How to Process Laue Data
Using Precognition
Zhong Ren
(This page best viewed with Firefox.)
This tutorial shows a step-by-step procedure to process a Laue dataset. One of the first Laue datasets taken on the 14-ID beamline after its upgrade is chosen as an example. This dataset consists of 31 images collected from a wild-type photoactive yellow protein (PYP) crystal at its ground state with 2° φ spacing between consecutive images, and φ spanning from -30° to 30°. A Mar165 CCD detector was used at a crystal-to-detector distance of 100 mm. Two collinear undulators (U23 at a gap of 10.78 mm and U27 at 15.87 mm) provided the X-ray source. The elapsed exposure time of each image is 2.13 s, but within this exposure time, there were only 8 X-ray pulses of 2 μs each. The APS synchrotron was running in 324-bunch mode (APS fill patterns are described at http://www.aps.anl.gov/Facility/Storage_Ring_Parameters/node5.html.
4. Scaling and Wavelength Normalization
Laue images in a tar-ball (99 MB)
Entire processing directory in a tar-ball (103 MB)
The particular sample used wasnÕt the greatest PYP crystal. The images were well exposed. The last image shows slightly decayed diffraction. Figure 1 shows the first image with some defect spot shape.
Figure 1. The first image in the dataset pyp1_001.mccd.
Before data processing, it is very useful to learn the soft limits of the dataset, such as a reasonable σ-cut (i.e., the minimum signal-to-noise ratio) and resolution limit dmin. They are called soft because these nmonochromatic dataset. Nevertheless, choosing the appropriate values leads to trouble-free processing. A small script limit1.inp helps to identify these soft limits.
diagnostic off
busy on
Input
Format
MarCCD
Distance 100
Center
1000.5 1030.5
Pixel
0.0792 0.0792
Image
images/pyp1_001.mccd
Wavelength 1.05 1.4 1.09
Quit
Spot 10 4 3.7 pyp1_001.spt
Profile
Limits
Quit
Listing 1. Input script limit1.inp.
The first two lines of the script diagnostic and busy are switches that control message output. This script usually runs in the foreground, so that busy messages are switched on. To run this script in foreground and save a copy of the log file, type:
% Precognition limit1.inp | tee limit1.log
The middle section provides a number of input data, all of which should be available from a beamline staff member. The three values to the keyword Wavelength are λmin, λmax, and λreference. The wavelength where the source spectrum peaks makes a good reference wavelength.
The command Spot takes three values for spot length, width in pixel and σ-cut, which can be thought of as the third dimension. At the beginning, one may simply choose 10, 10, and 3 as defaults to these values, if no better idea suggests itself. We will see shortly that this procedure learns better values from the image specified. It performs spot recognition throughout the image, and optionally saves the spots into a file, if a filename is given. It is highly recommended to plot the spots with your favorite plotting software. For example, gnuplot (http://www.gnuplot.info) is convenient to use for this purpose.
% gnuplot
gnuplot> plot Õpyp1_001.sptÕ using 4:5
gnuplot> set size ratio -1
gnuplot> set yrange [2000:0]
gnuplot> replot
Listing 2. Plotting a spot file using gnuplot.
It is important to make sure the plotted pattern (Figure 2) looks like the one in the original image (Figure 1) before proceeding further. If the original image is very over or underexposed, or if the crystal diffracts poorly, the first try of limit1.inp may result in something far from expectation.
Figure 2. Scatter plot of the recognized spots saved in pyp1_001.spt.
Precognition prints some ASCII arts in its log file to signal important messages of different types. A few numbers near the end of the log file limit1.log are interesting. The first is about the mean spot profile, which is generated by the command Profile. The values of overall spot length and width shall replace the first two arguments to the command Spot above.
______
| )_
| Report |
| ------ |
| ------ |
| ------ |
| ----
|
|________|
An overall mean profile is recognized.
Semi-major & -minor axes (pixel): 2.58865
1.05309
Non-elliptical correction: 0.000305247 0.0112322
-0.00527107 0.0050105
Non-Gaussian correction: 0.914063
0.780225
Overall spot length is set to 10 pixels.
Overall spot width is set to 4 pixels.
Estimated crystal dimension is 0.192782 mm.
Estimated mosaic spread in FWHM is 0.110279 degree.
Listing 3. Log file section regarding a mean spot profile from limit1.log.
The command Limits generates more messages about the soft limits. The most important soft limit is σ-cut at various stages of processing. This procedure identified a σ-cut of 3.9 results in 10% spots that exceed an underlying spot distribution model. This value shall replace the third argument to the command Spot above. The spot distribution model implemented may not be very accurate, and may even fail completely under some circumstances, but the σ-cut values it identifies serve as guidelines during the following steps of processing. For example, during indexing and geometry refinement, the suggested range of σ-cut 4.5 to 5.8 will be helpful.
Best sigma-cut estimated at 4.45.
1803 real spots on this image.
Sigma-cut results in 10% noise is 3.87.
Suggested sigma-cut for indexing and geometry
refinement is between 4.45 and 5.84.
Maximum spot density 9.6/mrad at Bragg angle of 14.3
degree.
Diffraction limit estimated at Bragg angle of 21.8
degree or 1.68 A resolution.
Suggested resolution for indexing and geometry
refinement is 2.11 A.
Listing 4. Log file section regarding some soft limits from limit1.log.
The first step of processing is indexing, that is, an assignment of Miller indices to all spots on the images. Indexing is required only for one image, often the first in a set, if all others have a known spatial relationship with the indexed one. Indexing is done by another small command script index.inp.
diagnostic off
busy off
Input
@ pyp.inp
Format
MarCCD
Distance 100
Center
997.5 1031.8
Pixel
0.0792 0.0792
Omega
-90 0
Goniometer 0 0 -30
Image
images/pyp1_001.mccd
Resolution 2.2 100
Wavelength 1 1.4 1.1
Quit
Spot 10 4 5.8 pyp1_001.re.spt
Ellipse
Pattern 5 pyp1_001.pre.spt
Quit
Listing 5. Input script index.inp.
This script looks very similar, except that the Input section contains more information, and two different commands Ellipse and Pattern are used. In the Input section, the arguments to commands Format, Distance, Center, Pixel, Omega, and Wavelength should be known from a beamline staff member. The command Omega -90 0 is specific to 14-ID beamline of BioCARS.
The argument to Image specifies a filename. A relative path can be used as shown. All images are arranged in a subdirectory images.
The three arguments to Goniometer are ω, κ, and φ in degree.
A special command @ pyp.inp redirects the input stream
to another script pyp.inp, which
contains one line Crystal 66.9 66.9
40.8 90 90 120 173. These
values are the unit cell constants and space group number. This line can optionally replace the @ command.
Finally, the resolution limit
often has significant impact on the processing. Suggestion from limit1.log
(Listing 4) shall be followed. During the stages of indexing and geometry refinement, conservative
resolution limit often works better.
The command Spot is exactly the same as
before. It is recommended to use a
σ-cut towards the higher end of the
suggested range (Listing 4).
5 possible crystal orientations are recognized;
corresponding cell constants and detector parameters
are refined.
Indexing 1
R.M.S. deviation (pixel):
2.44762
Number of spots matched:
570
Cell lengths (Angstrom):
66.9000
66.9000 40.8000
Cell angles (degree):
90.0000 90.0000 120.0000
Euler angles (degree):
20.4120 111.4379 -84.9195
Euler angles (radian):
0.3563
1.9450 -1.4821
Missetting matrix:
-0.04397746
0.94481500 0.32463919
0.37208200
0.31706507 -0.87236731
-0.92715747 0.08242790 -0.36549237
Goniometer omega, chi, phi(degree): 0.0000 0.0000 -30.0000
Omega-axis polar orientation (deg): -90.0000 0.0000
Detector type:
flat
Crystal-to-detector distance (mm): 100.0000
Direct-beam center (pixel):
998.3930
1031.8162
Pixel size (mm):
0.0792000
0.0792000
Detector swing angles (degree):
0.0000 0.0000
Detector tilt angles (degree):
0.0000 0.0000
Detector bulge corrections (10^-12): 0
0
Listing 6. Parameters of one orientation match in index.log.
The command Pattern accepts an integer and a
filename. The integer specifies
how many orientation matches should be found before the program exits. The orientation matrices and other
parameters will be printed in the log file. The solutions will be sorted by their merits. The first two numbers are the most
important: RMSD and matched spots. Obviously, the smaller the RMSD and the more the matched spots, the more
reliable the solution. The
filename is for another spot file that contains predicted spots under the best
orientation match. The spot files pyp1_001.re.spt (recognized) and pyp1_001.pre.spt
(predicted) can be now plotted together
to show the correctness of the indexing (Figure 3).
Figure 3. Superposition of the recognized (red +) and predicted (green ×) patterns.
A silent product of this script index.inp is another input script pyp1_001.pre.spt.inp, that is, the string argument to Pattern appended by .inp. This newly generated file essentially repeats all parameters listed in Listing 6, except that this file has the valid syntax to be loaded back in (see below).
As shown in Figure 3, these two
patterns obviously match with each other, but they look a bit displaced. Geometry refinement is a process to
minimize the displacement while adjusting the geometric parameters, such as
unit cell and detector parameters. On the other hand, this is also a process to distribute the crystal
orientation found from one image by indexing to an entire dataset, provided
that all images are related by a known φ-spacing.
diagnostic off
busy off
@ pyp1_001.pre.spt.inp
Input
Crystal 0 0 0.2
0 0 0 free
Format
MarCCD
Distance 1.0 free
Center
1.0 free
Pixel
0.0001 free
Tilt
0.2 free
Bulge
fix
Goniometer 0 0 -30 pyp1_001.mccd
Goniometer 0 0 -28 pyp1_002.mccd
...
Goniometer 0 0 30
pyp1_031.mccd
Resolution 2.0 100
Wavelength 1 1.2 1.03
Spot
10 4 4.5
Quit
Dataset progressive
In images
Quit
Quit
Listing 7. Input script refine.inp (with
omission).
Input script refine.inp first uses the @ command we have seen before to load
back the geometric parameters found by indexing. This set of parameters is the starting point of a
refinement. Once again, an even
longer Input section provides
more control. Notice that the
numerical argument turns into an allowed standard deviation when fix or free
is the string argument. If a
standard deviation is supplied to a parameter, the displacement of this
parameter will be suppressed according to the standard deviation. This provides a means to restrain the
refinement before some parameters drifting too far away from their physically
meaningful ranges.
A sequence of the same command Goniometer adds frames to the dataset
to be processed. From this point
on, the word "process" deviates to different meanings, but the command that
triggers the "process" is always Dataset. Now, "process" stands for geometry
refinement, and we choose one of the refinement modes called progressive.
As the log file reported, three
files are generated for each image: two spot files and an input script pyp1_001.mccd.inp. These spot files can be again plotted
together to check the goodness of refinement. Nevertheless, the log file reports a much better fit compared
to the RMSD and matched spots after indexing. Now, the set of input scripts contains all geometric
parameters needed for prediction of Miller indices, location, and wavelength of
all spots in the dataset, visible or too weak to show.
File
./pyp1_001.mccd.re.spt is overwritten.
...
Cell constants, lattice orientation, goniometer
setting,
and detector parameters: after geometric refinement
Title:
pyp1_001.mccd
Cell lengths (Angstrom):
66.9000
66.9000 41.0262
Cell angles (degree):
90.0000 90.0000 120.0000
Euler angles (degree):
20.3627 111.5005 -84.9477
Euler angles (radian):
0.3554
1.9461 -1.4826
Missetting matrix:
-0.04447376
0.94509715 0.32374908
0.37291472
0.31635045 -0.87227118
-0.92679917
0.08193762 -0.36650993
Goniometer omega, chi, phi(degree): 0.0000 0.0000 -30.0000
Omega-axis polar orientation (deg): -90.0000 0.0000
Detector type:
flat
Crystal-to-detector distance (mm): 100.2936
Direct-beam center (pixel):
998.1205
1031.8685
Pixel size (mm):
0.0792047
0.0792000
Detector swing angles (degree):
0.0000 0.0000
Detector tilt angles (degree):
-0.1633
-0.2998
Detector bulge corrections (10^-12): 0
0
R.M.S.D. in pixel & matched spots: 0.3647 1093
File ./pyp1_001.mccd.pre.spt is overwritten.
File ./pyp1_001.mccd.inp is overwritten.
Listing 8. Result from geometry refinement in refine.log.
Optionally, a different
refinement mode called final may
be used after progressive mode is
done. The final mode may result in better fit in some, but not all,
cases. Notice that the @ commands are used repeatedly in
script final.inp. This is also a chance to loosen up the
standard deviations, resolution, and wavelength ranges, use more spots, or even
refine more parameters.
diagnostic off
busy off
prompt off
result off
@ pyp1_001.mccd.inp
@ pyp1_002.mccd.inp
...
@ pyp1_031.mccd.inp
prompt on
result on
Input
Crystal 0 0 0.2
0 0 0 free
Format
MarCCD
Distance 1 free
Center
1
free
Pixel
0.0001 free
Tilt
0.2 free
Bulge
fix
Resolution 1.9 100
Wavelength 1.02 1.2 1.04
Spot
10 4 4.5
Quit
Dataset final
In images
Quit
Quit
Listing 9. Input script final.inp (with
omission).
Once a pattern is well refined,
additional soft limits can be found out, that is, an estimated source spectrum,
even before integration. The
well-refined geometric parameters need to be loaded first. The command Limits now requires a numerical argument as λmax and a filename for the estimated
spectrum. The spectrum is plotted
in Figure 4.
diagnostic off
busy off
@ pyp1_001.mccd.inp
Input
Omega
-90 0
Goniometer 0 0 -30
Format
MarCCD
Image
images/pyp1_001.mccd
Quit
Spot 10 4 3.7 pyp1_001.spt
Limits 1.3 estimate.lam
Quit
Listing 10. Input script limit2.inp.
Integration is another meaning of "process"; it also has various modes, but we choose nonlinearAnalytical, whatever it means, it is the most aggressive mode of integration (not necessarily the best for all cases). The @ command @ pyp1.inp is essentially the same as loading a sequence of geometric input scripts in final.inp. They are now saved into another file pyp1.inp to keep the conciseness.
diagnostic off
busy off
warning off
@ pyp1.inp
Input
Image
start.lam
Spot
10 4 4.5
Quit
Dataset nonlinearAnalytical
In images
Resolution 1.5 1000
Wavelength 1 1.3 1.04
Quit
Quit
Listing 11. Input script integrate.inp.
A new item has appeared in Input section: start.lam, which is a small file that contains the source
spectrum. The best spectrum can be
used here, but if not, a few points on the spectrum are sufficient (Figure
4). The following file has
wavelength in in the first column and the relative intensity in the second. The estimated spectrum estimate.lam from limit2.inp is another good option.
1.000 0.0
1.034 1.0
1.100 0.3
1.200 0.1
1.300 0.0
Listing 12. The starting source spectrum start.lam.
The spot parameters are the same,
length, width, and height in the form of σ-cut, but the σ-cut
is particularly important here, since it is one of the criteria to select
sample spots for profile fitting. Needless to say, this has no effect if a non-profile-fitting integration
mode is chosen (see Precognition
manual for other integration modes). Follow suggestion in limit1.log
and check .re.spt by plotting.
Resolution and wavelength ranges
are now specified in Dataset
section, since they are very specific to the integration process. All other input parameters are removed,
since they reside in the set of geometric parameter scripts now.
Integration produces a set of .ii files that contain integrated intensities. Distributions
of all kinds can be plotted from these files for diagnostic purposes.
4. Scaling and Wavelength
Normalization
This procedure is a large-scale, multi-parameter, nonlinear minimization. In most cases, a single pass of scale.inp solves the problem nicely, but sometime two or more passes may further improve the result.
diagnostic off
busy off
warning off
@ pyp1.inp
# @ restore.inp
Input
Image start.lam
Resolution 1.5 1000
Wavelength 1 1.3
1.034
Anomalous off
Quit
Scale
Restore
restore.inp
Sigma 2
Mosaicity 0 fix
Isotropy 0 scale
Isotropy 0
temperature
Expansion fix
Lambda-shift free
Chebyshev 64
<