(Demeter main page) (Cookbook main page)
| /01DataHandling /02Fitting /03AtomsAndFeff /04PlottingAndConfiguration /05HgDNA /06Uranyl |
A complicated fitting example
Hg bound to a synthetic DNA molecule
![Figure 1 (click for full size image) [Figure 1]](/ifeffit/Demeter/Cookbook/05HgDNA?action=AttachFile&do=get&target=hg_thymidine.png)
Figure 1 Hg bound to the pyrimidine ring of a thymidine nucleotide. The part in the gray box is the pyrimidine ring.
This example demostrates the analysis described in this presentation [5.4 MB] that I gave at the XAS08 conference at the Swiss Light Source.
Some background: XAS was measured on a solution containing 3mM Hg, 3mM duplex DNA, a buffer, and a salt all dissolved in water. The DNA is the active part of a DNA sensor for Hg contamination in water. The purpose of the experiment was to shed some light on how the Hg interacts with the DNA.
Not knowing much about how the DNA might interact with the Hg, my approach was to generate feff.inp files with the Hg in various positions around each nucleotide. In the example fleshed out here, I am placing the Hg atom at a position of relatively high symmetry near the N atom in thymidine's pyrimidine ring. In other attempts to analyze the data (none shown here), I placed the Hg atom in other locations around each of the four nucleotides.
From simple examination of the data, I know that the Hg atom is about 2.04 Ang. away from its nearest neighbor. For this attempt at fitting the data, I presume that the Hg atom is 2.04 Ang. from the N atom on thymidine's pyrimidine ring, that it is equidistant from the C atoms on either side of the N in the pyrimidine ring, and that it is in the plane of those three atoms. All of this is shown in Figure 1. In that image, the pyrimidine ring is the bit in the gray box.
The dna2feff.pl script sets up the set of coupled equations solving all the conditions described in the last pararaph and uses Ifeffit to solve them. It them writes out a feff.inp based on the thymidine Protein Data Bank file with the Hg atom at the position solving the coupled equations.
The fit.pl script shown below then uses that feff.inp file and data contained in the HgDNA_data.prj Athena project file. It sets up a fitting model and runs the fit. The basic assumptions in this fitting model are that the Hg atom is in a position of high symmetry and that the pyrimidine ring is rigid compared to the Hg-N bond.
All of the files referred to here can be found in the example/HgDNA/ folder when you download Demeter from the svn repository.
Some things to notice in the fit script:
Data is imported directly from the Athena project file at line 18 and 19 using the Prj object.
Feff is run on the fly. The potentials are computed at line 66 and the paths are enumerated at line 68. feffNNNN.dat files are generated as needed and behind the scenes once the fit is called.
A Data object (lines 20-26), several GDS objects (29-58), and several Path objects (83-144) are set up.
The various GDS objects encode the trigonometry relating the Hg-C distance to the Hg-N distance (lines 46-49).
The paths from the calculation are shown in this interpretation. This is what line 70 would write out if it were uncommented.
The find_path method (line 83 and elsewhere) is used to find the correct path for each Path object. That's not strictly necessary -- in this case I could just count off the paths returned by the pathlist method (which are returned in the order shown in the the interpretation). But I wanted to demonstrate Demeter's semantic path description capabilities.
Since this script is intended to be run from a command line, the screen UI mode is turned on at line 3. This will display a moving indicator on-screen during the fit and it enables the use of the interview method at line 173. This allows for some limited interaction with the fit results using a simple command line interface.
1 #!/usr/bin/perl -I/home/bruce/codes/demeter/lib
2
3 use Demeter qw(:ui=screen); # enable the use of Interview and Spinner
4 # strict and warnings automatically imported with Demeter
5
6 ## -------- clean up in preparation of the next fit
7 unlink("hgfit.iff") if (-e "hgfit.iff");
8 unlink("hgfit.dpj") if (-e "hgfit.dpj");
9
10 ## I like to have a dummy object around for things like set_mode and
11 ## simpleGDS, although you can use any object for those purposes...
12 my $demeter = Demeter->new;
13 $demeter -> set_mode(screen => 0, ifeffit => 1, file => ">hgfit.iff");
14 $demeter -> po -> set(kweight => 2, rmax => 6);
15
16
17 ## -------- import data and set up the FT and fit parameters
18 my $prj = Demeter::Data::Prj -> new(file => 'HgDNA_data.prj');
19 my $data = $prj -> record(2); # import 2nd record, which contains the Hg/DNA data
20 $data -> set(name => 'Hg with DNA',
21 fft_kmin => 2.0, fft_kmax => 8.8,
22 fit_space => 'r',
23 fit_k1 => 1, fit_k2 => 1, fit_k3 => 1,
24 bft_rmin => 1, bft_rmax => 3.1,
25 fit_do_bkg => 0,
26 );
27
28 ## -------- create all the guess, set, def, and after parameters
29 my @gds = (
30 $demeter->simpleGDS("set angle1 = 115.9 * pi / 180"),
31 $demeter->simpleGDS("set angle2 = 116.6 * pi / 180"),
32 $demeter->simpleGDS("set b1 = 1.373"),
33 $demeter->simpleGDS("set b2 = 1.384"),
34 $demeter->simpleGDS("set m = 1.43"), # crude scaling factor for MS paths
35
36 $demeter->simpleGDS("guess amp = 1"),
37 $demeter->simpleGDS("guess enot = 0"),
38
39 ## geometry for location equidistant from two 2NN atoms in a 6-member ring
40 $demeter->simpleGDS("set anot = 2.04"),
41 $demeter->simpleGDS("guess deltaa = 0"),
42 $demeter->simpleGDS("def a = anot + deltaa"), # net Hg - N distance
43 $demeter->simpleGDS("def angle = (angle1 + angle2)/2"), # average Hg-N-C angle
44 $demeter->simpleGDS("def b = (b1+b2)/2"), # average N-C distance
45
46 ## some fun trigonometry follows
47 $demeter->simpleGDS("def tanth = (a + b) * tan(angle/2) / (a - b)"),
48 $demeter->simpleGDS("def theta = atan(tanth)"),
49 $demeter->simpleGDS("def c = (a-b) * cos(angle/2) / cos(theta)"),
50
51 ## the rest of my fitting parameters, all MS paths will be approximated in terms of these
52 $demeter->simpleGDS("guess dro = 0"),
53 $demeter->simpleGDS("guess ssn = 0.003"),
54 $demeter->simpleGDS("def ssc = m*ssn"),
55 $demeter->simpleGDS("guess sso = 0.003"),
56
57 $demeter->simpleGDS("set szs = 0.82"), # s02 determined from fit to HgO data
58 $demeter->simpleGDS("after cn = amp/szs"), # compute coordination number for log file
59 );
60
61
62 ## -------- run the feff calculation
63 my $feff = Demeter::Feff->new(file=>'15/withHg.inp', workspace=>'15');
64 $feff -> set(screen=>0, buffer=>q{}, save=>1);
65 $feff -> co -> set_default("pathfinder", "fs_angle", 25);
66 $feff -> potph;
67 $feff -> rmax(4.5);
68 $feff -> pathfinder;
69 ## $feff -> freeze('15/feff.yaml');
70 ## print $feff -> intrp, $/;
71
72
73 ## -------- begin setting up paths
74 ## note that I am using the `find_path' method here as a
75 ## demonstration of how to use Demeter's semantic path
76 ## descriptions. for instance, in the case of the first
77 ## path, I want to use "the SS path that is less than 3
78 ## angstroms and scatters from a nitrogen atom"
79 my @paths = ();
80 my $index = 0;
81 my @common = (parent => $feff, data => $data, s02 => "amp", e0 => "enot",);
82
83 my $p = $feff->find_path(lt=>3, tag=>['N']); ## find the nearest neighbor, N at a short distance
84 push @paths, Demeter::Path -> new(@common,
85 sp => $p,
86 delr => "deltaa",
87 sigma2 => "ssn",
88 );
89
90 $p = $feff->find_path(lt=>3, tag=>['C']); ## find the second neighbor C atoms in the pyrimidine ring
91 push @paths, Demeter::Path -> new(@common,
92 sp => $p,
93 delr => "c-reff",
94 sigma2 => "ssc",
95 );
96
97 $p = $feff->find_path(lt=>3.5, tag=>['O']); ## find the third neighbor O atoms dangling from the pyrimidine ring
98 push @paths, Demeter::Path -> new(@common,
99 sp => $p,
100 delr => "dro",
101 sigma2 => "sso",
102 );
103
104 $p = $feff->find_path(lt=>4, tag=>['C', 'N']); ## find the C-N triangle paths
105 push @paths, Demeter::Path -> new(@common,
106 sp => $p,
107 delr => "(c-2.924)/2 + deltaa/2",
108 sigma2 => "ssc+ssn",
109 );
110
111 $p = $feff->find_path(lt=>4, tag=>['N', 'C', 'N']); ## find the N-C-N dog leg
112 push @paths, Demeter::Path -> new(@common,
113 sp => $p,
114 delr => "deltaa",
115 sigma2 => "ssn",
116 );
117
118 $p = $feff->find_path(lt=>4, tag=>['C', 'O']); ## find the C-O triangle
119 push @paths, Demeter::Path -> new(@common,
120 sp => $p,
121 delr => "(c-2.924)/2 + dro/2",
122 sigma2 => "ssc+sso",
123 );
124
125 $p = $feff->find_path(lt=>4.2, tag=>['N', 'O']); ## find the N-O triangle
126 push @paths, Demeter::Path -> new(@common,
127 sp => $p,
128 delr => "deltaa/2 + dro/2",
129 sigma2 => "ssn+sso",
130 );
131
132 $p = $feff->find_path(lt=>4.2, tag=>['C', 'O', 'C']); ## find the C-O-C dog leg
133 push @paths, Demeter::Path -> new(@common,
134 sp => $p,
135 delr => "deltaa/2 + dro/2",
136 sigma2 => "ssn+sso",
137 );
138
139 $p = $feff->find_path(lt=>4.2, tag=>['C', 'C']); ## find the C-C triangle
140 push @paths, Demeter::Path -> new(@common,
141 sp => $p,
142 delr => "2*deltaa",
143 sigma2 => "4*ssn",
144 );
145
146 ## -------- a Fit object is a collection of GDS, Data, and Path objects
147 my $fit = Demeter::Fit->new(
148 gds => \@gds,
149 data => [$data],
150 paths => \@paths,
151 );
152
153 ## Up to this point in the script, Demeter does not significantly
154 ## reduce the amount of typing you have to do to create a fitting
155 ## model. Parameters *have* to be defined, data processing parameters
156 ## *have* to be set, paths *have* to be defined. The benefit of
157 ## Demeter is how easy everything else is after this point.
158 ##
159 ## Running the fit is trivial. Plotting the data, paths, and fit is
160 ## easy. Logfiles, output files, project files -- all those things
161 ## are easily created as well.
162
163 ## -------- do interesting things with the Fit object
164 $fit -> fit;
165 $fit -> logfile("hgfit.log", "Hg at N15 on pyrimidine", q{});
166 $fit -> freeze("hgfit.dpj");
167 $data -> save("fit", "hgfit.fit");
168
169 ## -------- simple, on-screen interaction with the fit results
170 $fit -> interview;
Some shortcomings in this script:
Temporary files from the Feff and Gnuplot interfaces are not yet cleaned up automatically, so line 173 takes care of that. (Fixed 11 Jan 2009)
There is currently no mechanism for automatically generating proper Index attributes of the Path objects, so that has to be handled by hand. (Fixed 31 Dec 2008)
Saving project files (line 166) is not working at this moment, but that is at the top of the list of things to fix. (It works now, use the rdfit script to examine the project file. BR 4 Nov 08)