[ Get Ifeffit | Ifeffit Overview | Ifeffit License | Documentation | Mailing List | Frequently Asked Questions ]


Writing Athena Filetype Plugins

Introduction

Athena uses Ifeffit's read_data() function to import data. This means that Athena's notion of what is an acceptible data format is completely identical to Ifeffit's notion. The contrapositive is also true -- if Ifeffit can read a data file, so can Athena.

In practice, this works great. Ifeffit is able to read the data files generated by many of the world's XAS beamlines. And so, consequently, is Athena. Sadly, there are many beamlines that use a format that confounds Ifeffit and Athena. There are three reasonable ways that I could deal with deal data from those beamline:

  1. Refuse to deal with them and require the user to transform the data into a form that Ifeffit can handle.
  2. Hardwire code into Athena to deal with each new data format as I become aware of it.
  3. Create a plugin architechture that allows Athena to be extended to deal well with new data formats without having to change the underlying code.

For a long time, Athena relied on a combination of 1 and 2 from that list. Recently I decided to adopt number 3. This wiki page is an attempt to fully document the plugin architechture so that Athena's users can begin writing their own filetype plugins.

Overview of how plugins work

In simple language, a perl module is a short file containing special perl code placed in a special location. Athena uses the code contained in that file to recognize and pre-process data files so that they can be imported properly using Ifeffit.

In somewhat more technical language plugin is just a perl module placed beneath the user's $HOME/.horae/ directory. This file is used when Athena starts and its methods are available when data is imported.

When a plugin is available for use, it is invoked every time a file is imported into Athena using either the Open file or Open many files functions. The new file is checked using one of the plugin's methods to ascertain if the file is of the sort serviced by the plugin. If the file is recognized, another method in the plugin transforms the original data file into a form that is readable by Ifeffit. This transformation is done in a way that leaves the original data file unchanged.

If the transformation is successful, the user is presented with Athena's column selection dialog and can import data in the normal manner. Ideally, a plugin is written in a way that makes the import of the data into Athena a completely transparent process for the user.

Example plugin

Here is a complete example of a functional plugin. This plugin allows Athena to import files from NSLS beamline X10C. As you can see, the plugin is quite short. The following sections of this wiki page will explain this example in detail.

   1 package Ifeffit::Plugins::Filetype::Athena::X10C;
   2 
   3 use vars qw(@ISA @EXPORT @EXPORT_OK);
   4 use Exporter;
   5 use File::Basename;
   6 use File::Copy;
   7 @ISA = qw(Exporter AutoLoader);
   8 @EXPORT_OK = qw();
   9 
  10 ## define the required variables
  11 use vars qw($is_binary $description);
  12 $is_binary = 0;
  13 $description = "Read files from NSLS beamline X10C.";
  14 
  15 ## this method recognizes a file from NSLS X10C
  16 sub is {
  17   shift;
  18   my $data = shift;
  19   open D, $data or die "could not open $data as data (X10C)\n";
  20   my $first = <D>;
  21   close D, return 0 unless (uc($first) =~ /^EXAFS/);
  22   my $lines = 0;
  23   while (<D>) {
  24     close D, return 1 if (uc($first) =~ /^\s+DATA START/);
  25     ++$lines;
  26     #close D, return 0 if ($lines > 40);
  27   };
  28   close D;
  29   return 0;
  30 };
  31 
  32 ## this method transforms a file from NSLS X10C
  33 sub fix {
  34   shift;
  35   my ($data, $stash_dir, $top, $r_hash) = @_;
  36   my ($nme, $pth, $suffix) = fileparse($data);
  37   my $new = File::Spec->catfile($stash_dir, $nme);
  38   ($new = File::Spec->catfile($stash_dir, "toss")) if (length($new) > 127);
  39   open D, $data or die "could not open $data as data (fix in X10C)\n";
  40   open N, ">".$new or die "could not write to $new (fix in X10C)\n";
  41   my $header = 1;
  42   my $null = chr(0).'+';
  43   while (<D>) {
  44     $_ =~ s/$null//g;           # clean up nulls
  45     print N "# " . $_ if $header; # comment headers
  46     ($header = 0), next if (uc($_) =~ /^\s+DATA START/);
  47     next if ($header);
  48     $_ =~ s/([eE][-+]\d{1,2})-/$1 -/g; # clean up 5th column
  49     print N $_;
  50   };
  51   close N;
  52   close D;
  53   return $new;
  54 }
  55 
  56 1;
  57 __END__

Namespace

The module must be in a particular namespace. The namespace is defined by the package function on line 1 of the example. The convention for the namespace used by the plugin is slightly unweildy, but was chosen for a good reason. The package must be in the Ifeffit::Plugins::Filetype::Athena namespace and should have a name that is descriptive of what format it is made for. in the case of the example, the plugin is intended to transform X10C files, so the full namespace of the module is Ifeffit::Plugins::Filetype::Athena::X10C. Lines 3-8 include requisite boilerplate which will allow this module to work properly with Athena and call some modules that are almost always useful.

The reason I chose such an unwieldy namespace is to allow for the possibility of moving much functionality in both Athena and Artemis into the form of plugins. With this choice, I will have considerable flexibility without having to rewrite any existing code.

Required methods and variables

The plugin must supply a few "public" variables and two "public" methods. (I put public in quotes because, of course, perl does not provide true encapsolation of variables and methods. The sense in which I mean public is that Athena requires certain variables and methods in the namespace of the plugin. Without them, the plugin will fail noisily.)

required variables

Lines 11-13 define the two required variables in a way that allows them to be accessed outside the scope of this module.

$is_binary
  • A boolean that tells Athena whether the input file format is text or binary. Athena handles binary files slightly differently in the column selection dialog.
$description
  • A short text string describing the purpose of this plugin. This string will be displayed in the plugin registry -- take a look at the amount of space available there, and make your string shorter than that.

the "is" method

Lines 16-29 show the is method. This method is called by Athena to try to recognize an input data file as being of a particular format. In the case of this example, the X10C file is recognized by some of the text in the first few lines of the files. When the file is recognized, this method returns a true value. If the test fails, it returns 0. When Athena sees the true return value, it applies the fix method to transform the data file.

It is quite important that the is method be fast. It is possible that a data file will have to be tested against a large number of plugins. If the is method is slow, file import will be slow.

the "fix" method

Lines 33-53 show the fix method. This method is called when the is method returns true. In some manner it makes a copy of the original data file and transforms that copy into a form that can be read by Ifeffit. This method needs to follow a number of strict rules, however within those rules there is a lot of lfexibility about how the transformation is accomplished and the scope of what that transformation does to the data.

This method takes four scalars as inputs.

$data
  • This is a string containing the fully resolved name of the input data file.
$stash_dir
  • This is a string with the location where Athena will look for the transformed copy of the data.
$top
  • This is a reference to Athena's main window. This allows you to create a GUI input dialog to collect information interactively from the user. See the Encoder plugin for an example of how this is used.
$r_hash
  • This is a reference to a hash that can contain information about how the data are transformed. This hash persists between invocations of the fix method, thus allowing you to reuse parameters about the transformation.

The return value of this method is the fully resolved file name of the transformed file which must be located in the directory indicated in the $stash_dir input scalar.

The basic workflow of the fix method is to open the original data file, perform some kind of operation on the data, and write the transformed data to the $stash_dir. This can be done almost any way. Some plugins use Ifeffit commands (via the Ifeffit module) to operate on the data. Other plugins use pure perl to parse and transform the file.

In the example given on this page, the first thing the fix method does is to create a file name in the stash directory for the transformed file. Lines 36 and 37 tell athena to give the stash file the same name as the original file (that is the function of the fileparse command) but in the stash directory (the catfile method builds a fully resolved filename in a platform transparent manner). Line 38 checks the length of the fully resolved filename to avoid running into one of Ifeffit's internal limitations.

Three things are done to transform an X10C file. The header is stripped of null characters, the header is commented out by putting # characters in the first column, and a formatting problem in some files involving a lack of white space between columns is resolved. Each line of the original file is read, operated on, and written to the transformed file in the stash directory. The while loop starting at line 43 reads through the file line-by-line and performs the operations.

Lines 51 and 52 close the original and new file handles. The filter should always close the file handles. This is not such a huge issue under unix, but Windows places a lock on any open file handle. If you fail to close one, for as long as Athena is running no other process will be able to do anything with that file.

At line 53, the method returns with the fully resolved name of the transformed file. At no point was the original file altered. When Athena exits, it will clean up the stash directory, thus avoiding a pile up of unnecessary data files.

The work flow in this example is a simple stream from one file to another. Other filters (Lambda.pm is an example) in the horae distribution use Ifeffit to perform the transformation. X15B.pm uses perl's pack function to transform a binary file. Encoder.pm generates a simple GUI dialog to get data necessary for the transformaton from the user. As you can see, the architecture of these plugins is quite flexible, allowing you to solve the transformation problem in whatever manner best suits the situation.

Athena's plugin registry

Because there might be a large number of filetype plugins, it is possible for the user to turn the checks for the file types on and off. In the Settings menu, you will find the Plugin Registry. This is a simple list of all plugins found in the system and user directories. The check buttons enable and disable the plugins. The value of the $description variable is displayed in the list (so be sure to choose a suitable and suitably short value for that variable).

Note that the order in which the plugins are displayed is the same order in which files are checked against the plugins. User plugins are checked before system plugins. After that the plugins are ordered alphabetically. If you want your system plugins to be checked against the data first, choose a name that comes early in the alphabetical sense.

Reformatting and data processing

When I originally conceived the concept of the filetype plugins, the scope was for the problem of importing data in a format not recognized by Ifeffit. The plugins can also be used for some data processing. For instance, you might do deadtime corrections using the ICR and OCR columns of an MED file using a plugin. In that case, the fix method would perform the deadtime correction as the data is streamed between the original file and the stash directory.

Another example of pre-processing is the filter I wrote for MED files from APS Sector 10. We try to run in a mode where dead time is negligible. However, the large number of columns tends to use way to much of Ifeffit's memory. That tends to slow Athena way down. A good solution to that problem was to use a plugin to strip all the unused columns, leaving only the ROI columns in file in the stash directory. Note that the Sector 10 plugin is an example of using a plugin to alter a file that Athena could already read in a way that added value to the user's interaction with Athena.

System plugins and user plugins

Athena looks in two different places for these plugins. One place is in Athena's installation location where it finds the plugins that come with the horae distribution. The other is in the user's space ($HOME/.horae/Ifeffit/Plugins/Filetype/Athena/ on unix and C:\Program File\Ifeffit\horae\Ifeffit\Plugins\Filetype\Athena\ on Windows). In both places, it reads the contents of the plugin directory and attempts to import the files which end in .pm.

Miscellaneous advice on plugins

  1. Cut-n-paste is an excellent way to get started on a new plugin. Make a copy of a plugin for a file that is similar to your own file and use that as the basis for your new plugin.
  2. X15B is an example of a plugin for a binary format.
  3. Encoder is an example of a plugin that uses both GUI elements and the persistent hash..
  4. Lambda is an example of using Ifeffit to perform the transformation. (Simply use Ifeffit qw(ifeffit); near line 5, then make use of the ifeffit function in the fix method.)

  5. You can use any module that you need, thus you have all of CPAN available to you when designing your plugin. If you need to do any seriously heavy lifting, check out the Math::Pari module or the Perl Data Language

  6. Although a well-tested, robust plugin should be your goal, one of the nice features of the plugin architecture is that a "good-enough" plugin is easy to write and can quickly get you over a hurdle.


AthenaFiletypePlugins (last edited 2007-01-04 00:15:17 by BruceRavel)