autoPROC Documentation previous next
Examples

autoPROC Documentation : Examples

Copyright    © 2004-2015 by Global Phasing Limited
 
  All rights reserved.
 
  This software is proprietary to and embodies the confidential technology of Global Phasing Limited (GPhL). Possession, use, duplication or dissemination of the software is authorised only pursuant to a valid written licence from GPhL.
Documentation    (2004-2015)  Clemens Vonrhein, Claus Flensburg, Wlodek Paciorek & Gérard Bricogne
 
Contact proc-develop@GlobalPhasing.com


Contents


Simple one-liners

Here we show some relatively simple examples how to run autoPROC and discuss the context of each command.

Simple case with images in current directory and correct image headers:

process -d 01
This assumes that you want to have the processing output in a subdirectory of the directory containing images. Or maybe symbolic links were created to have images visible in the current directory.

Of course, the above command will actually not save standard output. So it would be better to run

process -d 01 > 01.lis 2>&1          # bash/sh/zsh/kash
- or -
process -d 01 >& 01.lis              # tcsh/csh
in order to save standard output in a file (here 01.lis). It seems a good idea to name output (sub)directory and file with standard output in a consistent way (and not using "test", "new", "newer" etc).

Images are in different location:

If autoPROC should be run in one location while the actual diffraction images are in a different directory, the -I flag can be used to point to this directory:
process -I /some/where/Images -d 01
Sometimes it might be useful to use an explicit specification of images to be used - e.g. if a range of images should be excluded or a larger dataset should be split into several parts.
process -Id "early,/some/where/Images,lyso_####.cbf,1,450" -d 01
or
process -Id "early,/some/where/Images,lyso_####.cbf,1,450" \
        -Id "late,/some/where/Images,lyso_####.cbf,451,900" \
        -d 01
... which isn't quite a one-liner any more.

Beam centre in image header is wrong:

For a general discussion please see autoPROC wiki.

If it is known that the beam centre in the image header is completely wrong and the correct value is known (or has been determined e.g. visually), one can give this to the program via the beam parameter:

process beam="1230 1330"-d 01<
The beam coordinates have to be given in pixels (not mm). If the values in the image header are correct but use a different convention (the imginfo program and beam8.sh jiffy might be helpful here), the BeamCentreFrom parameter wil define the conversion:
process BeamCentreFrom=header:-y,x -d 01
If the conversion itself is not know (and one can't determine this conversion visually from the actual diffraction images), then running once with
process BeamCentreFrom=getbeam:init -d 01
could be helpful - although it is not guaranteed to find the correct answer in all cases.

Problematic diffraction pattern or challenging project:

The first try could be to use the LowResOrTricky macro with
process -M LowResOrTricky -d 01
The main purpose of the settings within this macro (see process -M show for details) is to switch off parameter refinement during integration - since this can become unstable for problem cases. More cycles of integration and post-refinement are done to compensate for this, resulting in a longer run-time but hopefully better behaved integration. Please be aware that this will most likely cause problems for datasets where radiation damage induced a significant change in cell parameters, because then the assumption of a single set of cell parameters for all images no longer holds.

Running on data from a specific beamline:

A few beamlines require sepcific settings to describe the configuration (of detector, goniostat, direct beam and others). autoPROC provides several pre-defined beamline macros that we were able to test on datasets. No automatic activation of these macros will be done - it is up to the user to make use of them explicitly. Using
process -M list
will give a short summary of each macro while
process -M show
will show the actual parameter settings for each macro. Please remember that command-line arguments are processed in the order they are read (and macros are expanded at that point). Settings later on the command-line might therefore overwrite previous settings.

If there is a pre-defined macro for a given beamline, running

process -M SlsPXIII -d 01
will pick up those settings. Otherwise please check the autoPROC wiki for known (and hopefully up-to-date) settings of other beamlines.

When encountering problems

Often the first time a user starts some serious reading of documentation (apart from a quick glance to get started), is when problems occur. Here we want to provide some pointers and tips to get over some of the more common and typical hurdles - although this is of course no substitute to a proper reading of this documentation or the autoPROC wiki.

What should one try if there are problems when running with (most) default settings? There are different stages of the data processing run that we can look at separately below.

Getting started

If autoPROC doesn't even find any images, this could be due to Image headers have an amazing mutation rate - maybe due to being close to hot synchrotron beams. Especially image headers that provide a simple and easy to read ASCII section at the top (ADSC and Pilatus mini-cbf) seem to attract the ambition to make them more complicated, inconsistent and generally different from everybody else's version of those headers.

The imginfo program within autoPROC is the tool that handles image formats and header content. We try to cater for as many variants of image headers as possible - but will not encode work-arounds for every single detector employed anywhere in the world (unless there is a very good reason for it that we can get convinced of). In most cases where autoPROC/imginfo fails in reading the image header correctly, this is due to very unusual values or items being written into the header by the beamline control software. Unfortunatly, we very rarely get a preview of changed image headers (beamline reconfiguration etc). If every beamline/instrument would post detailed and up-to-date information about the beamline configuration and the meaning of the image header items (sticking a post-it to the hutch door doesn't count), life would be easier for users and software developers by an order of magnitude.

As you can see from the rant above, image headers are a permanent pain for us and users: if you hit problems in that area, please contact us with some example files and (after we analyse what is going on) then get back to the beamline staff to ensure any problems are fixed asap ... so that other users won't hit the same problems as well.

During indexing

Indexing might fail for one of several reasons belonging to two groups: either the experiment is not described accurately enough or the diffraction pattern is problematic. To distinguish between the two it can be useful to process data from a test crystal (like Lysozyme or such) collected under the same or very similar conditions on the same instrument. I that also fails unexpectedly, the instrument settings should be analysed (see also here).

A common difference to the default settings is a change in direction for a positive rotation around the spindle axis. This can be specified using

process ReversePhi=yes ...
Another possibility (and sometimes in conjunction with the above) is a difference in beam centre specification. If
imginfo some.cbf
shows that the beam-centre is significantly (more than a couple of pixels) off the image centre, then the correct convention should be used with e.g.
process BeamCentreFrom=header:y,x ...
How do you find that convention: If the detector was swung out by the 2-theta arm, the parameter autoPROC_TwoThetaAxis should be specified (if it differs from the rotation/omega axis). Also note that autoPROC assumes the beam centre to be specified at a 2-theta angle of datum position: if this is not the case, some back-calculation is required.

If data was collected using e.g. the Phi-axis of a multi-axis (Eulerian or Kappa) goniostat, the exact rotation needs to be calculated using the exact specifications of the goniostat. This involves the Kapparot system and the XdsGetRotationAxisViaKapparot parameter.

Of course, indexing might still fail (or give problems) if everything is correctly defined but it is the diffraction pattern that gives problems. By default, autoPROC will already try to optimise the indexing solution via the XdsOptimizeIdxrefAlways parameter. This can still fail and a few possible settings could be:

During integration

The most common reason for faulty integration is an instability of parameter refinement. This can be avoided by using
process -M LowResOrTricky ...
Sometimes it is better to exclude problematic image ranges right frmo the start, e.g. with
process -Id "early,/where/ever/images,lyso_####.cbf,1,435" ...
 - or -
process -Id "later,/where/ever/images,lyso_####.cbf,101,900" ...
 - or -
process -Id "middle,/where/ever/images,lyso_####.cbf,101,435" ...
If a few intermediate images are buggy (issues with beamline or detector hardware/electronics etc), these images could be moved away using
mv lyso_201.cbf lyso_201.cbf.bad
mv lyso_202.cbf lyso_202.cbf.bad
process -Id "allgood,/where/ever/images,lyso_####.cbf,1,900" ...

During scaling

Scaling can be a complicated procedure: decisions need to be made about what images to include/exclude, which resolution limits to apply, what scaling protocol to use etc. The good thing is that scaling can be run separately within autoPROC using the aP_scale module - so several approaches can be tested without the need for re-running the full integration. However, if after scaling a significant range of images is excluded (for a variety of possible reasons), it might be a good idea to re-run the integration using only the known-to-be-good images in the first place.

The easiest way of improving the scaling results is to include only data from well diffracting image ranges. Running e.g.

aP_scale -mtz XDS_ASCII.mtz \
  -P Lyso nat early -b 1-435 -id early_01 > early_01_aP_scale.log
or
aP_scale -mtz XDS_ASCII.mtz \
  -P Lyso nat later -b 101-900 -id later_01 > later_01_aP_scale.log
or
aP_scale -mtz XDS_ASCII.mtz \
  -P Lyso nat middle -b 101-435 -id middle_01 > middle_01_aP_scale.log
This will obviously only give a complete dataset if high enough multiplicity was collected (another good argument for low-dose, high-multiplicity data collection strategies).
Last modification: 10.06.2015