autoPROC Documentation	previous	next
Examples

autoPROC Documentation : Examples

Copyright © 2004-2018 by Global Phasing Limited

All rights reserved.

This software is proprietary to and embodies the confidential technology of Global Phasing Limited (GPhL). Possession, use, duplication or dissemination of the software is authorised only pursuant to a valid written licence from GPhL.

Documentation (2004-2018) Clemens Vonrhein, Claus Flensburg, Wlodek Paciorek & Gérard Bricogne

Contact proc-develop@GlobalPhasing.com

autoPROC Wiki
autoPROC reference card (PDF) with short examples and typical usage of commands (for print-out as a two-sided A4 page)
Simple one-liners
When encountering problems

Simple one-liners

Here we show some relatively simple examples how to run autoPROC and discuss the context of each command.

Simple case with images in current directory and correct image headers:

process -d 01

This assumes that you want to have the processing output in a subdirectory of the directory containing images. Or maybe symbolic links were created to have images visible in the current directory.

Of course, the above command will actually not save standard output. So it would be better to run

process -d 01 > 01.lis 2>&1          # bash/sh/zsh/kash
- or -
process -d 01 >& 01.lis              # tcsh/csh

in order to save standard output in a file (here 01.lis). It seems a good idea to name output (sub)directory and file with standard output in a consistent way (and not using "test", "new", "newer" etc).

Images are in different location:

If autoPROC should be run in one location while the actual diffraction images are in a different directory, the -I flag can be used to point to this directory:

process -I /some/where/Images -d 01

Sometimes it might be useful to use an explicit specification of images to be used - e.g. if a range of images should be excluded or a larger dataset should be split into several parts.

process -Id "early,/some/where/Images,lyso_####.cbf,1,450" -d 01

process -Id "early,/some/where/Images,lyso_####.cbf,1,450" \
        -Id "late,/some/where/Images,lyso_####.cbf,451,900" \
        -d 01

... which isn't quite a one-liner any more.

Beam centre in image header is wrong:

For a general discussion please see autoPROC wiki.

If it is known that the beam centre in the image header is completely wrong and the correct value is known (or has been determined e.g. visually), one can give this to the program via the beam parameter:

process beam="1230 1330"-d 01

The beam coordinates have to be given in pixels (not mm). If the values in the image header are correct but use a different convention (the imginfo program and beam8.sh jiffy might be helpful here), the BeamCentreFrom parameter wil define the conversion:

process BeamCentreFrom=header:-y,x -d 01

If the conversion itself is not know (and one can't determine this conversion visually from the actual diffraction images), then running once with

process BeamCentreFrom=getbeam:init -d 01

could be helpful - although it is not guaranteed to find the correct answer in all cases.

Problematic diffraction pattern or challenging project:

The first try could be to use the LowResOrTricky macro with

process -M LowResOrTricky -d 01

The main purpose of the settings within this macro (see process -M show for details) is to switch off parameter refinement during integration - since this can become unstable for problem cases. More cycles of integration and post-refinement are done to compensate for this, resulting in a longer run-time but hopefully better behaved integration. Please be aware that this will most likely cause problems for datasets where radiation damage induced a significant change in cell parameters, because then the assumption of a single set of cell parameters for all images no longer holds.

Running on data from a specific beamline:

A few beamlines require sepcific settings to describe the configuration (of detector, goniostat, direct beam and others). autoPROC provides several pre-defined beamline macros that we were able to test on datasets. No automatic activation of these macros will be done - it is up to the user to make use of them explicitly. Using

process -M list

will give a short summary of each macro while

process -M show

will show the actual parameter settings for each macro. Please remember that command-line arguments are processed in the order they are read (and macros are expanded at that point). Settings later on the command-line might therefore overwrite previous settings.

If there is a pre-defined macro for a given beamline, running

process -M SlsPXIII -d 01

will pick up those settings. Otherwise please check the autoPROC wiki for known (and hopefully up-to-date) settings of other beamlines.

When encountering problems

Often the first time a user starts some serious reading of documentation (apart from a quick glance to get started), is when problems occur. Here we want to provide some pointers and tips to get over some of the more common and typical hurdles - although this is of course no substitute to a proper reading of this documentation or the autoPROC wiki.

What should one try if there are problems when running with (most) default settings? There are different stages of the data processing run that we can look at separately below.

Getting started

If autoPROC doesn't even find any images, this could be due to

permission problems: are you allowed to read the images autoPROC is supposed to process? Check if
```
imginfo /where/ever/some.img
```
prints the content of the image header.
file naming: the automatic detection of image files is based on file names following certain naming conventions. If the image file names contain spaces or other "odd" (to UNIX) characters like "+", "#" or such, the automatic detection might fail. Also, localized character sets will cause problems - best is to stick with plain ASCII characters and numbers plus a few delimiters like ".", "-" and "_".
compressed files: autoPROC does support the processing of compressed files - but not by default. This can be switched on using a FindImages_AllowCompressedImages=yes parameter setting. But be aware that the actual handling of compressed files within XDS is done by uncompressing the file into a temporary file (usually within the /tmp directory) each time it needs to be read. So throughout a full autoPROC processing run it is easily possible to uncompress (and then delete) each file a dozen times ... which depending on your file system and network connection can be slower than doing an initial copy and uncompress to then process those already uncompressed images directly.
pointing to wrong directory: remember that autoPROC will look for images in the current directory or (if provided) in the directory given with the -I flag. Better double check that this is correct.
fragmented or incomplete datasets: the find_images tool looks for sets of files with a minimum number of consecutive numbering and withut too many gaps in this numbering. To process such partial or fragmented datasets with autoPROC you might want to use the -Id setting instead for full control (see below). If autoPROC is the right tool for such types of data is a different question.

Image headers have an amazing mutation rate - maybe due to being close to hot synchrotron beams. Especially image headers that provide a simple and easy to read ASCII section at the top (ADSC and Pilatus mini-cbf) seem to attract the ambition to make them more complicated, inconsistent and generally different from everybody else's version of those headers.

The imginfo program within autoPROC is the tool that handles image formats and header content. We try to cater for as many variants of image headers as possible - but will not encode work-arounds for every single detector employed anywhere in the world (unless there is a very good reason for it that we can get convinced of). In most cases where autoPROC/imginfo fails in reading the image header correctly, this is due to very unusual values or items being written into the header by the beamline control software. Unfortunatly, we very rarely get a preview of changed image headers (beamline reconfiguration etc). If every beamline/instrument would post detailed and up-to-date information about the beamline configuration and the meaning of the image header items (sticking a post-it to the hutch door doesn't count), life would be easier for users and software developers by an order of magnitude.

As you can see from the rant above, image headers are a permanent pain for us and users: if you hit problems in that area, please contact us with some example files and (after we analyse what is going on) then get back to the beamline staff to ensure any problems are fixed asap ... so that other users won't hit the same problems as well.

During indexing

Indexing might fail for one of several reasons belonging to two groups: either the experiment is not described accurately enough or the diffraction pattern is problematic. To distinguish between the two it can be useful to process data from a test crystal (like Lysozyme or such) collected under the same or very similar conditions on the same instrument. I that also fails unexpectedly, the instrument settings should be analysed (see also here).

A common difference to the default settings is a change in direction for a positive rotation around the spindle axis. This can be specified using

process ReversePhi=yes ...

Another possibility (and sometimes in conjunction with the above) is a difference in beam centre specification. If

imginfo some.cbf

shows that the beam-centre is significantly (more than a couple of pixels) off the image centre, then the correct convention should be used with e.g.

process BeamCentreFrom=header:y,x ...

How do you find that convention:

use the table on the autoPROC wiki
visually determine the beam centre and compare it to the header values using the beam8.sh jiffy
use the getbeam program to check all 8 possibilities with
```
process BeamCentreFrom=getbeam:init ...
```

If the detector was swung out by the 2-theta arm, the parameter autoPROC_TwoThetaAxis should be specified (if it differs from the rotation/omega axis). Also note that autoPROC assumes the beam centre to be specified at a 2-theta angle of datum position: if this is not the case, some back-calculation is required.

If data was collected using e.g. the Phi-axis of a multi-axis (Eulerian or Kappa) goniostat, the exact rotation needs to be calculated using the exact specifications of the goniostat. This involves the Kapparot system and the XdsGetRotationAxisViaKapparot parameter.

Of course, indexing might still fail (or give problems) if everything is correctly defined but it is the diffraction pattern that gives problems. By default, autoPROC will already try to optimise the indexing solution via the XdsOptimizeIdxrefAlways parameter. This can still fail and a few possible settings could be:

fix some parameters during indexing via
```
process autoPROC_XdsKeyword_REFINEIDXREF="BEAM AXIS ORIENTATION CELL" ...
```
This will fix the detector origin position during indexing.
use only the strongest N spots with
```
process RunIdxrefStartWithTop=1000 ...
```
exclude higher resolution spots from indexing via
```
process RunIdxrefExcludeHighRes=3.5 ...
```

exclude reflections in ice-rings with

process RunIdxrefExcludeIceRingShells=yes ...

use a specific range (or ranges) of images for indexing

process autoPROC_XdsKeyword_SPOT_RANGE="1 10|101 110" ...

During integration

The most common reason for faulty integration is an instability of parameter refinement. This can be avoided by using

process -M LowResOrTricky ...

Sometimes it is better to exclude problematic image ranges right frmo the start, e.g. with

process -Id "early,/where/ever/images,lyso_####.cbf,1,435" ...
 - or -
process -Id "later,/where/ever/images,lyso_####.cbf,101,900" ...
 - or -
process -Id "middle,/where/ever/images,lyso_####.cbf,101,435" ...

If a few intermediate images are buggy (issues with beamline or detector hardware/electronics etc), these images could be moved away using

mv lyso_201.cbf lyso_201.cbf.bad
mv lyso_202.cbf lyso_202.cbf.bad
process -Id "allgood,/where/ever/images,lyso_####.cbf,1,900" ...

During scaling

Scaling can be a complicated procedure: decisions need to be made about what images to include/exclude, which resolution limits to apply, what scaling protocol to use etc. The good thing is that scaling can be run separately within autoPROC using the aP_scale module - so several approaches can be tested without the need for re-running the full integration. However, if after scaling a significant range of images is excluded (for a variety of possible reasons), it might be a good idea to re-run the integration using only the known-to-be-good images in the first place.

The easiest way of improving the scaling results is to include only data from well diffracting image ranges. Running e.g.

aP_scale -mtz XDS_ASCII.mtz \
  -P Lyso nat early -b 1-435 -id early_01 > early_01_aP_scale.log

aP_scale -mtz XDS_ASCII.mtz \
  -P Lyso nat later -b 101-900 -id later_01 > later_01_aP_scale.log

aP_scale -mtz XDS_ASCII.mtz \
  -P Lyso nat middle -b 101-435 -id middle_01 > middle_01_aP_scale.log

This will obviously only give a complete dataset if high enough multiplicity was collected (another good argument for low-dose, high-multiplicity data collection strategies).

Last modification: 16.05.2018

Copyright	© 2004-2018 by Global Phasing Limited

	All rights reserved.

	This software is proprietary to and embodies the confidential technology of Global Phasing Limited (GPhL). Possession, use, duplication or dissemination of the software is authorised only pursuant to a valid written licence from GPhL.

Documentation	(2004-2018) Clemens Vonrhein, Claus Flensburg, Wlodek Paciorek & Gérard Bricogne

Contact	proc-develop@GlobalPhasing.com