What can I do to have it run faster?

Content:


Introduction

Everyone wants programs to run faster. The simplest step is to run it on the fastest computer (clock-speed as well as number of CPU/cores) and avoiding expensive network traffic when reading data images (try storing images on a fast local disk). But sometimes, these parameters are non-negotiable and some other options could be used.

However, the gain in speed that a shortcut or a skipped analysis step might bring, could as well mean lost time further down the structure solution or refinement path because of suboptimal data processing or even mistakes. This doesn't mean that running with these options will produce poor data - it all depends on the crystal and diffraction quality.

The main areas to speed up data processing are:

Making use of XDS cluster feature

If your local XDS installation is configured to make use of a cluster, setting the MAXIMUM_NUMBER_OF_JOBS parameter will speed things up:

% process autoPROC_XdsKeyword_MAXIMUM_NUMBER_OF_JOBS=4 ...

This would fork 4 jobs at the COLSPOT and INTEGRATE stage. You might also want to set the related MAXIMUM_NUMBER_OF_PROCESSORS parameter; which combination of those two parameters is ideal depends completely on your local setup.

Use as few images as possible

The only area to cut down on the number of images is during initial indexing - or rather the spot search required for subsequent determination of an orientation matrix. This can be done using one of our pre-defined macros:

% process -M fast ...

There is obviously also a potential danger to this (that's why it is not the default) - it might miss one of the following:

  • the diffraction pattern shows multiple lattices, split or streaky spots
  • radiation damage occurs (leading to changes in cell parameters or loss of diffraction)
  • the crystal is poorly centered (and moves out of the beam at certain orientations)
  • spatial overlaps within the image (poor spot separation) or within the oscillation angle (spatial overlaps due to too large an oscillation angle)
  • an incorrect beam centre value
  • any other pathological feature

This could all lead to wrong indexing solutions (in terms of spacegroup, cell, orientation matrix, distance, beam centre etc). But if the crystal system is well known and the diffraction pattern is nice, using -M fast is an easy way of speeding up the initial indexing.


Avoid integrating areas of the image without diffraction

Ideally, the crystal-detector distance should have been chosen to make maximum use of the detector surface. This would mean that there is diffraction up to the edge of the detector (as opposed to the corners of a square detector). In that case one could change the default (to use the full area of the detector right into the edges) by using

% process autoPROC_XdsKeyword_TRUSTED_REGION="0.0 1.05" ...

Inidentially, we've learned here at the same time a general mechanism for setting specific XDS keywords.

Of course, there are situations where changing the default might not be adequate:

  • an extremely well diffracting crystal or one with very large unit cell parameters: in those cases it might not be possible to move the detector closer to the crystal - either because of hardware limitations or because of the danger of badly resolved spots
  • a low-intensity pass (or sometimes called 'low-resolution pass'): if the initial scan(s) resulted in a larger number of overloaded reflections - to get the highest resolution data possible - an additional pass with additional attenuation and a larger detector-crystal distance is advisable. This would collect those overloaded reflections but might have diffraction to the full extent of the detector.

Another option to restrict the used area of the detector is to use a high-resolution limit. If it is already known that on any image there are no diffraction spots beyond a given resolution limit, one can use e.g.

% process -R 999.9 1.6 ...

to limit the resolution to a maximum of 1.6A. This limit is used at both the spot searching and integration stage. Some things to remember though:

  • The very first image of a scan/dataset might show weaker diffraction than images say 90 degree apart. Crystals can have shapes that result in different crystal volume in the beam at different rotation angles or they can show severe anisotropic diffraction due to the internal crystal packing or disorder.
  • Some spots that are visible at high resolution on images could well be coming from ice (build up around the sample or due to the cryo-cooling conditions and procedure).
  • It might be a good default to always slightly over-estimate the high-resolution limit after visual inspection of the images - as long as one avoids extremes (a 6A diffracting crystal at a detector-crystal distance that allows 1.5A data to be collected is clearly suboptimal).

Putting this all together

Combining several of these options might as well give a rather long command-line:

% process -Id LysoHepes,`pwd`,exp5_lyso_ligands_1_###.img,61,120 \
          -M fast \
          autoPROC_XdsKeyword_TRUSTED_REGION="0.0 1.05" \
          autoPROC_XdsKeyword_MAXIMUM_NUMBER_OF_JOBS=4 \
          -R 999.9 1.6 -d 03 | tee 03.lis