TutorialSeaCoast2020

Content:

Introduction

Setup

Example data
Working with own data

Some potentially necessary performance tuning

Introduction

These are some notes for the SEA COAST 2020 workshop running at KMUTT (Bangkok) in January 2020. We will try and keep this up-to-date as we go along.

Remember, our software is usually run from the command-line (within a terminal). You should be reasonably familiar with some basic so-called shell commands - see e.g. also here and the handout you should have received.

To keep the different program and tutorial runs organised, it might be a good idea to run everything related to autoPROC in a separate directory, e.g. doing

mkdir ~/autoPROC
cd ~/autoPROC

(the first command creates a sub-directory in your home directory and the second command changes your current working directory to that newly created one).

You should have a copy of the 2-sided autoPROC reference card in your handouts with the most commonly used command-line arguments. More details can be found in the online manual. The crucial bits to remember are:

the autoPROC command itself is called process and you get help by running process -h

you need to tell the program where your images are (usually via the -I flag)

you want to save all output into a separate sub-directory - since you never know if you need/want to run processing multiple times to fine-tune or optimize it. It is useful to decide on a logical system for naming output directories consistently - e.g. 01, 02, 03 ... etc (and not "new", "newer", "newsest", "test", "test1" etc).

the main output/result file to look at (even while the job is running) is called summary.html and is located within the output sub-directory specified (you need to reload that file from time to time while the job is still running).

Setup

It is very important to have everything setup to run our programs. That involves typing:

module load ccp4
module load xds
module load gphl

Example data

There are some local example datasets available at

~/Desktop/data

that can be used as examples for the different data processing tutorials during the workshop. After opening a terminal (click on the icon called "Terminal" on your desktop) you should be able to run autoPROC on these using e.g.

mkdir ~/autoPROC
cd ~/autoPROC
process -I ~/Desktop/data/6OE7 -d 6OE7.01 | tee 6OE7.01.lis

Let's look at the various components here:

process is the actual command for running autoPROC

-I ~/Desktop/data/6OE7 tells the program where the images are:

-I is a so-called command-line flag (for image directory)
~/Desktop/data/6OE7 is the directory name - where ~ is a useful shortcut for your home directory
please ensure that there are spaces between -I and ~/Desktop/data/6OE7!

-d 6OE7.01 defines where all output should go:

it is often a good idea to adopt some kind of naming convention for this
some numerical system for different runs might be handy: often we will run processing multiple times

the last bit is a bit of (normal) shell "magic":

the vertical bar | is called a "pipe", meaning that the output (i.e. everything that is usually written into the terminal) is passed over to another program
this other program is "tee" - which takes it name from T-junction (see what the man pages have to tell about it - by typing man tee in a terminal)
the nice thing about "tee" is that it will now write standard output into a file 6OE7.01.lis while at the same time also write it to the terminal: so we get the best of both worlds, namely seeing what is being done in real-time and saving all this information for later

While the job is running (should take about 5-6 minutes), you can open the main output file (summary.html) via

firefox ~/autoPROC/6OE7.01/summary.html

or putting the full path to that file into the location bar of your browser (maybe in a separate tab to the one you are currently using):

/home/li-mth04/autoPROC/6OE7.01/summary.html

(remember you need to use the correct account name: ~ will not work here and li-mth04 is only correct for one of you).

We will use this 6OE7 example during the practical tutorial - and then go through the output to show you the type of information provided by autoPROC processing with XDS.

Working with own data

This would not be fundamentally different to using the example/tutorial data mentioned above. Some additional care should be taken though - especially in checking for any beamline-specific settings. If the beamline/instrument you used for collecting your data is not listed or the instrumentation/setup has changed without us being aware of it and data processing is not working, the most common reasons (apart from poor diffraction quality) are

rotation axis is inverted (relative to "standard" setup): add ReverseRotationAxis=yes to the command-line
rotation axis is vertical (instead of the more common horizontal orientation): add autoPROC_XdsKeyword_ROTATION_AXIS="0.0 -1.0 0.0" or autoPROC_XdsKeyword_ROTATION_AXIS="0.0 1.0 0.0"
beam-centre definitions as stored in image header follow unknown convention: you could try different conventions (e.g. BeamCentreFrom=header:y,x) or tell autoPROC to try and automatically test for it via BeamCentreFrom=getbeam:init

Have a close look at the information provided by the summary.html file, especially any warning messages that might point to problems with diffraction, crystal, instrumentation or processing.

Some potentially necessary performance tuning

To avoid issues with performance or your machine becoming nuresponsive, it might be necessary to restrict the way we are running autoPROC a bit (in order to work with the 8Gb of memory the workshop machines have). This can/might include the following flags

process -nthreads 1 ...

and/or

process -nthreads 1 xds=/usr/local/xtal/bin/xds ...

and/or

process -nthreads 1 xds=/usr/local/xtal/bin/xds autoPROC_XdsKeyword_NUMBER_OF_IMAGES_IN_CACHE=0 ...

(where ... represents any other arguments like -I or -d).