To setup the environment and copying some of the example data over, please run

source /data/rapidata2/gphl.csh

whenever you connect to one of the processing machines. If everything works as expected, you should then be placed automatically into a directory like


which would contain several subdirectories with example data (all named according to the PDB identifier).

If you are interested in some of our work related to Covid-19: see here for data processing with autoPROC and our notes regarding refinement with BUSTER.


There are several example datasets available you can use for running autoPROC:

1o22/Images   => 90 degree, 1.0 deg/image, CCD
3get/Images   => 90 degree, 1.0 deg/image, CCD
3isy/Images   => 2 wavelengths (90 degree, 1.0 deg/image), CCD
4hpe/Images   => 360 degree, 0.5 deg/image, Pilatus
4j8p/Images   => 100 degree, 0.5 deg/image, Pilatus
4jm1/Images   => 3 wavelengths, 0.5 deg/image, Pilatus
7jiw/Images   => 999 images, 0.3 deg/image, Pilatus

Or have a look at the examples here. Of course, the most interesting would be to use one of your own datasets - if you have any available and can transfer them to SSRL computers. Below are some suggestions on how to run full data-processing on those datasets, but also see

  process -h


  process -M list

The simplest way is to run

  process -I /where/ever/image/directory -d out.01 | tee out.01.lis

All output will be written into subdirectory out.01 and standard output is saved into out.01.lis (but also written to the terminal - the tee command does this little trick). The most important output file is out.01/summary.html, so you could also run

  process -I /where/ever/image/directory -d out.01 > out.01.lis &
  firefox out.01/summary.html

A few commonly used options are (... denotes rest of the arguments as described above):

  process -M LowResOrTricky ...                       # difficult data

  process -M HighResCutOnCChalf ...                   # isotropic high-resolution limit based
                                                      # on CC1/2 (instead of I/sig(I))

  process -M ScalingX ...                             # use XSCALE instead of AIMLESS scaling

Those arguments (macros invoked via -M flag) can also be combined.


You should be able to run the following commands for various SAD examples:

      process -ANO -M HighResCutOnCChalf -I 1o22/Images -d 1o22_process.01 | tee 1o22_process.01.lis
      process -ANO -M HighResCutOnCChalf -I 4hpe/Images -d 4hpe_process.01 | tee 4hpe_process.01.lis
      process -ANO -M HighResCutOnCChalf -I 4j8p/Images -d 4j8p_process.01 | tee 4j8p_process.01.lis
      process -ANO -M HighResCutOnCChalf -I 7jiw/Images -d 7jiw_process.01 | tee 7jiw_process.01.lis

Instead of waiting for the program to finish, you can open a browser (firefox - the globe icon at the bottom of your desktop) and go to the summary.html file of a particular job, e.g.


(substitute the correct rd20NN number etc).


As you will see, you have several subdirectories available: one for each of the examples. You can then look at a whole list of examples and run each of those with the command-line shown - after changing your directory. E.g.

  cd 1o22 \
    -seq 1o22.pir -ha "Se" \
    -wvl 0.9778 peak -7 5 -sca \
    -d autoSHARP_SAD-1 | tee autoSHARP_SAD-1.lis

Remember: look at the autoSHARP reference card (PDF) for more help. Or run -h

for online help.

We can also run all of those examples with two extra flags to go for speed: -fast -nowarp ...

On those fast 72-thread processing machines (pxproc01 to pxproc12, Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz) we get those results (bold is for the deposited PDB model and italic what you get out of autoSHARP):

PDB Phasing type sequence chains/ASU #residues #built #chains #sequenced time
1o22 Se-SAD 169 1 149 153 1 153 5 minutes 47 seconds
4j8p Se-SAD 158 1 156 156 1 155 6 minutes 19 seconds
4hpe Se-SAD 307 6 1735 1827 13 1765 33 minutes 31 seconds
3isy Se-MAD, 2 wvls 119 1 117 118 1 118 7 minutes 49 seconds
4jm1 Se-MAD, 2 wvls 83 1 84 77 2 73 5 minutes 39 seconds
4is3 Se-MAD, 3 wvls 267 4 997 1009 5 1000 30 minutes 17 seconds
4me8 Se-MAD, 3 wvls 150 1 117 105 4 80 8 minutes 19 seconds
3get MR-SAD (Se) 364 2 726
1gxt SIRAS (Hg) 90 1 88 88 3 85 10 minutes 19 seconds
3zft MIRAS (Hg, Ir) 147 1 148 142 5 123 7 minutes 33 seconds

The MR-SAD example didn't work here, and the 4ME8 data also looks like it coud have done better. But as you can see, a lot of those jobs worked fine in a very short time: ideal for a tutorial and if you want to try multiple examples.

autoSHARP examples

It might be interesting to use one of the autoPROC examples during the tutorials: to see the combination of data processing and experimental phasing together. For this you could run one of the following commands:

              -fast -nowarp \
              -seq 1o22/1o22.pir -ha "Se" \
              -wvl 0.9778 peak -7 5 -sca 1o22_process.01/ \
              -d 1o22_autoSHARP.01 | tee 1o22_autoSHARP.01.lis
  • 4HPE (will run for quite some time: ~2h)
    • weak low-resolution anomalous signal
    • initially solved by molecular replacement (even though it is a Se-MET protein)
              -nowarp \
              -seq 4hpe/4HPE.pir -ha "Se" \
              -wvl 0.9794 peak -8 5.6 -sca 4hpe_process.01/ \
              -d 4hpe_autoSHARP.01 | tee 4hpe_autoSHARP.01.lis
  • 4J8P (~7 min)
    • 1 Met in 159 residues
    • originally solved with autoSHARP (SHELXC/D and SHARP)
              -fast -nowarp \
              -seq 4j8p/4J8P.pir -ha "Se" \
              -wvl 0.97858 peak -8 6 -sca 4j8p_process.01/ \
              -d 4j8p_autoSHARP.01 | tee 4j8p_autoSHARP.01.lis
  • 7JIW (~ 14min)
    • this was initially not solved by Zn-SAD, but rather via molecular replacement
    • but the Zn signal should be strong enough to be used for experimental phasing
    • we'll have to guess the number of Zn sites per chain to some extent ...
              -fast -nowarp \
              -seq 7jiw/7JIW.seq -ha "Zn" -nsit 2 \
              -wvl 0.9778 hrem  -sca 7jiw_process.01/ \
              -d 7jiw_autoSHARP.01 | tee 7jiw_autoSHARP.01.lis


We could do something in relation to refinement, restraint dictionaries, ligand fitting, screening campaigns etc if needed. In the meantime, check out the BUSTER wiki for details and examples.