TutorialRapiData2022

Content:

Introduction
Example 1 (Lysozyme)
Example 2 (1o22)
Example 3 (3isy)

Introduction

These notes here are specifc for the RapiData 2022 workshop at SSRL. We assume that you know how to (1) connect to NX/NoMachine, (2) start a terminal on one of the processing machines and (3) have some basic familiarity with the command-line.

Please remember to run

    gphl

whenever you open a new terminal/shell - in order to have a correct setup for our software and these tutorials. The above command will create a tutorial directory you can then work in. If you want to use the latest version of our software (to be released in the next couple of weeks), please type

    gphllatest

You should now be able to see the online help via

    process -h

If you don't have your own data available for running autoPROC, you can have a look at some of the example data provided by SSRL staff: these are in

/data/rapidata2/SampleData2022

with a separate directory for each project. In each you will find a sequence file (*.pir or *.seq) and a subdirectory Images (if the raw diffraction data is available). You can also run

    gphlinfo

to get to that information (and some more).

Example 1 (Lysozyme)

We can run this via a simple command

process -I /data/rapidata2/SampleData2022/CoLyso_SAD/Images -ANO -d CoLyso_SAD.01 | tee CoLyso_SAD.01.lis

As you can see:

process is the command to run the autoPROC program

we give it three arguments here:

-I /data/rapidata2/SampleData2022/CoLyso_SAD/Images to tell it where to find a directory with images
-ANO since we already know (from deciding on a data collection strategy) that we have high multiplicity and are looking for an anomalous signal
-d CoLyso_SAD.01 to define the output directory for all results

we also "pipe" (vertical bar) the standard output into the tee command to show and save all the information written to the terminal by autoPROC.

see also the autoPROC reference card

One of the first things autoPROC does is to create a HTML document within the output directory - in this case CoLyso_SAD.01/summary.html. This contains very similar information to standard output (which we save via the tee pipe), but enhanced via graphs, tables, links to documentation/reports and a clear annotation of the various results (inclduing deposition-ready mmCIF files). You might want to open this file in a browser to see not only the progress of the processing job, but also learn about any interesting features - coming as notes, warnings and (hopefully no) error messages.

Example 2 (1o22)

Since we use 1o22 as an example in autoSHARP, we can also use it as an example for data-processing:

process -I /data/rapidata2/SampleData2022/1o22/Images -d 1o22.01 | tee 1o22.01.lis

You might want to open 1o22.01/summary.html once that job is on its way - and remember to "reload" that document fro time to tie (since new content will appear throughout the processing job).

Example 3 (3isy)

This is a 2-wavelength MAD dataset that can processed using

process -I /data/rapidata2/SampleData2022/3isy/Images -d 3isy.01 | tee 3isy.01.lis

It will show you how autoPROC first processes each dataset separately and then scales them all together (but taking care that individual wavelength data are merged separately). There is also some indication of anisotropy in that dataset that is being analysed by STARANISO as part of that autoPROC run.

This might be a good opportunity the be aware of the additional scaling paths within the autoPROC system:

XSCALE path (via -M ScalingX argument)

AIMLESS-only scaling path (via -M ScalingA3 argument)