From the session website:

We often encounter datasets that include weak experimental phases that we must depend on in order to solve our structures. The following two datasets are examples of data collected for the purposes of solving structures by Single-wavelength Anomalous Diffraction (SAD) phasing. The anomalous signal is present, but weak, and therefore care must be taken to preserve the anomalous signal for phasing. This demonstration will educate the user on how to preserve the anomalous differences in a dataset.

Content:


Introduction

  • Data kindly provided by: Juergen Bosch
  • NOTE: This example includes SeMet data collected at two different wavelengths (and two different beamlines).
  • Sample: Plasmodium falciparum 6-pyruvoyl tetrahydropterin synthase (PTPS)
  • PDB Code: 1Y13
  • ALS:
    • Date data were collected: June 27, 2004
    • Beamline where data were collected: ALS 8.2.1
    • Image file format: The individual images were collected on an ADSC Quantum 210 detector.
  • SSRL:
    • Date data were collected: May 17, 2004
    • Beamline where data were collected: SSRL BL9-1
    • Image file format: The individual images were collected on an ADSC Quantum detector.

What data do we have?

Let's look at the datasets from the two synchrotron trips separately.

ALS data

Running

% find_images

shows two datasets: j1603b3PK_1_E1_001.img to j1603b3PK_1_E1_592.img and j1603b3PK_1_E2_001.img to j1603b3PK_1_E2_591.img. A bit more detail can be seen with

% imginfo j1603b3PK_1_E1_001.img j1603b3PK_1_E1_592.img

which gives

################# File = j1603b3PK_1_E1_001.img

>>> Image format detected as ADSC

===== Header information:
date                                = 27 Jun 2004 07:04:21
exposure time             [seconds] = 5.000
distance                       [mm] = 210.000
wavelength                      [A] = 0.979400
Phi-angle (start, end)     [degree] = 0.000 0.250
Oscillation-angle in Phi   [degree] = 0.250
Omega-angle                [degree] = 0.000
2-Theta angle              [degree] = -0.003
Pixel size in X                [mm] = 0.102400
Pixel size in Y                [mm] = 0.102400
Number of pixels in X               = 2048
Number of pixels in Y               = 2048
Beam centre in X               [mm] = 105.100
Beam centre in X            [pixel] = 1026.367
Beam centre in Y               [mm] = 101.050
Beam centre in Y            [pixel] = 986.816
Overload value                      = 65535

################# File = j1603b3PK_1_E2_001.img

>>> Image format detected as ADSC

===== Header information:
date                                = 27 Jun 2004 07:04:57
exposure time             [seconds] = 5.000
distance                       [mm] = 210.000
wavelength                      [A] = 0.918400
Phi-angle (start, end)     [degree] = 0.000 0.250
Oscillation-angle in Phi   [degree] = 0.250
Omega-angle                [degree] = 0.000
2-Theta angle              [degree] = -0.003
Pixel size in X                [mm] = 0.102400
Pixel size in Y                [mm] = 0.102400
Number of pixels in X               = 2048
Number of pixels in Y               = 2048
Beam centre in X               [mm] = 105.100
Beam centre in X            [pixel] = 1026.367
Beam centre in Y               [mm] = 101.050
Beam centre in Y            [pixel] = 986.816
Overload value                      = 65535

So this looks like an inflection and high-energy remote, 2-wavelength MAD on Se dataset. We can get a bit more information about the actual data-collection using

% imgdate.sh -s *.img > img.lis

which shows

# sorted list of: file, Epoch, Date, seconds-to-previous
j1603b3PK_1_E1_001.img 1088319861  27 Jun 2004 07:04:21 0
j1603b3PK_1_E1_002.img 1088319869  27 Jun 2004 07:04:29 8
j1603b3PK_1_E1_003.img 1088319877  27 Jun 2004 07:04:37 8
j1603b3PK_1_E1_004.img 1088319886  27 Jun 2004 07:04:46 9
j1603b3PK_1_E2_001.img 1088319897  27 Jun 2004 07:04:57 11
j1603b3PK_1_E2_002.img 1088319905  27 Jun 2004 07:05:05 8
j1603b3PK_1_E2_003.img 1088319913  27 Jun 2004 07:05:13 8
j1603b3PK_1_E2_004.img 1088319922  27 Jun 2004 07:05:22 9
j1603b3PK_1_E1_005.img 1088319933  27 Jun 2004 07:05:33 11
j1603b3PK_1_E1_006.img 1088319941  27 Jun 2004 07:05:41 8
j1603b3PK_1_E1_007.img 1088319949  27 Jun 2004 07:05:49 8
j1603b3PK_1_E1_008.img 1088319957  27 Jun 2004 07:05:57 8
j1603b3PK_1_E2_005.img 1088319968  27 Jun 2004 07:06:08 11
...
j1603b3PK_1_E2_368.img 1088326532  27 Jun 2004 08:55:32 9
j1603b3PK_1_E1_369.img 1088326543  27 Jun 2004 08:55:43 11
j1603b3PK_1_E1_370.img 1088326551  27 Jun 2004 08:55:51 8
j1603b3PK_1_E1_371.img 1088326560  27 Jun 2004 08:56:00 9
j1603b3PK_1_E1_372.img 1088326568  27 Jun 2004 08:56:08 8
j1603b3PK_1_E2_369.img 1088326579  27 Jun 2004 08:56:19 11
j1603b3PK_1_E2_370.img 1088326588  27 Jun 2004 08:56:28 9
j1603b3PK_1_E2_371.img 1088327966  27 Jun 2004 09:19:26 1378
j1603b3PK_1_E2_372.img 1088327974  27 Jun 2004 09:19:34 8
...

We can see:

  • the two wavelengths were interleaved, with 4 images per wavelength block
  • something must've happened after or during image 370/371: we have a time-gap of over 20 minutes

SSRL data

Running

% find_images

reports 90 images: J11C05b3_12_001.img to J11C05b3_12_090.img. With

% imginfo J11C05b3_12_001.img

we get some more information

################# File = J11C05b3_12_001.img

>>> Image format detected as ADSC

===== Header information:
exposure time             [seconds] = 83.630
distance                       [mm] = 450.000
wavelength                      [A] = 0.979245
Phi-angle (start, end)     [degree] = 209.000 210.000
Oscillation-angle in Phi   [degree] = 1.000
Omega-angle                [degree] = 0.000
Pixel size in X                [mm] = 0.102588
Pixel size in Y                [mm] = 0.102588
Number of pixels in X               = 3072
Number of pixels in Y               = 3072
Beam centre in X               [mm] = 157.500
Beam centre in X            [pixel] = 1535.267
Beam centre in Y               [mm] = 157.500
Beam centre in Y            [pixel] = 1535.267

Unfortunately, the image header doesn't record the collection date - so we can't analyse it in the same way as above (in order to find potential problems).


Looking at images

First let's check the ALS dataset:

ALS data

Image Full image Centre region Upper-left
j1603b3PK_1_E1_001 j1603b3PK_1_E1_001.smaller.png j1603b3PK_1_E1_001.centre.smaller.png j1603b3PK_1_E1_001.upper-left.smaller.png
j1603b3PK_1_E1_061 j1603b3PK_1_E1_061.smaller.png j1603b3PK_1_E1_061.centre.smaller.png j1603b3PK_1_E1_061.upper-left.smaller.png
j1603b3PK_1_E1_121 j1603b3PK_1_E1_121.smaller.png j1603b3PK_1_E1_121.centre.smaller.png j1603b3PK_1_E1_121.upper-left.smaller.png
j1603b3PK_1_E1_181 j1603b3PK_1_E1_181.smaller.png j1603b3PK_1_E1_181.centre.smaller.png j1603b3PK_1_E1_181.upper-left.smaller.png
j1603b3PK_1_E1_241 j1603b3PK_1_E1_241.smaller.png j1603b3PK_1_E1_241.centre.smaller.png j1603b3PK_1_E1_241.upper-left.smaller.png
j1603b3PK_1_E1_301 j1603b3PK_1_E1_301.smaller.png j1603b3PK_1_E1_301.centre.smaller.png j1603b3PK_1_E1_301.upper-left.smaller.png
j1603b3PK_1_E1_361 j1603b3PK_1_E1_361.smaller.png j1603b3PK_1_E1_361.centre.smaller.png j1603b3PK_1_E1_361.upper-left.smaller.png
j1603b3PK_1_E1_421 j1603b3PK_1_E1_421.smaller.png j1603b3PK_1_E1_421.centre.smaller.png j1603b3PK_1_E1_421.upper-left.smaller.png
j1603b3PK_1_E1_481 j1603b3PK_1_E1_481.smaller.png j1603b3PK_1_E1_481.centre.smaller.png j1603b3PK_1_E1_481.upper-left.smaller.png
j1603b3PK_1_E1_541 j1603b3PK_1_E1_541.smaller.png j1603b3PK_1_E1_541.centre.smaller.png j1603b3PK_1_E1_541.upper-left.smaller.png
j1603b3PK_1_E2_001 j1603b3PK_1_E2_001.smaller.png j1603b3PK_1_E2_001.centre.smaller.png j1603b3PK_1_E2_001.upper-left.smaller.png
j1603b3PK_1_E2_061 j1603b3PK_1_E2_061.smaller.png j1603b3PK_1_E2_061.centre.smaller.png j1603b3PK_1_E2_061.upper-left.smaller.png
j1603b3PK_1_E2_121 j1603b3PK_1_E2_121.smaller.png j1603b3PK_1_E2_121.centre.smaller.png j1603b3PK_1_E2_121.upper-left.smaller.png
j1603b3PK_1_E2_181 j1603b3PK_1_E2_181.smaller.png j1603b3PK_1_E2_181.centre.smaller.png j1603b3PK_1_E2_181.upper-left.smaller.png
j1603b3PK_1_E2_241 j1603b3PK_1_E2_241.smaller.png j1603b3PK_1_E2_241.centre.smaller.png j1603b3PK_1_E2_241.upper-left.smaller.png
j1603b3PK_1_E2_301 j1603b3PK_1_E2_301.smaller.png j1603b3PK_1_E2_301.centre.smaller.png j1603b3PK_1_E2_301.upper-left.smaller.png
j1603b3PK_1_E2_361 j1603b3PK_1_E2_361.smaller.png j1603b3PK_1_E2_361.centre.smaller.png j1603b3PK_1_E2_361.upper-left.smaller.png
j1603b3PK_1_E2_421 j1603b3PK_1_E2_421.smaller.png j1603b3PK_1_E2_421.centre.smaller.png j1603b3PK_1_E2_421.upper-left.smaller.png
j1603b3PK_1_E2_481 j1603b3PK_1_E2_481.smaller.png j1603b3PK_1_E2_481.centre.smaller.png j1603b3PK_1_E2_481.upper-left.smaller.png
j1603b3PK_1_E2_541 j1603b3PK_1_E2_541.smaller.png j1603b3PK_1_E2_541.centre.smaller.png j1603b3PK_1_E2_541.upper-left.smaller.png
  • quite small beamstop
  • the diffraction seems to suffer a bit for later images

The beam centre is a bit off-centre. This can often cause problems if the visible beam centre and the header value are not in sync. So let's quickly check that:

% adxv j1603b3PK_1_E1_001.img

will show the image

adxv_1.png

and the control panel

adxv_2.png

We will zoom in (selecting the "100%" button in the control panel)

adxv_3.png

and move the mouse to the visible beam centre

adxv_4.png

We can read off the pixel values as (1026,1060). How does that relate to the header values (1026.367,986.816) from above?

% beam8.sh 1026.367 986.816 2048

shows

Convention     Beam centre
----------------------------- 
 x, y      =  1026.37   986.82
-x, y      =  1021.63   986.82
 x,-y      =  1026.37  1061.18
-x,-y      =  1021.63  1061.18
 y, x      =   986.82  1026.37
-y, x      =  1061.18  1026.37
 y,-x      =   986.82  1021.63
-y,-x      =  1061.18  1021.63

So the header seem to follow the (x,-y) convention.

SSRL data

The images of the single scan dataset:

Image Full image Centre region Upper-left
1 J11C05b3_12_001.smaller.png J11C05b3_12_001.centre.smaller.png J11C05b3_12_001.upper-left.smaller.png
31 J11C05b3_12_031.smaller.png J11C05b3_12_031.centre.smaller.png J11C05b3_12_031.upper-left.smaller.png
61 J11C05b3_12_061.smaller.png J11C05b3_12_061.centre.smaller.png J11C05b3_12_061.upper-left.smaller.png
  • clean beamstop
  • severe ice-rings
  • lower resolution than the ALS data

Initial run

We first process the ALS dataset:

ALS data

We've already noticed that the beam centre recorded in the header needs to be converted in order to match the actual image. So we could tell autoPROC about that by using

% process BeamCentreFrom="header:x,-y" -d 01 | tee 01.lis

However, we also noticed that the diffraction seems to get weaker towards the end of data collection. So lets switch on a series of options that try and take care of such potential issues (apart from beam centre convention and loss of diffraction power, it also tries and deals with ice-rings):

% process -M automatic -d 01 | tee 01.lis

This reads a so-called macro (named "automatic") - for a list of macros please run process -M list.

which gives for the inflection wavelength spacegroup P4212

Summary data for   Project: Test Crystal: A Dataset: 0.979400

                                           Overall  InnerShell  OuterShell
---------------------------------------------------------------------------
  Low resolution limit                      33.345      33.345       2.437
  High resolution limit                      2.429      11.134       2.429

  Rmerge                                     0.081       0.055       0.442
  Ranom                                      0.079       0.052       0.400
  Rmeas (within I+/I-)                       0.086       0.057       0.480
  Rmeas (all I+ & I-)                        0.085       0.059       0.481
  Rpim  (within I+/I-)                       0.036       0.023       0.261
  Rpim  (all I+ & I-)                        0.026       0.019       0.188
  Total number of observations              265518        2369        1637
  Total number unique                        26953         321         254
  Mean(I)/sd(I)                               20.0        31.3         4.2
  Completeness                                98.6        95.3        98.8
  Multiplicity                                 9.9         7.4         6.4

  Anomalous completeness                      98.4        95.0        99.6
  Anomalous multiplicity                       5.3         5.4         3.3

and for the high-energy remote spacegroup P41212

Summary data for   Project: Test Crystal: A Dataset: 0.918400

                                           Overall  InnerShell  OuterShell
---------------------------------------------------------------------------
  Low resolution limit                      33.345      33.345       2.480
  High resolution limit                      2.473      11.134       2.473

  Rmerge                                     0.082       0.057       0.478
  Ranom                                      0.080       0.055       0.459
  Rmeas (within I+/I-)                       0.088       0.061       0.544
  Rmeas (all I+ & I-)                        0.086       0.061       0.519
  Rpim  (within I+/I-)                       0.036       0.025       0.289
  Rpim  (all I+ & I-)                        0.026       0.020       0.199
  Total number of observations              251499        2404        1152
  Total number unique                        25542         324         175
  Mean(I)/sd(I)                               19.7        30.4         4.1
  Completeness                                98.6        96.1        98.3
  Multiplicity                                 9.8         7.4         6.6

  Anomalous completeness                      98.3        97.2       100.0

The final files:

SSRL data

We can run with all defaults

% process -d 01 | tee 01.lis

to get spacegroup P41212 and

Summary data for   Project: Test Crystal: A Dataset: 0.97925

                                           Overall  InnerShell  OuterShell
---------------------------------------------------------------------------
  Low resolution limit                     131.822     131.822       3.458
  High resolution limit                      3.447      15.990       3.447

  Rmerge                                     0.103       0.037       0.392
  Ranom                                      0.096       0.032       0.385
  Rmeas (within I+/I-)                       0.112       0.036       0.447
  Rmeas (all I+ & I-)                        0.112       0.041       0.426
  Rpim  (within I+/I-)                       0.058       0.018       0.223
  Rpim  (all I+ & I-)                        0.044       0.018       0.160
  Total number of observations               62137         658         512
  Total number unique                         9595         128          79
  Mean(I)/sd(I)                               18.0        40.2         4.6
  Completeness                                97.8        97.7        81.4
  Multiplicity                                 6.5         5.1         6.5

  Anomalous completeness                      93.2        94.1        68.3
  Anomalous multiplicity                       3.6         3.3         3.9

and files:

Can those already be used for solving the structure? See the autoSHARP tutorial.

Notes

The POINTLESS step in determining the most likely spacegroup comes up with P4212 (ALS infl) and P41212 (ALS hrem and SSRL data). Although there is still some checking to be done, it seems most likely that we have P41212 (or the enantiomorph P43212).


Can we do better?

Work in progress

ALS data

SSRL data