ExamplesAutoSharp

Attachments
3ZFR_Ir.mtz	1MB
3ZFQ_Hg.mtz	302K
3ZFT.pir	159B
1o22.pir	206B
3isy_aimless_0.97934A.sca	189K
3ZFT_nat.mtz	1MB
4JM1.pir	96B
1GXT_hg.mtz	375K
1GXT.pir	102B
4IS3_truncate_0.97919.mtz	5MB
3GET.pir	380B
4IS3_truncate_0.91162.mtz	5MB
4ME8_truncate_0.97894.mtz	411K
3isy.pir	131B
4HPE.pir	322B
4ME8_truncate_0.91837.mtz	411K
4JM1_truncate_0.97917.mtz	1MB
3FFH_ala_MR.pdb	270K
4ME8_truncate_0.97944.mtz	411K
1o22_peak.sca	543K
3isy_aimless_0.91162A.sca	190K
3GET.sca	2MB
4J8P_truncate.mtz	1MB
4HPE_truncate.mtz	5MB
4JM1_truncate_0.97849.mtz	1MB
4IS3.pir	282B
4IS3_truncate_0.97936.mtz	5MB
1GXT_nat.mtz	674K
4ME8.pir	163B
4J8P.pir	171B

Content:

Notes

SAD (single wavelength)
MAD (multiple wavelength)
Using an initial (partial) model, e.g. from MR solution
SIRAS (native plus single derivative)
MIRAS (native plus multiple derivatives)

What to look out for?

Notes:

It might be a good idea to run these tutorials in a fresh directory on your system, e.g. via

       mkdir -p ~/Projects/autoSHARP/Tutorials
       cd ~/Projects/autoSHARP/Tutorials

If you are running those examples as part of a workshop, please make sure to follow the local instructions regarding locations and directories (ie. where you should run jobs). If you are supposed to work in a specific directory, just do e.g.

             mkdir -p autoSHARP/Tutorials
             cd autoSHARP/Tutorials

Once you have changed directory ("cd" command above), it can useful to remind yourself of where you are:

pwd

Downloading example files from here usually involves moving the mouse pointer over the file name, then "[right mouse click] -> Save Link As ...". You might (or might not) be asked where to save the file. You could just browse to the relevant directory (see "pwd" output above) or decide to save them all into some default directory (which could be e.g. "Downloads" in your home directory).

After downloading the relevant data files (sequence, reflection, initial model PDB etc), it might be necessary to move these files into the current working directory. Often, the browser will automatically save downloaded files in ~/Downloads (the tilde means "my HOME directory") and you could use e.g.

       mv ~/Downloads/1o22* .

The commands shown below could be used as-is with the provided files - just cut-n-paste them into your terminal window:

mark the full command shown on this pages here via the left mouse button, move the mouse back into your terminal window and then press "[middle mouse]". If this doesn't work, use the normal "Copy" and "Paste" functionality (usually with the right mouse button).
the "\" character marks a continuation and therefore is part of the command

Each of those commands will write some information into your terminal ("standard output") - among them the name of a HTML file ("LISTautoSHARP.html"). You should open this HTML document into a browser (Firefox, Chrome, Safari etc) and reload it from time to time while autoSHARP is still running and writing to this file.

The timings given below are for a reasonably fast computer with 4 threads: this might be faster on more modern and powerful machines or slower when running on some old hardware.

Most of the time will be spent in automatic model building and density modification (the actual HA substructure solution, completion and phasing typically takes only 2-10 minutes in most cases).

The automatic building can be skipped by adding "-nobuild" to the command-line (but this might not give as good density). You will still be able to visualise the initial electron density after density modification, which should give you an idea if you have solved the structure or no.

SAD (single wavelength)

SAD-1 (1O22):

Download 2 files: 1o22.pir, 1o22_peak.sca
Command to run (time: 33 minutes 13 seconds)

      run_autoSHARP.sh \
          -seq 1o22.pir -ha "Se" \
          -wvl 0.9778 peak -7 5 -sca 1o22_peak.sca \
          -d autoSHARP_SAD-1 | tee autoSHARP_SAD-1.lis

      # or:

      run_autoSHARP.sh -seq 1o22.pir -ha "Se" -wvl 0.9778 peak -7 5 -sca 1o22_peak.sca -d autoSHARP_SAD-1 | tee autoSHARP_SAD-1.lis

SAD-2 (4J8P):

Download 2 files: 4J8P.pir, 4J8P_truncate.mtz
Command to run (time: 42 minutes 37 seconds)

      run_autoSHARP.sh \
          -seq 4J8P.pir -ha "Se" \
          -wvl 0.97858 peak -8.000 6.000  -mtz 4J8P_truncate.mtz \
          -d autoSHARP_SAD-2 | tee autoSHARP_SAD-2.lis

SAD-3 (4HPE):

Download 2 files: 4HPE.pir, 4HPE_truncate.mtz
Command to run (time: 2 hours 44 minutes 22 seconds)

      run_autoSHARP.sh \
          -seq 4HPE.pir -ha "Se" \
          -wvl 0.9794 peak -7.963 5.573  -mtz 4HPE_truncate.mtz \
          -d autoSHARP_SAD-3 | tee autoSHARP_SAD-3.lis

MAD (multiple wavelength)

Of course, any of these MAD examples could also be run as SAD, i.e. using only one of the wavelengths. Some should still work in those cases, while others might not. It might be a good experience, to try different combinations of SAD and/or MAD (e.g. 2 versus 3 wavelengths) to see the effect on HA substructure solution, phasing, density modification and final automatic building.

MAD-1 (3ISY):

Download 3 files: 3isy.pir, 3isy_aimless_0.97934A.sca, 3isy_aimless_0.91162A.sca
Command to run (time: 24 minutes 27 seconds)

      run_autoSHARP.sh \
          -seq 3isy.pir -ha "Se" \
          -wvl 0.97934 infl -11 3.3  -sca 3isy_aimless_0.97934A.sca \
          -wvl 0.91162 hrem -1.8 3.3 -sca 3isy_aimless_0.91162A.sca \
          -d autoSHARP_MAD-1 | tee autoSHARP_MAD-1.lis

MAD-2 (4JM1):

Download 3 files: 4JM1.pir, 4JM1_truncate_0.97849.mtz, 4JM1_truncate_0.97917.mtz
Command to run (time: 31 minutes 25 seconds)

      run_autoSHARP.sh \
          -seq 4JM1.pir -ha "Se" \
          -wvl 0.97849 peak -4.660 4.060  -mtz 4JM1_truncate_0.97849.mtz \
          -wvl 0.97917 infl -7.690 2.050  -mtz 4JM1_truncate_0.97917.mtz \
          -d autoSHARP_MAD-2 | tee autoSHARP_MAD-2.lis

MAD-3 (4IS3):

Download 4 files: 4IS3.pir, 4IS3_truncate_0.97936.mtz, 4IS3_truncate_0.91162.mtz, 4IS3_truncate_0.97919.mtz
Command to run (time: 2 hours 18 minutes 15 seconds)

      run_autoSHARP.sh \
          -seq 4IS3.pir -ha "Se" \
          -wvl 0.97936 infl -11.400 3.710  -mtz 4IS3_truncate_0.97936.mtz \
          -wvl 0.91162 hrem  -1.700 3.300  -mtz 4IS3_truncate_0.91162.mtz \
          -wvl 0.97919 peak  -8.700 6.670  -mtz 4IS3_truncate_0.97919.mtz \
          -d autoSHARP_MAD-3 | tee autoSHARP_MAD-3.lis

MAD-4 (4ME8):

Download 4 files: 4ME8.pir, 4ME8_truncate_0.97944.mtz, 4ME8_truncate_0.91837.mtz, 4ME8_truncate_0.97894.mtz
Command to run (time: 31 minutes 2 seconds)

      run_autoSHARP.sh \
          -seq 4ME8.pir -ha "Se" \
          -wvl 0.97944 infl -8.600 2.660  -mtz 4ME8_truncate_0.97944.mtz \
          -wvl 0.91837 hrem -1.800 3.400  -mtz 4ME8_truncate_0.91837.mtz \
          -wvl 0.97894 peak -6.860 4.580  -mtz 4ME8_truncate_0.97894.mtz \
          -d autoSHARP_MAD-4 | tee autoSHARP_MAD-4.lis

Using an initial (partial) model, e.g. from MR solution

autoSHARP allows the use of an input (already placed) model for any phasing scenario - in which case the de-novo HA substructure finding step (with SHELXC/D) is skipped and HA sites are found based on the initial phases from this model (via LLG residual maps in SHARP). Such a model could e.g. be an initial MR solution that is not accurate enough to allow refinement or only one component of a hetero-multimer could be placed. There are a variety of situations, where some meaningful initial PDB model is available. This model and the reflection data given to autoSHARP should have the same (correct) spacegroup as determined through the steps leading to the initial model.

Remember that this can be used with any of the methods supported by autoSHARP (SAD, MAD, SIRAS etc).

MR-1 (3GET):

Download 3 files: 3GET.pir, 3FFH_ala_MR.pdb, 3GET.sca
Command to run (time: 2 hours 39 minutes 27 seconds)

      run_autoSHARP.sh \
          -seq 3GET.pir -ha "Se" \
          -pdb 3FFH_ala_MR.pdb \
          -wvl 0.9789 peak -8 4 -sca 3GET.sca \
          -d autoSHARP_MR-1 | tee autoSHARP_MR-1.lis

SIRAS (native plus single derivative)

This could also be run as SAD, using only the derivative dataset.

SIRAS-1 (1GXT):

Download 3 files: 1GXT.pir, 1GXT_nat.mtz, 1GXT_hg.mtz
Command to run (time: 46 minutes 54 seconds)

      run_autoSHARP.sh \
          -seq 1GXT.pir \
          -nat -mtz 1GXT_nat.mtz \
          -ha "Hg" -nsit 2 -wvl 0.99970 peak -16 10 -mtz 1GXT_hg.mtz \
          -d autoSHARP_SIRAS-1 | tee autoSHARP_SIRAS-1.lis

MIRAS (native plus multiple derivatives)

This could also be run as SIRAS using only one of the derivative soaks.

MIRAS-1 (3ZFT):

Download 4 files: 3ZFT.pir, 3ZFT_nat.mtz, 3ZFQ_Hg.mtz, 3ZFR_Ir.mtz
Command to run (time: 56 minutes 41 seconds)

      run_autoSHARP.sh \
          -seq 3ZFT.pir \
          -nat -mtz 3ZFT_nat.mtz \
          -ha "Hg" -nsit 1 -wvl 1.54179 -mtz 3ZFQ_Hg.mtz \
          -ha "Ir" -nsit 2 -wvl 1.54179 -mtz 3ZFR_Ir.mtz \
          -d autoSHARP_MIRAS-2 | tee autoSHARP_MIRAS-2.lis

What to look out for?

There are several main themes one should consider:

Even before running autoSHARP: is there some indication of a useful heavy-atom signal in your data? E.g. for phasing on anomalous differences:

correlation between anomalous differences of random half-sets (often called "CC(ano)")
anomalous signal (DANO/sig(DANO) or SigAno), i.e. anomalous difference (absolute value) over its sigma

correlation between anomalous differences for distinct wavelengths (in the MAD case)
comparison of statistics ignoring or taking Friedel's Law into account (e.g. comparing Rmeas0 and Rmeas)

Is there an indication of heavy atom signal during data analysis in autoSHARP? These are given at the SHELXC step during heavy atom substructure solution.

Did the substructure solution step give a promising looking solution?

For multi-site searches we expect the first few sites to have a high (relative) occupancy.
Does the cross-word table show goot scores (PSMF) between teh first few sites, i.e. is the solution consistent with the Patterson map?
Do we see a more or less bimodal distribution of successful and incorrect trials (in the final set of scatter plots)?

Does heavy-atom refinement and phasing (including iterative interpretation of the LLG (residual) maps) give meaningful indications of phasing signal?

Does it keep all/most of the initial sites (sugegsting that these are correct)?
Does it add additional (strong) sites up to the expected number?
At what resolution does the phasing power drop below 1.0? This should be in sync with the indications of heavy-atom signal mentioned above.

For a successful solution we would like to see a significant difference in score between the two hands (enantiomorphs):

the absolute value is not that important (although: the higher usually the better)
a score of 0.25 versus 0.11 would be good (or 0.5 versus 0.3, 0.15 versus 0.09 etc)

Some additional hints for running:

it is often not necessary to use the full resolution if working with really high-resolution data, rather do an initial quicker test using

      run_autoSHARP.sh -R 50.0 2.6 ...

there are additional flags for speeding things up - useful for maybe initial tests:

"-fast" (with the danger of "missing" a solution)
"-nobuild" (stop after density modification)
"-nowarp" (skip the ARP/wARP building step and only use BUCCANEER for model building, with the final model still needing quit a bit of manual work)