Content:


Notes:

  • It might be a good idea to run these tutorials in a fresh directory on your system, e.g. via
       mkdir -p ~/Projects/autoSHARP/Tutorials
       cd ~/Projects/autoSHARP/Tutorials
    • If you are running those examples as part of a workshop, please make sure to follow the local instructions regarding locations and directories (ie. where you should run jobs). If you are supposed to work in a specific directory, just do e.g.
             mkdir -p autoSHARP/Tutorials
             cd autoSHARP/Tutorials
  • Once you have changed directory ("cd" command above), it can useful to remind yourself of where you are:
      pwd
  • Downloading example files from here usually involves moving the mouse pointer over the file name, then "[right mouse click] -> Save Link As ...". You might (or might not) be asked where to save the file. You could just browse to the relevant directory (see "pwd" output above) or decide to save them all into some default directory (which could be e.g. "Downloads" in your home directory).
  • After downloading the relevant data files (sequence, reflection, initial model PDB etc), it might be necessary to move these files into the current working directory. Often, the browser will automatically save downloaded files in ~/Downloads (the tilde means "my HOME directory") and you could use e.g.
       mv ~/Downloads/1o22* .
  • The commands shown below could be used as-is with the provided files - just cut-n-paste them into your terminal window:
    • mark the full command shown on this pages here via the left mouse button, move the mouse back into your terminal window and then press "[middle mouse]". If this doesn't work, use the normal "Copy" and "Paste" functionality (usually with the right mouse button).
    • the "\" character marks a continuation and therefore is part of the command
  • Each of those commands will write some information into your terminal ("standard output") - among them the name of a HTML file ("LISTautoSHARP.html"). You should open this HTML document into a browser (Firefox, Chrome, Safari etc) and reload it from time to time while autoSHARP is still running and writing to this file.
  • The timings given below are for a reasonably fast computer with 4 threads: this might be faster on more modern and powerful machines or slower when running on some old hardware.
  • Most of the time will be spent in automatic model building and density modification (the actual HA substructure solution, completion and phasing typically takes only 2-10 minutes in most cases).
  • The automatic building can be skipped by adding "-nobuild" to the command-line (but this might not give as good density). You will still be able to visualise the initial electron density after density modification, which should give you an idea if you have solved the structure or no.

SAD (single wavelength)

 

SAD-1 (1O22):

      run_autoSHARP.sh \
          -seq 1o22.pir -ha "Se" \
          -wvl 0.9778 peak -7 5 -sca 1o22_peak.sca \
          -d autoSHARP_SAD-1 | tee autoSHARP_SAD-1.lis

      # or:

      run_autoSHARP.sh -seq 1o22.pir -ha "Se" -wvl 0.9778 peak -7 5 -sca 1o22_peak.sca -d autoSHARP_SAD-1 | tee autoSHARP_SAD-1.lis
 

SAD-2 (4J8P):

      run_autoSHARP.sh \
          -seq 4J8P.pir -ha "Se" \
          -wvl 0.97858 peak -8.000 6.000  -mtz 4J8P_truncate.mtz \
          -d autoSHARP_SAD-2 | tee autoSHARP_SAD-2.lis
 

SAD-3 (4HPE):

      run_autoSHARP.sh \
          -seq 4HPE.pir -ha "Se" \
          -wvl 0.9794 peak -7.963 5.573  -mtz 4HPE_truncate.mtz \
          -d autoSHARP_SAD-3 | tee autoSHARP_SAD-3.lis
 

MAD (multiple wavelength)

Of course, any of these MAD examples could also be run as SAD, i.e. using only one of the wavelengths. Some should still work in those cases, while others might not. It might be a good experience, to try different combinations of SAD and/or MAD (e.g. 2 versus 3 wavelengths) to see the effect on HA substructure solution, phasing, density modification and final automatic building.

MAD-1 (3ISY):

      run_autoSHARP.sh \
          -seq 3isy.pir -ha "Se" \
          -wvl 0.97934 infl -11 3.3  -sca 3isy_aimless_0.97934A.sca \
          -wvl 0.91162 hrem -1.8 3.3 -sca 3isy_aimless_0.91162A.sca \
          -d autoSHARP_MAD-1 | tee autoSHARP_MAD-1.lis
 

MAD-2 (4JM1):

      run_autoSHARP.sh \
          -seq 4JM1.pir -ha "Se" \
          -wvl 0.97849 peak -4.660 4.060  -mtz 4JM1_truncate_0.97849.mtz \
          -wvl 0.97917 infl -7.690 2.050  -mtz 4JM1_truncate_0.97917.mtz \
          -d autoSHARP_MAD-2 | tee autoSHARP_MAD-2.lis
 

MAD-3 (4IS3):

      run_autoSHARP.sh \
          -seq 4IS3.pir -ha "Se" \
          -wvl 0.97936 infl -11.400 3.710  -mtz 4IS3_truncate_0.97936.mtz \
          -wvl 0.91162 hrem  -1.700 3.300  -mtz 4IS3_truncate_0.91162.mtz \
          -wvl 0.97919 peak  -8.700 6.670  -mtz 4IS3_truncate_0.97919.mtz \
          -d autoSHARP_MAD-3 | tee autoSHARP_MAD-3.lis
 

MAD-4 (4ME8):

      run_autoSHARP.sh \
          -seq 4ME8.pir -ha "Se" \
          -wvl 0.97944 infl -8.600 2.660  -mtz 4ME8_truncate_0.97944.mtz \
          -wvl 0.91837 hrem -1.800 3.400  -mtz 4ME8_truncate_0.91837.mtz \
          -wvl 0.97894 peak -6.860 4.580  -mtz 4ME8_truncate_0.97894.mtz \
          -d autoSHARP_MAD-4 | tee autoSHARP_MAD-4.lis
 

Using an initial (partial) model, e.g. from MR solution

autoSHARP allows the use of an input (already placed) model for any phasing scenario - in which case the de-novo HA substructure finding step (with SHELXC/D) is skipped and HA sites are found based on the initial phases from this model (via LLG residual maps in SHARP). Such a model could e.g. be an initial MR solution that is not accurate enough to allow refinement or only one component of a hetero-multimer could be placed. There are a variety of situations, where some meaningful initial PDB model is available. This model and the reflection data given to autoSHARP should have the same (correct) spacegroup as determined through the steps leading to the initial model.

Remember that this can be used with any of the methods supported by autoSHARP (SAD, MAD, SIRAS etc).

MR-1 (3GET):

      run_autoSHARP.sh \
          -seq 3GET.pir -ha "Se" \
          -pdb 3FFH_ala_MR.pdb \
          -wvl 0.9789 peak -8 4 -sca 3GET.sca \
          -d autoSHARP_MR-1 | tee autoSHARP_MR-1.lis
 

SIRAS (native plus single derivative)

This could also be run as SAD, using only the derivative dataset.

SIRAS-1 (1GXT):

      run_autoSHARP.sh \
          -seq 1GXT.pir \
          -nat -mtz 1GXT_nat.mtz \
          -ha "Hg" -nsit 2 -wvl 0.99970 peak -16 10 -mtz 1GXT_hg.mtz \
          -d autoSHARP_SIRAS-1 | tee autoSHARP_SIRAS-1.lis
 

MIRAS (native plus multiple derivatives)

This could also be run as SIRAS using only one of the derivative soaks.

MIRAS-1 (3ZFT):

      run_autoSHARP.sh \
          -seq 3ZFT.pir \
          -nat -mtz 3ZFT_nat.mtz \
          -ha "Hg" -nsit 1 -wvl 1.54179 -mtz 3ZFQ_Hg.mtz \
          -ha "Ir" -nsit 2 -wvl 1.54179 -mtz 3ZFR_Ir.mtz \
          -d autoSHARP_MIRAS-2 | tee autoSHARP_MIRAS-2.lis
 

What to look out for?

There are several main themes one should consider:

  • Even before running autoSHARP: is there some indication of a useful heavy-atom signal in your data? E.g. for phasing on anomalous differences:
    • correlation between anomalous differences of random half-sets (often called "CC(ano)")
    • anomalous signal (DANO/sig(DANO) or SigAno), i.e. anomalous difference (absolute value) over its sigma
  • correlation between anomalous differences for distinct wavelengths (in the MAD case)
  • comparison of statistics ignoring or taking Friedel's Law into account (e.g. comparing Rmeas0 and Rmeas)
  • Is there an indication of heavy atom signal during data analysis in autoSHARP? These are given at the SHELXC step during heavy atom substructure solution.
  • Did the substructure solution step give a promising looking solution?
    • For multi-site searches we expect the first few sites to have a high (relative) occupancy.
    • Does the cross-word table show goot scores (PSMF) between teh first few sites, i.e. is the solution consistent with the Patterson map?
    • Do we see a more or less bimodal distribution of successful and incorrect trials (in the final set of scatter plots)?
  • Does heavy-atom refinement and phasing (including iterative interpretation of the LLG (residual) maps) give meaningful indications of phasing signal?
    • Does it keep all/most of the initial sites (sugegsting that these are correct)?
    • Does it add additional (strong) sites up to the expected number?
    • At what resolution does the phasing power drop below 1.0? This should be in sync with the indications of heavy-atom signal mentioned above.
  • For a successful solution we would like to see a significant difference in score between the two hands (enantiomorphs):
    • the absolute value is not that important (although: the higher usually the better)
    • a score of 0.25 versus 0.11 would be good (or 0.5 versus 0.3, 0.15 versus 0.09 etc)

Some additional hints for running:

  • it is often not necessary to use the full resolution if working with really high-resolution data, rather do an initial quicker test using
      run_autoSHARP.sh -R 50.0 2.6 ...
  • there are additional flags for speeding things up - useful for maybe initial tests:
    • "-fast" (with the danger of "missing" a solution)
    • "-nobuild" (stop after density modification)
    • "-nowarp" (skip the ARP/wARP building step and only use BUCCANEER for model building, with the final model still needing quit a bit of manual work)