Phase Improvement and Interpretation Manual	previous
Chapter 2

Phase Improvement and
Interpretation Manual - Control

Copyright © 2001-2006 by Global Phasing Limited

All rights reserved.

This software is proprietary to and embodies the confidential technology of Global Phasing Limited (GPhL). Possession, use, duplication or dissemination of the software is authorised only pursuant to a valid written licence from GPhL.

Documentation (2001-2006) Clemens Vonrhein

Contact sharp-develop@GlobalPhasing.com

Introduction
Density modification
NCS detection
Automatic building using ARP/wARP
Utilities
Output
- Density modification

Introduction

The "Phase Improvement and Interpretation Control Panel" lets you do most of the subsequent steps (after running SHARP for heavy atom refinement and phasing) necessary during a structure solution for improving the initial phases and getting the best possible electron density map. However, some tools are in a more advanced stage of development than others. Especially, everything concerned with non-crystallographic symmetry (detection and use) is still rudimentary. If you have any comments or remarks please get in contact with us at sharp-develop@globalphasing.com.

Since several similar procedures might be run (within the same sub-directory in your sharpfiles/logfiles directory) we need to distinguish these from each other. This is done through the solvent fraction XX.Xpc (expressed as a percentage with only one digit being used). The interface will warn you if you try to run a similar procedure with the same solvent fraction.

Density modification

This section contains all options specific for solvent flattening/flipping.

Solvent fraction

This is the solvent content of your crystal. It has to be unique for this procedure: just change it in steps of 0.001. Although the solvent fraction will slightly differ it won't have too much of an influence.

This doesn't have to be the correct solvent content of your crystal: it is the one used for calculating the solvent envelope.

Resolution range

The resolution range to use throughout the procedure. Make sure to take the phase quality into account.

Real solvent content

If using DM at any stage of the solvent flattening procedure(s), this solvent content will be used for data scaling (see DM documentation). It should always be the real solvent content of your crystal. The default of "0" will make this the same as the solvent fraction.

Starting solvent content

If this differs from the default of "0", the first ten cycles of solvent flattening will change the solvent content used for defining the solvent envelope from this value to the solvent fraction in 10 equal steps. This could be helpful if the starting phases are of particular bad quality and a gradually increasing solvent content at the beginning of the solvent flattening will make sure, that no potential protein regions are flattened (and therefore never recovered).

Radius of solvent sphere

SOLOMON defines the solvent envelope by using a local standard deviation map. A sphere at each grid point is used to calculate the rmsd of electron density within this sphere. The protocol for solvent flattening/flipping used here allows this sphere to slowly decrease during the various cycles. Tests suggest that a good starting value is the resolution where the figure-of-merit drops below 0.5. The final radius is usually the high resolution limit used.

Flipping factor

The protocol for solvent flipping using SOLOMON needs a flipping factor. The default of "0" will make sure, that the correct (gamma correction) value is used.

Mean density

During solvent flipping using SOLOMON, the density in the solvent and protein region are adjusted to have a ratio defined by the mean density in the solvent and in the protein region. Water should have a mean density of 0.32 e/Å³. For proteins this should be about 0.43 e/Å³, DNA should have about 0.60 e/Å³. For mixtures of protein and DNA/RNA you might need to adjust this.

min / max density truncation in "protein" region

The solvent flattening/flipping procedure used will truncate the lowest and highest grid points in the "protein" region. The defaults (lowest 40 % and highest 1 %) seem to work in most cases. If you suspect large peaks (e.g. caused by heavy atoms) in the protein region you might want to increase the fraction of highest peaks truncated. See also the documentation for SOLOMON.

starting DM run ?

In some cases - especially when only low resolution data or phases are available - it might be beneficial to start with a simple run of DM (solvent flattening and histogram matching) before switching to solvent flipping with SOLOMON. This can lead to a better starting map for the iterative solvent flipping procedure.

Including available partial model

If some kind of model is available (partial molecular replacement solution, partially built model, automatically built model etc) it can be included in the solvent flipping procedure. This can greatly improve the starting cycles of density modification by providing a much better initial solvent envelope. If some loops or flexible parts of your model are missing in the solvent flattened map it might be because these regions were initially included in the solvent part of the envelope and therefore flattened.

Note: make sure that the experimental phases (ie heavy atom sites) and the model are on the same origin and have the same handedness!

NCS detection

Currently, only closed local symmetries of the form C_n can be located. More complicated forms of NCS have to be dealt with externally.

Molecular weight of monomer

During NCS detection using GETAX, this is used to estimate the volume of your "protein" monomer, ie the part of your structure obeying the NCS.

Number of rotations & translations to test

During NCS detection, for each rotation determined from a self-rotation function, a certain number of possible translation solutions can be tested. Your own set of peaks in a self-rotation can be supplied by editing a file selfrot.lis and adding polar angles omega, phi, kappa (for angle convention see POLARRFN documentation). Each of these rotations is fed into GETAX to find possible centres of a C_n rotation axis.

Monomer positions

Although the assumption for NCS detection using GETAX is, that a pure C_n rotation is present, a slight deviation of this ideal scenario might occur. This might not have a great effect in finding the (roughly) correct position of the rotation axis. However, during refinement of the operators, the use of a pure multimer mask when a monomer mask is required can drown the correct solution in noise. Therefore, a set of possible monomer masks is tested, each kappa/N angles apart (where kappa is the rotation angle around the rotation axis - 180, 120, 90 etc - and N is the number of monomer positions to be tested).

This options is probably only useful, when the correlation map from GETAX shows clear stretches of high correlation (corresponding to a possible rotation axis position) but the following NCS operator refinement using a multimer mask is unsuccessful.

Ignoring crystal symmetry

In some cases, local rotation axes can be parallel to crystallographic rotation axes. These would be buried under the large crystal symmetry peaks in the self-rotation. In these circumstances, this option could be turned off.

In some cases, these parallel local symmetry axes are visible in a native Patterson function.

Automatic building using ARP/wARP

The program package ARP/wARP is used for automatic building and tracing in a solvent flattened map. If you want to use this feature, you have to have the latest version installed and configured (ask your SHARP site administrator about that).

We use a script that is more or less identical to the distributed arp_warp.sh script: it differs in that it only supports the warpNtrace protocol and has a different error handling. So you should be able to use the warp.par file produced through this interface with your own ARP/wARP installation without any further changes.

Number of residues

The total number of protein residues that ARP/wARP should be building. This is not the number of residues per chain/monomer, but the total number of protein residues in the asymmetric unit.

Total number of cycles

Two parameters control the number of cycles that ARP/wARP should be running in automatic building and tracing mode (warpNtrace). This defines the overall number of cycles after which it will stop. The default of 100 should be a good starting point - but see the ARP/wARP documentation for more details.

Number of cycles between rebuilding

The number of cycles, after which ARP/wARP will try to re-interpret the density/model. At this stage, automatic building and (possibly) side-chain docking is done. In most cases, the default of 10 should be adequate - but see the ARP/wARP documentation for more details.

Sequence file

If a sequence file is available, the pull-down menu should allow for its selection. ARP/wARP will try side-chain docking at various stages of the warpNtrace procedure. See the ARP/wARP documentation for more details.

Phased refinement

The auto-building will always use experimental phases during refinement. These can come either direct from SHARP or from the solvent flattening procedure. In the latter case the Hendrickson-Lattmann coefficients from the last DM run are used - and these should be automatically adjusted so that at later stages of the automatic tracing/refinement they are appropriately dampened.

Automatic adjustments

The fraction of successfully traced main chain can be used to dampen the contribution of experimental phases during each refinement cycle. If phase information after solvent flattening is used this is recommended (the phase probabilities after density modification are nearly always overestimated). If you prefer to use your SHARP phases directly it should not be necessary - since SHARP outputs a reliable estimation of phase probabilities.

The weight between X-ray and geometry term in the refinement can automatically be adjusted to give a reasonable geometry of the final model. The rms deviation on bond distances is used for this.

Convergence criteria

To speed up the process of refinement, the free R-factor can be used to stop the refinement cycles between rebuilding cycles. A certain degree of fluctuation is allowed, but too large increases in Rfree will trigger the refinement cycles to be abandoned and the switch to a building cycle.

Utilities

Some tools are provided that can help you using or analysing the results from the various phase improvement and interpretation protocols.

Optimal solvent content

A simple procedure can help you in finding the solvent content that gives the best overall density for a given protocol. If you ran three different solvent flattening runs where only the solvent content was changed, a parabolic fit to these three values should give an idea of the optimal solvent content. This is exactly the strategy used by the "Solvent-flattening (optimising)" protocol.

Note: this will obviously not work if any of these runs produced complete nonsense. See the remarks in each solvent flattening log-file for possible problems.

External phases

This tool gives the possibility to extract Hendrickson-Lattmann coefficients from

solvent flattening,
automatic building or
a model

These can then be used as external phase information in SHARP to help refinement of heavy atom parameters.

In case you have various versions of SHARP installed and you want to use a version of SHARP later than 2.0.0 you should make sure that these are written into a combined file.

Difference Fourier

To calculate ordinary difference Fourier maps (isomorphous/dispersive and anomalous differences) any of the solvent flattened phases can be used. All amplitudes and anomalous differences in the REFL01.mtz file (ie the data file used for the corresponding SHARP run) will be used.

Although it is recommended to use the residual (log-likelihood gradient) maps, this is an easy way to check existing sites, find sites in new, unused datasets etc.

Output

Most of the output produced during any of the above steps should be self-explanatory. However, here are some basic explanations about the various results you can obtain.

Density modification

The density modification can be run either for each solvent content by hand or using the 'self-optimising' option. In any case, each run with a specific solvent content will produce a log file with a lot of information. This will be explained here.

Comparison of standard deviations (SOLOMON)

This value compares the standard deviation of electron density in the solvent region and the protein. Since we expect relatively flat solvent (low sd) and lot of features in the protein (high sd), this value should steadily go down during density modification. Obviously, the determination of where solvent and protein is is crucial. So again, the very first solvent mask determination has a large effect on this quality indicator.

Overall R-factor R0 (ICOEFL)

If a newer version of SHARP/autoSHARP is used (post 2.0.0) an additional step of scaling using ICOEFL is performed. The final R-factor is given. This should be similar to the ones coming from SFALL and/or RSTATS. Obviously, it should go down during the iterative density modification - but be aware of the bias problem intrinsic in this type of density modification!

Overall reliability index (SFALL)

This is a simple R-factor between the structure factor amplitudes from the modified map and the 'observed' data. This should be similar to the ones coming from ICOEFL and/or RSTATS. Obviously, it should go down during the iterative density modification - bu be aware of the bias problem intrinsic in this type of density modification!

R-factor (RSTATS)

This is a simple R-factor between the structure factor amplitudes from the modified map and the 'observed' data. This additional scaling is done right after the calculation of structure factors (in SFALL). The value should be similar to the ones coming from ICOEFL and/or SFALL. Obviously, it should go down during the iterative density modification - bu be aware of the bias problem intrinsic in this type of density modification!

Overall Correlation on |E|**2 (SIGMAA)

During calculation of appropriate weights for the modified structure factors (and finally phase combination with the experimental phase information from SHARP), the correlation coefficient on E**2 values is calculated in SIGMAA (between the structure factor amplitudes of the 'observed' data and the modified map). This is a good indicator of the quality of the map - at least at the very first cycle (when the procedure has still no bias introduced). In our experience, a value of lower than 0.1 at the very first cycle usually points to a very bad starting map (and therefore bad starting phase values and/or data).

Here, only the overall value is given. But it might be a good idea to look at these values as a function of resolution (just follow the link 'directory listing' to get to the complete log files). The logfile for the first cycle is also kept.

A good indication if the starting map is of promising quality, is to compare this value at cycle 1 for each of the two possible hands (if a change in handedness is possible). In good cases, you should see a significant difference in this value between the two hands.

Final DM run

At the end of the solvent flipping procedure (and in fact, after the last cycle of solvent flattening), an additional density modification calculation using DM is performed. This will result in another set of phases as well as a set of Hendrickson-Lattmann coefficients. The latter can be fed back into SHARP as external phase information to help the refinement of the heavy atom model as well as the calculation of residual maps.

Last modified: Fri Sep 15 14:58:02 BST 2017

Copyright	© 2001-2006 by Global Phasing Limited

	All rights reserved.

	This software is proprietary to and embodies the confidential technology of Global Phasing Limited (GPhL). Possession, use, duplication or dissemination of the software is authorised only pursuant to a valid written licence from GPhL.

Documentation	(2001-2006) Clemens Vonrhein

Contact	sharp-develop@GlobalPhasing.com

Phase Improvement andInterpretation Manual - Control

Contents

Phase Improvement and
Interpretation Manual - Control