BUSTER User Manual
NCS-restrained refinement and completion of a partial structure tutorial: Barnase

BUSTER tutorial 2


NCS-Restrained refinement and completion of a partial structure

This tutorial illustrates the use of BUSTER for NCS-restrained Maximum Likelihood refinement of a partial model against 1.7 Å data, followed by 2.1 Å Maximum Entropy completion of the missing part of the structure. The degree of incompleteness is about 25%.

The structure of the R69S mutant of the hydrolase Barnase from Bacillus Amyloliquefaciens was solved (together with other mutants) by Buckle et al. in 1993, in the course of a study of the contribution of hydrophobic residues and side-chain packing to the stability of globular proteins [1].

This tutorial is a synthetic example of refinement of a "bad model". To prepare it, the published structure (PDB code 1B20) was shaken by torsion angle dynamics to an rms of about 1 Å away from the published structure. The worst one fourth of the structure - the residues for which at least one atom was disordered in one of the three copies of the protomer - was removed; the solvent molecules were also excluded from the partial model and all fragment's B factors were set to 25 Å2.

What does BUSTER do here

There are three separate steps in this job:

  1. cycle 0: the first BUSTER run uses the starting structural model to phase the structure factor amplitudes; from these phases a 2Fo-Fc electron density map is computed and a prior distribution for the location of the missing atoms is derived; the output of this scaling-phasing only job is LIST.0.html
  2. cycles 1-21: the second step is the Maximum Likelihood refinement of the partial structure model, while the distribution for the missing atoms is kept equal to the initial one; data up to 1.7 Å are used for refinement; soft NCS restraints are enforced throughout.
  3. cycles 22-23: the last step is the Maximum Entropy completion of the missing structure: parameters are varied to modify the density of the initial distribution for the missing atoms; atomic positions and B factors for the atoms in the partial structure are kept fixed to the values they reached at the end of the refinement. Given the degree of imperfection of the partial structure, only data up to 3.00 Å are used for the MaxEnt calculation. -->

Input Preparation

A few parameters need some attention: see the simple quick input guide

Main items of the output

We list here a few of the key items you might want to check in the LIST.html output file:
  1. R-factors and Log-likelihood gain plots: the simplest way of checking if the model parameters are indeed improving the fit to the working- and free-set.

    During refinement (cycles 1-21) the R factors decrease; the working-set and free-set Rfactors do not differ by more than a fraction of a percent - due to the NCS. By chance the free R is lower than the working R.

    The plot of Log-Likelihood gain vs. cycle number shows an increase of the LLG, starting from the initial value of 0;

    The working-set LLG is bound to increase because the refinement is driven by maximising the likelihood of the model with respect to the working set; more important is to check that the free-set LLG increases as well.

  2. Correlation coefficients plots: the correlation coefficients curves are a good tool to monitor the improvement in the structural model during refinement and MaxEnt completion; they are independent of the overall scale factor but do depend on the values of the relative scale factors between the partial and missing structures.

    The (Fobs,Ffrag) and (Fobs,Fcalc) curves only differ at low resolution because in Fcalc there are contributions from the solvent and the low-resolution envelope for the missing atoms, while in Ffrag only the atoms in the PDB model for the partial structure are taken into account.

    Most importantly, the (Fcalc,Fexpct) curve depends on the imperfection parameters that parameterise the BUSTER internal error model. The larger the internal estimate for the error on the calculated F, the more this CC curve departs from unity. A comparison between the (Fcalc,Fexpct) and the (Fobs,Fcalc) correlation coefficients curves can inform as to the adequacy of the BUSTER internal error model: if the latter is correct, after the first cycle the two curves should be close to one another.

    The (Fobs,Fobs+d) curve is a measure of the noise on the data vs. resolution. This correlation coefficient is lower than unity when the noise on the data becomes large (typically at high resolution, where the I/s(I) is lowest).

    • It is useful to compare the Correlation Coefficient curves between cycle 0 and cycle 21 (beginning and end of refinement):

      Cycle 0: After the first round of scaling, you can see that the ``observed correlation'' (Fobs,Fcalc) agrees with the ``predicted correlation'' (Fcalc,Fexpct): the initial error model is adequate.

      The (Fobs,Fcalc) correlation coefficients are better than the (Fobs,Ffrag) at low resolution (except for the 54 reflexions in the first resolution bin) because of the improvements brought about by solvent and missing atoms models.

      Cycle 21:At the end of the refinement all the CC values are closer to unity than they were at the beginning: this is due to the improvement of the model for the partial structure.

      The imperfection Bfactor for the partial structure at the end of refinement come down to a value around 4 (see curves below): still the ``predicted correlation'' (Fcalc,Fexpct) is in good agreement with the ``observed'' one, (Fobs,Fcalc), as it should if the error model is adequate.

    • Let's now check the curves showing the geometric restraints residuals and the parameters of the imperfection models, as refinement progresses:

    • Sterochemistry residuals: the plot of the normalised sum of geometry residuals can be used to make decisions as to the value of the X-ray weight.

      As refinement progresses, and if the X-ray weight is adequate, the normalised sum of observed residuals should tend towards the ideal value of 1 (meaning that ideally every restraint should be obeyed within 1 associated e.s.d.).

      Final values of the observed residuals higher than unity suggest too loose a geometry - and point to the need for a lower X-ray weight; values lower than unity pertain to too tight a geometry - the X-ray weight could be increased.

      In this case the refinement hasn't converged yet - so no changes to the X-ray weight are advised.

      Other useful stereochemistry statistics such as the list of the top violations of stereochemistry restraints are found in the geometry output file.

    • Imperfection B factors refinement: the plot of the Imperfection B factors can be used to confirm that the partial structure is improving during refinement and that the error model becomes smaller.

      As refinement progresses, the value of the imperfection B for the partial structure should decrease. In this case Bimpf frag decreases from about 15 to 5.5.

      The missing atoms model is roughly constant during this partial structure refinement (only the so;lvent and missing atoms scale factors and B factors are being refined here), so that the changes in the missing atoms and solvent imperfection parameters are mostly due to the coupling with the partial structure model.

      The Luzzati parameters BLuzzWork, BLuzzFree and BLuzz, are not refined: rather, they are estimated at each cycle from a sigmaA plot.

Final Results

References

[1] Buckle, A.M. and Henrick, K. and Fersht, A.R., Crystal structural analysis of mutations in the hydrophobic cores of barnase, J. Mol. Biology, 234, 847, (1993)
Eric Blanc, <blanc@GlobalPhasing.com>
Pietro Roversi, <pietro@GlobalPhasing.com>

Last modified: Fri Jan 9 11:24:09 GMT 2004