STARANISO

Anisotropy of the Diffraction Limit
and
Bayesian Estimation of Structure Amplitudes


  1. Perform an anisotropic diffraction cut-off of the merged intensities, instead of the traditional isotropic cut-off, using a locally-averaged mean I/σ(I) as the cut-off threshold.  The local average is calculated within a sphere of reciprocal space centered on each reflection (default radius r* = 0.15Å-1), and the contributions to the average are weighted by the exponential function w = exp(-4(s*/r*)2) of the reciprocal-space distance s* from the center.

  2. Determine the anisotropy of the observed intensity distribution, corrected where necessary by the systematic absence factor (Wilson [1987]), using either the error-free likelihood function proposed by Popov & Bourenkov [2003], or the Bayesian likelihood function (default) which uses the French-Wilson formalism and which takes experimental errors into account.  In either case, by default a precalculated expected intensity profile is used (thanks to Alexander Popov for furnishing this); this assumes an average solvent content so in cases where the actual solvent content is much higher than normal it may not provide an accurate estimate of the contribution of bulk water.

    Maximum likelihood optimization of the overall scale and the elements of the overall anisotropic displacement tensor that are not constrained by the point-group symmetry is performed.  Note that the default P & B profile was obtained by averaging the observed profiles of a number of protein-only structures so is not strictly applicable to structures containing a substantial proportion of nucleic acid.

  3. Optionally, renormalize the intensity profile by applying a d*-dependent scale factor determined such that the mean normalized intensity (Z) is 1 in all d* bins.  This may help to average out the effect of differences from the averaged profile.

  4. Use the anisotropy from step #13 to compute an anisotropic prior of the expected intensity, i.e. divide the expected intensity obtained from the profile by the scale/anisotropy correction.

  5. Perform Bayesian estimation of structure amplitudes by the method of French & Wilson [1978], but using the anisotropic prior in place of the traditional isotropic prior originally suggested by F & W.  STARANISO incorporates subroutines from the Netlib repository, in place of the approximate look-up tables used in TRUNCATE, to compute high-accuracy parabolic cylinder functions (scaled to avoid numerical under/overflow issues: Gil et al. [2006]) and thereby obtain all the required moments.

  6. Input anomalous data are treated differently from non-anomalous data in the Bayesian estimation.  If anomalous data are present on input it is naturally assumed that the anomalous differences are statistically significant (otherwise what is the point of keeping the Bijvoet pairs separate?).  If this is not the case then the correct course of action is to re-run the merging step, this time also merging the Bijvoet pairs, since this will deal with outliers correctly.  Otherwise the Bayesian estimation is performed twice per unique reflection on the separately merged means of I[+] and I[-] (where these are observed), not on the overall merged mean including all I[+] and I[-].

    This is because the Bayesian estimation assumes a centric or acentric Wilson distribution as appropriate, but the average of two random variates each with an acentric distribution with different expected values does not necessarily itself have an acentric distribution.  Hence it is not correct to perform the Bayesian estimation as currently implemented on the average of two Wilson intensity variates with different expected values.  Rather I[+] and I[-] should be separately converted to Fs, and then averaged.

    There are further issues concerning the optimal procedure for averaging F[+] and F[-] when they have different standard uncertainties.

  7. Optionally correct the amplitudes for anisotropy.

  8. Finally, create a new MTZ file containing F and σ(F) columns (and also anomalous F and σ(F) columns if anomalous I columns were read in).  Note that it is formally invalid to take Fs from the Bayesian estimation and square them in a misguided attempt to recover the Is! (needed for example by some twinning tests).  Rather, the posterior Is should be estimated by the same procedure as for the posterior Fs.  For this reason there is an option to output the posterior intensities (MTZ column labels Ipost, SIGIpost etc.).

REFERENCES

French, S. & Wilson, K.S. (1978) "On the treatment of negative intensity observations." Acta Cryst. A34, 517-525.  See also: "Bayesian treatment of negative intensity measurements in crystallography" .

Gil, A., Segura, J. & Temme, N.M. (2006) "Algorithm 850: Real parabolic cylinder functions U(a,x), V(a,x)." ACM Transactions on Mathematical Software (TOMS). 32, 102-12.  See also: "Computing the real parabolic cylinder functions U(a,x), V(a,x)".

Morris, R.J., Blanc, E. & Bricogne, G. (2003) "On the interpretation and use of <|E|2>(d*) profiles." Acta Cryst. D60, 227-40.

Popov, A.N. & Bourenkov, G.P. (2003) "Choice of data-collection parameters based on statistical modelling." Acta Cryst. D59, 1145-53.

Wilson, A.J.C. (1987) "Treatment of enhanced zones and rows in normalizing intensities." Acta Cryst. A43, 250-2.