BUSTER Documentation	previous next
Examples

BUSTER Documentation : Examples

Copyright © 2003-2020 by Global Phasing Limited

All rights reserved.

This software is proprietary to and embodies the confidential
technology of Global Phasing Limited (GPhL). Possession,
use, duplication or dissemination of the software is authorised
only pursuant to a valid written licence from GPhL.

Contact buster-develop@GlobalPhasing.com

Normal refinement
Some ligand is (possibly) present, but location is not well known
A ligand is (possibly) present, and the location is well known
A ligand is (possibly) present, and the location is well known: variation
Some settings that might need adjustment

Normal refinement

To do a normal refinement only a PDB and MTZ file are needed:

% refine -p some.pdb -m other.mtz -d Results.1

Results available

The results of a BUSTER refinement (in the current directory or in the subdirectory pointed to with the "-d" flag) include:

refine.pdb: the final, refined PDB file (including a header section with additional information)
refine.mtz: MTZ file with columns to calculate electron density maps. Use
- 2FOFCWT/PH2FOFCWT (2mFo-DFc map)
- FOFCWT/PHFOFCWT (mFo-DFc map)
- 2FOFCWT_aniso-fill/PH2FOFCWT_aniso-fill (2mFo-DFc map with unobserved reflections classified as "observable" by e.g. STARANISO given DFc/PHIc values)
  These will only be created if the input MTZ file contains an appropriate flag (e.g. column "SA_flag" from STARANISO)
- 2FOFCWT_iso-fill/PH2FOFCWT_iso-fill (2mFo-DFc map with unobserved reflections to the full high-resolution limit of the data given DFc/PHIc values)
  Please be aware of the potential for serious model bias if the input data is incomplete (e.g. missing cusp, ice-rings, detector gaps, anisotropy): this should be used with care in those cases.
It is easy to load these two files e.g. into Coot using
```
% coot --pdb refine.pdb --auto refine.mtz
      
```
BUSTER_model.cif and BUSTER_refln.cif: deposition-ready PDBx/mmCIF files.
refine.corr: tabulated values for real-space correlation of refined model against 2mFo-DFc map
analyse.html: small HTML document with tabulated statistics for each BIG cycle (and thumbnails of potential ligand-binding sites - if -L/-Lpdb options was used together with setting do_analyse="yes").
refine_CC-mc_Chain-<ID>.mtv and refine_CC-sc_Chain-<ID>.mtv: graphical plots of main-chain (mc) and side-chain (sc) real-space correlation for each chain <ID>. These can be viewed using plotmtv (e.g. in $BDG_home/helpers/linux/plotmtv).

Handling of waters

By default, the water structure will not be updated. However, updating the water model might might be a good idea at a stage when the protein model has been built and refined and is very close to the final structure. At early stages of refinement (when the macro-molecule is still requiring major manual or automatic rebuilding), the placement of water molecules might not be ideal. On the other hand: if larger parts of the model are still missing, placing these so-called "waters" might indicate to the bulk solvent correction a much better and more realistic envelope. Similarly, towards the end of refinement - when water molecules have been checked manually - this feature should probably be left switched off.

Rigid-body refinement

When the starting model is poor or the cell parameters have changed (e.g. between an apo structure and a compound soak) it is a good idea to first start with some rigid-body refinement. This allows for collective motions that would otherwise take a lot of time or be impossible to achieve within a normal refinement.

To perform rigid-body refinement use the -RB command line argument. This will set up a single rigid body for each chain and start refinement with a single big cycle of rigid-body refinement (after which it will switch to normal, positional refinement for the subsequent big cycles).
It is possible to produce custom rigid-body definitions and use them with the -RB <rigid.dat> command line argument. See Rigid-body description file format section for their syntax and how to do this.
We recommend using rigid-body refinement when starting from any molecular replacement structure or where there is a reasonable degree of non-isomorphism between the data and input model.
During a rigid-body refinement big cycle non-bonded contacts are weighted to zero but bonded contacts continue to be active. This is a good idea as it allows e.g. misplaced loops on the outside of the protein to have short contacts with other chains or to adjacent symmetry copies. Such contacts may be relieved by normal refinement after the initial rigid-body step(s), but there can be problems: particularly for loops that are in close contact to symmetry-related copies of themselves. It is important to check for bad contacts in the screen type output or using the visualise-geometry-coot tool after doing a rigid-body refinement.
Temperature factors are held constant during rigid-body refinement big cycles.
It is sometimes a good idea to use only low resolution data during the rigid-body refinement cycles. See the Rigid-body description section for details how to do this.

NCS restraints

The recommended way of defining NCS is to start from the initial hypotheses that all copies of the macro-molecule within the asymmetric unit are identical. Only if there are clear indications that parts of one monomer differ from the rest (side-chains in crystal contacts, domain and loop movements, etc) should these parts be taken out of the NCS restraints. Therefore, the procedure to define NCS restraints should start from a completely restrained description that changes during the course of refinement and rebuilding to leave parts of the the molecules out. However, the final NCS restraints should probably still cover between 80-90 % of the atoms in each monomer.

The easiest way to define NCS restraints is using the -autoncs command-line flag. This will apply LSSR-type NCS restraints between all matching chains. It will automatically take care of real differences by removing those from the NCS-relation (so-called "pruning"). If the NCS-relation within the starting structure has been allowed to diverge too much (by over-eager model building into noisy maps or too agressive refinements), it might be a good idea to try and re-instate the NCS-relation. For that the pruning option can be switched off with -autoncs_noprune. This might also be necessary for situations where the X-ray data is rather weak, e.g. at lower resolution. But it depends a lot on the particular problem and especially the modeling history (NCS restraints are not something happening only during refinement, the manual model building also needs to be done under NCS restraints).

Another useful tool is the -sim_swap_equiv flag: this will try and correct problems where NCS-related atoms are chemically identical but have been given different atom names in the PDB files.

B-factor refinement

Under normal circumstances, the mode of B-factor refinement is determined automatically, depending on the resolution. At lower than 3.5 Å resolution the default is to turn off any B-factor refinement, whereas individual atomic B-factors are refined at higher than 3.5 Å.

Previous versions of BUSTER used grouped B-factor models at moderate resolution (2.8 - 3.0 Å). However, we have found that with the use of tight BCORREL restraints (as implemented as default in BUSTER), use of individual B-factors gives superior results.

Individual B-factor refinement at lower than 3.5 Å resolution, or turning off B-factor refinement at higher than 3.5 Å, can be enforced by use of -B individual or -B None.

The resolution cutoff between these two schemes can be set with the parameter UseBrefNoneFrom.

More complex B-factor refinement modes can be set by use of the -B user option, in conjunction with -Gelly <gelly.file>. As an example, the following command may be used to refine a structure, defining a single B-factor per protein chain.

% refine -p some.pdb -m other.mtz -B user -Gelly gelly.dat

The gelly.dat file uses gelly combine syntax.

NOTE BUSTER_COMBINE B { A|* }
NOTE BUSTER_COMBINE B { B|* }

TLS refinement

To enable the use of TLS parametrisation, use the -TLS option of the refine command.

In its simplest invocation use:

% refine -p some.pdb -m other.mtz -TLS -d Results.1

This will perform TLS refinement for the first big cycle and do regular refinement for subsequent big cycles. If TLS definitions are present in the input pdb file header (both group definitions AND tensors), they will be used. Otherwise, it will define a single TLS group per macro-molecular chain.

Alternatively, use of:

% refine -p some.pdb -m other.mtz -TLS tls.dat -d Results.1

will similarly do TLS refinement for the first big cycle, but using TLS domain definitions specified in tls.dat (see TLS description for format details).

For convenience two different macros can be used.

TLSbasic

% refine -p some.pdb -m other.mtz -M TLSbasic -d Results.1

This will switch on TLS refinement for the first and third big-cycle and do regular refinement on the other big-cycles. If TLS definitions are present in the input pdb file header, they will be used (group definitions ONLY). Otherwise, it will define a single TLS group per macro-molecular chain. We would recommend use of -M TLSbasic in the first instance.
TLSalternate

% refine -p some.pdb -m other.mtz -M TLSalternate -TLS tls.dat -d Results.1

Similar to use of -TLS or -TLS tls.dat alone, but will perform (up to 10) alternating cycles of TLS and restrained refinement (starting with TLS). Note that the -TLS option must be specified with this macro. Furthermore, this option does not increase the number of big cycles (default is 5). To carry out the full 10 cycles (if wanted) specify -nbig 10.
This can be especially useful when carrying out additional refinement cycles after small model alterations. The current set of TLS parameters can always be extracted using the pdb2tls tool and that output used as argument to the -TLS flag.

NOTE: Any atoms that are not included in a TLS domain definition will undergo normal restrained refinement.

For a more detailed description of the use of these TLS options please see the TLS tutorial WIKI.

Some ligand is (possibly) present, but location is not well known

The -L flag tells the program to remove water atoms around residual difference density at the last cycle. This should make the difference density in these (potentially) 'interesting' regions clearer. The starting PDB file should obviously not contain any atoms for the unknown ligand.

% refine -p some.pdb -m other.mtz -L -d Results.2

The file Results.2/analyse.html can be used to look at pictures of the found (possible) binding sites (requires setting of do_analyse="yes").

A ligand is (possibly) present, and the location is well known

If the location of the binding site of a new ligand is known (e.g. from previously solved structures, biochemical data or docking experiments), a PDB file with a model of this (or a similar) ligand can be given with the -Lpdb flag. This PDB file should not contain the putative ligand as present in the crystal or even a similar structure (the risk of introducing bad model bias would be unacceptably high), but just a collection of atoms that cover the space likely to be occupied by the unknown ligand structure, without highlighting its shape.

This option tells the program to remove waters atoms around this PDB file at the last cycle. This should make the difference density in these 'interesting' regions clearer.

Note : Be careful, when using dummy atoms to describe a large area in space: these atoms are also used to describe the region not covered by bulk solvent. So if these dummy atoms are within the bulk solvent region, some artificial difference density will appear (corresponding to the bulk solvent).

% refine -p some.pdb -m other.mtz -Lpdb lig-model.pdb -d Results.3

The file Results.3/analyse.html can be used to look at pictures of densities within the user-defined binding sites (requires setting of do_analyse="yes").

A ligand is (possibly present) in a known location. A variation: excluding regions from bulk solvent during refinement

Use the -x flag to exclude a region described by the provided PDB file from both water addition and bulk solvent region throughout the refinement. This should make the difference density in this region clearer.

However, there is always the danger of creating a biased imprint of the used PDB file in cases where nothing has bound in that site. Under those circumstances, the difference density visible is due to unmodelled bulk solvent (since the region is left out of the bulk-solvent mask). Be careful when decreasing the density level while looking at maps, especially mFo-DFc difference density maps: if one has to go to a level at which there is a lot of difference density all over the remainder of the model, it is unlikely to be significant.

Some settings that might need adjustment

Here are some flags that might need changing:

-l <library>
If a good-quality geometry dictionary is already available for ligands/compounds that are present in the input PDB file, it is recommended that these are given on the command line (to prevent the automatic generation of geometric restraints based on the current coordinates). Make sure that the residue name is correct and that all atom names match (some modelling programs rename atom names sequentially, so that the coordinates and the dictionary might be out-of-sync).
-Gelly <NCS file>
If there is more than one copy of a macro-molecule in the asymmetric unit, NCS restraints should be used. In general it seems a good initial assumption that the various copies of a monomer are identical to each other. Only if the density or crystal-contact analysis give clear indications might it be necessary to leave some residues and/or loops out of the NCS restraints. Also, if different domain-orientations can be seen, some fine-tuning in the description of the NCS-relations might be necessary.
However, completely removing NCS-restraints in case of several monomer-copies per asymmetric unit seems a bad idea and will most likely lead to over-fitting.
This is now mostly automated by the -autoncs (and related) command-line flags.
-WAT [<ncyc>]
If the solvent structure of the input PDB file is already very complete, it might be a good idea to leave the automatic update of the water structure switched off. Also, if the input structure is just at the beginning of the refinement (and rebuilding) process, the addition of waters too early in the process might prevent larger parts of the structure from moving. On the other hand, if the structure is fairly incomplete, the interpretation of so-far unexplained density by adding waters might be better than to leave large regions of additional density unmodelled.
It is difficult to give an easy recipe how to deal with waters (present in the input PDB as well as visible through difference (mFo-DFc) maps). Some experimentation based on the characteristics of each structure/dataset/project is necessary.
There are several methods available for updating the solvent structure: PKMAPS, PKMAPS with restraints on hydrogen-bonding partners, Coot's findwaters program as well as the possibility of a completely user-defined plugin.
-r <rms(bond) target>
The value given here is probably a rather complicated way of actually weighting the X-ray and geometric terms relative to each other. Effectively, the X-ray weight will be adjusted so that the rms(bond) value comes out roughly with a value of 0.008. Using only a single criterion for judging the relative weight between X-ray and geometric term is probably not sufficient. Also, the value of 0.008 is most likely not to be correct in a lot of cases (the only reason we came up with this value is that an analysis of the whole PDB gives something very close to this as the mean value in nearly all resolution ranges).
Note: the whole area of weighting X-ray and geometric term as well as the weighting of the various geometric terms relative to each other will be revisited in the future.
-RB [<rigid.dat>]
If large movements are to be expected (e.g. when refining an apo-structure against a new dataset containing a compound) and the most-likely movements are already well known (active-site loop motion, domain closure, etc ...), it will be good to give one or several rigid-body describing files to BUSTER containing these rigid-body movements. The command pdb2rig can be used to generate (fairly complete) templates for rigid-body descriptions (in GELLY syntax).
-B <B-ref type>
Sometimes it is a good idea to switch off the default B-factor refinement scheme (-B None), especially at lower resolution and/or at early stages of refinement. In case of very high non-crystalligraphic symmetry it could still be useful to do B-factor refinement even at resolutions lower than the current 3.5 Å cutoff (-B individual).
-nbig <no BIG cycle>
If one wants to calculate a map very quickly, the following command-line flags could be used:
```
      refine -nbig 1 ...
      
```
-nsmall <no SMALL cycle>
The current set of defaults for a refinement using BUSTER seem a good compromise for a whole range of refinements. However, for rigid-body refinement of large rigid-bodies, a smaller number of cycles could be used. Also, a larger number of cycles (several hundred) might be able to move much more side-chains into the correct place, even when large rotations/movements are required. In any case, the value given here is only an upper limit and - if convergence has been reached - much fewer iterations might be done anyway.
Note: we're working on better convergence criteria to make these decisions automatically.

Last modification: 04.02.2020

Copyright	© 2003-2020 by Global Phasing Limited

	All rights reserved.

	This software is proprietary to and embodies the confidential technology of Global Phasing Limited (GPhL). Possession, use, duplication or dissemination of the software is authorised only pursuant to a valid written licence from GPhL.

Contact	buster-develop@GlobalPhasing.com

BUSTER Documentation : Examples

Contents