autoBUSTER Documentation previous next

autoBUSTER Documentation : Usage

Copyright    © 2003-2013 by Global Phasing Limited
  All rights reserved.
  This software is proprietary to and embodies the confidential
technology of Global Phasing Limited (GPhL). Possession,
use, duplication or dissemination of the software is authorised
only pursuant to a valid written licence from GPhL.


Running the "refine" command

Command line arguments for the "refine" command

The most important command line flags are summarised below:
-h Basic help message Special option to print help message and exit. Most important options shown.
-hh Longer help message Special option to print help message and exit. More options shown. To show all options use -hhh.
-p <PDB file> PDB file with complete macro-molecule to be refined PDB file requirements for autobuster.
-m <MTZ file> reflection file in MTZ format with correct space-group and cell parameters MTZ file requirements for autobuster.
-d <subdir> all files will be created in sub-directory it is a good idea to use some systematic numbering, otherwise the current working directory might get cluttered with output. If I/O over the network is slowing down calculations, this sub-directory should be located on a fast, local file-system.
-l <library> user supplied geometric restraints dictionary several -l switches may be given (as many as may be necessary); these restraint dictionary files can be REFMAC-style CIF restraint dictionaries or in TNT-format. Conversion of REFMAC-style restraint dictionaries is done with refmacdict2tnt (since March 2010 release).
-WAT [<ncyc>] switches on water updating (optionally only after cycle <ncyc>) default = don't do water updating.
-M use a predefined macro each macro combines a set of related parameters to perform a specific task. To see a list of available macros: refine -M list.
-nbig <no BIG cyc> Number of BIG cycles (refinement/water building/bulk solvent mask update/weight adjustment) to perform. A positive number is required. default = 5. Note that this number may be automatically increased if water updating is selected and there is a significant change in the overall water-structure.
-nsmall <no SMALL cyc> Number of SMALL cycles of refinement to perform during each BIG cycle default = 100.
-R <reslow> <reshigh> low- and high-resolution limits for refinement default = use all data present in MTZ file.
-r <rms(bond) target> target value for rms(bond) deviation used for automatically adjusting X-ray weight; default = 0.010 Å.
-w <X-ray weight> Starting X-ray weight default = take the recorded value from the header of the input PDB file (if it was previously refined with BUSTER - otherwise it will start with a value of 4.0).
Note that the weight will still be adjusted throughout the run to achieve the desired rms(bond) deviation, as set by the -r flag (or at least get reasonable close to this value). To use a constant X-ray weight, set the desired weight with the -w flag and the parameter AdjustXrayWeightAutomatically to "no".
-Seq <TNT seq> TNT sequence file default = generate automatically from input PDB file using the pdb2seq tool. For more complex connectivity, such as covalently bound cofactors, see TNT sequence file section.
-RB [<rigid.dat>] Perform rigid-body refinement for one BIG cycle The default behaviour of -RB is to assign a single rigid body per chain. Specific rigid-body descriptions can be supplied in the optional file. Please see Rigid-body description format for more details.
Several -RB flags may be defined (in which case rigid-body refinement will be performed for one BIG cycle for each of the specified rigid-body descriptions in the order given); see Rigid-body usage for more details.
-L Turns on water updating and uses it to enhance difference density to aid in identification of potential ligand sites with unknown location. If potential locations are found, they will be described in form of PDB files cluster-<i>.pdb. These are also used to generate thumb-nail pictures of those regions (see file analyse.html).
For further information please see ligand chasing procedure (unknown position).
-Lpdb <PDB file> Turns on water updating and uses it to enhance difference density to aid in the identification of unmodelled ligands whose location is known. the location is described by a PDB file which contains "atoms" describing the space of the binding site. Any water atoms placed around the positions defined in this PDB file will be removed prior to the last BIG cycle.
For further information please see ligand chasing procedure (known position).
-noWAT [<ncyc>] switches off water updating for the first <ncyc> cycles. The default is to switch it off for all cycles. Since the default is NOT to update waters (see -WAT), this argument only has an effect if -L or -Lpdb is set PREVIOUSLY.
-autoncs   use automatic setup of LSSR-type NCS restraints Please see NCS restraints section for more details.
-autoncs_noprune   switch off automatic pruning of NCS outliers Please see NCS restraints section for more details.
-target <target PDB> target structure refinement against known, high-quality and/or high-resolution structure using LSSR restraints See Target restraints.
-sim_swap_equiv   improve the NCS relationship of symmetrical side-chains Asp, Glu, Tyr, Phe and Arg by swapping equivalent atoms.  
-sim_swap_equiv_plus   as -sim_swap_equiv, but also includes Asn, Gln and His.  
-nthreads <no. of threads> how many threads to use on multi-CPU/multi-core machines default is to use a limited number of available threads. See Controlling the number of threads for details. If given a negative parameter, then a fraction of the available threads is used (eg -2 means to use half the threads and -4 means to use a quarter of the threads)
-report run buster-report at the end of refine It is important to ensure buster-report is correctly setup before using this option. See buster-report chapter for details.
-qm < ligand name and charge> (eg <LIG+1>) Residue type for which to use the quantum energy. Can be given more than once to handle multiple types BUSTER from the October 2010 release onwards can compute the quantum-mechanical energy of a ligand conformation directly, and use this as part of the objective function in refinement. See AutobusterLigandQM on the wiki for details
Less frequently used command-line arguments:
-TLS [<tls.dat>] do TLS refinement (with optional TLS description) We would recommend the use of the -M TLSbasic macro in the first instance. Please see TLS refinement for more details on the use of TLS refinement.
-Gelly <file> file with GELLY-style commands Use of the -Gelly flag allows expert setting of more complex NCS restraints, target restraints, B-factor groupings and occupancy refinement. Please see the GELLY Manual for more details.
-x <PDB file> waters will not be placed around any atoms in this PDB file at any step during the refinement This has a slightly different effect from the -Lpdb flag! For further information please see ligand chasing procedure (known position: variation).
-autoncs_weight <number> weight to use for -autoncs LSSR restraints default = 2/(no. of ncs chains in the set); see the LIST.html file (with the BUSTER run details) for actual value. It is not normally necessary to change the default. However, if -autoncs worsens Rfree, try reducing this weight.
-target_weight <number> weight to use for -target LSSR restraints default = 1.0. It is not normally necessary to change the default. However, if applying target restraint worsens Rfree, try reducing the target weight.
-dlim <number> set the convergence limit within each BIG cycle: maximum rmsd distance to starting structure. default = not set.
-glim <number> set the convergence limit within each BIG cycle: maximum value of gradient. default = 4.0
-special_dist <number> Distance in Angstroms used to identify atoms and ions at special positions.
-B <B-ref type> type of B-factor refinement you want to do - one of "individual", "None" or "user". default = determined automatically by resolution. At higher than 3.5 Å resolution, individual B-factors are refined. Below 3.5 Å, no B-factor refinement is performed. -B user must be used in conjunction with any -Gelly command that describes a user-defined B-factor refinement scheme. Please see B-factor refinement for details.
-reportrm run buster-report at the end of refine and remove the original -d directory It is important to ensure buster-report is correctly setup before using this option. See buster-report chapter for details. Use this option with caution

Controlling the number of threads used by BUSTER

BUSTER can take advantage of multiprocessor machines, as it includes OpenMP multiprocessing code. By default, the "refine" command will obtain the number of CPU's as reported by the operating system on the machine on which it is run (see below), and will use the number of threads shown in the Table below, unless the environment variable OMP_NUM_THREADS is set or the refine argument -nthreads is used.

Number of CPU's
Default number of threads
used by BUSTER
1 1
2 2
3 3
4-23 4
24-63 6
64- 8

The number of CPU's reported by the operating system is determined by running:

    Linux : % grep -c '^processor' /proc/cpuinfo
    Darwin: % /usr/sbin/sysctl hw.ncpu

If you want to override this default behaviour, this can be done by setting the environment variable OMP_NUM_THREADS, in which case its value will be used in preference to the default. It should be noted that other applications using OpenMP can be affected by the OMP_NUM_THREADS environment variable so care needs to be taken as conflicts could arise.

Another way to control the number of threads used by a "refine" job is the nthreads="8" refine parameter. This could be included in a .autoBUSTER file but this would seldom be useful.

Finally, use of the "refine" command-line argument -nthreads will take precedence over both the default behaviour and the environment variable OMP_NUM_THREADS. A positive value <N> is used directly, while a negative value makes BUSTER use the fraction: (all available)/<|N|>.

Some information as to how BUSTER "refine" scales with number of threads on a 24 CPU machine is available on the BUSTER wiki page BusterShortRefineTest2.

Picture generation with Pymol

To get some final thumbnails (and larger pictures) of the (potential) binding site with various types of density displayed, the graphics program Pymol needs to be installed (and in your path as "pymol"). ImageMagick programs are only used to convert the final pictures into JPEG format.

This is only relevant, if the -L or -Lpdb flag is used, i.e. autoBUSTER tries to detect ligand binding sites. The file analyse.html will then contain pictures of the (potential) binding site(s).

Automatic restraints generation

If a residue is encountered for which no standard dictionary is found in the Engh & Huber parameter file for proteins ($BDG_home/tnt/data/protgeo_eh99.dat) or the distributed DNA/RNA parameter file ($BDG_home/tnt/data/nuclgeo.dat), the following logic is used:

  1. check the other well-defined dictionary files for co-factors ($BDG_home/tnt/data/cofactor_geo.dat), sugars ($BDG_home/tnt/data/sugar.dat) and other frequent compounds ($BDG_home/tnt/data/othergeo.dat).
  2. If the NeverGenerateDictionary option is set to no, PDB2TNT is used to generate a dictionary based on the current set of coordinates as found in the PDB file. This does not work if the current coordinates for the ligand include hydrogen atoms.
We would strongly recommend that you do not turn on the automatic restraints generation, and instead use grade to generate dictionaries. A set of sample coordinates, particularly without hydrogens, is a very bad description of a ligand's chemistry, and there are serious problems with hysteresis over repeated refinements. It is also possible to use quantum-mechanical restraints for a ligand, with the -qm LIG option, but a ligand dictionary in CIF format is still required in order to get the atom typing right.
Last modification: 16.07.2014