gelly Documentation |
previous next |
Similarity restraint syntax |
|
Appendix C2: Similarity restraint: .Gelly card specifications
Copyright © 2008-2016 Global Phasing Ltd.
All rights reserved.
Contents
Overview
The use of NCS restraints in refinement can be complex. Because of this we have introduce LSSR similarity restraints and
simple to use command line arguments to use these.
These should cover many common applications.
However, the automated setup methods may need manual tweaking.
In addition users may want to use sophisticated treatments. This appendix describes the .Gelly cards allow the application
of different kinds of restraints for both NCS and to target structures.
See also, how to use .gelly cards in BUSTER.
Defining similarity groups for NCS with the NOTE BUSTER_SIM_DEFINE card
This card is used to define a "similarity group".
The card syntax is:
NOTE BUSTER_SIM_DEFINE <Group_name> <Set_specifier> <Related_chain_list> ...
|
where:
<Group_name>:
this is an arbitary user defined group name. It is usually best to make this
informative. For instance in define an NCS relation between chain A and B a good
group name would be "ncsAB", but if you wanted you could call it
"Peter_Rabbit".
<Set_specifier>:
This is used to define the template atoms for the NCS
relation. The template must be from a single chain but
can be all or any part of this. Set_specifier can either
be (a) a buster set name - for instance 'Chain_D' or
(b) a direct atom selection within curly brackets, for
instance { D|*:CA } for just CA atoms in chain D.
(Further explanation)
<Related_chain_list>:
this is a list of chains whose atoms are to be
restrained to the position of the corresponding atom
in the template after superposition. Note that the
template chain name must not be specified. If more
than one related chain is given restraints are used
between all pairs of chains but only for atoms whose
equivalents are in the template. (The word 'null'
should be used to specify the blank chain.)
- The match used requires that related atoms have a different chain identifier but the same atom name
and residue number/insertion code.
- If you want to define a similarity group that relates two chains to another two chains then this can be done with a
NOTE BUSTER_SIM_RESTRAIN_JOINT card.
- Note that the definition of a group does not in itself turn on any restraints, this is done independently, for example with
a NOTE BUSTER_SIM_RESTRAIN_LSSR card.
- The same NOTE BUSTER_SIM_DEFINE card is used for both NCS and target type relations.
The use of the NOTE BUSTER_SIM_DEFINE card for target restraint definition is dealt with below.
- As part of setting up a similarity group the RMSD for a superposition between the different chains and and analysis
of the difference in temperature factors is listed.
Examples
- To define NCS between chain A and B use the card:
NOTE BUSTER_SIM_DEFINE ncsAB Chain_A B
|
- Suppose you have six-fold NCS between chain A, B, C, D, E and F. You want to define a NCS group for this
but for the protein main chain only:
NOTE BUSTER_SET Chain_A_NCS = Chain_A & Back
NOTE BUSTER_SIM_DEFINE ncsABCDEF Chain_A_NCS B C D E F
|
- Following on from the previous example. You can define a tight superposition based restraint on positions
for the group with a separate NOTE BUSTER_SIM_RESTRAIN_RMSD card. A
sigma 0.05 Angstroms is used here so the coupling between the chains is very tight.
NOTE BUSTER_SET Chain_A_NCS = Chain_A & Back
NOTE BUSTER_SIM_DEFINE ncsABCDEF Chain_A_NCS B C D E F
NOTE BUSTER_SIM_RESTRAIN_RMSD ncsABCDEF 0.05
|
- Following on with this example. Suppose you found that the loop between residue numbers 107 and 112 had distinct conformations. You
can "prune" these out from the NCS restraint by adjusting the set of atoms used from the template. You may want to look
at help for NOTE BUSTER_SET card syntax.
NOTE BUSTER_SET Chain_A_NCS = Chain_A & Back
NOTE BUSTER_SET Chain_A_NCS = Chain_A_NCS \ { A|107 - A|112 }
NOTE BUSTER_SIM_DEFINE ncsABCDEF Chain_A_NCS B C D E F
NOTE BUSTER_SIM_RESTRAIN_RMSD ncsABCDEF 0.1
|
Once a similarity group has been defined (with a NOTE BUSTER_SIM_DEFINE card) LSSR restraints
can be activated for this group with a NOTE BUSTER_SIM_RESTRAIN_LSSR card. The theory of LSSR restraints
is detailed elsewhere. The card syntax is:
NOTE BUSTER_SIM_RESTRAIN_LSSR <Group_name> [Weight]
|
where:
<Group_name>:
this is a user defined similarity group name that has already defined with
a NOTE BUSTER_SIM_DEFINE card or
a NOTE BUSTER_SIM_RESTRAIN_JOINT card.
[Weight]:
this is a optional weight. The weight must be a positive real number. If no
weight is given the default weight of 1.0 is used.
- Example:
Suppose you have two chains A and B that are related by NCS. To use LSSR restraints for these you could you the .gelly cards:
NOTE BUSTER_SIM_DEFINE ncsAB Chain_A B
NOTE BUSTER_SIM_RESTRAIN_LSSR ncsAB
|
This will mean that close interatomic contact distances within the A chain will be restrained to be close
to the corresponding distances within the B chain. The plateauing LSSR restraint function will be used.
Note that LSSR restraints are solely on position no restraint on temperature factors is produced (this can be done
separately).
- Example where weights are set, you have two homo trimers in the asu. One trimer has chains A B C and the other D E F. A 2-fold axis
means relates A to D, B to E and C to F. Each chain is conformational closer to its 2-fold partner than the other chains. You decide
you want to restrain the 2 fold relations with a weight of 2.0 compared to the intertrimer at half weight:
# double weight for NCS across 2 fold
NOTE BUSTER_SIM_DEFINE ncsAD Chain_A D
NOTE BUSTER_SIM_DEFINE ncsBE Chain_B E
NOTE BUSTER_SIM_DEFINE ncsCF Chain_C F
NOTE BUSTER_SIM_RESTRAIN_LSSR ncsAD 2.0
NOTE BUSTER_SIM_RESTRAIN_LSSR ncsBE 2.0
NOTE BUSTER_SIM_RESTRAIN_LSSR ncsCF 2.0
# half weight intertrimer
NOTE BUSTER_SIM_DEFINE ncsABC Chain_A B C
NOTE BUSTER_SIM_DEFINE ncsDEF Chain_D E F
NOTE BUSTER_SIM_RESTRAIN_LSSR ncsABC 0.5
NOTE BUSTER_SIM_RESTRAIN_LSSR ncsDEF 0.5
|
- Note the TNT "WEIGHT NCS" (set in BUSTER by parameter wncs) has no effect on LSSR restraints, only influencing harmonic restraints.
Once a similarity group has been defined (with a NOTE BUSTER_SIM_DEFINE card) superposition based
RMSD restraints can be activated for this group with a NOTE BUSTER_SIM_RESTRAIN_RMSD card:
NOTE BUSTER_SIM_RESTRAIN_RMSD <Group_name> [sigma_XYZ]
|
where:
<Group_name>:
this is a user defined group name that has already defined with
a NOTE BUSTER_SIM_DEFINE card or
a NOTE BUSTER_SIM_RESTRAIN_JOINT card.
[Sigma_XYZ]:
is a optional positive real number specifying the sigma to be used on the restraint
in Å. Each atom pair in the restraint will contribute separately with this sigma.
If no Sigma_XYZ is specified then a sigma will be obtained from the
"WEIGHT NCS" that is current.
The TNT "WEIGHT NCS" value is normally set by the BUSTER parameter
wncs.
The default value for wncs is
50).
Following the practice of the TNT ncs program
Sigma_XYZ is set to 1/sqrt(wncs) (if not specified on the card). The
default Sigma_XYZ is accordingly 0.14Å.
- These restraints use RMSD superposition routines from:
Coutsias, E.A., Seok, C., Dill, K.A.(2004)
"Using quaternions to calculate RMSD", J. Comput. Chem., 25:1849-1857
- See also TNT Users' Guide section on
Noncrystallographic Restraints
- We used to call these restraints "soft-NCS". However, because the restraints are fully harmonic
and superposition is inflexible for large chain they are not really soft! LSSR restraints are normally much softer.
If you want to ensure that two chains are kept pretty much identical then use RMSD based restraints with
a sigma of 0.05Å. Differences will then not be allowed during refinement.
- Restraints on temperature factors can be separately specified with a
NOTE BUSTER_SIM_RESTRAIN_B card
- For example, you have two chains A and B and want tight RMSD restraints for all atoms, use a .gelly file
containing:
NOTE BUSTER_SIM_DEFINE ncsAB Chain_A B
NOTE BUSTER_SIM_RESTRAIN_RMSD ncsAB 0.05
|
- For another example suppose you had 4 chains A, B, C and D. You wish to have a soft NCS restraint on positions only for backbone atoms for all the chain except residue 23 to 27 where there is a loop with distinct conformations. This can be achieved by using the dictionary provided sets 'Back' for backbone atoms and automatic set 'Chain_A'. The sigma used is left under the control of wncs (defaults to 0.14Å)
NOTE BUSTER_SET Chain_A_NCS = Chain_A & Back
NOTE BUSTER_SET Chain_A_NCS = Chain_A_NCS \ { A|23 - A|27}
NOTE BUSTER_SIM_DEFINE ncsABCD Chain_A_NCS B C D
NOTE BUSTER_SIM_RESTRAIN_RMSD ncsABCD
|
Once a similarity group has been defined (with a NOTE BUSTER_SIM_DEFINE card)
harmonic restraints coupling the temperature factors can be
activated for this group with a NOTE BUSTER_SIM_RESTRAIN_B card:
NOTE BUSTER_SIM_RESTRAIN_B <Group_name> [sigma_B]
|
where:
<Group_name>:
this is a user defined group name that has already defined with
a NOTE BUSTER_SIM_DEFINE card or
a NOTE BUSTER_SIM_RESTRAIN_JOINT card.
<Sigma_B>:
is a positive real number specifying the sigma to be used on the restraint
in Å2. Each atom pair in the restraint will
contribute separately with this sigma.
(Sigma_B):
is a optional positive real number specifying the sigma to be used on the restraint
in Å. Each atom pair in the restraint will contribute separately with this sigma.
If no Sigma_B is specified then a sigma will be obtained from the
"WEIGHT NCS" that is current.
The TNT "WEIGHT NCS" value is normally set by the BUSTER parameter
wncs.
The default value for wncs is
50).
Following the practice of the TNT ncs program
Sigma_B is set to (5/0.3)*1/sqrt(wncs) (if not specified on the card). The
default Sigma_B is accordingly 2.36Å2.
- The restraint is based on that used in TNT this couples individual temperature factor
but allows the average temperature for each chain to vary without penalty.
- In practice use of NCS restraints on temperature factors usually only produces a marginal improvement
in Rfree within BUSTER. For this reason the automated options do not use temperature factor restraints.
- For example, you have two chains A and B and want to use LSSR for positions and to couple temperature factors
with the default moderate coupling, use a .gelly file, containing:
NOTE BUSTER_SIM_DEFINE ncsAB Chain_A B
NOTE BUSTER_SIM_RESTRAIN_LSSR ncsAB
NOTE BUSTER_SIM_RESTRAIN_B ncsAB
|
Joining together different similarity groups with the NOTE BUSTER_SIM_JOINT card
(useful for including water in NCS)
The NOTE BUSTER_SIM_JOINT card allows the joining together of two or more similarity groups to form
a new group.
The card syntax is:
NOTE BUSTER_SIM_JOINT <New_group_name> <Existing_group1> <Existing_group2> ...
|
where:
<New_group_name>:
this is an arbitary user name for the new group. It is usually best to make this
informative.
<Existing_group1>:
this is a user defined group name that has already defined with
a NOTE BUSTER_SIM_DEFINE card
<Existing_group2>:
this is a user defined group name that has already defined with
a NOTE BUSTER_SIM_DEFINE card
<Existing_group3>:
a JOINT card must have at least two groups but can be composed of as many as you want.
- The main practical use of a joint card is to include water molecules into NCS restraints. This is demonstrated in the
4cha including water in to NCS
example found at the
Global Phasing BUSTER Wiki.
The ccp4 tool sortwater (http://www.ccp4.ac.uk/dist/html/sortwater.html)
can be used to sort waters by the protein chain to which they "belong". If this has been done for a protein with chain id's A and B
producing waters in chains U and V then the .Gelly file would be useful:
NOTE BUSTER_SIM_DEFINE ncsA-B Chain_A B
NOTE BUSTER_SIM_DEFINE ncsU-V Chain_U V
NOTE BUSTER_SIM_JOINT ncsAU-BV ncsA-B ncsU-V
NOTE BUSTER_SIM_RESTRAIN_LSSR ncsAU-BV
|
the joint similarity group means that the A/U hybrid is related to the B/V hybrid.
LSSR restraints are used. This means that the contact distances between corresponding water
molecules and their proteins will be restrained to be similar.
TNT uses a CLUSTER card with CHAINS
in it to specify harmonic restraints on positions and temperature factors. gelly will interpret CLUSTER
cards by 'translating' them into its own format and should produce identical function values. Details
of the translation will be listed in LIST.html. For instance the CLUSTER card:
CLUSTER NTERM RESIDUE 11 - 59 CHAINS A B C D
|
is translated by gelly:
Translating card 'CLUSTER NTERM RESIDUE 11 - 59 CHAINS A B C D' into gelly format:
NOTE BUSTER_SIM_DEFINE cluster0001_NTERM { A|11 - A|59 } B C D # card added - translated CLUSTER
NOTE BUSTER_SIM_RESTRAIN_RMSD cluster0001_NTERM # card added - translated CLUSTER
NOTE BUSTER_SIM_RESTRAIN_B cluster0001_NTERM # card added - translated CLUSTER
The previous release of gelly used a NOTE BUSTER_NCS_SOFT card to allow more flexible
specification of harmonic NCS restraints. These cards continue to be read and will produce identical
results. They are interpreted by 'translating' them into the new format. Details
of the translation will be listed in LIST.html. It is advised to switch to the new format
as this allows the use of new features such as LSSR.
Target restraints enable the exploitation of the similarity of
the structure under refinement to an already known structure.
A command line shortcut sets up
a basic default approach. This section explains the gelly cards for target restraints.
The cards are both used by the shortcut routine and can be user specified for finer control.
The first thing that is necessary for target restraints is load an external target structure to be used.
The
NOTE BUSTER_TARGET01 full_path_to_file.pdb
|
card allows the specification of the coordinates to be used for a target coordinate set.
- Note that the target coordinates are kept fixed during each refinement.
- It is necessary to specify the full unix pathname for the coordinate file
- It is normal to use .pdb coordinates for the target file but the routines also allow
TNT type .cor records to be used.
- All records other than ATOM/HETATM are ignored.
- It is not necessary or desirable to displace the coordinates into a common frame to the structure being refined.
LSSR restraints are based on close interatomic distances with a structure so superposition is irrelevant.
RMSD based restraints will work out the optimal superposition for the similarity group(s) during the calculation.
- If more than one target coordinate set is required then use
NOTE BUSTER_TARGET02 full_path_to_file2.pdb
|
for the next. Up to 99 coordinate sets can be defined.
- Note that the -target filename.pdb command line option will produce "on-the-fly" additional
gelly cards including NOTE BUSTER_TARGET01 to activate default (all atom) LSSR target restraints to a given structure.
- Target similarity groups are defined by a special form of the
NOTE BUSTER_SIM_DEFINE card.
The special form has a single a "Related_chain" is specified starting with "TARGET01_".
- For example
NOTE BUSTER_TARGET01 full_path_to_file.pdb
NOTE BUSTER_SIM_DEFINE AtargetY Chain_A TARGET01_Y
NOTE BUSTER_SIM_RESTRAIN_LSSR AtargetY
|
will define a similarity group between the predefined BUSTER_SET "Chain_A" (that is composed of the A chain)
and the Y chain from the TARGET_01 structure. LSSR restraints are used for the similarity group.
- For target similarity groups only, it is allowed to use a wildcard for the target chain, for example
NOTE BUSTER_TARGET01 full_path_to_file.pdb
NOTE BUSTER_SIM_DEFINE AllTarget All TARGET01_*
NOTE BUSTER_SIM_RESTRAIN_LSSR AllTarget
|
This will match all atoms from all chains to the corresponding atom from the target structure
(matching atom name, residue number and chain identity).
- If more than one target structure is used then the "Related_chain" chain specifier should
be start to "TARGET01_", "TARGET02_" etc. For example, consider a structure under refinement with two
chains A and B. The A chain is based on a high resolution structure "foo.pdb". The B chain is
based on separate high resolution structure "another.pdb". Target restraints could be defined:
NOTE BUSTER_TARGET01 full_path/foo.pdb
NOTE BUSTER_SIM_DEFINE Afoo Chain_A TARGET01_A
NOTE BUSTER_SIM_RESTRAIN_LSSR Afoo
NOTE BUSTER_TARGET02 full_path/another.pdb
NOTE BUSTER_SIM_DEFINE Banother Chain_B TARGET02_A
NOTE BUSTER_SIM_RESTRAIN_LSSR Banother
|
- Once a similarity group has been defined to a target structure, it can be used for
LSSR restraints, and/or harmonic RMSD based restraints and/or temperature factor restraints.
The same cards as for NCS restraints are used. See
NOTE BUSTER_SIM_RESTRAIN_LSSR,
NOTE BUSTER_SIM_RESTRAIN_RMSD and
NOTE BUSTER_SIM_RESTRAIN_B card discussion.
Page Author: Oliver S. Smart
Please send feedback to: buster-develop@globalphasing.com
Last modification: 24.11.2016