Preferred input formats for grade

Contents


For compounds that the PDB has already assigned a code to

To get a dictionary for compound QV7 just run

grade_PDB_ligand QV7

This downloads information from RCSB ligand expo. The resulting dictionary will have the atom names that the PDB assigned.


For novel compounds

In order to produce a usable dictionary, grade would like to be informed of

  • Atom types
  • Bond orders
  • Names for all the atoms
  • 3D coordinates of all the atoms

If the dictionary is to be used for placing a new ligand in a structure, the atom names are not so important; but if using a ligand known to the PDB, significant extra work during deposition can be avoided if the atom names match those used by the PDB (so see compounds that the PDB has already assigned a code to section!).

Format Atom types Bond orders Atom names Coordinates
PDB No No Yes Yes
SMILES Yes Yes No No
MOL2 Yes Yes Yes Yes
MOL/SDF No Yes No Yes

So we would suggest that MOL2 files containing 3D coordinates for all atoms (including hydrogen atoms) are the preferred input format. Using OpenBabel (http://openbabel.org) to convert a PDB file containing 3D coordinates to a MOL2 file fills in the atom types and bond orders

The coordinates in MOL2 files are assumed to be good 3D ones; grade will misbehave if given input with only 2D coordinates, so if those are the only coordinates you have then you will probably get better results using a SMILES string. You might also use OpenBabel (versions after 2.2.3) to convert from 2D to 3D.


hydrogen atoms and grade

grade uses quantum methods to define some of the restraints, and these quantum methods only work if all the hydrogen atoms are explicitly present in the input file.

If you have a PDB file without hydrogens, there are two approaches:

  • load the file into CCP4 sketcher, use the 'Create Library Description' function, check that all bond orders and hydrogen atoms added are sensible. Once you are happy, locate the ideal coordinates pdb file which the sketcher has produced, and convert this to mol2 format using a recent version of babel.
    • Obtaining atom types and bond orders from pure coordinate information is sufficiently unreliable that we would recommend you check the hydrogenated output before proceeding.
  • obtain a SMILES string for the compound (for example obprop ligand.pdb) and using that as the input, but note that the atom names will be different in the final dictionary.

We have had problem reports from using versions of OpenBabel before the 2.2 release to do the format conversion.


Page by Tom Womack, July 2011. Any questions regarding our software or this wiki should be directed to buster-develop@globalphasing.com