.. _usage: ***** Usage ***** Before you run the Grade2, make sure that you have followed the :ref:`Configuration` instructions and :ref:`tested ` that Grade2 works properly. Running the ``grade2`` command ============================== To run Grade2 you need to specify the molecule that you want to create a restraint dictionary for. There are currently 4 alternative input options: .. _SMILES_input: 1. Molecule input from SMILES string. ------------------------------------- SMILES (Simplified Molecular-Input Line-Entry System) provides a way to describe a molecular structure as an ASCII string https://en.wikipedia.org/wiki/Simplified_molecular-input_line-entry_system. To generate a restraint dictionary for a given SMILES string simply run ``grade2`` on the command-line followed by the SMILES surrounded by single quotes: ``grade2 'SMILES'``, for example: :: $ grade2 'CN1C=NC2=C1C(=O)N(C(=O)N2C)C' Please note that the dollar symbol ``$`` above represents the command prompt. This will run grade2 producing output like the following: :: $ grade2 'CN1C=NC2=C1C(=O)N(C(=O)N2C)C' set CSDHOME=/home/software/xtal/CCDC/CSDS/2021.3/CSD_2022 from $BDG_TOOL_MOGUL=/home/software/xtal/CCDC/CSDS/2021.3/CSD_2022/bin/mogul ############################################################################ ## [grade2] ligand restraint dictionary generation ############################################################################ Copyright (C) 2019-2022 by Global Phasing Limited All rights reserved. This software is proprietary to and embodies the confidential technology of Global Phasing Limited (GPhL). Possession, use, duplication or dissemination of the software is authorised only pursuant to a valid written licence from GPhL. Version: 1.1.0 <2022-02-01> Authors: Smart OS, Sharff A, Holstein J, Womack TO, Flensburg C, Keller P, Paciorek W, Vonrhein C and Bricogne G ----------------------------------------------------------------------------- RDKit generated molecule and coordinates from input SMILES: CN1C=NC2=C1C(=O)N(C(=O)N2C)C CHECK: Check the molecule's InChiKey against known PDB components: CHECK: Exact match to PDB chemical component(s): CHECK: CFF https://www.rcsb.org/ligand/CFF "caffeine" Minimization with MMFF94s reduces energy from -104.82 to -123.49 kcal/mol Using CCDC Mogul-like geometry analysis. Mogul version 2021.3.0, CSD version 543, csd-python-api 3.0.9 Mogul Data Libraries: as543be_ASER Geometry Optimize coordinates against restraints using gelly .... ---- gelly: Took 42 steps, reducing the rms gradient to 0.04 ---- gelly: and the rms bond deviation to 0.004 Angstroms. Have written CIF-format restraint dictionary to: LIG.restraints.cif Have written ideal coordinates to PDB-format file: LIG.xyz.pdb Have written ideal coordinates to SDF-format file: LIG.xyz.sdf Have written ideal coordinates in MOL2-format to: LIG.xyz.mol2 Have written schematic 2D diagram SVG-format file: LIG.diagram.svg Have written 2D diagram & atom_id labels to file: LIG.diagram.atom_labels.svg Suggestion: to view/edit the restraints, use one of the commands: coot -p LIG.xyz.pdb --dict LIG.restraints.cif EditREFMAC LIG.restraints.cif LIG.xyz.pdb LIG Normal termination (6 secs) * As you can see, before the restraint dictionary is produced a ``CHECK`` is made to see whether the ligand has already be defined in the wwPDB Chemical Component Dictionary https://www.wwpdb.org/data/ccd that describes residues small molecules in PDB entries. As can be seen, in this case the SMILES string is for caffeine and it would be sensible to use a restraint dictionary for ``CFF`` so that the atom names agree with the existing definition https://www.rcsb.org/ligand/CFF. See the next subsection. * Note that the CIF-format restraint dictionary is written to file ``LIG.restraints.cif`` and has the default PDB chemical component id (aka residue name or 3-letter code) of ``LIG``. To set the 3-letter code use the command-line option :ref:`--resname `. * As well as the CIF-format restraint dictionary grade2 will write "ideal" coordinates based on the restraints to PDB, SDF and MOL2 formats. For more details see the :ref:`coordinates files ` section. * Molecular diagrams are also produced, for more details see the :ref:`schematic 2D molecular diagrams <2d_diagrams>` section. * Finally suggestions are given how to view the coordinates and restraints produced using ``Coot`` or ``EditREFMAC`` (supplied with ``BUSTER``). * Note that if you do not want the coordinate or molecular diagram output files then the :ref:`--just_cif ` option can be used. * It should be noted that when the chirality of one or more chiral centers is not specified in the SMILES string the output molecule will have an arbitrary stereochemistry assigned. :ref:`From release 1.3.1 `, Grade2 now produces an output restraint dictionary where the chiral restraint volume is set to ``both`` for the ambiguous centers rather than being arbitrarily assigned. A warning message is now written when there are any ambiguous chiral centers. .. _PDB_ligand_input: 2. PDB chemical component definition ------------------------------------ To generate a restraint dictionary for an existing PDB ligand it is best to use the :ref:`--PDB_ligand ` option. For instance, to generate a restraint dictionary for caffeine ``CFF`` run: :: $ grade2 --PDB_ligand CFF or using the equivalent short option ``-P`` :: $ grade2 -P CFF This will produce output using the `wwPDB Chemical Component Dictionary`_ (CCD) compound record for caffeine ``CFF`` (see https://www.rcsb.org/ligand/CFF for an overview). Grade2 will download the wwPDB CCD CIF file for the compound from either PDBeChem: https://www.ebi.ac.uk/pdbe-srv/pdbechem/ or from Ligand Expo: http://ligand-expo.rcsb.org/. The output restraint dictionary will be called ``CFF.restraints.cif`` and other files will be named ``CFF.*``, see the :ref:`outputs` chapter. If the ``--PDB_ligand`` option is used then the atom names will agree with the wwPDB CCD definition for the compound. This has the advantage that if you deposit the final structure to the PDB the compound's atoms will not be renamed. .. _`wwPDB Chemical Component Dictionary`: https://www.wwpdb.org/data/ccd .. _file_input: 3. Input molecule file ---------------------- The third input option is to use a file to specify the input molecule. The command-line option :ref:`--in ` should be used to specify the input filename. For instance, to generate a restraint dictionary for the SDF file ``ligand_35.sdf`` with the 3-letter code ``L35`` run: :: $ grade2 --in ligand_35.sdf --resname L35 or using the equivalent short options :ref:`-i ` and :ref:`-r ` :: $ grade2 -i ligand_35.sdf -r L35 the output restraint dictionary will be ``L35.restraints.cif`` and other files will be named ``L35.*``, see the :ref:`outputs` chapter. Normally, the format of the input file is detected from the filename extension (for example ``.sdf``). If necessary the command-line option :ref:`--itype ` can be used to specify the input format. Currently, Grade2 supports the following input formats: .. list-table:: Grade2 Input Molecular File Formats :widths: 15 15 70 :header-rows: 1 * - File format - Normal extension - Notes * - `mol/sdf`_ - ``.mol`` or ``.sdf`` - The MDL Molfile and SDF file formats provide a good exchange-format for molecules between applications and databases. As the format lacks atom names these will be generated by Grade2. Please note that, if an SDF file contains multiple molecules only the first molecule will be processed by Grade2. Please note that chiral restraints for 3D MOL/SDF are based on the input coordinates rather than any of the many chiral flags that might be used in the file (please see https://depth-first.com/articles/2021/12/29/stereochemistry-and-the-v2000-molfile-format/ for background). If you use a 2D MOL/SDF input with chiral centres then please check the stereo configuration of the result matches your requirements. * - `Tripos MOL2`_ - ``.mol2`` - The MOL2 format has the advantage of representing bond orders, atom ID's (names) and Cartesian coordinates. On the other hand, MOL2 format has ambiguity in the format definition and is `not supported by RDKit`_. Grade2 uses the CSD Python API to read (and write) MOL2 files and so can handle MOL2 files produced by CSD programs. The CSD-convention for MOL2 files is to use the partial charge field to store the formal charge of an atom. Other programs, such as `Open Babel `_, use the MOL2 partial charge field to store partial charges and atomic formal charge information is lost. For MOL2 files with partial charges, Grade2 now attempts to reconstruct the atomic formal charges from valency considerations. If the reconstruction process fails, it is possible to manually edit correct formal charges, please see the :ref:`FAQ Editing MOL2 file of a charged molecule with atomic partial charges `. * - `SMILES`_ - ``.smi`` - Please note that, if the SMILES file contains multiple molecules only the first molecule will be processed. If the SMILES file has a name field then this will be used for the name of the ligand, unless the command-line option :ref:`--name ` is specified. It is often easier to directly specify a :ref:`SMILES input string on the command-line ` rather than a SMILES file. * - restraint dictionary CIF - ``.cif`` - `CIF`_ stands for Crystallographic Information File. It should be noted that CIF-format can be used for many types of data (for instance macromolecular coordinates or reflection data). Grade2 uses CIF-format for its principal output, the restraint dictionary file (see :ref:`Outputs chapter`) and this can also be used as an input file. Grade2 can read CIF-format restraint dictionaries written by Grade2 itself, `eLBOW `_, `AceDrg `_ and `Grade `_. As Grade CIF restraint dictionaries lack atom formal charge (`_chem_comp_atom.charge `_) records these are set zero when the restraint dictionary is read and care must be taken as this may cause the output molecule to be incorrect. Please the FAQ :ref:`How can I use Grade2 to generate a restraint dictionary with atom names consistent with an existing Grade dictionary? ` for more detail. * - wwPDB CCD CIF - ``.cif`` - `CIF`_ files for existing PDB ligands defined in the `wwPDB Chemical Component Dictionary`_ can be obtained either from PDBeChem: https://www.ebi.ac.uk/pdbe-srv/pdbechem/ or from Ligand Expo: http://ligand-expo.rcsb.org/ . Note that it is normally easier to get Grade2 to retrieve the wwPDB CCD CIF information directly using the :ref:`--PDB_ligand ` option. Downloading the CCD cif file and using the ``--in`` is useful if there are firewall issues preventing script downloads. .. _lookup_option: 4. The ``--lookup`` option -------------------------- The ``--lookup`` option provides a mechanism whereby an external script is invoked to look up details of a ligand from a database. To use your own script, set environment variable ``BDG_GRADE2_LIGAND_LOOKUP`` to the location of the script. Please see https://gitlab.com/gphl/grade2_lookup_scripts for example scripts written in different languages and description of what your script needs to do. By default, if ``BDG_GRADE2_LIGAND_LOOKUP`` is not set, ``grade2 --lookup CID`` uses a script that downloads ligand details from PubChem https://pubchem.ncbi.nlm.nih.gov/ using ``CID`` the PubChem compound identifier. For example, running :: $ grade2 --lookup 123 will download details, of the drug Triforin, of from PubChem using its CID ``123`` (see https://pubchem.ncbi.nlm.nih.gov/compound/123 for the Triforin PubChem entry). This will run grade2 producing output like the following: :: $ grade2 --lookup 123 --just_cif ############################################################################ ## [grade2] ligand restraint dictionary generation ############################################################################ Copyright (C) 2019-2022 by Global Phasing Limited All rights reserved. This software is proprietary to and embodies the confidential technology of Global Phasing Limited (GPhL). Possession, use, duplication or dissemination of the software is authorised only pursuant to a valid written licence from GPhL. Version: 1.3.0 <2022-10-??> Authors: Smart OS, Sharff A, Holstein J, Womack TO, Flensburg C, Keller P, Paciorek W, Vonrhein C and Bricogne G ----------------------------------------------------------------------------- Lookup option --lookup "123" ---- Database: "PubChem" ---- Information: https://pubchem.ncbi.nlm.nih.gov/compound/123 ---- Molecule name: "Tiformin" Systematic name set to "4-(diaminomethylideneamino)butanamide" RDKit generated molecule and coordinates from input SMILES: C(CC(=O)N)CN=C(N)N CHECK: Check the molecule's InChiKey against known PDB components: CHECK: The input molecule does not match any existing PDB chemical component (up to 2022-08-26). For help on checks against known PDB components, , see: .... ---- https://gphl.gitlab.io/grade2_docs/faqs.html#checkpdbmatch Minimization with MMFF94s reduces energy from -118.68 to -162.18 kcal/mol Using CCDC Mogul-like geometry analysis. Mogul version 2021.2.0, CSD version 542, csd-python-api 3.0.8 Mogul Data Libraries: as542be_ASER, Feb21_ASER, May21_ASER, Sep21_ASER Geometry Optimize coordinates against restraints using gelly .... ---- gelly: Took 249 steps, reducing the rms gradient to 0.05 ---- gelly: and the rms bond deviation to 0.002 Angstroms. Have written CIF-format restraint dictionary to: CID_123.restraints.cif Normal termination (4 secs) You can notice that the SMILES string ``C(CC(=O)N)CN=C(N)N`` downloaded from PubChem is used as a starting point for the molecule. A CIF-format restraint dictionary is output to the file ``CID_123.restraints.cif``, and this will include information about the molecule's name, its systematic (IUPAC) name and the PubChem information page. .. _`mol/sdf`: https://en.wikipedia.org/wiki/Chemical_table_file .. _`Tripos mol2`: http://chemyang.ccnu.edu.cn/ccb/server/AIMMS/mol2.pdf .. _`not supported by RDKit`: https://github.com/rdkit/rdkit/discussions/3647 .. _`SMILES`: https://en.wikipedia.org/wiki/Simplified_molecular-input_line-entry_system .. _`CIF`: https://en.wikipedia.org/wiki/Crystallographic_Information_File .. _cl-arguments: Command-line arguments for ``grade2`` ===================================== Please note, that most grade2 command-line arguments have a long version, for instance ``--just_cif`` and a short version ``-j`` (see :ref:`--just_cif `). The long version can be abbreviated when this creates no ambiguity. Help & setup command-line arguments ----------------------------------- ``-h, --help`` ^^^^^^^^^^^^^^ The ``--help`` option will write out a help message listing all the command-line arguments. Please note that help on each option is deliberately brief and more detail can be found in this chapter. ``-checkdeps, --checkdeps`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``-checkdeps`` is a special option that checks that the external tool (CSD) that grade2 needs is accessible and works properly. Useful for setting up grade2 and for a quick test that the program works on a particular host. Please see the :doc:`Installation ` section of this document for more details. .. _versions: ``-V, --versions`` ^^^^^^^^^^^^^^^^^^ ``--versions`` writes out version numbers of the program and Python/Data libraries used. Please use this option when reporting bugs. Molecule input arguments ------------------------ You must specify exactly one molecular input argument, so if you provide a SMILES string you cannot also provide an input CIF file. ``'SMILES'`` ^^^^^^^^^^^^ SMILES string input. The SMILES string should be given in single quotes to avoid SHELL mangling, for instance: :: grade2 'C(=O)OH' Please see the :ref:`section above ` for more details. .. _PDB_ligand: ``-P PDB_ID, --PDB_ligand PDB_ID`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ downloads information for the given PDB chemical component id (also known as the residue name or 3-letter code) from PDBe or RCSB PDB. Please see the :ref:`section above ` for more details. .. _in: ``-i IN_FILE, --in IN_FILE`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Use the filename ``IN_FILE`` for the input molecule. Please see the :ref:`section above ` for more details, including supported file formats. .. _lookup: ``-L ID, --lookup ID`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Use an external script to lookup the molecule with ``ID`` in an external database. Please see the :ref:`section above ` and https://gitlab.com/gphl/grade2_lookup_scripts for more details. Optional command-line arguments ------------------------------- .. _resname: ``-r PDB_ID, --resname PDB_ID`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The ``--resname`` option sets the output PDB chemical component id (aka residue name or 3-letter code) to the string specified by ``PDB_ID``. Note that using ``--resname`` will normally alter the output filenames. The default PDB_ID code is ``LIG`` unless the code is available from the input (for instance, if the :ref:`PDB_ligand` option has been used). Please see the FAQ `What are the Grade2/BUSTER restrictions on residue name? `_ for more information. .. _out: ``-o OUT_ROOT, --out OUT_ROOT`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Output files produced will have filenames starting with this string. The actual filenames will be formed of the specified ``OUT_ROOT`` with an appropriate extension (see the :ref:`outputs` chapter for more details), for instance the restraint dictionary CIF file will be called ``OUT_ROOT.restraints.cif``. If ``--out`` is not specified, by default output filenames will start with ``LIG.``, where ``LIG`` is the PDB_ID that can be set by the :ref:`--resname ` or :ref:`--PDB_ligand ` options. .. _ocif: ``-ocif OUT_CIF, --ocif OUT_CIF`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The ``--ocif OUT_CIF`` option sets the full filename for the CIF restraint dictionary to the user-specified string ``OUT_CIF``. This option can be used to exactly control the filename for the restraint dictionary including its file type. For instance, using ``--ocif ../ligand_ABC.dic`` will result in the restraint dictionary being written to a file ``ligand_ABC.dic`` in the directory above the current working directory. Please note that the ``--ocif`` option overrides the ``-o/--out`` option. Furthermore, the ``--ocif`` option has no effect on the filename for other output files (if any). Consequently, it is recommended that it is used with the :ref:`--just_cif ` option. .. _force_overwrite: ``-f, --force_overwrite`` ^^^^^^^^^^^^^^^^^^^^^^^^^ By default ``grade2`` will not overwrite existing files, instead exiting with an error message. Use the ``--force_overwrite`` option (or the ``-f`` short option) to force overwriting existing files. .. _just_cif: ``-j, --just_cif`` ^^^^^^^^^^^^^^^^^^ By default ``grade2`` writes a number of output files (see the :ref:`outputs` chapter). The ``--just_cif`` option will cause ``grade2`` to write only the CIF-format restraint dictionary. It turns off the production of all other (PDB, SDF, MOL2 & SVG) files. .. _shelx_option: ``-s, --shelx`` ^^^^^^^^^^^^^^^ Produce `SHELX`_ restraint ``.dfix`` format output files. If ``--shelx`` is specified two additional output files will be created with the extensions ``.dfix`` and ``.with_hydrogen.dfix``. The former file has restraints excluding those to hydrogen atoms. .. _SHELX: https://shelx.uni-goettingen.de/ .. _no_charging: ``-N, --no_charging`` ^^^^^^^^^^^^^^^^^^^^^ Use the ``--no_charging`` option to turn off the standard charging scheme that modifies groups likely to be charged at pH7. For instance, the standard charging scheme alters a neutral carboxylic acid to a carboxylate ion and also a neutral phosphoric acid to a phosphate ion, for more detail see the :ref:`charging` chapter. It should be noted that, the ``--no_charging`` option leaves the input molecule unchanged. So if the input molecule has a charged group then this will NOT be altered by the ``--no_charging`` option. If you want to model a ligand with a protonation state that is distinct from the standard charging scheme then use manual editing with Mercury as demonstrated by the :ref:`FAQ How can I produce restraints for a ligand with a different protonation state or tautomer? `. .. _ecloud : ``-e, --ecloud`` ^^^^^^^^^^^^^^^^ The ``-ecloud`` option now specifies that the ideal xyz coordinates will use the electron-cloud distances for bonds to hydrogen atoms rather than nuclear distances. It should be noted, that in the first public release :ref:`1.0.0 ` of Grade2 the ``-ecloud`` option specified that for bond restraints to hydrogen atoms to be set to electron-cloud distances that are adequate for X-ray refinement. From release :ref:`1.1.0 `, Grade2 produces CIF restraint dictionaries containing both electron-cloud and nucleus X-H bond restraints, avoiding the requirement of separate restraint dictionaries for the two use cases. The ``-ecloud`` option is retained with the narrower effect on just the ideal xyz coordinates. .. _chirality_both: ``-c, --chirality_both`` ^^^^^^^^^^^^^^^^^^^^^^^^ Use the ``--chirality_both`` option if you are not certain of the chiral configuration of the input molecule. The ``--chirality_both`` set the volume of all chiral restraints identified to ``"both"`` to allow for cases of ambiguous stereochemistry. Note that the ``--chirality_both`` flag is not needed if starting from a non-stereo SMILES as restraints will then automatically be set to "both". .. _chiral_non_carbon: ``--chiral_non_carbon`` ^^^^^^^^^^^^^^^^^^^^^^^ Grade2 by default only places chiral configuration restraints on chiral tetrahedral carbon atoms. Use the ``--chiral_non_carbon`` option to also place chiral restraints for atoms that are nitrogen, phosphorous and sulfur. It is noteworthy that chiral centres at nitrogen atoms can often rapidly interconvert, for example ammonium ions are not regarded as chiral unless they are quaternary (see `Athabasca University Chemistry 350 Organic Chemistry I: 5.10: Chirality at Nitrogen, Phosphorus, and Sulfur `_). Using ``--chiral_non_carbon`` can introduce undesirable chiral restraints for ammonium ions and phosphates (such as ATP). We advise you to use this option cautiously for cases where you are sure of the chiral configuration. ``-b, --big_planes`` ^^^^^^^^^^^^^^^^^^^^ Produce large fused planes that overemphasize ring planarity. For a full description please see the :ref:`Treatment of Planar Groups ` chapter. .. _four_atom_planes: ``-4, --4_atom_planes`` ^^^^^^^^^^^^^^^^^^^^^^^ instead of creating a single plane restraint for each flat 5/6-atom ring, produce 5 or 6 separate four-atom planes around that ring. In practice, using this option has little effect on refinement results. The ``--4_atom_planes`` option is included for testing as separate four-atom plane restraints are used by both Grade and in the first Grade2 release :ref:`1.0.0 `. .. _eh99_sigma_correction_option: ``--eh99_sigma_correction`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^ scales up all non-hydrogen bond and angle sigmas to match the mean sigma values of the EH99 amino restraints used by BUSTER. Please see the :ref:`Comparing Grade2 and EH99 Restraints for amino acid side chains ` chapter for background and details of the option. .. _name: ``-n``, ``--name NAME`` ^^^^^^^^^^^^^^^^^^^^^^^ The full name of ligand can be set using the ``--name`` option. Ideally, the full name should be human-readable, for example, "retinoic acid". The name will be shown in **buster-report** output. You should quotation marks if the full name contains a space, for example: :: $ grade2 'Ic1ccccc1C(=O)[O-]' --name '2-iodobenzoic acid' By default, the full name will be set to the `InChIKey`_ for the molecule, unless a name is already known for instance for PDB ligands. .. _systematic: ``--systematic IUPAC_NAME [PROGRAM] [PROGRAM_VERSION]`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The ``--systematic`` option allows the systematic (IUPAC) name of the molecule to be specified. The systematic name provided will be included in the :ref:`output CIF restraint dictionary` using the `_pdbx_chem_comp_identifier `_ data category. It is optional to specify the name and version of the program used to find the systematic name. For example, specifying ``--systematic "2-acetyloxy-4-iodobenzoic acid"`` specifies just the systematic name, without recording the program details. Note the use of the double quotation marks as the systematic name has a space. To record the program used and its version simply add after the systematic name. For example, ``--systematic "2-acetyloxy-4-iodobenzoic acid" ACD/Name v2021`` will result in the following CIF records in the output restraint dictionary: :: _pdbx_chem_comp_identifier.comp_id LIG _pdbx_chem_comp_identifier.type "SYSTEMATIC NAME" _pdbx_chem_comp_identifier.program ACD/Name _pdbx_chem_comp_identifier.program_version v2021 _pdbx_chem_comp_identifier.identifier "2-acetyloxy-4-iodobenzoic acid" .. _pubchem_names: ``--pubchem_names`` ^^^^^^^^^^^^^^^^^^^ The ``--pubchem_names`` option performs an online search for the ligand in the PubChem database https://pubchem.ncbi.nlm.nih.gov/. If the option is activated and the molecule is found then the PubChem title is used for the full name of ligand and the systematic name is set to the PubChem IUPAC name. The `PubChemPy `_ package is used to make most of the lookups. The online search involves uploading the SMILES string of the molecule to PubChem. For this reason, **the** ``--pubchem_names`` **option should not be used for confidential ligands.** To be extra careful, by default the ``--pubchem_names`` option is deactivated until the environment variable ``BDG_GRADE2_PUBCHEM_NAMES_ON_ACCEPT_SMILES_TO_WEB`` is set. If the option is specified without activation then Grade2 will terminate with an error message. To activate the ``--pubchem_names`` option then, if you are a bash ksh or dash shell user: :: $ export BDG_GRADE2_PUBCHEM_NAMES_ON_ACCEPT_SMILES_TO_WEB="yes" But if you are a csh or tcsh shell user: :: $ setenv BDG_GRADE2_PUBCHEM_NAMES_ON_ACCEPT_SMILES_TO_WEB "yes" If you are happy for ``--pubchem_names`` to be permanently enabled for all users of grade2 at your site then please see the :ref:`Advanced Configuration ` section. .. _group: ``--group GROUP`` ^^^^^^^^^^^^^^^^^ Set the CCP4-extension CIF item ``_chem_comp.group`` to GROUP. This item is used by CCP4 programs, like Coot, when producing restraints to link monomers together. Grade2 automatically sets the ``_chem_comp.group`` to ``peptide`` for amino acids both for PDB chemical components and while **Setting atom IDs for amino acids**. The item is also automatically set for PDB chemical components that are saccharides (to ``pyranose`` or ``furanose``). The ``--group`` option can be used to manually set ``_chem_comp.group`` to any value. If the option is used it overrides any automatically set value. Please note that to work properly it will also be necessary to set appropriate atom IDs for monomers to be connected properly. .. _InChIKey: https://en.wikipedia.org/wiki/International_Chemical_Identifier#InChIKey .. _database_id: ``-d``, ``--database_id ID [DB_NAME] [URL] [DETAILS}`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Set a corporate or database ``ID`` for the molecule and optionally other details for the molecule. The ``ID`` should be database identifier for the molecule, for example: ``2083`` (for PubChem) or ``DB01001`` (for DrugBank). One or more additional optional arguments ``DB_NAME``, ``URL`` and ``DETAILS`` can also be given (separated by spaces). ``DB_NAME`` should be the name of the database (for example, ``PubChem`` or ``DrugBank``). The ``URL`` should be a URL of a page giving details of the ligand a the database (for instance, ``https://pubchem.ncbi.nlm.nih.gov/compound/2083``). ``DETAILS`` can be used for any other information (for example, ``"Corporate Compound Database - internal access only"``). The ``ID`` will be shown in **buster-report** output. Future reporting tools will display all the information. As an example, when producing a restraint dictionary for the PDB component ``VIA`` information about the DrugBank entry for Sildenafil from https://go.drugbank.com/drugs/DB00203 can be added: :: $ grade2 --PDB_ligand VIA --database DB00203 DrugBank https://go.drugbank.com/drugs/DB00203 Note how grade2 options can be abbreviated when there is no ambiguity with other options. The information provided will be included in the output restraint CIF dictionary in the in ``gphl_chem_comp_database`` the CIF data category: :: loop_ _gphl_chem_comp_database.comp_id _gphl_chem_comp_database.id _gphl_chem_comp_database.database _gphl_chem_comp_database.url _gphl_chem_comp_database.details VIA VIA PDB https://www.rcsb.org/ligand/VIA "RCSB PDB" VIA VIA PDB https://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/VIA PDBe VIA DB00203 DrugBank https://go.drugbank.com/drugs/DB00203 . For more information please see the section :ref:`Database Information in output CIF Restraint Dictionary`. Please note that if you want to add information about more database entries then further ``--database_id`` options can be specified. For instance to add information about the Wikipedia page: :: $ grade2 --PDB_ligand VIA --database DB00203 DrugBank https://go.drugbank.com/drugs/DB00203 \ --database . Wikipedia https://en.wikipedia.org/wiki/Sildenafil .. _no_extra: ``-X``, ``--no_extra`` ^^^^^^^^^^^^^^^^^^^^^^ By default the output restraint dictionary CIF file will have many extra Grade2-specific items, for instance giving source of restraint values. Use the ``--no_extra`` to turn off the extra Grade2-specific items. .. _itype: ``--itype {cif,sdf,mol,mol2,smi}`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Format for the :ref:`--in ` input file, selected from allowed list. By default, the format is detected from the filename extension and file contents (please see the :ref:`section above ` for more details). .. _rcsb: ``--rcsb`` ^^^^^^^^^^ For the :ref:`--PDB_ligand ` option download first from the RCSB site https://files.rcsb.org/ligands/ rather than from PDBeChem. .. _debug: ``--debug`` ^^^^^^^^^^^ The ``--debug`` option turns on debug-level terminal output. The STDOUT output written by Grade2 will then include a large number of lines starting ``DEBUG:``. These are not intended to be intelligible by end users but instead are useful to the program developers. You should only use the ``--debug`` option if reporting problems with Grade2. Optional arguments for setting atom IDs (aka atom names) -------------------------------------------------------- .. _antecedent: ``-a, --antecedent RELATED_RESTRAINTS_CIF`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This option is used to base the atom IDs and 2D coordinates on those from a related molecule. A filename ``RELATED_RESTRAINTS_CIF`` for a CIF restraint dictionary of a related molecule must be provided. It is best if the restraint dictionary is produced by Grade2 itself. The option is demonstrated in the :ref:`Atom Naming ` chapter. .. _antecedent_disregard: ``-ad, --antecedent_disregard_element RELATED_RESTRAINTS_CIF`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This option is similar to :ref:`--antecedent ` except that atoms are not required to have the same element. Where possible atom IDs are altered so that the non-element part of matching atoms is maintained. So for example if atom ``CL24`` is matched to a fluorine atom it will be given the atom ID ``F24`` (provided there is not an another atom with that label). .. _rdkit_canonical_atom_ids: ``-R, --rdkit_canonical_atom_ids`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Sets the atom IDs from the RDKit canonical SMILES order. This means will get the same atom IDs regardless of the input atom order. This option is explained and demonstrated in the :ref:`Atom Naming ` chapter. .. _inchi_canonical_atom_ids: ``--inchi_canonical_atom_ids`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Set InChI-canonical atom IDs. These are "universal" but rather ugly. This option is explained and demonstrated in the :ref:`Atom Naming ` chapter. .. _no_aa_labels: ``--no_aa_labels`` ^^^^^^^^^^^^^^^^^^ This option turns off recognizing amino acids and setting atom IDs to ``N CA C O OXT CB``. Please see :ref:`Setting atom IDs for amino acids ` for more details. .. _aa_loose: ``--aa_loose`` ^^^^^^^^^^^^^^ extends setting atom IDs to "exotic" amino acids, such as N-modified and beta amino acids. Please see :ref:`Setting atom IDs for "exotic" amino acids ` for more details.