WORK-IN-PROGRESS
We will be working in a separate directory - and create two subdirectories in there (one for the deposited data and one as a work directory): you should be able to just cut-and-paste the commands given in these green code blocks.
With a series of shell commands
mkdir Deposited cd Deposited # Raw diffraction data # use "curl -O" instead of "wget" if the latter is not available wget -q "https://data.proteindiffraction.org/other/8agq_zip_8AGQ.tar.bz2" tar -xf 8agq_zip_8AGQ.tar.bz2 ln -s 8agq_zip_8AGQ/data Images # MR search model # use "curl -O" instead of "wget" if the latter is not available wget -q "https://files.rcsb.org/download/5F07.pdb" egrep "^LINK|^SSBOND|^CRYST1|^ATOM|^HETATM|^ANISOU|^TER" 5F07.pdb > start.pdb # currently deposited model (for comparison): fetch_PDB_gemmi 8AGQ | tee fetch_PDB_gemmi.log ln -s r8agqsf.mtz deposited.mtz ln -s 8agq.pdb deposited.pdb cd ..
we should now have all relevant (deposited) data in the subdirectory Deposited.
Running
mkdir Work cd Work # create some symbolic links for files to be used here ln -s ../Deposited/Images . ln -s ../Deposited/start.pdb . ln -s ../Deposited/deposited.pdb . ln -s ../Deposited/deposited.mtz . # create pseudo APO model egrep "^LINK|^SSBOND|^CRYST1|^ATOM|^HETATM|^ANISO|^TER" 8agq.pdb | egrep -v "^HETATM.*M5O|^ANISOU.*M5O" > apo.pdb # getting some Grade2 restraint dictionaries # use "curl -O" instead of "wget" if the latter is not available wget https://www.globalphasing.com/buster/wiki/plugin/attachments/GPhLTutorials8AGQ/GSH.restraints.cif wget https://www.globalphasing.com/buster/wiki/plugin/attachments/GPhLTutorials8AGQ/M5O.restraints.cif
will get us all required files:
To check if our software is correctly set up and configured, we can run
buster_maponly -p 8agq.pdb -m 8agq.mtz -o maponly.mtz | tee maponly.logto get a set of map coefficients in the maponly.mtz output file:
This can be done in three different ways.
process -I Images -d process.01 | tee process.01.lis
This has the advantage of avoiding any kind of bias towards the actual crystal form (cell/SG) of that dataset: sometimes the crystal form changes e.g. due to a co-crystallisation with a particular compound. However, if this dataset comes in previously observed crystal form, the associated cell/SG is not enforced - which can become relevant when not all screw-axes are observed (so processing derives at P21212 just because the last 2-fold screw was not measured) or when a particular setting is required.
If working within a larger project with several datasets, the correct SG/cell might have to be manually set/adjusted and a consistent set of test-set flags needs to be established afterwards.
process -I Images cell="90 55 55 90 114 90" symm=C2 -d process.02 | tee process.02.lis
This will enforce a specific cell/SG already at the indexing stage, which can be problematic if e.g. the unit-cell has doubled and the discarding of half of the (maybe weak) spots is thus enforced.
The test-set flags automatically created at the end of processing will not be consistent with any previously determined model/dataset (see use of check_indexing and add_freerflag.sh below).
process -I Images -ref 8agq.mtz -d process.03 | tee process.03.lis
Here we not only enforce a specific cell/SG (extracted from the MTZ file header: see e.g. mtzana 8agq.mtz or gemmi mtz 8agq.mtz output), but will also ensure that the newly processed data is consistently indexed (for spacegroups that allow different equivalent indexing solutions). Finally, the same set of test-set flags will be used and (if necessary) extended to the full resolution limits of the newly processed data.
This mode of processing is recommended when working within the same project and crystal form on a large number of datasets.
If no reference MTZ file was provided during data processing, the newly processed reflection data might require some transformations in relation to existing reference data. This can be done e.g. via
check_indexing -v 8agq.mtz process.01/staraniso_alldata-unique.mtz check_indexing -v 8agq.mtz process.02/staraniso_alldata-unique.mtz
In the case here (C2 symmtry) nothing very interesting will happen: there are no alternative indexing possibilities, potentially missed screw axes or different settings.
Of course, there are other programs available to perform similar checks (see also our own aP_select_pdb for a variant): the important point is that any newly processed data should be consistent with any previously available data of the same crystal form to avoid confusion solely because of some trivial re-indexing or difference in settings.
Within a larger project, the test-set flags for a given crystal form should be the same across different datasets. This ensures that Rfree-values computed even after a minimal amount of refinement are meaningful and not biased. We provide a simple tool (that internally runs the usual CCP4 programs which should be run after any indexing/setting ambiguity has been resolved:
add_freerflag.sh -f 8agq.mtz -m process.01/staraniso_alldata-unique.mtz add_freerflag.sh -f 8agq.mtz -m process.02/staraniso_alldata-unique.mtz
If the reference file 8agq.mtz was less complete (e.g. lower resolution) than the MTZ file, the test-set flag will be extended. This would happen afresh for every new dataset handled this way - which is why the creation of a highly-optimisitc reference MTZ file (with test-set flags to the highest resolution envisaged) wold be a good idea.
The aB_autorefine interface to BUSTER will run a series of individual BUSTER refinement jobs with some automatic decision making in between (when to use TLS, ADP, solvent model update, outlier rejection etc). This is not a quick process, but provides a very consistent and reliable way of getting a large number of structures to a reasnoable state for subsequent analysis and manual adjustment.
aB_autorefine -p search.pdb -m process.03/staraniso_alldata-unique.mtz -d refine.01 | tee refine.01.log
A single BUSTER refinement run can be run using
refine -p search.pdb -m process.03/staraniso_alldata-unique.mtz -l GSH.restraints.cif -d refine.02 | tee refine.02.log
There are a lot of options available (see refine -h for details) for fine-tuning the behaviour, although we thing that the default should be adequate for most situations. Please note that this job will internally run BUSTER several times (so-called "big cycles", with updates to the solvent mask and the X-Ray weighting) and that before the last of those cycles a feature called "void correction" is activated (to account for cavities that should be excluded from the bulk solvent model).
This uses the so-called -L feature (Vonrhein, C. and Bricogne, G., 2005. Automated Structure Refinement for High-Throughput Ligand Detection with BUSTER-TNT. Acta Cryst. Sect A, 61, p.c248.) as described in more detail here. The "Polder maps" in Phenix follow a similar idea.
refine -L -p search.pdb -m process.03/staraniso_alldata-unique.mtz -l GSH.restraints.cif -d refine.03 | tee refine.03.log
The result should hopefully be a clearer difference-density map for the bound ligand.
We will need a description of the ligand in form of a mmCIF restraints dictionary before we can try and fit the ligand into the (hopefully clear) difference density map that provides our evidence for a bound compound. There are several ways of generating those type of restraint files (and the different programs should create files that are compatible with each other), with Grade2 the latest incarnation of our own approach to this. If you have the required CSD software from CCDC installed, you should be able to run
grade2 -P M5O(for a known identifier in the chemical components dictionary), or
grade2 -r LIG 'Oc1cc(O)c2c(c1)O[C@@H](c1ccc(O)c(O)c1)[C@H](O)C2'if using a SMILES string. Please also see the extensive Grade2 documentation and the Grade2 webserver.