Pipedream Tutorial 3


A common practice to reduce the number of x-ray experiments in the first phase of pursuing a fragment based screening strategy is to soak crystals with multiple compounds, rather than screen fragments individually.

This typically involves soaking crystals in cocktails ranging from 2 to around 8, small (hence usually weakly binding) and importantly, shape diverse fragments and then analysing the difference electron density maps to determine which if any of the components have bound to the target.

Pipedream currently provides basic support for this scenario.

Before proceeding with the tutorial, we should highlight a number of points which need to be carefully considered when using Pipedream to look at structures containing small, weakly bound fragments (whether soaked as cocktails or individually). Given that the compounds are small (and generally weakly bound), we are looking for similarly "small" difference density features and need to be mindful of the increased risk of mistaking difference density features that may be the result of conformational change, disorder, incompleteness of the model (or indeed data) or simply "noise" from genuine compound peaks - so called false positives.

In order to keep existing waters from the reference structure the following points MUST be followed:

This tutorial uses some trypsin structures to illustrate how to use Pipedream with cocktails of small fragments. A series of trypsin structures soaked with a series of compounds, both individually and in cocktails, is described by:

Yamane, J et al. (2011), J. Appl. Cryst, 44, 798-804.

All of the resultant structures are available in the PDB.

For this tutorial we will use the following structures/data:

3rxt.mtz the structure factors as supplied by the pdb for 3rxt, converted to mtz format
3rxe.mtz the structure factors as supplied by the pdb for 3rxe, converted to mtz format
3rxe.pdb the model coordinates as supplied by the pdb for 3rxe

3rxt was soaked with a cocktail of two compounds (ligand codes SZ4 and 2AP).

(A) Restraint dictionary generation

grade 'COc1cccc(CN)c1' -resname L01
grade 'Nc1cccc[nH+]1' -resname L02

(B) Pipedream run

Before we can run Pipedream, we need to "prepare" the reference structure.

3rxe is not an apo structure, but has benzamidine bound in the binding site. So, before using 3rxe we need to remove the benzamidine from the model. Also, as this is a high resolution structure with a stable water structure, we will keep the water model rather than have Pipedream discard and rebuild it. But we want to ensure that there is no solvent encoaching on the ligand binding site.

We are now ready to run Pipedream.

Run Pipedream with the following command:

pipedream -hklin 3rxt.mtz -xyzin input.pdb -hklref 3rxe.mtz \
-keepwater -nowateradd -rhofit grade-L01.cif,grade-L02.cif \
-postref -d pipe &


You should also consider the following points when running Pipedream with cocktails:

  1. Currently, it is not possible to identify specific binding sites for Rhofit. Run with default options, Rhofit will pick what it considers to be the best difference density feature in which to try to fit the ligand(s). As mentioned above, with small, weakly binding fragments, there is an increased risk of Rhofit choosing a feature away from the expected binding site, resulting in a false positive. Of course it could be indicating genuine alternative binding sites. The solution to this is to use the -allclusters option in Pipedream to tell Rhofit to fit the ligand to all possible clusters. Of course this will increase the time the job will take to run, however it has the advantage that all potential binding events will be analysed. The results will need to be closely analysed to decide which if any of the solutions are genuine. This is a determination that only you can make.
  2. As mentioned, only the "best" solution from Rhofit is post-refined. If Pipedream is run without -allclusters and has selected the wrong cluster, then more cpu time may have been needlessly used on post-refining an incorrect or unwanted solution. In addition to -allclusters, also specifying -postref will further increase the cpu time required, but has the definite advantage that the electron density maps, both 2fofc and fofc, after refinement should give much greater discrimination between a potential binding solution and a false positive. Use of -postquick instead of -postref (to perform a faster, more curtailed refinement) would be a reasonable option to save time in this instance.

(C) Analysis

Now, we will look at the output and the results.

Look at the Rhofit and post-refinement results in the summary.out file.

firefox report-grade-(L01,L02)/index.html

These would suggest that L01 is correct.

Now, look at the Rhofit solutions graphically:

visualise-rhofit-coot rhofit-grade-(L01,L02)

and look at the difference density that Rhofit was trying to fit into and the solutions it found.


The above picture shows the top solution from L01 on the left and L02 on the right.

The density strongly suggests that L01 has bound.

Now look at the output maps from post-refinement:

visualise-geometry-coot report-grade-(L01,L02)


The above picture shows L01 on the left and L02 on the right.

Now the +ve difference density around L02 gives the game away - clearly suggesting that L01 is bound.

And indeed L01 is the ligand shown bound in the deposited structure 3rxt.

Follow on

The above results may suggest, on the face of it, that L01 (ligand code SZ4) and not L02 (ligand code 2AP) is bound.

However, when designing cocktails for use in fragment based x-ray screening, one of the key points is to try to ensure shape diversity between the components of individual cocktails to aid in easy identification of which component may have bound.

In the above example, this is most certainly not the case. In fact L02 is a perfect substructure of L01, making exclusion of L02 as a potential binder impossible from this experiment. A deconvolution experiment - soaking both components individually - would be required.

Indeed, this is exactly what Yamane et al. did, confirming that L02 does indeed bind. They report an IC50 for L01 of 188microM but the binding of L02 was so weak that standard enzyme kinetic studies were insensitive. X-ray crystallography however was able to detect and show binding. 3rxc shows trypsin soaked with L02 and 3rxd is soaked with L01.


The above picture shows the bound ligand in 3rxc (left) and 3rxd (right) with the 2fofc maps contoured at the same level. Clearly L02 is bound in 3rxc, although the density appears somewhat weaker than the density for L01 in 3rxd.

If the components of the cocktail were truly shape diverse, would it be possible to "deconvolute" their binding modes (without doing a deconvolution experiment) and also potentially tell anything about their relative binding affinities?

Quite possibly - by refining the structure with both components present, making use of BUSTER's occupancy refinement capabilities and showing that the difference density features were diminished relative to refinement of the structures with the individual components.

In this case:

  1. Take a copy of the final post-refined structure with L01 (postrefine-grade-L01/refine.pdb) and with an editor, add in the coordinates of ligand L02 taken from postrefine-grade-L02/refine.pdb.
  2. Change the residue number of L02 to 4001
  3. Assign alternate conformation code A to L01 and code B to L02
  4. Reset the occupancy of all atoms in both L01 and L02 to 0.5
  5. Save the modified pdb file (say input.pdb)
  6. Generate a gelly input file to tell BUSTER to refine the relative occupancies of L01 and L02:
pdb2occ -p input.pdb -o occupancy.gelly
  1. Now run BUSTER:
refine -p input.pdb -m 3rxt.mtz -Gelly occupancy.gelly -l grade-L01.cif -l grade-L02.cif -d refine > refine.out &


Run buster-report:

buster-report -d refine -dreport report

Look at the output, at the "ligand report" tab

firefox report/index.html

The centre column lists the refined occupancy for both ligands - in this case ~0.79 for L01 and ~0.21 for L02.