[sharp-discuss] question on mir phasing

Thu May 27 20:52:18 CEST 2010

Hi Fengyun,

On Thu, May 27, 2010 at 01:25:51PM -0500, alpharyun at rice.edu wrote:
> 1) That's Rfactor calculated by SCALEIT for all the reflection data  
> between derivative and the native. I was told that if the value is lower 
> than 20% and higher than 5%, then the derivative data is useful for sir 
> phasing, is this right?

I would phrase it differently: if above 20% you will probably have a
non-isomorphism problem. And if it is below 5% there probably isn't
any heavy atom in there.

However, the 5-20% range can be due to slight non-isomorphism or real
heavy-atom substitution ... or a mixture.

> 2) We didn't do the fluorescence scan. The only thing we tried was that 
> we ran the native gel on the samples after HA soaking. The samples with 
> band shift on the native gel were considered as to be HA substituted.

Yes, also a good test.

> 3) For anomalous signal, the <ano>=-0.6, <|ano|>=9.3 for Hg; the  
> <ano>=-0.6, <|ano|>=12.3 for Pt; these are the output of FHSCALE. How  
> can I tell whether these statistics are good or not?

Good question: I never look at those. Have a look at the SHELXC tables
given in the relevant autoSHARP section: f"/sig they are called (or
similar).

> 4) For the isomorphous difference, I got the output from SCALEIT as follow,
>    RMS differences     RMSiso    RMSano
>    F_hg                54.18     31.60
>    F_pt                88.93     37.13

Also a good question: I'm not familiar with those. I tend to look at
the table (versus resolution) of <diso> (ie. <|FPH-FP|>). You expect a
large value at low resolution that goes down with resolution.

> 5) When we processed the data in HKL2000, the results for indexing in H3 
> and H32 very close to each other.

The indexing won't really tell you - the scaling would though. I don't
know how to use scalepack for that kind of thing: but if you had
processed your data with XDS or MOSFLM in H3 you could quickly run

  % pointless mosflm.mtz

or

  % pointless xdsin XDS_ASCII.HKL

Pointless (Phil Evans) is very good in figuring out the symmetry of
the data.

> 6) The warning meesages from scaling are like this,
>     * differences in amplitude
>     * very high R-factor(s) in low-resolution bin (nat-PT)

Not ideal: there could be a problem with low-resolution. I guess you
didn't explicitely mask the beamstop during your HKL2000 runs on all
datasets? I highly recommend doing that (very easy to do in MOSFLM and
d*TREK, a bit of a pain in XDS ... not sure how to do it in HKL2000):
after all, some outliers in the stringest reflection regime can mess
things up big time. And later (during density modification) you want
good low-res data for the solvent flattening/flipping to perform with
all its power.

There are automatic methods in all programs for doing that, but in my
experience it is worthwhile double checking those approaches. And once
you figured out how to do it properly by hand (in your favourite
program) it is usually a 1 minute job (or often even faster).

>    The warning messages from ns analysis are like this,
>     * unexpected non-origin peaks (derivative 1, HG)
>     * unexpected non-origin peaks (derivative 2, PT)
>
>  and the non-origin peaks look like this, they all very close to origin 
> or alternative origins.

Then they're probably ripples ... maybe because of some low-res
outliers. You're right: I would only worry about them at this point if
they're away from origins.

> 7) I didn't get the exact meaning of this question,
>   * what does it have to say about likely strength of your HA signal
>     (ano and isomorphous)?

Usually, autoSHARP will say something about the recommended resolution
cutoffs for iso/ano data in two places: during the scaling/analysis
and during the SHELXC/D HA detection.
>
> 8) I set "search for 1" site for each derivative, because the crystal  
> should have 2 monomers in the a.u as calculted by the Mathew  
> coefficient. SHELXD searches for 2 sites. Here are the correlations,
>
>    For Hg-SIRAS, the highest correlation=0.042;
>    For Hg-SAD  , the highest correlation=0.074;
>    For Pt-SIRAS, the highest correlation=0.150;
>    For Pt-SAD  , the highest correlation=0.123.

Not very high ...

> 9) How can I view the plot of CCall vs CCweak? I don't have CURVE2D, can 
> I view it by EXCEL or other program?

You need to use the plotmtv program - it is in

  /where/ever/sharp/helpers/linux/plotmtv

Usually (if you're on a Linux box and use firefox) when you click a
*.mtv hyperlink the first time it should ask you what to do with
it - this is the normal behaviour for the browser when it encounters a
file of an unknown type. Then point it to the

  /where/ever/sharp/bin/helpers.x-plotmtv

script (similar scripts are available for *.pdb, *.mtz files and maps
to be viewed with Coot). Usually, during installation the specifc
settings were generated - but you could check
/where/ever/sharp/bin/helpers.local if you still have problems.

> 10) The first round of sharp gives FOM=0.013, PPiso/ano=0.047/0.043 (for Pt).
>     Then sharp removes 2 sites from pt and adds 2 new sites to pt.

Sounds like it didn't work: it removes the 2 sites it just found
... and the stats are very poor. So I wouldn't trust anything after
that at all.

>     * Compound 3 : phasing power ANO = 0.043 (drop below 1 at 32.45 ?)

VERY poor.

> 11) I went to 4A and 5A. You said "if that is only 5-6A then the statistics
> coming out of SHELXC/D can be very noisy and inflated". Do you mean that 
> the result of SHELXC/D in this reslution is not reliable?

The results can be correct - but it is much more tricky to know if
they are correct. the usual stats (CC values, CCweak vs CCall plot
etc) are very unreliable then.

> 12) Let me show you the map what I got (see attached files). In our  
> case, the structure should be a trimer. The monomer is expected to have a 
> long helix structure. The three-fold axis happens to be the same as the 
> crystal symmetry.

I'll have a look at it.

> 13) We did not get data at lower resolution. The reason I cut at 15A is 
> that the completeness at lower resolution is low.

Aha: is the completeness low because of distance/beamstop? Or because
of a large number of overlaps? In the latter case: you could collect a
low-intensity pass to have all strongest (!) reflections in your data
as well.

The next time I would definitely try and collect something to at least
30-40A ... unless the beamstop is one of those stupid, old-fashioned,
big thingies that cost 100 USD and blocks half the (10000000 USD)
detector surface ...

> 14) PPiso/ano=0.297/0.147 is to 4A for Hg. They all drop below 1 at 14.85 A.
> 15) PPiso/ano=0.108/0.175 is to 4A for Pt. They all drop below 1 at 14.85 A.

Not good.

> 16) You see, I still have the hand issue because my map is very flat.

I'm not sure I understand: the map is 'flat' because you told the
program to flatten it. This is no good criteria for a good/bad map or
the correct handedness.

It is only a good criteria if you look at maps _before_ solvent
flattening (i.e. pure SHARP phases): but unless you have a whopping
good derivative and work with Lysozyme, this is not a map I would
actually look at ;-)

I would have another look at your images and data processing (let me
know if you need more info in that area) - so that you can be sure
that the low-resolution is as good as it gets, you understand the
incompleteness at that low resolution range, you know where those
outliers come from and you know your spacegroup. After that you can
concentrate on the HA solution to see if that goes anywhere ...

Cheers

Clemens

-- 

***************************************************************
* Clemens Vonrhein, Ph.D.     vonrhein AT GlobalPhasing DOT com
*
*  Global Phasing Ltd.
*  Sheraton House, Castle Park 
*  Cambridge CB3 0AX, UK
*--------------------------------------------------------------
* BUSTER Development Group      (http://www.globalphasing.com)
***************************************************************