SHARP User Manual	previous
Appendix 3

Detailed description of the Sharp INput file

Copyright © 2001-2006 by Global Phasing Limited

All rights reserved.

This software is proprietary to and embodies the confidential technology of Global Phasing Limited (GPhL). Possession, use, duplication or dissemination of the software is authorised only pursuant to a valid written licence from GPhL.

Documentation (2001-2006) Claus Flensburg, Marc Schiltz, Clemens Vonrhein

Contact sharp-develop@GlobalPhasing.com

If you're not happy running the various programs through the graphical user interface Sushi (although highly recommended) and you want to edit/create SHARP input files (SIN files) by hand, we give a more detailed description of the syntax here.

Note : you also might to have a look at Appendix 2: How to use SHARP under UNIX

Introduction
General keywords
G-Sites
Compound
Crystal
Wavelength
Batch

1. I n t r o d u c t i o n

The SIN file is a ASCII representation of the hierarchical description that SHARP uses. The hierarchy is represented by nodes within curly brackets: { opens a node and } closes it. The closing bracket is always needed.

The input for various keywords has always to be on a single line. Furthermore, most keywords require exact spelling (no mixing of upper- and lower-case letters). The numerical values can be free format. Also, white space is ignored.

REFINE/NOREFINE

The refinement of each refinable parameter can be specifically switched on or off using the REFINE or NOREFINE flag following the parameter value(s). These flags will only have an effect, if the MODE keyword specifies that refinement is requested.

The default for each parameter is NOREFINE.

ESTIMATE/NOESTIMATE

The estimation of each parameter can be specifically switched on or off using the ESTIMATE or NOESTIMATE flag following the REFINE/NOREFINE flag. These flags will only have an effect, if the MODE keyword specifies that estimation is to be done.

The default for each parameter is NOESTIMATE.

anisotropic temperature factors

2. G e n e ra l k e y w o r d s

DATAFILES

This points to the directory where the MTZ file is expected. It usually is something like /home/user/sharpfiles/datafiles (or relative to the main SHARP/autoSHARP installation.

OBSFILE

Name of the MTZ file (in DATAFILES). This should contain all columns (and optionally the Hendrickson-Lattmann coefficients HLA ... HLD for external phase information).

You can override the DATAFILES/OBSFILE keywords by putting a file named REFL01.mtz into the current directory.

TITLE

Just a string describing the calculation you want SHARP to do - handy to keep track of several runs with same identifier.

MODE

Can be one or several of:

ESTIMATE
All parameters defined as ESTIMATE will be estimated as a first step.
REFINE
Maximum likelihood refinement of all parameters defined as REFINE.
RESIDUAL
Calculation of log-likelihood gradient maps.
ELECTRON_DENSITY
Calculation of electron density (phases).
SIGMAZ
Calculation of Luzzati tables a la SIGMAA. Requires columns: Fn, Fmod, and, PHImod in the reflection file.

SYMMETRY

Give the space-group name: make sure to use

H3 for hexagonal setting (versus R3 for rhomboedral setting)
unique-b setting for monoclinic

CELL

Cell parameters (all 6 values have to be given) in units of Å and degrees.

CYCLES

The syntax of the CYCLES card is:

CYCLES NumCycles iCycBeg iCycEnd

where all variables are integers. Refinement will be done starting at BIG cycle iCycBeg up to BIG cycle iCycEnd (with up to NumCycles small cycles at each BIG cycle). The maximum value for iCycEnd is 20.

REJECT

The syntax of the REJECT card is:

REJECT s q

where s is the string 'YES' or 'NO' with obvious meanings. q is a multiplier to determine the cut-off value at which rejections are retained. Default is 'REJECT NO' while the usual (in the distributed tutorial files and the SIN files prepared by Sushi and autoSHARP) is 'REJECT YES 5'.

SPARSE

The syntax of the SPARSE card within the SIN file is:

SPARSE Sparse_cut

where Sparse_cut is the allowed distance in Å (defaults to 8 Å). If the distance is negative the sparse approximation is switched off. Using the sparse approximation to the Hessian is usually not required if there are fewer than 200 variables.

The default is:

SPARSE -8.0

WEED

The syntax is:

WEED I1 Q1 Q2

where I1 is an integer, Q1 and Q2 are reals.

If I1 is non-zero weeding is switched on. Q1 is an absolute threshold (weed_t1) and Q2 is a factor to multiply the standard uncertainty of the mean of the scores to construct the relative threshold weed_t2, with

weed_t2 = <score> - Q2 * Sigma(<score>).

The default is to do weeding with thresholds of 0.50 and 3.00 for Q1 and Q2, respectively, ie.

WEED  1 0.50 3.00

STRATEGY

For each BIG cycle it is possible to switch on/off classes of variables using the STRATEGY card. The classes are:

Description Variable(s) Class Number Value

Scaling SCAL_K
SCAL_B
SCAL_B6_ADD 1 1

Non-isomorphism NISO_BGLO
NISO_BLOC
NISO_CLOC
NANO_BGLO
NANO_BLOC
NANO_CLOC 2 2

Occupancy HAT_OCC 3 4

Coordinates X, Y, Z 4 8

B-factors HAT_B
HAT_B6_ADD 5 16

Scattering factors ATOM_f'
ATOM_f" 6 32

Others 7 64

The STRATEGY card does NOT alter the REFINE/NOREFINE flag for the specific parameters.

Description	Variable(s)	Class Number	Value
Scaling	SCAL_K SCAL_B SCAL_B6_ADD	1	1
Non-isomorphism	NISO_BGLO NISO_BLOC NISO_CLOC NANO_BGLO NANO_BLOC NANO_CLOC	2	2
Occupancy	HAT_OCC	3	4
Coordinates	X, Y, Z	4	8
B-factors	HAT_B HAT_B6_ADD	5	16
Scattering factors	ATOM_f' ATOM_f"	6	32
Others		7	64

The syntax of the STRATEGY card is as follows:

STRATEGY I1 I2 I3 ... In

where the I1 to In parameter are integers.

For each BIG cycle you can construct the sum of the Class Values (see table above) for the variable classes which are to be optimised in that BIG cycle and put it on the STRATEGY card. Some examples:

Standard STRATEGY 7 15 63 means that variables from the following classes are allowed to vary:

BIG cycle 1 :	Scaling, Non-isomorphism, Occupancy
BIG cycle 2 :	Scaling, Non-isomorphism, Occupancy, Coordinates
BIG cycle 3 :	Scaling, Non-isomorphism, Occupancy, Coordinates, B-factors, Scattering factors

Careful STRATEGY 7 15 23 63 means that variables from the following classes are allowed to vary:

BIG cycle 1 :	Scaling, Non-isomorphism, Occupancy
BIG cycle 2 :	Scaling, Non-isomorphism, Occupancy, Coordinates
BIG cycle 3 :	Scaling, Non-isomorphism, Occupancy, B-factors
BIG cycle 4 :	Scaling, Non-isomorphism, Occupancy, Coordinates, B-factors, Scattering factors

Reckless STRATEGY 0 7 63 means that variables from the following classes are allowed to vary:

BIG cycle 1 :	Nothing (this BIG cycle will be skipped)
BIG cycle 2 :	Scaling, Non-isomorphism, Occupancy
BIG cycle 3 :	Scaling, Non-isomorphism, Occupancy, Coordinates, B-factors, Scattering factors

PRINT_HESS

The syntax is:

PRINT_HESS I1

where I1 is an integer. If the card is present or the variable I1 is non zero the Hessian file (printed at each small cycle through the Auxiliary Cycle Information) will contain all elements and there will also be a link to a file with the eigenvectors. The latter is useful to detect problems due to over-parametrisation.

The default is not to print the full Hessian (and eigenvectors)

PRINT_HESS 0

BOX_CONSTR_F'

The syntax is:

BOX_CONSTR_F' I1 QAbs QRel

where I1 is an integer and QAbs and QRel are reals.

If the variable I1 is non zero the f's will be constrained to be within the bounds:

MIN(f' - |f'|*QRel, f' - QAbs) and MAX(f' + |f'|*QRel, f' + QAbs)

computed with the value of f' at the start of each BIG cycle. The default is to use the card:

BOX_CONSTR_F' 1 2.0 0.5

BOX_CONSTR_XYZ

The syntax is:

BOX_CONSTR_XYZ I1 Q1

where I1 is an integer and Q1 is a real.

If the variable I1 is non zero each coordinate will be boxed within: initial value +- delta, where

delta = Q1 * Lowest_HiRes / Cell_parameter

where Lowest_HiRes is the lowest high-resolution limit among all Batches and Cell_parameter is A for X, B for Y, C for Z.

The default is to box coordinates using:

BOX_CONSTR_XYZ 1 0.7071

SPHCLUSTER

The syntax of the SPHCLUSTER card (for SHARP versions 2.1.0 and above) within the SIN file is:

SPHCLUSTER TAG d1 d2 ... dN

where TAG is a unique identifier (up to 12 characters) for this cluster of a single heavy atom type. The d1 to dN parameters are the distances (from the centre of the cluster [Å]) of each of the N atoms of the given type.

There has to be a SPHCLUSTER card for each element type in a cluster.

Use one G-Site to specify the location of the centre of the cluster.

At the C-site level you specify the actual scatter type of the cluster component and add: SPHCLUSTER TAG to specify what cluster component this C-site/G-site belongs to. Eg.

   C-SITES {
      C-SITE-01  G-SITE-01  Ta  SPHCLUSTER  Ta6Br12:Ta
      C-SITE-02  G-SITE-01  Br  SPHCLUSTER  Ta6Br12:Br
   }

Optimisation of all the usual atomic parameters are allowed for a cluster component.

Note : Since Sushi does not (yet) handle the Spherical cluster keywords properly, you will have to run SHARP either as described in Appendix 2: How to use SHARP under UNIX or use the following recommended procedure:

Setup a SIN file using SUSHI and save it rather than running a SHARP job.
Edit the SIN file entering the required SPHCLUSTER information.
From the SHARP Control Panel request a restart of this job to launch SHARP.

A database of pre-defined clusters is available in the file $BDG_home/database/sphcluster. Any cluster defined in this database can directly be used. If the user wants to create their own database of cluster definitions, the environment variable SPHCLUSTER can be set to a database file with the same format as $BDG_home/database/sphcluster:

# Comment line
TAG
N
d1 d2 ... dN

(TAG and d1 to dN have same meaning as above)

ABSSCALE_MODE

The syntax is:

ABSSCALE_MODE MODE

where MODE is a character string.

This card is used to switch between different algorithms to determine an absolute scale. MODE can be either WILSON or E2PROTEIN. See: Morris & Bricogne for more details.

Default is to use the Wilson mode.

ABSSCALE_RESOL

The syntax is:

ABSSCALE_RESOL QLow [QHigh]

where QLow and QHigh are reals in units of Å.

This card sets the resolution limits for the determination of an absolute scale. If only one arguments is present it is assumed to be QLow.

PRIOR

This card specifies the form of the prior to be used during SHARP computations. The syntax is:

PRIOR Type

where Type is one of: NONE, WILSON, EXTERNAL, or, SIGMAZ.

If Type is EXTERNAL the reflection file MUST contain columns with the following names: PriorA, PriorB, PriorV11, PriorV12, and, PriorV22.

If Type is SIGMAZ the reflection file MUST contain columns with the following names: Fn, Fmod, and, PHImod.

Cards to specify a particular prior for only some of the MODEs have the following form:

PRIOR_REFINE Type
PRIOR_RESIDUAL Type
PRIOR_PHASING Type

where Type has the same meaning as for the general PRIOR card.

The default is to use the Wilson prior for all computations.

Detailed description of the Sharp INput file

Contents

1. I n t r o d u c t i o n

2. G e n e ra l k e y w o r d s

3. G - S i t e s

4. C o m p o u n d

5. C r y s t a l

6. W a v e l e n g t h

7. B a t c h

Copyright	© 2001-2006 by Global Phasing Limited

	All rights reserved.

	This software is proprietary to and embodies the confidential technology of Global Phasing Limited (GPhL). Possession, use, duplication or dissemination of the software is authorised only pursuant to a valid written licence from GPhL.

Documentation	(2001-2006) Claus Flensburg, Marc Schiltz, Clemens Vonrhein

Contact	sharp-develop@GlobalPhasing.com