Covid19ReRefine6W9C

Please also check the autoPROC (Re)processing page for 6W9C

Content:

Introduction
Deposited model and data
(Re)processed data and refinement

Introduction

The 6W9C model was solved via molecular replacement (with Phaser) using 5Y3Q and refined in REFMAC.

Deposited model and data

Given the low completeness (see remarks on autoPROC (re)processing page), it is not surprising that the overall validation metrics are on the slightly poorer side. here are some numbers with different versions of the MolProbity suite/method:

MolProbity 4.02:

     ___________________________________________________________________________
    |        |Clashscore, all atoms:|6.57           |99th percentile* (N=189,   |
    |All-Atom|______________________|_______________|2.70AA_+/-_0.25AA)_________|
    |Contacts|Clashscore is the number of serious steric overlaps (> 0.4 AA) per|
    |________|1000_atoms._______________________________________________________|
    |        |Poor_rotamers_________|40______|4.93%_|Goal:_<1%__________________|
    |        |Ramachandran_outliers_|5_______|0.54%_|Goal:_<0.05%_______________|
    |        |Ramachandran_favored__|779_____|84.40%|Goal:_>98%_________________|
    |Protein |MolProbity score^     |2.56           |83rd percentile* (N=5412,  |
    |Geometry|______________________|_______________|2.70AA_+/-_0.25AA)_________|
    |        |Cb_deviations_>0.25AA_|0_______|0.00%_|Goal:_0____________________|
    |        |Bad_backbone_bonds:___|0_/_3720|0.00%_|Goal:_0%___________________|
    |________|Bad_backbone_angles:__|0_/_4643|0.00%_|Goal:_<0.1%________________|

CCTBX version (as distributed with CCP4):

     Ramachandran outliers =   0.54 %   
                   favored =  84.40 %   
     Rotamer outliers      =   7.39 %   
     C-beta deviations     =     0   
     Clashscore            =   6.70 (percentile: 64.1 N=23192, 2.70A+/-0.25A) 
     RMS(bonds)            =   0.0135   
     RMS(angles)           =   1.77   
     MolProbity score      =   2.70 (percentile: 33.2 N=22793, 2.70A+/-0.25A) 
     Resolution            =   2.70   
     R-work                =   0.2350 (percentile: 17.4 N=23083, 2.70A+/-0.25A) 
     R-free                =   0.3090 (percentile: 5.5 N=22111, 2.70A+/-0.25A) 
     Refinement program    = REFMAC

(Re)processed data and refinement

We will be using the re-processed data and start again from an initial "MR-like" starting point. For that we are

mutating the 5Y3Q model to have the 6W9C sequence (using Coot)
placing three copies of this starting model on top of the existing chains A, B and C

The idea is to start with a good geometry model (since 5Y3Q is a 1.65 A structure with complete data and good validation metrics. So what can be done from this starting point:

refine normally, i.e. let the data (and refinement - including NCS restraints) do their job
add LSSR restraints against the starting model (to preserve some of the initial good geometry - since the poorer data and completeness can easiy cause overfitting)
use only a "poly-ALA" model (i.e. all side-chains truncated to Cbeta) as external restraints - because the initial model was achieved through automatic mutation towards the correct sequence)