Content


Best practice

A normal autoPROC (20240710 and later versions) run will create a whole range of reflection data in MTZ or mmCIF format. Not all of them are worth keeping: it is always best to refer to the main summary.html file which will describe all relevant output (anything not mentioned in there is of intermediate nature and can be ignored). If you want to keep/transfer only a single file: take the summary.tar.gz file - which will (once unpacked) provide the summary.html and all relevant files.


User problems with OneDep ("missing freeR set") - Oct 2022

The information on DepositionMmCif should provide adequate instructions when using both autoPROC+STARANISO and BUSTER for the data processing and refinement stages ... but what if different systems were used? A user might encounter a wwPDB deposition/validation problem similar to this:

So I tried to run a wwPDB validation job by submitting as mmCIF
structure factor file Data_1_autoPROC_STARANISO_all.cif, and as
mmCIF coordinate file the cif file output by phenix.refine. However,
I still get a "Structure factor file is missing freeR set" error.

We have to remember the distinction between the information given in the _refln.pdbx_r_free_flag column of a reflection mmCIF data block and the notion of a "freeR set" mentioned in the error message one gets.

The current situation can quickly give the following understandable impression

Thus it is not quite true that Data_1_autoPROC_STARANISO_all.cif is
"deposition-ready", because it appears to be so only if combined
with BUSTER-derived coordinates.

because (1) the loss of information and provenance when going from data processing into subsequent use of reflection data and (2) the assumptions of the deposition system what a "typical" set of two mmCIF files should look like. What we provide in the autoPROC+BUSTER world is a way of combining the different deposition-ready files in order to create two files (model and reflection) from the following input:

We then need to do the following in order for the validation/deposition system to be happy:

So we would argue that all our mmCIF files are deposition ready after all (they contain the complete and correct information) - just that the validation/deposition system has certain assumptions that are tricky to meet: a reflection file from data processing will never contain a _refln.status flag since this is a derived quantity computed within downstream processes.

One might then rightly ask:

Surely it would be desirable to allow users to deposit
autoPROC-derived data, even if for whatever reason they used for
refinement a package different from BUSTER?

Absolutely - other refinement packages/systems should probably provide a similar tool to our aB_deposition_combine to combine the often much richer reflection mmCIF from data processing with the more limited reflection mmCIF from refinement ... at least as long as the deposition system itself lacks the flexibility to allow for the same combination steps and checks outlined above.

Remember that one can always "just" deposit the reflection and model mmCIF files coming out of refinement (any package) - with certain significant limitations:


Test-set flag principles - Apr 2026

The following OneDep message ("Structure factor file is missing freeR set") is causing our users a lot of headache:

Screenshot_2026-04-15_15-53-12.png

Here is our assessment of it:

       Classification of a reflection so as to indicate its status with
       respect to inclusion in the refinement and the calculation of R
       factors.

The official wwPDB documentation has

But:

Therefore:

This incorrect warning/error message from OneDep has caused a lot of problems over several years and we have tried to support users on a case-by-case basis to get the data through the annotation process unchanged. Unfortunately, this has not triggered a fix in the OneDep system as-is. We feel very strongly that the bogus check (_refln.status in datablocks N>1) should be removed from OneDep as a matter of urgency: it is preventing a lot of our users from depositing metadata-rich, multi- datablock reflection data, whereas the prevailing Zeitgeist is that depositions should be as "rich" as possible to help train Machine Learning engines.


Reflection data file warning messages - Apr 2026

If you see the following block of warnings

6-back-to-square-one.png

you have to remember first that

Also:

The warnings about "unwanted CIF item" are misleading

Why OneDep is complaining about the _diffrn_radiation_wavelength.wt item is unclear:

The warnings about an "abnormal" value are incorrect

The warnings about missing "mandatory items" are incorrect

                    The dataset used for the refinement should be listed as a first
                    data block and should contain diffraction indices h,k,l, observed
                    amplitudes and/or intensities, their respective sigma values
                    and refinement test set.

Bottom line: we think that all those warnings are either misleading or incorrect and can be ignored.


I've lost some files

What should you do if you "lost" those important files (summary.tar.gz or Data_1_autoPROC_STARANISO_all.cif) but you still want to deposit rich, multi-datablock reflection data?

If you only have aimless_alldata_unmerged.mtz, aimless_alldata.mtz, staraniso_alldata-unique.mtz, staraniso_alldata-unique.cif and staraniso_alldata.log, you could run

 % aP_deposition_prep -p 1
 % gemmi mtz2cif --no-comments --no-history --separate aimless_alldata.mtz aimless_alldata_unmerged.mtz 2_aimless_alldata.cif
 % cat 1_autoPROC_STARANISO_all.cif 2_aimless_alldata.cif > Data_1_autoPROC_STARANISO_all.cif

This should create the mmCIF file Data_1_autoPROC_STARANISO_all.cif that is similar (but with not as rich metadata) to the one originally created by autoPROC itself.

Now you should be able to run

  % aB_deposition_combine -aP Data_1_autoPROC_STARANISO_all.cif BUSTER_model.cif BUSTER_refln.cif

to get two deposition-ready mmCIF files

aB_deposition_combine_model.cif
aB_deposition_combine_refln.cif