MultiFoXS: determination of multi-state models

# Introduction

In many cases our data comes from heterogeneous sample. Heterogeneity can be both compositional and conformational. Interpretation of data collected on heterogeneous sample requires a multi-state model. A multi-state model is a model that specifies two or more co-existing structural states and values for other parameters, such as the weights of each state.

IMP includes a command line tool multi_foxs for enumeration and scoring of multi-state models against SAXS profiles. There is also a web interface available, which functions similarly.

In this example, we will demonstrate how to use it using PDB structures of replication protein A and a simulated SAXS profile.

For full help on the multi_foxs tool, run from a command line:

multi_foxs --help

# Setup

First, obtain the input files used in this example and put them in the current directory, by typing:

cp <imp_example_path>/multi_state/rpa/* .

(On a Windows machine, use copy rather than cp.) Here, <imp_example_path> is the directory containing the IMP example files. The full path to the files can be determined by running in a Python interpreter 'import IMP.multi_foxs; print(IMP.multi_foxs.get_example_path('rpa'))'.

# Calculation

The structure of the RPA in complex with ssDNA is available in the RCSB Protein Data Bank (PDB) as code 1jmc (file 1jmc.pdb), the unbound RPA structures are available as code 1fgu (file 1fguA.pdb, 1fguB.pdb), while the SAXS profile is given in the weighted.dat file. The SAXS profile was simulated from the 3 structures with the following weights:

1jmc.pdb 0.5
1fguA.pdb 0.25
1fguB.pdb 0.25


3% Gaussian noise was added to the simulated profile.

The atomic structures can be fit against the SAXS profile by running MultiFoXS:

multi_foxs weighted.dat 1fguA.pdb 1fguB.pdb 1jmc.pdb

FoXS calculates the theoretical profiles of the input structures, enumerates multi-state models, and fits them to the input SAXS profile. The output is a list of best scoring multi-state models of size 1, 2, and 3 in files ensembles_size_1.txt, ensembles_size_2.txt, and ensembles_size_3.txt, respectively. It also generates multi_state_model_x_x_x.dat files, containing the computed theoretical profile for the multi-state model and its fit against the experimental profile.

In this example, the best fit is obtained using a 3-state model (Chi=1.74) ensembles_size_3.txt, with weights of 0.53, 0.197, and 0.27, showing a maximal deviation of 7% compared to the weights that were used to simulate the profile:

1 |  1.74 | x1 1.74 (1.03, 0.53)
0   | 0.533 (0.533, 1.000) | 1jmc.pdb (0.333)
1   | 0.197 (0.197, 1.000) | 1fguA.pdb (0.333)
2   | 0.270 (0.270, 1.000) | 1fguB.pdb (0.333)