IMP logo
IMP Reference Guide  develop.63b38c487d,2024/12/21
The Integrative Modeling Platform
IMP.spatiotemporal.prepare_protein_library Namespace Reference

Function for preparing spatiotemporal models for sampling. More...

Detailed Description

Function for preparing spatiotemporal models for sampling.

Functions

def prepare_protein_library
 Function that reads in experimental stoicheometery data and calculates which compositions and location assignments should be sampled for spatiotemporal modeling, which are saved as config files. More...
 

Function Documentation

def IMP.spatiotemporal.prepare_protein_library.prepare_protein_library (   times,
  exp_comp_map,
  expected_subcomplexes,
  nmodels,
  output_dir = '',
  template_topology = '',
  template_dict = {},
  match_final_state = True 
)

Function that reads in experimental stoicheometery data and calculates which compositions and location assignments should be sampled for spatiotemporal modeling, which are saved as config files.

Optionally, a PMI topology file can be provided, in which case topology files for each composition and location assignment are also written. The output is 3 types of files:

  1. *_time.config - configuration files, which list the proteins included at each time point for each model
  2. time.txt - protein copy number files. Each row is a protein copy number state and each column is the protein copy number in that state. Note that each protein copy number state can result in multiple location assignments.
  3. *_time_topol.txt - topology files for each copy number and location assignment.
Parameters
timeslist of strings, the times at which the stoicheometery data should be read.
exp_comp_mapdictionary, which describes protein stoicheometery. The key describes the protein, which should correspond to names within the expected_subcomplexes. Only copy numbers for proteins or subcomplexes included in this dictionary will be scored. For each of these proteins, a csv file should be provided with protein copy number data. The csv file should have 3 columns, 1) "Time", which matches up to the possible times in the graph, 2) "mean", the average protein copy number at that time point from experiment, and 3) "std", the standard deviation of that protein copy number from experiment.
expected_subcomplexeslist of all possible subcomplex strings in the model. Should be a list without duplicates of all components in the subcomplex configuration files.
nmodelsint, number of models with different protein copy numbers to generate at each time point.
output_dirstring, directory where the output will be written. Empty string assumes the current working directory.
template_topologystring, name of the topology file for the complete complex. (default: '', no topology files are output)
template_dictdictionary for connecting the spatiotemporal model to the topology file. The keys (string) are the names of the proteins, defined by the expected_complexes variable. The values (list) are the names of all proteins in the topology file that should have the same copy number as the labeled protein, specifically the "molecule_name." (default: {}, no topology files are output)
match_final_stateBoolean, determines whether to fix the final state to the state defined by expected_subcomplexes. True enforces this match and thus ensures that the final time has only one state. (default: True)
Note
This function is only available in Python.

Definition at line 57 of file prepare_protein_library.py.