IMP
2.4.0
The Integrative Modeling Platform
|
This page walks through an iterative design process to give an example of what sort of issues are important and what to think about when choosing how to implement some functionality.
Hao wants to implement ligand/protein scoring to IMP so that he can take advantage of the existing infrastructure. The details of the scoring function are currently experimental. The code does the following:
Since the mol2 reader is quite separate from the scoring, we will consider it on its own first. In analogy to the pdb reader, it makes sense to provide a function read_mol2(std::istream &in, Model *m)
which returns an IMP::atom::Hierarchy.
The mol2 atom types can either be added at runtime using IMP::atom::add_atom_type() or a list of predefined constants can be added similar to the IMP::atom::AT_N. The latter requires editing both IMP/atom/Atom.h and modules/atom/src/Atom.cpp and so it is a bit harder to get right.
First, this functionality should probably go in a new module since it is experimental. One can use the scratch module in a separate git
branch, for example.
One could then have a PMFRestraint
which loads a PMF file from the module data directory (or from a user-specified path). It would also take two IMP::atom::Hierarchy decorators, one for the ligand and one for the protein and score all pairs over the two. For each pair of atoms, it would look at the IMP::atom::Atom::get_type() value and use that to find the function to use in a stored table.
Such a design requires a reasonable amount of implementation, especially once one is interested in accelerating the scoring by only scoring nearby pairs. The PMFRestraint
could use a IMP::core::ClosePairsScoreState internally if needed.
One could instead separate the scoring from the pair generation by implementing the scoring as an IMP::PairScore. Then the user could specify an IMP::core::ClosePairsScoreState when experimenting to see what is the fastest way to implement things.
As with the restraint solution, the IMP::PairScore would use the IMP::atom::Atom::get_type() value to look up the correct function to use.
If you look around in IMP
for similar pair scores (see IMP::PairScore and the inheritance diagram) you see there is a IMP::core::TypedPairScore which already does what you need. That is, it takes a pair of particles, looks up their types, and then applies a particular IMP::PairScore based on their types. IMP::core::TypedPairScore expects an IMP::IntKey to describe the type. The appropriate key can be obtained from IMP::atom::Atom::get_type_key().
Then all that needs to be implemented in a function, say IMP::hao::create_pair_score_from_pmf() which creates an IMP::core::TypedPairScore, loads a PMF file and then calls IMP::core::TypedPairScore::set_pair_score() for each pair stored in the PMF file after translating PMF types to the appropriate IMP::atom::AtomType.
This design has the advantage of very little code to write. As a result it is easy to experiment (move to 3D tables or change the set of close pairs). Also different, non-overlapping PDFs can be combined by just adding more terms to the IMP::core::TypedPairScore.
The disadvantages are that the scoring passes through more layers of function calls, making it hard to use optimizations such as storing all the coordinates in a central place.