IMP
2.0.1
The Integrative Modeling Platform
|
This page presents instructions on how to develop code using IMP
. Developers who wish to contribute code back to IMP
or distribute their code should also read the Contributing code to IMP page.
The input files in the IMP
directory are structured as follows:
tools
contains various command line utilities for use by developers. They are documented below.doc
contains inputs for general IMP
overview documentation (such as this page), as well as configuration scripts for doxygen
.applications
contains various applications implementing using a variety of IMP
modules.each
subdirectory of module/
defines a module; they all have the same structure. The directory for module name
has the following structureREADME.md
contains a module overviewinclude
contains the C++ header filessrc
contains the C++ source filesbin
contains C++ source files each of which is built into an executablepyext
contains files defining the Python interface to the module as well as Python source files (in pyext/src
)test
contains test files, that can be run with ctest
.doc
contains additional documentation that is provided via
.dox filesexamples
contains examples in Python, as well as any data needed for examplesdata
contains any data files needed by the moduleWhen IMP
is built, a number of directories are created in the build directory. They are
include
which includes all the headers. The headers for module name
are placed in include/IMP/name
lib
where the C++ and Python libraries are placed. Module name
is built into a C++ library lib/libimp_name.so
(or
.dylib on a Mac) and a Python library with Python files located in lib/IMP/name
and the binary part in lib/_IMP_name.so
.doc
where the html documentation is placed in doc/html
and the examples in doc/examples
with a subdirectory for each moduledata
where each module gets a subdirectory for its data.When IMP
is installed, the structure from the build
directory is moved over more or less intact except that the C++ and Python libraries are put in the (different) appropriate locations.
The easiest way to start writing new functions and classes is to create a new module using the make-module script. This creates a new module in the modules
directory, complete with example code.
We highly recommend using a revision control system such as Subversion or GIT to keep track of changes to your module.
If, instead, you choose to add code to an existing module you need to consult with the person who people who control that module. Their names can be found on the module main page.
When designing the interface for your new code, you should
IMP
for similar functionality and, if there is any, adapt the existing interface for your purposes. For example, the existing IMP::atom::read_pdb() and IMP::atom::write_pdb() functions provide templates that should be used for the design of any functions that create particles from a file or write particles to a file. Since IMP::atom::BondDecorator, IMP::algebra::Segment3D and IMP::display::Geometry all use methods like IMP::algebra::Segment3D::get_point() to access the endpoints of a segment, any new object which defines similar point-based geometry should do likewise.You may want to read the design example for some suggestions on how to go about implementing your functionality in IMP
.
When there is a significant group of new functionality, a new set of authors, or code that is dependent on a new external dependency, it is probably a good idea to put that code in its own module. To create a new module, run the make_module script from the main IMP
source directory, passing the name of your new module. The module name should consist of lower case characters and numbers and the name should not start with a number. In addition the name "local" is special and is reserved to modules that are internal to code for handling a particular biological system or application. eg
The script adds a number of example classes to the new module, which can be read and deleted.
The next step is to update the information about the module stored in modules/mymodule/README.md
. This includes the names of the authors and descriptions of what the module is supposed to do.
If the module makes use of external libraries, you can add code to check for the external library to modules/mymodule/SConscript
. Add a check for a library called libextern.so
(or libextern.dylib
on a mac) which is accessed using the header extern.h
, using a function call call_extern
(1.0)
SConscript
includes env.IMPModuleBuild
call. This will make it so that your module is only built if the library is found. If the library is optional, instead add IMP.mymodule.use_extern
is True
.Each module has an auto-generated header called modulename_config.h
. This header contains basic definitions needed for the module and should be included (first) in each header file in the module. In addition, there is a header module_version.h
which contains the version info as preprocessor symbols. This should not be included in module headers or cpp files as doing so will force frequent recompilations.
Ensuring that your code is correct can be very difficult, so IMP
provides a number of tools to help you out.
The first set are assert-style macros:
See Error reporting/checking page for more details. As a general guideline, any improper usage to produce at least a warning all return values should be checked by such code.
The second is logging macros such as:
Finally, each module has a set of unit tests. The tests are located in the modules/modulename/test
directory. These tests should try, as much as possible to provide independent verification of the correctness of the code. Any file in that directory or a subdirectory whose name matches test_*.{py,cpp}
, medium_test_*.{py,cpp}
or expensive_test_*.{py,cpp}
is considered a test. Normal tests should run in at most a few seconds on a typical machine, medium tests in 10 seconds or so and expensive tests in a couple of minutes.
Some tests will require input files or temporary files. Input files should be placed in a directory called input
in the test
directory. The test script should then call
to get the true path to the file. Likewise, appropriate names for temporary files should be found by calling
. Temporary files will be located in build/tmp
. The test should remove temporary files after using them.
To assist in testing your code, we report the coverage of all IMP
modules and applications as part of the nightly builds. Coverage is basically a report of which lines of code were executed by your tests; it is then straightforward to see which parts of the code have not been exercised by any test, so that you can write new tests to test those parts. (Of course, lines of code that are never executed have no guarantee of working correctly.)
Both the C++ and Python code coverage is reported. For C++ code, only the lines of code that were exercised are reported; for Python code, which conditional branches were taken are also shown (for example, whether both branches from an 'if' statement are followed).
Ideally, coverage reflects the lines of code in a module or application that were exercised only by running its own tests, rather than the tests of the entire IMP
package, and generally speaking you should try to test a module using its own tests.
If you have code that for some reason you wish to exclude from coverage, you can add specially formatted comments to the code. For Python code, add a "pragma: no cover" comment to the line to exclude. For C++ code, an individual line can be excluded by adding LCOV_EXCL_LINE somewhere on that line, or a block can be excluded by surrounding it with lines containing LCOV_EXCL_START and LCOV_EXCL_STOP.
Make sure you read the API conventions page first.
To ensure code consistency and readability, certain conventions must be adhered to when writing code for IMP
. Some of these conventions are automatically checked for by source control before allowing a new commit, and can also be checked yourself in new code by running
All C++ headers and code should be indented in 'Linux' style, with 2-space indents. Do not use tabs. This is roughly the output of Artistic Style run like
. Split lines if necessary to ensure that no line is longer than 80 characters.
\b Rationale: Different users have different-sized windows or terminals, and different tab settings, but everybody can read 80 column output without tabs. All Python code should conform to the <a href="http://www.python.org/dev/peps/pep-0008/"> Python style guide</a>. In essence this translates to 4-space indents, no tabs, and similar class, method and variable naming to the C++ code. You can ensure that your Python code is correctly indented by using the
script, available as part of the IMP
distribution.
See the names section of the \ref conventions "IMP conventions" page. In addition, developers should be aware that - all preprocessor symbols (things created by `#define`) must begin with \c IMP and no \p %IMP code should depend on preprocessor symbols which do not start with IMP. - names of files that implement a single class should be named for that class; for %example the SpecialVector class could be implemented in \c SpecialVector.h and \c SpecialVector.cpp - files that provide free functions or macros should be given names \c separated_by_underscores, for %example \c container_macros.h - Functions which take a parameter which has units should have the unit as part of the function name, for %example IMP::atom::SimulationParameters::set_maximum_time_step_in_femtoseconds(). Remember the Mars orbiter. The exception to this is distance and force numbers which should always be in angstroms and kcal/mol angstrom respectively unless otherwise stated. . \b Rationale: This makes it easier to tell between class names and function names where this is ambiguous (particularly an issue with the Python interface). The Python guys also mandate CamelCase for their class names, so this avoids any need to rename classes between C++ and Python to ensure clean Python code. Good naming is especially important with preprocessor symbols since these have file scope and so can change the meaning of other people's code.
Name
using a Names
. Declare functions that accept them to take a NamesTemp
(Names
is a NamesTemp
). Names
are reference counted (see IMP::RefCounted for details), NamesTemp
are not. Store collections of particles using a Particles
object, rather than decorators.All classes must have a show
method which takes an optional std::ostream
and prints information about the object (see IMP::Object::show() for an example). The helper macros, such as IMP_RESTRAINT() define such a method. In addition they must have operator<<
defined. This can be easily done using the IMP_OUTPUT_OPERATOR() macro once the show method is defined. Note that operator<<
writes human readable information. Add a write
method if you want to provide output that can be read back in.
Classes and methods should use IMP
exceptions to report errors. See IMP::Exception for a list of existing exceptions. See a list of functions to aid in error reporting and detection.
Use the provided IMPMODULE_BEGIN_NAMESPACE
, IMPMODULE_END_NAMESPACE
, IMPMODULE_BEGIN_INTERNAL_NAMESPACE
and IMPMODULE_END_INTERNAL_NAMESPACE
macros to put declarations in a namespace appropriate for module MODULE
.
Each module has an internal namespace, module_name::internal
and an internal include directory modulename/internal
. Any function which is
should be declared in an internal header and placed in the internal namespace.
The functionality in such internal headers is
As a result, such functions do not need to obey all the coding conventions (but we recommend that they do).
IMP
is documented using Doxygen. See documenting source code with doxygen to get started. We use //!
and /**
... * / blocks for documentation.
Python code should provide Python doc strings.
All headers not in internal directories are parsed through Doxygen. Any function that you do not want documented (for example, because it is not well tested), hide by surrounding with ``` #ifndef IMP_DOXYGEN void messy_poorly_thought_out_function(); #endif ```
We provide a number of extra Doxygen commands to aid in producing nice IMP
documentation. The commands are used by writing \commandname{args}
or \commandname
if there are no arguments.
\command{the command text}which produces
\salilab{imp, the IMP project}which produces the IMP project
\external{http://boost.org, Boost}produces Boost
IMP
do \impso that no link is produced (
IMP
as opposed to IMP).IMP
code should be marked with \advanceddoc You can tweak this class in various ways in order to optimize its performance.Similarly advanced methods should be marked with
\advancedmethodTo produce
\warning Be afraid, be very afraid.which produces
\unstable{Classname}
. The documentation will include a disclaimer and the class or function will be added to a list of unstable classes. It is better to simply hide such things from Doxygen.\untested{Classname}
.\untested{Classname}
.\comparable
and then hide the comparison functions from Doxygen (there are a lot of them and they aren't very interesting).Restraints take the current conformation of the particles and return a score and, if requested, add to the derivatives of each of the particles used. Evaluation can be done each of two ways
In whole model evaluation, each restraint is called one at a time and given a change to computes its score based on the current conformation of the particles and adds to each particles derivatives. That is, if \(R(P_i)\) is the score of the restraint on particle conformation \(i\) and \(R'(P_i)\) and there are no other restraints:
Stage | Score for R | Particle attribute | Particle derivative |
before model evaluation | undefined | \(P_0\) | undefined |
before restraint evaluation | 0 | \(P_0\) | 0 |
after restraint evaluation | \(R(P_0)\) | \(P_0\) | \(R'(P_0)\) |
Writing examples is very important part of being an IMP
developer and one of the best ways to help people use your code. To write a (Python) example, create a file myexample.py
in the example directory of an appropriate module, along with a file myexample.readme
. The readme should provide a brief overview of what the code in the module is trying to accomplish as well as key pieces of IMP
functionality that it uses.
When writing examples, one should try (as appropriate) to do the following:
import
lines for the IMP
modules usedcreate_representating
which creates and returns the model with the needed particles along with a data structure so that key particles can be located. It should define nested functions as needed to encapsulate commonly used codecreate_restraints
which creates the restraints to score conformations of the representationget_conformations
to perform the samplinganalyze_conformations
to perform some sort of clustering and analysis of the resulting conformationscreate_representation
and create_restraints
functions and performing samping and analysis and displaying the solutions.Obviously, not all examples need all of the above parts. See Nup84 for a canonical example.
The example should have enough comments that the reasoning behind each line of code is clear to someone who roughly understands how IMP
in general works.
IMP
provides a variety of scripts to aid the lives of developers.
Creating such a module is the easiest way to get started developing code for \p %IMP. First, choose a name for the module. The name should only contain letters, numbers and underscores as it needs to be a valid file name as well as an identifier in Python and C++. To create the module do
Then, if you run scons
with localmodules=True
, your new module will be built. The new module includes a number of examples and comments to help you add code to the module.
You can use your new module in a variety of ways:
.h
files in modules/my_module/include
and .cpp
files in modules/my_module/src
. In order to use use your new functions and classes in Python, you must add a line include "IMP/my_module/myheader.h"
near the end of the file modules/my_module/pyext/my_module.i
.IMP
by creating .cpp
files in modules/my_module/bin
. Each .cpp
file placed there is built into a separate executable..py
file in modules/my_module/pyext/my_module/
pythoncode
blocks to modules/my_module/pyext/my_module.i
..py
files in modules/my_module/test
or a subdirectory.If you feel your module is of interest to other IMP
users and developers, see the contributing code to IMP section.
If you document your code, running
will build documentation of all of the modules including yours. To access the documentation, open doc/html/index.html
.
In order to be shared with others as part of the IMP
distribution, code needs to be of higher quality and more thoroughly vetted than typical research code. As a result, it may make sense to keep the code as part of a private module until you better understand what capabilities can be cleanly offered to others.
The first set of questions to answer are
IMP
? If so, it might make more sense to modify the existing code in cooperation with its author. At the very least, the new code needs to respect the conventions established by the prior code in order to maintain consistency.You are encouraged to post to the imp-dev to find help answering these questions as it can be hard to grasp all the various pieces of functionality already in the repository.
All code contributed to IMP
scons
doc
)The next suggestions provide more details about the above choices and how to implement them.
Small pieces of functionality or extensions to existing functionality probably should be submitted to an existing module. Please contact the authors of the appropriate module and discuss the submission and how the code will be maintained.
A list of all current modules in the IMP repository can be found in the modules list or from the modules tab at the top of this page.
As always, if in doubt, post to imp-dev.
Patches to modules for which you have write access can be submitted directly by doing:
If you have a large group of related functionality to submit, it may make sense to create a new module in svn. Please post to imp-dev to discuss your plans.
Once you have submitted code, you should monitor the Nightly build status to make sure that your code builds on all platforms and passes the unit tests. Please fix all build problems as fast as possible.
The following sorts of changes must be announced on the imp-dev mailing list before being made
We recommend that changes be posted to the list a day or two before they are made so as to give everyone adequate time to comment.
In addition to monitoring the imp-dev list, developers who have a module or are committing patches to svn may want to subscribe to the imp-commits email list which receives notices of all changes made to the IMP
SVN repository.
IMP
is designed to run on a wide variety of platforms. To detect problems on other platforms we provide nightly test runs on the supported platforms for code that is part of the IMP
SVN repository.
In order to make it more likely that your code works on all the supported platforms:
and
and or
in C++ code, use &&
and ||
instead.friend
declarations involving templates, use #if
blocks on SWIG
and IMP_DOXYGEN
to hide code as needed instead.IMP
now turns on C++ 11 support when it can. However, since compilers are still quite variable in which C++ 11 features they support, it is not adviseable to use them directly in IMP
code at this point. To aid in their use when practical we provide several helper macros:
override
keyword when availablefinal
keyword when availableMore will come.
The contents of this page are aimed at C++ programmers, but most apply also to Python.
Two excellent sources for general C++ coding guidelines are
IMP
endeavors to follow all the of the guidelines published in those books. The Sali lab owns copies of both of these books that you are free to borrow.
Below are a suggestions prompted by bugs found in code submitted to IMP
.
using
namespace'
outside of a function; instead explicitly provide the namespace. (This avoids namespace pollution, and removes any ambiguity.)const
variables instead. Preprocessor symbols don't have scope or type and so can have unexpected effects.const
& (if the object is large) and store copies of them.const
value or const
ref if you are not providing write access. Returning a const
copy means the compiler will report an error if the caller tries to modify the return value without creating a copy of it.IMP
modules and kernel and finally outside includes. This makes any dependencies in your code obvious, and by including standard headers after IMP
headers, any missing includes in the headers themselves show up early (rather than being masked by other headers you include).double
variables for all computational intermediates.double
parameters) it is easy for a user to mix up the order of arguments and the compiler will not complain. int
and double
count as equivalent types for this rule since the compiler will transparently convert an int
into a double
.IMP
uses SWIG to wrap code C++ code and export it to Python. Since SWIG is relatively complicated, we provide a number of helper macros and an example file (see modules/example/pyext/swig.i-in). The key bits are
IMP_SWIG_VALUE()
, IMP_SWIG_OBJECT()
or IMP_SWIG_DECORATOR()
line per value type, object type or decorator object the module exports to Python. Each of these lines looks like include
lines, one per header file in the module which exports a class or function to Python. The header files must be in order such that no class is used before a declaration for it is encountered (SWIG does not search #include
files for the include
headers)template
call. It should look something like On linux you can use gperftools for code profiling. The key bits are:
config.py
At some point you may want to understand how some aspect of IMP
works under the hood. These pages explain various aspects of how IMP
works.