IMP
2.3.0
The Integrative Modeling Platform
|
This page presents instructions on how to develop code using IMP. Developers should also read Getting started as a developer.
The input files in the IMP directory are structured as follows:
tools
contains various command line utilities for use by developers. They are documented below.doc
contains inputs for general IMP overview documentation (such as this page), as well as configuration scripts for doxygen
.applications
contains various applications implementing easier-to-use command line functionality, using a variety of IMP modules.modules/
defines a module; they all have the same structure. The directory for module name
has the following structure:README.md
contains a module overviewinclude
contains the C++ header filessrc
contains the C++ source filesbin
contains C++ source files each of which is built into an executablepyext
contains files defining the Python interface to the module as well as Python source files (in pyext/src
)test
contains test files, that can be run with ctest
doc
contains additional documentation that is provided via .dox
or .md
filesexamples
contains examples in Python and C++, as well as any data needed for examplesdata
contains any data files needed by the moduleWhen IMP is built, a number of directories are created in the build directory. They are
include
which includes all the headers. The headers for module name
are placed in include/IMP/name
lib
where the C++ and Python libraries are placed. Module name
is built into a C++ library lib/libimp_name.so
(or .dylib
on a Mac) and a Python library with Python files located in lib/IMP/name
and the binary part in lib/_IMP_name.so
doc
where the html documentation is placed in doc/html
and the examples in doc/examples
with a subdirectory for each moduledata
where each module gets a subdirectory for its data.When IMP is installed, the structure from the build directory is moved over more or less intact except that the C++ and Python libraries are put in the (different) appropriate locations.
The easiest way to start writing new functions and classes is to create a new module using make-module.py. This creates a new module in the modules
directory. Alternatively, you can simply use the scratch
module.
We highly recommend using a revision control system such as git or svn to keep track of changes to your module.
If, instead, you choose to add code to an existing module, you need to consult with the person or people who control that module. Their names can be found on the module main page.
When designing the interface for your new code, you should
You may want to read the design example for some suggestions on how to go about implementing your functionality in IMP.
Make sure you read the API Conventions page first.
To ensure code consistency and readability, certain conventions must be adhered to when writing code for IMP. Some of these conventions are automatically checked for by source control before allowing a new commit, and can also be checked yourself in new code by running check_standards.py.
All C++ headers and code should be indented with 2-space indents. Do not use tabs. clang-format can help you do this formatting automatically.
All Python code should conform to the Python style guide. In essence this translates to 4-space indents, no tabs, and similar class, method and variable naming to the C++ code. You can ensure that your Python code is correctly indented by using the cleanup_code.py script.
See the introduction first. In addition, developers should be aware that
IMP
.SpecialVector
class could be implemented in SpecialVector.h
and SpecialVector.cpp
separated_by_underscores,
for example
container_macros.h`Name
using a Names.
Declare functions that accept them to take a NamesTemp
(Names
is a NamesTemp)
. Names
are reference counted (see IMP::RefCounted for details), NamesTemp
are not. Store collections of particles using a Particles
object, rather than decorators.All values must have a show
method which takes an optional std::ostream
and prints information about the object (see IMP::base::Array::show() for an example). Add a write
method if you want to provide output that can be read back in.
Classes and methods should use IMP exceptions to report errors. See IMP::base::Exception for a list of existing exceptions. See checks for more information.
Use the provided IMPMODULE_BEGIN_NAMESPACE,
IMPMODULE_END_NAMESPACE,
IMPMODULE_BEGIN_INTERNAL_NAMESPACE
and IMPMODULE_END_INTERNAL_NAMESPACE
macros to put declarations in a namespace appropriate for module MODULE.
Each module has an internal namespace, eg IMP::base::internal
and an internal include directory IMP/base/internal.
Any function which is
should be declared in an internal header and placed in the internal namespace.
The functionality in such internal headers is
As a result, such functions do not need to obey all the coding conventions (but we recommend that they do).
IMP is documented using doxygen
. See Documenting your code in doxygen to get started. We use //!
and /**
... * / blocks for documentation. You are encouraged to use Doxygen's
markdown support as much as possible.
Python code should provide Python doc strings.
All headers not in internal directories are parsed through doxygen
. Any function that you do not want documented (for example, because it is not well tested), hide by surrounding with
\#ifndef IMP_DOXYGEN void messy_poorly_thought_out_function(); \#endif
We provide a number of extra Doxygen commands to aid in producing nice IMP documentation.
\\unstable{Classname}.
The documentation will include a disclaimer and the class or function will be added to a list of unstable classes. It is better to simply hide such things from doxygen
.\\untested{Classname}.
\\untested{Classname}.
Ensuring that your code is correct can be very difficult, so IMP provides a number of tools to help you out.
The first set are assert-style macros:
See checks page for more details. As a general guideline, any improper usage to produce at least a warning all return values should be checked by such code.
The second is logging macros such as:
Finally, each module has a set of unit tests. The tests are located in the modules/modulename/test
directory. These tests should try, as much as possible to provide independent verification of the correctness of the code. Any file in that directory or a subdirectory whose name matches test_*.{py,cpp}
, medium_test_*.{py,cpp}
or expensive_test_*.{py,cpp}
is considered a test. Normal tests should run in at most a few seconds on a typical machine, medium tests in 10 seconds or so and expensive tests in a couple of minutes.
Some tests will require input files or temporary files. Input files should be placed in a directory called input
in the test
directory. The test script should then call
to get the true path to the file. Likewise, appropriate names for temporary files should be found by calling
. Temporary files will be located in build/tmp.
The test should remove temporary files after using them.
Writing examples is very important part of being an IMP developer and one of the best ways to help people use your code. To write a (Python) example, create a file myexample.py
in the example directory of an appropriate module, along with a file myexample.readme.
The readme should provide a brief overview of what the code in the module is trying to accomplish as well as key pieces of IMP functionality that it uses.
When writing examples, one should try (as appropriate) to do the following:
import
lines for the IMP modules usedcreate_representating
which creates and returns the model with the needed particles along with a data structure so that key particles can be located. It should define nested functions as needed to encapsulate commonly used codecreate_restraints
which creates the restraints to score conformations of the representationget_conformations
to perform the samplinganalyze_conformations
to perform some sort of clustering and analysis of the resulting conformationscreate_representation
and create_restraints
functions and performing samping and analysis and displaying the solutions.Obviously, not all examples need all of the above parts.
The example should have enough comments that the reasoning behind each line of code is clear to someone who roughly understands how IMP in general works.
Examples must use methods like IMP::base::get_example_data() to access data in the example directory. This allows them to be run from anywhere.
IMP uses SWIG to wrap code C++ code and export it to Python. Since SWIG is relatively complicated, we provide a number of helper macros and an example file (see modules/example/pyext/swig.i-in). The key bits are
IMP_SWIG_VALUE(),
IMP_SWIG_OBJECT()
or IMP_SWIG_DECORATOR()
line per value type, object type or decorator object the module exports to Python. Each of these lines looks like IMP_SWIG_VALUE(IMP::module_namespace, ClassName, ClassNames);
include
lines, one per header file in the module which exports a class or function to Python. The header files must be in order such that no class is used before a declaration for it is encountered (SWIG does not do recursive inclusion)template
call. It should look something like namespace IMP { namespace module_namespace { %template(PythonName) CPPName<Restraint, 3>; } }
When there is a significant group of new functionality, a new set of authors, or code that is dependent on a new external dependency, it is probably a good idea to put that code in its own module. To create a new module, run make-module.py script from the main IMP source directory, passing the name of your new module. The module name should consist of lower case characters and numbers and the name should not start with a number. In addition the name "local" is special and is reserved to modules that are internal to code for handling a particular biological system or application. eg
./tools/make-module.py mymodule
The next step is to update the information about the module stored in modules/mymodule/README.md
. This includes the names of the authors and descriptions of what the module is supposed to do.
If the module makes use of external libraries. See the files modules/base/dependencies.py
and modules/base/dependency/Log4CXX.description
for examples.
Each module has an auto-generated header called modulename_config.h.
This header contains basic definitions needed for the module and should be included (first) in each header file in the module. In addition, there is a header module_version.h
which contains the version info as preprocessor symbols. This should not be included in module headers or cpp files as doing so will force frequent recompilations.
In order to be shared with others as part of the IMP distribution, code needs to be of higher quality and more thoroughly vetted than typical research code. As a result, it may make sense to keep the code as part of a private module until you better understand what capabilities can be cleanly offered to others.
The first set of questions to answer are
You are encouraged to post to the imp-dev
list to find help answering these questions as it can be hard to grasp all the various pieces of functionality already in the repository.
All code contributed to IMP
gcc
, clang++
and Visual C++
) without warningsSee getting started as a developer for more information on submitting code.
Once you have submitted code, you should monitor the Nightly build status to make sure that your code builds on all platforms and passes the unit tests. Please fix all build problems as fast as possible.
In addition to monitoring the imp-dev
list, developers who have a module or are committing patches to svn may want to subscribe to the imp-commits
email list which receives notices of all changes made to the IMP repository.
IMP is designed to run on a wide variety of platforms. To detect problems on other platforms we provide nightly test runs on the supported platforms for code that is part of the IMP repository.
In order to make it more likely that your code works on all the supported platforms:
and
and or
in C++ code, use &&
and ||
instead.friend
declarations involving templates, use the preprocessor, conditionally on the symbols SWIG
and IMP_DOXYGEN
to hide code as needed instead.IMP now turns on C++ 11 support when it can. However, since compilers are still quite variable in which C++ 11 features they support, it is not adviseable to use them directly in IMP code at this point. To aid in their use when practical we provide several helper macros:
override
keyword when availablefinal
keyword when availableMore will come.
Two excellent sources for general C++ coding guidelines are
IMP endeavors to follow all the of the guidelines published in those books. The Sali lab owns copies of both of these books that you are free to borrow.
Below are a suggestions prompted by bugs found in code submitted to IMP.
using namespace
' outside of a function; instead explicitly provide the namespace. (This avoids namespace pollution, and removes any ambiguity.)const
variables instead. Preprocessor symbols don't have scope or type and so can have unexpected effects.const &
(if the object is large) and store copies of them.const
value or const
reference if you are not providing write access. Returning a const
copy means the compiler will report an error if the caller tries to modify the return value without creating a copy of it.#include <IMP/mymodule/mymodule_exports.h> #include <IMP/mymodule/MyRestraint.h> #include <IMP/Restraint.h> #include <vector>
double
variables for all computational intermediates.FloatKey get_my_float_key() { static FloatKey k("hello"); return k; }
double
parameters) it is easy for a user to mix up the order of arguments and the compiler will not complain. int
and double
count as equivalent types for this rule since the compiler will transparently convert an int
into a double.