IMP Tutorial
|
When writing C++ code for IMP it can sometimes be difficult to track down bugs in the code. Some more subtle bugs may not crash the program but give incorrect results; in other cases the results may look OK on some machines but not on others. Bugs with memory allocation may waste or even exhaust memory. In this tutorial we will cover using the Valgrind tool to assist in finding such issues. Valgrind runs your program in a virtual machine, so it can accurately track and verify all memory access (at the expense of running more slowly).
This tutorial assumes you are already familiar with building IMP and creating new modules, as discussed in the IMP coding tutorial. You will also need to install Valgrind; it works best on Linux systems.
The C++ example code used here is very similar to that in the IMP coding tutorial, where we implemented a simple restraint to harmonically restrain particles to the XY plane. The full sources for the module can be found at GitHub. The only change made is to delegate the calculation of the score (in src/MyRestraint.cpp
) to a helper class, ScoreCalculator
(in the anonymous namespace since it's not part of the public interface):
To build the custom module, either drop the entire module into IMP's modules
directory and then build IMP from source code in the usual way, or build the module out of tree pointing CMake to an existing IMP installation. In order for Valgrind to be maximally useful, build the module with extra debugging information available by passing -DCMAKE_CXX_FLAGS="-g"
to CMake. The module should build without errors (or even warnings) with gcc.
In order to exercise the potentially buggy code, one or more unit tests are needed. (Code coverage will help to show whether some code paths have been missed by tests.)
For this restraint, we already wrote a simple test case (test/test_restraint.py
, also at GitHub) to evaluate the restraint and test the score. We can run this test case through Valgrind to see if it can pick up any issues:
Valgrind has a lot of command line options; see the Valgrind manual for more information. We used some common options here. Let's look at each part of the command line in turn:
PYTHONMALLOC=malloc
tells Python (3.6 or later) to use the regular, slower, system dynamic memory allocator. By default Python uses a scheme which will result in lots of warnings from Valgrind.
./setup_environment.sh
in IMP's build directory sets the Python search path so that it can find IMP modules. Valgrind is then run in this environment.
--log-file
puts Valgrind's output in a separate file, rather than having it interleaved with the IMP output.
--track-origins=yes
will help us track down where problems occur in the code.
--leak-check=full --show-leak-kinds=definite
will show where we definitely lose memory.
$TESTDIR
is the directory containing test_restraint.py
.
Valgrind can produce a lot of output. Searching the output file valg.out
for our custom IMP::foo
module will help to narrow this down. With our (contrived) example Valgrind finds two issues. Here's the first one:
The --track-origins=yes
Valgrind option results in some extra information about the source of this value:
This shows us that the score we return to IMP (on line 28) from the ScoreCalculator
object (created on line 22) is a function of an uninitialized value - i.e. a variable that has no defined value. This means that we can't trust the score. The value is often zero, which might give reasonable-looking results, but in principle could be anything, causing random and hard-to-find bugs. In this case it is easy to find the problem. We used k2_
as the force constant on line 13 but never assigned a value (we should have used k_
instead).
(As an aside, because this score value is then passed back to Python and used elsewhere, we see many many more depends on uninitialised value(s)
errors in the Valgrind log. The first is the most informative.)
The second issue is a memory leak:
The issue here should be clear; we created a ScoreCalculator
object on the heap using new
(on line 22) but never freed the memory. We only leaked 32 bytes here but since we do this on every restraint evaluation, over the course of a long simulation this could result in a large leak.
The simplest solution here is to delete
the object when we're done with it to free the memory. However, this is not ideal for a couple of reasons:
new
but before the delete
the memory will still leak.new
is paired with a delete
and that we never try to free the same memory more than once.In this case where the object is small it would be better to avoid dynamic memory allocation entirely and just create the ScoreCalculator
object as an automatic variable (on the stack) as ScoreCalculator calc(d, k_);
For a larger object, derive from the IMP::Object class and use a smart pointer to make sure it gets cleaned up automatically (replacing new
with IMP_NEW
and using IMP::Pointer
rather than raw C++ pointers).
(Valgrind also reports some small memory leaks from IMP's SWIG interface. These aren't worth worrying about, since this is not our code, and they should be one-time allocations, which get cleaned up at the end of the program anyway.)