Introduction

When writing C++ code for IMP it can sometimes be difficult to track down bugs in the code. Some more subtle bugs may not crash the program but give incorrect results; in other cases the results may look OK on some machines but not on others. Bugs with memory allocation may waste or even exhaust memory. In this tutorial we will cover using the Valgrind tool to assist in finding such issues. Valgrind runs your program in a virtual machine, so it can accurately track and verify all memory access (at the expense of running more slowly).

This tutorial assumes you are already familiar with building IMP and creating new modules, as discussed in the IMP coding tutorial. You will also need to install Valgrind; it works best on Linux systems.

C++ code

The C++ example code used here is very similar to that in the IMP coding tutorial, where we implemented a simple restraint to harmonically restrain particles to the XY plane. The full sources for the module can be found at GitHub. The only change made is to delegate the calculation of the score (in src/MyRestraint.cpp) to a helper class, ScoreCalculator (in the anonymous namespace since it's not part of the public interface):

namespace {
  class ScoreCalculator {
    core::XYZ xyz_;
    double k_, k2_;
  public:
    ScoreCalculator(core::XYZ xyz, double k) : xyz_(xyz), k_(k) {}
    double get_score() { return .5 * k2_ * square(xyz_.get_z()); }
  };
}
void MyRestraint::do_add_score_and_derivatives(ScoreAccumulator sa) const {
  core::XYZ d(get_model(), p_);
  ScoreCalculator *calc = new ScoreCalculator(d, k_);
  double score = calc->get_score();
  if (sa.get_derivative_accumulator()) {
    double deriv = k_ * d.get_z();
    d.add_to_derivative(2, deriv, *sa.get_derivative_accumulator());
  }
  sa.add_score(score);
}

Build procedure

To build the custom module, either drop the entire module into IMP's modules directory and then build IMP from source code in the usual way, or build the module out of tree pointing CMake to an existing IMP installation. In order for Valgrind to be maximally useful, build the module with extra debugging information available by passing -DCMAKE_CXX_FLAGS="-g" to CMake. The module should build without errors (or even warnings) with gcc.

Testing with Valgrind

In order to exercise the potentially buggy code, one or more unit tests are needed. (Code coverage will help to show whether some code paths have been missed by tests.)

For this restraint, we already wrote a simple test case (test/test_restraint.py, also at GitHub) to evaluate the restraint and test the score. We can run this test case through Valgrind to see if it can pick up any issues:

PYTHONMALLOC=malloc ./setup_environment.sh valgrind --log-file=valg.out --track-origins=yes --leak-check=full --show-leak-kinds=definite python3 $TESTDIR/test_restraint.py

Valgrind has a lot of command line options; see the Valgrind manual for more information. We used some common options here. Let's look at each part of the command line in turn:

PYTHONMALLOC=malloc tells Python (3.6 or later) to use the regular, slower, system dynamic memory allocator. By default Python uses a scheme which will result in lots of warnings from Valgrind.

./setup_environment.sh in IMP's build directory sets the Python search path so that it can find IMP modules. Valgrind is then run in this environment.

--log-file puts Valgrind's output in a separate file, rather than having it interleaved with the IMP output.

--track-origins=yes will help us track down where problems occur in the code.

--leak-check=full --show-leak-kinds=definite will show where we definitely lose memory.

$TESTDIR is the directory containing test_restraint.py.

Valgrind output

Valgrind can produce a lot of output. Searching the output file valg.out for our custom IMP::foo module will help to narrow this down. With our (contrived) example Valgrind finds two issues. Here's the first one:

==286540== Conditional jump or move depends on uninitialised value(s)
==286540==    at 0x1707A77D: IMP::ScoreAccumulator::add_score(double) (ScoreAccumulator.h:84)
==286540==    by 0x17079D89: IMP::foo::MyRestraint::do_add_score_and_derivatives(IMP::ScoreAccumulator) const (MyRestraint.cpp:28)

The --track-origins=yes Valgrind option results in some extra information about the source of this value:

==286540==  Uninitialised value was created by a heap allocation
==286540==    at 0x4839E7D: operator new(unsigned long) (vg_replace_malloc.c:342)
==286540==    by 0x17079CA2: IMP::foo::MyRestraint::do_add_score_and_derivatives(IMP::ScoreAccumulator) const (MyRestraint.cpp:22)

This shows us that the score we return to IMP (on line 28) from the ScoreCalculator object (created on line 22) is a function of an uninitialized value - i.e. a variable that has no defined value. This means that we can't trust the score. The value is often zero, which might give reasonable-looking results, but in principle could be anything, causing random and hard-to-find bugs. In this case it is easy to find the problem. We used k2_ as the force constant on line 13 but never assigned a value (we should have used k_ instead).

(As an aside, because this score value is then passed back to Python and used elsewhere, we see many many more depends on uninitialised value(s) errors in the Valgrind log. The first is the most informative.)

The second issue is a memory leak:

==286540== 32 bytes in 1 blocks are definitely lost in loss record 444 of 5,686
==286540==    at 0x4839E7D: operator new(unsigned long) (vg_replace_malloc.c:342)
==286540==    by 0x17079CA2: IMP::foo::MyRestraint::do_add_score_and_derivatives(IMP::ScoreAccumulator) const (MyRestraint.cpp:22)

The issue here should be clear; we created a ScoreCalculator object on the heap using new (on line 22) but never freed the memory. We only leaked 32 bytes here but since we do this on every restraint evaluation, over the course of a long simulation this could result in a large leak.

The simplest solution here is to delete the object when we're done with it to free the memory. However, this is not ideal for a couple of reasons:

If a C++ exception occurs after the new but before the delete the memory will still leak.
In more complex programs with multiple code paths it can be tricky to make sure that every new is paired with a delete and that we never try to free the same memory more than once.

In this case where the object is small it would be better to avoid dynamic memory allocation entirely and just create the ScoreCalculator object as an automatic variable (on the stack) as ScoreCalculator calc(d, k_); For a larger object, derive from the IMP::Object class and use a smart pointer to make sure it gets cleaned up automatically (replacing new with IMP_NEW and using IMP::Pointer rather than raw C++ pointers).

(Valgrind also reports some small memory leaks from IMP's SWIG interface. These aren't worth worrying about, since this is not our code, and they should be one-time allocations, which get cleaned up at the end of the program anyway.)

Table of Contents

Introduction

C++ code

Build procedure

Testing with Valgrind

Valgrind output