[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [IMP-dev] [Fwd: PDB lib]



At any rate, this PDB reader stuff needs to be discussed on imp-dev before we proceed. For example, what's wrong with the BALL stuff you were playing with before?
BALL is dead. No activity on email list. No response to bugs. No move to
actually document their newest version even though it was released a
year ago. I don't think we want to tie ourselves to it. Sure we can take
it to IMP dev. No one else seems to care much :-)

People certainly care (they keep coming to talk to me, anyway). But I guess they don't like writing emails.

If that's really the case for BALL, then we should probably explore other possibilities, as per Frido's email. I know that BALL's Python interface is rather lacking, certainly.

I have looked around and asked around and couldn't find any decent PDB
readers (in C or C++) which are not buried in some huge project.
Why can't we link against this PDB library, rather than cut-and-pasting thousands of lines of code?
The nice thing about it is that it is small and simple and mine so we
can just ship it along with IMP and not worry about dependencies, name
collisions etc. I don't want people to have to get another library from
somewhere else, hence my desire to put a copy into imp svn. Soon enough
the lib will make it to fedora extras (whenever the next CGAL release
is) so we could potentially just use that.

If it's an external library, it should be a dependency, not part of IMP. Otherwise, regardless of whether you describe it as a "fork", it'll fork as versions of it elsewhere change. CGAL source control sounds like the best place for it if it's going to be part of CGAL. Embedded copies of other projects are a great way to ensure that bugs never get fixed (think of all the projects that bundle zlib).

and 3. from a brief reading, it looks like a not-very-good PDB library anyway (hard-coded atom names - what's with that?)
Well, it is either that or use strings which pushes the checks to
runtime rather than compile time. Adding to an enum and recompiling is
trivial (and adding a constant externally works just as well for must
purposes). Checking everywhere than an object falls in a small set of
allowed strings is hard (especially if you can't specify that set of
strings anywhere). BALL has hardcoded atoms for that matter (just a lot
more of them :-)

A PDB reader which needs to be recompiled for every new HETATM type is simply not going to work. See http://www.bmrb.wisc.edu/elec_dep/pdb_het_library/pdbhetn.htm for example. Hao's project absolutely requires HETATMs, for example. And I don't share your concern for runtime checks, since PDB reading is not performance-critical.

Any PDB reader that we adopt needs to be extendable at runtime. Even Modeller can do that. PyMol, for example, has a library of HETATM fragments (stored as Python pickles, I believe). It also needs to be extensible to be able to read PDBML or possibly MMCIF.

Everybody and his dog has written a PDB reader. Andrej wrote one. Maya wrote one. Javi wrote one. Keren wrote one. There's one in biopython, one in BALL, one in PyMol, one in Chimera, and one in Biskit, all free and widely available software. I can't believe we have to burden the world with another one.

	Ben
--
                      http://salilab.org/~ben/
"It is a capital mistake to theorize before one has data."
	- Sir Arthur Conan Doyle