Re: [IMP-dev] design discussions

To: List for IMP development <imp-dev@salilab.org>

Subject: Re: [IMP-dev] design discussions

From: Javier Ángel Velázquez Muriel <javi@salilab.org>

Date: Fri, 9 Oct 2009 10:02:34 -0700

Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=IcKUp2slUm/ns4BueYrkZ87AakhEjy27pk+pjd+XxbTvs+UjrCLZdKXnf66qf8+iA6 ZHvKi+VMar6qUNHbawG/E4Svb//F7ozCW8htrHKODvGZ4E8lcLE8gbDhmCwV0qRLzGOk mx4LBFSR1bjdQfECJIMgwJX0Xy7kVBTQvdYd4=

Reply-to: List for IMP development <imp-dev@salilab.org>

2009/10/9 Daniel Russel <">drussel@gmail.com>

As a followup to my email, Javi had raised issues concerning the choice of names of HierarchyType. It is problematic to have such a type label (as a protein is a molecule, but it is complicated to express such a reslationship). I propose removing the Hierarchy::get_type() method and HierarchyType type and replacing it with methods like
- get_is_protein() (true for any piece of protein)
- get_is_molecule() (true for any molecule)
- get_is_residue()
- get_is_atom()
- get_is_assembly() (not sure if we want this)
etc...

Something else to think about for Tuesday.

I'm not entirely sure I agree (witness the last developers meeting :-). But am ambivalent. It could be structured better though which would help things converge faster.

That said, we have several outstanding questions:
- do we have PROTEINs which can contain CHAINs or just PROTEINS (which are chains). This we should probably just find something authorative and use it. I don't much care either way.

- what are the most useful things for one or more read_pdb functions to return? For this we should come up with standard usage cases. I would propose a couple here:
- someone is running through lots of PDB files and wants to load one protein from each file. To do this, it would be nice to have a function which loads a protein from the pdb and returns a hierarchy containing only that protein. Whether this protein has one or more chains depends on the answer to the first question
- load the whole structure from a pdb complete with many proteins and ligands and other molecules. For this it is useful to be able to read everything from one PDB model record.
- take one piece of the pdb and use it (such as a chain or ligand). For this it is nice not to have to dissect a hierarchy.
- load a bunch of model records from a single pdb and deal with all the molecules in each record.

Any other cases? Think about it and we will discuss it on Tuesday.

A proposal to think which handles the above cases is:
- one function which reads a protein from a pdb
- one function which reads everything from one model record in a pdb and returns it in a list/vector

_______________________________________________
IMP-dev mailing list
" target="_blank">IMP-dev@salilab.org
https://salilab.org/mailman/listinfo/imp-dev