[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [IMP-dev] Loading multiple molecules from a pdb file




On Oct 8, 2009, at 7:02 PM, Keren Lasker wrote:

a pdb might contain chains of an assembly, not all multiple chains are NMR structures. I think read_pdb is fine as is, but in case it would be decided to follow your suggestion, I vote for an additional function: read_pdb_assembly.
Currently it makes a PROTEIN consisting of anything if finds in the PDB (including random other molecules). Which then throws an exception when it discovers that this is not a valid hierarchy :-)

So something needs to be fixed.

A fixed version of the current approach (returning a UNIVERSE analogue containing the chains and other molecules) is annoying whenever you want to handle the different molecules separately as you have to remove each of them from the current hierarchy and then do something with them (rather than just sticking them into the new place). We could provide helper functions to merge UNIVERSES (or some alternate name as Javi doesn't like the name). You also have to make sure you get rid of all the water and other stuff in their yourself.

So maybe a better solution is:
- read_pdb which returns a UNIVERSE consisting of everything found in the first model in the PDB and makes people sort out the proteins, ligands and waters. Making this switch broke a lot of example code, so it might not be a trivial change.

- read_protein_from_pdb which reads the first chain and returns a PROTEIN, for people who just want a protein and don't want to worry about the junk

if we feel like it, we could then provide
- read_molecules_from_pdb which returns a Hierarchies, one for each molecule. This can be implemented later (but is easy).