a pdb might contain chains of an assembly, not all multiple chains
are NMR structures.
I think read_pdb is fine as is, but in case it would be decided to
follow your suggestion, I vote for an additional function:
read_pdb_assembly.
Currently it makes a PROTEIN consisting of anything if finds in the
PDB (including random other molecules). Which then throws an exception
when it discovers that this is not a valid hierarchy :-)
So something needs to be fixed.
A fixed version of the current approach (returning a UNIVERSE analogue
containing the chains and other molecules) is annoying whenever you
want to handle the different molecules separately as you have to
remove each of them from the current hierarchy and then do something
with them (rather than just sticking them into the new place). We
could provide helper functions to merge UNIVERSES (or some alternate
name as Javi doesn't like the name). You also have to make sure you
get rid of all the water and other stuff in their yourself.
So maybe a better solution is:
- read_pdb which returns a UNIVERSE consisting of everything found in
the first model in the PDB and makes people sort out the proteins,
ligands and waters. Making this switch broke a lot of example code, so
it might not be a trivial change.
- read_protein_from_pdb which reads the first chain and returns a
PROTEIN, for people who just want a protein and don't want to worry
about the junk
if we feel like it, we could then provide
- read_molecules_from_pdb which returns a Hierarchies, one for each
molecule. This can be implemented later (but is easy).