[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Helper functions

To: imp@salilab.org
Subject: Re: Helper functions
From: Daniel Russel <drussel@salilab.org>
Date: Fri, 2 Nov 2007 17:27:25 -0700

Rather than answer Ben's questions one by one, I'll try to addressthings in bulk.

The route we chose to go down for representing particles in IMP isvery non-structured. You just give it a string and get back a value.This makes things very flexible (for example I can triviallyimplement the hierarchy on top of it) but makes it hard to maintaininvariants. The best way to do this is to provide helper functionsand beat users into using them. For example an add_child helperfunction adds a child to a node and makes sure the parent_index iscorrect. A compute_coordinates_from_center_of_mass_of_childrenfunction does exactly that. Unfortunately, there is no way of makingsure that it gets called every time the the set of children change(although add_child/remove_child could of course be made more cleverand we can use a State object to make sure it gets called aftercoordinates are updated by the optimizer).

Another issues is that all lookups involve searching for a string ina table. This can be expensive. The cost of generating the stringshould be trivial as they can easily be cached (I do so in myget_child helper function).

The alternative would have been to use an object hierarchy and havethe objects manage everything internally. Then we can have all sortsof types of objects which allow you to get and set attributesdirectly (hiding the Model_data object and the indirection providedby the IntIndex sort of things from users of the various Particleclasses). Then we would have a GeometricParticle which has methods x() and y() which return floats for the coordinates and aHierarchyParticle() which has child(i) etc. The main disadvantage isthat you have to cast all over the place (but now that C++ has RTTIthis isn't too bad). The other disadvantage is that loading data fromfiles is more tricky as the mapping between the text string in thefile and the attribute no longer happens for free (you have to know"X" corresponds to the function set_x()). We can provide macros tomake this mapping easier though.

Personally I think the class based approach is better, but Brettliked databases and went with the former. The one thing I think weshould not do is mix the two. Either everything is an object and youget things through C++ calls or everything is as it is currently andyou manipulate things through helper functions. If we mix, it is hardto keep track of what everything is and make sure that things likesaving and restoring state happen properly as well as just being ugly.


On Nov 2, 2007, at 4:54 PM, Ben Webb wrote:

Daniel Russel wrote:
- Is Residue just an example of a member of a hierarchy, or wouldchains and proteins be treated differently?
A tree node is a tree node. It can happen to also have somebiological function, but that is orthogonal to being a hierarchynode.
I think you misunderstood my question.

Quite likely :-)

The wiki page has a description of what attributes a Residue has,but nothing about chains or proteins, so I was just trying toascertain whether you just put in Residue as an example (and justhaven't done chains/proteins yet) or whether you think they shouldbe treated specially. I think your answer means the former, yes?

Yes, the former. I just haven't had any reason to add more fields tochains or proteins other than what they have from being in thehierarchy and being a generic object (i.e. they have a name, a typeand children and parents).

Well, sure, but let's say I have a rigid body containing 500 atoms.It has 7 attributes - the xyz of its center of mass, and anorientation quaternion. These would both have to be updated ifparticles were added to or removed from the rigid body. By makingthese 'dumb' attributes, the only way to do that is to do theupdate every time you want to use the rigid body, which seemsinefficient to me. In contrast, a ParticleContainer object couldhave a method to add/remove particles, so that it could do theupdate when necessary.

To not answer your question, for updates to locations caused by theoptimizer, a State object would handle things quite nicely.

I see your point that we need somewhere to put the functionality tocall it when you add or remove a point. Personally I would prefer afree floating function that you call passing a particle in thehierarchy (like my hierarchy helper functions for getting the ithchild). Then you could easily provide your own function if you wantto do something slightly different or could apply the "compute centerof mass of all children" function to a body which didn't happen to berigid.

- If I wanted to pull out every atom in residue 1, I'd reallyhave to scan through every single particle to figure out whichones a) have a residue attribute and b) have it = 1 ? That seemsinefficient.
You would find the particle for residue 1 and get "child_0","child_1"...I don't think you should ever have to scan through all particles(and, personally, I don't think you should be able to as it wouldencourage bad habits).
Ah, I see - it wasn't clear to me from the wiki page. Then myconcerns here are 1) you have the information in two locations, soyou will need to do consistency checks to make sure that the child/parent pointers all point to the right thing;

Yes, that is true.

2) that seems grossly inefficient - imagine a container with 10000atoms, doing the string concatenation and formatting to get child_0through child_9999, then the hashtable lookup, as opposed to justiterating through a std::vector<int>.

Well, you wouldn't actually do the string concatenation since thatcan be trivially cached (in fact I currently do it in my helperfunction). You would have to do the table lookup though. This is ageneral problem with our architecture which may prove to be a problemin the long run. Even if we special case the children in thehierachy, you still have the same problem when you want do toanything other than look at the children/parents of a hierarchy node(such as the coordinates).

Follow-Ups:
- Re: Helper functions
  - From: Ben Webb <ben@salilab.org>

References:
- Helper functions
  - From: Daniel Russel <drussel@gmail.com>
- Re: Helper functions
  - From: Ben Webb <ben@salilab.org>
- Re: Helper functions
  - From: Daniel Russel <drussel@salilab.org>
- Re: Helper functions
  - From: Ben Webb <ben@salilab.org>
- Re: Helper functions
  - From: Daniel Russel <drussel@salilab.org>
- Re: Helper functions
  - From: Ben Webb <ben@salilab.org>

Prev by Date: Re: shared state - a new thread ... :)
Next by Date: imp
Previous by thread: Re: Helper functions
Next by thread: Re: Helper functions
Index(es):
- Date
- Thread