[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [IMP-users] [Coarse grain modeling] a bunch of questions and comments regarding the nup84 cg example



   1. I think there is a minor bug in the documentation for IMP.atom.create_protein which states that a hierarchy of balls is created, and "The balls are held together by a ConnectivityRestraint with the given spring constant.". The second assertion seems erroneous to me.
Indeed, the function changed but not the docs.

I would probably also add some details on the hierarchy returned by the first occurrence of  IMP.atom.create_protein, since it differs from the second one (another second hierarchical level seems indeed to be inserted).
I'll take a look. I've be purposefully vague about the internal structure of the hierarchy returned by various methods to increase flexibility, but I'm not sure if that is still worthwhile.


   2. In the nup84 cg example, the connection between the balls of a same protein is made in 
        h=IMP.atom.create_protein(m, name, resolution, ds)
with a call to
        r=IMP.atom.create_connectivity_restraint([IMP.atom.Selection(c ) for c in h.get_children()],k)
…And after dissection of the function's code I am still wondering what is exactly done.
In my understanding, when applied to a hierarchy returned by IMP.atom.create_protein(), the generated restraint is always created through a ConnectingPairContainer, which connects balls in a tree like structure, and I cannot see how and when this tree is built. 
Plus, I basically was expecting something pretty much simple, such as a distance restraint applied between the successive fragments in the molecule.
It is supposed to apply the simplest restraint it can based on what is passed. That is, one of:
- distance restraint
- kclosepairspairscore based restraint
- connected pair container with distance pair score
- connectivity restraint

If you have a case where it isn't doing the simplest, let me know,


   3. Nothing very important, just a bit noisy/confusing : in create_protein() sub-function, the leaves variable 
        leaves= IMP.atom.get_leaves(h)
        is never used… So why not just stripping it ?
I don't see that. Where is it?


   4. minor bug in the documentation : some occurrences of create_connectivity_restraint() have no mentioned return type.
Where do you see this?


   5. When it comes to inserting inter-molecules restraints, I think I understand the meaning of the two functions : 
add_connectivity_restraint and  add_distance_restraint, but I'd like to be sure of that :
the first one enforces the specified molecules to be somehow connected (technically : consider each molecule as a node in a complete graph, weight each edge (A,B) with the smallest distance computed for a pair of particles belonging to AxB (thanks to KClosePairsPairScore), then computing MST and deriving score (thanks to ConnectivityRestraint).)
the second one merely favors the two molecular hierarchies to be in exact contact (given the two molecules A and B test the closest pair of particles in AxB, and return the score for that pair of particles).
In practice they are more or less the same thing (modulo implementation details) when both passed a pair of selections. The names are just different for consistency with other parts of IMP. I'm not sure if that was a good decision.




II. Concerning the sampling part

I am not sure to understand how the MCCG sampler works. 
In my understanding, the sampler uses an optimizer to improve a set of initial random solutions, hence generating several putative solutions, or at least not so bad ones (a sample).
In this context, the two lines :
    sampler.set_number_of_conjugate_gradient_steps(100)
    sampler.set_number_of_monte_carlo_steps(50)
Merely control each optimization step, whereas
    sampler.set_number_of_attempts(40)
controls the number of initial (or final ?) retained solutions. Am I correct or at least near enough ?
Basically there are three nested loops:
1) attempts
2) MC steps
3) CG steps

I should add that to the docs to make it clearer.



III. Concerning the analysis part of the example :

1.  The last argument of :
    embed= IMP.statistics.ConfigurationSetXYZEmbedding(cs,
                 IMP.container.ListSingletonContainer(IMP.atom.get_leaves(all)), True)
is not documented.
Even though the name of the variable "bool align=false" is quite suggestive, I have an issue to guess the type of alignment that is considered here. Maybe a simple question can help to leverage my problem : Let's say I have two configurations that can be derived from one another through a simple rotation°translation; does setting this parameter to true help me to have the same embedding for each conformations, and hence classify both in the same cluster ?
It is whether rigid alignment is performed. Currently, this alignment is against the first configuration, which may not be the best option. I'll add a note to the docs.



2. In the analyze_conformations() function, I think the line
        cs.load_configuration(i)
ought to be replaced by
        cs.load_configuration( cluster.get_cluster_representative(i) )
Yup. Thanks.