I think it can make sense. However, since I've realized some problems with what we (or at least I :-) said yesterday, and want to make sure we are all on the same page, I'd like to go into a bit more detail. The idea would be something like the following?
class KMeansLloydsClustering: public PartitionalClustering { KMeansLloydsClustering(EmbeddingAdaptor embed, unsigned int k, unsigned int num_iterations); void refine(unsigned int num_iterations); };
so you would do:
clu= KMeansLloydsClustering([[0,1], [1,2]], 2, 10) print clu.get_cluster(0), clu.get_cluster(1) clu.refine(10)
Whatever is done should also have similar things done on the Metric side.
BTW, I split up various headers, so make sure you update before you dive in.
Daniel and I discussed a little bit consolidation of clustering things in statistics. I am not touching anything out of the statistics / kmeans module for now, but please tell me if it is agreed that things will work as follows:
* The Embedding family of classes (used to embed data in vector form) will remain as is
* There will be a "Clustering" class from which all clustering algorithms will derive, with:
* constructor that takes either Embedding class, or EmbeddingAdaptor for implicit conversions from e.g., vector form