In the MS Sequence Clustering algorithm there is a parameter named “CLUSTER_COUNT”, and the definition of the parameter states that if we set this value to 0, the algorithm automatically choose the best number of clusters.
So far, so good.
assume we run the algorithm with CLUSTER_COUNT=0 and got 5 clusters.
My questions is:
How does the algorithm understand the best number of clusters is 5? How can you show me that 5 clusters is better than 4 or 6 clusters? I mean, what goodness measure does the algorithm use?
Thanks for any help
View Complete Post