Alternative Methods for Cluster Validity

Hi,

I am new to Knime and I have been using k-means along with c-means to generate clusters for some datasets.

I have noticed the Entropy node that does the entropy calculations for each cluster and gives an overall entropy. I wanted to know if there are any additional nodes that could perhaps be downloaded to perform other cluster validity measures such as purity, squared error or any other form of validation. If there are not, are there any methods I could use to calculate said validity measures?

Any help would be appreciated, thanks.

Hi John, 

Someone will correct me if I am wrong, but I don't think we have any nodes that can do this natively.  The default fall back in this situation is often to try to use R in an R-Snippet node.  A quick google search doesn't really point to any great libraries that are a part of the R core libraries (http://nmf.r-forge.r-project.org/index.html has something), but at least for purity, there is some simple example code shown here:

http://stackoverflow.com/questions/9253843/r-clustering-purity-metric.

Does that help at all?

Regards,

Aaron