How to know number of optimum cluster

Hi KNIME team & user, I am currently working to decide number of optimum cluster in my data set (I am using k-means). However, one problem I faced is that I need to specifiy the number of cluster before I can execute the node.

Is there any way I can know the most optimum number of cluster without setting it first? I am quite new and I don’t know how to do coding on KNIME, it would be helpful if you could kindly show it as well. :smile:

Thanks in advance

Very good question!
IBM’s SPSS products have the algo Two Steps which determines the optimal number of cluster within specified range. Don’t know about KNIME though.

Hi,
This feature is already requested:

:blush:

1 Like

There is an example on the server how to determine the perfect number of clusters. You could check that out.

https://hub.knime.com/knime/workflows/Examples/06_Control_Structures/04_Loops/01_Loop_over_a_set_of_parameter_for_k_means*V-WokGlnamXnmo7j

Normalized Entropy = smaller is better
Quality = larger is better

4 Likes

Hi

On the example server there is a workflow, where the number of topis is derived via the Elbow Method.
In which the sum of squares at each number of clusters is calculated and graphed, and the user looks for a change of slope from steep to shallow (an elbow) to determine the optimal number of clusters. This method is inexact, but still potentially helpful. quote from towardsdatascience.com

gr Hans

elbow_method

2 Likes

Hi All,

Thanks for the solutions. :slight_smile:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.