It seems that K-means node generates clustering names (cluster_1, cluster_2…) in random to the reference value. Is it possible to make the name in order to reference value? now I am using Rule Engine to re-order it like “$Recency_Cluster$ = 2 => 1” but if I re-run the analysis it may generate clustering randomly again then it may not work.

Hello @anguslou,

what do you mean by reference value?
Do you mean that the first cluster in the table is not necessarily cluster_1?



Reference value means the columns that clustering are determined, eg, total payment in my case.

I find if the total payment is sorted, the clustering are cluster_0, cluster_2, cluster_1, and so on…

So you mean the columns that are used to compute the clustering i.e. the columns in the include list in the dialog of the k-Means node?

Yes, that’s right. I wonder how to make the clustering in order.

I can see how that might make sense in your case but if you have more than 1 column, it’s not clear how to define such an order.
There is actually a caveat with the current implementation, namely that is uses the first rows as initialization which is a very poor way to initialize k-Means if the rows are ordered.
Therefore I’d recommend to use the Shuffle node to break up any order, then run k-Means and afterwards sort your table again and finally reassign the clusters in the desired order.



