Spark k-Means - the index is off between the model and labels applied to input

Hi all,
I was creating a k-Means model using the Spark k-Means node. I notice that the cluster numbering is off between the model and the labels applied to the input.
Any idea how to make these numbers consistent?


1 Like

Hi @adityatw17,

the clusters are just named by their index, which starts at 1 in the model view and at 0 in the data view. As a quick workaround you can add plus one to the clusters in data view and they’ll fit to the cluster names in the model view. But still, you’re right saying this is confusing and should be consistent. I’ll report this as a bug an keep you posted as soon as there is a fix available. Thanks for your feedback!



Excellent. Thank you so much, Marten!