Scoring new clusters with existing cluster centers

Hi everyone,

I have a K-means cluster project from a couple of years ago. Unfortunately what I have is the data with cluster membership, but not the original K-means predictor mode. So I want to use the cluster centers as input (just the matrix of the old means), and score new data. I’m guessing (hoping?) that there is a straightforward way to do this in KNIME, but I’ve never tried it in KNIME.

If it helps, it’s only a handful of variables and a dozen clusters so it would not be difficult to build a small matrix manually, but I want to avoid writing the calculation manually.

Any ideas would be welcome.

Thank you, Keith

Hi Keith and welcome to the KNIME community

This should not be difficult to implement but I would need to know at least two more details: What was the metric you used originally, Euclidean, Tanimoto, any other ? Were your variables (columns) all of type numeric ? Could you please upload here your data to work directly on it and thus provide a better solution ?

Best

Ael

4 Likes

Thank so much. Euclidean and integer values ranging from 1 to 5. Uploading data would not be easy to do at this time, but I could upload a fake version of it if required.

Thanks again, Keith

1 Like

Hello @keith_mccormick,

dummy data works just fine :wink:

Welcome to Community!

Br,
Ivan

1 Like

Hi @keith_mccormick

My pleasure. Thanks for your message with the required K-Means configuration.

If I understood well your need, I believe the following workflow should provide with a possible solution:

20211021 Pikairos Scoring new clusters with existing cluster centers.knwf (533.0 KB)

The workflow is self-commented but please get in touch if you need further explanations.

Hope this helps.

Best

Ael

1 Like

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.