Using K-Means cluster midpoints to assign new values

Snowy · September 3, 2019, 2:10pm

Hi!
I have an SPSS file (.sav) that contains the midpoints for variables for a K-Means cluster. I can use the R Snippet node to read the SPSS file using the foreign package… But I am wondering if it is possible to assign new rows to one of the existing clusters using the midpoints?

This is what the SPSS file looks like when read through R:

I believe this is simply an export of the output. I also don’t believe it is possible to export a PMML file from SPSS Statistics.

Thanks for your help!
Snowy

nemad · September 4, 2019, 7:39am

Hello @Snowy,

k-Means will assign a new data row to the cluster whose centroid is closest.
The K Nearest Neighbor node uses a similar approach for classification.
Simply convert your cluster id into a string column and use it as target in the K Nearest Neighbor node with the number of neighbors set to 1 and you should be good to go.

Cheers,

Adrian

system · September 11, 2019, 7:39am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.