Implement Clustering K-Medoids for new Data Pt. 2

stelfrich · June 7, 2021, 11:47am

To be honest: I don’t see how joining could be a solution to the problem. As your workflow highlights, the Cluster Assigner would rather need to compute a distance to the extracted medoids and assign based on the shortest distance. A join wouldn’t do that computation but check on equality instead, which isn’t what you are looking for.

Having said that: I don’t have a good solution available at the moment. You could try to get your New Data and the data of the second output port of the k-Medoids node into the same table. Once you have done that, you can compute a Distance Matrix for all pairs of rows (which now includes the medoids). With the Distance Matrix Pair Extractor you can extract all pairs and filter for the ones that contain one of the medoids. Grouping by the data afterward you should be able to find the shortest distance and assign the cluster to the row.

Best regards,
Stefan