This is the prompt for the homework we were given, I’m not sure how to filter the clusters or what that necessarily means when it says clusters 2, 3, 4, 5. I was thinking maybe each cluster could be by like car types, like race car, family car, luxury car etc. Not sure if I’m on the right track

This is the data file we were given to construct these k-means clusters from

BIT-445-RS-Automobiles.xlsx (22.8 KB)

Any help would be extreeeeeeeeeemely appreciated, and I’d forever be in your debt, thank you

A K-Means cluster normally needs numerical data to calculate a distance measure used for clustering. So you probalby need to encode your data first. Then you can check different clusters e.g 2,3,4,5 and see which gives the best metric

br

3 Likes

how would I encode my data?

There are various ways, one would be a one hot encoding which can be done via one to many node in KNIME. But you might also explore other options as well

br

1 Like

@NathanV a few resources about clustering an K Means that might help you

If you know a class you can see what would be a good number of clusters:

Optimizing a “silhouette coefficient”:

2 Likes