kmeans clustering

Hi all

I want to cluster those columns:

But kmeans doesn't accept string

It accepts only numbers

How can I convert the data to numbers?

Knowing that category to number and string to number give me error for conversion

I want the whole expession to be converted to number not each word in a single cell 


k-means is not suitabel for clustering terms. k-means needs the objects to be in multi-dimensional numeric (continuous space) and strings are neither. What you can do is first compute distances between your terms using the distance matrix/function nodes and an appropriate distance function (e.g. Levenshtein). Then use k-medoids or hierarchical clustering. Both should work when you only have distances between objects but not a continuous space.

Thank you so much

It worked with me. It gives me 5 clusters

There is another question please

How to visualize the results such that I can know which row belongs to which cluster

this result appears:

I want to know the "Black part" -for example- the rows number that are in the same cluster

I believe it should be possible to use th highlighting feature. If you open the original table in a Interactive Table View node, select a cluster in the Pie Chart and highlight the selected rows via the menu then you should see the corresponding rows in the table view.

Thank you so much

It worked with me


another question please

I want to apply association rules on the same data (the 2 columns above)

I tried bit vector and collection column but both don't give me the desired results

What should I do??