Search in Clusters


I have a set of documents which I preproccessed and then extracted keywords and created a document vector.

I also clustered them using k-medoids, and then I inserted the results into a GroupBy node to group each cluster and create a document list for each one of the clusters. That resulted in words and their mean values in each cluster.

My aim is to search for a word or multiple words and return all the possible occurrence in each cluster then for the highest cluster I search inside each document.

I tried similarity search node, but it doesn't seem to serve my aim.

Is there a way that it could be done inside KNIME or shall I create a database for that ? knowing that I have a column for each word and they are about 3000 words. 

Hi Alaa,

you could do it by searching the bag of words result table after your keyword extraction process. This could e.g. be done by using the Row Filter node. Alternatively you can also use the Indexing extension and create an index based on the preprocessed documents and query that index.

Cheers, Kilian