Clustering of Text based on Co-occurrences

I’ve tried to extract co-occurrences and put it into a document vector for later clustering. However, it turns out that it is not possible to make a term-term matrix instead of a document-term matrix.

It gives me the warning: Only one column conmtaining TermCells allowed !

Do I oversee something?

Hi @b0raas,

have a look at the workflow Term Cooccurences – KNIME Hub. In particular the result of the GroupBy node. This is just a listing and not a whole matrix, but I’m also not sure it is feasible to build the whole matrix with probably thousands of rows and columns.

If the workflow isn’t any help, maybe you could share the state you’re at and where you want to go (what do you need the entire matrix for)?

Kind regards
Marvin

1 Like

The particular thing I am looking for is a Term Vector which has also the terms as attributes (not only as rows) like we have them in the Document Vector.

This would result in a Term-Term matrix. I think this is currently not possible with KNIME.