Clustering of Text based on Co-occurrences

b0raas · November 19, 2021, 8:31am

I’ve tried to extract co-occurrences and put it into a document vector for later clustering. However, it turns out that it is not possible to make a term-term matrix instead of a document-term matrix.

It gives me the warning: Only one column conmtaining TermCells allowed !

Do I oversee something?

marvin.kickuth · November 23, 2021, 9:01am

Hi @b0raas,

have a look at the workflow Term Cooccurences – KNIME Hub. In particular the result of the GroupBy node. This is just a listing and not a whole matrix, but I’m also not sure it is feasible to build the whole matrix with probably thousands of rows and columns.

If the workflow isn’t any help, maybe you could share the state you’re at and where you want to go (what do you need the entire matrix for)?

Kind regards
Marvin

b0raas · November 30, 2021, 9:11am

The particular thing I am looking for is a Term Vector which has also the terms as attributes (not only as rows) like we have them in the Document Vector.

This would result in a Term-Term matrix. I think this is currently not possible with KNIME.

system · June 2, 2023, 9:39pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.