TF- IDF to a Pair

  • Hi guys, how are you? Do you guys know if it is possible and correct to do a tf - idf to a pair (verb-noun)
  • I have been working on TF-IDF implementation and calculating tf-idf for a set of sentences, but now I want to do it, if possible, for a pair in each sentence.

Welcome to the KNIME Community Forum @Benimaru !

I am not an expert in Text processing. But, I recommend to search hub.knime.com as it is the best place to find example workflows. We have nodes for TF and IDF . If you follow these links, you will find workflows using the nodes at the bottom of the page under Related workflows & nodes section.

For example I found this workflow Regression on Text – KNIME Hub which you can have a look at.

Hope this helps!

Best,
Temesgen

Hi @Benimaru , you would need to use the n-gram node, set it to n=2. Then separate the bigrams where the two words are each split into two columns (using Cell Splitter node). Then run a POS tag on each item, followed by filtering out rows that do not follow the rules of Verb (for the 1st word) and Noun (for the 2nd word).

From there, I assume you know what to do with TF-IDF calculations.

Also, to your question whether it is ‘correct’ to pair it in such way; it depends on what you’d like to find out from the text document. Popular collocation for compound nouns include noun+noun or adjective+noun.

1 Like

Also, you can take a look on

node for shorter solution.

I would like to extract the importance of the verb-noun in the document. Thank you for your answer.

Here is simple intro.
https://stevenloria.com/tf-idf/

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.