In “Technical Report:The KNIME Text Processing Feature: An Introduction”, 2012, Dr. Killian Thiel and Dr. Michael Berthold mentioned that BoW generates column frequencies beside each term. They illustrated that by figure 10 on page 12. Werethat available in the an old version of KNIME or are that available in the current newest version? If that are still available, What is the appropriate node that has to been used for that objective? The report is attached.The KNIME Text Processing Feature - An introduction.zip (2.0 MB)
the report is a little bit outdated. To extract frequencies, you can use the TF (term frequency), DF (document frequency) and IDF (inverse document frequency) nodes right after the Bag of Words Creator node.
Another option would be the Unique Term Extractor which gives you a list containing each occurring term once + all of the above mentioned frequencies. This node can be used without a BoW. Simply connect it to a table that contains documents.
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.