Trying to count word frequency

I’m new to KNIME and I’m completely lost.
I am trying to load a file with the flat file document parser and count the frequencies of words. The problem I run into is that it counts uppercase and lowercase words separately and counts words with punctuation. I’ve tried a number of different workflows which essentially have:
flat file document parser → bag of words creator → TF

Whenever I try to put a string cleaner, it converts the terms to strings and the TF doesn’t work. If I put a string to term node, frequencies are 0. If I try to clean the file before the bag of words, it cleans the file name and not the contents and the output is a string which does not work as input for the bag of words.

Use the Case Converter & Punctuation Erasure nodes to prep the data. Also see this post:

4 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.