I am new to KNIME. I want to use KNIME to analyze Software Support Tickets containing unstructured text to identify patterns and trends.
To get started, I created a new workflow following instructions provided by Killian. (Thank you, Killian)
I am getting unexpected results using this workflow shown below and starting from a .csv file.
• File Reader • Strings to Documents • Punctuation erasure • Stop word filter • N chars filter • Snowball Stemmer • Case converter • Bag of words creator • TF • Sorter • Count Sorted
The results from the "Count Sorted" operator include word and punctuation marks I would expect to be omitted as shown below.
Should these words and punctuation marks be in the final results?
Should each word be followed by the double square brackets [ ] ?
Is it possible to edit the Stop Words file? If so, how?
Thanks for any and all help.