Hi experts. I am new to Knime. I have 5 text files named, 1500.txt, 1600.txt.....1900.txt. Using Python I selected 200 most frequently used words from 1500.txt and compared those words with other text files. The following is an example of those words:-
Words
1500
1600
1700
1800
1900
love
1775
904
897
887
798
great
1564
832
2044
2025
1574
good
1508
1599
2009
1671
1329
thee
1494
1644
1023
339
75
lord
1203
877
2110
823
84
Now the question is, I want to calculate TF-IDF of these 200 words in each text file. I am attaching the list of 200 words for ready reference. I hope you all understand my question.
Thanks Marc. Actually i dont know which nods i will use and how i will do it? Can you you please help to making the workflow and creating the step by step nods?
You can import the data from your CSV file using the CSV Reader node. Thereafter you can use one or more Math Formula nodes to calculate whatever you want (normalized term frequency, inverse document frequency). Alternatively you can use the Java Snippet node if you are familiar to Java.
If you still don't understand what I mean, you may find some tutorials and example workflows somewhere on the KNIME homepage.