Hi experts. I am new to Knime. I have 5 text files named, 1500.txt, 1600.txt.....1900.txt. Using Python I selected 200 most frequently used words from 1500.txt and compared those words with other text files. The following is an example of those words:-
Words
1500
1600
1700
1800
1900
love
1775
904
897
887
798
great
1564
832
2044
2025
1574
good
1508
1599
2009
1671
1329
thee
1494
1644
1023
339
75
lord
1203
877
2110
823
84
Now the question is, I want to calculate TF-IDF of these 200 words in each text file. I am attaching the list of 200 words for ready reference. I hope you all understand my question.
Use the Flat File Parser to read in the txt files. Use the Dictionary tagger to tag the 200 most frequent words, that you counted before. Then use General Tag Filter (filter out all terms that have not been tagged, to reduce terms / data)->Bag of Words creator->TF (absolute).