Finding TF for specified word from number of PDF files

I had created a list of words and phrases using create table node. Now i want to find TF for those words and phrases as listed in the table from number of PDF files. Can someone please spare some time and guide me to develop wrokflow for above mentioned job…

Welcome to KNIME.
Take a look here


The PDF Parser notes returns a column with each row an pdf document converted to a text-document ready to proces. With a Chunk Loop Start you can proces every document (maybe using more / relevant preproses steps as described in the document from @izaychik63 above ) and finaly calculate the TermFrequency and collect them as columns together.
For more inspiration search for “text” at the KNIME examples server, you will find lots of usefull workflows.

gr. Hans

Thank you sir/madam for your time…there is one doubt…where i should apply my selected words/phrases for which i want to find TF…? actually i had tried dictionary tagger…

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.