Generate a table from PDF input

dav1b · March 2, 2018, 8:26pm

Hi all,

KNIME newbie here.

I want to input data from a PDF, and extract certain strings from the document and place into a table.

I've managed to read in a PDF using the PDF parser but not sure what nodes to use next! Can one of you experts please assist?

Many thanks in advance, David

RolandBurger · March 12, 2018, 11:03am

Hi David,

In addition to the PDF Parser, you can also use the Tika Parser node to extract data from PDFs. After that, the next step is usually to use a Strings to Document node to prepare the data for text processing. Please have a look at these links for some inspiration for what you can do for text processing in KNIME:

https://www.knime.com/nodeguide/other-analytics-types/text-processing/tika-parsing
https://www.knime.com/nodeguide/other-analytics-types/text-processing/document-classification
https://www.knime.com/nodeguide/other-analytics-types/text-processing/sentiment-classification
https://www.knime.com/nodeguide/other-analytics-types/text-processing/topicextraction-with-the-elbowmethod

I hope this helps you to get started!

Cheers,
Roland

system · April 4, 2018, 5:07pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.