Extract Tables from PDF

oole · October 23, 2018, 3:58pm

Hello everyone,

It is possible if the PDF allows it, meaning if the string we get from the PDF represents the table so that we can manipulate the String, to extract the table.

I took @JAGBI’s example and parsed the first three pages as an example. In order to extract the whole table/document, some more string manipulation would have to be done. The magic happens in the Extract Table metanode, where the string is parsed to an actual table. The workflow would have to be adapted to other PDFs/tables, but it worked pretty well on the given PDF.

Here is the workflow: table_from_pdf.knwf (320.6 KB)

I hope it helps.