Extracting tables from PDF

Hello

Im trying to upload an image in pdf into knime in order to read and convert to an excel or extract some columns i requiere how i can do that?

Hi @ada,

This is not a primary use-case for KNIME Analytics Platform. You can, however, use other open source tools to extract tables from PDF that you can subsequently import into KNIME. Try e.g. https://tabula.technology/

Best regards,
Stefan

1 Like

Stefan, you can be a little more optimistic. See article below.
https://www.researchgate.net/publication/322776745_Table_Recognition_in_Heterogeneous_Documents_Using_Machine_Learning

I wasn’t pessimistic. Quoting from the link that I had posted earlier:

Tabula allows you to extract that data into a CSV or Microsoft Excel spreadsheet using a simple, easy-to-use interface. Tabula works on Mac, Windows and Linux.

I just think that it would make more sense to use an external tool developed for exactly that use-case instead of implementing a custom solution…

2 Likes