Access table in PDF

I’m trying to get the table in the attached PDF file into KNIME, but I can’t seem to get it to work. I have looked at some other posts using the Tika Parser, but I’m not able to get it to acquire all of the data. Any help to point me in the right direction would be greatly appreciated.

2022_M_ArubaOceanClub.pdf (434.9 KB)

@hdavis54 you could take a look at the examples in this article.

Another approach is to use LLMs to extract information. Here are examples with local versions in order to protect the data privacy.

1 Like

Hi @hdavis54 ,

As of now, there is no such direct node that can do this. But if you just want to convert the PDF into excel and later use KNIME for further process. You can use TabulaPDF tool (Open source and no need for an AI) it is a web app run in location (Although if your PDF layout remain same you can create one workflow with Tika Parser and Regex Nodes):

Please KNIME team, include this library and nodes for this library please.