Hi, I’m the PDF guy on the forum. I’ve never heard of a non-OCR PDF. What is that exactly?
We recently had a PDF extraction event via Data Connect. The slides can be found here .
For PDFs, you may also find that the tika parser is better for extraction (but it depends on how/what you want to extract).
As well, we did PDF extraction in a Just KNIME It challenge:
Extracting a Table from a PDF – alinebessa
Given a text-based PDF document with a table, can you partially extract the table into a KNIME data table for further analysis? For this challenge we will extr…
And see community solutions as well.
Please see those examples and then post your workflow so we can diagnosis the issue and so we can provide the best possible answers. Thank you!