Tika Parser Node - Content Bug

closer1 · October 3, 2025, 2:48pm

Hi. I recently started using Tika Parser Node to read PDF Files but some PDFs are being read with content-bug.

I attached 2 examples bellow, first is how it should be reading and second is how it actually reads.

Any ideas of how to solve it?

Visually, the PDF that gets the error is exactly the same as the one which reads perfectly

rfeigel · October 3, 2025, 11:35pm

Let me make sure I understand. You have 2 pdfs. They appear to be identical. The Tika Parser will read one but not the other. Was the one which won’t read correctly scanned? Could you provide examples of both? What language are you working with?

system · January 1, 2026, 11:35pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.