Hi. I recently started using Tika Parser Node to read PDF Files but some PDFs are being read with content-bug.
I attached 2 examples bellow, first is how it should be reading and second is how it actually reads.
Any ideas of how to solve it?
Visually, the PDF that gets the error is exactly the same as the one which reads perfectly
Let me make sure I understand. You have 2 pdfs. They appear to be identical. The Tika Parser will read one but not the other. Was the one which won’t read correctly scanned? Could you provide examples of both? What language are you working with?
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.