Hello KNIME Community,
I’m encountering an issue with the PDF Parser node in KNIME. When I use the node to extract text from a PDF document that has two columns of text, it combines both columns into a single line in the output. This is causing the extracted text to lose its original formatting and structure.
I’ve attached two screenshots to illustrate the problem:
- This shows the result from the PDF Parser node, where the two columns are combined into one row.
- This displays the original PDF with the two distinct columns.
Additionally, I’ve attached the PDF article for reference.
ijmsv11p1185.pdf (1.1 MB)
Is there a way to configure the PDF Parser node to maintain the original column structure and extract the text as it appears in the source PDF? Or is there a workaround or setting I might be missing?
Any help or guidance on how to address this issue would be greatly appreciated.
Thank you in advance!