divided columns on table from pdf

i have a pdf with a table containing client1, client2 and client3 as header, when i use tika parser or pdf parser all clients are combined in one cell and i need them separately. the problem with this is that clients dont have any pattern to distiguish from each other. is there any way i can extract the clients names in a clean way or exists a node that can read the table already separated

Hi @lizcuenca

Welcome to the KNIME Community!

Can you post some (anonymized) examples? It’s very hard to judge from a distance what your cell exactly looks like. The more you can provide, the better the help usually gets :wink:

1 Like

Look at the trail below

It may gives you some ideas.

3 Likes

hi this is an example o what i need to extract, i need all contect separated from all diferents text box but tika parser and pdf parser reads all as a same line
image

Hi! here an example divided columns on table from pdf - #4 by lizcuenca

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.