i have a pdf with a table containing client1, client2 and client3 as header, when i use tika parser or pdf parser all clients are combined in one cell and i need them separately. the problem with this is that clients dont have any pattern to distiguish from each other. is there any way i can extract the clients names in a clean way or exists a node that can read the table already separated

Can you post some (anonymized) examples? It’s very hard to judge from a distance what your cell exactly looks like. The more you can provide, the better the help usually gets :wink:

hi this is an example o what i need to extract, i need all contect separated from all diferents text box but tika parser and pdf parser reads all as a same line

