Extract a PDF

Brain · November 6, 2023, 3:39pm

Hello,

I would like to extract this PDF with PDF Parse or TIKA Parser from this website :

I try but i dont find the way to do.
Does anyone know how to do ?
Thanks

lelloba · November 6, 2023, 5:54pm

You can do this in two ways:

You download the file, save it into a folder, then use the Tika Parser and get the result;
You can use a Table Creator to paste the link as string, then use a Tika Parser URL Input to read the link and download + read the file as shown in the workflow below.

The strange thing is that if you use your original link in the Tika Parser URL (https://www.arqana.com/upload/pedigrees/vente334/complet_eng.pdf?v=2311060311) it doesn’t work; if you change it into https://www.arqana.com/upload/pedigrees/vente334/complet_eng.pdf (I’ve just cleaned the last part) the node works as expected. Don’t know if it is a bug or an expected behaviour I can’t explain.

Hope it helps! Have a nice evening,
Raffaello Barri
LinkedIn

Brain · November 6, 2023, 6:20pm

Thanks a lot. The way will be long with PDF but it works

system · November 13, 2023, 6:20pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.