Table Reader - Strings to Document

Hi,
I am getting the below message when I am trying transformation of Table Reader to Strings to Document,

“Execute failed: Cell at index 0 is null!”

resim

Hi there @AhmetYavuz,

welcome to KNIME Community Forum!

What is content of Table Reader and how does Strings to Document configuration looks like?

Br,
Ivan

Hi @ipazin,

Thank you so much for your reply,

Actually I try to use NGram node, before using it I used PDF Parser for extract 2000 pdf documents containing sciencific articles, then I used Document Extractor to extract tables from PDF Parser,

I wrote this as table then read with Table Reader, so

Table Reader contains 2000 pdf article,

I want to use NGram after preprocessing, do you have any comment for big volume documents like thousands of artcile

Kind Regards,
Ahmet

Hi @AhmetYavuz -

The String to Document node only works on data of type string, whereas you created a table using the PDF Parser, which produces documents already. (That is, if I’m understanding you correctly - let me know if I’m not.)

We actually just published a blog post a couple of weeks back that deals with pre-processing of OCR data, along with creating bigrams for additional analysis. I realize you are using PDFs as input as opposed to images, but a lot of the concepts are similar. As with most of our blog posts, it comes with a workflow too - check it out here:

https://www.knime.com/blog/an-experiment-in-ocr-error-correction-sharing-treasure-on-the-knime-hub

1 Like

Encountering such problem. Any solution

@Saivinod,

I answered here:

1 Like

Hi There,
This after having read PDF file via a Tika Parser node where I am trying to read the contents of the PDF file.

@Saivinod,

is it possible to provide an example pdf where this happens?

Best,
Julian

The text I am trying to extract is from the pdf of the link below:
https://onlinelibrary.wiley.com/doi/abs/10.1002/bse.2703

Due to copyright issue I wont be able to share the pdf directly.

1 Like