Table Reader - Strings to Document

#1

Hi,
I am getting the below message when I am trying transformation of Table Reader to Strings to Document,

“Execute failed: Cell at index 0 is null!”

resim

0 Likes

#2

Hi there @AhmetYavuz,

welcome to KNIME Community Forum!

What is content of Table Reader and how does Strings to Document configuration looks like?

Br,
Ivan

0 Likes

#3

Hi @ipazin,

Thank you so much for your reply,

Actually I try to use NGram node, before using it I used PDF Parser for extract 2000 pdf documents containing sciencific articles, then I used Document Extractor to extract tables from PDF Parser,

I wrote this as table then read with Table Reader, so

Table Reader contains 2000 pdf article,

I want to use NGram after preprocessing, do you have any comment for big volume documents like thousands of artcile

Kind Regards,
Ahmet

0 Likes

#4

Hi @AhmetYavuz -

The String to Document node only works on data of type string, whereas you created a table using the PDF Parser, which produces documents already. (That is, if I’m understanding you correctly - let me know if I’m not.)

We actually just published a blog post a couple of weeks back that deals with pre-processing of OCR data, along with creating bigrams for additional analysis. I realize you are using PDFs as input as opposed to images, but a lot of the concepts are similar. As with most of our blog posts, it comes with a workflow too - check it out here:

https://www.knime.com/blog/an-experiment-in-ocr-error-correction-sharing-treasure-on-the-knime-hub

0 Likes