Table Reader - Strings to Document

AhmetYavuz · October 9, 2019, 9:41am

Hi,
I am getting the below message when I am trying transformation of Table Reader to Strings to Document,

“Execute failed: Cell at index 0 is null!”

resim

ipazin · October 9, 2019, 3:49pm

Hi there @AhmetYavuz,

welcome to KNIME Community Forum!

What is content of Table Reader and how does Strings to Document configuration looks like?

Br,
Ivan

AhmetYavuz · October 10, 2019, 4:40am

Hi @ipazin,

Thank you so much for your reply,

Actually I try to use NGram node, before using it I used PDF Parser for extract 2000 pdf documents containing sciencific articles, then I used Document Extractor to extract tables from PDF Parser,

I wrote this as table then read with Table Reader, so

Table Reader contains 2000 pdf article,

I want to use NGram after preprocessing, do you have any comment for big volume documents like thousands of artcile

Kind Regards,
Ahmet

ScottF · October 15, 2019, 2:34pm

Hi @AhmetYavuz -

The String to Document node only works on data of type string, whereas you created a table using the PDF Parser, which produces documents already. (That is, if I’m understanding you correctly - let me know if I’m not.)

We actually just published a blog post a couple of weeks back that deals with pre-processing of OCR data, along with creating bigrams for additional analysis. I realize you are using PDFs as input as opposed to images, but a lot of the concepts are similar. As with most of our blog posts, it comes with a workflow too - check it out here:

https://www.knime.com/blog/an-experiment-in-ocr-error-correction-sharing-treasure-on-the-knime-hub

Saivinod · March 24, 2021, 6:02pm

Encountering such problem. Any solution

julian.bunzel · March 25, 2021, 11:07am

@Saivinod,

I answered here:

Saivinod · March 26, 2021, 6:11pm

Hi There,
This after having read PDF file via a Tika Parser node where I am trying to read the contents of the PDF file.

julian.bunzel · March 29, 2021, 10:22am

@Saivinod,

is it possible to provide an example pdf where this happens?

Best,
Julian

Saivinod · March 29, 2021, 1:39pm

The text I am trying to extract is from the pdf of the link below:
https://onlinelibrary.wiley.com/doi/abs/10.1002/bse.2703

Due to copyright issue I wont be able to share the pdf directly.

system · June 2, 2023, 9:40pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.