OCR with Tika Parser, Image Reader Table and Tess4J

Hi @RezaBaharmand ,

after playing around now I can :slight_smile: Two things:

  1. apparently, the Tess4J node expects the traineddata to live in a folder called “tessdata”
  2. since KNIME uses version 3.* of tesseract, the traineddata for 4.* will not work. You can find the traineddata for version 3.* here. (Thanks to Tess4j for chinese Execute failed: Invalid memory access - #10 by stelfrich)

I also uploaded a little example workflow:

Please excuse my norsk, I have no idea what is written there :wink: But the transcript seems pretty similar to the original text to me.

Cheers,
Lukas

4 Likes