Hi
I’m using Tess4j to OCR the image into Chinese. When I use the default function for English characters, it works fine. However, when I export the external training data and OCR the image to Chinese, an error came out: Execute failed: Invalid memory access.
If anyone met the same problem and has solution, please let me know. Thank you so much.
If possible, could you provide a workflow with some example data to reproduce the issue? Also, it would be very helpful, if you could also provide the log file (see Bug Reporting Best Practices - Knowledge sharing - KNIME Community Forum for more information) which should contain a more detailed error message. With that, we should be able trace the source of the error message!
Extract the archive to any folder and enter the folder
Open your installation folder of KNIME Analytics Platform in a second window
You should find a plugins/org.knime.knip.tess4j.base_1.3.3.v201906051307/tessdata folder in there
Copy chi_tra.traineddata or chi_sim.traineddata from the extracted archive into the tessdata folder in your KNIME installation
Once you have restarted KNIME, it should pick up the new training data and should be able to recognize Chinese. If it still doesn’t work, please follow the instructions of my previous post and provide additional information.
I have linked to the data files for Tesseract 4.x but the KNIME extensions uses version 3.x. Could you try the two files from GitHub - tesseract-ocr/tessdata at 3.04.00?