Tess4j for chinese Execute failed: Invalid memory access

thu · December 30, 2019, 10:25pm

Hi
I’m using Tess4j to OCR the image into Chinese. When I use the default function for English characters, it works fine. However, when I export the external training data and OCR the image to Chinese, an error came out: Execute failed: Invalid memory access.

If anyone met the same problem and has solution, please let me know. Thank you so much.

(upload://hnM1Z7UJi7Xe5WGMF6VOQ4UlBSm.png)

Best

Tao

Iris · January 2, 2020, 7:51am

Hi Tap,

the image upload seemed to have failed, can you reupload it?

Thank you! Iris

stelfrich · January 7, 2020, 11:25am

Hi @thu,

If possible, could you provide a workflow with some example data to reproduce the issue? Also, it would be very helpful, if you could also provide the log file (see Bug Reporting Best Practices - Knowledge sharing - KNIME Community Forum for more information) which should contain a more detailed error message. With that, we should be able trace the source of the error message!

Best,
Stefan

thu · January 7, 2020, 2:32pm

Hi Iris

Thank you for reply. I found the problem may be the training data for Chinese text is not right.

Could you give me any suggestion about how to use the Tess4J to OCR chinese characters? Thanks.

Best,

Tao

thu · January 7, 2020, 2:32pm

Hi Stefan

Thank you for reply. I found the problem may be the training data for Chinese text is not right.

Could you give me any suggestion about how to use the Tess4J to OCR chinese characters? Thanks.

Best,

Tao

stelfrich · January 8, 2020, 9:11am

Hi Tao,

I haven’t verified if this works, but it is definitely worth a try:

Download either the Chinese Traditional or Chinese Simplified archive
Extract the archive to any folder and enter the folder
Open your installation folder of KNIME Analytics Platform in a second window
You should find a plugins/org.knime.knip.tess4j.base_1.3.3.v201906051307/tessdata folder in there
Copy chi_tra.traineddata or chi_sim.traineddata from the extracted archive into the tessdata folder in your KNIME installation

Once you have restarted KNIME, it should pick up the new training data and should be able to recognize Chinese. If it still doesn’t work, please follow the instructions of my previous post and provide additional information.

Best,
Stefan

christine_ywl · July 30, 2021, 7:42am

Hi Stelfrich,

Could you please re-post the two archive files? I couldn’t download them now. Thanks!

Best regards,
CY

stelfrich · July 30, 2021, 8:00am

Hi @christine_ywl,

You should be able to get them from

Best,
Stefan

christine_ywl · August 3, 2021, 2:56am

Hi Stefan,

Thanks! I tried to use chi_sim to recognize a PDF file but encounter the following error message:-

ERROR Tess4J 3:137 Execute failed: Invalid memory access

Could you please advise what I should do to get this resolve?

Best regards,
CY

stelfrich · August 3, 2021, 9:49am

I have linked to the data files for Tesseract 4.x but the KNIME extensions uses version 3.x. Could you try the two files from GitHub - tesseract-ocr/tessdata at 3.04.00?

Best,
Stefan

christine_ywl · August 4, 2021, 2:41am

I replaced the files but I still cannot execute it. The following are the error messages:-

ERROR Tess4J 3:137 Error initializing Tesseract.
ERROR Tess4J 3:137 Execute failed: Invalid memory access

system · June 2, 2023, 9:11pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.