Extract text in an image

Hi,
I want to extract the text in an image, the image looks like this


How can I get the text " FD4I" in the image using KNIME node(s)?
Appreciate it if someone can build a KNIME workflow to get the text.

1 Like

Hallo @Hawk326040,

you can use the Tess4J node from the KNIME Image Processing - Tess4J Integration Extension. Maybe this example workflow is a good starting point: Integration: Tess4J

Best,
Janina

Hello, Janina
Thanks for your recommendation.
Yes, I tried to use Tess4J nodes to get the text, but the testing image does not have an obvious difference (color edge) between the text and background, need to preprocess it (I used a lot of filter nodes, crop nodes), still can’t separate the text from the background image. I tested my workflow with an image (the text in black and background in white), it works. So the problem is the testing image can’t be extracted from the background.

Hallo @Hawk326040,

yes, your image needs a lot of preprocessing. There is no fixed rule how to do that. You would need to play around and try different things to increase the contrast of text to background.
Here is an example workflow, which might help you to get started: Application: Difference Of Gaussians

Kind regards,
Janina

Thanks, Janina.
It’s difficult to increase the contrast of text to the background due to the color of text and background almost are the same.
Now I am trying to import easyocr in python node and the accuracy increased to 80%.

Hallo @Hawk326040,

wow, that’s great. Would you like to share your workflow on the Hub? I think other people might also be interested in that and I would also like to have a look. :blush:

Kind regards,
Janina

Hi Janina,
Sorry for the late reply.
Here is my solution with implementing easyocr in a python node.
EasyOCR.knar (1.6 MB)
You should install relative Python packages listed in the requirements.txt to make the workflow work.requirements.txt (404 Bytes)

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.