Image to json

AnthonyCREng · November 14, 2022, 3:14am

Hi Knime Community.

I was asking myself about the following scenario: Lets imagine we need to transform an image which illustrates an avro_schema snapshot and now we want to transform that image into a json file.

Do we got something similar here in Knime which allow us to achieve and makes this affordable?
Note: Something related to image recognition who is capable to analyze and identify the syntaxis of a document and then infer/associate and convert such syntaxis into the definitive document which ultimately presents/use this type of syntactical patterns (e.g.: json documents: which contains characters such as {“”:“”,“”:“”}

I will appreciate your commentaries.

badger101 · November 14, 2022, 11:53am

Hi @AnthonyCREng , what output did you get if you use this node Tess4J – KNIME Hub ?

AnthonyCREng · November 20, 2022, 3:18am

Hi badger101, thanks for your prompt response.

Unfortunately, the Tess4j node does not work for the current scenario, according to the github repository (GitHub - tesseract-ocr/tesseract: Tesseract Open Source OCR Engine (main repository)) Tesseract-OCR is explicitly used as follows:

Tesseract has unicode (UTF-8) support, and can recognize more than 100 languages “out of the box”.
Tesseract supports various image formats including PNG, JPEG and TIFF.
Tesseract supports various output formats: plain text, hOCR (HTML), PDF, invisible-text-only PDF, TSV and ALTO (the last one - since version 4.1.0).

In Summary based on the info extracted from (Languages supported in different versions of Tesseract | tessdoc) Tesseract is an open source for text recognition (OCR) specifically for country languages (Languages supported in different versions of Tesseract | tessdoc)

system · November 27, 2022, 3:18am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.