OCR Feature Request

Although a little different to what the existing Image Processing Tools can do. How about a Optical Character Recognition node. This would be very powerful for reading in documents which are not in a searchable format (i.e. some pdfs) and you can just parse the image in and get the text out.



A while back, I tried using the OSRA package with the External Tool node to do OSR and grab structures out of PDFs.


Mixed success. Depends heavily on quality of images, how close annotaions are to strutures in figures, and existence of short hand labels like R and X.

Although...now that the RDKit nodes are available, might have another go.


(The other) Simon

See discussion in other thread: http://tech.knime.org/forum/indigo/image-ocr-feature-request#comment-28732