OCR Feature Request

Although a little different to what the existing Image Processing Tools can do. How about a Optical Character Recognition node. This would be very powerful for reading in documents which are not in a searchable format (i.e. some pdfs) and you can just parse the image in and get the text out.

 

Simon.

A while back, I tried using the OSRA package with the External Tool node to do OSR and grab structures out of PDFs.

http://cactus.nci.nih.gov/osra/

Mixed success. Depends heavily on quality of images, how close annotaions are to strutures in figures, and existence of short hand labels like R and X.

Although...now that the RDKit nodes are available, might have another go.

 

(The other) Simon

See discussion in other thread: http://tech.knime.org/forum/indigo/image-ocr-feature-request#comment-28732