Process tabular data image

I would like to process tabular data in an image. There is a table in the image. To be specific we are talking about a certificate from university with grades - a report card. In the end I would like to have the text in one column and the grades in the other column. As far as I know, there is no KNIME node for something like that, or is there? (E.g. the ImageJ nodes are basically completely undocumented, so I am not sure whether they have a capability like that.) An idea of mine would be to have the Tessaract node to parse the image and then treat each line as tabular row and the last “word” as the grade. For that I would have to build on How to split a string into multiple rows at linebreaks?. However, I would like to know whether there is a better way than this slightly ad-hoc solution.

I found a much nicer solution to your multi line string problem, see the attached workflow:
Split multiline string.knwf (11.4 KB)


@gab1one Thanks, but how is this more elegant? Now you use 2 nodes (Java Snippet to break at \n, Split Collection Column to split into columns) where as your previous idea just used the Cell Splitter node to achieve the same. In both case you need a Transpose node or a Unpivot node (which is actually nicer because it is more flexible and can remove the missing values directly) afterwards.

Yeah you are right, I totally overcomplicated this :roll_eyes: