trouble reading png files

Since I had trouble reading jpeg files, I used ImageMagick to convert them to PNG files which seem to be better supported.  However those files are also not read by the ImageReader (Table) node.  The image handling appears fairly fragile with the image sources I'm using.   They appear ok in browsers and Windows image viewer.

Also, the tools to cast a binary blob (e.g. read from a file or stream)  to an image type are weak; there is only a blob to PNG casting node, no other formats area supported.

Versions:

KNIME Image Processing    1.5.4.201706010607    org.knime.knip.feature.feature.group    University of Konstanz / KNIME

  KNIME Image Processing - Deeplearning4J Integration    1.1.0.v201612201033    org.knime.knip.dl4j.feature.feature.group    University of Konstanz, Germany

  KNIME Analytics Platform    3.3.2.v201704061137    org.knime.product.desktop    null

Platform: Windoes 10

Error:

2017-06-02 12:31:46,948 : DEBUG : KNIME-Worker-20 : ImgReaderTableNodeModel : Image Reader (Table) : 0:110:67 : Encountered exception while reading image:
io.scif.img.ImgIOException: javax.imageio.IIOException: Error reading PNG image data
    at io.scif.img.ImgOpener.openImgs(ImgOpener.java:389)
    at io.scif.img.ImgOpener.openImg(ImgOpener.java:529)
    at org.knime.knip.io.ScifioImgSource.getImg(ScifioImgSource.java:297)
    at org.knime.knip.io.nodes.imgreader2.AbstractReadImgFunction.readImageAndMetadata(AbstractReadImgFunction.java:138)
    at org.knime.knip.io.nodes.imgreader2.readfrominput.ReadImgTableFunction.lambda$0(ReadImgTableFunction.java:101)
    at java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:110)
    at java.util.stream.IntPipeline$Head.forEachOrdered(IntPipeline.java:567)
    at org.knime.knip.io.nodes.imgreader2.readfrominput.ReadImgTableFunction.apply(ReadImgTableFunction.java:95)
    at org.knime.knip.io.nodes.imgreader2.readfrominput.ImgReaderTableNodeModel.execute(ImgReaderTableNodeModel.java:154)
    at org.knime.core.node.NodeModel.execute(NodeModel.java:732)
    at org.knime.core.node.NodeModel.executeModel(NodeModel.java:566)
    at org.knime.core.node.Node.invokeFullyNodeModelExecute(Node.java:1128)
    at org.knime.core.node.Node.execute(Node.java:915)

Hi Matthew.

I tried reproducing the issue on both Linux and Windows (using the same versions as you) without success, the image you provided is read without an issue.

Are you reading the image files from a network share or a similar source that is not physically located within your computer? The stacktrace you posted indicates that there might be an issue with accessing the bytes when trying to read the file.

I wrote a minimal java program (attached jar file) that replicates just the PNG reading part. It would be very helpful if you could try reading a few of the offending images with it to test if you are getting the same error there.

best,

Gabriel

 

 

It may be problems with memory, and at one point running out of disk space that may have caused problems accessing the bytes.   I noticed that the "read images" node that is reading the PNG images has about 5G data (size of folder in workspace),  but the 'PNGImage to ImgPlus' node that follows  has 55G of data.   The 10x expansion may be problematic.  My test case has only 5000 PNG images, so it does not look possible to handle 50,000 images in this context.

The text mining tools seem also to run out of heap, even though the text associated with the images is only about 5M, and I've filtered out the image columns for the text processing with the goal of joining them back together after the text processing.

If you have memory issues, the first bandaid is to process your input data chunkwise (if possible), e.g. by using the Chunk Loop Start node in combination with any Loop End node, to only read process a group of files at a time. Additionally you can make use of the Don't Save Start and Don't Save End nodes to define parts of the workflow which are not saved to disk, which can additionally increase performance.

I attached an example workflow which should make this process clear.