KNIME Keras ImageDataGenerator - Usage


Has anyone ever used the Keras ImageDataGenerator functionality within KNIME, and if so when I run the code to augment new images how do I get this back into the output_table (output from node) so as it can then be used later in the workflow? The output is a zip in Python, but how can then use this within say a KNIME learner or executor node?

image_generator = image_datagen.flow(X_train, seed=randomseed, batch_size=BatchSize, shuffle=True)
mask_generator = mask_datagen.flow(y_train, seed=randomseed, batch_size=BatchSize, shuffle=True)

Just zip the two generators to get a generator that provides augmented images and masks at the same time

train_generator = zip(image_generator, mask_generator)

Hi @brendanPdoherty,

the output_table variable in the Python nodes expects a pandas DataFrame. Hence, you would need to iterate over the augmentation generator and collect the data in a DataFrame.

Did you already input images into the Python node? Or did you load the data within Python? In case you loaded images into the node, you should be already familiar with the next section.

Generally, if you want to input/output images to/from Python nodes, you need to install an additional extension (see here). The link GitHub repo also shows example code how to used it. In your case the steps roughly would be:

  1. Iterate over the generator, set batch size to 1
  2. Get the numpy array holding the augmented image data
  3. Create and collect KNIPImages (as shown in the linke repo)


Hi David

I am currently loading the image data within the Python Source node, but also have another work flow where I use KNIME nodes to load and processes the data, but have not used the ImageDataGenerator on this method yet. Which do you think is preferable in this instance?

I’ll try the KNIPImages method, many thanks for this information.


Hi @brendanPdoherty,

if the available KNIME nodes suffice for your use case, this is the preferable option. However, I’m not sure if there are nodes (or it is easy) to replicate all functionality from the Keras ImageDataGenerator. It depends on what exactly you want to do. Working without Python nodes, especially when working with images, is certainly easier.