Image Column - access meta data in Python extension

MartinDDDD · September 10, 2024, 9:20am

Hello (again),

I’ve been trying to solve this problem for a while now, have tried / experimented, but can’t seem to make progress - maybe someone of you can help me!

TL:DR:
Long story short: Is there a way for me to convert the dict I get in python for an Image cell into the Image object, which contains meta data like the source?

The Idea
I have started to improve my previous node that allowed prompting OpenAI and using the structured output feature. I have recently played around with vision models and did not find a way to prompt a vision model with KNIME. So I wrote my own python script, which also takes care of encoding the image to Base64 by taking the local file path. In my original workflow where I prompted via python script, I get the file URI via a range of nodes chained after Read Files / Folder Node.

Whilst trying to solve this problem, I noticed that when viewing the image column in a table editor JavaScript, that it appears that the metadata, like the path to the file on the local drive, is inside an image object by accessing Image[“source”]:

View in Table:

View in Table Editor:

Right now I am trying to also add the option to add images to the prompt and my idea was, to avoid having to chain multiple nodes to get the file path, to extract this from this Image object that comes out of the Image Reader Node.

What I am struggling with:

I’m struggling to “access” this image object - when printing the content of the image column in a Python Script Node I can see that it is shown as a dict with key 0 and 1:

I also get this information which seems to relate to the cell class:

Name: Image, dtype: PandasLogicalTypeExtensionType(struct<0: extension<knime.struct_dict_encoded<StructDictEncodedType>>, 1: extension<knime.struct_dict_encoded<StructDictEncodedType>>>, {"value_factory_class":"org.knime.core.data.v2.value.cell.DictEncodedDataCellValueFactory","data_type":{"cell_class":"org.knime.knip.base.data.img.ImgPlusCell"}})

When I convert to dict and print I get:

{'image_0.jpg': {'0': b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x01\x00\x00\x00$\x00\x00\x00aa5aac11-887f-48f8-b5a1-f43c5f8b501b\x00\x00\x00\x00&\x00\x00\x000_681f77e6-0314-4689-a1ca-691cad962688\xff\xff\xff\xff', '1': 'org.knime.knip.base.data.img.ImgPlusCell'}}

I already tried to add knip as a dependency (it is definitely installed) and also ran knime.extension.supported_value_types() however the only image data type I see in the list is PIL.Image.Image and that did not help.

I take that the data behind the ‘0’ key is somewhat the serialised / encoded Image object:

b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x01\x00\x00\x00$\x00\x00\x00aa5aac11-887f-48f8-b5a1-f43c5f8b501b\x00\x00\x00\x00&\x00\x00\x000_681f77e6-0314-4689-a1ca-691cad962688\xff\xff\xff\xff'

and the data behind the ‘1’ key:

org.knime.knip.base.data.img.ImgPlusCell

is supposed to help to “translate it”.

I already looked into knime.api.types and found:

get_converter()

However when instantiating an object by passing cell_class / value_factory_class from the below:

value_factory_class":"org.knime.core.data.v2.value.cell.DictEncodedDataCellValueFactory",
"data_type":{"cell_class":"org.knime.knip.base.data.img.ImgPlusCell"

However this resulted in an instance of FallbackPythonValueFactory, and when using decode() method it doesn’t help either…

Long story short: Is there a way for me to convert the dict I get in python for an Image cell into the Image object, which contains meta data like the source?

MartinDDDD · September 30, 2024, 8:58am

bump…

Anyone any idea or is it not clear what I am trying to do here?

takbb · September 30, 2024, 9:48am

Hi @MartinDDDD, I think we are all so used to you finding solutions for other people these days that we see your name and assume you’ve got it all in hand,

Not sure I will be able to help, but I can ask some questions to get the conversation started.

First off, do you have a simple demo of what you have shown above that you can share so we can have a play?

So you are wanting to write the “meta data” for the image (name, source, dimensions etc) - which KNIME appears to hold - to the console within the Python node? Is that the ultimate goal?

Thinking out loud here… is the meta data perhaps Exif data (which I think can be extracted using the PIL library in python), or does Exif not contain the info that KNIME is displaying?

MartinDDDD · September 30, 2024, 12:48pm

Hey there

That seems to be a luxury problem to have xD.

Ok let me try to explain a bit differently - reading through the above again I must have been “deep in the zone”, maybe close to turning insane xD:

I’ve create a Python-based KNIME Extension that includes a node that allows to prompt vision models GitHub - miuuu1/FinnovationFlows-Extension: KNIME Extension by Finnovation Flows
As of now, there are a fair few nodes required to wrangle an image into base64 format, so that it can be provided to the vision model (see image)
I am looking for a way to move some of that into the vision model prompter node itself, to make it easier to use - i.e. I want to get rid of the steps in the red box in the image
I noticed that when reading an image initially, it’s not the “full image” that is loaded, but some sort of meta data. In my initial tests I “base64-ed” the Image column and got weird responses from the models - when I then printed the base64 string and decoded I saw that it was this Image object as shown in the image below in the Table View.
That lead to the idea to grab the local path to the image inside my vision prompter node and to take care of the base64 conversion inside the node
When I tried to access the Image object, I got errors and on further investigation, I found that the data I get in Python does not (yet) marry up with what I see in the cell when looking at it via java script table view node, but it seems to be encoded:
{'image_0.jpg': {'0': b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x01\x00\x00\x00$\x00\x00\x00aa5aac11-887f-48f8-b5a1-f43c5f8b501b\x00\x00\x00\x00&\x00\x00\x000_681f77e6-0314-4689-a1ca-691cad962688\xff\xff\xff\xff', '1': 'org.knime.knip.base.data.img.ImgPlusCell'}}
So I’m trying to find a way how I can pass the Image column in the image below directly to my vision prompter node. In the code behind my node I want to then access the local path in the Image object so I can use that to load the image, convert to base64 and send it on its way to the vision model api inside the code of my node.

My working assumption is that there must be a way to convert the “gibberish” above into the Image object… I did dive quite deep into the different python modules and found experimented however with not much success.

Image:

Link to the example workflow I created for the extension (Screenshot taken from there with the Table View node added)

takbb · September 30, 2024, 1:12pm

Hi, I’m not sure if this will help progress things or not, but there are a couple of nodes:

The Image Properties can be added after your Image Reader (Table) and you can select the meta info to return, via the features tab in config.

The Binary Objects to PNGs gives you a PNG Image in your table, instead of “just” an image. I don’t know if that’s helpful or not in getting you where you want to go!
Edit: I now see you already had a PNG image after the Ungroup following ImgPlus to PNG images, so that’s probably not of help other than to point out yet another PNG node

takbb · September 30, 2024, 1:39pm

@MartinDDDD , Reading more about the subject of images and python in KNIME, did you see this topic?:

and @mlauber71 's suggestion:

MartinDDDD · September 30, 2024, 3:07pm

Thanks @takbb - will see if this can make things more efficient… ideally I want to go straight from Read Image node to the vision model prompter and handle everything there (User selects the image column and the node takes care of all the troublesome stuff like converting to b64…).

Let’s see

takbb · September 30, 2024, 3:37pm

There was also a recent thread I was involved in that I just remembered regarding storing images in a local database

gonhaddock · September 30, 2024, 3:38pm

Hello @MartinDDDD
You can read the images from Py PIL, avoiding the Image Reader (Table) node. And then, convert it to base64 in the same Script node with little coding.

https://stackoverflow.com/questions/31826335/how-to-convert-pil-image-image-object-to-base64-string

The only input needed is a table with the images’ URL locations.

BR

gonhaddock · October 2, 2024, 1:49pm

Hello @MartinDDDD
I’ve been trying to deploy the idea from my latest post. Then, I’m releasing a workflow that is encoding base64 in Python from the sourcing JPG file.

I don’t fully understand the context of the challenge; so I don’t know either if transforming to PNGs, it is a necessary step. I’ve included a PNG file within the sample data, and it’s currently performing ok with mixed files.

BR

PS1.- with some code update, it would read from www if preferred

PS2.- I’ve just updated the Py code in workflow to report Image Dimensions column

Brain · October 3, 2024, 9:28am

Hi,
thanks for your answer. I use your workflow. Now how must i continue to get the image (image view) in my report ?

Thanks
Br

k10shetty1 · November 11, 2024, 2:01pm

Hello @Brain,

If you’re looking to display images directly in your report, you can use the Table View node.

Best,
Keerthan

system · February 9, 2025, 2:01pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.