Image Column - access meta data in Python extension

Hello (again),

I’ve been trying to solve this problem for a while now, have tried / experimented, but can’t seem to make progress - maybe someone of you can help me!

TL:DR:
Long story short: Is there a way for me to convert the dict I get in python for an Image cell into the Image object, which contains meta data like the source?

The Idea
I have started to improve my previous node that allowed prompting OpenAI and using the structured output feature. I have recently played around with vision models and did not find a way to prompt a vision model with KNIME. So I wrote my own python script, which also takes care of encoding the image to Base64 by taking the local file path. In my original workflow where I prompted via python script, I get the file URI via a range of nodes chained after Read Files / Folder Node.

Whilst trying to solve this problem, I noticed that when viewing the image column in a table editor JavaScript, that it appears that the metadata, like the path to the file on the local drive, is inside an image object by accessing Image[“source”]:

View in Table:
image

View in Table Editor:

Right now I am trying to also add the option to add images to the prompt and my idea was, to avoid having to chain multiple nodes to get the file path, to extract this from this Image object that comes out of the Image Reader Node.

What I am struggling with:

I’m struggling to “access” this image object - when printing the content of the image column in a Python Script Node I can see that it is shown as a dict with key 0 and 1:

I also get this information which seems to relate to the cell class:

Name: Image, dtype: PandasLogicalTypeExtensionType(struct<0: extension<knime.struct_dict_encoded<StructDictEncodedType>>, 1: extension<knime.struct_dict_encoded<StructDictEncodedType>>>, {"value_factory_class":"org.knime.core.data.v2.value.cell.DictEncodedDataCellValueFactory","data_type":{"cell_class":"org.knime.knip.base.data.img.ImgPlusCell"}})

When I convert to dict and print I get:

{'image_0.jpg': {'0': b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x01\x00\x00\x00$\x00\x00\x00aa5aac11-887f-48f8-b5a1-f43c5f8b501b\x00\x00\x00\x00&\x00\x00\x000_681f77e6-0314-4689-a1ca-691cad962688\xff\xff\xff\xff', '1': 'org.knime.knip.base.data.img.ImgPlusCell'}}

I already tried to add knip as a dependency (it is definitely installed) and also ran knime.extension.supported_value_types() however the only image data type I see in the list is PIL.Image.Image and that did not help.

I take that the data behind the ‘0’ key is somewhat the serialised / encoded Image object:

b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x01\x00\x00\x00$\x00\x00\x00aa5aac11-887f-48f8-b5a1-f43c5f8b501b\x00\x00\x00\x00&\x00\x00\x000_681f77e6-0314-4689-a1ca-691cad962688\xff\xff\xff\xff'

and the data behind the ‘1’ key:

org.knime.knip.base.data.img.ImgPlusCell

is supposed to help to “translate it”.

I already looked into knime.api.types and found:

get_converter()

However when instantiating an object by passing cell_class / value_factory_class from the below:

value_factory_class":"org.knime.core.data.v2.value.cell.DictEncodedDataCellValueFactory",
"data_type":{"cell_class":"org.knime.knip.base.data.img.ImgPlusCell"

However this resulted in an instance of FallbackPythonValueFactory, and when using decode() method it doesn’t help either…

Long story short: Is there a way for me to convert the dict I get in python for an Image cell into the Image object, which contains meta data like the source?

bump…

Anyone any idea or is it not clear what I am trying to do here? :slight_smile:

1 Like

Hi @MartinDDDD, I think we are all so used to you finding solutions for other people these days that we see your name and assume you’ve got it all in hand, :wink:

Not sure I will be able to help, but I can ask some questions to get the conversation started.

First off, do you have a simple demo of what you have shown above that you can share so we can have a play?

So you are wanting to write the “meta data” for the image (name, source, dimensions etc) - which KNIME appears to hold - to the console within the Python node? Is that the ultimate goal?

Thinking out loud here… is the meta data perhaps Exif data (which I think can be extracted using the PIL library in python), or does Exif not contain the info that KNIME is displaying?

2 Likes

Hey there :slight_smile:

That seems to be a luxury problem to have xD.

Ok let me try to explain a bit differently - reading through the above again I must have been “deep in the zone”, maybe close to turning insane xD:

  1. I’ve create a Python-based KNIME Extension that includes a node that allows to prompt vision models GitHub - miuuu1/FinnovationFlows-Extension: KNIME Extension by Finnovation Flows
  2. As of now, there are a fair few nodes required to wrangle an image into base64 format, so that it can be provided to the vision model (see image)
  3. I am looking for a way to move some of that into the vision model prompter node itself, to make it easier to use - i.e. I want to get rid of the steps in the red box in the image
  4. I noticed that when reading an image initially, it’s not the “full image” that is loaded, but some sort of meta data. In my initial tests I “base64-ed” the Image column and got weird responses from the models - when I then printed the base64 string and decoded I saw that it was this Image object as shown in the image below in the Table View.
  5. That lead to the idea to grab the local path to the image inside my vision prompter node and to take care of the base64 conversion inside the node
  6. When I tried to access the Image object, I got errors and on further investigation, I found that the data I get in Python does not (yet) marry up with what I see in the cell when looking at it via java script table view node, but it seems to be encoded:
  7. {'image_0.jpg': {'0': b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x01\x00\x00\x00$\x00\x00\x00aa5aac11-887f-48f8-b5a1-f43c5f8b501b\x00\x00\x00\x00&\x00\x00\x000_681f77e6-0314-4689-a1ca-691cad962688\xff\xff\xff\xff', '1': 'org.knime.knip.base.data.img.ImgPlusCell'}}
  8. So I’m trying to find a way how I can pass the Image column in the image below directly to my vision prompter node. In the code behind my node I want to then access the local path in the Image object so I can use that to load the image, convert to base64 and send it on its way to the vision model api inside the code of my node.

My working assumption is that there must be a way to convert the “gibberish” above into the Image object… I did dive quite deep into the different python modules and found experimented however with not much success.

Image:

Link to the example workflow I created for the extension (Screenshot taken from there with the Table View node added)

Hi, I’m not sure if this will help progress things or not, but there are a couple of nodes:

The Image Properties can be added after your Image Reader (Table) and you can select the meta info to return, via the features tab in config.

The Binary Objects to PNGs gives you a PNG Image in your table, instead of “just” an image. I don’t know if that’s helpful or not in getting you where you want to go!
Edit: I now see you already had a PNG image after the Ungroup following ImgPlus to PNG images, so that’s probably not of help other than to point out yet another PNG node :wink:

1 Like

@MartinDDDD , Reading more about the subject of images and python in KNIME, did you see this topic?:

and @mlauber71 's suggestion:

2 Likes

Thanks @takbb - will see if this can make things more efficient… ideally I want to go straight from Read Image node to the vision model prompter and handle everything there (User selects the image column and the node takes care of all the troublesome stuff like converting to b64…).

Let’s see :slight_smile:

There was also a recent thread I was involved in that I just remembered regarding storing images in a local database

1 Like

Hello @MartinDDDD
You can read the images from Py PIL, avoiding the Image Reader (Table) node. And then, convert it to base64 in the same Script node with little coding.

https://stackoverflow.com/questions/31826335/how-to-convert-pil-image-image-object-to-base64-string

The only input needed is a table with the images’ URL locations.

BR

1 Like

Hello @MartinDDDD
I’ve been trying to deploy the idea from my latest post. Then, I’m releasing a workflow that is encoding base64 in Python from the sourcing JPG file.

I don’t fully understand the context of the challenge; so I don’t know either if transforming to PNGs, it is a necessary step. I’ve included a PNG file within the sample data, and it’s currently performing ok with mixed files.

BR

PS1.- with some code update, it would read from www if preferred

PS2.- I’ve just updated the Py code in workflow to report Image Dimensions column

3 Likes

Hi,
thanks for your answer. I use your workflow. Now how must i continue to get the image (image view) in my report ?

Thanks
Br