Hello (again),
I’ve been trying to solve this problem for a while now, have tried / experimented, but can’t seem to make progress - maybe someone of you can help me!
TL:DR:
Long story short: Is there a way for me to convert the dict I get in python for an Image cell into the Image object, which contains meta data like the source?
The Idea
I have started to improve my previous node that allowed prompting OpenAI and using the structured output feature. I have recently played around with vision models and did not find a way to prompt a vision model with KNIME. So I wrote my own python script, which also takes care of encoding the image to Base64 by taking the local file path. In my original workflow where I prompted via python script, I get the file URI via a range of nodes chained after Read Files / Folder Node.
Whilst trying to solve this problem, I noticed that when viewing the image column in a table editor JavaScript, that it appears that the metadata, like the path to the file on the local drive, is inside an image object by accessing Image[“source”]:
View in Table:
View in Table Editor:
Right now I am trying to also add the option to add images to the prompt and my idea was, to avoid having to chain multiple nodes to get the file path, to extract this from this Image object that comes out of the Image Reader Node.
What I am struggling with:
I’m struggling to “access” this image object - when printing the content of the image column in a Python Script Node I can see that it is shown as a dict with key 0 and 1:
I also get this information which seems to relate to the cell class:
Name: Image, dtype: PandasLogicalTypeExtensionType(struct<0: extension<knime.struct_dict_encoded<StructDictEncodedType>>, 1: extension<knime.struct_dict_encoded<StructDictEncodedType>>>, {"value_factory_class":"org.knime.core.data.v2.value.cell.DictEncodedDataCellValueFactory","data_type":{"cell_class":"org.knime.knip.base.data.img.ImgPlusCell"}})
When I convert to dict and print I get:
{'image_0.jpg': {'0': b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x01\x00\x00\x00$\x00\x00\x00aa5aac11-887f-48f8-b5a1-f43c5f8b501b\x00\x00\x00\x00&\x00\x00\x000_681f77e6-0314-4689-a1ca-691cad962688\xff\xff\xff\xff', '1': 'org.knime.knip.base.data.img.ImgPlusCell'}}
I already tried to add knip as a dependency (it is definitely installed) and also ran knime.extension.supported_value_types()
however the only image data type I see in the list is PIL.Image.Image and that did not help.
I take that the data behind the ‘0’ key is somewhat the serialised / encoded Image object:
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x01\x00\x00\x00$\x00\x00\x00aa5aac11-887f-48f8-b5a1-f43c5f8b501b\x00\x00\x00\x00&\x00\x00\x000_681f77e6-0314-4689-a1ca-691cad962688\xff\xff\xff\xff'
and the data behind the ‘1’ key:
org.knime.knip.base.data.img.ImgPlusCell
is supposed to help to “translate it”.
I already looked into knime.api.types and found:
get_converter()
However when instantiating an object by passing cell_class / value_factory_class from the below:
value_factory_class":"org.knime.core.data.v2.value.cell.DictEncodedDataCellValueFactory",
"data_type":{"cell_class":"org.knime.knip.base.data.img.ImgPlusCell"
However this resulted in an instance of FallbackPythonValueFactory, and when using decode() method it doesn’t help either…
Long story short: Is there a way for me to convert the dict I get in python for an Image cell into the Image object, which contains meta data like the source?