i am a quite new user to KNIME. I want to perform a clustering of a couple of input png images based on the pixel values. I am able to load the images with the Image Reader, then I extract a single color channel with the Splitter. And then I am stuck.
How can I convert an Image to numerical columns of its data? I am missing a node Image to Vector...
Thanks a lot,
you want to cluster several images based on their pixelvalues? We don't have a node to get vectors containg the pixel value out of the images (if it's urgent, I can write a KNIME Node for you and add it to our nightly build next week).
What we have is a node called "Image Features". With this node you can extract simple statistics from an image (mean, std dev., skewss, kurtosis, texture information, ...) and you get a feature-vector for each image. Based on these you can perform a clustering with KNIME standard nodes.
Additionally we are currently working on SIFT/SURF/HoG Implementations, which will be available in a future release.
I hope I could help you. If you need this "Image To Vector" Node, just tell me and I'll try to add it to our nightly build as soon as possible.
yeah, it would be really great to be able to turn an image into a vector. then one could apply all methods outside of knime image processing (such as PCA, clustering etc,.) to this vector, which currently does not seem to be possible. It would be really good, if this was also possible, the other way round: i.e. convert several columns to a single image column with specific height and width. Then you could visualize the processing later.
If it is not much work for you, I would greatly appreciate the adding of these nodes.
Thank you very much for your help.
I just added a simple node to our nightly build update-site and it will be available by tomorrow morning (link to nightly build update-site: http://tech.knime.org/community). It's called "Img To DataRow" and will simply convert Imgs to CollectionsCells of DoubleType. I attached an example workflow. If you could tell me what further requirements you have to such a node, I could add the functionality.
Please note: As this node is not part of an official release yet, it may change (Name, Configuration, Dialog etc) in the future, until it's part of an official release : - )
The second node, DataRow to Img, is a little bit more complicated, but I see your point and we will add it in future releases.
I hope this helps!
it works on your sample workflow. However, if i use it on a couple (5-10) of png images (210x260px) the transpose node doesn't seem to manage the amount of data. It keeps calculating for over an hour now, and there is now progress indicated (the blue progress bar runs from right to left and backwards)...
My intention is to use KNIME to quickly perform and compare clustering of images, which I would partly preprocess. preprocessing could be a simple pca or hog3d or cutting out of tracked interest points (supplied externally). My images will be not larger than 260x210, but I wanted to work on several hundreds to thousands of images.
Is there any chance I could do it (in terms of internal data structure, memory, etc.) with knime?
actually this should work. Just a few questions:
1. I don't know if you really need to tranpose the images. Actually e.g. for K-Means Clustering you don't need to transpose.
2. if you read in PNG Images, maybe you also read in the color channels. Do you need color information? maybe you could simply read in Channel = 0, this will reduce the number of pixels which are stored in one vector (Subset Selection in Image Reader)
3. Concerning Images: A rule of thumb is: As long as one of your images fits twice into your memory, we should be able to process as many images as you want. We have implemented a memory management system which takes care about that. Concerning KNIME: KNIME is able to process huge amounts of data, so actually it should work. I Just tested the workflow I gave you with 1000 Images (300x300 each). The maximum amount of required HEAP, right before Transpose Node was called, is 2.6G (max Heap set to 12G). The Transpose Node itself takes a while, but finishes. Did you increase your HEAP Space (http://tech.knime.org/faq#q4_2)?
4. We certainly need to reduce the amount of memory required of the Img to DataRow Node. Currently we only create DoubleValued Vectors from the images, even if the image type is like bytetype or something. We will enhance this node before we officially release it of course : - )
Let me know if works ; - )
One more comment: You can increase the number of chunks ("chunk size" parameter in node dialog of Transpose Node), then transpose will run faster.
PS: You want to preprocess the images with a PCA, then you need the pixel values in a table right? Therefore we would need another node and the workflow would look like this: Image Reader -> Chunk Loop Start ->Image to Table (complete Table = one Image) -> PCA -> Table to Image -> Loop End -> Images to Vector -> ... further steps ... clustering ...
correct me if I'm wrong.