Solutions to "Just KNIME It!" Challenge 21 - Season 2

:boom: New Wednesday, new Just KNIME It! challenge! :boom:

:golf: This week we’ll work on a golf problem – get your clubs ready! :golfing_woman: To help a caddie who is exhausted of trying to find golf balls in the field, you’ll be performing image segmentation to separate the white balls from the green grass.

Here is the challenge. Let’s use this thread to post our solutions to it, which should be uploaded to your public KNIME Hub spaces with tag JKISeason2-21.

:sos: Need help with tags? To add tag JKISeason2-21 to your workflow, go to the description panel in KNIME Analytics Platform, click the pencil to edit it, and you will see the option for adding tags right there. :slight_smile: Let us know if you have any problems!

1 Like

Hands on to image processing challenge… 2nd one in the series.
Sharing my submission for basic set of nodes


This is my solution.
The golf balls are shown in all red and also merged into the original image.


Hi KNIMErs :slight_smile:

Interesting challenge this week!

I decided to approach this one as a clustering problem rather than purely image processing.

First of all, I read the image and used the -Splitter- node to split it into RGB Channels. I kept only the Red Channel and used the -Image to DataRow- node to convert the image to a collection of pixel intensity values. I then ungrouped the collection to organise the pixels into just one column of pixel intensities. The maximum image dimensions were then used to calculate the X and Y coordinates of each pixel using separate -Math Formula- nodes.

Using only the red channel is sufficient to discriminate between the pixels that represent the golf balls and those that represent the grass. Out of the three channels, the red channel is the one that discriminates the most between the grass and the balls. The reason for this is that the grass is green and the grass should be as dark as possible to differentiate them from the whiteness of the balls. The distribution of pixels per intensity can be seen in the histogram below:

The histogram clearly shows a bimodal distribution, where the left distribution corresponds to the grass pixels and the right distribution corresponds to the ball pixels. To discriminate between the grass and ball pixels I could have chosen the minimum height of the histogram as the threshold around position 130 where both distributions overlap. However, I have chosen a higher value of 200 to make sure that no pixels from the grass are retained. The choice of this threshold could also have been automated, but this could be the subject of another contest :slight_smile: The number of pixels was further reduced by randomly sampling 1000 rows from the remaining pixels.

The next step of the workflow determines the number of balls and their positions. Therefore, I applied the -Distance Matrix Calculate- node to the X and Y coordinates, followed by the -Hierarchical Clustering (DistMatrix)- node to generate a hierarchical cluster tree based on the ball pixels. Pixels from the same ball are closer together than pixels from other balls. To determine the optimum number of clusters (k), I used the -Parameter Optimization Loop Start- node starting from k=2 up to k=16 (double the number of golf balls in the image). The -Hierarchical Cluster Assigner- node assigns each pixel to a cluster based on the number of clusters in that iteration of the loop.

In order to save time, it is best to use this combination of nodes instead of just the -Hierarchical Clustering- node. This is because the -Hierarchical Clustering- node generates a tree each time it is run, however, the tree will always be the same, as it is independent of the number of clusters assigned to the data.

I have used the -Silhouette Coefficient- node inside the loop and then calculated the mean silhouette coefficient for each k value and plotted this using the -Line Plot (Labs)- node:

The plot clearly shows that the Optimum number of clusters is 8. I was very happy to see this, as there are 8 golf balls on the original image :slight_smile:

k=8 was therefore applied as the number of clusters in the final -Hierarchical Cluster Assigner- node:

The X and Y coordinates of each pixel plus the Mean X and Y coordinates of each cluster were plotted using the -Scatter Plot- node:

Again, I was very happy to see the result :slight_smile: The clusters are plotted in the same places as the golf balls on the image:

This proves that taking a sample of the data was sufficient enough to produce a good result!

In conclusion, the coordinates of the golf balls have been successfully determined and Caddie Tom would be able to use them to find where the golf balls are on his images. The method can be applied to different numbers of golf balls and different image sizes without the need to edit the workflow. However, the method could be refined since at the moment it detects anything with a white/off-white colour, so Tom may end up with a “chihuahua or muffin” situation on his hands :joy:


Here’s the link to my workflow on the hub:

Thanks @aworker for your supervision on this challenge!

Happy KNIMEing!!


Whoah! Very creative approach!


Thank you @alinebessa :slight_smile:


:stuck_out_tongue_winking_eye: Hi my Crazy KNIMErs! :heart_hands:

Here is my solution: JKISeason2-21 Ángel Molina – KNIME Community Hub

It was based on the Neubias webinar

I’m super intrigued to see the second part of this challenge, so I hope to do my full article on both challenges in my newsletter.
If you want and have time @HeatherPikairos , we could collaborate on this article or another in the future, I love your explanations :smiling_face:

Happy Sunday my crazy KNIMErs :people_hugging: and happy low-coding!


Hello everyone,

Here is my solution.

Thank you @HeatherPikairos for your solution, I learned a lot from it, since I do not have much experience working with image processing. I think I slightly improved your solution with using DBSCAN instead of Hierarchical Clustering — I believe it runs faster and there is no need to do the parameter optimization for it. I also implemented the case for multiple (just 2 in my case) images, and the output returns the segmented images, number of objects and centers of the objects.

Another experimental approach that I used is the Python package called Segment Anything (GitHub - facebookresearch/segment-anything: The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.). It is NN-based, so I would assume it performs better for more complicated cases than we have in this exercise. However I did not manage to run it on GPU, since it requires at least 12 GB VRAM! Still it runs on CPU, but it takes quite a lot of time.

Still it was very interesting exercise this week for me, since I learned more about image processing.


Hi @Artem Thanks for the nice comment about the workflow! I’ll take a look at your adapted version :slight_smile:


I almost didn’t share this. I know nothing about image processing. I tried to develop a workflow solely using image processing nodes/components. I found the two components on the Hub. They both use legacy Python scripting nodes and are written to use outdated versions of scikit-image. They will not work with the current version of scikit-image (0.21.) The environment I used is Python 3.6 and scikit-image 0.17. The workflow runs with that environment. I spent a lot of time playing with parameters, but as you can see, my workflow misses one ball and essentially overlays another. Hopefully someone with more experience can offer some suggestions. The workflow was too large to share in executed form.


Very elegant solution. I would never have thought of that in a million years. Kudos!


Thank you very much @rfeigel :slight_smile:

1 Like

Hi, KNIMEr :partying_face: :partying_face: :partying_face:

Here is mine.

Do you like it? Very simple :face_with_hand_over_mouth:

Best regards,


:sparkles: As always on Tuesdays, here’s our solution to last week’s Just KNIME It! challenge :sparkles:

:framed_picture: We offer two solutions based on classic image processing – very similar, also, but one comes straight from a shared component. No experiments with clustering here: just basic convolution work and component analysis! :nerd_face:

See you tomorrow for this challenge’s part 2!


This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.