Solutions to “Just KNIME It!” Challenge 25 - Season 4

:sun_with_face: Happy Wednesday, everybody! Shall we hone our geoprocessing skills this week with a new Just KNIME It! challenge ? :world_map:

:cityscape: Zurich’s city council wants you to interpret a growing dataset of citizen-submitted service reports. To ensure equitable and efficient resource distribution, the council wants to break down the city into smaller, more manageable clusters. Can you pinpoint a systematic method to group Zurich’s neighborhoods based on the incoming reports? :eyes:

Here is the challenge. Let’s use this thread to post our solutions to it, which should be uploaded to your public KNIME Hub spaces with tag JKISeason4-25 .

:sos: Need help with tags? To add tag JKISeason4-25 to your workflow, go to the description panel in KNIME Analytics Platform, click the pencil to edit it, and you will see the option for adding tags right there. :blush: Let us know if you have any problems!

Hi all,

Here my solution: JKISeason 4-25 - Zurich Clustermap – KNIME Community Hub

I was a bit lost reading the description of the challenge, hope I got it right.

First I spent time to find the original dataset with a little description (in German). Thanks ChatGPT for the translation and overview!

Here the steps:

  1. GeoPackage Reader
  2. As we have a report per geo point, I grouped per neighbour. To do so, I created a GeoGrid of 500 meters.
  3. Joining each point to the grid
  4. I decided to compute the time of completion - not sure the updated date can be used for that, but this was the only information available.
  5. Grouping then per gridID (neighbour) and getting some information: total number of case, unique number of case type, average solving age. I could get more information and compute some additional ratios.
  6. Normalization
  7. Using k-Means for clustering. Number of cluster of 3. More experiments might be used to evaluate the impact and find the best number of clusters. Or using other methods.
  8. Visualisation of the clusters on a map (as a grid).

The result on the map shows:

  • Size of the circle is the number of case
  • Color is the cluster

It seems that the city center is a cluster at itself.
I am sure I should work on feature engineering to improve the clusters.

Also, improving the visualization so we remove the blue color from the grid.

Enjoy all!

Cheers

Jerome

3 Likes

(post deleted by author)

Hello team,

Here is my solution to this week’s challenge.

JKISeason4-25

This is my first real exposure to geospatial data and I used guidance from ChatGPT and from @trj workflow.

Cluster size at 8

Random seed at 1234

What the clustermap is telling me that as the there is a correlation when demand is higher in a given cluster, the average fix time is lower versus when demand in a given cluster is lower the average fix time is higher. This correlation seems about right where the objective is to complete as many jobs as possible in a high demand area in order to maximize service fees.

This was definitely another fun challenge. Cheers.