Solutions to "Just KNIME It!" Challenge 10 - Season 3

:exploding_head: Wow! We just published our 10th Just KNIME It! challenge this season! :hourglass: Time is really flying this year, huh? :sweat_smile:

:package: This week, imagine that you work as a data analyst for a delivery company. :memo: Given a dataset with successful deliveries (due to no typos) and unsuccessful ones (due to typos), your goal is to automatically fix incorrect postal addresses by leveraging the correct ones.

Here is the challenge. Let’s use this thread to post our solutions to it, which should be uploaded to your public KNIME Hub spaces with tag JKISeason3-10.

:sos: Need help with tags? To add tag JKISeason3-10 to your workflow, go to the description panel in KNIME Analytics Platform, click the pencil to edit it, and you will see the option for adding tags right there. :slight_smile: Let us know if you have any problems!

4 Likes

HI @alinebessa :wave: :slightly_smiling_face:

I wanted to let you know that tonight, Jakarta time, I’ve uploaded the solution (v1.0) for the JKISeason3-10 challenge.

6 Likes

My solution to the 10th challenge. It’s a surprise even for me that I had the time to solve it this quick :smiley:

This challenge was really enjoyable and required some thoughtful problem-solving and nodes that you do not use every day.

I’m excited to see how others approached it. Since this isn’t a visualization challenge, I’m sure the final solutions will showcase peak efficiency just as in the folder challenge!

5 Likes

Hi all,
This is my solution.Two methods are equivalent in output under specific input data and configuration conditions.

5 Likes

Hi all,

my solution - three nodes… quick and dirty… did I miss something??

8 Likes

Hello @ JKIers
Here is my take for the challenge using ‘KNIME Distance Matrix Extension’ nodes.

I also checked for typos in $full Address$ string key segments (street number, address text, postal district number); aiming to highlight that, a double typo may suggest to the delivery service company taking some audit actions.

In the results table, a typo warning in ‘street number’ could require only address text correction, as a wrong assigned street number can stand for an unhappy customer.

Keep coding :vulcan_salute:t4:

6 Likes

Here’s my solution. Nothing fancy. Also corrected ZIP code to add"0" to front of 4 digit ZIP.

6 Likes

Her is my solution for this challenge.
It is only suitable for company addresses, as there is mostly only one number at the same street.

5 Likes

Hello everyone,
My solution is here.

In my solution, I set a download link to obtain the corrected data. User can download the corrected file with confirming the corrected data.

4 Likes

Where can I download the string matcher?

Hello, this node is in the “Text Processing” extension of “Other Data Type”. It is not installed by default in KNIME. You need to install it manually.

image

One of the installation methods therein:

2 Likes

Thanks, Installed the extensions and solving it

1 Like

My first knime submission to this season 3.

i took inspiration from @tomljh and @RBre workflow, and added a column expressions to format the zip code to length of five .

What i learned from this challenge:

  1. String matcher - how to find the relationship between two strings,
  2. value Lookup: How to use it How do I extract the values from the dictionary column.
4 Likes

Yes, I agree with your point of view. Through everyone’s exploration, this string has a special structure, and the error occurred in the first paragraph of the address. Both methods can obtain the result. :smiley:

Hi all,
Here is my solution.
I highlighted the corrections in yellow for easy identification.

8 Likes

Hi all,
Here is my solution, that is similar to the others.

5 Likes

Hello everyone,
My workflow is similar to others’, taking inspiration from their approaches. I highlighted the corrections, using @sryu 's ideas. Thanks.

7 Likes

Catching up to speed after the Knime Community Hacking day here i mine.

3 Likes

Hi @alinebessa

I noticed a lot of duplicate entries in the dataset. After removing 90% of the initial input, next steps where more ore less straight forward.

gr. Hans

8 Likes

My submission for challenge

3 Likes