Clustering data after postal codes

Hi,

i have an excel sheet with over 27.000 customers with postal codes.
E.g.:
A: Name B: Postal code
1 Customer X 35679
2 Customer Y 76521

I want to make the postal code from 35679 to 35 and from 76521 to 76
How can I do this?

I need this for k-means later.

Thank you

Hi @Fatih97

Welcome to the KNIME Forum. Theres several ways to do this in KNIME. You could use a string manipulation node. Example attached!

Regards,
Wali

1 Like

If it is numeric and you don’t want to convert it to string you can use math formula
(assuming your postal codes are between 10k and 100k)

keep

br

2 Likes

Thanks.
but my postal codes are also e.g. 01594 … and then the formula doesnt work…

Hi @Fatih97,

then go with @wkhan post, it will handle leading zeros.

Best regards,

Paul

I dont know, how to make the code to get the first 2 chars

update: ok i got it. next problem is, that after “string to number” the postal codes e.g. from 02 get to 2.
but I need the zero too

Hi @Fatih97,

then you should just remove the “String to Number”. But than you would have the truncated PLZ-Codes in String format, but that seams to be what you want?
By the way clustering by postal code is not trivial as it seams (when you are located in Germany,


), because for example, 06 to 39 can be nearer than 06 to 01. Maybe you can tell us what your goal is?

Best regards,

Paul

Hi,

Maybe this can help you

Yes, I am in Germany.
My goal is to find out, where the most customers of a company are to open a new location. I have an excel sheet with over 20.000 customers and their postal codes.
I want to make 3 oder 4 clusters, to see where in germany (north, east, south, west) the most of the customers sit. Then I will plot this with a scatter plot and make a histogram.

1 Like

Hi,

It will be more accurate if you use GPS coordinates instead of postal codes … Here you can find the data

and the clustering of GPS coordinates is here Clustering geo datas by user and by date

1 Like

How can I put the latitudes and longitudes in my excel sheet to the correct postal code…?
I have 9000 rows…

Hi,

Here is also the databaset that you can import https://gist.github.com/iteufel/af379872bbc3bf5261e2fd09b681ff7e
If you do it in excel you can use VLOOKUP in KNIME use the JOIN node

Do I have to put the customers to the right latitudes and logitudes manually ?
I have over 9000 rows…can I make it automaticlly?

You could use the Joiner Nodes, to append the geo cordinates to your existing columns

Hi,

As @goodvirus said use the JOIN node and join the tables based on the ZIP code

which reader i have to take for the zip. its a sql data

Hi,

It is faster if you just copy the txt in excel …
ZIP_Codes.xlsx (881.0 KB)

i lose the overview…can we meet us at skype or zoom?

i made a scatter plot and how i have to choose the axis?
Latitudes on x axis or y axis?

Hi @Fatih97 ,
maybe this workflow can help you 20220322_zipcodes – KNIME Hub
It reads the file with postal codes and lat/lon on Github, then it calculates the centroid for each code (a lot of codes are shared between many towns) and then it joins the resulting table with the customers data (a test table with just zipcodes).
result:

The bubble chart is just an example of visualization