How can I read only the postal code, city and street from a text in the table in Knime?

I hope you can help me, I have already tried several regex statements but I just can’t get any further

I want to display in KNIME from a table with text content always only the postal code, the city and the street.

Input: Test 1234 00123/68213 1, -prio 45468 Mühlheim an der Ruhr, Ruhrstr. 1, test

Output: 45468 Mühlheim an der Ruhr, Ruhrstr. 1

The text can always be different, but what remains the same is that the postal code and the city are contiguous followed by the street.

Are there possibly nodes that could simplify this for me somehow?

Welcome to the KNIME Forum!
You might try using the column expressions node to evaluate multiple regex expressions at time. I made an example based off your example that I got to match:
german_street_regex_example.knar.knwf (6.6 KB)

Using the column expression node I evaluated the regex below and output the address into a separate column.

The regex I used was the following:
/[0-9{5}\s[\x7f-\xffa-zA-Z\s]+[,]{1}\s[\x7f-\xffa-zA-Z0.\s]+\s[0-9]+?/

Going off the assumption it always had a 5 digit zip code [0-9{5}, a space \s, City name (umlauts included) [\x7f-\xffa-zA-Z\s]+, a comma & space [,]{1}\s, Street name (including abbrv.) and address number [\x7f-\xffa-zA-Z0.\s]+\s[0-9]+?.

Hopefully this helps.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.