Hello,
Trying to identify the way to split columns to identify the municipality:
Example:
Sint-Willibrordusplein 4 3550 Heusden-Zolder
Henri Horriestraat 31 8800 Roeselare
Dreef 1 3220 De Holsbeek
Arthur Dezangrélaan 17 1950 Kraainem
In the example above I put the municipality in bold. It is always after the 4 digit Belgium postal code.
Do you know how this can be solved? Using Regex?
Thank you!
By the way, Knime is fantastic. Keep the fantastic work!
I recommend to have a look at the Palladian toolkit plugin. It offers a user-friendly Regex Extractor node with instantaneous preview which allows you to build regular expressions for such cases very easily (see here for more details about the recent release).
Thank you for the clear and useful answer! This Regex Extractor node will for sure be useful in the future. Out of curiosity, once the regex is identified (in our case " .\d{4}\s(?.) ", how can we use it in the “Regex Split” node? I tried with no luck. Thanks again Philipp!
Indeed, with good quotes the RegEx runs smoothly Thank you!
However the result is not as intended. I’m only interested in the Municipality that is what’s after the 4 digit postal code. For line 1, that would be only “Brugge”. Feels we’re almost there!