Hi bluena,
I recommend to have a look at the Palladian toolkit plugin. It offers a user-friendly Regex Extractor node with instantaneous preview which allows you to build regular expressions for such cases very easily (see here for more details about the recent release).
Here’s an example in action:
The regular expression which I used here is:
.*\d{4}\s(?<city>.*)
This means:
- capture arbitrary characters,
- followed by exactly four digits,
- followed by a space,
- followed by the city – this is in brackets so that I can use the capturing group as the output column name
city
You can of course set additional capturing groups to split the street, zip code, …
You can find the workflow on my NodePit Space:
Hope this helps!
Philipp