Only keep specified characters

Hi there,

I have a spreadsheet with addresses in different alphabets, but I only want to keep characters from the Latin alphabet as well as some symbols. For example:

河南省 Henan Province becomes just Henan Province

*221/B, [Baker Street] * becomes just *221/B, Baker Street *

I know there’s a way to filter out rows containing undesired characters, and another way to specify which characters to remove, but I think it’d be easier in my case to specify which characters to keep. Is this possible?

Many thanks!

you probably want to use regex with e.g. string manipulation
save
This needs some refinement for your case (we do not know what special characters you want to keep) but could hopefully get you started
br

3 Likes

Hello @mac95 ,
You can use the ‘String Manipulation’ node in KNIME with the expression like:

" regexReplace($COL-1$, “[^a-zA-Z*\/\\.,\s]+”, “”)"

Please refer to the attached images for additional context.

output:

Screenshot 2024-01-19 111507

3 Likes

This seems to have done the trick - many thanks!

That seems to have done it - thank you :slight_smile:

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.