Ex Replace

Hi Community Members,

Please help me in getting the output data below

sample.xlsx (9.6 KB)

I have used the cell splitter node, but it is not that efficient. Experts please help me.

Input Output
125.ABC-ETS 125.ABC
4RC.000-ETS 4RC.000
223.F10S-ETS 223.F10S
407.6GIS-ETS 407.6GIS
TBD.125.ABC-ETS 125.ABC
ATY.223.F10S-ETS 223.F10S

Use the string manipulation node with the function substr(str,start,length), after having separated each cases and concatenate them all again ?KNIME_project5.knwf (26.1 KB)

1 Like

Hi there!

I came up with a solution using couple of nodes. First use String manipulation node to remove “-ETS” part. Then split column based on “.” and afterwards combine first two columns using Column Combiner. At the end filter columns you need using Column Filter.

Here is a workflow:
2019_01_11_String Manipulation.knwf (27.1 KB)

Br,
Ivan

1 Like

Hi,

You can do this using a String Manipulation in which you enter this expression:
regexReplace(regexReplace($input$, "-ETS", ""), "^[1-9a-zA-Z]*\\.(?=[1-9a-zA-Z]*\\.[1-9a-zA-Z]*)", "")

This is a nested regexReplace function which first removes the “-ETS” part and then the first part of the string if there are 3 parts left.
nested%20regex
input-output
nested regex.knwf (10.3 KB)

If you have any other special cases that this expression doesn’t cover, just let me know and I will modify it for you.

Best,
Armin

1 Like

Thanks a lot!

Thanks!

Thanks & I will let you know if i encounter any special case! Can you please help me in understanding the expressions inside [ ]

Sure,
[1-9a-zA-Z]* means any character in this range and can be repeated multiple times. So it matches your string in parts cause the dot character is not included.

Thanks for clarifying.
And what about ?=[1-9a-zA-Z]

The part that is in parentheses:
(?=[1-9a-zA-Z]*\\.[1-9a-zA-Z]*)
checks whether such a string exists or not. So if some string like A2A.2A2 exists after the string which we specified before the parentheses then that string will be replaced. This is the case when you have 3 parts like “ATY.223.F10S” so the “ATY” will be replaced but if the string has 2 parts like “4RC.000” then the string will remain unchanged.

You are a genius in Knime! I would need your expert help

1 Like

I’ll do my best to support KNIME community. :blush:

1 Like

Hi @Priyonko!

You are welcome but I have notice a mistake in my workflow cause when there are 3 parts left I was taking the first two instead of last two. Here is workflow to see how I have handled it if interested.
2019_01_11_String Manipulation.knwf (38.2 KB)

For these problems I think armingrudd solution using regex is a better approach cause it is only using one node.

Feel free to hit Solution button on replay that solves your problem :wink:

Br,
Ivan

1 Like