How do I split a column into 2 based on 2 separate delimiters? (one or the other)

Hello there,

I am new and would appreciate help in figuring out my issue on splitting the column into 2.

I want a column split so that the below delimiters are kept in column 1, but everything after these is split into column 2. There is also a space after each of these, so I could just split it out by the first space/underscore if that is easier! Thanks

LW
L4
L12
L24
L52

Column 2 would contain any text after the above mentioned text/numbers.

image

Hello @UKfan12 and welcome to the KNIME community.

You can simply use a Cell Splitter node stepping on ‘_’ as cell delimiter. The triggered problem is that you will split the other ‘_’ within your text. This approach will generate further work to concatenate back the description columns with Column Aggregator (concatenate, delimiter value ‘_’).

The Regex Split node can do the job in a single node using the following code:

(.*?)_(.*)

BR

2 Likes

Thank you for the quick reply!

Unfortunately I get this when I use your formula. Not sure where I messed up.

image

image

image

Hello @UKfan12
I am testing it and it works properly in my side…


20230222_regex_split_underscore.knwf (12.0 KB)

Can you provide a sample of your column to check?

BR

2 Likes

Your workflow does not have the data embedded so unfortunately nobody can work with it.

image

Please run the workflow, export it again and uncheck the box to reset the workflow.

@gonhaddock Could also be that he has a \n as delimiter. KNIME represents it in the same way.

2 Likes

Sorry about that!
Sample 2.knwf (78.6 KB)

1 Like

So I tried the below formula
image

and it worked on some but not all of the rows for some reason…

image

Hello @UKfan12
Congrats! I was expecting something like that, special characters hided on display. But some rows do not work with it.

BR

@UKfan12
There is an Unicode mistranslation at some point of your data preparation… I would probably solve this in a first instance with a String manipulation node with the following code:

image

regexReplace(
	regexReplace($ColumnNames$, "\\n", "_")
	, "\/", "")

And finally the original Regex Split code:

(.*?)_(.*)

BR

2 Likes

Perfect, thanks so much!

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.