How to use Regex Split Column

Hi Guys,

Can someone help me with regex cuz I’m having a hard time trying to split the data into 5 columns using regex split. I’m beginner in RegEx.

Sample Data
I remove the dublicate space.
image

FX Sell - FX Forward 28,424,437,259,534.00 705,130,159,759.00 432,808,376,076.00 28,152,115,475,851.00
FX Sell - FX Forward-Arbitrage 1,189,409,133,972.00 88,638,777,818.00 66,847,812.00 1,100,837,203,966.00
FX Sell - FX Forward-NDF 763,142,000,000.00 90,000,000.00 0.00 763,052,000,000.00
FX Sell - LTFX 324,594,630,527.00 403,635,711.00 0.00 324,190,994,816.00
FX Sell - LTFX - Internal 709,693,879,680.00 481,467,420.00 0.00 709,212,412,260.00
FX Sell - FX Spot 6,262,183,991,491.00 4,472,523,449,594.00 4,136,565,241,758.00 5,926,225,783,655.00
FX Sell - FX Spot-Arbitrage 1,222,763,124,154.00 744,975,068,104.00 428,525,143,886.00 906,313,199,936.00

Sample Output

Thank you in advance!

1 Like

Hello @seanmanzanilla,

give this expression a try:
(.*) ([\d,]+\.\d\d) ([\d,]+\.\d\d) ([\d,]+\.\d\d) ([\d,]+\.\d\d)

(If I know how to write expression for repeating patterns above code would be way shorter :sweat_smile: And first capturing group should be improved…)

Br,
Ivan

3 Likes

Hello @seanmanzanilla

To keep it simple i would solve this in a two step approach:

  1. Try to split the text part from the numeric part with the ‘Regex Split’ node

(^[a-zA-Z -]+)(.*)$

The final text $split_0$ space can be easily strip()ed with the ‘String Manipulation’ node.

  1. A second ‘Cell Splitter’ node with space as delimiter will disaggregate the four remaining numbers.

BR

2 Likes

It looks like there is a space between your numbers? If so cell splitter without regex possible like @gonhaddock suggested?
br

2 Likes

The input behaves nicely for this kind of split: The last 4 columns are separated by whitespace, with an arbitrary number of spaces in the first column.

Ivan’s expression can actually be simplified to
^(.*) ([^\s]*) ([^\s]*) ([^\s]*) ([^\s]*)$
because we do not care about number formats at this point.

We can’t use the normal Cell Splitter for that. It can do a fixed number of columns, but only from the beginning. But there’s one simple trick: Reverse the input column to use the normal Cell Splitter. The resulting columns values and order then need to be reversed back. Better to use the RegEx Split though.


How to use Regex Split Column.knwf (50.6 KB)

4 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.