How to Split 1 Column to 9 Columns using regex

Hi friends,

I’ve been trying to process this thru regex but the alphabet didn’t detect. I want to split the column into more columns using regex.

Before Raw Data
Col0
2153531702 BSER-IENC PLMT INTERNAL
0355000000 3,602.40- 3,602.40- 0.00 0.00 3,602.40 3,602.40
0744000000 3,602.40 3,602.40 0.00 0.00 3,602.40- 3,602.40-
2153531840 BSER-IENC PLMT INTERNAL
0355000000 76.26 102.94 0.00 0.00 76.26- 102.94-
0744000000 76.26- 102.94- 0.00 0.00 76.26 102.94
2250511036 CURRENT A/C - TEMPORARY OD-AUD
4850000000 2.08- 2.04- 2.08- 2.04- 0.00 0.00
2250511344 CURRENT A/C - TEMPORARY OD-HKD
0304000000 121.45- 21.02- 121.45- 21.02- 0.00 0.00
0305000000 155.88- 26.98- 155.88- 26.98- 0.00 0.00
0339000000 1.42- 0.24- 1.42- 0.24- 0.00 0.00
0747000000 1,114.98- 193.04- 1,114.98- 193.04- 0.00 0.00
4612000000 564.69- 97.77- 564.69- 97.77- 0.00 0.00
4850000000 1259,613.96- 218,090.41- 1259,613.96- 218,090.41- 0.00 0.00

After Raw Data

Column 1 Column 2 Column 3 Column 4 Column 5 Column 6 Column 7 Column 8 Column 9
2153531702 BSER-IENC PLMT INTERNAL 355000000 3,602.40- 3,602.40- 0 0 3,602.40 3,602.40
2153531702 BSER-IENC PLMT INTERNAL 744000000 3,602.40- 3,602.40- 0 0 3,602.40 3,602.40
2153531840 BSER-IENC PLMT INTERNAL 355000000 76.26 102.94 0 0 76.26- 102.94-
2153531840 BSER-IENC PLMT INTERNAL 744000000 76.26- 102.94- 0 0 76.26 102.94
2250511036 CURRENT A/C - TEMPORARY OD-AUD 4850000000 2.08- 2.04- 2.08- 2.04- 0 0
2250511344 CURRENT A/C - TEMPORARY OD-HKD 304000000 121.45- 21.02- 121.45- 21.02- 0 0
2250511344 CURRENT A/C - TEMPORARY OD-HKD 305000000 155.88- 26.98- 155.88- 26.98- 0 0
2250511344 CURRENT A/C - TEMPORARY OD-HKD 339000000 1.42- 0.24- 1.42- 0.24- 0 0
2250511344 CURRENT A/C - TEMPORARY OD-HKD 747000000 1,114.98- 193.04- 1,114.98- 193.04- 0 0
2250511344 CURRENT A/C - TEMPORARY OD-HKD 4612000000 564.69- 97.77- 564.69- 97.77- 0 0
2250511344 CURRENT A/C - TEMPORARY OD-HKD 4850000000 1259,613.96- 218,090.41- 1259,613.96- 218,090.41- 0 0
![image 690x180](upload://jZOVBWsC7sxbzvzLGC7DQOPfTZw.png)
1 Like

Hi @seanmanzanilla

I’m a bit confused with your data. If I copy & paste as below the “text” version that you have shared with us, it seems to be separated by spaces, but from your snapshot, it seems to be separated by tabs (\t).

So maybe in your workflow, there is a data transformation that converts tabs into spaces. In other words, originally your data seams to be tabulated, but eventually it is not, which makes its processing complicated.

Could you please post here your data in text format as it seems to be originally tabulated ? It would be easier for us to help you from there.

Hope this helps.

Best

Ael

2 Likes

Sorry for that, Here’s my sample Raw Data.

Sample Raw Data.txt (2.2 KB)

I wanted the output to be like this

Column 1 Column 2 Column 3 Column 4 Column 5 Column 6 Column 7 Column 8 Column 9
2153531702 BSER-IENC PLMT INTERNAL 355000000 3,602.40- 3,602.40- 0 0 3,602.40 3,602.40
2153531702 BSER-IENC PLMT INTERNAL 744000000 3,602.40- 3,602.40- 0 0 3,602.40 3,602.40
2153531840 BSER-IENC PLMT INTERNAL 355000000 76.26 102.94 0 0 76.26- 102.94-
2153531840 BSER-IENC PLMT INTERNAL 744000000 76.26- 102.94- 0 0 76.26 102.94
2250511036 CURRENT A/C - TEMPORARY OD-AUD 4850000000 2.08- 2.04- 2.08- 2.04- 0 0
2250511344 CURRENT A/C - TEMPORARY OD-HKD 304000000 121.45- 21.02- 121.45- 21.02- 0 0
2250511344 CURRENT A/C - TEMPORARY OD-HKD 305000000 155.88- 26.98- 155.88- 26.98- 0 0
2250511344 CURRENT A/C - TEMPORARY OD-HKD 339000000 1.42- 0.24- 1.42- 0.24- 0 0
2250511344 CURRENT A/C - TEMPORARY OD-HKD 747000000 1,114.98- 193.04- 1,114.98- 193.04- 0 0
2250511344 CURRENT A/C - TEMPORARY OD-HKD 4612000000 564.69- 97.77- 564.69- 97.77- 0 0
2250511344 CURRENT A/C - TEMPORARY OD-HKD 4850000000 1259,613.96- 218,090.41- 1259,613.96- 218,090.41- 0 0
![image 690x180](upload://jZOVBWsC7sxbzvzLGC7DQOPfTZw.png)

This is what is looks like in Knime.

Hi @seanmanzanilla

Your data is not originally tabulated but space-separated. The good news is that there are enough spaces as separators to recognize that this is not just a set of spaces but a tabulator-like column separator. My recommendation here would be to use a -string manipulation- node to replace a sequence of white spaces by tabs and then process it further, as follows:

20220210 Pikairos How to Split 1 Column to 9 Columns using regex.knwf (52.0 KB)

Hope it helps.

Best

Ael

1 Like

Hi @aworker, sorry for the confusion I need it also to append.

image

The Rows that contains Word/String is connected to succeeding rows until the next Rows has new Word/String.

Ex.
Row 0 = Row 1, 2
Row 3 = Row 3,4
Row 6 = Row 7
Row 8 = Row 9,10,11,12,13,14
Row 15= Row 16 to 20

Just like this one One to Many

Hi @seanmanzanilla

No problem at all. Please find below a workflow for the full processing as needed :

20220210 Pikairos How to Split 1 Column to 9 Columns using regex full.knwf (89.8 KB)

Hope it helps.

Best

Ael

3 Likes

Thank you! it will help a lot!

1 Like

I guess so haha ha :wink: !

Glad to help and thanks for having validated the solution :+1: !

Best wishes

Ael

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.