Splitting a cell

Hi all,

I’m currently processing a java export where I’ve got a column for activity code as shown below

These represent activity codes that are assigned to tasks in asta powerproject

What I’d like to do is split the cell into 01 xxxx, 02xxxx all the way to 14. I tried to use a cell splitter by comma however there are instances where you can assign more than 1 codes which means split by comma doesn’t work

How would you split the cells?

I’ve attached a dummy excel file.
dummy data.xlsx (14.9 KB)

Any help would be greatly appreciated

Hi @lahiru_ten , I have a couple of options here.

  1. You could try the Regex Extractor node for this, which is part of the Palladian extension available from nodepit:

The following regex will hopefully be a reasonable starting point:

([0-9]+\s+[^0-9\[\],]+)

This matches anything beginning with a sequence of at least one digit followed by any amount of white space and then captures anything that is not numeric, square brackets or comma.

Alternatively capture only if it begins with 2 digits…

([0-9]{2}\s+[^0-9\[\],]+)
  1. There is also Regex to List from AF Utilities (kudos to @AnotherFraudUser ) which you can then follow with an Ungroup or a Split Collection Column.

image

Configure using the same regex as above

edit
The following regex (courtesy of chatGPT) might be a better fit for the project name structure, as looking down the list I noticed some projects contain digits in the title:

([0-9]{2}\s[^:]+:[^,\]]+)
  • [0-9]{2}: Matches exactly two digits.
  • \s: A space after the digits.
  • [^:]+: Matches any characters that aren’t a colon (so the text before the colon).
  • :: The literal colon.
  • [^,\]]+: Matches any characters that aren’t a comma or closing square bracket (so the text after the colon up to a comma or bracket).
6 Likes

Much appreciated for the help!!!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.