Extract period from filename

In the filename of my input-file (xlsx), I would like to detect the period for further use. My filename is taken from the Excel reador node, and put in a table as a string. It looks like this : IN/Download 2KEE 2023P05 20230329-121500.xlsx. I need the part the listens to the pattern “20??P??”. Help is very much appreciated.

THink I found it : I used the column expressions node and this formale : substr(column(“FilePathName (#1)”), indexOf(column(“FilePathName (#1)”), “2KEE”)+7, 7). result was 2023P05, as required. But is there a better way ?

Hello @PeWo,

better way can be to use regular expression. One node that can handle easily is Regex Extractor node with following expression:
20\d{2}P\d{2}

Br,
Ivan

2 Likes

It seems like you could also just use the Cell Spliter node with the space character as your delimiter.

Thank you, Ivan. I tried, but can’t find the node Regex Extractor. In only see Regex Split. I tried to install some extentions, but no positive result.
When I insert a Regex Split node with your proposed formula, I don’t get a result.
When I change a little bit to (20\d{2}P\d{2}), and let it run on my input (knime://knime.workflow/data/IN/Download%202KEE%202023P05%2020230329-121500.xlsx), I already get a separate column, but it stays empty. I admit I don’t master REGEX. Any hints ?

Correct, but what in case a user does not use space. Then My script doesn’t work.

Looking at that file name, I had just assumed that it was system generated. If the file names are just user written, then the potential of error certainly opens up risk and challenges on the regex side as well unless you put out some sort of file naming guidance.

Hello @PeWo,

it is part of Palladian extension. Check these instructions:

Br,
Ivan

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.