Problem with Rename column titel

My basic file read in with Excel Reader node has various columns. Including columns that start with a 7-digit number, of which the first 3 digits always start with 605. After the 7 digits, there is a longer alphanumeric sequence separated by spaces. My goal is to create a workflow that automatically selects the columns starting with 605 and renames the column title to the first 7 digits, i.e. 605xxxx. This means that the alphanumeric sequence attached to the original name should no longer be displayed. Iā€™ve already used the String Manupulation Node and the Coumn Rename Node for this, but I didnā€™t get anywhere with both. I think that one difficulty is, among other things, that the content contains multiple lines in the string formated cell - example:

6052311
KA3
0M41
R 1300 R

I just want 6052311 in the title column. Iā€™m asking for urgent help.

Hello @steppel23 and welcome to the KNIME Community

You can test the following code in a ā€˜Column Rename (Regex)ā€™ node:

Search String (regexp):

(605\d{4}).+

Replacement:

$1

Regards

1 Like

Another aproach: Filter the table as long as you need to get only one row with the desired headings as a result. Then use this row in the ā€œRow to column namesā€ node. This node uses the first row only, this is why I ask you to filter first.

Hey gonhaddock - thanks a lot. Yeah, I tried to fix my problem with this node yesterday and also got a same search string from chat gpt but nothing of this is working. I will attach you my reduced example excel sheet. The colums from P to V are the one I want to rename - in the result there must be for example (columne P) just 6052311. Do you have any idea? thanks a lot, stefan
Example_KNIME.xlsx (12.0 KB)

Hello @steppel23
Sorry because I didnā€™t take so literal your challenge description. The goal to be achieved is because the format of your column headers is multiline text with carriage returns [\r]

My proposal is based in ā€˜Column Rename (Regex)ā€™ node as well, but we can split the problem in two steps.

The first one aiming to remove the carriage returns in all the columns (Node 2):

Search String (regexp):

(?:[\s])(.+)?

Replacement:

$1

And for the second step -rename target columns- you can use the code from my first post (Node 3): (605\d{4}).+

BR

Dear Mc Ready, thanks for your help - but this was not the solution I needed. I just replyed to gonhaddock one post before yours. I donā€™t have a problem with elaminating rowsā€¦ I think the reason why the renaming is not working is the in the header row by itself - I atteched you my example excel sheet as well. in want to rename row P to V, in the result there should be only for example in P: 6052311. And I need a workflow, that this will work atumatically - because next time I have another basic excel sheet where this must work y itself - the trigger is always that the renamed colums begin with 605ā€¦
Example_KNIME.xlsx (12.0 KB)

Ah, ok, now I see what you mean - multi-row headers with line-breaks in Excel.

Try this one:

Row Splitter: Matches wildcard with Value ā€œ605*ā€
String Manipulator: substr($Column Name$,0,7) with output to ā€œnew column nameā€
Column Renamer with Lookup ā€œColumn nameā€ and Names column ā€œnew column nameā€

1 Like

@gonhaddock you made my day - thank you so much!

1 Like

Just for the records; this code was properly working for @steppel23 because the challenge column headers didnā€™t have any space character before the first carriage return (current code removes spaces)ā€¦

Aiming to make the code more robust, and honors space characters in the first text row; you can use the following code version:

- - - Remove ā€˜carriage Returnsā€™ [\r] from multi-line text in column headers - - -


regex101.com

ā€˜Column Rename (Regex)ā€™ node > Search String (regexp):

(?:[\s])?(.+)

Replacement:

$1|

BR