Column expressions adds a "(#1)" to each column name

ThomasJ · July 20, 2020, 1:23pm

I have two expressions in a Column Expression node. I have a simple string manipulation expression to strip each column of whitespace, and a rule expression which removes empty strings. I’m very new to Knime but i think these work ok except the rule expression adds “(#1)” to every column name. The names go into the column expressions node as (e.g.) “Product_code_1” and come out as “Product_code_1 (#1)”.

Clearly I’m doing something wrong but i don’t know what.

Below is a screenshot of the nodes in sequence, the regex nodes change the column names all to ANONYMOUS_COLUMN and then changes it back with a flow variable. I tried to make the workflow without renaming the columns and adding the flow variable of the currentColumnName to the output flow variable in the column expression node, but this worked out even worse as each column was duplicated 3 time. I saw someone using the renaming technique in an example in order to make the Output Column on the expression dynamic (instead of using flow variable).

The rule expression is:

if (column(“ANONYMOUS_COLUMN”) != “”) {
column(“ANONYMOUS_COLUMN”)
}

ipazin · July 20, 2020, 1:57pm

Hi there @ThomasJ,

welcome to KNIME Community!

How about using String Manipulation (Multi Column) node if you are performing same manipulation over multiple columns?

Have couple of ideas what could be wrong in your current design but it would be best if you share workflow to avoid unnecessary guessing if you are interested to find that out See here how to do that:

Br,
Ivan

DemandEngineer · November 17, 2020, 9:53pm

I really like the Column expression node as it lets us build some fairly complex logic without having to go to another node.

2 Feature requests:

add a function to let you use regex to parse out parts of a string as a capture group
allow you to use new output columns created in expressions above it to be used in the current expression… I realize the workaround is to use a second node after the column is created with the new column… this one is definitely less important

gab1one · November 18, 2020, 10:45am

Hi @DemandEngineer,

That is a good suggestion, you can already do this with the Regex Split node though, even if that is a bit less handy than column expressions.

best,
Gabriel

DemandEngineer · November 18, 2020, 3:45pm

Thanks for the suggestion, @gab1one . I was mainly asking for that feature so that I could have access to the regex in the same node’s if else logic. I wanted to avoid doing the regex first to create a new column of data just for the logic and then filtering out the column after.

gab1one · November 19, 2020, 2:11pm

Yes that makes a lot of sense, I will add this to the list of possible improvements of that node. Just wanted to let you know that the regex node exists so you can do what you, even if it is not as convenient as it should be.

best,
Gabriel

roberting · August 4, 2021, 2:36pm

Hi there,
I think the thread went a bit off-topic? the main question – “(#1)” being appended to the column name – is not answered, as far as I can see.
I have the same problem as the thread author: the column name is changed after some simple string manipulation within the column expressions node. Is this a general behavior we have to live with?

ipazin · August 4, 2021, 2:42pm

Hello @roberting,

and welcome to KNIME Community!

Just a little bit (We are not that strict about it.)

Do you also use Column Expressions inside loop? As indicated in answer to original post can you share workflow example? Dummy data works just fine

Br,
Ivan

roberting · August 5, 2021, 3:49pm

Hi Ivan,
thanks for your reply! actually I used the node without a loop, but I realized, that the number #1 is only appended, when I use more than one expression inside the node.

bruno29a · August 5, 2021, 4:44pm

Hi @roberting and welcome to the Knime Community.

The (#1) being appended to the column name happens in the event that you have 2 columns with the same name. Just like you can’t have 2 fields with the same name in a db table, the same rule applies in Knime tables.

There are several cases where this can happen. You can either rename the first column before the process that is going to create/generate another column with the same name, or rename the column after the process has created/generated another column with the same name.

In this case, what probably happened is that the user did not choose the “Replace Column” option during the String Manipulation, but instead chose to “Append new column” and added the same column name as new column. So Knime created a new column with the same name, but since the name already existed, Knime appends (#1) to the name so that there is no name clash/conflict. It will append (#2), (#3), etc is there are more columns with the same name.