Column rename with suffix

Braveen · September 10, 2020, 3:57pm

Hi,

I am trying to create a denormalized table by concatenating one over another.

I can’t create a suffix of the table name at the end of each column. I saw the row ids have suffix.

For eg.
Table name : Fruits
Col 1: Apple
Col 2: Grape

I want the o/p be like

Col 1: Apple_Fruits
Col 2: Grape_Fruits

I tried Column rename regex node, but it replaces whole value with the given value.

Any suggestion or Workflow will be appreciable.

Thanks.

s.roughley · September 10, 2020, 5:28pm

The column rename (regex) node should work if you set the pattern as:

(.+)

And the replacement as

$1_Fruits

Steve

scapuzzi · September 10, 2020, 5:32pm

@Braveen, this is the solution. I have verified.

Braveen · September 11, 2020, 4:19am

Thanks for the answer.

Out of curiosity, how is $1 means original column name in KNIME?

Also, I have 4 column names and if I want to exclude those names while renaming (usually there are 12-15 columns present in each file), what is the syntax for that?

Let’s say, apple, banana, baby and ox are the four names

I saw and tried the document given in the node help page. I couldn’t figure out.

I know, I can do this using Column rename and manually change this. But, I have to do this for 12 files.

Highly appreciate the help

Braveen · September 11, 2020, 4:19am

Thanks for verifying it.

s.roughley · September 11, 2020, 6:31am

OK, first the easy-to-explain part - in the matching pattern, (.+), ‘.’ matches any character, and ‘+’ means 1 or more of them, so ‘.+’ matches then entire column name. That match is surrounded by ‘(’ and ‘)’ which is, in regular expressions a capturing group (see https://www.regular-expressions.info/refcapture.html). In the replacement, ‘$1’ refers to the contents matched in the first capturing group - in this case the whole column name.

Now the more tricky part - excluding names. You can either use a Column Splitter node to separate the columns you dont want to rename from those you do, rename them as above, and then put them back together with a Column Appender node, or you can change your regular expression as follows:

(?!apple|banana|baby|ox)^(.+)

This works as follows - the first part enclosed in ‘(’ and ‘)’ is called a negative lookahead https://www.regular-expressions.info/lookaround.html - the matching must start by not matching whatever follows the ‘?!’ sequence - ‘|’ meaning ‘or’. Now we come to ‘^’ which means ‘the start of the string’, and then the same ‘one or more of any character’ match we had previously. Without the ‘^’ you will get some very strange behaviour in this case.

Steve

Braveen · September 11, 2020, 7:20am

Awesome Steve. That’s informative.

system · September 18, 2020, 7:20am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.