Change row values based on condition

I have a dataframe with with two columns:

ID | Timepoint
1   | Baseline 
1   | 2
1   | 4
1   | 8
2   | Baseline 
2   | 2

What I want is to add the string “Hour” just before the timepoint value whenever the value is not baseline.

So I would get:

ID | Timepoint
1   | Baseline 
1   | Hour2
1   | Hour4
1   | Hour8
2   | Baseline 
2   | Hour2

I though of using a rule engine as follows, but theres an error:

$Timepoint$ NOT "baseline" => "Hour"$Timepoint$ 
TRUE => $Analysis Timepoint (N)$

It would be more helpful if you said what the error was, but based on the rules you provided I can see that they’re not formatted correctly and they don’t really comply with functions that the Rule Engine node can do.

The first issue is that the NOT function must not be preceded by an argument. It’s only followed by an expression.

The second issue is that I don’t think you can concatenate strings with the Rule Engine node. From the node description:

The outcome of a rule may be any of the following: a string (between " or / ), a number, a boolean constant, a reference to another column or the value of a flow variable value

The good news is that there are a couple ways this can be done:

First approach:

  1. Use the Rule Engine node to create a new column with “Hour” in it when the Timepoint column is not “Baseline”. Use the Expression NOT ($Timepoint$ MATCHES "Baseline") => "Hour"
  2. Merge the columns with a String Manipulation node using the expression join($Hour$,$Timepoint$) and make sure that “Insert Missing As Null” is checked.


The second approach, and the one I’m partial to, uses the Column Expressions node to combine both manipulations using the expression

if (contains(column("Timepoint"),"Baseline"))
{ column("Timepoint") }
else { join("Hour",column("Timepoint")) }

Alternatively, you could use the expression

if (not(contains(column("Timepoint"),"Baseline")))
{join("Hour",column("Timepoint")) }
else {column("Timepoint")}



This reminded me of @takbb’s thread about the Java ternary operator, that works in String Manipulation Nodes, while not being part of the documentation.

I never use it in programming because it’s ugly, but it comes in handy because it allows me to Node Golf this question. This expression produces the desired output in a single node, without the use of the cheaty scripting nodes:

string($Timepoint$).equals(“Baseline”) ? $Timepoint$ : join(“Hour”, $Timepoint$)

Does nothing if the cell contains “Baseline”, otherwise concatenate “Hour” in front of it. No picture because it’s a single String Manipulation Node that replaces the “Timepoint” Column.
change row values with condition.knwf (11.9 KB)


I also thought of column expression like @elsamuel

   join("Hour ",column("column1_Arr[1]")) 

The other idea is just add it everywhere and then use a cell replacer to replace Hour Baseline with Baseline


This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.