Hi, I have a table with 2 columns: age and gender.
I would like to replace all the values equal to zero or less than zero (for example all the wrong negative values) for the column age with NULL values.
Is there a way or a workaround for perform this sort of operation?
Thanks in advance.
Hi @gujodm -
You can use the Rule Engine node to do this. If you define a rule like this:
$age$ > 0 => $age$
and use the Replace Column option for age, any age value less than or equal to zero will be replaced with NULLs, and the other values will be kept as-is.
@ScottF - is there a loop-type node for rule engine which can run this rule through multiple (100’s of columns)?
If you have a static missing value pattern, you could use the Column Auto Type Cast node to set values to missing across columns without any looping at all.
If you needed to apply a rule as above, I think you could use the Column List Loop Start node to apply it columnwise.
Does that help?
@ScottF - please could you provide some guidance as to how to code the Rule Engine using variables?
Sometimes Rule Engine, Variables and Null values can be tricky. I have added an example where you could define your own replacement rules. Age gets replaced by NULL and gender by “NA”.
- there is no generic NULL in the Rule engine but ipazin came up with an clever solution (Create or replace column with NULL values) - create a column with just missing and use that
- you would always need a TRUE rule that covers everything else. If you do not have one you will create missing values
- you could deal with missing values later with the missing value node
The rules are defined in Excel and are then applied in a loop in KNIME
kn_example_replace_missings.knwf (47.1 KB)
This is a great example - thanks so much! Variable looping is super useful and having the rules documented in a separate sheet brings a lot of transparency!
One thing I noticed was the age rule didn’t seem to replace the age <=0 with the _dummy_null column, I am not entirely sure why? The gender rule worked well. Thanks.
it is a problem with the loop, the results are not stored properly. I changed the loop, although I am not completely satisfied with the result it shows what could be done.
kn_example_replace_missings.knwf (118.4 KB)