# Replace certain numeric values with null or missingvalue

I have some odd values in numeric columns like 9999, 9998, 9991. They mean missing or some odd value that is intentionally labeled as a very large numeric value for recognition. It seem to be the old fashion statistical data encoding practice. Is there a short logical way to replace these specific values with null or missingvalue? I do not want to convert the Numeric column to String, do String manipulation using toNull(), and then convert it back to Numeric, if possible. It would be nice to do this in Rule engine or in Math formula nodes. But there, how do we assign Null as a replacement for a numeric value in the expression?

Hi @dursundelen , the rule engine will make something missing if you donâ€™t give it an outcome for a particular value, so if all of your special values in â€śColumn1â€ť are (for example) 9990 and above then a rule engine that says:

`\$Column1\$<9990 =>\$Column1\$`
and set to replace Column1 would
replace this:

with this:

There isnâ€™t an easy way of doing a â€śnot equalsâ€ť in the rule engine, so if the values you want to replace are more varied and not easily found by all being â€śless thanâ€ť or â€śgreater thanâ€ť, then one option might be two Rule Engines. The first would have rules matching all the values you want to remove, setting them to a single â€śspecifialâ€ť value, which is then interpreted by the second Rule engine in the way described above,

e.g. First rule engine

Second rule engine

This would handle if you had a variety of â€śspecial valuesâ€ť that werenâ€™t just in one range

becomes

6 Likes

Hi takbb,

Thank you for the quick and well detailed answer. I like it. In teh meantime, I tried the following (although it is a two step process).
Step 1.
Used a Rule engine node to replace the odd values with â€śâ€ť (empty string / nothing)
\$Column1\$ >= 9900 => â€śâ€ť
\$Column1\$ < 0 => â€śâ€ť
TRUE => \$age\$
Choosing append a column. will create a new String column. The replaced cells will have no values in them (but not Nulls either, not yet)
Step 2.
Used a String to Number node for the newly created column. That will convert the String column to a numeric column with empty cells converted to Null values.

1 Like

Another trick that I sometimes use to insert missings into a numeric column is by using the Math Formula node. Starting with

The following Math Formula node config

will insert a missing value:

It would be nicer if there was an explicit way to insert missing values with the Rule Engine. Feature suggestion!

Best
Aswin

6 Likes

Thank you, Aswin,

In your expression, the value of â€ś0/0â€ť is interpreted as Null. Wow!

1 Like

I think it is worth mentioning is that it is possible to combine a Math Formule node and a Rule Engine node:

Use the Math Formula node to create a column that consists of only missing values (again by using 0/0), and then use a Rule Engine to insert those missing values in the column of interest.

2 Likes

Hello there,

there is NOT function in Rule Engine one can use for â€śnot equalsâ€ť. So following expression with replace option should do the trick if I got it right:

`NOT \$column1\$ >= 9900 AND NOT \$column1\$ < 0 => \$column1\$`

@Aswin request noted on existing ticket. (Internal reference: AP-9716)

Br,
Ivan

5 Likes

Lolâ€¦ @ipazin, I canâ€™t believe I didnâ€™t spot thatâ€¦ All this time Iâ€™d been wrongly thinking the NOT was simply for handling boolean columnsâ€¦ rather than negating other predicatesâ€¦ so had completely overlooked what should be obvious. I clearly need some stronger coffeeâ€¦

On a related topic from yesterday (where I also started off on completely the wrong track ) , there is a use of String Manipulation (multi column) which can also be used as a type of Rule Engine for similar use casesâ€¦ (ie turning specific values to â€śmissingâ€ť across a dataset)

https://forum.knime.com/t/string-to-missing-value/33901/5?u=takbb

2 Likes

Hello @takbb,

not bad approach but seems a bit too much code in KNIME (String Manipulation) terms

Br,
Ivan

2 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.