Missing Value Handling in Math Functions


If I want to calculate the average from several columns for each row, but there are different missing values in each row, how do I get around this?  

For example, let's say I have the following 2 rows of information, spanning 3 columns:

10    ?    7

?     4     6

Ideally, I would just calculate the average of both rows, and the new column would contain the numbers 8.5 and 5.  As it stands, when you try to do this through the Math Formula node, it gives you blank values :(

Can anyone help?

Try using the Column Aggregator node.  Select the columns that you want an average for.  Click on "options" and choose mean.

Yes, agree exactly with tangeroo, use column aggregator.

the missing value issue with the maths node is a real pain, an issue I have encountered many times before. Thankfully the column aggregator node was a recent addition into knime which makes this solution quite easy now.


Thanks for your comments, Tangeroo and Richard.  Ideally, what would be nice is to calculate a kind of slope for the numbers occurring in each row across the columns concerned.  In my case, they represent charitable giving on a yearly basis.

So let's say you have giving totals across 5 columns (representing 5 years of possible giving for the person)

100 200 ? 500 ?

Ideally it would be nice to have a slope value that represents the average increase (or in another case decrease) on a yearly basis.  Here it would be something over 100.

If all else fails, the aggregator node is decently useful for my purposes.

Perhaps you could use a few column mergers and push the values from the right INTO cells on the left that have missing values.  Then use column aggregators to find the difference between 2 columns and so forth.


Thanks for pointing me towards the Column Aggregator node, it seems very useful. One problem though- I do not see an option for "mean" in the options menu. There are other things there, such as "mode", but no "mean". I am very puzzled by this!


Hello! I have another problem.

I have

Column1 Column2

?              7

4              6

I need column



This column has Column2 as value if Column1 is null. If it's not then new column has Column1 as value. The node Math formula does not understand operator MISSING.

Looks like a job for Rule Engine.