Missing Value Handling in Math Functions

inkhorn · May 23, 2013, 7:45pm

Hi,

If I want to calculate the average from several columns for each row, but there are different missing values in each row, how do I get around this?

For example, let's say I have the following 2 rows of information, spanning 3 columns:

10 ? 7

? 4 6

Ideally, I would just calculate the average of both rows, and the new column would contain the numbers 8.5 and 5. As it stands, when you try to do this through the Math Formula node, it gives you blank values :(

Can anyone help?

tangerooo · May 23, 2013, 8:53pm

Try using the Column Aggregator node. Select the columns that you want an average for. Click on "options" and choose mean.

richards99 · May 23, 2013, 10:06pm

Yes, agree exactly with tangeroo, use column aggregator.

the missing value issue with the maths node is a real pain, an issue I have encountered many times before. Thankfully the column aggregator node was a recent addition into knime which makes this solution quite easy now.

simon

inkhorn · May 23, 2013, 10:25pm

Thanks for your comments, Tangeroo and Richard. Ideally, what would be nice is to calculate a kind of slope for the numbers occurring in each row across the columns concerned. In my case, they represent charitable giving on a yearly basis.

So let's say you have giving totals across 5 columns (representing 5 years of possible giving for the person)

100

200

?

500

?

Ideally it would be nice to have a slope value that represents the average increase (or in another case decrease) on a yearly basis. Here it would be something over 100.

If all else fails, the aggregator node is decently useful for my purposes.

tangerooo · May 24, 2013, 12:25am

Perhaps you could use a few column mergers and push the values from the right INTO cells on the left that have missing values. Then use column aggregators to find the difference between 2 columns and so forth.

dsiegel · April 19, 2014, 1:28am

Hi,

Thanks for pointing me towards the Column Aggregator node, it seems very useful. One problem though- I do not see an option for "mean" in the options menu. There are other things there, such as "mode", but no "mean". I am very puzzled by this!

-dsiegel

Goger · October 6, 2017, 8:07am

Hello! I have another problem.

I have

Column1 Column2

? 7

4 6

I need column

7

4

This column has Column2 as value if Column1 is null. If it's not then new column has Column1 as value. The node Math formula does not understand operator MISSING.

Tom_Hawkins · October 10, 2017, 4:45pm

Looks like a job for Rule Engine.