Convert Missing Values to NaN for use in Math Formula Node

Hi,

I'm trying to use the Median function in the Math Formula node. I am using 6 columns for input and unless there are 6 missing values in a row, I want the function to return an actual number. This will happen if there are NaN entries, but not missing values.

Here's the behavior I'm seeing when applying the median function:

Row A: median(double, double, double, double, double, double) = double

Row B: median(double, double, NULL, double, double, double) = NULL (or missing)

What I want from Row B is a double that is calculated using all but the NULL value.

Row C: median(NULL, NULL, NULL, NULL, NULL, NULL) = NULL

Row C's behavior is preferred, but returning NaN would also be okay.

 

The documentation for the Math Formula node claims that I will get the desired behavior if I have a NaN value and not something Missing (NULL).

Is there a way to accomplish what I want?

Thanks,

Margaret

2 Likes

Hi Margaret,

the Math Formula node only handles NaNs but not Missing Values. If there is a missing value in the input, there is a missing value in the ouput. Hence you need to resolve the missing values beforehand to be able to handle them.

But for this usecase you can also use the Column Aggregator. It has the option missing for all aggregations. And if you don't select the option missing, it will ignore the missing values.

Best regards, Iris

3 Likes

Hi Iris,

I solved my issue by using an R-snippet node. I was able to assign NaN to a variable in R and then use that to fill in for the missing values. (Matlab or Octave probably would have worked too.)

The Math node works as expected now. 

Thanks for the reply. I'll play with the Column Aggregator node to see how it works.

Best Regards,

Margaret

Would be helpful if possible in future to have a function in the Math Formula node to resolve a missing value, something like ifMissing(x, replaceValue) would be nice, this can then be used to embed different expressions in the same formula if necessary

Indeed the Math node cannot do this, it seems. But most/all functions in the Math node are also available in the Column Expressions node, where a missing value is inserted as a null:

Alternatively, most functions in the Math node are also accessible from the Java Math library in a Java Snippet node (as, for example, Math.sin etc.). So, as an alternative to the Math node you can use a Java Snippet node where you can check “Insert missing as null” and then in the formula you can test for null, just like above:

1 Like

Hi there @David_Ko,

you like reviving old topics, don’t you? :smiley:

In this case (if you don’t wanna use Missing Value node prior to Math Formula node cause you only want replaced values for calculation and leave missing value in output table) I would say you should try Column Expressions node. It also has isMissing() function which return Boolean.

Br,
Ivan

2 Likes