Converting NaNs to missing values

anshuls235 · January 31, 2021, 10:32am

Hi,

I’m using python’s pandas library for data manipulation in one of my workflows. Since, the data is messy, there are a lot of manipulations involved and also there are a lot of missing values as well. Pandas treats NaN as a missing value while KNIME doens’t do so. So, now my aggregations are not working fine on KNIME and there are a lot of manipulations involved that can’t be achieved by just using native KNIME nodes.

Is there any way by which I can replace the NaNs with missing values? Tried to search through all the topics on the forum but those are not really helping me. Also, the version I’m using is 4.3.

qqilihq · January 31, 2021, 10:46am

Hi,

a very simple solution would be to use a Math Formula node and simply return the column value.:

As the node documentation states:

When any of the used columns contains a missing value, the result is missing, just like when the result would be NaN, infinite value, or outside of the 32 bit signed integer range when that is requested.

This means, a NaN would be converted to a ? (missing value). See here on my NodePit Space for an example workflow.

Hope this helps!

–Philipp

anshuls235 · January 31, 2021, 1:02pm

Thanks @qqilihq, it worked perfectly for me!

mlauber71 · January 31, 2021, 1:44pm

Another way could be to employ a Java Snippet

system · August 2, 2021, 1:45am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.