Replacing Missing Values in Boolean Column - feature request

Hi all,

The addition to the Missing Value node to export settings to PMML seems really useful, thanks!

However, seems to me like something has been overlooked that I personally would give priority over this cool new feature: setting options per column (individual) that actually make sense for the given column type.

To illustrate my point; when trying to replace missing values for a Boolean type column, I can choose from a lot of options such as mean, moving average and linear interpolation - which of course are completely irrelevant regarding to Booleans. The fix value that I would like to choose (true or false) is not an option.

It would simplify my work if more sensible replace options were made available.

For those looking for a workaround, there is of course always the Java snippet.

P.S.: I recently upgraded to KNIME 2.12.0 and have only noticed this issue afterwards - I do not know whether or not it was present in earlier versions.

 

Kind regards,

Elissa

Now the missing value node isn't even executable anymore if placed after a join node (no matter which nodes are in between). I'm getting:

ERROR Missing Value        0:148      Configure failed (NullPointerException): null

 

The node was working fine in the exactly same place in the workflow before...

Hi,

I am currently working on a similar problem with this node and would like to know if your problem is related to it. Could you post your workflow here so I can have a look at it? The default value for boolean fields is a good suggestion. When we created the node we focused on the things the old node could do, but it is easy to add other missing value replacement strategies. The boolean data type in KNIME can be treated as a double, this is why all the other missing value handlers appear in the dialog.

Regards,

Alexander

Hi Alexander,

Thank you for replying! I've created the workflow in a commercial setting so I'm not allowed to post it, but I can describe it. Basically I just did a row split and calculated some new fields on the split part of the data (nodes: groupby, java snippet, time difference, join, constant value column). Then I concatenated this with the other split part; creating missing values for the newly calculated fields; that I want to replace with fixed values. 

The missing value replacement that's producing the error above is a double type field generated using a time difference node (days between 2 given date fields after the splitting). It is set to replace with fixed value 0,0 (all types are set to "do nothing"). Now for the weird part. The error only occurs when executing a metanode that calculates 2 geocoordinates columns and 2 string columns, and joining this back to the original data, before all the splitting etc. If I sidestep the addition of those 4 columns (that are not used in the missing value replacement) then the missing value replacement works.

Is there any other information I can share to help you along?

Kind regards,

Elissa

 

 

Hi Elissa,

thank you for providing that information! There is indeed one more thing you could tell me: The when you view the spec of the incoming data (Right click on the producing node -> Click on the output port -> Open Tab "Spec - Column: ..."), do you see a value that starts with "Non-Native" in the "Column Type" column? If yes, this is a bug I have just fixed for KNIME 2.12.1 :-)

Regards,

Alexander

Hi Alexander,

Sorry for the delay! Indeed, there's a non-native data type in my incoming data, namely geocoordinates (created with the LatitudeLongitudeToCoordinate node. I've updated KNIME, deleted and reinserted the missing values node, but it's still giving me the error:

ERROR Missing Value        0:102:151  Configure failed (NullPointerException): null

And I get this when opening the node interface:

ERROR Missing Value        0:102:151  Error loading model settings

Did I do something wrong?

Kind regards, 

Elissa