Replace missing values based on another column

Is there a way to replace a missing value with an entry imputed by another column with the missing value node?

I have a data set where there are vendors, categories, and products. There are some categories where I’d like to replace a missing value with the most common (typically only) entry for that vendor. The only way I can think to do it is to group by the vendor along with the mode for category, then join that to my table, and then use a rule engine to replace missing values with my new imputed value.

That will work but it seems clunky and not super scalable. The data set has a lot more than the three columns of data and I might need to do this for other columns as well.

It seems like there should be a way to create a loop for this, but I do not know how in KNIME.

Thanks in advance,

Hi @ewhulbert,

Use a Group Loop Start node and the Vendor as the grouping column then handle the missing values inside the categories column by replacing them with the most frequent value.


Woohoo, thanks Armin! I think this is the second or third time you’ve helped me, much appreciated.


