Dear Knimers,
Consider the following table:
Suppose we want to use the “Missing Value Column Filter” node to remove some of the columns with missing values. The somewhat confusingly named “Missing value threshold (in %)” value is the minimum percentage of missings in the total number of cells in a column; a higher percentage of missings will result in the column’s removal. Say we enter 20%
This results in:
So far so good. What if we want to remove all columns that have any non-zero number of missings value? Naively I would enter a threshold 0%… Uh oh:
A threshold of 0% removes all columns, even those that do not have any missing values. If I want to remove all columns that have at least 1 missing value but keep the columns that have no missing values, I usually end up using an absurdly low but non-zero threshold to make sure that it works for all table sizes that I am likely to encounter:
But I think it would be more elegant and intuitive if a threshold of 0% simply keeps the columns with no missing values.
Best
Aswin