I have a table (see workflow attached) that contain some missing values. I would like to loop over each column and if the percentage of missing values of the selected column is greater then 70% of total rows of the specific column, then I want to replace all the rows of this specific column with an empty (or missing) value.
If the percentage of missing values of the selected column is less then or equl to 70% of total rows , then I would leave the column as it is.
And then I need to continue the loop to do the same the same for all the other columns.
Please, could you help me?
Thank you in advance
P.S. I know that the node "Missing Value Column Filter" exist but I don't want to remove the column.
Thank you very much! It works. But I wonder if there is a more efficient solution that avoid us creating a lot of unused column that then we need to filter...
I like your approach, never thought of this "trick". However, I have got three comments:
The column list loop approach is more flexible in regards to changing columns or column numbers. That's because of the Enforce exclusion/inclusion option provided by Column List Loop Start node.
This approach also retains the row order.
You can save one more node by not "pivoting" the RowIDs column.
thank you! I didn't think about your good aproach. But when the tabe is big Unpivoting could be costly.
I know you can "column filter" before "Column List Loop Start" and then join at the end of the loop, but is there a way to tell to the node "Column List Loop Start" to not include the excluded columns several times but to consider them only once? If not I think this could be a very good option to include as a feature... Don't you think so?
the goal is tho exclude from repetition the columns that are inserted in the left side (the exclude part) of the "Column List Loop Start". So inserting the function "skip known column" in the loop end or the loop start could be very usefull.