Missing Value Column Replacer

Hi,

I have a table (see workflow attached) that contain some missing values. I would like to loop over each column and if the percentage of missing values of the selected column is greater then 70% of total rows of the specific column, then I want to replace all the rows of this specific column with an empty (or missing) value.

If the percentage of missing values of the selected column is less then or equl to 70% of total rows , then I would leave the column as it is. 

 And then I need to continue the loop to do the same the same for all the other columns.

Please, could you help me?

Thank you in advance

 

P.S. I know that the node "Missing Value Column Filter" exist but I don't want to remove the column.

Maybe there's a more simple solution, but at least this works.

Best,
Marc

Thank you very much! It works. But I wonder if there is a more efficient solution that avoid us creating a lot of unused column that then we need to filter...

If you want to do something for a lot of columns you can always use the Unpivoting - Pivoting Trick..  As in the attached WF.

Best, Iris

Hey Iris,

I like your approach, never thought of this "trick". However, I have got three comments:

The column list loop approach is more flexible in regards to changing columns or column numbers. That's because of the Enforce exclusion/inclusion option provided by Column List Loop Start node.

This approach also retains the row order.

You can save one more node by not "pivoting" the RowIDs column.

Best,
Marc

Hi Iris,

thank you! I didn't think about your good aproach. But when the tabe is big Unpivoting could be costly.

I know you can "column filter" before  "Column List Loop Start"  and then join at the end of the loop, but is there a way to tell to the node "Column List Loop Start" to not include the excluded columns several times but to consider them only once? If not I think this could be a very good option to include as a feature... Don't you think so?

 

Hi iiiaaa,

Yes, I also thought about this feature. It would be something like "skip known column" as option in the loop end node?

Iris

Hi Iris,

the goal is tho exclude from repetition the columns that are inserted in the left side (the exclude part) of the "Column List Loop Start". So inserting the function "skip known column" in the loop end or the loop start could be very usefull. 

Many thanks!

I think this would be a really useful option and it should be activated per default in the node's configuration dialog.

Cheers,
Marc