Hello to all,
I am new to this community and to Knime, thank you in advance for your help and patience.
I want to randomly delete cells from a table across columns and rows.
This could be rephrased as I want to copy random cells from a table into a new table keeping the same table structure.
Anyone has an idea on how to to that simply in a resource efficient way?
My table is over 30 million lines and 10 columns.
Thanks!
That is a one node solution to my problem. Very impressive!
Thanks @ScottF
I can get a table with 10% missing value, but it seems repeating the Disturber in the workflow on the 3rd node output port has no effect.
Do you know how I could obtain 30% missing values for example?
How about using a Partitioning node to remove 30% of data randomly and joining this output to the main table? Joining will take a while in your case but this way you will get exactly what you want.
You are right @HansS. Disturber node always assigns missing value to the same cells if table is the same. As a workaround one can use Shuffle node in between. Or @armingrudd solution