Dear KNIME users,
yesterday I was looking for a way to filter out rows containing missing values. Most Forum posts suggest the missing value node. So I went about and tried it, however with ~1000 columns and 3 rows the node takes about 5 minutes to execute. As far as I see the main issue is the PMML generation (which is non standard if I exclude the row, instead of filling in a value), so the node hangs at around 43% for most of the 5 minutes.
I benchmarked the timings for different numbers of colums (using 100 iterations of the benchmark nodes):
1 column: ~0.5s
~100 columns: ~40s
~500 colums ~180s
~1100 columns ~300s
(Using Win10, 16GB RAM, KNIME version 4.1.2)
is there any possibility to not generate the PMML or a more efficient way to remove rows with missing values?
The final table will contain ~200.000 rows so I guess transposing and filtering is also not too efficient.
Thank you for your help!