This workflow was constructed to be able to compare different performance and scalability measurements. First a data set of the defined size is created. This dataset is sent forward to multiple workflows. We are here comparing three modes, using Native KNIME and adding Spark or H2O nodes. Finally these measurements are repeated mutliple times to ensure that the results are independent from effects outside of the workflow.
This is a companion discussion topic for the original entry at https://kni.me/w/_yfTsyB-ZNNkW61-