Performance question

niederle · November 8, 2018, 1:00pm

Hi there,
I was using the ‘Test Data Generator’ node and realized something I do not
understand. I have a two similar configured nodes (in my example ‘Number to String’ but it does not seem to be specific for this node). One gets the input table (just reduced by columns) and executes very slowly. The other one is row-filtered (criteria set so that actually nothing gets filtered) and executes much much faster. The input tables are compared by a ‘Table Difference Checker’ node and seem to be the same.
I will attach an example workflow which illustrates the performance differences of both nodes.
From a developer point of view, I would just like to understand why there is a difference.
test extreme.knwf (24.1 KB)

niederle · November 8, 2018, 1:09pm

(should go to the ‘Developers’ section - sorry…)

niederle · November 8, 2018, 1:44pm

I guess, I know the answer. Columns are not really filtered after the column filter so data for the first ‘Number to String’ node is read from the big table generated by the ‘Test Data Generator’ node. The ‘Row Filter’ rewrites the table and so the reduced dataset is fed to the second ‘Number to String’ node.
Correct me if I’m wrong.

izaychik63 · November 8, 2018, 2:21pm

Look at https://nodepit.com/node/org.knime.base.node.util.cache.CacheNodeFactory
node. It may help to improve performance in your case.

niederle · November 9, 2018, 7:54am

Thank you! This node might indeed be very helpful (especially before looping over such an (un)filtered table.