Performance issues while accessing large amounts of data from RedShift

sabarishs · May 8, 2019, 5:27pm

Our knime scripts connect to a redshift cluster to fetch large amount of data. One of the Query returns about 3 billion row (about 49 GB) of data.

We are using the JDBC driver for redshift… Currently our database reader node.stops after executing for some time. Is there any best practice to access such huge data from redshift

izaychik63 · May 8, 2019, 5:51pm

If you use DB (Labs) nodes then DB Connector (Labs) on Advance tab has parameter Fetch size. Default value is 10000 you can increase it for better speed. Also, if you have filtering or other streamable nodes to stream the process. See here
https://www.knime.com/blog/streaming-data-in-knime

system · November 7, 2019, 5:55am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.