Performance issues while accessing large amounts of data from RedShift

Our knime scripts connect to a redshift cluster to fetch large amount of data. One of the Query returns about 3 billion row (about 49 GB) of data.

We are using the JDBC driver for redshift… Currently our database reader node.stops after executing for some time. Is there any best practice to access such huge data from redshift

If you use DB (Labs) nodes then DB Connector (Labs) on Advance tab has parameter Fetch size. Default value is 10000 you can increase it for better speed. Also, if you have filtering or other streamable nodes to stream the process. See here
https://www.knime.com/blog/streaming-data-in-knime

2 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.