Issue SQL COPY statement to AWS Redshift

Hi All,

I'm looking to copy data from a KNIME table to a redshift table. I should be able to create a blank table using the 'Database Table Creator' node, but I'd then like to write data to the table using a COPY statement, rather than using the 'Database Writer' node. The reason for this is that I want to trigger the automatic column compression encodings that Redshift creates when data is copied to a blank table.

As far as I can see, my only option is to generate an appropriate SQL COPY statement, and run this using 'Database SQL Executor'. However, it looks like in order to get that to work I'll need to firstly write the data table to a local file, which feels cumbersome and would no doubt be slow.

Any suggested workarounds? Ideally I'd really like to have the same kind of functionality as 'Database Writer', but with 'COPY' style functionality.



Hi Andrew,

As you describe, using the COPY command to import data into Redshift is exactly the right way to be doing things. You're also right that using the COPY command in the 'Database SQL Executor' is a way to execute that command.

I used the CSV Reader and the 'Limit Rows' option to read the first 50 lines to specify the column types (you can read from a remote URL (only the top 50 lines are fetched).

Then I use the 'Database Table Creator' to create the required table in Redshift.



Thanks Jon.

That sounds like a good approach. I've since got this working reasonably well using a direct copy from S3.

That said, there is still some odd behaviour around column compression with COPY, regardless of the tool. In a nutshell, the compression types that Redshift assigns on initial copy (with compupdate) differ from those that it returns when running 'analyze compression', even when using exactly the same data. This is not KNIME specific, and is noted in other forums on the web. What I have noticed, is that using the Database Writer node will produce the same compression types as using COPY from SQL Executor - meaning that (and contrary to my initial impressions), with respect to Redshift, Database Writer seems to be a reasonable substitute for SQL Executor, except of course that it doesn't offer the fine-grained control of a COPY command. 

Thanks again,