AWS S3 Bucket to Bucket File Transfer is Slow

I have a workflow where I am simply transferring files (mostly small PDFs) from one S3 bucket to another within my account. It is very slow. If I transfer the files from my source S3 bucket to a local directory it performs acceptably.

I often use the Python AWS CLI from a terminal and the command is executed within the AWS backbone. When doing this it uses the AWS 100GB/sec (or whatever) speed and is extremely fast.

So, what I guess I’m asking is, if anyone is familiar with how the Transfer Files node works, is it downloading all the files locally and then uploading them? So source bucket to local temp directory and then from local temp directory to the destination bucket. If so, I might just use a Python node and use the AWS CLI instead (maybe I’ll just try this out now… ).

Thanks,
Troy

image

3 Likes

@troy.smith Developer here.

Just for terminology: Each file system connector creates a new “connection” when it executes. That connection is getting passed via the output port.

The Transfer Files node behaves as follows:

  • when transferring between different connections it will download and then upload each file
  • when transferring within the same connection, then it uses S3’s copyObject operation (where the data is not downloaded/reuploaded).

Hence, if you just use one S3 Connector instead of two this will be faster.

3 Likes

Perfect, thanks for the explanation Bjoern! That cut the time in half, from ~50 seconds to ~25 seconds.

image

Thanks,
Troy

4 Likes