Loop S3 Bucket (AWS / Amazon Web Services)

Hi,

I am new to Knime and am trying to loop through several thousand CSV files in AWS.  They all have the same header.  These files represent machine sensor data.

Thanks in advance for any help!

-Ken

I am able to do this w/no problem:

 

Here is the Iterate meta node - the 'List Files' and 'File Reader' are both happy pointing to my PC:

I want to avoid downloading the files to my PC - i.e. I would like loop the files directly against AWS.

So far, it seems that I am only able to pick one file at a time with 'AWS File Picker'

I've tried countless attempts at the below w/o success:

Hi,

It sounds like you made some progress, thanks for the update. Could you give a little more detail about what you would like to acheive when you say " would like loop the files directly against AWS.". Am I correct in thinking that you want to extract file metadata but without downloading the complete file content?

Thanks,

Jon

Hi Jon,

I'd like to create a consolidated CSV file - with the full contents of the 2,500+ files in S3, without first downloading it to my PC.

Does this make sense?

Thanks,

-Ken

Hi Ken,

That certainly makes sense. The nodes that we have don't allow you to do that directly in S3 at this point. Probably the simplest option is to use your current workflow and KNIME Cloud Analytics Platform (or KNIME Server) to avoid moving the data out of the cloud: https://aws.amazon.com/marketplace/pp/B071ZNNLC6

Best,

Jon

Has any progress been made on this? I too would like to be able to loop over remote files and process them directly from AWS S3, without having to download them to the local drive first.