Merging several *.csv files

I am new to KNIME and am looking for a workflow such that I can import multiply *.csv files (actually 6 files to start with) and combine them into one single *.csv file to work with. All six are structured the same , with the same header varaibles.

For a starter, you can use six File Reader nodes which you set up for each of the CSV files.

Then, use a “Concatenate” node to combine the six tables. There’s a variant with four input ports:

As obviously 4 < 6, you’ll need to chain two of them together (connect the first four File Reader nodes to one Concatenate node) and then its output and the two remaining File Readers to a second Concatenate node.

Later, when you want to make your workflow more generic and want to be able to handle 7, 8, 11, or 235 CSV files (and just one File Reader, instead of 235), you could do this with the looping nodes.

– Philipp

2 Likes

Well that’s an easy start. What I really need is (I suspect) some sort of loop. That is I have a directory where I will be adding csv files 1 to n and will (at some point) want to combine them into one file.

I’ve started toying with this Untitled but its not working yet?

1 Like

What’s the message from Table Reader? You see it when you hover your mouse over the warning sign.

WARN Table Reader 2:4 Input file ‘C:\Program Files\KNIME\0’ does not exist

Node 2: List Files see the files (all 6) and lists them
Node 3 I’m assuming starts the loop
Node 4 should read the tables. I set Use Variable, currentIteration, under the Variable Settings???
Node 5 again assuming it will loop until there are no more files

That’s just an index number, hence the …\0. You’ll need to use the file path instead.

where do I set that (option)
Untitled

I got it I had to set Node 4 to the first csv file. So now o continue my work flow from Node 4 or Node 5

Hi there @reichmaj,

welcome to KNIME Community Forum!

There is KNIME Hub where you can search for example workflows and use them as templates. Here is one which reads multiple files and merges them.

https://kni.me/w/UPm1nNWmew1uu2L0

Also searching forum on “reading multiple csv files” will give you plenty of results :wink:

Br,
Ivan

6 Likes

@reichmaj

Attached is a workflow I built as a training tool (similar to the link that @ipazin shared) I just broke it down even more to help beginners.

14 Likes

It’s a great collection of examples. I would recommend to name it
“Importing Multiple Files (ETL)” for better opportunity to search.

1 Like

@izaychik63

Thank you! Done!

Hi there,

nice one @TardisPilot. Will bookmark this topic for future references :wink:

Also you can share it on KNIME Hub if you want.

Edit: I have checked it briefly and think adding example files into workflow directory and using workflow relative paths would be very nice. That way examples could be executed immediately.

Br,
Ivan

1 Like

Just seconding what Ivan already stated - nice work @TardisPilot! You should definitely upload your workflow to the Hub if you have time, for the widest possible availability. :slight_smile:

@ipazin & @ScottF,

Thanks! I will work on that and update the workflow then upload it to KNIME Hub. I have a few other examples (and some other great components built to auto load data from a users downloads folder etc.) that I want to include as well.

3 Likes

@ScottF

I have the workflow updated, I’m just not certain how to upload it to the Hub. Are there instructions somewhere?

@TardisPilot -

Sure - check out the 1st animated gif on this page: https://hub.knime.com/site/about

You will probably also want to scroll down a bit to the Workflow Metadata Editor section too. :slight_smile:

1 Like

Ok! Here it is! Sorry for the delay (my Workspace was a bit goofed up)

Thank you for the help and suggesting this!

https://hub.knime.com/tardispilot/space/Importing%20Multiple%20Files%20(ETL)

6 Likes

Hi - I can get the files into the file list and see that they are listed there. But when I want to add the variable, url is not an option. The only choice is not where the files are. How do get it to to show url as a choice so I can add it? If it isn’t done in the list files node, please specify what node you put this variable in. right now, I cannot point any of the nodes to anything except knime.workspace. Thanks.

And here is the file list -