Take a large csv file and break it into 18 smaller files

I have a CSV file with 183,506 rows of data and I would like to break it down into 18 smaller files and if possible have it do it randomly so it will mix up the different towns in each of the csv file if possible.

Hi @sgilmour,

you can do this with the following steps:
Use the CSV Reader node to load the whole file (183000 rows should not be a problem if you do not have an absurd amount of columns)

Use the shuffel node to scrample your file:

Then you use the chunk loop with 18 chunks on the shuffeled data:

Then use an csv writer to write the chunks into different files.
set the filepath through a variable and maybe use the Iteration-number of the loop to make them unique
for example with the string manipulation node:
https://files.knime.com/sites/default/files/inline-images/_manipulation_column_column_transform_String_Manipulation.html
~join(“yourOutputFileLocationFile”, currentIteration, “. csv”)

Hope this helps you with your problem :slight_smile:

6 Likes

Thanks sounds great
I only have 11 columns
Scott

I have this workflow but it is not creating the 18 files. It is only creating one file How do I get it to create 18 individual files from my one large csv file… I will attach my current workflow if anybody has an idea.


KNIME_project4.knwf (15.0 KB)

Hi @sgilmour

It looks like your flow variables are not connected in the right way. And there can also be something wrong with the configuration. See this example flow write_18_files_from_1.knwf (34.1 KB) .
Screenshot from 2020-06-05 21-27-53
Hope this helps,
gr. Hans

5 Likes

Thank You I will check it out. Should I replace the Random Number Generator with the file reader.
The random number gives me just a bunch of random numbers in Column 1?
I will play around with this. Thank You Hans

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.