Hello everyone! I have a huge file containing around 8000 rows and I’d like to split this data in multiples CSV file of 100 rows each.
Data structure is pretty simple, I have an ID column (unique value), a comment column and a Counter column to count the lines.
I created an additional column with a math formula node (floor($Counter$ / 100) + 1) that gives the same value every 100 rows. So the first 100 rows will have: 1, the next (from 101 to 200):2 and so on… in this way I know that I can cut the table every time there is a change of value in this column.
then I should loop trough the data and filter according to this column but I’m not able to give instruction to the CSV writer to create multiple files, I’m trying to create a variable but it’s not working. Also in the variable setting of the node there is no field for “file path” or “file name”. the result is that all the data are written always on the same file.
anyone know how can i solve this?
thanks a lot!
Could you share your current workflow? It seems like you could use a Chunk Loop node and write the separate files inside the loop.
TEST.knwf (17.2 KB)
here an example, I reduced the data set to 20 rows with a chunk loop of 5. the problem is I think with the CSV writer.
You need to create the full file path for the CSV Writer. I’ll try to tackle it this evening if you’re still stuck.
Can you please show the configuration of String to path variable?
NOTE: If you select File System as Local File System you have to create full path and if you select Relative to and current workflow data area then you don’t need to give full path. In second case the files will be created in data folder inside your workflow folder.
You can refer to:
KNIME File Handling Guide
In this, case you have to create file name on every loop deferrently with full path.
I would suggest don’t use loop start output to create file name instead choose loop start output variable and use iteration variable with full path and then convert to String to path and use variable in CSV writer.
I still don’t get the loop right. I get just one CSV with one iteration and the file name doesn’t contain the right variable.
Did you download the my workflow I sent?
If not download it and place csv reader instead of Table creator Node.
Try this. You’ll need to change the path creation in the String Manipulation node to store files on your computer. If I find time, I’ll work on an approach which obviates creating the “Math Formula” column.
TEST REF.knwf (94.8 KB)
this works for my case! The string manipulation joining the path and the variable works perfectly. thank you @rfeigel and @prashant7526 for your support!
As promised here’s a workflow which doesn’t require the Counter and Math Formula columns, which I assume add no value to the output. Its a little more complicated. It creates the file name from the loop iteration.
TEST REF rev 1.knwf (114.2 KB)
Excel Output
data:image/s3,"s3://crabby-images/0c363/0c363555135fc99426dfc0bb058c4d8f6f6b638e" alt="Excel Output"