I have to create this workflow: from a folder where there are csv files I have to retrieve them, process them and then delete the files in the folders, what do you recommend?
Can you please provide more detail about what you’re trying to do? For example, are you processing the csv files separately or trying to join/concatenate/append them? Do all of the csv files have the same structure? Do you want to store the processed files somewhere or just view the output?
Hi, I’m pasting a screenshot of the part of the work I did that works. It’s a CSV taken from a folder that is processed and then written to Excel. The starting folder is Temp and the final folder is Out.
I need to retrieve the CSV files from the FTP folder, process them with transcoding (the workflow posted above), and then delete the files in the FTP folders. I asked Copilot to first run a local test with 4 folders: In (source), Temp (file to process), Error (if there are errors), Out (final folder), and Archive (backup folder). It did this for me, but I’m lost. Thanks.
there are still too many open variables:
- are we talking large amounts of data or small?
- can you read everything from network
- do you need to transfer the files first and buffer locally
- do you have enough space to load store them temporarily on a local disk
- etc.
- are you expecting interruptions in the process?
in general, you should be fine with:
- list files & folders from the ftp location
- create a temporary folder locally
- transfer files (leave originals)
- load local files from csv reader
- do stuff
- delete remote files
- delete temp directory
Hi, thanks
They’re small quantities, one file per folder. I managed to create this workflow that transfers, reads, and writes. Once written, it deletes the files and leaves only Out and Archive. All I need is an email notification if a file isn’t readable. Is that possible?
you dont need to loot.
e.g. for transfer and deletion, there are also nodes that do this all in once.
CSV also can read all in once (all files in a folder).
but this will make it more difficult to debug.
if a file isnt readable
because it can be corrupt? or do you expect are change in structure?
ok I didn’t know, because if a file is not readable or has a different structure it doesn’t work and I don’t have the excel file, this from an automation perspective I have to understand which folders have non-functioning files..the folders could be 300 with a file inside
This is exactly what I was looking for! The logic with the delete node at the end makes so much sense for keeping the folder clean.
@luca_er2011 in my opinion you will have to take it step by step and at each step build in fail-safes that will handle errors. The key from my experience is not to try to do it all at once but split it up into small understandable tasks. KNIME ist ideally suited for that.
- create a list of the files on FTP
- connect to FTP, then connect the list files node
- transfer the files to a local folder (you can also connect this node with the FTP node - the blue port)
- if there could be problems do it file by file (Loop) and wrap a try-catch node combination around and record which files did make problems
- store the names and paths in tables so you later can for example steer the deletion
- from the local folder (temp folder) try to import the (csv?) files
- do it in a flexible way like import every structure and then validate that against a standard pattern or letz the CV reader fail if the structure has changed
- again: record the failings so you later can skip, handle, delete … whatever
- the ‘successful’ files can then be handled. Like being combined. Stored as Excel and so on (if you are paranoid you can again insert a try-catch)
I think you get the idea. Yes this is some work and mostly involves planning.
Here are two articles that explain the elements and techniques you will need:
Medium: “KNIME, Paths and Loops — Automate Everything”
Medium: “KNIME — Cases, Switches and Catching Errors”
KNIME File Handling Guide
if you only want to check the structure, you can also just plainly tell the CSV reader to not do any transformations and simply load the first line (which is the header) and list them.
then you get a long list of all the ordered columns of each file and can test against the expected pattern and know each file which deviates in the structure (content, trailing lines etc may still cause issues)



