Description: In my KNIME workflow, a CSV file is generated and deposited into a specific directory (Directory X). An external batch process (FEX process), which runs every 5 minutes, transfers this file to another directory. The issue arises when the CSV file is large: KNIME takes time to complete the generation. If the batch process runs while the file is still being created, it transfers it partially, leading to an incomplete and unusable file.
I have tested a solution that involves writing the file to a temporary directory and then moving it to Directory X once the generation is complete. However, KNIME takes nearly the same amount of time to perform this move—around 50 seconds—which does not fully resolve the issue.
Question: Is there a native or optimized solution in KNIME to address this problem? For instance, a way to delay the visibility of the file in the target directory until it is completely generated, or another strategy to avoid this conflict.
Hi,
Thank you for your response. I tested renaming with the Transfer File node, but I experienced the same processing times because the node used for renaming the file rewrites it while renaming.
Ah now I get it:
The problem lies in the external batch process, as it’s running with the fixed interval.
Is it possible to change this and trigger the transfer by KNIME?
If not it depends on how the FEX is built. Does it search for a specific named file? Or is it just looking for new files?
In the first case I would save the CSV with a name which is ignored by the FEX
and rename it after the export is finished.
In the second case I would add a additional folder in the directory where the SV is exported to. After finishing I would move the file using system commands. If the file is large and lies on a network drive this can be slow as well, as the data is first uploaded to your PC and then send back to the network drive. In my cases I used an virtual machine at a server which had better network connections than my PC.