Managing the concurrency issue between KNIME and a batch process during CSV file generation

Description: In my KNIME workflow, a CSV file is generated and deposited into a specific directory (Directory X). An external batch process (FEX process), which runs every 5 minutes, transfers this file to another directory. The issue arises when the CSV file is large: KNIME takes time to complete the generation. If the batch process runs while the file is still being created, it transfers it partially, leading to an incomplete and unusable file.

I have tested a solution that involves writing the file to a temporary directory and then moving it to Directory X once the generation is complete. However, KNIME takes nearly the same amount of time to perform this move—around 50 seconds—which does not fully resolve the issue.

Question: Is there a native or optimized solution in KNIME to address this problem? For instance, a way to delay the visibility of the file in the target directory until it is completely generated, or another strategy to avoid this conflict.

@mosmel have you thought about creating the file with a temporary name and then renaming it to the target name?

1 Like

Hi,
I think the “wait” node could help.
Wait… – KNIME Community Hub

1 Like

Hi,
Thank you for your response. I tested renaming with the Transfer File node, but I experienced the same processing times because the node used for renaming the file rewrites it while renaming.

Hi,
Thank you for your response.
How can I use it? I tried using this node, but it didn’t solve the problem.

As a workaround, could you increase the time between the external batch runs?

1 Like

If I increase the duration of the batch, the probability of the incident decreases, but it does not solve the problem.

@mosmel you could try using a python script. Maybe not the most elegant way.

1 Like

Ah now I get it:
The problem lies in the external batch process, as it’s running with the fixed interval.
Is it possible to change this and trigger the transfer by KNIME?

If not it depends on how the FEX is built. Does it search for a specific named file? Or is it just looking for new files?

  • In the first case I would save the CSV with a name which is ignored by the FEX
    and rename it after the export is finished.
  • In the second case I would add a additional folder in the directory where the SV is exported to. After finishing I would move the file using system commands. If the file is large and lies on a network drive this can be slow as well, as the data is first uploaded to your PC and then send back to the network drive. In my cases I used an virtual machine at a server which had better network connections than my PC.
1 Like

Hi,
Thank you for these valuable recommendations. In the end, I used the Java Snippet node, which renames the file instantly.

Hello all,

Thank you for all your recommendations. In the end, I used the Java Snippet node, which renames the file instantly.

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.