Read new files whenever file is coming to the directory

Hi,

     I am in need of a node where it can read files from the directory whenever any new file is coming and also it should not read the old files again.

 

  Can anyone help me ?

 

Regards,

Prasanna

1 Like

This is certainly something I am very interested in as well.

1 Like

Hello Prasanna,

I don't think there is such a node, but there is a solution to your problem:

First you need to use the File Meta Info node in combination with a List Files node and some others. With these you can get the newest file in your directory.

To trigger the execution, you need to write a script outside of the Analytics Platform. You can register a monitor on this directory and then use the batch execution to trigger the workflow execution.

Best,
Ferry

Hi,

I have an example workflow performing this process.... However it has quite some java snippets inside. I

We are working on a dedicated sets of nodes which will be able to run on new files in a folder only. But unfortunately, the timeline is not set...

Attached is the workflow, you need to change the directory in the Watch Files Meta node. And the reading in the Read and process new files meta node. Currently it would read only KNIME tables in and process them to the output.

I hope this helps,

Cheers Iris

1 Like

Hi Iris,

thanks for sharing this workflow. Is there a reason why you're using deprecated List Files nodes and the legacy Quickforms nodes? I always tried to avoid these in my workflows, but I'd be interested if there are use cases where there is no replacement available yet.

 

Thanks,

Jan

 

The reason is the workflow is old and I didn't had the time to update it.

you might be interested in this article: https://www.knime.org/blog/reproducibility-and-knime

Thanks, Iris, for letting me know. I appreciate that KNIME ensures the reproducibility of older workflows. However, I believe there was a reason why those nodes received an update and were developed further. And when learning new methods, I prefer being based on the state of the art, hence my questions above.

I'd like to encourage the KNIME team to keep their example workflows up-to-date with the latest node implementations. These example workflows (both on the example server and here on the forum) are extremely helpful, and it just creates confusion when those examples use all kinds of different paradigms (e.g. the concepts of metanodes vs wrapped metanodes and the implementations of legacy quickforms vs quickforms, etc.)

 

Cheers

Jan

 

1 Like

Hi all,

I recently faced the same need so I did some kind of light-weight alternative to the previous propositions. I also rely on the recursive loop in combination with a python node to get the state of the folder.

The second python node for the processing can be replaced by other processing nodes, however the output data at the loop end can sadly not be visualised before the loop finishes.

In this configuration, the loop runs for a given number of iteration defidne in the loop end configuration but it can also be stopped using a flow variable. The loop can also be stopped by cancelling the workflow but in this case no visualisation is possible.

As I've been using loop-workarounds for this as well I can only say that I'd love to see this feature!

One further alternative would be to use the KNIME Streaming Mode and change the batch size to 1 so that each new "event" is handled by the KNIME workflow right away. But this would obviously include some under-the-hood node development which might not be what most people here are looking for.

1 Like

Hi Iris,

Has any progress been made on the directory watcher nodes? I am working on setting this up to act as a kind of batch processing for template files being submitted to our database. I am having trouble with implementing the directory_watcher_0 and Recursive Loop - WatchFolder workflow examples. Can you please help?

Thanks,
Gina