I have a workflow that starts with a “List Files” node that is taking a folder with over 1600 .xml files to create a table. When I started the workflow it was not a problem given that I had a few xml files.
As time went on, the number of files grew, thus making the workflow a lot slower to complete.
I was hoping that someone could share a way to only process the “new xml files” that are added so that the workflow could be a lot faster. I have no idea if what I am asking is possible but I decided to ask anyways.
Thank you in advance
It’s almost certainly possible, but how to do it depends on how you will know that they are ‘new’? Are you looking simply at a file creation date-based approach, or is there some other way of looking up with files are already processed (e.g. from a database, a list of output files etc)?
Also, how are you reading them? If you are using a
Table Row To Variable Loop Start ->
XML Reader comobination after your
List Files node, then almost certainly it will be much quicker to use the
Load Text-Based Filesnode (from the Vernalis community contribution - see https://hub.knime.com/Vernalis/extensions/com.vernalis.knime.feature/latest/com.vernalis.nodes.io.txt.LoadTxtNodeFactory or https://nodepit.com/node/com.vernalis.nodes.io.txt.LoadTxtNodeFactory) followed by a
String to XML node
Thank you Steve. I will check it out!
Hi @stevens_albert and welcome back to the KNIME community forum,
Regarding @mlauber71’s second suggestion, You can export the output of the List Files node in a file and read it each time you run the workflow. Then use the Reference Row Filter node to exclude those files which are already listed in the exported file.
Thank you very much for your reply. I will try this and let you know!
How can I get the date of my last run?
Thank you! I will try that and let you know!
You could use the:
A few more examples how to handle date and time variables. You might convert your time of execution to a number like:
and store it with your data and later simple use a Rule engine to filter out cases with older timestamps.
This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.