Input files

I created a workflow to process dat and xls files (merge and reformat data). The input consist of 2 files which I select using 2 file reader nodes. say:
feed1_ddmmyyyy.dat
feed2_ddmmyyyy.xml
I have scheduled the workflow using bat file and Windows scheduler to run daily.
Because the input feeds arrives on a daily basis with different timestamp (i.e feed_1_14/08/2021), it is not possible to run the workflow automatically.
Is it possible to recognize input name in knime based on letters only without the date so that the automation will work?

thanks!

Hi @Sharon_c

You can use the List Files / Folder node, and use the “Filter options” to make the selection without the date. Then you feed the filename as a flow variable in the File Reader node.

gr. Hans

2 Likes

Hi @Sharon_c , there seems to be some inconsistencies in what you are presenting.

  1. You mentioned processing “dat and xls”, but the sample file names you gave are of type dat and xml.
  2. Your sample file is of format feed1_ddmmyyy, but the example you gave is feed_1_14/08/2021, which is of format feed_1_dd/mm/yyyy, although in your screenshot, you are using feed1_ddmmyyy

It becomes a bit difficult to provide a solution when we can’t tell what format the filename and filetype is, without making assumptions on our part.

The assumptions I will make are:

  1. File types are .dat and .xml. I chose .xml instead of .xls, because the File Reader would be able to open an xml file, but not an xls file.
  2. Date format is ddmmyyyy since that’s what it shows in the screenshot.

To answer your question, the File Reader unfortunately does not accept wildcards for file names, so you have to specify the exact full path of the file.

A side question: In the event that File Reader could read wildcards (like feed1_* for example), what happens to the files after all the process is done? If the files are not moved or renamed, the next time the workflow runs, it will re-process these files again.

You can control what file to Read. There are a few ways to do this.
If it’s just a question of getting the date, you can get the current date and then build the file names based on the current date, and then build your file path using that date.

Here’s a quick sample:
image

I generate a dateString variable containing today’s date:
image

I have a base filepath:
image

And I generate the paths for the .dat and the .xml files:
image

Another alternative is to use the List Files/Folders node. Here, there are a few different options.
Option 1: You can filter by file name with wildcard directly within the List Files/Folders:


You’d use this if you had different file names in the folder, and wanted to process only files with filenames starting with “feed1_”.

However, if you wanted to filter feed1_<current_date> among the other feed1_* files, then you cannot use this option as it is, which leads to option 2 below.

Option 2: This is still about using the List Files/Folders, but filtering on specific files. There are actually 2 sub-options here, but both would involve building the filepath using the current date, and you go about it the same way that I showed above building the dateString. After that:
2a: You can point to a variable containing the filepath directly in the List Files/Folders:


image

2b: You can filter via a Rule Engine or Rule-based Row Filter after the List Files/Folders:
For example, let’s say I have yesterday’s and today’s files in the folder, and I only want today’s files:
image

After filtering:
image

This option gives you a bit more flexibility in that you can do different filtering and not be limited by the options of the List Files/Folders node.

I give you a workflow that has both alternatives:
Input files.knwf (31.7 KB)

3 Likes

hi,
thanks for the detailed respond!

  1. It’s dat and xls- sorry for the confusion.
  2. The dates that are related to the files are not Today- usually dates are Yesterday but not always. on weekend, dates are 3 days back. File names (not including dates) are injective Therefore, I’m trying to use the wildcard solution… currently with no success…
    Where can I find the “List Files\Folders” node? only List folders node and List Files node are available in the repository.

Thanks,
Sharon.

Thanks! I have managed to complete the workflow successfully.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.