Automatic file selection reader node and delivery of outcoming figures into a certain Excel spreadsheet

Dear KNIME community,

I have set up a KNIME workflow in order to replace different daily reports that we had to run manually in the past. The outcome of each report is one figure per day. All of this figures are then collected in a certain column within an excel spreadsheet and we have a new column for every day. For each month we create a new worksheet within the same excel file.

My requirements/questions are the following:

  1. The database for my KNIME workflow are two excel files. For report 1-8 file A and for report 9+10 file B. There a fixed folders where the database files are stored. Currently I daily have to manually select the files (two excel reader nodes). Would it somehow be possible that the excel reader nodes select the database files (the most up to date ones in the folders) automatically? AND

  2. Would it be possible to have the execution of the KNIME workflow automated after the database files have been selected or if this is not possible at a certain point in time?

  3. Currently the 10 figures per day are written into 10 different csv files (new line for each day). Would it be possible to use KNIME in order to write this 10 figures directly into central excel spreadsheet, that I have described in the beginning? It would be ideal to have all of this automated and keep the logic that the daily figures are collected in a certain column within the excel spreadsheet and to have a new column for every day. Same applies for the procedure that a new worksheet is created within the same excel file for each new month. In case there is no direct way, maybe there is an indirect way? :thinking::wink:

BR
Matt

Hi @MattonKNIME,

thanks for your questions! Let me try to help here.

  1. So regarding your first question. If you say database you don’t mean a “real” database but more a folder containing Excel files correct?
    I don’t know how is the naming convention of your files, but I could imagine that using the List Files node where you can select a folder could be a good starting point. Then you could sort the list according to a number that you give to the files before. If you can manage to have the file you need at the top or bottom you could then filter that file automatically using the Row Filter or Top k Selector node. The URL could then be provided to the Excel reader automatically.
    Like this for example:
    image

  2. This is workflow scheduling and a feature of the KNIME Server.

  3. I don’t see any reason why this should not be possible, but I am not sure if I got it correctly. So if you have the images in one list in KNIME then, in the end, it is not a big deal to write them into one Excel sheet. You need to read that file first and then add your data to that (concatenate node probably) and then use the Excel writer node. I don’t know how complex your example is, but if you have some dummy data with a simple example workflow maybe that would make it easier here.

Best,
Martyna

3 Likes

Hi @Martyna,

thanks for your hints! Let me go through it point by point:

  1. Yes, you are right. I meant a specific folder that contains the relevant excel files. Sorry for being unprecise here :slightly_smiling_face:. However when trying to create such workflow, I was not able to find the node “Top k Selector” in the node repository. Currently I´m using KNIME version 4.0.2. Could this be the reason?

  2. Allright, but is there any alternative to have the execution of a KNIME workflow automated without using a KNIME server (we might only consider to use a KNIME server at a later stage)?

  3. OK, will build an example once I got a solution for point 1 :wink:

BR
Matt

Hi Matt,

  1. yes, the version here is the reason. In version 4.0.2 the node was not part of the basic installation and was included in the active learning extension with the name “Element Selector” if I am not wrong. So you would need to install that extension or update KNIME to at least 4.1
  2. no, at least not to my knowledge
  3. Good luck :grinning: let me know if you need help here!

Best,
Martyna

2 Likes

Well of course you should buy an KNIME server but if you combine windows task manager and KNIME in batch mode you should be able to do some automation. Also on UNIX and MacOSX you could try to use the respective schedulers.

But KNIME server is great and provides a lot of functions and is also available as an AWS service.

1 Like

OK @Martyna I got the latest KNIME version now and I can select all relevant nodes now. However I am still struggling with point 1. My intention would be to have always the excel file with the respectively latest amendment or creation date selected automatically. Not sure which possibilities I have there. This would be more comfortable because If I have to go through ascending numbers in the file naming the naming of the files would then be manual in the end, too…

Ok I think I found finally the node I was looking for that might help you :slight_smile:
First you list the files of the folder, then you need to transform the URL column to the URI format with String to URI and then you can use the File Meta Info that gives you a column with last modified dates.
sort according to the date column and select the first from the top.

image

I hope thats better now :slight_smile:

3 Likes

@Martyna before I will go ahead and create an example in order to come to a solution to all of my initial questions, just a short feedback on my trials on the automated file selection.

In the meantime I managed to have the selection based on the file name (URL). The reason why the automated selection of the desired file failed in my workflow was that I had the date in the format DD.MM.YYYY within the file naming convention. After changing the format to YYYY_MM_DD it is working properly now :grinning: :+1: .

However when trying it through your last approach with “String to URI” and “File Meta Info” the columns are not filled with the respective data, e. g. last modified date:

Unbenannt

Therefore the automated selection based on URI / last modified date does not deliver the correct output at the moment. Do you have any idea what is going wrong?

This is how both nodes are configurated currently:
String to URI
File Meta Info

Good to hear that something is already working :slight_smile:

Did you try to get the Meta Info for different folders?
Could you please try something that is local where you are sure to have permissions?

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.