How to dynamically read excel sheet

Hello,
I am currently working on an Excel file that contains different sheets in date format. For my work, I need the data of the sheet that is named as yesterday’s date. For example: If today is 09.12, I need the data from 08.12. How can I do this?

image

Hi @emshihab !!

I assume you are going to run the workflow every day, so today’s date and the run date could be the same value, then, the yesterday’s date will be execution - 1 day. Once you get that value, the rest of the task consist in formatting.

At the end, you can use the formatted date as a variable for the ‘Excel Reader’ node, passing the variable through ‘sheet_name’ parameter.

Hope it works for you. Best regards,


dynamically read excel sheet.knwf (26.5 KB)

2 Likes

Hi @cristiancandia , your approach in general is good, but I would not use String Manipulation to “extract” day and month from the date. I’ve not looked what you did, but it’s most likely that you are assuming a date format when you are doing this. Date formats depends on locale, meaning that day and month are not always in the same position. For example, some locales have dates as yyyy-mm-dd (or yyyy/mm/dd) while others have mm-dd-yyyy, and others have dd-mm-yyyy, etc. This means that your formatted_date is valid only for specific locale.

It’s safer to use the Extract Date&Time Fields which will factor in the locale.

@emshihab , I’ve put something together that offers flexibility of changing the date in case you want to run the workflow for some other date, but default is the current date. The workflow also will work for any date format.

My workflow looks like this:

Looking at the first node, it’s configured to use the current date:

But should you want to run it for a specific date, just uncheck that box, and click on that little icon to get a nice calendar popup where you can choose which date you want to run it for:

2 nodes that you need to configure based on your locale. The Node 3 and Node 5. Both should use the same locale, which should be your locale. I’m on Canada, so mine is en-CA. By default, Knime would set your locale automatically based on what your system/computer is set to:
Node 3:

Node 5:

So, for today’s date, it will generate a variable called sheet_name with value “08.12”:
image

I have also configured it so that if the month is from 1 to 9 (Jan to Sept), the month will be formatted as 01 to 09, so if you run the workflow for Feb 4th, the sheet_name will be “03.02”.

The Excel Reader is configured to use the sheet_name as sheet_name:

You just need to point to the file, and that’s it.

Here’s the workflow: Dynamically read excel sheet.knwf (20.6 KB)

9 Likes

Big thanks Bruno, i totally agree with you!!

2 Likes

Is your “yesterdays” sheet always the second one? If so just read sheet by index 1
br

thanks for both of your solution.

Hence i am working on knime automation for this excel file also, the position of excel sheet is changed. Now today’s sheet is placed in the last position. And the yesterday’s sheet is placed before the today’s sheet. That’s why I assume index might not work.

If you today’s sheet which becomes yesterday’s sheet tomorrow is shifted one tab because the new today is added then reading always the second sheet to get yesterdays data would work. But I assume your use case is a little different
glad that you found a solution which works for you.
br

In one sense, your idea is also nice. My current workflow reading data by sheet name is perfect if I do the task everyday. But if I miss one day and next day when I will try to execute the workflow, it will not work for me because I have given condition to read sheet name of yesterday.
Currently, I am using knime 4.2. Directly, I cannot provide the index for sheet. Can you tell me how can I call the sheet by index?

Excel Reader should provide you with option to read by sheet name or index based on your configuration
br

@emshihab not entirely sure if this is of any help but I thought about Excel and index of sheets and toyed around with openpyxl. There you can extract the explicit index of a file. Maybe you could do the same by just importing all sheets via Excel node and see their sorting.

As far as I understand your case you will have to either have the date in the name of the sheet (would there be a Excel file per month or would you have to take care of last and first date of month also) or you will have some indication within the data itself.

All being said this seems not an ideal way to safely transfer data so you might try to work on that part too. Maybe name the sheets with an explicit timestamp “2021-12-24” that would be easier to handle.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.