Hello, I just started learning KNIME today and I’ve run into an issue with the Excel Reader.
I have a folder with weekly data in excel files and they all have the same table structure. The only difference between all these files is the sheet name since the sheet name changes every time the data is exported from my source (which I have no control over).
Is there a way to get Excel Reader to read all the files without having to manually change the sheet name in every excel file so that they’re all the same? I know I wouldn’t have to worry about that if I had the files in another file format, but I’d like to avoid that since I have years worth of data. My goal is to have these files combined into a single file.
I’ve tried to look at other people’s questions to see if I can get some kind of answer to work with, but nothing seems to help.
So I’ve tried using the Read Excel Sheet Names node and connecting it to the Table Row to Variable Loop Start node, but I’m still getting an issue in my Excel Reader node. The example you’re referring to is if the files have multiple sheets, but it seems like there is still an assumption that the sheet names are the same across multiple excel files. Here’s a picture of my workflow.
The error message I get is:
ERROR Excel Reader (XLS) 2:5 Execute failed: java.io.IOException: Workbook “2020 Week 2.xlsx” does not contain a sheet called “Sheet1”
It’s happening because the sheet names are different between the files.
Everything else in the workflow works fine (thanks to the hub which you pointed out for me).
The workflow should be ok with that. Have you checked that the sheet names are set as a variable as well as the excel file itself. And would this error happen with an early file or in the middle (so to speak).
The you might check if your file actually does contain a sheet1
Do you know where I would set the sheet names and excel file as a variable? I don’t see an option for that in either of the excel nodes.
The error is happening early in the file (like, between the very first and and second files). The first file contains a Sheet1, but the rest of them don’t.
In the configurations for Excel Reader, I have “Files in folder” and “first sheet with data” selected.
I am still having issues with the Read Excel Sheet Names node. The output table shows that it is only reading the sheet name of the first excel file, but not the others. I noticed in your workflow that the file you have selected to read is “randomForest.xlsx”, but that’s not a file that I see in the files you’ve provided. Is that a file containing the sheet names?
The workflow in my previous post did not match the picture. I removed the RowFilter and the deprecated nodes. You can find the flow on KNIME Hub . But in the end this doesn’t alter the approach. And yes you see a link to some randomforest.xls, but that file will be override by a flow variable (the location of the file from the List Fiel node).
I’ve checked through every configuration and made sure that they match yours, but Read Excel Sheet Names is still only reading the sheet name of the first file. I did notice that I get the following warning:
I changed then example to two nested loops and put it on the hub. Maybe you could check it out. And it also uses the KNIME protocol to reference the files so it should work on all platforms and computers.