Multiple excels files with multiple sheets

Hi Knimers,

I have a challenge here, I need to read 5 different excel files, each one has 5 or 4 sheets. I previously built a workflow classifying the date by Rank on these last 5 files of the month. The only way I found was to read each one, filter with Row filter by row number, and then read each sheet separately and then Concatenate each category, for example: all 5 files have 5 tabs,
file 1 has a-b-c-d-e,
file 2 has a-b-c-d-e,
file 3 has a-b-c-d-e,
file 4 has a-b-c-d-e,
file 5 has a-b-d-e.
All the “a” must be together, the same as the “b” and so on.

This is a quick view of what my file looks like:

Do you know a better way to build a beautiful and more effective one?
BTW: sometimes file 5 has or not “c” or “b”, it is variable.
Another thing, excel files are read by “Variable: URI” because I am executing the Rank first.

Thanks in advance for your help!! :slight_smile:

You could use a combination of loops to extract the necessary informations. The File Meta Info node could provide you with information about when the file was last saved if you do not want to use the file name itself.

If you want to explore further options how to deal with Excel import you might want to have a look at this collection:

4 Likes

Hello!! Thanks for your help, I set up the workflow, but something happened:

Sure enough, I have a file for each of my sheets, but I have one, “Magazine” for example, it should have 102 rows, but when I read the result with the *** Table reader *** node, this is the amount:

image

I only modified 2 nodes, List files to provide the location of the data, Java edit variable with the new location of the result.

By the way, I need to work with this data, each sheet has a different number of columns, I need to filter rows, columns, add others, and finally when the data is complete, concatenate all the data in one file. The best way is after running this workflow to start again with Table reader?

Hi,
are you sure that your Magazine file does not have rows contains unvisible chars?

Just open our file with Excel and jump to the end or save it as csv-file and view in that or load it into Knime.

BR

2 Likes

Hi! thanks for your answer, I saved it as csv but the result is the same :see_no_evil:

@IvaR do you get actual duplicates? Maybe you could check what is going on with a sample line.

It is difficult to understand what is going on. Maybe if you could construct an example that demonstrates your findings we could investigate.

@IvaR
did you look in the imported table? How does the data look like?

One point which i had long time ago with an excel file:
The file was formated with specific formats over the whole table. I had in this case an similar issue. The Knime excel sheet read tried to read the whole table. But after deleting all the seemly “empty” rows below with data filled block solves that issue. Maybe you have an equal case.

@to the knime developers: This could also be an improvement for the excel nodes.

BR

2 Likes

Hello @morpheus and @mlauber71,

Sorry to bother you, I was working on this project but had to quit, and now I need to get back on it.

I was looking for the error in the workflow. In my case, I am trying to read 3 different files and each file has 7 different tabs. I noticed that the inner loop reads the first tab of each of the 3 files, gives them the correct name, but the resulting 7 files have exactly the same information (first tab of the 3 files). I have the impression that the inner loop is running, but the outer loop is not changing. That is, the loop that reads the files if it changes and reads the first tab of the next file, but does not read the other 6.

Thanks in advance :slight_smile:

@IvaR welcome back :slight_smile:

It is difficult to say what is going on in your workflow. I think you would have to provide us with an example representing your specific challenge (without giving away sensitive information of course) since I provided several working examples that demonstrate how to import data from Excel in various settings. If your scenario was not covered we would have to take a closer look.

Often it has to do with the correct ‘wiring’ of loops. If you want to educate yourself further on that topic I created a small collection about that:

1 Like

Thank you very much, I rebuilt the example I am using. As you can see, I added two “table readers” to check the data, one for the result of the “a” tabs contained in the 5 files and the other for the result of the “b” tabs. As you can see, the results are the same in both tables, which means that only the “a” tab is being read, although the tabs have different names, all 5 contain the same information.

Example loops Iva.knwf (111.1 KB)

Thank you very much for your help!
And for all the information, I am reading right now. :slight_smile:

@IvaR you are welcome. Unfortunately your workflow does not contain the Excel files. edit: OK they seem to be there but they are not accessed properly - I can adapt that. edit 2: OK are they the ones I originally included …

A good way to include the files is to put them in a /data/ subfolder under the main workflow so they would be included when you export the file. Most data handling nodes would offer the option to access the data relative to the current workflow for example:

Other than that the KNIME protocol offers a good option to access data (more on that you can read in the mentioned meta node about Loops and Paths)

1 Like

Thanks, to the originals you included I made some changes to the data, such as the identification numbers (from 1 to 125) and the data I changed to places

file_01.xlsx (12.3 KB)
file_02.xlsx (12.0 KB)
file_03.xlsx (12.0 KB)
file_04.xlsx (12.0 KB)
file_05.xlsx (12.0 KB)

Thanks :slight_smile:

@IvaR indeed there was a problem with a variable setting. Only the URL got changed the sheet was always the same. I added the Sheet as a variable. Now it should work:

4 Likes

AMAZING!!! WORKS Perfectly!!! Thanks a lot :star:

3 Likes

You are welcome. Sorry for the confusion and happy KNIME-ing :slight_smile:

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.