Excel Reader - So Annoying -Transformation Tab

So Annoying Excel Reader
I’m having an issue with the Excel Reader node in a loop.

Situation:

I have a folder with 10 Excel files.

  • All files have the same layout (same number of columns and data types).
  • They all have the same number of sheets.

I’m iterating over a specific sheet.
I’m using a flow variable for each file within the Group Loop.

  • In the first iteration, everything works fine.
    (See screenshot.)
  • In the second or later iterations, an error may occur, stating that the data type of a column in the first file is different from the data type in the next file being iterated.

I can’t seem to fix this at all.

I’ve already deleted the node, added a new one, and the same issue keeps happening every time.
It’s been really difficult to get it working.

I’ve also tried unchecking Enforce Type, but the problem persists.

See this frame of the first iteration
Look for this column selected

Then Error
image

*Now, look that same column, that jumps to the end with another datatype,

Sometimes the more colums “jumps” to the End, are not selected

Configuration Excel Reader


Can someone help?

Hey there,

first question:

Have you considered changing the “Mode” from “File” to “Folder” and to simply read all the files at once?

If column structure is the same and sheet names are always the same you can skip the initial loop for reading the data (you can configure the node so that the file path is added to each row - so if you need to process each file’s content individually you can later iterate over that file path column using a group loop)

I tried to recreate your scenario different ways, however did not manage to replicate the error…

I tried to start with a file that has only integers in the column and transform to double and then have a mix of files with only int, only double or mix of both.

Also started with a double only column (w/o touching transformation tab) and then using the other files - also no issue…

Here’s the workflow both with the loop and the read all in one…

ExcelReaderSchema.knwf (109.5 KB)

@MartinDDDD

Hello Martin,

Yes, I had considered that, but I decided to do it this way because some files were large, so I thought it would be better to process them one by one (even though, in the end, the Loop End concatenates them all together).

About the issue
It’s really difficult to simulate the error.
I also don’t know why it happens.

The most frustrating part is that I built this workflow for a friend, and she was running it but kept complaining about this error.
I tested it on my machine, and there was no error. Then, I exported it to her, and on her machine, the error appeared.
I went crazy.

But I ended up solving it with a simple strategy:

  1. I went to the Transformation tab in the Excel Reader node and changed everything to String. This way, the Excel Reader won’t throw an error due to column type issues.
  2. Before reaching the Loop End (but it could also be after it), I used the String to Number node.

In short, I treated everything as String first and only changed the Data Type after the Excel Reader had processed the files.
See the screenshot.

I believe the error was caused by the following scenario:
Imagine that in the first file, there is a column where the numbers are "0" or "1" to "9".
In this case, the Excel Reader will consider the column as Integer.
However, in the second file, imagine that the same column contains at least one value like "10.1".
Then, the Excel Reader will treat it as Double.

At this moment, I believe the column “jumps” to the end and triggers a Data Type error.

If I check the “Change table specifications” option in the Loop End, it wouldn’t be ideal because the columns that “jumped” to the end would stay there, changing the order of colums dinamic.
This isn’t what I want because of this issue—each file could be different depending on which column caused the error.

Another functionality I couldn’t find was an option in the Excel Reader node to not use the first row as a header.
If I could disable that, the columns would automatically be treated as String, eliminating the need to manually force it in the Transformation tab.
Since I couldn’t find such an option, I had to manually set everything to String and then convert it back with String to Number.

In the end, the result was satisfactory, and the error no longer occurred.

On the forum, I found a post from 2020 where someone complained about the same issue.
I’ll send it to you so you can check it out if you’re curious.

My strategy
All colums to String

Changing to Number after

Same problem

2 Likes

Awesome that you solved it! The „treat everything as string and then convert“ I had in mind, too but it’s more of a last resort :).

I think in data area tab there’s a setting where you can choose which row to take as column headers. I think you can also uncheck that so then you get default excel headers A, B, C etc and your first row is just part of the data…

I agree with what you suspect as the cause of error and that is what I tried to replicate ( therefore trying different scenarios for the column in the first file…)

Might remain a mystery, but glad you made it work

1 Like

@Felipereis50 you can skip the first row and just import standard column names, and reuse them later.

deactivate “use values in row”

Also you can choose what to do with column names

2 Likes

Thanks for the help my friends.

@MartinDDDD
@mlauber71

2 Likes

Hello there!

Additionally Loop End node has option Allow variable column types which helps in this case and in general when looping and don’t want to know to much about column types.

From node description:
Allow variable column types
“If checked, the loop does not fail when the column types between different table iterations change. The resulting column will have the common super type of the different column types.”

Br,
Ivan

2 Likes

OMG, on my face and I didn’t see it.

Thanks for the tip

2 Likes

Learned something new - love it!

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.