Hello,
Just some feedback to the updated Excel reader. I like it, but came across one minor issue:
I deal with messy Excel data a lot and use KNIME to clean it up.
In a typical case, I get a .xlsx file with 100+ sheets, there is no real data schema, it’s all over the place.
My prefered solution is to create a loop and append all sheets into one file (as strings) and go on from there.
Worked fine with the old Excel reader.
The new Excel reader tries to force a schema (columns and data types) on me, based on the first sheet he finds, or I select.
No bueno
Feature request:
Let the users create a desired target schema in the transformation tab:
Currently I don’t see a way how to add new columns
Force numbers to strings
My current workaround is to append a “target schema sheet” in the original file before reading all sheets. It works, but not exactly proud of that.
Also, “any unknown new column” probably works fine for new files, but doesn’t really do much in a loop.
1 Like
janina
January 7, 2021, 11:42am
2
Hallo @dr_snglr ,
thanks for your suggestions and feedback. This feature is already on our list (internal reference: AP-15888), so we will try to implement it as soon as possible.
I will keep you updated.
Best,
Janina
@dr_snglr in these discussions there are examples how to import Excel Files into KNIME with the help of R and either let the system decide about the data type or force it to import everything as string so you could later decide how to handel it
that is actually a good idea
I built a workflow that uses your idea and does three imports of an Excel file, one all numeric, one all strings and one guess. For the numeric ones when it might actually be a String there will be a huge number of missings. I do a calculation of % missing and I set a variable (75%) if there are 75% or more missing and the guess suggests a String it is a string.
Obviously if you have date variables you would have to do something different.
Here the …
These are item no 510 from this collection about KNIME and Excel (import)
3 Likes