Hi, I’m new here and I’m using KNIME for a few days what means I’m on a really basic level.
But I haven’t found a solution for my problem up to now. Basically I would like to read excel files, manipulate and import them into a database (SQLITE, Microsoft SQL,…).
The big challenge is, that the size of those XLSX-file reach 150 - 190 MB. With smaller files (~40MB) the reading process works fine. Is there anybody out there, who has a solution for me.
“The performance of the reader node is limited (due to the underlying library of the Apache POI project). Reading large files takes a very long time and uses a lot of memory (especially files in xlsx format when formula reevaluation is selected).”
One way would be to try and use R to import the Excel files and also write the results directly into SQLite, ARFF or Parquet (or CSV). So they would not need to be processed in KNIME (but you could always load them back from the mentioned formats).
If the Excel files become more complex you could also use R to help you with that (sheet names) etc.
If you have problems with column formats you could force them all to string and then decide the right format
If you have to deal with password protected files you could do something with Python
If you want to do more advanced things with KNIME and Excel (with the help of R and Python)