Two weeks ago KNIME 4.7.x was able to open Excel file with Excel Reade node even if the file was open in Excel application. Then, my computer crashed (which was bad thing) and a couple of good things happend afterwards:
My computer was repaired with the entire office suite installed from scratch.
Company I work for changed antivirus software.
I upgraded KNIME to 5.2 version.
And now, when I try to open Excel file that is already open, KNIME informs me that ‘The process cannot access the file because it is being used by another process’. I’m not sure if the root-cause relates to KNIME or to other computer/company settings.
Is it possible to change any preferences or configurations in KNIME to work with already open Excel files?
If so, what should I do?
Thank you @evert.homan_scilifelab.se for your prompt response and your confirmation.
So, either we have various KNIME settings or other, non-KNIME settings are crucial here.
I’m not aware of any KNIME setting that would change this, and that error message is resulting from low level operations between java and the operating system, so if it’s occurring you wouldn’t be able to just bypass it, so something else is going on here.
All I have is questions at the moment:
After getting this message, if Excel is closed, does the Excel Reader function as expected, and gain access to the file?
Does this affect ALL excel files or only certain files?
Are the Excel files still in the same folder that they used to be when it worked?
Is the folder containing the Excel files a onedrive/sharepoint/dropbox or similar “cloud” folder that is being synchronised with your computer?
Hi @takbb
Thank you for your feedback and your questions. Please see my answers: 1. After getting this message, if Excel is closed, does the Excel Reader function as expected, and gain access to the file? Yes, it is enough to close Excel file (Excel app could stay open) and then Excel Reader reads the file correctly. 2. Does this affect ALL excel files or only certain files? This affects all Excel files I have tested very recently. 3. Are the Excel files still in the same folder that they used to be when it worked? Yes, all file locations are the same. 4. Is the folder containing the Excel files a onedrive/sharepoint/dropbox or similar “cloud” folder that is being synchronised with your computer? Yes, all paths refer to company OneDrive. I have just tested Excel Reader node with Excel file stored on a local hard drive only: I was able to work with Excel file that is open in Excel app! 5. What OS are you running? This is Windows 10 Enterprise 64-bit operating system. 6. What version of Excel/Office? This is Microsoft 365 Apps for enterprise.
So, it seems OneDrive could have someting in common with the issue I’m facing…
I would certainly expect not to be able to write to a file that is open but I know that when working on a flow, sometimes I will also have an excel open for viewing in Excel at the same time I’m reading it in the workflow, especially when I’m initially developing the flow. (sometimes just because I’m too lazy/forget to close it )
But it may be the case if I’m configuring the Excel Reader for a particular range of rows (rather than just “all cells”, since the Reader’s preview of the data doesn’t necessarily give an accurate representation of the row numbers (for example) if you have things like “skip empty rows” selected, so looking in Excel may be the simplest way to know what the physical range of the cells is that I want to configure.
I might (very occasionally) even modify it in Excel, manually and then re-read it in the flow whilst developing.
The particular issue above hasn’t caused me problems because my tendency is to not work with files directly on OneDrive, but it’s odd that the scenario works with 4.7 but not with 5.2, and additionally that it apparently isn’t a problem if using a local drive rather than OneDrive.
To reply @Daniel_Weikert’s question: Excel file covers various data, including errors that are ‘dynamic’ (there are errors or not; there are errors that could be omitted and those that need attention). To investigate root-cause for errors requiring attention, I work with Excel and KNIME: I look into Excel for error area and then execute appropriate sub-workflow in KNIME for investigating details.
And sometimes, it happens that I simply forget to close Excel file while using KNIME workflow refering to the same file
Thank you @takbb for spending time on reproducing the issue I face and for your findings. At least I know all is fine, except that bug.
I hope that bug will be removed in the next KNIME releases.
with KNIME 5.2 we changed the way how the Excel Reader uses the underlying library (Apache POI) to read Excel files in order to support reading larger files (that would previously exceed the Heap space when read).
Unfortunately, it seems that there is a special exclusive lock present when Excel (or any other MS Office program/Office file) opens an Excel file located in OneDrive specifically (which caught us by surprise, to be honest). This lock seems to be different than normal file locks on Windows and is probably there to support the tightly integrated Office sync functionality, which is why files located outside of OneDrive continue to work.
Since we have not found a robust way to detect this special case where it could be somewhat OK to ignore the exclusive file lock, we cannot simply revert the behavior, unfortunately, as it would otherwise remove the possibility to read larger Excel files again and possibly introduce subtle data loss or wrong computations when simply ignoring any file locks. If you are interested in a more technical background I can gladly provide what we have been able to find out.