issues with excel reader

dear community, i have issues with the excel reader. after listing folders and excel files therein, i would like to use the excel reader to extract the actual data over a variable loop. while the listing nodes work fine, i occasionally receive the following error when executing the excel reader: execute failed, the specified file does not exist.

however, the file does exist. proof is that the execution does work from time to time. but (too) often it does not.

does somebody have an idea what to do?

thanks!

Hi MB12
Could you please share a screenshot of the listed files and a screenshot of the excel reader?
I am woking with this in a couple of proyects - gathering- reading files in a folder, and read (list) all in a new file to do some work.

2 Likes

Unfortunately, I cannot share a screenshot.

@MB12 can you give us any more details about your system and setup. Maybe a log file in debug mode when the problem happens?

1 Like

ERROR Excel Reader 0:1924:1917 Execute failed: \path\dummy.xlsx: Die Datei oder das Verzeichnis ist kein Analysepunkt.

WARN Excel Reader 0:1922:1915 The node configuration changed and the table spec will be recalculated during execution.

sometimes the above mentioned error even occurs when i manually browse through the folder structure and pick an excel file to be read. any ideas what to do? thanks!

@MB12 I think you would have to provide us with more information about your system and settings. Also you might want to consider providing us with a log file.

1 Like

I would try to copy the excel file to a different location and try again. If it works it is related to the original file in the folder.
br

1 Like

Maybe the file is opened, the knime won’t read a opened file…

If the file is shared or the folder too, can bring this error… your test is local mode or remote one?

Tks,

Denis

3 Likes

Hi Denis,

the file is stored on a shared drive. i attached an image of the workflow with the corresponding error notification.

image

ERROR Excel Reader 5:4 Execute failed: The specified file xyz.xlsx does not exist.

@MB12 then it seems there is a problem with the connection to the remote file. You could try and copy the file to a local location and then do the import. Or again you might tell us more about the system. What kind of remote drive - maybe there is a generic connector.

2 Likes

it is a network drive inside the company network. since the amount of data is relatively large, i would prefer to avoid data transfer

Hi @MB12
It’s always quite tricky to diagnose a problem like this where we cannot see anything of your configuration. Quite simply, we know that such a workflow DOES work, so it is unlikely we are looking for a bug in KNIME, but much more likely we are looking for something wrong with your setup. We are therefore trying to second-guess all the possible config errors, or infrastructure issues that could lead to what you are seeing.

That’s fine, but this then leads us to ask questions which you may think are obvious, but I’m afraid they have to be asked, because somewhere in these questions is likely to be the answer… so please bear with me on this, as I fire a few questions at you:

Have you confirmed manually that the specific file that the Excel Reader is attempting to read really DOES exist?

I would suggest that just because it works on one iteration with one file, this is not proof of the existence of a different file on a separate iteration of the loop.

If you wait a couple of minutes after the error occurs, and then manually re-execute the failing Excel Reader node, does it then work, or does it continue to fail?

We know that the List Files/Folders is returning at least one file. We assume (but don’t know) that you have your Excel Reader configured to read the current file in the loop, rather than a file name from some other flow variable (and that file happens to be present on one iteration, but not on another)…

Assuming it is configured correctly, we know that the current file existed at the time List Files/Folders read the folder, but there is nothing here that guarantees that it still exists when the Excel Reader finally attempts to read it.

My thoughts then are that your problem is one of the following:

(1) The file name has been found by List Files/Folders but the file is still being written when the Excel Reader attempts to read it. Although that I think would return a “locked by another process” message rather than “file does not exist”

(2) The file name has been found by List Files/Folders but the file has been deleted by an external process before the Excel Reader attempts to read it.

(3) You have a misconfiguration with flow variables and the file that the Excel Reader is attempting to read is not actually the right one (e.g. the file name has been hard-coded in the Excel Reader or the wrong flow variable is being used).

(4) An unstable network connection, or other network issue, is causing intermittent problems when attempting to read the file.

Additional questions…

How many files is the List Files/Folders node returning?

Is it always the same file that causes problems? i.e. if you rerun the workflow from the start, does it fail on the same file every time?

Do you have a problem EVERY time you attempt to run the workflow, or is it intermittent?

Is the folder containing the file also being written to by an external process, or other processes on other machines?


Oh hang on… a completely different idea. I just translated that last error

“Die Datei oder das Verzeichnis ist kein Analysepunkt.”
“The file or directory is not a reparse point”

Are you using OneDrive or some other file-syncing service on this folder? Do you have an XLSX file “present” but which is actually being backed up to cloud, leaving just the filename/pointer, but not the actual file itself on the drive?

2 Likes

Thanks for your detailed reply. A few comments from my side:

It is not always the same file that is making trouble. Sometimes - but rather rarely - it happens that the workflow is running smoothly. However, most of the times, it does not. Sometimes it already crashes when trying to read the first file in the list. Sometimes a few steps can be executed, so it takes me quite some time to restart the loop over and over again.

Let me know if there might be additional information that could be useful to find out what the origin of these issues is.

If you google these things you get these hints that they might have to do with the handling of file (archive?) attributes or path names on a Windows server. Do you have any hints as to what happens on the server?

Can you try a Microsoft Authentication to access the data or another version of paths? Do you currently have a drive letter assigned to the data like "P:"?

1 Like

You might want to try to collect the list of files you want to process and store the ones you have already successfully handled and try the rest again instead of starting everything new:

2 Likes

yes, in my case the drive letter is H:

Hi @MB12,

When it fails, and says “execute failed, the specified file does not exist.”, at that point, if you right click on the Table Row to Variable Loop Start, and look at the current flow variables, and you see the current file name, are you definitely able to see that file name (which the Excel Reader says does not exist) in Windows Explorer, and if so, are you able to open it in Excel?

And if you are able to do both of those things, then if you close Excel and then right-click on the Excel Reader, and execute it, does it then work, or does it still say the file does not exist?

2 Likes

i can see the file name that is the current flow variable in windows explorer. and yes, the corresponding file does exist and can be opened.

when executing the excel reader manually, it does work!

Hi @MB12 , Is the folder being synchronised to the cloud by any services such as onedrive or dropbox?

If the file “becomes available” when you are opening it directly in Excel and KNIME is then able to read it, this suggests to me that one of these cloud services may be the issue.

If the file is present only as a “link”, attempting to open it in KNIME will not cause the service to retrieve the file from the cloud, and KNIME is likely to report it as not found or another error.

Opening it with Excel, however, forces the service to pull the whole file down so that it is then available locally.

The apparent random nature of your problem could be explained by the “indeterminate” situation where any given file may be present locally or only on the cloud.