Dear KNIME users,
I have difficulties to upload large csv files (> 200 MB) from a network drive into a knime analytics platform and from a local drive to the KNIME server. I am uploading with a “file upload widget” so that I am able to explore the workflow both locally and on the server. To circumvent the problem I am now pasting the folder path into a string widget, than select the file via table view and subsequently transfer the file to a temporary directory. Apparently the file transfer node has no problems with the size of the csv files and once they are in a temporary drive I can open them with a csv reader. A more elegant solution for me would be to use the file upload widget to just get the path of the local file and then transfer it to a temporary folder. However, it seems that there is no configuration in the file upload widget to just create the path? Are there any alternative solutions, the file chooser node does not work in a component and the server. I am using KNIME version 4.7.8.
Thanks in advance for any ideas
Stefan
Hey Stefan,
Sadly it is not possible to just expose the path form the File Upload Widget. This is a security feature of the native file browser that prohibits to get the actual path of the file. You only get access to the content which is why we upload the content and write it into a temporary location on the Server/AP so that we know the path and can expose it as a flow variable.
That being said lets focus on the root cause of your problem. From your post it reads like the file transfer solution is also just a workaround and I guess you are using that because the File Upload Widget has some problems with files > 200MB?
Greetings,
Daniel
Hi Daniel, thanks for the explanation. And you are right the main reason for my workaround was due to difficulties with uploading large csv files. This seems to be related to the transfer from the network. Because when run the workflow on my PC and the file to upload was on my PC as well, I could handle large files. The problem occured when I was exectuing the KNIME workflow on my PC and the file to upload was on a network drive. And also when I executed the workflow on the server and the file to upload was stored locally or on network share. Hence, it really looks like a time out or whatever for uploading large files.
Kind regards
Stefan
Hey Stefan,
just to make sure, have you tried to increase the timeout option in the File Upload Widget?
Greetings,
Daniel
Hi Daniel, yes I increased the time-out to 200 seconds. On the server the upload failed.
The server returns the message “upload failed”. Smaller files below 200 MB are uploaded. As indicated before using file transfer to temporary folder is possible with these large files.
Kind regards
Stefan
@sscholz you could try and split the CSV into Parquet files which are compressed and then upload them individually. They can later be used as one big file again
Hello,
There is a JIRA designated AP-20331, which describes an upload size limitation via AP for File Upload Widget node. However, I don’t have a specific number for the file size on that JIRA.
But, it seems likely that if you are uploading files > 200mb and it fails 100%, but it succeeds 100% on files smaller, that may be the limit, and this JIRA may be your issue.
It is not marked as resolved at this time, but I would suggest
a) downloading the latest AP and testing with it just to verify if the issue still repros in the latest version; or
b) possibly working around the issue by looping, uploading the file in chunks < 200mb and reconstructing the data after reading.
Thank you,
Nickolaus