Hi,
sorry if I ask a very common question, however I am stuck. I need to download a file (gzip, csv) from an https URL. I’ve tried many nodes (download, HTTPs Connector, Transfer Files, CSV Reader, HTTS Connection) but I am not able to download the file.
thank you. I’ve tried, however I get a 1 kb file that has no file extension.
I think as I am not downloading a .gzip/.csv via the link, something is not working properly.
I am just trying to download the CSV. I thought with a Get Request I could see more, what the Server is doing. But this is obviously not the case.
So basically I still want to dowload the gzip compressed CSV file from the inital file path. However when I use Decompress Files Node (the one node that gives me a real result), the downloaded file is a file called “1”, with no file type.
Well, based on your screenshot, you are NOT downloading the gzip compressed CSV file. You are trying to decompress the file on the fly from the server (most probably Knime does eventually download the file to a temp folder first).
To download the compressed file, you would use the Transfer Files node:
And then decompress the downloaded file using the Decompress Files node as you did, except that you would choose the locally downloaded file. In your screenshot, that is not the case, you are pointing directly to the online compressed file. In theory, this should also work, but it looks like it’s not working in this case.
Can you try to first download the zipped file with the Transfer Files node, and then decompress it via Decompress Files? If you still get the same result, can you decompress the file via your OS and see if you get the same results - this is just in case that’s what’s in the zip file.
This is why I’ve tried to use Get Request Node in order to see, which other files I could choose.
However, when I use the file without the “/” at the end, I get this result again:
This file cannot be read with Decompress Node, because then the download folder is empty aka the file cannot be read.
If I change it manually to “1.gzip” I can decompress it and I can see one file called “1”. If I rename this file to “1.csv” I can now see my CSV.
Again: whenever I put the same path (from Transfer Node) into my browser, I can download a file called “datafeed_xxxxx.csv.gz”. This can easily be uncompressed and the final result is a file called “datafeed_xxxxx.csv”. This behaviour I am trying to recreate with KNIME.
Hi @Awiener , if you have a “/” at the end of the URL, it means you are not pointing to a file. You are pointing to a folder path. You need to point to a file, and you can see the message that confirms this.
If you expect a gzip file, then you should point to path_to_the_file/your_file.gz for example.
Similarly, if it’s a csv file, your url should be point to path_to_the_file/your_file.csv
EDIT:
That is not correct. Putting this path in your browser, does not download the file. I opens the online folder and shows you the file, and then you have to download the file by clicking on it for example.
The Transfer Files does not have the interaction of showing you the file for you to click download. You have to give the full path of the file. And again, this is clearly mentioned in the message from your screenshot (I wrote these comments without looking at the messages, and only noticed them after I wrote them).
So, what you need to do is add “datafeed_xxxxx.csv.gz” after the “/”. For example: path_to_the_file/datafeed_xxxxx.csv.gz
Last and minor question: is it possible to connect the nodes somehow to each other? So that in batch mode decompress is done AFTER the file has been downloaded, and the CSV is read AFTER the gzip has been decompressed? At the moment the nodes “fly” around in space
You can use “flow variables” connections for that. Drag with your mouse from the very top right corner of the preceding to the successor node. This way they are executed in sequence once the previous one has finished.
Alternatively, right click the node and select “Show flow variable ports” to explicitly show the ports.
Hi @Awiener , yes you can link nodes via the Flow Variable ports. This is applicable to any nodes actually, so you can also link the CSV Reader after the Decompress Files, like this: