I'm (still) trying to use KNIME to download a file using the HTTP connection + download nodes, so far with no luck. I have no problem downloading the file (see "search_result.zip" file attached below) using simply a web browser (Safari for Mac, etc) by just "visiting" the following URL http://126.96.36.199/ct2/results?lup_s=12%2F01%2F2016&lup_e=12%2F01%2F2016&studyxml=true (When doing this, a ZIP file containing individual XML files is automaticaly downloaded to my "downloads" folder). But my attempts to elicit the same behavior/output with KNIME have been unsuccessful (See "http_connection_and_download.zip" workflow attached below). Instead of the ZIP file, KNIME downloads a single file (containing just some metadata corresponding to the individual XML files) (See "resultslup_s122f012f2016lup_e122f012f2016studyxmltrue.txt" file attached below). So obviously there's something missing here... Any suggestions/solutions?
I have investigated the issue and found that the Download node does not support a server-side redirect to an SSL version of the download resource if that one uses an SSL certificate which does not match the domain name of the download URL.
In the example above this happens. In browser like Chrome the download works, because Chrome seems not to care about the certificate issue when the server requests the redirect. It does actually not work in Chrome either when I go directly to the HTTPS version - then it tells me about the security issues with the certificate. The certificate is valid for *.clinicaltrials.gov", but not issued for an IP address in the example above.
As a workaround I used Palladian nodes, HttpRetriever and HttpResultDataExtractor, which support better SSL handling and support that the certificate name does not need to match.
It would be good if this functionality could also find its way into the normal old KNIME core Download node.
Kudos to Manuel for developing a meta node (Attached below) that solved my problem! BR. Martin
Note, that beside the fact of the Palladian Nodes dependency the meta node contains also an internal node, which is not available for the public. The Fuzzy Numeric Row Splitter (a NIBR node) node can be replaced easily with some other node that just checks if the HTTP response code was >= 200 and < 300 -- it should be simple to adapt.