Trying out Selenium for the first time (and ensuring I had the correct download site configured ;)), I’m now facing a challenge of tackling an authentication dialog.
The dialog is not a true popup (not a window), and it isn’t within a frame of any sort. So I have two questions:
How does tackle this authentication type? The closest posts I found were using Java Snippets, but I believe those were tackling an iframe type. There was a Facebook related one as well, but that login I believe is embedded into the core page itself.
Slight spin-off question: once authenticated, let’s assume there are a list of CSV links to download. Normally, CSV links can be read via URL setting in the CSV Reader node, which will then trigger an authentication window within KNIME. Can the authentication from #1 carry downstream for such purposes, perhaps in a workflow variable, or are CSVs extracted in Selenium differently?
Overall quite impressed with the Selenium node set - feels like I should have jumped on this boat a long time ago!
Context matters with issues like this so, if possible, I would recommend sharing the website where you encounter this so that people are better equipped to try to help you out
// the URL to download
var url = arguments[79];
// asynchronous callback, triggered when download has finished,
// this is the last argument in the arguments array
var callback = arguments[arguments.length - 1];
if (typeof callback !== 'function') {
throw new Error('The callback is missing or not of type function! ' +
'Please enable the [asyncCallbackMethod] in the node configuration.');
}
var xhr = new XMLHttpRequest();
xhr.addEventListener('error', function(event) {
callback('ERROR: ' + JSON.stringify(event));
});
xhr.open('GET', url, true);
xhr.responseType = 'arraybuffer';
xhr.onload = function (oEvent) {
if (xhr.response) {
var byteArray = new Uint8Array(xhr.response);
// convert ArrayBuffer to Base64 string; code taken from here:
// http://stackoverflow.com/a/9458996/388827
// important: do not use fancy one-liners as they obviously
// cause exceptions with larger arrays
var binary = '';
for (var i = 0; i < byteArray.byteLength; i++) {
binary += String.fromCharCode(byteArray[i]);
}
callback(window.btoa(binary));
}
};
xhr.send(null);
Or the second approach with using another Navigate node with the URL fed right in via variable.
The Navigate node times out, and the Javascript produces a 1kb file.
I would not recommend the “XHR” approach nowadays - this is back from the times, when there was no better way to trigger file downloads with the Selenium API / automated browsers. Fortunately, this has changed for the better, and you can now usually simply execute a “Click” action on a download link.
For that, use the “Download Files Templates” in the Start WebDriver resp. Factory node (according to the screenshot, you already added this). This is needed (a) to adapt some security settings which are per default rather strict, (b) disable the download prompt dialog and start downloading right away, and (c) to set the download destination.
Then, extract/locate the link you want to download and use a “Click“ node to start the download.
For larger files which need some time to finish downloading, you can continuously poll the file system to check if the download is still running, or has finished.
I have a while ago built an example workflow just for that - it additionally adds some logic to use a dynamically created temporary directory for the downloaded file, but this is not strictly necessary.
You find the workflow here:
There’s also an older forum for this topic, which is worth a look:
Regarding the authentication dialogs I’ll add some more content here later, as we already discussed via email.
Awesome! Thanks very much for the workflows and post link. While I hadn’t run into that one post (thought I read them all for Selenium), I did have an ever-growing list of arguments to try to get the larger download going.
So overall, for larger downloads, we do have to go with the Selenium/Palladian mix in the meantime.