Problems with Selenium

I'm having some difficulties with Selenium for Knime. My aim is to scrape data from web pages which need authentication. I've downloaded the example workspaces from http://seleniumnodes.com/workflows, and now I'm running the first one (Han shot first). The problem is that is taking a very long time (3h+ and still running).

May be something wrong with it? Is it needed a further configuration? By the way, is there any document about Selenium nodes apart from the web page itself?

Can you please be more specific, which task is taking 3h+? The example workflows given on the website should run within seconds. Please also provide some details about your system configuration and the DEBUG level log output. Thank you.

The Start WebDriver node seems to be in a permanent execution. I've discovered that if I change the WebDriver from PhantomJS to HtmlUnit it works, but then some nodes fail (I suppose HtmlUnit is not the right Webdriver to use in this workflow). It seems the problem is the PhantomJS Webdriver I think.

I'm on a Win XP desktop.

Thanks for answering.

Thank you aff, that definitely helps to isolate the problem. Can you also try using one of the GUI browsers, preferably Firefox? Please enable DEBUG logging in KNIME's prefs and attach the log output, if you should encounter any issues.

Also, please try running with PhantomJS once more and also post the log output.

Are you using Windows 32 or 64 bit?

Best regards,
Philipp

In that case (Firefox WebDriver), Start WebDrives fail producing the following message:

ERROR Start WebDriver      0:3        Execute failed: Could not start a new session. Possible causes are invalid address of the remote server or browser start-up failure.
Build info: version: 'unknown', revision: 'unknown', time: 'unknown'
System info: host: 'tic-vbt', ip: '172.26.12.216', os.name: 'Windows XP', os.arch: 'x86', os.version: '5.1', java.version: '1.8.0_60'
Driver info: driver.version: FirefoxDriver
ERROR Start WebDriver      0:32       Execute failed: Could not start a new session. Possible causes are invalid address of the remote server or browser start-up failure.
Build info: version: 'unknown', revision: 'unknown', time: 'unknown'
System info: host: 'tic-vbt', ip: '172.26.12.216', os.name: 'Windows XP', os.arch: 'x86', os.version: '5.1', java.version: '1.8.0_60'
Driver info: driver.version: FirefoxDriver

 

Is Firefox within you system's path? Does Firefox start, when you enter "firefox.exe" on the command line? If not, you will need to extend the system path, so that Firefox can be found. We'll be adding a configuration option for this within the next releases.

I'm not sure about the PhantonJS issue; the necessary binary is bundled with the Selenium nodes in new versions of the nodes, however I have not been able to perform extensive testing under Windows yet. On the other hand, I haven't heard any complaints :) Could you please enable DEBUG logging (KNIME prefs -> KNIME GUI), and try to re-run the workflow with PhantomJS? Then please attach the log -- thank you!

Besides that, I'll try to perform some testing on Windows tomorrow, to see if there are any general issues in the current version.

PS: More extensive documentation will be coming with the final release :)

Thanks again Philipp.

First of all, Firefox wasn't in the system's path. I've included it, but, apparently, it didn't change a thing :-(

The DEBBUG Logging is enable. I attach two log files: one for Firefox and one for PhantomJS.  The PhantomJS execution is in a standstill as it was yesterday.

Thank you for the logs, aff. I'll perform some more tests on Windows during these days to see if there are any general issues. However, after looking at your logs, I would assume that something on your system is blocking access to binaries and network communication. Are you running any anti virus/protection/firewall software?

[edit1] Btw; which version of Firefox are you using?

[edit2] Could you please try executing the following file on the command line? (note; the forum does not break the line, make sure it ends with "phantomjs.exe")

C:\Archivos de programa\KNIME\plugins\ws.palladian.nodes.selenium.driver.win32_1.0.0.201601031617\binaries\phantomjs.exe

If it works as expected, it should give you a "phantomjs>" prompt.

Well, We are using ESET ENDPOINT ANTIVIRUS and Firefox is 43.0.4.

Unfortunately, the execution did not give us the expected output. Apparently, phantomjs is not a valid Win 32 application.

 

Okay, first about the PhantomJS issue: PhantomJS requires at least Windows Vista, as I just discovered here. I will update the FAQs.

Concerning the issue with Firefox; when you run the Start WebDriver node, is a new Firefox application instance launched? Can you re-try the procedure with anti virus software disabled or on a machine without anti virus software?

Regards,
Philipp

Thanks for the info about PhantomJS, we will to execute it on a more updated machine. Concerning Firefox, a new firefox instance is not launched when we run the workfow. I've just disabled temporaly the anti virus software and run it, but still getting the same error message:

ERROR Start WebDriver      2:3        Execute failed: Could not start a new session. Possible causes are invalid address of the remote server or browser start-up failure.
Build info: version: 'unknown', revision: 'unknown', time: 'unknown'
System info: host: 'tic-vbt', ip: '172.26.12.216', os.name: 'Windows XP', os.arch: 'x86', os.version: '5.1', java.version: '1.8.0_60'
Driver info: driver.version: FirefoxDriver

 

Hi aff,

just wanted to get back to you after some test runs on Windows 7. PhantomJS as well as Firefox are working without any issues for me. Would be really interested to hear, whether it works on a more recent OS for you.

Best,
Philipp

Tested on Ubuntu at home. It worked perfectly apart from the PhantomJS issue, but I used firefox instead. The problem is we have to to use it on a Windows environment.

Thanks for your feedback. There will be more messages ;-)

On Linux, you currently need to build PhantomJS on you own, as there is no universal portable binary. Check the FAQ#q8 for more info.

Best regards

Well, I finally managed to run the workflow on Windows XP. I used Chrome, which surprisingly worked (yesterday, it didn't). I don't know whether the problems with Firefox are related to the use of Selenium on Windows XP or it's simply my computer configuration, further research is needed :-P.