Help on automatic closing of Selenium Browser.

umutcankurt · April 25, 2024, 10:55am

I have a problem and I don’t understand why. When moving from page 15 to page 16, the browser closes on its own without any intervention. I cannot get data from other pages in progress.

Why do you think the browser is closing by itself?

thor_landstrom · April 30, 2024, 4:10pm

Hey @umutcankurt,

I will hop in and respond with some ideas as the user you linked hasn’t yet.

It is hard to tell why that may be happening without more details of whats going on inside the component. If it is a copy paste of your previously ran page miner components, it may be due to the actual webpage and not component.

You may need to add a wait to allow enough time for a webpage to populate which can possibly error out any logic you have in your component.

TL

qqilihq · May 3, 2024, 4:23pm

Hey umut,

does this issue happen reproducibly always with the same URL? Are you running the latest Selenium Nodes version and / or have you restarted your computer, just to rule out the low hanging fruits? Which browser are you using?

Thank you,
Philipp

umutcankurt · May 3, 2024, 6:24pm

Hello; @thor_landstrom and @qqilihq Thank you both for the replies. Philipp, I sent you the workflow I was having problems with via e-mail. Operating system ubuntu and knime 5.2.3

qqilihq · May 5, 2024, 10:34pm

Hey Umut,

thanks for sending over the workflow! I had a look (impressive!), and while I can not entirely guarantee, I think the reason for the problem is, that you generate an extreme amount of temporary data during workflow execution and I suspect that this eventually causes the browser to crash.

Here’s some suggestions which I suggest to implement, and which will for sore be helpful for other Selenium Nodes users as well. They will make the workflow execute with less resources and thus snappier - and thus hopefully also solve the issue with the crashing browser.

Avoid the combinations of Find Elements + Execute JavaScript (as you use the JS only for extracting text content). If you need to extract strings, I recommend to use Extract Text node instead (do not use a “Find Elements” node unless you have to; instead enter the XPath or CSS directly in the Extract Text node using the “Find Elements” option. This makes things much faster, as you avoid writing lots of intermediate information.
If you still need to use Find Elements, avoid ticking the setting “Append additional WebElement information”. It is slows down things considerably and I only recommend it for debugging reasons in general and not for information extraction tasks.
You can replace most of the Find Elements + Execute JavaScript combinations with a single Table Extractor node. It will automatically extract most of the table information for you, and you’ll only need to do some simple string post-processing.
It’s not necessary to have the combination of GET Request+ HTTP Retriever. Take one or the other - I much recommend HTTP Retriever, as it works best with the HTML Parser node.
Instead of the explicit Wait node, make use of more “smart” waiting options. The “Find Elements” settings have a “Wait up to …” option, which I suggest to use instead. It will wait and continue execution as soon as elements for the entered XPath/CSS query become available on the page.

I hope that helps - let me know how it goes!

Best regards,
Philipp

umutcankurt · May 6, 2024, 6:58pm

Hi Philipp;
Thank you very much for the detailed explanation and suggestions. I will make adjustments and optimization to the workflow.

system · May 13, 2024, 6:58pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.