[Palladian] Pooled Web Driver Crashed

Hi @qqilihq,

I have to use the pooled web driver because the failure handling process, or Knime in general, is not coping well with the WebDriver column nor port type.

The Chrome instance crashed which didn’t happen for a loooong time so I suspect the pooled web driver at fault.

Here is the thread dump. System resources were not utilized even half but the Knime log showed "SWTException": Invalid thread access during execution around 18:53 when Chrome crashed.

org.eclipse.swt.SWTException: Invalid thread access
	at org.eclipse.draw2d.DeferredUpdateManager.sendUpdateRequest(DeferredUpdateManager.java:264)
	at org.eclipse.draw2d.DeferredUpdateManager.queueWork(DeferredUpdateManager.java:251)
	at org.eclipse.draw2d.DeferredUpdateManager.addDirtyRegion(DeferredUpdateManager.java:119)
	at org.eclipse.draw2d.Figure.repaint(Figure.java:1540)
	at org.eclipse.draw2d.Figure.repaint(Figure.java:1531)
	at org.eclipse.draw2d.Figure.repaint(Figure.java:1549)
	at org.knime.workbench.editor2.figures.WorkflowFigure.setJobManagerFigure(WorkflowFigure.java:214)
	at org.knime.workbench.editor2.WorkflowEditor.updateJobManagerDisplay(WorkflowEditor.java:1603)
	at org.knime.workbench.editor2.WorkflowEditor.nodePropertyChanged(WorkflowEditor.java:3822)
	at org.knime.core.node.workflow.NodeContainer.notifyNodePropertyChangedListener(NodeContainer.java:481)
	at org.knime.core.node.workflow.WorkflowManager.notifyNodePropertyChangedListener(WorkflowManager.java:8218)
	at org.knime.core.node.workflow.NodeContainer.setJobManager(NodeContainer.java:353)
	at org.knime.core.node.workflow.SubNodeContainer.setInactive(SubNodeContainer.java:2720)
	at org.knime.core.node.workflow.NodeExecutionJob.checkForTryCatchScope(NodeExecutionJob.java:257)
	at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:206)
	at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:117)
	at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:367)
	at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:221)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
	at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
2024-08-13 17:46:32,466 : ERROR : KNIME-Worker-176-Extract Data 3:1126:0:972 :  : LocalNodeExecutionJob : Extract Data : 3:1126:0:972 : Caught "SWTException": Invalid thread access
org.eclipse.swt.SWTException: Invalid thread access
	at org.eclipse.draw2d.DeferredUpdateManager.sendUpdateRequest(DeferredUpdateManager.java:264)
	at org.eclipse.draw2d.DeferredUpdateManager.queueWork(DeferredUpdateManager.java:251)
	at org.eclipse.draw2d.DeferredUpdateManager.addDirtyRegion(DeferredUpdateManager.java:119)
	at org.eclipse.draw2d.Figure.repaint(Figure.java:1540)
	at org.eclipse.draw2d.Figure.repaint(Figure.java:1531)
	at org.eclipse.draw2d.Figure.repaint(Figure.java:1549)
	at org.knime.workbench.editor2.figures.WorkflowFigure.setJobManagerFigure(WorkflowFigure.java:214)
	at org.knime.workbench.editor2.WorkflowEditor.updateJobManagerDisplay(WorkflowEditor.java:1603)
	at org.knime.workbench.editor2.WorkflowEditor.nodePropertyChanged(WorkflowEditor.java:3822)
	at org.knime.core.node.workflow.NodeContainer.notifyNodePropertyChangedListener(NodeContainer.java:481)
	at org.knime.core.node.workflow.WorkflowManager.notifyNodePropertyChangedListener(WorkflowManager.java:8218)
	at org.knime.core.node.workflow.NodeContainer.setJobManager(NodeContainer.java:353)
	at org.knime.core.node.workflow.SubNodeContainer.setInactive(SubNodeContainer.java:2720)
	at org.knime.core.node.workflow.NodeExecutionJob.checkForTryCatchScope(NodeExecutionJob.java:257)
	at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:206)
	at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:117)
	at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:367)
	at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:221)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
	at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
2024-08-13 18:54:36,447 : ERROR : KNIME-Worker-300-Click 3:1126:0:972:0:1377:0:1368 :  : Node : Click : 3:1126:0:972:0:1377:0:1368 : Execute failed: java.util.concurrent.TimeoutException
ws.palladian.nodes.selenium.SeleniumNodeExecutionException: java.util.concurrent.TimeoutException
	at ws.palladian.nodes.selenium.AbstractWebElementMethodNodeModel.execute(AbstractWebElementMethodNodeModel.java:319)
	at org.knime.core.node.NodeModel.execute(NodeModel.java:812)
	at org.knime.core.node.NodeModel.executeModel(NodeModel.java:588)
	at org.knime.core.node.Node.invokeFullyNodeModelExecute(Node.java:1286)
	at org.knime.core.node.Node.execute(Node.java:1049)
	at org.knime.core.node.workflow.NativeNodeContainer.performExecuteNode(NativeNodeContainer.java:594)
	at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(LocalNodeExecutionJob.java:98)
	at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:198)
	at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:117)
	at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:367)
	at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:221)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
	at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
Caused by: org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException
Build info: version: '4.22.0', revision: 'c5f3146703*'
System info: os.name: 'Windows 11', os.arch: 'amd64', os.version: '10.0', java.version: '17.0.5'
Driver info: org.openqa.selenium.chrome.ChromeDriver
Command: [86eba025f0c54b9575a839940dcdf437, clickElement {id=f.AAA5358D1D041BB1194AD86DFC027433.d.A671B6FC30F2E8F78997477816AF881E.e.621968}]
Capabilities {acceptInsecureCerts: false, browserName: chrome, browserVersion: 127.0.6533.100, chrome: {chromedriverVersion: 127.0.6533.88 (a2d0cb026721..., userDataDir: C:\Users\MIKEWI~1\AppData\L...}, fedcm:accounts: true, goog:chromeOptions: {debuggerAddress: localhost:14981}, networkConnectionEnabled: false, pageLoadStrategy: normal, platformName: windows, proxy: Proxy(), se:cdp: ws://localhost:14981/devtoo..., se:cdpVersion: 127.0.6533.100, setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: dismiss and notify, webauthn:extension:credBlob: true, webauthn:extension:largeBlob: true, webauthn:extension:minPinLength: true, webauthn:extension:prf: true, webauthn:virtualAuthenticators: true}
Element: [[ChromeDriver: chrome on windows (86eba025f0c54b9575a839940dcdf437)] -> xpath: //*[@id="reservations_active"]//following-sibling::*[contains(@class, "section__content")]//*[contains(@class, "collection__item")][8]]
Session ID: 86eba025f0c54b9575a839940dcdf437
	at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:399)
	at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
	at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:55)
	at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute(JdkHttpClient.java:355)
	at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:187)
	at org.openqa.selenium.remote.service.DriverCommandExecutor.invokeExecute(DriverCommandExecutor.java:216)
	at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:174)
	at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:518)
	at org.openqa.selenium.remote.RemoteWebElement.execute(RemoteWebElement.java:223)
	at org.openqa.selenium.remote.RemoteWebElement.click(RemoteWebElement.java:76)
	at jdk.internal.reflect.GeneratedMethodAccessor180.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.base/java.lang.reflect.Method.invoke(Unknown Source)
	at ws.palladian.nodes.selenium.SeleniumUtils$SeleniumSyncInvocationHandler.invoke(SeleniumUtils.java:45)
	at ws.palladian.nodes.selenium.SeleniumUtils$2.invoke(SeleniumUtils.java:208)
	at jdk.proxy23/jdk.proxy23.$Proxy120.click(Unknown Source)
	at ws.palladian.nodes.selenium.click3.Click3NodeFactory$ClickType.lambda$0(Click3NodeFactory.java:30)
	at ws.palladian.nodes.selenium.click3.Click3NodeFactory$ClickType.execute(Click3NodeFactory.java:43)
	at ws.palladian.nodes.selenium.click3.Click3NodeFactory$1.execMethod(Click3NodeFactory.java:72)
	at ws.palladian.nodes.selenium.AbstractWebElementMethodNodeModel.execute(AbstractWebElementMethodNodeModel.java:317)
	... 14 more
Caused by: java.util.concurrent.TimeoutException
	at java.base/java.util.concurrent.CompletableFuture.timedGet(Unknown Source)
	at java.base/java.util.concurrent.CompletableFuture.get(Unknown Source)
	at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:382)
	... 33 more
2024-08-13 18:54:36,494 : ERROR : KNIME-Worker-389-Variable Condition Loop End 3:1126:0:972:0:1211 :  : Node : Variable Condition Loop End : 3:1126:0:972:0:1211 : Active Scope End node in inactive branch not allowed.
2024-08-13 18:57:36,708 : ERROR : KNIME-Worker-374-Find Elements 3:1126:0:972:0:1196:0:1314 :  : Node : Find Elements : 3:1126:0:972:0:1196:0:1314 : Execute failed: Expected condition failed: waiting for presence of elements located by By.xpath: //*[@class="toggle-switch"]//*[contains(@class, "toggle-switch__item")][last()]/span (tried for 5 second(s) with 0 milliseconds interval)
ws.palladian.nodes.selenium.SeleniumNodeExecutionException: Expected condition failed: waiting for presence of elements located by By.xpath: //*[@class="toggle-switch"]//*[contains(@class, "toggle-switch__item")][last()]/span (tried for 5 second(s) with 0 milliseconds interval)
	at ws.palladian.nodes.selenium.findelements2.FindElements2Settings.execFind(FindElements2Settings.java:343)
	at ws.palladian.nodes.selenium.findelements2.FindElements2NodeModel.execute(FindElements2NodeModel.java:82)
	at org.knime.core.node.NodeModel.execute(NodeModel.java:812)
	at org.knime.core.node.NodeModel.executeModel(NodeModel.java:588)
	at org.knime.core.node.Node.invokeFullyNodeModelExecute(Node.java:1286)
	at org.knime.core.node.Node.execute(Node.java:1049)
	at org.knime.core.node.workflow.NativeNodeContainer.performExecuteNode(NativeNodeContainer.java:594)
	at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(LocalNodeExecutionJob.java:98)
	at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:198)
	at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:117)
	at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:367)
	at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:221)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
	at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
Caused by: org.openqa.selenium.TimeoutException: Expected condition failed: waiting for presence of elements located by By.xpath: //*[@class="toggle-switch"]//*[contains(@class, "toggle-switch__item")][last()]/span (tried for 5 second(s) with 0 milliseconds interval)
Build info: version: '4.22.0', revision: 'c5f3146703*'
System info: os.name: 'Windows 11', os.arch: 'amd64', os.version: '10.0', java.version: '17.0.5'
Driver info: jdk.proxy23.$Proxy115
	at org.openqa.selenium.support.ui.WebDriverWait.timeoutException(WebDriverWait.java:84)
	at org.openqa.selenium.support.ui.FluentWait.until(FluentWait.java:228)
	at ws.palladian.nodes.selenium.findelements2.FindElements2Settings.execFind(FindElements2Settings.java:333)
	... 15 more
2024-08-13 18:57:36,771 : ERROR : KNIME-Worker-428-Network Dump End [BETA] 3:1126:0:972:0:1171 :  : Node : Network Dump End [BETA] : 3:1126:0:972:0:1171 : Active Scope End node in inactive branch not allowed.
2024-08-13 19:00:36,925 : ERROR : KNIME-Worker-438-Find Elements 3:1126:0:972:0:1196:0:1314 :  : Node : Find Elements : 3:1126:0:972:0:1196:0:1314 : Execute failed: Expected condition failed: waiting for presence of elements located by By.xpath: //*[@class="toggle-switch"]//*[contains(@class, "toggle-switch__item")][last()]/span (tried for 5 second(s) with 0 milliseconds interval)
ws.palladian.nodes.selenium.SeleniumNodeExecutionException: Expected condition failed: waiting for presence of elements located by By.xpath: //*[@class="toggle-switch"]//*[contains(@class, "toggle-switch__item")][last()]/span (tried for 5 second(s) with 0 milliseconds interval)
	at ws.palladian.nodes.selenium.findelements2.FindElements2Settings.execFind(FindElements2Settings.java:343)
	at ws.palladian.nodes.selenium.findelements2.FindElements2NodeModel.execute(FindElements2NodeModel.java:82)
	at org.knime.core.node.NodeModel.execute(NodeModel.java:812)
	at org.knime.core.node.NodeModel.executeModel(NodeModel.java:588)
	at org.knime.core.node.Node.invokeFullyNodeModelExecute(Node.java:1286)
	at org.knime.core.node.Node.execute(Node.java:1049)
	at org.knime.core.node.workflow.NativeNodeContainer.performExecuteNode(NativeNodeContainer.java:594)
	at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(LocalNodeExecutionJob.java:98)
	at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:198)
	at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:117)
	at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:367)
	at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:221)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
	at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
Caused by: org.openqa.selenium.TimeoutException: Expected condition failed: waiting for presence of elements located by By.xpath: //*[@class="toggle-switch"]//*[contains(@class, "toggle-switch__item")][last()]/span (tried for 5 second(s) with 0 milliseconds interval)
Build info: version: '4.22.0', revision: 'c5f3146703*'
System info: os.name: 'Windows 11', os.arch: 'amd64', os.version: '10.0', java.version: '17.0.5'
Driver info: jdk.proxy23.$Proxy115
	at org.openqa.selenium.support.ui.WebDriverWait.timeoutException(WebDriverWait.java:84)
	at org.openqa.selenium.support.ui.FluentWait.until(FluentWait.java:228)
	at ws.palladian.nodes.selenium.findelements2.FindElements2Settings.execFind(FindElements2Settings.java:333)
	... 15 more
2024-08-13 19:00:36,989 : ERROR : KNIME-Worker-487-Network Dump End [BETA] 3:1126:0:972:0:1171 :  : Node : Network Dump End [BETA] : 3:1126:0:972:0:1171 : Active Scope End node in inactive branch not allowed.```

Unfortunately I do not have more information to share ... 

[240813 Selenium Pooled Chrome Webdriver snaped threaddump-1723568399363.tdump.txt|attachment](upload://b4cH08XI1zJYE7FLtRDHRU0B2UD.txt) (103.3 KB)

[270813 Pooled Web Driver Crashed knime.log.txt|attachment](upload://zaGDgGMXFen7ACJVLpZB0QxOlIR.txt) (384.2 KB)

Best
Mike

Hey Mike,

Let’s pick this up here at the root - could you elaborate about your issues with the mentioned column or port types?

Generally the “Pooled” node should handle crashing browsers gracefully (which is at least not extremely uncommon in heavy automation scenarios) - remove it from the pool and just start a new instance. Does this not work?

Thanks!
Philipp

Hi Philipp,

For failover I write the data into temp tables. Though, Knime, for apparent reason, cannot save the WebDriver column. In case of an error, the WebDriver column or the row is not present resulting in:

It is also not possible, though I believe the generic loop can pass all varieties, to pass the Web Driver port to start the instance in the inner most loop.

Best
Mike

@mwiegand I think I roughly get the (pain) point and I faced similar challenges too - I would really like to go through this with you once during a call to evaluate some ideas (beside many other topics :slight_smile: ).

I’ll be in touch via email.

What about Friday? Today and tomorrow my wife and son have birthday :wink:

So something odd is really happening when using the pooled web driver:

  1. Process runs Out of Memory but system has still plenty
  2. Several instances in the background still running, which shouldn’t, indicating something with the pooling might not work (pure speculation as I am still familiarizing with pooling)
  3. Crashed & Pooled Chrome not being terminated upon closing Knime

I also noticed, after starting Knime and reinitiating the scrape, that in total seven Chrome processes are launched … which might be intended but unclear from the node description.

I also believe I might have stumbled upon a possible bug:

2024-08-16 16:19:13,082 ERROR Logging 3:1126:0:1658 Execute failed: Cannot invoke "org.openqa.selenium.chromium.ChromiumDriverLogLevel.name()" because "level" is null

This might have been triggered since a pooled Chrome instance was already running but I decided to reset the web driver factory to adjust the log level to warn.

PS: Even after killing the process and restart Knime, the error persists.

PS: Further investigating I found aa potential workaround that I am testing at the moment by defining these args:

--max_old_space_size=4096 --optimize_for_size --max_executable_size=4096 --stack_size=4096

But, I am not certain, as these flags date back to very old pieces of information, if they are still applicable.

PS: So far no out of memory Chrome crash :crossed_fingers:

1 Like