get browser user authentication in knime using Selenium

Hi, Some websites block access to the data page when I try to scrape it using Selenium nodes. If I could open my Gmail profile within Selenium, this might solve the issue, but I couldn’t get the workflow from Armin’s example to work. Is there a sample solution for this? I’d like to access the data page when I open the browser with Selenium nodes, but currently, the websites redirect to a different security page.

WARN Java Snippet 10:16 Bundle “org.seleniumhq.selenium.selenium-merged” required by this snippet was not found.
Bundle “com.github.kklisura.cdt.cdt-java-client-merged” required by this snippet was not found.
Compile with errors:
Error in line 16: The import com.fasterxml cannot be resolved
Error in line 17: The import com.fasterxml cannot be resolved
Error in line 18: The import com.fasterxml cannot be resolved
Error in line 19: The import com.github cannot be resolved
Error in line 20: The import com.github cannot be resolved
Error in line 21: The import com.github cannot be resolved
Error in line 22: The import com.github cannot be resolved
Error in line 23: The import com.github cannot be resolved
Error in line 24: The import com.github cannot be resolved
Error in line 25: The import com.github cannot be resolved
Error in line 26: The import com.github cannot be resolved
Error in line 27: The import com.github cannot be resolved
Error in line 28: The import com.github cannot be resolved
Error in line 29: The import com.github cannot be resolved
Error in line 30: The import com.github cannot be resolved
Error in line 97: ObjectMapper cannot be resolved to a type
Error in line 97: DeserializationFeature cannot be resolved to a variable
Error in line 98: TypeReference cannot be resolved to a type
Error in line 105: ChromeDevToolsService cannot be resolved to a type
Error in line 107: WebSocketService cannot be resolved to a type
Error in line 107: WebSocketServiceImpl cannot be resolved
Error in line 108: ChromeDevToolsServiceConfiguration cannot be resolved to a type
Error in line 108: ChromeDevToolsServiceConfiguration cannot be resolved to a type
Error in line 109: CommandInvocationHandler cannot be resolved to a type
Error in line 109: CommandInvocationHandler cannot be resolved to a type
Error in line 111: ChromeDevToolsServiceImpl cannot be resolved to a type
Error in line 111: ProxyUtils cannot be resolved
Error in line 112: ChromeDevToolsServiceImpl cannot be resolved to a type
Error in line 113: WebSocketService cannot be resolved to a type
Error in line 113: ChromeDevToolsServiceConfiguration cannot be resolved to a type
Error in line 121: WebSocketServiceException cannot be resolved to a type
Error in line 122: ChromeServiceException cannot be resolved to a type
Error in line 139: ChromeDevToolsService cannot be resolved to a type
Error in line 139: The method createDevToolsService(java.lang.String) from the type JSnippet refers to the missing type ChromeDevToolsService
Error in line 142: Fetch cannot be resolved to a type

Umut,

the Java packages in the imports which you mention are from a very old version of the Selenium nodes and are no longer contained in recent versions hence this will not work.

Can you please describe step by step what exactly you’re trying to do?

-Philipp

Hello, Philipp @qqilihq

Some websites block access when I attempt to navigate to certain pages using Selenium, redirecting to a different URL instead. I recently discovered an alternative approach: launching Google Chrome with a saved user profile via Selenium.

However, even with a saved profile, some websites still block access and redirect to restricted pages. The key to overcoming this is adjusting browser DNS settings within the saved profile, which resolves the issue for now. That said, integrating Gmail access via Selenium with a saved profile would be an even more robust solution.
image


Interesting approach with changing the DNS - I wasn’t aware of that.

Gmail you need to retrieve a confirmation link or for what? I think I’ve been accessing Gmail within a workflow already once, I built a node for that - did you have a look at this node:

Does that help?

Unfortunately, this would not work as a solution for my workflow. For example, I successfully scrape data from the URL I provided. Initially, there were no issues, but after some time, the website started redirecting the Selenium browser to a different page. Adjusting the DNS settings resolved the issue temporarily for now.

Additionally, if we can open the Gmail session in a saved profile, it could be much more functional in bypassing various barriers such as Cloudflare and human verification challenges. Currently, I frequently encounter Cloudflare or human verification issues on many pages, and these obstacles have significantly increased compared to previous years.

https://bieterportal.noncd.db.de/evergabe.bieter/eva/supplierportal/portal/tabs/vergaben

**When the DNS is configured, the page loads as expected in the first image.

**In the second image, when Selenium opens the page without DNS configuration, it redirects to a different page.

Additionally, I would like to share this article to provide insights for those facing similar challenges. It may help in developing alternative solutions for comparable scenarios.

If you don’t block cookies, they will eventually block you again. I’ve shared all my findings and experiences on overcoming these challenges. First, clear all history from the saved profile. Then configure DNS and cookie settings before visiting the website.