Solutions to "Just KNIME It!" Challenge 9 - Season 3

Just so you will not face blocked walls when try to open my versions: I deleted from my public the two versions as they were not working. I put the command line argument to my first solution as well.

1 Like

I may not be very smart, but I’m persistent. I modified @MartinDDDD and @RBre’s workflows to remove the Clicker nodes and the revised workflows now work for me. IMHO the Clicker node is not ready for prime time. I have no idea what the problem is. May have something to do with browser cookie settings. The node descriptions are so sketchy that they’re no help. I’m running Chrome on Windows 11. @RBre I had to modify your search terms since AI and NVIDIA were no longer in the list.

3 Likes

I can think of the EU’s General Data Protection Regulation (GDPR; it’s a regulation in the EU regulating the cookies and how companies handle the personal data).

I do not know where you live @rfeigel , but maybe it’s not in Europe (or where the information of handling of personal data is heavily regulated). And the same true of @MartinDDDD, as I saw from the video, that the cookie page appears, so I think it’s regulated there.

We could test this theory with a VPN connecting to country servers where it’s regulated or not. And that way a general solution could be developed as well. Sadly I do not have VPN like this, but that could be the answer.

In my opinion:
From regulated country: the clicker node is needed (a cookie page appears)
From not heavily regulated country: the clicker node is not needed (a cookie page doesn’t appear)

4 Likes

I checked - I can also just ignore the notice as the HTML loads in the background correctly already - just left it in so I could demonstrate how the clicker works :-).

If that notice does not appear for someone (for whatever reason) that explains the error - sounds like considering error handling via try / catch might be necessary to make sure it works when shared “internationally”.

5 Likes

:sun_with_face: Happy Tuesday, folks! :sun_with_face:

As usual, here’s our solution to last week’s Just KNIME It! challenge!

:railway_track: We took a bit of a different path here, and are happy to see how you all found different ways of leveraging KNIME Analytics Platform for web scraping! :robot:

See you tomorrow for a challenge on :memo: typo correction!

2 Likes

@berti093 You’re right. I tested on a German server and the pop-up appeared. I live in the USA. I would urge KNIME developers to try to handle this internally to avoid the necessity of try/catch branches.
@ScottF @MartinDDDD @alinebessa

3 Likes

I don’t mean to beat a dead horse, but I thought I’d develop a workflow that checks for “EU like” cookie popups. I modified @MartinDDDD’s workflow with Try/Catch nodes. I tested on German and USA servers and it seems to work with either. This is not an ideal solution. There should be a “Cookie Checker” node which does this.

4 Likes

A bit late to the game since I focused on the Community Hacking Days for the 5.3 release. Let’s see if I can catch up xD

I added several other methods for education and illustrative purposes, including Selenium and Palladian which I see as the most advanced scraping nodes available by a far margin and am working on an article “How to scrape the web using Knime”.

If the AF-Utilies nodes get fixed, I will update the workflow accordingly.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.