Collecting KNIME forum data

Hi,
How can I get the links for all topics? So that I can collect them all.
Assume that I want to gather all the topic titles and their first post. In the old forum there were a list of topics in separate pages and one could use the next button to go through all topics automatically. But in this new forum new topics appear when one scrolls down.
Would you please help me with that?

Previous replies: Using Selenium nodes to login to the KNIME forum - #11 by Iris

Reply to @Iris: Thank you for the guide but I couldn’t find any option to collect all topics of a category at once. Instead in a response I found that I can use this link (https://forum.knime.com/c/knime-analytics-platform?amp;page=x), where x starts from 0 for the latest topics and older topics go to the next pages . Now I have a new issue: In a browser (tried Chrome and Firefox) I can put new page numbers so new topics are shown but in KNIME using httpRetriever in a loop, I get always the same topics for all pages.
I think I can solve this problem using Selenium nodes but I wonder what the problem is.

Best,
Armin

KNIME forum.knwf (16.9 KB)

(sorry, hadn’t seen this post before, probably b/c it was split from the other thread)

Hi Armin,

the reason is, that the content on the follow-up pages is loaded dynamically via XHR / AJAX. The HttpRetriever only sees the static page, and not the content which is added later through additional JavaScript. That’s the main motivation for the Selenium Nodes to exist.

In a nutshell, consider it like this:

  • HttpRetriever from Palladian: Pure downloading of pages via HTTP/HTTPS
  • Selenium: Complete browser logic.

– Philipp

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.