Example - WebCrawler Workflow

Hi All,

I am a new user and starting to learn knime. While trying out the examples, I could not find the "webcrawler workflow" mentioned in the "Knowledge Extraction from a Web Forum" white paper. Can anyone help me in finding the source code for the workflow to crawl data from a forum.. I am trying to build a similar functionality as a learning activity.

Thanks & Regards



Hi kichenin,

the workflows are available on the KNIME Example Server under 050_Applications/050007_ForumAnalysis. The workflows are in the workflows folder. See https://www.knime.org/example-workflows for a description how to connect to the example server. The crawling workflow requires the Palladian (http://tech.knime.org/community/palladian) and the XML extension.

Furthermore find a small crawling example workflow attached. The workflow loads the content of the science page of the New York Times website (http://www.nytimes.com/pages/science/index.html) and extracts titles, links, authors, and summaries of all articles on that page. This workflows requires the Palladian and XML extension as well.

Cheers, Kilian

Hi Kilian,

Thanks for the reply. Actually I was looking for the KNIME forum crwaler workflow refered in the forum analysis white paper. The example workflow does not contain that paricular workflow. But I will try to use the Palladian example provided by you as a basis and build my own crawler.. Thanks again.