I am a new user and starting to learn knime. While trying out the examples, I could not find the "webcrawler workflow" mentioned in the "Knowledge Extraction from a Web Forum" white paper. Can anyone help me in finding the source code for the workflow to crawl data from a forum.. I am trying to build a similar functionality as a learning activity.
the workflows are available on the KNIME Example Server under 050_Applications/050007_ForumAnalysis. The workflows are in the workflows folder. See https://www.knime.org/example-workflows for a description how to connect to the example server. The crawling workflow requires the Palladian (http://tech.knime.org/community/palladian) and the XML extension.
Furthermore find a small crawling example workflow attached. The workflow loads the content of the science page of the New York Times website (http://www.nytimes.com/pages/science/index.html) and extracts titles, links, authors, and summaries of all articles on that page. This workflows requires the Palladian and XML extension as well.
Thanks for the reply. Actually I was looking for the KNIME forum crwaler workflow refered in the forum analysis white paper. The example workflow does not contain that paricular workflow. But I will try to use the Palladian example provided by you as a basis and build my own crawler.. Thanks again.