Filter Hyperlinks from Website

Hi Everybody,

I started working with Knime and by first project is to analyze a website with job offers. In this case I would like to filter all Hyperlinks to the individual jobs.

I used the following workflow (see attachment).

the output of the HtmlParser is qite strange (see attachment) and there is always an error with the htmlParser saying: "Input of http(s) URLs is deprecated. Connect an upstream HttpRetriever node and parse the HttpResults instead."

 

I want to filter all links to the job offers and collect them, which query do I need to put in the XmlPath Node? :/

This is the site I want to filter https://recruiting.bmwgroup.de/ibs/Servlets/ibs/controller/sm

Thank you very much for help in advance.

 

Cheers

 

Vanessa

Please keep related questions in one thread ..:

https://tech.knime.org/forum/knime-general/webcrawler-workflow