xpath node for non-programmers who can't make any sense of it - helpful resources


I am a non-programmer who could not make any sense of how to use the webpage retriever and then extract content from it using the xpath node. The stuff I found on the KNIME forum was way to opaque for me. After struggling with this I found a couple of online resources that, after a bit of trial and error, proved really useful. First is this cheatsheet: http://scraping.pro/res/xpath-cheat/xpath_css_dom_recipes.pdf, and second is this xpath tutorial: http://zvon.org/comp/r/tut-XPath_1.html. After reading through this stuff and with a bit of trial and error it all became considerably more pellucid, and I managed to extract content from several hundred online project web pages (each one having, more or less, the same structure) and get it all into a nice table. Thought I’d share this in case it might be of interest or use to anyone who has had similar struggles with the xpath node…



did you know that you can click on the elements in the preview to generate an XPath-Expression automatically? Sometimes you have to adjust this a bit, but often it is quite helpful to get an expression that is in the ballpark of what you are trying to extract.

