xpath node for non-programmers who can't make any sense of it - helpful resources

Rich_ard · March 18, 2020, 8:01pm

Hi,

I am a non-programmer who could not make any sense of how to use the webpage retriever and then extract content from it using the xpath node. The stuff I found on the KNIME forum was way to opaque for me. After struggling with this I found a couple of online resources that, after a bit of trial and error, proved really useful. First is this cheatsheet: http://scraping.pro/res/xpath-cheat/xpath_css_dom_recipes.pdf, and second is this xpath tutorial: http://zvon.org/comp/r/tut-XPath_1.html. After reading through this stuff and with a bit of trial and error it all became considerably more pellucid, and I managed to extract content from several hundred online project web pages (each one having, more or less, the same structure) and get it all into a nice table. Thought I’d share this in case it might be of interest or use to anyone who has had similar struggles with the xpath node…

R.

AlexanderFillbrunn · March 20, 2020, 2:21pm

Hi,
did you know that you can click on the elements in the preview to generate an XPath-Expression automatically? Sometimes you have to adjust this a bit, but often it is quite helpful to get an expression that is in the ballpark of what you are trying to extract.

Kind regards
Alexander

system · September 19, 2020, 6:37am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.