Yesterday I found how to scrape a web page with KNIME. Awesome!
Yet, I have no clue how to divide this path, as the site has put the body text into many elements.
The code, I’m struggling with, is here. This is an excerpt: https://pastebin.com/fdpBGpwG
and the whole code of the page can be found here:
You may need to follow very different paths according to the data and web page designs you want to import. Because, because they are designed with different methods, you may need to use appropriate methods and tools after you understand how the web page responds to the requests and the logic of the designer.
I collect data from many different sources and sometimes some websites can be very challenging.
you can see examples at the bottom of the page below.
Those are paid nodes. I love KNIME for allowing me to experiment for free on my free time.
But I’m making progress with the Palladian nodes. Setting a loop now.
Only if I could remove 404 error for the results…