Scraping text from a website / list of websites

I used pagecrawlerwithpagination workflow from example server. However, it returns error since the http retriever node returns nil result. I have just changed the http address in the initial table crator node.

Kindly clarify

The workflow on the example server is likely outdated, but there’s a new version on NodePit. I also suggest you have a look here:

If you still encounter errors, please post your modified workflow and/or the error messages which you see.

– Philipp

2 Likes

@SridharVenu Did it work?

No, it doesn’t work.

The articlelist column returns a blank and hence subsequent nodes too return empty output.

I verified against http://baputrust.com website.

I had replied much earlier to the mail from my mail box.

The articlelist column returns a blank and hence subsequent nodes too return empty output.

I verified against http://baputrust.com website.

When I enter the URL in the initial node and execute the HTTP Retriever, I get a 200 (i.e. successful) response. Also the HTML Parser properly parses the DOM. Further nodes of course need be adapted to the website in question as the workflow descriptions clearly state.

I had replied much earlier to the mail from my mail box.

I didn’t send you any email.

– Philipp