HttpRetriever Excecution Cancelled

Hi everyone,

Trying to use HttpRetriever and Boilerpipe to get the text from a list of pages.

Current setup:

String manipulation is to create a boilerpipe URL, like this:

http://boilerpipe-web.appspot.com/extract?url=http://facebook.com&extractor=ArticleExtractor&output=text&extractImages=

When I pass that into the HTTPRetriever, I get the Warning as you can see, which says "Excecution Cancelled".

Anyone got any good troubleshooting tips, or ideas what I may eb doing wrong?

 

Thanks!

 

Adam

 

 

Hi Adam,

difficult to debug this without workflow. Could you attach your workflow with some sample data, so that I can have a look?

Best,
Philipp

PS: We provide a node for content extraction from web pages (ContentExtrator), you might consider using this one instead of Boilerpipe. It doesn't require any web API access and during our evaluations it was more reliable than the Boilerpipe algorithm.

I'll give that a go, thanks Philipp!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.