HttpRetriever

I was using old HttpRetriever to download webpage but after update, the new HttpRetriever does not work. Is it a update bug or ...?

 

Hard to tell without any details.

Best,
Philipp

 here is one simple workflow: table creator-->HttpRetriever-->HtmlParser

In table creator i just type any website. then run this workflow the error message i got:

WARN HttpRetrieverCellFactory Error retrieving https://www.google.co.krXXXX: Exception org.apache.http.conn.HttpHostConnectException: Connection to https://www.google.co.kr refused for URL https://www.google.co.kr/XXX: Connection to https://www.google.co.kr refused

 

Before i am using the deprecated httpretriever no problem. but now both nodes dont work.

Does that problem happen for all URLs you're trying to access?

Do you use a specific proxy configuration (company network, etc.)?

Can you access the mentioned URL within your browser?

1. it happens to all URLs i tried, well-known websites

2. yes, I am using company network, but those URLs are not blocked.

3. I can access the URLs if i type them directly to my IE browsers.

 

all this happened just after the latest node update, about 1 week ago.

Could you please enable DEBUG logging in KNIME's prefs and append the ouput here?

WARN HttpRetrieverCellFactory Error retrieving http://www.yahoo.com/: Exception org.apache.http.conn.HttpHostConnectException: Connection to http://www.yahoo.com refused for URL "http://www.yahoo.com/": Connection to http://www.yahoo.com refused

There was a regression concerning proxy handling in recent versions. It should be fixed with today's update. Let me know if it works for you.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.