HttpRetriever

I was using old HttpRetriever to download webpage but after update, the new HttpRetriever does not work. Is it a update bug or ...?

 

Hard to tell without any details.

Best,
Philipp

 here is one simple workflow: table creator-->HttpRetriever-->HtmlParser

In table creator i just type any website. then run this workflow the error message i got:

WARN HttpRetrieverCellFactory Error retrieving https://www.google.co.krXXXX: Exception org.apache.http.conn.HttpHostConnectException: Connection to https://www.google.co.kr refused for URL https://www.google.co.kr/XXX: Connection to https://www.google.co.kr refused

 

Before i am using the deprecated httpretriever no problem. but now both nodes dont work.

Does that problem happen for all URLs you're trying to access?

Do you use a specific proxy configuration (company network, etc.)?

Can you access the mentioned URL within your browser?

1. it happens to all URLs i tried, well-known websites

2. yes, I am using company network, but those URLs are not blocked.

3. I can access the URLs if i type them directly to my IE browsers.

 

all this happened just after the latest node update, about 1 week ago.

Could you please enable DEBUG logging in KNIME's prefs and append the ouput here?

WARN HttpRetrieverCellFactory Error retrieving http://www.yahoo.com/: Exception org.apache.http.conn.HttpHostConnectException: Connection to http://www.yahoo.com refused for URL "http://www.yahoo.com/": Connection to http://www.yahoo.com refused

There was a regression concerning proxy handling in recent versions. It should be fixed with today's update. Let me know if it works for you.