HttpRetriever SSLException - hostname in certificate didn’t match

I’m suddenly having a problem with my HttpRetriever node. This is the curl command that successfully gets the data I need:

curl "https://variable.site1.com/detail/itemDetail.htm?itemId=12345&callback=setVariable" -H "Referer: https://detail.site2.com/item.htm?id=12345"

This has been working fine in KNIME for a few weeks but suddenly yesterday it stopped working (it still works in curl). When I dug into the problem I found that the HttpRetriever node is hitting a Java SSLException “hostname in certificate didn’t match”.

Here’s the complete error from the log file:

WARN    HttpRetrieverCellFactory          Error retrieving https://variable.site1.com/detail/itemDetail.htm?itemId=12345&callback=setVariable: Exception javax.net.ssl.SSLException: hostname in certificate didn't match: <variable.site1.com> != <*.site2.com> OR <*.site2.com> for URL "https://variable.site1.com/detail/itemDetail.htm?itemId=12345&callback=setVariable": hostname in certificate didn't match: <variable.site1.com> != <*.site2.com> OR <*.site2.com>

You can see that a site2 page is requesting data from a site1 service (like it always has) and that this has suddenly become a problem. Note that the domains site1.com and site2.com are all part of the same group of services.

The KNIME node doesn’t give me a lot of choices. I’ve tried accepting self-signed certificates, and I’ve tried adding a useful-looking header: -H "Host: variable.site1.com". I'm now looking to see if there is something I can do at a system level like perhaps adding something to the Windows hosts file that will pass this exception.

Any ideas?

 

A colleague found this enhancement bug in Apache JIRA to support alternative names when checking hostname certification:

https://issues.apache.org/jira/browse/HTTPCLIENT-614

The problem is fixed in the Apache HTTP Client Version 4.0 and the current version is now 4.5.

I looked around the Palladian source code and it wasn't clear whether or not Palladian is using the Apache HTTP Client. But if it is then perhaps it needs to be updated to at least version 4.0?

Doing an SSL handshake with my problem link "variable.site1.com" successfully sets the certificate and verifies the locations:

Server certificate:
*      start date: 2015-06-25 09:56:06 GMT
*      expire date: 2015-12-26 15:59:59 GMT
*      subjectAltName: variable.site1.com matched
*      SSL certificate verify ok.

This implies that the certificate is indeed valid and that the HTTP Client should match it.

 

My apologies. This question was held in a "Revision pending" state for an indefinite period of time. Thinking I had made some kind of mistake when entering the question the first time and not being able to resolve it, I reposted the question a second time. But that second question is now also in a "Revision pending" state. I am terribly sympathetic to the problem of preventing spam get on the site, but I am at a loss as to how to ensure a legitimate question gets posted. I'm sorry for the trouble.

Dear Edlueze,

we are sorry for overseeing your post. We go over our pending posts daily, but unfortuantely we sometimes oversee  one real post inbetween.

I published both of your posts, I think they are duplicate. If your posts ever gets stuck feel free to contact me or via our contact page https://www.knime.org/contact

Best, Iris

Thanks Iris!

Not to worry - I was able to connect with the folks from Palladian and their support continues to be amazing!

Palladian added a quick patch to the KNIME nightly builds for the HttpRetriever node that replaced the "Accept self-signed certificates" option with an "Accept all certificates" option. This works fine for me. I don't know if they are also considering other ways to handle this exception in the future.