Cannot connect to Kerberos-enabled Cloudera Impala

Hello

We are trying to connect to an Impala instance on a Cloudera 5.14 cluster secured by kerberos MIT using KNIME AP 4.0, but we have not been successful so far.
We first setup the Kerberos Configuration in Preferences. “Validate” and “Log in” work as expected.

We have tried setting up 4 Connectors as follows:
1. Using the built-in JDBC Impala driver, with JDBC Parameters
kerberosAuthType=fromSubject
principal=impala/<hostname>@<REALM>
ssl=false

The error message when trying to Execute the Connector is:

ERROR Impala Connector 0:1 Execute failed: Could not open client transport with JDBC Uri: jdbc:hive2://***:21050/***;ssl=false;kerberosAuthType=fromSubject;principal=impala/***@***: null

2. Same setup as above, using the built-in JDBC Impala driver but with SSL enabled, with JDBC Parameters:
kerberosAuthType=fromSubject
principal=impala/<hostname>@<REALM>
ssl=true

The error message when trying to Execute the Connector is:

ERROR Impala Connector 0:1 Execute failed: Could not open client transport with JDBC Uri: jdbc:hive2://***:21050/***;ssl=true;kerberosAuthType=fromSubject;principal=impala/***@***: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

The Impala node is using a self-signed certificate which has been imported to the Java keystore via the Java Control Panel on Windows 10.

3. Using the Impala JDBC driver from Cloudera, v2.6, with JDBC parameters:
AuthMech=1
KrbHostFQDN=<hostname>
KrbRealm=<realm>
KrbServiceName=impala

Again, we cannot get a connection, with the error:
ERROR Impala Connector 0:1 Execute failed: [Cloudera][ImpalaJDBCDriver](500164) Error initialized or created transport for authentication: [Cloudera][ImpalaJDBCDriver](500169) Unable to connect to server: [Cloudera][ImpalaJDBCDriver](500591) Kerberos Authentication failed..

4. Using the Impala JDBC driver v2.5, with JDBC Parameters as above:
AuthMech=1
KrbHostFQDN=<hostname>
KrbRealm=<realm>
KrbServiceName=impala

Connecting fails with the error:
ERROR Impala Connector 0:8 Execute failed: [Simba][ImpalaJDBCDriver](500310) Invalid operation: Unable to obtain password from user

We can connect in other apps, with ODBC drivers, without problems.

What is the recommended way of connecting to such a setup?

Hi @nasospat

welcome to the KNIME community!

Concerning your issues: I am assuming that your Impala is setup in such a way that it expects clients to use SSL. This means that 1., 3. and 4. cannot work, because they are not connecting with SSL.

The Impala node is using a self-signed certificate which has been imported to the Java keystore via the Java Control Panel on Windows 10.

KNIME includes its own Java Runtime environment (it was Oracle Java last time I checked). The Java Control Panel on Windows 10 has no effect on the Java Runtime which is included in KNIME.

There are two solutions to this:

  1. Make a new truststore file (= Java Keystore in JKS format) that contains Impala’s self signed certificate and mark it as trusted. Then tell the JDBC driver about it:
    • The builtin driver needs the ssl, sslTrustStore, sslTrustStorePassword parameters (see [1])
    • The Cloudera driver needs the SSL, SSLTrustStore and SSLTrustStorePwd parameters (please consult the Installation and Configuration Guide PDF that comes with the driver, it contains all the details)
  2. Add the certificate to the global truststore that is part of the JRE that KNIME uses. You can find the truststore under <knime-installation-folder>/plugins/org.knime.binary.jre.<os>.x86_64_<version>/jre/lib/security/cacerts. Please note that you have to repeat this after updating KNIME (whenever we update our JRE).

[1] HiveServer2 Clients - Apache Hive - Apache Software Foundation

Björn

6 Likes

Hello Björn,

We followed your instructions and it worked. We are able to have successful connection.
Thank you for your help !!!

2 Likes

Hi Björn,
until version 4.3 my configuration of Impala Node works but updating KNIME to 4.7 gave me this error on the console

[Cloudera]HiveJDBCDriver Unable to connect to server: GSS initiate failed.

the driver I am currently using is Cloudera JDBC 2.6.0 with Kerberos authentication and the settings are saved in my workspace so I am able to start with a fresh installation of KNIME without doing again all the tedious operations. Indeed after I tried unsuccessfully with KNIME 4.7 I was able to unistall it and re-install the 4.3 version and everything started to work again.

Does this give you any idea about what should I do in order to use the last version of KNIME?

Thank you in advance for your help.

Francesco

Hi @francesco.bindella,

This might be related to the Java Upgrade in KNIME and the deprecation of some encryption types, that are a known security risk. The following thread might be helpful to understand the problem: connect HDFS WITH KERBEROS

You have to choose a recent and secure encryption type in your Kerberos config. You are welcome to open a new Thread if you need additional help.

Cheers,
Sascha

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.