Hive Connector & Kerberos (Cloudera) - Unable to connect

Hi,

I try to connect to a Cloudera Cluster with Kerberos enabled.
Followed the steps in the BlogPost https://www.knime.org/blog/speaking-kerberos-with-knime-big-data-extensions 6
Still getting timeout-messages, when trying to connect the Hive-Server.

Any suggestions how to fix this?
"
WARN Hive Connector 0:1 Your database timeout (15 s) is set to a rather low value for Hive. If you experience timeouts increase the value in the preference page.
ERROR Hive Connector 0:1 Exception creating Kerberos based jdbc connection. Error: Can’t get Kerberos realm
ERROR Hive Connector 0:1 Execute failed: Could not create connection to database: Cannot locate default realm
"
Thanks !

Hi @irving-ccc

unfortunately the blog post has not aged well and the steps that described where to place the krb5.conf do not work anymore on KNIME 3.5 and newer. I believe this is what caused your problem, because KNIME does not seem to find your krb5.conf file.

I have updated the blog post and replaced the section on where to put the krb5.conf file with the following:

Now, put the krb5.conf file into a location of your choice, where it can be accessed by Analytics Platform. It is recommended to store the file outside of the Analytics Platform installation folder, to avoid accidentally deleting it during upgrades. Then append the following line to the knime.ini file inside the Analytics Platform installation folder:

-Djava.security.krb5.conf=PATH

Replace PATH with the full path to your krb5.conf file, for example C:\krb5.conf.

Also, you will have to update your KRB5_CONFIG environment variable to reflect the new location of krb5.conf. My apologies for the confusion. With KNIME 3.5 the jre/ folder was moved to a location inside the plugins/ folder. Since that location may change during KNIME upgrades, it is not a reasonable loaction for krb5.conf.

Best,
Björn

1 Like

bjoern.lohrmann:

It’s a great honor to be guided by the master for a rookie.Your method has solved almost all of the problems but the hive connector still cannot work. the console suggest:

WARN Hive Connector 0:1 Your database timeout (15 s) is set to a rather low value for Hive. If you experience timeouts increase the value in the preference page.
WARN Hive Connector 0:1 >>> KrbCreds found the default ticket granting ticket in credential cache.
WARN Hive Connector 0:1 >>> Obtained TGT from LSA: Credentials:
client=mlb/mlb-cluster@PERSP.NET
server=krbtgt/PERSP.NET@PERSP.NET
authTime=20180625005959Z
startTime=20180625005959Z
endTime=20180626010001Z
renewTill=20180702005959Z
flags=FORWARDABLE;RENEWABLE;INITIAL;PRE-AUTHENT
EType (skey)=18
(tkt key)=18
ERROR Hive Connector 0:1 Exception creating Kerberos based jdbc connection. Error: failure to login using ticket cache file null
ERROR Hive Connector 0:1 Execute failed: Could not create connection to database: Unable to obtain Principal Name for authentication

Best regards,Tks!

Hi @irving-ccc

to debug this further I think I will need the full KNIME log (or rather all of the log message around the time when you tried to connect to Hive). You can find the full log under View > Open KNIME log. I

f you want you can send it to me via private message and I will post the response here.

Best,
Björn

Hi @irving-ccc

from the logs I would say that your Kerberos ticket is expired:

client=mlb/mlb-cluster@PERSP.NET
server=krbtgt/PERSP.NET@PERSP.NET
authTime=20180625005959Z
startTime=20180625005959Z
endTime=20180626010001Z
renewTill=20180702005959Z
flags=FORWARDABLE;RENEWABLE;INITIAL;PRE-AUTHENT
EType (skey)=18
(tkt key)=18

Then do the following:

  1. Destroy your currentt ticket and then get a new one using “MIT Kerberos for Windows” client that you installed as part of the blogpost.
  2. Check that after getting a new ticket you have a recently created/modified file called:
    C:\Users\<username>\krb5cc_<username>
  3. Restart KNIME Analytics Platform
  4. Open your workflow and open before running it, open the configuration dialog of Hive Connector.
  5. Since you are using the builtin Hive JDBC driver you need different “Parameters” than you are currently using (your current ones are for the proprietary Simba-based Hive drivers). They should be something like:
    principal=hive/_HOST@PERSP.NET
    This tells the driver about the Kerberos service principal of the Hive server you are connecting to. “_HOST” will automatically be replaced with “perspcluster1node3.persp.net” by the driver, hence this assumes that the service principal is “hive/perspcluster1node3.persp.net@PERSP.NET” However, if you know that the Kerberos service principal is something different, specify that one instead.
  6. Click OK and run the Hive Connector node.
  7. If it still doesn’t work I will need the “full” KNIME log again.

Best,
Björn

1 Like

Hi @irving-ccc

happy to hear that is working finally!

The warning appears because your database timeout (15 s) is set to a rather low value for Hive. This means every query that takes longer than 15s will be canceled. You can increase the timeout under File > Preferences > KNIME > Databases.

Best,
Björn

1 Like

HI @bjoern.lohrmann

Your help really assures my workflow success.

Really appreciate.

Best wishes,

Irving

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.