I am trying to access Hive from Knime using the Big Data Connectors.
For this I use a simple workflow with just a Hive Connector and a Database Reader (see attached zip file).
For simple queries like "select * from <table>" this works on both Knime Desktop and Knime Server. However, making the query a bit more involved (e.g. "select count(*) from <table>") this stops working on Knime Server (it still works on Knime Desktop). The resulting error on Knime Server is given by:
Database Reader 6:325 - Execute failed: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
When changing the execution engine to Spark in the Cloudera cluster, I get the following error:
Database Reader 9:325 - Execute failed: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
Following advice from different places on the internet I have tried the following to no avail:
- Set the setting hive.auto.convert.join to false in the Cloudera cluster (via Cloudera manager)
- Tried different users in the hive connector (my personal useraccount, hive, hdfs)
I cannot find any relevant log messages in the Knime Server logs or the Hive logs.
I rule out a network problem between Knime Server and the Cloudera cluster, since the simple query works fine.