Big Data - Connection with AWS EMR

I got a big data connector license and tried to connect ot AWS EMR (5.x) but got various errors:

1. If I use hive connector, every step passed except "Database connection table reader"

- Flow design (1_HIVE_Flow.png)

- Error details (1_HIVE_ERROR.txt)

2. If I use generic database connector (using AWS HIVE JDBC driver 4/4.1/4.2), got simlar error. Details please refer to 2_HIVE_ERROR.txt with the setting shown in 2_HIVE_SETTING.png

3. AWS EMR JDBC driver is downloaded from: https://amazon-odbc-jdbc-drivers.s3.amazonaws.com/public/AmazonHiveJDBC_1.0.4.1004.zip

Indeed, I tried to extract the KNIME generated query and directly run in EMR, it works fine.

I wonder if there is any success case using KNIME with AWS EMR (HIVE)?

Hi,

unfortunately we have not done any testing against EMR 5.x so far. What we would recommend however is to use the Hive Connector together with the AWS EMR JDBC driver (if present, the driver is automatically detected and used by Hive Connector).

About your problem: The error you are seeing is probably an EMR internal problem with Hive/JDBC. Have you tried running the query through beeline? (which is also using JDBC)

The real cause is probably in the logs of the YARN containers that were started for your query, in particular the logs of the YARN application master container.

Best,

Björn

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.