hadoop impala - problem with db insert

Hi all,

tried to use the “DB Insert” node based on an impala connection to insert (append) data into a table.
Got the error message: “Execute failed: INSERT is not supported by Impala”.
However, it is possible to use insert via hadoop SQL commands (e.g. in Squirrel) - we are
using Cloudera 6.2. It seems that this error message is created by KNIME not by hadoop.

Is there any solution/workaround for this ?

Best regards
Roland Grund

You will have to first load the data into a (temporary) table with the Big Data environment and then insert it into an existing Impala or Hive table via the SQL Executor node.

I have created a few workflows that demonstrate different concepts of the usage of Big Data nodes. They might serve as an inspiration.

https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_bigdata_hive_csv_loader/m_001_import_hive_csv

The first part loads a new Hive table
https://hub.knime.com/mlauber71/spaces/Public/latest/kn_example_h2o_sparkling_water

1 Like

thanks - very helpful !

1 Like

Hi Roland,
the reason why we have disabled the insertion of records via the SQL INSERT command is that each command would result in a separate file in the HDFS file system containing only a single record. That is nothing you want to use for inserting large amounts of data. Please use the DB Loader node or the Impala Loader (legacy) node to upload data to Impala as mlauber71 already pointed out. If you use the DB Loader of the new DB framework you do not need to upload the data into a temporary table to insert it into an existing table. If a table already exists the new DB Loader automatically creates a temporary table which is used to transfer the data into the destination table. The old Impala Loader did this only for partitioned tables.
Bye
Tobias

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.