How to insert data into hive table with partition

@hhkim there ware two ways to add a partition to an existing one:

You can use the DB Connection Table Writer and just make sure your partitions are in the same order (at the end) of the file - like in this version the column “education”. Here it would append the data (all educations=Masters) if you would execute it more than once:

Or you could use the syntax you already have mentioned - “INSERT OVERWRITE” where you mention the partition column. Again: the partitions have to be the last columns (and have to have the same order as in the target file).

INSERT OVERWRITE TABLE `default`.`data_train`
PARTITION (`education`)
SELECT *
FROM `default`.`data_education_masters`

You would add the partition in the form of an additional folder or parquet files within the folder (on the HDFS):

The settings for the target table “data_train” have been set in the additional options of the DB Table Creator node

STORED AS PARQUET
PARTITIONED BY (education STRING)
TBLPROPERTIES ( 'parquet.compression'='snappy'
  , 'external.table.purge'='true'
  , 'transactional'='false'
  , 'discover.partitions' = 'false'
  )

1 Like