Hello.
I’m having trouble migrating hive query to spark node.
I don’t have idea how to migrate ‘case when’ and ‘not like’ to spark node.
The query is as below:
SELECT svc_id
, st_ymd
, svc_1
, svc_2
, svc_3
, ipadr
, CASE
WHEN REPLACE(host_adr, ‘', ‘’) <> ‘’ THEN LOWER(host_adr)
WHEN REPLACE(svr_nm, ‘’, ‘’) <> ‘’ THEN LOWER(svr_nm)
WHEN REPLACE(host_dns, ‘’, ‘’) <> ‘’ THEN LOWER(host_dns)
ELSE '’
END as con_adr
, CASE
WHEN SUBSTR(st_hhmiss, 0, 2) IN (‘00’, ‘01’) THEN ‘A_0000_0200’
WHEN SUBSTR(st_hhmiss, 0, 2) IN (‘02’, ‘03’, ‘04’, ‘05’) THEN ‘B_0200_0600’
WHEN SUBSTR(st_hhmiss, 0, 2) IN (‘06’, ‘07’, ‘08’, ‘09’) THEN ‘C_0600_1000’
WHEN SUBSTR(st_hhmiss, 0, 2) IN (‘10’, ‘11’, ‘12’, ‘13’) THEN ‘D_1000_1400’
WHEN SUBSTR(st_hhmiss, 0, 2) IN (‘14’, ‘15’, ‘16’, ‘17’) THEN ‘E_1400_1800’
WHEN SUBSTR(st_hhmiss, 0, 2) IN (‘18’, ‘19’, ‘20’, ‘21’) THEN ‘F_1800_2200’
WHEN SUBSTR(st_hhmiss, 0, 2) IN (‘22’, ‘23’) THEN ‘G_2200_2300’
ELSE ‘’
END as time_cd
FROM DW_TABLE_1
WHERE etl_ymd = ‘2022-09-14’
AND CONCAT(REPLACE(host_adr, ‘’, ‘’), REPLACE(svr_nm, ‘’, ‘’), REPLACE(host_dns, '’, ‘’)) <> ‘’
AND svc_1_nm NOT IN (‘A’, ‘B’)
AND svc_2_nm NOT IN (‘C’, ‘D’, ‘E’)
AND svc_3_nm NOT IN (‘F’, ‘G’, ‘H’, ‘I’)
AND svc_3_nm NOT LIKE ‘%DNS%’
AND svc_id <> ‘12345’
AND ipadr NOT LIKE ‘%220.220.220.1%’
AND ipadr NOT LIKE ‘%220.220.220.2%’
AND ipadr NOT LIKE ‘%220.220.220.3%’
And I was wondering if there are some node could calculate these functions:
SUM(tx_qnt + rx_qnt) as byte_size
TO_CHAR(CURRENT_TIME_STAMP, ‘yyyyMMddHHmmss’) as etl_date
I’d appreciate it if you could tell me if there are some node could replace this query.
Thanks,
hhkim