About migration hive query to spark node

hhkim · September 14, 2022, 8:10am

Hello.
I’m having trouble migrating hive query to spark node.
I don’t have idea how to migrate ‘case when’ and ‘not like’ to spark node.
The query is as below:

SELECT svc_id
, st_ymd
, svc_1
, svc_2
, svc_3
, ipadr
, CASE
WHEN REPLACE(host_adr, ‘', ‘’) <> ‘’ THEN LOWER(host_adr)
WHEN REPLACE(svr_nm, ‘’, ‘’) <> ‘’ THEN LOWER(svr_nm)
WHEN REPLACE(host_dns, ‘’, ‘’) <> ‘’ THEN LOWER(host_dns)
ELSE '’
END as con_adr
, CASE
WHEN SUBSTR(st_hhmiss, 0, 2) IN (‘00’, ‘01’) THEN ‘A_0000_0200’
WHEN SUBSTR(st_hhmiss, 0, 2) IN (‘02’, ‘03’, ‘04’, ‘05’) THEN ‘B_0200_0600’
WHEN SUBSTR(st_hhmiss, 0, 2) IN (‘06’, ‘07’, ‘08’, ‘09’) THEN ‘C_0600_1000’
WHEN SUBSTR(st_hhmiss, 0, 2) IN (‘10’, ‘11’, ‘12’, ‘13’) THEN ‘D_1000_1400’
WHEN SUBSTR(st_hhmiss, 0, 2) IN (‘14’, ‘15’, ‘16’, ‘17’) THEN ‘E_1400_1800’
WHEN SUBSTR(st_hhmiss, 0, 2) IN (‘18’, ‘19’, ‘20’, ‘21’) THEN ‘F_1800_2200’
WHEN SUBSTR(st_hhmiss, 0, 2) IN (‘22’, ‘23’) THEN ‘G_2200_2300’
ELSE ‘’
END as time_cd
FROM DW_TABLE_1
WHERE etl_ymd = ‘2022-09-14’
AND CONCAT(REPLACE(host_adr, ‘’, ‘’), REPLACE(svr_nm, ‘’, ‘’), REPLACE(host_dns, '’, ‘’)) <> ‘’
AND svc_1_nm NOT IN (‘A’, ‘B’)
AND svc_2_nm NOT IN (‘C’, ‘D’, ‘E’)
AND svc_3_nm NOT IN (‘F’, ‘G’, ‘H’, ‘I’)
AND svc_3_nm NOT LIKE ‘%DNS%’
AND svc_id <> ‘12345’
AND ipadr NOT LIKE ‘%220.220.220.1%’
AND ipadr NOT LIKE ‘%220.220.220.2%’
AND ipadr NOT LIKE ‘%220.220.220.3%’

And I was wondering if there are some node could calculate these functions:

SUM(tx_qnt + rx_qnt) as byte_size
TO_CHAR(CURRENT_TIME_STAMP, ‘yyyyMMddHHmmss’) as etl_date
I’d appreciate it if you could tell me if there are some node could replace this query.

Thanks,
hhkim

mlauber71 · September 15, 2022, 3:34pm

@hhkim you might want to take a look at this example comparing Hive SQL and Spark SQL. Most commands should work the same. But it also might depend on your Spark version and your basic big data system.

Date and Time variables are a specific challenge on big data systems. I have put together a few examples in this sample workflow that you might want to check. But again my experience is that this also might depend on a version.

system · December 14, 2022, 3:34pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.