Local big data environment - Spark to Parquet (to S3), AWS Credentials Chain

DXRX · August 31, 2019, 11:02am

Hi Folks,
I have a local big data environment running a spark job, I want to use the spark to parquet node which is connected to S3. My S3 connection is setup and working via the Default credential provider chain, but I need some way of supplying the default credential provider chain to the local spark environment.

Is there a way to do this with the local big data environment, e.g. access to the core-site.xml. And what is the options to set to pickup the credentials chain?

mareike.hoeger · September 2, 2019, 2:23pm

Hi @DXRX,
Unfortunately it is currently not possible to use the S3 connection with the Spark Data nodes in the local Big Data environment. We already have an open Ticket for this.

The workaround is to write the data to your local file system with the “Spark to Parquet” node and the local HDFS connection of the “Local Big Data Environment” node. Afterwards you can upload it to S3 with the “Upload” node.

best regards Mareike

DXRX · September 2, 2019, 3:19pm

Ok thanks for the information, will make the necessary changes.

system · June 2, 2023, 9:01pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.