Create Spark Context (Livy) with Azure Blob Storage

#1

Hi,
This is my first post here and from the beginning I wish to help. I’m working with a client who choose KNIME (3.7) on Azure (with HDInsight 3.6 / Spark). I’m Azure guy, not KNIME so be gentle please if I’m wrong somewhere.
I noticed a problem with creating a Spark Context with Livy. I’m using Blob Storage for it. It works great, but only in situation when we do not enforce secure transfer on the blob storage. It’s a default setting for many months right now and in fact, disabling it will not pass any security audits. Today this node (Create Spark Context (Livy)) is sending a job to Livy with a link to this job in blob storage. The link is built for not encrypted transfer with wasb://. It should be like wasbs://. Is it possible for you to change it? Azure Blob Storage connector works great, context node is also creating a job on blob stoirage when encryption is enabled. It is just failing on job trigerring in Livy.

0 Likes

#2

Hi @smereczynski,
welcome to the KNIME Community!
I am afraid it is currently not possible to create a secure Azure URL with the KNIME nodes. But you might be able to use the HDFS nodes as HDInsight offers access to the local files via the HDFS protocol:


So in the described case, I would suggest to try using the local file system as staging area.

I will open a ticket for secure transfer support in Azure.
best regards Mareike

1 Like

#3

Hi,
It would be nice to have this feature in near future because it is possible that I will be forced to drop KNIME because of that. I’m automating an environment deployment and case where I’m forced to extract Hadoop NameNode to connect with HDFS in KNIME is probably not possible to accept. Maybe you need some help with Azure/Java here? My team can help but we just don’t know how - we do not know KNIME as a source code. Together it’s possible.

1 Like

#4

I asked the responsible software engineer for the Azure nodes to comment on this.

best regards Mareike

0 Likes

#5

Hi @smereczynski,

Unfortunately it is not possible to switch to wasbs at the moment. We created a ticket for that, it should only be a small change to the Azure Blob Storage connection.

0 Likes