Problem creating spark context on Microsoft SQL Server Big Data Clusters.

Hello knime community!

For some time now we have been trying to connect to the HDFS and create spark context on our Spark cluster with the following workflow:

2022-01-10 10_10_27-10.200.130.29 - Remote Desktop Connection

Connecting to HDFS with kerberos seems to work fine, using HDFS connector (webHDFS with SSL/HttpFS with SSL) or KNOX, yet the Create Spark Context node gives the following error:

ERROR Create Spark Context (Livy) 3:8 Execute failed: Failed to upload Kryo version detector job: Cannot retry request with a non-repeatable request entity. (NonRepeatableRequestException)

We found this error message in the KNIME update-site.zip > org.knime.bigdata.spark.core.livy.source_4.4.0.v202106241517.jar > LivySparkContext.java > 449. line
but couldn’t find what is causing it.
Running the node creates the livy session ID and the workfolder on HDFS for it, but crashes right after.
We overcame a lot of different errors, but got stuck here.

Is this UseCase supported on the Analytics Platform?
Are we missing a configuration?

Thanks in advance!

Hi @botichello and welcome to the KNIME community!

Running Spark via Livy on a secured cluster should work fine. Can you provide some details about your cluster (Type, Cluster Version and Spark Version)? E.g. Cloudera 7.1.7 and Spark 3.0.

The debug logs in KNIME might contain more information. You can enable them in Preferences → KNIME and then select DEBUG as Log Level. Can you post the logs here?

Hi @sascha.wolke!

We are using Microsoft Sql Server Big Data Clusters with CU13, It’s the 15.0.4178.15 version number (SQL Server Big Data Clusters CU13 release notes - SQL Server Big Data Clusters | Microsoft Docs),
with spark version of 3.1.2.

The Error logs for the relevant part are the following:

2022-01-10 14:31:19,593 : DEBUG : KNIME-Worker-0-Create Spark Context (Livy) 0:6 :  : SparkJarRegistry : Create Spark Context (Livy) : 0:6 : Time to collect all jars for Spark version 3.0: 725 ms
2022-01-10 14:31:55,486 : DEBUG : KNIME-Worker-0-Create Spark Context (Livy) 0:6 :  : LivySparkContext : Create Spark Context (Livy) : 0:6 : Uploading Kryo version detector job jar.
2022-01-10 14:31:55,524 : INFO  : KNIME-Worker-0-Create Spark Context (Livy) 0:6 :  : LivySparkContext : Create Spark Context (Livy) : 0:6 : Destroying Livy Spark context 
2022-01-10 14:31:55,857 : INFO  : KNIME-Worker-0-Create Spark Context (Livy) 0:6 :  : SparkContext : Create Spark Context (Livy) : 0:6 : Spark context sparkLivy://3405e6ed-a9ac-4c13-9fe1-5dcff85952a4 changed status from OPEN to CONFIGURED
2022-01-10 14:31:55,858 : DEBUG : KNIME-Worker-0-Create Spark Context (Livy) 0:6 :  : Node : Create Spark Context (Livy) : 0:6 : reset
2022-01-10 14:31:55,858 : DEBUG : KNIME-Worker-0-Create Spark Context (Livy) 0:6 :  : SparkNodeModel : Create Spark Context (Livy) : 0:6 : In reset() of SparkNodeModel. Calling deleteSparkDataObjects.
2022-01-10 14:31:55,858 : DEBUG : pool-7-thread-1 :  : DestroyAndDisposeSparkContextTask :  :  : Destroying and disposing Spark context: sparkLivy://3405e6ed-a9ac-4c13-9fe1-5dcff85952a4
2022-01-10 14:31:55,858 : INFO  : pool-7-thread-1 :  : LivySparkContext :  :  : Destroying Livy Spark context 
2022-01-10 14:31:55,859 : ERROR : KNIME-Worker-0-Create Spark Context (Livy) 0:6 :  : Node : Create Spark Context (Livy) : 0:6 : Execute failed: Failed to upload Kryo version detector job: Cannot retry request with a non-repeatable request entity. (NonRepeatableRequestException)
java.util.concurrent.ExecutionException: org.apache.http.client.ClientProtocolException
	at java.base/java.util.concurrent.FutureTask.report(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.get(Unknown Source)
	at org.knime.bigdata.spark.core.livy.context.LivySparkContext.waitForFuture(LivySparkContext.java:492)
	at org.knime.bigdata.spark.core.livy.context.LivySparkContext.uploadKryoVersionDetectorJob(LivySparkContext.java:447)
	at org.knime.bigdata.spark.core.livy.context.LivySparkContext.detectKryoVersion(LivySparkContext.java:349)
	at org.knime.bigdata.spark.core.livy.context.LivySparkContext.open(LivySparkContext.java:323)
	at org.knime.bigdata.spark.core.context.SparkContext.ensureOpened(SparkContext.java:145)
	at org.knime.bigdata.spark.core.livy.node.create.LivySparkContextCreatorNodeModel2.executeInternal(LivySparkContextCreatorNodeModel2.java:85)
	at org.knime.bigdata.spark.core.node.SparkNodeModel.execute(SparkNodeModel.java:240)
	at org.knime.core.node.NodeModel.executeModel(NodeModel.java:549)
	at org.knime.core.node.Node.invokeFullyNodeModelExecute(Node.java:1259)
	at org.knime.core.node.Node.execute(Node.java:1039)
	at org.knime.core.node.workflow.NativeNodeContainer.performExecuteNode(NativeNodeContainer.java:559)
	at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(LocalNodeExecutionJob.java:95)
	at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:201)
	at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:117)
	at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:365)
	at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:219)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
	at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
Caused by: org.apache.http.client.ClientProtocolException
	at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:187)
	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
	at org.apache.livy.client.http.LivyConnection.executeRequest(LivyConnection.java:292)
	at org.apache.livy.client.http.LivyConnection.access$000(LivyConnection.java:68)
	at org.apache.livy.client.http.LivyConnection$3.run(LivyConnection.java:277)
	at java.base/java.security.AccessController.doPrivileged(Native Method)
	at java.base/javax.security.auth.Subject.doAs(Unknown Source)
	at org.apache.livy.client.http.LivyConnection.sendRequest(LivyConnection.java:274)
	at org.apache.livy.client.http.LivyConnection.post(LivyConnection.java:242)
	at org.apache.livy.client.http.HttpClient$4.call(HttpClient.java:268)
	at org.apache.livy.client.http.HttpClient$4.call(HttpClient.java:265)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot retry request with a non-repeatable request entity.
	at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:225)
	at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
	at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
	at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
	... 16 more

Hi @botichello,

I’m sorry, but KNIME currently only supports Spark up to version 3.0 and you have to select the matching Spark version in the Livy node. We are already working on this and hopefully support more Spark versions in one of the next KNIME releases.

Hi @sascha.wolke,

We have another server with BDC CU12
(SQL Server Big Data Clusters CU12 release notes - SQL Server Big Data Clusters | Microsoft Docs)
It uses Spark 2.4
and livy 0.6

Running this workflow with same configs and certificates to connect the server with 2.4 gives the exact same error log, so it’s surely not about the spark version.

Any other lead for this issue?

and…
May I ask about your roadmap on the implementation of new spark versions?

Thanks!

Hi @botichello,

was this the full error log you posted above? The interesting part in the Stacktrace would be at the end and unfortunately seem to be missing. Maybe the Livy log contains some more information?

Right now we do not support Microsoft SQL Server Big Data Clusters as we never tested it. It seems to use the usual tools like Hadoop/Livy and you might be able to get it running with KNIME. I can’t reproduce this as I don’t have a test SQL Server Big Data Cluster at the moment.

Spark 3.2 support is on the next KNIME summer release roadmap, but it’s not yet finished and plans might change.

Thank you for the answers @sascha.wolke,

Sadly that is all the log I have, tried to config the spark logs with plugins/org.apache.axis…/lib/log4j.properties (scala - Get Full Stack Trace in Spark Log - Stack Overflow), but It still shows “… 16 more”.

Can I configure it elsewhere?

I also found that it’s irrelevant, because those are already in the stacktrace
(stack trace - How to read the full stacktrace in Java where it says e.g. "... 23 more" - Stack Overflow)

So there won’t be more info in the spark logs.

Hi @botichello,

the logs above are from the Livy Client / Apache HttpClient that should use the KNIME logger, there is no Spark or Log4j involved at all. If there is nothing more, than the NonRepeatableRequestException seems to be the last mentioned problem. I have never seen such error before and can’t reproduce this right now. You might try to look into the Cluster logs and check KNOX, Livy and YARN container logs. Maybe its an permission problem and KNOX or Livy fails somewhere.