Hi,
I recently tried to install and configure the spark-job-server compatible with apache spark 1.6 provided with CDH 5.13.
I have follower the installation steps and when I try to type:
/etc/init.d/spark-job-server start
Everything seems working as expected.
But when I try to create spark context in Knime platform it still doesn’t work.
On the KNIME console I get a general misconfiguration/incompatibility version error:
Log file is located at: /home/ubuntu/knime-workspace/.metadata/knime/knime.log
ERROR Create Spark Context 0:1 HTTP Status code: 500 | Response Body: The server was not able to produce a timely response to your request.
ERROR Create Spark Context 0:1 Execute failed: Spark Jobserver gave unexpected response (for details see View > Open KNIME log). Possible reason: Incompatible Jobserver version, malconfigured Spark Jobserver
The fact is that I’m quite sure that I have correctly configured the node/preferences with the correct version of spark 1.6 (CDH 5.9+). And if it’s really a misconfiguration error, how can I find what causes the problem?
Below the specs:
- ubuntu 16.04 LTS
- knime version—>knime_3.5.2.linux.gtk.x86_64.tar.gz
- CDH 5.13 with default version spark 1.6
- the version of spark-job-server installed—> spark-job-server-0.6.2.3-KNIME_cdh-5.13.tar.gz
Here the detailed view > log:
2018-03-27 18:05:39,554 : DEBUG : main : Node : Create Spark Context : 0:1 : reset
2018-03-27 18:05:39,554 : DEBUG : main : SparkNodeModel : Create Spark Context : 0:1 : In reset() of SparkNodeModel. Calling deleteSparkDataObjects.
2018-03-27 18:05:39,554 : DEBUG : main : Node : Create Spark Context : 0:1 : clean output ports.
2018-03-27 18:05:39,554 : DEBUG : main : NodeContainer : Create Spark Context : 0:1 : Create Spark Context 0:1 has new state: IDLE
2018-03-27 18:05:39,555 : DEBUG : main : SparkContextCreatorNodeModel : Create Spark Context : 0:1 : Reconfiguring old context with same ID.
2018-03-27 18:05:39,555 : DEBUG : main : Node : Create Spark Context : 0:1 : Configure succeeded. (Create Spark Context)
2018-03-27 18:05:39,555 : DEBUG : main : NodeContainer : Create Spark Context : 0:1 : Create Spark Context 0:1 has new state: CONFIGURED
2018-03-27 18:06:11,318 : DEBUG : main : WorkflowEditor : : : Saving workflow test 0
2018-03-27 18:06:11,358 : DEBUG : ModalContext : FileSingleNodeContainerPersistor : : : Replaced node directory “/home/ubuntu/knime-workspace/test/Create Spark Context (#1)”
2018-03-27 18:06:15,229 : DEBUG : main : ExecuteAction : : : Creating execution job for 1 node(s)…
2018-03-27 18:06:15,230 : DEBUG : main : NodeContainer : : : Setting dirty flag on Create Spark Context 0:1
2018-03-27 18:06:15,230 : DEBUG : main : NodeContainer : : : Setting dirty flag on test 0
2018-03-27 18:06:15,230 : DEBUG : main : NodeContainer : : : Create Spark Context 0:1 has new state: CONFIGURED_MARKEDFOREXEC
2018-03-27 18:06:15,230 : DEBUG : main : NodeContainer : : : Create Spark Context 0:1 has new state: CONFIGURED_QUEUED
2018-03-27 18:06:15,230 : DEBUG : KNIME-Workflow-Notifier : WorkflowEditor : : : Workflow event triggered: WorkflowEvent [type=WORKFLOW_DIRTY;node=0;old=null;new=null;timestamp=Mar 27, 2018 6:06:15 PM]
2018-03-27 18:06:15,230 : DEBUG : main : NodeContainer : : : test 0 has new state: EXECUTING
2018-03-27 18:06:15,230 : DEBUG : KNIME-WFM-Parent-Notifier : NodeContainer : : : ROOT has new state: EXECUTING
2018-03-27 18:06:15,234 : DEBUG : KNIME-Worker-4 : WorkflowManager : Create Spark Context : 0:1 : Create Spark Context 0:1 doBeforePreExecution
2018-03-27 18:06:15,234 : DEBUG : KNIME-Worker-4 : NodeContainer : Create Spark Context : 0:1 : Create Spark Context 0:1 has new state: PREEXECUTE
2018-03-27 18:06:15,234 : DEBUG : KNIME-Worker-4 : WorkflowManager : Create Spark Context : 0:1 : Create Spark Context 0:1 doBeforeExecution
2018-03-27 18:06:15,236 : DEBUG : KNIME-Worker-4 : NodeContainer : Create Spark Context : 0:1 : Create Spark Context 0:1 has new state: EXECUTING
2018-03-27 18:06:15,236 : DEBUG : KNIME-Worker-4 : WorkflowFileStoreHandlerRepository : Create Spark Context : 0:1 : Adding handler 06d64ce1-87ef-4086-8d27-5c970651fa67 (Create Spark Context 0:1: ) - 1 in total
2018-03-27 18:06:15,237 : DEBUG : KNIME-Worker-4 : LocalNodeExecutionJob : Create Spark Context : 0:1 : Create Spark Context 0:1 Start execute
2018-03-27 18:06:15,237 : INFO : KNIME-Worker-4 : JobserverSparkContext : Create Spark Context : 0:1 : Spark context jobserver://localhost:8090/knimeSparkContext changed status from CONFIGURED to CONFIGURED
2018-03-27 18:06:15,238 : DEBUG : KNIME-Worker-4 : JobserverSparkContext : Create Spark Context : 0:1 : Checking if remote context exists. Name: knimeSparkContext
2018-03-27 18:06:15,259 : DEBUG : KNIME-Worker-4 : JobserverSparkContext : Create Spark Context : 0:1 : Remote context does not exist. Name: knimeSparkContext
2018-03-27 18:06:15,259 : DEBUG : KNIME-Worker-4 : JobserverSparkContext : Create Spark Context : 0:1 : Creating new remote Spark context. Name: knimeSparkContext
2018-03-27 18:07:15,500 : ERROR : KNIME-Worker-4 : CreateContextRequest : Create Spark Context : 0:1 : HTTP Status code: 500 | Response Body: The server was not able to produce a timely response to your request.
2018-03-27 18:07:15,501 : INFO : KNIME-Worker-4 : JobserverSparkContext : Create Spark Context : 0:1 : Spark context jobserver://localhost:8090/knimeSparkContext changed status from CONFIGURED to CONFIGURED
2018-03-27 18:07:15,501 : DEBUG : KNIME-Worker-4 : Node : Create Spark Context : 0:1 : reset
2018-03-27 18:07:15,501 : DEBUG : KNIME-Worker-4 : SparkNodeModel : Create Spark Context : 0:1 : In reset() of SparkNodeModel. Calling deleteSparkDataObjects.
2018-03-27 18:07:15,501 : ERROR : KNIME-Worker-4 : Node : Create Spark Context : 0:1 : Execute failed: Spark Jobserver gave unexpected response (for details see View > Open KNIME log). Possible reason: Incompatible Jobserver version, malconfigured Spark Jobserver
2018-03-27 18:07:15,501 : DEBUG : KNIME-Worker-4 : Node : Create Spark Context : 0:1 : Execute failed: Spark Jobserver gave unexpected response (for details see View > Open KNIME log). Possible reason: Incompatible Jobserver version, malconfigured Spark Jobserver
org.knime.bigdata.spark.core.exception.KNIMESparkException: Spark Jobserver gave unexpected response (for details see View > Open KNIME log). Possible reason: Incompatible Jobserver version, malconfigured Spark Jobserver
at org.knime.bigdata.spark.core.context.jobserver.request.AbstractJobserverRequest.createUnexpectedResponseException(AbstractJobserverRequest.java:154)
at org.knime.bigdata.spark.core.context.jobserver.request.AbstractJobserverRequest.handleGeneralFailures(AbstractJobserverRequest.java:123)
at org.knime.bigdata.spark.core.context.jobserver.request.CreateContextRequest.sendInternal(CreateContextRequest.java:76)
at org.knime.bigdata.spark.core.context.jobserver.request.CreateContextRequest.sendInternal(CreateContextRequest.java:1)
at org.knime.bigdata.spark.core.context.jobserver.request.AbstractJobserverRequest.send(AbstractJobserverRequest.java:72)
at org.knime.bigdata.spark.core.context.jobserver.JobserverSparkContext.createRemoteSparkContext(JobserverSparkContext.java:465)
at org.knime.bigdata.spark.core.context.jobserver.JobserverSparkContext.access$4(JobserverSparkContext.java:459)
at org.knime.bigdata.spark.core.context.jobserver.JobserverSparkContext$1.run(JobserverSparkContext.java:242)
at org.knime.bigdata.spark.core.context.jobserver.JobserverSparkContext.runWithResetOnFailure(JobserverSparkContext.java:341)
at org.knime.bigdata.spark.core.context.jobserver.JobserverSparkContext.open(JobserverSparkContext.java:230)
at org.knime.bigdata.spark.core.context.SparkContext.ensureOpened(SparkContext.java:64)
at org.knime.bigdata.spark.node.util.context.create.SparkContextCreatorNodeModel.executeInternal(SparkContextCreatorNodeModel.java:155)
at org.knime.bigdata.spark.core.node.SparkNodeModel.execute(SparkNodeModel.java:242)
at org.knime.core.node.NodeModel.executeModel(NodeModel.java:567)
at org.knime.core.node.Node.invokeFullyNodeModelExecute(Node.java:1172)
at org.knime.core.node.Node.execute(Node.java:959)
at org.knime.core.node.workflow.NativeNodeContainer.performExecuteNode(NativeNodeContainer.java:561)
at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(LocalNodeExecutionJob.java:95)
at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:179)
at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:110)
at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:328)
at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:204)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
2018-03-27 18:07:15,501 : DEBUG : KNIME-Worker-4 : WorkflowManager : Create Spark Context : 0:1 : Create Spark Context 0:1 doBeforePostExecution
2018-03-27 18:07:15,501 : DEBUG : KNIME-Worker-4 : NodeContainer : Create Spark Context : 0:1 : Create Spark Context 0:1 has new state: POSTEXECUTE
2018-03-27 18:07:15,501 : DEBUG : KNIME-Worker-4 : WorkflowManager : Create Spark Context : 0:1 : Create Spark Context 0:1 doAfterExecute - failure
2018-03-27 18:07:15,501 : DEBUG : KNIME-Worker-4 : Node : Create Spark Context : 0:1 : reset
2018-03-27 18:07:15,501 : DEBUG : KNIME-Worker-4 : SparkNodeModel : Create Spark Context : 0:1 : In reset() of SparkNodeModel. Calling deleteSparkDataObjects.
2018-03-27 18:07:15,501 : DEBUG : KNIME-Worker-4 : Node : Create Spark Context : 0:1 : clean output ports.
2018-03-27 18:07:15,502 : DEBUG : KNIME-Worker-4 : WorkflowFileStoreHandlerRepository : Create Spark Context : 0:1 : Removing handler 06d64ce1-87ef-4086-8d27-5c970651fa67 (Create Spark Context 0:1: ) - 0 remaining
2018-03-27 18:07:15,502 : DEBUG : KNIME-Worker-4 : NodeContainer : Create Spark Context : 0:1 : Create Spark Context 0:1 has new state: IDLE
2018-03-27 18:07:15,502 : DEBUG : KNIME-Worker-4 : SparkContextCreatorNodeModel : Create Spark Context : 0:1 : Reconfiguring old context with same ID.
2018-03-27 18:07:15,502 : DEBUG : KNIME-Worker-4 : Node : Create Spark Context : 0:1 : Configure succeeded. (Create Spark Context)
2018-03-27 18:07:15,502 : DEBUG : KNIME-Worker-4 : NodeContainer : Create Spark Context : 0:1 : Create Spark Context 0:1 has new state: CONFIGURED
2018-03-27 18:07:15,502 : DEBUG : KNIME-Worker-4 : NodeContainer : Create Spark Context : 0:1 : test 0 has new state: CONFIGURED
2018-03-27 18:07:15,502 : DEBUG : KNIME-WFM-Parent-Notifier : NodeContainer : : : ROOT has new state: IDLE
I have also attached other error_log files.
Can someone help to understand how to make it work? I don’t know what to think of it anymore. I have tried all the possible combinations but without any positive results.
Since I didn’t mention it, there is no Kerberos authentication configured, both for the cluster defined in cloudera and knime-spark-job. I have just installed and configured the spark-job-server without any other features.
Thanks…
hs_err_pid13101.log (127.4 KB)
knime.log (15.2 KB)