run existing Spark job?

Is it possible to use Knime to run a pre-existing Spark job?  For example, can I run the SparkPi from Knime then do processing with its output?  If so, what Knime node(s) would I use to run my Spark job?  

 

Thanks!

Hi,

the Spark Java Snippet node can run pre-existing Spark code (you can paste it in there). It gets handed an existing SparkContext and - depending on the snippet type (regular, source or sink) - also up to two input RDDs. The SparkPI example will however not work out of the box, because it is a full Spark application that creates its own Spark context (the KNIME Spark Executor creates and manages its own Spark context). If you need to use precompiled code you can also add jar files in the Spark Java Snippet node. Note, that currently you still need to copy those jar files over manually to your Spark cluster.

Hope that helps,

Björn

Hi,

you can use the Spark Java Snippet node to run arbitrary Spark jobs within KNIME. For an example on how to use the Spark Snippet node have a look at the Modularized Spark Scripting in the Node Guide.

The Spark Java Snippet node is part of the commercial KNIME Spark Executor. Here you can request a trial licence for the KNIME Big Data Extensions that include the KNIME Spark Executor.

Bye

Tobias