I want to model Tabnet and propet, several of the DL algorithms developed by Google and Facebook, using Pyspark node, and then save the learned models to Object Storage as a pkl file.
In python script, the above process is successfully completed by proceeding with model writer.
If so, how should I proceed in Pyspark node?
Is “save method” a code-based storage method within Pyspark? If so, could you share the manual about the use?
Is there any way to export to Output other than saving as code?
Suggesting using the Python nodes in KNIME to get started with Prophet.
Do you run a Spark cluster with multiple nodes? Otherwise, there might be no benefit running the code using PySpark.
To exchange the data, use Spark data frames instead of pickle. This is what the PySpark node in KNIME expose as an output port, and Spark uses to distribute the data in the cluster. The Spark data frames can be converted to pandas data frames, and be used e.g. with Prophet, but doing this on scale might become a way more difficult and require some custom python code, as Prophet does not have Spark support out of the box.
Just to confirm, are you using an Apache Spark or Databricks cluster?
Well, applying various analysis models to Pyspark seems to be limited.
Yes, that’s true, PySpark can’t be used as 1:1 replacement of existing Python code as it has to run distributed in a cluster, and this requires a lot more effort to organize the execution.