Hi all,
I had a question regarding KNIME PySpark integration - So I’m following the below steps:
- I’m using the “Create Databricks environment node” to create Spark context.
- Then “PySpark script source” node to run some python code and output is a resultant dataframe.
Questions:
- Where is the output dataframe stored in Spark?
- Will it persist as long as the cluster persists ?
- How can we delete this result dataframe file once the results are stored in dbfs/blob storage?
Also, it would be great if you could link any reference documentation to the answers.
Thank you