Storage and memory of Pyspark Script result data

Hi KNIME Support team.

I have a memory question regarding Pyspark Script.

After creating a Spark Session by connecting with Livy, I am importing about 5 million data into parquet and running ML analysis code with Pyspark Script.

  1. After completing the Pyspark Script, where is the data stored in memory? Is it stored in Spark Driver memory?

  2. If it is stored in a specific space, is it possible to reduce the memory by clearing the cache in that specific space?

Your answers will be appreciated.

Hi @JaeHwanChoi,

Is this the same question like in Driver memory specs inquiry when using Pyspark Script ?

It would be great if we can keep this in one thread instead of double posts.

Thanks!
Sascha

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.