Problem with disk space during workflow execution.

Hello.

I have a problem with disk space during execution of my workflow. I have a dataset of 85Gb (it’s very huge as for me :grinning:) and only 350Gb free space on my device. So execution stops not far from the end because of lack of disk space. Is there any possibility not to store results of some nodes in memory (in temporary files)? For example, I have several nodes to preprocess my dataset and I don’t need to store keep results of all of these nodes. I need the result of the last node of this group. Is there way to deal with it somehow?

Thanks.

@dbolshev streaming might be an option for you

4 Likes

@mlauber71 thanks a lot!

1 Like

Sorry for bothering you, but I can’t find “wrapped metanodes” in my interface. I use KNIME 4.6.0. Is it need to be installed?


Hi @dbolshev ,

I believe that “wrapped metanode” is the old name (earlier KNIME versions, before my time!) for what is now a “component”.

3 Likes

@takbb, @dbolshev you can configure streaming options it in the tab “Job Manager Selection” of the component.

Also you might want to consider other options. Also a tweaking of data storage might be an option. Parquet and ORC file formats offer a good combination of compression and keeping of one’s column types. And also they allow to handle larger files in chunks but also treat them as one (big) table if necessary.

3 Likes

Thank you @takbb! Got it

@mlauber71 thanks a lot!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.