How to manage resources in KNIME Server

Hi there.

I’m wondering how to manage resources on a KNIME server when using a dataset that is larger than the server resources.

When I use a dataset larger than the server resources, it kills the executor.

What I want is to use a dataset larger than the server resources without killing the executor.

I was wondering if there is a way to do something like swap memory automatically or handle it with a queue structure.

Please let me know if there is any way to solve this problem.

Thanks,
hhkim

Hi @hhkim ,

this is more of an Analytics Platform question, as it is independent whether the workflow is running on an Executor or Analytics Platform.

This even is a question you can pose for any software in general. In KNIME, there are ways of handling data in batches. Checkout chunk loops, for instance.

Some nodes have settings to run (faster) in-memory algorithms or cache intermediate results on disk. Be sure to untick any “Process in memory” checkboxes.

Of course you can also increase the allowed heap size beyond the physical memory present and configure the OS to use the main disk as swap, but this will be painfully slow; I’d advise against this approach.

You may also be able to change the order in which you process your data (filter first, before applying transformations), or use processing algorithms that are more memory efficient than others. I’m afraid this is very dependent on the task at hand.

I hope these pointers help.

Kind regards
Marvin

1 Like