In Memory Computation using Ramdisk


A colleague asked my why, when attempting to bring my system to the limit to find the most efficient solutions, I am not using a Ramdisk. It’s been years I used one and on Unix but not Windows so his idea struck me like lightning (Kudos to him!).

Using ImDisk I created one of 16 GB size. Exported a test workflow, you find both linked below, and created a new test workspace. This approach is like a MySQL Server and indeed blew me away.

Crunching the 10 Million rows is by an order of magnitude faster. Only when trying to write the data, as I checked something out in regards to this performance bug report of mine, the 16 GB did not suffice as they were allocated within mere seconds.

Anyways, this begs the questions:

  1. What is the difference to the “Memory Policy” tab where the default is set to “Cache tables in memory”?
  2. Wouldn’t “Full in-memory processing” be an awesome feature?


I believe ‘-Xmx2048m’ + “Cache tables in memory” roughly equal to 2G ram disk

I have 64 GB in total of which I allocated up to 50 to Knime. After / During processing the node result data is read and written to disk.