KNIME + H2O - how it works...

I have tried KNIME and SPARK on our cluster, and seen some configuration available in NODE (address, port etc), but when trying H2O nodes I dont see any node with config dialog. I mean how it works with H2O in general ( I Have no idea about H2O btw, but how KNIME integrates with H2O either on cluster or locally? - like with spark we have configurable node + spark job server etc.)…

Thanks for answers.

Hi @zebov -

You initially invoke H2O using an H2O Local Context node, and then usually transfer data from KNIME into the H2O format using a H2O Table to Frame node to begin processing.

Last week Marten wrote an excellent post on our blog about how to tackle a Kaggle challenge using KNIME’s H2O integration. The example workflow would be very useful for you, I think. Check it out here:


Thanks, I read it - but I dont have H20 installed and still was able to execute the node. I mean its completely different - as its a little bit confusing - no H2O installed and still execution of local context node is OK… This is something I dont get…

Hi @zebov,

in the case of the H2O Local Context we provide the installation of H2O with KNIME. It’s not yet possible to run H2O Sparking Water, but we’re on it.

I hope this helps,



Ok, so if H2O is integrated within KNIME how does it do “distributed” operations then? Or within KNIME its only in Memory and on locally 1 node cluster? In general how can we create h2o cluster in KNIME? - Is it possible?

Right, at the moment H2O runs only on “a single cluster node”, which is your local machine. Still running in parallel etc. However, we designed the extension in a way that we could theoretically add other “H2O Context”, which are not local (e.g. running on a cluster) and the nodes will still run out of the box. One of these contexts will be “H2O Sparkling Water”. I can’t promise anything regarding a timeline, but we’re on it :wink: