I have tried KNIME and SPARK on our cluster, and seen some configuration available in NODE (address, port etc), but when trying H2O nodes I dont see any node with config dialog. I mean how it works with H2O in general ( I Have no idea about H2O btw, but how KNIME integrates with H2O either on cluster or locally? - like with spark we have configurable node + spark job server etc.)…
Thanks for answers.
Hi @zebov -
You initially invoke H2O using an H2O Local Context node, and then usually transfer data from KNIME into the H2O format using a H2O Table to Frame node to begin processing.
Last week Marten wrote an excellent post on our blog about how to tackle a Kaggle challenge using KNIME’s H2O integration. The example workflow would be very useful for you, I think. Check it out here: https://www.knime.com/blog/solving-a-kaggle-challenge-using-the-combined-power-of-knime-analytics-platform-h2o
Thanks, I read it - but I dont have H20 installed and still was able to execute the node. I mean its completely different - as its a little bit confusing - no H2O installed and still execution of local context node is OK… This is something I dont get…
in the case of the
H2O Local Context we provide the installation of H2O with KNIME. It’s not yet possible to run H2O Sparking Water, but we’re on it.
I hope this helps,
Ok, so if H2O is integrated within KNIME how does it do “distributed” operations then? Or within KNIME its only in Memory and on locally 1 node cluster? In general how can we create h2o cluster in KNIME? - Is it possible?
Right, at the moment H2O runs only on “a single cluster node”, which is your local machine. Still running in parallel etc. However, we designed the extension in a way that we could theoretically add other “H2O Context”, which are not local (e.g. running on a cluster) and the nodes will still run out of the box. One of these contexts will be “H2O Sparkling Water”. I can’t promise anything regarding a timeline, but we’re on it