H2O sparkling water connection with Knime

Hi, I am new with the H2O thing, and just trying it with the Knime connector.

I basically used a lot of spark nodes on Knime which I found hard to do cross validation since there is no native nodes for spark CV, however I saw this H2O CV.

I have a question, to create the sparkling water context, I need to connect it to an existing spark context. Let’s say I connect it to a Cross Validation on a Random Forest. If I use this sparkling water context, will it run on spark?

How can I know which H2O runs on spark and which H2O runs on just H2O?

And, any idea when will the H2O xgboost come to Knime?

Thanks!
-Mizu

1 Like

Hi @Mizunashi92,

If you use the H2O Sparkling Water Context, all the following nodes will process their computations on Spark, i.e. in your example both the CV and the Random Forest will be executed on Spark.

If you are using just one H2O context, all of your H2O nodes will run with this context. If you are having a local and a Spark context node in the same workflow, it depends on how you connect the nodes. All the succeeding nodes of a context are using this context.

We have it on our list, but not sure yet when it will be done.
However, you might want to check out our implementation of XGBoost. There are several nodes you can use (they don’t run on Spark, unfortunately): xgboost – KNIME Community Hub

Cheers,
Simon

3 Likes

just to add to @SimonS’s answer: the H2O implementation of GBDT is not much slower that the one from xgboost. So just try it out in a sparkling water context. Or do you have specific requirements in mind that would favour xgboost over H2O?

@lisovyi I think what @Mizunashi92 meant was the H2O XGBoost implementation. H2O does have one but we don’t have it integrated into KNIME yet.
Do you mean the H2O Gradient Boosting Machine with GBDT? You are right, this is a good alternative to the XGBoost algorithm.

3 Likes

Thanks for the answer, it has been very helpful! :smiley:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.