H2O AutoML Leaner

helfortuny · February 20, 2023, 1:03pm

Hi!!!

I have a question. I’m trying to build a classification model. I’m using right now the H2O model, because with this model I am available to specify the weight of each row with an extra column. I’m interested with that option, becasue I have a very unbalanced dataset.

I have found a node called H20 AutoML Learner, which trains the dataset with all the models selected in the configuration and returns the leading model among these. Before, I was one by one implementing every model for classification, using a loop start and end to optimise its parameters and, at the end, I was comparing the accuracy among all the models. I understand that with this node I can do exactly the same: optimise parameters of the model and compare different models, right?

So it should be better to use that node than all the nodes I used before, right?

Daniel_Weikert · February 20, 2023, 5:40pm

This might depend on whether AUTOML also optimizes parameters or only benchmarks individual models.
br

evert.homan_scilifelab.se · February 20, 2023, 8:28pm

From the KNIME Hub:

Learns the specified types of models using H2O AutoML and returns the leading model amongst these. As part of the learning process, hyperparameters are automatically optimized by H2O using a random grid search.

SimonS · February 21, 2023, 8:35am

Hi all,

as @evert.homan_scilifelab.se already pointed out, the H2O AutoML Learner also optimizes hyperparameters. It is a very powerful node.

There are few downsides with the AutoML learner:

It does not support all the models for which there are separate nodes for. You can see the list of models in the Algorithm Settings tab.
It needs more computational power and time to run than a single node.
It does not output a comparison of the performance for each hyperparameter and model combination. If you do it manually as before, you get more insights.

It depends on what you want to achieve. If you can live with the downsides or if the model it gives out works for you, the node might be perfect for you.
The AutoML learner might also just be used to get a feeling of which model actually performs best. Afterwards, one could still do a manual optimization of that specific model.

Hope that helps a bit.
Best,
Simon

Daniel_Weikert · February 21, 2023, 5:27pm

Thanks for the update @evert.homan_scilifelab.se
br

system · February 28, 2023, 5:27pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.