H2O binomial classification cutoff


I used the H2O Gradient Boosting Machine Learner in one of my workflows and noticed that for a binomial classification task the cutoff between the positive and negative value is not at the standard 0.5.

This behaviour can also be noticed in the example workflow 05_H2O_Scoring on the example server. There are 3 classifications whose probabilities for Cluster_1 is above 0.5 but the prediction is still Cluster_0.

Are the probablities correct so that I just could overwrite the predicton based on the probability? 


Hi lindig,

H2O uses the F1-score to optimize the threshold. That's why it usually is not exactly 0.5. The probabilities are correct, so if you want to have 0.5 as threshold you can do the prediction by yourself.