Variable Importance H2O Gradient boosted trees

Cosima · December 22, 2020, 8:16pm

Hello,

I’m employing an H2O gradient boosted Trees model to predict a binary dummy target variable.
My independent variables are binary dummies as well.
I want to use the variable importance output as a tool for the model interpretation.

Is that a valid approach?
Can I distinguish if the independent variable is seen as important at a 1 or 0 level?

Thank you so much in advance!

marten_kose · December 28, 2020, 10:24am

The variable importance of these nodes is based on whether the feature was selected for a split and the decrease in overall error due to this split. The higher the value, the more important the variable. For further reference please revisit the H2O documentation: http://docs.h2o.ai/h2o/latest-stable/h2o-docs/variable-importance.html#feature-importance-aka-variable-importance-plots

Cosima · April 7, 2021, 1:47pm

Thank you so much Marten.
I solved my problem by switching to the non H20 Gradient boosted tree node, implementing a permutation feature importance to extract the most important variables, and used the partial dependency plot to identify the direction of the relationship.