Variable Importance H2O Gradient boosted trees

Hello,

I’m employing an H2O gradient boosted Trees model to predict a binary dummy target variable.
My independent variables are binary dummies as well.
I want to use the variable importance output as a tool for the model interpretation.

  1. Is that a valid approach?
  2. Can I distinguish if the independent variable is seen as important at a 1 or 0 level?

Thank you so much in advance!

The variable importance of these nodes is based on whether the feature was selected for a split and the decrease in overall error due to this split. The higher the value, the more important the variable. For further reference please revisit the H2O documentation: http://docs.h2o.ai/h2o/latest-stable/h2o-docs/variable-importance.html#feature-importance-aka-variable-importance-plots

3 Likes

Thank you so much Marten.
I solved my problem by switching to the non H20 Gradient boosted tree node, implementing a permutation feature importance to extract the most important variables, and used the partial dependency plot to identify the direction of the relationship.

1 Like