Variable Importance H2O Gradient boosted trees


I’m employing an H2O gradient boosted Trees model to predict a binary dummy target variable.
My independent variables are binary dummies as well.
I want to use the variable importance output as a tool for the model interpretation.

  1. Is that a valid approach?
  2. Can I distinguish if the independent variable is seen as important at a 1 or 0 level?

Thank you so much in advance!

The variable importance of these nodes is based on whether the feature was selected for a split and the decrease in overall error due to this split. The higher the value, the more important the variable. For further reference please revisit the H2O documentation:


Thank you so much Marten.
I solved my problem by switching to the non H20 Gradient boosted tree node, implementing a permutation feature importance to extract the most important variables, and used the partial dependency plot to identify the direction of the relationship.

1 Like