Gradient Boosted Trees Predictor Regression)

Slightly off-topic but I wanted to point out that the workflow you show in my opinion is flawed anyway. A single partitioning is simply not going to provide a result that will generalize. Or said otherwise the difference in model performance will most likely be more affected by the choice of observations (eg the partitioning) than by a single feature.
With this setup your optimizing for this 1 specific partition (assuming the seed is fixed, if seed isn’t fixed,then you are completely comparing apples to oranges with the score).

So to actually make this meaningful IMHO you would need to cross-validate each feature combination. Meaning it will only become more computational expensive. Hence I’m not really a fan of feature selection this way especially for trees. Trees should be relatively immune to unimportant features. For feature selection I would simply go with the linear correlation filter and low variance filter.

2 Likes