Overfitting control in Gradient Boost Regression Model


I have a gradient boost regression model and i want to learn how can i overfitting contol for my model?

Best Regards,

Hello @gokhan_sir,

I assume you are using the Gradient Boosted Trees Learner (Regression) node?
This node essentially provides you with three parameters that can help to control overfitting:

  • Tree depth: The deeper a tree, the more it overfits the training data but a too shallow tree might not allow for enough feature interactions.
  • Number of models: This regards the number of trees to learn. Generally, more models mean more overfitting.
  • Learning rate: The learning rate defines the influence a single tree has on the overall prediction and a lower learning rate can be used to counter overfitting.

My recommendation is to keep the tree depth in the interval 3 to 5 and play with different configurations of learning rate and number of models. Note, however, that there are no general best settings (otherwise, they would be our defaults :wink: ) and you will have to try and see what works for your data.




Hi Adrian,

Thank you for reply.
Tree Depth : I am using 4 tree depth as you recommend :slight_smile:
Number of models: Default settings is 100. Is it ok ? or can i use different number ? I used 100 models and our model R^2 is 0,913 - than used 1000 models and our R^2 is 0,93 and also have some root mean square error.
Learning Rate : i am also using defaults value (0,1).

When i changed learning rate between 0,05 to 0,5 - R^2 changed between 0,94 to 0,90. This mean our data is normal ? :slight_smile:

I am the new gut for this machine learning algortihms :slight_smile: . If i ask very simple and dummy question, i will be very sorry :slight_smile:

Best Regards,

Hello Gökhan,

I can’t tell you whether your data is normal as I am not sure how normal would be defined in this case.
However, a larger R^2 means that your model explains the variation in the data better, so a smaller learning rate seems to benefit your model perhaps because it might allow the algorithm to make finer distinctions.

Kind regards,