The start node in this case will allow you to select combinations of parameters you want to optimize, along with a choice of strategies (e.g. brute force, Bayesian, and others.). In the loop end you choose what metric you want to optimize (e.g. max accuracy, min error).
There is a video on our Youtube channel that explains this process in more detail:
Somewhat off topic but that video should be taken with a huge grain of salt. If you want to to do parameter optimization, you should do it with cross-validation not a single train-test split.
Thanks a lot . It was really useful. But my data is time series . So for training I will select about 80% from the beginning and balance 20% towards the end will be for testing.
One cannot shuffle the data to build second tree as it disturbs the sequence.
Also rather than Forecast what I am interested is which variable is Important and some type of Sensitivity Analysis.
Simple CART tree the accuracy of Forecast is not good.
X boost is it a correct model for Multivariate Time series Analysis Apart from Multivariate ARIMA
Since you have time series data, you could use linear sampling instead of random sampling. Another option would be to bypass CV and just use the regular Partitioning node with the Take from top mode.
If your primary interest is in interpretability, you may want to take a look at SHAP or LIME. Here’s a workflow that demonstrates those. It’s set up to use a Random Forest, but you could switch it to XGBoost without too much trouble: