Solutions to “Just KNIME It!” Challenge 25 - Season 2

Hi KNIMErs :slight_smile:

This week I have split the data into training (70%) and test (30%) sets and used the training set to perform a 5-fold cross validation using the -X-Partitioner- node. The -Parameter Optimization Loop- nodes have been used inside the cross validation loop to determine the optimal parameters (Number of Boosting Rounds, Max Tree Depth, and Learning Rate) for the -XGBoost Tree Ensemble Learner- node, with the Objective Function set as the F-Measure.

After obtaining the optimal parameters for each of the cross validations, the best set of parameters was selected to train a final XGBoost model on the full training set and the model was used to predict the test set.

The average F-Measure for the prediction on the test set was 0.87, using the following parameters:
Number of Boosting Rounds = 21
Max Tree Depth = 3
Learning Rate = 0.2

You can find my workflow on the hub here:

Thanks to @aworker for your guidance :slight_smile:

Best wishes
Heather

3 Likes