Decision Tree Model Optimal settings

Anton6491 · March 22, 2018, 11:36am

Hello guys,
I’m a newbie user of knime and I’m trying to build an effective decision tree model on a dataset of 7043 rows. I partitioned the data set (75% for the learning set and 25% for the testing one) and used the default settings for both the learner and the predictor nodes (I only increased the min number records per node to 4 of the learner node). I got an accuracy of 77.4%, do you have any tips to increase it without making the model overfitted?
Thank you for your help!

nemad · March 22, 2018, 6:15pm

Hello Anton,

I believe you will find this workflow helpful.
It shows how to do parameter optimization in KNIME.
If you are afraid to overfit you can do a cross-validation instead of a simple split in training testing.
See this workflow for an example to do that.
I’ll leave it to you to figure out how to combine the two to do a cross-validated parameter optimization

If you need more examples I’d recommend to check out the node guide on the web, or the example server which you can directly access from within your KNIME AP (just log in to EXAMPLES in the KNIME Explorer, you don’t need a password).

Cheers,
nemad

Anton6491 · March 23, 2018, 3:26pm

Hello,

Thank you for your help!