Dear @Haroon_954
Complementary to the information provided by @Iris and @mlauber71, I have added to your workflow a simple way of doing variable selection based on a Decision Tree. I guess what you need too is to understand how a DT in particular and a RF in general do variable selection but for this, I believe it is good to start with a Decision Tree since RF are made of Decision Trees.
In the workflow I have:
- Taken your data and trained a DT using your data splitting (70% / 30%)
- Extracted Variables from the DT rules using the -Decision to Tree Ruleset- node
- Counted how many times every variable was employed by the DT. Usually people determine DT variable importance based on at which DT branch level it was used, the highest the most important. It turns out that there is a strong correlation between variable branch level and eventually # of variable occurrences in the DT rule set. Thus you can determine variable importance based on variable occurrence in DT rule set too
- Filtered out variables with occurrence less than 15. This threshold here is set arbitrary but could be estimated too. I’m not adding threshold estimation here to provide you with a first simple solution that you could easily understand to begin with.
Using this variable selection, you end up with 11 most important variables in terms of rule set occurrence:

In a second workflow, I filter IN only selected variables to be used to train your RF classifier.
From Scorer results, you can see that using only these 11 variables, you get same statistics performance as when you use the whole 20.
The statistics obtained by the RF remain similar before:
and after variable selection:
This approach can be extended to do Variable Selection using Random Forest instead of a DT but my aim here is to provide you with an example on how to easily achieve variable selection based on a DT, how to extract this information from the DT tree and how to reuse it in a RF.
The whole workflow is here below:
20211229 Pikairos Feature Selection for Random Forest Classification.knwf (877.1 KB)
Hope this minimalist example helps you to understand Variable Selection by DT.
Best
Ael




