Hi, I’m trying to explain feature importance from a Random Forest classification. I haven’t found any useful viz nodes, or anything to help me explain to laymen the various importance of features. Any clues on this?
Also- I’d like to export the “average” or median tree from the random forest, or the ruleset arising from the random forest classification. Is this possible?
see this threat for an idea on how to calculate feature importance.
Regarding the visualization, this is really up to you, you can, for example, use a scatter plot with the features on the x-axis and the importance on the y-axis, or you can write your own view using the Generic JavaScript View if the options KNIME provides don’t suit your needs.
What do you mean by average or median tree?
Ideally, the trees in a random forest are extremely diverse and as far as I know, there exists no sensible way to calculate an average tree.
You can extract all the trees from a random forest with the Tree Ensemble Model Extract node and then use the Decision Tree to Ruleset node but I doubt that you will gain much insight from the gigantic set of rules this will result in.
Random forests are powerful classifiers but I would be careful to draw conclusions from the rules they provide because the individual trees are built with a large amount of randomization involved.