Feature importance is difficult to discern from all of KNIME’s tree nodes. For global feature importance, Random Forest and Tree Ensemble provide split information for 3 levels - but if trees are deeper that split information is lost (a frustration already expressed by @aconca at Tree Ensemble Learner - variable importance?). For H2O GBT, feature importance is provided as a single score. But the KNIME GBT nodes provide no indicators of feature importance. There is also no means to extract the trees from anything but the Tree Ensemble which would allow external compilation of such metrics. Local feature importance has no support from any of the nodes. Being unable to explain and investigate models reduces the utility of these nodes. Are there ways to derive feature importance metrics of which I am unaware?
Hi @bfrutchey take a look for inspiration and some directions
One old trick back in the days was to take the scores and feed them back as target into a simple decision tree (together with the original variables) and see what the top branches are. Might be worth a try besides the hints by @bfrutchey
Thanks @HansS those are really great examples for generating local explanations! Definitely makes me want to upgrade to version 4 of KNIME soon (we are stuck in 3.7 for a bit) to use the new nodes.
If I am patient and get the Shapley values for every prediction I could even average them to get global model explanations. That does seem like a waste of processing though, as the model learners generally can expose the global feature importance immediately after training.
I did discover that I could extract all the Trees from the Tree Ensemble node, then parse the tree to get their splits and create a global feature importance score. Only works for the Tree Ensemble, not the Gradient Boosting nodes though… which I would prefer for my use case. I am attaching a mini-workflow which shows how I extract and parse the trees.feature_influence_demo.knwf (16.8 KB)
@HansS that is. Did not scroll correctly…
This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.