I was applying a Decision Tree model (learner and predictor) to perform a turnover prediction for employees in a company. After looking at the outputs including a probability for an individual to leave/stay we were asking ourselves, if there is a possibility to identify the “driving factors” per individual case. By generating this information, we would then be able to better understand which actions could be taken (based on an individual level) to mitigate certain probabilities.
Any ideas how this can be achieved in KNIME?
Thanks and best regards,
welcome to the KNIME community.
There different approaches to interpret/explain model predictions:
- Classical approach. Decision tree is a simple model that allows to explain predictions by inspecting the tree structure. In particular, the top-level level shows the most important feature that allows the best separation of classes in the case of a classification problem. The Decision Tree Learner node has a view that visualises the structure of the trained model. See https://kni.me/w/gxrPOF2R8QCJCBk0 for an example.
- Modern state-of-the-art approaches. Starting with KNIME 4.0 we provide dedicated nodes to calculate SHAP, Shapley and LIME values. These approaches are model-agnostic, i.e. they will work not only for a decision tree model, but also for a linear and deep learning model. See https://kni.me/w/iuy-zmbLbhkAO3KU for a good example using all three methods to explain predictions of a (random forest) model.
Thanks for the prompt answer, really appreciate it!
Do you know, when KNIME 4.0 will be released? Do I understand correctly, that we can only apply your linked example to a Random Forest model, not a classic Decision Tree model?
As we are dealing with large data-sets, I’d like to avoid the classical approach by going down the tree for each individual
KNIME 4.0 is the most recent publicly available release. You can download it from https://www.knime.com/downloads . As of this moment the latest release is 4.0.1, and there is 4.0.2 coming out soon.
As for the usage of the explanation methods, those are fully model-agnostic, i…e they can provide explanations for predictions of any model. You can replace the Learner and Predictor nodes with the corresponding nodes for Decision Tree. Note, that Predictor nodes are hidden in the ML Interpretability metanode. I have created a simple example to explain predictions of a Decision Tree using SHAP (which is in most cases the best speed/reliability compromise option):
This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.