Decision Tree predicts one outcome more than the others?

This isn’t really a Knime issue, it’s a decision tree issue. D Tree basically just finds the variables that are most correlated with differences in an outcome, splits on that variable, then go within that split group, and repeats the process. D Trees are good for having a peek at what variables might be the most associated, and subsequent interactions, but really aren’t good for much beyond that for small data sets. Random forest or XG Boost may be better suited for your needs.

The bigger problem is that there are a lot of variables that are highly influential on grades which you couldn’t possibly have in your data set, so you’re only going to be able to speak to a few trends that you can observe.

1 Like