I am brand new to looking into analytics on Knime and know members of my team were, in the past, looking at predicting an outcome based on attendance. We know that the “NEET” outcome is more likely to happen when absence rate is closer to 1, but I am having trouble replicating that in Knime. This is more of a training exercise for me at the moment, but I hope to be able to feed in various kinds of data to predict the likely outcome a person will be in next year.
It’s a top-heavy dataset in that (as above) the higher the absence rate, the more likely “NEET” is. However, when I run the prediction I get a lot of 0s predicted as NEET and a few 1s predicted as “EET”! I am sure this is something to do with my poor understanding of how decision trees work, but I wondered if there is any advice anyone can give any advice?
A fine example is that, given I am (currenlty) only feeding raw numbers into it, how has it managed to predict these four differently?
Here is a sample of the input, the set-up and the confusion matrix at the end. It seems alright with the EETs, but not with the NEETs. Maybe it’s just lack of input?
Most of the questions I asked in my first post still stand. Without answers to those, we’re not really in a position to help you diagnose any issues.
Your cropped screenshots just raise more questions. Solely from your description and the screenshots, it looks like you’re using a decision tree to predict a classification, and employing only 1 feature (absence rate)? Is that correct? If this is the case, I don’t really see the point of this or how it could possibly work.