How to fix biased random forest prediction while maintaining accuracy?

CA.knwf (3.5 MB)

Sorry for the noob question, im new to Knime and totally cant figure out how. I have tried to use 3 models (KNN, decision tree and random forest) to predict loan status, and all of them are predicting more “1” than “0”, especially random forest, which is almost only predicting “1”.

I have tried to use stratified sampling and SMOTE but I guess i did something wrong they were not working.
I have also tried equal size sampling, prediction is less biased but accuracy drops from already low 60% something to 40% something.

Would appreciate any help on how to fix the biased prediction while maintaining accuracy (or maybe improve accuracy?), thank you!

Hello @Lucius_L,

Welcome to the Forum.

I would suggest to avoid using linear combinations of existing features, as they don’t add new information. Instead, perform feature importance to identify key variables and use them. Optimize model parameters (e.g., tree depth, number of trees), apply pruning techniques, and avoid forcing specific columns for root splitting. Also, ensure SMOTE or other balancing techniques are correctly applied.

Hope this helps.

Best,
Keerthan

1 Like