How to fix biased random forest prediction while maintaining accuracy?

Lucius_L · January 25, 2025, 5:11am

Sorry for the noob question, im new to Knime and totally cant figure out how. I have tried to use 3 models (KNN, decision tree and random forest) to predict loan status, and all of them are predicting more “1” than “0”, especially random forest, which is almost only predicting “1”.

I have tried to use stratified sampling and SMOTE but I guess i did something wrong they were not working.
I have also tried equal size sampling, prediction is less biased but accuracy drops from already low 60% something to 40% something.

Would appreciate any help on how to fix the biased prediction while maintaining accuracy (or maybe improve accuracy?), thank you!

k10shetty1 · January 29, 2025, 12:56pm

Hello @Lucius_L,

Welcome to the Forum.

I would suggest to avoid using linear combinations of existing features, as they don’t add new information. Instead, perform feature importance to identify key variables and use them. Optimize model parameters (e.g., tree depth, number of trees), apply pruning techniques, and avoid forcing specific columns for root splitting. Also, ensure SMOTE or other balancing techniques are correctly applied.

Hope this helps.

Best,
Keerthan

system · April 29, 2025, 12:56pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.