k-fold cross-validation for Random Forest

Hello Everyone,

Quick question - is k-fold cross-validation commonly used for Random Forest models? I understand that since Random Forests use OOB, cross-validation isn’t technically needed but is it wrong to use cross-validation for RF models?

Thanks

Hi @Subha_D

Yes it is. It is true that RF includes OOB (Out of Bag) in its algorithm (although as an option in most implementations). It is not because it includes OOB that a K-Fold CV (KFCV) is not needed to evaluate or compare its performance .

Usually, one doesn’t try just a machine learning method but different ones to determine which one is the best at the task. A good technique to evaluate performance, being able to compare it among ML methods and chose the best one, is definitely KFCV, regardless of whether the compared methods include an internal technique which can beforehand shed light on how good they are performing.

Just googling -Random Forest, Cross Validation and JCIM- (Journal on Chemical Information & Modeling) you’ll see how popular is KFCV to evaluate RF in this field, for instance. I’m citing this example because I guess from one of your previous post on Tautomers that you are in this field.

Definitely it is not.

Hope this helps.

Best

Ael

6 Likes

Hi @aworker ,

Thank you for the explanation! And yes I’m currently doing my dissertation in this field.

Thank you,
Subha

1 Like

Hi @Subha_D

My pleasure to help you. I’ll be grateful to have a copy of your disertation once it is finished, if possible.

Best

Ael

1 Like

Hi Ael,

I am happy to send you a copy once it has been completed and corrected (which will take a while).

Thanks,
SD

2 Likes

Hi Subha,

Thanks a lot. Let’s keep in touch in the meanwhile and glad to continue to help you if possible.

Best wishes for your disertation !

Ael

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.