Quick question - is k-fold cross-validation commonly used for Random Forest models? I understand that since Random Forests use OOB, cross-validation isn’t technically needed but is it wrong to use cross-validation for RF models?
Yes it is. It is true that RF includes OOB (Out of Bag) in its algorithm (although as an option in most implementations). It is not because it includes OOB that a K-Fold CV (KFCV) is not needed to evaluate or compare its performance .
Usually, one doesn’t try just a machine learning method but different ones to determine which one is the best at the task. A good technique to evaluate performance, being able to compare it among ML methods and chose the best one, is definitely KFCV, regardless of whether the compared methods include an internal technique which can beforehand shed light on how good they are performing.
Just googling -Random Forest, Cross Validation and JCIM- (Journal on Chemical Information & Modeling) you’ll see how popular is KFCV to evaluate RF in this field, for instance. I’m citing this example because I guess from one of your previous post on Tautomers that you are in this field.
Definitely it is not.
Hope this helps.
Hi @aworker ,
Thank you for the explanation! And yes I’m currently doing my dissertation in this field.
My pleasure to help you. I’ll be grateful to have a copy of your disertation once it is finished, if possible.
I am happy to send you a copy once it has been completed and corrected (which will take a while).
Thanks a lot. Let’s keep in touch in the meanwhile and glad to continue to help you if possible.
Best wishes for your disertation !
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.