# Random Forest - Random Forest column split and candidate counts

Hi,

In a trained Random Forest, the Attribute Statistics include #splits (level 0), #splits (level 1), #splits (level 2), #candidates (level 0), #candidates (level 1), #candidates (level 2).

It is easy to understand the #splits (level 0) and #candidates (level 0)

Does anybody know how the following numbers are determined/calculated?
#splits (level 1) and #candidates (level 1)
#splits (level 2) and #candidates (level 2)

Thank you!

Hi,
Conceptually those numbers are calculated in the same way like those for the root split, just for the second and third splits in the tree.
The candidate number indicates how often the attribute was in the attribute sample used to find the split and the number of splits is the number of times the attribute won the split.
Note that the numbers roughly double with each level because the second level contains two splits and the third level eight.

2 Likes

A follow-up of the same question. With the Random Forest Learner node, we can derive variable importance by using the # of times an attribute â€śwas a candidateâ€ť vs how many times it â€śwon the splitâ€ť, but we canâ€™t see where the split is.

For example, if a continuous variable â€śpriceâ€ť is, at level 0, the most important attribute, how can I know at which price the split is happening (if at all)? I know Random Forests are ensembles of trees, so even if price â€świnsâ€ť 10/10 splits, odds are it â€świnsâ€ť the splits at different price points; are these points of split also averaged?

Best,
Joel

Hello @JoelMenendez,

I donâ€™t understand how the split point would affect the variable importance but if you want to see the split point, you can go to the nodeâ€™s view where the splits are displayed.
You can also extract the individual decision trees as PMML which is a special kind of XML and further process it to get this kind of information.

Best,