Random forest missing domain information

Hi I have a random forest model and I am getting the following error. “WARN Random Forest Learner (Regression) 6:128 15 column(s) were ignored due to missing domain information: – change the node configuration or use a Domain Calculator node to fix it”
this model is going through almost 600 columns but these 15 columns are the different iterations of a column that I initially had. the original column is a simple measurement (numerical) and there is not much difference between the values! I also tried replacing the empty values with 0 but that didn’t change anything! I am confused what I need to do with the Domain Calculator node or how I need to change the Random Forest Learner configuration. I would appreciate any help.

Hi @amirmbhd,

The Domain Calculator node scans the selected columns and updates the possible values. If you have a nominal column (String type for example) with more than 60 unique values, then in the node’s configuraiton, you need to increase the number of possible values.
Please give it a try and see if it solves the issue.

1 Like

In addition, I would consider some reduction of dimensionality in your dataset, if possible. You might need all 600 of those columns, depending on the context of the problem. But there’s a good chance you can get by with a much smaller subset without sacrificing the predictive capability of the model, and improve performance to boot.

See this blog post for more detail:

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.