I made a linear regression modelling to forecast revenue and used around 25 variables as below flow but still not getting a good result. can someone suggest any optimization method to improved the result ? I already used features selections and cross validation.
the variable contain data like number of samples, number of populations, wealth index, actual revenue as references
Hi @ananggaersan
If there are too many categorical features, it is recommended to try the random forest algorithm first. The default processing of categorical features by tree-based algorithms is acceptable.
If improvements are made on the linear model, one-hot encoding for categorical features may be effective. The node in KNIME is “one to many”.