Optimizing Linear Regression Machine Learning

Hi All or @SupportOfficer1 @SupportOfficer2 ,

I made a linear regression modelling to forecast revenue and used around 25 variables as below flow but still not getting a good result. can someone suggest any optimization method to improved the result ? I already used features selections and cross validation.

the variable contain data like number of samples, number of populations, wealth index, actual revenue as references

Current Results


Cross Validation


Hi @ananggaersan
If there are too many categorical features, it is recommended to try the random forest algorithm first. The default processing of categorical features by tree-based algorithms is acceptable.

If improvements are made on the linear model, one-hot encoding for categorical features may be effective. The node in KNIME is “one to many”.


Hi @tomljh ,

Thanks for the suggestion. May I know im which part I can put this one-to-many node ?

That node belongs to feature preprocessing, so it is generally used after reading the data and before splitting the dataset.