Optimizing Linear Regression Machine Learning

Hi All or @SupportOfficer1 @SupportOfficer2 ,

I made a linear regression modelling to forecast revenue and used around 25 variables as below flow but still not getting a good result. can someone suggest any optimization method to improved the result ? I already used features selections and cross validation.

the variable contain data like number of samples, number of populations, wealth index, actual revenue as references

Current Results
image

Flow

Cross Validation

Thanks

Hi @ananggaersan
If there are too many categorical features, it is recommended to try the random forest algorithm first. The default processing of categorical features by tree-based algorithms is acceptable.

If improvements are made on the linear model, one-hot encoding for categorical features may be effective. The node in KNIME is “one to many”.

BR

Hi @tomljh ,

Thanks for the suggestion. May I know im which part I can put this one-to-many node ?

That node belongs to feature preprocessing, so it is generally used after reading the data and before splitting the dataset.