Optimizing Linear Regression Machine Learning

ananggaersan · July 4, 2024, 9:11am

Hi All or @SupportOfficer1 @SupportOfficer2 ,

I made a linear regression modelling to forecast revenue and used around 25 variables as below flow but still not getting a good result. can someone suggest any optimization method to improved the result ? I already used features selections and cross validation.

the variable contain data like number of samples, number of populations, wealth index, actual revenue as references

Current Results

Flow

Cross Validation

Thanks

tomljh · July 4, 2024, 9:22am

Hi @ananggaersan
If there are too many categorical features, it is recommended to try the random forest algorithm first. The default processing of categorical features by tree-based algorithms is acceptable.

If improvements are made on the linear model, one-hot encoding for categorical features may be effective. The node in KNIME is “one to many”.

BR

ananggaersan · July 4, 2024, 10:27am

Hi @tomljh ,

Thanks for the suggestion. May I know im which part I can put this one-to-many node ?

tomljh · July 4, 2024, 11:03am

That node belongs to feature preprocessing, so it is generally used after reading the data and before splitting the dataset.

system · October 2, 2024, 11:04am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.