Large coefficient values (Linear Regression Learner Node

23032569 · November 10, 2023, 8:16am

Dear forum

The [Linear Regression Learner] Node found in
Qn003 - Large coefficient values.knar (1022.4 KB)
shows this:

The dataset from which [Linear Regression Learner] Node is taking in for training are all numeric. A number of the columns contains binary data: 0 or 1.

Is it normal/usual to see absolute coefficient values above 1 trillion under the column “Coeff.” ?

In the screenshot, their p-values (column “P > [t]”) are below 0.05 which means they are significant enough to be included in the linear regression formula. But I’m holding back because they are such big coefficient values.

Thank you

ScottF · November 15, 2023, 4:37pm

I think you should consider a few changes to your workflow to see if it improves results:

Conversion of integers to string in the case of your binary data, since it’s actually categorical
Removal of columns with extremely low variance
Normalization

We have nodes to support all these functions. Is this for an assignment, by any chance?

23032569 · November 19, 2023, 2:16am

Hi ScottF

To answer the question, yes, you guessed it correctly! This was for my first academic assignment using Knime.

Pertaining to point #1, the (many) binary-data columns in ‘Qn003.csv’ was the result of one-hot encoding of the original columns with categorical data. I was told it is possible that some learning algorithms work better with numbers than with String data type. I avoided label encoding.

Thank you for suggesting point #2 and #3.

My actual assignment workflow had more Nodes than ‘Qn003 - Large coefficient values.knar’.

The knar file posted earlier was created to find out from the community if any members have encountered large coefficient values (output by [Linear Regression Learner] Node) before and could possibly share with me what is the known cause and how they finally managed to rectify it.

I could exclude columns that resulted in large coefficient values reported by the learner node but something’s telling me that doesn’t seem to be addressing the problem.

system · February 17, 2024, 2:16am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.