How to render a scatter plot including best fit straight line? (without R)

Hi guys,

Is there a way in KNIME (without using R) to represent a scatter plot including a straight line of the best fit and the coefficient of determination R2?

I tryid using Scatter Plot Jfree chart (so that I can later export the chart in KNIME report) or directly using BIRT, but I'm not succeeding.

Thanks in advance for any help.

Gio

The 2d/3d scatter plot from the Erlwood community nodes is what you need.

simon.

1 Like

Hi Simon,

Thank you for your fast reply. I know that node and it's great, but the problem is that after generating the chart I want to export it to the reporting system (BIRT) and I think that is not possible from there.

Please, do you know if there are other possibilities? 

Okay, unfortunately in Birt I haven't found a way of doing line of best fit or R2 directly.

but you can do it indirectly.

You can use the linear correlation node on the x and y variable to get the Pearson Coefficient, then use the Maths Formula node to square it so you get the R2. You can then pass that value into a column for using in Birt to display the R2.

for the line of best fit, I fed the x and y variables into a linear regression model, and then use predictor node to predict the value for the lowest x value on your axis, and highest x value on your axis. Then feed these into your Knime table in a new column which goes to the reporting node in Birt.

then in your scatter plot in Birt, you just add your data in the normal way, and then an extra set of data which is your lowest/highest x/y value prediction and choose to put a line between those two datapoint so you now have your line of best fit.

its a bit long winded but it works well.

simon.

Very nice workaround! I didn't think about it. I will give it a try.

Thanks a lot Simon!

Thanks Richards99

I tried this:

You can use the linear correlation node on the x and y variable to get the Pearson Coefficient, then use the Maths Formula node to square it so you get the R2. You can then pass that value into a column for using in Birt to display the R2.

for the line of best fit, I fed the x and y variables into a linear regression model, and then use predictor node to predict the value for the lowest x value on your axis, and highest x value on your axis. Then feed these into your Knime table in a new column which goes to the reporting node in Birt.

 

Question: the out put from correlation is a correlation measure. How do I get access to this measure to use to feed into a math fomula please?

Thanks