Straight line of best fit

Hi,

I am struggling to find anywhere in Knime that can plot a few points (3 to 4) on a logarthmic scale on both x and y axis and draw a straight line of best fit through them which can then be used as an image in the Report Designer. 

Any ideas?

The best I have found is the chart tool in BIRT Report designer which will plot my data points on a logarthmic scale, but I can only have a straight line to each datapoint, or a curved fit line, NOT a straight fit line. Am I overlooking a setting somewhere. 

Thanks,

Simon.

Simon,

Did you consider using R and ggplot? stat_smoot should do the fitting nicely, see here:

http://docs.ggplot2.org/current/stat_smooth.html
 

Then add your coord scaling with scale_continuous:

http://docs.ggplot2.org/current/scale_continuous.html
 

The "purist" KNIME way would be to predict y based on x with a Linear Regression Learner, followed by plotting x/y pairs as points, and x/y_pred pairs as a line using BIRT (which will handle the log scaling). Easier w/ggplot IMHO.

Cheers
E

P.S.: Just to add, of course "best fit" is a minimised sum of squares in both cases - a maximum likelihood estimator might give a better fit. :-)

1 Like

Thanks for the tips, I have managed to do it using your purest Knime way by creating a prediction using linear regression for the max and minimum x value and plotted this BIRT.

However I really would like to try and use R but I just have no idea what I am doing. I connected up the R Plot node from the R Scrpting repository but haven't managed to get any further. It fails due to Could not connect to R,  server is not running. I do not know what this means, in the knime preferences it is set as localhost. How I start an instance RServ I have no idea.

Thanks,

Simon.

Hi Simon,

Take care *not* to use the community R scripting nodes for learning / local use, since they require an Rserve instance and aren't interactive. You will want to use the "regular" R nodes (top-level folder between "Quick Form" and "Reporting" in full installs), which version 2.9 used to call the "interactive" nodes. They are truly the latest and greatest, and they now include an R v3 environment which runs out-of-the-box on Windows. AFAIK both Linux and Mac OS X require a separate R package install.7

Note that you'll need to run the following before using ggplot2 (same for any other non-core package):

install.packages("ggplot2")

Hope that helps.

Cheers
E

This should work in standard R, if ggplot is not essential.