Using R Snippet

Hi Knimers!

I still in doubt how to use R in Knime and need a hand here.

1- What is the difference between R Snippet and R Snippet (2:1)?

2- I still don't know how to use the script editor (sorry for that). My main issues are:
        2.1 - I couldn't understand  why the script editor start with the code "rOut <- kIn" 
        2.2 - How can I install a new package (Hmisc for example)?
        2.3 - How can iI run the code bellow using all of my columns? (knime.out <- rcorr(as.matrix(knime.in)))

I appreciate your support.

Tks

Hi Fabio,

most of the answers you need can be found through the node documentation or in the R Snippet node Wiki here: https://github.com/knime-mpicbg/knime-scripting/wiki/R-server-for-knime

Nevertheless I will do my best to provide some additional explanations below.

 

1- What is the difference between R Snippet and R Snippet (2:1)?

In few words, the R Snippet (2:1) node is exactly like the R Snippet node, except that it takes two distinct tables as input. This is useful if the task you are going to perform in R requires two matrix or vectors as input (e.g. calculating a covariance matrix), otherwise you can stay with the R Snippet node.

 

2.1 I couldn't understand  why the script editor start with the code "rOut <- kIn"

This is the default code in the node and basically it takes the input table from KNIME (kIn) and pass it straight, as is, to the output (rOut). This makes the node fully transparent, which is ok when you first place it in a workflow, but not very useful to accomplish any task. What you normally do is modify the code so it produces some useful output.

For example, if you want to take a numeric input table and divide all its elements by 2 you can use this code:

rOut <- kIn / 2

 

2.2 - How can I install a new package (Hmisc for example)?

The R Snippet nodes, part of the Community Nodes, rely on RServe to function. This means you need to have a running RServe instance somewhere on the same machine or on a remote machine (note that this works under Windows but it is not recommended). To start the RServe instance you have to start your local/remote R instance first, then use the R command:

library(Rserve); Rserve(args='--vanilla');

The wiki article I linked above explains it all, including what to do if you are running on a Windows machine. This basically creates an R instance you can control from KNIME. Whatever package/library you have installed into your local/remote R instance will be available in the R Snippet node as well, provided you load the library first with library() or require().

To install the Hmisc package, start your local/remote R instance and type:

install.packages("Hmisc")

in the R console. This makes the package available via RServe too. By the way, for everything to work as expected the Rserve package needs to be installed as well and you have to have it properly configured in KNIME, but this you knew already I guess.

 

2.3 - How can iI run the code bellow using all of my columns?

Once you have installed the Hmisc package and with the Rserve running, you can enter this code in the code editor of the R Snippet node:

require(Hmisc)
a <- rcorr(as.matrix(kIn))
b <- a$r
rOut <- as.data.frame(b)

Note that I have split it in steps using intermediate variables to make it clearer, but it is not strictly necessary. The rcorr function requires a numerical matrix as input, which we obtain from the kIn data.frame with the as.matrix() call.

rcorr returns a list with three elements, the first one r being the correlation matrix. We extract this element and pass it back to KNIME by converting it first to a data.frame and then assigning it to the rOut variable. If you forget to go back to a data.frame before assigning the output, very likely you are going to get a type conversion error of some sort.

 

Give it a try and let me know if it works as expected.

Cheers,
Marco.

HI Marco!

Works perfectly!

Thank you again for your support!
With your help I'm improving fast in Knime!

Success!