Create a Node that executes custom, non-editable R code

Hi,

I'm trying to build a series of KNIME nodes that execute custom R code. Kind of like the existing "R snippet" and "R view" nodes, but the code should be contained "within" the node, and not be visible/editable by the user.

I'll need to develop own views as well as snippets, but for now I'm trying to build a node that shows me a png of, say, a histogram of 1000 normal-distributed values, as in 'hist(rnorm(1000))' in R.

What would be the best approach to do this? Should I extend the RView Node somehow? (I'm not too familiar with OOP as of now)? Or should I create a new Node which has an "RViewNodeModel" as a property? Or would it be smarter to copy&paste the code from the existing RView-Node to an own node?

Thanks in advance,

 Alex

Hi Alex,

did you have a look at the 'Scripting Integration' extension for R? (can be found at the community contributions site). We created nodes (plot + snippet) which can execute R-code and hide the code behind a configuration GUI (which would for example allow a user to set the binning for a histogram). The nodes use template files to provide several R-scripts.

The second idea: Take our code and make use of the possibility to create a node which takes such a template and integrates it completely. Then you can make it look like a 'real' KNIME node but it still exectutes R-code in the background. All our nodes work with an R-server which can be local or remote (easy to setup; have a look at RServe-package for R).

If you like to know more about this stuff or have questions, just ask.

Ciao, Antje

Hey,

this is _exactly_ what I was looking for. Thanks a lot! :) You see, we have a bunch of R scripts which implement an existing workflow, but we now have to implement them in KNIME for the non-programmer staff to use.

Only one problem: We will have to generate a few separate nodes, one for each of a few categories. For example, we'll need a node "compute models" that can compute a model from some input data in many different ways. Each of these ways will be a <rgg></rgg> section, I guess. But the "compute models"-node should only display templates from a category "compute models".

Then we'll have another node, "visualize models", which has a few templates for plotting and showing estimated models and input data. And maybe another node for "prediction", which predicts response variables with some input data and model specification.

I think each of these nodes should be a separate version of your "R snippet", with each using a different templates.txt file consisting of a few different ways (<rgg>-tags) to execute the node (and a dialog for fine-tuning). You said it's possible to "make use of the possibility to create a node which takes such a template and integrates it completely". How would I begin to do this?

This is quite hard to explain, I hope you'll understand this...

Again, many thanks for your answer =)

Dear both,

First of all: Wow, I wasn't aware the scripting extensions had such power! In line with Alex though, I'd like to understand the necessary steps for "full integration". Also, do you foresee supporting KNIME's "default" (local) way of handling R scripts at some point?

Especially for self-service end users without local admin rights this would be really useful. I'm aware that I *could* set up R and Rserve in parallel to KNIME anyway, but I'll certainly enrage our IT department less by maintaining only my (approved) KNIME distribution with it's "carry-on R".

Thanks,
E

Hi there,

I'm happy that you like our approach :-)

Unfortunately, I'm very busy until the end of the week and it might take a bit time to explain what is possible and what not. I'll answer your questions as soon as I can beginning of next week. I hope that's fine for everybody :-)

Ciao,

Antje

That's very fine, just take your time.
And while you work in sweat and grime
to earn your necessary dime,
I'll make it my prime to learn more KNIME.

Er... what he said! :-)

M

Hi Alex,

so far it's not possible to split R-snippet nodes into different categories. There are two ways possible:

1. If the input and parameters do not change for all snippets of one category (e.g. compute models category), I would propose to create a node which integrates an R-script which can call all different methods depending on a user choice (like a combo box which provides the different method-names). Here you would have to create a java-node which uses abstract classes from our scripting core (should be enough if you use our jar-file)

2. The other solution would be to put all scripts into one (or different) template files and to put proper categories. Then the user will see a tree-view with all categories where he can choose 'compute models' for example and then will find all different scripts. I don't think that this is more difficult that choosing different nodes, no? Here, you only would have to provide the template file(s).

Let me know what you prefer.

Antje

Hi Ergonomist,

Also, do you foresee supporting KNIME's "default" (local) way of handling R scripts at some point?

Especially for self-service end users without local admin rights this would be really useful. I'm aware that I *could* set up R and Rserve in parallel to KNIME anyway, but I'll certainly enrage our IT department less by maintaining only my (approved) KNIME distribution with it's "carry-on R".

Hmm, I see what you mean... There are no plans so far to run our nodes without a server behind. We will think about it but at the moment our resources are not enough to integrate R fully into KNIME.

(If you would get an R-installation on you machine it would be enough. You could then run a local R-server instance without any help from the IT-department).

Ciao, Antje

Hey Antje,

so you're saying I will create only one node with a giant template file, which can compute a model, or visualize a model, or even write stuff in a database, right? And I will determine the node's task by choosing which script of that _one_ template file to run?

If I would set up a workflow that fetches data from a database, then computes a model, and then visualizes that model, I would basically build a workflow of three consecutive instances of this same node, right? But the first one fetches data from the DB, the second one computes something, the third one visualizes it?

I'd just like to make sure I understand what you are saying.

Thanks again :)

Hi Alex,

yes that's correct (beside the fact that you would need one snippet node for fetching the data, another one for calculating the model and a view node for visualization)

Hi Antje,

Thanks for the reply. So I'll try to get something like this set up (somehow I failed last time I tried it on the company server, unlike on my private PC), and other than that patiently wait until you come out of your resource crunch. :-)

Cheers
E