Over the last months I have collected a few entries about the installation of R alongside KNIME that contain some maybe useful hints how to overcome certain obstacles while installing them. I want to bring them together here. (If you are looking for a basic guide about how to install R in the first place you might look here or here).
A few maybe unnecessary remarks upfront. Yes it takes some effort to get KNIME and R up and running but in the end you will be rewarded with the whole world of possibilities R has to offer, and you will be able to solve some tasks that you were not able to easily solve before while having the power of the workflow concept of one of the leading Data Scientist platforms at you fingertips.
OK first there are two ‘flavours’ of R you can install, on Windows the integrated Binaries (KNIME would manage all of R) or you get yourself a standalone version of R (which I strongly recommend, and I recommend using the 64 Bit version wherever possible). Alongside that I recommend installing RStudio to easily manage R and that can also be useful in itself. To work with KNIME R needs several packages most prominently RServe (we come to that in a minute) and Cairo. But you will be able and have to install further libraries as your tasks evolve.
Once you have R up and running you will need RServe so that KNIME nodes can communicate with it. Again there are two flavours of nodes one is the R environment managed by KNIME itself and then there is R Scripting from the community. The difference in using that is that the KNIME-managed one would just call RServe by itself while the R community nodes need RServe to be manually started from within R and the RServe needs to be up and running while using the nodes - that can be confusing, so maybe you stick with KNIME’s own nodes for now.
About the RServe library itself …
There have been some problems with that and it is highly recommended you use the latest version of it which ist 1.8.6 as of now (Jan 2019).
In theory this should be as easy as running this line in your R or RStudio.
or if you have the file in a local folder something like:
install.packages('~/Downloads/Rserve_1.8-6.tar', repos = NULL, type="source")
If that is not working you can try to install it manually by downloading the package - or trying to compile it yourself as we have discussed here:
I also have an entry about being able to compile packages on Mac:
Compiling the Rserve package on MacOS seems to be particularly tricky and seems to depend very much on the exact version of MacOS you use. Several steps have helped in the past.
- install R on MacOS (https://cran.r-project.org/bin/macosx/)
- install RStudio (https://www.rstudio.com/products/rstudio/download/)
- install XQuartz (https://www.xquartz.org) - not 100% sure if you need it for everything
- install Clang 6.0.0 for OS X 10.11 and higher, (https://cran.r-project.org/bin/macosx/tools/)
- install GNU Fortran 6.1 for OS X 10.11 and higher (https://cran.r-project.org/bin/macosx/tools/)
- run this installer if the compilation still does not work (cf. this discussion)
sudo installer -pkg \ /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg \ -target /
- check out these hints for your specific version of MacOS (https://cran.r-project.org/doc/manuals/R-admin.html#macOS)
And please be aware that these hints might not work on every system or under any circumstances and as I have experienced these settings tend to change a lot, so please be patient and keep working on setting this up.
Being able to compile R libraries can also come in handy if you want to stay on top of the latest developments or you have some academic packages that are not hosted in a binary form on some server. Of course using new libraries can also bring some challenges with stability and bugs - like it is with advanced (free) software.
Yes there is some work to do but you will be better off once you have invested the effort. Hope you enjoy KNIME and R and if you have questions there is always the KNIME forum and the good people from KNIME there to help.
Since we are at it I want to share some more quirks about R that I have come across over the months. They may or may not be relevant for your problems.
On windows RTools for compiling packages seem not to like very long and complex paths. So stick to something like c:\rtools instead of something fancy.
Especially on Windows RServe seems to be picky if other instances of it are running parallel. So if you for example have a windows server and several people do use the same installation of R you might run into problems (I think there is a mention of that somewhere on the Rserve page). You might try for everyone to use their separate installations or different ports. Also you would want not to start too many instances of R at the same time.
For further problems with compilation of Rserve please see this entry about command line compilation with Rtools on Windows