Dear all,
I have just started to use knime, and I have found some issues in executing R scripts within a workflow. Basically, after extraction, I send a table of data to a "R learner" object in which I perform a feature selection with the R package caret. Unfortunately the script (below) gives an error during the execution: Error in { : task 1 failed - "undefined columns selected".
Now, saving the input table of the knime node as csv and importing it in R, I can execute the same code with no errors. I checked the library path of R loaded within knime and it correspond to that loaded in the native R. I tried to execute the code on a single processor, with no success. I also tried to execute the same code (with small adaptation on the input/output objects) with both the R snippet both from the community (with a local server) and from the standard implementation, obtaining the same error. On the other hand, other R scripts run without problems within knime.
I was wondering why the same R code gives different outcomes when executed within knime and in the native R environment.
Any help is really appreciated.
Thanks,
Luigi
-----------------------------------------------------
##SETTINGS ncpu=2 niters=2 numCVs=2 ##EXECUTE library(caret) library(doMC) registerDoMC(ncpu) #Import table tableIn=knime.in tableIn<-tableIn[complete.cases(tableIn),] yId=grep('^LD',colnames(tableIn)) y=tableIn[,yId] x=tableIn[,-c(1,yId)] rownames(x)<-tableIn[,grep('^database_',colnames(tableIn))] ## REMOVE ZERO VARIANCE VARIABLES xzv<-x[,!nearZeroVar(x, saveMetrics= TRUE)$zeroVar] ## REMOVE HIGHLY CORRELATED VARIABLES correlationMatrix <- cor(xzv) highlyCorrelated <- findCorrelation(correlationMatrix, cutoff=0.85) xzvhc<-xzv[,-highlyCorrelated] dim(xzv) Ctrl<-rfeControl(functions = caretFuncs, rerank = FALSE,method = "repeatedcv", saveDetails = FALSE,number = niters ,repeats = numCVs,verbose = TRUE, returnResamp = "final",p = .75,index = NULL, timingSamps = 0,seeds = NULL,allowParallel = TRUE) nvar=0 i=0 ## N SET VARIABLES TO EVALUATE while(nvar < dim(xzv)[2]) { i=i+1; nvar = 2^(i+1) } if (i>6) {i=6} sizevar=2^(2:i) rfefit <- rfe(xzvhc,y, metric="RMSE", maximize=FALSE, method = "pls", rfeControl = Ctrl,sizes=sizevar) knime.model <- rfefit$fit