Different outcome running R within knime

Dear all,

I have just started to use knime, and I have found some issues in executing R scripts within a workflow. Basically, after extraction, I send a table of data to a "R learner" object in which I perform a feature selection with the R package caret. Unfortunately the script (below) gives an error during the execution: Error in { : task 1 failed - "undefined columns selected".

Now, saving the input table of the knime node as csv and importing it in R, I can execute the same code with no errors. I checked the library path of R loaded within knime and it correspond to that loaded in the native R. I tried to execute the code on a single processor, with no success. I also tried to execute the same code (with small adaptation on the input/output objects) with both the R snippet both from the community (with a local server) and from the standard implementation, obtaining the same error. On the other hand, other R scripts run without problems within knime.

I was wondering why the same R code gives different outcomes when executed within knime and in the native R environment.

Any help is really appreciated.

Thanks,

Luigi

 

-----------------------------------------------------

##SETTINGS
ncpu=2
niters=2
numCVs=2

##EXECUTE
library(caret)
library(doMC)

registerDoMC(ncpu)

#Import table
tableIn=knime.in

tableIn<-tableIn[complete.cases(tableIn),]
yId=grep('^LD',colnames(tableIn))
y=tableIn[,yId]
x=tableIn[,-c(1,yId)]

rownames(x)<-tableIn[,grep('^database_',colnames(tableIn))]

## REMOVE ZERO VARIANCE VARIABLES
xzv<-x[,!nearZeroVar(x, saveMetrics= TRUE)$zeroVar]

## REMOVE HIGHLY CORRELATED VARIABLES
correlationMatrix <- cor(xzv)
highlyCorrelated <- findCorrelation(correlationMatrix, cutoff=0.85)

xzvhc<-xzv[,-highlyCorrelated]

dim(xzv)

Ctrl<-rfeControl(functions = caretFuncs, rerank = FALSE,method = "repeatedcv",
      saveDetails = FALSE,number = niters ,repeats = numCVs,verbose = TRUE,
      returnResamp = "final",p = .75,index = NULL,
      timingSamps = 0,seeds = NULL,allowParallel = TRUE)

nvar=0
i=0

## N SET VARIABLES TO EVALUATE
while(nvar < dim(xzv)[2]) {
	i=i+1; nvar = 2^(i+1)
	}

if (i>6) {i=6}
sizevar=2^(2:i)

rfefit <- rfe(xzvhc,y, metric="RMSE", maximize=FALSE, method = "pls", 
              rfeControl = Ctrl,sizes=sizevar)

knime.model <- rfefit$fit

 


 

 

 

 

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.