problem with loops for multiple imputation

Northern · June 11, 2018, 11:27am

Hi,

I am trying to do multiple imputation with amelia R package in cross validation. There’s a problem with loop and it cannot continue runing…Can anyone help me to check where’s the problem and how to solve that? I’ve attached the workflow with a google drive link.
On the traing set I did normal 10 times imputation and get one complete data set with imputed values.
On the testing set I tried to firstly filter out the target column, combine the test set without target column with the training sets for imputation. Then I filtered out the results with imputed values and get the target column back for a complete test data set.
Then I use the imputed traings- and testing data to build a model using cross validation with decision tree.

Best Regards,
N.

https://drive.google.com/file/d/11dVofKrqt6G1Dc0157_T3ELVGIDA6oJE/view?usp=sharing

mlauber71 · June 11, 2018, 9:00pm

Since I provided the initial Amelia workflow I thought I have a look at this one - From my perspective it has to do with the export of the Amelia CSV files and their import. I put them into separate nodes and inserted a flow variable so they would run after one another in the loop, so as not to irritate R. And I got rid of the blanks in the KNIME file name.

Maybe you take another look. I have not checked the scoring logic or the construction with the mix of Training an Test files. That should be like you left it. I may try to use this imputation at some time and see if it gets good results.

So please be extra careful with the interpretation of the results. You might set aside a third chunk of the data that has not been used in the whole workflow to see if you like the results.

On another note: you use just a 0/1 scorer. From my perspective with scores it might make sense to use a continuous score 0.0-1.0 but that of course depends entirely on your task.

write.amelia(a.out, separate = FALSE, file.stem = ‘amelia_imp2’, impvar = “imp”, orig.data = FALSE)

10062018_deal_with_mvs_amelia.knwf (747.0 KB)

Northern · June 12, 2018, 9:52pm

Hi, Thanks for the tipps…It’s really very helpful:) I’ll take a look at it tomorrow and see if it could get a good result…

Best Regards,
N.

Northern · June 14, 2018, 6:01am

Hallo,

When I use this imputation in a cross validation, I got this error, do you know why and how to solve that? Thanks…
ERROR R To R 0:621:739:726:7 Execute failed: R evaluation failed.: “knime.tmp.ret<-NULL;printError<-function(e) message(paste(‘Error:’,conditionMessage(e)));for(exp in tryCatch(parse(text=knime.tmp.script),error=printError)){tryCatch(knime.tmp.ret<-withVisible(eval(exp)),error=printError)
if(!is.null(knime.tmp.ret)) {if(knime.tmp.ret$visible) print(knime.tmp.ret$value)}};rm(knime.tmp.script,exp,printError);knime.tmp.ret$value”

Best Regards,
N.

mlauber71 · June 14, 2018, 10:04am

These things come to my mind:

I experienced recent problems with R 3.5 and KNIME on a Mac (R Problems on macOS (high sierra) since R version 3.5). So if you use R version 3.5 you might want to try using R version 3.4.x instead to test it
Do you use R as a standalone program or do you use the ‘integrated’ R Binaries (Windows only) (cf. Importing R Code in Knime)
I have seen problems with KNIME on folders that were managed by Microsoft OneDrive because depending on the configuration it might not properly support the special characters that KNIME relies on (namely the hashtag #)
considering your other post (reload problem with Excel Reader(XLS)) you might want to check if your file system is still healthy if it can handle the load. You could try to install KNIME on a different drive or move the knime-workflow folder to a different drive

system · June 21, 2018, 10:04am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.