I have a script in R that does a few data manipulation steps and runs some tests on the dataset. I’m trying to integrate it into my KNIME workflow. Tried using R Snippet node and Table to R node.
#convert all character columns to numeric
chars = sapply(knime.in, is.character)
knime.in[ , chars] = as.data.frame(apply(knime.in[ , chars], 2, as.numeric))
#convert from numeric to nominal comparison
most = max(knime.in)
multiple = c(2:most)
for (num in multiple) {
knime.in[knime.in == num] = 1
}
rm(chars,most,multiple,num)
knime.out <- knime.in
I have been getting the following error on the “most=max(knime.in)” line: “Error: only defined on a data frame with all numeric-alike variables.” This indicates to me that the sapply() function may not be working properly, perhaps? Both R Snippet and Table to R nodes are giving me the same error. I’m not too sure of the difference between the R nodes. Any help is appreciated. I am aware there are some KNIME video courses; if you are able to refer me to the right ones, that would also be much appreciated. Thank you.
Hi @ssheriff
I’d be more than happy to help you if you can provide at least a sample of your dataset.
But, by just looking the code I think that this is just an R problem, so, I’ve noticed that the line most = max(knime.in) isn’t referring to an individual column, but to the entire dataframe. You should only use functions like MAX or MIN if you have all numeric columns. Take a look at this test I’ve made so we can reproduce the same error:
Hello, thank you for your reply! All of the code worked in R; it was only when I copied and pasted it in KNIME and replaced my df with knime.in that I was getting this error.
was sufficient in R, but I’m not sure if it’s doing the same thing when executed in KNIME. My df had two chr columns that this code converted to numerics. Since my df should be the same as knime.in, I’m not sure why knime.in would have any columns that are neither character nor numeric…
I’ll try narrowing the scope of max() to sets of columns to see if that helps.
Yes you are right, I’ve reproduced the code in RStudio with a dataframe with a couple of string columns and the code works perfectly fine.
Still trying a couple of thins inside KNIME but it doesn’t work. It’s weird since is just a simple conversion. I’ll let you know if I found something usefull
Actually, another question: how/why does your script use kIn and rOut instead of knime.in and knime.out? i was wondering if the periods might mess up code in the future