R in Knime: Column Name change

Hi Knimers!

After select some columns to my script, the name of columns change to Knime.in.XXXXX.

Why is that and How can I fix it?

Tks again!

 

 

Fabio,

why don't you post an excerpt of your R script, at least the part that seems to be causing the issue, so you make it easier for everyone to help you? Otherwise you get no answers or, best case scenario, some generic advise.

Cheers,
Marco.

Hi Marco!

Here is the script:

knime.out <- data.frame("Age" = ifelse(test = is.na(knime.in$"Age"), yes = mean(knime.in$"Age", na.rm = TRUE), no = knime.in$"Age"), "Salary" = ifelse(test = is.na(knime.in$"Salary"), yes = mean(knime.in$"Salary", na.rm = TRUE), no = knime.in$"Salary"), knime.in$"Country",knime.in$"Purchased")

And bellow a picture of my columns.

You know that I have been practicing R on Knime and for my script bellow I'm just writing the code and clicking in the variable. As a result, my columns always have another knime.in$XXXXXXXX in front of the name as it is in my second picture.

 

 

I appreciate your help

Hi Fabio,

Thanks, that makes it a bit clearer to understand. Do you have a workflow with multiple R snippets, one connected to the other? Do you get the strange "variable" names in the first snippet as well, or only after the first one?

Usually the knime.in and knime.out data frames are just for integration purposes and their names should not be visible outside of the R snippet. This unless something goes wrong in the creation of the output data frame and, for some reasons, the names are carried over to the output columns.

It may help if you could share the portion of the workflow where that happens.

Cheers,
Marco.

Hi Marco!

Indeed I have a workflow with several R snipets.

Please, take a look attached. I'm following a simple exercise and trying to reproduce in two aways, in Knime and in R.

Thank you for your support

 

Hi Fabio,

can you also provide a sample Data file (data.csv)? Otherwise I have to try to re-invent the data to test your workflow.

Cheers,
Marco.

HI Marco

Indeed there was something missing.

You can use the file attached, its the main source for this exercise.

I appreciate your help.

Hi Fabio,

your problem is pretty simple to spot. If you don't assign a name to the variable column in your R output data frame, KNIME will have no idea how to call the corresponding column in the output table, hence it will be using by default the name of the R variable, which is something like knime.in$yyyyy. Since the $ symbol cannot be in the name of a KNIME column, it is converted to a period.

Considering you have a chain of R snippet nodes all containing that issue, you end up with a long chain of knime.in.something.knime.in.somethingelse etc.

Consider this line of code taken from one of your R snippet nodes:

knime.out <- data.frame(knime.in$"Age",knime.in$"Salary",knime.in$"knime.in.Purchased","Country" = factor(knime.in$"knime.in.Country", levels = c('France', 'Spain', 'Germany'), labels = c(1,2,3)))

In the case of Age, Salary etc. you are not assigning a name to the column, hence KNIME will name the column according to the variable. Only Country is handled properly with an assigned name.

The fix is simple. You need to assign a name to the column in the output data frame so KNIME can use it to name the column in the output table. For example, the line above should become:

knime.out <- data.frame( "Age" = knime.in$"Age", "Salary" = knime.in$"Salary",  "Purchased" = knime.in$"knime.in.Purchased", "Country" = factor(knime.in$"knime.in.Country", levels = c('France', 'Spain', 'Germany'), labels = c(1,2,3)))

In this way you end up with 4 columns named Age, Salary, Purchased and Country like you expect.

Cheers,
Marco.

Get it Marco!

Pretty clear!

The only point from my side is that will be "complicated" to handle many columns step by step.

In my exercise for instance, if I'm only using the column "Country" for a function, the other columns should be provided as an output without any coding process. But this is a point of view of an analyst(me) not a coder/developer.

Anyway, thanks for your help again!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.