Performing Log Transform on multiple columns

Hi,

I have a large number of numerical columns which have a large skew. Before performing an analysis, I wish to transform these skewed columns via a log transform.

Here is an equivalent R code for what I'm looking for:

for (nm in names(train)[2:(length(names(train)) - 1)]) {
   sk <- abs(skewness(train[, nm]))
  if (sk > 5 & minval >= 0) {
    train[, nm] <- log1p(train[, nm])
  }
}

Using Knime, I can grab all the skewed columns using the statistics node, however I only seem to be able to apply a math function one column at a time using the Math formula node inside a column list loop (with the column regex rename trick).

For large numbers of columns (anything over 20 really) this takes a long time - several minutes up to hours when you get to hundreds of columns. The R code takes seconds.

Is there a quicker more efficient Knime way?

Hi,

Just guessing here, but Unpivto / Log / Pivot might get it done - or one of the new GroupBy aggregation functions perhaps? "Sum of logs" sounds promising - the "RowID" node will provide a primary key if required.

Cheers
E

I'm looking for the same thing. There must be a way to apply any simple transformation across many columns.  

I missed that the original question already used a column list loop. That's the answer to simply transforming variables one at a time. It's slow as he stated but a good example of it is in the KNIME Examples Server and it's called 03_Looping_over_all_columns_and_manipulation_of_each.