GroupBy node - aggregate by type option does not follow the settings

genericknimenodes
#1

Hi guys,

I think I found a potential bug in the GroupBy node. As you can see in the attached workflow I group the node for a certain string column and then I aggregate by type using the following settings:

  • string type: concatenate
  • integer type: sum
  • double type: min, max, mean and SD

Nevertheless the integer column present in the input table, a part from being aggregated using sum, it is also aggregated using min, max, mean and SD, methods that in theory are reserved only for columns with double type.

Do you know why this happens?

Thanks in advance,

Gio

groupby_node_type-based_aggregation_problem.knwf (14.3 KB)

0 Likes

#2

When you select “double” it includes all numbers, including ints; this is by design.

From the GroupBy help text:
"The “Type Based Aggregation” tab allows to select an aggregation method for all columns that are compatible with the selected data type. For example to apply an operation on all numeric columns simply select DoubleCell. This will include all numeric cells that are compatible with DoubleCell such as IntCell and LongCell. "

Edit: I think many people are surprised by this behavior. Feature request: instead of writing “Number (double)” in the Data Types field, write “Number (double, long, int)”

1 Like

#3

Hi Aswin,
Ups… sorry, I didn’t read that part. Thank you for the explanation!
Anyway yes… maybe writing “Number (double, long, int)” instead of “Number (double)” as you suggested, would be more exact and will be less prone to misunderstanding.
Best

0 Likes

#4

Hi there,

This would be sort of ok proposition but there can be another data (number) type developed that is compatible with double (cause KNIME Analytics Platform is an open source software :open_mouth: ) and thus this wouldn’t be true and thus can’t work :smiley:

Anyways some solution for this would be welcome as this is reoccurring question… :slight_smile:

Br,
Ivan

1 Like