(Notice that the numbers in the second col are all in the range of an INT or INT4)
b) Make a Workflow:
CSV Reader -> Column Rename -> GroupBy
c) in Column Rename, convert the ANUMBER column either to LONG or DOUBLE, check that the output corresponds to the selected type (it does).
d) In GroupBy, group by column CATEGORY in tab Groups. Change to tab Options and add column ANUMBER, then select function SUM from the dropdown in the function field.
Run the workflow or node. Node returns the following error:
WARN GroupBy Skipped group(s): ANUMBER groups: "catA" cause: "Sum > maximum int value. Convert column to long."
The problem is that SUM converts the summands back to INTEGER. The other aggregation functions seem to work.
Weirdness: This happens ONLY when the numbers contained in the CSV are within the range of an INT and the column is interpreted by the CSV Reader node as INT. If you add the following row to the file
catB;3000000000
then the second column will be interpreted as Double and the GroupBy suddenly has no problem summing what is there, and correctly choses Double as output format.
the problem is that the Column Rename node does not return a proper LongCell but a non native type that implements the Long as well as the Int interface. That is why the GroupBy node still thinks that the column is an IntCell and tries to return an IntCell.
A possible workaround is to convert the Int column to a Long column using the Java Snippet (simple) node. Select in the dialog "Replace column : ANUMBER" and Long as return Type. In the Method body write:
return (long)$ANUMBER$;
The java snippet node will than replace the Int column with name ANUMBER to a Long column.
Bye,
Tobias
Hi,
the problem is that the Column Rename node does not return a proper LongCell but a non-native type that implements the Long as well as the Int interface. That is why the GroupBy node still thinks that the column is an IntCell and tries to return an IntCell.
A possible workaround is to convert the Int column to a Long column using the Java Snippet (simple) node. Select in the dialog "Replace column: ANUMBER" and Long as return Type. In the Method body write:
return (long)$ANUMBER$;
The java snippet node will than replace the Int column with name ANUMBER to a Long column.
This is a common problem with the column replace node. I have had (and others) had the same problem downstream in a workflow. Why can this node not be altered to convert columns properly into the type the user selects. If its not possible, can we have a typecast node instead of needing to rely on java snippets.