Dear KNIMErs,
Just a quick heads-up which I don't know how to place: I am processing a large file list with paths, sizes, dates, owners etc. generated by FileList. 5 of the 650k files are so large that their byte size won't fit in an integer, so I went ahead and tossed them at a Java snippet node which converted them to LONG.
So far, so good, but when I went ahead and "grouped by", summing byte sizes, I realised that GroupBy would convert the results to Double. As Double is easier to read thanks to 1000s separators, and as it's generally better supported, I went back and converted with the standard "String to number" node to "Double". But guess what? "GroupBy" would now take ages to complete, if it completed at all, because memory usage was much higher and closer to my heap space limit.
How come? Size-wise they take both 8 bytes per value, but I guess it's an "int vs. float" performance issue? Standard x86 hardware under Windows.
Having discovered that, wouldn't it be beneficial to improve "long" support in KNIME, if only by providing "long" as a target format in "String to Number"?
Thanks for your comments,
E
P.S.: Subsequent aggregations now always need a double to long conversion to have acceptable speed - so "long" support in the "GroupBy" node would be appreaciated, too, I guess. :-)