Using the new Expression node in 5.3 (which is looking very promising) is it the intention that it treats Long and Integer datatypes the same?
I noted that when presented with both Integer and Long column data, it sees both of them as “Integer”, buy it actually treats them as “Long”, and from what I can see can only output “Long” for integer data:
I haven’t yet found a “toInt” converter built into that node, so currently with respect to integer data:
Expression can only output Long (64 bit) integers
Math Formula can only output Integer (32 bit) integers
Is that how it is intended to be, or is this just a transient situation with this first release of the Expression node?
Should we read into this, for example, that the expectation for future-KNIME is that 32 bit integer data is destined to be phased out?
Or alternatively, have I just missed something with the new node?
I can understand that there may be technical reasons why the preference was to use Long internally (to avoid loss of precision), but perhaps it would still make sense to allow the user to specify the output datatype that they require and perform a conversion if necessary.
The idea that simply adding 1 to an Integer always generates a Long feels wrong to me.
The Long datatype has for some time been the “poor cousin” of the datatypes within KNIME and neglected in many of the scripting-based nodes.
The new Expression node goes the other way and neglects Integer in favour of Long. I see the point for the future, but the implementation is in my view neglecting the “now”, and ignoring the other nodes that people will be using. I too had considered that the main problem node was Math Formula but on further investigation, the issue is a little more widespread.
I have created the following image to annotate the primary inconsistencies that I have found with a move to only creating the Long data type and not creating Integer.
Mostly you will see that the “problem nodes” fall into two camps:
Nodes that cannot see Long Variables and/or Long Columns
Nodes that can only be configured using Integer Variables (if configuring with variables)
I am sure my screenshot is not exhaustive. I hope to have captured the major affected nodes that I can think of.
To me, it isn’t that the removal of the distinction between Integer and Long is necessarily a bad thing going forward, but either the Expression (and Variable Expression) nodes should be given the ability to create Integer output, or all of the existing nodes ought to be (at last) modified to be able to handle Long; especially Long variables which are really out of favour in the older nodes .
Without that there is going to be a need to add lots of additional Double to Int nodes to convert Long to Int just to be able to use the new nodes alongside the old.
EDIT 24 April 2025
Another node that can be included in the above diagram:
DB Query Reader
(and possibly other DB Nodes)
This node cannot see LONG flow variables, so to include them in a query, they need to be converted to String. Would not recommend converting them to Double or Integer for inclusion in a SQL Query because of potential (likely) loss of precision.