Row Index Consistency results in erroneous processing - Always start Row Index with 0

Hi,

I noticed on many occasions, that the row index is inconsistently interpreted by nodes. I.e. the Rule Engine starts with zero:

However, the Row Filter (new node) as well as the Rule-based Row Filter, use Row Index starting at 1.

image

2024-09-18 10:25:28,205 WARN Row Splitter 4:1121:0:1130:0:1133 Row number must be larger than zero: 0

This inconsistency can easily lead to serious data processing issues. Especially because it does not really error out!

From a coding perspective numbers start with zero, not with one. This however, might be confusing to people not familiar with that concept. Contrary, though, having the row index start with zero was always the case in Knime.

To prevent further disjoints and potentially breaking workflows, I strongly recommend to stick with the default by starting Row Index with zero in all nodes.

PS: Here is a simple example showing how the Row Index disjoint can cause troubles.

Best
Mike

Hi @mwiegand and thank you for your feedback,

You are right, we are using the 1-based “Row number” instead of 0-based row index with the new dialogs. This as you mentioned, prevents the confusion for users who are not familiar with 0-based indexing.
Row IDs are still 0-based if you need it.

2 Likes

Shouldn’t i.e. the Row Filter then auto-adjust that in accordance to min selectable value? That disjoint, some nodes starting with 0 and others with 1, is unnecessarily confusing, does not help to promote Knime as a pro-tool as inconsistency is the opposite of “it just works”.

Just imagine how embarrassing it must be if Knime trainers would be required to explain, which they can not, why a Row Filter has the Row Number start at 1 but a Rule-based Row Filter have it start with zero.

1 Like

Well, I double checked the new Expression node and this node, which is the first in the line to replace the old expression based nodes, actually supports both “Row index” and “Row Number”.
That made me think why not having the same in all other nodes.

Thanks to your feedback and perseverance in proving your point :wink:, now we have 2 new tickets + 1 existing ticket to address the issues you have mentioned here:

  • Existing ticket UIEXT-2094 for min value in the input fileds like for Row number in Row Fliter
  • New ticket UIEXT-2172 to have further discussion to add “Row index” to the new dilaogs where there is a “Row number”
  • New ticket AP-23342 to provide the Row ID, number and index options with the column list in the Expression nodes’ dialogs (which are already supported and mentioned in the nodes’ description)
3 Likes