Sorter node logic

Why does the KNIME sorter node prioritize capital letters ahead of lower case ones when sorting strings?

I thought something was missing from my data when a string that began “Ab” was not at the top of a sort while a string that began “AM” was at the top. I looked through the rest of my data via a GroupBy and it appears that when sorting strings in ascending order KNIME puts capitalization before what the actual letter is.

For example, KNIME would sort these strings in the following order:

AD
Ab
BC
BD
Bv
CA
aa
ba
cc

That seems ridiculous, a reasonable person would assume the sort goes:
aa
Ab
AD
ba
BC
BD
Bv
CA
cc

Does that make sense? Is there a way to change this default setting or deal with it other than changing every entry to all caps or all lower case?

Very possible that it is ridiculous that historically it was a rule to sort in accordance to assigned codes to alphabet.
Below are references for your information
https://www.cs.cmu.edu/~pattis/15-1XX/common/handouts/ascii.html

3 Likes

Hi @ewhulbert,
@izaychik63 already gave the reason for this behavior, but here is a workaround: You can change everything to lower case, sort, and then join this sorted table with the original table using the row id as join key.
Kind regards
Alexander

3 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.