Bug in Table Difference Finder?

I try to use the Table Difference Finder to compare two tables. But it struggles comparing doubles. Here is a simple example:
Input:
grafik

This gets split up into two tables and compared. The output is 10 and 10.0 are different.

grafik

Even if I round the numbers down to 0 decimal places, it still fails. Here is the workflow:

I know about IEEE 754 floating point representation and the errors which can occur there. That’s why I tried to round the number. But even if I round it to 0 decimal places it does not work. It fails even when rounded to 1 significant digit. (For my application, 4 decimal places would be enough). A Number like 10.0 should also have only a single IEEE 754 representation.

The comparison works correctly, if I do a “Math Formula (Multi Column)” on each double column where I add 0.0. (Formula: $$CURRENT_COLUMN$$ + 0.0)

I don’t get where the problem is. Any ideas?

Here is the full workflow:
Table difference bug.knwf (81.8 KB)

Hello @masgo,

checked it and seems to me it’s failing due to domain differences (check second output port of Table Difference Finder node). You can include double columns into Domain Calculator after rounding and then it will work fine. Maybe it’s the way you got those numbers in first place as you are using Table Reader but that’s guessing from my side…

Additionally if you create same double value (10) in two Table Creator nodes it works as expected.

Br,
Ivan

2 Likes

Hi @ipazin,

the data ist from a real-world application where i encountered this error. Due to privacy, I can not provide all data. So I did a row and column filter to create this minimal example and then used a Table Writer to save the data.

The domain is different, since the data originates from two different sources (Excel and DB). It get’s filtered and transformed and at the end I want to know the differences. The fact that the domains differ is also not the reason why it’s not working.

Here I create a Table where the columns have different domains. I even input one value als 10.0 and the other as 10 but the comparison works fine. The Difference Finder does not flag it as difference (as expected/desired) but shows on the second output that the domain is different (es expected).

Table difference bug2.knwf (85.5 KB)

For the real world data, the domains will, most likley, always differ. But only some values differ. Right now i am comparing two tables with ~1.400.000 rows each and ~10 double columns. So a total of 14.000.000 cells that are compared. And for ~3.000 cell I am getting this kind of error, where 10 and 10.0 are somehow considered different. (It’s not only 10, seems to affect most numbers)

Hello @masgo,

I see and you are right. It’s not due to domain differences. Hope someone will take a look at it.

Br,
Ivan