Does the Joiner node have a bug?

Hello guys,
In Knime 4.0 I am noticing something that worked before. I have a table created by a workflow and I have a csv. I need all records from the workflow table (top/left) and matching records from the csv. My left table has 5859 records and I am expecting this number to stay the same after I left-outer-join the table with csv, however I end up with ~9K rows if I join by “id” column or 5862 if I join by “name” column.
Can anyone shed some light why the number of rows would change at all on outer left joiner?

See here
http://doc.nuodb.com/2.5.5/Content/About-LEFT-OUTER-Join-Operations.htm

What is the join that will bring the results from the second table for matching rows but will not add rows?

Matching may mean to bring new rows. Say table 1 key is 5 and table 2 has 2 records with key corresponds to 5 then join will return 2 rows even though first table has just 1 row.

1 Like

OK, but I still want to achieve what I described.
Add extra data from table one for the rows I have in table two. How can I achieve that?

Hi @IrynaK,

If you want to keep all the rows from the first table and have the matches from the second table the left join is alright. But if you want to have the same number of rows after joining the tables, you have to decide how to choose from the rows with the same “id”.
To do that, you can use Duplicate Row Filter node and select the “id” column and in the advanced tab choose how to pick the row. Or you can also use a GroupBy node to do that but you need to choose aggregation function for every other column in your table.

:blush:

1 Like

I tried Duplicate row filter. Firstly, it preselects what column is used as an ‘id’, secondly it removes some duplicates however the row counts is still off, so I am not acheiving what I want.
I could swear in my previous versions it worked correctly, I always checked the record count before and after the join and it was correct. Is there maybe a lookup node that would achieve it?

I just realized what is happening. The bottom table has two rows for my one row and thats why it multiplies. Sorry, probably needed more coffee all this time to focus on this correctly.

4 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.