Node Joiner is adding more rows than original file

Hi,

My Joiner top input is an excel file with 4530 which I need to classify one column based on other file list, similar to a VLOOKUP on excel. Problem is, when I use the Joiner (see bellow it’s settings), it’s adding more 100+ rows, duplicating all values. I just want to do the vlookup, don’t want to add nothing. If the Joiner doesnt find any match, than add like “Error”, but it just CAN’T add more rows.

What I’m doing wrong? I heard that I have to perform a Left Join only, but I cant see this settings on Joiner node. Please help!!!

Hi @Rodrigo_Rasswei , if the Joiner node is causing additional rows to be added it is because you have more than one row matching on your join condition, e.g. perhaps EAN isn’t unique on your lower table, and so it finds two rows for a given value of EAN, therefore returning both rows to be joined to the original row on the top table.

Your options, (if you are going to use the Joiner node) are to find an additional join column which guarantees a single unique row to be joined, or else to remove duplicates in one or both tables prior to using the joiner.

Alternatively, have you tried the Value Lookup node, which gives additional options on how to handle multiple matching rows?

4 Likes

Hi @takbb, thank you for your response!

And yes, the EAN aren’t unique. They are a like a SKU code, a product identification. So my rows are purchase orders, for different clients, different destination, different revenues, but the EAN repeats itself. There are like 4530 purchase orders, but 100+ EANs, so they will appear more than one time.

I need to classify the EANs for an analysis, but the classification is on other file.

This is what I need to do: fill the yellow column based on the values on Table 2.
ean-classification

I don’t know what to do :frowning:

As @takbb said - the behavior you describe suggests your table 2 has duplicates in column EAN.

To check this you could:

  • connect table 2 to group by node
  • Select column EAN as group colon
  • go to manual aggregation tab and select column Classification as aggregation column and use method Count
  • after the node executes, sort the table descending to identify those EAN that occur not than once

I‘d then check, if the classification values for duplicates differ. If they are not you can remove them - if they are the question is: Why?

2 Likes

Ohh I got it now! It worked, thank you guys!

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.