Joiner - Problem

Chrimbo · March 10, 2019, 6:00pm

Hi Community,

next rookie-problem from me ^^.

I understand the joiner nearly a vlookup.

Compare Column A, when you find a match, write column B and C.
But i have no succes when i compare A, and table 1 has more rows like Table B or B has more than A. Example:

Table 1:

Table 2:

Result:

Dream-Result:

Many, many thanks in advance!

Chrimbo

izaychik63 · March 10, 2019, 6:32pm

To have dream result join table 2 on port 0 table 1 on port 1, use left outer join by 3 fields A, B and C: Collect result fields only from tab 1.

Chrimbo · March 10, 2019, 6:50pm

Hi izaychik63, (second time you help me ! ),

thanks for the result - but:

Table 1 contain rows which are not in Table 2, and Table 2 contains rows which are not in table 1, but both have same rows.

e.g.:

Table 1:

A
B
C

Table 2:

C
D
E

Do you have a solution ? When i use the joiner, i get empty rows when in Table 1 are rows which are not in table 2.

Thanks in advance !

Chrimbo · March 10, 2019, 7:03pm

or to make it simple:

Check Column A with A, when you have a match write: A,B,C

My Problem:
KNIME dont write A when it dont have a Match, but in both Tables are More Rows as only the Matches Rows. I need a mix from join and concatenate.

Hope you understand what i mean.

Thx in advance!

izaychik63 · March 10, 2019, 7:19pm

Possibly concatenate and following Group by to get rid of duplicates will do what you need.

Chrimbo · March 10, 2019, 8:10pm

hey,

i dont know if this is the best solution.

i try to explain it other wise:

Table 1

Column: A B C
1 2 3
12 23 34
45 67 89

Table 2:

Column: A B C D
1 2 3 4
12 23 34 xyz
48 19 32 abcde

Target: A B C D
1 2 3 4
12 23 34 xyz
48 19 32 abcde
45 67 89

With concatenate, i get two times the same rows. And what should i do with group by?

Is there no other chance to do this with one node? (joiner?!?!)

Thank you very much in advance!

Chrimbo

mlauber71 · March 10, 2019, 8:30pm

You could use the joiner node with full outer join and then Rule Engine to determine a common ID and if there was an initial match. I also carry information about the source of the data. You would then have to make a decision how to proceed with the data - eg which table gets the ‘lead’ or to check if the two columns B and B (#1) do match - for example you could give table1 the lead and only take data from table2 if it is missing.

kn_example_joiner_problem.knwf (44.5 KB)

Chrimbo · March 10, 2019, 8:55pm

Hello mlauber71, (second time you helped me )

i download your wf, but the result is not what i try to do.

You rebuild my tables, so i think you know what i try to do.

In my opinion its an art of extended joiner.

Compare Table 1 and 2. When you have a Match in Column A, join them from both tables, if not, write the rows without the match.

Table 1:

C1----C2----C3

A B C
AB CD DE
FG HI JK

Table 2

C1—C2—C3—C4

A B C D
AB CD EF GH
MN OP QR ST

= A joiner who join ALL Columns and rows and combine the matches.

Hope you both have an idea ^^

thx a lot !

Chrimbo · March 10, 2019, 8:59pm

ok,

i checked again. (Thank you !!!) When i now put a column filter, i get the result i want.

But, is there easier way to do that?

mlauber71 · March 10, 2019, 9:21pm

Not sure if this is ‘easier’ but you can do it SQLite style

kn_example_joiner_problem.knwf (63.4 KB)

Chrimbo · March 10, 2019, 9:27pm

hi mlauber71,

lol, nice answer! yes, for me its more difficult.

But i am very thankful, because you show me the way and i realize, that i have to learn a lot to be on a nearly level like you.

Even with your answers to my first thread, I realized that my knowledge is a grain of sand on the beach ( i booked an online-Python-course !!! ) ^^

I need an easy-way to explain my colleagues tomorrow

Maybe i get a result with concatenate and group by (but i am not sure), if not, i take yours

Thanks for your work !

And maybe one day, we can do a same-level-talk ^^

mlauber71 · March 10, 2019, 9:40pm

The Join functions of KNIME are somewhat limited and you would have to use a few rules and maybe more nodes to bring the data together. I thin in your case you cannot avoid setting a rule what to do with the columns that are the same in both tables but are not IDs for matching. You could even put all that in a single SQL statement but that is not really more readable.

From your last entry with the c–c--c columns I wonder if I fully understand what you are planning to do. Maybe if the example is not the right one you could construct a file that better suits your question (eg. if there could be more than one ID you might put that in your example). Typically it is much easier to solve problems with sample data - and constructing the example forces the person with the question (that is you) to think about where the specific problem is.

If you need to do this for an unknown set of variables you might have to use a loop.

With KNIME it is often the case you have to use quite some nodes to achieve your goal and often there is more than one way to do it, and especially in the beginning one might take a few extra or strange steps. In the end what matters it that you get your result and you can reproduce it - so if you have to explain it to other people use the node descriptions and annotations to structure your workflow.

So keep up with KNIME and Python - you will see great results and hopefully have a lot of fun on the way

Chrimbo · March 12, 2019, 6:40pm

Hello mlauber71,

thank you very much for your long answer!

Yes, next time when i am at home, i will show you the “real” datas to get a better answer.

I will keep you up to date and of course try myself to solve more and more difficult problems. It will be a long way to go, but I know it will be worth it. I wish you, izaychik63 and of course the whole community a great time!

Thanks again for your help!

Chrimbo

agaunt · March 13, 2019, 3:43pm

@chrimbo: Have you thought about a Joiner (Full Outer Join with unchecked option “Remove joining columns from bottom input”) followed by three Column Merger Nodes?

morpheus · March 14, 2019, 10:07am

Hello Chrimbo,
normaly you have to set the joiner mode to left outer join. In this case all rows of table A are shown and only entries of table B which match the defined key(s). In your example Column A.

Chrimbo · March 17, 2019, 4:15pm

Hey guys,

Thanks again for your answers!

I had a lot of stress during the week, so unfortunately I am so late!

Attached as promised, the tables that represent my data source. Now, I would like, as already mentioned, a node that joins me both tables and match and complete same lines to the final table that exist in one but not in the other table and write the no-matches-found-rows and write this too in the final table.

I solved the problem with concatenate and Group by ( Soloution from izaychik63) BUT, mlauber71 woke me up in the positive to continue relentlessly going my way ^^ )

Tonight i will try the solution from agaunt and morpheus and give you a feedback if it works for my special problem ^^

Nice evening guys!

Chrimbo

A.xlsx (9.8 KB)

B.xlsx (8.8 KB)

Final table:

Finaltable.xlsx (9.1 KB)

eferro · August 19, 2021, 1:08pm

I had the same problem, I resolved it extracting the ROWID to a column and from there using this column to join. See the flow below.

https://hub.knime.com/eferro/spaces/Public/latest/problema_joiner