Merging multiple columns, simple and fast

Hello community,

I would like to join two tables having the same columns but with different data.

E.g.:
Table 1
ID Col1 Col2 Col3
1 123 123 123
2 12 3 6
3 ? ? ?

Table 2
ID Col1 Col2 Col3
1 ? ? ?
2 ? ? ?
3 1 1 1

out
ID Col1 Col2 Col3
1 123 123 123
2 12 3 6
3 1 1 1

At the moment i am using a joiner node and multiple column merger nodes after that (like 30 of them, capsuled into a meta node), to get rid of duplicate columns.
I dont realy like that solution, it’s very slow and kind of awkward, but gets the job done.

Is there maybe a knime extension with a merge-join node or something simelar?

Thanks for any kind of ideas and help :).

Regards Sebastian

Hi @SebastianK,
You are looking for the Concatenate node, that should do exactly what you want.
best,
Gabriel

Hi Sebastian,

how about this: First extract the rowId using the RowId node on both tables.
Than concatenate the two of them.
In the end use the groupby node on the RowID and on the type based aggregation tab you can define how e.g. string columns should be combined, than it is done for all columns at once.

Cheers, Iris

1 Like

I guess it depends on the actual data.

In your potentially simplified example you could just use concatenate and a row filter on one of the columns (empty). This of course only works if all columns of that row are empty, eg like in your example.

Else Iris approach should work and is more general. In the aggregation method you can then remove the missing value check mark which will exclude missing values. So something like “first” for string columns would work.

Thank you all for the usefull tips! :^)

I have realized that my given example was a bit unfortunate. My Problem is a bit more complex, but the ideas tips u gave me helped me much and allowed me to solve my Problem.

So basicly I’ve got 2 tables and want to join them to append the columns from table 2 on table 1. The Problem about this was that table 2 has 2 different ID numbers (like ID#1 and ID#2) which need to be joined on table 1 which has only 1 ID number, including all ID’s from table 2 (ID#1 and ID#2).

Table 1
ID col1 col2 col3

Table 2
ID#1 col4 col5 ID#2

My first attempt was to left join Table 1, Table 2 on ID and ID#1. And then left join again on ID and ID#2

What i got was
ID col1 col2 col3 col4 col5 col4#1 col5#1
(but like 30 duplicate columns which needed to be merged)

What I did now: I splited table 2 into two separte tables (one with ID#1 + all columns and one with ID#2 + all columns), renamed ID#1 /#2 to ID in both tables, and concatenated them back together.
After that I could simply join table 1 and table 2 together without merging duplicate columns.

Thank you very much! :smile:

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.