Join on partial string match, or Join if string is element of a larger string

Hi KNIMERs,

I have an issue which I couldn’t figure out,yet.

Following Scenario: In the first table, I have a column which can look like this, a bunch of keys. The length of the overall string, as well as the single elements can variate alot (length and position):

ROW 1: ,c5411,abc,rbst,c41,caat,nbcat,nbfact,nbfcpex,nbfcpin,
ROW 2: ,c12281,gdesge1,gfase1,gsocce2,c113e,cprdta,reqrpq,c1820,c2,cactio0,ccbat,gmat,

In a second table, I have those elements in singular representation, such as:

ROW 1: ,c5411,
ROW 2: ,c12281,

I would like to join those two tables following the logic if ROW 1 from table 2 is an element of the long string of ROW 1 in table 1, then join the information.
In SQL I would work here with sth. like INNER JOIN … CONTAINS (table2.table2column, table1.table1column)

Cross Join is too computationally expensive. Cell Splitter and then joining is too variable, as I don’t know with how many columns I will end up with.

Any ideas?

Best regards,
Stiefel

Hi @Residentstiefel

My suggestion here would be to convert the concatenated keys of first table into a collection (list) of values using the cell splitter but checking the option to generate a single column with a collection of values. Then I would use ungroup, do the joining and then group again if needed. The idea is here but if not clear enough, just tell me please and I’ll be happy to provide the workflow solution.

Hope this helps.

Best

Ael

2 Likes

Hi @aworker,

This did the job! Still a little bit mysterious to me first creating a list, then ungrouping again, but this didthe job fairly quick and efficient.

Thanks alot!

My pleasure and thanks @Residentstiefel for validating the answer. Indeed, SQL provides a more direct solution which, as far as I know, it is not possible in KNIME when implementing a Joiner:

I would be interested too in such joining “equivalent to SQL solution” in KNIME if it does exist.

Best

Ael

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.