Feature request: theta joins, advanced joins

Hello KNIME users and developers,

I’m suggesting about the possibility to implement theta joins using only the joiner node or a dedicated “advanced joiner” node:

I mean the ability to make advanced joins between tables using complex logical operators like:
“BETWEEN”, ≤, ≥,<,>
and any combinations of them using parenthesis + AND, OR, NOT operators.

For example (given two tables T1, T2) the following join condition:
NOT (T1.columnA=T2.columnB) AND ((T1.columnA BETWEEN T2.columnC AND T2.columnD) OR (T1.columnE ≥ T2.columnE))

I know that it can currently be obtained in two ways (some examples are available in KNIME hub):

  • using a combination of nodes and structures like loops, crossjoins and sequential filters,
  • using SQL script with a connection to a database, execute, and download-reimport results

I tried both… but first way is often very slow and not optimized (especially on huge tables), the second one is ok with execution, but it can result a cumbersome process and seems a kind of workaround…

I really hope someone can think about that,
and I hope it can be realized in the future because I think it’s a very very useful task…

Thank you for any suggestion or any interest

Ema

2 Likes

Hello @Ema,

some use cases can be addressed with Binner (Dictionary) node. Also here is nice topic in case you missed:

Will add +1 to already existing ticket mentioned in linked topic.

Br,
Ivan

Thank you @ipazin for the feedback!

1 Like