How to Join Multiple Tables with Different Schemas in KNIME (Similar to Alteryx Multi-Join)?

Hi KNIME Community,

I have a use case where I need to join multiple incoming tables from flat files, and these tables have different schemas.

In Alteryx, there is a direct tool called Join Multiple, which allows joining multiple inputs easily.

However, in KNIME, I couldn’t find a single node that does this.

So far, I’ve tried using two Joiner nodes to combine the tables, but this approach doesn’t scale well when the number of tables increases.

I came across this Multiple join - #3 by aworker suggests using Recursive Loop Start/End nodes to wrap the Joiner node and perform multiple joins iteratively. However, I couldn’t get this to work in my case.

Here’s what I need help with:

  • How can I configure the Recursive Loop so that the Joiner node joins all tables one by one?

  • Is there a better approach for joining multiple tables with different schemas in KNIME?

Any guidance, examples, or best practices would be greatly appreciated!


What I’ve Tried:

  • Used multiple Joiner nodes manually (works but not scalable).

  • Attempted Recursive Loop setup (Joiner inside loop), but I’m stuck because the Joiner node only accepts two inputs.


Goal:

A solution similar to Alteryx’s Join Multiple tool, but in KNIME.

Hi @Arpit_Sharma ,can you give an idea of the type of data you are trying to join, possibly even some small demo files.

As you have noted, there is no joiner node in KNIME that can handle joining more than two tables at once.

I’m wondering though in your case how much data from each input table is to be added to the output. It could be for example that there are two primary tables to be joined and then other tables simply augment the output which could be efficiently achieved by a series of Value Lookup nodes.

I don’t think there is necessarily a “one pattern fits all” for this, so if you can give more information about the nature of the various input tables and expected output table that would be beneficial.

btw I understand your concerns about not scaling, but trying to suggest any pseudo-generic solutions with no idea of the use case is difficult.

Hi,

it’s not easy to find a generalized solution but maybe this is a good start:
The idea is, that we define a list of Paths, Columns to use for the GroupBy and for the Joining. Then iterate recursively over this list and expand the table in each iteration.

1 Like

But you need to be careful with the sizes… I dont think that hundrets of iterations will work. I tested it with three tables

HI guys,

A question… If tyou have some files/tables with the same scheme, you can use a csv/json/file reader node with the folder configuration to load multiples files. This will make a single table with all data that you have.

If you need to compare file by file and add some column because they have differences, use de join node for it.

After that, you can use de group node to sum the information or make some calc and at the end, use de row filter and column filter to remove information that you don’t need any more.

Can it be a solution for you?

Thanks,

Denis Iongbloed