Is there a node equivalent to the "joiner node" with more than two input ports?

rogerius1st · September 7, 2023, 1:26am

Dear Knimers,
I have to join n files (where n>2, about a dozen, and possibly even more). And, of course, they have one (or more) columns in common. But the classical joiner node allows me to join just two files each time. So I have to repeat the process time and time again, joining the first two, and then another file to the previous joining. Thus, I am obliged to add n-1 joiner nodes to fulfill all my joinings. This renders me an unexpected (and possibly unnecessary overload in my workflow. Is there a more flexible way to implement that? I expected to find an alternative to the joiner node, but with the option to modify input ports, adding as many files as I might need. Is there such an option in Knime?
Thanks for any help.
Rogério.

mlauber71 · September 7, 2023, 5:49am

@rogerius1st I would see two options. Using loops or a local SQL database which would mean to load the that there first

rogerius1st · September 7, 2023, 6:46pm

Thank you, once again, dear Markus, for your tremendous generosity of all your knowledge.
But unfortunately (once again) I feel as if I were with a rope around my neck. I must conclude this research by the end of this month. Thus, I do not have enough time to acquire this additional competence (of dealing with Databases and Structured Query Language (SQL)).
In the meanwhile, I had already adopted another loop [ using the nodes: ‘List Files / Folders’ → "Table Row To Variable Loop Start’ → ‘Java Edit Variable’ → ‘CSV Reader’ → ‘Variable To Table Column’ → [plus a series of other nodes to implement sequential transformations on the data] → ‘Loop End’. Of course, considering this sequence has the additional cost of writing and exporting (to my local system) my intermediate files, and then importing them all again into my workflow, I had to pay for the undesired burden of this process with the higher associated computational costs.
What I expected would be a node (or metanode, or component) inside Knime, which could exert this function of joining several intermediate tables (generated along a workflow).
But once again, thank you for all your inestimable efforts.
I wish you all the best.
Bye.
Rogério.

mlauber71 · September 7, 2023, 7:36pm

@rogerius1st from my experience it sometimes can help to define a clear case and provide maybe sample data with the (let’s call them) rules and restrictions that you will face. Based on that it it might be possible to plan and build a workflow that could accomplish a merging of several files/data (?).

The files might be in a folder and have the same ID. You could set up a reference table that would state which ID would have to be joined with which other one and so on.

The question will also be how many such files and loops are to be expected. If there is a dozen it might work like this. If you have hundreds or thousands one might have to store intermediate results.

Concerning time and workload of the system I would say: this is what machines and software are made for: so you can let them do the work.

Another observation: often a more simple construct is best and then just letzt KNIME run a few minutes longer.

So would it be possible to build up your case with 3-5 dummy files (CSV) with relevant IDs to demonstrate what you want to do.

system · December 6, 2023, 7:36pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.