One for the dev team - Any chance of a multi-join node in the pipeworks?

Gavin_Attard · June 16, 2020, 8:50am

Hi Knime Devs!

Firstly a nod of appreciation to the work you guys are doing.
Having come from a strong Alteryx background, and now getting over the initial learning curve i have to say i am mighty impressed with this platform and can see it growing in market strength. Indeed i am happy to report that i have now embedded Knime workflows for 2 clients. It will be interesting to see how they adopt and scale and we are supporting them over this journey.

So to the point of the post; Any chance you gusy are working on or planning a mulit-join tool similar to the one in alteryx?

I know there are workarounds (there always are), but he point of a platform like this is to make it simple, virtually codeless, in particular for non db admin, data engineering, coding type folk who have domain knowledge and will be using a tool like this to facilitate their job and turn them into data heroes.

Cheers

Gavin

beginner · June 16, 2020, 9:59am

Spamming the forum with this request is not going to make it appear faster. This annoys me because you have been told multiple times by different people to just create a component. If you could show that you at least tried that, it would make you look a little less demanding. You can build a component that fits your needs in like 5 minutes, faster than writing these forum posts.

I leave the exact node configs as exercise for you as that might make you try to learn some more “advanced” knime concepts. Sometimes it just needs some thinking outside the box.
The top out port is inner join, then left and right join. Column configuration is for selecting join columns with assumption they are named the same in both tables. If not the case, rename before the component. Splitting can be done by rowid as any left only rowid will end with a “?" and any right join rowid will start with a "?”
(As I said, thinking outside the box).

And as said writing this took me about 5 times longer than creating this example.

Gavin_Attard · June 16, 2020, 10:32am

Hi @beginner

It seems i may have hit some nerve with you, so much so, that i don’t think you read the post completely and went off on one.
Whatever it is, may i suggest a calming cup of tea and maintaining a respectful and helpful tone through out, this is not reddit or a gaming forum…

This post is completely unrelated to the post about handling nulls in joins or it’s parent referring to three output ports for the L, Inner and R respectively.

This is about equivalency to the Alteryx Multi-Join Tool, which allows a variable number of inputs to be joined, incredibly useful as you can imagine.

Of course am aware about components, how to build them. I have also built the Alteryx equivalents (Macros) in much of the work i do.
So much so my contributions to forums and aiding users in the Alteryx platform had me designated as an ACE for 2 years running.
There is nearly always a workaround and multiple ways of doing the same thing… i’m interested in the discussion behind them as they often reveal new things and help a person grow.

Bear in mind, I am also bringing the view point of clients with me, where the point of a platform such as this is to facilitate. So simpler concepts such as multi stream joins are effortless. To the non sql initiated, nulls pass through and so on and so forth…

Leaving components building for the more complex stuff…

I do hope this clears the air and we can reset our forum relationship to one that is more collegial in nature.

Gavin

Gavin_Attard · June 16, 2020, 10:34am

Incidentally @beginner

i do thank you for putting forward solutioning to the two issues i put forward in my other posts, perhaps best placed posting it there?

beginner · June 16, 2020, 12:49pm

Actually I did read it completely. Your initial post doesn’t mention what the multi-join tool is at all hence the misunderstanding.

If you actually want to join multiple inputs, just chain the joiner as many time as needed? Indeed you hit a nerve because your are proposing stuff that is already trivial to do. This one, as far as I can tell from your explanation, is even easier than the other one.
(Another option could be concatenate + groupyby but that would actually be more black-boxy as it hides you actual intention and trickier to configure)

That is were we disagree. Chaining joiners seems straightforward and effortless to me.
Explaining the meaning of null/missing value and how null behaves is a matter of explanation/training but yeah I can see some people having issues with understanding that. But are these the right people to use a tool like knime?

Gavin_Attard · June 16, 2020, 1:09pm

But are these the right people to use a tool like knime?

Yes of course they are, you only need to see the rapid and incredible expansion of Alteryx and the enablement of the users it serves. The primary reason, it’s easy and intuitive.
Indeed i find myself for the most part using 2 to 3 more nodes/clicks to get the same thing done.

Now i’m not knocking the platform, as per previous post, it’s great, and indeed the capability to run nodes individual is brilliant (not to mention multi-OS etc…).
Personally i think Knime may be the better platform and i only seek to make available or suggest rather, nodes and methods which i am seeing as gaps between the two platforms.

Yes of course you can chain join nodes, but why not have a single tool that does the multi-chain for you, less steps/clicks - easier to configure, cleaner on the canvas…

It’s just good UX.

AnotherFraudUser · June 16, 2020, 10:06pm

hi @Gavin_Attard,

i see where you are coming from - a node which does multiple steps and reduce the clutter of the workflows does not sound bad.
E.g. the new concat node with optional numbers of inputs is really nice.
However I would say the current every node does a small process step is not bad UX.
I think having small nodes with easy understandable functions has a charm for itself - with simple easy to understand steps to the result e.g. one node reads in a file, the next row filters value A, the next one joines the table.
With good set node descriptions this becomes a process documentation by itself while providing great flexibility.

While I am not against more multi-task nodes, i guess i would prefer if the focus is on new features which I am currently unable to do with the current native knime nodes at all

But maybe coming from another tool gives you another point of view on this

Maybe you could collect your development/UX suggestion in one thread to keep the forum tidy?
E.g. as a collection of suggestions (i guess would be easier for developers as well to check multiple suggestions at once instead of many small threads with side discussions like these which bury your actual suggestions)
As someone checking many different threads for people who i might be able to help - i can see why @beginner is somewhat annoyed with many threads with basically non-problems from a solution perspective - which is not a problem as many new users have “easy” problems… however you do not actually want “a” solution but mostly better/combined/other UX while knowing that there are solutions… which can be irritating
(which should not downplay your suggestions it is just a different point of view)

beginner · June 17, 2020, 5:48am

Exactly. Having these large black-box like nodes would ruins this specific feature of knime.

beginner · June 17, 2020, 6:06am

Maybe you are right. While I would like users being able to do more, often the simply don’t have the time or don’t want to. I think it’s better to have an expert tool and as such an “expert” that provides the solution to the users where the solution could be a one-of extract of data or a periodically repeated workflow/report. That let’s the user focus on their core work. For the average users (which in my case are not dumb, phds) it is simply too much to ask to understand a tool like KNIME or alteryx on top of their core-work, This needs time and dedication and why should the user waste hours when I can do it in 10 minutes? Therefore I think it’s better to keep knime as an “expert” tool and you can use knime-server (not free) to share your workflows with said users. (probably doesn’t work so nicely in your case as you seem to be in consulting?).

In general I’m firmly against the trend of moving more and more down to the end-user versus how it used to be, to have dedicated people (“experts”) do the stuff for all the users.

That is were I’m coming from.

Gavin_Attard · June 17, 2020, 6:13am

Hey guys,

perhaps best to clarify i am not knocking the smaller single function nodes. I am not calling out for them to be replaced or deprecated. Merely for an additional node that can handle those multisteps, in particular where there is a time saving value (who likes repetition…)

I’ll shift the suggestions discussions to the Knime Development category which perhaps to @AnotherFraudUser point is better placed, i was looking for a ‘product ideas’ thread, i think this one is the closest to that. .

Gavin_Attard · June 17, 2020, 11:45am

I see your point, and don’t disagree. I think there is space for both of these to co-exist.

We serve our clients by building and maintaining their flows, but also training them to self serve. That’s where the magic happens.

DemandEngineer · October 30, 2020, 10:00pm

Hi @Gavin_Attard,

Hope this can help you and maybe you can give back and help update:

Gavin_Attard · January 21, 2021, 5:07pm

Will certainly spend some time to contribute soon. Also identified a couple of components that can be built…

system · July 23, 2021, 5:07am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.