I am transitioning from Alteryx to Knime and I am struggling with Joiner tool. I need to do a left join aka “anti-join” like in Alteryx. Could some one please explain step by step how to do it. Also after I want to do a union of left join and inner join matching columns with the same names.
unfortunately I have never used Alteryx, so if you could explain your data and the expected result, that would help a lot. Maybe you can come up with a small example? As far as I know a left join is not the same as an anti-join. While the former takes all the data from the left table and whatever it can match from the right table it appends, the latter is kind of the reverse of a join, in that it matches rows where the predicate is not true. Left join can be selected in KNIME’s Joiner node at the top of the configuration dialog (it is called left outer join here). If you need to remove rows from a table where a value occurs in another table, the Reference Row Filter can be used. For more complicated joins you will need to perform a cross join and then do the necessary calculations for each row, followed by a Row Filter.
Welcome to KNIME! I’ve used Alteryx and have been using KNIME extensively. I think you’ll find the transition “familiar yet new”, and over time you’ll find that KNIME is a lot more flexible in just about every area than Alteryx.
I think you might find this topic useful to get your solution. It includes an example as well.
Good luck and don’t hesitate to ask more questions!
I’ve spent a lot of time using Alteryx and put together training and transition documents for my team on moving over to KNIME. If you still have access to Alteryx load the example from the Join Tool. I built a copy of the example in KNIME to explain this to you:
Unlike Alteryx which splits your datastream into Left Unjoined, Joined, and Right Unjoined, KNIME will generally keep the data together and just generate missing values based on the kind of join you are wanting to do (which can be helpful or not)
For instance, in a LEFT OUTER JOIN in the KNIME Joiner Node the Galileo entry has a MISSING value in the FirstPurchaseDate column because it did not match any values from the BOTTOM or RIGHT table. In Alteryx this row is separated out into the LEFT Unjoined stream.
I have approached this problem in a couple ways
1. I used the Full Outer Join and then a Rule Based Row Filter to see the rows that DID NOT have any matching values.
2. Generally though, for the work that I do, I use the INNER JOIN option in the Joiner which acts like the J Output on the Alteryx JOIN tool. The only drawback here (versus the Alteryx tool) is that you will get an output that only shows you the joined values. You would need to do additional work (as mentioned above) to see the Unjoined values.
In the first image below I’ve highlighted the kind of join I used and the output with the missing value for the row containing Galileo. In the second image you see the Alteryx workflow with the same data generated but in this case the Galileo line is the only output from the L portion of the Join tool.
Finally, in the last image I am demonstrating how you can isolate the particular row to provide a close approximation of the Alteryx L output.
I have spent a LOT of time learning and transitioning from Alteryx to KNIME and am happy to answer any questions you have and provide translations solves for common problems!
Thank you for your reply. It is helpful. But indeed requires Some extra steps. In Alteryx I really like the joiner tool because it helps me to avoid many mistakes because I can see right away if some values didn’t match.
I hope someone will develop a tool which is equivalent to Alteryx.
I have mixed feelings about the Join function. While I appreciate that it works the same as in SQL so there is consistency, from a business use case perspective having it work like in alteryx make more sense to me… as each output is mutually exclusive and each can have different paths for processing. Joined records move forward. Left unjoined are sent to X team for review. Right Unjoined are sent to data enrichment. In Knime I guess it just takes an additional node to filter those that were joined as @TardisPilot outlined.
The best of both worlds is to have an Enhancement that has 2 more items in the join mode or better yet 2 additional output ports that are Left Unjoined and Right Unjoined.
@ScottF Is there a better way for us to suggest feature enhancements?