I have 2 data sets with employee names in each. In one excel, it includes 3 columns:
A1) First Name
A2) Preferred Name
A3) Last Name
B1) First Name
B2) Last Name
Unfortunately, some names are registered in the 2nd excel (Excel B) with the difference in first name (Some are using A1 as the first name, and some are using A2 as the first name.
How do I create a joining method that relies in the rules of:
If B1 = A1 OR, B1= A2
B2 = A3
Then Join the data sets.
In other words:
IF First name (B) matches First name (A), or First name (B) matches Preffered name (A)
And Last name (B) matches Last Name (A) - then Join data sets.
Thanks in advance!
I use this H2 based custom joiner component that @takbb made. The syntax is easy and it really opens up a ton of advanced join options.
The other option is basically joining / concatenating together multiple joiner node outputs.
It doesn’t seem to work after inputting this function. Do you have any idea why?
I used the function:
t1.“First Name UC” = t2.“Split_0” or t1.“Prefered First Name UC” = t2.“Split_0” and t1.“Last Name UC” = t2.“Split_1”
Before you can check if the first_name or preferred_name from A matches the first_name of B, you have to join both tables A en B.
In this wf you join them twice. Concat the results, and use a rule engine to filter the matches.
It looks like the 1st “or” clause should be in parenthesis logic wise. I am not in the office, so I can’t troubleshoot it or get to my work for syntax examples until later today.
You also need to make sure that the data types are H2 compliant. I often have to convert data types and then return them back after the join.
The approach that @HansS posted works just fine for this situation. I do think it is worthwhile to get good with the H2 join approach as well. At some point you will likely get a complex multi test / multi approach join challenge. A few of these components with some complex formulas can save hundreds of nodes dramatically simplify reviewing, proofing and adjusting.
Thank you but I have one enquiry - what is the point of the last joiner node? I am having troubles understanding what to join here?
Thank you I would like to learn more about this - perhaps whenever you have the time could you provide me with the right logic so I can use it someday?
The logic for the last joiner was to check which records from table B have a match with records from table A (not in scope of your initial question). I guess you have some expectations in advance, and it is always good (i.m.o.) to check your results and see if it match your expectations.
To add, for
B1 = A1 OR, B1= A2 you can use one joiner, add both comparisons and set the option any of the following.
I should be able to upload a sample workflow later today or tomorrow so that you can see how it works.
I was thinking about that as an option but it did not allow me to add (AND last names match)
Very true! Thank you HanS!!!
Forum H2 Join.knwf (134.9 KB)
I used the RowID and Reference Column Filters to show the unmatched left and right for the inner Join.
Thanks for this. Unfortunately the node: Join Custom condition keeps giving me an error and these two messages pop up. When I look for updates - it returns with nothing. Do you have any idea how I can fix this?
What version of KNIME are you running? You can open or expand the component (and any metanodes inside) to see all of the nodes that make it up. That may help you pinpoint the problem if it is a “missing node”.
The latest - there are no more updates available. I opened it and I believe the problem lies in the DB Query Reader
Is there any additional info on the error if you click on the the DB Query Reader and then hover over the ? Did the H2 Connector that feeds it appear to work?
Thanks @iCFO for providing the support for this component.
@Menoufi are you actually getting an error when you attempt to execute the workflow? Warnings with yellow exclamation marks aren’t generally actual fatal errors and “warnings on load” can happen for a variety of reasons and often don’t prevent execution unless there are actually nodes missing (such as version incompatibility such as @iCFO is alluding to).
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.