Get the parent position code using two other columns

Yesterday also I posted a query like this but unfortunately my data set was not clear, apologies for that.

today I have a clear explanation and a data set to explain you all helpers. Below is the table,

I have position name column and position id column both are same but position id column has additional digits at the end. Next to that is the position parent name column where it takes the names from previous couple of columns(position name and id) . I need to get the position parent code from the position code column. if you want you can alter data position id and position parent name column adding few digits at the end. I say to amend here because see below snapshot where we cannot get the precise parent name as they do not have a digit at the end.

Could somebody help me to get this done as I have been trying to get this sorted for couple of weeks and still I am no where closer to get this done by Monday.

thanks in advance here i have attached the data set also.

KNIME HELP.xlsx (10.6 KB)

Hi @psycho

Your desired output is not really feasible at the moment. Just extracting the information only takes one Joiner node. But there are two flaws: the position name and the parent position name are not uniform. Director is written with and without capital D.

Second, there is not distinction for professor on codes 33,34 and 35 and Director respectively. If you do a left outer join, each row gets matched three times. This even besides the fact that for some you need to join on position id while for the others you need to join on position name, which is a bit strange. Why not fill in a position id for all rows?

The first issue is easily solved with a multi column string manipulation node with upperCase($$CURRENTCOLUMN$$)

For the second issue, you can make a start stacking both joiners and choose any of the following.

But that doesn’t take away that you either have make one entry for Professor and Director or make them unique with a position id and then only have one join statement.

A way to tackle this could be to use a duplicate row filter excluding the parent position code column, but then you will end-up with only one ID depending on the logic that you set in the dup row filter (under the Advanced tabsheet).

1 Like

yes we have to get the position parent code either from the position name or position id column because as you can see there are some fields where position id column is blank. I tried as you explained but i get multiple column of duplicates. Is there an easy way to give position id column gen filled because there are more than 8k fields in that column to be filled.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.