Compare two tables with different table structure

Dear All,

I have two tables, one is the master, and the other is the raw data, I would like to map the type of offer to the raw data, according to the appearance of the the no. of router installed. The raw data only have column up to “Delivery & Install 4th router”. But I want the output file to contain all columns of the master file… I tried to use joiner to compare, but since the two tables are not having the same no. of column, the compared result is wrong, (I used pivot table of Excel to verified the result)… the Delivery & Install 5th router in the output file got some “1”…But since the raw data do not have this column, i presume the value for the column should be 0… Does anyone know how to fix this problem?
master.xlsx (10.1 KB)
raw data.xlsx (10.5 KB)

Hi @catcat,

you can solve this issue by using the Table Validator (Reference) node in combination with the Missing Value node.

  1. The Table Validator (Reference) node to create the missing columns. The column values are missing values (missing values are represented via red questionnaires).
  2. The Missing Value node to replace all missing values with the value 0.

This little workflow performs the two steps: Compare two tables – KNIME Hub

Cheers
Kathrin

6 Likes

Hi @catcat , I looked at your 2 files. I think you need to first figure out on your side what the business rules should be. We cannot set these rules for you.

Comparisons between the 2 tables on the match columns can be done, and if that’s what needs to be done, then there is no problem. But you need to determine what the rules should be.

For example, if you can specify what should be done for the following questions, it would help:

  1. Should the empty cells in the raw data file be considered as 0?
  2. Should only the 1st, 2nd, 3rd and 4th routers be compared, since only these 4 routers exist in both files? If not, how should the 5th router from the master file be compared with the raw data file?
  3. If we are going to compare only the 4 routers, there will be duplicates. For example, records from the raw data file that match line 4 in the master file (Basic Offer 3) will also match the line 5 in the master file (Basic Offer 4) since the 1st, 2nd, 3rd, and 4th routers are the same for both lines. Similary for line (7 and 10), line (9 and 11), etc. Is that OK?
  4. Finally, what kind of output are you expecting? Is it something like:
    Order No., Deli…& Inst… 1st router, 2nd router, 3rd router, 4th router, type
2 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.