Identify differences by comparing 2 csv files

Hi Team,

Would like to discuss on a best possible solution whilst comparing 2 csv (A.csv and B.csv) which in an ideal scenario will be identical unless upstream or data source may disrupt the structure. This is where we are trying to explore Knime capabilities to identify the differences and write some logs around it.

Problem statements:

  1. How to identify if someone added or removed a column in B.csv
  2. How to identify if someone changed the datatype for a particular column in B.csv
  3. How to identify if someone changed the order of columns B.csv

Trying to achieve with following approach:

  1. Read from 2 CSV reader flows (A.csv and B.csv) followed by ‘Extract column header’ node followed by Transpose and Joiner nodes
  2. Table difference finder followed by Group By and Extract table dimension

Unfortunately, i can not upload the workflow, but i tried to explain with the node sequence. Hope it helps to visualize.

Please do suggest if there is a better way to approach this, as the results aren’t promising.

HI @akhil_dhir_14

Can you be a lot more specific about “the results aren’t promising”. What makes it that the Table difference finder does not do the job?

To me it seems like port 1 of the node covers all your problem statements :wink:

5 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.