Table Difference Finder and Looping

I need to compare a table in a “target” database to the same table in the “source” to make sure an ETL process was complete. In this case, I may have 50 tables where I grab the table from source and target and manually enter that into each DB Table Selector and then essentially do a Select * on each and run that into the Table Difference Finder node. That works pretty well but I was wondering if I could provide a listing of source and target tables (they are not named the same exactly) and loop through the process and output files to see if there are differences or not. This would allow me to just click start and come back later when it ran through all 50 tables and I can see which had differences. The other caveat is that on some of them, it looks like the rows may not align perfectly so I just sort on PK or 4 different columns to get them sorted first before the compare.

Ideally, I could create that listing of paired tables and let it run.

Hi,

To create a list of database table names, you can, for example, execute a SQL statement using the “DB Query Reader” node to create a data table from the DB table names. The exact statement depends on the database you use. In PostgreSQL, for example, you can use the following SQL to get the DB table names:

====================================================================
SELECT
table_name
FROM
information_schema.tables
WHERE
table_type = ‘BASE TABLE’
AND
table_schema NOT IN (‘pg_catalog’, ‘information_schema’);

=====================================================================
while in MySQL you can use the “SHOW TABLES IN db_name” statement.

Once you have the 2 lists from the Target and Source DB-s, they have to be combined to create the pairs. Do you have the logic already for creating the pairs if the table names are not exactly the same?
After this, you can iterate through each pair using the “Table Row to Variable Loop Start” node and then run a “SELECT *” query for each table of the pair in each iteration using another “DB Query Reader” node before you add the pairwise comparison of the Source and Target tables.
Listing the table names and iterating through all tables in a single DB with a loop would look similar to the screenshot below:


I hope you find this information helpful. Please let us know if you have further questions.

Best regards,
Dóra

1 Like