DB Extract then compare to current DB

mr2mike · May 20, 2020, 7:08pm

Looking for help on which nodes I could use or a start to a workflow for this.
Want to have something built where I can run the workflow each week, extract the database and time stamp it so then in future weeks, I can then compare back to this previous week extract or any other extract with a time stamp compared to current database data.
I would also use Tableau so I think I just need to then publish to Tableau data sources.
Any ideas?
Thanks.

dyagaty · May 21, 2020, 4:58am

DB connector.
DB reader/DB query
Write the output to a location (for each week).
Use list files node and choose a previous file to read.
Compare.

ipazin · May 21, 2020, 2:05pm

Hi there @mr2mike,

welcome to KNIME Community!

If I may ask what is the reason for comparison and what kind of comparison is it?

Here is link to database guide and feel free to browse KNIME Hub for nodes and workflow examples

Br,
Ivan

mr2mike · May 21, 2020, 5:40pm

Thanks for the replies.

The database is real time and does NOT track changes that are made by all the employees.
So say we have:
Last week. 10 Jobs completed.
This week. 5 Jobs completed.

If you poll the database, you will see 15 jobs completed. But there’s no way to actually retrieve what was the job count done this week.

I know it’s an issue with the database collection and should be updated to have this integrated but I wanted a work around without dumping it to excel on a weekly basis to track these weekly changes.

Does that help?

So far I have gone with:
Oracle Connector
DB Query Reader <-- I am needing to join a few tables.

ipazin · May 22, 2020, 2:08pm

Hi there @mr2mike,

tnx for additional explanation. Sure does help and seems you are aware this is a database issue

Anyways you can use KNIME for your task (extract current data and compare to previous one) but you will still have to “dump” your data on weekly basis somewhere in order to be able to compare it. What I suggest is to use Table Writer and Table Reader nodes for this. These nodes use KNIME internal format and thus are much faster when it comes to writing/reading data plus they take less disk space.

Regarding database operations I suggest to utilize as much as possible database nodes that have pushdown logic (in case data is big and there are lot of joins for example) and read data into KNIME after transformations are done.

Good luck and in case of any questions feel free to ask!

Br,
Ivan