Get rolling dataset from SQL-db

tobias.koetter · April 8, 2020, 8:50am

Hi mabm,
I would store the maximum timestamp from the database records that where loaded in the current workflow execution. Up on the next execution you use a sql query that returns all records that have a timestamp greater then the stored max time stamp from the last time of execution. This ensures two things:

You do not load any duplicates since you have the fixed timestamp. Even better would be a db column with a sequence number since depending on the speed the records get inserted and the granularity of the timestamp you still might get duplicates.
If the workflow didn’t execute for what ever reason up on the next execution it will fetch all rows that haven’t been processed so far.

The max timestamp or sequence number could be stored in the workflow using the KNIME protocol and used in the DB Row Filter node via a Flow Variable. Let us know if you need an example for this.
Bye
Tobias