Dynamic updating

Hello, dear Knimers :innocent:

My question is related to Dynamic Updating.

Lets say that I have two tables, which joined with proper key and these tables updates incrementally.
I have two cases regarding these cases.

  1. If there are some new updates in the source table/file, data should be updated automatically.
  2. After join if something updated in the source files, the output should be also updated automatically
    How to do these cases in KNIME, without having KNIME server and scheduling? How to create workflow, which says if there are some update, pls run the workflow?)

Thanks

Hi Karlygash,
There are two general approaches: push and pull. Either the database pushes changes to the workflow somehow, or the workflow periodically checks the database and updates. Push is possible with KNIME Server and the REST API, if the database supports triggers and you can call a REST API from those triggers. Pull is a bit easier, but hard if you do it without scheduling. You can have a Counting Loop Start combined with a Wait… to check the database periodically in your workflow, but for that it must be running non-stop. Next is the issue of data transfer: You do not want to transfer the whole table with every check. This means you need to cache the data locally and then only query for new data, which means that your rows in the database need a timestamp indicating when they were added. You can then remember the last retrieved timestamp and query the table roughly like this: SELECT * FROM <table> WHERE timestamp < <last_timestamp>. If this query returns a result, you update your cache and redo the join. As you can see, it is a bit of effort. With KNIME Server you can schedule the workflow instead of relying on a loop, but the general approach stays the same.
Kind regards,
Alexander

5 Likes

Hi @Karlygash , with any client db tools/software that you use, when you run a SELECT statement, the results that you get will not change/update automatically if there are new updates in the data. You need to re-run the SELECT statement to see the updated data. So, it is the same with Knime, you will see the updated data if you re-run your db reader node.

Regarding this:

There are actually 2 questions here:

  1. What should trigger the check for update
  2. How to implement “if there are some update, pls run the workflow”

It is possible to implement this without Knime server and scheduling, probably via an infinite loop that constantly check for updates, but that’s VERY inefficient and very resource consuming.

There was a thread similar to this where I explained the pros and cons of this (mostly cons) but I can’t find it.

1 Like

Found the thread I was looking for. This is more about having the workflow handling the scheduling vs having the scheduler:

As per my comments there:
I see a few drawbacks such as:

  1. Your workflow would constantly be running, and looping infinitely. I am not sure how efficient this is, not to mention that it would be using resources constantly.
  2. What happens if the workflow is somehow stopped? What would re-trigger the workflow?
  3. It would mean to have Knime open all the time. What if you need to restart Knime? What would re-trigger the workflow?
  4. Based on what you explained, you are expecting the workflow to be reset by the workflow itself. I don’t think that this is possible.

With the task scheduler:

  1. Your workflow does not have to keep running, therefore will only use resources whenever it runs.
  2. You do not have to worry about manually triggering or re-triggering the workflow.
  3. You can restart your Knime independently, and not to worry about manually triggering or re-triggering the workflow.
  4. Each trigger via the command line is a new execution and has the option to reset the workflow before executing.
2 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.