KNIME HUB: Schedules getting disabled

Hi KNIME,

We have users that experienced their schedules was disabled after 1x failure.

image

We suggested, that users tries to set “Execution retries” to 3 instead of 0.
But will this retry execution and still disable schedules?

In the KNIME Server Administration Guide, there was the option:

com.knime.server.job.max_schedule_failures=<number>: 
It is now possible to configure the number of attempts to create a job 
before a schedule is disabled. Set to -1 to deactivate auto-disabling 
of schedules. This was previously hardcoded to 3.

In the KNIME Business Hub Release Notes 1.13.2 , there is the notion

Existing execution contexts are switched to set 
defaultMaxScheduleFailures to unlimited, 
effectively removing the disabling of deployments as a consequence 
of load errors. This migration is only applied 
to execution contexts where 
defaultMaxScheduleFailures is left at the default value of 3.

Where is the “defaultMaxScheduleFailures” set in KNIME HUB?

Is that in the Customization Profile for executors?

I confess that I don’t know the settings for KNIME Server.
The way I read it is that setting the Execution Retries to 3 will cause it to try 3 times, but not disable the schedule unless the disable schedule box is checked.
For future readers, the Execution Retries box is found in the UI to create or edit a Schedule Deployment, images shown below.

The defaultMaxScheduleFailures value is only settable through the API. If you previously had an Execution Context on which you had changed the setting from its default of 3 to anything else, then when you upgrade to 1.13.2, that will not be changed again. It will remain at whatever number you’d set. If, on the other hand, you’d not previously set it, then upon upgrade to 1.13.2, it will be changed from the old default of 3 to the new default, (which on my instance appears to be 2,147,483,647) a number which is functionally unlimited.

I cannot presently explain why the schedule would have been disabled after a failure if the box for Disable schedule was not checked. I also don’t know what version you are running.

I can tell you that if you have an Application Password set for a user with Admin permissions, then you can see how your Execution Context is currently set using the Swagger examples on the API page of your hub. In case you haven’t had occasion to browse to that page, it is by default at https://api.yourHubURL/api-doc/ (where if your regular Hub URL were https://hub.companyname.com, you’d take that entire part after the forward slashes and put it in place of where yourHubURL appears in the example above) The desired service is execution-service, and the GET and PUT endpoints are under Execution Contexts.
image

Here are two screenshots of the UI to create or edit a Schedule deployment on the Hub.

image

2 Likes

Hi @llepome

Thank you very much for the detailed answer.

The emails that made the fuzz looked like this:

And we did not know how many users that had received mails like this.

In any case, we upgraded to 1.13.3.

Thank you for pointing to the API. Now I see the info: 2,147,483,647 (INT_MAX)

The Keycloak errors may be a larger problem and might be the reason the schedules were disabled. If they persist after your upgrade, I encourage you to open a ticket with Support (See Best practice how to submit a ticket to support for tips)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.