Troubles with API respondings

We have a workflow API that runs on Knime Server Medium 4.10.2. Periodically, when the load is high, our API stops responding.This lasts from 5 to 20 minutes, then the performance is restored.
I attach the log.
Do you have any idea what the problem is?
localhost_cutted.txt (3.2 MB)

The server will wait for ~1 minute for workflows getting loaded. After that it will cancel and discard the job. This is in line with the ProgressMonitor is canceled message that you get. You can increase the load timeout by passing the “timeout” parameter when loading a job.

It’s not a job. It’s API workflow, deployed via KNIME Analytics platform. How we can passing “timeout” parameter in this case?

Config Info

  • com.knime.server.server_admin_groups: admin
  • com.knime.server.webportal.debug: false
  • com.knime.server.login.jwt-lifetime: 30d
  • com.knime.enterprise.executor.embedded-broker: false
  • com.knime.server.csp-report-only: false
  • com.knime.server.job.default_load_timeout: 1m
  • com.knime.server.job.swap_check_interval: 1m
  • com.knime.server.executor.knime_exe: /opt/knime/knime-latest/knime
  • com.knime.server.executor.start_port: 50100
  • com.knime.server.job.default_swap_timeout: 1m
  • com.knime.server.executor.max_instances: 200
  • com.knime.server.repository_path: /srv/knime_server
  • com.knime.server.job.max_execution_time:
  • com.knime.server.executor.update_metanodelinks_on_load: false
  • com.knime.server.executor.skip_teamspace_mount: false
  • com.knime.server.webportal.title_label: WebPortal
  • com.knime.server.job.discard_after_timeout: true
  • com.knime.server.job.max_lifetime: 7d
  • com.knime.server.default_mount_id: knime-server
  • com.knime.server.webportal.sketcher_page: VAADIN/sketcher/sketcher.html
  • com.knime.server.executor.reject_future_workflows: true
  • com.knime.server.server_admin_users: knimeadmin
  • com.knime.server.job.status_update_interval: 60s
  • com.knime.server.job.default_report_timeout: 1m
  • com.knime.server.webportal.hide_version: false
  • com.knime.server.executor.max_lifetime: -1
  • com.knime.server.webportal.sketcher_size: 300.0x300.0
  • com.knime.server.config.watch: true
  • com.knime.server.executor.prestart: true
  • com.knime.server.webportal.csp: default-src ‘self’; script-src ‘unsafe-inline’ ‘unsafe-eval’ ‘self’; style-src ‘unsafe-inline’ ‘self’;img-src ‘self’ data:;
  • com.knime.server.webportal.disable_warning_messages: true
  • com.knime.server.job.max_time_in_memory: 60m
  • com.knime.server.webportal.disable_report_preview: false

Hello Sergey,
the sixth row from the top in your config is: com.knime.server.job.default_load_timeout: 1m. You can set the default timeout to something higher here. Additionally, when making a call to the workflow you can specify the query parameter timeout=<time> to specify the timeout in milliseconds for that particular call.
Kind regards
Alexander

2 Likes

Thank’s a lot.
There is still a small question. Average time for API answer currently is about 10 seconds. Why 1 minute might not be enough to load the workflow, and can using job pools help?

Hi Sergey,
with a high-load scenario like yours Job Pools can certainly help. The workflows will not be loaded for every call and it seems like you are mostly running one particular workflow, so it should have a significant effect on response time. Please let us know of your findings!
Kind regards
Alexander

Hi. We changed the settings, set 20 job pools and default_load_timeout=2m.
Аfter that, the server worked normally for 6 hours instead of 1-1.5 hours, вut then the same problem arose.
Today we set 50 job pools and will see

1 Like

Last update: we still have the issue. But we found workaround decision: auto-rebooting KNIME server each hour.

Hi @Sergey_Bazov,
KNIME Server periodically (by default every 24 hours) recycles its executor. Maybe it is enough for you to set that time to 1h instead, so you don’t have to reboot the whole server. In your knime-server.config file, change the following setting:

com.knime.server.executor.max_lifetime=<duration with unit, e.g. 60m, 36h, or 2d>

https://docs.knime.com/2018-12/server_admin_guide/index.html#knime-server-configuration-file
Kind regards
Alexander