Enhancement: Service performance in Business hub

Description:
Service performance can be boosted after fixing this.

I’m happy to find this :partying_face: :partying_face: :partying_face:

Steps to reproduce:

  1. Create a simple json_api_test workflow and deploy as Service. Basiclly , it output the current running timestamp as JSON output

  2. Create another workflow(json_api_call) to call previous Service and time it. The workflow is simple, 500 empty rows call Service 500 times.

    image

  3. Here is the interesting part: Every row return at same time and Status is Completed in 0 secs

    image

    but actually, the workflow take too much time:

I guess, the Service finished at same time because little containers finished at same time(if it is correct). But the Call Workflow waiting the result one by one, if it just receive all the results at once, and reorder later, the performance can be boosted.

Or, the Service has bugs(if Service JSON timestamp is wrong). I guess the former hypothesis is a bit more likely

Actual results:

Expected results:

Attachments:

OS:

Hello @HaveF,

thanks for reporting! It’s a bit of a mixture. The Call Workflow (Row Based) node provides three settings on what to send to the callee workflow

  1. From Column (send a selected cell from the input row to the callee)
  2. Custom JSON (send a constant value)
  3. Default (send nothing, use default value of the Container Input node)

In cases 2 and 3 the callee workflow does not get re-executed (because the input did not change - however, this makes sense only if the workflow is deterministic and has no side effects). If you select option 1, you’ll get the expected output.

And you’re right, in case the workflow is not re-executed, there is no need to actually call it over and over and performance could be drastically improved. I think to improve this, we need to

  1. Improve node descriptions (this is highly unintuitive)
  2. Rethink the behaviour and either go fully in the direction of omitted re-execution, or, more likely, make re-execution the default for options 2. and 3.

Created a ticket: AP-21620

2 Likes

@CarlWitt I’d probably prefer this one. If the workflow has no side effects, the input is the same, and the output is the same, the work should be done by tools, like redis, and not the job of Service API here (of course, the Service API can be responsible here, but it should be done in a different abstraction layer)

Btw, the option Default is a bit vague, and I don’t understand what it means unless you explain it. Maybe it’s better to call it Default value of container input(of course this option is a bit long) or something like that.

1 Like

@CarlWitt Btw, if I add a table to json node after empty table creator, and select From Column (send a selected cell from the input row to the callee) in Call workflow node, it still too slow to be a API. It take me almost 7 mins to finish 500 calls.

It take too much time, you know, the API workflow is just getting a timestamp!

Since I haven’t used anything like KNIME Serve or Edge, I don’t know if this issue is specific to business hub, or if it’s been this speed all along…

@CarlWitt I don’t know if you missed this(It take me almost 7 mins to finish 500 calls). :joy: Just a kind reminder. Thanks!

Best Regards,

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Internal ticket ID: AP-21620
Fix version(s): 5.3.0
Other related topic(s): -