Post request concurrency - performance optimisation

hi all

i have a question on the Rest Post Request Concurrency option, i want to post a lot of data towards an end-point on the same host and i have to do it in batches.

however, this slows down the data loading process (1day to get the data in)

i have the Post Request Node - concurrency option set to 64 threads (max), but it’s still not optimal. i have added some screenshots so you can see the impact on the computer resources, but i doubt that there is a disk / cpu bottleneck… any idea what the issue / solution might be?
–i didn’t see any related topics on the forum


i’m on KNIME 3.5.0.

many thx!

Herman

The total number of requests is also limited by the number of threads available for KNIME node execution. This can be set in the preferences and defaults to twice the number of cores. Therefore setting the concurrency level in the node to 64 won’t help much. You also need to increase the thread limit.

1 Like

Hi Thor,

thx for the input! i had read about the KNIME settings thread limit, but i thought it was across nodes, thx for pointing that out.

  • restart

however, i don’t really see a noticeable difference

is there something we are overlooking?
– command line parameters to the JVM or something?

Herman

What are you expecting in the end? You won’t see full CPU usage because even if you issue 64 concurrent requests they will wait most of the time for the server to respond or transmit data which doesn’t take much CPU.

Hi Thor,

good question!
my “expectation” would be that CPU usage goes to 100% so that the data loading towards the rest-endpoint is most optimal.

2 optimal scenario’s, equally good for me, both in the setup that the knime project and the rest-endpoint service is on the same machine

A) cpu goes to 100% because the rest-endpoint needs to process the request, thus making the Post Request node wait until a server response

B) cpu goes to 100% because the rest-endpoint can handle the request very quickly, and thus making the Post Request node “hammer” against the rest-endpoint.

Clearly, both scenario’s are not happening, so there is a bottleneck somewhere, it’s not disc or cpu, so it must be software throttling /timeout. Given the fact that i cannot monitor or change the rest-endpoint, i’m looking at the knime project optimisation.

not sure if this makes sens?

thx,

H

but maybe, there is indeed nothing wrong with knime thread handling, as i see it spawn’s 64 connections

[hvereyck.VEREYCKEN-HXL] ➤ netstat -a | grep 9999
TCP 0.0.0.0:9999 VEREYCKEN-HXL:0 LISTENING
TCP 127.0.0.1:9999 VEREYCKEN-HXL:62222 ESTABLISHED
TCP 127.0.0.1:9999 VEREYCKEN-HXL:62223 ESTABLISHED
TCP 127.0.0.1:9999 VEREYCKEN-HXL:62224 ESTABLISHED
TCP 127.0.0.1:9999 VEREYCKEN-HXL:62225 ESTABLISHED
TCP 127.0.0.1:9999 VEREYCKEN-HXL:62226 ESTABLISHED
TCP 127.0.0.1:9999 VEREYCKEN-HXL:62227 ESTABLISHED
TCP 127.0.0.1:9999 VEREYCKEN-HXL:62228 ESTABLISHED