GET request - size limitation

Dear Knime-Community,
I seem to have encountered an issue that has been posted already before but was closed again without a final solution.

GET Request's result is too long ?

I have been looking to work with timeseries from an internal database using the GET Request. Since these timeseries are quite long, I seem to only get a result for a day or two (see fig 1) but not for more than 2 days (see fig 2). So I would guess that there is a limit to the amount of data that the GET request provides.

Fig. 1

Fig. 2

To check, I pasted the URL into an explorer and got the entire result without any delay. Memory space is also unlikely to be the issue, given the significant resources allocated to KNIME.

So is there a chance this is a problem with Knime? (currently using version 4.7.3)
Or is there any additional way to configure the GET Request Node?

Thank you very much in advance.

Maybe you can wrap a loop around your get request and iterate in either one or two day intervals to get your data
br

Hi @Klocker,

  • Did you tried to add more timeout for the request?
  • Could you mark the option to set “Send large data in chunks” too?
  • Your server accept asynchronous requisition?
  • All headers request are given as content-type, accept application… ? nothing was missed?

At Error Handling tab, you can mark to show you all error problemas as a result output too, bring some light for this situation.

And the oblivious question, you can send it as get and post method, but the endpoint accept both? could you try with post method?

Tks,

Denis

Good morning and thank you for your replies.

@Daniel_Weikert
I agree that a loop might be a quick fix. However, the plan for this workflow is to get data for fairly long time series, and looping over days over several years would most likely slow down the workflow significantly.

@denisfi
I have tried your suggestions:

  • I added 200 seconds. The node however finishes almost immideately.
  • I tried that. No change in result.
  • I don’t know that. But I checked the box and the result hasn’t changed.
  • Yes, all the necessary information for the API is provided. The URL worked fine in both Swagger and a normal explorer.

The option in the Error Handling tab provides us with the following reply. Unfortunately, this is also no different from the info we had so far.

image

My guess here is that the json is cut at some point which leads to a syntax error in the json format.

Finally, I also tried the suggestion using the POST Method. I don’t believe that the API accepts that leading me into an error 405.

image

Interestingly, I have another workflow that I’ve been using for a while where the GET request, which returns a very large JSON code, works fine. It is a different API but the node configuration is identical…

Do you have another idea where some configuration might be missing?

Thank you in advance.

@Klocker ,

I believe that you can make some changes:

  • Short the period of your request, it’ll bring less data from the request and you can test if the problem can be only about the content or a limitation from knime.

  • I saw at the knime’s preferences a setting that can help/test for this moment… called “Preferred Renderers” > “Others” > “HTTP Result”

By default, it’s set to result firsts 1000 characters, but you can change it to full. MAYBE it can help you.

  • The response from the server using post said that you need to inform a media type at the header request… try to set as “Content-type” “application/json” and “Accept” “application/json”.

  • POST method works with large pack of data, Get can have some limitations.

  • Check if the endpoint works with sync or assync information too.

Try and give some insights to us…

Tks,

Denis

Can you try to extract all HTTP headers from the response into columns? It could be that the server is sending a chunked response which - ironically - I believe the REST nodes cannot deal with properly. You should see this in the Transfer-Encoding response header.

Hi @thor , but the node was set to large data in chunk, so I understand and agree with you to add some information throw the response headers to see something else…

Tks,

Denis

The option only affects the request that is sent by the node, not the response that is received from the server.

uhhhh… so i suggest an upgrade for this node for future use too… lol

tks,

Denis

Yes loops are generally slow, maybe parallel chunk loop could help a little bit. If not to much overhead I would give it a try just to see the performance. Also have you looked at the api doc? Maybe some additional useful info there
br

Good morning everyone,

thank you all for your replies and suggestions.

@denisfi

  • I tested this situation with a shortend request and then it works fine.
  • Unfortunately, I don’t seem to have the option you are referring to.
  • Adding the media type information to the request header using POST did not change the error message I get.
    image
  • I then also tried to add these entries in the response header (just out of curiosity) but it didn’t change anything either.

@thor

If this really comes down to some limitation in the GET request node I would really appreciate some future upgrades there too.

@Daniel_Weikert

  • I did look at the API doc and there are no anomalies. As for the chunk loop, I agree that it might work better than a regular loop, but my reservations performance wise however still remain.

It’s very likely caused by the chunked response. I will open a feature request.

2 Likes

Great, thank you very much. In the meantime I will try to work my way around this issue with some loops or even better some temporary database storing. Let’s see what works best.

Thank you all for your support!

That’s not a public free api (for testing) is it?
br

Hi, unfortunately not. It’s all company internal.

Good morning @thor ,

has there been any development on the potential feature request regarding the issue with the GET request?

Best Regards

Unfortunately not. We have other priorities at the moment.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.