GET request not returning all data

Using Postman for a GET request, I’m successfully returning all 400+ records and its 17k rows of JSON data:

I then setup the exact same request in KNIME. But as you can see the end of the results here pasted into Notepad++, it’s only returning 50+ records with 800 rows of JSON:

I even tried increasing the timeout as was suggested in other posts:

But the result is the same. It only gets 50 records instead of the almost 500 records. Any ideas?

Hi @austinrecords, can you share the workflow? Then I can have a closer look. Other than that, is it possible that there a slight differences in the setup of the request (e.g. in the headers)?

Hi @austinrecords , looking a the 2 screenshot, it looks like they are not using the same URL. The URL in Postman seems much longer than the one you are using in Knime. Could it be that it’s missing some parameters contained in the URL?

This has nothing to do with timeout by the way.

And to @JoergWas 's point, you are showing us the headers sent via Postman, but we can’t see what headers you are sending via Knime.

Can you also show us the Response Headers that you are receiving?

1 Like

@JoergWas - Unfortunately, I can’t share the workflow because it involves accounts with my job and the company I work at. As for your question about there being a difference in the headers, or anything for that matter, I’m fairly confident I set everything up the same between Postman and Knime. But I’ve been wrong before :slight_smile: , so perhaps you can tell me for sure. Here are the headers for both.

Postman:

Knime:

@bruno29a - the thought occurred to me in the original posting that people might see the default google address covered up and ask about it. But I figured the chances were low. My apologies, I should’ve explained that the address is indeed the same, it’s just coming from a String Configuration node passed in as a flow variable.

The weird thing is that I get no errors whatsoever, it just returns partial data instead of all as if it’s supposed to.

Hi @austinrecords , understood for the URL, and your request headers in Knime seem to be the same as what you are sending via Postman. How long did you run the requests (Postman and Knime) from each other? Could it be that the data changed by the time you ran the request via Knime?

Can you also show us what the Response Headers did you get? What is the size that the headers show?

Thanks for looking into this, @bruno29a. I ran the requests randomly. Sometimes immediately after, sometimes days after. It never seems to matter. Knime always only gets a portion of the data while Postman gets all. Here are the response headers from Postman in 2 images due to having to scroll:


Here is all that Knime has for this:

Hi @austinrecords , can you check the box “Extract all headers” next time you run it again? It will show us all the headers

@austinrecords, as far as i can see Postman automatically adds autogenerated headers to requests. Have you checked them?

@bruno29a: I checked the box and ran it again. Still it only had those 2 lines, but this time they were grayed out:

@duristef: these are what’s shown when I look at those hidden autogenerated headers from Postman:

The only reason I can think of which is compatible with a correct JSON result in both cases (nearly 500 records vs 50+) is that a different “quantity” parameter is involved in the two requests. The API response could be “paginated” and in that case the request process requires a loop, which is managed by Postman but not by Knime

Hi @austinrecords , those 2 lines are only for configuration, in case you are not extracting all headers. These are not the values of the headers. The headers will be included in the response. Can you show us what values for the headers were retrieved?

Agree or there is some API key / Auth / Cookie involved and without it you get a limited API that returns only a partial response for trial/testing.

1 Like

Hey @bruno29a, forgive me for I’m new to Knime and not sure where to get those headers included in the response. Here’s the spec of what comes back. Is there somewhere else I should look for this?

@duristef: along the lines of your thinking, in the image above I see “52” value for X-Total-count. That’s definitely not something I’m setting, and I don’t know if that gets auto-set or if it’s just showing a result of how many records that’re coming back.

@austinrecords I guess that X-Total-Count is a header that returns the total number of items in the response body (you can check that), but it’s definitely not a setting. What we’re looking for has to do with the request:
a) if it lacks auth parameters (key, cookie) and the result is limited by default (as @kienerj suggests) there’s nothing we can do here
b) if it lacked a parameter that tells the API “return all the items”, we would find it in the GET string or in the headers, but apparently we don’t
c) if the result is paginated there must be some kind of “hook”, a value that tells your app when to go on requesting items and when to stop. It can be a token (in this case: same conclusion as in a) or something hard-coded in the JSON file, for example “next-url” or a page number
I can’t think of anything else

Yea I’m running out of ideas myself. Just to see, I replicated the same call using R:

It resulted in getting all records (397). You can see in the lines above I only added the 2 headers and the basic authentication. That’s the same simple thing I did in Postman and Knime. Still for some reason, Knime won’t pull back all the records. I suppose I could put this in an R Snippet and parse out the json. But the whole reason I’m trying to use Knime is to get away from coding everything. I’m new to Knime, but love it and am trying to convince those above me at work that we could use it as a viable tool. However, not being able to make a simple REST call is not helping my case.

In any event, here is the returned object from R in case you see anything I’m missing. I’m pretty much out of ideas after this:

$ url        : chr "(hiding this since it's related to my job)"| 
 $ status_code: int 200
 $ headers    :List of 20
  ..$ server                   : chr "snow_adc"
  ..$ date                     : chr "Wed, 13 Apr 2022 19:30:18 GMT"
  ..$ content-type             : chr "application/json;charset=UTF-8"
  ..$ transfer-encoding        : chr "chunked"
  ..$ connection               : chr "keep-alive"
  ..$ set-cookie               : chr "BIGipServerpool_(hiding this since it's related to my job); httponly; secure; path=/; SameSite=None"
  ..$ set-cookie               : chr "JSESSIONID=EE31E6744CD721D10A10D4A40647C7BA; Path=/; HttpOnly; secure; SameSite=None"
  ..$ server-timing            : chr "sem_wait;dur=0, sesh_wait;dur=0"
  ..$ content-encoding         : chr "gzip"
  ..$ set-cookie               : chr "glide_user=; Max-Age=0; Expires=Thu, 01-Jan-1970 00:00:10 GMT; Path=/; HttpOnly; secure; SameSite=None"
  ..$ set-cookie               : chr "glide_user_session=; Max-Age=0; Expires=Thu, 01-Jan-1970 00:00:10 GMT; Path=/; HttpOnly; secure; SameSite=None"
  ..$ set-cookie               : chr "glide_user_route=glide.d9df430433a4d35cf67ed960e230ab00; Max-Age=2147483647; Expires=Mon, 01-May-2090 22:44:25 "| __truncated__
  ..$ x-is-logged-in           : chr "true"
  ..$ x-transaction-id         : chr "b8022d5287ba"
  ..$ set-cookie               : chr "glide_session_store=74022D5287BA8D501E3B326E0EBB353E; Max-Age=1800; Expires=Wed, 13-Apr-2022 20:00:18 GMT; Path"| __truncated__
  ..$ x-total-count            : chr "397"
  ..$ pragma                   : chr "no-store,no-cache"
  ..$ cache-control            : chr "no-cache,no-store,must-revalidate,max-age=-1"
  ..$ expires                  : chr "0"
  ..$ strict-transport-security: chr "max-age=63072000; includeSubDomains"
  ..- attr(*, "class")= chr [1:2] "insensitive" "list"
 $ all_headers:List of 1
  ..$ :List of 3
  .. ..$ status : int 200
  .. ..$ version: chr "HTTP/1.1"
  .. ..$ headers:List of 20
  .. .. ..$ server                   : chr "snow_adc"
  .. .. ..$ date                     : chr "Wed, 13 Apr 2022 19:30:18 GMT"
  .. .. ..$ content-type             : chr "application/json;charset=UTF-8"
  .. .. ..$ transfer-encoding        : chr "chunked"
  .. .. ..$ connection               : chr "keep-alive"
  .. .. ..$ set-cookie               : chr "BIGipServerpool_(hiding this since it's related to my job)=9acd2069d914487e27f5c0b3dda36f84; httponly; secure; path=/; SameSite=None"
  .. .. ..$ set-cookie               : chr "JSESSIONID=EE31E6744CD721D10A10D4A40647C7BA; Path=/; HttpOnly; secure; SameSite=None"
  .. .. ..$ server-timing            : chr "sem_wait;dur=0, sesh_wait;dur=0"
  .. .. ..$ content-encoding         : chr "gzip"
  .. .. ..$ set-cookie               : chr "glide_user=; Max-Age=0; Expires=Thu, 01-Jan-1970 00:00:10 GMT; Path=/; HttpOnly; secure; SameSite=None"
  .. .. ..$ set-cookie               : chr "glide_user_session=; Max-Age=0; Expires=Thu, 01-Jan-1970 00:00:10 GMT; Path=/; HttpOnly; secure; SameSite=None"
  .. .. ..$ set-cookie               : chr "glide_user_route=glide.d9df430433a4d35cf67ed960e230ab00; Max-Age=2147483647; Expires=Mon, 01-May-2090 22:44:25 "| __truncated__
  .. .. ..$ x-is-logged-in           : chr "true"
  .. .. ..$ x-transaction-id         : chr "b8022d5287ba"
  .. .. ..$ set-cookie               : chr "glide_session_store=74022D5287BA8D501E3B326E0EBB353E; Max-Age=1800; Expires=Wed, 13-Apr-2022 20:00:18 GMT; Path"| __truncated__
  .. .. ..$ x-total-count            : chr "397"
  .. .. ..$ pragma                   : chr "no-store,no-cache"
  .. .. ..$ cache-control            : chr "no-cache,no-store,must-revalidate,max-age=-1"
  .. .. ..$ expires                  : chr "0"
  .. .. ..$ strict-transport-security: chr "max-age=63072000; includeSubDomains"
  .. .. ..- attr(*, "class")= chr [1:2] "insensitive" "list"
 $ cookies    :'data.frame':	5 obs. of  7 variables:
  ..$ domain    : chr [1:5] "#(hiding this since it's related to my job).service-now.com" "#(hiding this since it's related to my job).service-now.com" "#(hiding this since it's related to my job).service-now.com" "#(hiding this since it's related to my job).service-now.com" ...
  ..$ flag      : logi [1:5] FALSE FALSE FALSE FALSE FALSE
  ..$ path      : chr [1:5] "/" "/" "/" "/" ...
  ..$ secure    : logi [1:5] TRUE TRUE TRUE TRUE TRUE
  ..$ expiration: POSIXct[1:5], format: NA NA "2090-05-01 17:44:25" NA ...
  ..$ name      : chr [1:5] "(hiding this since it's related to my job)" "JSESSIONID" "glide_user_route" "glide_user_activity" ...
  ..$ value     : chr [1:5] "9acd2069d914487e27f5c0b3dda36f84" "EE31E6744CD721D10A10D4A40647C7BA" "glide.d9df430433a4d35cf67ed960e230ab00" "U0N2M18xOkhLMVZJaTROODI0QndxaGI2ekZzenQ3Z1NHVUtqMHo1S2ZCQWtuMFlDZU09OlJCTnp1UUdxUS9xQXJIOHo0MTdxS05VWXpJK0xibnp"| __truncated__ ...
 $ content    : raw [1:265365] 7b 22 72 65 ...
 $ date       : POSIXct[1:1], format: "2022-04-13 19:30:18"
 $ times      : Named num [1:6] 0 0.0295 0.1258 0.3404 0.8651 ...
  ..- attr(*, "names")= chr [1:6] "redirect" "namelookup" "connect" "pretransfer" ...
 $ request    :List of 7
  ..$ method    : chr "GET"
  ..$ url       : chr "(hiding this since it's related to my job)"| __truncated__
  ..$ headers   : Named chr [1:2] "application/json" "application/json"
  .. ..- attr(*, "names")= chr [1:2] "Content-Type" "Accept"
  ..$ fields    : NULL
  ..$ options   :List of 4
  .. ..$ useragent: chr "libcurl/7.77.0 r-curl/4.3.2 httr/1.4.2"
  .. ..$ httpauth : num 1
  .. ..$ userpwd  : chr "hiding this since it's due to my job"
  .. ..$ httpget  : logi TRUE
  ..$ auth_token: NULL
  ..$ output    : list()
  .. ..- attr(*, "class")= chr [1:2] "write_memory" "write_function"
  ..- attr(*, "class")= chr "request"
 $ handle     :Class 'curl_handle' <externalptr> 
 - attr(*, "class")= chr "response"

Could results depend on a time span parameter? For example: the 52 items you get with Knime are the ones which refer to last week or month

In a last attempt to solve the problem I would suggest you try to copy the request headers of Postman to those of Knime. To do that you should start with the “Generic REST Template” of the Get Request node


and then add some custom headers.
I’m sorry I can’t be of more help

An update: prior to reading the last message from @duristef , I decided to delete the entire workflow, reboot, update my Knime, and build it from scratch again. This time it worked. I can’t explain it. I have no reasoning for it working now other than I’m sure I was doing something wrong before, I just don’t know what it was. Thanks to all who offered help, and I hope I didn’t waste your time.

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.