Sample GitHub account IDs

My goal is to count the number of valid GitHub account IDs. In order to retrieve this data, I used Knime. I have to download 5000 samples due to GitHub API limits.

There is a limit of 5,000 requests per hour and per authenticated user for User-to-Server requests on GitHub. All requests from OAuth applications authorized by a user or a personal access token owned by the user, and requests authenticated with any of the user’s authentication credentials, share the same quota of 5,000 requests per hour for that user.

With this limitation in mind, how can I retrieve the 5000 users? I have designed a flow, but I am not sure if this is the right way to do it?

I need to save it as Json.

I used the the below python code as well to fetch data as an other way, while it has a problem.

Code:

import requests
import json

headers ={
      'Authorization': 'token <TOKEN>', # replace <TOKEN> with your token
    }
# collect data by users API
id_ = 0
response = requests.get('https://api.github.com/users?since='+str(id_),headers=headers)
data = response.json()

# collect data by search API
response = requests.get('https://api.github.com/search/users?q=created:<2020-01-14&created:>2020-01-13',headers=headers)
data = response.json()
json_formatted_str = json.dumps(data, indent=2)
print(json_formatted_str)

It will return 30 results for each request. I would like to use loop to crawl more data.

The following is an example of data[0]

 {'login': 'mojombo',
     'id': 1,
     'node_id': 'MDQ6VXNlcjE=',
     'avatar_url': 'https://avatars0.githubusercontent.com/u/1?v=4',
     'gravatar_id': '',
     'url': 'https://api.github.com/users/mojombo',
     'html_url': 'https://github.com/mojombo',
     'followers_url': 'https://api.github.com/users/mojombo/followers',
     'following_url': 'https://api.github.com/users/mojombo/following{/other_user}',
     'gists_url': 'https://api.github.com/users/mojombo/gists{/gist_id}',
     'starred_url': 'https://api.github.com/users/mojombo/starred{/owner}{/repo}',
     'subscriptions_url': 'https://api.github.com/users/mojombo/subscriptions',
     'organizations_url': 'https://api.github.com/users/mojombo/orgs',
     'repos_url': 'https://api.github.com/users/mojombo/repos',
     'events_url': 'https://api.github.com/users/mojombo/events{/privacy}',
     'received_events_url': 'https://api.github.com/users/mojombo/received_events',
     'type': 'User',
     'site_admin': False}

Hi,
you could also use your python script node in the loop. Have you already tried that?
br

1 Like

Hi Daniael,
It has an error and I could not fetch it by python loop, is it possible with Knime flow?

Its solved by python nodes.
Thanks.

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.