24 Hour Restriction of the API after using Google Address Geocoder

Hello!

I have used the Google Address Geocoder node to convert the 6300 addresses to coordinates. I had to do it twice (so total of approx. 12k calls) and everything went very well and I was very happy :smiley:

Unfortunately after some days I received the following email from Google:

Dear Developer,
We are happy to see developers using Google Maps Platform Services. However, we recently detected that your Google Cloud/API Project idlike - XXX (id: XXX) could be scraping data from our Google Maps APIs in violation of Google Maps Platform Terms of Service Section 3.2.3.

We work hard to build and maintain high quality data for use in our Google Maps APIs. Scraping, building databases, or otherwise creating permanent copies of data from our Google Maps APIs is strictly prohibited by both the Google Maps Platform Terms of Service and the Google API Terms of Service.

We have restricted your use of the Maps API for 24 hours as granted by the Google Maps Platform Terms of Service Section 5.1 and Google APIs Terms of Service Section 8.a, due to non-compliance with Google Maps Platform Terms of Service Section 3.2.3.

Your previous access to the Maps API will resume in 24 hours, however if your projects continues scraping activity after regaining access, we will suspend it.

Is this expected? Has it happened before? Is there anything I can submit to Google as appeal so that they don’t permanently suspend my project?

Thanks,
Fabrizio

I never had this before, seems like some arbitrary “AI” rule triggered. I wouldn’t even bother with appealing, you will not get any response anyways. Instead have a look at the other geocoder which we have in Palladian:

Fingers crossed.

3 Likes

Thanks a lot! I’ll try that out :slight_smile:

2 Likes

Hi @tmalust , this is something I have warned several users about when using any API: Do not abuse the api server.

This might be a result of sending too many requests within a time period. Most API will document this and will tell you what is the limit of requests per time, and if you go beyond this, you might get “banned” temporarily, and eventually permanently if you continue to abuse it.

Servers impose such limits to protect themselves from attacks.

Sending 6300 addresses in 1 shot is a lot.

So, even if you are going to use another API such as what @qqilihq suggested, if you are going to send the same way, you will be banned.

You need to send the requests in batches with some cooling time in between the batches. For example, send 100 requests every 5 seconds, meaning you send batches of 100 requests with a wait time of 5 seconds. Obviously that metric varies per api - check their documentation.

The GET/POST Request nodes already have these options embeded, so you don’t have to use Loops and the Wait node to implement the batch and delay. You can just do it in the GET/POST Request nodes. Look at the Delay and Concurrency options:

The Delay represents how many milliseconds (ms) you want to wait before sending the next batch. So 5000 ms means 5 seconds.

The Concurrency means how many rows you want to send at the same time, so that’s your number of requests per batch.

EDIT: Based on what I’m seeing online, Google has a 50 Requests per second (QPS - I think Query Per Second) rate limit. So 6300 was well over that limit.

4 Likes

Thanks for your answer! I was using the Google Address Geocoder node and I did not think that I should limit the API calls.

Also there is no setting for this in the node itself so I guess for the future I’ll have to manually do it with a loop + Wait node.

Thanks again for the detailed answer!

1 Like

For clarification:

Sending 6300 addresses in 1 shot is a lot.

EDIT: Based on what I’m seeing online, Google has a 50 Requests per second (QPS - I think Query Per Second) rate limit. So 6300 was well over that limit.

Both geo coders will run the requests sequentially one-by-one of course with a fixed size thread pool. Thus, in practice you cannot run into a limit of 50 / seconds, as each HTTP request has “enough” overhead so that you’ll be far from reaching 50/second (except if you have a direct backbone to the google servers) :slight_smile:

If you exceed a given rate limit, you’d typically get a 503 response from the API which would be shown as error on the node and the execution would stop, which was obviously not the case here.

According to your feedback, Google obviously triggered some obscure rule about which they are not absolutely sure themselves or which they do not further want to clarify – they write “could be scraping data” (which is somewhat ironic coming from Google). Thus I would just move on the to other service as mentioned above.

–Philipp

3 Likes

There are a lot of rules in that 3.2.3. In that same section of “No scraping”, it says:
For example, Customer will not:(ii) bulk download Google Maps tiles, Street View images, geocodes, directions, distance matrix results, roads information, places information, elevation values, and time zone details
Hi @qqilihq ,

Yes, the geocoders will run the requests sequentially one-by-one on their side, but from their point of view, they want to limit how many requests is being sent per second from the same requester. While I agree that you might not get response for 50 requests / second, they still don’t want you to send more than that for different reasons:

  1. Protect themselves against attack
  2. Give everyone some change to jump in the queue. For example, let’s say you get 3 requesters sending 50 requests per second. The queue might end up with:
    1 Request from R1
    2 Request from R1

    50 Request from R1
    51 Request from R2
    52 Request from R2

    100 Request from R2
    101 Request from R3
    102 Request from R3

    150 Request from R3
    151 Request from R4
    etc as other requests come in

Now, if everyone starts sending 6000 requests at once, now the queue might end up as:
1 Request from R1
2 Request from R1

6000 Request from R1
6001 Request from R2
6002 Request from R2

1200 Request from R2
1201 Request from R3
1202 Request from R3

1800 Request from R3
1801 Request from R4

Look at where the request from R4 is able to get in.

You would not necessarily get a 503 response, Google is probably able to handle the 6000 requests. Probably for paid package, it allows you to send much more than just 50. You’d get a 503 is the server is taken down.

@tmalust , as I mentioned, you do NOT have to implement a Loop + Wait to do the batch and delay. The GET/POST Request already allows you to do that as I explained with the screenshot.

1 Like