We are using the oAuth2 Authenticator to access google APis.
However we notice that after a period of time (approx 1 hour?) the token is lost and unable to get a refresh token.
This causes our data transfers to fail midway.
The token from google doesn’t to my knowledge have a time expiry.
I can’t use the Palladian oAuth2 connector as it has no blue box connector to the Googel Analytics Connector.
2 things, can the bug be fixed at some point please. Second, is there a way to create a blue connector ?
The connector does not even make the call, it fails as it has an empty refresh_token which ought to be coming from the oAuth2 Authenticator. For some reason it appears after an hour or so that value is no longer available so the GA node fails and spits out the error, never making the request.
Thanks for the additional details. Maybe I missed something in your initial post. Did you tried the Google Auth node and have you updated Knime as well as all extensions to the most recent version?
That is quite interesting. I never ran into any API limitations querying GA using the Google API event though I pretty much exploited the API extracting raw data on a per event basis (each marked with a unix time stamp) with 50k results per query over a time span of years. So with 1000 queries with 50k results each I managed to get 50.000.000 (million) results and didn’t even scratched the quote limit.
It rather feels, reading the post you linked, that the query done might be inefficient. But I strive away form the topic.
Though, even if you use a custom endpoint the data still needs to come from a Google API, doesn’t it?
I roughly recall that the Google API is used got unintentionally exhausted by peers of the company I once worked for. But that whole topic is so complex that, without direct access to anything, it is virtually impossible to make any recommendation. Feel rest assured I do feel your pain.
That reminds me that I slapped my forehead some time ago when GA4 was announced as I wrote ana article about quote limitations. It might be that or another one:
Same as yourself i dont usually have quota issues, however this time 96 views * 365 days * 4 years (i know, client insisted), * 8 queries * avg of 4.5 pages per query = 5M requests.
Quota limit is 50K per day per project
gonna try another tack on the other thread. if some kind soul from Knime can let me now what localhost url they use
I believe this is to application specific as the limitation is presumably imposed by the API but not Knime. Since you mentioned UA I recalled to have read an article about backing up the data recently too.
Did you, by any chance, read one or both of the following support articles?
About your math resulting in 5 Million queries. 50k queries * 50k results (max) per query = 2.500.000.000 results / day
That should work out. I suppose only specific data is required like sales but not everything, correct?
On another note, you said you are using oAuth. When I try using interactive with Googel Auth Node I see:
That reminded me of the issue I mentioned earlier that someone else consumed all credits. Having your own project in the Google Dev Console with own credentials would help. However, I might have gotten something wrong in our exchange as, like I mentioned, it’s pretty complex without seeing anything.
PS: I believe I was able to reproduce the issue and recall the cause. Since the OAauth method bundles all app specific requests (difficult to describe) amongst all using that method in Knime, it counts towards that project.
Several years ago I faced that very same issue but cannot locate the post from back then. Though, my memory is quite vivid so I have a high confidence it the cause.
To resolve it you really need to create your own Project, credentials etc. which is … challenging.
ERROR Google Analytics Query (legacy) 4:1406 Execute failed: 403 Forbidden
GET https://analytics.googleapis.com/analytics/v3/management/segments
{
"code": 403,
"details": [
{
"@type": "type.googleapis.com/google.rpc.ErrorInfo",
"reason": "RATE_LIMIT_EXCEEDED",
"domain": "googleapis.com",
"metadata": {
"quota_limit_value": "50000",
"quota_metric": "analytics.googleapis.com/default",
"service": "analytics.googleapis.com",
"quota_location": "global",
"quota_limit": "defaultPerDayPerProject",
"consumer": "projects/868899538500"
}
},
{
"@type": "type.googleapis.com/google.rpc.Help",
"links": [
{
"description": "Request a higher quota limit.",
"url": "https://cloud.google.com/docs/quota#requesting_higher_quota"
}
]
}
],
"errors": [
{
"domain": "usageLimits",
"message": "Quota exceeded for quota metric 'Queries' and limit 'Queries per day' of service 'analytics.googleapis.com' for consumer 'project_number:868899538500'.",
"reason": "rateLimitExceeded"
}
],
"message": "Quota exceeded for quota metric 'Queries' and limit 'Queries per day' of service 'analytics.googleapis.com' for consumer 'project_number:868899538500'.",
"status": "PERMISSION_DENIED"
}
Adding to this I noticed the new Google Analytics Connector node might only support GA4. So try use the legacy node.
ya setting up our own credentials was not the issue, that was the easy bit.
we have a few of them to help us get the data we need in good time (15 credentials, used 5 at a time with 2 parallel query nodes gives us a nice run rate with no errors and we are able to download a years data at a time relatively reasonably. ) However one query is geendartin a lot of data which is taking longer then then oauth refresh token timeout. We are doing archiving of top level metrics at daily level (non ranged, so 1 query per day for each master query - that avoids sampling)
Also events and pageviews with several other metrics so quite a data haul.
THe issue is when you setup a credential you need to specify a redirect_url - this way you can do the auth just like the Interactive method on google auth
i wanted to use our credentials with the google auth tool, but we don’t know what redirect_url was set int eh Knime google cloud project (they hardcoded this value into the tool you see).
If i use oAuth2 connector, it all works as i can set hte redirect_url to whatever i configured in google cloud, however the issue is the timing out or loss of refresh token, which doesn’t happen with the google auth, presumably cause they factored in that it doesn’t expire.
my temp solution is to half the query date range to half years rather than yearly, which is a faff…
PS: My former calculation was wrong. It weren’t 50k50k but 50k10k … still 500.000.000 results which are plenty. It also seems to be possible to apply for a higher quota.
About your workflow, personally I’d use the API credentials (as JSON or p12) as you can manage them programmatically which drastically streamlines your workflow. The Parallel Chunk Nodes I’d use with caution as on many occasions they caused instability (also within Knime).
In regards to the redirect URL. Iif you unauthorize Knime and authorize it again, open the browser console and enable persistent logging. That way you should be able to get the redirect url.