Getting an Error on HttpRetriever Status Code 429

Hi Everyone,
I am trying to visit html links and parsing data with in this workflow. GetData.knwf (43.4 KB)
In the first part I collected many URLs. Everything is well. But the second part when I trying parsing and getting data, HtmlRetriever return to status code 429 (you maybe already know, it means too many request).
I would liked to pass with chunk loop and some waiting nodes but I couldn’t solve it with this way. I’m new at Knime, and I’m novice so I really need this type of tips.
Does anybody have an advice or a sample workflow to pass “too many request?”

Regards,
Emine

Hi Emine,

some kind of waiting would be the obvious way to go. As these rate limiting strategies are typically “black box”, this involves tedious trial-and-error.

An alternative (or better: complementary) approach would be to use different request IPs, e.g. by parallelizing your workflow across different machines or by the use of proxy servers.

Best,
Philipp

4 Likes

Hi there @Eminegul,

welcome to KNIME Forum!

With increasing time in Wait… node you still get 429?

Alternatively what you can do is implementing Try/Catch Errors nodes.

See here for more:

Br,
Ivan

2 Likes

Hi and many thanks for yours replies @qqilihq and @ipazin. Yes, I did some tests with increasing and decreasing time in Wait then I opened this entry. I will try again and again for find optimum time for wait I guess. Or maybe I can work with one of VPN.

Best Regards,
Emine

1 Like

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.