Hi Everyone,
I am trying to visit html links and parsing data with in this workflow. GetData.knwf (43.4 KB)
In the first part I collected many URLs. Everything is well. But the second part when I trying parsing and getting data, HtmlRetriever return to status code 429 (you maybe already know, it means too many request).
I would liked to pass with chunk loop and some waiting nodes but I couldn’t solve it with this way. I’m new at Knime, and I’m novice so I really need this type of tips.
Does anybody have an advice or a sample workflow to pass “too many request?”
some kind of waiting would be the obvious way to go. As these rate limiting strategies are typically “black box”, this involves tedious trial-and-error.
An alternative (or better: complementary) approach would be to use different request IPs, e.g. by parallelizing your workflow across different machines or by the use of proxy servers.
Hi and many thanks for yours replies @qqilihq and @ipazin. Yes, I did some tests with increasing and decreasing time in Wait then I opened this entry. I will try again and again for find optimum time for wait I guess. Or maybe I can work with one of VPN.