Web scraping for pages with 403

dashak · January 19, 2024, 11:28am

Hello,

I need data from this web page, e.g. https://www.motion.com/products/sku/11508980 . Unfortunately I can’t solve it with existing notes and settings (palladian and sellenium nodes). I always get the status 403. Sellenium (WebDriver) recognizes an automated browser. How can I bypass it and get the data? Spetial setting? Rotating the proxy doesn’t help either.
Thank you
Svetlana

takbb · January 20, 2024, 8:26am

Hi @dashak , I don’t have an answer to your question I’m afraid. The 403 response usually means the website is blocking access.

I also note that the terms and conditions of use for that website, notably clause “5. Services Use Restrictions” it preclude any automated trawling or scraping of information from that website without prior written consent, so I suspect they may be refusing to respond to automated requests.

Whilst you may have sought permission to do so, without written permission from the website owner, I would personally be unwilling to try accessing that site to investigate the cause, as I would then be breaching their usage terms.

dashak · January 22, 2024, 7:03am

Hello @takbb,

that’s clear to me now. Thank you very much for your answer. I will forward this to my requesters.

system · April 21, 2024, 7:04am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.