/g/ - /wsg/ - Web Scraping General - Technology

Anonymous

/wsg/ - Web Scraping General 06/21/24(Fri)17:32:38 No.101090758

File: scraper.png (1.62 MB, 1892x2142)

/wsg/ - Web Scraping General Anonymous 06/21/24(Fri)17:32:38 No.101090758 Archived

Web Scraping General

Reverse engineering edition pt 2

QOTD: Which is easier: parsing HTML or reverse engineering priv/undocumented APIs to scrape from?

FAQ: rentry co/t6237g7x

> Captcha services
https://2captcha.com/
https://www.capsolver.com/
https://anti-captcha.com/

> Proxies
https://hproxy.com/ (no blacklist) (recommended, owned by friend of /wsg/)
https://infiniteproxies.com/ (no blacklist)
https://www.thunderproxies.com/
http://proxies.fo/ (not recommended)

> Network analysis
https://mitmproxy.org/
https://portswigger.net/burp

> Scraping tools
https://beautiful-soup-4.readthedocs.io/en/latest/
https://www.selenium.dev/documentation/
https://playwright.dev/docs/codegen
https://github.com/lwthiker/curl-impersonate
https://github.com/yifeikong/curl_cffi

Official Telegram: @scrapists
Last thread: >>101054257

Anonymous
06/21/24(Fri)19:13:44 No.101091990

Anonymous 06/21/24(Fri)19:13:44 No.101091990

'mp

Anonymous
06/21/24(Fri)21:41:25 No.101093799

Anonymous 06/21/24(Fri)21:41:25 No.101093799

'mp

Anonymous
06/22/24(Sat)00:49:53 No.101095385

Anonymous 06/22/24(Sat)00:49:53 No.101095385

'mp

Anonymous
06/22/24(Sat)00:58:51 No.101095455

Anonymous 06/22/24(Sat)00:58:51 No.101095455

It's over

Anonymous
06/22/24(Sat)01:00:31 No.101095472

Anonymous 06/22/24(Sat)01:00:31 No.101095472

>>101091990
>>101093799
>>101095385
>>101095455
reddit fucking shitstain, you do not "bump" useless threads.
if no one wants to post it in it means no one wants your garbage on the board
fuck off with your fucking cancerous "general" garbage

Anonymous
06/22/24(Sat)01:29:24 No.101095686

Anonymous 06/22/24(Sat)01:29:24 No.101095686

>>101095472
Seethe.

Anonymous
06/22/24(Sat)01:31:44 No.101095698

Anonymous 06/22/24(Sat)01:31:44 No.101095698

Go back to /b/ ranjeet

Anonymous
06/22/24(Sat)02:01:49 No.101095881

Anonymous 06/22/24(Sat)02:01:49 No.101095881

>>101095472
This guy probably gave his left testicle for access to a read only API