Web Scraping GeneralWhitehat cuck edition continuedQOTD: What are some good sources for scraping AI training data from?> Captcha serviceshttps://2captcha.com/https://www.capsolver.com/https://anti-captcha.com/> Proxieshttps://infiniteproxies.com/ (no blacklist)https://www.thunderproxies.com/http://proxies.fo/> Network analysishttps://mitmproxy.org/https://portswigger.net/burp> Scraping toolshttps://beautiful-soup-4.readthedocs.io/en/latest/https://www.selenium.dev/documentation/https://playwright.dev/docs/codegenhttps://github.com/lwthiker/curl-impersonatehttps://github.com/yifeikong/curl_cffiOfficial Discord: discord.gg/9EKk3psXMrLast thread: >>100150524
bump
sage
>>100167855Poster had to show his drivers license and a DNA and semen sample and pay $100/m just to gain access to a read-only API he could have just scraped (even though that would have gone against the website's TOS)
At the end of the day yt-dlp is really the solution to pretty much everything
>>100141630Aren't there like 10B possible phone numbers?>>100143919> Indirectly by training ML models on dataOn this, what are some good sources for pulling data for training ML models? >>100150865Join cybercrime TG groups and look for people spreading drainer links, they should know about Twitter scraping
>>100167898was waiting for them to fix comments not downloading before I started scraping channels again but the zfs pool I was going to use to store the videos fuckin died
>>100168026youtube sucks ass, who cares about video comments
where's the euro greek anon that runs the discord with a data scraping channel show yourself
>>100167925>On this, what are some good sources for pulling data for training ML models? HuggingFace, Kaggle, roboflow or I scrap myself which is way more rewarding since the best data is always gatekept
Does anyone here know a castle bypass or am I gonna have to pay some jeet in the sneaker botting coms?
>>100168034Imagine scraping comments and using it to train a YouTube comment bot
Having an issue with the selenium IDE (the web browser extension) throwing a fit over a 2d array:Command: execute scriptTarget: return [["val1", "val2", "val3"], ["2d", "3d", "4d"]]Value: A1it gives me an error invalid or unexpected tokenhas anyone tried using 2d arrays before in their little web app. I can get it to work fine in the normal selenium webdriver but the IDE is a bit of a pain.
whats web scraping?
>>100171769never mind I got it working.
>>100170754You'd need a shitload of proxies though
Anyone know where I can scrape unobfuscated browser JS from?Planning on training a GPT to deobfuscate obfuscated JS