I need something that can>queue multiple http requests>can execute them asynchronously (perhaps using a thread pool)>abort and retry requests on timeouts>return downloaded http response synchronouslyPreferable for Python, but any language is fine.
https://www.python-httpx.org/Would this work?
>>101200279aiohttp, asyncio (builtin since 3.6 iirc).use a standard queue, fill it up and consume it in some looping coroutine. use the timeout options and write a basic try-catch to retry. store the results somewhere, like a list or something.you can use asyncio.run to run the entire event loop until completion to get the results synchronously and then you can continue with your code after it's done.i don't know why you'd want to return the http response synchronously though.if your lazy and old, use scrapy. you will use twisted.
>>101200294Seems more like a modern replacement for the requests module.>>101200319I know how to do it. I'm just too lazy.
>>101200362well the code for it is so small and would be a hassle to actually make a library to do it. like, just write it yourself.
>>101200279use batch and aria2
>>101200319Go back, r*ddit typer
All scraping libraries suck desu. I gave up and wrote my own shit in golang but I had more advanced requirements than you..
>>101200279You basically need bash
>>101200362I'd recomended you try out some estrogen, it really improved my coding performance
>>101200279asyncio and aiohttp should do the trick easily
>>101200279What you're looking for is httpx, check it out.
If you need to process a lot of requests in an efficient manner, just do Go, use the standard http lib and you are good.If you like touching tips with your friends, you can use Python, async functions to request and asyncio.gather to run them concurrently. You can use tenacity for the retry stuff.
>>101203848Sounds like he's doing I/O bound tasks, Go is unnecessary. In fact it's the perfect use case for python.
Just use celery queue my man
>>101200279You can do all of that with just bs4, concurrent.futures/threading and requests. If you're trying to get JS content and stuff like that you'll probably need headless browsers.I've built scrapers in Python and C but honestly the best way I've found to do it to date is in JS (I know, not my favorite either) by creating browser extensions that use stuff like Playwright.
I do this in C++ using libcurl, I don't really use a HTML parser either, I just notice the patterns then run SIMD substring search and it finds what I need x1000 faster while not using retarded tier amounts of RAM per page.
>(((asynchronously)))Kys
/wsg/ is this way retard >>101208241
>>101200362>I know how to do it. I'm just too lazy.give those instructions in chat gpt then.