the great debate
>>106762012curl is for doing specific web requests, mimicking complicated auth chains and headers.wget is a less engineered tool for downloading/archiving content from a network
>>106762012wget for downloads things, curl for everything else, the two programs complement each other
curl -O makes wget obsolete
>>106762012bin/wget#!/bin/bashfn="`basename "$1"`"curl "$1" > $fn
#!/bin/bashfn="`basename "$1"`"curl "$1" > $fn
>>106762584theres some wget features for spidering stuff that im not sure curl does well
>>106762584cURL is much more fickle, you also need at least -J and -L to mimick Wget behavior, now try to throw a list of hundreds of URL at the cURL command line utility and see how that works out, you can't even chain arguments like -O
>>106762889also no recovery, clobber and no retry by default, cURL is clearly made for handcrafting requests, not bulk downloading
what's the one that gets whole websites?
>>106762012Curl is amazing and open source devs like curl creator Daniel Stenberg gets too little credit.
>>106762649youre supposed to do the logic or trust somone elses code
>>106762012Both are good for there specific purposes, but I use aria2 over wget nowadays.
>>106762012Curl unles you're using the spidering / recursive download features of Wget like:>>106762649Here's a list of things Curl can do which Wget cannot:>HTTP/2>HTTP/3>Impersonate common browsers like Google Chrome (https://github.com/lwthiker/curl-impersonate): This one it's ironic that Wget can't do that since its spidering/indexing would benefit from it
>>106762897https://github.com/curl/wcurl
>>106766038>>Impersonate common browsers like Google Chrome (https://github.com/lwthiker/curl-impersonate): This one it's ironic that Wget can't do that since its spidering/indexing would benefit from itnot really. spidering is usually something you do with common courtesy and curl impersonate is meant to get around blocks maliciously - understandably. i use it.
>>106766065Not necessarily maliciously, but rather because the block is there for everyone except a web browser now and there's no other way past it for bots.You can have a legitimate interest in scraping / spidering a website and do everything right and respectful but you're still not getting past the filters of the modern web easily, everyone has some sort of filter in place now.
>>106766074It makes we wonder how people like Archiveteam have managed:https://wiki.archiveteam.org/They have a fork of Wget, can it impersonate browsers or are they fucked for archiving anything that wants a browser now?You need a bit more than "User-Agent: Chrome" to bypass the sophisticated filters that check things like TLS fingerprints, etc.
curl_cffi is based