/g/ - the great debate - Technology

Anonymous

10/01/25(Wed)17:29:34 No.106762012

File: 1752437844165730.png (300 KB, 1920x1200)

300 KB PNG

Anonymous 10/01/25(Wed)17:29:34 No.106762012 Archived

the great debate

Anonymous
10/01/25(Wed)17:31:58 No.106762040

Anonymous 10/01/25(Wed)17:31:58 No.106762040

>>106762012
curl is for doing specific web requests, mimicking complicated auth chains and headers.
wget is a less engineered tool for downloading/archiving content from a network

Anonymous
10/01/25(Wed)17:35:27 No.106762077

Anonymous 10/01/25(Wed)17:35:27 No.106762077

>>106762012
wget for downloads things, curl for everything else, the two programs complement each other

Anonymous
10/01/25(Wed)18:35:03 No.106762584

Anonymous 10/01/25(Wed)18:35:03 No.106762584

curl -O makes wget obsolete

Anonymous
10/01/25(Wed)18:37:29 No.106762597

Anonymous 10/01/25(Wed)18:37:29 No.106762597

>>106762012
bin/wget

#!/bin/bash
fn="`basename "$1"`"
curl "$1" > $fn

Anonymous
10/01/25(Wed)18:42:57 No.106762649

Anonymous 10/01/25(Wed)18:42:57 No.106762649

>>106762584
theres some wget features for spidering stuff that im not sure curl does well

Anonymous
10/01/25(Wed)19:15:49 No.106762889

Anonymous 10/01/25(Wed)19:15:49 No.106762889

>>106762584
cURL is much more fickle, you also need at least -J and -L to mimick Wget behavior, now try to throw a list of hundreds of URL at the cURL command line utility and see how that works out, you can't even chain arguments like -O

Anonymous
10/01/25(Wed)19:17:32 No.106762897

Anonymous 10/01/25(Wed)19:17:32 No.106762897

>>106762889
also no recovery, clobber and no retry by default, cURL is clearly made for handcrafting requests, not bulk downloading

Anonymous
10/01/25(Wed)19:18:38 No.106762906

Anonymous 10/01/25(Wed)19:18:38 No.106762906

what's the one that gets whole websites?

Anonymous
10/01/25(Wed)19:21:28 No.106762927

Anonymous 10/01/25(Wed)19:21:28 No.106762927

File: daniel-stenberg-inventor-(...).jpg (125 KB, 1300x956)

125 KB JPG

>>106762012
Curl is amazing and open source devs like curl creator Daniel Stenberg gets too little credit.

Anonymous
10/01/25(Wed)19:36:49 No.106763055

Anonymous 10/01/25(Wed)19:36:49 No.106763055

>>106762649
youre supposed to do the logic or trust somone elses code

Anonymous
10/02/25(Thu)00:11:33 No.106764833

Anonymous 10/02/25(Thu)00:11:33 No.106764833

>>106762012
Both are good for there specific purposes, but I use aria2 over wget nowadays.

Anonymous
10/02/25(Thu)04:01:16 No.106766038

Anonymous 10/02/25(Thu)04:01:16 No.106766038

>>106762012
Curl unles you're using the spidering / recursive download features of Wget like:
>>106762649

Here's a list of things Curl can do which Wget cannot:
>HTTP/2
>HTTP/3
>Impersonate common browsers like Google Chrome (https://github.com/lwthiker/curl-impersonate): This one it's ironic that Wget can't do that since its spidering/indexing would benefit from it

Anonymous
10/02/25(Thu)04:02:58 No.106766049

Anonymous 10/02/25(Thu)04:02:58 No.106766049

>>106762897
https://github.com/curl/wcurl

Anonymous
10/02/25(Thu)04:05:10 No.106766065

Anonymous 10/02/25(Thu)04:05:10 No.106766065

>>106766038
>>Impersonate common browsers like Google Chrome (https://github.com/lwthiker/curl-impersonate): This one it's ironic that Wget can't do that since its spidering/indexing would benefit from it
not really. spidering is usually something you do with common courtesy and curl impersonate is meant to get around blocks maliciously - understandably. i use it.

Anonymous
10/02/25(Thu)04:07:13 No.106766074

Anonymous 10/02/25(Thu)04:07:13 No.106766074

>>106766065
Not necessarily maliciously, but rather because the block is there for everyone except a web browser now and there's no other way past it for bots.

You can have a legitimate interest in scraping / spidering a website and do everything right and respectful but you're still not getting past the filters of the modern web easily, everyone has some sort of filter in place now.

Anonymous
10/02/25(Thu)04:10:32 No.106766086

Anonymous 10/02/25(Thu)04:10:32 No.106766086

>>106766074
It makes we wonder how people like Archiveteam have managed:
https://wiki.archiveteam.org/

They have a fork of Wget, can it impersonate browsers or are they fucked for archiving anything that wants a browser now?

You need a bit more than "User-Agent: Chrome" to bypass the sophisticated filters that check things like TLS fingerprints, etc.

Anonymous
10/02/25(Thu)04:15:06 No.106766111

Anonymous 10/02/25(Thu)04:15:06 No.106766111

curl_cffi is based