[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


I need to download as many books as possible in less than 24 hours.

Any ideas on how to do this?
>>
>>107997475
quickly
>>
>>107997475
entire libgen inventory. i seed as much as i can.
https://libgen.li/torrents/libgen/
>>
>>107997475
I hope nvidia downloaded everything they wanted
>>
>>107997475
why exactly did anna's archive even involve themselves in music piracy? the mission was about books and they're just drawing heat on themselves for this bullshit
>>
>>107997598
How many TBs is that?
>>
>>107997686
i dunno, i've got 10tb seeded so far and it's not even a dent in the page.
>>
>>107997475
Might as well sue them for double infinity.
They’ll get the same amount.
>>
>>107997598
I only want written books ( soexcluding audio books). Can you post the specific torrents for books?
>>
>>107997475
>sued
Nothingburger. Anna's Archive is a hydra. You can cut as many heads, and it will just come back alive again.
>>
>>107997720
libgen doesn't host audiobooks as far as i know.
and the torrents aren't parsed. the files all have hash names. you'd have to sort them yourself.
>>
>>107997475
Scrape torrent trackers for anything tagged as book
>>
>>107997475
When first hearing about annas archive they only functioned as a search engine for other shadow libraries. It showed all links to a book with the same md5 sum. This was said to supposedly be to avoid take down attempts.

Skip forward 3 years or so. Now they have their own download links on their own servers, a subscription plan to download faster, and literally downloads all the songs of spotify and intends to publish it for free just asking for a fight with the record industry. The same industry who if I remember correctly tried to even ban private copying to cassette tapes back in the days.

Unless it is hosted and ran by people with good connections in the deepest parts of russia, it might be over.

Would have been fun if the clanker companies started running around defending annas archive as most of them likely have used it for training their clanker chat algorithms.
>>
>happens right after nvidia gets done scraping all the books for their aislop machines
hmmmm
>>
>>107997927
>Unless it is hosted and ran by people with good connections in the deepest parts of russia, it might be over.

My personal guess is some unknown satoshi era crypto bro is bankrolling this somewhere 3rd world-ish. Consider that even holding a single copy of annas archive is likely $50k in Hard disks alone. They are getting to close to 2PB in total volume and they are absolutely brazen in what they do. Either that or state sponsored which i consider unlikely due to the whole selling to AI companies.
>>
>>107997475
Can you store that much?

wget.
>>
>>107997995
https://annas-archive.li/blog/critical-window.html

>As of the time of writing [2024], disk prices per TB are around $12 for new disks, $8 for used disks, and $4 for tape. If we’re conservative and look only at new disks, that means that storing a petabyte costs about $12,000. If we assume our library will triple from 900TB to 2.7PB, that would mean $32,400 to mirror our entire library. Adding electricity, cost of other hardware, and so on, let’s round it up to $40,000. Or with tape more like $15,000–$20,000.

>On one hand $15,000–$40,000 for the sum of all human knowledge is a steal. On the other hand, it is a bit steep to expect tons of full copies, especially if we’d also like those people to keep seeding their torrents for the benefit of others.
>>
>>107997475

what books you really should pick publisher and look for torrent
>>
>>107999387
Before AI Price rises
>>
>>107997475
https://www.myanonamouse.net/
>>
>>107997475
Go on the DarknWeb you can find any and everything you want there
>>
>>107997475
Nah. They can use the AI defence, i.e., how come AI companies are allowed to scrape copyrighted material for profit and get away with it.
>>
>>107997943
>Nu-/g/ doesn't realize that AI companies are invited to pay $$$$ for high-speed access to the whole dataset plus special unreleased datasets
>>
>>107997475
i just downloaded all the libgen fiction torrents.
so much trash in that dataset, like tons of spanish programming books in the pdfs, german bordello porno novels.
the pdfs was like 2tb alone, insta delete.
even some exe files in there.
100k rar files, many zips are epubs i thingk????

around 100 torrents was not able to download out of 4000.

main focus this weekend will be converting all the usable formats to epub version 3.4 and then ingest it into a database
>remove langs i dont want
>dedup with simhash on chapter level so i can identify omnibuses
>convert all images to webp to push down filesizes

webp at compression 10 looks completely fine for covers and shit.
>>
>>107997475
Be Jewish.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.