[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/t/ - Torrents

[Advertise on 4chan]

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • There are 84 posters in this thread.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


Janitor acceptance emails will be sent out over the coming weeks. Make sure to check your spam box!


[Advertise on 4chan]


File: file.png (70 KB, 499x367)
70 KB
70 KB PNG
This is a collection of all the HTML files from yuki.la as of Feb 2021, packaged by Anonymous !wJbF8ZWUxk.

It was a process of scraping over multiple months and completed nearly the time that yuki.la vanished for good.

Each board is collected into 2 tar.gz files, containing either the entirety of a board's HTML files, or a portion. After each file is extracted, you will have all the files of the board.

Unfortunately, no thumbnails or images were saved before yuki.la went offline, but it is possible to find images using other archive websites.

If you'd like to support my efforts, you can donate ETH to: 0xD952EeD3a5f10A891f962f9bC411fFa41272F78A

More information regarding 4chan archives can be found at https://bibanon.org/

Cheers! I hope you find this collection useful.

magnet:?xt=urn:btih:42b3aa33de93b705ef1074776b27f50765891b86&dn=yuki.la&tr=udp%3a%2f%2ftracker.openbittorrent.com%3a6969&tr=udp%3a%2f%2ftracker.opentrackr.org%3a1337%2fannounce
>>
Shit, I've been putting off setting up my seedbox for months, but now this is something that's really worth spending the effort for
>>
Are you from archive.is LPySg?
>>
>>1033298
Yes
>>
Lack of images suck. Request section has continuous stream of hot girl pics. I'm sure someone has huge archieve of them. That kind sould should share magnet of them.
>>
>>1033260
Nice anon
>>
>>1033260
You're a hero, OP.
>>
>>1033260
I love you archive kun
>>
bump
>>
>>1033260
fuck bibanon. i hope that they all get suicide bombed
>>
>>1033260
>Unfortunately, no thumbnails or images were saved before yuki.la went offline, but it is possible to find images using other archive websites.
then it's worthless
you should have at least saved image urls from the site, i think it was using imgur as a front end for image hosting.

pray to god somebody has a solution to undo the loss.
>>
>>1034377
fuck off discord goon
>>
>>1033260
you do know there's archived.moe and https://archive.wakarimasen.moe archiving everything? latter even having full search?
only issue is they are not so old

archiveofsins.com has /t/ since 5years+
>>
>>1034400
The image urls are saved you fuck. It's the raw HTML from the threads.
>>
>>1034446
>archived.moe
-outsources links from other archive
-most of the search is unavailable aside slow or coom boards
>https://archive.wakarimasen.moe
-another fresh FoolzFuuka archive that hasn't been around for years and could kick the bucket any time like the previous ones
-also doesn't have any of the old stuff yuki.la had either
>archiveofsins.com
unrelated
>>
>>1033260
>If you'd like to support my efforts, you can donate ETH to: 0xD952EeD3a5f10A891f962f9bC411fFa41272F78A

No one cares about your shitty efforts.
Go beg elsewhere faggot
>>
>>1034622
you have to be over 18 to use this website
>>
>>1034622
Iunno there's plenty of peers on this torrent.
>>
bruh it took 16 hours to decompress /a/
though my HDD is ship
>>
>>1033260
Do I have to download the whole collection or can I choose what files I want?
>>
>>1035206
choose
but make sure you have a lot of space
/a/ alone might be over 300-200 gb
>>
File: file.png (12 KB, 1362x34)
12 KB
12 KB PNG
>>1034400
>>
File: file.png (29 KB, 948x434)
29 KB
29 KB PNG
>>1034400
>>
>>1035246
>>1035256
I knew i was wrong
was thinking of (ugh) 4chanarchive
>>
>no images
>400+ GB

wot
>>
>>1035310
4chan has had literally billions of posts, it adds up
>>
>>1034622
lol you want a medal or something with that post?
>>
>>1033260
What happened to yukila and will it ever come back?
>>
>>1035980
No one really knows, and probably not.
>>
>>1036936
It's really a shame how inconsiderate some of those archive owners are.

They know how valuable the info on those sites are to some people and while we can't force them to keep doing something they don't want to/can't afford to do they should at least have the decency to dump the data somewhere and not just disappear without a trace like that

Also I am pretty some of them they get tons of donations anyways so it's not like they are simply doing people a favor.
>>
>>1036936
Ah, well, thanks anyways. What I'm having to do now is search for stuff in that torrent collection, get the thread numbers, then look those threads up on archived.moe. Even then it's still not as good because it only has thumbnails instead of yukila's saved images.
>>
What are our alternatives? wakarimasen is pretty good, but it lacks a lot of old stuff
>>
>>1033260
Bump
>>
Bump for epic thread
>>
>>1033260
I unpacked the archives and fuck. there are so many html files that my computer will take weeks to transfer them onto my external hard drive...
>>
>>1038959
why didn't you download them directly to your external hard drive?
>>
>>1038959
Yes it's a huge collection. It took almost half a year to complete the download.
>>
>>1033260
holy shit I can't believe I didn't see this until now! thank you so much anon with that weird apostrophe thing that isn't actually an apostrophe at least in english
I also recall you saying that you would upload it to archive.org when it was finished zipping. Is that still in the plans? Or am I just retarded and can't find it?
>>
>>1039081
Yes, I have tried uploading to IA but kept getting disconnected by their servers. Filesize may be too big, but I wanted to get it into multiple hands as soon as possible so a torrent was what came to mind.

Would of hated for my hard drive to die or something like that before getting it uploaded.
>>
Thank you anon!
>>
>>1039124
oops, wrong tripcode
>>
>>1039081
backtick
>>
Someone make a website and upload all the stuff there
>>
>>1035310
>>1035323
I don't know why this was deleted AND intentionally excluded from archive.org. I also can't find the 4chan article about it either, but from this dead link: http://chrishateswriting.com/post/68794699432/small-things-add-up
>By migrating to the new domain, end users now save roughly 100 KB upstream per page load, which at 500 million pageviews per month adds up to 46 terabytes per month in savings for our users. I find this unreal.
This was simple from changing image domains from img.4chan.org to i.4cdn.org to save 50 bytes.
Just that accounted for 46TB of extra text data a month.
>>
Bump
>>
>>1033260
What happened to yuki anyway? Will it come back?
>>
>>1039752
>>1033260
Also, not an option for me to back it up all of it now, but maybe some day. Thank you for the effort.
>>
>>1034377
>>1034401
Okay what did I miss?
>>
Bump
>>
how long does it date back to?
>>
Come back pls... yuki, the search on other archives is garbage
>>
>>1040317
Dunno, yuki was kind of annoying with how it loaded, but the good thing was how extensive it was. I wonder what really happened to it. As far as I know they didn't communicate or take donations...
Also, wish more archival sites let you give donations without me have to bother with memecoins.
>>
Guys tell me the date of these archives so i can know if i should bother downing and seeding
>>
>>1040735
February 2, 2008 to sometime in February 2021
>>
>>1040881
ok thanks for the info
>>
bump
>>
any more torrentable 4chan dumps?
>>
>>1041838
check bibanon and related sites, as well as internet archive (which generates a torrent file for most initial uploads, though obviously the torrent can't be updated if the collection changes)
>>
So what happened?
>>
>>1042518
It’s dead
>>
>>1042789
but why
>>
>>1042815
he car door man hook horn
>>
do any archives at all allow searching for symbols? seems like yuki.la was the only one that allowed this. like i cant search for things like "a****" , it either brings up irrelevant results or none at all.
>>
Is there any archive atm that plays webms on /gif/
>>
How do you search now for /t/ archives?
Archived.moe doesnt have search activated
>>
>>1043599
archive of sins but their search feature is momentarily disabled
>>
>>1040317
>tfw didnt even know yuki had a search engine until it went down

>>1040619
>wish more archival sites let you give donations without me have to bother with memecoins
this, I'm too much of a fucktard to figure it out and plus its annoying
>>
What was yuki.la?
>>
>>1044740
4chan threads archive
>>
>>1044740
one of the few archives that saved .webms from longer than 5 years ago and seemed to be one of the oldest surviving archives up until it died over a month ago.

A lot of people have wanted to help out to bring it back because of that, but no one knows anything about it other than it existed and the email for the manager/admin is lost because it was also from yuki.la




>>
>>1044930
the archiver never interacted with anyone and never had a contact email or anything like that either, or else people would have contacted him ages ago when things started to go down
>>
>>1044935
his contact email was hosted or however it's called from yuki.la servers. so once that went down, no one could email him.
I forget what error code the website was giving out that weekend, so maybe someone could have tried to reach out on early Sunday and Saturday, before it was just GONE.

sad that he never attempted to reach out.
>>
>>1044954
This, hell, pls archivers, set up proper donations (not just memecoin ones), if you must.
>>
Bump
>>
bump, thank you very much for the archive, man I miss yuki like you wouldn't believe
>>
>>1039809
I would also like to know
>>
>>1033260
Thanks anon, keep it up! Fucking super glad you archived this, so thanks!
I can't donate now but I'll keep this seeded forever.
>>
>>1034400
Bullshit it's worthless, if it has all the image hashes then it can be rebuilt from image archives.
A billion images are useless without their threads.
Having a nice archive like this sitting around is invaluable. In time I plan on archiving all archives, scraping all the images I can, and compiling them into a mega-archive.
The software I'll need to handle it will need to be written from scratch but my programming is improving and I'll be there soon. For now, I'm just happy to see people saving everything they can.
>>
>>1040317
>tfw arch.b4k.co is fucking garbage
>rebeccablacktech fucking merged with fucking desuarchive
>>
>>1033260
oh thank fuck
i miss yuki.la so fucking much
you would've though anons would have learned from other past archive failures like the old archive, fireden, fireden 2, 4tan etc by now
>>
Stuck at 95% anyone else?
>>
>>1046628
After I posted this it resumed again.
Thanks to whoever archived this but I wish there were pictures.
>>
bump
>>
Thank you for doing this anons. I enjoyed reading old threads from times past
>>
>>1034622
reddit ritard niger
>>
thank you for your effort archive anon
>>
Got the whole thing sitting on my NAS, can't donate but I'll seed forever.
Thanks OP, you're an absolute legend
>>
Bump
>>
Would make a backup, but need more space...
>>
last bump
>>
That's absolutely amazing, thank you for your efforts.
>>
Bumping till my new drive gets here.
>>
Bump
>>
Bump
>>
>>1033260
I quickly wrote up a couple simple python (tested on 3.7.9) scripts to extract all threads and posts matching a criteria.
It should be pretty easy to repurpose them to fit anyone's needs. You could even build a database with the scraped data for more efficient storage.

Script 1
>move all HTML files that match a regex to a different folder
https://pastebin.com/rZKZW7V2

Script 2
>scan all HTML files, match regex in the thread subject field, extract every post, clean and split the posts and save them to a single column CSV
https://pastebin.com/0u8etcZX

The regex are probably not perfect and I used the CSV in an AI project so you probably want to take out the part that cleans and splits posts and save more columns than just the post text.
It should be easy to get them to work on tar files rather than folders of html files but I haven't tried that.
>>
>>1052675
thank you friend
>>
bump
>>
>>1040317
Yuki's search sucked. I used desuarchive to find files and posts, and then went to the thread/post on yuki for the image file.
>>
RIP
>>
>no images
>no webms
nooo my memerinos
>>
File: 1574887227398.jpg (7 KB, 250x201)
7 KB
7 KB JPG
>>1033260
Great praise OP.

Thank you for your efforts. A part of 4chan died when yuki.la vanished. Does anyone even know what happened to yuki? Can't believe the best archive would just vanish like that without much warning or a chance to rescue more of what it had archived.
>>
>>1036937
The only existing /v/ and /vg/ full image archive from 2015-2019 are now on fireden's onion service that could go down at literally any time. It'd take years to scrape it all and there's no dumps. Real downer eh?
>>
I wrote a scraper for yuki.la at one point. It used this niche file format called nozomi, named after a Japanese train service, which stored all thread OP post id's for a given board in a single file. Its kind of an odd choice, but it offloads some pagination and lookup work to the client.
Another website which goes by the name nozomi.la, uses this same niche nozomi format, but to greater effect. Each tag has its own nozomi file, and using a single set intersection operation across any number of nozomi files, a client can find images which satisfy all tags fairly quickly and then request those from the server one by one.
My guess is that I the same person made both websites. They had the same TLD, use the same tech, and have similar interests.
I could previously find a github or gitlab which explained what nozomi was but I can't find it anymore. Maybe it was privated when yuki went down.
>>
>>1057149
I would love more detail about how it worked.
>>
Bump
>>
>>1033260
Dimes
>>
Thank you, I won't be able to download it all, but I will try to seed the boards i shitpost on the most
>>
Bump
>>
bump
>>
>>1058036
seconded
>>
So, it never went back on?
>>
Could someone do a DDL upload of some of the smaller/slower boards from this? Especially /m/.
>>
File: 7251.gif (801 KB, 250x195)
801 KB
801 KB GIF
>>1033260
cool and all that someone saved it but i wish there was a site that indexed these text-only archive projects and provided a search engine, so that people could access information in old posts without having to download everything themselves and basically set up their own search engine.
guess what i'm saying is, anyone actually have a use for torrents like this?
or you saving just because it's fun to collect data without ever looking at it?
>>
Bump
>>
File: 2021-08-31.png (510 KB, 1475x1829)
510 KB
510 KB PNG
>>1039338
Damn dude, this thread, your post, got me riled up.

Remember a few years ago, when tumblr came through with their "let's restrict porn" idea? There was some internet outrage, and people frantically scraped sites they cared about, and those are still available on archive.org. So you can get a copy of fatponybutts.tumblr.com... but not moot's blog? Doesn't seem right.

It doesn't make sense to me, but there are web scrapes hosted on archive.org that are not available through their 'wayback machine'. If you're using their builtin search, look for 'warc', 'warcarchives', something like that. These are scrapes in their format (cdx) that you can download, and view using a server on *your* machine (pywb, wayback-cdx-server, etc)... but not the instance running on *their* machine. Maybe I'm missing something here?

Anyway, I jumped through the hoops & made this screenshot for you.
>>
>>1067692
Saved. Thanks Anon.
>>
bump
>>
>>1065754
iktf.
>>
Please tell me this isn't ded
>>
thnks
>>
>>1070138
RIP
>>
It's frightening to know some nerd can read about your posts years after you posted it. Might stop posting on 4chan.
>>
>>1071085
You might stop posting on every site ever
Get a clue
>>
File: 1242953042280.jpg (200 KB, 1280x1024)
200 KB
200 KB JPG
back to page one

I did scrape a 36k, images and caps, from dumblr during the porn thing.
3.5g worth. So there is that.

Also have about 1,200 pre 09' images from a download I did then. 750mb

Sad part of that rip was, it was a html archive of glorious porportions.
Being a dumb ass, not knowing wtf a html was, I deleted all of
them an only kept the fucking pics.
I have spent hours, usually twice a year, rebooting my old hardware
searching the drives hoping to find some remnants that might have
got put in a save folder somewhere. I never liked to toss data even back then.

If someone is interested I will hash it all up.

Here is a chan poo from 2004. memeing before it was cool
>>
File: 1632247439744.jpg (31 KB, 306x273)
31 KB
31 KB JPG
so there's pretty much no other decent archive sites now that yuki is gone? been to desu, moe etc. but can't find any archives of the board I'm looking for (lgbt). wakarimasen doesn't even go back until 2019 apparently
>>
>>1073892
off yourself tranny
>>
>>1043329
archived.moe works for me
threads for /gif/ go back to 2015 or so
not sure if wemb's from that far back work
>>
>>1073892
https://archive.4plebs.org/_/articles/credits/
Here's a list of archives, might find something you need there.
>>
>>1073892
try fireden.net since you're so much of a faggot
>>
I thought this was going to have images, got my hopes up damn. I remember some months back, or has it been a year already? I contacted the admin of archived.moe if he could share a dump of /r/ and /biz/ in a certain timespan (June 2017 to June 2018). This is probably not the specifics of my request but close enough.

It's because I used random lewd images on 4chan as keyfiles for my keepass databases, where those images are now lost because I didn't back up properly because I'm a retard. I have the passwords to the kdbx files, and I have the kdbx files too, just those keyfiles missing. Since then I managed to recover a few by going page by page on warosu, saving everything that may have been a keyfile (and I honestly can't believe I managed to recover some of my accounts this way, even got my main email back) but still, there's still a huge chunk missing

I even sent him some hundreds of dollars in crypto (probably thousands now since it pumped) but I got jack shit, didn't get a decent reply once I sent it. I'm still bummed about it to this day, not because I got scammed (?), but because maybe the admin dude thinks I'm some hackerman or some shit instead of just a retard who lost his keys. Oh well. I even have a script set up ready to parse through countless images, trying to make use of each image one by one on my kdbx file and alerting me if it worked. This is how I saved time because downloading one by one from warosu was already a pain in the ass there's no fucking way I would've tried decrypting with the countless images I saved one by one too. I'm lucky keepass allows me to do it via cli otherwise I would've given up from the start

Now I don't even have any crypto left to try and pay some other admin for an image dump. I really thought this post would have them. Fuuuck. Should I just scrape warosu? I don't even know how to begin with that
>>
>>1074381
I may be able to help you. What information do you have on the images you need?
>>
>>1074381
I'm not surprised archived.moe admin did that desu
>>
>>1074381
wow what a fucking faggot.
assholes like him are why we keep losing archives so easily.

how long until BibleAnon takes over and do all the archiving as a neutral group effort ?
>>
>>1039338
>>1067692
You can find more on archive.is
https://archive.is/Mx4uq
Use the "next" "previous" at the bottom to find more pages
>>
>>1074544
On second look, here's the page with all the pages listed
https://archive.is/bf5Wj
>>
>>1052675
Smart to filter with regex before grabbing contents with bs4, beautifulsoup can be so fucking slow for this stuff
>>
Figure I'll ask here

I'm looking for a single specific webm posted on /gif/, that was in this thread: https://desuarchive.org/gif/thread/13505776

None of the 4chan archives I've checked have fullsize images/webms of this thread

does anybody have it/know where I could get it?
>>
File: 1.jpg (59 KB, 696x870)
59 KB
59 KB JPG
>>1074512
Hi anon. Sorry it's been so long since I've lost them, though there's really no rhyme or reason to them, just that they are lewd images of women. Pic related is an example but I just got this from a random website. It's supposed to be from 4chan so that the file's hash is the same (thank fucking god warosu doesn't modify the images in any manner). I know, I'm retarded.

Besides that, I only know that it's possibly from that span I mentioned. Well, I just looked at what I've managed to salvage so far and I think it can be further restricted to just from June 2017 to Nov 2017. Oh, almost all of it is from /biz/ and there's one I got from /r/. That's it. This is also why the script is useful because I don't need to know the specifics. I just chuck all possible images in a folder and run this line on each of them
echo my_password | keepassxc-cli extract -k $theImage my_kdbx_file >> ~/output
so on and so forth. Then I'd check (grep) the output afterwards if there are lines that isn't just the usual error

>>1074513
>>1074533
I'm probably (definitely) just coping but to this day I've no idea if he really is an asshole or he just forgot because my request was too difficult, I don't fucking know
>>
>>1074873
No he's actually an asshole. He accepted thousands of dollars in crypto to delete some posts.
>>
File: file.png (250 KB, 1482x659)
250 KB
250 KB PNG
>>1074873
Gotcha. So I can't give you the files (as I don't have them), but I'm exporting a list of URLs to where you can download all the files posted to /r/ and /biz/ between June 2017 and June 2018. It will take a little while, but I hope it can help you out. I'll post here once I have that. The image is a chart of posts I have data on between those times. Green is /biz/ yellow is /r/

>>1074819
Search "View Same" and then try https://archive.wakarimasen.moe/ to see if they have it since it was reposted recently.





Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.