[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / asp / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / qst / sci / soc / sp / tg / toy / trv / tv / vp / wsg / wsr / x] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/t/ - Torrents

[Advertise on 4chan]

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • There are 83 posters in this thread.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


Self-serve ads are available again! Check out our new advertising page here.


[Advertise on 4chan]


File: frogs.png (629 KB, 1309x725)
629 KB
629 KB PNG
i am sharing my collection of cartoon frogs. these were all collected from 4chan. all files are unique but there are many duplicate frogs. i put the images into zip files because there are so many. torrent is 15 GB in total.

magnet:?xt=urn:btih:7a165aec0c1917489773d44c05e44e0e978a41d1
>>
File: pepe.png (211 KB, 1200x675)
211 KB
211 KB PNG
>>990243
what... for real bro?
>>
the only post that matters on this entire board
>>
15 gigs is worth it im coming pepe
>>
what the fuck?
>>
this is a thread that really worth
>>
bump the fuck out of this
>>
Have you gotten rid of the duplicates?
>>
>>990243
crashing pepe market with no survivors
>>
>>990329
i have attempted to cluster the images to remove duplicates. it works alright but my code is too inefficient and my computer doesnt have enough resources to run it on the entire dataset. i made a website that lets you explore the "deduplicated" dataset and view similar pepes. it only has 30,000 images. https://bbwroller.com/frens

>>990243
also i fucked up. it doesnt actually have 130,000 images. the torrent only has 100,000. i didnt include the other 30,000 because the probability they were actually pepes was lower
>>
no wojaks?
>>
>>990243
absolutely based I am not a frog poster but I have become one starting from today
>>
This should be stickied
>>
File: 1609096703399.gif (142 KB, 1221x1007)
142 KB
142 KB GIF
>>990389
>>
>>990455
no u
>>
>>990456
Leddit is around the corner.
>>
File: download.png (8 KB, 194x259)
8 KB
8 KB PNG
>>990458
>Leddit is around the corner.
>>
>>990459
Leddit is around the corner.
>>
Stupid thread.
>>
>>990243
are you from the past?

do you have that kitten who wants a cheezburger too?
>>
>>990477
Then go back to jerking off to your trap/sissy megapacks, you tranny faggot. Better yet, kys.
>>
>>990243
Based
I'm seeding this now
>>
seed baby seeeeed!!!!!!!!!!!!!
>>
if you got rid of identical duplicates i would dl and sneed.
>>
Thanks. Can we have a wojak collection too?
>>
Now That's What I Call Autism vol. 5
>>
sneeded+seeding this gem
>>
thanks OP for making this available, it's a nice large dataset to experiment on. the problem is the absolutely massive amount of dupes and how to get rid of them. checksums are out, you need to use a tool for comparing images. first off, and what i should have done at the start, delete all pics under 126 pixels width. apparently people save fucking thumbnails instead of full pics and it's no loss to jettison those. you can do this easily in XnView, Search, and specify the dimensions.

Next, Get VisiPics or AntiDupl.Net to use to try and automatically find similar images. This is where it gets tough because they dont have a simple way to say "out of this group of similar pics, just take the best one". And unless you are an autistic NEET, do not attempt to manually go through and select the best from the 4000 groups i found in the 100.zip alone. it will take days.

currently looking for better methods to dedupe this thing.
>>
>>990785
the other thing i should mention is you have to sort of come up with some rules for specifying a dupe in something this diverse. for example, it's easy to say "just take the file with the larger dimensions" but what about when you have two identical pics and one has background transparency and the other doesnt? and what about when you have a pic that is maybe 5% different because some dude put his obscure website favicon logo on pepe's shirt? do you want to keep that one? you can't hardcode that rule because a lot of good, similar pics are slight changes in facial expression, which you want to keep.

what about when they are identical dimensions but one is 400K larger than the other? are you somehow losing some valuable data? these are just some of the decisions that i'm struggling to find a way to automate.
>>
I'll be seeding for awhile
>>
>>990683
>>990455
>>
>>990243
Fucking bump
>>
>>990243
Bump this shit, based as fuck
>>
dead meme
>>
>>990378
you could post the code and let someone with a better computer run it.
>>
>>990243
anon, a thread died for thiss...
and fuck that thread good job
>>
File: 1608769480213.jpg (20 KB, 600x600)
20 KB
20 KB JPG
>>991036
based
>>
File: 1527904428380.png (65 KB, 500x382)
65 KB
65 KB PNG
>>990243
Thanks, Based One
>>
File: apu panda.png (60 KB, 300x250)
60 KB
60 KB PNG
Here OP add this to your collection
>>
we did it niggers
>>
Okay so heres the plan:
>Somehow delete duplicates with worse quality
>Run every image through https://github.com/fhanau/Efficient-Compression-Tool with -strip -9 flags
>zip
>20% less size and a better dataset overall
>???
>profit
>>
>>991122
this is how you get in a position where you can't find duplicates, you retard, it's just 15gb anyway are you poor or something?
>>
>>991132
english nigger do you speak it?
>>
File: 1609434300617.png (309 KB, 657x731)
309 KB
309 KB PNG
>>990528
Now this is a based post
>>
File: livingispain.png (460 KB, 666x666)
460 KB
460 KB PNG
>>991122
you forgot step 0, which would be to remove all thumbnail sized images from the set first. and you are just fluffing over the remove dupes step which is the hardest part.

before anyone tries to modify this massive archive you should ask yourself what you plan to do with this. at first i thought "oh, it'd be cool to have a folder where i could pick a pic for any feel when i post" but the fact is there are so many images with no descriptive filename, you'd spend 15 minutes looking and not even browse 5% of them. it's not good for that. it's not even good for creating a folder of go-to images because, again, you have to browse so many unsorted pics.

this seems well suited for training an ai model. i'm gonna start reading up on the criteria for that because i donno what else to do with this, honestly. there's one interesting part to this archive that may make you NOT want to remove dupes. that is, it may be true that the most popular pics have the most variations (in size, dimension, etc. in which case maybe you don't want to remove dupes so when you feed it to a model generator, it's biased against the most popular images. i donno. i dont know shit about ai yet.

Visipics is great for generating a crude histogram of images with the most variations. you better believe there is a base pape with literally every college/pro sports team hat on it in this archive.

if you guys want the same thing but soijak, check out this archived thread for links:
>>>/g/79476879
also, just grab this 2gig zip of jaks: https://mega.nz/#F!tOhxAYjI!Y9nFdFHI_2wlCryV__4-wQ
>>
>>991201
>only 2gig
>>
>>991202
>>990389
>>990683
>>991201
sorry so smol. how do i delete link?
>>
>>991207
just leave it, but in the internet there are way more variations of wojak than that of Pepe, thats why I was surprised the wojak archive is only 2gb while the Pepe around 10gb (if you delete most duplicates)
>>
>>991212
i was joking. if you want to try and do what op did for soijak, in that thread i posted they suggest scraping basedjak dot party. maybe you'll get a lot more. then again, i'd like to see him just scrape 4chan in the same way for it.
>>
>>991201
>you better believe there is a base pape with literally every college/pro sports team hat on it in this archive

https://bbwroller.com/frens/search/0afd4ddba7970487d8de848a7ea1ebfd1584908662bc2378e66e31ee97b4a014
https://bbwroller.com/frens/search/19619e62cf88cec4a08b6727570279fa0bd5eef8afae901141e6bceb40890c0a
https://bbwroller.com/frens/search/d8e36989e2dd52b284333074267829782d8009e208af21b1e893f363fecb2465
>>
>>990243
omg i gonna cry, this is amazing.

how close do you think your are to a complete collection ?
>>
Bump
>>
File: sad_frog - Copy.png (16 KB, 366x313)
16 KB
16 KB PNG
You guys must be just as addicted to coming here as I am, I don't post enough to want so many pepes though, here's a pepe I created through paint though, OP, enjoy.
>>
this is better than porn
>>
not a frog poster, but still high quality post OP. thanks
>>
>>990243
crashed qbittorent
>>
Is anyone seeding right now?
>>
>>991498
it took a long time for the metadata to fully load on this in qbittorrent for me. i also tried adding a large list of trackers to it, but i donno if this is even on a tracker so that might not help

>>991580
i am, but it appears like i've been the only seed for a couple days. i'm not OP and i'm not gonna do it forever. people dont have an excuse to not seed this thing, it's not like it's copyrighted material or anything. r-r-right?

>>991237
this is incredible. what software do you use to determine similarity?
>>
File: pepefingers.jpg (27 KB, 550x400)
27 KB
27 KB JPG
Beautiful work bro
>>
>>991580
I'll be seeding until 11am EST tomorrow
>>
>>991655
>what software do you use to determine similarity?
https://github.com/JohannesBuchner/imagehash

>>991580
I am seeding on a vps with 200 MB connection. Try these teackers
https://ngosang.github.io/trackerslist/trackers_best.txt
>>
>>990528
This.
>>
File: 047 - TWTFPSf.jpg (8 KB, 222x204)
8 KB
8 KB JPG
now this is awesome!!
>>
File: 1600121791950.jpg (23 KB, 384x332)
23 KB
23 KB JPG
>>990243
if this is real im going to shed a tear.
>>
>>990378
Does this mean out of 130k only 30k are unique?
>>
File: 1605048541153.png (99 KB, 746x512)
99 KB
99 KB PNG
>>990243
HOLY FUCKING BASED !!
THANKS SO MUCH FREN !!




>>
what a waste of time.
>>
>>991995
yeah complaining about pepe image dumps in the torrent section of a basket weaving forum is much more productive
>>
>>992046
based
>>
>>990243
Wouldn't that make Limited Edition Pepe less valuable?
>>
>>990243
Fucking based, thanks op
>>
took like 7 hours to download but hek i got it now thanks friend
>>
File: newest meme very new.jpg (31 KB, 600x571)
31 KB
31 KB JPG
thanks but I'll wait for something better
>>
File: di768rKAT.jpg (124 KB, 1920x1080)
124 KB
124 KB JPG
>>
great torrent. i'll seed this one for a while
>>
File: AnAngryGod.jpg (22 KB, 400x300)
22 KB
22 KB JPG
>>992434
MILHOUSE IS NOT A MEME YOU NEWFAG FUCK
>>
File: brainlet 3d.webm (2.77 MB, 480x480)
2.77 MB
2.77 MB WEBM
any 3d ones like this
>>
File: 1599918140866.gif (1.84 MB, 266x199)
1.84 MB
1.84 MB GIF
>>
File: V Tip.gif (2.79 MB, 312x250)
2.79 MB
2.79 MB GIF
>>
>>990243
this is solid gold props to you OP
>>
>>990243
This is the level of autism that makes this site worthwhile
Thank you friend
>>
File: 1593437131240.png (150 KB, 343x343)
150 KB
150 KB PNG
>>993054
This thread needs to live
>>
>>991273
I just like that fucking frog.
>>
File: 00 The Fool.gif (78 KB, 227x533)
78 KB
78 KB GIF
Does it have the Tarot pepe?
>>
bump
>>
>>990243
Thanks OP, very cool!
>>
Based
>>
>>990243
hahaha awesome. i have been saving Pepes and Trump images since 2015. i must have 1000s by now lol.
>>
Anon, you don't really expect me to download 15 gb of cartoon frogs, do you? Because I will. Thank you.
>>
There's a few thousand extra pepes available at the-eye too. https://the-eye.eu/public/Images/Pepe/
>>
>>990243
op here. i've got something in the works to remove thumbnails. i'll try to tackle duplicates later
>>
>>994592
Yeah, please. I downloaded this, extracted the 001 one and when I went through it, just wasted my time watching at low quality ones and in the end I figured my pepe collection is better and deleted it all...now I kinda regret it coz I could have *.gif or sort by size and kept some quality ones, if you could remove the shitty ones it would be an amazing torrent bro. Duplicates, well that would be great, but my biggest complaint was low quality pixelated thumbs in there.
But thank you for your effort, dont wanna come out as a dick, you are doing keks work here after all.
>>
>>991273
This is history man, you need to think long term. In a decade or two, these will be worth millions.
>>
someone post the removed duplicate version of op's post
>>
File: CthKtAiVYAIE2zh.jpg (70 KB, 540x514)
70 KB
70 KB JPG
>>990243
thanks for the frens, faggit.
>>
File: aqq7j.jpg (34 KB, 619x453)
34 KB
34 KB JPG
>>990243
Am I there?

T. 178cm man
>>
File: download.jpg (8 KB, 220x184)
8 KB
8 KB JPG
fuck, torrent dead
>>
>>994765
na just very slow, fren. currently downloading and i'll keep seeding asap
>>
>>994765
I'm gonna wait on the fixed one with the dupes/thumbnails removed, personally
>>
>>990243
stupid frogposters
>>
Seeding
>>
>>990243
Oh boi
>>
Shit, quality is shit kinda
>>
>>994828
im dl'ing now
have an exam tomorrow, then the dedupe is probably gonna take a day or 2
>>
>>990243
Thanks!
>>
File: Now I am become Death.gif (1.7 MB, 438x392)
1.7 MB
1.7 MB GIF
>>
a
>>
online viewer
https://bbwroller.com/frens
>>
File: 1606092213246.png (238 KB, 756x569)
238 KB
238 KB PNG
>>995260
pretty cool
>>
>>990458
'no u' started on this site
>>
File: 625.png (24 KB, 657x527)
24 KB
24 KB PNG
>>990243
And here I was feeling bad over having collected almost 1100 frogs over the years.
>>
File: Capture.png (166 KB, 1113x612)
166 KB
166 KB PNG
now something for the other faggots trying to remove the dupes
should edits like these be considered dupes?
quite a few images are jpegs of jpegs of jpegs so the similarity between these 2 is greater than the original and some jpegs

do 4channelers really cant into saving a fucking png
>>
File: dupes.gif (1.41 MB, 769x612)
1.41 MB
1.41 MB GIF
going deeper into the rabbithole, would these be considered dupes? there are 48 fucking variations of this file
>>
>there's 84 more variants of the same file but mirrored
what the fuck are 4channelers doing
>>
put them through duplicate image software

found 15k duplicates

gj anon
>>
Nigger.gif
>>
>>995396
>should edits like these be considered dupes?
well obviously, are you not human?
>>
>>995523
no they shouldnt. that is a unique pepe. the black squiggle is representative of dust or hair on your monitor, and the smug pepe is to make you mad after you realize it is part of the picture. it is a modern masterpiece.
>>
File: So Basic.jpg (11 KB, 340x202)
11 KB
11 KB JPG
Bump
>>
thank you OP, now I have a frog for every occasion
>>
>Sorted out 112k duplicate frens :)
>Reduced amount of frens down from 130k to 18k very unique (at least 50% different) frens
>Decreased file size from ~15gb to ~4gb
>Put the frens together in one folder :D

Not op but removed duplicates for personal use. May contain some random anime pictures that were failed to catch. Still some pesky fucking duplicates in there even though i used the strictest settings on 2 different softwares.

No magnet link but feel free to make one: https://mega.nz/file/y5MliCBL#Ku78MX9flMiC3vhPYl7n4r-4KbhPVmIdlAOVLGb4tB4
>>
>>995941
thanks anon
>>
>>990243
Thanks fren!
>>
>>995941
thanks anon
but can someone make a torrent please? mega has a download limit of 4gb and this one is 4.09gb
>>
File: This is a conundrum.jpg (214 KB, 1242x594)
214 KB
214 KB JPG
>>990243
Bumpn
>>
>>990243
Hey, could some one seed me please? I can't even download the torrent name yet.
>>
>>995941
op here. this looks really good. what software did you use to find dupes?

>>995985
515231255edffbef39af3a08e79e74423d99688d
>>
>>995941
>50% different
Can you do one that only checks like 98% similar so small variations are still there?
>>
>>996101
why make a torrent of the zip?
the extracted version is 3.47GB as opposed to 4.08GB and also lets you do shit with the images while seeding
also please announce future torrents to a couple trackers, DHT is botnet
>>
>>995342
it didnt.
>>
>>996147
i just made a torrent from what i downloaded from mega. i have extracted it yet. i use trackers from here: https://github.com/ngosang/trackerslist
>>
>>995941
so in total there is only 18K pepes that arent dupes even in the torrent op posted





Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.