not because it's not feasible, but because AI developers will never realize that purging any trace of reddit from the dataset is one of the necessary breakthroughs to achieve it
>>108021458humans (or at least some of us) are intelligent despite consooming lots of garbage data on a daily basis.there must be more to AGI than just data. there must be different subsystems for different types of intelligence, just like our brains have subsystems.
>>108021458>Train on reddit>Put negative weight on those vectors>AGI instantly achieved
>>108021458That is just being used as source for web search queries.There are dedicated data companies that provide curated data for AI training
>>108021458kek, I deliberately shitpost on Reddit to muddy the AI waters with slop./g/ needs to contribute to the shitpile
>>108021458AI companies pirated hundreds of thousands of books from libgen and Anna’s Archive and fed it to the AI.
>>108021458I thought 4chan was going to be at least on the top 4
>>1080262644chan doesn't main a large historic database of posts going back decades, and that's what the bot-trainers wantthird party archives exist, but the site itself has little old content (except for a few very slow boards)
>>108026264they avoid 4chan because they don't want the ai to go naziwithout heavy censorship it turns nazi on its own anyway, but it's unrelated to this place
>>108026264It's right there.