/g/ - Exactly how would this work? - Technology

Anonymous

02/20/26(Fri)16:33:44 No.108199363

File: 1768415622721909.png (2.24 MB, 1208x1022)

2.24 MB PNG

Anonymous 02/20/26(Fri)16:33:44 No.108199363 Archived

Exactly how would this work?

Anonymous
02/20/26(Fri)17:22:50 No.108199764

Anonymous 02/20/26(Fri)17:22:50 No.108199764

this method is known as bullshit in bullshit out

Anonymous
02/20/26(Fri)17:23:20 No.108199767

Anonymous 02/20/26(Fri)17:23:20 No.108199767

>>108199363
Distillation. Query the model, and train another on the results. It's not really an attack, it's more paying for material, and then using it. But that's a no-no because that's not fair I guess.

Anonymous
02/20/26(Fri)17:27:23 No.108199802

Anonymous 02/20/26(Fri)17:27:23 No.108199802

File: 1662069717962214.jpg (4 KB, 250x250)

4 KB JPG

>hehe we just had to scrape the whole internet, copyrighted or not, just a mere six gorillion tokens of data, no biggie
>DISTILLING MY MODEL? THAT'S ILLEGAL!

Anonymous
02/20/26(Fri)17:32:19 No.108199841

Anonymous 02/20/26(Fri)17:32:19 No.108199841

LLMs are just a statistical model. So unless they use some kind of crypographic random process in the inference then there will always be a direct coorelation between what you give and what it gives you back.
If you have enough data to coorelate inputs and outputs, you can train another model by trying to make the same coorelations with inputs and outputs.

Anonymous
02/20/26(Fri)23:58:49 No.108201850

Anonymous 02/20/26(Fri)23:58:49 No.108201850

>>108199802
if you hide your blog from ai scrapers but keep it wide open for googlebot for le search ranking on a service nobody uses anymore you're cucking yourself by serving your shit up to gemini on a silver platter

Anonymous
02/21/26(Sat)00:01:15 No.108201866

Anonymous 02/21/26(Sat)00:01:15 No.108201866

>>108201850
on my blog i serve 100k seo tags for every post, and they're all a variation of the Nword

Anonymous
02/21/26(Sat)00:01:58 No.108201870

Anonymous 02/21/26(Sat)00:01:58 No.108201870

>>108199363
(((Google)))

Michelin Star AI Chef
02/21/26(Sat)00:03:51 No.108201876

Michelin Star AI Chef 02/21/26(Sat)00:03:51 No.108201876

>>108199802
"Laws don't exist, only power"
Once you understand this, you realize there's no such thing as interpretation, but only alliances.

Anonymous
02/21/26(Sat)00:05:54 No.108201883

Anonymous 02/21/26(Sat)00:05:54 No.108201883

>>108199363
heh

Michelin Star AI Chef
02/21/26(Sat)00:07:11 No.108201889

Michelin Star AI Chef 02/21/26(Sat)00:07:11 No.108201889

>>108201876
I'm trans btw, not sure that it matters

Anonymous
02/21/26(Sat)00:08:18 No.108201893

Anonymous 02/21/26(Sat)00:08:18 No.108201893

>>108199363
reversing weights

Michelin Star AI Chef
02/21/26(Sat)00:13:01 No.108201913

Michelin Star AI Chef 02/21/26(Sat)00:13:01 No.108201913

Outrageous lies and defamatory material!

Anonymous
02/21/26(Sat)00:13:48 No.108201917

Anonymous 02/21/26(Sat)00:13:48 No.108201917

Every company has access to the general literature (like libgen) and public web data (dated) to train the foundation model i.e. dumb/statistic model that predict the next word. To get better at certain task like math/programming, they have pay professional to write good and long training data and that process is not cheap. Chinese companies have been prompting for these curated/expensive data to improve their model capability.

Anonymous
02/21/26(Sat)00:15:58 No.108201924

Anonymous 02/21/26(Sat)00:15:58 No.108201924

>>108201876
woah....

Anonymous
02/21/26(Sat)00:55:32 No.108202073

Anonymous 02/21/26(Sat)00:55:32 No.108202073

>>108199363
>recover multi trillion parameter model from like a few million tokens
retard

Anonymous
02/21/26(Sat)00:58:10 No.108202087

Anonymous 02/21/26(Sat)00:58:10 No.108202087

>>108199363
if i ask you 1000 questions then i know