/g/ - AI companies "Stealing" YouTube videos to train algorithms? - Technology

Anonymous

AI companies "Stealing" YouTub(...) 07/24/24(Wed)17:17:34 No.101557030

File: file.png (1.71 MB, 1571x1134)

AI companies "Stealing" YouTube videos to train algorithms? Anonymous 07/24/24(Wed)17:17:34 No.101557030 Archived

https://www.wired.com/story/youtube-training-data-apple-nvidia-anthropic/

I fail to see how any of this shit is "stealing" in and of itself. There may be arguments to be made regarding monetization or inclusion in proprietary data sets from something that was made available in public etc, but not simply the action itself. The idea that somehow if you put something out on the Internet intending it for public exposure and that's fine no matter if someone say...writes a blog post and charges or otherwise monetizes (ie medium, substack, something similar) based on accessing many youtube videos or other sites, that suddenly the rules need to change when it comes to "AI" doing exactly the same thing that can be done by an individual. All this ends up doing seems to be hurting attempts at FOSS AI model improvement and instead centralizes AI performance around monied megacorps. More nuanced critique may be appropriate depending on the circumstance, but the guy who has a youtube talk show bitching that AI is being trained on it because they didn't come specifically and ask (and pay no doubt) him for it doesn't make a lot of sense. Its hard not to see this as a wider application of "anti AI artist" self interested profiteering that demands that AI/LLM doing things to the exact same behaviors as humans should not be available

Edit: article is co-published by ProofNews that seems to have an axe to grind against anything having to do with AI.

SAGE
07/24/24(Wed)17:25:54 No.101557170

SAGE 07/24/24(Wed)17:25:54 No.101557170

>publicly available for free
>stolen
Pick one, JewTube rep.

Anonymous
07/24/24(Wed)17:33:01 No.101557292

Anonymous 07/24/24(Wed)17:33:01 No.101557292

>>101557170
That's my exact point. If I put up a video on YouTube that's intended to be watched by anyone of interest (as opposed to a private only video for my use that is not searchable, or even via public link if known etc) then how can anyone bitch about it being used for...anything?

Assuming nobody is trying to scrub its attribution, who made it etc (and the article doesn't seem to indicate that. Otherwise how could they know they were using Khan Acadamy vids etc. The data set seems to accurate attribute and list its inclusions etc) what is the problem? If you're fine with it being done by a human, histrionics about AI doing the same thing makes no sense.

Anonymous
07/24/24(Wed)17:57:03 No.101557680

Anonymous 07/24/24(Wed)17:57:03 No.101557680

>>101557292
>then how can anyone bitch about it being used for...anything?
Well you just said "intended to be watched by anyone of interest". These companies aren't watching them.

Anonymous
07/24/24(Wed)18:10:21 No.101557927

Anonymous 07/24/24(Wed)18:10:21 No.101557927

>>101557680
They're "watching" them just the same way that some guy who puts them on for background music , or whatever the fuck else. Without going to
>YOU ARENT WATCHING THE COMMERCIALS YOU ARE STEALING TV ARREST HIM YOUR HONOR
levels of stupidity (which we used to universally decry as moronic when RIAA/MPAA types, or those who tried to restrict ad blocking or make it illegal etc), there's no overlap.

>intended to be watched by anyone of interest
I meant that in contrast to something made for private like if someone uploaded home movies to YT but set them to only be accessible on their own account. If someone updated something like the things mentioned that are intended for the public - others can follow a link or search for it regardless of their account etc, then being upset if an AI has the same access by the same method makes no sense.

Anonymous
07/24/24(Wed)18:20:38 No.101558072

Anonymous 07/24/24(Wed)18:20:38 No.101558072

File: 1699425701556175.jpg (92 KB, 750x1000)

92 KB JPG

>>101557030
Excuse me "YOUR" videos? Since when do (you) own youtube videos that (you) upload?

Anonymous
07/24/24(Wed)18:25:01 No.101558139

Anonymous 07/24/24(Wed)18:25:01 No.101558139

>>101557927
>They're "watching" them just the same way that some guy who puts them on for background music , or whatever the fuck else.
No, they aren't.

Anonymous
07/24/24(Wed)18:25:56 No.101558152

Anonymous 07/24/24(Wed)18:25:56 No.101558152

>>101557170
just like 4chan posts

Anonymous
07/24/24(Wed)18:26:51 No.101558167

Anonymous 07/24/24(Wed)18:26:51 No.101558167

>>101557927
My videos are intended for human viewers. If an AI company wants to use them as training data, they can buy a different license for that. They apparently crave shitloads of high quality video, so time to pay up

Anonymous
07/24/24(Wed)18:29:28 No.101558201

Anonymous 07/24/24(Wed)18:29:28 No.101558201

>>101557030
there are many crunchyroll videos available for free. does that mean sony's copyright is no longer valid?
this argument is braindead. these large ai companies are scum and your defending wealthy satanists

Anonymous
07/24/24(Wed)18:29:42 No.101558206

Anonymous 07/24/24(Wed)18:29:42 No.101558206

>>101557030
This is honestly a good thing, this will filter out bad channels like linus tech tips, markass brownlee and gamers nexus. And will result by people seeking more organic videos.

Anonymous
07/24/24(Wed)18:35:14 No.101558283

Anonymous 07/24/24(Wed)18:35:14 No.101558283

File: 2fe.jpg (587 KB, 1448x2048)

587 KB JPG

>>101557030
It's an obvious copyright violation because AI's have no Fair Use. What they are trained on they become a Derivative Work of.
Hence they infringe on the original author's copyright, which was originally based on preventing competition for the first printer of a book while it is still in print with them, and e.g. if someone uploads their life advice or art project to YouTube and it gets incorporated in an AI model, that AI will then be a derivative work AND compete with the original author on the basis of that author's work without their permission and quite possibly against their will.

Anonymous
07/24/24(Wed)19:07:44 No.101558718

Anonymous 07/24/24(Wed)19:07:44 No.101558718

>>101558139
This makes no sense.

>>101558167
Without going to asinine lengths that would make a Disney exec cum hard enough to hit the ceiling in terms of IP law, there's no way to justify this. Why not say
>Excuse me you only have the right to watch my video WITH MONETIZATION IF ITS ADBLOCKED YOU ARE STEALING
>Excuse me you only have the right to watch my video FOR ENTERTAINMENT IF YOU LEARN ANYTHING AND OR USE IT FOR ANY CREATIVE ENDEAVOR, YOU HAVE TO PAY THE THOUSANDS OF DOLLARS LICENSE. OH I AM SORRY YOU MADE REFERENCE TO IT IN A SUBSTACK GIVE ME ALL YOUR MONEY
and so on and so forth. It obliterates fair use and going NUH UH the moment that fair use involves an AI model is moronic.
>high quality video
Also they don't even need the video in many cases, such as the one discussed talks about taking subtitles.

>>101558201
It does not, but it doesn't apply to this element because they're not doing anything that would be in violation of those copyrights if done by a human, in terms of fair use.
>TEH WEALTHY SATANISTS HURRR
Fuck off, the reason this shit is such a problem as I mentioned is that its not going to affect the megacorps, who can afford to throw money all over the place to license everyone an everything ; they're happy to do it to be "ethical" because they know it basically pulls the ladder up behind them. The issue is that any FOSS, self hosted independent AI model who can't afford to give every random tard who screeches wanting thousands of dollars for license, so now you've basically established the only performant AI models will be exclusively proprietary, secret, megacorp tech company owned.

>>101558206
...no? Those 'bad channels' were made by humans, and humans make equal if not worse knock offs if they think they can profit from it. Every I WANT TO BE PEWDIEPIE OR MR BEAST monetization knock off didn't happen because of AI. Are you even thinking?

Out of room, I'l get to the other point.

Anonymous
07/24/24(Wed)19:16:06 No.101558815

Anonymous 07/24/24(Wed)19:16:06 No.101558815

>>101557030
>Edit:
You have to go back

Anonymous
07/24/24(Wed)19:23:09 No.101558901

Anonymous 07/24/24(Wed)19:23:09 No.101558901

>>101557030
Good, they are public

Anonymous
07/24/24(Wed)19:24:45 No.101558928

Anonymous 07/24/24(Wed)19:24:45 No.101558928

>>101558283
Fair use does not evaporate because of the use of a tool; AI is not independent and training any model and its prompting is a function of the person doing so.
>what they are trained on they become a derivative work of
This is the same kind of nonsense in accusing a search engine of violating copyright because it has to "know" the content and address in order to provide it. Also, again fair use is intact including for derivative works, especially at this degree of extrapolation. AI/LLM and humans for instance both learn
>What drawn big tits look like
based on their exposure to existing drawn big tits. Unlike a digital model, the only issue is that we can't crack open some random patreon artist's head and prove that he decided to draw big heavy banana boobs inspired by Raita in a specific work of his, but the nipples are more like another artist's style and the body types are closer to a combination of two other approaches etc. Many of those exposures may have even been 'illegal' - he didn't buy that raita book, he saw it somewhere random online etc - but we only see the end product. However its the same process overall of exposure to those figures, style etc. Why do you think there are so many bad "ColdSteel the Edgehog" or "Goku but with another hair color" style "this is just an existing character with a couple of things swapped around" knockoffs? Humans do it all the fucking time and its considered fair use in the vast majority of cases. Commissioning some artist to make a knockoff Hasbin Hotel character requires knowing those properties; LLM does the same.

AI/LLM is just another tool, no different than how using CGI and digital art tools vary significantly from physical media (and there were some fans of the latter who claimed the former were 'not real art' too - YOU HAVE AN UNDO BUTTON? YOU DONT HAVE TO MIX THE COLORS AND CAN JUST PICK!? NOT AN ARTIST HURRRR etc) and fair use doesn't go away by using different tools.

Anonymous
07/24/24(Wed)19:33:57 No.101559059

Anonymous 07/24/24(Wed)19:33:57 No.101559059

>>101558928
Then we simply don't see eye-to-eye on the limitations of Fair Use.
Bye!!

Anonymous
07/24/24(Wed)19:44:23 No.101559181

Anonymous 07/24/24(Wed)19:44:23 No.101559181

>>101559059
See you tomorrow!