/g/ - /ldg/ - Local Diffusion General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/ldg/ - Local Diffusion Genera(...) 10/01/25(Wed)11:59:45 No.106758695

File: highlights_g_106755435_17(...).jpg (1.09 MB, 1874x2387)

1.09 MB JPG

/ldg/ - Local Diffusion General Anonymous 10/01/25(Wed)11:59:45 No.106758695 Archived

NB4 Spam Edition

Discussion of Free and Open Source Text-to-Image/Video Models and UI

Prev: >>106755435

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2203741
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbours
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo

Anonymous
10/01/25(Wed)12:01:55 No.106758719

Anonymous 10/01/25(Wed)12:01:55 No.106758719

>Discussion of Free and Open Source Text-to-Image/Video Models and UI
https://files.catbox.moe/ccd0qs.mp4

Anonymous
10/01/25(Wed)12:02:21 No.106758722

Anonymous 10/01/25(Wed)12:02:21 No.106758722

Focus on the maintain thread quality part.
>Maintain Thread Quality
https://rentry.org/debo
He's having an episode so just do what's needed which is don't engage and do what you need to do

Anonymous
10/01/25(Wed)12:02:24 No.106758724

Anonymous 10/01/25(Wed)12:02:24 No.106758724

File: 1753078401259449.png (293 KB, 549x617)

293 KB PNG

When will local reach OpenAI Sora 2 quality?

Anonymous
10/01/25(Wed)12:02:43 No.106758727

Anonymous 10/01/25(Wed)12:02:43 No.106758727

blessed thread of saas

Anonymous
10/01/25(Wed)12:03:40 No.106758735

Anonymous 10/01/25(Wed)12:03:40 No.106758735

File: friendly reminder that lo(...).mp4 (1.38 MB, 704x1280)

1.38 MB MP4

Anonymous
10/01/25(Wed)12:03:55 No.106758740

Anonymous 10/01/25(Wed)12:03:55 No.106758740

>>106758722
then why do you keep engaging?

Anonymous
10/01/25(Wed)12:04:04 No.106758742

Anonymous 10/01/25(Wed)12:04:04 No.106758742

>>106758722
you are truly obsessed, ranjeet

Anonymous
10/01/25(Wed)12:04:06 No.106758744

Anonymous 10/01/25(Wed)12:04:06 No.106758744

thx 4 da phree bumps cloudcucks :D

Anonymous
10/01/25(Wed)12:06:24 No.106758764

Anonymous 10/01/25(Wed)12:06:24 No.106758764

>>106758722
and remember guys, everytime you see a post you don't like, that's definitely debo, climate change? that's also debo? World War 2? that was debo as well!

Anonymous
10/01/25(Wed)12:06:56 No.106758769

Anonymous 10/01/25(Wed)12:06:56 No.106758769

File: file.png (140 KB, 247x247)

140 KB PNG

Wish I didn't waste so much time on the 1.2B model, something is seriously wrong with the architecture, probably the mlp. Using the HDM mlp setup the 600m model test is learning so much quicker and works with much higher learning rates.

Anonymous
10/01/25(Wed)12:07:02 No.106758770

Anonymous 10/01/25(Wed)12:07:02 No.106758770

oh he big mad with those replies kek go back to spamming

Anonymous
10/01/25(Wed)12:07:59 No.106758777

Anonymous 10/01/25(Wed)12:07:59 No.106758777

>>106758769
fuck off debo

Anonymous
10/01/25(Wed)12:08:05 No.106758780

Anonymous 10/01/25(Wed)12:08:05 No.106758780

>>106758770
In his mind if he doesn't look at it in OP it doesn't exist

Anonymous
10/01/25(Wed)12:08:24 No.106758784

Anonymous 10/01/25(Wed)12:08:24 No.106758784

>>106758769
>Using the HDM mlp setup the 600m model test is learning so much quicker and works with much higher learning rates
Based.
>HDM mlp
Wat is?

Anonymous
10/01/25(Wed)12:08:27 No.106758786

Anonymous 10/01/25(Wed)12:08:27 No.106758786

>>106758722
ngl we hear more about you complaing about debo than debo posting, you are more annoying fag

Anonymous
10/01/25(Wed)12:08:47 No.106758787

Anonymous 10/01/25(Wed)12:08:47 No.106758787

>debo

Anonymous
10/01/25(Wed)12:10:37 No.106758795

Anonymous 10/01/25(Wed)12:10:37 No.106758795

>>106758769
would you humor me and upload the 1.2b? if anything, to admire the file since its one of a kind

Anonymous
10/01/25(Wed)12:11:30 No.106758805

Anonymous 10/01/25(Wed)12:11:30 No.106758805

Newfag here, who is debo?

Anonymous
10/01/25(Wed)12:11:50 No.106758809

Anonymous 10/01/25(Wed)12:11:50 No.106758809

>>106758784

class SwiGLUTorch(nn.Module):
    """
    SwiGLU MLP: y = W3( SiLU(W1 x) ⊙ (W2 x) )
    - Supports packed weights via a single Linear projecting to 2*hidden_features.
    - For compatibility with callers that pass extra kwargs (e.g., HW=...), forward accepts **kwargs.
    """
    def __init__(self, in_features, hidden_features=None, out_features=None, bias=True, _pack_weights=True):
        super().__init__()
        self.in_features = in_features
        self.hidden_features = hidden_features or in_features
        self.out_features = out_features or in_features
        self._pack_weights = _pack_weights

        if _pack_weights:
            self.w12 = nn.Linear(in_features, 2 * self.hidden_features, bias=bias)
            self.w1 = None
            self.w2 = None
        else:
            self.w12 = None
            self.w1 = nn.Linear(in_features, self.hidden_features, bias=bias)
            self.w2 = nn.Linear(in_features, self.hidden_features, bias=bias)

        self.w3 = nn.Linear(self.hidden_features, self.out_features, bias=bias)

    def forward(self, x, *args, **kwargs):
        if self.w12 is not None:
            x1, x2 = self.w12(x).chunk(2, dim=-1)
        else:
            x1 = self.w1(x)
            x2 = self.w2(x)
        return self.w3(F.silu(x1) * x2)

It's how the layers and parameters are glued together and controls how the data flows..

Anonymous
10/01/25(Wed)12:12:55 No.106758815

Anonymous 10/01/25(Wed)12:12:55 No.106758815

lol https://files.catbox.moe/203axg.mp4

Anonymous
10/01/25(Wed)12:13:02 No.106758818

Anonymous 10/01/25(Wed)12:13:02 No.106758818

File: 1730864918478479.mp4 (1.78 MB, 640x768)

1.78 MB MP4

Anonymous
10/01/25(Wed)12:13:32 No.106758821

Anonymous 10/01/25(Wed)12:13:32 No.106758821

>anistudio again in the op

Anonymous
10/01/25(Wed)12:13:36 No.106758822

Anonymous 10/01/25(Wed)12:13:36 No.106758822

>>106758815
lmaooo

Anonymous
10/01/25(Wed)12:13:45 No.106758823

Anonymous 10/01/25(Wed)12:13:45 No.106758823

>>106758795
You enjoyed the best of it with the images I posted. And to use it you'd need all my code which I won't release. Anything I actually release would be a full open source drop including the modified Pixart training/inference code.

Anonymous
10/01/25(Wed)12:14:01 No.106758827

Anonymous 10/01/25(Wed)12:14:01 No.106758827

>>106758805
https://desuarchive.org/g/thread/102628478/#q102629904

Anonymous
10/01/25(Wed)12:14:36 No.106758829

Anonymous 10/01/25(Wed)12:14:36 No.106758829

>>106758823
>And to use it you'd need all my code which I won't release
ah, understandable. i enjoy reading your posts about it regardless. ty anon.

Anonymous
10/01/25(Wed)12:15:18 No.106758836

Anonymous 10/01/25(Wed)12:15:18 No.106758836

File: 1730350897814088.png (261 KB, 1354x1142)

261 KB PNG

>>106758815
I never thought sound would make memes that much funnier but now I realize we're still at the Charlie Chaplin stage with Wan 2.2, can't wait to get a local model that produces sound as well, maybe it'll be sooner than you expect
https://xcancel.com/bdsqlsz/status/1973262731311755334#m

Anonymous
10/01/25(Wed)12:16:21 No.106758845

Anonymous 10/01/25(Wed)12:16:21 No.106758845

>>106758722
we should rename /ldg/ to /cadg/ -> complain about debo general

Anonymous
10/01/25(Wed)12:16:56 No.106758850

Anonymous 10/01/25(Wed)12:16:56 No.106758850

Too bad it can only output "man impersonating celeb" instead of the actual celeb. But hey, Sama memes can be funny I guess.

Anonymous
10/01/25(Wed)12:16:58 No.106758851

Anonymous 10/01/25(Wed)12:16:58 No.106758851

>>106758836
remember to get on your knees and beg them for it nicely

Anonymous
10/01/25(Wed)12:18:07 No.106758858

Anonymous 10/01/25(Wed)12:18:07 No.106758858

>>106758850
man it got his voice perfect and his face almost perfect, what more do you want

Anonymous
10/01/25(Wed)12:18:58 No.106758862

Anonymous 10/01/25(Wed)12:18:58 No.106758862

>>106758851
And remember, bet grateful for every release. Even if you think it's slopped, repeat after me: FINETUNES WILL FIX IT!

Anonymous
10/01/25(Wed)12:19:03 No.106758863

Anonymous 10/01/25(Wed)12:19:03 No.106758863

>>106758858
>what more do you want
For it to actually look like him and not someone cosplaying as him. Pretty simple request desu.

Anonymous
10/01/25(Wed)12:19:32 No.106758870

Anonymous 10/01/25(Wed)12:19:32 No.106758870

File: 00005-3184870178.png (1.37 MB, 1024x1240)

1.37 MB PNG

Anonymous
10/01/25(Wed)12:20:02 No.106758874

Anonymous 10/01/25(Wed)12:20:02 No.106758874

>>106758863
like are we talking perfect facial bone / skin alignment? cause I dont think that is gonna happen any time soon

Anonymous
10/01/25(Wed)12:20:07 No.106758875

Anonymous 10/01/25(Wed)12:20:07 No.106758875

>>106758862
I swear to god if this general wouldn't cope so hard and be harsher on those slop chinks, we would've gotten better models at this point, ultimately we get what we tolerate, if we stopped hyping slopped models maybe it would make them understand that they should stop doing that

Anonymous
10/01/25(Wed)12:20:32 No.106758881

Anonymous 10/01/25(Wed)12:20:32 No.106758881

File: stable-diffusion-noise.webm (777 KB, 512x454)

777 KB WEBM

>>106758829
But like I said this updated setup learns much much faster and is learning more how I'd expect, where the model goes from "abstract" patch blocks and slowly gets more and more detailed with each epoch. So it's going to take a fraction of of the time. The other model really felt like it had no noticeable improvement epoch over epoch (much like this video clip).

Anonymous
10/01/25(Wed)12:21:07 No.106758885

Anonymous 10/01/25(Wed)12:21:07 No.106758885

>>106758863
>For it to actually look like him and not someone cosplaying as him. Pretty simple request desu.
no model can do that, neither Sora, Veo or any local model

Anonymous
10/01/25(Wed)12:21:19 No.106758888

Anonymous 10/01/25(Wed)12:21:19 No.106758888

>>106758874
Damn, local wins that one I guess.

Anonymous
10/01/25(Wed)12:22:19 No.106758894

Anonymous 10/01/25(Wed)12:22:19 No.106758894

>>106758885
im sure anon is just imagining all those celeb deepfakes made with local models kek cope harder cloudcuck go back

Anonymous
10/01/25(Wed)12:22:23 No.106758895

Anonymous 10/01/25(Wed)12:22:23 No.106758895

why does debo defend local so much?

Anonymous
10/01/25(Wed)12:23:06 No.106758901

Anonymous 10/01/25(Wed)12:23:06 No.106758901

>>106758836
Translation
>New intel: We will be sourcing Sora2 videos from here onwards! We will be benchmaxxing and claiming we have caught up with our slopped datasets and models! Worry not peasants!

Anonymous
10/01/25(Wed)12:23:14 No.106758902

Anonymous 10/01/25(Wed)12:23:14 No.106758902

>>106758888
show me lol

Anonymous
10/01/25(Wed)12:23:18 No.106758903

Anonymous 10/01/25(Wed)12:23:18 No.106758903

>>106758894
>im sure anon is just imagining all those celeb deepfakes made with local models kek
must be easy as fuck to find one example then, go on anon, I'm waiting for the proof (plot twist, he won't provide the proof)

Anonymous
10/01/25(Wed)12:24:32 No.106758910

Anonymous 10/01/25(Wed)12:24:32 No.106758910

>>106758902
>>106758903
go on /gif/ /b/ or /r/ right now. surprised you havent heard of those threads yet... are you new?

Anonymous
10/01/25(Wed)12:25:13 No.106758914

Anonymous 10/01/25(Wed)12:25:13 No.106758914

>>106758910
just one example, just one post, that's all we're asking, the burden of proof is on you, remember that

Anonymous
10/01/25(Wed)12:26:44 No.106758923

Anonymous 10/01/25(Wed)12:26:44 No.106758923

>pretending that local models cant into deepfakes
>pretending most of /r/ isnt wizard requests
>pretending theres not a link to the celeb fake thread in OP
Just go back to spamming bro

Anonymous
10/01/25(Wed)12:26:49 No.106758924

Anonymous 10/01/25(Wed)12:26:49 No.106758924

>>106758910
I just looked, none of them look closer to the real person in the one thread I found

Anonymous
10/01/25(Wed)12:28:11 No.106758929

Anonymous 10/01/25(Wed)12:28:11 No.106758929

>>106758923
maybe you should learn how to argue your point debo

Anonymous
10/01/25(Wed)12:29:07 No.106758937

Anonymous 10/01/25(Wed)12:29:07 No.106758937

>>106758903
>(plot twist, he won't provide the proof)
>>106758923
>narrator: he didn't provide the proof
like clockwork, damn I'm so good at this

Anonymous
10/01/25(Wed)12:29:33 No.106758941

Anonymous 10/01/25(Wed)12:29:33 No.106758941

>>106758895
>>106758929
>spamming didn't work
>moves to muddying the waters
How many newfags are lurking right now do you think? KEK

Anonymous
10/01/25(Wed)12:29:48 No.106758944

Anonymous 10/01/25(Wed)12:29:48 No.106758944

>sora can't do cele- acckk!
https://files.catbox.moe/l68j9b.mp4

Anonymous
10/01/25(Wed)12:30:34 No.106758949

Anonymous 10/01/25(Wed)12:30:34 No.106758949

File: Chrowan_00026_.jpg (1.07 MB, 2016x3072)

1.07 MB JPG

Anon, cousin, stop fighting.

>>106758609
Happy halloween!

Anonymous
10/01/25(Wed)12:30:54 No.106758952

Anonymous 10/01/25(Wed)12:30:54 No.106758952

>>106758944
it doesn't load, catbox is down or something?

Anonymous
10/01/25(Wed)12:31:19 No.106758958

Anonymous 10/01/25(Wed)12:31:19 No.106758958

>>106758941
now this is classic debo cope after losing an argument

Anonymous
10/01/25(Wed)12:31:27 No.106758959

Anonymous 10/01/25(Wed)12:31:27 No.106758959

>>106758902
>>106758924
>>>/b/940546596

Anonymous
10/01/25(Wed)12:31:51 No.106758963

Anonymous 10/01/25(Wed)12:31:51 No.106758963

>>106758914
>>>/b/940555975
bruh I’m not the guy you’re in a pissing contest with but like what are you even trying to argue. Deepfakes are one of the things people have been doing from the start

Anonymous
10/01/25(Wed)12:32:09 No.106758967

Anonymous 10/01/25(Wed)12:32:09 No.106758967

>>106758952
its a big one, might take a min
>>106758959
>video

Anonymous
10/01/25(Wed)12:32:44 No.106758971

Anonymous 10/01/25(Wed)12:32:44 No.106758971

>>106758959
>we asked for a video model capable of doing deepfakes since we're talking about the capabilities of Sora 2
>he provides an image instead
damn... must be tough living in this world while being this retarded, I have some pity not gonna lie

Anonymous
10/01/25(Wed)12:33:00 No.106758974

Anonymous 10/01/25(Wed)12:33:00 No.106758974

>moving the goalposts
but keep spamming anime videos or whatever lol

Anonymous
10/01/25(Wed)12:33:24 No.106758979

Anonymous 10/01/25(Wed)12:33:24 No.106758979

>>106758944
ok it works now

Anonymous
10/01/25(Wed)12:33:45 No.106758982

Anonymous 10/01/25(Wed)12:33:45 No.106758982

>>106758963
>Deepfakes are one of the things people have been doing from the start
doesn't mean that they're perfect, remember, this is the criteria
>>106758874
>like are we talking perfect facial bone / skin alignment?

Anonymous
10/01/25(Wed)12:34:24 No.106758983

Anonymous 10/01/25(Wed)12:34:24 No.106758983

>>106758974
>>106758944
>anime

Anonymous
10/01/25(Wed)12:34:30 No.106758984

Anonymous 10/01/25(Wed)12:34:30 No.106758984

File: 00024-1558125614.png (1.41 MB, 1024x1240)

1.41 MB PNG

Anonymous
10/01/25(Wed)12:34:39 No.106758987

Anonymous 10/01/25(Wed)12:34:39 No.106758987

>>106758982
Qwen Edit is capable of perfect deep fakes. Wan is capable of animating those perfect deep fakes. And there's no content cop saying you're unsafe.

Anonymous
10/01/25(Wed)12:35:33 No.106758993

Anonymous 10/01/25(Wed)12:35:33 No.106758993

File: 1756220638628612.png (1.73 MB, 1360x768)

1.73 MB PNG

>>106758987
>Qwen Edit is capable of perfect deep fakes.
me when I lie

Anonymous
10/01/25(Wed)12:35:53 No.106759000

Anonymous 10/01/25(Wed)12:35:53 No.106759000

I log off and go to bed and there's been three threads in the intervening twelve hours wtf happened

Anonymous
10/01/25(Wed)12:36:16 No.106759001

Anonymous 10/01/25(Wed)12:36:16 No.106759001

Seems like you are butthurt.

Anonymous
10/01/25(Wed)12:36:16 No.106759002

Anonymous 10/01/25(Wed)12:36:16 No.106759002

>>106758944
Unironically who is that supposed to be

Anonymous
10/01/25(Wed)12:36:21 No.106759003

Anonymous 10/01/25(Wed)12:36:21 No.106759003

File: 00025-1558125615.png (1.44 MB, 1024x1240)

1.44 MB PNG

Don't argue just post gens

Anonymous
10/01/25(Wed)12:36:34 No.106759008

Anonymous 10/01/25(Wed)12:36:34 No.106759008

>>106758944
is this sora 2? the watemark logo is the sora 1 logo

Anonymous
10/01/25(Wed)12:36:45 No.106759010

Anonymous 10/01/25(Wed)12:36:45 No.106759010

>>106758993
You don't know because you're poor and no one is going to post anything because deepfakes are illegal so right now the internet is just poorfags living in dirt hovels who aren't afraid of the law.

Anonymous
10/01/25(Wed)12:37:22 No.106759011

Anonymous 10/01/25(Wed)12:37:22 No.106759011

>>106759000
Any time a new cloud model drops it's used as a wedge. Some discuss it in earnest but the guy who spammed the last two threads is just trolling.

Anonymous
10/01/25(Wed)12:37:32 No.106759014

Anonymous 10/01/25(Wed)12:37:32 No.106759014

>>106759008
sora 1, music is Suno

Anonymous
10/01/25(Wed)12:37:35 No.106759016

Anonymous 10/01/25(Wed)12:37:35 No.106759016

>>106759010
I literally posted a Qwen Image Edit render of Sam Altman with Will Smith, does that look "perfect deep fakes" to you?

Anonymous
10/01/25(Wed)12:38:36 No.106759025

Anonymous 10/01/25(Wed)12:38:36 No.106759025

>>106759014
So... to prove that Sora 2 can do celebrities you post a Sora 1 video?? lmao

Anonymous
10/01/25(Wed)12:38:42 No.106759028

Anonymous 10/01/25(Wed)12:38:42 No.106759028

>>106759016
This is no different than you building a shitty birdhouse and saying it's impossible to make a good birdhouse with home tools. You don't prove me wrong, you only show you're stupid.

Anonymous
10/01/25(Wed)12:39:05 No.106759033

Anonymous 10/01/25(Wed)12:39:05 No.106759033

>>106758983
>generic ai face woman
whats that supposed to prove exactly

Anonymous
10/01/25(Wed)12:39:30 No.106759035

Anonymous 10/01/25(Wed)12:39:30 No.106759035

File: 00013-2348413427.jpg (1.66 MB, 2450x2450)

1.66 MB JPG

Anonymous
10/01/25(Wed)12:39:41 No.106759039

Anonymous 10/01/25(Wed)12:39:41 No.106759039

>>106759028
>Qwen Image Edit can definitely do it, I just won't show it!
All right I'm completly convinced right now, great argument anon!

Anonymous
10/01/25(Wed)12:40:15 No.106759042

Anonymous 10/01/25(Wed)12:40:15 No.106759042

>>106759025
He's not very smart. Just laugh at him getting all riled up.

Anonymous
10/01/25(Wed)12:40:54 No.106759047

Anonymous 10/01/25(Wed)12:40:54 No.106759047

>>106759039
Your premise is completely flawed. I don't *need* you to use Qwen Edit and in fact the more you favela poorfags avoid the good local models the better.

Anonymous
10/01/25(Wed)12:40:54 No.106759048

Anonymous 10/01/25(Wed)12:40:54 No.106759048

this is higher res with pro account
https://files.catbox.moe/4zf45q.mp4

Anonymous
10/01/25(Wed)12:41:15 No.106759050

Anonymous 10/01/25(Wed)12:41:15 No.106759050

>other AI threads haven't been bumped in 40 minutes
Wooooaaaa I'm noooooootocing!

Anonymous
10/01/25(Wed)12:41:49 No.106759053

Anonymous 10/01/25(Wed)12:41:49 No.106759053

File: 00035-2324060362.png (1.44 MB, 1024x1240)

1.44 MB PNG

Anonymous
10/01/25(Wed)12:41:55 No.106759055

Anonymous 10/01/25(Wed)12:41:55 No.106759055

>>106759025
isn't that even worse for local?
>>106759033
holy brown

Anonymous
10/01/25(Wed)12:43:13 No.106759068

Anonymous 10/01/25(Wed)12:43:13 No.106759068

>>106759035
>muscular woman
*barfs*

Anonymous
10/01/25(Wed)12:44:04 No.106759078

Anonymous 10/01/25(Wed)12:44:04 No.106759078

>>106759055
So tell me who it is so we can compare unless you're scared of being proven wrong again kek
Also it's more brown to think a random british guy dressed up as austin powers is mike myers but thats besides the point

Anonymous
10/01/25(Wed)12:45:28 No.106759094

Anonymous 10/01/25(Wed)12:45:28 No.106759094

https://files.catbox.moe/r25a0j.mp4

Anonymous
10/01/25(Wed)12:45:43 No.106759100

Anonymous 10/01/25(Wed)12:45:43 No.106759100

File: 00036-2324060363.png (1.93 MB, 1024x1240)

1.93 MB PNG

He really is going all out today sadly he can't make anything locally that's passable so we have another week of the same non local trolling..

Anonymous
10/01/25(Wed)12:48:20 No.106759120

Anonymous 10/01/25(Wed)12:48:20 No.106759120

>>106759094
Broke the 180 rule. Animators remain safe.

Anonymous
10/01/25(Wed)12:49:24 No.106759128

Anonymous 10/01/25(Wed)12:49:24 No.106759128

>>106759003
>>106759053
>>106759100
Chroma?

Anonymous
10/01/25(Wed)12:51:05 No.106759141

Anonymous 10/01/25(Wed)12:51:05 No.106759141

File: 00039-2324060366.png (1.47 MB, 1024x1240)

1.47 MB PNG

>>106759128
Yes it's really good with more irl stuff, but I could probably do this with flux if I had it on my computer

Anonymous
10/01/25(Wed)12:51:39 No.106759146

Anonymous 10/01/25(Wed)12:51:39 No.106759146

>>106759050
>one minute after this post they were bumped
this ability to notice is a burden some anon must carry

Anonymous
10/01/25(Wed)12:53:17 No.106759160

Anonymous 10/01/25(Wed)12:53:17 No.106759160

File: 00040-2324060367.png (1.86 MB, 1024x1240)

1.86 MB PNG

>>106759146
99% of anons notice you can tell by the engagement in the other thread nobody save for a few mentally ill anons and ones that want to quell him post there now. He has been losing this game since he lost the thread split.

Anonymous
10/01/25(Wed)12:53:31 No.106759166

Anonymous 10/01/25(Wed)12:53:31 No.106759166

File: 00048-1749936220.png (1.91 MB, 1280x1024)

1.91 MB PNG

Anonymous
10/01/25(Wed)12:55:28 No.106759182

Anonymous 10/01/25(Wed)12:55:28 No.106759182

>only posts are rans
looking grim for local

Anonymous
10/01/25(Wed)12:55:50 No.106759184

Anonymous 10/01/25(Wed)12:55:50 No.106759184

>>106758881
That gif reminds me of the one where it shows progressive steps of a prompt that's like "woman on the beach" with euler a. I thought that one illustrated convergence well and it's a shame I can never find it when I need it.
>So it's going to take a fraction of of the time.
Nice. I can't wait to see the progress gens.

Anonymous
10/01/25(Wed)12:56:15 No.106759191

Anonymous 10/01/25(Wed)12:56:15 No.106759191

>>106759182
He's melting down over scarecrows and spamming the thread harder than the other schizo he's whining about.

Anonymous
10/01/25(Wed)12:56:56 No.106759199

Anonymous 10/01/25(Wed)12:56:56 No.106759199

*yawn*

Anonymous
10/01/25(Wed)12:57:43 No.106759207

Anonymous 10/01/25(Wed)12:57:43 No.106759207

>>106759182
>>106759191
yeah, at some point the complaining is even worse than the schizo himself, let's hope that one day he'll understand that he should stop feeding trolls

Anonymous
10/01/25(Wed)12:58:16 No.106759214

Anonymous 10/01/25(Wed)12:58:16 No.106759214

Once again I wonder how many newfags you think are here right now

Anonymous
10/01/25(Wed)12:58:44 No.106759217

Anonymous 10/01/25(Wed)12:58:44 No.106759217

>>106759120
>Animators remain safe.
remember, this is the worst we'll ever get

Anonymous
10/01/25(Wed)12:59:55 No.106759228

Anonymous 10/01/25(Wed)12:59:55 No.106759228

File: 00051-3342282911.png (2.51 MB, 1240x1240)

2.51 MB PNG

Anonymous
10/01/25(Wed)12:59:58 No.106759229

Anonymous 10/01/25(Wed)12:59:58 No.106759229

>>106759217
I think you completely misunderstand that there's a ton of compute between right now and a model smart enough to not break the 180 rule and if the future is 100B+ models, yes, animators are safe.

Anonymous
10/01/25(Wed)13:00:26 No.106759234

Anonymous 10/01/25(Wed)13:00:26 No.106759234

>>106759207
>>106759191
to be fair it is fun to shitpost at him since it's really easy to set him off
my favorite is when he goes to /sdg/ after to melty in there too thinking it was debo

Anonymous
10/01/25(Wed)13:00:46 No.106759238

Anonymous 10/01/25(Wed)13:00:46 No.106759238

File: 1732511172659797.png (205 KB, 2096x883)

205 KB PNG

https://arxiv.org/abs/2509.22935
Apple saved local, we'll be able to go for very low quant models and have good accuracy

Anonymous
10/01/25(Wed)13:01:20 No.106759246

Anonymous 10/01/25(Wed)13:01:20 No.106759246

>>106759120
>flips the clip horizontally in editing

Anonymous
10/01/25(Wed)13:01:21 No.106759247

Anonymous 10/01/25(Wed)13:01:21 No.106759247

>maybe if i samefag hard enough anon will believe me

Anonymous
10/01/25(Wed)13:01:47 No.106759253

Anonymous 10/01/25(Wed)13:01:47 No.106759253

>>106759229
>there's a ton of compute between right now and a model smart enough to not break the 180 rule and if the future is 100B+ models
OpenAI has that compute, and we're far from optimizing the neural networks architecture, if you believe Transformers is the end of the road, you're not gonna make it

Anonymous
10/01/25(Wed)13:02:01 No.106759254

Anonymous 10/01/25(Wed)13:02:01 No.106759254

>>106759214
Surely this will work after it not working for 3 years, doing the same exact style and bit.

Anonymous
10/01/25(Wed)13:02:33 No.106759255

Anonymous 10/01/25(Wed)13:02:33 No.106759255

>superior sora saas starts sdgtard seething
love to see it

Anonymous
10/01/25(Wed)13:03:30 No.106759268

Anonymous 10/01/25(Wed)13:03:30 No.106759268

>>106759254
They call that the definition of insanity, don't they.

Anonymous
10/01/25(Wed)13:03:40 No.106759271

Anonymous 10/01/25(Wed)13:03:40 No.106759271

*brap*

Anonymous
10/01/25(Wed)13:04:06 No.106759274

Anonymous 10/01/25(Wed)13:04:06 No.106759274

>>106759246
kek, and that's it, animators are dead now

Anonymous
10/01/25(Wed)13:04:35 No.106759277

Anonymous 10/01/25(Wed)13:04:35 No.106759277

>>106759217
We are getting to the point of incredibly incremental upgrades. This has better editing for the overall video but the movement and consistency is still wack. I love AI, and it's fun AF but until hallucinations are fixed this tech really has no future.

Anonymous
10/01/25(Wed)13:04:58 No.106759278

Anonymous 10/01/25(Wed)13:04:58 No.106759278

>>106759253
Actually they don't and realistically it's not going to be 100B to be "perfect", it's going to be something like 10T. And that's even talking about context problems, 10s is cute but last time I checked movies are 90+ minutes.

Anonymous
10/01/25(Wed)13:05:39 No.106759283

Anonymous 10/01/25(Wed)13:05:39 No.106759283

>>106759238
id rather chop off my left nut and give it to chang than put trust into crapple. considering their record id be surprised if its not a nothing burger.

Anonymous
10/01/25(Wed)13:06:17 No.106759288

Anonymous 10/01/25(Wed)13:06:17 No.106759288

>>106759278
>10s is cute but last time I checked movies are 90+ minutes.
the average length of a scene before a cut from a movie is 12 seconds, all they have to do is to make multiple cuts, like you would do on a real movie

Anonymous
10/01/25(Wed)13:06:37 No.106759293

Anonymous 10/01/25(Wed)13:06:37 No.106759293

File: xyz_grid-0001-1970925430.jpg (1.21 MB, 4000x1111)

1.21 MB JPG

Anonymous
10/01/25(Wed)13:07:32 No.106759306

Anonymous 10/01/25(Wed)13:07:32 No.106759306

>>106759293
Neat

Anonymous
10/01/25(Wed)13:09:18 No.106759315

Anonymous 10/01/25(Wed)13:09:18 No.106759315

>>106759288
>average length of a scene is 12 seconds
Okay you're totally rotted by TikTok. Have you ever watched a movie in your life? Yes, there are quick cuts that bring down the average, but scenes are not a series of 12 second cuts. Also that still doesn't solve the context problem, because you might realize, you need to have your 12 second clips work within a plausible, realistic, "physical" environment. So even if you managed to keep your characters consistent, you still need the model to not hallucinate physical differences in the environment, which means you NEED a full 3D global model.

Anonymous
10/01/25(Wed)13:12:05 No.106759338

Anonymous 10/01/25(Wed)13:12:05 No.106759338

>>106759315
I don't know why providing that fact made you so upset but it is what it is, you can't just ignore reality
https://gointothestory.blcklst.com/the-shortening-of-movies-43dc906852f9
>The average shot length of English language films has declined from about 12 seconds in 1930 to about 2.5 seconds today

Anonymous
10/01/25(Wed)13:14:12 No.106759357

Anonymous 10/01/25(Wed)13:14:12 No.106759357

>>106759315
You can use a 3d model of the environment as an input to ground the model when necessary.

Honestly I find it funny how you guys think everything has to be pure text to video. In a realistic production there would be a much more advanced pipeline with a lot of human talent in the loop. Live action and 2D/3D animated productions use an immense amount of human labor and talent for every minute of video.

Anonymous
10/01/25(Wed)13:14:15 No.106759358

Anonymous 10/01/25(Wed)13:14:15 No.106759358

File: 00049-188999635.png (1.22 MB, 1024x1280)

1.22 MB PNG

sloprangers

Anonymous
10/01/25(Wed)13:14:46 No.106759359

Anonymous 10/01/25(Wed)13:14:46 No.106759359

>>106759338
You have no understanding of visual language. With AI videos every cut looks like it has a different cinematographer. It is one of the biggest glaring problems with making longer AI videos

Anonymous
10/01/25(Wed)13:15:38 No.106759374

Anonymous 10/01/25(Wed)13:15:38 No.106759374

>>106759338
Wow you just completely ignored what I said.
- movies aren't just 2.5 second or 12 second clips stitched together
- even if they are, those clips are taken with 2 or more cameras inside of a physical scene, which means even if you had 12.5 second clips, you still need the previous clip for context

Anonymous
10/01/25(Wed)13:15:39 No.106759375

Anonymous 10/01/25(Wed)13:15:39 No.106759375

>>106759359
You can use an edit model to provide plausible changes of angles, you add characters and you do some I2V shit, and boom

Anonymous
10/01/25(Wed)13:15:47 No.106759376

Anonymous 10/01/25(Wed)13:15:47 No.106759376

File: 00062-3008569877.png (2.44 MB, 1240x1240)

2.44 MB PNG

Anonymous
10/01/25(Wed)13:16:40 No.106759388

Anonymous 10/01/25(Wed)13:16:40 No.106759388

>>106759357
Yeah anon, where's your 60 second clip. You're so confident it can be done I'm sure you can slap something together in 20 minutes. Really give us the SOTA state of video gen.

Anonymous
10/01/25(Wed)13:18:11 No.106759403

Anonymous 10/01/25(Wed)13:18:11 No.106759403

>>106759388
>Yeah anon, where's your 60 second clip.
https://www.youtube.com/shorts/2njx4yIONSU

Anonymous
10/01/25(Wed)13:18:15 No.106759404

Anonymous 10/01/25(Wed)13:18:15 No.106759404

>>106759375
lol, I should have realized you would have no idea what I am even saying before I replied. Good bait anon.

Anonymous
10/01/25(Wed)13:19:12 No.106759415

Anonymous 10/01/25(Wed)13:19:12 No.106759415

>>106759404
It's all right, I accept your concession.

Anonymous
10/01/25(Wed)13:19:42 No.106759421

Anonymous 10/01/25(Wed)13:19:42 No.106759421

>>106759403
>not 60 seconds
>literally hallucinates details clip to clip

Anonymous
10/01/25(Wed)13:19:45 No.106759423

Anonymous 10/01/25(Wed)13:19:45 No.106759423

>>106759357
>In a realistic production there would be a much more advanced pipeline with a lot of human talent in the loop. Live action and 2D/3D animated productions use an immense amount of human labor and talent for every minute of video.
>>106759388
>I'm sure you can slap something together in 20 minutes
Kek, talk about missing the point.

Anonymous
10/01/25(Wed)13:19:57 No.106759426

Anonymous 10/01/25(Wed)13:19:57 No.106759426

>>106759228
Its ability to capture TV fuzz and the general look of a tube display always amazes me. Thank god we have 16ch VAEs now.

Anonymous
10/01/25(Wed)13:20:59 No.106759437

Anonymous 10/01/25(Wed)13:20:59 No.106759437

>>106759415
>I accept your concession.
classic debo phrase

Anonymous
10/01/25(Wed)13:21:00 No.106759438

Anonymous 10/01/25(Wed)13:21:00 No.106759438

File: 1754151454735528.png (9 KB, 269x95)

9 KB PNG

>>106759421
>>not 60 seconds
oh yeah my b, it's 62 seconds, the horror+

Anonymous
10/01/25(Wed)13:21:09 No.106759440

Anonymous 10/01/25(Wed)13:21:09 No.106759440

File: 00052-845814631.png (2.46 MB, 1536x1920)

2.46 MB PNG

Anonymous
10/01/25(Wed)13:21:25 No.106759442

Anonymous 10/01/25(Wed)13:21:25 No.106759442

File: 1753774423789611.png (846 KB, 936x1112)

846 KB PNG

the man is holding a taco and wearing a sombrero. keep his expression the same.

my food is augmented.

Anonymous
10/01/25(Wed)13:21:43 No.106759445

Anonymous 10/01/25(Wed)13:21:43 No.106759445

File: proof.png (96 KB, 382x491)

96 KB PNG

>>106759423
What point, you said it's been done, it's easy. I'm merely asking you to substantiate your claims.

Anonymous
10/01/25(Wed)13:22:05 No.106759448

Anonymous 10/01/25(Wed)13:22:05 No.106759448

>>106759437
>only debo use this sentence in this world
completly brainbroken

Anonymous
10/01/25(Wed)13:22:52 No.106759459

Anonymous 10/01/25(Wed)13:22:52 No.106759459

>>106759438
Please be honest, are you intentionally bad faith or are you retarded and thought I didn't mean a 60 second clip of a single scene. But it's still funny because you showed a series of 5 second clips and they don't even have continuity.

Anonymous
10/01/25(Wed)13:23:08 No.106759462

Anonymous 10/01/25(Wed)13:23:08 No.106759462

>>106759445
>I'm merely asking you to substantiate your claims.
we already did but you decided to pretend it didn't exist, we can't do much more when dealing with someone in pure denial >>106759403

Anonymous
10/01/25(Wed)13:23:11 No.106759465

Anonymous 10/01/25(Wed)13:23:11 No.106759465

>>106759000
hes one of those guys who as a kid would deflate the ball because no one wanted to play with him

Anonymous
10/01/25(Wed)13:23:39 No.106759470

Anonymous 10/01/25(Wed)13:23:39 No.106759470

>>106759445
>you said it's easy
No, I said the opposite of that. Realistic film productions don't shit out a minute of finished film in 20 minutes. You're just incapable of arguing the point and decided to create that strawman to argue against instead

Anonymous
10/01/25(Wed)13:23:54 No.106759474

Anonymous 10/01/25(Wed)13:23:54 No.106759474

>artists right now
https://files.catbox.moe/oii2yi.mp4

Anonymous
10/01/25(Wed)13:24:07 No.106759480

Anonymous 10/01/25(Wed)13:24:07 No.106759480

>>106759462
Please be honest, are you intentionally bad faith or are you retarded and thought I didn't mean a 60 second clip of a single scene. But it's still funny because you showed a series of 5 second clips and they don't even have continuity.

Anonymous
10/01/25(Wed)13:24:09 No.106759482

Anonymous 10/01/25(Wed)13:24:09 No.106759482

>>106759448
>brainbroken
not helping yourself debo at least try to change up your lingo

Anonymous
10/01/25(Wed)13:24:11 No.106759483

Anonymous 10/01/25(Wed)13:24:11 No.106759483

>>106759459
>I didn't mean a 60 second clip of a single scene.
good thing it's not a 60 second clip of a single scene, it's a 60 seconds video with multiple cuts, all with a single prompt

Anonymous
10/01/25(Wed)13:24:28 No.106759486

Anonymous 10/01/25(Wed)13:24:28 No.106759486

File: 00054-3320588119.png (3.23 MB, 1240x1240)

3.23 MB PNG

>>106759426
Had to pull out the base model for these

Anonymous
10/01/25(Wed)13:24:59 No.106759490

Anonymous 10/01/25(Wed)13:24:59 No.106759490

>>106759483
You mean like how actual cartoons are? Oh, the horror!

Anonymous
10/01/25(Wed)13:25:03 No.106759492

Anonymous 10/01/25(Wed)13:25:03 No.106759492

SaaS is honestly insanely powerful, these threads move so fast when amazing new SaaS models drop. It's clear that OpenAI continues to dominate the conversation in the AI space. What must local do to remain relevant?

Anonymous
10/01/25(Wed)13:25:08 No.106759493

Anonymous 10/01/25(Wed)13:25:08 No.106759493

>>106759483
> it's a 60 seconds video with multiple cuts, all with a single prompt
Okay should be simple for you to recreate this while screen recording.

Anonymous
10/01/25(Wed)13:25:13 No.106759494

Anonymous 10/01/25(Wed)13:25:13 No.106759494

>>106759482
>only debo says "brainbroken"
damn, debo is such a unique guy, only him has the right to say words you don't like it seems

Anonymous
10/01/25(Wed)13:25:38 No.106759503

Anonymous 10/01/25(Wed)13:25:38 No.106759503

Still can't into real people though unfortunately

Anonymous
10/01/25(Wed)13:26:14 No.106759511

Anonymous 10/01/25(Wed)13:26:14 No.106759511

>>106759493
the paper and the project are here, fell free to read it, but I think you're too much in a denial and you will pretend it never existed I guess
https://test-time-training.github.io/video-dit/

Anonymous
10/01/25(Wed)13:26:18 No.106759512

Anonymous 10/01/25(Wed)13:26:18 No.106759512

https://files.catbox.moe/cawxp1.mp4

Anonymous
10/01/25(Wed)13:26:32 No.106759514

Anonymous 10/01/25(Wed)13:26:32 No.106759514

>>106759493
>but bro I made a Tom and Jerry LoRA for Wan and I just stitched together clips, video is solved!

Anonymous
10/01/25(Wed)13:27:07 No.106759522

Anonymous 10/01/25(Wed)13:27:07 No.106759522

>>106759494
>debo is such a unique guy
classic debo talking in the third person

Anonymous
10/01/25(Wed)13:27:34 No.106759525

Anonymous 10/01/25(Wed)13:27:34 No.106759525

>>106759511
You're doing everything except provide proof YOU can do this right now. I'll accept "I can't do this myself anon because I'm poor" as an answer btw.

Anonymous
10/01/25(Wed)13:27:51 No.106759528

Anonymous 10/01/25(Wed)13:27:51 No.106759528

File: 00071-3008569886.png (2.66 MB, 1240x1240)

2.66 MB PNG

Anonymous
10/01/25(Wed)13:27:57 No.106759529

Anonymous 10/01/25(Wed)13:27:57 No.106759529

>>106759514
>I just stitched together clips
it's literally a 1 minute video made with a single prompt, no stitching or anything, the model did the whole minute by itself >>106759511

Anonymous
10/01/25(Wed)13:28:32 No.106759534

Anonymous 10/01/25(Wed)13:28:32 No.106759534

>>106758944
>cele-
and it's an animation of AniStudiop

Anonymous
10/01/25(Wed)13:28:45 No.106759535

Anonymous 10/01/25(Wed)13:28:45 No.106759535

File: 1740296506594498.png (909 KB, 936x1112)

909 KB PNG

the man is holding a bowl of Chinese white rice and wearing a Chinese rice hat. Give him a moustache and goatee. keep his expression the same.

you should check out the lucky money club, anons.

Anonymous
10/01/25(Wed)13:28:58 No.106759537

Anonymous 10/01/25(Wed)13:28:58 No.106759537

>>106759525
>well yeah some people did it and showed how it can be done but YOU didn't, therefore it doesn't exist
(You)

Anonymous
10/01/25(Wed)13:29:09 No.106759540

Anonymous 10/01/25(Wed)13:29:09 No.106759540

>>106759529
You're doing everything except provide proof YOU can do this right now. I'll accept "I can't do this myself anon because I'm poor" as an answer btw.

Okay it's all one prompt, where's your 60 second video using only one prompt.

The prompt
"A cartoon airplane in an intense war scene from World War 2, the pilot is a nigger."

Anonymous
10/01/25(Wed)13:30:05 No.106759551

Anonymous 10/01/25(Wed)13:30:05 No.106759551

>>106759522
oh hi debo, still talking with yourself?

Anonymous
10/01/25(Wed)13:30:17 No.106759554

Anonymous 10/01/25(Wed)13:30:17 No.106759554

>>106759537
>t. I can't do this myself anon because I'm poor
It's okay anon, I know your pride won't let you say it.

Anonymous
10/01/25(Wed)13:30:53 No.106759557

Anonymous 10/01/25(Wed)13:30:53 No.106759557

>>106759540
>it doesn't count unless you put 0 effort in

Anonymous
10/01/25(Wed)13:31:30 No.106759564

Anonymous 10/01/25(Wed)13:31:30 No.106759564

>>106759551
i'm debo#2 you're debo#1

Anonymous
10/01/25(Wed)13:31:30 No.106759565

Anonymous 10/01/25(Wed)13:31:30 No.106759565

>>106759557
You're doing everything except provide proof YOU can do this right now. I'll accept "I can't do this myself anon because I'm poor" as an answer btw.

Anonymous
10/01/25(Wed)13:31:35 No.106759566

Anonymous 10/01/25(Wed)13:31:35 No.106759566

>>106759554
the model is here, feel free to test it out (you won't because you're too poor)
https://github.com/test-time-training/ttt-video-dit

Anonymous
10/01/25(Wed)13:32:29 No.106759573

Anonymous 10/01/25(Wed)13:32:29 No.106759573

>>106759035
Exquisite. 2d > 3d every fucking time

Anonymous
10/01/25(Wed)13:32:36 No.106759575

Anonymous 10/01/25(Wed)13:32:36 No.106759575

>>106759565
>moving the goalpost
we went from "a model cannot do a 1 minute video" to "w-well yeah it can do a 1 minute video but can YOU do it?"

Anonymous
10/01/25(Wed)13:32:54 No.106759578

Anonymous 10/01/25(Wed)13:32:54 No.106759578

alive thread of stagnant (local) tech
it's so over

Anonymous
10/01/25(Wed)13:33:25 No.106759584

Anonymous 10/01/25(Wed)13:33:25 No.106759584

File: 00056-2960080141.png (3.07 MB, 1536x1920)

3.07 MB PNG

fml

Anonymous
10/01/25(Wed)13:33:37 No.106759587

Anonymous 10/01/25(Wed)13:33:37 No.106759587

>>106759565
I could do it using the model anon posted but I would never waste time on low-effort workflow when I could build out a professional pipeline that uses 5% the effort of traditional filmmaking and surpasses it in results

Anonymous
10/01/25(Wed)13:33:57 No.106759592

Anonymous 10/01/25(Wed)13:33:57 No.106759592

Guys I made a Tom and Jerry LoRA for Wan. I generate a still frame, input it for starting frame to video for 5 seconds, then I take the last frame as input for Qwen Edit, ask it to change the camera position, use a VLM to write a prompt, and repeat until I have 60 seconds of video. Video is solved!

Anonymous
10/01/25(Wed)13:33:58 No.106759593

Anonymous 10/01/25(Wed)13:33:58 No.106759593

>>106759584
at least your gens are cool desu

Anonymous
10/01/25(Wed)13:35:23 No.106759609

Anonymous 10/01/25(Wed)13:35:23 No.106759609

>>106759592
What's your point? Just saying something with a sarcastic undertone doesn't automatically invalidate it, you know.

Anonymous
10/01/25(Wed)13:36:03 No.106759621

Anonymous 10/01/25(Wed)13:36:03 No.106759621

File: 1736625547512232.png (666 KB, 936x1112)

666 KB PNG

view from the back perspective

neat

Anonymous
10/01/25(Wed)13:36:59 No.106759627

Anonymous 10/01/25(Wed)13:36:59 No.106759627

>>106759621
Can it do an extreme angle from below? Like from his feet looking up, at an extreme angle

Anonymous
10/01/25(Wed)13:37:24 No.106759633

Anonymous 10/01/25(Wed)13:37:24 No.106759633

>>106759592
>still lying about the true process
why is it so hard to admit they managed to make a model that can do a 1 minute video all by itself?

Anonymous
10/01/25(Wed)13:37:54 No.106759640

Anonymous 10/01/25(Wed)13:37:54 No.106759640

File: ComfyUI_temp_qofnp_00059_.jpg (360 KB, 1024x1280)

360 KB JPG

Anonymous
10/01/25(Wed)13:38:34 No.106759646

Anonymous 10/01/25(Wed)13:38:34 No.106759646

>>106759633
probably low iq

Anonymous
10/01/25(Wed)13:38:45 No.106759649

Anonymous 10/01/25(Wed)13:38:45 No.106759649

File: 1749737344170142.png (778 KB, 1098x790)

778 KB PNG

>>106759621
with an anri pic

like magic.

Anonymous
10/01/25(Wed)13:40:52 No.106759672

Anonymous 10/01/25(Wed)13:40:52 No.106759672

File: 1751388724538094.png (645 KB, 1162x717)

645 KB PNG

>>106759649
diff girl to test

setup: qwen edit v2 (2509), 8 steps with Qwen-Image-Lightning-8steps-V2.0.safetensors (works better than the edit v1 lora)

pretty cool.

Anonymous
10/01/25(Wed)13:41:27 No.106759678

Anonymous 10/01/25(Wed)13:41:27 No.106759678

>>106759649
she's a granny now bro time to move on already

Anonymous
10/01/25(Wed)13:41:57 No.106759685

Anonymous 10/01/25(Wed)13:41:57 No.106759685

>wheelchairs are posted
>goes on the fritz

Anonymous
10/01/25(Wed)13:43:20 No.106759700

Anonymous 10/01/25(Wed)13:43:20 No.106759700

Peacefully observing the localmonkeys in their natural habitat, don’t mind me

Anonymous
10/01/25(Wed)13:43:47 No.106759705

Anonymous 10/01/25(Wed)13:43:47 No.106759705

File: 1736570015130821.png (1.61 MB, 912x1144)

1.61 MB PNG

>>106759672
what's neat is you can't take a base image and just inpaint them in reverse. you'd need a new pose and to render it in proper detail. edit models can do that. it even works with large buildings or houses. this is just a fun way to test functionality.

Anonymous
10/01/25(Wed)13:43:54 No.106759708

Anonymous 10/01/25(Wed)13:43:54 No.106759708

>>106759633
>multi-scene videos up to one minute in length directly from storyboards.
Actually it's funny that my sarcastic reply is actually their process.

Anonymous
10/01/25(Wed)13:45:30 No.106759727

Anonymous 10/01/25(Wed)13:45:30 No.106759727

File: 1750255481991735.png (1.32 MB, 856x1216)

1.32 MB PNG

DAMN. AI is pretty cool after all.

Anonymous
10/01/25(Wed)13:45:47 No.106759731

Anonymous 10/01/25(Wed)13:45:47 No.106759731

>>106759708
not at all, it's literally one single prompt and you have the full 1 minute video, are you baiting or something?

Anonymous
10/01/25(Wed)13:45:46 No.106759732

Anonymous 10/01/25(Wed)13:45:46 No.106759732

File: file.png (87 KB, 1323x836)

87 KB PNG

>>106759708
You really should read your own papers.

Anonymous
10/01/25(Wed)13:46:06 No.106759736

Anonymous 10/01/25(Wed)13:46:06 No.106759736

>>106759708
>if I say it sarcastically that means it's a bad thing!
You need to go back.

Anonymous
10/01/25(Wed)13:46:54 No.106759748

Anonymous 10/01/25(Wed)13:46:54 No.106759748

>>106759649
That's not a view from her back though.

Anonymous
10/01/25(Wed)13:47:07 No.106759750

Anonymous 10/01/25(Wed)13:47:07 No.106759750

>>106759727
>but you can just look at the JAV
the idea is to test functionality of the model.

Anonymous
10/01/25(Wed)13:48:02 No.106759757

Anonymous 10/01/25(Wed)13:48:02 No.106759757

>>106759732
>>106759731
woo that aged poorly

>>106759736
No, it's funny because it doesn't make the video all by itself. You can literally do their paper at home with a Tom and Jerry LoRA for Wan.
- use a LLM and Qwen create a series of storyboards along with a video prompt
- animate the storyboard with Wan with the storyboard image and video prompt
- stitch together

Anonymous
10/01/25(Wed)13:49:30 No.106759775

Anonymous 10/01/25(Wed)13:49:30 No.106759775

>>106759492
>SaaS is honestly insanely powerful

Only ClosedAI is. Not a single Chink model has ever held a candle to Dalle, nor 4o, and now this... They never will hold a candle to this.

Anonymous
10/01/25(Wed)13:49:47 No.106759778

Anonymous 10/01/25(Wed)13:49:47 No.106759778

>>106759757
>woo that aged poorly
?? what are you talking about, it's still one single prompt with multiple descriptions, can you read? >>106759732

Anonymous
10/01/25(Wed)13:50:48 No.106759790

Anonymous 10/01/25(Wed)13:50:48 No.106759790

>>106759775
>Only ClosedAI is.
Google has veo 3 remember
>Not a single Chink model has ever held a candle to Dalle, nor 4o, and now this...
Seedream is great though, still the best image model for realistic scenes

Anonymous
10/01/25(Wed)13:51:36 No.106759798

Anonymous 10/01/25(Wed)13:51:36 No.106759798

Smells like cloudcucks in here.

Anonymous
10/01/25(Wed)13:52:04 No.106759803

Anonymous 10/01/25(Wed)13:52:04 No.106759803

>>106759757
Let's say that's how it works, why is that a bad thing? Explain in plain english, don't just sarcastically describe the process like it's obvious why we should be predisposed against it

Anonymous
10/01/25(Wed)13:52:06 No.106759805

Anonymous 10/01/25(Wed)13:52:06 No.106759805

File: 1729181907036598.png (981 KB, 1024x1024)

981 KB PNG

one more, a test of a random anime pic (Haruhi):

even got the back of the uniform right.

Anonymous
10/01/25(Wed)13:52:15 No.106759806

Anonymous 10/01/25(Wed)13:52:15 No.106759806

>>106759778
>one single prompt is actually a JSON input with with multiple prompts
Why do you insist on being this obtuse? It's okay to admit you're wrong. I get it, AI is like magic to you so you made fantastical assumptions about the state of the tech and you're upset I'm bringing you back to Earth. You wanted to imply they just typed in "Tom and Jerry in the office" and not actually a full series of prompts and storyboards fed into Cog.

Anonymous
10/01/25(Wed)13:54:36 No.106759823

Anonymous 10/01/25(Wed)13:54:36 No.106759823

File: 1757992628634973.png (1.06 MB, 1024x1024)

1.06 MB PNG

>>106759805
the character is sitting at one of the desks. her hands are on the desk.

and it works. 20-30 seconds, and more effective than spending more time with openpose to get this type of repose.

Anonymous
10/01/25(Wed)13:54:54 No.106759828

Anonymous 10/01/25(Wed)13:54:54 No.106759828

>>106759806
>nooo, why can't it make a movie if I provide 3 words max
a prompt is a prompt anon, you just don't like it when it's long, but you're moving the goalpost, you said that it couldn't make a 1 minute video with a single prompt, now your argument is "I want that prompt to be short", it's all right, just say you were wrong and you have trouble admiting it because your ego is too fragile to do that, and we're good, deal?

Anonymous
10/01/25(Wed)13:55:08 No.106759832

Anonymous 10/01/25(Wed)13:55:08 No.106759832

>>106759803
This entire conversation is about people overexaggerating the capabilities of video A and grossly misrepresenting the current state of AI and the requirements to achieve a human-free AI pipeline. There is NOTHING wrong with what I just said for making a 60 second AI video, my contention is the human effort to make that clip is non zero. Yes anon you can make a video right now only using AI with existing models and tools. But not without significant care and effort.

Anonymous
10/01/25(Wed)13:55:56 No.106759840

Anonymous 10/01/25(Wed)13:55:56 No.106759840

>>106759790
Looking at their raw capabilities, Seedream is not even 1% as good as 4o. And kek, no, that slopped crap is not realistic.

Anonymous
10/01/25(Wed)13:56:39 No.106759847

Anonymous 10/01/25(Wed)13:56:39 No.106759847

local tooling
you just cant beat it

Anonymous
10/01/25(Wed)13:56:39 No.106759848

Anonymous 10/01/25(Wed)13:56:39 No.106759848

>>106759840
>Seedream is not even 1% as good as 4o.
you're talking about the piss filter model? ahah good bait, 7/10

Anonymous
10/01/25(Wed)13:56:58 No.106759851

Anonymous 10/01/25(Wed)13:56:58 No.106759851

File: file.png (27 KB, 635x460)

27 KB PNG

>>106759828
You wanted to imply they just typed in "Tom and Jerry in the office" and not actually a full series of prompts and storyboards fed into Cog.

When you have two text prompts in an array, it's called "prompts".

Anonymous
10/01/25(Wed)13:57:39 No.106759857

Anonymous 10/01/25(Wed)13:57:39 No.106759857

>>106759851
>You wanted to imply they just typed in "Tom and Jerry in the office"
I never implied that, you're talking out of your ass again

Anonymous
10/01/25(Wed)13:59:10 No.106759869

Anonymous 10/01/25(Wed)13:59:10 No.106759869

>>106759851
it's still having a 1 minute video with just pure text, I don't get what you're complaining about, now you don't want to write text to get that video? you want the model to guess what's in your mind or something?

Anonymous
10/01/25(Wed)14:00:03 No.106759877

Anonymous 10/01/25(Wed)14:00:03 No.106759877

>>106759848
The piss filter is them not even trying though. Probably some really bad fingerprint/censorship. Their video model is what happens when they try. What do you think they will give us next for text to image? It will be over. Also, 4o still the best thing for text and overall concept knowledge. It's not even close.

Anonymous
10/01/25(Wed)14:00:23 No.106759881

Anonymous 10/01/25(Wed)14:00:23 No.106759881

where did all the competent anons go?

Anonymous
10/01/25(Wed)14:00:27 No.106759885

Anonymous 10/01/25(Wed)14:00:27 No.106759885

I've been thinking about how in the future movie studios and game studios will just make their own model and train it on a per movie/game basis
I know it's not a very complex thought but I felt like in the context of people expecting generalist AIs to make competent movie scenes it's kind of relevant. Like this wouldn't be what the studios use anyway.

Anonymous
10/01/25(Wed)14:00:37 No.106759886

Anonymous 10/01/25(Wed)14:00:37 No.106759886

ComfyUI is such a wonderful program. If only it worked.

Anonymous
10/01/25(Wed)14:01:20 No.106759894

Anonymous 10/01/25(Wed)14:01:20 No.106759894

>>106759877
>What do you think they will give us next for text to image? It will be over.
I don't talk in the wishful thinking language, they still haven't provided anything good, so it doesn't exist

Anonymous
10/01/25(Wed)14:01:32 No.106759898

Anonymous 10/01/25(Wed)14:01:32 No.106759898

>>106759511
>last updated
>4 months ago

sigh, only tencent and nvidia can save us now

Anonymous
10/01/25(Wed)14:02:39 No.106759903

Anonymous 10/01/25(Wed)14:02:39 No.106759903

>>106759894
>I don't talk in the wishful thinking language, they still haven't provided anything good, so it doesn't exist

>Sora 2 exists
>Dalle 3 still exists
>4o is just a teaser for what they will give us next...

Anon don't be so naive..

Anonymous
10/01/25(Wed)14:03:14 No.106759913

Anonymous 10/01/25(Wed)14:03:14 No.106759913

>>106759903
>>Dalle 3 still exists
this shit is ultraslopped, can't believe you decided to bring this shit to the table

Anonymous
10/01/25(Wed)14:05:28 No.106759934

Anonymous 10/01/25(Wed)14:05:28 No.106759934

File: 00086-534935041.png (2.8 MB, 1240x1240)

2.8 MB PNG

>Eat lunch with GF
>Still doing it

Anonymous
10/01/25(Wed)14:05:36 No.106759935

Anonymous 10/01/25(Wed)14:05:36 No.106759935

>>106759885
Realistically in video production where AI is most useful is inbetweening and neural rendering. Maybe some storyboarding and animatics help. Just like AI is useful today in programming and writing. Ultimately AI is a solver of the blank canvas problem, but the heavy lifting is the human judging, iterating and controlling the end result.

Anonymous
10/01/25(Wed)14:06:31 No.106759947

Anonymous 10/01/25(Wed)14:06:31 No.106759947

>>106759913
The model is still them not trying and yet it's SOTA in concept knowledge.

Anonymous
10/01/25(Wed)14:06:37 No.106759949

Anonymous 10/01/25(Wed)14:06:37 No.106759949

File: 1744788858507866.png (1.03 MB, 1024x1024)

1.03 MB PNG

the girl is holding a baseball and is wearing a Dodgers baseball uniform.

hmm qwen knows the doyers.

Anonymous
10/01/25(Wed)14:07:33 No.106759959

Anonymous 10/01/25(Wed)14:07:33 No.106759959

>>106759947
>t-they didn't try on Dalle3 and 4o, but next time they'll try! Trust the Sama
I won't

Anonymous
10/01/25(Wed)14:07:40 No.106759960

Anonymous 10/01/25(Wed)14:07:40 No.106759960

>>106759885
They will train LoRA vWhatever, sure, but they will not be training new base models on a per-project basis. The economics make no sense.

Anonymous
10/01/25(Wed)14:09:13 No.106759971

Anonymous 10/01/25(Wed)14:09:13 No.106759971

>>106759959
When they released Sora 1, they also were clearly not trying and noticeably holding back. Did you really think Sora 2 just came as a natural upgrade to Sora 1? It's several generations ahead of Sora 1, which itself was behind every Chink model that was being put out at the time (even Veo 3)...

Anonymous
10/01/25(Wed)14:10:50 No.106759979

Anonymous 10/01/25(Wed)14:10:50 No.106759979

>>106759971
I just think that Sam Altman is a weird motherfucker, for his image model he added a piss filter because he was terrified of deepfake, and now he made Sora 2 and the cameo shit that lets every random make a video selfie of themselves and share this shit for everyone to see, he's so inconsistent lol

Anonymous
10/01/25(Wed)14:11:25 No.106759984

Anonymous 10/01/25(Wed)14:11:25 No.106759984

File: 00095-2394186116.png (2.02 MB, 1240x1240)

2.02 MB PNG

I wish I was interested more in irl style gens but I'm not, really liking these outputs

Anonymous
10/01/25(Wed)14:12:24 No.106759991

Anonymous 10/01/25(Wed)14:12:24 No.106759991

>>106759971
Sora 2 is just a natural progression of the audio to video tech, I don't think it's that far ahead of the curve. Isn't Wan 2.5 basically on par with Sora 2, the only thing OpenAI has going for it is they obviously have the best dataset but that's ultimately beat by LoRAs as specialization beats generalization.

Anonymous
10/01/25(Wed)14:12:32 No.106759993

Anonymous 10/01/25(Wed)14:12:32 No.106759993

Qwen Edit seems worse at t2i than plain old Qwen Image. Not significantly, just seems jankier with slightly worse prompt adherence, but maybe that's my imagination.
Are they going to update Qwen Image or are they going to focus on regular updates to Qwen Edit from now on?

Anonymous
10/01/25(Wed)14:13:01 No.106759995

Anonymous 10/01/25(Wed)14:13:01 No.106759995

>>106759979
Or Sam is well aware that open source is catching up, and his time is almost up. He either puts up a good model or gets replaced.

Anonymous
10/01/25(Wed)14:13:09 No.106759997

Anonymous 10/01/25(Wed)14:13:09 No.106759997

>>106759979
The man marketed his AI assistant with a movie about falling in love with your AI assistant, and then proceeded to kneecap the human model as much as possible as often as possible and virtue signal about how moral he is for not creating an Ani-equivalent. He has no plan or vision, he just does whatever and assumes he is entitled to AGI by divine right of kings.

Anonymous
10/01/25(Wed)14:13:51 No.106760002

Anonymous 10/01/25(Wed)14:13:51 No.106760002

File: 1756656700051655.png (1.03 MB, 1024x1024)

1.03 MB PNG

>>106759949
the girl is holding a baseball bat and is wearing a New York Yankees baseball uniform.

anyways you get the idea. it's an amazing tool for edits that can be used with either noob/illustrious gens or realistic gens/photos.

Anonymous
10/01/25(Wed)14:15:03 No.106760011

Anonymous 10/01/25(Wed)14:15:03 No.106760011

>>106759991
>Isn't Wan 2.5 basically on par with Sora 2
not even close, the sound is terrible, the images are kinda slopped, the movements have a lot of glitches and it's far from having all the characters/styles concepts of Sora

Anonymous
10/01/25(Wed)14:16:03 No.106760018

Anonymous 10/01/25(Wed)14:16:03 No.106760018

>>106759993
>Qwen Edit seems worse at t2i than plain old Qwen Image. Not significantly, just seems jankier with slightly worse prompt adherence, but maybe that's my imagination.
it is, because it's not the goal of QIE to be a t2i model, they sacrified that aspect so that it can be good only in editing

Anonymous
10/01/25(Wed)14:16:41 No.106760028

Anonymous 10/01/25(Wed)14:16:41 No.106760028

>>106760011
Sora 2 sounds like they're chewing gravel.

https://www.youtube.com/watch?v=yvD8TxNsR4Q
It sounds like you're full of shit honestly.

Anonymous
10/01/25(Wed)14:17:06 No.106760032

Anonymous 10/01/25(Wed)14:17:06 No.106760032

>>106759971
yeah, I never believed OpenAI would reach Veo 3's level this soon, they surprised me on that one, I should stop underestimating them lol

Anonymous
10/01/25(Wed)14:18:30 No.106760037

Anonymous 10/01/25(Wed)14:18:30 No.106760037

>>106760028
>sounds like they're chewing gravel.
you're literally describing Wan 2.5, what the fuck are you talking about?
https://youtu.be/yvD8TxNsR4Q?t=126

Anonymous
10/01/25(Wed)14:18:31 No.106760038

Anonymous 10/01/25(Wed)14:18:31 No.106760038

>>106760011
It doesn't have any real people other than Sam Altman though so

Anonymous
10/01/25(Wed)14:18:43 No.106760040

Anonymous 10/01/25(Wed)14:18:43 No.106760040

>>106760018
I vastly prefer QIE and if you do a LoRA you can enhance its text to image capabilities making QI redundant or not worth switching between them, I personally think QIE is better because the model has a better 3D understanding of the generations which is hard to put into words.

Anonymous
10/01/25(Wed)14:19:31 No.106760047

Anonymous 10/01/25(Wed)14:19:31 No.106760047

>>106760038
>It doesn't have any real people other than Sam Altman though so
Wan can only do Trump accurately lol

Anonymous
10/01/25(Wed)14:19:50 No.106760050

Anonymous 10/01/25(Wed)14:19:50 No.106760050

>>106760037
There are Sora clips in this fucking thread that sounds like a wav file from 2001. But it's okay, you're the resident OpenAI shill.

Anonymous
10/01/25(Wed)14:20:48 No.106760058

Anonymous 10/01/25(Wed)14:20:48 No.106760058

>>106760050
>A Wan 2.5 (API only) shill complaining about other API shills
kek, this is funni ngl

Anonymous
10/01/25(Wed)14:20:54 No.106760060

Anonymous 10/01/25(Wed)14:20:54 No.106760060

>>106760011
ClosedAI understands that in order for a model to be good, it needs to understand as much as possible and naturally communicate with the world. Chinks don't understand this, they just want to benchmax, fit in as many generic concepts as they can, and have zero vision.

Anonymous
10/01/25(Wed)14:21:16 No.106760064

Anonymous 10/01/25(Wed)14:21:16 No.106760064

File: 1746823423338828.png (981 KB, 1024x1024)

981 KB PNG

>>106760002
remove the girls outfit and replace it with a two piece bikini with the Dodgers baseball logo on it.

qwen image edit is so smart (for detecting stuff and edits)

Anonymous
10/01/25(Wed)14:21:24 No.106760065

Anonymous 10/01/25(Wed)14:21:24 No.106760065

>>106760047
I think we all know the real use case of Wan and it's nothing Sora will ever do without getting your account banned.

But please, give me an a woman in a bikini in a hot tub doing ASMR for her 18+ Twitch stream.

Anonymous
10/01/25(Wed)14:21:35 No.106760070

Anonymous 10/01/25(Wed)14:21:35 No.106760070

>>106760047
With i2v it can do any real person. Can't say the same for Sora unfortunately.

Anonymous
10/01/25(Wed)14:21:42 No.106760072

Anonymous 10/01/25(Wed)14:21:42 No.106760072

>>106760058
that's a chinese alibaba shill

Anonymous
10/01/25(Wed)14:22:06 No.106760077

Anonymous 10/01/25(Wed)14:22:06 No.106760077

>>106760028
Can Wan 2.5 do songs?
https://files.catbox.moe/1c3h2s.mp4

Anonymous
10/01/25(Wed)14:22:47 No.106760083

Anonymous 10/01/25(Wed)14:22:47 No.106760083

>>106760070
> do
do what

Anonymous
10/01/25(Wed)14:23:05 No.106760086

Anonymous 10/01/25(Wed)14:23:05 No.106760086

>>106760077
>guys there's no gravel I swear in this 12kbps audio clip

Anonymous
10/01/25(Wed)14:23:37 No.106760097

Anonymous 10/01/25(Wed)14:23:37 No.106760097

the localcope is real

Anonymous
10/01/25(Wed)14:23:38 No.106760098

Anonymous 10/01/25(Wed)14:23:38 No.106760098

File: 1759022990171238.png (1000 KB, 1024x1024)

1000 KB PNG

>>106760064
remove the girls outfit and replace it a taliban terrorist outfit.

kek

Anonymous
10/01/25(Wed)14:24:38 No.106760111

Anonymous 10/01/25(Wed)14:24:38 No.106760111

calling it now

>wan 3.0 - 2026, slaps sora 2 veo 3 and the rest
>local - 12 secs, 18B, 720p, 30fps
>ayy pee eye - 30 secs, 39B, 1080p, 60fps

Anonymous
10/01/25(Wed)14:24:45 No.106760112

Anonymous 10/01/25(Wed)14:24:45 No.106760112

>>106760086
I never said the sound is perfect, but somehow you can't stop ignoring context, that sound quality is on par with veo 3 and is better than Wan 2.5, that's the best sound quality we get at this moment

Anonymous
10/01/25(Wed)14:24:54 No.106760113

Anonymous 10/01/25(Wed)14:24:54 No.106760113

>>106760058
With Sora 2 it's inevitable Wan is going to release locally lmao because the economic damage to OpenAI is too juicy.

Anonymous
10/01/25(Wed)14:25:05 No.106760115

Anonymous 10/01/25(Wed)14:25:05 No.106760115

saar you wish to make video? please only sama it is not safe for others. please saar.

Anonymous
10/01/25(Wed)14:25:47 No.106760117

Anonymous 10/01/25(Wed)14:25:47 No.106760117

>>106760111
that's only if localkucks beg hard enough

Anonymous
10/01/25(Wed)14:25:48 No.106760118

Anonymous 10/01/25(Wed)14:25:48 No.106760118

>>106760077
>https://files.catbox.moe/1c3h2s.mp4
THIS IS SO CUTE WTF

Anonymous
10/01/25(Wed)14:26:34 No.106760126

Anonymous 10/01/25(Wed)14:26:34 No.106760126

File: 1735810645415272.png (1.13 MB, 1024x1024)

1.13 MB PNG

>>106760098
remove the girls outfit and replace it with a Japanese samurai outfit.

Anonymous
10/01/25(Wed)14:26:40 No.106760127

Anonymous 10/01/25(Wed)14:26:40 No.106760127

>>106760113
> is going to release locally
> 228B

Anonymous
10/01/25(Wed)14:27:10 No.106760133

Anonymous 10/01/25(Wed)14:27:10 No.106760133

>>106760112
I'm don't care to verify capabilities because it misses the point. Given audio2video exist, why would I want a really shitty sounding song when I can use a better AI to make a good song and then use that to generate a video?

Anonymous
10/01/25(Wed)14:27:41 No.106760141

Anonymous 10/01/25(Wed)14:27:41 No.106760141

https://files.catbox.moe/nxirwk.mp4
that's a weird demon souls mod lol

Anonymous
10/01/25(Wed)14:28:14 No.106760148

Anonymous 10/01/25(Wed)14:28:14 No.106760148

File: 1736751045400578.png (971 KB, 1024x1024)

971 KB PNG

>>106760126
the girl is wearing a large teddy bear costume.

cute!

Anonymous
10/01/25(Wed)14:28:26 No.106760149

Anonymous 10/01/25(Wed)14:28:26 No.106760149

>>106760127
Wan 2.5 is not going to have exponentially more parameters than Wan 2.2. It's just like QIE, it's just a model that has longer context and has better input options (e.g. an audio stream).

Anonymous
10/01/25(Wed)14:28:44 No.106760152

Anonymous 10/01/25(Wed)14:28:44 No.106760152

>>106760133
>why would I want a really shitty sounding song when I can use a better AI to make a good song and then use that to generate a video?
that's too much of a pain, I just want to write text and get a video with sound in the output, that's way more fun that way, having to provide an audio sound you already know removes the surprise magic

Anonymous
10/01/25(Wed)14:29:42 No.106760163

Anonymous 10/01/25(Wed)14:29:42 No.106760163

>>106760152
>too much of a pain
It's a literal pain to listen to that clip it's so low quality. And we're talking about programming here, it's literally why things like ComfyUI exist, that's the whole fucking point of using node workflows.

Anonymous
10/01/25(Wed)14:30:14 No.106760167

Anonymous 10/01/25(Wed)14:30:14 No.106760167

i dont want audio in any local video model unless there is an option to completely disable it to increase generation speed. we already have 5b wasted parameters in every local model dedicated solely to generating text on signs thanks to emad.

Anonymous
10/01/25(Wed)14:30:16 No.106760168

Anonymous 10/01/25(Wed)14:30:16 No.106760168

>>106760077
ngl the song is quite catchy

Anonymous
10/01/25(Wed)14:30:48 No.106760177

Anonymous 10/01/25(Wed)14:30:48 No.106760177

>>106760113
>the economic damage to OpenAI is too juicy
Oh, fuck. The entire western economy is currently hanging on almost exclusively by delusional valuations of nvidia. All China has to do is keep undercutting OpenAI's releases, or produce cheaper chips with some CUDA equivalent, and the entire thing will come crashing down. And OpenAI is easy to undercut due to the wastefulness of their models (4o vs DeepSeek) and safetyism, and the nvidia embargo means new chips are a matter of time.

This is all going to go really, really bad, isn't it?

Anonymous
10/01/25(Wed)14:30:54 No.106760179

Anonymous 10/01/25(Wed)14:30:54 No.106760179

Local Sora 2 both porn and anime would be solved. A shame we will never have that.

Anonymous
10/01/25(Wed)14:31:17 No.106760183

Anonymous 10/01/25(Wed)14:31:17 No.106760183

>>106760163
like I said, I don't want to provide my own audio, I want the model to make everything by itself, I agree that at that moment the sound quality isn't great, but let's not pretend they won't improve on that, they will

Anonymous
10/01/25(Wed)14:31:32 No.106760185

Anonymous 10/01/25(Wed)14:31:32 No.106760185

>>106760179
Too unsafe, goy.

Anonymous
10/01/25(Wed)14:31:56 No.106760192

Anonymous 10/01/25(Wed)14:31:56 No.106760192

>>106760183
>like I said, I don't want to provide my own audio
okay you're too stupid to understand what I said

Anonymous
10/01/25(Wed)14:32:26 No.106760201

Anonymous 10/01/25(Wed)14:32:26 No.106760201

>>106760187
nigga that's gay

Anonymous
10/01/25(Wed)14:32:31 No.106760202

Anonymous 10/01/25(Wed)14:32:31 No.106760202

>>106760179
Jesus all you people do is complain

Anonymous
10/01/25(Wed)14:32:32 No.106760203

Anonymous 10/01/25(Wed)14:32:32 No.106760203

>>106760187
Oh no. The providers aren't going to like this one.

Anonymous
10/01/25(Wed)14:32:37 No.106760205

Anonymous 10/01/25(Wed)14:32:37 No.106760205

https://files.catbox.moe/p8zyu7.mp4

Anonymous
10/01/25(Wed)14:33:26 No.106760213

Anonymous 10/01/25(Wed)14:33:26 No.106760213

>>106760192
I accept your concession.

Anonymous
10/01/25(Wed)14:34:14 No.106760225

Anonymous 10/01/25(Wed)14:34:14 No.106760225

File: 00114-2792768966.png (3.2 MB, 1240x1240)

3.2 MB PNG

I'm going to aim for the VHS recording look next

Anonymous
10/01/25(Wed)14:34:24 No.106760227

Anonymous 10/01/25(Wed)14:34:24 No.106760227

>>106760167
Shouldn't they be separate, dedicated models that communicate anyway? Like, I want to be able to add arbitrary foley synched to motions, not produce whatever sound the video model thinks is realistic. Like, I should he able to gen an image of a big tiddy asian girl bouncing, and them subsequently prompt "sound of milk jugs being swished around" as a separate process.

Anonymous
10/01/25(Wed)14:34:27 No.106760228

Anonymous 10/01/25(Wed)14:34:27 No.106760228

>>106760205
>Sora 2 won't allow you to make edgy jok-ACK

Anonymous
10/01/25(Wed)14:35:13 No.106760233

Anonymous 10/01/25(Wed)14:35:13 No.106760233

>>106760213
1) specialized models are better than general models
2) a specialized audio model will always beat a shitty video model wasting parameters on a shitty audio generator
3) functionally using a separate AI model to generate an audio clip and then feeding it to a video model is exactly the same to a retard like you

Anonymous
10/01/25(Wed)14:35:28 No.106760234

Anonymous 10/01/25(Wed)14:35:28 No.106760234

>>106760167
>i dont want audio in any local video model
I do, it's way funnier with sound
>unless there is an option to completely disable it to increase generation speed.
fair, at least everyone is happy if there's an option to disable it

Anonymous
10/01/25(Wed)14:35:36 No.106760237

Anonymous 10/01/25(Wed)14:35:36 No.106760237

File: 1750178826663197.png (1.03 MB, 1024x1024)

1.03 MB PNG

>>106760148
the girl is dressed in a business suit with a red tie and is holding a black rifle and firing it towards the camera. she is smoking a cigar.

Anonymous
10/01/25(Wed)14:36:15 No.106760241

Anonymous 10/01/25(Wed)14:36:15 No.106760241

>>106760233
>specialized models are better than general models
This incorrect assumption set back AI by 10 years btw

Anonymous
10/01/25(Wed)14:36:33 No.106760247

Anonymous 10/01/25(Wed)14:36:33 No.106760247

>>106760233
>they can't do it, it's too hard!!
that's why they're engineers at OpenAI and you aren't, they have ambition and you don't

Anonymous
10/01/25(Wed)14:36:50 No.106760251

Anonymous 10/01/25(Wed)14:36:50 No.106760251

btw normies doent know about sora 2 theyre all distracted by zucks thing

Anonymous
10/01/25(Wed)14:38:59 No.106760273

Anonymous 10/01/25(Wed)14:38:59 No.106760273

File: 00126-2373027009.png (2.79 MB, 1240x1240)

2.79 MB PNG

Anonymous
10/01/25(Wed)14:39:48 No.106760277

Anonymous 10/01/25(Wed)14:39:48 No.106760277

>>106760251
>theyre all distracted by zucks thing
wait what? did something happened recently with zucc?

Anonymous
10/01/25(Wed)14:40:34 No.106760281

Anonymous 10/01/25(Wed)14:40:34 No.106760281

>>106760228
Who are you quoting?

Anonymous
10/01/25(Wed)14:41:44 No.106760288

Anonymous 10/01/25(Wed)14:41:44 No.106760288

>>106760281
debo

Anonymous
10/01/25(Wed)14:41:57 No.106760293

Anonymous 10/01/25(Wed)14:41:57 No.106760293

>>106760241
No, it’s still correct. AI is fundamentally a statistical solver over data distributions. Diluting that distribution across modalities increases variance and slows convergence. The fix is exponentially more parameters, but that brings diminishing returns: more compute, longer training per step, and far more steps to converge. That’s why specialized models outperform general ones: drastically smaller, yet higher quality in their domain. But I understand why people who own the datacenters want you to believe that's not true, they want every model impossible to run locally even if it's grossly inefficient. Literacy is dangerous in the hands of the peasants after all.

Anonymous
10/01/25(Wed)14:42:18 No.106760298

Anonymous 10/01/25(Wed)14:42:18 No.106760298

File: 00000-1390880363.png (728 KB, 1024x1280)

728 KB PNG

Anonymous
10/01/25(Wed)14:44:53 No.106760315

Anonymous 10/01/25(Wed)14:44:53 No.106760315

The brain is a generalist that coordinates between specialists. The purpose of specialised neural anatomy is to solve specific problems. You need both, and you need both to be in communication.

Anonymous
10/01/25(Wed)14:46:05 No.106760319

Anonymous 10/01/25(Wed)14:46:05 No.106760319

>>106760281
>Who are you quoting?
the guys who said you can't do edgy jokes, you need to lurk the previous thread to see that
>>106751481
>anything exciting would have not passed the censorship anyway

Anonymous
10/01/25(Wed)14:47:23 No.106760327

Anonymous 10/01/25(Wed)14:47:23 No.106760327

>>106760319
oh my bad i must have lost it in all the spam and totally reasonable criticisms

Anonymous
10/01/25(Wed)14:47:26 No.106760328

Anonymous 10/01/25(Wed)14:47:26 No.106760328

>>106760293
that's it guys, we pack it up, that random anon said it can't be done so it must be true

Anonymous
10/01/25(Wed)14:48:27 No.106760332

Anonymous 10/01/25(Wed)14:48:27 No.106760332

>>106760327
it's all right, that's why I'm here to remind it, I know how to navigate between noise, it's a skill not a lot of people can get

Anonymous
10/01/25(Wed)14:48:36 No.106760334

Anonymous 10/01/25(Wed)14:48:36 No.106760334

>Exciting = Edgy
They need to find smarter jeets

Anonymous
10/01/25(Wed)14:48:55 No.106760335

Anonymous 10/01/25(Wed)14:48:55 No.106760335

File: 00001-1296422259.png (1.55 MB, 1024x1280)

1.55 MB PNG

Anonymous
10/01/25(Wed)14:49:28 No.106760338

Anonymous 10/01/25(Wed)14:49:28 No.106760338

>>106760334
>>Exciting = Edgy
we're on 4chan, we only get excitement through edgy jokes, DUH!

Anonymous
10/01/25(Wed)14:49:30 No.106760339

Anonymous 10/01/25(Wed)14:49:30 No.106760339

File: 00136-1668900265.png (2.99 MB, 1240x1240)

2.99 MB PNG

Anonymous
10/01/25(Wed)14:50:41 No.106760349

Anonymous 10/01/25(Wed)14:50:41 No.106760349

>>106760347
>>106760347
>>106760347
>>106760347
>>106760347

Anonymous
10/01/25(Wed)14:50:46 No.106760350

Anonymous 10/01/25(Wed)14:50:46 No.106760350

>>106760334
And I'm still waiting for Sora 2 to do that one though
>>106751628
>maybe if the Jew dog was dancing while the twin towers were falling I'd chuckle but we both know it's going to be censored

Anonymous
10/01/25(Wed)14:55:25 No.106760402

Anonymous 10/01/25(Wed)14:55:25 No.106760402

>>106760315
The brain isn’t a single generalist blob, it’s a federation of specialized modules. Visual cortex, auditory cortex, motor cortex, language centers, etc. all evolved for domain-specific processing. Coordination doesn’t erase specialization; it depends on it. And by the way, the brain has trillions of synapses: orders of magnitude beyond any AI model. If you think that comparison justifies wasting parameters on unfocused multimodal models, you’re proving my point: specialization is what actually makes the system efficient. The irony is our brain is more akin to a MoE model with specialized domains all of which filter and prepare inputs for the "generalist" model. Do you think your brain processes the raw auditory data?

Anonymous
10/01/25(Wed)17:46:12 No.106762189

Anonymous 10/01/25(Wed)17:46:12 No.106762189

>>106759949
>>106760002
>>106760064
>>106760098
>>106760126
>>106760148
>>106760237
The results are quite amazing, are you using the default comfy workflow?
Also how does it handle LoRA's with concepts it doesn't understand?

Anonymous
10/01/25(Wed)17:48:18 No.106762211

Anonymous 10/01/25(Wed)17:48:18 No.106762211

>>106762189
default comfy with the 8 step qwen edit lightning v2.0 lora (not edit v1), 8 steps, with qwen edit 2509, Q8 version.

it works with loras too, just chain it to the lightning lora (or use them by itself).

Anonymous
10/01/25(Wed)18:03:34 No.106762330

Anonymous 10/01/25(Wed)18:03:34 No.106762330

>>106762211
Noted, time to download it and play with it.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.