[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Looks Like We Won Edition

Discussion of Free and Open Source Text-to-Image/Video Models and UI

Prev: >>106743839

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2203741
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbours
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
first for /sdg/
>>
>>106748252
yeah you basically force it to see only one thing you want it to do, focusing it
>>
theres a really good wan 2.2 penis lora that just dropped on civitai
>>
comfy lied
local died
outputs fried
trani cried
wan api’d
>>
>>106748266
Massive victory for API nodes today lads.
>>
>>106748283
When can we see the mutilated female circumcism lora, that's what everyone is waiting for.
>>
>>106748283
The "taz" one? I'll get it, hopefully it's not as bad as some of the badly trained others changing the face of the women.
>>
long video waiting room

https://huggingface.co/Efficient-Large-Model/LongLive-1.3B

>Long Video Gen: LongLive supports up to 240s video generation, with visual consistency.
>Real-time Inference: LongLive supports 20.7 FPS generation speed on a single H100 GPU, and 24.8 FPS with FP8 quantization with marginal quality loss.
>Efficient Fine-tuning: LongLive extends a short-clip model to minute-long generation in 32 H100 GPU-days.

https://github.com/TencentARC/RollingForcing

>olling Forcing performs real-time streaming text-to-video generation at 16 fps on a single GPU and is capable of producing multi-minute-long videos with minimal error accumulation.
>>
localsissies BTFO
https://www.youtube.com/watch?v=gzneGhpXwjU
>>
File: ComfyUI_00947_.png (1.41 MB, 1024x1024)
1.41 MB
1.41 MB PNG
AMD GPU support on windows:
https://github.com/comfyanonymous/ComfyUI/discussions/10116
>>
File: 1728073027711065.gif (305 KB, 220x301)
305 KB
305 KB GIF
>>106748319
>long video
fuck yeah!
>1.3b
oh...
>>
>>106748324
cool, my 7900xt can finally be useful
>>
>>106748323
>>106748295
>>
what's with the black horizontal stripes that sometimes appear on chroma gens? I'm using dpm2pp and beta as I have acheieved best results with those (of the combos that I've tried anyway) and good performance too.
>>
Blessed thread of frenship
>>
>>106748323
>sound
looks like Veo 3 finally got some serious rivals, I was wondering when OpenAI will finally wake the fuck up
>>
>>106748324
Does this work with api nodes?
>>
File: 1745514214200638.png (293 KB, 395x689)
293 KB
293 KB PNG
>>106748323
IT SUCKS AT ANIME
>>
>>106748319
Hopefully they'll do something with the 14B model, otherwise it's not very interesting in a practical sense, even if the idea is really cool, actually directing the video.
>>
>>106748323
Does this work with api nodes?
>>
>>106748342
>black horizontal stripes
?
catbox an example
>>
>looks like we (local) won edition
>immediate API fag cope
nb4 hurr durr API is local
>>
>AI slop only social media by OpenAI

who is the target audience for this?
>>
>>106748372
ComfyUI users
>>
How do you prompt wan2.2 so that it obeys a series of event?

Do you :
- make lists?
- numbered lists?
- just make "and then" sentences?
- ?

Is there a way that works?
>>
File: 1756992813678447.png (1.53 MB, 2066x991)
1.53 MB
1.53 MB PNG
>>106748323
>Sam is not here
nothingburger
>>
>>106748323
>>106748295
It's funny the more simplified and better the models get the more normies hate it.
I'll miss the day they hired me for adding "I like to play around with AI" on my CV
>>
I don't understand trani's and nigbo's forced narrative at all
>have free software
>can gen comfortabely offline
>get optional API nodes for normies
>NOOOO COMFY IS DED SAARS HES A SAAS SHILL NOW AND DED THE CRASHING SD.CPP WRAPPER IS OUR ONLY HOPE OMG
did i summarize that correctly? Because that makes no sense at all
>>
>>106748372
>AI slop only social media
This sounds like a response to what Meta announced.
And it's still baffling to me anyone thought it was a good idea.
>>
>>106748328
>>106748353
Got to give it to them, at least the nvidia one(LongLive) released something. But yeah, hopefully do release the 14b model. This is all I ever wanted wan to have.
>>
>>106748323
>omg guys we added anime filters on our app
WHO CARES
>>
File: .png (207 KB, 279x540)
207 KB
207 KB PNG
>>106748323
>>
File: ComfyUI_39725_.png (1.97 MB, 1120x1440)
1.97 MB
1.97 MB PNG
>>106748342
thats just a feature of the model, anon
take it or leave it
(or lower your res)
>>
File: dmmg_0035.png (1.46 MB, 832x1216)
1.46 MB
1.46 MB PNG
>>106748319
incredible starting point. if this thing drops and works we're just months away from some wild local stuff
>>
>>106748399
This! I’m just glad ComfyUI still supports legacy local models. API can get Wan2.5 if it means more funding for UI improvements.
>>
>>106748323
>looks like upscaled 720p generation thats worse for details than wan 2.2 and more plastic
holy shit closedai is cooked lmao

the only thing thats ok is background sounds generations which are good enough with video for some things, voice is still synthetic unlike even a lot of FOSS alternatives

trash
>>
>>106748414
openai features "safe physics", where women have skeleton boobs using ps1 physics engines
>>
>>106748429
>worse for details than wan 2.2 and more plastic
HOLY LOCALSISSY COPE
>>
>video featuring big boobs girl jumping on skateboard
>Slow motion
>Lacks jiggle*
trash
>>106748433
Local will always win
>>
>>106748429
>holy shit closedai is cooked lmao
that's why they're putting a lot of emphasis about those meme filters, they know it's a bad model so they're trying to change your attention on something no one gives a fuck
>>
>>106748380
Just describe it normally like a white man would
>>
Local quite literally lost wan just last week lmao
>>
>>106748266
qwen miku tip: add "add a 01 to her arm in red text", because for some reason she defaults to lines.
>>
>>106748399
Ani is smarter then debo do t without Ani's help he poorly freestyles it and fails horribly. You want to hurt him?
>ignore his thread leave it dead
>post handicap signs whenever he post
Ani is not involved in today's activities debo has been seething about his own thread being dead for days. Ani takes better care when he trolls debo crashes out because he knows everyone hates him already.
>>
File: 1754433108192662.png (613 KB, 694x1238)
613 KB
613 KB PNG
>>106748323
that one is quite impressive, do they use a real video from the us open and they insert their character in there?
>>
Can we all just calm down and buy some api tokens? Sora v2 is lookin sweet!
>>
>>106748453
You “anyone I don’t like is this one schizo” schizos are fuckin annoying man, damn, no one fucking cares shut up go away.
>>
He loves to make false reports so don't feel bad giving debo valid ones
>off topic
>instigating a flame war
I'm not announcing anything just giving a recommendation for thread health. We follow rules that he has never done himself.
>>
>muh ecelebs, muh troons
jesus fuck, this general is absolute cancer
>>
>>106748453
>ignore his thread leave it dead
Ok then so-
>but also post handicap signs whenever he post
That sounds a bit counter productive.
>>
>>106748441
It's funny how they have the talent to make incredibly good looking stuff, but the second it involves bouncing female bits, or femininity, suddenly we are back to a generations behind.
Though the ultimate experts on this are BFL.
>>
>>106748469
I think it's generated from scratch
>>
>>106748453
>>106748488
just filter "ani" and "debo" on 4chanX and you won't hear about those schizos anymore
>>
>>106748495
In our thread it triggers him
>>106748488
>whenever he's gone his thread sinks to page 10 and this thread is peaceful
>surely it's not him
Use your fucking head this has been a thing for years
>>
>>106748501
I have a hard time believing it, the text is flawless
>>
File: 1743736333849340.png (3.62 MB, 2390x1364)
3.62 MB
3.62 MB PNG
>>106748323
bruh they trained their shit on hibike euphonium there's no way lmao
>>
GPU Buyer:
>NOOOOO I SPENT $1000 ON A 3090 WHAT DO YOU MEAN IT CANT RUN HUNYUAN80B?!?!
API Token Buyer
>Its like crypto but the value only goes up! API models keep getting cheaper, 500 tokens today is worth 5000 tomorrow! i can run all the top-tier models on the fastest enterprise hardware backed by the power of Comfy workflows
>>
>>106748496
it kind of requires talent to suppress bouncing breasts from videos, like you have to do it on purpose
>>
>>106748517
curated examples
>>
File: Chrowan_00014_.jpg (1.37 MB, 2688x4032)
1.37 MB
1.37 MB JPG
Man, I'd love Chroma so much if it didn't shit the bed sometimes.
>>
>>106748529
>Strawman
>"It's like crypto"
Not buying your onlyfans
>>
File: 1749698037729750.png (879 KB, 1176x880)
879 KB
879 KB PNG
no depth map:

replace the girl on the right with the yellow hair girl in image3. (just to swap)
>>
File: 1734831705104723.png (804 KB, 1184x880)
804 KB
804 KB PNG
>>106748559
and then with this image and the depth map of the original:

the yellow hair anime girl is leaning against the wall on the right using the pose of image2.

neat
>>
>>106748559
what about :
replace the girl on the right with the yellow hair girl in image3. keep the pose from image1
>>
>>106748401
It's the ultimate AI product.
Maximizing user time on platform through endless slop.
>>
>>106748548
looks good, catbox?
>>
>>106748591
yeah but I don't see people using that, it's gonna flop really hard once the novelty effect wore off
>>
File: 1734758125493116.png (822 KB, 1176x880)
822 KB
822 KB PNG
replace the girl on the right with the woman wearing white in image3.

instead of motoko I got asuka dressed as motoko, still a cool result.
>>
>>106748324
What did you spend the 17 million dollars on?
>>
File: .png (883 KB, 1295x708)
883 KB
883 KB PNG
>>106748352
Ok nevermind this is actually cool
>>
>>106748591
What >>106748602 wrote, people won't use it, already AI shorts on tiktoks/youtube are usually way less popular than non AI ones.
>>
>>106748625
On localizing your address.
>>
>>106748627
it's literally hibike euphonium
https://youtu.be/TplA1k6GWDI?t=324
>>
>>106748625
Buying out Wan2.5 API exclusivity
Seedream 4.0 API implementation
Wan2.5/Seedream meetup
Sora 2.0 day-one availability
Sabotaging Chroma-hd to promote The Chroma Awards
>>
File: 1758640961229294.png (474 KB, 431x945)
474 KB
474 KB PNG
>"our model is great at physics"
>not flowing out of the tap but out of his mouth
>tap levitating in the air
>nonensical body movements that only models from a year ago had
lmao, what the fuck were openai doing all these months despite having sora 1 all that time ago? this is just brutal
>>
File: 1757723173873401.png (385 KB, 1217x290)
385 KB
385 KB PNG
>>106748724
>>106748517
>the text is flawless
lawl
>>
Can qwen edit text from a photo of printed document or is it too much for it?
>>
Enjoy your bill, api fags. I'll be over here generating on my solar powered 5070.
>>
ComfyUI is just AniStudio for people who can't handle local freedom. Fight me.
>>
File: 1547211305663.png (179 KB, 463x492)
179 KB
179 KB PNG
>>106748743
Also is there a way for Qwen Edit ot use the mask drawn on the input image?
>>
>>106748743
yes, you can fake documents and IDs with qwen
>>
>106748835
What a perfectly natural reaction to post answering a question and pointing anons to material in the OP that tells them to ignore a well documented schizo
>>
Video sisters, some food

>Kandinsky 5.0 T2V Lite is a lightweight video generation model (2B parameters) that ranks #1 among open-source models in its class. It outperforms larger Wan models (5B and 14B) and offers the best understanding of Russian concepts in the open-source ecosystem.

https://github.com/ai-forever/Kandinsky-5
>>
>>106748883
>offers the best understanding of Russian concepts in the open-source ecosystem.
Can it do wide putin memes?
>>
>>106748883
>releasing a local model when Wan exists
can't tell if this is courage or retardation
>>
File: file.png (1 KB, 383x74)
1 KB
1 KB PNG
>>106748883
>>
>>106748324
oh nice
>>
>>106748323
https://xcancel.com/GabrielPeterss4/status/1973071380842229781#m
this is actually pretty good
>>
>>106748324
Wooo
Total comfy dominance
Based
>>
File: file.png (21 KB, 629x158)
21 KB
21 KB PNG
these people smell their own farts and love it, don't they?
a normal human does not talk like this, wtf
>>
Does this work with api nodes?
>>
File: 1733599995614066.png (22 KB, 349x206)
22 KB
22 KB PNG
openaisissies... not like this
>>
Is there a retard proof guide for wan animate?
>>
>>106748324
Why is your underage foxgirl so dressed up? If locals are for pedos
Is it so your shareholders don't think badly of your UI?Coward
>>
File: 1733375905682966.png (721 KB, 928x1120)
721 KB
721 KB PNG
gentlemen
>>
Comfy isn’t local though, it’s API
>>
>>106748981
that's a robot.
>>
He loves reporting people and all this api shilling is off topic.
Want to stop it?
You know what to do
>>
>>106749089
>You know what to do
move to supperior /sdg/?
>>
Trying to do my first wan 2.2 lora. Any rule of thumb on epochs? I've got 21 5 second clips. Really annoying that you have to juggle two separate loras now, was thinking 200 epochs each?
>>
File: file.png (91 KB, 242x242)
91 KB
91 KB PNG
never converge even when they tell you to
>>
>>106749180
I still believe
>>
File: 1727727706761367.png (585 KB, 1112x936)
585 KB
585 KB PNG
replace the man on the left with Miku Hatsune.

with an openpose of the same image as image2 source.
>>
File: 1734089130720266.png (618 KB, 1112x936)
618 KB
618 KB PNG
>>106749193
without openpose source: close, but the size is closer to the original and the eyes are more focused, because you have the openpose model from the controlnet node (aio aux preprocessor).
>>
File: Chrowan_00016_.jpg (805 KB, 2016x3024)
805 KB
805 KB JPG
I want to play some dumb video games, but I can't stop messing around with Chroma. These horizontal/vertical line artifacts when genning above 1MP on some images are really annoying though.
And the fact that it still likes to mess up hands and anatomy.
Any Anons with some pro-tips for any of that?
>>
>>106748323
>12:25 kickflip
The long video clip with the skateboard is laughable next to this.
>>
>>106749237
Lora helps fight it but you're kind of fucked, it's a feature of the model. I was down this road for 3 weeks and while I learned stuff this model is fundamentally fucked.
>>
File: 1740384145410693.png (652 KB, 1112x936)
652 KB
652 KB PNG
>>106749214
but if you just say replace the character on the left with image2, that works too.

and what makes qwen edit great is you can use it with your anime or realistic gens too, to make easy changes/manipulations.
>>
>>106749265
I really wish he coordinated with Ostris because he made a superior pruned Flux model that doesn't have weird speed problems that Chroma has.
>>
>>106749237
base gen on lower res + upscale are better
>>
File: 1740745699251189.png (591 KB, 1112x936)
591 KB
591 KB PNG
also note qwen image lightning 8 steps v2 is better overall than the qwen edit v1 lora, it just works better in general. since the new version (2509) the v2 lora works great with 8 steps, over the edit one with 8 steps.

done testing for now, maybe they will release wan 2.5 "soon".
>>
>>106749281
With that said, I believe Ostris is pruning Qwen Image which will be a better layman's training starter because it doesn't have built-in lobotomies and training countermeasures like Flux does. When I get my Spark I'll probably run a nsfw finetune on one of the large models even if it's super slow.
>>
It’s released on ComfyAPI right now
>>
>>106749281
He decided to freestyle everything while being smug about it. Now he has to come to the realization he did something fucking retarded and stop wasting resources playing with a broken foundation.
>>106749296
You actually get better cohesion with anatomy and composition at higher resolutions but you can still get those lines on the high res pass.
>>
>>106749328
Sunk cost fallacy is a hell of a drug
>>
File: ape pee eye.jpg (6 KB, 533x32)
6 KB
6 KB JPG
wansisters....i-i dont feel so good
>>
>>106749336
It really fucking is and anons warned him around epoch 20 and he still kept doing it. I don't know what he expects from radiance it's a schizo config still pledged by the fundamental issues of chroma.
All he had to do is
>Keep the resolution at 1024,
Seeing how he can still finetune models what the fuck was the point of down scaling everything especially since his late training cope completely failed.
>don't block tokens
He decided to link up with the pony retard and somehow he ended up having all the worst aspects of pony but WORSE
>Listen to people
Many people including anon told him he was fucking up but nope just that gay emoji of his fursona with it's tongue out
>>
>>106749148
Well I did 400 epochs, seems like a lot anyway.
>>
>>106749365
Radiance is dumb but it's par for the course for him. The problem isn't like SDXL VAE where the VAE is actively sabotaging training by being incapable of reproducing certain details, Flux VAE is 99.5% accurate and the benefit is less data for the model has to process per step which means faster steps and thus faster convergence. It's funny because they ran the wrong direction when they realized Flux was improperly architected, yes, they designed the model so that the VAE is effectively processed at pixel resolution -- which means instead of doing Radiance, he should've retrained the first layers to efficiently use the VAE at it's trained data resolution which would be huge gain on throughput.
>>
>>106749148
There is no set rule for epochs, it depends on the objective and the model's prior understanding of the training data.
>>
>>106749454
You can tell him that, but expect that retarded emoji followed by a failure of a model.
He's just burning money to fucking burn it at this point
>>
>>106749526
We'll see if someone cares enough to invest time fixing the model for him, but honestly it's pretty cooked and I think realistically what we should do as a community is set up a proper tiny relevant concept dataset that can be used to train base models without wasting compute on useless images/captions.
>>
>>106749562
Sounds good to me.
I think he's going to be in full Hitler in the bunker cope if he doesn't course correct by winter.
>>
Once again openai opens up a year+ lead over open sores with sora 2
>>
File: radiance.png (2.85 MB, 832x1488)
2.85 MB
2.85 MB PNG
>>106749308
i think so too. but i only did a bunch of dozen gens with it, so take my opinion with a grain of salt.

also it's still a noteworthy difference from not having the lightning lora. unsurprisingly. it's like 60% or something, you do notice the other 40%... even if the speedup is very nice.
>>
>>106749587
Is there a desktop version of it yet? I refuse to use my phone for AI, that sounds fucking gay
>>
File: radiance.png (2.3 MB, 832x1488)
2.3 MB
2.3 MB PNG
>>
>>106749634
this is so bad in every single way
>>
>>106748574
fix her fingetips first CatJank
>>
File: cumra.jpg (83 KB, 1012x624)
83 KB
83 KB JPG
>>106748324
>>
File: radiance.png (2.1 MB, 832x1488)
2.1 MB
2.1 MB PNG
>>106748323
>localsissies BTFO
let me guess, this is still censored?
>>
>>106749652
>every single way
State five of these ways
>>
>>106749693
I wonder if the geeks who work at openai are able to access the truly open model or if they are so brain rotted by “safety” that they cuck themselves out of access
>>
>>106749626
Website's been updated. Only way androidpoors can use it.
>>
>>106749717
I don't wonder about them at all.
>>
>>106749717
>spending millions to train a model so our employee can generate porn
doubt
>>
>>106749697
fingers
arm
feet
style
taste
>>
>>106749020
but sora 2 is better than those 2 models no?
>>
>>106749693
Not being able to create a decent gen using openchink models is a form of censorship too
>>
>>106749739
Outside of background audio, no
>>
>>106749737
I’ll give you 3 out of 5. The style and taste seem fine.
>>
>>106749737
*eye
>>
File: 00357-57834769.png (2.35 MB, 1824x1248)
2.35 MB
2.35 MB PNG
>>
File: radiance.png (2.99 MB, 832x1488)
2.99 MB
2.99 MB PNG
>>106749652
quite fine with me

>>106749717
i have no idea. maybe they even filtered the training data so intensely by now the current truly open model also can't do much?
>>
File: 00366-1671142133.png (2.62 MB, 1248x1824)
2.62 MB
2.62 MB PNG
>>
File: radiance.png (2.82 MB, 832x1488)
2.82 MB
2.82 MB PNG
>>106749741
that seems like some desperate redefinition of what censorship is.

not saying chinese models have no censorship, but it does get quite ridiculous and extremely restrictive if you publish models even more censored
>>
what is a good nsfw implanting checkpoint that I can use to test some stuff
>>
>>106749906
bstaber
>>
>>106748548
Keep same seed. Take 1 or 2 more steps or change samplers. Fixes it most of the time.
>>
>>106749717
That was the case for dalle3, and even the captioning llm they used for their dataset was uncensored.
>>
>>106749935
great days, hopefully the vietnamese will do the same, chinks are incapable
>>
>>106749902
Sounds like cope to me. Depriving a man of a fishing rod is infinitely worse than telling him he can't fish a certain way.
>>
File: fairy2.png (1.41 MB, 864x1152)
1.41 MB
1.41 MB PNG
>>
>>106749741
>is a form of censorship too
is a form of skill issue acktually
>>
>>106748323
Hailuo 2 beat them to everything they showed.
>>
File: ComfyUI_18867.png (3.1 MB, 1152x1728)
3.1 MB
3.1 MB PNG
>>106748323
Was Sora actually popular? I don't recall seeing much of it around... at least nothing comes to mind.

>>106748324
But is this delivering shareholder value?
>>
>>106748323
>Everyone needs to give explicit permission to use their cameo

Wow, way to go. Now to wait for China to give us a proper implementation of this feature.
>>
whats the meta on lightning loras with wan2.2? anything change since those almost two giggerbyte big ones dont work?
>get literally perfect prompt adherence on my first run but forgot the loras are busted so my girl goes ghost
>>
>>106750087
don't care stop 1girling you intensely smelly brownoid
>>
>>106750087
new t2v that fixed most of the problems got released
we are still waiting for updated i2v
>>
>>106750100
>stop 1girling
1girling is the whole point of boughteting a 5090
>>
File: ComfyUI_01548_.jpg (1.4 MB, 1296x1728)
1.4 MB
1.4 MB JPG
I heard something about brown 1girls?
>>
>>106750102
my fucking heart will fade from all the waitfagging if i have to waitfag an upteenth time more good gravy goodness glaciers

>>106750144
keep posting (and maybe if theres a solution ill be posting vid brown 1girls)
>>
>>106750150
>my fucking heart will fade from all the waitfagging if i have to waitfag an upteenth time more good gravy goodness glaciers
I know... but can't do shit about it sadly
>>
File: 40895273764724.webm (2.31 MB, 640x360)
2.31 MB
2.31 MB WEBM
>>106750031
It will never happen. China spends too much time benchmaxxing which is why local never truly catches up. Vidrel is Sora 2.
>>
>>106750166
>Vidrel is Sora 2.
give it vhs degradation, and it might as well be a tony hawks pro skater intro. woah.
>>
File: radiance.png (2.8 MB, 832x1488)
2.8 MB
2.8 MB PNG
>>106749952
the SaaS censored garbage is doing both, screw them
>>
Why the fuck is some idiot uploading Chroma gens that look like SD 1.5 quality? He's making local models look bad!
>>
>>106750240
nigbo?
>>
>>106750166
it looks amateur, I like it, I think it's better than veo 3
>>
File: ComfyUI_01609_.png (3.24 MB, 1296x1728)
3.24 MB
3.24 MB PNG
>>106750150
>keep posting (and maybe if theres a solution ill be posting vid brown 1girls)
You got it
>>
>>106750166
https://xcancel.com/WuxiaRocks/status/1935298213613027521#m

https://www.youtube.com/watch?v=5yI9wEys2dc&t=251s

Look at the date it was released... ClosedAI is just now catching up to China's old tech. Open sourced China and closed source are different beasts.
>>
File: 11122.png (1.96 MB, 1024x1024)
1.96 MB
1.96 MB PNG
This is getting bad they need to fix this
>>
>>106750303
Hailuo 02 has no sounds, Sora 2 has
>>
>>106747195
>Works on everything
>ComfyUI implementation likely never (as always)
>>
I think Netta Yume will take the anime cake and then the furries will make it god tier once one that can follow basic instruction can finetune it with their booru
>>
it's funny how Scam Altman's demo is literally worse and more limited than wan video + talk models.
>>
File: 46942135.jpg (70 KB, 460x460)
70 KB
70 KB JPG
>>
>>106750351
At this point I just expect random limitations like: it doesn't work with video models, it does only work in multiples of 87.24584 resolutions, it's not compatible with loras...

If it's not snake oil and it's as good as it looks, someone will make the finetunes and create the nodes.
>>
File: radiance.png (2.9 MB, 832x1488)
2.9 MB
2.9 MB PNG
>>106750223
ty. radiance is generally looking pretty good too if you ask me.
>>
File: Chrowan_00022_.jpg (1.95 MB, 3328x5120)
1.95 MB
1.95 MB JPG
>>106749265
What kind LoRA?
>>106749296
Yeah, I tried but then anatomy/hands get even worse sometimes. It's really an annoying model at this point. And all the other replies I'm reading don't really inspire confidence.
Kinda shitty because it's so promising but also so bad at the same time.
>>
File: Sora 2.mp4 (2.25 MB, 1280x704)
2.25 MB
2.25 MB MP4
https://xcancel.com/AnalyticsVidhya/status/1973123588140970028#m
impressive
>>
File: radiance.png (3.05 MB, 832x1488)
3.05 MB
3.05 MB PNG
>>
File: 65363456.png (20 KB, 825x348)
20 KB
20 KB PNG
ATTENTION! 2 NEW NEO FORGE DEV BRANCHES!!! "dev" and "mem" and they are more up to date than the "neo" branch!
>>
>>106750418
I fined tuned a few loras but the inherent nature of chroma makes it outright overpower loras even a extra epoch that looks crisp most of the time will have random seeds where chroma will not follow it and give you a different style. The model is fundamentally fucked
>>106750452
What can I expect?
Would really like native netta yume support
>>
File: 1731621568814488.png (180 KB, 1746x576)
180 KB
180 KB PNG
>>106747195
>>106750391
https://github.com/dc-ai-projects/DC-VideoGen

28 minutes -> 4 minutes
Impressive.
>>
File: catgirl-combat-maid.jpg (1.19 MB, 1256x2488)
1.19 MB
1.19 MB JPG
>>106750427
Is this that thing Altman was talking about last week that was super powerful but expensive and only for their high paying customers?
>>
>>106750427
https://files.catbox.moe/4hwrs6.mp4
imagine the memes we would make if we had this shit locally
https://xcancel.com/ai_for_success/status/1973097111064289332#m
>>
>>106750472
hypeman
>>
>>106750471
whoa pretty numbers
>>
>>106750471
wait, they got better scores with the compression shit? how is that even possible?
>>
Remember that Haoming02 is too busy fixing bugs of Qwen, WAN, Chroma and Flux to compete with ComfyUI.
Remember that he hasn't updated anything for SDXL in over 7 months.
Remember that only Panchovix was the one dev who actually cared about updating SDXL slop mixes and he 44% off himself.
>>
can we appoint a resident goonmaster to keep a rentry up to date with the latest goon adjacent workflows (aptly named cumfyUI-x-x.json) and other materials/resources? Let's face it, if a model can't produce smut, it's functionally, literally, unironically worthless. If any model training scientists are currently visiting this general, you better make sure to put cock, balls, and vaginas in your training dataset!
>>
File: 1738610924212446.png (460 KB, 785x439)
460 KB
460 KB PNG
>>106750506
>Under deep compression settings, causal video autoencoders suffer from low reconstruction quality. In contrast, non-causal video autoencoders achieve better reconstruction quality but generalize poorly to longer videos.
>>
>>106750523
>causal
>non-causal
what is that? sounds like a big deal, did they just found a way to make shit like 5x faster while keeping the quality? if it's true it's insane
>>
File: 4545215514.webm (3.47 MB, 1920x1080)
3.47 MB
3.47 MB WEBM
It's over for local.
>>
>>106750523
that's weird, the chink casual has a higher PSNR but the image looks better than the casual one
>>
File: radiance.png (3.01 MB, 832x1488)
3.01 MB
3.01 MB PNG
>>106750427 >>106750479
not bad. but as a "recording of speech" it sounds quite a lot more artificial than the better TTS managed to do for a while now

and the simulation of where the audio is coming from as a simulation of a stereo mic mounted on the "camera" or w/e it'd be seems not that good either

pretty strong uncanny valley in terms of audio
>>
>>106750546
holycringe
>>
>>106750471
Wansisters, thinking we're back

>speed boost
>2 long vid gens

We are about to FEAST!
>>
>>106750523
That's a smart trade off.
>>106750545
Causal = this thing happened because that thing happened in the past.
>>
>>106750460
There is a way to get neta to work on forge with this extension, You have to recode some stuff but fairly simple to do. Though if the dev is still active could probably see if he can add it officially
https://github.com/DenOfEquity/Lumina2-for-webUI
>>
>>106750546
with sound: https://xcancel.com/angrypenguinPNG/status/1973077740333994056#m
>>
File: soy4.jpg (60 KB, 909x1024)
60 KB
60 KB JPG
>>106750575
>>
>>106749980
Popular with free tier jeets and other such thirdies who are the largest and fastest growing user segment lol. I use ChatGPT plus for work and have sora access and the explore feed is constant like Indian shit and shitskins all over. And it is absolutely terrible for video (excellent still image generator though the best one I’ve personally used), video with sora 1 feels like sd1.4 did, reroll dozens of times for one mediocre result and never again kek. But enough about the cloud.
>>
File: 1736716020795536.png (251 KB, 1756x788)
251 KB
251 KB PNG
>>106750471
>27.52/3.58 = 7.68x faster
>better scores
lmao, they found magic here, gimme the weights NOW!!
>>
>>106750575
Great, try inputing a loli pantyshot.
>>
>>106750598
>The code and pretrained models will be released after the legal review is completed.
soon...
>>
>>106750362
Isn't their also another anime lumina tune coming anlia or something?
>>
>>106750610
>after the legal review is completed.
translation: "we will cuck Wan"
loooool, can't we get this shit by ourselves instead?
>>
>>106750144
Speaking of brown 1girls, do the “modern” models like flux and the stuff post sdxl understand race? My day to day is basically anime models trained on boorutags and danbooru has multiple times struck down adding race or ethnicity tags. Hard to make girls that aren’t Asian or “anime white” features even if it can make their skin dark.
>>
>>106750598
this looks honestly too good to be true, it's weird
>>
I have yet to train a lora. Do some of you constantly train loras if you want to make something specific. Is 10 VRAM enough? I want to make loras for specific armor types from different periods
>>
>>106750285
Delicious choco princess more pls
>>
>>106750452
Guy needs to learn versioning lmao he’s as bad at update tracking as auto1111 was
>>
>>106750617
I believe so
>>106750573
I played with it, I'm not a fan of it not working with the main UI because it leaves a ton of things on the table that could make better gens
>>
>>106750639
what model?
>>
File: 484545455414.webm (3.14 MB, 1920x1080)
3.14 MB
3.14 MB WEBM
>>106750546
>>
File: radiance.png (2.83 MB, 832x1488)
2.83 MB
2.83 MB PNG
>>106750630
it depends. almost all can be convinced to make some variations of brown or asian girls. but some really are severely "sameface" models. many don't have a particular opinion on some nationalities.

if you really wanted to make picture books with each nation on the planet you'd likely be a bit disappointed how basically nothing is THAT credible, but you can get a pretty wide range of looks from some models
>>
>>106750685
There is barely any movement in these, is just camera panning shots...
>>
>>106750685
>nu-anime 2010+ artstyle
put in the trash
>>
>>106750695
this + sdxl and 1.5 models have really good range if you just try it
get the face you want, gen a good amount of angles (or just cheat and use wan on one) and then train in the model you actually wanna gen in. ez.
>>
>>106750669
i use flux, qwen, sdxl
>>
>>106750702
>>106750703
...just like flat forgettable modern animeslop then
>>
I'm feeling so safe right now
>>
File: 1733256558263449.mp4 (2.28 MB, 2048x738)
2.28 MB
2.28 MB MP4
>>106750637
there's some examples on their page
https://hanlab.mit.edu/projects/dc-videogen
>>
>>106750695
Well the prime example I can think of is like any of the sdxl models that are popular and primarily booru trained, try prompting an Indian woman, dot or feather (“why would you want to” aside). It’s quite difficult. But humorously a strongly Indian character gets you there like prompting “symmetra (overwatch)”. But then that also adds the headgear which is hard to remove. I just want to goon around the world is it so much to ask
>>
>>106750734
and suddenly i'm not interested in this tech anymore
>>
>>106750734
the examples look fucking good
it's black magic wtf
>>
>>106750734
LMAO
>>
File: 1746369367016443.mp4 (3.95 MB, 2048x732)
3.95 MB
3.95 MB MP4
>>106750734
I really doubt it keeps the quality, look at the first frame, you have weird artifacts on the compressed VAE one
>>
File: radiance_indian_dot.png (3.08 MB, 832x1488)
3.08 MB
3.08 MB PNG
>>106750737
a bunch of sdxl models wouldn't be too bad at this especially if you can use qwen image edit / flux kontext to add the more fancy extras, which works for a whole lot of things

but yes you have a chance with newer models. the model switch itself will get you a few 1girls too even if they're severely sameface/samebody for some or most nations.

> try prompting an Indian woman, dot or feather
here is one
>>
Are there any ComfyUI nodes that can make a 'ding' noise when the gen is ready?
>>
>FINALLY figure out why comfyui was ooming on me
>start fucking around with 2.2 and discover the horrid news on the state of the lightning loras
>oh fuck oh god now there's a new speedup
>oh fuck the SaaS jews are at it again
can this scene calm the FUCK down for FIVE SECONDS?!

>>106750834
>he redeemed the racemixing fetish
>>
File: 1739140321073514.png (1.14 MB, 2958x542)
1.14 MB
1.14 MB PNG
>>106750828
doa, it's unable to keep the details of the original vae
>>
>>106750828
It might be useful with some snakeoils. The quality loss is noticable but the reduction in gentimes is genuinely crazy.
>>
>>106750828
Another "free lunch" that isn't. Shocking!
>>
File: i3066_w.jpg (522 KB, 1600x1120)
522 KB
522 KB JPG
>>106750737
>feather
not a bad idea
>>
>>106750848
good enough if it allows a 14x speedup
>>
>>106750856
desu I wished they went for something less agressive, I would be ok with 7x and the quality is almost the same, but here it's too noticable
>>
File: 548445454521.webm (3.79 MB, 1920x1080)
3.79 MB
3.79 MB WEBM
>>106750685
>>106750702
I'm sure you can prompt for movement
>>106750427

If you saw these clips in the wild you'd think they are real, that's what's most impressive.
>>
>>106750834
I guess my mistake was downloading the most popular models and mixes which are anime oriented. Perhaps it’s time to take the post-sdxl pill. I do have a 5070ti it shouldn’t be too painful
>>
File: chroma_indian_feather.png (2.91 MB, 832x1488)
2.91 MB
2.91 MB PNG
>>106750858
feather. repost because filename was wrong, it's chroma base
>>
>>106750855
But this is what I was talking about the costume is right but the character looks like basically your standard “Asian” anime character right even with the tan.
>>
>>106750868
I think it's worth testing with wan2.2 to see if it's any better
because if you combine this with lightning lora, it would be like a few seconds per video, unreal
>>
>>106750598
>mememarks showing it has better quality
>>106750828
>>106750848
>the reality
once again, your eyes will always be the best judge
>>
File: question.jpg (154 KB, 1105x619)
154 KB
154 KB JPG
So I decided to try creating a video to see how it works, but the part where it says model_type FLOW, the program takes forever to continue. It can take up to 10 minutes before the next line, "Requested to load WAN21", finally appears.
Is this normal?
For example, in the pic related, the video took 17 minutes to generate. However, the GPU only worked for about 2 minutes. The rest was a long wait...
>>
>>106750890
it's ranma so yeah, but i think she looks a little native
>>
>>106750853
give me your lunch money, faggot!
>>
https://www.reddit.com/r/StableDiffusion/s/iK3iRC9uqP

This Reddit post basically proves /ldg/ is being sour grapes about Hunyuan 3.0.
>>
>>106750882
>shit encrusted fingernails
Holy shit this model is accurate
>>
File: chroma.png (3.17 MB, 832x1488)
3.17 MB
3.17 MB PNG
>>106750877
various realism-focused sdxl tunes have *definitely* more than various of the anime finetunes

chroma base also recommended by me. pic is chroma but prompting a feather necklace and feather ear jewelry
>>
>>106750923
>wow, close up of plastic characters
you really owned the libs anon!
>>
File: file.png (50 KB, 2361x284)
50 KB
50 KB PNG
The best safety is when the program refuses to comply with any order.
>>
>>106750924
the same way that slop infection in the training data causes models to reproduce the AIslop look sometimes when it shouldn't, meme slop data infection is going to make poop start appearing when you just prompt for normal images of Indian men and women
>>
>>106750903
Are you using torch compile? Also are you loading from HDD?
>>
>>106750923
who cares, Comfy already made Tencent kneel in submission and we'll get a 20b model instead
>>
File: chroma.png (2.58 MB, 832x1488)
2.58 MB
2.58 MB PNG
>>106750942
yea, i definitely did not prompt fingernails
>>
>>106748324
you say this like once a month and it's still garbage. fuck you
>>
>>106750952
cumra took the call
comfyOS any day now!
>>
>>106750835
https://github.com/royceschultz/ComfyUI-Notifications
>>
>>106750936
Ultimate saas cuckening, pay them money for them to do nothing.
>>
>>106750875
>If you saw these clips in the wild you'd think they are real, that's what's most impressive.
desu that happens often with new models. dont reply and call me a faggot i think the clips are cool im jus sayin
>>
File: ComfyUI_00002_.mp4 (1.11 MB, 640x640)
1.11 MB
1.11 MB MP4
>>106750948
Yes, I'm running from the HDD. I don't know if I'm using "torch compile", I just installed the portable Comfy UI and let it take care of everything for me.

Note: I just created a second video, and it only took 6 minutes to generate (versus 17 minutes from the first one). Apparently, the extremely long wait is only in the first generation, I don't know...
>>
>>106750875
https://xcancel.com/patience_cave/status/1973129728266776669#m
kek
>>
>>106751004
>Yes, I'm running from the HDD
xD !
>>
File: file.png (2.92 MB, 832x1488)
2.92 MB
2.92 MB PNG
>>
>>106750973
>all the safety crap to be fighting against "ai going rogue"
>all they do is to make ai refuse orders
logical
>>
>>106750875
>If you saw these clips in the wild you'd think they are real, that's what's most impressive.
true, some of those clips are really close to real animation
https://files.catbox.moe/1nhqy9.mp4
fucking 4chan still not allowing sound, this will be a pain once we'll get a local model that'll produce video with sound
>>
>>106751025
>"ai going rogue"
Nigga it's literally all anti-porn and anti-problematic stuff. Nobody cares about it going rogue, but they'd kill the model first before they'd let it say a singel bad thing about womxn or israel
>>
>>106751040
>Nigga it's literally all anti-porn
obviously
their biggest enemy is a nipple showing or bouncing boobs, not the ai going rogue
>>
>>106751040
Nothing is more unsafe than woman and jews
>>
File: 1511270610943.jpg (41 KB, 374x374)
41 KB
41 KB JPG
Is conditioning concat better for prompting two characters in one pic or it changes nothing?
>>
>>106751013
If I copy all the folders from the portable ComfyUI to the SSD, will it improve? Or have I already screwed everything up and have to start from scratch?
>>
>>106751068
maybe
>>
I've never felt safer in my life
>>
The speed of Chroma wouldn't be so bad if wrangling every other aspect would take like 10 gens per image. I give up. It was a fun ride, but unless someone picks up where the furry left off, I think I'm done with that fuckass model.
>>
>>106750936
Which is why Sora 2 is completely useless until the tech behind it is open sourced.
>>
>>106750702
>There is barely any movement in these, is just camera panning shots...
ngmi
https://xcancel.com/humbleguava/status/1973134504702378330#m
>>
>>106751070
Portable should be fine with just moving.
>>
>>106751036
That's pretty cool but honestly like 4o before only a matter of time before they safetize it
>>
>>106751070
it's a waste of space on nvme, save that for OS and models only
>>
>>106751077
the best time i had using chroma was with two of the experimental loras that would let you gen an image in around 8-9 seconds but those don't exist anymore unless they've been reuploaded.
>>
>>106751084
>can't show violence
>can't show sexy
Obviously the result is just boring. It's both impressive (the voice sounds good), and boring, because any spice is removed for safety.
>>
File: 1733376902029904.mp4 (360 KB, 352x640)
360 KB
360 KB MP4
https://xcancel.com/cloud11665/status/1973115723309515092#m
bruh... I want this, it's soo fucking cool
>>
>>106751103
https://huggingface.co/clover-supply/Chroma-loras/blob/main/chroma-unlocked-v4x-hyper-turbo-flash-r64-fp32.safetensors
>>
>>106751077
Chroma HD Flash speed is okay on 24GB.
>>
>>106751110
It is impressive, very coherent, but it's too constrained.
>>
File: radiance.png (3.11 MB, 832x1488)
3.11 MB
3.11 MB PNG
>>106750936 >>106750973
predictable. it's also predictable that most SaaS will bend the knee and cancel political parody and so many other things... like maybe not depictions of anyone with an illegal musket without the loicense visibible and it being clear that it's an authorized shooting range.

pic here is actually current radiance now
>>
is chroma so slow just because you can use cfg>1 or is there another reason on top of that
>>
>>106751084
Lol that gundam is static the entire time, looks awkward af but at least something I guess.
>>
>>106751128
An.. Indian woman?
>>
>>106751110
>>
how long do you think this nu cloud shit will keep them going for
i suspect at least a week maybe two tops
>>
>>106751016
This is cultural appropriation sir
>>
>>106751110
how does that work? they even got the voice sound of Sam right, do you have to provide an image input + a clip sound of him or something?
>>
>>106751151
Remember Sora1?
https://www.youtube.com/watch?v=xbxmDYk1l2w

I give them one month to have every normie and normie geek channel saying "wow it's crazy" and other "is it the end of movies".
Then little by little the limitations will be obvious, you literally can't do anything fun with it.
And people will forget it.
>>
>>106751170
the thing with sora 1 was that the official clip made by OpenAI looked good, but the clips made by regular people was ass, it's not the case here, those clips are made by regular people and they look really good
>>
File: 00517-3119191728.png (1.79 MB, 1248x1824)
1.79 MB
1.79 MB PNG
>>
>>106751181
Sure but :
- can't do real people
- can't do violence
- can't do sexy or even hint at it
- can't do crass humor
- can't do caricatures

At some point even the normiest of the normie will be bored.
>>
>>106751128
>gather round people, 'runs with train' will tell us the legend of the two shape shifting creatures, bob and gene
>>
>>106751199
>can't do real people
it can? that's the whole point of sora 2, the cameo thing, that's why they used Sam Altman and some OpenAI engineers are examples on their demo
>>
Okay, guys, so I did the following:
>installed the portable Comfy UI
>downloaded a text-to-video workflow
>downloaded an image-to-video workflow

I can already do basic things, but if I want to improve the results, what should I look for now? Specific "loras" for WAN2.2 or something like that?
>>
>>106751211
interpolation, upscaling, loras, keep trying to gen the stuff you want, you will eventually learn
>>
File: literally.mp4 (1.12 MB, 704x1280)
1.12 MB
1.12 MB MP4
>>106751151
ngl, Sam is great at hyping shit, he just has to snap some fingers and a lot of people are ready to dance with him
https://xcancel.com/GabrielPeterss4/status/1973096194508251321#m
>>
File: 1737436589421232.png (115 KB, 1603x786)
115 KB
115 KB PNG
>>106751209
I meant known people.
>>
>>106748323
>>106751110
It's unironically over.

>>106751068
Last time I've checked (a month ago), yes.
Weirdly not many people know about it.
>>
File: Api kek'ed.png (1.91 MB, 2000x1333)
1.91 MB
1.91 MB PNG
>>106751230
So no Will Smith? bullshit!!
>>
>>106751167
It was trained on him obviously. In an unconstrained and unfiltered environment, it could probably do celebs as well. But they do have a feature where you scan a video of your face and voice, and with verification you can make these fake type of videos of yourself. This is very far ahead of what we can do locally. I guess that's what happens when China is the leader in open models and they're giving us scraps anyways all the time.
>>
>>106751247
>>106751247
>>106751247
>>106751247
>>106751247
>>
>>106751238
No will smith, no slap, no nothing.
This shit is as impressive as boring.
>>
File: file.png (2.69 MB, 832x1488)
2.69 MB
2.69 MB PNG
>>106751138
yea. it does certainly seem weaker than base chroma at indians (maybe nationalities in general) and perhaps photos in general at this point.

but it's not like it's a complete mess
>>
File: x.png (3.47 MB, 832x1488)
3.47 MB
3.47 MB PNG
>>106751158
idk why you people memed that one into existence in the usa, really
>>
File: radiance.png (2.56 MB, 832x1488)
2.56 MB
2.56 MB PNG
>>106751201
guessing that prompt has pretty interesting results on a sufficiently powerful LLM
>>
>>106751068
>>106751234
>conditioning concat
qrd? I'm very interested in
>prompting two characters
situations as it's very gay to do
>>
File: RA_NBCM_00023.jpg (919 KB, 1872x2736)
919 KB
919 KB JPG
>>
File: ComfyUI_00005_.png (3 KB, 904x1152)
3 KB
3 KB PNG
Is it qwen edit example workflow feature or one updoot too much syndrome?
>>
>>106752217
>>106752241

New >>106751249



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.