[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>106952799

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2203741
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
let's enjoy bouncing breasts and asses
>>
>>106957370
>>
AHHHHHHHHH CATBOX IS DOW

Can't up these mp3 diffusions.
>>
>>106957402
https://vocaroo.com/upload
>>
File: screenshot.1761011513.jpg (237 KB, 816x811)
237 KB
237 KB JPG
For those that missed, InvokeAI has joined Adobe. The main team will no longer be working on the free edition. It is effectively dead.
>>
>>106957417
is it really news if nobody used it?
>>
File: ComfyUI_00521_.mp4 (1.16 MB, 720x1280)
1.16 MB
1.16 MB MP4
titty bounce lora
>>
>>106957422
I guess not. It was never even in the OP.
>>
>>106957417
I wonder how adobe will disfigure it.
>>
>>106957414
YuE gen goes hard
https://voca.ro/1giaGXFhN5m7
>>
>>106957414
>https://vocaroo.com/upload
fanks
https://vocaroo.com/12wVie68juN5

Still learning to use songbloom
>>
>>106957417
i'm sure that was the hope from the beginning
>>
>>106957433
nice but

>random brightness
is there truly no way to get rid of that wan issue?
>>
>>106957437
No you don't get it. The team joined Adobe, meaning Invoke is not being worked on anymore.
>>
>>106957439
>YuE gen
wats dis
>>
>>106957445
its from the lora(s) .. shitty loras do shitty things
>>
your prompts are logged and will be used for blackmail in the near future
>>
>>106957451
My prompts will be used to promote me in the 4th reich.
>>
>>106957446
Can't the local project be maintained by other people then?
>>
>>106957447
https://github.com/joeljuvel/YuE-UI
>>
File: ComfyUI_00522_.mp4 (1.53 MB, 720x1280)
1.53 MB
1.53 MB MP4
walky walky
>>
>>106957391
Good collage
>>
>>106957467
Yeah but no one is going to make any serious contributions to it.
>>
>>106957477
that hot cat walk style will never be available on cloud models
>>
>>106957441
I like this version better, different lyrics.
https://vocaroo.com/1bZd4EkRGas1
>>
>>106957496
thank god for that. stop fucking objectifying women.
>>
>>106957473
does the yue thing work on comfy?
>>
Chroma is great for hideous body horror and not much else
>>
>>106957504
so true so true
now if you excuse me, I'm gonna go gen more hypnotic swaying hips walks
>>
>>106957513
>does the yue thing work on comfy?
Not officially
https://github.com/smthemex/ComfyUI_YuE

Exl2 through that gradio UI that I linked is the fastest way to run it though.
>>
>>106957517
nah i make damn good shit with chroma. just takes a few rolls.
>>
>>106957537
Yeah, I'll wait lmao
>>
i remember some anon said i'd be burnt out from ai. been going a year strong gen'ing nearly daily and im not even close to losing interest. i feel sorry for people that have no imagination and feel they hit a wall. actual NPC's.
>>
bros when are we getting an audio network that can moan
>>
>>106957568
Some simply are not cut out for it
>t. 3 year club
>>
>>106957572
https://huggingface.co/phazei/NSFW_MMaudio/tree/main
>>
>>106957467
Why would anybody bother maintaining a local UI in 2025? Local is all about the grift, and Invoke already got bought out just like ComfyUI. Anyone working for free on someone else’s project is a sucker
>>
>>106957574
>cant do dialogue too
worthless
>>
File: ComfyUI_00046_.mp4 (1.68 MB, 832x832)
1.68 MB
1.68 MB MP4
>>
songbloom is suno at home.
>>
>>106957544
>i make damn good shit
as long as you don't look at the hands...
>>
File: ComfyUI_00047_.mp4 (1.93 MB, 832x832)
1.93 MB
1.93 MB MP4
>>
>>106957587
why does it suck that much at hands though?
>>
>>106957597
512x512 training with half the dataset being furry scat.
>>
>>106957422
invoke was pushed as a more stable normie-friendly solution with a smoother inpainting ui, and it was good for that. it makes sense that adobe decided to kill them since it clowned on their implementation.
>>
>>106957547
https://vocaroo.com/1ao5t1Ynsn6y

It's never going to be implemented in Comfy anon. At least not until the model gets an update of sorts.
>>
>>106957568
not everyone is a gooner like you
>>
>>106957595
cool
>>
>>106957607
nice. songbloom is hard to control. idk why this has a female vocal.

https://vocaroo.com/12QgKzwaKo15
>>
>>106957629
>not everyone is a gooner like you
I struggle to understand anyone who's into this but not a gooner. The year I started genning I gooned 3x harder than I ever had in my life, yet I still got bored of genning.
>>
File: ComfyUI_00048_.mp4 (1.67 MB, 832x832)
1.67 MB
1.67 MB MP4
Good night anon
>>
>>106957629
I don't always gen goon material. I make actual art sometimes too and it gets reposted quite often in generals I frequent. I have so many potential things I want to do I simply don't have the time to do it over NSFW content, that's all.

anyone saying they're 'bored' of gen'ing is creatively bankrupt though, especially if you use AI to wrote your prompts or give you ideas. then you definitely have hit the creative dead end.
>>
>>106957645
nobody cares.

I'm white.
>>
>>106957645
>yet I still got bored of genning
you lack creativity
>>
>>106957602
Shouldn't a basic hand lora fix the problems though? People keep constantly bitching about hands but there is no Chroma hand lora available yet. Clearly not that big a deal.
>>
>>106957673
Me too.

>>106957678
No, I don't.
>>
>>106957645
i've never wanked it to anything i've genned.. just doesn't work for me when I know how the sausage is made
>>
>>106957644
Not genning these. Too lazy, these are from the demo
https://map-yue.github.io/
>>
>>106957679
I never get great hands with chroma. Ever.
>>
>>106957684
see >>106957678
>>
>>106957681
Yes, you do. A creative person wouldn't be bored. Perhaps the problem is your not motivated because your AI art doesn't get the attention you want. This is another reason I don't care much for SFW AI art. Doesn't matter how creative or good it is, it will always get shit on by most people.
>>
>>106957686
Yeah, perfectly stated. not comfy, cba.
>>
File: morrissey bored.png (69 KB, 1008x469)
69 KB
69 KB PNG
>>106957700
>>
>>106957689
ok why not just make a damn hand lora already. it's not a big deal for me personally because most of my stuff doesnt involve complex hand positions
>>
>>106957686
Composition in YuE in insane though
https://vocaroo.com/19yARcSdjCzf

No other open model compares so far. It's the closest thing to Suno
>>
>>106957725
Are you chinese or something? I posted better examples from songbloom.
>>
>>106957717
ADHD/depression/anhedonia/etc. its not normal to be bored of everything.
>>
File: ComfyUI_00528_.mp4 (2.14 MB, 720x1280)
2.14 MB
2.14 MB MP4
>>
File: 1760461844356622.jpg (230 KB, 2000x991)
230 KB
230 KB JPG
>>106957746
Oh I'm "not normal"? Great, thanks, noted, etc

Will try to be more creative in future!
>>
>>106957761
me on the right
>>
>>106957775
Same! :D
>>
we have people that gen the same 1girl in the same pose hundreds of time for weeks on end. there is something clearly wrong in their brain chemistry.
>>
>>106957795
Refinement.
>>
>>106957742
Songbloom only has better sound quality. Its compositional ability is not YuE tier. Though YuE skips a lot of lyrics so there's that, but the YuE songs are superior in quality.
>>
>>106957795
yeah I feel like the past threads didnt have enough crouching pointing at viewer laughing 1girls. Do you agree?
>>
>>106957800
acestep kicks yue all over town in that respect.
>>
>>106957602
it is pretty shocking how good chroma is at scat. fucking disgusting haha
>>
the nogen has "opinions"
>>
>
>>
>>106957834
half of the dataset is degenerate furry slop. what did you expect?
>>
Here's acestep :^)

https://vocaroo.com/1lMkksHMvinz
>>
File: dmmg_0097.png (1.24 MB, 896x1152)
1.24 MB
1.24 MB PNG
>it's one of those threads
>>
>>106957883
join in and argue who is more or less autistic
>>
>106957883
vomit

I'M EATING
>>
>>106957817
This is the first song from ACE Step's showcase
https://vocaroo.com/1JPCPL96ecq8
If you think that is YuE tier you are retarded. There are many issues here, not just skipping lyrics, but that is a robot singing and it clearly doesn't have a good grasp on the English language (though it seems to have a grasp on Chinese, I can't speak that so don't quote me on it).
Its voice quality while just as loud is also not as good as Songbloom.
>>
i heard there's a benefit to running comfy on edge instead of chrome or firefox? why so?

>>106957883
my fav lora
>>
acestep is garbage
yue is garbage
suno is garbage
udio is good
>>
wake me up when i can make a synth sound like a real violin with AI
>>
>>106957911
>udio is good
They shitted it up.
>>
>>106957725
Second version using same prompt/lyrics
https://voca.ro/1iergfR1M0ly

Neither of these are something ACE Step could pull off. The varying intonation, the musical composition, the chorus... Even with matching instruments it would go off into some unnatural sounding mess.
>>
>>106957954
how so?
>>
File: dmmg_0094.png (1.23 MB, 896x1152)
1.23 MB
1.23 MB PNG
>>106957897
i think anyone using a name without a tripcode is clearly leading the pack on that one

>>106957908
ty anon, almost done generating the samples for civit
>>
Unrelated observation: my Chinese ain't so hot but I feel like you might not want 风格,作品 (stylized, artistic) in the negative prompt for certain kinds of WAN gen
>>
>>106957961
provide lora sir
>>
>>106957958
idk. One day they shitted it hard.
>>
>>106957968
right...
>>
>>106957911
The Chinks are close to an Udio quality model, just a matter of iterating on YuE.
>>
>>106957972
?

I went a whole day of genning without a good gen. canceled. They shitted it.

Respect your elders, child.
>>
>>106957965
elaborate
>>
>>106957987
>they uhh they just downgraded cuz uhhh because uhhhh
>>
>>106957961
VOMIT

I'm EATING
>>
File: bimbo_grid.jpg (2.52 MB, 3132x3132)
2.52 MB
2.52 MB JPG
>>106957966
still making contact sheets

>a candid photograph of a 30 year old woman. she has long straight black hair with bangs. she has striking eyes and heavy eyebrows. She is wearing a white bikini and laying on the beach. white sand all around, two seashells are scattered in the sand nearby. she lays on a teal towel in the bright overhead sun. a white sunhat lays near her.
>>
sad to hear comfyorg will probably sell off the project at some point
>>
File: ComfyUI_06352_.png (1.33 MB, 1144x904)
1.33 MB
1.33 MB PNG
>>
>>106957989
If you're trying to gen, say, an anime video then by its nature it is going to be a stylized drawing and not photorealistic. So those two terms in the negative prompt would be counterproductive. That's my intuition anyway.
>>
>memegen made it to the collage, but not my other gens
sad
>>
>>106957997
So anyway, how's your transitioning?
>>
File: ComfyUI_06354_.png (1.35 MB, 1032x1008)
1.35 MB
1.35 MB PNG
>>
>>106958024
same way your weight loss journey is going
>>
>>106958024
meant for
>>106958006
>>
>>106957208
meant for
>>106958036
>>
>>106957908
firefox has a bunch of bugs when I use it for comfy. chrome just werks. dunno about edge
>>
File: cheeks_grid.jpg (2.57 MB, 3132x3132)
2.57 MB
2.57 MB JPG
>>106957908
which one
>>
>>106958109
the tranny lora
>>
>>106958109
i think the best is 0.8 str with 7 guidance, or 1.0 str with 7 guidance
>>
>>106958109
VOMIT

I'm eating!
>>
I've heard that nvidia is basically a must for ai, does the cpu make a difference? I don't know if I should go for ryzen or intel
>>
File: dmmg_0006.png (1.77 MB, 896x1152)
1.77 MB
1.77 MB PNG
>>106958132
anon picks the one config that is absolutely bonkers lmao. guidance 8, 1.0 str is absolutely bugged in gen.
>>
>>106958192
>does the cpu make a difference
no but I think amd is the go to nowadays
>>
>>106958194
VOMIT VOMIT VOMIT
>>
>>106958194
but maybe the 1.0 7 guidance isn't bugged
>>
File: dmmg_0007.png (1.61 MB, 896x1152)
1.61 MB
1.61 MB PNG
>>106958210
yeah this .8str 7 guidance is solid, thanks anon
>>
>>106958221
VO
MIT
>>
>>106958227
can you fuck off nigger
>>
>>106958232
Like put on some CLOTHES
>>
>>106958232
just ignore him
>>
>>106958243
Oh yes ignore the guy who's tryin ta eat
>>
These "family size" cracker boxes only have 1 snack in them. donny's economy.
>>
File: 1730036204094670.jpg (653 KB, 2670x1693)
653 KB
653 KB JPG
babe wake up, a new model got released
https://jiuhaichen.github.io/BLIP3o-NEXT.github.io/
https://huggingface.co/BLIP3o
>>
>>106958254
wtf is this shit?
>>
>>106958261
Behold, the dragon awakens.
>>
fill me up with slop daddy
>>
>>106957645
>I struggle to understand anyone who's into this but not a gooner.
Professionals. I have a friend in textile design, she was so offput by the idea of genning at the start I could barely talk to her about it. Last time we've talked, she was running overnight batches in Comfy for three different jobs. Because let's face it, prompting isn't hard, having credentials in the industry is.
>>
>>106958201
okay but why
>>
>>106958276
I kind of understand a textile industry person's interest back when we had tile controlnets—I made a floral backdrop for a graphic for a gift once—but how do people do tiled gens in the newest models?
>>
>>106958254
>bad
>good in a worse way
Garbage
>>
>>106957968
You've probably noticed that the model is now way worse than what it used to be. Well, that is expected. Their monetization tactic is two-fold. First, they offer a decent model to garner public interest. Then, they gather attention of music labels. Some underground labels make a deal with them to get exclusive rights to some aspects or versions of the model. Then, they patch the model and the public only gets accessed to a watered down version of it. So they continue milking the masses who use their model, while the real money is in the underground label which is using it for their own songs. It is genius when you think about it, the music industry is teeming with Jews ready to do this. There's a lot of money in AI music, and even more than first meets the eye (we're talking billions of $ in exclusive deals with labels).
>>
>>106958276
>she
ahahahahahahah

AHAHAHAHAH

aaaaHAHAHAHAHAHAHHHAAHAHAHHa

woooooeeeee ahahahahahAHAHAHAHAH
>>
Qwen on Windows 10: 650 seconds
Qwen on Fedora KDE: 250 seconds

I have a 3080
>>
>>106958109
0.4 at most. Otherwise it looks like a lora for lighting conditions and camera grain instead of cheeks.
>>
File: dmmg_0021.png (1.88 MB, 896x1152)
1.88 MB
1.88 MB PNG
>>106958323
honestly, any lora on krea is just fighting for it's life against the built in styles (0.4str/6.5)
>>
>>106958254
why would you release something this bad in the year of our lord 2025?
>>
>>106958276
>prompting isn't hard
you say that but i read what anon posts and i have to believe he profusely disagrees
>>
>>106958290
I'm fairly certain she fixes complicated ones by hand the way she always did. But most prints don't require much work by virtue of being on plain background, like those Dior Delft Blue ones.
>>
>>106958335
It's BLIP, it was meant as a mental exercise, not as a working replacement for anything.
>>
File: file.png (2.73 MB, 2560x1354)
2.73 MB
2.73 MB PNG
Hey /ldg/ I've been working on a tool that acts like a local danbooru style gallery for cataloguing my gens, memes, and lewds.
Has anyone built anything like this?
>>
>>106958388
>you say that but i read what anon posts and i have to believe he profusely disagrees
Who, me? Who do you think I am? You're almost certainly mistaken.
>>
>>106958454
You built that?
>>
>>106958454
I just started working on exactly this
>>106957304
I'm targeting for something like your picrel, maybe without the rating, but yours look great
>>
>>106958481
Yes, admittedly it's largely vibe coded. But I had a good idea of what I was after.

It leans on https://github.com/deepghs/imgutils for doing additional auto-tagging.
And took the basic idea from rclip https://github.com/yurijmikhalevich/rclip
To add clip search, and reverse image similarity support.

Right now there's no support for adding custom tags or anything in the interface, but the core functionally to do prompt-based tag search, and semantic search is working.
>>
>>106958527
Wow, it looks really good. Nice, anon. I really wanna test this out.
>>
>>106958549
Give me a little bit to clean my real name out of the commit history and I'll post it.
>>
>>106958334
is the Lora trained on Krea or no? Ones that were trained on BF16 Dev work ok with it, ones that were trained on FP8 Dev are usually shit. The results look MUCH better though if you actually train on Krea.
>>
should I use qwen or qwen edit for inpaint model?
>>
>>106958776
no
>>
>real video in the collage
>>
FUCK MERGING MODELS
FUCK THESE FUCKERS UPLOADING SHARDED AND UN-NAMED FILES

FUCK THESE CUNTS

STUPID LAZY MOTHERFUCKERS
>>
He still use Wan
>>
Does anyone use Framepack here ?
>>
>>106958882
why not
>>
>>106958966
No. This is like the 2nd time I've seen this asked. No one uses Framepack anymore.
>>
Does anyone use Framepack here ?
>>
>>106958998
what's the point of the collage then?
>>
>>106957370
What model was used in the pic with the socks?
>>
Just realize how cucked our local model is compared to the one that jews use. Animators and Artists totally won hard.
>>
>>106959012
to put something in the op obviously
>>
I found this workflow to upscale a video. Why am I ooming with 32gb vram and 64gb ram? How can I improve it?
>>
Why haven't /ldg/ created their own model yet?
>>
>>106959040
I actually don't know why you're OOMing with 32GB VRAM using these settings and that resolution

but I suppose you could try to to comfyui-multigpu distorchv2 offload to system ram or use the kjai sampler where you can offload block layers

>>106958966
focus is mostly on wan, hunyuanvideo and framepack are very rare. feel free to post videos made with them tho
>>
File: 1737661786512591.jpg (1.78 MB, 2906x2415)
1.78 MB
1.78 MB JPG
Is it really worth it doing this fucked up mess for 5 second video ?
Seems like pain in the ass desu. Is this because our local models so damn shit ?
>>
>>106959053
bigma
>>
>>106959117
you can use subgraph
>>
>>106958254
>3b
Cool, I like them small. It's still garbage, but at least it's not a bloat
>>
>>106959117
you only need like 5 nodes to do wan videos.
>>
>>106959145
>>106959166

In the end it become complex like that pic.
If its not a simple prompt, its not worth it.
>>
>just hide the mess bro
>sweep it under the rug
>>
>>106959103
Yeah I can't figure out why I oom so often when the models arent even that big.
Is it possible that pytorch, cuda version mismatch/corruption could be a thing to reduce the vram?

I'll try those two options when I return.
>>
>>106959117
what is that even supposed to do lol
>>
Theres really no point to buy RTX 5090 if our local models is shit. Prove me wrong
>>
>>106959177
that's the primary reason python is popular in AI
>>
>>106959170
>In the end it become complex like that pic.
if you don't know how to organize your nodes. are you a toddler?

>>106959181
it's not even comfy. it's geometry nodes for blender I believe.
>>
>>106959177
Yeah that's why we have front ends(pretty thing you interact with) and back end(mess meant to be hidden). You do the same thing with nodes.
>>
>>106959196
>if you don't know how to organize your nodes. are you a toddler?
moar nodes, moar problems
>>
File: screenshot.1761032183.jpg (383 KB, 1729x647)
383 KB
383 KB JPG
>>106959170
Make it look cute.
>>
>>106959196
I hate that shit in blender, i hate that in comfy.
FUCK NODES. TRANNY SCHIZO SHIT
>>
>>106959212
i've never had a problem getting overwhelmed or confused with nodes. any limitation I have is with the models themselves.
>>
Hi, 9070 XT owner here.

How bad is it?
>>
I don't see the interface anymore. All I see is latent space.
>>
>>106959230
You already know AMD is shit for anything AI related. This isn't a meme or some Intel vs AMD shitposting. You're free to try your luck though.
>>
>>106959235
ok slop neo
>>
>>106959230
60 fps on video games

0 fps on AI

Its AI is fucked either way. They regulated this shit harder than cigarretes. Just learn how to draw
>>
>>106959238
>slop neo
Kek
>>
>>106959238
>>106959268
Thats not a quote from neo you FUCKING RETARDED FAGGOT
>>
>>106959230
Pick up a pencil
>>
>>106959276
>>106959262
Why are you telling them to draw? More like get a job and buy a fucking 3090/4090/5090.
>>
>>106959236
Thanks. Just holding my breath until the ROCM support AMD promised is implemented.

And I thought local diffusions would be censorship free-ish?
>>
>>106959282
Some people have morals and boycott nvidia.
>>
>>106959288
You lost, kid?
>>
>>106959283
>And I thought local diffusions would be censorship free-ish?
fine tuned models are censorship free.IE, chroma is uncensored, illustrious is uncensored, etc.
>>
>>106959288
then they have no business doing anything AI related because nvidia has a complete monopoly in this field.
>>
>>106959288
It's less moral more like common fucking sense to not support N*idia
>>
>>106959293
been here since wood screws but go on son
>>
File: hidream.png (2.32 MB, 1024x1024)
2.32 MB
2.32 MB PNG
>>106959262
get a suitable gpu and gen locally instead of posting your SaaS depression in the wrong thread
>>
>>106959306
>hidream
why on earth would anyone still be using this piece of shit.
>>
>>106959288
Lol
>>
>>106959306
theres no point if i cant video generate donald trump sucking a hard shit from 16yo anus
>>
File: hidream.png (1.11 MB, 1024x1024)
1.11 MB
1.11 MB PNG
>>106959288
because with a local gpu, you can even while using other models too. it is fully slopped but fairly capable.
>>
>>106959357
Now gen her with a penis
>>
File: hidream.png (1.99 MB, 1024x1024)
1.99 MB
1.99 MB PNG
>>106959360
DIY. if you don't want to train/use loras, feel free to use another model.
>>
>>106959355
>theres no point if i cant video generate donald trump sucking a hard shit from 16yo anus
You can absolutely do that though. The issue with anuses is that all loras are trained on adults (no yummy kidbutt yet) so 16 year old anus is incredibly doable especially if you're just doing a boring i2v

>>106957568
I burnt out of WAN, not out of AI. Wan is too limited compared to Sora and I can only use it for quick goons now
>>
Can someone sneak in to Sora and Grok data center and steal their models ?? You did this before. You can do it again
>>
>>106959502
Didn't someone already did that not long ago?
>>
if /ldg/ just jumped on the H100 train a lot of these issues wouldn't be talked about
>>
>>106959524
The issue is not GPU, its the models. Even WAN jewed out with their 2.5
>>
File: 00028-1769928823.png (1.72 MB, 1248x1824)
1.72 MB
1.72 MB PNG
sdxl far another year right local bros?
>>
>>106958310
I noticed no change at all, I'm using the same model as before.
>>
File: 00037-2765771820.png (2.05 MB, 1248x1824)
2.05 MB
2.05 MB PNG
>>
File: WanVid_00001.webm (1.45 MB, 720x960)
1.45 MB
1.45 MB WEBM
made with love
>>
>>106959582
>Fast Inference
>Perfect Prompt Adherence
>Fast and easy to Train
>Good ControlNET support
>Low VRAM
>Superior Quality
I don't think anything will top it within the foreseeable future.
>>
>>106959763
Can you make one with the saliva being clear instead of yellow so I can fap to it with a clean conscience instead of being half horny and half disgusted like I am right now

Maybe I'm not as burnt out as I thought I am and I am indeed just uncreative (I cannot visualize an apple in my head)
>>
>free multimodal model that is really high on tests
"Wow, I want to test this loca-"
>2tb model file
>>
>>106960078
tangent but can quants also be quanted?
like Q1 of Q1 of Q1 might work
>>
>>106960078
Link ?
>>
>>106960082
No idea, doesn't sound worth it.

>>106960231
https://huggingface.co/inclusionAI/Ring-1T-preview
>>
>>106960268
That's not an image model bro.
>>
File: fuck off.png (543 KB, 1200x900)
543 KB
543 KB PNG
https://xcancel.com/_akhaliq/status/1980265202500116525#m
>with a High-Quality Synthetic Dataset
>>
>>106960285
You're not exactly a model yourself either.
>>
>>106960291
kek
>>
>>106960268
The fuck ? Not even VideoGen ?
>>
>>106960268
>1T parameters
what the fuck
>>
Apologies if this is not the right thread to ask, but how can I blend this overlayed image with the surrounding background so that it looks like one, using AI or prompting? Any ideas? I tried WAN but it didn't do it right, at least not with the way I prompted it
>>
>>106960432
Try qwen edit. If it doesn't work then gemini
>>
>>106960432
maybe if you tell qwen image edit [or flux kontext / hidream e1.1] to change/blend the background or something like that.
>>
>>106960432
Do you know what a mask is?
>>
>>106958561
no need, we already know you're nonwhite
>>
>run a gen
>goes through
>run again
>oom

Thanks, kijai.
>>
>>106959288
buy used, retard
>>
>>106960705
Post your hand
>>
>>106960723
me posting against the troonku obsession of a generic tranime girl you commited your whole identity around spamming is proof enough that i dont have 80 iq retard brain
>>
>>106959000
But but
https://github.com/lllyasviel/FramePack/issues/768
>>
File: 00045-2948199529.png (2.66 MB, 1248x1824)
2.66 MB
2.66 MB PNG
>>
When you guys gen pussy with Wan2.2 do you prompt it in detail or not?
I feel like wan gets kinda worse when you give it toooooo much detail.
>>
File: 00053-734991350.png (2.8 MB, 1248x1824)
2.8 MB
2.8 MB PNG
i love illustrious convenient censorship :)
>>
>>106960465
great movie
>>
File: 00060-3852705153.png (3.01 MB, 1824x1248)
3.01 MB
3.01 MB PNG
>>
>>106957370
Does Wan2GP really capable for poorfag VRAMlet GPU ??
>>
>>106961027
yes saar
>>
>>106961027
wan2gp gets the job done but deepbeepmeep is too fucking slow with the updates and is obsessed with vace and multitalk shit. He needs to add more samplers, upscalers and schedule types and stop with the vace and wan animate crap. image generation settings are too barebones.
>>
File: 00077-2755917040.png (2.17 MB, 1248x1824)
2.17 MB
2.17 MB PNG
>>
>>106957761
https://youtu.be/dDgBq206pnI
>>
>>106961113
is that txttoimg or imgtoimg
>>
How do I convince people to sub to my patreon of my gens?
>>
>>106961196
make good gens
>>
>>106960928
I only use 2.2 with lightning and at low cfg it doesn't listen that well, so I try to keep it a little vague. Most important thing is to prompt anything you think might get hidden, like if eyes close and you don't say "blue eyes", you might get green eyes when they open, stuff like that.
>>
>>106957817
Will they ever release 1.5 ?

Current step on their road map:
>Train and Release ACE-Step V1.5

For 6 months at least...
>>
>>106959230
Hope you're running Linux. I have one and it works fine, just a little slow.
>>
>>106959420
>hidream
use case?
>>
>>106961235
Yeah I understand that.
The problem I'm facing is that T2V can do pussies very well like if you prompt "Lady with open legs and shaved pussy" it can do a fairly realistic one.
But on I2V if you put a picture and tell it: "The lady opens her legs and shows her pussy" it just do pic related. Even with loras, tried all of pic related
>>
>>106959193
you're not wrong. every time I get the itch to buy a 5090 I have to remind myself that I'd be doing the same bullshit I'm doing now with a 3090, just faster
>>
>>106958645
trained on Krea, the results are better. it's just got a really low learning rate already (3e-5), and i possibly have to go lower
>>
File: wan22___0006.png (1.6 MB, 832x1216)
1.6 MB
1.6 MB PNG
>>106959020
>this guy is into pegging
wan 2.2
>>
Wtf, seedvr2s extraargs node disappeared on me..
>>
I'm kinda of autistic while typing.
I meant it the "pussy reveal" looks weird.
Also, don't see much difference with different samplers
>>
>>106961335
did you somehow switch from nightly?
>>
>>106961328
not him but i'm impressed it can get patterns socks without looking like a mess.
>>
File: sdcap.png (32 KB, 740x444)
32 KB
32 KB PNG
Can I tag a NL model lora dataset with tags like these or is it gonna fall apart?
>>
>>106961400
depends what model
>>
>>106961284
Yeah, that's true. I haven't found a truly effective uncensor for i2v yet which will let you do what you're describing with pussies.
>>
Pic related, is that just a straight up tiler while genning. Does it reduce vram just like how the vae decoder tiler does? Or will this OOM once decoding?

Wanted to try that wan 2.2 detailer, but now that I ran the video through seedvr2, holy fuck so much better. Still want the option of more upscalers.

>>106961361
I have no idea how. It's a fast reinstall anyway.
>>
File: 00053-1419447086.jpg (379 KB, 1280x1920)
379 KB
379 KB JPG
>>
>>106961499
what does seedvr2 do?
>>
>>106961543
an upscaler. it's very vram intensive
>>
>>106961549
Do you need a separate workflow for it? Or can you just whip it into an existing Wan2.2 workflow at the end before saving?
Looking at their hugging face couldn't find a model smaller than 13.6 so I guess it's not for my 12gb 4070 :(
>>
>>106961627
>Do you need a separate workflow for it?
No but I have it in a seperate workflow so I'm not upscaling shit gens
>12gb 4070
Yeah that's not gonna work at all
>>
>>106961627
>Do you need a separate workflow for it?
no, it's 3 nodes.

>Or can you just whip it into an existing Wan2.2 workflow at the end before saving?
you can, but it's not ideal.
>>
File: 00012-4238704569.png (709 KB, 1024x1280)
709 KB
709 KB PNG
>>
>>106959230
>9070 XT
Works fine. Image gen can take less than 20 seconds (maybe better with the latest rocm nightlies).
The feature set is limited on Windows, for example, triton and *attentions (except for pytorch's) don't work, but for Linux I think they do.
>>
File: seedvr.jpg (431 KB, 2543x599)
431 KB
431 KB JPG
>>106961682
To follow up, another reason it's better to put the upscaler in it's own workflow is so you can do batch upscaling. What if you want to upscale non-wan videos aswell?

imo, it's just faster to generate a bunch of wan videos, select the best ones, then batch upscale it.
>>
>>106961885
nta, seedvr2 is not a quick process even on a gpu with a ton of compute. You only want to upscale something that is worth upscaling. I mainly use it for general videos, it is quite good, arguably on par with topaz.
>>
>>106961885
Personally I do it from my own machine outside of comfy, a scripts checks a folder where I put all the gens I like every few seconds and just upscale/interpolate everything in it.
>>
Don't make fun of me or I'll sue you!
>>
File: wan22___0018.png (1.37 MB, 832x1216)
1.37 MB
1.37 MB PNG
>>106961362
wan image gen is pretty cool
>>
>>106961885

Is seedvr worth it for images? I tried once, and didn't see much difference
>>
Artists won
>>
>>106961912
>seedvr2 is not a quick process even on a gpu with a ton of compute
It takes about 15 minutes for a 7 second wan video going from 480P to 720P, so not too bad.
The batch_size really controls whether you'll OOM or not. Higher batch_size means higher quality. Ideally on an H200 or gpu cluster, you'd want to full frames in batch_size for maximum temporal consistency. Definitely can't do 5 minute video clips or anything, which ironically makes it really only useful for wan.

It's way better than Topaz though, but on the flipside you don't need insane vram to run topaz upscale models.

>>106961914
Sounds pretty useful. I would do something like that if I had dual gpus. One gpu runs wan gens while the other handles upscaling

>>106961974
I've never tried single images yet but I imagine it is. I should compare it to topaz gigapixel to see if it's worth it
>>
>>106962018
>Sounds pretty useful. I would do something like that if I had dual gpus. One gpu runs wan gens while the other handles upscaling
Yeah wan runs on my other "server" pc, and I do the upscaling locally.
>>
>>106962018
>It's way better than Topaz though
How much vram would it need to upscale 720->4k?
It's very fast on topaz, only issue I have is interpolation of fast moving stuff, looks really bad.
>>
>>106962062
I don't think you can do 4k video upscales with seedvr on consumer gpus, atleast not with any batch_size that'd be worth it.
>>
File: 00017-677119547.png (663 KB, 512x640)
663 KB
663 KB PNG
>>
>>106962082
Oh ok, well too bad then.
>>
>>106962018
>you don't need insane vram to run topaz upscale models
Yeah, my 4090 is barely touched with Topaz. I could easily run four concurrent sessions with it... and that's with frame interpolation on as well.
>>
If you run Ovi i2v with no reference image, it behaves like t2v. The image is a bit blurry, but it does follow the prompt.
>>
>>106962018
>Definitely can't do 5 minute video clips or anything
You could always split videos into 5 second pieces, upscale with seedvr, then merge them back. I'm guessing this would fuck with the overall quality though if the model needs more than 5 seconds to do a proper upscale.
>>
>>106961123
I need moar
>>
>>106961974
I just tried it with a single image, and yeah it's WAY better than giga pixel. Give me an image to upscale and i'll show you.
>>
File: 1751936546321950.mp4 (2.33 MB, 720x1072)
2.33 MB
2.33 MB MP4
>>106961531
tried 6 steps high with no light lora, 3.5 cfg and 4 steps low with light lora like one anon suggested. this shit is cooked
>>
>>106962377
Give me the prompt lemme try
>>
>>106962377
wow looks like shit, who could've guessed?
>>
"Sex Smash Cut" is a fun lora idea, but man does it love to give body horror results
>>
>>106962386
for that I used chatgpt but I had the same shit happen with simple prompts
Scene: bright indoor gym sideline under cool white lights, clean white backdrop bleed, light rim-light on hair and horns, soft studio shadows.
Subject: blue-skinned fantasy cheerleader with short horns, long pale hair in a high ponytail, pointed ears, glowing eyes, blue and white cheer uniform with pleated skirt, pink pom-poms in both hands, slim athletic build, friendly confident smile.
Action (continuous, loop-free): she takes a half-step forward, pumps both pom-poms up-down twice, then a quick 8-count cheer: right-left-up, small hop, lands, finishes with arms raised in a “V” and a proud hold; natural breathing and subtle body weight shift; skirt, ponytail, and tassels sway with secondary motion; pom-poms ruffle; no mouth callouts—just expression and movement.
Camera: medium full-body, slight 5% dolly-in and gentle 10° arc rightcenter; handheld micro-jitter <1%; shallow DOF that tightens near the end.
Look/Style: crisp illustration realism, smooth gradients on skin, specular highlights on fabric, clean edges, no film grain, no heavy bloom.
Lighting: key from front-left, soft fill, cool back rim; balanced exposure (no blown whites).
Ending: holds final “V” pose for ~0.5s, eyes on camera.
>>
>>106962442
I prefer product consistency lora.
>>
>>106962442
Oh yeah, meant to try that today. Seems it can only do vaginal missionary.
>>
>>106962464
Wait, does that work for NSFW?
what
>>
File: dmmg_0218.png (1.38 MB, 832x1216)
1.38 MB
1.38 MB PNG
ty to the anon who told me to train loras in the first place, psxCheeks is on civit if you want it.
>>
>>106962474
Also, it can only do Medium/Big boobs and it's pretty stron even with flat/large chest it just ignores it.
>>
>install reforge without problems
>install ersatzForge
>4 quadrillion dependencies missing, install will now proceed to shit itself
why is this allowed
>>
>>106962492
time to get comfy
>>
ok bros I need some 1girl inspiration
>>
>>106962509
elf
>>
>>106962457
what the actual fuck
>>
>>106962509
1girl, topless, massive breasts, oily body, frying 2 eggs and a sausage on top of her breats
>>
>>106962509
boob
>>
>>106962526
hmmm bacon
>>
>>106962475
Yes, in combination with other loras and prompting something like "instantly switches to the next scene:"
Wan already can do it by itself, the lora just helps to preserve and shorten a transition.
>>
File: 00020-421442988.png (427 KB, 512x640)
427 KB
427 KB PNG
>>
>>106962548
Nice to know, thank you helpful anon.
>>
File: 251021-234204-Wan5s 00001.mp4 (2.32 MB, 1120x1696)
2.32 MB
2.32 MB MP4
>>106961531
>>106962457
with bouncing boobs lora
>>
>>106962579
butiful lightx2v slowmo slop
>>
>>106962481
seedvr upscale
https://files.catbox.moe/lngypv.png
>>
File: 1732282059015109.mp4 (2.68 MB, 720x1072)
2.68 MB
2.68 MB MP4
>>106962586
this is 10/10 steps with no light. same slow-mo lol
>>
>>106962610
what extension do i grab for this?
>>
>>106962627
https://github.com/numz/ComfyUI-SeedVR2_VideoUpscaler/tree/nightly?tab=readme-ov-file

workflows
https://github.com/numz/ComfyUI-SeedVR2_VideoUpscaler/tree/nightly/example_workflows
>>
File: dmmg_0044.png (1.72 MB, 896x1152)
1.72 MB
1.72 MB PNG
>>106962610
that's intense. can you use that to save shitty gens? like if it had a lot of noise or unclear details?
>>
>>106962618
fug, I think using the rcm low lora was why the output was fried
>>
>>106962464
>I prefer product consistency lora.
You mean this?
https://civitai.com/models/2000699/productconsistency-wan22-i2v-consistencylora3

Can you catbox an example? I never thought it would be anything useful in nsfw.

>>106962474
Yes, I'd like someone to generalize the idea.

>>106962488
Yeah it ignores both small breasts and huge ones, tried both, seems like it's overtrained with one type only.
>>
>>106962492
everything should work like python modules, where it also installs all dependencies. Or better yet just put the .dll next to the .exe
>>
File: 251022-000507-Wan5s 00001.mp4 (2.21 MB, 1120x1696)
2.21 MB
2.21 MB MP4
>>
>>106962759
>everything should work
it doesn't
>>
>>106962794
in this instance you have to specify that she has a tail
>>
File: 251022-000849-Wan5s 00001.mp4 (2.39 MB, 1120x1696)
2.39 MB
2.39 MB MP4
>>106962804
right in the middle of genning u said it
>>
>>106962794
>>106962830
>>106962618
>>106962579
>>106962377

What res are you guys genning this and are you upscaling it?
I can't gen bigger than 518000 pixels (720x720) I get oom
>>
Pony V7 seems like it might be interesting for realism focused tuning at least from what I've gotten out of it so far the few times Civit's ridiculously unstable generator didn't fail and refund the buzz
Attached pic is with `style_cluster_450` which seems to be just like, general photographic style
I do think these clusters were a dumb idea though overall
And they REALLY need proper documentation as the model is really finicky to use otherwise
>>
>>106962893
I'm 720x1072 guy genning directly at that res. 720p 81 frames is about my limit with 24gb vram and maxing out the blockswap settings
>>
shes cooked!
>>
>>106962915
why dude
just use chroma
the image you posted fucking SUCKS
>>
>>106962946
How long (time) for those gens? for 81f 720x720
9 steps 6 swap I take 1000s (16~17min).
Can be sub 10 minutes with 6 steps but quality takes a hit.
>>
>>106962959
you ripped that off hot shots, part deux I think
>>
File: file.png (2.23 MB, 1328x1328)
2.23 MB
2.23 MB PNG
>>
new
>>106963047
>>106963047
>>106963047
>>106963047
>>
NTA but I've tried Chroma a bunch of times for photorealistic stuff. I don't have a knack for prompting NPCs, not to mention Chroma has some weird ideas about what people in their late twenties or early thirties look like.
Tracking down loras for well-known real-world subjects is an option but doing that feels too weird.
>>
>>106962974
I'd tell you but I restarted comfy
>>
>>106962727
> Can you catbox an example? I never thought it would be anything useful in nsfw.
Can't open catbox.
>>
>>106962970
It's not bad in the sense of a 1536x1280 baseline gen for a model I didn't actually expect to have better out of the box realism capabilities than V6 IMO
>>
>>106962946
u def fucked something post wf
>>
File: ComfyUI_00530_.mp4 (1.01 MB, 720x1280)
1.01 MB
1.01 MB MP4
>>106961113
this is wan? how'd you get it to not be slow-mo?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.