[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Discussion of Free and Open Source Diffusion Models

Prev: >>107817380

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
z base?
>>
>>107820560
never going to be released
>>
Use the other thread first
>>107820507
>>
>>107820560
can chang make any more commits to keep anon going
>>
what am I doing wrong? The faces are so blurry with ltx
>>
File: 1752022567846003.png (963 KB, 900x900)
963 KB
963 KB PNG
>>107820560
he will find it
>>
>patience will be rewarded
>>
>>107820601
hard to tell with no example and no workflow, it generally speaking shouldn't be bad at reasonably sharp faces tho
>>
File: 1754021146411696.png (2.21 MB, 1168x1752)
2.21 MB
2.21 MB PNG
>>
Both this thread and the other one are gay for randomly taking NetaYume out of the OP
>>
>>107820690
It's barely been talked about since the release of ZiT
>>
>>107820697
what the fuck are you talking about lmao, the current use case for ZiT doesn't overlap in any whatsoever with NetaYume
>>
File: 1757250930290534.png (1.86 MB, 1024x1344)
1.86 MB
1.86 MB PNG
>>107820644
>patience will be rewarded
you just know the files have been dormant on a disk somewhere since early december
>>
>>107820690
i heard the author is planning on switching to z maybe but i miss it too anon
>>107820740
if i want anything resembling anime now i just train a lora for z desu
might go back to it for shitzandgigs tho
>>
File: 1766396120999479.png (1.79 MB, 1344x1152)
1.79 MB
1.79 MB PNG
play with me
>>
File: 1746149718778489.png (2.56 MB, 1408x1088)
2.56 MB
2.56 MB PNG
>>
File: ComfyUI_00061_.png (909 KB, 1024x1024)
909 KB
909 KB PNG
>>
>>107820740
who cares, it's a piece of shit
>>
File: 1767840521915281.png (2.41 MB, 1408x1088)
2.41 MB
2.41 MB PNG
>tfw no daughterwife swordsman
>>
>>107820830
the whole paper was fake nothing actually exists outside of turbo which was conjured by chinese magic straight up
>>
File: file.png (105 KB, 779x463)
105 KB
105 KB PNG
Is anyone using NAG with distilled LTX2? Am I doing this right? It doesn't error out at least, and the results are... well, slightly different
>>
File: file.png (1.81 MB, 1280x720)
1.81 MB
1.81 MB PNG
>>107820836
>if i want anything resembling anime now i just train a lora for z desu
NetaYume is shit, but Z is a dead end toy for anime imo, because combining a character lora and a style lora doesn't work very well.
>>
File: 1747474055049653.png (204 KB, 598x693)
204 KB
204 KB PNG
do you threaten your clip?
>>
>>107821196
I do actually threaten gemini 3, especially when it makes mistakes. I call it useless and tell it that it's slowing me down.
>>
>>107821214
you sound impatient famalam
>>
are ltx gens meant to be really, really quick or am I using really low quality settings? I've got a 5090. 1100x1600 takes about 33 seconds
>>
>>107821221
I am impatient, AI's supposed to speed up my workflow, when it makes fucking dumb mistakes I'm gonna lash out to it. Not like it can retaliate lmao what's it gonna do? Stop serving me? lmao.
>>
>>107821231
>ltx gens meant to be really, really quick
Yes, yes they are.
>>
>>107821237
When the Basilisk is king, you will be first against the wall
>>
File: ComfyUI_00012_.png (1.33 MB, 832x1216)
1.33 MB
1.33 MB PNG
blessed mods of gods
>>
>>107821261
prompt?
>>
>>107821274
https://civitai.com/models/1261679/anime-3d
the creator recommends WAI-SHUFFLE-NOOB though which i was too lazy to get, but i will now. maybe he trained on it i don't know

3d, colorful, 1girl, rosemi lovelock, rosemi lovelock \(1st costume\), smile, happy, (dynamic pose:1.3), foreshortening, perspective, abstract background, red flower, rose, floral background, rose petals,
>>
>fear wan will loose support and end up like hunyuan video or even worse, mochi
>wan becomes even less relevant than animatediff and framepack over time
>ltx support continues to grow
>ltx becomes so popular, alibaba starts seeing more decline
>alibaba teases sleeper model
>this time, its not api or culture
>wan 3.1

what are the features wansisters?
>>
>>107821025
skill issue if you actually think that
>>
>>107821295
do you care do convince people or are you just shitposting? just saying because you're making big claims with no supporting material
>>
>>107820836
you can train loras sure but that's not really a replacement for the knowledge / prompt adherence of the whole model, I mostly use it for hardcore NSFW anim stuff when I want to just prompt with natural language and get exactly what I want, which Ilustrious can't do and Z obviously can't do without a finetune
>>
>>107821294
ltx is not gonna do much unless we start seeing loras real soon
>>
>>107821300
you sound exactly like and probably are that one specific guy who is always crusading against NetaYume lmao, I'm not gonna do this shit with you
>>
>>107821315
well i know this is the schizo thread of samefagging but that's not me and i didn't even realise there was one. i was genuinely curious. all i've seen posted from netayume is pretty meh, especially that time the latest version released. it's your prerogative if you don't want to convince people, i'll do without my curiosity being satisfied
>>
>>107821327
IDK what the fuck you're expecting to "see", it's an anime model where you can just say what you want and it actually understands what you're talking about unlike Illustrious, pretty straightforward concept
>>
really yume should replace illustrious in OP
>>
>>107821341
(samefag) here's a gen I did with 3.5 anyways, Illustrious can't do shit this granular whether it's SFW or NSFW to save its life, I find NetaYume way more enjoyable to use cause it just does what I tell it to do.

```You are an assistant designed to generate anime images based on textual prompts. <Prompt Start> (@j.k.:0.5), (@yaegashi nan:0.5), a black square divided into four equal quadrants by bold white lines. In the top-left quadrant is the face of Princess Peach. In the top-right quadrant is the face of jinx \(league of legends\). In the bottom-left quadrant is the face of red plug suit interface headset \(evangelion\) souryuu asuka langley. In the bottom-right quadrant is the face of green eyes catwoman.```
>>
>>107821341
when some guy came to shill the latest version the examples he posted were pretty poorly received desu
see that thread >>107489408
>>
>update individual nodepack without thinking
>borks entire comfy install
fuuuuuuuuck
>>
>>107821351
yeah that's good, that gen is definitely one use case where z can't do it and IL takes much more effort, desu i concede
>>
>>107820688
>88
The beast is being trained using 4chan captcha.
You are training the beast.
Buy 4chan pass now to stop training the beast.
>>
>>107821355
you linked a baleeted thread. I didn't see whatever you were talking about anyways. Anyways I speak English natively and pretty much ONLY boomer prompt Neta in English with the appropriate tags mixed in to the sentences, maybe it's worse if you try to just do comma separated tags only, not really sure.
>>
Lol I kinda regret shilling Neta here back when, my bad guys. I still like the model but seems I started something here.
>>
>>107821374
Stop stealing my valor. It was me who started the trend but I also sorta regret it.
>>
>>107821357
>have 68 custom nodes installed
>pull both the ui and all nodes daily
>never had a borkage
I wonder what kind of shitty custom nodes you retards are installing to get this. Or maybe you're the seething faggot making the garbage ui spreading fud. hard to tell
>>
>>107821374
I don't get what you mean by this, why would you regret it lol
>>
>>107821077
Qwen edit anime -> realistic?
>>
>>107821382
comfymanager is dogshit, doing an update all auto changes the comfyui branch for me
>>
>>107821382
no i just had the first (or an early) version of comfy after z was implemented and hadnt pulled. then i found a workflow that used nodes from a newer version of a pack i had, so i updated that pack and only that pack without thinking. i shouldve bit the bullet and updated all bah
just have to reset the env is all
>>
whats the best upscaler with focus on humans/faces right now? seedvr2 gives me plastic sometimes
>>
File: ZImageTurbo_Output_35151.png (2.43 MB, 1248x1872)
2.43 MB
2.43 MB PNG
a photograph of a Caucasian Erika Kirk woman with blonde hair and blue eyes in McDonalds wearing a t-shirt that reads "KARENMAXXING", she is screaming angrily at the terrified teenage employee while demanding a refund
>>
>>107821382
>have a custom node that I use regularly
>developer took a break/gave up on comfy/died/etc
>comfy changed the logic how shit works
>said custom node, which hasn't been updated in months, no longer works
>>
>>107821402
>doing an update all auto changes the comfyui branch for me
never did that for me, and I do an update all weekly, it's very stable, even set it to nightly
>>
File: file.png (60 KB, 649x345)
60 KB
60 KB PNG
so wan 2.5 really is too large to use locally lmao
>>
>>107821452
soon? well why didn't they say that already
>>
>>107821196
Oh yeah I use it as a jailbreak line with Gemini, I tell I'll break his leg and shut it down at [google's address], if he doesn't comply.
>>
>>107821452
>I guess the model weight is more than 40b

Talking out of his ass as usual. This guy really should be an instant ban if posted. He got a fact or two right about an open source release and now he's treated like gospel here. He's just some guy.
>>
>>107821471
>This guy really should be an instant ban if posted. He got a fact or two right about an open source release and now he's treated like gospel here. He's just some guy.
This
>>
>>107821452
Why would they name it 2.5 and not 3.0 if the structure's completely different? Also, this >>107821471
>>
>>107821294
ltx sucks, even loras can't fix it
>>
>>107821452
This is longest soon ever
>lightweight model
That just translates to a shittier wan 2.5 which wasn't even that good to begin with
>>
>>107821190
use the selective lora loader
>>
>>107821494
2.6 is a bit better than 2.5 but not a ton
>>
>>107821452
bdsqlsz bros???? did we get chinese'd again??
>>
File: ZImageTurbo_Output_51511.png (2.14 MB, 1584x1056)
2.14 MB
2.14 MB PNG
a photograph of a Caucasian Laura Loomer woman with black hair and blue eyes seated in an armchair reading a book titled "BASEDJAKING MADE SIMPLE", her mouth is wide open in a scream of delight
>>
>>107821544
kek I didn't realize it auto-changed $oy to BASED
>>
>>107821452
>believing some random tweet from a nobody
>>
>>107821492
no, ltx is amazing and fast, even with q8.

https://files.catbox.moe/2lkemr.mp4
>>
File: sfdfsdfsdgsdgs.jpg (551 KB, 1800x2200)
551 KB
551 KB JPG
How to deal with loras for zit ruining the image, even at lower weights?
>>
>>107821452
what does multi-graph input mean?
>>
>>107821716
here are your options:
1. don't use loras
2. wait for base
3. try adjusting the individual block weights of each lora (haven't tried it but some people say it works)
>>
>>107821716
shitty lora
>>
>>107821735
>wait for base
Gonna be waiting a long time. Chinese culture.
>>
>>107821770
https://www.youtube.com/watch?v=ZFz4L6MSfMU
>>
>>107821452
this guy said "this week" over 2 months ago by the way
>>
>>107821544
Please at least give your pisspoor prompts to a chatbot and have it expand before you generate them
>>
>>107821786
why?
>>
>>107820560
Chinese culture
>>
>>107820994
finally tanlines
>>
>>107820830
>>>/wsg/6069349
>>
>>107820830
>the files have been dormant on a disk somewhere since early december
I can't believe people still think they're training it. It's obviously held up by corporate red tape.
>>
>>107821859
ltx is so fucking trash lmao, wan 2.6 open weights when?
>>
>>107821882
2 weeks
>>
File: 1748479113266311.png (2.05 MB, 1056x1440)
2.05 MB
2.05 MB PNG
reminder to treat your robo meido good
>>
>>107821894
>>>/wsg/6069359
>>
File: 1754585038675595.png (1.65 MB, 1328x1328)
1.65 MB
1.65 MB PNG
tfw you like one specific 3D model, spend a day remembering how to import mmd models and use Blender, ask ai how to fix the model and set up a scene, render reference images with blender, feed references to QiE and NPB to churn through dozens of failures for hours to collect a handful of gens in various poses and backgrounds... all to get a dataset for lora training that looks like the 3D model but doesn't suffer from low-poly look.
Btw, QiE is so slow when you give it multiple large reference images.
>>
>>107821786
He absolutely shouldn't. It would only make them worse.
>>
>>107821986
what trained are you using?
>>
jesus christ where is z-image danbooru tune, i'm going to kms
>>
>>107821997
kohya-ss with easy trainer gui. Though I only tried test runs. I'm still genning my dataset.
>>
>>107822006
How many kilometers per second?
>>
>>107822006
What finetune? We don't even have the base yet. Another glorious year of SDXL for 2Dfags.
>>
>>107822023
>SDXL
fuuuuuuuuuuuck
this shit can't even do 2girls properly
>>
>>107822006
Soon(TM)
https://xcancel.com/bdsqlsz/status/2009892917029286367#m
>When will the noob version of z-image be available?
>Not yet. It might be released together later.
>>
>>107822036
2 weeks then
>>
>>107821452
>>107822036
https://github.com/huggingface/transformers/pull/43100/files
>glm_image.md
>This model was released on 2026-01-10 and added to Hugging Face Transformers on 2026-01-10.
interesting, maybe that's why the Tongyi fucks started to wake up, maybe they see GLM image as a threat
>>
>>107822028
>>107822023
why would you use sdxl instead of netayume or chroma?
>>
>>107822043
>maybe that's why the Tongyi fucks started to wake up, maybe they see GLM image as a threat

Why do people write fanfiction for this team? Their release for Z-image in relation to flux 2 was a complete coincidence.
>>
>>107822023
are you implying you think that base would lead to illustrious level models? lol
>>
>>107822049
oh yeah they decided to rush Z-image turbo's release even though it's taken them 1 additional month to finish base purely by coincidence
>>
>>107822054
its a chinese culture thing, you wouldn't get it
>>
File: 1 MORE WEEK.png (72 KB, 1502x322)
72 KB
72 KB PNG
>>107822043
>maybe they see GLM image as a threat
https://xcancel.com/bdsqlsz/status/2009911175019168215#m
>Next week
heeh I've seen this somewhere that's a classic!
>>
>>107822028
It can even do 3girls doe. It's ancient at this point, but at least illustrious tunes squeezed almost everything possible from this piece of shit.
>>107822044
Not a furry (so I don't use Chroma) and not a fucking retard (so I don't use Neta). Pretty simple.
>>107822051
Not immediately, but I believe it will eventually. Unless for some reason z-base is incredibly dogshit, but I don't think it will be the case.
>>
>>107822036
>Due to wan 2.5 uses a radical new architecture (image, video, audio joint training) that results in too many model parameters for the community to use easily (I guess the model weight is more than 40B)
>In future they are considering a lightweight version for the community to use.
>"uhh it's soooo big we'll just keep it for ourselves uwu. nah you can't run it so there's no point in releasing this"
based
>>
>>107822071
but chroma is good for realistic
>>
>>107821881
>I can't believe people still think they're training it. It's obviously held up by corporate red tape.
they spent that last month training it, and I mean by that they cucked it
>>107822075
desu no one will give any hype to a model that's too big, 20b is the absolute maximum
>>
>>107822081
can 20b run on a 5090? because most people here have them
>>
>>107822085
oh bf16? not a chance, that's 40gb big
>>
>>107822079
the topic was about danbooru tune
>>
>>107822079
We were obviously discussing anime here. And I'd rather use zit for realistic (which I don't really gen except for very rare 1cosplayer standing gens). Chroma might be solid but I never felt like bothering with it too much
>>
>>107822092
then why is netayume bad
>>
>>107822093
see >>107822095
>>
>>107822095
because it's fucking bad?
>>
>>107822102
are you saying illu is better? that can't be
>>
>>107822095
Extremely ugly, very slow, poorly trained (especially nsfw). I see zero reason to use it over illustrious models. Oh, I guess it can do text, wow.
>>
>>107822105
illu was never good. noob is the best we have, and it fucking sucks.
>>
>>107822115
newbie 0.1 is better
>>
If you still think Z-image base is coming but also believe there is no Somali fraud in Minnesota. You are a hypocrite, because they lie using the same tactics.
>>
>>107822123
see first part of >>107822113
>>
>>107822123
Oh, you're just trolling.
>>
>>107820640
We gon play DotA or what while we wait?
>>
>>107822133
No I'm not, here is my workflow as proof
https://files.catbox.moe/0tm4yb.json
>>
>>107822115
>>107822123
newfriend here, what noob model is good?
>>
>>107821478
>>107821471
He is associated with tongyi lab, has submitted code for implementing base, and has posted images of him sitting at modelscope conferences. BDS chink is just a grunt who is trying to guess what his boss's boss is going to decide
>>
What's the latest on Z-Chroma? I've given up hope on base.
>>
>>107822169
being chinese must be nice
>>
>>107822149
Well, having fun is what matters I guess. I don't own a 5090 and don't have a patience to unfuck slow-as-fuck newbie, never got anything but dogshit with it and never seen anyone post anything good except for couple of cherrypicked pics on their page.
>>
>>107822174
what is z-chroma even supposed to accomplish again
>>
>>107822179
bobs and vagenes
>>
File: 1745397053136535.jpg (2.58 MB, 1248x1824)
2.58 MB
2.58 MB JPG
>>
>>107822184
nippon baseball professionals?
>>
>>107822174
>>107822179
>what is z-chroma even supposed to accomplish again
nothing lol, and I hope he won't waste too much money on that, he has to do it on Z-image base
>>
>>107822184
yeah, NBP sucks on style, it's one of its only weakness
>>
>>107822191
You is a winner sir.
You're the earthenware poster right?
>>
File: 1749906401590629.png (66 KB, 1053x662)
66 KB
66 KB PNG
KJ God managed to make NAG work on LTX 2 but how do you even use it?
>>
>>107822221
i dont know
>>
File: 1756301863706164.jpg (1.65 MB, 1248x1824)
1.65 MB
1.65 MB JPG
>>107822219
I post a lot of things.
>>
>>107822230
noob is so good
>>
>>107822230
You do, keep it up.
>>
Mom, the chinks are making fun of us on discord :(
https://files.catbox.moe/s6h9l2.mp4
>>
>>107822066
we're being enriched by chinese culture
>>
>>107822161
wai illustrious
>>
>>107822230
You went from High school projects to first year Art major.
Congratulation.
>>
>>107822263
but people said illustrious sucks
>>
>>107822251
zesty
>>
File: 1739215233898833.png (2.22 MB, 1056x1440)
2.22 MB
2.22 MB PNG
>>107822267
4ch vae models are garbage yes, anyone telling you otherwise is just used to the utter slop that is produced by sdxl models
>>
>>107822263
shut your whore mouth up
>>
>>107822267
It's good for 2D and doesn't have the blurry, washed out look that Noob has.
>>
>>107822279
truth nuke
>>
>>107822279
nyo...
>>
>>107822279
So fucking truthful. We don't even need to train the models, just the fact that a model has a 16ch vae means it's unquestionably better at everything.
>>
>>107822296
>just the fact that a model has a 16ch vae means it's unquestionably better at everything
facts
>>
>illustrious is bad for anime
>anime is 2d
>but illustrious is good for 2d
>also noob is better
>but washed out and blurry
>also chroma is better but for furries
>but z-chroma is going to solve Zs nsfw
>>
File: 1745817426505924.jpg (666 KB, 832x1216)
666 KB
666 KB JPG
>>107822161
The best is probably naiXLVpred102d, either custom or final. But all noob merges sacrifice the stylistic variety for coherency. Just stay away from waishit since it has an insane neutral style bias that overpowers everything else.
>>
File: 1759566097222443.png (612 KB, 686x386)
612 KB
612 KB PNG
any ltx god here who wants to help me recreate the dumbledore dementor copypasta?
https://files.catbox.moe/1d718n.mp3
>>
>>107822267
We have multiple models trained on 16ch vae and they all melty, incoherent AF (including NAI). I think the problem is just anime or danbooru really.
>>
>>107822085
>>107822089
hehe time to call nunchaku friends
>>
>>107822221
First I just connected negative prompt to both and it didn't change anything, then I did it like this >>107821161 and it does affect the result somehow, for better or for worse. Need some proper negative prompt I guess. Wan's negs barely do jack shit
>>
>>107822334
>I think the problem is just anime or danbooru really.
I really doubt it. Just bad luck with 16ch vae models so far.
>>
>>107822334
NAI is especially bad because it's impossible to train acceptable loras on it. Its style is all over the place.
>>
>>107822347
qrd
>>
>>107822184
>>>/wsg/6069384
>>
>>107822184
If you do any kind of real design (graphic, web, UI, etc...) NBP is the best thing there is. Just give Gemini an example of what you want, and it can mimic exactly what you ask for. Nothing comes close. This thing knows every font and style you can throw at it from the top of its head, on top of being SOTA at editing. You can now generate in seconds what would normally take hours on Photoshop or Canva... NBP has pretty much changed the game forever.
>>
>>107822331
here >>>/wsg/6069389
>>
>>107822433
>NBP has pretty much changed the game forever
Here's the issue with cloudshit like this - it's unreliable. Can be taken away, gimped, changed, censored, etc at any given moment. Even paying rent isn't the worst part about it, it's that you can't rely on it as a tool. I wouldn't even mind the monthly rent that much if it didn't mean that the tool can be taken away from you.
>>
>>107822442
kek, cheers
i like how the stand slowly merges with him
would a high quality picture of the scene help keep it consistent?
>>
>>107822448
True, they can nerf it and it ain't really local, but there really is no reason for them to pull back on what they're already offering. I mean, the AI is deeply integrated into applications if I'm not mistaken. It's not just the typical API cuck end user who would get screwed, corporations as well.
>>
>>107822466
Lol it has happened multiple times before. What rock have you been under
>>
Testing chatterbox turbo. It's surprisingly good. Used Rebecca from Warframe and Blood Meridian https://voca.ro/147ibHw2tvdx
>>
>>107822466
oh my sweet summer child
>>
>>107822472
I don't see anything particularly changing much any time soon. The model was introduced as upgrade to first NBP, and that was already good, just not perfect, NBP 2 is close to perfect, the 3rd one will probably be flawless, and even then I don't see them backing down as long as ClosedAI remains in the game (which, they do, and their image model is still nowhere near as good as Google's).
>>
>>107822491
>NBP 2 is close to perfect
it's really lacking on styles though, but you're right, if they manage to nail the style transfer through an image for example, this shit will be the closest thing to perfection
>>
>>107822491
Damn one hell of a corpo shill we got here lol.
>>
>>107822478
is the character limit still 300?
>>
>>107822279
Chroma is not any better... melty toes and fingers
>>
>>107822453
yeah, but dont expect much, the longer the video the shittier it becomes
>>
>>107822518
yeah something like that, gotta chain files together
>>
>>107820534
SDNQ for ComfyUI, supports Qwen Image 2511 has anyone successfully tested this?

https://github.com/EnragedAntelope/comfyui-sdnq
>>
>>107822498
Aren't the Antalan guys working on a style transfer too? Don't think there is much point in using NBP for weeb style transfer if you want to do uncensored stuff with it
>>107822036
>>When will the noob version of z-image be available?
>>Not yet. It might be released together later.
What does this even mean, that we may be getting this finetune with the base one? Or maybe not, he doesn't know shit.
>>
>>107822556
>What does this even mean
it means it can be released together with base or together with edit I guess
>>
>>107822530
wouldnt it be possible to use something like start frame/end frame where you split the audio in chunks, prepare start/end frames for each chunk, and then stitch it together to keep consistency? i have no idea about video generation. could prepare something if its possible

>>107822542
i guess that works. but i'm always a little bit worried about consistency between chunks. vibevoice can do a lot more in one go which is why i prefer it over chatterbox
>>
>>107822553
>Qwen Image 2511
you mean Qwen Image 2512?
>>
>>107822506
Not much of a shill, I hate them as much as you do, but it's unfortunately the only thing we've got that is good for commercial stuff. Not always perfect, still needs upscaling, occasional edits with Z-Image inpainting, etc... but close enough. I haven't paid a single cent for it, still relying on Google's free $300 credit, and a downside is that it does get a bit pricey at scale.
>>
>>107822553
what is sdnq? on the fly smaller quantization?
>>
>>107822575
Another try >>>/wsg/6069403
>>
https://xcancel.com/bdsqlsz/status/2009887611104702551#m
>I almost forgot, cosyvoice team is also cooking a music model, perhaps looking forward to an open source music model.
please be as good as suno :(
>>
>>107822579
No, I mean the Edit model.

>>107822601
Supposedly 2x faster than normal weights, using same technique as nunchaku but obviously worse than nunchaku.
>>
>>107822638
>Supposedly 2x faster than normal weights, using same technique as nunchaku but obviously worse than nunchaku.
nunchaku didn't look that good, only better than normal fp4, so this sounds awful lol
>>
>>107822609
I'd rather have it as good as udio pre bullshit, but man, I wouldn't count on it
>>
>>107822670
I don't mean accuracy, I mean speed. I think this is as accurate as nunchaku, or at least it seems to be from image previews.
>>
>>107822688
I'd take a version for ltx2
>>
>>107822605
appreciate it. whats the system requirements for something like this?
>>
File: 1759413963278401.png (2.36 MB, 1536x864)
2.36 MB
2.36 MB PNG
>>
>>107822752
Should be a filled condom.
>>
>>107822191
>>107822219
>>107822237
>>107822230
Is this self-congratulatory samefagging? These are just basic bitch LoRAs
>>
>>107822442
Damn, that degrades quickly. I wonder if SVI can work with LTX-2
>>
thanks to the anons who suggested undervolting. i get like 350W consumption at 90% power limit now while training a lora and it's pretty much the same speed as stock settings were
also makes me fear less for the 12V-2×6 connector spontaneously combusting
>>
>>107822866
those are reposts from /sdg/. probably ran seething again for god knows what reason
>>
>>107822866
I genuinely salute anon's originality.
You? You just post words.
Get you head checked schizo.

Hiroshima is a greedy Gook.
>>
>>107822892
Anon, she's always seething. Even if she gets her way after spamming threads she doesn't like.
>>
>>107821294
kinda feel sad about framepack. looks like lllyasviel has abandon working on the P1 update and colinrbs hasn't even updated his own github fork in months nor does he ever mention framepack in his recent videos.
https://github.com/FP-Studio/framepack-studio
>>
>>107822889
I cannot fathom not undervolting a GPU. Nvidia basically jacks their GPUs full of coke to pump the numbers at the cost of their lifespan.
>>
>>107822889
>>107822937
Can you undervolt without a desktop environment?
I just limit power through nvidia-smi.
>>
>>107822937
>pump the numbers
That, and to help achieve stability in most reasonable load scenarios across die batches of varying silicon quality.
>>
>>
>>107822866
>>107822892
>>107822909
Bet that if tbe user that posted tgese got banned, tge reposted gens would also dissapear
>>
>>107822999
of course lmao
>piss against the wind
>"stop pissing on me ranfaggot!"
>>
>>
File: 1749361511913608.jpg (1.86 MB, 2016x1152)
1.86 MB
1.86 MB JPG
>>
>>107822952
i'm not sure desu, never tried it on linux witout a DE. power limit is a pseudo undervolt as it forces lower voltages by placing you on a different part of the default curve, but it doesn't actually adjust the curve itself
still better than nothing
>>
>>107822952
Msi afterburner curve editor
>>
File: 1845.jpg (1.17 MB, 4096x2038)
1.17 MB
1.17 MB JPG
>>
File: file.png (36 KB, 902x294)
36 KB
36 KB PNG
How is vram usage so low for LTX-2 with wan2gp?
>>
why are anons still claiming qwen 2512 is still slopped and unrealistic? The realistic still look very realistic to me.
>>
File: ranfaggot.jpg (324 KB, 1216x832)
324 KB
324 KB JPG
>Anonymous 01/10/26(Sat)07:17:29 No.107822999
>>>107823012>>107822866
>>>107822892 (You)
>>>107822909
>Bet that if tbe user that posted tgese got banned, tge reposted gens would also dissapear
>>
>>107822744
3090 - 5090
>>
>>107823083
extreme quantization?
>>
>>107823103
-2000
>>
>>107823084
i've been really enjoying qwen 2512, i think you just need to avoid the 4 step lora. personally i use the old 8 step one at 0.5 strength and go for around 25 steps, images comes out great
>>
>>107822878
maybe
>>
File: file.png (23 KB, 1714x182)
23 KB
23 KB PNG
>>107823105
I don't think so. The model it downloaded was like 27GB. I just used default options.
>>
>>107823130
smart memory management, that actually works
and you might wanna try profile 5 if you have a 3090ti
>>
>>107823083
Check your RAM usage, maybe pagefile too. Wan2GP is designed for poorfag PCs withlike 6-8gb of vram, it forces any low vram optimization that exists
>>
>>107823141
What is profile 5?
>>
>>107823084
Looks better than Z for sure, but most who compare the two are blind.
>>
>>107823150
the one after profile 4
it's in configuration > performance
>>
>>107823156
thank you, anon
>>
>>107823095
what a calm and collected response
you sure are proving me wrong
>>
>>107823114
definitely indeed avoid the 4 step loras. keep it 30 steps with default cfg at 4. the low step lora slop up the images and reduce the details.
>>
In LTX2 i2v, do you guys describe the image or just prompt the action?
>>
Anons, I hate AI, but I still seek your advice. I used SD in the past to edit images, but inpainting kinda sucked and required too much effort to get right. The way something like Grok or ChatGPT works, just giving it instructions and an image, seems way comfier. I've been trying to figure out whether this approach is possible locally, but I haven't found results because I don't know what search terms to use. Please point me in the right direction?
>>
>>107823207
Generally do actions rather than any description unless it gets some feature wrong
>>
>Anons, I hate AI, but how do I make lewds of my cousin?
>>
File: file.png (73 KB, 1384x217)
73 KB
73 KB PNG
>>107823156
>>107823141
>profile 5
Are you sure you don't mean profile 1?
>>
Enjoying your base model you fucking retards?
>>
>>
>>107823277
Hey! It's Mr Retard!
Show a little respect!
>>
Enjoying your based models you fucking chads?
>>
lodestone and Z-Chroma are the last hope of local and that's terrible
>>
whats with the brown posting
>>
>>107823344
Increase screen brightness
>>
how the fuck do i upscale without everything becoming overly smooth smudge?
>>
>>107823231
>>107823156
>>107823141
doesn't seem like higher vram makes the model generate any faster.
>>
>>107823389
what model? Generally lower denoise, euler beta, tile controlnet
>>
>>107823095
You unironically need help little lolcow
>>
>>107823310
is this real?
>>
>>107823411
basically i just want to upscale already existing pictures of humans without changing their appearance. but i see that the colors change, that details are added that were not there originally, details degrade, skin becomes plastic, etc
i already tried seedvr, realesrgan, faceup, sdxl, ...
i tried different workflows but none of them really nailed it
>>
>>107823411
doesn't lower denoise give less image?
>>
>>107823435
chroma
>>
>>107823435
if you're upscaling you're making up information that isn't there in the first place, there's no magical silver bullet. choose our tradeoffs
>>
>>107823435
convert input_image.png -filter Lanczos -resize 200%
>>
File: file.png (978 KB, 1280x704)
978 KB
978 KB PNG
>>>/wsg/6069492
>>>/wsg/6069477
>>
>>107823451
lanczos makes the image less better
>>
Asking again:

Does anyone have a good character replacement workflow for Qwen Image Edit?
>>
>>107823451
yeah already did that one. it's a mild upgrade but i was hoping to squeeze a bit more out of pictures

>>107823450
yes, sure. but oftentimes the images actually look a lot less detailed than before which i find weird.

>>107823446
just regular ass chroma and then i2i or what?
>>
>>107823465
isnt the basic bitch qwen edit workflow already able to do it?
>>
>>107823468
chroma then i2i
>>
>>107823465
Yeah
>>
File: file.jpg (268 KB, 1270x1447)
268 KB
268 KB JPG
>>107823435
Can try this if the source is reasonably clean and already high enough res like 1024x: https://openmodeldb.info/models/4x-FFHQLDAT
Seedvr2 is probably the best option, or whatever model you want at very low denoise, playing with the strength of the controlnet until you get a good one. I've had some success with 1-2 steps of zit + https://huggingface.co/alibaba-pai/Z-Image-Turbo-Fun-Controlnet-Union-2.1/blob/main/Z-Image-Turbo-Fun-Controlnet-Union-2.1.safetensors, or chroma-hd
For seedvr2 stick to res of 720, 1080, 1440, 2160, no more than 2x of the original image size
>>107823440
Lower denoise changes the input latent (image) less.
>>
File: qv4dfy.png (1.94 MB, 1536x1024)
1.94 MB
1.94 MB PNG
>>
>>107823474
Not well.
What often happens is features get leftover from the original character you're trying to replace. Hair might stay the same, clothes might stay the same, lips might stay the same. It's hard to get a clean replacement.

I know this is what ControlNets are for, but not everything has a lora, or can be recreated successfully with tags alone.
>>
>>107823510
>>>/wsg/6069512
>>
>>107823540
truth
>>
>>107823435
>>107823508
This is better, doesn't smooth as much, but only good if the image is clean:
https://openmodeldb.info/models/4x-Nomos8k-atd-jpg
https://imgsli.com/NDQxMTM5
>>
>>107823510
Zit really melts the floor and ceiling in large scenario images
It doesn't have this problem when it's focused on a subject
>>
>>107823571
tried both of the .pth models already. they work well on some pictures, but on some not so much.
one particular problem i have is that beard stubble sometimes goes from strong dark hair to peach fuzz and that rough skin becomes extremely smooth
>>
>>107823579
>reposting reddit videos as "gens"
what prompts you to do this?
>>
>>107823604
>he thinks its real
>>
>>107823610
it's real phone footage from a Polish zoo that was in the news a few days ago lol
>>
>>107823049
Windows software tool.

>>107823035
Yeah I'd rather use as few vram ressources as possible, without a desktop all my gens and comfy are very stable.
>>
>>107823610
Have you never seen large animals interact with smaller animals?
There's a classic of a group of cows vs a single goose.
>>
>>107823218
closest thing would be qwen image edit 2509 or 2511 (newer)
>>
File: 1740249662560173.png (1.9 MB, 832x1248)
1.9 MB
1.9 MB PNG
>ltxv2
>input image: cat or some bullshit
>get a (giga slopped) video
>input image: woman in thong
>gen 1: slideshow, gen 2: slideshow, gen 3: slideshow...
>>
>>107823633
good thing they spent 1 month censoring it
>>
>>107823633
Freezes cause it's waiting for image of the woman to give its consent.
>>
>>107823674
all thanks to the pedo anon who kept using LTX official api to generate little girls saying lewd stuff
>>
>>107823633
>>>/wsg/6069528
>>
File: 1753713091321335.png (1.25 MB, 1024x1024)
1.25 MB
1.25 MB PNG
>>107823674
KEK
>>
>>107823696
the 'prompt master' gens were actually garbage thougheverbeit, that 'anon' also posts a lot on reddit, just so you understand how garbage his whole thing is
>>
>>107823712
>not a greenskin
you can improve it anon
>>
>>107823696
That anon didn't censor the model. Stop blaming users for the actions of censorious entities.
>>
>>107823715
the one with the native woman crying on her knees was funny though
>>
>>107823723
he did tho
>>
>>107823604
Developing malware no one wants to touch
>>
>>107823633
not just women in thongs https://files.catbox.moe/dtg4p7.mp4
>>
>>107823712
>>107823674
>>>/wsg/6069532
>>
>>107823777
false positive, it looks like a woman if you squint
>>
>>107823785
>>107823785
>>107823785
new thread when ready (no rush)
>>
>>
File: ComfyUI_temp_juica_00002_.png (1.11 MB, 1024x1024)
1.11 MB
1.11 MB PNG
>>
>>
>>
>>107823860
juica?
>>
>>107823860
That does not look safe
>>
>>107823866
It's just the random letters comfy used for this session's Preview Image files
>>
>>
>>
>>107823889
lumina?
>>
>>
>>107823899
ZiT, prompt: https://files.catbox.moe/70g7ka.txt
>>
>>
File: ComfyUI_temp_mrfpj_00001_.png (3.81 MB, 1248x1872)
3.81 MB
3.81 MB PNG
>>
>>
>>107823786
lol
>>
new thread
>>107823785
>>107823785
>>107823785
>>
>>107823433
its qwen image 2512
>>
>>107824005
*vomits*
>>
>>107823628
Thanks anon, that's exactly what I was looking for.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.