[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion of Free and Open Source Diffusion Models

Prev: >>107877194

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>NetaYume
https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0
https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
Blessed thread of friendship
>>
File: 1746678213470292.png (1.43 MB, 816x1264)
1.43 MB
1.43 MB PNG
fun stuff, even the 4 step 9b distill model works well for klein. I still like qwen edit 2511 but now I have to see which does what better. two edits, worked fine.

also you can do face swaps now, bfl learned a lesson I guess.
>>
>2025: flux 2 dev is bad and a skipped model, bfl is so done
>2026: flux 2 klein is great, china is so done
explain
>>
File: 1752919971140979.jpg (2.16 MB, 1248x1824)
2.16 MB
2.16 MB JPG
In some aspects Klein is better than zimage but worse in others. Still a surprisingly good model regardless.
>>
>>107878590
Also nobody's to say, since the audio can also vary in quality, that there aren't simply seed based failgens, but obviously gens superior to Suno v4.5 and Udio if you simply vary the seed or improve the settings. That's the advantage of local after all, and the devs are running experiments to find best settings for everything, so once we do get ACEStep 1.5, it will be the real deal.
>>
>>107878598
flux 2 was great, it was just too big for 99% of people so they all cried sour grapes.
>>
>>107878598
all part of the plan
>>
File: 1747460777572664.jpg (2.46 MB, 1248x1824)
2.46 MB
2.46 MB JPG
>>
>>107878607
>flux 2 was great
prove it with some images
>>
>update comfyui
>gen previews broken again
neat
>>
>>107878598
>explain
looks like BFL has an ego and finally decided to work harder to please us and shut some mouths, I like it :^)
>>
>>107878616
plus they released a base model, which they've never done before (?)
>>
File: Flux2-Concat_00028_.png (3.71 MB, 2428x1712)
3.71 MB
3.71 MB PNG
>>
>>107878615
How do you fix this?
>>
File: 1749528738779129.png (431 KB, 800x582)
431 KB
431 KB PNG
>>107878598
don't worry, Alibaba will come back to put others company back to their place, which is not first
>>107878615
>>107878638
add the --preview-method taesd flag
>>
>>107878598
Chinks released ZIT and BTFO'd Flux2 bloat, leading their hand forcing them to release decent small models in retaliation, demonstrating that competition is good for innovation and the end consumer
>>
>>107878598
it was a trap for chinese so they release prematurely and lose reputation

>>107878424
big boobs saar
>>
>>107878646
>competition is good for innovation and the end consumer
this, you only give your best if you have a worthy rival in front of you
>>
File: Klein 9b.png (1.41 MB, 1200x864)
1.41 MB
1.41 MB PNG
>>107878645
>>
>>107878646
yeah
if they don't release z image base then it'll be flooded with klein loras and it'll be too late
>>
>>107878607
>muh sour grapes
Models of this size are always dead on arrival because no one will finetune them
>>
File: 1762127397584242.png (2.63 MB, 1664x896)
2.63 MB
2.63 MB PNG
nice edit, transposed my wf to klein, was painless
>>
File: 1752247729942484.png (1.05 MB, 816x1264)
1.05 MB
1.05 MB PNG
>when you run out of fent
I have to say, pretty clean inpainting for klein. but now we have a template.
>>
File: 1753742454776126.png (1.11 MB, 1958x1275)
1.11 MB
1.11 MB PNG
comfy bros, we wonned!!!
>>
>>107878746
vl_megapixels = 1
it should've been at 0, a value > 0 is only for QiE
>>
File: GOTY.png (2.07 MB, 1744x768)
2.07 MB
2.07 MB PNG
>>
>>107878746
how do you use q8 with the base klein workflow? default only allows non gguf .safetensors files
>>
File: 1768007998169905.png (1.36 MB, 816x1264)
1.36 MB
1.36 MB PNG
>>107878726
the anime girl in image 2 is stepping on the black man lying on the floor with her right boot in image 1. keep the appearance of the black man on the ground the same.
>>
>>107878753
meh, does it really matter?
>>
File: 1765528394026146.png (32 KB, 946x344)
32 KB
32 KB PNG
>>107878771
you use the gguf loader
https://github.com/city96/ComfyUI-GGUF
>>
is this 'put her in a bikini' at home?
we've probably had this for a while, but kontext sucks and I never got qwedit to work, so this is all new to me
>>
File: 1764189196028464.png (77 KB, 690x734)
77 KB
77 KB PNG
>>107878781
cant link it with this default one, might need a new workflow for q8 I guess
>>
does hires fix work for qwen image on forge neo? i keep getting this error
TypeError: Cannot handle this data type: (1, 1, 1, 896), |u1
Cannot handle this data type: (1, 1, 1, 896), |u1
>>
File: 1738377643305865.png (2.64 MB, 1664x896)
2.64 MB
2.64 MB PNG
>>107878800
tbf qwen can also do it
>>
File: 1739569470383814.png (1.39 MB, 816x1264)
1.39 MB
1.39 MB PNG
>>107878777
the anime girl in image 2 is stepping on the black man lying on the floor with her left boot in image 1, and is holding a green leek vegetable with her right hand. keep the appearance of the black man on the ground the same.

kino
>>
>>107878803
https://github.com/BigStationW/ComfyUi-TextEncodeEditAdvanced/blob/main/workflow/workflow_Flux2_Klein_9b.json
here's a workflow without the Subgraph AIDS, replace the default loader by the gguf loader
>>
>>107878820
thanks, and yeah I hate the subgraph stuff i'd rather have it all in the open not nested.
>>
any lora trainer supports klein yet? i wanna try training some shit
>>
File: 2026-01-16 12.45.43.png (41 KB, 582x618)
41 KB
41 KB PNG
Klein bros, wtf is this?
>>
>>107878841
weird error, did you update comfyui?
>>
File: 1741349902470214.png (3.2 MB, 1664x896)
3.2 MB
3.2 MB PNG
same input image for caption, but diffused in klein
lmao
>>107878280
>>
File: wowza.png (1.55 MB, 2733x1359)
1.55 MB
1.55 MB PNG
>>107878820
anon i love you so much thank you
>>
File: 1756108179230154.png (2.89 MB, 1664x896)
2.89 MB
2.89 MB PNG
>>107878856
when asked to make the source image photorealistic
lole. qwen is still a bit uncanny but not this bad looking from what I recall
>>
>>107878868
you're welcome :3
>>
>>107878810
>light mode
how do you even see this shit, it looks like a flashbang in the thumbnail
>>
Decided to try that music model someone linked in the last thread
Here's the output using the example lyrics and tags after the first test run.

https://voca.ro/1hMV9WTu7RY7

https://github.com/HeartMuLa/heartlib?tab=readme-ov-file

Gonna try some more whimsical shit next.
>>
>>107878889
yo that's kind of a bop
i have shelf speakers hooked to my pc and the base at the beginning was crazy
>>
>>107878886
lmao'd
>>
>>107878889
It supports Japanese! Nice! Thanks for the link.
>>
>>107878875
uwu :3333
>>
https://voca.ro/1b34wdfSy0wW

Was supposed to be upbeat. But sounds sad like the last one. Hmm
>>
File: 1755286020718786.png (2.23 MB, 1664x896)
2.23 MB
2.23 MB PNG
>>107878871
original bikinize'd. im off to lunch now, all in all, good model.
>>
File: topaz.png (3 KB, 169x49)
3 KB
3 KB PNG
small price to pay for not using rife interp and sneedvr2
>>
>>107878915
lmao god that's depressing.
>>
Can you use Flux2Klein on Forge Classic Neo?
>>
File: 1759773062501240.png (2 MB, 2284x1600)
2 MB
2 MB PNG
https://xcancel.com/bdsqlsz/status/2012047511566107012#m
Oh NOW they're about to release it, WHAT A COINCIDENCE
>>
File: 2026-01-16 13.07.11.png (93 KB, 879x692)
93 KB
93 KB PNG
>>107878849
I did from the manager but turns out I needed to do this from the update folder.

Now I'm getting something different.
>>
>>107878889
>>107878915
jesas that's pretty good, it kinda falls apart in some places but still
we're really approaching the age of dead internet theory at an insane pace...
>>
>>107878956
based Hans forcing Chang's hand
>>
>>107878956
Based actually, BFL can pound sand I don't care how good their models are. I just hope Z-Image-(Omni-)Base meets expectation and gets people excited to finetune.
>>
>>107878966
turn off animated previews
>>
File: file.png (377 KB, 585x410)
377 KB
377 KB PNG
>>107878956
disappeared :smiley_face: haha
>>
>>107878889
>>107878915
What vram/ram requirements?
>>
1 day worth of training ltxv, its gonna be crazy:
https://files.catbox.moe/iibwa3.mp4
https://files.catbox.moe/ykmdq3.mp4
https://files.catbox.moe/0zm6rf.mp4
https://files.catbox.moe/qcpd33.mp4
https://files.catbox.moe/r70qg1.mp4
https://files.catbox.moe/qac25e.mp4
>>
>>107878992
>surprise!
https://youtu.be/2tWHvQQMkLE?t=7
>>
File: 2026-01-16 13.12.10.png (19 KB, 723x220)
19 KB
19 KB PNG
>>107878984
I just set it to none but the problem remains
>>
>>107878997
Sat at around 20 on a 3090 for the 3b model but that's just their basic inference script.
>>
>>107878998
Looks like shit
>>
>>107878956
the damage has done
>>
>>107879004
are you using the right text encoder from HERE:
https://huggingface.co/Comfy-Org/flux2-klein-9B/tree/main/split_files/text_encoders
>>
>>107878966
your clip is probably wrong how is it lookin
>>
>>107879012
its 14000 steps of training on T2V on a model that knew nothing of nsfw. It looks incredible for how early the training it compared to wan2.2
>>
File: file.png (8 KB, 438x91)
8 KB
8 KB PNG
>>107879004
try disabling this too for good measure if it's on
otherwise no idea
>>
>>107879020
It's furryshit. Take all the steps, doesnt matter.
>>
>>107879029
why would he need to disable previews for Klein? they work just fine for me. it looks like some issue with his text encoder
>>
>>107879036
you will be able to use it for humans as well. It also works for I2V
>>
>>107879012
It's not bad considering the time input. Not even a furfag, I just think it's an impressive proof on concept of the thing training well on something it can't do otherwise.
>>
>>107878915
That sounds ACEStep 1.0 at best.
Meanwhile here is just a random ACEStep 1.5 gen I came across
https://files.catbox.moe/hm2stn.mp3
>>
File: wtf.png (307 KB, 405x720)
307 KB
307 KB PNG
>>107878956
https://xcancel.com/bdsqlsz/status/2012022892461244705#m
>Z-image in the final testing phase, although it's not z-video, but there will be a basic version z-tuner, contains all training codes from pretrain sft to rl and distillation.
>z-video
what is he talking about???
>>
>>107879070
I think he is just being Chinese "its not something like video to be excited about"
>>
>>107879070
>but there will be a basic version z-tuner, contains all training codes from pretrain sft to rl and distillation.
this is good though cause it seems those are needed to unslop the base
>>
>>107879039
NTA, but VHS's video previews use their own hacky system and hijack image previews as well.
>>
>>107879017
>>107879018
ah yeah you guys are right I was on the wrong text encoder, now it works, thanks!

>>107879029
Where is this?
>>
>>107879070
>>107879098
this is actually really interesting, I thought they would keep the RL script to themselves (since it's the secret sauce), based, now let's see lodestone ignore all of that and go for another one of his schizo mumbo jumbo and we'll go for another 6 month cope ride until everything gets broken :(
>>
>>107878992
where is this posted?
did you just lie on the internet? D:
>>
>>107879124
https://xcancel.com/ModelScope2022/status/2012055794020409361#m
>>
>>107879116
lodestone is gonna merge Klein-Base with Z-Base, remove the VAE, randomly delete half the parameters and then train the Klein-Z-Chroma-Radiance-Furry-Edition-FrankensteinV2 we've all been waiting for
>>
>>107879065
>Meanwhile here is just a random ACEStep 1.5 gen I came across

That's very nice but neither your nor I have ace step so kindly piss off until you do.
>>
Also Ace step was worse than HeartMuLa. That's not to say it's great, but Ace step was pretty bad considering.
>>
>>107879148
We will soon.
>>
>>107878998
infinite daddy daughter princess blowjobs with cute squeals and gurgling bubbling noises locally by 2029 and it'll be the furries that got us there just like they did with imagegen

>>107879020
>>107879036
don't listen to him furry anon he doesn't understand the value of technological progress and the potential to save the anuses of children and dogs all around the world as a result of the substitution effect

>>107879082
>>107879070
No zvideo is a thing they're working on, its been mentioned before, but that doesn't mean they're releasing it, it's literally just a thing they're testing
>>
File: CnP_16012026_123139.png (575 KB, 979x364)
575 KB
575 KB PNG
what is used here to combine two images?
>>
>>107879190
flux 2 klein
>>
>>107879194
I think he's asking about the image concatenate node but yeah.
>>
>>107879190
>>107878820
>>
>>107879177
>zvideo is a thing they're working on, its been mentioned before
source??
>that doesn't mean they're releasing it
oh maaan :(
>>
File: ComfyUI_temp_euvgt_00009_.png (2.87 MB, 1580x1104)
2.87 MB
2.87 MB PNG
You guys said klein was good?
>>
>>107878950
Nobody answered my question....
>>
>>107879244
its ok
>>
File: harsh truth.png (130 KB, 326x202)
130 KB
130 KB PNG
>>107879247
because nobody is using forge
>>
>>107879244
GROK, PUT THE GIRL IN TOP LEFT IN A BIKINI PLS
>>
>>107879247
maybe start using the big boys tool instead of retarded subhuman children toys?
>>
>>107879247
as a forgesissie you'll have to wait two more weeks for support while comfychads feast
>>
File: Klein 9b.jpg (779 KB, 3328x928)
779 KB
779 KB JPG
Can't believe he never won an oscar...
>>
Klein?

More like Kucked.

"replace just the face of image 1 with the face in image 2 and maintain the hair and clothing of image 1. change the woman in image1 to lift her skirt with her hands above her waist. the woman in image1 has visible panties."
>>
File: flux2i2i.png (1.08 MB, 1264x816)
1.08 MB
1.08 MB PNG
>>
it seems like bfl allowed nudity this time. It can do boobs better than z image. Same detailless crotches though
>>
>>107879310
lmao this looks like a shitty photoshop
>>
>>107879310
Flux Kek 2
>>
File: 1740574393713664.png (2.11 MB, 816x1264)
2.11 MB
2.11 MB PNG
ty gguf workflow anon, things seem to be working pretty well, wanted to try q8 klein edit cause q8 is closer to full quality and the model isn't large at all.
>>
File: ComfyUI_temp_euvgt_00020_.png (2.2 MB, 1693x1024)
2.2 MB
2.2 MB PNG
It might be fun for some memes, but ye nah, deleting.
>>
Looks like I can run Klein 4b and 9b-fp8 on my iGPU if I make the text encoder unload once it's done. The only weird thing is after the run is finished, I get like a solid minute or more of 100% disk usage by "System" that seems longer than any disk usage during the actual gen. I can only assume a bunch of OS stuff was swapped to disk and is being loaded back into RAM, but it's still not something I've seen with other models, e.g. Z-Image. Hopefully we still get those other Z-models sometime.
>>
>>107879310
obviously it's far from perfect, now I'm waiting for Z-image edit to raise the bar even higher lul
>>
>>107879310
Skill issue. Literally for face swap the best method is a single input image, then passing in the image you want edited as a latent and lower the denoise.
>>
>>107879337
use 8 steps instead of the default 4, they are not nearly enough
>>
>>107879337
would
>>
>>107879206
>>107879194
workflow?
>>
File: 1742248490238546.png (1.26 MB, 816x1264)
1.26 MB
1.26 MB PNG
>>107879337
but it is fun, and not a large model.
>>
>>107879337
>>107879310
alternatives to flux 2 klein then?
>>
>>107879352
>>107878820
>>
>>107879357
big flux 2
>>
>>107879355
go for billy mitchell and karl jobst lol
>>
>>107879361
thanks
>>
File: ComfyUI_temp_fkiia_00033_.png (2.6 MB, 1344x1728)
2.6 MB
2.6 MB PNG
>>
File: 1768431116755372.png (2 MB, 816x1264)
2 MB
2 MB PNG
>>107879371
kek this model is good

2 edits: one for the cop face, one for fent man.

replace the face of the black man lying face down on the floor in image 1 with the man in image 2.
>>
>>107879244
>>107879310
>>107879337
china please
>>
>>107879394
lmao, nice
>>
>>107879244
cox is so hot
>>
>>107879394
man looks fucking photoshopped lmao, bad.
>>
>>107879341
Also, the seed value seems stuck for some reason. (And I'm not on Nodes 2.0.) It's not on Fixed mode, but I can't get it to change when I click Run again. Weird.
>>
File: 1758351051006019.png (2.07 MB, 816x1264)
2.07 MB
2.07 MB PNG
>>107879394
diff billy, also a bit nicer kek
>>
https://voca.ro/185Dz7rIIPth

An example of the music model doing Japanese. Not exactly the genre I pictured. But it mostly got the reading right.
>>
File: 1753422465633060.png (1.6 MB, 1360x752)
1.6 MB
1.6 MB PNG
give the asian man on the left black skin. add a Netflix logo above the text "RUSH HOUR".
>>
>>107879425
https://voca.ro/11VGjmohRUXO
Oh damn.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.