[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion of Free and Open Source Diffusion Models

Prev: >>108018763

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>NetaYume
https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0
https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
>>108022550
bad fagollage
>>
>>108022550
>>Maintain Thread Quality
>https://rentry.org/debo
>https://rentry.org/animanon
ew. gross schizobabble in the op again. schizo is gonna do some mental gymnastics now
>>
Hi guys, can someone help me?
>>
File: 1761235250943712.jpg (3.73 MB, 3378x3372)
3.73 MB
3.73 MB JPG
>>108022550
Alternative kino collage
>>
>>108022586
no
>>
>>108022589
proper bake template

Discussion of Free and Open Source Diffusion Models

Prev: >>108018763 #

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>NetaYume
https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0
https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg
>>
File: ComfyUI_temp_xmjmi_00011_.jpg (703 KB, 2432x1664)
703 KB
703 KB JPG
https://files.catbox.moe/jf9k7g.png
>>
File: o_00216_.png (1.67 MB, 1280x768)
1.67 MB
1.67 MB PNG
>>
File: ComfyUI_temp_xmjmi_00066_.jpg (623 KB, 2432x1664)
623 KB
623 KB JPG
https://files.catbox.moe/sdryqa.png
>>
>>108022632
do you have non comfy metadata?
>>
Blessed thread of frenship
>>
File: ComfyUI_temp_xmjmi_00037_.jpg (1.99 MB, 3840x2176)
1.99 MB
1.99 MB JPG
>>108022656
No I don't, sorry. The prompt is one of those "war-and-peace" multi paragraph ones so I can't paste them here. You can save the png and open as text file; you should be able to get what you need
https://files.catbox.moe/tljubc.png
>>
File: o_00217_.png (1.64 MB, 1280x768)
1.64 MB
1.64 MB PNG
>>
>>108022674
neoforge has a metadata viewer. dunno why comfy doesn't have one, it's been years. it's just annoying copying values
>>
>>108022632
nice
>>
File: ComfyUI_temp_mcozp_00002_.jpg (640 KB, 2432x1664)
640 KB
640 KB JPG
>>108022696
true
>>108022715
thanks
https://files.catbox.moe/4od2ay.png
>>
>>108022685
ai slop
>>
friendly reminder you cannot use z-image loras with z-image turbo.
>>
File: ComfyUI_temp_qiaol_00012_.jpg (734 KB, 1664x2432)
734 KB
734 KB JPG
https://files.catbox.moe/92d0yi.png
>>
File: ComfyUI_temp_qiaol_00022_.jpg (841 KB, 2432x1664)
841 KB
841 KB JPG
https://files.catbox.moe/xm25fx.png
>>
File: ComfyUI_temp_qiaol_00033_.jpg (1011 KB, 1664x2432)
1011 KB
1011 KB JPG
https://files.catbox.moe/vry28w.png
>>
Imagine wan but with klein quality: super fast, multiple inputs, easily trainable, incredible vae.
>>
>>108022793
Let's all imagine this guy's mom's pussy
>>
Can anybody please tell me what I am doing wrong? I am getting the infamous "You do not have CLIP state dict!" error and I don't understand which file I am missing.
>>
whats the latest face detailer meta?
>>
>>108022632
>>108022674
>>108022754
>>108022765
>>108022782
Kino is back on the menu
>>
> win7
> forge
> ff
> 2026
>>
File: ComfyUI_temp_xmjmi_00006_.jpg (901 KB, 2432x1664)
901 KB
901 KB JPG
https://files.catbox.moe/g5prdi.png
>>108022793
one can dream
>>
>>108022813
>literal retard unable to use UI made for retards
LMAO
why do you have 2 vaes?
why is the diff model in the vae?
>>
>>108022602
>>108022583
Lol suffer tr*ni or de*o
>>
>>108022826
i made this image
>>
>>108022828
huh?
>>
File: ComfyUI_temp_xmjmi_00013_.jpg (609 KB, 2432x1664)
609 KB
609 KB JPG
>>108022821
thanks fren :)
https://files.catbox.moe/phlg81.png
>>
>>108022813
Search the github repository and google for "You do not have CLIP state dict!", there will be multiple other people who had the same problem and the possible causes/solutions, if you can't do that you can also ask a LLM such as Grok, chatgpt, gemini or copilot on how to search issues on the github repository of the webui you are using
>>
File: ComfyUI_temp_xmjmi_00027_.jpg (700 KB, 2432x1664)
700 KB
700 KB JPG
>>108022836
erm acksually z-image's algorimth made it, you just prompted it
https://files.catbox.moe/t8foa7.png
>>
>>108022825
based
>>
File: ComfyUI_temp_xmjmi_00046_.jpg (926 KB, 2432x1664)
926 KB
926 KB JPG
https://files.catbox.moe/lwd2kp.png
>>
File: zimg_00038.png (1.41 MB, 960x1536)
1.41 MB
1.41 MB PNG
>>
>>108022840
You heard me
Suffer
>>
>>108022813
Out of pure curiosity what card do you have
>>
>>108022898
what?
>>
File: ComfyUI_temp_xmjmi_00049_.jpg (850 KB, 2432x1664)
850 KB
850 KB JPG
https://files.catbox.moe/4os6p6.png
>>
File: ComfyUI_temp_xmjmi_00061_.jpg (509 KB, 2432x1664)
509 KB
509 KB JPG
https://files.catbox.moe/xd9ld6.png
>>
Why is SaaS adware included in the OP for local models?
>>
File: 697.png (63 KB, 192x192)
63 KB
63 KB PNG
>>108022923
are you mad your uncle stopped diddling you cuz you got older and ugly or something?
>>
File: ComfyUI_temp_xmjmi_00064_.jpg (689 KB, 2432x1664)
689 KB
689 KB JPG
https://files.catbox.moe/8981vu.png
>>108022923
because money dear boy
>>
File: ComfyUI_temp_xmjmi_00063_.jpg (740 KB, 2432x1664)
740 KB
740 KB JPG
https://files.catbox.moe/amyore.png
>>
>>108022933
Damn, new models could be good for baking lightning + diffusion maps.
>>
>>108022932
that explains a lot about catjak. if you author the rentry you should put that in
>>
File: ComfyUI_temp_xmjmi_00042_.jpg (665 KB, 2432x1664)
665 KB
665 KB JPG
https://files.catbox.moe/1t97ag.png
>>
File: 1755702301550849.png (5 KB, 185x164)
5 KB
5 KB PNG
I'm trying out LoKr instead of LoRA for the first time. do you load these things the same way you load loras in comfy?
>>
>>108022957
Why do you give catjak so much power?
>>
>>108023026
does lokr even work in comfy?
>>
>>108023026
Is LoKr short for low key retarded?
>>
File: bitmap.jpg (788 KB, 2432x1664)
788 KB
788 KB JPG
https://files.catbox.moe/763vi5.png
>>108022956
FOSS models are this close to being production-ready; I think that the cloud/SaaS models are already there. If the pace of dev continues, we could see FOSS catching up in 6 months or a year
>>
>>108023026
It's only better than lora when it's used in the same model it was trained on btw (and you should use 99999 dim to trigger full matrix and control the size by the factor), lokr transfers terribly
>>
>>108023032
wdym? he's the thread lolcow. he just chose this thread to graze in and shit his pampers. sometimes it's funny, sometimes it's annoying. sometimes he pretends to be other anons like what you are doing right now. all we know for sure is he is a failure.
>>
>>108023056
According to you and your "friends" he's been able to
>destroy /sdg/
>get anons to migrate to /ldg/
>delete post that goes against his narrative
>be present 24/7
>control the OP for the majority of the threads
>ruin ani's project
>frame ani as debo as the thread schizos
You make him sound like some autistic schizo god.
>>
>>108023082
yfw that anon IS catjack and does all those things
>>
a message from ani in /adt/:
>>108022262
>>
File: Video_00001.mp4 (2.73 MB, 720x1280)
2.73 MB
2.73 MB MP4
How the fuck do I prompt for pov shit? Camera, cameraman, camera man, ain't working.
>>
>>108023087
The thread is more deader than usual because the OP was vandalized
Dev schizo also tried to claim upscaling destroys artist styles and that's objectively false and a serious skill issue. I find it funny he ignored the post pointing that out to him.
Did he say that because his frontend can't do upscaling?
>>
>>108023090
He's running off to share that beer with an underage prostitute.
>>
>>108023087
*yawn*
>>
>>108023119
>Dev schizo also tried to claim upscaling destroys artist styles
he isn't wrong. the style is always more sloped than the input. realism seems to hold up better nowadays
>>
File: Video_00001.mp4 (2.08 MB, 720x1280)
2.08 MB
2.08 MB MP4
>>108023090
What the fuck, now it's working. Rng I guess..
>>
>>108023156
prompt thief
>>
>>108022910
RTX 3060 12GB
>>
>>108023151
>Being wrong
Please stop
>>
>>108023156
Did you try "cohesium"?
>>
>>108023090 >>108023156
there's 8 billion people on this flat earth and you hollow brain fuckers keep posting the same slop.
>>
>>108023208
show us you are right. I am really interested in what you have learned about it
>>
File: ComfyUI_09035.png (3.52 MB, 1440x2160)
3.52 MB
3.52 MB PNG
>>108023026
>LoKr
That's for people that deep-fry their datasets. Doing 15k steps and using only that final, extra crispy output, completely ignoring any point where it might have converged earlier.
>>
File: z-image_00011_.png (1.71 MB, 960x1536)
1.71 MB
1.71 MB PNG
>>108023026
yes

>>108023034
yes

>>108023035
no
>>
File: 1727936307662.png (76 KB, 1321x446)
76 KB
76 KB PNG
>open reference klein workflow
>try to make sense of it
>2-level nested subgraphs
>to hide picrel
FUCK this shit.
>>
>>108023119
>Did he say that because his frontend can't do upscaling?
Since he didn't post proof and oldfags know it's not true, he's probably hoping newfrens take him at face value.
>>108023228
The original claim that upscaling "destroys styles" has yet to be proven.
>>
>>108023243
damn, how do you get so much detail? or is this zit?
>>
File: ComfyUI_00353_.png (2.19 MB, 1024x1472)
2.19 MB
2.19 MB PNG
>>108023240
>>108023240
tf this bih look like da grinch
>>
File: 179985.png (190 KB, 512x512)
190 KB
190 KB PNG
>>108023240
mentally ill
>>
File: Loras.jpg (3.7 MB, 4032x1728)
3.7 MB
3.7 MB JPG
So yeah, I can 100% confirm that ZIB-trained loras are WORSE used on ZIT than ZIT-trained loras used on ZIT. However, ZIB-trained loras used on ZIB itself are pretty fine, assuming you use the right negative. Picrel is the exact same Rubi Rose dataset, 120 pics, Gemini 3 Pro captioned, trained at the best possible quality on both ZIT and ZIB (1024x1024, full BF16 models, no quantization during training) with the same settings.

ZIB on ZIT is the least close to what she actually looks like, whereas ZIT on ZIT and ZIB on ZIB are both quite believable in terms of facial likeness given the dataset did have her with a ton of different hairstyles and different lighting conditions and stuff.
>>
>>108023295
Are you using AI Toolkit for training? Does the cope of upping ZiB LoRA to 2 strength on ZiT work?
>>
>>108023295
>So yeah, I can 100% confirm that ZIB-trained loras are WORSE used on ZIT than ZIT-trained loras used on ZIT
yeah i have drawn the same conclusion. did you bump up the strength? it helps a little with likeness but also looks more borked. its a bit of bummer because genning takes ages with zib....
>assuming you use the right negative
anything out of the ordinary or do you mean stuff like "low quality", "deformed"?
>>
>>108023295
I guess impatientlets stuck with ZiT will have to cope with using sub optimal versions of my kinosovl LoRAs.
>>
Is there a decent workflow anyone has been using for video to video?
>>
is there a comfy custom node that takes in any input and returns the filename?
so vae, image, model, -> node -> filename ?
>>
It is confirmed: z-image is another chinkshit failbake. Flux Klein won
>>
File: z-image_00053_.png (1.76 MB, 960x1536)
1.76 MB
1.76 MB PNG
>>108023257
that was zit, this is base, i have to update my naming still
>>
File: o_00226_.png (1.87 MB, 1280x768)
1.87 MB
1.87 MB PNG
>>
>>108023295
>100% confirmed
>single test
erm....
>>
>>108023352
also looks pretty good. are you using default base and default workflow or is some magic involved?
>>
File: 28.png (2.05 MB, 1296x1296)
2.05 MB
2.05 MB PNG
:3
>>
>>108023247
What's exactly wrong with it? This is how subgraphs should be used - to hide stuff that you set up once and never touch again
>>
>>108022865
I did but I can't find. Search engines are so shit these days that they only give me unrelated results. And AIs tell me to check if I have the necessary files, without telling me which files are necessary. That's why I asked here, you guys know your stuff, I thought you would notice my error right-away from my screenshot and point it out.
>>
>>108023257
looks like ZIT to me.
>>
File: z-image_00058_.png (1.46 MB, 832x1216)
1.46 MB
1.46 MB PNG
>>108023403
>first google result
>"You do not have CLIP state dict!"
>https://www.reddit.com/r/StableDiffusion/comments/1exhalk/anyone_know_why_i_get_assertionerror_you_do_not/

>>108023361
don't tell him, kek

>>108023368
just different scheduler/sampler settings from r*ddit and an optional face detail but i only turn that on if it's losing details
>https://files.catbox.moe/bxd5v8.png
>>
>>108023313
I used Musubi. And no, turning it up to 2 in any of the three setups fries the absolute fuck out of the image (as you'd expect), anyone who that works for at all just didn't train their lora correctly, in means the lora is just woefully undertrained in general.
>>
File: 1755731465352306.jpg (330 KB, 768x1024)
330 KB
330 KB JPG
Anyone having trouble using Adetailer with forge neo? Just doesn't seem to work.
>>
>>108023361
I've tested others and am testing more but the conclusion to be clear is that ZIB simply is not in fact the direct parent model of ZIT, not that training is the problem. Again as I said the "ZIB on ZIB" output looks fine for what it is in terms of the likeness (and I was using a negative for that one as ZIB generally needs).
>>
File: ComfyUI_09067.png (3.92 MB, 1440x2160)
3.92 MB
3.92 MB PNG
>>108023274
More like; "The QT that brought Sexmas!"
>>
>>108023430
thanks for the catbox.
when i do close-ups like yours, i also get a lot of detail. but as soon as the person is farther away, faces usually start to look quite plastic.
>>
File: ComfyUI_temp_cyrxa_00010_.jpg (749 KB, 2432x1664)
749 KB
749 KB JPG
https://files.catbox.moe/nq6pfc.png
>>
File: 1769590261338506.png (926 KB, 1024x1024)
926 KB
926 KB PNG
thoughts?
>>
File: before and after.png (3.29 MB, 1930x1426)
3.29 MB
3.29 MB PNG
>use wildcards to generate various characters in various locations
>if either the character or the background is good, save it for later
>use edit models to swap the good characters onto the good backgrounds
>relight, inpaint, make changes
I'm having fun over here.
>>
File: o_00230_.png (2.1 MB, 1152x896)
2.1 MB
2.1 MB PNG
>>
>>108023451
>the conclusion to be clear is that ZIB simply is not in fact the direct parent model of ZIT
oh yeah that was obvious from the getgo. you could just look at their original examples in the paper and see it. also the fact that they sat on ZiB for two months.
>>
Greta/Kirk lora merge?
>>
File: 564670433796886.jpg (350 KB, 2432x1664)
350 KB
350 KB JPG
>>108023499
nice
>>
File: WAN__00010.mp4 (1.47 MB, 640x480)
1.47 MB
1.47 MB MP4
https://files.catbox.moe/ftqvm6.mp4

Are there any current new workflows that give longer vids with Wan 2.1? like 10 sec and i'd be happy. only have 12GB VRAM though
>>
>>108023352
I gave the ZIT one to Gemini and prompted Klein 9B Distilled with the output, 4 steps
>>
>>108023322
nothing weird for the negative, specifically it was `worst quality, low quality, bad quality, very displeasing, lowres, unfinished, pixelated, low resolution, jpeg artifacts, disfigured, deformed, fused, conjoined, disproportionate, missing eye, closed eyes, cross-eyed, lazy eye, asymmetrical irises, bad hands, missing finger, extra digits, broken, crushed, melted, illustration, cartoon, anime, drawing, painting, watercolor, 2d, 3d, cgi, render, simple background, blurry, sketch, ugly`

for the ZIB gens
>>
File: ComfyUI_temp_tahyb_00001_.png (2.29 MB, 1694x1286)
2.29 MB
2.29 MB PNG
>>108023472
yeah face detail is for that exact scenario

>>108023543
yeah i know a lot of folks are miserable in these threads but i can get great output from klein 9b, zit/ base, i can't even choose which one to use for my daily driver, they all take loras great too and train fast as shit.
>>
>>108023570
sdxl slop
>>
>>108023570
>i can't even choose which one to use for my daily driver, they all take loras great too and train fast as shit.
in the mists of trolling anon often forgets how incredible it is we even have more than a single model competing for the top
>>
File: o_00231_.png (1.69 MB, 1280x768)
1.69 MB
1.69 MB PNG
>>
File: rip_sdxl.png (1019 KB, 896x1152)
1019 KB
1019 KB PNG
https://huggingface.co/circlestone-labs/Anima

babe wake up, new SOTA uncensored anime model
>>
File: 754537457217746.png (1.39 MB, 1152x896)
1.39 MB
1.39 MB PNG
>>
>>108023663
>this model constitutes a "Derivative Model" of Cosmos-Predict2-2B-Text2Image
Unironically DOA
>>
>>108023663
>Built on NVIDIA Cosmos
DOA
>>
File: 01676-1057906093.png (1.37 MB, 1152x896)
1.37 MB
1.37 MB PNG
Tell me about Flux Klein. I see it's 64 GB in size. Does it run on 24GB cards? does it run well? is it worth it?
>>
>>108023663
1girl standing holding a sign benchmark SOTA
>>
What's the latest snake oil?
>>
>Tag order
>[quality/meta/year/safety tags] [1girl/1boy/1other etc] [character] [series] [artist] [general tags]
lmfao
>>
File: 852454766989206.png (1.38 MB, 896x1152)
1.38 MB
1.38 MB PNG
>>
File: x_anqbv6.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
>>
>>108023663
i don't mind anime models but how is it SOTA?
>>
Is there an alternative to WanAnimate?
>>
>>108023663
I saw Comfy add the support a while ago, wondered what it was. Seems like it has to be fast at 2B + 0.6B encoder. Gonna give it a shot now.
>>
>>108023678
it's not 64GB, wtf do you mean. The 9B base and distilled Klein are like 18GB, the 4GB base and distilled Klein are like 7GB.
>>
>>108023671
It must be an architectural mod though, the original Cosmos 2 used T5-XXL for the TE
>>
File: poole.jpg (202 KB, 1780x984)
202 KB
202 KB JPG
>>108022550
Mod is eeez rawleee
>>
>>108023717
In short:
Architecture modifications allow very good training efficiency, so it is trained on a gorillion epochs. It rivals the best SDXL tunes in terms of breadth of knowledge, while being a DiT-based model that also supports natural language. It particularly excels at more niche concepts and complex prompts.

I would encourage you to just try the model yourself. It's small, natively supported in comfy, and runs on a potato. I could cherrypick images that make it look good but you can do that with any model.
>>
File: 645621935234530.png (1.55 MB, 896x1152)
1.55 MB
1.55 MB PNG
>>
>>108023761
>natively supported in comfy
I want it in neoforge and anistudio instead. comfy is malware
>>
>>108023754
interesting.
I knew there was subversion going on but i didn't think it was this embedded
>>
So is there a simple client thats similar to lmstudio for image generation where I can upload an image and ask a model to generate an image output? "Remove watermark" etc?
>>
File: 636761349382255.png (1.53 MB, 896x1152)
1.53 MB
1.53 MB PNG
>>108023761
If you have it running, give a few examples of more complex prompts, like multiple characters interacting, etc..
>>
File: 01234-1390049721.png (2.04 MB, 1792x2304)
2.04 MB
2.04 MB PNG
>>108023747
I see one of the models - flux 2 dev - is 64 GB. Naturally, I feel this is probably the superior model. Bigger is better right?
But I guess you have a point. I don't have 64GB vram so I'll have to do with one of these other models. This is why I ask.
So I'm guessing I probably want to go for 9B right?
>>
>>108023761
just feels like nothing is using the optimizations yet. like sure, use a small dataset first to test it out but then you're supposed to use a giant dataset after but it seems like we are just getting the small dataset prototypes
>>
>>108023769
killed the
>it's an anime website
lie forever

stolen from
Mini Modu
@MinModulation
>>
>>108023761
>>108023663
Post examples please
Whoever posted the repo with one image is a fucking retard
>>
File: 59.png (2.46 MB, 1392x1200)
2.46 MB
2.46 MB PNG
>>
>>108023799
too hairy
>>
File: 1053135782600405.png (1.42 MB, 1152x896)
1.42 MB
1.42 MB PNG
>>
File: 01638-1340798938.png (804 KB, 768x1152)
804 KB
804 KB PNG
>>108023747
Sorry, I guess I'm being retarded dev is obviously not klein
>>
File: 1.png (853 KB, 896x1152)
853 KB
853 KB PNG
>>108023796
ok, some random prompts from the validation set
>>
>>108023768
surely the developer of anistudio is not a nocoder and is able to implement it without relying on a 3rd party library... oh wait
>>
>>108023799
this poster is a pedophile btw
>>
>>108023663
This model is horny as fuck, can't post images here.
>>
File: 1745374401789572.png (8 KB, 532x108)
8 KB
8 KB PNG
just realized this significantly improves FPS when moving around in comfy
its a bit ugly though
>>
>>108023690
Regardless of your initial model selection, do a second pass with Klein image edit to fix possible issues etc.
>>
File: 2.png (785 KB, 896x1152)
785 KB
785 KB PNG
>>108023844
>>
File: 3.png (1.16 MB, 896x1152)
1.16 MB
1.16 MB PNG
>>108023856
>>
>>108023854
So klein is good at picking up styles?
>>
>>108023663
that example image on the page REEKS of qwen image, dont tell me this is a distill of it
>>
File: 4.png (1.29 MB, 896x1152)
1.29 MB
1.29 MB PNG
>>108023867
>>
>>108023870
I wish. I'd still be using the edit model if it did.
>>
File: 5.png (957 KB, 896x1152)
957 KB
957 KB PNG
>>108023876
>>
File: 6.png (1.08 MB, 896x1152)
1.08 MB
1.08 MB PNG
>>108023883
>>
File: 156497796706489.png (1.8 MB, 896x1152)
1.8 MB
1.8 MB PNG
>>
what is this dogshit model? why is it fucking up in every example image?
>>
File: 7.png (1011 KB, 1024x1024)
1011 KB
1011 KB PNG
>>108023894
not perfect yet at a bunch of characters at once but it's getting there
>>
>>108023788
yeah Flux.2 Dev is their mainline big chungus open weights model and uses a big Mistral TE, the Kleins are unrelated and use smaller Qwen TEs
>>
File: 01683-755039903.png (1.21 MB, 896x1152)
1.21 MB
1.21 MB PNG
>>108023877
is it at least fast?
>>
>>108023871
it uses Qwen VAE and Qwen 3 0.6B TE, but it appears to be an arch mod of Cosmos 2 overall
>>
File: 626606076365767.png (1.56 MB, 896x1152)
1.56 MB
1.56 MB PNG
>>108023844
>>108023883
>>108023905
These are pretty good actually.
>>
>>108023877
i've had good results doing style transfer with Klein
>>
>>108023912
Conclude your own tests. don't believe the schizo who replies to other people's posts.
>>
>>108023916
yeah it looks at least as good as what I can get out of NetaYume V4.0, gonna try it now
>>
>>108023914
what im saying is was this trained on booru or on synthetic garbage
>>
>>108023912
Yes.
>>108023920
We have different standards for "good".
>>
>>108023927
it was trained on 5 million booru images and 800k non-anime artistic images
>>
Seems good, looking forward for the final version
https://files.catbox.moe/yfbt5c.jpg
>>
>>108023940
weirdo
>>
does zim/zit have problems with belly buttons? they often look deformed
>>
>>108023939
ok good i will try it
>>
>>108023940
>catbox down for me
fuck this gayass earth
>>
>>108023922
you're right. there are a lot of people with questionable motives itt
>>
>>108023949
For me too, it didn't properly uploaded the first time I tried, now it won't load the image for me either.
>>
>>108023949
it always does this shit. it's garbage
>>
>>108023960
>it always does this shit
it doesn't tho
>>
>>108023969
it just did thougheverbeit
>>
>>108023956
try not uploading gcn next time



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.