[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Barely Edition

Discussion of Free and Open Source Diffusion Models

Prev: >>107946229

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>ZiT
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>NetaYume
https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0
https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
>>107950047
Please use the correct rentry for 32 stars next time, this version is missing a ton of lore and was added by the troll
>>
Blessed thread of frenship
>>
>>107950073
get a better hobby
>>
>>107950076
I want to enjoy this hobby without schizos
>>
>>107950073
welcome back ani
>>
>>107950086
Then go to /sdg/ because you want anons to get harassed by them.
Oh....That's right your thread is dead dog shit under NEET loser schizo control.
>>
>>107950088
take your meds
>>
>>107950047
Thank you for baking this thread, anon
>>107950061
Thank you for blessing this thread, anon
>>
>>107950103
I will agree with you that you're not ani because of the lack of a racist spergout
The only other desperate faggot willing to ritual post day in and day out in every thread is the first schizo with a rentry.
Go back to your dead thread pillow princess.
>>
>>107950115
ani is racist?
>>
File: Flux2-Klein_00001_.png (1.71 MB, 1024x1024)
1.71 MB
1.71 MB PNG
>>
>>107950114
thank you for samefagging
>>
>>107950073
Hi julien
>>
>>107950115
You're mixing things up
Trani was the homosexual pedophile not the racist anon
>>
does LTX 2 run in comfy? It doesn't specifically mention it.
>>
>>107950192
Update and check the prebuilt workflows
>>
File: 830.png (1.87 MB, 832x1488)
1.87 MB
1.87 MB PNG
>>107950192
First thing that shows up if you type "comfyui ltx2" on google: https://docs.comfy.org/tutorials/video/ltx/ltx-2
>>
>>107950200
thanks
>>
>>107950212
I'm used to the Comfy github having a page for each of the models it supports. Didn't think I was going to have to google it.
>>
To the anon that was interested in the anime klein edit model. I published it on civit, but if you prefer a megaupload for it, just reply to this post!
https://civitai.com/models/2332320
>>
>>107950231
That docs page has all the stuff, its surprisingly good
>>
>>107950247
>it's surprisingly good
what? the page or ltx2?
>>
>>107950241
Did you finetune 4b and extract the difference? Was image pairs needed for it to work?
>>
>>107950294
the docs pages, haven't used ltx2
>>
>>107950241
will you make a 9B version?
>>
>>107950301
it's just AI generated slop
>>
>>107950300
I fine tuned it on image pairs yes. For "normal" prompts i made the conditioning image a blank image
>>
>>107950310
Yes, im on it, should be done by tomorrow
>>
>Klein turned out to be shit
>z-image base still stuck in Chinese culture
Why even continue?
>>
>>107950326
cool, nice work anon
>>
>>107950331
>Klein turned out to be shit
Don't continue, literally nothing will ever please you.
>>
File: 4.png (1.97 MB, 1472x768)
1.97 MB
1.97 MB PNG
>>107950331
>>
>>107950320
>the conditioning image
what's that? are you sure image pairs are needed for this and it doesnt work without?
>>
>>107950343
There would be something that'd please me. z-image so we finally can have proper finetunes of that shit.
>>
>>107950331
ace 1.5 next week
get writing lyrics
>>
>>107950165
He's both
>>
>>107950365
>ace 1.5 next week
I hope this is good and you can make loras and finetunes of that one. 80% of udio at home would be so nice.
>>
>>107950047
>didnt make the fagollage
literally kys
otherwise thanks for baking
>>
File: sss.png (581 KB, 1318x740)
581 KB
581 KB PNG
>>107950350
it works text2image, but what I meant is that you can feed it a reference. For example a collage of art from the same artist to replicate the artstyle, or a character that you want to change the pose of, change the clothes, compose with another subject.
>>
>>107950415
I hope all the baker schizos die in a fire
>>
File: 38.png (2.54 MB, 832x1488)
2.54 MB
2.54 MB PNG
>>
File: 45756052.png (2.44 MB, 912x1360)
2.44 MB
2.44 MB PNG
>>
>>107950415
your image was deleted Julien so of course it didn't make it
>>
>>107950428
Did you try training it without pairs?
>>
>>
Why "LTX-2: Load Latent Upscale Model" node doesn't fetch the upscaler files that are present in the right folder?

ComfyUI startup logs:

E:\ComfyUI_windows_portable_nvidia>.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build --listen
[...]
Adding extra search path loras D:\AI\models\loras
Adding extra search path upscale_models D:\AI\models\upscale_models
Adding extra search path upscale_models D:\AI\models\latent_upscale_models
[...]


Folder structure:

D:\AI
───models
───audio_encoders
───checkpoints
ltx-2-19b-dev-fp8.safetensors
ltx-2-19b-distilled-lora-384.safetensors

───clip_vision
───configs
───controlnet
───diffusion_models
───embeddings
───latent_upscale_models
ltx-2-spatial-upscaler-x2-1.0.safetensors
ltx-2-temporal-upscaler-x2-1.0.safetensors

───loras
ltx-2-19b-ic-lora-depth-control.safetensors
ltx-2-19b-ic-lora-detailer.safetensors
ltx-2-19b-ic-lora-pose-control.safetensors
ltx-2-19b-lora-camera-control-dolly-in.safetensors
ltx-2-19b-lora-camera-control-dolly-left.safetensors
ltx-2-19b-lora-camera-control-dolly-right.safetensors
ltx-2-19b-lora-camera-control-jib-down.safetensors
ltx-2-19b-lora-camera-control-jib-up.safetensors
[...]
>>
>>107950563
with pairs but it generalizes without too
>>
>>107950631
maybe R to update?
>>
>if you don't specify for klein edit to make a high quality image it will try to reproduce the low quality of the input
a blessing and a curse
>>
>>107950631
You have posted this multiple times.
>>
>>107950631
This fucking folder retardness. Has it still not become better?
Just let people use what ever folders they want.
Bullshit
>>
Isn't Klein just Flux 2 but worse?
>>
>>107950764
scratch that, reverse it
maybe I'm doing something wrong, or maybe flux2 dev is predominantly a t2i model, but it can't even swap clothes as well as klein
>>
>>107950682
twice but ok
>>
Is there a General for NSFW discussion?
>>
>>107950822
You can discuss here, just catbox nsfw images.
>>
>>107950861
can you give us the shota collection already?
>>
>>107950871
What?
>>
>>107950764
It's not exactly the same model at all, given it uses a completely different text encoder and doesn't seem to be like, a params-reduced version of Flux.2 Dev. It is overall worse though yes, as you'd obviously expect. Still great models for their size though, especially for editing some of the stuff you can do with the 4B is crazy relative to way bigger models like Qwen Edit, or any API-only edit model.
>>
>>107950887
the shota collection you promised to share last year
>>
>>107950897
Wrong anon.
>>
>>107950848
>>107950861
Thanks bros.
>>
Don't know why violet always becomes asian. And it removes her mask.
>>
>>107950923
and she looks uncanny bad
>>
File: 1768236772815152.png (2.58 MB, 1216x1216)
2.58 MB
2.58 MB PNG
humm fuck demons ok?
>>
>>107950927
That's expected. I, at least, don't know of a prompt that can turn anime/comics into real life without the uncanny look.
>>
we demand the shota
>>
>>107950939
ask to reduce her head size? and describe her more?
>>
>>107950127
Cool gen
>>
>>107950938
Doesn't look right.
>>
>>107950923
how the fuck are you prompting it lmao
>>
>>107951046
I made a general prompt kek
>Make this image as a photograph of a real life, live-action movie scene, or a film still. Keep the lighting, camera angle, and other elements unchanged, but remove any watermarks or signatures. If humans are present, maintain the proportions of their bodies, but adjust the heads and faces for realism (in case of anime illustrations), while preserving their facial expressions.
>>
File: 1756261466079904.png (262 KB, 624x1380)
262 KB
262 KB PNG
owari da
>>
File: Flux2-Klein_00526_.jpg (331 KB, 1136x1136)
331 KB
331 KB JPG
>>
File: 353611.png (1.12 MB, 1120x1120)
1.12 MB
1.12 MB PNG
>>
Original by sade (I think)
>>
File: BeforeAndAfter.jpg (3.09 MB, 2240x1824)
3.09 MB
3.09 MB JPG
>>107951077
"The girl with purple eyes and black gloves and a red suit and a pointed domino-style black eye mask in image 1 is now completely realistic and lifelike and standing in the middle of a city street. Maintain all other aspects of the composition and layout."

If you ask me there's no way it could make her look any more "real" than this while having it still look like the same person at all, her initial head proportions don't really lend itself to that
>>
>>107951102
weren't white people banned from existing already in the uk?
>>
>>107951102
I would absolutely love them to respond saying "sorry but that's just not feasible", then carry on issuing fines to foreign nations for not complying with british law circa 2023
>>
>>107951102
Does the person who made this petition even know that local models exist? The only way I could see this making sense is if they literally don't
>>
>>107951263
It's because of grok probably, so no they don't know.
>>
>>107950938
>>
File: r.jpg (175 KB, 848x1488)
175 KB
175 KB JPG
>>107951102
predictably retarded. surely the UK government is going to inspect <art outsourced EVERYWHERE in the world for $0.2> if it is AI content.

worked so well for the climate corruption certificates and everything else
>>
whadafugg?
dragged wf and I get this:

Loading aborted due to error reloading workflow data
- **Exception Message:** TypeError: helpDOM.addHelp is not a function
>>
>>107951362
Unfortunately ComfyOrg is pro abortion
>>
File: Comparisons.jpg (3.59 MB, 3600x1808)
3.59 MB
3.59 MB JPG
It's actually crazy how good the Distilled Kleins are for editing, given the size / speed of them. Even the 4B one here is almost as good, it just (incorrectly) unblurs the grass a bit too much.

Like not even Nano Banana Pro can do this while maintaining exactly the same input resolution of 1200x1808.
>>
>>
>>107951277
Yeah Elon is a gigantic faggot for allowing that on X for as long as he did
>>
>>
>>107951419
You know that you can use edit models with other stuff that isn't photos right? for example klein sucks at text editing but qwen-edit excels
>>
>>107951102
1- Nothing happened, literally.
2- If they go and legislate that with or without the petition, people will just get whatever they want in other ways, just like what they do for porn.
>>
>>107951458
wat? is this supposed to be some kinda gotcha? I haven't really tested text extensively on Kleins but I doubt the text even does suck for the speed in comparison to Qwen
>>
File: 00231.jpg (931 KB, 2737x4000)
931 KB
931 KB JPG
>>
File: SignComparison.jpg (3.38 MB, 1520x3024)
3.38 MB
3.38 MB JPG
>>107951556
(samefag)
I'm not even sure dude is correct about text also
Qwen 2511 at 50 steps for:
```The sign in image 1 now reads "JOEVER HERE". Maintain all other aspects of the composition and layout.```
here both looks way worse and doesn't actually get it right, while being giga slower
>>
>>107951653
>50 steps
>distilled vs base
anon, do proper testing, distilled vs lighting and base vs base
>>
Retard here, to run Flux2 4b I need the checkpoint, te text encoder(quen3 4b), and the flux2 vae, no other files right?
>>
>>107951653
distilled is always better than base on quality, that being said a single datapoint is just that, a datapoint
>>
>>107951671
yes, flux2-vae, flux-2-klein-4b, qwen_3_4b
>>
File: 1743297900002917.jpg (658 KB, 1424x2176)
658 KB
658 KB JPG
testing more exotic samplers, I like lobatto
>>
>>107951666
>>107951676
>qwen fried the image as it always does
>klein keeps it mostly the same
>anons ignore it entirely
it's hard to shill when your model gets beaten this bad.
>>
>"rewrite the prompt. do not generate an image and instead write a prompt. rewrite the last prompt so that she's now on a beach. write, dont gen an image."
>Loading Nano Banana...
>>
>>107951802
https://aistudio.google.com/
>>
>>107951666
The point there was I can do it in 4 steps with Klein but QIE couldn't do it in 50 steps at all
>>
>>107951776
Try kohaku
>>
>>107951802
I think if you're in "Create Image" mode on Gemini UI, it will always in fact create an image.
>>
>>107951799
the prompt said "JOEVER" too, not "JOVER", Qwen didn't add the E properly. I hope they do a Qwen Edit 2512 with the same way better realism of normal Qwen 2512, anyways.
>>
>>107950617
Jenifer Lopez? powerfull..
>>
File: 1761438350989230.jpg (1.14 MB, 1248x1824)
1.14 MB
1.14 MB JPG
>>
>>107951996
kino
>>
>>107951964
nobody, pure T2I
>>
File: 00232.jpg (866 KB, 2737x4000)
866 KB
866 KB JPG
>>
sd1.5 never died
>>
>>107952037
>What is dead may never die
>>
>>107952037
not agree. 1.5 is so plastic
>>
>Over a year
>Still no local base model better than illustrious for hentai
Grim
>>
File: sd1.png (2.26 MB, 1827x772)
2.26 MB
2.26 MB PNG
>>107952037
yeah there's never really stopped being a steady stream of new or updated SD1 checkpoints
>>
>>107952037
Body horror never died
>>
>>107952082
if "better" DOESN'T mean "actually extant / very good natural language prompt adherence, in English" which would be NetaYume already, than what exactly are you hoping for?
>>
>>107952108
i've never understood that either, a lot of the people who claim to want a better anime model simultaneously don't seem to give a shit about prompt adherence, which is like, what is it you want then lol
>>
>>107952108
"Better" probably means properly knows anime/booru concepts, characters, styles, doesn't look worse than illustrious, and doesn't melt down into body horror with sex prompts. Probably.
>>
>>107951102
>OI MATE, YOU GOT A LOISENCE FOR THAT MODEL?
>>
>>107952108
There are only 4k downloads on Civitai since its release last month. Only one lora for it was published. Most of comments are complaints rather than praises. This model must be very bad.
>>
>>107952108
>>107952127
Their favorite gacha character isn't native nor has a lora so to them that means the model is bad
>>
File: 1534396412016.png (15 KB, 210x260)
15 KB
15 KB PNG
Is the patch sage attention node still working? Doesn't seem to do shit on auto.
>>
>>107952095
It was and is still the only option for VRAMlets.
>>
>>107952164
I assume the model was shit based on the posted images on civit, just compare it to WAI and well, thats it
>>
>>107952127
>>107952127
>Text/captions
>Obscure fetish prompt recognition
>Multiple characters in a scene without 3rd party tools,
>Generally better quality, anatomy, and backgrounds that adhere to logic.
Am I being gaslit? What do mean there's no improvements to be made?
>>
>>107952127
There's a reason 1girl became a meme.
>>
>>107952167
>the model that is supposed to be for genning anime is bad at genning anime
>It's pretty good though, trust me!
>>
File: 8735492677.png (10 KB, 395x77)
10 KB
10 KB PNG
wtf is that? does the model need instructions?
>>
we get it you were filtered by yume its fine just dont sperg out about it
>>
>>107952217
an llm does, which is most TE's now
>>
>The current models are fine
Says the guy who only gens 1girl standing without any creativity in the prompt.
>>
>>107952164
>There are only 4k downloads on Civitai since its release last month
yeah because this community is full of utter retards and turbo ESLs who wait for Reddit to tell them what to use and don't really browse for things themselves
>Only one lora for it was published.
That doesn't really mean anything, but beyond that the architecture of the model itself is relatively "unexplored" to begin with
>Most of comments are complaints rather than praises.
That's not really true, the guy asks for constructive criticism on each version and usually gets it. There's more than one person who went out of their way to leave a praise comment on V4.0 also.
>>
>>107951102

>>>/wsg/6078544
>>
>>107952190
99% of images in Civitai are bisgustin and are no indication of the capabilities of the model they purport to be from.
>just compare it to WAI
oh nevermind you are a hyper ai slop enjoyer. carry on
>>
>filtered by yume
Because it's underbaked as fuck. And newbie too.
>>
>>107952228
they just encode the text they don't understand a fucking system prompt LOL
>>
>>107952127
Good quality
Consistent styles
No body horror
Easy to train
Decent gen times
>>
>>107952247
Man I wish I was this new
>>
>>107952254
prove me wrong
>>
File: TheSDXLVaeIsVeryNotGood.jpg (3.15 MB, 1664x2432)
3.15 MB
3.15 MB JPG
>>107952190
people keep saying this but I really fail to see how the NetaYume user gallery or example images are particularly different at all from many Illustrious or Noob based checkpoints out there.

Noob VPred base literally has gigaslopped shit like picrel as examples also so I'm not sure this reasoning is actually reality
>>
Thoughts?
https://huggingface.co/blog/waypoint-1
>>
>>107952275
another 'woooaaah!!... can anyone think of a use for this?' kinda thing
it'll keep AI alive a little longer, and not much else
>>
You wouldn't know underbaked even if Pixart punched you in the face
>>
>>107952275
Tried the online demo, looks like mushy shit, like all these videogame generators.
>>
File: ayy.png (2.12 MB, 1216x896)
2.12 MB
2.12 MB PNG
Should've included furry like Noob (but considering they underbaked even with just tranime it would probably be even worse).
>>
>>107952327
yeah desu the minecraft version was way more fun.
>>
>>107952095
none of this looks significantly worse than qwen/glm or the rest of the garbage we've received recently. grim
>>
>>107952332
It's funny because so many promptlets would point to the inclusion of e621 as the reason why NoobVP was ""bad"".

Many seem to forget that anon initially fought tooth and nail against Noob in favor of Pony. Thankfully I singlehandedly changed their minds.
>>
>anon
fizzlekek self-report
>>
>>107952332
Is that a NetaYume gen? If so, I'm impressed.
>>
ok, I've morphed all my cantonese cartoons into live-action movies.
What else can I do with klein image edit?
>>
>>107950331
soon
>>
Who cares about Noob? This is the new hot stuff
https://civitai.com/models/2197517/newbie-image

I see they updated it on civitai?
>>
>>107952388
lol no, just klein, if it was on neta i'd try on a anime beef
>>
>developed through research on the Lumina architecture.
KEEEEEEEEEEK another lumina failbake!
>>
File: ComfyUI_temp_arrpk_00002_.png (3.95 MB, 1152x1344)
3.95 MB
3.95 MB PNG
>>
>>107951107
>>
File: PA_0028.jpg (1.18 MB, 2560x1536)
1.18 MB
1.18 MB JPG
>>
File: 80588.png (1.29 MB, 512x1408)
1.29 MB
1.29 MB PNG
>>
it's surprising how uncucked 4b klein is, I'm using it to upscale old porn and it's almost perfect
>>
>>107952418
Newbie isn't bad by any means IMO but it's just like, not in any way really better (or even *different* in terms of promptability and overall feel) than NetaYume while being bigger and quite a bit slower than NetaYume.
>>
File: KleinVsZITAnimeWAIPrompt.jpg (3.59 MB, 3072x2304)
3.59 MB
3.59 MB JPG
Klein 9B Distilled vs ZIT on boomerprompt recreation of a WAI Illustrious model card lol

```a 2D digital anime illustration of a pale-skinned young woman with an ethereal and charming appearance, standing in the center of a dimly lit, magical library. The woman appears to be approximately 18 years old. She has long, vibrant blue hair styled in a thick braid that rests over her left shoulder, with loose strands framing her face and a halo-like accessory floating above her head. Her hair is adorned with star-shaped clips and blue ribbons. Her eyes are large and striking, featuring distinctive bright yellow irises with star-shaped pupils that gaze directly forward with a gentle, inviting expression. She is holding her right index finger to her smiling lips in a "shh" or quiet gesture. She wears a detailed outfit consisting of a black dress with a corset-style waist, white ruffled sleeves, and a white shawl or capelet draped over her shoulders, decorated with teal stars and gold accents. A blue ribbon is tied around her neck with a star charm. In her left hand, she holds a lit lantern with a warm, glowing yellow light that illuminates her figure and the immediate surroundings. The lantern has a black metal frame and a glass enclosure. To the left of the woman, on a stack of hardcover books resting on a wooden surface, sits a small, stylized black plush toy character wearing a military-style cap. The background features tall wooden bookshelves filled with numerous books of varying colors and sizes, extending high up towards the ceiling. A wooden ladder leans against the shelves on the right side. Through a large arched window in the upper right background, a dark blue night sky is visible. The lighting is dramatic, with the warm glow of the lantern contrasting against the cool, dark tones of the library shadows. The overall atmosphere is cozy, mysterious, and scholarly.```
>>
>>107952588
>>
https://x.com/gorilla_rape/status/2014721320429232282?s=20
https://x.com/gorilla_rape/status/2014720728264843627?s=20
mayli anon...
>>
>>107952637
>>
>>107952634
>high-res
How much are you upscaling and what's your denoise?
>>
File: 1762233068985557.png (493 KB, 1152x896)
493 KB
493 KB PNG
>>
>>107952618
Does it keep vaginas and nipples intact? What happens if you don't upscale?
>>
File: h742by.png (660 KB, 1024x512)
660 KB
660 KB PNG
>>
>>107952650
I used the exact same (assumed) base res and (obviously concrete) final res as the actual WAI pic here (final was 1536x2304, almost certainly upscaled exactly 2x by WAI dude from 768x1152, a resolution that East Asians love for some reason.)

Used 4xUltrasharp for the upscale on both Klein and ZIT, and 0.4 strength for the 8 steps of high-res denoise on both, with the same prompt / seed / everything else.
>>
>>107952637
They all have that kind of uncanny same face, it's weird.
>>
>>107952660
Yeah it does, even cumshots work most of the time. It can't change genitals and nipples but keeps them intact when upscaling or changing other things in the image
>>
>>107952706
Not as much as before though
>>
>>107952711
>>
>>107952643
She’s beautiful, frankly
>>
>>107952711
>>107952719
Bodies are fine, but it's that round sameface.
>>
>>107952723
You can simply change that with some prompting.
>>
I still can't believe this ACEStep 1.5 kino is just an AI song. The guitar just goes so hard on this one, and I've never seen other models like Udio or Suno do that kek (good or catchy music, sure, but this level of coherence... damn)

https://files.catbox.moe/jc3fgz.mp3

Only 4 days for this kino bros.
>>
>>107952766
time signature changes are jarring, and the tempo is all over the place
but it's coherent
>>
>>107950047
erm anons, is there any good local model like nano banana but which will help me make 2D sprites/tiles?
>>
why isn't there any speed up when I generate images in a batch?
>>
>>107952778
It does change it up a lot, but then again real "artists" aren't too repetitive either.
>>
>>107952707
The 4b model seems to have different behavior. the 9b model would often mangle genitals and nipples unless you upscale.
>>
>>107952798
Batching improves speed only if you have enough VRAM.
>>
>>107952798
You are saving time on the repeated spool-ups of the model when doing it one by one.
>>
File: 169604-tmp.png (2.77 MB, 1368x2000)
2.77 MB
2.77 MB PNG
>>
File: 00062-1965252318.png (2.22 MB, 1536x1536)
2.22 MB
2.22 MB PNG
>>
File: inpaint halos.png (344 KB, 641x1037)
344 KB
344 KB PNG
>>
>>107952832
>>107952844
but it's chroma and fits entirely on vram?
>>
>>107952924
Idk nigga, run a stopwatch for manually clicking 10 gens or a batch of 10
>>
>>107952934
>running in comfyui
>use a stop watch when it clearly reports run time
>>
>>107952798
Who told you it would?
>>
>>107952939
It doesn't report time of you clicking the buttons with your hands
>>
>>107952944
why wouldn't it?
>>
>>107952618
it works very well. I've been doing this with all kinds of shit in my massive porn folder.
>>
>>107952973
What was your prompt?
>>
File: 1740473590765315.jpg (1.4 MB, 2224x2576)
1.4 MB
1.4 MB JPG
>>
File: saas adware.png (439 KB, 617x687)
439 KB
439 KB PNG
reminder that comfyui is saas adware and should be removed from the OP. comfyui promotes API-only models and encourages companies to shift towards an API-first approach as that is how they plan to make money. remove ComfyUI from the OP.
>>
wan2gp chads, are boy has cooked it up good again.
>>107952981
transform style of image to a ultra realistic cinematic high budget Hollywood movie scene. maintain the original background. do not add clothes, bra and panties. keep everyone full nude, naked and exposed.

transform style of image to a photorealistic cinematic high budget Hollywood movie scene. change the lighting to cinematic very dark night lighting. remove all breast nipples, areola pigment and breast pigment.
>>
>>107953000
Thank you anon.
>>
Low levels of kino sovl ITT
>>
File: _f2k9b_00062.png (1.78 MB, 960x1440)
1.78 MB
1.78 MB PNG
>>
>>107953014
you obviously aren't complaining hard enough
>>
File: _f2k9b_00173.png (2.11 MB, 960x1440)
2.11 MB
2.11 MB PNG
>>
>>107951424
>https://files.catbox.moe/jc3fgz.mp3

What am I looking at here?
>>
>>107951996
how do u get this aesthetic
>>
>>107952973
you can probably retain the Playstation 3 text if you explicitly mention it in the prompt FYI
>>
File: _f2k9b_00189.png (1.96 MB, 960x1440)
1.96 MB
1.96 MB PNG
>>107953117
pick any chroma model and ignore anyone that tries to explain how to use it
>>
>>107953117
img2img with low denoise.
>>
should I go back to sigma or xl?
>>
LTX-2 no prompt. you can't make this shit up.
https://files.catbox.moe/2zrymf.mp4
>>
>>107953249
AI knows.
>>
>>107953249
We're about to enter a totally new era of brainrot
>>
>>107952247
the lumina team trained the model on captions that start like that
>>
>>107953249
Having no prompt results in 100% jeet shit.

I forgot what I was doing, but I also managed to get 100% italian audio/vocals when genning something. It's such a slopped model.
>>
File: peppa.png (592 KB, 1280x768)
592 KB
592 KB PNG
>>107953249
>>107953288
Second try I get straight up peppa pig. not even a scuffed version.
>>
>>107953249
>https://files.catbox.moe/2zrymf.mp4
lmaoooo
>>
>>107952217
gemma boilerplate
>>
File: ohnonono.png (279 KB, 920x469)
279 KB
279 KB PNG
>>107953255
>>
>>107953249
ltx2 is such a dogshit model
>>
>>107953392
I like it a lot. every gen it makes make me crack up laughing.
>>
>>107953360
Is this ai generated?
>>
>>107953249
AWAKEN MY MASTERS
>>
File: ComfyUI_temp_uarav_00001_.png (2.31 MB, 1024x1280)
2.31 MB
2.31 MB PNG
Is there a way to run joycaption batches from within comfy? I've tried taggui but that shit just forcedumps the models into the hugginface cache on C: and I have no room.
>>
File: LoomerJigsaw.mp4 (1.69 MB, 480x640)
1.69 MB
1.69 MB MP4
>>
File: 1747515384247504.png (2.57 MB, 1280x1248)
2.57 MB
2.57 MB PNG
>>107953539
yes
>>
>>107953420
https://www.lightricks.com/about
>>
>>107953583
>troon gradient
lmao this entire thing looks like a shitpost
>>
>>107953580
What nodes?
>>
File: 1761506656894234.png (246 KB, 1088x1056)
246 KB
246 KB PNG
>>107953601
use any openai client if you have an existing llmao.cpp
if you want to keep everything in-workflow, then use the llmao-cpp-python nodes.
was nodes for loading/saving as showed.
>>
>>107953686
almost forgot, you have to run this workflow N times per images you have, there are other nodes that do loops/load all images but they require more nodes and a bit morel ogic
>>
>mfw Resource news

01/22/2026

>Linum v2: 2B parameter, Apache 2.0 licensed text-to-video models
https://www.linum.ai/field-notes/launch-linum-v2

>Huihui-GLM-4.7-Flash-abliterated
https://huggingface.co/huihui-ai/Huihui-GLM-4.7-Flash-abliterated

>LookBench: A Live and Holistic Open Benchmark for Fashion Image Retrieval
https://serendipityoneinc.github.io/look-bench-page

>FunCineForge: A Unified Dataset Toolkit and Model for Zero-Shot Movie Dubbing in Diverse Cinematic Scenes
https://anonymous.4open.science/w/FunCineForge

>Towards Holistic Modeling for Video Frame Interpolation with Auto-regressive Diffusion Transformers
https://github.com/xypeng9903/LDF-VFI

>Porn Site ManyVids Descends Into AI Psychosis
https://gizmodo.com/porn-site-manyvids-descends-into-ai-psychosis-2000713033

>Comic-Con Bans AI Art After Artist Pushback
https://www.404media.co/comic-con-bans-ai-art-after-artist-pushback

01/21/2026

>Ollama Image generation (experimental)
https://ollama.com/blog/image-generation

>DiffusionAgent: Navigating Expert Models for Agentic Image Generation
https://github.com/DiffusionAgent/DiffusionAgent

>Arthemy Live Tuner Z-Image ComfyUI nodes: Real-time control for Z-Image (S3-DiT) models and Qwen3-4B Text Encoders
https://github.com/aledelpho/Arthemy_Live-Tuner-ZIT-ComfyUI

>VideoMaMa: Mask-Guided Video Matting via Generative Prior
https://cvlab-kaist.github.io/VideoMaMa

>Easy 2D Openpose Editor
https://github.com/speedyrulz/Easy-2D-Openpose-Editor

>Flux2 INT8 Acceleration: Speeds up Flux2 in ComfyUI by using INT8 quantization
https://github.com/BobJohnson24/ComfyUI-Flux2-INT8

>EmoLat: Text-driven Image Sentiment Transfer via Emotion Latent Space
http://github.com/JingVIPLab/EmoLat

>Fine-Grained Zero-Shot Composed Image Retrieval with Complementary Visual-Semantic Integration
https://github.com/yyc6631/CVSI

>Interp3D: Correspondence-aware Interpolation for Generative Textured 3D Morphing
https://github.com/xiaolul2/Interp3D
>>
>mfw Research news

01/22/2026

>LURE: Latent Space Unblocking for Multi-Concept Reawakening in Diffusion Models
https://arxiv.org/abs/2601.14330

>Breaking the accuracy-resource dilemma: a lightweight adaptive video inference enhancement
https://arxiv.org/abs/2601.14568

>LFS: Learnable Frame Selector for Event-Aware and Temporally Diverse Video Captioning
https://arxiv.org/abs/2601.14594

>3D Space as a Scratchpad for Editable T2I Generation
https://oindrilasaha.github.io/3DScratchpad

>Diffusion Epistemic Uncertainty with Asymmetric Learning for Diffusion-Generated Image Detection
https://arxiv.org/abs/2601.14625

>Mirai: Autoregressive Visual Generation Needs Foresight
https://arxiv.org/abs/2601.14671

>Safeguarding Facial Identity against Diffusion-based Face Swapping via Cascading Pathway Disruption
https://arxiv.org/abs/2601.14738

>Enhancing T2I Generation via End-Edge Collaborative Hybrid Super-Resolution
https://arxiv.org/abs/2601.14741

>Synthetic Data Augmentation for Multi-Task Chinese Porcelain Classification: A Stable Diffusion Approach
https://arxiv.org/abs/2601.14791

>TempViz: On the Evaluation of Temporal Knowledge in T2I Models
https://arxiv.org/abs/2601.14951

>Mixture-of-Experts Models in Vision: Routing, Optimization, and Generalization
https://arxiv.org/abs/2601.15021

>Deep Leakage with Generative Flow Matching Denoiser
https://arxiv.org/abs/2601.15049

>Differential Privacy Image Generation with Reconstruction Loss and Noise Injection Using an Error Feedback SGD
https://arxiv.org/abs/2601.15061

>PROGRESSLM: Towards Progress Reasoning in Vision-Language Models
https://progresslm.github.io/ProgressLM

>StableWorld: Towards Stable and Consistent Long Interactive Video Generation
https://arxiv.org/abs/2601.15281

>Iterative Refinement Improves Compositional Image Generation
https://iterative-img-gen.github.io

>Towards Understanding Best Practices for Quantization of Vision-Language Models
https://arxiv.org/abs/2601.15287
>>
>>107953759
>Linum v2: 2B parameter, Apache 2.0 licensed text-to-video models
What's the point? Making gifs? Wan 5B was already ass.
>>
the absolute state of local
>>
>>107952986
until every custom free node stops working, it will remain. plus i'm sure there's comfyui forks out there without api so its all good.

the issue with alibaba's betrayal is that they mentioned releasing an open sourced version or was possibility (believe it was one of their live streams) but the community was too busy sperging out. so, guess alibaba quietly said fuck you and kept it closed.

>>107953249
lmfao. the anon who mentioned this when ltx2 first released wasn't lying
>>
>>107953810
A two-man AI startup got YC funding and blew it all on making the world's most mediocre video model. Investors really will throw money at absolutely anything related to AI.
>>
so any improvements on f2k character loras or are they still shit?
>>
File: 1654012964004.gif (1.98 MB, 400x300)
1.98 MB
1.98 MB GIF
>qwen tts released

Welp, looks like I am going to be busy all day cloning voices into ltx2
>>
anon I need the copypasta that chroma never delivers
>>
>>107953928
is it any good? no one is posting samples
>>
>>107953980
I'm backing up my install, taking forever.

But the cloning is extremely good, it's gamechanging. It's going to blow up for sure.
>>
>>107953999
can it do whisper voice / asmr / moans?
>>
>>107953999
Not bad at all. This is just with x-factor, no transcript https://voca.ro/15qrUuMucqdW
>>
>>107953980
youtube has some, just search for qwen3 tts

it seems very good overall at least, but I can't yet tell if it is "the best" at for example emotions or voice cloning
>>
>>107953965
>actually chroma was very soul at Revision #
>no wait #48 is the actual soulful one
>#50 (HD) is bad but wait!
>there's also the flash heun model its good (its not)
>and there's the HD flash merge too!!!! lol!! I swear this time it converges good!!!!
>but you know whats really bad? its not the unfinished training... its just the.. UGH VAE!!!
>yeah lets train a new vaeless chroma LMAO, RADIANCE!
>*retard spams the general for weeks with his absolutely melty/cooked gens*
>uhmmm no radiance is good !!!!
>but WAIT, ackshually radiance can be fixed with this x0 version
>ehh but you know what? we're moving onto z-image... what? waiting for base? lmao!!!! we're training on a distill just like we did for normal chroma!!!
>>
File: r.jpg (101 KB, 848x1488)
101 KB
101 KB JPG
>>107953980
found more samples on their webpage:
https://qwen.ai/blog?id=qwen3tts-0115
>>
I'm using svi2pro and each time I want to continue my chain of clips, I have to rerun them all after a comfy reboot. They're all set to the fixed seeds, the nodes just doesn't have the latents to continue from, how do I solve this?
>>
Welp..
>>
>>107953117
I used klein to transfer style from a certain manga cover
>>
File: n3q.png (1.64 MB, 2048x2048)
1.64 MB
1.64 MB PNG
>>actually chroma was very soul at Revision #
>>no wait #48 is the actual soulful one
>>#50 (HD) is bad but wait!
>>there's also the flash heun model its good (its not)
>>and there's the HD flash merge too!!!! lol!! I swear this time it converges good!!!!
>>but you know whats really bad? its not the unfinished training... its just the.. UGH VAE!!!
>>yeah lets train a new vaeless chroma LMAO, RADIANCE!
>>*retard spams the general for weeks with his absolutely melty/cooked gens*
>>uhmmm no radiance is good !!!!
>>but WAIT, ackshually radiance can be fixed with this x0 version
>>ehh but you know what? we're moving onto z-image... what? waiting for base? lmao!!!! we're training on a distill just like we did for normal chroma!!!
>>
>107954292
filtered
>>
File: ce0.png (1.5 MB, 2048x2048)
1.5 MB
1.5 MB PNG
>>107954292
>filtered
>>
File: edit_13.jpg (2.48 MB, 2703x2703)
2.48 MB
2.48 MB JPG
>>
File: 1746800611838374.png (2.6 MB, 1280x1248)
2.6 MB
2.6 MB PNG
tfw no elf wife
>>
>>107953870
the quality so far of NSFW loras on Civit for F2K (pretty high) makes me think people are just doings something wrong training wise
>>
File: 1754072877775650.png (2.43 MB, 928x1664)
2.43 MB
2.43 MB PNG
>>
File: wat.png (12 KB, 632x234)
12 KB
12 KB PNG
>>107953810
they claim these hardware requirements but like, how the fuck can that be true for a 2B model?
>>
File: edit_14.jpg (2.68 MB, 2703x2703)
2.68 MB
2.68 MB JPG
>>
>>107954347
this is interesting though, I found this old Reddit post from January 2024:
reddit.com/r/StableDiffusion/comments/19bhl2q/best_of_linum_texttovideo_jan_20/

it looks like their original V1 discord bot model was not that bad for video models at that time
>>
I love 1 girls
>>
File: h7f.png (1.44 MB, 2048x2048)
1.44 MB
1.44 MB PNG
>I love 1 girls
>>
File: hmm.png (52 KB, 1098x305)
52 KB
52 KB PNG
>>107954400
yeah i probably can't run this shit but at least they're honest about how unoptimized it is I guess
>>
File: batch.png (174 KB, 1168x733)
174 KB
174 KB PNG
>>107953539
>>
>>107954465
the output if you turn on the "use vulgar slang and profanity" is such retarded dogshit, I dunno why it even exists
>>
>>107954494
yeah it must be a joke.
>>
>>107954337
so far i have only seen good style loras desu
>>
File: r.jpg (116 KB, 848x1488)
116 KB
116 KB JPG
>>107954347
IDK what they did but for example the more context you keep between frames [/ the more frames you can look back at and contextualize even in terms of latent space], the more you may need RAM?
>>
>>107954465
I keep getting assistant spam in the captions.
>>
File: a07.png (1.47 MB, 2048x2048)
1.47 MB
1.47 MB PNG
>LatentUpscaleModelLoader
>'config'"
>Show Report >Help Fix This
>>
>>107954674
HELP ME LTX VIDEO REPLY GUY!! HELP!!
>>
>update comfy
>webui is just blank white
:)
>>
>>107953928
Can it do erotic moaning?
>>
>>107954701
lmao, are you feeling comfy?
>>
pulling bleeding edge is a skill issue
>>
yet here you are, still using sdxl 3 years later
>>
Some people on civit really like to make these gigantic all in one workflows, surely none of the people here do that.
>>
File: Flux2-Klein-9b8fp_00046_.png (3.77 MB, 1920x1072)
3.77 MB
3.77 MB PNG
>>
File: rargh.png (59 KB, 506x775)
59 KB
59 KB PNG
Why does it keep looping assistants? There isn't even any setting for it.
>>
>>107954758
thats a big miku
>>
>>107950047
Goatse is eternal.
>>
>>107954758
cool gen. they should do this in reality.
>>
File: jeesee.png (136 KB, 874x911)
136 KB
136 KB PNG
Can you type a stop token manually into the system prompt or something so it stops right before the first assistant hits? Or do I look for the og system prompt in the source files and just delete it?
>>
>>107954764
>>107954832
Chat template is probably broken. I have no idea how it should look like and how to fix it for JoyCaption.
>>
>>107954764
Sorry no idea. Never gotten any assistant crap. Works flawless for me. Maybe you forgot some kind of sys prompt on?
>>
>>107954832
Maybe it's the GGUF.
>>
>>107954764
>>107954832
I only noticed this happening with gguf. It started when I was forced to install gguf version because the other model stopped loading properly for some unknown reason. This shit really needs an update. The example workflows are all broken too.
>>
File: lo8.png (1.68 MB, 2048x2048)
1.68 MB
1.68 MB PNG
>Setting `pad_token_id` to `eos_token_id`:2150 for open-end generation.
>>
What the fuck does this guy even mean? There are no full prompts at all for the images in the Klein prompt guide, just bits of suggestions for different things like lighting etc.

reddit.com/r/StableDiffusion/comments/1qlihrk/bfls_flux2_klein_official_prompting_guide_is/
>>
>>107954758
kek
>>
File: Flux2_01514_.png (3.21 MB, 1824x1248)
3.21 MB
3.21 MB PNG
me rn. its almost here folks
>>
File: 1766700205416983.png (276 KB, 2118x1154)
276 KB
276 KB PNG
comfy qwen tts
https://vocaroo.com/12szpObIHoJt
>>
how are you even supposed to prompt for heartmula? it keeps giving me country rock slop. i trying to get music instrumentals and voice tone similar to "And the Battle Is Going Again"
https://www.youtube.com/watch?v=L9McoPpo6Nk https://www.youtube.com/watch?v=wU3ur__geX0
>>
somehow it was a pain in the ass to get index tts running, but i like the results.
https://vocaroo.com/1d1ouCpPKDvl
>>
>had to make a backup of comfyui because I knew trying qwen tts would brick something
>out of storage
>have a 4tb ssd lying around just for this moment
>find my ssd cable and open pc case to install it all
>all ssd slots are taken
>have to transfer shit and empty ssd


This was not the saturday I wanted..
>>
>>107954211
that's neat, guess i underestimated klein.
which manga is it? i dig the style
>>
>>107953076
is that a local music model? I'm looking for a music model that runs locally.
>>
>>107954965
>comfy
nobody cares. not local diffusion
>>
>>107954965
lol thats pretty good
>>
Making a character LoRA for ZiT with12 GB VRAM .
Which software should I use? What dataset size is recommended, and what image resolution is preferred?
How many images do I need per costume and per camera angle for the following shot types?
Close-up: front, side, 3/4 view, back, above, below
Portrait: front, side, 3/4 view, back, above, below
Upper body: front, side, 3/4 view, back, above, below
Cowboy shot: front, side, 3/4 view, back, above, below
Full body: front, side, 3/4 view, back, above, below
>>
>>107954958
Tried a comic booky reprompting of this idea with Klein
>>
>>107955051
>cumfart faggot so desperate he comes to shill tts in an image general
they reek of desperation.
>>
>>107955051
this. no need to bring your dilation station ui up
>>
>>107955071
I'm no expert but I'd say balanced amount of everything except only a couple of full body pics so the model learns the body proportions. You can train zit with 512 resolution. Ai toolkit is noob friendly and onetrainer more advanced.
>>
>>107955115
who's that?
>>
>>107955131
cumfart
>>
julien is crying
>>
>>107955131
the guy ranfaggot is obsessed with. she smokes meth and dilates looking at this guy. she thinks its ani but that's just his coworker. unemployed nigger troon is failure at everything
>>
>>107955115
prompt?
>>
>>107955141
>that's just his coworker
so they both work at mcdonald's?
>>
>>107955153
sour grapes wasteman xd
>>
>inb4 3 hours of ani talking to himself and 90 deleted posts
>>
>>107955118
thanks,
>>
/adt/ is again raiding us? i'm tired of this
>>
>>107955166
you can write ani on discord if you want to talk to him. he hasn't been here for months stop bringing him up you obsessed junkie nigger
>>
>>107955177
tell ani I called him a faggot
>>
Why would /adt/ do this? What do they gain from trolling us?
>>
>>107955224
>>107955224
>>107955224
>>
ran is sucking off a mod right now to secure a schizo bake (everyone else who bakes gets banned)
>>
>>107955227
holy fake
>>
>>107954965
how did you get the nodes on the right to run? installed the custom node but it still says missing nodes
>>
>>107955245
Something to do with Sox package or something. I just installed the wrapper with pip and installed it on windows and added exe to path. Not sure if all that is required but it worked.
>>
>>107954965
NIce, can you control the speed?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.