[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion of Free and Open Source Diffusion Models

Prev: >>107892557

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Flux Klein
https://huggingface.co/collections/black-forest-labs/flux2

>WanX
https://github.com/Wan-Video/Wan2.2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>NetaYume
https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0
https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
Blessed thread of frenship
>>
File: 1764909011075340.jpg (3.99 MB, 4800x6912)
3.99 MB
3.99 MB JPG
>>
>Maintain Thread Quality
>https://rentry.org/debo
https://rentry.org/animanon

What do these mean?
>>
>>107894984
pretty cool image anon, what model?
>>
>>107894997
the schizo anon didn't leave the general after being given multiple hints
>>
>>107895009
neta
>>
>>107894997
It means you are gay ani
>>
>>107894997
you're a lolcow
>>
>>107895012
?

>>107895015
What's a gay ani
>>
File: 1755682829164266.png (1.12 MB, 955x1447)
1.12 MB
1.12 MB PNG
Still no base? Oh man I was expecting Klein to wake them up a little
>>
>>107893436
that guy tested FP8 Klein vs BF16 ZImage. And also only used 4 steps for Klein while using 8 for ZImage. It's a useless trash comparison for these reasons. There's no good reason not to test them at the exact same precision with the exact same number of steps and exact same sampler / scheduler. Anyone doing otherwise is either retarded or a disingenous Chinese Z Shill. (And I like Z, don't get me wrong. I think people who feel you have to choose exactly one model at any given time are dumb as fuck.)
>>
>>107895024
Chinks lost
Chinese culture lost
>>
ltx2 video extend workflow is pretty fun, can clone voices and do whatever you prompt.

starting at frame 65:

https://files.catbox.moe/bcgutr.mp4
>>
>>107895038
buy an ad
>>
>>107895033
>And also only used 4 steps for Klein while using 8 for ZImage.
which are the standard default settings for each models? if BFL tells you it's supposed to work at 4 steps and it doesn't work at 4 steps then it's their fault, not us
>>
>>107895038
did you implement the audio regularisation node shit?
>>
>>107895024
based renetta
>>
>>107895047
no he didn't, he is a schizo advertising his cobbled together workflow
>>
>>107895039
>no gen
>complaining about posts in a general about gens
great post
>>107895047
I just got the updated distilled model that they said has a fixed vae, and am testing stuff.
>>
File: elsacrush.mp4 (765 KB, 480x848)
765 KB
765 KB MP4
no spicy
>>
>>107895040
I mean I expect people posting these sorts of things on fucking Reddit to be not retarded drones who blinly follow the default "recommendations" when it can easily be discerned that, "hey, these are both distilled flow-matching DiT models, thus they certainly will work with the exact same settings, so I should test them that way."
>>
>>107895061
>hey, these are both distilled flow-matching DiT models, thus they certainly will work with the exact same settings
only a retard would say something like that, Z-image turbo was distilled on 8 steps, and Klein was distilled on 4 steps, do you know it's different right? 4 is not 8, can you see that anon?
>>
>>107895038
https://files.catbox.moe/amaoqw.mp4
>>
File: 1750679923727940.png (1.95 MB, 1120x1376)
1.95 MB
1.95 MB PNG
doresny
>>
>I am just testing
>>
>>107895081
ever hear what john wick could do to troons?

https://files.catbox.moe/35jds8.mp4
>>
>>107895053
sounds like he was right, i will just implement it myself
>>
>>107895102
shut the fuck up you third world schizo, just because you are too poor to buy a GPU or RAM doesn't mean you can cry about everyone else all day.

go jump off the roof of your favela.
>>
>Klein 4b base has the apache 2.0 licence
>It's way better than SDXL base and can do edit shit
I guess SDXL is officially dead right? regardless of whether we get Z-image base or not
>>
File: 552964.png (1.34 MB, 1665x1248)
1.34 MB
1.34 MB PNG
>>
holy melty
>>
>>107895114
post 50 more videos about troons and blacks, they are le hilarious, i do wonder how you even got comfy running with your iq
>>
File: 1763801198562605.jpg (61 KB, 975x730)
61 KB
61 KB JPG
>>107895128
you will never be a woman
>>
File: 654126588.png (1.71 MB, 1088x1088)
1.71 MB
1.71 MB PNG
>>107895126
>>
>>107895128
>they are le hilarious
>le
troon ledditor spotted
>>
>>107895132
Indeed, just as you will never be a non retard
>>
>>107895121
>make the most delicious /d/eviant porn to drive my genitals crazy with sdxl
>struggle to get good hands with any of the new models
sdxl forever and ever
>>
>>107895121
it will take 1 big finetune but yes. 3 finetuners are already testing and say it learns absurdly fast. Its small enough for vramlets to run, flux 2 vae is far better than z image's flux 1 vae so it will be a better base...
>>
>>107895024
they're trying to tease it so they'll release it, I wonder what the marketing idea behind that teasing is though, unless they're actually still finetuning behind the scenes
>>
>NOOOO you don't get it I MUST post hundreds of videos about trannies DAILY
>you WILL hear about TRANNIES every day and you WILL be HAPPY
>>
>>107895138
the distill has those issues, the base is great though and we can make our own speed up lora
>>
>>107895147
it's just a test bro
>>
File: 1748398002619234.png (245 KB, 1400x788)
245 KB
245 KB PNG
>>107895147
who's forcing you to open up those catboxes anon?
>>
>>107895151
sometimes he makes funny stuff
>>
>>107895137
yes, we should contribute more to the thread by complaining about gens in the DIFFUSION GENERAL. that would be a great idea.

also, complaining adds so much to the thread, instead of constructive feedback. thanks for contributing to high memory prices by wasting RAM on your computer.
>>
>>107895145
>I wonder what the marketing idea behind that teasing is though, unless they're actually still finetuning behind the scenes
I have no idea wtf they are doing, I thought they were just waiting for Klein to be released to dunk on BFL again, and it didn't happen, what are they waiting for? fucking CHINESE CULTURE
>>
>>107895165
here is my constructive feedback, stop being a moron and gen good shit
>>
File: 2828.png (1.93 MB, 1088x1088)
1.93 MB
1.93 MB PNG
Free access to mental health services would've prevented this
>>
>>107895170
>stop being a moron and gen good shit
where's your good shit anon? don't give advise when you can't even show the example
>>
>>107895169
>LTX2 has audio and is more useful than wan
>klein is better than qwen edit
>chinks keeping good stuff behind APIs
fuck chinkland
>>
>AttributeError: 'NoneType' object has no attribute 'to'
anyone has been able to use NAG with Klein?
>>
>>107895178
just check the collage ;)
>>
>>107895173
I'm leaning pretty right but for that specific shit I totally agree with libtards, we must make asylum great again!
>>
>>107895184
yeah
>>
>>107895187
fuck off retard
>>
>>107895024
>>107895145
who cares about it, just find something else to do. if they release it great, if they dont move on
>>
>>107895184
there's no NAG implementation of Klein yet
>>
>>107895196
damn, ok thanks anon
>>
has anyone created lora with butt cellulite for Chroma or Z? their skin is too smooth
>>
>>107895196
when are you planning to do it?
>>
>>107895205
I'm not a coding genius unfortunately :(
>>
>>107895145
They are cleaning the model from illegal stuff. (Illegal in China that is.)
>>
>>107895214
that came in a dream to you
>>
>>107895219
No it didn't
>>
>>107895204
>Z
>skin is too smooth
I think the opposite, sometimes it goes overkill with wrinkles
>>
>>107895219
Chinese culture, mate
>>
>>107895214
the ironic thing is that porn is illegal in china and legal in western countries, yet when you compare their models, they're both as cucked when it comes to nudity, really makes you think
>>
I wish there was a way to not have every skin so damn full of peach fuzz with klein
>>
>>107895204
Chroma probably needs no LoRA. Describe precisely what kind of ass you want and it will do it, can even do saggy tits if you ask for them.
>>
>>107895242
too much of a grey area, I am surprised flux risked putting out a edit model with the whole grok thing that went on that is making people push for laws against ai edits
>>
>>107895204
for chroma, just use the existing flux lora https://civitai.com/models/497253/cellulite-ass-15orponyorxlorflux and of course some good prompting, terms like extreme or extremely helps a lot.

t. slopper
>>
>>107895070
only a retard would say what you just said when it's trivially observable that Klein (much like Z Image and most other distilled models) works well with up to 10 steps or so before it starts to fry out
>>
>>107895257
>I am surprised flux risked putting out a edit model with the whole grok thing
same, it's really not the best time to release a good edit model, yet they did it, damn that Z-image turbo really hurt their ego bad they wanted to make a statement with this model lmao
>>
File: 1761519280498993.png (1.56 MB, 1536x1024)
1.56 MB
1.56 MB PNG
>>
>>107895242
China: CCP
West: Investors
Shrimple as
>>
>>107895276
can you make it give birth
>>
>>107895272
>damn that Z-image turbo really hurt their ego bad they wanted to make a statement with this model lmao
and then they failed
>>
>>107895270
desu I didn't notice much difference between 4 steps and 8 on Klein, but if we go through your logic, if you want to go for 8 steps for klein, then we must allow Z-image turbo to go for 16, and ultimately the quality will always go on favor of Z so...
>>
File: 1750434173510867.png (1.32 MB, 1360x768)
1.32 MB
1.32 MB PNG
replace the man in the blue shirt in image 1 with the anime girl in image 2 wearing a blue shirt, black jacket, and beige cargo pants, with her hands on her waist.
>>
>guys i need a celulite lora
>first guy to suggest one is a chromer
pottery
>>
>>107895292
you had one job
>>
lol so klein edit changes 2d penises to thumbs half the time
https://files.catbox.moe/nm5x1g.jpg
>>
File: Klein.png (1.9 MB, 1744x768)
1.9 MB
1.9 MB PNG
>>107895280
investors don't exist in China?
>>107895288
>and then they failed
lol, it's an incredible edit model and klein 4b base will finally be the replacement of SDXL, sounds like success to me
>>
>>107895288
people who think z image is better are just shills. It is photo and aesthetic maxxed to the point where the whole thing collapses a few thousand steps in for any sort of training while the klein base is 95% of the way there without all that. A finetune will go crazy
>>
File: 1768136013387660.png (1.32 MB, 1360x768)
1.32 MB
1.32 MB PNG
>>107895301
which is that?
>>
>>107895307
the model has no idea what a penis is, so it defaults to what it knows, if it looks like a finger it'll render a finger lol
>>
>>107895307
its early still but try >>107890346
its already learning nsfw pretty well this early in
>>
File: tic tac bretzel gwello.png (431 KB, 800x582)
431 KB
431 KB PNG
>>107895310
>klein 4b base will finally be the replacement of SDXL
... until Z-image base gets released
>>
File: specs.png (25 KB, 759x303)
25 KB
25 KB PNG
>>107895149
wtf are you talking about, the base lacks any aesthtic tuning. It can be better SOMETIMES for some stuff (assuming we're talking about T2I) at like 50+ steps but mostly it's aesthetically inferior by a lot. BFL themselves activly claims 9B Distilled, not 9B Base, to have the best prompt adherence, too.
>>
>>107895330
I'm using the 9B version sadly
>>
File: 1754610120446880.png (1.29 MB, 1360x768)
1.29 MB
1.29 MB PNG
>>107895312
a big edit for you
>>
File: 1741818993972679.jpg (128 KB, 1300x1150)
128 KB
128 KB JPG
Firs time to posting here. Finally got nvfp4 to work on my comfy ui setup and tested LTX-2, cool thing is that is quality its not so bad if on high resolutions 1920x1080, with loras i can only work at 1280x720 though but i can sitll generate long ass clips for some reason (managed to do 14 seconds in less trhan 30 minutes,

Any anons that can run the the heavier model/quants could tell me how close its the model when using loras to wan 2.2 to know if its worth to train my own Loras on it
>>
>>107895331
unless they completely trained it from scratch z image will have a far worse vae which makes it worse by default if we want to make a full finetune

>>107895335
I was saying the distill had anatomy issues
>>
>>107895242
Something being "illegal" in china is not a big deal. Especially when it comes to the culture because their rusted governmental machine has a hard time catching with the rapidly changing times.
>>
File: 1740532825398230.png (1.59 MB, 1248x1248)
1.59 MB
1.59 MB PNG
>>
>>107895335
>BFL themselves activly claims 9B Distilled, not 9B Base, to have the best prompt adherence, too.
that's because Klein 9b is using Qwen 8b, and Klein 4b is using Qwen 4b, that's a shame desu
>>
>>107895341
now make him give birth
>>
>>107895310
>investors don't exist in China?
Are you dense or otherwise challenged?
They are not the controlling force. It's the CCP. You can't do anything in China without party approval, you won't even get that far as to even meet investors.
>>
>>107895351
what lora re you using?
>>
File: zspecs.png (43 KB, 892x467)
43 KB
43 KB PNG
>>107895331
I can't wait for the "I'm so disappointed by the quality of Z-Image Base" reddit posts by NPC retards who didn't grasp that Tongyi has made it VERY CLEAR the WHOLE FUCKING TIME that there was no reason to expect base to perform better out of the box, and that isn't the point of it (much like it's not the point of base Klein either).
>>
chinese culture vs germanic culture?
>>
>>107895361
>you won't even get that far as to even meet investors.
chat is this true?
>>
>>107895367
reminder that they first posted the quality as bad for base before the marketing team made them change it
>>
>>107895352
it has anatomy issues with the garbage trash default Comfy workflow (that doesn't even have working seed randomization). Use 8 steps like you would for Z, problem solved.
>>
>>107895310
Klein is slow and doesn't even reach the quality ot z-image who generates better pictures much faster.
>>
>>107895365
the furry one that got posted yesterday in civai it has 15000 steps and is quite heavy so i figured out it was the best one to test as it would simiarl to what i want to train
>>
>the year is 2026
>BFL decided to become based and gave us a base model that could legitimately end up SDXL career
>the chinks are still making unfunny "Soon(TM)" memes on twitter >>107895024
the pendulum has swung so hard I feel like I'm on a different universe lmao
>>
ideal number of steps for distilled klein 9B? their rec of 4?
>>
>>107895356
what? both Klein 9Bs have better prompt adherence than both Klein 4Bs. The point I was making is they say Distillled 9B has better prompt adherenc than Base 9B, both of those using Qwen 8B.
>>
>>107895386
8
>>
File: 1758565225309764.png (1.45 MB, 1360x768)
1.45 MB
1.45 MB PNG
replace the plane in image 1 with the robot in image 2.

neat
>>
>>107895381
>Klein is slow
what? in my testings I found out that klein 9b distill was as fast as Z-image turbo per steps (I had ~1.6 s/it on both models)
>>
>>107895381
>slow
wat? are you yet another retard who can't grasp the difference between the undistilled bases and the distilled regular ones?
>>
>>107895371
China is basically dollar store germany
>>
>>107895367
>>107895377
I expect anons doing the exact same here, only with added "china is shit" "ccp is great" and other nonsense sports team bullshit.
>>
How well does Flux2 Klein retain original image quality?
>>
>>107895386
officially 4, but feel free to go for more and see what happens, personally I didn't notice much difference when going higher, that effect is stronger on Z-image turbo
>>
>>107894964
Can lora be trained yet on these Klein base? I want to train my imoutotv girls datasets on something better than sdxl.
>>
>>107895393
I am getting 1.4s/it with z-image
>>107895395
No, cunt.
>>
File: 1738081449934319.png (114 KB, 640x640)
114 KB
114 KB PNG
>>107895404
I'm just rooting for whoever gives me the best model, if it's a bretzel I'll congratulate them, if it's a chink I'll congratulate them, hell even the kikes released a decent video model and I congratulated them
>>
>>107895382
at least post your shit here bastard
>>
>>107895389
>>107895408
ok, I'm trying either going higher or using res2s/4s with less steps
>>
>>107895414
when did we do anything bad against you?
>>
>>107895413
again, I was saying that klein 9b distill is as fast as Z-image turbo 6b, let that sink in, they managed to make a 9b model as fast as a 6b model, Klein is actually faster if you account the parameters
>>
File: randomanimeman.jpg (25 KB, 632x155)
25 KB
25 KB JPG
>inhale the hopium
>exhale the copium
>>
File: 1753687187638197.png (1.47 MB, 1440x1072)
1.47 MB
1.47 MB PNG
>>
I just wish Klein was less slopped, it still doesn't beat z's asian girls
>>
>>107895426
The model size has nothing to do with speed if you can fit everything into VRAM.
z-image gets a better quality than Klein. It's just a fact. Flux just can't get textures right.
>>
>>107895429
that guy masked off recently with how hard he was trying to slide klien / hype z base. actual paid shill
>>
File: oh god.png (1.53 MB, 1280x720)
1.53 MB
1.53 MB PNG
>>107895436
>The model size has nothing to do with speed
>>
>>107895445
he is right though
>>
>>107895436
>z-image gets a better quality than Klein
klein gets far more styles right, knows more and is trainable. z image is literally over trained on good looking photos
>>
>>107895440
>actual paid shill
it's obvious he's too tied to alibaba to criticize them, if he gets critical of their models he won't be able to be an insider anymore, so I'll take this shilling if it means he can get informations no one else can
>>
>>107895445
>he has no clue
Many such cases
>>
>>107895432
clearly aesthetically zimage asian girls look way better, but I'm surprised klein even made passable girls at this point
>>
>>107895453
>klein gets far more styles right
That doesn't matter if the textures look bland.
>>
File: OH GOD.png (53 KB, 168x300)
53 KB
53 KB PNG
>>107895451
>calculating 9 billions parameters on the GPU is the same as calculating 6 millions parameters on the GPU
>>
>>107895386
6 - 10. 8 is a good default.
>>
I think klein 9b base could 100% become the new sdxl, we just need to ignore the license, lets see what happens, why is everyone so scared
>>
File: 1737835276027067.png (1.35 MB, 1360x768)
1.35 MB
1.35 MB PNG
remove the plane behind the man in image 1. remove the men behind the man wearing a blue shirt. add the robot in image 2 far away the man in image 1, the robot is 100 feet tall.

fund it sunrise
>>
File: 1756632427291196.png (995 KB, 1120x1392)
995 KB
995 KB PNG
>>
>>107895465
>parameters = size
anon...
>>
File: 1768527432239093.jpg (1.15 MB, 4580x1242)
1.15 MB
1.15 MB JPG
reminder that z image is literally only good at realism
>>
File: OH MY FUCKING GOD.png (343 KB, 686x386)
343 KB
343 KB PNG
>>107895475
>>parameters = size
yes?? you literally calculate the size by multiplying the numbers of parameters with the average of bits, are you fucking retarded????
>>
>>107895465
Dude... learn LLMs before you try to act smart. You are not calculating 9 billion parameters. The 9 billion params need to fit into VRAM and then the tensors get calculated that are relevant to the context. That's what the attention is for.
>>
is Mr catjak here? I have a question
>>
>>107895483
getting angrier won't change reality, calm down
>>
>>107895480
are you one the girls? please be in lodon
>>
>>107895483
Melty
>>
>>107895491
I literally explained to you that size = (number of parameters)X(average of bits) so obviously the size is related to the number of parameters, why are you so retarded anon?
>>
File: Comparison.jpg (2.69 MB, 2496x1872)
2.69 MB
2.69 MB JPG
>>107895406
pretty well IMO, even at higher resolution. This was a 1248x1872 input, output at the same resolution, with "The man is now wearing blackface. Everything else is exactly the same." on Klein 9B Distilled, with 8 steps.
>>
>>107895484
>You are not calculating 9 billion parameters.
you are, is this the low IQ timezone or what? what time is it in india at the moment?
>>
>>107895406
it's all right, but you can still see the differences, that'll never be resolved until they finally get rid of the VAE and go for some pixel radiance X0 shit mumbo jumbo, lodestone was right all along
>>
File: 1753233767238681.jpg (587 KB, 2725x768)
587 KB
587 KB JPG
replace the blonde anime girl in image 1 with the anime girl in image 2 in the same pose.

very good, using 8 steps since it's already so fast.
>>
>>107895505
Melty
>>
>>107895497
i think we can all agree you lost
>>
File: 1747169600770559.png (2.19 MB, 992x1568)
2.19 MB
2.19 MB PNG
>>
>>107895470
4B could too quite frankly, it's not THAT much worse than 9B when running both at a reasonable step count
>>
>>107895497
gora
>>
>>107895514
I'd like it to keep the style of the original drawing though, I've heard that going with megapixels > 1 can help with that
>>
>>107895531
i noticed going too high makes it shit out slop
>>
>>107895531
it can go pretty high, you should almost always just try to output at literally the same size as the input image
>>
>>107895470
>why is everyone so scared
imagine spending 200k on a finetune and then you'll have to spend an additional 400k to a probably lawsuit against BFL (and obviously lose), no one will take such a risk
>>
File: 1759213098971548.png (488 KB, 392x504)
488 KB
488 KB PNG
>>
>>107895550
as long as you aren't in europe you will be fine, you really think Trump is going to allow germans to sue americans with the current climate?
>>
File: 1752784298070946.png (3.64 MB, 1472x1488)
3.64 MB
3.64 MB PNG
>>107895531
>I'd like it to keep the style of the original drawing though
then don't say "replace character 1 by character 2", say something, "the girl from image 1 has the outfit of the girl from image 2" and it'll work better
>>
>>107895470
>we just need to ignore the license
sure, name 1 person who will throw away a couple hundred grand for us
>>
>>107895560
can you do it? please
>>
>>107895556
lodestone lives in europe, and even if you make your finetune outside of europe, huggingface and civitai won't allow your model to be published, so it's instant DOA
>>
File: 1751699086780306.png (2.16 MB, 1072x1440)
2.16 MB
2.16 MB PNG
>>
File: 1749489106249070.png (1.59 MB, 1360x768)
1.59 MB
1.59 MB PNG
neat, now it swapped both, didnt say to use the same pose though.

replace the blonde anime girl in the middle with the anime girl in image 2, in the same artstyle.
>>
>>107895568
it has been long coming the day of torrent style sites for ai models with civ censoring loras
>>
>>107895465
the retard is shitting with you anon
>>
>>107895574
ooo a private tracker for models i like it
>>
File: 1753622435252015.png (3.64 MB, 1280x1632)
3.64 MB
3.64 MB PNG
>>
>>107895552
prompt? this rocks
>>
>>107895545
is there no rule about size needing to be a multiple of x?
>>
>>107895584
A ballpoint pen illustration on blue paper, traffic on a freeway at night, heavy cross-hatching, scribbled texture, etching style. Limited color palette: cyan, black, hot pink, and yellow. High contrast, distinct line work, moody atmosphere, lo-fi aesthetic, graphic novel art style. --ar 4:5
>>
>>107895589
i don't remember
>>
>>107895556
It's not about where the person lives, it's about what big platforms like civitai and hf do about it, and they will comply to any hissy fit BFL makes.
So 9B spicier finetunes are out of the questions.
Lora will probably be fine unless it's celebs or cunny artist styles.
>>
File: image.png (218 KB, 1024x576)
218 KB
218 KB PNG
Can someone do this one?
>>
File: 7.png (1.01 MB, 1088x960)
1.01 MB
1.01 MB PNG
>>
File: cat (2).gif (688 KB, 300x289)
688 KB
688 KB GIF
>>107895418
Well its nvfp4 and i dont think anons are probably gonna enjoy it because it belongs on /trash/ but sure

this is out of the box cartoon no lora nvfp4 with low res image to start

https://files.catbox.moe/2o458a.mp4


This is my first test on british acccent and realism + lora for nsfw , as this is a blue board i its zoomed in so the nsfw stuff was not visible which makes it look way more blurrier, also the nsfw stuff its pretty bad but i thing thats down to the lora being overtrained and me not being able to use better quants/weights or a better scheduler than euler simple

https://files.catbox.moe/bgn511.mp4

same style as previous but now it last 13 seconds

https://files.catbox.moe/z4hr1p.mp4


Overall they are pretty shit and i doubt anyone here want to see this stuff, here is a cat for your troubles, i figured it was worth a shoot asking here before i rent a pod to train loras for it
>>
https://www.reddit.com/r/StableDiffusion/comments/1qftepq/you_are_making_your_loras_worse_if_you_do_this/
I got only partway through this AI slop post before throwing my hands up in rage. Specifically to this part:
>When you train a LoRA on "downward dog pose" and your captions mention "brown hair, purple mat, minimalistic studio, natural light, Canon EOS R5" you're entangling all of that with the pose
It is QUITE LITERALLY the exact fucking opposite of that. If you caption your image "downward dog pose" and nothing else, you're telling the model that caption means to generate everything in the image. The short caption becomes entangled with not just the pose, but anything else that tends to be in images of that pose. Only by describing things in more detail does the model disentangle the pose from everything else in the image.

It is the Year of Our Lord 2026 and people still don't understand the very basics of how to train these models. It is never, ever going to change and at this point I've just given up trying to help people.
>>
>>107895622
are you making a lora for birthing
>>
File: 1764814317284564.png (2.57 MB, 1504x1024)
2.57 MB
2.57 MB PNG
>>
File: 1768071222565262.jpg (74 KB, 752x1003)
74 KB
74 KB JPG
>>107895628
Not really i dont train loras for specific stuff, i just train them in styles so the out of the model distil understand certain styles and stops eating the motion or expression whenever its not doing realism
>>
>>107895624
I guess the models are big enough for both ways to work nowadays
>>
File: Flux2-Concat_00082_.jpg (3.21 MB, 4096x2048)
3.21 MB
3.21 MB JPG
>>107895589
It handles resizing automatically, so you might end up with something which is 1080 in and 1072 out.
I only downscale if I'm working with something gigahuge, it also upscales really nicely too
>>
File: Capture.png (160 KB, 1132x833)
160 KB
160 KB PNG
>>107895624
so we are stealing reddit comments now??
>>
>>107895645
proof?
>>
>>107895658
he wrote it twice, one time there one time here, big deal
>>
File: 20.png (1.93 MB, 1088x1088)
1.93 MB
1.93 MB PNG
>>107895647
When you don't caption the model ends up knowing what to do despite that, but the quality will always be worse, training a lora is no different than training the model from scratch in this regard, everything that is not captioned becomes implied
>>
>>107895674
samefag
>>
File: 1753237448555748.png (1.6 MB, 1072x1440)
1.6 MB
1.6 MB PNG
>>
>>107895690
makes sense
>>
>>107895690
So am I right or wrong?
>>
>>107895702
samefag
>>
>>107895624
anon is a retard that doesn't know about regularization images.
>>
>>107895718
what does nag do?
>>
File: 1740571217240902.png (228 KB, 2301x1198)
228 KB
228 KB PNG
>>107895624
good caption of something specific in an image > no caption of that element > bad caption of that element
>>
File: 1766676214161670.png (2.16 MB, 1024x1520)
2.16 MB
2.16 MB PNG
>>
>>107895624
This guy is obviously regurgitating advice he read from somewhere and haven't done much training himself.
>>
>>107895709
You are right that it will work, but that should always be disclaimed as "not correct but functional", people assume what works is right and then when stuff goes wrong they go through insane hoops trying to fix it and breaking other stuff in the process, which leads to the plethora of bogus theories and methods about how to train stuff.
>>
>>107895738
I knew it, I'm always right
>>
>>107895624
what about people using trigger words like : bl0wj0b or s3x
>>
>>107895730
>nipples in a blue board
uh oh
>>
>>107895753
they're also blue
>>
>>107895753
it's fine the nipples are blue too
>>
>>107895730
can you fix the inside of her mouth
>>
>>107895647
With a single concept, and enough diverse images, you can often get away with short captions. But it can go wrong.

Suppose half of your downward dog images have a blue vase in the corner. If your captions are just "downward dog pose" the model learns that half the time there should be a blue vase in the corner. It got entangled, the model has a tendency to make a blue vase when prompting for that pose.

If you caption those images "downward dog pose, but also there is a blue vase in the corner", the model learns to only generate a blue vase when you specifically mention a blue vase (because that's exactly how it was captioned). It disentangled the concept. Basically what >>107895690 said: "everything not captioned becomes implied".

Also I just realized that ledditor is the Pyro guy who makes the shitty overfit SDXL porn models. Not exactly a preeminent ML researcher here lmao.
>>
>>107895770
Now in english
>>
File: 1768305936434867.png (2.19 MB, 1717x752)
2.19 MB
2.19 MB PNG
>>107895571
>>107895514
>Replace the dress of the girl from image 1 with the dress of the girl from image 2, keep the pose and the art style the same
>>
Thanks for all those flux/zit/qwen comparisons, anon. I understand that there's no best overall model, each one has its weaknesses. Which is why I still gonna wait for ZIM.
>>
File: Flux2-Klein_00060_.png (1.48 MB, 1280x816)
1.48 MB
1.48 MB PNG
>>
File: va3.png (2.33 MB, 2560x2560)
2.33 MB
2.33 MB PNG
What's to go to for models paywalled by flaming ferrets on civitai?
>>
File: 1737487027290583.png (1.65 MB, 1360x768)
1.65 MB
1.65 MB PNG
>>107895781
did this but with teto:
>>
>>107895770
yeah I agree in an ideal world you'd describe everything accurately but at some points it's probably too many details
"there is a blue vase in porcelain with slight cracks in its upper corner, the vase is small"
>>
File: 1755926272355712.png (1.68 MB, 1360x768)
1.68 MB
1.68 MB PNG
>>107895825
and this iteration got asuna + teto dress.
>>
File: 1740582060501048.png (2.13 MB, 928x1664)
2.13 MB
2.13 MB PNG
oo
>>
>>107895825
>>107895837
try to increase the input resolution, go for 1.5 mp, I'm noticing it keeps the original style more like that
>>
>>107895817
none. the only time i visit is if anon traines a cool one and uploads it there instead of catbox. i only check out a few profiles. maybe huggingface
>>
>>107895568
Lodestone said on his discord he's gonna try an experimental 4B Base Klein tune I think
>>
>>107895874
that's because 4b klein has the apache 2.0 so there's no problem with that model
>>
>>107895589
doesn't seem to be. Or at least all my images were already convntional multiples of whatever. You can shave a couple pixels off your input where needed to make it divide better by 2 or something if you really want.
>>
>>107895589
>is there no rule about size needing to be a multiple of x?
yes, it must always be a multiple of 16, that's what the vae can handle
>>
How do I change the Klein fp8 loading in the Comfy default workflow? As it stands (with this sub-graph shit) it only wants to read that file and nothing else, but I want to use the the larger bf16 model.

Undo the sub-graph and redo the entire workflow, or what?
>>
>>107895833
you need to describe things in a way the model already understands, this is why too many tags can also be a bad thing. like those civit loras that describe every single piece of an outfit and it just creates a big activation tag that breaks the moment you leave one thing out.
>>
>>107895882
yeah I know. There's no problem with 9B either in the sense of at least lora training, also.

Additionally, Pixelwave is an example of an actual finetune of Flux Dev by a guy who did not, in fact, have commercial interests and thus wasn't violating the license:
https://civitai.com/models/141592/pixelwave

A small handfull of SAAStards with active specific commercial interests are 90% of who gives any kind of shit about the license stuff.
>>
>>107895902
>Additionally, Pixelwave is an example of an actual finetune of Flux Dev by a guy who did not, in fact, have commercial interests and thus wasn't violating the license:
pixelwave probably wasn't expensive to make and he didn't use 5 millions images like on Chroma
>>
File: 1767020398013453.jpg (744 KB, 1856x1664)
744 KB
744 KB JPG
>>
File: Comparison.jpg (1.81 MB, 4872x1184)
1.81 MB
1.81 MB JPG
base has better prompt understanding (because of CFG) but the quality is ass, NAG Klein can't come soon enough
>>
>>107895833
You only need to describe what you want to control, what you want to be disentangled. If half your images have a blue vase, and you don't want blue vases to randomly appear unprompted, then those images better mention a blue vase in the caption. If you don't care about the details of the vase, then that part doesn't need to be described.
>>
File: 1757140612774702.png (2.33 MB, 1440x1072)
2.33 MB
2.33 MB PNG
>>
>>107895891
Never mind, I'm retarded and didn't see the fucking thing at the bottom of node.
>>
>>107895384
>could legitimately end up SDXL career
lmao ok nigga calm down
>>
File: lo9.png (2.98 MB, 2560x2560)
2.98 MB
2.98 MB PNG
>>107895624
>2023+3
>people still don't know how to make loras and are fighting over the basics
lol
>>107895703
>>107895655
neat
>>
File: 1752540129410827.mp4 (3.79 MB, 2048x1228)
3.79 MB
3.79 MB MP4
>>
>>107895954
SDXL career will soon be ended, but not by Klein
>>
File: 1753943522670295.png (1.66 MB, 1360x768)
1.66 MB
1.66 MB PNG
Replace the dress of the girl from image 1 with the dress of the girl from image 2, keep the pose and the art style the same

this model is fun stuff.
>>
File: 241.jpg (154 KB, 1024x1024)
154 KB
154 KB JPG
Love Brazilian pussy and now with AI, I can coom and coom and coom and never stop cooming
>>
>>107895977
ai slop
>>
File: oh4.png (2.96 MB, 2560x2560)
2.96 MB
2.96 MB PNG
>NEXON
>>
>>107895985
puto
>>
>>107895974
agreed, hopefully we get something this year
>>
File: lmao.png (3.92 MB, 2436x928)
3.92 MB
3.92 MB PNG
>>107895934
At 50 steps Klein base 9b kinda got it... but that image is cursed on so many levels!
>50/50 [06:41<00:00, 8.03s/it]
>>
File: 1748722826894198.png (1.57 MB, 1120x1376)
1.57 MB
1.57 MB PNG
>>107895961
ty
>>
>get amazing german engineered model with barely any censorship built in, great license too
yet you bastards want to wait for z base, fuck you
>>
>>107895891
any difference in quality between the fp8 encoder and the bf16?
>>
File: 1737955694370732.png (838 KB, 1920x1080)
838 KB
838 KB PNG
>>107895999
>steps Klein
>>
>>107895655
>>107895883
>>107895589
I can automatically go with 16, so I'll let it do that
>>
>>107895920
Left side is cooler but these are all pretty neat.
>>
>>107896007
Yeah
>>
>>107895934
the color mismatch is too much, it should get the palette from ba
>>
>>107895994
i would coom in them both till my balls explosed and never worked again.
>>
meow
>>
File: Flux2-KleinEdit_00045_.png (3.45 MB, 1448x1448)
3.45 MB
3.45 MB PNG
2D isometric perspective view of the building in Image 1
>>
File: 1740400381304063.png (1.2 MB, 1088x944)
1.2 MB
1.2 MB PNG
the man in image 1 is holding a framed picture of image 2.

mein pippa
>>
>>107896027
yeah I agree, it's just a base model it's not that good on image quality, but it shows the potential, Klein can be better than what we currently have
>>
File: 1743295843322324.mp4 (3.83 MB, 1638x2048)
3.83 MB
3.83 MB MP4
>>107896003
>>
File: Flux2-KleinEdit_00059_.png (2.21 MB, 1371x1523)
2.21 MB
2.21 MB PNG
2D isometric perspective view of the scene in Image 1, 45 degree angle, plain background
>>
>>107896104
qwen edit could never
>>
>>107896096
fridge bearing hips
>>
>>107895743
That might still work. Stable diffusion models don't actually learn the exact words. You can omit the vowels and the loras can still have an effect.
>>
retard here. since Klein can do edits does that mean if you finetune it you'll need to include edit pairs as well?
>>
File: 1738888476836942.png (2.16 MB, 2409x1022)
2.16 MB
2.16 MB PNG
Klein is lucky it's an edit model, it's the only thing it's good at, as a text2image model it gets destroyed by Z-image turbo
>>
File: 1759283252011307.mp4 (3.97 MB, 2048x1706)
3.97 MB
3.97 MB MP4
>>107896070
>>
File: 1739734469863958.png (1.15 MB, 1088x944)
1.15 MB
1.15 MB PNG
change the face of the man in image 1 to the face of the man in image 2.

literally me-in
>>
File: 1750396334289546.png (1.08 MB, 1088x944)
1.08 MB
1.08 MB PNG
>>107896146
>>
File: 1767748088861994.png (81 KB, 1323x385)
81 KB
81 KB PNG
>>107896146
>>107896153
try to go for simple scheduler + a lower shift and see if it makes it less slopped
>>
>>107896135
i didn't realize how fun editing can be if the model is actually good at it. tried qwen edit before and gave up on it very quickly
>>
>>107896162
it's gonna be even more fun with Z-image edit (in which I expect it to be better than Klein)
>>
>>107896060
Where's the water fountain?
>>
File: 1747907156060792.png (1.09 MB, 1088x944)
1.09 MB
1.09 MB PNG
>>107896156
dont have that yet, tried res multistep
>>
>>107896174
I hope the license is better at least, and maybe even nicer if it knows what a dick or nipples are from the get go.
>>
>>107896128
No one knows for sure since this is the first worthwhile local that does both.
I don't think that's strictly necessary, but how well the knowledge would transfer between the two domains, I have no clue.
Like if I make a good t2i bob vagen lora, would it also improve bob vagen of edit? To same degree?
>>
File: 1758338686196892.mp4 (3.94 MB, 2048x2048)
3.94 MB
3.94 MB MP4
>>107896058
>>
So, I assume Klein edit doesn't do nsfw like qwen "remove clothes" etc? Also, anyone run a test transforming anime into photo-real, thank you
>>
Is flux2's vae that much better than flux's vae? they claim it is but I don't think Klein is keeping the original input's details better than Kontext for example, there's still a shift in colors and shit
>>107896183
Alibaba always released Apache 2.0 licence models so it'll probably be the same for that one
>>
>>107896189
>Also, anyone run a test transforming anime into photo-real
I did, it works, it just has no idea for nsfw stuff or haram body parts
>>
File: 1742523558033674.png (1.08 MB, 1088x944)
1.08 MB
1.08 MB PNG
>>107896182
back to euler: here is the fent fuhrer:
>>
>>107896178
sold off to fund the greenland invasion
>>
>>107896210
keek
>>
>>107895384
You lot are forgetting Tongyi was given the full datasets used in NetaYume & NoobAI, the only thing keeping SDXL alive
>>
>>107896189
>So, I assume Klein edit doesn't do nsfw like qwen "remove clothes" etc?
It's poisoned in that regard. 9B one is noticeably better with nudity than 4B. The boobs aren't very good still and nether regions are Barbie dolls.
>>
>>107894964
What's the best model for doing NSFW with an accurate artstyle to a show atm? Mostly thinking about anime
>>
File: 1761343038101865.png (1.6 MB, 1088x944)
1.6 MB
1.6 MB PNG
frame it
>>
File: 1761734382938115.png (2.42 MB, 2325x1069)
2.42 MB
2.42 MB PNG
man, QiE is such a slopped garbage, Klein is by far the best local edit model now
>>
>>107896225
Damn, that sucks. Is the texture quality good at least? I still use QIE for anime to photo-real, with a personal workflow that works really well, but still not as realistic as Z. When Zedit...
>>
>>107896186
very nice, i like it
>>
File: 1742467311047196.mp4 (3.8 MB, 2048x2048)
3.8 MB
3.8 MB MP4
>>107895992
>>
>>107896225
>The boobs aren't very good still
zero nipples for me
>>
File: Klein 9b.jpg (650 KB, 3328x928)
650 KB
650 KB JPG
>>107896248
it's not at Z-image turbo's level, but it's way better than QiE that's for sure
>>
>>107896255
skill issue
>>
>>107896245
i made it
>>
File: file.jpg (1.12 MB, 2342x1272)
1.12 MB
1.12 MB JPG
>>107894964
/lmg/ is so gay lol
>>
>>107896260
yeah
skill issue
ai slop
Now in english
samefag
proof?
>>
>>107896273
uh oh meltie
>>
File: Flux2-KleinEdit_00074_.png (2.5 MB, 1145x1835)
2.5 MB
2.5 MB PNG
>>
File: hd3.png (2.83 MB, 2048x2584)
2.83 MB
2.83 MB PNG
>>107896252
neat. Wan?
>>
>>107896273
he's kind of obvious isn't he
>>
File: 1759788137975013.png (1.65 MB, 1088x944)
1.65 MB
1.65 MB PNG
>Berlin, 1944:
>>
File: merged.png (2.12 MB, 1365x1536)
2.12 MB
2.12 MB PNG
>>107896189
Maybe works with more detailed prompt engineering but just "Turn this anime image into a real life photo." doesn't seem to work well.
>>107896255
4B? Skill issue if you are using 9B. 9B also occasionally refuses to do nipples across seeds but I got a lot of it still.
>>
Fresh when ready

>>107896297
>>107896297
>>107896297
>>
>>107896270
haha i member that one https://desuarchive.org/g/thread/107887535/#107887535
>>
>>107896270
/lmg/ has nothing happening except that endless meme vocaloid spam in the year 2026 of our lord for the sake of it.
>>
>>107896058
>Juggernaut
brave
>>
do the anons here share a singular brain cell or something?
>>
>>107896186
average wan 2.2 experience



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.