[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>106981016

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://civitai.com/models/1790792?modelVersionId=2298660
https://gumgum10.github.io/gumgum.github.io/https://huggingface.co/neta-art/Neta-Lumina

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
kek, that VH:D gen is alright
>>
>>106988377
yeah it's slopped and there's minor other flaws in places but it's generally coherent, like I said Qwen with no Lora will give you something fucking hilarious that's not even vaguely close to correct for the same prompt

>>106988462
so is Qwen by default, that's not really the point here lol
>>
Blessed thread of frenship
>>
File: 00041-2433127780.png (2.42 MB, 1080x1920)
2.42 MB
2.42 MB PNG
>>106988433
kino

>>106988382
figured i'd shift gears from my usual (huge breasts, wide hips:1.5) subjects for a little bit.

>>106988479
VERY cute. what did you prompt to get that little heart sticker on her cheek, or was that an accident?
>>
File: 1755589005101673.png (3.06 MB, 1344x1728)
3.06 MB
3.06 MB PNG
>the chinese release an uncensored model
>it's slopped and 80 FUCKING GB
>>
>>106988458
can i do anything with a GTX 750 TI w/ 4GB and CUDA 11.1?
>>
>>106988523
>hunyuan 2.1 doesn't exist
>>
Only base model gens below this line (I'll allow a single lora but only if you trained it yourself)
-------------------------------------------------------------------------------------
>>
>>
File: 00032-700683388.png (2.68 MB, 1248x1824)
2.68 MB
2.68 MB PNG
>>106988507
the heart on her cheeks are not Intentional. it just appears randomly because of the "heart shape hand" or "heart shape eye pupils" on in the positive prompt section.
https://files.catbox.moe/zigbch.png
>>
File: 1744555857344505.jpg (149 KB, 1024x1024)
149 KB
149 KB JPG
>>106988526
You can make utter slop with outdated SD1.5 models like me!
Can't tell ya what settings to use though...
>>
new yume making SDXL look antiquated desu
>>
File: 00046-1383837398.png (2.55 MB, 1080x1920)
2.55 MB
2.55 MB PNG
>>106988581
oh shit nigga, just use "heart hands". Never had this happen to me using that tag.
never used heart pupils though, but it helps to simplify your tags and triple check that they're actually booru tags, going accidentally natural language has those sideeffects.
https://files.catbox.moe/yezzoy.png
>>
>>106988591
i'm noob and have no idea what that means
>>
>>106988526
you can punch yourself in the balls
>>
best facefix at the end of wf? is the regular facedetailer node enough?
>>
File: ComfyUI_00233_.png (1.03 MB, 912x1144)
1.03 MB
1.03 MB PNG
>>
>>106988622
With a good enough model, prompt, and settings you need not a face detailer
>>
>>106988615
what about cpu? i saw comfyui will only use 1 core, but i have 2 x 12-core cpus
>>
>>106988523
this was just a test. don't worry, they already have a better version. of course, for the api only. kek
>>
File: 1732675826142867.jpg (239 KB, 1024x1024)
239 KB
239 KB JPG
>>106988612
IDK ask Grok or something.
This is the coolfag high roller's club!
>>
>>106988526
No, unfortunately. I think 1060 is bare minimum.
>>
>>106988646
>>106988657
kthx
>>
>>106988657
>>106988666
3060, ideally ti with i think 12gb of vram, would be bare minimum. i don't think the 2000 series even gets a single optimization.
>>
>>106988680
i have a 2080 ti just not installed. you saying that won't even work? i bought it to play with vgpu
>>
Chroma vs Qwen
If I want to put irl women in unpredictable situations
Which is better?
>>
File: x.png (3.41 MB, 864x1488)
3.41 MB
3.41 MB PNG
>>106988526
i think you'll be basically running on cpu. get a new card.
>>
>>106988698
>>106988698
gemini'd it
>SageAttention 1.x and the Triton kernel in SageAttention 2 have been reported to work on RTX 20 series (Turing architecture) cards, typically using specific versions of the Triton dependency.
>>
File: 00044-183534503.png (2.83 MB, 1824x1248)
2.83 MB
2.83 MB PNG
>>106988538
don't remember, i did post post hot spring partial submerged gens of her but got 3 day vacations.
>>106988603
perfect
>>
>>106988732
thank you. i'll look into that
>>
File: 00048-3785879175.png (2.63 MB, 1080x1920)
2.63 MB
2.63 MB PNG
>>106988741
more advice if you wanna give it a try, run an eyes only adetailer pass and in the prompt just do eye color + heart pupils, works pretty well. (though i think my setup just needs a few more steps.)
>>
>>106988680
1060 is enough for SDXL based. Slow, but doable, good results if put effort.
>>
>>106988706
Qwen if you want prompt adherence and accurate details
Chroma if you want to convince yourself that nonsensical noisy artifacts are “analog realism”
>>
>>106988706
Qwen will do anything (minus nsfw) that you can think of. Only drawback is that there is literally no seed variety. The image you get is what you get. No alternatives.
>>
https://wccftech.com/amd-officially-launches-radeon-ai-pro-r9700-at-1299/
>32gb
>1299 dollars
wtf I love AMD now!
>>
File: 1737996549308516.jpg (183 KB, 1024x1024)
183 KB
183 KB JPG
>>106988680
So you're saying this might not be a skill issue?
>3060ti 8gb VRAMlet
>tried every possible combination of samplers, schedulers, and refiners in SwarmUI
>it's always slop.
>>
>>106988599
wtf am I doing wrong with neta yume? I'm getting acid trip artifacts from it
https://files.catbox.moe/kyg9n3.png
>>
File: 00007-947958130.png (2.8 MB, 1080x1920)
2.8 MB
2.8 MB PNG
>>106988803
I'm sorry bud, but it is a skill issue too. You can absolutely get by on 8gb for imagegen, especially if you can run sage attention.
Hey everyone starts somewhere.
>>
>>106988706
qwen sucks with the same scenes. I tolerated it before alibaba's betrayal. now I only usechroma. I always use wan, because I have no choice.
>>
>>106988801
And only 1/8 the speed of a 4060!
>>
>>106988801
>$700 less than the 5090
>have to deal with the endless jank that is using AMD for AI
we're good
>>
File: 00092-4287040057.png (2.18 MB, 1152x1440)
2.18 MB
2.18 MB PNG
>>
File: ComfyUI_00239_.png (1.2 MB, 1024x1024)
1.2 MB
1.2 MB PNG
>>106988591
>>
>>106988801
you can be swimming in vram, if you don't have the compute, it won't be helpful
>>
File: 1752757581433648.jpg (619 KB, 1040x1520)
619 KB
619 KB JPG
>>106988819
>sage attention.
intredasting...
I will look into this.
Here is a sloppy cyborg for your troubles.

>>106988834
Is this real?
>>
>>106988309
Wait, that's without any lora? If so that's pretty cool, I don't remember any non anime nsfw model out of the box like that for years now, if ever.
I have no idea the model was able to do that. How big was it? And is the new 80b one also able to do that?
>>
>>106988523
>could not possibly have read my comment that started this conversation at the end of the last thread
>>
>>106988706
either for a base image, then qwen edit + qwen edit remove clothes lora if you want lewds.
>>
File: ComfyUI_00240_.png (1.42 MB, 768x1360)
1.42 MB
1.42 MB PNG
>>
>>106988819
what model gives this style?
>>
>>106988854
A hunyuan 2.1 finetune is the future, not qwen. No need to unfuck it if it already has a grasp on fundamentals.
>>
>>106988849
Good luck bruddah.

>>106988860
look how you massacred my wife.. the flux chin..
eh not bad. would still cum.

>>106988863
the metadata here >>106988603
>>
>>106988876
saved for future studies, thanks
>>
>>106988801
id rather buy 3x3090 for 72gb with more than double the vram speed
>>
>>106988826
the pro cards are in a much better place than the rdna cards, at least for LLMs. I'd be curious the hoops you'd need to jump through to get it working in comfy, I think rocm should take care of most of it. That said, compute on that thing is laughably low.
>>
>>106988309
I suspect this model may be better than Qwen due to its variety issue.

Also, can those with access to 80B test the same prompt?
>>
So jeet is already shilling hunyuan 2.1 after getting bored of neta, chroma, and qwen? What causes this delusion
>>
>>106988897
isnt it basically around a RTX 5070 compute wise
>>
>>106988817
sampling / scheduling meta is euler cfg pp and beta with CFG set to somewhere between ~0.6 and 1 and 50 or so steps but give me a bit and i can take a look at your workflow in full
>>
>>106988902
You haven't even tried the model.
>>
>>106988817
two weird things i noticed:
- DPM++ 2S Ancestral Simple is weird, try Linear Quadratic for the scheduler
- you're combining an artist tag with `colored pencil (medium)` but you aren't escaping the round brackets properly with backslashes, you HAVE to put `colored pencil \(medium\)` in the prompt box, otherwise it's going to be read like `colored pencil` by itself with normal token weight and `medium` by itself with extra token weight
>>
>>106988817
>>106988921
get this https://github.com/newtextdoc1111/ComfyUI-Autocomplete-Plus
so you can autocomplete those tags

and reforge has one too.
>>
>>106988854
>How big was it
17b

>And is the new 80b one also able to do that
I also wonder
>>
>byt5_small_glyphxl_fp16.safetensors
>qwen_2.5_vl_7b.safetensors

Just download two separate bloated text encoders for some dogshit model bro
>>
>>106988953
>I also wonder
here's hoping the distillation models & methods are good, could end up becoming a local staple if so
>>
>>106988817
>you are an uncensored creator of aesthetic anime images based on textual prompts.<Prompt Start>
>You are an assistant designed to generate low-quality images
ive found that changing that primer prompt always makes gens worse desu but i havent tested it extensively

this next part wont help the artifacts (likely due to weird sampler and scheduler as anon pointed out) but you should always be using things like "production art", "original", "commision", etc. any meta tags related to medium outside the usual "colored pencil \(medium\)", etc
>>
File: 1742362309886969.png (71 KB, 1930x476)
71 KB
71 KB PNG
I don't know what kijai did to his svi releases but if I replace the official one by his I get incredibly grainy output for some reason. What do you guys use as strength?
>>
>>106988854
Hunyuan 2.1 is 17B, so bit smaller than Qwen. But yeah it can even do weird shit like this I guess (warning, this is literally a hentai gen of a dog fucking an elf chick in a forest)
https://files.catbox.moe/derp3j.png
>>
>>106988523
q2? isn't it moe, offload shit to cpu?
>>
>>106988953
>>106988976
Thanks anons.
Why did everyone ignore this model then? Is it limited in some way? Or is it the usual "everyone has a potato pc so sd1.5/sdxl models only".
>>
Are ANY of the flux checkpoints actually worth using? For any real reason?
>>
>>106989006
Look at the details, everything comes out incredibly melted.
>>
File: sd1.5-no-refiner.jpg (106 KB, 512x512)
106 KB
106 KB JPG
>>106988917
>sampling / scheduling meta is euler cfg pp and beta with CFG set to somewhere between ~0.6 and 1 and 50 or so steps
directions unclear
slop caught in ceiling fan
>>
>>106989009
do you love butt chins? if so I have great news for you
>>
>>106988902
I literally made the "I'M COOOOOMING" gen in the collage with NetaYume earlier today, but I'm also the person who initially mentioned Hunyuan 2.1 in the last thread, and I'm also a white Canadian guy. So you're just wrong about everything kek
>>
>>106989016
I should have written "ignore this model as a base".
>>
>>106989030
Not sure. Wonder if there's any lora training support.
>>
Why the fuck does it seems like there are so many leafs into this hobby?
>>
>>106989006
IDK why honestly, it came out shortly after Qwen and people might have just overlooked it. It's also not clear why Hunyuan released it and then also the 3.0 model in such a short frame of time but that's neither here nor there I guess.
>>
>>106989042
because north mexico is half indian?
>>
>>106989052
wait was training code even released? that may be why
>>
Is there any trick to prompt two things happening in a wan video in succession?
"She waved her hand, then went inside the car."
First action is always super fast for me.
>>
>>106989006
Because it’s shit that can’t even compete with 2024 APIs. Nobody cares to train this bloated crap, especially after the waste of money known as chroma
>>
File: 00062-621390453.png (2.12 MB, 1080x1920)
2.12 MB
2.12 MB PNG
>>106989018
i want muh text and better backgrounds so bad man.
>>
>>106989094
But it's completely uncensored, unlike the hyperslopped Qwen.
>>
>>106989006
Because it's bloated and looks like plastic, so we all assume it's not uncensored. Interesting to see that it's not fully censored, still, unless you can get a proper photoreal tune to do the same thing, this model doesn't hold a candle to Chroma. Looking at
>>106988309
Small details like her eyes also look a bit noisy for a model that's supposed to be the base. Not saying the fact that we're now dealing with a base model that can do porn isn't special, but this model would require further tuning to be useful, and who knows how well that holds up at that size.
>>
>>106988854
>And is the new 80b one also able to do that
I don't remember any significant examples outside of the official showcases
80b is hard to run, it's basically an LLM more than an image model, so I don't think many people played around with it to test how good for nsfw it is
>>
>>106989126
Also what happened to SRPO, does that work with this model? And would something like

https://github.com/ClownsharkBatwing/RES4LYF

Help remove the blur and sloppiness? It's worth a try.
>>
>>106989101
That doesn’t matter at all. Were you one of the retards saying SDXL could never do porn because it didnt contain the same 70~ or so nude art images in the dataset that sd1.5 did?
>>
>>106989151
It matters quite a bit. There's a difference in wall time between having to teach a model the concept of genitals from scratch, and having to alter existing knowledge of genitals.
>>
genitalia doesn't sound very safe, I feel unsafe
>>
>>106989164
The amount of time spent on that is nothing compared to teaching it to actually be coherent. Just look at chroma, it understood nudity by epoch 7 yet 50 epochs later it still can’t handle fine anatomy properly. The most important part of a base model is fidelity and coherence, not being able to render a plastic boob.
>>
File: 00066-2229791331.png (2.22 MB, 1080x1920)
2.22 MB
2.22 MB PNG
>>106989170
just imagine, pontificate in your mind's eye if you will.
a PERFECT milfy pussy lined with the right amount of pubic hair, the SMELL of it. She's not showered in 16 hours, and she just ran a mile.
oooh yeaaahh. It's unsafe alright, and that's the THRILL OF IT.
>>
>>106989186
You leave out the important part that chroma was essentially a finetune from scratch, due to the aggressive de-distillation that rendered a ton of the pre-existing knowledge useless. Not to mention the questionable training methods. The gen shared earlier by an anon showed Hunyuan can (with nitpicks) already generate coherent genitals. The issue of the absolutely destroyed fine details is something else entirely. Not to mention the lack of any training code, I'd be curious to see how easy/hard it'd be to de-melt the model
>>
just imagine, plastic sdxl slop for 3 more years.
a PERFECT 4ch vae 3b model. Not updated in 5 years.
oooh yeah, it’s local alright, and that’s the STATE OF IT.
>>
File: ooh im killing myself.jpg (56 KB, 679x544)
56 KB
56 KB JPG
>>106989247
why are you like this
>>
just because one or two anons really love their 2.5D hypersloppa doesnt mean we all do
>>
File: 1742907160757555.png (3.88 MB, 2725x768)
3.88 MB
3.88 MB PNG
Replace the anime girl with Hatsune Miku.
>>
>>106989279
ask it to match the lighting
>>
File: 1730749030855882.png (1.3 MB, 1360x768)
1.3 MB
1.3 MB PNG
>>106989279
or, remove the girl from the image (after her camo activates)
>>
>>106989293
you didn't leave her brap cloud
>>
more hun 2.1 examples please
>>
>>106989317
just go back in the archive to when flux came out, it looks the same anyway
>>
neta doesn't know who pepe is, but it is getting very capable/usable. I think v4 or v5 of this model could replace noob.
>>
File: 1744559189543832.png (1.38 MB, 1360x768)
1.38 MB
1.38 MB PNG
>>106989286
Replace the anime girl with Hatsune Miku. keep the lighting and artstyle the same.

I could prompt darker lighting but it still works
>>
>>106989368
looks basically the same as before, disappointing
>>
File: 454565454545.jpg (184 KB, 2138x1217)
184 KB
184 KB JPG
>>106989101
It seems like those slopped AI porn images, it has seen in the dataset. When you ask it to do something novel that you'd expect an uncensored base model to nail (E.G. pic rel, which Chroma does right away), it shits the bed.

Prompt was simple

>Amateur photograph, a cute Japanese alt emo woman standing, with short, dark hair from a low angle, extending her bare foot toward the camera. She wears a ribbed top and plaid skirt, holding a glass with amber liquid. Indoors, adding a delicate contrast to the edgy, artistic composition.

I really don't think this thing has seen real porn. Its dataset is pure synthetic slop. Remember, that everything that Chroma can do, it can generalize and you can add things to it, like a proper base model.
>>
LMAOOOO >>106989230
>>
>>106988817
You can make the prompt even better but this is a start. https://files.catbox.moe/u5dpt5.png

>>106989352
The author states his dataset includes both e621 and Danbooru but it doesn't really feel as kino pilled e621 wise as Noob desu.
>>
File: 1742593383783257.mp4 (2.22 MB, 720x1280)
2.22 MB
2.22 MB MP4
>>106989098
>>
>>106989148
Clownshark samplers work for all flow-matching models basically (and to a large extent SDXL and SD 1.5 ones too)
>>
>>106989412
Wonder how the 80b monstrosity would perform on this
>>
File: 1742395813679848.png (841 KB, 928x1120)
841 KB
841 KB PNG
the man is holding a magazine with one hand, with the title "UNATCO". Below the title on the magazine is the text "how to spot an illuminati operative" and a man that looks exactly like him, wearing the same sunglasses, with the same expression.
>>
>>106989421
holy retardation
>>
>>106989421
>Steps to Reproduce
>use it

the unbridled, schizophrenic rage. we've all been there, i've felt it inside. so fucking based and unhinged oh my god. this reads like someone's manifesto.

>>106989425
thank you for blessing my senses, off to yoink my shploinker now.
>>
>>106989425
>load up wan
>slop bounce
>slop twerk
>eeup it's 1girl, jiggly time
>>
>>106989456
and it's glorious
>>
>>106989245
Kohya added support for Hunyuan Image 2.1 recently:
https://github.com/kohya-ss/sd-scripts/tree/sd3

It also supports Lumina 2 arch models like NetaYume but you'll want this PR:
https://github.com/kohya-ss/sd-scripts/pull/2225
>>
>bro chroma is great for nsfw
>try it
>it's all body horror slop
>>
File: huel ghat damn.gif (1.92 MB, 228x261)
1.92 MB
1.92 MB GIF
>>106989456
eeeyup.
>>
>>106989477
you've learned the hard lesson, this general is only populated by paid shills, don't listen to their praise it's not genuine at all
>>
>>106989473
>Kohya
this trainer is the literal definition of intuitive
>>
File: 1760785913423827.png (901 KB, 928x1120)
901 KB
901 KB PNG
>>106989447
the man in image1 is holding a magazine with image2 on the cover. keep the expression of the man in image1 the same.
>>
>skill issue: the post
>>
>>106989497
>keep the expression of the man in image1 the same.
its funny how qwen edit users cope with this prompt. cant change anything without severely fucking up the image
>>
>>106989352
`pepe the frog` is an actual Booru tag but there's only 263 entries at least on the main Danbooru site
>>
>>106989507
need a model trained on basedbooru
>>
>>106989488
idk if you're being sarcastic or not or what this is supposed to have meant quite frankly
>>
File: 1742998534524122.png (810 KB, 928x1120)
810 KB
810 KB PNG
>>106989506
works on my pc
>>106989507
if you want infinite pepes just use qwen edit, any pepe image can be transformed into a new pepe of any type.
>>
>>106989528
whoops i meant unintuitive lol
>>
File: 00161-296091506.png (2.66 MB, 1280x1920)
2.66 MB
2.66 MB PNG
>>
>>106989529
have his expression change without destroying the style
>>
Ldg wouldn't be the same with the turbo autist who posts multiple dozens of gens of the same image with minor variations
>>
File: 1742645313604497.png (529 KB, 1136x912)
529 KB
529 KB PNG
>>106989529
like so:

the green cartoon frog is wearing a blue tshirt and red shorts, and is sitting at a computer with a white CRT monitor that says "LDG" on the back. On his bag is a bag of potato chips that says "SIPS".

source is just a regular pepe
>>
>>106989547
and says "neat" after adding miku to the same three images over and over again like its something anons never seen before
>>
File: 1750712472215838.png (498 KB, 1136x912)
498 KB
498 KB PNG
>>106989555
it even worked with a bad prompt (on his bag is a bag)

the green cartoon frog is wearing a blue tshirt and red shorts, and is sitting at a computer with a white CRT monitor that says "LDG" on the back. On his desk is a bag of potato chips that says "SIPS".

see? even better.
>>
>>106989566
the body was drawn too well. doesn't match the sovl of the face
>>
File: ComfyUI_temp_hrgxs_00005_.png (2.69 MB, 1152x2040)
2.69 MB
2.69 MB PNG
>>
File: 1758834163580699.png (697 KB, 1136x912)
697 KB
697 KB PNG
>>106989571
the pepe was just a headshot, if you want to be specific then prompt it (skinny, etc).
>>
>>106989578
>helo honey this is me come meet me
>>
>>106989571
He's unironically unable to understand what you mean per his reply.
>>
>>106989578
THIS IS NOT MY BEAUTIFUL WIFE
>>
>>106989583
>skinny
the sovl of the poor drawing not girth of his body
>>
>>106989595
well it's up to you to prompt style specifics to your preference. in any case, it can generate pepes, how they appear is up to you.
>>
File: ComfyUI_00244_.png (3.32 MB, 2160x1736)
3.32 MB
3.32 MB PNG
>>
>>106989609
>well it's up to you to prompt style specifics to your preference
>implying it would actually listen
kek
>>
noobai and illustrious are still the ultimate local models
>>
>>106989412
Chroma can't even do the blowjob prompt in a way that looks realistic or with as much prompt adherence, she's sucking off the wrong dude:
https://files.catbox.moe/6gnraw.png

Most of the NSFW in Chroma definitely is not porn photos, it's certainly almost all 2D and 3D content
>>
>>106989620
It's pretty impressive to make that model look that bad. Even asian footfag's doesn't look that shitty kek.
>>
>>106989537
I think it's user friendly enough with the GUI:
https://github.com/bmaltais/kohya_ss

I can't think of a trainer that's moreso really
>>
File: 1girl_00002_.jpg (580 KB, 1224x2144)
580 KB
580 KB JPG
>>
File: ComfyUI_03614_.png (3.36 MB, 1496x1000)
3.36 MB
3.36 MB PNG
>>
File: 1740110448806986.png (1.51 MB, 1080x1819)
1.51 MB
1.51 MB PNG
>>106989279
>>106989368
this is my fetish

>>106989497
this too

>>106989412
close second
>>
File: ComfyUI_00248_.png (1.13 MB, 832x1248)
1.13 MB
1.13 MB PNG
>>
>>106989542
>jacket on shoulders
kino
>>
File: ComfyUI_00249_.png (2.19 MB, 1280x1920)
2.19 MB
2.19 MB PNG
>>
tell me about the schizo, why does he img2img?
>>
>>106989749
>>106989762
not sure why these look like shit
>>
>>106989529
>>106989555
I want to make pepes that look different
>>
File: 1744702139285578.jpg (1.21 MB, 1416x2128)
1.21 MB
1.21 MB JPG
>>106989352
sometimes it gets close, other times it does not
>>
File: 1748592215048104.jpg (1.05 MB, 1416x2128)
1.05 MB
1.05 MB JPG
>>106989784
meant to post picrel
>>
>>106989630
that's literally what Chroma looks like if you give it any remotely complex prompt for a lot of subjects, it oversaturates and gets increasingly less realistic.
>>
>>106989317
https://files.catbox.moe/3drgrs.png

i don't have time to mess with it but this is without a refiner, then throwing it through a .16 denoise with sdxl and then flux which obviously doesn't know tits. the coherence really isn't bad, and with a refiner it's probably kinda dope
>>
If your 10b+ model still has to be refined with superior SDXL, why bother?
>>
File: ComfyUI_00250_.png (2.37 MB, 1248x1872)
2.37 MB
2.37 MB PNG
hmm better
>>
>>106989583
I like your examples. Here's
>Photorealistic wide angle full-body shot of the standing subject from head to toe, from the left facing left, against a plain white wall background.
>>
File: ComfyUI_00012_.png (2.86 MB, 2496x1216)
2.86 MB
2.86 MB PNG
>>106989412
>Amateur photograph, a cute Japanese alt emo woman standing, with short, dark hair from a low angle, extending her bare foot toward the camera. She wears a ribbed top and plaid skirt, holding a glass with amber liquid. Indoors, adding a delicate contrast to the edgy, artistic composition.


why did you lie?
>>
File: ComfyUI_01236_.png (324 KB, 1136x912)
324 KB
324 KB PNG
>>106989819
F, picrel
>>
>>106989795
Hunyuan 2.1 doesn't officially support 1 megapixel FYI, they straight up say on the huggingface page that 1 megapixel "might cause artifacts" and give a list of base resolutions to use for different aspect ratios. 1 megapixel might work for some stuff but you probably shouldn't have any expectations.
>>
File: 849455894894.jpg (286 KB, 2394x1345)
286 KB
286 KB JPG
>>106989412
On different seeds, there's different performance, but it's still slopped.

>Amateur photograph, a Japanese idol woman, performing an advanced contortion pose at a bench in a barn. She is sitting on a surface with her legs bent backward and extended over her shoulders, so that her feet are positioned and touching over her head, displaying an impressive level of flexibility.

>There's a rooftop rope attached to both of her ankles and duck tape on her mouth.

>A long white towel is draped over her entire front for modesty. She has straight black hair with bangs.

>>106989620
Skill issue. Hunyuan doesn't follow my very basic prompts in a coherent manner, so it's slopped. Crazy how now the ceiling is at a model that we could only dream of having back in Dalle days (which is arguably still the only APIshit model that could do anything like it in terms of coherence prior to its censorship).
Anyways, for a proper base model performance, look at what Chroma can do. Why should a base model do less? I would give it points if it could do it even while slopped. No idea about Qwen, it can probably pull this off even if it spits back the same image. Another point. Chroma can generalize and give me entirely different images in different settings. It also obeys my command if I say the girl has to be naked.
>>
Hidream, flux, qwen, hunyuan
It’s all the same plastic garbage. Why even bother arguing over which is superior? Is local really so far behind that these are the only options?
>>
File: ComfyUI_01237_.png (425 KB, 832x1248)
425 KB
425 KB PNG
>>106989815
>>
>>106989836
>spends all his days making wall of texts to defend a meme model
what kind of mental illness is this?
>>
File: reso.png (86 KB, 772x662)
86 KB
86 KB PNG
>>106989832
this is what I meant:
>>
>>106989836
can you post a catbox of right without the towel? i want to make sexo with her
>>
File: 00102-3735262145.png (1.95 MB, 1280x1920)
1.95 MB
1.95 MB PNG
>>106989752
the tag is so good, I need to use it more often
>>
>>106989836
>Hunyuan doesn't follow my very basic prompts in a coherent manner
sounds like a skill issue
>>
>>106989618
>>106989806
trvth nukes
>>
>>106989836
Chroma doesn't look like your gens do AT ALL unless you're using some kind of over the top Clownshark sampling workflow, I don't know why you're pretending like it does lmao
>>
>>106989858
gun and briefcase look good too nice work anon
>>
>>106989865
nah NetaYume is a great anime model that as far as I can tell is only gonna keep improving. Like I was saying earlier nothing else open source is remotely as comparable to NovelAI 4.5's overall capabilities
>>
File: ComfyUI_00016_.png (3.61 MB, 1792x2304)
3.61 MB
3.61 MB PNG
>>106989854
good catch anon, thanks
>>
this entire thread is just jeets defending their failbakes, claiming the “true model” accessible by “skilled prompters” is actually way better than what’s posted here. meanwhile not a single one even benches in the top 15 on any arena. localkeks are a unique breed of pathetic
>>
dam. youtube is full of this crap.
>>
>>106989821
Try different seeds. It messes up. A proper base model wouldn't be this bad. At the end of the day, the fact they cut corners in training really shows. Tencent is a massive corporation. That's not their 100%. That's their failed bake that they're giving us.
>>
*yawn*
>>
>>106989887
That face is a horrific blend of SDXL mixslop and fluxplastic
>>
>>106989423
whoa thanks btw that unfucked my gen
>>
>>106989904
Kek, this. Chroma is shit but at least it’s NSFW. The rest of this crap is so bad.
>>
What makes anon seethe and mald so hard about Local Diffusion? I don't get it.
>>
>>106989909
>>106989913
you are very very upset about a gen that is so much better than chroma and i haven't even touched the settings yet. the base output of this shits all over chroma dude, i'm sorry you're taking it personally.

is it an easy model to get right out of the box? no. i've only done five gens and it's still configured. do they all look way better than chroma shit? YES.

i probably won't use this model myself, but lets be real
>>
>>106989913
Chroma does the same if you look at it wrong. It literally was trained off of SD 1.5 hyperslop gens
>>
So the conclusion is all local models are shit?
>>
File: Just_try.png (1.85 MB, 2048x2048)
1.85 MB
1.85 MB PNG
>trying out the new ditto style transfer model
>Works great and can transfer clips really quickly
>Want to run a whole video through it and restritch it all together when it's done
Are there any nodes for quick looping? Basically just want to hit run once and have the workflow select intervals of 74 frames at a time so I can create a full length video.
>>
>>106989984
>ditto style transfer model
link?
>>
>>106989908
the only difference now is the slop is synthetic
>>
File: file.png (301 KB, 463x689)
301 KB
301 KB PNG
>>106989984
>try
>time and effort required:
>result: maybe

>dont try
>time and effort required:
>result: all the free time in the world to do whatever you want
>>
>>106990002
>try
>time and effort required:
*>time and effort required: insane
>>
File: 1girl_00045_.jpg (844 KB, 1224x2144)
844 KB
844 KB JPG
>>
>>106989793
I have yet to encounter a single thing that Chroma can't do. There for Hunyuan I encountered one thing it does half-assed, and another it can't do at all.

>>106989972
You posted heavily noised Flux Schnell tier slop, thanks for proving my point.
>>
Chroma can’t render a decent looking image
>>
>>106990013
>a single thing that Chroma can't do.
Make a coherent picture without fucked up anatomy
>>
>API shill upon seeing localchads enjoying what they have
>>
>>106990013
>I have yet to encounter a single thing that Chroma can't do
Do a POV perspective of someone sitting beside the viewer on a couch. I can't get chroma to do it.
>>
>>106989992
https://editto.net/
>>
anon must be tired after casting all this b8
>>
>>106990026
he hasn't had his right arm worked this hard since the fappening!
>>
>>106990013
You are a chad for trying to beat some sense into these skill-lets
>>
Chroma is decent but any moderately complex prompt turns it into visual diarrhea
>>
File: ComfyUI_01239_.png (807 KB, 920x1136)
807 KB
807 KB PNG
>>106990019
>>
File: 1754899263049103.gif (40 KB, 220x220)
40 KB
40 KB GIF
>nogen browns seething in the thread literally 24/7/365 since the very beginning about any and all local models being shit while i have 2000 gigakino images queued to generate with four different models as i eat, read papers, and watch the gen bar filling up
prompts for this feel?
>>
>>106990088
The things I gen are too spicy to be posted here
>>
>>106990088
>1man, comfy, unbothered, not a slave to SaaS
>>
>>106990112
you forgot masterpiece :^)
>>
But really are there any easy looping nodes that just return a bigger number? Seems so basic.
>>
File: ComfyUI_03634_.jpg (2.35 MB, 1824x1248)
2.35 MB
2.35 MB JPG
>>
>slopjeet browns shitting up the thread with plastic 1girl slop literally 24/7/365 since the very release of sd1.4, waiting 70 seconds for their turboquanted qwenslop to generate the same generic image regardless of the seed all while i have 1000s of gigakino 4k seedream gens and 60 second studio-quality Sora 2 gens
don’t even need a prompt for this feel, GPT can generate one for me
>>
>>106990120
wow...... xo butiful
>>
>>106990119
Open manager, search "loop", find one that works for you. Or videcode your own.
>>
>>106990121
based and apipilled
>>
how unsafe is it to use custom nodes in comfy?
>>
>>106990172
ComfyUI-Manager is already phoning home so if you have that installed you already have a virus.
>>
>>106990172
The only nodes you need are API nodes, which are included with ComfyUI linked in the OP
>>
>>106988801
this is going to be a very good value AI card, but with the same chip as a 9070 XT, will it be slower than a 7900 XTX at image gen?

has anyone done a recent benchmark of their 9070? Say with Lumina or SD35m so you don't run into VRAM limits? I'd assume the 7900 XTX is faster, but rdna4 has improved AI support supposedly.

This pic took 30 seconds for 8 steps. btw if you're on AMD, USE THESE FLAGS:
>TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1 MIOPEN_FIND_MODE=FAST python main.py --use-pytorch-cross-attention --bf16-vae
it will unfuck your initial gen times on SDXL models by skipping a shitty compile phase
https://rocm.docs.amd.com/projects/MIOpen/en/develop/how-to/find-and-immediate.html#find-modes
>>
File: ComfyUI_07751_.png (2.04 MB, 1152x1152)
2.04 MB
2.04 MB PNG
>>106990020
>Do a POV perspective of someone sitting beside the viewer on a couch. I can't get chroma to do it.

Close enough. Depends on what exactly you mean though.
>>
>>106990181
>ComfyUI-Manager is already phoning home
source?
>>
>>106990223
>Close enough. Depends on what exactly you mean though.
The viewer's legs should be in frame, as if a camera was strapped to his head and someone is sitting beside them. Best I got with chroma was someone directly in front, none to the side. Seedream could do it... but it was very gacha.
>>
I'm the guy who started the conversation about Hunyuan Image 2.1 and I literally never said it WASN'T slopped lol, I just thought it was interesting that it was capable of some complex NSFW concepts right out of the box.

This is Qwen on the same prompt with the Lora I mentioned earlier anyways:
https://files.catbox.moe/rbiirz.png

And this is Qwen on that prompt on the same seed without the Lora:
https://files.catbox.moe/lf4hap.png

Basically it's not really difficult to train Qwen on actual photos and get good results, I don't suspect it would be difficult to train Hunyuan Image 2.1 either since it's also not distilled, that was pretty much my original point, I don't think the way they look out of the box is that important.
>>
>>
File: ComfyUI_07744_.png (1.97 MB, 1152x1152)
1.97 MB
1.97 MB PNG
>>106990237
This was my first result before I optimized my prompt, I do think it is possible.

Here's what that was
https://files.catbox.moe/nqxfbj.png

And what it is right now
https://files.catbox.moe/l9rnet.png

You can modify depending on your needs, I'm sure it can do it.
>>
>>106990272
>I'm sure it can do it.
I feel like a POV lora would be a better use of time considering the failures Chroma was netting me. I'll give it another shot though
>>
>>106990272
she looks underage
>>
>>106990251
Qwen is easily the best local model available. It learns NSFW concepts faster and more coherently in a day of training than chroma did in 6 months. An actual Qwen finetune would be insanely powerful
>>
>>106990283
chroma schizo is a notorious pedo
>>
File: 1759427970044861.gif (1.13 MB, 640x392)
1.13 MB
1.13 MB GIF
>>106990283
kek
>>
>>
>>106990251
>Lora I mentioned earlier
link?
>>
>>106990269
nice
>>
File: mr jamison 1.png (291 KB, 432x486)
291 KB
291 KB PNG
>>106990294
>PEDO PARKER
>>
>>106990226
wireshark
>>
>>106990296
>when you have to stow away your fat girlfriends dildo because guests are coming over
>>
>>106990345
lmao
>>
>>106990300
I haven't released it yet, I may still add more images and re-train just to tighten it up a bit more. The current one can do two girls / one guy, two guys / one girl, or one girl / one guy blowjob stuff all pretty coherently though.
>>
>>106990352
would love to have it whenever it's ready
>>
>>106990251
If only qwen had seed variety...
>>
File: ComfyUI_07760_.png (1.61 MB, 1152x1152)
1.61 MB
1.61 MB PNG
>>106990281
Actually, I realize what you mean with the whole POV thing here's another attempt from a generic image.

https://files.catbox.moe/6gz9rz.png

It's possible, just depends on what you're going for. When in doubt, feed the image to gemini as that's what it was trained on.
>>
>>106990290
you can at least tell he really is one specific person because he only seems to gen "muh azn gymnast waifu" type stuff
>>
>>106990409
That's wayyy closer, good shit. In my experiments I was aiming for something like this, but the POV viewer's head turned torwards whoever was seated there. Appreciate the experimentation you've done.
>>
File: ComfyUI_00258_.png (2.42 MB, 1248x1872)
2.42 MB
2.42 MB PNG
>>106989841
>>
what's the best way to install Nvidia drivers on debian 13?
>>
>>106990468
sudo rm -rf /
>>
File: 1741005643525784.png (39 KB, 846x213)
39 KB
39 KB PNG
https://civitai.com/models/1901521/pony-v7-base?dialog=commentThread&commentId=985535
Incompetent grifter won't even release his synthslop shitpile out of shame
KWABEROONI
>>
>>106990484
thank you that cleared it up
>>
File: 1739595053908648.mp4 (1.75 MB, 720x880)
1.75 MB
1.75 MB MP4
>>
>>106990524
That was the right choice.
>>
>>
>>106990547
got dam
>>
File: 1749862025689049.png (1.47 MB, 1344x896)
1.47 MB
1.47 MB PNG
>>
>>106990547
>MFW my face gets eaten
>MFW this thing ate my face
>MFW I have no face
>>
>>106990358
yeah it'll be on Civit too
>>
File: ComfyUI_07778_.png (2.4 MB, 1152x1152)
2.4 MB
2.4 MB PNG
>>106989856
Exact image/girl is not possible. Area where Qwen Edit could help. Due to changing one token, Chroma makes variation.
https://files.catbox.moe/q8393g.png

Not the prettiest, could use a innie pussy LoRA.

>>106990460
Np anon, I'm sure it's possible with the right engineering
>>
>>106990524
he's clearly joking lmao, if you go on the pony discord he's been actively working on making lora support and stuff work properly
>>
File: ComfyUI_00282_.png (1 MB, 912x1144)
1 MB
1 MB PNG
>>
>>
File: ComfyUI_00285_.png (980 KB, 1408x736)
980 KB
980 KB PNG
>>
>>106990365
my prompts are never short enough for that to matter too much personally
>>
File: AniStudio-01981.png (1.92 MB, 1024x1344)
1.92 MB
1.92 MB PNG
>>
For the new 2.2 light loras, does it matter if you use it separately or does the quality degrade if don't use the merge?
I saw a comparison video and the quality was better with the merfe, but he only had one seed going.
>>
File: ComfyUI_00288_.png (1.28 MB, 888x1168)
1.28 MB
1.28 MB PNG
>>106990740
>>
>>106990762
the fukkin glasses. I kek'd
>>
>>106988680
this is complete BS if you're talking about SDXL lol. Any Nvidia card from a GTX 1660 Super or so upwards can do XL fine
>>
File: ComfyUI_00293_.png (1.16 MB, 1448x720)
1.16 MB
1.16 MB PNG
>>
File: 1729823449742510.mp4 (2.48 MB, 720x720)
2.48 MB
2.48 MB MP4
>>
File: ComfyUI_00297_.png (1.08 MB, 1448x720)
1.08 MB
1.08 MB PNG
everyone's done with genning now huh
>>
File: AniStudio-02009.png (2.32 MB, 1024x1344)
2.32 MB
2.32 MB PNG
>>106990795
cute!

>>106990860
I'm just done with abi madness for now. still have to redo the filepaths and plugin unloading. weird edge cases keep popping up too. if anything it's a good time to gen for me at least. I need to blow off steam.
>>
File: ComfyUI_07784_.png (1.85 MB, 1152x1152)
1.85 MB
1.85 MB PNG
Got it know. Describing the guy's pant color in original one I had seems to push it/help a lot with precisely what you're looking for. Then you could say he's relaxing or extending his legs on the floor or a small table or whatever.

https://files.catbox.moe/eclebn.png
>>
>>106990460
>>106990909
>>
File: ComfyUI_00305_.png (1.06 MB, 1248x832)
1.06 MB
1.06 MB PNG
>>
File: ComfyUI_00308_.png (941 KB, 1176x880)
941 KB
941 KB PNG
>>
File: netayume.png (34 KB, 1919x258)
34 KB
34 KB PNG
Even Comfy himself is on the NetaYume train lol, I guess he gave it its own workflow template
>>
File: ComfyUI_23200_.png (1.14 MB, 1152x768)
1.14 MB
1.14 MB PNG
>>106990909
I've already gotten closer, though it's still pretty gacha. If anything this tells me the model is an extremely good base for loras
>>
anyone try SVI with wan 2.2 to fix color shift? I know it's for 2.1 but still
>>
>>106990947
I think it’s decent, but it suffers the same issue as novelai where styles look very vectorized and simplified. I hope he continues to train it, he was using 8xb200s. I find a lot of artist styles just don’t work compared to noob so maybe it’s still undertrained
>>
>>106990990
comfy doesn't train anything
>>
>>106990990
It's also slightly worrisome that the additions he's made to the dataset dont have NLP captions, just tags
>>
>>106990996
Nsfw NLP is shit anyway
>>
>>106991008
It's not just NSFW, it's everything new on danbooru
>>
>>106989412
>>106989436
Hunyuan 3
>>
>>106990996
im okay with yume just being noob with a 16ch vae
desu, illustrious v2 had decent nlp support so i highly doubt whatever neta(yume)'s ends up being will be worse than that
>>
>>106991048
did pretty good
>>
>>106991048
significantly better, though given the size, i hope it would lmao. hope the distilled version of this model performs close to this
>>
>>106991048
Hunyuan 3 has the same issue base sdxl did: everything has a tendency to lean towards a dull brown/grey tint
>>
>>106990641
That Qwen LoRA is so kinosovl....
>>
>>106989042
because we are powerless to right this ship, we can only generate better things digitally
>>
so is chroma v46 better than the latest? i've been seeing you guys post more often with that version
>>
>>106989247
Why do we even have SDXL?
How'd that happen?
>>
>>106991144
no, anons just memeing
>>
>>106991144
no, every chromadome has their own preferred version
>>
File: ComfyUI_00352_.png (1.28 MB, 864x1208)
1.28 MB
1.28 MB PNG
>>
>>106989230
>>106989421
audible kek
>>
File: 1747014062038540.mp4 (1.51 MB, 720x720)
1.51 MB
1.51 MB MP4
>>106991048
>>
>>106989452
>unbridled, schizophrenic rage
Kek I can't stop laughing at this
I've put exactly that in driver crash reports
PLAY THE FUCKING THING NIGGER REEEE
I like to imagine Chang in Taipei gets a smile out of it
>>
>>106991170
too squishy
>>
File: ComfyUI_07794_.png (2.25 MB, 1152x1152)
2.25 MB
2.25 MB PNG
>>106990959
Nice. I remember Chroma being trained/test with lots of POV pics like this back when lodestone had that training preview.

>>106991048
Kek, Qwen tier output, though at 80B a Chroma style finetune that takes advantage of that many params would be insane.
>>
File: ComfyUI_00339_.png (940 KB, 1024x1024)
940 KB
940 KB PNG
>>
File: ComfyUI_07795_.png (2.42 MB, 1152x1152)
2.42 MB
2.42 MB PNG
>>
>>106991181
cat
>>106991191
box?
>>
File: ComfyUI_00299_.png (1.1 MB, 856x1216)
1.1 MB
1.1 MB PNG
>>
>>106991158
Dear lord
>>
File: ComfyUI_23219_.png (760 KB, 1280x720)
760 KB
760 KB PNG
>>
Fresh

>>106991205
>>106991205
>>106991205
>>106991205
>>106991205
>>
>>106988526
You can try to run compressed version of SD1.5. Go to civitai.com and in model section chose sd 1.5, if it's good enough for you just to mess locally, you can try comfyui. It will take 2.2gb vram for 768x768 image with this command:
.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build --fp8_e5m2-unet
pause
>>
>>106990996
that's just for part of it in the most recent version though, it's probably not a big deal. Mixed NLP / tag captions are generally what you want for this kind of model anyways.

>>106991062
that's not gonna happen lmao, it would take an enormously huge amount of degradation given the text encoder itself is far superior to CLIP
>>
How do I create hyperslop?
>>
I'm having an issue using the stickied thread:

I've managed to get all the custom nodes to work but now i've got:

backend='inductor' raised:
ImportError: cannot import name 'triton_key' from 'triton.compiler.compiler'
>>
>>106991732
whut gpu
>>
>>106991770
4070 super
>>
>>106989680
nice



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.