[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Settings Mobile Home
/g/ - Technology

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

File: the longest dick general.jpg (2.4 MB, 2000x1201)
2.4 MB
2.4 MB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bred : >>102804738

Hellish Fate Edition

>Beginner UI
Fooocus: https://github.com/lllyasviel/fooocus
EasyDiffusion: https://easydiffusion.github.io
Metastable: https://metastable.studio

>Advanced UI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
reForge: https://github.com/Panchovix/stable-diffusion-webui-reForge
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out

>Model Ranking

>Models, LoRAs & training

Quants: https://huggingface.co/TheYuriLover/flux-dev-de-distill-GGUF/tree/main

>Pixart Sigma & Hunyuan DIT
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality

>Related boards
I like your collages OP, you put some meme images or some non AI related pictures, but those are interesting nonenthless
you dropped >>102826381
File: ComfyUI_01919_.png (1.9 MB, 1024x1024)
1.9 MB
1.9 MB PNG
You know, if you have a two GPUs, you can block swap to the GPU instead of using the CPU fro full finetuning Flux on Kohya which is slightly faster. The microstutter is awful though, it makes my audio stutter so I can't watch Youtube with a stutter every second.
File: ComfyUI_01926_.png (3.89 MB, 1920x1080)
3.89 MB
3.89 MB PNG
>a whole row just for my gens
baker is trying to turn me into a mommy guy...
that collage must smell foul
>Hellish Fate Edition
No more big old bitches for a while, I really must apologize again.
File: file.png (2.72 MB, 1408x1539)
2.72 MB
2.72 MB PNG
>2024/10/13: We have adapted and open-sourced the CogView-3Plus-3B model in the diffusers version. You can experience it online.
Cool that they released it, it looks all right, not at the Flux level but it's a fucking 3B model so I din't expect better results lol
File: file.png (2.05 MB, 1852x1685)
2.05 MB
2.05 MB PNG
bruh it doesn't know what a photo is or something?
File: ComfyUI_01930_.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
are quantized flux models as good as the original?
File: ComfyUI_01931_.png (1.52 MB, 1024x1024)
1.52 MB
1.52 MB PNG
Q8 is extremely close to lossless, though if you do side by sides with FP16 you will notice some pixels shuffled around. 4bit introduces noticeable errors.
File: file.jpg (2.04 MB, 7961x2897)
2.04 MB
2.04 MB JPG
>are quantized flux models as good as the original?
only Q8 is really close, you can more easily notice the difference with the others
are people finishing with flux or is it XL for final i2i?
Did you give her nails on purpose?
File: ComfyUI_01935_.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
actually no I did not
I think some people do an i2i pass with an XL model at the end to reduce the slop look. Personally though I find that using a good lora is enough to de-slop it.
Oh, okay... I was just curious, sorry.
>good lora
I only visit civit so I wouldn't know.
kek... fucking kek....
File: ComfyUI_01932_.png (1.33 MB, 1024x1024)
1.33 MB
1.33 MB PNG
I guess thats how the LoRa learned from the training material.
Aika always has her nails done
I'm still using an Automatic1111 stable diffusion installation I setup late June of 2023.
I didn't upgrade as soon as I could have so I'm a little behind of where I should be. Should I be upgrading to SDXL or Flux?
desu you could make that style interesting
dont skip xl
>Should I be upgrading to SDXL or Flux?
flux, definitely flux
are you asking the right question?
A1111 upgrades to A1111, forge or reforge.

SDXL and Flux are models. If you are asking this then >>102828261
Okay... not to sound like a complete idiot, but I never really read up on SDXL. Do I have to start from scratch or can I use my existing setup and actually upgrade from SD to SDXL?
Just drop in the XL models of your choice and you are good to go
i recommend you ditch a1111 and switch to either forge or reforge, they are forks of a1111 so the ui should feel very familiar and your extensions should work fine. i personally use forge for flux and reforge for xl/sd 1.5

Install Forge in a new folder or bite the bullet and learn ComfyUI.
You dog
>bite the bullet and learn ComfyUI.
learning ComfyUi is easy when all you do is downloading workflows and loading them lol
Alright. Thanks for your time and advice.
I actually did that when SDXL was pretty new and quickly realized that was probably not going to work because I was being dumb.
So I just kept using SD with the weird model I was using that kept giving me really nice results.
File: ComfyUI_01937_.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
File: ComfyUI_01939_.png (1.86 MB, 1920x1080)
1.86 MB
1.86 MB PNG
File: ComfyUI_01950_.png (2.62 MB, 1920x1080)
2.62 MB
2.62 MB PNG
De-distill allows cfg>1, so what's your negative prompt bros?
What are some things we can do with un-distilled that differentiate it from dev?
File: main.jpg (604 KB, 2131x1196)
604 KB
604 KB JPG
Zero-shot editing for flux. At first i was excited but then i saw it's from Google. This will never see the light of day.

File: file.png (1.46 MB, 1024x1024)
1.46 MB
1.46 MB PNG
It got better, i was using 20 steps, by using 40 steps i got a way better image. Had no idea Flux needed so many steps.
>Had no idea Flux needed so many steps.
yeah, flux distilled could get away with less steps, the undistilled version is less forgiving on that
yeah that's sad because that could solve the style issues on Flux, you just put a picture of something you like and there you go it mimic that style, no need for a lora or whatsoever
you're quoting something that is answering the question you're asking lol
The ball and phone being in the wrong order has me mildly frustrated. Not in particular but in general. I always get swap ups like that for various things and I have to roll the dice until I get lucky.
Who are you quoting?
File: file.jpg (498 KB, 3135x1586)
498 KB
498 KB JPG
wtf it's still not converging at 1000 steps?
File: file.png (2.33 MB, 1594x1324)
2.33 MB
2.33 MB PNG
>This will never see the light of day.
we have the paper though so we can make the code by ourselves no? They're talking about a new sampler specialized for Flux (FluxSDE), that looks hella interesting
>Additionally, we extend our framework to design a stochastic
sampler for Flux. Our inversion method allows for state-of-the-art performance
in zero-shot inversion and editing, outperforming prior works in stroke-to-image
synthesis and semantic image editing, with large-scale human evaluations confirming user preference.
K, now I'm waiting for a comfyUi sampler node of this
Wasn't PuLID solved this?
pulid is only about using a celebrity's face to make a cool scene with it, it has nothing to do with styles

What improves? I haven't seen dedistilled understanding my prompts better.
>I haven't seen dedistilled understanding my prompts better.
I have, for example if you write something + "pixel art style", distilled can't do it but undistilled can
it was the thing before this anon. Seems everyone forgot because they are too busy cooming.
Inspiration (in-spirit)

wait i just remembered i have a old rx 570 4gb lying around
think it'd be worth it using it?
btw how do you even run 2 gpus in a consumer motherboard?

If your mobo supports 3 GPUs the manual will tell you.
i will be sincere
i like flux
i ask for a drawing and it gives me something akin to a actual drawing on paper
shame it takes so much
now be insincere
The best part of something other than flux finally hitting this general will be that the mikufag isn't excused for spamming his retarded Miku pictures every thread and reposting slight variations of milu with dreads so frequently that he might as well be considered an SDG tier avatarfag at this point
Why are you guys so full of hate
Now if comfy could be chased out every time for being the utter retard he is as well...
Don't get surprised when people call out your weird Discord ""customs"", we don't respect them because they're weird and you're unpleasant.
having standards is what prevents sdg happening. some of you clearly needed to be bullied a lot harder in your formative years
>some of you clearly needed to be bullied a lot harder in your formative years
But i only post in /ldg/
I just dont get why you're so full of hate :/
rope immediately.
I wish I believed you were trolling right now, I truly, genuinely wish I had that amount of faith left in me
File: 00005-534283554.png (1.99 MB, 1128x1512)
1.99 MB
1.99 MB PNG
The stockings are a bit much
Yeah, I try not to read here when things are slow because thread """personalities""" become really obvious. Made the mistake of checking in just now., mind you.
this isn't your personal blog retard, kys
he's right tho.
>he's right tho.
I'm tired of seeing miku poster too. they were cross posting with illustrious in hdg as well, it's really obvious and annoying. I shouldn't be able to
anyone on an anonymous board desu
>schizo general
You deserve to be here in the containment thread, all of you.
you can't contain my girth
Shut the fuck up and go back
i think flux gens are aesthetically pleasing
what about blurfag
blurfag if you're reading this.. FUCK YOU!
File: 1728870488355780.png (1.22 MB, 768x1344)
1.22 MB
1.22 MB PNG
what is the point of all of this
I love ldg
My brain automatically categorizes his posts as bot/troll tier spam so he might as well not exist
Full finetune does seem to work now on Kohya for Flux on a 24 GB SPU. I don't know about their psychotic settings though. 1e-5 seems more appropriate for this model also seems psycho not to have grad norm set.
only /ldg/ avatar i support is the scott pilgrim art style pony diffusion monster girl poster. where is he? does he still bake?
File: 1697814031846063.png (1.86 MB, 768x1344)
1.86 MB
1.86 MB PNG
why are we here
*angery tranny noises*
yep, the snowflake whining about avatarfags is probably taking HRT
very debo-esque
the founding changs left this general to rot
Yeah i think we should merge back with /sdg/
>we should go back to posting with the schizo children that spam the same images every day because they're actually, unironically mentally ill
merge conflict
>accept theirs
Not gonna happen, sorry bub
butt why
where'd my balls go?
why does sdg want this threads 3 and 1/2 posters to return to it so badly
Because that will double the amount of people in their thread.
>This will never see the light of day.
whys that?
maybe the j word
I wrote a molmo-based captioner for flux dev lora training, as an attempt to improve on the one built into fluxgym which used Florence2:

There's a command-line and gradio version. Molmo does a much better job than Florence2, it's able to describe nudity and much more subtle details about a scene. I think it improves the quality of lora training.
Bigma soon
two more weeks
File: 30606581.png (804 KB, 1216x832)
804 KB
804 KB PNG
Oh cool, my post (which was not actually my gen) got in.
Eh at least we have working full fine tuning for Flux on 24 GB
we all miss him. he made the best collages :(
I'm trying to make an "Anime to Real Life" workflow in comfyui. Is there a way to load a directory of images one after another and add each prompt to the queue? I've tried the Inspire Load Image List from Dir node as well as the one from VideoHelperSuite and they seem to run the prompts in parallel which I don't have enough RAM for.
File: 1693153131654296.png (390 KB, 660x796)
390 KB
390 KB PNG
Does 3080 Ti 12gb and 32ram fit flux Q8 or do I need to cuck myself to Q6K and lower.
is there no .gguf versions of vision models?
File: file.png (611 KB, 512x512)
611 KB
611 KB PNG
is it just me or flux has absolutely no fucking knowledge of anime?
this is what i get out from "saint seya, from the knights of the zodiac anime."
Not trained on anime
It was gimped during training, so out of all anime characters it only knows Donald Trump and Hatsune Miku, you can also trick it into generating Goku from Dragon Ball if you describe him without naming.
There probably is. This was just a "get it working" version. I'll add the ability to use a quantized model eventually.
It fits but expect around a minute gen time
Where is the flux anime tune? Tell the richfags to get on it.
Wait for 5090s
Wait for
version that can be used with 3060 12gb might get quite few downloads
You could also wait 20 seconds per step on a 4090.
File: file.jpg (332 KB, 1024x1024)
332 KB
332 KB JPG
doesn't look that bad
it literally doesnt know what saint seya is. I did a100 count batch and not ONE got even close to it
all i got was females and females with kemomimi
>all those params
>cant into anime
The absolute state of FLUX.
>hire an office filled with accountants
>none can cook me a lobster thermidor
The state of accounting
what was your prompt
So this seems to be moreso the place to discuss models and technique, where the other thread is moreso where gens are posted, right?
two different generals for two different groups of autism
File: l.png (329 KB, 600x600)
329 KB
329 KB PNG
Not really it officially split because Stability AI turned full retard so Anons didn't want to call the thread "Stable" diffusion general anymore.

Unofficial reason is there was some drama and Anons got triggered by other anons posting the same gens and avatarfagging.
It's the same reason why there's an adult table and a kids table.
I sat at the adult table as a kid and it has fucked me up
Look at the collage full of images
>(which was not actually my gen)
where'd you find it?
File: glurrh.png (2.27 MB, 1024x1024)
2.27 MB
2.27 MB PNG
no new models makes anon antsy as evidenced by this bread
Yeah and not even the better ones
bigma when
File: image-37.jpg (194 KB, 1024x1024)
194 KB
194 KB JPG
Can't get it to repeat
So I used to mostly believe the people on the /h/ board were generally trolling but I'm starting to think they actually are in many cases people who for some reason really cannot think in a context outside of "current meta checkpoint vs xyz incumbent meta checkpoint" or consider that anyone else might not approach things from that context by default.

Maybe I'll just stick to the /b/ board, the content was always way better there anyways.
File: sana.png (84 KB, 1727x626)
84 KB
New Pixart paper?
File: pixart news.png (79 KB, 1068x483)
79 KB
wtf we sana now
some of the samples look decent, others look completely terrible. 0.6b so not exactly a 'bigma', about the same size as sigma. i will try it out later and compare. smaller models are cute but i still think a solid 3-4b model without any t5 shit would be ideal.
File: Flux_00898_.png (925 KB, 1024x1024)
925 KB
925 KB PNG
CivitAI, there was a LoRA for 2000's alt girls.
File: sana.png (191 KB, 1363x743)
191 KB
191 KB PNG
after reading more, there's a 0.6b and 1.6b version as well. if this comparison holds true, it would be a solid base for playing around with. but as we see so often, these benchmarks tend to stretch the truth. hope the release doesnt take forever
seems like there's 2 models, a 0.6b and 1.6b i didn't know models this small could generate text, wtf was flux doing with 12b then
Have faith in the chinamen
dissenters will be drawn and quartered
the day of reckoning is upon us
0.6b = actually trainable
And it's quality is good enough that you can make your porn pony model on the cheap
imagine the humiliation stability ai must be feeling right now, quite exicted for the new sana model. hope it releases soon.
File: ComfyUI_01974_.png (1.83 MB, 1024x1024)
1.83 MB
1.83 MB PNG
>As a result, Sana-0.6B is very competitive with modern giant diffusion model (e.g. Flux-12B), being 20 times smaller and 100+ times faster in measured throughput. Moreover, Sana-0.6B can be deployed on a 16GB laptop GPU, taking less than 1 second to generate a 1024 × 1024 resolution image.
what the fuck, how is it possible
wtf i love ldg now
In the same way that Florence2 is extremely small yet is really good. Poor architecture causes the weights to be underutilized and the model to struggle finding the optimal solution.
Also ding dong the T5 is dead.
I knew my Pixart boys would deliver.
- lightning sampler to begin with
- better text encoder
- better vae
- tiny model that runs on a laptop GPU (that's without the 8-bit / quantization copes)
ldg returning to it's roots, the second coming of pixart is here. at our lowest low, chang came back!
are any of these woman names? do you think they're cute?
if they aren't you can always i2i, soon with blazing fast speeds
somebody explain that equation to me, I was top tier in maths smoking weed and barely studying, now I can't even make a +2 digits divisions on paper
Also it's not "4K" as we would say 4K, it's 4096x4096 images in 9 seconds.
official sana weights waiting room
see >>102815183
/G/entoomen, we are seeing a major rebound
In praise of products for the high brand of diffusion
The, if you will, super-sana era

Chang light the way to our next endeavor
Lo, the fallen models: two have become one
Yet the image, it is singing

A new spirit of diffusion has arrived

In praise of speed
I say unto you, my weary anon:
The super-sana era is here!
>Also ding dong the T5 is dead.
I hope I never have to use language models to unnecessarily expand prompts again. "Natural lighting" vs ""Sun queefs it's rays across the whole fucking image"
wake me up when they drop the demo
watch this, i am going to make them release the sana weights in 5 minutes.
sounds like they just need basic 3d fluid simulation, shit solved decades ago
File: file.png (311 KB, 622x370)
311 KB
311 KB PNG
Also we're looking at huge speedups in training since they're using linear transformers now which are o(n) rather than o(n^2), also no positional encoding which means the training process is likely even more ratio independent. This also means that the 0.6B of this model is going to be more efficient compared to the 0.6B of Sigma (less network wasted on positioning). The text encoder is Gemma-2 which apparently has better contextual language and internal reasoning / chain of thought compared to T5. They're also using a 32x Latent Representation rather than 8x, the VAE appears to be 32 channels (rather than 16).
File: dnd_highres_00596_.png (3.94 MB, 2400x1680)
3.94 MB
3.94 MB PNG
>They're also using a 32x Latent Representation rather than 8x, the VAE appears to be 32 channels (rather than 16).
we're eating so good
proud to be a pixartsexual
32 channel VAE is interesting, but ultimately the dataset is what will define the model (that is, if people will even bother using the base model anyways before finetuning it)
Cog and that other shite img2vid model from last week are still shit right?
The ones where the comfyui install broke itself and cog couldnt write a fucking readme to save their lives.
Anything changed or anyone has any comments on the (absolute) state of img2vid?
It's so lightweight you'd finetune it with whatever within a couple of weeks. It's not going to be like Flux, you're probably going to be doing batch 8 on a 4090 at 1024px with 2 seconds per step with this model.
File: 4K-4.jpg (3.97 MB, 4096x4096)
3.97 MB
3.97 MB JPG
there was another significantly better model that released but i cannot remember what it was called. pyramid or something.
unbridled sovl
File: Screenshot.png (288 KB, 645x570)
288 KB
288 KB PNG
> never have to use language models to unnecessarily expand prompts again
idk anon, feels like they just baked it in for you this time
File: 38.png (1.08 MB, 1024x1024)
1.08 MB
1.08 MB PNG
hope this isnt another cascade
Gemma-2 has been very good for local bot. Easy to guide creative style.
Thanks pyramide was the other shite model I was referring to, I forgotten it's name as it was kinda DOA.
i hope that sana will be our mistral nemo moment
oh, to add, it was DOA to me as they were still training the model (to not be shit) but released the shit version anyway, because social credit points for productivity i guess???
Pixart SIgma was very good, this model appears to fix everything. The architecture truly is what is SOTA and anyone wanting to make a model should probably use it.
is there any anime finetune that isnt danbooru but rather more descriptive?
i find it hard to make these models give me exactly what i want when i am beholden to tags
Wait, it contains system prompt? Can it be changed?
>released the shit version anyway
probably as a way to get funding for the bigger model
in what way?
i think he meant that he hopes it doesn't get ignored like cascade was
Cascade was ignored because SAI said SD3 was the best thing ever and was coming out in two weeks. If Flux didn't come out we'd probably be using Cascade right now but honestly the architecture of Cascade wasn't that great.
that the giga compression doesnt make it impossible to train into something actually usable
also the relatively low param count sucks
Anon, it's a VAE, if it didn't work it wouldn't produce any high quality images.

>also the relatively low param count sucks
>wah why isn't anyone making a porn finetune for Flux
File: peer.png (2.15 MB, 1024x1024)
2.15 MB
2.15 MB PNG
>Put in a bunch of AI images of various quality, mostly very high quality, into AI image detector
>Detects them with 100% accuracy
>Even put in abstract ones with no possible fingers to count wrong or obvious shit to copy
>Can always determine the human abstract from the AI abstract
>Put in an image I inpainted a bit
>Human rating was higher than the original
>Put in a picture I poured over and inpainted meticulously to make it high quality
>95% human rating
What is the detector detecting, exactly? Is there some uniquely human way of doing things that just comes through, even in the way we edit images? I didn't use an image editing tool on the 95% human one, just a ton of inpainting. It was still all AI, but it still recognized that it was heavily human-touched. Pic unrelated, since the 95% one is NSFW.
AI images have detectable sub-pixel patterns, the detection isn't as complex as you think and is easily confused.
cool gen
>uses Gemma-2 2b as the text encoder
Why not just go for the 9b? You only process the prompt once per image. It's not the bottleneck, even with a relatively large TE. For actual LLM use cases, the 2b tier is just retarded. The model is simply too small. Meanwhile gemma 2 9b is quite good, and known for even punching above its weight. I feel like they left a lot of performance on the table for no real reason.
Ahh, I see! Figured it was something like that, it just seemed odd that the patterns would be so broad, you'd think that such small patterns would be easily detectable even within the small areas of inpainting. Is the fact that those patterns exist something that can be solved, or is it inherent to the architecture?
You're not talking to it, you just want it to know "dog" and ":dogemoji:" are the same thing. It doesn't need to write a coherent five paragraph story.
you think the eye/clothing texture artifacts are ALL just from the unet randomly deciding to make them? go take a look at some cascade images and just how much it loves to take a dump on the details
and clearly if less than 2b params was enough we would all be using pixart sigma or other memes rather than everyone moving on to either sdxl or flux
File: 407660.jpg (552 KB, 1428x1400)
552 KB
552 KB JPG
ayy, thanks! it's really fun mashing styles together and totally forsaking quality tags, it makes some super neat stuff. Girls go apeshit when you mix them with Charlotte from Madoka and album covers (picrel), turns out.
I think you 12B fags will be forever waiting for a model lmao
The 1.6B model is going to be the porn model everyone uses, guaranteed.
Are the patterns human-detectable? I know images that were img2img'd were visible to the human eye if you just looked a little closely, they had this really subtle, odd texture to them that couldn't really be removed.
Probably not human detectable and who really cares. The pattern is going to be from the VAE decoding images from latent space.
>anon brings up Cascade again
Stockings with sneakers is a travesty
but I like the style
I mean maybe, but I'd still like to see how much improvement you get using an even larger text encoder. Like look at prompt comprehension of pixart sigma vs SD1.5. The diffusion model is of comparable size, the gains are purely coming from pixart using T5-XXL. The Sana paper even has chart showing that gemma-2 2b is slightly worse than T5-XXL (they claim it is "comparable"). So there's no real improvement on the TE front. Imagine if the jump from T5 to a larger modern LLM is as big as the jump from CLIP to T5. But we'll never know.
Will Sana's compression cause it to shit out on details like Cascade? Cascade sucked ass at fine details, eyes always blurry etc
It compensates by being 32 channels.
File: file.png (1.85 MB, 704x1312)
1.85 MB
1.85 MB PNG
Also Cascade was undertrained just like Pixart Sigma, so we don't know what its final form could've been.
If Flux was smart they'd release their Pro 1.1 weights.
no it doesnt, all that matters is the latent size, sdxl is 4x128x128 for a 3x1024x1024 input image
this compresses the latents to 32x32x32 which is still half the size of sdxl
maybe they have found some super efficiency gains to get them through this and it will look great, then thats awesome, but im skeptical
Nice action
We're talking about a 2D image being represented as a 32x32x32 cube. What you understand about compression doesn't apply to a (V)AE.
sana-samas.... the wait is killing me... is it here yet?
yeah right, we will see in practice, because the examples they gave sure dont look too hot
sanasexual pride month
The Sigma examples weren't great either but the model itself was capable of good images. I don't believe it's going to be the AE holding back Sana. And maybe its going to end up equivalent to SDXL for overall fidelity of fine details, but that might be worth being four or eight times faster at inference and training.
most animu models will give a little to regular prompts as apposed to booru tags but the nature of animu necessitate using said tags considering booru is really the only database for it
perhaps merging an animu with non-animu checkpoint will get you closer to what you want
>>cant into anime
>The absolute state of FLUX.
But there's tons of anime LoRAs for it.
I did these myself:
If anything filters people from Flux, it's probably that it started with ComfyUI, which isn't as easy as automatic1111.
File: ass.jpg (596 KB, 3375x1108)
596 KB
596 KB JPG
Alright someone spoonfeed a babby because I'm genuinely too retarded.
Why does it refuse to listen to the prompt? Did I download some retarded meme version of flux? Because I'm getting nf4 tier slop results when I downloaded the Q8.
12gb 3080 Ti and 32gb system ram. Does the model go retarded because it can't load it all with vram? Am I missing some bullshit text encoder?
>Chubby miku guy
Please tell me you put some cute chubbers in there. I'd kill for a high quality chubby anime flux LORA.
Well, if I had to guess, it's probably that a lot of anime data was scrubbed, so even if it can do Miku, it doesn't "know" how to do a lot of shit with anime-style figures in different situations, since it has basically zero anime in the training data.
genned 1500 flux 1girls overnight, here's the best failed gen of the batch
but >>102827873
is doing it right there. Thats flux dev Q8 right? So why is mine being retarded.
what's fluxfusion?
holy shit it's so slop
for a second I thought you linked a SDXL model
what went wrong?
Love it
maybe that's the reason it's not listening to your prompts, try base flux. flux finetunes and merges are still just memes
Yeah that was my guess I'll try the direct Q8 conversion of base flux since no chance of running the full model.
The dedistilled dev model has better prompt adherence.
I'm going to need the best team of prompt scientist working for 3 days exclusively on Stephanie from Lazy Town
>The dedistilled dev model has better prompt adherence.

That can't run on Forge yet though or can it? I remember some Anon yesterday couldn't run it.
Oh hi miguanon.
There's no use arguing with that one. It's an eldeegy shitposter just like the antilocal/antiopensource shitter on elemgy.
File: 0.jpg (300 KB, 1344x768)
300 KB
300 KB JPG
/ldg/ morale status?
we have no morals
They have a cargo cult mindset
File: file.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
rock hard
I can see it, the details are very detailed (the 32ch VAE) but the general structure in which the details happen is extrmely messy (the super-compression)
Anyone else having problems with loras working on forge? It's like forge is sometimes 'forgetting' to apply them, they show up in the lora tab and everything, but only some seem to be working. All those loras previously worked on A1111.
did you try updating?
Is Comfy really the best way to run flux?
File: 00037-2962381111.png (2.09 MB, 1024x1536)
2.09 MB
2.09 MB PNG
I'm creating a lora. Should I make it fp16 or bf16? It sounds like I should only choose fp16 so that people with 20 series cards can use it. Is that a large audience? Should I prioritize compatibility?
>Stability AI turned full retard so Anons didn't want to call the thread "Stable" diffusion general anymore.
I haven't genned anything since august 2023, and haven't looked back into the space until today. What happened?
The usual muh safety and ethics bullshit so SD3 got lobotomized so badly it forgot human anatomy.
File: 0.jpg (296 KB, 1344x768)
296 KB
296 KB JPG
What brought you back?
I hope the Sana model will be decent, it will be so nice to go back to a faster model again.
If you remember SD 2.0, SAI did it again but worse with SD3 except this time it couldn't even make pictures of women laying in grass. Then what happened was multiple things: the company has been fragmenting part since the beginning of the year, a group of those researchers made Flux (the new hot model), SD3 came out, was absolute shit, didn't even come out with proper training tools so no one even tries finetuning it.

Of course outside of SAI you have multiple other models (particularly from China) which have been the real innovation in the space including now Sana that was announced today which is likely going to be a SOTA model especially for small, efficient models which will come out soon. These alternative models caused the general split because SDG is full of a bunch of literal children (and their groomers) who are incapable of change and just want to spam 1girls.
File: 00007-98318545.png (2.12 MB, 1024x1536)
2.12 MB
2.12 MB PNG
>Gemma2 license
the fat girl in the OPs photo not gonna lie. I'm fiercely horny today
thank you for the QRD. I wish I was here for that, it must've been a shitstorm.
I really can't tell what possesses these companies into moralfagging their way into failure.
>don't commit crimes with it otherwise we don't give a fuck
Okay, really the hill you're dying on huh?
will sana finally put an end to the 1girl menace plaguing /ldg/?
maybe not if we can do 1girl 50x faster than flux, 50x more waifus
Nothing will end the 1girl menace until low effort AI content is banned.
Nothing will end the 1girl menace until high effort models are made
so until then post more 1girls
>when ldg died
ldg stay kil until sana comes out and saves us all. if sana bad /ldg/ die for ever, sorry bud.
I might need to ask for help making 1girl slop. I made a lora that I want to post on civitai and I figure having 1girl as the main image will be most helpful since that's what people use SD for most of the time. I don't usually make generic anime girls so this is new to me.
If it's anything like Sigma it's going to rock and if it's more efficient compared to Sigma like they claim then it's going to be the king mini model. The only question is how the Gemma text encoder handles new concepts and particularly nsfw, hopefully the prompt enhancer doesn't do bullshit like "sorry I cannot do this". But since it's 2B, I guess you can fix that with a finetune too.
post some pics
File: ComfyUI_01992_.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
if thats flux i wanna know what loras you used
Too good to be true
Too true to be good
could you share prompt pls?
File: ComfyUI_01991_.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
yes its Flux and the LoRa I used is my Aika LoRa that was banned from Civitai.
banned why?
isnt that a character lora? shes nice but im mostly impressed by the artstyle
wondering if you would share the prompt/methods
File: ComfyUI_01859_.png (1016 KB, 1024x1024)
1016 KB
1016 KB PNG
>Hi there! I'm writing because we've been contacted by T-Powers, an entertainment agency, requesting the removal of model (https://civitai.com/models/694163/jav-aika) and a number of other models featuring people whom they represent. I'm removing that content per our policy allowing real people (or their representatives) to request the removal of their likeness, and I appreciate your understanding! Thanks.
LDG never die
File: ComfyUI_00940_.png (1.33 MB, 1024x1024)
1.33 MB
1.33 MB PNG
>isnt that a character lora?
>wondering if you would share the prompt/methods
its nothing fancy actually, here see for yourself
Just to suffer?
File: catbox_k79428.png (1002 KB, 832x1216)
1002 KB
1002 KB PNG
The lack of arms is intentional, that's the purpose of the lora.

I'm using autismmix ponyxl and I wish it didn't look so halfway between anime and 3d. I've been having a lot of issues with the faces looking way too young as well, this is one here is the best I could get after several rerolls.

I ended up uploading the LoRA with this image here, but I think the cover image could be so much better: https://civitai.com/models/858871?modelVersionId=960964
god i hope
Is it?
interesting, thanks
that's cool, pic looks fine to me but i've never used pony before

based and fatbitchpilled
File: 2024-10-15_00003_.png (1.83 MB, 1024x1024)
1.83 MB
1.83 MB PNG
Wow, crazy. Glad I downloaded a bunch of loras lol
File: ComfyUI_34287_.png (1.35 MB, 848x1024)
1.35 MB
1.35 MB PNG
File: ComfyUI_34330_.png (1.49 MB, 848x1024)
1.49 MB
1.49 MB PNG
File: ComfyUI_34336_.png (1.43 MB, 848x1024)
1.43 MB
1.43 MB PNG
File: file.png (6 KB, 781x65)
6 KB
the fuck does this even mean??
care to share the catbox?
File: ComfyUI_34334_.png (1.3 MB, 848x1024)
1.3 MB
1.3 MB PNG
With loras from here https://mega.nz/folder/mtknTSxB#cGzjJnEqhEXfb_ddb6yxNQ
thank you anon, looks sick as fuck

[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.