[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Settings Mobile Home
/g/ - Technology

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

File: photo-collage.png.png (3.13 MB, 1080x1920)
3.13 MB
3.13 MB PNG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>102235413

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out

>Model Ranking

>Models, LoRAs & training


>Pixart Sigma & Hunyuan DIT
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools

>GPU performance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium
txt2vid: https://hailuoai.com/video

>Maintain thread quality

>Related boards
File: 1703906564197.jpg (386 KB, 1024x1024)
386 KB
386 KB JPG
nice, got two in the collage
File: delux_dc_00024_.png (2.5 MB, 1152x1344)
2.5 MB
2.5 MB PNG
Nice collage
that /pol/ link isn't working, gg
File: 1696672159253.jpg (885 KB, 1024x1024)
885 KB
885 KB JPG
if the hailuoai thread is making half of pol seethe, i cannot imagine the reaction from twitter
File: 1725518441535591.webm (3.69 MB, 720x406)
3.69 MB
3.69 MB WEBM
Yeah, not sure why
File: 00013-3296611028 copy.png (228 KB, 340x482)
228 KB
228 KB PNG
maybe >>>/pol/uncensored+ai
let's test to see if it's caps killing it

File: 1725514526669048.webm (1.33 MB, 1280x720)
1.33 MB
1.33 MB WEBM
Ah maybe because I used '%20' instead of '+'?
And thanks, we can use that in the next bread
File: Untitled.jpg (3 MB, 3840x3708)
3 MB
This is a really dumb trick and will only matter to you if you're autistic about the texture of an image like I am
but I've discovered you can kinda get rid of that unpleasant noise texture that Flux sometimes puts on art by simply VAE encoding the image with the old 1.5-mse VAE and then decoding it again
no ksampler in between, literally just vae encode and then decode again. the 1.5 vae gets rid of a lot of the shitty Flux noise

(heavily zoomed-in crop from a larger pic to make it more obvious)
File: 1725518905.png (511 KB, 1024x1024)
511 KB
511 KB PNG
File: delux_dc_00023_.png (2.38 MB, 1152x1344)
2.38 MB
2.38 MB PNG
pol is in a perpetual state of seethe. I don't think thats a good measure of anything
>Made my first lora yesterday, character lora for flux. rank/alpha 32/32, learning rate 0.0004, adafactor, split_mode, 1600 steps and spitting out wips at 400/800/1200 too. trained at 512x512, ~25 fairly high quality high res varied aspect ratio images but no preprocessing beyond whatever bucketing/cleaning the comfyui flux training node does for you automatically.
>I assume my dim 32/32 was huge overkill should go to 8/8 as a rule of thumb default, do I need to toggle any dependent variables alongside that? past that I assume going to 768 or 1024 training is worth it if my 10GB is enough, especially for e.g. characters with detail, and probably lowering the learning rate or using prodigy if I wanna be lazy (even the 300 steps lora was basically complete, using 0.0004). but I'm guessing the next lowest hanging fruit is in a higher quality dataset and preprocessing it more.
File: 00076-3656386129 copy.png (373 KB, 873x340)
373 KB
373 KB PNG
File: 1725519139158028.webm (3.06 MB, 1280x720)
3.06 MB
3.06 MB WEBM
The quality is quite decent and the frames are surprisingly consistent, not to mention it doesn't distort the figures that much unlike the others, and the hits are more than the misses.
I don't recall asking or caring for your opinion
why wouldn't you go for a full i2i?
It's just adding blur to the picture.
you could try adamw8bit with cosine if you want. Its a pretty tried and trusted combo
because i2i with a 1.5 model is gonna fuck things up, due to 1.5 being old and dumb
>due to 1.5 being old and dumb
welp. Can't argue with kool-aid.
File: LOLpasta.webm (994 KB, 1280x720)
994 KB
I wonder what kinda tech it runs on to generate so much video so fast.
It's time to move on, anon. Staying with SD1.5 at this point is like being one of those holdout japanese soldiers who refused to accept WWII had ended.
With all the video surveillance data they have on their people I wouldn't be surprised if they have optimized a model to perfection.
File: 1702033870277920.png (3.71 MB, 2304x1152)
3.71 MB
3.71 MB PNG
Not him, but for me, it's 1.4
here is a short documentary piece about chalk.

It shows why a good tool can be beautiful.
Just resize it to dim 8 or lower and check how low you can go, it's recommended to train at high dim and resize later.
can I run flux models in forge?
You can run GGUF and NF4 ones, yes.
I don't know anything about flux i've been lazily ignoring it until now. What does that mean?
Two different file (compression?) formats. GGUF seems to allow higher compression, but is also slower to use/decompress and it's a technique primarily used in LLMs (Large Language Models) ie. chatbots, gpts and the likes. NF4 seems to be some fancy alternative that allows for faster inference (image generation) whilst keeping the compression within a comfortable range.

If you end up wanting to try out Flux on Forge, something like:
should work right out of the box, having to change "Diffusion in Low Bits" to bnb-nf4 as mentioned in the filename.
With GGUFs you might need to also download the vae, clip and t-5 encoders and slap them in the vae/textencoder section.

You can find more details over at:
i just now found out the news about the latest pytorches, and also found out ive been on pytorch 2.1 this whole time
so i updated through pip and everything, do i have to REINSTALL comfyui and shit for them to use my now latest pytorch or what?

>go easy on me please ive never done this before
Anyone here using SkimmedCFG?

When I use it with a sampler CFG of 100 (with Skimming_CFG set to 7), the image gets a bit grainy. Has anyone else noticed this?
And so, if 100 is too much, what CFG do you use?
I know this isn't the silly questions thread but how does one suppress a specific warning in the console when running comfy? pytorch 2.5 related, not safe, etc
what's the news? faster inference?
If you update, you won't be able to reproduce your previous gens.
>i already wasnt able to since i upgraded to comfyui
Can AI draw feet yet
Forge is getting updated? I thought it was abandoned?
any CFG value under 9000 just isn't high enough.
tasty ones
Yup. At this point vanilla Automatic1111 looks more like abandonware.
>tasty ones
Any evidence?
Can't post them on blue board.
Nevermind ignore this im fucking stupid and forgot what the virtual environment was and why every ai program uses them
you shitting me? pony (and pmuch all realistic pony merges) can do flawless feet. flux is ok, and can certainly get them right with an inpaint. sd15, sdxl and sd3 do not know what feet are. but that is the past. (you can get a good foot with sdxl when you inpaint but that is slot machine gacha)
<insert foot pic here>
this is why we need PonyFlux, these furfags hold the key to flawless imagegen for some unknown reason.
if you're on windows it doesn't use a venv and you should be able to confirm it's using your updated pytorch in the CMD window when it starts up (it'll say pytorch version somewhere in that text)
nothing special needed
if you're using Linux with a venv then yeah gotta install to the venv
it was definitely not using my updated pytorch, i had to use the bat files for updating them for comfyui.
now it is, at least upscaling seems just slightly faster but nothing's really changed.
just some decent training, its not magic. the realistic depiction of a human body in various poses was just never on the agenda of stability AI.
Astralite is a hack and pony being anything usable is a fluke. The only reason it maintained dominance is no one else had the desire to put that much funds into training coomer shit. Pony itself is absolute overtrained fried in garbage and the dude is hashing artist styles and replacing them with some kind of retarded style system that I have no doubt is gonna fuck any bake he attempts on any arch to hell and back. Which doesn't even touch on that he will never train flux proper because he can't profit off it.

It always shocks me people here talk about pony like it's some miracle model and astralite like some god, when you go to h and he's laughed at for being a narcissistic, pony fucking retard who didn't know what caption drop out meant while tard boxing Lykon
Speaking of pony I wish the lora makers on Civitai would stop using obvious pny images as training data, it's so obvious because the result looks like a crappy pony mix
I'm not well versed in all the drama, but i do faintly remember something regarding the lead dev's actual mental retardation and not understanding how the tagging system even worked despite putting it in himself(?)
It's definitely not talked about nearly enough, but that still doesn't take away how objectively solid pony is.
it's horrid because I always thought the biggest complaint about pony was that shitty cooked in style... these jeets actually desire it
all of that doesn't matter jack fucking shit. pony can do stuff no other model can, end of document.
using generated images as training data in general is not healthy, big news
>of that doesn't matter jack fucking shit. pony can do stuff no other model can, end of document.
Name one thing pony is better at that isn't cartoon porn, I'll wait. Then consider that the only competition it's had in the smut arena is NAI (superior to pony but not local, so irrelevant), so it 'wins' best at smut by default of being the literal only option
this so this
>pony can do stuff no other model can
yes horse pussy
File: myFile_10_8.0_006.png (2.94 MB, 2136x1664)
2.94 MB
2.94 MB PNG
unique poses, doing more than just ((1girl)) renders, being able to train it to handle complex or even very simple art styles without trying hard to steer it in baked in directions (yes even sdxl is like this, the aislop art style jeets love and normies can point from a mile away)
i know you're only asking this as a retorical question but i'm mentioning this for the 2 people reading this thread right now.
also you should consider possibly killing yourself today,preferably before you've had another coffee this morning. (and before i've had my second.)
File: 2024-09-05_00103_.png (1.11 MB, 720x1280)
1.11 MB
1.11 MB PNG
l-la creatividad..
wow a hardcore pony lover appears
>>102240333 >>102240347
I remember you lol. ignore them, aggro baiters (probably just one guy samefagging). pony bad pony horseporn. ad infinitum.
File: myFile_10_3.0_059.png (2.14 MB, 1536x1536)
2.14 MB
2.14 MB PNG
i couldn't possibly claim that title, this guy seems to be the horse genner la creatividad >>102240398

im having fun, haven't lurked legit since those pizza gens. It's surreal to think all of this is just a drop in the bucket for whatever's coming next year.
its like every gen, every stupid fun argument on these threads, all of it is leading up to that orgasmic moment of perfect videogen models synched up to audiogen and text gen. holy fuck. im optimistic to say the least regardless of who helps us get there.
>pony bad pony horseporn
This but unironically.
>being able to train it to handle complex or even very simple art styles without trying hard to steer it in baked in directions
Have you ever trained a lora in your life? Pony is notorious for being shitty to train on because Astralite fried the base style in with retardedly high LR. it's known for doing it's ponyisms even through styles, incredibly difficult to steer away from it's shitty cartoon thick lines, and because Astralite refuses to give any real insight on his captioning or dataset and keeps lying about hashing terms, there's a good chance your lora tags will just flat not work because the burnt ass pony model has decided the words "red skirt" mean something distinctly different. I'm not even humoring you past this, you are either baiting me or know absolutely nothing
this is pony by the way >>102240420
(still havent figured out the adetailer nodes to fix hands)
File: 2024-09-05_00112_.png (1.33 MB, 1280x720)
1.33 MB
1.33 MB PNG
ya.. I tried to gen a night sky with a visible milky way.. it gave me some silly pony character that is named milky way.. no way Id use that model for anything artistic.
It looks like some horrible almalgimation between a 3d model and some uncanny valley skin suit made of plastic, and you can see the hands got fried having to overcook in realism (something pony doesn't know natively). All you did was further prove you don't know what the fuck you're talking about.
File: myFile_20_4.0_049.png (2.59 MB, 1536x1536)
2.59 MB
2.59 MB PNG
sneed this
here for the ultra softcore porn, thank you all
t. nofapper q;^)
File: 2024-09-05_00115_.png (862 KB, 1280x720)
862 KB
862 KB PNG
File: 2024-09-05_00116_.png (949 KB, 1280x720)
949 KB
949 KB PNG
daaaamn its gonna be hot here again 35°C .. damnit.. my poor GPU
File: 2024-09-05_00119_.jpg (700 KB, 3840x2160)
700 KB
700 KB JPG
also I am sad, my dim 96, alpha 32 lora ran OOM mid night.. saaad indeed
tell me about the wabbits, why do they wield the ice cream?
File: 2024-09-05_00121_.png (881 KB, 1280x720)
881 KB
881 KB PNG
its vorpal rabbits with vorpal ice cream.. very deadly. You think Guts' sword is dangerous? you never felt the wrath of a vorpal rabbit
File: hqdefault-3148551175.jpg (18 KB, 480x360)
18 KB

Why do I see so many images on Civitai generated with Pony that use easynegative when easynegative is actually for 1.5 models?
average IQ on Civit is two digits and closer to the lower side of 50 than it is to 100
it's a civitai slopgenner thing, ponyschizposter. seen it on pretty much every model i've ever had an interest in using, people think using 1.5 negatives on xl/pony/even FLUX once will actually do anything.
wait til you see the flux prompts with
>Best quality, no bad hands, perfect anatomy, score_9
it does do stuff. whether that is good or bad is up to you.
File: 2024-09-05_00127_.png (932 KB, 1280x720)
932 KB
932 KB PNG
>absurdres, 16k, shot by a Canon EOS R1, trending on artstation, by Greg Rutkowski
monkey learn, monkey repeat, even if not viable. You also see ppl use SD15 embeddings on SDXL or FLUX gens.

ya this.. seen them already
bold of you to assume the average civitjeet even knows what different archs are desu
Go back
File: image (14).png (900 KB, 3486x1362)
900 KB
900 KB PNG
A buddy of mine is having issues where his gens are greyed out or bricked if he uses anything other than Euler A. I'm not too familiar with comfy. Can you guys spot anything fucked up with his workflow?
ty i'll have a look at that
using this?
the docs are in moonrunes but i assume i can just run this like `python resize_lora.py --model old.safetensors --save_to new.safetensors --new_conv_rank 8`
i'd usually say that's some xformers or --fast type thing or just a bug from being on a bleeding edge commit but i notice the scheduler is karras, from memory some scheduler+sampler combos are not appropriate. in webui/forge this is hidden from you by just offering you the combo like "dpm++ 2m karras" but in comfy you're allowed to mix and match, so make sure he's at least aiming for a sampler/scheduler combination from some list (like from a webui) for now. I don't think that's necessarily his problem though
File: 2024-09-05_00128_.png (1.21 MB, 1280x720)
1.21 MB
1.21 MB PNG
in comfy you load embeddings like this:
also try to load the VAE from the LoadCheckpoint node, it probably is baked in, also its way to much negatives SD15 needed many negatives, sdxl is happy with a few.

other than that the only problem I see is using sdxl
this is text to image, sampler denoise must me 1. also several commas missing from the prompt, can fuck up pony
this >>102240747
deliver an empty latent with denoise 1 always.. for upscaling or img2img you reduce denoise
can't believe i missed that, dishonorable seppukuing now. NTA but what do you mean by missing commas fucking up pony? I wasn't aware it greatly mattered whether you used spaces/enters/commas to separate tags, is this something you're confident of or have just noticed in passing?
the original forge guy came back from his "i'm bored" hiatus and added the anon who made the reforge fork to the main forge repo
how do you get forge to save the first gen pass? I actually don't even remember if it's saving images before it upscales.. only been a few months since i last used it and already forgot.
yeah pony can spazz out (you'll notice it when it happens, looks like modern art) and I think it's related to the prompt and missing seperators. I see two there, between "best quality" and "1girl" and between "maid uniform" and "standing", the line breaks basically.
no using forge, but since a1111 clone: it is probably hidden in the settings, there it was in a1111
>I wasn't aware it greatly mattered whether you used spaces/enters/commas to separate tags,
Nta but what the fuck have you been doing, dear God anon... How have you even genned anything so far... You've been using spaces and linebreaks to "seperate" your tags?
What the fuck
My brother in Christ
I can't even begin...
>always save all generated images and gens per pass are completely different settings
what did the A1111 devs mean by this? thanks. I have to tick a lot of fucking boxes now.
not the hecking boxes y'alls!
when renders come out real grainy and oily like this, what is that an indication of? god i hate taking long breaks from genning then forget why things break.

>at least forge IS now a whole minute faster in genning 1500pix images compared to comfy

shut up faggot retard
I wonder if that's just it hitting one of the hashes, like how "gpo" turns it into a really ugly shitty deviantart style? I've really never had this issue afaik, admittedly 95% of my prompts would be comma separated tags and 95% of the remainder would be very short/clear sentences arranging some tags into composition / multi-subject instructions, so the only time I'd have two tags separated by something other than a comma or another word would be the very rare case where I separate two blocks with line breaks for readability without using a BREAK
could you post an example of a fucked up gen that hits this discontinuity and resolves it with a comma, if you happen to have any saved? also applies to >>102240804
This fudtard uses masterpiece
Using a pony realism model, in my experience
then that's what i call a skill issue my man
>at least i eventually overcome mine
cool gen anon, what lora are you using?
Let me see an example
File: 2024-09-05_00156_.png (1.83 MB, 1024x1024)
1.83 MB
1.83 MB PNG
flux makes some nice stuff with just "masterpiece" as prompt
The tokenizer understands commas to be the tag separator
How is this considered some esoteric, mythical knowledge to you, it's literal fact
what in the uncanny valley of horror is this anon. might be lora related in this case.
I think so, yeah. and no I don't have any of the failgens saved. you immediately notice it, looks like a pop art painting or something. btw, you know about the artist tags?
https://files.catbox.moe/5y78n4.jpg (nsfw-ish, ass with g-string) just a small selection
File: 2024-09-05_00114_.png (860 KB, 1280x720)
860 KB
860 KB PNG
ty, Disgea lora
It being fanfiction you made up in your head would be my guess
Masterpiece on flux is understood via the t5
Masterpiece on pony is some shitty quality tag hold over from ripping tags off boorus and has no place in any prompt, usually used from misguided tards who used it on some other arch or model because it was hard baked in as a trigger word for "quality"
Wow you had me going, I really thought you were a just retard and not a troll. now I know you're both, bravo
>were a just retard
Just a. Fucking phone posting.
>tags off boorus
its not even a booru tag (masterpiece is non existent on any booru). The real story is that "masterpiece" was a baked in quality tag of the NAI leak. NAI used masterpiece to tag high quality pictures in their finetune of SD15
File: file.png (2.6 MB, 1024x1024)
2.6 MB
2.6 MB PNG
thanks anon
I'm sure you are convincing a lot of people with your constant trust me bro word salad non arguments and avoiding posting documentation that doesn't exist. Fudtard
it was the lora, i forgot NEVER to see it to 1. thank ya muchly.
hey whichever one of you is the pony schizposter, do you have any good youtube channel recommendations that go over the history of LLM's and imagegen? I wanna see a nice mini doc on the story of how we even got this shit in the first place, my scattered understanding is that basically everything we get to use now got its start from leaks. Llama, NAI, etc.

>there's no way no one's covered this as a story yet
File: file.png (1.79 MB, 1024x1024)
1.79 MB
1.79 MB PNG
how do I get comfy to output the image in jpg/jpeg format instead of png?
File: 2024-09-05_00155_.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
you are welcome

im not the pony poster, but the lore is scattered in 4chan threads, have not seen any youtuber touch this yet .. kicked off with the usual suspects gathering in /sdg/ .. voldy was practically living in the thread around summer 2022 - autumn 2022, until comfy released his first version .. then it got into bitter fighting (you can probably still find their bickering on desuarchive), meanwhile everyone tried their best to get even something done with SD14 and SD15.. until NAI leaked .. then all hell broke loose, and the only way to further enhance SAI was to merge the silly finetunes that were spread on endless rentry pages, with secret mix recipes (ow the infamous berrymix).. and then the LoRAs happened in early 2023 and it got way more structured .. uuf that was fun times
you can use this node
My bad you are correct, misremembering on my part
do I just clone that repo to \custom_nodes ?
look at the readme
better use comfy ui manager to install custom nodes
quick recap of the thread so far:
yeah if you wanna do something "realistic" with a toon like that it you need to dial in the lora carefully. also doesn't hurt to try if the toon is already in the model
File: file.png (1.94 MB, 1914x1050)
1.94 MB
1.94 MB PNG
>Stable Image Ultra
What's that?
>Set the aperture f/4 to keep the entire face in shapr focus while maintaining a pleasing depth of field.
Holy fuckeroni.. SAI needs to die in a fire.
>cloud based
Don't care
that's the thing, SAI was all about "local models", now they're keeping Large and Ultra to themselves, this company is definitely dead, and has been replaced by BFL
>photography terminology
Would be interesting to see what /p/ says about how accurate it is
replaced by BFL, who *checks notes* keep their best model to themselves
considering bfl team were the former sai team too
>also doesn't hurt to try if the toon is already in the model
honestly true, though LORAs when theyre done right seem to help loads with accuracy
though that could just be a prompting skill issue, not like i can fully predict exactly what tags pony was trained to use for that character.
>i say this as i removed the lora, genning the new image and it's 100% accurate to the character
>replaced by BFL, who *checks notes* keep their best model to themselves
BFL never *checks notes* promised they would never keep their model to themselves, but SAI *checks notes* did such promises
File: 2024-09-05_00164_.png (1.56 MB, 1024x1024)
1.56 MB
1.56 MB PNG
so basically the best elements on the company left because of the incompetence of their CEO to make their own and better model, that's really familiar to Claude and OpenAI
flux pro mogs flux pleb pretty hard but let's not think about that
yeah lol, btw working on a lewd zelda set as I type so there's that. no lora either, just some artist tags and a style lora. loras can help a lot with clothing consistency
When did SAI make such promises? Feel free to provide a reference to back up your claim.
kek thank you for your personal breakdown of the events even if it presumes i know any of what you're vaguely pointing out
that looks a lot more fun than what was going on in /aicg/ and /lmg/ respectively. wish i had any idea how decent imagegen was even before 2023, would've loved to at least lurk.

I honestly feel like a fucking idiot for not just trying this from the start and seeing where it went by experimenting with my settings more. Just got so caught up in how nice the results were with the two loras.
Though now im facing some issues with the gens being kinda desaturated so thats fun.
as I get it SAI was basically a big incubator.. all that Emad was doing is get investors in and hire ppl, he had no real clue what they were doing and how to make a successful business out of that, then ran as soon as trouble was afoot and quit as CEO. Also toxic workplace.. I mean London cmon.. so the nerds that actually made SD15 took their stuff and left to found BFL .. now all they got is some diversity hires maintaining the status quo
File: 2024-09-05_00165_.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
London isn't all bad but the office was literally above a chicken shop
File: ComfyUI_Flux_12469.jpg (496 KB, 672x1504)
496 KB
496 KB JPG
File: file.png (2.26 MB, 1024x1024)
2.26 MB
2.26 MB PNG
File: 2024-09-05_00169_.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
did you say lewd Zelda?
File: 00000-2527154214.jpg (748 KB, 1728x1344)
748 KB
748 KB JPG
Is this disturbing to anyone else?
>flux pro mogs flux pleb pretty hard but let's not think about that
let's think about that, where are the side by sides?
It was Emad's mission as the former StabilityAI's CEO at least
>This is why open source is important.
>My own personal experience was that we went from an open and flat structure to very corporate/siloed very quickly as I delegated out ex-research and model building to massively experienced folk.
>Didn't spend enough time instilling openness & other elements as core to culture.
File: ComfyUI_Flux_12487.jpg (410 KB, 672x1504)
410 KB
410 KB JPG
I didn't wanna dive into that much further, here something I saved
pro is on another level.
yeah bottomless, sitting on a golden chest. with the nostalgia pony lora and some artist tags thrown in
>We are building national open models and datasets across modalities and will be funding hundreds of local researchers, communities and initiatives.
>We are not a release one model type of company but driven by community.
File: ttNpreview_00040.jpg (3.47 MB, 4303x2245)
3.47 MB
3.47 MB JPG
the posts you're replying to aren't me, i'm only >>102240834 & the original asker. anyway, here's a test prompt at 1024x1024 on autismmix dpm++ 2m sde karras cfg6 24steps a1111-parser/mean-normalisation:
>score_9, score_8_up, score_7_up BREAK 1girl, black hair, bob cut, blouse, white shirt, blue skirt, pleated skirt, blue eyes, brown hair BREAK standing, looking at viewer, light smile, hand on hip, simple background
>score 1, score 2, score 3, text, watermark, 3d, cg, sex, nude
This is the ultra extreme case where the commas are just straight up all removed so the prompt reads like "standing looking at viewer light smile hand on hip simple background" and even in that extreme (keeping in mind the context here was a prompt with just a couple of missing commas) I'm not seeing major changes, nor a predictable effect in either direction. I'm assuming >>102240881 is being honest and that perhaps in rare cases a combination of unseparated tags do hit some poison token or something, but i'm not really sure the changes in this grid justify calling me a "troll and a retard"; that's very unkind of you and only drags the discourse of our happy /ldg/ down into the mud.
File: file.png (2.94 MB, 2128x1379)
2.94 MB
2.94 MB PNG
It's literally the motto of StabilityAI when you go on their main site page
>Open models in every modality,
for everyone, everywhere.
File: ComfyUI_33342_.png (906 KB, 768x1024)
906 KB
906 KB PNG
>comfyui genning to 1500pix took 3 mins for 1 image at 1024pix base scale
>forge 3 mins for 2 images at this config
god damn they're on a roll lately, and i was just doomposting about forge 2 months ago.
>Don't be evil.
ya.. once again proofs, don't belief company mottos
>provide evidence
>no not that evidence it doesn't count
amazing strategy
yeah I know, I never said I believe promises of companies, but my point was that they did make such promises
File: ComfyUI_Flux_12477.jpg (491 KB, 672x1504)
491 KB
491 KB JPG
File: ComfyUI_Flux_12495.jpg (438 KB, 672x1504)
438 KB
438 KB JPG
File: 00017-681986612.jpg (738 KB, 1728x1344)
738 KB
738 KB JPG
cool image dude
interesting. here found something in my history, that is how the gens look, all of a sudden. https://www.reddit.com/r/StableDiffusion/comments/1f73xch/glitched_images_occurring_in_pony_sdxl/?rdt=36218
and to add my experience, I've had it be persistent even after a restart of comfy & had it happen in reforge too. anyways
beautiful gens son, I grabbed the lora last night
File: Lenghty Dick General.jpg (3.28 MB, 3264x3264)
3.28 MB
3.28 MB JPG
is there ever any benefit to fucking around with upscaling settings? it seems like 0.3 denoising and 5 samples to get to my 1500pix res is good enough, but my anal mindset is telling me to tweak anyway.
I really like the art style. Are you using a LoRA for that or is it just prompt?
Just tPonynai3_v61 checkpoint and some inpainting. Prompt wise I was trying to do something the likes of >>102240467
>is there ever any benefit to fucking around with upscaling settings?
Once you get it where you want no. But you may need to adjust settings if say you're using a character LoRA.
File: 00023-4033627191.jpg (647 KB, 1728x1344)
647 KB
647 KB JPG
File: file.png (2.1 MB, 1024x1024)
2.1 MB
2.1 MB PNG
>Digital illustration,
>A woman doing a pov selfie in front of a giant billboard in New York in the shape of a cloud presenting the game "Disgaea" at night
how do you make an actual selfie on Flux?
>doing a pov selfie
i just prompted "taking a selfie, camera POV, amateur photography" after a BREAK in pony and it worked fine.
File: FLUX_00063_.png (1.55 MB, 1440x1120)
1.55 MB
1.55 MB PNG
In pony? Sure, that's probably the "selfie" tag doing the heavy lifting there, but some of those rigorously defined tags in booru finetunes are quite hard to achieve in Flux
get the dip
>the "selfie" tag
nah like the schizo was pointing out, none of this works the way you probably expect it to. It's pony's training and XL's understanding of what those phrases mean. Combining them gave me the affect of looking at the character through her phone, taking a selfie. Flux i imagine works exactly the same way.
File: ComfyUI_Flux_12515.jpg (397 KB, 672x1504)
397 KB
397 KB JPG
>Stable Image Ultra has an additional 4 billion parameters vs. Flux.
So their closed model is a 16b model? I hope it's better than Flux pro or else they are fucking done
File: file.png (103 KB, 1960x786)
103 KB
103 KB PNG
>So their closed model is a 16b model?
maybe try something like
>nigttime selfie of a woman standing in front of a giant billboard in the shape of a could, it depicts an advertisement for a game "Disgea".
Huh? The original guy posted an example of the word "selfie" not really working for him on Flux. I'd expect it works sometimes (hell, maybe it works a lot and he just got unlucky or is bamboozling us, baking atm so can't flux to test), but I'm pretty confident it's reliable on Pony because of 20,000 identical compositions on https://danbooru.donmai.us/posts?tags=selfie being fed to Pony with 100% reliability and correlation. Smoke this joint to prove to me you're not the real schizo here, schizos aren't allowed to
Its too early for me to be smoking marijuana right now, my complete comprehension of the conversation could be wrong but aren't you agreeing with my point? Why are you calling me the schizo?
>still 16b parameters translates to how much required vram?
it's still managable to run a 16b model if you go for Q8, the quality is really close to fp16 so that's all right
made a few solid mirror selfies in flux but an actual selfie, uh. haven't tried. still gonna take a puff if you don't mind (diff anon here)
you can take the blunt he handed to me by force, i really shouldn't smoke before 3PM. here you go bud, one hell of a nightmare blunt rotation if i do say so myself.
>i seriously just sent that message at 4:20 AM
time to gen 20 images of princess peach smoking ganja (nude) (subsurface scattering) (golden hour)

yknow maybe the pony schizoposter was the friends we passed weed to along the way
File: file.png (2.09 MB, 1024x1024)
2.09 MB
2.09 MB PNG
Managed to make it work with something like this I guess:
>A close up portrait of a woman, behind her there's a giant billboard in the shape of a cloud, it depicts an advertisement for a game "Disgaea"
File: 2024-09-05_00191_.png (1.08 MB, 720x1280)
1.08 MB
1.08 MB PNG
Did not know that Disgea has Mario and Pikachu .. pic related.
it could have anything in it because Disgea doesn't exist, but Disgaea does
is disgea that bottled water that was recalled for having microplastics in it (how times change)
>>102241458 lol. I haven't smoked in 8 years and its 1pm here. I'd kill me.
>>102241477 maybe the pony schizo was the weed we smoked? think about it.
File: ComfyUI_hgdf_00386_.png (1.46 MB, 1344x768)
1.46 MB
1.46 MB PNG
kek.. I fixed it in my lora description .. silly me
File: 2024-09-05_00196_.png (1.22 MB, 720x1280)
1.22 MB
1.22 MB PNG
File: tmp987qzan2.png (1.48 MB, 1152x896)
1.48 MB
1.48 MB PNG
wish i had access to weed
gg wp anon
File: 2024-09-05_00199_.png (1.34 MB, 720x1280)
1.34 MB
1.34 MB PNG
use this:
>This is a selfie portrait of a woman standing in front of a giant billboard in the shape of a cloud, it depicts an advertisement for a game "DISGAEA". It is night time. The advertisement shows an anime style combat scene of a fantasy army battle.

>gg wp anon
File: ComfyUI_hgdf_00388_.png (1.35 MB, 1344x768)
1.35 MB
1.35 MB PNG
File: file.png (1.89 MB, 1024x1024)
1.89 MB
1.89 MB PNG
>use this:
I think at this point it's just the lora's fault, it's raping the weights so hard it doesn't know what a pov selfie means anymore
File: 2024-09-05_00201_.png (970 KB, 720x1280)
970 KB
970 KB PNG
I am using the Disgaea lora for >>102241621 and >>102241573 .. maybe its the sampler? I use dpmpp_2m with sgm_uniform atm
or the lora weight.. I lowered it abit
Yeah but you don't seem to be using the "Digital illustration" trigger word, I want it to be fully anime
>retro flux lora
thats where T5 helps you .. digital illustration will trigger the style for the whole picture (I made that lora so I know whats in the captions), but if you tell t5: this is a selfie portrait, then later tell it the advertisement shows anime style, it can do both styles in one picture
File: file.png (2.06 MB, 1024x1024)
2.06 MB
2.06 MB PNG
File: 1723832077294177.png (752 KB, 1314x361)
752 KB
752 KB PNG
>Fixed Point Diffusion Models

>We compare our model, trained
at a 256 Ă— 256 resolution against the state-of-the-art DiT [37] on four datasets (FFHQ, CelebA-HQ, LSUN-Church, ImageNet) using compute equivalent to
20 DiT sampling steps. FPDM (right) demonstrates enhanced image quality with 87% fewer parameters and 60% less memory during training.

>it can do both styles in one picture
it sure can do that, but what I want is the disgaea style everywhere, not a human woman mixed with anime drawing behind
>at a 256 Ă— 256 resolution
it's the Gan architecture meme all over again? Good at low resolution and shit at high resolutions?
It's September.
Where's bigma?
File: file.png (93 KB, 350x350)
93 KB
I forgor Pixart ever existed lol
File: ComfyUI_hgdf_00373_.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
File: 2024-09-05_00212_.png (1.09 MB, 1024x1024)
1.09 MB
1.09 MB PNG
ah gotcha..
>This is a close up anime-style portrait selfie, anime-style portrait of a woman's head and upper body in front of a giant billboard in the shape of a cloud, it depicts an advertisement for a game "DISGAEA". It is night time. The advertisement shows an anime style combat scene of a fantasy army battle.
That smile is one tiny squint away from needing correction.
>Flux quality at 200% smaller size

I believe.
File: 2024-09-05_00213_.png (1021 KB, 1024x1024)
1021 KB
1021 KB PNG
oops I doubled anime-style portrait.. works with only one time mentioning it to
>This is a close up anime-style portrait selfie of a woman's head and upper body in front of a giant billboard in the shape of a cloud, it depicts an advertisement for a game "DISGAEA". It is night time. The advertisement shows an anime style combat scene of a fantasy army battle.
File: ComfyUI_Flux_12539.jpg (429 KB, 672x1504)
429 KB
429 KB JPG
These guys are a fucking meme now. SD3 8B wasn't even close to Flux Dev. At prompt following it was even worse than Pixart 600m, because they also censored the shit out of that 8B model (it couldn't do feet, etc...)
Also the Ultra model as tested on Discord is also a meme riddled with the same issues as 8B. HunyuanDiT which was only 1.5B performed much better than their 16B.
oh yeah? can you show some examples so we can laugh a bit?
File: file.png (1.83 MB, 1024x1024)
1.83 MB
1.83 MB PNG
yeah that's better, thanks anon
>Also the Ultra model as tested on Discord is also a meme riddled with the same issues as 8B.
Emad told us training SD3 cost them 10 millions, I find it funny they managed to burn even more millions to make another meme model
File: 88846854645465.png (52 KB, 867x563)
52 KB
I did a while back

Although HDiT does do an amazing job (with fingers even better than flux at times) at excluding AI artifacts, there is still no excuse for them to be SDXL quality.
>3 months ago
what? I thought that model was released a few days ago, the fuck is happening there? lol
File: ComfyUI_hgdf_00380_.png (1.3 MB, 1352x768)
1.3 MB
1.3 MB PNG
What's wrong with it?
what do you actually prompt to get this? lora does not seem to be doing much at all
I still laugh in the face of hunyuan shills
it was an sdxl finetune before, they said sd ultra is their best model but not a specific one
File: ComfyUI_Flux_12549.jpg (467 KB, 672x1504)
467 KB
467 KB JPG
File: 515154454521.jpg (114 KB, 592x592)
114 KB
114 KB JPG
Kek I realize that was 2B, but here's an anime image generated with Ultra, prompt "a girl waifu anime style". I remember browsing through the gens at the time and concluded the model was dogshit.
>I did a while back
no I meant the recent Ultra one, the one released yesterday
File: file.png (2.1 MB, 1024x1024)
2.1 MB
2.1 MB PNG
to the disgaea lora anon, you did a good job on not fucking up the anatomy with it, it's cool when that happen
File: 4884465465.png (136 KB, 403x297)
136 KB
136 KB PNG
Ultra is not a new model, they have been testing it on Discord for a while
Released on Bedrock yesterday but it is not a new model.
File: 2024-09-05_00216_.png (968 KB, 1280x720)
968 KB
968 KB PNG
the prompt for this one was
>Anime-style rabbit warrior holding a giant ice pop that looks like a two handed sword. In the back ground is a night sky. The rabbit stabs a goblin.
(it fucked up the goblin, but who cares, the rest is cool)

also pic related is same+seed of without the lora.. is okay, but not what I wanted. Also you would be amazed on how many blue anime ice pops pictures I have in the dataset

File: 5156454545645.jpg (110 KB, 592x592)
110 KB
110 KB JPG
As for their photoreal Ultra model, here's "photoreal instagram avi for a personal trainer named Lauren Blanchard who loves to hike"
>she loves to hike so much that she started to become the trail
File: ComfyUI_Flux_12567.jpg (438 KB, 672x1504)
438 KB
438 KB JPG
what's with the resolution
all i hope is that elon buys them out and fires everyone there
wtf? it's ultra saturated and for a really simple prompt as the "1girl" they messed it up, she doesn't have her right arm, can't believe amazon brought that shit, ffs it's a 16b model
oh yeah you definitely want the ((GROK)) guy going into imagegen, and not just let him give the FLUX devs free advertising.
File: ComfyUI_Flux_12571.jpg (441 KB, 672x1504)
441 KB
441 KB JPG
what's up with that grainy effect? I had the same with this old school lora one
>... who loves to hike and compete in the Paralympic one armed tennis team.
not local, don't care
File: file.png (125 KB, 2475x769)
125 KB
125 KB PNG
yeah they even made a funny joke about that one on leddit
File: ComfyUI_Flux_12573.jpg (441 KB, 672x1504)
441 KB
441 KB JPG
thats just part of the aesthetic
can you go for something bigger?
i wish bfl all the best but its clear they dont really give a shit about local trainers
File: 87789786565.png (289 KB, 1108x376)
289 KB
289 KB PNG
I'm not the one genning since I'm not a paypig, these are what I find on Discord, they do not let you download in higher res by default so it's probably some safety feature (I probably must use their GUI or some shit, won't mess with it).
oh ok, it's kind of embarassing those pictures have those horrible anatomy issues, if they can't do something good with a 16b model they will never make it, I think this company is defintely dead, and Flux was the final nail in their coffin
File: 00010-1078962388.png (1.29 MB, 1024x1536)
1.29 MB
1.29 MB PNG
File: ComfyUI_Flux_12581.jpg (422 KB, 672x1504)
422 KB
422 KB JPG
that is one terrible prompt tho
File: 20240905074541-1356504967.png (1.5 MB, 1024x1280)
1.5 MB
1.5 MB PNG
sloppa incoming
File: 20240905074642-1356504969.png (1.44 MB, 1024x1280)
1.44 MB
1.44 MB PNG
why though
File: ComfyUI_Flux_12587.jpg (427 KB, 672x1504)
427 KB
427 KB JPG
it is how it is
File: FLUX_00086_.png (1.75 MB, 1120x1440)
1.75 MB
1.75 MB PNG
ladies night
File: file.png (3.9 MB, 1457x1514)
3.9 MB
3.9 MB PNG
>oh yeah? can you show some examples so we can laugh a bit?
there's this, and the images Ultra make are terrible
File: 00000-1859538050.png (4 MB, 1664x2432)
4 MB
File: 2024-09-05_00228_.png (1.51 MB, 832x1216)
1.51 MB
1.51 MB PNG
File: 00050-1267951095.png (1.2 MB, 1024x1024)
1.2 MB
1.2 MB PNG
File: out.jpg (1.06 MB, 3072x1280)
1.06 MB
1.06 MB JPG
i figured out why my loras werent working flux1-dev-q8 gguf
left - before
middle - flux1-dev-q8
right - flux1-dev-q8 + t5-v1_1-xxl-encoder-Q8_0.gguf

forge has a dropdown which wasnt showing for me called:
"diffusion in low bits"
it was set to auto, i switched to auto + lora fp16
and loras back to working
File: 20240905085033-711349899.png (1.12 MB, 1024x1280)
1.12 MB
1.12 MB PNG
i had to jack the strength way up and reroll a lot before it even produces one image vaguely in the correct style so this is a fail in my book, your shit is undercooked
File: ComfyUI_Flux_12595.jpg (430 KB, 672x1504)
430 KB
430 KB JPG
nvm it was this lol
want the dim 64 5000 step version? its 650MB .. also I heard others complain its to overcooked... I guess Ill just be fine with it
>Forge still does that thing were it sometimes hangs on finishing a gen causing my system to explode to 100% usage slowing everything down
aaaaahhhhhh just when i thought things were finally looking up.
>at least i finally got around to genning princess peach smoking a blunt (it looks like a cig though)
File: 1706574999819.jpg (566 KB, 1024x1024)
566 KB
566 KB JPG
hailuoai is fun to play with
not me
File: 1723190408582.jpg (319 KB, 1024x1024)
319 KB
319 KB JPG
File: 00005-3026436554.jpg (406 KB, 1664x2432)
406 KB
406 KB JPG
anons ive been using flux1-dev-bnb-nf4 but is there a better one I should be using? I have a 4070 super btw
yeah, wow, that looks great, anon
you have 12gb of vram, you should go for Q4_K_M instead
File: 20240905090055-1369836339.png (1.11 MB, 1024x1280)
1.11 MB
1.11 MB PNG
bowstrings are a bit beyond it but it's still neat
File: 2024-09-03_00328_.jpg (803 KB, 3840x2160)
803 KB
803 KB JPG
ya, I think there is only one archer in the data set
File: file.png (2.94 MB, 1534x1134)
2.94 MB
2.94 MB PNG
>flux dev seems to actually converge at about 35-40 steps. The image at 35 steps is what the model is trying to do. Illustrations converge faster so you get different results earlier. ie: before 35 steps it's just a not completely formed image (that is still nice to us)
that's interesting
File: ComfyUI_hgdf_00384_.png (1.85 MB, 1680x960)
1.85 MB
1.85 MB PNG
File: 2024-09-05_00236_.png (1.16 MB, 1216x832)
1.16 MB
1.16 MB PNG
File: ComfyUI_Flux_12607.jpg (433 KB, 672x1504)
433 KB
433 KB JPG
I'm only seeing Q4_K_S, where do I get the M version?
my b, I meant Q4_K_S yeah, you could even try the Q5, I think they might fit on a 12gb vram card
File: ComfyUI_00168_.jpg (234 KB, 1072x1648)
234 KB
234 KB JPG
still can't get over how good flux looks
>still can't get over how good flux looks
>posts bug eyes plastic skinned freak to show it

Her male looksmatch is hanging from a rafter right now.
maybe that's his kind of girl, I won't judge
File: 00001-1343288145.jpg (572 KB, 2048x2048)
572 KB
572 KB JPG
File: FluxDev_04740_.jpg (221 KB, 832x1216)
221 KB
221 KB JPG
File: ComfyUI_00170_.jpg (177 KB, 1072x1648)
177 KB
177 KB JPG
you guys wouldn't believe how angry people get at me genning anya
Best local diffusion for Mac OS?
well there's not much else to critique, it's just her against a blank background
Anya is not the issue, do get a realism lora to avoid the plastic skin issue.
ooph.. all will be slow as fuck, the tensor calculations in even the best M4/M3 Max model is shit ..
File: ComfyUI_00159_.jpg (187 KB, 1648x1488)
187 KB
187 KB JPG
how about just appreciate instead of always critiquing
nta but exactly this. Last I checked it's slower than a 1060... it just happens to have a lot more ram (unified) available. Better for LLMs but even at high param count models the preprocessing time starts to be a killer.
how come mac can't do it well?
regardless of who the subject is, the face is classic for shit civitai loras of her.
then there's the hairline, lack of skin texture, bleached sclerae, teeth clipping, eyelash artifacts, DOF, incoherent textile patterns, the bra seemingly hovering in front of her chest, mismatched shadows, et c
that's impossible
File: ComfyUI_00176_.jpg (385 KB, 1328x1376)
385 KB
385 KB JPG
cause they don't have fast memory or tensor cores. The UMA gives alot of video ram, but its creeping slow (150-300GB/s vs 1000GB/s on the 40xx series). And the tensor cores are on good smartphone level or bad 10xx series..

Apple likes to talk big about their CPUs, but if you wanna do AI that isnt some slow LLM they are a joke.
everything in the AI space is geared towards coding for Nvidia's CUDA cores which no Apple hardware has. I don't know if anyone has tried coding natively for the GPU that Apple does have but I'm sure we would have seen something by now if anyone thought it would give decent performance.
I wonder why Apple isn't panicking about getting left behind when it comes to this stuff. it's not like they dont have the $ to make their machines viable for ai gen content
There's CoreML, which is Apple's native machine learning model.

It's faster to use it (but annoying, since it means you have to convert models), but still horrendously slow compared to even shitty CUDA cards.
Apple doesn't seem all that interested in running high end models on customer hardware. Look at the image gen of their 100% native Apple Intelligence - it's pathetic.
Makes SD1.5 laugh.

Apple only needs to run models that are good enough for the normies and it shows. Remember that "Superhero mom" nightmare fuel and how proudly they showed it…?
nah.. they never were in the graphics high end game, even when they were intel they did not support NVidia and always shipped with amd. there are not even drivers for NVidia gpus on the old macOS
could at least gen it yourself cunt
Your next loaf, ready to collect...
To give you a perspective on speed:

An M4 boasts about 38 TOPS (trillion operations per second) for AI
A 4090 has 1321 AI TOPS

.. so a 4090 GPU is ~35x faster than an M4 at AI
I'm in the op... today is a good day

Ultra looks like schnell
Does comfy have something similar?
File: ComfyUI_hgdf_00388_.png (1.3 MB, 1344x768)
1.3 MB
1.3 MB PNG
KEK a luxury brand or high-end campaign is not going to do that, what are they gonna do, pay a professional photographer to take shots of their products, then pay high-end prices to SAI, then make a lora and hope for the best?

>Stable Image Ultra: Photorealistic, Large-Scale Output
>Ideal For: Ultra-realistic imagery for luxury brands and high-end campaigns.
>Use Case Example: A luxury brand uses Stable Image Ultra to create stunning visuals of its latest collection for magazine spreads, ensuring a premium feel that matches its high standards.
Kek, Ultra easily looks the worst. That's what they call photorealistic? That's plastic.

[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.