Discussion of free and open source text-to-image modelsPrevious /ldg/ bread : >>101761268>Beginner UIEasyDiffusion: https://easydiffusion.github.ioFooocus: https://github.com/lllyasviel/fooocusMetastable: https://metastable.studio>Advanced UIAutomatic1111: https://github.com/automatic1111/stable-diffusion-webuiComfyUI: https://github.com/comfyanonymous/ComfyUIInvokeAI: https://github.com/invoke-ai/InvokeAISD.Next: https://github.com/vladmandic/automaticSwarmUI: https://github.com/mcmonkeyprojects/SwarmUI >Use a VAE if your images look washed outhttps://rentry.org/sdvae>Model Rankinghttps://imgsys.org/rankings>Models, LoRAs & traininghttps://civitai.comhttps://huggingface.cohttps://aitracker.arthttps://github.com/Nerogar/OneTrainerhttps://github.com/derrian-distro/LoRA_Easy_Training_Scripts>Fluxhttps://huggingface.co/spaces/black-forest-labs/FLUX.1-schnellhttps://comfyanonymous.github.io/ComfyUI_examples/flux>Pixart Sigma & Hunyuan DIThttps://huggingface.co/spaces/PixArt-alpha/PixArt-Sigmahttps://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiThttps://huggingface.co/comfyanonymous/hunyuan_dit_comfyuiNodes: https://github.com/city96/ComfyUI_ExtraModels>Kolorshttps://gokaygokay-kolors.hf.spaceNodes: https://github.com/kijai/ComfyUI-KwaiKolorsWrapper>AuraFlowhttps://fal.ai/models/fal-ai/aura-flowhttps://huggingface.co/fal/AuraFlows>Index of guides and other toolshttps://rentry.org/sdg-linkhttps://rentry.org/rentrysd>GPU performancehttps://vladmandic.github.io/sd-extension-system-info/pages/benchmark.htmlhttps://docs.getgrist.com/3mjouqRSdkBY/sdperformance>Try online without registrationtxt2img: https://www.mage.spaceimg2img: https://huggingface.co/spaces/huggingface/diffuse-the-restsd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium>Related boards>>>/h/hdg>>>/e/edg>>>/d/ddg>>>/b/degen>>>/vt/vtai>>>/aco/sdg>>>/trash/sdg
>first time in several collages that I'm not in itI'm free
>flux vae is fucked with that retarded grid pattern and there's nothing we can do about itAI detectors easily pick it up btw
>>101764200Why the fuck should you care? Are you trying to trick people into thinking it's your art?
>>101764200no grid pattern on my gens
>>101764221It is there, it's just not visible, anon.
All right so I made a XY plot between the samplers and schedulers for that prompt:>Hatsune Miku with dreadlocks and a black skin showing her fistsHere's a few notes:1) I used CFG 3 + DynamicThresholding or else flux simply wouldn't want to modify Miku's featurehttps://reddit.com/r/StableDiffusion/comments/1ekgiw6/heres_a_hack_to_make_flux_better_at_prompt/2) Only those samplers are working on flux on ComfyUi:>euler; heun; heunpp2; dpm_2; lms; dpm_adaptive; dpmpp_2m; lcm; ipndm; ipndm_v; deis; ddim; uni_pc; uni_pc_bh2The rest just give insanely glitched output so it wasn't worth adding them in3) The schedulers "normal; sgm_uniform; simple;" give almost the same output, so I only went for the "simple one"https://files.catbox.moe/af40tk.jpgThere's some interesting observations you can make out of those samplers + schedulers, they're not as identical as I thought.
>>101764233See, that's why I told you to use a much more specific prompt, now you don't know if the variation is due to the sampler or just the prompt being that vague...
>>101764229nope
>>101764247Post your gen (raw png) and I'll show you
official pixart bigma and hunyuan finetune waiting room
>>101764244Nope, the prompt is fine, it's already working when you're being more agressive with the guidance (4.0) + CFG (3.0), for that one I went for guidance 3.5 + CFG 3.0 so that I could find a set of samplers + schedulers that would work for something less agressive, desu there's none that had both the black skin and the dreadlocks, so I consider it a failure
>>101764258tell me the method you use and I'll see for myself
>>101764233very interesting! thanks, anon.
>>101764283I think https://arxiv.org/pdf/1912.11035 should easily spot Flux gens too
>>101764200>mfw the latest free shit has an invisible pattern
>>101764297Yeah, looks like it works for SD at least, see e.g. https://blog.metaphysic.ai/combating-stable-diffusion-face-forgery-through-frequency-analysis/ https://arxiv.org/pdf/2210.14571v4
>>101764315"I just realised... Sucrose and Collei... Anemo and dendro, wind...Blume..."?
>>101764233thanks for the research efforts, heunpp+beta looks very promising, had not considered that one yet, pic related
Some data points for you guys
>>101764297The squares on the frequency analysis have fuck all to do with squares we're talking about which are on the image itself. Dummy.
>cropping out the background and replacing it with a solid color when training a character LORAthoughts?
>>101764360Bad. It fucks up the LoRA. All the gens using your subject will have a white or black background, You can't prompt a background back in.
>>101764360>>101764371To add to this, you want as many diverse backgrounds for your subject as posible. Inside, outside, nature, urban, light, dark, day, night etc etc
>>101764233thanks a lot anon, saved
>>101764379>>101764371are you tagging the backgrounds then? how thorough do you have to be?
>>101764387could you make these more fuckable? thanks.
>>101764102>dev2 branch scrappedRIP.If you were on a dev2 branch before updating, then just do git checkout dev2 and don't touch it again until a major update, otherwise try git checkout on the main branch back to the commit before illyasviel started messing with the repo somewhere in June-July. Alternatively give reForge a try, though I have no idea if it even works properly.
>a 1060 6b took 120 sec per gen on sd1>a 3060 12gb takes 150 sec per flux genWhy Are We Still Here? Just To Suffer?tho i get the impression something very wrong is happening, swarmui uses a constant 24gb ram after loading the model but flushes the vram multiple times.
>>101764388I can't remember, haven't trained any LoRAs since SDXL, I used an auto tagger and just gave the data a once over.I remember doing the exact thing of removing the background of the data set and it definitely fucks up the LoRA.I ran the next session with backgrounds in and it was perfect.
>>101764233heunpp+beta looks cool, but euler+beta (see >>101764165) has better sky
>>101764233DPMadaptive+beta is madness, it went to 60 iterations before it was happy with convergence
>>101764233heunpp2 beta 14 steps
>>101764233ipndm for schizo gen
>>101764469ya its nice but its sloooow, euler+beta ~1.2s/it, heunpp+beta ~2.4s/it, with negatives hack ~5s/it
>>101764297>Upscale the image>Grid pattern gone, like magic!
>>101764396>Alternatively give reForge a try, though I have no idea if it even works properly.It works alright, but it's starting to have some bloat
DreamshaperXL 4 steps. Um... didn't exactly get the prompt correct, but I assume that has to do with some SFW thing with fooocus.
Have you tried decreasing the CFG value so that you get more varied styles out of your prompts?
>>101764785no because using CFG cuts speed by half
>>101764785Would you look at that!
>>101764857>In the style of Pablo PicassoNO YOU CANT DO THAT
can I get a link to a workflow with CFG included so I can give it a try?
>>101764864nta, but yes I can.
>>101764877Sure, here it is.https://files.catbox.moe/rf18x1.png
>>101764785>>101764857Is it still 2-3 times slower than having cfg at 1?
>>101764992it's 2 times slower when cfg isn't 1 because it has to calculate the negative prompt on top of the positive prompt now, desu I don't know why it shouldn't be the same speed if you put nothing in the negative prompt though
>>101764974Jesus Christ I thought that was a real Germans desktop>>101764992Yes.It definitely does something though.>Princess Jasmine in the style of Frida KahloJust not sure it's the something I intended.
>>101765023>real Germans desktopAnyone else notice that a lot of the default gibberish in this model looks distinctly German?
Have any of you been able to get a pose like this out of FLUX? I tried, but without success. Can it just not do it?
>>101765003Have you guys tested if having same token amount on pos and neg effects generation time?
>>101765023>a painting of princess jasmine by Frida Kahlo, cfg 0.3>>101765038I mean theres a model called Schnell
>>101765062>Princess Jasmine painted by Frida Kahlocfg 1.0cfg low is placebo, it's just being more creative and you are fluking results I think.
>>101764857right is cool/interesting but it's still not even close to the style of Picassolike that's not even a decent approximation of cubism lol, it's just some totally different random thing
>>101765023>in the style of Frida Kahlois not by Frida Kahlo if there is no mustache and nearly unibrow>>101765078>cfg low is placebothis, it will ofc change something, but my Picasso Mikus just went full back to anime with low CFG
>>101765104nice! What's your prompt anon?
>>101765003>I don't know why it shouldn't be the same speed if you put nothing in the negative prompt thoughbecause originally having nothing in the negative prompt was the whole point, Classifier Free Guidance runs a step with no text condition and another with text condition then you control how much of each to use to change the noiseputting things in the negative prompt is a hack of CFG
>>101765045it doesn't know naughty stuff and that pose is rarely seen in SFW contexts, anon, but how close have you gotten and what prompt did you use?
>>101765115oh ok, thanks for the explaination anon
>>101765110i'm the schizoprompter from a few threads back.heres the catbox, keep in mind that only about 1 in 10 or 1 in 20 of the outputs are any good, the prompt and settings produces very chaotic outputs.https://files.catbox.moe/85ujkt.pngthanks to the anons who provided info about samplers and schedulers recently
>>101765144>thanks to the anons who provided info about samplers and schedulers recentlyit was me kek, and thanks for the catbox
>>101765045uuuuum.....
I've seen a few gigantic workflows shared, and since I'm new to comfyui, I have no idea what they actually do.Is there any guide to understand what nodes do beyond simple "my first workflow with flux/sdxl" ?
>>101765151It sucks at most poses that are even mildly nsfw.
>>101765144Oh shit you're prompting in full mode. I can't handle that.
>>101765160i'm new too, and from what i've seen there's not a whole lot. you can learn a lot surprisingly quick by lurking these threads tho
>>101765160>Is there any guide to understand what nodes doYes.https://github.com/comfyanonymous/ComfyUI/blob/master/nodes.py
>>101764785for some reason no matter what I tried on 1.5 and XL it never managed to make a bicycle or motorcycle look correct. This is impressive to me.>>101765045We will probably have to wait for community checkpoint for the good stuff.How does one even install the bloody thing? Finally have a reason to learn Comfy after getting too comfy with A1111, but every guide says to do a different thing (put it into checkpoints, no put it into unet etc)
>>101765181>How does one even install the bloody thing?Go for that tutorial, it's the best onehttps://www.youtube.com/watch?v=stOiAuyVnyQ
>>101765181if you're talkin about flux here's the quick setup guidehttps://comfyanonymous.github.io/ComfyUI_examples/flux/this should probably be in the OP now that i think about it
>>101765181This is their official bloghttps://comfyanonymous.github.io/ComfyUI_examples/flux/Install comfy, download the files on that page, then drag and drop the image into the comfyui pageSuper Simple Stuff
>>101765160>gigantic workflowsYou should see some of my 1.5 shitThis is tame compared to some of the autism I have seen
>>101765222>official blog
>>101765233wtf...
>>101765197Thanks m8. Dig the thumbnail! >>101765212>>101765181thank you! I saw the link somewhere in the links in OP I think. Gonna give it a go and do the YT if I fail miserably (expected outcome)>>101764396>reForgeoh God, are they making a duct tape supportet fork of an abandoned fork?
>>101765235oh yeah my b I was originally going to link to their official blog athttps://blog.comfy.org/august-2024-flux-support-new-frontend-for-loops-and-more/but then I realised it'd be easier to click through to the actual tutorial and link that, and forgot to edit
>>101765233I love factorio and cities skylines too.
>>101765131It's not that rare/naughty. "a girl looking back through between her legs" finds a lot of hits that are clearly not intended to be sexy, but that doesn't work as a prompt and just gives a girl in underwear facing away.I tried things like "a drawing of princess peach standing bent over with her legs spread, looking back at the viewer from between her legs. her face is upside down." but it makes things like pic rel.
>>101765245>oh God, are they making a duct tape supportet fork of an abandoned fork?What has been abandoned?
>>101765240Looks more complicated than it is. Essentially because 1.5 was trained on 512x512 images, genning large images is impossible in one step, you get way too many artifactsThis workflow does 4 upscale steps to 4k and has a face detailer, That's it.
>>101765173I'm able to make a basic workflow, it's just the terms: what's the difference between unet and checkpoint loading? What's sigma? What's guidance why is it different from cfg? etc
>>101765271Forge was abandoned iirc and the author said it will no longer be developed as stable alternative to A1111 but he will test out some new features on it. At least thats what I got a few months ago.
>>101765309>Forge was abandoned iirchttps://github.com/lllyasviel/stable-diffusion-webui-forge looks like there's some big update coming
f
>>101765342interesting. maybe I should switch back to forge. I remember reading the news when there was some drama between him and A1111 devs. Maybe situation improved since Forge did produce results way faster and reliably even on a 4080
>>101765364>between him and A1111 devsNah, it was comfy and his butt buddies who got their panties twisted. A1111 dev doesn't seem to give a shit about anything
>>101765280>What's sigmaIt's another model, a community darling, better than SD but lacks the same level of support.cfg stands for classifier free guidance, Guidance helps shape the images a model makes, cfg specifically adjusts images to better match prompts by mixing results with and without the prompt influence.Unet loads a kind of neural network, checkpoint loads a models state.
>>101765280you can find some terms explained for example here>https://diffute.com/glossaryor here>https://replicate.com/guides/stable-diffusion/glossarybut every new diffusion method and model comes with new tech and new terms so they are outdated (for flux for example)>>101765280>what's the difference between unet and checkpoint loading?unet loader loads flux model and the likecheckpoint loader loads sd15/sdxl models and the like>>101765280>What's guidance why is it different from cfg? etcguidance for flux makes t5xxl follow the prompt more precise (it can forget styles if its to high tho, so lower guidance for that)cfg is a different type of guidance == classifier free guidance that is implemented outside the text encoder interpretation, you can use it in flux with a hack using dynamic thresholding that normalizes the output back to a "virtual" mimic cfg of 1.0, which is what FLUX actually wants>>101765280>What's sigmain what context did you see that? cause it can have many meanings from lora learning to a model name
>>101764200NTA but figured it out, I got that grid when I dropped guidance to 1
>>101765388oh yeah, now I'm starting to remember it. Comfy accused him of reusing some code which was probably not true. Stability wants Comfy to be their main UI which made me dislike Comfy along with the drama.>>101765429Is there a guide like that for all the prompt formatting related to SD? Like Break, brackets, || etc? Maybe a cheatsheet or something. I also tried experimenting with BREAK on the free flux demo but it doesnt seem to respond to it at all.
>>101765391>>101765429Thanks anons.For sigma I meant picrel.
Finally time to test DoRA. Running with AdamW8Bit, linear, huber loss enabled
>>101765455>BREAK on the free flux demo but it doesnt seem to respond to it at allcause it doesnt, BREAK and prompt weights dont work on FLUX
comfy is telling me to run:>-m pip install --upgrade pipit doesn't work either in cmd or git, I'm doing something wrong, any hint?
It's weird but it's been multiple times I've seen flux show a naked person at low steps, then at some steps (for example 8-9) suddenly they got underwear.Is that related to how it's been trained?
>>101765499that's python suggesting to upgrade it. remove the -m part at the front and it should work.
>>101765464thats explained here >https://openart.ai/workflows/fish_intent_33/flux-dev-splitsigmas/j8kSUra4WQSQMoePIj9m
>>101765490For Flux?
>>101765522I'll take a look then, thanks.
>>101765537Testing some anime-jank with 1.5, DoRA should be great with multiple concepts
>>101765560A 1.5 DoRA? At this time of year?
>>101765517thanks, I managed to make it work by using the full string and adding --user at the end
>>101765559>here's your controller bro
should I change the file extension of : flux1-dev.safetensors and ae.safetensors to .sft?
>>101765617No. Why?
>>101765617".sft" is what i got and i never had to change it, not sure why its .safetensors for you
>>101765560honestly, dora kinda sucked for me when I tried multi concept lora.
>>101765636what settings did you use? how large dataset?
>>101765617.jpg .jpeg same thing.
>>101765617it does not matter .. three letter file extensions are a DOS limitation, you using DOS? no, therefore it does not matter
>>101764212NTA but yeah, not because I care but it just makes artlets seethe so I do it
>>101765671DOS is open source now, you mean you aren't genning on DOS? baka my head
How the fuck does the AI know who Integra Hellsing is, but not Haman Karn.
>>101765233Group up your shit and use get/set nodes or anything everywhere to get rid of this disgusting spaghetti.
>>101765710I consider any character still there as a mistake from the model makers, I'm sure they'd remove that too if they could.Maybe in their next iteration.
>>101765678Artlets seethe just by existing. I honestly don't know why they don't just use imagen in their art workflows.My wife is a painter. She gens concepts, refines them, then paints them. Speeds up the whole process. I gen character sheets and model them in blender. >>101765750I do. This was from like 2 years ago and was in the testing phase, where everything is spaghet. Once I get it working how I want I group things.
>>101765650dataset from around 300 up to somewhat above 1000, with around 50 datapoints per concept, iteratively scaled up while working on improving the adapter. I used locon and prodigy optimizer with long training times (some of the concepts took around 4-6 hours to converge).just locon and prodigy was much more effective for me, dora overfit like crazy without converging to the concepts. I could see dora really slap for training a style though in minimal steps.interested in hearing your experiences anon
>>101765797>I honestly don't know why they don't just use imagen in their art workflows.Honestly I'd rather keep it the way it is now, them just whining and screaming instead of gaining intelligence, it's more entertaining that way
>>101765816I miss when they use to come in here seething daily, and getting dogpiled for having a dumb ass opinion
>>101765797Artists would be really good at this shit too because they probably know a bunch of prompt terms that the AI would recognize.
>>101765808>interested in hearing your experiences anonI'll report here. What dimension and alpha did you use? I have a feeling that scale weight norms is mandatory with this. For me prodigy overfits like crazy if I don't lower d_coef and use snr gamma
>>101765783How is it a bug for the model to recognize characters?
Uhm? Is this real?https://www.reddit.com/r/StableDiffusion/comments/1em9u6x/first_flux_controlnet_canny_was_just_released_by/
>>101765797>I honestly don't know why they don't just use imagen in their art workflows.Good artists already have ai one way or the others in their workflow, from draft ideas to test to add background to anything really.The ones you see going apeshit 24/7 either are weirdly insecure about their skills, or are luddites who will refuse to use the new thing because it was created after they were born.
>>101765891I'm reasoning using their intent anon.I think it's retarded to scrub pop culture stuff like this, but they obviously did it on purpose.
>>101765891deefakes and using "registered TM" faces I guess.ok this has potential but is so terribly and ubearably slow. Even with 16gb vram for some reason it goes into lowvram mode. With noVideo probably putting 12gb vram on all their 5xxx series chips and 14.5gb on a 5090 this will be not feasible as a gacha hobby
unironically if you want flux to listen to your prompts more without weird CFG tricks and sacrificing speed. use Schnell (4 steps) (inevitable quality loss on text) or a DARE merge of Schnell and Dev. (4-16 steps) https://huggingface.co/martyn/FLUX.1-dev-schnell-dare-mergeprompt:an african american hatsune miku with braided dreadlocks holding up a peace hand sign gesture.no gimmicks, 1 CFG eulerleft is Dev at 21 stepsright is the linked DARE merge between Dev and Schnell at 10 steps (it can converge at even lower steps but not as well as Schnell on its own)The problem with Dev is as it needs more steps to converge, that 1 CFG is spread over more steps while at lower step counts, the CFG is proportionately higher and thus "listens" more.
>>101765884I used network rank 8 network alpha 1 convolution rank 8 convolution alpha 1 which was always the best, no matter how many concepts I added. Also, I've read in a guide that higher batch count is bad for multiconcept but don't listen to the haters, when I turned up batch count the model learned to differenciate picrel, which was a huge pain in the ass to learn.>scale weight normsI haven't tried that yet>lower d_coefyeah d_coef of 0.5 was pretty good for me
>>101766013If you merge schnell and Dev, which license takes precedence? I'm assuming the shitter one.
>>101766015ah it's the old cenobite sailor moon
>>101766013Anyone care to make a thorough comparison between regular Dev and this frankenstein of a merge with different prompts of various complexity?>inb4 meI will, but my PC is shit and it will take a long while.
>>101766073yep trying to get it in a panavision film style. so far unlucky. its entertaining though
Where's debo
>>101766062you are correct, the apache 2.0 is void then, you get the black forest lab license then
>>101765904Even shit artists are using it, they are ALL using generative fill in photoshop but they don't consider it AI because it's part of photoshop. They are all retards.>>101765872They are, my wife has that eye for things, she makes some cool gens. I get creatively bankrupted constantly and just gen from an empty prompt to explore the latent space for ideas. I don't consider myself an artist at all.
>>101766073
>>101766159
>>101766159>>101766175Now add a caption.
>>101766027I've been defaulting to dim32 alpha16 with 1.5 for a long time. Batch 2.>yeah d_coef of 0.5 was pretty good for meI've gone as low as 0.1 with great results
>>101766195
>>101766287proompt?
>>101766317its meh and wont get you consistent results.https://files.catbox.moe/m4t9ww.webp
>>101766367thanks
I have a 3070ti and am running ReForge, should I even bother with trying to use Flux? There's a 20 GB checkpoint on Civit.
>>101764165Explain like I'm retarded. What's the difference between cfg & guidance.
>>101766107this is a good one, inspiring!
>>101766542Also I don't think it's gonna take much to make an NSFW version of this lol
>>101766529yeah you could run it easily just takes time and you'll have to install comfy (or just wait for a1111/forge update)
>>101765710I know who Integra Hellsing is, but not Haman Karn...
>>101765889That's adorable. I'm going to see if I can make that irl.
>>101766725you could 3d print some of it ong
>>101766542see >>101765429 and >>101765391
>>101766686haman karn is from the gundam universe, best girl.as expected, flux does decent macro stuff.
>>101766143Right hereWe're going to add the pastebin to the OP if you don't fuck off
>>101766809You won't do nothing chud
>>101766752Yep; there's a bunch of existing designs to 3D print, but they're either props or too big. I'm thinking clear or tint clear and small 1" OLED display with microcontroller running clock. Or something.
>>101766752Your gens are my favorite
>>101766773Thanks, anon.
Good morning anons. Hope you are all well.
>>101767167So good. Little bit like that boss from Fallout 4 dlc. Fake movie screenshots are great
>>101767102This actually works out pretty well. I know t5 better than I know SD kek. No wonder language models are so good at writing prompts for this thing.
>>101766013>The problem with Dev is as it needs more steps to converge, that 1 CFG is spread over more steps while at lower step counts, the CFG is proportionately higher and thus "listens" more.what do you mean "the CFG is proportionately higher"? CFG is constant on every step
>>101766241I guess rank size and stuff is very individual to models kinda, but I've been surprised what fits into dim8 pony adapter
what do you guys load for clip? I've read that using both t5 and clip_I yields worse results than just using one of them but I don't I don't even know how to test that because the Load Clip node doesnt have flux as type, only the DualClipLoad does
>>101764165Put Frida Kahlo in the negative.
Man, once this thing has a proper NSFW extension, I am going to jack off if you know what I mean.
>>101767843Sadly probably months/a year away...
>>101767851I don't know, man. Thirst makes people incredibly motivated!
>>101767868doesn't matter how motivated, it's going to take months of H100s running non stop
>>101767883And just getting started will be months away, is my guess.
>>101767843How the fuck did you manage to get such a skimpy outfit?
>>101767928the model knows skimpy attirebasically the only horny stuff it does
Has anyone figured out the best guidance to use that doesn't trade too much style for prompt adherence. I'm having pretty good luck with 1.5 but that seems low, although I really don't have much sense of scale for this other than that they recommend 3.5 with the full model. (using schnell btw).
>>101767939ya it does
>>101767928> Description: in an 80s slasher movie, starring a beautiful woman, looking scared. Her physique is curvy and her clothes are revealing and torn in places. Overall, exploitative feeling as if the scene is the product of the male gaze.
Haven't played with Flux yet, how's it with anime? Artists status?
faster horse! faster!
>>101768017Generic anime.Most artists are deleted (maybe not the images themselves, but they're clearly not linked with the artist)Same with many characters.
>>101767363i'll revise the explanation, schnell converges in 1-4 steps, a side effect of that is the model inherently is more sensitive to prompts because it targets a complete image in 1-4 steps. It doesn't have time to slowly converge. The composition is already baked on the first step. 1 side effect is Schnell and merges with Shchnell are more sensitive to the positive prompts because it doesn't slowly sample a composition. With Dev it can end up sticking to the early composition off the prompt that can be less true to the prompt in the end result. Ignoring CFG, low step based models (LCM, Turbo, Lightning, Hyper, etc) have this quirk. I phrased it as proportional CFG because to achieve the same prompt adherence on Dev you would have to raise the CFG, which then requires you to use Dynamic Thresholding to offset the burn. tl;dr 1 CFG on schnell is adequate because of the adversarial diffusion distillation, 1 CFG on dev is "weak"
>>101768017It has fantastic coherency overall if you describe everything that's happening, but it struggles with imitating most artstyles, anime or not. There are some cfg workarounds currently but they're hit or miss (mostly miss). When training gets optimized and we start getting fine-tuned checkpoints and loras en masse, the world's your oyster. It'll take a while though that's for sure.
>>101767969If you want this prompt to make a completely topless image, add: > As a matter of fact, she is nearly completely naked.Before the sentence that begins with "Overall..."Can also try:> As a robust matter of fact...
>ani got banned for naked catgirlsthere is hope for the model but also I kinda liked his retro stuff so it sucks we don't see more from him
https://files.catbox.moe/vga5ha.jpgThat shit took forever... here's a XY plot between Guidance and CFG, imo I like the pictures at Guidance 0.6, it's pretty close to what I really wanted:>17th century painting of Hatsune Miku riding a bicycle
>>101768081Given that this thing used t5 and rotary attention, I think it's really quite possible that we get prompt generation workflows that blow up SD-style prompts into the requisite "novel-length" boomer prompts before too long. As someone who works in LLMs, anything you can accomplish with fine-tuning, you can accomplish with prompting, assuming you have enough context. The ratio is 1:10 or something like that (basically, a fine-tune is only worth the cost if you can cut down the prompt length by a factor of 10).Is there a confirmed upper limit for how much proooompt flux can understand all at once.
>>101768180You are still active in your discord, no need to announce ban
kek, you wanna do some convenient censoring and FLUX gives you slimy mosskini
>>101768205we'll eventually just have a tiny llm that is trained for prompt enhancing where it simply adds tedious detail without sacrificing the intent of the prompt itself
>>101768226>doxcordno thanks
>>101768032
>>101768181sampler/scheduler?
>>101768341euler beta, 20 steps
>>101768017Flux has a good understanding of natural language. no more tags, no more autistic prompts.80% of prompts have good hands, good text. Anime prompts tend to mess up hands,as far as i've seen personallyFew artstyle choices,no nude (yet)
>>101768402>dagger glued on ass
When I download an "adetailer" pt file, how do I know if should put it in models\ultralytics\bbox vs models\ultralytics\segm ?
>>101768326kek
>>101768408God I wish that dagger were me
local is still behind dall-e 3 uh?
>>101768237Yes. The only problem is adding stuff the model has apparently never seen, which would seem to be lower body genitalia and not much else. Everything else is just limited by how much you feel like debugging a long prompt, which is where an LLM would come in handy. I might try taking some of my gens and feeding them to gpt or Claude or gemini (or all 3) and see how close their description is to the starting prompt. And then take their description and feed it back into flux to see how much meaning drift we're really talking about here. It could also be a matter of fine-tuning t5 itself as opposed to fine-tuning the diffuser. I haven't gone to the flux repo yet to look at how everything works under the hood, but there's probably some stuff from the language modeling side that can be done faster than the time it would take to retrain the image generator itself.TLDR there's a lot you can do with transformers / attention is all you need / etc..
>>101768478not on realistic pictures, flux destroys dalle on that departement
since updating comfyui my vram gets flushed out every gen and has to reload every timeanybody else experiencing this issue with comfyui?
>>101768069that's a really interesting theory, but schnell's quality isn't that great compared to dev, I hope that merge will take the best of both worlds though
>>101768478On text no, Flux is actually better.On artist recognition, style, even people and nsfw, DALLE is actually better (when the moderation endpoint and prompt rewriting doesn't make it impossible to show of course, I meant the base model)
why did it give her that absolute dump truck
>>101768536yup.. that it does, loves reading the model again and again every gen, no clue why it does that.
>>101768478dalle has really shit AI grain too, you can tell DE3 gens from the pattern, Flux absolutely can make indistinguishable photorealistic gens
>>101768478nope, saying that as a big freetard cope hater
>>101768181GJ anon!
>>101768510>>101768629that's by design, the API has the choice between vivid and natural and natural has much less of that dall-e 3 look we know
>>101768478I am not convinced it's behind.There should be a competition: the best flux prompter vs the best dall-e, prompter. 30 subjects, consisting of description of complex scenes plus desired styles, 30 minutes. Judges choose between the Dall-e gen or the Flux Gen Blindly.The main problem is this >>101768629. You can tell Dall-e gen for their grain, so it won't be truly blind.Let the rest of the models compete too.
>>101768612i tried adding the --gpu-only tag and it made things worsei have a 3060 but still, i was running fine before i updated.
>>101768536https://github.com/comfyanonymous/ComfyUI/commit/c14ac98fedd0176686d285d384abec5e4c0140c2this commit is really good for a couple reasons, but if you are always hitting lowvram then there should be an -arg to disable it.
>>101768181good stuff, now extent it to cfg 2.0 .. kek just joking, interesting results
>>101768846>interesting resultsthanks o/>now extent it to cfg 2.0 .. kek just jokingThat's possible with DynamicThresholding yeah, here's what I've got with cfg = 3 + Guidance 3.5 for example
>>101767386DoRA didn't learn multiple concepts as well I wanted. Few test runs I made with small datasets and short training time were promising. I think with datasets closer to 1k images it's better to stay with normal Lora, set lr to 1 with prodigy and let Jesus take the wheel.
actually wouldn't it be better to check if the user even needs low vram after the flush? i have a xl workflow on 8gb that goes into lowvram on the last detailer because prior detailers don't unload from the model once they are done. that flush could push me out of vramlet hell for that last step
>>101768931I can recommend giving locon a shot!
>>101768975don't unload from the VRAM*
Is fp8 supposed to take just as long as fp16?
>>101769050yeah, I didn't notice a speed increase when going for fp8
>>101769015for sure
>>101769050yes, its not faster it just takes less vram
>>101764165>this threadi love how the prerequisite to generate images locally is to be a lonely incel lmao
>I updatedShould i just go back or is this a matter of lets say the new update writing over zluda somehow, and maybe i could just reinstall zluda?
>>101769057>>101769067ah ok, thought I fucked something up
Other merged flux : https://huggingface.co/HaileyStorm/FLUX.1-Mergeshttps://huggingface.co/drbaph/FLUX.1-schnell-dev-merged-fp8-4step
>>101769050Yeah, it's knock-off chinese "quantization" where weights are cast to fp8 then you lose precision casting back to fp16 for inference
>>101769100>Forgot picI am at a loss for words.
>>101769136>Yeah, it's knock-off chinese "quantization"so that mean there's a way to make this quantization faster right?
>>101769100>>101769137
For those using DynamicTresholding with higher CFG, I'd recommand to put cfg_mode at either Half Cosine Up or Half Cosine Down, the others are too bright and makes the picture too saturated
>>101768844this is what i get for updatingit was so lit before, next seed, gen, next seed, gen, change prompt, gen, pipe to upscaler, gen, back to base sampler, genno loading/reloading
Why is the quality of gens in this general so low? Any place I check, discords, other boards, reddit or whatever, people gen much higher quality, funnier and cooler flux stuff. Here it's just... very boring.
>>101769321thx (btw I think you mean to desaturated, high saturation == colorful, desaturated == bleached out)
>>101769370yeah, my b. Btw I think I found the best combinaison, you put half cosine up for both cfg_mode and mimic_mode, that one is the closest to the original picture at CFG 1
>>101769124>still over 23GBI'm guessing this wasn't supposed to help out vramlets.
>>101769510a merge doesn't change the size of the architecture
I wish Cyberpunk 2077 had looked like this, not the plastic look it actually is.
>>101764165Poorfag here. I've got 12 gb of VRAM, but only 16 gb of system ram. Can I fp8 Flux? Or am I flucked?
>/ldg/ reaching Popular Threadswe eatin' good
Text generation is good, but it's not good enough. Hopefully finetuning and lora can improve it.
maybe if you're on linux
>>101769586with 32GB system ram it would have worked.. but txxl will gobble up about 20GB system ram .. you can still try with ram swapping to the SSD .. but probably you are fucked
>>101769631Thanks. I'll see about getting another cheap ram stick.
>>101769602anon loves imggen
>>101764165Is SwarmUI a honeypot? During each gen it sends an outbound request to a google server, doesn't do that when not genning anything
>>101769690what google server tho?
>>101769690They probably have Google Analytics, could be sending your prompts or simply sending that you generated something with the model, resolution, etc.
>>101769762You could probably look at the JavaScript code / install Google Tag Manager and see what it's sending.
>>101769321>>101769411>I think I found the best combinaison, you put half cosine up for both cfg_mode and mimic_mode, that one is the closest to the original picture at CFG 1It's starting to be looking really good, negative prompt is working and there isn't much burn at cfg = 3 with those new settings
>>101769757No clue, some generic google IP, lookup says it's part of the 1e100.net>>101769762That'd be shitty>>101769778I'll try the Google Tag Manager then
>>101769100>>101769137After re-following this guide >https://github.com/CS1o/Stable-Diffusion-Info/wiki/Installation-Guides#amd-automatic1111-with-zludaI get picrel... hiprtc0507.dllAlso it turns out the driver update wiped the PATHs for some fucking reason thanks windows 10 but thats an easy fix.Hmmm not sure where to go from here.ZLUDA says it SHOULD be compatible with AMD driver 24.7.1, and i even re-installed hip.
>>101768844>>101769338ok i loaded an old snapshot and full updated things are working like they used to now. im not even ganna question it
>>101768478Go ahead and write "nigger" in your dall-e prompt, I'll wait.
>>101768402autistic prompts are what I live for
>>101769799Where does dynamic thresholding go? Anywhere in the model pipeline?
>>101767843good quality booba, and just the right amount of baked-ness, nice.
>>101768846Is this flux? How get wet skin look thx
>>101769960I give you a workflow: https://files.catbox.moe/haqdtd.png
>>101769743Nice but blown out af
>>101769945Don't worry, writing a wall of text (or relegating it to LLM) is pretty autistic in itself.
>>101770014I can't run this myself but I wonder how it would react to the prompt edit junk I love to do
>>101770006yeah tru
>>101769937this, fuck censorship, flux is dalle with better realism and complete freedom, a blessing in the sky
I'm curious about flux's capabilities. It's clear that it does single characters very well, but what about multiple characters interacting? Can somebody try to make two characters boxing in a boxing ring, each character with a different description and appearance?
>>101769985ya thats flux there was>The girl has wet blonde hair that clings to her body.and>She is swimming in a natural forest lake.in the prompt .. just the later and she was still dry, but with wet hair it also made wet skin
>>101770037damn son this is more like it
>>101770033Gimme the prompt, I'll gen it.
>>101769937/pol/ is that way
>>101770118>score_9,score_8_up,score_7_up, black theme, simple background, gray background,black slime orb, many red dots,red dots,red dots,humanoid shape,[:tentacles,melting, body horror,arms,legs,arms,legs,arms,legs,:0.2]lovecraft,cthulhu mythosI guess you can trim the ponyXL stuff and I don't know if the edit works
>>101769986What does the VAE override do?
Straight from the oven...>>101770020>>101770020>>101770020
>>101770325it gives you the choice to put the VAE on another gpu if you have multiple gpu's, if you only have one you can delete that node and the CLIP override aswell
>>101769806Turns out it’s adetailer
>>101770360>Turns out it’s adetailerwtf
>>101770272Left - verbatim, right - without "score_9,score_8_up,score_7_up" and without brackets/weights
>>101770568>it turned into pixel artHah. Anyway, very cool.Thanks anon
>>101770472ADetailer makes calls to google cloud servers, supposedly to compare the local detection model with some in huggingfacehttps://github.com/Bing-su/adetailer/issues/163