[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (1.14 MB, 3264x3264)
1.14 MB
1.14 MB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>102520372

>Beginner UI
Fooocus: https://github.com/lllyasviel/fooocus
EasyDiffusion: https://easydiffusion.github.io
Metastable: https://metastable.studio

>Advanced UI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/kohya-ss/sd-scripts/tree/sd3

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/aco/sdg
>>>/aco/aivg
>>>/b/degen
>>>/c/kdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/tg/slop
>>>/trash/sdg
>>>/u/udg
>>>/vt/vtai
>>
File: -.png (417 KB, 1216x832)
417 KB
417 KB PNG
How is diffusion for 3d models coming along? Anything good enough to be used in game dev yet (and without needing ~50gb vram?)
>>
Monkey's Paw
>>
>>102535316
no
>>
blessed thread of frenship
>>
File: ComfyUI_34083_.png (1.44 MB, 848x1280)
1.44 MB
1.44 MB PNG
>>
>>102535316
you can get some basic objects but there are a ton of problems, its going to be a while before anything actually usable shows up
>>
File: 0018.jpg (128 KB, 1440x960)
128 KB
128 KB JPG
>>102535404
chin up
>>
>>102535404
Nice style
>>
>>102535428
Aw. Hopefully a simple style like Wind Waker's will be doable a little earlier than more complex models, at least
>>
>>102535241
i completely missed bottom right. very cool gen.
>>
>>102535204
>man i don't like flux prompting. i don't want to write a 1000 word paragraph every time.
Verbose prompting isn't necessarily required.
>>
>OP is a massive fag and makes new threads before image bump limit is hit and before the thread is even on page 10
>the OP image continually has the same type of shit over and over, likely made by the same small group of people
really makes you think
>>
>>102536178
post better gens then
>>
>>102536220
wow, that's crazy.
>>
>only a small number of anons make decent gens
really makes you think
>>
imggen is dead
>>
flux is making the "thread highlights" look like sterilized corpo powerpoint slides
>>
>>102536241
are you on that small % of those anons, anon?
>>
>>102535241
Terrible collage, great work.
>>
>>102536838
i only produce the finest slop
>>
>>102536734
Did you just wake up from a two month long nap?
>>
>
>>
File: 01051-2210615452.png (2.96 MB, 1296x1728)
2.96 MB
2.96 MB PNG
>>
File: ComfyUI_temp_pxdai_00100_.png (1.64 MB, 1024x1024)
1.64 MB
1.64 MB PNG
>>
File: ComfyUI_temp_pxdai_00102_.png (1.72 MB, 1024x1024)
1.72 MB
1.72 MB PNG
>>102538218
>>
File: 00000-2210615452.png (2.41 MB, 1728x1296)
2.41 MB
2.41 MB PNG
>>
File: ComfyUI_temp_pxdai_00109_.png (1.63 MB, 1024x1024)
1.63 MB
1.63 MB PNG
>>102538255
>>
File: 01105-2210615454.png (2.26 MB, 1728x1296)
2.26 MB
2.26 MB PNG
>>
File: ComfyUI_temp_pxdai_00116_.png (1.55 MB, 1024x1024)
1.55 MB
1.55 MB PNG
>>102538417
>>
>>102537052
Wow, that's crazy.
>>
File: 00164-1315524702.png (3.18 MB, 1120x1440)
3.18 MB
3.18 MB PNG
>>
File: 00002-2210615453.png (2.14 MB, 1296x1728)
2.14 MB
2.14 MB PNG
>>
>>
Trying to swap over from auto1111 to forge, can't seem to get the dataset tag editor to show up properly. Do I need to edit some settings or install a previous version?
>>
>>102538948
why not just run it locally instead of in forge..
>>
Please remove mage space as it requires a login.
>>
>>102539095
tried that, but kept getting a "no module named print_color" error on install. Figured that running it via forge would work instead, but its not.
>>
>>102539107
Why are people so anti logging in to things
>>
>>102539149
Wow, that's crazy.
>>
>>102539148
weird. try this instead maybe?
https://github.com/starik222/BooruDatasetTagManager
>>
File: 2024-09-24_00007_.png (1.13 MB, 720x1280)
1.13 MB
1.13 MB PNG
>>102539259
inpainting with flux, flux refuses to obey.

I am trying to fix the horseshoe...
>>
>>102539148
>>102539294
sorry I'm brain dead today
you can fix an error like that by pip installing print_color or any other module it says it's missing
or add it to it's requirements.txt then install it via that
>>
File: my mask.png (960 KB, 720x1280)
960 KB
960 KB PNG
>>102539306
My mask
>>
>>102539306
he looks like when you're high on mushrooms and look at your skin and it looks all weird and red and floaty
>>
File: ComfyUI_00089_.png (1.14 MB, 720x1280)
1.14 MB
1.14 MB PNG
>>102539320
annnd the result.
>>
>>102539320
Try masking in the shape of a horseshoe?
>>
>>102539325
Yeah, fried cfg 6.0 because I was using negatives to try to tone down the cartoonishness of his face. fwiw, we need male loras so badly. I want him to look like Bubbles from Trailer Park Boys.
>>
>>102539320
new mask.
>>
File: my mask.png (789 KB, 720x1280)
789 KB
789 KB PNG
>>102539370
>>
>>102539384
bigger
>>
>>102539350
>Yeah, fried cfg 6.0 because I was using negatives to try to tone down the cartoonishness of his face.
what are you using? the best ones that doesn't fry the picture that much at cfg 6 are skimmedCFG and AutomaticCFG
>>
File: ComfyUI_00090_.png (1.12 MB, 720x1280)
1.12 MB
1.12 MB PNG
>>102539384
reeezult
>>
File: my mask.png (560 KB, 720x1280)
560 KB
560 KB PNG
>>102539390
>>
>>102539259
I can’t believe I made a meme just by posting that like twice
>>
File: ComfyUI_00091_.png (1.14 MB, 720x1280)
1.14 MB
1.14 MB PNG
>>102539507
>one huge horseshoe
At least it's complete. I'll try to mask off the rest of the coat.
>>
>>102539536
wow, that's crazy.
>>
>>102539565
Stabbhhhhhpppp
>>
File: ComfyUI_00092_.png (1.1 MB, 720x1280)
1.1 MB
1.1 MB PNG
>>102539507
>horseshoe
kekked, no idea
>>
>>102539576
would
>>
>>102539419
skimmedCFG and AutomaticCFG? not sure what I have.
>>
>>102539608
bit of a butterface
>>
>>102539671
they only work on ComfyUi though, so if you run it on that software you gotta try those out
https://reddit.com/r/StableDiffusion/comments/1eza71h/four_methods_to_run_flux_at_cfg_1/
>>
>>102539700
I have been using AutomaticCFG. Not sure why it's burned, but I have to add quite a few words to the negative, sometimes.
>>
>>102539847
show a screen of your workflow, it shouldn't be this burned, something's wrong with your settings
>>
>>102539864
https://files.catbox.moe/cnrxb2.png
>>
>>102539864
the disabled upscaler junk I don't use.
>>
>>102539934
wtf? you're using AutomaticCFG for the inference and then DynamicThresholding for the upscale? Ditch DT and use AutomaticCFG for the both of them
>>
>>102539995
I haven't used the upscaler, actually. I have never made a wf from scratch. But just replace dt for automatic?
>>
Maybe the reason they looked fried is that comfy has to resort to the Tiled vae?

> Warning: Ran out of memory when regular VAE decoding, retrying with tiled VAE decoding.
>>
File: file.png (551 KB, 2175x1621)
551 KB
551 KB PNG
>>102539995
well you're using AutomaticCFG for inference, dunno why it looks so fried, should've worked but I can't see the issue there
>>
File: file.png (433 KB, 1618x1433)
433 KB
433 KB PNG
>>102539934
you made a mistake there, the DualClipLoader should only have one output line, it's the Power Lora Loader that should have 2 output lines for both positive and negative prompt
>>
File: file.png (412 KB, 2264x1203)
412 KB
412 KB PNG
>>102539934
>>102540152
should look like this instead
>>
>>102540162
Thanks, trying it out. The result appears to be exactly the same, for whatever reason, but no point organizing it wrongly.
>>
File: 2024-09-24_00009_.png (1.37 MB, 720x1280)
1.37 MB
1.37 MB PNG
>>102540162
Dynamicthresholdingfull, with recommended settings, is unfrying me. He looks like Alan Greenspan now, because ai is crazy.
>>
File: 00103-86020175.png (2.34 MB, 1024x1536)
2.34 MB
2.34 MB PNG
>>102538218
>>102538255
>>102538417
>>102538587
Love the opal styled elf, very nice.
>>
>>102540654
I see... I still find it weird it's not working on AutomaticCFG, did you update your nodes?
>>
I don't like the Invoke niggers, but I have to say their canvas support has gotten really good. What's the best canvas extension for Comfy?
>>
Can someone please go beat google’s image search team with a shovel
(Nsfw) https://files.catbox.moe/jvmxlk.png
>>
>>102540948
Comfy will never be as good as invoke :/
>>
>>102541045
I saw that new invoke demo they posted and WTF I don’t understand why they are building that as a company with an MIT license.
>>
File: 00004-3104485413.png (2.23 MB, 1024x1536)
2.23 MB
2.23 MB PNG
>>
>>102541166
Sloppppp
>>
File: 00025-1779597322.png (2.39 MB, 1024x1536)
2.39 MB
2.39 MB PNG
>>102541176
Post better then. I do what I like =.="
>>
>>102541199
don't listen to that retard, post whatever you like anon, freedom of expression is a beautiful thing
>>
>>102541166
Her thumb nail is on par with Megan Fox
>>
File: 00026-1945740566.png (2.36 MB, 1024x1536)
2.36 MB
2.36 MB PNG
>>102541206
Thanks! Seems like a bit of a slow night tho.
>>102541217
Yeah hands elude my 1060 running 1.5
>>
File: 00024-1630348299.png (1.48 MB, 1024x1536)
1.48 MB
1.48 MB PNG
>>
I have a situation when making loras. The character I am working with has very little art besides the one the author made for it. Is there a way to minimize the effect the author has on how the character looks?
>>
>>102541320
You want to use art some author, but in the same time you want to minimize the effect his art has on the outcome?
>>
>>102541320
Img2img with Flux, then train with those images.

If the img2img doesn't produce good results, train with the original images and use the lora to guide the img2img to produce better img2imgs.

With the new images, train the lora.
>>
>>102541431
I want the character concept, not the author's personal style. It bleeds on every image I make.

>>102541443
That could work. Thanks.
>>
>>102541233
>>102541166
>>102541199
make her asian pls
>>
File: 00037-3698434179.png (2.68 MB, 1024x1536)
2.68 MB
2.68 MB PNG
>>102541666
I'll try
>>
File: 00039-1243346775.png (2.66 MB, 1024x1536)
2.66 MB
2.66 MB PNG
>>102541666
1st try :)
>>
>>102541739
she needs to be more asian bro
>>
>>102541791
well be more specific
>>
File: 00040-1781348319.png (2.52 MB, 1024x1536)
2.52 MB
2.52 MB PNG
>>102541791
How much more?
>>
File: 00041-3556864975.png (2.58 MB, 1024x1536)
2.58 MB
2.58 MB PNG
>>
File: 00043-509673630.png (2.28 MB, 1024x1536)
2.28 MB
2.28 MB PNG
>>
>>102541233
>Seems like a bit of a slow night tho.
Everyone's playing with the new video shit on red boards
>>
>>102541814
>>102541837
>>102541919
way more asian, as asian as it gets bro
>>
>>102535241
Tips for getting the ai to focus on a specific body part? I'm trying to generate an image of just the torso area, but I keep getting legs/face in my output as well.
>>
File: 00045-1288075561.png (2.42 MB, 1024x1536)
2.42 MB
2.42 MB PNG
>>102541962
Aah I think I've seen some of that going around. That would make sense.
>>102541993
I have failed you :( I told computer more asian. Even with increased importance it didn't change much.
>>
>>102542006
What does your prompt look like?
>>
>>102541962
which new video shit?
Cogvideo?
>>
>>102541962
>the new video shit on red boards
where exactly? it's for work :^)
>>
>>102542019
>Even with increased importance it didn't change much.
why is that? because of the model or is it a prompt issue?
>>
File: 00006-2120213302.png (2.18 MB, 1024x1536)
2.18 MB
2.18 MB PNG
>>102542043
Might be partly my ability to judge asian'ness and partly models tendency to generate white features.
>>
>>102542025
I've stripped it down to the simplest prompt/scene I could think of just to figure this out, but it still doesn't work. Here you go though:
> Positive:
close-up torso of a woman, background distant plain wall, style amateur photograph, perspective three-quarter view, pose sitting down, lighting soft and cool, high resolution
> Negative:
face, head, leg, legs, hand, hands,fake eyes, deformed eyes, bad eyes, cgi, 3D, digital, airbrushed, text, watermark
>>
File: 00044-3583454186.png (2.27 MB, 1024x1536)
2.27 MB
2.27 MB PNG
>>102542124
i'd remove the woman part (maybe try female torso or just torso and add breasts somewhere in the prompt), also the sitting down most likely messes with it as well. Could Also go for just cropping the image if nothing else works.
>>
File: 00426-520042103.png (2.42 MB, 1296x1728)
2.42 MB
2.42 MB PNG
>>
>>102541058
I didn't see the new demo. What's good about it?
>>
File: 00049-2310171290.png (2.37 MB, 1024x1536)
2.37 MB
2.37 MB PNG
>>102542270
looks hand drawn. Lovely contrast.
>>
>>102542239
This helped. Thanks!
>>
File: 00036-1120686066.png (2.64 MB, 1024x1536)
2.64 MB
2.64 MB PNG
>>102542323
Glad that it helped,np, I tried getting some torso gens myself and I've got n luck so far xD Gonna hit the hay gn
>>
File: 00439-520042105.png (3.07 MB, 1296x1728)
3.07 MB
3.07 MB PNG
>>
once I tune in an IP adapter is it possible to save the file for later? Loading the workflow everytime is very slow.
>>
is controlnet for sdxl supposed to add a whole extra second of iteration time to my gens? couldnt help but noticing that obscenely huge file size for just the controlnet file, is that shit really staying in VRAM while the image gens?
>>
File: myFile_5_5.0_001.png (996 KB, 1024x1024)
996 KB
996 KB PNG
>>102542417
okay well this wasn't worth five gigabytes and four seconds per iteration, guess people werent just doomposting about the absolute state of openpose for XL swear on me muddah what a sad state
>>
>>102542408
You can technically save it as it's own model, I think.
>>
Is koyha the only game in town for training loras?

Is there a version with a GUI?
>>
File: 00006-397334496.png (3.88 MB, 1296x1728)
3.88 MB
3.88 MB PNG
>>
>>102542468
probably true. I would like to avoid that though as it would eat through space very quick.

I guess I can pull apart the node and grab the weights, but I find it surprising nobody has wanted this.
>>
File: ComfyUI_07417_.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
>>102542499

I am going to spoonfeed you only once. Your googlefu is weak. How the fuck do you even survive this day and age? Use SD flux branch

https://www.google.com/search?q=kohya%20ss%20gui
>>
>>102542284
https://www.youtube.com/watch?v=y80W3PjR0Gc
They really improved the canvas. Very intuitive, a lot of granular control.
>>
>>102542742
I found it a few minutes ago but thanks for using my favorite airplane spoon god bless
>>
File: 00504-520042104.png (2.59 MB, 1296x1728)
2.59 MB
2.59 MB PNG
>>102542747
I just wonder what are the hw requirements. Last time I tried Invoke it was pretty memory hungry
>>
File: 00498-520042105.png (1.36 MB, 864x1152)
1.36 MB
1.36 MB PNG
>>
>>102542747
Imagine that level of polish with all the community nodes from comfy. We won't live to see that day.
>>
we should normalize bullying 1girl posters again
>>
>>102542911
desu only the bad ones
>>
>>102542911
>>102542924
desu every 1girl poster
>>
>>102542924
how do they know they are bad? The (you)s are pretty rare these parts.
>>
>>102542911
desu i support all 1girl posters left behind policy
>>
1girl in service of china man a okay
>>
china man make 1girl in lab with nation top 100 scientist,,,, not /ldg/ top 10 jeet
>>
>>102541233
nice cameltoe. necklace needs editing, maybe make it a cross?
>>
File: 00520-520042103.png (3.28 MB, 1296x1728)
3.28 MB
3.28 MB PNG
>>
https://github.com/hyoo14/Finetuned-SD3-Correcting-Anomalous-Images
>Flux exists
>he Finetunes SD3 anyway
lol
>>
File: 00526-520042104.png (2.92 MB, 1296x1728)
2.92 MB
2.92 MB PNG
>>
>>102543076
>no example outputs
Does he expect us to use our imagination
>>
>>102543154
neat lightning effect
>>
>>102543154
the glowing worm coming out of her poo hole
>>
File: 00539-2648562677.png (3.03 MB, 1296x1728)
3.03 MB
3.03 MB PNG
>>102543183
yeah, nice rng

>>102543215
cyberpunk tapeworm
>>
File: 00540-2648562681.png (3.05 MB, 1296x1728)
3.05 MB
3.05 MB PNG
>>
>>102543068
>>102543154
>>102543248
>>102543269
>LOOK MA I GENNED 1girl Portrait Masterpiece AGAIN!

nice gens but try something more interesting bud
>>
the beating the shit out of 1girl posters
>>
File: 00191-3741067434.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
>1dog
>>
>>102542742
the sd3 flux branch shit itself multiple times while installing but the master works.
>>
>>102543173
its on the github

https://github.com/hyoo14/Finetuned-SD3-Correcting-Anomalous-Images/tree/main/data
>>
File: 00548-2371158935.png (3.08 MB, 1296x1728)
3.08 MB
3.08 MB PNG
>>
File: file.png (2.25 MB, 1024x1024)
2.25 MB
2.25 MB PNG
>>102543327
looks like watercolor painting, not a photo
>>
File: ComfyUI_08079_.png (1.7 MB, 960x1280)
1.7 MB
1.7 MB PNG
>>102543327

So anyone else is kinda tired of Flux already? The lack of flexibility and consistency (i.e. no negative prompt, confusing cfg settings) plus there are no good finetunes, also the fact that is slow compared to sdxl, I'm seeing that all the hype is already gone lol, even in these threads people are posting SDXL/1.5 gens instead of flux, maybe we shouldn't have been so bad with SD3, it had potential
>>
File: ComfyUI_07045_.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>>102543303

Really? Don't come crying when it doesn't work on master for Flux LoRa.
>>
File: file.png (176 KB, 704x728)
176 KB
176 KB PNG
>>102543424
>So anyone else is kinda tired of Flux already?
I think that's obvious enough when you look at this thread, it's so slooooooow, no one is posting anymore, I guess we're all waiting for something to happen, like a finetune or some shit, personally I'm still waiting for PuLID on ComfyUi
https://github.com/cubiq/PuLID_ComfyUI/issues/69#issuecomment-2360010118
>>
File: ComfyUI_10382_.png (2.41 MB, 1152x1728)
2.41 MB
2.41 MB PNG
>>102543424
Other things I dont like about flux is that is limited artistic style it has, also the fact that you have to write LLM prompts to get something somewhat decent, while in SD 1.5/SDXL you can use a simple CLIP prompt and get something decent
>>
>>102543303
>>102543431
I think I realize what I did wrong, for that I downloaded the zip because I don't know how to clone a branch and gave up after the first 5 suggestions from google didn't work. And it needed the --recursive option so it might be missing stuff if you get as a zip
>>
>>102543473
>cubiq

See? even cubiq can't get something decent out of flux, just look at this twitter account and all the testing he has done of Flux, nothing decent has come out of it, just more confusion more than answers lol
https://twitter.com/cubiq/
>>
>>102541612
>I want the character concept, not the author's personal style. It bleeds on every image I make.
there's some technique to combat this where you train a lora on the author's style, then merge that lora into the checkpoint. then you train the character lora on that merged lora model, and it ends up having less of the author's style in it
https://rentry.org/copier_lora (see bottom)
>>
File: 00585-2371158937.png (2.46 MB, 1296x1728)
2.46 MB
2.46 MB PNG
>>
>>102540827
thank you
>>
>>102543473
same ol' AI toy cycle

new ai toy comes out > everybody gets excited > gets spammed the shit out it > people get tired of its limitations > interest peaks down
>>
>>102543514
https://xcancel.com/cubiq/status/1838615642926166292#m
>is this the end of stable diffusion as we know it?
I think he's genuinely retarded, the last good model we got from SAI was SDXL in july 2023, of course it's the end of SAI it's been more than a year at this point
>>
>>102543483
you can write with clip on flux too anon..
>>
File: file.png (710 KB, 680x680)
710 KB
710 KB PNG
>>102543514
https://xcancel.com/cubiq/status/1835930759468642426#m
kek, cubiq if you're reading this, the sooner you've finished your node, the sooner we'll talk about something else :p
>>
>>102543573
>In honor of James Cameron joining the board of directors, it is tradition that we generate an image of his choosing
>I dont know uhhh
>A naked woman laying in grass
>>
>>102543573
image generation has already peak, what else we want anyway, at least i think I can generate anything I want, the future is video, we need a good img2video model, those videos on /aco/ make me so jealous, the physics are so good
>>
>>102543583
yeah but the results are not the same, flux got boring really fast
>>
>>102543638
those are inpaints anon. I guarantee they aren't generating those videos.
>>
>>102543638
>image generation has already peak, what else we want anyway, at least i think I can generate anything I want
nah, Flux doesn't know any celebrities/characters/styles, it's not fun playing with it when all you can do is to render Hatsune Miku and Donald Trump
>>
>>102543638
>the future is video, we need a good img2video model
2 weeks anon
https://blackforestlabs.ai/up-next/
>>
File: fluxsucks.png (18 KB, 629x147)
18 KB
18 KB PNG
>>102543607
this user gets it
>>
>>102543671
>this user gets it
he's just a VRAMlet, of course he got filtered by Flux
>>
>>102543653
but SDXL does, the many finetunes there are out there are really complete, also another shit I hated about flux or BFL really is the lack of documentation, the fuckers have never released anything official documentation about flux or how it works, is just
>umm its a distilled model and you can't profit off it, ok thx
>>
>>102543483
does flux not respond to artist/photographer prompts?
>>
File: 00035-2425293528.png (2.72 MB, 1824x1248)
2.72 MB
2.72 MB PNG
>>102543722
kind it does, but it lacks flexibility, IMHO once you've seen a Flux gen you have seem them all, they all look alike

>>102543483
compare this with pic rel. (SDXL) and SDXL mogs the shit outta Flux
>>
File: 00251-3142409133.png (2.64 MB, 1344x1920)
2.64 MB
2.64 MB PNG
>>102543722
>>102543745
same with movie posters, I think it was a big mistake from BFL/Flux to neuter the CFG Scale because the whole point of the CFG Scale was how much freedom you gave the model when generating a picture, I don't know why it mutated into this >muh burned image, REEE I want a 100 CFG scale that doesn't fry my image

Same thing with the negative prompt, it allows you as a user to have a little control on what gets gen or not
>>
>>102543524
Thank you, I will try this out.
>>
File: 00021-414697068.png (2.52 MB, 1024x1536)
2.52 MB
2.52 MB PNG
>>102543788
For me, the whole LLM prompting that Flux requires is so retarded because you have to create these elaborated prompts to get something good, instead of a simple prompt that makes your life easier, I mean, Dalle-3 works like that you gave it a prompt and the model actually adds to it thats why they all look good, same with SD(XL), you can add a simple prompt and a play with the cfg values and you can have a beautiful image right away, the whole point of AI models is to make the life of the user easier not harder
>>
File: bake again.png (192 KB, 866x539)
192 KB
192 KB PNG
LOL so only now everyone wakes up to the fact that flux is sterile shit and the synthslopped captions and pruning of artstyles resulted in an incredibly boring and overbloated model?
at the end of the day ART always wins, and when your model fails at art it becomes boring. local is scared shitless of putting art first which is why all the recent models are so bland and lifeless
>>
>>102543986
just wait till we loop back around to "well okay 1.5 wasnt THAT bad" after all the overconfidence about flux and sdxl brought people to unholy heights of pride and retardation
>still not gonna forget that blunder from the other day were anons here couldn't tell the diff between 1.5 and flux
>>
>>102543745
>once you've seen a Flux gen you have seem them all
And it was the death of seeds. I've have some workflow where I use hundreds of image generators and test prompts in them, it's faster to just allow the same prompt to the same model because the random seed will give something completely different.
Not so with Flux, you're getting a slight variation of the same image, because the noise doesn't matter much and DiT has an idea of what the prompt should be drawn like and does it for all the seeds.
People don't realize how bad is it and we may have gotten the same blandness with a different, more complete dataset and the only way to fix it and get the variety and creativity back is to stuck with unet.
When you want a duck, just pick the part of the noise that looks the most like a duck, and gives you the most creative duck, instead of "I always imagine the same duck and can put it anywhere in the picture no matter the seed" that Flux has going just to allow people to choose where the damn duck is on it.
>>
>>102543986
I wouldn't blame the data that is trained on, but the limitations the devs burdened into the model, is like a LLM model with 0.1 temperature
>>
>>102543999
You haven't got a link to that post saved?
>>
>>102543999
Stable Diffusion 1.6 scored better than base SDXL on blind tests, and I bet it's garbage compared to the best SD1.5 finetunes.
So, yes, even SDXL was a mistake and I wonder if something like SD1.5+better text encoder could be done, just use all the models that we have with better understanding and coherence, and whatever would fix the eyes and anatomical problems.
The eyes of SD1.5 models have aged pretty badly and it's incredible to go back to old generations I deemed great but now I see the messed up eyes that were always there...
>>
>>102544053
base sdxl has a ton of issues like limited dynamic range which is why they pushed cosxl as a copium replacement
>>
>>102544053
the other problem that 1.5 has its low resolution 512x512 , even tho a few months ago there was this 768x768 model that popped up and it was really good quality
>>
Alibaba
https://menyifang.github.io/projects/MIMO/index.html
>>
>>102543986
>nothing more than jeets begging for nsfw finetunes

kek that is so true
>>
File: 00688-2371158934.png (2.42 MB, 1296x1728)
2.42 MB
2.42 MB PNG
>>102544053
>SD1.5+better text encoder could be don
Wasn't there that 16ch vae already?

>>102544044
>like a LLM model with 0.1 temperature
Pretty accurate.
>>
>>102544053
sdxl was fun and soulful
I've moved on now like everyone else, but I'll remember it fondly
never understood the hate for it on /g/
>>
File: contrast.png (1.25 MB, 1536x768)
1.25 MB
1.25 MB PNG
>>102544062
That was a problem with SD1.5 as well until epiNoiseOffset was introduced.
Most models overdid it, though, here's animeTWO with animeTEN at its side, you can see how the noise offset makes everything less saturated and not as colorful, but once it's in, there's no way to take out the pasta out of the soup, and to bring back the vibrancy you'd need to do something like adding the Gloomifier as negative, but that changes the compositions as well.
I'd have loved to just get a version of animeTEN with original limited dynamic range.
>>
>>102544107
Huh? ZootVision Epsilon delivered well at 1024x1024. SD1.5 has full potential to be like that, he even released a Fluxxer Lora that is supposed to enable 1024x1024 into any SD1.5 model, though I haven't tested it yet.
I don't know what happened with later versions of ZootVision, though, I was never able to get the fine details of Epsilon on them, they seemed like degraded models.
>>
whenever I consider leaving flux I remember what eyes and details look like on sdxl without adding a bunch of shit like adetailer and hiresfix to my WF which are a RNG crapshoot and go back to baking loras for flux for whatever its missing
>>
>>102542006
My default these days is to find/crop an image that has what I want in focus, upload it to Joy Caption, and work from there
>>
>>102544120
>Wasn't there that 16ch vae already?
It's not about that, it's about the model understanding what you want to draw.
flux beats every other model at that, by miles, but it's only because of its text encoder, nobody has made a version of SD1.5 with T5 because that'd be like trying to turn a Delorean into a time machine, but it could be the jump that oldie but goodie technology needs.
>>
>>102544133
civitai was a mistake, all it did is create inbred models
>>
>>102544125
I'll challenge you again to put a SDXL gen that I can't reproduce with some SD1.5 model specialized for whatever you post.
>>
File: ComfyUI_07858_.png (3.19 MB, 1280x1280)
3.19 MB
3.19 MB PNG
SD3 wasn't that bad either, people got carried away by the whole >le woman laying on grass meme
maybe there was a fix we could've worked on, we failed as a community, fuck reddit, fuck civitai
>>
>>102544165
It was HuggingFace's fault, they could have capitalized on image generation and made people not have any reason to visit Civitai, specially as HF allows unlimited uncensored generations without an account, and that should have been enough of an edge.
Instead, people visited HF and couldn't even find the download button, because clicking on a model, instead of downloading it, goes to an information page about it, so you may download that and see it doesn't work, or get ?download=true appended at the end of the filename so the UI can't use the model.
Plus, their search function is so bad I have many models with the word "World" on them, and none of them appear when you search for world (it's not only me, other models won't appear either), and google won't find them either so your best luck is searching FOR THEIR HASH which does show them on Google, but who's going to know their hash on the first place?
Oh, and, apparently making NSFW is fine as long as you don't publish it, so we can't have NSFW models showcasing what they do like at civitai, which was what made them the standard.
>>
>>102544173
Did you try Kolors? I have this theory that's the model people should have focused on, because it already got creativity and style and who knows how to add those to flux.
And it doesn't have problems about girls sitting on the grass.
>>
File: 1700235085916989.png (16 KB, 128x128)
16 KB
16 KB PNG
>>102543424
>So anyone else is kinda tired of Flux already?
Nope, having as much fun as ever
>>
Been out of the loop for quite a while, is this chink Anything XL model good for anything at all? I see it's right under Autism which I know everyone and their mother is using but can't find shit about AXL in the archives.
>>
File: 1700859651071774.png (18 KB, 128x128)
18 KB
18 KB PNG
>>102544285
>>
>>102544361
>is this chink Anything XL model good for anything at all
No, it's only popular because it was done by the same guy that made Anything V3 and people spent a lot of time using it.
Anything V7 is better.
>>
>>102544435
>Has a fetish for pictures for ants.
>>
>>102544285
doing what? that? lol
>>
>>102544496
Because it's impossible to find 64x64 icons of those things on the internet.
>>
>>102544521
you literally added nothing to the conversation, and created nothing of interest
>>
>>102544594
What prompts do you use?
>>
File: 1707479625300011.png (20 KB, 128x128)
20 KB
20 KB PNG
>>102544473
>phoneposting on /g/
>>102544496
>>102544521
Yes
>>
File: 1712778702362135.png (15 KB, 128x128)
15 KB
15 KB PNG
>>102544804
>>
>>102544285
How
>>
>>102543473
>it's so slooooooow, no one is posting anymore

People are afraid. You can get auto-banned for posting too many pictures.
>>
>>102544144
>ZootVision
The ugliest inbred slop model this anon has ever laid eyes on KEK
>>
File: 1719031201150026.png (625 KB, 882x824)
625 KB
625 KB PNG
is this the first flux finetune?
>>
File: 31014173.png (3.22 MB, 1152x1728)
3.22 MB
3.22 MB PNG
>>102545123
It's just doing what the base model already does like every other "finetune".
Also where the fingers supposed to be like this (right hand)?
>>
>>102545123
A finetune to gen niggers I could gen with vanilla flux (but didn't because there's already too many niggers in this world)? I'm so hyped I'm going to support you directly on Boosty.
>>
>>102545282
it does nudes
>>
>>102545301
it explicitly states "basic" nsfw will be added in a future version kek
>>
>>102545123
>>102545282
Boring. Not impressed.
>>
File: bComfyUI_122166_.jpg (379 KB, 1024x1024)
379 KB
379 KB JPG
cursed gloryholes
>>
>>102541612
Have you tried describing the style extensively in the captions for each image? Everything that isn’t the character that is described should be “cancelled out” that way.
>>
File: IMG_0170.jpg (792 KB, 1125x1154)
792 KB
792 KB JPG
>>102543986
Someone needs to make a Lora of just art words.
>>
>>102545123
The problem with every flux “finetune” is that the model immediately gets overfit and it loses any of its original quality, I.e, all women look the same, photos looks like the ones from the dataset and lose all the textures from the original flux model
>>
>>102544285
>>102544435
how did you do that? that's cool
>>
File: file.png (22 KB, 1102x128)
22 KB
22 KB PNG
Trying to dreambooth train a ponyXL model right now, and it's attempting to use 52GB of VRAM for a model that's only 7GB in size.
Is that normal? I have 24 GB VRAM and 64GB RAM and it still complains that it can't allocate so much. Shouldn't you need much less?
>>
>>102546335
Isn't that the point of a finstune... If I wanted it to look like the original, I'd inpaint on the original
>>
>>102546383
you fucked up some setting + are using prodigy
>>
>>102546448
yeah it's prodigy, why?
>>
>>102544594
I poked and made fun of something, it's interesting under Poe's law.
>>
>>102546426
No, a proper finetune will add stuff instead of replacing it, a complement, not a new allocation, a proper SD finetune would add new stuff and you can play and mix with stuff that was present in the model, every flux finetune I’ve tested it just overfit crap from the training dataset s, i.e. the women look the same no matter what you prompt
>>
File: file.png (119 KB, 1069x1281)
119 KB
119 KB PNG
>>102546383
>>102546448
>>
>>102546369
>how did you do that?
What if he did NOT do that? What if he made 256x256 pictures, shrunk them down to 64x64, and pasted them together?
That's what I did, well, except I don't have his prompts, but what's new?
>>
>>102546492
You assume it's possible to make a proper finetune on a distilled model, if it was possible someone would have added true CFG on it already.
If it was possible someone could make something that beats their pro version, so it was in their interest that finetuning it would have these problems on purpose.
>>
Is there a reason why lora training GUI defaults to the dreambooth tab in 2024? I've already lost count of how many times I ran into an error when trying to start training because "lol fuck you, it was the dreambooth tab all along"
>>
>>102546458
prodigy + high dim + low batch = big VRAM
>>102546492
Yeah that's why I'm sdxl everyone uses pony, right? Your idea of a proper finetune is underbaking
>>
>>102546693
Because bmaltais has 0 skill and did the equivalent of adding a mspaint paint bucket fill over kohya's work lmao
>>
>>
i wish BFL employees would stop posting here
>>
>>102545763
:O
>>
>>102543693
>but SDXL does, the many finetunes there are out there are really complete
I mean, you're comparing a 1.5 month model to a 1+ year model, of course SDXL has more interesting finetunes, it had more time to mature
>>
>>102543638
>image generation has already peak
It will peak when we can no longer find ways to improve it. There is still so much to improve.
>>
File: teaser_scene2 (1).webm (3.6 MB, 830x830)
3.6 MB
3.6 MB WEBM
>>102544108
wtf that shit is impressive
>>
>>102544990
>People are afraid. You can get auto-banned for posting too many pictures.
wait what? what happened?
>>
>>102547655
it's a thread schizo don't give it attention
>>
>>102547627
FUCKING
EVERYWHERE
>>
>>102547739
hes right tho
>>
>>102547745
And opened an issue that was already asked 6 hours earlier, useless grifter
>>
>>102547762
more issues opened = higher chance of success
>>
>>102547739
kek, at least he wants us to get the good toys, so I won't mind for that one
>>
>>102547739
fuck this guy. Twice I have hit open issues with no resolution for stuff. He is wasting dev time and creating a mess.
>>
>the absolute power of a singular Turkish grifter
>>
>>102547739
>>102547627
>>102544108
did they say they're gonna release the weights? the consistency is quite insane, it's another level there
>>
>>102547627
How can it be so consistent, I never expected this shit to be this good now, maybe in a year or two
>>
>>102547903
Doesn't say. Pretty sad if no
>>
Any anon knows where can i find the latest joycaption local (v21) for free?
It seems to be better than Florence-2 and Llama 3 for captioning images for FLUX loras but the only place i seem to find is on patreon, behind a paywall.
Kemono hasn't been updated in ages.
>>
>>102544120
nice
>>
>>102544108
>Excellent model
>Trained on a shit ton of parkour, breakdancing, and basket moves videos
That's what I've been saying all this time man! If you give the model complicated humans poses it'll be a great one, you can't expect a model to be good at anatomy if you only give it to it "1girl standing"
>>
>>102547903
>>102547963
I very much doubt it, it's got the usual "Academic research only goy!" disclaimer.
Meanwhile at BFL....
*silence*
>>
>>102544053
>Stable Diffusion 1.6 scored better than base SDXL on blind tests
Yeah, but if i remember correctly, 1.6 is API only and will most likely never become local.
>>
>>102547903
>did they say they're gonna release the weights? the consistency is quite insane, it's another level there
I mean, if they made a github page, it's to be used to run a local model right?
>>
>>102548002
https://desuarchive.org/g/thread/102520372/#102523894
>>
>>102548063
Thank you anon.
>>
>>102547803
He wants it to be released so he can bitch at the devs for weeks to get it to work in comfyui and then sell his patreon.
He doesn't give a shit about anons, that's been proven, he wants to make money out of the process of getting closed software on specialist hardware to work so he can gatekeep it and build his grift.
>>
>>102547903
https://github.com/menyifang/MIMO
>(2024-09-25) The project page, demo video and technical report are released. The full paper version with more details is in process.
They don't tease us with a weight release, it's unlikely we'll get this shit on our hands, it's way too good to be local
>>
>>102548002
https://huggingface.co/spaces/fancyfeast/joy-caption-alpha-one/tree/main
just download it?
>>
>>102548102
>it's unlikely we'll get this shit on our hands, it's way too good to be local
https://github.com/menyifang/MIMO/issues/5
>Alibaba never releases the code of anything it releases so don't expect any of this, not even an API, animate anyone is a clear example
Yep, this is a nothingburger, we can move on
>>
>>102547067
what do you call this creature
>>
>>102548093
To be fair, we all want local model only for ourselves, I don't hate you or anything, but at no point I'm telling myself "YES! If this model is local, a lot of people are gonna have their fun with that", I just think of my own enjoyment, as everyone kek
>>
>>102548002
You can literally git clone it from the higgingface repo and run it locally, why the fuck would anyone need the grifters Patreon version, it's pointless. Mfers really paying $5 because they can't ask chatgpt to make them a .bat file to launch the app.py or what the fuck. That retarded grifter is so inept he claimed on Reddit joycaption won't work with anything but the gated llama repo, when you can change it to anything you want in the app.py. I don't think he even codes his shit he releases, probably asks ai to write it for him then resells. It's so tiresome
>>
>>102548127
half turtle half ant
>>
>>102548122
>>102548145
I wasn't aware the HG was the latest one. I thought he gatekept the actual latest behind a patreon paywall and that the HG was an older version.
>>
File: file.png (770 KB, 851x898)
770 KB
770 KB PNG
>To anyone reading this, does that make you happy?
https://www.youtube.com/watch?v=tQLKc3zSLdU
>>
>>102548158
A turant or anrtle
>>
>>102548160
the creator of joycaption releases everything for free iirc, the only thing gatekept behind Patreon is the dumb grifters' pointless UI for it (it literally comes with gradio built in, I hate that faggot so much)
>>
>>102547739
If you drop a paper and people harass you for code, they are assholes.
If you drop a paper PROMISING CODE with NO CODE it should be legal to gangstalk and terrorize your entire family until you push code.
>>
>>102548199
>the only thing gatekept behind Patreon is the dumb grifters' pointless UI for it (it literally comes with gradio built in, I hate that faggot so much)
come on man, this man gives us the most important thing (the weights) and wants to make a bit of money out of his hard work, don't be like that
>>
>>102548201
>If you drop a paper PROMISING CODE with NO CODE it should be legal to gangstalk and terrorize your entire family until you push code.
Where did they promise anything? I only see that >>102548102
>>
>>102548173
Tl;dr some infantile faggot is upset other people have fun in ways that don't adhere to his preferences. Coming up at 10: cutting his dick off and listing preferred pronouns
>>
>>102548002
You can’t spend five minutes tweaking the 200line app.py?
Fuck, you can’t spend TEN SECONDS telling Claude to?
LAZYBONES
>>
>>102548208
>come on man, this man gives us the most important thing (the weights)
The grifter didn't contribute shit, he just resells the hard work other people gave out freely
>>
>>102548173
why can't these people just understand that the "brush" is bigger now, that's all.
with a bigger brush, you paint a bigger picture. it's still the same amount of human involvement.

For example instead of making one picture, you make hundreds for the sake of a game/comic/animation/storyboards whatever. The project is larger, to suit the larger brush that you now have.
>>
>>102548173
That's the problem with the west: They conflagrate being able to draw with being an artist.
It's not the same.
Being able to draw is a skill you have to train, but it does not make you an artist.
The average "artist" can only draw capeshit actors and other insipid nonsense, they lack creativity.
If you call yourself an "artist" and think AI can replace you, you are not really an artist, just someone that can draw pretty things.
>>
>>102548173
>Bohoho, art is supposed to be made by hum-ACK
This guy has no idea what he's talking about, the point of art is that it's something completely subjective, everyone has his own interpretation of art, so gatekeeping this definition to "only humans" is like saying that "only painting is art" or "only sculpture" is art.

Art is supposed to thrive from new techniques, new concepts, and AI is just a new tool added to the rest, what a fucking retard.
>>
>>102548217
A link to GitHub on the project page is a legally binding implicit promise of code
Having a link to a GitHub that doesn’t have anything in it is like being a tranny and taking a guy too drunk to tell home without telling. It’s rude and if you die it’s your fault.
>>
File: wuat.jpg (375 KB, 1080x1576)
375 KB
375 KB JPG
>>102548173
>>
I don't care about the rantings of some literal who plebbitor and neither should you
>>
>>102548173
I agree with him but my brain is so good at coping that I don't really care. Someone else's problem to deal with desu.
>>
>>102548269
They're the ones making mails to governors about how AI is dangerous and should be shut down, so no anon, they aren't just seething on reddit, they have real impact that can hurt us
>>
BIGMA SOON
>The new model is almost done. Working hard on report writing. See ya soon!
from their discord
>>
>>102548173
>I don't want to live in a world where humans are completely redundant
Narcisism alert! Humans aren't special nigger, the sooner you learn that, the better. We are just parasites that are hurting the world and nature with our roads, pollutions and shit.
>>
>>102548282
So go write some letters
>>
>>102548287
care to share a screen? did they show some output examples to tease the release?
>>
>>102548236
oh I thought you were talking about the same guy who made JoyCaption, my bad
>>
>Improvements to SDXL in NovelAI Diffusion V3
>https://arxiv.org/abs/2409.15997
They did the v-prediction
>>
>>102548327
NTA >>>/h/8223661
>>
>>102548343
No, that guy is gold. The grifter is that Turkish faggot
>>
File: Andersen.png (808 KB, 680x3149)
808 KB
808 KB PNG
>>102548269
>I don't care about the rantings of some literal who plebbitor and neither should you
Normally i would agree.
The problem is that artists are retarded-ness has reached the legal realm.
Sarah Andersen and other talentless hacks GENUINELY believes they owns the concept of doodles and will sue AI companies trying to create precedents and "take down AI".
Her lawsuits are being taken seriously by courts.
>>
>>102548359
>One answer
>"Trash, don't care"
he has a point, now if a base model doesn't reach Flux's level at least, it's gonna be discarded, that's how high the local standards have become
>>
>>102548347
> We trained the model on our 256x H100 cluster for many epochs and roughly 75k H100 hours.
cheap
>>
>>102548378
>The problem is that artists are retarded-ness has reached the legal realm.
I need to fucking proofread what i write.
I meant:
>The problem is that artists retarded-ness have reached the legal realm.
>>
File: file.png (187 KB, 480x525)
187 KB
187 KB PNG
>>102548378
>Sarah Andersen
This bitch makes 3yo drawing cartoons and she believes she has a say on all of this? Lmao, it's always the bad artists that are seething about AI not gonna lie
>>
>>102548378
>talentless hacks GENUINELY believes they owns the concept of doodles
Good thing that's not how it works. I almost feel bad for them with how their legal team is simply taking their money. Almost.
>>
>>102548347
I mean, good for them, only ultrarichfags that can borrow thousands of GPU are gonna use this technique lol
>>
>>102548379
>>One answer
>>"Trash, don't care"
that stung kek
>>
>>102548378
Wouldn't Greg Rutkowski's fear of bad output images be solved by advancement in AI where all the quality is only good, and then it benefits as free advertising?
Lol at her getting torn up on that when she's drawing the equivalent of stick figures.
>>
>>102548403
>I almost feel bad for them with how their legal team is simply taking their money. Almost.
I feel good personally, they got money-punished for being actual retards, we need more of them to lose court battle so that it'll scare the other artists that want to do the same, this bitch is so retarded she's actually hurting her cause
>>
>>102548394
>This was honed with years of education!
My god she's retarded
>>
>>102548403
Her first lawsuit was rejected by a judge because it had outrageous claims like "AI models zip billions of art files into a 5gb file" but since then she (or her lawyers) have amended her lawsuit and now is about using her art to train AI checkpoints.
It SEEMS some judge agrees this might infringe in artists copyright and does not fall under transformative/fair use.
>>
>>102548413
It would be the art model. V-prediction and zero term snr improve colors so much.
>>
>>102548454
I can't wait for the court battles to fail so bakers stop being retarded and refuse to use artists and copyright in their datasets
>>
>>102548478
Meant stop refusing, oops lmao
>>
>>102548466
>It SEEMS some judge agrees this might infringe in artists copyright and does not fall under transformative/fair use.
nah the judge merely thought it was worth actually hearing the arguments from both sides instead of instantly dismissing it, like he did with all other claims.
>>
The next loaf is ready right here...
>>102548479
>>102548479
>>102548479
>>
File: file.png (336 KB, 850x400)
336 KB
336 KB PNG
>>102548466
>It SEEMS some judge agrees this might infringe in artists copyright and does not fall under transformative/fair use.
which is retarded because every artist look at the work of others to improve their skills
>>
>>102548503
>but it's different cuz it's compooter!
I dislike Twitter artfags
>>
>>102548102
The 2000's equivalent of a "Spring Breakers gone wild" promo clip, you'll have to pay big bucks to attend the party.
Sorry anons, there's nothing for us here.
>>
>>102548387
$2/hr for an h100
So it was $150k to fine tune a tiny model
Finetuning flux would be like $500k
>>
nb4



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.