[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: the longest dick general.jpg (2.03 MB, 3264x1178)
2.03 MB
2.03 MB JPG
Discussion of free and open source text-to-image models

Waiting 'til October 10th Edition

Previous /ldg/ bred : >>102699906

>Beginner UI
Fooocus: https://github.com/lllyasviel/fooocus
EasyDiffusion: https://easydiffusion.github.io
Metastable: https://metastable.studio

>Advanced UI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://aitracker.art
https://civitai.com
https://huggingface.co
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/kohya-ss/sd-scripts/tree/sd3

>Flux
https://replicate.com/black-forest-labs/flux-1.1-pro
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/aco/sdg
>>>/aco/aivg
>>>/b/degen
>>>/c/kdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/tg/slop
>>>/trash/sdg
>>>/u/udg
>>>/vt/vtai
>>
File: soyblonde.png (902 KB, 475x485)
902 KB
902 KB PNG
'tra anchor
>>
File: cjungle1.png (2.13 MB, 2688x1536)
2.13 MB
2.13 MB PNG
>>
File: cforest2.png (2.62 MB, 2688x1536)
2.62 MB
2.62 MB PNG
Love booba, love fantasy environment, simple as
>>
post your best UNDISTILLED flux gen
>>
File: ComfyUI_06863_.png (1.45 MB, 1024x1024)
1.45 MB
1.45 MB PNG
>>102723492
>post your best UNDISTILLED flux gen
they are really similar to distilled flux, which is a good thing, the goal here was to just remove the distillation bullshit, not make a completely different model
>>
File: grid-0379.jpg (774 KB, 1792x2304)
774 KB
774 KB JPG
>>
File: grid-0479.jpg (639 KB, 2432x3328)
639 KB
639 KB JPG
>>
File: grid-0502.jpg (1.11 MB, 3328x2432)
1.11 MB
1.11 MB JPG
>>
File: grid-0515.jpg (1.27 MB, 3328x2432)
1.27 MB
1.27 MB JPG
>>
File: grid-0560.jpg (988 KB, 3328x2432)
988 KB
988 KB JPG
>>
Arer there any flux loras or models that improve the skin and make it more realistic? The models I've tried all make people look shiny and waxy and too smooth.
>>
>>102723739
go for the amateur loras?
https://civitai.com/models/652699/amateur-photography-flux-dev
>>
>>102723739
have you tried the /ldg/ certified technique? img2img a blurry pic with denoise set to ~.9
>>
File: grid-0562.jpg (897 KB, 3328x2432)
897 KB
897 KB JPG
>>
File: grid-0564.jpg (1.15 MB, 2432x3328)
1.15 MB
1.15 MB JPG
>>
I use A1111 but I'm starting to experiment more with things like [tag1:tag2:10] and [tag1|tag2] to adjust how the prompt changes as the steps progress. It is giving me some good results but I wish I had even more control. Are there any frontends that make it easier to adjust how my prompt changes throughout the steps? I think the way A1111 does it will get pretty messy pretty fast.
>>
>>102723432
>>102723475
Based.
>>
>>102723899
If you want more control, use Comfy.
>>
>>102724119
I used comfy once or twice and I remember just having a textbox for the prompt, which didn't inspire confidence. Is there more advanced prompt stuff in comfy?
>>
>>102724134
try it again
>>
>>102724227
Okay now what
>>
>>102724365
don't know i use forge
>>
File: file.png (130 KB, 1430x551)
130 KB
130 KB PNG
>Stability Ai (almost) quintuples the API costs for SDXL and SD 1.6
This is insane, they're probably bankrupt next year that's for sure
>>
>>102724424
>SD1.6
the fuck?
>>
>>102724548
it's a finetune they made on 1.5 they called 1.6 somehow lol
>>
File: ferr.png (2.31 MB, 1568x1568)
2.31 MB
2.31 MB PNG
>>102723260
Some of these collage selections are awful. Its like youre picking random ass ones
>>
>>102723475
where booba?
>>
>>102724598
i like the randomness and variety
>>
>>102724548
i think it was a more censored version of 1.5
>>
>>102724424
they have increased the price only for the models that give to them the most money, you know that SD3 is a piece of shit when they don't even bother to milk that shit before jumping out the ship lol
>>
File: file.jpg (1.15 MB, 4320x3890)
1.15 MB
1.15 MB JPG
https://github.com/EnVision-Research/Lotus?tab=readme-ov-file
For those doing some controlnet, I think you'll be interested about that one
>>
haven't done this stuff in a while, Automatic's UI still good? What's the general anime model used nowadays for someone with 8gb of VRAM? I used to use EverythingV3 last time I dicked around with this
>>
>>102724861
Is it just for image preprocessing?
>>
>>102724908
anythingV3*
>>
>>102724939
yeah, that's an important part, with a bad preprocessing image it won't work well
>>
>>102724134
You need to get the example workflows from ComfyUI's github. You just drag and drop the image to load the workflow. There are also workflows available on Civitai. If you want more control, you need to put up with the complexity.
>>
>>102724908
A1111 fork reforged is goto ui. Sdxl pony model for anime
>>
>>102723698
model?
>>
>>102724424
>they're probably bankrupt next year
anon has been saying this forever kek
>>
>>102725238
that's fair, I really thought they wouldn't survive the whole 2024, I have no idea how they can still make money with such shit products, now that Flux exists they're deprecated
>>
File: file.png (2.05 MB, 1024x1024)
2.05 MB
2.05 MB PNG
please send help
>>102725348
i much prefer illustrious outputs, but sometimes it gives me SUCH GARBAGE nonse noise
>>
>>102725439
go for anime loras anon
>>
brothers i am new to this, please engliten me
i am using a model that says it's finetuned to danboru tags, very well.
But why do i see shit like "detailed background" in prompts on civit ai... when it isnt a danbooru tag?
is this what they call a LAION caption? since the model is tuned to danbooru tagging, would this not be counterproductive?
>>
>>102725581
in my experience rule34.xxx is way more accurate to the tags list than danbooru is.

https://rule34.xxx/index.php?page=post&s=list&tags=detailed_background
it has 110731 entries

there are also furry boorus that the pony guy probably trained on, I haven't looked at them.
>>
>>102725631
i dont use pony
>>
>>102725670
I find it hard to believe there's a model finetuned on booru tags that isn't a "descendant" of pony.
>>
>>102725581
>why do i see shit like "detailed background" in prompts on civit
Civitfags are retarded and notoriously omit or lie about prompts. It's only use is as a repository for models and even then it blows ass.
>>
>pixart shills left
>flux shills left
>>
>>102725842
t: homosexual
>>
nice gens
>>
>>102725721
what rock have you been living under everyone's using illustrious now
>>
>>102724564
And people like it better than SDXL Base!
It's the branch we should have gotten instead of the SD2.0 and SD3.0 fiascos.
Instead, it's private.
>>
>>102724750
No, they made it have body coherence and better eyes ala SDXL.
It's incredible the community was never able to provide something like that, SD1.5's compositions done right while keeping all the access to styles and artists, community efforts just destroyed them and did their own thing.
Lehina 2.0 was the closest we got to a SD1.6 and it was even worse than Lehina 1.0, lol.
(Lehina is a great model, you can still use the word "worse" like when a B+ is even worse than an A.)
>>
>>102724908
Nobody ever topped iffyMix.
>>
>>102725264
People vote with their wallets.
Democracy doesn't work because half the people can't get to 100IQ.
>>
File: fs_0218.jpg (153 KB, 1232x1440)
153 KB
153 KB JPG
>>
File: bComfyUI_126494_.jpg (728 KB, 1440x1024)
728 KB
728 KB JPG
>>
File: bComfyUI_127149_.jpg (667 KB, 1536x768)
667 KB
667 KB JPG
>>
File: fs_0228.jpg (151 KB, 1232x1440)
151 KB
151 KB JPG
>>
>>102726894
trippy
>>
File: bComfyUI_127380_.jpg (692 KB, 1440x1024)
692 KB
692 KB JPG
>>
Bi-
>>
File: fs_0248.jpg (133 KB, 1232x1440)
133 KB
133 KB JPG
>>
>>102726461
>Democracy doesn't work because half the people can't get to 100IQ.
true that
https://www.youtube.com/watch?v=XFfJC3D2ZWs
>>
https://huggingface.co/nyanko7/flux-dev-de-distill
Is there some people who have tried loras on this model? If yes, what's your impressions? Do you manage to get something more interesting than the distilled one?
>>
File: c_00012_b_red_00003_.png (753 KB, 832x1216)
753 KB
753 KB PNG
I am trying to run some tests and having issues getting Comfy to name things. pic related it is not associated to any file named c_00013_b_red_00003_.png. The issue comes in that batch doesn't hit any kind of counter so I can't number anything in the same batch.

All I want to generate in one run is two sets of files:
test_1
test_1_alpha
test_1_beta
test_2
test_2_alpha
test_2_beta

Is there a node for this or am I writing yet another bs node for myself?
>>
File: file.png (3.03 MB, 856x1528)
3.03 MB
3.03 MB PNG
https://civitai.com/models/832858/anime-art?modelVersionId=931713
Hehh that's pretty good!
>>
>>102727644
Yes, quality / prompt understanding of loras trained on this is noticeable better than dev. But more importantly, the model itself just gives better results than anything I can get with normal dev + CFG + antiburner. I'd rank things like this:
dedistill lora on dedestill > dev lora on dedistill >>> dev lora on dev > dedistill lora on dev
>>
>>102727698
You can try using Image Save from WAS Suite. Use Convert Widget to Input to and connect it to a primitive to standardize the naming.
>>
>>102727797
What do you use to train the dedestilled lora? I couldn't get it to work on Kohya.
>>
>>102727762
I really don't like these kinds of hyper detailed images. I like styles that have less room to make tiny mistakes.
>>
File: file.png (1.99 MB, 1024x1024)
1.99 MB
1.99 MB PNG
>>102727831
>>102727831
>I really don't like these kinds of hyper detailed images. I like styles that have less room to make tiny mistakes.
I see what you mean, but the lora seems to give both details and good coherence, got that one with undistill dev
>>
>>102727823
>What do you use to train the dedestilled lora? I couldn't get it to work on Kohya.
that's weird because he said that he made it work for undistill
https://github.com/bmaltais/kohya_ss/issues/2885#issuecomment-2395679834
>The latest commit includes the latest sd-scripts updates with flux dev de-distilled model support. Give it a try.

I think you need to go to this branch to get it work though
https://github.com/bmaltais/kohya_ss/tree/sd3-flux.1
>>
>>102727797
>dedistill lora on dedestill > dev lora on dedistill >>> dev lora on dev > dedistill lora on dev
where did you find a dedistill lora though? if you have one can you share one with us? I'm interested to see how much better than can be
>>
File: node_hack.png (48 KB, 1173x397)
48 KB
48 KB PNG
>>102727801
>Image Save from WAS Suite
I am not seeing anything in here that will help. The naming isn't the issue. It is that I need to run a for loop (or something) inside the save node. Here is something I threw togther. The arrow indicates the part where the increment happens. It will only happen between runs and remains static during batches.

I could be missing something with WAS, so tell me if I am.
>>
File: file.png (801 KB, 1183x766)
801 KB
801 KB PNG
>>102727797
>the model itself just gives better results than anything I can get with normal dev + CFG + antiburner
I tested it and yeah, the prompt understanding of undistill + CFG is equivalent to normal dev + CFG + AutomaticCFG, but the most important part is that it doesn't burn on undistill (as it should) so finally you can get realistic gens with great prompt understanding without that burning effect and I'm glad it's possible now, we're so back
>>
File: file.png (3.69 MB, 2781x1486)
3.69 MB
3.69 MB PNG
I really recommand you to try the APG node, he updated the default values and I'm loving them
https://github.com/MythicalChu/ComfyUI-APG_ImYourCFGNow
https://www.reddit.com/r/StableDiffusion/comments/1fxbfzn/comment/lqleowq/?utm_source=share&utm_medium=web2x&context=3
>>
>>10272788
I'm using the same branch. Did you use the same training settings for dedistilled model that you use for distilled dev? When I say it's not working, I mean the samples from training look like gibberish.

It's not uncommon for him to release features that don't work correctly or at all.
>>
>>102727986
Meant to quote
>>102727883
>>
>>102727986
>When I say it's not working, I mean the samples from training look like gibberish.
I think you have to ask that anon, he seemed to have better results with it >>102727797
>>
>>102727891
It's my own, trained on women peeing and wetting themselves (yes, this is my fetish...). I've uploaded the dev version on civit a while back, and plan on uploading the dedistill once I've done a few more experiments.
>>
>>102727948
this works with flux?
>>
>>102728040
>this works with flux?
it works on flux undistill yeah, because that one can handle CFG
https://huggingface.co/nyanko7/flux-dev-de-distill
>>
>>102728038
>It's my own, trained on women peeing and wetting themselves (yes, this is my fetish...).
what hyperparameters did you use anon? were them that different compared to the distilled lora trainings?
>>
>>102728048
needs Q8
>>
>>102728087
>needs Q8
already here
https://huggingface.co/TheYuriLover/flux-dev-de-distill-Q8_0-GGUF
>>
>>102728068
No different from dev. LR 5e-5, global batch size 16 (4 GPUs with gradient accumulation steps of 4), rank 96 lora, guidance scale 1, timestamp_sampling='flux_shift', AnyPrecisionAdamW optimizer with bf16 state (basically AdamW but you can make all the state bf16 to save memory, I added this optimizer by just copying the code in), adam betas (0.9, 0.99), images at 512x512 res with aspect ratio buckets. Anything else just what kohya recommended in the readme.
>>
>>102727912
>It will only happen between runs and remains static during batches
I don't believe Comfy can do that OOTB nor with any custom node unfortunately
>>
File: Image.jpg (1.49 MB, 2304x1792)
1.49 MB
1.49 MB JPG
>>
>>102728221
Damn, I love stuff like this, prompt?
>>
>>102727238
-gma
>>
File: 1727391601186172.png (2 MB, 832x1216)
2 MB
2 MB PNG
https://civitai.com/models/833294/noobai-xl-nai-xl

Been using this extremely early illustrious finetune. Getting good results with DPM++ 2M Karras in SwarmUI.

Just a little bit more and Pony's reign of terror will be over. At least for V6. I dream of local NAI.

Illus is trained on an up-to-date booru dataset so it can recognize a lot of obscure artists, and unlike Pony, it doesn't hide artist tags.
>>
>>102728561
we've come quite a ways with xl
>>
>>102728561
>Just a little bit more and Pony's reign of terror will be over. At least for V6. I dream of local NAI.
if only they would go for flux dev instead, this anime model would obliterate everything in sight

>and unlike Pony, it doesn't hide artist tags.
Pony is a cuck and I'm glad people with balls are up to the task on dethroneing him
>>
>>102728561
>NAI-XL
Wait that's a finetune made by NovelAI? Why are they giving it to us?
>>
>>102728561
>must sign in to download on civitai and huggingface
Goddamnit.
>>
>>102728097
Nothing atypical there. Are you using the fp8 undistilled model or the fullsized one?
>>
File: flux fucked.jpg (185 KB, 2051x949)
185 KB
185 KB JPG
How do you use the de-distill GUFF flux on comfy? Using my old setup for normal flux guff seems to fuck the gens up. Do I need a different vae or something?
>>
>>102728722
anon, it's de-distilled, it means that it's supposed to work at CFG > 1, go for CFG 3.5/4 to get good gens
>>
>>102728729
That was it, thanks.
>>
File: file.png (2.33 MB, 1024x1024)
2.33 MB
2.33 MB PNG
>>102727762
that lora has a lot of sovl, with good pictures, Flux can stop being boring and that's cool
>>
>>102725439
use that illustrious-smooth-spo finetune or whatever it's called and artist tags
>>
>>102728101
I figured out how to do it, but I am not willing to change everything else. Using an integer ticker and integer division you can delay everything until you hit a "batch size". You then queue everything as normal. The settings will only change when x mod "batch size" == 0.

The main reason I am not pursuing it is because comfy is a piece of trash that refuses to use proper types and I am having issues with a ton of custom nodes not agreeing with what int, bool, boolean, string and combo are.
>>
is bigma here yet?
>>
File: file.png (2.65 MB, 1024x1024)
2.65 MB
2.65 MB PNG
>>
>fuck Diffusers
Why tho?
>>
>>102728704
I have the fp16 model but set fp8_base_unet to train it using fp8.
>>
File: Image.jpg (2.1 MB, 1152x2688)
2.1 MB
2.1 MB JPG
>>102728375
>nintendo 64 screenshot, n64 screenshot, scenery, warping twisted schizophrenic dream drug intoxication sequence in another reality,

>form constant, corridor, arboreal, interior, mario, controller,
>>
File: file.png (3.9 MB, 3326x1477)
3.9 MB
3.9 MB PNG
>>102727762
it's insane how much we improved vanilla flux in just 2 months, we got PuLID, APG, good loras, we undistilled flux dev, you combine all this shit and you start to get serious quality and soulful pictures
>>
File: file.png (420 KB, 688x1088)
420 KB
420 KB PNG
undistilled flux dev has this tendancy of going for vertical black bars, that's annoying
>>
File: file.jpg (452 KB, 2887x1547)
452 KB
452 KB JPG
>>102729011
>undistilled flux dev has this tendancy of going for vertical black bars, that's annoying
that's the moment where it's convenient to finally have a working negative prompt
>>
>>102728867
Looks like there might something wrong with the fp8 model I used.

https://huggingface.co/MinusZoneAI/flux-dev-de-distill-fp8/discussions/2
>>
File: ComfyUI_temp_ipchu_00025_.png (2.99 MB, 1880x1072)
2.99 MB
2.99 MB PNG
>>
>>102729077
that man who made this post is a retard, you're not supposed to get the exact same result from distilled to undistilled, but yeah his undistilled fp8 picture looks bad as fuck, I think this model is broken, I suggest you to download the fp16 and set fp8_base_unet for the training
>>
>>102729107
And you don't seem to understand...
>>
>>102729110
>his undistilled fp8 picture looks bad as fuck
that's because he used the distilled workflow, and that one doesn't have a cfg node so it's at cfg = 1 so of course it's fucked, the undistilled model should be run at cfg 3 at least
>>
>>102728937
Thanks
>>
>>102728038
>(yes, this is my fetish...)
I thought that was mine as well until I managed to make DreamShaper draw a girl with a miniature golden waterfall in the middle of her legs that successfully imitated that.
And I got nothing, the whole thing was about the chase and managing to get what I couldn't, once it was done and I knew how, I lost interest.
The same is happening now with pantyshots, I'd merge entire models with one another just to manage to get a girl to show her panties in the style of the model that would never do it.
But in recent times I've been getting so good at prompting that I've managed to get full upskirts from safe models that would never do it before.
It was a promptographer skill issue.
But the challenges are disappearing and the more I can draw, the more boring it becomes.
I guess that's a thing that makes Flux so boring, it was fun to explore the limits, but nowhere to go with no limits.
>>
>>102728593
NAI-XL stands for NoobAI-XL, noob
>>
>>102728561
>this is an image generation model based on training from Illustrious-xl
but that's the version 0.1, why didn't they wait a bit so that they could finetune the final version instead?
>>
https://epiphqny.github.io/Loong-video/
this makes CogVideo look like it's fucking sora kek
>>
>>102729307
It's just a test to check their multi GPU nodes. Also we won't get a final version of illustrious, angel got in trouble because some sperg from the WD discord tattled to the CEO
>>
File: ComfyUI_temp_ipchu_00037_.png (2.2 MB, 1880x1072)
2.2 MB
2.2 MB PNG
>>
>>102729335
>some sperg from the WD discord tattled to the CEO
what happend? the CEO didn't intend to put this on local initially?
>>
is it just me or they look like garbage
https://www.youtube.com/watch?v=Lru3Esv2Ya8
https://www.youtube.com/watch?v=5vvTfBgVNag
https://www.youtube.com/watch?v=oqqT-0FGk6A
https://www.youtube.com/watch?v=Z9GGW8WTOHs
>>
>>102729360
It's somewhat unclear because Angel's English is shakey at best but that's more or less how I understand it to be. That or he changed his mind after some Cer Furkan tier tard harassed him and decided it wasn't worth it, pulled out.
>>
>>102729391
Left looks definitively better.
>>
>>102729391
>is it just me or they look like garbage
no it's not you it looks really uncanny, as if the CFG is too high or something, but what's impressive about it is the temportal consistency, it's on another level
>>
>>102729394
>after some Cer Furkan tier tard harassed him
jesus fucking christ there are so many useless spergs who contribute absolutely nothing
>>
>>102729391
we can already simulate the entire engine we dont need simple img2img
https://www.youtube.com/watch?v=p6RzS_mq-pI
>>
>>102729431
as impressive as the doom demo is, it's not the future, what if you want to modify the game or add some mods, some DLC... you'll have to retrain the model all over again
>>
>>102729426
>there are so many useless spergs who contribute absolutely nothing
it's worse, they destroy cool projects and are proud of it, those people are parasites on earth
>>
>>102729460
>simulating entire game engines isn' the future, img2img is
>>
>>102729508
yep, vid2vid is, because you just need to make simple 3d scenes and let AI do the rest (texturing, lightning, shading...), you have way more control like that
>>
File: ComfyUI_temp_ipchu_00045_.png (2.66 MB, 1880x1072)
2.66 MB
2.66 MB PNG
>>
>>102729565
what model anon?
>>
>>102729574
>>102728561
>>
>>102729594
thanks
>>
File: file.png (3.72 MB, 2055x1606)
3.72 MB
3.72 MB PNG
yeah that one doesn't look like Monica at all, or is it?
>>
File: file.png (1.91 MB, 1024x1024)
1.91 MB
1.91 MB PNG
>>
>>102729656
Looks little bit like her
>>
>https://github.com/declare-lab/della
Could this be made to work with sdxl checkpoints?
>>
>>102729391
The problem is those graphics are ALREADY aesthetically pleasing, so you can only make it worse in aesthetic quality, you can only go downwards.

It's funny, it reminds me a lot of hollywood writers rewriting the stories of books for adaptations. Those books are already at the top of their game (that's why they were a popular book) - so any changes made will always be for the worse.
>>
>>102730076
That's not the case, there's high levels of quality, and high levels of aesthetics, but you can't ever reach perfection, so things can be improved.
It's just that it requires skills that the people often put in those positions lack, so they make changes that make the quality go down.
AI could be used to improve graphics because there's a skill ceiling, they were not perfect, they can't be perfect, so they can be improved infinitely, but not like this, never like this.
>>
>>102730115
>>102730076
I think that in these cases AI could be used to improve textures, animation, improve the game code itself etc. I2I like this seems like a waste.
>>
>>102730157
>I2I like this seems like a waste
Only because it wasn't done well, if you can't imagine how an image could look better it's a problem with your imagination.
>>
>>102730115
>It's just that it requires skills that the people often put in those positions lack, so they make changes that make the quality go down.
yeah 99 times out of 100. so basically it's the height of ego to think you can improve on other professional's work, that's the only point i was making.
>>
>>102726931
Very cool
>>
File: segs.png (1.57 MB, 1008x748)
1.57 MB
1.57 MB PNG
I love how Comfy's Impact Pack calls the mask-like things their nodes need "SEGS"
Even better, they put what the detailers need on the right, so you get "Detailer (SEGS)"
Unironically made my comfy experience fun
>>
>>102728729
Does changing CFG on de-distilled drastically increase gen times like it does on distilled? I could get drastically different images on the same seed changing CFG to 0.9 or 1.1-1.5 at the cost of double to triple gen times. If de-destilled is as slow it's probably not work using for me, although I am downloading now to try it.
>>
File: VPX-105.jpg (2.93 MB, 2608x2080)
2.93 MB
2.93 MB JPG
Morning
>>
>>102731691
morning
>>
>>102730076
No that's not the issue. The issue is that the AI generated result lacks art direction. The games' graphics have been specifically designed to create certain moods and atmosphere. That's why certain scenes are at specific times of day, have specific lighting, it's a crafted visual experience.

The AI generated result is just the AI doing its best job to convert that footage into "realistic" visual style, but its limited by whatever footage it was trained on. It's why the visuals take on an uncanny "Stock Footage" quality, there's a lack of artistic intent. However, the AI also manages to capture so much subtle nuance in lighting and texture that the final result in undoubtably "realistic", more realistic than pretty much any graphics traditional rendering techniques can achieve. But they're also kind of sloppy and generic due to the limitations of the technology.
>>
>>102731124
how long were the gen times for you?
>>
File: file.png (2.65 MB, 1272x1720)
2.65 MB
2.65 MB PNG
>>
File: bComfyUI_126233_.jpg (751 KB, 1440x1024)
751 KB
751 KB JPG
>>
File: bComfyUI_126291_.jpg (846 KB, 1280x1024)
846 KB
846 KB JPG
>>
File: bComfyUI_127608_.jpg (727 KB, 1440x960)
727 KB
727 KB JPG
>>
Is Illustrious or NoobAI worth picking up if what I want to do is gen environments/backgrounds?
>>
>>102732539
try them and find out, it's half the fun of messing around with this stuff and seeing what you can get.
>>
>>102732424
I love it.
>>
>>102726931
yeah, very cool
>>
>>102731124
cfg > 1 always doubles the time of rendering yeah
>>
https://reddit.com/r/aivideo/comments/1fz06y3/im_perdy/
wait we can do img2vid on Minimax now?
>>
>>102729391
They look like bogged remakes plus some awful color grading / post processing mods made by someone with no taste turned to the max.
>>
>>102732442
cool as hell
>>
File: IMG_0471.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
Schnell
>>
>>102732539
Looking at the civitai page, it also looks good at 1 or 2 girl
>>
>>102723899
A1111 has the simplest implementation of prompt editing. Everything else is more convoluted.
>>
File: 00020-279650773.png (2.81 MB, 1024x1440)
2.81 MB
2.81 MB PNG
>>
>>102734186
What model/loras?
>>
>>102733703
schnell is worthless desu
>>
>>102732442
>Reactor online
>Sensors online
>Weapons online
>All systems nominal.
>>
>>102734246
flux dev, Retro Anime lora, kodak_800 lora at .3 for lighting, and some version of my asuka lora
>>
Bigma status?
>>
>>102723284
what did she mean by this
>>
>>102734639
two more weeks
>>
File: 0.jpg (280 KB, 788x1195)
280 KB
280 KB JPG
>>
>>102732888
>VIDEO REMOVED
>REASON:
> videos must be at least 10 seconds long, no loops
alright
>>
>>102729530
Vid2vid was the future until we figured out how to do >>102729431
>>
>>102735174
this is a retarded rule, Minimax only allows for 5sec videos, they're killing their subreddit that way
>>
>>102734460
nta but ty
>>
>>102728561
Has anyone tried to mix it with other XL models? That should work, right?
>>
RELEASE YOUR SHIT ALREADY PIXSHART
ARE YOU GAY
ARE YOU RETARDED
>>
its too hyped now everyone will be disappointed
>>
>>102732442
descent
>>
I haven't tried anything on Comfy yet, but does anyone know whether someone can do something akin to NAI's Vibe Transfer on SDXL? I've tried to use T2I and IPAdapter and it might just be me being complete crap at understanding how to use it in Forge, but I'm not getting any results trying to use those.
I've tried to prompt for the desired outcome, but... Not much luck there.
>>
File: file.png (254 KB, 1514x1125)
254 KB
254 KB PNG
https://github.com/DwanZhang-AI/SePPO/tree/main
https://arxiv.org/pdf/2410.05255
are we using any of those techniques on a Lora training?
>>
>>102735818
New here. What is?

>>102735620
QRD?
>>
>>102736344
IPAdapter for XL is worse than 1.5 but either way your best bet is to play around with the settings. Comfy has a number of "advanced" IPAdapter nodes as well that may serve you better than Forge.
>>
File: file.png (1.54 MB, 2333x1390)
1.54 MB
1.54 MB PNG
https://github.com/ohayonguy/PMRF
could this be used on Flux or something?
>>
>>102736483
>Posterior
Heh...
>>
File: file.png (21 KB, 1180x169)
21 KB
21 KB PNG
>>102736414
It would appear that the built-in controlnet is borked the fuck do I do
>>
>>102729391
it is not just you.

there might be ways to incorporate """AI""" to make video games look better. This is not it.
>>
>>102736483
what are the chances we just get this again?
>>
>>102736606
seems unlikely, the more pixelated, the more solutions you have, the original is too pixelated there can give too many solutions so the AI can't just guess the right one
>>
>>102734929
Ugh
>>
pixartsexuals on suicide watch
>>
>>102729391
I don’t get why people keep doing this.
There are exciting things happening in graphics but real time frame by frame ai is not a thing that is going to be powering almost any of it
>>
>>102736606
Why do I get the feeling Obama wants me to buy Skyrim?
>>
NoobAI-XL is pretty good. I think it'll be the last decent XL checkpoint we get.
>>
we love FLUX don't we folks
>>
is stable-diffusion.cpp actually good if i want to write my own gay little script for inference? perhaps adding a dead simple UI on top as well?
the only other option i see is diffusers but that appears mega cucked nor do i fully understand the difference betwixt the two
>>
File: IMG_0474.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
>>102736988
I do
>>
>>102736483
If you believe hard enough anything is possible
>>
File: thank you mistrail.png (365 KB, 990x866)
365 KB
365 KB PNG
>>
File: 00020-2606397679 copy.png (3.31 MB, 1632x1152)
3.31 MB
3.31 MB PNG
>>
>>102736730
real time AI is the holy grail of rendering
>>
>>102738014
Very nice
>>
>>102738026
It’s a waste
>>
joycaption author uploaded this to HF with no info on the card:
https://huggingface.co/fancyfeast/llama-joycaption-alpha-two-hf-llava/tree/main
anyone know if he posted anywhere about it?
>>
holy dead
>>
>forever stuck genning 1girls because it makes peepee hard
Fuck
>>
any Azula gens?
>>
>>102738026
not via img2img or vid2vid desu
>>
File: IMG_0498.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
>>102739153
Ikr
>>
>>102739153
follow the peepee anon
>>
>>102736730
literally the only way to make money with AI. Outside of existing systems (recommender and such) all the AI progress has been speculation and showing other people cool shit. This stuff is entirely masturbatory in one way or another.
>>
File: IMG_0516.png (1.24 MB, 1024x1024)
1.24 MB
1.24 MB PNG
>>102739740
No way
>>
>>102736730
Why do anything? Many people do hobbies just for fun. There is no angle outside of some personal amusement and perhaps sharing that amusement with others. It also is a glimpse of what AI might look like in 10 or 20 years. It's like griping about people doing 3D graphics on a Commodore 64.
>>
File: 00164_.png (1.28 MB, 896x1152)
1.28 MB
1.28 MB PNG
Besides turning up the bbox threshold any tips for not running my auto hand inpainting on feet?

>>102739804
>people doing 3D graphics on a Commodore 64
that was surprisingly groundbreaking
>>
File: wrb123.webm (3.38 MB, 1280x720)
3.38 MB
3.38 MB WEBM
The future is so close, frends
>>
File: file.png (442 KB, 1287x514)
442 KB
442 KB PNG
>>102731945
Gone from ~2 minutes per gen with what I was using to 4-5 minutes with de-destilled + have to retinker so many levers to actually get gens in the same ballpark as what I was getting, I don't know if it's actually better yet. Negative prompting to unfuck fingers has been nice though.
This is on a 7900XTX.
>>
>>102730995
prompt?
>>
>>102740349
>FluxGuidance 4.0
I don't think distilled guidance is working anymore, that's the point un undestill, you can remove that anon
>>
>>102738014
nice
>>
File: 00172_00003.jpg (128 KB, 896x1152)
128 KB
128 KB JPG
>>102739969
bump. Thinking of trying segment anything if anyone has some thoughts on that.
>>
I've used A1111 a few months ago and had decent results creating lewd stuff, but now I'm seeing really good shit. Do I have to update anything since my absence?
>>
>>102740958
use forge
current lewd checkpoint for hentai is illustrious - smooth or noob nai
for still sfw lewd flux (see op)
for photo realistic sexo.. I guess some sdxl model idk about that shit
>>
>>102741005

Thanks, getting on that right now. Hopefully I can recreate some of what I see around. It seems they really improved the lewd shit, amazing.
>>
>>102726104
ty anon
>>
File: ComfyUI_00238_.png (1.78 MB, 770x1126)
1.78 MB
1.78 MB PNG
>>102723260

I'm using ComfyUI and I noticed that the metadata in the image I generate don't contain any data on what models I use. They include the names of LoRA nodes I use but nothing else useful. Is it possible to set it to input the LoRA and model hashes into the metadata?
>>
>>102727644
still waiting for someone to make a lower quant
>>
>>102741680
what quant do you want? I can try it
>>
>>102741677
share a catbox? ive never heard that before very strange
>>
>>102741700
i like nf4 (v1) for the speed, but something like a Q4 k quant would be good too
>>
File: 1717433525523875.png (1.76 MB, 770x1126)
1.76 MB
1.76 MB PNG
>>102741723
https://files.catbox.moe/xih3u0.png

Bear in mind I'm still pretty new to comfy.
>>
>>102741727
>Q4 k
the issue is that the only repo I found about quants is this one, and it doesn't seem to do the QK quants
https://github.com/leejet/stable-diffusion.cpp/blob/flux/docs/quantization_and_gguf.md
>>
>>102741677
https://github.com/alexopus/ComfyUI-Image-Saver should allow you to save hash data i havent used it tho
>>
>>102739969
>>102740942
I dont gen Feet sorry but I like these booba
>>
>>102741759
use that one to get your QK quants anon
https://github.com/city96/ComfyUI-GGUF/tree/main/tools
>>
>>102739969
maybe crop the feet out, run through detailer, and then stitch them back together? brain dead idea
>>
>>102727644
https://huggingface.co/nyanko7/flux-dev-de-distill/discussions/3#6705765f2214de561f5499d4
>My first test was a total success, I trained many people at the same time without bleeding between each other, It works perfect and the lora can be used on regular flux dev that is much faster on inference, I get a little bleeding with to subjects with a similar names "Diego man" "Dani man" is minimal so it can be fixed changing the name to "Daniel man" I saw a little class bleeding but I think it can be fixed using regularization images. the model behaves very similar to regular SDXL, Training many subjects at the same time was imposible with regular flux-dev. The lora model is still a little undertrained so I will continue and try with regularization of class people because my dataset contains people from different genders and ages. my captions are very simple, "name class". This model is awesome and very promising so far.
that's cool, I guess that's a good model if you want to make better quality loras
>>
>>102740070
>left keeps trying to take her top off
>cucked ai forbids her from doing it
The horror
>>
>>102741794
do you happen to know how to make nf4 ones too?
>>
>>102741865
I have no idea dude lol, desu you should download the fp16 model and try to convert it to nf4 if you can
>>
>>102741870
i tried looking it up but couldnt find any tools for converting them to nf4, and people only make q quants now
>>
File: file.jpg (2.04 MB, 7961x2897)
2.04 MB
2.04 MB JPG
>>102741893
>people only make q quants now
that's because nf4 sucks, Q4_0 is the same size but is better than this garbage quant
>>
Newb here. Are Loras not meant to be used with inpainting? I downloaded several crying Loras to add some tears to a photo, but it doesn't work no matter what I do.
>>
>>102741914
no it doesnt, i tried them myself and often preferred nf4 outputs to q4, granted im using v1 and not v2, which i think is worse
thats miku comparison meme image is really not as insightful as you think it is
>>
>>102741968
>thats miku comparison meme image is really not as insightful as you think it is
it's an evidence to show which quant is better, where's your evidence anon? Y'know what, I was about to start the quant and put them on huggingface but I won't do it now, you look like a serious retard, you want the quants, do it yourself.
>>
>>102741920
>crying Loras
?? what checkpoint are you using that can't do tears?
>>
>>102742039
I've tried EpicRealismPureEvolutionInpainting versions 1 and 5, and RealisticVisionInpainting version 6.
>>
>>102741968
>really not as insightful
How would you make it more insightful, anon?
>>
>>102742016
>evidence
>one image
i don't have anything to prove, i know i like nf4 more
also yeah im trying out that comfyui gguf tool the other anon posted
>>102742080
more images at different seeds, settings and steps
obviously that would be too complicated to put in one image so thats why miku skateboard prevails
>>
File: file.png (656 KB, 640x480)
656 KB
656 KB PNG
>>102742109
>more images at different seeds, settings and steps
>>
to be honest i dont even get how people shit on nf4 based on that image
is it just because she has a slightly different pose than the others? yeah it gets the skin wrong but so does q4, and aesthetically the nf4 background looks more distinct and comic book style
>>
File: 0.jpg (190 KB, 832x1216)
190 KB
190 KB JPG
>>
>>102742204
You completely missed the point, the goal of a quant is to be the closest possible to fp16, nothing more, nothing less, if it's too far away it's a shit quant, period.
>>
https://xcancel.com/mapchannels/status/1843665565166252448#m
MiniMax can do image2video now, it's ova
>>
>>102742226
in my book the goal of a quant is to be smaller and look good
i dont see a problem if it works and outputs something coherent that follows the prompt as well as its similar size peers
>>
>>102742258
using an autistic man for narration is far more egregious than whatever state txt2voice is in now
>>
>>102742263
>in my book the goal of a quant is to be smaller and look good
no one care about your subjective feelings, the only goal of a quant is to give you the most similar experience possible than the real deal (fp16)

>i dont see a problem if it works and outputs something coherent
but it doesn't, it's different because it's making more mistakes, for this miku picture that's not that bad, but the more calculus mistakes a quant make, the more likely you'll get shit pictures, and you'll be like "Flux isn't that good after all" when in reality you're using a broken Flux model
>>
>>102742283
it's not broken if it works
from my experience using them q4 makes as many shit pictures as nf4 does, maybe even more
>>
>>102742300
>it's not broken if it works
I never said it never works, but my point is that the more shitty a quant is, the more likely you'll get broken pictures

>from my experience using them q4 makes as many shit pictures as nf4 does, maybe even more
that's fair, you can believe that, so far we only have 1 picture to get a clue on what quant is better than the others, but I feel like you're the only guy here that uses nf4, so far people prefer to use the Q quants, and as a /lmg/ fag I already seen this episode, nf4 was also a thing for LLMs until the Q quants appeared and everyone noticed how shit nf4 was compared to Q4_0
>>
File: 1726710608.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>>
File: ComfyUI_temp_ivcpc_00023_.png (3.77 MB, 1072x1880)
3.77 MB
3.77 MB PNG
>>
File: 0.jpg (366 KB, 832x1216)
366 KB
366 KB JPG
>>
File: ComfyUI_temp_ivcpc_00027_.png (3.53 MB, 1072x1880)
3.53 MB
3.53 MB PNG
>>
>>
>>102742327
if one of these formats is better for LLMs, that doesnt necessarily mean it will be better for diffusion models too, but i get what you mean
i just think its odd to fixate on the pose having to be exactly the same
if the quanted weights cause some slightly other bias to the interpretation of the prompt at a given seed, but the consistency and coherence are just as high, then the model is not suffering from more mistakes, because a mistake would be fucking up a bodypart or misinterpreting a part of the prompt
we can theoretically explain everything as well as we want to but in the end all we have is our eyes to judge
since you're also familiar with LLMs, i'm sure you're aware of how unreliable benchmarks for example are and that the only way to verify a model is actually good is to use it yourself
>>
>>102742540
>if the quanted weights cause some slightly other bias to the interpretation of the prompt at a given seed, but the consistency and coherence are just as high, then the model is not suffering from more mistakes, because a mistake would be fucking up a bodypart or misinterpreting a part of the prompt
it's just one example, the fact of the matter is that it's too far from fp16 only due to calculus errors, so it's fair to assume that this model is more likely to fuck it up because it makes more errors than the rest

>since you're also familiar with LLMs, i'm sure you're aware of how unreliable benchmarks for example are and that the only way to verify a model is actually good is to use it yourself
That's why we use the perplexity to quantify the quality of a quant, now that you said that it gave me an idea, how about doing the same and measuring the FiD between those pictures, it's the image equivalent to a perplexity, I'm going to check that out
>>
>>102740070
reminds me of an aphex twin video
>>
>>102740070
what did you use to get those videos?
>>
>>
File: file.png (43 KB, 925x462)
43 KB
43 KB PNG
>>102742594
>>102742540
I went for SSIM (Structural Similarity Index) at the end, that's really interesting, I put the python script down below for those interested. Piercel picture is the SSIM for the Miku pictures compared to fp16
https://files.catbox.moe/p4wjjn.py
>>
>>102742687
thats a pretty cool way to compare them actually, i wonder how the k quants would fare in this
the result for nf4 here is basically what i expected, since it is a structurally different image
>>
File: file.png (1.17 MB, 2035x927)
1.17 MB
1.17 MB PNG
>>102742687
>>102742848
https://files.catbox.moe/wfhykf.py
I went for a more sophisticated SSIM (that one includes the RGB channels instead of going for grayscale), here's the results
>>
oh yay my quant just finished
>>
oh no... im.... im quanting
>>
File: ComfyUI_temp_ivcpc_00049_.png (3.31 MB, 1072x1880)
3.31 MB
3.31 MB PNG
>>
oh crap my quant doesn't work
>>
>>102742848
>>102742863
>i wonder how the k quants would fare in this
the difference between Q5 (0.49) and fp8 (0.51) is much smaller than between fp8 (0.51) and Q8 (0.66), that's crazy... I'm sure Q6_K is better than fucking fp8 at this point
>>
>>102730319
Because I'm the 1/100.
>>
>>102742870
>>102742873
>>102742948
what repo did you use to get the quantization?
>>
>>102743006
i think we're both using city96/ComfyUI-GGUF
>>102742873
please tell me yours works
>>
File: file.png (27 KB, 806x237)
27 KB
27 KB PNG
>>102742863
kek, I'll always be impressed by the jpg compression, like you get a file five time lighter and the quality barely decreases
>>
>>102742863
Am I stupid or is this saying that quanting it from fp16 AT ALL turns it to absolute dogshit? I’m used to LLMs and if a score dropped that much when quintes it would be unusable.
>>
>>102743062
yeah I'm also surprised that Q8 "only" gives a 66% similarity, but I don't think it can be compared to LLMs, on LLMs, you usually have logits that should absolutely not touched like if I say "The previous president of the United States was Donald...", if the logits doesn't go for "Trump" at 100% then it's wrong, here we're talking about pixels, art... that can be subjective, if at the end if doesn't make giant mistakes like her having a third arm, people won't mind if one green pixel changed into a yellow pixel I guess
>>
File: file.png (1 MB, 2016x989)
1 MB
1 MB PNG
>>102743062
>Am I stupid or is this saying that quanting it from fp16 AT ALL turns it to absolute dogshit?
that's just one picture, I think the best way of handling this is to make an average of several picture, here's an other example, here Q8_0 has a 0.78 similarity, which is good
>>
File: ComfyUI_temp_ivcpc_00063_.png (1.97 MB, 1072x1880)
1.97 MB
1.97 MB PNG
>>102728561
i have trouble making it not look like digital slop
>>
File: IMG_0534.png (1.17 MB, 1024x1024)
1.17 MB
1.17 MB PNG
>>102743107
>>102743231
Oh okay. Imagegen is odd.
>>
File: 1728447006.png (1.2 MB, 1024x1024)
1.2 MB
1.2 MB PNG
>>
>>
>>102743273
that left hand
>>
File: file.png (1.23 MB, 2057x1470)
1.23 MB
1.23 MB PNG
>>102743231
I also went for LPIPS which is something more accurate to measure similarity
https://richzhang.github.io/PerceptualSimilarity/
https://files.catbox.moe/y37uhy.py
>>
is autismmix lightning as good as it gets for a pony lightning model? anyone have recommendations?
trying to do a particular super simple style, i really want to try and get it down to as few samples as possible.
>>
File: file.jpg (93 KB, 832x1152)
93 KB
93 KB JPG
is there a way to get a dynamic steps? like you start with 50 steps schedule, and after 20 steps, when the image doesn't move that much you go for 5 steps to finish the job?
>>
>>102743636
Try double sampling.
>>
>>102742873
status update?
>>
>>102743647
can you give me the workflow? that looks interesting
>>
File: IMG_0536.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
>>102743463
Sorry, schnell does its best but the hands…
>>
>>102743661

https://files.catbox.moe/t9ztp7.png
>>
>>102743739
it's working, thanks anon!
>>
>reduce the vram and make it lame and gay
>>
File: file.png (149 KB, 308x238)
149 KB
149 KB PNG
>>102743966
>>
File: ComfyUI_temp_ivcpc_00092_.png (2.24 MB, 1072x1880)
2.24 MB
2.24 MB PNG
>>
File deleted.
>>102741792
I normally don't either, but there are some position cause feet to be in frame.

>>102741799
I guess I could take out the bottom 1/4 of the image before the process. Feet will be there most of the time.

>>102742061
that kind of lora should be fine. You need to floor the weight though.
>>
>>102744096
>floor the weight

Sorry, does that mean to lower it or raise it?
>>
>>102744121
lower it significantly. You have to experiment, but I would start at 65% of recommended value for your lora.
>>
>>102744129
Thanks, I'll try that.
>>
>>
>You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254
>>
>>102744297
tf is this?
>>
>>
>>102744316
dumbass diffusers shit
>>
Fresh

>>102744592
>>102744592
>>102744592



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.