[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: tmp.jpg (967 KB, 3264x3264)
967 KB
967 KB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>102405949

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/c/kdg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/u/udg
>>>/tg/slop
>>>/trash/sdg
>>>/aco/aivg
>>
Blessed thread of frenship
>>
File: Sigma_14651_.png (2.25 MB, 1024x1024)
2.25 MB
2.25 MB PNG
hodor
>>
File: Sigma_14660_.png (1.61 MB, 1024x1024)
1.61 MB
1.61 MB PNG
>>
File: Sigma_14670_.png (954 KB, 1024x1024)
954 KB
954 KB PNG
>>
File: Sigma_14629_.png (764 KB, 1024x1024)
764 KB
764 KB PNG
>>
how come percentile_clipping and block_wise wont work with regular Lion? Is PagedLion8bit so much different?
>>
>>
>>102418422
Open source tools most likely plus custom loras.
>>
File: 00060-2457933165.png (2.05 MB, 1152x1536)
2.05 MB
2.05 MB PNG
>>
File: 00003-48613435.jpg (173 KB, 1080x1280)
173 KB
173 KB JPG
>>
When is this shit going to be good enough to generate doujins for works that somehow don’t have any despite practically crying out for them?
>>
File: 00005-4205403894.jpg (140 KB, 1080x1280)
140 KB
140 KB JPG
>>
File: 00006-2321852529.jpg (138 KB, 1080x1280)
138 KB
138 KB JPG
>>
my slowpoke self only just got the time to look up the full specs sheet of the 4070 compared to the 3090
how the fuck does it have half the cuda cores of the 3090? the ((marketing)) claims are that it's at least faster due to being a newer architecture along with other things but, in reality, is that even true? i can't find any fucking direct comparisons and data is all over the place with people's wildly different setups, but the general consensus solely places the 3090 as a better value for vram alone, but how does that stack up versus a card that might be faster regardless of less cuda cores and vram?

>people still using sd 1.5 in 2024 at 512x as if that helps get an idea of the performance differences

>vergil is MFW this entire rabbithole
>>
File: 00081-2457933167.png (2.54 MB, 1152x1536)
2.54 MB
2.54 MB PNG
>>102418997
>>102418978
uncanny stuff
>>
File: 00011-2545065371.jpg (105 KB, 1080x1280)
105 KB
105 KB JPG
>>
>>102419032
I bet the inference speed would be pretty close between a 4070super and a 3090. The 4000 series clocks are a lot higher than the 3000 cards. 12gb of ram is tough though if you are actually going to gen a lot.
>>
I've noticed something interesting with flux for anime images. The resolution you generate at influences the style. High resolutions have more visually appealing, high-quality style overall. And I don't mean that higher res just means the image captures more fine details, obviously that's true. I mean the actual style the image is drawn in changes. Generating in lower res is not the same as just a scaled-down version of a higher res gen. But for photos, I don't notice any difference.

I think this is because flux was trained at whatever resolution the image was natively in. So a 512 res anime image on the internet is likely to be kinda shitty and low quality. But a 1024 image, on average, had more effort put into it. For photos this isn't true, a lower res photo might have just been scaled down to fit on a webpage.

It's kind of annoying. If you're using lower res to quickly iterate on prompts, you're not really seeing what flux is capable of with anime.
>>
File: 0.jpg (274 KB, 1016x1024)
274 KB
274 KB JPG
>>
File: ComfyUI_01586_.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
>>
File: ComfyUI_01472_.png (1.98 MB, 1024x1024)
1.98 MB
1.98 MB PNG
>>102419196
so whats a good resolution for anime images?
>>
>>102419601
896x1152 seems good for portrait aspect ratio. I've barely tried higher than that, mainly because it's so fucking slow.
>>
File: ComfyUI_01338_.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>
File: 00056-377886493.jpg (186 KB, 1080x1280)
186 KB
186 KB JPG
>>
File: 00059-4231916271.jpg (153 KB, 1080x1280)
153 KB
153 KB JPG
>>
On A1111 is there some way to send a pic to inpaint sketch? It doesn't seem to be a default option and I can't find an extension for it so I feel like I'm missing something obvious. I don't want to always save and open a pic to send it there, I'd like to do it directly like with regular inpainting
>>
File: 00065-4095912729.jpg (189 KB, 1080x1280)
189 KB
189 KB JPG
>>
File: 0.jpg (251 KB, 1024x1024)
251 KB
251 KB JPG
>>
>>
>>102419995
from txt2img there is a button, you need to hover over it.

in img2img the buttons are around the input image.
>>
https://civitai.com/models/755161/wizards-deck-of-affirmation-cards
>>
File: 1726543760.png (1.28 MB, 1024x1024)
1.28 MB
1.28 MB PNG
>>
>>102418659
Awesome
>>
File: ComfyUI_33677_.png (1.29 MB, 768x1024)
1.29 MB
1.29 MB PNG
>>
File: 1726543790.png (1 MB, 1024x1024)
1 MB
1 MB PNG
>>
>>102419032
Isn't it supposed to be ~one gen - one decimal difference, two gens - two decimals? So 3090 ~ 4080 ~ 5070
>>
>>
File: file.png (17 KB, 1968x43)
17 KB
17 KB PNG
4090 noob here, is this the model I need for Flux LoRA training?
https://huggingface.co/black-forest-labs/FLUX.1-dev/tree/main
>>
Epoch [1/100], Step [10], Loss: 2770.3979, Importance Loss: 150.7213, Perceptual Loss: 1.0613, Total Loss: 2785.4805, EMA Loss: 3282.3598, LR: 0.00000039, Grad Norm: nan, Max Grad: inf, Param Norm: 982.0561, Time/Step: 8.18s

Patches with importance and perceptual loss bb.
>>
>>102420988
You need the fp8 version or else the training is going to be a lot slower.
>>
File: dino_00061_.png (977 KB, 1024x1024)
977 KB
977 KB PNG
>>102421074
OK, thank you
>>
>>102419921
>>102420001
>>102419944
My queen
>>
someone please make a krill myself meme with Flux where the krill breaks his handcuff chains
>>
hibernation mode
>>
>>102415140
Why would that matter?
>>
>>102421677
because just one image isn't enough to make a conclusion, to be sure something is worse than the other, it must be consistently worse, not just once I guess, nta btw
>>
https://github.com/lschaupp/PuLID_ComfyUI_FLUX
It's cool we are technically able to run PuLID on ComfyUi but, I'm seeing a requirements.txt + requirements_fp8.txt, I guess it's not working on GGUF quants right?
>>
>>102421809
looks like anon did multiple tests https://desuarchive.org/g/thread/102253191/#q102257955
still trying to figure out which xformers works with torch 2.3.1 tho
>>
>>102421881
>still trying to figure out which xformers works with torch 2.3.1 tho
If I remember well, the newest version of torch and cuda, especially cuda 12.1 aren't compatible with xformers anymore
>>
File: file.png (1.22 MB, 3826x1782)
1.22 MB
1.22 MB PNG
>>102421842
doesn't seem to be working for me
>>
>>102420576
Odd in a good way
>>
>>102421891
nuking the venv and manually installing deps seems to have done it
hopefully
>>
https://github.com/comfyanonymous/ComfyUI/commit/0849c80e2ab50b2821856c3de1ae6ebc1f879260
I highly suggest you to update ComfyUi now, his new commit now allows us to change a lora or a lora strength without any unload/reload now
>>
my images aren't the same as before but also different than when using the newest torch
the only difference should be missing xformers but that shouldnt change the output
>>
File: file.png (3.45 MB, 3185x1612)
3.45 MB
3.45 MB PNG
>>102422219
>my images aren't the same as before but also different than when using the newest torch
that's normal, every new version of torch means different outputs
>>
File: file.png (90 KB, 1404x438)
90 KB
90 KB PNG
>>102422081
if you use the GGUF node it keeps unloading though, to prevent that you can use the Force device
https://reddit.com/r/StableDiffusion/comments/1el79h3/flux_can_be_run_on_a_multigpu_configuration/
>>
>>102422249
im meaning my NEW outputs using 2.3.1 dont match my OLD outputs that also used 2.3.1 perhaps there's something i'm missing but i'm done fucking around with my venv for today
>>
>>102422290
oh, well if you're using xformers, this shit isn't deterministic so you get different pictures each time even on the same seed
>>
>>102422300
>this shit isn't deterministic
https://desuarchive.org/g/thread/102195069/#q102197163
>xformers has been deterministic for quite a while now, they fixed the thing from the old 2022 versions where the image changed on every gen

anyway, output is the same with or without xformers but still not the same as my previous install with 2.3.1
>>
>>102422319
well, even if it's deterministic now, it doesn't mean that output + xformers = output, maybe xformers changed the picture anyway
>>
File: file.png (1.87 MB, 1024x1024)
1.87 MB
1.87 MB PNG
What if George Costanza and Tidus have a son?
>>
>>102422349
Tidus laugh scene or we're not BACK baby
>>
I generate 100% women.
Can't I just train a basemodel with 20tb of women images?
Do I need pictures of hubble space, swiss cheese, caves in thailand, chinese wall etc. in my dataset?
Does my model need this knowledge to generate boobs? Pls smart anon, answer me.
Ask for a friend
>>
>>102422383
Only if you want such things in the background of your tits model. The cost of training a full model isn't really worth just making some Loras though
>>
>>102422329
>be me
>july
>presumably using torch 2.3.1
>gen image, call it output A
>wow this looks really good
>ff to august
>update
>gen image with output A's workflow
>receive entirely different image, call it output B
>oh it looks like the problem is a torch update
>ff to today
>finally have the urge to downgrade torch to 2.3.1
>gen image with output A's workflow, receive another completely different image to output A or B, call it output C
>maybe xformers has something to do with it
>install xformers
>receive same image as output C
this idea is likely retarded as it only treats the symptom and not the root cause but i wish metadata included version numbers of pip packages used so i can click a button that goes "one of the updates messed with the output, please revert everything to what it was when i generated this image"
>>
>>102422406
>i wish metadata included version numbers of pip packages used so i can click a button that goes "one of the updates messed with the output, please revert everything to what it was when i generated this image"
that would be a horrible idea, not everyone has the same version of each package
>>
>>102422391
What % of flux training is data that I don't need? 99%? So if I only need 1% of the data, wouldn't the model be much smaller and therefore much cheaper?
Would still be very expensive, but not millions?
>>
>>102422421
maybe prompt the user if they'd like to match their venv/install with the one used for the loaded output/workflow if its not the same? i did say it was a retarded idea... regardless that wont solve my current problem :( i just want to be able to generate the same images that i was back in july :(
>>
Wake up, find out CogX img2vid is pretty much released (theres a pth file that needs to be safetensored). Not a word in the thread.
*shocked pikachu face*
https://old.reddit.com/r/StableDiffusion/comments/1fibroc/cogvideo_5b_image2video_model_has_been_released/
>>
>>102422479
Wait until you find out what lousy quality the thing generates. Then you'll never talk about it again. That's probably why you don't read anything here.
>>
>>102422474
>i just want to be able to generate the same images that i was back in july :(
I suggest you to reinstall ComfyUi and then go back to 2.3.1 (because the new default one is 2.4.1), maybe it'll work
>>
>>102422479
wake me up when there's a decent video inpainting workflow
>>
>>102422479
>Not a word in the thread.
the last thread talked about it anon >>102415243
>>
File: 1726474928926460.webm (754 KB, 1280x720)
754 KB
754 KB WEBM
>>102422479
I mean, now that we tasted MiniMax, everything else looks like a fucking toy not gonna lie
>>
>>102422501
i changed requirements.txt to use 2.3.1 (and that's what
 pip list 
shows as well) so that's what leads me to believe it's something other than torch or xformers as in theory everything is the same as what it was in july. but there's really no way to check it's for sure the same, is there? did something with comfy itself change outputs?
>>
>>102422474
Pytorch updated?
The new versions generate significantly poorer quality. There was a thread with comparative images in /r/stablediffusion.
The difference is huge, no idea what they fucked up.
>>
>>102422514
One day you will be able to easily impregnate anyone you wish
>>
>>102422517
Well, that'll teach me for not reading old threads while i'm drinking my morning brew, ty.
>>102422537
Fair comment, going to give it a whirl anyway just so i can get the full depressive experience.
>>
File: bolter_.png (3.25 MB, 1696x1696)
3.25 MB
3.25 MB PNG
>>
File: Flux.1_00010.png (1.49 MB, 896x1152)
1.49 MB
1.49 MB PNG
>>
>>102422548
follow the reply chain, everything should be as it was before >>102422406
>>
>>102422548
>The difference is huge, no idea what they fucked up.
I've seen the same complaints a year ago with even older versions of Pytorch, feels like the more the time passes, the shittier our images will get, that's fucking grim
>>
File: 1720697341987448.png (491 KB, 500x500)
491 KB
491 KB PNG
>>102422550
if someone made a comfyui to aftereffects bridge that day could be TODAY
>>
it's my fault for not backing up before updating desu
>>
>>102422563
Uh I was writing you bastard.
Karma that fucks up your fucking comfy, hope it stays that way fucker. ;)
>>
>>102422579
go eat brekky, anon
>>
/ldg/.. my sweet home..
>>
>>102422577
At least say which UI goofed up again
>>
File: Flux.1_00012.png (1.93 MB, 896x1152)
1.93 MB
1.93 MB PNG
>>102422562
>>
>>102422562
>>102422618
Anon says flux can't into dynamic poses but these are pretty decent
>>
>>102422622
Anon says a lot of things
>>
>>102422622
I cant seem to get the lettering tho
>>
anon says the darndest things
>>
File: ComfyUI_00336_.png (1.53 MB, 1024x1024)
1.53 MB
1.53 MB PNG
>>102422622
>Anon says flux can't into dynamic poses
dunno why they say that, flux is way better than anything we got so far at poses
>>
>>102421966
I hope you figure it out anon this thread is severely lacking jlaw imgs.
>>
>>102422641
>flux is way better than anything we got so far at poses
arguable, flux isn't great for it's aesthetics, but more complex prompt adherence
>>
File: 1726510282143.png (2.51 MB, 1024x1024)
2.51 MB
2.51 MB PNG
>>102422667
just set guidance to 1 and cfg to 5
>>102414677
>>
>>102422629
Using a lora?
>>
>>102422682
>just set guidance to 1 and cfg to 5
it was guidance 1 + cfg 1 for the ultimate kino kek
>>
>>102422698
yeah, is it coz of that?
>>
>>102422682
>implying I can run flux
>>
>>102422712
how much vram do you have anon? there's the GGUF quants now you can run this shit on 8gb of vram without a lot of quality decrease
>>
File: ThatFeeling.jpg (1.41 MB, 2330x1472)
1.41 MB
1.41 MB JPG
>>
>>102422711
Could be everso overtrained. We only just recently figured out targeting specific blocks doesn't fuck up the hands. I dunno.
>>
>>102422716
I have 8vram precisely but I've had no luck running quants. They run on par or slower (due to loading) than nf4 in my case.
>>
File: Flux.1_00017_.png (1.61 MB, 896x1152)
1.61 MB
1.61 MB PNG
>>102422727
I guess, picrel is a cherry picked example with loras
>>
>>102422711
which one?
>>
>>102422752
Its vintage abstract film
>>
>>102418659
cute!
>>
File: ComfyUI_temp_jaoco_00004_.png (2.8 MB, 1520x1520)
2.8 MB
2.8 MB PNG
>>
File: 20240917_104611.webm (119 KB, 720x480)
119 KB
119 KB WEBM
CogX
Started off ok, then got chinkyfied, bloody Chang and his "locally sourced" datasets eh.
Kind of the theme of "Uglies", but in reverse.
>>
>>102422812
I'd say not too shabby either way
>>
>>102422812
not too bad.. the examples on leddit are shit tho
>>
AI is only good when you exploit what's unique to the medium.
>>
>>102422824
Well said. It's funny how instead of using it to create new things you wouldn't be otherwise able to do traditionally, people are more keen on imitating what we're already familiar with.
>>
>>102422831
I much prefer to remix in a funny way what's existing than trying to make something completely new, Will Smith spaggheti meme got so popular for a reason, it's just fun to see him struggle with something as basic as eating spaggheti
>>
File: file.png (1.74 MB, 1024x1024)
1.74 MB
1.74 MB PNG
>>102422645
nah I can't make it work, guess that the guy hasn't finished his repo yet
>>
>>102422847
>it's just fun to see him struggle with something as basic as eating spaggheti
...which is unique to the medium
>>
>>102422831
>people are more keen on imitating what we're already familiar with.
>>102422847
>Will Smith eating spaggheti
>>102422917
>which is unique to the medium
How having Will Smith eating spaggheti completely unique? It's a completely mundane scene



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.