[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (904 KB, 3264x3264)
904 KB
904 KB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>101954365

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/g/sdg
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
what's the little plastic thing in the middle of american pizza boxes called
>>
>>101957379
fix the hand on this one and it'd be a straight 10/10 gen, thats fucking awesome
>>
>>101957392
Pizza Table
>>
>>101957392
Pizza Tower
>>
File: ComfyUI_04732_.png (1.46 MB, 1024x1024)
1.46 MB
1.46 MB PNG
>>101957403
It was made with cfg = 8, here's the one with cfg = 6, the hands definitely look better
>>
>>101957417
100% that's a way better gen, god damn very nice.
>>
>>101957424
thanks anon
>>
>>101957417
wholesome
>>
File: FD_00029_.png (1.91 MB, 1024x1024)
1.91 MB
1.91 MB PNG
>>
https://civitai.com/models/658922/flux-image-enhancer
>Flux Enhancer
>none of the images look enhanced
>>
>>101957392
the one ring
>>
Prompt challenge:
Try to gen your pet, full boomer description of their features

Compare to real picture of your pet if you're brave
>>
File: 101911600.png (1.22 MB, 896x1152)
1.22 MB
1.22 MB PNG
>>
>>101957461
All the "detailer" loras are trash. If you want more details in your image, inject more latent noise in the middle of the generation.
>>
>>101957461
in fact it looks like it REMOVES detail and kills whatever artistic stylizing flux was going for
why are detailer loras never actual detailers, either they go too far to explode in excess detail, or they flat out remove..
>>
>>101957461
>AI "Artist"

>never produces anything worthy of being called art
>>
>>101957492
Nowdays, a banana in a wall is considered "art" so...
https://paintvine.co.nz/blogs/news/the-banana-on-the-wall?srsltid=AfmBOophoaT0aahy7RaEDemjddce9fXzIFq49FdiH-T9zHHe132E9WSC
>>
File: FLUX_00036_.png (1.02 MB, 896x1152)
1.02 MB
1.02 MB PNG
>>
>>101957509
Now THIS, is ART!
>>
someone gen a rat dancing on a tree stump in the middle of a spooky forest, in flux please.
the rat *can* be wearing accessories like a sombrero or something if you wish.
>>
Blessed thread of frenship
>>
>>101957417
saved
>>
>>101957531
that will be 200 Buzz
>>
File: what.jpg (278 KB, 2000x992)
278 KB
278 KB JPG
What's going on here? I only got 10s once when changing device from cpu to cuda:0, then it only increases gen time
>>
>Put some of my NSFW 3D work into a LoRA trainer
>Looks exactly like my stuff but better and more expressive.
>It even got the clipping errors down
>>
I wish I farmed more buzz.
>>
>>101957551
Welcome to the future. /3/ is kill.
>>
File: 1692640742893106.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
>>101957531
i used your message as the prompt
>>
>>101957551
time to get into really weird and specific shit that current AI can't handle
then repeat that every 9 months
>>
>>101957574
:D
>>
File: 3093197053.png (1.86 MB, 1024x1024)
1.86 MB
1.86 MB PNG
>>101957531
>>
>>101957561
Just spam upload a bunch of gens. You can farm buzz from liberals if your gens have anti-russian sentiment.
>>
>>
>>101957588
:D

>>101957598
I spent the entire month of June, once a day, uploading random failgens with the post name "YWNBAW" and managed to farm a pretty good amount of buzz off of it.
You can upload pretty much anything, I don't think people are paying attention that hard, everyone's just grinding for buzz in the exact same ways.
>>
There is literally no reason for a LoRA of a person to be more than 20mb.
>>
>>101957620
these are facts
it can even know several people in 20MB
a lora for all characters in Breaking Bad should be under 20MB
>>
>>101957615
>uploading random failgens with the post name "YWNBAW" and managed to farm a pretty good amount of buzz off of it.
dare I say based?
>>
How to you add LORA to the guf node?
>>
File: file.png (1.3 MB, 768x1024)
1.3 MB
1.3 MB PNG
>>101957531
>>
>>101957620
Yes sir I did the needful 20 gigabites on the civit AI. Please redeem buzz
>>
>>101957615
Yeah but you want them specifically to pay you the buzz for your gens, not just like it. Liking it only gets you 2 buzz, if they give it to you then you get shit loads.
I just farmed enough today to gen a Flux Lora, currently training now.
>>
is there a way to refresh a node? I added a new lora and it doesn't show up to the load lora node, it's a pain to have to delete it and redo the connections and shit
>>
>>101957632
You attach a load LoRA node to the load gguf node.
>>
>>101957643
there is a button literally called Refresh, press it
>>
So, how do I get 1024x1024 dev images in under a minute using an 8GB AMD card?
>>
>>101957645
You connect the clip from lora to gguf model node?
>>
>>101957663
Q4_K.gguf
>>
>>101957665
Put the same color node in the same color hole.
>>
>>101957661
Oh yeah you're right, damn I didn't pay attention to it at all, thanks anon
>>
>>101957669
That takes 10 minutes to make a 1024.
BUT I just saw city96 released a Q2 quant at some point. I'm going to test that.
https://huggingface.co/city96/FLUX.1-dev-gguf
>>
File: 1708850800857174.png (2.95 MB, 1536x2048)
2.95 MB
2.95 MB PNG
>>
File: Untitled.jpg (39 KB, 500x334)
39 KB
39 KB JPG
>>101957676
There's no node for it

Prompt outputs failed validation
LoraLoader:
- Required input is missing: clip
>>
>>101957704
have you generated any image at all? then you have a CLIP loader, connect it to that input
>>
>>101957663
Take me 5 minutes per gen with a 6700 XT (Q4_K)
>>
File: Capture.jpg (18 KB, 561x213)
18 KB
18 KB JPG
>>101957704
>>101957713
why can't he simply uses the "LoraLoaderModelOnly" node instead?
>>
>>101957704
Where is your clip loader? Connect clip loader to lora.
>>
>>101957697
I recall an anon testing it on another thread and the outputs were pretty shit. Give it a try.
>>
>>101957724
because he must learn first
>>
File: 1.jpg (289 KB, 1600x960)
289 KB
289 KB JPG
>>
File: FD_00014_.png (1.87 MB, 1024x1024)
1.87 MB
1.87 MB PNG
>>
status of loras on flux? it was pretty trivial to make a semi decent one for SD1.5 when I last tried
>>
>>101957798
>status of loras on flux?
never been this easy to make loras thanks to flux, this shit learns concepts so well
https://civitai.com/models/650330/yorha-2-gou-b-gata-2b-2b-nier-automata
>>
>>101957782
Tell me this is SD or SDXL
>>
>>101957798
It's insanely easy to make it look good, so long as you have the hardware.
If you don't, you can train one on Civit for like 2 bucks.
>>
File: ComfyUI_04737_.png (1.66 MB, 1024x1024)
1.66 MB
1.66 MB PNG
>>
>>101957663
>So, how do I get 1024x1024 dev images in under a minute using an 8GB AMD card?
>under a minute
>AMD
you don't
>>
>>101957812
Man the anime was so shit compared to the game. Even if you don't compare them.
>>101957812
I see people are releasing finetunes of the base model already, and people are asking for NF4 quants instead of GGUF?
>>
>>101957782
what is she reading?
>>
>>101957857
She's not even looking at the book.
>>
>>101957854
>I see people are releasing finetunes of the base model already
I hope they'll convert to gguf's, I'm not quitting my beloved Q8_0
>>
>>101957697
Your iteration speed will be the same regardless the quant. It just makes model loading time shorter.
>>
File: 1696808681928064.png (36 KB, 449x461)
36 KB
36 KB PNG
Does anyone know if there's a node that lets me read the tags/prompts from the metadata of a generated image? ForgeUI had a tab like that where you would just select an image and it would show all the relevant information contained in the metadata. I know i can load the image and get the workflow with all the prompts but that's very inefficient.
>>
>>101957874
I thought my iteration speed was slower because my VRAM is full and offloading to RAM.
>>
>>101957871
If it ain't quant I don't want
>>
>>101957874
This is not true if you can't fit the whole model in the GPU to begin with.
>>
>>101957877
Just read the exif on the image.
>>
>>101957812
>not 3D 2B
worthless
>>
>>101957818
VendoPonyRealistic

>>101957857
Tolkien, probably
>>
>>101957910
you can make her 3d though? you just specify it's a 3d character on the prompt, and it can work >>101956399
>>
>>101957886
It's mostly slow because you can't use flash attention with older AMD cards (and even older nvidia).
>>
>>101957931
it's not the same
>>
File: ComfyUI_04738_.png (851 KB, 1024x1024)
851 KB
851 KB PNG
>>101957501
>>
>>101957909
the exif is pretty much unreadable unless i open it in an external program . i'm trying to do it all in comfyui
>>
>>101957935
Dude you talk too much out of your ass. ROCm has had flash attention support for a while.
>>
File: ComfyUI_20902_.png (1.16 MB, 1280x720)
1.16 MB
1.16 MB PNG
>>
>>101957971
Only for RDNA3+ (and not even fully upstreamed) and CDNA2+
>>
>>101956442
on this do I have to run the model locally via say ooba and joycaption can see it?
If I run say mistral large that will use up all my vram over both GPUs, does joycaption need extra or is it sending requests to the open LLM.
>>
File: greensnakewoman.png (3.19 MB, 1536x1536)
3.19 MB
3.19 MB PNG
censored version
>>
>>101957986
you can put the uncensored version on a catbox anon
>>
File: ComfyUI_04739_.png (1.29 MB, 1024x1024)
1.29 MB
1.29 MB PNG
>>
so with flux are we currently at the point you can't do fine tunes or LoRAs with 24 or less vram, and there is no dual GPU stuff available?
But my understanding is KohyaSS will soon^tm support training on less vram?
>>
File: 0.jpg (381 KB, 1024x1408)
381 KB
381 KB JPG
>>
File: ComfyUI_20830_.png (1.04 MB, 1280x720)
1.04 MB
1.04 MB PNG
>>
>>101957724
>>101957713
>>101957746
Got q5 to work but it's trash, and same gen time
>>
File: ComfyUI_20905_.png (2.05 MB, 1920x1080)
2.05 MB
2.05 MB PNG
>>
File: Quants.jpg (225 KB, 2403x1539)
225 KB
225 KB JPG
>>101958066
You can't go higher than Q5_0?
https://huggingface.co/city96/FLUX.1-dev-gguf/tree/main
>>
File: redditbtfo.png (40 KB, 893x345)
40 KB
40 KB PNG
REDDIT BTFO
>>
>>101958097
Holy cope
>>
>>101958089
Q8 is 12gb but why even bother if the gen time is the same to fp8
>>
>>101958097
the fire rises..
>>
>>101958097
>don't trust your eyes, mine is better because...because it just is, OK!?
>>
File: ComfyUI_20900_.png (1.06 MB, 1280x720)
1.06 MB
1.06 MB PNG
>>
>>101958125
because the outputs are closer to FP16
>>
>>101958097
He really went from "nf4 is better than fp8 I swear" to "Q4_1 is more precise than NF4"
AHAHAHAH
>>
>>101958125
Because Q8 is significantly closer to FP16 than FP8 for the same speed.
>>
File: Comparison_all_quants.jpg (3.84 MB, 7961x2897)
3.84 MB
3.84 MB JPG
>>101958125
because it gives a better quality image? it's the closest to fp16
>>
File: ComfyUI_04743_.png (783 KB, 1024x1024)
783 KB
783 KB PNG
>>101958097
death to his shit quant
>>
>>101957886
It went from 37s/it to 35s/it
>>
>>101957487
>inject more latent noise
How do I do this? I read about it before and tried doing it with two ksamplers advanced with sdxl but just got garbled mess
>>
I hate that I can't post a gen of a witch who happens to be wearing purple because some annoying faggot got a monopoly on purple witches.
>>
can i run a fp8 on rtx 3060 12gb? i seen reddit posts claiming it runs on it but when I try to run i get nmap error. i guess i have to stick with q6
>>
File: 3146235815.png (975 KB, 1344x768)
975 KB
975 KB PNG
>>
>>101958097
What I read there is that the GGUF implementation we're using for Flux is incomplete (unlike llama.cpp's)? Ya'll too eager to "own" and win arguments to even read?
>>
>>101958246
You can, it will overflow to vram and go slow though. Go for Q8, you can fit that into vram.
>>
does flux have strongly recommended dimensions like SD?

ie should I always do 1024x1024, can it handle x1.5 this easily enough? x2?
>>
>>101958255
So what you're saying is the gguf versions are going to get a lot better? Nice.
>>
File: ComfyUI_20903_.png (2.13 MB, 1920x1080)
2.13 MB
2.13 MB PNG
>>
is there/does some have a lora guide for flux? i want to see what it does
>>
>>101958276
I don't know shit about this topic, but that screenshot seems to imply there's some stuff in the llama.cpp implementation that would make them better.
>>
>>101958167
>none of those images are 50's comic book style
>no seed/samplers/workflow info is given
>its always one guy spamming the same image comparison

we are at peak snake oil levels rn
>>
>all these flux loras coming out
>they're all shit
>>
File: FD_00079_.png (51 KB, 128x256)
51 KB
51 KB PNG
>>101958270
No. You can do pretty much any multiple of 32 you want.
>>
>>101957821
>train one on Civit for like 2 bucks.
These are actually ripoff prices for the GPU rental require.
>>
>>101958255
pfft of course it is. commits being pushed that break shit every other day, rushed quanting, this scene needs to slow the fuck down a little bit and take it easy. blunders like this would have /lmg/ on fire for days, here its just business as usual.
>>
>>101958302
wasn't it multiples of 64?
>>
>>101958306
they are, however you can get their mtx currency for free and not pay a dime just by uploading your gens.
>>
>>101958280

Cuter and cooler than what Pixar has produced in at least the last 15 years. Anyone wanna guess when Disney pulls the plug on all the art fag's? Or at least a good portion of them. Can't be any much longer now.
>>
File: FD_00065_.png (34 KB, 128x128)
34 KB
34 KB PNG
>>101958317
Maybe, point is it's not strict about it at all.
>>
File: file.png (468 KB, 512x512)
468 KB
468 KB PNG
>>101957886
Using 512x512 I go from 12s/it to 9s/it, so for the record, Q2_K definitely generates faster than Q4_0 on my RX 5700 XT.
>>
File: ComfyUI_00263_.png (1.47 MB, 1408x704)
1.47 MB
1.47 MB PNG
is dev worth it? schhnell has 16s gen time while dev has 100s
>>
>>101958297
kek people are riding the popularity wave, sadly there is little to no filter to ai content
>>
>>101958336
Go even smaller, Anon.
>>101958333
>>101958302
>>
File: ComfyUI_20852_.png (1.09 MB, 1280x720)
1.09 MB
1.09 MB PNG
>>
>>101958342
everybody is using dev, no one is making loras or finetuning schnell
>>
File: 2024-08-18_00188_.png (1.46 MB, 1344x768)
1.46 MB
1.46 MB PNG
>https://civitai.com/models/659150/zdzislaw-beksinskie-style-flux-lora?modelVersionId=737548
aah I am saved someone made Beksinski lora for FLUX
>>
>>101958342
quality over quantity
>>
>>101958348
Please don't mock me. This hurts enough as it is.
>>
File: LOL.jpg (23 KB, 1191x212)
23 KB
23 KB JPG
>>101958097
>GGUF is a pure compression tech, means it's smaller but also slower
>NF4's computation is faster
I think lying is pathologic with him right? >>101958167
>>
File: 2024-08-18_00190_.png (1.07 MB, 1344x768)
1.07 MB
1.07 MB PNG
>>
File: FD_00233_.png (25 KB, 96x128)
25 KB
25 KB PNG
>>101958348
>>101958370
>>
File: file.png (562 KB, 512x512)
562 KB
562 KB PNG
>>101958336
OK, I think there's a problem with the Q2_K quant. Pic related is the same prompt and seed using Q4_K. They look like two completely different seeds. I double checked.
>>
>>101958403
well duh, Q2 is basically a lobotomy
>>
>>101958403
*Q4_0
>>
has anyone tried getting their waifu into flux yet with LoRAs (say an anime character the model doesn't know at all), simple enough test.
>>
>>101958421
If by waifu you mean my actual wife then yes and it works ridiculously well.
>>
>>101958421
there are already dozen of waifu anime loras that actually work in civitai for flux
>>
File: 00056-4032470775.png (316 KB, 640x360)
316 KB
316 KB PNG
>>
File: ComfyUI_20708_.png (1.27 MB, 1280x720)
1.27 MB
1.27 MB PNG
>>
File: ComfyUI_20696_.png (1.77 MB, 1080x1920)
1.77 MB
1.77 MB PNG
>>101958439

My neo lora , did mutagrn man, neo, Cain from robocop 2 is next
>>
>>101958380
it's slower for me, goes from 3.8 on Nf4 to 5 on GGUF, probably depends on your GPU, I have 12 gb vram
>>
>>101958456
Try doing a whole movie or show lora with all the characters.
>>
can you do controlnets and inpainting on flux yet? Ideally in forge
>>
>>101958458
for me the speed is equivalent, I'm on a 3090
>>
File: file.png (1.26 MB, 800x1169)
1.26 MB
1.26 MB PNG
He finally found them.
>>
>>101958310
>this scene needs to slow the fuck down a little bit and take it easy.
I am compiling pytorch and would like my scene to speed up a bit actually.
>torch.cuda.device_count()
>1
That's goo-
>torch.cuda.current_device()
>[...] IndexError: list index out of range [...]
What the fuck.
>>
>>101958475
Inpainting you can but controlnets, not yet.
>>
File: 2024-08-18_00195_.png (1.45 MB, 1344x768)
1.45 MB
1.45 MB PNG
>>
>>101958380
you seem like a schizo spamming your tests and workflows 24/7, I've even seen you on github FFS
>>
>>101958483
See?
>>
File: FD_00011_.png (1.24 MB, 1024x1024)
1.24 MB
1.24 MB PNG
>>
File: file.png (2.32 MB, 1024x1024)
2.32 MB
2.32 MB PNG
>>101958414
Ah, OK. I thought this was a problem because people were making such a big deal about NF4 producing different images than the other quants.
>>101958458
Is there way to use NF4 on AMD? I would like to test it, but I hit a wall trying to get bitsandbytes to work.

Also, I have confirmed that the speedup on a smaller quant only happens if everything fits in VRAM. The moment it unloads "for lowram load", the speeds get as slow as they've ever been.
>>
what ur experience on amd
>>
>>101958515
Doesn't it just werk in Forge?
>>
>>101958523
pain
>>
>>101958525
Forge can't do offloading at all for me, it just crashes if it tries to do any offloading.
>>101958523
Pain and cope.
>>
>>101958515
No idea sorry, haven't had an AMD for like 10 years.
>>
File: 2024-08-18_00198_.png (1.65 MB, 1344x768)
1.65 MB
1.65 MB PNG
>>
File: file.png (176 KB, 256x256)
176 KB
176 KB PNG
>>101958348
On Q2_K and schnell? See pic related.
>>101958546
Fuck I'm so tempted to just buy a 3060 even if it's just barely an upgrade to my 5700 in general terms.
>>
>>101958542
RIP. Sorry for AMDs cuckery. They dragged their heels on AI and now they are cunted.
>>
File: 00129-3196969707.png (1.39 MB, 1216x832)
1.39 MB
1.39 MB PNG
>>101958456
>My neo lora , did mutagrn man, neo, Cain from robocop 2 is next

Used this as a prompt
>>
so I have flux working in forge, is there no a way to change the guidance in forge? I saw on comfy very different results from 3.5 vs 2 for example.
Or is this just CFG?
>>
>>101958562
It's a significant upgrade if you plan to do any AI workloads.
>>
>>101958566
What are the two little holes on the head for?
>>
every single flux likeness lora so far affects every face in the image
T5 is smarter, Flux is smarter, loras have to be smarter
>>
File: ComfyUI_20889_.png (1.02 MB, 1280x720)
1.02 MB
1.02 MB PNG
>>101958472

I tried that with sdxl in the beginning but it just mixed styles , probably in painting would be better, don't know how flux reacts yet. Just got started transferring my loras and ice got like 10 more to so like Jessica rabbit battle angel Alita ,inspector gadget is next gonna skip Cain, gonna use actual gens from my sdxl as dataset to get him exactly how I want him. This is so much fun.. not sure how much genning on a 4070 ti all day would cost, probably cheaper than the a100 I'm renting, this shit is getting expensive. I'm finding flux does a wayyyy better job at learning features that sdxl did not pick up on no matter how much I tried...
>>
>>101958590
>But not me, someone else will figure it out for me.
>>
>>101958038
we're past that point lora training can be done on 10gb vram, kohya is in process of releasing finetuning even lower than 24gb vram
>>
>>101958585
>AI workloads
I find it hard to justify to play with imagegen and fap to textgen once in a blue moon.
>>
File: 2024-08-18_00200_.png (1.62 MB, 1344x768)
1.62 MB
1.62 MB PNG
>>
File: ComfyUI_04746_.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>>
File: 00130-3196969708.png (1.52 MB, 1216x832)
1.52 MB
1.52 MB PNG
>>101958589
I was going to say it's for your dick, but really it's for multiple reasons, sometimes he has antennas of sort that pop out to help communicate with devices such as satelittes etc

Basically it's a multi use hole.
>>
>>101958605
You can hire me to make your data set better.
>>
File: 1723390335453186.jpg (80 KB, 608x839)
80 KB
80 KB JPG
Can A1111 run Flux yet?
>>
>>101958609
is there a special dev version I have to do this on? I have 24gb vram available I just want to get my waifu into flux
>>
File: file.png (622 KB, 1509x570)
622 KB
622 KB PNG
>>101958238
Something like this. High sigmas go into the first sampler, low sigmas go into the second sampler, in my pic 60% of the steps are done in the 1st sampler as set in the SplitSigmas node, then I inject noise and move to the second one. Don't forget to disable noise for the 2nd sampler because you inject it manually.
https://imgsli.com/Mjg4ODEz
>>
>>101958481
this is the beauty of memes, anon, you get it. I remember when there was a potato revival cause her mom took her to the tv to be pissed about ancient potato memes from back when she was a little girl, and we made potato air lines meme
>>
File: ComfyUI_01864_.png (747 KB, 1024x768)
747 KB
747 KB PNG
>>
>>101958595
yeah makes sense, gonna be amazing when everyone starts finetuning this with their own GPUs
>>
File: FD_00398_.png (1013 KB, 1024x1024)
1013 KB
1013 KB PNG
>>101958590
Not the Kamala one. Although I have only tried it with Kamala and Trump
>>
>>101958481
LA CHANKLA
>>
File: ComfyUI_20706_.png (1.98 MB, 1920x1080)
1.98 MB
1.98 MB PNG
>>
>>101958640
ONE CAKE
I DON'T KNOW WHY
>>
>>101958622
>a multi use hole
I was going to say like your mom's, but I think that's a weird thing to have in a mechanized armor. What if water or blood gets in there? What a pain to clean.
>>
>>101958523
It's ok but slow, make me want to buy a newer card.
>>
>>101958640
>2 girls, 1 cake
oh no...
>>
>>101958648
>Not the Kamala one
which one
>>
File: 1701827901368854.jpg (113 KB, 821x951)
113 KB
113 KB JPG
Are there any animatediff plugins for after effects? I've mastered image morphs and now I want to do video
>>
>>101958653
the hole is also to release hot air, it can be released at a high amount of pressure which can remove any debris, blood or water. Just don't be near it when it happens because the pressure can kill.
>>
>>101958680
hnnnnggh
>>
>>101958668
I don't know there's a few now.
>>
>>101958342
Hair and face look plastic, otherwise pretty good.
>>
>>101958635
Thanks, I'll try it
>>
>>101958680
>"I've mastered image morphs"
>that huge ridge over the belly
>>
>>101958481
Damn, this gives me the feels unironically. Has it already been 3 years? Felt like just yesterday.
>>
>>101958680
Why is this so hot?
>>
File: 2024-08-18_00205_.png (1.68 MB, 1344x768)
1.68 MB
1.68 MB PNG
>>
>>101958296
Well then provide a better comparison.
>>
File: ComfyUI_20911_.png (2.08 MB, 1920x1080)
2.08 MB
2.08 MB PNG
>>
>>101958635
This looks nice. Does it add a lot of overhead?
>>
File: 4147331765.png (1.4 MB, 768x1344)
1.4 MB
1.4 MB PNG
>>
https://civitai.com/models/658111/thefatone
This guy's dumping one kino after another
>>
What does this mean?

ComfyUI-GGUF\dequant.py:8: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
data = torch.tensor(tensor.data)
>>
no (You)s blues...
>>
>>101958296
Yes. And the guy spamming that image seems to be very easily triggered.
>>
File: ComfyUI_20909_.png (2.22 MB, 1920x1080)
2.22 MB
2.22 MB PNG
Would have liked him standing but he looks happy the way he sits lol
>>
File: Capture.jpg (20 KB, 616x254)
20 KB
20 KB JPG
>>101958769
>TheFatOne
>>
File: 1713707557328419.jpg (127 KB, 576x1024)
127 KB
127 KB JPG
>>101958710
sorry I mean I've masturbated to image morphs
>>
File: thefatone.png (1.76 MB, 1904x801)
1.76 MB
1.76 MB PNG
>>101958769
I gotta say, this guy's political leanings are a mystery!
Glad he's keeping his politics outta his art!
Good to know i can rely on him for keeping things apolitical!
>>
>>101958763
It doesn't add any additional time at all. Even noise injection took 0 seconds as you can see on the screenshot
>>
>>101958769
Putin lora when? I want some real kino
>>
File: 00134-2476226602.png (1.53 MB, 1216x832)
1.53 MB
1.53 MB PNG
>>101958776
Hey you.
>>
>>101958769
legitimately can't stand how this guy is parading around the safest political opinions you can hold like he's some kind of renegade.
>>
>>101958796
He's in the middle, so basically both sides hate him more than each side. The curse of not giving a shit.
>>
is there any flux dev-schnell-merge gguf yet?
>>
File: file.png (2.31 MB, 1024x1024)
2.31 MB
2.31 MB PNG
Q2_K seems to produce this kind of garbled pseudo-pixelation no matter the size I gen at. I'm going to test Q3 next.
>>
File: 00135-2476226603.png (1.69 MB, 1216x832)
1.69 MB
1.69 MB PNG
>>
File: 1723975840880815.png (59 KB, 1489x399)
59 KB
59 KB PNG
The CLIP from force set clip device goes to prompts or load lora?
>>
>>101958854
think about it
>>
File: ComfyUI_20717_.png (1.16 MB, 1280x720)
1.16 MB
1.16 MB PNG
>>101958776

Well I've been genning happily , guessing that's what others are doing now or hopefully creating their own loras. Skies the limit right now, go make cool stuff and remember to share.
>>
how do I build a local stable diffusion server? /hsg/ is so useless with their old hardware
>>
>>101958796
His whole profile does
>>
>>101958825
>>101958880
I was jk, I'm just having fun too
>>
File: 3556858103.png (1.07 MB, 768x1344)
1.07 MB
1.07 MB PNG
>>
>>101958877
Here it's on prompt but there's no lora
>>
File: file.png (2.25 MB, 1024x1024)
2.25 MB
2.25 MB PNG
>>101958848
Q3_K_S
>>
>>101958923
think about it harder then
>>
File: ComfyUI_20826_.png (1.03 MB, 1280x720)
1.03 MB
1.03 MB PNG
>>
>>101958923
You are looking at another image for the answer, you are not thinking about it.
Think for yourself and use logic to decide, not some external authority you follow blindly. Can you do that?
When do you want the model to be moved to your preferred device? After it is used in the workflow or before?
>>
Legally distinct Saria
>>
>>101958964
how did you manage to get that n64 low poly style?
>>
>>101958923
Anon there are colours that match, it's not as hard as you think it is.
>>
>>101958964
OOT vibes
>>
>>101958973
I trained a lora on a couple of screenshots I took in OOT
>>
File: ComfyUI_20915_.png (2.01 MB, 1920x1080)
2.01 MB
2.01 MB PNG
>>
>>101959013
I hope you'll release that kino in Civitai anon
>>
>>101958923
Anon, go pirate Factorio and don't even think about genning until you launch the rocket.
>>
>>101959031
But he probably has too many questions that he needs somebody else to answer for him before he can even get a torrent downloading.
>>
File: 160823_00001_.png (1.85 MB, 1024x1024)
1.85 MB
1.85 MB PNG
>>
does adetailer work on flux?
>>
>>101958964
>>101959013
Damn, are you the same guy who trained one for SD during its peak? I still have it on my hard drive lol, and the link apparently >>>/h/7222779
>>
>>101958796
>terminally obsessed redditor discovers AI for the first time
>>
>>101958796
>SuperBigote
kek
>>
File: 0.jpg (321 KB, 1024x1408)
321 KB
321 KB JPG
>>
I'm about to spoonfeed him because this is painful
>>
>>101959068
And this.
https://archived.moe/h/thread/7223742/#7225728
>>
>>101959075
Can you? Fuck off. This thread is not for requests.
>>
>>101959110
someone gen this anon getting furious at this anon >>101959075
for requesting a fat guy with a stubble happily typing on his pc
>>
>>101959110
You fuck off
>>
Someone fuck me.
>>
File: 2024-08-18_00218_.jpg (727 KB, 2688x1536)
727 KB
727 KB JPG
>>
File: ComfyUI_04756_.png (1.64 MB, 1024x1024)
1.64 MB
1.64 MB PNG
Holy kino! Now I'm convinced flux was trained with GPT4V captions, it responds so well to some GPT4 slop prompt
>>
>>101959075
Gratz, you just made a few fat guys with a stubble happily type on their PCs.
>>
File: download2.jpg (187 KB, 1024x1024)
187 KB
187 KB JPG
>>101959122
>>
>>101958572
hello sirs
>>
File: file.png (623 KB, 512x512)
623 KB
623 KB PNG
>>101958925
Q2_K dev 20 steps (~3 min)
This quant is completely fried.
>>
>>101959191
soul
>>
>>101959168
lmao
>>
>>101959191
lmao
>>
>>101959191
The prompt is
>This is a photo of Miku Hatsune riding a skateboard at a high speed through the streets of Tokyio. Godzilla can be seen in the background destroying the city. A person on the sidewalk is saying "Save us Miku!"
It would probably be best to mention a speech bubble, but I've done the all the tests with this one and I don't want to change it now.
>>
File: file.png (642 KB, 512x512)
642 KB
642 KB PNG
>>101959191
Schnell Q8_0 (45 seconds because this size fits in my GPU).
>>101959223
>Tokyio
I just realized that, sorry.
>>
>>101959172
>>101959203
>>101959208
>>101959216
all me btw
>>
https://github.com/pamparamm/sd-perturbed-attention

The PAG custom node now seems to work with Flux when combined with the new snf rescale mode.
>>
File: 00140-1792250101.png (1 MB, 825x940)
1 MB
1 MB PNG
>>
File: d4gjP8CEbwg3av1u.jpg (72 KB, 720x900)
72 KB
72 KB JPG
>>101959258
Oh nice, gotta try that one out
>>
File: 0.jpg (366 KB, 1024x1024)
366 KB
366 KB JPG
>>
>>101959110
didn't know it was like that imma leave you alone, peace
>>
File: 00144-2811210283.png (1.64 MB, 832x1216)
1.64 MB
1.64 MB PNG
Inspired by OOT anon
>>
>>101959258
Perturbed/self-attention guidance were great on sdxl to help make shit a little bit more coherent, I think it's kinda unnecessary on flux given the speed reduction
>>
File: 2024-08-18_00220_.jpg (787 KB, 2688x1536)
787 KB
787 KB JPG
>>
File: file.png (2.26 MB, 1024x1024)
2.26 MB
2.26 MB PNG
>>101959242
>because this size fits in my GPU
Wait, this is probably not correct. Whatever. I'm a brainlet.
Anyway, this is Q8_0 too (3 min). My veredict is that it doesn't matter which one you use with my shit card because everything runs like slow like crap, so might as well just go with Q8_0 because everything else is retarded.
Thanks for reading my techbro blog.
>>
>>101959258
>the new snf rescale mode.
the what?
>>
File: ComfyUI_temp_fthgz_00085_.png (2.06 MB, 1024x1400)
2.06 MB
2.06 MB PNG
>>
make me a sandwich
>>
File: file.png (17 KB, 643x161)
17 KB
17 KB PNG
>>101959330
https://arxiv.org/abs/2311.10329

New setting that supposedly helps increase image quality at higher guidance - and more importantly is the only rescale mode that made the node work with Flux for me
>>
can u make a photo of vegana
>>
File: 00147-2811210286.png (1.68 MB, 832x1216)
1.68 MB
1.68 MB PNG
>>
File: ComfyUI_20918_.png (2.02 MB, 1920x1080)
2.02 MB
2.02 MB PNG
Wonder how much the chinks would charge to have something like this made and sold for limited quantities like set of 10 or something.
>>
File: ComfyUI_00003_.png (1.09 MB, 1024x1024)
1.09 MB
1.09 MB PNG
>>101958923
Anon here is the most simple workflow for LoRA. Just drag this into your comfy workspace.
Please pay attention to how everything flows into one another.
https://files.catbox.moe/ne3vn2.png
>>
File: ifx88.png (1.45 MB, 1024x1024)
1.45 MB
1.45 MB PNG
>>
>>101959381
awwwwwwwwwwww look at that baabbeeeee hes literally me (and anyone else that comes to this general for help)


With that said, What are some good words/prompt combos to use when you want the model to gen an interesting pose?
>>
>>101959013
>>101959023

https://civitai.com/models/659573/flux-n64-oot-style?modelVersionId=738029

Thanks a lot anon.
>>
I've been conducing some experiments to see how a vision model like GPT-4o describes first-person images, and how Flux generates using those. The first is that 4o never says "first person", "point of view", or anything indicating that the arm interacting with the scene is from the viewer. For example, this is how it captioned this image.
>A confident, smiling African American businessman in a dark suit and white dress shirt, standing in a modern, sunlit office with exposed brick walls. He is shaking hands with someone off-camera, extending his right hand. The office has large windows with views of greenery outside. In the background, there are shelves filled with binders, a plant, and a flip chart with colorful graphs. The lighting is soft and natural, creating a professional and warm atmosphere.
And when you plug this into Flux, you get many images where it's not truly from a first person perspective.

Coupled with the knowledge that Flux often likes to not generate first-person perspective interactions even if you specify it, this may be the reason why. It can still generate it sometimes, so it's obviously seen some during training, but the model doesn't have a strong connection or understanding of these terms.
>>
File: Capture.jpg (341 KB, 2232x1473)
341 KB
341 KB JPG
>>101959350
oh nice, what are your parameters anon? I want to start on a good basis, I made a first try and it's not that good with the default parameters
>>
File: ifx75.png (1.14 MB, 1024x1024)
1.14 MB
1.14 MB PNG
>>101959380
might be cheaper domestically, if you have complex parts it's harder to 3D print then colorway them esp as some are within a plastic shell
>>
why cant pony do thick pubic hair? i have to go back to 1.5 to get it
>>
File: FD_00098_.png (1.33 MB, 1344x768)
1.33 MB
1.33 MB PNG
>>101959393
just describe the pose. providing it's not overtly pornographic it will generate it.
>>
File: 1715522541333440.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
>>101959372
>can u make a photo of vegana
>>
>>101958572
.
>>
>>101959427
use the tags "hairy anus, excessive pubic hair"

>>101959429
its mostly a me problem, i don't even know how to begin describing the hazy image in my mind's eye, i just want the model to surprise me with standing full body focus poses.
>>
>>101959398
It's still a bit under-cooked, I'll try to improve upon it in the future.
>>
File: FD_00146_.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
>>101959450
Try "dynamic pose" or "action pose"
>>
>>101959469
Yeah especially when we get some good finetunes to work on
>>
>>101959426

It would have to be something durable, I've printed in pla before and that shit melts if it's in your car too long on a hot day. I'm assuming it would be expensive either way, first gotta find plastic in that color, if you can't. Gotta paint it from scratch, that's after paying someone to make a 3d model, then clearances for arms, questions like whether the eye all is inside the hrlm, or outside.
>>
>>101959405
However, this changes a bit for other first-person interactions. Look at this, this is the description 4o gave for this image.
>An image of a woman with long, wavy blonde hair wearing sunglasses and a sleeveless grey top. She is holding a wide-brimmed hat in her right hand and pulling someone forward with her left hand, looking back with a joyful smile. The setting is a serene, wide-open beach with soft sand and distant waves, framed by rocky cliffs covered in trees in the background. The sky is overcast, creating a soft, natural light. The perspective is from the point of view of the person being pulled, with their arm and hand visible in the foreground, holding onto the woman's hand.
Notice again that it doesn't say viewer, perspective, point of view, etc. It just talks about elements in the image. And what Flux spits out from this... is actually on point! It turns out that what we can learn from this is that this particular kind of description is in the training AND that no other images have this type of description, so the model generates this scene.
The only issue though when you actually try this is that the hands get mangled pretty badly, so I guess the model hasn't seen a ton of this in its training, but just enough to understand the prompt.
>>
File: 2024-08-18_00226_.jpg (809 KB, 2688x1536)
809 KB
809 KB JPG
>>
>>101959509
> that's after paying someone to make a 3d model
use Tinkercad
>>
>>101959518
>And what Flux spits out from this... is actually on point!
>She is holding a wide-brimmed hat in her right hand
>>
>>101959518
Oh wait I'm blind. It does mention point of view. But not in the beginning. It's still able to make the connection though with the beginning part of the prompt which is interesting. But in my other experiments 4o does not mention POV, so this is an outlier.
>>
>>101959518
its like...describing properly an image will get you the image that you want... wow anon, truly remarkable discoveries you're doing it there, keep it going
>>
File: ComfyUI_03346_.png (1.37 MB, 1024x1024)
1.37 MB
1.37 MB PNG
>>101959411
Default settings for PAG right now, but I did turn the CFG down to 5.0 and the dynamic threshold at 0.990. GuidanceNeg 8.0. It seems to have a very narrow window of settings it functions properly at, might just be a worse option than the current meta.
>>
>>101959518
That image looks undercooked, try increasing the CFg or something
>>
>>101959564
Shut up. At least he's trying to help. What are you doing?
>>
>>101959580
EXACTLY thanks for appreciating me anon baby boy
>>
>>101959573
I see, I'll try those settings out first and try to figure out of I can improve from it, there's also a "Smoothed Attention" node but when I try it it looks exactly the same, I have no idea what differences they can have kek
>>
>>101959564
Try making a prompt that generates POV consistently in a variety of scenarios then.
>>
File: file.png (2.23 MB, 1024x1024)
2.23 MB
2.23 MB PNG
No matter what I do, it insists of having Miku say the thing. I've generated a dozen images and they all do this. What should I do to get the man to say it?
>>
>>101959540
Anon that's a real stock photo. I asked 4o to describe it.

>>101959564
Unfortunately it doesn't work that way. Image captioning systems have biases and don't always completely describe the image faithfully or in a way a human would. And in this case if you try to specify certain first person views in Flux, it doesn't generate them consistently. Some will be, like the one with the beach. Some won't be, like the one with the handshake.
>>
>>101959594
Yeah, I tried that to - it seems that a lot of the implementation might be broken for Flux, since adjusting some variables just doesn't do anything.
>>
Will models ever come together on one unified, agreed upon tagging and captioning system?
>>
File: ifx90.png (1.53 MB, 1024x1024)
1.53 MB
1.53 MB PNG
>>
>>101959616
HAHAHAHAHAHAHAHAHAHAHA
>>
>>101959616
Yes. The people who brought us unified prompt formats for text models are working on it this very second.
>>101959618
Beautiful
>>
>>101959616
maybe, after people do
for better or worse, a picture paints a thousand words
>>
>>101959604
>Anon that's a real stock photo. I asked 4o to describe it.
I misread you but still gonna say it is your fault.
>>
HELLO SIRS DOES THE CFG ON FORGE FOR FLUX DO THE SAME THING AS GUIDANCE IN COMFY
>>
File: Capture.jpg (278 KB, 2067x1455)
278 KB
278 KB JPG
>>101959615
please someone end her suffering ;_;
>>
File: 180995429.png (1.05 MB, 896x1152)
1.05 MB
1.05 MB PNG
>>
File: FLUX_00040_.png (1.08 MB, 896x1152)
1.08 MB
1.08 MB PNG
I'm not gonna spam this stuff, just thought this was cool
>>
>>101959657
that;s funny
>>
>>101959665
Almost cute
>>
So should I use WD14 to train a lora for a pony model?
Does it matter that much?
I just think it makes sense.
>>
>>101959657
That Emma Watson lora has her hair firmly stuck in that style.
>>
File: ComfyUI_04760_.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
>>101959651
>Anon, what do you do this to me???
>>
Come and get some piping hot...
>>101959699
>>101959699
>>101959699
>>
File: ComfyUI_03353_.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
>>101959651
>>101959698

The 0.990 on adaptive threshold helped a lot, which probably means I have the wrong cfg tuned in.

Also tonemap seems to be compatible and adds more detail on top of the PAG gen. This is at
tonemap 0.42.

https://files.catbox.moe/n31wod.png

Here's a catbox, but as it stands this just looks like an inferior restrictive method atm
>>
>>101959616
the strength of all of this decentralized open source shenanigans where no one works together is that it behaves a lot like evolution
>>
>>101959725
Oh cool, didn't know it could be applied with tonemap, but yeah if the results aren't much better then it's not worth the x3 speed decrease
>>
File: 3732831602.png (948 KB, 1152x896)
948 KB
948 KB PNG
>>
>>101959811
fail



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.