[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (1.2 MB, 3264x3264)
1.2 MB
1.2 MB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>102384776

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/c/kdg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/u/udg
>>>/tg/slop
>>>/trash/sdg
>>>/pol/uncensored+ai
>>
File: swiftelontrump.png (3.51 MB, 1568x1568)
3.51 MB
3.51 MB PNG
>>
>>102406123
I can't even tell who this is intended to trigger anymore
>>
File: jetcruise.png (2.86 MB, 1568x1568)
2.86 MB
2.86 MB PNG
>>102406128
>why can't we all just get along
>>
>>102406128
Liberals
>>
File: Sigma_14439_.png (935 KB, 1024x1024)
935 KB
935 KB PNG
>>
File: Sigma_14448_.png (1.54 MB, 1024x1024)
1.54 MB
1.54 MB PNG
>>
File: Sigma_14475_.png (1.84 MB, 1024x1024)
1.84 MB
1.84 MB PNG
>>
File: Sigma_14482_.png (982 KB, 1024x1024)
982 KB
982 KB PNG
>>
File: Sigma_14496_.png (1000 KB, 1024x1024)
1000 KB
1000 KB PNG
>>
File: ComfyUI_08180_.png (241 KB, 1200x1752)
241 KB
241 KB PNG
>>
File: 1724455739878807.jpg (2.13 MB, 2040x2040)
2.13 MB
2.13 MB JPG
>>
File: 1718783989430728.jpg (1.9 MB, 2040x2040)
1.9 MB
1.9 MB JPG
>>
>>102406283
>>102406297
>>102406348
>>102406357
>>102406365
>>102406400
>>102406411
>>102406430
These all suck, >>102396906 for the real gens.
>>
just upgraded from a 3060Ti to a 4090
do i need to redownload torch or anything like that?
i did switch to the Q8 flux quant.
>>
File: ComfyUI_08184_.png (245 KB, 1200x1752)
245 KB
245 KB PNG
>>102406400
>>
>>102406476
4090 user here as well. You don't need to redownload anything.
>>
File: lounge.png (3.31 MB, 1561x1564)
3.31 MB
3.31 MB PNG
>>102406435
>literally all 1girls
>>
>>102406534
awesome, thanks
i went from about 0.2 it/s to 1.5 it/s if any other VRAMlets were curious
>>
>>102406593
i have a 3080 and just leave flux generating batches when im not at home. the power usage and heat of the 4090 concerns me, the 3080 is bad enough.
AI power usage seems to be an issue, its like you get diminishing returns, someone needs to break the nvidia mold
>>
File: 1697908976890139.jpg (2.83 MB, 2040x2040)
2.83 MB
2.83 MB JPG
>>
>>102406668
i wanted 24GB for training, but yeah i should have just got a 3090 instead
>>
File: view (4).jpg (795 KB, 2560x1440)
795 KB
795 KB JPG
>>
File: kittensoup.png (2.91 MB, 1568x1568)
2.91 MB
2.91 MB PNG
>>102406845
>>
File: view (5).jpg (890 KB, 2560x1440)
890 KB
890 KB JPG
>>102406917
kek, topical
>>
File: voodooqueenlowres.png (1.92 MB, 1016x1016)
1.92 MB
1.92 MB PNG
>>102406933
>>
File: ComfyUI_05257_.png (1.53 MB, 1024x1024)
1.53 MB
1.53 MB PNG
>>102406411
>>102406430

3dpd
>>
>>
File: ComfyUI_01524_.png (348 KB, 512x512)
348 KB
348 KB PNG
>>
>>102406950
>>
>>
File: ComfyUI_01536_.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
>>
File: view (6).jpg (862 KB, 2560x1440)
862 KB
862 KB JPG
>>
File: 1701341526254387.png (545 KB, 640x480)
545 KB
545 KB PNG
>>
File: flightreactionsyell.gif (192 KB, 220x220)
192 KB
192 KB GIF
>>102406593
>he paid $2000 to get 1.5it/s
it's so over
>>
>>102407522
prob flux numbers. it/s is a trash measurement without a ton of context. He 7x his speed. He is also using quantized still, so no idea what is happening.
>>
>>102407522
Not him but $2000 is about one week of post-tax labor for me
>>
>>102407537
flux or not that money should be getting us to the moon, not bringing us back to running an rx 580 on sd 1.5 in 2023(?)

>>102407548
making good money is based, pissing it away proudly to a company that's about to ship 8gb on a $600 card in 2025 isn't though.
>>
>>102407559
>pissing it away proudly to a company that's about to ship 8gb on a $600 card in 2025 isn't though
He didn't buy that card THOUGH
>>
>>102407522
More importantly, I can train now too

>>102407537
Should I go for fp8/32 instead?
>>
>>102407586
training at least i can see being worth it, gen speeds are fucking PISS.
>>
>you get magically teleported to a school playground in whales, and the girl that discovers you looks at you from behind with this expression, WDYD?

pony's expressions will never not cease to make me laugh
>>
pixartsexuals waiting room
>>
>>102407770
whats that?
>>
>>102407773
we've been edging for months
>>
File: ComfyUI_08218_.png (245 KB, 1200x1752)
245 KB
245 KB PNG
>>
URPM is coming back!

https://civitai.com/articles/7418/urpm-is-back-i-will-be-releasing-over-two-years-worth-of-urpm-proprietary-models
>>
File: 00226-203681330.png (2.03 MB, 1328x1328)
2.03 MB
2.03 MB PNG
>>102407780
>POV: You just told her that to her face
>>
>>102407849
Might be ok
>>
>>102407670
>'Ah kent Adam and Eve it! A dishy nonce for once! The girls'll be full a beans when I bring him to the do!

>>102407918
>'Wut do you mean 'Too old, not a pony'?!
>>
>>102407849
>SDXL
deprecated
>>
File: 00007-389307229.png (1.39 MB, 1152x896)
1.39 MB
1.39 MB PNG
>>102408241
kek


>i FINALLY learned how to upgrade from 1girl
>after finding out base forge is trash and i need reforge because the dev broke everything
>and just getting latent couple because its better than regional prompter
long time c(u)oming.
>>
>>102408442
Stocking and Fran Bow?
>>
>>
>>102408483
yessir
>and sometimes fran bow's doppleganger not sure what's causing this my prompt's solid
>>
>>
>>102408442
just don't updoot
>>
File: ComfyUI_33470_.png (752 KB, 1024x768)
752 KB
752 KB PNG
>>
File: 2024-09-16_01-42-52_7453.png (1.01 MB, 1152x896)
1.01 MB
1.01 MB PNG
>>102406744
>>
File: 2024-09-16_01-54-51_1668.png (1.12 MB, 1152x896)
1.12 MB
1.12 MB PNG
>>102406411
>>
File: 2024-09-16_02-05-36_3256.png (1.23 MB, 1152x896)
1.23 MB
1.23 MB PNG
>>102406430
>>
>>102407586
Isn't Q8 slower than fp16?
>>
>>102405949
How do I run this on my Radeon 680M?
>>
>>102410128
It probably depends on whether you're memory or compute bound.
>>
File: 0.jpg (170 KB, 1024x1024)
170 KB
170 KB JPG
>>
File: ComfyUI_00656_.png (1.81 MB, 1024x1280)
1.81 MB
1.81 MB PNG
>1girl, AI slopface
>>
File: Sigma_14503_.png (984 KB, 1024x1024)
984 KB
984 KB PNG
>>
File: Sigma_14517_.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
>>
File: Sigma_14542_.png (1.01 MB, 1024x1024)
1.01 MB
1.01 MB PNG
>>
File: Sigma_14534_.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
>>
>>102411700
very nice! what is the prompt if i may ask? some new sigma finetune?
>>
File: Sigma_14508_.png (976 KB, 1024x1024)
976 KB
976 KB PNG
>>102411778
> A glass prism catches a beam of sunlight, projecting a vibrant rainbow across a stark white wall. The spectrum of colors adds a magical quality to the minimalist setting, illustrating the hidden beauty within seemingly simple things.

Not released test but bumping mainly to keep this baby going
>>
went back to ponyxl after playing around with flux for a few weeks and it looks so fucking shit
>>
>>102411855
>Not released test
So it's the one you are tuning yourself. Looks nice so far. Can it deal with hands?
>>
File: 1714147155080054.jpg (2.66 MB, 2040x2040)
2.66 MB
2.66 MB JPG
>>
>>102408903
cute style
>>
Is it still hard to use AMD GPUs?
>>
>>102412296
Yes
>>
>>102406128
Good.
>>
>>102412367
Is it changing anytime soon?
>>
File: 2024-09-16_08-22-37_8453.png (1.06 MB, 1152x896)
1.06 MB
1.06 MB PNG
>>102411944
>>
File: 1699655916907105.jpg (3.04 MB, 2040x2040)
3.04 MB
3.04 MB JPG
>>
File: 1720796527030362.jpg (2.43 MB, 2040x2040)
2.43 MB
2.43 MB JPG
>>
>1girl
>stock photography
>basic abstract
>political memes
Which way, image generation man?
>>
>>102412296
What gpu you have?
>>
>>102412483
unlikely. amd doesn't seem to give a shit.
>>
https://www.reddit.com/r/StableDiffusion/comments/1fi1e04/flux_halflife_but_soviet_era/

No Half Life or Stalker lora yet?
>>
>>102412561
im not on this list
>>
>>102412561
i've been combing the first 3 since 1.5
>>
https://www.reddit.com/r/StableDiffusion/comments/1fhpk7p/2_years_later_and_ive_still_got_a_job_none_of_the/

Can anyone make this image in Flux? OP says you can't
>>
File: 2024-09-16_08-39-04_6710.png (1.23 MB, 1152x896)
1.23 MB
1.23 MB PNG
>>102411944
>>
>>102412638
>OP says
Good for him
>>
>>102412714
Aaarrgghh get down advertiser sama!
>>
File: Sigma_14552_.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>102411907
>So it's the one you are tuning yourself
Yes, it's bunline dataset with higher lr and higher betas on CAME.

>Can it deal with hands?
Only when you get lucky. And no text either. Not flux quality, but I'm not pretending it is either
>>
File: file.png (2.49 MB, 1504x1024)
2.49 MB
2.49 MB PNG
>>102412638
this is my attempt, meh
>>
>>102412785
Is it hard to find good parameters for CAME? I don't mind if it's not Flux quality. Text can be edited in.
>>
File: 1712654216433254.png (1.08 MB, 768x1024)
1.08 MB
1.08 MB PNG
Using duchaiten pony, trying to get this girl to be outside at night. But I keep getting daytime pictures.
Do I need a lora for this or does anyone know how to word the input to to get night time?
>>
File: file.jpg (646 KB, 3072x1280)
646 KB
646 KB JPG
Someone used the "true_cfg" method from Flux PuLID at its advantage, maybe we could make a node out of it, if it turns out it's better than the other methods (AutomaticCFG, SkimmedCFG...) then we'll be eating good I guess:
https://xcancel.com/linoy_tsaban/status/1835655132174004717#m
https://gist.github.com/linoytsaban/aec23eee5cc168726f594c04413ead7a
https://raw.githubusercontent.com/linoytsaban/diffusers/cfg-flux-exploring/examples/community/pipeline_flux_with_cfg.py

Here's the demo:
https://huggingface.co/spaces/multimodalart/flux-cfg
>>
mmm okay flux is indeed pretty fun.
>>
>>102412879
Oh. I figured it out. Nighttime works as one word but not as two for some reason.
>>
>>102412912
you used a lora to get Cirno?
>>
>>102412897
>we could make
>we
Literally just ask chatgpt to write you the node, takes 20 seconds
>>
File: 00097-2224858638.png (2.2 MB, 1120x1440)
2.2 MB
2.2 MB PNG
>>
>>102412575
RX 7900 xt, got it for basically free from a friend that upgraded to a RTX 4090.
>>
>>102412973
that's what I'm doing kek
>>
>>102412984
Godspeed
>>
>>102412930
nope just "Cirno ice fairy touhou zun with ice wings smiling like a baka" plus rest of prompt, adding blue eyes and blue hair bow works decent too.
>>
>>102412897
someone check if using this also helps stop it's weird censoring of nips (ie. even male nips look like weird puffy splotches 9 times out of 10)
>>
>>102412973
its surprising that I've been messing with ai for over two years and how often I still catch myself forgetting I can just get chatgpt to make/fix/explain things.
>>
>>102412897
holy shit the hf demo is slow, this method is not going to be plausible for vramlets is it...
>>
File: 00040-3803674750.png (735 KB, 1216x832)
735 KB
735 KB PNG
>>
>>102413035
It's a blessing and a curse, sometimes think I'm going to regress in crit thinking skills or coding knowledge for depending on it
>>
>>102413152
NOOOOOOOOO!!!!!!!!
>>
vramggots still on suicide watch they will never recover
>>
File: 00655-2227675913.jpg (562 KB, 1152x1536)
562 KB
562 KB JPG
>>
>>102413191
:( I don't want to get a job
>>
>>102413227
>Its MA'AM
>>
>>102413155
I felt that, I was a way better coder during the pre-AI days, now I feel like if they remove chatgpt out of me I couldn't do anything by myself anymore
>>
>>102412897
damn this does look really good
>>
>>102412978
See how this works
https://github.com/patientx/ComfyUI-Zluda

Should take 10 seconds at 20 steps
>>
File: 1.png (998 KB, 1024x1024)
998 KB
998 KB PNG
>>102412834
>>102412638
>>
File: file.png (644 KB, 3152x1442)
644 KB
644 KB PNG
>>102412973
yeah... Claude 3.5 sonnet helped me making a node but it's not working as expected :(
>>
>>102413339
Alright, thanks.
>>
>>102412897
Wait am I fucking retarded? I looked at the Diffusers PR, and it's literally just CFG and NOTHING else? There is no new technique that's better than AutomaticCFG etc, it's actually just CFG and they're acting like it's some brand new revolutionary thing.

Surely not though, somehow show me how I'm wrong.
>>
File: Sigma_14562_.png (1.29 MB, 1024x1024)
1.29 MB
1.29 MB PNG
>>102412638
Sigma

>>102412861
It's really sensitive and frequently learns more than I expect at low LR. Increasing betas keeps previous gradient more intact. It's a schedule-free optimizer, which makes at least the schedule nice at a constant rate.
>>
>>102413438
Write something in the negative prompt
>>
>>102413585
I don't think so, it's using a "true_cfg" parameter that is calculated with this script:
https://raw.githubusercontent.com/linoytsaban/diffusers/cfg-flux-exploring/examples/community/pipeline_flux_with_cfg.py
>>
>>102413610
won't work, Claude is retarded, it included the "true_cfg" parameter but it wasn't linked with the "pipeline_flux_with_cfg.py" script at all, which is fair because the transition from a diffusion inference and a MODEL/MODEL node isn't trivial at all
>>
>>102413614
If true_cfg is > 1, it simply computes a negative noise prediction and then does this:
noise_pred = neg_noise_pred + true_cfg * (noise_pred - neg_noise_pred)

That's just plain CFG. I can't find anywhere where there's some kind of new anti-frying technique like AutomaticCFG.
>>
>>102413669
there's no way it's just the regular cfg, you would get something fucked up at cfg = 5, and it's not the case for that example picture >>102412897
you don't get something this clean at cfg = 5 without an antiburner
>>
>>102413685
Yes, that's why I think I must be the one who's a retarded codelet. Someone please show me in the code where it does the new antiburn technique, because I'm not seeing it.
>>
>>102413602
>Increasing betas keeps previous gradient more intact.
Sorry for dumb question, but how do you see this..with Tensorboard? So you go back to previous epoch and adjust it manually?
>>
File: file.jpg (1.13 MB, 2048x2743)
1.13 MB
1.13 MB JPG
>>102413728
I'm not seeing it too but I'm probably a retarded codelet too, there's something goin on that makes it work lol
>>
File: robot.jpg (995 KB, 1344x1728)
995 KB
995 KB JPG
>>102413728 (me)
Nah I'm right. Here's the example prompt, with guidance=1 and CFG=5. No antiburner. The guidance=1 is important, otherwise it's fried.

Fucking redditors. Their "innovation" is literally vanilla CFG and nothing else.
>>
>>102413849
elemayeoh
>>
File: file.png (3.08 MB, 3025x1387)
3.08 MB
3.08 MB PNG
>>102413849
>Fucking redditors. Their "innovation" is literally vanilla CFG and nothing else.
that's a big deal no? like I never thought we could simply disable distilled guidance, go for CFG > 1 and it works, we don't need antiburners then with that config
>>
>>102413669
>>102413849
https://github.com/search?q=repo%3Ahuggingface%2Fdiffusers%20noise_pred_uncond%20%2B%20self.guidance_scale%20*%20(noise_pred_text%20-%20noise_pred_uncond)&type=code
>noise_pred = noise_pred_uncond + self.guidance_scale * (noise_pred_text - noise_pred_uncond)
yeah that's just the standard cfg lol
>>
File: 00011-2019439815.jpg (1.84 MB, 2475x1564)
1.84 MB
1.84 MB JPG
>>
File: file.jpg (1.45 MB, 4196x1887)
1.45 MB
1.45 MB JPG
>>102413926
>standard CFG + distilled guidance 1 (disabled)
Why haven't I thought of something like that earlier? that's fucking clever
>>
>>102413602
nice
>>
File: file.png (1.66 MB, 1024x1024)
1.66 MB
1.66 MB PNG
>>102414047
>A photo of a woman in New York
>Distilled Guidance 1
>CFG 5
yeah... that doesn't look really good, maybe adding an antiburner on top of that could do the trick, Idk, looks way better than a regular cfg 5 you usually see on flux though
>>
>>102414249
are you sure your workflow isn't fucked up? I'm not at the PC to try this myself but the HF results do not look like that and that's all it has enabled on there
>>
File: file.png (1.99 MB, 3021x1435)
1.99 MB
1.99 MB PNG
>>102414282
>t the HF results do not look like that and that's all it has enabled on there
how does it look like in your end? because I'm doing this on ComfyUi, you just have to change the cfg and distilled guidance values
>>
File: ComfyUI_01545_.png (1.13 MB, 1024x1024)
1.13 MB
1.13 MB PNG
>>
File: file.png (1.78 MB, 1423x1802)
1.78 MB
1.78 MB PNG
>>102414282
>>102414357
Got similar results on the demo
>>
File: file.png (2.96 MB, 3239x1546)
2.96 MB
2.96 MB PNG
this method makes flux very good at styles, even though Migu looks like a tranny here kek
>>
File: 1719079785578307.png (55 KB, 1264x421)
55 KB
55 KB PNG
that can't be good. oh god, it's still going ahead with the generation. i'm making dolphin mustard
>>
>>102414390
huh no you're right, my earlier people tests didn't have this issue but trying your prompt its fucking up for me too. maybe its only good for stylized pictures and photos are hit and miss
>>
>>102414486
I thought everyone knew about the guidance=1 + CFG since the beginning? When guidance is 1, the model is trying to mimic CFG 1, i.e. trying to predict only the positive noise with no built-in negative. So theoretically that should be like any other model without guidance distillation. In practice, it doesn't work quite right, it makes images look "messy" with weird high-frequency noise artifacts. For certain types of painting styles this works well (the impressionist example with the robot) but for photos it looks like shit.
>>
>>102414390
>>102414357
That's how guidance at 1 looks by default. I think it was just luck with the impressionist painting style
>>
>>102414532
well, fuck. damn plebbitors getting my hopes up
>>
File: FluxDS_00006_.jpg (143 KB, 1216x832)
143 KB
143 KB JPG
Is 15s/it normal for Flux models on a 4070?
>>
>>102414581
depends on which version of the model youre running
cute gen tho
>>
File: ComfyUI_06504_.png (1.37 MB, 1024x1024)
1.37 MB
1.37 MB PNG
>>102414532
>>102414547
>>102414575
yeah, basically guidance = 1 means it's disabled, so you can technically go for CFG > 1 without burning the image, but the images are still fucked anyway so...
>>
>>102414596
>HHH
nigku having a stroke there
>>
>>102414586
fp8 all-in-one .safetensors
>>
tokenizer = T5Tokenizer.from_pretrained(t5_models_dir, subfolder="tokenizer")
t5_model = T5EncoderModel.from_pretrained(
t5_models_dir,
subfolder="text_encoder",
torch_dtype=torch.float16
).to(offload_device)
vae = AutoencoderKL.from_pretrained("F:/ai/models/vae-kl-f8-d16", torch_dtype=torch.float16).to(offload_device)

# KV Compression Config
kv_compress = True
kv_compress_config = {
'sampling': 'conv', # Options: 'conv', 'uniform', 'ave'
'scale_factor': 2,
'kv_compress_layer': list(range(22, 45)), # Layers 22 to 44
}

# Initialize the full model on cuda:0
model = Diffussy(
vae=vae,
depth=44,
hidden_size=1440,
num_heads=20,
kv_compress=kv_compress,
kv_compress_config=kv_compress_config,
learn_sigma=True,
pred_sigma=True
).to(device)


2B is all you need... right?
>>
>>102414605
yeah, kek
>>
for the love of God someone finally convince me to stop testing epoch 1-10 on 30 epoch loras, wasting time and cursing myself with uncanny valley results
>>
>>102414619
>Diffussy
Any images yet, anon?
>>
File: Sigma_14572_.png (1.25 MB, 1024x1024)
1.25 MB
1.25 MB PNG
>>102413807
I keep validation noise fixed so you can tell how much is changing. Right now I'm saving to file in a folder because tensorflow viewing experience kinda sucks.

> for image in images:
>- formatted_images.append(np.asarray(image))
+ >image.save(f'output/validation/{step}_{validation_prompt}.jpg')
>>
>>102414619
>Diffussy
This sounds like the most troon name in the existence of troon names please reconsider
>>
>>102414619
>2B is all you need... right?
2b is so much all you need that SAI went for a 16b model (SD-Ultra) on their API lol
>>
>>102414626
I always start from final epoch with 0.7
>>
>>102414607
try the Q8 GGUF and see if it's any better
>>
File: file.png (2.51 MB, 1024x1024)
2.51 MB
2.51 MB PNG
>>102414596
>cfg 1
>distilled guidance 1
https://youtu.be/NZkig1r2zCs?t=6
>>
>>102414637
No I'm experimenting, so it's all VAE learning shit right now. Dialing in the parameters. Implementing the features needed to make a basic transformers model.

>>102414646
I hope you learned your lesson about large models and local.
>>
>>102414677
unfiltered sovl holy shit
>>
>see AI image I really like
>fuck what model/lora is this
>it's midjourney
>fuck
>see AI image I really like
>fuck what model/lora is this
>it's midjourney...
aaaaaAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>>
File: Sigma_14598_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>102414619
HYPE
>>
midjourney's regular aesthetic is pretty mid desu
>>
>>102414724
I felt that anon, MJ can make so much kino pictures, no one is close to it yet
>>
File: Sigma_14605_.png (611 KB, 1024x1024)
611 KB
611 KB PNG
>>102414638
tensorboard*
>>
File: Sigma_14612_.png (884 KB, 1024x1024)
884 KB
884 KB PNG
>>
>>102414469
So I figured out this is a LoRA problem. In fact, some LoRAs just seem to not work for me. Is it a problem with my workflow? How do I fix this? Clearly the LoRA works for others
>>
>>102414745
kills me. I'd be fine sticking with the shitty SDXL arch or whatever else if it was midjourney/niji tier. but I absolutely refuse to use their shitty discord service. one day I'll collect enough images to make a MJ slop lora worth my time, I guess
>>
>>102414729
I'm trying to implement patches but I'm too much of a brainlet and I really want to do a crop-free training workflow where you train basically on ratios and total pixels. Ie 1.0 @ 65536 pixels (ie 256x256). I do wonder if you're willing to pay the price if doing the full latent is better in the end.
>>
>>102414774
Sorry, I'm wrong, my old workflows are broken too. I assume updating python just fucked things god dammit
>>
>>102414774
are you using a lora that is not compatible with the model you are using? ie. 1.5 lora with sdxl, or something?
>>
>>102414792
>one day I'll collect enough images to make a MJ slop lora worth my time
have you tried those ones though?
https://civitai.com/search/models?baseModel=Flux.1%20D&modelType=LORA&sortBy=models_v9&query=mj
https://civitai.com/search/models?baseModel=Flux.1%20D&sortBy=models_v9&query=midjourney
>>
File: 1725670433241014.png (135 KB, 1258x583)
135 KB
135 KB PNG
>>102414807
The workflow I was using was shared from another anon, and he was making images with it without issue. Even old workflows that worked now do picrel. Not sure what to do either, all I did was run 'update_comfyui_and_python_dependencies.bat'
>>
>>102414800
so you want to train a 2b model from scratch? why not simply finetune Flux, that would be less work imo
>>
>>102414867
No it wouldn't and it certainly wouldn't be cheap. This would be 100% my model, my code too. So I could license it how I please.
>>
>>102414850
well, share your WF then, can't really guess otherwise. that error to me says you're loading a lora that isn't made for the model you're using, but maybe you're just using an out of date custom node
>>
File: 1723158894339026.png (2.81 MB, 2048x1224)
2.81 MB
2.81 MB PNG
despair, i'm probably going to have to reinstall won't i
>>
>>102414877
fair enough
>>
>>102414877
>This would be 100% my model, my code too. So I could license it how I please.
What if we finetune schnell so that it's not a piece of distilled shit and would be as good as dev (and will work with a lot of steps), that way we can keep the Apache 2.0 licence while having some quality in it
>>
>>102414906
https://files.catbox.moe/3gd4fy.json
here's the workflow for the frieren fumo one. both LoRAs can be found on Civit, they're for Flux and as you can see from >>102414907 I've gotten both to work perfectly fine in the past

All I did since that was update dependencies and load up someone else's workflow https://files.catbox.moe/vcbn01.png (which was bait, but i liked the lora and output quality. workflow included in that image) with this LoRA https://mega.nz/folder/mtknTSxB#cGzjJnEqhEXfb_ddb6yxNQ/folder/71kWXLjA
>>
>>102414907
i feel you anon this shit fucking sucks mega gay ass
>>
>>102414949
i'm fine downloading models and shit for hours but i would like the result of those downloads to WORK and STAY WORKING. I'll provide any info I can but I have a feeling nobody knows what's going on
>>
>>102414372
more
>>
File: 00143-595692129.png (2.25 MB, 1120x1440)
2.25 MB
2.25 MB PNG
anyone else get the thing where flux just makes the poses floaty? I swear it is something I am doing because certain prompts really bring it out but it's just unlucky.
>>
File: file.png (18 KB, 931x258)
18 KB
18 KB PNG
anybody happen to know how big CuDNN is?
it's been downloading for over half an hour
>>
File: file.png (3.45 MB, 3185x1612)
3.45 MB
3.45 MB PNG
>>102414907
>>102414949
>>102414995
did you update comfy in a way it also updated torch? because the current installation uses torch 2.4.1 but the older ones look better, that's why I decided to stay at 2.3.1
>>
>>102415071
800MB ish I think. you on dial up anon?
>>
>>102415113
pretty much (500KB/s) and i have to share my connection
thanks
>>
>>102415109
Still waiting for a multi seed comparison before I believe this
>>
>>102414677
Best image posted in weeks
>>
https://reddit.com/r/StableDiffusion/comments/1fic0if/playing_with_cogvideoxs_new_image_to_video_feature/
CogVideoX-5b can do image to video now
https://huggingface.co/spaces/THUDM/CogVideoX-5B-Space
>>
>>102415109
I think so. I'm also really suspicious about the GGUF loader for my quantized flux model. In all the workflows I import I have to remove the model loader for the GGUF one and I'm seriously starting to suspect it's the culprit for the crashing. I mean even the basic comfyui workflow on https://comfyanonymous.github.io/ComfyUI_examples/flux/ crashes. Only thing I'm changing is the model loader
>>
>>102415272
>In all the workflows I import I have to remove the model loader for the GGUF one and I'm seriously starting to suspect it's the culprit for the crashing.
that's weird, I never had such problem with the GGUF node, did you update the package?
>>
File: 00702-1951859936.jpg (442 KB, 1152x1536)
442 KB
442 KB JPG
>>
>>102415243
wow thats bad
>>
>>102415289
Just updated it, was actually some stuff to pull. Fingers crossed that was the problem!
>>
File: 20240916_210621.webm (680 KB, 720x480)
680 KB
680 KB WEBM
>>102415243
meh
>>
File: ComfyUI_06519_.png (1.02 MB, 1024x1024)
1.02 MB
1.02 MB PNG
https://civitai.com/models/765554?modelVersionId=856284
let's goo, I suggest you to go for Strength 0.7 or else you'll only get FFX characters kek
>>
>>102415243
>>102415377
well at least since its open source there's improvement potential.

>and we won't need to rely on the chinese for long
>>
>>102414767
pixart would be the meta if it had more knowledge and a 16ch vae
>>
haha lol
>>102415243
>>
>>102415394
>10000 ingame Screenshots
this guy is fucking chopping up and snorting blitzball
>>
File: ComfyUI_01544_.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
>>102414998
>>
File: 20240916_211549.webm (338 KB, 720x480)
338 KB
338 KB WEBM
>>102415243
>>102415394
it's not that bad, definitely meme potential, far from MiniMax of course
>>
>>102415394
>10000 screenshots
holy shit
>>
File: ComfyUI_06520_.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
>>102415394
kek, desu that was a bad idea to mix the high quality cutscenes with the in game cutscenes, dunno how to switch from one or the other
>>
i hope anons realize we're not getting any good local video models in 2024 simply due to hardware constraints. you probably need 64gb of VRAM for a 20fps 720p 5 second video and each video will take 10 minutes unless you own a H100 in which case it will take 5
>>
File: they call me cory.webm (168 KB, 720x480)
168 KB
168 KB WEBM
i dont think the 5b air is doing too well with cory
>>
>>102415556
>you probably need 64gb of VRAM for a 20fps 720p 5 second video and each video will take 10 minutes unless you own a H100 in which case it will take 5
for CogVideoX-5b I think it's asking for less than 24gb, and even if it's too big, we can go for GGUF quants, that works
>>
>>102412484
top tier, more latex bodysuits please.
>>
>da da da DAAAAAAAAAAAAAAAAAAAAAAAA
>*snake eater*

okay this is really fun, even if it only gets slightly better by the end of the year it has a lot of meme potential.
>>
File: w.png (1.28 MB, 1024x1024)
1.28 MB
1.28 MB PNG
>>
>>102415625
lmaooo, it's basically a worse version of Luma we got at home, image to video will never cease to be funni
>>
File: 0.jpg (176 KB, 1024x1024)
176 KB
176 KB JPG
>>
>>102415580
yeah and cogvideox is 720x480 and 8fps, which is worthless outside of a tech demo

pre-nerf kling is my benchmark for actually usable text-to-video or image-to-video AI, so 1280x720 or 960x960 and at least 20fps, preferably 24fps. i cant fap to girls jiggling at 8fps
>>
>>102415394
is it possible to prune a lora of this size?
>>
File: 1722438442515160.png (2.14 MB, 1024x1024)
2.14 MB
2.14 MB PNG
>>102415289
Updating it didn't work, but restarting afterwards seems to have done it. Old workflows work again. We're so back. Thanks for the help!
>>
File: 20240916_212642.webm (181 KB, 720x480)
181 KB
181 KB WEBM
>>102415552
>>102415243
not gonna lie that's funni
>>
>>102415682
you're welcome anon o/
>>
File: dino_00104_.png (1.1 MB, 1024x1024)
1.1 MB
1.1 MB PNG
>>102415682
nice
>>
>>102415691
thats like a fucked up fmv game cutscene, going from ps1/early pc ingame graphics to an fmv with a hard cut lmao
>>
File: 00020-2564301890.png (744 KB, 1024x1024)
744 KB
744 KB PNG
>>
>>102415394
Makes blurry images at 1
>>
>>102415781
at strength = 1? show me a screen of your workflow anon
>>
File: inkling.jpg (146 KB, 1024x1024)
146 KB
146 KB JPG
wow surprisingly not terrible, kinda cute even.
>image resolution is too large
https://files.catbox.moe/dznvia.mp4
>>
File: 1.png (1.22 MB, 1024x1024)
1.22 MB
1.22 MB PNG
>>102415793
Yes, some loras don't work at 1 with boomer prompting, they all get distorted
>>
>>102415833
yeah, that's why I recommanded 0.7
>>
File: 20240916_213837.webm (235 KB, 720x480)
235 KB
235 KB WEBM
>>102415728
kek
>>
File: 00729-2723491664.png (886 KB, 896x1152)
886 KB
886 KB PNG
>>
File: 5.png (1.41 MB, 1024x1024)
1.41 MB
1.41 MB PNG
>>102415850
0.50
>>
File: tmp5jxk_nqb.png (888 KB, 896x1152)
888 KB
888 KB PNG
>>
>>102415873
show me a screen of your workflow anon, maybe something is wrong with your settings
>>
>>102415859
works well on that one because 8fps is not that low in the "Wallace and Gromit" kind of style
>>
File: 1696968389879154.png (1.64 MB, 1024x1024)
1.64 MB
1.64 MB PNG
ah........ no more random ass crashes and bullshit... just back to good old proompting... feels good man
>>
File: 20240916_214637.webm (471 KB, 720x480)
471 KB
471 KB WEBM
Andrew Tate is that you?
>>
File: 25.png (1.3 MB, 1024x1024)
1.3 MB
1.3 MB PNG
>>102415892
0.25

It works as intended, it's the prompt with tags like 4k that changes it

The lora is good, if it was broken it would not apply the style it was trained on
>>
>>102413954
nice one
>>
>>102415814
>wow surprisingly not terrible, kinda cute even.
true, I hope they'll improve their model, it has potential
>>
File: 0.jpg (214 KB, 1024x1024)
214 KB
214 KB JPG
>>
File: file.gif (1.76 MB, 360x240)
1.76 MB
1.76 MB GIF
>>
File: GXjKVWMXoAALFDY.jpg (217 KB, 1290x1217)
217 KB
217 KB JPG
Can someone please make a movie poster for The Clumsy Assassin: Part Two?
>>
>>102416058
Sub to my patreon first
>>
File: file.gif (2.13 MB, 360x240)
2.13 MB
2.13 MB GIF
wtf
>>
>>102405949
Please make sure to delete >>>/pol/uncensored+ai from OP next time. My disdain for the board aside, it seems their general died out.
>>
File: reddit.png (1.19 MB, 1152x896)
1.19 MB
1.19 MB PNG
>>
>>102416373
you're joking right? their videos are hilarous
>>
File: 00195-3446841138.png (2.07 MB, 1440x1120)
2.07 MB
2.07 MB PNG
>>
>>102416403
When was the last time you checked there? Every time I looked up their catalog nothing was there. Maybe tough luck on my part.
>>
>>102416423
>Every time I looked up their catalog nothing was there.
by "nothing" you mean "nothing interesting"? Yeah I'd say there's 80% of gagbage but the remaining 20% are pure kino, if you have enough patience you can find some gems
>>
>>102416445
No I meant there's no general for us to link to. I don't frequent pol anymore, so I've probably missed what you're mentioning, though I've seen this and that pop up on other boards, since it leaks into /v/ and the likes.
>>
File: file.png (1.96 MB, 1024x1024)
1.96 MB
1.96 MB PNG
>>
>>102416457
oh yeah, I find it weird too, I'm sure there's some other stuff than some /pol/ board but I can't find it, we definitely need an AI board ffs...
>>
>>102416472
I wouldn't be surprised if at some point /trash/ became a containment board for ai content under some new rule, due to the sheer amount flooding other boards. Unless we live in a timeline where ai becomes commonplace like ms paint, that's also an option I guess.
>>
File: 00236-3420271091.png (2.5 MB, 1328x1328)
2.5 MB
2.5 MB PNG
to add to this current conversation, it is baffling how we're now a solid year in with actually good local AI (besides voice) and STILL no AI general board, that would've even been a logical move to make during the initial leaks.
why are the glowies and hiroshimoot so retarded and incompetent?
>>
>>102416559
Path of least resistance I guess. Less maintance for them, and chances are they're in it for the money and little else, so out priorities and common sense likely do not align. Though that begs the question why boards like the vtuber one or the myriad video game related categories exist. I'm frequenting chan on and off for good couple of years now and mentioned examples flew completly over my head whenever they got implemented.
>>
>>102416559
I could probably add an AI board on my site, but it has little over 1.5 users anyways.
>>
File: file.gif (1.96 MB, 360x240)
1.96 MB
1.96 MB GIF
>>
>>102416590
gamers are on their own are harder to tardwrangle than A.I generals, which would be significantly less work if they were tardwrangled to their own board.
hell it's easier to moderate if it's one board. It's not /g/ that's spammed with bbcposting, doomposting, and glowies literally every hour of every day and mysteriously ceases during certain middle eastern conflicts.
Even the advertising has zero sense around it given there's loads of nsfw advertising that's from what i heard ramped up the past year.
>>
File: 00073-1093100536.png (2.04 MB, 1328x1328)
2.04 MB
2.04 MB PNG
>>102416675
shit i meant to attach another funny gen to this post to contribute my bad
>>
>>102416602
don't bother, the feds will spam cp on it, only one honeypot image board allowed for the english speaking community.
>>
File: tmpj6be2bmw.png (1.2 MB, 896x1152)
1.2 MB
1.2 MB PNG
We live in a society and I just hope affordable ai resources keep being developed. XL was slow enough and ever since F1 feels like eternity at this point.
>>
File: 00069-4074743733.png (818 KB, 768x768)
818 KB
818 KB PNG
>>102416769
trust the plan
nvidia squeezing our balls tight with vram means devs will have no choice but to optimize optimize OPTIMIZE going forward when 99% of users won't have more than 12-16gb available on even brand new $1k cards.
the silver lining is there, you just have to be willing to see it.
>>
File: tmpvrt4d8ox.png (1.16 MB, 896x1152)
1.16 MB
1.16 MB PNG
>>102416976
Can't say I trust it, since there's always fiascos, fuckups and greedy obstacles to be expected, but the way you described it is precisely how I've been looking at it all this time, at least when it comes to local, since due to it's open source nature we're bound to work around limited resources. Glad to see someone share the sentiment.
>>
>>102413191
I can't wait for the 5090 64GB VRAM card to come out.
>>
File: ComfyUI_00750_.png (3.18 MB, 1792x2304)
3.18 MB
3.18 MB PNG
>>
>>102413191
:( I can't find a job
>>
File: 00070-3944207463.png (745 KB, 768x768)
745 KB
745 KB PNG
>>102417068
nah trust me i share the paranoid sentiment too, But it's not healthy to stay thinking that way. We're here because we're in the possibly blessed timeline to some degree. Enough that i can gen tanned loli inklings to my heart's content on a decade old graphics card kek like i said just trust the plan.
>>
File: tmpl2ylslmb.png (1.03 MB, 896x1152)
1.03 MB
1.03 MB PNG
>>102417124
Ngl it's amazing how much we can already get away with at relatively low specs. I'm personally satisfied with my current tools when it comes to pixels. LLMs could use some work though, never really found a satisfying model quant.
>>
>>102417246
LLM's are a weird one. I find loras to be easy to make for SD but i've NEVER touched trying to do anything with LLM's besides download other people's works, and i've also tried a myriad of text models to find most of them lacking.
Right now i'm sitting on NeuralDaredevil-8B-abliterated.Q5_K_S.gguf, seems the best ive ever used in its B class, at least the gap between low parameter and high for ERP/general purpose seems to be closing but no one's ready to admit that on here at least kek

>shit i need to gen more SFW clowns
>>
>>102412974
Looks great. Can you do a catbox?
>>
Clown World
>>
>>102417087
If only it would actually come with that much. Current rumors put it at 28gb. Anything north of 32gb and they're saving that for data centres and charging 6k+ for it.
>>
File: tmppk8onvta.png (1.22 MB, 1152x896)
1.22 MB
1.22 MB PNG
>>102417317
Thanks for sharing the model, I'll probably give it a go once I'm bored of pixels. When it comes to LLMs, I guess vram is even more of a gamchanger with how much context comprehension is needed to keep it coherent.
>>
File: t43yeq3hao161.jpg (32 KB, 640x480)
32 KB
32 KB JPG
>Windows
1. Install and unzip folder
2. Click .bat
3. It launches

>Linux
1. Install dependencies
2. Clone repository
3. Clone another repository
4. Create and activate Virtual Environmemt
5. Create another virtual environment
6. Activate virtual environment
7. Install
8. Run
>>
>>102417733
A friend of mine just switched to Ubuntu, and the first thing he had to do when we wanted to play together, was to boot up a windows virtual machine.
>>
>>102417733
Linux is a Rube Goldberg machine
>>
>>102417578
I find configuration is way more important. Everything has been quant. VRAM just buys you speed.

>>102417767
>boot up a windows virtual machine.
a noob in linux definitely did this. GPU passthrough is easy shit.
>>
File: file.png (9 KB, 888x85)
9 KB
9 KB PNG
>downloading torch with my shit internet for the 10th time for my 10th AI project
Auto11111, ComfyUI, Kohya, Kobold, Ultimate Vocal Remover, RVC, AI Voice Cloning, who knows what else, all have their own separate ~5GB torch install
kill me
>>
>>102417912
>RVC, AI Voice Cloning
please link what youre using, been around the block with ai voice and stopped because the 10gb for shit audio wasnt worth it
>>
>>102417912
fix your pip cache then
>>
>>102417912
>bloatwrappers, my beloved..
>>
>>102405949
why is controlnet so fucking slow? Is it poor optimization? I even see people with 4090s having issues. Is there any alternatives? I think I tried every optimized model, setting and trick and it still shits it self.
>>
>>102417912
There's no reason why they can't allow a common folder for dependencies.
>>
>>102417912
Saar please git pull saar
>>
>>102417925
desu i haven't touched it in a year
this was the main thing I used
https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI
and then the voice models from "AI Hub" & "AI Hub Brazil" Discord
>>
>>102417967
>why they can't allow a common folder for dependencies.
venv are designed to avoid having a common folder. I couldn't image the tears if comfyanon decided to use some test package and screwed up everyone's shit.
>>
Is voice ai any good for ASMR? Any samples mayhaps?
>>
File: tmpzj8a6_rk.png (1.16 MB, 1152x896)
1.16 MB
1.16 MB PNG
>>
File: ComfyUI_00778_.jpg (410 KB, 1792x2304)
410 KB
410 KB JPG
>>
>>102414724
Never happened to me since I only like big fat cocks
>>
>>102417939
Heh
>>
>>102414745
>no one is close to it yet
skill issue
>>
>>102416373
if anything, >>>/aco/aivg should replace it (if it sticks around)
>>
>>102418205
the futa one aligns better with this thread. aivg is entirely cloud.
>>
https://github.com/lschaupp/PuLID_ComfyUI_FLUX
Anyone kicked the tires on this yet?
>>
>>102418227
nvm about the futa thing. Apparently the search is capturing it now.
>>
>>102418235
I have. Might have been doing it wrong, but my test image folder is full of weird oversaturated stuff.
>>
>>102418235
Not seen anyone here try it, only the few on reddit.
>>
>>102418201
Damn
>>
I don’t know fuck all about AI, but I am curious what kinds of tools/programs people like Nyakki and ChocoPizza use for generating anime porn. That shit looks kinda cash.
>>
>>102418377
I've not heard of either of those two before, brb googling
>>
File: flux_052.png (845 KB, 1024x1024)
845 KB
845 KB PNG
>>
>>102418205
We are so close boys
>>
>>102418390
They both have Patreons, and Nyakki goes by @nyakki46873203 on twitx
>>
It's bread time...
>>102418421
>>102418421
>>102418421
>>
>>102417366
>In a detailed anime digital painting style, stark modern room Asuka Langley,<lora:asuka_v3_000002600:1> sits in a sleek metal chair, with abstract art on wall under a dramatic painting resembling a distorted sun. She wears a chic black jacket with metallic accents, her auburn hair complemented by high-fashion sunglasses. Her posture is relaxed but assertive, suggesting deep contemplation. The room, minimalist with sharp contrasts, casts shadows that add to the introspective ambiance.<lora:RetroAnimeFluxV1:0.8> <lora:kodak_800_flux:0.35>
my forge is fucked up and saves prompts wrong and I'm too lazy to try and fix it. The kodak lora is pure kino, just need a really low strength or it will overpower everything.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.