[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (858 KB, 3264x3264)
858 KB
858 KB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>102074644

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/c/kdg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/u/udg
>>>/trash/sdg
>>
Slow Sunday
>>
I'm a brainlet with almost no programming experience, but I thought of a project that would be helpful to me. Is this the correct area to ask about it, or would a different thread be more suitable? It's not for image generation specifically, but AI image detection and I figured databases that are used for AIgen would be useful for it.
>>
>>102079146
maybe you should just ask your question and see if anyone answers instead of pussyfooting around it
>>
File: 00005-1583210901.png (2.04 MB, 1024x1440)
2.04 MB
2.04 MB PNG
>>
File: file.png (1.92 MB, 1024x1024)
1.92 MB
1.92 MB PNG
>>102079104
But how do you create a lora that can uses different captions for each of them?
>>102079115
Thank (you). It's obvious you don't own a GPU capable of generating anything, and just want to shit and piss on everything other people do, so good luck on getting a job and getting your head out of your ass.
>>102079146
We're all brainlets here, but we do use image detection.
>>
File: 0.jpg (314 KB, 1024x1024)
314 KB
314 KB JPG
>>
File: file.png (2.49 MB, 1024x1024)
2.49 MB
2.49 MB PNG
>>
File: 1713881644956056.png (360 KB, 787x794)
360 KB
360 KB PNG
>2024
>STILL no real solution to the seam/tiling issue when upscaling
fluxbros...
>>
>>102079265
There's an option on sdupscale
>>
>>102079263
very good
>>
3 in the chamber ready for bake
I really should stop to smell the roses and just gen stuff, this FOMO got to me
>>
File: file.png (2.05 MB, 1024x1024)
2.05 MB
2.05 MB PNG
>>102079280
Thanks. I'm so grateful to that anon who told me not to pussyfoot on GPU prices when I was struggling with schnell, and I just went and got a 3090 and a new PSU.
>>
is there hope for a 12gb poorfag?
i know flux is too rich for me, is there any alternative i can play with?
>>
File: 1711298528895562.png (225 KB, 565x459)
225 KB
225 KB PNG
>>102079273
somehow i'm 99% certain the "seam fix mode" isn't going to fix the fabric type on one end of a garment from looking completely different to the other end, or stitching cutting off randomly like picrel
so again, no real solution
it's a shame because the detail on some of these is SO GOOD, you can even see the individual threads on embroidery
but the tiling sucks so bad, i wish there was a good solution to it
>>
>>102079104
yeah but it wasn't trained like that
>>
>>102075064
Thanks for your help. I got it working after finding out the image folder had to be named a specific way, which i must have missed when i was reading through everything pulling my last hairs out.
Only 8i ish GB being used so i have room to play with DIM and alpha and the other settings, thank you.
Training for 2000 (20 images) at 6.55s/it with your tweaks. Maybe it can be improved, i'm just happy its working.
>>
>>102079349
>went and got a 3090
Way back then I was hesitating between a 16GB and a 3090, and same, anons here told me to go for 3090, and overall I'm glad I did.
>>
>>102079351
Your dick.
>>
>>102079351
It takes some setting up, but SDXL can give you hours and hours of fun.
>>
>>102079364
How do you train it like that? If it was trained for boomer prompting if someone uses only clip how will the lora will work properly?
>>
>>102079146
yes, no, maybe, can you repeat the question
>>
>>102079397
Our daddy taught us not to be ashamed of our dicks.
>>
File: file.png (2.29 MB, 1280x960)
2.29 MB
2.29 MB PNG
>>102079415
>How do you train it like that? If it was trained for boomer prompting if someone uses only clip how will the lora will work properly?
NTA but the best you can do is test it. Here is an image generated by passing this to T5:
>This is an analog postcard photo featuring a beautiful petite brunette wearing a classic 50s nurse outfit. The photo is almost black and white, and very faded out. There is the faint outline of lipstick on one corner of the image.
>>
File: file.png (2.27 MB, 1280x960)
2.27 MB
2.27 MB PNG
>>102079441
Now the same image, still using T5, but with this prompt:
>analog postcard photo, beautiful petite brunette, classic 50s nurse outfit, black and white, faded out, outline lipstick on corner
>>
>>102079441
Use the same prompt only with clip_l.safetensors loaded
>>
File: file.png (2.99 MB, 1280x960)
2.99 MB
2.99 MB PNG
>>102079457
Now using CLIP, with this prompt:
>analog postcard photo, beautiful petite brunette, classic 50s nurse outfit, black and white, faded out, outline lipstick on corner
>>
>>102079146
>inb4 it's furniture guy again
>>
File: file.png (3 MB, 1280x960)
3 MB
3 MB PNG
>>102079473
And finally CLIP with this prompt:
>This is an analog postcard photo featuring a beautiful petite brunette wearing a classic 50s nurse outfit. The photo is almost black and white, and very faded out. There is the faint outline of lipstick on one corner of the image.
lol
>>
File: file.png (3.03 MB, 1280x960)
3.03 MB
3.03 MB PNG
>>102079493
For good measure, the same test with ViT-L14-BEST
>>
File: cp005.jpg (440 KB, 1025x1023)
440 KB
440 KB JPG
>>
>>102079415
you just train with boomer prompts and with both CLIP and T5
don't worry about people only using only one of them, you should train as the base model was trained
>>
As you can see, T5 can mostly handle tags, but CLIP can't handle prose.
>>
File: file.png (2.53 MB, 1024x1024)
2.53 MB
2.53 MB PNG
Am I paranoid or does 1024x1024 output significantly better images than any other resolution?
>>102079533
Why did you move one pixel from height to width?
>>
File: 00196-4268935926.png (977 KB, 896x1152)
977 KB
977 KB PNG
>>
File: 7n3ix8do9vgd1.png (2.29 MB, 3258x3242)
2.29 MB
2.29 MB PNG
>>102079505
Got similar results, clip only is not accurate

T5 works good with boomer prompting

Only difference I noticed when using both clip and t5 is with some loras that make the clip recognized tags better

reddit.com/r/StableDiffusion/comments/1elqc3e/just_some_flux_images_it_pays_using_both_the_t5/

>>102079537
I use both, good loras respond to CLIP as a style with tags and T5 as boomer prompt, I want to the train a lora the same way, not only with boomer prompting, it's useful when you combine loras to mix styles
>>
if t5 is so good, when why do loras still consume the whole gen
it should know that the person named X is the one to apply the weights to, and no one else unless specified
>>
>>102079613
In order to train a lora to respond like that, do you have to do anything special, or just include tags and prose in your caption?
>>
>>102079594
Have you tried 1920 x 1080?

You use 1024x1024 with lora on?
>>
>>102079662
I was using smaller resolutions in terms of total pixels, so it's probably that.
>>
>>102079613
imagine my disappointment when I found out there is no documentation behind this image. It drives me crazy that I have to push so hard to learn this shit because I am piecing it between papers I don't understand, reddit and needing to triple verify 4chan posts.
>>
>>102079613
>I want to the train a lora the same way, not only with boomer prompting
don't
>>
>make a lora of my gf
>gen 1000 studio quality modeling shots of her
>she shows her friends, they get jealous, naturally
>pay me to do it for them too
Guys I'm worried BFL is gonna snipe me because I am profiting off Flux!
jk but do you think I can convince them it works better if they include nudes in the data set?
>>
File: 00136-3956641964.png (401 KB, 359x538)
401 KB
401 KB PNG
lora-trainer master, moi.
>>
>>102079757
>they dont know i can now generate images of them pissing on my face
Doubt
>>
>>102079757
irl south park hobbit episode.
>>
>>102079827
prompt?
>>
Her tongue is extended, touching the tip of the penis, which is circumcised and uncircumcised at the same time, with a visible foreskin.

i love joycap
>>
>>
>>102079880
what does this mean
>>
>>102079880
anon
>>
>>102079827
blue board anon. Delete this and stay a while.
>>
>>102079634
I think a lot of people don't know how to train it properly, they just farm buzz and make slop

>>102079644
idk, still looking into it

https://www.reddit.com/r/StableDiffusion/comments/1enuib1/i_trained_an_anime_aesthetic_lora_for_flux/
>>
>>102079867
>which is circumcised and uncircumcised at the same time
???
>>
>>102079912
:(
>>
>>102079634
it should be possible, but apparently nobody has figured it out yet, or they're rushing out slop to farm buzz as >>102079913 says
>>
>>102079827
TJD
Who's in the office on a Sunday
>>
>>102079827
How??
Lora again?
>>
>>102079827
graped
>>
>>102079971
Yes it's a Belle Delphine lora from Civitai https://civitai.com/models/679858/belle-delphine-fluxdev>>102079986
i deleted it myself
>>
File: 00009-3702331730.jpg (146 KB, 1040x1176)
146 KB
146 KB JPG
Dope Cow energy drink™ with patent pending ProColostrum©
>>
>>102079634
because people are still training only on closeups and/or single subject images
many use meaningless "trigger words" that T5 won't associate with a person until it is hammered in, or use just "woman" which is utterly retarded
all of these are responsible for Flux loras still affecting all faces in the image
>>
>>102080001
oh ok thanks anon
>>
When you use a LoRa upload some of your gens to the LoRA page so the creator gets free money.
>>
>249/4000 steps 6hrs remaining
Bros I'm excited
>>
>>102080050
It's gonna look like shit
>>
>>102080054
Why?
>>
>>102080050
how do anons keep working? I find most of the past year has been waiting for shit.
>>
>>102080059
Because I am trying to demoralise you.
>>
>>102080062
I'm a lifelong ideaguy for a dream game that may never be made, so being able to proompt my ideas into quasi-reality always feels rewarding
>>
>>102079827
Does anyone know how long desuarchive is going to not be saving images for?
>>
>>102079594
Just 1024 then upscale
>>
>>102080090
how are you managing to multitask game dev when genning images? I find myself just waiting for the images to be complete rather than completing my code.
>>
File: ComfyUI_33005_.png (887 KB, 720x1280)
887 KB
887 KB PNG
>>
File: FLUX_00452_.png (606 KB, 768x768)
606 KB
606 KB PNG
>>102079137
bank holiday, innit
>>
>>102079096
Holy shit that risitas looks so damn good.
>>
File: ComfyUI_Flux_0252.jpg (1010 KB, 1536x2688)
1010 KB
1010 KB JPG
>>
>>102079963
>How did you caption your dataset ? Is it a list of danbooru-style tags or is everything described with natural language ?
Both

>Great job! How many images did you use ?
Like only 700 since I didn’t know it would work. Will use a much bigger dataset in the future.
>>
>>102080205
nice
>>
File: 2024-08-25T222043.249.jpg (151 KB, 1024x1024)
151 KB
151 KB JPG
>>102080163
Monday here. Should I pull a sicky and gen all day? I can say I have diarrhoea. Nobody questions diarrhoea.
>>
>>102079850
> jew cam whore kneeling in the desert with big ass, wearing boots, red hat
>>
File: 2024-08-25_00402_.jpg (1.6 MB, 3840x2160)
1.6 MB
1.6 MB JPG
>>
>>102080323
I don't think that's it, anon. lying is not nice.
>>
>>102079398
any guides?
>>
File: FFLUX_00483_.png (947 KB, 1024x1024)
947 KB
947 KB PNG
>>102080286
just gen while you work, most of it is waiting anyway
>>
Anyone got some wildcard ideas? I just got chatgpt to make me some
>>
>>102080386
The problem is teams and remote desktop is a cunt to my gens.
>>
File: 2024-08-25_00403_.png (1.02 MB, 1280x720)
1.02 MB
1.02 MB PNG
>>102080323
nice prompt
>>
>been genning non-stop all month
>power bill is $32 less than last month where I wasn't
they just make this shit up on a whim don't they.
>>
>>102079392
where i live a 3090 is the price of two 4060ti 16gb... which would net me 32 gb instead
so i am saving for a second one
>>
>>102080478
Anon that's not how this works.
>>
>>102080478
you can't split an image model between two GPUs tho
>>
>>102080478
I don't think you can still pool multiple GPUs for image gen, at best you can ask the model to be in one, the clip in other etc but it's not as good as a full 32GB of VRAM.
>>
>>102080491
why
why cant just buy multi gpu motherboard and profit?
>>
>>102080498
*yet not sill
>>
>>102080498
>>102080497
god this sucks so damn bad
>>
>>102080509
yeah sorry
>>
>>102080503
For this >>102080498 reason.
Nvidia also stopped doing SLI
>>
>>102080478
this >>102080491 >>102080497 >>102080498
you will bite yourself in the ass with a useless 4060ti
>>
I guess no one is working to make pooling a thing ?
>>
>>102080533
i wouldnt call a 4060ti useles.... just not future proof
>>
>>102080524
* for consumer cards atleast, but even with NVLink (new name for SLI), torch does not work on two cards anyway (3090 still has SLI/NVlink but still cant use two 3090s)
>>
>>102080533
Not a useless 4060ti, TWO useless 4060tis
>>
>>102080435
get some old computer from ebay and use that as your dedicated work machine if they didn't give you one

>>102080462
i have a smart meter that tells me electricity consumption in watts for the whole house in real time, really handy
>>
File: 2024-08-23_00269_fp16.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
>>102080546
you know only VRAM counts anon.. everything else is pointless

>VRAM VRAM VRAM VRAM VRAM VRAM
>>
civit is weird they consider this an R rated image
>>
File: 2024-08-25_00405_.jpg (1.85 MB, 3840x2160)
1.85 MB
1.85 MB JPG
meanwhile while the VRAMlets where squirming my 4090 made this
>>
What's the least I need to use high guidance (or cfg?) but maintain a fast gen speed?
Gens are either very slow or not adhering to the prompt.
>>
>>102080533
>a useless 4060ti
the next 16GB NVIDIA card is double the price, okay??? I bought what I could!
>>
>>102080592
Looks better than your previous upscales.
>>
>>102080592
well, its alright, but hardly remarkable or stunning in quality. kinda grey and dull.
>>
File: 0.jpg (272 KB, 1024x1024)
272 KB
272 KB JPG
>>
File: FFLUX_00498_.png (872 KB, 1024x1024)
872 KB
872 KB PNG
>>102080580
yeah their dumb ai classifier flags every male who shows a bit of shoulder with the PARTIAL NUDITY tag
>>
File: 2024-08-25_00017_.jpg (1.18 MB, 2560x1440)
1.18 MB
1.18 MB JPG
>>102080608
ya the more crowded the content the better the upscales are, its pretty bad at upscaling areas of plain or gradient color.. I guess it just itches to make something there
>>102080612
I sadly agree, but I guess its the simplicity of the prompt .. its just "intergalactic war"
>>
File: 2024-08-26T085036.343.jpg (219 KB, 1024x1024)
219 KB
219 KB JPG
>>102080612
>>
currently using:
>--pretrained_model_name_or_path D:/flux1-dev.sft
when training, can I instead use:
>flux1-dev-fp8.sft
to save VRAM when generating?

how many steps is typical? got my first lora with 1100 steps, and it looks real nice but seeing some butt-chin still, main issue though is generation speed, currently about 2 minutes per image
>>
File: 2024-08-25_00407_.jpg (1.86 MB, 3840x2160)
1.86 MB
1.86 MB JPG
>>102080663
kek
>>
File: ComfyUI_02147_.png (1.4 MB, 1024x1400)
1.4 MB
1.4 MB PNG
>>
File: ComfyUI_02154_.png (1.58 MB, 1416x1120)
1.58 MB
1.58 MB PNG
>>
File: ComfyUI_02156_.png (1.61 MB, 1416x1120)
1.61 MB
1.61 MB PNG
>>
>>102080592
I have 24 GB but if I had to choose between acting like a cunt and being a vramlet, I'd pick the latter
>>
File: formerly Abbey.png (1.12 MB, 800x800)
1.12 MB
1.12 MB PNG
>>
>>102080690
>>102080705
>>102080721
>lets hit the image limit faster by spamming my single-prompt image
>>
>>102079351
>12gb

It works fine with 12Gb, do some research
>>
File: ComfyUI_02155_.png (1.57 MB, 1416x1120)
1.57 MB
1.57 MB PNG
>>
File: FluxDev_03031_.jpg (238 KB, 1024x1024)
238 KB
238 KB JPG
it's not the same without the sweaty skin and sultry look
>>
i love ldg
>>
File: 2024-08-25_00409_.jpg (2.23 MB, 3840x2160)
2.23 MB
2.23 MB JPG
let's dine in Kuiper belt in a grey and dull atmosphere!

>>102080753
as if we ever hit the image limit.. I think we did once the past two days
>>
File: 00000-610985329.png (1.53 MB, 896x1152)
1.53 MB
1.53 MB PNG
>>
>>102079351
>i know flux is too rich for me, is there any alternative i can play with?

Yes, here are the smaller versions of flux.
https://huggingface.co/city96/FLUX.1-dev-gguf/tree/main
https://huggingface.co/city96/t5-v1_1-xxl-encoder-gguf/tree/main
>>
>>102079356
What resolution do you want to upscale to?
>>
>>102080599
whenever you need help
ask in /sdg/ instead
>>
>>102080535
there is a comfy node can pool. It doesn't change that you can't do math across two cards efficiently. I would think you could beat a single safetensor with a split t5 and gguf across two cards. I haven't seen anyone post trying it so I am guessing.

>>102080599
cfg does not change speed in a significant way.

>>102080842
I moved here. There is nothing of value left.
>>
File: 00004-2810829831.png (1.7 MB, 1152x896)
1.7 MB
1.7 MB PNG
>>102079351
https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/1050

Read up how to setup up flux here, I have 12GB and I'm doing fine.
>>
File: 2024-08-25_00411_.jpg (1.73 MB, 3840x2160)
1.73 MB
1.73 MB JPG
>>102079356
You know what? I tested with seam fix modes, bandpass, half tile, half tile + intersection for hours in frustration .. but all the upscales you see here I did with flux are "seam fix mode = NONE" .. it just works without and else the right setting
>>
File: 00268-383083857.png (1.74 MB, 968x1728)
1.74 MB
1.74 MB PNG
>>
File: ComfyUI_02169_.png (1.8 MB, 1416x1120)
1.8 MB
1.8 MB PNG
>>102080753
>>
>>102079356
these are the settings I use for 1280x720 -> 4k
>>
File: ComfyUI_postColor_00872_.jpg (2.85 MB, 3192x2184)
2.85 MB
2.85 MB JPG
>>102080685
nice gen
>>
>>102080875
>9 minutes to gen a response
>>
>>102079351
I gen on a 12GB but train lora's on google colab
>>
>>102080862
>there is a comfy node can pool. It doesn't change that you can't do math across two cards efficiently. I would think you could beat a single safetensor with a split t5 and gguf across two cards. I haven't seen anyone post trying it so I am guessing.
Only nodes I've seen were to assign cuda:0/cuda:1 to different nodes.
>>
File: ifx242.png (1.22 MB, 1024x1024)
1.22 MB
1.22 MB PNG
>>
File: 0.jpg (306 KB, 1024x1024)
306 KB
306 KB JPG
>>
File: 00118-108369158.png (2.41 MB, 1824x1024)
2.41 MB
2.41 MB PNG
>>
File: ComfyUI_Flux_0257.jpg (1.2 MB, 3840x2048)
1.2 MB
1.2 MB JPG
>>
File: ComfyUI_01077_.png (2.98 MB, 2304x1792)
2.98 MB
2.98 MB PNG
baked a new lora with the latest sd-script updates, used "split_qkv=True" and "timestep_sampling shift". Usually my loss rate is around .4 but this time was at .08; not really sure what that indicates ultimately but the lora turned out pretty good
>>
>>102080909
nta but it's understandable.
>>
>>102080975
is that the queue?
>>
File: 1709378132078506.png (58 KB, 928x263)
58 KB
58 KB PNG
Uhhhh bros when is the loss supposed to start going down?
>>
File: 2024-08-25_00415_.jpg (1.33 MB, 3840x2160)
1.33 MB
1.33 MB JPG
>>102080903
ty
>>102080955
cool way of making an XY plot with tiled upscale .. actually a pretty cool result
>>
>>102080925
that is basically it. Loading take a bunch of time so if you are swapping out your clip and your model I think you could "win".

Swarm is claiming:
The "Swarm" name is in reference to the original key function of the UI: enabling a 'swarm' of GPUs to all generate images for the same user at once (especially for large grid generations)

Then he is using C# and claiming the GIL is some unbreakable obstacle. The dream keeps dying because of skill issues. The smart people are getting paid by the mega corps who don't want to solve this issue.
>>
>>102079634
that's not how neural networks work, no matter the context, loras fundamentally rape the weights
>>
>>102079146
okay, my question is, I want to make a live deepfake using my webcam to help with my body dysphoria, i want to look at my own body in real time but with a beautiful deepfaked vagina because I'm truly a woman deep down. I don't know if it's possible but I bought a 4090 to try to generate this digital vagina. if it works I will buy another 4090 to give me breasts
>>
>>102081016
On a fully trained model loss stays low forever, most of the progress will be seen in the sample images.
>>
>>102080862
>CFG
That's not even supposed to be used with dev anyway
>>
>>102081036
d*bo?
>>
>>102081024
I thought maybe the release of Flux would push them to do something it's the first local image model that requires so much memory, but it seems no one cares sadly.
>>
>>102080862
>cfg does not change speed in a significant way.
CFG drops the speed in half, what the fuck are you talking about
>>
>>102080991
Yes
>>
File: ComfyUI_01080_.png (1.52 MB, 1152x896)
1.52 MB
1.52 MB PNG
>>
>>102081036
the return of furniture guy

>>102081044
You are correct. I shouldn't mince guidance and CFG. It is confusing.
>>
>>102081058
why
>>
>>102081043
how can a model exhibit a low loss on things that it does not know how to depict yet?
how do the optimizers know how to steer the training without a loss metric?
>>
>>102081056
if you mean guidance then yes
is it supposed to be prevented with "skimmed cfg" or "autocfg"? so many options right now
>>
>>102081083
because of this >>102079757
I queued up 300 of them.
>>
>>102081114
thats a nice thing to do for your gf anon desu, if I had a girlfriend I would make a lora to generate a younger, more attractive girlfriend no doubt
>>
>>102081093
it depends on how the loss is calculated but more or less it's a comparison of the expected pixels vs the generated pixels, every image is, more or less, going to be within the ballpark of correctness, a fully trained model is basically going to be generating images mostly correct even on a new concept
>>
>>102080899

Thanks anon
>>
File: 2024-08-25_00419_.jpg (1.2 MB, 3840x2160)
1.2 MB
1.2 MB JPG
>>
>>102081138
She's actually older and uglier but she's going through some shit so I am making the pics seem more attractive than she is irl.
>>
>>102080927
based
>>
>>102081107
no I mean CFG, that's what it has always been called
with Flux Dev Guidance means the learned distilled CFG value, no one was calling CFG "Guidance" before Flux Dev
>>
File: file.png (610 KB, 512x512)
610 KB
610 KB PNG
1.3B fine art
>>
>>102081212
worth the weight
>>
File: comparison.png (708 KB, 2560x512)
708 KB
708 KB PNG
testing a comparison of different ratios of my two soft vs hard pixel art LoRAs. i think a 50/50 mix (middle) looks best
>>
>>102081230
Yes, this model will beat pony for horny.
>>
File: 2024-08-25_00421_.jpg (1.2 MB, 3840x2160)
1.2 MB
1.2 MB JPG
>>102081237
I agree the middle one is the best.. although the robe on the far right one is damn cool
>>
File: SD15__00021_.png (334 KB, 512x512)
334 KB
334 KB PNG
>>102081212
Don't forget how much of a revolution 1.5 was
>>
>>102081016
>>102080964
For 250 images are 10000 steps good? How many epochs?

https://www.reddit.com/r/StableDiffusion/comments/1enuib1/i_trained_an_anime_aesthetic_lora_for_flux/

This guy trained it for 1$ on runpod
>>
>>102081281
It's over 2x the size of Pixart Sigma and SIgma was at the same level or better than SDXL.
>>
local video gen on par with kling/luma when
>>
File: 174320_00001_.png (2.07 MB, 944x1408)
2.07 MB
2.07 MB PNG
>>102081302
Pixart S wasn't enough of an improvement over base XL and nowhere near the finetunes so nobody gave a shit about it.
Underrated model though and I wish it had taken off like Flux has.
>>
>>102081337
Kling and Luma are not on the same level
and the best local video models are below ground in comparison to just Luma
>>
>>102081347
We'll see at the end of the day what the end result of my model looks like. It's promising though and it'll be my own. Pixart did a lot with 600m so I think my 1.3B model should hit a sweet spot for quality and comprehension. But if not, that's okay because it's still trained on a dataset of stuff I care about.
>>
File: 1700906504075.png (227 KB, 1472x573)
227 KB
227 KB PNG
Hey homos, ever hear of frequency domain filters?

http://meesoft.com/Analyzer/help/FrequencyDomainFilter/FrequencyDomainFilter.html

Check your images with one and see if there's any weird patterns.
>>
>>102081285
I'm doing 68 images, 4000 steps, will cost me about $4 on colab, need to start paying civitai instead lol
>>
I haven't been following the AI gen scene for about a year and I want to start genning again.

Should I let my A1111 update or will it break stuff?
>>
>>102081443
lmao
>>
>>102081443
Use Forge now
>>
>>102081443
just update dude
>>
File: 2024-08-25_00425_.jpg (1.27 MB, 3840x2160)
1.27 MB
1.27 MB JPG
>>102081443
will be fine.. all will be fine...

heck you better just make a new install and keep your old one, nothing will be the same, also a1111 kinda is abandonware these days

if you like the way it worked try forge or sd.next
>>
File: 1696507663277717.png (405 KB, 640x480)
405 KB
405 KB PNG
learning inpainting, anything I'm doing obviously wrong?
>>
File: 1722288925082593.png (371 KB, 640x449)
371 KB
371 KB PNG
>>102081492
>a1111 kinda is abandonware these days
Wow things have been moving fast I see.

>>102081484
Will look into it, thanks.
>>
>>102081492
>also a1111 kinda is abandonware these days
wtf no
it's shit but not abandoned
>>
>>102081513
I said "kinda" .. the update cycle is sluggish as hell
>>
File: cfg_compare.png (429 KB, 2531x1367)
429 KB
429 KB PNG
>>102081056
what am I missing. I cranked the settings and yes it is slower, but nothing crazy.

RUN 1 settings - top left
RUN 2 settings (live) bottom left
timestamps for a 4 batch count right - top right
terminal - bottom right
>>
>>102081525
ok then
wonder what game is taking all the dev's time again
>>
I've only used euler beta for both dev and flux.
what should I use instead?
>>
>>102081443
Don't update, it will definitely break stuff. I have like 4 different A1111 installs. But Comfy/Forge are probably better now anyways
>>
>>102081503
Forge is literally the same thing UI wise but faster and can run flux (the only relevant model right now). If you want to run older models use reforge, an anons pet project that's a fork of old forge with some quality updates. either way, a1111 is ass
>>
>>102079351
Subscribe to my patreon and I will tell you.
>>
File: SD3_13624_00062_.png (1.73 MB, 1024x1024)
1.73 MB
1.73 MB PNG
>>102081302
>>102081347
>>102081384
By this logic then 2.5b SD3.1 can beat Flux
>>
>>102081582
No, because unlike SD my model is uncensored from the ground up.
>>
>>102081588
That doesn't sound very safe of you, Anon.
>>
>>102081602
It's a very unsafe model.
>>
File: 00019-4291639819.png (1.04 MB, 832x1216)
1.04 MB
1.04 MB PNG
>>
>>102081613
Ground up or fine tune?
>>
>>102081546
what's with the time calculations, just show us the s/it
>>
pc just shit the bed real fucking hard, would not boot
was kinda hoping it was my gpu so I could return it for a refund, but no, it was somethin stupid
>>
File: 1708915888676116.png (786 KB, 1666x1019)
786 KB
786 KB PNG
>>102081496
my inpainting workflow if anyone wants to critique my settings. i'm a total noob at this.
>>
>>102081588
Is this fantasy or are you actually training a model?
Do you have any documentation for it? I'm interested in doing the same.
How much did the hardware costs, 50k?
>>
>>102081636
My critique is you are doing it in comfy and it's fucking annoying to do inpainting in comfy
>>
>>102079634
Because the model is unfamiliar with the concept. You need a proper finetune model for the lora to work in a flexible manner. A lot of loras are also poorly trained, adding to the inflexibility.
>>
>>102081636
my eyes
>>
File: 2024-08-25_00429_.jpg (1.06 MB, 3840x2160)
1.06 MB
1.06 MB JPG
>>102081636
I rate it:
>fancy group colors/10
>>
File: file.png (666 KB, 512x512)
666 KB
666 KB PNG
>>102081629
It's from the ground up, initialized from the void. Basically the Pixart Sigma except with 1.3B parameters and I'm using Adam Mini as the optimizer to train it and it's better than 8 bit Adam because it's stays in FP16. (Flux could probably be trained the same since it's a transformers model)

>>102081646
@MODELS.register_module()
def PixArtMS_XL_2_Next(**kwargs):
return PixArtMS(
depth=42,
hidden_size=1408,
patch_size=2,
num_heads=16,
**kwargs
)


kv_compress = True
kv_compress_config = {
'sampling': 'conv', # ['conv', 'uniform', 'ave']
'scale_factor': 2,
'kv_compress_layer': [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41],
}
>>
>>102081657
What's different about doing it in Comfy? Normally I'd just use Florence for masking but it kept messing up and I had to do it manually.
>>
>>102081630
20.34it/s to 11.68it/s
It took some insane settings to get there. If there are working gens at 20 distilled CFG and 20 scale I am curious and please share. Everything I produced was mostly black unusable trash.
>>
>>102081671
>>102081646
It's trained on two 4090s on a single computer. About two months of training so far and a dataset of about 5 million images.
>>
>>102081695
it's a cp model isn't it
>>
>>102081706
no but I do want it to gen all my favorite celebrities nude :)
I also care a lot about art so it has a lot of fine art in it
>>
File: ComfyUI_temp_uvpjp_00024_.png (3.58 MB, 1360x1600)
3.58 MB
3.58 MB PNG
>>
File: 00022-4291639822.png (1.19 MB, 832x1216)
1.19 MB
1.19 MB PNG
>>
>>102081623
nice
>>
>>102081671
>>102081695
If this is the output it's producing after 2 months I guess we will see you in a year.
>>
>>102081680
>20.34it/s
what fucking enterprise card are you using?
anyway, that's a (near) doubling of gen time as you can see
You're not supposed to put distilled at 20CFG, also on ComfyUI running CFG requires special nodes to keep it from burning and I don't know if Forge has that logic builtin
>>
>>102081733
Progress is exponential, it takes a long time for a model to learn the basics of generating color.
>1girl, holding_animal, animal, dog, holding, solo, faux_traditional_media, fine_art_parody, traditional_media, realistic, painting_medium, sky, upper_body, black_hair, looking_to_the_side, closed_mouth, brown_hair, long_sleeves, rating_general, fine_art, artist_joshua_reynolds, portrait, style_rococo
For the prompt, it's actually 80% there.
>>
>>102081756
>it's actually 80% there.
anon...
>>
>>102081769
yes anon, I know all you know is burned out 1girl digital illustrations
>>
>>102079662
i have tried 1920x1080 with the anime screen shot lora and it dosent seem any better than 1024x1024 and has 2x longer gen time
>>
>>102081791
it's your electricity, anon
or is it
>>
>>102081738
it is a near doubling and Forge doesn't have protections as those are stupid settings. I think if I have to go that hard on those settings then it isn't significant. I'll agree that is on the list of things that slow things down and stupid settings will hurt you.

I am running GGUF so that might be why it is a weird it/s. I hate using it/s because it does vary wildly on the model choice and the environment if anons are comparing.
>>
>>102081756
See you in 2 weeks with your dog shit model then. Look forward to the monsters it produces.
>>
File: 00026-2746889373.png (1.16 MB, 832x1216)
1.16 MB
1.16 MB PNG
>>102081727
cheers, using a new Princess Zelda lora, it seems quite versatile
>>
>>102081816
>I think if I have to go that hard on those settings then it isn't significant.
Anon, the CFG level doesn't change the speed, if it is anything other than 1.0 it will double gen time
>>
>>102081824
I really don't get what you get out of this. Is this some sort of projection of a helpless individual who does nothing who is threatened by people who do things?
>>
>>102081816
>I am running GGUF so that might be why it is a weird it/s
you just cannot have 20it/s with the resolution shown in the screenshot with Flux, GGUF doesn't change anything
also the screenshot shows 3.77s/it so what gives
>>
>>102081756
Do you have any place where you post updates or something? If it's an actual uncensored model I'm interested, too many corpo shitters with only profit in mind running ai right now.
Hypothetically, If you made the training data available as open source would I be able to continue training the model adding more stuff to the dataset?
>>
>>102081851
To discourage you Anon.
>>
>>102081851
>For the prompt, it's actually 80% there.
We're just all laughing at your delusion.
>>
This lykon guy is funny, constantly talking all kinds of shit about flux when his model can't even make a coherent woman.
>>
File: high_it.png (202 KB, 1279x833)
202 KB
202 KB PNG
>>102081846
weird, what I said here >>102080862

I ran the test and hopefully am smarter. Possibly not.

>>102081885
I could be an idiot. I scrolled back, apparently not enough. I jumped to a sd1.5 model after I made that screenshot.
>>
>>102081913
is he a fanboy or part of the SAI staff
or both I guess
>>
>>102081899
I only post updates here. I won't be releasing my dataset since a lot of it is a blatant disregard how I source the images but you can do something similar by scraping duckduckgo images on a wide range of topics. Same with scraping gelbooru and other boorus.

Other datasets used:
- https://huggingface.co/datasets/tomg-group-umd/pixelprose
-https://huggingface.co/datasets/CaptionEmporium/coyo-hd-11m-llavanext
>>
there's some image sets behind paywalls on getty images, i want to train a lora on one of them, am i totally fucked? could i get AI to remove the watermark from the preview and then train on that, the previews themselves are pretty high quality/res and they only have a single stamped watermark
>>
So that promise to speed up SDXL gens significantly with nf4 was a lie?
>>
I'm convinced there is something wrong with captioning with LoRA training with Flux. Which makes sense since any text encoder training is disabled when currently training LoRAs for it.
And before someone says
>Go ahead and try training t5 retard
I'm talking about the clip model.

Until that function is restored, I don't think adding unique tokens for different concepts in a LoRA is functional.
>>
File: 2024-08-26_00002_.jpg (1.31 MB, 3840x2160)
1.31 MB
1.31 MB JPG
>>
>>102081912
I really don't care what you think anon. People like you hate people doing things. When you say we you actually mean yourself, because you have no friends. And you know why? Because when you had friends you shat on them whenever they tried something new. You're a boring, sad person.
>>
>want new gpu
>can't afford new gpu
might as well just kill myself when life is this hard
>>
File: FFLUX_00564_.png (2.02 MB, 1024x1024)
2.02 MB
2.02 MB PNG
>>102081913
i mean what did you expect, his credentials were basically clicking two buttons in the UI and merging other people's models
his shitty ass same1girlface model, dreamshaper, is dogshit
>>
>>102081948
>could i get AI to remove the watermark
People have completely forgotten about Photoshop
>>
File: ComfyUI_01094_.jpg (967 KB, 1792x2304)
967 KB
967 KB JPG
>>
File: taylorsphinx.png (1.81 MB, 1018x1018)
1.81 MB
1.81 MB PNG
>>
>>102081976
you care enough to respond, we'll keep laughing at your delusion
why can't you admit it is not anywhere near "80% there"
>>
File: 00013-2463479091.png (2.04 MB, 1152x1536)
2.04 MB
2.04 MB PNG
>>
>>102082028
>we
>>
>>102082036
why can't you admit it, anon
>>
>>102081948
iopaint can do batch eraser using a coordinate

>>102081990
that's AI anon. It is ALL AI!!! More importantly if he is stealing images he probably isn't paying for photoshop. I assume the piracy of PS is difficult at this point.

>>102082036
ignore it anon. There is a touched individual here assigning all sorts of traits to everyone. It really isn't worth it since he says the same shit every time.
>>
>>102082028
>anon actually puts in effort to train something himself
>he is ridiculed
what have you done anon other than smashed the gen button on 'girl booba with huge tits'
>>
>>102082033
nice one
>>
>>102082074
he is ridiculed for being delusional
>what have you done anon other than smashed the gen button on 'girl booba with huge tits'
but girl booba with huge tits is all I want, what I have or have not done doesn't change the fact the guy is delusional
now stop sucking his cock
>>
File: liter flux dev 16 dimm.jpg (485 KB, 1024x3072)
485 KB
485 KB JPG
>>102081971

Token activation is functional in Flux Dev. 1 single LoRa. Danbooru style taggings only. 16 dimm/alpha

>pic related
>>
File: img2img.png (1.53 MB, 1349x786)
1.53 MB
1.53 MB PNG
>>
File: 2024-08-26_00010_.jpg (1.18 MB, 3840x2160)
1.18 MB
1.18 MB JPG
>>
>>102082137
anon do you ever wonder why you don't have friends and you have to say "we" here as if you do? that's some social parasite shit, get help
>>
>>102082209
>deflecting this hard
just admit it is not "80% there"
>>
File: FFLUX_00570_.png (1.12 MB, 1024x1024)
1.12 MB
1.12 MB PNG
>>102082195
can i get a lap dance and an nvidia H100 on the side?
>>
>>102082209
also there is no "we" in that comment, you monumental retard
>>
File: file.png (1.28 MB, 1024x1024)
1.28 MB
1.28 MB PNG
>>
>>102082175
all 3 look shit
>>
>>102082247
>double post seething
Looks like I found the winner. You were so happy to say we earlier when the truth is no one really likes you and if you posted hanging around here, no one would miss you. You say "we" as if you're part of the group, you're not. You're a parasocial hang around loner.
>>
>>102082275
>more deflection
why can't you admit it is not "80% there"?
>>
File: FLUX_00079_.png (1.23 MB, 896x1152)
1.23 MB
1.23 MB PNG
>>
File: 00036-4241309416.png (1017 KB, 1216x832)
1017 KB
1017 KB PNG
>>102082033
Love this one, nice details
>>
what's the criteria for 80%? how are we quantifetterancing this?
>>
File: 2024-08-26_00019_.jpg (1.29 MB, 3840x2160)
1.29 MB
1.29 MB JPG
>>102082246
>Coming up, Sir. Get a seat. Enjoy the the queue.
kek
>>
>>102082285
Anon, at the end of the day, you are seeking my approval.
>>
>>102082313
sounds normal to me
now, why can't you admit it isn't "80% there"?
>>
>>102082303
Well he probably thinks the end result looks like oversaturated Dreamshaper 1girls.
>>
Which model I can download on my own pc and let me input my own picture and videos so I can output pictures that's me but not actually me ?
>>
is there a way to run FLUX on windows amd?
>>
>>102082343
Yes, ZLUDA

>>102079351
Try GGUF merge
>>
>>102082299
Lora or does Flux know the suit that well?
>>
File: 00039-319277975.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>102082299
Noice.
>>
File: FFLUX_00577_.png (984 KB, 1024x1024)
984 KB
984 KB PNG
>>
>>102082299
you're repeating images, anon
>>
>>102082343
no. See https://pytorch.org/get-started/locally/

There is a docker option if you are a madman:
https://rocm.docs.amd.com/projects/install-on-linux/en/develop/install/3rd-party/pytorch-install.html

or this >>102082384
>>
File: 2024-08-26_00022_.png (1.14 MB, 1280x720)
1.14 MB
1.14 MB PNG
>>
>>102082399
Lora?
>>
>>102082397
we didn't have loras on the 12th, which is when he posted that the first time
>>
>>102082417
there's finite seeds
what do you want from me
>>
Something about the look of these greasy blondes is very irritating to me.
>>
>>102082436
It's the same filename, anon!
>>
File: 1706882722780039.png (1.25 MB, 960x1280)
1.25 MB
1.25 MB PNG
A1111 still works fine, the new options look nice.

>>102081557
You were wrong.

>>102081488
This Anon was correct.
>>
File: 00001-319277977.png (1.03 MB, 1024x1024)
1.03 MB
1.03 MB PNG
>>102082426
Yeah this one

https://civitai.com/models/682944/princess-zelda-flux-or-dogmaai
>>
>>102082458
Yeah but Forge is way faster

Especially if you want to get into Flux, which is the new main thing now
>>
is it normal that my first attempt at flux loras shit all over my old SD loras, or was I just bad at SD
>>
File: 6 fingers.png (1.73 MB, 1152x1152)
1.73 MB
1.73 MB PNG
>>102082006
is there a reason for this?
>>
>>102082384
>ZLUDA
i openned the github page and it not only says :
>Does ZLUDA support AMD GPUs?
>Certainly not currently, but it might be technically possible
but also that amd told him to remove shit
>>
>>102082479
Yeah it's a lot harder to fuck up Flux loras, hard to overtrain too
>>
>>102082458
>You were wrong.
Does it have good Flux support now?
>>
>>102082479
SD has a shit architecture
>>
File: 1702177294888889.png (1.25 MB, 960x1280)
1.25 MB
1.25 MB PNG
>>102082493
I don't care about Flux.
>>
>>102082479
it's the power of training on top of a solid foundation
>>
File: paladintaylorswift.png (2.28 MB, 888x1167)
2.28 MB
2.28 MB PNG
>>102082483
that does not look like taylor
>>
File: 1701309744949514.png (16 KB, 107x62)
16 KB
16 KB PNG
>>102082504
Clearly
>>
>>102082509
yours sucks too
>>
When is the next big thing coming out? it seems like the visual side of thing have stagnated.
>>
File: 00022-2463479091.png (1.81 MB, 1152x1536)
1.81 MB
1.81 MB PNG
>>
File: FluxDev_02908_.jpg (127 KB, 1216x832)
127 KB
127 KB JPG
Mine's perfect.
>>
>>102082528
poo you
>>
>>102082534
Flux video
>>
>>102082137
>>
File: image.jpg (105 KB, 1024x1024)
105 KB
105 KB JPG
>>102082560
it's not very impressive, the progress is not noticeable.
>>
>>102082509
>Taylor Swift after going on a fast food diet
>>
File: 00004-109943326.png (930 KB, 622x1151)
930 KB
930 KB PNG
>>
>>102079880

>>
File: ifx243w.jpg (418 KB, 1600x1600)
418 KB
418 KB JPG
>>
>>102082587
guess we'll see when he finishes training the remaining 20%, should be done in 2 years
>>
Why do I get OOM error when saving a sample/lora in Kohya?
>>
>>102082597
I think flux video will be something fresh because a community will be able to finetune it etc, we will have more control with local flux video vs the website AI video sites.

And by the way Flux video is actually coming
>>
>>102082625
Why do you care again? Do you just hang out here with your "friends" as a social leech?
>>
>>102082642
why can't you admit it is not "80% there"?
>>
>>102082651
Only if you admit you have no friends here or real life and you just hang around like you belong because it's all you have before the sweet release of death?
>>
>>102082665
That has always been true.
Now your turn.
>>
>>102082665
Why do so many people do this, stick to the topic it just makes you obviously look like you can't have a debate on the facts.

Like those twitter fags
>>
>>102082676
It's 80% there. You simply don't understand because you don't know where's its come from, its recent progress, and what the goal is. It's clear you've never trained anything before.
>>
File: taylorfixed.png (1.89 MB, 1152x1152)
1.89 MB
1.89 MB PNG
>>102082528
relax anon, fixed it for ya
>>
File: FFLUX_00605_.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
>>102082421
>>
>>102082629
There is zero evidence that is going to be open weights
You guys did not learn from SD3, did you?
>>
File: ComfyUI_02389_.png (1.3 MB, 960x1344)
1.3 MB
1.3 MB PNG
>>
>>102082683
Because when you talk to someone who is a social black hole it's basically pointless to speak to them outside of reminding them of how unwanted they are.
>>
I took a peek in the SDG thread, and man it's so dead there and the anons are just arguing abotu nothing.

>>102082545
>>
>>102082685
How dishonorable, anon, you said you'd admit it.
>you don't know where's its come from
it's come from noise
>its recent progress
it's two months of training
>and what the goal is
it is not that image obviously
>>102082734
you just have to admit it is not "80% there" and stop sucking that anon's cock
>>
>>102082734
But you are the one that's acting like someone that's pointless to talk to, you don't try to meet half way, you just keep insulting without sticking to the topic.

NTA, but what makes you say the messed up image is 80% there? Is there an actually logical reason that we don't know about?

Because to most people that looks more like 5-10% or something.
>>
can i setup comfy to iterate through prompts, or even swap out words in prompts, e.g. do 10 generates with a "red" background, 10 with a "blue" background etc
>>
>>102082752
Prison of their own design.
>>
>>102082699
I ber you were one of the ones that thought Flux wouldn't work on consumer PCs
>>
File: ComfyUI_02277_.png (1.17 MB, 1024x1024)
1.17 MB
1.17 MB PNG
>>
File: taysphinx_.png (2.53 MB, 1432x1432)
2.53 MB
2.53 MB PNG
>>102082539
good work anon
>>
>>102082766
There is no meeting halfway, he started with being a faggot, he has interest in seeing someone succeed, his goal is to turn everyone into the miserable faggot he is. There is nothing constructive happening here, he is an unhappy person who only wants to make others around him unhappy. He didn't need to say anything and nothing he said was remotely of interest or value and is a reflection of the shitty life he has IRL. The pathetic part was his appeal that he actually is part of a group here when in fact he's an undesirable loser that would not be missed if he stopped posting.
>>
>>102082801
Yeah but you can't know which anon said the first comment, another anon is probably asking you to clarify the 80% part.

Anyway forget it
>>
>>102082801
>*GARGLE GARGLE GARGLE*
take that anon's dick off your mouth and just admit it is not "80% there"
or... wait, are you the anon training the model?
>>
File: 1696234605979146.png (62 KB, 472x864)
62 KB
62 KB PNG
ah fug i'm a bit late kek

>>102080826
well, that specific one was 1024x1536 -> 4096x6144, was just curious how much detail it could generate and honestly it's amazingly good (apart from the FUCKING SEAMS)
but i'd even be happy with a simple 2x or 3x upscale, only problem is there's still incoherencies like one pocket looking different than the other because of the tiles, it's awful

>>102080873
that's interesting, because i don't use any seam fixes and mine still come out nasty
i guess it's because a planet might be easier to generate? a mountain is a mountain, if they look different it's kind of natural i guess? as opposed to the specific details on clothes, idk tbdesu

>>102080899
i'll give these a try, not sure if it'll change anything but it can't hurt
picrel were my current settings
>>
>>102082801
clarify your criterions for 80%, how is that worked out mathematically per se?
>>
>>102082819
Anon, at the end of the day, your opinion doesn't matter and I hope you get in a better place someday and have friends you can share positive experiences with. When I close this thread, it's like you don't exist.
>>
File: 00036-4152069828.png (1.97 MB, 1152x1536)
1.97 MB
1.97 MB PNG
>>
>>102082839
that's true for any anon here, retard, no matter how nice they are
just admit it is not "80% there"
>>
>>102082850
how about no and I hope it drives you nuts
>>
File: 00008-1297066943.png (1.1 MB, 896x1152)
1.1 MB
1.1 MB PNG
>>
>>102082839
Well that's true every single comment here is just pixels on a screen.
>>
>>102082862
I'm already nuts.
It just makes you a dick rider for someone you don't even know, and retarded because that's not "80% there" in any way or shape
or are you the anon training the model?
>>
>>102082880
No, there's definitely some people in here who live in this thread (and /sdg/ in particular) and it's not just pixels on the screen. They crave validation but as you already said, this is just pixels so they die of social malnourishment.
>>
>>102082778
No, in fact, I always called out Stability on their bullshit that 8B was not suitable for consumer hardware and "could not be fine-tunable".
>>
File: 00007-1297066942.png (1.13 MB, 896x1152)
1.13 MB
1.13 MB PNG
>>
given that in the next 12-24 months we're gonna be generating video on our GPUs, is anyone else scraping social media for training data? maybe I am too much of a coomer but I ended up writing a python project to rip all the insta/youtube/tiktok/etc/etc profiles i want
>>
Which caption do you think is better?

This digital artwork is a highly detailed exploration of dark fantasy, featuring a commanding, dark-skinned female pirate captain standing boldly on the deck of an imposing, wooden ship. The woman boasts long, cascading white hair and piercing green eyes, complemented by an intense, almost menacing glare. She dons a form-fitting, black corset that accentuates her curvaceous figure, along with a tricorn pirate hat embellished with a captivating, glowing green gem. Her ensemble is adorned with intricate, otherworldly designs in glowing green and black that shimmer and dance with an ethereal energy. The ship's rigging is a complex web of ropes and dark blue, almost black sails, casting an ominous ambiance over the scene. In the background, a turbulent stormy sea churns beneath dark clouds, while a full moon casts an eerie, silvery light on the pirate captain. The wood textures are exquisitely detailed, showcasing their natural grain and wear, while the fabric elements have a sleek, glossy finish. The pirate captain holds her stance with authority, her expression conveying dominance and danger, making the overall mood intense, foreboding, and distinctly erotic. She brandishes firearms with familiarity, their metallic surfaces gleaming and complementing her fierce aesthetic. Her attire includes high-heeled boots that accentuate her powerful stance, and her single bare arm adds a touch of raw vulnerability amidst her formidable appearance. The artwork stylishly blends elements of classic pirate lore with a mystical, fantasy twist, invoking a sense of both allure and peril, bringing to life a world of adventure and mystery.

or
>>
>>102082932
This is a highly detailed, digital artwork in a fantasy genre, depicting a fierce, dark-skinned female pirate captain standing confidently on the deck of a large, wooden ship. The woman has long, flowing white hair, and her eyes are a striking green, accentuated by her menacing expression. She wears a black, form-fitting corset and a tricorn pirate hat, with a prominent, glowing green gem embedded in the brim. Her attire is adorned with intricate, glowing green and black designs, adding a mystical, otherworldly touch. The ship's rigging is visible, with ropes and sails in shades of dark blue and black, contributing to the eerie atmosphere. The background is a dark, stormy sea, with dark clouds and a full moon casting a silvery glow. The texture of the wood and fabric is highly detailed, with the wood showing natural grain and wear, while the fabric has a smooth, glossy appearance. The overall mood is intense and foreboding, with the pirate captain's posture and expression exuding dominance and danger.
>>
>>102082927
Not exactly scraping but I;m more likely to keep a lot of media now, just in case it will be useful in the future for something like that, so kinda.
>>
>>102082779
kino, is that someone you trained a lora on?
>>
>>102082927
>we're gonna be generating video on our GPUs, is anyone else scraping social media for training data?

People barely got the image model working locally, I am very pessimistic about video running locally on anything but top of the line systems, forget training.
>>
>>102082932
Put "world of Warcraft" in negative prompt
>>
>>102082939
This one
>>
>>102082955
It's likely in a couple years we'll have high memory architectures like the Mac Pros that don't rely on GPUs to run AI models with 192GB+ unified memory
>>
>>102082967
interesting, 1st is using a 70b LLM second is 8b
>>
>>102082955
Nothing stays the same, why do you anons never learn this? every single time I've had discussions about technology in the past 20+ years there is always comments like yours, which makes no sense, THINGS ALWAYS CHANGE.

You are thinking from the perspective of today, today is not tomorrow and tomorrow is not a year from now, and a year from now is not 3 years from now.

At one point it will be easily done on home PCs, sooner or later. Let's see what flux video does, then if that's not possible we will see the next until it does work
>>
>>102083003
GPUs have to improve A LOT to make generating videos a realistic thing.
>>
>>>/a/270283991
Who of you fags is working for SHAFT?
>>
>>102083003
Nah you're being overly optimistic and filtering your bias towards scenarios where there were workarounds and fixes towards making the solution work locally. Take a look at /lmg/ where the current meta to run the best local models are going upwards of 4 24gb gpus in a single ring to run at a reasonable quant and speed.

Now when you bring the distant future into the argument, the conversation changes, but we have a rough idea of the the hardware that is on the horizon and as far as inference goes, it's not a big step step.
>>
>>102083016
Even looking at basic trends it's not outrageous to say we'll have GPUs at the level of H100s in the 6090 series. We also can expect several disruptors from new players because there's too much money in AI for someone new not to get into it.
>>
>>102083016
Of course it's gonna take a lot, but there's always ways to optimise, we start with low res etc

Also videos are basically seperate images put together.
>>
>>102082948
No, I just prompted for this specific haircut
>>
>>102083016
Not him, but wouldn't it go a long way if you could just
>stabilize the output generated by a prompt
>small modifications to the prompt don't change the entire output, only the detail that's supposed to change
>generate a dozen of frames or so stop-motion style by adjusting the prompt a little each time
>generate more frames with generic AI interpolation not dependent on the prompt
That would be good enough for looped hentai animation, the thing that 99% of all users here want.
>>
>>102083030
>Also videos are basically seperate images put together.
This quote isn't exactly inspiring confidence in me regarding your knowledge of how text to video works.
>>
>>102083027
No I'm telling you there's a thing called time, nothing stays the same.Even if it takes 3 years to figure out, an anon saving up training data is still worthwhile.

I dislike when anons always view the future with the perspective of today's technology. As if things don't change.

1.5 and Nai shocked everyone that said it wasn't possible at home.
>>
>>102082941
yeah I've deleted media in the past which I wish I had kept, I know for a fact a lot of it just doesn't exist on the internet anymore, and most of that probably doesn't even exist on some random hard drive either, I would always do random "purges" of my storage to clean things up and now I regret it.

desu I've also started doing things like recording video calls, began a few years ago just with my dad, we would skype regularly and he died this year unexpectedly, i didn't record half as many of those conversations as I should have done. I'm not interested in making some weird avatar of him, but even just having the calls and giving them to an LLM to look through so it can remind me of things or etc, I think that may have a lot of personal value desu,

also i really wanna coom to tiktok sluts
>>
>>102083052
Yeah same regretted a lot of my purges.
>>
>>102083028
They wont increase the VRAM because they can make the datacenter niggers pay out the ass for quadros and A100s.
So no you probably wont see 48GB VRAM cards anytime soon.
>We also can expect several disruptors from new players
lol lmao even

>>102083030
Just do the math, to make 1 second of video you need at least 30 frames per second. so you need to generate 30 images.
>>102083033
obviously this shit isnt as simple
>>
>>102083044
The jump from SD 1.4 to Flux is a lot and that's 2 years. What's the jump of Flux to the new SOTA local model in 2026? It's useless to talk to these anons, they're midwits at the end of the day that don't possess abstract thinking. Many of them generate images on their mother's laptop and are destined to be some Amazon warehouse worker. They cannot see what isn't physically in front of them.
>>
>>102083062
unless we get more VRAM we are kinda stuck
>>
>>102083058
>Just do the math, to make 1 second of video you need at least 30 frames per second. so you need to generate 30 images.
24 frames per second you mean
also you'd generate less than that and use interpolation
>>
>>102083044
Your argument are based in nothing but motivational mumbo jumbo and going from single image text to image to text to video is not a small leap. Looking at flux and what is currently available for video diffusion is a very good indicator that it will not be trivial to run this locally. I would be happy to be wrong, but your baseless optimism is only going to leave you disappointed when you realize the basic cost of inference for a video model of decent quality is going to fill a single GPU several times over.
>>
>>102083028
>We also can expect several disruptors from new players
Like breaking nvidia's monopoly? Is there any hope?

>>102083058
>obviously this shit isnt as simple
Why not. It would be an awesome feature to have for still image generation. Why the fuck does the fucking background need to change every gen?
>>
>>102083070
thats still 24 minutes per second of video.
and with the potato GPUs of most anons this is probably more like 3 hours per second of video.
>>
>>102083062
Yeah several of these anons were making fun of a anon a few days ago saying you could train a flux lora with 12GB vram, they said things like it's already been a miracle to run, so they can't go further and so on.

Well plenty of people have trained with 12gb now lol
>>
>>102083069
the best model for any xGB of VRAM has yet to be built
Flux is not the epitome of 12B models, SD certainly won't be
There is much to improve that doesn't require more VRAM
>>
>>102083069
People will buy 40GB GPUs if it's what it takes to run a 30B image model locally. But that's irrelevant, we can't comprehend how AI images are made in 2 years, Flux uses an entirely different architecture from SD 1.4. The model in two years will use something different, we cannot possibly imagine what the real requirements will be for any future model but something we do already know is current models are underutilized and inefficient.
>>
>>102083073
small hope
https://www.furygpu.com/
and that china has a better video card thing is going to piss the US off that some weird shit is going to go down.
>>
>>102083082
if not for retarded nvidia jews we could already have 48GB VRAM in our 4090s and maybe 80GB VRAM in the 5090s.
>>
Stop fucking talking and post gens. This ruin every /leg/
>>
>>102083073
There are at least four major companies that have the capabilities to make AI processors.
>>
>>102083031
what was your prompt
>>
>>102083107
If you want to mindlessly post gens, go to /sdg/, this general is always more discussion focused.
>>
>>102083104
why not 96GB in the 4090s and ONE BAJILLION GB in the 5090s? idiot
when you sell something on a marketplace do you try for the lowest or highest price? choose wisely for this decides if you are a jew or not
>>
>>102083099
Even ignoring that, China is big on the whole AI thing, so I'm sure they are going to bring a lot of competition to Nvidia at some point.
>>
>>102083122
Nvidia is not the only company and there is an extreme, dire need for affordable 40 GB GPUs. If Nvidia refuses to comply there are others who will fill the need, even it means someone starts hard-modding 4090s in their basement.
>>
>>102083122
Well if you are going to call him a Jew for wanting Nvidia to sell GPUs with more Vram, then you would be wrong, right?
>>
>>102083122
>why not 96GB in the 4090s
NTA, but 48gb black market 4090's out of China are a thing. It's why sometimes people will buy a used 4090 and find parts of it missing
>>
>>102083099
Wake me up when anyone manages to get rid of the CUDA dominance. AI devs seem to be particularly stupid with how they write software. (Just remember that the first AI model data formats were just serialized python objects, which famously can include executable code.) There also was this paper about how AI wastes a shit ton of computation power unnecessarily.
>>
>>102083109
I should also stress, Apple has already broken ground on the unified memory system which is AI capable (just extremely slow). For all I know we're all using Macs in 5 years.
>>
>>102083136
there are software issues, but it is already happening.

https://www.ebay.com/itm/256533572643?itmmeta=01J660JBXDSE81WSR0XC69JV73&hash=item3bba97ac23:g:bH8AAOSwgFBl5dm6
>>
>>102083140
are you retarded?
>>
>>102083069
You think the 6xxx series won't go up to at least 48GB?
>>
>>102083189
Unless there's a major threat it's going to be 32 GB, maybe 40 GB.
>>
>>102083189
Why would Nvidia purposely undercut their own market dominance in the enterprise GPU market but releasing consumer grade GPUs with more than 24gb of ram?
I bet it pains them that they even set the 24gb vram upper limit in the first place.
>>
File: 1701862341613523.png (398 KB, 640x480)
398 KB
398 KB PNG
>>102083189
Nvidia doesn't play that game. They've barely raised VRAM since the fucking 1080ti. They know consumers are desperate for more VRAM so they can afford to charge an arm and a leg for 4090's. The 6000 series will cap at 32gb if we're lucky. More likely 24gb because Nvidia sucks.
>>
>>102083212
It will be their undoing because there's a glaring market slot for a $2500 40GB GPU. Too bad AMD is working in collusion with Nvidia.
>>
>>102083202
>>102083212
>>102083213
Intel has cards that are approaching competition with nVidia, not to mention >>102083169
>>
>>102083244
I'll be really curious to see if we get a 32GB+ Battlemage workstation card.
>>
>>102083244
I don't have a lot of hope for Intel. We're more likely to see something tangible from Qualcomm honestly.
>>
>>102083244
they are staying "budget". Still A770 is reasonable with 16GB and they are actually putting money into getting pytorch to work. Unlike some people *cough* amd.
>>
>>102083244
I'm praying intel does the needful and release actual GPUs with considerable vram upgrades. I'd deal with the CUDA workarounds if it meant I could actually inference large LLMs. I'm not gonna buy Chinese frankencards though.
>>
>>102083244
Intel is the Boeing of tech companies. They're completely incompetent and wholly propped up by Israeli money.
>>
>>102083213
>They know consumers are desperate for more VRAM
Most consumers are gamers who don't even need 24GB, not AI addicts like us.
It is enterprise that is desperate for cheaper VRAM and doubling the VRAM on consumer cards is to lose those enterprise customers that can afford the enterprise cards.
NVIDIA is now making 80% of its profit from selling enterprise hardware, what we want is irrelevant.
>>
>>102083292
>wholly propped up by Israeli money
That's just American money desu
>>
>>102083107
>nooooo you can't talk about technology on the technology board
>you must dump more images that don't have to do shit with technology!
/g/ is the worst board.
>>
>>102083170
Now a Jew sterotype would do exactly what you said, give the least ram as possible to get the most money as possible, which is a clever thing.

The other guys wanted the opposite, which you were suggesting was a jew thing, which makes no logical sense.

So that question right back at you.
>>
fresh bread

>>102083367
>>102083367
>>102083367
>>
>>102083360
>The other guys wanted the opposite, which you were suggesting was a jew thing, which makes no logical sense.
You really are retarded.
Just think it through again, anon. Good lord.
>>
>>102083312
In a just world, this would open up a business opportunity for whoever wants to produce consumer grade cards with more VRAM, like AMD. Except no one wants cards that can't run CUDA which is Nvidia's proprietary tech so Nvidia has a stranglehold on the entire market until someone figures out how to get equivalent speeds on non-CUDA cards in Windows.
>>
>>102083372
I don't get what you are saying and you don't seem to want to clarify, so bye.
>>
>>102083156
>AI wastes a shit ton of computation power unnecessarily.
Like how you seem to only need 7B parameters to hold all English texts ever produced by humanity?
I think that it is on purpose to keep this technology in the hands of a few selected people.
>>
>>102083428
>Like how you seem to only need 7B parameters to hold all English texts ever produced by humanity?
no you only hold technical sounding 6th grader essays with 7b parameters.. you can ask llama to "please write all of Shakespeare"
>>
Newfags don't know you can still post here!
>>
What's a good consistent sampler for img2img? It's for sdxl
>>
Over 2 hours after the new thread was posted, this thread is still usable. Will get archived in maybe 15 minutes.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.