[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Kandinsky Edition

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107208907

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://rentry.org/wan22ldgguide
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd
https://gumgum10.github.io/gumgum.github.io/
https://huggingface.co/neta-art/Neta-Lumina

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
>>107217952
what model would you recommend? I am running 64gb of ram and a 5070ti
>>
>Trying to find a video on how to run Qwen on a cloud service for free
>It's made by pajeet
>>
>>107217949
Not using your russian shovelware
>>
local is dead
>>
local is alive
>>
>>107218091
ur alrdy ded
>>
catbox is so slow today
>>
is this the ran worshipping thread?
>>
>>107216567
>another wan 2.2 boost/fix?
so what are the current boosts/fixes that I should be using then?
24+32 GB btw
>>
Illustrious 3.5 when?
>>
File: 1747101385929984.png (1.4 MB, 1024x1024)
1.4 MB
1.4 MB PNG
I owe you an apology Chroma fags, I wasn't familiar with your game
https://huggingface.co/SG161222/SPARK.Chroma_preview
>>
>>107218235
spark fucking sucks, uncannyphotorealism is a good chroma checkpoint
>>
>>107218218
The following is mostly for video

>woct0rdho's sage triton, sage attention, sparge Attention radial attention https://github.com/woct0rdho?tab=repositories
>model patch torch settings (its under kj nodes), enable f16 and use the f16 bat for vids

>>107218235
Chroma is a real gem, I've been an SDXL slut for so long, been on Chroma since last year and hanvt looked back since
>>
>>107218223
Illustrious but based on SD 3.5 Medium? I FUCKING WISH!!!
>Next up, as we planned back in our previous note, we are going to release our new model series based on SD 3.5, Animaestro! It will come out after Anim4gine, around Q1 to Q2 this year, so please bear with us!
>first and last mention: January 2025
AAAAAAAIIIIEEEEEEEEE
>>
are we 80% there yet?
>>
File: radiance.png (1.78 MB, 832x1488)
1.78 MB
1.78 MB PNG
>>107218235
Chroma and Chroma Radiance are very nice indeed.
>>
File: radiance.png (1.78 MB, 832x1488)
1.78 MB
1.78 MB PNG
>>107218307
no, regardless what it is
>>
File: radiance.png (1.85 MB, 832x1488)
1.85 MB
1.85 MB PNG
>>107218307
there's more than 20% improvement left in just about anything nontrivial that is AI
>>
>>107218293
>been on Chroma since last year and hanvt looked back since
you've been using Chroma from the moment it started training?
>>
>>107218371
Yes
>>
>>107218377
I don't believe you.
>>
File: radiance.png (1.79 MB, 832x1488)
1.79 MB
1.79 MB PNG
>>
File: radiance.png (2.08 MB, 832x1488)
2.08 MB
2.08 MB PNG
>>
File: radiance.png (1.86 MB, 832x1488)
1.86 MB
1.86 MB PNG
>>
File: radiance.png (1.96 MB, 832x1488)
1.96 MB
1.96 MB PNG
>>
File: radiance.png (1.69 MB, 832x1488)
1.69 MB
1.69 MB PNG
>>
thats a lot of images anon
>>
File: radiance.png (1.71 MB, 832x1488)
1.71 MB
1.71 MB PNG
>>
File: radiance.png (1.83 MB, 832x1488)
1.83 MB
1.83 MB PNG
>>107218446
just a few 1girl images from current radiance
>>
>>107218371
Since v10 so close to a year

>>107218377
You dont even know what it is, babby go to sleeps
>>
File: file.png (1.79 MB, 832x1488)
1.79 MB
1.79 MB PNG
>>
>>107218455
not close to a year and definitely not since last year
>>
File: radiance.png (1.89 MB, 832x1488)
1.89 MB
1.89 MB PNG
>>
File: radiance.png (2.16 MB, 832x1488)
2.16 MB
2.16 MB PNG
>>
>>107218453
>>107218463
>>107218486
I mean they kinda suck
Can chroma actually do something that illust cannot do better?
And I don't give a fuck about realism btw
>>
>>107218470
You can call police at any time
>>
>>107217949
Sorry for the reposting but how do you all keep your loras organized?
Atm I have them all in one big folder but was going to start sorting them and wanted some pointers.
>>
>>107218512
>ESL
should have known
>>
File: radiance.png (2.05 MB, 832x1488)
2.05 MB
2.05 MB PNG
not 1girl

>>107218510
to me it has a larger variation in clothing / 1girls in general and more power in prompting overall

but illustrious is a cleaner training if you want specific anime characters
>>
File: radiance.png (1.75 MB, 832x1488)
1.75 MB
1.75 MB PNG
>>
>>107218284
https://civitai.com/models/2086389?modelVersionId=2360624
too bad he didn't specify how many images he used to finetune chroma
>>
File: radiance.png (1.96 MB, 832x1488)
1.96 MB
1.96 MB PNG
>>
File: radiance.png (1.84 MB, 832x1488)
1.84 MB
1.84 MB PNG
>>
File: radiance.png (2.15 MB, 832x1488)
2.15 MB
2.15 MB PNG
>>
File: radiance.png (1.9 MB, 832x1488)
1.9 MB
1.9 MB PNG
>>
File: radiance.png (2.34 MB, 832x1488)
2.34 MB
2.34 MB PNG
>>107218546
you could probably ask? maybe it's not even that many.
>>
Blessed thread of frenship
>>
>>107218568
nice lipstick
>>
File: 667080146.png (912 KB, 832x1216)
912 KB
912 KB PNG
>>107218510
Better at non-anime and less popular anime art styles, also natural language prompt adherence I guess.
>>
damn radiance fucking sucks at finer details, this retard spamming the thread thinking he's posting good gens, how delusional can you be? try posting agen with non fucked up anatomy/eyes/hands? like dude are you fucking blind?
>>
File: 00181-2603097378.png (1.15 MB, 1280x784)
1.15 MB
1.15 MB PNG
>>
>>107218546
its probably a lora merged into the model
>>
File: nxyz_Nov11-1763244837_0.png (2.05 MB, 1200x1600)
2.05 MB
2.05 MB PNG
Early training tests seem promising. What's the best way to merge models, by the way? I know you can use automatic but is there a better tool? I use comfy.
>>
Why do so many people prefer WAI14 to WAI15?
>>
>>107217949
They didn't just drop a 20B image model
https://huggingface.co/collections/kandinskylab/kandinsky-50-image-lite

Why is no one here talking about this? How does this compare to Flux, Qwen? Does it know copyrighted characters?
>>
>>107219400
>image model
video*

Also does anyone have a workflow for this? I will try it out
>>
>>107219400
gen a video of miku holding a pepe, apparently it's too big for ppl in this thread to run though
>>
>>107219455
it doesnt fit on a 4090?
>>
>>107219455
I can't run it either, but their 2B model still surpasses Wan 2.2. I'm curious about the image model though, dropped silently and there's no preview images or talks about it anywhere. The name suggests it's "Lite" meaning there's a bigger version incoming. If their previous 2B video model is anything to go by then this could be good.
>>
talking about the containerization of comfyui earlier makes me wish it could actually gracefully shut down after a few mins of inactivity. you could use socket activation to only run it when you need it in the background.
>>
>>107217949
>Kandinsky
Are they finally better than Chinese now? I've stopped caring. These groups seem to care more about SOTA than actually providing things normies and developers can use in their workfflows. Russians are top-tier talent, but are kept out from the rest of the West.
>>
>>107219661
it's a 20b model though so...
>>
>>107219668
So, it's the same as usual?
>>
>>107219661
Looking at how good Sora 2 is, we absolute need the SOTA without regard for what can be run locally. It's the only way forward, even if we need to pay a few GPU cents per gen on some arbitrary cloud provider, that is still eons better than being told "Sorry, we can't do that" by ClosedAI, we can figure out the hardware later.
>>
>>107219698
You need to provide both SOTA and consumer-grade applications to prove your worth.
>>
>>107219698
>>107219712
Otherwise you're just showing off to a tiny-minority and wasting power.
>>
File: nxyz_Nov11-1763248892_0.png (2.84 MB, 1800x1400)
2.84 MB
2.84 MB PNG
>I need to finish brian griffin captions
>I'm not a furry
AAAAAAHHHHHHHHHH
>>
>>107219697
not at all, you can barely make videos on wan on a 24gb card and it's a 14b model
>>
>>107219722
Not really. You don't realize how bad the SOTA is on the API side. You can't even generate a girl cosplaying as Miku anymore without being hit by guardrails. An infinite meme machine turned into waste. We already have models that run on consumer hardware, what we need is SOTA without interference from third parties.
>>
File: 1747050945130268.jpg (487 KB, 1152x2016)
487 KB
487 KB JPG
Even though comfy's template for qwen-edit says to limit images to 1Mpx, It turns out, it can output much bigger images without issues. Don't even need hi-res fix.
>>
is there a node for fp16_accumulation? I need it for wan, but i want it disabled for sdxl. putting it as a flag in the bat arguments enables it globally
>>
File: nxyz_Nov11-1763249437_0.png (1.91 MB, 1800x1400)
1.91 MB
1.91 MB PNG
I think local models are the only future if you want to make use of any kind of trademarked character. Since my 2 year hiatus from this, things have definitely improved in terms of tools, optimization, and just in general. Local is definitely moving slower than the big boys but that's not surprising and at least it's moving at all.
>>
File: proxy-image.jpg (38 KB, 474x485)
38 KB
38 KB JPG
>>107218510
>Coomer starting to understand the limits of AI in 2026
>>
>>107219888
>Time to rape White girls IRL
>G-D bless
>>
>>107219851
Of course, which is why we need the SOTA to be local more than ever. API models are practically unuseable even if your content is 100% fair use and they would be a massive productivity boost in content creation otherwise.
>>
>>107219823
nvm, found it.
'Model Patch Torch Setting' from kjnodes if anyone else wants it.
>>
How is wan 2.2 compared to 2.1?
>>
>>107219909
SOTA models need SOTA hardware, it's nothing short of a miracle that WAN2.2 runs on consumer hardware as well as it does and even that can barely eke out five seconds on 16GB GPUs.
Still would be nice if WAN had better prompt adherence thoughbeit.
>>
File: 00223-874616549.png (1.71 MB, 1280x784)
1.71 MB
1.71 MB PNG
>>
>>107219932
LLMs briefly caught up to SOTA a while back. I only tested Deepseek through Openrouter, but this structure works. While it's not ideal, it's better than nothing, and even if the model had to be jailbroken through a system prompt,, it was still basically uncensored GPT-4/o1 without having to go on ClosedAI's site which was big at the time. It was a wake up call for ClosedAI too, and the only reason they are still in the running at all is because Claude and Gemini were better than what Deepseek had to offer. Local only needs that disruptive model you can run locally with insane hardware to be good, eventually we will get there. Having to spend a couple thousand bucks in GPU is better than not being able to run uncensored and capable models yourself in any capacity at all, and besides, with tech advancing all the time who knows how long having to spend an arm and a leg might be true.
>>
>be me
>locally diffusing for a long time now
>decide to try local llming
>realize every single model is still aggressively censored even when run locally
>you have to jailbreak them the same way as online but you're definitely running an enormously worse version than the SOTA API-only equivalent

yeah that wasn't worth it. Suddenly I'm thanklful for what we have / get as LDGers kek
>>
>>107219931
WAN 2.2 was trained on a significantly larger dataset (+65% more images and +83% more videos compared to 2.1), leading to better prompt adherence, better motion understanding, and better scene diversity.
>>
>>107220043
ah yes, Deepseek R1, the very locally accessible 670B param model
>>
>>107219932
even Wan 2.5 has dogshit prompt adherence when compared to e.g. the latest version of Kling, I found
>>
>>107220155
Man don't remind me, it was a Kling gen in a random spam thread one day that sent me down this rabbit hole to begin with. The things I would do to be able to run that model locally... do not include buying an A100 because fuck that.
I will continue to be greedy and to hope that somebody comes up with One Weird Trick to break the video generation problem down into small enough pieces that we can get SOTA gens on a 16GB consumer cards.
>>
>>107220127
Does it do NSFW better? I heard something like wan 2.2 has a better understanding of it by default
>>
File: 1744197461880359.jpg (126 KB, 880x1168)
126 KB
126 KB JPG
>>
>>107220212
literally me
>>
>>107220204
Somewhat, you can do things like undressing without needing a lora but it still can't do proper genitalia.
>>
>>107220126
dit models have been pretty bad with censorship and it's only getting worse. feels like synthslop datasets shits up diffusion harder than llms too
>>
>>107220322
I mean the "censorship" we have to deal with is way softer though, it's moreso just like, the model straight up doesn't know certain stuff.

Dealing with the hard refusals to produce an output at all that you get with LLMs is vastly more annoying in my opinion.
>>
>>107220403
kontext has refusals but bfl is just known as the safetyslop company now.
>>
My setup of 32gb DDR5 + 4090 is taking 250-300s per video gen. ComfyUI shows I'm constantly using 99% of my ram during gens but my 4090 and its VRAM is never saturated. How much faster can I expect to gain if I upgrade to 64gb?
>>
>>107220507
in what way? there's no text output is there?
>>
>>107220528
actually I'm wrong, qwen does that with their own encoder. it happens when trying to do undressing and other nsfw stuff
>>
>>107220545
I still don't know what you mean DESU, text-to-image models don't use text encoders in a way where refusals are possible
>>
>>107220155
Kandinsky looks about as good if not better than latest Kling.

>>107220137
The appeal of it is you can rent a GPU from a provider and get much cheaper rates than say the ClosedAI API (or anything else closed source). Rn the meta isn't Deepseek, but Kimi K2, but all the same, you can get crazy good rates. Ideally, the same applies to video models.
>>
>>107220633
Didn't the latest Kandinsky vid model come out like at least a month ago? Is there some new thing about it or something?
>>
>>107220617
the qwen encoder is an llm, not clip or t5
>>
File: chroma.jpg (3.86 MB, 2496x1824)
3.86 MB
3.86 MB JPG
I remain very convinced that one could SIGNIFICANTLY improve Chroma with a minimum-1024px finetune of even 1000 well-captioned images

given that the attached pic is the difference between stock Chroma and just one Lora trained at 1024px on only 60 images, with the same seed / etc
>>
>>107220701
>what is spark chroma
>>
File: 1759830576629596.jpg (9 KB, 107x65)
9 KB
9 KB JPG
Reminder to back up your shit
>>
>>107220670
that doesn't mean anything whatsoever, the way they run it isn't at all how you'd typically run an LLM. It's the same technique that allows Neta Lumina / NetaYume / etc (which as we know can very much do NSFW) to use Gemma-2 2B (a normally very "censored" model) as an encoder with no issues.
>>
>>107220706
hadn't heard of it actually. Looks like its from RealVis man though, should be at least decent, will have to check it out.
>>
>>107220708
my storage node just died so point taken. 60TB just sitting cold, and I'm not even interested in fixing it.
>>
>>107220765
whatcha got in that 60 tee bee huh anon?
( ͡° ͜ʖ ͡°)
>>
>>107219973
interesting
>>
>>107220723
its still training though
>>
>>107219731
Finetune or lora?
>>
>>107219851
The Chinese API I sometimes use does most of them, but it also sends me weird veiled death threats and tells me to kill myself so idk desu
>>
>>107219851
that's not really true at all TBQH, ChatGPT image gen is the ONLY one remaining I can think of that literally gives any kind of shit about copyright
>>
>>107218319
>>107218348
>>107218366
>>107218396
>>107218409
>>107218417
>>107218433
>>107218438
>>107218449

...
fuck is this troonime diffusion general
>>
This is Lain diffusion general
post Lain
>>
>>107220715
all I know is some prompts do absolutely nothing in qwen edit.
>>
File: 00045-3609147570.png (2.77 MB, 1248x1824)
2.77 MB
2.77 MB PNG
>>
i'm gooning with nemo and im getting pretty good degenerate stuff but it seems like it keeps trying to "conclude" the scene and as the chat went on it felt like it was repeating itself and talking in circles. is it better to clear the chat and restart or do you guys keep it mindbroken? i gave it suggestions during the chat and it didnt really understand it and it kept bringing my suggestions back up and it sounded retarded
>>
>>107221085
>>>/lmg/
>>
>>107220798
childe 'p
>>
>>107220653
20B just got released.

Anyways, there are examples of their image model on https://kandinskylab.ai/

I tried running their stuff based on Github code but it doesn't work at all. Leave it to Russians to release possibly frontier models but have a barely useable site as well as any documentation on their new model. Not sure about Comfy implementing this because they too are slow so we'll have to wait until some autist gives us a custom node.
>>
tfw you realize WAN2.2 being split into high-noise and low-noise is not a drawback but a benefit, because you don't have to fit both in VRAM at the same time to be able to run it
>>
>>107218235
do meme low-step loras work with it? Everything I've tried results in a blurry-ish output
>>
File: ComfyUI_08540_.png (2.23 MB, 1152x1152)
2.23 MB
2.23 MB PNG
>>107218235
Don't get why you would use this over Chroma HD Flash.
>>
File: 1733839727102240.png (890 KB, 832x1248)
890 KB
890 KB PNG
>>107221001
>>
>>107221260
GGUF etc etc waiting room
>>
is there any sort of trick to getting comfy working on linux without crashing? i can do simple xl images, but anything related to videos it most of the time either crashes in the terminal with oom, or freezes my pc alltogether. windows i rarely have crash issues
>>
>>107221491
what's your setup other than linux? what libs are you running?
>>
>>107221491
>linux
>>
>>107221491
For linux just skip comfy or any other UI and use the command line. Much better. Try running it headless if you can.
>>
>>107221494
the setup i use is this one for video diffusion https://files.catbox.moe/2kfsuj.json

i have a 3080, so it's not too intensive when i use windows.
>>107221511
doing that for videos sounds like an absolute nightmare
>>
File: ComfyUI_temp_yktki_00001_.png (1.78 MB, 1248x1024)
1.78 MB
1.78 MB PNG
>>107221491
What are your specs
>>
File: ComfyUI_temp_yktki_00002_.png (1.64 MB, 1248x1024)
1.64 MB
1.64 MB PNG
>>
File: ComfyUI_temp_yktki_00003_.png (1.48 MB, 1248x1024)
1.48 MB
1.48 MB PNG
>>
show me may li
>>
>>107221518
10gb 3080, 32gb ram [spoiler]5600g[/spoiler]and on arch linux.

also not running swap partitions if that's relevant.
>>
>>107221550
You are going to suffer with video with this. I have 12G/48G and it's ass at Q8. Try Q6 or give up.
>>
>>
>>107221555
Look at this smug gentleman
>>
>>107220993
Cute newfag,
Listen, these are called "shillers" and he is our beloved Radiance Shill.
Do you like his gens?
Yes? Cool because he has been posting the same for at least three months!
And you're asking yourself 'but what the hell is this person doing generating that stuff for 3 months?!?!"

Nobody knows the truth, but the truth is these shillers are the pillars of this community, they give stable bumps, generate material for the people who visit and give the feeling of a living general, they are our soldiers.
>>
>>107221550
ouch, thats a little tight for video generation, I'd stick to lower resolutions, are you using comfy native wan wf or kijai wrapper? Kijais wrapper has a block swap option, search on r/stablediffusion or r/comfyui threadds made by other people with your case
>>
>>107220922
>ChatGPT Image gen

Arguably local is way far ahead of all API image gen models if we are counting LoRAs and finetunes or finetune potential. But all video models are far behind Sora 2 when it comes to video gens, and sure they may be the only ones who care about copyright, but they're also the only good providers of video out there. They allow copyright: Fun. Now copyright allowed: No fun. It's that simple.
>>
>>107220993
>>107221571
Just so you understand newfag, shillers are the backbone of a general.
>>
>>107221591
Also, it's sickening that they strategically allow it for one week to show what is possible and to draw users in, then pull the rug on most who were actually using it for edits and whatnot once they become so big they will still continue to get organic users.
>>
>>107221554
If I still can't get it working on Linux, then I'll probably just suck it up and stick with windows.
>>107221578
Think a default comfyui wan, but I'll try giving the wrapper thing a shot at some point
>>
>>107221618
And let's be honest, they are a massive corporation. Even NYT couldn't sue them. Don't kid yourself that they're afraid or can't win a court battle. This rule is simply about restricting and crippling the users.
>>
>>107221618
Stop being such a girl.
>>
>>107221623
https://www.reddit.com/r/StableDiffusion/comments/1mlhh2p/running_wan_22_on_an_8gb_vram_gpu_my_benchmarks/
>>
>>107221650
What? It's true. Where have you seen any of those meme toy video models that aren't Sora 2 be used for anything practical? Sora 2 is the only one that can make acceptable clips out of the box. And because it's not China you get 30 free videos a day, that's a lot more generous than fags like Kling.
>>
>>107221653
>576×720, 93 frames, 218s
how is that possible when chroma takes my 12gb 3060 3 minutes to gen an image at 15 steps?
>>
>>107221571
>>107221593
no retard
>>
>>107221689
4 steps =/= 15 steps

duh
>>
File: 00186-1838009399.png (3.05 MB, 1456x2128)
3.05 MB
3.05 MB PNG
>>
>>107219698
Wait WTF forget about Sora 2
https://aaxwaz.github.io/Ovi/

This is pretty good why did we all forget about this?
>>
>>107221689
Hmm let's see

576 * 720 * 93 * 4 (steps)

Let's take 4 out of 93 to get 16 steps, now we've got 576 * 720 * 23.25 * 16

which = 576 * 720 * 4.80 * 4.80 * 16

which = 576 * 720 * 2.19 * 2.19 * 4.80 * 16
or,
1261 * 1577 * 16 * 4.8

Ok, maybe that's on the large side, so let's take 1.23 out of those numbers to get something like
1024 * 1280 * 16 * 4.8 * 1.23 * 1.23
or,
1024 * 1280 * 16 * 7.3

Then let's remember that 3m is less time than 218s, so we'll multiply 7.3 by 180/218 and get ~6

Now we check how the 3060 Ti compares generally for performance to the 3060 12gb, and find that its "gpu compute" benchmarks are about 20% better. So we'll divide 6 by 1.2 and get 5.

That 5 represents some combination of a smaller amount of work per pixel (which is probable in a video model) plus other mysterious unaccountable differences between two genner's performance benchmarks, which always seem to happen.
>>
Is stable diffusion and wan2.2 affected by whether you run them on an ssd or hdd?
>>
I wonder if the name Caitlyn Genner is taken on civitai already
>>
>>107222041
You can generate some solid ASMR with this Ovi 1.1 model. This is insane.
>>
New localgen-induced fetish I've discovered: Asian chicks with white tits, pink nipples, etc

Imagine you're dating one of these chinks, you get her into her bed and she takes off her shirt, you're expecting to see weird foreigner brown nipple tits, but actually her tits look exactly like your mom's that you suckled as a babe
>>
>>107222050
loading wan from an hdd is excruciating but it doesn't affect the diffusion process
>>
>>
are all txt2img models shit as soon as you deviate from one person and do two people interacting in some way?
>>
File: 1735328531553397.png (2.61 MB, 1120x1440)
2.61 MB
2.61 MB PNG
>>
>>107220212
holy kino
>>
>>107222065
This is about 90% of the way of what I'm looking for, just needs to be polished a bit more, way better than that Kandinsky nonsense. It can probably handle prompts I was throwing at Sora 2 and having to evade, and complaining about too.

All from their demo on HF:

https://files.catbox.moe/uagd1q.mp4
https://files.catbox.moe/6q6g8i.mp4
https://files.catbox.moe/ujjswe.mp4
https://files.catbox.moe/du5nym.mp4
https://files.catbox.moe/xowzmj.mp4

Fuck Russian shills on Plebbit, come to think of it this is probably why they released the Kandinsky 20B in the first place.
>>
File: ComfyUI_00004_.png (2.76 MB, 1248x1848)
2.76 MB
2.76 MB PNG
>>107222144
fat. bitches.
>>
File: nxyz_Nov11-1763243892_0.png (1.89 MB, 1600x1200)
1.89 MB
1.89 MB PNG
>>107220835
Both. My goal was to make a seth macfarlane checkpoint. I got drunk and watched a bunch of tv tonight instead of working on that. Captioning is bitchwork and it makes sense that the big boys can hire a bunch of people like me but more retarded to do what I do.
>>
File: 1757938233855908.png (2.44 MB, 1120x1440)
2.44 MB
2.44 MB PNG
>>
File: nxyz_Nov11-1763245579_0.png (2.16 MB, 1200x1600)
2.16 MB
2.16 MB PNG
>>107222178
Also, specifically I am training off of porn and still images from the shows so despite joycaption being as nice as it is it does confuse basic concepts like anal and vaginal. It doesn't always understand what hole it's going into. These small things add up to a lot o big things in a big model so I curate everthing.
>>
>>107222215
your fetish a shit
>>
>>107222156
>Requires minimum of 24GB VRAM with fp8
Maybe GGUFs can get that lower? Wake me up when it fits inside 16GB.
>>
>>107222249
thanks. don't care, fag.
>>
>hurr durr I need BIG BLACK COCKS for MY FETISH
>>
>>107222289
this
>>
>>107222256
That's what I was thinking, and this is talking about the 720x720 model (I'm assuming 1.0), so we definitely need a Q4 quant to test this on 24GB, but the community hasn't picked this model up at all. It flew under everyone's radar.
>288 downloads this past month

That is abysmal for the only model resembling Sora 2 I've ever seen, especially now after it got an upgrade that makes so many of its videos be as consistent. Of course, many of these are talking and we're not seeing dynamic actions scenes like we have in certain video models, but that's not to say it can't do them, at least we know so far the static camera motion looks really smooth.
>>
File: 419699939.jpg (275 KB, 1280x1920)
275 KB
275 KB JPG
>>107222289
umad?
>>
File: 1762460180095305.png (2.5 MB, 1120x1440)
2.5 MB
2.5 MB PNG
>>107222144
>>
>>107222072
what the fuck are you talking about you stupid fucking fuck, they have booba the same color as their face like every other woman
>>
>>107222329
overwatch, wow really impressed. i'm training my own model I don't care about your thousand hours of watched widow futa porn.
>>
>>107222372
you are making a model? Awesome pics. Look thick. Solid. Tight. Keep us all posted on your continued progress with any new progress pics or vid clips. Show us what you got man. Wanna see how freakin' huge, solid, thick and tight you can get. Thanks for the motivation.
>>
>>107222397
I honestly hate you. You're cringe. You will get progress pics I've probably been here longer than you, gaylord.
>>
File: ComfyUI_00092_.png (2.88 MB, 1280x1920)
2.88 MB
2.88 MB PNG
>>107222406
you cute
>>
>>107222406
ive been here from when this site was still called 4channel
>>
>>107222421
It still is...

>>107222418
Why is Asuka a tiger! ACK!
>>
>>107222366
Fuck you, you don't know what you're talking about.
>>
File: 1748734463785595.png (364 KB, 640x498)
364 KB
364 KB PNG
>>
Wanted to give firstmiddlelast frame a try again. These are the images, just seeing if it can even do morph or multishot.
>>
File: AnimateDiff_00047.mp4 (2.09 MB, 960x480)
2.09 MB
2.09 MB MP4
>>107222768
The result. Completely ignores the other two images.
>>
>>107222777
resolution/aspect ratio maybe?
>>
>>
File: AnimateDiff_00062.mp4 (1.65 MB, 480x480)
1.65 MB
1.65 MB MP4
>>107222786
It was the mode, swapping it to Normal worked.
>>
File: AnimateDiff_00069.mp4 (1.2 MB, 480x480)
1.2 MB
1.2 MB MP4
Lul, ok it's working as I hoped now.
But I guess I need to do the 4 image version if I want the third section to use the image as the starting frame.
>>
File: 1757947405172970.png (2.71 MB, 1120x1440)
2.71 MB
2.71 MB PNG
>>
File: AAAAAAAAA.webm (134 KB, 480x480)
134 KB
134 KB WEBM
>>107222936
>>
>>107223034
Damn, after tests my idea of using this for loops was a failure. It seems that this node is just using first frame last frame built in, so the color and brightness shifts are still present.
>>
File: 1751328030746714.png (3.1 MB, 1120x1440)
3.1 MB
3.1 MB PNG
>>
File: 1737767296875884.png (2.89 MB, 1120x1440)
2.89 MB
2.89 MB PNG
>>
>>
File: IMG_1301.jpg (309 KB, 784x1168)
309 KB
309 KB JPG
I was goofing around with sora, meta and grok today.
What kind of pc would be optimal for running local AI.
I want to make funny conspiracy videos.
But there are too many restrictions, to get my point across.
>>
>>
File: 167726842.png (740 KB, 1152x896)
740 KB
740 KB PNG
>>
>>107214555
not bait, i am just completely new to all this
i thought comfy ui was the golden standard, so this wan vace is better for doing ai video? or editing exisitng video? (like doing character swaps) is it good for nsfw stuff? (i would mainly like to do anime sex loops)
>>
>>107223453
dude fucking google stuff you fucking retard, are you a bot or really sub 80 iq?
>>
File: 955580467.png (950 KB, 1216x832)
950 KB
950 KB PNG
>>
File: 2819215727.png (613 KB, 1216x832)
613 KB
613 KB PNG
>>
File: 1760944267912089.gif (718 KB, 500x281)
718 KB
718 KB GIF
I have a nvidia 3080 card, been messing around with wan 2.2 and it's been fun. I stick to low resolution(512×512) and with the options to speed up and stuff.
This has me wanting to upgrade though, and i have my eyes on the 3090 which would have 24gbs of vram

I'm a techlet though, does anyone know if it would be worth it just for vid gen from a speed perspective? Like would it make it considerably quicker? Right now I can do a 7 second 512×512 6 steps gen in around 5 mins. I love the stuff I get but 5 minutes is a pretty long time. However i don't want to buy a 3090 if it's not gonna improve speed by a lot

Any help would be appreciated bros
>>
>>107223463
>google
who the fuck says this anymore
>>
>>107223984
>i want to improve speed
buy an rtx 6000 pro
>>
>>107223984
I have a 3080 too and I realized its a lot better to work with what you have and learn to optimize, because I can use Qwen and Flux just fine and you gotta think realistically: how good are local models gonna get, do you really think you'll be making amazing stuff with more VRAM. A small jump in VRAM isn't worth the investment and I'm quite content with a 3080. It's better to wait
>>
>>107224013
>copewall: the text
lol
>>
>>107224004
I'm American therefore getting ass raped so I'm on a budget lol
>>107224013
It's not a small jump though it's going from 10-12 vram to 24
>>
Is the 5060 Ti 16GB good enough for text 2 image and image 2 video?
>>
>>107223984
You will need a new power supply if you go farther than just 3090

4090 and above need he retarded 16-pin connector

I enjoy my 3090. It would be nice to have more VRAM for other thing though
>>
>>107224037

3090 >> 5060
>>
>>107224050
5070 ti is sexy
>>
>>107224050
>>107224054

I can get the 5060 Ti for 399€ while a quick look at ebay the 3090 goes for around 735€ with shipping. Is the performance worth the extra 335€?
>>
>>107224073
the 3090s are older and less power efficient. its worth it at half that price
>>
>>107224037
yes
>>
>>107224045
>4090 and above need he retarded 16-pin connector
you don't need a new power supply to use the 12VHPWR connector
>>
>>107224092
>>107224106

I guess I will get the 5060 Ti now, with the ram scarcity scare right now, not sure the Supers will arrive anytime soon.
>>
>>107224123
you definitely want the 16GB version of the 5060 Ti and I doubt it is going for 399€
>>
File: order.png (57 KB, 788x538)
57 KB
57 KB PNG
>>107224126

It's the Ventus version, I heard they're not so good but the local shop has it on sale now from €469
>>
>>107224147
so the 399 is pre-tax?
>>
>>107224155

399 is after tax (81.07 is the added tax). I'm still hesitating a bit if 16GB is enough to animate 5 sec with Wan though. Will it take too long?
>>
>>107224174
instant buy at that price
>>
File: 2918921327.png (658 KB, 1216x832)
658 KB
658 KB PNG
>>
>>107224202

Yeah I ordered it, there's a return policy for within 50 days. So I might as well.
>>
>>107222418
Coffee is a good idea
Asuka Taiga
>>
>>107222333
prompt?
>>
>>107224174
>hesitating a bit if 16GB is enough

F.A.F.O.
>>
>>107224174
>16gb is enough to animate 5 sec
brother with my now working sage attention 2 setup, and fp16 accumulation, i'm genning 1280x720 5sec at as low as 59s/it with the PNY version, the one everyone says is the shittiest. get it.
also for reference, cards lower than 16gb OOM trying to gen that resolution.
>>
File: 1737222120455283.mp4 (2.13 MB, 512x512)
2.13 MB
2.13 MB MP4
>>107223984
>I can do a 7 second 512×512 6 steps gen in around 5 mins
3090 here. Around 224 seconds (3.73 mins) from a cold start. Around 172 seconds (2.86 mins) if rerolling without changing prompt.
>>
>>107224698

Thanks, that really puts my mind at ease! Appreciate it.

Does you card have the 2 fan or 3 fan? I was wondering if it's noisy when running Wan?
>>
File: WanVid_00017.webm (312 KB, 512x512)
312 KB
312 KB WEBM
>>107224705
same, 2:10 and however long it takes to switch models on DDR4
>>
>>107224770
i have no idea if MSI's cards are like this, but the out of the box overclock was so aggressive and retarded it was constantly revving the fans from 30% to 80%, even when the temps were barely hitting 67c. But to be fair, it was trying to hit an almost 3ghz OC. But i'm capped at 2680~ish with my undervolt. just undervolt, set fan at auto speed, both in msi afterburner. I never hit above 40% fan speed, and temps never above 66c now, even during wan loads at that speed.
i never really hear the fans anymore, at least not over my case fans.
probably COULD adjust this for a higher clock for slightly higher temps but, i like my low power usage and quiet fans. It's a dual fan model.
>>
>>107224802

There are some complaints about the Ventus model being noisy. I guess after it arrives I will try to do the same as you and undervolt & set the fan.

Might try to change the thermal paste later on but I'm not sure if the 5000 series have the warranty void stickers.
>>
>>107224844
>Might try to change the thermal paste later on
in like 10 years yeah you'll have to worry about that
really what you need to worry about are D R I V E R S. have fun with that fucking nightmare. i've had this card over a month and maybe reinstalled drivers/((dependencies)) at least 6 times for comfyui alone.
>>
>>107224864

Well shit, hopefully I won't end up with the same problem. Is it the same driver or did you have to hunt for a specific driver for it to work?
>>
File: 00004-896632504.png (1.3 MB, 1280x784)
1.3 MB
1.3 MB PNG
>>
>>107224902
less about the core nvidia driver and more about how dependencies get fucked in the ass sideways every other day and MIGHT not play nice with the latest driver. youll just grab whatever's new and probably be fine.
but i think i had to reinstall over something a month ago, i forget by now. been fine since. but never believe the faggots in these threads that say "works on my machine", because something somewhere WILL break and you'll need open that cute little cmd window again.
big plus side is, you may discover something wasn't working in the first place and end up fixing it like i did. like with my sage attention install suddenly working right and getting me that nice speed.
>>
>>107220043
Why do you fags write like this? I can tell every time you write “closedAI” you do that typical nasal high pitched chuckle of some retarded neckbeard.
>>
File: 00050-2024489686.png (2.88 MB, 1080x1920)
2.88 MB
2.88 MB PNG
>>107225006
it is crazy that his entire post was pretty eloquently written by 4chan standards to begin with, but you specifically chose that singular term to be offended by and felt the need to try and flex for god knows what reason.
anyway, all that to say, you write like a full on faggot and i hope more people in these threads start saying ClosedAI just to piss specifically (You) off. In fact, i never say it, but now i will be.
>>
>>107225049
>Image
Gimmie more like it.
>>
File: 00082-3838312624.png (2.48 MB, 1080x1920)
2.48 MB
2.48 MB PNG
>>107225124
you got it lad
gonna throw some multiple views gens into wan and see what happens while im at it.
>>
Anyone tried prior prediction / differential output preservation? What optimizer to use with this?

>>107223123
That's great
>>
File: 1744526614148679.jpg (83 KB, 1024x1024)
83 KB
83 KB JPG
>>107217949
Does anyone have some sort of image comparison chart or comparisons that show the differences in quality between 4-bit, 8-bit, and full precision image outputs for Chroma?
>>
File: 120.jpg (101 KB, 791x166)
101 KB
101 KB JPG
>>107224126
>>107224147

I usually look up local used market
A 5060 ti 16 gig goes around from 250+ euros
>>
>>107225273
buy it and see what you get
>>
>>107225201
Kek. Not what I expected, but highly entertaining non the less.
>>
>>107225301
>that moment of realizing the thing you really wanna fuck looks hilarious and probably stinks really badly
>>
>>107225273

Damn that is a good price. Finland is a small country though and I checked the secondhand market here and couldn't find anything in tori.fi

I saw someone selling the Asus rog strix rtx 3090 for 680 euro there though, but doesn't really seem to be worth it.
>>
File: 16.jpg (150 KB, 763x163)
150 KB
150 KB JPG
>>107225299
>>107225328

>buy it and see what you get
Don't need it, already have a used 3090 I snagged from the same site for 500 euros a few months back. So far yet to get fucked over.
>>
I'm doing non-photorealistic gens and playing around a bit with the dpmpp_2s_ancestral / linear_quadratic that somebody suggested in this thread a while back, and I'm noticing so many promising gens in the preview window immediately eat shit at exactly step 23/40. I haven't come up with a decent style prompt, so maybe that has something to do with it.
One thing I'm trying right now is to chain two sampler blocks together so that I can switch to a different sampler at step 23. The step count and scheduler have to remain the same of course, but I've managed to get some decent results by switching the sampler to euler_a or by halving the CFG. Is this a worthwhile approach to mess around with or am I just wasting my time?
>>
lmao even with my settings adjusted this shit is jiggling WAY too much. i underestimated how much that m4cro lora adds jiggles that aren't just the tits.
https://files.catbox.moe/10zx0l.mp4
https://files.catbox.moe/1c4fnf.mp4
funny how the opposite problem was plaguing me before this godlike painter node.. definitely do not lower it below its default, that adds the slowmo back in. i wonder if it fucks with lora strengths. Anyone else here testing it?
>>
File: wan21.jpg (15 KB, 606x260)
15 KB
15 KB JPG
>try to run wan without noodles
>get https://github.com/Haoming02/sd-webui-forge-classic/tree/neo
>install
>runs
>use wan21 on comfy howto to get the necessary files, model, text encoder, vae
>load
>all outputs it produces like picrel with different samplers
Where am I wrong?
>>
>>107225606
step 1
>>
>>107225606
step 1 i'm afraid. neo can barely handle flux as is.
just copy my catboxed workflows above your post.
>>
>>107225657
I rather not use wan than use comfy but thanks
>>
>>107225049
Because it’s the giveaway of some “acktually” larper who thinks he’s twice as smart as he really is. Since when is open router local? It’s literally just a service to connect you to cloud providers.
>>
>>107225738
pfft weakling.
>>
>>107225771
I just hate the "UI" and the whole ecosystem around it mkay
>>
https://gizmodo.com/yann-lecun-world-models-2000685265
>LLM's are a dead end because they do not have sensory data and therefore lack an entire dimension of interpreting the world
Do you agree?
>>
>>107225793
and so do cumfart users, but it pales in comparison to the indominable human spirit so.
>>107225798
i guess.
>>
>>107225738
use wan2gp then
>>
>>107225806
Therefore it made sense for me to try something else, not?
>>107225888
>why does x not work?
>well just use y
>>
>>107225798
no, lecunt is a retard, you dont need anything more than text to reach agi, although having multiple modalities will help of course, and we have multimodal models, just no big native ones yet, his "world" jepa models are a toy.
>>
>>107225358
Bazoš only has scams if it is suspiciously cheap or something.
>>
>>107225421
Not sure which model you're using but i'd start experimenting more with sampler/scheduler combinations before messing with chainsampling. Get RES4LYF and you'll have more sampler/schedulers to try out (use a fixed seed, steps and cfg, make notes of the gen times and output quality). If you do decide to experiment with chainsampling, i recommend using the clownsharksampler instead of the regular ksamplers and here's why; https://www.youtube.com/watch?v=A6CXfW4XaKs
>>
>>107226017
Very useful info, thanks. I'll look into those.
>>
>>107226011
Maybe just had a shit experience
Nothing was fucked up for me on Bazos , Aukro on the other hand got me scammed once before.
>>
>>107226203
catbox this pls. i am the milkman my milk is delicious and i wish to share it with this pumpkin spice late enjoyer
>>
>>107226257

Do you have S? Custom character lora was used for it that I don't mind sharing
>>
>>107226273
You can't find your S?
>>
>>107226273
what you mean signal? just litterbox that shit on a 5 minute timer if you wanted.

did you talk to that hunnie at least once before making a lora of her? :^)
>>
File: 152267347.png (617 KB, 1216x832)
617 KB
617 KB PNG
>>
checkpoint looks interesting, worth looking into. may or may not report results.
https://civitai.com/models/694493/c4pacitor?modelVersionId=2252250
>>
cozy bread
>>
what are the recommended samplers for flux? Beta i see is the scheduler every model suggests but there's some specific samplers like RES_2M that reforge neo doesnt have.
>>
File: ComfyUI_00233_.png (1.19 MB, 896x1152)
1.19 MB
1.19 MB PNG
>>107226323


Imagine willingly talking to these demons IRL
>>
>>107226692
so true!
>>
hey all, the hottest conspiracy just dropped. comfyorg hired trolls to fud anistudio because they are afraid of losing market share
>>
>>107227042
makes sense. comfy is a spurned lover that wants to see ani fail after she broke up with him
>>
>
>>
>>107224147
16 GB really just means running everything a 3090 can at 8 bit, which is more efficient anyways, not a bad deal if you ask me.
>>
why isn't anistudio in the OP? did the shills poison the thread?
>>
>>107227074
you kidding? there are shills here that do it for free
>>
yoland reads these threads yaknow
>>
>our cute sister general has trani aids
grim
>>
>>107227102
why wont he call me
>>
>>107227103
why? anistudio was in their OP for over a month but one schizo has a problem with it now? very sus
>>
>>107227109
chinks and jeets only. white people have been replaced
>>
>>107227112
There was a clear poll against having it on op but trani forced himself back in by panic early baking
Anons are angry about that because they don't care about the ui but want to post about anime
>>
>>107227135
we don't even have polls here because they get botted like the chroma one
>>
>>107227135
Go back and leave us alone
Trani is your problem now
>>
>>107227156
the schizo is everyone's problem
>>
File: dog having freakout.gif (2.21 MB, 360x360)
2.21 MB
2.21 MB GIF
all the posts from the past 12 minutes are totally organic after this thread was slow all morning and most importantly, on topic. absolutely.
and mods are paid to not do a thing about it, except look like they do an important job.
anyway fully convinced all the schizo spamming in EVERY a.i thread even on /trash/ and /b/ are some sort of social conditioning to get us to stop using local a.i
>>
>>107227191
There are only three anons at any given time across every single diffusion thread combined on every imageboard site combined together in total
>>
>>107227191
>fully convinced all the schizo spamming in EVERY a.i thread even on /trash/ and /b/ are some sort of social conditioning to get us to stop using local a.i
this. it's not ani's fault in /g/'s case. every single thread has some tailored troll to make people stop posting. the motive is to kill off threads with concern trolling relentlessly
>>
>>107225798
He is right, and this explains well why we don't really have any big leap anymore.
>>
>>107227232
crazy how not subtle it is too. I just don't see the motivation on hiroshimoot's side. Like, more traffic is a good thing, why let them kill off the most prevelant discussion on every one of these boards?
Its the only part i can't square.
>>
>>107220043
>LLMs briefly caught up to SOTA a while back.
Have you tried K2 thinking? For some use cases it's actually better than western SOTA.
>>
Local videogen is so sovlless. Why did alibaba leave us behind?
>>
>>107227420
api bucks
>>
I've been using wan 2.2 and I still don't understand something. When a lora has a high noise and a low noise version do I use both at the same time? Thats what I assume but I haven't been able to find a answer anywhere. Kind of sucks due to the size of the loras though
>>
>>107227488
>When a lora has a high noise and a low noise version do I use both at the same time
yes
>>
>>107227513
Thanks, is there a general rule for what strength to use for them? I know it'll differ on the model but it seems if I run any at default it'll fry the gen, so I have to lower them quite a bit
>>
>>107227541
- use whatever the maker recommends if he does.
- if you use multiple ones use lower strength for each individual ones.
- just experiment.
>>
Is there any model able to get proper background characters without inpaint use?
>>
>>107222156
Did you check to make sure Putin isn't hiding under your bed, faggot?
>>
Fresh when ready

>>107227636
>>107227636
>>107227636
>>
ready now
>>
>>107222065
It's censored at the dataset level, sadly. Model barely knows anything beyond shit to make demoslop
>>
and over



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.