[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>106957370

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2203741
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
Blessed thread of frenship
>>
sexy walks thread
>>
baker hates netayume even though its the best base anime model
>>
>sexy walks thread
1girl, walking, bouncy titties is truly the 1girl, standing of video gen
>>
>>106963093
not many neta yume gens last thread, and the one I did was too spicy for OP anyway
>>
>>106963100
Yep.
And Sora 2 cant even pass the test! LOL!
>>
1girl, grabbing own nipples

>girl misses the nipple
>>
>>106963125
1girl, grabbing own nipples, conveniently censored
>>
>>106963119
it could if it wasn't censored after gen
>>
>>106963145
>hypotheticals
a model isn't evaluated by its theoretical abilities, only what you can ACTUALLY do with it
>>
>>106962893
I gen at 1152x* and upscale/interpolate with Topaz. Everything takes about a minute (1min per step and 1min upscale/interpolate).
>>
File: 00039-2786000019.png (3.22 MB, 1920x1080)
3.22 MB
3.22 MB PNG
>>
>>106963166
I agree but I just meant it's like dalle, the model would do nsfw or sexy if it wasn't shackled by retarded censorship
>>
>>106963093
He's put a few of mine in before, like the Dragon rider girl a couple threads back, never seen any from the other NetaYume poster anon though yeah IIRC
Baker might not always be the same guy also
>>
lolsuit status?
>>
>>106962970 #
It's not bad in the sense of a 1536x1280 baseline gen for a model I didn't actually expect to have better out of the box realism capabilities than V6 IMO
>>
>>106963188
It will (a neutered form of it), just have to upload your ID and sign a contract.
>>
>>106963264
elaborate
>>
>>106963093
he hates milfs too, low t faggot
>>
>>106963268
https://www.reddit.com/r/OpenAI/comments/1o6lfpc/sam_altman_confirms_less_restrictions_adult_mode/

Though it'a not for video, one could assume it could come in the near future in a similar way (no uploading images or prompting certain fetishes such as creepshots though).
>>
>>106963288
Plus I think it's their answer to Grok (which is capable of Wan 2.2 tier NSFW).
>>
>>106963288
>one could assume it could come in the near future
I don't know why you'd assume that. that's a huge can of worms
>>
>>106963301
*was capable
>>
>>106963288
>>106963303
well to be fair this is listed in the documentation for sora 2 api:
Only content suitable for audiences under 18 (a setting to bypass this restriction will be available in the future).
>>
>>106963303
Well, their model was trained with celebs and whatnot, so it'd be hard, but not impossible.
>>
>>106963288
It took them years and an internal purge of their more crazy elements (who went and funded anthropic to be even """safer""") to allow tame sexy stuff on text, I highly doubt they would ever do nsfw images or videos, at least not until years from now if it even happens.
>>
>>106963301
Grok was censored.
>>
>>106963308
Imagine being a ClosedAI "safety researcher". The lucky bastard gets to gen all kinds of degeneracy and gets to test every celebrity and edge case you can imagine.
>>
File: 1731810916591150.jpg (468 KB, 2221x1634)
468 KB
468 KB JPG
>>106963337
I wonder how many hit pieces it took for that.
>>
>>106963323
Yeah I don't think it's happening. Unless jailbreaking is solved they are essentially allowing some crafty users to generate Pokimane doing x, y or z.
>>
>>106963371
their third party content detectors aren't going away, and they're REALLY aggressive. i think that'll be fine
>>
>>106962727
> Can you catbox an example? I never thought it would be anything useful in nsfw.
https://litter.catbox.moe/2w7fttqv3j0rj2jv.webm
The image: >>106954551
>>
>>106963405
hory shit
>>
>>106963381
It was included in the training data. There's always a way though not immediately obvious.
>>
>>106963405
OK that's so much better than the other lora, I need to try that!
Did you use a t2v nsfw lora for that effect? or i2v works too?
>>
>>106963233
Filled with massive regret now that he's sober.

He's become just like debo and has even less pull than debo. His cope thread has been up for 3 days now.
>>
Thots on Krea video? Seems to do long vid properly, just needs gguf conversion maybe?

https://www.krea.ai/blog/krea-realtime-14b#long-inference
https://huggingface.co/krea/krea-realtime-video/tree/main
>>
>>106963522
>based on wan 2.1
>slow motion
it's dogshit
>>
I'm doing it. I'm finally shedding the last of my training wheels from reforge and going full cumfart.

It just feels strange, because comfyui is such overkill for image diffusion but, forge in all its iterations is so down syndrome. It's like you either go full spaghetti, or go full retard, there's no real in between solution.

By the way for anyone still on the fence, this exists, and it's bloody fucking lovely.
https://github.com/willmiao/ComfyUI-Lora-Manager
>>
>>106963482
i2v or t2v depends on a particular lora, testing is required.
>>
>>106963522
ditto (https://github.com/EzioBy/Ditto) also looked fun to mess around with for a bit, think kijai already did the lora.

https://www.reddit.com/r/StableDiffusion/comments/1ocfa2u/
also this dude made a p cool vid with animate.
>>
>>106963557
OK thanks anon.
>>
>>106963577
Cool, looks like its a vace variant? https://huggingface.co/Kijai/WanVideo_comfy/tree/main/LoRAs/Ditto Shame, I never use kijai nodes/models
>>
>>106963405
what prompt did you use?
>>
>>106963093
He's the only avatarfag who keeps shilling the model and only communicates through his filename instead of actually engaging like a real anon. Seems like an employee to me. If I keep seeing him use the same filename without actually contributing to discussions, I'm going to start reporting him.

If you're reading this, NetaShiller, change your filename because you're making your model and company look bad. There's another anon who also uses Neta Lumina, generates great gens, and participates in the community organically like a real anon should.
Learn from him
>>
>>106963722
nta but all the metadata is included. btw to the op, I don't think (this:1.5) works with wan
>>
File: waterfox_UTEmOUGHsw.png (200 KB, 1440x902)
200 KB
200 KB PNG
can anyone spoonfeed me on wan 2.2? I followed the stuff in the OP and now I'm wondering how do I work with loras? I've seen there's a "normal" loader and then lora only which seems more apt to put in between the model and model sampling nodes? what about chaining multiple of them together? what about resolution is there any recommended one or an aspect ratio that works better? what's the best nsfw lora? what about "speed up" loras, will they fuck up the quality?
>>
>>106963659
unno, i haven't looked too far into it just bookmarked it for later when i get the time.
>>
File: 1749365348794732.png (103 KB, 1444x728)
103 KB
103 KB PNG
>>106963755
>I've seen there's a "normal" loader and then lora only
you mean "lora loader model only"? yeah use that one, you don't need clip. you just chain them between the model and model sampling nodes as you said. or you can use power lora loader by rgthree to keep it tidy. there was recommended resolutions for wan 2.1, pic related. I dunno about 2.2 but I just use arbitrary resolutions and it just werks. there is "(wan 2.2 experimental) WAN General NSFW model" on civit that works for simple thrusting, but really you have to mix and match different loras depending on what you're trying to do. the lightx2v speed up loras don't hurt the visual quality but they do hurt the motion and prompt adherence. you can increase the fps to mitigate this but you'll have a shorter video. for gooning purposes I think it's fine else you will be waiting three times as long for a gen
>>
>>106963742
Working or not I used the same prompt and it provided similar results.
I'm testing different starting images and different fusion/light values to fine tune it

>>106963755
I don't use model sampling.
I recommend using Rgthree lora loader (it's one node and you can put howerver many loras you want)

But the essence is

High noise model > high noise loras > high noise ksampler
low noise model > low noise loras > low noise ksampler

high noise ksampler feeds the latent to the low noise
Imagine high noise is the movement/motion/scene setup and low noise are the details
put vae/clip on their correct places and play around with settings

steps should be the same for both samplers
for example 9 steps (high/low)
you start at 0 steps on high and go to 6
then on the low you put start at 6 and go to 10000 (it will stop at 9 in this case).

cfg 1.5~3.5 tend to work on high noise
cfg 1~1.5 on low
try shit out and see what works for you in terms of speed/quality
>>
File: screenshot.1761069744.jpg (865 KB, 2584x888)
865 KB
865 KB JPG
>>106963554
Lora-Manager really needs to be officially included with Comfy. I don't think I can use comfy without it. Using any other lora node feels caveman tier.
>>
>>106963895
To add to this, you can use a node to automatically resize, pad, crop your original image for i2v to feed into the start image
KJnodes has a good one
>>
>>106963723
>>
>>106963946
There's also an even lesser known one called "local lora gallery", its just that manager in a node. Which is what i wanted to begin with. Extremely useful since i can copy all the activation tags right there in the node setup.

Either options are nice to have, I think i'd use manager if i'm going on a spree of organizing/deleting loras. Manager has a really neat feature that downloads metadata of *every* deleted civitai lora, though for the two i needed it for it didn't grab the data.

https://github.com/Firetheft/ComfyUI_Local_Lora_Gallery
>>
>>106963723
Bruh the Neta*Yume* trainer guy is just some random dude from the community, he wasn't on the team at the actual company called Neta that trained the initial Neta Lumina 1.0, nor is he employed by them
Also the anon here (the one who kept doing roped up all fours ladies with NetaYume yesterday) isn't the same person as me (did the "lick it" blonde lady, dragon rider girls, blacked Tyrone joke one, etc) for example.

TLDR more than one person is using the model lol
>>
File: ComfyUI_00025_.png (2.03 MB, 1920x1152)
2.03 MB
2.03 MB PNG
>>106963987
>blacked Tyrone joke one, etc
post it, i wasn't here for that.

also throwing my hat in, I mentioned wanting to train loras for it yesterday. Still haven't done that. Trainers are a cunt to work.
>>
File: 1741333251220618.png (295 KB, 1322x826)
295 KB
295 KB PNG
>>106963952
yeah, I have it set up like this so I can quickly link one of the integer nodes to width or height and it will scale the other dimension accordingly. it doesn't scale up though for some reason which is why the scale image to total pixels node is there
>>
>>106963895
>>106963952
thanks a lot anons, the node you mentioned is "rgthree-comfy" right? I'll also check out KJnodes, so for i2v the base image should be the same size as the output video?

>>106963896
also in general do you split your loras by high and low noise then or do you connect to both of them?
>>
>>106964041
I use 2 Resizes in sequence.
One is to scale up/down proportionally
The second is to Crop to a size that Wan likes.
Even padding some times work, wan sometimes use the padding as an effect to "open up a scene" but depends on the seed.
>>
File: comfy power.png (50 KB, 960x527)
50 KB
50 KB PNG
Oh yeah. It's all coming together. I've about replaced my entire forge setup save for a proper upscaler. (And figuring out 2girls prompting.)

https://github.com/newtextdoc1111/ComfyUI-Autocomplete-Plus
>>
>>106963337
The version of Grok that could do celebs looked embarrassingly bad and generated at like SD 1.5 resolutions though
>>
>>106964084
>base image should be the same size as the output video
Yes, you should format you image on the sizes that wan likes (or not, it seems to work on random sizes), I mostly do it to not get OOM.

>split your loras by high and low
All (or most?) Wan2.2 loras are divided into high/low loras.
Wan 2.1 loras work but it's a case by case base you have to test (in this case you put the same loras in high and in low)
>>
File: 1746354453481697.png (70 KB, 803x530)
70 KB
70 KB PNG
>>106964084
just link the width and height like this. loras have separate high and low versions
>>
>>106964110
yeah it's awesome. just wish they made an option to disable the japanese text.
>>
>>106963405
why is the product consistency lora not used in the low pass too?
>>
>>106964197
Not him but I assume because you already set up the scene you want, you don't need it for the details you just need it for the major scene change/movement consistent characters.
Try with or without and check ur genz
>>
>>106963723
im here bro, idk why youre seething this much at my gens. having the model name in the filename is how I organize my files, and it's also helpful if other anons want to know which model im using. I posted a single gen last general too.
tldr: stop sneething and let people have fun
>>
>>106963093
even if it's technically better it will be a long time before it reaches the flexibility of XL due to the sheer amount of loras it has.
>>
>>106964314
it's just the resident schizo anon pay him no mind
>>
>>106964342
No worries. You can wait for "stabilizer" LoRAs, jeetmixes, and such.
>>
File: file.png (2.4 MB, 1328x1328)
2.4 MB
2.4 MB PNG
>>
>>106963961
>NetaYumev4test
uh.... v4 wen? 0.o
>>
File: 1748740005341765.png (1.22 MB, 1080x1582)
1.22 MB
1.22 MB PNG
Babe wake up, a new video model got released
https://github.com/Shopee-MUG/MUG-V
https://huggingface.co/MUG-V/MUG-V-inference
>>
>>106964402
you forgot the "add detail" loras.

>>106964433
I am staying cool as a cucumber until i see some locally genned examples.
>>
>>106964433
>video model
>no video examples
what did they mean by this?
>>
>>106964419
that was a semi-private finetune (had to give him email through HF) done by the netayume guy before the current release, I had assumed v4 (wrongly)
>>
>>106964402
>>106964456
also "darkness" loras kek
>>
>>106964197
Because in this case just high was enough. Only in low pass might work too in some scenarios. Or low and high together.
>>
Is there even LoRa training support for Lumina? Or just use Flux Dev LoRa training?
>>
>>106964433

>High-quality video generation: up to 720p, 3–5 s clips
>Image-to-Video (I2V): conditioning on a reference image
>Flexible aspect ratios: 16:9, 4:3, 1:1, 3:4, 9:16
>Advanced architecture: MUG-DiT (≈10B parameters) with flow-matching training

Despite that, I'm still trying to understand what does it actually do thats different.
>>
>>106964524
>To our knowledge, this is the first publicly available large-scale video-generation training framework that leverages Megatron-Core for high training efficiency
i think this is just a model trained to show off a new training platform
>>
>>106964402
sure, whenever noob or wai makes a finetune of it, then i'll show interest. until then im not ever going to be sastisfied with whatever limited character knowledge it has.

i guess if all you care about is OC's are popular characters in basic poses, go for it.
>>
File: 1642586287418.gif (3.15 MB, 333x250)
3.15 MB
3.15 MB GIF
>>106964524
>≥ 24 GB VRAM (for 10B-parameter inference)
>>
>>106964433
>>106964540
https://developer.nvidia.com/megatron-core
lmao it's just an ad for this shit
>>
File: AniStudio-01846.png (1.67 MB, 1024x1344)
1.67 MB
1.67 MB PNG
>>
>>106964546
wai is a mix my young anon not a tune
the smart anons among us are going to start using it now (or already have been) so they can get a better handle on how it likes to be prompted. puts anon ahead of the curve much like how it was when il0.1 released
>noob tune
would be tight desu
>>
>>106964524
>Advanced architecture
it's the same architecture as Flux, more than a year ago, they havent found something better since? I don't believe that
>>
File: ComfyUI_00042_.png (2.2 MB, 1920x1152)
2.2 MB
2.2 MB PNG
When did you notice WAINSFW got renamed to WAI?

and how long until you found out why?
>>
>>106964584
Lawsuit status?
>>
>>106964496
yeah i was thinking since it's used for face consistency in this scenario, it would add the face details in the low pass maybe
>>
>>106964616
what lawsuit?
>>
>>106964584
someones been really starved for your attention
>>
>>106964584
Pedo
>>
>>106964584
cute style
>>
>it begins
see you guys tomorrow
>>
how does he do it? schizo reacted immediately
>>
> bait
The troll thread died so he's fishing, ignore trani. Being irrelevant is the only thing he deserves.
>>
>>106964662
>>106963233
>>106963484
>>106961939
well i mean he has been waiting all day for ani to give him some attention, it was cute
>>
>>106964584
Get the fuck out of here you drunk retard.
Stay in your containment thread.
>>106934820
>>
File: NetaForge.png (7 KB, 737x162)
7 KB
7 KB PNG
>IMPORTANT!:
Neta Lumina for NeoForge coming soon (or might already be out, not sure exactly how GitHub releases work)!
>IMPORTANT!:
>>
>>106964699
or just use cumfart.
>>
>>106964699
wow now everyone can download it, make 12 gens, compare it to noob then delete it because it's just not there!
>>
>>106964699
or just use an*studio
>>
>106964719
>noobschizo
>>
>>106964726
sdcpp doesnt support lumina
>>
File: file.png (307 KB, 2617x796)
307 KB
307 KB PNG
>>106964433
>they haven't compared against wan 2.2
of course, and what's that STIV model? it's open source?
>>
>>106964758
I used to be the noob schizo before anon understood its local SOTA status. Glad that torch has passed from me to a poster who probably called me it way back then. He probably means some noob shitmix anyway.
Being so far ahead of most anons is a burden someone must carry, and that someone is me.
>>
>>106964788
this. wake up sheeple. comfy likes lumina so it's already cursed
>>
So with NeoForge I can do everything ComfyUI does but without dealing with nodes? It supports Wan, Qwen, Qwen Edit, Neta, SDXL, FLUX, and Chroma.

But the important question: has he fixed the memory management issues? That's a big drawback for me.
>>
>>106964822
neither comfy or neoforge fixed memory problems
>>
>low skilled poster misunderstands anons post
baka my baka
>>
>>106964822
yes, neoforge is for literal retards scared of nodes
>>
>>106964822
>has he fixed the memory management issues?
every form of forge will always be slower than cumfart, that's just the way the tard helmet fits.
>>
>>106964920
but why is cumfart slower than diffusers?
>>
>>106964964
Listen buddy, blows smoke in your stupid retard slack jawed face, I don't know ALL the technical jargon, I just know intentionally crippling yourself with a UI that isn't coded very well isn't a great way to go about things, and just sucking up your slightly lower than average intelligence and going with comfy is just the way this works now. Your 1girls will gen faster and look better when you put in the morning's work to make *your* workflow.
>>
File: 1731164133617579.png (803 KB, 1080x519)
803 KB
803 KB PNG
babe wake up, they finetuned Qwen Image Edit and improved it
https://huggingface.co/chestnutlzj/Edit-R1-Qwen-Image-Edit-2509
https://github.com/PKU-YuanGroup/UniWorld-V2
>>
shut the fuck up
>>
believe in ani
>>
File: ComfyUI_00537_.mp4 (651 KB, 1280x720)
651 KB
651 KB MP4
>>
File: 00018-1637512504.png (1.67 MB, 1024x1280)
1.67 MB
1.67 MB PNG
>>
>>106965131
big if true
>>
>>106965131
i'm tired with image models. give us nu video models
>>
Is it possible to run Qwen Nunchaku with LORA?
>>
File: 1742212585016729.jpg (423 KB, 3946x1350)
423 KB
423 KB JPG
>>106965131
they also did the same on flux kontext (but who cares about that inferior model with a worse licence lol)
>>106965288
lucky for you, you have one today >>106964433
>>
>>106965131
How to use this? A LoRa?

captha: pwnj2
>>
>>106963119
>And Sora 2 cant even pass the test! LOL!
Sora 2 can absolutely generate walking from behind

>>106963145
>it could if it wasn't censored after gen
Reminder that it still is censored at the nipple and genital level so even if the weights leak you're not getting good nudity
>>
>>106965363
well the model can moan like a slut so i'll be happy if the weights leak
>>
>>106965353
yes, it's a lora you can use it right away
>>
>>106965243
lol cute
>>
>>106965380
Yeah audio is another thing entirely. Uncensored audio would be insane and probably very dangerous from all the old people getting scammed
>>
>>106964152
>>106964164
I see, I was trying to use "Female Genitals helper" which seems to be just a single file, but the other one mentioned "(wan 2.2 experimental) WAN General NSFW model" does have high a low files

it's annoying though both of them don't seem to quite work right and genitals are still pretty wonky, while also degrading the quality to a certain extent, the "vanilla" models from the OP seem to work well enough with that default workflow but the genitals are either a black void or abominations
>>
>>106965395
this is probably still the reason we don't have any legitimately good generalized audio models locally
>>
File: ComfyUI_00057_.mp4 (3.08 MB, 1280x1280)
3.08 MB
3.08 MB MP4
>>106965304
nta, knew this but didn't have evidence ty

vid is literally me
>>
>>106965412
>this is probably still the reason we don't have any legitimately good generalized audio models locally
You are correct. I wish old people didn't have all the money and power and we didn't need to care if they got scammed or not but death by gerontocracy is a real thing
>>
File: 1753849217766682.jpg (678 KB, 3215x976)
678 KB
678 KB JPG
>>106965131
>https://huggingface.co/chestnutlzj/Edit-R1-Qwen-Image-Edit-2509
it seems to be better at prompt understanding indeed
>>
File: 1734467766821806.jpg (1.47 MB, 1560x2280)
1.47 MB
1.47 MB JPG
I see that neo forge have a wan option now.
How do I do video stuff with it?
>>
>>106965482
show me your migu
>>
"Frivolous applications of AI do nothing to address the greatest challenges of our time and adds more weight to the growing anti-AI movement which sees America’s AI companies as disinterested in their fate, exploitative, and craven."
>>
File: 1759280196113328.mp4 (3.8 MB, 720x1056)
3.8 MB
3.8 MB MP4
last cheerleader gen, excuse the autism. 6 steps high with no light, 4 steps low with the 2.2 light seko lora. genned at 560, upscaled with seedvr2, interpolated with gimm-vfi, fps increased slightly.

I guess the rcm low lora slops the output?
>>
>>106965595
catbox?
>>
File: ComfyUI_00539_.mp4 (1.84 MB, 720x1280)
1.84 MB
1.84 MB MP4
>>
File: 1757168766555522.jpg (808 KB, 3400x978)
808 KB
808 KB JPG
>>106965514
I prefer without the lora on that one
>>
is it possible to train a wan2.2 i2v character lora on 24gb vram? if yes, which trainer should i use?
>>
>>106965300
yes im doing it.
just copy over the files from the PRs on top of your shit, no need to recompile stuff
>>
>>106965131
it still zooms in the image, sad
>>
>>106965610
this is just for the initial gen. I use separate workflows for seedvr2 and gimm
https://litter.catbox.moe/5jxnxpmgt4shrap3.mp4
>>
>>106965624
so it was trained on 4o outputs huh
>>
>>106965513
please man >>106964710 my body and soul needs that metadata
>>
>>106965682
it's funny because it's true
https://arxiv.org/pdf/2510.16888#page=13&zoom=100,110,642
>We curate a dataset comprising 27,572 instruction-based editing samples in total (Figure 5), which
are sourced from LAION (Schuhmann et al., 2022), LexArt (Zhao et al., 2025), and UniWorldV1 (Lin et al., 2025).
>UniWorldV1
https://huggingface.co/datasets/LanguageBind/UniWorld-V1
>Image Editing
>Imgedit-724k: Data is filtered using GPT-4o
>>
File: ouroboros.jpg (40 KB, 596x612)
40 KB
40 KB JPG
>>106965715
are you flippin dippity doo doggy serious
>>
>>106965658
That isn't a qwen issue, that's a comfyui issue
>>
>>106965778
false, even disabling the node resizing, the zoom ins still happen. It's a model issue. There are multiple issues on this on the tracker with research
>>
>>106965715
kek, DOA
>>
>>106965795
Anon I've ran it hundreds of times, it's 100% a comfyui issue. If you're using an appropriate resolution and correctly sized inputs you will not get zooming issues. If you're somehow still getting them then do a step or half step of img2img
>>
>>106965778
nah, kontext dev doesn't have that problem on comfyui
>>
File: 1739339979167633.webm (3.85 MB, 464x688)
3.85 MB
3.85 MB WEBM
>>106965617
>>106965513
How long till we can do this with local models.
>>
>>106965832
different node, retard.
>>
>>106965837
do what?
>>
is it possible to uninterpolate a video in comfyui?
>>
>>106965840
>noo, you don't understand, Comfy perfectly coded Kontext dev (even though it was his first try implementing an edit model) but somehow messed up QIE (even though at this point he had experience on edit model since he worked on Kontext dev before)
take this (You) kind saar
>>
loooooooooooooooooooooooooooooool
https://civitai.com/images/106861501

https://files.catbox.moe/qq5s1m.mp4
>>
>>106965830
no it STILL happens, even using diffusers.
the effect is AMPLIFIED if you're using a shitty resolution or AR.
Comfy has the latent and input resizing which further amplify this. you can edit out the resizing code from there too.
After removing the resizing shit and using proper resolutions, it's a matter of rolling a good seed that doesnt zoom for the specific input you're using.
>>
>>106965866
beautiful
>>
File: lmao.png (317 KB, 352x409)
317 KB
317 KB PNG
>>106965866
>my honest reaction upon processing this information
>>
File: 1749151243646293.png (218 KB, 666x893)
218 KB
218 KB PNG
>>106965849
>>
>>106965887
this doesnt remove the interpolated frames, just alters the video framerate
>>
>>106965866
bruhhhh
>>
>>106964456
>cool as a cucumber
https://www.youtube.com/watch?v=J8eTUXLcm-w
>>
>>106965862
You are a fucking idiot anon. You can perform in-context learning with a normal inpainting model like flux fill, it isn't some black magic that requires l33t coding. The bigger jump in tech was from QIE to 2509
>>
>>106965627
i've heard yes
i don't know, the one able to train wan2.2
>>
>>106965894
does it? too busy genning to test. you can ask chatgpt to write you a ffmpeg script that removes every nth frame
>>
File: 00086-324552202.png (512 KB, 512x640)
512 KB
512 KB PNG
>>
>>106965908
>The bigger jump in tech was from QIE to 2509
it stills zooms in, it's still slopped, and the styles got worse, take this (You) again kind alibaba employee
>>
reminder for fellow editfags, if its not a VAEless edit model, its a toy
>>
File: 00022-3634275900.jpg (192 KB, 1200x1200)
192 KB
192 KB JPG
>>106965901
This is cool. Thank you for linking it.
>>
>>106965901
I miss chiptune crack era
>>
>>106965924
No, I'm not turning this into a consolewars argument. You are objectively wrong and don't know jack shit about the technology. Get off >>>/g/ if you're going to treat AI like a black box, you're just clouding the signal
>>
>>106965952
You are objectively wrong and don't know jack shit about the technology.
>>
File: ComfyUI_00540_.mp4 (1.78 MB, 720x1280)
1.78 MB
1.78 MB MP4
>>106965837
the hell are you talking about? this was local... you're in LOCAL diffusion general here
>>
File: two retards fighting.png (469 KB, 581x411)
469 KB
469 KB PNG
>>106965952
>>106965957
me in the back
>>
>>106965901
very nice tune
>>
>>106965901
I wonder if you will be able generate this stuff locally some day.
Udio being kind of alone in good music gen is sad.
>>
File: ComfyUI_00541_.mp4 (2.38 MB, 720x1280)
2.38 MB
2.38 MB MP4
>>
>>106965987
audio is the neglected child of the local bunch
>>
>>106966002
every good audio model getting abandoned after v0.1 is a genuine tragedy
>>
Nvidia releases new rtx 5000 card, what do we think chat?

>72GB
>around $4500
>https://videocardz.com/newz/nvidia-quietly-launches-rtx-pro-5000-blackwell-workstation-card-with-72gb-of-memory
>>
>>106966089
it's too damn expensive
>>
>>106966089
>300W
decent intermediary
>>
>>106966089
i just spent $2k on a 5090.. maybe if i sell it and my 4090 i can get this instead.. but dunno if it would be worth a shit for gaming or not
>>
File: EejVBd_VoAMbfa-.png (300 KB, 560x374)
300 KB
300 KB PNG
>>106966089
>72gb
>at 300w
i'll take it. Well, time to start grinding in the mines to afford it i guess.
>>
>>106965631
bro fucking how. i don't mean just the lightning lora but loading loras of my own. which pr?
>>
>>106966089
>less cuda cores than a 4090
nvidia can't stop jewing
>>
>>106966089
shame it's blackwell
>>
File: 00102-1022819231.png (3 MB, 1280x1920)
3 MB
3 MB PNG
>>106965595
I like that one the best of the bunch. I ran through a few attempts myself but never got anything interesting.
>>
>>106966152
>shame it's blackwell
is that not their latest generation chip
>>
Wish we had better interpolation support in comfyui. The very latest versions of RIFE are better than FILM, yet the rife node hasn't been updated to support them.
>>
>>106966089
>>106966111
It's a castrated 5090 so it's not that great outside of the consumption and vram.
>>
>>106966161
it's also optimized for fp8 not fp16. it's a downgrade
>>
>>106966176
yeah.. the vram is nice tho
>>
>>106966185
It is but in the end it will be slower than a 5090+block swap for inference.
>>
>>106966089
>still not much vram
>cucked vram speed
lol, lmao even
>>
>>106966202
so who is this card for? suckers who cant pony up for a 6000?
>>
>>106966211
an extra $3k is not small bills
>>
File: AniStudio-01867.png (2.32 MB, 1024x1344)
2.32 MB
2.32 MB PNG
anon is right. they skimped on cuda cores.
>>
File: almost unpleasant.png (159 KB, 384x390)
159 KB
159 KB PNG
can nvidia just fuck off and burn in the lake of fire already i'm TIRED OF THIS
>>
File: 1731852789530640.jpg (49 KB, 1080x1016)
49 KB
49 KB JPG
>>106965866
>>
>>106966211
It probably has nvlink so you can chain 2-4 of them and use them in multigpu contexts I guess, all for 1200W.
>>
>>106966261
we are born too early anon, in 100 years, people will make fucking movies in less than 10 seconds with quantum computer shit, sad :(
>>
born just in time to experience 1girl, walking, bouncy titties
>>
>>106966308
>>106966315
I'm not despairing completely, two more weeks and some richfag/furfag will deliver the goods. Never rely on the chinese or the jews.
Happy enough i can gen what i can on 16gb.
>>
>>106966326
>richfag/furfag will deliver the goods
name 5 times this has happened
>>
>>106966308
100 years? nah.. a few years maybe
>>
>>106966326
>some richfag/furfag will deliver the goods.
it happened once with pony v6 and it was a fluke
>>
>>106966343
>a few years
geg
>>
File: 1745928050465812.png (261 KB, 1230x690)
261 KB
261 KB PNG
>>106966308
>>106966343
>>
>>106966393
sigh... at least I lived through the golden age of video games (1995 -> 2005), that's something
>>
>>106966308
At least I wasn't born a few centuries ago without any of that. It's the best moment in history so far to experience that.
>>
>>106966285
it seems more like it's going to be double slot
>>
>>106966403
we will be able to simulate that in vr to a pretty high extent, although yes, there is a slight difference in your experience when you truly live at that time not knowing the future and being excited about it
>>
>>106966138
open wide
https://github.com/nunchaku-tech/nunchaku/pull/754
https://github.com/nunchaku-tech/ComfyUI-nunchaku/pull/647
if you know how to use git, just checkout the relative branch/remote, otherwise if youre a nocoder retard, just download the branch and copypaste the files over. Ask your favourite LLM for further instructions
>>
Started trying qwen edit(2509) and its awesome.
Any tips on nsfwmaxxing with it? Loras, workflows etc
>>
>>106966089
> jewvidia is trying to jew as much as possible before people knew about ramtorch
>>
>>106966443
>ramtorch
that's a meme right?
>>
What's the vramlet cut off? <16GB?
>>
>>106966456
https://www.youtube.com/watch?v=d49mCFZTHsg
>>
>>106966343
LOL you keks said the same thing 3 years ago. Yet here we are still stuck with sdxl. Any revolutionary tech will be api only
>>
Hope the Chinese make something with 4070tis speeds or higher and with 64gb at a reasonable price that works with comfy out of the box.
>>
>>106965131
where is the model?
>>
>>106966482
no cuda no try
>>
>>106966315
Good that I enjoy that.
>>
>>106966436
Use nude loras (you can't find these in civitai), and that's it, there aren't that many good loras for the model, which is sad.
>>
>>106966494
it's a lora
>>
>>106966516
oh
>>
https://www.youtube.com/watch?v=EeTDf6Anakg
he probably used Wan Animate to make this kino lmao
>>
>>106966476
no one said it had to be local
>>
>>106966515
So where do I find them?
Also how capable is the multi image stuff for nsfw shenanigans?
>>
>>106966531
>no one said it had to be local
if it's not local you won't be able to make anything remotely interesting with it, look at sora 2 it's lobotomized to hell now
>>
any wan2gp users here? thoughts on the recent update? also is dreamomni2 any good? is censored like flux kontext?
>>
>>106966569
>also is dreamomni2 any good? is censored like flux kontext?
nothing is better than Qwen Image Edit
>>
>>106966532
>So where do I find them?
Check here : https://civitaiarchive.com/search?q=qwen+edit&is_nsfw=true&is_deleted=true

>Also how capable is the multi image stuff for nsfw shenanigans?
By default it has no idea what underwear types are let alone actual nsfw, so unless loras add nsfw it's not usable for that.
>>
File: 1747092644748299.png (32 KB, 1160x117)
32 KB
32 KB PNG
>>106966626
dude it's been the second time they've been asked to change their name, from civitaibay to civitaiarchive, and now civarchive
civitai is maybe polite but what a pain
>>
File: ComfyUI_00061_.mp4 (3.51 MB, 960x1280)
3.51 MB
3.51 MB MP4
>>106965246
^^
>>
>>106966546
>if it's not local you won't be able to make anything remotely interesting with it
depends on what you find interesting. for me there's quite a few people already creating good content using non local or sometimes mixing them.
>>
File: ComfyUI_07597_.png (1.93 MB, 1152x1152)
1.93 MB
1.93 MB PNG
>>
>>106966723
for once I like the tongue out pose, nice
>>
Did anyone ever make a nsfw/better control "lora" or finetune of the microsoft quickly released then deleted audio vibevoice model that allowed nsfw voice but made everything like a script to be read?
>>
>>106966741
They never released the training code. There's hacked together unofficial sloppy lora code, but I have yet to see anything good come from it. If they released code I'd have finetuned it by now
>>
>>106966755
Man, what a shame.
>>
>>106966772
Our second best bet is if the Ovi team release finetuning code, but I'd wager that too is a pipe dream. I'm foaming at the mouth for good local NSFW audio
>>
File: ComfyUI_07600_.png (1.98 MB, 1152x1152)
1.98 MB
1.98 MB PNG
>>
>anon has reached the apathy phase with Julien
Perfect
>>
implications of Ramtorch?
>>
>>106966942
snake oil
>>
>>106966942
nothingburger, if it was really revolutionnary it would means you would be able to run HunyuanImage 3.0 on a vramlet card, and I'm not seeing that happening!
>>
>>106966953
slow but works, doesn't it?
>>
I fucking hate these pip comfyui trannies with a red hot intensity at this point. Why the fuck do I need to go through dependency hell just to run a fucking gooner model!?
>>
>>106966975
he claimed the speed was equivalent to having enough vram to load the whole model, that's bullshit
>>
>>106967016
ah yeah, I doubt that, otherwise most ai providers would use consumer grade gpus supplemented with crazy amount of ram
but even at 50-75% of the speed, it could be worth it
>>
File: AniStudio-01879.png (1.15 MB, 1024x1344)
1.15 MB
1.15 MB PNG
>>106967015
lol
>>
>>106967016
he also claimed his 3-model merge technique was better than a traditional finetune yet chroma is a blurrymelt mess
>>
what's up with this board being filled with pedos?
>>
>>106967101
he also said that chroma would have artist tags and it didn't happen, this guys says a lot of random shit
>>
File: 1755071860559069.mp4 (3.96 MB, 1088x1080)
3.96 MB
3.96 MB MP4
set new_resolution to 1088 in seedvr2. outputed 798x1080 (shortest side was supposed to be 1088). the file properties says it's 1088x1080 which is obviously not the case. strange
>>
File: 1745555039218558.mp4 (2.55 MB, 720x720)
2.55 MB
2.55 MB MP4
>>106966804
>>
what model is the de facto for realism?
>>
>>106966202
72GB is a damn nice and 14K CUDA cores isn't bad, but if it's going to be a lot more than $4300 (the regular 5000 price) then either just get a 4090D 48GB ($3000) or go all the way for the 6000 Pro.
>>
File: punished howard 3.jpg (76 KB, 950x1070)
76 KB
76 KB JPG
>>106967015
oh oh, that's nothing motherfucker
in my exhaustion stupor of trying to figure out WHY my install was requiring a specific pytorch dependency for my UltralyticsDetectorProvider piece of my upscaling workflow, turned out i installed this particular comfyui through the command line, and it was using my SYSTEM python this entire time.
so 99% of everything was working just fine, except that one fucking node, and that one node made me realize i installed comfyui wrong 4 months ago and in running this new blackwell gpu, i really should have just clean reinstalled the portable zip instead.

this shit is convoluted for the sake of being convoluted. anyway,
>updates your dependencies that were already working
>>
>>106966953
>>106966971

>>106966466
>>
File: 00154-2612313.png (1.1 MB, 1168x840)
1.1 MB
1.1 MB PNG
>>
>>106967186
bigasp, or chroma if you like mangled hands and fucked limb proportions
>>
>>106967186
Pony V6 if you actually want good results that don't take hours of your life away regenning
>>
>>106967186
For 1girl ‘realism’, sdxl. For actual realism, qwen or wan with loras
>>
>>106967186
pre v30 chroma, but the anatomy is shit so...
>>
>>106967201
nta but considering I was getting a 4090D, this just seems like the better option. genning and training wan models with 72GB would be a breeze.
>>
File: 1757771574256280.jpg (173 KB, 1080x1080)
173 KB
173 KB JPG
>>106967332
>qwen
>realism
>>
>slower than a 5090
>more vram to run plastic localslop
>$5000+
why? are localjeets that addicted to generating ugly iphone blur with qwen?
>>
>>106967374
>so guys I've seen you capture a tiger how did you do it
>wanna subscribe to my stonetreon?
>>
>>106967167
I'd like to bust her melons if you get my double meaning.
>>
File: tis.jpg (85 KB, 1184x657)
85 KB
85 KB JPG
If picrel isn't 24GB or over when its released, this would be the absolute most cucked disrespectful thing nvidia has done
>>
>>106967374
>>106967416
>>
Just buy a H100 retards instead of debating on the upcoming card
>>
File: 1734079465529345.jpg (31 KB, 514x573)
31 KB
31 KB JPG
>>106967454
shieet
>>
File: 1742522361831224.mp4 (3.8 MB, 1088x1472)
3.8 MB
3.8 MB MP4
>>106967137
disregard. I didn't use the right encoder settings when compressing
>>
File: file.png (164 KB, 988x189)
164 KB
164 KB PNG
>>106967449
>>
>try to use the smallest seedvr model
>oom
it's simply owari
>>
>>106967469
by the way if you guys have turk guy gens please share them, was 99% sure i grabbed all the ones i saw last year and early this year but nope. He really isn't memed that much anymore is he?

>>106967449
>>106967506
>it would be the absolute must cucked disrespectful thing nvidia has done
>not the 5050 and 5060 8gb
>>
why does this general post variations of the same gen?
>>
>>106967548
I'm gonna teeeeeeeeeest
>>
>>106967548
Local hasn’t received an actual good image model since sdxl, so the cope is i2v and qwen edit
>>
>>106967548
Becaues its the same few posters posting the same things while the rest of us are primarily genning our own selves fucking hot women
>>
local might be done for at this point
>>
where can you upload goon slop from comfy ui
>>
>>106967548
>>106967585
crushes your tiny ballsack in my hands
don't forget to subscribe to my patreon you faggot
>>
>>106967585
API has been and will continue to be the future. Local will always be nothing but a toy.
>>
>>106967596
i've got balls of steel
>>
>>106967464
This is unironically nvidia’s entire business strategy: ensuring there is no cheap alternative to the H100. The instant an H100 competitor releases for under $10k their entire monopoly crumbles overnight
>>
File: ComfyUI_07632_.png (1.74 MB, 1152x1152)
1.74 MB
1.74 MB PNG
>>106966729
Thanks

>>106967167
kek

https://files.catbox.moe/h90ted.png

>catbox down again

Oh well
>>
>>106967605
>Local will always be nothing but a toy.
a toy relative to API, once they reach Sora 3 level, we'll get sora 2 locally, and I would die in peace if it was the case
>>
>>106967674
the right foot has the toes inversed lol
>>
>>106967682
Weekly reminder that local never even caught up to dalle3 or elevenlabs, both of which are over 2 years old
>>
>>106967682
Sora 3 will make Sora 2 look like a joke and you'll be saying the same thing all over again.
>>
>>106967701
Except we already passed Dalle3 with Wan 2.2
>>
>>106967731
LOL holy localcope
>>
File: 1735652610832334.png (21 KB, 347x318)
21 KB
21 KB PNG
>>106967701
passed in what? celeb knowledge? loras exist and dedistilled flux loras mogs all models for celeb consistency
>>
File: ComfyUI_07626_.png (1.53 MB, 1152x1152)
1.53 MB
1.53 MB PNG
>>106967332
sdxl "realism" joke has gotten old. Qwen needs LoRAs for separate concepts which is primitive compared to Chroma, and it's also practically always same seed. Wan is great for I2V but doesn't give you same amateur look out of the box and even LoRAs look a bit plasticky.
>>
>>106967701
we can't get the same things. because companies have their own settings, and don't use comshitui
>>
if you need to cope with loras your model is objectively inferior. you cant even get 2 celeb loras to interact without inpaint or snakeoil stacks due to lora bleed. loras are peak copium
>>
>>106967773
if you cant train loras "your" model was never yours and never got off the ground, sorry sis
>>
peak localcope
>>
the worst thing is that they're not even paid to do the api bait, they do it for free
>>
>>106967773
>>106967783
>he can't even use LORAs on his SaaS slop to make 100% accurate Facade gens

Sad!
>>
>>106967783
concession accpted
>>
>nooooo i NEED to be able to train overfit slop that bleeds into all aspects of my gens
lora is outdated 2019 tech, just like everything else local uses
>>
>He still use euler over literally any dpm
>>
>I just ate a banana bread
>>
i wish local was as fast as grok
>>
>he thinks saas isn't a vast network of loras
>>
>>106967839
it could if you had better specs
>>
i wish local wasn’t complete shit
>>
I wish you a merry christmas
>>
and a SaaSy new year!
>>
>>106967835
how was it?
>>
>>106967877
delicious
>>
>>106967877
like scratching your eye but it's still fucking itchy
>>
File: 00045-2228726078.png (1.86 MB, 1824x1248)
1.86 MB
1.86 MB PNG
>>106967891
Nice.
>>
Do you think Qwen image edit could be trained to unwrap textures onto a given UV shape?
>>
>>106968011
maybe, but it's hard to tell in advance and qwen isn't that quick to train
>>
File: ComfyUI_07648_.png (1.72 MB, 1152x1152)
1.72 MB
1.72 MB PNG
>>106967701
>dalle3
You're right, we never caught up, we surpassed it.
>>
>>106967701
elevenlabs is very true local voice fucking sucks
>>
>>106963047
Is generating video files with 4070 possible or am I wasting my time? If possible which model should I look at?
>>
>>106968047
>we
Keep your elongated chromaslop to yourself. embarrassing
>>
>>106968093
>>106968093
>>106968093
>>106968093
>>
>>106968069
it's possible but the quality will suffer. people gen with 8gb supposedly. wan 2.2 is the only model anyone really uses
>>
>>106968081
You haven't seen a woman IRL in 30 years
>>
>>106966942
it works pretty well already on aitk, idk if it could be faster
>>
>>106967503
dis gud



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.