[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


The US will Ban that Card Edition

Discussion of Free and Open Source Text-to-Image/Video Models and UI

Prev: >>106790544

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2203741
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbours
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
File: 1738957583041051.png (635 B, 471x430)
635 B
635 B PNG
Maybe it would be a good idea to take real, very high-res photos of human skin with a lot of detail, run them through some 0.1-0.3 denoise of qwen image to destroy a lot of the detail and get the plastic skin, and then train a lora where you teach a model to basically take the plastic versions of images and try to "add realistic detail" where you use the original images as the goal?
>>
>>106793018
>qwen image
*qwen image edit
>>
File: file.png (2.13 MB, 848x1488)
2.13 MB
2.13 MB PNG
>>106793018
it sounds possible but perhaps not easy. maybe you can also use regularization images, haven't used that on qwen trainings yet tho.
>>
>>106793018
>https://github.com/MIGHTYEZ/Inversion-DPO

This is possible but most people can't afford it. It's called DPO, but you need to have the model in memory twice to use it, one with frozen weights and then the one unfrozen being trained.
>>
>>106793085
How do people train qwen image edit loras then?
>>
>>106793018
That's more or less how anon trained the clothes remover LoRA, so yes.
>>
any way to speed up qwen edit when using multi image input? shit is like 7-9x slower
>>
>>106793018
It should be done with a wide range of images instead of strictly skin. You would have a "desloppa" LoRA. I'd be surprised if that's not a thing already.
>>
>>106793127
ur spilling into ram, look at your vram usage
>>
>>106793131
deslopper would be huge
>>
>>106793127
just use the 8 step qwen image 2.0 lightning lora, when I dont have image2 bypassed it's still around the same speed
>>
>>106793139
100% gpu and 22gb ram in python kekW I'd say you're right
>>106793150
I am using it, but I must be right on the limit with 1 image. Or there's some rocm faggotry failing me.
>>
>>106793093
???
what he's saying and what you're thinking are not the same thing lil bro
>>
>>106793141
we can only hope kek
>>
>>106793186
I assume to train qwen image edit loras to for example remove clothes you train it in a similar way to what I initially described?
>>
File: a113.webm (3.19 MB, 720x1088)
3.19 MB
3.19 MB WEBM
>>
Do any Qwen Edit loras even exist?
I'm talking about Loras to make this or that look better
>>
File: radiance.png (3.16 MB, 848x1488)
3.16 MB
3.16 MB PNG
>>106793054
no surprise, chroma has very attractive girls (typical booru / internet standards) and i did prompt sexy clothes

> large breasts, aqua eyes, very long hair, black hair, hime cut, black latex jacket, cropped jacket, black micro bikini top only, black micro shorts, thong, highleg panties, fishnet pantyhose, fishnet legwear, makeup, black choker, jewelry, sunglasses on head
>>
File: 1759161576857065.png (3 KB, 197x92)
3 KB
3 KB PNG
>comfyui first model load and inference still overfills vram and only on second gen onwards does it allocate the proper amount
>>
>>106793258
I love that style of bikini and material too. The shiny wetlook is the best.
>>
>>106793227
I've been told qwen image loras may work with qwen edit
>>
>>106793336

ty, anon
>>
File: 1740762369272791.jpg (5 KB, 205x246)
5 KB
5 KB JPG
vibevoice + a snip of JC audio

https://voca.ro/1n9rSVHWiD5b
>>
>>106793398
I disagree with the message. Literally, nobody is interested in controlling your miserable life.

For (You)
>>
What were silveroxide's chroma lora experiments? Were they literally just flux models that he converted to work with Chroma?
>>
>>106793445
openAI is absolutely doing social engineering
>>
File: lillie_2025-rgalz.mp4 (1.31 MB, 480x832)
1.31 MB
1.31 MB MP4
>>106793003
blestest flanzone of threads! ;D

>wan 2.1
>lora: wan hip swing twist
sauce: https://tensor.art/models/906123123449986583


byeeee
<33333
>>
>>106793398
anyone who is stupid enough to believe any capitalist is doing anything "for good" deserves what they get
>>
>>106793488
wait, requesting:
catbox\prompt\models\etc for >>106791094

;_;
so cute
>>
>>106793398
based
>>
>>106793498
MUH CAPITALISM
tell me you have a tv watcher\brainwashed take without actually telling me kek
oogie boogie money haunting you??
financials are the DIRECT reason people actively try to innovate\participate in a freemarket economy
it is the literal driving force
your qualms are not w\ the forces of capitalism themselves (intentional) but w\ crony corporatists rigging the game for themselves via corporate\gov welfare state


tldr;
stop posting libtard <3
your are uninformed (not libertarian) at best
or basically dumb\stupid at worst
>>
>>106793498
how much for sex?
>>
>>106793518
REKT
>>
File: 1740434040203715.png (219 KB, 430x454)
219 KB
219 KB PNG
JC Denton on Sam Altman taking money for Sora 2 then censoring after taking $200 from people:

https://voca.ro/14gLLirZYX5o
>>
>>106793488
>>106793518
I never thought I could like a namefag tripfag
>>
>>106793467
use deepseek instead

https://files.catbox.moe/7yoy86.png
>>
>>106793518
woops ultra triggered the lib.. who hilariously enough calls other people libs
>>
>>106793003
>>106793028
>yea the "release" chromas are chroma 1 base and 2k, what he is working on is radiance (model architecture change without VAE)

ok, googled it...
>Release branch:

>Chroma1-Base: This is the core 512x512 model. It's a solid, all-around foundation for pretty much any creative project. You might want to use this one if you’re planning to fine-tune it for longer and then only train high res at the end of the epochs to make it converge faster.

>Chroma1-HD: This is the high-res fine-tune of the Chroma1-Base at a 1024x1024 resolution. If you're looking to do a quick fine-tune or LoRA for high-res, this is your starting point.

>Research Branch:

>Chroma1-Flash: A fine-tuned version of the Chroma1-Base I made to find the best way to make these flow matching models faster. This is technically an experimental result to figure out how to train a fast model without utilizing any GAN-based training. The delta weights can be applied to any Chroma version to make it faster (just make sure to adjust the strength).

>Chroma1-Radiance [WIP]: A radical tuned version of the Chroma1-Base where the model is now a pixel space model which technically should not suffer from the VAE compression artifacts.

So Chroma1-Base is somewhat different from Chroma 50?
>>
>>106793562
base is v48
>>
>>106793549
libertarians aren't calling for the gov to control money supply AKA STATISTS\commies
if you weren't chinese you would understand
>>
>>106793562
49 and 50 were meh version that turned into HD. 50 Annealed is an abomination.
>>
>>106793518
The father of capitalism, Adam Smith, would say you're an idiot and our system sucks.
If you don't like books, ask AI.
>>
>>106793518
surprisingly based
>>
>>106793529
isn't it open source?

https://www.youtube.com/watch?v=2SvPfkXs3Nk

https://files.catbox.moe/mg1ze6.png
>>
File: ComfyUI_temp_luduo_00009_.png (2.73 MB, 1152x1152)
2.73 MB
2.73 MB PNG
Does the newer Qwen Edit model needs its own lightning lora or does the old work?
>>
>>106793586

libertarians =/= libtards

also, almost non-existent
>>
>>106793609
works for me
>>
>>106793623
also using this one
>>
>>106793609
the regular 8 step qwen image 2.0 lightning one works better than the 1.0 edit one, at least in my experience. use that, 8 steps.
>>
>>106793623
The edit and base loras are interchangeable?
>>
>>106793591
>sucks
yet all the spoils\benefits directly produced by it are how you are able to whinge\complain online kek
nice logic

no system is perfect
but historically the safest\smartest solution has always been to let the MARKET decide what is acceptable
vote with your wallet
you actually vote every single day

>dont like A, B, or C
>don't buy from them, dont support them
even walmart bends\noodles at the 10% margin loss mark, hence, all the surge of 'made in usa' items 'sweatshop free' items ect during that specific trend years ago

it doesn't take much for people to make real-world changes
but people would rather seethe and act like demoncraps that obey their (((tv))) programming overloads

you know im right
>>
File: 1753198225869052.jpg (496 KB, 2725x768)
496 KB
496 KB JPG
the asian girl is sitting on a beach chair in a white bikini. to her right is a chibi plush doll of Hatsune Miku.

what a cool model.
>>
>>106793613
>almost non-existent
only if you believe the blatantly throttled viewcount numbers on 'YOU' tube
even elon tested this theory and laughed
>example: ron paul viewcounts on YT are in the 3-4 figures no matter what
>one shitpost show on twitter\x = viewcount in the hundreds of millions overnight
in response, fb, insta, google\youtube say:
'w-w-wait! it was the FEDS who MADE US do all that!!!"

the consolidation of the open-internet webtraffic was a mistake and will bring about its end
>>
File: WanVideo2_1_T2V_00223.mp4 (163 KB, 832x480)
163 KB
163 KB MP4
How come nobody posts videos anymore?
>>
>>106793709
I'm a vramlet amdlet and I assume this would take me 30 min to gen so I don't bother
>>
By the way, when I use seedvr2 as an image upscaler with cfg 0.5 and 3 steps instead of 1, I get significantly better results—at the expense of time, of course.
I'm still surprised that it works with more steps.
>>
>>106793636
idk, I'm on Edit now

https://files.catbox.moe/bklho3.png
>>
>>106793726
What repo are you using that lets you adjust the cfg and and steps?
>>
>>106793709
it takes substantially more time for me to render\edit\crop\salvage a wan2.1 vid than to show my illustrious waifus that i can doink out in 21 seconds per 1080p image hehe
>>
>>106793736
lora?
>>
>>106793709
the kino sora era may be over but its affects are everlasting
>>
>>106793744

this >>106779689
>>
File: jiggly.webm (3.85 MB, 1440x1904)
3.85 MB
3.85 MB WEBM
>>
>>106793761
<333333333333333333333
>>
>>106793762
SEXOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
>>
>>106793738
numz repo
cfg had to be adjusted a few lines to make it accessible via node, and the step value is hardcoded deeper in the gen script.
>>
>>106793762
Lora or is this built in?
>>
I obtain an erection every time I find a new artist to train on.
>>
>>106793398
this is what i made
https://vocaroo.com/18yqEZuy1epV
>>
>>106793790
>Find artist to train on.
>Almost done building dataset
>Coom
>Lose interest in training

every time. On the rare occasions I do go to the point of training. I maintain a semi for like the entire training period.
>>
>>106793782
https://civitai.com/models/1822984/instagirl-wan-22
>>
>>106793774
Ah, okay. I can see "steps" in generation.py I can just hardcode that to 3 and it should work yeah? I can also change cfg to 0.5 in the shared values?
>>
>>106793809
I've seen this before but I still don't quite understand "what" it is.
>>
>>106793726
any examples/comparisons? curious
>>
can you guess what game this voice is from? https://vocaroo.com/1nqDuCh3fWUi
>>
>>106793827
it's a wan lora, you can use it for t2v or use it to generate a starting image and animate. it comes with an example workflow
>>
File: AniStudio-0000p.png (377 B, 666x666)
377 B
377 B PNG
>>
>>106793832
vibe, can tell from the sting at the start
>>
>>106793832
I really have no idea but it gives me another hand touches the beacon vibes.
>>
>>106793844
switch sage attention off
>>
>>106793762
too skinny
>>
>>106793844
one billion dollars please mr softbank.
>>
>>106793844
Tile de VAE faggot
>>
>>106793709
>How come nobody posts videos anymore?
TrAnies in the thread can't gen videos and basedGODS are generating i2v sex of themselves and hot women without wanting to self dox
>>
>>106793811
Just incase anyone was curious. Yes this worked.
>>
File: gamers.png (787 KB, 1176x888)
787 KB
787 KB PNG
>>
>>106793762
>women with big eyes roubd facdand chubby
Yuck, they are too gay
>>
>>106793762
I fucking love chubby women so much and I unironically hate Ozempic because I've seen a noticeable and sharp dropoff of fat women "retiring" and reemerging with saggy ozempic bodies..
>>
>>106793586
libs as in liberals, no one gives a flying fuck about libertarians. libertarians are as child-brained as anarchists

us government is a neoliberal government, an ideology it tries to foist upon the rest of the world, but only the vassals have fully accepted it and are getting ass pounded by it

simping for capitalists is simping for your owners like the good little wage slave you are

but i forget, 4chan is basically cia/mossad/mi6 whatever anyway so of course you'd simp for your masters
>>
>>
>>106793916
>wanting sound fiscal responsibility+ a properly maintained currency is child-brained
you simply do not know what you are talking about
>gov does bad things
stop giving them so much authority
>>
>>106793930
k you've got to be at least 18 to be here kiddo
>>
who cares
>>
local diffusion?
>>
sure is glow in here
>>
>>106793831
maybe later, still exp
>>
>>106793961
you cant even see your own nose on your face
your are literally under direct control of corporate interests if you want more gov\ hate capitalism\ etc
you are a useful idiot, sad.
you can only double-down
you can only see the splinter in your brothers eyes, while ignoring the log in your own eyes
a fool who cannot be taught
>>
rocketgulp needs another ban
>>
>>106793978
nta, but I just checked and I can see my nose. What kind of fake news is this?
>>
debo can you please just return to sdg and stop bringing up your libtard tirades?? you always do this
>>
>>106793978
read a book dumb fuck
>>
File: burntframe1.mp4 (1.36 MB, 480x832)
1.36 MB
1.36 MB MP4
>988
can't argue with \ kill an idea? BAN HIM! ;D
>>>>/leddit/ (idiotville)
>>106793990
my favorite fake news was when they tried to say that the corn-syrup was better than regular sugar in the coke haha
how can these fucks even sleep at night??
>>106793958
yes ma'am!
my poor gpu will catch on fire any day now hehe

(too low of stepcount here)
>>
>>106793709
People always eventually lose interest in videos as it takes too long and you can only do so much with a few seconds of video. People so inclined are making porn with it but you can't post it here.
>>
qwen edit is literal magic. it's a meme maker, can make characters interact from diff images, can repose a character, change their clothes, remove elements, make them nude, make them thin or fat, etc.
>>
>>106794025
skittles sexo
>>
>>106793709
I don't play with video that much. It's silly fun though.
>>
>>106793709
sora 2 revealed how much of a toy wan is
>>
is there a qwen edit for video?
>>
>>106794042
>censora 2
>>
Why didst thou leave, Debo? I simply seek answers, my friend."
>>
>>106794044
vace I guess, but it's not quite the same.
>>
>>106794053
>unable to trick the machine
>>
File: 00331_.mp4 (707 KB, 832x640)
707 KB
707 KB MP4
>>
>>106794063
we're going to have to go more jewish on this one
>>
>>106794063
>dudes finger is literally on fire
>>
>>106794069
back in the oven
>>
>rocketgurlp*** and n*gbo are fighting
love 2 see it
>>
>>106794061
>amazing anon, today's trick is tomorrow new censorship!
>>
>>106794061
post nsfw sora stuff then
>>
>>106794096
He meant tricking it in showing... gasp!... ankles!
Perhaps even the nape of the neck!
>>
>>106794106
kling will allow nipples to slip through if youre sneaky enough
>>
>>106794120
Yeah I was there when the cross trick worked, anything you got since they patched that is kind of sad.
>>
>>106794061
>training the censorship
>paying for this privilege
>>
>>106794120
pls saar, just a sliver of the bob saar
>>
>>106794155
Literally worse than a jannie
>>
>>106794158
blody bastards, no bobs, no vagane, no sexi sex
>>
>>106793529
>after taking $200 from people:
Was it really $200 to get in? Fucking kek
>>
>>106794175
The best part is there was absolutely no way that wasn't their intent from the start. They knew fully well people would and could make copyrighted characters with it and chose to let it fester for a day or two before pulling the rug.

Sometimes I hate the AI industry because of how much overlap it has with crypto.
>>
>>106794175
it's free
they're certainly paying with their data, however
>>
>>106794175
no, but 200 is their priciest tier for gpt in general (outside of api stuff)
>>
File: 1753637765341271.png (2.82 MB, 1416x2120)
2.82 MB
2.82 MB PNG
>>
>>106794158
Kek. Yeah, it's always fun to see how far people will go to push the boundaries.
That being said, /b/ exists to subvert.
>>
>>106794199
Literally haven't been on /b/ since like 2010.
>>
>>106794084
>fighting
postcard seems a bit spicy but generally nice
>>
File: 00019-3344439821.png (2.77 MB, 1248x1848)
2.77 MB
2.77 MB PNG
>>
https://vocaroo.com/1e6ioSLSpBbv
>>
it's truly a shame vibevoice never got training code
>>
>>106794270
training for what? it takes what ever the sample is and gives it back to you. if you want a girl yelling you just use a sample of her yelling. thats why video game wavs are so good cause you can grab expressive characters with different emotions in the same voice
>>
With the AI boom, I no longer look forward to the weekend, for I know there won't be any model releases during it... May monday come as soon as possible and may we get real time video gen by the end of next year.
>>
>>106794281
i want to feed it a bunch of shit from DLSITE and make it more robust for nsfw lol
>>
>>106793321
you're in luck since apart from chroma and the chroma radiance snapshot that I am posting, wan has good support. as do various sdxl derivative checkpoints

>>106793562
as the other anon said it's v48 but officialized
>>
File: 1400013636259.jpg (33 KB, 480x252)
33 KB
33 KB JPG
>>106793003
>want to upgrade to using SSD for local
>need an AM5 motherbaord
>need to replace 32GB of old ram
>>
>>106794293
>as do various sdxl derivative checkpoints
oh I know, some of the pony offshoots do it reaaaaalllllyyy well.
>>
Reminder a huge speed % is on the table with TensorRT that for some reason died.
https://github.com/comfyanonymous/ComfyUI_TensorRT
>>
>>106794294
What does ssd have to do with am5?
>>
>>106794314
>ComfyUI TensorRT engines are not yet compatible with ControlNets or LoRAs.
>>
>>106794341
The catch with Tensorrt is that you have to spend a couple hours or whatever it was to get the optimized version of the model, but you can easily merge loras into the model and then optimize that to support any lora.
>>
>>106794294
idk what limitations you have but maybe you could just use a pcie slot for the ssd?
>>
File: gamers2.png (932 KB, 1176x888)
932 KB
932 KB PNG
>>
>>106794294
>need an AM5 motherboard for SSD
??????????????
>>
So I assume ovi was shit and a nothingburger?
>>
>>106794354
I don't mind waiting a few hours, but the lack of flexibility of being forced to merge loras is huge no unless I was doing the same thing for thousands of gens.
>>
>>106793208
nice
>>
Speaking of LoRAs. How does everyone here go about using multiple LoRAs for wan? I find if you add more than two the quality shits itself.
>>
>>106794390
If you dont mind waiting then let it optimize the model for each lora that you will be using a lot when you're not genning and thats it
>>
>>106793709
I don't want to play with Wan anymore, I tasted the forbidden fruit (Sora 2) and I don't want to go back to mid
>>
File: raadiance.png (2.54 MB, 848x1488)
2.54 MB
2.54 MB PNG
>>106794374
seems to have video capabilities/aesthetics somewhat worse wan and seems to have basic tts (far from one of the better open sauce tts)

but I don't think anyone here tested it very extensively yet. personally I don't see it as a high priority for myself.
>>
https://www.reddit.com/r/StableDiffusion/comments/1ny8971/its_not_perfect_but_neither_is_my_system_12gb/
not bad at all
>>
>>106794374
only 5090s can run it at the moment
>>
>>106794401
mostly just loading them in the rgthree stacker but i loaded them normally before as sequential nodes too.

it did not seem any worse than loading multiple loras on other models, but of course certain combinations work poorly as always
>>
>>106794374
i briefly tested it with their official inference code. it's wan 5b, but worse, with mediocre TTS strapped on. it was 100% released to cash in on the sora 2 hype
>>
>>106794374
>a wan 2.2 5b finetune is a nothingburger?
of course
>>
>>106794430
Yeah I find I have to walter white test tube meme the perfect weight at the right step to make sure the LoRAs don't cancel each other out and make body horror.
>>
File: _00001_.mp4 (831 KB, 832x640)
831 KB
831 KB MP4
>>
File: mconion.png (613 KB, 1360x768)
613 KB
613 KB PNG
corr
>>
>>106793562
> where the model is now a pixel space model which technically should not suffer from the VAE compression artifacts
does this mean perfect eyes, hands, fishnets, distant objects?
>>
File: raadiance.png (3.22 MB, 848x1488)
3.22 MB
3.22 MB PNG
>>106794431
from the samples people posted i did notice that perhaps it gives more control over facial expressions

maybe that's some good to someone
>>
>>106793498
>anyone who is stupid enough to believe any capitalist is doing anything "for good" deserves what they get
The problem is Capitalism, not the people. The majority genuinely believe they are doing great things. Unironically. You know you shitlibs think you're doing right, don't you?
>>
>>106794454
would

>>106794478
oh yes baby go ahead & donate 8^)
>>
does qwen image have prompt adherence as good as qwen edit?
>>
>>106794481
shitlibs are the idiots who believe in capitalism dummy.. they're the brunch crew and the pearl clutchers and the cheerleaders for the whole rotten system.. they only cry when the chickens finally come home to roost
>>
>>106794505
They think they are doing what is best.
>>
>>106793877
this
i wish i could train wan loras
>>
https://www.reddit.com/r/StableDiffusion/comments/1ny9h3f/samsungcam_ultrareal_qwenimage_lora/
this looks really good
>>
Chroma1-Base
or
Chroma1-HD
or
Chroma1-Flash
or
Chroma1-Radiance

I feel like this is a trick question.
>>
File: radiance.png (2.89 MB, 848x1488)
2.89 MB
2.89 MB PNG
>>106794478
i do think the eyes are pretty decent.

fishnets usually have flaws but it's not like they're completely odd

hands certainly not always correct, distant objects... idk, what do you expect from them?
>>
>>106794510
sure, everyone thinks that, but most people are ignorant as fuck when it comes to politics and that's by design.. they're the morons who cheer for the system that actively fucks not just themselves, but virtually everybody else too
>>
>>106794186
What’s strange is that sora1 seemed pretty free rein up until now and they’ve clamped it all down. Wonder why they didn’t care before
>>
File: radiance.png (2.78 MB, 848x1488)
2.78 MB
2.78 MB PNG
>>106794523
Base is a decent go-to.

Radiance is the current experimental model. For that one my personal opinion is that you might as well take the current snapshot if you're using something that experimental.
>>
File: wage_slave_worker_bees.mp4 (1002 KB, 832x640)
1002 KB
1002 KB MP4
>>
>>106794530
democracy is stupid. women are retards.
>>
>>106794523
start with hd
>>
>>106794546
democracy is completely at odds with capitalism
>>
>>106794540
Why does it have banding?
>>
>>106794538
>Wonder why they didn’t care before
because no one wanted to play with sora 1, it's a bad model, OpenAI "only" got insane hype for GPT4, o4 imagegen and Sora 2
>>
>>106794551
hardly. capitalism typically captures whatever is around it, if it's weak.
>>
>>106794527
>fishnets usually have flaws but it's not like they're completely odd
this kind of pattern has always been the chroma tell. knowing that chroma itself can't produce these fine details makes radiance kind of useless as a proof of concept for pixel space vs vae.
>>
ok
>>
>>106794555
Call me crazy but sora1 was the best image Gen model I’ve used locally or saas. Won’t do lewd of course but it always seemed to understand whatever half baked idea I threw at it. Since I have ChatGPT plus for work I figured I may as well mess around with it and was pretty pleased actually. But local is still king I can’t gen big titty 1girls getting anally spit roasted on saas
>>
>>106794527
insufficient style prompting
>>
>>106794557
capitalism can only move in one direction and that is toward complete monopoly which is entirely at odds with anything but fascism
>>
File: ice protestor.png (1.97 MB, 1024x1024)
1.97 MB
1.97 MB PNG
>>106794565
local is king for racism.

repost from previous thread. (other gen going on)
>>
>>106794565
>always seemed to understand whatever half baked idea I threw at it
All OAI image/video models interpret and rewrite your prompts. That's why they feel so "intuitive". Try doing the same with wan.
>>
>>106794582
not only that, but o4 imagegen is an autoregressive model, so it knows how to "think" and expand your simple prompt in its own layers without any rewrite, as a real human would actually
>>
so much newfaggotry recently
>>
>>106794588
always happens with any big model launch, saas or not
>>
File: radiance.png (2.87 MB, 848x1488)
2.87 MB
2.87 MB PNG
>>106794552
I don't really see it. perhaps it's just the influence of 2d art

>>106794559
even if it simply learns more quickly it'll be fine for many users
>>
>>106794538
>Wonder why they didn’t care before
They did. They just wanted to hook people in before they built up any real liability.
>>
>>
>>106794607
classic bait and switch, and the APIkeks fall for it again and again, when will they learn?
>>
>>106794608
Are those guys packing cocaine?
>>
>>106794401
lower the strength of the loras on the low noise model side to something like 0.4.
>>
>>106794602
Well, when there's a vae, we'd say "it's the vae". But actually, I think it's the stripe the model follows. could be wrong.
>>
>>106794401
>I find if you add more than two the quality shits itself.
that's the problem of loras, you can't really stack them, that's why I'm angry that the modern base models know so little concepts, you can't go far with just loraMaxxing
>>
horror no ai can generate. checkmate ai.
>>
>>106794608
change the guy walking down the row to big smoke
>>
>>106794629
You can stack them if you basically do a some final inpaint passes where you inpaint each character on it's own pass. Style loras generally stack well.
That being said has comfy ever introduced a decent inpaint node where you can paint directly on the image or do you still have to make masks in a third party app?
>>
File: Samsung 990 PRO 1TB.png (156 KB, 1335x436)
156 KB
156 KB PNG
>>106794355
>>106794332
>>106794370
My current mobo is MSI B450 Gaming Plus Max. Apparently I will only get half the speed from this ssd.
>>
>>106794658
It don't matter. None of 'dis matters.
>>
>>106794658
bro half speed is like 3TB/s lol, that ain't an issue
>>
How is the new Wan2.2 lightning LoRas? Better than the 2.1 version?
>>
>>106794688
The NEW new ones? They definitely fix the motion issues, but it's also t2v only right now. They also increase the saturation like crazy if you don't tone them down. Like the acid just started kicking in or something.
>>
File: 1746979637772008.mp4 (1.47 MB, 720x1248)
1.47 MB
1.47 MB MP4
the woman laughs. she drinks beer from the mug

the beer is floating in the air but it works
>>
>>106794737
good, almost there but still slopped sadly
>>
File: Animate01.webm (3.74 MB, 1440x1062)
3.74 MB
3.74 MB WEBM
>>
>>106794744
neat
>>
>>106794621
Which is why the multi day trolling could only be done by a disabled retard
>>
why has nobody done for qwen what pony and its derivatives done for sdxl
i love qwen man
>>
>>106794828
big model
>>
>>106794828
it's a 20b model anon, look at chroma, it's a 8.9b model and that furry fag had to wait 6 months to """finish""" that finetune
>>
>>106794744
butiful lightx2v slow motion wanslop
>>
>>106794840
>>106794852
are we talking like crypto rich nigga, ben mallah, or elon musk?
>>
>>106794853
can't wait to get the improved i2v version, they nailed their latest t2v lora
>>
bros i have yet to try qwen image / edit, are there any lightning loras or anything like that? gonna need all the cope i can get to run it
>>
>>106794868
it'll probably cost millions to make a full scale finetune of a 20b model, for chroma we have those informations:
>8.9b model
>5 millions images
>48 epochs at 512x512 + 2 epochs at 1024x1024
>150 000 dollars
>>
>>106794737
>the beer is floating in the air but it works
the beer was "floating" in the initial pic bro
>>
File: 1759637114976293.png (207 KB, 615x688)
207 KB
207 KB PNG
yeah, I'm thinking pytorch is cooked
>>
>>106794872
https://huggingface.co/lightx2v/Qwen-Image-Lightning/tree/main
use 8 steps v2
>>
>>106794887
yea sure it """runs""" but how good will it actually be
>>
sora 2 killed the party. even the other online tools, are less impressive now. local is still great for nsfw content. but for sfw... lol
>>
File: radiance.png (2.53 MB, 848x1488)
2.53 MB
2.53 MB PNG
>>106794628
IIRC it's 16x16 patches. IDK if anything could encourage banding.

>>106794658
I think you'll be ok with that speed. YMMV about the size.
>>
>>106794887
>1 bit
holy lobotomy
>>
>>106794897
you forgot sora giving itself a lobotomy for the sake of safety + not wanting the biggest class action lawsuit in history against them
>>
>>106794892
but it's for the old version of qwen edit right?
>>
>>106794887
finally, a model that can function like myself (just one brain cell)
>>
>>106794911
it works with 2509
>>
>>106794910
>not wanting the biggest class action lawsuit in history against them
this is why SaaS will always lose
>>
>>106794897
>>106794910
>you forgot sora giving itself a lobotomy
it's like witnessing the Iphone release a second time and then Steve Jobs stops producing those phones 2 days later, like we've seen the future but the company chickened out because it was deemed too dangerous, now we'll have to wait for a company with more balls to release the equivalent to the Iphone
>>
>>106794887
people need to train their model to be 1bit as u cant just quant it down to it and have it work
>>
is there any good db of openpose / controlnet skeletons?
>>
>>106794925
desu, if they managed to train model at 4bit while keeping the quality of fp16, we would be able to run bigger models and compete with APIkeks
>>
File: 1733950966480600.png (94 KB, 276x182)
94 KB
94 KB PNG
>>106794887
>I'm thinking pytorch is cooked
>April 2025
https://xcancel.com/LiorOnAI/status/1913664684705874030
>>
>>106794933
nvm im retarded
>>
reminder, magic is real with qwen edit.

with remove clothes qwen edit 2509 + the 8 step lightning lora: "remove the yellow banner over the breasts of the naked anime girls."

https://files.catbox.moe/dmjhva.png
>>
>>106794979
i love you for shilling this anon
>>
>>106794991
well no one has to pay me, I shill it cause it's a really neat tool for manipulating and editing all images. now all we need is wan 2.5.
>>
I only hope China can drop some insane uncensored models before they grow too indulgent and start caring about nonsense like women's rights and copyright
>>
>>106794997
>insane uncensored models
Yeah but from who? Wan is censored.
>>
>>106795004
I don't know. But there is only one nation which can rival Altman's legions of Ugandans. CHYNA.
>>
>>106794997
>before they grow too indulgent and start caring about nonsense like women's rights and copyright
they seem to care a lot about copyright though, Wan and Hunyuan don't know shit in pop culture
>>
>>106794979
also, if you feed the 1024x1024 emptysd3 latent image node to ksampler latent image (override the default size of image1)

you get this: "show the nude anime girls."

it's not wide enough for all of them but you get the idea.

https://files.catbox.moe/1iqs3d.png
>>
File: Animate02.webm (3.86 MB, 1440x1062)
3.86 MB
3.86 MB WEBM
>>106794871
hopefully it's good, lightning 2.1 is not as consistent
>>
>>106795019
>care a lot about copyright
this could've just been the autocaptioner not including character names or game names
>>
>>106794923
>dangerous
that's a funny way of saying expensive
>>
>>106795020
got a link to that lora?
>>
>>106795030
Anthropic got raped in court, why do you think OpenAI is immune to the copyright mafia?
>>
>>106794997
it's too late. they already started to act like americans. it wasn't even 5 years ago children had aspirations of being astronauts and scientists. now they all want to be brainrot influencers
>>
>>106795039
ill upload it, some anon posted it earlier in the week, give it a bit
>>
>>106795041
>Anthropic got raped in court
is this why they're going in an anti-slop direction with their "Keep Thinking" brand
>>
>a local model consistently passing my "2d cat with 3d tail" test that some SOTA models struggle with
oh shit-
>model too fucking fat for home use
:(
we're so fucking close man. hunyuan 3 would've been salvageable if smaller but atleast this gives me hope for the future of image gen
>>
File: 1748058220901278.png (1.41 MB, 1360x768)
1.41 MB
1.41 MB PNG
the anime girl is wearing a black bikini.

steel bikini run?
>>
does qwen image work best with hyperslopped llm rewritten prompts or can i just type shit in like with wan
>>
File: 1758693823771603.png (419 KB, 1630x1149)
419 KB
419 KB PNG
>>106795027
they should use Gemini to caption image desu, this shit seems to know characters
>>
>>106795060
it understands language better than most models, no need for llm scripts
>>
>>106795064
is this pro or flash
>>
>>106795070
gemini 2.5 pro
>>
>>106795074
that'd be expensive as shit to run over tens of millions of images
>>
File: 1752170576051451.png (1.39 MB, 1360x768)
1.39 MB
1.39 MB PNG
>>106795059
better glove:
>>
>>106795079
well yeah that's the price to pay for a quality dataset, we're talking about multi billion dollar companies (Tencent and Alibaba), I'm more angry at Tencent though, instead of spending millions on a giant 80b model they should've used that money on Gemini pro and a normal sized model
>>
>>106795090
they dont give a fuck about the dataset as long as it's merely okay. it's a means to an end so they can move on to the next project
>>
File: 1751144373739692.png (223 KB, 1413x888)
223 KB
223 KB PNG
>>106795095
>they dont give a fuck about the dataset as long as it's merely okay.
that's the problem, they don't give a fuck and at the same time expect us to give a fuck
>>
For the sdxl-based 1girl sloppers out there, I’m messing around with the new epsilon scaling thing hao ported to forge classic and it does seem to produce nicer outputs. I can’t compare directly as pretty much all the gens I’ve got stored are from vpred noobxl and derivatives, no eps models, but even for the bored models it seems to improve output. However it also changes the output, unlike what it is supposed to do for eps models. Fun to mess around with if you guys haven’t pulled recently. It’s also in comfy, he ported it from there
>>
File: 1734211382616241.png (45 KB, 468x60)
45 KB
45 KB PNG
>>106795039
k it finished uploading

https://gofile.io/d/sfoxub

qwen clothes remove 2509 lora anons
>>
any word if Celestial will be any closer to tensor?
>>
>143s/it qwen edit
really nigga
>>
>>106795144
use the template workflow, use the 8 step qwen image v2 lightning lora
>>
File: hi.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>>106795058
not bad but also it doesn't indicate sufficient 1girl power even if
>>
>>106795110
ill give it a shot, thx anon
>>
>>106795160
I am :|
>>
>>106795058
>we're so fucking close man
far from it, the outputs are ultra slopped, it doesn't know characters or celebrities, it's just qwen image with a slightly better prompt comprehension since it's an autoregressive model
>>
>>106795058
for all those parameters it better be able to do that
>>
https://huggingface.co/drbaph/Qwen-Image-Edit-Mannequin-Clipper-LoRA

Is this trolling?
>>
>>106795163
which gpu?
>>
File: 1751504215756212.png (1.22 MB, 824x1256)
1.22 MB
1.22 MB PNG
the anime girl Hatsune Miku is on a giant billboard on the side of a building, in Akihabara Japan. The billboard extends from the street to the ceiling.

bit redundant since the image is of an anime girl, but it still works.
>>
>>106795144
you using a 1050 ti or some shit? i'm a vramlet and im getting acceptable speeds
>>
>>106795175
>>106795181
7900 GRE. But I'm noticing there may be significant updates I can make to rocm. I have to check this because I agree something is wrong.
>>
>>106795185
>amd
that explains it.
>>
>>106795095
>it's a means to an end so they can move on to the next project
so they're not aiming for something usable for themselves?
>>
>>106795162
Go to “settings in ui” and add “scaling_factor” to get the slider with your txt2img settings, rather than switching to the settings tab each time to fiddle with it.
If you have noobai eps or models based off it it should work as a “refiner” of sorts if I understand right.
>>
>>106795187
We will see. I'm leaning more towards my incompetence in setting it up with the latest updates. Someone says that every time and it always just ends up being my incompetence. AMD deserves bullying but it's not completely useless.
>>
File: radiance.png (2.08 MB, 848x1488)
2.08 MB
2.08 MB PNG
>>
>>106795201
is he still working on that shit? do we know when it's gonna be done?
>>
fucking jeets, now we'll never get sovl models because souless shit like HunyuanImage 3.0 managed to be 1st on that memeboard
https://xcancel.com/TencentHunyuan/status/1974522542858911935#m
>>
>change seed on qwen
>image changes slightly
not a very creative model i see
>>
>>106795180
>in Akihabara Japan
akiba doesn't have the giant screens. it's all static billboards and none that big. you are thinking of shibuya
>>
File: 1755462213302149.png (234 KB, 1461x747)
234 KB
234 KB PNG
https://xcancel.com/LodestoneRock/status/1974487600225546733#m
is this another snake oil?
>>
File: radiance.png (2.77 MB, 848x1488)
2.77 MB
2.77 MB PNG
>>106795205
> is he still working on that shit?
yes, and also it has only been like a month

> do we know when it's gonna be done?
no clue
>>
File: 1739922243327881.png (3.76 MB, 2580x808)
3.76 MB
3.76 MB PNG
the girl is sitting in a chair in a library. keep her expression the same.

neat, from a swimming photo even.
>>
>>106795218
how much slower is it compared to the regular VAE process on chroma? and does it eat more vram?
>>
>>106795194
>>106795162
>>106795110
I'm on comfy, I pulled but I can't seen to find the node, lol
>>
>>106795238
switch to nightly
>>
>>106795242
whats the node called
>>
>>106795226
>the girl
Can you feed it a highly specific person and it work? Like does it work with this person?

I seriously doubt I can get qwen working with my gpu, gotta upgrade.
>>
>>106795216
>the pseudo intellectual furry retard that ruined his model to try meme experiments wants to experiment with memory directly
gee I wonder
>>
>>106795250
sure, use "keep their expression the same" to retain their facial details though, in general.
>>
>>106795252
does it work with the library thing? Like can you make him (he's just a random person) sit in the library? Or does it look basically like a photoshop?

Because my suspicion is that the model detects Laura Croft, and then just gens her. I don't think it has an internal sense of the look of a random new person.
>>
>>106795261
it works with any image, real person or anime. you can make them point a gun if you want, whatever. it's like gen AI but can manipulate stuff with prompts.
>>
File: radiance.png (2.74 MB, 848x1488)
2.74 MB
2.74 MB PNG
>>106795227
i'd say about a third to half the speed but it is hard to compare. as for the RAM that might need comparative testing across a bunch of resolutions but IDK if anything much is optimized so not sure there's much of a point.
>>
>>106795247
It’s called “epsilon scaling”. I apologize for linking preddit but here https://old.reddit.com/r/StableDiffusion/comments/1nwmj4m/epsilon_scaling_a_real_improvement_for_epspred/
>>
File: 1746915410204115.png (974 KB, 1360x768)
974 KB
974 KB PNG
change the background to a movie studio with large green screens and a couple of movie cameras. A white piece of paper taped to the wall says "we'll do it, some day".
>>
>>106795268
>>106795268
>>
File: radiance.png (2.5 MB, 848x1488)
2.5 MB
2.5 MB PNG
>>106795261
like the wan orbital camera movement videos, it works with basically any character including random chroma gens or w/e where there's no way it's an actual known character as such

shit is pretty magical
>>
File: 1757910507362124.png (769 KB, 1176x880)
769 KB
769 KB PNG
>>106795261
like this for example.

the man with glasses is pointing a silver pistol at his head.

the model doesn't know who the fuck they are. but it knows how to edit/swap stuff.
>>
>>106795216
>"We solved VRAM bro, Nvdia is no more!"
>By the creator of Chroma!
kek, you know it's bullshit, c'mon
>>
>>106795279
Yeah, but does it know how to read/understand faces enough to rotate them in 3d space? Like generating actual new views with a likeness intact?

if so, imo that would be the first ai with documented likeness capability.
>>
>>106795285
it can do transforms, like "view from the back" or side, pretty neat how it works desu
>>
>>106795278
>like the wan orbital camera movement videos

It's more magical to do a large movement than basically slight changes in a row.

Does the likeness drift if you have the head return to the original position?
>>
File: libraryman.png (1.15 MB, 1016x1032)
1.15 MB
1.15 MB PNG
>>106795196
using
--use-split-cross-attention
improved it considerably
>>106795250
pic related
>>
>>106795291
3/4ths view, does it look like the same person? or at least usually?
>>
>>106795296
it depends, if no face is visible the model has to guess, but you can prompt details (eye color, ethnicity, shape, etc).
>>
What's the nomenclature fore Qwen edit with multiple images? Insert image1 into image2? Can it work up to image3?
>>
>>106795307
the default 2509 edit layout that comes with comfyui has 3 load image nodes so I assume yes. And yes to the image1 image2 thing.
>>
>>106795294
not bad. Seems close enough.
>>
File: QwenEdit_00151_.png (1009 KB, 1016x1032)
1009 KB
1009 KB PNG
>>106795317
>>106795294
added the samsung insta lora
>>
File: x.mp4 (1.85 MB, 1056x1856)
1.85 MB
1.85 MB MP4
>>106795292
have you ever actually done anything like that? it do slight changes the trivial way and you'll lose coherence and the backgrounds wont make sense and so on. ai techniques are pretty fucking magical now.
>>
>>106795330
That jiggle is so unrealistic, but also like so good. Real women never stood a chance
>>
best model at generating HELL?
>>
>>106795330
i'd play that walking simulator
>>
>>106795218
>>
is cum covered considered sfw? I could name the file "slime girl" or something.
>>
File: x3.mp4 (554 KB, 1056x1856)
554 KB
554 KB MP4
>>106795350
features more jiggle than base wan, surely still to be perfected.

>>106795391
might not be long in the future - multiple different ai techniques could lead there

>>106795398
nice
>>
>>106795417
>features more jiggle than base wan, surely still to be perfected.
are you just using a lora on wan or something? what do you mean
>>
>wan 2.2 smoothmix is great at motion but ignores the last frame for a loop

Fuck off.
>>
>>106795214
who cares
>>
File: Spoopy2.mp4 (2.32 MB, 480x832)
2.32 MB
2.32 MB MP4
It's spooky gen season
>>
>>106793488
>>106793501
>>106793518
>>106793638
Kill yourself
>>106793535
That was before you transitioned and suffered traumatic head injuries, i assume?
>>
https://files.catbox.moe/fnh3wo.png
nsfw titties
if only qwen didnt fuck up the teak
>>
File: glass_chair.png (927 KB, 840x1240)
927 KB
927 KB PNG
>>106796532
>>
>>106796532
>>106796596
any tips on unslopping the skin? possible?
>>
qwen_image_edit_2509 is so bad at prompt following
>>
>>106796617
I think the model is really just trained for "putting clothes in image 2 on person in image 1" and anything other than that is a throw of the dice
>>
>>106796617
>>106796625
I don't know if this is some kind of elaborate samefag ruse but there has been an anon posting gens over the last few generals which has proven that this isn't true at all.
>>
>>106796636
I assume you mean e.g.
>> 106791124
I tried that with the native comfy workflow i.e. https://docs.comfy.org/tutorials/image/qwen/qwen-image-edit
1/5 it works nicely, the rest of the time it spits out the input image unmodified. Tried different prompts, no luck.
>>
File: girlsubway.png (1.13 MB, 840x1240)
1.13 MB
1.13 MB PNG
>>106796688
dunno what to tell you
I've had that before a few times, it was because my prompt was super vague or needed adjustment
first take
the girl is seated in the tokyo metro reading a newspaper in a populated subway car
>>
>/ldg/
>/sdg/
>/adt/
all identical please get your shit together



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.