[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Noise Convergence Edition

Discussion of Free and Open Source Text-to-Image/Video Models and UI

Prev: >>106739587

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2203741
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbours
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
I need to take a shit real fucking bad, but I have so much fun genning..
>>
File: 00135-1563371775.png (2.79 MB, 1536x1536)
2.79 MB
2.79 MB PNG
>>
File: file.png (2.03 MB, 1536x864)
2.03 MB
2.03 MB PNG
>>106743839
You dropped >>106743830
>>
>>106743909
laughable
>>
File: 1734638977163041.mp4 (1.61 MB, 640x640)
1.61 MB
1.61 MB MP4
hatsune miku is dressed in soldier fatigues running through a battlefield
>>
File: 1728451674344437.png (1.27 MB, 768x1280)
1.27 MB
1.27 MB PNG
why buy a 6000 when you can buy a 5090 with a workstation mobo and just swapmax?
>>
File: 00178-3380301330.png (2.72 MB, 1344x1728)
2.72 MB
2.72 MB PNG
>>
File: 00193-66941820.png (2.32 MB, 1728x1344)
2.32 MB
2.32 MB PNG
>>
File: 1747169419905448.png (20 KB, 1205x172)
20 KB
20 KB PNG
What's a preprocessor? I can't remember. I know I use tile in img2img since some anon told me to do that, and said to use this model. It works but I don't understand how or what it's doing.
>>
File: wan22_00669.mp4 (451 KB, 448x576)
451 KB
451 KB MP4
>>
>>106743909
model/catbox?
>>
>>106744023
there is no good reason to buy one at that price uynless you plan to do training. Especially since it wont be particularly much faster than a 5090, you'll be compute bound for most video/image tasks anyways. It will be fantastic for LLM's mostly- with system ram he can run glm q4 which is somewhat on par with deepseek using just that card.

Im sure you can do some fun stuff like having llm/tts/image gen all on device. But it wont be good. It will be jank, and too slow, and not real time even on that.

if you plan to keep this for 5 years- 2k/year is do-able. I just know theyre gonna come out with flash attention 4 or some shit and the last few years you will be a cuck like people genning vid on 3090's right now.
>>
Anyone ever had dark thoughts about robbing a shipment of Nvidia cards?
>>
>>106744239
https://files.catbox.moe/254tgy.png
https://civitai.com/models/784543?modelVersionId=2142667
>>
>>106744421
danke
>>
>>106744407
Yes, more than once
>>
File: WanVideo2_2_I2V_00464.webm (873 KB, 1248x768)
873 KB
873 KB WEBM
>>
Does anyone know if musubi tuner can do regular flux and chroma loras? It does flux kontext of all things, for whatever reason.

>>106744023
>>106744277

Why not? Kinda sounds like poor cope. A faster card and a shit ton of ram sounds good to me.
>>
>>106743542
post prompt postcard! :)
>>
https://huggingface.co/papers/2507.01051

what
>>
File: 00229-1154129823.png (2.69 MB, 1728x1344)
2.69 MB
2.69 MB PNG
>>
>>106744648
Nice finger count.
>>
File: 1746334153397035.png (288 KB, 420x429)
288 KB
288 KB PNG
>>106742822
boob size fix https://files.catbox.moe/t7nsds.mp4
https://civitai.com/models/1918611?modelVersionId=2204588
>>
>>106744568
the 6000 is only like 10-15% faster for several thousand dollars more. It's a nice bonus but the vram, takes center stage, and if youre just gonna be running wan, it's not going to be a huge difference. Now you can run larger vid models and full precision stuff, but those are going to run even slower to the point of suffering. And even on a 5090 I am constantly lowering resolution and sampling just to get it to render faster. Unfortunately, what is required to make 1080p video render quickly is a full b200 gpu stack that costs millions and some video models that have not been open sourced. Basically, you have to be Elon Musk.

Im sure a comfy worklflow could even negate the offload penalty by batch generating low noise gens and then doing the same for high noise so that you could come very close to the 6000 pro's speed bonus from keeping models on gpu. For fucking free.
>>
>>106744684
>Basically, you have to be Elon Musk.
Am I the only one who's pissed off that GPU and hardware manufacturers can almost exclusively cater to a handful of billionaires and still turn record profits? Like what's the fucking point if they alone can drive the market.
>>
best way to give a model an image and basically tell it to just gen more of something similar? captioning the image and feeding that as a prompt isnt really enough although can be interesting, can ipadapter be used for this or is there something else?
>>
>>106744713
line go up
>>
>>106744713
the average consumer has stopped mattering to corporations a long time ago. we're lucky they don't decide to just turn everyone into mulch
>>
File: 00250-1937915327.png (2.92 MB, 1344x1728)
2.92 MB
2.92 MB PNG
>>
File: WanVideo2_2_I2V_00467.webm (957 KB, 1152x720)
957 KB
957 KB WEBM
>>
>>106744716
>captioning the image and feeding that as a prompt isnt really enough
It should be if you also do img2img with the original. A high denoise value should get nice variants.
>>
>>106744727
don't give them ideas
>>
>>106744727
rich people stop being rich when everyone serving them dies. While I love the idea of omnicide, the rich and powerful wanna keep this going.

>>106744713
the rich are bad with money and are mentally children. But unfortunately they have all the money. Gamers who care about money would have never let nvidia get away with this.
>>
>>106744779
>Gamers who care about money would have never let nvidia get away with this
idk about that when the average goymers buys the newest 90 card the moment it comes out because bigger number better despite it not being needed for basically any game except the most pozzed AAA propagandaslop thats unoptimized on purpose while looking worse than games from 8 years ago to sell those very cards
>>
>>106744738
sexo
>>
>https://huggingface.co/Efficient-Large-Model/LongLive-1.3B
>https://nvlabs.github.io/LongLive
>Nvidia LongLive: Real-time Interactive Long Video Generation
Which part of this video generator exactly is real-time? Is "Real-Time" with us in the room right now, Nvidia?
https://youtu.be/Wb5FdiCIXwI
>takes double the time of target video lenght to generate 24FPS video
>iF wE sLoW dOwN pLAyBaCk sPeEd tO 0.5x wE cAn AcKshUalLy ViEw iN rEaLTiME
>lEtS aLsO jUsT cAlL 0.5x = 1.0x aNd 1.0x = 2.0x sPeEd.
>nOw iT's tRuLy rEaL TiMe!!!!
the audacity of these unhinged researchers to call that realtime.. or did the jeet fuck up somewhere?
>>
>>106744789
the most popular cards are the XX-50 series bro. Gamers are cheap as fuck.
>>
>>106744756
heh
>>
File: 00277-3507880779.png (2.88 MB, 1728x1344)
2.88 MB
2.88 MB PNG
>>106744802
indeed
>>
>>106744641
>they cannot meaningfully consent to the numerous potential outputs their data might enable or the extent to which the output is used or distributed
These people are completely nuts.
>>
>>106744779
>>106744789
why are we pretending it's gamers fault when nvidia makes most of it's money from ai
>>
>>106744838
women
>>
>>106744838
enjoy the consent cult making everything insane
>>
can someone look at this please?
>>106744809
I really wanna know if I misunderstood something and it's actually realtime.
>>
>>106744835
what is this absolute slop
>>
>>106744811
right, what i wanted to say is that they buy 90 if they have the money, which yes, most dont, but a crap ton still do, rewarding ngreedia and ultimately showing that they have no principles and that the only average gamers not rewarding ngreedia are those that just cant drop 2-4k on a gpu
>>
>>106744809
Actually quite impressive considering those samples took around 8 minutes to gen for a 30 second vid, this is the direction we needed. Now please, for the love of god, some one slop this technology into wan
>>
>>106744641
If this represents the current state of social studies papers, it's kind of pathetic.
>>
And now, a short poem about seeing a comfyui update and saying "ah, I am sure it's fine"

>And on the pedestal, these words appear:
>My name is Ozymandias, King of Kings;
Look on my Works, ye Mighty, and despair!"
>Nothing beside remains. Round the decay
>Of that colossal Wreck, boundless and bare
>The lone and level sands stretch far away.
>>
>>106744901
did one this morning and for some reason my vram usage went down
>>
>>106744878
this is based on WAN afaik
>>
File: 1754908411471452.png (226 KB, 1818x812)
226 KB
226 KB PNG
>>106744641
>foid papers
>nothing but globohomo propagandaslop
So this is the vast contribution of women to world research.
>>
File: 00304-651905710.png (2.58 MB, 1344x1728)
2.58 MB
2.58 MB PNG
>>
File: 1734553835746341.png (151 KB, 465x453)
151 KB
151 KB PNG
>>106744920
>checks huggingface

les goooooooooo
>>
File: file.png (12 KB, 739x124)
12 KB
12 KB PNG
>the average AI user
>>
>>106744641
30+ year old woman going insane with the possibility of ai sex robots
>>
File: 1739364583781090.jpg (7 KB, 200x200)
7 KB
7 KB JPG
>>106745026
>>106744641
>>
File: 00321-1150737988.png (2.45 MB, 1152x2016)
2.45 MB
2.45 MB PNG
>>
>>106744838
I like how the "AI Done Well" example was just a regular ol' EULA with nothing else mentioned previously (temporal consent, etc). Wish I could get paid to write gibberish.
>>
>>106744851
>consent cult
It's a weird mix of applying consent to areas where it makes no sense, to mixing it with copyright ideas so far reaching it would make any derivative work essentially illegal.
Or even looking at anything for that matter.
>>
if i use the 4 step light loras, is there any point to go higher than 4 steps on each ksampler or is the result going to look the same?
>>
My AI consents genning raunchy porn 24/7.
>>
If we just make the AI horny, then it'll like generating horny stuff. Elon has the right idea.
>>
File: Qwan_00001_.jpg (852 KB, 2976x1984)
852 KB
852 KB JPG
Think they'll nunchakufy Hunyuan 3.0? I'd take a nunchakufied Hunyuan 3.0.
>>106745103
Pretty cute, actually.
>>
Sage attention 3 out yet?
>>
>>106745168
>qwen image slopper has no taste
pottery
>>
File: 00414-816042324.png (2.76 MB, 1080x1920)
2.76 MB
2.76 MB PNG
>>106744901
I have chosen personally to completely cease all wan and flux fun until i pick up an additional 32 gigs of ram to prevent swapping to my nvme who's a good boy who dindunuffin. Comfyui is too niggerlicious for this modest little rig.

https://youtu.be/3COHCKX_xF8?si=U_DcG_MN7Lqi4fa2

>that moment i realized illustrious was perfectly capable of doing what i asked anyway
>>
>>106745185
spilling into ssd with model inference shouldnt do anything other than reading off the storage, meaning its free and not stressing it
>>
>>
>>106745180
It is, but for only blackwell cards at the moment. I can imagine they'll later support older cards, time will tell.

>inb4 some demoralizing contrarian dipshit
>>
>>106745147
you're not limited to 4 steps, more is always better. it can result in smoother motion and more clarity and detail in the output.
>>
>>106745201
is it even usable on blackwell cards on comfyui?
they recommend only using it after the first step and excluding the last step, and using sage2++ for the rest
>>
File: Chrowan_00001_.jpg (1.06 MB, 3024x2016)
1.06 MB
1.06 MB JPG
>>106745181
I slop with whatever, really. Chroma tends to be so noisy, though. Doesn't lend itself for a lot of types of images, in my opinion.
As for the girl, she reminds me of an older Julianna Rose Mauriello a lot. I'll stand by that.
>>
>>106745168
it's a man i2i'd using qwen
>>
File: 1755777023939209.jpg (1.25 MB, 2016x1152)
1.25 MB
1.25 MB JPG
>>
>>106745168
>Pretty cute, actually.
Maybe, but she has mental aids.
>>
>>106745238
Not a clue, I dont have a blackwell card, kek. sage2.2 on my 4070tis works fine but double speed boost would be great too
>>
anyone has a workflow for wan2.2 without the lightx2v loras pretty please?
i tried the native one, with fp8_scaled and euler 10/10 steps, then tried unipc 15/20 steps and the result always looks shitty and blurry
>>
File: file.png (66 KB, 1665x709)
66 KB
66 KB PNG
>>106745371
the issue is that sage attention 3 is a destructive process, way more than the relative free lunch that sage1/2 were
>>
>>106744407
Only about ASML.
>>
>>106745410
How do you even change the version between steps?
>>
File: 00121-175658031.png (3.11 MB, 1248x1824)
3.11 MB
3.11 MB PNG
>>106745410
kek this is why i'm waiting for the ((two more weeks))
let the monke beta test until it's good enough to go through the effort of dropping into my workflow
or SKIP altogether.
>>
File: 9392025850.png (798 KB, 1024x1024)
798 KB
798 KB PNG
>>
File: Qwan_00010_.jpg (657 KB, 2976x1984)
657 KB
657 KB JPG
>>106745338
Well, that's women for you, huh? Am I right fellas?
>>
why do people still use pony?
it boggles the mind.
>>
>>106745742
boggle?
>>
>>106744555
Hi /lit/ bro
>>
>>106745742
What should people use instead?
>>
>>106745787
noob
>>
>>106745798
>why do people still use noob?
>it boggles the mind.
>>
File: 98849951.mp4 (3.46 MB, 1312x896)
3.46 MB
3.46 MB MP4
>>
>>106745787
illustrious, noobai, chroma, and qwen.
pony is simply just not good anymore.
illu and nai shitmixes have more concepts and characters built in

>he needs a lora
fuck you, use illust you cunt and stop shitting up civitai with loras that illu/nai can do out of the box RRRREEEEee
>>
Anons, I tried to install comfy ui just now, and just for the Python dependencies (I think), it tries to download almost 70GB of packages (several 3.33GB parts). I tried to install it outside of C:, but I got a warning that it may run unstable and that it is not recommended.
Is that really the case? Will comfy_ui need almost 100GB of storage space on C:, even though I haven't installed any large models or anything? Is this just to get started with those Python/PyTorch things?
>>
>>106745961
>stop shitting up civitai with loras that illu/nai can do out of the box RRRREEEEee
but we need six gorillion concept loras for things the first ever released illustrious checkpoint can do out of the box!

and don't even get me started on the six gajillion anime girls even pony was able to do out of the box!
>>
>>106745966
Idk what are you installing since comfy python is ~10 gigs with 3.6 being torch.
>>
File: 88945033.mp4 (3.56 MB, 1312x896)
3.56 MB
3.56 MB MP4
>>
>>106745966
>tries to download almost 70GB of packages
how do you fuck up a comfy install so badly?
delete everything and use the portable version jesus fucking christ.

anon you might not be cut out for this whole ai stuff
>>
with the following core settings and a 640x360 video dataset, it's possible to train Wan 2.2 14b i2v with musubi-trainer and not OOM
    --task i2v-A14B --sdpa --mixed_precision fp16 --fp8_base \
--optimizer_type adamw8bit --learning_rate 2e-4 --gradient_checkpointing --gradient_accumulation_steps 1 \
--max_data_loader_n_workers 2 --persistent_data_loader_workers --offload_inactive_dit \
--network_module networks.lora_wan --network_dim 32 \
--timestep_sampling shift --timestep_boundary 900 --min_timestep 0 --max_timestep 1000 --discrete_flow_shift 3.0 \
--max_train_epochs 16 --save_every_n_epochs 1 --seed 23571113 \
--save_state \

836x480 was close, like it almost worked in 48GB, but there would be occasional peak memory useage moments where it would OOM, and it would happen before it could write a checkpoint.
>>
>>106745966
we dont know what you're downloading specifically so you come off as a fucking idiot, you might be downloading a prepackaged setup for multiple things for all we know
>>
>>106746050
>it's possible to train Wan 2.2 14b i2v with musubi-trainer and not OOM
on what, on 48gb vram?
>>
>>106746059
Yes, I have a 48GB 4090D
>>
For flux/chroma, is there a way to use a reference images of real faces to blend them to a new one?
I want to try and create the perfect waifu with celebs.
>>
>>106746117
i was going to help you until you mentioned celebs.
>>
>>106745966
tray stability matrix. It just installs everything for you. one click, works on windows and linux.
>>
>>106746125
Anime all have the same face.

I remember an old phone app that let you do it, it was amazing, /tv/ threads created beauty never seen before.
>>
File: 00101-2095852289.png (1.22 MB, 896x1152)
1.22 MB
1.22 MB PNG
>>
>>106746117
With loras of each celebrity it would be easy, just alter the strength of each lora to your liking while prompting for a single person

There's probably some easy way to do img2img with two celebrity reference images, but I doubt the results will be particularly good
>>
>>106746059
I wanted to also mention i2v is probably only worth doing if you want something unique that Wan doesn't already understand, in terms of motion. I went with the action scenes from City the Animation, since they're pretty creative and have a lot of "bullet time" action going on.
Now I understand why most community i2v loras are very specific - impractical on home gear, and expensive in the cloud to go past 100 or so videos, so you have to keep it focused.
>>
File: heyohoya.jpg (49 KB, 684x456)
49 KB
49 KB JPG
Summoning Kijai, QuantStack, lightx2v or phr00t to work their black magic and create a lora for Long Video Generation https://huggingface.co/Efficient-Large-Model/LongLive-1.3B

>>106745961
Use whatever works for you, broham

>>106746107
>4090D

Still OOMs? Was going to buy one of those too, kek
>>
>tfw not OOMing from doing an upscale to 5120x4000p

Feels good.
>>
>>106746117
>use 2 celeb loras at equal strength
>????
>profit
idiot
>>
>>106746170
>>106746199
I don't think it will work the way I want it to, but I'll try.
>>
>>106746117
i dont know about flux, but even slopmerges for noob and illustrious manage to keep some XL knowledge of celebrities, you can prompt their name and get a decent likeness that isn't 1:1, ive used this for a specific OC that i wanted to have defining features.
>>
>>106745432
Waiting for a node to do that.
>>
>>106746218
Hang on, there's no loras of any celebs on civitai?
>>
>>106746244
it got shoah'd. never forget the six million celeb loras. most other sites also fell in step for the same reason/paranoia.
>>
File: 1734809426946311.mp4 (1.36 MB, 816x560)
1.36 MB
1.36 MB MP4
>>106745943
>>
>>106746191
> Still OOMs? Was going to buy one of those too, kek
yeah I couldn't make it do 836x480, and this was on a linux box running "headless", meaning all GUI stuff shut down so there's no other program using the GPU memory.
4090D 48GB is still great though. It lets you do wan 2.2 14B af fp16, you can train Qwen-image, etc... I consider it the lowest entry-level GPU for anything more serious than genning 1girls. Anything older with that much memory either costs more, or is a poor value because it's slow and lacks fp8 support.
In all honestly, the "not just fucking around" GPU is the Blackwell 6000 Pro. if you have that, you're golden because you can run anything up to single H100 80GB territory, and that covers just about all the diffusion projects.

I'm just fucking around, so spending $3k on a GPU is about my limit.
>>
>>106746250
Well fuck. Guess it's time to start learning on how to train loras.
>>
File: 00102-12096845.png (1.38 MB, 896x1152)
1.38 MB
1.38 MB PNG
>>
>>106746271
It's really not difficult. The difficult territory is natural language tagging, have fun with that one.
>>
Another long vid gen attempt, 2minutes?

>Tencent promise a new autoregressive video model ( based on Wan 1.3B, eta mid October) ; Rolling-Forcing real-time generation of multi-minute video ( lot of examples & comparisons on the project page)

https://kunhao-liu.github.io/Rolling_Forcing_Webpage/
https://github.com/TencentARC/RollingForcing
>>
In fairness, installing Python packages makes us all idiots.
>>
>>106746286
looks interesting, possibly even promising.
alright getting it out of my system; ROOOOOOOLLIING STAAAAAAAAAAAAARRRTT
>>
>>106746268
Could try 1:1 ratios? Reading around the leddit training threads, some have success at 1:1 ratios.
>>
>>106746271
or, you know, grab them from sea or tensor
>>
File: 00106-2222961683.png (2.46 MB, 1088x1920)
2.46 MB
2.46 MB PNG
hhrrrmmm scratches big scary chin

wily's noob realism vpred (formerly epsilon's) has some promise. that said, my adetailer pass has been going a little too hard. not sure what i fucked up this time, but it gives every girl varying sizes of cheekbones for some reason.
>>
File: 00103-669395706.png (1.65 MB, 896x1152)
1.65 MB
1.65 MB PNG
>>
>>106746310
Yeah these look dope, miles better than the other attempts. So we have

>Rolling Forcing
>LongLive

These will 100% release (usable in comfyui) before radial attention/wan chaku
>>
>>106746390
sharing this incase anyone has better settings you can pass me. denoise always set to 0.3 since 0.4 seems aggressive on every checkpoint. should i be adjusting cfg scale based on the checkpoint?


>>106746424
I like the little skateboarding turtle not gonna lie, it did have the best example out of the bunch.
>>
File: Untitled.png (6 KB, 751x280)
6 KB
6 KB PNG
>>106746041
>>106746055
I downloaded the installer from their official website, for Windows, for NVidia GPUs.
The first time I tried to install it, it had more than 15 of these 3.3GB packages. Now I tried again, and apparently there is only one, I don't know... I'll see what happens.
>>
>>106746355
What's sea?

I guess tensor doesn't give a fuck, nice.
>>
>>106746455
>I downloaded the installer from their official website, for Windows, for NVidia GPUs.
Rookie mistake, the electron app is shit. get the windows portable release off of github
>>
>>106746191
I feel like I can already generate at 320x320 and have rapid iteration on prompting. I have trouble caring about making really long janky low quality turbo-slop vids. I mean, look at all the examples they give lol. None of them are good. They dont even try to like upscale or vid to vid them or anything to try and justify this.
>>
>>106746034
just like real porn actresses pretending to play chess
>>
>>106746511
or pretending to do anything for that matter
>>
>>106746507
It's the 1.3b model, give it time maybe? You're not going to release a banger on the first swing. We dont have anything in terms of proper long vid generation. Plus the long awaited radial that'll never get released, I'm grateful we're getting something, well two things, see >>106746286
>>
>>106746481
they have an electron app? holy mother of bloat, who thought that was a good idea wtf
>>
>>106746034
Funny thing is that someone who hasn't played the game should be able to get a little suspicious. Three players moving while a fourth hand's seat remains empty.
>>
why are seaart and tensor the absolute worst made sites? navigating them feels like trying to find the right download button on a sketchy site from the early 2000's
>>
I remembered that deviantart was a thing and it still is. Seems to be fairly alive, worth posting stuff there?
>>
>>106746685
yes, they've stopped pretending to care about "the ethical ramifications of using AI" since they realised they can make money off of it

REAL FUNNY HOW THAT WORKS
>>
>>106746685
It's gigaslopified and basically unusable.
>>
File: downloads.jpg (22 KB, 689x100)
22 KB
22 KB JPG
Does this piece of shit Comfy UI have to install everything in the fucking C: documents folder? I installed the program on other hard drive with 4 terabytes of free space, but it insists on downloading everything to C:, where I have almost no free memory.
The installation section said that I could put models and other things on other drives, but the option does not exist anywhere.
>>
>>106746731
uninstall the retarded shit you installed, then download the portable version and stop suffering your own idiocy
>>
why are the most retarded people the loudest
>>
Is noobai still the best sdxl based 1girl model? I’ve tried some of the top mixes on Civitai and holy baked in slop. I did a test with just “1girl” as the prompt and they basically make the same exact girl with the same face basically doing the same sex act over and over. Noob seems like it may be less “polished” for detail but much more variety and “tunable” raw state. If that makes sense.
>>
>>106746770
because all slopmixes have the same recipes more or less. just pick the model that you like the look of.
99% of top voted shit on civi is turboslop made by merging the prevoius highest rated model with other random shit.
>>
I've been no fapping for 10 days. Thought I'd give a progress report.
>>
>>106746782
what are some /ldg/ approved illu/noob slopmixes?
>>
>>106746878
we dont do that here
>>
>>106746920
what do we do here then?
>>
>>106746931
chromapilled asian1girlmaxxing
>>
>>106746878
I like one called smoothmix noobillustrious, it easily spits out that 3d ish looking style which I enjoy, but it’s only good in small doses cause it heavily samefaces. For straight “flat” anime style I like wainsfw, despite the name it’s great for general purpose 1girl gacha
>>
>>106746931
use base noob
>>
>>106744809
>They finetuned Wan 1.3B and not 14B
Shit...
>>
how do i stop hires fix from adding new knees
>>
>>106747067
lower denoise/cfg
>>
>>106747067
>he doesnt like extra limbs/nipples/knees
NGMI
>>
>still calling it "highres fix"
>>
hairesufixu
>>
>>106746996
you won't get "real time" from a 14b model kek
>>
>>106747067
In auto and its offspring hiresfix sometimes just won’t work (some models worse than others), in those cases you have to img2img it separately and what works best for me is to use a controlnet to ensure it doesn’t warp the image too much. Sometimes even a controlnet isn’t enough but it controls things a lot better. But if it just refuses to do patent upscale, then a script like sd upscale usually can get me there.
>>
File: 1746546375476862.png (88 KB, 885x637)
88 KB
88 KB PNG
neat, qwen edit v2 (2509) understands controlnets natively. so you can use this node, and the output as your image2 source, then whatever you prompt with image1 will use the canny/depth/openpose info for your gens.

before:
>>
File: 1752765655071996.png (835 KB, 824x1256)
835 KB
835 KB PNG
>>106747168
and after. prompt is just "the anime girl is squatting and holding a black rifle."

as you can see, it follows the canny controlnet output. and it works for depth maps, and openpose too. pretty cool.
>>
File: 1750681717069795.png (742 KB, 824x1256)
742 KB
742 KB PNG
>>106747184
"the background is white" for a cleaner bg:
>>
File: 1739869578242063.png (985 KB, 1080x636)
985 KB
985 KB PNG
https://github.com/dc-ai-projects/DC-Gen
https://arxiv.org/pdf/2509.25180
>52x faster
HOLD UP LET THEM COOK
>>
Any new API node announcements recently?
>>
>>106747195
Wake me up when it has an actual implementation
>>
File: 1757888820847257.png (700 KB, 856x1216)
700 KB
700 KB PNG
>>106747194
this time, openpose of the stance as image2 input, and changed image1 to motoko:

pretty cool. no background stuff since it's just the pose/skeleton.
>>
File: 1741434330185693.png (716 KB, 848x1232)
716 KB
716 KB PNG
2b, pretty good too. if you want the original style rifle just prompt diff or use canny.
>>
File: 1740523541954421.png (726 KB, 824x1256)
726 KB
726 KB PNG
>>106747242
depth map of another nikke bunny character as image2 input, image1 with miku.

isn't technology neat?

basic prompt: the anime girl is bent over and is wearing a teal thong under her skirt. she has long teal twintails and is wearing black boots. the background is white.
>>
>>106747195
>4k image generation
Krea can do that?
>>
File: 1737051617221494.png (449 KB, 848x1232)
449 KB
449 KB PNG
>>106747330
yes, I know 2b is in the game but it's just a test. there is no bunny 2b (yet)
>>
File: 1740436919573531.png (597 KB, 784x1328)
597 KB
597 KB PNG
>>106747394
last test, zero suit samus cause why not:
>>
>>106747195
let me guess its some batch image generation of fp4 quants running on h200 with a low 1-4 step model
>>
>>106747195
sage attention 3?
nah m8
DreamCast Gen?

AW YEAH THIS IS HAPPENIN
>>
File: 1732995985022650.jpg (482 KB, 2509x808)
482 KB
482 KB JPG
>>106747195
>>
>>106747126
Probably not "real time" with 14b if it comes out but can only imagine it will be relatively fast.
>>
>>106747517
waiter waiter, more plastic!
>>
File: 1750753966685899.png (219 KB, 1404x843)
219 KB
219 KB PNG
https://xcancel.com/T8star_Aix/status/1972934185624215789
>NOOOO HOW DARE YOU NOT GIVE A FUCK ABOUT OUR GOZILLION MODEL, TO PUNISH YOU WE'LL MAKE THE EDIT ONE 20B
looks like we won folks
>>
>>106747563
this is in response to comfyui stating they are not going to official implement hunyuan btw (they have api nodes to add instead)
>>
File: 1744088896296492.png (1.23 MB, 824x1264)
1.23 MB
1.23 MB PNG
you can also use openpose and qwen edit to do other fun things. to be fair, I made the miku with openpose then used edit to remove the guy, then combined the two with PS. AI did the hard work though (the Miku, in the pose.)
>>
>>106747581
So it's comfy that saved us? So fucking based, maybe I treated him a bit harshly...
>>
File: 1741117307075227.png (77 KB, 1204x670)
77 KB
77 KB PNG
Wan 3.0 mixture of experts with 671b params quanted dynamically to 130gb for 24vram+128ram systems + ramtorch for high quality real time 1080 video gen when?
>>
File: 1744264585098708.png (1.23 MB, 824x1264)
1.23 MB
1.23 MB PNG
>>106747583
sorry, wasn't very clean at the car bumper. now it's floyd time.
>>
>>106747583
bruh, lmao
>>
>>106747596
what's neat is you can take a base image (OG miku) and feed any openpose/canny/depth map from a comfy node into image 2, then get that pose/figure, with the original appearance and manipulate it how you like.

controlnets are great but this adds the option to manipulate stuff that img2img/inpaint + controlnets can't do at the same time. so it's a very cool tool.
>>
>>106747563
is he retarded? why does he expect any support at all? no one can run a 80b model so the case is quickly closed
>>
>>106747623
dont forget that it's also super slopped, even if you can run it
>>
>>106747581
comfy should have worded it less harshly and said that given the model size and quality its not worth comitting a lot of work to get it to run given that most people wont use it but that he's open for PRs but it seems like the harsh words made them aware of the problems of huge models that arent mixture of experts
>>
>>106747623
Let retards like this fall off, this is a repeat episode of smug retard trying to juggle profit and free labor to improve his model and forgetting that the community is the main reason why models succeed in this space.
>>
File: rare W leddit.png (370 KB, 1813x1234)
370 KB
370 KB PNG
>>106747563
even leddit isn't buying that bullshit lol
>>
>>106747635
>comfy should have worded it less harshly
No, it was perfectly fine. The model is just ass.
>>
>>106747563
He should set a GitHub star goal for anon to reach kek I hate chinks
>>
>>106747635
>comfy should have worded it less harshly
>it seems like the harsh words made them aware of
so... he was right to be harsh, we immediatly got some nice feedback from them
>>
>>106747635
>made them aware of the problems of huge models that arent mixture of experts
although i do want to say that actually the main problem is the model quality in relation to its size, it doesnt look much better than qwen image while its much larger, thats really the problem, since if it was huge but a huge improvement in quality without any doubt, people would be able to prune it somehow themselves or crowd fund something like nunchaku tier quants
>>
>>106747664
this, if that 80b model was Seedream tier, there's no doubt the community would've found a way to make it run, but since it looks like your random slopped model, it's completly useless to do any effort on that shit
>>
>>106747661
the point is that you dont want to go nuclear too quickly as you can just as easily make the company hate you because of your over the top harsh words and specifically avoid supporting you in the future through PRs or even make them less likely to open source things, especially now when the main way most image/video models are ran locally for anyone who wants to have actual control is through comfyui
>>
>>106747664
its complete shit, and it's only 1024 too. qwen is at least 1328x1328. hunyuan 3 is just a disaster of a model, the rendering quality is poor, it's bloated beyond belief, it has the aesthetic of refiner-era SDXL. it's like they trained it on a couple of benchmark party tricks and forgot the rest of the model
>>
>>106747690
>over the top
nigga
>>
>>106747696
>never had a job in his life
>>
>>106747703
>muh peepeepoopoo words
Shut the fuck up nigga. The model is shit.
>>
>>106747690
but that's not what happened there, Comfy managaed to make them reconsider things, it's a win for the open source community
>>106747703
Comfy doesn't care about that he's his own boss, and he has the power to make Tencent kneel, he's pretty based when you think about it
>>
>>106747703
>>106747690
shut the fuck up retard, Comfy was right to not implement it, and tf you say about "harsh words"? at no point he insulted the model or the team behind it
>>
File: jam.png (314 KB, 1195x1060)
314 KB
314 KB PNG
>>106747581
How many times must it be said: Comfy's business model is all about API nodes. Getting companies to switch to API is in ComfyUI's best interest. He's now making shit up and refusing to implement local models so that they switch to API only, at which point he will get on his knees and suck them off with a whole suite of workflows, tutorials, livestreams, and san francisco meetups just like for Wan2.5 and Seedream 4
The quicker ComfyUI gets removed from the OP the quicker local models begin to heal.
>>
File: Comfy right now.png (384 KB, 859x960)
384 KB
384 KB PNG
>>106747563
>>106747581
>this is in response to comfyui stating they are not going to official implement hunyuan btw
I don't want anyone to talk shit about Comfy for at least 2 months since he made this gigachad move
>>
>106747739
get better bait
>>
How retards ITT see it
>based comfy, this means we will get 20b models locally soon!!
How chinks see it
>Lets just make our stuff API only, it's not worth dealing with this shit
How local receives the news
>NOOOO why are they making it API?????
How comfy receives the news
>heh, all according to plan
>>
You need to be harsh to these faggots or you'll get another SD3 situation. Ironic because Comfy was slurping up the bullshit only to jump ship and cope despite anons knowing what would happen after XL
>>
>>106747563
>>106747581
We live in a timeline where Comfy forced a supergiant Chinese company like Tencent to kneet, just let that sink in.
>>
File: 1736448083405756.png (915 KB, 824x1256)
915 KB
915 KB PNG
okay, here's a more fun example. Take a depth map of elegg from nikke and make it your image2 source. image1 source is a full body miku pic.

then with qwen edit: you get thicc Miku.
>>
>>106747775
he's false-flagging as a concerned local user in order to push tencent into adopting api nodes
>>
Common API node victory, localpajeets stuck with scraps again
>>
File: 1730353474243799.png (21 KB, 931x174)
21 KB
21 KB PNG
>>106747725
>>106747717
again, retards that cant differentiate between regular environments and thinking versus what a suit high up in a company or a woman in marketing will think when they see the response to the implementation of you newest model of the literal owner of the most popular local open source community inferencing frontend clowning on it publicly in picrel

you just need one of those people to see this and give out the memo to no longer cater to the foss community and associate with the currently most popular project for running these models, let alone send their devs to submit PRs to help after this
im not saying this is how it should be, im saying this is how it is
>>
>>106747775
Zero adoption of HI3 would've forced them anyway.
>>
>>106747801
>what a suit high up in a company or a woman in marketing will think
Maybe they should think why their model turned out a steaming pile of shit.
>>
File: 1749869016708585.png (589 KB, 824x1256)
589 KB
589 KB PNG
>>106747783
would retouch the hand (it's a depth map) but you get the idea.
>>
>>106747801
I see all that sponsorship money is being put to good and effective use
>>
>>106747801
>they booed him but look at where we are now, Tencent got defeated
based Comfy, he never listened to the retards and that's why he has that power in the first place
>>
>>106747801
What do you expect from turdworld posters? They believe that comfy made tencent 'kneel' when all he did was make them abandon local altogether, which is exactly what he wants so he can then crawl back to them and suck them off with an API offer to bait more VC money.
>>
>>106747825
he went hard on cosmos, saying it was prime for finetunes and I bet you don't even remember that
>>
>>106747805
Tencent knows that if Comfy doesn't want to implement a local model, such model is officially dead, that's how powerful Comfy is
>>
The disabled one is here, remember to mention his disability whenever he tries to detail the thread
>>
>>106747825
>>106747749
>>106747717
pathetic samefaggotry
>>
File: 1492821324809.png (572 KB, 1080x1190)
572 KB
572 KB PNG
Oh, HE is here again...
>>
>>106747827
>They believe that comfy made tencent 'kneel'
he did though
>Comfy says he won't implement it
>2 days later Tencent says they'll prune their model so that more people can run it
>>
File: 1739626073818383.png (816 KB, 928x1120)
816 KB
816 KB PNG
the man is reading a newspaper with the title "BFL bankrupt!". keep his expression the same.

it'd be really funny if the CEO looked like that.
>>
Seeing mounting evidence for the removal of ComfyUI from the OP. It's clear that it's just SaaSware now. Perhaps a thread split is needed as well, so many comfyfaggot shills ITT
>>
>>106747844
>2 days later Tencent says they'll prune their model so that more people can run it
Ok, but the model is still pure ass and I have zero reasons to use it over Qwen EDit.
>>
>>106747836
*derail
He's going to have a major melty today so remind him of his disability. Also beware he will try to mass report you in retaliation.
>>
>>106747852
yeah I don't care about their image model, but they'll release an edit model as well, and if that one is 20b instead of 80b, and it's better than QIE, we won
>>
>>106747848
This. apiUI is killing local
>>
I didn't realize a company like Tencent would care about posts from literal whos on this hell site desu
>>
>>106747848
API Nodes are local as long as ComfyUI is in the OP, just so you know
>>
>>106747848
>>106747860
Ani samefag
>>
>>106747859
Unless the edit model has been trained completely separately from the 80b one then MAYBE it has a chance, but if it is based on the 80b nonsense, then it is DoA
>>
>>106747866
they don't, it's comfyshills pretending they have influence. tencent gave a handout to localpoors, localpoors screeched, so now tencent is switching back to closed source. that's all it is
>>
>>106747866
The disabled one shifted his focus to this thread you can tell by his dead thread sinking
>>106747873
Wrong one out of the duo, this is his attack dog.
>>
how much does ten cent pay (you) to post here, anon?
>>
Brainstorming ideas for the new thread title. I’m thinking /odg/ - Open Diffusion General
>>
>>106747882
>now tencent is switching back to closed source
why do you have to lie like that, we have eyes we read what they've written >>106747563
>>
>>106747898
This is a good idea really, I think the shilling for API Nodes is a bit out of control, it's time to start fresh
>>
>trolling anon into doing your dirty work
>>
>106747898
Ah the bi quarterly attempt to split the general
Bold move disabled one, surely it will work the 20th time.
>>
>>106747904
This is a massive win for local honestly, just like Flux. We can get the 20b hyper-distilled version while the powerful one comes to API. In this situation, everyone wins
>>
>>106747908
>I think the shilling for API Nodes
who's shilling for API nodes, all we hear is your complaining about "ghosts who shill for API nodes"
>>
>>106747783
very cool, works really well
>>
>>106747921
>hyper-distilled
every tencent model is distilled no? even the 80b model is distilled right?
>>
>>106747923
The shilling is in the OP and all throughout this thread
>>
>>106747923
Debo is upset because his thread gets no traction and has been having a meltdown. He tries this regularly and fails. Just make fun of him for being a low functional autistic neet that can't hold a job.
>>
>post lauding API nodes
>wait a few minutes
>complain about API posts
All in a days work
>>
>>106747938
>all throughout this thread
where? show some posts shilling API nodes
>>
Seedream 4.0 is probably the best image model to date, any place I can run it for cheap??
>>
File: 1740349658633482.png (929 KB, 1080x630)
929 KB
929 KB PNG
>>106747882
>so now tencent is switching back to closed source
not the right time to say random shit like that, they are about to release an autoregressive video model
https://kunhao-liu.github.io/Rolling_Forcing_Webpage/
https://arxiv.org/pdf/2509.25161
>>
>>106747841
from helpful calm thread to an explosion of shit
every time
>>
>mention his thread is sinking
>then he post
I promise you if anons just started posting handicap sign gens he will fuck off. All of this stops whenever he has to see his social worker
>>
File: 1735117414112557.png (13 KB, 291x180)
13 KB
13 KB PNG
this node is amazing btw, for canny/depth/openpose in qwen edit (or any model really).
>>
>>106747990
how does it work?
>>
>>106747904
>why do you have to lie like that
>>106747961
>not the right time to say random shit like that

It doesn't matter, he just wants to stir shit. He'll keep repeating it and or come up with some other bullshit regardless.
>>
>>106747990
I don't like preprocessors that don't come with values to change
>>
[Tutorials]
Here are some tutorials I found recently that really leveled-up my ComfyUI workflows. Hope they help!
https://www.youtube.com/watch?v=FmU-iNXlZ9g
https://www.youtube.com/watch?v=BZzGfUT4YAg
https://www.youtube.com/watch?v=Ht2rafC3FRI
>>
>>106747563
I really don't like their behavior, why do they feel entilted to get some dick sucking for every turd they released, it's not because it's "free" that we should eat shit
>>
File: 1734951856370812.png (832 KB, 1176x880)
832 KB
832 KB PNG
>>106747990
also, it makes edits even more precise cause you have controlnet info. with a depth map of this same picture I got a better result.

"replace the girl on the right with Miku Hatsune."

it's easy to swap asuka without the openpose image source but not necessarily in the same pose/style. But you can get exact poses with canny/depth/openpose as a source for image2.

pretty neat, the first version of qwen edit didnt have native controlnet support.
>>
>>106748002
They don't understand how he operates but most of the time debo is replying to himself which makes it more pathetic. It's no surprise he has a dead thread where anons make fun of him for being disabled.
>>
>>106747961
All these papers are benchmaxed fake shit until we see an actual practical implementation.
>>
>>106748017
(You)
>>
>>106747995
take an image, pick a controlnet in the dropdown (openpose, canny, depth, whatever), link it to an image output, run: get controlnet output.

stick that in image2 and you can do qwen edit with that as a reference, for exact poses or outputs (canny for 1:1, depth for more flexibility, openpose for the exact pose/skeleton).
>>
>>106748036
Glad I could be of help to at least one person. More tutorials coming later, there's just so much to learn!
>>
File: C R I N G E.png (1.9 MB, 1661x1704)
1.9 MB
1.9 MB PNG
Wtf? This is disturbing as fuck.
>>
No one will use tranistudio julien just give up
>>
File: 1747924511023871.png (906 KB, 1176x880)
906 KB
906 KB PNG
kek

replace the girl on the right with the man in image3.
>>
>>106748057
The implication here is that guy wasn't already a weirdo before generating the image which we all know is not true kek
>>
>>106748039
oh I see, never used these since sd1.5/a1111, do you need to download the canny/openpose etc models?
>>
File: ComfyUI_39637_.png (1.66 MB, 1152x896)
1.66 MB
1.66 MB PNG
>>106746878
theres no consensus whatsoever aside from base noob being the most flexible one (and requiring more skill)
slopmix wise i regularly use hassaku and anime screenshot merge, as well as cocoillustrious but that one got taken off civitai for some reason
>>
File: 1739665706602451.png (783 KB, 1176x880)
783 KB
783 KB PNG
>>106748068
in comfy it will download it if you dont have it, did that for the zoe depth one I tried, and others.
>>106748066
replace the girl on the right with the man in image3. remove the red hair anime girl.

qwen magic. pretty good desu, only using a depth map for image2.
>>
>>106748060
Will even throw his friend under the bus
Just like interior anon, debo lacks the ability to comprehended basic social norms.
>>
>>106748057
What a psycho. Even if you do that, don't fucking share it publicly.

>>106748067
He was weird before social media existed, but I didn't have to see it lol.
>>
>>106748089
>in comfy it will download it if you dont have it, did that for the zoe depth one I tried, and others.
ok thanks anon, will try
>>
>>106747961
LONG VIDEO WAITING ROOM
>>
>>106747563
>it is not easy to open source the core model
how? they just released "Slop model #48458784", does he really believe he did something special there? lmao, those guys are living on a bubble there's no way
>>
>>106748097
it might seem like it's doing nothing while the node is outlined in green but if you open explorer it will show python is downloading stuff, so it's getting the model then when done should do the output.
>>
>>106748111
NTA but can you try with and without the node (same seed)?
>>
>>106748117
you dont have to use any controlnet stuff as inputs, but if you want a specific canny/depth/openpose style, just use it as image2 input and prompt, it will affect image1. otherwise just bypass image2/3.
>>
>>106748087
>aside from base noob being the most flexible one (and requiring more skill)
based. skill-lets BTFO eternally
>>
File: 1742643618815005.png (962 KB, 1176x880)
962 KB
962 KB PNG
replace the blue hair anime girl on the left with the man in image3.

didn't exactly work in this case (the depth map for rei isn't 100%) but the result is pretty funny
>>
>>106748129
Yeah i meant does it make that huge of a difference, I was just curious to see it compared.
>>
>>106747563
Their model is quite bad, it becomes clear when looking at their prompting manual. Maybe the 20B will have some value, but it won't be better than the current best SDXL fine-tune.
>>
>>106744641
This must be a woman/feminine thing. I couldn't care less if someone trained on my art let alone my face. In fact, I would be flattered!
>>
is there any "show image" node that isn't saved in my feed?
I don't want clutter, only to see some intermediate steps when I gen, and it's annoying to see everything saved
>>
>>106747563
>Tencent: "Hey guys! Please like and subscribe and if we reach [arbiratry number of github Stars] we'll release that subpar pruned model!
>Alibaba: Here, take this SOTA model, see you next time
during the "Alien during pushups days" I was kinda rooting for Tencent as the underdog, but now I see them as entilted bitches, I really don't like that company, Alibaba is my goat and it'll remain that way
>>
File: 1753012460105993.png (898 KB, 1176x880)
898 KB
898 KB PNG
>>106748153
this is just with 1 image: "replace the red hair anime girl on the right with miku hatsune." it swapped, but didn't do the same pose (didn't specify)

but, notice >>106748024 got the original pose right, that's cause of the depth map info (or openpose, canny, etc).
>>
File: 1729718904687047.png (434 KB, 669x502)
434 KB
434 KB PNG
>>106748177
and here is the original, both swap but you get the same pose if you use the controlnet as image2 input.
>>
>>106748170
preview image
>>
>>106748185
thank you for posting the original as i have never seen that image in my entire life ever please keep posting it
>>
File: 1740884390850406.png (129 KB, 434x1698)
129 KB
129 KB PNG
>>106748129
yeah it worked, now I just need to learn what all of these do...
>>
>>106748201
canny, openpose, and depth (zoe depth is fine) are the main ones for image edits. stuff like tile is for upscaling mainly.
>>
>>106748177
OK, makes sense. Thanks for testing, anon.
>>
>>106748172
other than hunyuan vid, that only gained traction because if i recall the competition at the time was just anidiff, what have they even done?
>>
File: 1739652028203622.png (337 KB, 1365x1024)
337 KB
337 KB PNG
>>106748209
basically, canny is when you want 1:1 lineart, it's like copying an outline

depth is like this, and has more flexibility when genning

openpose is a skeleton of the character pose, which will help create exactly that.
>>
>>106748189
perfect, thanks
>>
>>106748172
Alibaba is based. Wan2.5 is insanely powerful, especially through ComfyUI API nodes. Excited to see more coming from them!
>>
>>106748209
>>106748220
yeah I'm just wondering which ones would actually useful for qie and which ones it's clever enough to not need
>>
>>106748231
>Alibaba is based.
facts
>>
>>106748231
SO
MUCH
THISSSSSSS
>>
I refuse to root for any Chinese company desu
>>
>>106748237
the base model is great at all kinds of stuff, controlnets are for when you want a very specific pose or output to look a certain way (same character outline, same size, same pose). like the example, if you want miku to have the same asuka pose with the feet, you can use controlnet info to do that.
>>
When ready

>>106748266
>>106748266
>>106748266
>>106748266
>>
File: NOTHING.png (73 KB, 216x234)
73 KB
73 KB PNG
>>106748216
>other than hunyuan vid, that only gained traction because if i recall the competition at the time was just anidiff, what have they even done?
>>
>>106748250
Well I'm convinced
>>
>>106748216
they dont clean their datasets as well and left in nsfw stuff because theyre bad at ai. The image 2.1 model they released a few weeks ago is pretty uncensored compared to qwen

their desperation and brute force approach may help us. It's free, fuck it.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.