[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Discussion of Free and Open Source Text-to-Image/Video Models and UI

Prev: >>106879215

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2203741
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
>>106884374
very shitty and retarded collage. besides the clever tobi gen, the rest are 1.5 tier slop.
>>
shit miku general of gay
>>
File: 1734328361897209.mp4 (1.75 MB, 720x1072)
1.75 MB
1.75 MB MP4
>>
File: 00081-3579559290.png (2.61 MB, 1248x1824)
2.61 MB
2.61 MB PNG
>>
File: 1743734666863196.mp4 (521 KB, 704x496)
521 KB
521 KB MP4
the blonde anime girl rolls down her car window and throws her tea at the people outside.

new lora, kino
>>
>>106884501
These look like 3D renders more than anything else. Is it deliberate?
>>
I've subjected myself to the Forbidden Dream where you can never wake up from and lived
>>
Worth upgrading from 64gb to 96gb ram, with a 5090?
>>
>>106884573
are you running out of memory and swapping? if not, more RAM doesn't automatically provide a performance boost.
>>
File: 00094-532180422.png (2.63 MB, 1248x1824)
2.63 MB
2.63 MB PNG
>>106884547
yes indeed 3d render
>>
File: 1737950254942228.mp4 (624 KB, 704x544)
624 KB
624 KB MP4
>>106884528
the blonde anime girl sips her cup of tea as people outside the car are walking by.
>>
>>106884580
I am. But feels wasteful for such a "small" upgrade. There are no larger packs supported on the qvl list.
>>
File: 1754781622220770.mp4 (1.09 MB, 640x640)
1.09 MB
1.09 MB MP4
the anime girl eats cereal from her cereal bowl, and gives a thumbs up.

neat, double bowl.
>>
what does wan VACE do, is it for replacing characters in scenes? if yes, can't that be done with wanimate already?
>>
>>106884615
I'm assuming DDR5, but email the motherboard vendor with the model of a 2x64GB kit and ask if it'll be supported. my board shipped with "max. 64GB" at launch, but they've increased that to 256GB now that there are larger DIMM sizes.
>>
File: 1743176932730121.mp4 (1.14 MB, 640x640)
1.14 MB
1.14 MB MP4
>>106884652
for rife VFI is 47 or 49 better (recommends both)

this is with 49:
>>
>>106884660
editing, so they say
extending frames at front or back, extending the canvas, transferring poses, depth, outlines, and stuff I'm probably forgetting
>>
>>106884374
>i didnt make it to the collage
shit collage kys
>>
File: 1756005938070286.mp4 (1.91 MB, 640x640)
1.91 MB
1.91 MB MP4
the man with the beard on the left drinks his champagne, and the yellow cartoon character on the right throws 100 dollar bills into the air
>>
File: 1740647612564184.mp4 (956 KB, 480x704)
956 KB
956 KB MP4
the white hair anime girl wearing a blindfold stands up and runs quickly to the right through a door in her house.

new lora, pretty good
>>
File: 1756500235604357.mp4 (1.18 MB, 704x480)
1.18 MB
1.18 MB MP4
the anime girl is riding a blue skateboard around a racetrack.
>>
File: 1755171141524004.mp4 (932 KB, 480x704)
932 KB
932 KB MP4
>>
>>106884665
Dear god going through asus support is absolutely horrible.
I ordered the 96gb pack.
>>
File: 1742781671925168.mp4 (532 KB, 480x704)
532 KB
532 KB MP4
the camera pans out and an anime style Hatsune Miku puts her hand on the man's shoulder, as he kneels in the snow outside.
>>
>increase cfg in high noise
>the FFLF now ignores the last frame

Who do I blame for this?
>>
File: 1733687252560453.mp4 (853 KB, 704x480)
853 KB
853 KB MP4
>>106885033
the camera pans out and an anime style Hatsune Miku gives the man a hug as he stands in front of neon signs.

new lora works very well. kijai says use his lora for high noise + the 2.1 i2v for low.

https://huggingface.co/Kijai/WanVideo_comfy/tree/main/LoRAs/Wan22_Lightx2v
>>
File: 1734769040356789.mp4 (856 KB, 704x480)
856 KB
856 KB MP4
>>106885073
one more test with this one.

the camera pans out and the man eats McDonalds french fries with his hands, that are on a round table.
>>
>>106884998
yeah, but not as bad when it's not an RMA. they didn't respond to my ticket, just closed it after they'd published the updated BIOS with a note about memory support.
>>
>>106885121
Pretty smooth
>>
File: ComfyUI_0083.jpg (2.26 MB, 1920x2656)
2.26 MB
2.26 MB JPG
>>
>tfw migu
:)
>>
File: 00113-986247081.png (2.51 MB, 1536x1536)
2.51 MB
2.51 MB PNG
>>
>>106885346
did you use a style lora/tag for this? if so do you mind sharing
>>
>>106883990
If any anon here feels that doesnt belong remember that there is an actual place where we test these same models by actually using it creatively and having fun with them, no miku spamming or girl crunching pointing out at viewer spamming, instead actual good quality gens with same models, just saying. If you like being shilled, stay here.
>>
>>106885346
the more innocent she looks, the bigger the chance she'll push a finger up your bum while blowing
>>
>>106885375
>advertising sdg circlejerk bullshit general
>in ldg of all places, which was created specifically to get away from sdg cancer.
You fucking retard.
>>
File: ComfyUI_03312_.png (2.19 MB, 1024x1024)
2.19 MB
2.19 MB PNG
These video gens with the fixed lora look like they're out of an entirely new model. local is eating good!
>>
>>106885492
I don't get it. What's the difference between the new one and this one that was released in August?

https://huggingface.co/Aitrepreneur/FLX/blob/main/Wan2.2-Lightning_I2V-A14B-4steps-lora_HIGH_fp16.safetensors
>>
File: 00155-2868255923.jpg (804 KB, 1536x1536)
804 KB
804 KB JPG
>>106885492
what's the gen time for single image?
>>
File: ComfyUI_03316_.png (1.53 MB, 1024x1024)
1.53 MB
1.53 MB PNG
>>106885512
picrel is/was Chroma1-HD-Flash. Extracting text embeddings is 6-7 seconds, 8 steps heun is 9-10 seconds
>>
File: ComfyUI_03313_.png (1.92 MB, 1024x1024)
1.92 MB
1.92 MB PNG
>>106885505
>I don't get it.
It's magic to me too. Ask the god of latents Kijai
>>
File: 00138-2280755712.png (2.2 MB, 1824x1248)
2.2 MB
2.2 MB PNG
>>106885346
haven't done 2d shit in a while so I'm kind of rusty with tags. doesn't look special to me.
https://civitai.com/models/1259516/sophie-sofi-3-outfits-tales-of-graces-tales-of-graces-f
https://files.catbox.moe/ilka1r.png
>>
File: ComfyUI_03319_.png (1.17 MB, 1024x1024)
1.17 MB
1.17 MB PNG
>>
>>106885531
>>106885584
Still rocking the 1gb beast Emma lora?
>>
File: ComfyUI_03324_.png (1.3 MB, 1024x1024)
1.3 MB
1.3 MB PNG
>>106885615
Rank 32 this time (213.8mb). I think it needs more baking. Ran out of captioned datasets and starting to update old ones
>>
comfy when
https://github.com/hyeon-cho/Tangential-Amplifying-Guidance
https://hyeon-cho.github.io/TAG/
>>
>>106885688
Ah you captioned it properly this time. How does it benchmark against old version?
>>
File: ComfyUI_00147_.webm (3.85 MB, 768x768)
3.85 MB
3.85 MB WEBM
30 minutes on a 12gb 3060 with wan 14b. 5b is 5 minutes but it sucks.
>>
>>106885829
it's been in comfy for months
'Tangential Damping CFG' node.
nobody knows how to use it or what exactly it does though so it was forgotten.
>>
>>106885958
it's a different thing, no?
>>
I heard we are back yes?
>>
>>106886059
the wan2.2 i2v lora ive been using seems no different than the new one posted, so i dunno.
>>
>>106886059
Back from genning tiktok videos and 1grils? I dunno.
>>
>>106886059
we never left desu
>>
>>106884573
I upgraded from 64gb of ram to 96gb (also have a 5090) and it was worth it to me as I use the fp16 Wan models. The noticeable boost is that the model switching (high>low) takes like 30 seconds now as it doesn't constantly use all of my system ram the entire time. If you just use the Q8 models of Wan then 64gb is fine imo.
>>
File: ComfyUI_0131.jpg (2.5 MB, 2432x1664)
2.5 MB
2.5 MB JPG
>>
for wan if i wanted to create videos of a realistic character/celeb is it better to:

- generate a lora and use t2v
- use wan animate/vace to face swap
- something else?
>>
>>
File: ComfyUI_03326_.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
>>106885855
Same captions. HD really dialed in the details
>>
Anyone have any joy running qwen on 12gb vram? i have watched a video and copied that claims it can but keeps failing.
>>
>>106886502
why does it look like that though
>>
>>106886140
does fp16 really make a difference over q8?

let's see how long fp16 takes, and if its worth the extra time
>>
>>106885829
every single one of these guidance methods are snake oil so nobody cares
>>
>>106886140
I'm running q8. I can run fp16, but with more loras it gets too heavy.
>>
>>106886529
You need to go outside and look at more women
>>
It seems like unless you have a super computer local gen is a waste of time.
>>
>>106886435

Make a lora of them in SDXL/Chroma and then I2V those for the most consistency.
>>
>>106886568
4GB laptop gang represent
>>
Wan 2.2 lightning lora testing results:

After trying many combinations including old loras and new rCM, I got the best results with:

New HIGH:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/LoRAs/Wan22_Lightx2v/Wan_2_2_I2V_A14B_HIGH_lightx2v_MoE_distill_lora_rank_64_bf16.safetensors

Old LOW:
https://huggingface.co/Kijai/WanVideo_comfy/blob/main/LoRAs/Wan22-Lightning/old/Wan2.2-Lightning_I2V-A14B-4steps-lora_LOW_fp16.safetensors

4 steps, cfg 1, unipc

I think unipc does the most work here at low 4 steps to remove blur from motion, and the new HIGH lora does well to add more motion compared to the old HIGH lora.
>>
>>106886568
a high end consumer is not a 'super computer'.
you just need 3090/4090/5090 gpu and 64gb ddr5 ram. thats it.
>>
>>106886568
you can image gen with a 600-800$ pc and videogen with a 1-1.5k pc, if you dont have that by now while liking tech you got bigger problems
>>
An H200 is about $32k.
Solar panel setup to save on electricity are about $40k.
Man, I really want to access the full power of these models unrestricted..but the cost savings from using solar panels would be enormous..I could gen without ever worrying about how much electricity I use.

Such tough decisions.
>>
>>106886564
most women i see in real life are not violently saturated and pixelated

>>106886602
you mean generate pictures using sdxl/chroma then train a lora based on those? or generate an image in sdxl/chroma then use that for i2v?
>>
>>106886648
I should have clarified with image2video.
>>106886632
>you just need 3090/4090/5090 gpu and 64gb ddr5 ram
And that's faster than using a web gen?
>>
>>106886435
another option is i2v and switch the scene with the person (having them do whatever you want) through prompting.

>>106886532
A while back I tested Q8 and fp8 scaled against fp16 with the i2v models and the fp8 scaled just melted at higher video length (10 sec) while the Q8 was almost the same as the fp16 model (there were some additional minor movements in the background with the fp16 model). Honestly I just use the fp16 model because I can.
>>
File: Video_00064.mp4 (936 KB, 480x704)
936 KB
936 KB MP4
>>106886700
i2v could work with a few tweaks
>>
File: SD35Medium_Output_36267.png (3.73 MB, 1216x1600)
3.73 MB
3.73 MB PNG
>>
File: DiscoElysium_00013_.jpg (1002 KB, 1256x1704)
1002 KB
1002 KB JPG
>>
https://www.reddit.com/r/StableDiffusion/comments/1o67ntj/comment/njfdj4a/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button
>A new somewhat interesting option is Nvidia's rCM distillation, which I also extracted as a LoRA:
>https://huggingface.co/Kijai/WanVideo_comfy/tree/main/LoRAs/rCM
>It's for 2.1, so for 2.2 it needs to be used at higher strength, but it seems to have more/better motion and also bigger changes to the output than lightx2v, granted we may not have the exact scheduler they use implemented yet.
>>
>>106886827
>>106886613
>>
>>106886841
I'd like him to show a result with rCM and see how it compares to that new lora (and rCM is for t2v so it's irrelevant to I2V)
>>
File: Video_00068.mp4 (1.63 MB, 480x704)
1.63 MB
1.63 MB MP4
>>106886700
nah you were right
>>
>>106886661
>most women i see in real life are not violently saturated and pixelated
You need to get out more
>>
File: 1736403781670206.jpg (777 KB, 968x1152)
777 KB
777 KB JPG
>>
>>106886502
can you post the lora?
>>
File: 1731489577131934.png (84 KB, 738x488)
84 KB
84 KB PNG
https://xcancel.com/sainingxie/status/1977936710135669130#m
if this kills VAE I'm all for it
>>
File: 1751162075949848.png (1.7 MB, 1920x1039)
1.7 MB
1.7 MB PNG
>>106885829
>oh wow guys, it looks better compared to SD3 at cfg 1
no shit, SD3 isn't supposed to run with no cfg
>>
>>106886538
>every single one of these guidance methods are snake oil so nobody cares
which is insane to me, like they invented CFG years ago and somehow it's the optimal method and we can't replace perfection, or else they were the luckiest motherfuckers on earth, or else it can actually be replaced at some point
>>
>>106884426
Local? For real? What settings did you use?
>>
>>106884426
looks like the bouncing boobs and leaning forward lora
>>
>>106887145
it's math. there is one true way of doing something as efficiently as possible. all these knock-offs are just papermaxxing someone's resume
>>
>>106887204
>there is one true way of doing something as efficiently as possible.
and usually it's not the first try it manages to do that, it gets improved over time, but not for CFG, they nailed that shit first try somehow
>>
>>106887007
>we’re also open sourcing a JAX nnx diffusion codebase that Willis @ma_nanye has been building
lol that's pretty jokes
>>
File: 1740993114453035.png (254 KB, 1652x940)
254 KB
254 KB PNG
Lodestone is fucked since that new huggingface update lool
https://xcancel.com/bdsqlsz/status/1978114907598909724#m
>>
>>106887247
>Best-effort
>impactful work
the fuck?
>>
best sampling settings for wan 2.2 WITHOUT lightning?
>>
>>106887275
the default euler ones
>>
>>106887275
uni_pc/beta for anime
deis/beta for realism

what i found personally the best.
>>
>>106887332
cfg/shift?
>>
>>106887378
1 cfg if using lightning
5 cfg without

5 shift for both
>>
>>106886613
>New HIGH:
>https://huggingface.co/Kijai/WanVideo_comfy/blob/main/LoRAs/Wan22_Lightx2v/Wan_2_2_I2V_A14B_HIGH_lightx2v_MoE_distill_lora_rank_64_bf16.safetensors
I'm not a big fan of that lora, yeah kijai fixed it with his own format, but I'm still getting ghosting shit, the motion is there but something else is gone (for euler at least, it's a bit better on uni_pc like you said)
>>
File: DiscoElysium_00028_.jpg (655 KB, 1256x1704)
655 KB
655 KB JPG
Chroma doesn't know how to type "fägäri" because ä. It's over!
>>
>>106886655
Unless you live in an actual hellhole (Cali, prepare to be arrested for using a local AI model) solar panels will never pay back over normal electricity prices. Solar panels are for paying for electrical independence, for for saving money.
>>
>>106887607
from what i hear some places give you big monetary incentives to buy solars which might make it worth but yes
>>
File: ComfyUI_03332_.png (2.23 MB, 1024x1536)
2.23 MB
2.23 MB PNG
>>106886998
https://gofile.io/d/3h45bi
>>
>>106887556
kek
>>
>>106887652
based emma enjoyer, danke.
>>
>>106887640
No, there is no payback and it's all a scam, even with stealing the money of your fellow taxpayers you're lucky to break even over 30 years -- assuming they last that long -- and it's pure sunk cost especially understanding that every 10 years there are major breakthroughs in solar panel and battery technology. Again, they're for energy independence and off-grid living, not a realistic way to save money. You'd be way better off and it'd be way cheaper to spend money on insulation with literally payback within several years.
>>
File: RaMu BeAcH.webm (3.92 MB, 852x1280)
3.92 MB
3.92 MB WEBM
>>106886827
>New High + rCM Low (8 steps)
Neat combo for motion, but I think the new High introduced some gamma issues. Everything's been a touch darker since I've been using it.
>>
>>106887696
>8 steps
nahh, 4steps or bust, they managed to nail it on wan 2.1 I expect the same for wan 2.2
>>
File: 00090-1832475527.png (1.09 MB, 832x1216)
1.09 MB
1.09 MB PNG
I'm trying to do a thing where we see a full-body illustration on one side of the image, and multiple close-ups of different parts of the girl's body on the other side of the image.
>ass focus, breast focus, crotch focus
>(full body:2.0), lower body, upper body
>close-up, multiple views
It only works 10% of the time.
Any tips on how to make this work more consistently? What kind of prompt would I use? Maybe there's already a lora for this?
>>
Ordered my Spark, I think my first experiment will be trying to make a long video Wan LoRA.
>>
>>106887728
train a lora for it, you are on /g/
>>
>>106887728
first thing I would try is changing it to (multiple views:2.0)
>>
>>106887694
I pay $500/mo for electric. That's $6k a year. After 5 years, I'd be saving money if I can get my electric cut in half. They have 10-15 year warranties.

>You'd be way better off and it'd be way cheaper to spend money on insulation
All the insulation in the world isn't going to reduce how much electricity my gpu + server uses when running 24/7.
>>
>>106887752
cool bait anon
>>
>>106887777
Which part, that I got a Spark or that it can train a long frame LoRA?
>>
>>106887728
gen x amount of images and take the one that has good composition and put that one into ControlNet with depthanythingv2. This way you'll almost always get the thing you want. You can also use gimp/photoshop to make the original gen for depth
>>
>>106887771
>15 year warranty
Well you're in luck anon I have a warranty for a crypto coin to sell you.
>>
>>106887752
you're going to train a lora on a machine with just ram? lol
>>
>>106887789
spark is for llm's. the vram performance is worse than a GTX 1080. no idea what you're doing but good luck.
>>
>>106887814
>LLMs
You mean AI models that are also using Tranformers? Are you technically illiterate?

But yes, I'll also use it to finetune a Gemma model on smut.
>>
>>106887752
You will regret it. A Wan 2.2 i2v lora with just 100 640x360 takes almost 6 days on an Ada 6000, and that's far more powerful than a Spark.
>>
>>106887825
Here's your (You). Not wasting my time on you.
>>
>>106887854
You're wasting my time because you think LLMs and Diffusion models aren't fundamentally the same.
>>
>>106887864
>LLMs and Diffusion models are fundamentally the same
ohh that's a quality bait not gonna lie
>>
what a special little guy
>>
>>106887752
>Ordered my Spark
>GB10 delivers up to 1 PFLOP of sparse FP4 tensor performance, placing its AI capability roughly between that of an RTX 5070 and 5070 Ti
>>
>>106887841
Regret what? Having a machine training over time with 128 GB of available RAM allowing for much larger model training? No wonder nothing is ever trained or finetuned because retards need 5 second gratification on literally everything. I don't care if it takes 2 months to train if it results in long videos.
>>
Poorfags absolutely malding
>>
>>106887888
Yeah I think the key point is it has 128 GB of RAM which is the bottleneck for most training especially video models.
>>
File: 1751641381890583.mp4 (3.72 MB, 976x544)
3.72 MB
3.72 MB MP4
>>106886613
Their new lora model isn't great desu, they had much more success with the 2.2 T2V one
>>
>bro why buy a terminal in 1987, don't you realize it'll take you 6 months to code anything?
>>
ey gais ordered my spunk, gonna do some llms and ais on it ... k cya
>>
>>106887888
https://www.reddit.com/r/LocalLLaMA/comments/1lk5te5/nvidia_dgx_spark_whats_the_catch/

>It was never meant as a stand-alone product for inference or training beyond testing whether what you're trying to do will actually work.
>>
File: WanVid_00064.webm (828 KB, 400x608)
828 KB
828 KB WEBM
I tried to trick it
>>
Kill cogsuckers. Behead cogsuckers. Roundhouse kick a cogsucker into the concrete. Slam dunk cogsucker's babies into the trashcan. Crucify filthy cogsuckers. Defecate in a cogsucker's mouth. Launch cogsuckers into the sun. Stir fry cogsuckers in a wok. Toss cogsuckers into active volcanoes. Urinate into a cogsucker's face. Judo throw cogsuckers into a wood chipper. Twist cogsuckers heads off. Report cogsuckers to the IRS. Karate chop cogsuckers in half. Curb stomp cogsucker. Trap cogsuckers in quicksand. Crush cogsuckers in the trash compactor. Liquify cogsuckers in a vat of acid. Smack cogsuckers. Dissect cogsuckers. Exterminate cogsuckers in the gas chamber. Stomp cogsucker heads with steel toed boots. Cremate cogsuckers in the oven. Lobotimize cogsuckers. Mandatory prison sentences for cogsuckers. Grind cogsuckers in the garbage disposal. Drown cogsuckers in acid. Vaporize cogsuckers with thermite. Kick old cogsuckers down the stairs. Feed cogsuckers to lions. Slice cogsuckers heads off with laserbeams.
>>
>>106887888
isn't that super close to 6000 Ada performance?
>>
>>106888089
>cogsuckers
but CogVideo is a fine model though :(
https://github.com/zai-org/CogVideo
>>
>>106888094
That doesn't matter they can't afford and it's not 10 pflops for $350 with 512 GB of VRAM.
>>
>>106887275
What hardware are you running? Curious what you need to gen without the lightning loras
>>
>>106888199
3090+128ram
>>
>>106888066
>The NVIDIA RTX 3090 offers approximately 142 TFLOPS
>meant to be
According to who? There are people who still buy 3090s for "cheap" finetuning and it's operates at much higher wattage and slower speeds. People are ultimately mad the Spark is overpriced which is true, but if it was $1000 there be a 2 year waiting period to buy one.
>>
File: 1746722118413402.jpg (1018 KB, 3570x4096)
1018 KB
1018 KB JPG
https://xcancel.com/Alibaba_Qwen/status/1978150959621734624#m
I have a feeling they'll use those new VL models as text encoder for their future video/edit models
>>
>>106888199
regular porn clips?
everything looks better without using lightning loras obviously, unless you're doing boring poses or something.
>>
>>106887890
>doesn't care if it takes two months
Uhuh. You will care. Enjoy your gold painted brick.
>>
>>106888124
What really happened to cog and mochi? They could of been on par with wan if they weren't so slow. Even ltxv kinda drifted off. Who knows, maybe they're all cooking behind the scenes.
>>
>>106888403
Computers aren't run by hand crank any more grandpa, it can literally sit in the back of my office doing whatever I want freeing up my main workstation to do whatever I want. And it's a hilarious assertion, it taking two months really doesn't matter especially compared to what you do: nothing. Maybe someone that isn't you will finetune something lmao.
>>
>>106888291
>but if it was $1000 there be a 2 year waiting period to buy one.
Only because nvidia doesn't give a shit. Notice how Apple, as gay as they are, actually are able to meet demand for a product, even at launch?
Again, if it's not a datacenter GPU, nvidia does not really give a shit. The Spark is a "look what I can do" toy for twitter idiots, most of who got theirs for free or on loan.
>>
>>106888435
>Apple
too bad it can't do anything except specific inference workflows lmao, at least a Spark can run diffusers
>>
>>106888428
Sounds like you never trained a LoRA before. You'll be in for a surpsrise when two months later you try it and realize it's overbaked, underbaked, you forgot a key setting, etc... and now you have to wait another two months.
Just return the fucking thing and buy a 5000 Pro, or a pair of Quadro 8000, which BTW will be faster than the Spark, since they have much faster memory.
>>
>>106888472
Apparently you have never trained a LoRA because you think it takes 2 months to train one. Even if it was 1/4 the speed of a 4090, it's still literally several days, not several months.
>>
Question, how hard is video gen, ComfyUI, WAN to do/learn?

I've been screwing around for a while using Forge but haven't tried Comfy. Can i keep using Stability Matrix?
>>
>>106888472
Let anon bask in their stupidity. Some people need to learn the hard way.
>>
>>106888495
As we all know a LoRA is like baking a cake and there's no way to test its performance mid training, literally impossible, you just train for 2 months straight and cross your fingers. Every day it's more and more obvious why the LoRA and finetune ecosystem has dried up.
>>
>>106888428
Sorry I totally missed the part where you hilariously said you were going to finetune a model. Yes, you do that on your Spark, everyone else is idiots for using an H100 cluster.
https://huggingface.co/quarterturn/models
Now post your huggingface page.
>>
File: WAN2.2_00365.mp4 (3.81 MB, 960x542)
3.81 MB
3.81 MB MP4
>>
I can't believe 1.5 is still better than 2.2 and there is no 3.0 after all this time.

ALso is this shit finally useful for making comic books or it's just as useless?
>>
>>106888539
whitebros...
>>
>>106888546
>I can't believe 1.5 is still better than 2.2 and there is no 3.0 after all this time.
>1.5
you mean wan 2.1?
>>
File: ComfyUI_05934_.png (1.12 MB, 888x1176)
1.12 MB
1.12 MB PNG
>>
>>106888524
Sorry, you’ll need to post a picture of your face next to your monitor showing that page with a piece of paper that says “I’m not doing stolen valor.”

>everyone else is idiots for using an H100 cluster.
That’s not my assertion, but that’s par for the course from someone like you arguing in bad faith.

Thanks for conceding that you can fine-tune a model on a Spark. Now you’ll have to concede that a Spark performs roughly like a 3090 which people do use to fine-tune models like Flux. In another thread you’d probably be calling that a “good GPU.”

The only person pretending a Spark is supposed to be an H100 is you, and that’s disingenuous. An H100 costs more than a dollar an hour to rent meaning if you mess up a run, you literally just burn money. My Spark doesn’t cease to exist after a bad fine-tune.

So then you have to move the goalposts to “wasted time,” as if that even matters. People spend time on hobbies all the time. This isn’t a hand-cranked computer it runs without me touching it. Are you going to tell someone growing plants they’re wasting time because a seedling died after three months? Are you a real person, or just incredibly sad?
>>
File: WAN2.2_00370.mp4 (3.45 MB, 616x848)
3.45 MB
3.45 MB MP4
>>106888539
>>106888555
checked
>>
File: wan22___0047.png (1.54 MB, 832x1216)
1.54 MB
1.54 MB PNG
>>106888489
i don't know anything about stability matrix but if you have half a brain and can read documentation you can pick it up in a few weeks
>>
>>106888330
Didn't they already use VL to auto-caption their training set for Qwen-Image?
Actually on that note has anybody outside of Alibaba tried using VL for captioning?
>>
>>106888199
you're just increasing the steps which doesn't change the hardware requirements, it just takes longer
>>
File: RaMu TrUnK.webm (3.93 MB, 852x1280)
3.93 MB
3.93 MB WEBM
>>106887711
I was only getting videos full of artifacts at four steps, but yeah, I'm sure they'll figure it out.
>>
>>106888539
Almost completed Saika dataset earlier. Same difficulty as with k-pop stars. Every photo is photoshopped to hell so loras produce alien face mongoloids
>>
File: WAN2.2_00372.mp4 (3.65 MB, 944x552)
3.65 MB
3.65 MB MP4
>>106888807
oh hi! are you the hailey rose trainer?? what else did you cook up my man!
>>
File: WAN2.2_00374.mp4 (3.84 MB, 592x880)
3.84 MB
3.84 MB MP4
>>106888835
>>
File: videoframe_4984.png (2.78 MB, 1188x1740)
2.78 MB
2.78 MB PNG
>>106884374
Please can anyone help, I'm trying to generate batches of images with the same seed for the wildcard generator and also the ksampler.

This so far seems impossible. (generate every n images change the seed) Even with a counter node I can't do it because you can't reset the counter.

You also cant generate a seed with a node then use that seed for the ksampler or the wildcard processor as when you change it it doesn't register until you start and stop generating.
>>
>>106888835
nah different guy
>>
File: WAN2.2_00376.mp4 (3.81 MB, 536x960)
3.81 MB
3.81 MB MP4
>>106888904
>>106888977
ah my b
>>
>>106888911
Have you tried the LatentBatchSeedBehavior node?
>>
>>106888911
post workflow
>>
>>106888911
Use code and the local API. Not everything is a nail to be screwed.
>>
File: 1744091827126195.mp4 (2.74 MB, 720x960)
2.74 MB
2.74 MB MP4
>>
>>106889038
>>106889045
>>106889089
literally just how can i change the seed and thereby the wildcard generated every n generations on impactwildcard processor.

Why did they invent this "programming" language without loops...
>>
>>106888428
2 months for a cooked lora, meanwhile your electricity bill jumps 20%
>>
>>106889148
>guys how can I programmically do something extremely specific without programming
>>
>>106889131
neat
>>
>>106888904
pot on head
>>
Where can I adjust extra parameters in OneTrainer.. betas and such?
>>
File: 1748570477587942.png (814 KB, 1584x1648)
814 KB
814 KB PNG
>>106888720
I sure hope not, this shit is ass
>>
>>106889148
i dont know that meme node, just use fixed seed, queue X gens, change seed, queue more?
>>
>>106889027
ooh i like this
>>
>>106889027
Damn, this is good.
>>
File: SD35Medium_Output_543643.png (2.99 MB, 1120x1680)
2.99 MB
2.99 MB PNG
>>
>>106889027
>>106889200
yeah, pretty cool
>>
Can we gen our way out of Chinese Communist engineering dominance?
>>
>>106889266
>Chinese Communist engineering dominance
>dominance
what dominance? Sora 2 destroyed everything
>>
>>106889266
the US is cooked, might as well start learning Mandarin right now to get ahead of the curve
>>
>>106889266
The way NVIDIA is operating and the degree to which the US economy is dependant on it means we are begging for China to gain complete leverage over us.
>>
>>106889266
move Taiwan to Hawaii, just uproot the entire island
>>
>>106889461
>tsmc gets killed by a volcano or tsunami
epic
>>
>>106889480
the alternative is move outside of Cali but then it would just be gay
>>
File: wan22___0014.png (1.5 MB, 832x1216)
1.5 MB
1.5 MB PNG
>>106888835
my man. did this film style lora for WAN
>>
File: wan22___0022.png (1.26 MB, 896x1152)
1.26 MB
1.26 MB PNG
>>106888904
tommy king body style lora
>>
>>106889373
>Sora 2 destroyed everything
only for a short time. now, sora 2 is as boring as the first. because of censorship
>>
China in charge:
>snakeoil paper spam
>slop datasets
>bench/bloatmaxxing
US in charge:
>censorship retardation
>saasfagging
>walled gardens

we lose no matter what. slavs should save us because they don't give a fuck
>>
>>106889518
yeah it's completely cucked, but you can make state of the art "cat working at mcdonalds" videos i guess
>>
File: 1740076951864263.png (88 KB, 201x251)
88 KB
88 KB PNG
>>106889520
>slavs should save us because they don't give a fuck
but slaves are fucking retarded (djokovic is still my goat though)
>>
File: mikus_00001_.mp4 (1.66 MB, 1536x1920)
1.66 MB
1.66 MB MP4
>>
>>
File: dmmg_0013.png (1.49 MB, 832x1216)
1.49 MB
1.49 MB PNG
flux custom character lora (merging actresses).

>>106889027
this is dope as hell btw
>>
>>106889518
>>106889533
you can still make edgy shit though
>>>/wsg/5999084
>>>/wsg/5999085
>>>/wsg/5999088
>>
>>106889538
>slaves are fucking retarded
ever hear of retard strength? also all the papers that actually matter have slavs credited. they are the only ones not chasing bullshit
>>
>>106889541
tired of you spamming this shit. fuck comfyui. migu should not be advertising such slop
>>
>>
>>106887556
Great style
>>
>>106889623
i don't get it
>>
>>106889520
https://github.com/ai-forever/Kandinsky-5
some russian bank is making a video
model and they claim it outperforms wan
>>
>>106889734
>they claim it outperforms wan
https://www.youtube.com/watch?v=hpbGz9JPadM
>>
>>106889734
>2B parameters
>It outperforms larger Wan models (5B and 14B)
Riiiiiiight......
>>
>Kijai

[+4] 19 points 10 hours ago

I haven't really tested that much lately, I don't like the 2.2 Lightning LoRAs personally as they affect the results aesthetically (everything gets brighter), so for me the old 2.1 Lightx2v at higher strength is still the go-to.

A new somewhat interesting option is Nvidia's rCM distillation, which I also extracted as a LoRA:

https://huggingface.co/Kijai/WanVideo_comfy/tree/main/LoRAs/rCM

It's for 2.1, so for 2.2 it needs to be used at higher strength, but it seems to have more/better motion and also bigger changes to the output than lightx2v, granted we may not have the exact scheduler they use implemented yet.

some have said this one for low noise lora with the new 2.2 one for high is a good combo, gonna give it a try.
>>
>>106889685
That's perfectly fine
>>
>>106889758
>It's for 2.1, so for 2.2 it needs to be used at higher strength
a shame he didn't say the strength value to use, I'm too lazy to test that out and find the good spot only to find out it's inferior or some shit
>>
File: 00073-3439849120.png (930 KB, 1344x768)
930 KB
930 KB PNG
>>
>>106889807
>I've tried all of these in a few combos in the past hour on my 5090: new "moe distill i2v" that dropped earlier today, your MoE 2.2 i2v high you linked above, nvidia rcm, original 2.2 lightning i2v, 2.1 lightning i2v...

My best results by far so far are the version of the 2.2 i2v MoE distill lightning lora HIGH you linked above in high, and the nVidia rcm rank148 in low.

It's even better if you bump up the steps to like double, but that goes for all of these with motion...

gonna try the low and 2.2 kijai lora at 1 str first just to see what it does. otherwise 2.1 low works fine.
>>
>>106889807
the low model is basically the same as 2.1 so you don't need to increase the strength for low
>>
File: 1740334194873440.mp4 (878 KB, 704x480)
878 KB
878 KB MP4
test prompt: 2.2 kijai lora (high) 1 str, nVidia rcm rank148 lora 1 str for low

the camera pans out and the man shakes hands with an anime style Hatsune Miku.

https://huggingface.co/Kijai/WanVideo_comfy/tree/main/LoRAs/rCM

https://huggingface.co/Kijai/WanVideo_comfy/tree/main/LoRAs/Wan22_Lightx2v

works, need to test other prompts though
>>
>>106884374
hey goys, this was posted here too, it was removed:
https://www.reddit.com/r/StableDiffusion/comments/1o655q8/bytedance_just_released_faceclip_on_hugging_face/

They were pickle files. Does anybody know why it vanished?
>>
File: 1742187541151325.mp4 (1.01 MB, 480x704)
1.01 MB
1.01 MB MP4
>>106889915
the anime girl opens a white pizza box and eats a slice of pizza.

looks smooth to me. using rife interpolation node (2x).
>>
>>106889520
china is just racing to get ahead, but they have momentum and absolute staying power

the us is only good at short bursts due to the shit money cycle and short-term ROI requirements for everything
>>
>>
>>106889554
yeah real edgy watermarked shit
>>
File: sbs.jpg (601 KB, 1670x1216)
601 KB
601 KB JPG
some experiments with wan speed up loras. same seed. i wanted to see what differences come up when these loras are in place as i use them a lot. i think the difference is less dramatic when the prompt isn't complex so i tried to put a twist on it. there's no other loras, or upscale applied.

prompt:
>a dslr photograph of a 32yo blonde woman, a demon, a succubus, her eyes are red, and her skin is pitch black, lined with veins, standing in a graffiti filled alleyway. She wears a black thong with a short white crop top, with a relaxed yet deliberate stance. she has a sexy presence, strong cinematic lighting.

res_2s/beta57 for both.
>>
>>106889975
>moves the goalpost
you can remove the watermark if you go for the pro subscription mode (yeah I know that sucks but it's possible though), and there's watermark removers on the internet and they work fine as well
>>
>>106889952
the anime girl opens a mcdonalds paper bag and eats a mcdonalds cheeseburger.

yeah, this new combo works pretty well. still need to test more.
>>
>>106889990
>indian guy screaming at computer
whoa crazy stuff
>>
File: 1758218224536983.mp4 (1 MB, 480x704)
1 MB
1 MB MP4
>>106889999
helps if I add the video.
>>
File: sbs2.jpg (513 KB, 1670x1216)
513 KB
513 KB JPG
prompt:
>3dcg, a blue haired 19yo woman in a sailor costume sits on an inflatable whale floating in space

so we get a lack of understanding of some concepts with short prompts
>>106889915
thanks for this. gonna do some more testing after these
>>
File: 1740660208177521.mp4 (960 KB, 480x704)
960 KB
960 KB MP4
>>106890011
the anime girl runs to the right very fast out a door in the white room, and closes it.

rife vfi for interpolation (2x, so 32fps), seems faster than film vfi but that's better quality, this is for quicker gens.
>>
I don't get this Context Windows (Manual) node. seems like it does the exact same shit whether or not I have it enabled
>>
>>106890010
the computer explodes though! and the epstein files! and the heckin vegetables jokes makin fun of disabled people though!
>>
File: 1748457196499462.mp4 (1.31 MB, 480x704)
1.31 MB
1.31 MB MP4
>>106890038
the anime girl gets in a teal colored convertible car and drives away out of the white room, through a garage door.

okay this combo is really good imo. 1 strength for both, base wan 2.2 i2v template in comfy, with the two loras added (2.2 kijai for high noise, nvidia one by kijai for low noise).
>>
>>106889969
love this
>>
File: sbs3.jpg (715 KB, 1670x1216)
715 KB
715 KB JPG
prompt:
>In the still frame, a lone, ancient tree stands tall in the center of a misty marsh. Its twisted branches reach out against a cloudy, overcast sky, while the soft, reflective waters surround its base, creating a mirror-like surface. The muted tones of greens and grays evoke a quiet, haunting atmosphere, with wisps of fog drifting through the scene, emphasizing the solitude and resilience of the tree amid the vast, tranquil expanse of the marsh.
>>
File: sbs4.jpg (746 KB, 1670x1216)
746 KB
746 KB JPG
last one. a full prompt for our 3DCG to see if that makes a difference.

prompt:
>In a surreal, otherworldly scene rendered in stunning 3D computer graphics, a 19-year-old woman with vibrant blue hair sits gracefully atop an enormous inflatable whale. She is dressed in a classic sailor costume, complete with a navy blue and white striped top, a sailor collar, and a small anchor emblem. The inflatable whale, with its glossy, smooth surface and cheerful expression, floats effortlessly through the vast emptiness of space. Surrounding her, the cosmos stretches infinitely—stars shimmer softly in the distance, and faint nebulae cast a gentle glow, creating a mesmerizing contrast between the playful innocence of her attire and the boundless mystery of the universe. The scene exudes a whimsical, dreamlike quality, blending childhood nostalgia with cosmic wonder in a visually striking, cinematic tableau.

all same seed: 671162467703350
>>
>>106889982
What's the process of just genning one image like, on Wan?
>>
File: 1744847164258965.mp4 (1.33 MB, 480x704)
1.33 MB
1.33 MB MP4
>>106890069
the man runs out the door as the camera tracks him, down the street of New York at night.
>>
>>106890133
Also, why does a lora speed things up? That's surprising to me, all the loras I ever used at best were the same speed.
>>
>>106890154
>he turns into another person
why can't ai into likenesses?
>>
>>106890163
they are speed up loras, they let you generate an image with fewer steps, but as you can see they skip details and in some cases prompt adherence.

>>106890144
set frames to 1 instead of 81 or whatever
>>
>>106890163
>>106890167
stfu retard
>>
File: 1744581820037833.mp4 (1.85 MB, 640x640)
1.85 MB
1.85 MB MP4
kek
>>
>>106889266
fuck off glowies, no one likes you
>>
File: 1747293202682629.mp4 (1.36 MB, 640x640)
1.36 MB
1.36 MB MP4
>>106890188
the person in the blue coat throws a molotov at the van in front of him, causing it to explode into flames.
>>
File: file.png (1.79 MB, 1152x896)
1.79 MB
1.79 MB PNG
So what's the peak UI atm? Is it still Comfy or did another come on top?
>>
>>106890188
lmao
>>
>>106890242
comfy since day 1, it can't be topped due to the flexible customizable nature of the program as well having only the smartest people in the world contributing to it's source code. giving all of that to us for free? is... man..., the guy responsible needs a nobel peace prize.
>>
>>106890172
They both added Earth. Or at least a planet.
>>
File: 1730199160780852.mp4 (1.25 MB, 640x640)
1.25 MB
1.25 MB MP4
the large anime girl on the billboard waves hello, as cars drive by.

2.2 kijai lora (new) and 2.1 i2v rank64 lightx2v lora (old, but works fine)

the cars somehow didnt crash!
>>
>>106890242
For me, it's stable-diffusion.cpp. But it has a loooong ways to go to be imo really past alpha. Including incomplete parameter validation bugs lol
>>
File: basic.png (2.36 MB, 1600x896)
2.36 MB
2.36 MB PNG
>>106884374
>>106884374
>>106884374
>>
https://github.com/bytedance/FaceCLIP
>page not found
>>
>>106890242
what model/lora did you use for that goblin? mine always come out with normal faces/noses
>>
File: 1741286595632864.mp4 (626 KB, 704x480)
626 KB
626 KB MP4
the anime girl is playing her guitar on stage.

2.2 kijai + older 2.1 i2v lora
>>
>>106890400
it would help if more people contributed. it's just four guys doing it
>>
>>106890480
Are you using 4 steps or 8? I keep getting grainy outputs with 4 no matter what I do.
>>
why can't i local diffuse music yet
>>
File: 1734644584112201.mp4 (871 KB, 704x480)
871 KB
871 KB MP4
>>106890480
this time with the rCM nvidia lora that kijai posted for low:

all tested with 1 str for each lora

might be better, need to test more though.

https://huggingface.co/Kijai/WanVideo_comfy/tree/main/LoRAs/rCM
>>
>>106890501
default i2v wan 2.2 template in comfy, should be 6 steps (3/3 for high/low).
>>
File: jubilee.png (3.1 MB, 1296x1728)
3.1 MB
3.1 MB PNG
>>
>>106890011
fuck you, now I'm craving that literal poison
>>
>>106890537
There's Ace Step. It's pretty lowfi
>>
>>106890569
it tastes like dogshit, fuck is wrong with you
>>
>>106890569
>poison
oxygen is poison, yet we consume this shit constantly, I don't see you complaining on that!
>>
haven't genned in a year but what the fuck happened to comfyui? it's so fucking shit now
>>
>>106890595
https://en.wikipedia.org/wiki/Enshittification
>>
>>106890537
>>106890575
I genuinely don't understand why there is no big "midi first" music diffusion model. Wouldn't that be the cleanest possible dataset combined with the most utility for integration into existing music production workflows? It's not like midi files are even heavy, so you think the model would be easy to train and the data easy to tag.
>>
File: 1738399241194140.mp4 (1.34 MB, 704x480)
1.34 MB
1.34 MB MP4
>>106890544
the anime girl puts down her guitar and starts playing the drums.

kek, with 2.2 kijai and 2.1 rCM (low), 1 str.
>>
>>106890595
>>106890609
ani samefag
>>
>>106890624
butthurt shill
>>
File: 1737630351404052.mp4 (840 KB, 704x480)
840 KB
840 KB MP4
>>106890619
the anime girl transforms into hatsune miku, with a microphone.

I like this new combo, still need to test other stuff though.
>>
>>106890609
sad. where are we moving to now? sdcpp like anon linked? seems like a good idea since all this python shit just sucks for brainlets
>>
File: 1732562280931515.png (1.76 MB, 1280x1919)
1.76 MB
1.76 MB PNG
>>106890609
for once it's a jew that made the great noticing kek
>>
>>106890582
Its the food equivalent of "it hurts so good"

>>106890592
Not all things are created equal. Oxygen is both necessary and damaging to our biology, with a diet rich in antioxidants we have systems to mitigate the damage. Constant consumption of micky D's on the other hand is a death sentence
>>
File: 1753705156644775.mp4 (820 KB, 704x480)
820 KB
820 KB MP4
the anime girl transforms her guitar into a black pistol, and fires it at the camera.

smooth bocchi.
>>
>>106890703
I still want to punch this faggot's face
>>
>>106890668
The language used to set up all of the CUDA machinery really doesn't matter that much. Python and PyTorch are ass but they're easy to iterate with and that's all that matters.
>>
File: 1748772244455822.mp4 (815 KB, 704x480)
815 KB
815 KB MP4
the men who are sitting get up, kick their chairs, and walk off camera to the left.

no kick but pretty good.
>>
In short, what does it take (hardware, etc) to generate high quality videos today?
>>
File: 1742263191761777.mp4 (579 KB, 704x480)
579 KB
579 KB MP4
>when AI companies try to charge you $100 per prompt
>>
>>106890794
nobody here knows because we all have to quant vid models. rent a h100 perhaps?
>>
>>106890794
>>106890810
I run it at full precision and the quality doesn't really seem to be any better than what I usually see here.
>>
File: 1754994046870697.mp4 (683 KB, 704x480)
683 KB
683 KB MP4
>>106890796
>>
>>106890832
prove it :]
>>
>>106890794
nothing because wan only outputs 5, max 8 seconds
>>
File: 1748998386864494.mp4 (1.17 MB, 480x704)
1.17 MB
1.17 MB MP4
the japanese girl jumps up and down.

with rCM for low, 2.2 kijai high:
>>
>>106890848
Sure, give me a prompt and I'll run it if I have time.
>>
>>106890885
prompt: anything you want, idgaf. just post something you made
>>
>>106890865
That's fine
>>
File: 1735961131455279.mp4 (1.25 MB, 480x704)
1.25 MB
1.25 MB MP4
>>106890884
and this time with 2.1 i2v lightx2v distil rank64:
>>
File: 1756960652294340.webm (1.78 MB, 736x560)
1.78 MB
1.78 MB WEBM
>>106890751
>>106890796
>>106890847
your prompts are boring, you should make good stuff like me instead
>>
File: 1742028233487178.mp4 (1.05 MB, 480x704)
1.05 MB
1.05 MB MP4
>>106890910
another with 2.1 i2v low:

I think the rCM one works a bit better, more motion/physics.
>>
File: 1739711425441239.mp4 (1.18 MB, 480x704)
1.18 MB
1.18 MB MP4
>>106890941
and this is with rCM low. it seems to yield better motion in general.
>>
File: 1728993923542542.mp4 (909 KB, 480x704)
909 KB
909 KB MP4
>>106890957
seems to work, diff image:
>>
File: ff_g_wan__00022_1.mp4 (3.03 MB, 864x768)
3.03 MB
3.03 MB MP4
>>106890893
Sure
>>
>>106890668
>>106890595
>>106890609
samefag. so obvious.
anon hasn't "genned in a year" and suddenly mentions sdcpp as if that is something anyone remotely uses.

fuck off. your tactics are getting old.
>>
File: screenshot.1760483612.jpg (208 KB, 591x668)
208 KB
208 KB JPG
NetaYume Lumina officially supported in ComfyUI! ComfyUI remains as the leading UI for cutting edge AI!

https://x.com/ComfyUI/status/1978127680886521869
>>
File: are you ok anon?.png (33 KB, 504x240)
33 KB
33 KB PNG
>>106891036
?
>>
>>106891049
this shit any good?
>>
>>106891035
wow, your settings suck. didn;t even bother to mention speed or hardware. fuck you anon and do the second pass like a normal human bean
>>
File: 1736531098818906.mp4 (1.02 MB, 480x704)
1.02 MB
1.02 MB MP4
okay boobs are nice but lets try something more fun.

the anime girl snaps her finger, and the black man on the floor disappears.
>>
File: s.jpg (38 KB, 672x122)
38 KB
38 KB JPG
>>106891056
anyone can inspect element or use vpn/proxies to dual post anon.
>>
>>106891059
no it's severely undercooked and the authors ran out of money
>>
File: 1748427121084863.png (538 KB, 1074x525)
538 KB
538 KB PNG
>>106891063
he became a dolphin or what? lool
>>
>>106891077
too much fent.
>>
File: 1747596653940764.webm (2 MB, 528x784)
2 MB
2 MB WEBM
>>106890884
>>
>>106886940
fucking love this, well done
>>
>>106891049
nobody cares. tell the authors to add it to sdcpp
>>
>>106891061
The settings are all stock from the hf repo. RTX 6000 BW. Feel free to suggest something if you want to see it. I'm not too interested in video.
>>
File: you right now.png (254 KB, 640x480)
254 KB
254 KB PNG
>>106891073
so how can I prove I'm not a samefag? that's the funny thing about this, you claim someone is a samefag while knowing he has no way to prove his good faith, that's delightfully devilish, Seymour
>>
>>106891088
but it's so obvious except when asked for the burden of proof!
>>
>>106891059
no, that's why no here uses it.
the ONLY models that matter currently for local are: XL(Image gen for anime), WAN(video gen), Qwen(Editing) & Chroma(Image gen realism).
>>
>>106887652
thank you for the lora
>>
>>106891088
>shits on comfy as usual then immediately brings up a meme alternative
anon, we aren't stupid. I can also dual post. Watch, I will reply to this post within 5 seconds of posting, which is impossible even with a pass.

Check it
>>
>>106891114
hi
>>
>>106891121
>>106891121
>>106891121
>>106891121
>>
>>106891114
>>106891118
>it is possible to samefag therefore everyone samefag
if only life was this simple
>>
File: 1748657684254626.mp4 (1.77 MB, 480x704)
1.77 MB
1.77 MB MP4
the anime girl stands up and starts dancing beside the cop car.

this time: 2.2 kijai lora high, 2.1 i2v distilled rank64 low.
>>
>>106890615
>"midi first" music diffusion model.
idk, diffusion has mostly not been used for this type of data. But someone says there is a language model that uses diffusion, so idk, maybe.

I think *we* have the potential to train an llm to create um...

idk, what are rosegarden files called? I hate them, but I think ai could make those really well, and they can be turned into midi.

btw, there are - out there - commercial ai that seems to be basically "techno" midi. idk what they call it, edm? they seem only eh soso - but maybe I did it wrong trying them out.
>>
>>106890751
That's actually disturbing, it looks pretty real. The girl behind is perfect, except for the deformed hand.
>>
AAAAHAHAHAHAHAHAHAHAH

https://github.com/leejet/stable-diffusion.cpp/issues/396

>closed

>not fixed

stable-diffusion.cpp does NOT use clip_l.
>>
>>106891096
lmao chroma sucks



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.