[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: collage.jpg (3.07 MB, 5737x3170)
3.07 MB
3.07 MB JPG
Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>106904218

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2203741
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
>mfw
>>
File: WAN2.2_00380.mp4 (3.69 MB, 640x816)
3.69 MB
3.69 MB MP4
blessed bred
>>
File: ComfyUI_00002_.png (341 KB, 512x640)
341 KB
341 KB PNG
>>
>>106910897
simple and nice
>>
>>106910897
off to a good start
>>
Blessed thread of frenship
>>
Blessed thread of frenship
>>
>>106910897
strangely hypnotic
>>
File: WAN2.2_00382.mp4 (3.75 MB, 608x856)
3.75 MB
3.75 MB MP4
>>106910897
>>
comfy shoudl be dragged out on the street and shot
>>
File: 45747445.png (243 KB, 785x1000)
243 KB
243 KB PNG
is 12GB VRAM and 32GB RAM enough to train chroma loras or will my computer shit itself
>>
Which qwen image edit lightx2v lora is recommended?
>>
>>106911030
only if you're wh*te
>>
Is there a bypass for Lora Manager license key?
>>
File: WAN2.2_00386.mp4 (3.67 MB, 670x720)
3.67 MB
3.67 MB MP4
>>106911034
>>
>>106911048
beg ai-toolkit dev to tell you how to setup his project with ramtorch for chroma
>>
File: image_00121_.jpg (695 KB, 1264x1712)
695 KB
695 KB JPG
>>
That faggot comfy changed something and now my wan previews aren't animating, despite me setting it TAESD. It's just a static fucking image. Anyone know how to fix it?
>>
>>106911217
Display animated previews when sampling in options. It's always been like that. You might've pulled and it reset or something
>>
>>106911230
I set it to auto-pull on launch and it's never disabled it before, but you're right. What a cunt.
>>
>>106911036
>The size of tensor a (768) must match the size of tensor b (128) at non-singleton dimension 2
Yeah, I'm thinking you're right. This error is literally random now.
>>
File: image_00123_.jpg (808 KB, 1264x1712)
808 KB
808 KB JPG
>he pulled
>>
>>106910887
extremely based collage
>>
>>106911048
I train on 12GB/48GB with onetrainer and on loading it maxes my ram but idk if it just takes all available or if I'm at the treshold.
>>
>>106911317
>1girl x10
yeah bro extremely based bro, never seen shit like this before
>>
File: radiance.png (2.81 MB, 864x1488)
2.81 MB
2.81 MB PNG
>>
>>106911334
You fucking fags have whole discord just for you already. Fuck off.
>>
What lightx2v's are best for Wan 2.2? I've tried a couple, but I feel like motion is worse than 2.1 with its own lightx2v LoRA.

This was recommended by some redditor, but same issue, motion seems stiff :

High - Wan_2_2_I2V_A14B_HIGH_lightx2v_MoE_distill_lora_rank_64_bf16

Low - Wan_2_1_T2V_14B_rCM_lora_average_rank_148_bf16
>>
>my gens didnt make the collage
shit thread, reported, saged, contacted hiroshimoot, spammed the irc and sent an anonymous report to the FBI
>>
File: radiance.png (3.27 MB, 864x1488)
3.27 MB
3.27 MB PNG
>>
File: radiance.png (2.85 MB, 864x1488)
2.85 MB
2.85 MB PNG
>>106911411
kek
>>
>>106911317
not with catjak tranny in it
>>
File: radiance.png (2.48 MB, 864x1488)
2.48 MB
2.48 MB PNG
>>
File: radiance.png (3.27 MB, 864x1488)
3.27 MB
3.27 MB PNG
>>
File: image_00133_.jpg (741 KB, 1264x1712)
741 KB
741 KB JPG
>>
>>106911457
that's a sexo from me
>>
File: nodes.png (30 KB, 276x529)
30 KB
30 KB PNG
is there any way I can change the value of the width/height fields for all at once instead of having to go through 1 by 1
>>
>>106910887
>snubbed again in favor of some absolute slop
This baker fucking sucks
>>
>>106911556
Integer node -> connect to all of them.
>>
>>106911556
>>
File: ohshit.mp4 (143 KB, 500x376)
143 KB
143 KB MP4
>>106911577
>>106911579
>you can just do that
>>
>>106911556
but why do this anyways?
>>
>>106911600
Now imagine doing that in other UIs.
>>
File: nodes2.png (96 KB, 988x502)
96 KB
96 KB PNG
>>106911608
sdxl batch generation
I can sense I'm about to get shit on
>>
>>106911612
>batch_size 1
nigga bffr
>>
>>106911615
tried it with just one and playing batch size, destroys the result
>>
File: 1739967043839916.jpg (399 KB, 2016x1152)
399 KB
399 KB JPG
>>
>>106911322
what do you use to set the captions for a chroma dataset?
>>
>>106911657
taggui with joycaption. Need to load the 4bit version so I don't oom. Needs QC afterwards.
>>
>>106911657
some people use gemini with great success
>>
File: 1755867583030139.png (136 KB, 661x935)
136 KB
136 KB PNG
wan 2.2 anons, use this lora setup with the new kijai 2.2 lora, works really well

and shift 8:

4 steps, works well
>>
>>106911692
Doesn't gemini have a daily limit? How do I batch 40-50 pics?
>>
File: 1732836395387686.mp4 (1.27 MB, 480x704)
1.27 MB
1.27 MB MP4
>>106911706
the anime girl on the large advertisement waves hello, as people walk by on the streets of Tokyo.
>>
>>106911706
>stole it from reddit
kek
>>
>>106911714
too bad the city and people all look like plastic
>>
File: ComfyUI_00122_.png (1.73 MB, 1024x1024)
1.73 MB
1.73 MB PNG
>>106910662
>>
>>106911729
well, it is a good combo so why not link the node setup

also template shift was 5 default, 8 seems to help motion too.
>>
>>106911706
Never tried mixing the light loras like that. If it works, cool.
>>
>>106911709
>Doesn't gemini have a daily limit? How do I batch 40-50 pics?
As far as I know people who use it pay for it. No clue how they feed batch of images, but I'd guess it's just asking Grok to code Python script for it
>>
>>106911746
sdxl really doesn't believe in prompt adherence unless you cajole it like an autistic who speaks in google search
>>
>>106911746
>everything is wrong
lol
>>
>>106911709
If you've only got a small dataset, just use joycaption. You'll have to go through and do some cleanup afterwards, but that's standard for local tagging models.
>>
>>106911755
nothing wrong with just linking to the reddit post that has all the info and workflow already
>>
File: 1750468602644971.mp4 (1.16 MB, 640x640)
1.16 MB
1.16 MB MP4
the pink hair anime character is standing on a car drifting around the street of Tokyo at night. Smoke emits from the tires as it drifts.

with new setup (shift 8, from 5, + loras)
>>106911799
https://pastebin.com/g19a5seP

cant link site thinks it's spam.

this is WAY smoother than before. it has rife VFI interpolation but still, much better motion. the new kijai lora works very well for the high noise pass.
>>
>>106911816
also, qwen edit is great for making wan 2.2 i2v source content.
>>
>>106911839
>also, qwen edit is great for making wan 2.2 i2v source content.
is it when you put end image there as well so it knows what's supposed to happen?
>>
File: 1751708608465127.mp4 (1.58 MB, 640x640)
1.58 MB
1.58 MB MP4
the pink hair anime character is standing on a car which drives over a ramp and flies high into the sky in Tokyo.

holy shit, that escalated fast.
>>
>>106911858
rocket league double jump
>>
>>106911855
you can do that for the first/last wan one, havent messed with that too much though just regular i2v.
>>
>>106911816
>cant link site thinks it's spam.
https://www.reddit.com/r/StableDiffusion/comments/1o8exnu/
weird never had issues myself

https://www.reddit.com/r/StableDiffusion/comments/1o7r7sb/
https://www.reddit.com/r/StableDiffusion/comments/1o8662h/
if anybody is interested in playing with animate more these look interesting, even has a cunny showcase
>>
File: image_00145_.jpg (710 KB, 1264x1712)
710 KB
710 KB JPG
>>
File: AniStudio-00649.png (2.47 MB, 800x1280)
2.47 MB
2.47 MB PNG
>>
>>
File: 1758691329805062.mp4 (1.78 MB, 640x640)
1.78 MB
1.78 MB MP4
yeah, the lora combo + higher shift (8) seems to be a winner. this will do till wan 2.5 if it comes out.
>>
I get a bigger buzz seeing my influence in other gens, rather than direct acknowledgement
*cough* more leather *cough*
>>
File: 1752091978773568.mp4 (502 KB, 640x496)
502 KB
502 KB MP4
the blonde anime girl drinks her tea as people outside her car walk by.

yeah this combo is definitely a big improvement, the new 2.2 kijai lora + the 2.1 combo seems much better than the old setup.
>>
File: ComfyUI_00007_.png (500 KB, 448x576)
500 KB
500 KB PNG
>>
>>106912019
what card do you gen on?
>>
>>106912023
4080 (16gb)

wan works fine on almost anything though, 8-12gb works too
>>
retard here new to this. Got Wan2.2 on Comfy running and was recommended to use a dictionary autocorrect spellchecker and wildcards. What the fuck does any of that mean/do
>>
>>106912037
>hey claude
>retard here new to comfyui. Got Wan2.2 on Comfy running and was recommended to use a dictionary autocorrect spellchecker and wildcards. What the fuck does any of that mean/do
>>
>>106912037
ask grok
>>
Is the turbo contrarian trolling on /h/ all just one dude? I kinda suspect it is at this point, he just aggressively disagrees about everything almost no matter what it is in a pretty distinct way
>>
>>106912037
>recommended to use a dictionary autocorrect spellchecker and wildcards
wut. what are you trying to do? if you're just doing i2v then prompting is very simple
>>
>>106912054
https://files.catbox.moe/5zjzyd.webm
>>
>>106912037
>was recommended to use a dictionary autocorrect spellchecker
Huh? Are you a special needs person or retarded? Just type normally into the fucking box and gen, what the fuck man.
>>
File: 1735541842223448.mp4 (1005 KB, 640x640)
1005 KB
1005 KB MP4
>>106911855
the man puts on a blue hat

cool, it works
>>
>>106912075
saar...you must ask grok...redeem the grok
>>
>>106912082
he should hold up a sign after which says "IT JUST WORKS"
>>
File: stupid bitch.png (89 KB, 214x233)
89 KB
89 KB PNG
do i have to close forge everytime i want to delete a lora i used but didn't like? i can delete loras after using them in comfy
>>
File: Screenshot_2.png (97 KB, 664x939)
97 KB
97 KB PNG
>>106911706
Where can I find this file? For exact comparison.
>>
File: 1758776293234536.mp4 (829 KB, 704x496)
829 KB
829 KB MP4
wow, this combo is so much smoother. and I went from 6 steps to 4 (default).

two FBI agents arrest the man and take him away, off screen to the right.
>>
>>106912122
should be this:

https://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-I2V-A14B-4steps-lora-rank64-Seko-V1
>>
>>106912136
the low noise one (for wan 2.2 low)

the other one is the default wan 2.1 lightx2v lora.
>>
>>106910887
Based hand crafted collage
>>
>>
>>106912075
>https://files.catbox.moe/5zjzyd.webm
grok is this true?
>>
neta is so fucking dogshit, what a waste
>>
>>106912136
also I think the shortened link was considered spam.

https://www.reddit.com/r/StableDiffusion/comments/1o8exnu/zero_cherrypicking_crazy_motion_with_new_wan22/

this works well. (workflow in post)
>>
I wish i had gork
>>
An linux wizards here? thinking of going full linux or linux with virtual machine with an rtx, what is best recommended for running comfy? Wont be trying dual boot shit again (too many issues)
>>
>>106912204
virtual machine is worst recommended
>>
>>106912204
for a while i just grabbed a 1tb ssd, made it a bootable ubuntu install and ran comfy in there. when i wanted/needed windows back i just removed the usb cable and rebooted. linux was slightly more performant though so i still use both
>>
File: annie.jpg (108 KB, 1024x1024)
108 KB
108 KB JPG
>https://xcancel.com/__TheBen/status/1829554120270987740#m
>two layers at 640px

why don't we hear more about this?
does anyone know other lora hacks?
>>
>>106912136
is that the same as the original 2.2 light lora? my fucking head is spinning from all these versions
>>
File: 81042162538295_00001_F.jpg (1.4 MB, 2000x3000)
1.4 MB
1.4 MB JPG
>>
File: 1752297817345700.mp4 (634 KB, 704x496)
634 KB
634 KB MP4
the man puts on a wizard hat and casts a frost spell, making the desk turn to ice.

kek idk why it did a transition
>>
>>106912235
Wait..so I can just unplug windows and plug in linux without dual boot fuckery? The idea was to go full linux but if I can just do that, then that would save a world of headache. I just wanna keep win 10 as all of my softwares work and dont require updating
>>
File: 81042162538295_00001_F2.mp4 (3.68 MB, 1024x1536)
3.68 MB
3.68 MB MP4
>>106912296
>>
>>106912290
there is a new version, kijai fixed it

https://huggingface.co/Kijai/WanVideo_comfy/tree/main/LoRAs/Wan22_Lightx2v
>>
File: 1750414098318120.mp4 (626 KB, 704x496)
626 KB
626 KB MP4
>>106912319
the man puts on a wizard hat and points in the air, causing a large block of ice to form on his desk.
>>
>>
>>106912342
Look at you fag forgot to change filename?
>>
>>106912352
what
>>
File: ComfyUI_07400_.png (2.15 MB, 1152x1152)
2.15 MB
2.15 MB PNG
>>106910389
>I never got base Chroma to produce anything remotely close to this

Nta but you'd be surprised what Chroma can do when you prompt it slightly differently. Like I didn't think it could do gyarus, and it doesn't when you just prompt for it, but actually it turns out that simple change of prompt and enhanced description is all it takes to get gyarus.
>>
File: ComfyUI_00784_.png (1.7 MB, 720x1280)
1.7 MB
1.7 MB PNG
>>106912323
yeah i have my boot order to boot off the disk if it's present, if not just regular ass windows. i wanted the same set up since i use this machine for gaming, and the ubuntu for genning. plus who knows what the fuck i'm downloading with these models
>>
File: 1748704496338947.mp4 (429 KB, 640x640)
429 KB
429 KB MP4
the anime girl is typing at her computer.
>>
File: ComfyUI_07394_.png (2.38 MB, 1152x1152)
2.38 MB
2.38 MB PNG
>>
File: test4.webm (3.54 MB, 768x1202)
3.54 MB
3.54 MB WEBM
Idle
Attack 1
Attack 2
Run
Guard
Evade
Taking Damage
At Low HP
>Incapacitated
Triumph
Flourish

1st test: Shift 8, 4 high/4 low steps. No net benefit to the complicated mix. For wan22 mix. The spark is missing when weapons collide. The fog in the background was unnecessarily denoised. Blood already on the floor before the girl falls. Freaking reddit should post a side by side comparison before screaming from the rooftops of their break throughs.
>>
File: 1732001248059577.mp4 (667 KB, 640x640)
667 KB
667 KB MP4
>>106912371
the anime girl stands up and walks out of the computer lab.
>>
>>106912417
>should post a side by side comparison before screaming from the rooftops of their break throughs.
very rarely happens even here
>>
File: 1751899030550743.mp4 (669 KB, 640x640)
669 KB
669 KB MP4
kek

the man opens the bag of ONIONS potato chips and eats one.

the motion is OBJECTIVELY better with the new lora + combo.
>>
File: boing.mp4 (3.3 MB, 640x1096)
3.3 MB
3.3 MB MP4
>>
I'm against it.
>>
>>106912492
I wanted lemon lime!
>>
whens that faggot going to post his coveted ani shota collection?
>>
>>106912352
What is bro hallucinating about?
>>
File: 1746500961142691.mp4 (694 KB, 640x640)
694 KB
694 KB MP4
>>
File: ComfyUI_0346.jpg (1.83 MB, 2048x2048)
1.83 MB
1.83 MB JPG
>>
>>106912492
yea, good result. what is the "combo"?
>>
>>106912366
prompt? and what chroma version
>>
>>106912545
https://www.reddit.com/r/StableDiffusion/comments/1o8exnu/zero_cherrypicking_crazy_motion_with_new_wan22/

workflow: https://pastebin.com/g19a5seP
>>
File: radiance.png (2.82 MB, 864x1488)
2.82 MB
2.82 MB PNG
>>
File: 1748813465896441.mp4 (1.44 MB, 640x640)
1.44 MB
1.44 MB MP4
>>106912091
>>
File: 1734418836052784.mp4 (1.29 MB, 640x640)
1.29 MB
1.29 MB MP4
>>106912539
the pink hair anime girl puts down her guitar and starts playing the drums on stage.

smooth transition desu
>>
the noob TE schizo is back with more useless bullshit that doesn't do anything
https://redlib.catsarch.com/r/StableDiffusion/comments/1o7nnc1/clips_can_understand_well_beyond_77_tokens/

why do these "people" go to all the effort on their model training exercises only to provide nothing of practical value?

>>106909976
there are some startup flags that apparently can mitigate this, search comfyui github issues on AMD. I have yet to do a deep dive and test to figure out what works. also there's a way to turn off the bullshit compiling phase that SDXL models go through whenever you gen at a new res
>>
>>106912580
>why do these "people" go to all the effort on their model training exercises only to provide nothing of practical value?
autism is a hell of a drug
>>
>>106912552
thanks, will have a look
>>
>>106912483
niggers always do this. they go WHOA CHECK THIS OUT, post one comparison then never talk about it again
>>
>>106912580
>clip l
I sleep
>>
File: 1749731994493163.mp4 (1.64 MB, 640x640)
1.64 MB
1.64 MB MP4
the pink hair anime girl is running around the stage while playing her guitar.
>>
>>106912341

Does this only work well with img2vid? and not txt2vid?
>>
File: 00184-2202340486.png (2.66 MB, 1248x1824)
2.66 MB
2.66 MB PNG
>>
>still no updated I2V 2.2 lightning lora
>>
>>106912613
t2v should work fine but the workflow/setup in this case is with i2v loras. not sure there is a new 2.2 t2v update
>>
does anyone know how to prompt picrel? - https://files.catbox.moe/rwgu80.jpeg

not so much concerned about the magazine cover style, more about the pose with chair and the sparkling water and lighting.

it looks like derp photoshop
>>
File: ohnononolton.png (836 KB, 904x1152)
836 KB
836 KB PNG
>>106912572
based
>>
File: ComfyUI_07410_.png (1.95 MB, 1152x1152)
1.95 MB
1.95 MB PNG
>>106912548
HD Flash
https://files.catbox.moe/dlksp4.png

Btw there's also a way to get stylized/filtered images with Chroma (pic rel).
>>
File: ComfyUI_01285_.png (673 KB, 1024x1024)
673 KB
673 KB PNG
>>
File: test5.webm (3.67 MB, 768x1202)
3.67 MB
3.67 MB WEBM
Less coherence with this reddit combo. Imagine my disappointment, my day ruined.
>>
>>106912636
>Btw there's also a way to get stylized/filtered images with Chroma
ok. prompt?
>>
>>106912323
i keep a portable nvme with windows on it for when i want to be a gaymer. you dont have to dual boot
>>
>>106912628
thanks, so maybe try that moe img lora, but with light text loras. Since I cant find a text version of that moe lora.
>>
File: 00001-377839115.jpg (1.35 MB, 2048x2560)
1.35 MB
1.35 MB JPG
>>
File: ComfyUI_07417_.png (1.94 MB, 1152x1152)
1.94 MB
1.94 MB PNG
>>106912643
https://files.catbox.moe/k7p6ku.png
>>
>>106912737
Cool
>>
>>106912646
>>106912367

Nice, this seems like the best option.
>>
File: 00008-1827925854.png (584 KB, 512x640)
584 KB
584 KB PNG
>>
>>106912289
>does anyone know other lora hacks?
lora+ actually works
>>
File: 1744345978239860.mp4 (1.61 MB, 640x640)
1.61 MB
1.61 MB MP4
the anime girl stands up and starts dancing in the Japanese classroom.
>>
Comfy is a PROUD partner of NVIDIA, getting EXCLUSIVE access to NVIDIA's products for making ComfyUI truly great!
>>
File: 1732043276799670.mp4 (600 KB, 800x480)
600 KB
600 KB MP4
>>106912492
I like the OG better
>>
>>106912977
The chalkboard eraser was *completely gone* for 32 frames and the model still remembered it perfectly!
>>
Any lora or prompt to type to get a digicam kind of look on Chroma?
>>
>>106912987
wtf?? give me the sauce anon lmao
>>
>>106913039
https://x.com/ComfyUI/status/1978529150798569531
>>
File: ComfyUI_07430_.png (1.68 MB, 1152x1152)
1.68 MB
1.68 MB PNG
>>106912774
Thanks
>>
File: 1675527265623422.jpg (36 KB, 604x604)
36 KB
36 KB JPG
>>106912987
>for making ComfyUI truly great
great in what? more api? lmaooooo
>>
i wish chroma wasn't so FUCKING SLOW WHAT THE FUCK
>>
File: 1742248599636002.png (681 KB, 1045x698)
681 KB
681 KB PNG
>>106913050
didn't know will I am was a nerd, based lol
https://www.youtube.com/watch?v=WpYeekQkAdc
>>
>>106912987

Why are they shilling Spark? That thing is so underpowered.
>>
>>106913074
william is a larping aliexpress merchant
>>
>>106913081
Because NVIDIA is paying a bunch of AI relevant companies to shill it.
>>
>>106912987
>Comfy: "No I will not implement HunyuanImage 3.0 it's a bloated product"
based
>Also comfy: "Yass qween, Nvdia DGMeme is the future!"
cringe...
>>
>>106913081
what is the use case again? certainly not image diffusion. it doesn't even have that much memory for LLMs.
>>
File: 1748270988287389.mp4 (1.62 MB, 640x640)
1.62 MB
1.62 MB MP4
the anime girl stands up and shakes hands with hatsune miku.

I like the new lora + combo, and im using 4 steps instead of 6 now, still good outputs:
>>
I was pretty involved in SD in the early days. What models do you guys use these days? Everything still seems to be based on 1.5. Has Flux taken over SD?
>>
>>106912987
the comfy curse. anything he endorses is shit. all started with sd 3.0
>>
>>106913133
anime: noobai/illustrious, for anime I use wai v15

realism: qwen, qwen edit

video: wan 2.2 + lightx2v
>>
>>106913101
he barely said shit for like a 5 sec clip and got a dgx spark for free to fuck around with. i'm jelly
>>
>>106913115
for using large text based models like deepseek r1 or Ollama. it is not for image/video gen. no, it is not even for training or finetuning either.
>>
>>106913134
>the comfy curse. anything he endorses is shit. all started with sd 3.0
Did he really endorse it? I can't believe it
>>
>>106913158
for months he was saying it's the best model ever
>>
>>106913165
I don't believe you
>>
>>106913176
then you have a year of lore to catch up on
>>
>>106913165
>>106913176
he said that because he was an employee of StabilityAI, then he realized he was selling his soul to the wrong company, now it's all right, he's selling his soul to API nodes, that's much better
>>
>>106913115
from reddit;
>DGX Spark is a dev kit for GB300. So if you’re developing a high performance software and can’t afford to buy/rent GB300 for development, you can buy DGX Spark and test your code there.

>DGX Spark is not for local LLM inference.
> If you buy one, do not use it for LLM inference, that's dumb.

https://www.reddit.com/r/LocalLLaMA/comments/1o69vm5/whats_the_point_of_a_dgx_spark_for_inference_if_a/

Basically the anon 2 days ago that bought it thinking it was going to make WAN loras was stupid as fuck.
>>
https://github.com/comfyanonymous/ComfyUI/pull/10373
>Workaround for nvidia issue where VAE uses 3x more memory on torch 2.9
wtf, did anyone switch to torch 2.9?
>>
>>106913180
>>106913185
This has to be bullshit
>>
>>106913208
if you update to the latest version, it will put you on torch 2.9, which breaks xformers btw. there's an alternative updated version that still uses 2.8 though.
>>
>Kijai
>>
>>106913215
it is true though, that's why he left StabilityAi, he couldn't accept lying this much about such a mid product like SD3 medium
>>
File: ComfyUI_07435_.png (1.69 MB, 1152x1152)
1.69 MB
1.69 MB PNG
>>106913068
Chroma HD Flash is all you need to speed it up. Though ideally nunchaku Chroma would be out by now, any day now...
>>
Is Chroma really better than Flux?
>>
>>106913268
>that's why he left StabilityAi
he left because robin left. grift chink scooped him up to slap new chains on him now he shills api nodes and shitty hardware
>>
lightly technical question:
I'm using comfy, and almost every lora I've downloaded has the trigger words baked into the file, so that the Lora Info node can read them and let me just copy them into the prompt
however, one lora I've downloaded hasn't done this, and the output from the Lora Info node is completely blank, presumably because the author is a retard
does anyone know of a node that can save notes attached to a lora, even when that lora isn't loaded, or failing that can I edit the lora itself so that the trigger words (which I can get from the civitai page) show up in the Lora Info node?
and before you ask, no I can't get comfy to just pull the info from civitai itself, I just get 500 errors
>>
can you clean vram on forge?
>>
So I'm going to try image gen which I stopped shortly after the release of ComfyUI. Is there a go-to UI for retards where I can get started fairly easily and dive into details later?
>>
>>106913301
ComfyUI
>>
File: cc.png (20 KB, 325x261)
20 KB
20 KB PNG
>>106913068
>>
>>106913310
I don't get
>>
>>106913317
open your manager and look for it retard
>>
>>106913310
post workflow
>>
>>106913310
post skin color
>>
File: 1735984996884521.mp4 (840 KB, 992x560)
840 KB
840 KB MP4
comfy nooooooooooo
>>
>>106913317
you just skip the first 30% of the gen and get straight to the good stuff, ez
>>
File: power lora loader.jpg (245 KB, 2081x723)
245 KB
245 KB JPG
>>106913282
Don't know what you're talking about. Try power lora loader and check what tokens was trained.
>>
>>106913341
No, it's the opposite. It doesn't skip the first 30% because they are most important. And the rest it skips every 1 step.
>>
>>106913340
lmao
>>
>>106912366
That looks slopped and nowhere close to what that other anon did (the anon who refused to share his catbox)
But I get your intention was just to make a big titty 'schoolgirl'
>>
is there a single reason to use pony?
>>
File: 1744041405171694.mp4 (738 KB, 640x640)
738 KB
738 KB MP4
the blue hair anime girl gives the red hair anime girl a wrapped gift.
>>
>>106913329
https://files.catbox.moe/9e1twx.png
>>
>>106913386
Horse too expensive, mule too stubborn
>>
>>106913395
sankyu, neuro enjoyer
>>
>>106913342
I don't need the trained tokens, and I can already get those with pythongs lora loader anyway
>>
>>106913138
wow, he used it for 5 mins then just uses his 5090
>>
>>106913390
What is the point of spamming garbage from this scene every other thread? Those "tests" lost their novelty and are not interesting at all.
Are you the same guy who used to spam Miku edits with Kontext and Qwen Edit?
>>
File: ComfyUI_07440_.png (1.56 MB, 1152x1152)
1.56 MB
1.56 MB PNG
>>106913355
I'm not prompting for that though. It's clearly not impossible with a model as good as Chroma, so not sure why you're doubting him.
>>
>>106913432
If you ran it for 30 steps, it would look way better. Or used base + flash lora
>>
>>106913428
it's just a test nogenner
>>
>>106913432
Anon, I have been telling you since the last thread: your gens are not as good as you think. Pay attention to the walls, skies etc, there are noticeable artifacts, and the subjects have a weird smudge in their skin
>>
File: ComfyUI_07380_.png (2.34 MB, 1152x1152)
2.34 MB
2.34 MB PNG
>>106913438
All my images are only 8-9 steps with Heun/beta, CFG 1. I'm on Chroma HD Flash so no need for that many steps and it pretty much one shots it most of the time. I know many of my prompts would take a bunch of tries if not impossible on the full version (HD Flash is closer to convergence).
>>
>>106913427
worth it
>>
>>106913466
>8-9
That's why it looks so ass retard
>>
post a single gen from chroma that looks good
>>
File: ComfyUI_06106_.png (1.8 MB, 1328x1328)
1.8 MB
1.8 MB PNG
>>106912987
his name is literally "ComfyAnonymous"?
wat
>>
>>106913487
I guess he doesn't want to say his real name
>>
File: gamer.jpg (19 KB, 258x538)
19 KB
19 KB JPG
>>
File: 1752887687239170.mp4 (903 KB, 992x560)
903 KB
903 KB MP4
comfy yes
>>
>>106913446
A tranime character giving a gift to another, wow, so interesting and original
What is the purpose of sharing those uninteresting tests every thread?
It was interesting early on (new model releases with significant differences), now they are just a waste of space
>>
File: ComfyUI_06037_.png (1.25 MB, 1096x952)
1.25 MB
1.25 MB PNG
>>106913493
hm okay
>>
>>106910887
Yo, What's the minimum hardware requirement to start with this shit?
>>
>>106913507
the guy is clearly autistic and slow in the head and has no idea how to read the room. go easy on him. in real life you also dont walk up to any mentally ill retard and ask him why hes doing the retarded shit that hes doing. hes retarded. convincing him to act like a normal person is going to be impossible, like trying to get a downie to play a normal human. doesnt work.
so chill out, just ignore the guy if you dont like his posts.
>>
>>106913522
All of it
>>
>>106913522
what shit, retard? requirements vary greatly based on what model you want to use
>>
when is the ani shota collection suppose to drop bros
>>
>>106913522
>Yo
ey YO yiki yoYO that was a pretty broad question yokoYO
>>
>>106913510
lmao, this is amazing
>>
>>106911368
Use that but add a first step in high without the lora.
>>
>>106913503
Kek
>>
File: ComfyUI_07450_.png (1.75 MB, 1152x1152)
1.75 MB
1.75 MB PNG
>>
>>106913503
ahahah, it's been a while I haven't seen such kino in this place
>>
>>106913503
make ani hit him with a shovel
>>
File: 00009-1095993197.png (659 KB, 512x640)
659 KB
659 KB PNG
>>
Lmao I've never head this kind of failgen
>>
File: 1747602393232666.png (1.04 MB, 1368x760)
1.04 MB
1.04 MB PNG
love qwen edit
>>
File: AniStudio-00650.png (1.26 MB, 1184x864)
1.26 MB
1.26 MB PNG
>>106912987
Thats why I use AniStudio
>>
>>106913780
that mspaint lora is putting in work!
>>
https://www.youtube.com/watch?v=qGe_fq68x-Q
Westsisters? Our response?
>>
>>106913804
no CUDA
>>
>>106913804
>no fan
it's obviously shit. didn't click
>>
>>106913818
is this the /g/ version of
>no tail
>>
>>106913786
mspaint is local, and my hand movements while drawing are my tags
>>
File: 1745925645290088.mp4 (799 KB, 720x720)
799 KB
799 KB MP4
>>106912638
>>
File: 1730568284113520.gif (1.71 MB, 452x332)
1.71 MB
1.71 MB GIF
>wan i2v models and wf
>forgot to connect load image node to WanFirstLastFrameToVideo node so no image provided
>still get a scene of what I prompted like t2v
>>
File: 1751358666848077.png (3.52 MB, 1728x1344)
3.52 MB
3.52 MB PNG
>>
File: 00011-519273879.png (407 KB, 640x512)
407 KB
407 KB PNG
>>
>>106913780
When are we getting a good AI tool not designed by neckbeards and trannies?
It seems all non-autistic dudes that write good software go on the SaaS route

Hopefully the llm vibe coding culture lowers the barrier to write sane software
>>
File: 1730488747672025.png (1.03 MB, 1360x768)
1.03 MB
1.03 MB PNG
>>
>>106913804
LPDDR4x. It's probably slower than a DGX Spark at full compatibility
>>
>>106913925
do people "vibe code" with chatgpt or do they use open source llms with lm studio and so on?

I imagine an unrestricted model would be better than censorshipAI
>>
File: 1750333162678291.png (1.03 MB, 1360x768)
1.03 MB
1.03 MB PNG
>>106913928
I will note that qwen edit is better with text, before kontext was better on that front. now qwen is overall better on all fronts.
>>
>>106913933
local LLMs are a joke and are only useful for degenerate purposes (ERP and write erotica), but don't let anyone know that
Even the high end open weights LLMs (the ones that no one can run on local hardware unless you own an enterprise grade cluster at home) underperform vs the proprietary API-only ones
>>
>>106913951
so for making an app/game lets say, grok or chatgpt? I figure grok might be better cause openAI love censoring shit
>>
File: 1748761299990051.mp4 (3.1 MB, 720x960)
3.1 MB
3.1 MB MP4
>>106911457
>>
>>106913951
>local LLMs are a joke and are only useful for degenerate purposes (ERP and write erotica), but don't let anyone know that
Painfully dumb take, hurts to read it. GJ anon.
>>
File: ComfyUI_07478_.png (1.73 MB, 1152x1152)
1.73 MB
1.73 MB PNG
>>
>>106913948
kontext is better at safety
>>
>>106913958
"censorship" doesn't matter that much when it comes to write software (which those "censored" LLMs do well), it's just a retarded narrative parroted in AI circles where people only use AI to jerk off
>>
>>106913958
Unfortunately Claude is really good, but it's expensive enough and the company is horrible enough that you aren't going to want to pay them unless you're really desperate. I found Deepseek pretty serviceable, but sometimes it gets really dumb. I've been hearing great things about GLM 4.6 but I haven't used it for coding yet. Supposedly the new Gemini blows everything out of the water but I don't think it is actually out yet.
>>
File: 1738561864786039.png (1.02 MB, 1360x768)
1.02 MB
1.02 MB PNG
yeah. the new qwen edit is much better at text.

you heard her!
>>
What's the best method to run ComfyUI?
- Desktop?
- Portable?
- Stability Matrix?
>>
File: fray.jpg (581 KB, 1799x1368)
581 KB
581 KB JPG
>>
>>106914027
in a venv on a dedicated linux machine
>>
>>106913310
Enjoy your grain
>>106913522
You can run SD1.5 on a raspberry pi
>>
very inorganic
>>
>>106914027
Portable. I've heard desktop has some issues.
>>
>>106912019
What are your times like? I've got 4080 FE and 6 seconds takes me a solid 9 minutes for 480p using the fast rentry setup T2V. I2V is much faster though.
>>
File: ComfyUI_07481_.png (1.79 MB, 1152x1152)
1.79 MB
1.79 MB PNG
>>106913457
Not me you were talking to, I'm purposely prompting for grainy images and I can swap the style out if I want to.
>>
>>106914066
with lightx2v loras and 4 steps (2/2) it's like 100-120 seconds with interpolation

dont do wan without the loras or it takes forever. quality can still be very good with them, before loras gens would take like 10 to 15 min.
>>
>>106914103
also, use the wan 2.2 i2v template workflow in comfy, it works well and has the lora setup as well I believe.
>>
Retard here, been mesmerized by this for a while. Love wardrobe malfunction, and qt nihons. I want to wallow in depression.

What do I need to gen something like this and can we actually get nipples and genitals?
>>
>>106913139
bro you arent even close to fitting deepseek on that
>>
>>106914116
These models are not in a state yet where they can do specific fetish out of the box even if you prompt really hard for it, they heavily rely on Loras
So unless you train the lora yourself or someone already did it, you shouldn't expect to get what you want easily
>>
>>106914181
You could fit several ollama deepseek R1s.
>>
>>106914215
you could fit several of my nuts in your mouth, but that does not mean they are the real deepseek
>>
>>106914243
The Ollama devs singlehandedly wrote the code that made local LLMs possible so I'm going to trust them over you on this one.

I know, I'm laying it on too thick.
>>
>>106914296
>The Ollama devs singlehandedly wrote the code that made local LLMs possible
Who mean the grifters who just forked llama.cpp, never gave it credit, and made a flashy normie-baiting product (which is essentially only a wrapper/interface with repo) out of it?
>>
>>106914296
the experts of ldg always manage to surprise me so im never sure
>>
>>106914314
Yeah, but now they're working on rewriting the codebase after the fact so they can take it private. Then nobody can say they don't deserve it!
>>
Any recommendations for music generation?
>>
>>106914340
they will fade into obscurity when LlamaBarn gets released
>>
>>106914341
Honestly just stick to SaaS for that (Suno, Udio). All open-source musicgen models are shit
>>
>>106914341
YuE afaik

>106914391
why don't you let anon decide that for themself
>>
>>106914443
>YuE afaik
Yeah... If you can afford waiting 10~20 minutes to get so-so outputs that skips entire verses, sure
>>
>>106914341
udio is so far ahead local it would be like having sd1.4 local compared to current novelai on the saas front
>>
>>106914341
songbloom, but it's not that great

>>106914027
i'd use stabilitymatrix or portable
>>
File: file.png (349 KB, 472x478)
349 KB
349 KB PNG
>>
>>106914479
I agree. We'll catch up eventually, there's no way Chinks are sitting on this goldmine that is AI music gen. Someone has to be cooking something good.
>>
>>106914116
>>
File: ComfyUI_temp_pfsdp_00008_.png (2.55 MB, 1192x1648)
2.55 MB
2.55 MB PNG
>>
I just want a closed source competitor to ComfyUI
>>
>>106914641
Not bad
>>
>>106914661
Comfy?
>>
Damn, this mmaudio nsfw finetune actually works.
(nsfw) https://files.catbox.moe/nahift.mp4
https://huggingface.co/phazei/NSFW_MMaudio
>>
>>106914462
YuE is still best for composition.

https://map-yue.github.io/music/%E5%AE%8C%E7%92%A7%E3%81%AA%E9%96%A2%E4%BF%82.mp3

I have posted this many times before, feel free to read the lyrics here under "English + Japanese + Korean Code Switching Kpop"
https://map-yue.github.io/

It pretty much nails every language.

It's behind closed source but this is a bad Udio output:
https://www.udio.com/songs/79crys6WpDoA1FQUswzuWK

No, ACE-Step can not do anything like this. All ACE-Step seems to be good at is some Chinese rapping music. Regardless of how good their sound quality may be compare to YuE, their composition is not at the same level. There is a massive difference in dataset quality used to train both models. The same can be said about Songbloom (even more, and that model can't even be prompted without a sample). If you want to compare models genning from samples, look at this:
https://x.com/cocktailpeanut/status/1886456240156348674

Can Songbloom do this level of quality with the instruments? No.
>>
File: 00125-1373436559.png (2.87 MB, 1248x1824)
2.87 MB
2.87 MB PNG
>>
>>106914674
Neat. I can't believe I'm saying this.. does MMAudio have comfyUI integration? Since Wan22 is already there, might as well add MMAudio.
>>
>>106914701
Retard
>>
>>106914710
https://github.com/kijai/ComfyUI-MMAudio/tree/main
>>
>>106914674
How long does this take to run inference? Being 2B it should be fast but doesn't hurt to ask.
Maybe I should give it a try.
>>
>>106914720

Kijaigod... I kneel.
>>
>>106914726
like ten seconds
>>
>>106914701
>It's behind closed source but this is a bad Udio output

And I meant to say, that YuE output is very similar in quality to the bad Udio output (slightly worse in terms of sound quality, but still).
>>
can wanimate be run without speed loras with the wrapper and KJ scaled model, or do i need the native workflow and model?
for some reason, when i try running without the speed lora even with 40 steps, it still comes out looking slopped compared to using speed lora and I can't figure out why
>>
>>
File: ComfyUI_00349_.png (2.24 MB, 1024x1472)
2.24 MB
2.24 MB PNG
>>
>>106914674
Can it do sfx too?
Like bed creaking, etc?
>>
File: dmmg_0002.png (1.47 MB, 896x1152)
1.47 MB
1.47 MB PNG
>>106912963
this ain't bad for a 1hr train
>>
which video upscaling model would you recommend that i could run locally? only need a 2x upscale at most
>>
>>106912963
>>106914864
Is it in any trainers?
>>
>>106914031
good gen
>>
File: 00069-639778528.png (1.45 MB, 1344x768)
1.45 MB
1.45 MB PNG
>>
File: dmmg_0016.png (1.55 MB, 896x1152)
1.55 MB
1.55 MB PNG
>>106914883
i don't think lora+ is but as far the method from TheBen, it's available in any trainer that lets you specify layers. the lora is indeed 9MB
>>
File: ComfyUI_03053_.png (1.2 MB, 912x1160)
1.2 MB
1.2 MB PNG
>>
File: dmmg_0020.png (1.5 MB, 896x1152)
1.5 MB
1.5 MB PNG
bottom heavy lora,10 images from (/s/thread/22290821)
>>
File: dmmg_0021.png (1.5 MB, 896x1152)
1.5 MB
1.5 MB PNG
homie got weird titty but she kinda fine
>>
>>106915006
what model, tho?
>>
>>106914998
Upload please.
>>
>>106915006
now that's a proper ass
>>
File: dmmg_0024.png (1.42 MB, 896x1152)
1.42 MB
1.42 MB PNG
>>106915008
flux, going to run these with chroma soon to compare
https://files.catbox.moe/02kzlc.png
>>
>>106915035
ty, king <3
>>
>>106914674
I should mention, videos have to be 24 fps at least.
>>
>>106915035
Which Flux? Krea is best.
>>
flux cant do nsfw without a billion loras though
>>
bak?
>>
bak to the past
>>
new
>>106915102
>>106915102
>>106915102
>>106915102
>>
>>106912987
why the long face at 0:13? is it because he's shilling a piece of shit?
>>
>>106913057
i don't know why i find these psycho asian chicks so hot.. never imagined i would
>>
>>106913990
>and the company is horrible enough
qrd?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.