[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Yume Edition

Discussion of Free and Open Source Text-to-Image/Video Models and UI

Prev: >>106688541

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2203741
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbours
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
File: API NOOOOODES.png (932 KB, 896x1438)
932 KB
932 KB PNG
Anyone try out the new Wan 2.5 yet? Made possible by ComfyUI
>>
>>106691543
kek
>>
Bro what is this pr
>>
I know people don't unironically talk about the API nodes here, but does anyone actually know how they work? Like could I pass in all the same conditioning I pass into VACE/wan-animate into 2.5? Or is it just very specific types of inputs?
>>
>>106691532
>"Gigantabooba!"

https://files.catbox.moe/724riu.webm

This isn't mine, just an AI video I saw on JoyReactor the other day.
>>
>>106691562
Tranime retards are called that for a reason
>>
File: RA_NBCM_00021.jpg (888 KB, 1872x2736)
888 KB
888 KB JPG
>>
>>106691594
That's gud.
>>
>>106691594
Prompt or artist? Badass.
>>
>>106691294
i'm not using stable diffusion to make the videos retarded nogen

>>106691519
>What do you use for negative prompt in wan2.2 negative prompts?
tattoo, tattoos
chink pasta makes the quality worse imo
it doesn't help very much even with NAG
>>
>nunchaku r128
Are the leftover pixels caused by the non-standard aspect ratio or the original image being clear png? Should I remove the alpha first before editing?
>>
>>106691602
without the leftover pixels the leggings would lose the 3d look
>>
>>106691602
>nunchaku r128
Not sure what it is but your image looks like a failure to premultiply alpha, what exactly are you trying to do?
>>
>>106691610
The jagged pixels are along the entire edge of the pic tho.
>>
Can I use a 'high' lora on both the high and low lora slots? I'm worried that this breast bounce lora will only apply to the high denoiser and then when details are added in low, the breasts will lose all the jiggle detail.
>>
File: f1ir69jfr4ce1.png (443 KB, 608x1752)
443 KB
443 KB PNG
>>106691623
Just testing replacing clothes. The orig had clear background so wondering if I should just turn it white first.
>>
>>106691600
nomura tetsuya was the artist.
>>
>>106691543
Quite ironic but that's Purple Witch if you know what I mean.
>>
>>106691669
Nobody cares about your ban evading tranny debo, he's such a thin skin little bitch he seethes all night when EU makes fun of him.
>>
File: 1559764475058.jpg (171 KB, 600x437)
171 KB
171 KB JPG
>raise num_blocks_gpu to 10
>still no increase in vram use
>raise to 12
>oom
This nunchaku shit is fucking retarded. 60% of my card is doing nothing.
>>
Why even bother? Local will forever be stuck where it is now till the end of time. What's the point of doing anything knowing it will never get better.
>>
File: 1750961231306354.png (760 KB, 1360x768)
760 KB
760 KB PNG
>>
Let's say you are making a finetune checkpoint.
And the checkpoint isn't something that can be trivially a lora instead. (I dunno say beating photorealism into an anime model, feel free to give a better example)
How many images would you need for this task? How many total steps with Prodigy cosine would you expect to be necessary?
Does anyone have a ballpark number?
>>
>>
File: 1728245372383282.png (438 KB, 1024x1024)
438 KB
438 KB PNG
>>
Windows 10 is in its death throes. I saved so much slop onto my Desktop and Downloads folder I think I completely broke Explorer because it can't show thumbnails anymore and if I try to do anything like sort by date it hangs up until I have to restart explorer in Task Manager.

I will never, ever make a Microsoft account
>>
File: 00052-338067704.png (1.19 MB, 1192x736)
1.19 MB
1.19 MB PNG
>>
>>106691839
wangblowz breaks the os when its (((update))) time
you can clear/replace all thumbnails under disk cleanup but it will only marginally help
>>
File: bob.jpg (1.66 MB, 3072x1280)
1.66 MB
1.66 MB JPG
can I post anime in here?
>>
>>106691860
Of course
>>
>>106691860
As long as it's original and inspiring.
>>
>>106691848
tried that it didnt work. I heard Fedora KDE is more lightweight. Win10 isn't built to handle massive outputs of local AI slop
>>
Trying out new Qwen Image Edit, I change my mind from my initial testing on the demo online, it's much better than old QIE overall, but still needs to be pixelspace and hopefully MoE at some point too
>>
>>106691821
>(I dunno say beating photorealism into an anime model, feel free to give a better example)
This is easier than you think
>>
>>106691877
either your drive is failing or you have an obscene amount of files in a single folder
>>
File: thanksking.jpg (824 KB, 896x1152)
824 KB
824 KB JPG
>>106691866
>>106691869
Ok, thank you
>>
>>106691901
what is that, nyte tyde? you cranked the shit out of it
>>
File: extreme.png (384 KB, 3652x1866)
384 KB
384 KB PNG
>>106691895
>you have an obscene amount of files in a single folder

son you have no idea
>>
comfyui is a mass slop production factory
>>
File: king.jpg (650 KB, 832x1216)
650 KB
650 KB JPG
>>106691913
huaishen
>>
>>106691916
>son you have no idea
i do, file explorer starts slowing down in the 10-50 thousand range

just segment them
>>
File: ComfyUI_02161_.png (1.71 MB, 1024x1024)
1.71 MB
1.71 MB PNG
>>106691893
Well yeah but that was just an example, not a great one admittedly.
I was actually thinking of something like unfucking chroma's broken anatomy and other faults. (Pic not related)
I made a few loras in the past but never finetuned a model.
If it was easy someone probably would have done by now so I feel like I am missing something important.
>>
File: ComfyUI_03294_.mp4 (1.03 MB, 1024x1024)
1.03 MB
1.03 MB MP4
>>106691827
>>
File: 1758487527002466.jpg (160 KB, 681x681)
160 KB
160 KB JPG
>>106691932
>flux
tch
>>
>Look in /h/ thread
>Ask simple question
>Also has a dedicated schizos
>They all seem to talk about SaaS and waste the thread with stupid one liners
This dude is going cross country on everyone that ever slighted him
>>
>>106691839
>retards acting as if your computer magically stops working
>>
crazy how the official QIE workflows still dont autocrop all dimensions so they are divisible by 112 to fix a lot of the quality loss lmao
>>
>>106691821
i think noob tuned illustrious on something like 10m images so...
>>
>>106691969
Has illustrious caught up yet?
The dog fucker booru really pushed it above and beyond
>>
>>106691971
>Has illustrious caught up yet?
There was new version recently with updated dataset.
>>
>>106691971
Nope
>>
>>106691969
>noob tuned illustrious on something like 10m images so...
holy fuck, what are the best noobs for anime and realism then?
>>
>>106691945
i like how it did the eyes
>>
https://github.com/lodestone-rock/RamTorch
Can I vibecode this into a comfy node or does it need to be installed on a python level?
>>
File: 00055-2984449425.png (1.21 MB, 1192x736)
1.21 MB
1.21 MB PNG
>>
>>106691984
You might as well train your own lora for realism. For anime, base has always been king.
>>
>>106691886
Wtf QIE 2509 is better than nano banana with my limited testing single image editing prompts I previously tried with nano banana, this shit is very good, are there any prompts people tried and failed at that I can try with new QIE?

I think people who complain are probably the usual ones that use fp8 scaled or even worse Q2-6 quants instead of Q8 and fp16 8 step v2 lightning lora
>>
>>106691959
It's actual mental illness. Like I don't like the SaaS stuff, but going around shitting up every local image diffusion thread with it is not the right play. He's basically turned the site into his toilet.
>>
File: ComfyUI_02172_.png (1.05 MB, 832x1216)
1.05 MB
1.05 MB PNG
>>106691969
Hopefully that many images wouldn't be necessary
>>
>>106692011
Yeah It's been pretty good in my testing. Just don't expect to throw all the parts into a cauldron and tell it what to spit out. You still need to take things in logical baby steps.
>>
https://x.com/bdsqlsz/status/1971022216675590380
>>
>>106692032
I'm not even gonna open it. That's the liar and fraud that told us wan 2.5 would be open source.
>>
Double exposure being used as a test now thanks to based Pixart being the only model that could do it during its time.
>>
why does chroma take so fucking long to gen a single image when illustrious takes like 5 seconds. i dont understand
>>
File: elf-hugger_00617_.png (2.89 MB, 1088x1920)
2.89 MB
2.89 MB PNG
>>106691839
just use CachyOS like someone who knows how to go on the computer
>>
>>106692050
one is SDXL the other is dogshit
>>
>>106692058
i think chroma's nice, but it takes too fucking long
>inb4 buy a 5090 and 128gbs of ram lolololol
>>
>>106691994
>Your question is nonsensical, you need to install both the pip library and import the module into code
>gguf nodes already do something like this
I am actually lowkey curious how well this performs though, maybe it can be useful for LORA training on large models for us VRAMlets
>>
>>106692050
why does sdxl take so fucking long to gen a single image when sd 1.4 takes like 0.5 seconds. i dont understand
>>
>>106692068
Flux and Qwen are faster, and better, I can totally use the with my good old 3080
>>
>>106692076
illu takes like 5 seconds, chroma takes like 1-2 minutes. its ridiculous
>>
File: file.png (1.77 MB, 768x1280)
1.77 MB
1.77 MB PNG
>>106691821
What's your goal? If it's something pretty specific like a single artist's style then you don't need that much. I made a noobai finetune with about 800 images of a specific artist's style and it did a very good job.
>>
File: 1745460361875049.png (108 KB, 1456x614)
108 KB
108 KB PNG
>>106692032
>>106692044
there's a chance it'll be open source
>>
>>106692050
>why does the model that is 3.5 times the size of the other work slower than the other?
>>
>>106692072
The upside is suppossed to be less slowdowns thanks to heavy offloads to ram unlike current solutions.
>>
>>106692099
Sorry I don't believe anyone after the horrific backstabbing we all just received this week.
>>
>>106692092
illust doesnt take 5 seconds if you actually do a proper second pass and dont use some cope quants/faster loras
chroma doesnt need a second pass and it still can create more details than illust which can only do tranime

if you want realism, then you need chroma, otherwise stay on ilust/noob simple as
>>
>>106692097
Better anatomy and composition >>106691941
I am curious if it can be done without spending a fortune.
>>
>>106692090
Flux is faster since it's distilled, Qwen is NOT faster
>>
>>106692113
Hmm can't comment on that, seems like it would take a lot more data and you would need a pristine dataset.
>>
File: ComfyUI_01308_.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
>>
>>106692112
>chroma doesnt need a second pass
It absolutely does. It does wonders.
>>
>>106692126
post workflow
>>
What workflow are you guys using for Wan 2.2? I've been shopping around for one that uses lightning and supports an input lora, that will safely run on 24gb VRAM (when I've tried adding a lora to the one that otherwise worked, I get an allocation error)
>>
i shan't post mine workfloweth
>>
>>106692145
concession accepted
>>
>>106692128
https://files.catbox.moe/jgqobf.png
Normally I'd set the denoise ~0.65. If you go higher it starts changing/improving/worsening features.
>>
>>106692155
accept this

*unzips lora of my penis*
>>
>>106692138
You can replace the boards 4chan org with desuarchive in your URL bar, and then go to the previous thread and search for "json" to find my workflow. It's t2v but can easily be turned into i2v by switching the model and loras to the i2v versions and changing the empty hunyuan video latent to an image encode node if you want

With 24gb of vram you can probably do 720p in under 5 minutes per video since it'll all be inside the gpu at Q8_0
>>
>>106692138
When I was a VRAMlet I used the UnetLoaderGGUFMultiGPU node to load the model, which allows you to specify some amount of the model to offload to system memory (at the cost of speed), if you're not using that node already then you just need to switch to that and then play with the number until you no longer oom. Unfortunately I'm no longer a VRAMlet and I just load the models in full so I can't just send you my workflow.
>>
>>106692097
>800 images
Tips on scraping? Seems like most sites make it near impossible.
>>
>>106692223
Well in my case it was part of the game's files but if you're scraping danbooru just use

https://github.com/Bionus/imgbrd-grabber

It has support for other sites but I think a lot of them will require an account or you'll get heavily throttled.
>>
>>106692109
What happened?
>>
>>106692239
Wan 2.5 API.
>>
File: AniStudio_00172.png (1.59 MB, 1326x1152)
1.59 MB
1.59 MB PNG
>>
File: output.webm (3.88 MB, 720x1280)
3.88 MB
3.88 MB WEBM
>>106692097
>>
>>106692274
Interesting style. I would ask for a catbox but it looks like it was made with a meme ui
>>
>>106692286
Why would you need a catbox to replicate a prompt?
>>
>>106692274
Looks like absolute shit
What does Ani even offer anymore?
He used to larp saying his animation work was saving the space and we see how that failed, now he larped saying that he's important for his vibe code UI and we see how that went, his gens are shit and he's a drunken sperg that advertises every fucking day
>>
>>106692231
Nice. Thanks anon.
>>
>>106692286
>>106692325
for multiple reasons one can deduce that image as not being created with anistudio. you were duped.
>>
>>106692336
I simply asked a question about our persistent shameless drunk slob shill that will never touch the power he so desperately craves because everyone and their mother can tell he would abuse it and burn everything to the ground.
>>
for qwen edit v2, how do you reference the second or third image? image2? I know you can describe it but can you reference the node? (ie: what if you have two girls in both nodes)
>>
File: 1753439079261214.png (955 KB, 1360x768)
955 KB
955 KB PNG
>>106692366
k, image2 did work

a computer with image2 on the a CRT monitor is in front of the man with the black pistol.
>>
>>106692366
I use the numbering on the image hooks.
>>
>>106692379
*on the
>>
>>106692099
they are liars. they will never come back for free. hailuo is now 100% paid. they literally removed all free daily points. only new accounts still have limited points. chinks never change.
>>
File: 1731276279433061.png (912 KB, 928x1120)
912 KB
912 KB PNG
the man in sunglasses is holding a white CRT monitor with image2 on the screen. keep his pose and expression the same.

this model is so fun and versatile. this + wan, and noob/illustrious for anime gens, can do basically anything.
>>
>>106692403
This. Never underestimate how ruthless they are once they believe they have the other hand. It's like switch is flipped. I'm not commenting on the morality of that behavior, but pretending it doesn't exist is foolish.
>>
does comfyui glow when local?
>>
>>106692336
What do you mean?
>>
File: 1740570398337605.png (1.13 MB, 1024x1024)
1.13 MB
1.13 MB PNG
Change the text "Deus Ex" to "LDG General". The man in sunglasses is sitting at a computer typing, a white CRT monitor is on his desk. He is wearing a black trenchcoat. (qwen edit v2)
>>
Qwen edit V1 was pretty good. V2 is even better. it's not quite nano bana at home for a multitude of reasons, but it can do what Nanobana does with some creativity.
>>
>>106691959
I'm offended you asked /h/ instead of here
>>
Ahem, this was the last thread, any objections /ldg/?
>>
yes, /ldg/ is a saas shillware thread, we know this. now remove comfyui from the OP already
>>
Can someone explain to me what Comfy's "output" vs "denoised output" means on the custom sampler nodes? Because "denoised output" seems to behave identically to the regular ksampler's "return with leftover noise: enabled" option, and I'm struggling to understand what kind of concept could be described as both "denoised" and "returned with leftover noise"
>>
>>106691959
Seems like you are obsessed.
>>
File: 1750571549784792.png (572 KB, 832x1248)
572 KB
572 KB PNG
>>
>15-40 minutes between gens thanks to glorious new i2v paradigm

2022 was a special year and the thread will never be that exciting again.
>>
>106692646
>As he cries because his sole function in life is to be a human gnat on multiple boards and threads
>>
>>106692670
It's because everyone is genning themselves fondling their favourite hot woman with i2v which can't be posted
>>
File: 1733568684459837.png (1.1 MB, 832x1248)
1.1 MB
1.1 MB PNG
the japanese girl in image1 is wearing the outfit of the girl in image2.

covered her boobs but it's a very good sailor outfit swap.
>>
there has never been a clearer demonstration of what "sovl" vs "soulless" is than the Chroma vs LoRA pics in this op collage >>106691532
>>
File: 1744493175971728.png (1.13 MB, 832x1248)
1.13 MB
1.13 MB PNG
>>106692681
a blouse/skirt image, also worked fine:
>>
File: 1752674379745772.png (1.21 MB, 832x1248)
1.21 MB
1.21 MB PNG
>>106692692
and of course, it works with anime too. this can do a lot more than the first version.

the japanese girl in image1 is wearing the outfit of the anime girl in image2.

of course, image2 is miku.
>>
>106692686
Swing and a miss
>>
>>106692709
lora is very obviously slopped tf up
>>
Begging once again for someone to run the wan 2.2 template workflow and post the outpust. No matter what I do I cannot get correct outputs from fp8 models but quants work fine. I've used a fresh comfy install, I've upgraded pytorch, downgraded pytorch, nothing works and I am being driven insane
>>
File: 1736013679290088.png (1.28 MB, 912x1144)
1.28 MB
1.28 MB PNG
asuka looks a bit different...
>>
>>106692754
>and I am being driven insane
a new schizo is about to be born
>>
File: 1729620703552279.png (1.17 MB, 880x1176)
1.17 MB
1.17 MB PNG
>>
File: combine_00006.mp4 (1.74 MB, 768x1536)
1.74 MB
1.74 MB MP4
>>106692801
I've been trying to figure out why this is happening for DAYS now
Another example using some chink lora from civit
Example video
fp8_scaled
Q8
>>
>>106692754
>fp8 models
Kijai's and other fp8 variants aren't interchangeable between workflows, at least in my experience (things might have changed since I experimented though).
>>
>>106692754
>cannot get correct outputs from fp8 models but quants work fine
fp8 is a quant, and a cope one at that, Q8 or go home
>>
File: 1751743152644687.png (1.07 MB, 1024x1016)
1.07 MB
1.07 MB PNG
>>
>ranfaggot
>>
>>106692839
Using the template workflow with the correct models just produces garbage and I don't know why
>>106692840
I do use Q8, but I need to test my shit on fp8 for the copers before uploading them and and I cannot get fp8_scaled models to work
>>
File: 1729792771826961.png (900 KB, 872x1192)
900 KB
900 KB PNG
>>
File: 1731897740528927.png (1.21 MB, 856x1216)
1.21 MB
1.21 MB PNG
and one more test, seems to work very well: just reference the images as image1 or image2.

ie: replace the outfit of the woman in image1 with the outfit of the anime girl in image2. keep her expression the same.
>>
>>106692864
>I do use Q8, but I need to test my shit on fp8 for the copers before uploading them and and I cannot get fp8_scaled models to work
Think you already posted something like this before, as said, I think it might be some lora training setting that ultimately fucks with the lower than the full precision model inference.

There was a case with horizontal artifacting happening even on Q8 with some prompts on Chroma without a lora compared to bf16, so it's probably a deeper training issue that can't easily be fixed.

If it really matters to you and you can get similar lora performance with it, try using another trainer to train the loras. Try the lora on different fp8 scale types, e4 e5, try all those fp8 quants on both kj and comfy nodes, and if nothing works then just say that your lora verison there only works with Q8 for now.
>>
File: 1739076902939488.png (1.16 MB, 856x1216)
1.16 MB
1.16 MB PNG
>>106692892
>>
hi anon
is there flux redux equivalent for qwen yet?
>>
>>106692899
That was me, It's not my lora that's the problem that was just some civitai retard with an unrelated issue, in testing that though I found that using fp8 is just completely fucked for me, this >>106692754 is the completely standard comfyui lightx2v template workflow with the listed models. If someone could run it and post their result I'd really appreciate it
>>
File: 1749291325723293.png (965 KB, 744x1392)
965 KB
965 KB PNG
>>106692904
>>
>>106692922
Post the exact workflow you used and links to the exact versions of the models and loras you are using and ill do it
>>
File: hunyuan.jpg (155 KB, 800x600)
155 KB
155 KB JPG
>>
File: 1757871672835469.png (1.46 MB, 816x1272)
1.46 MB
1.46 MB PNG
>>106692932
it's pretty cool how qwen edit/kontext can do all this which would take a lot of effort with inpainting or controlnets and other stuff, plus edits with this model treat elements like their own layer. so you can change/remove elements without altering the composition otherwise.
>>
>>106692940
The girl on the right is asian so she still gets some /ldg/ points.
>>
>i'm a nigbophile
>>
File: 1730470629086912.png (1.44 MB, 816x1272)
1.44 MB
1.44 MB PNG
>>106692948
like, how would you change a plugsuit into a cammy outfit, high denoise + prompt, right? but then how would you get it in perspective, or done properly, even with openpose? and you'd need a mask to get it pixel perfect.

it's a pretty neat tool, inpainting is still useful but this is another option to make stuff with and is very versatile.
>>
i miss schizo anon
>>
>>106692849
nice
>>
>>106692940
I'm honestly getting an urge to train a lora on SD3 'girl lying in grass' output, if nothing else you could upload it as Cronenberg style
>>
>>106692948
what is the container size?
>>
File: 1747687471552506.png (1.06 MB, 1288x808)
1.06 MB
1.06 MB PNG
>>106692960
replace the outfit of the red hair anime girl in image1 with the outfit of the girl in image2. keep her expression the same, and her red hair the same.

and poof, asuka but dressed as cammy.
>>
File: Combine.mp4 (2.71 MB, 1920x640)
2.71 MB
2.71 MB MP4
>>106692935
You should already have the workflow, it's in templates but here anyway
What it should be
https://files.catbox.moe/8y5xjs.mp4
fp8
https://files.catbox.moe/tmwx3y.mp4
Q8
https://files.catbox.moe/k4nahp.mp4
fp8 models/textencoder are linked in the workflow. Or find them here https://docs.comfy.org/tutorials/video/wan/wan2_2
Q8 models here
https://huggingface.co/QuantStack/Wan2.2-I2V-A14B-GGUF/tree/main
>>
File: 1734589635568483.png (1.09 MB, 1168x888)
1.09 MB
1.09 MB PNG
*sip*
>>
>finally can train loras and craft all the goofy shit i want
>only limitations is vram for loading wan frames

Wish it was like animatediff where you can load 1000 frames without issue. If any kind anon with a 32gb card or higher can confirm how many frames at, lets say 512 x 512 with wan context can you load for i2v and does it slop out or hold context? I can only get up to like 190
>>
File: 1740818206870636.png (973 KB, 1360x768)
973 KB
973 KB PNG
>>
I'll just say it. I think the collages lately have been lacking in the care and effort department.
>>
Serious question. How did comfyui get to the point that I’m getting fucking Nvidia emails advertising it? Anyone else remember when the comfy dev himself was in these generals back during the very first threads shilling comfy so much that it almost got banned from the OP?

From shilling comfy on 4chan to partnering with Nvidia is insane. How did he do it?
>>
>>106693163
By never engaging the schizos and going for the throat of big businesses.
>>
>>106693080
>'It's hip to be /ldg/'
>>
>>106693163
start emailing Nvidia and tell them comfyui is shit and anistudio has faster inference
>>
So how many users does Ani studio have and what models does it support? I might jump ship if it's worth it.
And don't try and pawn another gradio shit interface on me. Not interested.
>>
>>106693197
pros:
model loading is incredibly fast
inference is noticably faster
it's a pure C/C++ application
cons:
memory management isn't ready so you have to reload the checkpoint every gen
no qwen support for now
doesn't support all the diffusers Lora formats either
upscaling only supports two esrgan models
also, ani hasn't said anything about it being ready to switch over and he is in Japan to drum up funding and support. he's playing too many roles to have things done in a timely manner but who knows what will happen when he is finished in tokyo
>>
>>106693227
if ani brings Japan back into the AI race to beat the chinks I would be forever grateful but he's still a fag rn. at least he is a hard working fag
>>
>>106693239
I've worked in Japan for quite a while and Japanese people some of the most clueless motherfuckers when it comes to AI, it's genuinely astounding. Like the ones who know their shit are genuinely good, but for the most part companies still haven't gotten over CV.
>>
>>106693227
>he is in Japan
lucky nigger
>to drum up funding and support
I have a bad feeling this could end up being a comfy situation but at least he knows he can't do it alone
>playing too many roles to have things done in a timely manner
comfy was incapable of running his own company let alone a fucking frontend and just let the grift chink sell everything out. maybe I should just trust ani because he doesn't seem like a greedy sellout like the alternatives
>who knows what will happen when he is finished in tokyo
please be good end
>>
>>106693138
That is because the best prompters are posting API gens.
>>
>>106693138
just skip niggerjak bakes
>>
>>106693288
>comfy was incapable of running his own company let alone a fucking frontend and just let the grift chink sell everything out.
lol so true!
>>
>>106693227
honestly, just sounds like it needs more time to mature but it's on the right track. I find it wild ggml beat out pytorch despite being some random balkan basement experiment.
>>
>>106693227
LMAO. Everything about this sounds like it is an absolute meme that will get abandoned in two months.
Can you guys try working on it and maybe get it to a respectable state instead of relentlessly shilling it here?
>>
>>106693343
>abandoned in two months.
The real joke is that this has taken way more than two months to get to this point.
>>
>>106693343
>guys
it's just one guy compared to an army of chink sloppers comfy has
>>
>>106693343
He's working since 3 years to get to this state kek
>>
>>106693354
to be fair, it takes 2 years on average for homebrew game engines to get to a stable state
>>
>almost solved the color shifting for loops
>introduced a large stutter in motion instead
>>
niggerjak woke up and immediately started seething I see
>>
>>106693334
if it has all the ggml options, multigpu is finally solved
>>
nvidia sissies...
>China's latest GPU arrives with claims of CUDA compatibility and RT support — Fenghua No.3 also boasts 112GB+ of HBM memory for AI

https://archive.is/jQVZo
>>
>>106693402
>China
>claims
>>
>>106693365
I don't know how you specifically always seems to have these issues.
>>
>ran took everything from me
>>
recommended comfyui img2prompt model?
>>
>>106693413
Try doing it on more complex images. I'm doing seamless loops of my older gens which are far from simple.
>>
>>106693428
joycaption
>>
>>106693428
If it's single images just use joycaption on HF or ask Gemini
>>
>>106693428
https://www.reddit.com/r/LocalLLaMA/comments/1not4up/qwen3vl235ba22bthinking_and/
>>
>>106693440
>Gemini
I know it's SaaS shit by my god Gemini understands images so well.
>>
>>106692754
>>106693010
I've finally fucking narrowed down what the issue is, it's how native comfy nodes merge the lora weights to the model

https://huggingface.co/Kijai/WanVideo_comfy/discussions/52#689196f3665bcc325ec1dbac
>When using GGUF in native Comfy or the unmerged LoRA mode in my wrapper, the LoRA weights are not merged and instead handled like I explained.
Picrel is fp8 on KJ nodes, working as intended
>>
lets remove comfyui from the OP first and figure out the technicals after
>>
>>106693463
So is the error something that can be fixed in the official comfy workflow or
>>
>>106693463
Does this affect all models or just wan?
>>
>>106693434
thanks
>>106693440
>ask Gemini
I'm sure Gemini will gladly img2txt my nsfw images
>>106693450
I'm a vramlet so no qwen for me unfortunately
>>
>>106693480
use taggui with local joycaption model
https://github.com/jhc13/taggui
>>
>>106693473
I don't know, it could be a broken node, or something wrong with my setup. I have a 40xx series card so I should have issues with fp8 and I've tried every pytorch+cu combination under the sun. At this point I'm just going to swap to KJ nodes
>>106693475
Also don't know I pretty much exclusive use comfyui for Wan
>>
>>106693498
>so I should have NO* issues with fp8
>>
>>106693163
By being an agressive sperg who hires bots to damage anything remotely resembling competition. Remember his feud with lllyas right here, and then he tried the same with invoke, even though nobody literally cares about invoke. Except comfy, because they're in his space. Don't cross comfy's pass, he'll go right for the throat.
>>
>>106693480
Pretty sure Gemini stops giving a shit once you put it in silly tavern.
>>
>>106693521
>agressive sperg who hires bots to damage anything remotely resembling competition
>Every accusation is an admission
>>
ComfyUI is SaaS adware and should be removed from the OP. If you are fine with ComfyUI being in the OP, you are fine with SaaS gens and API shilling.
>>
File: 1754080378544948.png (5 KB, 217x103)
5 KB
5 KB PNG
>the most memory optimized official comfyui workflow (QIE2)
>>
what if you worked on your trash wrapper instead of endlessly sperging here?
>>
>>
>>
>>
>>
>>106693577
>everyone I don't like is ani
why don't you contribute to something instead of shitting your schizo diaper every 5 mins itt?
>>
>>
>>106693580
>>106693586
>>106693589
>>106693594
>>106693600
Why are you posting sora gens with the assets_task in the filename replaced with comfyui_2Ftask anon?
Did you want (You)s, here are some.
Please fuck off now.
>>
File: ComfyUI_1759285.png (2.35 MB, 1536x1024)
2.35 MB
2.35 MB PNG
>>
>>106693589
pissmaxxed
grainmaxxed
slopmaxxed
redditmaxxed
cuckmaxxed
>>
They appear to be locally generated though? They have comfyui in the filename
>>
>>106693639
>>106693569
>>
>>106693589
It would take over 5 hours to gen something like this in chroma btw
>>
File: ComfyUI_temp_axhxq_00004_.jpg (2.09 MB, 1440x1440)
2.09 MB
2.09 MB JPG
>>
>>106693589
People pay for this shit, lmao?
>>
>>106693649
Why would anyone want to gen this piss ?
>>
>>106693486
interesting
thanks
>>
File: Chroma2k-test_00068_.jpg (793 KB, 1216x1760)
793 KB
793 KB JPG
>>
Which one is better, Flux Kontext Dev, Pro, or Max?
>>
>>106693911
leave before you become a schizo too.
>>
>>106693911
Qwen Image Edit
>>
>>106693911
Max of course.
>>
File: Chroma2k-test_00077_.jpg (395 KB, 1216x1760)
395 KB
395 KB JPG
>>
>>106693943
Thanks. And between Qwen vs Seedream 4?
>>
>>106693911
wan2.5
>>
>>106693954
Seedream 4 easily. It's the best model available currently.
>>
>>106693963
*Diffused locally via API*
>>
>>106693963
For editing I suppose? I have tried it for image gen and it didn't seem that good, at least for anime-ish stuff, though I'm a promptlet so it might be a skill issue
>>
>>106693975
Neither are good for anime, you will want illust/noob or novelai for that.
>>
>ups died while I was genning cunny
fucking APC
>it's still in warranty
I'm fucking fuming the cunny was COMING so prime I could feel it, the combo of samplers and tags I was using made it GLISTENING, but no, the APSHIT had to fail.
>>
>>106693498
>At this point I'm just going to swap to KJ nodes
not q8 gguf? i think you indicated earlier the "quants work fine", i took hat to mean the gguf quants
>>
File: 1727451982686878.png (1.27 MB, 1008x1008)
1.27 MB
1.27 MB PNG
Qwen Image Edit 2509

v2 lightning loras seem the best versions compared to v1 and v1.1
4 step lora quality is actually very close to 8 step, due to RNG its not uncommon to be better for a set seed
But both are not too great for very large changes, where you need cfg 4, thus 50 steps.

Solid jump in improvement of the model overall, especially for cartoony styles, very similar to Wan 2.1 to 2.2 jump, it seems like Qwen had a lot more cartoony data to add to their models.

Single image editing is almost nano banana level overall, and for a large amount of prompt types it's better, although with the same somewhat opinionated changes that the model makes. Not unexpected as they basically certainly trained on their outputs.

Image height and width needing to be divisible by 112px so it crops them less and it still doing it a little is still annoying.
VAE quality loss is still there, this is a huge problem for image edit models, we obviously need pixel space models asap, like Chroma radiance.

Outside of this, the biggest thing to improve when it comes to the model itself is for it to be able to copy the likeness of the images even better, it still slops them up too much into non-candid plastic when doing bigger changes but this is just something that will need everything about the model training process and datasets to get better over time to continue to go away slowly.
>>
>>106691532
why hasn't the SAAS troonware been removed from the OP yet?
>>
>>106694081
>VAE quality loss is still there
it royally destroys complex backgrounds, smudges them all over from my experience. very sad.
>>
File: flux1girl.jpg (2.3 MB, 2688x1536)
2.3 MB
2.3 MB JPG
>>
>>106694094
If you keep it divisible by 112, and probably around 1 megapixel, rng it a little, it can keep most of the details.

But it's probably better to try some more advanced masking workflow which I didn't yet
https://civitai.com/models/1983350/ultimate-qwen-image-edit-plus-2509
https://civitai.com/models/1986315?modelVersionId=2248464
>>
>>106694094
>it royally destroys complex backgrounds
I only had that happen once. Other times it stayed the same. I don't know what the cause was.
>>
>>106694108
>>106694130
it's when it fucking zooms in, which can be mitigated when you set multiplier to 112 like the other anon said (you also need to not apply the vae to the text encode node and instead use the classic reference latent conditioning)
But even then, it CAN still happen, and when it does it's a wasted gen.
To mitigate this I'm personally just using the lighting 4step so at least I only waste 8 seconds of my life if the zoom in happens.
>>
If I could get any superpower I wanted, it'd be deleting people from existence through civitai.
>>
>>106694040
what a shame
surely you still have the workflow so you can genn it later
>>
>>106694074
Basically for me
>Default Comfy nodes
Q8 works fine
fp8 broken
>KJ nodes
Q8 works fine
fp8 works fine
From what I can gather this is due to how comfy merges loras into fp8 before inference
You can read KJ explain it better here https://huggingface.co/Kijai/WanVideo_comfy/discussions/52#689196f3665bcc325ec1dbac
>>
>>106694180
Surely there's a comfy API node that can fix this?
>>
Is there a general purpose inpainting model that can just "pick up" style from the surrounding image?
I just need to unfuck a small hand.
>>
>>106694193
no, if you want a classic masking and/or SEGS/detailer, your best bet is using either the edit models (qie/kontext) or dedicated inpaint models (flux inpaint). Usually they 'pick up' the original style but sometimes you either gotta lora it up or switch to a model that most resembles the style youre aiming for
>>
>>106694134
What am I looking at, did you hide people that got rid of their account?
>>
>>106694216
Blocked users. All the fuckers uploading furry, gay etc.
>>
>>106694220
holy fucking based
>>
File: 1755216455321419.png (76 KB, 703x446)
76 KB
76 KB PNG
the fucking chink promised the qie+ lightning models today, WHERE THE FUCK ARE THEY YOPU FUCKING CHINMKOID
>>
>>106694211
I don't want Kontext/QIE since I don't want it to touch the rest of the image (Plus they may not be too happy about the booba.).
I guess I can give a shot to flux fill, it's mask based as I want if I understand correctly.
I hope it doesn't stand out too much or produce those weird inpaint smear artifacts.
>>
>>106694180
thanks, nice findings
>>
>>106694220
Oh I see, wouldn't it be better to just hide the tags? (If that's possible)
>>
>>106694242
you can use the edit models for INPAINT workflows, that's what I mean, not in pure edit mode.
>>
File: ComfyUI_00039_.png (1.05 MB, 864x1208)
1.05 MB
1.05 MB PNG
>try qwen edit
>it adheres so well to the dress design that it removes the cleavage
>change dress

Success. This is extremely useful.
>>
>>106694247
It's right above the blocked users section and there's even a toggle to get rid of furry content. They're either retarded or autistic
>>
>>106694249
Oh, so I can give Kontext a mask and it will only touch that mask?
Let me try this.
>>
>>106694277
the mask literally forces the models to only touch the masked part, this is valid for ALL models
>>
>>106694274
>>106694247
It would be so lovely, but these fuckers doesn't tag their shit, at all. So it's useless.
>>
File: ComfyUI_00043_.png (1.37 MB, 752x1384)
1.37 MB
1.37 MB PNG
>>106694255
Kek, it even works on less coherent images. I even managed to prompt for it to hide the entire body.
>>
>>106694284
Pretty sure civitai auto tags images with no way to edit them yourself
>>
File: ComfyUI_00047_.png (1.67 MB, 888x1168)
1.67 MB
1.67 MB PNG
I haven't used these types of editing models before, so I'm pretty blown away like a boomer.
(this was grayscale)
>>
>mix the physical features of the two women together into one
Finally works with QIE now with 2509, big
>>
>>106694367
make him fall onto a walmart parking lot
>>
File: ComfyUI_00049_.png (1.35 MB, 1176x888)
1.35 MB
1.35 MB PNG
"turn the two men on horses into office workers wearing black suits and fashionable pants. remove their hats and replace with short stylish black hair."

The overall quality gets destroyed, but you can just do further edits in photoshop, masking them out etc.
>>
>>106694081 >>106694255 >>106694376
got an interesting workflow for this or is it the example one?
>>
>>106694389
can't you just add a grain/gauss node or something?
>>
>>106694393
Example with Q8 gguf loader, you can also try >>106694108
>>
File: ComfyUI_00060_.png (2.05 MB, 888x1168)
2.05 MB
2.05 MB PNG
>>106694380
>>
>>106694376
Actually, it doesn't really work
>>
File: ComfyUI_02274_.png (1.53 MB, 768x1152)
1.53 MB
1.53 MB PNG
>>
File: ComfyUI_00071_.png (1.2 MB, 832x1248)
1.2 MB
1.2 MB PNG
Holy shit I lost myself in this for an hour already.

>>106694393
Yeah the example one with q8 as well.

>>106694408
I'd rather do it manually in photoshop along with the rest of the editing.
>>
There used to be an eye symbol in ComfyUI that quickly hid all connections/noodles. It seems to be gone now for some retarded reason, and that's after reinstalling after not using AI for a couple months. Anyone know how to get it back? I don't want to have to go to the options menu and set render mode to none every fucking time I want to hide them.
>>
File: 1758110151020813.png (8 KB, 330x118)
8 KB
8 KB PNG
>>106694549
>>
How compatible are base flux loras to kontext and fill?
>>
File: ComfyUI_temp_axhxq_00005_.jpg (2.2 MB, 1440x1440)
2.2 MB
2.2 MB JPG
>>106693660
Second pass with 0.95 denoise turned it kino
>>
>>106694568
on a scale of are to not; not
>>
>>106694574
>Second pass with 0.95 denoise
Indistinguishable from just doing a first pass.
Also which model is this, do be aware that many won't like you rawdogging 1440x1440 resolution too much.
>>
>>106694559
Ugh, I see, that quick bar only shows up in focus mode. Lame.
Thanks though.
>>
>>106694597
View > Bottom Panel, then drag the logs window down so you don't see it. That way, you keep the quickbar
>>
>>106694594
Chroma. When you start pumping the denoise 0.80+ on the second sampler it starts doing big changes and 0.90+ it just uses the original pic as a i2i reference
>>
what happened to the voice diffusion from microsoft? need to catfish people
>>
>>106694611
That did it, thanks!
>>
>>106694617
Yes you are just doing a whole new image.
Just use empty latent, first pass becomes a pointless waste of time above 0.7.
>>
Is this comfy ui alert anything to be concerned about?
"Invalid workflow against zod schema:
Validation error: Invalid format. Must be 'github-user/repo-name' at "nodes[43].properties.aux_id""
>>
>>106694621
They pulled it, but it's an open license so they can't take down anyone who hosts it.
>https://github.com/diodiogod/TTS-Audio-Suite
Use that guys. It auto-downloads it. Get 7B and don't quantize to 4bit if you're not a poorfag with a sub 24GB.
>>
>>106694588
I am just disappointed that there is no lora to help NSFW stuff with Kontext.
>>
>>106694634
1st pass + 2nd with upscale gets better gens than just trying to smash it raw and hoping the model keeps coherence on high Mpx count
>>
what do you reckon is the size of wan2.5? it has a bunch of extra shit this time, with the audio model and such. wan2.2 was 16ish gb on q8/fp8.

i wonder if the audio/speech is done during video generation or afterwards.
man i want to know the tech details so badly.
>>
>qwen edit refuses nudity
Shieeet.

>>106694621
Shouldn't you be out shitting on the streets of canada?
>>
>>106694662
audio model?

are we going to be able to gen audio for i2v and t2v?
>>
>>106694666
blame trudeau
>>
>>106694621
>need to
No you don't, Pranjesh
>>
>>106694659
1. The resolution of both images you posted were same so no, it's not an upscale.
2. No it helps jack shit at such high denoising. The model sees next to nothing from the original image. It will have the exact level of coherency as drawing from scratch.
>>
>>106694694
The first was upscale on lower denoise
>>
>>106694673
it can generate speech/audio for the video. the quality is.. shit to be honest but you can always just dub over it as the lip movement will be there.
>>
>>106694662
FUCK OFF SAAS SHILL
>>
>>106694393
>example one
which one is that?
>>
>>106694814
are you retarded? if i were a shill i'd say how amazing it is. but it isn't. the audio is shit and it's slow as fuck. man kys, i just want to know the model details. i do not give a fuck if it stays api or not.
>>
>>106694876
ok you can ask about the model details in the NON LOCAL diffusion general. you're welcome
>>
>>106695094
Where's that?
>>
localjeets never recovered from dall-e 3
>>
>>106695149
yes, yes, local is toy, look upon saas and despair, yadayada
>>
we need someone to leak the seedream model
>>
>>106694648
Cracks me up how they purged their github of the models, crying over people "abusing" the model. The fuck did they expect people to use a zero shot voice cloning model for?
>>
>>106695167
>create robot intended to cook for you, but it can also jerk you off real good
>majority of users use said robot to jerk them off
surprised pikachu face
>>
File: Untitled.png (309 KB, 941x727)
309 KB
309 KB PNG
>Replace the outfit of the man in picture 1 with the outfit in picture 3.

>Change nothing else about picture 2. Only the outfit.
>>
heh
>>
File: 1740510758926792.png (1.94 MB, 1024x1552)
1.94 MB
1.94 MB PNG
neta yume is pretty good ngl, I've been gooning for the past two hours genning nekopara bitches
>>
File: ComfyUI_01366_.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
>>
>>106695359
meh, it didnt humanize the red bitch
>>
>>106694862
All the "example" workflows for any model I think of so far were in comfyui (browse the workflows) OR -especially before this feature- on github/huggingface where the model was published... also I just trust the anons to know if there's something remarkably special in their workflow or if it's just ~the thing anyone would have

>>106694410
ty for that too
>>
>>106695362
>the red bitch
don't insult Akari like that!
>>
You are reporting bad things, right, anon?
After you downloaded it of course.
>>
File: it's over.png (93 KB, 1015x478)
93 KB
93 KB PNG
>>106692099
lol, it's over
>>
File: 1747355184324389.jpg (325 KB, 1920x1080)
325 KB
325 KB JPG
>>106695392
yuru yuri (not real yuri btw) is a garbage sol moeshit anime, sorry.
>>
>>106695415
>not real yuri btw
it is real yuri, and a funny one
>pircel
your anime suck, it's "omg I'm a straight girl but a lesbian declared her love for me what should I do?" only subhumans would enjoy this shit
>>
File: 1729993285925082.png (809 KB, 1112x956)
809 KB
809 KB PNG
kek
>>
File: 1745003154171330.png (1.91 MB, 1024x1552)
1.91 MB
1.91 MB PNG
>>106695427
>it is real yuri
it is as yuri as nichijou is, you dumb faggot
>rest of the post
holy shit taste
>>
>>106695436
real usecase for wan, removing faggots from movies.
BASED CHINA
>>
>>106695446
>it is as yuri as nichijou is
(You)
>>
File: ComfyUI_01367_.png (1.37 MB, 1024x1024)
1.37 MB
1.37 MB PNG
>>106695362
Why would I want it to do something I didn't ask it to?
>>
>>106695451
will we get a nexus mods site but for movies this time? top kek
>>
>>106695415
It's not real yuri, it's yuru yuri and the word yuru is doing a lot of heavy lifting.
>>
>>106695460
fucked up eyes, ultra slopped.
qie the SPRO treatment
>>
>>106695396
catb0x it, or at least tell what it was
>>
https://xcancel.com/bdsqlsz/status/1971055141001605307#m
>China just released a new GPU
>CUDA-compatible
>112GB of HBM memory,
HOLD UP LET THEM COOK
>>
>genning porn
>the girl randomly turns her head and looks at the camera with a confused and slightly disgusted expression

S-should I stop..?
>>
>>106695488
it's a propaganda post most likely, we'll need to wait until the card is actually out, also imagine the pricing on it.
>>
>>106695499
it's a sign the model is recognizing you're raping it, so continue
>>
>>106695468
yuri doesn't mean "the characters end up being in couple" it means there's lesbians in there and it's true, they're almost all are (except Akari I guess)
>>
>>106695382
thanks, civitai "examples" are always like "150 nodes and 20 new custom ones to install" and it's a bit annoying when I just want to test the model itself in a simple way
>>
File: 1746333019093104.png (2.3 MB, 1024x1552)
2.3 MB
2.3 MB PNG
>>106695504
>ACKSHUALLY
lmao keep coping bro
>>
>>106695500
>Chinese company shows their new GPU
>PROPAGANDA >:O
>American company shows their new GPU
>Advertising :)
>>
>>106695488
>claims
it doesn't exist, anon
>>
File: Before Sunrise (1995).png (1.91 MB, 1000x1488)
1.91 MB
1.91 MB PNG
>>106695528
lurk more, there's also movies where the characters don't end up in relationship, and we still call that "straight romance"
>>
How many more years until models can create / replicate facial expressions? This is my biggest frustration with image generation, it always sloppifies any facial expression / head positioning and angle to that 'staring at viewer with a blank expression' or at most smiling.
>>
File: ComfyUI_01368_.png (2.78 MB, 1544x1544)
2.78 MB
2.78 MB PNG
>>106695470
Unfortunately, I don't think it's really possible to unslop once the sloppening has begun.
>>
>>106695562
better than before but yeah. I think it'd be better if he walked with miku and teto desu, akari is a slut
>>
>>106695562
why does she have dirt on her knees? did she.. you know..
>>
anons is there any big difference btw fp8 and q8 on wan2.2?
>>
>>106695597
remind me of the teto meme where she's tired as fuck and miku ends up getting the employee of the month or something kek
>>
File: 1744820181100905.png (2.09 MB, 1024x1552)
2.09 MB
2.09 MB PNG
>>106695597
it's a pantyhose
>>106695602
q8 is always better than fp8, so go for it.
>>
File: 1742133858485100.png (2.04 MB, 1024x1552)
2.04 MB
2.04 MB PNG
I hate the vocaloids
>>
>>106695602
Not a huge difference from my tests, but q8 will look better, at the price of slower inference for 5000 cards since they are optimized for fp8.
>>
>>106695652
still better than Vutubers desu
>>
>>106695656
4000 cards are also optimized for fp8, wasnt the 5000 meme about fp4?
>>
File: 1747206161861644.png (1.98 MB, 1024x1552)
1.98 MB
1.98 MB PNG
>>106695664
absolutely
>>
>>106695665
I only have a 5090 and 3090, so I dunno, but I tested q8/fp8 scaled, and while for the 3090 there was no difference, the 5090 was faster for fp8.
>>
>>106695700
strange, for inference everything gets casted to fp16 from what I remember
>>
is fp16 fp8_e5m2 dtype going to look better than q8 default dtype?>>106695656
>>
>>106695713
no, no one talk about e5 because it's worse than the e4 one
>>
>>106695713
it goes like this:
fp16
bf16
Q8
fp8_e4m3 (for 4000s/5000s)
fp8_e5m2 (for 3000s)

If you go any lower than this you'll get shit quality
>>
>>106695708
I got 450s vs 600s in wan when I tested, from memory.

>>106695713
All I know is:
fp8_e5m2 (scaled or not) -> recommended for 3000 cards and below
fp8_e4m3fn (scaled or not) -> 4000 and up
>>
>>106695499
>the girl randomly turns her head and looks at the camera with a confused and slightly disgusted expression
did it also say your full name and social security number? i hate it when that happens
>>
>>106695488
>>106695500
Please, based Chinaman, liberate us all from the nvidia menace

>>106695499
Kek, this the twerking loras, skinwalker bitch looks directly at the camera
>>
>>106695499
>confused and slightly disgusted expression
is wan even able to understand that?
>>
>>106695740
so if i use fp16 model with fp8_e5m2 dtype, it's the same as using an fp8 model?
>>
>>106695800
bro what the fuck are you doing/babbling about?
assuming you're talking about kijai's wan nodes, the dtype has to match your actual model type. I doubt you downloaded the full fp16, no? literally go ask chatgpt about this, you seem to be lacking a fundamental understanding of how this shit works
>>
>>106695809
>the dtype has to match your actual model type.
he literally said he downloaded a fp16 model, so he's allowed to run it on fp8 e4 or e5
>>
>>106695800
>it's the same as using an fp8 model?
if the type of quanting to fp8 of that fp8 model is e5m2, then yes
>>
>>106695451
I'm waiting for Ryan Goslings "Roots"
>>
>>106695800
yes,
>fp16 + you run on fp8 e5 mode = fp8 e5
>fp16 + you run on fp8 e4 mode = fp8 e4
>>
File: 1731664319543605.png (1.95 MB, 1024x1552)
1.95 MB
1.95 MB PNG
>>106694231
where's the lightning nunchaku model? is the chink sleeping?
>>
File: file.png (15 KB, 852x108)
15 KB
15 KB PNG
> nunchaku
> can't run lora
wew lad.

also, what is the logic of this? is the 8step lora higher quality than the 4step one?
>>
>>106694231
I noticed that using the lightning loras on qwen image (and edit) made the images more plastic and slopped, since QIE is already a highly plastic shit, we're going to the radioactive territory with that one
>>
>>106695898
>is the 8step lora higher quality than the 4step one?
obviously
>>
>>106695898
it can run loras for flux (the other supported model) but not for qwen (wip, very close to release).
>>
>>106695980
>flux
dead model lmao
i'll try nunchaku once it has lora support. i can run the native model so the only interesting thing here is the apparent speed increase.
>>
>>106695991
>dead model lmao
far from it, there's a reason Chroma was made from Flux Schnell and SPRO from flux dev, Qwen Image hasn't replaced flux at all (and it's humiliating to know that since QI is a 20b model, almost twice as big as Flux)
>>
>>106695991
qie is way more slopped than flux, I'm hoping it gets the SPRO treatment
>>
>>106696009
>I'm hoping it gets the SPRO treatment
no one will do that on QIE since the Alibaba fags will release an "improvement" every month
>>
>>106696016
I know they called it 'monthly' release, but will they really?
>>
>>106696016
I'm ok with that
>>
SPRO just looked like low noise injection but i know far too little about it to really REE about it.

>>106696003
i get what you are saying but am having far more fun with qwen image compared to flux. are there any flux finetunes you could recommend that enhance it's general understanding and nsfw capabilities? because needing a lora for every single little thing is tedious.
>>
File: 00001-1103118575.png (3.16 MB, 1248x1824)
3.16 MB
3.16 MB PNG
>>
>>106696042
it also destroyed details, but maybe that was because they applied it to a distill.
>>
>>106696052
hello biuteful, show bobs
>>
>>106696042
nta, i havent found a nfsw flux that isnt dogshit. having to get loras for everything is gay
>>
>>106696070
>having to get loras for everything is gay
true that, that's why I'm rooting for edit models, at least you don't need characters lora anymore if they work well (so far only nano banana works well)
>>
bread?
>>
>>106696076
Never used nano banana, I assumed is unlewdable?
>>
>>106696076
I didn't test the new qie but my main issue was that it was kind of dog shit at even understanding what underwear types or a lot of clothes types (the more daring ones) were.
>>
>>106696093
it's google anon, it's so safe, you will have a safety orgasm
>>
>>106696093
it is hardcore censored. people have to use all kinds of retarded language to get anything wearing a bikini to not cause google to dronstrike your house.
>>
File: 1729702247896588.png (2.32 MB, 1024x1552)
2.32 MB
2.32 MB PNG
Neta yume bros, why are there basically no loras for this model? Is it hard to train?
>>
>>
>>106696109
They're losing to SDXL because their base model isn't trained well enough.
>>
>>106696128
some sad ass there
>>
>>106696128
Why go to such extent to just generate the video it was trained on?
Same thing baffles me with all the porn people are doing. Just replicating the trained data.
>>
>>106696042
Flux Dev SRPO at guidance 3.5 is equivalent to like guidance 2 on normal Flux Dev. The apparent advantages of SRPO are way more apparent on seed-to-seed comparisons if you take it up to around guidance 4.5. That said I far prefer Flux Krea regardless, it just has quite noticeably better adherence than normal Dev or SRPO and a better understanding of a lot of style-related stuff, along with the improved realism.
>>
>>106696130
That's definitely not the reason, lack of awareness of Lumina 2 -> Neta Lumina -> NetaYume Lumina even existing plus the added hardware requirements are likely more the cause. A lot of people really can't handle anything that's even a bit more demanding than SDXL, even now.
>>
File: 1737069946988100.png (2.4 MB, 1024x1552)
2.4 MB
2.4 MB PNG
>>106696130
it's fucking sad really, if you check the lora page on civitai it's 21 loras, most of which fucking SUCK. I just wanted to gen some bitches from the currently airing anime, but NO LORAS.
>>
>>106696190
everything is for FUCKING illustrious
>>
File: 1754350297334688.jpg (47 KB, 637x679)
47 KB
47 KB JPG
>check youtube for 2.5
>check plebbit for 2.5

Kek, when 2.1 and 2.2 dropped it was everywhere, now almost radio silence
>>
>>106696178
Up to around Wan 2.2, we know people will use it if the output quality is sufficient, so reasons like long computation time have now been ruled out. It simply came down to poor model performance.
>>
>>106696190
AlphaVLLM had their own lora training scripts for the underlying Lumina 2.0 but there was not support in e.g. Kohya until very recently. Kohya does have it now in the "SD3" branch, not sure about other trainers.
>>
>>106696212
The NetaYume guy has been steadily improving the quality of Neta Lumina 1.0 with his finetune though, and Neta Lumina 1.0 was always quite objectively better than stock Illustrious 0.1 (which people continue to train off of)
>>
>>106696247
yeah I mean all the gens I posted were done with neta yume v3, I couldnt believe my eyes with the QUALITY I'm seeing for anime at least. It can also generate up to 2048x2048
Also being able to use both tags and boomer prompting at the same time. I don't think I can go back.
>>106696225
I guess I'll need to train my own loras, will check kohya's scripts
>>
>>106696274
>>106696274
>>106696274
>>106696274
>>
>>106695740
Sorry but isn't fp8 supposed to be better than Q8 since it uses floating point or does it entirely depend on the model and application?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.