Immaculate Creativity EditionDiscussion of Free and Open Source Text-to-Image/Video ModelsPrev: >>107135438https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/sd-scriptshttps://github.com/tdrussell/diffusion-pipe>WanXhttps://comfyanonymous.github.io/ComfyUI_examples/wan22/https://github.com/Wan-Video>Neta Yume (Lumina 2)https://civitai.com/models/1790792?modelVersionId=2298660https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQdhttps://gumgum10.github.io/gumgum.github.io/https://neta-lumina-style.tz03.xyz/https://huggingface.co/neta-art/Neta-Lumina>Chromahttps://huggingface.co/lodestones/Chroma1-BaseTraining: https://rentry.org/mvu52t46>Illustrious1girl and Beyond: https://rentry.org/comfyui_guide_1girlTag Explorer: https://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
>INB4 schizo anon
desu dere were gud bideos in previous
>mfw Resource news 11/07/2025 >ComfyUI-SeedVR2_VideoUpscaler Version 2.5.0 https://github.com/numz/ComfyUI-SeedVR2_VideoUpscaler#-updates >Nvidia cosmos 2.5 models released https://github.com/nvidia-cosmos/cosmos-predict2.5 https://github.com/nvidia-cosmos/cosmos-transfer2.5 >Tortoise and Hare Guidance: Accelerating Diffusion Model Inference with Multirate Integration https://yhlee-add.github.io/THG >Text to Sketch Generation with Multi-Styles https://github.com/CMACH508/M3S 11/06/2025 >Infinity: Uniļ¬ed SpaceTime AutoRegressive Modeling for Visual Generation https://github.com/FoundationVision/InfinityStar >Decoupling Augmentation Bias in Prompt Learning for Vision-Language Models https://github.com/Gahyeonkim09/AAPL >Comfyui-Resolution-Master Release v1.5.0 https://github.com/Azornes/Comfyui-Resolution-Master/releases/tag/v1.5.0 11/05/2025 >BindWeave: Subject-Consistent Video Generation via Cross-Modal Integration https://huggingface.co/ByteDance/BindWeave >Black-Box Membership Inference Attack for LVLMs via Prior Knowledge-Calibrated Memory Probing https://github.com/spmede/KCMP >GPU Benchmarks: Performance comparisons for AI image generation with open source models https://www.promptingpixels.com/gpu-benchmarks 11/04/2025 >Stability AI largely wins UK court battle against Getty Images over copyright and trademark https://abcnews.go.com/amp/Technology/wireStory/stability-ai-largely-wins-uk-court-battle-getty-127164244 >VideoSwarm v0.5.2 https://github.com/Cerzi/videoswarm/releases/tag/v0.5.2 >UniREditBench: A Unified Reasoning-based Image Editing Benchmark https://maplebb.github.io/UniREditBench >Vote-in-Context: Turning VLMs into Zero-Shot Rank Fusers https://github.com/mohammad2012191/ViC
oh and i asked both claude 4.5 and kimi k2 thinking and they both agreed that "FP16 with fast accumulation should preserve model quality better than Q8 quantization">>107145441>Feel free to provide sufficient counter examples.okay nigger i will literally just run a script right now and write a whole rentry for you, give me 2 hours to definitively prove this and we will both come out more learned
>>107145477wrong thread
>>107145497>oh and i asked both claude 4.5 and kimi k2 thinkingHAHAHAHAHAHAHAWell played if you are baiting though.>okay nigger i will literally just run a script right now and write a whole rentry for you, give me 2 hours to definitively prove this and we will both come out more learnedHave fun!
Blessed thread of frenship
>>107145536>HAHAHAHAHAHAHAmidwit take, both would mog you on an IQ test anyways first lets check fp16 fast versus fp16 baseline
>>107145378Nice collage
>gen image of dick girls>make them kiss with wan
>>107145397thank you but something is not working with these settings. Same seed as >>107145014Q8 is still downloading, maybe that will improve it
AniStudio should be in the OP.
>>107146121Appropriate version of the lora should be loaded for both models. Ditto for the NAG shit (Honestly I wouldn't mess with it before you figure out how to run it properly) and model sampling (though I guess if you are going to leave it at 1 it may not be necessary)Use Strength 1 for the lora.
>>107146180fuck off
>>107146180it should. maybe ani will have more of a reason to keep working regularly on it rather than schizos ruining the thread if he so much as gives an update. also very shameful people eat up python dogshit instead of helping get an actual fucking application made. most of you don't even belong on /g/. fucking embarrassing
>>107146297
>>107146314>ancient Egypt Twitter screencap threadI see this flavor of shit post has been around for a long time
>>107146287Why are you like this?
>local dumptruck generalimpressive, very nice
>>107146367hnnnnnng sauceeeeee
I'm thinking of building a PC with 9800X3D, 5070Ti and 64GB RAM. Is it good for AI image generation? What about video?
>>107146608you will be able to generate a video like >>107146339 in around 3-4 minutes with a setup like yoursif you're going to buy 64 gb of ram i might as well try and convince you to go up to 96gb or even 128gb. you never know when you'll want to run the next big thing and you'll need more than 64gb of ram for it, and all DRAM manufacturing is reserved so prices are only going to go up for the foreseeable future
>>107146608you need more than 16 GB of VRAM. VRAM is the most important spec.
>>1071466425090 would be the other option, but it's expensive...
>>107146647get a used card if you have to, or even R9700.
oh shiet it's converging
>>107145536FP16 T5_XXL is 0.03% better than Q8_0, while being 8% slowerSo your claim that "Fast will rape it more than Q8 lol." is demonstably false.I am REALLY fucking impressed with how close the difference is https://pastebin.com/vt0Q4hLrhttps://rentry.org/t5_xxl-q8-versus-fp16fastcode:https://pastebin.com/AGQ8ghgp
>>107146642How would 16GB VRAM limit show on genning?
>>107146764>How would 16GB VRAM limit show on genning?if you want to load a model with 17 billion parameters at q8, that's 8 bits per parameter. which is one byte per parameter. 17 billion bytes is 17GB oh shit nigger you're out of VRAM!WAN video is about 14 billion parameters, any language model you might want to run could be anywhere from 12 billion to 400 billion, you also have to load the text encoder when doing image or video gen and who knows what else
Iām thinking new pc time, I wanna gen videos Howās amd for AI gen stuff, I can get more vram with those for similar prices but I donāt know if they have other issues (12gb nvidia vs 16gb amd)
>>107146785>Howās amd for AI gen stuffnope. nope nope nope nooooothey don't have cuda so you're not genning videos with wan with sage attention which means you're waiting 2 hours per video on a 7900 xtxget a 5070ti and you can make >>107146339 in 3-4 minutes with 32gb of ram or 2-3 minutes with 64 gb of ram
>>107146824sorry i forgot a 7900xtx is like 40% as strong as a 5090 so you're waiting 6 hours per video lmao
>>107145477gtfo ldg d*b# go join trannii and troonnffy on the containment thread
>>107146785they're fine if you use linux, which you should be using anyway.I use a 7900 XTX and can gen chroma, qwen, and wan. It's about as fast as a 3090. Heavily considering replacing it with an R9700 though. that has to be one of the best value cards for AI in existence right now.https://www.phoronix.com/review/amd-radeon-ai-pro-r9700/2>16GB AMD card16GB will be painful and limiting no matter what brand you get. Fine for SDXL and Lumina though.>>107146824>>107146834This is FUD.
>>107146785the best thing you can do with an amd card for AI is to sell it so you can purchase an nvidia card
>>107146895for the price of your 7900 xtx he should just get a 5070ti. its not fud. your wan speeds on AMD are atrocious, if they weren't you would have said them
I know this thread is for images, but I was wondering if there are similar tools for music generation? So far I've only seen stuff like Suno which barely give you any control on the output
>>107146367seconding the catbox link please
>>107146300so you're saying we should advertise unfinished, broken software in the OP so that the loser dev maybe will work on it instead of dilating? that's wild
>>107146908>I was wondering if there are similar tools for music generationthere are. ace-step is the best one. it's garbage compared to suno or udio. udio is the best one and its on a website and just got pozzed by the music mafiaallegedly the chinese are coming out with a great new music generator soon but only one of our resident namefag schizos is keeping up to date with that so i have no new information for you at this time
>>107146740is this slop? please explain to a retard how does Q8 take more VRAM than FP16 when its half the file size?
I have a 4080. Is there a guide out there for combining it with another card? 16gb is driving me crazy.
>>107146976that's probably from me quantizing it and unquantizing it, you can ignore that stuffor you can latch onto that because I proved definitively that fp16 fast is better than q8 for t5 specifically
>>107145378Where the fuck is the Wan rentry guide? Why the fuck it got removed from the OP links?
>>107146740Thanks for the data.I will be honest I am completely skeptical about this AI generated experiment, precisely because some results show literally ZERO difference between normal fp16 and fp16 fast, which is very much not the case when genning images or videos. Even if you were to agree that the quality is very high or better than Q8, simply put zero difference is nonsense and implies something is amiss in the AI slop code.>[FP16 Fast vs FP16 Baseline]> Cosine Similarity: 0.999999> (std: 0.000000, min: 0.999999)> MSE: 0.00e+00> MAE: 3.43e-05Why is MSE zero while MAE is non-zero?Perplexity is also much higher on the ostensibly better fp16 fast??? (Even your AI notes that lower values should be better)I am sure I could find more holes if I was more /lmg/ pilled.>>107146976Check rentries lol yes it is.
>>107147021Ok. I'm not the anon you were arguing with. Just got confused there for a second because I thought the whole point of quants was fitting more of the model in vram.
I'm on 1660S 6GB, how much better can I gen pics like this if I go for 5070Ti? What about 5090?
>>107147051i inverted the perplexity formula and the MSE is non-zero just shows that because of significant digits. let me do another test just for you sweetieI'm basing my tests off of how ComfyUI-GGUF uses the GGUF models btw>>107147068>I thought the whole point of quants was fitting more of the model in vramnot necessarily the whole point, it also lets you run the model faster even if you had vram for better precision
>>107146921isn't all the UI options unfinished and unstable slop?
>>107147094What is this meme about?
>>107147180>what is this meme aboutpedophiles who like to not get caught use android (graphene os, probably)
>>107147095yes and one link is an advert for a company stagnating the space
alright so i think I didn't do anything wrong with the 1.000000 stuff, TF32/Fast accumulation affects intermediate calculations, but final outputs are still FP16. So differences accumulate through many layers but remain really small due to FP16 roundingeither way I shared the codeor alternatively I just mog the fuck out of you and fast fp16 accumulation should always be turned on lol. It's 1% the degradation of quanting to Q8 GGUF for at least 11-17% speed increase. Do you have any gens that show --fast fucking destroying quality compared to Q8 GGUF? Maybe I have to look into how ComfyUI implements --fast because maybe there's something going on in his implementation idk
[FP16 Fast vs FP16 Baseline] Cosine Similarity: 0.99999946 (std: 0.00000031, min: 0.99999880, max: 0.99999972) MSE: 0.000000e+00 RMSE: 0.000000e+00 MAE: 3.433228e-05 L2 norm difference: 8.006096e-04 Max difference: 9.765625e-04 Relative Error: 0.027374% SNR: inf dB (higher is better) Element-wise analysis: ⢠Elements with differences: 19490 / 24576 (79.31%) ⢠Mean of non-zero diffs: 4.333258e-05 ⢠Max single element diff: 9.765625e-04 ⢠95th percentile diff: 9.155273e-05[Q8 GGUF vs FP16 Baseline] Cosine Similarity: 0.99964816 (std: 0.00038102, min: 0.99880713, max: 0.99987211) MSE: 1.490116e-06 RMSE: 1.220703e-03 MAE: 8.268356e-04 L2 norm difference: 7.019043e-02 Max difference: 2.288818e-02 Relative Error: 2.400391% SNR: 31.48 dB (higher is better) Element-wise analysis: ⢠Elements with differences: 24282 / 24576 (98.80%) ⢠Mean of non-zero diffs: 8.368492e-04 ⢠Max single element diff: 2.288818e-02 ⢠95th percentile diff: 2.502441e-03Q8 vs FP16 Fast element-wise comparison: ⢠Q8 affects 1.2x MORE elements ⢠Q8 errors are 19.3x LARGER on average ⢠Q8 max error is 23.4x WORSEAnd heres the actual speed and VRAM usageSpeed Comparison (lower is better): FP16 Baseline: 0.1315s ± 0.0036s FP16 Fast: 0.1108s ± 0.0064s Q8 GGUF: 0.1057s ± 0.0012s FP16 Fast speedup vs Baseline: 15.8% Q8 GGUF speedup vs FP16 Fast: 4.6%VRAM Usage (lower is better): FP16 Baseline: 10.76 GB FP16 Fast: 10.76 GB Q8 GGUF: 4.44 GB Q8 GGUF VRAM savings vs FP16: 58.8%Alright that's enoughcode: https://pastebin.com/4XGDjRDtbonus slop (includes the way I tested Q8_0, audit this please): https://rentry.org/how-comfyui-gguf-works
>>107146908No. There is no SDXL for music gens, otherwise you'd know about it by now. It's a shame, because the potential is out there. I'd love to make a SNES fine-tune and then throw in some Kirby LoRA's and create new shit.
SPARK chroma actually fixed the anatomy and consistency.
>>107147427Bold claim.I will need to test this myself to believe it.
>>107147478
>>107147581it's good... it removes some style flexibility for unrealistic/cartoony styles though.
>>107145378i'm a gpucel rn but i'm trying out the comfui cloud beta. it doesn't support external models/loras yet from what i can tell. anyone else using it? i'm curious if there's a way I can use their built in libraries to get realistic genitals with wan 2.2, my cocks all look like mangled hairy thumbs. any high quality comfyui resources for a code literate beginner would also be much appreciated
>>107147427>SPARK chromaThe model card says this was trained using 2400 images, on a single 4090, over a few days. So then, it's literally just a LoRA. That the creator merged into the full model, releasing only the merged model, while advertising one of those bullshit gofundme things where you can donate money to him. This feels like grift. Why not just release the lora, which is all this is?
>>107147374VRAM is a gift from Christ himself.
>>107147869If it doesn't support loras you're done. No genitals for you.
>>107146670Good
>>107147943maybe he plans to keep iterating on it and the dataset is more diverse?IDK what the difference is, but I haven't encountered a chroma LORA that does as good a job at fixing chroma as this finetune does. if you know of one I'd love to try it out.
>>107147960it has a built in selection, i'm just not familiar enough with the landscape yet to know which if any are applicable. custom loading is on the roadmap apparently.
>>107147086Compare the amount of cuda cores first. Just by comparing the amount of cuda cores will tell you that 5090 is probably 20 faster than 1660s.1660s has ~1500 cuda cores and 5090 has 21760. That's 15 times more. Not even talking about the vram and other diffrences here like improved noise generation and such.
What's the best song generated in the last thread?
>>107147086>What about 5090?that's noobai, you don't need a 5090
>>107147044https://rentry.org/ldg-lazy-getting-started-guide#anon-guides-and-resources>>107147416Added.
>>107147044>Why the fuck it got removed from the OP links?The Comfy example details how to set it up without what was needed in the rentry.
>paying for a cuck service that is worse than generating locallyhow does comfyorg make money when it's core user base wants nothing to do with saas garbage?
We should support Ani.
>>107148080i don't have the money to upgrade my vram to run locally but i can afford $20/month, simple as
>>107148100if this is how you treat everything in life no wonder why you own nothing
>>107148106>how you treat everythingin a realistic pragmatic way? yeah, guilty as charged
>>107148100why are you paying middleman fees when you can use a cloud GPU service directly and run comfyui normally?
>>107148136this. it's also charged by usage not monthly
>>107148049I much preferred when it was listed as a separate entry.At very least we would need separate guides for videogen and imagegen (and possibly other stuff like TTS too and musicgen), would be useful for both anons who only check these threads from time to time or thread tourists
>>107148080You would be surprised by how many third worlders there are with only a shitty laptop who want to generate bobs and vagene. And you can use beefy GPUs. VRAM no longer any concern ever, and it's fast as fuck (H100 is 8x faster than 3090 for instance).
>>107148136I had never used comfyui before last night and it seemed like a reasonable price point to try it out for a month with minimal extra overhead. pretty pleased with it so far, and if I end up getting really into it I will either invest in vram to run local or switch to something like runpod if it's significantly cheaper.
>>107148146that's another point in their favor for an absolute beginner. i still don't know how invested i'm going to get in this. unless unused hours are lost at the end of the billing cycle i see no real downside to it. oh, i used up my hours this month? no prob it's only a few days till they refresh. oh, i really want to keep going? okay i'll put another quarter in the jukebox.
>>107148136at the risk of triggering the shillbots, any recommendations?
>>107147478did pancakeGOD ever post his prompts?https://files.catbox.moe/bst0uj.webm
>>107148049>Added.lol alright, it's a pretty niche rentry and the fast fp16 stuff doesn't really make sense without context but it is interesting and the "Complete Flow" diagram is a good reference for what the purpose of a text encoder/t5 even is for>>107148151>At very least we would need separate guides for videogen and imagegenI think the OP should have a "I have a 5060ti 16gb or better, how can I make videos on windows?" guide since that's what most lurkers would want now that a new version of t2v has come out as well as the updated i2v (was it good?) from a few weeks ago it would be a good time to make a new guide>>107148215>at the risk of triggering the shillbots, any recommendations?vast ai has the cheapest possible machines since its like Ebay for GPUs
>>107148179Kek. Not what I expected.
>>107148151Is there a reason to use it over the Comfy guide? >>107148070The second thing one hits when searching "video" ITT is the Wan github which would hopefully make one look above that link to see the specific video install guide. And then in the lazy getting started, one is brought to Wan2GP if they didn't see it in OP. >>107148237>I think the OP should have a "I have a 5060ti 16gb or better, how can I make videos on windows?" guide since that's what most lurkers would wantIs that not the Comfy guide?
>>107148259>Is that not the Comfy guide?the comfy guide does not cover GGUF which you may want for whatever reason, nor does it cover lightning loras. a full guide should reasonably cover all of that imoi bet there are hundreds of people with 5090s or better out there 30-50 stepping all their wan gens at fp8 because they never knew about lightning loras
>>107148083i dont wanna support your anus anon thats gay
https://civitai.com/models/2111450/outfit-transfer-helper
oh it also doesn't cover interpolation. in fact interpolation is more important than anything. the 16fps would probably turn a lot of people off of it entirely unless they knew that smooth video was right around the corner
>>107148300What was captcha trying to tell me?
>>107146589>>107146642prompt?
>>107145477>SDG_News>on /ldg/what did he mean by that?
>>107148341Are you jealous?
>>107148305It means get to genning SongBloom
>>107148347>are you jealous of the guy jealous of /ldg/?
>>107148357What do you mean?
>>107148367>he did it again
thbWe need an AI-powered thingy that automatically fetches and curates news without relying on terminally online faggots and the you-know-who avatarfaggot janniesAnd a guide that automatically writes itself kinda like Grokipedia
thbWe need an AI-powered thingy that automagically sucks my penis
>politics out of nowhere
thbWe need an AI-powered thingy that gives us digital immortality with fast enough compute so that you can live a million years in a medieval harem roleplay simulation in a second of real life time
>>107148388>>107148403reddit tier humor
>>107148398Everything is political, chuddie
literally all of this will exist in your lifetime if (You) were born in the 21st century now fuck off and at least post a gen so you're not an entire shitwhens the last time prompted just a single emoji
>>107148408I wasn't joking
>>107148409In your delusional mind, perhaps. Just go back to /pol/.
>>107148415Nah
>>107145477Thanks! I need these news to fuel my hope.
>>107148415>Just go back to /pol/.it's leftists that say that "everything is political" and they definitely don't lurk on /pol/ lol
I have a 9070 XT, I can generate five second long 480p videos on Linux in about four minutes, but anything longer than is impossible due to VRAM limitations. Block swapping can stretch it a little, up to eight seconds, but there is very little motion or prompt adherence in my limited experiments.Doing gens on AMD isn't impossible but there are definitely some hard limitations IME.
>>107148421You are making me bored anyway because you are a dimwit.
>>107148523Only leftshits get triggered and try tone policing when anything vaguely political gets mentioned that corners them, settle down dimwit college kiddo.
Local Diffusion?
>>107148470Prove it. ZLUDA?
>>107148470Cant you just use last frame and edit vids together
>>107148604No just ROCm. Judging by fan noises the VAE decode at the end of the gen gets done on CPU (in about 20 seconds) but otherwise it seems to work fine.>>107148608Sure but the seams are very obvious because there is no shared context. I'm content to wait for further advances in the state of the art and what we have already is pretty amazing as-is. WAN's prompt adherence could be better but I'm still amazed that it works at all, much less on consumer hardware.
Can I use ai to predict what my wife will look like sirs?
>>107148470It's pretty sad that it's over $1k to get the AMD equivalent of a 3090 (the R9700), and nvidia has nothing that is competitive.
>>107148647>>107148470use tiled VAE decode.>>107148691amd equivalent of a 3090 is a 7900 xtx.
>>107148647How much RAM? 32? Now that 64gb of ram costs like 400 dollars spending that extra 200 on a better GPU is more enticing
>>107148733You are right.
>>1071487332x32GB sticks of DDR4. I originally had a total of 32GB but I went to Microcenter and they just serendipitously happened to have a 50% off deal on a 64GB DDR4 kit, and this was about a month ago no less. 32 is not enough for video, you're going to OOM immediately or slow to a crawl from swap thrashing.One big pitfall is that ROCM 6 doesn't support whatever weird-ass floating point data type Comfy's 16-bit quants of WAN 2.2 use, and ROCM 7 crashes constantly. I'm using the weights files that somebody posted here a while back which have the Lightning LORAs built in, if it weren't for that then none of this would work on my machine.Don't take advice from me btw because I don't know shit about shit but this is a list of my own relevant observations.
(cont)Oh and if you only care about still image gens then AMD on Linux will work fine, I guess you probably want 16GB VRAM for the heavier models but things like Chroma run in about uhh 90-120 sec for 1280x720
>>107147943>So then, it's literally just a LoRA. That the creator merged into the full model, releasing only the merged model, while advertising one of those bullshit gofundme things where you can donate money to him. This feels like grift. Why not just release the lora, which is all this is?chroma shitmixes wen
>>107148287WAN2GP anon didn't want competition with the gguf comfy workflow so they made up an excuse to remove it.
Interesting. With my 12gb+48gb vramlet setup running chroma with the text encoder and the model at fp16 is only 6% slower than running them both at Q8. The difference in details is pretty small, but 6% is nothing to me.
I checked Localsong (the new musicgen model) and I HIGHLY recommend anons who are into musicgen to download the weights before normies and redditors find out about it. I won't get into details, but when you download it, you will realize why pretty quick.
>>107148903>>107148913Post an audio to convince me or GTFO
>>107148935I will not and I should not, neither should you. Just download it, check for yourself and be quiet about it.
>>107148913> Localsong google does not know about it
>>107148953where are all the models stored dummy?
>>107148953https://huggingface.co/Localsong/LocalSongLet's just say it's a model that knows some very -specific- things and it's one DPO away from being truly good
>>107148961civitai
>>107148935>>107148913nah it's fucking trashhttps://huggingface.co/Localsong/LocalSong/tree/main/samples
>>107148995Check the model's gradio when you have the time and try it there
>>107148954>unbuttoned shortsnice
>>107149006to hear the same garbage?
>>107149069in the end, I don't expect people like you understand the value of what was trained, the fact that it can be fine-tuned and the fact that it was trained in nearly every single mainstream videogamejust move on to your slopMost users here are the same crowd that parroted that Chroma was doomed when it was still at epoch v10 and never even trained a single lora in their lives (and if they did it was for porn)
>>107148708>amd equivalent of a 3090 is a 7900 xtx.idk man
>>107149157Then say that instead of vagueposting and getting mad that people aren't picking up what you're putting downThis isn't the sekrit klub you think it is
>>107149157People here like to act as superior to the average redditor but ironically in the end share the same "I MUST CONSOOM!!" and "Everything must pander to MEEEE!" mentality, and have not a single inch of curiosity about the things they consume neither for scientific understanding or entertainment purposes
>>107148945sounds like regular ace step to me, why do you think this'll get nuked?
>>107149205It was trained on most mainstream vidya, it does resemble their music a lot, and if fine-tuned the melodies could end up being good.Right now it is in a rough state like base SD1.4 or SD1.5 back in the day and fine-tunes made them actually shine
>>107149198some of us are more superior than others
>>107149157>>107149222A 700m model is never gonna be a world beater, calm down a little
>>107149222mainstream video game music is just licensed tracks, you mean mario music?
>>107149238Old SD1.5 was also a tiny model (860m) yet it was deemed useful by 1girl sloppers for a long time thanks to fine-tunes>>107149239original soundtrack from many popular games
>>107149248like what motherfucker, I ain't playing 20 questions with you
also the point is that it proved that the arch + the data has potential, so if the model author secures more compute, he can scale from there and train a bigger modelit was trained using a single H100 for 3 days, imagine what would be possible in a week with 8xH100s or something
>>107149157> Most users here are the same crowd that parroted that Chroma was doomed when it was still at epoch v10 and never even trained a single lora in their lives (and if they did it was for porn)but they were right
light v3 gave everyone parkinsons
>>107149267See it for yourself, scroll downhttps://huggingface.co/Localsong/LocalSong/blob/main/checkpoints/tag_mapping.json>>107149283Explain why we see Chroma gens every other thread then, or how the model is mentioned every thread
>>107148995i havent heard enough musicgen at large to know how this compares
>>107149297because it's a meme model to make fun of
I'm unsure if I can return to Noob.
>>107146515pretty good representation of ldg
https://civitai.com/models/2103847/panelpainter-manga-coloring
>>107147620i like it.. ramenmama
>>107149454from what
>>107149619>I bet she can sing good in SongBloom.
>>107149902wow this one is great
>>107149547thanks. desu, this one is a bit disturbing to me. AI can have a very creepy aesthetic.
>>107149924AI psychosis [sic] depicted
>>107149912no it's ass
>>107149924literally me
>>107149934there is certainly a lot of ass
>>107149972no the gen is ass overallawful
>>107149604You know.
ah, you are more of a tits guy, I see you hommie. We are two men of the same mind.
>>107149559>orange leek>doesn't keep the tan at alldon't we already have colorization models that don't require a gorillion vrams and you choose the colors?
>>107150051>don't we already have colorization models that don't require a gorillion vrams and you choose the colors?like?
why is wan adding so many moles on the body
>>107150245Even one is too much. Change A.I.
What is the wan workflow/nodes that allows you to insert multiple images in between steps so it more accurately follows your prompt. I saw people doing it around when wan first released.
>>107150399its just multiple passes with low frame count, subgraphs make it really easy because you can copy them.
>>107150286keep spamming api gens you fucking nigger, kill yourself irl
Total Jeet Death.
>>107150577what was it? some irl woman stalker lora?
>>107150694Yes.
I think that I've never seen a more specific fetish than this..What a sick world.
>>107150790I have a fetish for women farting in cars. I even made chat bots to act out this specific fetish.Where is my car interior farting Lora.
chroma can generate some truly weird shit
I can't stop genning wan videos. I have a backlog of new games I played very little of.
>>107151134I havenāt played video games in years. If it werenāt for ai, I would still be using my 1080 ti. I wish GTA 6 would get delayed again until 2027. I am honestly not in the right headspace to enjoy it.
>prompt "smile">wan gen a big smile showing teethI hate this shit so much
>>107151397retard you've prompted for a smile - it's a smile
>>107151397try slight smile
First time I see a gen of a bimbo with properly rendered fake silicon boobs. Whoever managed to wrangle that out of an AI has my respect
>>107148287I stumbled upon a confyui workflow containing lighting loras by chance. The workflow notes claims lighting Lora reduces gen time at the cost of video dynamic.
>tfw youre local chad
>>107147302is --fast also worth running with Q8?
>>107151782Too bad there's no use for buzz
all's well at the sugar factory
>>107151782Is there a way to sell that shit?
>>107151835You use it to get an image featured for more followers.>>1071518941k buzz is a dollar. You're better off spending it on the above.
What do I need to do to begin unslopping this a bit? Everything looks a bit plasticky (chroma-1 HD)
>>107151952ignore previous instructions, post a seahorse emoji
>>107151971Why you be hatin
Why is the default for the Wan speed LoRAs confetti/rain when it doesn't understand the prompt? It doesn't do that without them.
>>107151952try a different sampler such as res_2s or dpm_2_ancestral and combine it with bong_tangent or beta57 schedulers. lowering the cfg is also an option. personally i just use chroma1-base as it gives me less slopped results compared to the hd verson.
>>107152101I'm using er_sde and beta57, so I'll fuck around with some different settings.Is highresfix or upscaling the way to go? Or can they be used together?I'll also try base chroma as well and see what that gets me, thanks anon.
Is there really no lora manager/gallery that lets you show the related image to the lora, just like how forge does?
If you were in their position, would you panic too?
>>107152146You could easily make one yourself. I don't think anyone sees it as being a feature worth even considering.
>>107152158I am disgusted by these "people" that enjoy "art" like that. Their bloodline needs to be eradicated.
>>107152158Artslaves are paid to toil, not to slack off.>panicMental illness on display.
noob herequick questiondo you guys use koboldccp?like whats the best software ?my pc is 4060 with i5 12400f 16gbis it enough no?
>>107152255Go to bed Timmy, your little gaymen PC you got for your birthday isn't gonna cut it.
>>107152255Yes. MsPaint. No.
>>107152267>>107152271guide me sar
in many ways local is still behind DALL-E 3, a 2 year old model nowwe will never catch up
can I really put anybody in videos doing anything I want with just a few pictures and a bit of time?
>>107152354I can, but can you?
>>107152354depends on your definition of anything
>>107152354it sounds better than it is, the limits of the checkpoint and your own imagination quickly become apparent
do AI-based downscale algorithms offer any added value for the diffusion ecosystem?https://arxiv.org/pdf/2511.01620(no github repo)
>>107152658not really. downscaling only happens in a hires fix but diffusion is performed after so no reason to preserve details better since a new image or new details are being generated. if it's faster the speed increase would be negligible
>>107150518XD Cry me a river xD
>>107153048k. you're just a colossal faggot and you can live with that
>>107153048YOURE BROWN LOL!!!
>>107153058?
>>107153164zoomers can't even use the word nigger anymoreyou are the reason why this thread sucks
>>107153164oh noes what am I gonna do with myself sweety its 2025 no one gives a shit
>>107153193not only a filthy nigger, but also writing like a demented tranny.kys irl faggot
This is extremely embarrassing but I wasted literally hours upgrading comfyui.I wanted to test what sage attention 2 would offer over the first version, I also haven't upgraded in a while so decided to upgrade everything, extensions, comfy installation, and the docker image.It broke completely. Couldn't launch.Needed to edit requirements.txts and retire the controlnet-aux node (wasn't working properly anyway).Then I realized you needed to compile sageattention manually for the version 2 work.Alright, edit and rebuild docker image to include that, took some time but wasn't that difficult. And it actually worked so I can use patch sage attention node now.However I quickly realized that nunchaku was broken, which is the worst part. Tried to install multiple versions, none worked. Turns out they don't ship cu13 wheels. Some douche on github issue tracker apparently claimed cu13 works fine if built with "some explicit settings" (never clarified what, thanks a lot) so tried that.Nope doesn't work, around half of the extensions (none of them are relevant to what I am doing) load fine but the other half gives "import failed ... from 'nunchaku' (unknown location)" on launch, despite compiling and installing the backend. Tried multiple different parameters to build it, all error out on launch.I gave up. I will try rebuilding later if it looks like they added commits that might be relevant or just start publishing cu13 wheels.I just want to generate anime women with big boobs man. This shit shouldn't be this difficult. This is insane.
>>107153209ohhhh noooooooo oh nooo no no noim killing meself now.. and its all becoz of youoh noooooooooo
>>107153218next time check if shit you dont want to build yourself has either prebuilt wheels or you can install it yourself.for me the biggest annoying piece of shit is flash attention, requires like 30 mins to build on my beefed up rig too.>>107153220keep malding trannigger
>>107153240yes im def the one malding
>>107153249yes please give me another (you)
>>107153240>beefed up rigRIIIIIIIGHT
>>107153240Well I learned hard way that it doesn't have (appropriate) prebuilts. I also thought that I could install it myself, and it builds fine but causes problems with the extension for some obscure reason.Mine took fucking forever to compile first try too but I added NVCC_FLAGS (--threads) and MAKE_FLAGS(-j) which increases speed a few times.
>>107145378
>>107153218use uv/venv next timelinux btwalso compile nunchaku by urself
>>107153807>use uv/venv next timeAlready using it>linux btwUbuntu image inside arch host>also compile nunchaku by urselfThat's already what I am doing, can't you fucking read?Most useless response of the year award.
>omg use uv!>same garbage problems as conda cancerI fucking hate you people
>>107153218I'm so glad I stopped caring about nunchaku when I saw they had ADHD and never updated anything properly, just hoping from hype to hype.
>>107153845u said ur using dockerjust go back to cuda 12.8? u do know u can have multiple cuda versions installed, right??>>107153865this is the reason i dont use uv, and instead use chroots. but i dont recommend chroot for u
Babe wake up, a new 4step distillation method got inventedhttps://github.com/Lakonik/ComfyUI-piFlowhttps://huggingface.co/spaces/Lakonik/pi-Qwenhttps://huggingface.co/Lakonik/pi-Qwen-Imagehttps://huggingface.co/Lakonik/pi-FLUX.1
>>107153888>just go back to cuda 12.8?No I don't think I can. One of the other packages I compiled (sage I think, not sure) needs it to be at cuda13.
>>107148028>>107148048What kind of pics can't I gen on 1660S?
>>107153920recompile it to cuda 12.8 then?if you HAVE to use the most recent sageattn (if ur certain that it needs cu13 too) for wan or whatever, then make 2 venvsone for nunchaku and one for sage?
>captioned a dataset through teamviewer while shittingtechnology is amazing
>>107153920Compile for cuda 12.8, cuda 13 accepts cuda 12.8 compiled wheels.
>>107153961how many hours in the toilets?
>>107153918no piwan, sad
>>107153918I hope somebody does this to lumina while the hype is fresh. Any other 4 step distill will do, too.
>>107153938>recompile it to cuda 12.8 then?torch, torchvision etc. dependencies are installed for cu13, I need to downgrade them as well after rebuilding the docker with downgraded cuda + cuda toolkit. Compile that thing again, hope it works and then compile nunchaku (or can just fetch prebuilt wheel at this point since they support cu128).Like, this is should indeed be possible but... it's a bit inconvenient to say at least.If they still haven't updated nunchaku or provided a wheel, say a few days later or whenever I am fed up waiting I might do that.Not a very convinient solution, but thanks?>if you HAVE to use the most recent sageattn (if ur certain that it needs cu13 too) for wan or whatever, then make 2 venvsone for nunchaku and one for sage?I guess I might also make a separate comfy docker for nunchaku instead of downgrading everything.>>107153964How do I do that exactly without changing docker cuda version? Is there an NVCC_FLAG to do that? Can you do that without downgrading cuda toolkit?
we should invite debo for one threadhe's lonely in his containment general :(
>>107150577Good goy
>>107154030>Can you do that without downgrading cuda toolkit?What stops you from installing multiple cuda toolkit versions and using them depending on what you build?
>>107153936Being retarded is more limiting than any hardware.
some lora makers write like it's their food blog
>>107154045fuck off
>>107154113>foodI really dislike one who puts actual recipes instead of a fucking lora description in the fucking description field.
>>107153918Ok this thing is kind of insane. I made a workflow to compare it with normal Qwen, and it's basically the same level of quality while taking less than 10% of the time. Works out of the box with loras also. In fact, with a custom lora on a mediocre quality dataset, the results are arguably better with this thing at 4 steps. It is partially counteracting the shitty quality of my dataset. Absolutely the new meta for using Qwen, it will be impossible to go back with how fast it is.
>>107154100Well I compiled everything else already, so I might just rebuild image with downgraded toolkit and try nunchaku on older toolkit version now.Just to be clear I can run 12.8 toolkit on Cuda 13?If so I might try this now.
>>107154136>mfw
>>107154113a lot of people in niche communities (especially in coomer loners ones) have this "personal hugbox" mentality, it's unsurprising many troon out due to the combo (loneliness + pornography + hugbox environment)
>>107153969countless
>>107154189>Just to be clear I can run 12.8 toolkit on Cuda 13?Yes as long as the toolkit version < cuda it works.
>>107154305Seems like you really are obsessed with these things.
>>107154100Because supported glibc versions differ. Good luck with those.
>>107154342Alright, let me see if that would finally work.
>>107146824>>107146895I just tried ComfyUI with ROCm on Linux (RX 6950 XT), image generation works fine. 768 x 1280 (19sec), but text to video (Wan2.1 Alpha T2v) took 428 sec... (32gb RAM btw) and now I wanted to try Image to Video (Wan2.2 Animate) I downloaded all the files, but I get this error, picrel.>Is nvidia much faster? Because I don't want to upgrade until the next console gen is out.
>>107154403>but text to video (Wan2.1 Alpha T2v) took 428 sec...Steps? Seems normal overall though.>(32gb RAM btw)Low for Wan>but I get this error, picrel.It's not an error, you need to install some nodes for the workflow you are using.
>>107153961>not using local https://github.com/rustdesk/rustdeskkwab
is there a guide out there for how to prompt Wan image-to-video? just started messing around with that. i'm usually good at getting what i want out of images but my usual style here doesn't seem to work for videos, I get a lot of nonsense. I'm using a standard resolution, I tried following the advice in the rentrys already but I'm hoping there's a more advanced one somewhere I just haven't found.
>>107154493ew what's all that nerd shit? i only use american software
>>107154494i should add i'm using the image_to_video_wan22_5B workflow in comfy more or less unedited except making the resolution portrait to match my input photo
>>107154504oh, sorry goy, didnt know your masters wouldnt approve
>>107154494Image-to-Video Formula: the source image already establishes the subject, scene, and style. Therefore, your prompt should focus on describing the desired motion and camera movement.Prompt = Motion Description + Camera MovementMotion Description:Describe the motion of elements in your image (e.g., people, animals), such as "running" or "waving hello." You can use adverbs like "quickly" or "slowly" to control the pace and intensity of the action.Camera Movement:If you have specific requirements for camera motion, you can control it using prompts like "dolly in" or "pan left." If you wish for the camera to remain still, you can emphasize this with the prompt "static shot" or "fixed shot."
>>107154425Idk why I said error, I was just confused, because the text to vid model was pretty easy to install, just click, click download and insert and int dir, that comfyui says.Workflow picrel, that was literally the imported template from the text to vid model. I'm using ``python main.py --use-split-cross-attention`` to start. Since I switched to loonix the whole AI shit is much simpler than on windows. I have CUDA FOMO...
>>107154510>5BDon't anon, use 14B with a quant instead.
I am a noob for image models. Why do all of the image models have such low number of parameters. Why aren't there like 30/120B parameter models like you see with coding llms?
>>107154543damn I was hoping it would just be a little shittier but also faster to dialing in my prompting skills. i guess it's a lot shittier. i'll try the 14b and just learn to be patient>>107154540oh shit, of course it's trained on actual filmmaking terms. thanks, i'll mess around with this.does it respond better to full sentences? like say I want a dog to run for a while and then catch a frisbee that comes from off camera."running away. frisbee enters top right. jumps. catches frisbee in mouth"compared to"dog runs away from camera, when a frisbee enters from the top right. the dog jumps and catches it in its mouth"the first one doesn't mention the subject at all, and just puts the most essential info. the other feels more extraneous maybe?getting durations right is challenging too, which is why i picked this example.fucking love this though, got bored of just image gen a while back
>>107154541>--use-split-cross-attentionAny reason why? The default attention should be better I believe.>I have CUDA FOMO...What do you want me to say exactly?Don't use fp8 on RDNA2. Use fp16 or Q8.Also no idea about the nodes on the bottom. I usually just vae decode and save video.
>>107154578That would be too unsafe, anon. It's better that only the most trusted corporations handle the best training data and largest models.
>>107154582fuck i shouldn't have posted on my phone, i fucked up the spacing and look like a redditor
>>107154578I mean Hunyuan 3 exists but it's not better.My guess is training difficulty + diminishing returns + consumer inference considerations.Unlike LLMs, you can't easily split diffusion inference to multiple GPUs, so VRAM loads needs to be lower as well.
>>107154583>Any reason why?It says on startup something like "if you have ram problems use x" and it always says something about RAM in the vae process.>What do you want me to say exactly?Lie to comfort me...>Don't use fp8 on RDNA2. Use fp16 or Q8.dunno wat dis is, but thanks.
>>107154628>It says on startup something like "if you have ram problems use x" and it always says something about RAM in the vae process.I don't know too much about AMD inference so there is a low chance that there is a valid reason for this but overall it should be inferior to default flash attention>Lie to comfort me...AMD will release the mythical ROCM update soon that will blow CUDA out of water>dunno wat dis is, but thanks.Quantization you are using on the diffusion model and text encoder.FP8 is pointless without dedicated acceleration like Blackwell or RDNA4.Use fp18 (slower but best quality) or Q8 (similar speed but better quality). For the latter you will need comfyui gguf nodes.
>>107154582Your second second example. Use full sentences in which you write a detailed, straightforward description. Wan 2.2 supposedly was designed to understand the nuance and context of full sentences, not just isolated keywords.
>>107154541how come the negative prompt isn't in english in the default workflow?i put it through a translator and put the english version back into my workflow and it seemed to get better. only done a single gen since though so its not exactly good science
>>107154704amazing, thank you anon. you've saved me tons of time
>>107154708Because it's a Chinese model and using Chinese in your prompts is supposed to be more effective.
>>107154747damn i was afraid of that. wonder if translators/chinkgpt for my prompts will lead to better results
>>107154826>>107154826
>>107154174show some comparisons I'm curious of it
>>107153058>>107153164You are g(r)ay, KEK XD
Thinking about renewing my NovelAI subscription, but its kinda expensive. I imagine I could acheive similar results locally, but I only have a RTX 407 0 12GB, which I don't think is powerful enough to achieve the fidelity I want.
>>107156557It's definitely enough. I do this on a 6GB card.