Discussion and Development of Local Image and Video ModelsPrevious: >>108951930https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, & Upscalershttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.info>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/tdrussell/diffusion-pipehttps://github.com/kohya-ss/sd-scriptshttps://github.com/kohya-ss/musubi-tuner>Zhttps://huggingface.co/Tongyi-MAI/Z-Image>Animahttps://huggingface.co/circlestone-labs/Animahttps://tagexplorer.github.io/https://animadex.net>Qwenhttps://huggingface.co/collections/Qwen/qwen-image>Kleinhttps://huggingface.co/collections/black-forest-labs/flux2>Wanhttps://github.com/Wan-Video/Wan2.2>LTX-2.3https://huggingface.co/collections/Lightricks/ltx-23>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkCollage: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debohttps://rentry.org/animanon
>>108958327Elephants got legs like Kevin R. Nash
Blessed thread of frenship
>>108958327>>Maintain Thread Quality>https://rentry.org/debo>https://rentry.org/animanonThis is like a troon tramp stamp. Guaranteed to have melties about "Julien" and "Nik". Debo standing by
Could someone improve the quality of this with img2img?I want to print it on a shower curtain. Time is of the essence, I have to go in like 15min.
Glad to see lilbro is back to his regular seething I was worried where he went
>>108958356basado
what's stopping (You)?https://civitai.red/creator-program
>>108958370I could but I won't
I hate comfyUI because 1/4 of times it crashes.
I think comfy is fine I don't think much about it at all
>>108958458I don't like the site and how coomer heavy it is
>>108958461Too bad, order placed
>>108958327Thank you for baking this thread, anon >>108958345Thank you for blessing this thread, anon
>mfw Resource news06/01/2026>Bernini Latent Semantic Planning for Video Diffusionhttps://bernini-ai.github.io>NVIDIA Launches Cosmos 3, the Open Frontier Foundation Model for Physical AIhttps://nvidianews.nvidia.com/news/nvidia-launches-cosmos-3-the-open-frontier-foundation-model-for-physical-ai>LVSA: Training-Free Sparse Attention for Long Video Diffusionhttps://github.com/JiusiServe/LongVideoSparseAttention>RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World Videohttps://compvis.github.io/rayder>DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memoryhttps://jeffreyyzh.github.io/DecMem-Page>Lumos-Nexus: Efficient Frequency Bridging with Homogeneous Latent Space for Video Unified Modelshttps://jiazheng-xing.github.io/nexus-lumos-home>Envisioning Beyond the Few: Disentangled Semantics and Primitives for Few-Shot Atypical Layout-to-Image Generationhttps://github.com/iCVTEAM/DSP>PEEK: Picking Essential frames via Efficient Knowledge distillationhttps://github.com/momentslab/peek>CameraNoise: Enabling Faithful Camera Control in Video Diffusion through Geometry-Flow-Guided Noise Warpinghttps://gulucaptain.github.io/CameraNoise>Nvidia unveils new superchip to bring AI functions into personal computershttps://www.cbc.ca/news/business/nvidia-ai-personal-computer-9.7218820>Qwen3.7-Plus: Multimodal Agent Intelligencehttps://qwen.ai/blog?id=qwen3.7-plus05/31/2026>FLUX Identity Adjuster (V2)https://github.com/Magirad/Flux_ID_Adjuster_V2>ComfyUI AnimaFastTrainhttps://github.com/quinteroac/ComfyUI-AnimaFastTrain>MONET: Open-source datasethttps://huggingface.co/datasets/jasperai/monet05/30/2026>Pixal3D — Apple Silicon (MPS / Metal) Porthttps://github.com/pawel-mazurkiewicz/Pixal3D-mac>Comfy-Org/PixelDiT (diffusion models & upscalers)https://huggingface.co/Comfy-Org/PixelDiT/tree/main/diffusion_models>Orion4D Generative Painthttps://github.com/orion4d/Orion4D_generative_paint
>mfw Research news06/01/2026>DTG-Restore: Training-Free Diffusion Refinement for Generative Video Super-Resolutionhttps://arxiv.org/abs/2605.30431>TunerDiT: Training-free Progressive Steering of Diffusion Transformer for Multi-Event Video Generationhttps://arxiv.org/abs/2605.31590>SlotMemory: Object-Centric KV Memory for Streaming Long-Video Generationhttps://tj12323.github.io/SlotMemory>SANA-Streaming: Real-time Streaming Video Editing with Hybrid Diffusion Transformerhttps://arxiv.org/abs/2605.30409>OmniMem: Scalable and Adaptive Memory Retrieval for Long Video Generationhttps://wuyushuwys.github.io/OmniMem>Robust Dreamer: Deviation-Aware Latent Gaussian Memory for Action-Controlled AR Video Generationhttps://arxiv.org/abs/2605.30855>Mitigating Content Shift and Hallucination in GenAI Image Editing via Structural Refinementhttps://arxiv.org/abs/2605.30437>Parallel Tempering Initial Sampling in Inference-Time Reward Alignmenthttps://arxiv.org/abs/2605.30991>Benchmarking and Enhancing Text-to-Image Models for Generating Visual Representations in Early Arithmetic Educationhttps://arxiv.org/abs/2605.31212>Benchmarking Single-Step Inpainting Methods for Multi-Object 3D Gaussian Splatting Sceneshttps://arxiv.org/abs/2605.30987>MergeTok: Unified Continuous and Discrete Visual Tokenization via Token Merginghttps://arxiv.org/abs/2605.30904>Guidance for Low-Level Perceptual Editing in Unconditional Diffusion Modelshttps://arxiv.org/abs/2605.31162>Representation Forcing for Bottleneck-Free Unified Multimodal Modelshttps://yuqingwang1029.github.io/RepresentationForcing>A Unifying View of Variational Generative Wasserstein Flowshttps://arxiv.org/abs/2605.31369>Vision-Language Models Suppress Female Representations Under Ambiguous Inputhttps://arxiv.org/abs/2605.31556>What Makes LVLMs Hallucinate Less? Unveiling the Architectural Factors Behind Hallucination Robustnesshttps://arxiv.org/abs/2605.30911
>Using comfyUI>Crashes>Loads workflow>Missing model>Download model>0% for 20 minutes>Restart>Crashes>Try to download model again>Stucked at 25%>You need this extension>Git clone>Doesn't work>CrashesI sure do fucking love comfyUI, which is not comfy at all
>>108958647Maybe go simp some youtuber and ask him to 'make' a frontend for you
>>108958661Fix your piece of shit software
>>108958458The website is total garbage and it is not worth engaging with the brown userbase for the sake of pennies they are paying.200$ is a normal salary in India apparently but nothing worthwhile where I live.You also need to game the system by spamming lots of poorly trained mediocre loras and jeetmixes to farm meaningful amount of buzz.Not to mention, I have no faith that the website will be around for long, or that they won't arbitrarily suspend caching out.
>>108958666?
>>108958697How is Civitai still standing anyway? It must cost millions a month to maintain it. Who is funding it?
>>108958753Andreesen Horowitz
>>108958629>>108958637thanks!
Can someone release a new realistic edit model. I can't with Klein's dog shit anatomy anymore. Zedit was a meme and will never come out, what now? is there anything coming soon?
>>108958884You haven't heard?
> >108958629> >108958637go back
Why did Z lie? Why do Chinese developers lie? Why not just not lie?
Someone let a bot loose in the previous thread let's pray they don't come into this one
>>108959079Let's take a look at a couple famous chinese sayings.>He who has never been cheated cannot be a good businessman>If you can cheat, then cheat>The first time you cheat me, be ashamed. The second time it is I who must be ashamed.
So, did anyone test the newest cosmos stuff?I have a humble rig and don't want to hire a GPU on vast with the current inflated prices.
>>108959275Forgot to redirect the post, sorry >>108959236
>>108953392Catbox for Y'shtolaWhat model/Lora
>>108959178cool
>>108959275It's not like it's gunna blow any current image models out of the water right now. Maybe later thoughever.
>>108959275if its anything like the last cosmos then itll only be good once someone trains it with danbooru among other things...
>>108958923Haven't heard what, vaguepost king?
I'm trying to setup pixal3d in comfy and I'm becoming insane. There is always something breaking. Is there a guide or something?I'm so tilted right now and I hate comfy with ally heart and soul I fucking hate it. I just want to use pixal3d.
>started training at 40 epochs>now extended 4 times to 100 epochs and probably counting because validation STILL keeps fucking dropping and samples STILL keep fucking improvingthe things one does to goon in peace
i'm scared
>>108959444Klein 9B is okay especially with a certain lora(s).
>>108959477>validation
ultra cozy mode engaged
https://huggingface.co/RuneXX/LTX-2.3-Workflows/blob/main/Video-2-Video/Extend-Any-Video/LTX-2.3_-_V2V_Extend_Any_Video_Multi-Extend_long_video.jsoncan extend any video and even clone voices, ltx 2.3 is pretty versatile.https://files.catbox.moe/mmu2it.mp4
>>108959965https://files.catbox.moe/vr4m81.mp4
>>108959965Mia Yikk won, mikutroons btfo
>>108960103this is so true wtf
I managed to make pixal3d work. Despite the model themselves being good generations, the textures are fucking ass, specially the eyes. Any reason what I could be doing wrong? Because the example images and videos I've seen seem pretty accurate to the source image.
>>108960289post output? im just curious
cozy monday breas
https://desuarchive.org/g/search/text/mikutroon/It's all petra, isn't it?
>>108960409>only 8 pagesdollar store schizo
>>108960409>a single term in a joke already got a mikutroon panties up in a bunch to search the entire archives for it and do the raped schizo special of accusing all people who ever used that word to be the same personmost mentally sane and not tranny-like mukutroon behaviour
They just put cardboard cutouts of ghiblified characters in karate uniforms at the gym. Can't stop the slop.
>>108958647>>108958666I haven't used the regular frontend in months. No one should.
>>108958327>>108958327>Discussion and Development of Local Image and Video ModelsAND MUSIC!!!
anon?
comfyui actually crashes a lot.
usually you can fix the crashiness, but still...
comfy doesnt crash for me but i fuck up the venv a lot it seems
>>108961134Yeah, the main reason I have problems is rdna2 just barely works. It is only half supported.
>>108961134also, tweaking the settings, maybe expandable segments is a bad idea since it's not really supported on my card. idk I had to use it to run songbloom iirc
prompt from dalle thread:>>108935557
>>108959965there are no shrimple straightforward workflows for ltx?i checked some. they all look as "do not waste times with this".
actual new sota model, nai killercivitai.red/models/840276/
>>108961259that guy has a lot of neat ltx 2.3 workflows for diff tasks (video extend, custom audio, whatever).then I have a basic workflow for z image turbo, klein edit, and some other stuff. but most of the time I just mess with LTX 2.3 i2v, klein edit, or zimage if I want to make realistic stuff.
>>108961295what an interesting account https://civitai.red/models/2266799/heavens-gate-lets-start-a-vaporwave-ufo-cult
what is this shit? did civit went woke?
>>108961310ahhhh it's a rainbow gonna piss and shit all over myself
>>108961313you're the one who has to wear adult diapers buddy
>>108961320it's a rainbow icon get the fuck over it for real, it's unintrusive and at least they used a proper rainbow not the transbipoc flag
>>108961310>did civit went woke?like two years ago nearly
can we have a yuri thread or is 2girls too high effort for genners here
>>108961332was it then they banned celeb loras?
>>108961168>>108961148>>108961134looks like my "fixes" have been causing crashes. I was using a model unloader, idk, we'll see, but looks like a vanilla launch is working better, and without that node. again, we'll see lol
>>108961344It's really dumb, too, because we are one release away from 1shot 3d pose from a/multi photo.
>>108961335I posted it in the last thread and I got ignored so... RIP
>>108961346and. nope lmao
>>108961335>can we have a yuri threadwhat should the 2girls be doing?
>>1089613522girls is KEKED
>>108961335The last faggollage had two bros kissing !
>>108961310>did the website created by a gay man go wokehmmm is this a trick question?
nice to see ldg embracing pride month
jerking it to only lesbian porn and hentai this month in support
>>108961421what ever happened to the non-woke traditional gay men like this?
nofap 1 week world champion.
Was I supposed to laugh at that?
The laugh track is queued, don't worry about laughing.
>>108961313a tale as old as your average npc leftshit (underage)>"its just a rainbow chuddy, it literally doesnt matter!">ok>makes a client-side mod to remove it>"REEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE"almost like its not just a rainbow but a humiliation ritual those who you view as enemies have to accept or else get censored or banned
mikuroll, used the runeXX video extend workflow:https://files.catbox.moe/ev4rw4.mp4
>>108961335
>>108961620I'm pretty sure all he did was switch the game to use the Saudi Arabian version of the game for the flags. Wild that you could get banned for that. Meanwhile gays for palestine.
>>108961621Wow it retains the instrumental really well
lmao, it can sorta copy singing style if you extend too:https://files.catbox.moe/w9wp2h.mp4
>>108961714>https://files.catbox.moe/w9wp2h.mp4Neat
getting closer!https://files.catbox.moe/xt4kkw.mp4
works well if you set the frame load cap just after their dialogue to time it better:https://files.catbox.moe/prmbca.mp4
>>108961301i got few random ones from him to see what is up (including comfy default) and there are gazillion nodes.btw which model version is the best to run, original or kijai one?
ltx director is also fun. the node is like using premiere to add elements in the timeline:https://files.catbox.moe/0v6ml7.mp4>>108961922im using 2.3 distilled fp8, seems fine
>>108961933better:https://files.catbox.moe/yrqacs.mp4
>>108961933watchout spielberg!
>>108961959well its better than nolan and tranny achilles thats for sure.
heh. miopen hip tuning...
>>108961933>>108961956ty.btw quality of the starfield guy is a bit off.i saw on leddit star trek tng vids where they sing 90s euro-dance songs. must be higher precision since quality is quite up there (sound is a bit off tho).
>>108961956less audio glitching than before
The letter... what letter was it I should type to represent the indian?
>>108961305Neil Breen avatar is killing me
>>108958647comfy is rightful autistic clay, made to filter out goycattle, if you're a neurotypical you belong on forge/reforgesimple as
>>108962560or to just figure out what you want to do in sd cpp. For example, preview as text to get the tensor of the sigmas from your favorite sampler :^)
>>108958647just download the weights and rawdog them with sd.cpp
https://github.com/VAST-AI-Research/TripoSplatnew gaussian splatting, anyone knows the ram requirements?
if I want to train a LoRA for art style and then use that art style to make a VN I should train an IL LoRA right (because of Anima's shitty license)?
>>108962707anima license doesnt prohibit you from training your own lora, making images of it and selling the images, it only talks about preventing you from paywalling the lora itself without a special commercial license.
Is anyone here using pixal3d? Are you getting good models?
Has anyone had any luck installing Comfy on Linux (Bazzite if it matters) with an AMD GPU? For whatever reason I cannot get it to work, always get some error when installing dependencies/requirements. I searched around for help but everything's years out of date
>>108962742>Linux>AMDohnono
>>108962742>>108962685
>>108962747>>108962742>Shooting yourself in the foot for no reason
>>108962693wonky but fun, 40s on 8gb vram for the 3d model + a few more for rendering a microwave video
>>108962742rocM is annoying to get to work but yeah ubuntu + rocm here and i run forge neo and comfyjust ask chatgpt for the steps necessary, it's annoying but doable
>>108962742I bookmarked this a while back but haven't tried it. It's 9 months old, so maybe less outdated:https://github.com/SiegeKeebsOffical/Bazzite-ComfyUI-AMD-AI-MAX-395/tree/main
>>108962742I do. What type of error are you getting?
I'm testing LTX2.3. There's no way to get rendering preview like WAN2.2?
>>108962742I have it running on Fedora, make sure to install rocm and that the versions align properly, fedora ships with 7.1 so you have to make sure you install comfy with the same version number
>>108963259there is, check workflows in this repo >>108959965
I miss ouR guy.
How far away is image diffusion away from consistency? Like being able to create an original character and change the pose/view while retaining style, clothing and background.I wouldn't mind trying to create a manga via imagegen but it seems like it would be a nightmare trying to maintain consistency between panels..
>>108963408I'm still here!
>>108963509Not you.
>>108963509oh thank god
>ITT: absolute lack of sfw vageendisgust
>>108958465so raped lmao
>>108963500most models are based on copyrighted material or established IPs so the only consistent thing you can get is the style of an artist
>>108963596>sfw vageenI don't think that's a real thing...
>>108963637>>108963217Hello catjack
>>108963774stop calling everyone this, why did one poster brain your brain so much lil bro
>>108963703>>108963538>>108963509qrd on whose face is that
>>108963509>>108963703got any more of them hi-res jennies?
>>108963799why do you pretend to be "anon" then reply to yourself? Are you that desperate to ruin the thread again?
Is anyone here using trellis or pixal3d? I need help pls
>>108963774>>108963825the raped
Is there a N64 lora for anima? Can't seem to find it on Civitai.
Is there a diffusion tech general that actually talks about the tech without screeching schizos in it?
>>108963881>>>/reddit/
>>108963880It's usually OOT
>>108963885no thanks. I want an anon forum
>>108963881It used to be this one, but avatartranny subhumans like Julien (which the general was specifically created to avoid) crashed in and never left
>>108963889Yeah, there isn't one yet. It does have a PSX one, but it doesn't look as good.
>>108963950just use the il version
>>108963970I have that one. I was looking for one for Anima specifically. It'll happen soon, it's a fairly popular lora.
>>108964056just saying there isn't really a difference if you were to post either
>>108964070pretty, better than most anima posts I see
>>108964075its MUGEN
>>108963509A man with my kind of taste :D o7I miss the 1girl poster with team Rocket costumes and rainbow haircuts
>>1089641180/10
>>108964070is this with a lora for a specific artist
Some loras don't handle foreshortening very well.
>>108964191Nope. few for style, brush and color
>>108963880https://files.catbox.moe/w57tcb.safetensors
>>108963804My wife's!>>108963806I gave up on PiD for now, despite the fantastic color you got out of it because of the pixel space encoding. My workflow was getting pretty unwieldy. Adding PiD more than doubled the nodes in my workflow. I needed extra things to mange the resolution, I added a node to synchronize the five randomized seeds (it was a pain in the ass to freeze everything before I added that), redid my prompt inputs, splitting between the two sections more elegantly and I had stuff in there to output an image, then resize before feeding it into PiD because of the latent input limitations (that sucked because of the color hit)... it was pushing Civitai/Reddit tier nonsense near the end. The final straw was fighting against tiling artifacts introduced by the "Context Windows (Manual)" node once I tried to optimize that (drove me nuts).Anyway, that model needs work from someone much smarter and more patient than me to function in a high resolution workflow and it really needs better sigma tuning for those 4 steps, which is brutally low (an undistilled version would be even better).
>>108964312Holy based. Thank you, dude. Did you just make that?
>>108964327NTA but I'm leaning even harder into PiD, especially since those colors from the pixel space encoding are absolutely worth it. My workflow is actually feeling way more organized lately. Even the resizing before the latent input is a total non-issue because the color hit is basically nonexistent. This is miles above that usual Civitai/Reddit tier nonsense. It's actually been super stable for me.
test
>>108964389catbox of your wf?
>>108964422I'm afk now unfortunately.
ok i'm back from my ban. there's a pajeet in here. anyways, it's kino timehttps://files.catbox.moe/1viibc.mp4
>>108964389Oh yeah, it was fast an stable on my end too, I just wanted to work at a higher resolution before upscaling with it. Z-Image Turbo was leaving way too much quality on the table working down at 1024px. One bonus of working such a low resolution was that it was as every bit as fast as my non-PiD workflow. So I got 52MP Jennies in the same 45-50s as I'd normally gen stuff.
Is there a way of knowing if a download is still going or if comfyUI crashed without staring like a retard for 20 minutes?
>>108964568As if there was a way to examine running processes and network usage. Too bad :(
>>108964578Why do people always write and talk like this and excuse the lack of feedback with software? It's like you are allergic to frontend and interfaces
>>108964470trying to reverse-engineer this workflow. lmk if we're even close
>>108964612why? did he not share it?
https://huggingface.co/RuneXX/LTX-2.3-Workflows/blob/main/First-Last-Frame/LTX-2.3_-_FLF2V_First-Last-Frame.jsonkek, pretty fun, first/last frame with ltxhttps://files.catbox.moe/vht7hw.mp4
>>108964618he better not be sharing it, his workflow could easily sell for $300~$450 on the larger markets
>>108964618>>108964658he's said he's embarrassed to share it. It can't be too embarrassing. obviously his work speaks for itself.
>>108964658sounds like a tranny. i leave all my metadata in the files i put on catbox
>>108964672embarrassing how? i never used comfyui so i don't know what kind of sensitive information a workflow can hold
>>108964646better:https://files.catbox.moe/otslp4.mp4
>>108964681I think he's implying the comfy workflow is just a mess
i added sad music to the kino prompt to pay respects to all the dead niggas
seamless first/last framehttps://files.catbox.moe/i8l31t.mp4
>>108964803the neon sign changing from Hasan to Asmon while obscured is pretty great
T_Thttps://files.catbox.moe/h99ia8.mp4
>>108964814can use klein edit to make edit frames then toss it into first/last, many applications desu, fun model(s).
>>108964835for example, the clash of clans edit:the man puts on a gold crown and a large electric bolt strikes the dog on the left.https://files.catbox.moe/apus1q.mp4
>>108964803I'm so glad I don't watch or care about any of those guys. I have too much going on in real life to care.
>>108964839it's just entertainment desu, as far as stream goes asmon is entertaining cause it covers daily news/events and it isn't just another media shill saying "diversity good".in general political streamers are insufferable. but he does gaming stuff too.
try out ltx director, it's like adobe premiere editing but has prompt relay so you can make a bunch of stuff thats linked without external stitching or whatever:https://github.com/WhatDreamsCost/WhatDreamsCost-ComfyUI
>>108964943I forgot to change the time to 10s (was 20s) so you get some interesting output when it doesnt match, kekhttps://files.catbox.moe/961tpn.mp4
>>108964959make some action kinos
>>10896495910s, better:https://files.catbox.moe/teyyel.mp4
make a /vdg/ for video faggots
>>108965014For one schizo?
>>108965014Cry more
he's the jeet that doesn't have a good gpu for video
Is the raped retard catjak here? I have a question