AIDS and Vaseline Edition Discussion and Development of Local Image and Video ModelsPrevious: >>108575392https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/tdrussell/diffusion-pipe>Zhttps://huggingface.co/Tongyi-MAI/Z-Imagehttps://huggingface.co/Tongyi-MAI/Z-Image-Turbo>Animahttps://huggingface.co/circlestone-labs/Animahttps://tagexplorer.github.io/>Qwenhttps://huggingface.co/collections/Qwen/qwen-image>Kleinhttps://huggingface.co/collections/black-forest-labs/flux2>LTX-2https://huggingface.co/Lightricks/LTX-2>Wanhttps://github.com/Wan-Video/Wan2.2>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>Illustrioushttps://rentry.org/comfyui_guide_1girl>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkCollage: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/r/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debohttps://rentry.org/animanon
>china sold out and local is now left with zero developersis it over? we'll never get a seedance model locally
no one remembers stabilityai
can we tone the seethe down a bit
>>108585044that was diferent
>>108585044>no one remembers stabilityaiand Mochi (still waiting for MochiHD kek)
>mfw Resource news04/11/2026>ComfyUI-RookieUI: The ultimate A1111-style sidebarhttps://github.com/rookiestar28/ComfyUI-RookieUI>Qwen3.5-4B-Base-ZitGen-V1: Image captioning fine-tune of Qwen 3.5 4B optimized for Z-Image Turbohttps://huggingface.co/lolzinventor/Qwen3.5-4B-Base-ZitGen-V1>ComfyUI Memory Visualizationhttps://github.com/kijai/ComfyUI-MemoryVisualization04/10/2026>JoyAI-Image-Edit now supports ComfyUIhttps://github.com/jd-opensource/JoyAI-Image#-news>Two Front Doors: Civitai.com, Civitai.red, and What's Nexthttps://civitai.com/articles/28369/two-front-doors-civitaicom-civitaired-and-whats-next>Uni-ViGU: Towards Unified Video Generation and Understanding via A Diffusion-Based Video Generatorhttps://fr0zencrane.github.io/uni-vigu-page>PrivFedTalk: Privacy-Aware Federated Diffusion with Identity-Stable Adapters for Personalized Talking-Head Generationhttps://github.com/mazumdarsoumya/PrivFedTalk>AVGen-Bench: A Task-Driven Benchmark for Multi-Granular Evaluation of Text-to-Audio-Video Generationhttp://aka.ms/avgenbench>Cross-Modal Emotion Transfer for Emotion Editing in Talking Face Videohttps://chanhyeok-choi.github.io/C-MET>ChenkinNoob-XL-V0.5https://modelscope.ai/models/ChenkinNoob/ChenkinNoob-XL-V0.5>Control Order & Free Memory: Controls the order of node execution with device-agnostic memory managementhttps://github.com/mkim87404/ComfyUI-ControlOrder-FreeMemory>DMax: Aggressive Parallel Decoding for dLLMshttps://github.com/czg1225/DMax04/09/2026>MAR-GRPO: Stabilized GRPO for AR-diffusion Hybrid Image Generationhttps://github.com/AMAP-ML/mar-grpo>HybridScorer: Score, sort, and cut large sets down fast with GPU-accelerated AI reviewhttps://github.com/vangel76/HybridScorer04/08/2026>OrthoFuse: Training-free Riemannian Fusion of Orthogonal Style-Concept Adaptershttps://github.com/ControlGenAI/OrthoFuse>MIRAGE: Benchmarking and Aligning Multi-Instance Image Editinghttps://github.com/ZiqianLiu666/MIRAGE
>>108585044well yeah, because all their employees left to form BFL and continued releasing the same censored API-first goyslop. all they did was change names
>mfw Research news04/11/2026>M2StyleGS: Multi-Modality 3D Style Transfer with Gaussian Splattinghttps://arxiv.org/abs/2604.03773>SafeCtrl: Region-Aware Safety Control for Text-to-Image Diffusion via Detect-Then-Suppresshttps://arxiv.org/abs/2604.03941>SymphoMotion: Joint Control of Camera Motion and Object Dynamics for Coherent Video Generationhttps://grenoble-zhang.github.io/SymphoMotion>NavCrafter: Exploring 3D Scenes from a Single Imagehttps://arxiv.org/abs/2604.02828>Reinforcement-Guided Synthetic Data Generation for Privacy-Sensitive Identity Recognitionhttps://arxiv.org/abs/2604.07884>Collaborative Multi-Mode Pruning for Vision-Language Modelshttps://arxiv.org/abs/2604.02956>GENFIG1: Visual Summaries of Scholarly Work as a Challenge for Vision-Language Modelshttps://arxiv.org/abs/2604.04172>Stochastic Generative Plug-and-Play Priorshttps://arxiv.org/abs/2604.03603>Symbiotic-MoE: Unlocking the Synergy between Generation and Understandinghttps://arxiv.org/abs/2604.07753>Differentiable Stroke Planning with Dual Parameterization for Efficient and High-Fidelity Painting Creationhttps://arxiv.org/abs/2604.02752>Graphic-Design-Bench: A Comprehensive Benchmark for Evaluating AI on Graphic Design Taskshttps://arxiv.org/abs/2604.04192>Token-Efficient Multimodal Reasoning via Image Prompt Packaginghttps://arxiv.org/abs/2604.02492>DINO-QPM: Adapting Visual Foundation Models for Globally Interpretable Image Classificationhttps://arxiv.org/abs/2604.07166>Do Audio-Visual Large Language Models Really See and Hear?https://arxiv.org/abs/2604.02605>AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discoveryhttps://arxiv.org/abs/2604.05550>Neural Network Pruning via QUBO Optimizationhttps://arxiv.org/abs/2604.05856>Beyond Fixed Inference: Quantitative Flow Matching for Adaptive Image Denoisinghttps://arxiv.org/abs/2604.02392
who remembers disco diffusion
>>108585098nice try kek
>>108585112are people unhappy with BFL?
yucky saas swamp
>>108585079>>108585221>>108585236>>108585258chroma sucks
>>108585296bruh
kino hour
>>108585317bruhs @ u>>108585378the bold and the beautiful
>>108585305How can you tell it's Chroma? Maybe it's ZiT
Which anime model is most used on 4chan threads right now Anima, SDXL or NAI?Vote!https://strawpoll.com/B2ZB9rDajgJ
face is fixed on anima. fucking finally. that's the boring part of illus-pony
good job face
>>108585500>>108585512>/^ComfyUI_temp_/i;type:filename;you'll thank me later
>strawpollkek
>>108585114Gib us pixel merge
z-image making big feet again
cozy breas
Lodestone ZIT projects and their status?
>>108585744Diaper: filled
>>108585744Anteater cocks: gleaming
>>108585485>illust retards not knowing it's based on SDXL Still?
>>108585815I look like this
>>108585891Come on, ugly male feet, ugly male toes, ugly male ankles, ugly male thighs, ugly male buttocks.>but it looks like it was drawn by hand!Fuck off. Artists are learning too, and many of the artists on Danbooru are amateurs and have slop eyes.
can someone fix this
>>108585891If I covered her from the waist up, would I still be able to tell it was Frieren just from the waist down? What would her legs and feet look like, considering she is a small elf, and what about her toes and ankles?None of this would happen if you used NoobAI based models, but it is Saturday, it is your free day, it is your "casual animu genning day" you turned on your PC and chose the lowest effort model of all and posted this troon crossdressing as Frieren in the least anime general of all:ANIMA and /LDG/
>>108585891Nice feetos
>>108586032thanks i puked
what upscaler do you guys use?seedvr2 changes the face too much and looks like slop. using z image as a 2 pass completely messes up the skin and adds a weird white haze like someone smeared cum all over the pic. im really at a loss for good upscalers
>>108586182this upscaler is basically instanthttps://github.com/Comfy-Org/Nvidia_RTX_Nodes_ComfyUI
>
>>108586234You can’t post in two threads at the same time. You have to choose, /ldg/ or /sdg/. We demand exclusivity to our schizos.
what is there left to look forward to now that china has officially abandoned us?
>>108585415i cani'm a resident schizo and so he is
>>108586348the inevitable concentration camps
https://civitai.com/models/2536147?modelVersionId=2850290Style lora example for Anima, full captioned dataset and all config files are shared. The model trains extremely well I don't know why some people say otherwise.
is there any LTX 2.3 workflow that doesn't have a hundred random custom nodes? Why do these faggots feel the need to install every random piece of shit node set rather than making things work with the most popular nodes?
>>108586449>Rutkowskibased bigruss
>>108586348wait for the next company to do the same thing.>here are a bunch of great open source modelsthey build up a userbase and then try to monetize a new model and another company comes in and fills the void.or some rich neckbeard like notch or kim dotcom throw a bunch of money into a new model just because they can.
>>108586449>I don't know why some people say otherwise.They were using sub-optimal configs and blaming it on the model.
>>108586348we just accept that we're no longer a part of the cutting edge of tech, we're retro tinkertroons now who enjoy fiddling with outdated hardware. like the people who try to push the limits of the nintendo 64. we will be seeing if we can push out models to get 1/10 as good as seedance 2.0, or if loras can get local models to properly fill a wine glass to the brim.
https://github.com/ClownsharkBatwing/RES4LYFsnake oil?
https://huggingface.co/Lightricks/LTX-2.3/blob/main/ltx-2.3-spatial-upscaler-x2-1.1.safetensorsuse the updated upscaler with 2.3, helps a lot it seems.https://files.catbox.moe/gpmk06.mp4
>>108586485https://huggingface.co/RuneXX/LTX-2-Workflows/tree/mainI use these, they work well with ltx 2.3 distilled
>>108586601All the good stuff from that schizo paradise was brought to mainline Comfy DESU
>>108586573I just want a good API that isn't completely gimped after the first week. Seedance2 looked good when it was first showcased, what we have now is a joke. Bad physics, plastic skin, inconsistent generations, stiff animations. I guess we wait for Happyhorse, but it will probably get hit with a cease and desist on day one. Fucking bleak.
>>108586449Fuck you! Why don't you post in anime generals!?!?!?!POST IN ANIME GENERALS YOU FUCKING FAGGOT !WHY DO YOU IGNORE US!?!?I HATE YOU!
rocketbrown is melting down
>>108586032This reminded me of when I went to the ENT doctor and he had a colossal scar on his throat, it shocked me because I thought the scar was from him slicing his throat but it was from a thyroid surgery lmao
Anything except actually building skills.
>>108586112the star wars we need. This is beautiful.
>>108586741how do you know that?
>>108586621thanks, amigo
>>108586731kek
>>108586449Based, retards on suicide watch
>>108586449I HAVE TO POST MY ANIME NEWS TO THIS 3DPG SLOP GENERAL!!! MUH CATJACK MUST READ MY ANIME NEWS OR I WILL LOSE MY MIND!!! MUH CATJACK!!! MUH MEAT!!! THEY ARE VERY VERY IMPORTANT!!! NOT THE 200 ANIME POSTERS OF ALL 4CHAN, NO NO NO, THEY ARE TRASH, WORTHLESS, BENEATH CONTEMPT!!! ONLY MUH CATJACK AND THE ZIT AND THE CHROMA SLOPPERS MATTER TO ME!!! /LDG/ MUST STOP EVERYTHING AND READ MY ANIME NEWS RIGHT NOW THIS INSTANT!!
>>108586449Onegai, realism lora kudasai!
>>108586449blessed thread of frenship
I hope the chroma and zit sloppers here enjoy the great anime news of this faggot
What's the easiest local hardware I can use to make slop like this where I'm just going to take pictures and say "Give her a silver dress" or "Give her blue eye shadow" like you can do with cloud tools like Gemini and Grok
>>108586841>>108585019>>Klein>https://huggingface.co/collections/black-forest-labs/flux2
>>10858684116GB VRAM gpu
>>108586851I've got a 5080 and this is the most exciting use case of it now that I've beaten RE9 on Nightmare>>108586844thank you will give it a shot
>>108586807>pls saar give csam lora i must be generating the cunny, kindly do the needful
>>108586860>5080Qwen Edit is larger than Klein but would still fit on your card. Try that out instead desu.
>>108586869Goodness, my cup runneth overAlright I'll try it
>>108586880>>108586869Is there a way to get Qwen edit running locally on Linux instead of the hugging face version?
>>108585744is he still claiming to have done the first vaeless modelthat doesnt seem accurate if hes never got any to converge
>>108586449>Posting on civitai.For shame.
>>108586902nvm figured it out, didn't realize it worked in comfy
>>108586902Three versions of Qwen are listed here https://comfyanonymous.github.io/ComfyUI_examples/qwen_image/ godspeed my celeb gooner
>>108586449 I spent 8 hours today of my saturday using your model and sharing artist tag and comparisons on /h/, /e/ and /adt/ with other anons who use Anima. Watching you ignore us makes me want to never use your model again.
>>108586921praying to elohim that it isn't censored crap
>>108586449with these settings how long would it take on a regular card
>>108586449>I don't know why some people say otherwise.If you spend 5 minutes on any place where people discuss lora training you discover why, most people train to overfit because they caption poorly and prompt poorly, so for the loras to work for them they have to imprint the DNA of the image in the model
>>108586449>Caption with Gemma 4 31b. If you have less VRAM, JoyCaption or one of the medium sized Qwens would work almost as well.joycap and medium sized qwens suck ass tho bruh fr
>>108586783Those "retards" will never read his message, because those retards are the ones using his model and posting in dedicated anime generals, not here.
Complete meltdown
>>108586943a lot of people use shit tier datasets too, but they don't know they are shit so they just keep on doing the same thing over and over. garbage in, garbage out.
Can the LARPers stop samefagging and replying to Tdrssell's post to simulate interest? Here we all use Zimage and Chroma. We never cared about anime models, stop pretending engagement.
>>108586862
>>108585744Most of the discord seems to have moved on from this thread. I had to stop chroma gooning but if I remember correctly all three are still in training epochs. Kaleidoscope is fairing better than Zeta, to the surprise of nobody.
King Russ
>>108587004Cut the act. An anime model dev shows up and suddenly you care. You never did before, I know exactly what you're doing.
>>108586998Other way around, my bad.
Russ God
how do I use z image turbo/base with tensorrt?
>>108587056You never cared about anime models, LARPER!
what's the best method of training LTX-2.3? ai-toolkit is useless
https://higgsfield.ai/original-series/zephyr/episode-1>Traditional directors flimmaxxxing using Seedance 2.0 on Higgsfield. Watch “Zephyr” FULL Ep.1 – this is what happens when filmmakers face ZERO gatekeeping. With Unlimited Seedance 2.0 now LIVE everywhere for anyone with up to 70% OFF* - YOU can build your next viral AI movie. 2 minute intro got MILLIONS in a day. Now see how full Zephyr takes over your feed.>Dir. by ILYA KARCHIN & the team.>Zephyr (2026)
>>108586609see, quality is better vs the 1.0 one (both use the new one):https://files.catbox.moe/5pid7f.mp4
>>108586931
>>108587167kind of underwhelming
guess ill try out the greg lora...
so are we stuck at wan 2.2 forever?
>>108587286welcome to local
>>108587237>>108587278>>108587318Too old
>>108587286people are going to be using wan for the next 2-3 years regardless of what comes along.
who else is promptmaxxxing?
>wan 2.2 i2v -> lose detail over frames>flux klein -> cannot move camera too muchways to combine best of both worlds?
>>108587318>>108587278Very, very impressively realistic
>>108585019need download for the knight bitch on the horse
>>108585160>>108585250>>108585250>>108585268>>108585394>>108585480I like the 90s anime filter, but you gotta train your model to stop making Frieren look so unhinged/retarded
>>108586348Is that Nautilus?
>>108586348pic would be cool without FNAFfag
>>108587140Be quiet Bluvoll.
Can anyone make some recommendations for why I'm not getting the results I'm expecting to get? I know that if I were doing this with something that had natural language processing it wouldn't change anything about the image other than her outfitIs Qwen 2512 not a good fir for what I'm trying to do?
>>108587646its better to inpaint if you wanna add specific stuff
>>108587661That makes sense, I'm just spoiled by Gemini. I'll poke around and see if I can figure out how to add an inpainting mask with Qwen
>>108587646Could also use qwen image edit instead of the regular qwen image.
>>108587646>>108587728 this kek
>>108587728Sorry I'm super new to this- Where do I find that under templates? Is it just called Qwen Image Edit or is it this one?
If you're doing SFW stuff just stick to API. It's not worth using Qwen Edit for basic shit that API models can do 100x faster and better. You can get Qwen Edit here, but the model itself is outdated https://huggingface.co/Qwen/Qwen-Image-Edit-2511Flux Klein 9b is the best edit model available locally, and it can gen/edit without needing separate models. Qwen was working on a model that could gen/edit in one, but they decided to abandon local for API like everyone else.
Haven't tried upscaling since the pre-AI days (waifu2x back in 2017 I think). Is it still a meme or is it good now?
>>108587773Sorry, API?I was recommended Klen earlier, Qwen doesn't seem nearly as easy as the online stuff so maybe I'll try that
>>108587450Let's see Paul Allen's kissy face girl.
>>108587775I've never upscaled anything that didn't feel like a sidegrade. You always lose something you liked about the original gen, unless you keep the denoise so low that you wonder "what am I even wasting my GPU cycles on? It looks the same."
>>108587784API = 'the online stuff'. If your task is just putting a fur scarf on a girl, then there's no reason not to just use google AI or whatever. The benefit of local is nsfw stuff and niche use-cases that API cannot achieve (loras which are trained on specific concepts or styles).
lol
>>108587801The main thing is I just don't want to get constrained by stuff when I am doing something NSFW or if it's just some random guideline it doesn't agree with. I also despise paying for cloud software in generalI feel like what I'm asking for isn't necessarily outlandish, I'm sure it would be slower on local hardware even with a 5080, basically just something that can interpret prompts and then apply them to images.I mean look at this, why would this be moderated? It's something I'm sure local hardware is capable of, I'm trying to find the best tool for the job
>>108587834Another example, this stuff seems like it's what people have been working on for years, I'm surprised there isn't a consensus best tool for something like this. Granted there are a million different directions that people are working on
>>108587834Use Flux Klein 9b, it should be more than capable of doing this
>>108587749>>108586921
>>108587856Okay will be trying klein in the morning, thanks Anon >>108587860Sadly after playing for a couple hours I think Qwen might not be what I'm looking for, what I'm looking for is a really simple tool that's kinda hard to fuck up
>>108587855There isn't a single best tool because all companies are competing. Here is a list of pretty much all the relevant edit models: https://artificialanalysis.ai/image/leaderboard/editing
>>108587873This is also very handy, thank you
>sneederboard
my wife seira
Sorry if this is a dumb question, I'm new to this. Why does this keep popping up? I already downloaded and selected a VAE and put it in the VAE folder. (vaelsem). What else do I need to do?
>>108586449but these are only using natural language, how should it look if you want natural language and booru tags?
>>108588022What UI/Model?
>>108588068I downloaded Stability Matrix and just downloaded the first package available there (WebUI Forge NEO). And these are the other stuff
we wuz snape (ltx 2.3)https://litter.catbox.moe/l359g17k1ba76zhz.mp4
>>108588117more action:https://litter.catbox.moe/n1p1j6e4ns8bxn83.mp4
>>108588117>>108588132instant classics
https://github.com/Comfy-Org/ComfyUI/pull/13369What is this model? Are we saved?
>>108588222looks like it's a 8b model and uses ministral 3d as a text encoder, but so far it doesn't seem there's anything else about that model on the internet
my body is ready for nu chink slop
>>108587130catbox? looks crisp
https://youtu.be/1_5sSJK2rU0?t=1761>A 665k subscribers channel is talking about Animalmaoo wtf??
>>108588320Stop wasting your time watching "news youtubers"
>>108588320he talks about even the smallest most random niche ai that only get github announcement pages too
>>108588320"saar, what i scam with this week, saar? thanks saar, have my bell ring thank you saar"
>>108588222it's from baidu right? it might be good
>>108588320omg some scam artist trash tuber talks about tranima?!?!? turd russy bros we won so fucking hard!!!!
>>108588341>>108588331the seethe is delicious
>>108588339>baiduwhat should i know them from other than china
>>108588349baidu's site is one of the most visited in the world
>>108587400nevermind issue solved
>>108588358but what should anon know them from
>>108588330that's why I'm subscribed to that channel, he always finds something cool and niche at some point, he's not just like those sloptubers who only present things everyone knows about
so many new kinos to watch, so little time
>>108586449maybe bluvoll and tranifag will shut the fuck up now with their fudding
>>108588473the faggots new hobby is trying to start a flame war between ldg and hgggotta do something to satisfy the mental illness i guess
>>108588473You don't train loras for Anima. Bluvoll already made 2 finetunes this week. Stop pretending to be interested in anime.>>108588320It's the only anime model successor of Illustrious, and you don't even post in anime generals. You're trash. I know it's you because you're the only one who cares about your model in this 3DPG general.
>>108588320our boy made it. fucking BASED to see a homegrown hero landing such a massive W
>>108588581Yes, remember tdrusell's first posts? He didn't even know how to make a lora and asked us for help. How he has grown!
>>108588624you always start knowing nothing yes, and the sky is blue
>>108588551boss please!
anons, im using sd.next. which model/checkpoint comes closest to commercial cloud imagegens. i have 128GB of vram. also are there models which can run at float16 instead of bf16 ? v100 is slow on bf16
>>108588733ask ai how to convert locally bf16 to fp16>closest to commercial cloud imagegensfor what? ZIT for realism, noobai/illustrious for tranime, klein for editing
>>108586449>Greg RutkowskiI remember.
>>108588761thanks - will check it out
>>108588076I never heard of that vae, but you are using anima so check the anima huggingface page. in that VAE/ Text Encoder you will need to select qwen_image_vae and qwen_3_06b_base
>>108588076>cattower
>>108588733Change sd.next settings to FP16, read the wiki (Compute setting) https://github.com/vladmandic/sdnext/wiki/Performance-Tuning
>>108588843basado
>>108586449It do work.
What LM to use for enhancing nsfw image prompt? Some says Qwen3 is better than Qwen3.5?
>>108588922take an english 101 class
>>108588922go for gemma 4, it's the new hot thing in town, way better than qwen imo
>>108588922Gemma 4, yeah. Mistral models are one of the least censored out of the box. Qwen has higher chances of refusal.
>>108588922gemma 4 + that system prompt >>108588368
>>108588922Test Heretic models. Gemma base won't spit any nsfw.
>>108588974it can with that jailbreak prompt >>108588960
>>108588985
Never did vidgen, is there a tutorial for vid for not poorfags like in the op?
>>108588222>embedding_key='mistral3_24b'jesus fucking christ? a fucking 24b text encoder??
has anon used gemma 4 in your workflow? how?
>>108589090sir pls understand
saw this guy on reddit using generated pics for a fake onlyfans. any idea what model he could be using?
>>108589159Why are you generating literal children?
>>108587141how's ai toolkit useless?
>>108589179its teebs
>>108589184nta, but its retarded hidden console is awful. What the fuck were they thinking?
>>108588222https://github.com/huggingface/diffusers/pull/13432based, there will be a base model and its turbo variant
>>108589177Lustify maybe.>picrelHATE
>>108589268https://github.com/HsiaWinter/diffusers/blob/3aec976fc30347e4ea70e5f97c1bb4123cc218fd/docs/source/en/api/pipelines/ernie_image.md>ERNIE-Image is designed with a relatively compact architecture and solid instruction-following capability, emphasizing parameter efficiency. Based on an 8B DiT backbone, it provides performance that is comparable in some scenarios to larger (20B+) models, while maintaining reasonable parameter efficiency.big if true
Where BERT image though?
Big Bird Image or bust
>>108589184it doesn't have any effect if you train it with a starting frame
>>108589284>not an edit modelcome on dude, unified edit/image models are the future
>>108589307>Big Bird ImageWhen bird game 3 image??https://www.tiktok.com/@ancient_meme_archive/video/7557971057102114079
>>108589318I trained ltx23 lora in AITmostly with videos. It worked decently but I think the images were not close enough in style and slowed the learning and made the model stiffer. I think I'll try videos entirely next time.
>>108588222So Comfy implemented that baidu model one but not this one?https://huggingface.co/jdopensource/JoyAI-Image-Editwhy?
>>108589385What do you mean by "close enough in style"? You just take frames from the video and use that as your input images
how did wan 2.7 fuck up this much?https://xcancel.com/ChrisGwinnLA/status/2039960196458680366https://www.youtube.com/watch?v=RERsGjQrQ6Ewan 2.5/6 was marginal improvement if even that, and now this is just trash.
>>108589402I'm not talking about the starting frame. It was a character lora. I had two datasets, one video dataset and one image dataset. The image dataset had images from photoshoots etc. which didn't match the style of the videos.
>>108589423anon, Alibaba actually has a good video model, it's called HappyHorsehttps://xcancel.com/AlibabaGroup/status/2042530517799887326#mhttps://xcancel.com/lovart_ai/status/2043282414605332813#m
>>108587834nigga look for site that has flux klein, qwen image 2.0 or wan2.7. Budget pixel is my favorite it because of it variety of models and mentions the various levels on strictness a of a model. I find local image generation at the moment to very stale and boring at the moment. https://budgetpixel.com/
>>108589423I've used wan2.7 image and video generation. its absolute censored dogshit that even makes wan2.5 look a lot better. The shit model has a filter that re-writes your prompt to be sfw pg13. Basically making de-clothing and nudity prompts difficult to get right. Many people are disappointed with wan2.7 and it's basically DOA saas model.
>ImageLightyears behind SAAS>VideoSaas is literally in another universe>3DLocal has completely given up.Grim
>>108589506>Image>Lightyears behind SAASmaybe baidu will save us >>108588222
>>108588222>Flux2 VAEnow we're talking
>>108589523let's hope the licence is good too, but the most important part is that the images are good and not slopped
>>108589523Like Mugen
https://xcancel.com/obscaries/status/2043304041053397437
>>108589523>now we're talking*yawn*, pixel space or gtfo
>>108589561can seedance 2 do anything other than capeshit fighting slop?
>>108589585obviouslyhttps://xcancel.com/AzeAlter/status/2043027227374436827
>>108589592holy shit. wish local was able to produce something interesting like this
Trellis 2
>>108589618It's out of scope for local hardware. Currently local is only good for low-quality goonbait but it's fine, it'll get better along with hardware.
>>108589179He is literally generating not literal children. >108589265I make videos of sexy kids, not images, because while still images work for big tits hags since you can appreciate them as meat, little girls are more of a vibe so video works better for that. It's why I'm waiting for video+audio so eagerly to add more dimensionality to that vibe. Unfortunately it's looking like the odds of the evasi@n website being shut down due to costs before a local video+audio model comes out go up every month
>>108589592it's really impressive, but it won't be true democratization as long as this level is not local, you're still dependant on API censorship shit, it kills creativity
>>108589159ZIT Laura B???
>>108589629did they fix the workflow somewhere? seemed much worse when i tried it on release
>>108589637>it'll get better along with hardware.I used to believe that, but Z-image turbo and Gemma 4 proved to me that you can get insane quality with a relatively small model, the future is bright
>>108589670If you look up the latest repos, they use something called DINO lock that helps a lot. But it's still meh.
>>108589689>you can pay $10That's the whole concern, his stripe got banned and most of his potential market are retarded sooner nocoders who don't understand crypto so I'm worried about the financial health. I've gotten many times the worth of my gold donation and have shared hundreds of videos of youthful beauty with the world and have done my small part in displacing and substituting demand for the real thing. No seriously, a couple of times on some pedo-adjacent forum I see some guy post an old Gen and it makes me happy knowing that it is inarguable that this person has consumed something AI generated instead of the real thing The monthly begging prompt is back though and unlike previous years there's less momentum and the project is much more private, and 4chan is getting less and less popular. Video+audio doesn't share well on /g/ anyways but I'm just excited for the extra world knowledge the audio dimension brings. I will FINALLY be able to actually prompt for something like a home family vlog where the dad is holding the camera because 1000% that training data is in current models but there's no way to express that knowledge given the relationships you make when captioning videos with just text and not learning the audio information
>>108588222please be good, we haven't gotten anything decent this year so far (except klein I guess)
>>108589423>>108589439i'm really fed up with chinks. i wouldn't have said anything if they'd released something, but now, fuck them. i don't even touch wan anymore for sfw content. ltx is much better
>>108589453If it's the same model though, why is doing it on the website better? Does it just understand prompts better?
>only enough vram to train 1024 with batch 2It's over. I should've gotten a 4070TiS back then and not just the 12gb abortion
>>108589985Depending on how many steps you need and which model (I am presuming something like SDXL with that res + batch size combo) it can cost less than a dollar to train a lora with online compute. (vast, runpod, etc.)
OWO
>>108590008uwu, what's this? https://www.youtube.com/watch?v=7mBqm8uO4Cg
>>108590008The bubbles aren't doing a good enough job hiding those hands:(
>>108590020they had an incident please understand
>>108590108very interested in the chain of thinking that resulted in you uploading this image on the internet to show other people
>>108590150"anons might find this image amusing.... oh wait an anon is seething about it, even better, a successful post"
>>108590164usually your posts are better that's all, I don't get these
>>108590187ah, i thought they are always uniformly bad. maybe on occasion less bad.i am just bored.
60 minutes to caption 230 images, that's outrageousisn't it?
>>108590008prompt and settings please? you used z image base right?
>>108590303What the hell are you using to caption them that works so slowly?
>be me reusing ltx2>be me testing a medieval prompt that gives good results everywhere.>the gen is full of happy modern indians...
>>108590311qwen 3.5tried with gemma4 but the mm is broken on the ablit model, and the regular one just refuses
>>108590347Yes the jews trained it on bunch of clips from Indian tv shows, we know.
>>108590356>the regular one just refuseseven with this jailbreak? >>108588960
>>108590356Shouldn't be this slow desu. I wouldn't wait one fucking hour for captioning that few images.Makes me thinking you are fucking up something somewhere.
Gotta post this one to >>>/aco/9173986
>>108590357I have white characters in version 2.3, but with medieval carnival costumes...they have serious problems with medieval content.
Wow, ComfyUI updated the front end, but Nodes 2.0 still don’t work with text autocompletion, and I can’t align the nodes 2.0 horizontally or vertically or redistribute them. Thanks for doing nothing, Fennecfaggot.
>>108590430middleage is antisemitic
>>108590437Just disable Nodes 2.0? They are shit anyways.
I love the kind of LCD 1girls anima can spit out.Preview 2 performs better than 3 for this, among other things.
comfy bred
>>108590451I like the minimalistic design. I don’t like the 1.0 version with its large padding and big letters. Nodes 2.0, with black minimalistic nodes on a solid black canvas is the future.
There are quite a lot of really nice 'western style' loras for Anima now and they blend pretty well with anime styles as well if you screw around with step scheduling. I'm having lots of fun.
>>108590493More buggy to use than the original node design which is what I care about most. When (if) it stops being a buggy piece of shit I might care about leanness.>>108590498I am curious where do you typically switch and from which to which?
>>108590510groovy gens
>>108590539thx
>>108590510I haven't done much systematic testing regarding order but the timing serves as a pretty robust way of dialing in the amount of influence you want a style to have. In the attached image the relevant section is pasted below. I'm only using mikoyan's style half the time. In my previous image I only let it in with the last 10% steps. Super fun to play around with.<lora:Sanjulian:0.7> sanjulian style,painting \(medium\)[@mikoyan::0.5]