[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion and Development of Local Image and Video Models

Previous: >>108703603

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
File: localdead.jpg (1.32 MB, 1402x1122)
1.32 MB JPG
>>
are the anima finetunes/merges worth trying out?
>>
>mfw Resource news

04/28/2026

>Meta-CoT: Enhancing Granularity and Generalization in Image Editing
https://shiyi-zh0408.github.io/projectpages/Meta-CoT

04/27/2026

>PixlStash 1.1.0 Update
https://pixlstash.dev/whatsnew.html

>AURA AI Studio Vault: One-stop management app for models, images and more
https://github.com/TheGho7t/AURA-AI-Studio-Vault

>UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models
https://mo230761.github.io/UniGeo.github.io

>LumiPic — Single-Image SDR to HDR LoRA
https://huggingface.co/oumoumad/LumiPic

>SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture
https://github.com/OpenSenseNova/SenseNova-U1

04/26/2026

>ControlNet-LLLite for Anima
https://github.com/kohya-ss/sd-scripts/pull/2317

>Qwen3.6-27B-Uncensored-HauhauCS-Balanced
https://huggingface.co/HauhauCS/Qwen3.6-27B-Uncensored-HauhauCS-Balanced

>VOID: Video Object and Interaction Deletion [ComfyUI Repackage]
https://huggingface.co/Comfy-Org/void-model

04/25/2026

>StyleID: A Perception-Aware Dataset and Metric for Stylization-Agnostic Facial Identity Recognition
https://kwanyun.github.io/StyleID_page

04/24/2026

>MAI-Image-2
https://playground.microsoft.ai/chat

>ComfyUI-NAG-Extended: NAG support for Flux 2 Klein and Anima
https://github.com/BigStationW/ComfyUI-NAG-Extended

>UniGenDet: A Unified Generative-Discriminative Framework for Co-Evolutionary Image Generation and Generated Image Detection
https://github.com/Zhangyr2022/UniGenDet

>VARestorer: One-Step VAR Distillation for Real-World Image Super-Resolution
https://github.com/EternalEvan/VARestorer

>Sapiens2
https://github.com/facebookresearch/sapiens2

>Vista4D: Video Reshooting with 4D Point Clouds
https://eyeline-labs.github.io/Vista4D

>Pre-process for segmentation task with nonlinear diffusion filters
https://github.com/cplatero/NonlinearDiffusion
>>
>mfw Research news

04/28/2026

>Talker-T2AV: Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling
https://arxiv.org/abs/2604.23586

>Latent Inter-Frame Pruning: A Training-Free Method Bridging Traditional Video Compression and Modern Diffusion Transformers for Efficient Generation
https://arxiv.org/abs/2604.23858

>TextGround4M: A Prompt-Aligned Dataset for Layout-Aware Text Rendering
https://arxiv.org/abs/2604.24459

>Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion
https://arxiv.org/abs/2604.24351

>MuSS: A Large-Scale Dataset and Cinematic Narrative Benchmark for Multi-Shot Subject-to-Video Generation
https://arxiv.org/abs/2604.23789

>Oracle Noise: Faster Semantic Spherical Alignment for Interpretable Latent Optimization
https://arxiv.org/abs/2604.23540

>Edit Where You Mean: Region-Aware Adapter Injection for Mask-Free Local Image Editing
https://arxiv.org/abs/2604.23763

>CA-IDD: Cross-Attention Guided Identity-Conditional Diffusion for Identity-Consistent Face Swapping
https://arxiv.org/abs/2604.24493

>$Z^2$-Sampling: Zero-Cost Zigzag Trajectories for Semantic Alignment in Diffusion Models
https://arxiv.org/abs/2604.23536

>V-GRPO: Online Reinforcement Learning for Denoising Generative Models Is Easier than You Think
https://arxiv.org/abs/2604.23380

>PhysLayer: Language-Guided Layered Animation with Depth-Aware Physics
https://arxiv.org/abs/2604.23574

>LearnPruner: Rethinking Attention-based Token Pruning in Vision Language Models
https://arxiv.org/abs/2604.23950

>DeepSignature: Digitally Signed, Content-Encoding Watermarks for Robust and Transparent Image Authentication
https://arxiv.org/abs/2604.23016

>SycoPhantasy: Quantifying Sycophancy and Hallucination in Small Open Weight VLMs for Vision-Language Scoring of Fantasy Characters
https://arxiv.org/abs/2604.24346

>ServImage: An Image Generation and Editing Benchmark from Real-world Commercial Imaging Services
https://arxiv.org/abs/2604.24023
>>
File: 1747903962538151.jpg (854 KB, 2400x3360)
854 KB JPG
>>108707153
I tried a few and they were all very similar to but basically worse than the base one so I'd just use that for now.
>>
Blessed thread of frenship
>>
>>108707153
>merges
Are never worth trying out for any model arc. The "stability" is not worth the slop.
>finetunes
None that have been any more than aesthetic tunes IIRC.
>>
>>108707209
I really don't see myself using base illustrious though.
>>
>>108707153
I tested the kirazuri tune anon posted last thread and it's not worth it.
Easy to assume that the shitmixes would be even worse.
>>
In the past there have been discussions here about why anima can do surprisingly decent realism for an anime model. ye-pop was suggested as a reason and I remember speculating that enough photos slipped past the filters despite his claims to have filtered them.
Now I am working with that dataset myself for a personal project (doing the inverse of what he did, removing non-photos) and I am certain that thousands of photos have slipped past his filters if he simply keyword filtered based on the captions and alt text in the dataset. The cogvlm and llava captions are ass in quality and short, often not describing whether it's a photo or artwork at all. Most of the time there is no alt text or exif info, and when there is, there is often no useful info about what kind of an image it is.
Unless he manually reviewed half a million images himself (unlikely, I wouldn't neither) or asked an vlm or some other classification model to manually classify the images, then yes I am getting confident that ye-pop is the reason anima can do realism, at least before he added to regularization dataset to preview 2.
Also it just dawned on me after typing this that deviantart also has photos, though I have no idea how difficult/reliable it is to filter.
>>
>>108707111
>4/6 family guy shit
>>
chroma z image finished yet?
>>
>>108707312
i'm 100% sure that it has a lot of real photo training and that it was a deliberate choice, it's too easy to push the model into realism.
>>
>civitai search function hasn't shown new uploads in 6 days when set to newest for either .red or .com
>last upload shown is from Apr 22, 2026 when you use the search function for "Newest" without a keyword
>".red/models" DOES show the newest uploads when set to "Newest"
>".com/models" does NOT even have "Newest" as an option when using the filter

What the fuck is wrong with this website? Is anyone else having this issue? I haven't seen anyone talk about it. When I use the following link, I get the same loras from April 22nd, which are pic related.

https://civitai.com/search/models?sortBy=models_v9%3AcreatedAt%3Adesc
https://civitai.red/search/models?sortBy=models_v9%3AcreatedAt%3Adesc

I would like to know if it's just me.
>>
What is the /ldg/census on Anima?
>>
>>108707420
It's good.
>>
>>108707413
>I haven't seen anyone talk about it.
i havent browsed it like that since 2024. i only look at loras from accounts i already follow.
i do get the same results as you from those links thoughever.
>>
>>108707413
>What the fuck is wrong with this website?
It started going to shit once they introduced buzz and on-site training. Huggingface is preferable even if it doesn't have a nice thumbnail grid.
>>
>>108707413
>have cool thing
>turn cool thing into utter shit
the story of the internet.
>>
>>108707438
>i havent browsed it like that since 2024
I figured that's probably why people aren't talking about it, i'm just used to doing it like that.
>i do get the same results as you from those links thoughever.
So it IS fucked, great. I don't mind switching how I search to what I mentioned in my post but the "Newest" section isn't even an option so there's no way of looking at new uploads on .com currently that I am aware of.
>>
>>108707450
>It started going to shit once they introduced buzz and on-site training
It's been going downhill for a while but the split really fucked a lot of things up and I don't think they're aware or care if they do know about the issue. Huggingface is nice but civitai unfortunately has an iron grip on the market, even with how shit it is.
>>
File: _AnimaPreview3_00535_.jpg (383 KB, 1248x1608)
383 KB JPG
>>108707379
>i'm 100% sure that it has a lot of real photo training and that it was a deliberate choice, it's too easy to push the model into realism.
I think there's some fuckery going on because photo prompts slip into 3D render territory way too easily. It's almost like some super realistic artstation renders have been accidentally tagged as photos.
>>
>>108707450
Civjeet genuinely feels humiliating to use.
HF jannies might be moralfags but the site at least has a decent degree of professionalism to it. Not being vibecoded sluggish turd also helps. Also api to download shit easily.
>>
>>108707370
Finished in the sense that it's already obvious that it is doomed
Not finished in the sense that he is done burning money
>>
File: 1763800728306747.png (534 KB, 2100x6300)
534 KB PNG
>>108707546
is it meant to keep going up and up after going down for a while?
>>
Why doesn't anima have an image2image workflow?
>>
>>108707577
holy shit it's mooning! I hope you guys bought the dip.
>>
>>108707589
do you not know how to do make i2i workflow?
>>
>>108707577
No he just keeps changing the way loss is calculated for some bespoke reason unbeknownst to us mortals.
The model plateaued quickly and has been churning through useless noise for most of its training span.
>>
>>108707605
Not with natural language models no
>>
File: mpv-shot0018.jpg (1.29 MB, 2176x3840)
1.29 MB JPG
what's the most interesting body horror that you got?
>>
>>108707546
I just want the original chroma understanding of niche nsfw prompts and the detail/scene clarity of z base. why doesn't he just finetune a big lora?
>>
File: z_mod_00006_.jpg (815 KB, 1344x1728)
815 KB JPG
>>
File: 00012-3870274610.png (2.57 MB, 1248x1824)
2.57 MB PNG
>>
File: 00153-1191242260.png (2.05 MB, 3072x704)
2.05 MB PNG
>>108707675
when i tried to make long girls i got some fun results
>>
File: 00013-3619055720.png (2.42 MB, 1248x1824)
2.42 MB PNG
>>
>>108707157
>>108707164
thanks!
>>
File: z_mod_00020_.jpg (720 KB, 1824x1248)
720 KB JPG
>>
File: img_00519_.jpg (760 KB, 1720x1952)
760 KB JPG
>>
this thing is an absolute BEAST!
topping up my comfy credits has never felt this good. facebook has the best dataset in the world, so im really excited to see what they have in store for us
https://github.com/facebookresearch/tuna-2
>>
File: 00008-3876287076.jpg (1.61 MB, 1344x2240)
1.61 MB JPG



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.