[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


You Guys Goon To That? Edition

Discussion and Development of Local Image and Video Models

Previous: >>108727613

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
>>108733994
>>108734009
You posted a brunette lol
>>
>inb4 nigbo
>>
>>108733994
no anime in the collage, no posting
>>
>>108734123
5th image bint
>>
>>108733928
>>
for anime collage please don't miss out on our sister thread /adg. very nice thread. high quality.
>>
>>108734145
Ok, but I am fat, I wear glasses, and have a beard and long dark brown hair. That guy looks very unattractive - no long hair, no glasses, and possibly a prisoner of war (women like winners).
>>
>>108734145
She looks old af. Probably all used up and dry.
>>
>>108734149
it's /adt/ faggot, the name stands for Anime Diffusion Thread.
Why a thread and not a general? Because it was intended to be a subordinate thread to /ldg/, that is, it was made to discuss anime THERE and to help organize and reduce the amount of information in /ldg/, NOT HERE.
>>
File: actuallygood.png (102 KB, 1113x616)
102 KB PNG
the new Comfy search is the first actually good UI update in a long time lol
>>
>>108734169
The lady in sunglasses is absolute
>>
>>108734176
>>108734149
check the archives, it's original purpose was for that
>>
>>108734145
now make the guy look like the average /g/ poster
>>
>>108734187
He does, look at him, tight shirt, combover, very gay (he's noticing how he could cur her hair better)
>>
>>108734176
can't believe I missed that detail
>>
apache2 anima status?
>>
>>108734187
Just edited my original gen directly lol, straight through 2160x1600 -> 2160x1600, 4 steps DPM++ 2S Ancestral / Simple scheduler.

"Completely replace the man on the left side of photographic image 1 with a very fat nerdy fedora-wearing bearded slob while keeping the original character design and facial likeness of the woman on the right and overall layout and lighting and text exactly the same in every way."
>>
>>108734227
too well groomed
>>
>>108734176
>>108734185
>>108734205
It was always a troll. Cope, seethe, and dialate.
>>
>>108734230
Yeah, no glasses, plus no long hair. I would also point out no hats in church.
>>
>>108734230
I mean I'm sure I could fully prompt a new image with a slobbier ldgian but I feel like this is good enough lol
>>
>>108734242
He looks like he's a secret millionaire tho
>>
>>108734227
a neckbeard would look so much nicer
>>
>mfw Resource news

05/01/2026

>Representation Fréchet Loss for Visual Generation
https://github.com/Jiawei-Yang/FD-loss

>Caption Generator Pro: Tkinter app for generating image captions with LLaVA-style models
https://github.com/CoolGenius-123/Caption-Generator-Pro

>Metascan v0.3.0 Update
https://github.com/pakfur/metascan/releases/tag/v0.3.0

>Phosphene: Local video and audio generation for Apple Silicon ( LTX2.3 )
https://github.com/mrbizarro/phosphene

>MoCapAnything V2: End-to-End Learning of Generalizable Motion
https://animotionlab.github.io/MoCapAnythingV2

>Diffusers <0.37.1 Security Vulnerability - Code Injection
https://github.com/huggingface/diffusers/security/advisories/GHSA-98h9-4798-4q5v

04/30/2026

>ProcFunc: Function-Oriented Abstractions for Procedural 3D Generation in Python
https://github.com/princeton-vl/procfunc

>Efficient, VRAM-Constrained xLM Inference on Clients
https://github.com/deepshnv/pipeshard-mlsys26-ae

04/29/2026

>Z-Anime | Full Anime Fine-Tune on Z-Image Base
https://huggingface.co/SeeSee21/Z-Anime

>QuantVideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization
https://github.com/svg-project/Quant-VideoGen

>World-R1: Reinforcing 3D Constraints for Text-to-Video Generation
https://github.com/microsoft/World-R1

>Benchmarking Layout-Guided Diffusion Models through Unified Semantic-Spatial Evaluation in Closed and Open Settings
https://github.com/lparolari/cobench

>VibeToken: Scaling 1D Image Tokenizers and Autoregressive Models for Dynamic Resolution Generations
https://github.com/SonyResearch/VibeToken

>OmniVTG: A Large-Scale Dataset and Training Paradigm for Open-World Video Temporal Grounding
https://github.com/oceanflowlab/OmniVTG

>Refinement via Regeneration: Enlarging Modification Space Boosts Image Refinement in Unified Multimodal Models
https://github.com/LeapLabTHU/RvR

>SketchVLM: Vision language models can annotate images to explain thoughts and guide users
https://sketchvlm.github.io
>>
>mfw Research news

05/01/2026

>AesRM: Improving Video Aesthetics with Expert-Level Feedback
https://arxiv.org/abs/2604.28078

>TripVVT: A Large-Scale Triplet Dataset and a Coarse-Mask Baseline for In-the-Wild Video Virtual Try-On
https://arxiv.org/abs/2604.27958

>HiMix: Hierarchical Artifact-aware Mixup for Generalized Synthetic Image Detection
https://arxiv.org/abs/2604.27903

>Frequency-Aware Semantic Fusion with Gated Injection for AI-generated Image Detection
https://arxiv.org/abs/2604.27875

>Improving Calibration in Test-Time Prompt Tuning for Vision-Language Models via Data-Free Flatness-Aware Prompt Pretraining
https://arxiv.org/abs/2604.27715

>Leveraging Verifier-Based Reinforcement Learning in Image Editing
https://arxiv.org/abs/2604.27505

>Post-Optimization Adaptive Rank Allocation for LoRA
https://arxiv.org/abs/2604.27796

>Generate Your Talking Avatar from Video Reference
https://gseancdat.github.io/projects/TAVR

>AdvDMD: Adversarial Reward Meets DMD For High-Quality Few-Step Generation
https://arxiv.org/abs/2604.28126

>The Effects of Visual Priming on Cooperative Behavior in Vision-Language Models
https://arxiv.org/abs/2604.27953

>Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling
https://arxiv.org/abs/2604.28185

>Are DeepFakes Realistic Enough? Exploring Semantic Mismatch as a Novel Challenge
https://arxiv.org/abs/2604.28022
>>
> >108734249
> >108734257
fuck off
>>
>>108734273
Why are you so angry all the time?
>>
File: deCS_anima_00035_.png (3.03 MB, 2048x1117)
3.03 MB PNG
>>108734273
what is the point of this ritual post you do?
>>
>>108734292
Mainly attacking :^) it's an indian thing. He's hired to post, and is very lazy.
>>
>>
>>108733568
>https://rentry.co/s8fg8ber
Updated the captioning script to properly populate lyrics instead of raw_lyrics, note formatted_lyrics may also be redundant.
>>
Funny troonbo comes in here like he's welcome. Kekd.
>>
File: lol.jpg (208 KB, 323x498)
208 KB JPG
>>108734184
lol
>>
>disabo is lonely again
Go back to your containment general thread schizo
>>
>>108734383
kek
>>
>>108734249
>>108734257
thanks!
>>
>>
who's hyped for another month of local stagnation and saas innovation?
>>
>>108734447
heh, nice try SATAN, she's no supermodel church woman.
>>
>>108734447
thought this said saars innovation
>>
File: api comes first.png (42 KB, 534x556)
42 KB PNG
that's right, localkeks. api comes first
>>
File: 325375.png (98 KB, 2226x345)
98 KB PNG
kino alert
>>
File: 00345-3501482335re.png (1.73 MB, 1712x960)
1.73 MB PNG
Local shroom.
>>
>>108734751
i just tested it and the audio is perfectly seamless across a 1 minute video. i seems like i can generate full music with this now
>>
>>108732769
just in case people didn't believe me that this works for basically anything BTW

original image:
https://files.catbox.moe/3he15x.png

caption from Gemini 3.1 Pro:
https://pastes.io/mJzbYB4p
>>
File: Anima_0024.jpg (1.26 MB, 1200x1920)
1.26 MB JPG
>>
>>108734825
Those of us with a brain knew this months ago, it's only freetards still stuck in 2023 seething at API who think otherwise. Gemini is the best NSFW NL captioner available right now
>>
>>108734870
yeah. It does take a good prompt to get the most out of it though, my one I linked before I've been refining forever.
>>
calm down fag
>>
File: ErnieOutput_3525.jpg (3.12 MB, 1536x1536)
3.12 MB JPG
>>108734447
huh, Ernie version of this actually good
rare Ernie W
can't upscale as much though
>>
>>108734825
>tokenflood
>sending your porn prompts to google
lol



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.