[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


Janitor applications are now open. Apply here!


[Advertise on 4chan]


Discussion and Development of Local Image and Video Models

Previous: >>108887123

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, & Upscalers
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/tdrussell/diffusion-pipe
https://github.com/kohya-ss/sd-scripts
https://github.com/kohya-ss/musubi-tuner

>Z
https://huggingface.co/Tongyi-MAI/Z-Image

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>Wan
https://github.com/Wan-Video/Wan2.2

>LTX-2.3
https://huggingface.co/collections/Lightricks/ltx-23

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
(penis weenus:1.9)
>>
File: wan22_00015.mp4 (1.2 MB, 640x640)
1.2 MB
1.2 MB MP4
>>
>>108891701
>https://github.com/BigStationW/ComfyUi-Untwisting-RoPE
I tried this and it's kinda ass.
>Poor prompt adherence.
>Larger changes from reference image is impossible, prominent content will seep into your gens.
>Decreased anatomical stability
I didn't bother tinkering with the dozen different settings it has, maybe it has just poor defaults i dunno.
(Was typing a longer post, but alas I refreshed by accident.)
>>
>mfw Resource news

05/23/2026

>Klein Tiled Upscaler for ComfyUI
https://github.com/Gavr728/ComfyUI_KleinTiledUpscaler

>Anima AI Character & Artist search engine with 49,000 sample images
https://animadex.net

>ComfyUi-Untwisting-RoPE (Training-Free Style Transfer)
https://github.com/BigStationW/ComfyUi-Untwisting-RoPE

>LongCat-Video-Avatar-1.5
https://huggingface.co/meituan-longcat/LongCat-Video-Avatar-1.5

>IMG Dataset Refiner v4.3
https://github.com/NyxAwroo/IMG-Dataset-Refiner/releases/tag/v4.3

>Sulphur-2-base
https://huggingface.co/SulphurAI/Sulphur-2-base

05/22/2026

>[real] Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models
https://github.com/microsoft/Lens

>L2P: Unlocking Latent Potential for Pixel Generation
https://nju-pcalab.github.io/projects/L2P

>GenEvolve: Self-Evolving Image Generation Agents via Tool-Orchestrated Visual Experience Distillation
https://ephemeral182.github.io/GenEvolve

>EasyVFX: Frequency-Driven Decoupling for Resource-Efficient VFX Generation
https://easy-vfx.github.io

>FashionLens: Toward Versatile Fashion Image Retrieval via Task-Adaptive Learning
https://github.com/haokunwen/FashionLens

>PALLAIDIUM Refactored
https://github.com/tin2tin/pallaidium_refactor

05/21/2026

>Follow the Mean: controlling flow-matching generative models by shifting endpoint means with reference examples
https://github.com/pedrocurvo/follow-the-mean

>LiTo: Surface Light Field Tokenization
https://github.com/apple/ml-lito

>Q-ARVD: Quantizing Autoregressive Video Diffusion Models
https://github.com/tsa18/Q-ARVD

>StreamGVE: Training-Free Video Editing via Few-Step Streaming Video Generation
https://dsl-lab.github.io/StreamGVE

>PGC: Peak-Guided Calibration for Generalizable AI-Generated Image Detection
https://github.com/xiaoyu6868/PGC

>OcclusionFormer: Arranging Z-Order for Layout-Grounded Image Generation
https://henghuiding.com/OcclusionFormer
>>
>mfw Research news

05/23/2026

>MaTe: Images Are All You Need for Material Transfer via Diffusion Transformer
https://arxiv.org/abs/2605.15660

>ASAP: Attention Sink Anchored Pruning
https://arxiv.org/abs/2605.22372

>TAME: Test-Time Adversarial Prompt Tuning via Mixture-of-Experts for Vision-Language Models
https://arxiv.org/abs/2605.17577

>Findings of the Counter Turing Test: AI-Generated Image Detection
https://arxiv.org/abs/2605.20787

>Rethinking Cross-Layer Information Routing in Diffusion Transformers
https://arxiv.org/abs/2605.20708

>Conflict-Aware Additive Guidance for Flow Models under Compositional Rewards
https://arxiv.org/abs/2605.20758

>Semantic Granularity Navigation in Image Editing
https://arxiv.org/abs/2605.21190

>TASTE: A Designer-Annotated Multi-Dimensional Preference Dataset for AI-Generated Graphic Design
https://arxiv.org/abs/2605.20731

>PFlow-T: A Persistence-Driven Forward Process for Topology-Controlled Generation
https://arxiv.org/abs/2605.17555

>MARR: Module-Adaptive Residual Reconstruction for Low-Bit Post-Training Quantization
https://arxiv.org/abs/2605.17997

>PERL: Parameter Efficient Reasoning in CLIP Latent Space
https://arxiv.org/abs/2605.18464

>Edit-GRPO: A Locality-Preserving Policy Optimization Framework for Image Editing
https://arxiv.org/abs/2605.16951

>RoPeSLR: 3D RoPE-driven Sparse-LowRank Attention for Efficient Diffusion Transformers
https://arxiv.org/abs/2605.20659

>Factorized Latent Dynamics for Video JEPA: An Empirical Study of Auxiliary Objectives
https://arxiv.org/abs/2605.17165

>Ablate-to-Validate: Are Vision-Language Models Really Using Continuous Thought Tokens?
https://arxiv.org/abs/2605.21642

>Focus-then-Context: Subject-Centric Progressive Visual Token Reduction for Vision-Language Models
https://arxiv.org/abs/2605.20950

>Exact Linear Attention
https://arxiv.org/abs/2605.18848

>Mind Your Margin and Boundary: Are Your Distilled Datasets Truly Robust?
https://arxiv.org/abs/2605.20606
>>
>>108891780
>>108891785
Thanks for news
>>
File: debo_cm-m_anima1_00081_.png (3.08 MB, 1497x1663)
3.08 MB PNG
>>108891790
my pleasure :)
>>
>>108891780
>>Klein Tiled Upscaler for ComfyUI
>This custom node is 100% vibecoded
derp
>>
>>108891863
AGENIC ENGINEERING
>>
100% vibeart thread
>>
>>108891877
no vibecrits please
>>
>>108891877
I prefer my images to be slopped, not the code I am executing on my machine.
>>
File: 233903CUI_00001_edit.png (2.92 MB, 1248x1824)
2.92 MB PNG
>>
>>108891780
>https://animadex.net
is there offline version?
>>
>>108891863
>I love ai, look at this 1girl it made
>wow so cool

>i love ai, look at this node i vibecoded
>hello HR???
>>
I should start making fake HR complaints at companies I've never even visited.
>>
People kind of do this on tiktok. But I don't think they see the real potential.
>>
Give me obscure artist tags to try.
>>
File: debo_cm-m_anima1_00087_.png (3.87 MB, 1497x1663)
3.87 MB PNG
>>108891907
sort of
https://github.com/ThetaCursed/Anima-Style-Explorer
>>
>>108891945
but if i tell you itll no longer be obscure
>>
>>108891945
'booru image count threshold to be considered obscure?
>>
>>108891690
the TE is the most important part
>>
>https://huggingface.co/Comfy-Org/Lens
>https://civitai.com/models/2645928/microsoft-lens?modelVersionId=2970963
>>
File: miku speed challenge3.png (796 KB, 1024x768)
796 KB PNG
I wonder what the maximum i2v generation speed achievable on a 3090 is while still keeping coherent output. I'm talking about speeds using TeaCache Sage Attention lowquant Wan 5b with taesd and a lowstep lighting lora kinda speeds. Has anyone done it before? You know, for the sake of competition
>>
File: ComfyUI_00134_.png (2.22 MB, 1536x1536)
2.22 MB PNG
>>
File: ComfyUI_temp_lncfe_00001_.png (1.28 MB, 1248x1824)
1.28 MB PNG
>>108891945
Greg Rutkowski
>>
File: Flux2-Klein_00064_.png (1.13 MB, 880x1184)
1.13 MB PNG
>>
>>108892021
my hormuz isn't straight anymore
>>
>>108891692
Is there a new inpainting/mask detailing workflow?
Current one doesn't work because it depends on the Impact Pack, which was compiled with Numpy1.
Downgrading to Numpy1 doesn't work anymore, at least not in ComfyUI portable.
>>
>>108892041
inpainting is cope, just keep pulling the gacha lever on today's fast models until you get what you like, stack good images and wait a few months until we finally get a good pixelspace edit model, after which you wont be having to fuck with inpainting nearly as much if at all anymore

no matter what happens, 5 years from now you'll be able to point your ai to your old gen folders and just tell it "fix all inconsistencies" and be done with it.
>>
File: 432819290872713.png (1.93 MB, 1024x1536)
1.93 MB PNG
>>108892016
A fine choice.
>>
File: 1763145535840751.jpg (133 KB, 1280x736)
133 KB JPG
Computer, analyze all of my RP chats dating back to c.ai days and generate a perfect world with my preferences in mind. Then analyze all of my 1girls and populate the world with them, randomizing their backstory, outfits, and personalities. Turn on 8k per eye VR realtime video generation, engage the masturbation machine suction, disable safety protocols.

How over will it be for you?
>>
>>108892113
I don't think I'll ever truly accept it until it actually is like the holodeck.
>>
File: ComfyUI_temp_novpf_00001_.png (1.56 MB, 1248x1824)
1.56 MB PNG
>>108892091
Good ol' Greg doing wonders still.
>>
anything new since anima?
>>
File: ComfyUI_00143_.png (2.7 MB, 1920x1536)
2.7 MB PNG
>>
File: agentsmith.jpg (271 KB, 800x538)
271 KB JPG
>>108892113
Did you know that the first Matrix was designed to be a perfect human world? Where none suffered, where everyone would be happy. It was a disaster. No one would accept the program. Entire crops were lost. Some believed we lacked the programming language to describe your perfect world. But I believe that, as a species, human beings define their reality through suffering and misery. The perfect world was a dream that your primitive cerebrum kept trying to wake up from.
>>
>>108892163
>>108892113
it's the smell
>>
>>108892163
I wouldn't get bored, and people could finetune the amount of struggle they want to feel in their own matrix if needed anyway. Everything in life is cope, it's just different forms.
>>
File: 451865036884671.png (1.22 MB, 1024x1536)
1.22 MB PNG
>>108892141
No doubt.
>>
File: bored raven looks up.png (960 KB, 896x1152)
960 KB PNG
And one Raven gen from me.
>>
File: 005059CUI_00001_.png (845 KB, 1248x1824)
845 KB PNG
>>108892187
>>108892216
Based Ravenchads.
>>
File: 1124586197226172.png (1.49 MB, 1024x1536)
1.49 MB PNG
>>
File: ComfyUI_temp_altzu_00001_.png (1.32 MB, 1248x1824)
1.32 MB PNG
You know what would be cool? A lora trained in the style of those illustrations in the NWO Illuminati cards. That artstyle is really appealing to me.
>>
does anima work better with nat language or tags
>>
>>108892298
I remembered these! But I don't know where I am remembering them from, I am certain I never owned them.
No promises but I am somewhat interested, assuming there is a reasonably high quality dataset of the art somewhere.
>>108892310
Depends on what you are trying to do.
Tag prompts are often significantly more accurate than nat lang, but also more limited.
>>
>>108891928
>Imageslop is a little bit off in one small area
>Inpaint it and it's good to go, or just dgaf who cares
>Codeslop is a little bit off in one small area
>Delet you are entire computer
>>
File: ComfyUI_00144_.png (2.07 MB, 1536x1920)
2.07 MB PNG
>>
>>108892328
I used to save some pngs from ebay listings
>>
>>108892353
These are a bit too low res for my taste, I wouldn't bother anything less than 768p (I mean the art not the card) and I don't like training on upscaled images.
>>
>>108891954
I'm dying of AIDS I won't share them with anyone.
>>108891970
anything
>>
File: ComfyUI_26281.jpg (3.63 MB, 1500x1920)
3.63 MB JPG
>giant Jenny
I didn't prompt for it, but I'll take it.

>>108891991
So, will we be able to just drop in one of the uncensored GPT models to this? Vanilla GPT OSS usually checks for copyright and explicit content before doing anything else and I can see that getting in the way real fast.

>>108892113
You were supposed to relate to Data, not Reggie!
>>
File: ComfyUI_temp_tgfjq_00001_.png (1.46 MB, 1248x1824)
1.46 MB PNG
>>108892366
Yeah. That's a shame, I don't think you're gonna find many in an ok resolution. The game is ancient.
>>
>>108892375
Ok.
@garrett hanna
@shirosaki london
@diathorn
>>108892414
Maybe they published a book or something with higher quality assets but yeah besides that tough luck.
I am going to bed now I might attempt a brief search tomorrow to see if anything comes up.
>>
File: 884798848376243.png (1.61 MB, 1152x1472)
1.61 MB PNG
>>
File: ComfyUI_temp_ozvbo_00001_.png (1.45 MB, 1248x1824)
1.45 MB PNG
>>108892456
>I might attempt a brief search tomorrow to see if anything comes up
Nice. Good luck.
>>
File: autocollage.jpg (664 KB, 2104x1658)
664 KB JPG
>justified rows
>recursive bisect
>"collages are solved"
I prefer white gaps (v2)
>>
>>108892465
how did you make this?
>>
>>108892550
And yes, of course I chose the (approximate) relative sizes of the images myself. And I chose which ones were allowed to crop* and along which dimensions.

*mandatory up to 5% aspect fudging to fake tighter gaps is currently turned on however
>>
File: RND_A_001.png (1.2 MB, 832x1216)
1.2 MB PNG
>>
>>
>>108892465
i kneel
>>
whats the purpose of using both z image base and turbo? is turbo a finisher for things like skin texture? if i am doing img2img do i even need base?
>>
>>108892747
i dont know
>>
>>108892747
They're different models. You choose one or the other.
>>
>>108892747
turbo is a distilled model, which means it brings in strong opinions. although this produces limitations in composition and gen variety, it works well to refine. base as base gives better base gens with turbo tag-teamed in for a second pass to improve detail
>>
>>108892772
how do I choose?
>>
File: gob maid.jpg (1.06 MB, 2299x1936)
1.06 MB JPG
>>
File: 870297686542062.png (1.63 MB, 1152x1472)
1.63 MB PNG
>>108892562
Anima
>>108892715
o7
>>
>>108892810
very nice
can I have the prompt?
>>
>>108892813
nyo
>>
File: 1766217025834532.jpg (1.97 MB, 3584x4608)
1.97 MB JPG
>>108892814
>>
File: 358181007969663.png (1.65 MB, 1152x1472)
1.65 MB PNG
>>108892813
Sure, using a gits 1995 lora from citiai

>masterpiece, best quality, score_7, akagi ritsuko, ayanami rei, ikari gendou, ikari shinji, katsuragi misato, neon genesis evangelion, 1girl, arm up, armpits, bath, bathing, bathroom, bathtub, blue hair, blush, breasts, character doll, closed mouth, collarbone, completely nude, convenient censoring, doll, dripping, eyelashes, hair between eyes, hand up, indoors, knees up, lego, lego minifig, light blush, light smile, looking at object, looking down, nude, otaku, partially submerged, pink water, playing, red eyes, scene reference, shadow, short hair, sitting, small breasts, smile, solo, tile wall, tiles, toy, water, wet, wet hair, window
>>
File: ComfyUI_00170_.png (1.95 MB, 1536x1920)
1.95 MB PNG
>>
File: 1771485602136329.jpg (1.98 MB, 3584x4608)
1.98 MB JPG
>>108892834
Thanks
>>
>>108892010
Prompt?
>>
>>
File: goblin girl.mp4 (3.51 MB, 1024x1024)
3.51 MB
3.51 MB MP4
hm the movement seems a little weird
>>
>>
File: ComfyUI_00127_.png (1.86 MB, 1920x1088)
1.86 MB PNG
>>108892864
The image displays a promotional graphic featuring an anime-style character against a pink background with a subtle grid pattern. The background is decorated with white four-pointed stars, pink hearts, and thin white lines.

In the top left corner, white text reads "Skibidi Archive."

Dominating the right side of the image is a female character with long pink hair styled in twin buns on top, with the rest of the hair flowing down in loose waves. She wears a black headband with cat ears. Her expression involves winking with her left eye closed and smiling with her mouth open. She wears a white short-sleeved shirt with red trim, a red bow tie, and a black pleated mini-skirt with a zipper detail on the front. On her right leg, a black thigh-high stocking is visible. She holds a large assault rifle that is colored pink and black with yellow stars on the magazine. Her right hand is raised near her face making a peace sign, and she wears a red bracelet on that wrist.

Behind the main character on the right is a larger, semi-transparent close-up of the same character's face, showing her smiling with her mouth open.

To the left of the main character, large text in a bubbly brown font with white outlines reads "Edging, and all set to mog that skibidi!" in yellow, with smaller text "fr fr ong" below.

In the bottom left area, there is a smartphone screen displaying a smaller image of the character winking. Various objects are scattered around the left side: a bottle of pink nail polish near the top, a red beaded bracelet, a pair of gold earrings, and a bottle of red nail polish at the very bottom.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.