[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Are You Living In The Same Universe As Me Edition

Discussion and Development of Local Image and Video Models

Previous: >>108609718

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
Can someone explain the anima > zit workflow?
>>
>>108615707
gen with anima, hiresfix with zit
retard
>>
>mfw Resource news

04/16/2026

>Motif-Video 2B: A micro-budget text-to-video diffusion transformer from Motif Technologies
https://motiftech.io/videoshowcase

>HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds
https://huggingface.co/tencent/HY-World-2.0

>ErnieTurbo_extracted_lora
https://huggingface.co/GuangyuanSD/ErnieTurbo_extracted_lora/tree/main

04/15/2026

>DisCa: Accelerating Video Diffusion Transformers with Distillation-Compatible Learnable Feature Caching
https://huggingface.co/tencent/DisCa

>Lyra 2.0: Explorable Generative 3D Worlds
https://research.nvidia.com/labs/sil/projects/lyra2

>AniGen: Unified S3 Fields for Animatable 3D Asset Generation
https://github.com/VAST-AI-Research/AniGen

>T2I-BiasBench: A Multi-Metric Framework for Auditing Demographic and Cultural Bias in Text-to-Image Models
https://gyanendrachaubey.github.io/T2I-BiasBench

>Generative Refinement Networks for Visual Synthesis
https://github.com/MGenAI/GRN

>VideoFlexTok: Flexible-Length Coarse-to-Fine Video Tokenization
https://videoflextok.epfl.ch

>DiffusionPrint: Learning Generative Fingerprints for Diffusion-Based Inpainting Localization
https://github.com/mever-team/diffusionprint

>Chain-of-Models Pre-Training: Rethinking Training Acceleration of Vision Foundation Models
https://github.com/deep-optimization/CoM-PT

>Self-Adversarial One Step Generation via Condition Shifting
https://github.com/LINs-lab/APEX

>See-through WebUI
https://github.com/BeamManP/see-through-webui

>ERNIE-Image: Repackaged model files for ComfyUI
https://huggingface.co/Comfy-Org/ERNIE-Image

04/14/2026

>Nucleus-Image Released
https://huggingface.co/NucleusAI/Nucleus-Image

>ERNIE-Image: Text-to-image generation model built on a single-stream Diffusion Transformer
https://huggingface.co/baidu/ERNIE-Image

>Danbooru Dataset Filter: High-Speed Metadata Explorer for AI Training
https://github.com/ThetaCursed/Danbooru-Dataset-Filter
>>
>>108615744
Thank you shitstain
>>
>mfw Research news

04/16/2026

>Creo: From One-Shot Image Generation to Progressive, Co-Creative Ideation
https://arxiv.org/abs/2604.13956

>DiT as Real-Time Rerenderer: Streaming Video Stylization with Autoregressive Diffusion Transformer
https://arxiv.org/abs/2604.13509

>Enhanced Text-to-Image Generation by Fine-grained Multimodal Reasoning
https://arxiv.org/abs/2604.13491

>MaMe & MaRe: Matrix-Based Token Merging and Restoration for Efficient Visual Perception and Synthesis
https://arxiv.org/abs/2604.13432

>Bias at the End of the Score
https://arxiv.org/abs/2604.13305

>ASTRA: Enhancing Multi-Subject Generation with Retrieval-Augmented Pose Guidance and Disentangled Position Embedding
https://arxiv.org/abs/2604.13938

>What Are We Really Measuring? Rethinking Dataset Bias in Web-Scale Natural Image Collections via Unsupervised Semantic Clustering
https://arxiv.org/abs/2604.13610

>Who Gets Flagged? The Pluralistic Evaluation Gap in AI Content Watermarking
https://arxiv.org/abs/2604.13776

>Rethinking Image-to-3D Generation with Sparse Queries: Efficiency, Capacity, and Input-View Bias
https://arxiv.org/abs/2604.13905

>DiffMagicFace: Identity Consistent Facial Editing of Real Videos
https://arxiv.org/abs/2604.13841

>Seedance 2.0: Advancing Video Generation for World Complexity
https://arxiv.org/abs/2604.14148

>MOONSHOT : A Framework for Multi-Objective Pruning of Vision and Large Language Models
https://arxiv.org/abs/2604.13287

>VibeFlow: Versatile Video Chroma-Lux Editing through Self-Supervised Learning
https://lyf1212.github.io/VibeFlow-webpage

>ReConText3D: Replay-based Continual Text-to-3D Generation
https://mauk95.github.io/ReConText3D

>Free Lunch for Unified Multimodal Models: Enhancing Generation via Reflective Rectification with Inherent Understanding
https://arxiv.org/abs/2604.13540

>Grid2Matrix: Revealing Digital Agnosia in Vision-Language Models
https://arxiv.org/abs/2604.09687
>>
File: ComfyUI_09687_.png (572 KB, 1024x1024)
572 KB
572 KB PNG
>>108615765
you're welcome
>>
File: 397256838253824.png (1.91 MB, 1248x1824)
1.91 MB
1.91 MB PNG
>>108615104
generated from scratch
>>
>mfw nigbo
>>
>>108615891
I love the zitslop face
>>
>>108615361
There's no straightforward process with these models, It all comes down to luck.
You never know how the AI will react to whatever dataset you throw at it qnd you always have to sacrifice something to get improvements.
Base Noob has the best aesthetics but also the worst limb deformities, especially legs.
How did they fix it? More neutral/slopped/semi realistic data, which killed the aesthetic.
No model can balance aesthetics and accuracy yet.
>>
File: _AnimaPreview3_00271_.jpg (431 KB, 1072x1792)
431 KB
431 KB JPG
>>108616034
tyty!
>>
>>108615891
Prompt?
>>
>>108616156
i meant shit gens sorry
>>
usecase for posting 2+ images from the same batch?
>>
wai-anima may be sloppy but it's the only anime tune that has working franchise/copyright styles out of the box, base is terrible at this
>>
>>108616208
Your sdg buddies can help with that question :]
>>
>>108616233
It just sounds like youre not good at prompting
>>
>>108616240
but that's a dead shithole?
>>
File: _AnimaPreview3_00304_.jpg (373 KB, 1792x1072)
373 KB
373 KB JPG
>>
>>108616233
Based
>>
>>108616233
>tune
lel
>>
>>108616170
Hey, you are not me!
>>108616264
Lora?
>>
>>108616159
just write what you see in the image?
>>
File: 1747882449329772.jpg (79 KB, 1080x1080)
79 KB
79 KB JPG
is he lying
>>
>>108616324
You are a complete and utter newfren if you still take "coming soon" posts seriously
>>
>>108616336
Why? Z released base like they said they would. They were going to release edit until BFL surprised them.
>>
>>108616316
jujutsu kaisen culling game arc 1 from civitai
>>
File: 00051-2906136885.png (383 KB, 512x512)
383 KB
383 KB PNG
>>108615635
What can i do with a GTXX 1050 and 16gb of ram?
Any model recommendation?
>>
>>108616363
>Why?
Lurk for a couple more years and you'll understand
>>
>>108616324
no, you are shamesly shilling.
did ernie labs payd to you? it's clear that erni flopped
>>108616370
NAI
>>
>>108616370
>gtx card
>for ai
>in 2026
Is this bait?
>>
>>108616370
>2gb
Ouch!
Probably some SDXL variant like Noob vpred.
Either run at fp16 and eat offloading penalty or run at q8 (int8 if Pascal can accelerate that and if you can figure out how to get it working)
You are SOL for anything newer.
>>
>>108616324
They will release it 2 weeks after bigma is released
>>
>>108616425
You also need to run a distill lora for sane speeds.
I think there is a 2025 lora considered meta for step distilling SDXL, but I don't recall the name.
>>
>>108616370
There is a Noob Vpred Nunchaku
>>
>>108616454
Nunchaku needs at least 2000 series.
>>
File: 00074-1842675134.png (2.32 MB, 1824x1248)
2.32 MB
2.32 MB PNG
>>108616370
anon you need 12-16gb vram and at least the healthy minimum 32gb of ram for ai workloads. if you can't afford a decent prebuilt gaming pc from costco or bestbuy then forget and go the saas route. Your not running any good ai model under 8gb of vram.
>>
>>108616370
cpu is probably faster
>>
File: 00007.png (3.65 MB, 2048x2880)
3.65 MB
3.65 MB PNG
>>108616425
>>108616454
>>108617006
>>108617015
Well, i appreciate the help

Throwback from 2023, same setup. WebUI doesn't work for me anymore
>>
File: _AnimaPreview3_00357_.jpg (408 KB, 1160x1696)
408 KB
408 KB JPG
>>
File: _AnimaPreview3_00361_.jpg (391 KB, 1160x1696)
391 KB
391 KB JPG
>>
What do people use to train anima? I don't wanna use wsl2.
>>
File: _AnimaPreview3_00373_.jpg (411 KB, 1696x1160)
411 KB
411 KB JPG
>>
>>108617289
>not running linux bare metal
NGMI
>>
>>108617393
>not having a proxmox server with dedicated VMs with k8s running on em
LOL, fucking paesant
>>
File: _AnimaPreview3_00405_.jpg (414 KB, 1696x1160)
414 KB
414 KB JPG
>>
the problem with bluvoll is that he contaminates mugen and chenkin rf with his pedo hag dataset, it is not a 1:1 with noob dataset
>>
So how many artist styles are you using at a time with anima?

I set up my prompts to randomly pick between 1, 2, and 3, and at 3 it still seems coherent. One style seems to dominate overall, but you can still pick up hints of the others. I also have to try using no artists more often.
>>
>>108617502
i'm using 3
>>
So is Ernie Image Turbo better than ZIT?
>>
does anima include copyright tags? tried one with 800 entries on gel and 300 on dan but it didnt recognize
>>108617289
https://github.com/67372a/LoRA_Easy_Training_Scripts
>>
>>108617453
>pedo hag dataset
Que?
Also training on clip garbage in 2026 is the biggest problem with his models.
And being obnoxious dipshit in general.
>>108617502
Just one. I am not sure how more artist tags help coherency.
>I also have to try using no artists more often.
The default style is too sloppy and soulless for me.
>>
>>108617453
>pedo hag dataset
what that? straight shota?
>>
>>108617289
sd-scripts, bare metal Chadux.
>>
>>108617540
Did you use the correct tag syntax
>>
File: file.png (3.13 MB, 1152x2048)
3.13 MB
3.13 MB PNG
>>
>>108617562
>Did you use the correct tag syntax
is that crap really a thing? It makes me mad, let me retry
>>
>>108617568
Special booru syntax has been a thing since 2023 thobeit
>>
>>108617578
I thought he meant the tag positioning (which fortunately you don't really need to abide to), yeah I used the tag exactly as it is on danbooru/gelbooru
>>
>>108617584
>exactly as it is on danbooru/gelbooru
you have to escape parentheses with a backslash
>>
>>108617592
I know, it doesn't have those, seems like for whatever reason this thing in particular wasnt included, even on the tag autocomplete it isn't present
>>
File: _AnimaPreview3_00450_.jpg (505 KB, 1696x1160)
505 KB
505 KB JPG
>>
File: 1758659353540414.gif (96 KB, 640x640)
96 KB
96 KB GIF
>>108617651
you should also try creating a lora for ltx2.3. imagine this style with sound
>>
>everyone pretends that a 2b model can learn over a million character images and artist styles, when an LLM with the same parameters struggles learning a tenth of that
You need at least a 24b model at minimum to achieve what you want.
>>
How slow is Anima at doing 7680×2160? Can you even gen something over 1440p on it with consumer hardware?
>>
>he scales at all costs
>>
Is Chroma really not surpassed yet? We've had it for about a year now...
>>
>>108618018
There haven't been any other major NSFW capable tunes, yes. Shame it's too schizo. And memestone's other vibe training attempts have managed to become far more dysfunctional trainwrecks. At least you get lucky enough with Chroma sometimes.
>>
File: _AnimaPreview3_00517_.jpg (471 KB, 1248x1608)
471 KB
471 KB JPG
>>108617709
could be fun but probably requires latest hardware
>>
>>108617954
It works best at typical resolutions. Circlestone did release a Lora recently where 1536x1536 works without any major issues, and even 2048 (4 MP) works without falling apart.

That's genning straight-up. You need to upscale to go bigger.
>>
File: _AnimaPreview3_00545_.jpg (518 KB, 1248x1608)
518 KB
518 KB JPG
>>
File: 990306278.png (2.27 MB, 1248x1824)
2.27 MB
2.27 MB PNG
>>108616159
>>108616323
What he said, but here it is anyway
>toki \(blue archive\), toki \(bunny\) \(blue archive\), blue archive, 1girl, alternate hairstyle, animal ear hairband, animal ears, ass, back, backless leotard, bare shoulders, blonde hair, blue eyes, blue hairband, blue leotard, blue nails, blue streaks, braid, breasts, bun cover, detached collar, expressionless, fake animal ears, fake tail, from behind, grabbing own ass, hair bun, hairband, half up braid, halo, highleg, highleg leotard, large breasts, leotard, looking at viewer, mechanical halo, median furrow, multicolored hair, nail polish, official alternate costume, playboy bunny, rabbit ear hairband, rabbit ears, rabbit tail, short hair, simple background, single hair bun, sitting, solo, strapless, strapless leotard, streaked hair, tail, white background, wrist cuffs
>>
>>108618018
For girl full nudity some ZiT and even some FK29B on civitai are better than chroma. If you're in ultra hardcore porn and weird kink though...
>>
>>108617954
I found anima to be completely predicatable with the scaling of time in relation to image size.

It takes 30 seconds and 1024x1024 and 2 minutes at 2048x2048 on my 3090. So you just extrapolate the time it takes to a 1024x1024 image on your hardware and multiply it by how many times larger the image is than that.
>>
File: 336607023179300.png (2.43 MB, 1248x1824)
2.43 MB
2.43 MB PNG
>>
>>108618795
>>108618615
>>108618702
So Anima then Zit for hires fixes won over Chroma?
Realistic models were saved by weebs?
Why is Lodestone not fail tuning Anima yet?
>>
>>108618702
>some ZiT and even some FK29B on civitai are better
such as?
>>
What API node do I use now that local is dead?
>>
File: 1775261507046615.png (536 KB, 637x634)
536 KB
536 KB PNG
can't believe the last hope for local video is ltx...
>>
is it finally safe pulling latest cumfart? didnt do it for a month
>>
>>108619125
why are you using comfyui anifart?
>>
>>108619102
Yeah... It's dogshit
>>
Qwen3-VL-8B-Q8 or Qwen3-VL-32B-Q4? 5090 btw
>>
>>108619186
always bigger at a smaller quant
>>
File: 00119-858822408.png (1.26 MB, 1168x816)
1.26 MB
1.26 MB PNG
>>
File: so grim.png (63 KB, 405x344)
63 KB
63 KB PNG
>>108619102
>literal jews are my our best hope
>>
>>108619186
Why are you going q4 with 32b? You can easily do Q6 with 5090.
Anyway 3.6 will probably mog both even as MOE.
Probably get 3.5 27b q6 hauhaucs if you need NSFW (Although it will try its best it has low knowledge of NSFW subjects due to lacking training knowledge)
>>
File: 00154-2878864308.png (1.12 MB, 1168x816)
1.12 MB
1.12 MB PNG
>>
File: 227338914584475.png (2.19 MB, 1248x1824)
2.19 MB
2.19 MB PNG
>>
File: 1772802281817339.jpg (110 KB, 1018x1024)
110 KB
110 KB JPG
>>108619252
and not even the best jews, like with sora, midjourney. we've got the team of talentless jews. what luck...
>>
File: 914268186037849.png (2.03 MB, 1248x1824)
2.03 MB
2.03 MB PNG
>>
>>108619102
That's like saying BFL was the "last hope" for local image kek
>>
File: 617571134723071.png (2.28 MB, 1248x1824)
2.28 MB
2.28 MB PNG
>>108619429
אַזוי פֿיל געלט, אַזוי פֿיל שכל
>>
File: 00190-781702195.png (1.53 MB, 1168x816)
1.53 MB
1.53 MB PNG
>>
>>108619455
Make Yaoyao in her new outfit pls
>>
https://files.catbox.moe/f1w6g6.jpg

I really like Anima.
>>
>>108619693
SDXL and ControlNet still has potential...
>>
>>108619804
I understand /hgg/ fags and Oekaki shizo because Anima it's better at handling multiple characters and intricate poses, as well as abstract kino minimalist concepts with multiple characters in the case of Oekaki. But anyone else praising Anima is a poser. For example, >>108619433, >>108619455, and >>108618795 can be done with SDXL and ZiT hires fix.
>>
File: 1746952463420779.jpg (681 KB, 1728x1728)
681 KB
681 KB JPG
>>
>>
File: 104212991805621.png (2.04 MB, 1248x1824)
2.04 MB
2.04 MB PNG
>>108619872
wicked
>>
File: deEF_zi_00031_.png (2.1 MB, 1792x977)
2.1 MB
2.1 MB PNG
>>108619433
I'm surprised it got her weird mid-spine tail correct. this is i2i maybe?
>>
>>108618702
I don't even care about nudity particularly. I just want a model that can make nice pics of cute chicks with some cleavage, the occasional bikini pic or some lingerie. Rarely nudity, it's not really essential. I find nothing is as good as Chroma. I've tried all the other FOTMs and I wasn't blown away.
>>
File: happyhorse.png (26 KB, 806x333)
26 KB
26 KB PNG
Is it coming to API nodes anytime soon?
>>
>>108620088
took them awhile to add seedance, so wait and see.
>>
I havn't used SDXL in such a long time by now lol, and some faggots are still hanging on that deprecated model lol

Imagine using clip in 2026
>>
>>108619446
Well, technically Flux Klein is still the best open model though
>>
File: output.webm (3.88 MB, 2000x1922)
3.88 MB
3.88 MB WEBM
Some good coomer gens last thread.
>>
>>108620144
you gave me your workflow the other day
have you had much luck generating realistic hardcore with it?
>>
File: ComfyUI_19895.png (2.28 MB, 1200x1600)
2.28 MB
2.28 MB PNG
>>108619125
It fucks up handling memory less often on VAE loading now, but still fucks up... BUT! I now get a lot of "Windows fatal exception: access violation" when refreshing the page Comfy loads, which needs to be done because the RTX node doesn't load properly without a restart. So I have long stretches where I'm just trying to get it to work.

I really don't think they (or more likely, Claude!) know what they're vibing out over there memory-wise.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.