/g/ - /ldg/ - Local Diffusion General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/ldg/ - Local Diffusion Genera(...) 05/13/26(Wed)16:36:22 No.108816650

File: highlights_g_108807440_17(...).jpg (1.77 MB, 4633x1969)

/ldg/ - Local Diffusion General Anonymous 05/13/26(Wed)16:36:22 No.108816650

Shift Scheduling Edition

Discussion and Development of Local Image and Video Models

Previous: >>108807440

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, & Upscalers
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/tdrussell/diffusion-pipe
https://github.com/kohya-ss/sd-scripts
https://github.com/kohya-ss/musubi-tuner

>Z
https://huggingface.co/Tongyi-MAI/Z-Image

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon

Anonymous
05/13/26(Wed)16:37:42 No.108816657

Anonymous 05/13/26(Wed)16:37:42 No.108816657

>mfw Resource news

05/13/2026

>AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation
https://huggingface.co/nvidia/AnyFlow-Wan2.1-T2V-14B-Diffusers

>RealDiffusion: Physics-informed Attention for Multi-character Storybook Generation
https://github.com/ShmilyQi-CN/RealDiffusion

>OmniNFT: Modality-wise Omni Diffusion Reinforcement for Joint Audio-Video Generation
https://zghhui.github.io/OmniNFT

>Logit-Attention Divergence: Mitigating Position Bias in Multi-Image Retrieval via Attention-Guided Calibration
https://github.com/brightXian/LAD

>Design Your Ad: Personalized Advertising Image and Text Generation with Unified Autoregressive Models
https://github.com/JD-GenX/Uni-AdGen

>Elastic Attention Cores for Scalable Vision Transformers
https://github.com/alansong1322/VECA

>LychSim: A Controllable and Interactive Simulation Framework for Vision Research
https://lychsim.github.io

>Vision2Code: A Multi-Domain Benchmark for Evaluating Image-to-Code Generation
https://image2code.github.io/vision2code

>ComfyUI-ppm Implements NegPip on the Z-image series
https://github.com/BigStationW/ComfyUI-ppm#clipnegpip

>DreamX-World: A General-Purpose Interactive World Model
https://github.com/AMAP-ML/DreamX-World

>FLUX Identity Adjuster
https://github.com/Magirad/Flux_ID_Adjuster

05/12/2026

>Pixal3D: Pixel-Aligned 3D Generation from Images
https://ldyang694.github.io/projects/pixal3d

>SWIFT: Prompt-Adaptive Memory for Efficient Interactive Long Video Generation
https://github.com/ShanwenTan/SWIFT

>Forcing-KV: Hybrid KV Cache Compression for Efficient Autoregressive Video Diffusion Models
https://zju-jiyicheng.github.io/Forcing-KV-Page

>Masked Generative Transformer Is What You Need for Image Editing
https://weichow23.github.io/EditMGT

>Pixal3D: Pixel-Aligned 3D Generation from Images
https://ldyang694.github.io/projects/pixal3d

>Micro-Defects Expose Macro-Fakes
https://zbox1005.github.io/MDMF-project

Anonymous
05/13/26(Wed)16:38:43 No.108816664

Anonymous 05/13/26(Wed)16:38:43 No.108816664

>mfw Research news

05/13/2026

>CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives
https://yihao-meng.github.io/CausalCine

>EPIC: Efficient Predicate-Guided Inference-Time Control for Compositional Text-to-Image Generation
https://arxiv.org/abs/2605.11722

>STRIDE: Training-Free Diversity Guidance via PCA-Directed Feature Perturbation in Single-Step Diffusion Models
https://arxiv.org/abs/2605.11494

>UniFixer: A Universal Reference-Guided Fixer for Diffusion-Based View Synthesis
https://arxiv.org/abs/2605.12169

>Images in Sentences: Scaling Interleaved Instructions for Unified Visual Generation
https://arxiv.org/abs/2605.12305

>Couple to Control: Joint Initial Noise Design in Diffusion Models
https://arxiv.org/abs/2605.11311

>MULTI: Disentangling Camera Lens, Sensor, View, and Domain for Novel Image Generation
https://arxiv.org/abs/2605.12134

>Beyond Text Prompts: Visual-to-Visual Generation as A Unified Paradigm
https://yaofang-liu.github.io/V2V_Web

>One-Step Generative Modeling via Wasserstein Gradient Flows
https://arxiv.org/abs/2605.11755

>FIS-DiT: Breaking the Few-Step Video Inference Barrier via Training-Free Frame Interleaved Sparsity
https://arxiv.org/abs/2605.11869

>UniCustom: Unified Visual Conditioning for Multi-Reference Image Generation
https://arxiv.org/abs/2605.12088

>Cluster-Aware Neural Collapse Prompt Tuning for Long-Tailed Generalization of Vision-Language Models
https://arxiv.org/abs/2605.11939

>L2P: Unlocking Latent Potential for Pixel Generation
https://nju-pcalab.github.io/projects/L2P

>Principled Design of Diffusion-based Optimizers for Inverse Problems
https://arxiv.org/abs/2605.11506

>AlphaGRPO: Unlocking Self-Reflective Multimodal Generation in UMMs via Decompositional Verifiable Reward
https://huangrh99.github.io/AlphaGRPO

>A Mimetic Detector for Adversarial Image Perturbations
https://arxiv.org/abs/2605.11492

Anonymous
05/13/26(Wed)16:40:55 No.108816682

Anonymous 05/13/26(Wed)16:40:55 No.108816682

File: 1768733169304002.jpg (1.76 MB, 1560x2280)

1.76 MB JPG

Anonymous
05/13/26(Wed)16:44:32 No.108816710

Anonymous 05/13/26(Wed)16:44:32 No.108816710

>>108816682
Nice crow.

Anonymous
05/13/26(Wed)16:44:36 No.108816712

Anonymous 05/13/26(Wed)16:44:36 No.108816712

>>108816682
Tell me about Crow, why does he wear the randoseru?

Anonymous
05/13/26(Wed)16:47:10 No.108816725

Anonymous 05/13/26(Wed)16:47:10 No.108816725

kill ani in real life, intercept him on his way home and stab him multiple times

Anonymous
05/13/26(Wed)16:54:54 No.108816764

Anonymous 05/13/26(Wed)16:54:54 No.108816764

>>108816725
What the fuck?

Anonymous
05/13/26(Wed)16:57:01 No.108816778

Anonymous 05/13/26(Wed)16:57:01 No.108816778

>>108816764
Pretty sure it's a mental patient sending threats to himself

Anonymous
05/13/26(Wed)17:22:54 No.108816918

Anonymous 05/13/26(Wed)17:22:54 No.108816918

Blessed thread of frenship

Anonymous
05/13/26(Wed)17:31:49 No.108816976

Anonymous 05/13/26(Wed)17:31:49 No.108816976

Someone bake the artemis photos lora

Anonymous
05/13/26(Wed)17:49:49 No.108817094

Anonymous 05/13/26(Wed)17:49:49 No.108817094

>>108816976
Artemis photos? Elaborate

Anonymous
05/13/26(Wed)17:51:10 No.108817105

Anonymous 05/13/26(Wed)17:51:10 No.108817105

>>108817094
the moon mission

Anonymous
05/13/26(Wed)18:13:01 No.108817227

Anonymous 05/13/26(Wed)18:13:01 No.108817227

>>108816976
I'm making one for Z Turbo

Anonymous
05/13/26(Wed)18:15:39 No.108817237

Anonymous 05/13/26(Wed)18:15:39 No.108817237

File: ComfyUI_00137_.png (2.33 MB, 1072x1376)

2.33 MB PNG

Anonymous
05/13/26(Wed)18:24:49 No.108817291

Anonymous 05/13/26(Wed)18:24:49 No.108817291

File: Juggernaut_Z_V1_00086_.jpg (746 KB, 1344x1728)

746 KB JPG

Anonymous
05/13/26(Wed)18:33:48 No.108817334

Anonymous 05/13/26(Wed)18:33:48 No.108817334

File: Juggernaut_Z_V1_00090_.jpg (690 KB, 1344x1728)

690 KB JPG

Anonymous
05/13/26(Wed)18:38:30 No.108817354

Anonymous 05/13/26(Wed)18:38:30 No.108817354

File: ComfyUI_15420.png (3.45 MB, 1728x1152)

3.45 MB PNG

>>108817334
>dat grill

Anonymous
05/13/26(Wed)18:39:17 No.108817360

Anonymous 05/13/26(Wed)18:39:17 No.108817360

https://github.com/resemble-ai/DramaBox

Holy shit, this thing is the best TTS I've ever tried. It beats absolutely everything.

It's based on LTX-2.3, handles NSFW content perfectly, and can do just about anything you can imagine with near-perfect cross voice cloning. Even though it was released as English-only, it handles my language perfectly.

SAM-Audio can extract nearly any voice from any series or whatever.

Now I’m going to start building my perfect dataset with emotions for a single-voice model using Qwen3-TTS as the backbone, which handles all emotions for real-time inference.

I thought I’d have to wait another 1–2 years for this moment. What a time to be alive.

Anonymous
05/13/26(Wed)18:41:20 No.108817376

Anonymous 05/13/26(Wed)18:41:20 No.108817376

File: debo_ccsw_anima_00027_.png (3.15 MB, 2048x1117)

3.15 MB PNG

Anonymous
05/13/26(Wed)18:45:43 No.108817402

Anonymous 05/13/26(Wed)18:45:43 No.108817402

>>108817334
Nice

Anonymous
05/13/26(Wed)18:48:48 No.108817423

Anonymous 05/13/26(Wed)18:48:48 No.108817423

>>108817360
>Holy shit, this thing is the best TTS I've ever tried. It beats absolutely everything.
What are the newest other models that you tried? Fishaudio S2P, Kokoro, Chatterbox, Qwen 3.5?

Anonymous
05/13/26(Wed)18:51:28 No.108817445

Anonymous 05/13/26(Wed)18:51:28 No.108817445

>>108817360
>It's based on LTX-2.3
interesting. i already knew ltx was good at voices since i used it to generate a low quality talking videos just to extract the audio from it. but can it generate only spoken audio? i would like something that can generate sound effects as well

Anonymous
05/13/26(Wed)19:02:07 No.108817496

Anonymous 05/13/26(Wed)19:02:07 No.108817496

>>108817423
None of them are expressive, generate sound effects, ambiance, BGM, etc.
I honestly don't know why there are so many pure TTS models but little T2A models.

Anonymous
05/13/26(Wed)19:05:14 No.108817511

Anonymous 05/13/26(Wed)19:05:14 No.108817511

>>108817360
Upload some examples?

Anonymous
05/13/26(Wed)19:09:47 No.108817536

Anonymous 05/13/26(Wed)19:09:47 No.108817536

>>108817496
qwen's default voices and fishaudio s2p are expressive, omnivoice IIRC also had a few tags

sure, t2a models should have appeal

Anonymous
05/13/26(Wed)19:38:32 No.108817694

Anonymous 05/13/26(Wed)19:38:32 No.108817694

Big Russ is GONE and we are ALONE

Anonymous
05/13/26(Wed)19:51:22 No.108817756

Anonymous 05/13/26(Wed)19:51:22 No.108817756

King Russ sits permanently on the throne in my heart

Anonymous
05/13/26(Wed)20:03:42 No.108817825

Anonymous 05/13/26(Wed)20:03:42 No.108817825

anima4...

Anonymous
05/13/26(Wed)20:33:41 No.108817977

Anonymous 05/13/26(Wed)20:33:41 No.108817977

File: 1773445986291252.jpg (823 KB, 1040x1520)

823 KB JPG

Anonymous
05/13/26(Wed)20:35:41 No.108817982

Anonymous 05/13/26(Wed)20:35:41 No.108817982

v3 is unironically all you need

Anonymous
05/13/26(Wed)20:37:50 No.108817994

Anonymous 05/13/26(Wed)20:37:50 No.108817994

File: 1760150424246545.jpg (691 KB, 1040x1520)

691 KB JPG

Anonymous
05/13/26(Wed)20:53:24 No.108818066

Anonymous 05/13/26(Wed)20:53:24 No.108818066

>>108817511
There are some on their hf pages:
https://huggingface.co/ResembleAI/Dramabox
The output quality is shit but the tone is really good.

Anonymous
05/13/26(Wed)21:16:29 No.108818172

Anonymous 05/13/26(Wed)21:16:29 No.108818172

>>108817360
VibeVoice is better. DramaBox has better non spech controls but the quality is awful. Sounds like they're in a tin can, which isn't surprising since it's based on LTX.

Anonymous
05/13/26(Wed)21:40:59 No.108818303

Anonymous 05/13/26(Wed)21:40:59 No.108818303

>>108818172
It's based on LTX2, not even 2.3, so no surprise it sounds really bad.

Anonymous
05/13/26(Wed)22:03:25 No.108818393

Anonymous 05/13/26(Wed)22:03:25 No.108818393

File: 1771568740099485.jpg (3.28 MB, 2048x3072)

3.28 MB JPG

Anonymous
05/13/26(Wed)22:13:40 No.108818434

Anonymous 05/13/26(Wed)22:13:40 No.108818434

y no flo from progressive

Anonymous
05/13/26(Wed)22:32:06 No.108818500

Anonymous 05/13/26(Wed)22:32:06 No.108818500

>>108817360
There's also this one.
https://github.com/ScenemaAI/scenema-audio

Anonymous
05/13/26(Wed)23:00:09 No.108818610

Anonymous 05/13/26(Wed)23:00:09 No.108818610

File: ComfyUI_00370_.png (862 KB, 896x1152)

862 KB PNG

Anonymous
05/13/26(Wed)23:00:32 No.108818612

Anonymous 05/13/26(Wed)23:00:32 No.108818612

>>108818393
Kino gen idea but she looks way too plastic.

Anonymous
05/13/26(Wed)23:05:23 No.108818630

Anonymous 05/13/26(Wed)23:05:23 No.108818630

anyone got a SIMPLE ltx eros i2v workflow lying around?

Anonymous
05/13/26(Wed)23:08:02 No.108818646

Anonymous 05/13/26(Wed)23:08:02 No.108818646

File: Anima-290212676182359_00001_.png (1.55 MB, 704x1360)

1.55 MB PNG

still having fun with Anima

Anonymous
05/13/26(Wed)23:08:43 No.108818648

Anonymous 05/13/26(Wed)23:08:43 No.108818648

File: 1753349501286808.png (567 KB, 832x1216)

567 KB PNG

Anonymous
05/13/26(Wed)23:18:50 No.108818686

Anonymous 05/13/26(Wed)23:18:50 No.108818686

File: ComfyUI_00001_.jpg (2.83 MB, 3584x4608)

2.83 MB JPG

behold my opus

Anonymous
05/13/26(Wed)23:27:09 No.108818717

Anonymous 05/13/26(Wed)23:27:09 No.108818717

File: Anima-285066917167104_00001_.png (1.75 MB, 704x1360)

1.75 MB PNG

Anonymous
05/13/26(Wed)23:28:10 No.108818719

Anonymous 05/13/26(Wed)23:28:10 No.108818719

File: Anima-858453810132509_00001_.png (1.6 MB, 704x1360)

1.6 MB PNG

Anonymous
05/13/26(Wed)23:29:17 No.108818725

Anonymous 05/13/26(Wed)23:29:17 No.108818725

File: Anima-843264353440359_00001_.png (1.51 MB, 704x1360)

1.51 MB PNG

Anonymous
05/13/26(Wed)23:30:33 No.108818731

Anonymous 05/13/26(Wed)23:30:33 No.108818731

File: q8 vs int8convrot vs nunchaku.jpg (906 KB, 2304x1344)

906 KB JPG

>>108811965
>>108811978
>>108812292
I just got newest Convrot Int8 quant for Chroma and it runs at the same speed as nunchaku. (half of q8)
Quality seems much better although I won't make comparisons with bf16 (I guess I might for something faster like anima)
Int8 inference for diffusion seems to have matured significantly recently. (Surprising how long we've been sitting on it and did nothing worthwhile with it during all these years int8 acceleration had been available.)
This post on plebbit brought the newest int8 technique to my attention:
https://www.reddit.com/r/StableDiffusion/comments/1tazxqz/int8_in_the_age_of_mxfp8_an_investigation_into/
Seems worth considering if you are 3000 or 2000 series.

Anonymous
05/13/26(Wed)23:30:40 No.108818733

Anonymous 05/13/26(Wed)23:30:40 No.108818733

File: Anima-104280638034963_00001_.png (1.65 MB, 704x1360)

1.65 MB PNG

Anonymous
05/13/26(Wed)23:36:29 No.108818763

Anonymous 05/13/26(Wed)23:36:29 No.108818763

>>108818610
ooh la la

Anonymous
05/13/26(Wed)23:39:50 No.108818776

Anonymous 05/13/26(Wed)23:39:50 No.108818776

File: Anima-313638919887792_00001_.png (1.75 MB, 704x1360)

1.75 MB PNG

Anonymous
05/13/26(Wed)23:40:16 No.108818778

Anonymous 05/13/26(Wed)23:40:16 No.108818778

Any suggestion on how, in ComfyUI, to quickly turn a turbo lora on/off?

I want to use Anima Turbo for quick prototyping, but it is messing up the style I like, so I want to quickly turn it off to get a quality gen.
I can bypass the LORA loader, but I also need to change the sampler setting. Way to do it with a switch woud be great.

Anonymous
05/13/26(Wed)23:48:54 No.108818818

Anonymous 05/13/26(Wed)23:48:54 No.108818818

>>108818778
People use switches all the time in workflows but I don't know how it is precisely done from memory neither.
All I can say is that it can easily be done by pressing a single toggle. (Bypass node + change cfg and step count)

Anonymous
05/14/26(Thu)00:00:03 No.108818879

Anonymous 05/14/26(Thu)00:00:03 No.108818879

>>108818818 >>108818778
the easy to understand method many use is to just put the respective nodes into one group / subgraph each and group bypass one or the other

Anonymous
05/14/26(Thu)00:01:25 No.108818888

Anonymous 05/14/26(Thu)00:01:25 No.108818888

>>108818818
* my personal recommendation would be to also have the other cfg and step count in each of the group or subgraph you toggle. YMMV.

Anonymous
05/14/26(Thu)00:03:38 No.108818900

Anonymous 05/14/26(Thu)00:03:38 No.108818900

>>108818731
>if you are 3000
That's-a me. Thanks for the info.

Anonymous
05/14/26(Thu)00:05:47 No.108818912

Anonymous 05/14/26(Thu)00:05:47 No.108818912

File: Screenshot 2026-05-13 230503.jpg (30 KB, 367x368)

30 KB JPG

>>108818778
just ask claude code/codex to build a subgraph for you inside your current workflow. You'll end up with something totally insane like this, but somehow it will work.

Anonymous
05/14/26(Thu)00:14:11 No.108818955

Anonymous 05/14/26(Thu)00:14:11 No.108818955

File: Screenshot 2026-05-14 011309.png (57 KB, 1035x413)

57 KB PNG

>>108818778
if you just want on/off

Anonymous
05/14/26(Thu)00:15:10 No.108818962

Anonymous 05/14/26(Thu)00:15:10 No.108818962

File: 1476296753829.png (1.38 MB, 729x1156)

1.38 MB PNG

We might actually be back.
>https://arxiv.org/html/2605.12964v1
>https://hanshengchen.com/asymflow/

Anonymous
05/14/26(Thu)00:18:08 No.108818984

Anonymous 05/14/26(Thu)00:18:08 No.108818984

>>108818955
And how does switching change sampler setting?

Anonymous
05/14/26(Thu)00:29:20 No.108819032

Anonymous 05/14/26(Thu)00:29:20 No.108819032

>>108818962
Cool

Anonymous
05/14/26(Thu)00:30:25 No.108819035

Anonymous 05/14/26(Thu)00:30:25 No.108819035

>>108818962
that does look pretty cool also based on the sample images

but I barely get what they describe they did achieve... a better technique to relate 256*256px space to the actual latent space?

Anonymous
05/14/26(Thu)00:31:15 No.108819037

Anonymous 05/14/26(Thu)00:31:15 No.108819037

>>108818962
Busy now only briefly skimmed but seems interesting.
Also interesting is that they did this on 9b with the more iffy license.
Will probably post more thoughts later after I finish reading their paper.

Anonymous
05/14/26(Thu)00:31:49 No.108819039

Anonymous 05/14/26(Thu)00:31:49 No.108819039

>>108818962
hol up

Anonymous
05/14/26(Thu)00:33:16 No.108819042

Anonymous 05/14/26(Thu)00:33:16 No.108819042

>>108819035
* demo gen on their demo site not yet THAT amazing, but perhaps I'm testing for the wrong capabilities

Anonymous
05/14/26(Thu)00:34:05 No.108819049

Anonymous 05/14/26(Thu)00:34:05 No.108819049

>schizophrenic situation

Anonymous
05/14/26(Thu)00:36:10 No.108819058

Anonymous 05/14/26(Thu)00:36:10 No.108819058

https://huggingface.co/spaces/Lakonik/AsymFLUX.2-klein
>generate one (1) image
>You have exceeded your ZeroGPU quota

Anonymous
05/14/26(Thu)00:38:03 No.108819068

Anonymous 05/14/26(Thu)00:38:03 No.108819068

>>108819058
>ZeroGPU
they call it that for a reason

Anonymous
05/14/26(Thu)00:38:11 No.108819069

Anonymous 05/14/26(Thu)00:38:11 No.108819069

>>108819058
seems like they released the model? https://huggingface.co/Lakonik/AsymFLUX.2-klein-9B/tree/main

Anonymous
05/14/26(Thu)00:38:22 No.108819071

Anonymous 05/14/26(Thu)00:38:22 No.108819071

>>108818962
>Finetuning Latent Models into Pixel Models
>Hansheng Chen, Jan Ackermann, Minseo Kim, Gordon Wetzstein, Leonidas Guibas
so which one is lodestone

Anonymous
05/14/26(Thu)00:39:11 No.108819076

Anonymous 05/14/26(Thu)00:39:11 No.108819076

>>108819069
>707mb
obviously do not download this

Anonymous
05/14/26(Thu)00:40:07 No.108819085

Anonymous 05/14/26(Thu)00:40:07 No.108819085

>>108819069
That's the adapter

Anonymous
05/14/26(Thu)00:41:03 No.108819089

Anonymous 05/14/26(Thu)00:41:03 No.108819089

>>108819085 >>108819076
isn't that they thing they made for use with klein 9b?

Anonymous
05/14/26(Thu)00:41:03 No.108819090

Anonymous 05/14/26(Thu)00:41:03 No.108819090

>>108819071
They don't even consider what he does actual research lmao.
Never seem his stuff pop up in papers.

Anonymous
05/14/26(Thu)00:43:20 No.108819100

Anonymous 05/14/26(Thu)00:43:20 No.108819100

>>108819089
Yes. There is a usage example in the model card. You load this adapter with the base flux2 klein 9b model

Anonymous
05/14/26(Thu)00:44:35 No.108819106

Anonymous 05/14/26(Thu)00:44:35 No.108819106

File: asymflux.png (3.43 MB, 1425x1821)

3.43 MB PNG

>>108818962

Anonymous
05/14/26(Thu)00:45:36 No.108819112

Anonymous 05/14/26(Thu)00:45:36 No.108819112

>>108819100
yes, as far as I can tell that's how you'd use asymflow for local inference right now?

i don't get the complaints >>108819085 >>108819076

i mean sure, you can wait and see how the comfyui implementation will be done but I also wouldn't be surprised if they just kept the klein9b+adapter setup

Anonymous
05/14/26(Thu)00:51:57 No.108819131

Anonymous 05/14/26(Thu)00:51:57 No.108819131

File: bf16 vs int8 convrot anima.jpg (2.02 MB, 7168x1336)

2.02 MB JPG

>>108818731
I also made a bf16 baseline comparison.
I would say it's holding up reasonably well for quanting a just 2B model. You are going to get less divergence from half precision baseline with larger models. And possibly larger speed boost if you need offloading for the bf16/fp16.
You have to enable dynamic lora option and take a 10-15% speed penalty over usual int8 speeds when using loras though. Otherwise loras have very minimal effect.

Anonymous
05/14/26(Thu)00:54:28 No.108819146

Anonymous 05/14/26(Thu)00:54:28 No.108819146

my niggas
is it possible to train a zimage lora on a 6 gb, 32 gb machine?
thinking of doing the oneitis

Anonymous
05/14/26(Thu)01:01:42 No.108819176

Anonymous 05/14/26(Thu)01:01:42 No.108819176

>>108819146
Should be indeed possible with offloading, paged 8bit optimizer memes, maybe going for fp8, int8 or something for the weights and gradient accumulation, any combination of these until it's enough.
The quality and speeds being desirable is another question.
Desu you can get 5090 on vast for half a buck an hour.
I would rather train on that rather than heavy VRAMlet workarounds.

Anonymous
05/14/26(Thu)01:03:29 No.108819185

Anonymous 05/14/26(Thu)01:03:29 No.108819185

>>108819146
i think so, with offloading to RAM

maybe you even just need to change one offloading slider on onetrainer or ai-toolkit or whatever you use

Anonymous
05/14/26(Thu)01:04:37 No.108819190

Anonymous 05/14/26(Thu)01:04:37 No.108819190

>>108819176
not sure you need to massively quantize, i think just taking the speed hit from offloading to system probably will do?

the quality then shouldn't be worse, it's just not going to be a fast training

Anonymous
05/14/26(Thu)01:08:40 No.108819205

Anonymous 05/14/26(Thu)01:08:40 No.108819205

>>108819190
I do not know how much speed hit you will take for the significant offloading you are going to do.
If you are not dipping significantly slower than 10 seconds per step, that's still overnight training run while you are sleeping territory.
I guess you can try and see what you get.
I would temper expectations though.

Anonymous
05/14/26(Thu)01:10:42 No.108819217

Anonymous 05/14/26(Thu)01:10:42 No.108819217

File: ComfyUI_temp_ckcik_00007_.png (428 KB, 512x512)

428 KB PNG

>>108818984
laziest way would be chaining two ksamplers together and just bypassing the one you don't want when you toggle turbo on/off
https://files.catbox.moe/7gva2i.png

Anonymous
05/14/26(Thu)01:12:47 No.108819227

Anonymous 05/14/26(Thu)01:12:47 No.108819227

>>108818731
>2000 series
*cough cough*... wake up, my ol' betsy...

Anonymous
05/14/26(Thu)01:14:57 No.108819237

Anonymous 05/14/26(Thu)01:14:57 No.108819237

>>108819205
usually it's a linear type of slowdown with not too extreme a slowdown factor, which is not ideal but also not prohibitive

i'd actually recommend to try it, it's probably not a lost cause unless you have a large training data set and want to train it all at full resolution and all the other stuff that can also pump up the requirements

Anonymous
05/14/26(Thu)01:21:14 No.108819269

Anonymous 05/14/26(Thu)01:21:14 No.108819269

>>108819237
I have to admit I haven't tried how much impact offloading makes for training speeds.
I will keep this in mind next time I decide to train a lora for a larger model.

Anonymous
05/14/26(Thu)01:56:19 No.108819409

Anonymous 05/14/26(Thu)01:56:19 No.108819409

File: 1771177556393406.jpg (3.25 MB, 2048x3072)

3.25 MB JPG

Anonymous
05/14/26(Thu)02:05:51 No.108819447

Anonymous 05/14/26(Thu)02:05:51 No.108819447

>>108819409
holy plastic

Anonymous
05/14/26(Thu)02:07:02 No.108819456

Anonymous 05/14/26(Thu)02:07:02 No.108819456

File: 1760199990411198.jpg (605 KB, 1840x1328)

605 KB JPG

Anonymous
05/14/26(Thu)03:11:46 No.108819642

Anonymous 05/14/26(Thu)03:11:46 No.108819642

File: hidream_o1.jpg (438 KB, 2048x2048)

438 KB JPG

not too great at prompt adherence, but i don't hate the hidream o1 randomized anime 1girls

Anonymous
05/14/26(Thu)03:15:44 No.108819658

Anonymous 05/14/26(Thu)03:15:44 No.108819658

File: hidream_o1_2.jpg (233 KB, 2048x2048)

233 KB JPG

Anonymous
05/14/26(Thu)03:17:32 No.108819661

Anonymous 05/14/26(Thu)03:17:32 No.108819661

>>108818776
holy kino

Anonymous
05/14/26(Thu)03:17:47 No.108819662

Anonymous 05/14/26(Thu)03:17:47 No.108819662

>>108819642
>>108819658
is there some noise pattern or am I imagining things?

Anonymous
05/14/26(Thu)03:18:58 No.108819666

Anonymous 05/14/26(Thu)03:18:58 No.108819666

>>108819642
They lied by colossal margins about what the model is.
It's like someone promising you to buy you a brand new 5090 and then brings a second hand RX 6600 that needs to be repasted and get its broken fan replaced.
It's fast enough to get some curiosity from experimenting with it I suppose.
But also don't zoom in to your HiDream gens.
Once you see it you won't stop seeing it.
>>108819662
It has 32x32 patch artifacts.

Anonymous
05/14/26(Thu)03:25:00 No.108819680

Anonymous 05/14/26(Thu)03:25:00 No.108819680

>>108819662
i think it has either jpeg flaws trained or some other artifacts, not at all sure if it's just my settings or the model

>>108819666
>They lied by colossal margins about what the model is.
could be, I didn't actually hear the promises before release

but i think it's not bad for 1girl, questionable rating

Anonymous
05/14/26(Thu)03:28:22 No.108819690

Anonymous 05/14/26(Thu)03:28:22 No.108819690

it's 2026 and people still use wan and illustrious

Anonymous
05/14/26(Thu)03:29:40 No.108819692

Anonymous 05/14/26(Thu)03:29:40 No.108819692

ltx ooms on my 8gb gpu

Anonymous
05/14/26(Thu)03:30:26 No.108819694

Anonymous 05/14/26(Thu)03:30:26 No.108819694

>>108819690
because they're nicely trained for anime 1girl, among other things.

>>108819692
probably just offload more?

Anonymous
05/14/26(Thu)03:31:48 No.108819700

Anonymous 05/14/26(Thu)03:31:48 No.108819700

>>108819694
>probably just offload more?
isn't it offloading automatic already

Anonymous
05/14/26(Thu)03:35:54 No.108819713

Anonymous 05/14/26(Thu)03:35:54 No.108819713

>>108819680
>i think it has either jpeg flaws trained or some other artifacts,
Pixel space diffusion is done in patches of 32x32 pixels. You need to smooth out transitions between different patches some way.
Well it seems they simply didn't bother to for this garbage.
Zeta-Chroma also has them (alongside its million other issues)
llada also has them.
GLM, despite its shit quality, is the only local pixel space model I know of that doesn't have them.
>not at all sure if it's just my settings or the model
It's the model. Well, maybe they KNOW a way to prevent them, but they didn't bother to include it in the inference code, so it's still the model.

Anonymous
05/14/26(Thu)03:36:06 No.108819715

Anonymous 05/14/26(Thu)03:36:06 No.108819715

>>108819700
maybe with some tools/workflows? idk.

but since you OOM clearly something didn't work so either reserve moar RAM on whatever automatic mode you're using or just decide manually how much is offloaded in advance of running.

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.