/g/ - /ldg/ - Local Diffusion General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/ldg/ - Local Diffusion Genera(...) 05/06/26(Wed)00:03:10 No.108763550

File: highlights_g_108756500_17(...).jpg (1.74 MB, 4409x2645)

1.74 MB JPG

/ldg/ - Local Diffusion General Anonymous 05/06/26(Wed)00:03:10 No.108763550

Discussion and Development of Local Image and Video Models

Previous: >>108756500

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, & Upscalers
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon

Anonymous
05/06/26(Wed)00:05:32 No.108763564

Anonymous 05/06/26(Wed)00:05:32 No.108763564

cocky go boing boing

Anonymous
05/06/26(Wed)00:06:02 No.108763567

Anonymous 05/06/26(Wed)00:06:02 No.108763567

>>108757904
progress report anon?

Anonymous
05/06/26(Wed)00:08:01 No.108763579

Anonymous 05/06/26(Wed)00:08:01 No.108763579

>mfw Resource news

05/05/2026

>Decision Boundary-aware Generation for Long-tailed Learning
https://github.com/keepdigitalabc-svg/DBG

>Motion-Aware Caching for Efficient Autoregressive Video Generation
https://github.com/ywlq/MotionCache

>SIFT-VTON: Geometric Correspondence Supervision on Cross-Attention for Virtual Try-On
https://github.com/takesukeDS/SIFT-VTON

>Linear-Time Global Visual Modeling without Explicit Attention
https://github.com/LeapLabTHU/WeightFormer

>Local Dream 2.4.3 - SDXL support, tag autocomplete and more
https://github.com/xororz/local-dream/releases/tag/v2.4.3

>Sora’s downfall signals broader problems with AI’s creative utility
https://theconversation.com/soras-downfall-signals-broader-problems-with-ais-creative-utility-280013

05/04/2026

>UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors
https://houyuanchen111.github.io/UniVidX.github.io

>BlenderRAG: High-Fidelity 3D Object Generation via Retrieval-Augmented Code Synthesis
https://github.com/MaxRondelli/BlenderRAG

>It's Never Too Late: Noise Optimization for Collapse Recovery in Trained Diffusion Models
https://akoepke.github.io/divgen/index.html

>Walkyrie 1.3B — Text-to-Image
https://huggingface.co/kpsss34/Walkyrie-1.3B-v1.0

>Caption Creator: Fst and portable tool for image captions and tags
https://github.com/Merserk/Caption-Creator

>VulkanForge: Vulkan-based LLM inference engine in Rust
https://github.com/maeddesg/vulkanforge

>FastSDCPU release v1.0.0-beta.301
https://github.com/rupeshs/fastsdcpu/releases/tag/v1.0.0-beta.301

>Deepbooru TagWalker
https://github.com/Elliezrah/deepbooru-tagwalker

05/02/2026

>Sulphur 2: An uncensored video generation model based on LTX 2.3
https://huggingface.co/SulphurAI/Sulphur-2-base

05/01/2026

>Representation Fréchet Loss for Visual Generation
https://github.com/Jiawei-Yang/FD-loss

>Caption Generator Pro
https://github.com/CoolGenius-123/Caption-Generator-Pro

Anonymous
05/06/26(Wed)00:09:01 No.108763585

Anonymous 05/06/26(Wed)00:09:01 No.108763585

>mfw Research news

05/05/2026

>TrajShield: Trajectory-Level Safety Mediation for Defending Text-to-Video Models Against Jailbreak Attacks
https://arxiv.org/abs/2605.01761

>SteeringDiffusion: A Bottlenecked Activation Control Interface for Diffusion Models
https://arxiv.org/abs/2605.01653

>Disciplined Diffusion: Text-to-Image Diffusion Model against NSFW Generation
https://arxiv.org/abs/2605.01113

>ScribbleEdit: Synthetic Data for Image Editing with Scribbles and Text
https://arxiv.org/abs/2605.01135

>AttnRouter: Per-Category Attention Routing for Training-Free Image Editing on MMDiT
https://arxiv.org/abs/2605.01480

>SwiftPie: Lightning-fast Subject-driven Image Personalization via One step Diffusion
https://arxiv.org/abs/2605.01510

>CSGuard: Toward Forgery-Resistant Watermarking in Diffusion Models via Compressed Sensing Constraint
https://arxiv.org/abs/2605.01479

>MOC-3D: Manifold-Order Consistency for Text-to-3D Generation
https://arxiv.org/abs/2605.01743

>VAnim: Rendering-Aware Sparse State Modeling for Structure-Preserving Vector Animation
https://yukinonooo.github.io/VAnimProject

>Skipping the Zeros in Diffusion Models for Sparse Data Generation
https://arxiv.org/abs/2605.01817

>Unifying Deep Stochastic Processes for Image Enhancement
https://arxiv.org/abs/2605.01568

>MIRL: Mutual Information-Guided Reinforcement Learning for Vision-Language Models
https://arxiv.org/abs/2605.01520

>GEASS: Training-Free Caption Steering for Hallucination Mitigation in Vision-Language Models
https://arxiv.org/abs/2605.01733

Anonymous
05/06/26(Wed)00:16:46 No.108763621

Anonymous 05/06/26(Wed)00:16:46 No.108763621

>>108763579
>>108763585
thanks!

Anonymous
05/06/26(Wed)00:19:35 No.108763636

Anonymous 05/06/26(Wed)00:19:35 No.108763636

Russell abandoned us

Anonymous
05/06/26(Wed)00:20:28 No.108763640

Anonymous 05/06/26(Wed)00:20:28 No.108763640

File: 1763347421810252.png (534 KB, 1152x842)

534 KB PNG

>>108763550
You dropped >>108758016

Anonymous
05/06/26(Wed)00:27:02 No.108763667

Anonymous 05/06/26(Wed)00:27:02 No.108763667

>>108763536
>old 1.2
meaning? Anyways, I don't speak Japanese either, but this hobby introduced me to the music of those two particular artists and I like them. I always follow along with romaji translations, but now I have some motivation to actually learn Japanese.

>the vocals are heartfelt. anyway, what's the theme of this one?

Theme is whatever Claude gave me a few days ago to test the LoRA kek, here's the lyrics, then followed by romaji and its English translation (Given to the model entirely in standard Japanese, which either the merge or the LoRA has improved).
https://files.catbox.moe/9ors79.txt

In that case it skipped a line or two, but in my testing it's actually very rare now compared to how often it happens in the Turbo model.

Anonymous
05/06/26(Wed)00:32:03 No.108763689

Anonymous 05/06/26(Wed)00:32:03 No.108763689

big russ... please, come back...

Anonymous
05/06/26(Wed)00:34:02 No.108763700

Anonymous 05/06/26(Wed)00:34:02 No.108763700

File: ComfyUI_temp_sbvyi_00012_.png (2.87 MB, 1920x1152)

2.87 MB PNG

Anonymous
05/06/26(Wed)00:34:52 No.108763702

Anonymous 05/06/26(Wed)00:34:52 No.108763702

File: anima_00036_.png (1.33 MB, 1024x1024)

1.33 MB PNG

20 seconds on a 4gb gpu but at what cost

Anonymous
05/06/26(Wed)00:39:08 No.108763721

Anonymous 05/06/26(Wed)00:39:08 No.108763721

File: leave the localkeks alone.png (2 MB, 800x3376)

2 MB PNG

>>108762944

Anonymous
05/06/26(Wed)00:42:12 No.108763732

Anonymous 05/06/26(Wed)00:42:12 No.108763732

File: 1778042092486317_.png (1.41 MB, 1024x1024)

1.41 MB PNG

>>108763702
why is she outside the building

Anonymous
05/06/26(Wed)00:42:32 No.108763735

Anonymous 05/06/26(Wed)00:42:32 No.108763735

>>108763636
>>108763689
Let him cook

Anonymous
05/06/26(Wed)00:49:11 No.108763767

Anonymous 05/06/26(Wed)00:49:11 No.108763767

File: Anima Sogna Viper-12 Samples.png (3.13 MB, 3072x1024)

3.13 MB PNG

>Curious to see if Sogna artist is alive.
>The artist is alive
>Is a woman.

Didn't expect that.

Anonymous
05/06/26(Wed)00:49:55 No.108763771

Anonymous 05/06/26(Wed)00:49:55 No.108763771

>>108763636
tdrusted
formerly trusted

Anonymous
05/06/26(Wed)00:50:49 No.108763782

Anonymous 05/06/26(Wed)00:50:49 No.108763782

>>108763667
sorry, I meant the old ace step 1_3_5.

>>108763350
>>108763365
tencent was involved in songbloom.
https://github.com/tencent-ailab/SongBloom

It could be kind of nice. xmas gen:
https://files.catbox.moe/pxu1ia.mp3
The reference audio was iirc Mariah Carey.

The lyrics are "

problem: it basically was really unpredictable. Prompting was a wishlist.

I wouldn't want to go back lol. BUT, as you can hear, they're not jokers.

Anonymous
05/06/26(Wed)00:51:50 No.108763791

Anonymous 05/06/26(Wed)00:51:50 No.108763791

>>108763782
>The lyrics are "
based on God Rest Ye Merry Gentlemen

Anonymous
05/06/26(Wed)00:52:20 No.108763798

Anonymous 05/06/26(Wed)00:52:20 No.108763798

>>108763767
pretty style

Anonymous
05/06/26(Wed)00:57:59 No.108763827

Anonymous 05/06/26(Wed)00:57:59 No.108763827

File: 1760764205433215.png (1.03 MB, 1344x768)

1.03 MB PNG

Anonymous
05/06/26(Wed)00:59:24 No.108763835

Anonymous 05/06/26(Wed)00:59:24 No.108763835

>>108763530
>>Sulfur can't do pussy

It's a base model, LoRA trainer will improve this, also Sulfur 2.5 waits for funding

Anonymous
05/06/26(Wed)01:01:30 No.108763845

Anonymous 05/06/26(Wed)01:01:30 No.108763845

File: debo_anima_00009_.png (2.38 MB, 2048x1117)

2.38 MB PNG

Anonymous
05/06/26(Wed)01:11:14 No.108763893

Anonymous 05/06/26(Wed)01:11:14 No.108763893

warning! a diss track:
https://files.catbox.moe/21bzys.mp3

Anonymous
05/06/26(Wed)01:11:14 No.108763894

Anonymous 05/06/26(Wed)01:11:14 No.108763894

File: ComfyUI_00151_.png (598 KB, 1040x544)

598 KB PNG

Anonymous
05/06/26(Wed)01:14:32 No.108763906

Anonymous 05/06/26(Wed)01:14:32 No.108763906

File: Parental-Advisory-Logo.png (54 KB, 3840x2160)

54 KB PNG

>>108763893

Anonymous
05/06/26(Wed)01:15:24 No.108763909

Anonymous 05/06/26(Wed)01:15:24 No.108763909

>>108763906
I didn't want anyone to take it personally lol

Anonymous
05/06/26(Wed)01:17:02 No.108763919

Anonymous 05/06/26(Wed)01:17:02 No.108763919

>>108763893
fuck you nigger faggot

Anonymous
05/06/26(Wed)01:22:25 No.108763954

Anonymous 05/06/26(Wed)01:22:25 No.108763954

ayo I is gittin oppress

Anonymous
05/06/26(Wed)01:26:30 No.108763971

Anonymous 05/06/26(Wed)01:26:30 No.108763971

File: jordan rudess.jpg (226 KB, 940x535)

226 KB JPG

>>108763893

Anonymous
05/06/26(Wed)01:29:51 No.108763983

Anonymous 05/06/26(Wed)01:29:51 No.108763983

File: 1757029316324595.jpg (909 KB, 1536x1536)

909 KB JPG

Anonymous
05/06/26(Wed)01:39:17 No.108764014

Anonymous 05/06/26(Wed)01:39:17 No.108764014

File: ComfyUI_00006_.jpg (1.95 MB, 2286x4096)

1.95 MB JPG

waiting the API schizo this level of detail, this gen is 1 year old btw.

Anonymous
05/06/26(Wed)01:43:39 No.108764028

Anonymous 05/06/26(Wed)01:43:39 No.108764028

>>108763971
(thumbs up)

Anonymous
05/06/26(Wed)01:49:59 No.108764049

Anonymous 05/06/26(Wed)01:49:59 No.108764049

File: 1770031980345642.png (1.35 MB, 1344x768)

1.35 MB PNG

>>108763827

Anonymous
05/06/26(Wed)01:52:56 No.108764061

Anonymous 05/06/26(Wed)01:52:56 No.108764061

anyone else firing up the kino factory?

Anonymous
05/06/26(Wed)01:54:04 No.108764064

Anonymous 05/06/26(Wed)01:54:04 No.108764064

>>108763550
>https://huggingface.co/circlestone-labs/Anima
>405k downloads last month
woaw

Anonymous
05/06/26(Wed)02:01:03 No.108764079

Anonymous 05/06/26(Wed)02:01:03 No.108764079

>>108763782
They made a slightly better one actually, it was called https://github.com/tencent-ailab/SongGeneration
It still left so much to be desired and was basically ACEStep 1.0 tier. ACEStep had a 1.5 version which was much better than that, and then it was iterated on and 1.5 XL is even better than standard 1.5. We are very fortunate to have ACEStep. Local audio models were in a very sad and dire state beforehand.

Anonymous
05/06/26(Wed)02:03:41 No.108764086

Anonymous 05/06/26(Wed)02:03:41 No.108764086

>>108764079
neat. Still getting updates.

Anonymous
05/06/26(Wed)02:15:45 No.108764111

Anonymous 05/06/26(Wed)02:15:45 No.108764111

File: 1761211995361161.jpg (747 KB, 1536x1536)

747 KB JPG

Anonymous
05/06/26(Wed)02:18:03 No.108764115

Anonymous 05/06/26(Wed)02:18:03 No.108764115

>>108764079
have you tried running your generations through a mastering model? i remember someone suggesting that in a past thread. never tried it

Anonymous
05/06/26(Wed)02:29:10 No.108764147

Anonymous 05/06/26(Wed)02:29:10 No.108764147

>>108764115
>a mastering model
is there one that actually works? I have 5 tracks that needs cleaning

Anonymous
05/06/26(Wed)02:31:10 No.108764154

Anonymous 05/06/26(Wed)02:31:10 No.108764154

>>108764115
All the generations I posted yesterday (Fate Gear, Zutomayo, Miku) are mastered actually. I do it like second nature before sharing gens, because Turbo just is slightly worse at capturing details. I'm the one who recommended it after finding out about Matchering 2 thru Discord. Some songs are better than others for mastering so I just rotate around an album and it usually only takes 1-2 tries. The base merge does not need as much mastering as Turbo anymore though, as in the initial output is not as noisy and the voice is crisp, but that only means it sounds way better when mastered so I go ahead and do it anyways.

Anonymous
05/06/26(Wed)02:32:24 No.108764162

Anonymous 05/06/26(Wed)02:32:24 No.108764162

Nofap day 10, honestly, keep remembering that motorcyclist's cleavage.

Anonymous
05/06/26(Wed)02:33:52 No.108764170

Anonymous 05/06/26(Wed)02:33:52 No.108764170

>>108764147
>I have 5 tracks that needs cleaning
Matchering 2
https://github.com/sergree/matchering

Anonymous
05/06/26(Wed)02:35:48 No.108764177

Anonymous 05/06/26(Wed)02:35:48 No.108764177

>>108764170
>Matchering 2
I've tried it actually, just wasnt super happy with results. Perhaps I used bad reference song.

Anonymous
05/06/26(Wed)02:39:06 No.108764186

Anonymous 05/06/26(Wed)02:39:06 No.108764186

>>108764154
can you post a before and after?

Anonymous
05/06/26(Wed)02:42:59 No.108764192

Anonymous 05/06/26(Wed)02:42:59 No.108764192

>>108764177
I recommend trying Web Audio Mastering, but Matchering 2 with a good source can sound much better than that. There are some songs with poor mastering that lowers the volume a ton, finding a good one is just research, find a song that sounds clear. Googling for well mastered songs in genre you're targetting may help. Look thru playlists. Tip: Movie scores tend to be well mastered due to insane budgets. LLMs like Gemini can give ideas as well. You'd be surprised. I don't search anymore because I have good songs and albums that apply to everything I have (just downloading albums from a few target artists is enough).

Anonymous
05/06/26(Wed)02:43:08 No.108764193

Anonymous 05/06/26(Wed)02:43:08 No.108764193

>>108764014
kino

Anonymous
05/06/26(Wed)02:44:28 No.108764198

Anonymous 05/06/26(Wed)02:44:28 No.108764198

I bet that motorcycle zoomer wants a gen x nofap boyfriend.

Anonymous
05/06/26(Wed)02:45:45 No.108764203

Anonymous 05/06/26(Wed)02:45:45 No.108764203

>>108764198
nofap is not supposed to turn you gay. rebuke the devil

Anonymous
05/06/26(Wed)02:50:09 No.108764225

Anonymous 05/06/26(Wed)02:50:09 No.108764225

>>108764203
>:(
he said show my penis to drivers on the highway

Anonymous
05/06/26(Wed)02:53:46 No.108764237

Anonymous 05/06/26(Wed)02:53:46 No.108764237

>>108764225
he is trying to use you as a grim reaper

Anonymous
05/06/26(Wed)03:00:49 No.108764269

Anonymous 05/06/26(Wed)03:00:49 No.108764269

>>108764237
lol

Anonymous
05/06/26(Wed)03:01:12 No.108764270

Anonymous 05/06/26(Wed)03:01:12 No.108764270

File: lily 3.png (1.31 MB, 848x1264)

1.31 MB PNG

>>108763721
is this local? reminds me picrel claimed to be local too

Anonymous
05/06/26(Wed)03:02:23 No.108764274

Anonymous 05/06/26(Wed)03:02:23 No.108764274

>>108764186
Turbo gens since I'm AFK, these are from weeks back (before I figured out the DiT only trick so gens might be more boring than what the Turbo model can do).

https://desuarchive.org/g/thread/108694497/#108695746
Top is mastered, bottom is pre-mastered.

Pre-master
https://vocaroo.com/1a4VSBjqkuSX
Master
https://vocaroo.com/1b0F41rAgXqR

Depending on the gen, it doesn't get rid of every issue, but it can make it go from unacceptable on decent headphones to okay sounding.

Anonymous
05/06/26(Wed)03:05:02 No.108764283

Anonymous 05/06/26(Wed)03:05:02 No.108764283

>>108764192
Do you use 32bit float, just normal settings?

Anonymous
05/06/26(Wed)03:09:57 No.108764294

Anonymous 05/06/26(Wed)03:09:57 No.108764294

>>108764274
oh, i thought it would fix the metallic sound. sounds like some basic stereo manipulation. someone needs a model that actually rebuilds the full song to make it HD, kind of like asking image editing models to make something higher quality

Anonymous
05/06/26(Wed)03:11:42 No.108764300

Anonymous 05/06/26(Wed)03:11:42 No.108764300

>>108764283
16 bit WAV files. Default settings on UVL, I use this desktop app
>https://ultimatevocalremover.com/

Anonymous
05/06/26(Wed)03:16:03 No.108764316

Anonymous 05/06/26(Wed)03:16:03 No.108764316

>>108764300
you can, it often works. i think it's easier on wan though.

usually it's a matter of time and offloading to vram if yourt gpu's vram isn't so large

Anonymous
05/06/26(Wed)03:24:26 No.108764342

Anonymous 05/06/26(Wed)03:24:26 No.108764342

>>108764274
what do you mean by 'mastered'? What are you using?

Anonymous
05/06/26(Wed)03:36:24 No.108764389

Anonymous 05/06/26(Wed)03:36:24 No.108764389

File: he sold.jpg (134 KB, 1086x1448)

134 KB JPG

Anonymous
05/06/26(Wed)03:39:27 No.108764398

Anonymous 05/06/26(Wed)03:39:27 No.108764398

What do you guys actually use these generated images for? I liked the process to get image and video generation working on my system, but now I'm stumped, I really have no use for any of this. Just generating random images is a bit meh.

Anonymous
05/06/26(Wed)04:09:25 No.108764495

Anonymous 05/06/26(Wed)04:09:25 No.108764495

File: lily 5.png (1.26 MB, 848x1264)

1.26 MB PNG

>>108764270
Which nigga claimed that was local?
(These are Nano Banana + editing in GIMP.)

Anonymous
05/06/26(Wed)04:14:27 No.108764510

Anonymous 05/06/26(Wed)04:14:27 No.108764510

>>108764398
making kinos for my amusement

Anonymous
05/06/26(Wed)04:14:36 No.108764511

Anonymous 05/06/26(Wed)04:14:36 No.108764511

35

Anonymous
05/06/26(Wed)04:32:01 No.108764578

Anonymous 05/06/26(Wed)04:32:01 No.108764578

>>108764495
it was in adt
adt for local
fuck you nigger

Anonymous
05/06/26(Wed)04:33:51 No.108764586

Anonymous 05/06/26(Wed)04:33:51 No.108764586

>>108764578
adt is for anime

Anonymous
05/06/26(Wed)04:35:58 No.108764597

Anonymous 05/06/26(Wed)04:35:58 No.108764597

>>108764586
for anime generated locally
do you see any mentioning of nai or other cloud shit
kys

Anonymous
05/06/26(Wed)04:40:19 No.108764617

Anonymous 05/06/26(Wed)04:40:19 No.108764617

https://files.catbox.moe/x5z448.mp3
The text:
https://webstersdictionary1828.com/Dictionary/joy

Needed cfg scale to be 13.

>>108764398
similar to above. for memory stuff.

Anonymous
05/06/26(Wed)04:54:21 No.108764672

Anonymous 05/06/26(Wed)04:54:21 No.108764672

>>108764617
fyi, if I had fed that into udio, at least at the time I left, it would have never worked. You had to really slave to get uneven verses working.

Anonymous
05/06/26(Wed)04:55:22 No.108764681

Anonymous 05/06/26(Wed)04:55:22 No.108764681

>>108764672
ie udio, at least in the past, wants to talk instead of sing.

Anonymous
05/06/26(Wed)05:17:39 No.108764752

Anonymous 05/06/26(Wed)05:17:39 No.108764752

>>108764734
Hmmm nyo

Anonymous
05/06/26(Wed)05:19:01 No.108764757

Anonymous 05/06/26(Wed)05:19:01 No.108764757

>>108764597
Fuck you, /adt/ is for anime diffusion in general. Here is for local realism and western art

Anonymous
05/06/26(Wed)05:27:56 No.108764781

Anonymous 05/06/26(Wed)05:27:56 No.108764781

File: 345956.png (759 KB, 1835x1047)

759 KB PNG

anyone else making ltx kinos?

Anonymous
05/06/26(Wed)05:29:55 No.108764787

Anonymous 05/06/26(Wed)05:29:55 No.108764787

>>108764781
I am glitching out with Sulphur. What are your steps/cfg/distill LoRa strength?

Anonymous
05/06/26(Wed)05:33:32 No.108764790

Anonymous 05/06/26(Wed)05:33:32 No.108764790

>>108764781
it can do 40 seconds?

Anonymous
05/06/26(Wed)05:38:45 No.108764808

Anonymous 05/06/26(Wed)05:38:45 No.108764808

>>108764781
upload i wanna see

Anonymous
05/06/26(Wed)05:43:24 No.108764825

Anonymous 05/06/26(Wed)05:43:24 No.108764825

>>108764757
nigger
> do you see any mentioning of nai or other cloud shit (in the op)
fuck you

Anonymous
05/06/26(Wed)05:44:33 No.108764828

Anonymous 05/06/26(Wed)05:44:33 No.108764828

>>108764787
i am using the default settings on wan2gp. it seems to be 8 steps and i don't know what the rest of the settings are since it's hidden from the UI
>>108764790
it can go as far as you want. i'm just being meticulous with small extensions
>>108764808
i will once i reach the bridge after the chorus

Anonymous
05/06/26(Wed)05:54:58 No.108764856

Anonymous 05/06/26(Wed)05:54:58 No.108764856

>>108764734
>>108764511

Anonymous
05/06/26(Wed)06:36:50 No.108764984

Anonymous 05/06/26(Wed)06:36:50 No.108764984

>>108763567
I captioned the dataset late night and said fuck and started training. Just woke up.
It's done training but preliminary test seem to show disappointing results. Maybe earlier steps are better or needs a different prompting meta than I am doing. (Cope I know but I need to test it more.)
I honestly didn't expect to get it right first try, but still.
I will make a training help blogpost later.

Anonymous
05/06/26(Wed)06:55:51 No.108765059

Anonymous 05/06/26(Wed)06:55:51 No.108765059

>>108764781
i was trying sulphur, it still seems quite tricky compared to wan tho

Anonymous
05/06/26(Wed)07:00:25 No.108765075

Anonymous 05/06/26(Wed)07:00:25 No.108765075

>>108763550
Do you know how to configure ComfyUI ZImage to work with the Hammerai website? I have the portable version for Windows. I have a problem with it not being able to find the models.

Anonymous
05/06/26(Wed)07:04:44 No.108765096

Anonymous 05/06/26(Wed)07:04:44 No.108765096

>>108765059
the prompting for ltx seems to be very sequential. you have to write it out in the order in which things should be seen. if you mention that she has boots at the end of the prompt, the camera will aim down to look at her boots only at the end of the video

Anonymous
05/06/26(Wed)10:19:46 No.108766065

Anonymous 05/06/26(Wed)10:19:46 No.108766065

hello? retards?

Anonymous
05/06/26(Wed)10:26:26 No.108766094

Anonymous 05/06/26(Wed)10:26:26 No.108766094

>>108766065
retard here how can i help

Anonymous
05/06/26(Wed)10:27:06 No.108766098

Anonymous 05/06/26(Wed)10:27:06 No.108766098

>>108766094
where are you guys

Anonymous
05/06/26(Wed)10:27:49 No.108766102

Anonymous 05/06/26(Wed)10:27:49 No.108766102

>>108766065
hello? is this thing on? am I all alone? is anyone there?

Anonymous
05/06/26(Wed)10:56:41 No.108766239

Anonymous 05/06/26(Wed)10:56:41 No.108766239

>>108766102
>>108766065
anima posters moved to anime generals

Anonymous
05/06/26(Wed)11:01:12 No.108766261

Anonymous 05/06/26(Wed)11:01:12 No.108766261

>>108766239
they’re discussing loras and stuff in /hgg/

Anonymous
05/06/26(Wed)11:04:24 No.108766274

Anonymous 05/06/26(Wed)11:04:24 No.108766274

Give it a rest pal no one believes you

Anonymous
05/06/26(Wed)11:09:57 No.108766300

Anonymous 05/06/26(Wed)11:09:57 No.108766300

File: Anima_02242_.png (1.05 MB, 832x1216)

1.05 MB PNG

So here is an interesting thing about anima's @ keyword for styles.
Anima understands on a fundamental level that "@" is associated with styles to the point that it will hallucinate watermarks and patreon links on occasion if you feed it a nonsensical style tag.
(This was with @real photo and my lora disabled for testing comparisons.)
You don't see anything similar in sdxl.
When there is a watermark or link hallucinated, it is almost always gibberish and not strongly related to anything on your prompt.

Anonymous
05/06/26(Wed)11:19:02 No.108766338

Anonymous 05/06/26(Wed)11:19:02 No.108766338

>>108766300
what was your captioning like, and how many steps was that?
i was curious to see realism trained as an @, but that doesn't look promising.

Anonymous
05/06/26(Wed)11:23:47 No.108766359

Anonymous 05/06/26(Wed)11:23:47 No.108766359

File: 1588629222545.gif (933 KB, 220x220)

933 KB GIF

i've received warnings for racism (what) in the past, but look at the attitude of the Chinese. they're now ignoring the users who made them famous. that's not a respectful attitude, is it?

Anonymous
05/06/26(Wed)11:48:04 No.108766472

Anonymous 05/06/26(Wed)11:48:04 No.108766472

wat is weeb labs?

Anonymous
05/06/26(Wed)11:50:26 No.108766494

Anonymous 05/06/26(Wed)11:50:26 No.108766494

File: ComfyUI_temp_gqkgg_00022_.png (2.53 MB, 1520x1040)

2.53 MB PNG

After playing around with the Spark.Chroma 1024 model, my conclusions are as follows:
1 - this seems like a sidegrade compared to 512 and a downgrade to preview
2 - It is better at following some specific prompt details, like film grain
3 - It is worse at replicating now-photographic styles (unable to do american comic books, worse at black and white manga for two examples)
4 - It seems worse at generating faces on a crowd, something I didn't felt with 512 model
5 - the preview model still is the best overall, making a good compromise and being better at non-photographic styles
6 - Chroma is still the best for NSFW work, being the only one that is able to generate correct male genitalia (not just on the man that is penetrating, but also on males on a crowd or by themselves) and blood/gore (being able to render carcasses, body interiors, blood drips etc. much more coherently than other models)

https://files.catbox.moe/5js4so.png

Anonymous
05/06/26(Wed)11:51:51 No.108766511

Anonymous 05/06/26(Wed)11:51:51 No.108766511

File: Chroma_final_00028_.png (2.31 MB, 1520x1040)

2.31 MB PNG

https://files.catbox.moe/c8bcl1.png

Anonymous
05/06/26(Wed)11:54:11 No.108766531

Anonymous 05/06/26(Wed)11:54:11 No.108766531

File: Chroma_final_00034_.png (2.54 MB, 1520x1040)

2.54 MB PNG

https://files.catbox.moe/1h72q7.png

Anonymous
05/06/26(Wed)11:58:24 No.108766571

Anonymous 05/06/26(Wed)11:58:24 No.108766571

File: Chroma_final_00005_.png (2.22 MB, 1040x1520)

2.22 MB PNG

https://files.catbox.moe/j94wjf.png

Anonymous
05/06/26(Wed)12:00:34 No.108766586

Anonymous 05/06/26(Wed)12:00:34 No.108766586

File: Chroma_final_00032_.png (1.84 MB, 1920x800)

1.84 MB PNG

https://files.catbox.moe/ao3ew5.png

Anonymous
05/06/26(Wed)12:08:29 No.108766637

Anonymous 05/06/26(Wed)12:08:29 No.108766637

File: Anima_02243_.png (1.13 MB, 832x1216)

1.13 MB PNG

>>108766338
I already posted a bit about the captioning last thread but here is an example caption:
>@real photo. A young White woman in her early 20s with vibrant, wavy red hair leans back against the thick, gnarled roots of a large tree in an autumnal forest. She gazes thoughtfully upward and away from the camera, her face framed by her bright hair and accented by dark, plum-colored lipstick. She wears a black long-sleeved turtleneck sweater and white pleated trousers, with her arms crossed comfortably over her chest. Small, white flower-shaped earrings are visible on her ears, and her fair skin has a soft, rosy glow on her cheeks. The massive, grey tree roots cradle her body, while the ground behind her is covered in a thick layer of fallen orange and brown leaves. The background trees and the leaf-covered slope are softly blurred, creating a shallow depth of field that keeps the focus entirely on her.
I had 5 epochs with 1300 images so 6500 steps. Fuck it might as well as turn this into a help me post. Batch size 2, 1024p, AdamW, LR 0.00003, cosine, WD 0.001, betas=0.9,0.99 96 rank 48 alpha, 0.05 dropout, sigmoid_scale 1.3 max_grad_norm 1.0
Any ideas how to proceed?
As embarrassing as it is, here is the same image with the lora. There are better images and worse ones.

Anonymous
05/06/26(Wed)12:10:42 No.108766657

Anonymous 05/06/26(Wed)12:10:42 No.108766657

comfy will pay you minimum wage to make workflows
https://docs.google.com/forms/d/e/1FAIpQLSdCrLN2UBKjeqz30__wMjTAXoVqhypTTq1Gl08y2nvKaEf98A/viewform

Anonymous
05/06/26(Wed)12:11:02 No.108766658

Anonymous 05/06/26(Wed)12:11:02 No.108766658

File: Anima_02225_.png (1.27 MB, 896x1152)

1.27 MB PNG

>>108766637
Posting a slightly better example to feel less shame.

Anonymous
05/06/26(Wed)12:12:25 No.108766672

Anonymous 05/06/26(Wed)12:12:25 No.108766672

File: Chroma_final_00024_.png (2.12 MB, 1040x1520)

2.12 MB PNG

https://files.catbox.moe/oak48j.png

Anonymous
05/06/26(Wed)12:14:04 No.108766681

Anonymous 05/06/26(Wed)12:14:04 No.108766681

>>108766637
desu i just use the diffusion-pipe defaults

Anonymous
05/06/26(Wed)12:19:44 No.108766711

Anonymous 05/06/26(Wed)12:19:44 No.108766711

File: ComfyUI_temp_epzdk_00005_.png (3.14 MB, 1824x1248)

3.14 MB PNG

https://files.catbox.moe/qqx56k.png

Anonymous
05/06/26(Wed)12:25:47 No.108766754

Anonymous 05/06/26(Wed)12:25:47 No.108766754

File: ComfyUI_temp_yhnyn_00037_.jpg (1.07 MB, 2880x1632)

1.07 MB JPG

https://files.catbox.moe/0kv33c.png

Anonymous
05/06/26(Wed)12:27:47 No.108766776

Anonymous 05/06/26(Wed)12:27:47 No.108766776

>>108766681
I mean I don't think this deviates strongly from the settings in the rutkowski lora.

Anonymous
05/06/26(Wed)12:27:52 No.108766777

Anonymous 05/06/26(Wed)12:27:52 No.108766777

File: ComfyUI_temp_yhnyn_00076_.jpg (1.25 MB, 2880x1632)

1.25 MB JPG

https://files.catbox.moe/rlv0j1.png

Anonymous
05/06/26(Wed)12:28:32 No.108766782

Anonymous 05/06/26(Wed)12:28:32 No.108766782

>spend a few hours carefully curating more data for my lora
>train next version
>its worse than the precious version

Anonymous
05/06/26(Wed)12:28:56 No.108766786

Anonymous 05/06/26(Wed)12:28:56 No.108766786

File: ComfyUI_temp_varei_00002_.jpg (707 KB, 2880x1632)

707 KB JPG

https://files.catbox.moe/38yj18.png
>>108766782
many such cases

Anonymous
05/06/26(Wed)12:32:52 No.108766814

Anonymous 05/06/26(Wed)12:32:52 No.108766814

>>108766637
6500 steps doesn't seem like nearly enough steps with 1300 images. 1300 is kind of a huge amount of images for a lora what a single concept.

Just try baking it for longer. If you have a pretty diverse data set you might wanna take it all the way to like 20000 steps, just make sure you're saving checkpoints often.

Anonymous
05/06/26(Wed)12:36:55 No.108766849

Anonymous 05/06/26(Wed)12:36:55 No.108766849

File: download - 2026-05-06T123(...).jpg (118 KB, 512x768)

118 KB JPG

>>108766637

Anonymous
05/06/26(Wed)12:41:26 No.108766880

Anonymous 05/06/26(Wed)12:41:26 No.108766880

what the fuck is the obsession with making realistic loras for an anime model? makes no sense, it looks like shit

Anonymous
05/06/26(Wed)12:42:55 No.108766889

Anonymous 05/06/26(Wed)12:42:55 No.108766889

what the fuck is the obsession with making an anime model by using cosmos a realistic model? makes no sense!

Anonymous
05/06/26(Wed)12:44:37 No.108766894

Anonymous 05/06/26(Wed)12:44:37 No.108766894

>>108766786
Are the tall dudes invisible to the bottom dudes?
Also second dude from left has an arm overlapping the middle cape.

Anonymous
05/06/26(Wed)12:46:19 No.108766904

Anonymous 05/06/26(Wed)12:46:19 No.108766904

>>108766814
6500 takes close to 8 hours on my GPU.
I guess I can train for a day after taking so long to curate the dataset.
But I wonder if I can pull off faster convergence without frying with higher LR?
>>108766880
Troll post but I want to use its character and NSFW knowledge for realism.

Anonymous
05/06/26(Wed)12:46:48 No.108766907

Anonymous 05/06/26(Wed)12:46:48 No.108766907

>i-it's a troll post if i get told the truth

Anonymous
05/06/26(Wed)12:47:36 No.108766910

Anonymous 05/06/26(Wed)12:47:36 No.108766910

>>108766880
it's a small model, it has great prompt adherence, it's easy to train, nsfw out of the box, it has a lot of concepts trained into it that negate the need for specific character/style loras, and it's fun to fuck around.
plus a lot of people are looking for something to replace legacy models, anima will be that model, so may as well get a jump on it now.

Anonymous
05/06/26(Wed)12:49:09 No.108766915

Anonymous 05/06/26(Wed)12:49:09 No.108766915

>>108766786
i wish we had z-anima, the parameter count for anima is just too small to do scenes like this in it.

Anonymous
05/06/26(Wed)12:50:53 No.108766924

Anonymous 05/06/26(Wed)12:50:53 No.108766924

>>108766880
The real question you should be asking is why hasn't anyone made a 3DPD porn model even though it has the most original and high quality data on the internet. It's been years and not even a hint of one being made. Very sad for pig enjoyers.

Anonymous
05/06/26(Wed)12:51:55 No.108766928

Anonymous 05/06/26(Wed)12:51:55 No.108766928

>>108766924
preview 3. no sense in wasting compute until the model is done.

Anonymous
05/06/26(Wed)12:56:18 No.108766956

Anonymous 05/06/26(Wed)12:56:18 No.108766956

>>108766910
>anima will be that model
almost every lora for anima has a barren sample section, nobody is using the model and the people that use it are making the exact same slop they did with sdxl so creativity has peaked

Anonymous
05/06/26(Wed)12:58:28 No.108766971

Anonymous 05/06/26(Wed)12:58:28 No.108766971

>>108766907
What model do you use when you want to see a photo of a fictional character taking multiple cocks into her asshole?
>inb4 Chroma

Anonymous
05/06/26(Wed)13:01:10 No.108766984

Anonymous 05/06/26(Wed)13:01:10 No.108766984

>>108766971
>taking multiple cocks
even in your fantasies, you watch.

Anonymous
05/06/26(Wed)13:01:20 No.108766987

Anonymous 05/06/26(Wed)13:01:20 No.108766987

If a use a distilled model which uses 8 steps does it mean the image has reached its full convergence? If it’s so then why increasing steps it will still change? As an example using ZIT or Ernie around 10% of the image will still change between 8 and, let’s say, 20. It will also produce some kind of patchy textures, why? Using other models like CHROMA FLASH this will be even more pronounced, again why? Also all these models should in theory work at best when using ODE samplers and a simple noise scheduler, then why SDE samplers seem to work good?

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.