[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Discussion and Development of Local Image and Video Models

Previous: >>108756500

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, & Upscalers
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
cocky go boing boing
>>
>>108757904
progress report anon?
>>
>mfw Resource news

05/05/2026

>Decision Boundary-aware Generation for Long-tailed Learning
https://github.com/keepdigitalabc-svg/DBG

>Motion-Aware Caching for Efficient Autoregressive Video Generation
https://github.com/ywlq/MotionCache

>SIFT-VTON: Geometric Correspondence Supervision on Cross-Attention for Virtual Try-On
https://github.com/takesukeDS/SIFT-VTON

>Linear-Time Global Visual Modeling without Explicit Attention
https://github.com/LeapLabTHU/WeightFormer

>Local Dream 2.4.3 - SDXL support, tag autocomplete and more
https://github.com/xororz/local-dream/releases/tag/v2.4.3

>Sora’s downfall signals broader problems with AI’s creative utility
https://theconversation.com/soras-downfall-signals-broader-problems-with-ais-creative-utility-280013

05/04/2026

>UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors
https://houyuanchen111.github.io/UniVidX.github.io

>BlenderRAG: High-Fidelity 3D Object Generation via Retrieval-Augmented Code Synthesis
https://github.com/MaxRondelli/BlenderRAG

>It's Never Too Late: Noise Optimization for Collapse Recovery in Trained Diffusion Models
https://akoepke.github.io/divgen/index.html

>Walkyrie 1.3B — Text-to-Image
https://huggingface.co/kpsss34/Walkyrie-1.3B-v1.0

>Caption Creator: Fst and portable tool for image captions and tags
https://github.com/Merserk/Caption-Creator

>VulkanForge: Vulkan-based LLM inference engine in Rust
https://github.com/maeddesg/vulkanforge

>FastSDCPU release v1.0.0-beta.301
https://github.com/rupeshs/fastsdcpu/releases/tag/v1.0.0-beta.301

>Deepbooru TagWalker
https://github.com/Elliezrah/deepbooru-tagwalker

05/02/2026

>Sulphur 2: An uncensored video generation model based on LTX 2.3
https://huggingface.co/SulphurAI/Sulphur-2-base


05/01/2026

>Representation Fréchet Loss for Visual Generation
https://github.com/Jiawei-Yang/FD-loss

>Caption Generator Pro
https://github.com/CoolGenius-123/Caption-Generator-Pro
>>
>mfw Research news

05/05/2026

>TrajShield: Trajectory-Level Safety Mediation for Defending Text-to-Video Models Against Jailbreak Attacks
https://arxiv.org/abs/2605.01761

>SteeringDiffusion: A Bottlenecked Activation Control Interface for Diffusion Models
https://arxiv.org/abs/2605.01653

>Disciplined Diffusion: Text-to-Image Diffusion Model against NSFW Generation
https://arxiv.org/abs/2605.01113

>ScribbleEdit: Synthetic Data for Image Editing with Scribbles and Text
https://arxiv.org/abs/2605.01135

>AttnRouter: Per-Category Attention Routing for Training-Free Image Editing on MMDiT
https://arxiv.org/abs/2605.01480

>SwiftPie: Lightning-fast Subject-driven Image Personalization via One step Diffusion
https://arxiv.org/abs/2605.01510

>CSGuard: Toward Forgery-Resistant Watermarking in Diffusion Models via Compressed Sensing Constraint
https://arxiv.org/abs/2605.01479

>MOC-3D: Manifold-Order Consistency for Text-to-3D Generation
https://arxiv.org/abs/2605.01743

>VAnim: Rendering-Aware Sparse State Modeling for Structure-Preserving Vector Animation
https://yukinonooo.github.io/VAnimProject

>Skipping the Zeros in Diffusion Models for Sparse Data Generation
https://arxiv.org/abs/2605.01817

>Unifying Deep Stochastic Processes for Image Enhancement
https://arxiv.org/abs/2605.01568

>MIRL: Mutual Information-Guided Reinforcement Learning for Vision-Language Models
https://arxiv.org/abs/2605.01520

>GEASS: Training-Free Caption Steering for Hallucination Mitigation in Vision-Language Models
https://arxiv.org/abs/2605.01733
>>
>>108763579
>>108763585
thanks!
>>
Russell abandoned us
>>
File: 1763347421810252.png (534 KB, 1152x842)
534 KB PNG
>>108763550
You dropped >>108758016
>>
>>108763536
>old 1.2
meaning? Anyways, I don't speak Japanese either, but this hobby introduced me to the music of those two particular artists and I like them. I always follow along with romaji translations, but now I have some motivation to actually learn Japanese.

>the vocals are heartfelt. anyway, what's the theme of this one?

Theme is whatever Claude gave me a few days ago to test the LoRA kek, here's the lyrics, then followed by romaji and its English translation (Given to the model entirely in standard Japanese, which either the merge or the LoRA has improved).
https://files.catbox.moe/9ors79.txt

In that case it skipped a line or two, but in my testing it's actually very rare now compared to how often it happens in the Turbo model.
>>
big russ... please, come back...
>>
File: ComfyUI_temp_sbvyi_00012_.png (2.87 MB, 1920x1152)
2.87 MB PNG
>>
File: anima_00036_.png (1.33 MB, 1024x1024)
1.33 MB PNG
20 seconds on a 4gb gpu but at what cost
>>
>>108762944
>>
File: 1778042092486317_.png (1.41 MB, 1024x1024)
1.41 MB PNG
>>108763702
why is she outside the building
>>
>>108763636
>>108763689
Let him cook
>>
>Curious to see if Sogna artist is alive.
>The artist is alive
>Is a woman.

Didn't expect that.
>>
>>108763636
tdrusted
formerly trusted
>>
>>108763667
sorry, I meant the old ace step 1_3_5.

>>108763350
>>108763365
tencent was involved in songbloom.
https://github.com/tencent-ailab/SongBloom

It could be kind of nice. xmas gen:
https://files.catbox.moe/pxu1ia.mp3
The reference audio was iirc Mariah Carey.

The lyrics are "

problem: it basically was really unpredictable. Prompting was a wishlist.

I wouldn't want to go back lol. BUT, as you can hear, they're not jokers.
>>
>>108763782
>The lyrics are "
based on God Rest Ye Merry Gentlemen
>>
>>108763767
pretty style
>>
File: 1760764205433215.png (1.03 MB, 1344x768)
1.03 MB PNG
>>
>>108763530
>>Sulfur can't do pussy

It's a base model, LoRA trainer will improve this, also Sulfur 2.5 waits for funding
>>
File: debo_anima_00009_.png (2.38 MB, 2048x1117)
2.38 MB PNG
>>
warning! a diss track:
https://files.catbox.moe/21bzys.mp3
>>
File: ComfyUI_00151_.png (598 KB, 1040x544)
598 KB PNG
>>
File: Parental-Advisory-Logo.png (54 KB, 3840x2160)
54 KB PNG
>>108763893
>>
>>108763906
I didn't want anyone to take it personally lol
>>
>>108763893
fuck you nigger faggot
>>
ayo I is gittin oppress
>>
File: jordan rudess.jpg (226 KB, 940x535)
226 KB JPG
>>108763893
>>
File: 1757029316324595.jpg (909 KB, 1536x1536)
909 KB JPG
>>
File: ComfyUI_00006_.jpg (1.95 MB, 2286x4096)
1.95 MB JPG
waiting the API schizo this level of detail, this gen is 1 year old btw.
>>
>>108763971
(thumbs up)
>>
File: 1770031980345642.png (1.35 MB, 1344x768)
1.35 MB PNG
>>108763827
>>
anyone else firing up the kino factory?
>>
>>108763550
>https://huggingface.co/circlestone-labs/Anima
>405k downloads last month
woaw
>>
>>108763782
They made a slightly better one actually, it was called https://github.com/tencent-ailab/SongGeneration
It still left so much to be desired and was basically ACEStep 1.0 tier. ACEStep had a 1.5 version which was much better than that, and then it was iterated on and 1.5 XL is even better than standard 1.5. We are very fortunate to have ACEStep. Local audio models were in a very sad and dire state beforehand.
>>
>>108764079
neat. Still getting updates.
>>
File: 1761211995361161.jpg (747 KB, 1536x1536)
747 KB JPG
>>
>>108764079
have you tried running your generations through a mastering model? i remember someone suggesting that in a past thread. never tried it
>>
>>108764115
>a mastering model
is there one that actually works? I have 5 tracks that needs cleaning
>>
>>108764115
All the generations I posted yesterday (Fate Gear, Zutomayo, Miku) are mastered actually. I do it like second nature before sharing gens, because Turbo just is slightly worse at capturing details. I'm the one who recommended it after finding out about Matchering 2 thru Discord. Some songs are better than others for mastering so I just rotate around an album and it usually only takes 1-2 tries. The base merge does not need as much mastering as Turbo anymore though, as in the initial output is not as noisy and the voice is crisp, but that only means it sounds way better when mastered so I go ahead and do it anyways.
>>
Nofap day 10, honestly, keep remembering that motorcyclist's cleavage.
>>
>>108764147
>I have 5 tracks that needs cleaning
Matchering 2
https://github.com/sergree/matchering
>>
>>108764170
>Matchering 2
I've tried it actually, just wasnt super happy with results. Perhaps I used bad reference song.
>>
>>108764154
can you post a before and after?
>>
>>108764177
I recommend trying Web Audio Mastering, but Matchering 2 with a good source can sound much better than that. There are some songs with poor mastering that lowers the volume a ton, finding a good one is just research, find a song that sounds clear. Googling for well mastered songs in genre you're targetting may help. Look thru playlists. Tip: Movie scores tend to be well mastered due to insane budgets. LLMs like Gemini can give ideas as well. You'd be surprised. I don't search anymore because I have good songs and albums that apply to everything I have (just downloading albums from a few target artists is enough).
>>
>>108764014
kino
>>
I bet that motorcycle zoomer wants a gen x nofap boyfriend.
>>
>>108764198
nofap is not supposed to turn you gay. rebuke the devil
>>
>>108764203
>:(
he said show my penis to drivers on the highway
>>
>>108764225
he is trying to use you as a grim reaper
>>
>>108764237
lol
>>
File: lily 3.png (1.31 MB, 848x1264)
1.31 MB PNG
>>108763721
is this local? reminds me picrel claimed to be local too
>>
>>108764186
Turbo gens since I'm AFK, these are from weeks back (before I figured out the DiT only trick so gens might be more boring than what the Turbo model can do).

https://desuarchive.org/g/thread/108694497/#108695746
Top is mastered, bottom is pre-mastered.

Pre-master
https://vocaroo.com/1a4VSBjqkuSX
Master
https://vocaroo.com/1b0F41rAgXqR

Depending on the gen, it doesn't get rid of every issue, but it can make it go from unacceptable on decent headphones to okay sounding.
>>
>>108764192
Do you use 32bit float, just normal settings?
>>
>>108764274
oh, i thought it would fix the metallic sound. sounds like some basic stereo manipulation. someone needs a model that actually rebuilds the full song to make it HD, kind of like asking image editing models to make something higher quality
>>
>>108764283
16 bit WAV files. Default settings on UVL, I use this desktop app
>https://ultimatevocalremover.com/
>>
>>108764300
you can, it often works. i think it's easier on wan though.

usually it's a matter of time and offloading to vram if yourt gpu's vram isn't so large
>>
>>108764274
what do you mean by 'mastered'? What are you using?
>>
File: he sold.jpg (134 KB, 1086x1448)
134 KB JPG
>>
What do you guys actually use these generated images for? I liked the process to get image and video generation working on my system, but now I'm stumped, I really have no use for any of this. Just generating random images is a bit meh.
>>
File: lily 5.png (1.26 MB, 848x1264)
1.26 MB PNG
>>108764270
Which nigga claimed that was local?
(These are Nano Banana + editing in GIMP.)
>>
>>108764398
making kinos for my amusement
>>
35
>>
>>108764495
it was in adt
adt for local
fuck you nigger
>>
>>108764578
adt is for anime
>>
>>108764586
for anime generated locally
do you see any mentioning of nai or other cloud shit
kys
>>
https://files.catbox.moe/x5z448.mp3
The text:
https://webstersdictionary1828.com/Dictionary/joy

Needed cfg scale to be 13.

>>108764398
similar to above. for memory stuff.
>>
>>108764617
fyi, if I had fed that into udio, at least at the time I left, it would have never worked. You had to really slave to get uneven verses working.
>>
>>108764672
ie udio, at least in the past, wants to talk instead of sing.
>>
>>108764734
Hmmm nyo
>>
>>108764597
Fuck you, /adt/ is for anime diffusion in general. Here is for local realism and western art
>>
File: 345956.png (759 KB, 1835x1047)
759 KB PNG
anyone else making ltx kinos?
>>
>>108764781
I am glitching out with Sulphur. What are your steps/cfg/distill LoRa strength?
>>
>>108764781
it can do 40 seconds?
>>
>>108764781
upload i wanna see
>>
>>108764757
nigger
> do you see any mentioning of nai or other cloud shit (in the op)
fuck you
>>
>>108764787
i am using the default settings on wan2gp. it seems to be 8 steps and i don't know what the rest of the settings are since it's hidden from the UI
>>108764790
it can go as far as you want. i'm just being meticulous with small extensions
>>108764808
i will once i reach the bridge after the chorus
>>
>>108764734
>>108764511
>>
>>108763567
I captioned the dataset late night and said fuck and started training. Just woke up.
It's done training but preliminary test seem to show disappointing results. Maybe earlier steps are better or needs a different prompting meta than I am doing. (Cope I know but I need to test it more.)
I honestly didn't expect to get it right first try, but still.
I will make a training help blogpost later.
>>
>>108764781
i was trying sulphur, it still seems quite tricky compared to wan tho
>>
>>108763550
Do you know how to configure ComfyUI ZImage to work with the Hammerai website? I have the portable version for Windows. I have a problem with it not being able to find the models.
>>
>>108765059
the prompting for ltx seems to be very sequential. you have to write it out in the order in which things should be seen. if you mention that she has boots at the end of the prompt, the camera will aim down to look at her boots only at the end of the video
>>
hello? retards?
>>
>>108766065
retard here how can i help
>>
>>108766094
where are you guys
>>
>>108766065
hello? is this thing on? am I all alone? is anyone there?
>>
>>108766102
>>108766065
anima posters moved to anime generals
>>
>>108766239
they’re discussing loras and stuff in /hgg/
>>
Give it a rest pal no one believes you
>>
File: Anima_02242_.png (1.05 MB, 832x1216)
1.05 MB PNG
So here is an interesting thing about anima's @ keyword for styles.
Anima understands on a fundamental level that "@" is associated with styles to the point that it will hallucinate watermarks and patreon links on occasion if you feed it a nonsensical style tag.
(This was with @real photo and my lora disabled for testing comparisons.)
You don't see anything similar in sdxl.
When there is a watermark or link hallucinated, it is almost always gibberish and not strongly related to anything on your prompt.
>>
>>108766300
what was your captioning like, and how many steps was that?
i was curious to see realism trained as an @, but that doesn't look promising.
>>
File: 1588629222545.gif (933 KB, 220x220)
933 KB GIF
i've received warnings for racism (what) in the past, but look at the attitude of the Chinese. they're now ignoring the users who made them famous. that's not a respectful attitude, is it?
>>
wat is weeb labs?
>>
File: ComfyUI_temp_gqkgg_00022_.png (2.53 MB, 1520x1040)
2.53 MB PNG
After playing around with the Spark.Chroma 1024 model, my conclusions are as follows:
1 - this seems like a sidegrade compared to 512 and a downgrade to preview
2 - It is better at following some specific prompt details, like film grain
3 - It is worse at replicating now-photographic styles (unable to do american comic books, worse at black and white manga for two examples)
4 - It seems worse at generating faces on a crowd, something I didn't felt with 512 model
5 - the preview model still is the best overall, making a good compromise and being better at non-photographic styles
6 - Chroma is still the best for NSFW work, being the only one that is able to generate correct male genitalia (not just on the man that is penetrating, but also on males on a crowd or by themselves) and blood/gore (being able to render carcasses, body interiors, blood drips etc. much more coherently than other models)

https://files.catbox.moe/5js4so.png
>>
File: Chroma_final_00028_.png (2.31 MB, 1520x1040)
2.31 MB PNG
https://files.catbox.moe/c8bcl1.png
>>
File: Chroma_final_00034_.png (2.54 MB, 1520x1040)
2.54 MB PNG
https://files.catbox.moe/1h72q7.png
>>
File: Chroma_final_00005_.png (2.22 MB, 1040x1520)
2.22 MB PNG
https://files.catbox.moe/j94wjf.png
>>
File: Chroma_final_00032_.png (1.84 MB, 1920x800)
1.84 MB PNG
https://files.catbox.moe/ao3ew5.png
>>
File: Anima_02243_.png (1.13 MB, 832x1216)
1.13 MB PNG
>>108766338
I already posted a bit about the captioning last thread but here is an example caption:
>@real photo. A young White woman in her early 20s with vibrant, wavy red hair leans back against the thick, gnarled roots of a large tree in an autumnal forest. She gazes thoughtfully upward and away from the camera, her face framed by her bright hair and accented by dark, plum-colored lipstick. She wears a black long-sleeved turtleneck sweater and white pleated trousers, with her arms crossed comfortably over her chest. Small, white flower-shaped earrings are visible on her ears, and her fair skin has a soft, rosy glow on her cheeks. The massive, grey tree roots cradle her body, while the ground behind her is covered in a thick layer of fallen orange and brown leaves. The background trees and the leaf-covered slope are softly blurred, creating a shallow depth of field that keeps the focus entirely on her.
I had 5 epochs with 1300 images so 6500 steps. Fuck it might as well as turn this into a help me post. Batch size 2, 1024p, AdamW, LR 0.00003, cosine, WD 0.001, betas=0.9,0.99 96 rank 48 alpha, 0.05 dropout, sigmoid_scale 1.3 max_grad_norm 1.0
Any ideas how to proceed?
As embarrassing as it is, here is the same image with the lora. There are better images and worse ones.
>>
comfy will pay you minimum wage to make workflows
https://docs.google.com/forms/d/e/1FAIpQLSdCrLN2UBKjeqz30__wMjTAXoVqhypTTq1Gl08y2nvKaEf98A/viewform
>>
File: Anima_02225_.png (1.27 MB, 896x1152)
1.27 MB PNG
>>108766637
Posting a slightly better example to feel less shame.
>>
File: Chroma_final_00024_.png (2.12 MB, 1040x1520)
2.12 MB PNG
https://files.catbox.moe/oak48j.png
>>
>>108766637
desu i just use the diffusion-pipe defaults
>>
File: ComfyUI_temp_epzdk_00005_.png (3.14 MB, 1824x1248)
3.14 MB PNG
https://files.catbox.moe/qqx56k.png
>>
File: ComfyUI_temp_yhnyn_00037_.jpg (1.07 MB, 2880x1632)
1.07 MB JPG
https://files.catbox.moe/0kv33c.png
>>
>>108766681
I mean I don't think this deviates strongly from the settings in the rutkowski lora.
>>
File: ComfyUI_temp_yhnyn_00076_.jpg (1.25 MB, 2880x1632)
1.25 MB JPG
https://files.catbox.moe/rlv0j1.png
>>
>spend a few hours carefully curating more data for my lora
>train next version
>its worse than the precious version
>>
File: ComfyUI_temp_varei_00002_.jpg (707 KB, 2880x1632)
707 KB JPG
https://files.catbox.moe/38yj18.png
>>108766782
many such cases
>>
>>108766637
6500 steps doesn't seem like nearly enough steps with 1300 images. 1300 is kind of a huge amount of images for a lora what a single concept.

Just try baking it for longer. If you have a pretty diverse data set you might wanna take it all the way to like 20000 steps, just make sure you're saving checkpoints often.
>>
>>108766637
>>
what the fuck is the obsession with making realistic loras for an anime model? makes no sense, it looks like shit
>>
what the fuck is the obsession with making an anime model by using cosmos a realistic model? makes no sense!
>>
>>108766786
Are the tall dudes invisible to the bottom dudes?
Also second dude from left has an arm overlapping the middle cape.
>>
>>108766814
6500 takes close to 8 hours on my GPU.
I guess I can train for a day after taking so long to curate the dataset.
But I wonder if I can pull off faster convergence without frying with higher LR?
>>108766880
Troll post but I want to use its character and NSFW knowledge for realism.
>>
>i-it's a troll post if i get told the truth
>>
>>108766880
it's a small model, it has great prompt adherence, it's easy to train, nsfw out of the box, it has a lot of concepts trained into it that negate the need for specific character/style loras, and it's fun to fuck around.
plus a lot of people are looking for something to replace legacy models, anima will be that model, so may as well get a jump on it now.
>>
>>108766786
i wish we had z-anima, the parameter count for anima is just too small to do scenes like this in it.
>>
>>108766880
The real question you should be asking is why hasn't anyone made a 3DPD porn model even though it has the most original and high quality data on the internet. It's been years and not even a hint of one being made. Very sad for pig enjoyers.
>>
>>108766924
preview 3. no sense in wasting compute until the model is done.
>>
>>108766910
>anima will be that model
almost every lora for anima has a barren sample section, nobody is using the model and the people that use it are making the exact same slop they did with sdxl so creativity has peaked
>>
>>108766907
What model do you use when you want to see a photo of a fictional character taking multiple cocks into her asshole?
>inb4 Chroma
>>
>>108766971
>taking multiple cocks
even in your fantasies, you watch.
>>
If a use a distilled model which uses 8 steps does it mean the image has reached its full convergence? If it’s so then why increasing steps it will still change? As an example using ZIT or Ernie around 10% of the image will still change between 8 and, let’s say, 20. It will also produce some kind of patchy textures, why? Using other models like CHROMA FLASH this will be even more pronounced, again why? Also all these models should in theory work at best when using ODE samplers and a simple noise scheduler, then why SDE samplers seem to work good?
>>
>NOOOOO you can't fantasize about having multiple penises chud because... YOU JUST CAN'T OK???
>>
>>108766956
preview 3 had almost half a million downloads on huggingface.
mugen released around the same time, it had less that 2 thousand downloads.
sdxl is dead, regardless of creativity.
>>
>nyooo, i HAVE to gen my waifu getting fucked by multiple men
fucking cuck faggots
>>
>>108766928
I'm not talking about a small finetune of an existing non 3d model. I'm talking about training a major model from the ground up like Anima but with real porn data and proper tagging that takes full advantage of sex related activities like booru. No one to this date has done it and no one is even hinting at starting. Very strange considering the amount of 3DPD enjoyers.
>>
>>108767010
and yet, anima gets barely any traffic on civit while sdxl gets loras and samples uploaded nonstop
>>
I get these bait posts, but I refuse to talk about Anima and anime in non anime generals. If you wanna continue the discussion in actual anime threads, ping me and we can keep going there.
>>
>>108767077
I don't even post anime but it does belong here. This general is about local models not realism models. You seethe at the tranime poster but the api fags are the ones actually breaking rules.
>>
anima is proof that we didnt need a better model than sdxl
>>
>anima sucks
>im not going to talk about anime in a non anime thread
There he is!
>>
>>108767107
It's unironically every day now. Why does anon keep falling for his low effort b8?
>>
Man I just want the next Anima release.
>>
File: 1778074531463321.jpg (393 KB, 1735x1270)
393 KB JPG
Neat, in /adt/ someone shared a new front end for Comfy alongside a full and free Anima artist style list >>108765823
>>
>>108767121
this looks like invokeai
>>
>preview2
>img upscaler
lol
>>
>>108767103
>don't worry about the prompt, just use a controlnet
>oh yeah the faces are always fucked up, sdxl can't do faces unless it's a portrait. run a detailer to fix it
>oh don't forget to run a detailer on the hands and feet
>just inpaint some more and it will be fine
sdxl is dead
>>
>>108767121
that gen can be done with illu, anima is worthless in the hands of its users
>>
Why does anon come here to cry and pout about anima and ldg every day? I don't get it.
>>
>>108767133
>A-anima is the future bro!
>nobody is downloading or using Anima loras
>lora uploaders that claimed it was the future went back to making IL loras after nobody downloaded their loras for Anima
>anima previews are getting worse
anima never lived
>>
>>108767149
two popular things. anon gets jealous when the attention isnt on him.
>>
Are tehre any other Z Image turbo loras despite these Redcraft distill ones? The results look like shit.
>>
>>108767137
Hey, don't be so mean to adt. They try their hardest but it's difficult to beat ldg anons anime gens.
>>
>>108763767
my first faps..
>>
I'm trying sulphur and the vaginas are disgusting looking, I hope someone can do an innie pussy lora
>>
>>108767155
if anima wasn't good you wouldn't be seething about it every day of the week.
>>
>>108767107
it's been a while since I came back here, so it's not about comfyui obsession anymore nowadays?
>>
>>108767202
if anima was good, people would care.
>>
>anima is bad stop enjoying it!
>REEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
>STOP IT!
stinky ass reeetard b8
>>
>n-netalumina is the future bros!
>gens 1girl, standing with longer gen times
>a-anima is the future bros!
>gens 1girl, standing with longer gen times
sdxl just works for your neanderthal needs
>>
>>108767220
It's bait alternating between comfyui bad good, local/api console wars, and " xxx popular model is actually bad and I'm gonna talk about it 24/7".
Whoever is doing that is just burning his life in the most waste of time shit imaginable.
>>
it's only bait when i dont agree with it, simple as
>>
35 stars status?
>>
the truth shines
>>
Nothing ever happens
>>
>>108767222
This anon >>108767237 clearly cares kek
>>
>>108767220
>so it's not about comfyui obsession anymore nowadays?
Oh it still is. He only circles around to "comfy killed local" after he's gone through the other bait.
>>
>>108767256
i enjoy his endlessly evolving meltdown
>anima is bad and it has 'catastrophic forgetting'
>it doesn't forget but you need a linux workstation to train it
>you don't need a workstation but you do need linux
>it's actually easy as fuck to train and can be trained on any hardware... b-but it's slow
>some jeet on civatai is still making sdxl loras so anima is dead
>this gen could be made with sdxl and adetailer
>you can't talk about local anime models in /ldg/
>novel ai just works!
>anima gen time is slow
the meltdown when anima fully releases is going to be biblical
>>
>>108767190
genitals in eros and sulphur look fucking terrible and deform themselves at the slightest movement and loras barely alleviate the issue from my experience. wan is still better atm imo.
>>
>>108767300
so you're saying that every single post ever about anima is one guy? how many meds are you on?
>>
>>108767308
i would wager about 70-80% are from the same guy.
you can read about him in the op lol he is pretty easy to spot considering he is itt 24/7 and says the same shit everyday.
https://rentry.org/animanon
>>
i dont really understand why an anon would lurk and post here if they are not interesting in New Thing seeing as this is the preeminent New Thing ai image and video thread on /g/
>>
>>108767329
>i would wager
so you're guessing? got it. i like anima though so i dont care, i think people fighting models is retarded but most of you are maladjusted fat people so its expected.
>>
>>108767329
you just got b8ed by him anon...
>>
>they're still posting their stable diffusion shit

how are you retards always 3 years late on everything
>>
>>108767300
>anima is bad and it has 'catastrophic forgetting'
But it forgets
>>this gen could be made with sdxl and adetailer
But this is true
>>novel ai just works!
This is true also until Anima full release
>>anima gen time is slow
This is true
>>
>>108767353
local never evolved beyond stablediffusion
>>
>stop talking about and discussing the newest anime model!
>why are you still posting stable diffusion youre all so behind
???
>>
julien really deserves all the bullying
what a subhuman
>>
>>108767357
>>anima is bad and it has 'catastrophic forgetting'
>But it forgets
Have you been able to find any proof of this yet?
>>
kill ani IRL
>>
>>108767220
>>108767240
>>108767281
He even responds to himself with the opposite bait, it's sad to look at but at least it's very easy to hide once you've been around long enough.
It doesn't even matter who it is.
>>
is this guy still doing the off topic drama bullshit? fucking hell
>>
>>108767369
Any lora from Civitai with any character. The stronger the lora weight gets (closer to 1), the more concepts, clothes, and physical appearance the model forgets. Come on dude, don't be dense. Every model has flaws. There's no such thing as a perfect model.
>>
pedo faildev RAT
>>
>>108767393
Why should anon take you at your word if you've never trained it before though?
>>
>>108767393
could you post an example
>>
>>108767410
no that would defeat the whole purpose of his trolling
>>
>>108767405
I don't train but I use loras, I'm a local consoomer with an opinion about a local model talking about it in a local diffusion general.
>>
>>108767393
>Any lora from Civitai with any character
Okay here's one https://civitai.com/images/129231270
Looks good
>>
>>108767410
i asked ANYONE to post a single anima gen that showcases it's prowess but no one ever did or does since 1girl, standing is the extent of this general's capabilities
>>
good thing I don't give a shit about your opinion
>>
>>108767424
You should preface your original post clarifying that you are a notrainer so anon can disregard your uninformed opinion.
>>
cared enough to reply to me indirectly, bitch boy
>>
>>108767429
you havent been lurking? go through the archives lol
>>
>>108767114
For what?
>>
File: Anima_00004_.png (3.17 MB, 1536x1536)
3.17 MB PNG
>>
cared enough to post and solve a captcha, bitch boy
>>
>>108767466
top of the soles version pls
top of the soles > bottom
>>
>>108767427
he will pretend he didnt see this and use the same "look at any civitai lora anima forgets" line tomorrow
>>
feetima
>>
Just a bit of banter :-)
>>
i don't get anons problem, loras train just fine
>>
>>108767427
I don't post anime in non anime threads sorry
>>
>>108767494
Are you the windows poorfag with the integrated gpu from yesterday?
>>
you all bicker like women holy shit
>>
>>108767502
Yes, all me
>>
>>108767510
estrogen will do that to you. these troons are out of control.
>>
>>108767476
gen it yourself poorfag lol
>>
Localtroons...
>>
in my local diffusion general?
>>
>>108767466
is it just anima or with zit? face looks too good for anima
>>
Does anima know Shinichi Sakamoto?
>>
>>108767558
Anima ->ZiT ->Juggernaut for face variety
>>
>>108767597
Most probably not.
That's a nice style though.
>>
File: LTX-2_00524.mp4 (2.5 MB, 960x960)
2.5 MB
2.5 MB MP4
>>
>>108767613
It is, I love his art but no one ever made a lora this manga or it's characters. I'll just learn how to make a lora and make it myself.
>>
>>108767597
Its dataset consists of danbooru and some deviantart. You only need like 30 img to train a lora thoughever.
>>
>>108767644
For a style lora I would aim to hit 100 mark.
>>
we need a list of artists that despise AI the most
>>
File: LTX-2_00525.mp4 (2.43 MB, 1056x1056)
2.43 MB
2.43 MB MP4
>>108767598
stop with the SDXL non-sense that model has been deprecated for so long, I havn't used it in years
>>
>>108763550
What day it is /ldg/?
TODAY IT'S NATIONAL LOCAL SHAMING AWARENESS DAY :D :D !!!
>>
>>108767677
>>108767621
I like how ltx fix her defformed toes
>>
>>108767644
>thoughever
meds
>>
>it's trying to derail the thread again
>>
>>
https://storage.courtlistener.com/recap/gov.uscourts.cand.407208/gov.uscourts.cand.407208.531.0.pdf

>Plaintiffs Position: Midjourney built an “overfit” filter to catch its models outputting copies of
their training data. It also created style reference models with [REDACTED] that capture and mimic
the distinct style of 4,700 individual artists by name. The source code for both overfit filtering and
style mimicry is primary evidence of how Midjourney’s models regularly infringe and how its uses
of Plaintiffs’ works were anything but “fair.” Midjourney should be ordered to produce the source
code for the overfit filter and its database, as well as the style reference model and Midjourney’s
efforts to isolate the distinct style of artists as model [REDACTED].

>Midjourney's position: Plaintiffs’ request for further source code should be denied. Midjourney’s
core source code, including its training and data processing code, has been available to Plaintiffs
for months; they are still in the process of reviewing it. Yet, they now demand additional source
code for Midjourney’s overfit filter and style reference model based on a fundamental
mischaracterization of both without showing the required necessity. Plaintiffs already have many
documents on this functionality and are free to test their flawed theories about it at the upcoming
depositions of Midjourney’s engineers. But Midjourney should not be forced to turn over even
more of its most competitively sensitive source code as part of Plaintiffs’ latest fishing expedition.
heh
>>
File: 129415256.png (1.84 MB, 1520x1024)
1.84 MB PNG
>>
>>108767757
Texts in the descriptions are too clean.
>>
>>108767772
lel
>>
>>108767768
holy shit saas is so advanced, they have government-level secrets that have to be blacked out. midjourney's style techniques are so far ahead of localslop that even the courts are working to protect it.
>>
>>108767772
>>108767757
I bet if you had half the problems we have trying to get our shit tier local models running somewhat stable you wouldn't have the mental bandwidth left over to maintain a sense of humor
>>
>>108767849
lole calm down trollanon
>>
>>108767852
One of the advantages of using cloud models is that most of my time with diffusion is actually spent thinking about the image itself, not peripheral bullshit
>>
>>108767849
Never used midjourney, is there a free trial?
>>
>>108767768
My favorite part of this is midjourney basically saying
>these claims are so retarded and make no sense
Which is something we've all felt when dealing with luddites
>>
>>108767747
i wish jannies would clean up the low effort trolling that happens here everyday but they rarely even wipe NSFW
>>
I want to break free from local chains...
>>
>>108767849
Is this your first time ever reading documents from a class action suit?
>>
I want to imagine.. I want to create.. I want to express myself.. so many things to say... but I can't. I have to fix workflows, make loras, tinker with prompts, inpaint, chain models, fix dependencies.. I'm exhausted... tired of fixing , no time to create, only to fix... maybe tomorrow will be the day I ditch local.. maybe tomorrow I'll be free... maybe tomorrow I can actually enjoy this hobby...
>>
maybe tomorrow I'll stop fudding since I'm a raped failed retarded dev
>>
>>108767907
hes pretending to be retarded
>>
File: sref.jpg (1.57 MB, 1181x1331)
1.57 MB JPG
>>
>>108766472
weed labs
>>
>>108767931
>sorry you can't say that.
>sorry we can't generate that image.
>i'm so sorry but the servers are busy right now, we will try on the older model unless you don't mind waiting in the queue.
>i'm sorry but i don't understand the "unique consistent character" concept, here is 1girl trained on basic instagram models.
>you are out of tokens, would you like to upgrade your subscription?
>oh dear that sounds unethical, i'm afraid we have to ban your account.
what a cucked existence.
>>
>>108768000
Just ignore the troll and he'll eventually get bored and fuck off to somewhere else until tomorrow when he'll repeat the same things
>>
File: 1778095491604.jpg (323 KB, 1024x768)
323 KB JPG
>70 seconds to generate two, 5 minute long songs
>5 fucking minutes to generate 5 seconds of video
>>
>>108768006
dude has been at it for many years at this point. he's not going to get bored but ignoring him is a good step.
>>
>>108767971
Style reference stuff is mostly just edit/reference with subject consistency.
Many open datasets do similar already
https://github.com/Tencent/MegaStyle
>>
File: maylimorphosis.jpg (2.45 MB, 1792x2304)
2.45 MB JPG
Anima preview 4 is coming... I feel it.
>>
>>108768007
>temporo-spatial information of millions of objects performing myriad of different actions is much more complex to process than which frequency the guitar string vibrates at
You don't say.
>>
>>108768031
did daddy forgive her
>>
Anon hyped himself up assuming King Russ would release v4 yesterday when he never said anything like that. You've been spoiled by him giving LDG a heads up with v3.
>>
big russ here. we're skipping preview 4 in favor of PREVIEW 5!
>>
If I was big Russ I'd be sneaking pictures of my big willy into the training data tb desu
>>
>>108768031
>she's back... for round 2 of facial abuse
>>
Big Russ here.
>>
>calling him "big russ"
Fake fan detected
>>
Boss Russ
>>
since >>108755653 asked
>>
After running a few more tests, this time also without LoRA, I'm convinced merge-base-turbo-xl-ta-0.5 is the best ACEStep model by far, which is why I got superior results with LoRAs. It's not just a training thing. This model is good even without DCW, and DCW improves it marginally.

The logic is simple. The base model has the highest amount of core base model knowledge. The Turbo model performs some form of SFT and patches it up a bit. This time, the Turbo model was not trained as aggressively as before, so it retains a bit of creativity, but it lacks the raw structure/composition of the base model. The issue with the base model? Its outputs are broken due to lack of tuning.

Here's a comparison of the same exact prompt. No mastering.

Prompt- https://files.catbox.moe/wg1gpy.txt
These are all with LM disabled. DCW is enabled to appropriate values for each model. For Turbo, 8 steps, the rest are 50 steps for all my gens.
Base output-https://files.catbox.moe/86naro.mp3
Turbo output- https://files.catbox.moe/oymuzq.mp3
SFT output- https://files.catbox.moe/kqxbjs.mp3

Merge Base/Turbo 0.5 output-
https://files.catbox.moe/dy3v4l.mp3

This is one of the first prompts I had tested for Turbo XL, trying to get Initial-D like outputs, but written off because I thought it needed a LoRA to get it right. Keep in mind the lyrics do not have the proper annotation for some commands which is why it's ignoring some of them and saying them out loud. I'm surprised that it's not common knowledge that this model is better already within the community, it dropped 2 weeks ago and the guy who dropped it maintains an UI and is very active on the Discord.
>>
>>108768152
Not just instrumeents become more coherent, voices become so much more natural too. This is an universal improvement, which at first I attributed to LoRAs.
>>
They call me the P. Russler.
>>
Is image to 3d any good yet? Would be cool to have AI make a 3d mockups to light and use as drawing references.
>>
>>108768031
When is the final version coming? Sucks having loras break.
>>
File: ComfyUI_23631_edit.jpg (3.3 MB, 1920x2560)
3.3 MB JPG
>>108764398
Do the things you find beautiful need a reason to exist? What doth beauty, Anon?
>>
Why did he sell out to comfy api??
>>
What's the best chroma model right now? I've tried it like a year ago and whatever chroma that was it was garbage and slow af
>>
So 2026 was basically Z Base in january, Anima and Klein in february and then nothing until now?
>>
>>108768294
Same old story.
Spark preview is the best but still has the most faults of the normal chroma.
The guy's attempt to unfuck it more has failed.
>>108768301
We got bunch of useless turd research preview models trained on synthetic slop data, but basically yes.
>>
>>108768301
Remember how in previous years we'd only get one or two models
>>
>>108768301
yes, because local was abandoned in favor of API thanks to comfyui API nodes. why release open weights when comfy will shill your api for free?
>>
>>108768368
apis give comfy money to add support for their closed models
comfy gives money to people like russ to make open models like anima
literally robin hood
>>
russle here to tussle. move out the way apicucks!
>>
And in the closed source scene how many models where released in 2026?
>>
>trolling outside of /b/
>>
>>108768413
just check the comfyui github? hundreds, all sota top-tier performing models like seedream 4.0 which is still unrivaled in china, and nano banana 2 which redefined prompt comprehension
>>
>>108768341
But the hobby has 2 years...
>>
I'd like to take a census here, I'm betting almost certain that nowadays /ldg/ consists of one GPUtard and a bunch of trolls
>>
A big russ just flew over my house wtf
>>
>>108768462
It's the opposite desu
>>
>hundreds
>can only name two
lol
>>
>>108768462
feels like there is one resident "troll" that has a personal vendetta against comfy, then the odd saas cuck will drop by for a thread or two to have a melty when they get banned from whatever service.
>>
wonder what the 'td' stands for
>>
most people switched to api which is why the gen-posting slowed down massively. we just respect the thread so we don't post api gens here.
>>
>>108768514
kek you don't even respect yourself
>>
>>108768514
sure as heck don't post them in the api general, that poor thread limped along for a week until it fell off the board.
>>
File: ComfyUI_00798_.png (2.13 MB, 1000x1504)
2.13 MB PNG
I finally learned how to plug in loras lol, but illustrious is giving me horrible results for animu pics for some reason
>>
>>108768550
XL is outdated. Try Anima.
>>
take the anima enema
>>
File: ComfyUI_00799_.png (2.07 MB, 1000x1504)
2.07 MB PNG
>>108768560
sigh, I still need to learn how to manually set up workflows..
>>
>>108768548
This sort of reverse bait doesn't work with him because he does not come from those threads. He simply despises that this thread became the place to discuss ai image generation at its inception.
>>
>>108768581
99% of image generation workflows can be summarized by model loaders (diffusion, clip, and vae), text encoding nodes (positive and negative prompt), the sampler, and then a save image node. Almost all of the rest of it is bloat and or snakeoil DESU.
>>
File: ComfyUI_00790_.png (1.96 MB, 1000x1504)
1.96 MB PNG
>>108768616
Neat! I'll keep that in mind
>>
n*gbo-esque situation
>>
I wish Russell would give me a preview4 enema
>>
Soon
>>
>>108767891
low effort website
>>
I just shit myself
>>
>>108768256
philosophically based jenner
>>
>>108768152
>Base output
awful metallic voice

>Turbo output
>SFT output
better

>Merge Base/Turbo 0.5 output
best sounding so far

This is actually usable, so is that because you used loras or is that just the model itself?
>>
>>108768152
All outputs were DiT only. SFT/Base merge is also not as good as Base/Turbo merge. The SFT/Turbo merge decreases Turbo's creativity and output quality by a large margin so I do not use it.

>>108768958
That is just the model itself. I disabled LoRAs this time to compare overall output quality. Heads up, on ACEStep cpp to actually disable the LoRA it has to be set to Disabled, otherwise there's a chance it activates even at weight 0. Anyways, even on prompts that everything else that isn't Base completely breaks apart because of some aspect of the prompt, the Base/Turbo merge model has a way to somehow correct the other model's mistakes and sound perfect while doing so.
>>
File: ComfyUI_temp_fuvle_00007_.png (1.49 MB, 1024x1024)
1.49 MB PNG
>>108768560
yeah the results are more of what I'm looking for, thanks!
>>
sitting at the kinoplex
>>
File: 3longkirk.png (1.94 MB, 1254x1254)
1.94 MB PNG
>>108769164
>>
File: ComfyUI_temp_fuvle_00004_.png (1.49 MB, 1024x1024)
1.49 MB PNG
>>108769204
no bully, I'm still learning
>>
ITS OUT!!!!!!!!!!!!
>>
how do anons prevent blue balls while gooning for hours?
>>
>>108769229
>>108769204
>>108769164
Hello, I think it would be a good idea if you posted your anime gens in an anime diffusion general and redirected your questions there. For example, here on /g/ there is >>>/g/adt/ which is for sfw anime. /adt/ was a thread created by /ldg/last year to organize topics more evenly. I would appreciate it if you did this to keep things better organized.
Thanks and welcome to the hobby ^^!
>>
>>108769335
Cry more lol
>>
crazy to what degree /ldg/ broke him kek
>>
Well said >>108769335, /adt/ along with other threads belong to the /ldg/ ecosystem which in turn /ldg/ belongs to the grand ComfyORG ecosystem.
>>108769229 Welcome to the family, anon! Do you have a Nvidia GPU?
>>
>>108769351
why did you just reply to yourself tho
>>
I am seriously considering posting anime gens from now on and I don't even watch anime.
>>
>>108769164
No problem. Make sure to read the model card for prompting tips and the other Anima link in OP for an artist style guide.
>>
>>108769358
Reverse psychology faggot, KYS (know your schizo)
>>
>>108769164
In /adt/ there is a full Anima style explorer >>108765823 , and anons there actively use it, they don't speculate like here.
>>
All that just to get a single newfren to post somewhere else my sides
>>108769378
That one is outdated and has less artists than the tagexplorer
>>
dpm++ 2m sde >>> euler a
>>
>>108769400
/ldg/ is a corpse, the only bumps are from trolls and baitposts, and gen wise we're at rock bottom, itt 50 gens and half of them are API ones.
>>108769481
euler >>>>2pm++ 2m sde>>>>>euler a
>>
euler is a blurry piece of shit
>>
>>108769499
Euler a is a blurry piece of shit
>>
>>108769496
And somehow still the most active on /g/... Huh
>>
>>108769496
>and gen wise we're at rock bottom, itt 50 gens
newfag wasnt here for the worst trolling kek this is light
>>
>>108769496
quit your belly aching over it and post art and not just words of weep. pathetic evolutionary dead end.
>>
when are we getting proper ip-adapter for local?
>>
>>108769510
But it's fake engagement though. If all the posts are API or Julian shitpost, what's even the point of being the fastest?
Like cool, let me just share my pol takes and how my favorite baseball team is doing.
>>
>>108769532
What does that have to do with you crying though?
>>
>>108769525
Sir, this is local.
>>
Fresh

>>108769543
>>108769543
>>108769543

Fresh
>>
>>108769537
Huh? I never liked /ldg/ and seeing it like this is something I've always wanted.
>>
>>108769335
>>108769378
why does adt still have netayume stuff in their op even the die hard netasissies here realized its a dead arch already LOL



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.