[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


KYS Edition

Discussion and Development of Local Image and Video Models

Previous: >>108687829

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
>>108690678
can you stop adding pedo images to the collage please?
>>
File: 1776568406465646.png (2.07 MB, 1086x1448)
2.07 MB PNG
>>108690672
>ok so which one of these models can generate nude images.
out of the box? none
you have to hunt for a niggermix
>>
>inb4 nigbo
>>
this dude still here seething about local and posting cloud images? bruh its been a full week give it a rest LOL
>>
File: HGudmIdbQAAFwOQ.jpg (176 KB, 941x1672)
176 KB JPG
>>
Which model do I use for adding text to an image in perspective (like writing on a shirt)?
>>
>>108690706
GPT image 2.0 is solid
>>
File: 17682467.jpg (1.12 MB, 1693x929)
1.12 MB JPG
>>
>>108690719
sorry Chang I don't speak ching chong
>>
>>108690698
They are mindbroken and will never leave this blessed local breads
>>
>>
Anonymous 04/25/26(Sat)19:14:27 No.108690725▶
>>108690449
>whats that?
I know the image is split using "wavelets". I know the issue is that parts of the gen get out of synch vs the model's training.

But I don't know the flow of the calculation, I didn't understand that part. Realize the paper applied to images but it's being applied to the c version of Ace Step:
https://github.com/ace-step/ACE-Step-1.5/issues/1119
>>
>>108690706
flux 2 klein 9b
>>
>mfw Resource news

04/25/2026

>StyleID: A Perception-Aware Dataset and Metric for Stylization-Agnostic Facial Identity Recognition
https://kwanyun.github.io/StyleID_page

04/24/2026

>MAI-Image-2
https://playground.microsoft.ai/chat

>ComfyUI-NAG-Extended: NAG support for Flux 2 Klein and Anima
https://github.com/BigStationW/ComfyUI-NAG-Extended

>UniGenDet: A Unified Generative-Discriminative Framework for Co-Evolutionary Image Generation and Generated Image Detection
https://github.com/Zhangyr2022/UniGenDet

>VARestorer: One-Step VAR Distillation for Real-World Image Super-Resolution
https://github.com/EternalEvan/VARestorer

>Sapiens2
https://github.com/facebookresearch/sapiens2

>Vista4D: Video Reshooting with 4D Point Clouds
https://eyeline-labs.github.io/Vista4D

>Pre-process for segmentation task with nonlinear diffusion filters
https://github.com/cplatero/NonlinearDiffusion

04/23/2026

>ParetoSlider: Diffusion Models Post-Training for Continuous Reward Control
https://shelley-golan.github.io/ParetoSlider-webpage

>DynamicRad: Content-Adaptive Sparse Attention for Long Video Diffusion
https://github.com/Adamlong3/DynamicRad

>Normalizing Flows with Iterative Denoising
https://github.com/apple/ml-itarflow

>LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model
https://github.com/inclusionAI/LLaDA2.0-Uni

>Illustrious XL & NoobAI-XL Style Explorer
https://github.com/ThetaCursed/Illustrious-NoobAI-Style-Explorer

>AI Model & ‘MAGA’ Influencer Emily Hart Unmasked as Indian Man
https://www.yahoo.com/news/articles/ai-model-maga-influencer-emily-091027504.html

04/22/2026

>Embedding Arithmetic: A Lightweight, Tuning-Free Framework for Post-hoc Bias Mitigation in Text-to-Image Models
https://github.com/cvims/EMBEDDING-ARITHMETIC

>Denoising, Fast and Slow: Difficulty-Aware Adaptive Sampling for Image Generation
https://github.com/CompVis/patch-forcing
>>
where is the roleplaying blesser troon?
>>
>>108690718
>need account
>can't use guerilla mail
>>
>>108690719
kind of depressing that this is the best a SOTA model can do.
>>
kind of depressing local is still so far behind
>>
>single bake from baker
>hours of spering from you
>>
File: 534665136315842346665.png (2.57 MB, 1024x1536)
2.57 MB PNG
>>108690746
indeed
>>
File: 1272479.jpg (1.1 MB, 1448x1086)
1.1 MB JPG
>>
>>108690744
Thread theme: https://www.youtube.com/watch?v=Kqvg8YLpKlw
>>
File: behold, the power of API.png (1.04 MB, 747x1115)
1.04 MB PNG
>>108690757
>>
>>108690760
>I PAUSED MY CRYPTO NODE TO CHASE YOU
ashamed i kekd
>>
>>108690750
wow is the single baker female? NO FAKE FEMALES
>>
>>108690760
now this is funny
>>
How many pictures do I need to take of my cousin to make a lora of her? I could use like idk 4k right in a hidden non nude camera ofc highly legal.
>>
>>108690760
>cloudfag thinking the model is stored in the node
kys
>>
localbrown seething commenced
>>
File: 1753666913974784.png (2.22 MB, 1086x1448)
2.22 MB PNG
>>108690760
LOOOOOOOOOOOL
>>
>>108690760
you have to admit API shit can do great memes kek
>>
File: PaintingComparison.jpg (3.92 MB, 1920x4352)
3.92 MB JPG
Prompt just:
A serene mountain landscape painted in the style of a traditional Japanese ukiyo-e woodblock print, featuring flat colors and distinct bold outlines
>>
>>108690760
can you make it look like real security cam footage
>>
>>>108690642
https://files.catbox.moe/zxnkdg.png
>>
>>108690793
Size, anon, size. Add the VRAM taken by the model weights to its name.
>>
File: 7568736734734.jpg (870 KB, 1086x1448)
870 KB JPG
Remember to save on your compute by generating multiple 1girls
>>
>>108690820
it looks worse than your previous API gens, did OpenAI nerf their model already? lol
>>
>>108690809
nta thanks, that looks complicated, I don't have those custom nodes.

What's the basic flow lol like in this case
>>
>>108690779
the model is stored in the node
>>
File: Anima_0035.jpg (1.65 MB, 1344x2496)
1.65 MB JPG
>>108690820
bruh, you can't even can generate asses wearing a thong, what are you spamming your sanitized sfw girls, pathetic lmao
>>
>>108690793
ernie?
>>
>>108690840
Why did you repost anons gen https://desuarchive.org/g/thread/108664784/#q108666499
>>
>>108690846
it takes them 2 hours to gen an image on their 3050s, please understand
>>
>>108690830
GPT2 isn't really as good as NB with sexy baby girl pics tbdesu
>>
>>108690840
We need a symmetry breaker and a face fat adder with specific targets. women have pockets of fat in their faces that make them unique. uneven eyes...

a big thing is lips are very often not quite exactly symmetrical.

eyes are often less complete and more basic. bags under eyes are the norm at 30.

so many things.

noses, so many specific nose parts. zero prompts to control noses, but if you get the wrong nose, it will be the wrong feel...
>>
File: 1755518945614474.png (2.24 MB, 941x1672)
2.24 MB PNG
>>108690840
grok and seedance
>>108690846
thats just embarrassing
>>
File: Anima_0022.jpg (1.89 MB, 2112x1632)
1.89 MB JPG
>>108690846
>>108690855
they are my gens I can do whatever I want with them :)
Also, you can't upscale either, why are API cucks so pathetic, they can either post some unfunny memes, shitty infographics or sanitized sfw girls
>>
File: 95367.jpg (1.29 MB, 1536x1024)
1.29 MB JPG
>>
File: 7653865275472.jpg (1.07 MB, 1092x1402)
1.07 MB JPG
>>108690830
I don't know. The quality varies per gen. Sometimes you get good ones, sometimes you don't.
>>
>>108690809
megubased
>>
Reminder: Hide api niggers posts. They're trolling you just stop.
>>
File: 1759193517594996.png (901 KB, 1024x1024)
901 KB PNG
How do I make anima loras?

>>108690734
>tfw took 153 seconds
>>
File: Anima_0020.jpg (1.9 MB, 2400x1440)
1.9 MB JPG
>>108690864
>grok
>480p quality vertical videos
AHAHAHAHAHA
>seedance
>just give your ID and please record a video verification of your identity
OR
>add pathetic grids and filters to the original image and MAYBE you will bypass the censorship filter
OR
>pray for an API that maybe lets me do a midly sfw video

API cucks are so pathetic HAHAHAHAH
>>
File: kek.jpg (209 KB, 1581x724)
209 KB JPG
>>108690875
>>
>>108690864
that would go hard af in the 1920's, seeing a broad with her gams out. WHEW!
>>
>>108690873
yes, yes, it's so amazing, you are dismissed npc, go and watch rachel maddow, the biggest man in news.
>>
>>108690881
>How do I make anima loras?
>>108690678
>https://github.com/tdrussell/diffusion-pipe
And https://gist.github.com/tdrussell/3f79596efb8e27672da0881afd9c1d51
>>
>>108690837
It's not complicated, just a total mess because I keep slamming new stuff on top of it instead of creating new workflows.
You can do just fine with the default nodes.
>>108690878
more megu
>>
>>108690881
This guy...
>>108690188
made a stonetoss lora for anima
>>
>>108690886
>>108690889
>>108690890
Stop feeding the two retards.
>>
File: 1762473321172282.png (228 KB, 500x441)
228 KB PNG
>>108690873
>>
>>108690887
Showing promise. for me. I don't like the "magazine selected horse faced woman"
>>
>>108690873
>heh the state of the art model is better than a 3.5b model from 2023
the fucking state of api
>>
localboomers seething from the nursing home
>>
File: 1752180710251342.jpg (990 KB, 2112x1408)
990 KB JPG
>>108690793
Here is my attempt with the same prompt and ZiB
>>108690900
Nice gen
>>
>>108690886
lets not forget THEY HAVE TO PAY CREDITS to generate sfw images of 1girls, so sad
>>
>>108690873
you rattled them with this one
>>
>>108690894
Wake me up when OT supports it
>>
File: this.png (367 KB, 500x565)
367 KB PNG
>>108690873
we really rent free in their head don't we?
>>
>>108690926
If you're too dumb for diffusion-pipe then you can try https://github.com/gazingstars123/Anima-Standalone-Trainer
>>
>>108690926
>3 weeks since last commit
It's over.
>>
File: AS15T__00030_.png (378 KB, 512x512)
378 KB PNG
ace step xl, only 75 steps, sadly no dcw.

https://files.catbox.moe/6m58x6.flac
>>
>>108690935
I don't want to keep switching tools every time a new model drops.
>>
File: 1751333620054252.jpg (732 KB, 2112x1408)
732 KB JPG
>>108690760
>>108690795
Here is my version with ZiB and a non-cherrypicked seed. Do you think it looks more realistic?
>>
>>108690929
>I don't think about you at all
are you retarded? ldg has been entirely api seethe since gpti2 released
>>
>>108690965
the details look so bad, it's a ZiB image all right
>>
>>108690958
Love the parts where it spams the same annoying sound 30 times in a row.
>>
>>108690790
I mean Klein 9B Dist. can do this image verbatim, do you guys even fucking use the models kek. BFL strapped an 8B text encoder to it for a reason you know

Pastebin for prompt cause it's so long it exceeds comment length (Gemini 3.1 Pro output from original Gippity pic input)
https://pastebin.com/CMCPZxAe
>>
>>108690970
>ldg has been entirely api seethe
almost as if a local general would only like to talk about local models or something
>>
File: Brazilian Miku.png (1.05 MB, 832x1184)
1.05 MB PNG
>>
File: 1773420295281194.png (545 KB, 2100x6300)
545 KB PNG
How much money did this cost him do you think?
>>
look, he's paid to shill for a paid model.

Microsoft did the same thing. It's like the SPLC or whatever. Blackrock has done the same stuff. It's just reality, but also there are many genuine retards to also join the mix with paid retards.

Their job isn't to convince us of anything. We know local is amazing. What they want to do is shit up the text so someone reading it will conclude that there's no center of support for local, actually.

That's the pro purpose. The non-pros, idk what they rise each day for.

It sounds like a waste of time, but they are paranoid, you have to understand.
>>
>>108690994
He just a stinky indian trying to troll
>>
>apischolars pushing localstudents to explore their models beyond just 1girl
thank you comfyui for unifying us all in a single thread. so many people exploring the possibilities of z and klein now thanks to the creative genning by our api folk
>>
*yawn*
>>
>>108691012
this used to be the job of the St. Floyd poster but he seems to have left us unfortunately
>>
>>108690994
>We know local is amazing.
that's a stretch, for images it's decent, for videos we get completly destroyed by Seedance 2.0 (and to be fair, even their API rivals get destroyed by them too)
>>
>>108690977
It's a tendency with Ace Step, but it may be that I have temp too high. I'll try lowering it a bit and see.

One amazingly fascinating thing about ace step is you have to adjust the parameters for the genre.

The repetitiveness might just be because that's what the ai thinks bigroom is supposed to sound like.
>>
File: 1758500674940988.png (1.86 MB, 1122x1402)
1.86 MB PNG
>>108690970
gpt 2 has completely mindbroken them
they see an api image and immediately think their local models can do better, but whenever they try it just looks like dogshit
>>
>>108691012
I'm pretty sure nonretards already knew that at least Klein can do basically anything if the prompt is detailed and specific enough about placement / etc
>>
>>108691027
>they see an api image and immediately think...
... that it's off topic and that you should leave this place and "thrive" in your own home instead of disturbing ours instead >>108653190
>>
>>108691027
your picrel is clearly just generic ZIT slop though and not cloud anything
>>
>>108691019
I don't like video / movies / kino / animation.

I mean obviously sort of since I watch stuff some, but it's eh. I don't care. If tvs stopped existing, but photography continued, I'd be fine.

for me, video is all about showing how to manipulate mechanical components.

In theory it would relate to manipulation of sexual positions or whatever, but isn't that extremely basic?

Part of it may be that I am a cow rotator.
>>
>>108691027
if API is so great, then why do you prefer to lurk on a local general anon? curious
>>
>>108691034
>>108691040
>>108691047
Why are you still feeding the obvious troll
>>
File: zeta chroma bros.png (36 KB, 965x236)
36 KB PNG
Even sheeple are starting to wake up to his BS
>>108690988
Didn't original Chroma cost more than 100k? Probably gonna reach the roughly same mark before he gives up on it.
I wonder what kind of cope he will come up with for this one
Chroma was a "base model" so it was someone else's job to finetune it to become stable.
Klein just didn't train well.
Radiance was the initial training run for the first vaeless model.
Really curious for the cope this time.
>>
File: peace.jpg (1.4 MB, 1536x1024)
1.4 MB JPG
>>108691027
you don't have to troll, we're at peace now. just post your gens here and be happy
>>
>>108691054
>Didn't original Chroma cost more than 100k?
150k yeah lool
>>
>>108690971
Just bump up the shift. That gen had it set to 1.0 (the default is 3.0).
>>
>>108691078
>That gen had it set to 1.0 (the default is 3.0).
why did you put in such a low value?
>>
>>108691054
this hack can make an anima lora on his dataset in a day or two and get a better chroma
>>
>>108691055
It looks like a corporate ad, with corporate "beauty" values.
>>
File: 1751200215465818.jpg (1.73 MB, 4040x1667)
1.73 MB JPG
reminder
>>
>>108690757
underrated
>>
>>108691081
From my experience, lower shift values give more natural looking outputs at the cost of some (as you mentioned) fidelity. Sometimes for photographic styles the default value is fine and not completely slopped. Other times it'll have the same sort of look as ZiT.
Obviously images that have text are going to be hard to pull off with a low shift, but when there's none it looks good.
>>
>>108691089
how do i gen this
>>
>>108690844
yeah, what about it
>>
File: ComfyUI_temp_pdtgy_00270_.png (2.33 MB, 1120x1488)
2.33 MB PNG
There is nothing more satisfying than generating your own images in your local machine, training your own loras of whatever the hell you want, using an uncensored LLM model to caption and generate anything of whatever your imagination wants, API cucks will never know the feeling, with local only your imagination is the limit, with api and saas cloud crap your limit is what >they determine is the limit, **you must play by our rules goy** and on top of that you have to pay, being an api user is the ultimate form of cuckoldry
>>
File: 1750165603570609.png (1.97 MB, 1024x1024)
1.97 MB PNG
>>
what's with all the seething? we can gen with any model in comfyui and yet people are crying over models they don't like. what's the problem?
>>
>>108691099
Why are you reposting another anons gen from last month? https://desuarchive.org/g/thread/108320614/#q108324772
>>
>>108691052
trolls get bored, that shit is crippling mental illness and OCD.
>>
>>108691054
I'm still salty about Kaleidoscope, it could have been fucking great if the retard didn't continue to believe in magic VAEs that somehow magically allow 256x256 / 512x512 training to not enormously, almost irreparably degrade the compositional abilities of the underlying model. Like no matter what the fuck he says that's the ONLY thing that was wrong with it, he'll always be completely fucking wrong in his belief that training resolution isn't directly related to the actual outputs of the model IRL
>>
File: 13218446518.jpg (633 KB, 2048x1456)
633 KB JPG
>>
File: ComfyUI_temp_egrbv_00012_.png (2.04 MB, 1664x1216)
2.04 MB PNG
>>108691115
thank you for keeping track of my gens :)
>>
I can gen infinite lolis spreading their buttholes in my face. The ball's in your court, APIcucks.
>>
>>108691127
NovelAI did this before local ever could
>>
File: 7356734772.jpg (986 KB, 1024x1536)
986 KB JPG
>>108691127
best I can do is 1girl
>>
>>108691132
NAI just steals from local
>>
>>108691099
didnt one of your boys just get banned from civitai for posting porn loras? it seems you also have to follow rules and regulations
one day youll have to show your ID to download a jeetmix, what will you say then?
>>
File: nothing ever happens.png (176 KB, 460x310)
176 KB PNG
What's the next cope? No news about a new model coming or something? Even some "SoonTM" shit would put me in a better mood, nothing is happening!
>>
>>108691022
idk, I like techno type sounds, it's repetitive again, so working with different settings to see if I can identify what does that (with the llm).

>>108691099
She's pressing "2".

What's 2 do?
>>
File: loratraining.png (36 KB, 1759x183)
36 KB PNG
>>108691141
like I give a crap about civijeetai lmao
>>
>>108691143
>implyin we don't have great models already
>>
>>108691148
says the winjeet
>>
>>108691143
now that china sold out to api thanks to comfyui, local has nothing left. so instead it's just seethe
>>
>>108691151
>LTX 2.3 is great
shut up moshe
>>
>>108691154
it is though
>>
>>108691141
no one has ever been banned from CivitAI *merely* for "posting a porn lora", there's fucking always more to the story. Every fucking time it's always that in fact it was just outright Cheese Pizza or some other fucked up shit literally nobody wants to see anyways
>>
File: civitai.png (26 KB, 318x456)
26 KB PNG
>>108691148
muh "everything is indian"
literally an American startup started by a white guy from fucking Idaho
>>
how do you use ltx 2.3? is it comfyshit only?
>>
>>108691089
Both look nice? What's the problem?
>>
>>108691165
fuckin' french fry potato nigger
>>
File: AS15T__00034_.png (374 KB, 512x512)
374 KB PNG
>>108691146
>>108691022
>>108690977
>>108690958
MASSIVE
A
S
S
I
V
E
https://files.catbox.moe/0e8u1v.flac
>>
>>108691165
now show the user base origin :)
>>
Is there a Pragmata lora for anima?
>>
>>108691182
not yet
>>
>>108691089
people who actually give a shit about how many unrealistically identical muh azn ladies a model can do at once need to go to the chambers ASAP. Both of these images are unimpresive generic slopped dogshit
>>
>>108691165
idaho, they are the ones who got in the u haul with metal shields. Very baste.
>>
Does anyone else look at the computer screen so often that their eyes are perpetually dry, itchy, and red?
>lower the brightness and use a dark theme
I do.
>>
>>108691182
https://civitai.com/models/2555866/diana-d-i-0336-7-pragmata?modelVersionId=2872355
>>
>>108691194
why did this shit not show in my search? fuck civitai
>>
>>108691198
You have to use the blue site if you want anything with lolis or that nsfw isn't allowed of. Red site for porn loras. It's annoying, yeah.
>>
File: zImageturbo_00121_.jpg (879 KB, 1720x2064)
879 KB JPG
>>108691089
can gpt image do a Japanese Spider-Man?
>>
>>108691201
the blue site doesn't even let you sort by newest it's simply inferior I hate this shitty ass website
>>
>>108691209
Yeah it does, no idea what you're talking about. The functionality is identical between the sites.
>>
>>108691143
who cares, why aren't you making some sick gens faggot
>>
File deleted.
>>108691187
Whats sad about API or any "powerful" cloud service model is that the model is meant to replicate the same images over and over, they might look """"impressive""" at first glance, but after the user base milks the hell out of its model (like they always do by sharing their prompts everywhere), they all get the same images over and over, just like it happened with Nano Banana Pro.

Also as soon as someone finds someone to exploit, their precious cloud models get nerfed too, it's sad really, its like having a cool car but anyone with a credit card gets to drive it, might be cool at first then you start seeing what other people do with your car, eventually you start noticing the details, like they ruin the interiors with their stench and sweat
>>
>>108691218
where's your sick gens nigger?
>>
File: ss_04-25-2026_001.png (73 KB, 564x702)
73 KB PNG
>>108691213
I'm logged into both, btw.
>>
>>108691228
/ldg/ doesn't generate, they innovate.
>>
>>108690984
>>108691221
too old
>>
>>108691165
i live next to him apparently
>>
>>108691192
no my eyes water a lot when i yawn so it stays wet
>>
>>108691237
Oh wow. As an experiment I changed my setting off of Newest to experiment and now I can't change it back. Stupid fucking site. Yeah, I guess it's broken.
>>
File: 1767384502318163.png (2.19 MB, 1122x1402)
2.19 MB PNG
>>108691203
yes
>>
local stagnates. api innovates
>>
>>108691260
Big oof & yikes!
>>
>>108691263
point in case.
>>
>>108691258
Yeah I dunno, I don't even know of a good place to report it since I don't use groomcord.
>>
File: 1771242170874294.png (1.37 MB, 1024x1024)
1.37 MB PNG
>>108691194
Nice
>>
>>108691141
>>108691160
He posted several loras that sadly can be used to generate deepfake nsfw content, he got a warning, he kept posting them, he got banned, he created another account, got banned again obviously

Ban evasion gets you banned again of course (duh)
>>
>>
>>108691260
Doesn't really look like cum
>>
hey stonetoss guy, the anima lora is zero bytes when downloaded from catbox.
>>
File: Chroma_0072.jpg (1.56 MB, 1440x2304)
1.56 MB JPG
>>
>>108691338
everything is so smooth and lacks details, that's a chromakek gen all right
>>
>>108691333
https://mega.nz/file/4MV3SBhB#n1rGqISBOMr3R-2uv4dtQ26SZdACKXPGYDZEWMFS20s
>>
File: 874854737383.jpg (1.24 MB, 941x1672)
1.24 MB JPG
>>
File: 1772511601960550.jpg (366 KB, 1024x1024)
366 KB JPG
how come when I img2img with anima I get weird noise on the images?
>>
Regarding LoRA training in SDXL, what's the minimum total steps for characters or styles?
I c hecked various LoRA metadata ranging from 700 steps, but people say anime needs 2,500 to 3,500 steps for proper details.
Datasets vary from 20 to 30 images up to 150 to 250 images. I've got 39 images, all same style to guide the AI, trained for ~1,000 steps, but still not happy with results.
Won't touch Noob or Anima until I nail a decent LoRA with Illustrious. I must clear this hurdle first!
>>
do we have any idea what kind of deviantart dataset is in anima? which artist, etc?
>>
>>108691121
fake tits, no thx
>>
>>108691375
>ai
>fake tits
yeah, no shit
>>
>>108691358
catbox?
>>
>>108691360
it doesn't have proper controlnets afaik
>>
File: rockmata.png (484 KB, 768x768)
484 KB PNG
>>108691295
>>
>>108691384
you dont know how to gen 2asians?
>>
>>108691370
Desu I would do 3k probably, a little bit more if the character is on the complex side.
Just do prodigy LR 1, you don't need to minmax training params for SDXL usually.
39 is workable if these images are high quality and have great style variation (style variation for character lora, for a style lora you need double that easily and with decent content/character variation.)
>>
>>108691390
I do not. The last time I tried was in 2022
>>
File: Chroma_0075.jpg (1.91 MB, 1440x2304)
1.91 MB JPG
>>108691351
sorry, its my shitty lora that I just trained :(
>>
>>108691360
>img2img
You should be doing latent upscale with a high (>0.7) denoise for anime. Like the other anon said there's no cnets so your success rate is highly prompt and seed dependent unfortunately.
>>
>>108691394
So my 1k step baseline is below recommended. Gonna try 3k steps even tho it'll take 18hrs to train the lora locally.
>>
File: zImageturbo_00371_.jpg (573 KB, 1520x1824)
573 KB JPG
>>108691403
Asian celebs as dataset? Looks pre-filtered. I hate the alien face effect they use on social media
>>
File: ANIMA_P___00005_.png (464 KB, 1024x1024)
464 KB PNG
>>108691352
Thanks!
>>
>>108691410
>a high (>0.7) denoise
it's over....
>>
>>108691429
I misspoke. It's more like anywhere between 0.5 and 0.7. Sorry.
>>
>>108691420
That's rough.
It's possible to get a lora to quickly converge, but it's very easy to completely fry the model or still undershoot trying that. I wouldn't recommend it to a lora training newbie.
And anon if SDXL takes 18k to train locally, you should forget about anima lora training. At least locally. Maybe borrow some cloud compute?
>>
File: Chroma_0078.jpg (1.85 MB, 1440x2304)
1.85 MB JPG
>>108691421
No, I just prompted a kpop idols selfie, I have a instathot dataset that I like to train from time to time, I was testing the quantized trained of OneTrainer...even tho its fast, I think it affects the results of the gens too much, so I'll test the same settings with bfloat16
>>
>>108691437
I like low denoise since base gens usually have more soul.
>>
>it's improving
>>
File: 1763023732799368.png (1.54 MB, 1024x1024)
1.54 MB PNG
>>108691389
Cute
>>
>>108691448
training on quants fucking sucks m9-1
>>
>>108691351
at least Spark Chroma is coming along. Preview was good, V1 better, V2 should be even better. Based RealVisMan tends to deliver
>>
>>108691448
this fucking low-detail sandy-ass sampler looks so ass dude
>>
>>108691455
oh peetah
>>
>>108691352
Got anymore/advice for training?
>>
File: file.png (1.81 MB, 2048x1024)
1.81 MB PNG
trained my own stonetoss lora just to see if i could do a bit better than the other anon, turned out pretty well from what i'm seeing
dataset was smaller but more thoroughly captioned, seemed to make a difference, using tags here even though the dataset was 100% NL
mine on left, anon's on right
>>
File: zImageturbo_00373_.jpg (502 KB, 1520x1824)
502 KB JPG
>>
File: artward.png (119 KB, 278x271)
119 KB PNG
>>108691455
>>
>>108691501
IT'S IMPROVING
>>
>>108691455
thank you for reminding me of that one comic where chris and louis are stuck in a magicians box and are really sweaty and chris hard on is rubbing up against her pussy and they accidentally knock the box over and his cock accidentally goes inside her pussy and hes trying to take it out but ends up accidentally cumming inside louis
>>
File: ComfyUI_11445_.png (723 KB, 1024x1024)
723 KB PNG
>>
File: Chroma_0081.jpg (1.76 MB, 1440x2304)
1.76 MB JPG
>>108691471
the only issue I have noticed is that it makes the gens too grainy/bright but you have to play with the settings (mse or huber iirc), I ran some tests some time ago, its crazy that you can train a 512x lora en 8 minutes
>>
>>108691475
I don't mean to be too harsh to someone trying to help the community ultimately but the preview was only slightly better than base chroma and stopped well short of unfucking it. I would like to be proven wrong but I fail to see how this will end up any noticeably better.
Also:
>Hi! The training attempt failed because of excessive regularization, and the model’s outputs looked too artificial and plastic-like. I’m now using different training settings and have already updated the training info card.
This guy real should have made a lora instead of trying to force a finetune on very tiny (by finetune standards) dataset (and should have spent the time he spent autistically writing hand made captions to gather more high quality images instead)
>>
>>108691494
isnt the right more soulful? left is 'better' and more AI-like I guess
>>
File: 1775932535879580.jpg (934 KB, 2112x1408)
934 KB JPG
>>
did my gpu die?
it stopped genning z image in the middle of a batch and after reboot it still won't gen
>>
>>108691494
I believe using tags with the lora works fine if you caption the NL captions the way russ did in his official lora. @activation tag in the beginning. SPACE . PERIOD. Followed by the NL captions.
That has also been my experience.
Also can you share how many images did you use? I tried to train a style lora with 80 images initially but failed. It worked when I pushed for 130 (Alongside changing settings.)
>>
>>108691494
Did you use any different parameters or just dataset differences? How many images/steps?
>>
>>108691546
>SPACE . PERIOD.
huh i thought that was a mistype. have you done a comparison?
>>
>>108691529
Show us the error message.
Nvidia-smi output?
You also wouldn't be posting here from it unless you have igpu and the cable plugged to the motherboard.
>>
>>108691506
Share pls.
>>
>>108691469
What style is that? First thing I thought of was Shinkawa Youji but that's not quite right.
>>
File: 1752627636000978.jpg (1.13 MB, 2112x1408)
1.13 MB JPG
>>
>>108690738
thanks, but where is the research news post?
>>
>>108691552
>mistype
No I think
caption_prefix = '@greg rutkowski. '

was very deliberate to separate the TW from the rest of the prompt so that you can be flexible about how you prompt it later.
>have you done a comparison?
Nope. I am content captioning style loras this way. I might experiment with character loras later since they don't have the "@" clearly separating TW from the rest of the prompt.
>>
>>108690869
Catbox please
>>
File: 00015-2449696592.png (880 KB, 1536x1024)
880 KB PNG
>>108691546
idk man i just winged it and it worked
>@stonetoss, Two people are on a couch. etc
>>108691548
39 images total. many of them were cropped to remove black borders/artifacts, and for the full comics i did bother to leave in i removed the watermarks.
i mostly tried to replicate the rutkowski/other anon's training settings, 0.00002 LR, adamw, constant scheduler, all in anima standalone trainer
it's a little bit crusty desu, will retry with prodigy and see what happens
>>
File: pregmata.jpg (203 KB, 1000x1000)
203 KB JPG
>>108691389
>>108691295
>>
File: deNR_zi_00080_.png (3.24 MB, 1663x1164)
3.24 MB PNG
>>108691572
already posted it today, and weekend research is kinda filler anyway
>>
>>108691575
that makes sense, thanks. i see no reason to not caption that way with my future loras
>>
File: 1752730022884679.png (1.54 MB, 1024x1024)
1.54 MB PNG
I have a dream
Of a day where AI doesn't mangle hands and feet
Of a day where characters don't grow extra limbs
Of a day where detailed scenes don't shit the bed

>>108691569
This lora
https://civitai.com/models/912942/retro-sci-fi-90s-anime-style-animaillustriousxlzimageturbofluxchroma?modelVersionId=2808683
and
(@imamura ryou:2.7), (@butterchalk:2.0), (@ishikawa hideki:2.8), (@hh48288196:3.7)

with the hh48288196 lora
>>
any new models this wk?
>>
>>108691558
when it's READY. need more TRAINING
>>
File: 00016-2449696592.png (1002 KB, 1536x1024)
1002 KB PNG
>>108691579
and for reference here's the anon's lora, it really doesn't work well for comics/dialogue, don't know his exact workflow for his results but i haven't been having luck with it at least
>>
>>108691494
link for yours? :3
>>
>>108691579
Well good to see that it works with comma too I guess lol.
The @ is probably doing the heavy lifting then.
It seems to have overlearned some noise (Chun-Li's bracelet in yours vs other anons) but decent results for a style lora with 39 images I would say.
Cosine would probably give you better results.
>>
>>108691601
What card are you training this on?
>>
>>108691618
card?
>>
>>108691618
>>108691622
oh card. 3060 12gb.
>>
>>108691595
Prompt? Model?
Looks cool.
>>
thanks to cloud brothers for reminding how gens should look like
>>
File: deNR_zi_00087_.png (3.29 MB, 1663x1164)
3.29 MB PNG
>>108691629
zimg turbo + 'sagarealm' lora @ ~50%
>>
>>108691613
buzzheavier 9oa106uojuux
>>108691617
training the prodigy one now with cosine, only 2000 steps but i am willing to also try more. no guarantee i will return with results since this thread is a shithole but once i am satisfied the final one will probably end up on civit
>>
>>108691627
praying for you anon
>>
>>108691652
It will all be OKAY. This is only 11 epochers. I'm just happy I finally wrangled the settings correctly with so many characters. I had to restart training so many times.
>>
File: 00337-647634661.png (2.69 MB, 1248x1824)
2.69 MB PNG
>>
>>108691657
What do people train with these days?
>>
>>108691652
NTA but also 3060.
It's not that bad for anima lora training. I can get a out lora in 5-6 hours.
Easy to go to sleep and have your lora trained in the morning.
An absolute humiliation ritual to use for many other things AI though.
>>
>>108691645
>buzzheavier
da wat
>>
>>108691664
i'm just using
https://github.com/gazingstars123/Anima-Standalone-Trainer
but I'm thinking I should have just gone with diffusion-pipe, desu. the sampler is kind of shit
>>
>>108691680
good free filehost, should add .com/ between the space and it should lead you there. tried adding that in parentheses but it falgged my post so i just hoped you were able to put the pieces together
>>
cozy breas
>>
File: mugen__00111_.png (1.16 MB, 832x1216)
1.16 MB PNG
>>
File: 1000008267_edited.jpg (1.3 MB, 3074x4096)
1.3 MB JPG
Where is anima realism LoRA anon ...
>>
this is for all my API niggerfolk shills
>>
>>108691768
Likely gonna get jannied for hecking racisms outside /b/ but based.
>>
File: stay safe.jpg (1.72 MB, 1536x1024)
1.72 MB JPG
>>
>>108691768
They're not even able to finish their 3 days old /adg/ thread
>>
>>108691785
this is true, local model ‘freedom’ is a trap because the models are poisoned with bad data
>>
>>108691795
>>
>>108691795
>>108691785
Take your meds schizo
>>
File: 31321648646.jpg (734 KB, 1456x2048)
734 KB JPG
>>
>>108691494
>>108691579
oh yeah forgot to mention i upscaled it with this: https://openmodeldb.info/models/2x-AnimeSharpV4-Fast-RCAN-PU
looked clean enough and didn't take centuries like my other upscaling models
>>
>>108691833
the dataset i mean, my brain is all fucky because i'm tired
>>
File: 1000008271_edited.jpg (625 KB, 1776x1776)
625 KB JPG
>>108691785
Seeing all those sovless gens from apicuck is comforting...
>>
>>108691842
0 _ 0
>>
Getting the hang of anima LoRa training. Weird how anima likes batch size 2 more than batch size 1.
>>
>>108691842
jesus that is horrifying
>>
>>108691797
Well I just took my prescription sleepy pills and half a box of wine, should you upload the lora here please put a 16 hour timer on it.
Thank you.
>>
>>108691842
he says as he posts an ai inbred slop freak
>>
>>108691888
tomorrow night probably don't hold your breath
>>
lulz
>>
File: 2348060879234945.png (3.64 MB, 1080x1920)
3.64 MB PNG
>>108691785
as fun as it is to banter, we are all winners.
API bros get SOTA kino, local gennies get stuff like anima to tinker with.
>>
File: ANIMA_P___00018_.png (640 KB, 1024x1024)
640 KB PNG
>>
>>108691895
I thought local was free and uncensored?
>>
File: ANIMA_P___00017_.png (888 KB, 1024x1024)
888 KB PNG
>>108691913
>>
>>108691923
that is a website.
>>
File: KEK.png (232 KB, 402x498)
232 KB PNG
>>108691923
>I thought local was free and uncensored?
>>
>>108691923
Is the Internet local?

seriously, we need an anima swastika lora.
>>
>>108691934
the internet is a series of tubes
>>
>>108691905
>kino
Unironically not trolling are you unable to see how your gen is gigaslopped?
>>
>>108691944
>slop
so actually, I don't think that's a real category. Philosophically, it's ungrounded. It may be valenced, but it's not logical.

I abhor illogical fools.
>>
>>108691967
If slop is not "real" than neither is kino. I don't know why I'm even replying to you.
>>
>local is censored
>>
>>108691944
where did i say it was kino? i said it was fun to tinker with.
>>
>>108691980
As an npc, you construct the concept of yourself repeatedly through habits, gestures, and posture.

This pretty much explains why.
>>
File: kino alert.gif (577 KB, 498x498)
577 KB GIF
why do people keep saying ltx2.3 is bad? it only takes 3 minutes for each of these kinos to be generated on my 12gb card
https://files.catbox.moe/2bool8.mp4
>>
You are too stupid to be subtle.
>>
>>108691987
>I was only pretending to be retarded
>>
>>108691983
anima knows the swastika or is that part of the lora you're making?
>>
File: here's why.png (257 KB, 467x515)
257 KB PNG
>>108691988
>why do people keep saying ltx2.3 is bad?
>>
File: ANIMA_P___00020_.png (845 KB, 1024x1024)
845 KB PNG
>>108691926
>>
>>108692003
my lora doesn't have swastikas lmao I don't give a shit about nazi stuff but it's funny to see what it can do it's obviously just not trained on it very well and why would it be
>>
>>108691998
post some gens and relax
>>
>>108692004
even api models can't properly do eyes at distances
>>
>>108691370
Download any LoRA you like from civ and upload it here: https://xypher7.github.io/lora-metadata-viewer/

It'll give you all the information you need so you can achieve the same quality for your own LoRAs.
>>
File: 235860804780724976987532.png (2.61 MB, 1024x1536)
2.61 MB PNG
>>108692020
rome wasn't built in a day but api is very close.
>>
kekd
>>
>>108692049
For the record, you can also open it with a text editor and just read until the gibberish starts.
>>
>>108692020
>even api models can't properly do eyes at distances
my fucking ass, seedance 2.0 has no issue with that
>>>/wsg/6130845
>>>/wsg/6130044
>>
>>108692076
i mean at low resolution. i can crank it up to 1080p and the faces are perfect, but then i have to wait 15 minutes per video
>>
File: feeling a bit gassy.png (587 KB, 648x1056)
587 KB PNG
>>108691934
>>108692003
If you want to know what anima knows literally just search on danbooru.
>>
>>108692052
>>My api gen is SOTA

Missing leg, missing feet, 2 hands with melted fingers. Kek
>>
File: c.jpg (1.58 MB, 1402x1122)
1.58 MB JPG
>>
>>108692106
>>
>>108692093
make her do the heil and i'll download anima right now.
>>
>>108692106
why does it only generate white people
>>
>>108692133
>asking for only white people in your gen images
>>
>>108691878
The default has gradient accumulation steps set to 4. I don't think it uses normal batching.
>>
>>108692052
catbox?
>>
File: 2563625.webm (3.79 MB, 900x480)
3.79 MB WEBM
let me know when sneedance can compete with the kino factory
>>
>>108692162
He did it to show that it can be done for people with absolute potato GPUs, you are not supposed to use gradient accumulation unless desperate.
Anyway batch size 2 seems to work well in my experience also.
>>
>>108692076
god that thread is so embarrassing
>>
>>108692220
people use those threads mostly to link back to for audio, you know that right?
>>
File: 1773212448782470.jpg (361 KB, 896x1152)
361 KB JPG
>>
>>108692106
This is insanely good, wow
>>
File: 864658946548453184.jpg (1.76 MB, 2560x1440)
1.76 MB JPG
>>
>>108692106
Damn, pretty accurate.
>>
yo tran wake up time to bake
>>
>>108692366
we got time
>>
OH NONONONO, MIKU PLS DON'T LAUGH, STOP
>>
>>108692378
what if ani creates before you
>>
>>108692433
im not making, idgaf. i'm sleeping soon I will wait
>>
>>108692366
>>108692433
keep crying btw
>>
>>108692052
gtp-image-2 has hires-fix levels slop hidden in each image
>>
maybe this is my favorite variation https://files.catbox.moe/5l024s.mp4
>>
>>108692526
i wonder what he would have thought about ai. demons i imagine.
>>
>>108692548
he would be vibe coding templeOS 2
>>
File: _AnimaPreview3_00569_.jpg (417 KB, 1160x1696)
417 KB JPG
>>
>>108692624
it wouldn't work since the service would shut him down for typing nigger too much
>>
File: _AnimaPreview3_00582_.jpg (456 KB, 1248x1608)
456 KB JPG
>>
>>108692647
>>108692690
Hell yeah cwckino for anima!
>>
File: _AnimaPreview3_00585_.jpg (402 KB, 1248x1608)
402 KB JPG
>>108692693
Christening of anima 3
>>
>>108692647
>>108692690
>>108692697
bayste
>>
Fresh

>>108692909
>>108692909
>>108692909
>>108692909
>>
>>
>>108692914
Made me laugh, gg



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.