Discussion and Development of Local Image, Video, and Music ModelsPrevious: >>109101107https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUISDWebUI: https://rentry.org/ldg-lazy-getting-started-guide#the-stable-diffusion-web-ui-lineageWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, & Upscalershttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.info>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/tdrussell/diffusion-pipehttps://github.com/kohya-ss/sd-scriptshttps://github.com/kohya-ss/musubi-tuner>Zhttps://huggingface.co/Tongyi-MAI/Z-Image>Animahttps://huggingface.co/circlestone-labs/Animahttps://tagexplorer.github.io/https://animadex.net>Qwenhttps://huggingface.co/collections/Qwen/qwen-image>Kleinhttps://huggingface.co/collections/black-forest-labs/flux2>Wanhttps://github.com/Wan-Video/Wan2.2>LTX-2.3https://huggingface.co/collections/Lightricks/ltx-23>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkCollage: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debohttps://rentry.org/animanon
can someone generate a photograph of schizoanon killing himself and his parents celebrating it
>inb4 n*gbo
>mfw Resource news06/21/2026>FastSDCPU v1.0.0-beta.500https://github.com/rupeshs/fastsdcpu/releases/tag/v1.0.0-beta.500>LTX Director 2.0 Updatehttps://github.com/WhatDreamsCost/WhatDreamsCost-ComfyUI#-recent-updates>DF3DV-1K: A Large-Scale Dataset and Benchmark for Distractor-Free Novel View Synthesishttps://johnnylu305.github.io/df3dv1k_web06/20/2026>One Node ยท FLUX.2 [klein]https://github.com/yanokusnir-ai/one-node-flux-2-klein06/19/2026>FreeStyle: Free Control of Style-Content Dual-Reference Generation from Community LoRA Mininghttps://github.com/Blue2Giant/FreeStyle>JanusMesh: Fast and Zero-Shot 3D Visual Illusion Generation via Cross-Space Denoisinghttps://siang1105.github.io/JanusMesh.github.io>Linear Recurrent Unit with Semantic Modulation for Image Super-Resolutionhttps://github.com/MingyuChoi-run/LSM>LEAP: Layer-skipping Efficiency via Adaptive Progression for Vision Transformer Distillationhttps://github.com/KevinZ0217/LEAP>StylisticBias: A Few Human Visual Cues Drive Most Social Biases in MLLMshttps://hf.co/datasets/shaghayegh/stylistic-bias-dataset>musubi-tuner adds support for ideogram 4 lora traininghttps://github.com/kohya-ss/musubi-tuner/blob/dev/docs/ideogram4.md>KupkaProd Music Video Pipelinehttps://github.com/Matticusnicholas/KupkaProd-Music-Video-Pipeline>Midjourney goes from generating cat images to full-body ultrasound scanshttps://www.theverge.com/ai-artificial-intelligence/952011/midjourney-medical-ai-ultrasound-scan>TeleStyle V2: Beyond Content-Preserving Style Transfer with Self-Distillation and Distribution-Matching-Distillationhttps://github.com/Tele-AI/TeleStyleV206/18/2026>UniTemphttps://lzhangbj.github.io/projects/unitemp>Consensus-Frame Alignment for Visual Focus in Video-MLLMshttps://github.com/1Pansy/VideoCFR>Moebius: 0.2B Lightweight Image Inpainting Framework with 10B-Level Performancehttps://hustvl.github.io/Moebius
Why is the stablediffusion subreddit so cucked?
>>109105629>>109105670>>109105708which checkpoint is this?
>>109105718https://civitai.red/models/2544636/wai-anima?modelVersionId=2983680
made my first lora in case anyone wants to try it (warning some nsfw pics though I think civitai doesn't show them unless you are logged in)https://civitai.red/models/2719321/anima-picturd-style?modelVersionId=3055785
>>109105866I just downloaded that lol
>>109105866lora like that might work even better if you leave the activation tag completely, just caption with simple booru tags
>It's a homebrew agentic program that uses URL/search/image context and kling/heun/google/grok apis and stitches the result. You can do the same in a google workspace or with claude's MCP functionality.>I should clarify; Heun is an implicit sampler method (unlike euler which is explicit) meaning it be be used to generate partial image results without knowing the subjects are and then be merged with other methods. So it's very useful for generating virtual 3d spaces and then populating them with objects/characters.>Most flux models use it. I just have the agent scrub git repos and huggingface for public flux apis with that methodan anon in another board said this the other day, what the hell is he talking about
>>109105866>faggotry
>>109105876hope it works well!post some in the gallery if you make something cool or hot!
what can i prompt to get something like this?https://files.catbox.moe/t3e85h.jpgi've tried penis flickering, one finger on/touching the penis etc. nothing i tried works
>>109105866>It noticeably performs better if you also use the built-in: @picturdhow about training it with "@picturd" in the captions instead then?
>>109105933tmi
>>109105934i was unsure if that would somehow overwrite the original tag and mess it up that's why i did that to be safe. i haven't trained any models before
>>109105934>how about training it with "@picturd" in the captions instead then?not a good idea, better leave it out from captions and use something like [@picturd:0.5] when using lora
>>109105957>not a good ideawhy not?
>>109105963I've gotten garbled shit every time I've tried doing so
>>109105969learning rate too high?
>>109106008nah I just removed artist tag from the caption and it worked
After messing around with ideogram for a while day generating stupid plots and compairsons I just scrapped it all and went back to square one.But I did learn that Flux2 can edit. I thought it was just a t2i model.How does it compare to klein?
>>109106017what if you lowered the learning rate tho?
>>109105866Does it draw more normal sized dicks?I don't like micro or macro.
>>109105608>schizoanonwho?
>>109106066yeah i'm pretty sure if you prompt for it
>>109106071me
>>109105866>Higher strengths look pretty bad so I suggest something lower than 1.0.Also anon it is supposed to work fine at 1.0 strength so you ostensibly messed something up during the training, jfyi
>>109106088I read that as him talking about the built-in @picturd style, not the lora
i don't get it for lora training batch size usually falls between 1 and 4, with 2 being the average, but when finetuning a full model it goes up to 20 or more so i'm guessing dataset size affects what batch size you should usemy question is where's the actual cutoff? how many unique images in the dataset do you need to justify batch size 1, how many for 2, how many before 3, and so on?
>>109106103Some people use batch size 96 for lora training
>>109106102Oh yeah, my bad.
>>109105930I was really struggling to get it to make anything that wasn't an utter jumbled mess I'm afraid. When something does come out its pretty blurry or it's sharp and malformed. I was using er_sde on anime base. Maybe it's because I wasn't making porn I dunno.
>>109106103don't you batch size as high as your VRAM allows before you start losing speed?
When I try to use LoRAs with Ideogram, I get an out of memory error. The only way to prevent this is by disabling smart memory with --disable-smart-memory. Am I the only one this happens to? Without LoRAs, this model is cucked to the extreme, it's impossible to bypass the fucking cuck filter.>With lorahttps://files.catbox.moe/qnul18.png>Without lorahttps://files.catbox.moe/stoeox.pngWhere I find a correct workflow to avoid OuM?
>>109106114weird, did you try copying the workflow of any of the example images? they have metadata
>>109106030ideogram feels fake to me. idk why
>>109106153No but I got it working I had to really slim back the tags I was using but I don't feel like it's really captured the style well despite the minimal amount of tags I'm using.
>>109106259fair enough, i only made it because nothing else was available. if someone else made a picturd lora that was better i'd use that instead
>>109106176it's got that hyper-real look to it. Everything seems just a little too real.I haven't played with it myself though. Could be people just don't know how to turn down the CFG scale or that it's sometimes necessary to do that to create good looking images.
>>109106176>>109106285Defaults to a movie filter look, similar to how it was with GPT-Image 1. It's a significant regression from other models we're used to using (E.G. Flux), and of course Reddit would miss this 4o-piss like filter before they hyped it to death.
>>109106176Yeah, it's kinda uncanny valley. And sometimes it just shits the bed entirely. Feels like it's trained hard on synthetic output from nano banana and whatever gpt image model. Including the same 'noise' in the image.>>109106285I think that puts it pretty well, sometimes it's 'too real' which loops back to being slopped.Still fun to play around with, I think.What I do like is that there's not much 'same-face' going on even without LoRAs.
https://huggingface.co/FireRedTeam/FireRed-Image-Edit-1.1-ComfyUIWhy have I never heard of this edit model before? Is it that bad?
>>109106363I'm reporting this to Nintendo immediatly.
>>109106353>>109106308>>109106049>>109105713shit tho, anima base does the same
>>109106030Klein means small. Small means usable on vramlet hardware.
>>109106353cool gen
WELL WELL WELL WELL WELL WEEEELLLLLLLI just genned my longest song in Ace Step 1.5 XL SFT.aaahheheheheheeeem
>>109106363Not sure how that's a question. The example image from their workflow is slopped. The shirt does not match the original. Similar to all those countless VTON LoRA hacks that were released on Flux.1. Stick to Flux.2 Klein Edit if you want a proper edit model.
>>109106406inspired by d3 witch doctor
>>109106429>The shirt does not match the original.nta but the shirt looks the exact same???? what
>>109106350This image looks pretty good. On par with with Z-image I'd say. Except for that movie filter effect that >>109106285 mentioned. On a big image like this it's just too fine grained, which is fine, it's just that nothing irl looks like that. No media we're used to resembles that, except for AI generated media of course.
>>109106439Nah, texture changed too much so it's useless for showcasing actual products (so you'd still have to create mockups yourself using Photoshop). Flux.2 Klein would come closer. Only GPT Image 2 probably preserves every detail on that shirt.
>>109106428post it.
>>109106030Isn't flux 2 stupidly huge and unusable without multiple GPUs?
>>109106450Are you retarded? The other image is too small how on earth are you able to tell if the texture is too different when you're getting a poorly downscaled thumbnail image of the sweater?
>>109106466Take a look at the collar. I don't even need to zoom in to tell it didn't preserve the detail. Take a closer look at the color on the left side. Look at it on the right side, all washed out. The patches lack texture (different shades of brown clearly visible on the left). I can't see the full image on the right, but it's obvious it also missed the signature if it can't even get the color pallete right. Perhaps you don't know much about ecommerce, these details matter.
>>109106485Absolute retard.
>>109106445Yeah, I get what you're saying. And I would agree.>>109106462I've been using Flux.2 when it came out without any problems, it was great for high resolution wallpaper generation. Didn't like it for anything that included humans, though.
>>109106435this looks pretty nice. are you going to make that finetune available?
>>109106462That's the original Flux.2 that came out. Flux.2 Klein is the one with the edit model, it's like the Schnell version of it and it has less parameters than Flux.2 (but it's much more aesthetic compared to Flux.2, plus we got the base Klein model for tunes).
>>109106617That's a fucking Big Mac.
>>109106496sure why not, takes moment
>>109106684reposting from /a/ now? who the hell do you think you are??
>>109106718Anon, you are mentally unstable. Please get some help.
>>109106729why are you reposting in first place?!?
>>109106737Anon, you need to get some help. This is strange, abnormal behavior from you :(
>>109106744fuck off debo
>>109106769Now you're giving me your boogeyman label, poor anon. I hope one day you get the help you clearly need :(
>>109106769dont bring me into this
>>109106454>>109106428https://files.catbox.moe/80xrgo.mp3no EQ, and I still don't have izotope rx, so no pops/clicks fixed.5:27you may say, not that long.But the lyrics match exactly. I did it by using audacity to snip an end off, then used Extend with a tiny overlap and then that extend start + like 80-100 seconds for a verse. Or these verses, anyway, which are long-ish. putting them back together, in Audacity you use special->invert to invert one of the tracks, when perfectly aligned it's silent... except that the vae changes the volume slightly, so, yeah, quiet though. then uninvert. trim to meet in a least problem spot (where a tiny volume change won't be heard.
>>109106718>>109106729>>109106744Stop gaslighting, you are reposting
>>109106807Anon, you have mental problems. Please, for your own sake, get off your computer and get some help.
>>109106791go back
>>109106799panties too clean
>>109106842this is gonna blow your mind, but... I have two tabs openAT ONCEyeah, I know, I'm a fucking freak. it's almost hard to imagine. TWO TABS OPEN AT THE SAME TIME. call the police, I need to be stopped
>>109106363getting 999 errors with their default stuff, the clip model they posted doesnt have mmproj, great stuff
>>109106857I print a picture of every gen you make and physically shit on it.
>>109106857Chrome needs to let us have 3 split screen tabs open at once, desu.
>>109106854Enjoy your filth.
>>109106807not him but shut the fuck up you annoying sperg, it's not the end of the world if someone posts a gen on two different boards
>>109106921why is there a bed in the bathroom?
>>109106923yes because he is trolling, he is spamming this general with reposted content FUCK OFF
>>109106927famous japanese voyeur hotel
>>109106799"love hotel" was in the prompt, but the model likely has no idea what that is.
>>109106927Watch some JAVs.
>>109106857the message was not about the number of open tabs (2 is a very low number btw)try to read it again
>>109106934It took me forever to realize people actually see non-whites as humans.
>>109106930No, you just have severe autism and mental illness, and you do this all the time. Nobody cares, piss off.
>>109106935Anima has such good prompt adherence that with a right prompt it should be possible to set up the room correctly.
>>109106930>spammingyou poor little sperg, your brainrot is becoming quite severe :(
>>109106930>when I think somebody is trolling me I should respond angrilykek
It's weird how retards get baited by retards they think they're baiting and vice versa.
>>109106955>>109106966Fuck off you are spamming the thread and we both know it. Throwing insults at me isn't going to erase what you did, you reposted something ight from /a/ FUCK OFF YOU FAGGOT
trolling is wasted energy.
>>109106363is this not 100% on gpu because of the q4_k_m gguf? very unfortunate because it becomes very slow
>>109106363dunno. maybe it's good at something. it did not intuitively seem that interesting.
>>109107034at this point you are just repeatedly humiliating yourself. You know you're in the wrong and you know you have mental issues.
>>109106857quite the emotional response
>>109107128hot, would rape
>>109107186Anon, we already know you're mentally ill.
>>109107062truyen-hentai com/es/hentai-pantyhose-porn/anchovy-girls-und-panzer-7flf7Reposting from here too? You want to play Tom and Jerry with me, where I catch all your reposts?
waiter, waitermore low IQ WANtranny replies please
new gen!
>>109107202>WANtrannywhy he didn't poke-evolve to LTXtranny tho?
>>109107205https://danbooru.donmai.us/posts/11604869spam repost <----MODS
>>109107212they stole my gen
>>109107192>Woah, nice Woahjacks for my Anima, maybe I will look a few shitposts with it>Look inside>3.9 gigsYeah sorry that should have been a lora, ain't downloading that.
I miss pixart sigma.
>>109107230My drive is big and FAT. It should be a lora, though.
>>109107238heh, pixshart smega
>>109105627truly a strange place, they absolutely hate 1girls or anything """gooner""", so 99% of local ai
>>109107238we got quite a few nice models meanwhile, anima ultimately also is in that vein
>>109107205looks like quite the retarded door anon
>>109107313male pov
queef fighter 6
How does civitai get worse over time? What is the magic behind it
>>109107412ai genned slop fest of a website ran by a gay guy trying to milk every cent he can
>>109107332I like how you talk nonstop about how much you hate Illustrious, yet every image you make still looks like Illustrious slop. Maybe it's time to accept that it was genuinely a useful model and that you learned a lot from it, whether you like it or not. Own up to it instead of pretending otherwise, you might gen with a clearer head and some fresh ideas once you do.
thoughts on latest /hdg/ discussion?
>>109107459??? I've never talked shit about illustrious lol.real schizo nigga, take a break
>>109107464Very ayakoning.
>>109107464Why am I expected to know what kind of faggotry is going on there?At least link the posts.
>>109107464But Tdrusell browses here, why should I move from here, also, anime sucks
>>109107464homoerotic
I miss hdg being good, it used to have 300 image threads of decent quality every day
>>109107464I admit I'm an adult and that AI in general is a manchild hobby. But going from that to diffusing drawings with big eyes is a step way below that.
I hate 4chan schizos
>>109107534nice gen
It's too hot to gen.
>>109107710*sniffs*
>>>/wsg/6180134
>>109107728shieeet nigga
Slowly trying to adapt comfyUI to my workflow, have been building custom nodes. Next step is to build a proper start stop tags similar to webui, and I think I'll be fully situated to comfy. Might also change prompt weighing but seems pointless with anima.
blog factory status?
>>109107777full of beansand checked
AI is too sloppy for what I want to do on its own, I need to leverage my ability to use Blender, Krita, and my Wacom tablet somehow to make it better.
>>109107192I never knew I needed a feels gal Asuka in my life until now
I need to add one of those text models inpaint to my workflow and make it an additional post process pass. hopefully the turbos can do the job well if the only use is text.
ran won btw
>>109107820Boring
>>109107820exciting!
>>109107760>>109107818Lord catjack...!
>>109107818Nothing about you or your gens is interesting, to anyone looking at your gens from the outside, it's the dullest, most boring thing they could possibly see.
>>109107925Well seeing how we have a special needs parasite that forgot where he was I hope I can at the very least cause him suffering by discussing image diffusion and my journey learning comfyUIPay me no mind.Still trying to learn anima so this will be a fun journey the model has changed since the preview
>>109107938keep em coming. best gens ITT
>>109106285>>109106316Maybe it can be fixed, but I don't want any local model that every says "no."that's why all local llms are just experimental discards of the big dog fools. We'll have legit things soon enough in all areas.
>>109107957I wouldn't go that far, also I still need to migrate more webUI functions, I didn't know that comfy never handled [thing:thing:0.5] natively which is pretty annoying desu. I'm sure there's a custom node but I rather make my own native custom node with the logic and just route it through myself.
>>109107968How can we know you are the real catjack?
>>109107973Why the hell are people still talking about me that's the real question?I look in the archive and I just see constant conversation about things that make zero sense.Well you can just listen to me about what I'm doing, more or less I'm trying to just make comfy workout for me the c++ project looks promising but I don't really see the point at the moment when I can load .guff models through comfy.
>>109107986Seems like you are annoyed.
>>109107986Catjack carved out an era in our hearts with his high IQ artist tag test back in Febraury and March 2026, and left the bar set stupidly high. If catjack ever returns, real one or impostor doesn't matter, we're all expecting them to blow it out of the water.
This shit isn't healthy man, anyways next step is to master text, there seems to be text only models but I don't think they are needed due to the flux inpaint models, I think I can just use a vllm to guide the locations based off the output so that shouldn't be a huge uplift.
>>109108053saving this for my birthday
>in today's episode, debo discovers 'pout'
>>109108085Debo, if you want to come across more as an anime poster type I think you should get familiar with the term "goon".
or just learn the basics of not making melted garbage
I slopcoded a json (or plaintext) captioner which uses qwen 3.6 35b a3b. I'm using it to create json captions for use with ideogram 4 lora training. It's very accurate, but uses a LOT of VRAM to "look at" a 1024x1024 image, about 46.5GB. I'd initially used qwen 3.6 27b since that's also what my hermes agent uses, but though it might be a tiny bit better, it's far too slow - almost 2 minutes per image, vs 35B a3b which can do about 2.5 images per minute.If there's interest I'll throw it up in github. I'm guessing not much, since it needs so much memory. I'm going to do the first ten 1000-image bundles in danbooru 1024 and then take just the NSFW-tagged images and do a ideogram 4 lora on that.I did a plaintext caption NSFW lora previously for ideogram, and it seemed to really trigger the "unsafe image" bullshit a lot more often, so I'm wondering if it's better naughty bits or somehow the fact that the lora wasn't trained on json.If it turn out it's just ideogram 4 being cucked, then at least it isn't hard to post-process the json back to plain text. I feel like. that's a quick and easy LLM task.Picrel is a typical danbooru 1024 image. There's about 330K images in the dataset.
>>109108142Nah, melted garbage is the common denominator across every anime general in 4chan. Debo is probably asexual, his gens don't have that gooner energy, but Cajack's do tho >>109107938
>>109108157Okay
>>109108159Can you not do this?
>>109108159I dont make porn, I make ART
how do you make an outfit like that so unsexy
>>109108176Who are you talking to?
is there any other lora fuckery node where you can select individual layers and their strengths and things like that like this?https://github.com/ethanfel/ComfyUI-LoRA-Optimizer
if I don't give you a (you) when I reply, it means I don't respect you
>>109108172the buildings should be darker and cooler. They're in shadow.
>>109108302found what i was looking forhttps://github.com/shootthesound/comfyUI-Realtime-Lorais there any more?
Why the fuck would the comfy team give a drawn out article on Masking and Scheduling LoRA and Model Weights instead of just making it into a single fucking node, why are they unable to go into basic user experience to do shit that to this day is nothing more than a 5 minute fucking job instead of making it into a multi node drawn out process?With modern AI models there isn't a fucking excuse even a fucking local model can make it into a single node within minutes.
forge neo is still better for any model that supports it
>>109108373It is but the problem is the creator is too opinionated and has dropped features so now I have to use comfy and adapt the features that I like from both projects.
Flux Klein 9B 4 steps (res 2s ode)vsFireRedImageEdit 1.1 + 8 step lora 1.2 (res multistep)"remove all his clothes, make him wearing only a white fundoshi"
>>109108630forgot reference https://i.imgur.com/hSett1z.png
>>109108302The node you're looking for has something like "block weight" in the name
>>109108630>res 2s ode>res multistepNot that I think it will change the results drastically, but you may want to control your variables more for experiments
Christians, who are NOT satanists use exp_heun_2_x0_sde
>>109108791please just don't post here. Thank you.
>>109108799this comment was the straw that broke the camels back
>>109108799counter proposal: I continue posting here
I have slopped Automagic V3 into sd-scripts and was messing with it a bit.It's interesting. I would say it's greatest advantage compared to Prodigy is that it is not monotonic, it can decrease its learning rate without needing lr schedulers. Otherwise it seems to peak at similar lr values to prodigy's estimates before it starts decreasing.Doesn't seem to work with hard restarts. I tried loading checkpoint to train further and it just chilled around 1e-6 instead of going up to where it previously left off. Tried again with 1e-7 and it stayed around that value. Something worth noting I guess.I just did some 512p quick meme tests but the quality seems "OK" so far. I think I will try some more serious training later.Before that I want to test this: I want to give it a small dataset and absolutely overkill amount of steps. I want to see if it can bring its own LR down quickly enough not to fry the lora.
>>109108822Have you tried automagic_sinkgd? It seems to converge faster than automagic v1
Forked cumfart ui and stripped all the nonsense and the name, now my onee-san hermes gf has a cron each 3 days to merge whatever cumfart implements but keeping in mind to prune any naming or nonsense. Feels good.
>>109108850No, first time hearing about it.Are you talking about this?https://github.com/gesen2egee/custom/blob/main/automagic_sinkgd.py
>>109108871yes. lr_bump, min_lr and max_lr are percentages of the initial lr, so make sure to set the correct values. The default values are fine
>>109108917I will give it a more detailed look after I am done messing with the current one.
>>109107710Fell asleep. Still too hot to gen. I hate summer.
>>109108384Which features did he drop?
>>109108822How is VRAM usage? I OOM on Anima for 1344x1344 on a 5090 for Prodigy.
>>109108822>I want to see if it can bring its own LR down quickly enough not to fry the lora.More warmup steps?
>>109105593Has anything over taken z image yet?
>>109109095768p (low res because I am doing fry test mentioned earlier) 16 dim batch 1 takes around 5 something gigs excluding idle.The optimizer itself has very little overhead and uses fused backward pass by default, so it shouldn't be VRAM hungry.>>109109099It does a reasonable warmup by default and I set lr to 1e-7 to be even safer. Regardless that's not what I am testing.
>>109109095Btw you may want to use slice_p=11 on (an up-to-date version of) Prodigy if you aren't already.
>>109108822>>109109095any things i need to know if i want to try prodigy with a style lora for ZIT on ai-toolkit?any other suggestions to try for ZIT?
I hope LTX 2.-whatever will be a huge upgrade over 2.3. The 2.3 model fails so often with motion and physics, even when given a video reference.
>>109109185It will overfit.Why does every save video node have to save the start frame as welll? I'm tired of having to delete them.
>>109109220Didn't the company go bankrupt or something?
>>109109233No? They laid of a few people but they just put out a lora trainer, a few trained loras and the CEO posted this on reddit a few days ago: https://www.reddit.com/r/StableDiffusion/comments/1u3a4dp/ceo_thoughts_whats_next_at_ltx/
Thoughts on the latest major anima lora?https://civitai.red/models/2709835/male-feet-anima?modelVersionId=3058231
>>109109252Good. I don't think LTX is super good but it has potential.
>>109109271gamechanger
>>109108157How do you feel about your current situation?
Ideogram has absolutely no idea who Frank Frazetta is. And any and all photography style I have tried so far barely works.A shame, really. It would be such a cool and capable model if they (apparently) didn't have such a narrow focus on what they want to do.
>>109109722Just make a lora bro Just make a finetune broJust blow 100k a year finetuning every SOTA local model that drops. Just be your own tdrusell bro.
>>109109740this but unironicallybuild an ok dataset for what you like once and you can slightly change around training and generation parameters to generate infinite kino forever across models.
>>109109722That's a pretty simple composition though. Don't you think you could achieve the same result with Anima and some vibecoded tool that copies Ideogram's regional prompter style? From what I've seen so far, Ideogram's new thing is a more elaborate regional prompter.
>>109109750Imho base model should know these well known artists. Having to train lora for everything makes model boring to use
>>109109800But you can't clone tdrusell or his bank account, at some point, you're going to have to do something on your own.
>>109109822Sure I do, but it's still better if the base has some knowledge of basic styles and popular artists
>>109109740I'll admit part of this hobby also has to come from us a little, like selling your GPU every year and buying a new one. I've been in this hobby for 3 years and I'm still stuck at 12GB VRAM. Compare that to other hobbies I had before, like guitar, where I actually upgraded my gear year after year, sometimes every six months, and spent way more money than I ever have on this.There's something about this hobby that makes me weirdly hesitant to spend money on it. Maybe it's because guitar was something social, while this AI hobby is more of a personal gooner thing, so it always ends up in the "nah, I'm not upgrading this, I only use it to goon" category. Even though I use it all day.
>>109109840You only need to buy one instrument and that'll last you a lifetime. Maybe two guitars, the first one and second one after learning the ropesDigital effects are so good now that unless you're a live player there's no need to buy anything almost
>>109109826But has this ever actually happened with any real base model? Like base SDXL, Flux, Qwen, ZiT, or Cosmos? Anima is a finetune โโโ not a base model.
>>109109849Nah, I fell for the best guitar for blues thing, then the best guitar for metal, then jazz, oh but those Taylors are so good for country folk! Then the Beatles cover band and bought guitars similar to theirs. Then amps, then pedalboards, then a home studio.
>>109109855Zbase has decent knowledge
>>109109868It's still fun though. I would still play but I lost my fingers in an accident...
>>109109871How do you type?
>>109109871How do you grab your dick?
>>109101168
>>109109915>Only RL leftwhats that? eternal gguf waiting room
>>109109750I mean, yeah. That would be the fix, but I'm just messing around with the vanilla model still to see what it can and can't do.>>109109771Sure. I've made way better on Chroma (I think), which knows Frazetta pretty well at this point. You could probably get this same exact image from Ideogram without any bboxes or JSON format, too.I'm just interested in its general capabilities right now.
is it ethical to jerk off too AI hentai
>>109109915the left is bigasp? looks good. better than the previous sdxl versions
>>109110039no, fuck same face slop
>bigASP 3 is currently undergoing training. I'm dumping some of the intermediate checkpoints here for now. Load as a Flux 2 Klein 9B model. They generally aren't usable as-is. Still needs to undergo post training RL and I need to finish up the prompt enhancer model.>2026.06.18: Main training is done, 3fph2juh_flux2klein9b is the last checkpoint from that phase. Next up: RL and prompt enhancerhttps://huggingface.co/fancyfeast/bigasp-3
image captioner aimed at ideogram 4 but usable for any natural language prompt model: https://github.com/quarterturn/ollama-captioner
>>109110105>ollamajust support llama.cpp as first class citizen instead instead of that garbage malicious retardsOIlama:1. history of purposefully obscuring that they were using llama.cpp under the hood which violates the open source license2. history of obscuring names of small models like the deepseek distills so they can get downloads off the large model's hype3. ollama model format is confusing and challenging to work with4. ollama had a short period where they refused to attribute credit to llama.cpp, so they tried to rewrite the backend in golang, and ultimately broke a lot of functionality. They went back to using llama.cpp with their tail tucked between their legs5. once got caught stealing gpt-oss implementation code from llama.cpp when ggerganov put a papertown bug in the original PR6. default configurations for sampling and offloading was generally bad, and gives new users a poor impression of local LLMs.https://sleepingrobots.com/dreams/stop-using-ollama/
>>109110117I hear you. It's mostly inertia on my part from back when llama.cpp didn't have an openai-compatible API endpoint, but also it makes it a little easier to manage things when there are multiple models. Maybe llama.cpp can server multiple models now?
>>109110144router mode in the server binary has everything you needyou also have llama-swap which supports switching between multiple llama.cpp forks too
>>109110095
So the painting focused finetune, who are the artists?
>>109109088It tells you on the project page, my biggest gripe was the old webui refiner which was basically model swap mid generation
Best guide or resource on setting yourself up with WAN?
>>109110361if you have to ask thenhttps://github.com/deepbeepmeep/Wan2GP
>>109108666>Not that I think it will change the results drasticallyI know, just that even with the advantage of 2s at the end of the day flux doesn't really understood the prompt
I'm getting some surprisingly good results by using a realistic model to produce anime instead (with the right prompting):https://civitai.red/models/2409949/sam-anima-realistic?modelVersionId=3017757I'm guessing because it's got more (focused) anatomy knowledge that makes the anime anatomy better as well.
>>109110443this shit is slopped to hell and back, whats wrong with you people
>>109110443Show a comparison. In my testing, sam is much stiffer. I bet it's because real-life images often have very similar poses and expressions. 99% photos with people have the people looking into the camera. The number of photos that show people blushing or acting nervous is minuscule.
>>109110458I just fucking told you I use it to produce anime. "anima_baseV10" produces worse anime AND worse realism.Why did you even bother responding, you don't know jack shit.
>>109110458Just learn how to prompt skin imperfections.
i want to use scail-2 to replace a woman with a child. im making a hilarious meme. segmentation scales the body to match the woman's. how do i solve this
Glowing so bright the you blot out the stars in the night sky.
>>109110538Sure I'll tell you in a sec
>>109110538I bet it's impossible. The tech simply wasn't trained to scale the replacement object.
Good morning, it's Monday, and also Ace Step 1.5 XL SFT is in the oven. It's like having dial-up, I can't talk to Gemma because I'm busy genning music. Sorry babe, I got work to do.
>>109110560Vramlet?
>>109110565idk. batch=8
>>109110572Waste of time rolling like that
>>109110578despite saying "sft" on the lid, outputs are highly varied.
>>109110643I know it's quite annoying you're hunting for the right flow/temp then you have modify the lyrics with a working seed, it's quite annoying how resistant it is to keeping things in line in many cases it really does feel like a crapshoot
what's the verdict on ideogram 4 now that it's been out for awhile? better than ZIT for realism?
>>109109185Don't know much about ZIT training besides the fact that it is a pain in the ass, sorry.
>>109110653It has a very 'this is AI and ideogram made it' kind of look to it, much in the same way you can instantly recognize gpt images.It's fun but not the be-all-end-all for realism, I'd say.
>>10911065298% of diffusion addicts give up 2 gens before striking diffusion gold.
hello beautiful anons! I've been out for a few months, I see a lot of people using anima nowadays, has it surpassed illustrious?Also, which version of anima is the best one?thank you and have a nice day
>>109110760yesbase-v1.0
>>109110760what>>109110775said
>>109110760Illustrious still offers unique loras and good realistic models. but it's over for pony lol
>>109110653>better than ZIT for realism?kekaroo
>>109110798cutie
>>109110760I feel like XL still does porn better than Anima
>>109110830How good is it at anal bead sizes compared to Anima?
>>109110830yeah we need a noobai finetune for anima. but anima has catastrophic forgetting problem so...
>>109110849Why do we need a finetune of a finetune that has already squeezed everything from the provided architecture? Why can't noobai finetune cosmos 3?
>>109110849Why do you do this daily even after anon proved you wrong yesterday?What sort of mental illness drives this behavior?
>>109110798>>109110827samefag
>>109110849Yeah the catastrophic forgetting is a real problem, no idea why the anima shills act like it's not a proven fact at this point
The first thing you do when you wake up is to seethe and shit up the thread.I feel bad for you man, how about you actually do something in this spaceWe keep showing you that that this forgetting meme is a larp and a anon already made a finetune proving that you're fucking retarded. Why persist in this bullshit?
A certain someone seems to really have a problem with /ldg/And it seems to always be at the same time of the dayWho could it be?
>>109106259catbox?
The complete anima does blend artstyles better so that's good
>>109110798>>109110940How do we know you're actually catjack? Upload your gen on Discord and post a screenshot. I only bow to the real lord catjack, not some impostor.
I wish flux2 dev's skin didn't look like absolute shit. Ideogram and ZIT are so much better at that, it's insane. I'd love to use flux2 more.
>>109110560:^)This one is called "Treinta". It's gonna take the Internet like a loose woman with unique pronunciation and easily understood sarcasm.https://files.catbox.moe/oeezm9.mp3
I want to make a style LoRA using copyrighted characters that are already in the Anima dataset. For captioning, do I need to manually tag everything, character name, franchise, every piece of outfit, or can I just run the images through JoyCaption and let it caption the overall scene automatically? I only want the LoRA to learn the style itself like the coloring, face shape, and overall body physique, not the specific characters or outfits.
>>109111083I like this slop style. Mind sharing the sauce?
>109110964There are better uses of your time
>>109111065just wd tagger is good enough. if the lora learns outfits too much then don't train the cross attention layers
>does the same broken reply behavior he does when he's anon
I don't have time to bake, someone else has to
i finally finished training my lunafreya young version (final fantasy xv) lora with ai toolkit. Spent hours on end captioning the training data images for it. majority of the dataset were synthetically made with gpt image 2.
>>109111283>majority of the dataset were synthetically made with gpt image 2.Bruh. There are 300 images on danbooru, why did you do that?
>>109111283>cgi>still has background blurhow do we make it stlop?
>>109111283>>109111307Oh I see most are different version. The gen looks nice still.
>>109111283Planning to upload somewhere?
>>109111350who cares, it's not an ace step 1.5 lora
>>109111307anon.. there is only three very short cutscenes of the young version of lunafreya in ffxv while the adult version of her has a abundant of images and video all over the internet. i didn't like lazy procrastinator's version of young lunafreya. The images gpt image produces using 5 references of her were pretty good.
>>109111387cool cool, so where's the link to your lora?
>>109111377why are you specifically so obsessed with ace step? do you not realize the music ai scene is no where near comparable to image/video gen, therefore there will never be much interest in it?
Thats actually not a bad idea using GPT image 2 for datasets... The character I like has alot of stuff too but the *stuff* is stuff I'd rather not look at.
>>109111423Go back
>>109111408I am a man of taste who likes good music. This is why I am an outspoken champion of the People's Music, Ace Step 1.5 XL.
>>109111408He's just looking to bait over stupid shit I bet any music he post is shit too
>>109111432two tabs
>>109111439I bet capitalists sent you here to say bad things about Ace Step.
>>109111442Close one and don't come back here debo
>>109111350>>109111407i want to upload it to civitai but I'm anxious of the trippy mods over there nuking my account for posting a "minor" despite it being fictional character. here's a go file link to the lora. it was trained with z image base use the basic default settings at rank 64. i using the junggernauntZv10fastBy to generate the imageshttps://gofile.io/d/2zRJND
>>109111423>>109111442Fuck off thread schizo
>>109111450no
>>109111387Gpt is so powerful, local would be nothing without it
>>109111458You're not welcome here debo
>>109111387Have you tried flux klein or qwen image edit? How were their results?
test
>>109111452post it on .com not .red and disable nsfw gens with it in page options, dont have suggestive prompts or images and thats ti
I see he's here again must be having a rough time.All these years no progress still spiteful. I would be mad too if this was the only thing I did with my life.Once I'm done setting up comfy I'll make platinum tier handicap signs for him so he can feel included.I also need to reinstall iopaint for post prod or look to a all in one solution somewhere else
>>109111452the only reason people get nuked is because they can't help themselves when posting images.
all are welcome in ldg
>>109111463I would be nothing without my toilet, though.
/ldg/ admissions are pending submission of ace step 1.5 xl loras. Get in line over here.
>>109111452Thanks, what's the trigger word?
>>109111504Crazy how low effort slop seem to magically pop up when he's under fire. Can you make something not xl tier?Even my fail gens (and any /ldg/ regular's fail gens) tower over anything you and your /sdg/ waste of space can muster up in a board room with a weeks worth of planning
>>109111506localkeks gleefully eat shit out of the toilet, we know
Flux Klein is undefeated in LoRA training
>>109111504thank youI feel very welcome
>>109111556No what you experience is "being lonely" and now fuck off into your splinter general lol
>>109111479if your talking about training a lora then no, i haven't tried training lora on those models but curious to try with klein 9B. Hope the speed is as good as z image base. As for preparing a dataset with those models, the results were bad and not good enough.>>109111522the main trigger is lunafreya, but here is the other tags i used : lunafreya, young girl, young appearance, youthful, cute, human, caucasian, petite, petite frame, slender build, 1girl, pale skin tone, smooth skin, slim build, slender shoulders, fair complexion, flawless skin, blue eyes, almond eye shape, long eyelashes, small nose, slightly parted lips, soft mouth, platinum blonde hair, straight hair, shoulder length hair, blunt bangs, fringe hair, hair texture, silky hair, soft hair, wearing white spaghetti strap dress, white dress, thin straps, scoop neckline, sleeveless dress, layered fabric, white cloth, cotton texture, simple dress, small breasts, minimal cleavage, ultra realistic textures, 3d render, 3D, 4k, high quality rendered textures, highly detailed, video game screenshot style, cinematic cgi cutscene, video game cutscene,
>>109111534>says all that>oily pony slop imageare you actually just pretending to be retarded? is this just all some retarded joke for yourself?
why do these avatarniggers and namefags shit up the blessed thread? there's /s*g/ for that
>>109111574Get your eyes checked.
>>109111611check my cockit's itchy
>>109111592I hate to break it to you but the thread blesser is an avatarfag
>>109111571I asked about making a dataset. Sad that it's not good enough. I remember trying Nano Banana Pro to make a dataset half a year ago, it wasn't that much better than Flux Klein 9b or QiE because NBP introduced its own stylistic bias.
>>109111611get your head checked
>/s*g/ anons raid the blessed thread againIt didn't work in the past and it won't work now
>>109111630It's anyway not blessed anymore since debo posts here because he's lonely
>>109111630This has been happening for days now, just look at the state of /sdg/ and laugh
>>109111539i very interested in training lunafreya with flux klein 9b. Not really liking basic ass the 3d render slop results I'm getting z image base. I want the results to look like ps5 level final fantasy xvi and vii remake trilogy in game graphics and cut cense graphics. gpt image 2 was able to pull this off very well.
>>109111696Go be unemployed somewhere else
I forgot that Klein was really good at replacing characters with the LoRA character for basically any image.
>>109111539Has anyone fixed klein body horror (extra fingers etc) with a lora? Thats its one weakness
have a 5090 with 128gb ddr5 ram. what are decent lora training setting for flux klein 9b base. Are the default settings good enough? do i increase the learning rate and change the optimizer? is rank 64 and 5000 steps too high? also lora vs lokr?
>>109111766Disable low vramSwitch to bf16 precision in quantizationIsn't 64 rank a bit overkill for character lora? I am not sure thoughBatch 2 might help if you have a lot of imagesDefinitely go with AdamW NOT-8bitSet caption dropout to 0
I unironically support debo posting because it might finally get catjack to fuck off
FireRed (bottom) alters the rest of the image (torso and legs) a lot more than Flux (top)"make him wearing only a red thong"
does this general even have a culture? just seems like the same 3 dudes arguing with eachother
will anima 2 have catastrophic forgetting problem like anima 1?
>>109111834Not one single internet place has culture, its just bots and schizos arguing with each other. Or bots and schizos fellating each other, theres no inbetween
>>109111805But i'm more of a catjack kind of person tho
>>109111740At this point, isn't it just more sensible to wait for Flux 3 Klein 2 instead of sinking time into a model that's already over 6 months old?
>>109111793i have a total of 47 images, so leave it at 1? also how about layoffloading?so leave quantization transformer and text enconder to none right? none= fp16?how the learning rate and weight decay? keep it at 0.0001 or increase it?For resolutions, should 1024 be only checked or could 768 and higher ones be checked? most of my lunafreya images are at 1.5MP, 2MP and 4MP(2560x1440) resoultion sizes.
At this point I don't think any of us need to be downloading .safetensors from civit. The local meta is sharing already captioned dataset and handling the training ourselves.
>>109111955>also how about layoffloading?If you want to needlessly slow down training it's a good idea.>so leave quantization transformer and text enconder to none right? none= fp16?No I think for AI toolkit that would be fp32. Just keep it at fp8 then, I can't stand that tool.>how the learning rateI have no idea. I would do some prodigy runs to see what it settles at. Or trail and error and look at samples. >weight decayYou want 0.01 or slightly higher probably.>For resolutions, should 1024 be only checked or could 768 and higher ones be checked? most of my lunafreya images are at 1.5MP, 2MP and 4MP(2560x1440) resoultion sizes.I dunno just keep it at 1024p. I doubt higher res would benefit you that much for simple character lora.
>>109111946No idea? Are things like flux 3 klein 2 really imminent? What about ltx 3?
WHAT SHOULD THE NEXT THREAD TOPIC BE?!
>>109111998Why? Because then we handle the prompt nuance to our own liking or?
>>109112025debo
>>109112034a malware distributing racist? no thanks
>>109112071
who will download it?https://x.com/krea_ai/status/2069111451202334885
>>109112028Because we're free to pick whatever model we want to train and use it on. The hobby's moving too fast at this point.
>>109112112Oh, I see. Thing is not every dataset layout works best for every model? Different prompt styles, video lengths resolutions etc?
>>109112109>49.61 gbThis better be fp32 weights
Do the custom illustrious or klein checkpoints basically break loras since most loras are based on the base model?
making my first anima style lora, 40 images, all similar landscape resolution around 1.3MP>network_dim = 32; network_alpha = 1; network_train_unet_only = true; llm_adapter_lr = 0>learning_rate = 2e-5; optimizer_type = "AdamW8bit"; lr_scheduler = "cosine";>lr_warmup_steps = 200; max_train_steps = 3000>train_batch_size = 1>gradient_accumulation_steps = 1>timestep_sampling = "sigmoid"; weighting_scheme = "uniform"the 3000 step result managed picrel as a cherrypicked best of 10 (lora weight 1.0, trigger tag weight unmodified), but most outputs are far too clean/tidy/digital and/or have a default typical anime model look, in particular it hasn't learnt the character very well yet.are there any obviously bad settings to fix or very helpful good settings i'm unaware of? I curated down to a small number of clear aesthetic images for quality and to save time tagging - for a strong style change like this do i want to aim for 100-200 images instead, and only having 40 should be expected to hard gimp my results? or is 3000 steps also too low with these settings?
>>109112206>network_alpha = 1i don't recommend alpha = 1 as it makes the effective rank lower. also lr = 2e-5 with alpha/dim ratio = 1/32 is too low
>>109112145Loras get sloppier when used with shitmixes but I wouldn't say they break outright.Sometimes you get really unlucky but most still produce somewhat usable results, even though degraded.
>>109112206If these aren't the diffusion-pipe defaults then you should use them
>>109112109If these are the non-lobotomized weights, this will be the week local is back.
>>109112206>>109112227Keep 2e-5 but set alpha to 32.Honestly 1e-5 or 5e-6 will probably work better. Increase step count a bit to 4kOptimizer normal AdamW unless you are turbo vramlet--sigmoid_scale 1.3 is recommended40 images for style lora is rough btw regardless of what you do, you want to hit at least 100.If you are experimenting just stick with 768p, or even 512p and switch to higher res once you are certain you locked parameters in
>>109112206No anon, you're supposed to NOT post your settings and just declare the model dead instead of asking for help.
>>109112259Whats the big deal if klein is newer? I get maybe the images look slightly better but no image-to-image, so basically just another ZIT?
>>109112109I would, if it would start.
>Krea larping as Mistral dropping a magnet linkYo dumbfucks, the difference is that mistral 7b was the llama architecture, so any inference program that could run llama could run the mistral weights in that torrent. That's not the case with this model. Even if you download 50GB of weights, what format is it in? How do you run it? Next time at least coordinate with ComfyOrg to add day 0 support and release the weights in that format.
>>109112227>>109112285thanks. i'll try lr 5e-6, network alpha 32, plus the optimizer (16GB VRAM so i hope i'd be fine) and sigmoid scale. and i'll go back in and flesh out the dataset properly now, should be able to get 100-150 relatively clear pictures. i'm working via codex and it's really good, can fan out agents to tag 100+ images in under a minute with pretty good accuracy according to a markdown ruleset with how to tag for anima + whatever rules you come up with for the dataset after watching a trial run subagent make some mistakes. barely eats any usage limit. couple of random examples:>@slaytheprincessstyle, 1girl, the princess, seated, gentle expression, sad smile, hands clasped, long hair, crown, off-shoulder dress, basement wall, medium shot. The Princess sits with her hands clasped in her lap, wearing a crown and an off-shoulder dress. She gives a gentle, slightly sad smile in a basement-like room.>@slaytheprincessstyle, 1girl, the witch, wide shot, full body, crouching, barefoot, torn dress, long hair, tail, root tunnel, dark forest, chains, looking at viewer. The Witch, a barefoot woman with a tail, crouches in a dark tunnel made from tangled roots. Chains hang on the walls and the figure is small within the wide composition.might change the trigger, it's a fucker to type.pic related also, codex made it trivial to get a dataset of 70 pixel art images detecting various pseudopixel sizes to downscale to 1x, blown up to 8x, and padded to half a dozen sensible factor-of-16 buckets using the background color without violating the sprite's position on the new 8x8 grid relative to the image borders. all in about an hour of brainless prompting
>>109109915>bigaspv3https://litter.catbox.moe/gk584bpa66e7zi0d.pnghttps://litter.catbox.moe/cczi7wfqtjw95w4a.png
>>109112495>>109112495>>109112495