Discussion and Development of Local Image, Video, and Music ModelsPrevious: >>109034986https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUISDWebUI: https://rentry.org/ldg-lazy-getting-started-guide#the-stable-diffusion-web-ui-lineageWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, & Upscalershttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.info>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/tdrussell/diffusion-pipehttps://github.com/kohya-ss/sd-scriptshttps://github.com/kohya-ss/musubi-tuner>Zhttps://huggingface.co/Tongyi-MAI/Z-Image>Animahttps://huggingface.co/circlestone-labs/Animahttps://tagexplorer.github.io/https://animadex.net>Qwenhttps://huggingface.co/collections/Qwen/qwen-image>Kleinhttps://huggingface.co/collections/black-forest-labs/flux2>Wanhttps://github.com/Wan-Video/Wan2.2>LTX-2.3https://huggingface.co/collections/Lightricks/ltx-23>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkCollage: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debohttps://rentry.org/animanon
>inb4 the useless unfappable deepfake spam
https://rentry.org/LDG_vital_info
Blessed thread of frenship
https://www.anthropic.com/news/fable-mythos-accesscloudkeks are on the ropes
>SaaS is so powerful that the government has to step inmeanwhile local is so kekked they censor themselves
>>109041732Anthropic genuinely might be an all time top 3 "snake oil" business.
>>109041701>his grapes are sour
>>109041746you're so gay that the government has to step in and prevent you from trying to suck all the cocks on the planet.
>>109041732>be local>winLocal can't stop winning.
>>109041746>SaaS is so powerful that the government has to step inWell retard, If you actually read the statement you'd see that Anthropic do not agree with you, they think the US govt is being vindictive.>[...] we believe the government should have the ability to block unsafe deployments, as part of a statutory process that is transparent, fair, clear, and grounded in technical facts. >This action does not adhere to those principles.
>Local models are free, uncensored, can be trained to do anything>Yet somehow the government feels bigger threat from 'censored' APIlocal models are so dumb they're effectively harmess, nobody takes them seriously. API models are so advanced in thinking and capability, they're not even in the same league.
>API models are so advanced in thinking and capability,
>>109041888He's right. Car washes don't reduce your wanted level. If you want to wash your car, you need to go to a body shop.
>>109041831>API models are so advanced in thinking and capability, they're not even in the same league.Are you sure about that cloudkek? You can't finetune cloudshit models on custom data. As it stands, a local model with a LoRA is more dangerous than censored ("more capable) cloudshit model with similar param count. You might say (apples to oranges) but remember glm 5.1 was already open sourced.
>>109041831They already seethed about celeb deepfakes and AI political propaganda. There just wasn't a major incident yet.
>>109041916>You can't finetune cloudshit models on custom dataBecause you don't need to, this isn't the own you think it is. Local is so far behind you still have to train loras for outfits while API can do it in a single-shot and even search the internet.You could try training a local model to make a bomb or whatever but they're so dumb that it wouldn't even work properly. Once GPT/Claude/Gemini gained internet access, local fell off the face of the earth. I can just ask GPT to "put her in the kansas city royals jersey" and it will work, no need to use separate edit models (outdated localcope) or custom loras.
Pretty sure their concern is more with people finding a way around the safeguards and making 1000+ agents hack into national security shit than a few degenerates generating CP
>>109042033>I can just ask GPT to "put her in the kansas city royals jersey"Yeah but then it'll have that GPT image grunge, so its useless
Why does this video loader only have an image input, how am I supposed to get a video upscaled with this?
>>109042056Nah, their main concern is preventing a foreign adversary from finding and patching their backdoors. Read the reasoning behind the immediate ban- >>109041673>To date, the government has only given us verbal evidence of a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws. Our understanding is that one potential jailbreak was shared with the government. We have reviewed the report and validated that the level of capability displayed there is widely available from other models (including OpenAI’s GPT-5.5), and is used every day by the defenders who keep systems safe. We will share more details over the next 24 hours. From that alone, it's obvious.
>>109042110>images
>>109041707
>mfw Resource news06/12/2026>ComfyUI-Flux2Klein-Enhancer: Conditioning enhancement and reference latent control https://github.com/capitan01R/ComfyUI-Flux2Klein-Enhancer>InterleaveThinker: Reinforcing Agentic Interleaved Generation https://zhengdian1.github.io/InterleaveThinker-proj>Experimental Anima LLLite Regional Controlnethttps://huggingface.co/Sen-sou/Anima-LLLite-Regional-Controlnet>World Tracing: Generative Pixel-Aligned Geometry Beyond the Visiblehttps://haoz19.github.io/world-tracing-page>VietFashion: Benchmarking Sketch-Text Composed Image Retrieval for Cultural Outfitshttps://hng0303.github.io/VietFashion>Modality Forcing for Scalable Spatial Generationhttps://modality-forcing.github.io>VideoMDM: Towards 3D Human Motion Generation From 2D Supervisionhttps://videomdm.github.io>EvTexture++: Event-Driven Texture Enhancement for Video Super-Resolutionhttps://github.com/DachunKai/EvTexture>Budget-Constrained Step-Level Diffusion Cachinghttps://github.com/Westlake-AGI-Lab/BudCache>ECA: Efficient Continual Alignment for Open-Ended Image-to-Text Generationhttps://github.com/Snowball0823/ECA>InterleaveThinker: Reinforcing Agentic Interleaved Generationhttps://zhengdian1.github.io/InterleaveThinker-proj>i1-3B: A Simple and Fully Open Recipe for Strong Text-to-Image Modelshttps://huggingface.co/zlab-princeton/i1-3B06/11/2026>i1: A Simple and Fully Open Recipe for Strong Text-to-Image Modelshttps://zlab-princeton.github.io/i1>AnchorEdit: Maintaining Temporal Consistency in Multi-turn Image Editing via Causal Memoryhttps://github.com/xuhang07/AnchorEdit>Reroute, Don't Remove: Recoverable Visual Token Routing for Vision-Language Modelshttps://github.com/elmma/mllm-reroute>ComfyUI-BerniniStudiohttps://github.com/CCpt5/ComfyUI-BerniniStudio>Ideoprompt: plain English to Ideogram 4 structured JSON prompt https://github.com/cocktailpeanut/ideoprompt>Orion4D FXMax for ComfyUIhttps://github.com/orion4d/Orion4D_FXMax
>mfw Research news06/12/2026>TetherCache: Stabilizing Autoregressive Long-Form Video Generation with Gated Recall and Trusted Alignmenthttps://arxiv.org/abs/2606.13035>DuET: Dual Expert Trajectories for Diffusion Image Editinghttps://arxiv.org/abs/2606.13303>Efficient, Robust, and Anti-Collusion Fingerprinting of Image Diffusion Modelshttps://arxiv.org/abs/2606.12977>ReFree: Towards Realistic Co-Speech Video Generation via Reward-Free RL and Multilevel Speech Guidancehttps://arxiv.org/abs/2606.13304>SeamEdit: A Black-Box VLM-Agnostic Pipeline for Large-Image Semantic Editinghttps://arxiv.org/abs/2606.13041>Towards More General Control of Diffusion Models Using Jeffrey Guidancehttps://arxiv.org/abs/2606.13240>AudioX-Turbo: A Unified Framework for Efficient Anything-to-Audio Generationhttps://zeyuet.github.io/AudioX-Turbo>SmartFont: Dynamic Condition Allocation for Few-Shot Font Generationhttps://arxiv.org/abs/2606.13382>High-Fidelity Two-Step Image Generation via Teacher-Aligned End-to-End Distillationhttps://arxiv.org/abs/2606.12575>Bridging Modal Isolation in Interleaved Thinking: Supervising Modality Transitions via Stepwise Reinforcementhttps://arxiv.org/abs/2606.12886>Selecting Samples on Graphs: A Unified Dataset Pruning Framework for Lossless Training Accelerationhttps://arxiv.org/abs/2606.12913>Edit the Bits, Diff the Codes: Bitwise Residual Editing for Visual Autoregressive Modelshttps://arxiv.org/abs/2606.13558>HYDRA-X: Native Unified Multimodal Models with Holistic Visual Tokenizershttps://arxiv.org/abs/2606.13289>Emotional regulation improves deep learning-based image classificationhttps://arxiv.org/abs/2606.13081>Weekend Timehttps://suno.com/s/NDgulWDocrYOA7US
>>109042180>>109042188None of this data is useful when you blindly post it without any vetting. Didn't you learn your lesson already?Why are you spamming this in this thread when you have a tard cage all to yourself?
>>109041406Haven't figured out how to nest LoRAs yet, I guess one could merge one of them into base instead, and then go from there. In the meantime, the final epoch perhaps wasn't so bad after all even with a bit of overbaking in there.https://vocaroo.com/12tVNq7SnhO1https://vocaroo.com/1iXXFRxfvMQy
>>109042265your 'feedback' is just a vehicle for you to attack me as a poster; therefor, it will not be considered
>>109042312>https://vocaroo.com/12tVNq7SnhO1Leslie Parrish's voice really shining thru there kek. Lowering LoRA weight should always mitigate diversity issues.
>>109042265Don't listen to this guy >>109042180He's just mad cause he can't read.Thank you for blessing us with daily research.
>>109042312Something interesting is that I didn't tag it, just used generic descriptions, but for some reason Claude is able to control exactly which singer and style from a 20+ song dataset gets triggered, even without using the same exact captions.Here's Dave Rogers style songhttps://vocaroo.com/1lXhL94xMsviManuelhttps://vocaroo.com/14wvmcvt94lB
>>109042411And there's a total of 2 songs from both of them. It's a diverse dataset.
>>109042378:)
Is anyone here using Ideogram 4? What do you think?
>>109042516i saw people experimenting and it seems to be the best t2i model if you know what you want. i am still waiting for reference image support
Where is the workflow the scail2? It doesn't appear as a default workflow so I assume Kijai posted somewhere in one of his many many repos and just assumed I'd know where.>>109042516I like it. Like as far as the control over the image composition it gives you, it rivals saas. I've been meaning to train some hentai comic LoRAs to see how it handles art styles because it's already fairly competent at making comics
>>109042033>Local is so far behind you still have to train loras for outfits while API can do it in a single-shot and even search the internet.Local LLMs are not as far behind as you think. Image models etc... are one thing that only hobbyist companies do, but the big open source companies have focused on LLMs, and as such there's already a local LLM on par with the best cloud LLMs (of course, just not as good as the latest one, but if we can take a local LLM and say it's as good as Gemini 3 Pro, that's close enough...)
>>109042585Now, you may not believe in benchmarks, but this is the case based on private tests, benchmarks, lmsys, etc... GLM 5.1 is better than Gemini 3.1 Pro and Sonnet 4.6
>>109042516yes, it's the most powerful prompting
>>109042585The main thing is that dataset determines the diffusion destiny.And that can often mean that big models are in fact more restricted in what they can include.Here's a question to ask, has nanobanana ever trained on real actual medium format scans?
>>109042575https://github.com/Brobert-in-aus/scail-auto-extendUse the wf with this node
>109042599use ideogram to generate a cartoon explaining how to make curry like your mom makes it.
>>109042610ty
any differences on generating hentai from december to now? I think I'm on stable difussion through forge webui
>>109042794Nah, industry is pretty stagnant
>>109042794Don't listen to APIcucks. They're trolling. Wan and LTX is better than ever. Check out last thread for multi character swap potential of SCAIL-2
>>109042806*local is pretty stagnantindustry got tons of sota image models and claude fable is so insanely good the government had to ban it
Figured out how Wan SCAIL-2 multicharacter supposed to work. Just combine 2 characters in the segmentation input and prompt for 2 subjects. It will swap them out. Work best with same aspect ratio input references.
>>109042830that has lots of potential
>>109042816>claude fable https://www.youtube.com/watch?v=wVJ7LYrl83E
What if I dont want to just copy animations from existing anime shit? I don't wanna be Alan Bowe, I wanna be Miyazaki.
>>109042830I’ve found it to be hit and miss with prompting multiple subjects, as far as I can tell it goes off the first frame of the video, so if there’s only one character clearly visible to begin with it won’t segment the image into multiple characters, instead both of them will be blue.
>>109042816>claude fableKind of annoyed about this. I had some projects I wanted it to look over and clean up then they banned it.
>>109042864Dont worry about it. It was opus 4.6 but with a bunch of grifter ass underlying bullshit like how it aggressively nerfed outputs for no reason. So, technically, we still have Fable 5. Just go use Opus 4.6
>>109042830my pc is too lowend to take on WAN. what's scail-2 do?
>>109042516I thought it was total shit at first but since it can make comic pages like this, >>109042575its actually got *some* potential, however, just making the comic panels one by one and using a program like comic life 3 is still better.
>>109042903Better and easier to use reference replacements and tracking. Just look at previous threads for more examples.
if ideogram doesn't get some kind of finetune it will be the biggest waste of potential yet. 'muh license' is such cope. what are the chinks even doing, are they too ashamed to train on a western model? what ever happened to noobai?
>>109042903better video referenced videos is the main current usage I think
>>109042956me when i blink out of existence
>>109042953>'muh license' is such cope.I actually hate these people who pop up to remind you about the license whenever you discuss the model. It's genuinely the most useless "Uhm achually" sentiment you can make.
>>109042953>what ever happened to noobai?noob2 is saas only and controlled by comfyorg or some shit
>>109042953>'muh license' is such copeit's the most relevant feature that determines if the most relevant parties involved can even actually use/redistribute the model
I have been using ideogram for about a day, and not once have I triggered that gray blocked content blob thing. Then again, I haven't attempt to really generate any nudity since I'm not a coom brain
>>109042994who gives a shit
>>109042956Cute.
how do i get klein 9b to make a nice vag? they like to add a bulge or just do some really weird shit and make them really long and weirdly shaped when i edit, pic related.
>>109042996and?
>>109043000do you just lack basic foresight or understanding? people don't want to be rugpulled or sued after putting in effort as happened *countless* times in many areas surrounding copyright.
>>109043026>how do i get klein 9b to make a nice vag?You can't without LoRAs. It does not know what a vagina looks like.
>>109043025the video ended up cute and the original gen by another anon breads ago also was already very cute
>>109043028And I leveled up on fortnite this weekend
>>109043041yawn
>>109043053which loras. there is like a gorillon of them. i probably have them all and tried them all but i'm not sure what works. I was hoping for a definitive one.
>>109042516Trash model that does more of the same.Looks like the astroturfing worked.
>>109042316The OP shows you linking anons to malware, why do you post this constantly with that being a known issue, did you apologize?
>>109042953>are they too ashamed to train on a western modelLol. Ernie is superior and the Chinks know it.
>>109039192To the cloud faggot, yes you can gen that shit in local this is Z image
>>109042996I triggered it with a cat pictures. I guess it's a skill.
>ernieforgot that shit existed. what's with china and pumping out garbage like hidream, ernie, and glm?
>>109043176GLM is actually the best cloud AI on the market right now if you're not a shithead code grifter.
>>109043176commercial/research entities regularly pump out models with no good questionable/nsfw tuning even outside chinayes few of them seem to succeed, but I guess they prefer to keep imaginary or perhaps real trouble away? idk about their circumstances in detail.
>>109043155An he uncucked version
>>109043244the fact that you immediately thought of GLM’s llm says it all. the image model is downright terrible
it's friday? i thought it was tuesday
>set up a long queue and went to go work>came back to find it was all using the same seed
>>109043155>>109043255OK I think this is the best, uncucked, young but not too young, and with really 80s atmosphere
>>109043265lmao, glm has an image model? Forgive my ignorance
I was wondering why Bernini was so good.They finetuned Wan on 20 million video pairs.The model being from ByteDance too helps obv.
>>109041170>First, you haven't run a full dynamic range of settings for Tan2.no idea, i'm not smart but stubbornly refuse to take anon's word at face value sometimes.>Also, you should always graph your steps.wat? that grid is one of 4, flux 2 klein, 6 steps to rule out some of the ones looking unfinished because it just needed more than 4 steps that are usually fine. the specific thing that drove me to doing this again was the other day seeing armpit skin gens anon posted and thinking that'd be something interesting to test across different models but got hung up on flux 2 klein because it kept giving me stubble but not like armpit stubble i see IRL it's like only in the skin wrinkles. pic related. so i tried prompting it away using shaved, hairless, etc, didn't work, wondered if a different sampler would be better, haven't had time/motivation to push thru going thru all the results yet so will probably stick with euler + beta or flux2 for klein gens.>>109041320>what's "Automatic"?one of the scheduler settings in forge neo but i've never looked into it. in x/y tests i did with samplers/schedulers for zit (that lead me to preferring DPM++ 2s a RF with bong tangent) the automatic scheduler column didn't match any others exactly, assume it will be the same with klein and others but don't know for sure.
>>109043281countless 1girls, gone like tears in raintake measures to prevent this
>>109042516pushing the local scene forward without being some stupid size like a 400b model. hopefully we get a model from some lab that pushes the convenient side of things as well, im lazy as hell with bboxing and just wanna prompt
>>109043323i'm too lazy to patch it, i still haven't jumped onto the vibecode bandwagon so i would have to go search for the code that loads the seed from metadata
>>109042931so you gen with the reference visible, then crop it?
>>109043409it's like the image editing models or controlnets where you supply references, surely you used some of those by now?
>>109043409Reference Image + Video = kino, what not to get?
>>109042312Might have to retrain my ZUTOMAYO LoRA on the entirety of this album so I can have a cool nice live version of it, this time using the higher rank settings so it can pick up all the nuisances in her voice etc...https://music.apple.com/us/album/midnight-forever-expo-meik%C5%8D-wa-gunaruga-gotoshi-live/1840129493
>>109043462>kinoneked lady is not kino
>>109042180>>109042188thanks!seems like you've missed https://nvlabs.github.io/motionbricks/
>>109041690Tele Bgftg33Turn my Asian gf's pics into a lora, send me a sample of other loras you've made
>>109043155>>109043255>>109043288look awful
>>109043462why is she so small
>>109043462also have you tried to mask part of reference video to see how the model will fill it
>>109043462I want Marika to rape me.
>>109043562thanks, will add this
>>109043565>>>/r/
>>109043616/r/ is dead bro
>>109043562Isn't this just motion matching with a bigger library of motions?
>>109043565You misunderstand, anon. You must convince me that she's good enough to warrant my GPU time.
>>109043562So if someone trains sex animations on this you can make a game world where you can fuck all of the NPCs?
>>109043622That is so stupid kek
>>109043667Oh, so this is why it's gonehttps://archive.is/2026.05.21-102104/https://www.wired.com/story/4chans-misogynist-wizards-are-nudifying-women-by-request/But that is still retarded.
>>109043676tldr. why did gook moot cave in now? feminists have been complaining for over a decade now
>>109043682US Law - Take It Down act
>>109043686oh ok. did they ban deep fakes on the whole site then?
>>109043682Picrel takes a certain level of degeneracy. Like Plebbitor/b normalfag invasion levels of degeneracy. Worse than other boards, of perhaps trolls spamming. Either way, I guess he just didn't feel like moderating it, because that's a good way to get bad PR.
>>109043694it's only banned if it gets reported>>109043702>BBCWhy are white people like this?
>>109043712kikes are not white
It doesn't wanna match the lighting, but otherwise worked well.Seems like the best thing you can do (aside from replacing the background of your input image with white) is to rescale the input image to match the aspect ratio of the video by padding with white.
Can LoRA being butchered cause bad hands or is it all up to the checkpoint?
>>109043750i use klein to refit an image into a new aspect ratio without stretching it
>>109043769>i use klein to refit an image into a new aspect ratio without stretching itnta but this is sounding like a lot of work for 5 seconds of footage that might a single (You) and excluded from the collage for being video.
>>109041297>>109041690This is uncomfortable to look at but not in a bad way.
>>109043750Or gen new background mode to get the matching lighting and shadows.
how do I remove all the info from an image before uploading it to civitai?Don't wanna people judging me
>>109043799great things happen when you stop caring about superficial things
>>109043813You could just remove the info manually after uploading. If you're still paranoid just save the image again with an image editor
I hate how LTX looks more coherent at higher fps but animations are stiffer. Probably because it was trained on 60fps videos of mostly video games and vtuber slop.
>>109043462how smooth is it working with porn
>>109043813literally just screenshot your img
>>109043839the motion is far more dynamic at lower resolutions. you may be able to upscale it afterwards or use it as a control video for your higher resolution generations
>>109043816why do you keep genning this child
Was using Anima in Forge... How I do this in Comfyui????
>>109043864
>>1090438644channers still advocate for behaviors adjacent to the pedophile socialite class, and that will not change, ever.
>>109043893Search the manager or use SDUltimateUpscale.
>post a bikini pic on civitai>immediately moved to redlol
The best part of this is that it made the background characters asian too
>>109043975why even post there
>>109043983asian psycho
>>109044023>asian psychothat'd be honestly kinobased on bubble era japant. gook
>>109041888im currently testing my latest optimal training setup lora for acestep.it uses qwen llm to generate audio codes.if you got to their repo and check list of genres trained into it = hilarious.as im testing this optimized lora im getting ai slop.i throw in almost two paragraphs of insults ijnto the prompt. and some explanations this style of music != what insectoid llm slop thiks it is.songs generated currently ar coming out as intended, in the style of lora i trained it in.throw insults at it.
>>109042516you do not need control net and region conditioning.that is good.
Been playing with scail! Any of you have issues with audio and lipsyncing? any way to fix it??
>>109044086nice consistency
>>109044086why doesn't the cigarette appear? did you include it in the prompt?
are dynamic prompts always chosen randomly or is there a way to do them in order so i could have quite a long list of {big boob|small boob|medium boob} and gen and it will cycle them in order rather than random
>>109044105Wildcards are chosen randomly.You should be able to slop your custom node to have them cycled in order.
Triple replacement test
>>109044153
>>109042947nice
>>109044153> replacement
New pixel-space 'toss:https://huggingface.co/spaces/Photoroom/PRX-Pixel(It's shit. Like even the some of the example images in the demo have broken anatomy.)Tested with like three images. It sucks at text. (Maybe more steps help I dunno just ran default 28)At CFG 1 the images look ZIT-like, but with much worse prompt adherence. Higher CFG gives better prompt adherence but it looks slopped. Doesn't seem to have much character knowledge, didn't test styles or celebrities, but I wouldn't hope for much.One positive thing is that unlike many other local pixel space models it doesn't suffer from patch artifacts. Speaks about state of things with these slopped research preview garbage when the bare fucking minimum feels noteworthy to mention.
It's summer and I'm just sitting at home prompting.
>>109044220>7Btubby girl
Am I supposed to be daisy chaining extend nodes? Because I just set 81 on the initial node and +129 on the singular extend node and it seems to just werk
>>109044262lost
>>109044153only three? Those are rookie numbers>>109044262https://github.com/Brobert-in-aus/scail-auto-extenduse this node + wf
>>109044327
>tfw 2 years ago i was waiting for 5mins for an 480x720 gen to be upscaled now im waiting 5mins for a 20second coherent full on video of my waifu getting plowed by a green orc cocklife really moves fast
>>109044233well yes, the more convenient campaign seasons for murder, rape and plunder are spring/fall. siege in summer or winter means "bold" mis-planning or something you plebeian should not complain about if you want to live.being at home in summer AND not getting sieged yourself is good, fren
lolesoteric art style LoRA just downloaded from CivitAI
>>109042516There is a lot of detail in the model. Also kind of model puts in a frenulum piercing unprompted? It seems to have lots of unwanted sampling variety.
Can I link civitai.red LoRAs I trained on my resume?
>>109044398There are adult board links in OP. Maybe you can say you put some stuff in those.
>>109044372ideogram hallucinates random details because the gptslop it was trained on also hallucinated random details
>>109044398sure.
>>109044327Oh cool, thank you
>>109044153Would you care to share your workflow?
>>109044054nsfwhttps://files.catbox.moe/021w4k.mp4so what's the problem with the audio? I exported the video as 24fps. Wf has 24fps set in the first node.
>>109044620KEK
>>109044620Lord Farquaad with tits.
>>109044694I fucking love Scrubs
>>109044620
>>109044694My wife is telling me this woman is called Fukuda Aimi. Is this right?
>>109044895
can scail do ahegao? wan animate struggled with eye movement and tongue, asking for a friend
>>109044919Old wanimate comparison. I think wanimate did better with the likeness, initially anyway. Degraded heavily over the course of the video
Someone post the fucking scail workflow with background removal. It's a pain in the ass to setup.
Zbase learns so fast, nsfw test: https://files.catbox.moe/l7aak0.jpg
>>109044961what are your training settings?
>>109044945for you
>>109044948There's probably a more elegant way to do it than this, but here you gohttps://files.catbox.moe/xntg6t.png
>>109044931sometimes. it's not 100%
>>109045012Thank you.
>>109044931it has a pretty decent success rate but like the other anon said I too don't consider it "reliable" yetit's not quite solid with regards to EITHER all humanoid reference images or the reference video, same as other facial expressions reallyand i mean only those where i'd expect (or can test) most segmentation models yolo whatever to identify the facial features otherwise
>>109045012Have you tried this node instead of flux klein?
>>109045037that's... probably way quicker, had no idea it existed.
>>109045047it exists in dozens of similar sounding names for a while now, you probably want the most popular "rmbg"
>>109045038impressive
>>109044931>>109045016>>109045031Tongue works but I never see the eyes crossed correctly. Maybe a wan2.1 ahegao lora would help?
>>109044398Coomers seem to know more about AI technology than the average normie so employers are retarded if they're turning away applicants for NSFW loras
When did comfy go from a scratch disk raping monster when generating videos to a smooth memory managing king?
>>109042956>HITGS on a /g/ threadNice.
>>109045142not perfect but looks like it has some problem with 3D model doing crossed eye
>>109045349It's an improvement. The original woman is very punchable.
>>109045349have you tried it with this lora?https://civarchive.com/models/1390545?modelVersionId=1571626
Is there a reason SCAIL was built on Wan2.1 instead of 2.2?
>>109042610nice, thanks for sharing this.works for me except i had to disable the last comparison image concatenate, somehow out of all things that is what OOMs here - sticking bitmap images together.
>>109045362why dont u try it
>>109045390im currently training a lora and dont have the gpu capacity to try it right now
>>109045375i get seven keyframes with 2005 computeryou wanna to reshoot at ntsc anon?
>>109045349What resolution are your inputs? I’m getting pretty bad facial likeness consistency with a 9:16 ~1200px ref (don’t have it in front of me atm)
If you have to ask how much VRAM costs, you can't afford it.
>>109045405reference image 1869x2300, rendering video at 576x1056
>>109045387since it only need to copy the motion of subject, it doesn't need the improved motion from wan2.2 ?
>>109045387probably started before wan2.2 was out, maybe you get the opportunity to ask them directly on social media or w/e. i think they do write english.
>>109045387chatgpt said this:>Wan 2.1 14B is one dense transformer. SCAIL-2 modifies its conditioning sequence, masking channels, and RoPE behavior. Applying those changes to one dense model is relatively straightforward.>Wan 2.2 A14B is a two-expert MoE model: approximately 27B total parameters, with separate high-noise and low-noise 14B experts. SCAIL training would need to modify and train both experts consistently, greatly increasing storage, training memory, complexity, and cost.>Driving video already supplies motion. Wan 2.2’s improved prompt-generated motion and aesthetics provide less benefit when SCAIL directly transfers motion from another video.>Wan 2.1 has a mature modification ecosystem. Its dense architecture is easier to fine-tune, convert, quantize, integrate into ComfyUI, and extend with LoRAs.>SCAIL-2 actually uses Wan2.2 Animate as one of its data-generation teachers, so the authors were aware of it. They seemingly chose to distill its useful behavior into the simpler Wan 2.1 backbone.
>the effect at 0:03
>>109045461kino alert
>>109045424Thanks, are you using basically the same settings from the kj PR workflow (other than frames etc)?
Why is LTX so... shit?
>>109045502Nihon-Viet Cong propaganda.
>>109045461honestly impressive
>>109045510It's good, it's just cloud video models have gone nuts the last few months and LTX looks garbage in comparison
>>109045510because it's local
>>109045537did the copers wake up?
>>109045535yea scail really is good at what it does. and wan was certainly already an impressive model anyhow. even with scail, if you have very long hair or chains with jewelry or whatever they may do some physics stuff where the reference had none of it
is there a way to fix this initial discoloration?
>>109045510it made compromises but it's a good improvement over predecessors and some capabilities are quite good.i DID use wan more myself with the better prompt adherence and more capabilities what you can prompt (spatially, temporally)
>>109045510it's a talking head model: 1 person talking. sulphur and eros (i2v version of sulphur), the nsfw models, can do basic nsfw but you'll re-rolling a shit tonneI was having an argument with someone on here the other day, he insisted ltx was better than wan22. He actually convinced me for a while, I went on a multi-day tear, tweaking settings and trying out workflows, doing hundreds and hundreds of gens. My personal conclusion is that wan22 is just better but it has no sound and the clips are frustratingly short.
>>109045503https://files.catbox.moe/wp75sw.mp4same KJ workflow, just some minor change
>>109045349>>109045142>>109045133>>109045574>>109045596give it to me straightcan I run this shit with 6GB of VRAM?
>>109045601yes just limit the highest dimension to 480
>>109045568honestly, as a filty human artist, this feels like the only place worth visiting about ai art stuff
>>109045595kino is a skill that takes time to develop
>>109045636kino issue
>>109045596how do I get the workflow from this?It only opens a loaderDo I have to update Comfy to the latest version?
Why is no one testing outo Bermini? It seems like it's better than LTX and even lets you use image references finally
>>109045595LTX 2.3 is a seriously impressive model for it's size and speed. T2V, I2V, sound, upto 50fps, 30 sec gens.I'll agree Wan is better aesthetically and for NSFW but I can't go back to no sound 5 second slow mo clips
>>109045636sure, but you see, everyone who talks like this never posts their workflows. The guys I argued with the other day refused to post his workflow. Meanwhile, I don't need to post workflows, the default wan22 is enough to get you better results than ltx.
>>1090456016GB VRAM
>>109045663everyone is playing with scail making blurry as fuck vids. there was some anons playing with it a few days ago
>>109045665that's why I'm looking for a good SVI workflow. Every one I've tried degrades the quality of wan22. I think you can use wan to get a good 20 second clip with no sound, extract frames (like lots of them), and then use them as guides for ltx so it doesn't go batshit with the horrible, deformed anatomy and strange motion.
>>109045658drag and drop doesn't work? file open doesn't work?
>>109045697noI'm on an old version of Comfy
>>109045665Can LTX do nsfw at all?
>>109045702Out of the box? No. You'll need loras or a finetune
>>109045700oh u need nightly comfyui to even try SCAIL2
>>109045708ok
>>109045667didn't i give you my prompt and seed?
>>109045721I wasn't arguing with you. I do my best to avoid avatar fags. I'd filter you outright if I had a way.Are you happy with the way her face completely warps in the first 2 seconds of your video and in other parts as well?is that why you only post gifs and tiny resolutions, to hide all of that?Are you happy with your vid looking like a ponyxl gen come to life?
10 steps seems to give a nice improvement over the default 6
>>109045769And here's 6 steps
>>109045769>>109045775Just noticed the missing sparkle effect on 10 steps but the face and jacket look a lot cleaner to me
>>109045759i gave you everything you needed to make some proper kinos, but you aren't satisfied with it for some reason
https://files.catbox.moe/z1ype2.mp4Something about the way the talking heads move. So off-putting. I've found LTX way better at generic stock footage desu
why no more Anima talk? :(
Anima desuanime website desu
>>109045825no new one was announced and 80% of the anima discussion was shills anyways
>>109045825>>109045842>>109045851
>>109045825i got you covered bro in 1934 radio corporation america had this thing called rca maybe commies got it first with phono but pretty not much worth mentioning ever since
>>109045801https://files.catbox.moe/cj3jpb.mp4Post reminded me I completely forgot I trained this
>>109045859lol.Why do all AI monsters come out like that though? The white can with the green M
>>109045825Not much to talk about really. It has it's place as very decent concept creator, but it's not good enough for creating final image. If I could decide I would cull half of the booru creators and replace them with traditional artists.
https://files.catbox.moe/8csp6h.mp4Audio for /g/ when?
>>109045859>>109045909wansisters in shambles
>>109045909There was a brief general on /wsg/ when LTX was first released but people got bored.
what difference does 64gb of VRAM get me vs 128gb?
>>109045940double the difference
ideo4
>>109045993Bounding boxes are the future Natural language and tags in the trash
how to merge lora with anima?
>>109046101Pen and paper.
>>109045909>[Common sense feature that almost every website has] for [board in unmaintained shithole website that nobody in charge gives a fuck about] when?Never.
>>109045993>>109046050can I run it with 5 GB of VRAM?
the more bboxes I use, the better the image quality becomes, even at turbo settingsi use between 10 and 25 bboxes
Can it run on GeForce 6200 AGP (128MB version)?
>>109046144yes saar
Do you guys not use llm assisted tools for image gen?
>>109046177Does asking my girl for depictions of herself count?
>>109046177Yup.
>>109046286>that chink at the bottomkek
>>109045927>but people got boredwhy?
>>109046301>but people got boredSimply weren't enough people in the thread. The only reason to go to the wsg /ldg/ was to hear audio on videos. There was no generality to the general so once the model stopped being the new thing people forgot the thread ever existed.
>>109046177Local language models are fantastic for image captioning and subsequent iteration of similar conceptsI have a gigantic library of images accumulated over the years i can experiment with.
>>109046286>needing llm for 1girl slopPlease kys.
>>109046286catbox?
>year almost over>best local goon model still illustriousgrim
>>109046316maybe i should generate some more war kinos make a thread on there since those have excellent audio
goodbye
>>109046292That's by design.>>109046339You know where that really comes in handy? Describing women's clothes. It's an encyclopedia of clothing and I'm not.
>>109045611glad to hear it. we're probably standing on the shoulders(?) of giant anime girls here. or how ever that goes.ai probably combines well with whatever you already do.oh and if other places are worse it's probably mostly 'cause censored SaaS fucking sucks for art with new censorship every other week or something
>>109046431Maybe use your own eyes? Oh wait, you can't because you are artistically frigid.
what's the current recommended model for clothes change, or eye color change, or similar minor edits in an image?I think I used to run one of the qwens for it or something. but last year the results still weren't that great
>>109046491flux 2 klein 9b
>>109046505whats the current recommended model for sex?
>>109046491Klein edits are better at this kind of thing desu. Qwen is too slopped
>>109046361catbox?
>>109045993this is nonsense thobeit
>>109046188I guess>>109046286Very nice>>109046331You're not thinking far enough>>109046339Also not thinking far enough
>>109046443Ah, so there's some arbitrary limit on using AI when using AI, huh? Inderdasting...
>>109046343>best local goon model still illustriousanima
>>109046431horseface
>>109046745Why are you seething at another anon expanding his skillset?Vramlet?
>>109046435i mean some are sloppy and i personally dont use genai to create visuals but sometimes just the sheer variety of things people do here due to this vastly lowered execution cost is a good inspirationand on top of it, generally loose vibe, doing genuinely random shit instead of something you see from so called 'pro-ai communities' feels decent to metl;dr i do like lurking here
>>109046799I think the biggest problem with the space right now is software. web apps aren't really accessible for artists
>>109046737anima is still in "meet potential model" phase, illustrious just works
Anima thoroughly beats the shit out of any Illustrious finetune
>>109046857i always say this butbecause of the medium itself(text) coding got the most natural integrationbut with drawing and artistic matter, human-computer interaction workflow is one of the most important part i think and since what they are currently aiming for is the end product, it just doesnt integrate well into the existing artistic workflowi think photoshop's rotate tool is a decent example of 'making something that is compatible with existing method'
>>109046900once they capture enough hours of people using computers, not just the final product, we'll see some good integration
My rebirth is imminent
If people are switching to Ideogram, it's because Anima flopped, right?
>>109047018>>109046887yes this wonderful gen is better than any of the detailed finetunes lmao
>>109047038Still being retarded and miserable? You look at a purely LLM guided output and that's the only thing you see?
>>109047018Basado
>>109047189I always use you as a litmus test for how much models have improved due to how you have failed at basic genning for I think going on 4 years now?Really warms my heart image gen has gone a long way
>>109046517>>109046505thanks boys I'll try that on soon, appreciate it
>>109047313>>109047313>>109047313
>>109047018>All that excessive stutteringAnon, tell you ai-waifu to chill the fuck out. Not even animu girls talk this retarded. Moderation is key to believably.