The Secret Sauce For Kinos EditionDiscussion and Development of Local Image, Video, and Music ModelsPrevious: >>108966726https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, & Upscalershttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.info>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/tdrussell/diffusion-pipehttps://github.com/kohya-ss/sd-scriptshttps://github.com/kohya-ss/musubi-tuner>Zhttps://huggingface.co/Tongyi-MAI/Z-Image>Animahttps://huggingface.co/circlestone-labs/Animahttps://tagexplorer.github.io/https://animadex.net>Qwenhttps://huggingface.co/collections/Qwen/qwen-image>Kleinhttps://huggingface.co/collections/black-forest-labs/flux2>Wanhttps://github.com/Wan-Video/Wan2.2>LTX-2.3https://huggingface.co/collections/Lightricks/ltx-23>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkCollage: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debohttps://rentry.org/animanon
Can Krea2 do N64 kino OOTB?
oops fox posted after thread over
how long before we have dream diffusion?Turn my nightmares into a reality
Blessed thread of frenship
ai image detectors are quick with updating their modelsholy smokes
klein edit 9b is pretty neat, can do all kinds of stuff "use colored pencils like a sketch", change color, etc.qwen edit is good but I seem to get better results with klein edit.
>>108972791>klein edit 9bwhat's the difference between klein 9b and klein edit 9b? Also which one can be used as kontext?
>>108972801edit workflow for edits, im using klein edit 9b (distilled), very fast even at 8 steps (4 default)
>>108972305Are you thinking about releasing your kinoapp? I would love to try it out.
nb4 niggas who dont remember the safeycucking of ideograms last model complaining about the safetycucking of their new model
>>108972816also, klein edit seems to be pretty good at copying font styles:
>>108972763honestly the best nipples i've seen in a base model
SD3 chads eating good tonight
>>108972841better quality image as source.
how have they not fired the greasy snake already?https://www.reddit.com/r/comfyui/comments/1tvttzv/ideogram_40_just_open_sourced/
Comrades! It's Pride Month. Show us your pride !
>>108972914
>>108972829Maybe when I get the spaghetti under control. This could all be done by a single wrapper. Also there are features not yet implemented, like figuring out a design for a better manual palette node with a color picker. I really don't like how unwieldy it all is right now.You can do all of what I'm doing with some basic math right now. The only 'hard' part is picking a gen resolution close to your target aspect ratio and target image size (in total pixels) which divides cleanly by 8 and the target pixel size. You could figure out how to do that yourself and probably come up with something good enough, or ChatGPT could come up with an algorithm in five seconds that does it. For quantizing to black and white an easy trick with the default nodes is to just composite the image onto a large black and white image, quantize to 2 colors, then crop it back to the original dimensions. Downscaling and upscaling can both be done through the default nodes trivially.Also remember it's very easy to have any LLM make you a comfyUI node from scratch to do whatever you want if you give it this link: https://docs.comfy.org/llms.txtAll my time presently is absorbed by trying to make this app for tag correction on images for LoRA training, I'm just working on some UI elements for that. Since someone a few threads ago said the reason I don't like LoRA training is because of sour grapes or whatever. I'm getting a bit sidetracked on the way to making my LoRA...
>>108972870shalom (the people who decided to change the protagonist)
>84x85
>>108972940>>108972951Based and pixel pilled
Wonder how well it does as a second pass? Use other model like ZIT to get a rough composition, decode, re-encode latent, continue the steps from around 20% to completion so that the safety text doesn't get a chance to spawn in.Because from the few booba gens I've tried if the text goes away within the first 10-20% steps it doesn't come back.
>>108972781It probably just has the same features that most generative models have, no need to update, specially since its diffusion which is really easy to spot (as opposed to GANs or anything specifically trying to avoid detection), its also using flux VAE, which isn't new.>>108972914
>>108972946okay, here is the most accurate cover.
Maybe this will be the one. Combined a few gens to make it.
>>108973063removed a 'detail' on the face which felt like meaningless noise. Better I think.(the text extending out of the page is a stylistic choice I made, not a mistake. Maybe an artistic mistake.)
>>108972973Tried it at 0.9 denoise and it unsurprisingly distorts the vagene into a tumor but could be worthwhile for partially clothed gens.https://litter.catbox.moe/x7u5vprbixpzljfo.png
Reply blocked by safety filter
>>108973049
Ouch 2 minutes for Turbo on my 3060, could have been worse but not a great start.Anyway does anyone know why there are two checkpoints? What is "unconditional"?Are different parts of the cfg equation calculated by different models here? Is it what it is referring, why?I would also ask for what mu and std (standard deviation?) stand for but I doubt anyone can make sense of that comfy spaghetti.
>>108972325>>108972707>>108972726>mfw reading thisfiltered again...
so what is this new model people are having censorship issues with?
>>108973246https://huggingface.co/Comfy-Org/Ideogram-4
i'm 1girlingand i'm happy
>>108973285why the fuck would they train censorship into it>it GENERATES A CENSORED IMAGE instead of censoring it
>>108972768catbox?
>>108973311>>108972840
>>108973311So that you can feel S̵̨̧̛̼̫͖͇̳̝͕̣̜̖̱̻̯̤̰̝̭͕͖̗͕̮̟̰͙̤̟͙̪͉̻̯͕̘̬͖̪̰͙͚̈́̾̈͗͗͐͛͗̈̋̐̀̂̍̈́͋̽́̄͂͗̋̔́̎̀͐͒̒̉̋͗̊̆͘̚͜͠͝͠ͅÀ̵̢̢̢̡̧̡̛̛̪̹͇̬͙͈̻̻͎̗̠̱̰̬̜̝̙͈̟̪̰͕̤͕̦͇̖͈̫̞͈̻͙̣̳̻̥͓̰̠͍͚͕͖̦͍̄͗͑̑̌̑̿́̉͆͒̍̿̾̀́̀̿̎́͑̀͂̽͗̂͗̓̓̃͑̋̌͗̎͂̇̑͌̽͆̿́͛́̃̐̽̓̋̈́́͊̈̐̾̏̍̌͗͋̆̒̿͊̍͐̉̊̉̈̀͋̄̓͜͜͜͝͝͝͝͠͠͝ͅͅF̵̢̨̧̧̡̡̧̢̢̧̛̛̤̝̺̻͔̝̣̼͈̣̪̭̜͚͕͙̟̫̝̮̹̥̫̙͙͉̺͚̦͍͍͍̰͕̪͕͎̩̝͙̘̠͚̙̞̠̻̬͖̱̯͖̟̙͇̪̦̬͍͍͙̣͕͑̽̈́̂̂͗͑̋͆̈̊́̿̂̐̏͛͒̌̐̽͗͊͛̏͊͋͐̽̑͛̂͌͐̓͐̾́̽̋̐͑̎͛̈̽̓̔̌̿͛̀̃́̀̿͌̋͌̆̄̽̂͌̇͂͂̓̾̄͆͛́̒̓́͊͘̚̕̕͘͜͠͝͠͝͠͝͠͝͝ͅȨ̸̡̙̫̝̘͔̟̝̳̙͇͚̭̪̦͚̬̤̼̫̖̗͇̈́̀̃̆̐̈́͑̇̃̓̒̈́̿̄̈́́̌̓̒̐̈́̇̚͠ chud, be grateful for once.
This might sound unbelievably retarded but how do you make people visually breath hard or take deep breaths then exhaling repeatedly with Wan or LTX? I've tried describing the act of repeated inhaling and exhaling but it doesn't seem to work. Genning hyperventilating seems hard, or i'm retarded.
Remember when localroaches wouldn't shut up about their local uncensored models?
Cry more SaaSfag
censorship is retarded.we have klein edit + undress loras, even ltx 2.3 lewd finetunes if you want lewds.
>>108973372we can just use another model anonand you are still out of credits after 2 gens kek
Judging by the facts that its sensitivity is prompt dependent (probably one style of prompting was more over-represented than the other during the finetuning) and neither comfy nor diffusers code contain anything funny about censorship, we can conclude that this censorship was probably just post-training finetuning.They thought the model to draw the grey censorship image when asked for no-no prompts. Clearly not enough regularization so it's fried as fuck when it comes to generating that even for most benign prompts. I am not sure it being less sensitive for json is a case of it being trained on json so that it is able to distinguish between "good" and "bad" prompts better when given json prompts or if it is a case of being trained on NL prompts so that it can pick up thought crimes easier with NL and json slips past. (I am inclined believe the latter as some anon was able to gen shitty nipples with json last thread.)So anyway this might be salvageable with finetuning and I expect NSFW loras to work for the specific type of shit they are trained for, although they might be less versatile and less reliable to use than non-safety cucked models. It has a great vae and text encoder, so if it responds well to training it might be still worthwhile to thinker with it.
>>108973434
>>108973521Sounds like the exact same situation as their last model. Check out ponyfags Auraflow model to see how that turned out kek. No harm no foul tho I don't really give a shit about Ideogram.
>>108973232I'm going to make a single standalone node that does the basic pixel calculations which I can upload to a pastebin or something, then give you a workflow showing the idea. Going to assume you have the popular 'ComfyUI-custom-scripts' thing from pythongosssss, because you'll need a node that does custom mathAlmost done
>mfw Resource news06/03/2026>Ideogram 4.0: Open model at the forefront of designhttps://ideogram.ai/blog/ideogram-4.0>JoyAI-Echo: Pushing the Frontier of Long Audio-Visual Generationhttps://echo-team-joy-future-academy-jd.github.io/Echo-LongVideo-Page>Follow-Your-Preference++: Rethinking Preference Alignment for Image Inpaintinghttps://github.com/shenytzzz/Follow-Your-Preference>LongLive-RAG: A General Retrieval-Augmented Framework for Long Video Generationhttps://github.com/qixinhu11/LongLive-RAG>MAI-Image-2.5https://microsoft.ai/models/mai-image-2-5>AAD-1: Asymmetric Adversarial Distillation for One-Step Autoregressive Video Generationhttps://aad-1.github.io>Inference-Time Scaling for Joint Audio-Video Generationhttps://jung-jaemin.github.io/ITS-AVGen-Proj>Video-Mirai: Autoregressive Video Diffusion Models Need Foresighthttps://y0uroy.github.io/Video-Mirai>Order within Chaos: Capturing Intrinsic Energy Anomalies for AI-Manipulated Image Forgery Localizationhttps://github.com/phoenixnir/FLAME>VISReg: Variance-Invariance-Sketching Regularization for JEPA traininghttps://haiyuwu.github.io/visreg>HumanNOVA: Photorealistic, Universal and Rapid 3D Human Avatar Modeling from a Single Imagehttps://HumanNOVA.github.io>Cosmos 3: Omnimodal World Models for Physical AIhttps://research.nvidia.com/labs/cosmos-lab/cosmos3>TGV-KV: Text-Grounded KV Eviction for Vision-Language Modelshttps://github.com/Danielement321/TGV-KV>JAVEDIT: Joint Audio-Visual Instruction-Guided Video Editing with Agentic Data Curationhttps://ryanchenyn.github.io/projects/JAVEdit>Any2Poster: Any-Source Poster Generation Across Modalities and Domainshttps://github.com/Any2Poster/Any2Poster>Martin Scorsese faces industry backlash over AI company partnershiphttps://www.independent.co.uk/bulletin/culture/martin-scorsese-ai-black-forest-labs-b2988639.html
>mfw Research news06/03/2026>Training-Free Multi-Concept LoRA Composition with Prompt-Aware Weightinghttps://arxiv.org/abs/2606.03792>Text-to-Image Models Need Less from Text Encoders Than You Thinkhttps://nsping13.github.io/contextless-TTI>Qwen-Image-Flash: Beyond Objective Designhttps://arxiv.org/abs/2606.03746>Bootstrap Your Generator: Unpaired Visual Editing with Flow Matchinghttps://research.nvidia.com/labs/par/byg>Initialization is Half the Battle: Generating Diverse Images from a Guidance Potential Posteriorhttps://arxiv.org/abs/2606.02453>Inverting the Generation Process of Denoising Diffusion Implicit Models: Empirical Evaluation and a Novel Methodhttps://arxiv.org/abs/2606.03111>Retrieve What's Missing: Coverage-Maximizing Retrieval for Consistent Long Video Generationhttps://arxiv.org/abs/2606.02479>Drifting Preference Optimization for One-Step Generative Modelshttps://arxiv.org/abs/2606.02521>Equilibrated Diffusion: Frequency-aware Textual Embedding for Equilibrated Image Customizationhttps://arxiv.org/abs/2606.02129>Geometry-Aware Implicit Memory for Video World Modelshttps://gim-world.github.io>GuidedBridge: Training-freely Improving Bridge Models with Prior Guidancehttps://arxiv.org/abs/2606.03119>MemoGen: Can Past Experience Improve Future Text-to-Image Generation?https://arxiv.org/abs/2606.03243>UniVerse: A Unified Modulation Framework for Segmentation-Free,Disentangled Multi-Concept Personalizationhttps://universe-personalization.github.io>Diffusing in the Right Space: A Systematic Study of Latent Diffusabilityhttps://arxiv.org/abs/2606.03578>$A^2$: Smaller Self-Supervised ViTs Localize Better than Larger Oneshttps://arxiv.org/abs/2606.03148>Attention, May I Have Your Decision? Localizing Generative Choices in Diffusion Modelshttps://arxiv.org/abs/2604.06052>You Don't Need All That Attention: Surgical Memorization Mitigation in Text-to-Image Diffusion Modelshttps://arxiv.org/abs/2603.00133
Can I tell an LLM to use booru tags and it will do it properly or hallucinate her own?
>>108973532He didn't train on the last Ideogram though.He trained on an extremely underbaked model someone else trained on Ideogram outputs, without pruning any of the censored images.Anyway I am just saying it's probably worth experimenting with. You are probably not going to get rid of almost 100% of the "image blocked by safety filter"s without finetuning on millions of images but 90-95% might be doable on a very small scale finetune, or hell even lora, without spending a fortune.
>>108973581Depends on LLM but aside from booru tags that have special meanings that aren't trivially obvious, most llms will cope fine with tag style instructions. They are smart enough to just guess what you might be meaning even though they have no proper booru training.
>>108973581I've done this. Try preprocessing a list of Booru tags with their wiki entries and build a prompt like "below is a list of tags with descriptions, only output the tags, follow their descriptions". You can't do ALL tags out there, that will blow up the context window even of the larger open source models out there, but you could always just do more than one pass and split them up.This worked halfway decent with Qwen 3.5, I haven't tried Gemma 4 but assume it would work even better as it's ridiculously easy to get it to uncensored with a system prompt.Also look up Xgrammar where you can literally force the model to only output certain formats. Not sure whether it's compatible yet with Gemma4 though
> >108973545> >108973550fuck off
>>108973600>>108973667those are really good. base anima or lora?
Thank you euler cfg pp, very cool
>>108973679photo lora test, took fraction of dataset and tried how it learns. pretty good so far
>>108973705How many images for test and in total? How many steps have you trained for?
>newest localkek slopware has built-in censorship>saas remains decades ahead due to their uncensored all-knowing base modelslocal is an absolute embarrassment
>>108973705really nice so far, keep it up
>>108973703It's a based sampler desu. It's certainly more difficult to use than others but once you get the hang of it the results are incredible.
>>10897371454 epochs with batch 8, around 600 images something like that>>108973733TY only gonna get better
>>108973703Just reconfigure your pp, Anon!
Is there a good target for number of images in a lora dataset for anima? I've been making some style loras with 100-200 images and usually get good results around 1500 steps. Is it best to use the largest dataset possible with good image diversity and style consistency?
>>108973727I can tell you're not putting your heart into it this time. Sad desu.
>>108973787quality>quantity always, been training for a long time and the two things that consistently improve the quality the most were purging the dataset of garbage and increasing the dim/rank
one for battlestation thread
I thought api was supposed to be censored? How come local models are spitting out safety images? That never happened with Grok
35?
>>108973870where i post my kinosovl from
>>108973787Above 100 is a good target for style loras, pretty much any model. I expect the difference between 100 and 1000 images to be not worth it for most loras.
>>108973232Done, here you gohttps://files.catbox.moe/rksrik.zipPut that node in your custom nodes folder. You should also have this installed:https://github.com/pythongosssss/ComfyUI-Custom-ScriptsJust for the math node. I have my own math node that I like better but most people have this one installed already, I tried to make something you could use right away
Is anon trying to pretend like it's all local models and not just Ideogram?
>>108973824>>108973892ty fellas I appreciate the help
>>108973877imagine spending 10k on a rig just to get cockblocked by a safety filter when you try to gen "1girl, standing"LOOOOOOOOOOOOOL
y did anon reply to himself ?
>>108973882
>>108974000uh oh melty
>moving forward all local models will be censoredHow does that make you feel, anon?
>his seething grows quieter and quieter
my Anima upscales all like shit compared to the base image for some fucking reasonwhen i used illustrious my upscales were always objectively better than the base image
>>108972752HUGE NEWS EVERYONE I COMPILED SDCPP AND IT DOESN"T CRASH(but gguf don't work, comes out all white, preview is all black, apparently it's a bug)
>>108974039doesn't matter. we have the weights for flux dev 2.
>>108974068latent upscale?
>>108974068User error also anima can do larger resolutions out of the box anyway so highresfixing is just a cope
>>108974068I can't get good upscales with it either
are people making any celeb/streamer lora's for illustrious anymore? or do i bite the bullet and use sd 1.5 and pony
>>108974119you have to be the change you want to see in the world. you do know how to train loras, right?
>>108974039KEKSTONE WILL UNCENSOR IT!!! JUST DONATE $500000 SO HE CAN TRAIN AT 256x256 ON FURRY DIAPERSCAT SLOPPA
>>108974068Try genning with a natively higher resolution first. I stopped using upscalers with Anima. I like the native look better.
im going back to dalle mini. thats where the sovl is at
>>108974104>>108974068well you might be in luck, nvidia dropped the pid checkpoints for qwen today, this couldhttps://huggingface.co/Comfy-Org/PixelDiTthe comfyui master doesnt have it yet but should drop soon, support is already in the nightly
I really thought anon would get more trolling out of the Ideogram release. I'm pretty disappointed in him he's barely trying.
>>108974175>this couldim retardedthis could enable upscaling for anima *
>>108974175there was a z-image section in https://huggingface.co/nvidia/PiD , is that not released for comfyui yet or is it pixeldit_1300m_1024px_bf16.safetensors ?
>>108974257zit was released initially already, what they dropped yesterday are checkpoints for SDXL and qwen vae (so what anima uses) as well, as well as a fixed flux2 onehttps://huggingface.co/nvidia/PiD/tree/main/checkpoints/PiD_res2kto4k_sr4x_official_qwenimage_distill_4stephttps://huggingface.co/nvidia/PiD/tree/main/checkpoints/PiD_res2kto4k_sr4x_official_sdxl_distill_4stepsupport is only in comfy nightly so far though
>>108974141I saw an old git repository for local gen. the LoRA_Easy_Training_Scripts linked in some posts. but setting anything up on nvidia 50 series is a bitch, but it's been a minute since i've tried again.
>>108974182>Ideogramwhat release?
>>108974301>support is only in comfy nightly so far thoughThere is support in sdcpp :^) not sure which models it works with yet...https://github.com/leejet/stable-diffusion.cpp/pull/1585
>>108974301>support is only in comfy nightly so far thoughah. i guess I'll wait for a bit then, but it seems promising
>localkeks have more safety filters than SaaSLOOOOOOOOL
>>108974365>>108974182https://huggingface.co/Comfy-Org/Ideogram-4this???I don't understand, not joking lol
What sampler do you guys use? I keep switching between er_sde and dpmpp 2m sde.
>>108974374also, is this an upscale model? idk>pid_flux1_512_to_2048_4step_bf16.safetensorsThere's always so much new stuff.>>108974376>ah. i guess I'll wait for a bit then, but it seems promisingIt looks like it can be tested out using sd cpp.I'll have to try it out.
>>108974085I think that helped a lot actually, thanks.I was doing VAE Decode --> Encode inbetween each sampler. Apparently Anima doesn't like that and it makes faces slightly uglier. Illustrious was fine with it.
>>108974391euler_cfg_pp
>>108974397>also, is this an upscale model?it can be used for ZIT upscaling at leasthttps://github.com/Comfy-Org/ComfyUI/pull/14103
>>108974416It's extremely confusing.found the doc:https://github.com/leejet/stable-diffusion.cpp/blob/master/docs/pid.md
why is vae decode such a resource hog?
>>108974391res_2m with sgm_uniform and er_sde with bong_tangent
>>108974378More? Anon and math are enemy
>>108974431goes from tiny little ultra compressed latent into full pixel image that needs to fit inside your GPU's VRAM
>>108974427>>108974416ok.>In stable-diffusion.cpp, PiD currently runs as an image edit pipelineso sdcpp doesn't have proper pid support yet. if you want that, nightly.But, this sounds like a cool use of pid.
>>108974440Just say "because math"
>>108974346Cool. What model?
It's really weird to hear about ideogram. I did a lot of gens with that back then, but I ditched it when Flux dev 1 came out.
>>108974432>bong_tangentlolthat's new
>>108974431Related to this but I don't get how tiled decode works compared to regular decode. Why does my RAM usage shoot up with normal vae decode but tiled decode doesn't when they're both still storing everything in RAM while decoding?
>>108974470 It's img2img with flux1-dev, but t2i does pretty much the same thing (see strength param). ./sd-cli.exe \--diffusion-model ../models/flux1-dev-q8_0.gguf \--t5xxl ../models/t5xxl_fp16.safetensors \--clip_l ../models/clip_l.safetensors \--vae ../models/ae.sft \-H 1024 -W 1024 \-i ../Documents/1.png \--strength 0.78 \-o $ofile \-p "$P" \-s -1 \--sampling-method euler \--steps 20 \--guidance 3.5 \--cfg-scale 1.0 \--clip-on-cpu \-t 8
Has anybody switched from using comfyui manager (maybe still using it just for reference but not to start the download) in order to manage comfyui using pixi instead of pip?
Has nvidia pid been adapted for wan 2.2 instead of sdxl?
>https://civitai.red/models/2668799/cyberrealistic-animaslop time
>figured out how to use the extra model paths after 3 hours before chatgpt immediately corrected me>finally fire up comfyui>have to tinker around just to save images on another drive as something other than pngah, so it begins
>>108974499pretty sure economizing vram is the purpose of decoding tiles instead of the whole target image in one go.
nsfw https://files.catbox.moe/xwik7e.jpg
>>108974301from a few tests with a slop custom node, it can be pretty neat. seems like the more your image looks like an illustration the worse it gets though.also slightly changes colors
>>108974551>early access 7k buzz >"semi realistic" holy kek
>>108974551the ai stare
Instead of 30 FPS videos, is there a way to setup a workflow so you instead can generate frames in continuity with each other one at a time? What if 4 FPS is enough for me?
>>108974589Nice
>>108974635Yeah. Looks like he tuned the worst ai slop in
>>108974551>"i cant get proper realism into the model so ill call it "semi-realistic">unironically charge people money for itwho is even the audience for this
https://youtu.be/XogoQnkQUO8?si=Ah7Nb_pE49-CLGG2why has noone talked about this?
>>108974589i respect the idea but its so blurry and burnt
>>108974698filmed on nokia
>>108974630how do you make stereo images like this?
>>108974759should be a default node
>>108974785And a new record on you in their Redis database. A.I watch you!
>>108974785Do these niggas really think people should only use models for generating pictures of dogs?
>>108974630photograph example with lora
>>108974772cool, thanksJust had an idea that maybe one could use qwen image edit to maybe change the angle on one of these images with like 1 degree and then stitch them together. Could maybe make the 3d effect a little stronger.Something to experiment on some other day
Im out of the loop. But just looking at what I'm seeing here this looks like what happened when Aura flow go to released (anyone remember that) the service the model used to get its training data from had a cat that would appear the model blocked the prompt. It looks like they did the same here but instead of a cat it just generates text.
>>108974785do apikeks really? This would never happen with a local model, local is free and uncensored!
>>108974817it's not that unreasonable actually.Cats are also great source material for fun images.
>>108974785catbox?
https://research.nvidia.com/labs/par/byg/bygots, rise up
>>108972914gayest thing i could find
Censorship aside, how are the output from Ideogram that do get through? Are they good?
>>108974957this looks like the brightness adjust for a horror game
https://www.reddit.com/r/StableDiffusion/comments/1tw6c4y/sorry_not_sorry_ideogram_jailbroken_in_1_easy_step/seems people have figured out the censor slop?
>>108974968I am getting shit. Some anatomy errors too.I only tested a few images in Turbo and Default though.It seems you need to do some really fucking tedious json autism if you want good results.
Such a shame because it's textual capabilities are impressive.Mogs anything else local and most API models too.This is default. (20 steps)
>>108974488it's totally not lol. Here's what it looks like.always visualize your sigmas. then you watch in the preview, whatever step, you know what level of detail is being worked on. All public models act like this, big to small.
>>108975122unc here. Learn gimp, really, this is dumb lolbut ideogram brings back memories. I did the whole 360 degree Janduz (sp) cycle.
>>108975129Here's beta57. It's flatter.
>>108975122Even works at turbo.>>108975134I know gimp unc, I am just testing shit, practical value be damned.Usually when they say "our model has great text" they mean some stupid benchmeme but this model actually has great text.
>>108975122>>108975134herehttps://archive.org/details/new-360-symbolic-degreesThese are a good source of test material. It's very interesting to see that since that time attitudes and laws relating to nudity are both more strict, and at the same time homosexuality is very legal and common. I prefer the time when they were in the closet and nudity wasn't a high crime punishable by systemic rape.
>>108975156I'll respect it for its speed, at least.
i'm out of the loop. why is this ideogram release noteworthy? is it supposed to be the best local image model? or are people just interested because it's new
Been trying to get a decent icon of a hand pulling a photo out of a librarian's drawer... this is the best so far and it's still shitSkill issue, I know.Close to switching back to black-on-white just because the icons are so much easier to make
>>108975122there is something wrong with the inference, the images are all so bad, is shouldn't be because fp8 right?
>>108975190I am suspecting something might be off with the schizo workflow Comfy ships as well, but I am not sure.No, I don't think because it's FP8. FP8 only release sucks for training and making more quants, but it shouldn't tank the inference this much.
>Discussion and Development of Local Image, Video, and Music Models>Music modelsWas that always there?
>>108974785So, on top of wrestling with samples, complex multi pass workflows, plastic skin, melted hands and feet, catastrophic forgetting and failed LoRAs just to pull off what should be a stupidly simple concept on local models ,now we also have to fight this brand new flavor of censorship? Fantastic.
>>108975211We're dying out here, we need to bring in some fresh anons. For some reason Catjack has been spamming his gens all week long.
>>108975211:) good to see.
>>108975228Catjak just constantly spergs at random anons. Actual thread lolcow
>>108975041>Arbitrarily ablating so many layers at random weightsEnjoy the body horror and incomprehensible AI nightmares.Maybe if you could figure out a way to disable censorship by only slightly changing (0.8-0.9) small amount of layers it could be useful.Like it seems to work but results aren't good.This sigma crap another redditor linked in the thread:https://www.reddit.com/r/StableDiffusion/comments/1tw6gmq/ideogram_safety_filter_is_removed_by_using/didn't work for me but other timestep shenanigans seem worth experimenting.
>>108975235I love Jack the Cat, I wish we had more anons dedicated to this general. This general has so much potential but it needs more love desu
>>108975232It was added only 2 threads ago....did any new models come out?
>>108975264i like my ltx music
>>108975041>>108975252just wait for the sarah peterson patch, she'll fix it
>>108975283LTX can do music?At what length?
>>108975264We have ace step xl sft, with dcw.I'll go ahead and start work on my next song, I guess lmao.The key thing is to realize that prompting is very important. If your prompt is bad, you can do a2a with cover strength at 0.3. this can mitigate some issues. you need good audio equipment to hear ace step 1.5 xl gens in their full glory, though they only are essentially at idk 48kz mp3 maybe in total quality, ok? a paradox! like one of those women that has a wang.
>>108975295it can go on forever, but i haven't tried making a full song with it since you can only extend it in short increments in order to give enough memory for the context window to fit enough of the song to remain consistent
>>108975300I'm not fully familiar with that functionality, any guide guides or recommended baby's first settings?
>>108975302That sounds tedious
>>108975320yes, it takes forever when you want actual lyrics since it takes a long time to generate the extension, and then you can find that the singer doesn't say the correct thing. but i think it sounds okay considering it's a video modelhttps://files.catbox.moe/fuxnsb.mp4
>>108975252>>108975041>>108974785I personally hate all you and your "I have money and a GPU" vibe and I fantasize daily about watching you suffer BUT EVEN SO I think you should not be using Ideogram or messing around trying to jailbreak it because:->If Ideogram sees people deliberately avoiding their model because of this new censorship, it will pressure the people who came up with this nonsense to rethink their approach. Hit them where it hurts and that is the usage stats.2: If you keep trying to jailbreak it, you are making things worse for everyone. Every new model they drop will be harder to crack, more labs will start copying this censorship playbook, and the whole local ecosystem becomes a giant headache for all of us.The best move here is to go on strike and stop supporting anti gonner models.
>>108975336why does local attract so many blind and deaf people?
>>108975315steps 100, cfg scale between 6 and 13, shift as high as 11, dcw mode double, dcw scaler 0.0008, dcw high scaler 0.0005, ode eulerthat's where I've settled. There's another person who is ahead of me on this.other models need rhyming, ace step doesn't need it, idk if it even helps - much.but it still works better with quatrains of about the same number of syllables.the biggest tip is don't use "compose" and keep the audio codes thing blank. well, imo it's better. that keeps it squared up, but I don't like it. But when you do this, realize your prompt is kind of sequential. It relates to how clip works. I have not fully figured out how prompting works, because it's weird how it works. It knows descriptions of sounds.
you will never date your ai generated 1girl
>>108975337this is whats going to happenredditors will find a trivial jailbreak then everyone will use it for a while, realize its shit then forget about it and go back to klein/zit
>>108975353Meanie
>>108975337>I personally hate all you and your "I have money and a GPU" vibe and I fantasize daily about watching you sufferWhy???????
>>108975353the trick is to become the 1girl
Has anyone gotten the comfy workflow to work for ideogram4 using the fp8 models? I have a 4090, I get "mul_cuda" not implemented for 'Float8_e4m3fn'. I'm pretty sure Ada supports that so I'm confused. I'm on nightly. I guess I'll have to wait more. Very strange of them to not release the fp16 model. If I had that, it would work, since I have the VRAM.
whats so cool about ideogram4?
>>108975401its new and powerful
>>108975401its a cuckold simulator
>>108975369whats the best prompt for this
>>108975393Why are you supporting a model that doesn't want (You) using it? Use Anima cuckie.
>>108975349I see I have been using the gradio UI and just settled at 200 steps at heun. For me it's been best for whatever they call the guidance in that interface to sit a 2.5-3.>>108975336What made you decide to try this?I guess there isn't much discussion on music gen, I didn't see much of it in the threads
>>108975295https://files.catbox.moe/8s5ca3.mp4 (warning - nudity)Yes, and sometimes it's pretty good, but it doesn't follow the prompt all that well when using 8-step distilled, and I don't have the patience to wait longer. I take the music as a nice freebie when it comes out OK.
>>108975356Well, yes.There's nothing here to be excited about, unless you are REALLY interested in generating text, as in less than 1% of local users.Eventually we will have a ZiT/Klein killer model, but this sure ain't it.Looks like anima is finally dethroning the SDXL finetunes for anime stuff though.
>>108975432Aside from that Anima is overtaking Zeta Image and Klein, I keep noticing more and more realistic loras and finetunes for Anima.
>>108975432>Looks like anima is finally dethroning the SDXLall I've seen is half sticking with IL and half going with anima. mostly speed complaints
>>108975427InterestingI'm going to hone my skills with Ace step 1.5 I think it has a lot of potential the only problem I have is that there is a noticeable quality jump with the stl or whatever top end model between 80 where there's not much gain then it just jumps up around 180- 200 step I guess it's 400 when you use heun.
>>108975415Anima is undertrained garbage which gens hands like 2023 dall-e. Also, I tell me not to gen porn with your model, I'm going to gen porn with it, you know?
>>108975424>What made you decide to try this?sometimes i get random music in my videos, so i thought it had lots of music in the data
>>108975129>>108975143Very interesting, thanks.I'll take a look.
>>108975432>Looks like anima is finally dethroning the SDXL finetunesAll the actual skilled prompters and artist moved to it after preview 1 released desu
>>108975447how easy is it to finetune anima? i have over 100k real photos
I want to use Ideogram alongside windows 12 and nodes 2.0
>>108975467if you can afford to fine-tune just do the base cosmos model and get rid of the grifter licence
>>108975480Anime dataset is actually good for NSFW realism, it gives more hot and creative compositions
>>108975467Default training params work fine >>108975480Since he can tune he's probably not a jeet so he doesn't have to worry about the licence
>>108975452id be more than willing to help you get better outputs with anima but you strike me as the kind of anon who doesnt want help and would rather complain
>>108975467I don't know
>>108975452Why you follow me everywhere i go?
>grifter licenceHas anon found any proof of this yet or is he still just trolling
>>108975467easy
>>108975494why did you even bother replying to him lol
>>108975424>settled at 200 steps at heunace step cpp doesn't have heun, and is capped at 100 steps. idk why.dcw needs to change depending on how many steps you use.If I ever go back to comfyui for ace step, I would use exp_heun_2_x0_sde for the sampler, and I always use tan2 for my scheduler now. It's like a double Z shape. basically, bong tangent "rushes" through the mid sigmas, but tan2 has an adjustable plateau in the middle, or wherever you want it. anyway, these seem to be the best sampler + scheduler, so I think. Shame the sampler isn't on sdcpp, and shame I have to use comfyui to collect my sigmas, not that it takes that long.
>>108975504Just so you know, whoever you are, it looks really sus from the outside when someone jumps to defend one specific model this fast always. We were talking shit about Ideogram earlier and nobody said a word, but the second Anima comes up, suddenly there's a white knight in the thread. That's a little too convenient bro.
>>108975504tdruss was bitching about not making enough money abloobloo
>>108975558Wait there's a c++ version well fuck!What do you gain from that version?
>>108975134>>108975164Here's zit doing the first degree, I didn't prompt or prompt enhance, just pasted it in.
>>108975568It runs on my rdna2 card, and has dcw, and a2a.rdna2 is kind of the runt of the cards, amd has partially dropped support.
>>108975577>AMD shitting the bedIt's all so tiresome it's like they throw the match on purpose
guys guysrememberanima is le bad >:^(
>>108975585Yep, I'm through with all the companies. It's clear they have conspired to limit ram, and to limit "cuda" matrix math.
>>108975041>>108975252Thinking again, how much does the theory even make sense here?They modified the relevant weights during post-training process and taught the model to draw the grey image when presented with forbidden conditioning.So we want to disrupt the layer weights in such a way that:1) It doesn't completely cripple the model or make it too weird, so essentially a small enough delta on as few layers as possible2) No longer draws the safety filter image3) Instead draws whatever it knew about the naughty conditioning before post-training?The latter doesn't seem very possible through ablation. I guess the realistic goal here is to make it less prone to ludicrous false positives. If we assume that they fucked up the training and unintentionally fried the model and that's why it is so trigger happy, it might be possible to moderately clamp select few amount of probably the composition related middle layers and no longer get so many safety filter images, without also raping the model.But I dunno finetuning seems like a much better way out of this mess.I was thinking about putting some combinations into comfy oven before going to bed, and see if anything interesting comes up when I wake up, but I am now reconsidering if this is worth it.
The last remaining hope is Celestial, it's obvious that amd intentionally nerfed the matrix math on rdna4.
>>108975562>>108975563So no proof yet? Shame.
what do we want?matrix mathwhen do we want it?NOW!
>>108975471theyre calling him the most pozzed genner known to anon
>>108975622I'm disappointed he's going to use bare metal instead of cloud.
>>108975563dont worry bro im sure soon someone will join your team to make apache2 anima
>>108975654Funny how that became a nothing burger.
>>108975571aries 2 of Janduz, not prompted, maybe I should come back and prompt.
>>108975663Somtimes, a model can do one thing well, and feels like it should be a lora. Like ovis image whatever, it can do cartoon text really well. but I don't think it's good enough to do i2i, so pointless, I guess.
>https://echo-team-joy-future-academy-jd.github.io/Echo-Infinity/New model released that lets you generate 24 hour long videos based on Wan.
Everyone's favorite model is released.https://civitai.red/models/2544636?modelVersionId=2983680
>>108975683>24 hour long videosThey're trying to kill the coomers, aren't they ?
>>108975696At least this time he's not lying and has labeled it as a merge
>>108975696Why is everyone so hyped about this? Legit question.
>>108975723>Why is everyone so hyped about this?It's Pride Month. Let 'em get loud.
>>108975723>everyoneSure Jan...
>>108975723Same reason people get hyped about dogshit popular music. Probably a mixture of shilling and some kind of viral snowball effect past a certain point.
>>108975723>Why is everyone so hyped about this?some struggle to "stabilize" outputs from raw finetunes so they need something like WAI with a rigidity that compensates for their lack of prompt-fu
music enthusiasts here? what's the meta for local musicgen now? i'm interested in melodic instrumentals. was having fun with audiocraft a year ago.
>>108975754I am but AI will never come anywhere close to generating anything I could remotely enjoy (Classical) so I don't even bother
>>108975736Hm, the output is a indeed a lot cleaner: >>108974243
>>108975781sovl vs slop, as always with these shitmixes
>>108975754>what's the meta for local musicgen now?FL Studio
>>108975787wai pretty much always was a nice finetune tho
>>108975494I don't do a whole lot of imagegen anymore except to feed into ltx-2.3 i2v, and nsfw sdxl models are good enough for that. Maybe anime will improve with more training. I dunno, I just don't have the free time I used to.
>>108975797it was never a finetune tho. he just shitmixes random loras into slopshit
>>108975797>finetune
>>108975781i wont convince you to NOT use it i dont care about that. my only point was that mixes and merges are designed to give a default style which some prefer and others do not. if you like the style it brings than by all means use it, but i dont really care for models that have their own built in style. it usually leads to them looking less like real images and more like generated outputs. >>108975797>finetunelel
What if I train a lora with the "image blocked by safety filter" images it outputs, tagged with a diverse set of the prompts it rejects and apply it with -1 strength at runtime? What would happen?
>>108975802>I just don't have the free time I used to.the future belongs to the zoomers and gen alpha, old man
>>108975823negative loras dont work, you may prevent that from being generate but it will generate garbage
>>108975815Ok, so this is the same input from the collage in the OP. I used a sketch lora for that look and WAI seems to completely ignore that. I assume it's because it's a merged model and the lora doesn't work very well and not because it is overriding the effect it is supposed to have on the image? I don't know much about how this works, sorry.
json prompting is the gayest thing on the planet
>>108975855What if I apply it at +1 strength at the unconditional model then?Could this model's unique structure make it work?
>>108975871does your lora have trigger word? have you tried increasing the lora strength?
The N64 lora seems to work really well.>>108975897It does. I can also try increasing the lora strength. Hold on.
>>108975894It's a humiliation ritual for sure. Great prompt adherence has questionable value when no will sit through the tedium.I think it's meant for some agentic loop with LLM in the middle, I don't think they expect us to type that garbage by hand.Still sucks though.
from 1.00 strength to 1.30. Doesn't look much different. I'll crank it up to 2>>108975930Sick
>>108975754I'm still new to it but Ace Step seems to do well if you can prompt things out. There's a fuck ton of settings I don't understand but so far so goodhttps://vocaroo.com/1gFu5B3LIcBC
lol it added femkuna to the image I guess it does look lil bit sketchier though
>>108975941I just use an llm to generate the prompt. They have the system prompt they use in their github.https://github.com/ideogram-oss/ideogram4/blob/main/src/ideogram4/magic_prompt_system_prompts/v1.txtI have been able to generate nsfw images. I don't think I have triggered the safety image once.I don't have much opinion so far on quality, but it is decent enough.
2.50
>>108975947Thanks. Anima is actually pretty fun once you get the hang of it.
>>108975965it doesn't look better than zit though. I don't give a fuck about text
Are the luddites dead yet
>>108975983Check the wlop lora on civitai. It has a similar artstyle.
>>108974431because you didn't--disable-dynamic-vram
>>108975369ywnbaw
>>108976013funny enough, I tried "@wlop", but didn't like the result, so I removed him from the mix. I'll try the lora though.
>>108976032Man, the plastic slop people tolerate...
>>108976075desu still very much plastic
>>108976080Get your eyes checked, sis
>>108976018Comfy keeps threatening that they will remove this option, but I'm starting to doubt it ever will since they suck so bad at keeping a decent vram threshold with dynamic-vram, meaning you will OOMPeople will rather use more system ram and take the ~5% perfomance hit
>>108976075stop ruining my slop
>>108976002Nope but they are well on their way out. They're in the violently attacking high profile figures phase. People don't suffer that kind of behavior long.
Yeah, I think I get what Anonymous meant when he said 'mixes and merges are designed to give a default style'. The checkpoint seems to steamroll most of the loras I've tried,
>>108976136holy shit wai is such a shitty slop, how can anyone like this
>>108975349>dcw mode double, dcw scaler 0.0008, dcw high scaler 0.0005, This does make a difference way better separation
>>108976149Yeah, I remember the anon posting about dcw. dcw is supposedly originally for images, but you don't see anyone doing it.
>>108976083its okay you can still try again
whats wai?
best 1girl, you can't even compete.
someone make Bitcoin-chan hanging from a noosethanks
>>108976218you do it
>>108976226I can't get the ₿ logo on the shellsomeone edit it
>>108973324it's a grifter
>>108976218>still bag-holding in 2026
>>108976240no, I tethered up when it was above 100kbut I plan to start DCA once it goes below 60k
>>108976240>being chinesenever be chineseRemember the adage "a chinaman's chance"
>>108976246>>>/biz/fuck off
>>108976203
>>108976251
>>108976257Fine, retail stonks will get smacked in the summer. You want your biz, there it is.
What should I generate?
>>108976313>What should I generate?>>108975164also, you can throw it into an llm and add a modifier, like steampunk, videogame scene, renaissance painting, dark anime.
>>108973545>>108973550thanks!
>China, Germany, USA, Singapore, Israel, China, China, China...When will see see a French image model? Or one from South America? Australia? Sweden?
>>108976338right after they import more muslims and jeets
>>108976338>FrenchCan anyone besides maybe Mistral train anything worthwhile there?>South AmericaLol>AustraliaInteresting how no major global tech corporation ever came out of there, AI is no exception. In theory has the ingredients. Too much regressive tax/regulation like Europe?>SwedenIf you thought BFL's German safetrooning was bad Swedes will probably invent a whole new level of cuckoldry.
https://xcancel.com/thepatch_kev/status/2062140772942774681?s=20musicgen chads...
>>108976373https://github.com/betweentwomidnights/sa3-ableton-extensionthis looks really cool
>>108972764idk
>>108976436krea 2 medium is so cucked lmao
>>108976373waiting for the lmms version
I wish /k/ wasn't full of retarded boomers so we could have good gun loras for every model.
>>108976501Almost got baited by this.Let's just hope the model release doesn't get safetymaxxed.It would be a shame cause the model is really creative.
>>108976574>Let's just hope the model release doesn't get safetymaxxed.its over
>>108976574the only thing that would be good is an uncucked krea 2 large, maybe a slightly older checkpoint of it compared to the one they can sell on the api, otherwise its DOA like krea 1, which is what it will be, since they won't ever release krea 2 large
Also anyone has Nihei lora? According to booru, he has only <150 pics and this sure ain't it after prompting for him.
>local is in such a good spot that no anon bats an eye when the new SOTA textgen model is ruined by safetycucking
What if train image models with not just tokens output from LLMs encoding text, but also tokens output from LLMs encoding images? Or both? Or audio?
>>108976615nobody cares about textgen, we use text to gen 1girl instead here. you'll have your abliterated fix soon anyway
>>108976619that's called ltx
>>108976622no ideogram4