[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


Janitor applications are now open. Apply here!


[Advertise on 4chan]


The Secret Sauce For Kinos Edition

Discussion and Development of Local Image, Video, and Music Models

Previous: >>108966726

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, & Upscalers
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/tdrussell/diffusion-pipe
https://github.com/kohya-ss/sd-scripts
https://github.com/kohya-ss/musubi-tuner

>Z
https://huggingface.co/Tongyi-MAI/Z-Image

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/
https://animadex.net

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>Wan
https://github.com/Wan-Video/Wan2.2

>LTX-2.3
https://huggingface.co/collections/Lightricks/ltx-23

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
Can Krea2 do N64 kino OOTB?
>>
oops fox posted after thread over
>>
how long before we have dream diffusion?
Turn my nightmares into a reality
>>
Blessed thread of frenship
>>
File: 1760367433244970.png (138 KB, 984x408)
138 KB PNG
ai image detectors are quick with updating their models
holy smokes
>>
File: 1779034870481081.png (701 KB, 930x465)
701 KB PNG
klein edit 9b is pretty neat, can do all kinds of stuff "use colored pencils like a sketch", change color, etc.

qwen edit is good but I seem to get better results with klein edit.
>>
>>108972791
>klein edit 9b
what's the difference between klein 9b and klein edit 9b? Also which one can be used as kontext?
>>
>>108972801
edit workflow for edits, im using klein edit 9b (distilled), very fast even at 8 steps (4 default)
>>
>>108972305
Are you thinking about releasing your kinoapp? I would love to try it out.
>>
nb4 niggas who dont remember the safeycucking of ideograms last model complaining about the safetycucking of their new model
>>
File: 1780151980273053.png (1.02 MB, 1056x976)
1.02 MB PNG
>>108972816
also, klein edit seems to be pretty good at copying font styles:
>>
>>108972763
honestly the best nipples i've seen in a base model
>>
File: ComfyUI_00720_.png (1.23 MB, 896x1152)
1.23 MB PNG
>>
File: 1766037634923393.png (950 KB, 1080x1080)
950 KB PNG
SD3 chads eating good tonight
>>
File: 1758705451492465.png (1.36 MB, 1024x1024)
1.36 MB PNG
>>108972841
better quality image as source.
>>
how have they not fired the greasy snake already?
https://www.reddit.com/r/comfyui/comments/1tvttzv/ideogram_40_just_open_sourced/
>>
Comrades! It's Pride Month. Show us your pride !
>>
File: 1769030986209693.png (1.21 MB, 1440x960)
1.21 MB PNG
>>108972914
>>
File: spaghetti pixelization.png (225 KB, 1635x767)
225 KB PNG
>>108972829
Maybe when I get the spaghetti under control. This could all be done by a single wrapper.

Also there are features not yet implemented, like figuring out a design for a better manual palette node with a color picker. I really don't like how unwieldy it all is right now.

You can do all of what I'm doing with some basic math right now. The only 'hard' part is picking a gen resolution close to your target aspect ratio and target image size (in total pixels) which divides cleanly by 8 and the target pixel size. You could figure out how to do that yourself and probably come up with something good enough, or ChatGPT could come up with an algorithm in five seconds that does it. For quantizing to black and white an easy trick with the default nodes is to just composite the image onto a large black and white image, quantize to 2 colors, then crop it back to the original dimensions. Downscaling and upscaling can both be done through the default nodes trivially.

Also remember it's very easy to have any LLM make you a comfyUI node from scratch to do whatever you want if you give it this link: https://docs.comfy.org/llms.txt

All my time presently is absorbed by trying to make this app for tag correction on images for LoRA training, I'm just working on some UI elements for that. Since someone a few threads ago said the reason I don't like LoRA training is because of sour grapes or whatever. I'm getting a bit sidetracked on the way to making my LoRA...
>>
File: 1778590750619831.png (1.47 MB, 1024x1024)
1.47 MB PNG
>>108972870
shalom (the people who decided to change the protagonist)
>>
>>
File: 1766974471594411.png (254 KB, 448x336)
254 KB PNG
>84x85
>>
>>108972940
>>108972951
Based and pixel pilled
>>
Wonder how well it does as a second pass? Use other model like ZIT to get a rough composition, decode, re-encode latent, continue the steps from around 20% to completion so that the safety text doesn't get a chance to spawn in.
Because from the few booba gens I've tried if the text goes away within the first 10-20% steps it doesn't come back.
>>
File: 5875.png (1.66 MB, 1504x848)
1.66 MB PNG
>>108972781
It probably just has the same features that most generative models have, no need to update, specially since its diffusion which is really easy to spot (as opposed to GANs or anything specifically trying to avoid detection), its also using flux VAE, which isn't new.
>>108972914
>>
File: 1776615456186897.png (1.25 MB, 1024x1024)
1.25 MB PNG
>>108972946
okay, here is the most accurate cover.
>>
File: 1girl.png (549 B, 76x76)
549 B PNG
Maybe this will be the one. Combined a few gens to make it.
>>
File: 1girl_fix.png (547 B, 76x76)
547 B PNG
>>108973063
removed a 'detail' on the face which felt like meaningless noise. Better I think.

(the text extending out of the page is a stylistic choice I made, not a mistake. Maybe an artistic mistake.)
>>
>>108972973
Tried it at 0.9 denoise and it unsurprisingly distorts the vagene into a tumor but could be worthwhile for partially clothed gens.
https://litter.catbox.moe/x7u5vprbixpzljfo.png
>>
File: 1771110777435.jpg (18 KB, 398x376)
18 KB JPG
Reply blocked by safety filter
>>
File: 1760794671931890.png (1.38 MB, 1024x1024)
1.38 MB PNG
>>108973049
>>
Ouch 2 minutes for Turbo on my 3060, could have been worse but not a great start.
Anyway does anyone know why there are two checkpoints? What is "unconditional"?
Are different parts of the cfg equation calculated by different models here? Is it what it is referring, why?
I would also ask for what mu and std (standard deviation?) stand for but I doubt anyone can make sense of that comfy spaghetti.
>>
File: 1766074143812898.jpg (40 KB, 500x500)
40 KB JPG
>>108972325
>>108972707
>>108972726
>mfw reading this
filtered again...
>>
so what is this new model people are having censorship issues with?
>>
>>108973246
https://huggingface.co/Comfy-Org/Ideogram-4
>>
i'm 1girling
and i'm happy
>>
>>108973285
why the fuck would they train censorship into it
>it GENERATES A CENSORED IMAGE instead of censoring it
>>
>>108972768
catbox?
>>
>>108973311
>>108972840
>>
>>108973311
So that you can feel S̵̨̧̛̼̫͖͇̳̝͕̣̜̖̱̻̯̤̰̝̭͕͖̗͕̮̟̰͙̤̟͙̪͉̻̯͕̘̬͖̪̰͙͚̈́̾̈͗͗͐͛͗̈̋̐̀̂̍̈́͋̽́̄͂͗̋̔́̎̀͐͒̒̉̋͗̊̆͘̚͜͠͝͠ͅÀ̵̢̢̢̡̧̡̛̛̪̹͇̬͙͈̻̻͎̗̠̱̰̬̜̝̙͈̟̪̰͕̤͕̦͇̖͈̫̞͈̻͙̣̳̻̥͓̰̠͍͚͕͖̦͍̄͗͑̑̌̑̿́̉͆͒̍̿̾̀́̀̿̎́͑̀͂̽͗̂͗̓̓̃͑̋̌͗̎͂̇̑͌̽͆̿́͛́̃̐̽̓̋̈́́͊̈̐̾̏̍̌͗͋̆̒̿͊̍͐̉̊̉̈̀͋̄̓͜͜͜͝͝͝͝͠͠͝ͅͅF̵̢̨̧̧̡̡̧̢̢̧̛̛̤̝̺̻͔̝̣̼͈̣̪̭̜͚͕͙̟̫̝̮̹̥̫̙͙͉̺͚̦͍͍͍̰͕̪͕͎̩̝͙̘̠͚̙̞̠̻̬͖̱̯͖̟̙͇̪̦̬͍͍͙̣͕͑̽̈́̂̂͗͑̋͆̈̊́̿̂̐̏͛͒̌̐̽͗͊͛̏͊͋͐̽̑͛̂͌͐̓͐̾́̽̋̐͑̎͛̈̽̓̔̌̿͛̀̃́̀̿͌̋͌̆̄̽̂͌̇͂͂̓̾̄͆͛́̒̓́͊͘̚̕̕͘͜͠͝͠͝͠͝͠͝͝ͅȨ̸̡̙̫̝̘͔̟̝̳̙͇͚̭̪̦͚̬̤̼̫̖̗͇̈́̀̃̆̐̈́͑̇̃̓̒̈́̿̄̈́́̌̓̒̐̈́̇̚͠ chud, be grateful for once.
>>
This might sound unbelievably retarded but how do you make people visually breath hard or take deep breaths then exhaling repeatedly with Wan or LTX?

I've tried describing the act of repeated inhaling and exhaling but it doesn't seem to work. Genning hyperventilating seems hard, or i'm retarded.
>>
Remember when localroaches wouldn't shut up about their local uncensored models?
>>
Cry more SaaSfag
>>
File: 1749256176615482.png (1.37 MB, 1024x1024)
1.37 MB PNG
censorship is retarded.

we have klein edit + undress loras, even ltx 2.3 lewd finetunes if you want lewds.
>>
>>108973372
we can just use another model anon
and you are still out of credits after 2 gens kek
>>
Judging by the facts that its sensitivity is prompt dependent (probably one style of prompting was more over-represented than the other during the finetuning) and neither comfy nor diffusers code contain anything funny about censorship, we can conclude that this censorship was probably just post-training finetuning.
They thought the model to draw the grey censorship image when asked for no-no prompts. Clearly not enough regularization so it's fried as fuck when it comes to generating that even for most benign prompts. I am not sure it being less sensitive for json is a case of it being trained on json so that it is able to distinguish between "good" and "bad" prompts better when given json prompts or if it is a case of being trained on NL prompts so that it can pick up thought crimes easier with NL and json slips past. (I am inclined believe the latter as some anon was able to gen shitty nipples with json last thread.)
So anyway this might be salvageable with finetuning and I expect NSFW loras to work for the specific type of shit they are trained for, although they might be less versatile and less reliable to use than non-safety cucked models. It has a great vae and text encoder, so if it responds well to training it might be still worthwhile to thinker with it.
>>
File: 1755038994684352.png (1.38 MB, 1024x1024)
1.38 MB PNG
>>108973434
>>
>>108973521
Sounds like the exact same situation as their last model. Check out ponyfags Auraflow model to see how that turned out kek. No harm no foul tho I don't really give a shit about Ideogram.
>>
>>108973232
I'm going to make a single standalone node that does the basic pixel calculations which I can upload to a pastebin or something, then give you a workflow showing the idea. Going to assume you have the popular 'ComfyUI-custom-scripts' thing from pythongosssss, because you'll need a node that does custom math

Almost done
>>
>mfw Resource news

06/03/2026

>Ideogram 4.0: Open model at the forefront of design
https://ideogram.ai/blog/ideogram-4.0

>JoyAI-Echo: Pushing the Frontier of Long Audio-Visual Generation
https://echo-team-joy-future-academy-jd.github.io/Echo-LongVideo-Page

>Follow-Your-Preference++: Rethinking Preference Alignment for Image Inpainting
https://github.com/shenytzzz/Follow-Your-Preference

>LongLive-RAG: A General Retrieval-Augmented Framework for Long Video Generation
https://github.com/qixinhu11/LongLive-RAG

>MAI-Image-2.5
https://microsoft.ai/models/mai-image-2-5

>AAD-1: Asymmetric Adversarial Distillation for One-Step Autoregressive Video Generation
https://aad-1.github.io

>Inference-Time Scaling for Joint Audio-Video Generation
https://jung-jaemin.github.io/ITS-AVGen-Proj

>Video-Mirai: Autoregressive Video Diffusion Models Need Foresight
https://y0uroy.github.io/Video-Mirai

>Order within Chaos: Capturing Intrinsic Energy Anomalies for AI-Manipulated Image Forgery Localization
https://github.com/phoenixnir/FLAME

>VISReg: Variance-Invariance-Sketching Regularization for JEPA training
https://haiyuwu.github.io/visreg

>HumanNOVA: Photorealistic, Universal and Rapid 3D Human Avatar Modeling from a Single Image
https://HumanNOVA.github.io

>Cosmos 3: Omnimodal World Models for Physical AI
https://research.nvidia.com/labs/cosmos-lab/cosmos3

>TGV-KV: Text-Grounded KV Eviction for Vision-Language Models
https://github.com/Danielement321/TGV-KV

>JAVEDIT: Joint Audio-Visual Instruction-Guided Video Editing with Agentic Data Curation
https://ryanchenyn.github.io/projects/JAVEdit

>Any2Poster: Any-Source Poster Generation Across Modalities and Domains
https://github.com/Any2Poster/Any2Poster

>Martin Scorsese faces industry backlash over AI company partnership
https://www.independent.co.uk/bulletin/culture/martin-scorsese-ai-black-forest-labs-b2988639.html
>>
>mfw Research news

06/03/2026

>Training-Free Multi-Concept LoRA Composition with Prompt-Aware Weighting
https://arxiv.org/abs/2606.03792

>Text-to-Image Models Need Less from Text Encoders Than You Think
https://nsping13.github.io/contextless-TTI

>Qwen-Image-Flash: Beyond Objective Design
https://arxiv.org/abs/2606.03746

>Bootstrap Your Generator: Unpaired Visual Editing with Flow Matching
https://research.nvidia.com/labs/par/byg

>Initialization is Half the Battle: Generating Diverse Images from a Guidance Potential Posterior
https://arxiv.org/abs/2606.02453

>Inverting the Generation Process of Denoising Diffusion Implicit Models: Empirical Evaluation and a Novel Method
https://arxiv.org/abs/2606.03111

>Retrieve What's Missing: Coverage-Maximizing Retrieval for Consistent Long Video Generation
https://arxiv.org/abs/2606.02479

>Drifting Preference Optimization for One-Step Generative Models
https://arxiv.org/abs/2606.02521

>Equilibrated Diffusion: Frequency-aware Textual Embedding for Equilibrated Image Customization
https://arxiv.org/abs/2606.02129

>Geometry-Aware Implicit Memory for Video World Models
https://gim-world.github.io

>GuidedBridge: Training-freely Improving Bridge Models with Prior Guidance
https://arxiv.org/abs/2606.03119

>MemoGen: Can Past Experience Improve Future Text-to-Image Generation?
https://arxiv.org/abs/2606.03243

>UniVerse: A Unified Modulation Framework for Segmentation-Free,Disentangled Multi-Concept Personalization
https://universe-personalization.github.io

>Diffusing in the Right Space: A Systematic Study of Latent Diffusability
https://arxiv.org/abs/2606.03578

>$A^2$: Smaller Self-Supervised ViTs Localize Better than Larger Ones
https://arxiv.org/abs/2606.03148

>Attention, May I Have Your Decision? Localizing Generative Choices in Diffusion Models
https://arxiv.org/abs/2604.06052

>You Don't Need All That Attention: Surgical Memorization Mitigation in Text-to-Image Diffusion Models
https://arxiv.org/abs/2603.00133
>>
File: 23f.png (6 KB, 335x261)
6 KB PNG
Can I tell an LLM to use booru tags and it will do it properly or hallucinate her own?
>>
>>108973532
He didn't train on the last Ideogram though.
He trained on an extremely underbaked model someone else trained on Ideogram outputs, without pruning any of the censored images.
Anyway I am just saying it's probably worth experimenting with. You are probably not going to get rid of almost 100% of the "image blocked by safety filter"s without finetuning on millions of images but 90-95% might be doable on a very small scale finetune, or hell even lora, without spending a fortune.
>>
File: anima1_00004_.jpg (462 KB, 1344x1728)
462 KB JPG
>>
>>108973581
Depends on LLM but aside from booru tags that have special meanings that aren't trivially obvious, most llms will cope fine with tag style instructions. They are smart enough to just guess what you might be meaning even though they have no proper booru training.
>>
>>108973581
I've done this. Try preprocessing a list of Booru tags with their wiki entries and build a prompt like "below is a list of tags with descriptions, only output the tags, follow their descriptions". You can't do ALL tags out there, that will blow up the context window even of the larger open source models out there, but you could always just do more than one pass and split them up.
This worked halfway decent with Qwen 3.5, I haven't tried Gemma 4 but assume it would work even better as it's ridiculously easy to get it to uncensored with a system prompt.
Also look up Xgrammar where you can literally force the model to only output certain formats. Not sure whether it's compatible yet with Gemma4 though
>>
> >108973545
> >108973550
fuck off
>>
File: aa.png (1.9 MB, 1400x704)
1.9 MB PNG
>>
File: anima1_00010_.jpg (483 KB, 1344x1728)
483 KB JPG
>>
>>108973600
>>108973667
those are really good. base anima or lora?
>>
File: ComfyUI_temp_dzunn_00002_.png (1.17 MB, 1024x1472)
1.17 MB PNG
Thank you euler cfg pp, very cool
>>
File: anima1_00018_.jpg (466 KB, 1344x1728)
466 KB JPG
>>108973679
photo lora test, took fraction of dataset and tried how it learns. pretty good so far
>>
>>108973705
How many images for test and in total? How many steps have you trained for?
>>
>newest localkek slopware has built-in censorship
>saas remains decades ahead due to their uncensored all-knowing base models
local is an absolute embarrassment
>>
>>108973705
really nice so far, keep it up
>>
>>108973703
It's a based sampler desu. It's certainly more difficult to use than others but once you get the hang of it the results are incredible.
>>
>>108973714
54 epochs with batch 8, around 600 images something like that

>>108973733
TY only gonna get better
>>
>>108973703
Just reconfigure your pp, Anon!
>>
Is there a good target for number of images in a lora dataset for anima? I've been making some style loras with 100-200 images and usually get good results around 1500 steps. Is it best to use the largest dataset possible with good image diversity and style consistency?
>>
>>108973727
I can tell you're not putting your heart into it this time. Sad desu.
>>
>>108973787
quality>quantity always, been training for a long time and the two things that consistently improve the quality the most were purging the dataset of garbage and increasing the dim/rank
>>
File: anima1_00033_.jpg (833 KB, 1344x1728)
833 KB JPG
>>
File: b0dcq5.png (2.04 MB, 1216x832)
2.04 MB PNG
>>
File: anima1_00038_.jpg (521 KB, 1344x1728)
521 KB JPG
one for battlestation thread
>>
I thought api was supposed to be censored? How come local models are spitting out safety images? That never happened with Grok
>>
35?
>>
>>108973870
where i post my kinosovl from
>>
>>108973787
Above 100 is a good target for style loras, pretty much any model. I expect the difference between 100 and 1000 images to be not worth it for most loras.
>>
>>
File: pixel_00015_.png (1 KB, 64x96)
1 KB PNG
>>108973232
Done, here you go
https://files.catbox.moe/rksrik.zip

Put that node in your custom nodes folder. You should also have this installed:
https://github.com/pythongosssss/ComfyUI-Custom-Scripts
Just for the math node. I have my own math node that I like better but most people have this one installed already, I tried to make something you could use right away
>>
Is anon trying to pretend like it's all local models and not just Ideogram?
>>
>>108973824
>>108973892
ty fellas I appreciate the help
>>
>>108973877
imagine spending 10k on a rig just to get cockblocked by a safety filter when you try to gen "1girl, standing"
LOOOOOOOOOOOOOL
>>
y did anon reply to himself ?
>>
File: anima1_00044_.jpg (650 KB, 1344x1728)
650 KB JPG
>>108973882
>>
>>108974000
uh oh melty
>>
>moving forward all local models will be censored
How does that make you feel, anon?
>>
>his seething grows quieter and quieter
>>
my Anima upscales all like shit compared to the base image for some fucking reason
when i used illustrious my upscales were always objectively better than the base image
>>
>>108972752
HUGE NEWS EVERYONE I COMPILED SDCPP AND IT DOESN"T CRASH

(but gguf don't work, comes out all white, preview is all black, apparently it's a bug)
>>
>>108974039
doesn't matter. we have the weights for flux dev 2.
>>
>>108974068
latent upscale?
>>
>>108974068
User error also anima can do larger resolutions out of the box anyway so highresfixing is just a cope
>>
File: anima1_00052_.jpg (687 KB, 1344x1728)
687 KB JPG
>>108974068
I can't get good upscales with it either
>>
are people making any celeb/streamer lora's for illustrious anymore? or do i bite the bullet and use sd 1.5 and pony
>>
>>108974119
you have to be the change you want to see in the world. you do know how to train loras, right?
>>
File: anima1_00055_.jpg (867 KB, 1344x1728)
867 KB JPG
>>
>>108974039
KEKSTONE WILL UNCENSOR IT!!! JUST DONATE $500000 SO HE CAN TRAIN AT 256x256 ON FURRY DIAPERSCAT SLOPPA
>>
>>108974068
Try genning with a natively higher resolution first. I stopped using upscalers with Anima. I like the native look better.
>>
File: ComfyUI_Anima_03198_.png (1.33 MB, 1344x960)
1.33 MB PNG
>>
File: 153246486.png (328 KB, 800x778)
328 KB PNG
im going back to dalle mini. thats where the sovl is at
>>
>>108974104
>>108974068
well you might be in luck, nvidia dropped the pid checkpoints for qwen today, this could
https://huggingface.co/Comfy-Org/PixelDiT
the comfyui master doesnt have it yet but should drop soon, support is already in the nightly
>>
File: anima1_00059_.jpg (1.06 MB, 1344x1728)
1.06 MB JPG
>>
I really thought anon would get more trolling out of the Ideogram release. I'm pretty disappointed in him he's barely trying.
>>
File: smdyyn.png (1.94 MB, 1216x832)
1.94 MB PNG
>>
>>108974175
>this could
im retarded
this could enable upscaling for anima *
>>
File: ComfyUI_00001_.png (607 KB, 1024x1024)
607 KB PNG
>>
File: 213751CUI_00001_.png (1.24 MB, 1536x1152)
1.24 MB PNG
>>
>>108974175
there was a z-image section in https://huggingface.co/nvidia/PiD , is that not released for comfyui yet or is it pixeldit_1300m_1024px_bf16.safetensors ?
>>
>>108974257
zit was released initially already, what they dropped yesterday are checkpoints for SDXL and qwen vae (so what anima uses) as well, as well as a fixed flux2 one
https://huggingface.co/nvidia/PiD/tree/main/checkpoints/PiD_res2kto4k_sr4x_official_qwenimage_distill_4step
https://huggingface.co/nvidia/PiD/tree/main/checkpoints/PiD_res2kto4k_sr4x_official_sdxl_distill_4step
support is only in comfy nightly so far though
>>
File: ComfyUI_00004_.png (1.14 MB, 1024x1024)
1.14 MB PNG
>>
>>108974141
I saw an old git repository for local gen. the LoRA_Easy_Training_Scripts linked in some posts. but setting anything up on nvidia 50 series is a bitch, but it's been a minute since i've tried again.
>>
File: ctma9p.png (1.31 MB, 1024x1024)
1.31 MB PNG
>>
>>108974182
>Ideogram
what release?
>>
>>108974301
>support is only in comfy nightly so far though
There is support in sdcpp :^) not sure which models it works with yet...

https://github.com/leejet/stable-diffusion.cpp/pull/1585
>>
>>108974301
>support is only in comfy nightly so far though
ah. i guess I'll wait for a bit then, but it seems promising
>>
>localkeks have more safety filters than SaaS
LOOOOOOOOL
>>
>>108974365
>>108974182
https://huggingface.co/Comfy-Org/Ideogram-4
this???

I don't understand, not joking lol
>>
File: 215430CUI_00001_.png (1.13 MB, 1536x1152)
1.13 MB PNG
What sampler do you guys use? I keep switching between er_sde and dpmpp 2m sde.
>>
>>108974374
also, is this an upscale model? idk

>pid_flux1_512_to_2048_4step_bf16.safetensors

There's always so much new stuff.

>>108974376
>ah. i guess I'll wait for a bit then, but it seems promising
It looks like it can be tested out using sd cpp.

I'll have to try it out.
>>
>>108974085
I think that helped a lot actually, thanks.
I was doing VAE Decode --> Encode inbetween each sampler. Apparently Anima doesn't like that and it makes faces slightly uglier. Illustrious was fine with it.
>>
>>108974391
euler_cfg_pp
>>
>>108974397
>also, is this an upscale model?
it can be used for ZIT upscaling at least
https://github.com/Comfy-Org/ComfyUI/pull/14103
>>
>>108974416
It's extremely confusing.

found the doc:
https://github.com/leejet/stable-diffusion.cpp/blob/master/docs/pid.md
>>
why is vae decode such a resource hog?
>>
>>108974391
res_2m with sgm_uniform and er_sde with bong_tangent
>>
>>108974378
More? Anon and math are enemy
>>
>>108974431
goes from tiny little ultra compressed latent into full pixel image that needs to fit inside your GPU's VRAM
>>
>>108974427
>>108974416
ok.

>In stable-diffusion.cpp, PiD currently runs as an image edit pipeline

so sdcpp doesn't have proper pid support yet. if you want that, nightly.

But, this sounds like a cool use of pid.
>>
>>108974440
Just say "because math"
>>
>>108974346
Cool. What model?
>>
It's really weird to hear about ideogram. I did a lot of gens with that back then, but I ditched it when Flux dev 1 came out.
>>
File: 221138CUI_00001_.png (939 KB, 1152x1536)
939 KB PNG
>>108974432
>bong_tangent
lol
that's new
>>
File: ComfyUI_00012_.png (258 KB, 1024x1024)
258 KB PNG
>>
>>108974431
Related to this but I don't get how tiled decode works compared to regular decode. Why does my RAM usage shoot up with normal vae decode but tiled decode doesn't when they're both still storing everything in RAM while decoding?
>>
>>108974470
It's img2img with flux1-dev, but t2i does pretty much the same thing (see strength param).

./sd-cli.exe \
--diffusion-model ../models/flux1-dev-q8_0.gguf \
--t5xxl ../models/t5xxl_fp16.safetensors \
--clip_l ../models/clip_l.safetensors \
--vae ../models/ae.sft \
-H 1024 -W 1024 \
-i ../Documents/1.png \
--strength 0.78 \
-o $ofile \
-p "$P" \
-s -1 \
--sampling-method euler \
--steps 20 \
--guidance 3.5 \
--cfg-scale 1.0 \
--clip-on-cpu \
-t 8
>>
Has anybody switched from using comfyui manager (maybe still using it just for reference but not to start the download) in order to manage comfyui using pixi instead of pip?
>>
Has nvidia pid been adapted for wan 2.2 instead of sdxl?
>>
>https://civitai.red/models/2668799/cyberrealistic-anima
slop time
>>
>figured out how to use the extra model paths after 3 hours before chatgpt immediately corrected me
>finally fire up comfyui
>have to tinker around just to save images on another drive as something other than png
ah, so it begins
>>
File: 175568.jpg (1.1 MB, 1532x3165)
1.1 MB JPG
>>
>>108974499
pretty sure economizing vram is the purpose of decoding tiles instead of the whole target image in one go.
>>
File: ComfyUI_00016_.png (1.27 MB, 1024x1024)
1.27 MB PNG
>>
File: anima1_00078_.jpg (540 KB, 1344x1728)
540 KB JPG
nsfw https://files.catbox.moe/xwik7e.jpg
>>
File: 1758341443714999.png (3.33 MB, 2048x1024)
3.33 MB PNG
>>108974301
from a few tests with a slop custom node, it can be pretty neat. seems like the more your image looks like an illustration the worse it gets though.
also slightly changes colors
>>
>>108974551
>early access 7k buzz
>"semi realistic"
holy kek
>>
>>108974551
the ai stare
>>
Instead of 30 FPS videos, is there a way to setup a workflow so you instead can generate frames in continuity with each other one at a time? What if 4 FPS is enough for me?
>>
>>108974589
Nice
>>
>>108974635
Yeah. Looks like he tuned the worst ai slop in
>>
>>108974551
>"i cant get proper realism into the model so ill call it "semi-realistic"
>unironically charge people money for it
who is even the audience for this
>>
https://youtu.be/XogoQnkQUO8?si=Ah7Nb_pE49-CLGG2
why has noone talked about this?
>>
>>108974589
i respect the idea but its so blurry and burnt
>>
>>108974698
filmed on nokia
>>
File: ComfyUI_Anima_03226_.png (1.39 MB, 1344x960)
1.39 MB PNG
>>
>>108974630
how do you make stereo images like this?
>>
File: 1751148517247041.png (98 KB, 526x954)
98 KB PNG
>>108974759
should be a default node
>>
File: ComfyUI_Ideogram_0001.png (73 KB, 703x1251)
73 KB PNG
>>
>>108974785
And a new record on you in their Redis database. A.I watch you!
>>
>>108974785
Do these niggas really think people should only use models for generating pictures of dogs?
>>
File: 1769651642728811.png (3.03 MB, 2048x1024)
3.03 MB PNG
>>108974630
photograph example with lora
>>
>>108974772
cool, thanks
Just had an idea that maybe one could use qwen image edit to maybe change the angle on one of these images with like 1 degree and then stitch them together. Could maybe make the 3d effect a little stronger.
Something to experiment on some other day
>>
File: z052dt.png (1.74 MB, 1024x1024)
1.74 MB PNG
>>
Im out of the loop. But just looking at what I'm seeing here this looks like what happened when Aura flow go to released (anyone remember that) the service the model used to get its training data from had a cat that would appear the model blocked the prompt. It looks like they did the same here but instead of a cat it just generates text.
>>
>>108974785
do apikeks really? This would never happen with a local model, local is free and uncensored!
>>
>>108974817
it's not that unreasonable actually.
Cats are also great source material for fun images.
>>
File: 232133CUI_00001_.png (1.08 MB, 1536x1152)
1.08 MB PNG
>>108974785
catbox?
>>
https://research.nvidia.com/labs/par/byg/
bygots, rise up
>>
File: 1758277066069077.png (16 KB, 613x207)
16 KB PNG
>>108972914
gayest thing i could find
>>
Censorship aside, how are the output from Ideogram that do get through? Are they good?
>>
File: ComfyUI_Anima_03259_.png (1.28 MB, 1344x960)
1.28 MB PNG
>>
>>108974957
this looks like the brightness adjust for a horror game
>>
https://www.reddit.com/r/StableDiffusion/comments/1tw6c4y/sorry_not_sorry_ideogram_jailbroken_in_1_easy_step/

seems people have figured out the censor slop?
>>
>>108974968
I am getting shit. Some anatomy errors too.
I only tested a few images in Turbo and Default though.
It seems you need to do some really fucking tedious json autism if you want good results.
>>
File: 4689876.webm (2.5 MB, 256x448)
2.5 MB
2.5 MB WEBM
>>
File: Ideogram_00012_.png (1.51 MB, 1024x1024)
1.51 MB PNG
Such a shame because it's textual capabilities are impressive.
Mogs anything else local and most API models too.
This is default. (20 steps)
>>
>>108974488
it's totally not lol. Here's what it looks like.

always visualize your sigmas. then you watch in the preview, whatever step, you know what level of detail is being worked on. All public models act like this, big to small.
>>
>>108975122
unc here. Learn gimp, really, this is dumb lol

but ideogram brings back memories. I did the whole 360 degree Janduz (sp) cycle.
>>
>>108975129
Here's beta57. It's flatter.
>>
File: Ideogram_00013_.png (1.57 MB, 1024x1024)
1.57 MB PNG
>>108975122
Even works at turbo.
>>108975134
I know gimp unc, I am just testing shit, practical value be damned.
Usually when they say "our model has great text" they mean some stupid benchmeme but this model actually has great text.
>>
>>108975122
>>108975134
here
https://archive.org/details/new-360-symbolic-degrees

These are a good source of test material. It's very interesting to see that since that time attitudes and laws relating to nudity are both more strict, and at the same time homosexuality is very legal and common. I prefer the time when they were in the closet and nudity wasn't a high crime punishable by systemic rape.
>>
>>108975156
I'll respect it for its speed, at least.
>>
i'm out of the loop. why is this ideogram release noteworthy? is it supposed to be the best local image model? or are people just interested because it's new
>>
Been trying to get a decent icon of a hand pulling a photo out of a librarian's drawer... this is the best so far and it's still shit

Skill issue, I know.

Close to switching back to black-on-white just because the icons are so much easier to make
>>
>>108975122
there is something wrong with the inference, the images are all so bad, is shouldn't be because fp8 right?
>>
>>108975190
I am suspecting something might be off with the schizo workflow Comfy ships as well, but I am not sure.
No, I don't think because it's FP8. FP8 only release sucks for training and making more quants, but it shouldn't tank the inference this much.
>>
>Discussion and Development of Local Image, Video, and Music Models
>Music models
Was that always there?
>>
>>108974785
So, on top of wrestling with samples, complex multi pass workflows, plastic skin, melted hands and feet, catastrophic forgetting and failed LoRAs just to pull off what should be a stupidly simple concept on local models ,now we also have to fight this brand new flavor of censorship? Fantastic.
>>
>>108975211
We're dying out here, we need to bring in some fresh anons. For some reason Catjack has been spamming his gens all week long.
>>
>>108975211
:) good to see.
>>
>>108975228
Catjak just constantly spergs at random anons. Actual thread lolcow
>>
>>108975041
>Arbitrarily ablating so many layers at random weights
Enjoy the body horror and incomprehensible AI nightmares.
Maybe if you could figure out a way to disable censorship by only slightly changing (0.8-0.9) small amount of layers it could be useful.
Like it seems to work but results aren't good.
This sigma crap another redditor linked in the thread:
https://www.reddit.com/r/StableDiffusion/comments/1tw6gmq/ideogram_safety_filter_is_removed_by_using/
didn't work for me but other timestep shenanigans seem worth experimenting.
>>
>>108975235
I love Jack the Cat, I wish we had more anons dedicated to this general. This general has so much potential but it needs more love desu
>>
>>108975232
It was added only 2 threads ago....did any new models come out?
>>
>>108975264
i like my ltx music
>>
>>108975041
>>108975252
just wait for the sarah peterson patch, she'll fix it
>>
>>108975283
LTX can do music?
At what length?
>>
>>108975264
We have ace step xl sft, with dcw.

I'll go ahead and start work on my next song, I guess lmao.

The key thing is to realize that prompting is very important. If your prompt is bad, you can do a2a with cover strength at 0.3. this can mitigate some issues. you need good audio equipment to hear ace step 1.5 xl gens in their full glory, though they only are essentially at idk 48kz mp3 maybe in total quality, ok? a paradox! like one of those women that has a wang.
>>
>>108975295
it can go on forever, but i haven't tried making a full song with it since you can only extend it in short increments in order to give enough memory for the context window to fit enough of the song to remain consistent
>>
>>108975300
I'm not fully familiar with that functionality, any guide guides or recommended baby's first settings?
>>
>>108975302
That sounds tedious
>>
>>108975320
yes, it takes forever when you want actual lyrics since it takes a long time to generate the extension, and then you can find that the singer doesn't say the correct thing. but i think it sounds okay considering it's a video model
https://files.catbox.moe/fuxnsb.mp4
>>
>>108975252
>>108975041
>>108974785

I personally hate all you and your "I have money and a GPU" vibe and I fantasize daily about watching you suffer BUT EVEN SO I think you should not be using Ideogram or messing around trying to jailbreak it because:

->If Ideogram sees people deliberately avoiding their model because of this new censorship, it will pressure the people who came up with this nonsense to rethink their approach. Hit them where it hurts and that is the usage stats.

2: If you keep trying to jailbreak it, you are making things worse for everyone. Every new model they drop will be harder to crack, more labs will start copying this censorship playbook, and the whole local ecosystem becomes a giant headache for all of us.

The best move here is to go on strike and stop supporting anti gonner models.
>>
>>108975336
why does local attract so many blind and deaf people?
>>
>>108975315
steps 100, cfg scale between 6 and 13, shift as high as 11, dcw mode double, dcw scaler 0.0008, dcw high scaler 0.0005, ode euler

that's where I've settled. There's another person who is ahead of me on this.

other models need rhyming, ace step doesn't need it, idk if it even helps - much.

but it still works better with quatrains of about the same number of syllables.

the biggest tip is don't use "compose" and keep the audio codes thing blank. well, imo it's better. that keeps it squared up, but I don't like it. But when you do this, realize your prompt is kind of sequential. It relates to how clip works. I have not fully figured out how prompting works, because it's weird how it works. It knows descriptions of sounds.
>>
you will never date your ai generated 1girl
>>
>>108975337
this is whats going to happen
redditors will find a trivial jailbreak then everyone will use it for a while, realize its shit then forget about it and go back to klein/zit
>>
>>108975353
Meanie
>>
>>108975337
>I personally hate all you and your "I have money and a GPU" vibe and I fantasize daily about watching you suffer
Why???????
>>
>>108975353
the trick is to become the 1girl
>>
Has anyone gotten the comfy workflow to work for ideogram4 using the fp8 models? I have a 4090, I get "mul_cuda" not implemented for 'Float8_e4m3fn'. I'm pretty sure Ada supports that so I'm confused. I'm on nightly. I guess I'll have to wait more. Very strange of them to not release the fp16 model. If I had that, it would work, since I have the VRAM.
>>
whats so cool about ideogram4?
>>
>>108975401
its new and powerful
>>
>>108975401
its a cuckold simulator
>>
>>108975369
whats the best prompt for this
>>
>>108975393
Why are you supporting a model that doesn't want (You) using it? Use Anima cuckie.
>>
File: 1752555687172871.png (1.5 MB, 1024x1024)
1.5 MB PNG
>>
>>108975349
I see I have been using the gradio UI and just settled at 200 steps at heun. For me it's been best for whatever they call the guidance in that interface to sit a 2.5-3.
>>108975336
What made you decide to try this?
I guess there isn't much discussion on music gen, I didn't see much of it in the threads
>>
>>108975295
https://files.catbox.moe/8s5ca3.mp4 (warning - nudity)
Yes, and sometimes it's pretty good, but it doesn't follow the prompt all that well when using 8-step distilled, and I don't have the patience to wait longer. I take the music as a nice freebie when it comes out OK.
>>
>>108975356
Well, yes.

There's nothing here to be excited about, unless you are REALLY interested in generating text, as in less than 1% of local users.

Eventually we will have a ZiT/Klein killer model, but this sure ain't it.

Looks like anima is finally dethroning the SDXL finetunes for anime stuff though.
>>
>>108975432
Aside from that Anima is overtaking Zeta Image and Klein, I keep noticing more and more realistic loras and finetunes for Anima.
>>
>>108975432
>Looks like anima is finally dethroning the SDXL
all I've seen is half sticking with IL and half going with anima. mostly speed complaints
>>
>>108975427
Interesting
I'm going to hone my skills with Ace step 1.5 I think it has a lot of potential the only problem I have is that there is a noticeable quality jump with the stl or whatever top end model between 80 where there's not much gain then it just jumps up around 180- 200 step I guess it's 400 when you use heun.
>>
File: image.png (1.04 MB, 1024x1024)
1.04 MB PNG
>>108975415
Anima is undertrained garbage which gens hands like 2023 dall-e. Also, I tell me not to gen porn with your model, I'm going to gen porn with it, you know?
>>
>>108975424
>What made you decide to try this?
sometimes i get random music in my videos, so i thought it had lots of music in the data
>>
>>108975129
>>108975143
Very interesting, thanks.
I'll take a look.
>>
>>108975432
>Looks like anima is finally dethroning the SDXL finetunes
All the actual skilled prompters and artist moved to it after preview 1 released desu
>>
>>108975447
how easy is it to finetune anima? i have over 100k real photos
>>
I want to use Ideogram alongside windows 12 and nodes 2.0
>>
>>108975467
if you can afford to fine-tune just do the base cosmos model and get rid of the grifter licence
>>
>>108975480
Anime dataset is actually good for NSFW realism, it gives more hot and creative compositions
>>
>>108975467
Default training params work fine
>>108975480
Since he can tune he's probably not a jeet so he doesn't have to worry about the licence
>>
>>108975452
id be more than willing to help you get better outputs with anima but you strike me as the kind of anon who doesnt want help and would rather complain
>>
>>108975467
I don't know
>>
>>108975452
Why you follow me everywhere i go?
>>
>grifter licence
Has anon found any proof of this yet or is he still just trolling
>>
>>108975467
easy
>>
>>108975494
why did you even bother replying to him lol
>>
>>108975424
>settled at 200 steps at heun
ace step cpp doesn't have heun, and is capped at 100 steps. idk why.

dcw needs to change depending on how many steps you use.

If I ever go back to comfyui for ace step, I would use exp_heun_2_x0_sde for the sampler, and I always use tan2 for my scheduler now. It's like a double Z shape. basically, bong tangent "rushes" through the mid sigmas, but tan2 has an adjustable plateau in the middle, or wherever you want it. anyway, these seem to be the best sampler + scheduler, so I think. Shame the sampler isn't on sdcpp, and shame I have to use comfyui to collect my sigmas, not that it takes that long.
>>
>>108975504
Just so you know, whoever you are, it looks really sus from the outside when someone jumps to defend one specific model this fast always. We were talking shit about Ideogram earlier and nobody said a word, but the second Anima comes up, suddenly there's a white knight in the thread. That's a little too convenient bro.
>>
>>108975504
tdruss was bitching about not making enough money abloobloo
>>
>>108975558
Wait there's a c++ version well fuck!
What do you gain from that version?
>>
File: output_1780533942.png (1.56 MB, 832x1216)
1.56 MB PNG
>>108975134
>>108975164
Here's zit doing the first degree, I didn't prompt or prompt enhance, just pasted it in.
>>
>>108975568
It runs on my rdna2 card, and has dcw, and a2a.

rdna2 is kind of the runt of the cards, amd has partially dropped support.
>>
>>108975577
>AMD shitting the bed
It's all so tiresome it's like they throw the match on purpose
>>
guys guys
remember
anima is le bad >:^(
>>
>>108975585
Yep, I'm through with all the companies. It's clear they have conspired to limit ram, and to limit "cuda" matrix math.
>>
>>108975041
>>108975252
Thinking again, how much does the theory even make sense here?
They modified the relevant weights during post-training process and taught the model to draw the grey image when presented with forbidden conditioning.
So we want to disrupt the layer weights in such a way that:
1) It doesn't completely cripple the model or make it too weird, so essentially a small enough delta on as few layers as possible
2) No longer draws the safety filter image
3) Instead draws whatever it knew about the naughty conditioning before post-training?
The latter doesn't seem very possible through ablation. I guess the realistic goal here is to make it less prone to ludicrous false positives. If we assume that they fucked up the training and unintentionally fried the model and that's why it is so trigger happy, it might be possible to moderately clamp select few amount of probably the composition related middle layers and no longer get so many safety filter images, without also raping the model.
But I dunno finetuning seems like a much better way out of this mess.
I was thinking about putting some combinations into comfy oven before going to bed, and see if anything interesting comes up when I wake up, but I am now reconsidering if this is worth it.
>>
The last remaining hope is Celestial, it's obvious that amd intentionally nerfed the matrix math on rdna4.
>>
>>108975562
>>108975563
So no proof yet? Shame.
>>
what do we want?
matrix math
when do we want it?
NOW!
>>
>>108975471
theyre calling him the most pozzed genner known to anon
>>
>>108975622
I'm disappointed he's going to use bare metal instead of cloud.
>>
>>108975563
dont worry bro im sure soon someone will join your team to make apache2 anima
>>
>>108975654
Funny how that became a nothing burger.
>>
File: output_1780537152.png (1.63 MB, 832x1216)
1.63 MB PNG
>>108975571
aries 2 of Janduz, not prompted, maybe I should come back and prompt.
>>
>>108975663
Somtimes, a model can do one thing well, and feels like it should be a lora. Like ovis image whatever, it can do cartoon text really well. but I don't think it's good enough to do i2i, so pointless, I guess.
>>
File: ideo.jpg (444 KB, 1088x1936)
444 KB JPG
>>
>https://echo-team-joy-future-academy-jd.github.io/Echo-Infinity/
New model released that lets you generate 24 hour long videos based on Wan.
>>
Everyone's favorite model is released.
https://civitai.red/models/2544636?modelVersionId=2983680
>>
>>108975683
>24 hour long videos
They're trying to kill the coomers, aren't they ?
>>
>>108975696
At least this time he's not lying and has labeled it as a merge
>>
>>108975696
Why is everyone so hyped about this? Legit question.
>>
>>108975723
>Why is everyone so hyped about this?
It's Pride Month. Let 'em get loud.
>>
>>108975723
>everyone
Sure Jan...
>>
>>108975723
Same reason people get hyped about dogshit popular music. Probably a mixture of shilling and some kind of viral snowball effect past a certain point.
>>
>>108975723
>Why is everyone so hyped about this?
some struggle to "stabilize" outputs from raw finetunes so they need something like WAI with a rigidity that compensates for their lack of prompt-fu
>>
music enthusiasts here? what's the meta for local musicgen now? i'm interested in melodic instrumentals. was having fun with audiocraft a year ago.
>>
>>108975754
I am but AI will never come anywhere close to generating anything I could remotely enjoy (Classical) so I don't even bother
>>
File: 023341CUI_00001_.png (1.31 MB, 1536x1152)
1.31 MB PNG
>>108975736
Hm, the output is a indeed a lot cleaner: >>108974243
>>
>>108975781
sovl vs slop, as always with these shitmixes
>>
>>108975754
>what's the meta for local musicgen now?
FL Studio
>>
>>108975787
wai pretty much always was a nice finetune tho
>>
>>108975494
I don't do a whole lot of imagegen anymore except to feed into ltx-2.3 i2v, and nsfw sdxl models are good enough for that. Maybe anime will improve with more training. I dunno, I just don't have the free time I used to.
>>
>>108975797
it was never a finetune tho. he just shitmixes random loras into slopshit
>>
>>108975797
>finetune
>>
>>108975781
i wont convince you to NOT use it i dont care about that. my only point was that mixes and merges are designed to give a default style which some prefer and others do not. if you like the style it brings than by all means use it, but i dont really care for models that have their own built in style. it usually leads to them looking less like real images and more like generated outputs.
>>108975797
>finetune
lel
>>
What if I train a lora with the "image blocked by safety filter" images it outputs, tagged with a diverse set of the prompts it rejects and apply it with -1 strength at runtime? What would happen?
>>
>>108975802
>I just don't have the free time I used to.
the future belongs to the zoomers and gen alpha, old man
>>
>>108975823
negative loras dont work, you may prevent that from being generate but it will generate garbage
>>
File: 024743CUI_00001_.png (1.39 MB, 1536x1152)
1.39 MB PNG
>>108975815
Ok, so this is the same input from the collage in the OP. I used a sketch lora for that look and WAI seems to completely ignore that. I assume it's because it's a merged model and the lora doesn't work very well and not because it is overriding the effect it is supposed to have on the image? I don't know much about how this works, sorry.
>>
File: ComfyUI_Anima_03269_.png (1.28 MB, 1344x960)
1.28 MB PNG
>>
json prompting is the gayest thing on the planet
>>
>>108975855
What if I apply it at +1 strength at the unconditional model then?
Could this model's unique structure make it work?
>>
>>108975871
does your lora have trigger word? have you tried increasing the lora strength?
>>
File: 025827CUI_00001_.png (1.55 MB, 1536x1152)
1.55 MB PNG
The N64 lora seems to work really well.

>>108975897
It does. I can also try increasing the lora strength. Hold on.
>>
File: ComfyUI_Anima_03277_.png (1.29 MB, 1344x960)
1.29 MB PNG
>>
>>108975894
It's a humiliation ritual for sure. Great prompt adherence has questionable value when no will sit through the tedium.
I think it's meant for some agentic loop with LLM in the middle, I don't think they expect us to type that garbage by hand.
Still sucks though.
>>
File: 030506CUI_00001_.png (1.44 MB, 1536x1152)
1.44 MB PNG
from 1.00 strength to 1.30. Doesn't look much different. I'll crank it up to 2

>>108975930
Sick
>>
>>108975754
I'm still new to it but Ace Step seems to do well if you can prompt things out. There's a fuck ton of settings I don't understand but so far so good
https://vocaroo.com/1gFu5B3LIcBC
>>
File: 030905CUI_00001_.png (1.13 MB, 1536x1152)
1.13 MB PNG
lol it added femkuna to the image
I guess it does look lil bit sketchier though
>>
>>108975941
I just use an llm to generate the prompt. They have the system prompt they use in their github.

https://github.com/ideogram-oss/ideogram4/blob/main/src/ideogram4/magic_prompt_system_prompts/v1.txt

I have been able to generate nsfw images. I don't think I have triggered the safety image once.

I don't have much opinion so far on quality, but it is decent enough.
>>
File: 031344CUI_00001_.png (1.32 MB, 1536x1152)
1.32 MB PNG
2.50
>>
File: ComfyUI_Anima_03292_.png (1.07 MB, 1344x960)
1.07 MB PNG
>>108975947
Thanks.

Anima is actually pretty fun once you get the hang of it.
>>
>>108975965
it doesn't look better than zit though. I don't give a fuck about text
>>
Are the luddites dead yet
>>
File: 032512CUI_00002_.png (1.4 MB, 1536x1152)
1.4 MB PNG
>>108975983
Check the wlop lora on civitai. It has a similar artstyle.
>>
>>108974431
because you didn't
--disable-dynamic-vram
>>
>>108975369
ywn
baw
>>
File: ComfyUI_Anima_03320_.png (1.44 MB, 1344x960)
1.44 MB PNG
>>108976013
funny enough, I tried "@wlop", but didn't like the result, so I removed him from the mix.

I'll try the lora though.
>>
File: 1778558574961652.png (1.57 MB, 1344x960)
1.57 MB PNG
>>108976032
Man, the plastic slop people tolerate...
>>
>>108976075
desu still very much plastic
>>
>>108976080
Get your eyes checked, sis
>>
>>108976018
Comfy keeps threatening that they will remove this option, but I'm starting to doubt it ever will since they suck so bad at keeping a decent vram threshold with dynamic-vram, meaning you will OOM

People will rather use more system ram and take the ~5% perfomance hit
>>
File: ComfyUI_Anima_03341_.png (1.47 MB, 1344x960)
1.47 MB PNG
>>108976075
stop ruining my slop
>>
>>108976002
Nope but they are well on their way out. They're in the violently attacking high profile figures phase. People don't suffer that kind of behavior long.
>>
File: BasevsWAI.jpg (521 KB, 3075x1146)
521 KB JPG
Yeah, I think I get what Anonymous meant when he said 'mixes and merges are designed to give a default style'. The checkpoint seems to steamroll most of the loras I've tried,
>>
>>108976136
holy shit wai is such a shitty slop, how can anyone like this
>>
>>108975349
>dcw mode double, dcw scaler 0.0008, dcw high scaler 0.0005,
This does make a difference way better separation
>>
>>108976149
Yeah, I remember the anon posting about dcw. dcw is supposedly originally for images, but you don't see anyone doing it.
>>
>>108976083
its okay you can still try again
>>
whats wai?
>>
File: output_1780546927.png (1.89 MB, 832x1216)
1.89 MB PNG
best 1girl, you can't even compete.
>>
someone make Bitcoin-chan hanging from a noose
thanks
>>
>>108976218
you do it
>>
File: 1777152399008990.png (559 KB, 896x1152)
559 KB PNG
>>108976226
I can't get the ₿ logo on the shell
someone edit it
>>
>>108973324
it's a grifter
>>
>>108976218
>still bag-holding in 2026
>>
>>108976240
no, I tethered up when it was above 100k
but I plan to start DCA once it goes below 60k
>>
>>108976240
>being chinese
never be chinese

Remember the adage "a chinaman's chance"
>>
>>108976246
>>>/biz/
fuck off
>>
File: output_1780547872.png (1.9 MB, 832x1216)
1.9 MB PNG
>>108976203
>>
File: ComfyUI_00129_.png (1.04 MB, 1024x1024)
1.04 MB PNG
>>108976251
>>
>>108976257
Fine, retail stonks will get smacked in the summer. You want your biz, there it is.
>>
What should I generate?
>>
>>108976313
>What should I generate?

>>108975164

also, you can throw it into an llm and add a modifier, like steampunk, videogame scene, renaissance painting, dark anime.
>>
>>108973545
>>108973550
thanks!
>>
>China, Germany, USA, Singapore, Israel, China, China, China...
When will see see a French image model? Or one from South America? Australia? Sweden?
>>
>>108976338
right after they import more muslims and jeets
>>
>>108976338
>French
Can anyone besides maybe Mistral train anything worthwhile there?
>South America
Lol
>Australia
Interesting how no major global tech corporation ever came out of there, AI is no exception. In theory has the ingredients. Too much regressive tax/regulation like Europe?
>Sweden
If you thought BFL's German safetrooning was bad Swedes will probably invent a whole new level of cuckoldry.
>>
https://xcancel.com/thepatch_kev/status/2062140772942774681?s=20
musicgen chads...
>>
>>108976373
https://github.com/betweentwomidnights/sa3-ableton-extension
this looks really cool



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.