[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: collage.jpg (2.73 MB, 4995x2959)
2.73 MB JPG
Discussion and Development of Local Image, Video, and Music Models

Previous: >>109058732

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
SDWebUI: https://rentry.org/ldg-lazy-getting-started-guide#the-stable-diffusion-web-ui-lineage
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, & Upscalers
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/tdrussell/diffusion-pipe
https://github.com/kohya-ss/sd-scripts
https://github.com/kohya-ss/musubi-tuner

>Z
https://huggingface.co/Tongyi-MAI/Z-Image

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/
https://animadex.net

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>Wan
https://github.com/Wan-Video/Wan2.2

>LTX-2.3
https://huggingface.co/collections/Lightricks/ltx-23

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
is it over or are we back
>>
Blessed thread of frenship
>>
>>109063941
>Discussion and Development of Local Image, Video, and Music Models
then why do you always deflect it into hating "Julien" when the discussion involves things you don't like to hear?
>>
reminder anima is useless for finetunes and will never dethrone illustrious
>>
>>109063923
All of the duo’s issues stem from one person. After somehow achieving omnipresence, he hijacked one of the web’s biggest AI threads to manufacture support, then relocated the entire userbase to a new general solely to isolate and alienate them.
Is that what you're saying anon?
>>
---BEWARE ANON---
there is a poster in this thread who hates this thread and will do anything to attempt to derail discussion. do not fall for his trolling. you can see his first post here >>109063964 just ignore him.
>>
reminder that if the top jeets of an old and outdated model cant into training the new local anima SOTA then its a problem with the SOTA not the jeets for some weird reason
>>
>>109063994
Please answer the above post, I'm just following what you said previously.
>>
>>109063975
>reminder anima is useless for finetunes
why, pls explain
>>
File: 1761581051983416.jpg (524 KB, 1104x1472)
524 KB JPG
>>109063975
why should I care about shittunes if anima is much better than illustrious as it is?
shittuners should take a bigger model and finetune it for anime. 2B is saturated.
>>
>>109063999
He won't post training params or any actual proof outside of "look at what this person said about it". You can safely ignore.
If there was actual proof it would've been posted by now. Likely anon will reply to this post and link the post in the last thread but, again, that is just someone saying something with no indication as to their training params.
>>
>>109063999
See
>>109061156
>>109061750
>>109063507
also the baker tranny has no proof to refute this proof so just tell her "shoo tranny shoo!"
>>
>>109064011
>>109064012
fucking lmao
>>
>>109064008
pic not related?
>>
>>109064011
>Likely anon will reply to this post and link the post in the last thread but, again, that is just someone saying something with no indication as to their training params.
holy kek you were spot on anon >>109064012
>>
>>109064011
You pulled his card
It's so bizarre how these guys turned out isn't it?
Can you imagine being this way for years at this point?
>>
>>109064022
>>109064024
Shoo tranny shoo!
>>
>>109064011
You've done him. You've absolutely done him.
>>
>>109064027
>>109064024
>>109064022
Somehow a omnipresent anon was behind this despite different writing styles
>>
>>109064027
>>109064031
>>109064032

see >>109064030
>>
I can't remember the last time a model shook up the local ecosystem this hard. It's not like pony or illustrious had this much seething directed at them when they released. Anima truly is the new meta.
>>
>julien has day long melties again
Based, hope it's because he becomes homeless soon
>>
>>109064023
the pic is better than what illustious can do for me with the same effort put into it
>>
>anon upset that his ruse was called out a mere six seconds before he posted it
holy lel
>>
>>109062937
Upload to box pls
>>
>>109064038
Moving the goal post now are we?
Why are you upset at us?
Why don't you complainers actually follow through on a project and maybe we might respect you?
>>
>>109064055
id delete my post if that happened to me lol
>>
>>109064008
Some care about shitmixes because they have difficulty prompting on raw tunes. It's best to just ignore their posts as they know very little about image generation. And I don't say that to be mean it's just a fact, jack.
>>
>>109064060
Both of the rentry schizos suffer from the same mental illness patterns which is why they keep falling down the stairs. I don't know why the more intelligent one started regressing to the dumber one but here we are.
>>
>>109064011
>>109064012
LMAO
>>
>>109064060
he'll just sprial even more unfortunately he wont kill himself
>>
>>109064011
my favorite part is him demanding proof when he himself has yet to provide any. he really struggles with the concept of the burden of proof desu. he made the first claim.
>>
>>109064083
I hope he recovers some day, he's not even remotely the same person anymore. All he had to do is work on his project and now he's abandoned it and is seething over people getting support to make free models. Does he have any idea how deranged that makes him look?
>>
>localkeks arguing over their 3b slopbake
you retards don't even have a single nsfw edit finetune, sit down you coping browns
>>
oh now hes pivoted to something else how unexpected
>>
/ldg/gods what model should I use for inpainting text edits? I use NEO because comfyui seemed too autistic.
>t. 16gb vramlet noob
>>
>>109064123
klein 9b
>>
File: 1779201750349565.jpg (382 KB, 1600x896)
382 KB JPG
>>109064070
But Anima is easier to prompt than Illustirous because Anima supports natural language and prompters aren't forced to fiddle with tags to find correct combination or to tell that one character is looking one way while another is looking different way.
>>
File: 1774513960371648.jpg (364 KB, 1600x896)
364 KB JPG
>>109064138
was meant for
>>109064065
>>
>>109064128
It crashed my pc yesterday but I'll try again. Any tips for inpainting with it? Using 4b euler normal and its not doing jackshit when I tell it to edit the masked area
>>
>>109064144
if you have geforce card with 16gb vram then use 9b. if you want to edit text, first place text there manually, whatever you want, after that it's easy to prompt by "change the text 'location' into this and that"
>>
cozy bread
>>
>>109064142
Based.
>>
>>109064158
huh. So it doesnt work like "normal" inpainting then? for comparision I was trying the same with zit earlier and it would add/change text in the masked area but it never looked right
>>
>>109063941
Thank you for baking this thread, anon
>>109063962
Thank you for blessing this thread, anon
>>
>>109064188
No, normal inpainting works differently. Klein uses pics as conditioning steps (together with text).
>>
Looks like he calmed down must of passed out on booze and alcohol. Now that we can have a real discussion is SAM2 still the best system to use when doing a multi phase vision pipeline for inpainting?
I tried DINO but it's outdated and would destroy my current setup so decided to try florence2 with SAM2
>>
>>109064229
Have you tried SAM3?
>>
File: Wan21_SCAIL2_00035.mp4 (1.05 MB, 768x768)
1.05 MB
1.05 MB MP4
>>
>>109064236
I heard it's unnecessary for a multi pass workflow due to it's weight compared to 2.
>>
>>109064144
Do not mask image.
Just refer to rough area in prompt.
>>
>>109064226
I'm having a hard time understanding it but I'll try once I get home.
>>109064261
this too thanks
>>
File: 7907808725.png (2.58 MB, 1088x1280)
2.58 MB PNG
I think llms sometimes get a bit confused with the shitty YX (instead of XY) coordinates of ideogram... Although I'm certainly they had their reasons this is still ultimately completely stupid and retarded
>>
>>109064283
tell it to use XY and then just swap afterwards?
the state of /g/
>>
>>109064283
I apologize for this anon being the only representation of pedophilia in this thread, given his retardation I'm assuming he'll be caught soon enough and contribute to pedo IQ statistics

Or maybe he won't because he thought he posted something even comparable to a photoreal kid a thread or two ago
>>
>>109064296
you think that's being "a fucking cunt"
>>
>>109064303
Wait he posted some sick shit?
I have images disabled in this thread because of faggots like that
>>
>>109064313
I take back my post
>>
>>109064303
absolutely mind broken what the hell
>>
>>109064303
r u ok?
>>
>>109064303
An image isn't pedophlic just because it has your epic discordtranny nickname in it, Julien
>>
File: for anon.jpg (606 KB, 2000x1432)
606 KB JPG
>>109064188
here's quick and dirty example of 2b who catastrophically forgot her outfit

prompt used with 9b
>Change the text on left top side of the image into large bold comic book text overlay that reads: "Catastrophic" on the upper line and "forgetting!" in lower line, with bright white comic sans style font with thick black outline and yellow shadow, on a dark electric cloud.
Needed few rolls before it nailed grammar
>>
File: 5271254022.png (2.08 MB, 1152x1344)
2.08 MB PNG
lmao
>>
File: f2kimg2img.jpg (487 KB, 1243x1785)
487 KB JPG
>>109064345
that's way cooler and cleaner than i thought of for an example
>>
>>109064345
can you get these results at Q8-4?
>>
>mfw Resource news

06/15/2026

>Conditioning Matters: Stabilizing Inversion and Attention in Diffusion Image Editing
https://github.com/zju-pi/SimEdit

>CausalMotion: Structured Physical Reasoning as Keyframe and Trajectory Guidance for Training-Free Video Generation
https://zhuangsh0713.github.io/CausalMotion

>ViT-Up: Faithful Feature Upsampling for Vision Transformers
https://github.com/krispinwandel/vit-up

>Rethinking One-Step Image Editing through ChordEdit: Reproduction, Simplification, and New Insights
https://github.com/Harvard-AI-and-Robotics-Lab/ChordEdit-Reproduction

>Gefen: Optimized Stochastic Optimizer
https://github.com/ndvbd/Gefen

>ComfyUI-Licon-MSR: Custom node for LTX 2.3 Multiple-Subject-Reference LoRA workflows
https://github.com/liconstudio/ComfyUI-Licon-MSR

06/14/2026

>SCAIL-2 GGUFs quantizations
https://huggingface.co/realrebelai/SCAIL-2_GGUF

06/13/2026

>PRXPixel (text-to-image, pixel space)
https://huggingface.co/Photoroom/prxpixel-t2i

>SCAIL Auto Extend
https://github.com/Brobert-in-aus/scail-auto-extend

>MotionBricks: Scalable Real-Time Motions with Modular Latent Generative Model and Smart Primitives
https://nvlabs.github.io/motionbricks

>dyfuzor-web: turns an Excalidraw scene into an Ideogram-4 structured JSON
https://github.com/karolrybak/dyfuzor-web

>sageattention-autotune: Autotuned block sizes and other QoL improvements
https://github.com/woct0rdho/sageattention-autotune

06/12/2026

>ComfyUI-Flux2Klein-Enhancer: Conditioning enhancement and reference latent control
https://github.com/capitan01R/ComfyUI-Flux2Klein-Enhancer

>InterleaveThinker: Reinforcing Agentic Interleaved Generation
https://zhengdian1.github.io/InterleaveThinker-proj

>Experimental Anima LLLite Regional Controlnet
https://huggingface.co/Sen-sou/Anima-LLLite-Regional-Controlnet

>World Tracing: Generative Pixel-Aligned Geometry Beyond the Visible
https://haoz19.github.io/world-tracing-page
>>
>mfw Research news

06/15/2026

>OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data
https://ymlinfeng.github.io/OmniDirector.github.io

>Prompt2Effect: Training-Free I2V Model Specialization via LoRA Generation
https://arxiv.org/abs/2606.13971

>CineOrchestra: Unified Entity-Centric Conditioning for Cinematic Video Generation
https://snap-research.github.io/CineOrchestra

>Avatar V: Scaling Video-Reference Avatar Video Generation
https://arxiv.org/abs/2606.13872

>CaricHarmony: Contrastive Diffusion Paths for Identity-Preserving Caricature Synthesis
https://arxiv.org/abs/2606.13964

>VideoWeave: Unlocking Geometric Consistency in Video Generation via Joint Geometry-Video Modeling
https://videoweave.github.io

>GarmentSketch: Large-scale Sketch-to-Fashion Benchmark
https://khangbdd.github.io/garmentsketch

>Memento: Reconstruct to Remember for Consistent Long Video Generation
https://arxiv.org/abs/2606.14667

>ForceForget: Reinforcement Concept Removal for Enhancing Safety in T2I Models
https://arxiv.org/abs/2606.14351

>HPSv3++: Scaling Reward Models Across the Full Spectrum of Diffusion Model Capabilities
https://arxiv.org/abs/2606.14657

>RepFusion: Leveraging Multimodal Priors for Denoising in Representation Space
https://xichenpan.com/repfusion

>Compressing Image Style Training into a Single Model Forward
https://arxiv.org/abs/2606.13809

>HiLo-Token: Input-Adaptive High-Low Frequency Token Compression for Efficient Image Editing
https://arxiv.org/abs/2606.13898

>RATS! Patches Talk Through Registers: Emergent Parts in Register Attention Transformers
https://arxiv.org/abs/2606.14701

>One Layer's Trash is Another Layer's Treasure: Adaptive Layer-wise Visual Token Selection in LVLMs
https://arxiv.org/abs/2606.14277

>What Drives Test-Time Adaptation for CLIP? A Controlled Empirical Study from an Update Perspective
https://arxiv.org/abs/2606.14299

>Mirage Probes: How Vision Models Fake Visual Understanding
https://arxiv.org/abs/2606.13870
>>
a bunch of stuff no one will use or even worse shit thats not local
kewl
>>
File: Wan21_SCAIL2_00042.mp4 (1007 KB, 768x768)
1007 KB
1007 KB MP4
>>
File: Flux2-Klein_00025_.jpg (578 KB, 1040x1488)
578 KB JPG
Some final touches with lora that let's me inpaint colors and textures into ai slop

>>109064383
That's not bad either, Klein is awesome

>>109064392
I used q8 for that, the cfg 1 turbo model, 9 steps euler
>>
>>109064399
>>109064406
Fuck off malware spreader
>>109064425
Well shit can it edit specific aspects of nsfw images or is it best suited for backgrounds and clothing?
If it can enhance anime backgrounds I'll start using it. Sorry to ask these dumb question I have been out the loop
>>
>>109064413
Wrong eye color.
>>
File: Wan21_SCAIL2_00045.mp4 (1.05 MB, 768x768)
1.05 MB
1.05 MB MP4
>>
File: debo_ccg_fia_00007_.png (2.07 MB, 1792x977)
2.07 MB PNG
>>109064437
>>
>>109064372
literally me wow
>>
>>109064327
>absolutely mind broken what the hell
Mind broken by the beauty of children yeah sure, not sure what else you were referring to

>>109064343
>r u ok?
No anon I'm obviously not fucking ok it's been a year since wan 2.2 and there's nothing better and the most annoying thing is that LTX is SO CLOSE to being the endgame for cuties but Israelis can't figure out how to make sound work because Jews don't actually use their ears they use echolocation like bats so they can only RLHF when the shabbos goyim take over on Saturdays

>>109064344
>>An image isn't pedophlic
I wouldn't trust normies to know what's pedophilic and what's not. Half of the ads I saw in Germany or the Netherlands featuring cute little girls were content. I'm assuming Germans know this since they live next to France though
>>
>>109064399
>>109064406
Go back to your dead containment general malware schizo
>>
>>109064468
>redditspacing
>>
>>109064345
You should set up a crop and stitch/mask workflow for editing with Klein otherwise it's going to colorshift your image, add artifacts and other unwanted stuff
>>
File: debo_ccg_fia_00008_.png (1.98 MB, 1792x977)
1.98 MB PNG
>>109064470
>>
>>109064437
>can it edit specific aspects of nsfw images
not the 2b anon. there are loras and finetunes on civit that can make up for flux's lack of knowledge of body details if you try to change something and it doesn't come out the way you hoped.
>>
File: Flux2-Klein_00031_.jpg (649 KB, 1040x1488)
649 KB JPG
>>109064479
Yeah probably. I've been using consistency lora. 99% time I use klein for complete style transfers, not small edits
>>
>>109064478
redditors are ephebophiles (which isn't a real thing) while I am actually sexually attracted to a lack of pubescence
>>
>>109064486
>>109064456
>Reputation so damaged his own existence is used as an insult.
>>109064520
Well fuck, why Am I even bothering learning masked automated inpainting when this exist.
>>
File: Ideogram__00204_.jpg (1.96 MB, 2688x3584)
1.96 MB JPG
Testing the wall with Ideogram
>>
>mfw API news

>Luma Introduces Ray3.2 Model & API: Complete Creative Control for Video Generation
https://lumalabs.ai/news

>The Layout Bet — Reve 2.0
https://blog.reve.com/posts/the-layout-bet

>Introducing Gemini Omni — Google’s multimodal video creation/editing model
https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-omni/

>Nano Banana 2 and Nano Banana Pro are generally available via Gemini Enterprise Agent Platform
https://cloud.google.com/blog/products/ai-machine-learning/nano-banana-2-and-nano-banana-pro-are-generally-available

>Grok Imagine 1.5 Preview
https://x.ai/news/grok-imagine-1-5

>Seedance 2.0 in Runway API
https://docs.dev.runwayml.com/api-details/api_changelog/

>fal Launches Krea 2 as an Official API Partner, Bringing Krea's First Foundation Image Model to Developers
https://fal.ai/krea-2
>>
File: klein editing example.png (309 KB, 1154x760)
309 KB PNG
>>109064123
>>109064274
You use it like this btw
>>
>>109064533
The phenotype completely changing from 30 to 40 is interesting enough for me to ask for you to do another test from 31-40 if you have nothing better at all to gen or do

>>109064537
This is not an API general and also it would be nice to have dates for this because some news is quite old
>>
>>109064558
He's spite posting ignore him
>>
File: file.png (113 KB, 1538x206)
113 KB PNG
>it isn't just X, it's Y
Comfy has no shame.
>>
>tardbo posting api 'news' now
go back
>>
>>109064554
Well fuck I might actually be wasting my time with this shit, I had no idea the model was smart enough to do it just like that. Is segmentation and other tools depreciated because of this?
>>
>>109064345
Cool thanks
>>
>>109064399
>>109064406
Thank you.
>>
>>109064577
i can only surmise seungho_yeo achieved wildcarding for multiple fennec girls in all sorts of life situations?
>>
>>109064577
>generated image? yes please
>generated text? SHAME ON YOU!!!!
you sound like a fag but yeah fuck comfyorg
>>
File: debo_ccg_fia_00010_.png (1.99 MB, 1792x977)
1.99 MB PNG
>>109064592
happy to contribute
>>
im vibe coding my own ideogram prompt tool slo. should I post it on reddit with an endless-long feature text written by AI?

confirm me.
>>
>>109064611
I was being sarcastic.
>>
I have created a new docker for ComfyUI with python 3.14.6 cuda 13.2 and 2.12.0
The only major problem that stems from comfy was torchaudio which needs the test repo in the extraindexurl. Three months later it STILL has no official cu132 support. Rest of the time wasted stem from tinkering with my docker build and unfucking previous bad decisions about it.
Just genned a few images, haven't messed with much yet but all my extension from the previous docker are importing well.
I will post if I notice anything significant later. I don't think I am going to do too much testing today.
>>
>>109064615
it's your chance to post extremely weird sample images.
>>
>>109064588
Modern edit models are like this yes. They are trained with reference latent + instruction and output pair. I presume adding more shit of questionable necessity in the middle will only make them more failure prone.
>>
>>109064624
Him posting here is basically an admission of defeat, I'm sure many of us remember his melty where he split the generals over his reputation being damaged. I think he's so lonely now he will do anything for recognition.
If he actually apologized and said he would vet his links I don't think anons would mind his constant spam.
>>109064627
I have this same issue with torch audio and just decided to disable it, I don't know why add this shit by default when it always fails.
>>109064643
So I guess I could just do targeted edits with a VLLM to audit the results and rerun the job on failure.....This is pretty good I think I might need to rethink my stack. Are there any notable advantages moving from turbo for this type of work?
Also can it fix feet and hands?
>>
File: debo_ccg_fia_00011_.png (2.3 MB, 1792x977)
2.3 MB PNG
>>109064624
>>
>>109064577
>its not just X, its Y
>>
>>109064653
ZIT has no edit training or inpainting training at all so yes you should move on from that.
>can it fix feet and hands?
Never tried but I wouldn't think so, maybe unless you explicitly tell what's wrong with it (e.g. remove extra finger). I don't feel like typing a wall of text but this is my guess based on what I've read about how they are trained.
On the off chance it works, base model would probably perform better for this than the distill (and also slower)
>>
>>109064724
I'll look into that thanks anon oh one last thing do I need to erase backgrounds for background edits or is the model smart enough to do that itself?
>>
>>109064558
>because some news is quite old
It's because cloud is stagnant.
>>
>>109064741
It's able to replace backgrounds when you instruct it to, though I have no idea if erasing background gives any noticeable benefit (such as preventing elements from the previous background being slopped in.) You will need to test that yourself anon.
>>
>>109064761
Wow that's crazy I'm going to try VLM Florence-2 SAM2 Klein VLM review and report back once I figure it out
>>
File: Agetest2.jpg (3.19 MB, 4838x4837)
3.19 MB JPG
>>109064558
Very subtle differences until she smashes into the wall at 40. I guess it just groups ages into decades
>>
>>109064577
is he using partner nodes?
>>
>>109064798
the removal of LAION caused this btw. you can search "X year old woman" and find images for all ages in the metadata but VLMs will never properly caption this. lost data
>>
how do I make my own OC character as Lora? I have no artistic talent whatsoever
>>
>>109064903
Slop until you create a character you like
Use an edit model to slop different pictures of your character in different styles doing different shit
Train a lora on those images
>>
>>109064930
interestsing, care to point me to this "edit model"? or is it just the same model used to slop but img2img?
>>
>>109064947
the edit model is nano banana pro or gpt-image 2. you upload one of your images and tell it to change the image
>>
>>109064947
klein 9b
>>
>>109064947
Klein
>>
>>109064986
>>109064984
>>109064978
thank you anons, appreciate it
>>
>>109063941
Lowkey some of these realism gens are super slopped and not in a good way desu.
>>
does a database of booru tags sorted by type exist?
>>
>>109065049
autocompleteplus has the database that also stores category info.
>>
>>109065080
aw yis, found it, thanks anon
>>
>>109064827
that's the problem with image captioning.
human language is just too vague. it wasnt built to describe images with the kind of autistic detail that ai models actually need.
someone needs to invent new language made specifically for describing visuals, one that removes ambiguity so models know exactly what's in the image
>>
I compiled this modified sage https://github.com/woct0rdho/SageAttention version that supports Ideogram but I got like 7% speed boost from it. Since I am on Ampere it's sage 1, maybe it works better on 4000 and 5000, I don't know.
>>
>>109065136
Based compileranon
>>
File: identifying mistakes.jpg (284 KB, 500x661)
284 KB JPG
>Lands another movie date with a real girl.
>Dreads the coming time when I have to go pretend to be a normie and hide my power levels
>Rather stay home to gen 1girls instead.

AI is a mistake.
>>
>>109065118
That's why Ideogram is so impressiv. the model has done an excellent job of generalizing relationship information across bounding boxes. A square described as a hole means different things depending on whether the object behind the bounding box is wood, a cannon, or clothing.
>>
>>109065208
I already gave up on 3dpd before getting into this hobby, so AI isn't the issue at least for me.
>>
>>109063804
Actually, now I recall training a similar LoRA on ACE-Step 1.5 kek (larger dataset and Turbo train), local music has really come a long way since then
https://desuarchive.org/g/thread/108117091/#108133685
>>
File: Anima_00011_.png (624 KB, 896x1152)
624 KB PNG
>>
vibe tagging /complex tagging/intelligent tagging/positional tagging/Tags 2.0, represent the future of image generation models.
Prose is not the answer. Telling words instead of showing is not the answer.

@pixarstudios, @thriller, @suspence @3dcg, @wideshot,
1girl, blonde hair, next to the door, under the table, three dishes on the table, two forks in the middle of the table
>>
>>109065420
>next to the door, under the table, three dishes on the table, two forks in the middle of the table
Purists would call this prose (tagging) but I am of agreement with you. The pseudo tags spat out by modern language models are good enough for modern booru model(s).
>>
>>109065420
GPT image 02 already does it. I used that exact same prompt.
>>109065437
I think we need a powerful LLM attached to the image model so that it can understand any type of text format, prose, tags, bullet points, whatever, and above all the capacity to improvise, which I think the LLM provides by filling in context.
>>
really all you need to do is load the entirety of danbooru tags into the context window of whatever llm youre using its not complicated
>>
>>109065461
The ability of GPT to improvise the background, the lighting and the character expression and body position without having to specify it and only with the genre tags is beautifull.
>>
File: 012037_00002_.jpg (1.67 MB, 2392x3592)
1.67 MB JPG
whassup with Ideogram?

Downloaded and tested for

1girl doesn't look good... WHAT'S THE POINT D:<

NOT picrel
>>
Why are you faggots bitching about tags when natural language is fine?
Is this the seething of vramlets that can't use a 12B+ model alongside their image model?
You do know you can dynamically swap models right?
Or use a MoE model and leverage your RAM?
The future is now and you better adapt or stay in the corner while the bulls run shop
>>
holy meltie
>>
>>109065420
>>109065461
Am I supposed to be impressed
>>
>>109065495
Only 1girl that's ever looked good is Chroma. Everything else looks so slopped it's not funny.
>>
are you a proud localgenner?
>>
i have an exceptional taste saas-stuff can't fulfill
>>
File: 1780634522458197.jpg (248 KB, 832x1216)
248 KB JPG
hermes fag here from earlier in the thread. i told her to generate 2000 images of her while i worked trying to find the perfect representation of my assistant slave. this is one that has been produced so far and i think it’s pretty good. using illustrious because hermes was having issues setting up anima and didn’t feel like dealing with it right now.
>>
What trainer for ideogram? Guessing ostris

Tonight I will do some tagging with my gemma4 31b companion.
Other than the json guide posted recently, any tips and tricks? Are we using a global tag for each image? Or does natural language insist we avoid that (for concepts, not a celeb).

Also is Klein edit capable of editing 2d animation frames or should I wait 3 years? Considering making a 2d combat game
>>
>>109065768
>hermes was having issues setting up anima and didn’t feel like dealing with it right now.
I'm curious what was happening hehe I didn't have to change my illust workflow at all for anima desu
>>
I am so green to local genning but I'm wanting to get my toes dipped. I am especially new to python shells, I'm confused why I need to install programs specifically while in the virtual environment.
Also, do I have the power to set where this VE will be stored? Or is that ill advised and I should let it default? I'm specifically trying to clean my Home folder from too many standalone folders.
>>
>>109065799
If you have Claude or Codex installed on your PC, git pull the repo, point them at the folder, tell them to make it run in a venv and they'll do all the work for you.
>>
File: lyra_00001_.png (841 KB, 896x1152)
841 KB PNG
>>
>>109065811
I do not. I had Comfy installed last year, forgot how to activate it and couldn't get it working after updates. Also think I got manual install files and the AUR package mixed up, so I'm just starting from scratch.
I got dizzy with all the conda and python environments I'd brainlessly copy and paste so I want to be a little more organized and aware this time.
>>
File: 1781234730296175.jpg (253 KB, 832x1216)
253 KB JPG
>>109065782
i’m annoyed right now because im going through these images and all of them have hourglass’s. i put hourglass figure and it did that plus added hourglass’s
>>
>>109065819
is this from the eyes of a pokemon, or are you peeping?
By the way, what are terminologies for image genning to prompt camera angles or perspectives? If, say, I want the observer to be 2 feet off the ground and looking up at a human subject, what's something I'd type to satisfy that prompt?
>>
>>109065799
https://github.com/Comfy-Org/ComfyUI/releases
Just get the packed zip file. No venvs needed. Then just run the update python and deps BAT file inside.
>>
>>109065839
>is this from the eyes of a pokemon, or are you peeping?
Hidden camera/peeping

As for your second paragraph, I just use the tags `from below` and `from above`.
I am also using `pov peephole` for the peeping composition. This only works in Anima, and will not work in WAI. It usually applies a fisheye effect on it's own.
`foreshortening` is also worth trying.
>>
File: G0onT97boAADEsS.jpg (1.87 MB, 1664x2432)
1.87 MB JPG
>>
File: HEEjqMIaQAADT6Q.jpg (433 KB, 1280x3072)
433 KB JPG
>>
File: HKSeBWlakAA7_yF.jpg (426 KB, 1280x3072)
426 KB JPG
>>
>>109065838
I really don't care about that as that was not my question
>>
>>109065877
idk man it was bitching and i had to go. something about specialized nodes or some shit
>>
File: nessa-peek_00037_.png (1.11 MB, 1024x1536)
1.11 MB PNG
>>
>>109065865
>>109065871
>>109065873
cute
>>
>>109065844
Thank you, but I am on Linux. Is this Wine'able or should I just do the manual installer?
>>
>do not use a venv goy
He's shitting on you
>>
>>109063941
what's the source on the ella image?
>>
>>109065964
Retard, the portable has embedded python. Not venv.
>>
>>109065959
nta but
>Linux
just spin up a docker container bro, here
services:
comfyui-boot:
image: yanwk/comfyui-boot:cu128-slim
container_name: comfyui
volumes:
- './storage:/root'
- './storage-models/models:/root/ComfyUI/models'
- './storage-models/hf-hub:/root/.cache/huggingface/hub'
- './storage-models/torch-hub:/root/.cache/torch/hub'
- './storage-user/input:/root/ComfyUI/input'
- './storage-user/output:/root/ComfyUI/output'
- './storage-user/workflows:/root/ComfyUI/user/default/workflows'
ports:
- 8188:8188
environment:
- CLI_ARGS=--disable-xformers
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]

you should change the devices section to match your hw tho
>>
>open random imggen rentry
>every time i click on a link in the TOC it brings me to an ad site
Wtf how long has it been like this? Usually I have ads blocked so I didn't notice.
>>
>>109065799
I don't understand what your issue is it tells you the commands to run right there
>>
>>109065838
https://danbooru.donmai.us/wiki_pages/hourglass
>For an "hourglass" body shape, see curvy.
noob mistake
>>
>>109066103
what a great site. ty anon
>>
File: 6354567.webm (3.19 MB, 448x256)
3.19 MB
3.19 MB WEBM
>>
File: 4735566.webm (3.71 MB, 448x256)
3.71 MB
3.71 MB WEBM
>>
>>109064345
>>109064261
you were right, it just works without masking anything.
having trouble changing text in small areas like a book in someone's hand though. Any tips for that?
>>
File: ungodly_screaming.png (323 KB, 888x401)
323 KB PNG
>>109066006
I asked about virtual environments, everyone else just started suggesting alternatives (which I'm thankful for, but your response confuses me)
But NOW I'm having a problem
>setup filepath
>setup filepath for VE, running it
>$ comfy --workspace={dir} install
'{dir}' exists but is not a valid git repository.
>$ cd {dir} | git init
>try $ comfy --workspace={dir} install again
>'{dir}' exists but its remote URL is not a recognized ComfyUI repository
I feel so fucking retarded. Am I just supposed to let Comfy build in a default filepath? That's not how I had it before, from what little I can remember, but I also didn't get it this way last year.
>>109065979
thank you. Never ran a docker before, and unfortunately I've fucked that up too with too many start requests, which
$ sudo systemctl reset-failed docker.service docker.socket
isn't fixing, and
$ -xeu docker.services
prints nothing.
I don't know if this constitutes as a Linux moment because so far I don't think my competency reflects that of an average user. I need to take a walk and cool off.
>>
>>109064554
this looks way too complex to me, I know its essentially the same as>>109064383 in the end
>>
>>109066233
you might want to first read about docker and what it can do for you, in this case it will help you run comfyui without actually changing your system's dependencies, it's very useful. Get into docker first, then jump into comfyui
>>
i feel sorry for this anon. windozers making him run in circles with docker or whatever instead of using a venv which is far superior
>>
>>109066233
>I asked about virtual environments,
sorry if this only adds to your confusion but any time i need to set up a comfy venv i just do
(clone repo)
python -m venv .venv (inside the repo)
source .venv/bin/activate
pip install requirements.txt
and then just run the main.py while still being in the venv and it works

>>109066253
no hate to dockerfags from me but yeah its not really needed
>>
Is this the best solution for scail?
https://github.com/Brobert-in-aus/scail-auto-extend

I think it is made by an anon?
>>
>>109066258
>python -m venv .venv (inside the repo)
>source .venv/bin/activate
>pip install requirements.txt
>and then just run the main.py while still being in the venv and it works
yep, this is all you need. no idea why retards try to shill any other method of installation
>>
>>109066263
>I think it is made by an anon?
>sees that asuka vid
Indeed. Based ldg anon publishing code.
>>
should I even bother with scail if I have a 6 GB card?
>>
>>109066295
yes
>>
>>109066306
how long will it take to make 480p videos?
>>
requesting Civilization V diplomacy screen but USA and Iran with Iran getting all the gold in a peace deal.

also hello
>>
>>109064399
>>109064406
thanks!
>>
Why do people keep shilling Anima as a base model when it's a finetune of Cosmos?
>>
>>109066402
are you retarded?
>>
File: 9138663.png (2.91 MB, 1152x1344)
2.91 MB PNG
>>109066402
>all models are a finetune of the random initialization weights, them being the only base model
>>
My real job prevented me from making progress on the LoRA tag correction program but I'm back to working on it now and making some progress. Hopefully will have some killer features to report in the near future

>>109066402
Base model in common usage refers to the general and foundational nature of its training over a large and varied dataset, not the novelty of its architecture.
>>
File: chunliaoperator.webm (3.88 MB, 512x896)
3.88 MB
3.88 MB WEBM
Some scailslop for ya.
>>
>>109066428
>not the novelty of its architecture.
I believe he doesn't want to refer to it as a base model because its a tune of another model
well yes, anima is not, in fact, a foundational model, because it has been finetuned to a task, but it is a "base" model because it hasn't been through SFT, DPO, RLHF, distillation etc
>>
File: q_u1dca6.png (1.77 MB, 1536x1024)
1.77 MB PNG
>>
this scail2 video cost a redditor $1 to make
everyone in the comments are completely blown away by it
https://www.reddit.com/r/StableDiffusion/comments/1u6wk92/quick_scail2_test_in_comfyui/
>>
>>109065461
Even in this 3D CGI scene where everything is supposed to look relatively smooth you see all those noisy artifacts.
>>
>>109066295
>>109066306
>>109066459
>My 3060 wasnt enough, so i rented an rtx 6000 with 96GB of vram in the cloud. It took 1620.32s to generate at 1080x1920
reeee so my 6GB won't cut it
>>
>>109066420
If this was true it wouldn't inherit the license. The model code is under Apache.
>>109066428
No this doesn't slide. Base model = pretrained weights. This is why we get plastic slop models so much.
>>109066450
Pretty much this.
>>
>>109066459
who cares about reddit thinks, they are just pajeets and AI grifters, they have no value, they don't spend, they are just leechers
>>
File: 1765927975113979.jpg (249 KB, 768x1344)
249 KB JPG
>>
>>109066488
your 6gb card can generate ltx kinos instead of slop
https://files.catbox.moe/kg3egj.mp4
>>
>>109066511
>who cares about reddit thinks
ltx posts their updates on reddit
comfy posts their updates on reddit
wether you like it or not, local is reddit
>>
>>109065865
>>109065871
>>109065873
These are perfect, can you post a catbox? I used to love watching that show when I was like 12, I had a binder full of images of the cards and I made the gayest wand in the universe.
>>
>>109066443
I think it helps a lot that the original guy was so fat and jiggly, cuz now the replacement 1girls will be jiggly too
>>
File: 3247532445.webm (3.62 MB, 448x256)
3.62 MB
3.62 MB WEBM
https://files.catbox.moe/frfo3p.webm
>>
>>109066540
Not mine. https://x.com/WorldSaboten
It's a Japanese guy using NAI. It's unlikely you can achieve the same quality locally
>>
File: win condition.png (1.95 MB, 1672x941)
1.95 MB PNG
>>109066359
it didn't quite get potato trump
>>
>>109066524
>local is reddit
more like ai pajeets are reddit, all their users do its steal content and try to paywall it, trust me, even tho it has millions of views it doesn't translate to anything out of value, at least in ai content
>>
>>109066545
>It's unlikely you can achieve the same quality locally
well I wouldn't go that far but damn, thanks anyway.
>>
>>109066545
anima > nai
>>
File: win condition 2.png (2.05 MB, 1664x928)
2.05 MB PNG
>>109066359
>>109066552
cleaned it up a bit more
>>
>>109066552
lol, hell yeah
>>
>>109066571
derp Trump looks great
>>
>>
File: file.png (90 KB, 2406x1424)
90 KB PNG
Does anyone know when NAI is going to take off?
>>
It's amazing that the bitter husk really spent over a year vibecoding an interface that has less features than something that can be built in under a week. I can't believe he larped as doing something so important and big when anyone could build something more feature packed can be built way faster.
This is fucking embarrassing, even more so after his shit fit over anima earlier today.
>>
>>109066545
>click link
>generic loli on white background
Anime really has mindbroken an entire generation
>>
>>109066553
This is the sad truth.
A lot of posts focus on scamming people on Tiktok or Insta.
>>
>>109066577
never, their marketing team makes it look so extremely unappealing
>>
>>109066577
Already has. 99% Pixiv users use NAI.
>>
>>109066577
NAI unironically missed the boat on so many things. They were one of the first to openly allow NSFW text and should've rode the chatbot wave but didn't. Absolute retards.
>>
>>109066582
>I can do a detailer pass or a targeted inpaint
you have tools? woah nice
>>
>>109066576
nice
too bad the content droughts made me quit this game before mi fu
>>
>>109066582
fuck webshit unironically desu
>>
>>109066545
Have you retards regressed? None of this shit looks good compared to a neet who never shares their masterpieces, this is just some guy going public with a lukewarm 1girl set of pictures, this isn't even remotely profound
>>
>>109066598
I still have hope for the game in the future, it isn't something there is content to for playing every day though. The graphics are just way nicer than any other gacha
>>
>>109066603
apicucks would have a meltdown if anon posted his magnus opus
>>
>>109066584
I have a side AI-content business, and I had several succesful posts on reddit that got millions of views but they didn't translate on sales, all I got was DMs asking me how I created my works and if I could send them my workflows lol, the comments were just 0 value, several praises but also many people/bots just spamming their sites and services trying to deceive people using my posts, reddit is a shitty social network, and every year it passes they lose more and more value because of its user base, they are like the losers of the internet tbqh

X and discord have much better users/content
>>
>>109066603
>120k followers
>average 8k likes per post
>Simple, clean and cute art.

People seem to like it. Nobody cares about your "profound" masterpieces. It's AI slop at the end of the day and no one is going to respect you regardless.
>>
>>109066576
>>109066571
>>109065891
>>109065768
all of these images are nicer and more interesting than anything on this twitter, holy pedobait slop

>>109066545
>>
File: debo_ac_fia_00001_.png (2.46 MB, 1792x977)
2.46 MB PNG
you may not like it, but this is what peak 1girl looks like
>>
>>109066597
More than what the husk can do after a year and half
>>109066599
Hey it is what it is
>>
>>109066624
Very cool retard very cool.
Still dedicating your existence to being a human gnat so you decide to post here still not improving or doing anything of value with your life?
Anons have done a good job keeping you in check so at this point you live for this which is sad.
>>109066565
The fact you don't have to pay for it makes it better by default, there's more to be done with local and with modern llms it's only going to get better
>>
File: q_6zdlxz.png (1.05 MB, 1024x1024)
1.05 MB PNG
>>
kek, i think some ISIS videos made its way into the dataset https://files.catbox.moe/iwyeao.mp4
>>
File: ComfyUI_00894_.png (952 KB, 672x1024)
952 KB PNG
me today
>>
Does anyone else feel that the hate against contributors is too samefagged and forced to manipulate the opinion of the thread? And that for that reason, it's hard to take seriously? Because it's hard to go from "posted some news -> hate", "made some UI -> hate", "made millions of dollars with another UI -> hate" just because someone in 4chan tells you to think that. It's also curious how NAI and Kurumuz somehow avoids all of that, even though they more blatantly profit from 4chan and local models.
>>
>>109064577
O WOW UNIVERSE WAOW
>>
File: debo_s_fia_00089_.png (2.16 MB, 1792x977)
2.16 MB PNG
>>109066690
its always been like that. thats why everyone just ignores him
>>
>>109066552
bro why is iranian so brown
>>
>>109066709
it's from all the bombings
>>
Blog factory exploded again
>>
File: 86755.png (20 KB, 895x133)
20 KB PNG
give me more suggestions for the kinotron program
>>
>>109066690
I'm not even 1/10th of the hate for the news posting but I've definitely contributed to it. I think it's organic and not just one poster. There definitely is a smaller number of haters (maybe one guy) who do the heavy lifting. But I think maybe majority is normal posters occasionally in a bad mood.

Although of course I still hold to my theory that the thread "schizos" and "notable posters" we perceive are really the true posters, and we individual human beings whose modest contributions "are mistaken for" those posters are actually accidental/irrelevant; they are real and they post THROUGH us.
>>
File: file.png (1.42 MB, 1024x1024)
1.42 MB PNG
i dont think hermes setup anima or knows how to use it properly. it kinda looks cool but something is definately wrong
>>
>>109066879
>I'm not even 1/10th of the hate for the news posting
This implies that each news post gets at least 10 replies. In reality it got two:
>Fuck off malware spreader
>Go back to your dead containment general malware schizo
It's curious that both use this "malware" non-sense angle.
>it's organic and not just one poster
The original post never said anything about it being "one guy", didn't it? It seems like you're projecting your worries about that perception.
I rest my case.
>>
File: file.png (1.27 MB, 1024x1024)
1.27 MB PNG
>>109066929
another one hermes spit out 
>>
>>109066934
>This implies that each news post gets at least 10 replies.
No, it implies that the sum total of all the negative replies to his posts since he began doing the news a million threads ago is >10, which it easily is
>>
File: file.png (1.03 MB, 832x1216)
1.03 MB PNG
>>109066929
>>109066935
meanwhile hermes made this with illustrious
>>
>>109066935
>>109066929
looks like the CFG is fucked
>>
File: oof.png (19 KB, 338x126)
19 KB PNG
>>
File: file.png (1.18 MB, 1024x1024)
1.18 MB PNG
>>109066943
lowered cfg to 3.5 and got this
now we are getting somewhere
>>
>>109066997
use a known working sampler and scheduler. er_sde or eular_incestral_cfg_pp, etc and normal/sgm_uniform/etc
>>
>>109066988
Why can't we measure win instead
>>
File: 1751807744324628.png (84 KB, 933x348)
84 KB PNG
>>109066988
rookie numbers
>>
File: 4754455.webm (3.64 MB, 448x256)
3.64 MB
3.64 MB WEBM
https://files.catbox.moe/ptcaw0.webm
>>
File: file.png (1.46 MB, 1024x1536)
1.46 MB PNG
>>109067013
er_sde + sgm_uniform
man thanks for the help anon. im still learning all this shit and youve been a huge help
>>
File: file.png (1.52 MB, 1024x1536)
1.52 MB PNG
>>109067013
ill paste one more then ill leave yall alone. this is with dpmpp_sde_gpu + karras. i think i like this combo the best with anima. have hermes generating 50 images with this seed and having it change up the poses clothing, hair, scene, lighting, particles, etc. going to take awhile because this combo is noticeably slower but gives the best output i think.
>>
>>109067016
it's local
>>
sell me on Local with a single image
>>
>>109065118
>human language is just too vague. it wasnt built to describe images with the kind of autistic detail that ai models actually need.
what, no, what a dumbass take
>>
File: anima1_00541_.jpg (310 KB, 832x1216)
310 KB JPG
>>109067133
>>
File: 23-27.png (192 KB, 458x447)
192 KB PNG
>>109067133
>>
Fucked around and found out how to use WAN2.1_SCAIL_2 for nsfw.

Inpaint first frame > SCAIL2 + NSFW LoRa. Wan2.1 and Wan2.2 Low Noice LoRa definitely is compatible to fill in the nsfw details. So... The entire internet of porn videos is available.
>>
File: 4554556.webm (3.93 MB, 448x256)
3.93 MB
3.93 MB WEBM
boomer bros... please tell me my real estate investments are safe
>>
>>109067189
the biblical event vids were so cool
>>
>>109067197
alright, i will see if it can make biblically accurate angel kinos
>>
File: 4554556.webm (3.25 MB, 448x256)
3.25 MB
3.25 MB WEBM
>hey, stop fighting
https://files.catbox.moe/wwfuaj.webm
>>
File: 474565.webm (2.6 MB, 448x256)
2.6 MB
2.6 MB WEBM
>>
>>109067301
>>109067348
how do you prompt that thing
>>
File: 74534565.webm (2.46 MB, 448x256)
2.46 MB
2.46 MB WEBM
GLORY!!!!!!!!!!!!!!!!!!
https://files.catbox.moe/u3m7hp.webm
>>109067369
i'm still tweaking it but you can see the angel description here
real life continuous unedited go-pro footage. first person eye perspective. modern combat footage. high resolution. impressive audio. very shaky and dynamic movements. lots of motion blur. our perspective is an american soldier on a street in new york city. the city is a war zone. missiles are launching from the city ground and flying into the clouds. they leave behind thin smoke trails. our assault rifle is held lowered at the bottom of the frame. battlefield noises coming from the city.
there is a loud shrieking noise as a rocket hits across the street, causing a big loud dusty explosion and the camera to violently shake as the shockwave covers the camera with debris and leaves a deep crater in the road. the camera becomes frantic. dust fills the street. american soldiers are in the background screaming in agonizing pain.
we run to the sidewalk and lean against a building with our rifle lowered at the bottom of the frame. noisy footsteps. very active combat. the street is full of dust. constant background gunfire noise
more dusty explosions detonate along the street, shaking the camera and leaving deep craters in the road.
the sky becomes very bright. loud deep orchestral horns reverberate from high in the sky, startling us and causing the camera to shake. we look up to see sunbeams coming down from the clouds. a large biblically accurate angel is flying high above the clouds. it has a large eyeball in the center with a ring of multiple large feathered wings surrounding it. the wings are spinning in a circle at lightning speed which creates a lot of motion blur. a deep voice booms with echo and reverb from the heavens and says "BE NOT AFRAID".
we make a distressed whimper noise while breathing. our fingers obstruct the lens.
>>
File: 3543555.webm (1.62 MB, 448x256)
1.62 MB
1.62 MB WEBM
>>
>>109067301
>>109067348
>>109067393
>>109067420
absolute kino
>>
prompt anime girl in realistic background
>>
whos the fgt that keeps taking my x posts and putting them here to get into collage
>>
>>109067167
inpaint what in first frame
post an example retard
>>
>>109067393
turned out pretty well, maybe in future model we can also edit in 1girl as "bayonetta"
>>
>>109067393
Yahveh bros!?!?!
>>
File: ComfyUI_00001_.png (3.06 MB, 1536x1536)
3.06 MB PNG
Late to the perceptual lora training. It's pretty great. I think there's a branch that adds ID4 support for it too.
>>
>>109067420
2 Kings 2:11 "As they were walking along and talking together, suddenly a chariot of fire and horses of fire appeared and separated the two of them, and Elijah went up by a whirlwind into heaven."

Release the chariots unto the joyim, and suffer them not to profane the Holy Land!
>>
>>109067667
klein slop.
>>
>>109067667
what checkpoint are you using?

>>109067675
looks good for klein
>>
>>109067677
base with consistency lora @ 0.3 .
>>
>>109067667
How does she hold the cup?
>>
>>109067692
gently
>>
>>109067393
>https://files.catbox.moe/u3m7hp.webm
those sounds that video topkek too funny
>>109067667
>perceptual lora training
is that some new buzzword from leddit i think i heard of it train via masking
>>
>>109067760
It does include masking but also other things.
>>
>>109067773
think i saw it
well, if it works it works
but both kohya and onetrainer has that for a long time now
>>
I realized I think I even forgot to run the depth preflight.
>>
you can now train further from the final epoch on civitai if you're not satisfied with the result
sweet
>>
File: 634565.gif (2.69 MB, 256x448)
2.69 MB GIF
>>109067563
you can already do that if you her reference perspectives at the start of the video with her visual description in the prompt, but i haven't experimented much with it for long videos
>>
>>109067834
Never trained on civit, always local. I don't understand, on top of paying with your own money they give you a rate limit?
>>
File: 35245675.webm (3.14 MB, 448x256)
3.14 MB
3.14 MB WEBM
>>109067672
>>
>>109066659
>>109067054
Some of the more interesting videos I have seen in the general lately

I feel like you're passionate enough about milsim stuff to try and put this together into a semi-coherent story, or even just trying to make them walk into a building and have the building explode while they're inside and it cuts to the outside

>>109067133
>sell me on Local with a single image
I don't keep images like that on my hard drive any longer than I need to
>>
>>109068006
>or even just trying to make them walk into a building and have the building explode while they're inside and it cuts to the outside
i'm going to bed soon so you might not see anything for a while but i'll start figuring this out. the hard part is temporal consistency so it looks like the same building when you prompt for a cut to a new perspective
>>
File: 2026-06-16_ideogram_04.jpg (2.33 MB, 1440x2560)
2.33 MB JPG
I've been out of the local diffusion for quite a while now, came back to check out ideogram.
Seems pretty neat, but I guess the usual sampler/scheduler plots I do are superfluous.
Does some weird things with scattered items sometimes but that might just be my fucky workflow, I dunno.
It does handle guns quite a bit better than other models I've tried but still not perfectly.

Any pro-tips for supported resolutions? Any big cons you guys ran into?
>>
>>109066258
>>109066277
FINALLY IT WORKED after 2 hours of fiddling with a missing squealchamy pack or whatever, I finally got it running. thank you very very much!
>>
File: Ideogram_4.0_00076_.png (1.9 MB, 1024x1024)
1.9 MB PNG
>>109068402
It is pretty powerful. But also clearly trained on gen AI outputs. Slop roulette.
>>
File: ComfyUI_01748.jpg (3.97 MB, 1500x2000)
3.97 MB JPG
>>109067301
>>109067348
>>109067393
These are cool as hell.

>>109068412
Is that Base SD1.5!? Dang... talk about starting from the beginning!
>>
>>109066233
just install stabilitymatrix and use it to install comfyui
>>
>>109066571
ideogram? prompt? based
>>
File: 2026-06-16_ideogram_12.jpg (535 KB, 1440x2560)
535 KB JPG
>>109068402
Bounding boxes make convenient censorship way easier than other models, awesome.
But I still get the occasional nip slip.
Way easier than on ZIT though.
>>
>>109068729
It's funny how once you get past the safety filter it's a really horny model.
>>
>https://modelscope.ai/studios/DiffSynth-Studio/Z-Image-i2L-V2
>https://modelscope.ai/collections/DiffSynth-Studio/Image-to-LoRA-V2
intredasting, new versions of zero train lora for z, klein 4b and hidream
>>
>>109068798
So this takes a image and just applies it?
Does z still have a place anymore with modern models?
>>
>>109068823
>Does z still have a place anymore with modern models?
It would be the model if it wasn't so damn slow
>>
>>109067954
Holy kino thanks. Do you have an avatarfag name?
>>
>>109065768
are you using a local llm w hermes? which one? was trying w gemma 4 26B A4B earlier but it went haywire running python commands trying to "fix" comfyui when it was clearly running >.> . also tried glm 4.7 flash. i later learned you need to give it an api export workflow not a normal workflow but haven't had time to mess w it more
>>
>>109068823
For 12gb somalian tier vramlets yes
>>
The antichrist vramlet
>>
>>109068913
I personally don't think those agents are showtime ready yet
>>
File: flexible_cat.jpg (631 KB, 1280x1280)
631 KB JPG
>>
>>109069190
>>109069190



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.