[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion and Development of Local Image and Video Models

Previous: >>108777750

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, & Upscalers
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/tdrussell/diffusion-pipe
https://github.com/kohya-ss/sd-scripts
https://github.com/kohya-ss/musubi-tuner

>Z
https://huggingface.co/Tongyi-MAI/Z-Image

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
>>108789589
does n*gbo really want beef again
>>
Blessed thread of frenship
>>
>mfw Resource news

05/09/2026

>SenseNova-U1-8B-MoT-Merger-: GGUF quantized checkpoints and layer-offload VRAM modes
https://github.com/OpenSenseNova/SenseNova-U1#-updated-news

>HiDream-O1-Image: Pixel-level Unified Transformer (UiT) without external VAEs
https://huggingface.co/HiDream-ai/HiDream-O1-Image

>ComfyUI-RefineNode: local image refinement preprocessing, reference image processing and paste-back
https://github.com/1Kynx/ComfyUI-RefineNode

>Flowception: Temporally Expansive Flow Matching for Video Generation
https://github.com/facebookresearch/flowception

05/08/2026

>LTX-2.3 PolarQuant Q5: 88% size reduction, near lossless quality (Cosine Similarity: 0.9986)
https://huggingface.co/caiovicentino1/LTX-2.3-22B-HLWQ-Q5

>Anima Scribble+Canny Control with adjustable strength
https://huggingface.co/CabalResearch/Anima-Canny-Scribble-Adjustable-Control-LoRA

>Sparkle: Realizing Lively Instruction-Guided Video Background Replacement via Decoupled Guidance
https://showlab.github.io/Sparkle

>Continuous-Time Distribution Matching for Few-Step Diffusion Distillation
https://github.com/byliutao/cdm

>MSD-Score: Multi-Scale Distributional Scoring for Reference-Free Image Caption Evaluation
https://steinsgatesg.github.io/MSDScore

>SoftSAE: Dynamic Top-K Selection for Adaptive Sparse Autoencoders
https://anonymous.4open.science/r/SoftSAE-8F71

>IMG Dataset Refiner (v4.0 Pro)
https://github.com/NyxAwroo/IMG-Dataset-Refiner

>FLUX, Open Research, and the Future of Visual AI — Stephen Batifol, Black Forest Labs
https://www.youtube.com/watch?v=x8Yb4RidLgM

05/07/2026

>Stream-R1 Reliability-Perplexity Aware Reward Distillation
https://stream-r1.github.io

>banodoco/hivemind: Drop-in skill that searchs the Banodoco Discord message feed
https://github.com/banodoco/hivemind

05/06/2026

>Exploring Data-Free LoRA Transferability for Video
https://github.com/Noahwangyuchen/CASA

>Ortho-Hydra: Orthogonalized Experts for DiT LoRA
https://github.com/sorryhyun/anima_lora
>>
>mfw Research news

05/08/2026

>DynT2I-Eval: A Dynamic Evaluation Framework for Text-to-Image Models
https://arxiv.org/abs/2605.06170

>FreeSpec: Training-Free Long Video Generation via Singular-Spectrum Reconstruction
https://fdchen24.github.io/FreeSpec-Website

>RealCam: Real-Time Novel-View Video Generation with Interactive Camera Control
https://xyc-fly.github.io/RealCam

>SwiftI2V: Efficient High-Resolution Image-to-Video Generation via Conditional Segment-wise Generation
https://arxiv.org/abs/2605.06356

>ActCam: Zero-Shot Joint Camera and 3D Motion Control for Video Generation
https://elkhomar.github.io/actcam

>Arena as Offline Reward: Efficient Fine-Grained Preference Optimization for Diffusion Models
https://arxiv.org/abs/2605.06070

>Secure Seed-Based Multi-bit Watermarking for Diffusion Models from First Principles
https://arxiv.org/abs/2605.06153

>FREPix: Frequency-Heterogeneous Flow Matching for Pixel-Space Image Generation
https://arxiv.org/abs/2605.06421

>Continuous Latent Diffusion Language Model
https://arxiv.org/abs/2605.06548

>Autoregressive Visual Generation Needs a Prologue
https://arxiv.org/abs/2605.06137

>Continuous Expert Assembly: Instance-Conditioned Low-Rank Residuals for All-in-One Image Restoration
https://arxiv.org/abs/2605.06127

>Plug-and-play Class-aware Knowledge Injection for Prompt Learning with Visual-Language Model
https://arxiv.org/abs/2605.05910

>DCR: Counterfactual Attractor Guidance for Rare Compositional Generation
https://arxiv.org/abs/2605.06512

>Taming the Entropy Cliff: Variable Codebook Size Quantization for Autoregressive Visual Generation
https://arxiv.org/abs/2605.06207

>MARBLE: Multi-Aspect Reward Balance for Diffusion RL
https://aim-uofa.github.io/MARBLE

>Eulerian Motion Guidance: Robust Image Animation via Bidirectional Geometric Consistency
https://arxiv.org/abs/2605.06280

>AI-Generated Images: What Humans and Machines See When They Look at the Same Image
https://arxiv.org/abs/2605.06143
>>
ok already deblessed
>>
File: HiDream-O1-Image-Dev.png (3 MB, 1440x2560)
3 MB PNG
>>
How's Hidream for text/hotties combo?
>>
File: ComfyUI_23938.png (3.25 MB, 1500x2000)
3.25 MB PNG
>>108789475
Yeah, that happened back in the early 90s.
>>
>>108789778
Everything died around 2010.
>>
>>108789785
2016 was the turning point
>>
>>108789778
what model is that
>>
>>108789798
No. That was just when it was too late to turn back. Maybe 2012 really was the end of the world... just as we know it. Nostradamus was right.
>>
I will be posting kino btw
>>
File: ComfyUI_23940.png (3.38 MB, 1500x2000)
3.38 MB PNG
>>108789822
ZIM+ZIT.
>>
Anon?
>>
File: Ernie-Image-Turbo_00065_.png (1.51 MB, 1024x1024)
1.51 MB PNG
Ernie truly is almost there. All the model needs is a single tune to unslop itself.
>>
File: Ernie-Image_00021_.png (1.47 MB, 896x1200)
1.47 MB PNG
>>
>finetune will fix everything!
>>
It did for SD1.5 and SDXL thobeit
>>
but sd 1.5 and sdxl were good models
>>
I've spent last three and half hours trying to debug a fix for the grid patterns in HiDream-O1-Image (full), but alas I failed.
If I must say one nice thing about hidream o1, it's that the per iteration speed is reasonably fast for 8B.
I am running it on a 3060, and a 1024p image takes 2 minutes for 50 steps, with partial system memory offloading. Triple that for 2048p. I get its 2048p speeds with 1024p under ZIM, Ernie etc.
The quality is fucking ass regardless of grid though. Maybe there is something wrong with the script no one figured out yet, or maybe all the images we see are from the 200B API model and they straight up lied about the benches of the local variant. (I don't think you can even benchmeme this garbage)
Anyway I go to bed a disappointed, sorry man.
It runs with fine speed flash attention disabled btw, the repo is lying to you.
>>
File: Ernie-Image_00025_.png (1.66 MB, 896x1200)
1.66 MB PNG
>>
>>108789866
im waiting
>>
>>108790246
My one remaining cope is that they would have had to brazenly cheat for the 8gb version to have done so well in the benchmarks. Maybe there really is some stupid mistake somewhere. Or maybe they really are that brazen.
>>
File: Ernie-Image_00029_.png (1.47 MB, 896x1200)
1.47 MB PNG
>>108790227
I don't see why not
>>
File: t2i2.jpg (632 KB, 2048x2048)
632 KB JPG
>>108790246
Agree.
I was excited to see 2048 as the default resolution, but there is no real detail.

It is relatively fast though (25 seconds for 50 steps, 2048x2048).

I will try with the prompt enhancement next as some models improve clarity with more descriptive prompting.
>>
File: Ernie-Image_00032_.png (1.08 MB, 896x1200)
1.08 MB PNG
This model is extremely good with infographics/illustrations. It's also probably next level art.
>>
File: 1757124955578596.png (2.54 MB, 1328x2048)
2.54 MB PNG
>>
File: Ernie-Image_00038_.png (2.38 MB, 896x1200)
2.38 MB PNG
>>
can illu/noob not do a wig partially off someone else's head that shows the real hair color? can anima do it?

asking for a friend
>>
File: 1755781834434575.png (2.84 MB, 1328x2048)
2.84 MB PNG
>>108790410
anima probably could
>>
File: t2i3.jpg (581 KB, 2048x2048)
581 KB JPG
>>108790293
I tested using the prompt enhancement script from the repo (following the exact recommendation with gemma-4-31B-it at full precision). It roughly doubled the length of the prompt, but the image output is not very different.
>>
>>108790433
i guess ill have to try it. cant even find a lora for what im looking for, seems to be super niche.

that marceline is super weird looking, reminds me of homestuck
>>
File: 1747962718085903.png (859 KB, 1152x896)
859 KB PNG
>>108790433
https://danbooru.donmai.us/wiki_pages/hair_visible_through_wig
seems to work
some NLP would get you a better hit rate
>>
>>108790472
that tag is fighting me tooth and nail. if i do a blonde wig, it shows up as multicolored hair even with it in negative, if i make it a weird color like pink, the wig shows up as an object.
>>
File: 1758780638386566.png (692 KB, 832x1280)
692 KB PNG
>>
>>108790176
can you drop catbox? I fucked something up and didn't manage to make it work
>>
File: leona.gif (339 KB, 512x512)
339 KB GIF
Just learned how to use IP adapters.

chuds BTFO.
>>
>>108790531
Sure
https://files.catbox.moe/5etar5.png

Had to censor the booba for this post kek. I'm just using the standard template workflow with a touch of the turbo LoRA extract to speed up base gens https://civitai.com/models/2551262/ernie-turbo-lora-extracted.
>>
File: 1772123070723129.png (888 KB, 768x1344)
888 KB PNG
>>
File: debo_w_anima_00046_.png (3.05 MB, 2048x1117)
3.05 MB PNG
>>
>>108790619
>check the ernie model
>almost every gen is STILL 1girl, standing
kek
>>
>>108790722
What did you generate with it?
>>
>>108790619
based feeta
>>
>>108790736
nothing, ernie looks like shit and im not downloading dogshit. currently messing with anima
>>
File: Ernie-Image_00064_.png (1.7 MB, 1200x1024)
1.7 MB PNG
>>108790722
Come again? Civit users are uninspired jeets.
>>
>>108790775
5girls...SITTING?!?! AHHHHHHHHHHHH ERNIEMAN SAVE ME!!!!!!!!!!!!!!!!
>>
File: 1751616968366904.png (700 KB, 768x1344)
700 KB PNG
>>
File: 00224-793808769.png (2.1 MB, 1248x1824)
2.1 MB PNG
>>
File: 1749902053779863.gif (1.83 MB, 832x1280)
1.83 MB GIF
>>108790520
>>
1girl standing general
>>
>>108790252
Its current aesthetic lends itself well to sharp graphics. Not so much anything else tbdesu.
>>
File: 1762846654987640.gif (1.8 MB, 1152x896)
1.8 MB GIF
>>
>>108790775
Not bad, how are the generation times?
>>
>>108790775
give me the QRD on Ernie. Can it do realistic nsfw stuff? Is it easy to train?
>>
File: punk2.png (1.07 MB, 1024x1024)
1.07 MB PNG
>>
this fagollage smells of curry and shit
>>
why is everything shit and gay
>>
maybe it's you
>>
File: 1746969721819747.png (677 KB, 800x600)
677 KB PNG
>>
File: image.png (16 KB, 519x270)
16 KB PNG
>>108789589
I haven't updated ComfyUI in a while; does putting more than one character still suck compared to A1111?
I just want to prompt normally, no conditioning nods or any of that crap.
>>
>>108791397
the UI has no bearing on this
>>
File: file.png (3.15 MB, 1724x1122)
3.15 MB PNG
https://xcancel.com/ostrisai/status/2053256188142428341
>I am running my first test on training a HiDream-O1 LoRA on AI Toolkit. I don't want to get too excited too early. But this is the coolest model I have EVER seen. Super efficient pixel space. No VAE. No Text Encoder. Trains super fast. This is an industry changing innovation!
>>
>>108791433
Why would he post that image with that text? Should we not believe our lying eyes? The results are shit, nothing like a tarot card
>>
>>108791433
no thanks
>>
>>108790686
>>108790780
How do you make these? They look so cool.
>>
>>108790775
it's nice, but I guess no full nsfw right?
>>
>>108791429
How do I do what I want?
>>
File: file.png (28 KB, 1520x225)
28 KB PNG
>https://huggingface.co/HiDream-ai/HiDream-O1-Image
Anyone has been able to replicate the output of HiDream-O1-Image?
I feel like something is weird, are they shitting us and presenting the images from their non released 200B in the page of the open sourced 8B?
>>
>>108791569
type the prompt in the text box
the result will be the same, rng notwithstanding
>>
>>108791572
I couldn't get any good results, so subterfuge seems likely.
>>
>>108791070
Been out of the loop, what model makes gifs like this?
>>
>>108789622
>>108789623
someone kick nigbos cage?
>commences to post this spamfor a year
>>
>>108790775
holds up mirror
>>
File: 1761519392207582.jpg (778 KB, 2957x1857)
778 KB JPG
>>108791580
Thanks anon, I'm looking into their technical report too, they don't seem to make a difference on what model they're exactly using.
It's a bit annoying because it's clearly better for text too.

Example from their pdf:
(left is qwen 2512)

>Input prompt: An advertisement for XP Boost, a gaming hydrationdrink, featuring a young male gamer holding a can of the product in ahigh-tech gaming setup. In the foreground, a young male with curlybrown hair, appearing to be in his late teens or early twenties, wears ablack and green gaming headset and a black t-shirt with green accents.He holds a black and green can labeled "XP BOOST HYPER LIME" in hishand, extending it toward the viewer. The can features a lightning boltdesign and text indicating "ZERO CRASH FORMULA" and "16 FL OZ (473mL)". The background shows a dimly lit gaming environment with acomputer monitor displaying a blue circular graphic, a keyboard, and agaming chair with the XP Boost logo. The left side of the imagecontains large text reading "QUEUE UP. POWER ON." in white andgreen, with smaller text below stating "GAMING HYDRATION DRINKZERO CRASH FORMULA" and "Caffeine + electrolytes + B12, crafted forranked nights and tournament weekends." Below this, a green buttonlike graphic reads "LEVEL UP YOUR FOCUS." At the bottom, four iconsrepresent the product's benefits: "CLEAN ENERGY," "ELECTROLYTESHYDRATION," "B12 FOCUS," and "ZERO CRASH." The overall scene isframed with neon green lightning effects at the bottom.
>>
File: image-1-1.png (2.76 MB, 1055x1491)
2.76 MB PNG
Babe babe wake up!, Laxhar Lab has reverse engineered and open-sourced the trainer for SenseNova-U1, a leaked unreleased image generation model.
>Key highlights:
no VAE or separate Vision Encoder, making it a true end-to-end text + image model
8B parameters, making it fast and efficient compared to rivals like GLM-Image (16B)
Strong at understanding complex prompts and generating infographics

Trainer is live on Hugging Face: Link: >>108791657
>>
>>108791675
How good is the model at heavy text?
Can you give it the prompt from >>108791624?
>>
File: 4a.jpg (390 KB, 2048x2048)
390 KB JPG
>>108791675
>>108791692
Here is the output (full version)
>>
>>108791675
How good is it at 1girl and nsfw.
>>
>>108791723
>that hand
horrifying

text is pretty good though.
>>
>>108791576
You are too cryptic. What do you imply:
is it that breaking prompt into chunks doesn't work
or
Comfy breaks into chunks automatically when line break occurs and it's not effective
?
>>
>billion-dollar labs releasing non-stop garbage
>big russ out here releasing gemstone after gemstone at 1/10 the parameters
how does he do it?? or better yet, why can't they??
>>
File: TrueKleinV2_00022_.png (1.12 MB, 1280x720)
1.12 MB PNG
>>
>>108791723
Thanks anon.
The text is almost fine with spacing issues, but the logos are bad, and obviously anatomy is fucked up.
>>
File: makoto.png (1.2 MB, 1024x1024)
1.2 MB PNG
>>
>>108791765
its gemstone because its the only model animepoors can run
>>
>>108792092
where is your 2 bazillion parameters gem?
>>
File: ComfyUI_01630_.png (929 KB, 1024x1024)
929 KB PNG
>>
>>108791433
>from slop to slop
god damn it man why does nobody have eyes
>>
>>108792287
idk but it looks like a fast learner, by the 3rd sample it understood the style.
>>
>>108792313
If the style is "slop flat art" then sure. Looks nothing like a tarot card though
>>
>>108792374
it's ostris, the style is probably "slop flat art".
>>
>>108792386
Grifters gonna grift
>>
Why didn't you guys tell me anima is so much better than illustrious, I'm training styles and holy fuck, it really catches on.
>>
File: new api nodes.png (360 KB, 915x1178)
360 KB PNG
i missed this one, new api nodes dropped??
https://blog.comfy.org/p/luma-uni-1-is-now-available-via-partner
>>
>>108792467
it's not better than pony
>>
>>108792468
this one actually looks good unlike the localslop we've been getting. no wonder comfy stopped supporting local models, the shit we get is barely worth spending 5 minutes on. local fell off majorly
>>
sunday, fuddy sunday
>>
>>108791748
NTA but chunk shit does fuck all to help multiple characters.
You were already told what you needed to do for that.
This is my last (You)
>>
How do I convert a novelai prompt to work with anima?
>>
how's trellis 2 doing?
>>
>>108792523
What does your novelai prompt look like?
>>
>>108792621
give huge bob to girl and make sexy indian man
>>
File: 1775369432573122.png (352 KB, 1110x1412)
352 KB PNG
>>108792621
>>
>>108792689
>>108792621
https://files.catbox.moe/xkdiyj.png
>>
>>108792716
Not my gen btw. It's a shittedfag from /trash/ but the artstyle makes me diamonds and I wanna gen vanilla with it.
>>
yep that's not possible on local
>>
>>108792689
Set steps to 40
year2024 becomes year 2024
Undesired content is negatives
Move the ones with -2 in prompt to negatives
Remove {{}} and :: s from prompt
Remove underscores from prompt
Remove artist:, artist mixing is weak on anima but feel free to keep them.
>>
File: 1774315533193440.png (1.71 MB, 1024x1024)
1.71 MB PNG
>>108792732
No luck. Everything's coming out way too shiny and clean.
>>
>>108792885
I forgot to add
You need @ before artist tags
>>
File: 1751500316791397.png (1.63 MB, 1024x1024)
1.63 MB PNG
>>108792904
Yeah I forgot them too. Still coming out nothing like the original style unfortunately.
>>
File: 1767430383339237.png (1.48 MB, 1024x1024)
1.48 MB PNG
>>108792915
And this is with no weights on the artist tags
>>
Jokeal Confusion General
>>
File: 1754081749919326.gif (353 KB, 133x242)
353 KB GIF
>gpu crashed
>>
File: 1763575749120182.jpg (457 KB, 1232x1944)
457 KB JPG
>>
File: 1760473877898882.png (3.85 MB, 1128x2048)
3.85 MB PNG
sex with JKs
>>
zit sameface slop
>>
flux2 klein9b still the best for doing prompt+image(+image) instruction inputs? (with that qwen model roughly competitive with it). i see some talk of ernie and a very new hidream model itt?
>>
File: 1771803751451901.png (3.89 MB, 1328x1640)
3.89 MB PNG
>>108793032
i love my asian zitslop girls
>>
>>108793036
Where DreamShaperZIT?
>>
File: 1749847683619394.jpg (648 KB, 1328x1640)
648 KB JPG
>>108793042
its not 2024 anymore broski
>>
how do we stop anime slopstyle for good? anima keeps forgetting artists and is going full pony slop so it's not the answer. tired of all these failbakers shitting things up further
>>
>>108793054
35 stars status?
>>
>>108793056
and this helps because????
>>
File: 1772003227154172.png (114 KB, 384x416)
114 KB PNG
>>108792468
local is so fucking DEAD
>>108792287
>god damn it man why does nobody have eyes
turns out there are multiple kinds of ai psychosis
one develops after you've generated so much slop, that you cant distinguish slop from non slop anymore
>>
>>108793054
We need to reverse engineer nai
>>
>>108793054
make your own model
>>
File: 1751707429163570.jpg (658 KB, 1328x1640)
658 KB JPG
https://github.com/Comfy-Org/ComfyUI/pull/13817
cant wait to try this 'vaeless' model
>>
>>108793065
That "anon" is an insane tranny. Don't bother replying
>>
>>108793071
will you front the money? I promise I won't have pony score faggotry
>>
>>108793054
>anima keeps forgetting artists
Do you have any proof of this yet
>>
>>108793085
anima uses pony scoring?
>>
>>108793085
If the scores are bad, why don't you just not use it?
>>
>>108793036
wheres her dick?
>>
>>108793133
buried deep in your ass
>>
>>108793117
yes, the style priors are overwritten for score tag dropout training. It's why the art styles are degrading every preview before it goes 100% api
>>
>>108791675
>Laxhar Lab
They should get their heads out of their asses and train anima on their noob dataset
>>108793158
>still no proof
Next time you post please have some proof ready thank you anon
>>
>>108793158
>art styles are degrading
welcome to /ldg.
>>
>>108793170
sorry but saying there isn't any proof isn't viable if you have no proof yourself. make a side by side comparison with the same artist tags and see for yourself
>>
kill ani in real life
>>
>>108793196
You seem to not understand the burden of proof
>>
the only proof i need is that no matter how advanced models get, you knuckledraggers will make the same 1girl, standing slop you've been making since sd1.5
>>
>>108793170
cosmos has a gay corpo licence they don't want to touch. anima's shortcomings comes from njudea and Russ's own Jewish greed
>>
>>108793198
I accept your concession. anima has failed and is just a new pony model until an illustrious like model is released again without Jewish greed involved
>>
>>108793206
How is the license substantially different than Noobs
There's already other finetunes of Anima
>>
>>108793220
njudea can arbitrarily change it's licence and cuck everyone using it retard. It's a stipulation if the licence
>>
>>108793218
>still no proof
kekd
>>
>>108793229
cucked out of profiting off open source? fine by me
>>
bicker bicker bicker
>>
>>108793170
would rather they started pretraining for one of those vaeless meme models, doubt anima would get that much better
>>
apache2 anima status?
oh wait thats right it failed
>>
>>108793205
mona lisa is a 500 year old 1girl, standing gen people still talk about
>>
File: the raped.png (101 KB, 320x208)
101 KB PNG
>>
>>108793282
wood you, though?
>>
someone gen the mona lisa with cum on her massive forehead
>>
>>108793020
plap plap plap
>>
>>108793286
does comfy have working dynamic offloading yet? sdcpp does it way better
>>
File: 1750534034306921.png (2.33 MB, 1086x1448)
2.33 MB PNG
>steal millions of anime images from real artists to train your model
>release it under a license that forbids using your anima outputs to train competing models
when are localkeks finally gonna develop a sense of shame?
just leave real artists out of it and steal from us apichads instead. we don't mind, go ahead and distill our outputs all you want
but if you're gonna cry about licensing then keep your grubby hands off actual human work, you fucks
>>
File: zeta chroma.png (533 KB, 2100x6300)
533 KB PNG
8 hours ago lodestone updated training visualization and made the remaining part much shorter.
Is he finally giving up on it, lel
>>
>>108793410
why is the loss going up
https://upload.wikimedia.org/wikipedia/commons/thumb/1/18/Noto_Emoji_v2.034_1f480.svg/1280px-Noto_Emoji_v2.034_1f480.svg.png
>>
>>108793409
faggots eat anything up without thinking about the consequences like the pony model for example
>>
>>108793410
It used to be 2.25 (million I think) steps, now he seems to be planning to call it quits at 1.75
>>108793422
He changed the way it is calculated, for some reason.
>>
>>108793410
why can't this gigaretard just do a simple finetune
>>
>>108793444
we ask this all the time about tdrustled, ponydev and noob team
>>
>>108793197
meds
>>
>108793449
One of these is not like the others.
>>
>>108793229
>njudea can arbitrarily change it's licence
it literally can't thoughbeit, have you even read the nvidia cosmos license? there's basically no restrictions. it's intended to be commercially usable after all
>>
Cyberdyne Systems just released a new model!
>>
no fate but what we make
>>
ani was right about turdrussel
>>
File: 1769223506813118.png (1.32 MB, 1448x1086)
1.32 MB PNG
>>108793410
lmao, its completely fucked
leave the serious work to the professionals, such as sarah peterson and playtime_ai
>>
Anyway, I wonder what kind of surgery he will perform on Ernie.
That's what I think he is moving on to.
>>
File: pixel space snake oil.png (108 KB, 1094x651)
108 KB PNG
Not a single one of these models are any good btw. (expect tuna which is MIA since announcement)
Maybe it was always a ruse to distract local copers while API models secretly perfected the vae?
>>
>>108793471
the one whose model has x50 less downloads on civitai?
>>
>>108793612
the biggest reason the vae exists is because of gpulets. pixel space genning is too slow or intense for consumers which is why I don't have high hopes for people adopting this locally. Most API models don't use a vae
>>
>>108793649
Yeah I wasn't serious about the second sentence.
About speed I don't know.
Hidream is reasonably fast, but shit quality. GLM and LLaDA were slow and shit.
I don't know maybe fast and good pixel space on consumer hardware is possible. I am very skeptical that we are getting it though.
>>
>>108793687
the model would have to be pretty small and possibly distilled if you want the speeds maybe slightly better than DiT latent diffusion but I think DiT has run it's course for a while now. AR should be improved instead of benchmark chasing on the same years old tech
>>
>>108792479
People still use pony? I thought everyone was using illustrious and noob
>>
>>108793737
And still others are heterosexuals over the age of 17.
>>
File: _AnimaPreview3_00012_.jpg (338 KB, 1248x1608)
338 KB JPG
>>
>>108792468
took them a while
anon was shilling this two months ago >>108452832
>>
File: 00194-4018712778.png (1.2 MB, 1656x960)
1.2 MB PNG
>>
File: _AnimaPreview3_00024_.jpg (344 KB, 1248x1608)
344 KB JPG
>>
File: 00224-66164629.png (1016 KB, 1656x960)
1016 KB PNG
>>
just put the fries in the bag already big russ. how much more training could a 2b model need?
>>
What is it about competent people being rewarded for their effort that makes Pembroke's biggest rape victim seethe so hard?
>>
Skill takes a long time. Life is short.
>>
Big Russ just flew over my house and dropped a note which confirms Anima Full will be API only.
>>
File: _AnimaPreview3_00069_.jpg (500 KB, 1248x1608)
500 KB JPG
>>
>>108793970
Should have worn slippers.
>>
>>
File: zImageturbo_00006_.jpg (494 KB, 1672x1264)
494 KB JPG
>>
>>108794030
Thanks I hate it
>>
>https://civitai.red/images/127883693
holy shiet that prompt
>>
>>108794102
>join my discord saar
fuck off jeet, no one gives a fuck
>>
>>108794102
Kek, all of that for "white hair, white eyelashes, himecut, pale skin, face close-up"
>>
>>108794102
Just 600 words? I have seen people go far more schizo.
>>
>>108794102
What's wrong, you don't like boomer prompting?
>>
>>108794102
Is this jeet just taking real life photos and saying it's his gens?

That's nastya zhidkova.
>>
>>108794151
The eyes aren't even the same color, blind anon.
>>
>>108794151
He is a grifter but that's a ZIT gen and not a photo of zhidkova
>>
File: zImageturbo_00029_.jpg (688 KB, 1376x1832)
688 KB JPG
>>
>>108794131
he's selling a workflow so obviously he's bloating the prompt as much as possible to make it sound convoluted and special
reminds me of that short lived JSON prompt phase every jeet started doing to larp as a hacker or something
>>
>>108793073
waow, what model make this
>>
>>108794239
It's a FACT that if you parse your prompt through JSON the AI model appreciates you more. It doesn't do any better it just appreciates you more.
>>
>>108794030
jfc, why is this so awful?
>>
>>108794253
You don't like uncanny angelina jolie?
>>
>>108794114
uh oh melty
>>
File: zImageturbo_00038_.jpg (611 KB, 1376x1832)
611 KB JPG
>>
>>108794273
yea, and?
>>
>>108794281
random 1/4 hair just stuck to face for no reason
>>
File: 669086190813877.png (1.63 MB, 1248x1824)
1.63 MB PNG
>>
File: zImageturbo_00054_.jpg (701 KB, 1376x1832)
701 KB JPG
>>108794295
there has to be a reason
>>
>>108794374
and the reason is (Yooou)
>>
File: z-image_00046_.png (1.85 MB, 1472x1280)
1.85 MB PNG
jeets are really out here buying workflows? lmao
>>
File: zImageturbo_00066_.jpg (750 KB, 1376x1832)
750 KB JPG
>>
>>108794473
now do one of her without any makeup
>>
do artists really make posts about their work being stolen while they're downloading cracked photoshop and pirated art courses in the background?
>>
>>108794576
Yes, hypocrisy is an integral feature of the leftoid artoid brain.
>>
>>108794576
Great artists steal.
>>
>>108793409
I wish cloudcucks were a bit smarter. What a retarded post jej.
>>
File: x_wrn1vg.png (1.81 MB, 1024x2048)
1.81 MB PNG
>>
>>108793409
> 'real artists'
lmfao
>>
>>108791469
Here is the positive prompt for one of them, using Anima:

traditional media, binding discoloration, bleed through, crease, scan dust, painting \(medium\), canvas \(medium\), scan artifacts, magazine scan, scan, artbook, doujinshi, textless version, poster \(medium\), production art, novel illustration, non-web source, original, commission, @minuspal, @nirak, @wamudraws, oekaki, jaggy lines, huge breasts, hanging breasts, ass, wide hips, thighs, eyelashes, lips, lipstick, breasts, mature female, alternate breast size \(larger\), cleavage, curvy, toned, 1girl, dark magician girl, wizard hat, green eyes, blonde hair, bare shoulders, long hair, skirt from side, determined, armpit focus, walking , alley, urban, night, outdoors,

>>108791595
Any model can either via seed variation or multiple light i2i passes on a gen (with each pass serving as a frame). Those used the latter.The former has been an A1111 extension for quite some time.
>>
>>108793970
my sides
>>
>>108791595
https://github.com/FizzleDorf/Loopback-Wave-for-A1111-Webui
>>
File: 1749635541678350.png (1024 KB, 832x1280)
1024 KB PNG
>>
File: zImageturbo_00104_.jpg (736 KB, 1376x1832)
736 KB JPG
>>
>>108793035
>flux2 klein9b still the best for doing prompt+image(+image) instruction inputs?
It's at least well supported and looks reasonably good, but it has the same issues as other models of their sizes for small text making sense for example.

>i see some talk of ernie
I have no idea if it's good or not, I'm also curious about it.

>very new hidream model
This model makes no sense to me : the previews look very good on their paper and hf, but what people post trying to replicate them look worse than sd1.5.
So either they're cheating and showing images of their unreleased 200B model on the page of the 8B, or the support right now is very bad so whatever we're doing isn't working with it.
>>
File: ComfyUI_00320_.png (2.8 MB, 1344x1984)
2.8 MB PNG
I really miss forge's adetailer for faces. They look really messy without it. I just looked up Comfy's equivalent and it seems to be such a drag to set up. Wish me luck.
>>
File: 30248056.png (1.39 MB, 1216x832)
1.39 MB PNG
>>
File: zit_00018_.png (2.49 MB, 1536x1536)
2.49 MB PNG
>>
i give up on my dream lora. its not working out. i dont have photo real data for the concept so it keeps coming out with that clay sloppa look.
>>
File: 755274449140650.png (1.37 MB, 832x1216)
1.37 MB PNG
>>
File: zit_00016_.png (1.96 MB, 1440x1040)
1.96 MB PNG
>>
How come ernie won't take my anima latents like z image does? it just throws errors
>>108795897
What are you trying to make my friend?
>>
>>108796096
Ernie has flux 2 vae. It has higher channel count than anima/zit. You are trying to fit a square into a circle shaped hole.
Anima and zit vae are different btw despite having the same shape, so you are passing mostly gibberish latents to ZIT
You should be passing anima latents to a model with qwen vae, like qwen image.
>>
>>108796112
That sucks. I did use the anima with z refiner that got posted here a while back and it has been my favorite work flow in a good bit and was hoping to just put Ernie in as it is better at understanding things than z image
>>
a redditor just vibecoded a site that lets anyone create free AI videos using his gpus with no safety filter
is this a good idea?
https://www.reddit.com/r/StableDiffusion/comments/1t9juoy/i_built_a_site_to_create_free_ai_videos_using_ltx/
>>
>>108796122
could you share that wf? checked the archive but the link is dead
>>
>>108796096
>What are you trying to make my friend?
trying to make photo real fantasy races. but i cant seem to force it to understand the concept while still having realistic lighting and texture. if there were images of things that had really good make up like the orcs from lord of the rings i could use that, but what i am trying to make doesnt exist
>>
Women can smell my musk. They probably know I have an rdna2 card. If I were a woman I'd be dizzy.
>>
Wow did anyone see this?


The world 1girl supply at record lows, 10 days remaining
BBC
Mumbai, India
>>
>>108796346
oh no! Not the super censored talking head model that falls apart the moment you add any movement to the prompt?! It's so exploitable!
>>
>>108796893
wan 2.1 and ltx 2.3 are super censored?
>>
>>108796898
I've only used wan2.2 which isn't that censored. But without loras, ltx is
>>
i cant stop making images
i make them nearly every day
for hours and so on
its almost all i do
>>
>>108796949
Are you >>108796928 ?
>>
>>108795880
How many steps did you bake this? I imagine Base would be less resistant to her unique hair coloring DESU.
>>
>>108796904
Why do you guys keep saying this? LTX with NSFW loras is unironically better than WAN these days especially because it can do audio as well.
>>
>>108797047
Can't get it to work.
>Here's your 16 GB Workflow bro
>Doesn't work
>>
>>108797047
not really. It breaks cohesion pretty easily. You can quickly test this. Faces change, skin changes, anatomy warps with any kind of movement. The nsfw loras sort of work but not really. I'm not saying wan22 is much better but what I find that works is making a wan video in 5 second sections, stitching it together, and then adding audio to it with ltx, using 16 frames or so as keys to maintain coherence. It's pretty laborious and you have to get lucky with the wan SVI workflow but I think it's the best you can do

Also, ltx text to video is complete shit. I don't think anyone uses it as a t2v model. Wan on the other hand can deliver decent t2v
>>
>>108797074
i feel like the whole is greater than the sum of the parts for ltx. like in a vacuum it does most things worse than wan, but as a total package it wins out. at least for me.
>>
so anima preview 4 tomorrow, right?
>>
>>108797074
you can use klein9b edit model at 0.5 denoise on last frame with only improve the image keep composition and style the same and that works quite well. however it sucks at complex actions like repositioning actors.

wan 2.2 is still better by miles yeah. the sound is missing but most ltx lora's generate shitty sound's anyway.
>>
File: this.png (9 KB, 285x172)
9 KB PNG
>>108797074
>Wan on the other hand can deliver decent t2v
this and wan i2v model also does t2v if you feed it just a blank white image using node in pic related and then a decent prompt and character model. For prompt i just use llama.cpp and Qwen3-4B-abliterated_dark.Q8_0.gguf with -ngl 0 so that none of the layers are loaded on the vram and then i can safely have that running in a terminal with out getting oom. hours of fun, hours, i just worry my dick will fall off.
>>
>>108797389
>image

ahahahah you can't use that with ace step.
>>
>>108797389
I use https://github.com/FranckyB/ComfyUI-Prompt-Manager
it uses llamacpp to load the qwen gguf as a node and spits out the prompt in the workflow
>>
>>108797444
yeah i tried that but I get problems with it crashing after a while, using my own llama on cli i get better control to how it loads the llm.

llama-cli -c 32000 --flash-attn on --no-mmap -rea off -ngl 0 --threads 4 -m /path/to/model

32000 context is probably over kill mind, /clear clears the context and lets you start a fresh prompt, just need to get it to first engineer a prompt template for specifically generation of good wan prompts. Then can just leave the thing running in the terminal and copy paste what you need without it causing an OOM for as long as you have enough ram/swap then it will run albeit a little slow compared with using the GPU. i'm only getting 2.6 t/s but it don't matter much :)

llama.cpp compile command
cmake -B build -DGGML_CUDA=ON -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS

ComfyUI-Prompt-Manager would be better if it was able to just connect to the users already running llama-server process.
>>
>>108794374
I recognize this slut
where's the disgusted face looking at viewer tho?
>>
>>108797596
Well that's because he's wanking. I guess.
>>
ComfyUI 0.21 is out, with AMD support for --enable-dynamic-vram. I tried it out, but it still can't handle OOM gracefully on an iGPU laptop. Attempting untiled VAE decode with too big an image still freezes Windows for several minutes, and the max safe resolution is lower than with --disable-smart-memory, i.e. about 1600x1024 instead of 1600x1280 with SDXL.

If/when it snaps out of the freeze, it does fall back to tiled decode, but the right behavior should be to see the oncoming memory ceiling BEFORE smashing headfirst into it. It's overestimating the available memory somehow.
>>
>>108797616
(Freezing also happens in the default mode with 1600x1280, just to be clear. It didn't used to. I was just hoping that dynamic vram would fix it.)
>>
>>108797628
igpus don't have dedicated vram. your best bet would be increasing the page file and hoping for the best.
>>
https://files.catbox.moe/209htl.wav
>>
>>108797616
>wanting to do ai stuff with iGPU
>amd at that
LMAO bro, maybe you would've had some argument if you were talking about the strix, but the vram requirements would barely use half of its unified ram anyway
>>
>>108797640
Use mp3 next time, you're wasting catbox's bandwidth unnecessarily.
>>
>>108797616
just use tiled vae decode nodes instead of waiting for the fallback.
>>
>>108797638
AFAICT it IS hammering the pagefile when it freezes, which is how it sometimes recovers after a while. I think it's measuring the available physical RAM wrong, maybe double-counting the shared portion or something.

>>108797652
It's not the fastest, but I can run some stuff decently. I just wish it'd detect OOM more gracefully so my options aren't jamming the power button or letting it rape the SSD for 30 minutes while hoping it unfreezes.

>>108797696
Or I could do that, yeah. But it's the principle of the matter, the airbag don't werk.
>>
>>108797727
>Dynamic vram is the new ComfyUI memory optimization that should massively reduce ram usage and generally speed things up on Nvidia hardware on Windows and Linux.
>on Nvidia hardware
Not for you (or me)
>>
>>108797652
> wanting to do ai stuff with iGPU
iGPUs is the future of local >>108795908
>>
anyone had luck setting up comfy on newest fedora 44?
>>
>>108797750
They just added AMD support with this release, though it's not enabled by default.
>>
>>108797761
make a pyenv
make a venv
install
>>
File: 1752422164266101.jpg (484 KB, 1536x1536)
484 KB JPG
>>
>>108797778
> pyenv
> venv
> 2026
> what is uv
>>
>>108797757
>Intel
I'd go "bruh", but Comfy recently added portable builds for Intel, so it's looking readier-to-use lately.
>>
>amdshits and intelshits are getting more support
I hate this, how do I justify buying a 5090 now???????
>>
>>108797816
You can gen in seconds instead of minutes, and train in hours instead of... days? A week? I haven't actually dared try yet.
>>
>>108797816
cuda is still king for the foreseeable future. AMD is controlled opposition and intel is intel
>>
anima preview 4 isn't coming out because russ finally realized that preview 3 was an actual fried downgrade and it only got worse from there
>>
>>108798051
>>108793056
>>
>>108798153
you think you can downplay every anima criticism by mentioning a single faggot schizo?
>>
man imagine running sdxl shit in 2026 lmao, stop being poor
>>
>>108797789
>uv
lol, lmao even
>>
Can I make is so my workflow does 4 runs of model 1 before running them all in model 2 in a simple way?
>>
>>108798336
you can do anything you imagine with comfy, anon. Think of yourself as Johnny Depp in one of his magical movies.
>>
love me some comfy
>>
i will never stop using sdxl, it's the best
>>
just deleted all my 600gb of sdxl models and loras
>>
>>108798461
based
>>
where is anima v4
>>
there will be no further updates to anima
>>
I've been gooning non stop since I got a nice LTX2.3 eros workflow going for my setup
>>
>>108798518
and you course you will provide no proof because you're a pussy. why even bother shitting up this board then
>>
can anyone give me some prompts for sex in ltx 2.3? The ones you gave me for wan2.2 were amazing back in the day.
just give me something to throw into wan2gp so i can copy the settings. thank!
>>
>>108798518
so you make us read your gay masturbation diary AND not catbox the "nice" workflow? fuck you, buddy
>>
>>108798640
lmao stay mad
>>
>>108796886
My github stars are stuck in Hormuz.
>>
>>108798666
*cums on you*
>>
File: 984159710541203.png (1012 KB, 832x1216)
1012 KB PNG
>>
>>108790392
>My BRAPs would kill you traveler
>>
>>108791802
legendary status
>>
/r/ing' sfw VAGEEN
>>
File: Untitled.jpg (189 KB, 1303x905)
189 KB JPG
>>108798982
Man toes, man foot silhouette shape. Only men have a big toe that prominent and such bony feet. I’m even showing you photos of real women because your gen is rotoscoped slop.
>>
>>108799315
Big if true. No model can be considered good unless it's trained with "male feet" and "female feet". A real artist can draw a man with girly feet or vice versa, and generated images must allow that as well.
>>
>>108798982
"Feet are the second face of a character" or "Feet are the face of the lower body," say good anime illustrators.
Your gen's feet just followed the logic of the tights and calves without getting the consideration they deserve.
This wouldn't have happened if you'd consulted >>>/g/adt/ first. We emphasize this stuff because of the avalanche of "politically correct feet" in new anime models.
>>
>>108799430
meds
>>
Has Flux2 Klein been surpass yet for edits?
>>
>>108799430
this
wrong feet is not okay, and most importantly, not ldg
some people enjoy trolling by posting illogical feet in order to damage our reputation. don't give them the satisfaction of leading you astray
>>
why is inpaintign on comfy still so shitty compared to Auto1111 after all those years
>>
>>108799587
Nah.
>>
>>108799603
comfy only cares about API bucks anon
he betrayed us
>>
>>108799587
Yah.
>>
>>108799603
why inpiainting when you can mask it and API node the fuck out uf it?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.