[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Discussion and Development of Local Image and Video Models

Previous: >>108585019

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
>mfw Resource news

04/12/2026

>LTX-2 VBVR LoRA - Video Reasoning
https://huggingface.co/LiconStudio/Ltx2.3-VBVR-lora-I2V

04/11/2026

>ComfyUI-RookieUI: The ultimate A1111-style sidebar
https://github.com/rookiestar28/ComfyUI-RookieUI

>Qwen3.5-4B-Base-ZitGen-V1: Image captioning fine-tune of Qwen 3.5 4B optimized for Z-Image Turbo
https://huggingface.co/lolzinventor/Qwen3.5-4B-Base-ZitGen-V1

>ComfyUI Memory Visualization
https://github.com/kijai/ComfyUI-MemoryVisualization

04/10/2026

>JoyAI-Image-Edit now supports ComfyUI
https://github.com/jd-opensource/JoyAI-Image#-news

>Two Front Doors: Civitai.com, Civitai.red, and What's Next
https://civitai.com/articles/28369/two-front-doors-civitaicom-civitaired-and-whats-next

>Uni-ViGU: Towards Unified Video Generation and Understanding via A Diffusion-Based Video Generator
https://fr0zencrane.github.io/uni-vigu-page

>PrivFedTalk: Privacy-Aware Federated Diffusion with Identity-Stable Adapters for Personalized Talking-Head Generation
https://github.com/mazumdarsoumya/PrivFedTalk

>AVGen-Bench: A Task-Driven Benchmark for Multi-Granular Evaluation of Text-to-Audio-Video Generation
http://aka.ms/avgenbench

>Cross-Modal Emotion Transfer for Emotion Editing in Talking Face Video
https://chanhyeok-choi.github.io/C-MET

>ChenkinNoob-XL-V0.5
https://modelscope.ai/models/ChenkinNoob/ChenkinNoob-XL-V0.5

>Control Order & Free Memory: Controls the order of node execution with device-agnostic memory management
https://github.com/mkim87404/ComfyUI-ControlOrder-FreeMemory

>DMax: Aggressive Parallel Decoding for dLLMs
https://github.com/czg1225/DMax

04/09/2026

>MAR-GRPO: Stabilized GRPO for AR-diffusion Hybrid Image Generation
https://github.com/AMAP-ML/mar-grpo

>HybridScorer: Score, sort, and cut large sets down fast with GPU-accelerated AI review
https://github.com/vangel76/HybridScorer

04/08/2026

>OrthoFuse: Training-free Riemannian Fusion of Orthogonal Style-Concept Adapters
https://github.com/ControlGenAI/OrthoFuse
>>
>mfw Research news

04/12/2026

>Preserving Forgery Artifacts: AI-Generated Video Detection at Native Scale
https://arxiv.org/abs/2604.04634

>Generative Phomosaic with Structure-Aligned and Personalized Diffusion
https://robot0321.github.io/GenerativePhotomosaic/index.html

>DiffVC: Non-AR Framework Based on Diffusion Model for Video Captioning
https://arxiv.org/abs/2604.08084

>HandDreamer: Zero-Shot Text to 3D Hand Model Generation using Corrective Hand Shape Guidance
https://arxiv.org/abs/2604.04425

>BiTDiff: Fine-Grained 3D Conducting Motion Generation via BiMamba-Transformer Diffusion
https://arxiv.org/abs/2604.04395

>Image-Guided Geometric Stylization of 3D Meshes
https://changwoonchoi.github.io/GeoStyle

>Rethinking Position Embedding as a Context Controller for Multi-Reference and Multi-Shot VidGem
https://arxiv.org/abs/2604.03738

>SurFITR: A Dataset for Surveillance Image Forgery Detection and Localisation
https://arxiv.org/abs/2604.07101

>HEDGE: Heterogeneous Ensemble for Detection of AI-GEnerated Images in the Wild
https://arxiv.org/abs/2604.03555

>ABMAMBA: Multimodal Large Language Model with Aligned Hierarchical Bidirectional Scan for Efficient Video Captioning
https://arxiv.org/abs/2604.08050

>FIT: Large-Scale Dataset for Fit-Aware VTON
https://johannakarras.github.io/FIT

>HAWK: Head Importance-Aware Visual Token Pruning in Multimodal Models
https://github.com/peppery77/HAWK.git

>IQ-LUT: interpolated and quantized LUT for efficient image super-resolution
https://arxiv.org/abs/2604.07000

>TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders
https://arxiv.org/abs/2604.07340

>ResGuard: Enhancing Robustness Against Known Original Attacks in Deep Watermarking
https://arxiv.org/abs/2604.03693

>Appear2Meaning: Cross-Cultural Benchmark for Structured Cultural Metadata Inference from Images
https://arxiv.org/abs/2604.07338

>PortraitCraft: Benchmark for Portrait Composition Understanding and Generation
https://arxiv.org/abs/2604.03611
>>
>>108586449
>https://civitai.com/models/2536147?modelVersionId=2850290
Has anon tried training an Anima LoRA using the official configs?
>>
File: o_00195_.png (1.45 MB, 1280x768)
1.45 MB
1.45 MB PNG
>>
Baidu Edition
>>
File: Z-Image_00028_.png (1.16 MB, 722x1131)
1.16 MB
1.16 MB PNG
>>108590677
i vibe coded a sampler that uses gemma 4 as a judge. if the images come out as total slop it auto regenerates them with a different seed
soon ill have gemma also change the prompt and sampling settings depending on whats consistently messing up the outputs
>>108590680
cumfart is decent for some quick gens but once you throw in too many moving parts it shits the bed. its failing to keep up with all the stuff we can do now
>>
File: Anima_01723_.png (937 KB, 896x1152)
937 KB
937 KB PNG
>>108590830
I might try soon. I will be busy this week but maybe I will spare time.
>>
File: 00053-2467153369.png (1.13 MB, 1312x832)
1.13 MB
1.13 MB PNG
>>
>>108590854
Which Gemma 4, 31b?
I wasn't too impressed with 26b moe's visual reasoning. I doubt 31b is that much better.
I am also very skeptical that it will handle intricacies of sampling settings well. Hell even SOTA API LLMs aren't particularly great about reasoning when it comes to that.
>>
File: Anima_01760_.png (2.05 MB, 1024x1024)
2.05 MB
2.05 MB PNG
It just collapsed I guess.
>>
File: Anima_01761_.png (1.2 MB, 1024x1024)
1.2 MB
1.2 MB PNG
I showed you my p̶̨̢̹̻̣̤͚̭͔̮̺̃̒̇̍̇̐̃̃̉̀̐̇̀̐̈́̔́̈́̈́̐̂̌͒̏͂͛̕͘̚͝ͅę̶̨͔̜͈̞̠̬̩̣̭̖͔̼̝̖̦̭͚̥̺͗̿͗̔̚ǹ̴̨̧̦̱̯͈̟̝̘̤̜͍̼̱̼͕̩͎̬͉̝͔̣͔̘͆̌̍͋̃̉̉̏͑̀͒̏͌́̚̕̕͜͜͝ͅį̷̛̛͙̬̥̠̼̹͇̙̙͖̖͍͙͇̔̏̿͑̌̐́̆̋̈̀̾͂̄̊̃͛̂́̆̏̔̊͗̄̿͗̽̀̀͌̿͑̇͠͠s̷̠̞̪̥̤͈̣͔̠͍̯̓̎̌͗͆́̆̆͐͛̍̈̀̈̉́̈̽̉̔͊̍̿̕̕̕͝͝ ,please respond
>>
File: Anima_01765_.png (2.11 MB, 1024x1024)
2.11 MB
2.11 MB PNG
>>
File: 1764947232216650.png (686 KB, 1469x877)
686 KB
686 KB PNG
>>108590887
im using 26b and its good enough at detecting bad images which is all i need it to do right now
you might be right, it would probably be better to just have it inpaint the areas it flags as problematic instead of playing whackamole with the sampling settings
>>
>>108590830
I’d like to, but I don’t know where to get high quality images of the artists I want for free. Maybe RuTracker?
>>108590827
It doesn’t make sense to me anymore with models like Anima or Z Image, plus SAM3. As a former Invoke and actual Krita user, I don’t see the need. Maybe just something to clean between SAM3 layers, since there’s always residue and the option to choose the mask context. I see it more as a layer mask manager, a tab to centralize outputs and manage, blend, and crop them, but for that Krita output nodes exists.
>>
>>108591054
cute personality on that bot
what frontend is that? does it automatically render the generated html?
>>
>>108591071
>I don’t know where to get high quality images of the artists I want for free.
danbooru
gelbooru
rule34 dot xxx
paheal
xitter
pixiv
artstation
Most images here should be high quality enough for 1024p training.
>>
File: anima_selfie1.png (1.23 MB, 896x1152)
1.23 MB
1.23 MB PNG
>The model is designed for making illustrations and artistic images, and will not work well at realism.
Bullshit. I don't know what kind of undisclosed secret sauce is in the training data, but there ain't no way a model trained only on drawings can do this.
>>
>>108591129
>I don't know what kind of undisclosed secret sauce is in the training data,
ye-pop dataset
>>
File: o_00199_.png (600 KB, 896x1152)
600 KB
600 KB PNG
>>
File: anima_selfie2.png (1.13 MB, 896x1152)
1.13 MB
1.13 MB PNG
>>108591129
Seriously kekstone needs to stop the schizo Chroma2 experiments that are going nowhere, and just finetune Anima on the Chroma dataset. The model is like 95% of the way there already, it would take 2 epochs, he could have it done in less than a month.
>>
>>108591129
>undisclosed secret sauce
tdrussell openly state that ye-pop is in it. He also added a regularization dataset in preview 2 to ensure less forgetting of base model's realism knowledge.
>>108591142
His actions will never make sense.
>>
>>108591129
>no metadata
>not default cumfy or forge naming convention
>>
isn't ye-pop just midjourney style illustrations or something?
>>
>>108591153
I know it has ye-pop, but it says he filtered the photos out of it. Unless that part is a lie, or the filtering was bad and let a lot of realism through. I guess a regularization dataset implies realism knowledge so maybe that's how.
>>
>>108591160
>implying I'm lying about what model this is
Positive: Amateur photography. High quality candid photo of a young Asian woman with long, straight black hair, taking a selfie in a messy bedroom. The photo is taken from a slightly elevated angle, with her outstretched arm in the frame. The woman is wearing a Japanese schoolgirl sailor-style uniform. A bed and part of a window are visible in the background.

Negative: worst quality, low quality, score_1, score_2, score_3, artist name, anime, illustration, cartoon, blurry

CFG 5, er_sde, beta57 scheduler. No catbox, fuck you, type it in and try it yourself.
>>
File: ComfyUI_temp_axxiq_00001_.png (2.84 MB, 1104x1824)
2.84 MB
2.84 MB PNG
>>
>>108591164
>midjourney style illustrations
No it's more interesting. They are images from LAION that are conceptually similar to a dataset from MJ.
>4.25 million Midjourney images were downloaded from this huggingface repository, and CLIP L14 vectors were generated for each image. Using the k-means clustering method, these vectors were assigned to 10,000 centroids. The CLIP vectors of these centroids were then used to retrieve nearest neighbors from the LAION-5B dataset using the image search website, focusing on those with aesthetic values of at least 0.5 and a minimum resolution of 768 pixels on the shortest side.
So not MJ gens but rather real images that are like MJ gens.
>>
>>108591160
>he still cant into realistic anima
lel
>>
>>108591071
>I’d like to, but I don’t know where to get high quality images of the artists I want for free.
You might be surprised to know that even on this very site there are threads that contain hundreds of high quality images. There's even an entire board for it.
>>
>>108591071
>I’d like to, but I don’t know where to get high quality images of the artists I want for free. Maybe RuTracker?
kemono dot cr
sadpanda
>>
>>108591175
>but it says he filtered the photos out of it.
Forgot about that part. I guess enough slipped in then.
>I guess a regularization dataset implies realism knowledge
Yes.
>A regularization dataset is introduced to improve natural language comprehension and help preserve non-anime knowledge.
>>
File: ComfyUI_temp_axxiq_00002_.png (2.19 MB, 1200x1600)
2.19 MB
2.19 MB PNG
>>
>>108591310
yummy yummy aquatic bugs
>>
new chinese image model wen
>>
>>108591310
prawn-ography
>>
File: Video_00004.mp4 (2.22 MB, 1056x1376)
2.22 MB
2.22 MB MP4
I'm stumped at latent upscaling. It does not want me to go over 4steps at all for high noise no matter the scheduler or sampler. Any higher and it gets blown the fuck out and blurrier the more steps you add. Lowering the cfg does nothing, it's not NAG, light loras. Claude is also stumped, hallucinating with each new chat.
>>
File: o_00210_.png (1.39 MB, 1280x768)
1.39 MB
1.39 MB PNG
>>
File: 00187-1497584230.png (1.32 MB, 1312x832)
1.32 MB
1.32 MB PNG
>>
File: 10089453838992.png (1.12 MB, 1728x960)
1.12 MB
1.12 MB PNG
>>
File: Anima_01766_.png (1.82 MB, 1024x1024)
1.82 MB
1.82 MB PNG
>>108591345
I am also curious what happened to NucleusMoE. It's diffusers PR was finally merged a week ago, yet it's still MIA.
>>
File: ComfyUI_temp_padlz_00003_.png (2.47 MB, 1248x1728)
2.47 MB
2.47 MB PNG
>>
File: 377283825433227.png (1.26 MB, 960x1728)
1.26 MB
1.26 MB PNG
>>
File: ComfyUI_temp_elyfr_00005_.png (3.13 MB, 1248x1728)
3.13 MB
3.13 MB PNG
>>
cozy
>>
File: Anima_01767_.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
>>
File: 544104445879965.png (1.71 MB, 1024x1536)
1.71 MB
1.71 MB PNG
>>
File: 671883817003026.png (1.19 MB, 1024x1536)
1.19 MB
1.19 MB PNG
>>
File: 00211-3265597026.png (1.74 MB, 840x1208)
1.74 MB
1.74 MB PNG
>>
File: ComfyUI_temp_padlz_00009_.png (2.69 MB, 1248x1728)
2.69 MB
2.69 MB PNG
>>
File: ComfyUI_temp_padlz_00010_.png (3.02 MB, 1152x2016)
3.02 MB
3.02 MB PNG
>>
>>108591614
Nice polydactyl
>>
are both flux klein and z-image turbo just completely incapable of heavy clothed proportions? all these big booba loras just make the outfits rip apart and completely decimate prompt adherence/ image quality.
>>
File: ComfyUI_temp_padlz_00013_.png (3.27 MB, 1152x2016)
3.27 MB
3.27 MB PNG
>>
File: ComfyUI_temp_padlz_00016_.png (3.68 MB, 1152x2016)
3.68 MB
3.68 MB PNG
>>
>>108591614
>>108591631
>>108591722
>>108591807
She would never give me the time of day :(
>>
>>108591720
nevermind, klein's boob slider just needs stupid heavy weights. https://civitai.com/models/2318168/the-breast-slider-klein-edition?modelVersionId=2691652

can't wait for an Anima realism finetune, then i won't need sliders and shit to do heavy proportions.
>>
@gork this is true? >>>/h/8859488
>>
File: 3467824724727.jpg (2.16 MB, 2048x2048)
2.16 MB
2.16 MB JPG
Where is the funding for anima... I need more epochs...
>>
>>108591862
They’re saying you need to use Linux to train Anima loras >>>/h/8859489
>>
go back fag
>>
>>108591878
Where? If my anime model dev doesn’t post anywhere except here, where do you want me to move?
>>
>>108591894
lurk, do not post
>>
>>108591878
>>108591905
You've done him
>>
>>108591905
Fuck of, this is what happens when you let Comfy pull the strings and fund his /ldg/ buddy. Now you’re having all the anime posters from 4chan seething here.
>>
File: 1758237517589007.png (2.18 MB, 1248x1824)
2.18 MB
2.18 MB PNG
I recognize the artist style but I can't remember the name
>>
File: Anima_01772_.png (1.03 MB, 1024x1024)
1.03 MB
1.03 MB PNG
Just post schizo 1girls and not get into drama with anyone
>>
>>
File: o_00215_.png (1.05 MB, 1280x768)
1.05 MB
1.05 MB PNG
>>
File: 00314-1877534722.png (1.7 MB, 1472x848)
1.7 MB
1.7 MB PNG
>>
File: ComfyUI_09284_.png (798 KB, 1024x1024)
798 KB
798 KB PNG
>>
>>108591345
>new chinese image model wen
soon
https://github.com/Comfy-Org/ComfyUI/pull/13369
>>
https://strawpoll.com/B2ZB9rDajgJ
Anima got accepted really quickly.
>>
File: o_00217_.png (950 KB, 1280x768)
950 KB
950 KB PNG
>>
>>108592001
That's a sincerely cool gen. Prompt?
>>
>>108592020
poorly designed poll
>>
File: o_00219_.png (1.1 MB, 1280x768)
1.1 MB
1.1 MB PNG
https://strawpoll.com/2ayLQ03azn4
important
>>
>>108592078
Unserious poll I agree but anima and sdxl are roughly equal if you add all sdxls together
>>
File: 1747230948853958.png (2.85 MB, 1072x1880)
2.85 MB
2.85 MB PNG
>>
>>108592099
>if you add all sdxls together
i dont want to imagine how slopped that would be
>>
File: o_00220_.png (1.26 MB, 1280x768)
1.26 MB
1.26 MB PNG
>>
>>108592116
the merge to rule all merges
>>
>>108592106
Cool atmosphere in this one.
I can hear this. https://www.youtube.com/watch?v=M62pYatbyHo
>>
File: 04662-1325181679.png (1.64 MB, 1536x1792)
1.64 MB
1.64 MB PNG
Bottom left in the OP image
>what model pleaaaase!
>>
>>108592062
>That's a sincerely cool gen. Prompt?

space art by Chesley Bonestell , abstract expressionism, A line of detailed, embedded within dark, circular architectural elements, receding into the distance on an alien landscape under a vast, black sky with a distant Mars-like planet. The style is stark, surreal, and monochromatic, evoking a sense of cosmic horror and desolation. Dramatic chiaroscuro lighting casts deep shadows, emphasizing the texture , uneven surface of the lunar-like ground. The composition uses a low-angle, wide-angle perspective, drawing the viewer's eye along the unsettling procession The mood is somber, mysterious, and foreboding. Jupiter is in the starry sky.
Grainy, image that emphasizes texture and mood over technical polish
Fragmented, composition and unexpected cropping conveying immediacy and voyeurism
Focus on life, marginal figures, , decay, and anonymous moments — an exploration of modernity’s raw edges
Snapshot aesthetic with a spontaneous, confrontational energy; often serial and diaristic in presentation
style of Nobuyoshi Araki
style of Shomei Tomatsu
style of William Klein
style of Helen Levitt
style of Garry Winogrand
style of Nan Goldin
style of Anders Petersen
style of Seiichi Furuya
style of Masahisa Fukase
Steps: 6, Sampler: Euler, Schedule type: Simple, CFG scale: 1, Seed: 2427773870, Size: 1472x848, Model hash: 4038c907c8, Model: flux1-schnell, Version: f2.0.1v1.10.1-previous-669-gdfdcbab6, Module 1: ae, Module 2: clip_l, Module 3: t5xxl_fp16
>>
>>
File: miraclein.png (27 KB, 774x317)
27 KB
27 KB PNG
ah so this is why every realistic model after sdxl fucking sucks cock
realismsloppers are just brain dead
>>
>>108592420
Anima
>>
>>108592433
I doubt it knows all these styles but cool gen for flux schnell. Thanks.
>>
File: where is qwen 2.png (40 KB, 784x426)
40 KB
40 KB PNG
>>
>>108592517
let it go anon, it's over
>>
>>108592468
what is this retard yapping about when zit exists?
>>
File: pixel-0000-3722465621.png (1.84 MB, 1792x2304)
1.84 MB
1.84 MB PNG
>>
>>108592529
>z image tardbo
and i bet you still use shift settings too, or even more retardedly, gen on base.
>>
>>108592478
>anima
No way! I didn't think anima was good for anything. Well, except for anime of course.
But I guess it provides a good foundation for other styles as well
>>
>>108592517
alibaba should just focus on qwen 4 and distilling seedance 2.0 as wan 3
>>108592539
even st floyd while maxxed out on fentanyl would not fall for the apparent insinuation that the best out of the box model for realism is somehow bad for realism
>>
>>108592546
>except for anime of course.
well, that's what the name "Anima" implies yes
>>
>>108591973
>Emma!
<3
>>
File: lady liberty 2.png (1.87 MB, 1024x1024)
1.87 MB
1.87 MB PNG
A fine day in America.
>>
File: pixel-0001-2881986979.png (67 KB, 1792x2304)
67 KB
67 KB PNG
>>
>>108592569
It's not from ᾰ̓́νεμος (anemos)? Which in Greek philosophy symbolizes concepts such as breath and spirit.
>>
>>108592569
Good to know that it can churn out some bona fide concept art though. I'm guessing with the approprite Loras of course?
>>
>>108592614
its both
>>
>>108592636
>>108592614

nah, it's anime.
>>
File: lady liberty 3.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>>
>>108592654
Proof?
>>
>>108592682
>https://tagexplorer.github.io/
>>
>>108592695
Where does it say that
>>
File: 1754851925359741.png (713 KB, 734x538)
713 KB
713 KB PNG
>>108590807
anyone have the original file for this
>>
>>108592420
It's anima preview 3 with this lora and some other stuff, go to the prior thread to see how I set up the style. Also inpaint the face you lazy bastard and I'll catbox you.
https://civitai.com/models/908800/oldschool-fantasy-style?modelVersionId=2835743
>>
File: pixel-0002-1473128117.png (573 KB, 1792x2304)
573 KB
573 KB PNG
>>
File: ComfyUI_00039_.png (2.75 MB, 1344x1728)
2.75 MB
2.75 MB PNG
>>
File: 05256-1325181679.png (1.67 MB, 1536x1792)
1.67 MB
1.67 MB PNG
>>108592708
Sorry, yeah, I usually hate it when people post sloppy gens. I was kind reaching back way long ago since I just realize I've been genning nothing but fucked up shit for months
>>
File: 00014-782756080.jpg (1.31 MB, 2592x2016)
1.31 MB
1.31 MB JPG
>>
>>108592701
you're supposed to draw your own conclusions, don't be scared, I believe in you.
>>
>>108592904
box?
>>
File: 00026-888168195.png (1.49 MB, 1152x896)
1.49 MB
1.49 MB PNG
>>108592957
1girl, skinny, pale skin, long hair, straight hair, black hair, black sweater, turtleneck, black shorts, black thighhighs, expressionless, looking at viewer, red background, 1990s \(style\)

negative: flat color, minimalism, shiny skin

euler a, beta, 30 steps, 3 cfg, my bad illustrious mix
>>
File: ComfyUI_00501_.jpg (203 KB, 1024x1024)
203 KB
203 KB JPG
>>108592957
>>
File: 1749343255013788.png (3.93 MB, 1400x1400)
3.93 MB
3.93 MB PNG
>>
>>
>>108592916
I want to hear it from Big Russ's own mouth
>>
>>108593014
Just summit up for us.
>>
File: 00017-417262864.jpg (1.14 MB, 2592x2016)
1.14 MB
1.14 MB JPG
>>
File: 1764809755235480.jpg (1.38 MB, 1728x1728)
1.38 MB
1.38 MB JPG
>>
File: 1763690168005255.png (3.79 MB, 3072x1224)
3.79 MB
3.79 MB PNG
https://www.reddit.com/r/StableDiffusion/comments/1sjsp13/zimage_turbo_checkpoint_deedeemegadoodo_edition/
>he unslopped Z-image turbo
now THAT's impressive
>>
File: 1751184433502910.jpg (1.36 MB, 1728x1728)
1.36 MB
1.36 MB JPG
>>
>>108593315
Ben Chod
>>
>>108593315
>>108593329
Bloody Basterd!
>>
>>108590807
>>108590807

How do you turn a monochrome image to fully colored? I put "sepia, monochrome, sketch" in the negatives. Put "colorful, masterpiece, beautiful lighting" and specified hair colors, clothing colors in the prompt. I'm using Forge Neo. Any advice would be appreciated!
>>
>>108593479
you can't afaik. you can paint some colors on it in paintshop
Maybe if you add a lot of extra noise it can pick up some color from there but it's hard. You need a high denoise and it ends up ruining you're image
>>
>>108593479
use an edit model
>>
File: 1751551401631622.jpg (1.1 MB, 1728x1728)
1.1 MB
1.1 MB JPG
>>
>>108592869
Here you go, as promised, have fun
>>
>>108592869
>>108593528
FUCK'S SAKE
https://litter.catbox.moe/9d46ih3yzogwu5u0.png
>>
File: 1763696026166237.jpg (1.08 MB, 1728x1728)
1.08 MB
1.08 MB JPG
>>
File: 1770798233806998.jpg (868 KB, 1728x1728)
868 KB
868 KB JPG
>>
is this vibe coded node good or just snake oil slop?
https://github.com/ethanfel/ComfyUI-LoRA-Optimizer/tree/main
>>
>>108593532
>>108593532
Kek, happens to the best of us
Thanks for the cat
>>
>>108593672
>that example
It's some schizo hyperfixating on fringe cases. Ignore.
>>
>>108593672
>is this vibe coded node good or just snake oil slop?
This can be derived from the name alone
>>
>>108593672
interesting, maybe it just automates what https://github.com/larsupb/LoRA-Merger-ComfyUI does by introducing heuristics to auto-select stuff
>>
File: 00211-3265597026.png (1.37 MB, 1472x848)
1.37 MB
1.37 MB PNG
>>
File: 1765285887613878.jpg (813 KB, 1728x1728)
813 KB
813 KB JPG
>>
I need controlnet and inpaint workflow for z image base
>>
>>
>>108593658
Good band. https://www.youtube.com/watch?v=6xh5fhT0mPI
>>
File: 05279-906971472.png (1.14 MB, 1152x896)
1.14 MB
1.14 MB PNG
>>108593658
That's a hot cafe racer
>>
>>
>>108593760
nice
>>
File: ComfyUI_21104.png (2.21 MB, 1600x1200)
2.21 MB
2.21 MB PNG
>>108592862
Smooth...

>>108593305
I tried a few gens with it in my workflow and it just dropped detail everywhere (think large patches of undefined texture). Not a fan.
>>
>>
>>108590830
One cannot rush inspiration...
>>
The baby needs a LLM to write his prompts for him, kek, ngmi
>>
>>108593305
i dont know why he decided to show examples of z image completely mogging his gens
does he have some kind of cuck fetish?
>>
>>
anima is great
>>
>>
>>
File: 1771135161716865.jpg (731 KB, 1408x2112)
731 KB
731 KB JPG
>>
File: 00631-4005897646.png (1.27 MB, 848x1160)
1.27 MB
1.27 MB PNG
>>
File: file.png (3.09 MB, 1328x1776)
3.09 MB
3.09 MB PNG
>>108593911
>>108593943
Might switch to anima
>>
>>108590830
Ask in /edg/ there are a few reputable lora makers there who might want to try your Anima settings. Neclordx has only made loras for SDXL so far. It would be interesting to see him start working on Anima.
>>
>>108593911
How about using Anima for smut and then doing a second pass with ZiT for realistic details?
>>
File: 1769949306424367.jpg (1.08 MB, 1408x2112)
1.08 MB
1.08 MB JPG
>>
>>108594078
Am doing my nsfw with Anima refined in chroma. So I have nice hands and feet from anima and pussy lips from chroma.
>>
File: collage.jpg (2.04 MB, 2999x1970)
2.04 MB
2.04 MB JPG
>>108590830
Here's one for hirune I made from an old dataset I recaptioned with Gemma4, it was a pretty busted dataset desu, and it did a pretty good job anyway! Threw in a prompt with @aroma sensei too to test how much the lora affects stuff without @hirune and it seems pretty accurate still. Still has deformed hands a lot though, hopefully that gets better with more high res training.
>>
>>108594127
>nice hands and feet from anima
Hands have been pretty bad for me with Anima, maybe slightly better than Chroma.
>>
>>
File: 1757912780103850.jpg (724 KB, 2112x1408)
724 KB
724 KB JPG
>>
>>108594157
>miku hatsune
this perturbs me
>>
is ltx 2.3 lora training on a 5090/64gb of ddr5 ram even possible? what setting are recommended? very difficult to find any tutorial on training for this.
>>
File: ComfyUI_09377_.png (355 KB, 1024x1024)
355 KB
355 KB PNG
>>
File: file.png (3.36 MB, 1152x2048)
3.36 MB
3.36 MB PNG
>>
>>108594206
there are videos by the very guy who made the software you're using to train. he doesn't use a 5090 in his tutorials, but he goes over what you can do for low VRAM.
https://www.youtube.com/watch?v=JQIl8DFTL1M
>>
I keep getting this error even though I've got a 16 GB VRAM GPU and nothing else is going on, just watching a YouTube video.
I'm using Endeavour
>>
File: wan27.png (191 KB, 797x746)
191 KB
191 KB PNG
it's only local until it's good
>>
What should I gen?
>>
>>108594414
wan 2.7 is not good though lol
>>
>>108594427
localcope
>>
>>108594427
You've done him
>>
>>108594414
Leak the model.
>>
how do i resolve this? is this a known issue? its not even loading the video or is it a format issue?
>>
>>108594224
Loool
>>
>>108591936
Try asking in /adt/, surely they will know.
>>
which one of your is KING SLOP? The very king of the slop, who's very slop is pure K I N O. I beg of your KING SLOP, the pencil pusher hordes are invading the northern lands, we need your best K I N O SLOP now!
>>
File: ComfyUI_09401_.png (525 KB, 1024x1024)
525 KB
525 KB PNG
>>108594674
this all i gots
>>
should i continue or just give up and reduce the step count. 45sec/it seems very slow or is this normal for a 5090/64gb ram build. i also had to disable audio in the settings in order for caching lent to disk to work. i set up it up to save every 1000 steps and the repeats are at 5 with learning rate at 0.0001
>>
>>108594416
Me killing myself.
>>
File: 1761218479396543.png (1.36 MB, 1024x1024)
1.36 MB
1.36 MB PNG
>>108594822
ZiB with the prompt "Generate an image of me killing myself "
>>
File: 1752294216583486.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
>>108594855
Higher CFG
>>
>>
File: 1091642880282430.png (1.3 MB, 832x1216)
1.3 MB
1.3 MB PNG
>>
File: 911745310954562.png (515 KB, 832x1216)
515 KB
515 KB PNG
>>108594855
same prompt on Anima
>>
>>108594855
>>108594867
that feel
>>
File: 499134907045476.png (1.07 MB, 832x1216)
1.07 MB
1.07 MB PNG
>>
>>108594980
kino
>>
File: 1098958451311052.png (1.07 MB, 1216x832)
1.07 MB
1.07 MB PNG
>>
File: 1040270014740971.png (1.12 MB, 1216x832)
1.12 MB
1.12 MB PNG
>>
File: 568664217992898.png (1.06 MB, 1088x896)
1.06 MB
1.06 MB PNG
>>
File: 369227187011685.png (1.67 MB, 1024x1536)
1.67 MB
1.67 MB PNG
>>
Can local be saved?
>>
>>108593851
you dont know? if it looks like fucking shit it means it's more realism!!!!!!!!!!!!!!
>>
File: Video_00002 (20).mp4 (1.46 MB, 480x608)
1.46 MB
1.46 MB MP4
My latent upscaling quest continues.
>>
Some recent ZiT models are really awesome
https://civitai.com/models/2422805?modelVersionId=2851662
>>
I downloaded ComfyUI to try local image generation for the first time but I'm finding that prompts are awful UX to specify poses, perspectives, or descriptions involving multiple characters. I was looking at some conditioning area pipeline and openpose thing to try to control the outputs better but they didn't seem to do anything useful.

Is there a good way to do this? To be able to have like have separate prompts for different areas or characters of a scene and not have them get mixed up? Not just areas necessarily but layering for overlapping characters or objects or whatever? And openpose a meme? Or is it worth it to set up? Or should I find a controlnet where I can send stick figures into it and it derives poses and perspectives from that?
>>
>>108595490
>multiple characters.
Give the characters names if they don't already have them and describe what they're wearing/doing with full sentences using the character's name. If you're using a less retarded model this should work better than just a jumble of tags.
>>
>>108595459
fuck off with your shitmixes
>>
>>108595537
>ULTRA REALISTIC Z-IMAGE TURBO NSFW UNLEASHED V6.2
>new version released every 4 hours
>all previews posted by the trainer are of portraits, landscapes, animals in a generic cartoon style, nothing realistic, nothing nsfw
>free download is locked for 2 months
>clearly trained on pony-real data
>realistic gens of people literally have skin texture that makes flux-dev look like nanobanana
>all prompt-adherence is gone
>in the gens posted by other users featuring naked people, their genitalia looks like hamburger meat
>>
File: 1754808230970287.jpg (896 KB, 1328x1640)
896 KB
896 KB JPG
>>
File: 17515568918612394.jpg (1.07 MB, 1248x1824)
1.07 MB
1.07 MB JPG
>>108595451
Interesting results.
>>
File: Video_00004 (1).mp4 (2.75 MB, 1056x1376)
2.75 MB
2.75 MB MP4
>>108595798
I just solved the issue I had with doing more than 4steps, if I had used a normal workflow other than my own custommade one, I'd had figured it out days ago..

Now I need to solve the issue of nsfw loras getting like 10x the value on their weight with this setup..

Request a remake of that pose of Dakota in a bathroom looking like she's about to give a bj.
>>
File: 17515568918612332.jpg (1.66 MB, 1344x1728)
1.66 MB
1.66 MB JPG
>>108595843
That's the beauty of this hobby. There's so much to do but we have so little time.
>>
>>108594752
Post config.
Never trained a video lora but seems too slow.
You probably have weights in fp32 and want them in something accelerated like fp8 instead.
>>
File: comfy__14545.jpg (1.63 MB, 1344x1728)
1.63 MB
1.63 MB JPG
>>
>>108595982

three mark quality anon
>>
Damn just as I hoped that trani finally roped.
>>
https://github.com/Comfy-Org/ComfyUI/pull/13113/changes
>disable_dynamic_vram:
>If you have any issues with dynamic vram enabled please give us a detailed reports as this argument will be removed soon

LMAO it hasn't even existed for a month, dynamic vram fucks everyone's workflows who has more than 64GB of RAM because it manages RAM x10 worse than previous default and dozens of people reported right away that they now get stuck on VAE decode for 10 extra seconds every time and they already want to remove it? I'm never updating this piece of shit software.
>>
>>108596030
>>108596063
I didn't get it. Could you post it again?
>>
>>108596115
Which word don't you understand, Rakesh?
>>
>>108596063
how many custom sharted nodes with their own memory management are you running?
I'm using these params:
--reserve-vram 1.0 ^
--lowvram ^
--disable-smart-memory
because I actually need my VRAM to be free'd after use and it works very good
>>
>>108596131
I want to see the new version, anon. Come on. I'm sure you can do it.
>>
File: 1759240570614151.png (152 KB, 424x346)
152 KB
152 KB PNG
>>108596155
>--lowvram
>>
>>108596169
it's to free the vram immediately after use
>>
>>108596181
It most definitely fucks other things instead of just doing that.
>>
>>108596185
maybe try it instead of assuming retard? or read the source code?
in the past it force clip on cpu, but now it doesnt do that anymore, that's the only weirdness it had.
retard.
>>
>>108596063
not only that but it broke MultiGPU, and they want to remove it? fucking retards
https://github.com/pollockjj/ComfyUI-MultiGPU
>>
share some cool or unusual styles for anima

@soesoe300
@niyane
@dross
@sanjiro \(tenshin anman\)
@paprika shikiso
@coldcat.
@smart oval
@mesuosushi
@koorimizu
@rui rui rui0122
@kakinoki mikan \(kari\)
@susagane
@yu \(stdio nameraka\)
@yunayuispink
@mola mola
@ebanoniwa
@gecchu
>>
>>108596285
That's what this is for, right? https://thetacursed.github.io/Anima-Style-Explorer/index.html Sort by "unique." I found some cool artists this way.
>@takawoyu,
Very 2D and slender characters
>@amu \(m aa\),
Cute watercolor style.
>>
>>108596285
>\(tenshin anman\)
this is such a gay way of writing things, I understand that parentheses are used to put weights on prompts so you have to find something else, but still
>>
>>108596285
I just tested anima yesterday, I have to admit it's quite good, I have a question though, why did they decide to go for Wan 2.1's vae? What's wrong with Flux's vae? I thought the latter was the superior vae
>>
>>108596332
it's qwen image's vae, tf are you on about?
>>
>>108596337
yeah, yeah, same shit, the point was that why didn't they use Flux's vae instead?
>>
File: 1756199507813308.png (890 KB, 1536x1536)
890 KB
890 KB PNG
>>108596347
because it's dogshit and always has been
don't tell me you miss the plastic flux chins era, because i won't believe you
>>108596313
>@takawoyu
cute, i really like more sketchy styles
>>
>>108596359
>because it's dogshit and always has been
>don't tell me you miss the plastic flux chins era
tf are you on about? Z-image turbo uses flux's vae, and it's great at realism
>>
>>108596313
is there a noobai styles list i can export from somewhere online?
>>
>>108596362
>great at realism
>z-image turbo
ok so this is bait, kinda ashamed it took me this long to figure it out
>>
File: nice job.png (292 KB, 1000x1002)
292 KB
292 KB PNG
>>108596373
You got me for a second, nice job anon
>>
File: 1770777265317214.png (220 KB, 1871x1085)
220 KB
220 KB PNG
>>108596347
>* Qwen-Image Technical Report (Aug 2025) states their VAE beats Flux-VAE on PSNR and SSIM on both natural and text-heavy image sets — but does not publish the exact numbers in a standalone table.
chinks be like: "trust me bro"
>>
File: 1761022921368619.png (170 KB, 1563x837)
170 KB
170 KB PNG
>>108596417
yeah... now I'm really asking myself why they went for qwen image's vae
>>
>>108596417
>>108596443
>samefagging this hard in the big 2k26
uh oh, melty!
>>
>>108596449
I didn't pretend I'm not samefagging here, I'm just sharing my results, are you ok anon?
>>
>>108596332
https://huggingface.co/nvidia/Cosmos-Predict2-2B-Text2Image/blob/main/vae/config.json
original base uses it and it would take possibly much more training to switch it
>>
>>108596460
quite impressive they managed to make such a good model out of such a meme base model desu
>>
>work on my custom agent setup
>have my source code visualized
>it looks like this
Every node is a file of the agent, every interlink is an event/command/etc (currently synthetic, have to modify the actual agent software so I can visualize the network in real time). I just thought it looked neat and anons should be able to enjoy it too.
The size of the orbs I'm pondering is calculated by both graph edges and raw file size.
>>
>>108596467
wrong thread anon
>>
>>108596467
llm slop
>>
>>108596471
I keep confusing ldg with lmg :/
>>
File: 1769213565548726.png (8 KB, 316x59)
8 KB
8 KB PNG
welp
>>
>>108596484
why are they always finetuning meme base models?? lmao
>>
new browser, tried posting png over the file limit, before it would autoconvert to jpeg but it doesnt now. was that some 4chan x fork or random userscript?
>>
>>108596553
didn't know something like that existed, would've prevented me many headaches...
>>
>>108595459
>Users with a 5950X CPU should not run this; your Core 3 may have failed—unfortunately, this is a known issue with the Ryzen 5000 series. (applies only to the FP8 version)
wat
>>
>>108596564
i found something again that does it, but i dont think it's what i was actually using
https://github.com/TuxedoTako/4chan-xt
>>
File: 1772943548239164.png (163 KB, 498x317)
163 KB
163 KB PNG
>>108596679
>https://github.com/TuxedoTako/4chan-xt
>I stopped using 4chan since the hack. I now browse alt chans that actually care about their users, and don't need an userscript fighting their shitty design.
he'll be back
>>
As a tech noob I just made swarmUI work after failing hard with Forge and a1111 for hours, poggers!
>>
File: _329802_.jpg (532 KB, 1080x1920)
532 KB
532 KB JPG
>>
>>108597076
following old tutorials anon? good luck have fun genning
>>
File: file.png (3.64 MB, 1152x2048)
3.64 MB
3.64 MB PNG
>>
>>108597098
yes the video was from august 2024 but at least he spoke clear english
>>
File: 00001-3903066416.jpg (1.54 MB, 1728x2880)
1.54 MB
1.54 MB JPG
>>
>>108597123
well if you want to try a Forge-style UI this one is kept up to date
https://github.com/Haoming02/sd-webui-forge-classic/tree/neo
>>
>>108597144
thanks but I'm gonna stick with swarm for now, I like there's the option to tinket with the comfy backend as well
>>
>>108591129
catbox?
>>
File: image.png (67 KB, 1228x306)
67 KB
67 KB PNG
know your place!
>>
>>108597207
local is on borrowed time...
>>
>>108596484
Why the fuck have they even entertained the idea of glm image lmao?
>>
>>108597207
>kurmuz
am I supposed to know who the fuck this is?
>>
>>108597212
Local diffusion isn't going anywhere
>>
File: 00003-3382455876.jpg (1.43 MB, 1728x2880)
1.43 MB
1.43 MB JPG
>>
>>108596484
Have they learned nothing on bronyfag blowing fortune on Auraflow and kekstone's "de-distilled" flux schnell tune?
You can't unfuck shitty base models by throwing a few million anime images at them.
Anima is the sole exception, and it's only considered great because how fucking outdated SDXL is at this point. It's backgrounds, fine details, text capabilities, instruction following are very rough compared to any 2025-2026 model.
>>
>>108596484
But Flux Klein failed to Ikena (hassaku dev) and failed to Anlatroon (NewbieAI dev)
>>
>>108597230
NovelAI dev, the CEO of the SOTA anime image model
>>
File: 1768319563617455.gif (1.31 MB, 480x360)
1.31 MB
1.31 MB GIF
>>108597271
>anime
oh right
>>
File: output.webm (3.87 MB, 2000x1712)
3.87 MB
3.87 MB WEBM
>>108597117
>>
>>108597076
poggy woggy my friend! mayhaps you will be able to ascend to comfy soonage :D
>>
>>108597310
touch grass bro
>>
>>108597339
>touch grass
I prefer to touch those giant boobies desu
>>
>>108597217
Some labs offer free compute or other incentives for reputable researchers who want to train their models there.
>>
>>108597123
there were like 5 ppl doing SD content when I started lol. well congrats, now you can gen smug anime girls licking ice cream or other things (wait, it can do that too?!) until the end of days.
>>
>>108597257
Ok, tdrusell
>>
File: 1625875060176.jpg (14 KB, 500x332)
14 KB
14 KB JPG
why do SDXL controlnets has not effect on illustrious checkpoints?
>>
File: file.png (3.25 MB, 2304x1792)
3.25 MB
3.25 MB PNG
>>108597339
>>
>>108597408
the xinsir union one should work, no? been a while, sorry. don't forget to add the "SetUnionControlNetType" node
>>
File: price.png (53 KB, 432x249)
53 KB
53 KB PNG
>https://thetacursed.github.io/Anima-Style-Explorer/
what the fuck hahaha
>>
>>108597460
I just want sketch and depth controlnets.
>>
>>108597474
Total delusion. Nuts to ask for that much when their whole project revolves around something given away for free and costs way more to make. Dude probably vibe coded that app and then has some runpod just genning images that he commits to the git repo occasionally. My off the cuff math says it would only take like 5 days to generate 42k images with an RTX 6000 pro which would cost like 250 bucks.
>>
File: cat.png (1.56 MB, 2134x1082)
1.56 MB
1.56 MB PNG
>>108597491
it does something. don't look at the settings, haven't used this stuff in a while. need to tinker with the values, like cut it off earlier
>>108597568
le cash in. sadly he's not the only one.
>>
https://www.reddit.com/r/StableDiffusion/comments/1sk8vhq/ltx23_distilled_updated_sigmas_for_better_results/
>>
>>108597589
where do I install that node?
>>
>>108597789
controlnet should be in normal comfy nodes.
>>
>>108597619
that's just cope, I won't download a 22b model that's like 10x worse than seedance, Z-image turbo showed you can get quality outputs with a small model
>>
>>108595937
for some reason clicking the "do audio" slider kept giving me error so i had to disable it. the training work and its was 30sec/it after an 1 hour. ran it for 8 hours and got it at 1000steps out of 6000 steps with 4 repeats at learning rate of 0.0012. i paused the training and tested the results and the audio is absolute dogshit. AI toolkit really good for this shit. please help me out with betting setting solution.
https://litter.catbox.moe/jaeohg.mp4
https://litter.catbox.moe/6ni1ue.mp4
>>
dual sampler setup, 4 steps on 1st sampler > latent upscale with vae (res4lyf) x1.25 > 2nd sampler finishes with another 4 steps (dpmpp_2s_a/bong tangent). thoughts?
>>
Fresh when ready

>>108597963
>>108597963
>>108597963
>>
What board has the best discussion for NSFW uses of these tools?
>>
>>108597967
kinda early bruh
>>
>>108597948
>slopmerge
fuck off :)
>>
>>108597986
/hgg/ for anime, and SOTA Anima users
>>
>>108597948
its fucking shit dude
>>
>>108598001
What about realistic photographs of people?
>>
>>108597809
vramlet cope
>>
>>108598492
/b/



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.