/g/ - /ldg/ - Local Diffusion General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/ldg/ - Local Diffusion Genera(...) 10/26/25(Sun)18:15:00 No.107017112

File: highlights_g_107010364_17(...).jpg (1.37 MB, 2862x2050)

1.37 MB JPG

/ldg/ - Local Diffusion General Anonymous 10/26/25(Sun)18:15:00 No.107017112 Archived

Retarded Node Devs Edition

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107010364

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Neta Lumina
https://civitai.com/models/1790792?modelVersionId=2298660
https://gumgum10.github.io/gumgum.github.io/
https://neta-lumina-style.tz03.xyz/
https://huggingface.co/neta-art/Neta-Lumina

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo

Anonymous
10/26/25(Sun)18:16:58 No.107017125

Anonymous 10/26/25(Sun)18:16:58 No.107017125

>Pony v7
what we wrong??

Anonymous
10/26/25(Sun)18:18:15 No.107017133

Anonymous 10/26/25(Sun)18:18:15 No.107017133

File: 1527001-aesthetic_8_aesth(...).jpg (638 KB, 1664x2432)

638 KB JPG

Anonymous
10/26/25(Sun)18:18:39 No.107017136

Anonymous 10/26/25(Sun)18:18:39 No.107017136

>>107017125
Didn't want to jump ship from auraflow after flux was dedistilled

Anonymous
10/26/25(Sun)18:19:09 No.107017139

Anonymous 10/26/25(Sun)18:19:09 No.107017139

Comfy UI is life. I have been experimenting with it after being only a forge/wan2gp fag and the node thing is genius if you want to do anything more than basic gens.
Also it pairs extremely well with LLM assistance. I can quickly vibe code a python custom node for whatever niche functionality I need instead of dealing with spaghetti hell trying to mix and match other nodes.
I wonder if I can come up with workflows to cover needs unrelated to AI gen. I imagine its good for anything pipeline-ish with modular nature, feels like playing factorio IRL.

Anonymous
10/26/25(Sun)18:22:11 No.107017166

Anonymous 10/26/25(Sun)18:22:11 No.107017166

File: ComfyUI_temp_ptasu_00109_.png (1.45 MB, 1152x896)

1.45 MB PNG

obese nerds using my images for video gens, nothing new under the sun

Anonymous
10/26/25(Sun)18:22:56 No.107017173

Anonymous 10/26/25(Sun)18:22:56 No.107017173

>>107017139
it has shit tooling. node graphs are a feature, not the full experience which is the fatal flaw of comfyui. that and it's poothon

Anonymous
10/26/25(Sun)18:24:57 No.107017196

Anonymous 10/26/25(Sun)18:24:57 No.107017196

File: ComfyUI_03693_.png (1.59 MB, 832x1216)

1.59 MB PNG

>>107017125
Dogshit base model and who knows how much money wasted on futile training with sunk cost fallacy.
Everything else is at best of tertiary importance.
It's kinda crazy how it ended up being such a wet fart nothing burger after how sensational v6 was though.
People were at least outraged and sad over SD3.
Literally no one gives a fuck about that brony's autistic trainwreck now though.

Anonymous
10/26/25(Sun)18:27:30 No.107017219

Anonymous 10/26/25(Sun)18:27:30 No.107017219

>>107017125
not using illustrious

Anonymous
10/26/25(Sun)18:28:24 No.107017224

Anonymous 10/26/25(Sun)18:28:24 No.107017224

File: 1752260335258615.jpg (1.13 MB, 1040x1520)

1.13 MB JPG

>>107017125
>abandoned artist tags, a suicide move after illustrious and noobai that contained hundreds of drawfags in the dataset
>heavily censored the outputs
>picked a mediocre model and decided to go through with it
>requires autistic prompting style and SD3 tier novel sized prompts
>wasted a shit ton of money in the process

Anonymous
10/26/25(Sun)18:31:08 No.107017234

Anonymous 10/26/25(Sun)18:31:08 No.107017234

>>107017173
>it has shit tooling.
What do you think is missing that is out of the scope of handling with custom nodes? Only thing I'm having some beef for now is queue management, which seems lackluster.
Still I can already see me setting up some workflows to handle stuff of mine that involves clear composable modules to spaghetti with.

Anonymous
10/26/25(Sun)18:32:03 No.107017244

Anonymous 10/26/25(Sun)18:32:03 No.107017244

>>107017234
canvas, sequencers, video editor, 3d editor, and vector drawing

Anonymous
10/26/25(Sun)18:32:32 No.107017247

Anonymous 10/26/25(Sun)18:32:32 No.107017247

File: chroma_radience_000375.png (1.53 MB, 1024x1024)

1.53 MB PNG

Anonymous
10/26/25(Sun)18:32:51 No.107017252

Anonymous 10/26/25(Sun)18:32:51 No.107017252

please for the love of all that is loli save us ani

Anonymous
10/26/25(Sun)18:33:15 No.107017255

Anonymous 10/26/25(Sun)18:33:15 No.107017255

File: output_longvideo_refine_11_2_2.mp4 (2.87 MB, 1280x704)

2.87 MB MP4

1:04 length output from LongCat. The original file was about 300mb, so this is severely compressed. https://files.catbox.moe/62kssd.mp4 is still compressed to fit in the catbox, but not as badly.

It took about 4.5 hours to generate. LongCat works by generating multiple videos about 6 seconds long and then doing another pass to fuse them together. Based on this output, it doesn't seem like its default behavior is to repeat the action described in the prompt in each subsegment, whereas you'd normally want the action to be spread out over the full length.

To get good results involving complex sequences of actions, it would proably work better if each segment had its own prompt, so that you explain in detail what's going on every 6 seconds. The stock inference code isn't set up to handle that, but it would probably work, although I don't know how it will interpret the fusion pass.

For people interested in repetitive actions, it may be totally fine as-is.

This sequence was supposed to be anime-styled, but it refused to do it for anything sci-fi themed like this and would only produce 3DCG variants. (it will do anime for other subjects).

Anonymous
10/26/25(Sun)18:33:18 No.107017256

Anonymous 10/26/25(Sun)18:33:18 No.107017256

>>107017252
ywn baw

Anonymous
10/26/25(Sun)18:33:19 No.107017257

Anonymous 10/26/25(Sun)18:33:19 No.107017257

>>107017244
this is why comfy will always be shit, he will never add actually comfortable tools

scabPICKER
10/26/25(Sun)18:33:20 No.107017258

scabPICKER 10/26/25(Sun)18:33:20 No.107017258

um. songbloom is singing chinese lol

Anonymous
10/26/25(Sun)18:34:56 No.107017269

Anonymous 10/26/25(Sun)18:34:56 No.107017269

>>107017139
>if you want to do anything more than basic gens.
What are you even doing that requires a massive workflow something like swarm can't handle?

Anonymous
10/26/25(Sun)18:35:43 No.107017274

Anonymous 10/26/25(Sun)18:35:43 No.107017274

>>107017269
snake oil addiction

Anonymous
10/26/25(Sun)18:35:57 No.107017276

Anonymous 10/26/25(Sun)18:35:57 No.107017276

Can I be put in the next OP?

Anonymous
10/26/25(Sun)18:36:37 No.107017279

Anonymous 10/26/25(Sun)18:36:37 No.107017279

>>107017255
Watching this feels like you are developing schizophrenia. It's almost debilitating.
I don't think it is working well.

Anonymous
10/26/25(Sun)18:39:17 No.107017295

Anonymous 10/26/25(Sun)18:39:17 No.107017295

>>107017276
no, just ugly spaghetti screenshots. comfyui hasn't ruined the thread enough

Anonymous
10/26/25(Sun)18:40:01 No.107017303

Anonymous 10/26/25(Sun)18:40:01 No.107017303

File: 1725545289666750.jpg (4 KB, 233x216)

4 KB JPG

>>107017255
>took about 4.5 hours to generate

Anonymous
10/26/25(Sun)18:42:37 No.107017324

Anonymous 10/26/25(Sun)18:42:37 No.107017324

>>107017255
based patiencechad

Anonymous
10/26/25(Sun)18:47:09 No.107017369

Anonymous 10/26/25(Sun)18:47:09 No.107017369

File: 1655783628694.jpg (375 KB, 1248x1868)

375 KB JPG

>>107017244
In the context of out of the box quick AI gen infused gooning sessions I can see your point. The other UIs feels faster if you don't already have the comfy workflows integrated in other ergonomic software via API or something.

>>107017269
Never used swarm but niche pre and post processings, stitching together and daisy chaining different gen models, the possibilities seems infinite, specially if you already have a porn stash with years of curation.
Just now I have set up a mpv keybinding that sends the current video frame to a comfy workflow that overlays my crotch and dick at the bottom of the frame and run pov insertion loras and continuations. Its a great time to be alive and women are done for.

Anonymous
10/26/25(Sun)18:50:48 No.107017401

Anonymous 10/26/25(Sun)18:50:48 No.107017401

>want to make fantasy races
>realism model wont let me or it looks like a cheap costume
>have to make my own races like black asians or humans with photoshopped proportions

Anonymous
10/26/25(Sun)18:51:05 No.107017402

Anonymous 10/26/25(Sun)18:51:05 No.107017402

>>107017369
it's bad if you are a beginner or gooner if you just want to hop in and make shit and it's bad for professionals if you specialize in digital art or 3d modelling. it only really clicks with vfx node nuts but even then it breaks so many rules of node software they have their panties in a bunch. the only people that actually like comfyui and don't have a laundry list of problems are midwits who feel smart using it. this is funny because it pisses off smart people so much for actually being a pile of shit

scabPICKER
10/26/25(Sun)18:55:22 No.107017443

scabPICKER 10/26/25(Sun)18:55:22 No.107017443

I just found out something that SongBloom does...

scabPICKER
10/26/25(Sun)18:56:35 No.107017453

scabPICKER 10/26/25(Sun)18:56:35 No.107017453

Why are all of the examples of SongBloom wrong?

Anonymous
10/26/25(Sun)18:56:35 No.107017454

Anonymous 10/26/25(Sun)18:56:35 No.107017454

>>107017402
>the only people that actually like comfyui and don't have a laundry list of problems are midwits who feel smart using it
Or those are just regular power users / programmers facilitating hobby stuff instead of acting pretentious with professional needs.

Anonymous
10/26/25(Sun)18:57:29 No.107017463

Anonymous 10/26/25(Sun)18:57:29 No.107017463

Please saar the clients

Anonymous
10/26/25(Sun)18:58:04 No.107017472

Anonymous 10/26/25(Sun)18:58:04 No.107017472

>>107017454
people clearly try using it in a professional sense and it's retarded

Anonymous
10/26/25(Sun)18:59:05 No.107017481

Anonymous 10/26/25(Sun)18:59:05 No.107017481

>>107017454
>programmers
web shitters and poojeets aren't programmers

Anonymous
10/26/25(Sun)18:59:21 No.107017484

Anonymous 10/26/25(Sun)18:59:21 No.107017484

https://www.reddit.com/r/StableDiffusion/comments/1ogx7j4/chroma_radiance_mid_training_but_the_most/

Those images / that WF in the comments...

Anonymous
10/26/25(Sun)19:02:16 No.107017504

Anonymous 10/26/25(Sun)19:02:16 No.107017504

>>107017234
>queue management
Wish I could edit queue and then add it back in the same position.

Anonymous
10/26/25(Sun)19:02:53 No.107017509

Anonymous 10/26/25(Sun)19:02:53 No.107017509

>>107017256
you will never be a wataa

Anonymous
10/26/25(Sun)19:02:57 No.107017513

Anonymous 10/26/25(Sun)19:02:57 No.107017513

>>107017136
auraflow is shit, but de-distilled flux is just as bad. look at chroma who wasted 50 epochs trying to make it unmelty yet still failed. both of them should've just waited for qwen

Anonymous
10/26/25(Sun)19:03:47 No.107017519

Anonymous 10/26/25(Sun)19:03:47 No.107017519

>>107017255
>each segment had its own prompt
That's not possible?
Man, that's already something that pisses me off with wan.

Anonymous
10/26/25(Sun)19:04:01 No.107017522

Anonymous 10/26/25(Sun)19:04:01 No.107017522

All my gens are coming out shit for some reason

Anonymous
10/26/25(Sun)19:04:29 No.107017523

Anonymous 10/26/25(Sun)19:04:29 No.107017523

>>107017513
are you retarded?
chroma blows away qwen
https://files.catbox.moe/fe427t.png
https://files.catbox.moe/d5gu67.png
https://files.catbox.moe/ui7a5x.png

Anonymous
10/26/25(Sun)19:05:39 No.107017534

Anonymous 10/26/25(Sun)19:05:39 No.107017534

>>107017523
The flash / chroma HD is the version for these btw
custom node: https://github.com/silveroxides/ComfyUI_Hybrid-Scaled_fp8-Loader

https://huggingface.co/silveroxides/Chroma-Misc-Models/blob/main/Chroma1-HD-flash-heun/Chroma1-HD-flash-heun-fp8_scaled_original_hybrid_large_rev2.safetensors

Loras:
https://huggingface.co/silveroxides/Chroma-LoRAs/tree/main

WF in image

Anonymous
10/26/25(Sun)19:07:39 No.107017544

Anonymous 10/26/25(Sun)19:07:39 No.107017544

>>107017504
But at the same time its a huge plus that it actually runs headless on the comfy server. Other UIs I used seemed to handle it client side/browser based, which was a pain.
Now I can finally queue a batch, close my laptop and go to sleep, or just close the tab and open it up another time without fucking up everything.

Anonymous
10/26/25(Sun)19:08:20 No.107017554

Anonymous 10/26/25(Sun)19:08:20 No.107017554

>>107017513
Chroma is an amazing model, the fuck you're talking about

Anonymous
10/26/25(Sun)19:09:02 No.107017558

Anonymous 10/26/25(Sun)19:09:02 No.107017558

>>107017523
chromamelt is unavoidable and only blind retards choose to ignore this shit. the model is a mess thanks to de-distillation. literally every aspect of the images is falling apart, it looks terrible

Anonymous
10/26/25(Sun)19:09:48 No.107017567

Anonymous 10/26/25(Sun)19:09:48 No.107017567

>>107017558
you have to be trolling, qwen has nothing on this

Anonymous
10/26/25(Sun)19:10:40 No.107017578

Anonymous 10/26/25(Sun)19:10:40 No.107017578

File: chromamelt.png (79 KB, 204x207)

79 KB PNG

>qwen has nothing on this

Anonymous
10/26/25(Sun)19:11:30 No.107017587

Anonymous 10/26/25(Sun)19:11:30 No.107017587

>>107017544
Sure, I do like it overall, just wish it had more user friendly features.

Anonymous
10/26/25(Sun)19:11:59 No.107017593

Anonymous 10/26/25(Sun)19:11:59 No.107017593

>>107017578
show me a image remotely similar both style wise and quality wise from qwen or sdxl or illustrious at the same res, also that is the point of chroma radience which is training, all models have that issue with tiny details

Anonymous
10/26/25(Sun)19:12:30 No.107017600

Anonymous 10/26/25(Sun)19:12:30 No.107017600

>>107017578
yes sis, fucked hands can easily be fixed versus no seed variability of qwen that cant be fixed

Anonymous
10/26/25(Sun)19:12:35 No.107017602

Anonymous 10/26/25(Sun)19:12:35 No.107017602

been trying for a while but I'm stuck on this so thought I might ask regarding >>107017425

Cause I found a solution buts its the most caveman shit I've ever done with a computer.

In order to solve it turns out it works in video if I split the video file in chunks of 25 Mb each and then join each little tiny piece into the original movie like a virtual puzzle.

Just in case I'm missing something very obvious like GGUF models do diving the load by itself but with a video file as input, Its takes long to assemble and even longer to run the whole batch but it does the same job as paid software like Unifab which is worth around 300USD and its able to the whole video file in the same computer without any hassle and in one piece.

Anonymous
10/26/25(Sun)19:12:51 No.107017606

Anonymous 10/26/25(Sun)19:12:51 No.107017606

>>107017484
>>107017578
Part of me thinks that has to be trolling but I know that it isn't.
I am not even trying to be a hater and hope that some finetune will unfuck the schizo anatomy down the line but he really posted that angel demon image with errors on the wings, hands and nonsensical snakes as the example of chroma not having any problems.
Also I find their "come join and our cord and dilate with our xisters" attitude to anyone asking for documentation very bizarre.
Like seriously, they have spent 100k + on training and can't be assed to put whatever material they have anywhere public? Do they not care about adoption?

Anonymous
10/26/25(Sun)19:13:47 No.107017619

Anonymous 10/26/25(Sun)19:13:47 No.107017619

>>107017578
>$200k later
only coomerboomer retards defend this failbake

Anonymous
10/26/25(Sun)19:13:56 No.107017623

Anonymous 10/26/25(Sun)19:13:56 No.107017623

>>107017606
who is they? I dont think lodestone is running any kind of business, he is just making models, I dont think he is monetizing it at all

Anonymous
10/26/25(Sun)19:14:50 No.107017632

Anonymous 10/26/25(Sun)19:14:50 No.107017632

File: Screenshot 2025-10-26 190744.png (301 KB, 1390x933)

301 KB PNG

ovi is now currently supported on wan2gp with the latest v9.2 update. Not impressed with ovi at all, the audio quality is significantly inferior compared to ltx2.
ovi: https://files.catbox.moe/5n22sv.mp4
ltx2: https://files.catbox.moe/clx4i3.mp4

Anonymous
10/26/25(Sun)19:14:51 No.107017633

Anonymous 10/26/25(Sun)19:14:51 No.107017633

>>107017619
only low iq browns cant extract the 95% of good from something and focus on the fixable 5% bad

Anonymous
10/26/25(Sun)19:14:53 No.107017634

Anonymous 10/26/25(Sun)19:14:53 No.107017634

>2b params: just fix the hands manually!
>4b params: just fix the hands manually!
>8b params: just fix the hands manually!
>16b params: just fix the hands manually!
why is AI getting more expensive yet not actually improving?

Anonymous
10/26/25(Sun)19:15:17 No.107017640

Anonymous 10/26/25(Sun)19:15:17 No.107017640

ok, trolling, got it, I can go back 100 threads and only find worse gens from worse models

Anonymous
10/26/25(Sun)19:16:14 No.107017647

Anonymous 10/26/25(Sun)19:16:14 No.107017647

File: 1758412973441765.png (106 KB, 267x217)

106 KB PNG

>>107017632
at least it does a pretty good job with the facial expressions of someone being forced into existance to say shit like that

Anonymous
10/26/25(Sun)19:16:31 No.107017651

Anonymous 10/26/25(Sun)19:16:31 No.107017651

>>107017519
The default LongCat inference code doesn't do it, but it's easy to change it so it does (I didn't test it yet because it takes so long).

I'm trying long I2V now and seeing that it's using the same input image for the start of each segment which obviously isn't going to work well. Changing it to use the last frame of the past segment as the input for the next segment. The tricky part will be knowing what the prompt should be for each successive segment in advance, since it's unpredictable how much of the prompt will actually be incorporated within each segment.

Anonymous
10/26/25(Sun)19:16:38 No.107017652

Anonymous 10/26/25(Sun)19:16:38 No.107017652

>>107017634
vae, vae compression is the issue, even qwen's which only slightly hides it by being so overcooked that it is fully fitted to images

Anonymous
10/26/25(Sun)19:16:40 No.107017653

Anonymous 10/26/25(Sun)19:16:40 No.107017653

>>107017634
Qwen is actually pretty good at fixing hands, it's the main reason i use it for inpainting with better models.

Anonymous
10/26/25(Sun)19:17:05 No.107017656

Anonymous 10/26/25(Sun)19:17:05 No.107017656

>>107017633
>512x512
>broken hands
>melted details all over
>no artist tags
>no characters/celebs
deranged coomers will see this inferior-to-sdxl model and start creaming themselves simply because it can render a titty.

Anonymous
10/26/25(Sun)19:17:29 No.107017659

Anonymous 10/26/25(Sun)19:17:29 No.107017659

>>107017634
>manually
you could automatically fix the hands quickly since the earliest days of a1111 sis, thanks for exposing yourself as a retard

Anonymous
10/26/25(Sun)19:20:03 No.107017683

Anonymous 10/26/25(Sun)19:20:03 No.107017683

id rather have 99% perfect details with crazy good aesthetics and prompt following AND style flexibility and then just fix small details with adetailer over having nothing but the same image with better small details due to being overfit to death on a single style

Its one or the other, at least until the vae issue is fixed

Anonymous
10/26/25(Sun)19:20:54 No.107017689

Anonymous 10/26/25(Sun)19:20:54 No.107017689

>>107017651
Only reusing one frame will lead to the same issue wan have, not having information related to motion with the new segment.

Anonymous
10/26/25(Sun)19:21:41 No.107017695

Anonymous 10/26/25(Sun)19:21:41 No.107017695

File: 1751152327087028.png (4 KB, 320x240)

4 KB PNG

>>107017125
>Pony
Him getting in Loadstone's ear to make a equally shitty model. The irony is this piece of shit is worse than chroma and instead of following THE FUCKING DOCUMENTATION he decided to listen to pony fag to make a stillborn model that can't even work with loras.
Fuck both of them for burning so much time and money

Anonymous
10/26/25(Sun)19:21:52 No.107017700

Anonymous 10/26/25(Sun)19:21:52 No.107017700

File: 1732030567498492.jpg (387 KB, 1088x1344)

387 KB JPG

>>107017656
>512x512
chroma gens at 1024x1024 by default or even more, like 1088x1344 without needing a double pass >>107011247
again, brownoids literally just lying online every single thread, hope you're at least paid to be this retarded

Anonymous
10/26/25(Sun)19:21:55 No.107017701

Anonymous 10/26/25(Sun)19:21:55 No.107017701

>>107017623
>who is they?
People who regularly post about how great chroma is. Like that redditor. We have a few here, too.
>I dont think lodestone is running any kind of business, he is just making models, I dont think he is monetizing it at all
I guess this explains the attitude. 100k is awful amount of money to sink into this kind of hobby though. Another case of rich furry and suspicious amount of disposable income I suppose.
>>107017656
You can simply use flux loras for last two or train your own to be fair.
Only illust/noob knows major styles and characters and pretty much no model knows celebrities out of the box.

Anonymous
10/26/25(Sun)19:21:57 No.107017702

Anonymous 10/26/25(Sun)19:21:57 No.107017702

>>107017656

The gooner burden lies in seeds with things that ought be nice but have 6/7 digits.

They can't be bothered to apply a mask that is wakanda levels of science fiction.

Anonymous
10/26/25(Sun)19:23:33 No.107017709

Anonymous 10/26/25(Sun)19:23:33 No.107017709

>>107017700
He was referring to training resolution.
Most of the chroma training was done on low res.
Which perhaps partially explains fucked hands.(Remember how SD 1.5 was unable to learn hands at all due to that resolution?)

Anonymous
10/26/25(Sun)19:23:40 No.107017711

Anonymous 10/26/25(Sun)19:23:40 No.107017711

File: 1744577136262390.jpg (948 KB, 1464x1824)

948 KB JPG

Anonymous
10/26/25(Sun)19:24:33 No.107017716

Anonymous 10/26/25(Sun)19:24:33 No.107017716

>>107017689
After it generates each individual segment, it regenerates everything only using the first outputs as a reference, which ought to take care of that.

Anonymous
10/26/25(Sun)19:26:09 No.107017730

Anonymous 10/26/25(Sun)19:26:09 No.107017730

>>107017709
>He was referring to training resolution.
So he repeated himself twice in the first two points again chroma seething about hands? Even lower iq

Anonymous
10/26/25(Sun)19:27:37 No.107017743

Anonymous 10/26/25(Sun)19:27:37 No.107017743

>>107017709
No, all recent models were trained at low res first. Wan was 256 res for most of its training

Anonymous
10/26/25(Sun)19:27:37 No.107017744

Anonymous 10/26/25(Sun)19:27:37 No.107017744

>>107017653
>Qwen inpainting
shit, I didn't think of that
I tried it with i2i and it didn't work very well, but I might give it a shot

Anonymous
10/26/25(Sun)19:28:36 No.107017754

Anonymous 10/26/25(Sun)19:28:36 No.107017754

>>107017634
>Flux devs give instructions how to finetune
>furry decides to ignore it and do random retarded shit
It's tiresome

Anonymous
10/26/25(Sun)19:29:34 No.107017766

Anonymous 10/26/25(Sun)19:29:34 No.107017766

>>107017701
loras are not a good alternative to character knowledge because they do not interact well at all and bleed all over. people continue to ignore the major flaws of loras as an excuse to cope with bad models

Anonymous
10/26/25(Sun)19:30:42 No.107017780

Anonymous 10/26/25(Sun)19:30:42 No.107017780

trying to imagine the kind of anon thats genuinely taken by surprise at pony

Anonymous
10/26/25(Sun)19:30:51 No.107017782

Anonymous 10/26/25(Sun)19:30:51 No.107017782

>>107017766
Loras don't even work with chroma in the first place and are volatile between seeds mostly because token weights are fucked beyond repair.

Anonymous
10/26/25(Sun)19:31:53 No.107017788

Anonymous 10/26/25(Sun)19:31:53 No.107017788

>>107017782
Personally I'm mad because pony fag is partially responsible for chroma turning into dogshit.

Anonymous
10/26/25(Sun)19:32:21 No.107017793

Anonymous 10/26/25(Sun)19:32:21 No.107017793

>>107017780
I think the most surprised were style cluster 545 and style cluster 217.

Anonymous
10/26/25(Sun)19:33:26 No.107017800

Anonymous 10/26/25(Sun)19:33:26 No.107017800

>>107017788
>trying to push that chroma is bad still

>>107017782
chroma loras are trainable in diffusion pipe, simpletuner and aitoolkit

Anonymous
10/26/25(Sun)19:34:29 No.107017803

Anonymous 10/26/25(Sun)19:34:29 No.107017803

File: 1748238499588993.png (2.79 MB, 1344x1728)

2.79 MB PNG

Anonymous
10/26/25(Sun)19:35:51 No.107017819

Anonymous 10/26/25(Sun)19:35:51 No.107017819

File: 1761033414856813.png (2.55 MB, 1344x1728)

2.55 MB PNG

>>107017803

Anonymous
10/26/25(Sun)19:39:21 No.107017844

Anonymous 10/26/25(Sun)19:39:21 No.107017844

>>107017800
I made Chroma loras and the model itself does not respect the outcomes this has been discussed at length in this thread. That is a fundamental flaw with the model

Anonymous
10/26/25(Sun)19:39:38 No.107017846

Anonymous 10/26/25(Sun)19:39:38 No.107017846

>>107017484
https://www.reddit.com/r/StableDiffusion/comments/1ogwi51/holy_crap_form_me_chroma_radiance_is_like_10/
2 posts like that today, can't be organic

Anonymous
10/26/25(Sun)19:41:05 No.107017858

Anonymous 10/26/25(Sun)19:41:05 No.107017858

>>107017743
Huh.
I guess higher vae channels make the point moot or whatever. I am not going to LARP that I have a detailed understanding.
>>107017782
>because token weights are fucked beyond repair.
Can you elaborate on this?
>>107017788
No idea how this ties into the rest but okay.

Anonymous
10/26/25(Sun)19:41:04 No.107017859

Anonymous 10/26/25(Sun)19:41:04 No.107017859

File: 1760483892365731.png (930 KB, 1686x1342)

930 KB PNG

are we saved?
https://xcancel.com/HuggingPapers/status/1982360432514883795#m

Anonymous
10/26/25(Sun)19:41:58 No.107017865

Anonymous 10/26/25(Sun)19:41:58 No.107017865

>>107017844
>respect the outcomes
what? try English this time. Or are you being intentionally unobtrusive to try and hide the fact that you are full of shit?

Anonymous
10/26/25(Sun)19:41:59 No.107017866

Anonymous 10/26/25(Sun)19:41:59 No.107017866

>>107017846
>>107017484
Obviously not
What I'm still trying to figure out is why is he still training off of his broken architecture instead of restarting from a earlier epoch
>>107017858
Use a non realism lora and post 5 back to back seeds using the same prompt, I'm not fucking spoon feeding you

Anonymous
10/26/25(Sun)19:42:37 No.107017868

Anonymous 10/26/25(Sun)19:42:37 No.107017868

File: 1675514491543988.jpg (10 KB, 241x210)

10 KB JPG

chroma bros, have you made more celebrity loras?

Anonymous
10/26/25(Sun)19:43:05 No.107017873

Anonymous 10/26/25(Sun)19:43:05 No.107017873

>>107017866
>broken architecture
can you explain how its broken?

Anonymous
10/26/25(Sun)19:43:34 No.107017879

Anonymous 10/26/25(Sun)19:43:34 No.107017879

>>107017868
usecase for celebrity loras?

Anonymous
10/26/25(Sun)19:43:58 No.107017883

Anonymous 10/26/25(Sun)19:43:58 No.107017883

>>107017873
>Use a non realism lora and post 5 back to back seeds using the same prompt, I'm not fucking spoon feeding you

Anonymous
10/26/25(Sun)19:44:06 No.107017884

Anonymous 10/26/25(Sun)19:44:06 No.107017884

>>107017859
that is what lodestone is already doing lol

Anonymous
10/26/25(Sun)19:44:21 No.107017889

Anonymous 10/26/25(Sun)19:44:21 No.107017889

File: 1745938996190135.png (172 KB, 460x460)

172 KB PNG

>>107017879
>usecase for celebrity loras?

Anonymous
10/26/25(Sun)19:44:38 No.107017892

Anonymous 10/26/25(Sun)19:44:38 No.107017892

File: 1742700300385487.png (1.7 MB, 1344x1728)

1.7 MB PNG

>>107017859
holy based.

>inb4 nobody ever implements it in a real model

Anonymous
10/26/25(Sun)19:45:26 No.107017899

Anonymous 10/26/25(Sun)19:45:26 No.107017899

>>107017884
no, pixnerd doesn't improve the training speed and works on the pixel space, that one works on something different than pixel space and latent space

Anonymous
10/26/25(Sun)19:45:47 No.107017901

Anonymous 10/26/25(Sun)19:45:47 No.107017901

>>107017892
Me on the bottom

Anonymous
10/26/25(Sun)19:46:27 No.107017908

Anonymous 10/26/25(Sun)19:46:27 No.107017908

>>107017892
>her head is burried in Panty's panties
kek, I got the joke!

Anonymous
10/26/25(Sun)19:46:33 No.107017909

Anonymous 10/26/25(Sun)19:46:33 No.107017909

>>107017899
>lodestone once again starts a new chroma model

Anonymous
10/26/25(Sun)19:56:34 No.107017975

Anonymous 10/26/25(Sun)19:56:34 No.107017975

>>107017859
>SVG
How can you trust anyone stupid enough to reuse initials like that?

Anonymous
10/26/25(Sun)19:59:19 No.107017986

Anonymous 10/26/25(Sun)19:59:19 No.107017986

>>107017868
Hypothetically if I were to train one where would I even upload it at this point?
Can't register to that broken piece of shit seaart at all and tensorart asks for glowing email.

Anonymous
10/26/25(Sun)20:00:30 No.107017990

Anonymous 10/26/25(Sun)20:00:30 No.107017990

>>107017879
gooning

Anonymous
10/26/25(Sun)20:06:18 No.107018022

Anonymous 10/26/25(Sun)20:06:18 No.107018022

File: 00278-2861808508.png (2.83 MB, 1280x1920)

2.83 MB PNG

Anonymous
10/26/25(Sun)20:09:00 No.107018043

Anonymous 10/26/25(Sun)20:09:00 No.107018043

>>107017986
huggingface then https://civitaiarchive.com/

Anonymous
10/26/25(Sun)20:10:37 No.107018054

Anonymous 10/26/25(Sun)20:10:37 No.107018054

>>107017859
>x62 faster training
>35x faster inference
Must be a catch somewhere, i assume it will take longer to train, although even if its literally slightly worse resource wise it will be worth not having a vae quality loss, especially for the future of image gen models, which will be edit models, which need to lose their vae

Anonymous
10/26/25(Sun)20:15:19 No.107018079

Anonymous 10/26/25(Sun)20:15:19 No.107018079

>>107018054
they're saying it gets to the same quality level 62x and 35x faster.

Anonymous
10/26/25(Sun)20:18:29 No.107018105

Anonymous 10/26/25(Sun)20:18:29 No.107018105

>>107017909
If it's 62 times faster to train and 35 times faster to gen I wouldn't mind him doing that.

Anonymous
10/26/25(Sun)20:19:23 No.107018111

Anonymous 10/26/25(Sun)20:19:23 No.107018111

File: chroma lora iterative seed.jpg (712 KB, 4140x1216)

712 KB JPG

>>107017866
Well that anon wasn't me.
I did what you asked and yes I can see that consistency is a problem.
Can you now tell me why?

Anonymous
10/26/25(Sun)20:19:35 No.107018117

Anonymous 10/26/25(Sun)20:19:35 No.107018117

File: file.png (1.96 MB, 2713x1226)

1.96 MB PNG

>>107017859
is this why we got the slopped look? since everything is mixed together it does a sort of an average of concepts instead of focusing on the concept itself

Anonymous
10/26/25(Sun)20:19:54 No.107018121

Anonymous 10/26/25(Sun)20:19:54 No.107018121

>>107018079
thats on the usual initial toy model size, seeing if it actually scales is what kills papers

Anonymous
10/26/25(Sun)20:22:19 No.107018140

Anonymous 10/26/25(Sun)20:22:19 No.107018140

>>107017859
Tencent be like:
>"we're gonna pretend this never existed, STACK MORE LAYERS"

Anonymous
10/26/25(Sun)20:23:22 No.107018144

Anonymous 10/26/25(Sun)20:23:22 No.107018144

>>107018121
Or deliberately p-hacked BS that is useless in practice.
So many fucking papers on arxiv that does not yield promised results when trying the experiment on different data/params.

Anonymous
10/26/25(Sun)20:24:23 No.107018153

Anonymous 10/26/25(Sun)20:24:23 No.107018153

>>107018111
The creator decided to obfuscate tokens and then do yolo training methods not backed by any data. The model got destroyed because he decided to make it a yolo sandbox half way thought.

Anonymous
10/26/25(Sun)20:25:15 No.107018156

Anonymous 10/26/25(Sun)20:25:15 No.107018156

>>107018043
Well hf regularly purges "problematic" models.
I wonder if a gibberish name would fly under the rather or do they directly use that website to find them.

Anonymous
10/26/25(Sun)20:25:34 No.107018158

Anonymous 10/26/25(Sun)20:25:34 No.107018158

>>107018144
>So many fucking papers on arxiv that does not yield promised results
yep, that's called the reproduction crisis, a lot of researchers are frauds
https://en.wikipedia.org/wiki/Replication_crisis
>A 2016 survey by Nature on 1,576 researchers who took a brief online questionnaire on reproducibility found that more than 70% of researchers have tried and failed to reproduce another scientist's experiment results (including 87% of chemists, 77% of biologists, 69% of physicists and engineers, 67% of medical researchers, 64% of earth and environmental scientists, and 62% of all others), and more than half have failed to reproduce their own experiments.

Anonymous
10/26/25(Sun)20:26:28 No.107018166

Anonymous 10/26/25(Sun)20:26:28 No.107018166

>>107018156
Create a torrent, port forward the client port, and post the magnet link on the civarchive page too as another mirror

Anonymous
10/26/25(Sun)20:27:16 No.107018172

Anonymous 10/26/25(Sun)20:27:16 No.107018172

>>107018153
>yolo training methods
AI in general then? Everything is is a first in this field, wtf are you on about

>destroyed
looks amazing to me>>107017133
>>107017523

Anonymous
10/26/25(Sun)20:28:38 No.107018183

Anonymous 10/26/25(Sun)20:28:38 No.107018183

>>107017990
perfect non real girls are hotter than random celebs

Anonymous
10/26/25(Sun)20:29:24 No.107018185

Anonymous 10/26/25(Sun)20:29:24 No.107018185

File: chroma_00594.png (1.91 MB, 1024x1536)

1.91 MB PNG

Anonymous
10/26/25(Sun)20:29:44 No.107018188

Anonymous 10/26/25(Sun)20:29:44 No.107018188

>>107018117
It's a known issue and one of the reason details like hands or feet look bad.

Anonymous
10/26/25(Sun)20:29:49 No.107018189

Anonymous 10/26/25(Sun)20:29:49 No.107018189

>>107018172
>Flux creators give training guidelines
>decide to ignore them when the goal was to make a flux model without restrictions
Also since you're pointing out single gens it's clear to me you're not at a level for this discussion, if a model is not consistent then it's worthless especially when you have to put so much time investment into creating chroma gens compared to other models even on top end hardware.
I think we're done here because you fail to realize that 1 out of 5 gens is not acceptable when you're trying to maintain a look or style especially at the dog shit speeds chroma runs at

Anonymous
10/26/25(Sun)20:30:37 No.107018195

Anonymous 10/26/25(Sun)20:30:37 No.107018195

>>107018158
actually this survey cannot be reproduced

Anonymous
10/26/25(Sun)20:31:04 No.107018199

Anonymous 10/26/25(Sun)20:31:04 No.107018199

>>107018153
>obfuscate tokens
What does this mean exactly?
>do yolo training methods not backed by any data.
I believe this, judging by the stream of weird experiments in his hf repo, and the results of the training speak for themselves. But could you elaborate? Do you mean stuff like mixing high res and low res steps?
>>107018166
I am cursed with CGNAT unfortunately.

Anonymous
10/26/25(Sun)20:31:07 No.107018200

Anonymous 10/26/25(Sun)20:31:07 No.107018200

>>107018189
Show me a model that does as well as consistently as chroma at as many styles and is not qwen which is style locked

Anonymous
10/26/25(Sun)20:31:34 No.107018201

Anonymous 10/26/25(Sun)20:31:34 No.107018201

>>107018156
>regularly purges "problematic" models
only if people snitch on them, they don't really check everything by themselves

Anonymous
10/26/25(Sun)20:32:14 No.107018205

Anonymous 10/26/25(Sun)20:32:14 No.107018205

File: chroma_00587.png (2.74 MB, 1024x1280)

2.74 MB PNG

Anonymous
10/26/25(Sun)20:32:39 No.107018207

Anonymous 10/26/25(Sun)20:32:39 No.107018207

>>107018195
kek, the irony is beautiful innit?

Anonymous
10/26/25(Sun)20:33:15 No.107018213

Anonymous 10/26/25(Sun)20:33:15 No.107018213

File: chroma_00561.jpg (44 KB, 832x1216)

44 KB JPG

Anonymous
10/26/25(Sun)20:33:42 No.107018217

Anonymous 10/26/25(Sun)20:33:42 No.107018217

>>107017859
https://arxiv.org/pdf/2510.15301
>We also find that classifier-free guidance (CFG) is less effective in our framework, indicating the need for better alternatives
nothingburger

Anonymous
10/26/25(Sun)20:34:12 No.107018221

Anonymous 10/26/25(Sun)20:34:12 No.107018221

File: 1751519418942597.jpg (1.05 MB, 1336x2008)

1.05 MB JPG

Anonymous
10/26/25(Sun)20:34:29 No.107018222

Anonymous 10/26/25(Sun)20:34:29 No.107018222

>>107018200
Anything XL anon again you're being disingenuous so I'm going to ignore you now.
>>107018199
If you haven't noticed Chorma can't do artist tags despite having a dataset that includes various artist and styles, this is a conscious effort done by the creator and has historically lead to massive issues with models as we saw with SD3 being dumbed down and the latest victim being pony V7. It always seems to do massive damage to models because concepts and other things that are often put into separate tags or related to some tags become completely fucked and you get stuff like inconsistent styles or broken anatomy like SD3 not being able to produce an image of a woman lying in the grass because they decided to obfuscate anything close to a sex act.

Anonymous
10/26/25(Sun)20:35:35 No.107018228

Anonymous 10/26/25(Sun)20:35:35 No.107018228

File: chroma_00544.png (882 KB, 832x1280)

882 KB PNG

ignore the anti chroma troll, notice how he never posts gens

Anonymous
10/26/25(Sun)20:36:27 No.107018234

Anonymous 10/26/25(Sun)20:36:27 No.107018234

>>107018228
>notice how he never posts gens
that sentence is /sdg/ coded btw

Anonymous
10/26/25(Sun)20:37:26 No.107018237

Anonymous 10/26/25(Sun)20:37:26 No.107018237

>>107017859
Another dinov3 embedding model interesting

>>107017986
>where would I even upload it at this point?
https://catbox.moe/
https://gofile.io/

Anonymous
10/26/25(Sun)20:37:40 No.107018238

Anonymous 10/26/25(Sun)20:37:40 No.107018238

>>107018228
Now post 5 images with the same prompt within the same batch so I can laugh at you
>>107018234
I have a feeling it is one of those waste of space bro, we already tested chroma for weeks and decided the model had deal breaker flaws I bet if I did post a gen he's going to shit his pants and reveal himself.

Anonymous
10/26/25(Sun)20:37:57 No.107018239

Anonymous 10/26/25(Sun)20:37:57 No.107018239

File: chroma_00547.png (1.64 MB, 832x1280)

1.64 MB PNG

Anonymous
10/26/25(Sun)20:39:14 No.107018244

Anonymous 10/26/25(Sun)20:39:14 No.107018244

>>107018222
I see what you mean.
I actually haven't tried specific styles with chroma, mainly just realism stuff.
Shame that he also did this stupid BS.

Anonymous
10/26/25(Sun)20:39:23 No.107018246

Anonymous 10/26/25(Sun)20:39:23 No.107018246

isnt there any local models that you can interact with like llms?

>generate a unicorn
output.jpg
>wait, make the horn purple, and make it bit longer
output.jpg

Why isnt memory a thing for image models?

Anonymous
10/26/25(Sun)20:39:30 No.107018249

Anonymous 10/26/25(Sun)20:39:30 No.107018249

>>107018238
>Now post 5 images with the same prompt within the same batch so I can laugh at you
do that with ANY model, any, it will either have issues OR the model will be so overfit you will have no variance at all, are you arguing against AI in general? Because every model has issues every few gens, chroma just has the least that is not also locked to a single style like illustrious / qwen

Anonymous
10/26/25(Sun)20:40:17 No.107018254

Anonymous 10/26/25(Sun)20:40:17 No.107018254

File: 1743753188785909.png (1.59 MB, 1120x1440)

1.59 MB PNG

>>107018205
impressive chroma gen

Anonymous
10/26/25(Sun)20:40:57 No.107018257

Anonymous 10/26/25(Sun)20:40:57 No.107018257

>>107018246
hunyuan 3 is supposed to be like that, like an LLM that can generate images (why it's so big)

Anonymous
10/26/25(Sun)20:41:23 No.107018260

Anonymous 10/26/25(Sun)20:41:23 No.107018260

>>107018246
edit models do that but they are slow and have to be loaded after the initial image is generated

Anonymous
10/26/25(Sun)20:41:48 No.107018261

Anonymous 10/26/25(Sun)20:41:48 No.107018261

File: chroma_00579.png (1.78 MB, 896x1152)

1.78 MB PNG

Anonymous
10/26/25(Sun)20:42:08 No.107018263

Anonymous 10/26/25(Sun)20:42:08 No.107018263

>>107018246
The memory is your prompt and an I2I workflow

Anonymous
10/26/25(Sun)20:43:02 No.107018274

Anonymous 10/26/25(Sun)20:43:02 No.107018274

>>107018249
You're too stupid for this conversation please go back to your containment thread, no other model besides chroma is unable to maintain the same consistent art style even with the aid of loras

Anonymous
10/26/25(Sun)20:43:39 No.107018276

Anonymous 10/26/25(Sun)20:43:39 No.107018276

>>107018274
thats what I thought

Anonymous
10/26/25(Sun)20:44:51 No.107018284

Anonymous 10/26/25(Sun)20:44:51 No.107018284

Give me a new idea for gen

Anonymous
10/26/25(Sun)20:45:00 No.107018285

Anonymous 10/26/25(Sun)20:45:00 No.107018285

>>107018257
Is that the first model? So can we expect a distilled/smaller model in the future? I want to interact it like a LLM model with LM Studio. Kinda annoying trying to install billion different dependencies to generate image with python for every new model.

Anonymous
10/26/25(Sun)20:46:27 No.107018291

Anonymous 10/26/25(Sun)20:46:27 No.107018291

>>107018284
Enormous ufo ominously hovering over the landscape of your choice

Anonymous
10/26/25(Sun)20:47:49 No.107018294

Anonymous 10/26/25(Sun)20:47:49 No.107018294

File: b-but chroma bad!.jpg (570 KB, 1376x2304)

570 KB JPG

Anonymous
10/26/25(Sun)20:48:21 No.107018299

Anonymous 10/26/25(Sun)20:48:21 No.107018299

>>107018294
holy shit... imagine if he cussed?

Anonymous
10/26/25(Sun)20:49:28 No.107018305

Anonymous 10/26/25(Sun)20:49:28 No.107018305

>>107018284
plump bbw milf

Anonymous
10/26/25(Sun)20:49:46 No.107018309

Anonymous 10/26/25(Sun)20:49:46 No.107018309

>>107018294
omg it can do 1character?? best model ever!

Anonymous
10/26/25(Sun)20:50:31 No.107018311

Anonymous 10/26/25(Sun)20:50:31 No.107018311

>>107018309
1character more than you can looooooool retard

Anonymous
10/26/25(Sun)20:51:46 No.107018321

Anonymous 10/26/25(Sun)20:51:46 No.107018321

I'm tired of disabled retards trying to troll in this thread
>>107018111
Is why Chroma is shit and trained incorrectly, it shows the key flaw with this model and the retard is going to continue until he can find something else to annoy the general with.
>>107018244
I think the model is fine with realism but that's mostly due to flux so there's that positive to it. Realism is so strong tags like selfie will make a anime lora turn 3D in some seeds.

Anonymous
10/26/25(Sun)20:51:50 No.107018322

Anonymous 10/26/25(Sun)20:51:50 No.107018322

ldg mustve been linked elsewhere with this newfaggotry

Anonymous
10/26/25(Sun)20:56:24 No.107018341

Anonymous 10/26/25(Sun)20:56:24 No.107018341

>>107018322
No it's the same faggot that griefs this thread daily, sora is now irrelevant and the api bullshit gets no traction so he's cycling through his autistic bullshit.
I do think chroma is good for some memes if you can cope with the art constantly being inconsistent between seeds

Anonymous
10/26/25(Sun)20:58:14 No.107018352

Anonymous 10/26/25(Sun)20:58:14 No.107018352

File: style_cluster_117_00001_.jpg (1.03 MB, 1280x1536)

1.03 MB JPG

also you basically have to use a style cluster or it looks like garbage, there are tons of different ones

Anonymous
10/26/25(Sun)20:58:39 No.107018354

Anonymous 10/26/25(Sun)20:58:39 No.107018354

>>107018335
I think a lot of the initial tests disregarded the prompting instructions and tried to a-b test models with exactly the same prompt, getting terrible results.

pony 7 still doesn't look great, but not as bad as the first mutagenic horrors people initally posted.

Anonymous
10/26/25(Sun)20:59:15 No.107018358

Anonymous 10/26/25(Sun)20:59:15 No.107018358

>>107018352
posted wrong image at first:

pony is better than originally thought (still not good) but it has major issues, he did not do uncond dropout

To somewhat fix it spam tons of random nonsense tokens into the negative

It is also somewhat locked to long prompts = good image

Anonymous
10/26/25(Sun)20:59:24 No.107018359

Anonymous 10/26/25(Sun)20:59:24 No.107018359

>>107018321
>>107018322
And lastly, and once again I agree with you that it has issues, do you think it can be salvaged with extensive finetuning?
If someone were to train it further with say 50000 decent images and proper methods could they unfuck it for like a few thousand bucks?
I am asking this because I know that a major SDXL finetuner (actual finetuner, not shit-mixer) plans to move on to it soon. I wonder if I should have some hopium for it.

scabPICKER
10/26/25(Sun)21:00:20 No.107018361

scabPICKER 10/26/25(Sun)21:00:20 No.107018361

File: Screenshot from 2025-10-2(...).png (44 KB, 1154x310)

44 KB PNG

SongBloom chads have something to look forward to:

Anonymous
10/26/25(Sun)21:00:22 No.107018362

Anonymous 10/26/25(Sun)21:00:22 No.107018362

why does

anon sometimes

type like this

Anonymous
10/26/25(Sun)21:01:01 No.107018366

Anonymous 10/26/25(Sun)21:01:01 No.107018366

>>107016852
the prompt was this in all cases:

`a traditional media paper texture watercolor $medium$ painting $medium$ depicting a solitary crowned forest priestess rendered in fine inked linework and layered watercolor washes with an Art Nouveau botanical sensibility. A Caucasian woman, approximately 25 years old, stands in the middle of the composition, full body visible, facing forward with her eyes closed and a serene expression. Her long straight blonde hair is parted in the middle and falls down past her shoulders. She wears a flowing green gown with graduated watercolor tones and vertical drips, a fur-trimmed cloak at the top, and an embroidered chest panel showing a prominent crescent moon above stylized leaves. Her hands are on her hips with her left hand on the left side of her waist and her right hand on the right side of her waist. A delicate crown made of antlers, tiny beadlike orbs, and fern fronds rests atop her head, and a faint circular halo inked behind her head frames the crown. The background is a dense mossy woodland with gnarled tree trunks framing left and right, climbing ivy on the right, and tangled roots across the bottom foreground. Bioluminescent mushrooms with glowing undersides sit bottom left and bottom right, casting warm orange glow upward. The piece shows visible brush texture, wet-on-wet gradients, and pen hatching for detail.`

base AuraFlow 0.2 almost certainly didn't actually know any of those generic Booru style tags at the front and yet it still did better than Pony V7, which basically seems to be ridiculously over-reliant on the completely undocumented `style_cluster_xyz` tags.

Anonymous
10/26/25(Sun)21:01:45 No.107018368

Anonymous 10/26/25(Sun)21:01:45 No.107018368

>>107018359
You would need to go back to a earlier epoch and train it from there and it will take a long time. I really think he should have just trained it like flux and there wouldn't have been any issues. The model degraded the further along it went with training which is sad. I'm tempted to test loras with older models but I just can't be bothered to do it desu.
>>107018362
Redditor most likely

Anonymous
10/26/25(Sun)21:02:05 No.107018373

Anonymous 10/26/25(Sun)21:02:05 No.107018373

>>107018359
lol no

Anonymous
10/26/25(Sun)21:03:22 No.107018381

Anonymous 10/26/25(Sun)21:03:22 No.107018381

>>107018362
Redditors colonized this website and started reddit spacing everywhere for "legibility" or whatever.

Anonymous
10/26/25(Sun)21:03:27 No.107018382

Anonymous 10/26/25(Sun)21:03:27 No.107018382

>>107018352
I would be MUCH more forgiving if there was actually ANY FUCKING DOCUMENTATION WHATSOEVER for the style clusters. (There isn't). Nobody fucking knows what any of them consisted of dataset wise, somehow, which is ridiculous. The model is literally useless until he publishes a list of every single cluster with at the very least a one-word description of what they're supposed to do.

Anonymous
10/26/25(Sun)21:04:15 No.107018388

Anonymous 10/26/25(Sun)21:04:15 No.107018388

>>107018366
>base AuraFlow 0.2 almost certainly didn't actually know any of those generic Booru style tags at the front and yet it still did better than Pony V7, which basically seems to be ridiculously over-reliant on the completely undocumented `style_cluster_xyz` tags.

it's irrelevant what the original auraflow was trained for after the pony tuning. if you're not using "score_9, rating_sensitive, style_cluster_430" even though it sounds retarded you'll obviously get bad results. using them doesn't mean you'll get /good/ results, but if you leave them out you're not even giving it a chance.

Anonymous
10/26/25(Sun)21:04:30 No.107018389

Anonymous 10/26/25(Sun)21:04:30 No.107018389

>>107018359
i can think of at least three other models that are more worthy of tuning on ~50k images

Anonymous
10/26/25(Sun)21:05:04 No.107018394

Anonymous 10/26/25(Sun)21:05:04 No.107018394

>>107018358
I don't think anons should waste their time with that flaming piece of shit. Everyone warned him and he didn't listen and now he has a model even his retarded ass doesn't understand.
>>107018366
I asked a few questions in the discord and his drones got overly defensive when asked how and why would he implement this tagging system into a model with zero understanding or documentation. Fuck them and let them fester in shit absolute clown car of a model.

Anonymous
10/26/25(Sun)21:07:01 No.107018415

Anonymous 10/26/25(Sun)21:07:01 No.107018415

>>107018359
In my experience testing it a bit you can clean up Chroma reasonably well with even ONE lora, but it HAS to be trained at native 1024x1024, and it HAS to be captioned with proper (preferably hand-checked) English natural languae captions, not slopmaxxed broken grammar chink / jeet shit

Anonymous
10/26/25(Sun)21:08:15 No.107018420

Anonymous 10/26/25(Sun)21:08:15 No.107018420

>>107018388
why 430 specifically when there's apparently literally 2048 clusters? or was that just an example?

scabPICKER
10/26/25(Sun)21:08:16 No.107018421

scabPICKER 10/26/25(Sun)21:08:16 No.107018421

melband roformer is neato.

vocal separation.

Anonymous
10/26/25(Sun)21:08:44 No.107018423

Anonymous 10/26/25(Sun)21:08:44 No.107018423

im such a worthless gooner

Anonymous
10/26/25(Sun)21:09:00 No.107018425

Anonymous 10/26/25(Sun)21:09:00 No.107018425

>>107018359
>50000
datasets of legit meta changing finetunes are larger by orders of magnitude

Anonymous
10/26/25(Sun)21:09:26 No.107018427

Anonymous 10/26/25(Sun)21:09:26 No.107018427

>>107018394
it's possible that they literally do not have documentation of the styles, because they trained a classifier first that assigned the style tags.

it's linked on their guide at https://civitai.com/articles/21107/captioning-and-prompting-primer-for-v7

they probably used that classifier and a captioner and never saved and correlation with artist names.

Anonymous
10/26/25(Sun)21:11:02 No.107018436

Anonymous 10/26/25(Sun)21:11:02 No.107018436

Where do people get ideas about aesthetic tags etc. in Chroma? They do seem to have an effect but what's the source? I don't want to have to join this guy's discord just to learn the basics of how images were tagged for training, that should be public

Anonymous
10/26/25(Sun)21:11:16 No.107018438

Anonymous 10/26/25(Sun)21:11:16 No.107018438

>>107017166
You still haven't posted catboxes revealing the model and lora these gens are using. You claimed it's Chroma, but I am still waiting for the Catbox.

Anonymous
10/26/25(Sun)21:11:51 No.107018441

Anonymous 10/26/25(Sun)21:11:51 No.107018441

>>107018427
All the more reason to ignore his garbage model. Also if I recall the model has the same style swing issue chroma has with less features.
>>107018436
Chroma guy talked to pony guy and then we got that which hurt the model, he said he wouldn't do it and he copied the retard's homework.

Anonymous
10/26/25(Sun)21:13:11 No.107018447

Anonymous 10/26/25(Sun)21:13:11 No.107018447

>>107018425
I really think (if we're talking about photo stuff) as I said above you could really tighten Chroma up quite a lot by just training it on a dataset of ~1000 - 5000 actual photographs (no other content types whatsoever) at 1024x1024 with high-quality natural language captions.

Anonymous
10/26/25(Sun)21:14:13 No.107018451

Anonymous 10/26/25(Sun)21:14:13 No.107018451

>>107018361
AceStep 1.5 mogs that and will be released soon. And apparently, the Qwen team is working on Musicgen as well and they confirmed it will be open

Anonymous
10/26/25(Sun)21:15:09 No.107018460

Anonymous 10/26/25(Sun)21:15:09 No.107018460

>>107018438
If you can't tell what model an output comes from you should just give up honestly

Anonymous
10/26/25(Sun)21:16:12 No.107018466

Anonymous 10/26/25(Sun)21:16:12 No.107018466

>107018460
Stop replying to this bored dipshit his disability is going to get cut off in a few weeks so feel bad for him

Anonymous
10/26/25(Sun)21:16:26 No.107018468

Anonymous 10/26/25(Sun)21:16:26 No.107018468

>>107018438
some of them do look like good examples of what Chroma can sometimes output. Other ones I've seen from that guy look suspiciously like Flux Krea moreso than Chroma though.

Anonymous
10/26/25(Sun)21:18:18 No.107018484

Anonymous 10/26/25(Sun)21:18:18 No.107018484

>>107018468
There are a couple of gens from him that seems to be Wan 2.2 with a Lora or a SaaS model, but he keeps being a faggot withholding information and not posting the catbox

Anonymous
10/26/25(Sun)21:19:29 No.107018491

Anonymous 10/26/25(Sun)21:19:29 No.107018491

File: 390c475493d58a4026f2d0654(...).jpg (24 KB, 480x360)

24 KB JPG

>Find a combination of an image and prompt that gets consistently good gens across several seeds in WAN Lightx2
>Load it into Wan2GP to generate a longer and higher-resolution video using full 30-step WAN 2.2
>Get some hokey rigid jerky 2x speed crap
>mfw I waited 2h30m for this shit

Anonymous
10/26/25(Sun)21:27:21 No.107018511

Anonymous 10/26/25(Sun)21:27:21 No.107018511

>>107018425
My bad. I didn't realize more modest SDXL finetunes below pony/illust level still trained on millions.
>>107018415
>>107018447
I am skeptical of the claim but would be great if you can publish the lora that can do that.

Anonymous
10/26/25(Sun)21:29:46 No.107018519

Anonymous 10/26/25(Sun)21:29:46 No.107018519

>>107017653
>>107017744
can't seem to find the proper way to do this in comfy, any tips?

scabPICKER
10/26/25(Sun)21:32:09 No.107018522

scabPICKER 10/26/25(Sun)21:32:09 No.107018522

>>107018451
>will be released soon.
:^)

https://vocaroo.com/1niQmJYusH4G

Anonymous
10/26/25(Sun)21:33:50 No.107018529

Anonymous 10/26/25(Sun)21:33:50 No.107018529

>>107018511
they don't, the only other super-high-image-count finetunes ever done (starting from a literal base model as in SDXL) are like Animagine and BigASP. Starting from something like Chroma which has heavy concept knowledge already though is going to be a totally different story even if Lodestones insists it's itself a "base model". A high-diversity high-quality dataset could make it way more coherent with nowhere remotely close to millions of images.

Anonymous
10/26/25(Sun)21:34:06 No.107018532

Anonymous 10/26/25(Sun)21:34:06 No.107018532

File: 1744532902013458.jpg (946 KB, 2000x1336)

946 KB JPG

Anonymous
10/26/25(Sun)21:35:43 No.107018536

Anonymous 10/26/25(Sun)21:35:43 No.107018536

File: 1746821860409772.jpg (886 KB, 2000x1336)

886 KB JPG

Anonymous
10/26/25(Sun)21:36:58 No.107018540

Anonymous 10/26/25(Sun)21:36:58 No.107018540

>>107018246
Not quite local but doesn't Tensorart have this on their site? Where you can prompt any model they're hosting with a chat LLM?

Anonymous
10/26/25(Sun)21:39:02 No.107018548

Anonymous 10/26/25(Sun)21:39:02 No.107018548

>>107018054
it's VRAM usage. the vae is for converting to and from latents. openai already did this for years because of the compute they have access to. it's faster but the entire point is sd1.x having a vae in the first place was to fit on consumer gpus

Anonymous
10/26/25(Sun)21:41:54 No.107018564

Anonymous 10/26/25(Sun)21:41:54 No.107018564

>>107018548
ramtorch will save us unironically

Anonymous
10/26/25(Sun)21:48:58 No.107018614

Anonymous 10/26/25(Sun)21:48:58 No.107018614

>>107017844
what do you mean by that exactly?
>>107017800
it's even been in CivitAI's trainer for a while now

Anonymous
10/26/25(Sun)21:49:44 No.107018619

Anonymous 10/26/25(Sun)21:49:44 No.107018619

File: c_hunyuan_3_8bit_at_00051_.png (1.37 MB, 1024x1024)

1.37 MB PNG

>>107018257
Hunyuan 3 should be able to this in theory, but they never added support for it. I tried experimenting with it anyway (it does work as a VLM, so you can feed it back the images it generates) but the memory use was intractable.

Anonymous
10/26/25(Sun)21:52:51 No.107018637

Anonymous 10/26/25(Sun)21:52:51 No.107018637

>>107018491
That's the fun of wan

>2.1 fluid, wavy and wobbly movement but often morphs, changes input image and poor quality
>2.2 believable, very high quality and "realistic" movement but too rigid and stiff

2.2 just isn't as fun. It tries too hard to be realistic. At least with 2.1, I can slap on 256 lightx2v i2v lora or/and PUSA to retain character consistency.

ReallyComfy
10/26/25(Sun)21:53:59 No.107018648

ReallyComfy 10/26/25(Sun)21:53:59 No.107018648

File: Migu Round Trip.png (603 KB, 1339x845)

603 KB PNG

How could a model even begin to learn how to draw fine details when the VAE can't even encode it properly?

Anonymous
10/26/25(Sun)21:54:34 No.107018651

Anonymous 10/26/25(Sun)21:54:34 No.107018651

>>107018648
that is indeed the issue

Anonymous
10/26/25(Sun)21:57:43 No.107018667

Anonymous 10/26/25(Sun)21:57:43 No.107018667

>>107018491
using a seed that was made from lightx2 wont work good. are seeds even still relevant for trying to get the same output with wan?

Anonymous
10/26/25(Sun)21:59:26 No.107018680

Anonymous 10/26/25(Sun)21:59:26 No.107018680

>>107018648
>>107018651
Yep.
Despite being still 4 channel the SDXL vae was a lot better than the turbo garbage in SD 1.5 (for which you wouldn't even need comparison node to see how much it destroys the image)
Hence why it was still a noticeable improvement over its predecessors in term of fine detail.

Anonymous
10/26/25(Sun)22:00:48 No.107018689

Anonymous 10/26/25(Sun)22:00:48 No.107018689

File: hunyuan_image_test_24_8r.jpg (415 KB, 1600x2560)

415 KB JPG

>>107018648
Hunyuan 2.1 has a 32ch VAE
Actually, I don't remember if it uses it for the full process or only the refiner.

scabPICKER
10/26/25(Sun)22:01:15 No.107018693

scabPICKER 10/26/25(Sun)22:01:15 No.107018693

>>107018689
>furry foot
um

Anonymous
10/26/25(Sun)22:01:57 No.107018698

Anonymous 10/26/25(Sun)22:01:57 No.107018698

>>107018689
Hunyuan 4.2 has a 64 channel VAE

Anonymous
10/26/25(Sun)22:06:49 No.107018729

Anonymous 10/26/25(Sun)22:06:49 No.107018729

>>107018648
how about flux vae?

Anonymous
10/26/25(Sun)22:08:17 No.107018737

Anonymous 10/26/25(Sun)22:08:17 No.107018737

Clothes changes for wan lora: https://civitai.com/models/2077374/sudden-outfit-change?modelVersionId=2350576
get it before it's banned

scabPICKER
10/26/25(Sun)22:11:03 No.107018756

scabPICKER 10/26/25(Sun)22:11:03 No.107018756

>>107018648
This is what chroma radient is upposed to avoid.

ReallyComfy
10/26/25(Sun)22:11:29 No.107018757

ReallyComfy 10/26/25(Sun)22:11:29 No.107018757

File: flux vae round trip.png (556 KB, 1289x811)

556 KB PNG

>>107018729

Anonymous
10/26/25(Sun)22:13:27 No.107018773

Anonymous 10/26/25(Sun)22:13:27 No.107018773

>>107018737
why would it be banned? There's plenty of i2v sex acts, just not ones where there's a thought crime of it being non consensual

ReallyComfy
10/26/25(Sun)22:14:19 No.107018775

ReallyComfy 10/26/25(Sun)22:14:19 No.107018775

File: anything vae round trip.png (613 KB, 1365x851)

613 KB PNG

>>107018757

Anonymous
10/26/25(Sun)22:14:35 No.107018778

Anonymous 10/26/25(Sun)22:14:35 No.107018778

>>107018773
I expect it to be banned the same way nude loras for qwen image edit systematically get banned.

Anonymous
10/26/25(Sun)22:15:27 No.107018781

Anonymous 10/26/25(Sun)22:15:27 No.107018781

>>107018773
someone somewhere can use that to make a real a woman be in her underwear, that's super dangerous tech

Anonymous
10/26/25(Sun)22:15:43 No.107018785

Anonymous 10/26/25(Sun)22:15:43 No.107018785

I am curious does anyone know how major models SD, Wan, Flux, etc. handled bucketing during training?
Did they use 64 like lora trainers, lower step numbers or resize and prepare all images separately beforehand?

Anonymous
10/26/25(Sun)22:16:47 No.107018791

Anonymous 10/26/25(Sun)22:16:47 No.107018791

>>107018778
because making her clothes go away is perverted, but her grabbing two cocks and going at it will make her giggle

Anonymous
10/26/25(Sun)22:17:41 No.107018794

Anonymous 10/26/25(Sun)22:17:41 No.107018794

>>107018791
Don't ask me to find logic in that.

Anonymous
10/26/25(Sun)22:18:07 No.107018796

Anonymous 10/26/25(Sun)22:18:07 No.107018796

File: rank64lorav2.jpg (2.01 MB, 4248x2081)

2.01 MB JPG

>>107018447
I trained a Chroma rank 64 lora on 500ish images, should I release it? It sadly still needs negative prompts to get rid of the blurriness and lowres look (despite the fact the dataset consisted in highres images), but it does fix some anatomy issues Chroma has such as holding swords, guns, bows etc (but still far for perfect)

It's not as sharp/realistic as Chroma Flash but I think it fucks up anatomy a bit less often

Anonymous
10/26/25(Sun)22:18:14 No.107018799

Anonymous 10/26/25(Sun)22:18:14 No.107018799

how the fuck do you search while banning tags in civitai?

Anonymous
10/26/25(Sun)22:23:24 No.107018823

Anonymous 10/26/25(Sun)22:23:24 No.107018823

>>107018775
>>107018757
>>107018648
>>107018729
If you guys want quantifiable data, a few months ago I made a small experiment with encoding and decoding image with a vae and calculate VMAF score wrt original image.
Note that despite being objective values you should consider these numbers rough placements rather than absolute indicators of quality as one model can compress one image more accurately than the other and they can end up trading places in another image. But it is still indicative of broader trends.
Also there is some compounding effect from running it twice for compressing and decompressing.
SDXL 66
SD1.5 55
Flux VAE 89.52
Wan 2.1 VAE 84.12
Wan 2.2 VAE 86.15 #48 channel one for 5B
Hunyuan Video vae 89.85
Sd3 79.72
SD3.5 vae 80.44

scabPICKER
10/26/25(Sun)22:24:40 No.107018834

scabPICKER 10/26/25(Sun)22:24:40 No.107018834

>>107018823
What does it do anyway? It's compression?

Anonymous
10/26/25(Sun)22:25:20 No.107018838

Anonymous 10/26/25(Sun)22:25:20 No.107018838

>>107018823
did you calculate the video output vmaf after saving videos in lossless quality?

Anonymous
10/26/25(Sun)22:29:03 No.107018855

Anonymous 10/26/25(Sun)22:29:03 No.107018855

>>107018834
Variational auto-encoder is a lossy compressor, yes.
Used for speeding up training and lowering VRAM requirements for consumer inference, at the expense of some quality.
>>107018838
Lol no.
I've fed the same image to all vaes.
Works fine on video vaes too as it just gets processed as single "frame".

Anonymous
10/26/25(Sun)22:29:19 No.107018859

Anonymous 10/26/25(Sun)22:29:19 No.107018859

File: file.png (81 KB, 235x214)

81 KB PNG

>>107018794
>Don't ask me to find logic in that.
I realized that my life had improved significantly mentally once I stopped caring about the logic of society, it's completly incoherent and inconsistent, but that's just the way it is. It do be like that. Stoicism is rad babyyy!

scabPICKER
10/26/25(Sun)22:31:06 No.107018880

scabPICKER 10/26/25(Sun)22:31:06 No.107018880

>>107018859
:^)

Want to talk about the trinity?

ReallyComfy
10/26/25(Sun)22:32:04 No.107018888

ReallyComfy 10/26/25(Sun)22:32:04 No.107018888

>>107018823
Same trend I am seeing as I go down my list of VAEs and eyeballing the error image brightness. Flux and Hunyuan are best for the random 3 anime images I tested.

Anonymous
10/26/25(Sun)22:32:13 No.107018889

Anonymous 10/26/25(Sun)22:32:13 No.107018889

File: ChromaLora_00001_.png (1.36 MB, 1024x1024)

1.36 MB PNG

scabPICKER
10/26/25(Sun)22:33:41 No.107018896

scabPICKER 10/26/25(Sun)22:33:41 No.107018896

>>107018888
checked

I don't think it's comfy. I think it's a vae thing.

Anonymous
10/26/25(Sun)22:34:02 No.107018899

Anonymous 10/26/25(Sun)22:34:02 No.107018899

File: 1748877152177496.png (1.34 MB, 1064x976)

1.34 MB PNG

it's funny how Qwen Image Edit doesn't give a single fuck about the original image's style, it'll add the new element like you'd add something on paint and call it a day, that's soo lazyyy

Anonymous
10/26/25(Sun)22:34:05 No.107018900

Anonymous 10/26/25(Sun)22:34:05 No.107018900

what can i use for non realistic but also non anime? i find illustrious doesnt do my style. people shit on pony but i can get it to do what i need it to do as a base, but i need to do too much polishing and manual work

Anonymous
10/26/25(Sun)22:34:44 No.107018905

Anonymous 10/26/25(Sun)22:34:44 No.107018905

File: 1945003-Doro_20Momiji_20c(...).jpg (214 KB, 1536x1536)

214 KB JPG

scabPICKER
10/26/25(Sun)22:35:41 No.107018913

scabPICKER 10/26/25(Sun)22:35:41 No.107018913

>>107018899
That looks very cool, however.

Anonymous
10/26/25(Sun)22:37:49 No.107018927

Anonymous 10/26/25(Sun)22:37:49 No.107018927

>>107018899
some people really knew how to wear suits with class

Anonymous
10/26/25(Sun)22:39:41 No.107018937

Anonymous 10/26/25(Sun)22:39:41 No.107018937

>>107018927
almost any suite looks good if it's tailor made for you.

Anonymous
10/26/25(Sun)22:41:21 No.107018952

Anonymous 10/26/25(Sun)22:41:21 No.107018952

>>107018796
rank 64 seems kinda high DESU, at least if you mean that in Kohya scaling where it's gonna thus be > 500 MB
what training settings? was it trained at 1024x1024? If not that's your problem lol

scabPICKER
10/26/25(Sun)22:41:51 No.107018955

scabPICKER 10/26/25(Sun)22:41:51 No.107018955

SongBloom again. I find the input to be... sorta not really that controlling. But, you can get good gens.

https://vocaroo.com/1mtRTQX4ZRnG

Anonymous
10/26/25(Sun)22:42:25 No.107018961

Anonymous 10/26/25(Sun)22:42:25 No.107018961

>>107018927
>>107018937
more like "when you're handsome, everything fits well for you"

Anonymous
10/26/25(Sun)22:42:32 No.107018963

Anonymous 10/26/25(Sun)22:42:32 No.107018963

>>107018900
>i find illustrious doesnt do my style.
Are you using a shitmix?
Anyway base noob can do a wide variety of styles in my experience.

Anonymous
10/26/25(Sun)22:43:40 No.107018970

Anonymous 10/26/25(Sun)22:43:40 No.107018970

>>107018796
Yes I am interested.
Going to bed now but would check it out later.

scabPICKER
10/26/25(Sun)22:44:43 No.107018980

scabPICKER 10/26/25(Sun)22:44:43 No.107018980

Looking at the info for SongBloom, it's likely they'll be able to offer midi control at some point.

Anonymous
10/26/25(Sun)22:45:44 No.107018985

Anonymous 10/26/25(Sun)22:45:44 No.107018985

>>107018955
that sounds great. i tried ace step and it was crap it couldnt even follow a consistent beat.

Anonymous
10/26/25(Sun)22:46:26 No.107018990

Anonymous 10/26/25(Sun)22:46:26 No.107018990

Has anyone tried the longcat lora on native yet, does it work? I think an anon earlier tried the full model but too 4 hours or something https://huggingface.co/Kijai/LongCat-Video_comfy/tree/main

scabPICKER
10/26/25(Sun)22:46:42 No.107018993

scabPICKER 10/26/25(Sun)22:46:42 No.107018993

>>107018980
>>107018955
The chorus is pretty great lol.

https://vocaroo.com/163Fm1cacGS7

it skips lyrics some, idk why.

Anonymous
10/26/25(Sun)22:48:35 No.107019011

Anonymous 10/26/25(Sun)22:48:35 No.107019011

>>107018990
>I think an anon earlier tried the full model but too 4 hours or something
I'm looking at the benchmarks and it seems to be inferior to wan 2.2, so I don't see the point of using it

scabPICKER
10/26/25(Sun)22:49:25 No.107019019

scabPICKER 10/26/25(Sun)22:49:25 No.107019019

>>107018985
Yeah, SongBloom really is solid, but weird in how really rn you can't much control style. In theory you give it a sample, and maybe it's my settings, but it uh... kind of does its own thing.

There's supposed to be a text prompt version so you don't have to feed it an audio style sample, soon out:
>>107018361
Maybe by December???

Anonymous
10/26/25(Sun)22:53:09 No.107019052

Anonymous 10/26/25(Sun)22:53:09 No.107019052

File: shak.jpg (177 KB, 1024x1024)

177 KB JPG

scabPICKER
10/26/25(Sun)22:54:13 No.107019055

scabPICKER 10/26/25(Sun)22:54:13 No.107019055

WAN retards:
is there a wan that can take audio and lipsynch it?

:^)

scabPICKER
10/26/25(Sun)22:55:14 No.107019065

scabPICKER 10/26/25(Sun)22:55:14 No.107019065

>>107019052
lol it made it unrealistically short.

Anonymous
10/26/25(Sun)22:55:57 No.107019072

Anonymous 10/26/25(Sun)22:55:57 No.107019072

File: ChromaLora_00005_.png (1.62 MB, 1024x1024)

1.62 MB PNG

>>107018952
>what training settings? was it trained at 1024x1024? If not that's your problem lol
I used diffusion pipe's defaults. LR 1-e4, 68 epochs, over 500 images, took me about 3 days

Anonymous
10/26/25(Sun)22:58:13 No.107019087

Anonymous 10/26/25(Sun)22:58:13 No.107019087

File: -.jpg (68 KB, 1024x998)

68 KB JPG

Anonymous
10/26/25(Sun)23:02:33 No.107019113

Anonymous 10/26/25(Sun)23:02:33 No.107019113

>>107019072
>took me about 3 days
Damn, what GPU?
> 68 epochs, over 500 images,
34k steps assuming no repeats and batch size 1?
Perhaps not that slow then.

Anonymous
10/26/25(Sun)23:06:58 No.107019140

Anonymous 10/26/25(Sun)23:06:58 No.107019140

File: ChromaLora_00014_.png (1.47 MB, 1024x1024)

1.47 MB PNG

>>107019113
>Damn, what GPU?
Two 3090s
>34k steps assuming no repeats and batch size 1?
1 batch per gpu, no repeats per epoch

Anonymous
10/26/25(Sun)23:09:02 No.107019148

Anonymous 10/26/25(Sun)23:09:02 No.107019148

>>107019140
what exactly is this lora, just photo realism?

Anonymous
10/26/25(Sun)23:12:08 No.107019173

Anonymous 10/26/25(Sun)23:12:08 No.107019173

>>107019140
her forehead is red, did she hit her head or something? lol

Anonymous
10/26/25(Sun)23:15:39 No.107019186

Anonymous 10/26/25(Sun)23:15:39 No.107019186

>>107019173
that car probably has those automatic seatbelts that stab you in the neck when lazily entering the vehicle

Anonymous
10/26/25(Sun)23:20:35 No.107019207

Anonymous 10/26/25(Sun)23:20:35 No.107019207

>>107019072
idk what diffusion pipe is
so you don't even know what resolution they train at?

Anonymous
10/26/25(Sun)23:25:27 No.107019231

Anonymous 10/26/25(Sun)23:25:27 No.107019231

File: loragrid2.jpg (1012 KB, 5118x1021)

1012 KB JPG

>>107019148
I trained on "everything", not just photorealistic images
The goal was to fix Chroma's anatomy issues as it mostly has images of humans making poses or interacting with objects
And of course, it has some personal flavor to it (for example, it can consistently make 1990s ad scan photos without adding unprompted texts unlike the base model, it can make comic book artstyle without outputting comic pages, it can make ingame screenshots without TV offscreen shots unlike what base Chroma does etc)
>>107019207
I trained on 1024x1024, sorry, I forgot to say

Anonymous
10/26/25(Sun)23:27:59 No.107019245

Anonymous 10/26/25(Sun)23:27:59 No.107019245

File: 1738474742292191.jpg (1.07 MB, 1248x1824)

1.07 MB JPG

Can I merge 2 6.46gb sdxl models with 12gb vram? I haven't merged a model since 1.5 days and I remember running out of vram sometimes with 8gb.

Anonymous
10/26/25(Sun)23:30:36 No.107019258

Anonymous 10/26/25(Sun)23:30:36 No.107019258

>>107017112
Are restarts a meme? I usually train with cosine with restarts and Adamw8 but I heard they are useless with adaptive optimizers. Anyone knows if this is true?

Anonymous
10/26/25(Sun)23:37:36 No.107019291

Anonymous 10/26/25(Sun)23:37:36 No.107019291

>>107019258
how perfect are your adaptive optimizers?

in most cases doing random shit from the stuff that could be possibly working and seeing what sticks is still the way to go

Anonymous
10/26/25(Sun)23:40:38 No.107019306

Anonymous 10/26/25(Sun)23:40:38 No.107019306

File: ChromaLora_00023_.png (1.28 MB, 1024x1024)

1.28 MB PNG

Anonymous
10/26/25(Sun)23:50:48 No.107019356

Anonymous 10/26/25(Sun)23:50:48 No.107019356

>>107019140
Were this many steps necessary or did you just go for overkill?
>>107019258
>>107019291
Restarts aren't "useless", they straight up rape adaptive optimizers (Or at least just prodigy, haven't tried the rest with restarts.)
I believe they can boost quality for non adaptive optimizer like Adamw8 though.

Anonymous
10/26/25(Sun)23:54:06 No.107019377

Anonymous 10/26/25(Sun)23:54:06 No.107019377

File: ChromaLora_00029_.png (1.3 MB, 1024x1024)

1.3 MB PNG

>>107019356
>Were this many steps necessary or did you just go for overkill?
If anything, it's still undertrained / it's far from overfitting, lol

Anonymous
10/26/25(Sun)23:54:07 No.107019378

Anonymous 10/26/25(Sun)23:54:07 No.107019378

>>107019356
Ok I had a fucking brainfart there. Excuse me.
What I am saying is don't use it for Prodigy but you should be able to use them with Adam(8bit).

Anonymous
10/26/25(Sun)23:56:51 No.107019394

Anonymous 10/26/25(Sun)23:56:51 No.107019394

Who cares about music, is there a text2sfx model?

Anonymous
10/27/25(Mon)00:01:28 No.107019422

Anonymous 10/27/25(Mon)00:01:28 No.107019422

>>107017534
images uses different workflows, the bottom two use a ton of custom nodes and are have stack of loras lol
i can't replicate your results in the first one, i get a very blurry rendition

sounds like its back to v48 for me

scabPICKER
10/27/25(Mon)00:01:45 No.107019424

scabPICKER 10/27/25(Mon)00:01:45 No.107019424

>>107019394
me I do, me me me

I'm trying another source file :^)

It's promising, we'll see.

Anonymous
10/27/25(Mon)00:09:21 No.107019460

Anonymous 10/27/25(Mon)00:09:21 No.107019460

>>107018737
Did anyone figure out how to get it to change to nudity?

Anonymous
10/27/25(Mon)00:11:29 No.107019472

Anonymous 10/27/25(Mon)00:11:29 No.107019472

>>107019245
I hope you're at least block merging them.

Anonymous
10/27/25(Mon)00:12:19 No.107019476

Anonymous 10/27/25(Mon)00:12:19 No.107019476

File: ComfyUI_07964_.png (2.64 MB, 1152x1152)

2.64 MB PNG

I love how the Chroma haters pretend that the Flux architecture isn't vastly superior to SDXL at concept knowledge, scene coherence and prompt following. You have to be blissfully ignorant to ignore that SDXL has elementary understanding of what the world looks like. Let alone any NSFW concept it's already seen without relying on overcooking it with LoRAs. SDXL doesn't learn concepts. It can't generalize. Without controlnet, or some kind of hack, it does not understand composition.

These are not problems that have been made up by Chroma devs. The SOTA for image gen had been out for a while, and I mean years (Dalle 3). Models came around that were complete shit (SD 3/3.5), models that were good in their own way but still not good enough (Sigma/HunyuanDiT), then Flux came around, compared to Dalle it was still significantly behind in world understanding, scene coherence, NSFW and prompt following. Chroma came around and bridged that gap in all but two things (styles/character knowledge), and now it properly surpasses in those areas.

Now Chroma is the best model for photorealism. There is no better model currently available, even API is far behind both layers of censorship and plastic. Official Flux finetunes (E.G. Krea) are still behind what Chroma has achieved for photorealism. There are no other models that give you proper photographic look on demand, where you can place any 1girl in any scenario that you can think of and still get the photorealistic look.

Chroma may not be the easiest model to use, had its rough edges (earlier versions), but Chroma Flash HD really is the most coherent model available right now. It's also just a base model, so further large scale tunings to bring back things that it's missing are still on the line. Flux itself felt like a hopeless dead end for catching up to Dalle in any way, and now we're here.

Anonymous
10/27/25(Mon)00:13:41 No.107019485

Anonymous 10/27/25(Mon)00:13:41 No.107019485

>>107019476
>another wall of text
this guy is seriously mentaly ill

Anonymous
10/27/25(Mon)00:14:18 No.107019492

Anonymous 10/27/25(Mon)00:14:18 No.107019492

File: 1731255696403040.jpg (330 KB, 832x1216)

330 KB JPG

>>107019472
I didn't try yet, I don't think I can do it and I'd have to restart my UI and I'm generating anime girls. I dunno what that means, I was just going to 50/50 two models I like and get a gooder model(it's just that easy).

Anonymous
10/27/25(Mon)00:16:05 No.107019510

Anonymous 10/27/25(Mon)00:16:05 No.107019510

>>107019476
As someone who is only interested in text to video (and is burnt out of that and waiting for a local sora 2 so I can generate teens acting bratty with sound before going back to genning),

The fact that local models are in such a state that you even wrote a three-paragraph persuasive essay is top grim. I don't think it could get any grimmer than this. At least time will heal all wounds and it's just a matter of surviving and waiting for things to get better.

Anonymous
10/27/25(Mon)00:19:02 No.107019531

Anonymous 10/27/25(Mon)00:19:02 No.107019531

>>107019492
I think I remember merging XL with less VRAM back in the day (when it was first implemented in Comfy) for what that's worth. I could be misremembering, but I also am a simply retard as well.

Anonymous
10/27/25(Mon)00:19:32 No.107019536

Anonymous 10/27/25(Mon)00:19:32 No.107019536

File: 1747691772831390.jpg (854 KB, 1248x1824)

854 KB JPG

>>107019531
I'll give it a shot then, thanks.

Anonymous
10/27/25(Mon)00:20:47 No.107019543

Anonymous 10/27/25(Mon)00:20:47 No.107019543

>>107019476
>"A picture is worth a thousand words"
>he went for the thousand words anyway
loool

Anonymous
10/27/25(Mon)00:21:57 No.107019552

Anonymous 10/27/25(Mon)00:21:57 No.107019552

>>107019510
>The fact that local models are in such a state that you even wrote a three-paragraph persuasive essay is top grim.
this, it's a really cultish behavior, they're coping so hard they're pretending their toys is on par with what API has to offer, they don't seem to understand that you can enjoy local while admitting that your product is far from the best

Anonymous
10/27/25(Mon)00:22:01 No.107019553

Anonymous 10/27/25(Mon)00:22:01 No.107019553

File: 1757117215343811.jpg (1.1 MB, 2000x1336)

1.1 MB JPG

Anonymous
10/27/25(Mon)00:23:02 No.107019562

Anonymous 10/27/25(Mon)00:23:02 No.107019562

>>107019476
>>107019543
>thousand words
https://youtu.be/0rIvp-AreGI?t=114
(Sorry, I too hate FFX-2 but I couldn't help myself)

Anonymous
10/27/25(Mon)00:23:09 No.107019564

Anonymous 10/27/25(Mon)00:23:09 No.107019564

>>107019476
bub you gotta learn to parse through the b8 better it hurts me to see you like this

Anonymous
10/27/25(Mon)00:23:30 No.107019567

Anonymous 10/27/25(Mon)00:23:30 No.107019567

>>107019476
True and based chromaGOD, unfortunate about the blown out colors and chromatic abberation of some chroma posters here though but to each their own
bigASP is apparently also solid for some realism

Anonymous
10/27/25(Mon)00:24:06 No.107019569

Anonymous 10/27/25(Mon)00:24:06 No.107019569

are there any checkpoints that can do black women?

Anonymous
10/27/25(Mon)00:24:24 No.107019571

Anonymous 10/27/25(Mon)00:24:24 No.107019571

>>107019476
>“The heart has its reasons which reason knows nothing of... We know the truth not only by the reason, but by the heart.”
>― Blaise Pascal, Pensées
in other words, don't try to justify why you enjoy Chroma, feel free to enjoy it, but also feel free to accept that some people won't enjoy that model as well

Anonymous
10/27/25(Mon)00:25:07 No.107019577

Anonymous 10/27/25(Mon)00:25:07 No.107019577

What's new this week? Ditto? Dype? How come /ldg/ don't have weekly news?

Anonymous
10/27/25(Mon)00:25:35 No.107019579

Anonymous 10/27/25(Mon)00:25:35 No.107019579

>>107019577
>How come /ldg/ don't have weekly news?
be the change you want to see, make this like debo does on /sdg/

Anonymous
10/27/25(Mon)00:26:12 No.107019583

Anonymous 10/27/25(Mon)00:26:12 No.107019583

*yawn*

Anonymous
10/27/25(Mon)00:27:59 No.107019598

Anonymous 10/27/25(Mon)00:27:59 No.107019598

>>107019476
this is one of the saddest posts I've ever seen on 4chan, why are you so passionate about defending this model?

Anonymous
10/27/25(Mon)00:28:06 No.107019601

Anonymous 10/27/25(Mon)00:28:06 No.107019601

>>107019577
>How come /ldg/ don't have weekly news?
Because everyone who is based already knows whats new and is too busy genning to write news reports for newniggers

Anonymous
10/27/25(Mon)00:28:44 No.107019603

Anonymous 10/27/25(Mon)00:28:44 No.107019603

>>107019394
Is a good music model not inherently a good "sfx model"

Anonymous
10/27/25(Mon)00:29:01 No.107019605

Anonymous 10/27/25(Mon)00:29:01 No.107019605

>>107019569
what do you mean? every model can do black women

Anonymous
10/27/25(Mon)00:30:05 No.107019615

Anonymous 10/27/25(Mon)00:30:05 No.107019615

>>107019569
>are there any checkpoints that can do black women?
Have you tried (very dark skin:1.4), (dark skin:1.4)

If you mean the actual genetic facial structure then good luck because I couldn't even get jungle women last time I tried and had to settle for using a Lilo and stitch Lora to un-beauty-standard the faces

Anonymous
10/27/25(Mon)00:30:51 No.107019620

Anonymous 10/27/25(Mon)00:30:51 No.107019620

File: 1743899262667637.png (1.24 MB, 1440x1120)

1.24 MB PNG

>>107019569
tons of models can do black women.

Anonymous
10/27/25(Mon)00:31:59 No.107019631

Anonymous 10/27/25(Mon)00:31:59 No.107019631

>>107019601
>>107019620
based

Anonymous
10/27/25(Mon)00:32:10 No.107019633

Anonymous 10/27/25(Mon)00:32:10 No.107019633

>>107019620
lmaooo, that's amazing

Anonymous
10/27/25(Mon)00:32:50 No.107019637

Anonymous 10/27/25(Mon)00:32:50 No.107019637

>>107019577
news maker tend to be highly schizo for some reason

Anonymous
10/27/25(Mon)00:32:59 No.107019639

Anonymous 10/27/25(Mon)00:32:59 No.107019639

>>107019569
DUI checkpoints usually have a lot of cops

Anonymous
10/27/25(Mon)00:33:14 No.107019642

Anonymous 10/27/25(Mon)00:33:14 No.107019642

>>107019577
https://www.youtube.com/watch?v=uQiqKFK5_0w&t=1155s

Anonymous
10/27/25(Mon)00:33:17 No.107019644

Anonymous 10/27/25(Mon)00:33:17 No.107019644

>>107019605
>>107019615
yes i mean actual black woman, i can only get a tan anime girl from pokemon. its hard to do any ethnicities with illustrious, even when i prompt for asian nothing happens

Anonymous
10/27/25(Mon)00:33:43 No.107019647

Anonymous 10/27/25(Mon)00:33:43 No.107019647

File: ComfyUI_temp_dxygn_00014_.png (1.56 MB, 768x1344)

1.56 MB PNG

>>107019620
nice

Anonymous
10/27/25(Mon)00:34:10 No.107019649

Anonymous 10/27/25(Mon)00:34:10 No.107019649

>>107019620
this lore goes deeper than I thought

Anonymous
10/27/25(Mon)00:35:18 No.107019654

Anonymous 10/27/25(Mon)00:35:18 No.107019654

File: r.png (2.78 MB, 1024x1024)

2.78 MB PNG

>>107019356
> they straight up rape adaptive optimizers (Or at least just prodigy, haven't tried the rest with restarts.)
it can go pathologically wrong, but really doesn't have to.

yes, the adaptive scheduler would probably have continued with a lower/higher learning rate than the initial one, but the initial LR doesn't *necessarily* do damage. maybe it also IS moving the, uh, learned thing out of a local minimum/maximum.

not all adaptive optimizers are good at doing this kind of a thing on their own either.

Anonymous
10/27/25(Mon)00:35:22 No.107019655

Anonymous 10/27/25(Mon)00:35:22 No.107019655

>>107019620
>there's 2 debos
I KNEW IT!!

Anonymous
10/27/25(Mon)00:37:59 No.107019668

Anonymous 10/27/25(Mon)00:37:59 No.107019668

>>107019620
hahahahaahahahahahahah

gen a crowd of normal people sitting below the screens with their backs turned, monitoring the screens above them
add to screens: poopdickschizo, pissbuttfag, asianposter #6374

Anonymous
10/27/25(Mon)00:39:12 No.107019677

Anonymous 10/27/25(Mon)00:39:12 No.107019677

File: radiance.png (2.48 MB, 1024x1024)

2.48 MB PNG

>>107019476
ignore even more rough edges, enjoy radiance

Anonymous
10/27/25(Mon)00:39:26 No.107019679

Anonymous 10/27/25(Mon)00:39:26 No.107019679

>>107019620
>Debo#1 looks like Comfy
UH OHH IM NOTICING

Anonymous
10/27/25(Mon)00:39:31 No.107019680

Anonymous 10/27/25(Mon)00:39:31 No.107019680

File: ComfyUI_07970_.png (2.12 MB, 1152x1152)

2.12 MB PNG

>>107019485
Baiters consistently post walls worth of text since they are on every thread.

>>107019571
Just because I'm sharing my thoughts doesn't mean I take it personally when someone posts for the 1000th time that Chroma sucks because x or y reason. But some of the reasons sound sound enough to warrant argument. I'm not claiming Chroma is perfect, and I do acknowledge its imperfections.

Anonymous
10/27/25(Mon)00:40:27 No.107019685

Anonymous 10/27/25(Mon)00:40:27 No.107019685

>>107019680
>Just because I'm sharing my thoughts doesn't mean I take it personally
you make wall of texts and you don't take it personally? Sure...

Anonymous
10/27/25(Mon)00:40:49 No.107019690

Anonymous 10/27/25(Mon)00:40:49 No.107019690

Move

>>107019684
>>107019684
>>107019684
>>107019684

Move

Anonymous
10/27/25(Mon)00:41:10 No.107019693

Anonymous 10/27/25(Mon)00:41:10 No.107019693

>>107019644
>yes i mean actual black woman, i can only get a tan anime girl from pokemon. its hard to do any ethnicities with illustrious,
Yeah I don't think you're gonna get it from illustrious. Pony could do it better. Don't have experience with noob. Sorry anon, I understand your struggle

It's funny that the reasons why AI struggles with generating ugly people is the same reasons why it struggles with nigs, and those reasons are unrelated to "black = ugly" but more related to AI averaging everything out

Anonymous
10/27/25(Mon)00:41:48 No.107019698

Anonymous 10/27/25(Mon)00:41:48 No.107019698

>>107019685
If you're not a Zoomer that takes not long to read.

Anonymous
10/27/25(Mon)00:43:26 No.107019708

Anonymous 10/27/25(Mon)00:43:26 No.107019708

>>107019698
that's not the point, you're writing bibles to defend such a subpar model, what's wrong with you? do you think we're gonna be convinced by reading words? do you understand that the goal of an image model is to produce good images? I don't care what you want to say, I look at the image, if it looks like shit that's pretty much it

Anonymous
10/27/25(Mon)00:49:20 No.107019743

Anonymous 10/27/25(Mon)00:49:20 No.107019743

>>107018823
add qwen and lumina to this.
I think lumina uses the flux vae maybe? i dont rememebr

Anonymous
10/27/25(Mon)00:52:31 No.107019761

Anonymous 10/27/25(Mon)00:52:31 No.107019761

So has anyone actually tried to generate chinese cartoon smut with pony v7 yet? Thats what its for right? I see all these people seething about it not doing hyperrealism well but thats not even what its for?

Anonymous
10/27/25(Mon)01:22:13 No.107019932

Anonymous 10/27/25(Mon)01:22:13 No.107019932

>>107019761
anon, it's not so good at that either. you'll prefer noob/illustrious or neta-yume lumina or perhaps even chroma.

feel free to try tho.

Anonymous
10/27/25(Mon)01:46:45 No.107020042

Anonymous 10/27/25(Mon)01:46:45 No.107020042

>>107019603
Music isn't sound effects

Anonymous
10/27/25(Mon)03:13:50 No.107020448

Anonymous 10/27/25(Mon)03:13:50 No.107020448

>>107019258
Cosine With Restart @ 3 works well with like AdamW
they do nothing with adaptive though, you shoud just use Cosine for e.g. Prodigy or CAME

Anonymous
10/27/25(Mon)03:15:56 No.107020460

Anonymous 10/27/25(Mon)03:15:56 No.107020460

>>107019743
Lumina reuses the Flux VAE yeah

Anonymous
10/27/25(Mon)03:50:28 No.107020607

Anonymous 10/27/25(Mon)03:50:28 No.107020607

>>107019932
I'll give it a try later since I just bought a new drive. Apparently these new models need insanely verbose prompts to be any good.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.