[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Discussion of Free and Open Source Text-to-Image/Video Models and UI

Prev: >>106730007

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2203741
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbours
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
Blessed thread of frenship
>>
File: 1747526914833421.png (2.85 MB, 1416x2120)
2.85 MB
2.85 MB PNG
>>
A question to the Chroma anon who posts asian women: do you merge the deltas using Chroma1-HD as the base, or Chroma-HD (v50) as the base?
>>
File: 1736925213835793.png (2.07 MB, 1344x1728)
2.07 MB
2.07 MB PNG
>>
>>106732413
Thicc!
>>
File: 1729351877214346.mp4 (1.09 MB, 832x480)
1.09 MB
1.09 MB MP4
>>
File: image.png (1.87 MB, 1280x768)
1.87 MB
1.87 MB PNG
Still searching for any wins for Hunyuan Image 3. Not finding any.

>The first battle of Ypres, World War I. Aerial view, looking over the front. Painted with rough brush strokes and palette knife. Low saturation. The sun is visible in the distance through the gas and smoke.

I'm having a hard time believing that it's this hopeless. 80B, MoE, autoregressive, must have something it does better.
>>
File: ComfyUI_07210_.png (1.95 MB, 1152x1152)
1.95 MB
1.95 MB PNG
>>106732427
Chroma-HD (v50) as the base. Now, for some bizarre reason when I save the checkpoint using that mixing workflow >>106731891 it does not work well when I try to inference it. First, I tried Save Checkpoint, then I tried https://files.catbox.moe/eceuas.png
The checkpoint is still broken

Anyone got any pointers?
>>
>>106732512
>aerial view
lol
>trenches are empty and not even deep
lmao
>>
>>106732512
pleasant style desu
>>
File: 1745055435107804.mp4 (1001 KB, 832x480)
1001 KB
1001 KB MP4
the man in the blue shirt runs onto the airplane behind him and it takes off high into the sky.

directors cut:
>>
>>106732512
wtf? this is really an image from that 80b model? bruh this is awful...
>>
>>106732519
the composition is garbage
what the fuck is at the bottom right with the circle shape? is it an invisible predator pod?
rifles/mortars popping out of the ground for seemingly no reason, those legs?arms? below the guy on the left? get your fucking eyes checked
>>
>>106732532
yeah thats why i said style and not the composition or adherence to the prompt you dumb faggot
>>
File: image(1).png (1.29 MB, 768x1280)
1.29 MB
1.29 MB PNG
Another Hunyuan 3. This one is interesting because of what it did follow and what it ignored.

>T-pose character reference. The character is a young woman with long dark purple hair and blunt bangs. She is wearing a camouflage fatigue shirt with a scuffed steel breastplate over it. She is wearing loose dark gray pants that are threadbare at the knees and ankles. She is wearing dark brown combat boots, and her pant legs are stuffed into the boots. She has a brown leather belt around her waist. She is wearing large glasses with rounded frames that display a cyan HUD overlay. The illustration is a highly detailed digital painting with fine linework.

It did a great job following the description of the character's clothing/appearance. It completely ignored the instructions about the style and pose.
>>
>>106732512
>80b model
>trained on 5 billion images
and it looks like this, I have no idea you could fuck up this hard, they were supposed to slam dunk that shit with that much compute, they're so fucking incompetent it's comical lmao
>>
>>106732538
It also still has the issue of creating midgets by maximizing use of the canvas area above all else.
>>
bad or not, hunyuan 3 literally stole the popularity of wan api 2.5. based
>>
>>106732544
If it was good they wouldn't have open sourced it, Period.
>>
>>106732556
I hope those fuckers lose money on hosting 2.5 alone. Maybe keep them shy about going saas for another year or two.
>>
>>106732512
After dozens of threads filled with people model warring and generic shitposting it's kind of refreshing to get something objectively lame.
>>
File: 1748585421206750.mp4 (1.01 MB, 832x480)
1.01 MB
1.01 MB MP4
holy shit.

the man in the blue shirt fires a blue energy beam like Goku from Dragonball Z at the large airplane behind him, causing the airplane to explode.

not quite...
>>
>>106732544
The full dataset was 10 billion. They filtered like 40% of it, then filtered it some more cause it wasn't "aesthetic" enough.
>>
>>106732613
You just know their filtering algorithm filtered the most kino.
>>
File: 1739625828768957.png (1.47 MB, 1536x1024)
1.47 MB
1.47 MB PNG
had to pass this to QIE, netayume cant deal with double text gen. SAD!
>>
File: 1728810505285732.mp4 (901 KB, 832x480)
901 KB
901 KB MP4
the man in the blue shirt raises his arms, causing the large airplane and men behind him to float in the air.
>>
File: 1751168106741247.png (1.29 MB, 1357x1343)
1.29 MB
1.29 MB PNG
New to this. Is there anyway to view the details of a safetensor embedding file? I have one that has some great tags for image quality but it also locks the subject in a single pose and I have no idea how to view the file data without using a1111 or comfyui (I'm an invoke pajeet)
>>
File: 1745740507518050.png (928 KB, 1008x1008)
928 KB
928 KB PNG
>>106732645
the brown thread is at /sdg/.
you're welcome.
>>
File: y2k3.mp4 (1.07 MB, 1024x1024)
1.07 MB
1.07 MB MP4
>>
>>106732666
Haha mutt, where's your foreskin? Don't talk anything about brown you fucking inbred little shit with 250 years of colonial history.
>>
>>106732556
kek, that's a good point
>>
File: file.png (2.59 MB, 1328x1328)
2.59 MB
2.59 MB PNG
>>106732678
>>
File: 1732686467737266.png (945 KB, 1360x768)
945 KB
945 KB PNG
the blonde girl in the dress is sitting on the chair in the left part of the dressing room with her legs crossed, keep her in a low polygon style. remove the girl in the center of the room.

qwen edit v2 is such a cool model. and it kept the character (aya in parasite eve 1) in the same low poly style.
>>
File: 1732493702859357.png (381 KB, 1659x933)
381 KB
381 KB PNG
>>106732703
and this was the image source:
>>
sdxl still mogs
>>
Bro this is the most suspicious fucking reddit thread in history:
https://reddit.com/r/StableDiffusion/comments/1nszzmu/hunyuanimage_30_is_perfect/

And like I would be the first person to fairly point out if that was an accurate representation of the model (it's not)
>>
>>106732742
>plebbit is full of shills
tell me something new
>>
File: 1742147539204659.png (936 KB, 1360x768)
936 KB
936 KB PNG
>>106732712
>>
>>106732712
This one is the original? So it pixellated aya to match the background?
>>
File: 1737022438710734.png (1.23 MB, 1360x768)
1.23 MB
1.23 MB PNG
the woman on the left is shaking hands with the man in the center. keep their appearance the same.
>>106732830
yep, if I didnt say low poly it would prob change less.
>>
>>106732742
>bro
Tiktok zoomers are so retarded they are not even allowed to say nigger.
>>
File: 1753451464251395.jpg (778 KB, 1304x2312)
778 KB
778 KB JPG
>>
File: ComfyUI_07221_.png (2.14 MB, 1152x1152)
2.14 MB
2.14 MB PNG
So it is possible to get heavily stylized Chroma Flash images too. Just gotta git gud at prompting.

>>106732742
Where are the Seedream sloppers? What is their verdict on this model
>>
File: 1753362833524061.png (1.2 MB, 1360x768)
1.2 MB
1.2 MB PNG
the man in the center with white hair is holding a sign saying "more 1girl gens please". The girl on the left with red hair is typing at a computer on a pedestal. keep their appearance the same.
>>
>ran
>>
>>106732869
It's shit compared to Seedream. It's slower, lower res, and full of meltyslop artifacts. Hunyuan coped majorly over Seedream and thought upping the params would fix it. But it's expected that Hunyuan embarrasses themselves over and over, such is their fate
>>
File: 1752234689459913.png (3.52 MB, 1304x2312)
3.52 MB
3.52 MB PNG
>>
>>106732886
when was the last time you read a novel?
>>
>>106732885
the melty last night was hilarious. imagine spending a day off malding at someone more tale ted than you
>>
>>106732907
what happened?
>>
>>106732457

sista love
>>
>ran is obsessed with his imaginary rivals
>>
File: 1731876882246653.png (920 KB, 1176x880)
920 KB
920 KB PNG
>>
File: 1754030898527287.png (686 KB, 1168x888)
686 KB
686 KB PNG
>>106732933
>>
Every post is about drama. This is why you won't get hired, ranfaggot.
>>
>>106732916
ani posted after a long time being away >>106728038
what follows is unhinged shizophrenia
>>
>>106732950
Sounds like an awful situation. What happened next?
>>
>>106732869
good gen
>>
>>106732955
the singular schizo theory is solved, it's just a singular schizo ranfaggot
>>
Debo owns this thread
>>
File: 1747346157232657.png (1.08 MB, 1072x968)
1.08 MB
1.08 MB PNG
qwen edit is going to get banned in china now.
>>
File: ComfyUI_07231_.png (2.09 MB, 1152x1152)
2.09 MB
2.09 MB PNG
>>106732886
Nah, most realistically what happened is that they gave us a model that they tossed. They would never give us a decent 80B model. They probably have a Seedream 4 tier model cooking that they will then not release, which is why they gave us this.
>>
>>106732881
>Hope with 4 arms
kek
>>
>>106732933
>>106732942
loool
>>
File: 1748680235304996.png (1.05 MB, 1072x968)
1.05 MB
1.05 MB PNG
the man in image1 is kneeling and shaking hands with the cartoon bear in image2. keep the man's appearance the same.

KNEEL to pooh.
>>
>debo
>>
File: 1736131612972605.png (1.24 MB, 1072x968)
1.24 MB
1.24 MB PNG
>>106732990
the man in image1 is kneeling and shaking hands with the cartoon bear in image2. they are in a Chinese house filled with pots of honey labeled "hunny". keep the man's appearance the same.
>>
>>106732976
At least it gave us some insight into the sizes of SAAS models, maybe we'll no longer have to suffer retards who think models running on consumer hardware should be comparable in quality to 10+ times larger models.

If anything it's insane how little difference there is.
>>
File: 1747132518705219.png (1.13 MB, 896x1160)
1.13 MB
1.13 MB PNG
the man is holding a pizza box with his right hand and is wearing a Dominos Pizza uniform and Dominos Pizza hat. He has a rectangular nametag that says "goose".
>>
File: 1758425481229309.png (3.19 MB, 1416x2120)
3.19 MB
3.19 MB PNG
>>
>>106732976
>They probably have a Seedream 4 tier model cooking that they will then not release, which is why they gave us this.
they don't, if they had such a model we would've known about it and they would make money with some API shit, this is the best they can do and it's sad lol
>>
File: 1746583813024566.png (2.54 MB, 1416x2120)
2.54 MB
2.54 MB PNG
>>
Do we have workflows to put a picture of a girl in and get her doing porn? Obviously, right?


...may I see it?
>>
File: 1729646379297522.png (3.4 MB, 1416x2120)
3.4 MB
3.4 MB PNG
>>
File: ComfyUI_VFI_00018_.mp4 (2.69 MB, 736x960)
2.69 MB
2.69 MB MP4
it really all falls apart over 5 seconds/81 frames huh?
>>
>>106732512
does MoE mean we can offload layers to ram and still have it work? Could maybe be best model for local possible if so
>>
anyone know if there's a difference (for Illustrious) between putting a tag at the front of the prompt VS putting it at the end of the prompt but weighting it up?

in other words: is the prompt order just a series of weights?
>>
File: 1742277191826472.png (3.41 MB, 1416x2120)
3.41 MB
3.41 MB PNG
>>
File: 3565050.jpg (93 KB, 1080x1037)
93 KB
93 KB JPG
>comfy pushing for nightly pytorch
Surely nothing will break...
>>
Can I use sage attention 3 yet?
>>
>>106733143
???

They updated the command to use IF you want to install nightly pytorch

If you use nightly pytorch to begin with you know there can be breakage

By default it uses stable Pytorch
>>
>>106733180
Why do you reply like that?
>>
File: SageAttention3.png (59 KB, 1436x570)
59 KB
59 KB PNG
>>106733154
https://github.com/thu-ml/SageAttention/tree/main/sageattention3_blackwell
>>
>>106732869
Catbox?
Also, anon, I am not getting the same results as you with my Chroma Flash delta merges.
Can you send me the md5 hash of your merge so I can compare and check if there is something wrong?
Send me the md5 of each of the base models and the final model, please
>>
>>106733188
What's

wrong

with reply

like this?
>>
>>106733188
comfy uses reddit often and as a result uses reddit spacing
>>
File: 1733191070001619.png (784 KB, 1176x888)
784 KB
784 KB PNG
>>
>>106733188
Because you seem retarded, so it really needed to be easy to follow
>>
File: image(4).png (1.79 MB, 896x1152)
1.79 MB
1.79 MB PNG
>>106733098
Yes, that's what I'm doing. It still takes 15-20 minutes per image. Don't expect much from it though.

>Portrait of a catgirl by Alphonse Mucha
>>
Can anyone help me out? I have wildcards and conditional if/else statements to set the negative prompt depending on what the wildcard resolves to. The problem is is that only the first if/else statement actually works and sets the negative prompt. The proceeding else if statements are only processed AFTER the image has been generated. So the order goes if/else > make the image > else if. ComfyUI jumps straight into the ksampler and the rest of the entire process and saves the image before the if/else node resumes with the next else if. I know this because the showtext node only populates after the image has been made
>>
>>106733285
just catbox the pic on the right
>>
>>106733285
Sure, take a handgun, point it at your temple, pull the trigger.

Wah lah.
>>
>>106733285
no brainer solution is to copy/paste the sampler and delete the old one i think
>>
File: HunyuAACCCKKImage3.png (3.7 MB, 1664x1216)
3.7 MB
3.7 MB PNG
maybe if they put 160B params in Hunyuan Image 4.0, it will be finally able to topple titans like... the original Kolors
>>
why do people post spaghetti images instead of catboxing workflows?
>>
>>
>>106732637
kek
>>
File: ComfyUI_07244_.png (1.77 MB, 1152x1152)
1.77 MB
1.77 MB PNG
>>106732961
Thanks

>>106733200
https://files.catbox.moe/ax0106.png
For my 1152x1152 gens I exclusively use regular HD Flash.

MD5 hash of Chroma HD:
https://pastebin.com/8rDQiGsY

That should be the same as regular v50.

I do not have the merged model that I use, I can only do inference on it from the workflow I showed you, which consumes a lot of RAM. No idea what I'm doing wrong while saving the checkpoint, all it gives me is fried images when I save it (almost as if VAE is missing or something).
>>
>>106733238
can you try a more long winded prompt? It was trained most likely on ai captions that art much more verbose
>>
>>106733347
why do you still come here to post vomit, embarrass yourself with unhinges schizo ramblings and learn nothing?
>>
>>106733381
Give me one and I'll post the result (though it will probably be like an hour...)
I have been sticking with simple prompts with the logic that if it's not following simple instructions it won't follow more complex ones. I am aware that one of the advantages of the model is supposed to be adherence to very long prompts, though.
>>
>>106733387
Post your body of work or fuck off at this point. I have a huge suspicion you don't even use illustrious
>>
I feel fortunate the most dedicated idiots that have issues with me are alcoholic low functioning losers that have been exposed a lowcows that can't even match the average /ldg/ poster in skill.
They also like to ERP pretending to be girls with other men and can't even read filenames to realize who's posting what.
>>
>>106732965
And what happened next?
>>
File: ComfyUI_09986_.png (1.96 MB, 1152x1152)
1.96 MB
1.96 MB PNG
>>106733373
Is the md5 for your Flash delta this one?
67e5a31bb70aee6442643290bfa8ea71

My outputs are completely slopped and fried like you mentioned, even when I don't save the weights. I am not using the custom Chroma nodes you are using though, because it errors out and Comfy doesn't seem to find them on the custom node manager.
>>
>DEBO
>>
I think he gets high out of his mind and thinks people gives a fuck about his delusions. He can't even post good gens just trying to humble brag and shooting himself in the foot every time
>>
>>106733410
Yes, you will get slop unless you prompt at 2k. Currently, there is no way to properly save the weights.
>>
>>106733410
>I am not using the custom Chroma nodes you are using though

Follow these steps
https://huggingface.co/lodestones/Chroma?not-for-all-audiences=true#deprecated-manual-installation-chroma
>>
>>106733389
>an action shot still frame high resolution photography of a star NFL quarterback as he extends his arm after throwing a long pass. He has just thrown a baby with all his might in a perfect spiral, tears stream, from the babies eyes as the baby reaches incredible speeds, it's mouth open in an endless wail. The quarterback is playing for the "Montana Edge Lords" in their customary jersey with Ted Kazinsky as their mascot and logo. For some reason, this all takes place in a hockey rink during a birthday party bash.
>>
>>106733408
see >>106733404
crashing out with the mask off is icing on the cake
>>
File: chroma_034534.jpg (1.53 MB, 2048x2048)
1.53 MB
1.53 MB JPG
>>106733434
At 2k I could get good unslopped results at random like picrel
This is from a merged model, not loaded from ram
>>
File: 1735285358832826.png (3.39 MB, 2120x1416)
3.39 MB
3.39 MB PNG
>>
>>106733459
Seems like you are butthurt
>>
>>106733034
Aesthetic wise, I have never been impressed by an API model. We can get more out of local than what those closed models are capable of. Right now they are only ahead on trivia (due to being massive size, but that's also diminishing with lawsuits and copyrighted prompts getting banned), and edit models. That's about it. Local wins by a hurge margin everywhere else, especially if you count what is possible with LoRAs.
>>
>>106733461
2k is not 2kx2k resolution. It's 2k pixel count
>>
>>106727940
come back nigli
>>
>>106733524
>chroma 2k is actually 44x44
wouldn't be surprised
>>
>>106733544
Meant megapixel
>>
>>106733524
yeah, whatever
the merge I made only 'works' well with 2k px sides

btw, it might be worth it to merge the Flash delta with the 2k Chroma model to see what happens
This was the last one:
https://huggingface.co/lodestones/chroma-debug-development-only/blob/main/2k-test/2025-09-09_22-24-41.pth
>>
>>106733510
>We can get more out of local than what those closed models are capable of.
Yes BUT
>Aesthetic wise, I have never been impressed by an API model.
Occasionally, very occasionally, will I be surprised https://x.com/st66612873/status/1971949988344484312
>>
>>106733404
It's somewhat weird that this faggot talks about skill and posts AI generated images... Unless drawing simple masks around problematic areas is considered something groundbreaking I wonder why do you need to advertise here? You surely have a job already because of these "skills".
>>
>>106733487
no like the fact that you place another sampler, i think, makes comfy operate as if it should be used after all existing samplers
youre simply replacing the sampler, but im spitballing
>>
So what am I supposed to do if after updating cumfart-ui it says "Can't access property "graph", this undefined". I'm guessing this is because I used the new subnodes instead of making a complete spaghetti mess of nodes.
>>
File: file.png (51 KB, 1417x167)
51 KB
51 KB PNG
the abolsute state
>>
How to prompt a clothed boner on Chroma? (without the penis being exposed)
>>
>>106733195
So this is only for blackwell rich fags? Would it work on 4070? Please tell me we dont have to use kijai workflows
>>
>>106733729
who was in the wrong here?
>>
File: 00003-777779.jpg (1.68 MB, 2016x2736)
1.68 MB
1.68 MB JPG
>>
>>106733812
bulging pants?
>>
>>106733823
>blackwell rich fags
you know they have a 60 series right? welcome to architectural exclusivity. Like console exclusivity, only less gay.
>>
>>106733823
it's only for 50+ series. anything below doesn't have the hardware for it to work.
>>
>>106733195
Why is it closed source?
>>
>>106733859
Why do drugs dealer give out free hits? To get you addicted, now pay up.
>>
everytime i find a decent concept i can't help but spend hours gacha'ing it to death.. I need out of this loop
>>
>genjam
pepperidge Farm remembers
>>
>glowjam
>>
>>106733876
>new lora releases
re-prompt all previously saved gens, saving the best ones
>new lora releases
live die repeat
>>
File: WanVideo2_2_I2V_00463.webm (862 KB, 1248x720)
862 KB
862 KB WEBM
>>
File: asmonsmash.gif (31 KB, 128x128)
31 KB
31 KB GIF
jesus i just cannot get the hang of character lora training in flux. settings all seem right, i think its just the gay captioning.
its way more finnicky than xl for sure.
>>
>>106733960
i don't like c*mfyui cancer
>>
>>106733859
>Responds to link pointing to source code
>Asks why it's closed source
Hello retard
>>
>>106734005
nobody does except the biggest reddit faggots
>>
File: image.png (1.56 MB, 1024x1024)
1.56 MB
1.56 MB PNG
>>106733456
>>
File: 00038-384731983.jpg (411 KB, 1664x2432)
411 KB
411 KB JPG
tried to give chroma (hd and dc2k) a chance and it couldn't get shit right with the prompts and settings. Very slow piece of shit model on my 5090 at 30 steps with reforge. flux srpo dev works and i actually get decent results.
>>
>>106734152
>flux srpo
I thought this was a nothingburger?
>>
>with reforge
>>
>>106734144
I feel like I should at least give this a (You)
>>
>>106734144
EIGHTY
I
G
H
T
Y
>>
do you delete old model?
>>
brb i'll order some more h100s
>>
>>106733850
Nice Bisley
>>
>>106734222
only if they suck
>>
>>106734291
did you mean to post that image to include the output..?
>>
>>106734307
Yeah, you can see how her skin isn't all red unlike >>106733285
>>
File: 00070-3726124944.png (2.62 MB, 1248x1824)
2.62 MB
2.62 MB PNG
>>106734291
you dumbass, the jannies are weaking up very soon. delete quick
.>>106734175
i get better results with it than regular flux dev.
>>
>>106733301
it's "Voila" you uneducated lowIQ subhuman garbage. how about you find a nice tall bridge.

>>106734312
yeah that isn't the issue. this is a blueboard, meaning no nsfw/porn is usually allowed.
>>
why not just keep sunburn in negatives at all times? you are overly complicating it with so many if/then
>>
File: ComfyUI_temp_jnrvz_00002_.png (2.75 MB, 1120x1840)
2.75 MB
2.75 MB PNG
Are there any of those data annotating sidejobs that aren't chink or jeet scams? I'd like some sidemoney for my neetdom.
>>
>>106734291
>>106734347
legend

>>106734313
nice brown girl, what ethnicity did you prompt?
>>
>>106734347
>uncensored cunny on a blue board
how ToTpilled
>>
>>106734376
fair, but i feel like there has to be a more elgant or simpler solution. unrelated but what is the style lora? it looks like possummachine but i havent seen that trigger be4.
>>
>drop my wan quant down to q4ks
>realize theres now 6 giggerbytes freed on my gpu
huh, math really isnt my strong suit i thought id free way less than that lmao
should i try to drop the clip loader into that extra vram? will that fit?
>>
>>106734313
>than regular flux dev
That says nothing though, regular flux dev is plastic shit, this image still has flux face but thankfully not the chin
>>
>>106734357
sexy south American aztec woman
>>
>>106734350
what style is that
>>
>>106734415
and for the first time i got an OOM at the RIFE VFI node. after.. dropping the quant..

like what the fuck powers comfyui? actual niggerlicious voodoo magic?
>>
>>106734443
it's right THERE right FUCKING THERE JUST USE IT YOU-YOU DOUBLE NIGGER
>>
>>106734443
>>106734451
problem with cum-ui is that the custom nodes are made by whoever. some might use their own memory management (or lack of) and be buggy
not necessarily a cum backend issue.
sort of expecting some big malware hit at some point when people download a wrong custom node...
>>
>>106734438
It's the vidya screenshot lora someone posted here + Neuro lora.
>>
File: 00176-1614010019.png (3.1 MB, 784x2800)
3.1 MB
3.1 MB PNG
take the thin pill
>>
>>106734470
after running flux enough to only run into a single oom after like 20 gens, i'm starting to suspect it's kijai's patch sage attention node.
I just assumed from the start i wouldn't need it if it's already installed, but it seems you *do* need his node for it. Like, maybe its just busted/fucked and not managing the memory well? I haven't ran without it.gonna try because EVERY other node besides video combine is a comfy node.
>>
>>106734470
me when i make shit up
>>
>kijai
>>
>>106734502
Nice one. I like her outfit.
>>
knowing what you're doing makes browsing civitai for workflows a nightmare.

why is there a custom nodepack for a simple Int literal?! why is there a nest of if branches that don't make sense?
why did you add a piss ton of logos and stupid fucking cutesy shit in your wf that will just bloat it?

and worst of all are the people who take existing example workflows and add a ton of shit that no one asked for.
or those ultra mega workflows that have everything in one, assuming you can find the right group hidden in a sea of cancer spaghetti.
holy fuck some people must just die.
i'm so happy to have been born just above the line of retardation.
>>
File: bog call.jpg (20 KB, 400x400)
20 KB
20 KB JPG
>>106734433
kino

>>106734507
reran without the sage attention node, it was slower, meaning yes you need it to use sage attention. very gay.
still ooms at RIFE, so i think i know what broke. predictably, i fell for the updoot meme.
>>
>>106734557
if you really knew what you were doing, you wouldnt need another's workflow kek
>>
Huh, I'm seeing a better prompt understanding with fp16 model and the new wan 2.2 t2v lora than with the q8 model.
>>
>he uses everything everywhere
now i can't find anything anywhere

>doesn't use reroute nodes
please reroute your face into a wall

>>106734574
vaild point. i mostly browse to see if there is anything to learn. sometimes you do find people who know what the fuck they are doing but it is getting increasingly rare.
>>
File: fairy-mossy-cartoon.jpg (1.09 MB, 1440x2168)
1.09 MB
1.09 MB JPG
>>106734557
Yeah its a confusing mess out there. I wish people would share images of the workflow rather than always making you download and open it manually since usually I just want to know how to do some minor thing to add to an existing workflow.
>>
>>106734587
>he uses reroute nodes
this is italian american discrimination. eat your spaghetti like a real man you fuckin queer.
>>
i have a 5090 and i used a script to install sage attention2.2 with comfy, how hard is it to install sageattention3 for a noob?
>>
>>106734621
You have to put yourself on a fucking list and beg for access to it from chinaman like a fucking dog they're about to eat.
>>
>>106734621
general rule of thumb, wait 2 more weeks until they iron out the kinks. enjoy what you have for now.
>>
File: ComfyUI_18784.png (3.15 MB, 1152x1728)
3.15 MB
3.15 MB PNG
>>106734144
I think it did OK-ish for such a shitty prompt.

>>106734175
The degraded/noisy output is a bummer, but it really does help base Flux out.
>>
>>106734629
ill wait then, thanks
>>
>>106734609
Wildcard examples. Does anyone have any ideas for more locations?

Here's a list of expressions I have if anyone wants to copy:
smug, looking at viewer
smile
smile, looking at viewer
ecstatic
ecstatic, looking at viewer
drunk, mouth open
drunk, looking at viewer, mouth open
excited
excited, looking at viewer
ahegao
nervous
nervous, looking at viewer
face blush
face blush, looking at viewer
happy, looking at viewer
embarrassed
embarrassed, looking at viewer
smirk, looking at viewer
mischievous grin, looking at viewer
ahegao, clenched teeth
winking, looking at viewer

And my list of poses:
smug, looking at viewer
smile
smile, looking at viewer
ecstatic
ecstatic, looking at viewer
drunk, mouth open
drunk, looking at viewer, mouth open
excited
excited, looking at viewer
ahegao
nervous
nervous, looking at viewer
face blush
face blush, looking at viewer
happy, looking at viewer
embarrassed
embarrassed, looking at viewer
smirk, looking at viewer
mischievous grin, looking at viewer
ahegao, clenched teeth
winking, looking at viewer
>>
File: 1739047854633526.png (2.12 MB, 2097x1070)
2.12 MB
2.12 MB PNG
>>106734144
>>
>>106734702
qwen image btw
>>
Dead general
>>
>>106734702
we really are gonna look back at a.i fuckups with genuine nostalgia and reverence in 5 years aren't we?
fuckin genning fully photorealistic videos of football players chucking babies and another catches the baby perfectly, only to score a touchdown and driving the baby into the ground cartoonishly
i'm looking forward to it. and still sticking to 16gb of vram because i won't have a choice :)
>>
File: 00012-1221325739.png (2.87 MB, 1080x1920)
2.87 MB
2.87 MB PNG
>>106734721
I don't even like loli anymore, but i will pour one out for you in a few hours when the mods wake up and cave your face in. Like, whatever your intention/goal here is, I'm all for it.
>>
>>106734713
better than that 1guy constantly spamming irrelevant saas bait
>>
File: ComfyUI_temp_albym_00001_.png (1.79 MB, 1024x1024)
1.79 MB
1.79 MB PNG
>>106734702
>>106734144
>>106733456
Chroma Base
>>
What bait? SaaS models are superior to open weight ones, this was confirmed with Hunyuan. 80b params and it still doesnt come close to seedream
>>
>>106734771
What is you name, avatarfag?
>>
>>106734843
Hugh
>>
>>106734546
Ty. it's inspired by an obscure pc-98 game, alantia
>>
comfyui's catastrophic oom that just knocked out display to both my monitors made me use the reset button on my case for the first time in 5 years of owning it.


Wow!
>>
>>106734502
Ozempic is a blessing
>>
but really these threads need a theme so we know what to prompt
>>
>>106734972
every A.I thread has a common theme;

Autism and Schizophrenia.
>>
>>106734955
Cool story bro
>>
File: file.png (28 KB, 103x126)
28 KB
28 KB PNG
you walk in and see your gf drinking pepsi instead of the coke she said she was going to buy
>>
>>106734674
ask an llm to generate you booru style tag lists from whatever subject you want
you can also find wildcard lists made by other people with a search engine
heres a list of urban locations for example
https://litter.catbox.moe/w0wd48evx03pbrbo.txt
>>
File: 00082-2382600484.png (2.59 MB, 1248x1824)
2.59 MB
2.59 MB PNG
>>
>>106735031
What base checkpoint are you using?
>>
>>106735048
nta
chroma-hd
>>
>>106735048
Flux SPRO

Also nta.
>>
>>106735048
mungus
Also NTA.
>>
>>106735048
ponyrealism2.1
>>
>>106732742
/r/stablediffusion is hilariously bad
>>
https://www.youtube.com/watch?v=LicoKOX9CuQ
What do you think of AI videoclips? Is this the only "genre" where the 5 seconds limit doesn't matter?
>>
>>106735031
flux srpo, struggling soo hard to replicate this camera again. Fucking bullshit nature language based model.
>>
File: ComfyUI_01440_.png (3 MB, 1504x1504)
3 MB
3 MB PNG
>tricked by the srpo meme again.
>>
File: 00108-1486062195.jpg (1.27 MB, 2480x2688)
1.27 MB
1.27 MB JPG
>>106733404
>>106733433
>>106733459
>>106733617
>waits until EU hours to larp as me posting old gens I haven't posted in 6+ months
This is why your life is hell and you have nothing.
Just posting a new gen which is a chroma leftover. You're so pathetic and autistic you collect every gen I post which means you're a superfan.
Also this is why I don't give catbox
>>
File: ComfyUI_01445_.png (2.99 MB, 1792x1200)
2.99 MB
2.99 MB PNG
>>
I guess the wheel chairs really do get to him so I will dust off chroma to make more I guess
>>
File: ComfyUI_01446_.png (3.19 MB, 1872x1248)
3.19 MB
3.19 MB PNG
>>
File: ComfyUI_01447_.png (3.53 MB, 1872x1248)
3.53 MB
3.53 MB PNG
>>
The fuck, if the aspect ratio isnt 1:1 for this setup, it won't gen at all. Throws me a massive error about it.

What do?
>>
>>106735365
>Throws me a massive error about it.
How could we possibly know if we can't see the error. But if I were to wager a guess, the size of your image isn't divisible by 32 on one or more of its dimensions.
>>
>>106735279
What is happening here?
>>
File: 1734103313419153.png (244 KB, 1460x770)
244 KB
244 KB PNG
>>106735365
just do this
>>
>>106735387
A very ass pained schizo doing the same thing he has done for 3 years and what put him in his little containment thread where he does 80% of the bumping himself.
>>
>>106735365
>still using kijai's workflow
why?
>>
>>
>>106735404
Don't listen to this guy. Comfy's workflow fucks up all the time.
>>
>>106735407
See what I mean old pics he's a superfan
>>
File: offloading-.jpg (778 KB, 1024x1057)
778 KB
778 KB JPG
>>
>>106735404
it just werks and I'm pretty sure it's faster
>>
>>106735385
>>106735391
That was it, thanks. Why is it so pissy about the dimensions?

>>106735404
I'm still a beginner and this workflow is the only thing that isn't severely discoloring the results.
>>
>>106735434
Don't ask me. Ask Qwen.
>>
It is possible to train wan 2.2 14b i2v in 48GB VRAM?
I can almost do it with the following command, but I always OOM before it can write a checkpoint, so I can't resume:
accelerate launch --num_cpu_threads_per_process 1 --mixed_precision bf16 src/musubi_tuner/wan_train_network.py \
--task i2v-A14B \
--dit_high_noise /home/anon/Documents/ComfyUI/models/diffusion_models/wan2.2/wan2.2_i2v_high_noise_14B_fp16.safetensors \
--dit /home/anon/Documents/ComfyUI/models/diffusion_models/wan2.2/wan2.2_i2v_low_noise_14B_fp16.safetensors \
--dataset_config /home/anon/Documents/musubi-tuner/data/city-video-cfg/city-video-dataset.toml --sdpa --mixed_precision fp16 --fp8_base \
--optimizer_type adamw8bit --learning_rate 2e-4 --gradient_checkpointing --gradient_accumulation_steps 1 \
--max_data_loader_n_workers 2 --persistent_data_loader_workers --offload_inactive_dit \
--force_v2_1_time_embedding \
--network_module networks.lora_wan --network_dim 32 \
--timestep_sampling shift --timestep_boundary 900 --min_timestep 0 --max_timestep 1000 --discrete_flow_shift 3.0 \
--max_train_epochs 16 --save_every_n_epochs 1 --seed 23571113 \
--save_state \
--output_dir /home/anon/Documents/musubi-tuner/data/city-video-output/ --output_name wan2.2-14b-i2v-city.safetensors \
--logging_dir /home/anon/Documents/musubi-tuner/data/city-video-logs

I've tried setting --blocks_to_offload 32 but not only does it slow things down quite a bit, it still OOMs eventually anyway. I'm, training on 836x480 video. Maybe I should go smaller?
>>
>>106735417
what model did you use for that one?
>>
Someone has to explain to me how the Hunyuan fags trained their model with 5 billion images yet it can only do 4 styles max like the other slopped models, wtf?
>>
If you're upset over the way prompts are formatted on a free tool, I don't think local is for you atm. It's very easy to get and there's easy ways to remove undesirable aspects
>>
File: 00096-2261223199.png (2.88 MB, 1536x1536)
2.88 MB
2.88 MB PNG
first decent chroma gen after going through 40+ failed slops and tweaking cfg, sampler and hi res fix settings.
>>
File: ComfyUI_01452_.png (3.52 MB, 1872x1248)
3.52 MB
3.52 MB PNG
>>
>>106735441
if you OOM it means you can't unless you offload
>>
>>106733238
>>106734144
E I G H T Y
https://youtu.be/UJ12rCD-Dkk?t=15
>>
>>106735413
why are you replying to your own posts
>>
>>106735456
Yeah I tried the max offloading, it still OOMd.
I reduced my video dataset to 640x360, and I'm recreating my latent cache, and then I'll try it again. I think I'll make it, I was very close with 836x480.
>>
>>106735441
Can only musubi train WAN? Onetrainer has better offloading but I don't remember if it can do WAN.
>>
>>106735476
>max offloading
max offloading is putting everything to your ram, you really did that?
>>
>>
File: What a bigot!.png (1.2 MB, 2093x1487)
1.2 MB
1.2 MB PNG
https://www.reddit.com/r/StableDiffusion/comments/1nt9n03/i_absolutely_assure_you_that_no_honest_person/
Why is reddit filled with transphobic chuds!11!!1! That's not really wholesome chungus at all!!
>>
File: ComfyUI_01454_.png (2.86 MB, 1728x1248)
2.86 MB
2.86 MB PNG
>>
Someone from the dead thread is extra upset today. Or is it the failed alcoholic that wants to be popular, the world may never know. BTW this behavior is why nobody wants to deal with you people, new posters see this shit and old heads get reminded why they left /sdg/
>>
>>106735491
desu I'm glad it looks like shit, that means it'll be used as a good example to prevent other companies to go for LayersMaxxing since it shows that there's already diminishing returns if you go over 20b
>>
>>106735481
Yes, I removed --offload_inactive_dit and added --blocks_to_swap 32, which is the max. I have 128GB of DDR5 system RAM, that's not the issue.I even added --force_v2_1_time_embedding and that did make it last longer before the OOM happened.

I've been training on diffusion-pipe but last time I tried to do a qwen-image LoRA something was broken code-wise, so I switch to musubi and made this successfully: https://huggingface.co/quarterturn/qwen-image-20b-city

I'm up for any tool so long as it works.
>>
File: ComfyUI_01455_.png (3.09 MB, 1728x1248)
3.09 MB
3.09 MB PNG
>>
>>106735527
>--blocks_to_swap 32, which is the max
that's dumb, 32 shouldn't be the max since Wan has more than 32 layers
>>
>>106735515
You have completely lost your marbles.
>>
>>106735515
>has a meltdown on something but doesn't tell us what it is exactly
what is this schizo talking about?
>>
>>106735519
>that means it'll be used as a good example to prevent other companies to go for LayersMaxxing
Surely they'll learn from this fiasco...
>>
>>106735533
I think it has 40 but I also think musubi only allows up to 32 to be offloaded.
Doesn't matter, if it works on 640x360 I'm happy, i2v is about teaching it new motion vectors.
This is the sort of thing where you're better off renting an H100 80GB GPU instance once you get the hang of it.
>>
>>106735021
404
>>
>>106735491
if I was the CEO of Tencent I would fire all the engineers that worked on that model, I don't think people realize how expensive that shit must have been to create, for example, for chroma we have this

>8.9b
>512x512 training
>5 milions images
>50 epochs
>150 000 dollars

now imagine for this:
>80b
>5 billion images
>?? epochs
>1024x1024 training

that probably cost more than a hundred of million of dollars
>
>>
>>106735468
baste
>>
File: kek.png (123 KB, 320x320)
123 KB
123 KB PNG
>>106735585
>if I was the CEO of Tencent I would fire all the engineers that worked on that model
You're too kind. If I were him, I would sue all the engineers. At this level of incompetence, we're probably talking about money laundering.
>>
>>106735585
What stats does Qwen have?
>>
>>106735585
It's likely that this model was the byproduct of something else they figured they could throw out to the open sores community for some free street cred.
>>
File: 1743938070517292.png (145 KB, 1925x646)
145 KB
145 KB PNG
>>106735585
>that probably cost more than a hundred of million of dollars
chatgpt says it has costed billions lmao
>>
>>106735659
Chat GPT says a lot of things.
>>
>>106735442
chatgpt
>>
>>106735659
>labor, dataset
synthslop and jeet/SEA labor saves money
>>
File: 1729122297364416.png (220 KB, 1534x988)
220 KB
220 KB PNG
>>106735659
Claude Sonnet 4 also says it might have costed ~5 billion dollars
>>
>>106735665
I have to say are these unironic posts? Do people really ask these to llms thinking it magically knows the answer to questions like these?
>>
>>106735685
>>106735659
Can it take into account chink conditions and pricing? No way this slop cost 5B. they'd hang them.
>>
>>106735706
>doesn't know what "estimation" means
ngmi
>>
>>106735706
You overestimate the average IQ in this thread.
>>
>>106735710
An estimation is meaningless if the estimator has no idea what it's estimating. In this case, Chat GPT.
>>
>>106735717
>also doesn't know what "estimation" means
you're right, the average is low because of people like you
>>
>>106735732
>if the estimator has no idea what it's estimating
good thing that it can search on the internet to get all the informations necessary, are you retarded or something?
>>
>>106735708
Tsk tsk, the masterpiece anime 1girl lolicon version is for party leaders only, not you, anon.
>>
>>106735717
>hurdur they're wrong but I don't know why and I'm gonna say it anyway
that's not a sign of intelligence at all, feel free to prove those reasoning wrong >>106735685
>>
>>106735743
It clearly didn't. Look at that number, think about how much money that actually is and realize how ludicrous a figure that is.
>>
>>106735743
Did it give a source for all those numbers?
>>
File: it has sources.png (137 KB, 1834x619)
137 KB
137 KB PNG
>>106735762
>It clearly didn't.
it did, you're talking out of your ass again
>>
>>106735756
Well number 1 tencent doesn't rent their GPUs from the cloud.
>>
>>106734313
catbox? is it chroma?
>>
>>106735771
it's less expensive to buy those gpus' instead of renting them? serious question I don't know much about it?
>>
>>106735780
Depends entirely what they are used for, how long you use them and what they do before and after they train the model.
>>
>>106735780
They didn't buy it just for this model, they probably had these already so the only cost would probably upkeep and electricity cost.
>>
File: 00039-1235545371.png (1.23 MB, 1152x896)
1.23 MB
1.23 MB PNG
>>
>>106735803
>the only cost would probably upkeep and electricity cost.
and the price of those gpus as well, they have a cost
>>
>>106735812
Why? If they bought them already for other projects, they already incurred the cost, not like I re-buy my gpu every time I gen a pic lol
>>
>>106735808
>local begging for wan 2.5
>>
>>106735829
yeah but you had to buy them to use them, I get it it wasn't used only for that specific model, but let's not pretend that training a model is 0 dollars (we pretend we didn't buy the gpus at all) + the cost of electricity, that's retarded
>>
>>106735812
This argument goes back to when deepseek was made. Do you include the total cost of the GPUs they own in the price of the model or do you also count the GPUs?

Also if I recall correct, the price tag on DeepSeek was about $5 million dollars and it has 671 billion parameters. So by chat GPT's estimate, it would have cost deepseek around 41 billion dollars to train R1.
Now, idk about you, I'm not good at math, but that's a fair bit more than 5 million.
>>
>>106735831
kek
>>
>>106735838
>Now, idk about you, I'm not good at math
yep, you aren't, since we're not talking about LLMs but about diffusion models, that's something completly different
>>
>>106735844
Actually Hunyuan 3.0 has more in common with LLMs than it does with regular diffusion models so that's also not true.
>>
>>106735838
>the price tag on DeepSeek was about $5 million dollars and it has 671 billion parameters
that's a lie, the 5 million dollars tag was to transform deepseek V3 to deepseek R1, the finetune cost 5 millions, but they ommited the part they had to pretrain deepseek to get deepseek V3
>>
gib wan 2.5 nao
>>
>>106735833
Why? that's exactly why people buy the gpus lol. Dude don't be an idiot and use llms for shit like this, use your own fucking brain lmao.
>>
>>106735860
Yeah true. Still wasn't billions of dollars though. Get that retarded idea out of your head.
>>
>>106735872
>Why?
because that's fucking retarded, the only question should be "how much money do you need to spend to be able to finetune a model", and if you don't have enough money to buy those gpus it's already game over
>use your own fucking brain lmao.
how about you use your brain, you need the money to buy those gpus in the first place retard
>>
>>106735881
There is no scenario on earth where tencent bought those GPUs specifically for the purpose of training that model and nothing else. I'm not sure how you factor the price of those GPUs into the budget of the model with that in mind, but it is equally retarded to factor them in at the full price of whatever one of their datacenters they used.
>>
>>106735892
>There is no scenario on earth where tencent bought those GPUs specifically for the purpose of training that model and nothing else.
that doesn't matter, they had to pay those GPUs anyway, pretending that because it's been used to do something else it might as well not exist is one of the most retarded thing I've seen on 4chan, wtf dude
>>
>>106735892
Don't bother the dude replaced his brain with a llm and is now trying to make excuses with it lmao. Let this be a lesson kids, llm with care or it apparently takes over.
>>
>>106735899
Yeah I'm done. I was wrong to engage in good faith.
>>
>>106735899
>>106735906
>and remember guys, if you buy a gpu to train 2 models, you can say that the cost of training one of those two models is 0 dollars (we pretend we didn't buy the gpu) + only the cost of electricity, what a bargain
why is there so many so many low IQ retards in this place?
>>
>>106735926
accept his concession
>>
>bypass color match node
>workflow just stops working entirely
>>
File: 1756401296382255.png (360 KB, 1342x1617)
360 KB
360 KB PNG
>>106735855
>Actually Hunyuan 3.0 has more in common with LLMs than it does with regular diffusion models so that's also not true.
he's right, it's probably less expensive to train an autoregressive model, with that in mind ChatGpt says it cost 12 millions max
>>
>>106735934
Just don't engage him. He will go back to his llm when we stop providing ample stimulation.
>>
>>106735946
>>106735934
>We are right after all, a GPU costs $0, so when you train a model, you don't need to factor in the cost of the GPU you purchased.
Fascinating.
>>
>>106735944
since HunyuanImage is a MoE with 13b active parameters, it means that if you're able to put everything on your gpu memory (good luck with that though), the speed is equivalent to the speed of a 13b model right?
>>
fresh
>>106736034
>>106736034
>>106736034
>>106736034



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.