[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: long dick general.jpg (1.03 MB, 1914x2494)
1.03 MB
1.03 MB JPG
App Mode Edition

Discussion of Free and Open Source Diffusion Models

Prev: >>108058266

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Anima
https://huggingface.co/circlestone-labs/Anima

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
File: o_00008_.png (2.23 MB, 1280x768)
2.23 MB
2.23 MB PNG
>>
long dick general
>>
John Tongyi of Tongyi Labs here. We kneel to BRC (Big Russell Cock)
>>
tongyi my anus
>>
File: o_00010_.png (1.42 MB, 896x1152)
1.42 MB
1.42 MB PNG
>>
>>108062854
Nice
>>
>>108062858
thansk
>>
File: z_image_bf16_00126_.png (3.07 MB, 1344x1632)
3.07 MB
3.07 MB PNG
>>
>>108062906
trash
>>
File: Polemic.jpg (298 KB, 720x789)
298 KB
298 KB JPG
This is bad, I didn’t know Comfy gave Anima $1 million. Funding models that are likely to go closed source like in their repo specifies is bad for the ecosystem. Why not give Laxhar, Lodestone, Neta Yume, WAI or Newbie $1 million for a full finetuning in ZiB? Randomly handing out huge sums of money to friends makes no sense and looks shady or are we seriously supposed to believe Anima is worth $1 million???
>>
Blessed thread of frenship
>>
>>108062926
shutup
>>
>>108062926
>I didn’t know Comfy gave Anima $1 million
He didn't, the 1M is for a random person who signs up and earns it to develop something for them
>>
>>108062940
and here comes the racism
>>
>>108062926
Qwen 4B has better reading comprehension jej
>>
>>108062933
No, why Comfy deliberately gave 1 million dollars to literally who? I didn’t even ask for a finetune on a 2B model, we are going back instead of forward. What is Neta Lumina Labs supposed to do now that Comfy is deliberately sponsoring and supporting a model? Comfy is creating unfair and uneven competition.
>>
>>108062957
take a walk outside you fucking faggot
>>
>>108062957
its called capitalism you commie retard, don't like it fund your own model
>>
>>108062926
It's good you moron. More alternatives = better but only if the new model stay free
I ain't getting no ComfyCoins or loot nodes ya hear me
this nigga is done!
>>
File: 00064-2147368178.png (841 KB, 1216x832)
841 KB
841 KB PNG
>>
>>108062977
ayy vey!
>>
culture status?
>>
You went too far with suggesting WAIfag should get paid kek
How did anon fall for that b8
>>
Thx for lazy fag bake
>>
>>108062998
i just jacked off so much to klein my penis is all red and full of little blisters, meanwhile i couldnt even get hard for zit
>>
>>108063016
must be all the penis and foreskin loras for klein
>>
>>108063016
damn, klein does loli that well?
>>
>>108062957
Non-whites wouldn't understand.
>>
File: Flux2-Klein_00976_.png (3.08 MB, 2048x1360)
3.08 MB
3.08 MB PNG
>>
>>108063032
soul
>>
>>108063032
>May the fat be with you
>>
>>108062998
The legacy of the chinaman lives on in Amina diffusion model
>>
File: z_image_bf16_00130_.png (1.99 MB, 1216x1216)
1.99 MB
1.99 MB PNG
>>
File: o_00019_.png (1.21 MB, 896x1152)
1.21 MB
1.21 MB PNG
>>
>>108062957
If you believe Neta Lumina Labs deserves 1 million dollars then give it to them. :)
>>
>>108063061
What if I do?
>>
>>108063063
Then we get two cakes
>>
>>108063069
The cake is a lie
>>
>>108062926
>give more money to failbakers
genius plan
>>
>>108063087
im so tired of your negativity, fuck you
>>
>>108063092
why would we want a furry model, retard?
>>
>>108063095
>>108063069
>>
File: z_image_bf16_00131_.png (2.24 MB, 1216x1216)
2.24 MB
2.24 MB PNG
>>108062909
seems to work actually.
has to be portrait
>>
>>108063101
there is only so much money retard
>>
>>108063087
laxhar is not but the rest of them yeah i agree
>>
File: rxn_imokaywiththis.png (17 KB, 379x214)
17 KB
17 KB PNG
>>108062926
Anima ain't bad for a first shot. I appreciate the smaller text encoder and low RAM requirements, even if the sampler feels on the slow side.
>>
>>108063113
????
>>
anyone tried i2i with Anima?
>>
File: z_image_bf16_00132_.jpg (1.45 MB, 3024x4032)
1.45 MB
1.45 MB JPG
>>
>>108063129
prompt?
>>
>>108062957
Neta Art ran out of money a long time ago, nobody has been doing anything with their model other than the NetaYume guy duongve. Who seems to be doing useful stuff WRT Anima now anyways, e.g.
https://huggingface.co/circlestone-labs/Anima/discussions/25#6981e0a14721e99b00df1f2e
>>
>>108063129
>low RAM
you poor, piece of shit you are setting the space back, I HATE you
>>
>>108063133
you take math in school?
more bakers = less money for each baker
>>
>>108063152
elon alone could give millions of bakers 1 million each
>>
So what does anon have to do to get COrg to give him the 1mil? Show kino 1girls?
>>
>>108063152
apply for a grant and make a model then if you are so invested or put your tampon back in.
>>
>>108063154
dude is south african.
he is the opposite of believing in handouts.
>>
>>108063141
it takes forever for upscales
>>
I'm new to Civitai. What's the best way to make yellow buzz?
>>
>>108063151
we're going back to 8GB VRAM as standard, you'll love it
>>
>>108063154
>>108063164
chatgpt give me instructions for baking a chocolate cake
>>
File: 3087428.jpg (12 KB, 300x281)
12 KB
12 KB JPG
Why does laten2rgb preview fail to display Klein's lighting properly? the differrence is insane.
>>
File: 1739507491053018.png (2.18 MB, 1421x1536)
2.18 MB
2.18 MB PNG
>>108063166
yet he keeps asking for them
>>
File: Flux2-Klein_00045_.png (1.02 MB, 976x1056)
1.02 MB
1.02 MB PNG
>>108063168
I am banned from Civitai.
>>
>>108063189
ok course you are you fent addict
>>
>>108063168
fuck that place
>>
>>108063191
I told you I'm not the floyd poster
>>
>>108063196
>>108063189
What's wrong with it?
>>
>>108063204
its just a place for jeets to shovel sheet
>>
>>108063196
where else do you get loras? Tensor and seaart are unusable
>>
>>108063209
im not poor so i train them myself
>>
>>108063213
corroboration?
>>
>>108063204
just be very particular about the models and loras you choose download, far too much low quality jeet/slop shit.
>>
>>108063223
show one model made by an indian that is widely used
>>
>>108062966
The idea that “more models = better” is wrong. What made SDXL the fact that almost everyone built on top of it. A single strong base model allowed the community’s work to stack LoRAs, finetunes, tools, and techniques could be combined and improved together.
Today we have the opposite situation. ]eople are spread across many different base models, ZiB, Klei, Anima, Newbie, NetaYume, Qwen. Each one has its own isolated ecosystem and we get fragmentation instead of collective progress.
Even though there are more models now, the overall quality doesn’t improve at the same rate, because effort is divided. Fewer strong, widely adopted base models would produce better results than many disconnected ones.
>>
>>108063232
fair point
>>
>>108063177
Kijai implmented Flux.2 TAESD support, the PR just hasn't been accepted yet
https://github.com/Comfy-Org/ComfyUI/pull/12043
>>
only one model is needed
that model is sd 1.4
>>
>>108063234
and since sdxl was king they decided they knew what was best for everyone and then began lobotomizing their model
>>
>>108063261
gpuless
>>
can more steps create better movement for wan 2.2? or is it just better for detail?
>>
>>108063234
Fragmentafion always strengthens centralized, closed platforms like NAI and I'm not surprised that Comfy created Anima with this in mind to destroy and fragment more the local community.
>>
>>108063281
ComfyUI is the AI gulag.
>>
>>108063281
"One base model per person"
And with this false narrative SaaS companies face less organized competition and Comfy wins because API.
>>
most of the hard work anyways is dataset collection and captioning. yall could be doing that right now since thats independent of model
>>
>>108063316
im horny can you do it for me
>>
Remember when people on /hdg/ thought that the Stable Cascade anime finetune would be a thing? Pepperidge farm remembers
>>
>>108063337
i still remember Unstable Diffusion and Waifu Diffusion
>>
>>108063257
taesd is a system resource chugger
>>
>>108063316
We've seen this before.
On the Linux desktop, hundreds of distros and incompatible systems weakened adoption, while unified platforms like Windows and macOS stayed dominant.
In messaging, open standards like XMPP fragmented and lost to centralized apps like WhatsApp and Telegram.
In game dev, progress only accelerated once engines like Unreal and Unity became common standards.
>>
File: o_00024_.png (1.68 MB, 1152x896)
1.68 MB
1.68 MB PNG
>>
>>
No guyzz comfy gud he gave us waifusss :D
>>
>>108063345
whatever happened to the WD team anyways, why did they stop
>>
>>108063374
catbox?
>>
>>108063350
it's the only vaguely useable preview though, latent to rgb is completely useless on every model IMO
>>
Phenek phox gib waifus to anons :D
>>
https://huggingface.co/UDCAI/Z-Image-Fun-Distill-ComfyUI/tree/main
repostan the working cfg1 zibase lora
>>
>>108063382
https://files.catbox.moe/fxsjib.png
>>
>>108063380
pretty much gave up after the NAI leak. not sure if it was a direct cause but they didnt have the funds after that iirc
>>
>>108063394
thanks
>>
>>108063401
np
>>
Amazing job, Comfy. In 2026, a half baked 2B model has no long term future, but it’s perfectly sized to distract and disrupt local anime models now. Bravo.
>>
>>108063431
>half baked
that's neta, newbie, and chroma
>it’s perfectly sized
yes it is
>>
scale-at-all-cost fags on suicide watch
>>
What are the Laxhar chinks up to these days?
>>
>>108063345
Did Unstable Diffusion even release anything
>>
>>108063469
Seething because they’re no longer the in SOTA snake oil podium.
>>
>>108063487
not that im aware of
>>
remember disco diffusion, looking glass
>>
File: o_00031_.png (1.46 MB, 896x1152)
1.46 MB
1.46 MB PNG
>>
File: o_00033_.png (1.05 MB, 512x1536)
1.05 MB
1.05 MB PNG
>>
anyone have a good comfy work flow for ace step 1.5?
>>
>>
>>108063532
comfortable
>>
why do i ever update comfyui, now my workflow is borked and running out of memory for some reason and now i have to spend the next few hours trying fix it
>>
File: o_00035_.png (1.42 MB, 1536x512)
1.42 MB
1.42 MB PNG
>>
File: Flux2-Klein_00981_.png (383 KB, 448x592)
383 KB
383 KB PNG
>>108063623
>>
>>108063142
1grub
>>
There are too many different models, workflows, and softwares constantly appearing and being abandoned for me to know what to use or how to gen while still having a job, I lost track in 2023.
>>
File: z_image_bf16_00137_.jpg (3.04 MB, 1248x1728)
3.04 MB
3.04 MB JPG
>>108063592
>>
>>108063191
linger longer
>>
>>108063278
on high noise yes, to an extant
>>
>>108063623
I had strange patterns at the image edges just week ago. Then when I took a template workflow the patterns were gone. My workflow was pretty much identical and didn't use custom nodes. Also did a node refresh. Didn't help either.
I have zero idea what happened.
One possibility is the issue was pycache related.
>>
i claim this bread in the name of cozy
>>
>prompt for spread legs
>get 3 legs
happens every single time.
z-image will be superseded, it's a matter of when not if
>>
>>108063818
at this point just retvrn to SD1.4
>>
File: o_00044_.png (1.22 MB, 1152x896)
1.22 MB
1.22 MB PNG
>>
File: z_image_bf16_00141_.jpg (2.7 MB, 1248x1728)
2.7 MB
2.7 MB JPG
when there's a will
>>
>>108062906
worst kind of legs in existence.. straight tree trunk into foot.. hate that shit so much
>>
>>108063032
darth mac
>>
>>108063532
cool
>>
File: z_image_bf16_00127_.jpg (2.84 MB, 1344x1632)
2.84 MB
2.84 MB JPG
>>108063911
some women have thick ankles anon
>>
>>108063923
they should be thrown into a pit
>>
File: 29a.png (16 KB, 645x770)
16 KB
16 KB PNG
>worst kind of legs in existence.. straight tree trunk into foot.. hate that shit so much
>they should be thrown into a pit
>>
>>108063927
>takes 0 ankle damage and brace-climbs her way back out due to thick ankles
>kills you
fucking leglet get destroyed
>>
>>108063923
making me gag
>>
post your z-image negatives

ugly, mature woman, old woman, man, african, bad proportions, bad anatomy, transgender, transvestite, deformed, asian, korean, chinese, japanese, low quality, (jpeg artifacts:1.2), ambiguous items,
>>
>>108063949
色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走,
>>
>>108063959
cool. im not ganna use that though since its mainly for drawings, but i translated my negative to chinese
>>
>>108063959
>>108063975
Does this just work equally well while saving on tokens?
>>
File: z_image_bf16_00142_.jpg (2.92 MB, 1248x1728)
2.92 MB
2.92 MB JPG
>>
File: ComfyUI_00022_.png (1.87 MB, 832x1216)
1.87 MB
1.87 MB PNG
>>108063959
(低质量, 最差质量:1.4), 文字, 水印, 签名, jpeg伪影, 模糊, 低分辨率, 颗粒感, 照片, 摄影, 写实, 真实, 相机, 超写实, 变异的手, 糟糕的脸, 多余的肢体, 畸形, 缺失的手指, 悬浮的肢体, 断开的肢体, 斜视, 变形, 糟糕的人体解剖, 糟糕的手, 缺失的手臂, 多余的腿, 融合的手指, 手指过多, 长脖子, 变形的盔甲, 弯曲的织针, 悬浮的羊毛, 糟糕的龙, 扭曲的城堡, 变形的建筑, 糟糕的透视, 消失点错误, 裁剪的头部
>>
is ltx2 still fucked?
>>
>>108063995
>saving tokens
This isn't clip so token limits are irrelevant. Also the idea was half good luck charm and half chink models being able to understand chinese better.
>>
File: doomerism.png (63 KB, 1106x522)
63 KB
63 KB PNG
kek
>>
>>108064043
chatgpt niggery
>>
>>108064043
>we're waiting
for what? use the fucking models

>stay silent because knowledge depends on whether someone chooses to share
what a faggot
>>
>>108064043
>thinks Klein is similar to SDXL
ok
>>
File: z_image_bf16_00145_.jpg (3.13 MB, 1248x1728)
3.13 MB
3.13 MB JPG
>>
>>108064043
dude should an hero
>>
>>108063583
Unfortunately we’re stuck with gradio because comfy half asses ace steps implementation
>>
>>108064201
reatard
>>
File: 1756457239912161.png (890 KB, 1248x832)
890 KB
890 KB PNG
>>
File: z_image_bf16_00146_.png (3.51 MB, 1248x1728)
3.51 MB
3.51 MB PNG
>>
>>108064215
It's up to The Community™ to implement
>>
File: 1740776839167255.mp4 (1.65 MB, 704x1248)
1.65 MB
1.65 MB MP4
>>
>>108064244
kek
>>
>>108064238
I want to hear cumfarts side of the story. I smell a lot of amateur hour with the ace step steam despite it being a pretty solid model. There seems to be disconnect.
>>
>>108064043
have this doomer forgotten that qwen image exists?
>>
File: z_image_bf16_00148_.png (3.57 MB, 1248x1728)
3.57 MB
3.57 MB PNG
>>
once again i'm asking for NAG implementation for anima
>>
>Go to read up on ace step in their docs
>Get long rants about elephants.

My god.
>>
>>
>>108064320
cute
>>
>>108064344
made for LLMs to read not people basically
>>
Damn Comfy really botched his implementation of Ace Step, it's like 10 times better with the gradio.
>>
>>108064437
many such cases
>>
File: z_image_bf16_00150_.jpg (2.74 MB, 1248x1728)
2.74 MB
2.74 MB JPG
>>108064391
cute
>>108064423
no u
>>
>>108064437
par for the course
t. hasnt touched gradio in 3 years
>>
>>108064437
>Comfy really botched his implementation of Ace Step

99% sure it was intentional at this point. I fucking hate the gradio interface, but there's nothing in it that comfy shouldn't be able to do easily.
>>
>>108064340
cnet before nag
>>
>>108064309
Qwenpill me, I like the small Qwent text models for local codegen already
>>
>>108064489
? they are completely different things. cnet must be trained (why train on unfinished model?). for nag just add some code to existing node.
>>
>>108064244
She's cute
>>
I'm diving into citivai.
I'm a newbie.
I've only used ZIT default. Used it for a few weeks. It's fun. But I'm moving on to the next step. I look at citivai and I see there are LORAs and checkpoints. I don't know what those are yet but it looks like add-ons to improve ZIT.
Anyone have any advice before I try out these LORAs and checkpoints?
>>
>>108064340
Literally just ask Opus. It one shotted Klein NAG for me. "Adapt this KJ Ltxv NAG node for xyz, make sure you read the model implementation."
>>
https://civitai.com/images/709832
>>
>>108064767
Sarah Peterson on civitai is your guy
>>
>>108064809
This guy's pulling your leg, FYI.
>>
>>108064437
>it's like 10 times better with the gradio
the entire history of local summed up in a single sentence
>>
Just got done testing a LoRA for ace step 1.5 Actually worked as planned. Pretty cool how I can just feed it a genre a music I like an it will make more of that music for me.
>>
File: 1587347161774.jpg (43 KB, 379x379)
43 KB
43 KB JPG
>get qwen tts to work
>give it profanities to limit-test
>it all works
>>
File: 1756191505462776.png (1.57 MB, 832x1248)
1.57 MB
1.57 MB PNG
>>
>watching tutorial on qwen 3 tts
>it's the jeet in disguise making tutorials
>he uses opera winfrey as a voice clone
>mentions how he finds her super sexy and commanding

Fucking jeets and their god damn desi aunt fucking whatever bullshit fetish.
>>
>>108064935
>needing a tutorial
post hand
>>
>>108064871
This is a pretty funny image because it is true.
>>
>>108064962
I wanted to see if you could clone a voice and then prompt for its voice, whispering, screaming etc. Seems like you can't.
>>
File: My Hand.jpg (26 KB, 640x607)
26 KB
26 KB JPG
>>108064962
Hand? Here.
>>
>>108064988
hello mr long fingers
>>
>>108064863
On their gradio?
>>
>>108062926
>This is bad
ayylmao
>>
>>108064783
...don't have opus
>>
>>
>>108065120
>>
>>108064988
a perfectly human and caucasian hand sire
>>
File: ComfyUI_Anima_00034_.png (1.47 MB, 1024x1024)
1.47 MB
1.47 MB PNG
>>108064437
Yes, I've been telling anons this since Comfy is missing the thinking option. Though even Gradio implementation has its issues, repaint is not working properly.

Anyways, more explorations with Acestep 1.5. I feel like I'm getting its prompting rhythm

More raw kino gens
https://files.catbox.moe/kwh1vm.mp3
https://files.catbox.moe/3v4zxj.mp3

The model truly is as good if not better than Suno v4.5 on its best gens, truly a miracle for local.
>>
File: Flux2-Klein_00364_.png (1.51 MB, 1184x864)
1.51 MB
1.51 MB PNG
>>
>>108065264
>if not better than Suno v4.5
delusional
>>
File: Flux2-Klein_00366_.png (1.52 MB, 1200x864)
1.52 MB
1.52 MB PNG
>>
>>108064437
Comfy trying to cast a wide net with his implement first strat. It's the sole reason he captured the majority market share. Sadly, most of the new supports tends to be shit and you're better off using custom nodes.
>>
I've got a bunch of basic questions about AI image generation but instead of asking here I asked ChatGPT and Gemini and I'm getting great answers.
Sorry you guys but I think AI has replaced your jobs.
>>
>>108065272
Yes, these gens I posted are at least v4.5 tier. Still trying to test the overall musicality of it, seems like it's sensitive BPM changes and precise way genres are described, but songs that are just as catchy as that are possible. As for Udio, it's not there yet, but 85% of the way there. Udio's perk is that it's very catchy out of the box- https://files.catbox.moe/dkchd5.mp3

ACEStep 1.5 can already do the catchiness on good seeds but it doesn't have that big model/RLHF energy to always get it right so it just needs a LoRA or tune

Yes not quite there yet but we are so close.
>>
File: z_imageBASEd_00395_.jpg (710 KB, 1728x2400)
710 KB
710 KB JPG
>>
>>108065379
Nice, the bright sun and the orange tones remind me of Skate 2.
>>
>>108065379
Most impressive about this image, is the large breast size. Zimage fixed that?
>>
File: file.png (3.62 MB, 2250x2000)
3.62 MB
3.62 MB PNG
>>
File: z_imageBASEd_00397_.jpg (712 KB, 1728x2400)
712 KB
712 KB JPG
>>108065422
>Skate 2.
that was a great game

>>108065455
>Zimage fixed that?
You can get busty women but you have to prompt for 'massive hyper-tits' and all that crap, so I just trained lora filled with busty asian gravure models, amateurs and k-pop stars for more natural body.
>>
>>108065330
>these gens I posted are at least v4.5 tier
not really no
>>
File: 1756471130812543.png (3.17 MB, 2016x1120)
3.17 MB
3.17 MB PNG
just got done watching some shoelacer slop
>>
File: 1741401472990345.png (2.85 MB, 1376x1664)
2.85 MB
2.85 MB PNG
my laces are tied
>>
File: 1755365430573123.jpg (684 KB, 1920x1216)
684 KB
684 KB JPG
oh no shoelacer-kun, let me tie them too!
>>
File: Flux2-Klein_00373_.png (1.66 MB, 1024x1024)
1.66 MB
1.66 MB PNG
>>
>>108065455
NTA, but if you can't prompt large breasts, that's a skill issue. Stock zit
>>
File: Flux2-Klein_00369_.png (1.6 MB, 1024x1024)
1.6 MB
1.6 MB PNG
>>
File: Flux2-Klein_00375_.png (1.65 MB, 1024x1008)
1.65 MB
1.65 MB PNG
>>108065543
what do you type into the promptbox to get such massive tits? lmao
>>
File: 1760369877590311.jpg (615 KB, 2016x1120)
615 KB
615 KB JPG
i dont need your shoelacing bitch, I can solo this nigga
>>
File: 1756686804564084.jpg (474 KB, 1632x1376)
474 KB
474 KB JPG
AIEEEE shoelacerkun please let me be your onehole! end of story.
>>
File: Flux2-Klein_00996_.png (2.14 MB, 1360x768)
2.14 MB
2.14 MB PNG
>>
File: Flux2-Klein_00376_.png (1.46 MB, 1168x880)
1.46 MB
1.46 MB PNG
>>
File: lo9.png (2.73 MB, 2560x1600)
2.73 MB
2.73 MB PNG
>>108065540
>>108065543
wouldn't even rape
>>108065499
would rape
>>108062854
>>108063142
>>108063364
>>108063532
>>108064244
neat
>>108065569
gem
>>108062926
>creative AI
>>
>>108063151
scaleupfags absolutely malding
>>
>>108065587
but would u rape
>>108065563
>>
>>
File: yy8.png (1.31 MB, 1920x1200)
1.31 MB
1.31 MB PNG
>>108065623
yes
>>108065626
neat
>>
>>
>>
>>108065626
>>
>>108065330
Damn I just tried with 4B clip, shit is like night and day difference. Leave it comfy to default to the worst settings.
Also Lora's work great
https://huggingface.co/Sayoyo/ACE-Step1.5-ZUTOMAYO-LoRA

https://files.catbox.moe/vlvgzk.mp3
>>
File: z_imageBASEd_00408_.jpg (873 KB, 1728x2400)
873 KB
873 KB JPG
>>
>>108065691
>Leave it comfy to default to the worst settings.
so what are the goodest settings?
what I have to input so its not cancer?
also how do you write the prompts and how are you supposed to prompt this?
>>
File: g6u.png (2.11 MB, 2432x1664)
2.11 MB
2.11 MB PNG
>>108065691
that artificial tin can thing that you also see in TTS makes me want to stab my ears brah
>>
>>
>>108065703
Try using the 4B clip that helped a lot. For writing the prompt, someone posted a system prompt here bit ago, I am just using that to write the music description part.
>>
>>
>Have to verify E-mail to post in this thread (don't have to do this for other threads, what?)
>Verify from my throwaway proton mail
>Invalid or expired link despite it being 2 minutes old
>Practically forced to post here from the proxy website
Anyone encountered this? Even allowing third party cookies doesn't help (not that I would leave this shit on). Why do they keep making 4chan worse and worse all the time?
>>
>>108065691
Very nice, based. That's already Udio tier.
>>
>>108065759
>Have to verify E-mail to post in this thread

Is this why I've seen so much less schizo wars lately? If what you say is true. You're probably in the greater area the schizo was using as his shitting ground.
>>
>>108065691
I'm convinced there's a slander campaign against ace step.

Yeah. It's not suno 5 tier. Not even really 4.5 tier, but you'd think people jammed toothpicks into their ears the way people react on reddit.
>>
>>108065769
I'd assume he uses the same website as me right now so I doubt it. The shit they're doing is ineffective against determined schizos, it just keeps making the experience more and more annoying for the occasional posters. But hey, at least the 15 minute captcha is gone.
>>
>>108065775
>you'd think people jammed toothpicks into their ears the way people react on reddit.

It's very sensitive to how you prompt it. Like to get good results you need good settings and prompt. A small change results in bad lyrics adherence. With the LoRA that anon is using quality of the musicality per gen for that type of genre is improved a lot, showing its power with LoRAs (I recommend archiving it as a guide to train LoRAs because it probably won't last). Also the default voices in ACEStep 1.5, especially for English are admittedly not the best and very robotic, which could explain some bad reception, but this LoRA proves it's possible to bring it up to Udio's level at least in Japanese, and very likely increased substantially in English (possibly using various artists with different voices).
>>
>>108065775
I'm convinced people have no idea about music quality. I'd rather jam toothpicks into my ears than listen to that distorted MIDI shit again
>>
>>108065775
A guy just did a comparison between 4.5 and Ace and other than being little better at vocals Ace is pretty even, so not even sure what people are even talking about.
>>
File: 1759615666334823.jpg (579 KB, 1040x1520)
579 KB
579 KB JPG
Anima's understanding of various angles and poses alone BTFOs illustrious
>>
*cough*... NAG for anima... *cough* onegai...
>>
File: z_imageBASEd_00420_.jpg (914 KB, 1728x2400)
914 KB
914 KB JPG
almost does ak47
>>
>>108065805
Here's where every model stands

ACEStep 1.5 turbo- v4.5 tier (seed dependent)

ACEStep 1.5 turbo LoRA/finetune enhanced or combinations of it- v5+ (I've never heard a Suno gen that sounds human, that LoRA gen is even better than v5), Udio tier (or better)

The reason why with a LoRA it is better than Udio is because the sound quality for instruments is better than what Udio lets you stream, whereas a LoRA has musicality matching v5/Udio (which are about equal I'd wager though I have no access to v5).
>>
>>108065775
Because music is a whole different thing compared to images/videos/text. People's tastes in music can be very personal and specific while other mediums tend to have people going "good enough". You aren't recreating anything from reality like generic 1girl slop so if it sucks, people get a very strong reaction to it. It's also why I rarely use music AI in general other than to play around with some music ideas I made. 90% of text2music sounds like derivative ass, Suno/Udio included.
>>
>>108065832
It's this one https://youtu.be/1l1EtvXV1PQ?t=423
>>
>>108065691
Does the LoRA work with Turbo anon? Is this a turbo gen?
>>
File: Flux2-Klein-9bfp8_00184_.png (2.99 MB, 2176x960)
2.99 MB
2.99 MB PNG
>>
>>108065881
I would be more interested in something like giving the AI my midi notes and small phrases, or my own recorded guitar parts, and it would improvise a song out of these, with specific instruments etc. I have zero interest in throwing dice to get slop music...
>>
>>108063129
>even if the sampler feels on the slow side
wym?
>>
>>108065973
>my own recorded guitar parts
ACEStep 1.5 gives you tools precisely for that purpose (similar to what API already has and could take a way at any moment like the rights to any song you modify (E.G. Udio). It will only get better from here on out. Some of you musicians who can't get thru your thick skull that it is meant to be a tool like anything else will be in for a surprise in the future.
>>
can I watch ads to get comfycredits?
>>
I've been out of the loop for a while. How's Z-Image? Is it the base we were promised?
>>
Does Anima work out of the box with other options like sd.cpp or is it Comfy exclusive?
>>
Holy hell the llm for ace step is abysmal transcribing Japanese audio.
Thank god gemini is basically perfect.
>>
File: ComfyUI_09952.png (2.63 MB, 1440x2160)
2.63 MB
2.63 MB PNG
>>108062926
Damn, where do I stick my hand out for ol' money bags to see it?

>>108066091
It's fine... if you just use it for making LoRAs.
>>
File: 1742957065898503.webm (3.29 MB, 450x800)
3.29 MB
3.29 MB WEBM
How do you make a LoRA waifu? I mean, I want to gen my ideal woman and create a large dataset of her, but I have no idea how to maintain consistency between the gens for the dataset with the current models. I've tried the edit models and they're good enough to get one-off different angles, but for a dataset they're all over the place and constantly change the features/proportions.
>>
>>108066125
Try a rigid model like qwen or zit?
>>
csv with artist list for ComfyUI-Autocomplete-Plus (rename to danbooru_tags_*.csv) https://litter.catbox.moe/2sa7mm.csv
>>
>>108066132
Say I used zit and got the perfect gen and I wanted to use it as the base reference for the rest of the dataset. How would I keep the details exact? I want to get her at different angles. Under different lighting. NSFW. Different expressions. It's just so hard to keep things consistent if you're not training on photos of a real person/celebrity.
>>
File: 1754878040094049.png (308 KB, 300x536)
308 KB
308 KB PNG
help
>>
>>108066124
horse face
>>
>>108066163
yeah
>>
File: z-image_00260_.png (1.6 MB, 944x1280)
1.6 MB
1.6 MB PNG
>>
>>108066184
benchod
>>
File: ComfyUI_09910.png (2.56 MB, 1440x2160)
2.56 MB
2.56 MB PNG
>>108066171
Horse!? Where you getting horse?
>>
>>108066163
Klein or qwen edit and tell it to rotate it? There was multi angle lora for qwen edit. Or use WAN to do a 360 rotation? And qwen also has a lighting lora now too I think.
>>
File: z-image_00269_.png (1.81 MB, 944x1280)
1.81 MB
1.81 MB PNG
>>
>>108066171
you take that back. Right. Now.
>>
>>108066217
probably the jutting jaw and the large teef
>>
>>108066228
I haven't played with the angle LoRAs. Thanks. Klein and Qwen edits do a fine job if you're perfecting a gen, but they change the face and details of the person at the more extreme angles, so it doesn't build a consistent dataset, unfortunately. I also like a little asymmetry and imperfections for realism and they almost always get lost even with simple head turns. The API models are a lot better at that.
>>
File: o_00047_.png (2.17 MB, 896x1152)
2.17 MB
2.17 MB PNG
>>
>>108066292
>>108066217
same picture
>>
File: o_00049_.png (2.25 MB, 896x1152)
2.25 MB
2.25 MB PNG
>>
>>108066297
I checked the hashes. they are in fact different pictures
>>
>>108066310
proof?
>>
File: 1739826706370388.png (2.62 MB, 1657x1098)
2.62 MB
2.62 MB PNG
>>108066316
look, they're different
>>
File: 1748552242366460.png (3.42 MB, 1216x1760)
3.42 MB
3.42 MB PNG
genned this kino.
free for your friends
yw
>>
File: trv.png (3.52 MB, 1692x1098)
3.52 MB
3.52 MB PNG
>>108066344
fake and gay
>>
>>108066356
zit slop
>>
File: 1752629273428.jpg (2.24 MB, 1000x1500)
2.24 MB
2.24 MB JPG
>>108066257
You could probably make an argument about her thicker/stockier than average legs maybe being horse-like, but her face is extremely feminine (weak chin, upturned nose, etc).

Here's a real picture of her. Not horse-like in the slightest!
>>
erm... guys? Is hugging face down for anyone else?
>>
>>108066389
yup, fucking SHITTY FACE
>>
File: o_00054_.png (1.24 MB, 1280x768)
1.24 MB
1.24 MB PNG
>>
File: hmm.png (121 KB, 1154x413)
121 KB
121 KB PNG
>>108066381
>>
>>108066389
what did you do?
>>
>>108066404
Ohhhohh this is byaadd. This is weeely weeely byaaad.
>>
can someone make this?>>732336580
>>
>>108066438
done
>>
>>108066458
where
>>
>>108066461
i lost it
>>
>>108066461
its done doko
>>
>>108066462
corroboration?
>>
>>108062926
>likely to go closed source like in their repo specifies
obvious bait but where in the repo do they specify that it'll go closed source?

>>108062957
>literally who
tdrussell has made a couple of goated finetunes. They made the first storytelling finetune of llama 3, and unlike most story/narration tunes it wasn't slopped to the gills.

Would you rather another million for frankensteined models or another SDXL tune?
>>
>>108066484
>replying to bad faith troll whose only concern is to shit up on anything with comfy in the name
lol
>>
It's unfair, Comfy should've given that milly to proud LGBTQ founder Trani for his work on a worthless wrapper
>>
>>108066498
I'm aware that the dribbling morons from stablediffusion's subreddit often break containment, come here and ask stupid shit. I can no longer tell when someone is being a bad faith troll or is just a goonbrained redditor.
>>
File: Flux2-Klein_01005_.png (378 KB, 448x592)
378 KB
378 KB PNG
holy shit
https://huggingface.co/Tongyi-MAI/Z-Image-Edit
>>
>>108066507
He should've given it to me for my top gens in /ldg/.
>>
how did comfy make 1M in the first place? Who the fuck pays for their overpriced API
>>
File: file.png (1.93 MB, 1152x864)
1.93 MB
1.93 MB PNG
>>
>>108066578
you're courting death
>>
File: file.png (2.2 MB, 1152x864)
2.2 MB
2.2 MB PNG
>>
File: 1768113995188083.png (3.23 MB, 1216x1760)
3.23 MB
3.23 MB PNG
>tfw no goofs of new captioning model
>>
File: file.png (2.1 MB, 1152x864)
2.1 MB
2.1 MB PNG
>>
>>108066594
>>108066594
>>108066594
two rules for the new thread
>discuss free and open source diffusion models
>do not court death
>>
it's over
>>
>>108066148
Useful if real, thanks!
>>
>>108065952
Yes that's with turbo



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.