[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: collage.jpg (3.85 MB, 3264x1729)
3.85 MB
3.85 MB JPG
Discussion of Free and Open Source Diffusion Models

Prev: >>108048751 and >>108053187

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Anima
https://huggingface.co/circlestone-labs/Anima

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
>>108055389
how is it spamming kek
>>
>>108055406
(samefag) like i'm not even the same person who made the original image
>>
Gen me a song that can both make the happiest man sad and the saddest man happy.
>>
>>108055436
the song of revolution is not one to be sung alone
>>
File: 00248-1439226944.jpg (1.54 MB, 2688x2048)
1.54 MB
1.54 MB JPG
Quality restored
>>
>>108055457
tell him to go fetch the wheelbarrow that is floating away
>>
>>108055457
are you trans?
>>
>>108055477
Get back to work boy
>>
File: 00251-2005690729.jpg (1.41 MB, 2688x2048)
1.41 MB
1.41 MB JPG
>>
>>108055483
huh?
>>
File: eyepact.png (1.64 MB, 1024x1024)
1.64 MB
1.64 MB PNG
ace-step1.5 is fun enough
not good enough, but fun enough
https://vocaroo.com/147gcE5OU7zb
>>
Is ROCm viable now? I want to get a 9070 XT which is $200 cheaper than the 5070 Ti.
>>
>>108055555
no
>>
>>108055555
Nice get. But whatever you save in cash with ayymd. You're going to pay back tenfold in frustration and workarounds for a sub par experience.
>>
Blessed thread of frenship
>>
>>108055555
AMD cards work the same way using flex tape to fix a boat works. Cool that it somehow manages it, not ideal and prone to failure at every turn.
>>
File: ComfyUI_201718_.png (3.13 MB, 1536x1536)
3.13 MB
3.13 MB PNG
noticed something odd on qwen, while giving this vague prompt, it does yae miko accuraetly.

"This is a digital illustration in an anime-style featuring a a flirtatious girl in an anime style, embodying a fantasy motif. She boasts a curvaceous physique with voluptuous breasts and a sleek waist, accentuated by long, cascading pastel pink hair adorned with a pair of golden, curved horns. Her expressive, deep purple eyes hint at a playful, somewhat cheeky demeanor. She's clad in a provocative ensemble consisting of a sleeveless white top embellished with a prominent purple, star-shaped emblem, and a short, revealing red skirt. The outfit is complemented by intricate golden ornaments and ribbons, while her right hand rests teasingly on her mouth, adding an air of allure to her flirtatious gesture. The background transitions from light to dark grey, intensifying the vibrant colors and captivating textures of the girl's character design."

Just qwen image 2512 and 8 step lora.

The prompt is from joycaption of some yae fan art
>>
so the only songs anons like here are about epstein, trump or floyd
why I'm not surprised
>>
>>108055613
>massive franchise can be prompted without loras?? huh??
>>
>>108055620
>so the only songs anons like here are about epstein, trump or floyd

Reading from that list, it's immediately obvious that test anon is currently testing Ace Step. There is no anonS here. Just test anon.
>>
>>108055622
it wouldn't be surprising or interesting if it did it from "yae miko" or "Yae miko genshin impact". It suggests qwen was made from auto captions on booru fan art as well, and heavily trained on it.
Give a try with that prompt, it is very vague to get straight up Yae
>>
>>108055634
There's most likely gorillions of identical captions under every Yae pic. You prolly just forced it to bleed.
>>
>>108055634
if its just shitty captions then you should be able to get by with a text embedding instead of having to make a lora
>>
anyone tried the doggy style ltx2 lora? I saw examples and the girl seems to always yap in them
can't test it now but that would be annoying
>>
>>108055674
The furry one and the pov are pretty good
>>
File: Flux2-Klein_00161_.png (516 KB, 704x768)
516 KB
516 KB PNG
>>108055436
https://files.catbox.moe/2imtt6.mp3
>>
>>108055674
lol all the examples are shitty a dirty talk 'yes pump it harder' line delivered with the least emotion possible
>>
desu
z-image is so fucking good
>>
>>108055760
damn that looks amazing anon. way better than the tran slop he's spamming
>>
>>
Prompting Anima for realism is kinda interesting:
https://files.catbox.moe/f6ba1t.png

Seems like Noob where it has way more realistic knowledge than Illustrious did
>>
ai toolkit dude has been merging prs related to audio. soon...
>>
File: 2026-02-03_00007_.jpg (884 KB, 2240x1264)
884 KB
884 KB JPG
>>
File: 1759489507398508.png (1021 KB, 1136x896)
1021 KB
1021 KB PNG
>>
An example of an Ace Step cover.
Cursed Don't Stop Believing for you.

https://voca.ro/1i9qF5P699PX
>>
Whoever said ace step 1.5 was udio level needs to get his ears checked, it's so fucking inferior.
Still the best we have locally but no need to exaggerate.
>>
>>108055858
NTA, but it's heavily dependent on the seed and input. On an average of thousands of outputs, I'd wager it scores lower than anything SAAS puts out, but it has the potentially to match as well if you're lucky.
>>
>>108055858
What about https://github.com/HeartMuLa/heartlib
>>
>>108055880
I suppose, and I'm being extremely charitable here, but nobody really knows what happens behind the scenes at suno. It's possible that it gens dozens of tracks then picks the best one, but I really doubt it
>>
>>108055891
How would it even know it's the "best"?
>>
>>108055777
>black
>>
>>108055899
it asks me to rate them
>>
>>108055899
Probably the Same way Ace Step does. Scores it against the user input an model output and assigns a number to it.
That or they have thousands of Indians listening at 5X speed giving it a thumbs up or down.
>>
>>108055890
>Our latest internal version of HeartMuLa-7B achieves comparable performance with Suno in terms of musicality, fidelity and controllability. If you are interested, welcome to reach us out via heartmula.ai@gmail.com
If they release it some day.
>>
>>108055891
>It's possible that it gens dozens of tracks then picks the best one, but I really doubt it
I don't know for suno but for udio, I got shit gens half the time maybe, so the model is (was?) just that good I think.
>>
>>108055917
Ace step shits all over heart. I doubt they will ever release the 7B. It's there as enticement to venture capital.
>>
>>108055917
>music and sound/voice models are so far behind we still are at the weird gatekeeping era for them
man I want my good music gen model and my moans gen model
>>
>>108055772
Nice
>>
where do i go for anima discussion? feels like people are sleeping on it
>>
>>108056007
I'll discuss with you
anima looks cool, and her overdrive is nuts, but it's a massive pain to get her and once you break bahamuts damage limit then he's already a really heavy hitter
>>
>>108056007
what do you want to discuss? dont think theres many people actually using it here
>>
>>108055700
Alright, first tries out of Gradio UI

https://files.catbox.moe/wjh4eo.mp3
Prompt (loaded into Gradio)-
https://files.catbox.moe/4lm6te.json

Interestingly I see extra options that I don't see on Comfy like thinking mode enable/disable, which I left enabled here.

This is the closest I've gotten to Miku's voice completed unprompted and it was first try out of the 4B model. Second song I got of that gen also nice.

4B's power is truly showing. For anyone who's trained a LoRA from Gradio UI, how does it work? Which model are you using (base/sft/turbo) and does base/sft LoRA work on turbo?
>>
>>108056007
>feels like people are sleeping on it
retards are, yes
>>
>>108055858
It absolutely is. Maybe not 100% in terms of musicality or voice right away, (obviously Udio knows a bit more out of the box, nothing a LoRA can't help with) but controllability with all the tools + it's local means it is.
>>
>>108056062
Comfy's implementation is downright lazy. But I don't exactly blame him because as far as I can tell, they told everyone comfy would support it day one and then never gave him the weights and code.
>>
>>108056007
discord
>>
File: 1676772204312160.jpg (60 KB, 750x750)
60 KB
60 KB JPG
>ldg fags
>requested base z for days
>zib is finally available
>ldg fags now prefer anima and ace
you guys deserve bbc gangbang
>>
>>108056154
You know what. I'm just gonna say it. Z base is kind of shit and I'm tired of pretending it's not to spare the feelings of a few copers.
>>
>>108056154
No one expected Z to be better at anime than an anime-specific model
>>
>>108056191
But it is
>>
>108056204
lmao
>>
>>108056154
i only wanted z-base only for future anime finetune. and we got an anime model out of nowhere. so why would i care about something that not yet exists?
>>
>>108056154
Z base is merely an okay model. It's worse than Turbo (but we knew that from the beginning, Turbo has built in RL). It's easy to screw up training, apparently. Loras from it don't work well on Turbo since it's not even a parent model of Turbo. Klein is more interesting and fun to play with because of the edit functionality.

Z Base fades into obscurity unless it gets a big finetune. Except that it's nearly as heavy to finetune as Chroma and Kekstone blew $150k on that model only to get mediocre results.
>>
File: z flopped.png (51 KB, 1466x251)
51 KB
51 KB PNG
>>
>>108056154
>for days
it was unironically two months
>>
Basically, if you're defending Z base at this point, you're a clown.
>>
AceStep vae is so cursed:
https://vocaroo.com/1kbj8cwU2FW0
https://vocaroo.com/1kqFqKtK0p3m
>>
There's no need to bait that much right now anon the thread is slow
>>
>>108056256
Baiting while a thread is fast would make it even more cluttered.
>>
>>108056256
Too afraid to even link to my posts.
>>
>>108056256
it's always slow when the shitter upper aint shittering the thread up
>>
>>108056154
>>ldg fags now prefer anima
Only for animu. Klein is too slopped for regular t2i and I'm not a Chromatard.
>>
what's the status of ltx2 nsfw? can it reach wan2.2 levels of goonery?
>>
I hate comfyui so much is there an alternative yet?
>>
File: Klein9b-000003.jpg (243 KB, 1024x1024)
243 KB
243 KB JPG
>>
File: Anima_Output_632626.png (2.18 MB, 1248x1824)
2.18 MB
2.18 MB PNG
>>
>>108056271
how is Klein slopped lol
>>
>>108056292
it's as slopped as wai is compared to noob
>>
>>108056283
not yet and lora makers need to understand how to properly use sounds, but give it a few weeks
>>
>>108056154
only two digits iq degenerates requested baze because it supposed to be shit
i knew and told this since day -1 but no one was listening
>>
https://voca.ro/1aF0uiOJVKo5
made a song about my favorite person :)
>>
File: ComfyUI_09832.png (3.28 MB, 1440x2160)
3.28 MB
3.28 MB PNG
>>108056154
Well, it's really only supposed to be for training. At least that's all I've been using it for. All the anime posting did make me question why there's a separate anime diffusion thread though.
>>
File: 1752299057794144.png (3.5 MB, 1408x1632)
3.5 MB
3.5 MB PNG
>>
File: 1751815784387095.jpg (635 KB, 1536x1536)
635 KB
635 KB JPG
healer doesnt have the zit face anymore.. sad!
>>
File: 1751753344167482.jpg (816 KB, 1536x1536)
816 KB
816 KB JPG
>>108056476
other version.
>>
>>108054932
>>108054948
These are incredible. What is your prompt for these?
>>
>>108056496
This one has flux chin, though
>>
wsg thread died, ltx is dead
>>
>>108056511
im using zib + 2nd pass zit :(
>>
File: 1755906747787603.jpg (868 KB, 1824x1216)
868 KB
868 KB JPG
>>
File: a1.png (5 KB, 681x159)
5 KB
5 KB PNG
New ZImage lora setting from OneTrainer. Is it a cry for help? Using alpha=1 should require a wellness check
>>
now that onetrainer added flux2 klein support, how the fuck do i train this thing? whats the timestep distribution and timestep shift i need to use? what captioning style? rank16 for characters? preset ends up like shit for me
>>
>>108056559
whats the problem? pretty sure it has been the default for all models and the OP also suggests alpha 1 is perfectly fine
>>
File: 1766448230399886.png (3.08 MB, 1216x1888)
3.08 MB
3.08 MB PNG
>>
File: 1752718620155540.png (2.92 MB, 2016x1120)
2.92 MB
2.92 MB PNG
CF attack
>>
File: 1764467440995057.png (2.73 MB, 1536x1536)
2.73 MB
2.73 MB PNG
oniichan we're going in the bathroom
>>
Catjak haven't posted in this thread, I wanted to thank him: unlike Debo who killed my enthusiasm, he has actually rejuvenated it. I have learned a lot during these 54 hours.
>>
File: 1761649350757622.png (3.66 MB, 1408x1568)
3.66 MB
3.66 MB PNG
>>108056589
a nice shower to start things off
>>
File: 1754615369080578.png (3.76 MB, 1728x1312)
3.76 MB
3.76 MB PNG
>>108056597
its so steamy here...
>>
>>108056497
>1girl
>>
File: 1770189001458812.png (3.3 MB, 1094x1699)
3.3 MB
3.3 MB PNG
>>108056581
>>
>>108056589
too old
>>
>>108056612
Good job you know what arms are supposed to look like
>>
>>108056570
Raise alpha, lower LR, more steps = more room to find smoother minima, less 1:1 replication
>>
File: 1739655273139958.png (2.82 MB, 2016x1120)
2.82 MB
2.82 MB PNG
>>108056609
a fresh bath!
>>
>>108055858
It is faggot I prove it, you are just a skill issue SAASjeet which cannot read documentation, undio has prompt enhacer, and also you are using the bad llm and bad samplers. You are just a nigger promplet and low iq, go back tonyiur SAAS sir
>>
File: 1760257283454907.png (3.22 MB, 1472x1568)
3.22 MB
3.22 MB PNG
>>108056636
all set! ready for the beach.
>>
File: 1751135185544770.png (3.26 MB, 2016x1120)
3.26 MB
3.26 MB PNG
alright now time for some torture slop
>>
File: 1769132307930955.png (3.38 MB, 1408x1568)
3.38 MB
3.38 MB PNG
>>
>>108056591
what did you learn? how to be schizo?
>>
File: 1747814024938552.png (3.34 MB, 1728x1248)
3.34 MB
3.34 MB PNG
>>
the cycle continues
>>
File: 1750594846115303.png (3.48 MB, 1888x1152)
3.48 MB
3.48 MB PNG
imma finna torture u princess
>>
File: 1761045653139966.jpg (824 KB, 1888x1152)
824 KB
824 KB JPG
>>108056660
alt version, found it funny imagine the BJs
>>
>>108056655
I learned how to be a more particular and how to embrace my influences. I'm not saying some guy isn't a schizo - he is. But I sort of saw my own enthusiasm in him.
He's like a child, but with a mean demeanor.
>>
File: 1740170746354448.png (3.11 MB, 1568x1408)
3.11 MB
3.11 MB PNG
really angery
>>
File: 1744908960663260.png (3.77 MB, 1376x1632)
3.77 MB
3.77 MB PNG
alright with the anger out of the system, it's time for gift
>>
File: 1745951881289561.jpg (768 KB, 1728x1312)
768 KB
768 KB JPG
>>108056680
the princess appreciates
>>
File: 1748899755151611.png (3.57 MB, 1824x1216)
3.57 MB
3.57 MB PNG
and zit-slop face vampire chan is happy about the whole ordeal.
>>
File: 1768420329617755.png (2.36 MB, 1504x1472)
2.36 MB
2.36 MB PNG
>>
>>108055734
listened to the whole thing, A+
>>
>>108056154
the model page literally said it was going to be "worse" than turbo.
are you ESL? do you not understand the word "finetune"?
>>
>>108056636
Hawt
>>
local diarrhea general
>>
>>108056701
I can't go back to ZIT tbqh, it has no variety, even when prompting for races or hair color, you get the same fucking face per race.
>>
>>108056612
i see no problem here
>>
>>108056708
What are you using now that is better than ZIT?
Klein has that superficial AI look to it.
>>
>>108056733
Thank you. They are both like children.
>>
File: ComfyUI_00748_.png (398 KB, 1024x1024)
398 KB
398 KB PNG
>>
File: 1752643086843256.png (1.03 MB, 828x981)
1.03 MB
1.03 MB PNG
>>108056723
I'm the guy who posted all the anime-derivative gens above, I use ZIB 1stpass, upscale, ZIT 2nd pass.
I found ZIB lacks a bit in polish/fine detail, the ZIT pass used as a refiner with low denoise (0.6) pieces everything together. If I go higher on the denoise everything starts getting ZITslopfaced
50~s per gen
>>
File: ZIT_00024_ (2).png (1.75 MB, 1024x1024)
1.75 MB
1.75 MB PNG
Zimage is nice.
>>
File: ZIT_00037_ (1).png (1.63 MB, 1024x1024)
1.63 MB
1.63 MB PNG
>>108056771
TeslaQuote.jpg
>>
>>108056761
I'm still a newbie. I've used ZIT and nothing else. How do you do 1st pass and 2nd pass?
>>
File: ZIT_00047_.png (1.64 MB, 1024x1024)
1.64 MB
1.64 MB PNG
>>108056802
>>
>>108056753
Are we locked in with her or...
>>
File: 1745177290962443.png (3.43 MB, 1536x1536)
3.43 MB
3.43 MB PNG
>>108056820
here's a workflow I just created:
https://files.catbox.moe/t2swlx.png

it's basically doing a 2nd pass with hires.fix
if you want the gen to be faster, reduce the starting res (I usually do 0.9MP) and do the upscale at 1.6x
>>
>>108056900
forgot, remove the sage attention/torch compile nodes if you don't use them.
note that sage doesnt work with base
>>
>>108056664
>imagine the BJs
No, I don't think I will.
>>
Cache-dit got some updates recently, it seems it's now a must have.

The problems earlier were people using incorrect settings, not getting anything cached at all or broken gens.
So now with presets it's giving me 20-30% faster results.
>>
>>108056802
Chinese eye-acupuncture
>>
So. Ace was bullshit perpetuated by Cum and its partners.
>>
>>108056974
To add: it's going the same way as Anima. No people on earth would use 0.6b model for anything. It's capable of sorting a database of keywords and nothing else.
>>
>>108056974
>>108056979

>>108056347
listen to this
>>
>recalled stability released an open source audio generation model
>check to see if it was ever updated
>only API now
>barely any videos on it with low views
Why are they even bothering anymore
>>
>>108057025
>released open source
>is api only
hmmm hello?
>>
>>108057038
Yes they released an open source model but then switched to API. Can you read?
>>
>>108057013
Not music I'm afraid. Normies think that lyrics equal music, lol. Learn to play an instrument and come back in five years or more.
>>
>>108057045
I mean the song is dedicated to you, did you like it?
>>
Hmm, is offloading the models using multigpu onto ram and using the cache-dit counteracting each other?
>>
>>108057047
I can sense trash in 15 seconds when it comes down to music. Answer is: no.
>>
File: Video_00001.mp4 (2.17 MB, 480x848)
2.17 MB
2.17 MB MP4
Cool, didn't prompt for the paint stuff.
>>
>>108057084
And, that creation is not music either.
>>
Why are you talking when you could be suiciding?
>>
>>108056497
Should be something like
```A fair-skinned young Caucasian woman with long, sleek copper-red hair stands centrally on a weathered stone walkway, posing directly for the camera. She wears a whimsical pastel lavender mini-dress featuring a tiered skirt, ruffled bodice with lace trim, and sheer long sleeves, accessorized with a metallic gold crossbody bag. Her legs are clad in intricate white patterned lace tights, ending in chunky two-tone black and white platform oxford shoes. She is situated in a formal garden setting, flanked by stone balustrades topped with large white classical urns containing manicured green bushes. Immediately behind her stands a white architectural frame structure bearing the text "1GIRL GARDENS" in bold serif capital letters. The background reveals terraced flower beds, classical white statues, and a green hillside dotted with buildings. The lighting is soft, flat, and diffused from an overcast sky, creating shadow-free illumination that enhances the soft pastel colors of her dress and the even tones of her complexion. Style: whimsical street fashion photography. Mood: sweet, composed, and serene.```
>>
>>108057117
I'd definitely put in caucasian half way.
>>
>>108057083
just use lazy cache
>>
>>108057107
No one is talking. People are posting. I'm worried about you.
>>
Can you ever actually learn enough to feel in control or is it always going to feel like gacha where you eventually just lower your expectations and accept the result?
>>
File: 620707864261495.png (1.53 MB, 880x1168)
1.53 MB
1.53 MB PNG
>>
>>108057240
what model is this? i can't seem to find a model with good reflections for 2d content
>>
Man I really wish anima team went with 1.7B instead of 0.6B.
So many gens where you can clearly tell the tiny LLM is struggling to decypher what exactly you meant and can't even reliably make the most straightforward assumptions. Commonly find myself with the need to prompt in an excessively verbose way that no other NLP capable (i.e. not clip) text encoder I've used so far needed. Even old t5 junk resolved meaning a lot better.
I wouldn't call it a truly SOTA illustrious replacement, but it is still a lot better and I am really liking the model otherwise. Assuming no one makes a decent booru tune in the following months (I want Kaleidoscope to succeed but my expectations are non-existent with lodestone.) would it be too difficult to train it to take prompts from a larger Qwen 3 like 1.7B or 4B? I know that training to use a whole new text encoder is time consuming (e.g. Gemma) but since they are from the same family, would that allow a significantly quicker training to work?
>>
>>108057214
There is always some luck involved with sloping even if you do everything right.
But depending on how stable the model is and how much know-how you have you can enjoy having decent results most of the time.
>>108057240
Mr. Advertiser get down!
>>
File: yu7.png (2.94 MB, 1472x1600)
2.94 MB
2.94 MB PNG
>Guys it's been a week since Zimage dropped and there hasn't been a finetune yet I'm starting to get worried
>>
File: 261388768617577.png (1.62 MB, 832x1232)
1.62 MB
1.62 MB PNG
>>108057266
It's Klein-9B, but it's using a 3D render as the base image so that's where the reflection comes from.
>>
>>108057281
4B should be the minimum, it's not heavy for even 16GB ram and whatever vram even a laptop user has.
0.6 isn't good for anything. it worked for sdxl because it has a bigger dataset but for a anime boiru thing, not so sure
>>
>>108057281
I'm surprised that 0.6b works at all.
>>
File: 1girl garden.jpg (1.19 MB, 2606x1152)
1.19 MB
1.19 MB JPG
>>108057117
>1050 Ti (4 GB VRAM) + 16 GB DDR3 RAM
>864 x 1152
>558 sec, 441 sec, 342 sec
Thank you my friend.
Here are my outputs.
>>
File: 969441159993796.png (1.69 MB, 896x1152)
1.69 MB
1.69 MB PNG
>>108057266
>>108057318
I'm testing it now with txt2img and the reflection are not terrible, but it would take some editing to get them to match up properly.
>>
>>108057363
>>108057117
By the way, I used your prompt exactly. It still gave me asians. I've been playing around with ZIT and it does seem to ignore certain input sometimes.
>>
>>108057342
The problem (maybe) with 4B here is that the diffusion model itself is 2B. I assume they wanted it to be vramlet friendly.
Though you can just ask people who have 4gb vram or similar limitations to run it at Q8, so it shouldn't be too much of an issue.
>>108057358
It certainly works better than when you chat with an LLM of similar size and try to do... basically anything with it.
I wonder if there is an interesting research to be conducted regarding how LLM's understanding of instructions and internal modeling of the world scale down with very low weights vs problem solving intelligence or whatever.
>>
>>108057363
>1050ti
LOL
>>
File: Flux2-Klein_00114_.png (1.79 MB, 1024x1024)
1.79 MB
1.79 MB PNG
Babymetal kino
https://files.catbox.moe/3eccq6.mp3

Really nice that you can make the voice sound childish and even make children's songs with ACEStep 1.5.
ACEStep tip: Give it a shit prompt, and also shit settings it won't give you anything resembling what you want. Give it a good prompt, it'll give you exactly what you're after. So if you think it's trash and I've seen some Redditors quickly conclude, git gud with all settings.
>>
>>108057417
It's not too bad. I'm just glad it works at all.
Each generation takes about 7-8 minutes. I just do something else while waiting.
Overnight I can queue 40-50 and it'll be done when I wake up.
For testing I can drop the resolution from 864x1152 to 512x512, then it takes 2-4 minutes per generation.
>>
>>108057389
> Though you can just ask people who have 4gb vram or similar limitations to run it at Q8, so it shouldn't be too much of an issue.
It runs even on CPU just fine. Yes, it could be an extra minute, but the more images generated before a prompt changed, the less it's noticeable.
>>
>>108057456
4b, 4 steps?
>>
>>108057438
This is matching about 97% of my lyrics, just needs small changes with repaint, wonder when that'll work properly at all (haven't tested it on small bits yet, but didn't seem great when I tried replacing a large chunk of a previous song)
>>
Are there any good resources out there for training a Wan lora? I know there are guides on civitai etc but it's hard to know what's trustworthy when so many people making loras seem to have no fucking clue what they're doing
>>
>>108057438
are you using gradio or comfy?
>>
>>108057281
Wat
It has great adherence
>>
>>108057468
I don't know what 4 steps means. But I use Z Image Turbo. Everything default settings.
>>
File: 471142394661669.png (3.9 MB, 1920x1072)
3.9 MB
3.9 MB PNG
>>
>>108057438
can u do one with these lyrics

[Intro]
oh oh
cu-ni daisuki
oh oh
cu-ni daisuki

[Verse]
loli oishii
obacha muri
muri muri!
loli sugoi!
jk muri!
muri muri!

[Chorus]
loli pettan
oishi!
loli oppai
muri!

[Verse]
ore wa... loli daisuki
omae wa... gay!
ore wa... loli-chan... sukida!!
SUKI DA! LOLI! SUKI DA!!

[Chorus]
loli pettan
oishi!
loli oppai
muri!

[Outro]
loli oishii sugoi... dai suki!
>>
>>108057456
10 minutes isn't bad all and you can optimize the shit. If you're psu constricted or would like to use a silent workstation gpu, look into rtx 3050 6gb 2024. It has a hefty amount of cuda cores but it'll never even throttle its fan.
Sure thing some autist will claim you can't gen without having a 400W card but whatever.
>>
>>108057438
>>108057479
I'm using Gradio in this case, due to 4B LM crashing for me on Comfy, and it missing "Thinking" mode. This is the Gradio metadata for that gen
https://files.catbox.moe/zwvnzz.json
>>
>>108057514
4B had support merged in like 3-4 hours ago, I really cant stomach their fucking gradio shit
>>
>>108057484
>Wat
If you need an example: I prompted something like "person lying on the ground" inside a larger prompt and it generated deformed variations of Family Guy death pose. But when I changed that part to "person lying on the ground, on his back" it was all good. I am referring to needing shit like that.
>great adherence
Great compared to clip, but that bar is in hell. It's still a massive improvement, but a larger te would have been clearly better.
>>108057501
The more you look at it the worse it gets
>>
>>108057507
I'm still new and I'm having a lot of fun using ZIT.
I definitely want to upgrade in the future so I can use things like ZIB and LTX-2 (for video). I also want to eventually learn how to edit these outputs (inpainting).
>>
>>108057531
If you were on 2000 series, you could have used nunchaku but alas.
What quant are you running? Maybe Q8 would run faster. Don't bother with lower quants.
>>
>>108057531
It's actually good to practice on low end because you'll learn more.
When a nigga gets a gen in one second he doesn't think about the workflow or the image.
>>108057531
>>
File: file.png (41 KB, 1144x441)
41 KB
41 KB PNG
I have a suspicion that using the built in prompt-rewrite might not be giving the best results.
>>
>>108057531
Btw, and I know this is local general so it makes some people irrationally angry to suggest that but you might want to rent a GPU from vast/runpod to make gens. Making images on seconds there would cost you a lot less than what running your computer for 10 minutes for a single gen does in your electricity bill.
>>
>>108056950
which cache dit? theres like 2 for comfyui and ones in chinese. also dont you have to use like 20 steps for it to be effective? i like 4 step wan
>>
>>108057514
I'm begrudgingly using gradio and I fucking hate it. I have no idea how gradiots put up with it. Random pieces of the UI sloughing off at random, barely any control over what happens when, no idea if an error means you need to shut down the whole server or if you hit gen again it's going magically work.

I hate it so much. But at the same time, cumfy really phoned it in with ace step.
>>
>>108057604
don't you love when you change the batch count from 2 to 1 by hitting backspace then typing 1, leaving the field empty for a split second and completely breaking the ui as it tries to parse a batch size of null?
>>
>>108057630
Hasn't happened to me yet but I totally believe it. There's just so little feedback with a gradio UI, This isn't to praise comfy. But at least I can see it all play out in a logical order and have some command over it.
>>
I can't reply to you because if I would my ip would be flagged as csam poster.
I don't think you need to anything what you couldn't solve with 2 hours of lurking and reading the actual posts instead spamming the same shit.
>>
>>108057642
*to know
My finger was lost in a poker tournament.
>>
>>108057633
The fuckaround is worth it though
https://voca.ro/1fEU5CAqc8NF
>>
>>108057549
How do I check what quant I'm using? I'm using default setting ZIT.

>>108057598
Perhaps but I like the privacy of my prompts.
>>
complete n00b here

am i retarded for using easy diffusion?
>>
File: Erika.jpg (579 KB, 1787x1923)
579 KB
579 KB JPG
>>108057438
My Step Ace did Erika but in Eurodance style
https://vocaroo.com/1jwHE0jcyRCi
>>
>>108057665
You need to sink or swim. Get cum ui or the tranny special: neo vagina (neo forge).
>>
File: 1750360645667992.jpg (875 KB, 2016x1152)
875 KB
875 KB JPG
>>
>>108057502
Got some interesting ones after playing with settings, found sweet spot for musicality while mainting adherence

https://files.catbox.moe/vp1xie.mp3

and

https://files.catbox.moe/go8sj2.mp3 (slightly worse lyrics adherence but decent musicality)

Tags aren't exactly what you gave me as well, added descriptors like
"[Intro - Hyper-speed Power Metal Shredding]"
for ACEStep specific stuff.
>>
Just read couple of threads. I suppose you are literate aren't you.
>>
>>108057636
There won't be an /ai/ board, schizo. We've been over this many times already. There's not enough traffic on most boards if you take away the ai discussions. If anything AI kind of saved the site

>>108057642
>I can't reply to you because if I would my ip would be flagged as csam poster.
I genuinely can't imagine thinking like this in 2026. Like at this point you'd either accept reality that you can talk about naughty "littlest ones" on plaintext Gmail and no one cares, or take the steps (very easy steps) to make yourself anonymous enough for glowies to not bother unless you're literally creating new CP content

Like, I'm shitposting from the evasion platform right now because my wife hid my weed and I can't find it, but if LTX2 was truly wan 2.5 at home there'd be cuties in this general every day

That being said, maybe it's good to be afraid of the Internet since the Internet is for losers now and I wish I spent my years keeping more friendships during these brief lonely years before I get my AI little girl companion forever
>>
Anyone tried do an ace step LoRA yet? I'm wondering how many songs I actually need to make it work. Could one song be enough?
>>
>>108057737
What makes the difference is the attitude and the fact you are taking every post literally. You are a sub 90 IQ pol tourist poster.
>>
>>108057659
>I'm using default setting ZIT.
You are running bf16. (Possibly extra bad since no bf16 acceleration on that old GPU)
Install Comfyui-GGUF extension. Switch load diffusion model node with unet loader (gguf). Then download z-image turbo q8 gguf, put it inside models/unet, press r to refresh if needed, then select it with the gguf node. See if that runs faster.
>>
>>108057604
>>108057479
Nice, I'll probably be back to Comfy eventually probably once Repaint support is mixed in, for now Gradio is tolerable.
>>
why are you doing this?
>>
>>108057752
Jesus fucking Christ it's so embarrassing when fake whites try to use formal logic to construct arguments. I have literally no idea what this schizobabble of a post means. I'm sorry that in 2026 we have the freedom to speak freely about evil because evil is so prevalent everywhere I guess? I'd be offended if I couldn't tell by the way that you write posts that you're less white and less tall than I am.
>>
>>108057762
Someone who gives a fuck will make it work eventually.
>>
>>108057785
i do, but im busy making my own paid app
>>
>total vram usage climbs 200mb after each ace step gen

vrabo!
>>
>>108057769
What do you mean?
>>
>>108057792
yeah
>>
File: 60427865871746.png (1.67 MB, 1248x832)
1.67 MB
1.67 MB PNG
>>
>>108057805
benchod
>>
>>108056641
This looks like faux-real cartoon like Lazytown. With enough reference frames, you could make an entire episode.
>>
So there's also apparently a way to add and extract stems using the base model?
>>
>>108057820
shut the fuck up loser, i hate people like you who speak in presumptious retarded ways, either put up or end yourself
>>
>>108057782
>defending Jannies inaction ever
Board creation is not up to jannies, not even mods

And yes I will defend janny inaction. Jannies over moderate. There is no reason an image of a girl in a swimsuit from an Amazon listing should be removed for rule 1.

>>108057782
>4chan is dying the long drawn out death of stagnation and just overall corrosion. The fact there are three imgen ai generals on /g/ and everyboard having a /ai/ general with the ironic exception of the neurotic /ic/ schizos who are already preparing to euthanize themselves over ai art.
You said two statements but no actual argument. What is your point? If anything you're agreeing with me that AI generals are the things keeping the boards active at all

>You forget this a generation where you can't say "suicide", "stupid" on most online platforms.
As an older zoomer it's crazy to think that I barely escaped Newspeak because I was out of highschool by the time tiktok really blew up. Otherwise I'd probably be saying shit like "unalive" as well

>>108057782
>The less you give a shit over internet dopmaine drip/status the more you get out of life and real friends, Gf/waiuf, shit even enjoying sex
Hedonism isnt great because the endgame of the ideology is always gross but it's better than whatever zog consumerism the average golem passively follows.

You mentioned last time you were working on a boat or something. Hope that's been fun for you. Hopefully the next time you're here the next big thing for local is also here
>>
>>108057820
Stems are prone to latent washback.
>>
>>108057828
what are your guardrails?
>>
>>108057828
>Stems are prone to latent washback.
This is technobabble
>>
>>108057826
you are a zoomer, you will always be part of them and not us, we hate you
>>
>>108057833
I don't want to answer against my policy.
>>
>>108057854
technobabble
>>
>>108057817
>With enough reference frames, you could make an entire episode.
Lazytown is a bad show to try it with though since there's a lot of movement since it's like a fitness show or something. Idk I've never watched it as a kid since it's a couple of years before my time and I'm saving it for when I have a daughter with a fitness Instagram

>>108057839
I consider myself a zillenial because there are fundamental things about zoomer culture that I hate that I strongly believe are due to me being slightly older

It's ok I hate you too for ever thinking that tattoos were ever even slightly acceptable, especially on females
>>
File: W8Y5RENWA2YPOYNB0.jpg (744 KB, 1920x640)
744 KB
744 KB JPG
Babe, babe, wake up! ChenkinNoob-XL-v0.2 Rectified Flow is officially released!

A Rectified Flow conversion of ChenkinNoob-XL-V0.2, developed by Bluvoll & Anzhc from ChenkinRF Lab.
>What is Rectified Flow?
A sampling method that "straightens" the diffusion path for faster convergence and better image quality.
>What is ChenkinNoob?
An updated version with an updated dataset of NoobAI EPS.
>Key Advantages
- Vivid colors, no more greyness
- Better lighting (dark/contrasty scenes)
- Stable across wide CFG range (3-6)
- Fewer steps needed (20-28)
Recommended Settings
- Sampler: Euler / DPM++ SDE
- CFG: 3-6 | Steps: 20-28 | Shift: 3-8
>Download
- HF: https://huggingface.co/ChenkinRF/ChenkinNoob-XL-v0.2-Rectified-Flow
- Civitai: https://civitai.com/models/2363696
Guides
- English: https://www.notion.so/Chenkin-Noob-XL-RF-User-Guide-2fc03a034c1f80c18c00ffc93d4ad2a4
- äø­ę–‡: https://my.feishu.cn/wiki/ITBKwIoD5itZXukljjacDvc2nYd
>>
>>108057861
>tool that makes everything better with no cons
hahaha yeah sure buddy
>>
File: 271176525841466.png (1.71 MB, 896x1152)
1.71 MB
1.71 MB PNG
>>
>>108057871
this is cultural appropiation zoomer
>>
>>108057630
Kek, it's probably vibe coded. One thing I like about the Gradio though is the batch size being set to 2 by default. That way one can quickly decide whether it's bad prompt/settings or just bad seeds.

>>108057597
Yeah, I have that turned off, if you check their Discord it just seems to prefer specific phrasing so that makes it less versatile.
>>
>>108057734
lmao'd
thanks
>>
>>108057892
>lmao'd
proof?
>>
>>108057861
HOLY FUCK! This model fixed my childhood trauma!
>>
Anyone else experiencing high load and heating with new Anistudio macOS app?
>>
File: ComfyUI_01528_.jpg (2.95 MB, 2089x2089)
2.95 MB
2.95 MB JPG
>>108057826
>Board creation is not up to jannies, not even mods
I don't care about Janny ranks/Force Org, they are all the same to me.
Defending for any reason is dumb and clearly the sharty hack showed just how bad on both ends of the spectrum they are at with inaction/overmoderating.
> Hopefully the next time you're here the next big thing for local is also here
I don't remember you, when/where is the next big event Tokyo? Unlike those who do it for free, I do work to get paid even as a glowie.
>>
>>108057892
nyo~
>>108057734
one more question, did you transform it to hiragana or katakana before processing it?
it's pronouncing too many vowels (u in suki is mostly muted)
>>
File: 138272290777458.png (1.57 MB, 928x1120)
1.57 MB
1.57 MB PNG
>>108057875
kek
>>
File: mustard.jpg (35 KB, 1024x1024)
35 KB
35 KB JPG
dead website, who cares
>>
>>108057860
>Lazytown is a bad show to try it with though since there's a lot of movement
I mean it in the sense of the artstyle. Mostly human actors but overexaggerated cartoony colors and shit.
>>
>>108057861
I don’t know what RF means but my 1girls are shinier so I trust it.
>>
>>108057901
Whenever I run an exe my computer gets loud.
>>
>>108057861
>sdxl
I sleep
>>
>>108057906
why are you resposting
>>
>>108057861
is this vpred?
>>
>>108057861
If this is snake oil, then I’m a proud oil drinker.
>>
>>108057920
10 whole pixels
>>
>>108055858
must train loras to get voice/style
>>
>>108057934
Better and more modern than than vpred, more consistent
>>
>>108057861
>still clip
At least retrain this shit with a small gemma or qwen holy fuck. These retards really enjoy their stagnation.
>>
>>108057934
>eps
what u think?>????
>>
>>108057936
we know you are a proud retard
>>
>>108057666
Not bad. So far all these other languages that ACEStep knows sound very good.
>>
>>108057942
they have 1080s and are poor, what did you expect?
>>
>>108057943
One of the best artists on /ldg/ swears by vpred models.
>>
>>108057861
i cant go back to sdxl vae and clip
>>
>>108057941
nothing is better than vpred
>>
>>108057930
What would wake you up?
>>
>>108057861
10/10 would abandon NovelAI again.
>>
>>108057954
>ai slopper
>artist
lOL
>>
>>108057963
full release of anima of course *hits pipe*
>>
>>108057949
>>108057666
thanks, yes, the languages are good
note that it's the old step version, didn't try 1.5 yet
>>
>>108057954
>/ldg/
>best of
lol
>>
>>108057914
>did you transform it to hiragana or katakana before processing it?

These are romanji lyrics I give it while the language is set to "ja" probably best to translate it to avoid that issue but it could also just be a model size problem, in which case just altering seeds/settings are only thing that can be done unfortunately.
>>
>>108057963
Chromanime
>>
File: ComfyUI_01531_.jpg (3.87 MB, 2580x2580)
3.87 MB
3.87 MB JPG
>>108057914
DESU It sounds about right desu, vocals in Japanese music are less muted. Not always but not uncommon either.
>ITT, SUPER WEABOO NIHONGO MASTA BENNKIYOOO NIHOGO MAINICH JAPAN SUDEIMASS DESU (that's live in Japan for you gaijin) here...
>>
>>108057986
stop reposting make new images fuck you
>>
Do people transform descriptions into tags with LLM or do people really insert the tags themselves individually?
>>
>>108057861
I can't use non RF models again. RF ruined me for everyone else—the ball is in your court, Chenkin.
>>
https://voca.ro/17Z06ZF1QTFS
One for the oldfags, gnite anons
>>
Why can I masturbate to my gen'd shit easily but shit from other people looks all like unfapable AI slop?
>>
>>108057861
more like RECTAL FLOW lmfaooooooooooooooooooooo its diarrhea lololololooooool
>>
>>108058002
autogenephilia
>>
File: ComfyUI_01532_.jpg (1.14 MB, 3024x4032)
1.14 MB
1.14 MB JPG
>>
>>108057981
The one who creates most of the threads. He was merging his own mixes in sd1.5 era. The schism began when some other guy began to harass this individual artist.
>>
>>108058020
oh no no no not another schizo can of worms, keep quiet
>>
>>108057992
It gets easy when you get more experience. For "booru" tags they are simple definitions and you can always double check with a search.
llm will lie to you but of course it works because tags are simple etc
>>
File: ComfyUI_01533_.png (202 KB, 320x425)
202 KB
202 KB PNG
>>
>I will bully jannies by potraying them fucking a hot semen demon
lol
>>
>>108058063
>hot semen demon
Uh... em... bro? Uh, isn't she... like, 15? Wtf?
>>
>>108057281
At least it's better than SDXL, not exactly hard
>>
>>108058073
I don't follow every weeb shit
>>
>>108058073
chocola is 100% underage, but he's genning her as an adult so its cringe
>>
>>108058072
i dun get it
>>
>>108058097
rebecca is just a short woman, peopl are fucking retarded
>>
>>108058058
ok, thread schizo
>>
>>108058088
she was once a minor so its still creepy as fuck bro
>>
>>108057281
>>108057530
yeah I really hate having to type out verbose prompts so Anima is really tiresome to use imo
I can kinda get by with just prompting booru tags and then specifying if I need a certain composition ("girl X is on the left"), but that doesn't seem as reliable as going full purple prose schizo paragraph mode and I hate that
>>
>>108058111
wait anima doesnt use booru?
>>
>>108058124
>rebecca is fat
>>108058105
>>
>>108056497
>>>108054932(You)(Cross-thread)
>>>108054948(You)(Cross-thread)
>These are incredible. What is your prompt for these?
ty. They were too long to inline https://pastebin.com/raw/QzTLJg2S
>>
>>108058124
I can appreciate your artistry. Not that many anons are so willing to show their creations.
>>
>>108058135
can you do a brown girl with black lipstick
>>
File: 1766180245423672.jpg (1.37 MB, 5000x5562)
1.37 MB
1.37 MB JPG
>>
Im (rectifiedly) flooowing
https://files.catbox.moe/3zvc5x.jpg
>>
>>108058157
don't click this fellas
>>
File: ComfyUI_01542_.png (3.81 MB, 1843x1843)
3.81 MB
3.81 MB PNG
>>108058088
>>108058097
I love how the current gen of zoomer foids and troons on xitter get into flame wars and arguments over her actual fictional age, and if it's okay to like nekopara but disvow it for "problematic content".
It's funny because you had these people cosplaying as chocola and yet captioning how they "only like the designs". Then did a 180 when they found out Sayori (the creator) was a normal
*bio* woman with a family/kids and told them they were dumb for arguing over a work of fiction.
>>108058012
Actually having a bit of difficulty getting good results, I mean Psyduck doesn't have a mouth... just a bill.
>>
>>108058129
I said she's chubby-coded, which she is. And I said she's babyfat coded because of the neotenousness of her baby face and short stature

You'd have to be literally Arab to think that she's not overweight. in a cute way but still overweight nevertheless


Oh and the West is run by a gang of demonic pedophiles so schizo is no longer an insult unfortunately

>>108058138
You should appreciate my passion imo, my creations are slop. I wish more people cared about expressing themselves instead of being afraid and censoring

>>108058143
You haven't seen anything yet anon, this is the off-season. When it's go time, it'll literally be a pedophile ass thread (I prefer the term bummy because it's cuter)
>>
>>108058166
uhh this pic looks weird, can you stop posting weird things?
>>
new
>>108058181
>>108058181
>>108058181
>>108058181
>>
>>108058125
It does both booru and natural language
>>
>>108057861
i'm sorry there are 0 reasons to use this instead of anima
>>
>>108057861
ChenkinNoob was already DOA as it's just a slightly updated Noob 1.0 epsilon, adding snakeoil wont fix that
and it being SDXL is even more of a detriment now that we have Anima coming up, literally no reason to use this
>>
>>108058266
>>108058266
>>108058266
>>108058266
>>
*BRAAAAAAAAAAAP



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.