[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


I Am Artist Now Edition

Discussion of Free and Open Source Diffusion Models

Prev: >>107798332

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
4 (FOUR) of my gens made it to the collage i am eating tonight boys
>>
why does tran concern troll about someone wanting to dox her but doxxes ani all the time including his family?
>>
>>107800934
>>Maintain Thread Quality
>https://rentry.org/debo
>https://rentry.org/animanon
why are there off topic links in the OP? I don't want to maintain low thread quality anymore
>>
>>107800934
stop spitebaking retard
>>
File: sdfsdfsdtled.png (92 KB, 1099x454)
92 KB
92 KB PNG
another vram reduction for lxtv
>>
Blessed thread of frenship
>>
>>
>>107800919
a young woman is sitting on a gamer chair she quickly unstraps her bra and changes it for a pink one due to skin irritation on her shoulders.

https://files.catbox.moe/wfj9vf.mp4
>>
>>107800995
what i'm sure i tried that one, ah right they are loading in also the checkpoint. It does this anyway if you use --lowvram, it offloads the clip model to ram, i doubt it will make a difference.
>>
>>107800998
artist?
>>
>>107800975
he revels in drama like an actual woman
>>
>>107801024
egon schiele
>>
>107800965
>107800975
>107800986
Space these out a little more next time so it's not so obvious
>>
>>107801003
at least it was not a powerpoint still image. All this thing now needs are lora's
>>
>>107800995
>>107801018
Even without as long as you have enough ram it will properly offloads what it needs.
>>
>>
>>107801003
can it do groping? What about sensual movements? Can the audio swear?
>>
>>107801003
magic!
>>
Anyone got this working yet? https://huggingface.co/lodestones/Zeta-Chroma/tree/main
>>
>lora of a filipino girl
>prompt her with white skin
>turns out indian
>>
>>107801029
>>107801048
Did you train it yourself? Pretty accurate IMHO.
>>
>>107801046
>Even without as long as you have enough ram it will properly offloads what it needs.
actually on some machines it don't, by default I think it tries to fit everything in a run into vram. If I don't use --lowvram for ltx2 or wan it will oom
>>
>>107801081
Did you write "South Asian" in the prompt somewhere? People tend to mix South and Southeastern Asian.
>>
>>107801029
Nice! How about you try running it through img2img with an anime model? For example, you could use JuggernautXL and see how it turns out.
>>
>>107801095
ye
I think it needs more training though
his style is very brutal and these gens are still modest
>>
>>107801113
>For example, you could use JuggernautXL
hearty kek
>>
>>107801078
even if you could run it, it's early stages and probably really shit
>>
so I finally got LTX2 to work on my 4090.
can I do more than 5 seconds? whats the max frames I can do before it craps out?
(dunno if it matters but I only have 64GB RAM.)
>>
I said missile, sorta worked, kek:

https://files.catbox.moe/matnqp.mp4
>>
>>107801170
1 can get 10 in 65s with a 4080 (16gb) and 64gb ram. or 240 frames like >>107801194
>>
>>107801194
I like it better when gen mikus and ryan goslings desu.
>>
File: f.mp4 (2.01 MB, 1280x704)
2.01 MB
2.01 MB MP4
>>107801170
i think it also depends on how much RAM+swap you have. like the other anon, I get to about 10s / 250 frames, it doesn't really become worse towards the end either
>>
>>107801206
daily news topic is retard who tried to run over cop.
>>
File: 1761182051228218.jpg (41 KB, 656x586)
41 KB
41 KB JPG
>>107801140
preach
>>
>>107801158
I tried running chroma radiance x0 models (regular radiance v 04 works) but they all produce static. Got to update comfy or something...dont I?
>>
>>
>>107801220
>>107801204
thats actually pretty good.
man I wish I had more RAM.
fuck I remember when I only had 64mb RAM and all the cool games needed 128mb RAM.
>>
File: radiance.png (1.99 MB, 848x1488)
1.99 MB
1.99 MB PNG
>>107801235
the radiance snapshots work fine with pretty standard settings apart from using the "pixel_space" vae, as far as I know

the update that added support for Radiance to Comfyui was a while ago, surely you're not running a version THAT old?
>>
cool explosion:

https://files.catbox.moe/2htd80.mp4
>>
File: LTX-2_00007_.mp4 (862 KB, 448x832)
862 KB
862 KB MP4
>>107801170
4090 brother, we have to play with the settings, 840x480 I can generate 10 seconds, tried to generate at 1280x720p but I got an OOM error, I will try more optimizations, be sure to use fp8 models

https://files.catbox.moe/iewqol.mp4
>>
I don't believe JoyCaption can differentiate between different women in a mass tagging situation. That is, caption consecutive images but ONLY label a specific girl by name with the rest being "a woman". This is unfortunate.
>>
>>107801275
>be sure to use fp8 models
I use
>ltx-2-19b-dev-fp8.safetensors
thats the right one right?
>tried to generate at 1280x720p but I got an OOM error
I can do 1280x720 for 9 seconds (t2v), what workflow are you using? I use the template from comfyUI.
>https://files.catbox.moe/iewqol.mp4
nice video
>>
>>107801291
can any of those segmentation models crop out each individual subject so it can process them one by one
just wondering, not advice
>>
>>107801262
Yes, I use an old version (0.3.75) because everything works the way I need it to. I tried updating multiple times and it renders some of my custom nodes that I still use useless. The regular radiance Q8 works fine though but would be nice to try out the others. Might have to make a separate comfy install and point all models to that folder or something.
>>
File: LTX-2_00011_.mp4 (678 KB, 448x832)
678 KB
678 KB MP4
>>107801299
I'm using the wf that the jiggle test anon posted, just changed the models to fp8
>>
You're getting ooms because none of you are using the reserve vram arg, aren't you?
>>
>>107801345
tried and it failed on the first pass instead of the second
had better luck with disable cuda malloc, but had even better luck with wan2gp
>>
>>107801363
--lowvram --reserve-vram 4.0 --use-pytorch-cross-attention
>>
>>107801393
yeah I know, doesn't work for me
I can show you proof of it not working if you don't trust me
>>
>>107801363
>second
don't need imo, kj wf does not use it, no upscaler either.

and if you fail at the stage you might want
--disable-pinned-memory --disable-smart-memory --lowvram --reserve-vram 4.0 --use-pytorch-cross-attention
>>
>>107801401
I'm guess it just seems to lock up at the second lxt video in place node what ever it is called? Because your system is already struggling with a tiny 256mb buffer as it can't move other things into the swap even because pinned memory has it locked up, that is why it shits the bed.
>>
>>107801401
swap or page file big enough?
>>
>>107801345
Doesnt comfy automatically block swap or something? I'm confused how it works. Also doesnt using --lowvram --reserve-vram make things slower?
>>
>>107801407
is the lora useless too? haven't tried running without it since I don't know what it does
>>
>>107801461
the lora is for the full model to make it like the distilled one, kind of pointless really.
>>
Why is the comfyui template so much more complex than the KJ workflow?
>>
I'm trying to make a lora for my OC using the netayume model, but onetrainer doesn't support lumina models anymore, what can i do bros?
>>
>>107801487
Subgraph weirdness removed any pretext for the need to be readable.
>>
>>107801492
>anymore
use an old commit? Idk.
>>
is base out yet
>>
Tried so hard to get this to work. Literally spent an hour doing this. I need a better hobby.

https://files.catbox.moe/d7tw53.mp4
https://files.catbox.moe/zc0p5h.mp4
https://files.catbox.moe/f8nosi.mp4
https://files.catbox.moe/iv2r3v.mp4
https://files.catbox.moe/3irimr.mp4
https://files.catbox.moe/yqwrf5.mp4
https://files.catbox.moe/h5pc94.mp4
https://files.catbox.moe/r6hcj7.mp4
https://files.catbox.moe/jwkmxq.mp4
https://files.catbox.moe/n88c9w.mp4
https://files.catbox.moe/8pruy3.mp4

I'm less apprehensive about ltx-2 but still think it's weaker than wan in a lot of ways. Obviously a finetune or loras would make up for it a lot. It's crazy how fast it is, not looking forward to going back to wan and waiting 3+ minutes for 81 slow mo frames.
>>
https://files.catbox.moe/m91zzg.mp4
>>
https://files.catbox.moe/oxh3he.mp4
>>
LTX2 is actually a huge step for local video generation and memes and all kinds of other stuff. its like the moon landing event of local AI.
>>
damn i didn't know baking anime 1girl on zit is so shit
>>
https://files.catbox.moe/cwwblt.mp4
>>
in the ltx2 workflow, image strength is 0.6 default, is this like denoise strength? should it be at 0.6 or what happens if you set it to 1 or 0?
>>
>>107801515
>still think it's weaker than wan in a lot of ways
It's first last frame capabilities aren't great at all, but it can do a lot with sound and can add sound to previously mute clips.
>>
>>107801515
This model is so shit.
Loras wont fix it, lol, not only it feels like a much smaller model but the anatomy and movement is all kinds of fucked.
>>
File: ltx.mp4 (624 KB, 1280x704)
624 KB
624 KB MP4
>>107801248
>thats actually pretty good.
it is pretty good. it gens faster, it has audio

it IMO also isn't as good at following the prompt as wan is and does quite a lot more strange things

> I remember when I only had 64mb RAM and all the cool games needed 128mb RAM
i remember when that was an issue with some map sizes and such

128GB RAM and you can probably animate the games from then for a few seconds now
>>
>>107801515
It's an impressive model for all the things it can do. Problem is it just doesn't do any of them particularly well
>>
>>107801554
>*goes back to check 8 minute long gen of 5 second clip of penis going into vagina*
>>
File: 256.png (944 KB, 1024x1024)
944 KB
944 KB PNG
>>
https://files.catbox.moe/40038i.mp4
>>
>>107801557
Yeah, is not useful for anything complex, at best is good to generate UGC crap, or youtube shorts, some slop that people will try to monetize
>>
>>107801561
81 frames takes like 90 seconds on my pc at 720p, had you tried not being a poorfag?
Also I've been playing with the SVI loras and it can do 25sec videos, still needs to be better tho.
>>
File: Screenshot.png (161 KB, 2078x457)
161 KB
161 KB PNG
LOL, VRAMLETS ON SUICIDE WATCH!!

Qwen (not ZiT) is releasing their own anime model too! In a few months you're all gonna be sitting here watching MikutesterGOD with his infinite VRAM who doesn't even like anime, wasting it on brain dead tests. Yes, you will be here seething and drooling as this anon with zero artistic vision runs the biggest anime local model you will never get to touch.
>>
>>107801515
>Tried so hard to get this to work
I'd rather autistically test on nice boobs than any meme
>>
>>107801577
Sorry China. You got outplayed by Israel this time.
>>
>>107801580
It won't be uncensored. And if it'll be too big people won't use it. Flux was objectively better than SDXL but if you compare them, the latter has much more of everything.
I'm already expertly coping with quality shit like nano banana pro or arguably NAI being API only, this won't hurt me.
>>
>>107801601
Flux had a lot more issues than just being big.
>>
kneel to GEORGE FOID:

https://files.catbox.moe/g78qpe.mp4
>>
>>107801569
why don't mine look like that
https://files.catbox.moe/kydxo1.mp4
>>
https://files.catbox.moe/kuhevr.mp4
>>
Was beginning to freak out that I got snubbed again, then realized I didn't post any gen last thread
>>
>>107801580
I've got more freedom with styles using Loras in SDXL. I'm thinking it might actually be good for generating complex scenes or inpainting tricky spots
>>
>>107801556
>it gens faster, it has audio
so how did they do it? last I remember is that people said running video with audio locally wouldnt be possible.
or that videos beyond 5 seconds werent possible.
>>
>>107801580
I have zero faith in this or the hypothetical Z Anime. They will filter all NSFW, filter all artists, then post-train it with RLHF so it looks slopped as hell and has low seed diversity. People will keep using SDXL because it's fast as fuck, uncensored, and you can actually make good looking stuff with it.
>>
>>107801580
>In a few months you're all gonna be sitting here watching MikutesterGOD with his infinite VRAM who doesn't even like anime, wasting it on brain dead tests. Yes, you will be here seething and drooling as this anon with zero artistic vision runs the biggest anime local model you will never get to touch.
sides in orbit
>>
https://files.catbox.moe/an7anp.mp4

Time to die skankurai
>>
The smeary vaseline people are just awful. Honestly worse than CogVideo quality
>>
i feel like we regressed with ltx
>>
https://www.reddit.com/r/StableDiffusion/comments/1q718rh/wan_22_longform_video_svi2pro_s2v_lipsync/

LTX mogged hard by WAN
>>
holy shit, it knows Trump mannerisms.

https://files.catbox.moe/6t6jc1.mp4
>>
>>107801761
>Pumped full of fake VHS artifacts to make the video look less shit.
>>
>>107801772
even with all those artifacts it still looks better than any LTX video gen
>>
>>107801772
better than your blurry mess
>>
>>107801580
>>
>>107801772
>>107801782
This is as good as Wan will ever be. Now is as bad as LTX will ever be. The walls are closing in, Xi.
>>
ZiT simply does not know what a microbiki looks like. Sadge.
>>
I need someone to give me an idea for a character to gen. My brain's not coming up with anything today
>>
>>107801812
Maybe you should try LTX2.
>>
>>107801812
Maybe try discord? All the avatarfags are there.
>>
>>107801550
How many images are you using in your dataset?
>>
>>107801793
no nose?
>>
>>107801580
qwen is shit
you're shit
suck my penis
>>
>>107801812
Iwakura Lain
>>
>>107801812
if you need help making 1girls then it's time to take a break
>>
uhhh
>>>/wsg/6067723
>>
Kijai my nigger where's workflow
>>
So much better, prompt length really matters
>>>/wsg/6067724
>>
>>107801851
calm u're tits, it's 7.35 am
>>
>>107801812
use prompt-manager, I don't even bother writing prompts anymore, I just give it a sentence and it spits out a 4 paragraph prompt in natural langauge
>>
>>107801872
You mean llm? I don't have power for video gem and llm at the same time
>>
>>107801828
I've posted a total of ~15 images in 8 days. "Avatarfag" lmao fuck off.

>>107801838
It's been time to take a break from 1girl for about 3 years now. I'm never going to take that break.

>>107801815
12gb vram. If I had more I would.

>>107801837
Doesn't strike me as much of a 1girl, but then again I haven't seen the show so maybe that's why. Some good outfits for sure. Much to consider.

>>107801872
I don't have any trouble converting ideas into prompts, I just need the ideas. I'd rather get those from human beings
>>
>>107801862
? It's 11:30 in Delhi now, anon.
>>
>>107801895
>I don't have any trouble converting ideas into prompts, I just need the ideas. I'd rather get those from human beings
give the llm an extremely rough random prompt, it'll spit out some crazy long thing with a whole bunch of shit you didn't include but is kinda related, that will give you ideas to further refine and repeat
>>
>>107801895
Still using an avatar image. Do you think this is my first day on this board, itts?
>>
>>107801915
>I know you avatarfag, because I've only ever seen your posts (which I know by the images) with an image attached
>>
>>>/wsg/6067729
>>
>>107801895
>Much to consider.
Indeed. She is kino.
>>
>>107801918
bro youre avatarfagging, posting these black and white neta thots, no need to cope about it. hope you liked this (you)
>>
People should stay in ran's discord. This is getting out of hands again. They want attention.
>>
>>>/wsg/6067734
>>
Can anyone tell me why the 19b model looks so bad?
>>
File: 1754332010982235.png (1.67 MB, 1024x1472)
1.67 MB
1.67 MB PNG
came to rescue the thread with 1 girl
>>
>>107801970
needs to be more bloated. 50b at least
>>
https://files.catbox.moe/1umd0j.mp4
>>
>>107801970
can you articulate your thoughts any better than that? It would promote a more constructive conversation to discuss what elements of the model are lacking, because I agree it's a regression in some areas
>>
File: 1767578095571229.png (2.02 MB, 1024x1472)
2.02 MB
2.02 MB PNG
>>107801982
80b at least
>>107801988
do you have eyes? lmao
>>
File: LTX-2_00039.mp4 (2.08 MB, 1280x704)
2.08 MB
2.08 MB MP4
>A black and white noir film scene depicting a man and a woman, walking together through an open doorway. The woman, in a knee-length coat, has shoulder-length hair. The man, wearing a suit and hat, appears tense looking around nervously. Behind them is thick smoke, partially revealing city streets at night
LTX2
>>
>>107801999
is this real?
>>
File: 1754593074810585.png (2.15 MB, 1024x1472)
2.15 MB
2.15 MB PNG
>>
>>107801999
Same prompt Wan 2.2 SVI Pro
>>
>>107801958
kek
>>
>>107801990
r u brand new to z
>>
>>107801999
>>107802007
Noir aesthetic and prompt adherence is better on LTX but the people look like blurry blobs until they get close to the camera
>>
>>107802005
Mommy Issues: The Image
>>
>>>/wsg/6067744
>>
>>107801999
>>107802007
You will never be noir mario smoking a cigarette walking down an alley
>>
File: Z controlnet very gay.jpg (73 KB, 1087x1003)
73 KB
73 KB JPG
Do I have to use some special node to load z controlnet or what
>>
>>107802070
i forgot
>>
>>107802069
>The question mark on the wall
I remember that kino
>>
CARLOS!

https://files.catbox.moe/tttzio.mp4
>>
I thought it'd be handy to have a collection of selfie pics captioned to use in prompts but women only do like three poses total it's so fucking boring
>>
>>107802097
It's not really a coincidence why any model tends to create a female portrait. Internet holds trillions of social media images of a female posing for their phone.
It's somewhat funny, proof of a dying culture in which narcissistic tendencies are more important than figuring out why exactly we still can't travel outside of planet earth etc.
Their AI training will also kill itself but for that bubble to burst that will take more than 5 years.
>>
>>107802074
>i forgot
explain
>>
>>>/wsg/6067750
>>
>>107802125
What does that have to do with the post
>>
do you think we'll get the z-image base this week? haha, nah i'm just kidding. of course we're not lmao
>>
>>107802156
benchod
>>
>>107802156
stage 7
their momentum must be dead by now
>>
>>107802070
It goes in the model patches folder not controlnet. Use model patch loader and the qwen controlnet nodes
>>
>>107802070
its the qwen image thing. it's named qwen something. but it fries any attempt to combine it with loras so i dont see any point.
>>
>>107802152
>irrelevant moderator larp attempt
Congrats, enjoy the smell of shit in the morning. You have earned it.
>>
File: 1752143245698503.png (19 KB, 504x173)
19 KB
19 KB PNG
can this node take a gguf quant of gemma?
>>
File: o_00195_.jpg (384 KB, 1072x1376)
384 KB
384 KB JPG
>>107802170
>but it fries any attempt to combine it with loras so i dont see any point.
That sucks. Some qr-code monster style controlnet could be useful.
>>
>>107802178
maybe
>>
>>107802156
patience
lodestone is on the case
>>
stop fucking with comfy and just use this https://www.reddit.com/r/StableDiffusion/comments/1q6zb57/comment/nycrhpl/
>>
File: 1764536058835183.png (2.84 MB, 1632x928)
2.84 MB
2.84 MB PNG
>>
>>107802164
>>107802170
theres a native zimage fun node btw
>>
>>107801334
NTA but same setup (4090 + 64gb, fp8 model) and wf. I'm not getting an OOM error trying to gen a 10s video, but when I hit the second sampler the ram usage jumps up to 30gb and it slows down as expected, any ideas?

Can't wait till my RMAd 5090 comes back and I don't have to worry about this anymore.
>>
>>107802226
no
>>
>She wears an extremely revealing micro bikini in white with white side-tie strings. The triangle top consists of two small white fabric patches, barely covering the essential areas of her full, rounded breasts, connected by thin white halter straps tied behind her neck and back. The bottom is a minimal thong-style piece with a tiny white triangular front panel, attached to thin black strings that tie at her wide hips in small knots, exposing most of her thin waist, wide hips, lower abdomen, and round buttocks. The woman has a hourglass figure with noticeable hips and a thin skinny small waist.

>still makes them fat
>>
>>107802260
then keep crying about comfy's shitty results then, its completely fucked
>>
>>107802269
benchod
>>
File: o_00198_.jpg (355 KB, 1072x1376)
355 KB
355 KB JPG
>>
>>107802189
>>107802279
vomit inducing.
how old are you? 12?
>>
kek

https://files.catbox.moe/3mjl4t.mp4
>>
>>107802282
meant for >>107802247
>>
>>107802290
? ?
>>
>>107802286
Let me guess, this poster is 16 years old, from NA.
>>
>>107802290
>DUDE LMAO BIG TITTIES XD
kys unironically
>>
>>107802290
grisha go back to the frontlines
>>
>>107802279
Is this AnnaSophia Robb?
>>
File: file.png (121 KB, 772x94)
121 KB
121 KB PNG
>>>/wsg/6067773
>>
uncensored https://huggingface.co/FusionCow/Gemma-3-12b-Abliterated-LTX2/blob/main/gemma_ablit_fixed_bf16.safetensors
>>
https://files.catbox.moe/dbmpx7.mp4
sure, that's what I wanted
>>
>>107802317
virus
>>
>>107802318
lmao i loled
>>
>>107802317
Abliterated versions take a huge hit in performance, heretic behaves much closer to the original, just with no (or almost no) refusals
>>
>>107802336
yet you didn't post it, curious
>>
>>107802336
heretic?
Can it also be dropped in as is?
>>
>>107802309
partially yeah, running few loras
>>
>>107802341
Because I've yet to even try ltx2, I just know that abliterated models get dumbed down hard because I've messed around with LLMs before
>>107802344
That's the thing, I don't know https://huggingface.co/models?search=heretic%20gemma%2012b
>>
>>107802336
In performance? How exactly?
>>
>>107802355
why havent you tried ltx2 you fucker
>>
first of all:
https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard
2nd of all, using ablit models is fucking useless if the model has been trained with the normal model, the 'unlocked' tokens will mean jack shit to the model. you're a bunch of retards.
only thing it's partly useful for is for prompt rewriting
>>
>>107802359
Worse context comprehension, they start hallucinating hard. Basically unusable for erp
>>
>>107802359
what happens when you lobotomize a human? yep something like that
>>
>>107802361
Because I only have 16gb vram (+8gb on 2nd gpu) and 32gb ram, waiting for goofs
>>
>>107802367
You are not very technical.
>>
>>107802374
why did you do this?
>>
>>107801601
getting a model like NAI but with better natural language prompting is all i want at this point, API or local. too bad anlatan is taking 500 years with their "director model" shit and local (nsfw/porn wise) has been playing the waiting game for a good base model ever since illustrious.
>>
So the slideshows happen because Gemma, instead of processing your prompt, shits out "I apologize but I won't proceed with this request", right?
>>
File: 1738936332586096.png (1.5 MB, 1152x1344)
1.5 MB
1.5 MB PNG
>>107802394
you dropped this
>>
File: o_00208_.jpg (322 KB, 1072x1376)
322 KB
322 KB JPG
>>
need more dedicated wam
>>
can we use an existing video as input to generate audio for it in ltx2? hunyuan foley is still broken for comfyui
>>
>>107802418
indian
>>
I give up.
A+V is amazing and I hope it forces all other companies to add audio gen to their models (even if the audio quality in LTX is poor), but the quality is mediocre at best.
There's no value in generating 5x faster than WAN if it takes 10x more attempts to get anything salvagable.

The comedic timing is fantastic though:
https://files.catbox.moe/wdkhjv.mp4
>>
>>107802441
yeah
>>
>>107802449
Go post interior gens in your snark general
>>
>>107802448
oh she gave up too lmao
>>
>>107802455
>interior gens
what did she mean by this
>>
>>107802448
Out of curiosity, did you make wan version of this prompt? Want to see how it would handle it
>>
>>107802448
Anon it's only been a day, people are yet to find optimal settings and there are no loras yet
>>
>>107802448
Why not use some dedicated audio model to make audio and use that to drive the video gen?
>>
>>107802418
More big tits
>>
Question about performance for those having a 3070/3070Ti, I'm out of the loop and my knowledge dates back from that era : how long are your gens (roughly, I know it depends on a lot of shit) for, say, on ComfyUI, a 1024x1024 SDXL image with one or two LoRas through a first Ksampler, then an highresfix pass with a slight upscale through a second Ksampler, a facedetailler pass and maybe another detailler, and finally an upscale x2 pass (I'm using Ultimate SD Upscaler)?
I'm on AyMD (6800XT) and wondering how much of an impact hit I get on Windows compared to the equivalent Nvdia cards (yeah I'm looking for problems by using both Windows and AMD, but now there is native rocm support for my architecrure, before that it was zluda that werked quite well all things considered).
Generating images through the Ksamplers is rather quick, but god damn that 2x upscale with SD Ultimate Upscale takes like 140-180s (using tiles). I tried other means of upscaling to replace that node, but either I wasn't getting results looking as good, or it took longer.
>>
File: leave it alone.gif (2.26 MB, 498x281)
2.26 MB
2.26 MB GIF
>>107802415
>"lol shill"
Anon...
>>
>>107802479
>the redditboys .gif reaction image
I accept your concession
>>
https://files.catbox.moe/tvj5au.mp4
>>
>>107802485
fuck off the boys is based
>>
File: file.png (112 KB, 566x314)
112 KB
112 KB PNG
>>107802489
>>
File: 1742973811593069.png (2.2 MB, 1344x1152)
2.2 MB
2.2 MB PNG
>>107802503
ok dude, keep watching fags making out on tv, ill be here genning my 1girls
>>
>>107802538
>1girl
i know you self insert as her
>>
>>107802566
I would self-insert in her if you know what I mean.
>>
File: 00011-43.png (1.38 MB, 1024x1344)
1.38 MB
1.38 MB PNG
I give up on LTX2, can't do anime that well. 3dpg is worthless to me.
>>
>>107802582
>3dpg
what's the g for?
>>
>>107802538
Why did you stop posting in /sdg/?
>>
>>107802286
I wonder if there is some prompt to stop it trying to add moving lips to a mask.
https://files.catbox.moe/5mydeg.mp4
>>
File: ComfyUI_00006_.mp4 (123 KB, 416x624)
123 KB
123 KB MP4
>>
>>107802582
>>>/wsg/6067803
What do you mean, it works perfectly for anime
>>
24GB vramlets can't run LTX-2 yet, huh?
>>
>>107802640
>24GB vramlets can't run LTX-2 yet, huh?
actually yes we can, up to 10seconds and at least 720p.
its great.
>>
File: LTX-2_00025_-1.mp4 (3.36 MB, 1280x704)
3.36 MB
3.36 MB MP4
Oh my fucking god, this makes so much sense, poojeets made the ltx2 model.

I did t2v, with nothing in the prompt.
>>
>>>6067806
Ugh, AI board when. Please...

>>107802640
Have been able to generate 15 second videos at 720p with a single command line argument since it came out.
>>
>>107802658
>>>/wsg/6067806
Fuck.
>>
>>107802654
i know that actress
>>
>>107802652
>>107802658
The main model and the text encoder are too big to fit into vram. Is there a quant that I'm not seeing somewhere? Got a workflow?
>>
>>107802666
is the video real?
>>
>>107802671
yeah its on discord
>>
File: 1750580773269609.png (54 KB, 830x357)
54 KB
54 KB PNG
>>107802156
boy do i feel stupid
>>
>>107802589
I never posted there, sorry (not sorry!)
>>
>>107802671
you also need at least 64GB RAM and then its doing a bunch of offloading shit, the template workflow works fine too.
there are also other workflows around, the guy who did the titty jiggle comparisons has a different workflow and posted it here too.
>>
>>107802156
you are so wrong about everything you actually made it happen
>>
>>107802683
PATIENCE WILL BE REWARDED GWAILO!!!!
lmao
>>
ltx 3 when?
>>
File: 1750111723929202.png (2.4 MB, 1344x1152)
2.4 MB
2.4 MB PNG
>>107802699
2.1 coming soon iirc
>>
>>107802676
Yes very real. Happened to me.
>>
File: 1746554112454023.png (341 KB, 718x400)
341 KB
341 KB PNG
https://www.reddit.com/r/StableDiffusion/comments/1q6zb57/ltxv2_now_works_on_wan2gp_on_as_little_as_10gb/
lmao this is great
>>
File: LTX-2_00030_-1.mp4 (3.18 MB, 1280x704)
3.18 MB
3.18 MB MP4
This is horrible.. 100% hitrate on jeet material.

Please, someone else try it, ltx2, t2v, no prompt.
>>
It's funny that everyone thought I was trolling when I brought up Chinese culture.
>>
https://github.com/modelscope/DiffSynth-Studio/commit/0efab85674f2a65a8064acfb7a4b7950503a5668
z image base soon?!
>>
>>107802640
anon I can do 10 seconds easily with a 4080 and 64gb ram. can prob do more.
>>
for ltxv2 you NEED paragraph long prompts it seems or else you get garbage
https://www.reddit.com/r/StableDiffusion/comments/1q6k2a3/definition_of_insanity_ltx_20_experience/
>>
>Someone farts in China
>Hurries to 4chan
>Posts link from 青龍聖者 twitter
>"I just got a whiff of a fart and it's definitely going to be open sourced soon.*
>Disappointment averted for 2 more weeks.
>>
>>107802748
i2v is a stickler in particular, gotta unambiguously refer to everybody to absolutely zero in on them, or else it conjures new people or makes it a voiceover
>>
>>107802755
I learned this when I was i2v nude women and when the camera zoomed out there was a random Indian man squatting in the corner leering at her.
>>
>>107802730
are you dalit? bloody whats wrong?
>>
>>107802752
unironically this, TONGYI BROS WE WON!!!
>>
Z-image-Video-Audio-Edit Turbo Base soon.
>>
>>107802729
already fixed stop peddling your toy
>>
I can't wait for base to release and literally nothing to happen, you people are like a cult
>>
Z-generate-a-literal-waifu-in-my-actual-living-room Base soon
>>
>>107802778
Honestly not sure why the random anon cares about the base model, it's only useful to the guys who wants to finetune or train loras on it.
>>
>>107802792
this, stop wanting things stupid guailos, you'll get your scraps when the mighty chinese deems you worthy. you're all courting death
>>
>>107802792
We just want to know that the future of local will have the option available for finetuners like Lodestone, NoobAI team, etc. There's also a fair few of us who do train lora. I keep failbaking my Turbo loras lol.
>>
>>107802798
whats a lodestone
>>
>>107802680
Which discord
>>
>>107802808
Lodestones
>>
>>107802689
>you also need at least 64GB RAM
Fuck
Guess it's over for me
>>
>>107802810
what is that
>>
>>107802819
you dont have 64GB RAM?
>>
>>107802829
Why would I have 64 GB RAM?
>>
>>107802819
Guess you should have thought of that before a megacorporation with more money than the entire consumer market combined decide to buy up basically the entire supply of computer hardware in an insane gamble to create a digital god.
>>
>>107802798
I just want to train a lora of my waifu
>>
>>107802821
A gay furry. Do not dig further, you will turn into another thread schizo if you do.
>>
>>107802833
because your on a ai thread? 64 is like the minimum for a decent experience
>>
>>107802808
maker of the chroma model, main hope for a big z image finetune
>>
File: 1757807651267863.png (1.33 MB, 1216x1024)
1.33 MB
1.33 MB PNG
i'm gonig to bed. when i wake up i expect an update for ai-toolkit that lets me train loras on the base and edit model
>>
>>107802833
>Why would I have 64 GB RAM?
you can never have enough RAM bro. even 64 is already ramlet tier.
>>
>>107802855
you better sleep for two weeks then
>>
>>107802853
>main hope for a big z image finetune
okay lets not go too far now
>>
>>107802858
Was waiting to upgrade to ddr5, which didn't happen until the price hike
>>
wait for it...

https://files.catbox.moe/lu1nub.mp4
>>
>>107802778
>>107802792
>>107802797
>t. seething SDXL anons
>>
>>107802863
who else? pony destroyed all hope there. Illustrious went silent. novelai only releases models 2 generations older than their latest
>>
>>107802870
Yeah we hate good things
>>
>>107801837
I tried for a bit and it's just not going anywhere. Every gen slightly misses the mark on actually resembling the character, and if it did manage to resemble her I'm still not sure what I would want such an image to look like.

I did try. I don't think any of the results are worth posting.
>>
>>107802873
novelai makes image models?
>>
>>107802873
maybe the creator(s) of the Big_ series?
>>
>>107802873
>pony destroyed all hope there. Illustrious went silent.
qrd
>>
>>107802855
share your hitler lora faggot
>>
>>107802877
they have the best one if your willing to pay to use it on someone elses server
>>
>>107802873
I would take a random anon with a much smaller kino dataset than whatever abortion lodefag is going to shit out
>>
>>107802865
just build a new machine no matter what it costs.
>>
File: 1746776612096681.png (1.39 MB, 1024x1024)
1.39 MB
1.39 MB PNG
>>107802884
you dropped this
>>
>>107802884
Lumina is much better, stop trying to trick people
>>
>>107802880
latest pony was horrible with retarded mistakes like using auraflow which was auto trained on fucking shit like censorship messages from ideogram, illustrious I think fell apart after asking 400K for the next model
>>
>>107802873
i'm waiting for Indian_Sir's base + bobby-vagaine lora merge
>>
>>107802892
illu is closed source anyway, who the fuck cares?
>>
>>107802885
you are a retard, chroma is currently the best model for nsfw gens
>>
>>107802884
All the images from their new model has that furry dirty look. Another set of morons who thought it was smart to add e621 shit.
>>
>a single commit from chinaman
>a full two weeks of schizoposting from you
>>
>>107802898
>>107802892
you can get a good model using aws bedrock for 100k, 200k for an amazing one
>>
>>107802898
just giving you the list. No one but lodestone is left it seems
>>107802902
lol nothing else does anime even close to as well, period. I'm hoping a lodestone tuned z image fixes that
>>
>>107802730
oh anon-kun you're so based uwu
>>
>>107802910
>>107802902
why not work for nai and figure out
>>
>>107802900
shame about everything else tho since it sucks at keeping to a style
>>107802902
>Another set of morons who thought it was smart to add e621 shit.
retard. thats why noobai is kino and still the choice model (its derivatives) for XL sloppers
>>
File: 1751213874248074.png (983 KB, 1024x1024)
983 KB
983 KB PNG
>>107802910
here. you dropped it again champ
>>
>>107802900
prove it, you can't
>>
>>107802910
novelai is a sidegrade to IL-based models in my opinion, it knows some characters better and it obviously does details better with the 16ch vae but you can make up for the 4ch vae with high resolution inpainting and you get all the benefits of local like different controlnets, ability to train custom loras very easily, and whatnot
>>
fresh when ready
>>107802907
>>107802907
>>107802907
>>
>>107802929
Thank you, bro.
>>
>>107802925
its base details are SO much better, it knows a ton more and even knows like 10 image characters and artists. Saying anything is close is cope, call me a shill all you want. That is why I want a z image finetune
>>
>>107802910
>lol nothing else does anime even close to as well
Sure I can agree on v3 but their new model man, hell no. Not mention the utter retardation of their new prompting methodology
>>
>>107802934
it's easier to use but IL has a higher ceiling if you're willing to put in the effort, that's why I'd call it a sidegrade, but for many use cases yeah it's better
>>
V3 was ass, it didn't even do artist styles, now I know your full of shit
>>
>>107802925
why does vae matter, you can just download it
https://huggingface.co/AuraDiffusion/16ch-vae
https://huggingface.co/ostris/vae-kl-f8-d16
>>
>>107802953
Yeah you like scat shit we get it furfag
>>
>>107802956
No I don't
>>
>>107802955
vae is used when training as well not just when genning, and I think vae adapters are not fully there yet for sdxl, not sure though, but minthy (rouwei creator) does have a promising 16ch vae adapter beta
>>
>>107802966
means nothing, you would have to retrain the model
>>
>>107802974
minthy said she is gonna do it
>>
>>107802979
then she should spend the 100k on training z image instead of shitty sdxl. It would destroy the whole purpose of using sdxl since it would not be compatible with anything anymore.
>>
>>>/wsg/6067150
>>>/wsg/6067150
>>>/wsg/6067150
Migrate.
>>
>>107803072
Why are we changing boards?
>>
>>107803146
Because that board allows posting videos with sound.
>>
>>107803163
Makes sense!
>>
>>107801580
but turbo autist has just a 4080
>>
https://arxiv.org/html/2412.05781v3
Open-Source Acceleration of Stable-Diffusion.cpp Deployable on All Devices

[...]image generation is time-consuming and memory-intensive. To address this, stable-diffusion.cpp (Sdcpp) emerges as an efficient inference framework to accelerate the diffusion models. Although it is lightweight, the current implementation of ggml_conv_2d operator in Sdcpp is suboptimal, exhibiting both high inference latency and massive memory usage. To address this, in this work, we present an optimized version of Sdcpp leveraging the Winograd algorithm to accelerate 2D convolution operations, which is the primary bottleneck in the pipeline. By analyzing both dependent and independent computation graphs, we exploit the device’s locality and parallelism to achieve substantial performance improvements. Our framework delivers correct end-to-end results across various stable diffusion models, including SDv1.4, v1.5, v2.1, SDXL, and SDXL-Turbo. Our evaluation results demonstrate a speedup up to 2.76× for individual convolutional layers and an inference speedup up to 4.79× for the overall image generation process, compared with the original Sdcpp, on the same M1 pro.
>>
>>107802279
chroma 2k or whatever? looks like some upscaled chroma or zit enhanced chroma, post workflow



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.