[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Discussion of Free and Open Source Diffusion Models

Prev: >>107782219

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2485296
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
We are so back
>>
>he doesnt post here anymore! he just came by to say happy new year
>>
>>107784557
>I hope someone makes an nvfp8 of wan 2.2 model.
q8 will still be better. Originally I included a "probably" in the previous sentence but then I remembered I actually understand the math and there is absolutely no way you can do better than q8 gguf with 8 bit floating point values, even with scales, even with Jewish voodoo
>>
>>107784454
so do i have to install triton? there is no mention of installing it on the main github page. how do i install triton?
>>
Blessed thread of frenship
>>
"Jesus Christ's age when he was crucified" Stars UI GitHub project's status?
>>
obinn pls
>>
>>107784688
>US Eastern timezone
can't believe you're not indian
actually, you could be based in canada
>>
>>107784705
Nobody cares lets enjoy our peace
>>
>>107784657
It'll be faster on my 5090 and it also won't have the lora dequant issue q8 has.
>>
https://files.catbox.moe/rmkd4x.mp4
This fixed slideshows for me.
>>
>>107784688
i highly recommend making a linux partition if you intend to run local ai models
>>
keep crying
>>
>>107784759
Nice tummy.
>>
>>107784744
I'm praying for you, this is a peaceful thread.
>>
File: 1752400850582775.png (226 KB, 600x300)
226 KB
226 KB PNG
>>107784688
>0.0.0.0:7860
heh, see you kid
>>
>>107784759
can ltx2 make non English voices? Spanish? French? Japanese? Korean?
>>
>>107784757
can we not make these quants ourselves? i'll do it if you tell me how
>>
>>107784688
If only there was a sort of... I don't know what you'd call it.... AI chatbot? that could answer this question. That'd be crazy.
>>
File: file.png (139 KB, 1666x504)
139 KB
139 KB PNG
Anyone knows what comfy commit introduced Nvidia specific memory enhancements?
>>
>>107784477
https://huggingface.co/unsloth/gemma-3-12b-it-qat-GGUF/tree/main
youre blind bro, there's TENS of gguf repos for this, this is the QAT one which will quant even better.
>>
>>107784688
this is the level of your avg g user that shids and fards when using comfyui and cries 'its too complicated!!!' lmao
>>
>>107784760
a whole new os just for ai models? hell nah. can an anon help me out? both github pages don't have mentions of triton requirements. this version i installed is giving me this error https://github.com/maybleMyers/ltx/tree/main.
tried installing this one and i'm confused on the next step after uv sync --frozen step, https://github.com/Lightricks/LTX-2
>>
>>107784806
might also be on the driver side. I saw that comfy introduced a comfy_kitchen module, which seems to be related to this.
>>
>>107784806
prolly this repo as >>107784849 said
https://github.com/Comfy-Org/comfy-kitchen
>>
>>107784806
this is only for 5000+ series right?
>>
>>107784853
Yeah blackwell specific but maybe some can work on 4000s.
>>
>>107784863
they're mentioning FP8 specifically, (not just fp4 which is blackwell exclusive) so I guess yeah
>>
>>107784849
>>107784852
Thanks anons, I'll take a look.
>>
>>107784688
>>
can someone explain if nvfp8 is a specific format or just an nvidia specific implementation at runtime for fp8 models?
>>
>>107784757
>It'll be faster
You had a month of AI winter to generate anything you wanted dobeit

>>107784808
Id say you're blind because two posts after I said "oh unsloth has them" but I deliberately didn't stop the post submission because I didn't want to deal with the evasion site's cooldowns so that one is on me, apologies for making you reply and I hope the anger you took out on me made you happier and isn't a sign of deeper troubles

>>107784846
>a whole new os just for ai models?
No retard, a whole new os for Triton. Triton on windows sucks
>>
>>107784895
isn't NVFP8 just E4M3 and E5M2?
>>
>>107784806
super sampling is just the upscaling method most dlss enabled games use. you just gen at lower resolution. it's pretty much a nothingburger
>>
>>107784779
Apparently it is multilingual with dialect support.

https://files.catbox.moe/u7j9xa.mp4
>>
>>107784930
Can you test out accents for me? Like talking in heavily accented English with an Eastern European accent

I know that it can do accented English from the "confiscious says" example that was posted earlier but I'd want to see actually prompting for accented English when there isn't an image context like a Chinese man meditating in robes to help guide the model
>>
>>107784846
like this anon said
>>107784910
Installing Triton on windows isn't as easy as it is on Linux.
>a whole new os just for ai models? hell nah.
your loss then, because no one here is going to bother their ass digging that shit out for you. There are guides on how to install it for Comyui on youtube for windows, that's the only hint you are gonna get from me, in your case it will probably be similar.
>>
If the videos weren't so slopped, you could put a decent AI voice changer on the voices to make it actually sound human
>>
>>107785000
not sure why everyone is in a rush to try it when it's been proven to be garbage
>>
https://github.com/comfyanonymous/ComfyUI/issues/11665
roachbros, ww@
>>
>>107785000
Anon you don't get it, before we knew how shit the audio was, the fact that it could do audio potentially well, like soda, was one of the hype parts of the model

>>107785022
It's been proven to be pozzed and the audio is confirmed suboptimal, but you can only be sure when you test yourself
>>
>>107785034
>Very urgent fix needed please thank you
lmao this fucking guy
>>
File: 1745549178711571.png (44 KB, 380x462)
44 KB
44 KB PNG
do you still use this? i feel like i get way better results using the normal guider. are my settings wrong or should i be using some general negative tags to improve overall quality (i dont use negative tags at all)
>>
File: 1739468274906713.png (18 KB, 473x163)
18 KB
18 KB PNG
>>107785043
and if you check the PR that has the fix.
Furk bros... we winnin!
>>
File: 1765934918142202.png (16 KB, 531x185)
16 KB
16 KB PNG
>>107785034
genning porn right now all is good. that's a 33 star alt
>>
>>107784852
>>107784808
>look at issues
>memory bug already
gg
>>
>No retard, a whole new os for Triton. Triton on windows sucks
And if it needs compiling which I think it does well that anon has already been filtered.
>>
>>107784759
You should keep that stuff for yourself, not because i don't like it, but because of unwanted attentions.
Would push some to castrate their next model.
>>
>>107785051
there is actually a bug with comfy on windows that might make the cuda backend not available (falls back to triton/eager)
>>107785061
if you actually check the issue it says nothing. I guess it's normal for new stuff like this to be a bit buggy on release, I've already got good speed ups on linux with this release
>>107785069
woc has windows wheels ready to be installed
>>
new version dropped
https://civitai.com/models/2279107
>>
>>107785093
lmao, I kinda hoped it was this.
>>
>>107785093
thank u for honoring my sister saars
>>
Does Qwen Image Edit have any NSFW support?
>>
>>107785093
Thanks, redeemed
>>
>>107785093
perfect for good looks
>>
>>107785110
Pepperoni nipples like most models have. There are tons of nsfw loras on civit and none of them are good
>>
File: 1761874262685702.png (9 KB, 248x109)
9 KB
9 KB PNG
>>107785093
what the fuck
>>
>>107785128
Ok, what's the standard practice for fixing it?
For anime would I just run an inpainting pass using WAI or something?
>>
>>107785051
Using an fp8 scaled text encoder instead of the q8 is even more inexcusable than using fp8 scaled on the model itself
>>
File: img_00466_.jpg (778 KB, 1520x1728)
778 KB
778 KB JPG
>>
>>107785034
>>107785049
why I'm not surprised furk uses windows for gen
>>
>>107785166
no other option https://huggingface.co/NSFW-API/NSFW-Wan-UMT5-XXL/tree/main?not-for-all-audiences=true
>>
>>107785087
>I've already got good speed ups on linux with this release
Do you need to do anything special or just the latest commit enables this?
>>
>>107785093
nose rings make women have perma boogers
>>
>>107784474
Did we reach a point where I can type some prompts at an AI and have it spit at me rpg maker sprite sheets or not yet?
>>
>>107785430
no. better to do one sprite at a time for static objects since it's just going to fuck up the consistency
>>
>>107785417
Unfortunately this fact is still not enough for nose rings to not be ugly to me

>>107785387
You can make it yourself from the bf16 if you have 32gb of ram
If you don't/can't/won't, I'll do it myself for (You) right now
>>
>>107785486
>Unfortunately this fact is still not enough for nose rings to not be ugly to me
you like boogers?
>>
>>107785448
I can't pixel art and I wanted character sprites.
Whatever though, thanks.
>>
>>107785536
ngmi
>>
dont care about shitty video slop
base doko
>>
is the ltxv2 audio supposed to be earrape every time?
>>
where the hell is nvfp8 for ltx2
>>
File: z-image-base_00389_.png (3.79 MB, 1264x2048)
3.79 MB
3.79 MB PNG
>>107785548
in my harddrive
>>
>>107785166
>fp8 scaled on the model itself
But but they say fp8 is much faster than Q8, so it is in fact more correct to compare the quality to the smaller quants that run as fast; when it's not about the size at that point...
>>
>>107785564
how did you avoid blur?
>>
>>107785516
They're not my favourite bodily fluid but I can work with it, especially given the fact that they're already a dumb cow with poor impulse control since they have a nose ring. If anything them being snot-retarded would be the goon expectation
>>
>>107785563
https://huggingface.co/Lightricks/LTX-2/discussions/8
Others seem to wonder the same thing.
Was that even been released yet?
>>
>>107785579
ok, I hate nose rings precisely because of the booger look lol, well and it makes a female face uglier too
>>
So is ltx2 worth my time or should I keep using wan 2.2?
>>
>>107785614
nbp is better
>>
>>107785614
Ltx2 is trash, don't bother unless you want terrible audio.
Also not only has no nsfw capabilities but is hardcoded cucked at model level.
>>
>>107785593
Same, I hate all tattoos and piercings because they are masculinizing. I also associate especially tattoos with sexual promiscuity, poor life choices, and sex work. They do not venerate, so they're only useful for degrading (like in the cow example)

I don't even like earrings, only large hoop earrings because of the implied sluttiness and the round shape is feminising (contrast that to the masculizing sluttiness signal that tattoos give off)

Thanks for reading my blog post. I wish I had an audio+video model worth running inference on instead of blogposting but here we are.
>>
File: img_00478_.jpg (515 KB, 1501x1728)
515 KB
515 KB JPG
>>
>>107785662
>I don't even like earrings
my only exception, pretty earrings are nice
but nose, and lips, yeah no
>>
can you not use the audio gen part to speed up ltx2 gens or lower vram req or does it not work like that?
i imagine you can see which parts of the model are most utilized for audio and you can just skip those calcs
>>
without lora

>>107785572
what blur
>>
>>107785689
No, I'm the exception with not liking earrings. Earrings are anthropologically culturally different than piercings anywhere else on the body and most people like them. I think it's because it's an extremity so it's closer to painting your nails than it is to creating a lump on your body.

>>107785699
What a worthless question

A better question is if the audio result is relevant to the video result i.e. if you can use this as a pure audio model with a tiny 32x32 video or if all audio output on 32x32 videos sounds like ear cancer because of the fundamentally small sized resolution
>>
>>107785699
why? just use wan
>>
>>107785721
>without lora
which lora?
>>
>>107785722
Calm down, you come off as if you get bullied regularly.
>>
>>107785721
the blur
>>
>the model is so bad people would rather just talk about piercings and their impact on white women
>>
>>107785740
>people
we're all ai bots in here
>>
>>107785730
4step
>>
>>107785748
benchod
>>
>>107785676
kek is this a new lora?
>>
>>107785750
same seed?
>>
I am having trouble getting the wan all in one model to get the person in the picture to do a nazi salute, is the model cucked?
>>
>>107785782
ye
>>
File: img_00493_.jpg (444 KB, 1520x1728)
444 KB
444 KB JPG
>>
>>107785732
>you come off as if you get bullied regularly
I was always the tallest in my class so I never got bullied
Now my turn: I lost my virginity in highschool, what about you?

>>107785740
>the model is so bad people would rather just talk about piercings and their impact on white women
I don't like piercings on non-white women either. I agree with Shakespeare that the darker the skin the bigger point there is of juxtaposing and contrasting with shiny stuff (a jewel in an Ethiop's ear) but if I want shiny brown people I'd just cover them in water or oil instead
>>
>>107785730
a 1girl lora i baked
>>107785782
assume it's not me if no image attached you summer child
>>107785790
kek
>>
>>107785790
quite the difference, ok thanks anon
>>
is it possible to find out on which images a model/lora has been trained on?
>>
>>107785806
oh now it makes more sense, thanks anon
>>
>>107785811
only if you have a copy of the original dataset
>>
>>107785811
no, the process is destructive
>>
>>107785811
You learn from the dataset anon, it doesn't exist in the trained model.
>>
Can anyone redpill me on the differences between Qwen Edit 2509 and 2511? Gens on 2509 only take like 10 seconds to finish on my RTX 5070 Ti but holy fuck is 2511 slower. Only 20% after about 8 minutes. And I'm using the smaller "fp8mixed" model that's only 20GB.

Are the results really that much better?
>>
>>107785811
>is it possible to find out on which images a model/lora has been trained on?
Of course not, if it was then AI would have never ballooned to be as big as it was since you could prove all the copyright infringement and sexy kids in the datasets
>>
>>107785823
>>107785830
>>107785839
>>107785846
makes sense
thx
>>
>>107785804
You come off as a guy that got ass punked in the showers, going to ignore you for being a bitter schizo
>>
>>107785842
2511 is better
>>
>>107785846
>you could prove all the copyright infringement
training =/= infringement. anthropic only had to pay because they pirated the text, not because they trained on it. based US courts.
not sure about the sexy kids part thoughever.
>>
>>107785853
>>107785823
If you set the lora very high you can almost get a copy of the image it used to train
>>
>>107785842
It shouldn't be much slower. You are probably comparing a distill to a base model or something. They are practically the same architecture, in fact you can use 2509 loras on 2511
>>
>>107785885
Anon you're missing my point. If you could reverse engineer the models the courts wouldn't have ruled that it's not infringement. The fact that it's destructive satisfies the transformative nature criteria of fair use
>>
>>107785891
not always a full 1:1 but it gets really obvious
>>
But think of the generated pixels!
>>
>>107785897
Can you explain further?
For 2509, I was using the ComfyUI template workflow, which tells you to download qwen_image_edit_2509_fp8_e4m3fn.safetensors from here: https://huggingface.co/Comfy-Org/Qwen-Image-Edit_ComfyUI/tree/main/split_files/diffusion_models

There isn't a default template for 2511 in ComfyUI (except for a workflow for changing materials). There also isn't an fp8_e4m3fn model for 2511 on that huggingface link, only (fp8mixed), which I am using and it's insanely slow.
>>
File: 2041313479.jpg (315 KB, 1664x2432)
315 KB
315 KB JPG
>>107785564
cute
>>
File: img_00511_.jpg (447 KB, 1520x1728)
447 KB
447 KB JPG
>>107785777
nah just prompting for "glitch art" with basedjak lora
>>
>>107785907
>80GB ram
64+16? Rare combo

>>107785933
>But think of the generated pixels!
Elon boiled the frog with microbikini kids on X. Every time another scandal like that happens and no children commit suicide and it's a nothingburger people subconsciously internalise it's just not that big of a deal
>>
>>107785907
If you used rammap I bet you would've cleaned like 1/3rd of memory without unloading anything
>>
>>107785846
>>107785885
You dont need "sexy kids" in the dataset though if it knows what kids look like and what sexy looks like it can extrapolate
>>
File: gigachad suicide.jpg (61 KB, 1051x981)
61 KB
61 KB JPG
>All worthwhile video models will be system ram hungry
Everyday I regret not getting 2x32 instead of 2x16 when the difference was just 100 bucks more.
I am running out of "Wan 2.2 was a fluke, the next video models will be easy to run without needing copious swap" cope.
>>
>>107785954
No, fuck you. You shouldn't evaluate models based off of their performance in a buggy jeetware UI. 99% of the hard work of these models is from the training / R&D, it's unfair to the model makers to use buggy inference code and then share your opinions on the model like they're well informed
>>
>>107785564
To think one day I'll be able to create an ai girlfriend exactly like this
>>
>>107786001
anon wan2.2 architecture idea was to let it use less memory, because they could essentially separate "movement" from "details"
>>
>>107786004
Didn't realize you had severe autism.
>>
>>107786030
Less GPU memory, yes.
The separate models both get loaded to system memory, which eats your RAM, and then SSD, for lunch.
>>
>>107786030
>just unload and reload each model every time bro
I have a feeling all this massive swapping is killing ram faster
>>
File: img_00519_.jpg (884 KB, 1496x1928)
884 KB
884 KB JPG
>>
>>107786051
>>107786057
is comfyui's purpose to kill consumer hardware faster so Nvidia wins?
>>
>>107785984
wow anons itt love missing my point

It's not about the legality or illegality of Russian swimwear websites. It's about the fact that if you could see the training data of these models then the courts would have never ruled fair use and we would not see this amount of passive acceptance of AI

Also trust me, sexy + kids isn't enough. That's how you end up with every child from the back in WAN looking like they have adult legs and buttocks. Perhaps with a large/smart enough model with sufficient world model knowledge

>>107786001
As someone who bought the last normal priced 32gb ram in my country and is now rocking a 16+16+32 abomination in his motherboard: I know for a fact you saw the warning signs and had at least 3 chances to buy, and yet you still didn't. You have no one to blame but yourself.

>>107786057
You should worry more about your disk if ever offloading to that. RAM lifespan ends when a new generation comes out, but you might still be using your NVME for years afterward
>>
>>107785959

>>107786020
sponsored by the zuckâ„¢
>>
File: file.png (31 KB, 517x243)
31 KB
31 KB PNG
Guys redditors are onto us, its over
>>
>>
new anime finetune when
please
I NEED to 1girl with tags.....
>>
>>107786099
wanting 1girl must be a zoomer thing, thoughts?
>>
>>107786099
>basedkid "NEEEDS" something
>too low iq to train a zimage lora himself
oh no no no
>>
>>107786092
chat is this real
>>
>>107786118
zimage can't 1girl I have to do some retarded shit like "The blue haired anime girl stands promiscuously and stares at the camera" like wtf is that
>>
s o y jak lora is clearly the best z lora as of right now
based anon
>>
>>107786092
Committing crime on video is more legal than committing a crime off video as long as you wear a prosthetic finger so it looks like you're being framed by the prosecution with a 6-fingered video as the main evidence
>>
>browns think it's possible to stand "promiscuously"
I will be teaching my future daughter(s) to cross the street when they see one, and more importantly I will teach them to be proud of doing so since they're just that beautiful and only ugly girls who don't have anything to worry about don't cross the street
>>
File: 725963251.png (1.25 MB, 832x1216)
1.25 MB
1.25 MB PNG
>>
File: img_00524_.jpg (524 KB, 1496x1928)
524 KB
524 KB JPG
something went wong
>>
>>107786170
it is benchod, my wife said i looked like a meal when she saw me standing in uni
>>
>>107786178
why is the logo a demon with a lobotomized brain? what did zit mean by this?
>>
>>107784759
Now take off her clothes.
>>
File: 1615330278.png (1.42 MB, 832x1216)
1.42 MB
1.42 MB PNG
>>
File: ComfyUI_00269_.png (1.21 MB, 896x1152)
1.21 MB
1.21 MB PNG
>>107786139
>1girl, solo, large breasts, looking at viewer, smile, red dress, cleavage, indoors, dusk, from above, hair over one eye, long hair, black hair, red hair, multicolored hair, arms on knees, sitting, chair, living room, window, curtains, table, open window
It seems to have followed all besides open window and from above seems weak.
But yes it is not trained on this shit.
>>
is Mr catjak here? I have a question
>>
File: img_00535_.jpg (628 KB, 1496x1772)
628 KB
628 KB JPG
>>107786214
lol
>>
>>107786170
>daughters
oh no no no no no no no no
>>
>>107785084
https://files.catbox.moe/aeb9lq.mp4
>>
>>107785811
Everyone is giving you wrong answers. You retards need to shut the fuck up if you don't know something.

https://arxiv.org/abs/2506.00701

"Membership inference attacks" are a thing, that many papers have been written on. It is unclear how well these techniques work in practice, but theoretically there are cases where you can infer with high probability that some specific piece of data was used to train the model. If you overfit the model like crazy like almost everybody does, I imagine it would be even easier.

TLDR: yes, the glowies could potentially know you trained your model on cunny.
>>
>>107786344
ToT
>>
>>107786349
I guess he did ask for conceptual theory, but it's not a leap to assume he meant in practise
>>
File: 3630385005.png (1.42 MB, 832x1216)
1.42 MB
1.42 MB PNG
>>
>>107786305
>pay for API nodes
>still oom
lmfao
>>
>>107786385
>5000 nits phone
>>
>>107785811
>>107786349
if you manage to store information that can be reverted from a trained model you would win a nobel prize for starting a new era of data compression
>>
>>107786349
No one actually does that in practice, and it's less and less possible the more data you add in the model.
>>
>>107786344
Can you prove it, young ltx girl? It's quit a bold claim coming from someone as censored as you.
>>
>>107786349
That was my guess as well. Someone a lot smarter than me seems to have written a theoretical breakdown.
Though it is still possible that:
Across many millions of images, the individual footprint of each image or most of the images will become too weak to be detected by any means (probably not applicable for loras)
You might be able to infer that an image was likely used or not used in training, but can't prove it in practice.
Despite being possible, it's extremely computationally intensive. (Think of something like analyzing millions of seeds.)
It still seems best to not train on anything illegal.
>>
File: 2242131188.png (1.32 MB, 832x1216)
1.32 MB
1.32 MB PNG
>>107786411
>>
the alphabet boys wont get you for your cunny lora anon calm down
>>
LTX2 gives very low quality sound but very coherent, is there a way to "supersample" that? Aren't there tools to make low quality sound become better and lose that metallic effect?
>>
>>107786459
>>107786419
SDXL was trained on a billion images, it's like 7gb, no glowie tech could ever recover all billion images from that little info
>>
>>107786477
lmao
>>
>>107786459
Anon, the datasets are gigantic, you can't do that even if you wanted.
>>
>>
>>107786459
>Across many millions of images, the individual footprint of each image or most of the images will become too weak to be detected by any means
Even with millions of images I doubt this is true. If you have a NAI subscription, try prompting using the full set of a tags from a real booru image. In many cases the model has nearly memorized that exact image, almost down to the pixel level. If that model was open source you could definitely could use membership inference methods to determine if some specific animu image was in the training set.
>>
didn't ask for the subtitles but whatever
https://files.catbox.moe/fk5eet.mp4
I was asleep all day, did someone beat me to it?
>>
File: 3579236615.png (1.22 MB, 832x1216)
1.22 MB
1.22 MB PNG
>>
How do I stop chroma radiance from producing little squares? resolution 1024 x 768
>>
>>107786528
I really like this one.
>>
>>107786496
And there are a billions of weights holding data. There is also MASSIVE amount of overlapping data between images, it's possible that an identifiable, unique representation of a given image can be as low as a handful of bytes and survive the mixer. And be teased out with some analysis method.
I wouldn't be this confident.
>>
zit-chroma, anyone try this yet?
https://huggingface.co/lodestones/Zeta-Chroma/tree/main
>>
>>107786519
Do the daughter one.
>>
>>107786542
Not using schizo furtroon experiments would be my advice but if you insist joining Lodestone's 'cord is probably the best way to get good results from it.
>>
>>107786577
yeah it just gens diapers and furries
>>
>>107786577
It just started training and knowing lodestone he will find a way to break Z-Image too.
>>
How do I stop myself from drifting into goon shortly after I try to gen sfw?
>>
i think chinese culture is waiting for other companies to blow their loads before they release their base and mog everybody
>>
>>107786633
check other companies and their release dates on models
the chinks cucked overwatch 2 with marvel rivals by releasing around the same time
>>
File: file.png (412 KB, 1516x849)
412 KB
412 KB PNG
pixel art character z-image lora training progress. top left is start of training, bottom right is last epoch where it finally converged
>>
>>107786633
it's so hopeless that we're entering the qanon phase now
>>
>>107786715
That has been around for a while.
And I doubt the censorship is implemented through text encoder. Changing text encoder will only hurt image quality.
If you insist on testing norm-preserved-biprojected-abliterated should be less disruptive.
>>
>>107786725
Cool. Settings?
>>
>>107786757
50 image dataset, all perfectly scaled 512x512, lr 1e-4, rank 8
training rank 16 now with all other settings same, to see the difference
>>
File: 1746116156669178.jpg (76 KB, 720x720)
76 KB
76 KB JPG
>>107786725
what trainer is this, on onetrainer my sample images just collapse into incoherent smudges once it starts picking up the concept and I have to guess and try out the various saves to find out with the best output
>>
>>107786766
How many steps have you trained?
>>
>>107786766
Didn't think you could train zit on just 512x512 images
>>
>>107786773
it's ostris ai-toolkit
but you shouldn't trust the previews from that either, best is to test the loras for real in comfy
>>
>>107786784
yeah that's what I'm doing, I was just wondering because here the samples at least look recognizeable, the samples during my trainings look like someone put the images into a blender
>>
congratulations to Ltx 2
First model to be hyped and declared dead in half a day's time
>>
>>107786774
3000 steps, the default settings
>>
>>107784474
>>Maintain Thread Quality
>https://rentry.org/debo
>https://rentry.org/animanon
how did these get back into the OP? I thought we finally got rid of the spiteful baker
>>
>he's awake
well, it was nice talking about ai for a while, now let's go to full drama shit 24/7 again
>>
>>107786912
it's fine i have a life and shit to do
same time tomorrow?
>>
>>107786924
well I'll see how it goes for now
>>
>>107786801
why? Because it's censored? I think i2v still works ok
>>
>>107786912
what do you mean? ranfaggot has been awake the whole time
>>
Or you can all stop being girls and just ignore the drama/bait/attention seeking. Anyway, wan 2.5 when?
>>
>>107787022
>wan 2.5 when?
I'm hoping they release it to dunk on ltxv2 but I don't know if ltxv2 is good enough for them to do that
>>
>>107787022
give me undistilled furfag model NAOW
>>
>>107786779
You can but you'll more likely get duplication problems like multiple heads or limbs.
>>
btw, for the jannies, this is how it goes:
>proxy spam concern trolling to fill threads
> reach limit
> post a thread he baked without the links 10 ~ 20 minutes before he hit post limit
>bake another thread with the link, and post it after posting the one without link
>post reposts on the thread without links
>nuke the thread with link
>nuke with proxies any poster who points this out on the thread without link
>>
thanks for beta testing guys
saved me some bandwith
>>
btw, for the jannies, this is how it goes:
>I post
>you seethe
>>
>>107787022
>wan 2.5 when
right after Z Base
>>
File: file.png (289 KB, 1519x849)
289 KB
289 KB PNG
>>107786725
for comparison, here is rank 16 progress (the previous one was rank 8)
looks like it converged faster, but i will test the loras now to see which one performs better
>>
>>107787056
It would be hilarious if they dropped it now. However, since everyone got pissy pants over it being api, it wont be a surprise if it never gets releaseed, kek.
>>
>train influencer lora
>likeness comes out kinda ok, sometimes good sometimes barely recognizable
>prompt said person in a 70s movie poster
>likeness now comes out perfect every time as long as it's prompted as a 70s movie poster
shit's just so weird and unpredictable(and interesting) sometimes
>>
File: file.png (119 KB, 993x272)
119 KB
119 KB PNG
coombrains lost
>>
>>107786985
it stopped for a day. he of course couldn't help himself and just loaded up another proxy. when are mods going to reimplements the spam filter?
>>
>>107786725
>converged
qrd
>>
4k steps or bust
>>
>>107787164
my dick fully converged deep into your anus
>>
>>107787172
fuck you benchod
>>
>>107787168
fun fact i tried to gen a video with wan and it was still going after i came back from work, i was like wtf, apprently some node bugged out and set it to 1280 steps
>>
>>107787157
>plebbit likes the i2v model that outputs jeets by default and is heavily censored
>>
File: ComfyUI_00273_.png (2.06 MB, 896x1152)
2.06 MB
2.06 MB PNG
>>
>>107787190
>1280 steps
did it turn out epic tho
>>
>>107787157
Suppose it makes for a good discount sora 2 at home. Wonder if a lora would bypass the censorship
>>
>>107786887
whoever wrote these is highly schizophrenic
>>
How do you determine how many steps are aproximately needed for a lora training with ZiT? From experience it needs around 4000 steps minimum for realistic stuff on adamw 1e-4 regardless of dataset size, much more than for SDXL. Is there some rule of thumb?
>>
>lubimiv is quintupling down again
failed dev
>>
>>107787234
>thinks he can undo mossad trickery with loras
lol
>>
File: file.png (1.31 MB, 1800x761)
1.31 MB
1.31 MB PNG
>>107787108
test results.
left: no lora
middle: rank 8
right: rank 16

from this experiment, i can conclude that rank 8 is not enough for a ZiT style lora. Rank 16 successfully picked up the pixel art character style from the dataset.
>>
can I run any good models on my 9900 diamond?
>>
File: file.png (114 KB, 1702x1141)
114 KB
114 KB PNG
>>107787242
so apparently nvidia is wrong and neither ltx2 nvfp8 nor nvfp4 exist, only fp4 and fp8
>>
>>107787256
how long did it take? gpu?
>>
>>107787260
>nvidia is making shit up
lmao
>>
somehow, the spam returned
>>
>>107787263
about 45 minutes per lora on a 5090 rtx. mind you the dataset is 512x512 that's why it was so quick.
i only sample 1 image once every 200 steps, sampling really slows down your training too.
>>
>>107787260
inb4 nvidia pr guy had no fucking idea what he had to write about and just slopped it up
>>
>>107787256
What is the difference when training in ranks?
>>
>>107787252
kek

>>107787278
NTA but nice, that's quick. Its a shame zimage suffers too much from same face otherwise I'd be training loras now. Hopefully, something will come of zeta chroma https://huggingface.co/lodestones/Zeta-Chroma/discussions/1
>>
>>107787260
lmao
>>
File: file.png (1 MB, 1694x715)
1 MB
1 MB PNG
>>107787256
for further comparison, i also run these exact same generations against the de-distilled z-image model by ostris, to compare what effects they have on that version
30 steps, res2_s, CFG3 vs being used on Z-turbo

>>107787330
the bigger the rank, the more parameters the lora has, so more capacity to learn, but also to overfit to your dataset, so you have to find a balance
>>
local is saved
https://civitai.com/models/2280698/burqa-lora?modelVersionId=2566908
>>
>>107787352
What's a good rank for realism and likeness, is it 32 like with XL?
>>
File: 1742031795754434.png (4 KB, 195x108)
4 KB
4 KB PNG
>>107787359
>>
>>107787372
depends on what you're training and the dataset you have, it's probably best to run a few experiments and compare. i have no experience with realism loras so cannot offer advice
>>
>>
>>107787352
I want to make a style for Z image but I fail every time any advice?
>>
https://files.catbox.moe/tt2fnr.mp4
>>
>>107787492
I have yet to see the style lora that doesn't fuck up z-image by introducing artifacts.
>>
File: file.png (720 KB, 768x768)
720 KB
720 KB PNG
that concludes today's experiments, good night anons
here is the lora if anyone wants it:
https://files.catbox.moe/ghsodm.safetensors
(trigger is 'pixel art character' i think)

>>107787492
i used mostly default settings in ostris, just make sure your dataset is good and properly labeled
try 512x512 first and do fast experiments with different parameters. then you can try higher resolutions
>>
https://files.catbox.moe/n3oiaa.mp4
>>
>>107787520
>zooks
>>
>>107787260
the nvfp4 is :
https://huggingface.co/Lightricks/LTX-2/blob/main/ltx-2-19b-dev-fp4.safetensors
>>
>>107787492
You could perhaps post some examples or maybe elaborate on your settings so anon can help further
>>
>>
File: NANCY.png (2.92 MB, 1824x1248)
2.92 MB
2.92 MB PNG
>>
>>107787547
2 more weeks
>>
>>107787507
well, the way we train is a bit of a hack with the de-distillation, the knowledge doesn't transfer perfectly to z-turbo.
you can see this quite obviously in my examples when i ran the lora against z-image turbo >>107787256 it had artifacts on the knight, but running it against the de-distilled model the knight had no artifacts >>107787352
>>
File: 00326-104627030.jpg (302 KB, 1728x1344)
302 KB
302 KB JPG
Why is upscaling always damaging the skin texture so much? Making it look "flakey"
For demonstration very simple gen. Normally my prompts are more detailed but that's not changing the issue
picrel after upscale
>this is a photo of a girl in a forest. she looks like lara croft.
>Steps: 10, Sampler: Euler, Schedule type: Beta, CFG scale: 1, Shift: 8, Seed: 104627030, Size: 1152x896, Model hash: 4f528ced41, Model: zimage-unstableRevolution, Beta schedule alpha: 0.6, Beta schedule beta: 0.6, Version: neo, Module 1: Qwen3-4B-abliterated-q8_0, Module 2: ultraflux
>>
File: 00325-104627030.jpg (142 KB, 1152x896)
142 KB
142 KB JPG
>>107787592
Here before upscale
>>
>>107787592
why not just gen it at the desired resolution in the first place
>>
>>107787599
Upscale params
>this is a photo of a girl in a forest. she looks like lara croft.
Steps: 10, Sampler: Euler, Schedule type: Beta, CFG scale: 1, Shift: 8, Seed: 104627030, Size: 1152x896, Model hash: 4f528ced41, Model: zimage-unstableRevolution, Denoising strength: 0.4, Original Size: 1152x896, Hires Module 1: Use same choices, Hires CFG Scale: 1, Hires schedule type: Beta, Hires upscale: 1.5, Hires upscaler: 4x-UltraSharp, Beta schedule alpha: 0.6, Beta schedule beta: 0.6, Version: neo, Module 1: Qwen3-4B-abliterated-q8_0, Module 2: ultraflux
>>
>>107787546
I just deleted them last week as it took up so much space
>>107787515
I used that too. I followed the "official" guide from the maker him self on youtube where he did the children drawings things
>>
>>107787495
>shes laffin at the cucks crashout even in that ai vid
never began
>>
>>107787592
cfg 1 ruins upscales
>>
>>107787606
I hate that. One of the main reasons why I don't use ComfyUI. I get only 4.65s/it which is painfully slow. I am rolling until I like a preview and think it's worth bothering to upscale, then I hires fix or send it to img2img for upscale.
>>
>>107787629
Cant get lower than cfg 1.
>>
File: 1754924662603621.png (9 KB, 899x784)
9 KB
9 KB PNG
i can't install ltx 2 workflow, even though the ltx 1 workflow works perfectly… i've updated everything. i even have that ugly red cancel button now
>>
>>107787664
you don't "install" workflows in comfy, so you're already fucked
>>
>>
>>107787693
kek
>>
>>107787527
Sorry goyim, can't say swears in this christian model
https://files.catbox.moe/zgeb2k.mp4
>>
>>107787686
i mean the nodes
>>
>>107787592
Try 9 steps.
Don't use beta with ZIT in general.
8 shift can be a bit much. Try 6 maybe.
Don't use shitmixes in general.
0.4 seems too high.
>>
>>107787781
Meant to tag >>107787614
>>
File: 00013-1218076005.jpg (443 KB, 1728x1344)
443 KB
443 KB JPG
>>107787787
OK It was the model. Downloaded a different finetune, looks a little better now. Denoise down to 0.35 everything else the same.
>this is a photo of a girl in a forest. she looks like lara croft.
the forest behind her is filled with skeleton monsters. the skeletons are creepy and slimy. pieces of flesh hanging from their bloody bones. the mood is gloomy and creepy, with waft of mist.
>Steps: 10, Sampler: Euler, Schedule type: Beta, CFG scale: 1, Shift: 8, Seed: 1218076005, Size: 1152x896, Model hash: 8e9df7926d, Model: lexivisionII_lexivisionZBeta2, Denoising strength: 0.35, Original Size: 1152x896, Hires Module 1: Use same choices, Hires CFG Scale: 1, Hires schedule type: Beta, Hires upscale: 1.5, Hires upscaler: 4x-UltraSharp, Beta schedule alpha: 0.6, Beta schedule beta: 0.6, Version: neo, Module 1: Qwen3-4B-abliterated-q8_0, Module 2: ultraflux
>>
File: kekk.gif (1.25 MB, 480x270)
1.25 MB
1.25 MB GIF
>>107787693
even if the base is realized, it will be forgotten the next week. you're just consumers
>>
ltx2 kissing audio quality?
>>
File: are you for real?.png (806 KB, 1080x1080)
806 KB
806 KB PNG
>>107787871
>it will be forgotten the next week.
like Wan and SDXL?
>>
>>107787592
>>107787599
>>107787824
because youre using some shitty jeet merges probably
>>
File: IMG_2891.gif (1.32 MB, 498x280)
1.32 MB
1.32 MB GIF
>5070ti
>9800x3D
>32gb of vram

Is this enough to be able to AI goon? Do I need another 32gb of RAM?
>>
File: img_00590_.jpg (1.22 MB, 1720x2064)
1.22 MB
1.22 MB JPG
>>
>>107787246
you seem highly schizophrenic
>>
File: 1756596021274418.png (7 KB, 366x188)
7 KB
7 KB PNG
>>107787905
>Do I need another 32gb of RAM?
for ltx2 it's brutal how much ram it's demanding
>>
fresh when ready
>>107787932
>>107787932
>>107787932
>>107787932
>>
>>107787905
yeah just use some sdxl finetune. zit is kind of shit at nsfw
>>
>>107787941
maybe you just are highly schizophrenic. fuck off with the early bakes
>>
File: ComfyUI_00274_.png (1.36 MB, 896x1152)
1.36 MB
1.36 MB PNG
>>
>>107787905
for images, no problemo.
for vids? possible, but having another 32gb of ram would help
>>
instant blue screen trying i2v, don't want to try a third time
>>
>>107788080
One of your
>CPU
>RAM
>GPU
is unstable or
>PSU
can't cope with full load.
>>
>>107787894
sfw sdxl has definitely faded into obscurity. and you have no choice regarding wan, since hunyuan is a flop
>>
>>107787912
nice



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.