[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion of Free and Open Source Diffusion Models

Prev: >>107999241

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>NetaYume
https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0
https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
1st for kill ani
>>
Finally, an epic bake! Expecting lots of good stuff this thread
>>
Epic bread of valor
>>
so what now?
>>
>>108001477
What we do every night...
>>
cope and seethe over saas victory
>>
File: 1762117779350418.jpg (589 KB, 1408x1568)
589 KB
589 KB JPG
>>
>>108001500
way to ruin that layup man...
>>
“play with our nano bananas” was right there…
>>
Blessed thread of frenship
>>
File: 1738340849768354.jpg (667 KB, 2176x1024)
667 KB
667 KB JPG
>>
File: file.png (1.57 MB, 1461x1385)
1.57 MB
1.57 MB PNG
>>108001424
I wish sd.cpp was as fast as the pytorch web UIs but it just doesn't keep up (18.11s vs 10.7s in forge classic for picrel). You can just make Qwen coder 30b write a python QT GUI for the sd-server api though instead of using ani's mess that doesn't even compile.
>>
>>108001623
the majority of the time is spent inside cuda kernels on the gpu, whether you're using python, c++ or brainfuck to delegate those calls doesn't really matter, dumbasses think this is like traditional software where c++ == fast but in fact you gain zero speedup here, just the inflexibility of a shitty compiled language
>>
File: 1766762770126742.jpg (921 KB, 1376x1632)
921 KB
921 KB JPG
>>108001623
just use koboldcpp desu, it has llmao.cpp and sd.cpp integrated
>>
File: 1765959803666828.jpg (540 KB, 1632x1376)
540 KB
540 KB JPG
skin came out damaged :(
>>
Compiled around 200 images from some anime artist I'm into and captioned them with tags. Now how do I make a lora for flux 9 klein edit?
>>
>>108001739
Download a LoRA trainer of your choice and run the script on your dataset?
You've already done the hard part.
>>
>>108001739
you also have to caption some edits in your dataset (source image -> dest image) along with caption.
otherwise youll murder the edit capabilities
>>
>>108001753
For an edit model, do I not need before and after images since it's an edit model? Or will the model perform the same task while incorporating my lora into consideration?
>>
File: file.png (2.28 MB, 1248x1872)
2.28 MB
2.28 MB PNG
>>
>>108001778
>>108001786
I was thinking about that
>>
File: file.png (3.37 MB, 1248x1872)
3.37 MB
3.37 MB PNG
Zib loras on Zit are better than Zit loras on zit
>>
>>108001786
Okay, for an edit model you'll need the control images. Luckily flux already does an okayish job of translating things from anime to real. If you want to paypig, you can use nanobanana or something to do an even better job.

Use the real outputs as your control dataset and the anime images as your target dataset. So yeah if you want to do more than just train a style, you need to now convert those images to a realistic style.
>>
>>108001849
yep
>>
>>108001849
Nice now make her pregnant
>>
File: file.png (447 KB, 896x1152)
447 KB
447 KB PNG
>>108001706
I know that, sd.cpp's cuda backend just isn't as optimized as pytorch and it's nothing to do with "muh language". I hope it closes the gap in the future because it's a lot lighter than pulling in 8GB of pytorch dependencies and is a lot less hassle to make work on AMD with the vulkan backend option which is more performant than their rocm backend but still results in an rx 9060 taking 21s for what takes an rtx 3060 18s.
>>108001710
I do use it but just for the text side when I want the fancy markdown rendering I don't get in a terminal with llama-cli. Does kobold's built in sd.cpp keep the image model loaded instead of loading it from disk ever gen like sd-cli or is it just a webui for sd-server?
>>
>>108001864
So 1:1 ratio , each anime image gets a realistic version?
400 total images: 200 anime + 200 realistic counterparts.
>>
>>108001911
>>108001791
>>108001623
What the fuck is this llama talk and all this miku stuff? are we being raided by /lmg/? fuck off, comfy has a better ui
>>
File: 1367068868090.jpg (18 KB, 376x260)
18 KB
18 KB JPG
>batch size must be same or half of batch size of text embeddings
But they are the same number of files and both divisible by two, wtf does it want? 58 images.
>>
>>108001864
>flux already does an okayish job of translating things from anime to real.
I can reverse engineer my dataset.
>>
>>108001947
Are you using cache text embeddings or unload TE in AI Toolkit? They seem to be broken.
>>
>>108001963
caching
>>
>>108001965
Yep doesn't work with batches even if you have right amount of pics in each bucket.
>>
>>108001962
>I can reverse engineer my dataset.
That's right.

>>108001925
Yes, but also that's a lot of images.
>>
>tfw got zit working on my potato machine yesterday somehow
>lost the workflow and every attempted recreation seizes the machine
sigh
>>
File: 1563622631339.jpg (90 KB, 957x621)
90 KB
90 KB JPG
I know the dev is a roach but he shouldn't let bugs live in his code.
>>
>>108001623
Wow that even looks better than tranistudio
>>
>>108001974
>batches
I don't want to start a debate here, but is there really any point to batching. I feel like there's no middle ground. You either do none or have massive batches.
>>
>>108001998
>is there really any point to batching
faster (and better) training, low batch = low signal to noise ratio
>>
>>108001986
>Yes, but also that's a lot of images
How many do you recomend?
>>
>>108002006
They say it helps the model generalize.
>>
File: Flux2-Klein_00395_.png (1.35 MB, 1280x720)
1.35 MB
1.35 MB PNG
>>108002008
Like 100 or so. I trained that WoW LoRA on 280 images and it was excessive. I've gotten the same or better success from like 50 or 100 images selected for best quality.
>>
File: 1769686820280551.jpg (82 KB, 956x620)
82 KB
82 KB JPG
>>108001990
I know the roach is a dev but he shouldn't let code live in his bugs.
>>
the chroma2 images shared on Furry Discord are pretty intense.
Looks like Klein 4b is a real monster when it comes to training
>>
>>108002038
Good, thanks, so 100 total images 50 anime style + 50 realistic style.
>>
>>108002057
You forgot to include your donation link, load of shit stones.
>>
>>108002057
proof?
>>
>>108002064
Yes, probably. But don't take my word as gospel. You can always just add more images to the dataset if it's not working.
>>
File: file.png (2.88 MB, 1296x1728)
2.88 MB
2.88 MB PNG
>>
best way to create decent lora from ~200 images?
can any fren point me to somewhere where i can do it with 4070s?

me too retarded for this
>>
File: 87167.png (127 KB, 471x676)
127 KB
127 KB PNG
>>108002034
Idk how much it helps in this regard since our lora batch sizes are very small, the training "time" is also incredibly short, still, its pure benefit with no drawback for diffusion models.
https://arxiv.org/abs/2411.03177v1
>>
>>108002082
Onetrainer
>>
>>108002082
I'm feeling generous. Send me those pictures of your little cousin and I'll make that lora for you.
>>
>>108002082
Recently talked topic,
>>
>>108002082
You probably wont need 200 images. use AI-Toolkit
https://github.com/ostris/ai-toolkit
>>
>>108002006
That's what they say but how does say a batch of 4 perform over 1? or 50 compared to 1? I've never seen it quantified and you can find arguments for both.
>>
>>108002057
may we see it
>>
>>108001998
>You either do none or have massive batches.
I've been told you should not do more than 4 batches (or 2 batches + 2 gradient accumulation) for characters. Just 2 batches seems to work as well.
>>
So any news of large scale finetunes of z-image? Surely someone has started.
>>
>>108002106
Sublinear growth, it improves by sqrt(batch size), which is the ratio you can increase the learning rate by when increasing the batch size, with batch 4 you have twice the signal to noise ratio of batch 1, doesn't mean it will train twice as fast because of training dynamics (like how the first few steps the optimizer has no idea where to go and has to "warm up" accumulating momentum)
>you can find arguments for both.
Outside of diffusion models yes, there are neural network architectures that perform better on smaller batches (.e.g GAN), all of them still colossally larger than anything we do with loras (train at batch 64~32 instead of 128 or 512 lol)
>>
File: tmp3_pv__o3.png (799 KB, 896x1152)
799 KB
799 KB PNG
>>108001942
Relax schizo-kun I lurk the image gen threads more than I post in them, I just posted a couple mikus (second one isnt me) because I don't want to be a nogenner like you or an obnoxous avatarfag.
I don't like comfy (or ani before you accuse me of it) so I don't use it, I don't care what you like or use.
>>
>>108002129
i think lodestone mentioned something about training a vae-less version of z-image
>>
>>108002151
Don't pay Julien any mind
His failure as a "developer" has made him intent on destroying this general
We love our /lmg/ brothers
>>
>>108002150
Well, I'll give it a spin.
>>
File: o_00087_.jpg (1.33 MB, 2304x1792)
1.33 MB
1.33 MB JPG
>>
File: 1760572465950714.jpg (1.64 MB, 1872x2736)
1.64 MB
1.64 MB JPG
>>
File: o_00088_.jpg (1.43 MB, 2304x1792)
1.43 MB
1.43 MB JPG
>>
File: ComfyUI_00119_.jpg (574 KB, 2880x4096)
574 KB
574 KB JPG
>>108001415
I'm a complete neophyte regarding local diffusion, but I've recently trained a LoRA based on a character of mine
>https://files.catbox.moe/yoamgt.safetensors
sadly, I'm don't yet know how to make characters that don't look like burn victims

I'd appreciate any help
>>
File: 2727636.jpg (11 KB, 200x200)
11 KB
11 KB JPG
>ai toolkit webui has links to discord, youtube and donations, but not to github
Wow truly the peak of today's foss software. Nukes when?
>>
File: 1756803964926418.jpg (662 KB, 2016x1120)
662 KB
662 KB JPG
>>
>sage attention
>flash attention
which one does anon use?
>>
>>108002317
>Can't be assed to set up wsl to use diffusion pipe right now
>decide to use AI toolkit
>scroll down the page to get install instructions
>See a bunch of random looking squares before the relevant information.
>it's his patreon subscribers or something

jesus.
>>
File: file.png (1.58 MB, 832x1248)
1.58 MB
1.58 MB PNG
>>
>>108002270
Whats the base model
>>
>>108002328
FA doesn't seem to work with imagegen and SA only has gains for video. Use fast fp16 if you want speed.
>>
File: tmpe2op99d2.png (1.01 MB, 1440x1120)
1.01 MB
1.01 MB PNG
>>108002158
It's so unfortunate that his broken UI and unhinged behavior here drives people away from even trying or contributing to sd.cpp, I've even seen anons being mislead about it's model support from his outdated readme failing to mention zimage and flux2.
Not to be a shill or anything, I just prefer native GUIs to webapps and like the ease of setup compared to a pytorch environment especially for an AMD card. If anything he's why I post bringing attention to the upstream project which I think actually has some value especially for the AMD-only anons.
>>
>>108002343
>fast fp16 if you want speed.
And fucked up results.
>>
>>108002328
Cross attention
>>
>>108002317
>wh*tes when they can't just enslave people to get shit for free
>>
>>108002339
Illustrious XL
>>
>>108002355
wut
>>
Where are julien posts, I need to chase him
>>
>>108002355
Fix the batch size bug, roach. Not donating btw you grifting cuck.
>>
File: 1763678567338882.png (7 KB, 205x116)
7 KB
7 KB PNG
i just pressed run workflow on z-image. i'll be back in a couple of days with the result
>>
>>108001415
Grok Imagine is almost completely uncensored btw, and is available over the xAI API. Does images and videos. It doesn't let you gen actual sex and pussies, but basically everything else is fair game. Also it does decent audio (not shown here, but it actually does, about Veo 3 level). I hope Musk doesn't censor it later.

Videos are 6s but you can gen from 1s to 15s (i just used the default 6s), and generation for 6s takes under 1 minute

https://files.catbox.moe/88w9kr.mp4
https://files.catbox.moe/8uzs4d.mp4
https://files.catbox.moe/5tan4z.mp4
https://files.catbox.moe/cqh1cb.mp4
https://files.catbox.moe/y4qofn.mp4
https://files.catbox.moe/0z9vqq.mp4
>>
>>108002481
Is it local?
>>
>>108002489
better than local, better in softcore nude than your local models, you can switch tabs with comfy
>>
>>108002481
>gachashit
>SAAStrash
back to the jungle wth you, disgusting SEAmonkey
>>
>>108002481
This is very good! better than WAN rotoscoped slop, thanks anon I will try it out.
>>
>>108002495
What does that mean? Is it local?
>>
>>108002495
Ia it local?
>>
>>108002347
not a thing
>>
>>108002506
>>108002495
Yes, like electricity, your money printer, and your bank account ;)
You can access it from your home PC, or phone.
>>
>>108002495
>>108002502
You think replying to your own posts will work?
Everything you write smells like you: putrid shit
>>
>>108002512
>not a thing
It absolutely is. Do a test with and without it.
>>
>>108002481
>>108002495
>>108002502
>>108002526
Why is the developer of tr*nushartdio shilling SAAS models now?
>>
>>108002481
What is it about the word "local" you don't understand?
>>
File: o_00092_.jpg (1.32 MB, 2304x1792)
1.32 MB
1.32 MB JPG
>>
>>108002530
>You think replying to your own posts will work?
Yes
>Everything you write smells like you: putrid shit
Better than local. Meta off-topic crying is irrelevant. ;)
>>
seems like the browns are upset, must be a good model then
>>
>>108002552
>Meta off-topic crying is irrelevant. ;)
You are sething so hard, you shit-eating turd world nigger
>>
>everyone falling for either the bait or oblivious person
cmon local sisters, we can do better
>>
>>108002543
It’s better than local, it’s relevant and has to do with Local Diffusion
>>108002570
Where are you going to get your dataset local roach? Or did you forget the NAI leak and how the local roaches stole from it? ;)
>>
>saar do the grokful saar
>>
if it has api nodes then its local. if not, its off topic
>>
>>108002481
>>108002495
Its not local. You have to spend money and use an api. Also I can do 30+ seconds for free, now fuck off.
>>
>>108002623
Don't fall for the bait please
>>
>>108002623
Hehe, don't disrespect the ancestors of your dataset local cuck! You're a joke, a low quality one, just like your failed models.
>>
>>108002640
Your life must suck to do this everyday. I'm sorry you have nothing else going on.
>>
>>108002631
its the bot samefagging back and forth
>>
>>108002631
Yes, please, don’t learn from SaaS, don’t even look at it! Let’s stay safe in our little local world!
>>108002649
Yours too, if you go mad whenever you see something of good quality that’s better than your local cope.
>>
>>108002481
Thanks anon, very usefull!
>>
Are the saas shills ironical shitposters or real
>>
Tranustudio losted
>>
>>108002631
Baiters, shills and trolls should always be reminded this is a local thread. Their ape keys are no good here.

>>108002640
Begone adversary of closed source.
>>
>>108002683
It's the same retard recycling through his bag of grief, when there's nothing going on with new models he goes into his personal vendettas now he's just spending all his time spreading FUD
>>
File: AniStudio_00409_.png (2.64 MB, 1824x1248)
2.64 MB
2.64 MB PNG
>>108002481
Thanks! I wish local could do moans.
>>
File: z-image_00132_.png (966 KB, 1024x768)
966 KB
966 KB PNG
>>108001181
>>108001510
Finally RNG'd one decent enough to stop. Might see if non-Turbo has better prompt adherence later.

>>108002346
I used commandline sd-cpp for a while early on, and it was alright. I think VAE had to be done on CPU to avoid a memory-allocation bug, but it was quicker than tiling on GPU. Later I switched away from sd-cpp to Olive DirectML for more speed, at a cost of needing to convert safetensors to onnx, and being RAM-limited to 1280x800 due to how Olive loaded models. These days ComfyUI with ROCm works best for me though, and 1600x1280's my sweet spot. (I tried sd-cpp's ROCm version when it came out a while back, but it didn't work somehow. Might recheck if it's fixed sometime.)
>>
>>108002677
I think they're just as much shitposters and as real as the shills from Lodestone, Tongui, Comfy, Anistudio and BFL.
>>
File: o_00093_.png (1.95 MB, 1792x2304)
1.95 MB
1.95 MB PNG
>>
>>108001415
>>Maintain Thread Quality
>https://rentry.org/debo
>https://rentry.org/animanon
why are these in the OP? we just had two drama free threads and it immediately went to shit
>>
>>108002661
>Yes, please, don’t learn from SaaS
what is there to learn? how to type some text in a text box?
you have no control and zero knowledge of any of the internals and everything is completely opaque
>>
>>108002699
Ahh, only about 15 minutes after the SAAS shilling started, what a coincidence
>>
File: file.png (512 KB, 1920x1280)
512 KB
512 KB PNG
>>
Local model anons are so adorable, still in their rebellious teen phase. Come back when you've left mom's basement and gotten a job and then we can discuss local models realistically instead of circlejerking in an echo chamber.
>>
>>108001623
>>108001911
>>108002346
I think the main issue is catjak absolutely dragging ani's name through the mud. ani doesn't do drama or shilling itt. the second thing is you can just ask him in /adt/.
>>
File: file.png (3.15 MB, 1920x1280)
3.15 MB
3.15 MB PNG
>ask for pentagram
>get jewish star
>>
are localtards falling for the bait or is it just samefagging? kek
>>
>>108002754
He replies to himself, he's going to do this so he can alter the OP for one thread during the day to feel good about himself.
>>
please stop dramashitting already
>>
>>108002754
Thread has been dead since ZiB released and flopped, he is probably samefagging.
>>
>>108002699
Welcone back ani how did you sleep today
>>
>hahaha you are a puny insignificant miserable outdated fool who will be left behind!!
>noo i am upset, leave this thread now you off-topic poster! look how upset i am!
update the script
>>
>>108002699
Hi julien
>>
https://huggingface.co/OpenMOSS-Team/MOVA-720p
https://mosi.cn/models/mova

New video model that also does audio. Looks better than ltxv 2, its a 32B moe though (18B active)
>>
>>108002803
>18B active
Oh cool, so it's like LTX 2 but it rapes my pagefile twice as much?
>>
Klein is almost perfect, it doesn't even need loras to turn anime into photo-real, but if it's even a bit stylized it makes their heads huge and younger looking, has anyone managed to get around that? No amount of prompting I tried could fix that behavior and all the recent realism transforming loras either do nothing, or not fix this issue.
>>
ani cannot stop spawning schizos
>>108002302
>>
whatever happened to AceStep 1.5? someone was shilling the release for days and then nothing
>>
>>108002803
benchod
>>
>>108002803
>https://mosi.cn/models/mova
what the fuck, these samples look awful.
>>
>>108002699
Hey someone build a better sd.cpp ui than you i think you just should shut up now >>108001623
>>
What schizosare real and what schizos only exist in the mind of those schizos
>>
>>108002803
Shameful display.
>>
>>108002752
I was listening to https://www.youtube.com/watch?v=B8klPYjS3ws and here is your post out of sudden. Love such coincidences.
>>
>>108002831
that's a python wrapper. it's shit and also if the author didn't like python why did he use it? fucking retarded
>>
File: Untitled.png (18 KB, 871x122)
18 KB
18 KB PNG
>>108002827
They said in 2-3 more days 2 days ago. This was the last message from the developer specifically regarding its release and that was about 2 hours ago.

My gut tells me not this weekend. The only people that have the code now are certified "influencers" and developers. Comfy should have the weights right now but he doesn't come here any more because of the schizos so there's no way to know.
>>
>>108002803
Oh! Another failed model! Thanks, anon!
>>108002827
Looks like the shilling anon changed jobs.
>>
>>108002848
Yeah but it didn't take him over 2 years for a mess that does crash all the time and doesn't even compile like your shitty wrapper
>>
There is no day 1 without ComfyUI support. Deal with it.
>>
>>108002852
its wan 2.2 with audio. I expect it WILL be better than ltxv....



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.