[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion of Free and Open Source Diffusion Models

Prev: >>108240824

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Anima
https://huggingface.co/circlestone-labs/Anima
https://thetacursed.github.io/Anima-Style-Explorer/

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg
>>
Do not post any gens in this trollbake.
>>
>>108248215
elaborate?
>>
>>108248224
nah
>>
File: 1764210514475.png (1.89 MB, 1280x1024)
1.89 MB
1.89 MB PNG
repoastan:
Do I need to include the
>the focus of the image is..
>it's a medium closeup shot blah
in the captions or is it just token bloat?
Also what lora training settings are good for a large dataset? 200+ images.
>>
File: 00070-2671663316.png (3.79 MB, 1536x2144)
3.79 MB
3.79 MB PNG
>>
>julienbake
>>
File: 00074-3903908536.jpg (443 KB, 1536x2144)
443 KB
443 KB JPG
>>
Local is dead
>>
>>108248364
>Local is dead
we just got a llm revolution with qwen 3.5, Alibaba is cooking, wait for them to finish their Chinese new Year vacation and they'll give us Qwen Image 2.0 and Z-image edit, trust the plan
>>
File: 00072-1967938115.jpg (540 KB, 1536x2144)
540 KB
540 KB JPG
>>
wan 2.6 is coming soon too! just keep praying. hell, maybe we should buy some cloud credits just to show alibaba how much we love their models. i just donated stardust for illustrious 3.5 open source. vote with your wallets!!!!
>>
>>108248382
bot
>>
>>108248399
I'm not a bot, I can say faggot kek
>>
File: 00055-921742394.png (3.81 MB, 1536x2048)
3.81 MB
3.81 MB PNG
>>108248382
i very worried about that qwen image 2.0. The results seems to be somewhat inferior than to qwen image 2512 and it has a serious east Asian bias,
>>
>>108248199
Why is AniStudio not in OP?
>>
>>108248472
the important part will be the edit, like klein, completly useless as a pure image model, great at editing
>>
>>108248382
how did qwen 3.5 revolutionize llms?
>>
>>108248490
>>108248368
>>108248401
>>
File: 00080-2529363245.jpg (412 KB, 1944x1944)
412 KB
412 KB JPG
>>
File: 00078-1327482647.jpg (471 KB, 1944x1944)
471 KB
471 KB JPG
>>
>>108248496
No, maybe 0.01% better than Haiku or GPT mini at some hyper specific task.
As an RP and ERP, "local models = API models" is the biggest cope meme ever.
Local LLMs will never match API model overall quality reasoning and capabilities.
>>
>>
Has anyone tried a duel 3090 setup? i'm thinking about doing it for a meaty 48 GB of VRAM. could i do it on my 10th gen intel motherboard?
>>
File: farmgirl.webm (3.92 MB, 792x1320)
3.92 MB
3.92 MB WEBM
>Local textgen eating good with Qwen 3.5
>Local video still stuck on a 8 month old model.

sigh...
>>
>>108248597
I know someone with a 3x 3090 setup.
2x is probably the max you can fit in a normal PC provided you have something like a 1200w PSU.

anything more and you'll need to start doing somekind of mining rig setup.
>>
>>108248597
I'm on a 24+32GB setup, but I'm also using low power workstation cards. As the other anon points out, your biggest worry is going to be power.
>>
File: hero.png (237 KB, 720x870)
237 KB
237 KB PNG
>>108248654
>>108248691
I can always buy a bigger PSU. i'm worried if my motherboard's chipset will cuck the cards and make them run significantly slower.

i'm using an Asus Z490 F.
>>
>>108248751
Check dimensions first if the cards even fit.
>>
>>108248597
to do what? has there been a breakthrough for layer distribution?
>>
>>108248811
>I'm thinking about doing it for a meaty 48 GB of VRAM
doesn't work like that, it will still be two 24gb cards for all intents and purposes
>>
>>108248811
>>108248826
So how are these fags on youtube building multi 5090 setups and taking full advantage of it for AI?
>>
>>108248844
Maybe for LLMs it works because they don't require that much speed, but if the whole model doesn't fit on each GPU by itself they will have to exchange information, creating a colossal bottleneck (just like how RAM fallback does)
>>
>>108248844
The youtube fags are mostly doing LLM stuff, which does benefit pooling multiple GPUs together. With diffusion models, you're still limited by the VRAM of a single card, but you can split components up (model/clip/vae) to avoid offloading shenanigans.
>>
>>108248874
shiiiiiet. back to the drawing board, i guess.
>>
>>108248751
Your mobo won't cuck the card. the PCIe bandwidth is fast enough.
>>
>>108248751
I had a 4090 and 3090 in a 9900K system, both in waterblocks so they were in slot 1 and 2 (the bottom slot is controlled by the southbridge, which is 4x). Model load times feel slow, and time to first token responses are higher than a pcie 4.0 setup but not that bad. Actual tok/s were really good, apparently with way ollama and lm studio parallel the layers on each card means that during generation there is barely any crosstalk. I imagine for training the performance would be shit but I never tried that. I took the 3090 out a few weeks back to run a few tests on the single 4090 and have been too lazy to drain/refill the loop to put it back in.
>>
File: 1552080261076.jpg (29 KB, 400x400)
29 KB
29 KB JPG
What retard said nearest-exact is good? Total ass compared to bislerp.
>>
>>108248997
I'm trying to think of any cheap solution to increasing my VRAM count for image gen and image training. but i seems like the only real solution is just to buy 5090 or (gasp) RTX PRO 6000 which cost more than my actual car.
>>
>>108249195
you can't split models like that for image/video gen. well, technically you can, but all compute happens on one card so then you are REALLY pci-e bus limited or reliant on nvlink (good luck getting that to work lol). its 5090/6000 pro or bust.
>>
File: 1750558443545402.png (1.42 MB, 1849x1165)
1.42 MB
1.42 MB PNG
Qwen 3.5 is great at caption, and with the heretic method it can caption NSFW prompts without issues, and since it's a 3b MoE model, it's hella fast
>eval time = 3872.25 ms / 256 tokens ( 15.13 ms per token, 66.11 tokens per second)
https://huggingface.co/mradermacher/Qwen3.5-35B-A3B-heretic-GGUF
>>
>>108248515
there's something off about this
>>
>>108249195
>cheap solution to increasing my VRAM
PFFFT HAAAAAAHAHAHAHAHAHAHAAHAHAH
>>
full circle
>>
File: ComfyUI_12574.png (2.96 MB, 1440x2160)
2.96 MB
2.96 MB PNG
>>108249248
>what appears to be
Get rid of this kind of crap if you want GOOD captions though.

>t. Qwen 3 30b Q8 user
>>
>>108249344
>replace "which appears to be" to "which is" on notepad
problem solved
>>
>>108248599
That's how it goes. A drought in imggen, a tidal wave for textgen. Then later, the reverse. It's a back and forth. A sort of yin and yang if you will.
>>
>>108249248
> with the heretic method it can caption NSFW prompts without issues
Can you prove that? With not just 1girl standing image.
>>
i wish anon would raise his standards for his own gens
>>
File: image.png (55 KB, 531x540)
55 KB
55 KB PNG
> git pull \n uv pip -r
> ltx2 workflow doesn't work
Excuse me, what the fuck?
>>
>>108248199
Why have the last two faggollages felt more like cringe participation trophies rather than a based "best of"
>>
>>108249452
here goes nothing
https://files.catbox.moe/6su54j.png



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.