[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


80b Edition

Discussion of Free and Open Source Text-to-Image/Video Models and UI

Prev: >>106706484

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2203741
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbours
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
TELL ME NUNCHAKU
WHERE IS THAT WAN QUANT
WHERE IS THAT QWEN LORA SUPPORT
>>
>>106708345
>WHERE IS THAT QWEN LORA SUPPORT
here?
https://huggingface.co/nunchaku-tech/nunchaku-qwen-image-edit-2509
>>
Blessed thread of frenship
>>
File: 1736584904914097.png (1.93 MB, 1728x1344)
1.93 MB
1.93 MB PNG
>>106708106
Alright here's some (retarded) napkin math.

HDM is 340m params for $650.
Lumina2 and sd35m are both ~2.5b params. They both show strong capabilities at their size while suffering from training data and architecture problems, so an HDM-like model with its superior architecture and better training data could easily become local SOTA at 2.5b.

$650*2.5b/340m = $4779

HDM is undertrained according to the author, so let's double that budget. I suspect we hit severe diminishing returns with a budget >$10k. With current tech, we can get crowd (or richfag) funded local SOTA with <$10k. Keep in mind there are other optimizations that aren't used in HDM, so if one or two other 10x optimizations can be incorporated, we are in very good shape.

>>106708232
SDXL anime models like noob are the most fun by far.
I've mostly stopped non-anime SDXL models unless I need a controlnet or something. I still use SDXL for really big upscales since it has the tile controlnet and can fit the whole gen in memory while not taking too long. There are some impressive non-anime SDXL checkpoints out there, but the prompt comprehension issues and lack of art style knowledge hurt it significantly compared to Chroma.

>>106708267
what case/mobo do you use for this? I've been wondering if it's possible to fit a dual gpu setup in a mid tower these days
>>
File: ComfyUI_18499.png (2.43 MB, 972x1728)
2.43 MB
2.43 MB PNG
>>106706788
It got that soft/grubby SRPO look... they really need to show it doing something a little more impressive than stock photography. Maybe a complex interaction or something, I don't know.
>>
File: 1741778495766362.jpg (794 KB, 3072x2408)
794 KB
794 KB JPG
https://xcancel.com/SD_Tutorial/status/1970518843048272293#m
what's happening here? why SRPO is completly fucked on fp8?
>>
>>106708383
Sex with Jenny
>>
>>106708384
Maybe like 3-4 steps?
>>
File: 00112-3245483784.png (693 KB, 888x1008)
693 KB
693 KB PNG
>>
File: 1000010860.png (2.8 MB, 1728x1344)
2.8 MB
2.8 MB PNG
>>106708376
Where's the dataset coming from?
>>
File: interpolated_00064.png (1.18 MB, 720x1280)
1.18 MB
1.18 MB PNG
ÆÜGH Icum

>Error: Maximum file size allowed is 4MB
https://files.catbox.moe/f7x5jz.mp4
>>
>>106708429
ask chatgpt to make a script that compress your video to 4mb
>>
>>106708415
literally just use the danbooru api if you just want a booru model
>>
>>106708376
I have a full size case with a full sized motherboard with two GPU slots, I prefer extra space for easy moving than seeing how tight I can fit everything.
>>
File: 1747876613603820.png (3.61 MB, 1728x1344)
3.61 MB
3.61 MB PNG
>>106708415
catbox please?
for the anime side of things, a booru dataset augmented with NL captions (but not replacing the tags).
for the rest, desu, I don't see why something like LAION wouldn't work. it was fine for sd1.5. then augment it with some better captions, aesthetic selection.. use srpo or whatever too

>>106708486
damn... maybe I should give up on my dual gpu idea. don't really want a behemoth on my desk.
>>
>>106708413
All computer metrics are bunk in the scheme of generative models. The only real way to measure a model is promptability, level of censorship, breadth of conceptual and stylistic capacity, and ultimately distilled as "usability". Right now all the metrics are biased metrics essentially designed around making stock images but really if you want to see how shit aesthetics filtering is, just take 20k images from any booru and see which ones standard aesthetics metrics consider low quality.
>>
>>106708528
honestly i don't blame you for assuming gpu sizes either way, you really don't have a full idea of how huge or small gpus are until you actually get one in your hands.
then you totally forget once its been in your system for a year+.
dual gpu'ing is not for the faint of heart. or wallet for that matter.
>>
>>106708528
Why would you want a space heater on your desk? Two 4090s running full tilt even throttled gets quite toasty.
>>
File: grid-0002.jpg (2.21 MB, 7440x5376)
2.21 MB
2.21 MB JPG
trying to find a method to stop style swing but it's really bad in some cases especially at random seeds that tries to be irl during the first pass. I'm giving this model one last chance before dumpstering it
>>
>>106708358
Lora support anon, that is still being worked on. You can use model fine, though I think they messed up the lightning merge in that version
>>
>>106708429
why did that video need to be 11 seconds if its the same motion, REEETARD
>>
>>106708528
I use a basic old raidmax smilodon case from like 15-20 years ago and it fits modern GPUs fine
>>
having a danbooru data set with the tags grouped by subjects, background and interactions would make even SDXL based models exponentially better
we just need a powerful VLM like Gemini but uncensored
>>
File: 1744480545629914.png (2 MB, 1000x2044)
2 MB
2 MB PNG
>>106708703
nai seems to do something like that
https://docs.novelai.net/en/image/multiplecharacters

however, nai's implementation kind of looks like it's just calling on a regional prompting addon at least some of the time.

reminder that the forge regional prompting addon is able to generate region masks FROM THE PROMPT ITSELF, and we still don't have a comfyui equivalent even though this kicks ass:
https://github.com/hako-mikan/sd-webui-regional-prompter?tab=readme-ov-file#region-specification-by-prompt-experimental
>>
>>
>>106708799
how could you gen this absolute filth?
reported, filtered, snitched on, sent the batsignal
>>
>>106708827
I prompted black monolith lol
>>
Is there a straightforward way to get SD working on linux with an AMD card? I'm following the wiki installation and I kept running into issues
>>
>>106708799
sovl and kino
>>
>>106708799
That's the most disgusting thing i've ever seen on 4chan, and I'm an oldfag. You should be ashamed.
>>
>>106708415
>>106708528
NTA and also asking for catbox, thanks
>>
>>106708799
The best image posted in a long while
>>
File: IMG_2311.jpg (74 KB, 934x2000)
74 KB
74 KB JPG
>>106708328
I'm in the OP
>>
>gm
>>
File: file.png (76 KB, 558x845)
76 KB
76 KB PNG
I've set up Qwen Image Edit but it's maxing out my VRAM. I've tried launching Comfy with and without vram saving parameters. Is the Q8 model too big for a 3090 (24gb vram)
>>
File: IMG_2370.png (695 KB, 678x907)
695 KB
695 KB PNG
>>106708883
>>
>>106708883
..no? i have 16gb vram and can use q8 fine
>>
>>106708844
>make venv
>follow these instructions https://github.com/comfyanonymous/ComfyUI?tab=readme-ov-file#amd-gpus-linux-only
if that doesn't work, you will have to tell us more. card, distro, UI you're attempting to use. these steps worked on my old rx 6800 and my 7900 xtx.

>>106708883
try the fp8 scaled version instead. ggufs have a broken implementation on some cards, I have a similar issue on my 7900 xtx. fp8 scaled is faster anyway.
>>
>>106708872
>>106708892
You're a real piece of shit debo, it's also obvious that you like to bring up old irrelevant drama from other threads to force some conflict.
>>
>>106708931
Can't get it to work on arch with a 9070
>>
damn nigbo wthelly
>>
>>106708931
>try the fp8 scaled version instead
It's not faster on a 3000; on 4000/5000 it is.
>>
>>106708944
https://en.wikipedia.org/wiki/Nigbo_language
>>
File: Chroma2k-test_00050_.jpg (818 KB, 1184x1552)
818 KB
818 KB JPG
>>
File: file.png (332 KB, 3225x693)
332 KB
332 KB PNG
>>106708931
>>106708883
Do I need to run that pytorch step if I followed the Auto Installation? I can try it but idk if it will do anything
>>
>2025
>finetuned sdxl is still the best local model for realism and anime images
when are we gonna get an unslopped 4b-5b model with a permissive license?
chroma is slow dogshit that looks bad
seedream is the only new model that looks good but it's NOT local
>>
>>106708964
Can we not do this ritual post?
>>
File: interpolated_00067.png (1.01 MB, 720x1280)
1.01 MB
1.01 MB PNG
>>106708650
based retard doesn't understand what the pingpong effect is (or that it's a setting in those nodes)

anyway, GAAAHHH THE OOM IS EVERY GEN NOW CUMFARTUI YOU'RE PISSING ON MY LEG AND TELLING ME IT'S RAINING!

https://files.catbox.moe/dyugav.mp4
>>
>>106708959
reading comprehension anon. that pytorch setup is for the AMD linux user, not you. you should try the fp8 model instead of q8.
>>
>>106708941
>>106708931
Nevermind, got it to work with the manual install
Don't know why I bothered with the pip comfy-cli
Thanks
>>
>>106708959
>>106708883
oh yeah and if you're ever OOMing during VAE operations, replace VAE encode/decode with TILED VAE encode/decode

>>106708999
nice
>>
>>106708844
I think I saw a new beta version of rocm pytorch released today, in theory getting that should make things really straightforward
>>
File: Chroma2k-test_00004_.jpg (877 KB, 1408x2064)
877 KB
877 KB JPG
>>
File: 00114-2146955441.png (1.59 MB, 1008x888)
1.59 MB
1.59 MB PNG
>>
File: Chroma2k-test_00007_.jpg (617 KB, 1408x2064)
617 KB
617 KB JPG
>>
File: Chroma2k-test_00008_.jpg (726 KB, 1408x2064)
726 KB
726 KB JPG
>>
>>106708772
>and we still don't have a comfyui equivalent
>https://github.com/asagi4/comfyui-prompt-control/blob/master/doc/regional_prompts.md
>>
>>106708772
sounds like DAAM -> Latent Couple
>>
File: 1753982932434490.png (635 KB, 1288x808)
635 KB
635 KB PNG
>>
>>106708883
no? I use q8 on a 4080 with 16gb. it should be fine.
>>
File: 00659-39345720325.jpg (1.41 MB, 2688x2688)
1.41 MB
1.41 MB JPG
>>106708772
Dude this is better
https://github.com/Haoming02/sd-forge-couple
>>
>>106709346
what's with the obsession with that random dude
>>
File: 00116-3716891349.png (1.03 MB, 1008x888)
1.03 MB
1.03 MB PNG
>>
>>106709357
It's hard work to get to lolcow status
>>
>>106709357
It's obviously cause he worked at blizzard duh. Real answer the dude went on a weird tirade against getting game developers to create offline versions of games when they EOS them.
>>
https://huggingface.co/nunchaku-tech/nunchaku-qwen-image-edit-2509
>>
File: 1751386497849884.png (978 KB, 1080x652)
978 KB
978 KB PNG
https://www.reddit.com/r/StableDiffusion/comments/1nravcc/nano_banana_vs_qwen_image_edit_2509/
damn the lightning lora really sucks
>>
>>106709355
NTA but thanks for reminding me of forge couple
would you pls catbox that image?
>>
>>106709409
Ahh..man. Coping they will make a newer lightning LoRA.
>>
>>106709452
I don't have it on this computer, this was during my laptop era
>>
>>106709465
i mean just any forge couple'd gen will do, but understandable
>>
File: 1742378170494040.png (1.05 MB, 616x1696)
1.05 MB
1.05 MB PNG
>>106709460
8step one works fine in general for qwen edit v2.
>>
>>106709472
I'm not doing anything special and I don't give catboxes. It's a long story that becomes evident whenever you see the schizo screech the name ran.
>>
File: Chroma2k-test_00012_.jpg (857 KB, 1408x2064)
857 KB
857 KB JPG
>>
File: WAN2.2_00068.mp4 (3.43 MB, 960x544)
3.43 MB
3.43 MB MP4
>>
>>106709481
ran is the schizo
>>
File: 1731674190436019.png (1.19 MB, 1040x1000)
1.19 MB
1.19 MB PNG
>>106709475
>>
>106709486
>time wasting post
More wheelchairs for you then
>>
>>106709492
thanks schizo (niggerjak)



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.