[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Discussion of Free and Open Source Diffusion Models

Prev: >>107843132

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>NetaYume
https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0
https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
>>107846749
thanks for the bake anon
>>
You made sure to only include 4 images benchod
>>
>>107846749
>>Maintain Thread Quality
>https://rentry.org/debo
>https://rentry.org/animanon
why is this still being baked in the OP? it just invites drama
>>
Garbage op. Snubbed again. Although tbf I didn't post anything last thread
>>
https://github.com/Comfy-Org/ComfyUI/pull/11829
KJ God saved us once again btw, now I can go for only 2 gb of offload instead of 5
>>
File: Comfy_00031.png (2.22 MB, 1071x1033)
2.22 MB
2.22 MB PNG
>>
>>107846769
Leaving it out causes drama too
>>
>>107846773
https://github.com/Comfy-Org/ComfyUI/pull/11748
there's also that PR waiting to be merged, dunno if it's gonna help even more but if it does I'll take it
>>
>>107846769
you had time to make the thread if you wanted, which you did and are samefagging
>>
>>107846780
it's easier for the mods to nuke one schizo than god knows how many
>>
>>107846749
only 4 images in the Op? what an absolute faggot of a baker.
>>
threads too popular i miss when it was more niche
>>
>>107846785
maybe we shouldn't bake a new one at page 1? is that too much to ask?
>>
>>107846790
didnt ask
>>
>>107846778
I wish
>>
>>107846805
>missed the point award
>>
>>107846778
would
>>
File: 1761707941135045.png (303 KB, 500x502)
303 KB
303 KB PNG
>>107846778
Is this a reference to Kafka - Metamorphosis book?
>>
>>107846818
Being a cute pokemon is better than being a roach even if their situations are the same, faggot. Not replying any further.
>>
>>107846827
yes!
>>
>>107846749
should i use Regularisation images when training an anime style lora? and if so where do i even get them from
>>
>>107846827
no shit sherlock
>>
>>107846828
cry me a river
>>
>>107846835
>everyone read Kafka
oh god I wish, there would be way less retarded people on earth if it was true
>>
https://files.catbox.moe/kwv9pc.mp4
>>
>>107846841
>I'm a roach Morty! I'm roach Gregor!!
Wow such amazing plot
>>
Blessed thread of frenship
>>
File: 1768118287221857.png (39 KB, 201x251)
39 KB
39 KB PNG
>>107846854
a lot of philosophers love to talk about roaches somehow
>>
Please refrain from responding to the anon attempting to slide this thread.
>>
>>107846870
eh, I let beetles and spiders live in my house, roaches are dirty but i still try my best to get them out instead of killing them
>>
reminder LTX2 is amazing.

https://files.catbox.moe/0q28jd.mp4
>>
>>107846925
>didn't say the n word, some slur towards lgbt+ folx or some other /pol/ shit
this is new
>>
File: kafkajak.png (1.09 MB, 896x1152)
1.09 MB
1.09 MB PNG
>>everyone read Kafka
>oh god I wish, there would be way less retarded people on earth if it was true
>>107846870
Not real quote, btw.
>>
HAHAHA

it knows spongebob and patrick natively, we need a list of what voices LTX2 knows. anyone have a list?

https://files.catbox.moe/swy22q.mp4
>>
>>107846941
prompt?
>>
>>107846944
it is overtrained on cartoons
>>
File: 1752958300641493.png (2.01 MB, 1216x1280)
2.01 MB
2.01 MB PNG
z-image omni? hah, great joke
>>
>>107846946
>A comic character illustration. Upper body of a caricature man. He has a very arrogant smirk. He wears three, very large award pins. The one on the right says: "READ KAFKA AWARD". The one on the center says: "INTELLECTUAL AWARD". The one the left says: "GETS REFERENCES AWARD". He is wearing a black jacket and a black fedora hat. He has his arms crossed. The man has very dirty, unkempt beard. He is looking at the viewer. Insane meme.
Needs the jak lora too.
>>
>>107846769
You spend all day cryin lil bro, can you please give some positive vibes instead of doing that?
So many positive changes happening especially in the software space and here you are being salty not being positive vibes mon.
>>
>>107846950
it also does a perfect Trump if you i2v with him, even the mannerisms. Now i'm wondering what list of characters it knows. Cause I didn't provide the spongebob voice.

hahahaha, this is a gold mine.

https://files.catbox.moe/9nyy17.mp4
>>
>>107846960
evil benchod fuck you
>>
Spongebob says "Hey Patrick! Let's open a learing center!". Patrick says "did you mean LEARNING center?". Spongebob says "haha, not in somalia".

it even made his hat lol, this is great. using q8 ltx2 from kijai's repo and I only have 16 vram (4080) and 64gb ram. If you have the memory you can load it all without issue.

https://files.catbox.moe/zq4go1.mp4
>>
File: irony.png (1.6 MB, 1765x961)
1.6 MB
1.6 MB PNG
>>107846941
>Not real quote, btw.
>>
>>107846959
why lie
>>
>>107846957
omni looks useless, but the SFT one (Z-image) will be a better starting point to make finetunes
>>
LTX2 is amazing. I *need* to know what other characters work for voices.

https://files.catbox.moe/9uwlja.mp4
>>
>>107846638
I eventually got it to run. Switching to one of the fp8 options (doesn't seem like there is any noticeable difference between them) in the quant_format produces images similar to running it normally.
Unfortunately, no speed improvement in my Ampere GPU. Disappointing but expected.
Maybe there is very low chance that one of the other dozen different other variants they made will run faster, but I don't really feel like bothering with testing all that for what I believe to be very slim chance. My curiosity is sated for now.
End of blogpost.
>>
https://www.reddit.com/r/StableDiffusion/comments/1qatuni/ltx219bdistilled_vs_ltx219bdev_distilledlora/

distill lora at 0.6 is better than distill model or distill lora at 1.0
>>
>>107847003
>grasping at straws to one up after LARPing as bespoke literature connoisseur and posting fake quote popular on plebbit
Kek, had a laugh
>>107847018
Do you want catbox or something schizo?
>>
>>107846967
>*currynigger detected*
>>
How do you draw the controlnet mask for an image like this? Do you just draw the mask over both their arms? Is mask overlap acceptable?
>>
>>107847049
that's the point, you must get the references and be an actual intellectual to deboonk false philosophical quotes, hence the funny irony, hope that helps
>>
File: qwen_image_2512_00015_.png (2.68 MB, 1440x1120)
2.68 MB
2.68 MB PNG
new qwen is the first model I've tried that obeys this prompt decently. especially the stabbing, it's better at gore/violence than other models for some reason.

black and white political cartoon scanned from an old newspaper.
on the left, a palestinian man wearing a keffiyeh scarf around his neck is screaming and writhing in pain and agony and bleeding from a knife wound in his back. On the right, an Orthodox Jewish Rabbi with sidelocks and an orthodox jewish suit holds the hilt of the knife in the palestinian's back with right-handed overhand grip and stabs it deeply into the palestinian. the rabbi is looking back and happily calls out with his cupped left hand next to his shouting mouth, with a speech bubble saying "Help! This goy is attacking me!", sneering. the pommel of the knife has a small jewish Star of David design inscribed on it.
>>
>>107847042
the lora is gigantic though, I'd prefer someone to merge that shit into the model instead
>>
so this week we have both GLM-Image and Z-Image-Omni-Base release?
>>
>>107847068
>a palestinian man
he looks like a jew though, first of all a palestinian is brown
>>
>>107847075
source on getting base this week?
>>
>>107847085
it was revealed to me in a chinese dream
>>
>>107846882
I'll spend 15 minutes trying catch some asshole spider so I can relocate him outside but I have a zero tolerance policy for roaches. Cockroaches aren't from this planet and must be destroyed.
>>
hype from social media posts means nothing
show me those sweet sweet commits
>>
>able to run video upscalers up to 4k resolution just fine
>use a frame interpolation node and it shits the bed almost immediately
Why is it so hard to just make 60 fps videos
>>
File: kek.png (79 KB, 2058x281)
79 KB
79 KB PNG
>>107847085
>source on getting base this week?
how about some random trooncord post
>>
File: z_image_turbo_00077_.png (2.24 MB, 1440x1120)
2.24 MB
2.24 MB PNG
>>107847068
Z... kek. ZIT is good for some stuff, but if you can fit it in VRAM, qwen + lightning 8 step is far superior at prompt following and coherence, while being about as fast. their styles are equally slopped, just in different ways. a chroma i2i pass helps fix the slop.
>>
>>107847085
(C|H)opium from the cryptic "Patience will be rewarded" post in their discord and bdsqlsz tweet about an open source model being released this week.
>>
>>107847075
>so this week we have both GLM-Image
they're waiting for the commit to be merged, and only god knows when it's gonna happen
https://github.com/huggingface/transformers/pull/43100/files
>>107847110
and a new commit merged on Modelscope
https://github.com/modelscope/DiffSynth-Studio/commit/0efab85674f2a65a8064acfb7a4b7950503a5668
>>
>>107847120
have we seen even one example gen of this GLM Image?
>>
>>107846638
I struggle to believe that a 6b model and 20B one would run at the same speed. Are you running qwen nvfp4?
>>
>>107847130
nope, they're like Z-image base they're suspiciously silent about that, which is a bad sign imo, if they're not proud of the output of their models it means that it must be really really mid
>>
>>107847135
Tagged wrong post>>107847107
>>107847120
I mean yes but we had various Z-Base related commits since early December.
The primarily interesting thing here is that it has checksums which implies they finished training/finetuning.
>>
>>107846788
it is not a participation trophy
>>
is base out yet?
>>
File: DO IT.png (42 KB, 236x213)
42 KB
42 KB PNG
>>107847154
>they finished training/finetuning.
then what are they waiitng for???
>>
File: maybe?.png (2.84 MB, 1920x1080)
2.84 MB
2.84 MB PNG
>>107847130
>have we seen even one example gen of this GLM Image?
there's a new model on the Arena called "Goldfish" and it might be GLM-image
https://xcancel.com/testingcatalog/status/2008576286638424387#m
>>
>>107846827
No, just a coincidence.
>>
>>107846788
Try making some better images next time. If you want to go to the anime diffusion thread where 65% of all posts make it into the OP collage, you are welcome to.
>>
Tongyi tongue my anus
>>
File: FUCK.png (1.04 MB, 869x945)
1.04 MB
1.04 MB PNG
How many Qwen Image versions must we get before the release of Z-image base??
>>
>>107846790
How was it before it got popular?
>>
File: file.png (1.51 MB, 1570x908)
1.51 MB
1.51 MB PNG
>>107847181
tried it until i rolled one, left one was goldfish
i'm trying to roll a photorealistic one though, unlucky so far
>>
>>107847260
less slop and more kino
>>
File: x_zvx636.png (1.72 MB, 2048x1024)
1.72 MB
1.72 MB PNG
>>
How does Z Image Edit compare to Qwen Image Edit?
>>
File: delux_flebo_00012_.png (1.36 MB, 1216x832)
1.36 MB
1.36 MB PNG
>>107847061
>DEBOonk
#mentioned
>>
>>107847267
since it's an autoregressive model you can give a very vague prompt and get something sophisticated, try to verify if it has this AR capability
>>
Since A1111 is basically dead. Anyone know the next best alternative?
I'm not picky about control and comfy is too much for my baby brain to handle.
>>
I need to know what other characters LTX2 knows natively, for audio.
>confirmed: Trump, Spongebob, Patrick

piece of shit catbox wont load the video, so here is a streamable example.

https://streamable.com/7uzltt
>>
>>107847294
We'll never know
>>
File: WANI2V_INT_00009.mp4 (3.49 MB, 1136x920)
3.49 MB
3.49 MB MP4
>>107847164
>>107847225
>*Ha... ha... ha... ha.*
>>
File: 1763958931816998.png (68 KB, 236x197)
68 KB
68 KB PNG
>>107847316
>We'll never know
I'm still not ready to accept that fact, let me some time anon
>>
>>107847319
dumb gwailo yu no andastand chinese cultcha
>>
>>107847294
>How does Z Image Edit compare to Qwen Image Edit?
how can we know that :(
>>
>>107847260
>How was it before it got popular?
it was a time when Chinese companies would release models without having to say "Soon Soon Soon" for months, we'll never get that shit again, sad
>>
>>107847312
Command line. Or FORGE.
>>
>>107847312
>comfy is too much for my baby brain to handle
why
are you stupid or something
it's just node graphs. what exactly is 'too much'? do you not understand the terminology or how to connect things? what's going on man
>>
>>107847052
anyone?
>>
>>107847367
You never explained what you are trying to mask. If it's just the 1girl, then only mask her body parts, not the male's.
>>
>>107847312
neoforge
>>
File: z_image_turbo_00086_.png (2.06 MB, 1120x1440)
2.06 MB
2.06 MB PNG
>>107847154
>>107847135
Z is faster than Qwen, but not significantly when they're both just 8 steps and 1 CFG. I don't get a speed boost from nvfp4 or fp8 even.
>>
>>107847380
I am specifically talking about a 2girl output (or 1boy, 1girl).

If their arms are completely overlapping, am I still supposed to mask the arm being overlapped? Or just the visible parts of the arm?
>>
>>107847107
>their styles are equally slopped, just in different ways.
i am pretty tired of the default z styles but significantly less so than qwen. even in your own comparison z looks sharper and more detailed IMO
>>
>>107847347
>it was a time when Chinese companies would release models without having to say "Soon Soon Soon" for months
https://files.catbox.moe/b5sx5o.mp4
>>
>>107847388
Are you on 3090?
Also based gen.
>>
>if 4chan was a lora

https://www.reddit.com/r/StableDiffusion/comments/1qbd7gb/john_kricfalusiren_and_stimpy_style_lora_for/
>>
File: 1757237565582677.png (66 KB, 277x182)
66 KB
66 KB PNG
>>107847388
>No persian ever called me goy
oh I get it
>>
To heck with photorealism. I want knowledge of sundry styles.
>>
Persians do call me "infidel", tho.
>>
what to do we about the gacha nature of AI
same prompt, same settings, 100 iterations. some gens are pure incoherent dogshit. some gens are amazing. i dont know what to do other than generate batches until i find a needle in the haystack. seems wasteful, this shit will wear down my SSDs.
has anyone found a way to control this
>>
anyone tried the sarah peterson loras for zit? the jeet has massive black fetish but i was wondering if it can output normal scenes too
>>
I have a question regarding the rentry guide.

I notice the guide now recommends using the "Load Lora" node over Lora Loader Model Only because it includes a clip output.
But how would you connect the pins in a multi-character controlnet workflow? Like pic related.

Should lora clip outputs only connect to their respective character on the controlnet? And what clip should connect to the above text encodes?
>>
>>107847478
>has anyone found a way to control this
A question for the ages
>>
>>107847478
No. There's probably an infinite number of ways to generate a image regardless of the prompt based on what the model knows. You cannot possibly expect non-RNG results doing T2I/T2V.
>>
>>107847478
>the gacha nature of AI
in a way they definitely look like humans, sometimes we rock and sometimes we suck lol
>>
what other characters work for i2v voice cloning, trump works, spongebob and patrick work, anyone else know? is there a list?
>>
>>107847464
Iran has a large Christian minority, treated a lot better than Israel treats theirs, though.
>>
>>107847500
Checkpoint clip connects to Lora nodes first. Lora node clips connect to text encodes.

>Should lora clip outputs only connect to their respective character on the controlnet?
Yes. So if Region 1 is the 1girl character lora, then only connect that lora's clip to it.
>>
pretty funny that TongyiMAI played their card way too early just to cash in on some flux hype. pretty obvious nothing else was ready
>>
https://files.catbox.moe/nic9xu.mp4
>>
>>107847553
And it worked. They undermined and disrupted Flux.2 completely. They really don't need to release base at this point. Just makes it obvious China is only releasing quality free models to cut competition in the west.
>>
>>107847553
>>107847567
the most impressive thing is that they managed to make Z-image turbo good out of a really undertrained base model
>>
>>107847079
You fell for NPC propaganda.
Levant Arabs look white as hell. They are not Berbers or Gulf Arabs.
They push the narrative that a (((European))) country is fighting against evil browns so that you would be more willing to be a zog tax slave funding them.
>>
>>107847532
Ok, and what model should connect to Attention Couple? The Checkpoint again?
>>
>>107847592
>The Checkpoint again?
No. You daisy chain the loras.

Checkpoint > Lora 1 > Lora 2 > Attention Couple
>>
>>107847592
Yes.
Many regional prompting extensions on Cumfart are broken nowadays btw.
No idea if this one works or not.
>>
>>107847585
Palestine isn't as levant as Iran though, you're delusional, there's a lot of browns in there
>>
>>107847603
Oh ok, I see.
It's spaghetti, but it makes sense. Thanks.
>>
File: here is your gen.png (121 KB, 859x732)
121 KB
121 KB PNG
>>
>>107847611
>Iran
>Levant
Why are you talking shit about countries you can't point at map?
>there's a lot of browns in there
Not as much as you think.
>>
>>107847613
Any involved workflow will devolve into spaghetti. You need to use get/set nodes if you want clean workflows. They are like variables.
>>
>>107847613
You are doing it wrong.
Second lora should go from directly checkpoint to the second lora loader
>>
>>107847637
who gives a fuck about those brown countries though? only a brown would know how to put those countries in a map, show your hands
>>
embrace the noodle
>>
>>107847650
all that for 1girl, large breasts, (ai generated:1.4)
>>
>>107847660
and i fucking love it
>>
>>107847650
nice cow
>>
How the fuck am I supposed to run songbloom in comfyui? It looks like the original repo got nuked. Do I just grab some clone from some other dev?
>>
>>107847650
i dont know how you can run this without melting your browser after 2 gens.
>>
>>107847042
Do we still do CFG at 1? Or 4? Somewhere in between?
>>
>>107847709
always cfg 1 since it's using the distilled lora
>>
>>107847560
Man what a throwback to waking up in the dead of night to watch GSL games live on the fucking Gom Player
>>
>>107847692
Clean your fans.
>>
>>107847720
peak era for streaming imo, good starcraft and commentary, all through some jank player but it worked fine.
>>
>>107847647
What are you talking about? I was told each "Load Lora" node should plug their clip into their respective controlnet character: >>107847532
The model itself is daisy chained from checkpoint to lora to lora..
>>
>>107847720
>>107847735

Streaming hadn't been completely consolidated by faang companies yet. Esports is also kind of dead in the US.
>>
>>107847739
My bad you are right.
>>
File: x_cid3cv.png (1.82 MB, 1024x1536)
1.82 MB
1.82 MB PNG
>>
Anyone using i2v ltx2 with the detail lora? Does that even work?
>>
File: x_gpprs1.png (1.83 MB, 1024x1536)
1.83 MB
1.83 MB PNG
>>
>>107847787
it does. you can even add the lora to the 2nd stage sampler to get an even sharper result.
>>
>>107847835
What weight do you use anon? Sounds useful.
>>
best local model for img2img that fits in 24gb vram? i tried z-image turbo but it changes too much other stuff in the image
>>
>>
https://github.com/kijai/ComfyUI-KJNodes/commit/838a731f2fa250ee8ef5ac0b1299a1b1f5cbb3a0
1GB more vram reduction for LTX-2 model



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.