[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Discussion of Free and Open Source Diffusion Models

Prev: >>107820534

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
Blessed thread of frenship
>>
>>107823802
benchod
>>
File: 1761907363783082.png (2.09 MB, 1280x1280)
2.09 MB
2.09 MB PNG
>>107823785
Thanks, have more daughterwife
>>
Enjoying your base you fucking retards?
>>
>>107823836
I'm an healthy male with a normal sex drive, wbu?
>>
>>107823819
that's a man
>>
Going to post a gen soon make sure you have it on the next collage, thanks
>>
File: 1748657584553171.png (2.65 MB, 1184x1280)
2.65 MB
2.65 MB PNG
>>
File: 1749655109156986.png (2.11 MB, 1408x1024)
2.11 MB
2.11 MB PNG
>>
File: 1749641637388534.png (1.98 MB, 1408x1024)
1.98 MB
1.98 MB PNG
1 girl bros...
>>
Apparently someone on Plebbit found a way for LTX to do porn by copy and pasting a Wan2.2 GIF into the I2V workflow? It copies the motion and gives it proper audio.
>>
File: 1740116988379374.png (2.84 MB, 1344x1120)
2.84 MB
2.84 MB PNG
>>
>>107823847
Based
>>
From my experiments with vision enabled LLMs, I can say that from best to worst it's:
Qwen3 VL 30B-3A
Qwen3 VL 8B
Gemma 3 12B
GLM 4.6V Flash

strangely, the dedicated caption model (joycaption beta) shits the bed with 'complicated' instructions in the system prompt, I'd have to test it with something simplier (just ask it to use natural language or booru prompts, and to include camera position and all that retarded stuff)
>>
>>107823819
qwen edit?
>>
>>107823993
zit
>>
Enjoying your image and video gens you fucking legends?
>>
>>107823993
what the hell, it's not anime -> realistic lora?
>>
File: ComfyUI_temp_bncds_00019_.png (3.15 MB, 1824x1248)
3.15 MB
3.15 MB PNG
https://files.catbox.moe/d69gk6.png
>>
>>107823924
dude discovered v2v
>>
File: 019lg8.png (1.79 MB, 1024x1536)
1.79 MB
1.79 MB PNG
>>
>>107823992
>GLM 4.6V Flash
I don't know about the flash version but I'm testing GLM 4.6V in Q8 and it's very good so far, it's even uncensored as long as you ask it properly and add a prefill.
Only issue is that it's huge, but once I'm locking in a proper description prompt I'm definitely using it for most realistic and even some anime.
>>
>>107823785
Why is there a vtuber in the OP?
>>
>>107824006
meant for >>107823994
>>
File: 1790838.png (2.84 MB, 1336x1336)
2.84 MB
2.84 MB PNG
>>
File: ComfyUI_temp_llpiq_00101_.png (3.34 MB, 1824x1248)
3.34 MB
3.34 MB PNG
https://files.catbox.moe/ykfjxo.png
>>
>>107824031
I'm using captioning models to convert images to a natural language description, then touch them up (if needed) and send it to the actual model to gen an image.
>>107824025
I'm running these as part of my workflow and I don't want to wait a long time to load a big model to gen a caption
>running at Q8
how much ram you got bro lol, q8 need like 140gb~
>>
anons, how many steps do you recommend and use with res2s to get something good before diminishing returns in ltx-2?
>>
>wan2gp doesn't recognize the loras folder
it's over
>>
So... I am planning to get a rtx6000 pro in a month or so, but... I am a bit confused on what model(s) would be closest to working like Gemini's nanobanana offline for anime girl edits. would anyone know what would be the closest? i.e. image + prompt input with the prompt either changing the clothing or pose? Only ever worked with stuff like wai-nsfw-illustrious models that did tags and have little experience with proper prompt models...
>>
File: grid.webm (1.38 MB, 1930x2046)
1.38 MB
1.38 MB WEBM
For audio
>>>/wsg/6069549
>>
File: 964641867.png (2.26 MB, 1248x1824)
2.26 MB
2.26 MB PNG
>>
>>107824053
have you genned any vids with wan2gp yet? if so hows the speed vs comfykek?
>>
>>107824077
>have you genned any vids with wan2gp yet?
yes
>if so hows the speed vs comfykek?
well, it actually works on my 3090 Ti, so it's faster than infinity.
For a 5 second video:

# Dev 19B
- Profile 1
- 6m 21s
- Profile 4
- 6m 25s

# Dev 19B Distilled
- Profile 1
- 1m 27s
- Profile 4
- 6m 08s
>>
File: 1760263240475523.png (20 KB, 982x164)
20 KB
20 KB PNG
>>107824044
>I'm running these as part of my workflow and I don't want to wait a long time to load a big model to gen a caption
I'm thinking about either captioning everything in advance, or just using my second card (3090) with zimage turbo since it's such a light model.

>how much ram you got bro lol, q8 need like 140gb
When loaded it's relatively usable speed wise, basically I'm using 115GB out of 128GB just to load the Q8_K_XL + 15GB on the 5090 itself with MoE mode with 16k context on llamacpp.
I could go for the lower Q8 since I didn't see the difference (lower one uses 10GB ram less) but in both cases it's within my system capabilities so why the hell not go big.
>>
File: 00004-555262941.png (1.61 MB, 1536x1024)
1.61 MB
1.61 MB PNG
>>107823916
>>
>>107824091
funny how ads used to look like this
>>
>>107824099
hot sexy women? in my ads? no I'd rather have something RELATABLE and SAFE
>>
>>107824059
Qwen image edit variants for editing images. It's nowhere close to nbp though.
>>
>>107824142
it is if you use a 50b gemma
>>
>>107824059
>planning to get a rtx6000 pro in a month or so
>anime girl edits
you don't need a 6000pro for that, a 5090 is enough
>>
>>107824123
Go block an ICE vehicle.
>>
>>107824161
that doesn't sound very SAFE of you anon
>>
File: ZImageTurbo-_0244.png (3.04 MB, 1824x1248)
3.04 MB
3.04 MB PNG
https://files.catbox.moe/wqjzdz.png
>>
>>107823836
>suddenly
>>
File: LTX_2.0_i2v_00194_.webm (1.07 MB, 1024x640)
1.07 MB
1.07 MB WEBM
>>107824091
>>>/wsg/6069557
>>
>>107824044
>I'm using captioning models to convert images to a natural language description
which one? can you share the workflow?
>>
>>107824183
yeah
>>
File: ComfyUI_temp_bncds_00025_.png (2.94 MB, 1824x1248)
2.94 MB
2.94 MB PNG
>>107824179
https://files.catbox.moe/fnrel6.png
>>
>>107824189
"INSERT COIN" pixels are completely different from the image behind it.
>>
File: 344.png (1.49 MB, 960x1216)
1.49 MB
1.49 MB PNG
>>107824123
>oh nyoooo my bobas and veganas, do not redeem
>>
>>107824142
>>107824147
thank you, will note that down

>>107824160
I currently do have a 5090 comp and a 2x 3090 comp, but the ones i've tried from civit.ai and huggingface have been.. a bit rough for me I guess.
>>
>>107824196
this is so good
>>
>>107824196
Is that flux?
>>
>>107824196
Why are the top tears faceted?
>>
File: 1767230451703279.gif (164 KB, 240x240)
164 KB
164 KB GIF
What's the current best webui for retards? I used automatic's back in the day.
>>
>>107824197
>I currently do have a 5090 comp and a 2x 3090 comp, but the ones i've tried from civit.ai and huggingface have been.. a bit rough for me I guess.
you won't have that more inference speed with a 6000pro, I also wanted it but in the end I got 2x5090 and I'm pretty happy with them
>>
File: 109.png (1.36 MB, 960x1216)
1.36 MB
1.36 MB PNG
>>107824207
qwen 2512
>>107824209
prob cuz the prompt had "crystal-clear teardrops" kek
>>
>>107824211
comfyui
>>
>>107824183
Qwen3 VL 8B mostly, I have some private stuff/nodes my wf currently.
I suggest to use:
https://github.com/sebagallo/comfyui-sg-llama-cpp
for in-workflow usage, requires to install a python library in your venv
OR
https://github.com/BigStationW/ComfyUI-Prompt-Rewriter
this uses your system's llama.cpp

Otherwise just run your llama.cpp in router mode and use a generic OAI client to call the model and some custom REST nodes to force the router to unload models after you finish genning (it's currently what I'm doing)
>>
>>107824220
wtf
>>
>>107824211
Forge Neo if you want to avoid comfy
>>
>>107824222
what about? https://github.com/stavsap/comfyui-ollama
>>
https://huggingface.co/SG161222/SPARK.Chroma_v1
Spark chroma guy seems to be giving it another go.
I don't think rewriting prompts and 25% larger dataset will give drastically different results from last attempt, but we will see.
>>
>>107824240
ollama is for faggots so i dont use it (it uses llama.cpp anyway).
It's maybe easier to setup compared to all other llmao stuff I guess.
>>
>>107824243
which chroma are you using?
>>
File: LTX_2.0_i2v_00196_.webm (960 KB, 704x896)
960 KB
960 KB WEBM
>>107824220
>>>/wsg/6069568
>>
>>107824250

Piggybacking on this, I need a good chroma model and WF recommendation. The last one I used did not adhere to prompting at all and looked melted.
>>
>>107824049
You can test yourself with chained samplers. You can do like 30 in the first sampler, connect it to vae decode and save image, then connect it to a second sampler to do another 10 steps or 5 steps, and then a third one to do more steps, etc. and have them all connected to vae decode and save image nodes so you can compare and see how many steps is worth it.
>>
File: 1767406007300115.png (382 KB, 522x711)
382 KB
382 KB PNG
>>107824221
>>107824235
thank you
I think I'll go for forge, I remember being too stupid to use comfy properly
>>
>>107824260
piggybacking on this too, there are hundreds of chroma models, help
>>
>>107824258
jesus christ
>>
>>107824059
Imagine buying a $10,000 card to make shitty qwen slop edits that you can do with a $400 card.
>>
>>107824264
forge is good, if you do ever want to use comfy because of a use case not covered by forge, the comfy 1girl guide in the op should be helpful
>>
File: spark chroma (3).png (2.54 MB, 1520x1040)
2.54 MB
2.54 MB PNG
>>107824260
I use spark.chroma, and I like it a lot for porn
https://files.catbox.moe/j6d4vy.png
>>
>>107824273
I mean my wife died and I got some insurance money, so I'm trying to cope with it
>>
>>107824282
that's not porn
>>
>>107824282
This one? https://huggingface.co/SG161222/SPARK.Chroma_preview/tree/main
>>
>>107824282

Does it hold up with NSFW with character loras?
>>
>>107824250
None at the moment.
I am planing to test this:
https://civitai.com/models/2086389?modelVersionId=2517681
Not an endorsement. No idea if it's shit or not. But it's on my to do list.
>>
>>107824310
Why would you do that?
>>
>>107824301
the porn i gen with it is too niche
>>107824305
yes
>>107824310
degrades anatomy, but yes
>>
File: 1639750855078.jpg (140 KB, 1080x1080)
140 KB
140 KB JPG
ltx t2v is fucking slow and censored. also, many of my prompts give me 1940 videos... the fucking point of gemma. wtf
>>
>>107824275
I keep that in mind. Thanks.
>>
File: Qwen_00425_.png (841 KB, 1024x1024)
841 KB
841 KB PNG
>>
File: ZImageTurbo-_0249.png (2.71 MB, 1824x1248)
2.71 MB
2.71 MB PNG
https://files.catbox.moe/t4p0fp.png
>>
File: Qwen_00411_.png (2.18 MB, 1472x1136)
2.18 MB
2.18 MB PNG
>>107824189
tried to recreate your prompt with qwen 2512
>>
File: 1739667736057787.png (3.18 MB, 1472x1088)
3.18 MB
3.18 MB PNG
>>
>>107824432
oh my god
>>
>>107824444
z-baggyeyesmage
>>
>>107824449
theyre full of poisonous cum
>>
File: ComfyUI_temp_higpi_00011_.png (3.41 MB, 1824x1248)
3.41 MB
3.41 MB PNG
>>107824432
Very cool! Much more realistic than mine's
https://files.catbox.moe/lgj4jt.png
>>
File: ZVideoPreview-_0009.webm (1.23 MB, 1920x1080)
1.23 MB
1.23 MB WEBM
>>
>>107824462
at least make it believable next time
0/10
>>
>>107824462
Very cool. Feels like a The Last Guardian cutscene
>>
Official LTX2 workflow uses 0.6 image strength in i2v, while comfy's uses it at full strength. I'm 95% that comfy is a hack and the original creators know better, but MAYBE he's onto something?
>>
>>107823916
skin for z image look too noisy and static. why do people say its superior to photorealism of qwen image 2512?
>>
>>107824499
kijai said 1.0 is more better
>>
>>107824502
chroma is as slow as qwen and better
>>
>>
>>107824502
i think people genuinely don't see the noise for some reason, they've looked at too many shitty smartphone photos with sensor noise and think it looks natural
>>
File: bonk.jpg (743 KB, 1922x2183)
743 KB
743 KB JPG
best vidya to play while genning?
>>
>>107824537
>playing games
wrong board champ
>>
>>107824537
https://www.shoutoutuk.org/game/story.html
>>
More like Z-image Never
>>
>>107824502
You can overcome zit noise somewhat by using a steeper noise schedule/raising shift.
>>
>>107824537
Project diablo 2 by far
>>
File: 1743457258439135.jpg (1.57 MB, 1248x1824)
1.57 MB
1.57 MB JPG
>>107824537
Baldurs Gate 1-2 uses almost no GPU power.
>>
File: ComfyUI_temp_higpi_00018_.png (2.45 MB, 1824x1248)
2.45 MB
2.45 MB PNG
https://files.catbox.moe/k3xez7.png
>>
Is there any reason to use full ltx model + distill lora against just distilled model directly?
>>
File: 2.png (2.2 MB, 1152x1336)
2.2 MB
2.2 MB PNG
>>
>>107824499
>>107824503
You mean the LTXVImgToVideoInplace node? What does it even do?
>>
>>107824462
sick
>>
Anybody use turbo diffusion with wan? They said it's a 200x speed up but I'm wondering if the quality suffers.
>>
>>107824671
Prepares latents based on the original video? Same as WanImageToVideo
>>
https://streamable.com/xdfcx6

I need a workflow for this shit, now.
>>
>>107824688
There's no such thing as lossless optimization
>>
>>107824710
Obviously but is the trade off worth it or is it a 200x reduction in quality too?
>>
>>107824706
fucking kek
>>
>>107824706
This is genuine shit why in god's name is anyone trying to make this work instead of just going after existing porn
>>
File: ComfyUI_temp_higpi_00028_.png (3.49 MB, 1824x1248)
3.49 MB
3.49 MB PNG
https://files.catbox.moe/daktmo.png
>>
>>107824706
this is why we should wait for ltx2 loras instead of trying to make it do something it has no idea how to
>>
>>107824727
looks like someone cant handle inter-dimensional porn
>>
What's everyone using for audio for existing videos? Is there some gradio thing I can install that doesnt require comfy?

>mmaudio - hit or miss and doesn't produce audio for the full video
>hunyuan foley - doesnt seem to work for the latest versions of comfy (i tried everything, only solution is to setup a separate older version)
>ltxv2 - probably the best option but have yet to try, then again it requires the latest comfy which will break existing nodes

>>107824243
Did he say when it'll be finished? Spark preview is amazing, some of the best lighting I've seen a model produce. Also check out uncanny if you havent https://huggingface.co/dawncreates/UnCanny-Photorealism-Chroma-GGUF/tree/main
>>
File: zy8b7vqg3acg1.png (228 KB, 978x649)
228 KB
228 KB PNG
>>107824706
>>
>>107824327
I generated a woman taking her clothes off in under 4 minutes on a 3090 Ti. Use wan2gp
>>
>>107824753
He's trying to make a production version; the first version that he baked wasn't up to his standards, so he's trying again
>>
File: ComfyUI_temp_ekpno_00008_.png (2.5 MB, 1040x1480)
2.5 MB
2.5 MB PNG
>>
>>107824782
benchod
>>
File: ComfyUI_temp_ekpno_00009_.png (2.6 MB, 1040x1480)
2.6 MB
2.6 MB PNG
>>
File: Screenshot.jpg (24 KB, 645x130)
24 KB
24 KB JPG
>>107824798
not gonna spoonfeed you, currynigger
>>
>>107824833
What the fuck are you doing here, /pol/ maggot?
>>
File: ComfyUI_temp_higpi_00045_.png (2.51 MB, 1824x1248)
2.51 MB
2.51 MB PNG
https://files.catbox.moe/2zfkh4.png
>>
File: 1737448008966584.png (2.13 MB, 1472x1088)
2.13 MB
2.13 MB PNG
>>
>>107824753
wan2gp might have something so I'd say peruse the docs there
>>
File: Dawn03.png (2.49 MB, 2288x1520)
2.49 MB
2.49 MB PNG
>>
someone explain the "chinese culture" meme to me
>>
File: zimg_00018.png (1.15 MB, 768x1024)
1.15 MB
1.15 MB PNG
>>
File: 1738915726336514.png (2.18 MB, 1504x1024)
2.18 MB
2.18 MB PNG
>>
File: 1738422103030703.png (2.07 MB, 1632x928)
2.07 MB
2.07 MB PNG
>>
File: 1752759695613511.png (1.64 MB, 1024x1472)
1.64 MB
1.64 MB PNG
>>
File: 00014-417783637.png (1.27 MB, 1536x1024)
1.27 MB
1.27 MB PNG
>>
>>107825018
>if you're promised a model but it never release
>thaaat is chi-nese cul-ture~
>random anime man says "two more weeks!"
>thaaat is chi-nese cul-ture~
>open source is all you need
>be careful to watch the rug from under your feet
>you'll get wan2.5 if you say pretty please
>thaaat is chi-nese~
>thaaat is chi-nese~
>thaaat is chi-nese cul-ture~
>>
File: zimg_00068.png (1.53 MB, 960x1280)
1.53 MB
1.53 MB PNG
>>107825018
this board is inundated with racist troll
>>
>>107825140
give base chink
>>
>>107825140
shalom m'lady
>>
>>107825118
>comfortable migu
:)
>>
File: 1737854552384852.png (2.03 MB, 1024x1472)
2.03 MB
2.03 MB PNG
>>
>>107824921
Not bad, but I really hate the mixing of pixelated an smooth lines.
>>
>>107825259
rare stocking color, it's nice
>>
>>107825133
Why are they such fags about releasing 2.5 anyway when 2.6 is out already?
>>
>>107824791

What model saar?
>>
File: ZImageTurbo-_0287.png (902 KB, 1216x832)
902 KB
902 KB PNG
>>107825306
The non-upscaled one is more coherent re pixelization
>>
>no one is training LTX2 loras
>prompt following is dogshit, half of the time you get powerpoint slideshow
>not actually faster than Wan if you want quality
is the model DOA?
>>
>>107825133
>>if you're promised a model but it never release
>>thaaat is chi-nese cul-ture~
>>random anime man says "two more weeks!"
>>thaaat is chi-nese cul-ture~
>>open source is all you need
>>be careful to watch the rug from under your feet
>>you'll get wan2.5 if you say pretty please
>>thaaat is chi-nese~
>>thaaat is chi-nese~
>>thaaat is chi-nese cul-ture~


https://voca.ro/11sA543nw26m
>>
>>107825371
I wish there was a model that could do actual pixels.
>>
File: ComfyUI_temp_higpi_00052_.png (3.85 MB, 1248x1824)
3.85 MB
3.85 MB PNG
>>107825389
me too, fren
https://files.catbox.moe/x6h89q.png
>>
>>107825371
a little better, but there is still light bloom and intrapixel dithering
>>
>>107825389
Use imagemagick to downscale, then run upscale. ezpz
>>
File: zimg_00079.png (1.67 MB, 960x1280)
1.67 MB
1.67 MB PNG
>>107825182
aw fuck my lora is accidentally jewish
>>
>>107825389
hello
>>
>>107825385
lmfaoooo, saved
>>
>>107825352
too big for you
>>
>>107825385
acestep?
>>
File: ComfyUI_temp_higpi_00070_.png (3.55 MB, 1824x1248)
3.55 MB
3.55 MB PNG
>>107825437
you're a big model
uuuu
>>
>>107825471
forgot catbox
https://files.catbox.moe/mfy4c4.png
>>
>>107825482
do you do commisions?
>>
File: negatives.jpg (47 KB, 518x296)
47 KB
47 KB JPG
>using the kijai workflow for ltx-2
>notice the hidden negatives prompt
>expect it to be empty like usual
>it's actually this
wtf is all of this necessary?
>>
File: file.jpg (1003 KB, 5000x5000)
1003 KB
1003 KB JPG
Gooner sisters we are saved
https://www.reddit.com/r/StableDiffusion/comments/1q94nlk/wow_i_accidentally_discovered_that_the_native/
>>
>>107825538
where did you get that from?
>>
Been trying to gen Amelias and got one that looks eerily similar to one of my cousins (if she had different hair/fashion sense)

Like this is scary close, almost exactly what she looks like
>>
>>107825559
https://streamable.com/xdfcx6
>>
>>107825526
I don't. I do it only for the love of the game
>>
These threads are insufferably retarded. They have as much to do with technology as /v/ posting an SFM rip of Tracer wearing a thong bikini

Nothing has changed since the last time I posted this

I don't know how long it's going to be for this finally just gets booted off and put onto /hr/. I don't hate "AI slop" just because it's a trendy new thing and a trendy new thing to hate, it really is just absolutely braindead and completely devoid of anything redeeming
>>
File: x_0502ox.png (1.53 MB, 1536x1024)
1.53 MB
1.53 MB PNG
>>
>>107825583
meltie alert
>>
>>107825385
What did you use to make this?
>>
>>107825583
>redeeming
Good morning saar
>>
>>107825583
How about making an /ai/ board, retard?
>>
>>107825555
https://files.catbox.moe/f9fvjr.json
I googled for it and it linked to a r*ddit post so maybe it's just the retarded poster who did it. Don't know if it even actually is kijai's workflow but it works anyway. Just wondering if the negatives were fucking up my videos this entire time
>>
>>107825583
>calls things braindead
>is too braindead to know how to use filters
>>
>>107825628
I'm too busy doing real programming to bother learning such shit, keep playing with your ai crayons
>>
>>107825643
(( (Crayola( ((>
>>
>>107825385
https://files.catbox.moe/wi3i8y.mp4
genned in 1m27s on a 5090
>>
>>107825552
>Guys I just invented Video to Image t.Reddit
next they'll invent 1girl
>>
>>107825643
>i'm too busy to copy and paste into a textbox
no wonder jeets and ai are taking over your job
>>
>>107825697
weird image choice
>>
>>107825719
puto
>>
I think ltx is here to stay, people are slowly figuring how to uncuck it, genning in seconds instead of minutes is too good
>>
>>107825697
And we have this shit instead. I feel very depressed localbros. We were jewed, jeted, etc. Why could we do now, is OVER?
>>
>>107825725
That's just what I've been working on and I wanted a full body character in front of a green screen.
>>
>>107825756
What*
>Juche Manse
>>
>>107825004
QIE 2511?
>>
File: file.png (408 KB, 350x907)
408 KB
408 KB PNG
Any idea how to fix this?
>>
File: 1480061930691.jpg (51 KB, 400x323)
51 KB
51 KB JPG
Can I give vision to models that don't have it natively?
>>
Has anyone been experimenting with samplers/schedulers on LTX? To my surprise, Euler Ancestral seems to produce sharper results than Euler, usually it's been the opposite. Also I never seem to get any good results with the 2nd upscaled pass, everything looks so plastic and sloppy
>>
>>107825775
yeah
>>
>>107825783
res4s for me
>>
>>107825773
that's what carrots actually look like
>>
>>107825775
You can simulate vision with a captioning model. It's what we used to do before multimodal existed.
>>
File: ComfyUI_00025_.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
>>
>>107825794
>carrots have little tentacles
retard
>>
>>107825740
It also has more uses than just the videos. I feel like its ability to add audio to silent clips is a little underrated, though using actual text2speech for voices is still better because LTX voices grate my ears
>>
>>107825583
those threads had a lot of actual tech discussion before ran's recent melties started. lots of people left since then but we might get anons back after z base release
>>
>>107825812
you and ran belong to each other ani
>>
>>107825773
seems latent noise spikes that the model is filling in. Maybe try a less agressive denoise setting?
>>
File: file.png (84 KB, 661x437)
84 KB
84 KB PNG
>>107825804
it does. You have to carefully remove the outer layer.
>>
>>107825854
>lmg
>using cloud model
grim
>>
>>107825862
lmg?
>>
>>107825552
>>107825580
The first example is awful but the second one show promise, though god I hate dirty talk so fucking much. Moans should be enough.
https://streamable.com/aynxj6
>>
>>107825875
lmao
>>
>>107825768
No, that was made using controlnet.

QIE isn't good for character swaps. It's very good for concepting though.
>>
https://www.reddit.com/r/StableDiffusion/comments/1q94nlk/wow_i_accidentally_discovered_that_the_native/

apparently you can use videos, or gifs, as input, so video to video: havent tested it though but other people say it works.
>>
>>107825875
The dirty talk is the good part. Without that it's not nearly as funny
>>
>>107825889
read the thread before you post monkey
>>
>>107825896
was the insult necessary?
>>
File: 1737556778144544.png (36 KB, 1112x350)
36 KB
36 KB PNG
maybe it's just placebo, but it's worth using NAG with dedicated audio and video clip encoders
>>
>>107825903
>being called a cute animal is an insult
idiot
>>
>training the same lora with different settings for the Xth time because i'm not satisfied with the results
please release base...
>>
what timestep shift do you guys use when training ZIT loras
>>
>>107825783
res2s is the one LTX team uses, and it looks good

>>107825791
awfully slow?
>>
>>107825905
why does your stuff look good
>>
>>107825915
what
>>
>>107825896
>Racism
You arready rost.
>>
File: 1754794989397470.png (153 KB, 369x560)
153 KB
153 KB PNG
>>107825889
use this node instead of load image:
>>
>>107825953
thanks benchod
>>
Why does ZIT often correct into the wrong direction?
For instance.
>ball gag lora.
>prompt ball gag in detail
>in her mouth is a red ball gag. black leather straps go from the ball on the girl's mouth around the girl's head.
Then in the preview I'd see behind the blur that it is actually putting the gag inside the mouth. But as the generation progresses just before the last steps the ball more and more appears outside. After VAE it then looks like the ball is on her mouth and not in her mouth.
>ball gag lora at 0.6 down to 0.35 doesn't make a difference.
>>
>>107825953
also set the frame load cap or you get this error:

>ValueError: Invalid number of frames: Encode input must have 1 + 8 * x frames (e.g., 1, 9, 17, ...). Please check your input.
>>
>>107825953
Can you load "frame 10 to 30" with it? It does look like it loads everything?
>>
>>107825968
yeah you can, it lets you limit number of frames, starting point, etc
>>
>>107825973
OK I'll have to check it then.
>>
>>107825889
>you can load a video into a video generator
bruh if i had the video already i wouldn't need to generate it
>>
>>107825982
retard
>>
>>107825961
>on the girl's mouth
Wait, I just saw that. But nonetheless... changed it to in the girl's mouth but that doesn't make a difference either.
>>
>>107825982
I have some very old gifs that can be fun to use, and some video scenes I'd like to make longer.
>>
>>107825953
Is there a difference between using AnimateDiff format over LTXV?
>>
only lodestone can save us from chinese culture
>>
lmao so it works, used a super low quality azumanga gif as input

the anime girl kicks her shoe in the air, which explodes high in the air.

https://files.catbox.moe/bz5bpu.mp4
>>
>>107825982
You can add audio to silent videos, or redub it
>>
>>107826000
idk this video has image output, I assume other video nodes would work fine too
>>
>>107826000
if i remember correctly, format just sets the resolution automatically

>>107826016
>zeta chroma cooking
>spark cooking
>uncanny photorealism seems to be regularly updated
>chroma radiance x0 (i cant get it to work despite following instructions, produces static, probably have to update comfy)

yeah we eatin good
>>
>>107826031
works with mp4 too, still figuring out settings etc.

https://files.catbox.moe/6hpnj2.mp4
>>
File: zimg_00044.png (1.27 MB, 768x1024)
1.27 MB
1.27 MB PNG
sometimes you actually do need a controlnet
>>
kek, video input is good

https://files.catbox.moe/wc8ocs.mp4
>>
File: 1765029206927484.png (313 KB, 750x1000)
313 KB
313 KB PNG
>>107825385
KINO
>>
how the fuck do you add input video, what's the next node?
>>
>>107825643
it takes less than 10 mn to make a filter retard, holy low IQ moment
>>
Is it ok to use Pony loras with WAI?
>>
stupid question incoming. playing around with some tts stuff namely higgs audio as my first foray into ai. is it normal that i have to go through the conda setup every time I want to use it?
>>
>>107826139
no your computer will release VX nerve gas
>>
>>107826098
can't believe they beeped the swears, this model is so fucking cucked
>>
>>107824511
>chroma is as slow
this is a poors problem. Chroma is not slow
>>
>>107826157
think again. also, ltx knows trump natively:

https://files.catbox.moe/2lkemr.mp4
>>
>Be ZiTstain
>Dial in your prompt for 1000 hours until you get a girl that looks kind of like your crush
>Prompt her in lots of lewd positions
>Not good enough, I'll train a LoRA with her FB photos
>Better but still not 100% there, good enough to wank to but misses the essence

>Be Chromite
>Humbly submit to the will of The Model
>"p-please give wh-what it pleaseth you to give"
>drops an exact likeness of your cousin
>>
>>107826157
check this one: it is absolutely not censored.

https://files.catbox.moe/yjr8a6.mp4
>>
>>107826174
>ltx knows trump natively
every single model on earth knows donald trump lol
>>
>>107826174
kek
>>
>>107826191
not exactly, also it gets his mannerisms: nothing has been as close as ltx.

https://files.catbox.moe/brdlr6.mp4
>>
File: unbelievable.png (1.54 MB, 1280x853)
1.54 MB
1.54 MB PNG
>>107826218
>not exactly
yes exactly, and ltx2 doesn't know who miku is, can you believe that? unbelievable
>>
>every single model on earth knows donald trump lol
Especially child pageant models
>>
>>107826229
we can fix that with loras, no worries. also i2v miku is fine.
>>
>>107826236
sorry moche, not touching your jerusalem powerpoint generator
>>
lmao

the man says "why are you gay? are you trans ginga? you homo? faggot?", in a ugandan accent.

https://files.catbox.moe/8b13ln.mp4
>>
how do i actually set number of steps in the sampler for ltx2? the default comfyui workflow is so retarded
>>
>>107826266
so yeah, video input, 17 frames for frame load cap, works great. lmao, this could literally be artosis:

the man says "this piece of shit protoss! go kill yourself you stupid faggot! I hate you!"

https://files.catbox.moe/4ld0gp.mp4
>>
File: file.png (408 KB, 1449x769)
408 KB
408 KB PNG
Its me or the comfy dev never cared to implement proper batching functionality?
Things like decent loop/flow control for example. I'm only finding it in shitty custom node packs(pic related) where its highly unintuitive to do some basic for loop so I can make it cover an entire folder / list while making use of stuff generated earlier in the workflow without re-running them again needlessly when calling via API.
What a dummy, made the perfect ecosystem for automation and didn't care to give proper love to the very basic functionality that would make use of it.
>>
>>107826308
he didn't but I guess there was a lot of work related to just getting support for models into comfyui, it's not like the only work is just on webUI

and there is not a huge advantage to most people's workflow vs +1 counter per iteration somewhere in custom nodes which is also easier for most people to handle
>>
File: 1764473539867842.jpg (57 KB, 672x645)
57 KB
57 KB JPG
>trump and floyd shit here every fucking day
>ldg fags call other people schizo
>trump and floy shit here every fucking day
>>
base in 2 more weeks
stay tuned
fucking retards
>>
>>107826328
>there is not a huge advantage to most people's workflow
The whole point of using comfy is to cover the niche of doing intricate automated stuff that you couldn't do on simple UIs. Baffling that it can't do it decently the moment you want to start working with the basics of dynamic lists and iterations.
>>
>>107826308
Conditions and loops are a must have, it's surprising that even after so many years we still don't have such basic functionalities in comfy. He could just mimic the way LabView does them and everyone would be happy.
>>
>>107826308
>comfyui-easy-use
>it's not easy to use
>>
>>107826374
>is called comfy ui
> is not comfy to use
many such cases
>>
>>107826336
>>ldg fags call other people schizo
fags call other fags schizos FIFY
>>
>>107826362
you could do some work with applicative functors and zippers too, but unfortunately this doesn't support real programming
>>
how do you stop masturbating srsly
>>
>>107826150
>to go through the conda setup
what do you mean? activating some virtual environment?
>>
>>107826435
cut off your penis
>>
File: 0.png (2.12 MB, 1152x1336)
2.12 MB
2.12 MB PNG
>>
>>107826435
eat corn flakes
>>
>>107826435
keep going. you will grow tired of everything eventually
>>
>>107826435
>how do you stop masturbating srsly
it's been a while I want to know the answer desu, having sexual energy is a fucking curse
>>
>>107826336
Somehow I feel like you won't be particularly excited about my tens of sneed videos genned for testing purposes
>>
>>
>>107826528
What's the result?
>>
kek video input (set to 17 frames) is pretty good

https://files.catbox.moe/pp3dda.mp4
>>
https://github.com/huggingface/diffusers/blob/6cfc83b4abc5b083fef56a18ec4700f48ba3aaba/docs/source/en/api/pipelines/glm_image.md
>Autoregressive generator: a 9B-parameter model initialized from GLM-4-9B-0414
>Diffusion Decoder: a 7B-parameter decoder based on a single-stream DiT architecture for latent-space image decoding
a fucking 7b model to decode images? are they fucking serious?
>>
>>107826541
This is gonna take like an hour or so I don't know
>>
>>107826544
I only watched it in Japanese but this does absolutely sound like how I imagined the voice to sound speaking English.
>>
why is it that wan2gp can do LTX-2 so well and comfy just OOMs and shits itself?
>>
>>107826362
Whatever you need is probably severely out of ComfyUI scope and you're better off just writing your own python script
>>
>>107826601
comfy aggresively caches everything, try fiddling with launch options or put Clear VRAM Cache nodes where it OOMs
>>
>>107826571
subbed anime is better 99.9% of the time

JP frieren and fern are great too, also new season in a week.
>>
>>107826617
but it won't even load the model. It's like wan2gp uses layer offloading way better, or something.
>>
>>107826601
Put --cache-none in startup arguments and all of a sudden Comfy will perform as fast as wangp without hogging all the memory trying to cache models. Might also try the ram node https://desuarchive.org/g/thread/107460114/#q107462678
https://desuarchive.org/g/thread/107460114/#q107462783
>>
>>107826679
Where do you store links like this?
>>
>>107826679
cache none is a meme, without cache it takes ages to reload the model
>>
File: 24562345.png (1.79 MB, 1024x1536)
1.79 MB
1.79 MB PNG
>>
>>107826679
>--cache-none
won't I then have to load the model from disk every time I switch between models? Like if I was doing WAN, it would be removing the high noise model from vram and then loading the low noise model from disk, right?
>>
>>107826708
Cuz I (gemini) made it, duh
>>107826717
>>107826728
Sure as hell beats going over memory limit and either getting OOM or taking forever to dump the model to the pagefile
>>
>>107826601
Try --reserve-vram 2.0 to 4.0. At some point, the memory management in Comfy became retarded again.
>>107826728
Yes.
>>
>>107826748
>Sure as hell beats going over memory limit and either getting OOM or taking forever to dump the model to the pagefile
but wan2gp just werks. There must be a way to get comfy to work the same.

>>107826754
>Try --reserve-vram 2.0 to 4.0. At some point, the memory management in Comfy became retarded again.
I'll give it a shot.
>>
File: 00067-902619392.png (1.65 MB, 1536x1024)
1.65 MB
1.65 MB PNG
>>
File: file.png (1.73 MB, 1024x1536)
1.73 MB
1.73 MB PNG
>>107826718
>>
File: 1757291945223255.mp4 (1.85 MB, 720x1072)
1.85 MB
1.85 MB MP4
>>107826718
>>
>>107825889
I tried a few sexy time videos but most produced body horror. Some gens did work by keeping the prompt simple like "She continues to make the same motion". Would Abliterated gemma make it work better? Did anyone test that and compare? Or the prompt enhancer?
>>
File: file.png (1.91 MB, 1248x1248)
1.91 MB
1.91 MB PNG
>>107825798
>>
File: file.png (1.97 MB, 1032x1512)
1.97 MB
1.97 MB PNG
>>107825396
>>
File: x_kfin7p.png (1.85 MB, 1536x1024)
1.85 MB
1.85 MB PNG
>>
>>107826869
lol love that style
>>
>still waiting for z-image
>>
>>107826150
No, that's not the normal. I encountered a similar problem before. The fix involved fixing the path, but I forgot the exact details.
>>
fresh
>>107826985
>>107826985
>>107826985
>>107826985
>>
>>107826748
>>107826728
i think comfy is doing something retarded with ltx2, i have a 5090 and 96gb ram, and it will sometimes hit the VRAM limit and grind everything to a halt while i still have like 30gb free RAM
>>
>>107827243
I just tried
>--reserve-vram 4
and it fucking worked! we are in business!
>>
>>107825643
>too busy for 10min filter
>but has time to bitch about his lack of filter

kek
>>
>>107826435
take the globohomo SSRI eberything pill
>>
>>107826601
I've used an 80GB GPU and even that tends to OOM. Something is fucky with the way ComfyUI manages memory.
>>
>>107828145
It seems like the `--reserve-vram 4` flag is working pretty good so far
>>
File: 1000070192.jpg (3.31 MB, 3472x4624)
3.31 MB
3.31 MB JPG
I have an idiot question. I installed SD two days ago, but I can't get img2img to work. For example, I generated a girl, and I want to add an ankh tattoo on her neck, I paint the mask on that place, but the result has only smudging there, no tattoo whatsoever.
>>
>>107828503
pic unrelated?
>>
>>107828503
Denoise too low



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.