/g/ - /ldg/ - Local Diffusion General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Janitor applications are now closed. Thanks to all who applied!

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/ldg/ - Local Diffusion Genera(...) 06/12/26(Fri)00:39:17 No.109034986

File: highlights_g_109028009_17(...).jpg (2.57 MB, 3828x3753)

2.57 MB JPG

/ldg/ - Local Diffusion General Anonymous 06/12/26(Fri)00:39:17 No.109034986

Discussion and Development of Local Image, Video, and Music Models

Previous: >>109028009

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
SDWebUI: https://rentry.org/ldg-lazy-getting-started-guide#the-stable-diffusion-web-ui-lineage
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, & Upscalers
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/tdrussell/diffusion-pipe
https://github.com/kohya-ss/sd-scripts
https://github.com/kohya-ss/musubi-tuner

>Z
https://huggingface.co/Tongyi-MAI/Z-Image

>Anima
https://huggingface.co/circlestone-labs/Anima
https://tagexplorer.github.io/
https://animadex.net

>Qwen
https://huggingface.co/collections/Qwen/qwen-image

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>Wan
https://github.com/Wan-Video/Wan2.2

>LTX-2.3
https://huggingface.co/collections/Lightricks/ltx-23

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Collage: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon

Anonymous
06/12/26(Fri)00:41:21 No.109034999

Anonymous 06/12/26(Fri)00:41:21 No.109034999

gm saars

Anonymous
06/12/26(Fri)00:44:17 No.109035014

Anonymous 06/12/26(Fri)00:44:17 No.109035014

>>109034950
Is it possible to fuse the power of data center models with local freedom? Perhaps ComfyCloud is the answer. It's clear that everyone has abandoned local computing, GPUs are intentionally stagnant with pitiful consumer amounts of VRAM. It's clear "chinese ram" is also a meme, and even if it wasn't they'd sell out to datacenters just like they sold local out to API. With local compute completely dead, a hybrid approach might be the solution.

Anonymous
06/12/26(Fri)00:44:43 No.109035016

Anonymous 06/12/26(Fri)00:44:43 No.109035016

File: Wan21_SCAIL2_00001.mp4 (2.87 MB, 1024x896)

2.87 MB MP4

first test
pretty good, it doesn't take the source video background like WAN animate does

Anonymous
06/12/26(Fri)00:44:54 No.109035019

Anonymous 06/12/26(Fri)00:44:54 No.109035019

>

Anonymous
06/12/26(Fri)00:46:27 No.109035025

Anonymous 06/12/26(Fri)00:46:27 No.109035025

>>109035016
the face looks very weird as she gets closer. it's like the focal length isn't changing

Anonymous
06/12/26(Fri)00:46:43 No.109035027

Anonymous 06/12/26(Fri)00:46:43 No.109035027

File: Wan21_SCAIL2_00012.mp4 (3.75 MB, 1248x352)

3.75 MB MP4

https://github.com/Comfy-Org/ComfyUI/pull/14373

Anonymous
06/12/26(Fri)00:49:04 No.109035034

Anonymous 06/12/26(Fri)00:49:04 No.109035034

Is kijai a member of comfyorg? oh of course not, because he actually develops for local models.
maybe he should start kijaiUI instead of doing free labor for comfy's api adware.

Anonymous
06/12/26(Fri)00:49:54 No.109035038

Anonymous 06/12/26(Fri)00:49:54 No.109035038

>>109035016
What is this testing

Anonymous
06/12/26(Fri)00:50:07 No.109035041

Anonymous 06/12/26(Fri)00:50:07 No.109035041

oddly high levels of seetheposting today for some reason not sure why desu

Anonymous
06/12/26(Fri)00:50:29 No.109035043

Anonymous 06/12/26(Fri)00:50:29 No.109035043

>>109035016
FYI the workflow from the pull request defaults to 65 frames on the first segment and 81 on the second for some reason. If you change the first segment to be 81 as well, you get an extra second of video.
In theory it's possible to dupe the extend section indefinitely to gen whatever length videos you want, but I'm too stupid to figure out how to do that.

Anonymous
06/12/26(Fri)00:51:35 No.109035048

Anonymous 06/12/26(Fri)00:51:35 No.109035048

>>109035038
To make deepfakes for pedo socialite class.

Anonymous
06/12/26(Fri)00:51:37 No.109035049

Anonymous 06/12/26(Fri)00:51:37 No.109035049

>>109035034
>Model releases
>It's unusable dogshit on native comfy nodes
>KJ nodes make it actually work
A tale as old as time

Anonymous
06/12/26(Fri)00:53:31 No.109035060

Anonymous 06/12/26(Fri)00:53:31 No.109035060

idk, I posted this also in lmg, I'm not sure where to put music lol

Ace Step 1.5 XL SFT

https://files.catbox.moe/n5tow1.mp3

Anonymous
06/12/26(Fri)01:00:29 No.109035089

Anonymous 06/12/26(Fri)01:00:29 No.109035089

File: wan21scail2 daisy chain.jpg (269 KB, 1987x1224)

269 KB JPG

>>109035043
>>109035027

Link prev images frames and video frames offset. Then combine video up top.

Note: 5 frames repeat is used as anchor during extensions according to KJ. So calculate accordingly.

Anonymous
06/12/26(Fri)01:02:15 No.109035100

Anonymous 06/12/26(Fri)01:02:15 No.109035100

>>109035027
Hows it compare to wan 2.2 animate practically speaking?

Anonymous
06/12/26(Fri)01:02:45 No.109035102

Anonymous 06/12/26(Fri)01:02:45 No.109035102

>mfw Resource news

06/11/2026

>i1: A Simple and Fully Open Recipe for Strong Text-to-Image Models
https://zlab-princeton.github.io/i1

>AnchorEdit: Maintaining Temporal Consistency in Multi-turn Image Editing via Causal Memory
https://github.com/xuhang07/AnchorEdit

>Reroute, Don't Remove: Recoverable Visual Token Routing for Vision-Language Models
https://github.com/elmma/mllm-reroute

>ComfyUI-BerniniStudio
https://github.com/CCpt5/ComfyUI-BerniniStudio

>Ideoprompt: plain English to Ideogram 4 structured JSON prompt
https://github.com/cocktailpeanut/ideoprompt

>Orion4D FXMax for ComfyUI
https://github.com/orion4d/Orion4D_FXMax

>JoyAI-Echo — GGUF (for low-VRAM ComfyUI)
https://huggingface.co/realrebelai/JoyAI-Echo_GGUF

06/10/2026

>EvoQuality: Self-Evolving VLM for Image Quality Assessment
https://huggingface.co/ByteDance/EvoQuality

>ComfyTV: Turn ComfyUI into a TapNow / LibTV-style canvas app
https://github.com/jtydhr88/ComfyTV

>PathRelax: Parallel-Path Relaxed Speculative Jacobi Decoding for Accelerating Auto-Regressive Text-to-Image Generation
https://github.com/Haodong-Lei-Ray/PathSpec

>SSR-Merge: Subspace Signal Routing for Training-Free LoRA Merging in Diffusion Models
https://github.com/nagara214/SSR-Merge

>SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning
https://teal024.github.io/SCAIL-2

>IDEAL: In-DEpth ALignment Makes A Discrete Representation AutoEncoder
https://github.com/Row11n/IDEAL

>Image to Prompt: Web app to turn an image into Ideogram 4 JSON prompt
https://github.com/cocktailpeanut/image-to-prompt

>Simple Diffusion XS (sdxs-2b alpha version)
https://huggingface.co/AiArtLab/sdxs-2b

>Bernini-R: Repackaged model files for ComfyUI
https://huggingface.co/Comfy-Org/Bernini-R

06/09/2026

>SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning
https://teal024.github.io/SCAIL-2

>BLM-SGAN
https://github.com/haidy-maher/BLM-SGAN-Text-to-Image-Generation

Anonymous
06/12/26(Fri)01:03:45 No.109035103

Anonymous 06/12/26(Fri)01:03:45 No.109035103

>mfw Research news

06/11/2026

>A Comprehensive Ecosystem for Open-Domain Customized Video Generation
https://arxiv.org/abs/2606.11783

>ARGUS: Stacked Multi-View Identity Mosaic Injection for Subject-Preserving Video Generation
https://arxiv.org/abs/2606.11670

>SpecLoR: Spectral Lookahead Rectification for Motion-Coherent Text-to-Video Generation
https://arxiv.org/abs/2606.11969

>Plan-and-Verify Video Reward Reasoning with Spatio-Temporal Scene Graph Grounding
https://arxiv.org/abs/2606.11838

>FitVTON: Fit-aware Virtual Try-On via Body-Garment Size Control
https://zenoning.github.io/FitVTON

>ISAP-3D: Identity-Slot Aligned Part-Aware 3D Generation
https://arxiv.org/abs/2606.12099

>VOID: Defeating Unauthorized Mimicry in Latent Diffusion Models
https://arxiv.org/abs/2606.12263

>MultiToP: Learning to Patch Visual Tokens to Mitigate Hallucinations in Video Large Multimodal Models
https://arxiv.org/abs/2606.11792

>A Scalable PyTorch Abstraction for Multi-GPU Gaussian Splatting
https://arxiv.org/abs/2606.11390

>InternVideo3: Agentify Foundation Models with Multimodal Contextual Reasoning
https://arxiv.org/abs/2606.12195

Anonymous
06/12/26(Fri)01:07:20 No.109035118

Anonymous 06/12/26(Fri)01:07:20 No.109035118

File: WanimateCollage_00013.mp4 (2.41 MB, 1130x896)

2.41 MB MP4

>>109035100
Wan22 Animate couldn't do this Lapwing video test. The video cut to another shot abruptly multiple times. The open pose pre process cannot guess half a body nor can it pre process eyes or emotions. So overall, Wan21_SCAIL-2 seems more capable out of the box.

Anonymous
06/12/26(Fri)01:11:50 No.109035139

Anonymous 06/12/26(Fri)01:11:50 No.109035139

File: 456475.gif (3.97 MB, 320x222)

3.97 MB GIF

Anonymous
06/12/26(Fri)01:13:49 No.109035147

Anonymous 06/12/26(Fri)01:13:49 No.109035147

>>109035139
>.gif (3.97 MB, 320x222)
wat

Anonymous
06/12/26(Fri)01:13:53 No.109035148

Anonymous 06/12/26(Fri)01:13:53 No.109035148

Wonder how bernini stands up against scail 2. Scail 2 would allow using wan 2.1 loras? Heard bernini is somewhat wan 2.2 related so those loras may work?

Anonymous
06/12/26(Fri)01:15:03 No.109035155

Anonymous 06/12/26(Fri)01:15:03 No.109035155

>>109035118
Does it need a sam 3 node or is it sort of automatic/baked in?

Anonymous
06/12/26(Fri)01:19:49 No.109035164

Anonymous 06/12/26(Fri)01:19:49 No.109035164

>>109035155

It performs better with SAM3 it seems. Without it, the character is hallucinated or prompted in.

Anonymous
06/12/26(Fri)01:20:07 No.109035166

Anonymous 06/12/26(Fri)01:20:07 No.109035166

File: 4775455.webm (3.86 MB, 420x291)

3.86 MB WEBM

>>109035147
i have to keep jannies on their toes

Anonymous
06/12/26(Fri)01:20:16 No.109035168

Anonymous 06/12/26(Fri)01:20:16 No.109035168

>>109032022
>>109034612
>nano shills

The fact you still to this day have to put
>"DONT MAKE AN IMAGE GIVE ME A PROMPT"

At the end of every gemini input, is truly a sign of how shot AI devs and engineers are.

Anonymous
06/12/26(Fri)01:35:00 No.109035224

Anonymous 06/12/26(Fri)01:35:00 No.109035224

>>109035168
why are there nano banana shills? I have nano banana and grok imagine, just the non-decoy tier. But they're just not local.

Anonymous
06/12/26(Fri)01:36:11 No.109035235

Anonymous 06/12/26(Fri)01:36:11 No.109035235

File: 753466.webm (3.72 MB, 420x291)

3.72 MB WEBM

nice moves

Anonymous
06/12/26(Fri)01:38:12 No.109035242

Anonymous 06/12/26(Fri)01:38:12 No.109035242

File: 56854.webm (3.17 MB, 420x291)

3.17 MB WEBM

>>109035235
oops wrong one

Anonymous
06/12/26(Fri)01:42:11 No.109035259

Anonymous 06/12/26(Fri)01:42:11 No.109035259

File: 4575376.gif (3.51 MB, 320x222)

3.51 MB GIF

Anonymous
06/12/26(Fri)01:46:29 No.109035280

Anonymous 06/12/26(Fri)01:46:29 No.109035280

Do negative prompts work on ideogram?

Anonymous
06/12/26(Fri)01:53:56 No.109035319

Anonymous 06/12/26(Fri)01:53:56 No.109035319

File: Wan21_SCAIL2_00018.mp4 (3.64 MB, 2016x672)

3.64 MB MP4

>>109035118
Seek the elden ring, become the elden lord.

Note: The reference aspect ratio should match the video, that seems to improve accuracy.

Anonymous
06/12/26(Fri)01:54:41 No.109035327

Anonymous 06/12/26(Fri)01:54:41 No.109035327

cozy breasd

Anonymous
06/12/26(Fri)01:55:41 No.109035332

Anonymous 06/12/26(Fri)01:55:41 No.109035332

shameless repost for myself

>>109034840
>>109034927
ok thanks
i am making a significant amount of progress
my main gripe rn is that although my base img is quite crisp, i cant keep that detail
when i look at clips posted on civitai, so many of them are so crisp and clear with the motion while keeping the detail i feel like i am missing something

Anonymous
06/12/26(Fri)01:59:03 No.109035354

Anonymous 06/12/26(Fri)01:59:03 No.109035354

File: web_img_freq_export.png (251 KB, 512x512)

251 KB PNG

>>109034986

Anonymous
06/12/26(Fri)01:59:50 No.109035355

Anonymous 06/12/26(Fri)01:59:50 No.109035355

>>109035332

Depending on what you use. My experience is with Wan2.2. Increasing steps 4 high/4 low improve clarify. Increasing resolution also does that, but you need high VRAM. If you meant smoothness of motion, then you need to increase FPS or interpolate nodes to generate extra frames.

Anonymous
06/12/26(Fri)02:01:53 No.109035363

Anonymous 06/12/26(Fri)02:01:53 No.109035363

>>109035355
im using ltx eros at the moment and im comparing my stuff to the stuff i see on the related eros civitai pages, so i know its capable of it i just wish i could pull workflows from peoples videos on civit like you can of images

Anonymous
06/12/26(Fri)02:07:02 No.109035381

Anonymous 06/12/26(Fri)02:07:02 No.109035381

spare some ram for little old me?

Anonymous
06/12/26(Fri)02:19:36 No.109035438

Anonymous 06/12/26(Fri)02:19:36 No.109035438

File: Wan21_SCAIL2_00012.mp4 (2.74 MB, 1726x2048)

2.74 MB MP4

Anonymous
06/12/26(Fri)02:21:48 No.109035451

Anonymous 06/12/26(Fri)02:21:48 No.109035451

File: Wan21_SCAIL2_00022.mp4 (2.53 MB, 1024x896)

2.53 MB MP4

SCAIL is giving me kino, but it seems like it's a 50/50 whether it keeps the background from the original video or hallucinates an entirely new one

Anonymous
06/12/26(Fri)02:28:27 No.109035481

Anonymous 06/12/26(Fri)02:28:27 No.109035481

>>109035438
>>109035451
Videochads are eating good.

Anonymous
06/12/26(Fri)02:28:35 No.109035482

Anonymous 06/12/26(Fri)02:28:35 No.109035482

>>109035451
neck twist

Anonymous
06/12/26(Fri)02:30:49 No.109035486

Anonymous 06/12/26(Fri)02:30:49 No.109035486

>>109035451
try that masking thing anon posted in last thread

Anonymous
06/12/26(Fri)02:32:35 No.109035494

Anonymous 06/12/26(Fri)02:32:35 No.109035494

File: Wan21_SCAIL2_00016.mp4 (3.47 MB, 1786x2048)

3.47 MB MP4

it can mimic facial expression (and tongue) way better

Anonymous
06/12/26(Fri)02:34:35 No.109035504

Anonymous 06/12/26(Fri)02:34:35 No.109035504

File: 1772009263891184.png (303 KB, 448x640)

303 KB PNG

Anonymous
06/12/26(Fri)02:36:14 No.109035508

Anonymous 06/12/26(Fri)02:36:14 No.109035508

File: 1765765141177754.png (116 KB, 256x384)

116 KB PNG

Anonymous
06/12/26(Fri)02:36:43 No.109035510

Anonymous 06/12/26(Fri)02:36:43 No.109035510

>>109035494
do this but with bill gates

Anonymous
06/12/26(Fri)02:38:32 No.109035518

Anonymous 06/12/26(Fri)02:38:32 No.109035518

File: 1753368550566162.png (78 KB, 384x256)

78 KB PNG

Anonymous
06/12/26(Fri)02:40:02 No.109035524

Anonymous 06/12/26(Fri)02:40:02 No.109035524

File: 1767648464164387.png (132 KB, 384x256)

132 KB PNG

Anonymous
06/12/26(Fri)02:45:04 No.109035544

Anonymous 06/12/26(Fri)02:45:04 No.109035544

File: Wan21 Scail2 00018(1).mp4 (3.37 MB, 2044x2048)

3.37 MB MP4

way better hair rendering, but worst boobs jiggle

Anonymous
06/12/26(Fri)02:46:28 No.109035553

Anonymous 06/12/26(Fri)02:46:28 No.109035553

File: Wanimate_00181.mp4 (2.82 MB, 932x1280)

2.82 MB MP4

wan animate

Anonymous
06/12/26(Fri)02:46:40 No.109035554

Anonymous 06/12/26(Fri)02:46:40 No.109035554

>>109035544
what's the original from?

Anonymous
06/12/26(Fri)02:48:31 No.109035561

Anonymous 06/12/26(Fri)02:48:31 No.109035561

File: test7.webm (3.41 MB, 1890x992)

3.41 MB WEBM

Previously

Anonymous
06/12/26(Fri)02:51:25 No.109035571

Anonymous 06/12/26(Fri)02:51:25 No.109035571

File: Wan21_SCAIL2_00024.webm (3.79 MB, 1024x896)

3.79 MB WEBM

>>109035486
I don't think it's a masking issue, since SCAIL automatically masks both the video and input image
Like with this, the first time I tried it I gave it a dogshit low res image and it just made up a new person, but it kept the background of the original video. I try again with a high res image and the background gets deleted.
(ignore the last ~second of weirdness, that's just because I'm bad at math)

Anonymous
06/12/26(Fri)02:51:47 No.109035572

Anonymous 06/12/26(Fri)02:51:47 No.109035572

File: Wan21_SCAIL2_00022.mp4 (2.58 MB, 1636x996)

2.58 MB MP4

>>109035561
We're so back.

wan2.1_14B_SCAIL_2_fp8_scaled.safetensors

Usage:
60/128 GB Sys Ram
13.6/31.5GB VRAM

Anonymous
06/12/26(Fri)02:52:34 No.109035578

Anonymous 06/12/26(Fri)02:52:34 No.109035578

File: 1777691152730786.png (3.8 MB, 1536x1792)

3.8 MB PNG

Anonymous
06/12/26(Fri)02:54:48 No.109035586

Anonymous 06/12/26(Fri)02:54:48 No.109035586

>>109035571
do this but with jeff bezos

Anonymous
06/12/26(Fri)02:57:26 No.109035596

Anonymous 06/12/26(Fri)02:57:26 No.109035596

>>109035572
That's really good. Workflow?

Anonymous
06/12/26(Fri)02:58:12 No.109035599

Anonymous 06/12/26(Fri)02:58:12 No.109035599

File: 1767036842434299.gif (198 KB, 384x256)

198 KB GIF

>>109035524

Anonymous
06/12/26(Fri)02:58:26 No.109035600

Anonymous 06/12/26(Fri)02:58:26 No.109035600

>>109035494
what about two people interacting?

Anonymous
06/12/26(Fri)03:01:13 No.109035611

Anonymous 06/12/26(Fri)03:01:13 No.109035611

File: Wan21_SCAIL2_HighRes__00005.mp4 (3.46 MB, 736x1328)

3.46 MB MP4

>>109035596
Need to git pull this request.

https://github.com/Comfy-Org/ComfyUI/pull/14373

Anonymous
06/12/26(Fri)03:02:41 No.109035617

Anonymous 06/12/26(Fri)03:02:41 No.109035617

>>109035611
Sweet thanks. KJgod delivers again

Anonymous
06/12/26(Fri)03:21:31 No.109035685

Anonymous 06/12/26(Fri)03:21:31 No.109035685

>>109035451
how u get the source video background into the output video?

Anonymous
06/12/26(Fri)03:26:09 No.109035699

Anonymous 06/12/26(Fri)03:26:09 No.109035699

>>109035685

From my usage, it seems white background for the reference is ignored and you get the video's background. Black/gray bg generate new background

Anonymous
06/12/26(Fri)03:27:27 No.109035702

Anonymous 06/12/26(Fri)03:27:27 No.109035702

>>109035572
can it also reasonably do multiple character/object references or just one?

Anonymous
06/12/26(Fri)03:29:47 No.109035708

Anonymous 06/12/26(Fri)03:29:47 No.109035708

So will these ideogram bbox be possible to be implemented into klein workflows?

Anonymous
06/12/26(Fri)03:33:30 No.109035721

Anonymous 06/12/26(Fri)03:33:30 No.109035721

Scail looks really good. I was skeptical but it looks genuinely good

Anonymous
06/12/26(Fri)03:34:30 No.109035725

Anonymous 06/12/26(Fri)03:34:30 No.109035725

>>109035702
Nta but in my experience with wanimate it’s one character per pass. You can do one character in the scene then a second one on a second pass

Anonymous
06/12/26(Fri)03:36:04 No.109035729

Anonymous 06/12/26(Fri)03:36:04 No.109035729

Why do you read the Comfy pull requests like they are the news or some Reddit post tier content? How empty is your life? Pathetic...

Anonymous
06/12/26(Fri)03:36:17 No.109035730

Anonymous 06/12/26(Fri)03:36:17 No.109035730

>>109035708
No but Klein is an edit model so you can just mask areas and tell it to put stuff there

Anonymous
06/12/26(Fri)03:37:26 No.109035734

Anonymous 06/12/26(Fri)03:37:26 No.109035734

File: 1752961253494863.jpg (50 KB, 970x816)

50 KB JPG

How to know which epoch is better? I can't make up my mind by just the outputs
What prompts should I use to check?

Anonymous
06/12/26(Fri)03:41:13 No.109035745

Anonymous 06/12/26(Fri)03:41:13 No.109035745

File: Wan21_SCAIL2_00033.mp4 (2.8 MB, 1792x1110)

2.8 MB MP4

>>109035600
>>109035725
It can identify multiple people; the theoretical limit is 64. The quality of their interactions will be dependent on how much that specific interaction exists in the training data.

>>109035685
There are two modes, one where the video animates the input image, and one where the character(s) in the input image replace those in the video. For some reason, it's a coin toss whether the latter keeps the video background or hallucinates a new one. Consider
>>109035572
where it's partly kept it, and
>>109035571
where it's gone completely.

Anonymous
06/12/26(Fri)03:42:52 No.109035754

Anonymous 06/12/26(Fri)03:42:52 No.109035754

>>109035734
Satan in positive prompt to check which epoch has the most protective effects.

Anonymous
06/12/26(Fri)03:42:59 No.109035755

Anonymous 06/12/26(Fri)03:42:59 No.109035755

>>109035734
I took a half year break because I ran into this as well
there is no way out
just accept that you are now FUCKED

Anonymous
06/12/26(Fri)03:43:47 No.109035760

Anonymous 06/12/26(Fri)03:43:47 No.109035760

>>109035745
>It can identify multiple people; the theoretical limit is 64. The quality of their interactions will be dependent on how much that specific interaction exists in the training data.
nice. maybe my storage space won't suffer as much then. guess I'll see when I can test both bernini and scail-2.

Anonymous
06/12/26(Fri)03:56:17 No.109035803

Anonymous 06/12/26(Fri)03:56:17 No.109035803

>>109035734
>I can't make up my mind
I suspect this problem arises frequently in other areas of your life in addition to your epoch predicament.

Anonymous
06/12/26(Fri)03:56:57 No.109035806

Anonymous 06/12/26(Fri)03:56:57 No.109035806

File: 0612155526202-ijSJqV6d1pz.png (37 KB, 387x653)

37 KB PNG

this seems to do the trick. replace character in source video

Anonymous
06/12/26(Fri)03:56:57 No.109035807

Anonymous 06/12/26(Fri)03:56:57 No.109035807

>>109035803
okay how do you know?

Anonymous
06/12/26(Fri)03:59:24 No.109035819

Anonymous 06/12/26(Fri)03:59:24 No.109035819

>>109035734
vibecode a blind test. If you can't tell why do you care.

Anonymous
06/12/26(Fri)04:00:02 No.109035823

Anonymous 06/12/26(Fri)04:00:02 No.109035823

File: Wan21_SCAIL2_00036.mp4 (1.34 MB, 1500x690)

1.34 MB MP4

Anonymous
06/12/26(Fri)04:01:58 No.109035834

Anonymous 06/12/26(Fri)04:01:58 No.109035834

>>109035823
the lack of jiggling leaves a lot to be desired but the consistency maintained when it changes angle is impressive

Anonymous
06/12/26(Fri)04:03:47 No.109035839

Anonymous 06/12/26(Fri)04:03:47 No.109035839

>>109035014
>Is it possible to fuse the power of data center models with local freedom?
what kind of retarded question is this? local models are that fusion, it's a miracle that they even exist in the first place
>GPUs are intentionally stagnant with pitiful consumer amounts of VRAM
6 years ago, large amounts of VRAM were almost unilaterally considered a price-hiking strategy for almost any consumer application except semi-professional video production, and VRAM upselling over the years has caused a continuous decline in software quality and resulted in catastrophically worse performance of operating systems, basically all software, the entire video game industry, because what was at the time uselessly high memory bandwidth was relied on to "forgive" bad programming
we are now still probably well within the early adoption phase of local AI and 90% of consumers continue to not give a shit about it at all, which is apparently fertile soil for you to crumple up reality into this little cum napkin you've created after watching a gamers nexus video
>a hybrid approach might be the solution
a solution to fucking what? if it's not local, then it's not fucking local, there is no "hybrid" between local and remote, because that would be fucking remote

Anonymous
06/12/26(Fri)04:06:36 No.109035850

Anonymous 06/12/26(Fri)04:06:36 No.109035850

>>109035734
Diverse prompts help, anatomic positions besides 1girl, standing help, text helps, different styles help.
Let's say you are training a lora from the photos of a 1girl. I would use something like this:
1) Just 1girl, no elaborate conditioning, see how it swings when given reins
2) 1girl, lying on sand, from above, tongue out, black bikini
3) 1girl, jumping, on air, flying kick
4) 1girl, closed eyes, smile, holding a sign that says: "IS MY LORA FRIED NOW?"
5) 1girl, painting, Renaissance painting, outdoors, forest

Anonymous
06/12/26(Fri)04:06:48 No.109035852

Anonymous 06/12/26(Fri)04:06:48 No.109035852

File: Wan21_SCAIL2_00038.mp4 (2.72 MB, 1282x1500)

2.72 MB MP4

Anonymous
06/12/26(Fri)04:15:19 No.109035896

Anonymous 06/12/26(Fri)04:15:19 No.109035896

File: Wan21_SCAIL2_00040.mp4 (2.71 MB, 1500x1330)

2.71 MB MP4

it also replaced the old man in the background. how to retain the old man?

Anonymous
06/12/26(Fri)04:21:32 No.109035921

Anonymous 06/12/26(Fri)04:21:32 No.109035921

File: scail2 replacement mode.jpg (27 KB, 1091x81)

27 KB JPG

>>109035896

Use white background on reference character..

Anonymous
06/12/26(Fri)04:27:39 No.109035945

Anonymous 06/12/26(Fri)04:27:39 No.109035945

File: Wan21_SCAIL2_00051.mp4 (3.33 MB, 1792x960)

3.33 MB MP4

>>109035760
After a bunch of testing, the segmentation tool can identify and isolate objects, but SCAIL can't replace them, it can only replace people.
It can handle replacing multiple people in one pass, but it seems to struggle a bit as you add more and more people

Anonymous
06/12/26(Fri)04:37:32 No.109035974

Anonymous 06/12/26(Fri)04:37:32 No.109035974

how do you organize all your gens?

Anonymous
06/12/26(Fri)04:38:13 No.109035980

Anonymous 06/12/26(Fri)04:38:13 No.109035980

>>109035974
badly

Anonymous
06/12/26(Fri)04:45:16 No.109036007

Anonymous 06/12/26(Fri)04:45:16 No.109036007

File: Wan21_SCAIL2_00046.mp4 (3.45 MB, 1280x1500)

3.45 MB MP4

Anonymous
06/12/26(Fri)04:45:29 No.109036009

Anonymous 06/12/26(Fri)04:45:29 No.109036009

>>109035974
i just do %date
when i feel like i have too much bloat, i go back and delete all the gacha rolls, usually just keeping 1-2 of each kind of gen i did
i have a seperate area of folders where i specifically copy outputs in that i want to keep for reference reasons

Anonymous
06/12/26(Fri)04:47:03 No.109036017

Anonymous 06/12/26(Fri)04:47:03 No.109036017

>>109036007
ok this is getting pretty good, can you put this asian bitch in very out of place scenes, like have her replace a girl in a game of thrones clip

Anonymous
06/12/26(Fri)04:48:10 No.109036020

Anonymous 06/12/26(Fri)04:48:10 No.109036020

>>109036007
Would be good “imagine if we imported these ppl instead” meme gifs

Anonymous
06/12/26(Fri)04:54:03 No.109036038

Anonymous 06/12/26(Fri)04:54:03 No.109036038

>>109036017
if you can supply the clip

Anonymous
06/12/26(Fri)04:54:26 No.109036040

Anonymous 06/12/26(Fri)04:54:26 No.109036040

7 is a prime
as are 9, 11 and 13

Anonymous
06/12/26(Fri)04:56:18 No.109036046

Anonymous 06/12/26(Fri)04:56:18 No.109036046

File: Wan21_SCAIL2_00027.mp4 (644 KB, 1548x636)

644 KB MP4

>>109035945
Tried sequential replacement, still lost some data on the first char. Still. Wan2.1 has high potential. I'm done testing for the night.

Anonymous
06/12/26(Fri)04:59:06 No.109036058

Anonymous 06/12/26(Fri)04:59:06 No.109036058

>>109036046
I think I need to segment video number two to feed to video 3 to clamp down the masked area. Gonna try that later.

Anonymous
06/12/26(Fri)05:00:10 No.109036059

Anonymous 06/12/26(Fri)05:00:10 No.109036059

>>109036046
curious, what's the missing data?

Anonymous
06/12/26(Fri)05:02:15 No.109036069

Anonymous 06/12/26(Fri)05:02:15 No.109036069

File: 0.png (1.43 MB, 404x924)

1.43 MB PNG

>>109036059
Her yellow dress is gone on second pass.

Anonymous
06/12/26(Fri)05:02:35 No.109036070

Anonymous 06/12/26(Fri)05:02:35 No.109036070

>>109035945 >>109036046
hey, that's not bad at all! thank you very much for trying this

Anonymous
06/12/26(Fri)05:06:35 No.109036086

Anonymous 06/12/26(Fri)05:06:35 No.109036086

>>109036069
I see, thx

Anonymous
06/12/26(Fri)05:31:28 No.109036150

Anonymous 06/12/26(Fri)05:31:28 No.109036150

File: Wan21_SCAIL2_00054.mp4 (1.46 MB, 1500x452)

1.46 MB MP4

Anonymous
06/12/26(Fri)05:34:57 No.109036160

Anonymous 06/12/26(Fri)05:34:57 No.109036160

>>109036150
damn that looks good

Anonymous
06/12/26(Fri)05:42:14 No.109036185

Anonymous 06/12/26(Fri)05:42:14 No.109036185

>>109035572
> wan2.1_14B_SCAIL_2_fp8_scaled.safetensors
Only one checkpoint? No high, no low?

Anonymous
06/12/26(Fri)05:48:49 No.109036202

Anonymous 06/12/26(Fri)05:48:49 No.109036202

>>109036150
fix movies
fix tv shows

Anonymous
06/12/26(Fri)05:52:36 No.109036215

Anonymous 06/12/26(Fri)05:52:36 No.109036215

File: Wan21_SCAIL2_00059.mp4 (2.2 MB, 1800x684)

2.2 MB MP4

Anonymous
06/12/26(Fri)05:56:35 No.109036229

Anonymous 06/12/26(Fri)05:56:35 No.109036229

File: Wan21_SCAIL2_00083.mp4 (3.05 MB, 1792x1408)

3.05 MB MP4

>>109036215
Are you just using the workflow from the pull request?
I can't get it to reliably keep the background from the original video, and I've confirmed it's not the 'replacement mode' toggle, if I turn that off it keeps the background from the image.
https://files.catbox.moe/zmqr82.png

Anonymous
06/12/26(Fri)06:00:58 No.109036245

Anonymous 06/12/26(Fri)06:00:58 No.109036245

>>109036229
no idea, i just works when I switch replacement mode = true

Anonymous
06/12/26(Fri)06:01:25 No.109036249

Anonymous 06/12/26(Fri)06:01:25 No.109036249

>>109036229
I can’t believe you would edit Ronald McDonald out of video instead of editing him in

Anonymous
06/12/26(Fri)06:02:42 No.109036254

Anonymous 06/12/26(Fri)06:02:42 No.109036254

>>109036215
kino

Anonymous
06/12/26(Fri)06:11:47 No.109036291

Anonymous 06/12/26(Fri)06:11:47 No.109036291

Is there a turbo lora for Z Image where the outputs aren't completely slopped

Anonymous
06/12/26(Fri)06:12:55 No.109036296

Anonymous 06/12/26(Fri)06:12:55 No.109036296

maybe I'm just a schizo but is the reason civitai hasn't implemented an ideogram4 filter category, like they do for other models, because of the license and they might, very soon, delete the loras that have been posted?

Anonymous
06/12/26(Fri)06:20:37 No.109036339

Anonymous 06/12/26(Fri)06:20:37 No.109036339

File: Wan21_SCAIL2_00089.mp4 (2.79 MB, 592x1040)

2.79 MB MP4

>>109036249

Anonymous
06/12/26(Fri)06:25:37 No.109036368

Anonymous 06/12/26(Fri)06:25:37 No.109036368

>>109036339
i saw this in the playplace

Anonymous
06/12/26(Fri)06:26:08 No.109036373

Anonymous 06/12/26(Fri)06:26:08 No.109036373

>>109036291
>Turbo lora
??? Just use the turbo model.

Anonymous
06/12/26(Fri)07:00:42 No.109036528

Anonymous 06/12/26(Fri)07:00:42 No.109036528

is it me or is every wan svi workflow trash?
Can't get nearly the quality I get with regular wan22 i2v

Anonymous
06/12/26(Fri)07:01:07 No.109036530

Anonymous 06/12/26(Fri)07:01:07 No.109036530

File: Wan21_SCAIL2_00079.mp4 (2.68 MB, 1800x1144)

2.68 MB MP4

Anonymous
06/12/26(Fri)07:07:38 No.109036558

Anonymous 06/12/26(Fri)07:07:38 No.109036558

question
is the rtx upscale node in a bunch of the video workflows actually totally independant, like could i make a new workflow with just that and plug in any video and have itupscale it?

Anonymous
06/12/26(Fri)07:12:05 No.109036582

Anonymous 06/12/26(Fri)07:12:05 No.109036582

>>109036558
yes

Anonymous
06/12/26(Fri)07:14:21 No.109036589

Anonymous 06/12/26(Fri)07:14:21 No.109036589

>>109036558
yes plus use the rtx node from this node pack: deno-custom-nodes

Anonymous
06/12/26(Fri)07:15:47 No.109036600

Anonymous 06/12/26(Fri)07:15:47 No.109036600

>>109036589
why?

Anonymous
06/12/26(Fri)07:16:27 No.109036602

Anonymous 06/12/26(Fri)07:16:27 No.109036602

>>109036530
> gosling and stone

Anonymous
06/12/26(Fri)07:17:45 No.109036608

Anonymous 06/12/26(Fri)07:17:45 No.109036608

File: ComfyUI-IdeogramUpscaled_(...).jpg (1.75 MB, 2720x3520)

1.75 MB JPG

I've been using seedvr2 upscaler for images and videos for months now. Is RTX upscaling better?

Anonymous
06/12/26(Fri)07:18:30 No.109036612

Anonymous 06/12/26(Fri)07:18:30 No.109036612

>>109036600
or don't

Anonymous
06/12/26(Fri)07:19:41 No.109036615

Anonymous 06/12/26(Fri)07:19:41 No.109036615

>>109036608
no. it's faster. i don't upscale videos.

Anonymous
06/12/26(Fri)07:36:57 No.109036673

Anonymous 06/12/26(Fri)07:36:57 No.109036673

american hours: brown sperging and images
european hours: wealthy posters and video gens

Anonymous
06/12/26(Fri)07:39:55 No.109036685

Anonymous 06/12/26(Fri)07:39:55 No.109036685

>>109036673
im from australia
is it peak posting quality right now?

Anonymous
06/12/26(Fri)07:41:01 No.109036694

Anonymous 06/12/26(Fri)07:41:01 No.109036694

>>109036673
I don't think wealthy europeans are on 4chan at this hour. They are working.

Anonymous
06/12/26(Fri)07:44:08 No.109036706

Anonymous 06/12/26(Fri)07:44:08 No.109036706

>>109036694
wealthy people don't work

Anonymous
06/12/26(Fri)07:44:28 No.109036709

Anonymous 06/12/26(Fri)07:44:28 No.109036709

File: %Date 00001.mp4 (1.87 MB, 720x1080)

1.87 MB MP4

I'm learning over here.

Anonymous
06/12/26(Fri)07:46:44 No.109036721

Anonymous 06/12/26(Fri)07:46:44 No.109036721

>>109036706
most of them do. and hard.

Anonymous
06/12/26(Fri)07:48:22 No.109036731

Anonymous 06/12/26(Fri)07:48:22 No.109036731

>>109036709
>tz tx
what did she mean by this?

Anonymous
06/12/26(Fri)08:10:57 No.109036865

Anonymous 06/12/26(Fri)08:10:57 No.109036865

File: Ltx 00002.mp4 (1.52 MB, 720x1080)

1.52 MB MP4

>>109036731
was meant to be "i can ltx now"

text is for tomorrow, tonight, its more ass

Anonymous
06/12/26(Fri)08:13:51 No.109036879

Anonymous 06/12/26(Fri)08:13:51 No.109036879

>>109036865
Text is not one of LTX's strong points.

Anonymous
06/12/26(Fri)08:18:04 No.109036910

Anonymous 06/12/26(Fri)08:18:04 No.109036910

>>109034986
5

Anonymous
06/12/26(Fri)08:21:48 No.109036935

Anonymous 06/12/26(Fri)08:21:48 No.109036935

>>109036865
oooh yeah

Anonymous
06/12/26(Fri)08:30:10 No.109036985

Anonymous 06/12/26(Fri)08:30:10 No.109036985

File: 354686.webm (3.85 MB, 420x291)

3.85 MB WEBM

>>109036879
i remember making a video of a person wearing a shirt that has words on it, and it seemed to work correctly when i frame injected a flat image of the shirt as well as wrote in the prompt what the shirt said. i should try more text experiments

Anonymous
06/12/26(Fri)08:41:17 No.109037067

Anonymous 06/12/26(Fri)08:41:17 No.109037067

File: 1754545975117882.jpg (654 KB, 1248x1824)

654 KB JPG

Anonymous
06/12/26(Fri)08:47:35 No.109037113

Anonymous 06/12/26(Fri)08:47:35 No.109037113

>>109036985
Yeah but I mean SDXL can do text if you want it to. But the competence is different.

Anonymous
06/12/26(Fri)08:48:27 No.109037120

Anonymous 06/12/26(Fri)08:48:27 No.109037120

File: Wan21_SCAIL2_00095.mp4 (445 KB, 896x896)

445 KB MP4

Anonymous
06/12/26(Fri)08:49:05 No.109037127

Anonymous 06/12/26(Fri)08:49:05 No.109037127

>>109036608
hey bruh, can you do an anima lora version? ideogram is slow like cdprojekt

Anonymous
06/12/26(Fri)08:50:15 No.109037132

Anonymous 06/12/26(Fri)08:50:15 No.109037132

File: 1769137308829094.png (164 KB, 461x414)

164 KB PNG

things are looking exciting frens
What are my chances of running ID4 and SCAIL on a 6 GB GPU?

Anonymous
06/12/26(Fri)08:51:41 No.109037141

Anonymous 06/12/26(Fri)08:51:41 No.109037141

>>109035852
Sauce for original video?

Anonymous
06/12/26(Fri)08:52:59 No.109037148

Anonymous 06/12/26(Fri)08:52:59 No.109037148

Can you make SCAIL work off of vague animations? Like if I make some basic 3D model do janky animations, could SCAIL translate it into more natural movements?

Anonymous
06/12/26(Fri)08:54:29 No.109037156

Anonymous 06/12/26(Fri)08:54:29 No.109037156

>>109037127
I've not messed around with anima besides one of the early preview versions. How well does it take to 3DCG?

Anonymous
06/12/26(Fri)08:55:23 No.109037163

Anonymous 06/12/26(Fri)08:55:23 No.109037163

File: 1762578036024869.gif (3.74 MB, 714x714)

3.74 MB GIF

>>109037141

Anonymous
06/12/26(Fri)08:56:55 No.109037177

Anonymous 06/12/26(Fri)08:56:55 No.109037177

>>109037132
doubt it. you probably need at least 12 gigs.

Anonymous
06/12/26(Fri)08:57:08 No.109037179

Anonymous 06/12/26(Fri)08:57:08 No.109037179

File: Wbdxfbsfb.jpg (10 KB, 180x179)

10 KB JPG

scrolling through civitai for the past hour and can't find a single artist style that I like

Anonymous
06/12/26(Fri)08:57:35 No.109037184

Anonymous 06/12/26(Fri)08:57:35 No.109037184

>>109037177
grim
wish I wasn't poor

Anonymous
06/12/26(Fri)08:57:38 No.109037185

Anonymous 06/12/26(Fri)08:57:38 No.109037185

recommended tool for designing json prompts and bounding with ideogram4? doesn't have to be integrated with workflow, standalone/web is fine too

Anonymous
06/12/26(Fri)09:04:34 No.109037227

Anonymous 06/12/26(Fri)09:04:34 No.109037227

>>109037185
there are people shilling them on reddit. just go pick one

Anonymous
06/12/26(Fri)09:05:04 No.109037231

Anonymous 06/12/26(Fri)09:05:04 No.109037231

>>109037185
Since you're not a baby dick vramlet just fucking use a local llm

Anonymous
06/12/26(Fri)09:05:59 No.109037238

Anonymous 06/12/26(Fri)09:05:59 No.109037238

https://civitai.com/models/1662740/lenovo-ultrareal?modelVersionId=3025161

Anonymous
06/12/26(Fri)09:12:23 No.109037273

Anonymous 06/12/26(Fri)09:12:23 No.109037273

>>109037238
https://civitai.red/models/2688234/realism-engine-ideogram-4

Anonymous
06/12/26(Fri)09:16:46 No.109037298

Anonymous 06/12/26(Fri)09:16:46 No.109037298

File: 1777910216936643.jpg (1.86 MB, 1344x2240)

1.86 MB JPG

Anonymous
06/12/26(Fri)09:18:37 No.109037307

Anonymous 06/12/26(Fri)09:18:37 No.109037307

>>109037298
shouldn't it be chocolate milk?

Anonymous
06/12/26(Fri)09:27:18 No.109037353

Anonymous 06/12/26(Fri)09:27:18 No.109037353

File: 1763524588009940.jpg (2 MB, 1344x2240)

2 MB JPG

>>109037307
It's fresh out the tap.

Anonymous
06/12/26(Fri)09:32:38 No.109037383

Anonymous 06/12/26(Fri)09:32:38 No.109037383

not gonna lie, being able to oneshot comic pages locally (even if the styles are horrendous) is a gamechanger

Anonymous
06/12/26(Fri)09:37:47 No.109037408

Anonymous 06/12/26(Fri)09:37:47 No.109037408

>>109037383
>even if the styles are horrendous
The fix for this is as simple as finding a hentai artist you like and throwing the pages into the training script.

Anonymous
06/12/26(Fri)09:55:16 No.109037493

Anonymous 06/12/26(Fri)09:55:16 No.109037493

>>109037132
I think nf4 Ideogram might work.
I wouldn't bother video gen on that GPU though.

Anonymous
06/12/26(Fri)09:57:30 No.109037505

Anonymous 06/12/26(Fri)09:57:30 No.109037505

>>109037383
What model are you using for this?

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.