/g/ - /ldg/ - Local Diffusion General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/ldg/ - Local Diffusion Genera(...) 09/17/24(Tue)20:43:44 No.102434568

File: tmp.jpg (780 KB, 3264x3264)

780 KB JPG

/ldg/ - Local Diffusion General Anonymous 09/17/24(Tue)20:43:44 No.102434568 Archived

Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>102418421

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/kohya-ss/sd-scripts/tree/sd3

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/c/kdg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/u/udg
>>>/tg/slop
>>>/trash/sdg
>>>/aco/aivg

Anonymous
09/17/24(Tue)20:51:51 No.102434655

Anonymous 09/17/24(Tue)20:51:51 No.102434655

i heard this is the cozy thread

Anonymous
09/17/24(Tue)20:52:47 No.102434669

Anonymous 09/17/24(Tue)20:52:47 No.102434669

>>102434655
many people are saying this

Anonymous
09/17/24(Tue)20:54:04 No.102434682

Anonymous 09/17/24(Tue)20:54:04 No.102434682

>>102434662
>Flux is fundamentally fucked
True but not because it can't do styles (it can). It's fucked because it's too large to run comfortably on the average anons card.

Anonymous
09/17/24(Tue)20:55:16 No.102434694

Anonymous 09/17/24(Tue)20:55:16 No.102434694

>>102434682
>It's fucked because it's too large to run comfortably on the average anons card.
it can be run on 8gb vram cards though if you use quants, it's not asking for that much when you think about it

Anonymous
09/17/24(Tue)20:59:12 No.102434723

Anonymous 09/17/24(Tue)20:59:12 No.102434723

>>102434694
Even with quants it's painfully slow.

Anonymous
09/17/24(Tue)21:00:45 No.102434740

Anonymous 09/17/24(Tue)21:00:45 No.102434740

>>102434723
you can use schnell and get your image in 4 steps, but yeah, I prefer to wait more for the quality dev is giving

Anonymous
09/17/24(Tue)21:05:01 No.102434780

Anonymous 09/17/24(Tue)21:05:01 No.102434780

>>102434568
All of those images are ai?

Anonymous
09/17/24(Tue)21:06:51 No.102434799

Anonymous 09/17/24(Tue)21:06:51 No.102434799

>>102434780
yeah, that's the point

Anonymous
09/17/24(Tue)21:10:31 No.102434833

Anonymous 09/17/24(Tue)21:10:31 No.102434833

>>102434799
Neat!
I never visit these threads, you guys seem cool!

Anonymous
09/17/24(Tue)21:11:56 No.102434845

Anonymous 09/17/24(Tue)21:11:56 No.102434845

>>102434833
thanks :3

Anonymous
09/17/24(Tue)21:14:03 No.102434868

Anonymous 09/17/24(Tue)21:14:03 No.102434868

>>102434329
I think I made a mistake, but I'm going let the training finish before retrying. It seems that you specify --network_args in additional parameters, it with go with defaults for certain settings, even there are no arguments.

Anonymous
09/17/24(Tue)21:40:47 No.102435038

Anonymous 09/17/24(Tue)21:40:47 No.102435038

https://github.com/THUDM/CogVideo/issues/291#issuecomment-2354409848
>We have not yet released the I2V model, it is expected to be released this month, and the final preparations for the model are underway
I thought we could already use it locally, or there's something I've missed

Anonymous
09/17/24(Tue)22:06:56 No.102435216

Anonymous 09/17/24(Tue)22:06:56 No.102435216

So I added Pixtral as an option to my captioning scripts (only took a few lines of code, which was nice). Testing it out on both anime and real images. It's genuinely so fucking bad. Like it makes me think something is broken, but the text it generates is completely coherent. Literally 80% of what it describes in the image is wrong. Is it really that bad? It can't be, right? I'm using the Transformers implementation, not the mistralai one, if that matters. Maybe it's subtly broken somehow?

Anonymous
09/17/24(Tue)22:09:23 No.102435236

Anonymous 09/17/24(Tue)22:09:23 No.102435236

>>102435216
no no, it's completety shit, that's all there is to it lol

Anonymous
09/17/24(Tue)22:19:26 No.102435309

Anonymous 09/17/24(Tue)22:19:26 No.102435309

>>102435038
Being generous, even with local image gen we can say we're "close to" SaaS but with video we're not even in the same universe.

Anonymous
09/17/24(Tue)22:20:26 No.102435320

Anonymous 09/17/24(Tue)22:20:26 No.102435320

>>102435309
>video we're not even in the same universe.
we can also add text to music/songs, we are in 10 universe away from suno/udio kek

Anonymous
09/17/24(Tue)22:24:08 No.102435355

Anonymous 09/17/24(Tue)22:24:08 No.102435355

>>102435236
How do they even fuck up that bad then? Joycaption is two nn.Linear layers in a trained projector between a pretrained, frozen CLIP model and a pretrained, frozen LLM. Literally two fucking weight matrices. And it mogs Pixtral. Just embarrassing.

Anonymous
09/17/24(Tue)22:25:32 No.102435373

Anonymous 09/17/24(Tue)22:25:32 No.102435373

File: __00202_.png (2.83 MB, 1560x2280)

2.83 MB PNG

Anonymous
09/17/24(Tue)22:26:01 No.102435379

Anonymous 09/17/24(Tue)22:26:01 No.102435379

File: 00008-715896776.jpg (156 KB, 1080x1280)

156 KB JPG

Anonymous
09/17/24(Tue)22:26:05 No.102435380

Anonymous 09/17/24(Tue)22:26:05 No.102435380

>>102435355
Knowing MistralAI, they got roasted so hard by the LLM community by "selling their soul to the devil by keeping their best models as an API when they promised full open source releases" that they'll throw any kind of failed experiment shit to us as a way to say "see, we're still thinking of you peasents" :^)

Anonymous
09/17/24(Tue)22:28:23 No.102435402

Anonymous 09/17/24(Tue)22:28:23 No.102435402

File: ComfyUI_01557_.png (1.5 MB, 1024x1024)

1.5 MB PNG

so what comes after Flux?

Anonymous
09/17/24(Tue)22:29:01 No.102435407

Anonymous 09/17/24(Tue)22:29:01 No.102435407

>>102435402
>so what comes after Flux?
Flux video
https://blackforestlabs.ai/up-next/

Anonymous
09/17/24(Tue)22:29:06 No.102435409

Anonymous 09/17/24(Tue)22:29:06 No.102435409

>>102435402
Salvation from this Mortal Coil

Anonymous
09/17/24(Tue)22:31:30 No.102435431

Anonymous 09/17/24(Tue)22:31:30 No.102435431

>>102435407
woah better than I expected.
what about image generators tho?

Anonymous
09/17/24(Tue)22:33:07 No.102435444

Anonymous 09/17/24(Tue)22:33:07 No.102435444

>>102435431
>what about image generators tho?
I don't want to doom, but we'll probably never get a base model with the quality of Flux, the best we can do now is to finetune that mf to make it even better I guess

Anonymous
09/17/24(Tue)22:41:57 No.102435538

Anonymous 09/17/24(Tue)22:41:57 No.102435538

It's all matrices all the way down.
Matrices shaped into other matrices viewed into other matrices all turned into a loss function.

Anonymous
09/17/24(Tue)22:56:15 No.102435656

Anonymous 09/17/24(Tue)22:56:15 No.102435656

File: 00073-4237509925.png (1.01 MB, 1024x1024)

1.01 MB PNG

Anonymous
09/17/24(Tue)22:57:33 No.102435665

Anonymous 09/17/24(Tue)22:57:33 No.102435665

File: 00030-2814665992.jpg (887 KB, 1080x1280)

887 KB JPG

Anonymous
09/17/24(Tue)22:58:22 No.102435674

Anonymous 09/17/24(Tue)22:58:22 No.102435674

File: 00010-2034227891.png (1.25 MB, 1024x1024)

1.25 MB PNG

Anonymous
09/17/24(Tue)22:59:21 No.102435685

Anonymous 09/17/24(Tue)22:59:21 No.102435685

>>102435444
trust the plan

Anonymous
09/17/24(Tue)22:59:55 No.102435692

Anonymous 09/17/24(Tue)22:59:55 No.102435692

File: 00113-2577591480.png (1.32 MB, 1024x1024)

1.32 MB PNG

Anonymous
09/17/24(Tue)23:00:01 No.102435695

Anonymous 09/17/24(Tue)23:00:01 No.102435695

File: amputate the mutant.png (1.59 MB, 1344x768)

1.59 MB PNG

Good Comfy workflows to automate hand fixes? The Impact pack to detect and fix faces works great, but the automatic hand detection/fixing seems to have little or no positive impact

Anonymous
09/17/24(Tue)23:01:27 No.102435704

Anonymous 09/17/24(Tue)23:01:27 No.102435704

File: 00197-1997104261.png (1.22 MB, 1024x1024)

1.22 MB PNG

Anonymous
09/17/24(Tue)23:02:08 No.102435709

Anonymous 09/17/24(Tue)23:02:08 No.102435709

File: 00032-3559231248.jpg (779 KB, 1080x1280)

779 KB JPG

Anonymous
09/17/24(Tue)23:06:30 No.102435737

Anonymous 09/17/24(Tue)23:06:30 No.102435737

File: bidler2.png (3.92 MB, 2000x2000)

3.92 MB PNG

Anonymous
09/17/24(Tue)23:06:32 No.102435738

Anonymous 09/17/24(Tue)23:06:32 No.102435738

File: TextToImage-638622676299126668.png (2 MB, 768x1536)

2 MB PNG

Anonymous
09/17/24(Tue)23:09:40 No.102435755

Anonymous 09/17/24(Tue)23:09:40 No.102435755

File: 00008-464414774.jpg (752 KB, 1080x1280)

752 KB JPG

Anonymous
09/17/24(Tue)23:11:07 No.102435765

Anonymous 09/17/24(Tue)23:11:07 No.102435765

File: 00010-2797418403.jpg (903 KB, 1080x1280)

903 KB JPG

Anonymous
09/17/24(Tue)23:11:09 No.102435766

Anonymous 09/17/24(Tue)23:11:09 No.102435766

File: 00063-1422458154.png (1.07 MB, 1024x1024)

1.07 MB PNG

Anonymous
09/17/24(Tue)23:12:43 No.102435777

Anonymous 09/17/24(Tue)23:12:43 No.102435777

>>102435407
shits gonna be slow as fuck even on a 4090

Anonymous
09/17/24(Tue)23:14:22 No.102435782

Anonymous 09/17/24(Tue)23:14:22 No.102435782

>>102435777
yeah, for example CogVideoX-5b, for a 8fps + 50 steps that took 2 mn for a H100, we're fucked lol

Anonymous
09/17/24(Tue)23:14:40 No.102435784

Anonymous 09/17/24(Tue)23:14:40 No.102435784

File: 00744.png (861 KB, 1024x1024)

861 KB PNG

Anonymous
09/17/24(Tue)23:15:30 No.102435794

Anonymous 09/17/24(Tue)23:15:30 No.102435794

File: temp_vcmlh_00012_.png (1.05 MB, 832x1216)

1.05 MB PNG

Anonymous
09/17/24(Tue)23:16:13 No.102435798

Anonymous 09/17/24(Tue)23:16:13 No.102435798

File: yodalady.png (895 KB, 752x896)

895 KB PNG

Anonymous
09/17/24(Tue)23:17:09 No.102435808

Anonymous 09/17/24(Tue)23:17:09 No.102435808

File: rattigieg1.png (462 KB, 512x672)

462 KB PNG

Anonymous
09/17/24(Tue)23:17:46 No.102435811

Anonymous 09/17/24(Tue)23:17:46 No.102435811

Miraculous thread

Anonymous
09/17/24(Tue)23:18:51 No.102435821

Anonymous 09/17/24(Tue)23:18:51 No.102435821

File: campaign.png (3.87 MB, 2048x1288)

3.87 MB PNG

Anonymous
09/17/24(Tue)23:21:47 No.102435841

Anonymous 09/17/24(Tue)23:21:47 No.102435841

File: 458921430_487807994064753(...).jpg (225 KB, 2048x1421)

225 KB JPG

Dare I say... us?

Anonymous
09/17/24(Tue)23:23:09 No.102435850

Anonymous 09/17/24(Tue)23:23:09 No.102435850

>>102435841
kek, really based image not gonna lie

Anonymous
09/17/24(Tue)23:23:54 No.102435855

Anonymous 09/17/24(Tue)23:23:54 No.102435855

File: 00005-376976035.png (1.55 MB, 1024x1024)

1.55 MB PNG

Anonymous
09/17/24(Tue)23:24:39 No.102435862

Anonymous 09/17/24(Tue)23:24:39 No.102435862

File: 00021-3845168184.jpg (867 KB, 1080x1280)

867 KB JPG

Anonymous
09/17/24(Tue)23:25:11 No.102435867

Anonymous 09/17/24(Tue)23:25:11 No.102435867

File: 00000-376976030.png (1.36 MB, 1024x1024)

1.36 MB PNG

Anonymous
09/17/24(Tue)23:26:07 No.102435873

Anonymous 09/17/24(Tue)23:26:07 No.102435873

>>102435850
I was at work and nearly kek'd out loud when my nigga sent me that pic

Anonymous
09/17/24(Tue)23:26:12 No.102435874

Anonymous 09/17/24(Tue)23:26:12 No.102435874

File: 00017-2754352315.png (1.58 MB, 1024x1024)

1.58 MB PNG

Anonymous
09/17/24(Tue)23:28:00 No.102435890

Anonymous 09/17/24(Tue)23:28:00 No.102435890

File: 00080.png (1.03 MB, 896x896)

1.03 MB PNG

Anonymous
09/17/24(Tue)23:30:03 No.102435905

Anonymous 09/17/24(Tue)23:30:03 No.102435905

File: 00000-3227682127.png (1.34 MB, 1024x1024)

1.34 MB PNG

Anonymous
09/17/24(Tue)23:30:49 No.102435916

Anonymous 09/17/24(Tue)23:30:49 No.102435916

>>102435841
No, I like manual artists too

Anonymous
09/17/24(Tue)23:31:53 No.102435923

Anonymous 09/17/24(Tue)23:31:53 No.102435923

>>102435916
I like artists who respect AI, those who want AI dead won't be missed if they lose their jobs lol

Anonymous
09/17/24(Tue)23:32:08 No.102435925

Anonymous 09/17/24(Tue)23:32:08 No.102435925

File: 00003-3498046325.png (1.35 MB, 1024x1024)

1.35 MB PNG

Anonymous
09/17/24(Tue)23:33:42 No.102435935

Anonymous 09/17/24(Tue)23:33:42 No.102435935

File: 459504004_533938959036391(...).jpg (488 KB, 1632x1020)

488 KB JPG

Next one is an actual gen I promise lol

Anonymous
09/17/24(Tue)23:35:14 No.102435951

Anonymous 09/17/24(Tue)23:35:14 No.102435951

File: 00000-24400397.png (1.53 MB, 1024x1024)

1.53 MB PNG

Anonymous
09/17/24(Tue)23:35:32 No.102435953

Anonymous 09/17/24(Tue)23:35:32 No.102435953

>>102435923
Fair but I'd always prefer the olive branch first

Anonymous
09/17/24(Tue)23:37:05 No.102435964

Anonymous 09/17/24(Tue)23:37:05 No.102435964

File: 00001-2456067307.png (1.48 MB, 1024x1024)

1.48 MB PNG

Anonymous
09/17/24(Tue)23:37:41 No.102435966

Anonymous 09/17/24(Tue)23:37:41 No.102435966

File: 458985034_929840589187366(...).png (1.23 MB, 1152x896)

1.23 MB PNG

Anonymous
09/17/24(Tue)23:41:03 No.102435995

Anonymous 09/17/24(Tue)23:41:03 No.102435995

File: 00000-2731582637.png (997 KB, 1024x1024)

997 KB PNG

Anonymous
09/17/24(Tue)23:43:12 No.102436016

Anonymous 09/17/24(Tue)23:43:12 No.102436016

File: 00021-2397995919.png (885 KB, 1024x1024)

885 KB PNG

Anonymous
09/17/24(Tue)23:45:09 No.102436032

Anonymous 09/17/24(Tue)23:45:09 No.102436032

File: 00009-2456132599.png (945 KB, 1024x1024)

945 KB PNG

Anonymous
09/17/24(Tue)23:47:35 No.102436050

Anonymous 09/17/24(Tue)23:47:35 No.102436050

File: 00017-2674112797.png (960 KB, 1024x1024)

960 KB PNG

Anonymous
09/17/24(Tue)23:48:32 No.102436056

Anonymous 09/17/24(Tue)23:48:32 No.102436056

File: 00005-3322961658.png (1.32 MB, 1024x1024)

1.32 MB PNG

Anonymous
09/17/24(Tue)23:50:11 No.102436072

Anonymous 09/17/24(Tue)23:50:11 No.102436072

>>102436032
would

Anonymous
09/17/24(Tue)23:57:49 No.102436134

Anonymous 09/17/24(Tue)23:57:49 No.102436134

File: ComfyUI_temp_aydfz_00891_.png (3.59 MB, 1434x1434)

3.59 MB PNG

Anonymous
09/17/24(Tue)23:58:19 No.102436137

Anonymous 09/17/24(Tue)23:58:19 No.102436137

>>102436072
shameless motherfucker

Anonymous
09/17/24(Tue)23:59:23 No.102436146

Anonymous 09/17/24(Tue)23:59:23 No.102436146

File: 00037-2412469408.png (1.26 MB, 1024x1024)

1.26 MB PNG

Anonymous
09/18/24(Wed)00:00:57 No.102436161

Anonymous 09/18/24(Wed)00:00:57 No.102436161

>>102435782
>+ 50 steps
what if flux video can do with less steps?
if we only need like 4 it could be okay

Anonymous
09/18/24(Wed)00:01:00 No.102436162

Anonymous 09/18/24(Wed)00:01:00 No.102436162

File: -.png (1.48 MB, 1344x768)

1.48 MB PNG

Anonymous
09/18/24(Wed)00:02:00 No.102436172

Anonymous 09/18/24(Wed)00:02:00 No.102436172

>>102435695
I think everyone forgot about hand detailers now that flux gets it right most of the time

Anonymous
09/18/24(Wed)00:03:11 No.102436182

Anonymous 09/18/24(Wed)00:03:11 No.102436182

File: ComfyUI_temp_aydfz_00719_.png (2.49 MB, 1434x1434)

2.49 MB PNG

Anonymous
09/18/24(Wed)00:04:59 No.102436192

Anonymous 09/18/24(Wed)00:04:59 No.102436192

>>102436172
Haven't heard good things about flux for anime

Anonymous
09/18/24(Wed)00:05:38 No.102436202

Anonymous 09/18/24(Wed)00:05:38 No.102436202

File: ComfyUI_temp_aydfz_01095_.png (3.75 MB, 1434x1434)

3.75 MB PNG

Anonymous
09/18/24(Wed)00:06:37 No.102436215

Anonymous 09/18/24(Wed)00:06:37 No.102436215

>>102436192
It's default style isn't the most pleasant but there are several style loras that are pretty good

Anonymous
09/18/24(Wed)00:07:09 No.102436219

Anonymous 09/18/24(Wed)00:07:09 No.102436219

File: 00030-345097352.png (1.43 MB, 1024x1024)

1.43 MB PNG

Anonymous
09/18/24(Wed)00:08:57 No.102436234

Anonymous 09/18/24(Wed)00:08:57 No.102436234

File: ComfyUI_temp_ueozj_00085_.png (2.13 MB, 1434x1434)

2.13 MB PNG

Anonymous
09/18/24(Wed)00:09:49 No.102436243

Anonymous 09/18/24(Wed)00:09:49 No.102436243

File: ComfyUI_temp_aydfz_00961_.png (3.38 MB, 1434x1434)

3.38 MB PNG

Anonymous
09/18/24(Wed)00:10:05 No.102436245

Anonymous 09/18/24(Wed)00:10:05 No.102436245

>>102436215
It doesn't get booru tags, though, from what I've heard? Which really limits what you can do

Maybe I'll check it out if a proper full finetune is released, but it seems pretty inferior right now

Anonymous
09/18/24(Wed)00:15:02 No.102436275

Anonymous 09/18/24(Wed)00:15:02 No.102436275

File: ComfyUI_temp_plsgq_00177_.png (1.48 MB, 1024x1024)

1.48 MB PNG

Anonymous
09/18/24(Wed)00:15:33 No.102436281

Anonymous 09/18/24(Wed)00:15:33 No.102436281

File: ComfyUI_temp_ueozj_00028_.png (3.7 MB, 1434x1434)

3.7 MB PNG

Anonymous
09/18/24(Wed)00:16:26 No.102436288

Anonymous 09/18/24(Wed)00:16:26 No.102436288

File: ComfyUI_temp_plsgq_00046_.png (1.16 MB, 1024x1024)

1.16 MB PNG

Anonymous
09/18/24(Wed)00:17:12 No.102436295

Anonymous 09/18/24(Wed)00:17:12 No.102436295

File: ComfyUI_temp_aydfz_01072_.png (3.89 MB, 1434x1434)

3.89 MB PNG

Anonymous
09/18/24(Wed)00:18:16 No.102436304

Anonymous 09/18/24(Wed)00:18:16 No.102436304

File: ComfyUI_temp_aydfz_00916_.png (2.55 MB, 1434x1434)

2.55 MB PNG

Anonymous
09/18/24(Wed)00:19:03 No.102436317

Anonymous 09/18/24(Wed)00:19:03 No.102436317

File: konosuba flux dev v1.jpg (826 KB, 3072x1024)

826 KB JPG

>>102436245

Booru tags style dataset works. Just cook your own waifus LoRa with booru tags.

Anonymous
09/18/24(Wed)00:20:35 No.102436325

Anonymous 09/18/24(Wed)00:20:35 No.102436325

File: ComfyUI_temp_aydfz_00988_.png (3.41 MB, 1434x1434)

3.41 MB PNG

Anonymous
09/18/24(Wed)00:21:31 No.102436333

Anonymous 09/18/24(Wed)00:21:31 No.102436333

File: 00000-575325872.png (1.59 MB, 1024x1024)

1.59 MB PNG

Anonymous
09/18/24(Wed)00:24:08 No.102436353

Anonymous 09/18/24(Wed)00:24:08 No.102436353

File: 00000-1280067345.png (1.34 MB, 1024x1024)

1.34 MB PNG

Anonymous
09/18/24(Wed)00:25:51 No.102436367

Anonymous 09/18/24(Wed)00:25:51 No.102436367

File: 00000-1778453302.png (1.62 MB, 1072x1072)

1.62 MB PNG

Anonymous
09/18/24(Wed)00:26:56 No.102436379

Anonymous 09/18/24(Wed)00:26:56 No.102436379

File: 00001-2333064770.png (1.56 MB, 1024x1024)

1.56 MB PNG

Anonymous
09/18/24(Wed)00:27:44 No.102436384

Anonymous 09/18/24(Wed)00:27:44 No.102436384

>>102436317
I'm the opposite - I prefer to cook my own LoRas for style, and I don't really care about the characters it recognizes, but more the concepts from boorus. How well it understands specific concept tags like "bob cut", "doll joints", "goo girl", "yandere", etc. is what's really important, to give random examples

I'm sure it can infer those to some extent, but the reports I've heard say that it's far inferior to the SDXL models in that area at the moment

Anonymous
09/18/24(Wed)00:27:57 No.102436389

Anonymous 09/18/24(Wed)00:27:57 No.102436389

File: 00004-3527511055.png (1.6 MB, 1024x1024)

1.6 MB PNG

Anonymous
09/18/24(Wed)00:29:50 No.102436408

Anonymous 09/18/24(Wed)00:29:50 No.102436408

File: 00009-125602019.png (1.52 MB, 1024x1024)

1.52 MB PNG

Anonymous
09/18/24(Wed)00:30:58 No.102436416

Anonymous 09/18/24(Wed)00:30:58 No.102436416

File: 00003-3016189836.png (1.49 MB, 1024x1024)

1.49 MB PNG

Anonymous
09/18/24(Wed)00:32:32 No.102436425

Anonymous 09/18/24(Wed)00:32:32 No.102436425

File: 00011-3199069743.png (1.43 MB, 1024x1024)

1.43 MB PNG

Anonymous
09/18/24(Wed)00:36:01 No.102436449

Anonymous 09/18/24(Wed)00:36:01 No.102436449

File: 00006-2785877804.png (58 KB, 768x768)

58 KB PNG

Anonymous
09/18/24(Wed)00:36:52 No.102436455

Anonymous 09/18/24(Wed)00:36:52 No.102436455

File: 00005-42579165.png (1.36 MB, 1024x1024)

1.36 MB PNG

Anonymous
09/18/24(Wed)00:39:03 No.102436468

Anonymous 09/18/24(Wed)00:39:03 No.102436468

File: 00007-3347182046.png (1.56 MB, 1024x1024)

1.56 MB PNG

Anonymous
09/18/24(Wed)00:40:37 No.102436479

Anonymous 09/18/24(Wed)00:40:37 No.102436479

File: 00431.png (958 KB, 1024x1024)

958 KB PNG

Anonymous
09/18/24(Wed)00:41:24 No.102436483

Anonymous 09/18/24(Wed)00:41:24 No.102436483

>>102436479
Beautiful

Anonymous
09/18/24(Wed)00:41:38 No.102436485

Anonymous 09/18/24(Wed)00:41:38 No.102436485

File: ComfyUI_temp_cupog_00068_.png (1.36 MB, 1024x1024)

1.36 MB PNG

Anonymous
09/18/24(Wed)00:42:48 No.102436492

Anonymous 09/18/24(Wed)00:42:48 No.102436492

File: 00444.png (1.2 MB, 1024x1024)

1.2 MB PNG

Anonymous
09/18/24(Wed)00:43:07 No.102436495

Anonymous 09/18/24(Wed)00:43:07 No.102436495

File: dino_00163_.png (1.05 MB, 1024x1024)

1.05 MB PNG

for training, do you guys prefer adafactor, adamw8, LION, prodigy, or something else?

Anonymous
09/18/24(Wed)00:47:27 No.102436527

Anonymous 09/18/24(Wed)00:47:27 No.102436527

File: ComfyUI_temp_cmdfs_00445_.png (1.31 MB, 1024x1024)

1.31 MB PNG

Anonymous
09/18/24(Wed)00:48:19 No.102436535

Anonymous 09/18/24(Wed)00:48:19 No.102436535

File: ComfyUI_00014_.png (1 MB, 832x1216)

1 MB PNG

I couldn't get flux to run in comfy using the github documentation. Is there an manual for what everything is doing? I really don't want to be doing the pull it together from a billion workflows bullshit again. I understand basic concepts, but have no idea why their are over a dozen sampler options as part of comfy core.

Anonymous
09/18/24(Wed)00:50:58 No.102436554

Anonymous 09/18/24(Wed)00:50:58 No.102436554

File: ComfyUI_temp_aydfz_00154_.png (1.68 MB, 1024x1024)

1.68 MB PNG

Anonymous
09/18/24(Wed)00:52:24 No.102436568

Anonymous 09/18/24(Wed)00:52:24 No.102436568

File: chrome_fynJTPGFRx.gif (2.17 MB, 237x186)

2.17 MB GIF

is there an automatic1111 implementation of the technique in the new Steve Mould video? https://www.youtube.com/watch?v=FMRi6pNAoag
i'd love to make some twisty squares pics

Anonymous
09/18/24(Wed)00:53:49 No.102436575

Anonymous 09/18/24(Wed)00:53:49 No.102436575

>>102436568
no more half measures walter

Anonymous
09/18/24(Wed)00:54:35 No.102436579

Anonymous 09/18/24(Wed)00:54:35 No.102436579

File: workflow.jpg (203 KB, 1873x780)

203 KB JPG

>>102436535
this is my basic ass workflow

Anonymous
09/18/24(Wed)00:58:35 No.102436606

Anonymous 09/18/24(Wed)00:58:35 No.102436606

File: ComfyUI_temp_aydfz_00639_.png (3.66 MB, 1291x1291)

3.66 MB PNG

Anonymous
09/18/24(Wed)00:59:32 No.102436610

Anonymous 09/18/24(Wed)00:59:32 No.102436610

File: 00001-430385072.png (1.67 MB, 1024x1024)

1.67 MB PNG

Anonymous
09/18/24(Wed)01:00:27 No.102436620

Anonymous 09/18/24(Wed)01:00:27 No.102436620

File: 00010-2461924006.png (1.86 MB, 1024x1024)

1.86 MB PNG

Anonymous
09/18/24(Wed)01:06:28 No.102436666

Anonymous 09/18/24(Wed)01:06:28 No.102436666

File: temp_vcmlh_00046_.png (1.03 MB, 640x1536)

1.03 MB PNG

Anonymous
09/18/24(Wed)01:09:48 No.102436694

Anonymous 09/18/24(Wed)01:09:48 No.102436694

File: 1726636183067214720223894(...).jpg (31 KB, 438x604)

31 KB JPG

>>102436479

Anonymous
09/18/24(Wed)01:11:05 No.102436703

Anonymous 09/18/24(Wed)01:11:05 No.102436703

File: 00014-1077822227.jpg (2.87 MB, 2912x1840)

2.87 MB JPG

Do any other anons ever get sad when the seasons change

Anonymous
09/18/24(Wed)01:46:50 No.102436996

Anonymous 09/18/24(Wed)01:46:50 No.102436996

>>102436535
euler_cfg_pp is the only sampler that matters

Anonymous
09/18/24(Wed)01:56:12 No.102437087

Anonymous 09/18/24(Wed)01:56:12 No.102437087

>>102436610
she looks like you crossed nick cage and scarlet johansson

Anonymous
09/18/24(Wed)02:07:08 No.102437186

Anonymous 09/18/24(Wed)02:07:08 No.102437186

>just train layers 7 and 20 single blocks at 128 dim, bro
>resulting lora is a blurry mess, likeness isn't captured well, and looks much uglier than a lora trained and 8 dim

Anonymous
09/18/24(Wed)02:13:08 No.102437250

Anonymous 09/18/24(Wed)02:13:08 No.102437250

File: ComfyUI_33521_.png (939 KB, 736x1024)

939 KB PNG

>>102432426
7 styles and 2 characters.
https://mega.nz/folder/mtknTSxB#cGzjJnEqhEXfb_ddb6yxNQ
https://mega.nz/folder/ekklHRgY#pH4JkFk-kFj4c09r-EtmpQ

Anonymous
09/18/24(Wed)02:13:54 No.102437258

Anonymous 09/18/24(Wed)02:13:54 No.102437258

>>102435373
Nice

Anonymous
09/18/24(Wed)02:13:56 No.102437259

Anonymous 09/18/24(Wed)02:13:56 No.102437259

>>102437186
Yeah you can't trust advice in anything ML dev related because nobody who is being loud knows what they're doing. The only way to make anything work is to experiment yourself.

Anonymous
09/18/24(Wed)02:14:44 No.102437263

Anonymous 09/18/24(Wed)02:14:44 No.102437263

File: ComfyUI_33944_.png (1.67 MB, 1280x720)

1.67 MB PNG

Anonymous
09/18/24(Wed)02:15:17 No.102437269

Anonymous 09/18/24(Wed)02:15:17 No.102437269

File: ComfyUI_33905_.png (1.38 MB, 768x1024)

1.38 MB PNG

Anonymous
09/18/24(Wed)02:22:31 No.102437321

Anonymous 09/18/24(Wed)02:22:31 No.102437321

File: ComfyUI_33912_.png (1.5 MB, 768x1024)

1.5 MB PNG

Anonymous
09/18/24(Wed)02:23:39 No.102437334

Anonymous 09/18/24(Wed)02:23:39 No.102437334

File: temp_vcmlh_00017_.png (1.01 MB, 832x1216)

1.01 MB PNG

Anonymous
09/18/24(Wed)03:03:08 No.102437596

Anonymous 09/18/24(Wed)03:03:08 No.102437596

File: ComfyUI_00032_.png (1.46 MB, 1216x832)

1.46 MB PNG

>>102436703
Robert frost - reluctance or most of his stuff. You aren't alone anon.

>>102436996
this caused pixelization and garbage.

Anonymous
09/18/24(Wed)03:25:18 No.102437755

Anonymous 09/18/24(Wed)03:25:18 No.102437755

File: ComfyUI_33965_.png (1.26 MB, 1280x720)

1.26 MB PNG

Anonymous
09/18/24(Wed)04:02:10 No.102438008

Anonymous 09/18/24(Wed)04:02:10 No.102438008

Is there any way to raise flux adherence to the prompt. I can't get more than about a dozen poses and god help me if I want something I wouldn't find on imagestock

Anonymous
09/18/24(Wed)05:24:08 No.102438582

Anonymous 09/18/24(Wed)05:24:08 No.102438582

File: ComfyUI_Flux_14047.jpg (390 KB, 832x1216)

390 KB JPG

Anonymous
09/18/24(Wed)07:41:57 No.102439534

Anonymous 09/18/24(Wed)07:41:57 No.102439534

File: bComfyUI_114630_.jpg (1.61 MB, 3072x1536)

1.61 MB JPG

Anonymous
09/18/24(Wed)08:02:11 No.102439693

Anonymous 09/18/24(Wed)08:02:11 No.102439693

File: 1167752492635494298-SD.png (1.68 MB, 896x1152)

1.68 MB PNG

best party evaaaaaaaaaaa

Anonymous
09/18/24(Wed)08:43:43 No.102440072

Anonymous 09/18/24(Wed)08:43:43 No.102440072

File: Screenshot 2024-09-19 004014.png (570 KB, 2560x1440)

570 KB PNG

Well it's working, but fuck it's slow

Anonymous
09/18/24(Wed)08:46:08 No.102440088

Anonymous 09/18/24(Wed)08:46:08 No.102440088

>>102435692
whats with all the sudden love for sandniggers? i say fuck them both.
also, go back.

Anonymous
09/18/24(Wed)08:51:45 No.102440155

Anonymous 09/18/24(Wed)08:51:45 No.102440155

https://github.com/aigc-apps/CogVideoX-Fun/tree/main/comfyui

Has anyone tried this yet? Looks sus af, don’t wanna get some
Malware

Anonymous
09/18/24(Wed)08:52:55 No.102440170

Anonymous 09/18/24(Wed)08:52:55 No.102440170

>>102440072
workflow link?

Anonymous
09/18/24(Wed)08:54:06 No.102440180

Anonymous 09/18/24(Wed)08:54:06 No.102440180

>>102440170
https://comfyanonymous.github.io/ComfyUI_examples/flux/
I got it from here, I don't really know what I'm doing though

Anonymous
09/18/24(Wed)08:54:57 No.102440187

Anonymous 09/18/24(Wed)08:54:57 No.102440187

>>102440180
>I don't really know what I'm doing though
welcome to the club

Anonymous
09/18/24(Wed)09:00:55 No.102440245

Anonymous 09/18/24(Wed)09:00:55 No.102440245

>>102440072
just so you know, a 4090 with that exact workflow is only double the speed... which is to say the per iteration speed is still slow in comparison to SDXL and obviously SD1.5. In terms of money it's not too bad since an XTX is about half the cost of a 4090, so getting half the speed is okay.

Anonymous
09/18/24(Wed)09:08:49 No.102440339

Anonymous 09/18/24(Wed)09:08:49 No.102440339

>>102440245
That's good to know, I was kind of curious where this sat in terms of performance.

Anonymous
09/18/24(Wed)09:10:59 No.102440354

Anonymous 09/18/24(Wed)09:10:59 No.102440354

>>102440155
No idea, it's the 2B parameter model according to the ss's not the 5B one that is out on huggingface and modelscope and img2vid is due in maybe a day maybe two weeks.
I'm going to use the online ones (when hugginface is back up and or I can be bothered to register at modelscope with it's rune language)
https://huggingface.co/spaces/THUDM/CogVideoX-5B-Space (currently down)
https://modelscope.cn/studios/ZhipuAI/CogVideoX-5b-demo
(rune language)

Some other links:
https://github.com/kijai/ComfyUI-CogVideoXWrapper
https://modelscope.cn/models/ZhipuAI/CogVideoX-5b

Anonymous
09/18/24(Wed)09:24:52 No.102440505

Anonymous 09/18/24(Wed)09:24:52 No.102440505

>>102440072
Are you using the zluda comfy?

Try this workflow

https://openart.ai/workflows/onion/flux-gguf-q8-12gb/X5HzyhrKjW2jqHVCTnvT

unet: flux1-dev-Q8_0.gguf
clip: t5-v1_1-xxl-encoder-f16.gguf
clip: ViT-L-14-BEST-smooth-GmP-TE-only-HF-format.safetensors

https://huggingface.co/city96/FLUX.1-dev-gguf
https://huggingface.co/city96/t5-v1_1-xxl-encoder-gguf
https://huggingface.co/zer0int/CLIP-GmP-ViT-L-14/tree/main

Anonymous
09/18/24(Wed)09:27:34 No.102440541

Anonymous 09/18/24(Wed)09:27:34 No.102440541

>>102440505
>Are you using the zluda comfy?
Yes

I'll give it a try tomorrow it's 1AM and it basically took me 4-5 hours to get this working after faffing around with a1111

Anonymous
09/18/24(Wed)09:27:41 No.102440542

Anonymous 09/18/24(Wed)09:27:41 No.102440542

>>102440245
How long it takes on 4090 to generate 20 steps with flux dev?

Anonymous
09/18/24(Wed)09:30:36 No.102440567

Anonymous 09/18/24(Wed)09:30:36 No.102440567

>>102440354
The 5b model is linked right on the GitHub page

Anonymous
09/18/24(Wed)09:35:44 No.102440609

Anonymous 09/18/24(Wed)09:35:44 No.102440609

>>102440542
15 seconds for me using the same settings

Anonymous
09/18/24(Wed)09:43:36 No.102440695

Anonymous 09/18/24(Wed)09:43:36 No.102440695

File: 1491272745.png (1.43 MB, 1024x1024)

1.43 MB PNG

Anonymous
09/18/24(Wed)10:18:45 No.102441125

Anonymous 09/18/24(Wed)10:18:45 No.102441125

File: file.png (105 KB, 2304x467)

105 KB PNG

https://github.com/THUDM/CogVideo/tree/CogVideoX_dev
I2V CogVideoX will be released on huggingface tommorow

Anonymous
09/18/24(Wed)10:20:36 No.102441154

Anonymous 09/18/24(Wed)10:20:36 No.102441154

>>102440505
Alright I did it anyway and it is working but the clip gguf doesn't show up, the unet does but if I unselect it I can't choose it again

Anonymous
09/18/24(Wed)10:23:17 No.102441187

Anonymous 09/18/24(Wed)10:23:17 No.102441187

I wonder why the chinese are so involved in machine learning. Is it just a byproduct of their investments into scientific research in general, or could it be a more directed effort?

Anonymous
09/18/24(Wed)10:23:30 No.102441190

Anonymous 09/18/24(Wed)10:23:30 No.102441190

>>102441125
What about this? Seems like there's a branch of CogVideo that allows for every resolution or something, I wish it would also allow for more than 8fps lol
https://huggingface.co/spaces/alibaba-pai/CogVideoX-Fun-5b
https://github.com/aigc-apps/CogVideoX-Fun

Anonymous
09/18/24(Wed)10:24:31 No.102441201

Anonymous 09/18/24(Wed)10:24:31 No.102441201

>>102441187
>I wonder why the chinese are so involved in machine learning.
AI is the most important tool of the 21th century, and it's a good way to get investor money right now, so of course people are going for it

Anonymous
09/18/24(Wed)10:28:31 No.102441249

Anonymous 09/18/24(Wed)10:28:31 No.102441249

File: tmpm41y86p5.png (1.14 MB, 896x1152)

1.14 MB PNG

Anonymous
09/18/24(Wed)10:33:19 No.102441309

Anonymous 09/18/24(Wed)10:33:19 No.102441309

>it's just the mathematical average of all the images it was trained on!
>no it's not
Okay, explain.

Anonymous
09/18/24(Wed)10:35:41 No.102441339

Anonymous 09/18/24(Wed)10:35:41 No.102441339

>>102441309
>inb4 it's the machine spirit, I ain't gotta explain shit

Anonymous
09/18/24(Wed)10:36:22 No.102441345

Anonymous 09/18/24(Wed)10:36:22 No.102441345

>>102441187
This field only needs a couple dozen people to care and most of the gimmicks of AI is great for social media engagement.

Anonymous
09/18/24(Wed)10:56:03 No.102441581

Anonymous 09/18/24(Wed)10:56:03 No.102441581

>>102441190
cant install it

 error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
   exit code: 1
  ─> [20 lines of output]
      [WARNING] Unable to import torch, pre-compiling ops will be disabled. Please visit https://pytorch.org/ to see how to properly install torch on your system.
      [93m [WARNING] [0m unable to import torch, please install it if you want to pre-compile any deepspeed ops.
      DS_BUILD_OPS=1
      Traceback (most recent call last):
        File "G:\ComfyUI\venv\lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 353, in <module>
          main()
        File "G:\ComfyUI\venv\lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "G:\ComfyUI\venv\lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
        File "<string>", line 155, in <module>
      AssertionError: Unable to pre-compile ops without torch installed. Please install torch before attempting to pre-compile ops.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
 exit code: 1
─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

damn it chang

Anonymous
09/18/24(Wed)11:01:24 No.102441642

Anonymous 09/18/24(Wed)11:01:24 No.102441642

>>102441581
>AssertionError: Unable to pre-compile ops without torch installed. Please install torch before attempting to pre-compile ops.
looks like you have to install torch before doing this process?

Anonymous
09/18/24(Wed)11:05:18 No.102441689

Anonymous 09/18/24(Wed)11:05:18 No.102441689

>>102441642
too much effort

Anonymous
09/18/24(Wed)11:06:22 No.102441703

Anonymous 09/18/24(Wed)11:06:22 No.102441703

>>102441689
it's working on this comfyUi node too, maybe it's easier
https://github.com/kijai/ComfyUI-CogVideoXWrapper

Anonymous
09/18/24(Wed)11:09:23 No.102441747

Anonymous 09/18/24(Wed)11:09:23 No.102441747

>>102441703
>Note that while this one can do image2vid, this is NOT the official I2V model yet, though it should also be released very soon.
I wonder what the difference will be between this one and the tommorow one

Anonymous
09/18/24(Wed)11:10:32 No.102441759

Anonymous 09/18/24(Wed)11:10:32 No.102441759

>>102441154
Did you place the Q8 in "unet" folder and the clips in "clip" folder?

Make sure you have ae.safetensors in the "vae" folder too

https://huggingface.co/black-forest-labs/FLUX.1-dev/tree/main

You have to then install the missing nodes, restart comfyui and select the unet, clip and vae in each node

Anonymous
09/18/24(Wed)11:12:40 No.102441782

Anonymous 09/18/24(Wed)11:12:40 No.102441782

>>102441759
And remove the dualcliploader and add the dualcliploader (gguf) if you can't see the gguf clip

Anonymous
09/18/24(Wed)11:15:28 No.102441836

Anonymous 09/18/24(Wed)11:15:28 No.102441836

>>102441642
I already have torch installed, that chang repo is just sucks, ill go with kijai better
"(venv) G:\ComfyUI\custom_nodes\CogVideoX-Fun>python -c "import torch; print(torch.__version__)"
2.4.1+cu124"

Anonymous
09/18/24(Wed)11:16:46 No.102441854

Anonymous 09/18/24(Wed)11:16:46 No.102441854

>>102441836
>ill go with kijai better
yeah, you know you won't get weird ass errors when going for kijai's repos, this guy is really good

Anonymous
09/18/24(Wed)11:16:57 No.102441859

Anonymous 09/18/24(Wed)11:16:57 No.102441859

>>102435692
based

Anonymous
09/18/24(Wed)11:17:16 No.102441865

Anonymous 09/18/24(Wed)11:17:16 No.102441865

File: ComfyUI_33976_.png (456 KB, 1280x720)

456 KB PNG

Anonymous
09/18/24(Wed)11:20:05 No.102441903

Anonymous 09/18/24(Wed)11:20:05 No.102441903

File: ComfyUI_33907_.png (1.52 MB, 768x1024)

1.52 MB PNG

Anonymous
09/18/24(Wed)11:20:56 No.102441912

Anonymous 09/18/24(Wed)11:20:56 No.102441912

File: ComfyUI_33946_.png (1.65 MB, 1280x720)

1.65 MB PNG

Anonymous
09/18/24(Wed)11:29:39 No.102441999

Anonymous 09/18/24(Wed)11:29:39 No.102441999

File: ComfyUI_00735_.png (1.41 MB, 1024x1024)

1.41 MB PNG

>>102441125
Nice.

Anonymous
09/18/24(Wed)11:33:15 No.102442031

Anonymous 09/18/24(Wed)11:33:15 No.102442031

File: cogvideo.png (90 KB, 895x507)

90 KB PNG

>>102441854
well here i go, vramlets are not going to like this img2video model lel

Anonymous
09/18/24(Wed)11:34:08 No.102442042

Anonymous 09/18/24(Wed)11:34:08 No.102442042

>>102442031
how many gb of vram does it ask? there's also the fp8 version if it's too big? which one are you using?

Anonymous
09/18/24(Wed)11:37:53 No.102442089

Anonymous 09/18/24(Wed)11:37:53 No.102442089

File: cogvideo2.png (79 KB, 451x721)

79 KB PNG

>>102442042
its using 16,9 GB of 24 (I have a 4090)
and im using the CogVideoX-Fun-5b-InP model, I already downloaded it from the fun repo

Anonymous
09/18/24(Wed)11:41:25 No.102442148

Anonymous 09/18/24(Wed)11:41:25 No.102442148

>>102442089
compared to Flux it's not asking for that much, and like I said, it can still be quant to a Q8, the quality will be similar and it will only ask for probably 9gb of memory

Anonymous
09/18/24(Wed)11:42:04 No.102442156

Anonymous 09/18/24(Wed)11:42:04 No.102442156

File: cogvideo3.png (135 KB, 1663x691)

135 KB PNG

>>102442042
>>102442089
fuckkk, all that electricity used down to the drain

Anonymous
09/18/24(Wed)11:46:31 No.102442214

Anonymous 09/18/24(Wed)11:46:31 No.102442214

File: file.webm (85 KB, 512x512)

85 KB WEBM

>>102441190
>https://github.com/aigc-apps/CogVideoX-Fun
kek

Anonymous
09/18/24(Wed)11:46:45 No.102442218

Anonymous 09/18/24(Wed)11:46:45 No.102442218

Anyone know what kind of "language" you're supposed to use with SD 1.5?
I've been using Pony for a month now and it's come out fairly good, but trying SD 1.5 and the ai is clearly confused as fuck by what I'm trying to tell it. It looks like an acid trip during a fever dream.

Anonymous
09/18/24(Wed)11:48:06 No.102442244

Anonymous 09/18/24(Wed)11:48:06 No.102442244

File: 00012-3850524567.png (2.18 MB, 1120x1440)

2.18 MB PNG

Anonymous
09/18/24(Wed)11:50:23 No.102442268

Anonymous 09/18/24(Wed)11:50:23 No.102442268

File: CogVideotest.webm (407 KB, 640x384)

407 KB WEBM

>>102442156
Ok, 512 base resolution with 24 steps, not bad

Anonymous
09/18/24(Wed)11:51:30 No.102442276

Anonymous 09/18/24(Wed)11:51:30 No.102442276

File: 00031-1208650182.jpg (738 KB, 1600x960)

738 KB JPG

>>102442268
base image

Anonymous
09/18/24(Wed)11:54:24 No.102442315

Anonymous 09/18/24(Wed)11:54:24 No.102442315

File: file.png (226 KB, 1799x1343)

226 KB PNG

>>102442031
>10 mn of wait to get a 5 sec + 8fps bad video on a 4090
it's over, the only way to make this viable (if we were being serious and wanting to reach MinMax level), it would be to create a BitNet model, the inference would be way faster because your GPU only does additions instead of matrix multiplications
https://arxiv.org/abs/2402.17764

Anonymous
09/18/24(Wed)11:56:19 No.102442339

Anonymous 09/18/24(Wed)11:56:19 No.102442339

>>102442268
did you use the original prompt for the first frame as you did in the img2vid process?

Anonymous
09/18/24(Wed)11:59:18 No.102442375

Anonymous 09/18/24(Wed)11:59:18 No.102442375

When I use inpainting to fix eyes, do I redo the prompt for fixing eyes or do I just keep using the same prompt and make 20 copies until it makes good eyes?

Anonymous
09/18/24(Wed)12:01:15 No.102442402

Anonymous 09/18/24(Wed)12:01:15 No.102442402

so what's the next step after becoming a prompt engineer

Anonymous
09/18/24(Wed)12:02:16 No.102442413

Anonymous 09/18/24(Wed)12:02:16 No.102442413

>>102442402
Learn photoshop to touch up the outputs

Anonymous
09/18/24(Wed)12:02:36 No.102442415

Anonymous 09/18/24(Wed)12:02:36 No.102442415

>>102442339
Nop, just used "fashion week footage, model walking down the ramp"

>>102442315
512 base with 24 steps takes 90 seconds, not bad at all

768 base makes me OOT when VAE decoding, im trying to use 768 base with a resized image, hopefully where will be a solution like tiled vae decoding or something

Anonymous
09/18/24(Wed)12:03:44 No.102442427

Anonymous 09/18/24(Wed)12:03:44 No.102442427

>>102442402
Earning a degree in synthology

Anonymous
09/18/24(Wed)12:03:52 No.102442428

Anonymous 09/18/24(Wed)12:03:52 No.102442428

>>102442375
I typically only put character, eye color, and a few style tags. You can also check out adetailer, it will automatically img2img the face with the generation which usually clears up any eye issues.

Anonymous
09/18/24(Wed)12:04:30 No.102442435

Anonymous 09/18/24(Wed)12:04:30 No.102442435

>>102442415
Please try it again with the original prompt and the modifications you want to happen during the clip. Anecdotally I got better results when the huggingface version was active than just feeding it an image and letting her rip with hardly any guidance.

Anonymous
09/18/24(Wed)12:05:18 No.102442444

Anonymous 09/18/24(Wed)12:05:18 No.102442444

>>102442428
I see. I'll check that out thanks.

Anonymous
09/18/24(Wed)12:05:42 No.102442449

Anonymous 09/18/24(Wed)12:05:42 No.102442449

File: file.png (80 KB, 2099x222)

80 KB PNG

>>102442415
>512 base with 24 steps takes 90 seconds, not bad at all
but when you look at that >>102442031
it's 5 min for 768 base (24 steps), how can it be this slower when going from 512 to 768 lol

Anonymous
09/18/24(Wed)12:06:51 No.102442461

Anonymous 09/18/24(Wed)12:06:51 No.102442461

File: cogvideo4.png (43 KB, 351x475)

43 KB PNG

>>102442415
768 base resolution gives me OOT again, im trying fewer frames now

Anonymous
09/18/24(Wed)12:09:16 No.102442487

Anonymous 09/18/24(Wed)12:09:16 No.102442487

>>102442461
maybe you should go for fp8 instead? can GGUF be a thing for video models?

Anonymous
09/18/24(Wed)12:09:33 No.102442491

Anonymous 09/18/24(Wed)12:09:33 No.102442491

File: CogVideotest2.webm (445 KB, 912x624)

445 KB WEBM

768 base with 25 steps worked

>>102442435
Im testing quality right now

>>102442449
Because my original screenshot had 50 steps, thats why it took like 8 minutes, since then im just using 24 steps

Anonymous
09/18/24(Wed)12:11:24 No.102442511

Anonymous 09/18/24(Wed)12:11:24 No.102442511

>>102442491
>Because my original screenshot had 50 steps, thats why it took like 8 minutes, since then im just using 24 steps
that's what I said, it took 1.32 + 8.04 = 9.36 mn for 50 steps -> ~4.30 mn for a 24 steps inference for 768 steps, and then for 512 steps it's only 1.30 mn? damn that's a big difference if you ask me

Anonymous
09/18/24(Wed)12:11:43 No.102442514

Anonymous 09/18/24(Wed)12:11:43 No.102442514

>>102442491
meant 25 frames**

Anonymous
09/18/24(Wed)12:11:56 No.102442522

Anonymous 09/18/24(Wed)12:11:56 No.102442522

>>102442491
>quality
Ok, but with the original prompt or without?

Anonymous
09/18/24(Wed)12:21:29 No.102442627

Anonymous 09/18/24(Wed)12:21:29 No.102442627

File: cogvideo5.png (108 KB, 951x763)

108 KB PNG

fuck this gay earth, now all it gives me is OOT errors, vram leak?

Anonymous
09/18/24(Wed)12:24:26 No.102442661

Anonymous 09/18/24(Wed)12:24:26 No.102442661

>>102442627
kek there is a vae tilling option in the decode node

Anonymous
09/18/24(Wed)12:27:07 No.102442683

Anonymous 09/18/24(Wed)12:27:07 No.102442683

In my adventures of 16 channel VAEs I may have got Pixart Sigma compatible with the Osiris one. Also I want to try playing with Dynamic Routing in the patches, maybe teach them to pay attention to useful patches.

Anonymous
09/18/24(Wed)12:28:24 No.102442702

Anonymous 09/18/24(Wed)12:28:24 No.102442702

File: CogVideotest3.webm (389 KB, 976x584)

389 KB WEBM

Anonymous
09/18/24(Wed)12:29:30 No.102442712

Anonymous 09/18/24(Wed)12:29:30 No.102442712

>>102442683
Big of true

Anonymous
09/18/24(Wed)12:29:30 No.102442713

Anonymous 09/18/24(Wed)12:29:30 No.102442713

>>102442702
honestly tolerable for fapping

Anonymous
09/18/24(Wed)12:30:12 No.102442725

Anonymous 09/18/24(Wed)12:30:12 No.102442725

>>102442683
>the Osiris one
the what?

Anonymous
09/18/24(Wed)12:30:45 No.102442736

Anonymous 09/18/24(Wed)12:30:45 No.102442736

File: file.png (592 KB, 1361x735)

592 KB PNG

>>102442712
I mean it's noise and the learning rate was going down. So theoretically yes. Going to do a regular 600m VAE training test.

>>102442725
https://huggingface.co/ostris/vae-kl-f8-d16

Anonymous
09/18/24(Wed)12:31:13 No.102442748

Anonymous 09/18/24(Wed)12:31:13 No.102442748

>>102442713
I feel that, back then porn videos were only 360p so I'm used to low quality coom videos, that's good enough to me kek

Anonymous
09/18/24(Wed)12:32:14 No.102442766

Anonymous 09/18/24(Wed)12:32:14 No.102442766

File: file.png (59 KB, 1783x626)

59 KB PNG

>>102442736
interesting, seems like the SD3 one is the best of them all, now I want to see if the Flux Vae beats that

Anonymous
09/18/24(Wed)12:33:12 No.102442778

Anonymous 09/18/24(Wed)12:33:12 No.102442778

File: gen3.webm (1.87 MB, 1280x768)

1.87 MB WEBM

>>102442702
runwayml gen3 for comparison

Anonymous
09/18/24(Wed)12:33:32 No.102442781

Anonymous 09/18/24(Wed)12:33:32 No.102442781

>>102442766
vae = AutoencoderKL.from_pretrained(config.vae_models_dir, allow_pickle=True).to(device).to(torch.float16)

anything that is compatible with this should "work"

Anonymous
09/18/24(Wed)12:34:32 No.102442795

Anonymous 09/18/24(Wed)12:34:32 No.102442795

>>102442778
I prefer the low res over smeared HD

Anonymous
09/18/24(Wed)12:34:40 No.102442801

Anonymous 09/18/24(Wed)12:34:40 No.102442801

>>102442736
It appears that everyone forgot about trying to do this once Flux released but I hope you get further.

Anonymous
09/18/24(Wed)12:34:54 No.102442803

Anonymous 09/18/24(Wed)12:34:54 No.102442803

>>102442713
yes and the cool thing is that allows img2video, i've used every imag2video online service there is and they all suck, its like playing a coin slot machine and there is also the fucking censorship, they suck

Anonymous
09/18/24(Wed)12:35:02 No.102442807

Anonymous 09/18/24(Wed)12:35:02 No.102442807

>>102442778
do we know if MiniMax will do a I2V at some point in time? if yet it will crush the competition so hard...
https://reddit.com/r/StableDiffusion/comments/1fjvyc5/would_you_listen_to_this_band/

Anonymous
09/18/24(Wed)12:35:51 No.102442816

Anonymous 09/18/24(Wed)12:35:51 No.102442816

>>102442801
It's obvious we're not fine tuning Flux any time soon and there's room for truly open < 2B models

Anonymous
09/18/24(Wed)12:37:22 No.102442848

Anonymous 09/18/24(Wed)12:37:22 No.102442848

File: CogVideotest4.webm (542 KB, 1216x728)

542 KB WEBM

960 base resolution worked, 1024 gave me OOT right away, now im gonna test more steps and schedulers for quality

Anonymous
09/18/24(Wed)12:42:56 No.102442953

Anonymous 09/18/24(Wed)12:42:56 No.102442953

File: file.png (1.75 MB, 1024x1024)

1.75 MB PNG

https://civitai.com/models/772407/flux-gas-room-lora?modelVersionId=863913
OY VEY

Anonymous
09/18/24(Wed)12:44:36 No.102442977

Anonymous 09/18/24(Wed)12:44:36 No.102442977

>>102442766
flux is the best one
it was flux>aura>sd3>ostris when i was trying them

Anonymous
09/18/24(Wed)12:45:38 No.102443000

Anonymous 09/18/24(Wed)12:45:38 No.102443000

>>102442977
also aura can sometimes beat flux its like flux => aura

Anonymous
09/18/24(Wed)12:46:01 No.102443005

Anonymous 09/18/24(Wed)12:46:01 No.102443005

>>102442977
Is it AutoencoderKL compatible? Do you have a link? I can swap out my osiris one.

Anonymous
09/18/24(Wed)12:47:04 No.102443021

Anonymous 09/18/24(Wed)12:47:04 No.102443021

File: CogVideotest5.webm (513 KB, 1216x728)

513 KB WEBM

>>102442848
50 steps, don't see much improvement for the longer genning time desu

Anonymous
09/18/24(Wed)12:48:33 No.102443054

Anonymous 09/18/24(Wed)12:48:33 No.102443054

>>102443005
https://huggingface.co/black-forest-labs/FLUX.1-schnell/tree/main/vae for flux
https://huggingface.co/AuraDiffusion/16ch-vae for aura
and it should be its what i was using

Anonymous
09/18/24(Wed)12:49:24 No.102443070

Anonymous 09/18/24(Wed)12:49:24 No.102443070

>>102442848
>>102443021
looks like some ps1 3d glitch it usually had https://youtu.be/x8TO-nrUtSI?t=74

Anonymous
09/18/24(Wed)12:49:31 No.102443071

Anonymous 09/18/24(Wed)12:49:31 No.102443071

>>102440088
So true! We folks should donate another gorrilion $$$ to Israel - our greatest ally!!!

Anonymous
09/18/24(Wed)12:49:42 No.102443076

Anonymous 09/18/24(Wed)12:49:42 No.102443076

>>102443054
just looks like a different scaling factor is all

Anonymous
09/18/24(Wed)12:51:51 No.102443109

Anonymous 09/18/24(Wed)12:51:51 No.102443109

>>102443021
the official weights are released a few minutes ago
https://huggingface.co/THUDM/CogVideoX-5b-I2V/tree/main
https://github.com/THUDM/CogVideo

Anonymous
09/18/24(Wed)12:58:59 No.102443222

Anonymous 09/18/24(Wed)12:58:59 No.102443222

>>102443109
>Apache-2.0 license
that's pretty based, even though the model kinda sucks, it's the best one we have so far and we can do whatever we want with them

Anonymous
09/18/24(Wed)13:01:16 No.102443270

Anonymous 09/18/24(Wed)13:01:16 No.102443270

>>102443109
next step is I2V2I, once you can use key frames it's ogre

Anonymous
09/18/24(Wed)13:03:19 No.102443303

Anonymous 09/18/24(Wed)13:03:19 No.102443303

>>102443270
this, I absolutely loved the I2V2I shit on Luma, the transitions are so smooth
https://www.youtube.com/watch?v=KshorouF0s4

Anonymous
09/18/24(Wed)13:07:41 No.102443376

Anonymous 09/18/24(Wed)13:07:41 No.102443376

>>102443109
cant wait for the porn finetunes

Anonymous
09/18/24(Wed)13:09:18 No.102443399

Anonymous 09/18/24(Wed)13:09:18 No.102443399

>>102443109
it's the same weights at those ones or? >>102441190

Anonymous
09/18/24(Wed)13:17:20 No.102443557

Anonymous 09/18/24(Wed)13:17:20 No.102443557

File: rife.webm (587 KB, 640x384)

587 KB WEBM

>>102442268
looks pretty rough alright

Anonymous
09/18/24(Wed)13:23:26 No.102443656

Anonymous 09/18/24(Wed)13:23:26 No.102443656

File: CogVideotest7.webm (606 KB, 840x1024)

606 KB WEBM

Anonymous
09/18/24(Wed)13:25:18 No.102443681

Anonymous 09/18/24(Wed)13:25:18 No.102443681

>>102443656
that's really rough, either 5b isn't enough to get something interesting, either they still haven't mastered this technology (which is completely fair)

Anonymous
09/18/24(Wed)13:35:27 No.102443873

Anonymous 09/18/24(Wed)13:35:27 No.102443873

>>102443681
chud wants minmax/gen3 level of video quality in his shitty local computer lel

Anonymous
09/18/24(Wed)13:38:12 No.102443933

Anonymous 09/18/24(Wed)13:38:12 No.102443933

File: qwen2-72b-vl.png (275 KB, 1484x1921)

275 KB PNG

>>102434568
Qwen2-72B-VL released, it's a vision model.
https://huggingface.co/Qwen/Qwen2-VL-72B-Instruct

Anonymous
09/18/24(Wed)13:38:21 No.102443935

Anonymous 09/18/24(Wed)13:38:21 No.102443935

>>102443873
that's possible if we use GGUF quants, before Flux we thought it was impossible to get something in the same level of dalle or MJ locally

Anonymous
09/18/24(Wed)13:39:23 No.102443958

Anonymous 09/18/24(Wed)13:39:23 No.102443958

>>102443933
is there a demo we can try somewhere? because I can't run a fucking 72b model, they're tripping

Anonymous
09/18/24(Wed)13:40:24 No.102443993

Anonymous 09/18/24(Wed)13:40:24 No.102443993

>>102443873
trust the plan anon
https://blackforestlabs.ai/up-next/

Anonymous
09/18/24(Wed)13:41:59 No.102444026

Anonymous 09/18/24(Wed)13:41:59 No.102444026

>>102443958
here you, all you have to do is search HF's spaces section https://huggingface.co/spaces?sort=trending&search=qwen2

Anonymous
09/18/24(Wed)13:47:01 No.102444117

Anonymous 09/18/24(Wed)13:47:01 No.102444117

File: file.png (802 KB, 800x600)

802 KB PNG

>>102444026
https://huggingface.co/spaces/Qwen/Qwen2-VL
>The image depicts an anime-style character sitting on a wooden surface, likely a desk or a table, near a window. The character has short, brown hair and is wearing a white shirt with rolled-up sleeves, green pants, and a black tie. The character is barefoot and appears to be in a relaxed or contemplative pose, with one leg bent and the other extended.
>The character is holding a fork in their right hand, which is positioned near their mouth, suggesting they are about to take a bite. In front of the character, on the wooden surface, there is a small tray with a piece of food on it, possibly a cake or a pastry. The background shows a window with a view of greenery outside, indicating that it might be daytime. The overall atmosphere of the image is calm and serene, with soft lighting and a warm color palette.

>The character is holding a fork in their right hand
it's fucking over, GPT4V remains the king, only that model understands she's holding her fork with her left fooot

Anonymous
09/18/24(Wed)13:55:15 No.102444239

Anonymous 09/18/24(Wed)13:55:15 No.102444239

>>102444117
cant you just manually change the caption of such autistic image?

Anonymous
09/18/24(Wed)13:57:04 No.102444263

Anonymous 09/18/24(Wed)13:57:04 No.102444263

>>102444239
>every image that isn't about a 1girl standing in front of the picture is autistic
Sorry anon, but there's a reason we want good vision models, because not all the pictures we caption are "1girl, standing", we're not simpletons as you are, sometimes we have complex pictures that should be captioned well

Anonymous
09/18/24(Wed)13:59:42 No.102444313

Anonymous 09/18/24(Wed)13:59:42 No.102444313

>>102444239
You're everything that's wrong with this field, if image models can't do complex poses and scenes, that's because they have been trained with pictures that were captioned wrongly, there's no excuse to be made, it should be accurate on every situations

Anonymous
09/18/24(Wed)14:00:37 No.102444332

Anonymous 09/18/24(Wed)14:00:37 No.102444332

>>102444263
the rest of the caption is really good, the shitty nitpick you're pointing out is just a autistic sperg, the point of captioning is for training/gen purposes

Anonymous
09/18/24(Wed)14:01:35 No.102444354

Anonymous 09/18/24(Wed)14:01:35 No.102444354

>>102444332
>the point of captioning is for training/gen purposes
yes? and if it makes mistakes then you'll get bad training, are you retarded or something? it should do complex poses and not just your average coomerbrain 1girl standing you fucking morron

Anonymous
09/18/24(Wed)14:01:38 No.102444356

Anonymous 09/18/24(Wed)14:01:38 No.102444356

>>102444313
not enough people eating food with their feet in their dataset i guess

Anonymous
09/18/24(Wed)14:04:32 No.102444412

Anonymous 09/18/24(Wed)14:04:32 No.102444412

File: file.png (542 KB, 1425x1228)

542 KB PNG

>>102444356
yet GPT4V does it good, you reek of mediocrity, do you really need OpenAI to tell you that everything is possible?

Anonymous
09/18/24(Wed)14:04:32 No.102444413

Anonymous 09/18/24(Wed)14:04:32 No.102444413

>>102444354
those kind of mistakes can be easily fixed, what kind of a "complex" pose is that anyway, is just an autistic image

Anonymous
09/18/24(Wed)14:05:54 No.102444434

Anonymous 09/18/24(Wed)14:05:54 No.102444434

>>102444413
>those kind of mistakes can be easily fixed
yeah sure, when you pretrain a model, it has hundreds of millions of pictures, what's the point of fixing them all manually if you can instead have a good model that is reliable enough? do you get the point of having those models in the first place? They exist so that you don't have to do the work manualy, goddam you're fucking retarded anon

Anonymous
09/18/24(Wed)14:07:38 No.102444469

Anonymous 09/18/24(Wed)14:07:38 No.102444469

>>102444434
>pretrain a model, it has hundreds of millions of pictures

lmao people here barely can train a shitty lora and you're sperging out about >muh millions of pictures
really anon, are you really training a model with "millions of pictures"

Anonymous
09/18/24(Wed)14:07:42 No.102444470

Anonymous 09/18/24(Wed)14:07:42 No.102444470

>>102443109
plebbit says they are not the official weights for img2vid as they said they were releasing them tomorrow, ofc it is tomorrow in the middle Kingdom, so maybe they are.
Anyone support this theory about them being the official weights?

Anonymous
09/18/24(Wed)14:08:53 No.102444496

Anonymous 09/18/24(Wed)14:08:53 No.102444496

So the difference between this and /sdg/ is /sdg/ doesn't care about privacy and /ldg/ is strictly open source and not inherently network connected? New to this.

Anonymous
09/18/24(Wed)14:09:18 No.102444505

Anonymous 09/18/24(Wed)14:09:18 No.102444505

>>102444469
why do you believe Flux is the last base model we'll ever get? The pretraining is the most important process on making a good model, the finetuning is just the cherry on top, we need good vision models so that the people that will work on a new base model will get some quality training, and to get quality training you need good captions, and to get good captions you need a good vision model, I hope that helped

Anonymous
09/18/24(Wed)14:10:46 No.102444535

Anonymous 09/18/24(Wed)14:10:46 No.102444535

>>102444470
it's really confusing to me, like there's 1 days of difference between those weights and the official weights, I'm sure they are the same, we'll get the truth tommorow I guess

Anonymous
09/18/24(Wed)14:14:30 No.102444604

Anonymous 09/18/24(Wed)14:14:30 No.102444604

>>102444470
they are already released anon, its china, their tomorrow its today for us

https://huggingface.co/THUDM/CogVideoX-5b-I2V

Anonymous
09/18/24(Wed)14:16:27 No.102444636

Anonymous 09/18/24(Wed)14:16:27 No.102444636

>>102444604
yeah but there was this that released the models before
https://github.com/aigc-apps/CogVideoX-Fun?tab=readme-ov-file#model-zoo
https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/cogvideox_fun/Diffusion_Transformer/CogVideoX-Fun-5b-InP.tar.gz
I guess they're the same weights as the official ones?

Anonymous
09/18/24(Wed)14:25:43 No.102444808

Anonymous 09/18/24(Wed)14:25:43 No.102444808

File: possed.png (68 KB, 1297x419)

68 KB PNG

>>102444636
its pozzed

Anonymous
09/18/24(Wed)14:27:45 No.102444840

Anonymous 09/18/24(Wed)14:27:45 No.102444840

>>102444808
who are those aigs-apps anyway? they finetuned the "original" weights to make it work on multiple resolutions or something? Do they work with THUDM? They appeared out of nowhere and improved this shit, I won't complain though that's cool it can do multi resolution kek

Anonymous
09/18/24(Wed)14:28:15 No.102444857

Anonymous 09/18/24(Wed)14:28:15 No.102444857

File: CogVideotest8.webm (629 KB, 840x1024)

629 KB WEBM

>>102444808
CogVideoX-Fun can do anything

the official one can't do shit

Anonymous
09/18/24(Wed)14:30:13 No.102444896

Anonymous 09/18/24(Wed)14:30:13 No.102444896

>>102444857
>CogVideoX-Fun can do anything
>the official one can't do shit
so far, the only thing "Fun" can do that the official can't is multiresolution, is there some other stuff that it can do that the official can't?

Anonymous
09/18/24(Wed)14:30:45 No.102444905

Anonymous 09/18/24(Wed)14:30:45 No.102444905

File: CogVideoX-I2V_00002.webm (1.55 MB, 720x480)

1.55 MB WEBM

>>102444840
yeah with the "fun" one I can generate in any resolution. the official THUDM one is resolution locked and giving me shitty results, but i think its the wrapper fault than the original repo

Anonymous
09/18/24(Wed)14:31:19 No.102444913

Anonymous 09/18/24(Wed)14:31:19 No.102444913

Anyone used fast training on civitai?

Anonymous
09/18/24(Wed)14:32:00 No.102444931

Anonymous 09/18/24(Wed)14:32:00 No.102444931

>>102444905
>i think its the wrapper fault
you should update the wrapper, he seems to have made it work with the original weight
https://github.com/kijai/ComfyUI-CogVideoXWrapper/commit/6729faa4717c5049aba9031ecb67ff245e63121f

Anonymous
09/18/24(Wed)14:33:11 No.102444951

Anonymous 09/18/24(Wed)14:33:11 No.102444951

>>102444857
What if it comes out that this AI was just a bunch of cam girls doing shit that looks like AI at low res. Lol

Anonymous
09/18/24(Wed)14:44:47 No.102445153

Anonymous 09/18/24(Wed)14:44:47 No.102445153

>>102444636
>I guess they're the same weights as the official ones?
if you know how to read chink there's the technical report of "Fun" on how they managed to make it work on different resolutions
https://blog.csdn.net/weixin_44791964/article/details/142205114

Anonymous
09/18/24(Wed)14:49:53 No.102445240

Anonymous 09/18/24(Wed)14:49:53 No.102445240

>>102444931
i updated it but it needs time, the fun one seems better already, it can run on fp8 even

THUDM got cucked on their release day lol

Anonymous
09/18/24(Wed)14:52:18 No.102445284

Anonymous 09/18/24(Wed)14:52:18 No.102445284

>>102445240
>THUDM got cucked on their release day lol
that's so weird, I'm sure they had no idea some other company had their weight and improved on that, or if they knew maybe they didn't expect them to release the weights, there's no logical explaination on why they would release the "bad weights" officially if they knew something better was being cooked

Anonymous
09/18/24(Wed)14:56:01 No.102445355

Anonymous 09/18/24(Wed)14:56:01 No.102445355

https://github.com/aigc-apps/CogVideoX-Fun?tab=readme-ov-file#model-zoo
there's no fp8 quant of that one right?

Anonymous
09/18/24(Wed)15:00:44 No.102445441

Anonymous 09/18/24(Wed)15:00:44 No.102445441

File: file.png (136 KB, 632x500)

136 KB PNG

>>102435355
>Literally two fucking weight matrices. And it mogs Pixtral. Just embarrassing.
B-but... le heckerino youtuber said it was good!!
https://www.youtube.com/watch?v=7aGTKJJMb5w

Anonymous
09/18/24(Wed)15:09:48 No.102445585

Anonymous 09/18/24(Wed)15:09:48 No.102445585

>>102441309
>human art is just the mathematical average of all the art the human has experienced before!
>no it's not
Okay, explain.

>inb4 "soul"

Both statements are dramatic oversimplifications of the process. Human brains are very complex and difficult to explain. Neural nets as well, to a slightly lesser extent right now.

Anonymous
09/18/24(Wed)15:10:18 No.102445592

Anonymous 09/18/24(Wed)15:10:18 No.102445592

File: file.png (33 KB, 381x204)

33 KB PNG

>>102445355
you download the model and you can choose the precision on the node
https://github.com/kijai/ComfyUI-CogVideoXWrapper

Anonymous
09/18/24(Wed)15:11:24 No.102445613

Anonymous 09/18/24(Wed)15:11:24 No.102445613

>>102443933
Any other 96GB VRAMchads get this to run locally? I tried the 8 bit GPTQ version, when loading the model it complains that all the bias weights were not initialized. Then running inference immediately fails because it complains some value was NaN (caused by the aforementioned weights not loading? idk). Currently downloading the fp16 version, will try that with bitsandbytes quantization.

Anonymous
09/18/24(Wed)15:14:14 No.102445654

Anonymous 09/18/24(Wed)15:14:14 No.102445654

>>102445585
Neural nets are not difficult to explain. The math is simple. AI is also deterministic and can only produce autocompletions of patterns it has encountered in the training. The human brain, on the other hand, can actually conjure up new concepts completely foreign to the training dataset. Someone figured out the wheel. Someone figured out electromagnets. Don't let your atheist m'lady hat stop you from critically thinking especially when you're attempting to humanize these dumb autocomplete algorithms.

Anonymous
09/18/24(Wed)15:14:27 No.102445658

Anonymous 09/18/24(Wed)15:14:27 No.102445658

File: 0.jpg (158 KB, 1024x1024)

158 KB JPG

Anonymous
09/18/24(Wed)15:18:46 No.102445736

Anonymous 09/18/24(Wed)15:18:46 No.102445736

>>102445654
Neurons are not difficult to explain. The math is simple. It's all just voltage. The brain works deterministically based on learned patterns and initial input conditions.

Anonymous
09/18/24(Wed)15:25:10 No.102445830

Anonymous 09/18/24(Wed)15:25:10 No.102445830

>>102445736
>Neurons are not difficult to explain. The math is simple.
that's not the point, we have no idea what's the role of each weights so we can't control that shit, like imagine you want to remove censorship on a model, if you knew how it work you would deactivate the bad weights, but we can't because we have no idea how it works it's just a black box, a black box that does simple calculations to transform an input into an output

Anonymous
09/18/24(Wed)15:26:57 No.102445862

Anonymous 09/18/24(Wed)15:26:57 No.102445862

>>102445830
Explainable AI is a thing anon, you're a bit behind the times.

Anonymous
09/18/24(Wed)15:29:13 No.102445885

Anonymous 09/18/24(Wed)15:29:13 No.102445885

>>102445862
never said the field doesn't exist, but they haven't find much out of it, like I said, if for example I'm asking you to identify the weights that have the concept of Miku in Flux and I'm asking you to remove that concept, you wouldn't be able to do it because we don't really know how to do something like that

Anonymous
09/18/24(Wed)15:29:51 No.102445899

Anonymous 09/18/24(Wed)15:29:51 No.102445899

>>102445830
You are simultaneously describing the human brain and a neural network.

Anonymous
09/18/24(Wed)15:31:27 No.102445917

Anonymous 09/18/24(Wed)15:31:27 No.102445917

>>102441187
Technology for automation will always be useful.

Anonymous
09/18/24(Wed)15:32:08 No.102445923

Anonymous 09/18/24(Wed)15:32:08 No.102445923

>>102445736
actually retard, they are very difficult to explain given they run on the piece of cheese you ate for breakfast this morning, do yourself a favor and grow up and mature, being a 14 year old atheist on the internet worshipping an autocomplete algorithm is pathetic and you'll always be a retard when you don't see AI for what it is
it's not a magic box

Anonymous
09/18/24(Wed)15:32:20 No.102445927

Anonymous 09/18/24(Wed)15:32:20 No.102445927

>>102445885
Find the part in the human brain describing "Hatsune Miku" and completely remove it without damaging anything else

A really good brain surgeon might be able to find part of it during a very long, difficult, and risky surgery, but completely removing all memory and concept of Hatsune Miku without leaving a trace, and without removing anything else? Good luck

Anonymous
09/18/24(Wed)15:36:25 No.102445995

Anonymous 09/18/24(Wed)15:36:25 No.102445995

>>102445885
that's irrelevant when we literally understand how to put together AI neural networks, they are just stacked up matrices using gradient descent, they flow information from 0-1 trying to autocomplete a given pattern given data samples it was trained on based on a set of known inputs and outputs
your brain doesn't even work close to this way

Anonymous
09/18/24(Wed)15:37:54 No.102446014

Anonymous 09/18/24(Wed)15:37:54 No.102446014

>>102445995
>your brain doesn't even work close to this way
never said it works the same, my point is that for the both of them, it's just a black box, we cannot modify shit manually, I wish it was the case for neural networks, imagine the possibilities, censorship removal, concept removal...

Anonymous
09/18/24(Wed)15:41:17 No.102446095

Anonymous 09/18/24(Wed)15:41:17 No.102446095

File: Diddy.webm (708 KB, 512x384)

708 KB WEBM

in 1992, Rare paid $275,000 for each SGI workstation to make the prerendered graphics in Donkey Kong Country
nowadays that tech is trivial
some day we're going to be able to immediately make feature-length movies with a click of a button

Anonymous
09/18/24(Wed)15:41:48 No.102446103

Anonymous 09/18/24(Wed)15:41:48 No.102446103

>>102445995
The electricity screams into the void, proudly claiming itself to be more than electricity

Anonymous
09/18/24(Wed)15:42:18 No.102446113

Anonymous 09/18/24(Wed)15:42:18 No.102446113

>>102446014
It's not a black box, you are just trying to exaggerate because you have a biased worldview so you try to pretend an autocomplete algorithm is the same as your brain which processes a trillion times more context and memory in a trillionth of the time using a trillionth less power while also being capable of learning on the fly and INVENTING things. We literally understand how AIs work. We are not even one iota close enough to understanding how the human brain works. It's always funny too because you people always end up saying soul (trying to be ironic) but it just is admission.

Anonymous
09/18/24(Wed)15:42:19 No.102446114

Anonymous 09/18/24(Wed)15:42:19 No.102446114

>>102446095
*$275K is already accounting for 2024 inflation

Anonymous
09/18/24(Wed)15:42:32 No.102446119

Anonymous 09/18/24(Wed)15:42:32 No.102446119

>>102446095
back then, GPUs weren't even close to the architecture limit, now we got 4nm shit and we can't go much lower

Anonymous
09/18/24(Wed)15:43:33 No.102446132

Anonymous 09/18/24(Wed)15:43:33 No.102446132

>>102446113
>We literally understand how AIs work.
we don't know shit, and the fact we can't do any manual manipulation on the network proves my point

Anonymous
09/18/24(Wed)15:44:38 No.102446156

Anonymous 09/18/24(Wed)15:44:38 No.102446156

>>102446132
Manually manipulate a jet engine with your bare hands

Anonymous
09/18/24(Wed)15:53:32 No.102446305

Anonymous 09/18/24(Wed)15:53:32 No.102446305

My GPU just blew up. I am in Lebanon.

Anonymous
09/18/24(Wed)15:55:10 No.102446343

Anonymous 09/18/24(Wed)15:55:10 No.102446343

We have to come to terms with the fact that we simply lack the required efficiency to push parameter counts high enough. Some neurons in the cortex require around 1k artificial neurons to emulate, even at billions of parameters we're still orders of magnitude off.

Anonymous
09/18/24(Wed)15:56:49 No.102446369

Anonymous 09/18/24(Wed)15:56:49 No.102446369

we know how AI do but do we know why AI do?

Anonymous
09/18/24(Wed)15:57:20 No.102446381

Anonymous 09/18/24(Wed)15:57:20 No.102446381

>>102446343
We're far away from even just having models train while also being able to evaluate, the whole process is pure brute force.

Anonymous
09/18/24(Wed)15:58:10 No.102446389

Anonymous 09/18/24(Wed)15:58:10 No.102446389

>>102446305
send more lebanese women to texas bro

Anonymous
09/18/24(Wed)15:58:54 No.102446403

Anonymous 09/18/24(Wed)15:58:54 No.102446403

>>102446381
Even harder, we need a solid reward function to even know if training is progressing correctly, that model would somehow too need to be tuned. Kind of a recursive issue.

Anonymous
09/18/24(Wed)15:59:55 No.102446422

Anonymous 09/18/24(Wed)15:59:55 No.102446422

>>102445995
>Stacked up matrices using gradient descent
Neurons
>they flow information from 0-1
Neurotransmitters
>trying to autocomplete a given pattern
Electricity flows through neurons in ways that were advantageous for natural selection, flows through neural nets in ways that were advantageous to being selected as favorable by humans. Neither neurons nor neural nets are "trying" to do anything, they're just naturally fulfilling the patterns they evolved on
>given data samples it was trained on based on a set of known inputs and outputs
Human evolutionary behaviors and lived experiences

Anonymous
09/18/24(Wed)16:01:47 No.102446440

Anonymous 09/18/24(Wed)16:01:47 No.102446440

>>102446403
AGI is just contemporary cold fusion.

Anonymous
09/18/24(Wed)16:02:53 No.102446456

Anonymous 09/18/24(Wed)16:02:53 No.102446456

File: file.png (225 KB, 570x727)

225 KB PNG

you vill learn the flux latents

Anonymous
09/18/24(Wed)16:05:49 No.102446499

Anonymous 09/18/24(Wed)16:05:49 No.102446499

>>102446343
Maybe analog computers will be the next big AI breakthrough

Anonymous
09/18/24(Wed)16:07:34 No.102446525

Anonymous 09/18/24(Wed)16:07:34 No.102446525

File: 00076-453299283.png (2.33 MB, 1120x1440)

2.33 MB PNG

Anonymous
09/18/24(Wed)16:09:42 No.102446555

Anonymous 09/18/24(Wed)16:09:42 No.102446555

File: file.png (1.68 MB, 3400x1579)

1.68 MB PNG

>rescaled width 720
>rescaled height: 480
>base_resolution: 512
>Final video: width = 608 + height = 416
what the fuck?

Anonymous
09/18/24(Wed)16:13:16 No.102446628

Anonymous 09/18/24(Wed)16:13:16 No.102446628

>>102446555
sd3 and resize you image manually for your first run.

Anonymous
09/18/24(Wed)16:13:52 No.102446639

Anonymous 09/18/24(Wed)16:13:52 No.102446639

>>102446499
Maybe, they definitely introduce some more stochastic behavior into the system, probably important to reach a certain level of dynamic behavior.

Anonymous
09/18/24(Wed)16:14:55 No.102446660

Anonymous 09/18/24(Wed)16:14:55 No.102446660

>>102446628
I don't think I understand, can you elaborate anon

Anonymous
09/18/24(Wed)16:15:40 No.102446676

Anonymous 09/18/24(Wed)16:15:40 No.102446676

Another bread, has arrived...
>>102446651
>>102446651
>>102446651

Anonymous
09/18/24(Wed)16:17:38 No.102446714

Anonymous 09/18/24(Wed)16:17:38 No.102446714

>>102446660
resize you image manually for your first run and sd3.

Anonymous
09/18/24(Wed)16:24:57 No.102446829

Anonymous 09/18/24(Wed)16:24:57 No.102446829

File: 1726690182703685_edit.png (1.06 MB, 3400x1579)

1.06 MB PNG

>>102446660
I am not sure about the SD3 thing, but it doesn't match the example on the website.

The image resize thing just ads complexity while you are figuring things out.

Anonymous
09/18/24(Wed)16:26:35 No.102446861

Anonymous 09/18/24(Wed)16:26:35 No.102446861

File: file.png (897 KB, 2384x1549)

897 KB PNG

>>102446829
>I am not sure about the SD3 thing, but it doesn't match the example on the website
that's the example of the repo though
https://github.com/kijai/ComfyUI-CogVideoXWrapper

Anonymous
09/18/24(Wed)16:28:03 No.102446883

Anonymous 09/18/24(Wed)16:28:03 No.102446883

>>102446861
yup and they are using fp8. It is probably fine, but you are getting errors you don't like.

Anonymous
09/18/24(Wed)18:11:24 No.102448363

Anonymous 09/18/24(Wed)18:11:24 No.102448363

>>102435695
If it's that hand fixing method that use MeshGraphormer, the way it's meant to work is hand detection (bounding box) -> fitting a hand mesh to the deformed hand as well as possible with MeshGraphormer -> render the depth map of the fitted hand mesh, which is thresholded to also provide a mask -> inpaint the region defined by the dilated mask, using a depth Controlnet with the hand depth map as an input.
For anime, based on what I've seen using the hand refinement stuff in the Acly Krita plugin, either the bounding box hand detector or the MeshGraphormer fails to detect/fit the hand, because neither were trained on anime I think.
While training a bounding box detector for anime hands would be doable and might already exist, a specialized MeshGraphormer would be harder. Maybe using a synthetic dataset with toon-shaded anime models, perhaps augmented with light img2img.

Anonymous
09/18/24(Wed)18:48:27 No.102448847

Anonymous 09/18/24(Wed)18:48:27 No.102448847

>>102441782
Oh yup that was where I was stuck, it was working with the default one though

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.