/g/ - /ldg/ - Local Diffusion General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/ldg/ - Local Diffusion Genera(...) 09/12/24(Thu)10:24:57 No.102351868

File: tmp.jpg (1.37 MB, 3264x3264)

1.37 MB JPG

/ldg/ - Local Diffusion General Anonymous 09/12/24(Thu)10:24:57 No.102351868 Archived

Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>102332654

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/c/kdg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/u/udg
>>>/tg/slop
>>>/trash/sdg
>>>/pol/uncensored+ai

Anonymous
09/12/24(Thu)10:30:01 No.102351926

Anonymous 09/12/24(Thu)10:30:01 No.102351926

File: MiniMax.webm (380 KB, 1280x720)

380 KB WEBM

BFL managed to make dalle at home, will they do the miracle again by making a MiniMax at home?
https://blackforestlabs.ai/up-next/

Anonymous
09/12/24(Thu)10:35:06 No.102351987

Anonymous 09/12/24(Thu)10:35:06 No.102351987

>>102351926
If twitter announces a video model then yes, someone needs to fund it

Anonymous
09/12/24(Thu)10:40:46 No.102352073

Anonymous 09/12/24(Thu)10:40:46 No.102352073

>>102351926
Best not to think about it. They've said nothing about it since it was announced.
Personally I'd torture someone who promised me something "Soon", give me a fucking date you carrot dangling fucks.
I need to plan my schedule, projects and so on and not have shit just dropped on me and scramble to make time, change hardware, write things and so on.
It's really a bad sign when a company says "soon" they either have no idea if the project will even work or are having financial problems which may kill the project completely. It shows they dont have a complete grasp of what they are doing and cannot make projections about their own product.
We saw this with SD3 <shudder>
Just say a month Black Forest Labs, before black friday (meme prices)? so anons can decide to buy shiny new cards.

Throw us a bone, and not the ones betwixt your legs.

Anonymous
09/12/24(Thu)10:45:12 No.102352145

Anonymous 09/12/24(Thu)10:45:12 No.102352145

>>102351926
Are there any video models that I can animate with controlnets? I feel like I'm stuck with SD1.5 and Animatediff. I was doing some experiments with ToonCrafter and PonyXL stills and saw that there were some efforts with SVD and controlnets. I'm interested in cartoony/anime stuff.

Anonymous
09/12/24(Thu)10:45:24 No.102352146

Anonymous 09/12/24(Thu)10:45:24 No.102352146

>>102352073
Training a model is not straightforward. It's like giving a date for when your baby will say its first word.

Anonymous
09/12/24(Thu)10:45:38 No.102352149

Anonymous 09/12/24(Thu)10:45:38 No.102352149

File: file.png (187 KB, 800x400)

187 KB PNG

>>102352073
>Personally I'd torture someone who promised me something "Soon", give me a fucking date you carrot dangling fucks.
desu when you're making models you don't really know when it's gonna end, they're experimenting a lot and I prefer them to take their time and then release a good product, rather than going the SAI path and release a failed experiment as a bone for us

Anonymous
09/12/24(Thu)10:46:24 No.102352163

Anonymous 09/12/24(Thu)10:46:24 No.102352163

Forge question about Clip Skip on pony
I switched to new Forge yesterday (version that has flux and shit) and noticed my gens aren't the same on the same settings between old and new Forge. Blamed it on the sampler+scheduler split and moved on though. Now I also noticed Clip Skip is not on 2. Tried enabling it via sdxl_clip_l_skip and CLIP_stop_at_last_layers (together and separately), but it made no difference at all. On old forge, changing Clip Skip did make a difference however. I even tried making a xyz plot with Clip Skip 1 and 2, both images were the same with Clip Skip being 2 in both PNG Infos. What could be the issue here?

tl;dr how do I enable Clip Skip 2 on Forge with pony models

Anonymous
09/12/24(Thu)10:47:01 No.102352171

Anonymous 09/12/24(Thu)10:47:01 No.102352171

>>102352145
>Are there any video models that I can animate with controlnets?
there's some video to video features that exist, like you make a basic blender animation and you let the video model to do the rest

Anonymous
09/12/24(Thu)10:47:17 No.102352175

Anonymous 09/12/24(Thu)10:47:17 No.102352175

Blessed thread of frenship

Anonymous
09/12/24(Thu)10:48:14 No.102352193

Anonymous 09/12/24(Thu)10:48:14 No.102352193

>>102352073
I don't think they have much money problem now that they sold their licence to twitter, you can't find a better place to release your model as an API

Anonymous
09/12/24(Thu)10:50:37 No.102352228

Anonymous 09/12/24(Thu)10:50:37 No.102352228

>>102352163
Clip skip is bizarre on Pony. On Comfy setting it to 1 gave me noise (same for merges/finetunes of it), but I was told that on A1111 it worked and gave a slightly different output (not better, just different).

Anonymous
09/12/24(Thu)10:50:55 No.102352234

Anonymous 09/12/24(Thu)10:50:55 No.102352234

File: 1715899465664143.png (2.96 MB, 1632x1632)

2.96 MB PNG

Anonymous
09/12/24(Thu)10:52:35 No.102352258

Anonymous 09/12/24(Thu)10:52:35 No.102352258

Booba going crazy

Anonymous
09/12/24(Thu)10:54:47 No.102352283

Anonymous 09/12/24(Thu)10:54:47 No.102352283

File: 1725787761485762.jpg (1.39 MB, 1632x1632)

1.39 MB JPG

Anonymous
09/12/24(Thu)10:55:13 No.102352286

Anonymous 09/12/24(Thu)10:55:13 No.102352286

>>102352146
>>102352149
I understand it's an iterative process to get a well performing model, they could go on for a month, 6 months a year re-iterating the model until they were happy with the strengths it has in the areas they want.
What I'm objecting to is the use of "soon" because there's no timescale for "soon" and im too jaded to be strung along when a better explanation could be given; here's our timeframe" 3 months model refinement, 2 months tweaking the %'s, 1 month user testing, 1 month model refinement from feedback" etc.
it's just nebulous bullshit at this point. I don't want a shit model either, I don't want it now.
I want it when it's done which my sources tell me 100% will be:
>In 7 bananas
See how dumb it actually sounds?

Anonymous
09/12/24(Thu)11:00:23 No.102352357

Anonymous 09/12/24(Thu)11:00:23 No.102352357

>>102352228
on comfy for pony you have to set clipskip to a negative number, the setting is -2.

Anonymous
09/12/24(Thu)11:02:06 No.102352377

Anonymous 09/12/24(Thu)11:02:06 No.102352377

>>102352286
Maybe you should do something instead of acting like you're waiting for a deposit on your EBT card

Anonymous
09/12/24(Thu)11:05:34 No.102352417

Anonymous 09/12/24(Thu)11:05:34 No.102352417

>>102351926
im more looking forward to what those chiness university students are up to. maybe they cracked the 16ch VAE.

Anonymous
09/12/24(Thu)11:09:11 No.102352464

Anonymous 09/12/24(Thu)11:09:11 No.102352464

File: noise.png (1.15 MB, 1024x1024)

1.15 MB PNG

why does pony do this? it doesnt make sense to me. I'll have a prompt and it will have "blonde hair" in it. then when i change it to "long blonde hair" it shits itself. and ill have to trial and error removing words until it works again.

Anonymous
09/12/24(Thu)11:10:04 No.102352475

Anonymous 09/12/24(Thu)11:10:04 No.102352475

File: 1724343864122503.jpg (1.51 MB, 1632x1632)

1.51 MB JPG

Anonymous
09/12/24(Thu)11:10:17 No.102352479

Anonymous 09/12/24(Thu)11:10:17 No.102352479

>>102352464
there is something very wrong with whatever you're using

Anonymous
09/12/24(Thu)11:11:42 No.102352499

Anonymous 09/12/24(Thu)11:11:42 No.102352499

>>102352377
You make a great point.
EBT card payments are more reliable than Black Forest Labs saying "soon"
Lol.

Anonymous
09/12/24(Thu)11:12:44 No.102352506

Anonymous 09/12/24(Thu)11:12:44 No.102352506

>>102352499
I don't care. I hope they release it and manage to single you out and ban you from using it.

Anonymous
09/12/24(Thu)11:16:38 No.102352565

Anonymous 09/12/24(Thu)11:16:38 No.102352565

File: 00099-3082093201.png (1010 KB, 896x1152)

1010 KB PNG

Anonymous
09/12/24(Thu)11:17:24 No.102352577

Anonymous 09/12/24(Thu)11:17:24 No.102352577

>>102352464
This looks like a vae or clip slip -1 issue

Anonymous
09/12/24(Thu)11:19:58 No.102352606

Anonymous 09/12/24(Thu)11:19:58 No.102352606

BFL employee having a melty ITT

Anonymous
09/12/24(Thu)11:20:54 No.102352615

Anonymous 09/12/24(Thu)11:20:54 No.102352615

File: 5645ggffg5x.png (2.55 MB, 1487x888)

2.55 MB PNG

Anonymous
09/12/24(Thu)11:22:08 No.102352635

Anonymous 09/12/24(Thu)11:22:08 No.102352635

>>102352606
gib me dat *smacks lips* gib me dat model now

Anonymous
09/12/24(Thu)11:27:21 No.102352702

Anonymous 09/12/24(Thu)11:27:21 No.102352702

File: 1721875783572207.png (1.53 MB, 1024x1024)

1.53 MB PNG

>>102352606
I literally don't read any of these posts lmao who cares about some BFL retard

Anonymous
09/12/24(Thu)11:27:55 No.102352713

Anonymous 09/12/24(Thu)11:27:55 No.102352713

>>102352702
why aren't you posting in sdg, sounds more like your echochamber

Anonymous
09/12/24(Thu)11:28:52 No.102352727

Anonymous 09/12/24(Thu)11:28:52 No.102352727

>>102352713
who are you? oh yeah that's right it doesn't matter fuck off

Anonymous
09/12/24(Thu)11:29:51 No.102352735

Anonymous 09/12/24(Thu)11:29:51 No.102352735

>>102352727
it kind of matters because you're a brain drooling retard

Anonymous
09/12/24(Thu)11:30:21 No.102352742

Anonymous 09/12/24(Thu)11:30:21 No.102352742

File: 1709429314860412.png (1.7 MB, 1024x1024)

1.7 MB PNG

>>102352735
stop typing

Anonymous
09/12/24(Thu)11:31:52 No.102352761

Anonymous 09/12/24(Thu)11:31:52 No.102352761

>>102352635
unironically smacking my lips yes

Anonymous
09/12/24(Thu)11:33:47 No.102352785

Anonymous 09/12/24(Thu)11:33:47 No.102352785

>>102352479
>>102352577
im using automatic1111, the vae and clip skip is correct. im not sure, maybe something to do with out a1111 is sending the prompt. sometimes it will be totally fucked up like thei mage i shared, other times its just really warped. but I have found that just adding a bit of punctuation can fix it. so if adding thick thighs fucks it up adding "thick thighs" instead makes it work as expected

Anonymous
09/12/24(Thu)11:34:16 No.102352794

Anonymous 09/12/24(Thu)11:34:16 No.102352794

File: 00014-802331460.png (1.22 MB, 1024x1280)

1.22 MB PNG

Anonymous
09/12/24(Thu)11:35:34 No.102352812

Anonymous 09/12/24(Thu)11:35:34 No.102352812

>>102352606
Investors have shills all over /g/ , threads are scraped, keywords are flagged and they jump into threads.

Anonymous
09/12/24(Thu)11:36:13 No.102352817

Anonymous 09/12/24(Thu)11:36:13 No.102352817

File: 00001-2807647590.png (3.12 MB, 1280x1920)

3.12 MB PNG

Anonymous
09/12/24(Thu)11:41:45 No.102352892

Anonymous 09/12/24(Thu)11:41:45 No.102352892

File: 00003-1596829530.png (2.93 MB, 1280x1920)

2.93 MB PNG

Anonymous
09/12/24(Thu)11:43:38 No.102352919

Anonymous 09/12/24(Thu)11:43:38 No.102352919

when will mods make a /1girl/ containment board for low effort slop?

Anonymous
09/12/24(Thu)11:44:40 No.102352935

Anonymous 09/12/24(Thu)11:44:40 No.102352935

you just know who's trolling kek

Anonymous
09/12/24(Thu)11:44:58 No.102352939

Anonymous 09/12/24(Thu)11:44:58 No.102352939

>>102352919
there's less than 10 dedicate AI slop posters
it would be a very dead board

Anonymous
09/12/24(Thu)11:47:39 No.102352975

Anonymous 09/12/24(Thu)11:47:39 No.102352975

>>102352939
you clearly haven't seen the red boards

Anonymous
09/12/24(Thu)11:49:20 No.102352997

Anonymous 09/12/24(Thu)11:49:20 No.102352997

>>102352975
And you clearly demonstrate there's an overlap of users.

Anonymous
09/12/24(Thu)11:52:24 No.102353043

Anonymous 09/12/24(Thu)11:52:24 No.102353043

File: 00008-1468888841.png (2.94 MB, 1280x1920)

2.94 MB PNG

Anonymous
09/12/24(Thu)11:56:16 No.102353080

Anonymous 09/12/24(Thu)11:56:16 No.102353080

>>102352565
keke

Anonymous
09/12/24(Thu)11:58:14 No.102353104

Anonymous 09/12/24(Thu)11:58:14 No.102353104

I like 1girls desu

Anonymous
09/12/24(Thu)12:05:49 No.102353176

Anonymous 09/12/24(Thu)12:05:49 No.102353176

>>102351868
nice imgs

Anonymous
09/12/24(Thu)12:07:51 No.102353197

Anonymous 09/12/24(Thu)12:07:51 No.102353197

>>102353131
>using an LLM to prompt is literal Indian retard tier shit
It's good to get a sense of how most data sets are captioned and in turn how you /should/ prompt

Anonymous
09/12/24(Thu)12:09:26 No.102353218

Anonymous 09/12/24(Thu)12:09:26 No.102353218

>>102353131
>bloo bloo bloo why do I have to write long winded captions
>no I won't use a tool that does this for me, I just want to complain and get random images from single words

Anonymous
09/12/24(Thu)12:16:08 No.102353300

Anonymous 09/12/24(Thu)12:16:08 No.102353300

>>102352565
>Chudalus 4chanus

Anonymous
09/12/24(Thu)12:21:57 No.102353368

Anonymous 09/12/24(Thu)12:21:57 No.102353368

>>102352565
>>102352794
Great stuff man. Did you train just with the basic 2d meme images?

Anonymous
09/12/24(Thu)12:24:00 No.102353392

Anonymous 09/12/24(Thu)12:24:00 No.102353392

>>102352565
Attempted this today >>102350076
Flux does not understand the word "fat", so I had to use "obese" instead.

Anonymous
09/12/24(Thu)12:24:35 No.102353399

Anonymous 09/12/24(Thu)12:24:35 No.102353399

>>102353368
Mostly, but there were a few pictures of Patrick Crusius.

Anonymous
09/12/24(Thu)12:31:58 No.102353476

Anonymous 09/12/24(Thu)12:31:58 No.102353476

File: file.png (342 KB, 1200x757)

342 KB PNG

https://gist.github.com/sayakpaul/a9266fe2d0d510ec44a9cdc385b3dd74
>This code snippet shows how to split the Flux transformer across two 16GB GPUs and run inference with the full pipeline.
that's cool

Anonymous
09/12/24(Thu)12:43:12 No.102353600

Anonymous 09/12/24(Thu)12:43:12 No.102353600

>>102353476
Big if true

Anonymous
09/12/24(Thu)12:44:24 No.102353611

Anonymous 09/12/24(Thu)12:44:24 No.102353611

>>102353476
>across two 16GB GPUs
I guess it would work for a 24gb + 12gb aswell?

Anonymous
09/12/24(Thu)12:45:55 No.102353626

Anonymous 09/12/24(Thu)12:45:55 No.102353626

>>102353476
Dumb question: why does it need two text encoders instead one?

>>102353611
Image being able to use spare 3060 as some dedicated ai card

Anonymous
09/12/24(Thu)12:46:10 No.102353629

Anonymous 09/12/24(Thu)12:46:10 No.102353629

Why most loras made and posted here worked fine while the average civitai lora gives fucked up hands or eyes?

Anonymous
09/12/24(Thu)12:47:12 No.102353640

Anonymous 09/12/24(Thu)12:47:12 No.102353640

>>102353626
T5 translates your text from boomer prompting into tags that clip uses

Anonymous
09/12/24(Thu)12:47:13 No.102353641

Anonymous 09/12/24(Thu)12:47:13 No.102353641

>>102353629
Civitai is known for its easy of use not necessarily its quality
Tangentially anon is pretty good at it

Anonymous
09/12/24(Thu)12:47:34 No.102353649

Anonymous 09/12/24(Thu)12:47:34 No.102353649

>>102353626
>Dumb question: why does it need two text encoders instead one?
it's not a dumb question at all, I think they went for two text encoder to get the best of both words, clip_l is excellent at tags wheras t5 is excellent at natural language, imo I think that was a bad idea because clip_l can only eat 77 tokens, so if you go for long prompts it's basically useless

Anonymous
09/12/24(Thu)12:48:43 No.102353665

Anonymous 09/12/24(Thu)12:48:43 No.102353665

>>102353641
I have to tune way more often the random civitai ones while the ones from here worked fine with prompts used for other loras, so maybe everyone read here some guide to train loras properly?

Anonymous
09/12/24(Thu)12:48:55 No.102353668

Anonymous 09/12/24(Thu)12:48:55 No.102353668

Was the DC sampler implemented anywhere?

Anonymous
09/12/24(Thu)12:49:04 No.102353671

Anonymous 09/12/24(Thu)12:49:04 No.102353671

>>102353649
Pixart proved you don't need both, it seems like a dumb layover idea from SDXL (same team).

Anonymous
09/12/24(Thu)12:49:47 No.102353678

Anonymous 09/12/24(Thu)12:49:47 No.102353678

>>102353476
would this work with other models?

Anonymous
09/12/24(Thu)12:51:15 No.102353691

Anonymous 09/12/24(Thu)12:51:15 No.102353691

>>102353671
It was made like that so average users on twitter can use it without tags

Anonymous
09/12/24(Thu)12:51:58 No.102353704

Anonymous 09/12/24(Thu)12:51:58 No.102353704

>>102353476
that's big, there should be a ComfyUi node about that because Comfy can't be bothered to implement important stuff on his own repo

Anonymous
09/12/24(Thu)12:52:19 No.102353711

Anonymous 09/12/24(Thu)12:52:19 No.102353711

>>102353671
>it seems like a dumb layover idea from SDXL (same team).
Highly likely

Anonymous
09/12/24(Thu)12:52:36 No.102353714

Anonymous 09/12/24(Thu)12:52:36 No.102353714

>>102353691
T5 doesn't give a shit how you prompt it and in practice no one has ever liked the dual text encoder. No one wants to type different shit into two boxes. And in practice it's not required and they probably did it because they're lazy cunts who didn't want to train on a diverse set of prompt formats.

Anonymous
09/12/24(Thu)12:54:30 No.102353737

Anonymous 09/12/24(Thu)12:54:30 No.102353737

>>102353714
>No one wants to type different shit into two boxes
then dont

Anonymous
09/12/24(Thu)12:55:05 No.102353745

Anonymous 09/12/24(Thu)12:55:05 No.102353745

File: 00003-3273304712.jpg (818 KB, 1488x1776)

818 KB JPG

Anonymous
09/12/24(Thu)12:55:43 No.102353753

Anonymous 09/12/24(Thu)12:55:43 No.102353753

>>102353737
I won't and like I said, it's a dumb architecture and they're dumb cunts. And in the end it didn't even matter, because the model is stylistically inferior and the CLIP does fucking nothing.

Anonymous
09/12/24(Thu)12:55:45 No.102353755

Anonymous 09/12/24(Thu)12:55:45 No.102353755

File: ComfyUI_01413_.png (2.8 MB, 1920x1080)

2.8 MB PNG

Anonymous
09/12/24(Thu)12:57:17 No.102353771

Anonymous 09/12/24(Thu)12:57:17 No.102353771

File: 00016-557003662.png (2.91 MB, 1280x1920)

2.91 MB PNG

Anonymous
09/12/24(Thu)12:57:40 No.102353778

Anonymous 09/12/24(Thu)12:57:40 No.102353778

>>102353755
>use style lora
>people gen as asians

Anonymous
09/12/24(Thu)12:57:59 No.102353784

Anonymous 09/12/24(Thu)12:57:59 No.102353784

File: dino_00023_.png (944 KB, 1024x1024)

944 KB PNG

4090 coming in tomorrow (currently have a 3060 Ti)
can't wait can't wait can't wait
>>102353755
looks like something out of Darius

Anonymous
09/12/24(Thu)12:58:07 No.102353790

Anonymous 09/12/24(Thu)12:58:07 No.102353790

>>102353665
>so maybe everyone read here some guide to train loras properly?
Unfortunately I don't think anyone created a "definitive" guide. There's been lots of discussion in previous threads (and some small guides posted) so I presume anon gleamed something from that.

Anonymous
09/12/24(Thu)13:03:48 No.102353844

Anonymous 09/12/24(Thu)13:03:48 No.102353844

>>102353753
>and the CLIP does fucking nothing.
the worst part is that clip does something, when you change the clip finetune (there are several of them) you can get really different pictures, I was surprised how much that affect the output actually

Anonymous
09/12/24(Thu)13:07:09 No.102353881

Anonymous 09/12/24(Thu)13:07:09 No.102353881

>>102352565
nice

Anonymous
09/12/24(Thu)13:15:15 No.102353985

Anonymous 09/12/24(Thu)13:15:15 No.102353985

>>102353784
cute

Anonymous
09/12/24(Thu)13:19:48 No.102354042

Anonymous 09/12/24(Thu)13:19:48 No.102354042

File: ComfyUI_01421_.png (3.74 MB, 1920x1088)

3.74 MB PNG

Anonymous
09/12/24(Thu)13:21:31 No.102354065

Anonymous 09/12/24(Thu)13:21:31 No.102354065

File: 00014-2280578470.png (869 KB, 1152x896)

869 KB PNG

I downloaded a new flux model and get an error saying "mat1 and mat2 shapes cannot be multiplied" when trying to use it, what's the deal with that?

Anonymous
09/12/24(Thu)13:22:15 No.102354074

Anonymous 09/12/24(Thu)13:22:15 No.102354074

>>102354065
>I downloaded a new flux model
link?

Anonymous
09/12/24(Thu)13:22:52 No.102354086

Anonymous 09/12/24(Thu)13:22:52 No.102354086

>>102354065
it's going to be something like you're using an SDXL vae, lora or something when your entire process should contain flux based models, loras etc.

Anonymous
09/12/24(Thu)13:25:02 No.102354124

Anonymous 09/12/24(Thu)13:25:02 No.102354124

>>102354065
did anons answer yesterday not work?

Anonymous
09/12/24(Thu)13:26:21 No.102354146

Anonymous 09/12/24(Thu)13:26:21 No.102354146

File: ComfyUI_temp_znyvv_00002_.png (2.63 MB, 1920x1152)

2.63 MB PNG

Anonymous
09/12/24(Thu)13:27:47 No.102354173

Anonymous 09/12/24(Thu)13:27:47 No.102354173

File: 00010-436492188.png (687 KB, 1152x896)

687 KB PNG

>>102354074
I meant finetune, it's AcornIsSpinning Flux on civitai.
>>102354086
I have the clip file, T5 encoder and Flux vae, I used jibMixFlux before which worked without problems.

Anonymous
09/12/24(Thu)13:29:01 No.102354187

Anonymous 09/12/24(Thu)13:29:01 No.102354187

>>102354124
missed it, gonna check the archive

Anonymous
09/12/24(Thu)13:31:59 No.102354237

Anonymous 09/12/24(Thu)13:31:59 No.102354237

File: ComfyUI_temp_soaih_00001_.png (1.78 MB, 1600x960)

1.78 MB PNG

Anonymous
09/12/24(Thu)13:34:18 No.102354284

Anonymous 09/12/24(Thu)13:34:18 No.102354284

File: ifx464.png (987 KB, 1024x1024)

987 KB PNG

Anonymous
09/12/24(Thu)13:36:08 No.102354312

Anonymous 09/12/24(Thu)13:36:08 No.102354312

>>102353476
great, an overcooked cat

Anonymous
09/12/24(Thu)13:39:43 No.102354369

Anonymous 09/12/24(Thu)13:39:43 No.102354369

>>102353476
so that's it, we can now split the model into different gpus? that's a huge deal, especially if you want to split a tiny % onto your cpu to run bigger models but with tolerable speed

Anonymous
09/12/24(Thu)13:43:39 No.102354441

Anonymous 09/12/24(Thu)13:43:39 No.102354441

>>102354065
>>102354173
i enjoy this cat in space

Anonymous
09/12/24(Thu)13:45:57 No.102354485

Anonymous 09/12/24(Thu)13:45:57 No.102354485

File: 00027-2388925365.png (2.66 MB, 1280x1920)

2.66 MB PNG

Anonymous
09/12/24(Thu)13:46:56 No.102354497

Anonymous 09/12/24(Thu)13:46:56 No.102354497

File: 1705076095107533.png (263 KB, 512x512)

263 KB PNG

Anonymous
09/12/24(Thu)13:47:58 No.102354515

Anonymous 09/12/24(Thu)13:47:58 No.102354515

File: 1714503552279243.png (511 KB, 512x512)

511 KB PNG

Anonymous
09/12/24(Thu)13:53:37 No.102354625

Anonymous 09/12/24(Thu)13:53:37 No.102354625

I feel so irrelevant ever since XL.

Anonymous
09/12/24(Thu)13:54:17 No.102354635

Anonymous 09/12/24(Thu)13:54:17 No.102354635

>>102353476
What? This isn't new. I'm running T5 + CLIP + VAE on either GPU of my choosing, and then I run the transformer model on the other.

Anonymous
09/12/24(Thu)13:55:19 No.102354653

Anonymous 09/12/24(Thu)13:55:19 No.102354653

>>102352464
No norm needs to be enabled in settings if you're not using comfy

Anonymous
09/12/24(Thu)13:55:38 No.102354660

Anonymous 09/12/24(Thu)13:55:38 No.102354660

>>102354635
the new thing about that script is that you can now split Flux into multiple GPUs, for example if bf16 is too big for your first gpu, you split it and put the smaller parts on gpu 1 and 2

Anonymous
09/12/24(Thu)13:56:46 No.102354680

Anonymous 09/12/24(Thu)13:56:46 No.102354680

>>102354660
splendid

Anonymous
09/12/24(Thu)13:57:32 No.102354702

Anonymous 09/12/24(Thu)13:57:32 No.102354702

>>102354625
get with the times, old man

Anonymous
09/12/24(Thu)14:06:18 No.102354836

Anonymous 09/12/24(Thu)14:06:18 No.102354836

its finally happened..............
im in love with one of my gens

Anonymous
09/12/24(Thu)14:07:49 No.102354860

Anonymous 09/12/24(Thu)14:07:49 No.102354860

>>102354836
Well, get on with it, don't be such a cockblock tease and allows us to shame your taste in women.

Anonymous
09/12/24(Thu)14:09:08 No.102354888

Anonymous 09/12/24(Thu)14:09:08 No.102354888

File: goblin gf.jpg (152 KB, 768x1344)

152 KB JPG

>>102354860

Anonymous
09/12/24(Thu)14:10:17 No.102354906

Anonymous 09/12/24(Thu)14:10:17 No.102354906

>>102354888
>AI face

Anonymous
09/12/24(Thu)14:11:11 No.102354923

Anonymous 09/12/24(Thu)14:11:11 No.102354923

>>102354888
I'm a humble gobbo enjoyer myself

Anonymous
09/12/24(Thu)14:12:02 No.102354939

Anonymous 09/12/24(Thu)14:12:02 No.102354939

>>102354906
what does that mean

Anonymous
09/12/24(Thu)14:13:09 No.102354954

Anonymous 09/12/24(Thu)14:13:09 No.102354954

File: 1714457595655868.png (312 KB, 512x512)

312 KB PNG

Anonymous
09/12/24(Thu)14:13:29 No.102354958

Anonymous 09/12/24(Thu)14:13:29 No.102354958

>>102354939
there are some facial features that appear commonly in ai generated women, making many gens basicaly sameface

you can especially see it with sd 1.5

Anonymous
09/12/24(Thu)14:15:42 No.102354985

Anonymous 09/12/24(Thu)14:15:42 No.102354985

>>102354939
Basically all the slop models are overtrained on specific facial features so they all have the same AI face. I'm not sure if this is by design however, like Flux has it and it might be a deep fake counter measure. Or just retards training on a set of 1000 pictures of the same girl.

Anonymous
09/12/24(Thu)14:18:42 No.102355031

Anonymous 09/12/24(Thu)14:18:42 No.102355031

>>102354985
Pony models are guilty of this because of model inbreeding

Anonymous
09/12/24(Thu)14:18:53 No.102355034

Anonymous 09/12/24(Thu)14:18:53 No.102355034

>>102354985
I think it just has to do with the fact checkpoints are basically an average of whatever they were trained on, or so I presume.

Anonymous
09/12/24(Thu)14:20:09 No.102355056

Anonymous 09/12/24(Thu)14:20:09 No.102355056

>>102354985
i dont see it. i had to make my own models for the faces. maybe its just the symmetry

Anonymous
09/12/24(Thu)14:20:19 No.102355059

Anonymous 09/12/24(Thu)14:20:19 No.102355059

>>102355034
I don't think so, you don't see it with other things like dogs and cats. If it was an average things you'd see the same lamp showing up over and over again in the background, for example. I think it's an overtraining issue.

Anonymous
09/12/24(Thu)14:21:57 No.102355085

Anonymous 09/12/24(Thu)14:21:57 No.102355085

File: file.png (71 KB, 175x176)

71 KB PNG

>>102355056

Anonymous
09/12/24(Thu)14:24:32 No.102355124

Anonymous 09/12/24(Thu)14:24:32 No.102355124

>>102355085
Anon, posting one face won't convince them, if they haven't noticed it by now.

Anonymous
09/12/24(Thu)14:25:13 No.102355134

Anonymous 09/12/24(Thu)14:25:13 No.102355134

File: 1715791664843311.png (398 KB, 512x512)

398 KB PNG

>tfw no rotting gf

Anonymous
09/12/24(Thu)14:26:31 No.102355151

Anonymous 09/12/24(Thu)14:26:31 No.102355151

File: 1706826273692256.png (316 KB, 512x512)

316 KB PNG

Anonymous
09/12/24(Thu)14:26:48 No.102355155

Anonymous 09/12/24(Thu)14:26:48 No.102355155

File: goth gob.jpg (135 KB, 768x1344)

135 KB JPG

>>102355085
is it just the gentle smile neutral expression?
does this pic have ai face?

Anonymous
09/12/24(Thu)14:29:17 No.102355191

Anonymous 09/12/24(Thu)14:29:17 No.102355191

File: file.png (33 KB, 200x70)

33 KB PNG

>>102355155
look at the eyes
it's just the same person wearing facial prosthetics

Anonymous
09/12/24(Thu)14:31:05 No.102355211

Anonymous 09/12/24(Thu)14:31:05 No.102355211

File: file.png (34 KB, 185x104)

34 KB PNG

I'm not sure what about our biology says it's the same person. Must be the eye shape to cheekbone ratio.

Anonymous
09/12/24(Thu)14:31:53 No.102355223

Anonymous 09/12/24(Thu)14:31:53 No.102355223

File: 1723275306485243.png (415 KB, 512x512)

415 KB PNG

faceblind autist thinks he is unleashing true meaning on the thread

Anonymous
09/12/24(Thu)14:32:52 No.102355246

Anonymous 09/12/24(Thu)14:32:52 No.102355246

>>102355223
Everyone knows AI face exists. If you don't think it exists, you are definitely face blind. Try going outside and looking at real people.

Anonymous
09/12/24(Thu)14:35:00 No.102355277

Anonymous 09/12/24(Thu)14:35:00 No.102355277

>>102354985
Flux has buttchins

Anonymous
09/12/24(Thu)14:36:52 No.102355304

Anonymous 09/12/24(Thu)14:36:52 No.102355304

if you dont know about sameface in 2024 it unironically over for you

Anonymous
09/12/24(Thu)14:45:13 No.102355443

Anonymous 09/12/24(Thu)14:45:13 No.102355443

>>102354985
Isn't that a DPO thing?

Anonymous
09/12/24(Thu)14:50:50 No.102355535

Anonymous 09/12/24(Thu)14:50:50 No.102355535

File: 1717856395738113.png (399 KB, 512x512)

399 KB PNG

imagine thinking that I am affected by AI "sameface"
>promptlet

Anonymous
09/12/24(Thu)14:53:21 No.102355580

Anonymous 09/12/24(Thu)14:53:21 No.102355580

So was there a solution for that SD3 "gelu_new" error or it's still unusable in Automatic/Forge?

Anonymous
09/12/24(Thu)14:57:50 No.102355641

Anonymous 09/12/24(Thu)14:57:50 No.102355641

>>102355443
The Pony models wouldn't be using DPO

Anonymous
09/12/24(Thu)15:00:57 No.102355675

Anonymous 09/12/24(Thu)15:00:57 No.102355675

File: 1702663117120487.png (167 KB, 1029x1079)

167 KB PNG

i've hacked together a random prompt generator in comfy based on wild cards, and to read the prompt i've hooked up a simple string reader at the end of it. However, i can only see the prompt of the current image being generated, and when it's finished it's replaced by a new one. So when the image has finished generating, and i would like the see the prompt that generated it and compare it to the image, the string has already been replaced by the next prompt in the queue, if that makes sense. So what i would want is some sort of buffer or delay, so that it displays the last prompt generated instead of the current one.

Does anyone if this is possible somehow? I hope i described that well enough.

Anonymous
09/12/24(Thu)15:06:59 No.102355763

Anonymous 09/12/24(Thu)15:06:59 No.102355763

>>102355675
anything is possible, just find a node or make a node that saves a string to a text file

Anonymous
09/12/24(Thu)15:12:24 No.102355835

Anonymous 09/12/24(Thu)15:12:24 No.102355835

>>102355675
Yeah, so, you need to either find a way how to cache out the current prompt into a file on its own. I guess this should be doable because it's Python.
Or if it's not possible you could resort to writing a simple manager controller outside comfy but that's probably not that simple.

Anonymous
09/12/24(Thu)15:18:22 No.102355915

Anonymous 09/12/24(Thu)15:18:22 No.102355915

>>102355835
>>102355675
To add: it's probably hard because ComfyUI is not time based on any level, it's not made like that.
You will need to figure it out.
I mean it's not like Maya which you can easily script the way you want and refer previous positions because it's there and if it's not present you can always read it back.

Anonymous
09/12/24(Thu)15:28:38 No.102356074

Anonymous 09/12/24(Thu)15:28:38 No.102356074

File: 366527709-7eafb90a-fdd1-4(...).jpg (1.34 MB, 3011x3067)

1.34 MB JPG

https://github.com/ToTheBeginning/PuLID
Oh that's cool! It's like InstantID but for Flux

Anonymous
09/12/24(Thu)15:28:49 No.102356076

Anonymous 09/12/24(Thu)15:28:49 No.102356076

>>102355835
>>102355915
it would be dead simple to make a node that saves a string to a text file
you could even save the output image and the prompt together as an image (.png) and .txt file pair.

Anonymous
09/12/24(Thu)15:30:13 No.102356097

Anonymous 09/12/24(Thu)15:30:13 No.102356097

>>102356074
vram?

Anonymous
09/12/24(Thu)15:31:33 No.102356118

Anonymous 09/12/24(Thu)15:31:33 No.102356118

>>102356074
>chink looks at viewer

Anonymous
09/12/24(Thu)15:32:21 No.102356135

Anonymous 09/12/24(Thu)15:32:21 No.102356135

>>102356074
Demo here
https://huggingface.co/spaces/yanze/PuLID-FLUX

Anonymous
09/12/24(Thu)15:35:11 No.102356179

Anonymous 09/12/24(Thu)15:35:11 No.102356179

File: Capture.jpg (318 KB, 2301x1518)

318 KB JPG

>>102356135
dissapointing...

Anonymous
09/12/24(Thu)15:40:09 No.102356275

Anonymous 09/12/24(Thu)15:40:09 No.102356275

File: Capture.jpg (280 KB, 2349x1489)

280 KB JPG

>>102356135
not bad, but I should try something that gives a different expression than the input image I guess

Anonymous
09/12/24(Thu)15:42:21 No.102356312

Anonymous 09/12/24(Thu)15:42:21 No.102356312

File: file.png (3.06 MB, 2030x1505)

3.06 MB PNG

>>102356074
>As shown in the above image, in terms of ID fidelity, using fake CFG is similar to true CFG in most cases, except that in a few cases, true CFG achieves higher ID similarity. In terms of image aesthetics and facial naturalness, fake CFG performs better.
Interesting, that's what I noticed aswell at some point in time
https://reddit.com/r/StableDiffusion/comments/1emy5oz/a_higher_cfg_helps_flux_to_make_celebrities_look/

Anonymous
09/12/24(Thu)15:48:02 No.102356387

Anonymous 09/12/24(Thu)15:48:02 No.102356387

>>102355763
This is the answer, also add the seed number to the save file name or something to make life easier.

Anonymous
09/12/24(Thu)15:49:54 No.102356419

Anonymous 09/12/24(Thu)15:49:54 No.102356419

>>102356387
meant for also
>>102355675

Anonymous
09/12/24(Thu)15:50:46 No.102356427

Anonymous 09/12/24(Thu)15:50:46 No.102356427

>gen big butt foid
>watching it gen with fast preview
>at some early step it decides to flip the foid around so she's facing the other way
>now it doesn't know what to do with the big protruding buttocks area that's no longer her butt
>watch as it turns into a big potbelly

Anonymous
09/12/24(Thu)15:52:09 No.102356443

Anonymous 09/12/24(Thu)15:52:09 No.102356443

File: tmp0r8wnib6.png (669 KB, 768x1024)

669 KB PNG

>>102355580
bump

Anonymous
09/12/24(Thu)16:03:23 No.102356601

Anonymous 09/12/24(Thu)16:03:23 No.102356601

>>102356427
Sergeant Braphog came for you when you least expected it. Or maybe not, since it was an ass fetish image anyways.

Anonymous
09/12/24(Thu)16:04:04 No.102356612

Anonymous 09/12/24(Thu)16:04:04 No.102356612

The new ChatGPT model might actually be good enough to design an image model from scratch. Investigating.

Anonymous
09/12/24(Thu)16:05:11 No.102356625

Anonymous 09/12/24(Thu)16:05:11 No.102356625

>>102356076
Sure but you need to implement time function too.
I think this cannot be done without changing the source code of the original nodes or at least the interface itself.
I might be wrong because it has been some time I did any work etc.

Anonymous
09/12/24(Thu)16:07:50 No.102356664

Anonymous 09/12/24(Thu)16:07:50 No.102356664

how's the 4070ti at flux? are all those new advancements in architecture or whatever the fuck making it really fast?

>of which i can't use because i'm on a 970 right now

Anonymous
09/12/24(Thu)16:09:19 No.102356686

Anonymous 09/12/24(Thu)16:09:19 No.102356686

>>102356625
Comfy is just a behavior tree. Each node typically has an input and an output. The node takes in an input (if available) and does something and if there's an output pushes it forward. It's just a dynamic system of modules, it's trivial to do.

Anonymous
09/12/24(Thu)16:09:39 No.102356691

Anonymous 09/12/24(Thu)16:09:39 No.102356691

>>102356312
is this face thing unique to flux or is it available for 1.5 or sdxl or pony?

Anonymous
09/12/24(Thu)16:10:50 No.102356714

Anonymous 09/12/24(Thu)16:10:50 No.102356714

>>102356691
exists for SD1.5 but not for XL/Pony IIRC

Anonymous
09/12/24(Thu)16:14:29 No.102356774

Anonymous 09/12/24(Thu)16:14:29 No.102356774

>>102356664
Don't worry, flux is way slower than other Open Source models in its class, but the image quality is on par with what Bing AI was using. Just make sure you have at least 12Gb VRAM and CUDA compatibility, you'll be good with that.
My RTX 3060 satisfies these, and I spend like 2 minutes with 30-40 steps. That's with forge and the optimized model the dev uploaded. You can make FHD wallpapers with it, even.

Anonymous
09/12/24(Thu)16:14:33 No.102356776

Anonymous 09/12/24(Thu)16:14:33 No.102356776

>>102356714
>exists for SD1.5 but not for XL/Pony IIRC
instantID was for XL though?

Anonymous
09/12/24(Thu)16:14:45 No.102356777

Anonymous 09/12/24(Thu)16:14:45 No.102356777

>>102356686
Yeah but you will need to implement 'frame' conception.

Anonymous
09/12/24(Thu)16:15:43 No.102356788

Anonymous 09/12/24(Thu)16:15:43 No.102356788

>>102356777
what the fuck are you talking about?
he just wants to know what the prompt was
why are you making this so complicated?

Anonymous
09/12/24(Thu)16:16:16 No.102356795

Anonymous 09/12/24(Thu)16:16:16 No.102356795

>>102356776
yeah but instant ID kinda sucks, at least from my experience it fucks up the face especially the mouth often, PuLID seems to work much better

Anonymous
09/12/24(Thu)16:16:43 No.102356798

Anonymous 09/12/24(Thu)16:16:43 No.102356798

>>102356664
There are no advancements in arch. It's the exact same latent diffusion process as in SD or all the other current image generators. It just uses an additional text encoder and a transformer instead of a unet to predict noise from t -> t-1

Anonymous
09/12/24(Thu)16:23:08 No.102356890

Anonymous 09/12/24(Thu)16:23:08 No.102356890

>>102356795
>PuLID seems to work much better
you tested the demo or on ComfyUi?

Anonymous
09/12/24(Thu)16:24:24 No.102356911

Anonymous 09/12/24(Thu)16:24:24 No.102356911

>>102356890
I meant the one for 1.5, the faces came out better than the with InstantID on XL

Anonymous
09/12/24(Thu)16:27:18 No.102356945

Anonymous 09/12/24(Thu)16:27:18 No.102356945

>>102356135
I guess that can be used on comfyui right?
https://github.com/cubiq/PuLID_ComfyUI

Anonymous
09/12/24(Thu)16:27:33 No.102356947

Anonymous 09/12/24(Thu)16:27:33 No.102356947

>>102356774
looks like im buying a couple supers
https://youtu.be/nP5hG2voJ4I?si=rro4DZpWecgX7c2V

>and it takes over double this time at 1024pix on sdxl on my current gpu
>for around the same power draw

Anonymous
09/12/24(Thu)16:31:49 No.102357003

Anonymous 09/12/24(Thu)16:31:49 No.102357003

>>102356947
retracting this, not the ti super, i got confused. which is exactly what nvidia was going for with this retarded scammy releasing scheme, of the three GPU's the TI is the best value price/performance/wattage.

Anonymous
09/12/24(Thu)16:45:27 No.102357198

Anonymous 09/12/24(Thu)16:45:27 No.102357198

File: 00094-4149537017.jpg (483 KB, 1344x1792)

483 KB JPG

gigu

Anonymous
09/12/24(Thu)16:51:25 No.102357284

Anonymous 09/12/24(Thu)16:51:25 No.102357284

>>102356798
a transformer instead of U-Net is a massive change

Anonymous
09/12/24(Thu)16:55:30 No.102357337

Anonymous 09/12/24(Thu)16:55:30 No.102357337

>>102356788
You either do it right or don't do it at all.

Anonymous
09/12/24(Thu)16:57:33 No.102357362

Anonymous 09/12/24(Thu)16:57:33 No.102357362

>>102357284
Wouldn't say that fits my definition of "massive"

Anonymous
09/12/24(Thu)16:59:03 No.102357376

Anonymous 09/12/24(Thu)16:59:03 No.102357376

>>102357362
that's alright. you only need to agree about that it's an advancement, specifically in terms of the arch

Anonymous
09/12/24(Thu)17:00:23 No.102357396

Anonymous 09/12/24(Thu)17:00:23 No.102357396

File: file.jpg (542 KB, 1648x2066)

542 KB JPG

No Flux Anime in sight...

Anonymous
09/12/24(Thu)17:01:24 No.102357410

Anonymous 09/12/24(Thu)17:01:24 No.102357410

>>102357376
I do agree that it's an algorithmical advancement based on the same mathematical process.

Anonymous
09/12/24(Thu)17:02:20 No.102357417

Anonymous 09/12/24(Thu)17:02:20 No.102357417

File: 00000-721546627.jpg (411 KB, 1344x1792)

411 KB JPG

Anonymous
09/12/24(Thu)17:03:04 No.102357426

Anonymous 09/12/24(Thu)17:03:04 No.102357426

i cant even google this question and get any results, probably because im wording the question wrong maybe, but those of you on 3090's/more than 24gb of vram, do you use batch size? Like generating more than a few images all at once, how much vram does that take and do you get a performance hit doing it?
i just thought about how more efficient that would probably be in performance + power usage vs genning 1 image at lower performance, say 4070ti vs 3090ti.

Anonymous
09/12/24(Thu)17:08:20 No.102357499

Anonymous 09/12/24(Thu)17:08:20 No.102357499

File: clip_generation.gif (744 KB, 800x392)

744 KB GIF

Back to basics!

Anonymous
09/12/24(Thu)17:09:56 No.102357526

Anonymous 09/12/24(Thu)17:09:56 No.102357526

File: 1726175364.png (771 KB, 1024x1024)

771 KB PNG

Anonymous
09/12/24(Thu)17:12:12 No.102357564

Anonymous 09/12/24(Thu)17:12:12 No.102357564

File: file.jpg (103 KB, 1064x834)

103 KB JPG

>>102357499
CLIP is trash

SigLIP is the future

Anonymous
09/12/24(Thu)17:13:17 No.102357579

Anonymous 09/12/24(Thu)17:13:17 No.102357579

File: grid-0003.jpg (587 KB, 2048x2048)

587 KB JPG

Anonymous
09/12/24(Thu)17:13:47 No.102357586

Anonymous 09/12/24(Thu)17:13:47 No.102357586

File: file.png (1.78 MB, 1467x893)

1.78 MB PNG

Anonymous
09/12/24(Thu)17:14:53 No.102357606

Anonymous 09/12/24(Thu)17:14:53 No.102357606

>>102356945
>I guess that can be used on comfyui right?
PuLID is asking you to install the facexlib package, but unfortunately it's not available on python 3.11, and comfyUi uses 3.11 so you're fucked lol
https://github.com/vladmandic/automatic/discussions/110
>numba is not compatible so facexlib, gfpgan and realesrgan modules will not be available

Anonymous
09/12/24(Thu)17:15:17 No.102357616

Anonymous 09/12/24(Thu)17:15:17 No.102357616

File: 1726175681.png (289 KB, 1024x1024)

289 KB PNG

Anonymous
09/12/24(Thu)17:16:02 No.102357634

Anonymous 09/12/24(Thu)17:16:02 No.102357634

>>102357586
That's a boy

Anonymous
09/12/24(Thu)17:17:52 No.102357658

Anonymous 09/12/24(Thu)17:17:52 No.102357658

File: file.png (116 KB, 1529x1085)

116 KB PNG

>>102357606
>and comfyUi uses 3.11
kek, how do I downgrade to 3.10 then?

Anonymous
09/12/24(Thu)17:19:45 No.102357684

Anonymous 09/12/24(Thu)17:19:45 No.102357684

>>102357606
>comfyUi uses 3.11 so you're fucked lol
you can definitely run Comfy with older versions of Python. I use 3.10.6 without any issues or custom node incompatibilities

>>102357606
>https://github.com/vladmandic/automatic/discussions/110
you're quoting a post from April '23

Anonymous
09/12/24(Thu)17:21:08 No.102357704

Anonymous 09/12/24(Thu)17:21:08 No.102357704

>>102357684
>>102357658
>>102357606
it's ok there's a way to make it work anyway
https://github.com/cubiq/PuLID_ComfyUI/issues/1#issuecomment-2102591918

Do this instead:
ComfyUI_windows_portable\update>..\python_embeded\python.exe -s -m pip install --use-pep517 facexlib

Anonymous
09/12/24(Thu)17:21:30 No.102357711

Anonymous 09/12/24(Thu)17:21:30 No.102357711

File: 1726176053.png (487 KB, 1024x1024)

487 KB PNG

Anonymous
09/12/24(Thu)17:25:34 No.102357774

Anonymous 09/12/24(Thu)17:25:34 No.102357774

>>102357410
oh, well. that's not particularly controversial.
I'm sure you understand what I disagree about in terms of the statement "There are no advancements in arch" but meh

Anonymous
09/12/24(Thu)17:27:34 No.102357808

Anonymous 09/12/24(Thu)17:27:34 No.102357808

File: file.png (310 KB, 2728x1505)

310 KB PNG

>>102357704
bruh... now it doesn't want to install insightface, this shit is fucked

Anonymous
09/12/24(Thu)17:28:10 No.102357819

Anonymous 09/12/24(Thu)17:28:10 No.102357819

File: 1726176457.png (653 KB, 1024x1024)

653 KB PNG

Anonymous
09/12/24(Thu)17:28:44 No.102357834

Anonymous 09/12/24(Thu)17:28:44 No.102357834

File: 00021-2170268794.jpg (726 KB, 1344x1792)

726 KB JPG

Anonymous
09/12/24(Thu)17:30:02 No.102357846

Anonymous 09/12/24(Thu)17:30:02 No.102357846

>>102357808
Don't build it, just install this prebuilt wheel
https://github.com/cubiq/ComfyUI_IPAdapter_plus/issues/162#issuecomment-1868967714

Anonymous
09/12/24(Thu)17:31:11 No.102357862

Anonymous 09/12/24(Thu)17:31:11 No.102357862

File: 256664588-01107137-8a93-4(...).png (65 KB, 943x807)

65 KB PNG

>>102355675
>>102355763
>>102355763
>>102356387
Like this?

Anonymous
09/12/24(Thu)17:31:40 No.102357871

Anonymous 09/12/24(Thu)17:31:40 No.102357871

>>102357774
Maybe it's helpful to differentiate between "software arch" and "mathematical arch"

Anonymous
09/12/24(Thu)17:31:55 No.102357880

Anonymous 09/12/24(Thu)17:31:55 No.102357880

>Loads Flux
>Adds Lora
>Python.exe has stopped working
What's the git gud trick to get Flux running with a lora on a 16GB GPU?

Anonymous
09/12/24(Thu)17:32:45 No.102357896

Anonymous 09/12/24(Thu)17:32:45 No.102357896

>>102357880
What UI are you using?

Anonymous
09/12/24(Thu)17:33:59 No.102357912

Anonymous 09/12/24(Thu)17:33:59 No.102357912

>>102357896
Forge

Anonymous
09/12/24(Thu)17:40:01 No.102358011

Anonymous 09/12/24(Thu)17:40:01 No.102358011

File: 00029-3113399324.jpg (697 KB, 1344x1792)

697 KB JPG

Anonymous
09/12/24(Thu)17:40:14 No.102358014

Anonymous 09/12/24(Thu)17:40:14 No.102358014

>>102351868
Soo any finetunes of Flux yet that add porn and furry?

Anonymous
09/12/24(Thu)17:42:50 No.102358049

Anonymous 09/12/24(Thu)17:42:50 No.102358049

File: LGp_Vp-f_400x400.jpg (17 KB, 400x400)

17 KB JPG

>>102357880
Oh wait there is a setting for that: fp16 LoRA

Anonymous
09/12/24(Thu)17:50:51 No.102358163

Anonymous 09/12/24(Thu)17:50:51 No.102358163

>>102358014
Probably won't ever happen. It's very expensive to train and with the Flux's restrictive license, crowdfunding won't be possible.

Anonymous
09/12/24(Thu)17:52:04 No.102358181

Anonymous 09/12/24(Thu)17:52:04 No.102358181

File: 00044-3113399326.jpg (558 KB, 1344x1792)

558 KB JPG

Anonymous
09/12/24(Thu)17:54:08 No.102358210

Anonymous 09/12/24(Thu)17:54:08 No.102358210

File: ComfyUI_00343_.jpg (1.9 MB, 3584x4608)

1.9 MB JPG

Anonymous
09/12/24(Thu)17:55:59 No.102358236

Anonymous 09/12/24(Thu)17:55:59 No.102358236

File: file.png (967 KB, 3744x1728)

967 KB PNG

>>102356945
>>102356074
yep... the flux model isn't working for ComfyUi yet
https://github.com/cubiq/PuLID_ComfyUI/issues/69

vramslim 6gb
09/12/24(Thu)17:58:31 No.102358270

vramslim 6gb 09/12/24(Thu)17:58:31 No.102358270

File: angry010.jpg (13 KB, 146x222)

13 KB JPG

Say No to vrambloat.
It's a crime against civilization.

Anonymous
09/12/24(Thu)17:59:35 No.102358285

Anonymous 09/12/24(Thu)17:59:35 No.102358285

File: 00047-3113399327.jpg (621 KB, 1344x1792)

621 KB JPG

Anonymous
09/12/24(Thu)18:00:53 No.102358303

Anonymous 09/12/24(Thu)18:00:53 No.102358303

File: ComfyUI_00540_.png (2.09 MB, 1568x1568)

2.09 MB PNG

Anonymous
09/12/24(Thu)18:05:01 No.102358365

Anonymous 09/12/24(Thu)18:05:01 No.102358365

I hope specific layer training gets implemented for Kohya soon. It would save VRAM and increase training speed.

Anonymous
09/12/24(Thu)18:05:11 No.102358366

Anonymous 09/12/24(Thu)18:05:11 No.102358366

>>102358236
you tried this one?
https://github.com/ZHO-ZHO-ZHO/ComfyUI-PuLID-ZHO

Anonymous
09/12/24(Thu)18:07:16 No.102358382

Anonymous 09/12/24(Thu)18:07:16 No.102358382

File: ComfyUI_00363_.jpg (2.07 MB, 3584x4608)

2.07 MB JPG

Anonymous
09/12/24(Thu)18:10:41 No.102358414

Anonymous 09/12/24(Thu)18:10:41 No.102358414

File: 0.jpg (97 KB, 1024x1024)

97 KB JPG

Anonymous
09/12/24(Thu)18:11:43 No.102358429

Anonymous 09/12/24(Thu)18:11:43 No.102358429

>>102357871
no. the arch is the model structure. it is /the/ arch in ML.
the "software arch" is the stack, and "mathematical arch" just sounds like a parabolic arch lmao

Anonymous
09/12/24(Thu)18:12:35 No.102358439

Anonymous 09/12/24(Thu)18:12:35 No.102358439

>>102356074
I never tried that one, only InstantID, which was was better during the SDXL days?

Anonymous
09/12/24(Thu)18:30:50 No.102358640

Anonymous 09/12/24(Thu)18:30:50 No.102358640

>>102357564
prompt?

Anonymous
09/12/24(Thu)18:31:17 No.102358645

Anonymous 09/12/24(Thu)18:31:17 No.102358645

>>102355675
What's the node with the string for autorefresh called?

Anonymous
09/12/24(Thu)18:34:25 No.102358685

Anonymous 09/12/24(Thu)18:34:25 No.102358685

File: file.png (2.88 MB, 1636x1523)

2.88 MB PNG

>>102356074
>>102356135
nice

Anonymous
09/12/24(Thu)18:35:27 No.102358696

Anonymous 09/12/24(Thu)18:35:27 No.102358696

>>102357880
In comfy you can use gguf flux nodes and load any loras you want. Lowest quality/smallest quantization flux model can fit in like 4 gig

Anonymous
09/12/24(Thu)18:43:41 No.102358781

Anonymous 09/12/24(Thu)18:43:41 No.102358781

>>102357880
>>102358696
this, get a GGUF file that's under <16GB
i have 8GB VRAM and the Q4 quant works perfectly

Anonymous
09/12/24(Thu)18:44:19 No.102358786

Anonymous 09/12/24(Thu)18:44:19 No.102358786

>>102351868
that's a cool looking super stargate

Anonymous
09/12/24(Thu)19:02:28 No.102359001

Anonymous 09/12/24(Thu)19:02:28 No.102359001

>trying to get joycaption + llama 8b gguf to stop describing every image with 'emotions' and 'moods'
>make prompt something like "caption this image descriptively but concise. you will not use any emotions or moods.." [etc]
>every description, without fail: THE MOOD OF THIS IMAGE IS..
reeeeeeeeeeeeeeeeeeeeeeeeeeee

Anonymous
09/12/24(Thu)19:03:31 No.102359012

Anonymous 09/12/24(Thu)19:03:31 No.102359012

>>102359001
I think it did stop describing everything as whimsical though, so that's.. progress?

Anonymous
09/12/24(Thu)19:04:46 No.102359027

Anonymous 09/12/24(Thu)19:04:46 No.102359027

>>102359001
>every description, without fail: THE MOOD OF THIS IMAGE IS..
if joycaption start its sentence with "the mood of this image is" then you can easily remove that sentence with a regex python script

Anonymous
09/12/24(Thu)19:16:39 No.102359176

Anonymous 09/12/24(Thu)19:16:39 No.102359176

I like loss graph

Anonymous
09/12/24(Thu)19:17:47 No.102359188

Anonymous 09/12/24(Thu)19:17:47 No.102359188

>>102359027
it doesn't start it with that, no, its randomly throughout the caption. not like you can't still get it to remove every sentence that starts with the "the mood of", but I still don't want it to waste tokens on that shit

Anonymous
09/12/24(Thu)19:19:33 No.102359207

Anonymous 09/12/24(Thu)19:19:33 No.102359207

>>102359188
yeah I feel that, but somehow every caption model add this fluff shit, even the best of them all, GPT4V, and they don't want to listen to our order at all, that's frustrating

Anonymous
09/12/24(Thu)19:19:35 No.102359210

Anonymous 09/12/24(Thu)19:19:35 No.102359210

I have only used one single text box for prompting flux. How much of an impact does it really make?

Anonymous
09/12/24(Thu)19:20:47 No.102359225

Anonymous 09/12/24(Thu)19:20:47 No.102359225

>>102359207
the most confusing part is this must mean they were all trained that way... why... no one talks this way, where did the LLM purple prose-descriptions originate from lmao

Anonymous
09/12/24(Thu)19:31:19 No.102359347

Anonymous 09/12/24(Thu)19:31:19 No.102359347

>>102359225
that's a good question, my theory is that those captions models were trained as regular LLMs before, so they know how to write a story and add some "writing fluff", when finetuned to describe an image, they probably think it's the same thing as describing a scene like on a book, so they add those "mood" thing, that's my 2 cents

Anonymous
09/12/24(Thu)19:32:16 No.102359359

Anonymous 09/12/24(Thu)19:32:16 No.102359359

File: 0.jpg (200 KB, 1024x1024)

200 KB JPG

Anonymous
09/12/24(Thu)19:36:15 No.102359408

Anonymous 09/12/24(Thu)19:36:15 No.102359408

>>102359001
8B models are retarded.

Anonymous
09/12/24(Thu)19:36:54 No.102359413

Anonymous 09/12/24(Thu)19:36:54 No.102359413

>>102359408
even GPT4V does this shit and it's definitely not a 8b model

Anonymous
09/12/24(Thu)19:39:12 No.102359438

Anonymous 09/12/24(Thu)19:39:12 No.102359438

>>102359413
Because you're retarded.

Anonymous
09/12/24(Thu)19:46:47 No.102359532

Anonymous 09/12/24(Thu)19:46:47 No.102359532

>>102357198
>>102357417
>>102357579
>>102357834
>>102358011
Keep 'em coming FAGGOT

Anonymous
09/12/24(Thu)20:02:23 No.102359722

Anonymous 09/12/24(Thu)20:02:23 No.102359722

File: 1726185683.png (1.18 MB, 1024x1024)

1.18 MB PNG

Anonymous
09/12/24(Thu)20:03:37 No.102359732

Anonymous 09/12/24(Thu)20:03:37 No.102359732

File: 1726185739.png (1.22 MB, 1024x1024)

1.22 MB PNG

Anonymous
09/12/24(Thu)20:04:42 No.102359739

Anonymous 09/12/24(Thu)20:04:42 No.102359739

>>102359722
omg it bigu

Anonymous
09/12/24(Thu)20:10:18 No.102359792

Anonymous 09/12/24(Thu)20:10:18 No.102359792

File: 1726186180.png (1.39 MB, 1024x1024)

1.39 MB PNG

Anonymous
09/12/24(Thu)20:13:23 No.102359819

Anonymous 09/12/24(Thu)20:13:23 No.102359819

File: 1726186375.png (1.49 MB, 1024x1024)

1.49 MB PNG

Anonymous
09/12/24(Thu)20:19:55 No.102359881

Anonymous 09/12/24(Thu)20:19:55 No.102359881

File: 1726186502.png (1.62 MB, 1024x1024)

1.62 MB PNG

Anonymous
09/12/24(Thu)20:26:13 No.102359937

Anonymous 09/12/24(Thu)20:26:13 No.102359937

>>102359722
Miku by Ubisoft

Anonymous
09/12/24(Thu)20:26:28 No.102359938

Anonymous 09/12/24(Thu)20:26:28 No.102359938

CogVideoX just added img2video, too bad this model sucks ass though
https://github.com/kijai/ComfyUI-CogVideoXWrapper/issues/54

Anonymous
09/12/24(Thu)20:28:04 No.102359955

Anonymous 09/12/24(Thu)20:28:04 No.102359955

>>102359938
at this point I feel like img2anything is just worse controlnet

Anonymous
09/12/24(Thu)20:35:10 No.102360024

Anonymous 09/12/24(Thu)20:35:10 No.102360024

>>102359955
dunno man, for me, controlnet feels like rotoscoping, it's not natural at all, especially if you want to transform a realistic image (or a 3d image) into an anime image

Anonymous
09/12/24(Thu)20:38:13 No.102360059

Anonymous 09/12/24(Thu)20:38:13 No.102360059

>>102352228
"Clip Skip" isn't even a thing on SDXL, the differences for SDXL based models were always software bugs, there's no need to use the "Clip Set Last Layer" node for anything XL at all in current Comfy versions.

Anonymous
09/12/24(Thu)20:39:49 No.102360077

Anonymous 09/12/24(Thu)20:39:49 No.102360077

>>102352357
no you don't, in current Comfy you should just not use "Clip Set Last Layer" nodes at all for XL models

Anonymous
09/12/24(Thu)20:45:51 No.102360130

Anonymous 09/12/24(Thu)20:45:51 No.102360130

>>102354985
Flux Dev and Schnell have it because they're distilled from Pro, which reduces variance a lot. It's not the same reason that something like SD 1.5 has it.

Anonymous
09/12/24(Thu)20:54:59 No.102360208

Anonymous 09/12/24(Thu)20:54:59 No.102360208

>>102352228
A1111's clip skip 1 is actually clip skip 2 on Comfy. A1111 is hardcoded to prevented clip skip from being set below 2 for no logical reason.

Anonymous
09/12/24(Thu)20:55:04 No.102360209

Anonymous 09/12/24(Thu)20:55:04 No.102360209

>>102359938
uhh finally

Anonymous
09/12/24(Thu)21:00:26 No.102360276

Anonymous 09/12/24(Thu)21:00:26 No.102360276

>>102359938
>>102360209
its a fucking nothing burger

Anonymous
09/12/24(Thu)21:11:11 No.102360378

Anonymous 09/12/24(Thu)21:11:11 No.102360378

File: 1713471050125020.jpg (1.77 MB, 1632x1632)

1.77 MB JPG

don't fuck with me

Anonymous
09/12/24(Thu)21:15:34 No.102360415

Anonymous 09/12/24(Thu)21:15:34 No.102360415

File: 00107-2493429972.jpg (416 KB, 1152x1536)

416 KB JPG

Anonymous
09/12/24(Thu)21:30:04 No.102360554

Anonymous 09/12/24(Thu)21:30:04 No.102360554

File: 00111-15654573.jpg (412 KB, 1152x1536)

412 KB JPG

Anonymous
09/12/24(Thu)21:34:24 No.102360590

Anonymous 09/12/24(Thu)21:34:24 No.102360590

>>102359347
yeah that's not a bad theory and makes sense all things considered, but jeez does it suck for our purpose

Anonymous
09/12/24(Thu)21:49:38 No.102360725

Anonymous 09/12/24(Thu)21:49:38 No.102360725

>>102360024
Control net is useful, open pose and setting a pose, or using one of the modes based off an existing image etc, there's a lot you can do with it, sd1.5 was great with control nets + latent couple dividing up the image and genning high res straight from txt2img

Anonymous
09/12/24(Thu)21:52:35 No.102360752

Anonymous 09/12/24(Thu)21:52:35 No.102360752

File: 0.jpg (250 KB, 1024x1024)

250 KB JPG

Anonymous
09/12/24(Thu)22:44:10 No.102361285

Anonymous 09/12/24(Thu)22:44:10 No.102361285

What is the minimum number of images you need to create a model from scratch? I'm not talking about fine-tuning an already existing model

Anonymous
09/12/24(Thu)22:46:00 No.102361306

Anonymous 09/12/24(Thu)22:46:00 No.102361306

>>102361285
the only thing we know is that Stable cascade used 2% of the Laion-5b dataset, so something like 100 millions of pictures is the range for pretraining

Anonymous
09/12/24(Thu)22:53:40 No.102361361

Anonymous 09/12/24(Thu)22:53:40 No.102361361

>1 hour with no images
These are truly the last days

Anonymous
09/12/24(Thu)22:56:52 No.102361383

Anonymous 09/12/24(Thu)22:56:52 No.102361383

Anon is busy training

Anonymous
09/12/24(Thu)22:59:43 No.102361407

Anonymous 09/12/24(Thu)22:59:43 No.102361407

File: ComfyUI_01428_.png (1.51 MB, 1024x1024)

1.51 MB PNG

Anonymous
09/12/24(Thu)23:01:38 No.102361419

Anonymous 09/12/24(Thu)23:01:38 No.102361419

>>102361383
Unit ready

Anonymous
09/12/24(Thu)23:07:55 No.102361471

Anonymous 09/12/24(Thu)23:07:55 No.102361471

File: 16mei_v1_e000015_00_20240(...).png (407 KB, 512x512)

407 KB PNG

Anonymous
09/12/24(Thu)23:32:07 No.102361682

Anonymous 09/12/24(Thu)23:32:07 No.102361682

>>102353131

Real men prompt manually.

Anonymous
09/13/24(Fri)00:09:04 No.102361992

Anonymous 09/13/24(Fri)00:09:04 No.102361992

>>102358014
Pony is training on auraflow

Anonymous
09/13/24(Fri)00:37:59 No.102362306

Anonymous 09/13/24(Fri)00:37:59 No.102362306

File: 1707750947151756.jpg (74 KB, 1170x1082)

74 KB JPG

>>102359938
>want to use cogvideo to do additive video inpainting
>have to set up an entire workflow for it
SOMEBODY PLEASE JUST MAKE AN AFTER EFFECTS PLUGIN ALREADY

Anonymous
09/13/24(Fri)01:30:46 No.102362647

Anonymous 09/13/24(Fri)01:30:46 No.102362647

File: ComfyUI_33772_.png (2.08 MB, 1024x1024)

2.08 MB PNG

Anonymous
09/13/24(Fri)02:34:44 No.102363081

Anonymous 09/13/24(Fri)02:34:44 No.102363081

File: 00006-953169750.png (2.66 MB, 1152x1632)

2.66 MB PNG

Anonymous
09/13/24(Fri)02:37:19 No.102363103

Anonymous 09/13/24(Fri)02:37:19 No.102363103

File: ComfyUI_33818_.png (1.58 MB, 1024x1024)

1.58 MB PNG

Anonymous
09/13/24(Fri)02:44:14 No.102363147

Anonymous 09/13/24(Fri)02:44:14 No.102363147

File: ComfyUI_33819_.png (1.66 MB, 1024x1024)

1.66 MB PNG

Anonymous
09/13/24(Fri)02:45:13 No.102363153

Anonymous 09/13/24(Fri)02:45:13 No.102363153

File: 00008-93736787.png (2.73 MB, 1152x1632)

2.73 MB PNG

Anonymous
09/13/24(Fri)03:18:37 No.102363363

Anonymous 09/13/24(Fri)03:18:37 No.102363363

File: ComfyUI_33821_.png (1.83 MB, 1024x1024)

1.83 MB PNG

Anonymous
09/13/24(Fri)03:59:46 No.102363649

Anonymous 09/13/24(Fri)03:59:46 No.102363649

File: 00005-[Any][anyloraCheckp(...).png (1.46 MB, 1216x832)

1.46 MB PNG

Anonymous
09/13/24(Fri)04:02:37 No.102363672

Anonymous 09/13/24(Fri)04:02:37 No.102363672

File: 00002-589983247.jpg (3.5 MB, 2192x2192)

3.5 MB JPG

>>102363147
This is a really cool style

Anonymous
09/13/24(Fri)04:14:51 No.102363736

Anonymous 09/13/24(Fri)04:14:51 No.102363736

File: 00006-[Any][anyloraCheckp(...).png (1.45 MB, 1216x832)

1.45 MB PNG

Anonymous
09/13/24(Fri)04:21:53 No.102363785

Anonymous 09/13/24(Fri)04:21:53 No.102363785

:( I wanted to bake a lora over night but it's 2 am and I've only just finished cleaning the dataset, yet alone reviewing and fixing the captions... I'm going to bed... It'll have to wait another night..

Sometimes I wish I could bring myself to just shit out crappy no effort lora after crappy no effort lora like nochekaiser

Anonymous
09/13/24(Fri)04:23:51 No.102363801

Anonymous 09/13/24(Fri)04:23:51 No.102363801

>>102363785
Proper planning prevents piss poor performance. PPPPPP
You're doing the right thing, sleep well anon.

Anonymous
09/13/24(Fri)04:26:42 No.102363823

Anonymous 09/13/24(Fri)04:26:42 No.102363823

>>102363103
>>102363147
Model/LoRA? Alluring style

Anonymous
09/13/24(Fri)04:33:45 No.102363896

Anonymous 09/13/24(Fri)04:33:45 No.102363896

File: ComfyUI_33824_.png (1.75 MB, 1024x1024)

1.75 MB PNG

>>102363823
https://mega.nz/folder/mtknTSxB#cGzjJnEqhEXfb_ddb6yxNQ 16mei folder.
Based on this artist https://xcancel.com/ju6mei

Anonymous
09/13/24(Fri)06:26:19 No.102364818

Anonymous 09/13/24(Fri)06:26:19 No.102364818

File: 00126-4192611187.jpg (524 KB, 1344x1728)

524 KB JPG

Anonymous
09/13/24(Fri)06:55:49 No.102365085

Anonymous 09/13/24(Fri)06:55:49 No.102365085

Ran killed this general.

Anonymous
09/13/24(Fri)07:25:45 No.102365341

Anonymous 09/13/24(Fri)07:25:45 No.102365341

zzzzzz..... mimimimi...... zzzzzz.... mimimimi

Anonymous
09/13/24(Fri)07:30:38 No.102365394

Anonymous 09/13/24(Fri)07:30:38 No.102365394

File: ComfyUI_33837_.png (1.3 MB, 1024x768)

1.3 MB PNG

Anonymous
09/13/24(Fri)08:35:20 No.102365940

Anonymous 09/13/24(Fri)08:35:20 No.102365940

https://github.com/Vchitect
>[09/2024] We release Vchitect 2.0, including the model and the training system
>Vchitect-2.0 is a high-quality video generative model with 2 billion parameters, supporting resolutions up to 720x480 and video durations of 10-20 seconds. Besides, We are also developing a larger verison with 5 billion parameters, and will be released in the future.
When I click on the link I got a 404 error though

Anonymous
09/13/24(Fri)08:42:39 No.102365984

Anonymous 09/13/24(Fri)08:42:39 No.102365984

>>102365940
https://xcancel.com/ai_trends_hub/status/1834527949208621127#m
>A 5B parameter version will be released in the future.
https://vchitect.intern-ai.org.cn/
that looks better than CogVideoX I guess

Anonymous
09/13/24(Fri)08:57:09 No.102366117

Anonymous 09/13/24(Fri)08:57:09 No.102366117

File: 1725456605554449.png (56 KB, 699x293)

56 KB PNG

i just noticed i have 70gb of hugging face models in my home directory. does anyone know what these might be used for? like why is flux dev there

Anonymous
09/13/24(Fri)09:03:54 No.102366198

Anonymous 09/13/24(Fri)09:03:54 No.102366198

>>102366117
It's the downloads/cache folder of huggingface that it arbitrarily deletes from so you get to download the model 100 times.
it's why niggers should never use code like:
tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-xxl")
The model gets downloaded to that directory and then after an indeterminate amount of time, gets deleted. My conspiracy theory is they do this on purpose as a form of censorship.

Anonymous
09/13/24(Fri)09:21:11 No.102366402

Anonymous 09/13/24(Fri)09:21:11 No.102366402

>>102365940
same here, which is a shame, it's 21:20 in Beijing so it might not be fixed until Monday.

Anonymous
09/13/24(Fri)09:38:06 No.102366576

Anonymous 09/13/24(Fri)09:38:06 No.102366576

>>102365984
Their huggingface links are all 404 as well.
Curious.

Anonymous
09/13/24(Fri)09:47:47 No.102366671

Anonymous 09/13/24(Fri)09:47:47 No.102366671

File: Screenshot from 2024-09-1(...).png (204 KB, 676x746)

204 KB PNG

https://liuziwei7.github.io/papers/vchitect_slides.pdf

A very rounded video model (apparently)

Anonymous
09/13/24(Fri)09:51:49 No.102366721

Anonymous 09/13/24(Fri)09:51:49 No.102366721

>>102366671
>no numbers
>they compare to CogVideo and not CogVideoX
hmm...

Anonymous
09/13/24(Fri)09:59:06 No.102366794

Anonymous 09/13/24(Fri)09:59:06 No.102366794

>>102366721
Their technical report isn't even on their webpage...
It's over, before it even began. D.O.A.
Why do nerds constantly do this?
>"My GF actually goes to another school"
>"My AI model is nearly finished, here's the hard-drive im going to put it on when it's done"

No one cares ChangDev, put up or shut up.

Anonymous
09/13/24(Fri)10:00:50 No.102366815

Anonymous 09/13/24(Fri)10:00:50 No.102366815

>>102366794
>No one cares ChangDev, put up or shut up.
this, 100% this, they're wasting our time with this bullshit

Anonymous
09/13/24(Fri)10:18:46 No.102367018

Anonymous 09/13/24(Fri)10:18:46 No.102367018

Time to see if me and my coding buddy ChatGPT o1-preview can design a diffusion model from scratch.

- Flan T5 XXL
- Osiris 16channel VAE
- No crop image encoding (resize based on total pixels ie 256x256)
- pad and attention mask to handle "bucketing"
- rotary position embedding
- KV compression idea from Pixart
- Cross attention transformer blocks with Flash Attention

Will it work? Who knows.

Anonymous
09/13/24(Fri)10:21:28 No.102367058

Anonymous 09/13/24(Fri)10:21:28 No.102367058

File: ComfyUI_33842_.png (1.25 MB, 1280x720)

1.25 MB PNG

Anonymous
09/13/24(Fri)10:23:56 No.102367082

Anonymous 09/13/24(Fri)10:23:56 No.102367082

>>102367058
unironically chatgpt has helped me solve several long standing physical irritations where doctors were just like 'live with it bro'

Anonymous
09/13/24(Fri)10:29:46 No.102367165

Anonymous 09/13/24(Fri)10:29:46 No.102367165

>>102367082
Eat right, get sleep, exercise.

Anonymous
09/13/24(Fri)10:42:46 No.102367311

Anonymous 09/13/24(Fri)10:42:46 No.102367311

File: SDXL20246.jpg (218 KB, 1256x1256)

218 KB JPG

Ive been really sick so I haven't had the mental fortitude to come up with any decent gen ideas but I'm feeling a little better and am working on some now.

Anonymous
09/13/24(Fri)10:50:19 No.102367396

Anonymous 09/13/24(Fri)10:50:19 No.102367396

File: 00046-892027988.jpg (52 KB, 477x463)

52 KB JPG

> 20/20 [00:45<00:00, 2.28s/it]
> 20/20 [00:45<00:00, 2.39s/it]
How can I make Flux schnell actually being schnell on a 16GB GPU?

Anonymous
09/13/24(Fri)10:50:25 No.102367397

Anonymous 09/13/24(Fri)10:50:25 No.102367397

>>102367311
You could have just recorded your feverish ramblings and used those as prompts.
Why should prompts come from orderly minds only?
The best musicians, painters and writers were mentally derainged on drugs or plain old varying levels of insanity during their most prolific periods.
Why should we, modern artists be any different?

Anonymous
09/13/24(Fri)10:51:40 No.102367413

Anonymous 09/13/24(Fri)10:51:40 No.102367413

is there no flux training rentry yet?

Anonymous
09/13/24(Fri)10:53:17 No.102367437

Anonymous 09/13/24(Fri)10:53:17 No.102367437

File: 1695268332467266.png (3.77 MB, 2048x2048)

3.77 MB PNG

Anonymous
09/13/24(Fri)10:53:18 No.102367438

Anonymous 09/13/24(Fri)10:53:18 No.102367438

>>102367397
And you could have not posted this but you did

Anonymous
09/13/24(Fri)10:54:22 No.102367459

Anonymous 09/13/24(Fri)10:54:22 No.102367459

>>102367413
just use this, it's pretty easy https://github.com/cocktailpeanut/fluxgym

Anonymous
09/13/24(Fri)10:56:46 No.102367491

Anonymous 09/13/24(Fri)10:56:46 No.102367491

>>102367459
>python -m venv env
>env\Scripts\activate
really? They still need this bullshit?

Anonymous
09/13/24(Fri)10:57:11 No.102367493

Anonymous 09/13/24(Fri)10:57:11 No.102367493

>>102367438
It's an important lesson that creativity is nothing something you must only do when you are feeling well. Feeling unwell and doing something you normally enjoy can be part of the recovery process.
Thanks for prompting me to clarify it all for anons.

Anonymous
09/13/24(Fri)11:02:02 No.102367555

Anonymous 09/13/24(Fri)11:02:02 No.102367555

File: 00052-2951739421.png (986 KB, 896x1152)

986 KB PNG

Anonymous
09/13/24(Fri)11:03:32 No.102367577

Anonymous 09/13/24(Fri)11:03:32 No.102367577

>>102363896
NTA but ty

Anonymous
09/13/24(Fri)11:06:23 No.102367611

Anonymous 09/13/24(Fri)11:06:23 No.102367611

File: 1723074713770759.png (2.98 MB, 2048x2048)

2.98 MB PNG

Anonymous
09/13/24(Fri)11:21:13 No.102367818

Anonymous 09/13/24(Fri)11:21:13 No.102367818

Come and get your daily bread...
>>102367811
>>102367811
>>102367811

Anonymous
09/13/24(Fri)11:39:06 No.102368094

Anonymous 09/13/24(Fri)11:39:06 No.102368094

great

Anonymous
09/13/24(Fri)11:40:37 No.102368124

Anonymous 09/13/24(Fri)11:40:37 No.102368124

You're welcome

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.