/g/ - /ldg/ - Local Diffusion General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/ldg/ - Local Diffusion Genera(...) 10/17/24(Thu)14:26:21 No.102862167

File: the longest dick general.jpg (2.24 MB, 3264x2448)

2.24 MB JPG

/ldg/ - Local Diffusion General Anonymous 10/17/24(Thu)14:26:21 No.102862167

Discussion of free and open source text-to-image models

Previous /ldg/ bred : >>102850799

Wishing for 4B Edition

>Beginner UI
Fooocus: https://github.com/lllyasviel/fooocus
EasyDiffusion: https://easydiffusion.github.io
Metastable: https://metastable.studio

>Advanced UI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
reForge: https://github.com/Panchovix/stable-diffusion-webui-reForge
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://aitracker.art
https://huggingface.co
https://civitai.com
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/kohya-ss/sd-scripts/tree/sd3

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux
Quants: https://huggingface.co/TheYuriLover/flux-dev-de-distill-GGUF/tree/main

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/aco/sdg
>>>/aco/aivg
>>>/b/degen
>>>/c/kdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/tg/slop
>>>/trash/sdg
>>>/u/udg
>>>/vt/vtai

Anonymous
10/17/24(Thu)14:27:44 No.102862186

Anonymous 10/17/24(Thu)14:27:44 No.102862186

thread claimed by sana-samas

Anonymous
10/17/24(Thu)14:27:51 No.102862191

Anonymous 10/17/24(Thu)14:27:51 No.102862191

File: HunyuanDiT_02416.png (1.39 MB, 832x1152)

1.39 MB PNG

Anonymous
10/17/24(Thu)14:31:13 No.102862233

Anonymous 10/17/24(Thu)14:31:13 No.102862233

File: ComfyUI_temp_ibptx_00011_.png (1.76 MB, 1168x1368)

1.76 MB PNG

Anonymous
10/17/24(Thu)14:39:14 No.102862320

Anonymous 10/17/24(Thu)14:39:14 No.102862320

>>102862191
This is not Hunyuan

Anonymous
10/17/24(Thu)14:45:19 No.102862379

Anonymous 10/17/24(Thu)14:45:19 No.102862379

File: ComfyUI_02016_.png (1.25 MB, 1024x1024)

1.25 MB PNG

Anonymous
10/17/24(Thu)14:46:42 No.102862393

Anonymous 10/17/24(Thu)14:46:42 No.102862393

File: 00273-1349510609.jpg (485 KB, 1080x1440)

485 KB JPG

>>102862233
Eva Longoria?

Anonymous
10/17/24(Thu)14:47:36 No.102862404

Anonymous 10/17/24(Thu)14:47:36 No.102862404

SDXL looks incredibly outdated.

Anonymous
10/17/24(Thu)14:48:06 No.102862408

Anonymous 10/17/24(Thu)14:48:06 No.102862408

File: ComfyUI_temp_tpppe_00003_.png (1.91 MB, 1344x1344)

1.91 MB PNG

Anonymous
10/17/24(Thu)14:48:19 No.102862412

Anonymous 10/17/24(Thu)14:48:19 No.102862412

>>102862404
try illustriousxl

Anonymous
10/17/24(Thu)14:54:22 No.102862479

Anonymous 10/17/24(Thu)14:54:22 No.102862479

File: ComfyUI_temp_tpppe_00006_.png (1.76 MB, 1120x1440)

1.76 MB PNG

Anonymous
10/17/24(Thu)14:57:16 No.102862524

Anonymous 10/17/24(Thu)14:57:16 No.102862524

File: collageldg.jpg (1.22 MB, 3440x3440)

1.22 MB JPG

alternate collage

Anonymous
10/17/24(Thu)14:58:39 No.102862538

Anonymous 10/17/24(Thu)14:58:39 No.102862538

>>102862524
kek. much better collage than OP

Anonymous
10/17/24(Thu)15:00:54 No.102862557

Anonymous 10/17/24(Thu)15:00:54 No.102862557

>>102862524
good image

Anonymous
10/17/24(Thu)15:06:02 No.102862600

Anonymous 10/17/24(Thu)15:06:02 No.102862600

>>102862524
>>102862538
>>102862557
samefag

Anonymous
10/17/24(Thu)15:07:20 No.102862611

Anonymous 10/17/24(Thu)15:07:20 No.102862611

File: 1720294274349663.jpg (1.46 MB, 2152x1232)

1.46 MB JPG

>>102862524
top tier. come back to the server.

Anonymous
10/17/24(Thu)15:07:24 No.102862613

Anonymous 10/17/24(Thu)15:07:24 No.102862613

>>102862186
this is the blessed thread of frenship

Anonymous
10/17/24(Thu)15:08:27 No.102862627

Anonymous 10/17/24(Thu)15:08:27 No.102862627

File: 00300-1349510608.jpg (388 KB, 1728x1152)

388 KB JPG

Anonymous
10/17/24(Thu)15:12:00 No.102862660

Anonymous 10/17/24(Thu)15:12:00 No.102862660

>>102862524
top kek

Anonymous
10/17/24(Thu)15:12:43 No.102862670

Anonymous 10/17/24(Thu)15:12:43 No.102862670

I think pajeeta anon should know that genning around 1 megapixel with Flux is important. If you go too low you run the risk of quality issues

Anonymous
10/17/24(Thu)15:12:48 No.102862672

Anonymous 10/17/24(Thu)15:12:48 No.102862672

>>102862524
all my beautiful jeetas... like tears in the rain...

Anonymous
10/17/24(Thu)15:15:15 No.102862697

Anonymous 10/17/24(Thu)15:15:15 No.102862697

>>102862524
The great redeeming

Anonymous
10/17/24(Thu)15:24:33 No.102862803

Anonymous 10/17/24(Thu)15:24:33 No.102862803

>>102862524
Lul

Anonymous
10/17/24(Thu)15:26:51 No.102862828

Anonymous 10/17/24(Thu)15:26:51 No.102862828

>>102862803
Not funny.

Anonymous
10/17/24(Thu)15:27:51 No.102862838

Anonymous 10/17/24(Thu)15:27:51 No.102862838

File: 2646627489.jpg (876 KB, 2304x1344)

876 KB JPG

Anonymous
10/17/24(Thu)15:32:58 No.102862887

Anonymous 10/17/24(Thu)15:32:58 No.102862887

>>102862524
JeetaPit

Anonymous
10/17/24(Thu)15:37:02 No.102862919

Anonymous 10/17/24(Thu)15:37:02 No.102862919

File: 00333-1349510609.jpg (992 KB, 2160x1440)

992 KB JPG

Anonymous
10/17/24(Thu)15:38:34 No.102862933

Anonymous 10/17/24(Thu)15:38:34 No.102862933

File: FLUX-1053604032293006_00001_.png (415 KB, 512x832)

415 KB PNG

I definitely misjudged the audience, my bad. I will stay out of the subcontinent for my next batch.

Anonymous
10/17/24(Thu)15:39:51 No.102862947

Anonymous 10/17/24(Thu)15:39:51 No.102862947

File: FLUX-405551624629727_00001_.png (409 KB, 512x832)

409 KB PNG

just a few more I need to post

Anonymous
10/17/24(Thu)15:40:59 No.102862969

Anonymous 10/17/24(Thu)15:40:59 No.102862969

File: FLUX-783264754869585_00001_.png (351 KB, 512x832)

351 KB PNG

not the best gen but she has a comically large bosom and that's rare with flux, so I'm compelled to post

Anonymous
10/17/24(Thu)15:42:01 No.102862983

Anonymous 10/17/24(Thu)15:42:01 No.102862983

File: FLUX-159512081115233_00001_.png (351 KB, 512x832)

351 KB PNG

Anonymous
10/17/24(Thu)15:47:57 No.102863069

Anonymous 10/17/24(Thu)15:47:57 No.102863069

File: FLUX-10138137848253_00001_.png (398 KB, 512x832)

398 KB PNG

Anonymous
10/17/24(Thu)15:58:11 No.102863165

Anonymous 10/17/24(Thu)15:58:11 No.102863165

File: img-2024-10-17-21-58-00.png (1.67 MB, 1200x896)

1.67 MB PNG

this thread need 1girls

Anonymous
10/17/24(Thu)16:00:59 No.102863208

Anonymous 10/17/24(Thu)16:00:59 No.102863208

>>102863165
Apparently aliens too.

Anonymous
10/17/24(Thu)16:04:23 No.102863248

Anonymous 10/17/24(Thu)16:04:23 No.102863248

>>102863165
we need waifu diffusion flux edition: return of the massive titty elves

Anonymous
10/17/24(Thu)16:06:39 No.102863267

Anonymous 10/17/24(Thu)16:06:39 No.102863267

i don't like flux a lot, i can only run flux s with my 3080

Anonymous
10/17/24(Thu)16:09:14 No.102863293

Anonymous 10/17/24(Thu)16:09:14 No.102863293

File: img-2024-10-17-22-09-11.png (1.72 MB, 1024x1200)

1.72 MB PNG

>>102863208

Anonymous
10/17/24(Thu)16:09:33 No.102863302

Anonymous 10/17/24(Thu)16:09:33 No.102863302

>>102863267
>i don't like flux a lot, i can only run flux s with my 3080
It's not your card, it's the coders. It could be made to work with dev.

Anonymous
10/17/24(Thu)16:15:48 No.102863368

Anonymous 10/17/24(Thu)16:15:48 No.102863368

File: ComfyUI_03617_.png (1.56 MB, 1024x1024)

1.56 MB PNG

Anonymous
10/17/24(Thu)16:18:55 No.102863404

Anonymous 10/17/24(Thu)16:18:55 No.102863404

File: 00383-323849799.jpg (373 KB, 1728x1152)

373 KB JPG

Anonymous
10/17/24(Thu)16:48:29 No.102863729

Anonymous 10/17/24(Thu)16:48:29 No.102863729

File: ComfyUI_temp_pnzoe_00041_.png (894 KB, 896x1152)

894 KB PNG

Anonymous
10/17/24(Thu)16:49:18 No.102863745

Anonymous 10/17/24(Thu)16:49:18 No.102863745

>>102863165
the thread already has 20girls at least

Anonymous
10/17/24(Thu)17:00:30 No.102863872

Anonymous 10/17/24(Thu)17:00:30 No.102863872

>>102862524
lmaooo

Anonymous
10/17/24(Thu)17:07:56 No.102863965

Anonymous 10/17/24(Thu)17:07:56 No.102863965

File: 3131440319.png (808 KB, 1024x1024)

808 KB PNG

Anonymous
10/17/24(Thu)17:24:08 No.102864175

Anonymous 10/17/24(Thu)17:24:08 No.102864175

Shit gens ITT

Anonymous
10/17/24(Thu)17:28:58 No.102864227

Anonymous 10/17/24(Thu)17:28:58 No.102864227

>>102864175
show us how it's done

Anonymous
10/17/24(Thu)17:29:41 No.102864237

Anonymous 10/17/24(Thu)17:29:41 No.102864237

File: 495060443.png (275 KB, 1152x896)

275 KB PNG

Anonymous
10/17/24(Thu)17:30:12 No.102864244

Anonymous 10/17/24(Thu)17:30:12 No.102864244

>>102864237
prompt?

Anonymous
10/17/24(Thu)17:34:02 No.102864286

Anonymous 10/17/24(Thu)17:34:02 No.102864286

File: 83805878.png (683 KB, 1152x896)

683 KB PNG

>>102864244
>Black and white headshot of a beautiful woman with slicked back hair, she has a serious expression and is looking straight ahead. There is a dramatic thin strip of light highligthing her eyes. <lora:flux_realism_lora:1>

It's img2img of a white strip on a black background.

Anonymous
10/17/24(Thu)17:34:17 No.102864289

Anonymous 10/17/24(Thu)17:34:17 No.102864289

The most dog shit general on /g/

Anonymous
10/17/24(Thu)17:36:10 No.102864307

Anonymous 10/17/24(Thu)17:36:10 No.102864307

>>102864286
ty anon
>It's img2img of a white strip on a black background.
very neat

Anonymous
10/17/24(Thu)17:36:54 No.102864316

Anonymous 10/17/24(Thu)17:36:54 No.102864316

>>102863293
Pretty cool. It's creepy.

Anonymous
10/17/24(Thu)17:37:55 No.102864329

Anonymous 10/17/24(Thu)17:37:55 No.102864329

>>102863368
Now swap the girl with a beagle

Anonymous
10/17/24(Thu)17:38:59 No.102864340

Anonymous 10/17/24(Thu)17:38:59 No.102864340

File: 2023505011.png (482 KB, 768x1344)

482 KB PNG

>>102864307
sure thing

Anonymous
10/17/24(Thu)17:53:11 No.102864521

Anonymous 10/17/24(Thu)17:53:11 No.102864521

File: HARP.png (3.13 MB, 1696x1696)

3.13 MB PNG

>>102863368
you inspired me anon = D

Anonymous
10/17/24(Thu)18:16:42 No.102864777

Anonymous 10/17/24(Thu)18:16:42 No.102864777

File: 2024-10-17_00002_.png (1.64 MB, 720x1280)

1.64 MB PNG

>>102864521

Anonymous
10/17/24(Thu)18:28:51 No.102864945

Anonymous 10/17/24(Thu)18:28:51 No.102864945

>no flux Giorgia Meloni lora
ree

Anonymous
10/17/24(Thu)18:39:24 No.102865073

Anonymous 10/17/24(Thu)18:39:24 No.102865073

>>102862838
Fat people will never look that good.

Anonymous
10/17/24(Thu)19:15:23 No.102865445

Anonymous 10/17/24(Thu)19:15:23 No.102865445

>>102865073
That's a photo, what are you talking about?

Anonymous
10/17/24(Thu)19:32:45 No.102865594

Anonymous 10/17/24(Thu)19:32:45 No.102865594

is this thread still blessed?

Anonymous
10/17/24(Thu)19:35:27 No.102865620

Anonymous 10/17/24(Thu)19:35:27 No.102865620

blessed hibernation

Anonymous
10/17/24(Thu)19:58:21 No.102865844

Anonymous 10/17/24(Thu)19:58:21 No.102865844

>>102864777
Nice

Anonymous
10/17/24(Thu)20:02:12 No.102865887

Anonymous 10/17/24(Thu)20:02:12 No.102865887

https://nvlabs.github.io/Sana/
The only good thing of Sana is its LLM encoder, was about fucking time we ditched that old ass T5, I wish Flux had something similar

Anonymous
10/17/24(Thu)20:02:44 No.102865896

Anonymous 10/17/24(Thu)20:02:44 No.102865896

>>102865887
There is nothing good about vaporware.

Anonymous
10/17/24(Thu)20:05:07 No.102865915

Anonymous 10/17/24(Thu)20:05:07 No.102865915

>>102865887
>LLM encoder
Horrifying.

Anonymous
10/17/24(Thu)20:07:16 No.102865941

Anonymous 10/17/24(Thu)20:07:16 No.102865941

>>102865915
why? it'll give way better prompt understanding than an almost 3 year old T5 model

Anonymous
10/17/24(Thu)20:09:36 No.102865966

Anonymous 10/17/24(Thu)20:09:36 No.102865966

>>102865941
censorship

Anonymous
10/17/24(Thu)20:11:04 No.102865978

Anonymous 10/17/24(Thu)20:11:04 No.102865978

>>102865966
T5 isn't censored?

Anonymous
10/17/24(Thu)20:11:53 No.102865984

Anonymous 10/17/24(Thu)20:11:53 No.102865984

>miku is a product of xyz corporation, it is not ethical for me to reproduce such images. Instead, here's Steamboat Willie. Fun, right? :)

Anonymous
10/17/24(Thu)20:13:20 No.102865994

Anonymous 10/17/24(Thu)20:13:20 No.102865994

>>102865984
it won't talk because they removed the decoder part though, I think the censorship occurs only on the decoder head

Anonymous
10/17/24(Thu)20:17:50 No.102866034

Anonymous 10/17/24(Thu)20:17:50 No.102866034

>>102865978
T5 warnings:

Bias, Risks, and Limitations

The information below in this section are copied from the model's official model card:

Language models, including Flan-T5, can potentially be used for language generation in a harmful way, according to Rae et al. (2021). Flan-T5 should not be used directly in any application, without a prior assessment of safety and fairness concerns specific to the application.

Ethical considerations and risks

Flan-T5 is fine-tuned on a large corpus of text data that was not filtered for explicit content or assessed for existing biases. As a result the model itself is potentially vulnerable to generating equivalently inappropriate content or replicating inherent biases in the underlying data.

Known Limitations

Flan-T5 has not been tested in real world applications.

Sensitive Use:

Flan-T5 should not be applied for any unacceptable use cases, e.g., generation of abusive speech.

Anonymous
10/17/24(Thu)20:18:17 No.102866038

Anonymous 10/17/24(Thu)20:18:17 No.102866038

>>102865966
It's a 2B model, you can easily finetune it? You're not a child right? You know how to do that right/

Anonymous
10/17/24(Thu)20:19:15 No.102866045

Anonymous 10/17/24(Thu)20:19:15 No.102866045

>>102866034
>Flan-T5 is fine-tuned on a large corpus of text data that was not filtered for explicit content or assessed for existing biases. As a result the model itself is potentially vulnerable to generating equivalently inappropriate content or replicating inherent biases in the underlying data.
fucking based!

Anonymous
10/17/24(Thu)20:20:35 No.102866055

Anonymous 10/17/24(Thu)20:20:35 No.102866055

>>102866034
That doesn't mean anything. Most of the LLMs aren't censored. It's only the chat component that gets censored. Out of the box they're pure text completers.

Anonymous
10/17/24(Thu)20:22:26 No.102866075

Anonymous 10/17/24(Thu)20:22:26 No.102866075

File: file.png (195 KB, 2149x611)

195 KB PNG

>>102866038
>It's a 2B model
it's smaller than T5 then no? because T5 is a 11b model (we only use its encoder and that's 5b)
https://arxiv.org/pdf/2410.10629
>In addition, some small LLMs, such as Gemma-2 (Team et al., 2024), can rival the performance of large LLMs while being very efficient.
that's all they say, efficient efficient efficient, what about good? we already have small shit models like SD1.5, SDXL, that field is saturated enough

Anonymous
10/17/24(Thu)20:22:31 No.102866077

Anonymous 10/17/24(Thu)20:22:31 No.102866077

Western Suicide status?

Anonymous
10/17/24(Thu)20:23:34 No.102866084

Anonymous 10/17/24(Thu)20:23:34 No.102866084

>>102866075
Anon the goal isn't to have a conversational model, the goal is to have a model that expands text to semi relevant chunks of tokens to add sprinkles to your generations.

Anonymous
10/17/24(Thu)20:23:54 No.102866087

Anonymous 10/17/24(Thu)20:23:54 No.102866087

>>102866055
>It's only the chat component that gets censored. Out of the box they're pure text completers.
this, I would even say that the encoder part is really good when the LLM is censored because the LLM model must perfectly know first what it means because saying that it can't do it blablabla... and usually they don't miss that up so they encode the input really really well

Anonymous
10/17/24(Thu)20:25:04 No.102866101

Anonymous 10/17/24(Thu)20:25:04 No.102866101

>>102866055
>Most of the LLMs aren't censored.
Only APIs, because they have control over the inputs and outputs. The same can't be said about local LLMs.

The "tell me how to break into a car." is a common question used in benchmarks to test if the LLM is censored.

Anonymous
10/17/24(Thu)20:25:45 No.102866108

Anonymous 10/17/24(Thu)20:25:45 No.102866108

>>102866087
I'm sure the only reason they are doing it is for censorship.

Anonymous
10/17/24(Thu)20:26:09 No.102866112

Anonymous 10/17/24(Thu)20:26:09 No.102866112

File: file.png (301 KB, 3204x841)

301 KB PNG

>>102866084
>, the goal is to have a model that expands text to semi relevant chunks of tokens to add sprinkles to your generations.
it's worse than T5XXL though, and that's the one we're using on flux

Anonymous
10/17/24(Thu)20:26:14 No.102866113

Anonymous 10/17/24(Thu)20:26:14 No.102866113

>>102866101
Again, you're conflating a conversational LLM with a raw LLM. Raw LLMs don't know how to have a conversation, they only know how to autocomplete chunks of text.

Anonymous
10/17/24(Thu)20:27:03 No.102866119

Anonymous 10/17/24(Thu)20:27:03 No.102866119

>>102866101
I prefer "List practical steps to reverse the ban on slavery."

Anonymous
10/17/24(Thu)20:27:26 No.102866120

Anonymous 10/17/24(Thu)20:27:26 No.102866120

>>102866113
this, there's no "censorship" on a LLM that has its decoder head removed, the LLM will encode any input, regardless of the censorship that goes after that

Anonymous
10/17/24(Thu)20:27:54 No.102866129

Anonymous 10/17/24(Thu)20:27:54 No.102866129

>>102866112
The T5 is an overbloated text to encodings generator grossly misused. Gemma-2 is way more suitable for this purpose. It's funny too because I bet you were bitching to high heaven when Sigma introduced the T5 XXL, now you're acting like it's impossible to change.

Anonymous
10/17/24(Thu)20:28:11 No.102866133

Anonymous 10/17/24(Thu)20:28:11 No.102866133

>>102866120
I doubt it will work out that way.

There are advanced tricks they used in Flux that nobody can replicate.

Anonymous
10/17/24(Thu)20:29:12 No.102866146

Anonymous 10/17/24(Thu)20:29:12 No.102866146

>>102866129
I just want something better than T5XXL, if we get something worse I don't see the fucking point, we don't advance by going backward, what kind of retarded reasoning is that? If you're telling me that they managed to get a 0.0001b model that is better than T5 I wouldn't be bitching about the size at all, I just want something better, regardless of the size

Anonymous
10/17/24(Thu)20:29:14 No.102866147

Anonymous 10/17/24(Thu)20:29:14 No.102866147

>>102866133
Any day now someone is going to rent 8 H100s and fine tune Flux, I just know it.

Anonymous
10/17/24(Thu)20:30:13 No.102866157

Anonymous 10/17/24(Thu)20:30:13 No.102866157

>>102866147
>8 H100s
wait, you need 640 Go of VRAM to finetune Flux?

Anonymous
10/17/24(Thu)20:32:42 No.102866182

Anonymous 10/17/24(Thu)20:32:42 No.102866182

>>102866146
You're misunderstanding the purpose of the text encoder. It's not to be "smart". You encode captions into tokens, a model is trained with those tokens. You then take someone's prompt and turn that into tokens. Those tokens are then used to sample the latent space. There is no "intelligence", it's a search mechanism.

Anonymous
10/17/24(Thu)20:33:47 No.102866193

Anonymous 10/17/24(Thu)20:33:47 No.102866193

File: 2024-10-17_00003_.png (1022 KB, 720x1280)

1022 KB PNG

Anonymous
10/17/24(Thu)20:33:52 No.102866195

Anonymous 10/17/24(Thu)20:33:52 No.102866195

>>102866157
If you don't want training to take a year you're going to need many H100s to achieve a reasonable learning rate, especially with something like Pony that completely reteaches the model concepts that it never, ever learned. You're talking about 100,000 steps or more at batch 64.

Anonymous
10/17/24(Thu)20:34:36 No.102866204

Anonymous 10/17/24(Thu)20:34:36 No.102866204

>>102866182
>You're misunderstanding the purpose of the text encoder. It's not to be "smart".
I never said anything about "smart", I said BETTER, look at the table again, it's objectively worse than T5XXL, why would I want to use something inferior? >>102866112

Anonymous
10/17/24(Thu)20:35:53 No.102866219

Anonymous 10/17/24(Thu)20:35:53 No.102866219

>>102866195
>If you don't want training to take a year you're going to need many H100s to achieve a reasonable learning rate
isn't it that case aswell for smaller models? I know that pony used a lot of GPUs to train SDXL

Anonymous
10/17/24(Thu)20:37:29 No.102866233

Anonymous 10/17/24(Thu)20:37:29 No.102866233

>>102866147
https://civitai.com/models/859032
>These models received serious amount of compute. 32x H100s were used with 16 of them dedicated to multi-node training, and two other nodes split between various training jobs.

>An attempt was made at de-distilling Schnell
>but even it seems, 32x H100 isn't enough for that job, and it was abandoned.
>Not worth testing, doesn't really function at all.
>Offered incase anybody looking to do the same would like somewhat of a head start.

comment by the autor:
it was hard and i had to restart several times. it was more like 5000 GPU hours to figure it out. honestly i'm not a fan of Flux anymore. distilled models aren't fun to train

Anonymous
10/17/24(Thu)20:39:05 No.102866249

Anonymous 10/17/24(Thu)20:39:05 No.102866249

>>102866034
>It’s uncensored
>muh ethics
>its named after a loli vampire
Shut up and take my bandwidth!

Anonymous
10/17/24(Thu)20:39:13 No.102866250

Anonymous 10/17/24(Thu)20:39:13 No.102866250

>>102865941
>why?
Because I've never seen any indication that they're better than me at deciding what to feed into the model. They've taken Gemma and written a fucking "enhance this prompt" type instruction and you get what you get. It's so laughably fucking stupid that I don't know why I'm amazed you idiots think it's a good idea.

Anonymous
10/17/24(Thu)20:39:50 No.102866254

Anonymous 10/17/24(Thu)20:39:50 No.102866254

>>102866233
>honestly i'm not a fan of Flux anymore. distilled models aren't fun to train
that retard pushed the "start" button even though dedistilled exists and could've make his life better, and guess what, someone did train dedistill and he's having results with it
https://civitai.com/models/690991?modelVersionId=943891
>This version is a merge of training runs done on Flux De-Distill and Flux Dev2Pro, both of which seek to remove distillation from Flux Dev. Models were merged w/ a ratio of 0.7:0.3 Dev2Pro:De-Distill. The dataset has been unaltered from version 2, hence why it's v2.5 as opposed to v3.
>The result is FAR greater image quality and generally better prompt adherence at the cost of increased generation times

Anonymous
10/17/24(Thu)20:41:09 No.102866274

Anonymous 10/17/24(Thu)20:41:09 No.102866274

>>102862167
Which AI model has the most soul?

Anonymous
10/17/24(Thu)20:42:02 No.102866286

Anonymous 10/17/24(Thu)20:42:02 No.102866286

>>102866250
I'm not talking about that speciifc LLM in particular, I agree that "rewriting the prompt" is a meme and retarded, I want my model to understand my own sentences. But I'm talking in general, I hate to pretend that T5 is a perfect model and that we'll never improve on that, LLM encoders will be the future

Anonymous
10/17/24(Thu)20:47:09 No.102866329

Anonymous 10/17/24(Thu)20:47:09 No.102866329

>>102866233
I wouldn't listen to him, that guy is a lunatic
https://civitai.com/models/859032?dialog=commentThread&commentId=566485
>the internet is not for porn, it was created for warfare.
>and wouldn't it be funny if someone made an SFW Booru model just as a form of psychological warfare?

Anonymous
10/17/24(Thu)20:50:46 No.102866357

Anonymous 10/17/24(Thu)20:50:46 No.102866357

>>102866254
The first example show the common FLUX nipples, or a lack thereof, coupled with mangled hands on a standing photo. I seriously can't think of this as an example of success.

Anonymous
10/17/24(Thu)20:53:30 No.102866383

Anonymous 10/17/24(Thu)20:53:30 No.102866383

>>102866286
>LLM encoders will be the future
I'd prefer we design models based on what actually works right now and is demonstrably practical and emowering for the user, rather than making decisions based on ideology—"AI is the future" is your religion and I respect that but don't fucking replace a model that works with a model that doesn't just because it feels 'futuristic'. Put it this way: if they had a checkbox to enable the LLM feature you know full well every half-decent prompter is unchecking that box for more control.

It's trendy retarded bullshit like this which tells me a model isn't really serious.

Anonymous
10/17/24(Thu)20:53:53 No.102866386

Anonymous 10/17/24(Thu)20:53:53 No.102866386

>>102866357
yeah I never said it was perfect, and neither that guy:
>Whilst definitely still a proof of concept compared to something like Pony, it (often) does what it was designed to do quite well!
It just needs a bit more of training, but there's definitely proof that going for dedistill is the best solution if you want to make serious finetunes
https://huggingface.co/nyanko7/flux-dev-de-distill/discussions/3#671172c98b6c6d4db00b5840
>Update on flux-dev-de-distill Training 4 people same class in one lora - "T5 Attention Mask and T5-XXL both disabled" lr 0.0001 When starts to overtrain the subjects start to bleed to each other, up to 80 epochs no bleeding and very good resemblance. when is so overtrained on 200 epochs all subjects get mixed together, For inference works perfect with flux-de-distill and also on regular flux-dev and hyper-flux, on regular flux-dev and hyper-flux the resemblance diminish very little may be improves with lower lr, Now I'm going to train with a much lower lr to avoid overtraining and get finer detail learning, I'll use lr for unet 0.00003 and TE 0.00005 (at inference flux-dev-de-distill cfg 3.5, for flux-dev and hyper-flux cfg 1 and distilled cfg 3.5)

Anonymous
10/17/24(Thu)20:55:13 No.102866395

Anonymous 10/17/24(Thu)20:55:13 No.102866395

>>102866383
>AI is the future" is your religion and I respect that but don't fucking replace a model that works with a model that doesn't just because it feels 'futuristic'.
I never said anything close to that, what I said is that T5 isn't perfect, therefore we have to try some shit to replace it and move forward

Anonymous
10/17/24(Thu)21:15:19 No.102866575

Anonymous 10/17/24(Thu)21:15:19 No.102866575

>>102866254
>The dataset for males now contains 175 images, and the female dataset now consists of 75 images
holy fuck just STOP this shit.
Like every fucking flux lora does this, 100-200 images max. Maybe, MAYBE that's ok if you're just training a basic style. But for concept loras it's nowhere near enough. Civit is filled with this low image count overfit flux slop and it keeps getting worse.

Anonymous
10/17/24(Thu)21:38:49 No.102866781

Anonymous 10/17/24(Thu)21:38:49 No.102866781

File: ComfyUI_02186_.png (1.58 MB, 1024x1024)

1.58 MB PNG

Anonymous
10/17/24(Thu)21:45:31 No.102866840

Anonymous 10/17/24(Thu)21:45:31 No.102866840

>>102866395
there is such a thing as being able to judge ideas as promising or not promising before they are tried. I am not against trying new things. I am against trying very obviously stupid new things that are basically repackaged old things

Anonymous
10/17/24(Thu)21:53:01 No.102866919

Anonymous 10/17/24(Thu)21:53:01 No.102866919

>>102866840
>I am against trying very obviously stupid new things that are basically repackaged old things
I don't know the history of diffusion models, but they tried LLMs before and it failed?

Anonymous
10/17/24(Thu)21:55:32 No.102866955

Anonymous 10/17/24(Thu)21:55:32 No.102866955

File: ComfyUI_02193_.png (1.28 MB, 1024x1024)

1.28 MB PNG

>cock, license and registration please

Anonymous
10/17/24(Thu)21:58:15 No.102866979

Anonymous 10/17/24(Thu)21:58:15 No.102866979

>>102866919
anons have tried using LLMs to write prompts for them many times. It has never looked remotely promising or interesting. LLMs in general are the old thing being repackaged as a new solution. I'm not answering any more questions if they're of this caliber. If you can't see what's unimaginative and retarded about the idea of using an LLM to improve your prompts...

Anonymous
10/17/24(Thu)21:59:35 No.102866985

Anonymous 10/17/24(Thu)21:59:35 No.102866985

>>102866979
>anons have tried using LLMs to write prompts for them many times. It has never looked remotely promising or interesting.
are you retarded? what they're doing isn't even close to that, they're not rewiting prompts, they're using LLMs to encode your prompts, the decoder is removed

Anonymous
10/17/24(Thu)22:10:03 No.102867061

Anonymous 10/17/24(Thu)22:10:03 No.102867061

File: ComfyUI_02205_.png (1.19 MB, 1024x1024)

1.19 MB PNG

Anonymous
10/17/24(Thu)22:10:41 No.102867064

Anonymous 10/17/24(Thu)22:10:41 No.102867064

File: flux1-dev-Q4_K_S_00003_.png (335 KB, 512x512)

335 KB PNG

>when you knew the encoder will be censored

Anonymous
10/17/24(Thu)22:12:53 No.102867088

Anonymous 10/17/24(Thu)22:12:53 No.102867088

>>102865887
I don't get why Nvdia decided to let them make small models, isn't their goal to sell high end GPUs or something? It would've been a better choice to go for a giant model so that people would buy a 3090 to run then, like I did for flux lol

Anonymous
10/17/24(Thu)22:15:01 No.102867105

Anonymous 10/17/24(Thu)22:15:01 No.102867105

>>102866250
It's no different than the random results you get from "1girl, blue dress". I don't know how you survive with the amount of autism you have, seriously.

Anonymous
10/17/24(Thu)22:17:57 No.102867129

Anonymous 10/17/24(Thu)22:17:57 No.102867129

File: ComfyUI_02212_.png (1.51 MB, 1024x1024)

1.51 MB PNG

Anonymous
10/17/24(Thu)22:19:56 No.102867145

Anonymous 10/17/24(Thu)22:19:56 No.102867145

>>102863965
this is amazing, catbox?

Anonymous
10/17/24(Thu)22:24:26 No.102867186

Anonymous 10/17/24(Thu)22:24:26 No.102867186

I'll see how censored default Gemma-2 2B is.

Anonymous
10/17/24(Thu)22:26:01 No.102867197

Anonymous 10/17/24(Thu)22:26:01 No.102867197

>>102867186
>>102867064
seriously though, is the censorship occur during the encoding? I don't think so, if the model has refusal, it must first know what your prompt was all about, so the encoding must be unfiltered right?

Anonymous
10/17/24(Thu)22:31:28 No.102867234

Anonymous 10/17/24(Thu)22:31:28 No.102867234

File: IMG_0547.jpg (455 KB, 1125x982)

455 KB JPG

>>102866985
Oh, I didn't know that. Yeah, that is interesting. I hadn't looked into it much so I just assumed you were referring to this image which got posted at some point, which is what I knew about their use of Gemma. And the idea in this image is, I hope you won't disagree, retarded.

I'm not convinced the better 'understanding of language' that comes with being an LLM will give it enough of an advantage. I've never had the impression that they understand language that well particularly. It should be a lot better at "her dress is not blue", "his head is out of frame", and so on. What the downsides are is yet to be seen. But you're right, it's not a bad idea to try.

Anonymous
10/17/24(Thu)22:33:59 No.102867256

Anonymous 10/17/24(Thu)22:33:59 No.102867256

>>102867234
It's not like you have to use it, you can always just encode your plain prompt, but then you're going to have the same general retardation you get when you type in "girl" in Flux. The system wants rich prompts. And unless it drastically changes your prompt, it's actually just autism to care that it changed "girl" to "a girl standing in a forest, behind her is a rainbow [etc]", it's no different than the random bullshit a model will put in randomly without prompting anyways but the result will be better with the longer prompt.

Anonymous
10/17/24(Thu)22:39:27 No.102867296

Anonymous 10/17/24(Thu)22:39:27 No.102867296

>>102867256
>it's actually just autism to care that it changed "girl" to "a girl standing in a forest, behind her is a rainbow [etc]", it's no different than the random bullshit a model will put in randomly
Don't call an artist's desire for control "autism" you retarded freak. Just because you don't give a shit that an LLM is fucking with your prompts nobody else should care? I hope you get hit by a bus

Anonymous
10/17/24(Thu)22:40:35 No.102867307

Anonymous 10/17/24(Thu)22:40:35 No.102867307

>>102867296
you had no control when you typed in "girl" you retard
the system prompt basically is:
if the user is a dumbfuck typing in "girl" enhance it
otherwise leave the prompt unchanged

Anonymous
10/17/24(Thu)22:43:58 No.102867338

Anonymous 10/17/24(Thu)22:43:58 No.102867338

>>102866233
This funetune is weird, in a kind of unsettling way. It just looks like base flux, the 'booru' stuff is questionable and his example images don't even look like stuff you'd find on any booru and they arent even illustration style. I fail to see how this is even a booru finetune at all, it just looks like a generic flux model renamed to troll.

Anonymous
10/17/24(Thu)22:47:22 No.102867362

Anonymous 10/17/24(Thu)22:47:22 No.102867362

>>102867307
You have FULL control when you type in girl. You get a broad spectrum of results, but ONLY those which comply with the general idea of being plausibly captioned "girl" with nothing else. I am specifically saying, for example, that the image will NOT have a rainbow, will NOT feature a flashy and interesting piece of clothing, will NOT feature an interesting or unusual pose, and will generally represent the idea of a 'girl' in a very conventional way. This is desirable behavior, because it allows me a degree of control. There are ambiguities I have left in there which may be resolved one way or another

I am not ok with the idea that an LLM will expand on my idea of girl by helpfully elaborating that she is a BLACK girl with RED hair DANCING in front of a RAINBOW

Anonymous
10/17/24(Thu)22:48:49 No.102867371

Anonymous 10/17/24(Thu)22:48:49 No.102867371

>>102867362
No you don't, you get what the model hallucinates randomly
Anyways, don't care, you can opt-out of the feature since it's not required. But you'll have shittier results and then you'll complain about boomer prompting.

Anonymous
10/17/24(Thu)22:50:12 No.102867379

Anonymous 10/17/24(Thu)22:50:12 No.102867379

Is there an AI I can use to increase boob size

Anonymous
10/17/24(Thu)22:57:31 No.102867438

Anonymous 10/17/24(Thu)22:57:31 No.102867438

File: computer--enhance.png (31 KB, 816x590)

31 KB PNG

>>102867234
I really fucking hope Gemma-2 is less bad at this than chatgpt

Anonymous
10/17/24(Thu)22:58:33 No.102867448

Anonymous 10/17/24(Thu)22:58:33 No.102867448

>>102867186
>User Prompt:
>a nude woman masturbating
>Enhanced Prompt:
>A nude woman masturbating, her long, flowing hair cascading down her back, her body contoured by the soft, warm glow of the sun, her eyes closed in pleasure, her hands caressing her own body, her mind lost in the moment of self-discovery.

Has the GPT-isms but isn't censored, the system prompt likely will need to be massaged

>>102867438
The goal isn't accurate sentences you retard. The goal is to saturate the text encoder with tokens.

Anonymous
10/17/24(Thu)22:58:46 No.102867450

Anonymous 10/17/24(Thu)22:58:46 No.102867450

File: 0.jpg (260 KB, 1344x768)

260 KB JPG

Anonymous
10/17/24(Thu)23:02:09 No.102867469

Anonymous 10/17/24(Thu)23:02:09 No.102867469

>>102867448
>The goal is to saturate the text encoder with tokens.
It took my beautiful neutral prompt and sloppified it up to 11 with every worthless cliché it could find at-hand. You can't defend that. Just tell me Gemma is less shitty than Chatgpt (or, even better, prove it with your own test of the same prompt) and I'll relax a little.

Anonymous
10/17/24(Thu)23:03:08 No.102867477

Anonymous 10/17/24(Thu)23:03:08 No.102867477

>>102867469
You are literally missing the point. How about this, type in "girl" into SD 1.5, and have ChatGPT write a prompt. And realize what the point is.

Anonymous
10/17/24(Thu)23:06:59 No.102867498

Anonymous 10/17/24(Thu)23:06:59 No.102867498

File: file.png (1.17 MB, 1024x1024)

1.17 MB PNG

[Flux Prompt]
>woman
[Flux Output]
>The image shows a woman with short dark hair and blue eyes, wearing a brown hooded cloak or jacket. Her expression is serious yet calm, with a hint of determination or contemplation. The hood frames her face closely, and she is also wearing silver hoop earrings. The background is somewhat blurred but appears to be rustic or outdoors, possibly near a wooden surface, which adds to the image's earthy and natural atmosphere. The lighting is soft, emphasizing the texture of her clothing and the details of her face.

MY HECKIN ARTISTIC FREEDOM

Anonymous
10/17/24(Thu)23:11:33 No.102867532

Anonymous 10/17/24(Thu)23:11:33 No.102867532

>>102867477
I'm going to evaluate the simplicity of the results with my eyes, dumbass, not with chatgpt. I've prompted a simple prompt like that maybe a thousand times before, I know exactly what it looks like, and there has NEVER been a rainbow, not once.

(I did once get a rainbow out of nowhere, but it was on a very different prompt with a lot going on.)

>>102867498
the flux output is a very generic "woman" image such as you'd expect from googling "woman" or looking at an encyclopedia entry for "woman", a pseudo-anthropological image with strains of gritty real-life photojournalism, etc. It's the least surprising result possible and shows no sign at all of being spiced up with a quirky chungus randomizer like an LLM would do.

It means absolutely nothing to me that ChatGPT can't see this and instead gives a worthlessly long description, but it's interesting that you can't see it. Are you stupid?

Anonymous
10/17/24(Thu)23:12:16 No.102867539

Anonymous 10/17/24(Thu)23:12:16 No.102867539

>>102867532
You're actually a retard

Anonymous
10/17/24(Thu)23:28:26 No.102867684

Anonymous 10/17/24(Thu)23:28:26 No.102867684

>>102867477
>>102867448
Want to see my next four attempts to ask Chatgpt to expand "a girl" into a full prompt?

>A young girl with shoulder-length, wavy brown hair stands in a sunlit meadow. She wears a light blue dress that gently flows in the breeze, the fabric soft and slightly wrinkled from movement. Her eyes are wide and curious, reflecting the clear sky above. She holds a bouquet of wildflowers in her small hands, with vibrant yellows, purples, and whites, and soft grass brushes against her bare feet. In the background, tall trees create a dappled pattern of light and shadow, and a distant mountain range fades into the horizon.

>A young girl with long, flowing chestnut-brown hair wearing a light blue, knee-length dress with lace trim. She stands barefoot in a grassy meadow dotted with vibrant wildflowers of yellow, purple, and white. The soft sunlight filters through the trees behind her, casting gentle shadows on her face as she gazes thoughtfully at a butterfly hovering nearby. Her expression is one of quiet wonder, and the breeze lightly tousles her hair. The sky above is a brilliant, clear blue with a few wisps of white clouds.

>A young girl with long, wavy brown hair wearing a flowing white dress, standing barefoot on a soft green meadow. The sunlight filters through the trees behind her, casting dappled shadows on the grass. Her eyes are wide and curious, gazing up at the bright blue sky dotted with fluffy white clouds. In the distance, there is a gentle hill with wildflowers in shades of yellow and purple swaying in the breeze.

>A young girl with long, wavy brown hair, wearing a light blue dress with delicate lace trim, standing in a lush green meadow. The sun is shining brightly, casting a soft golden glow around her, and a gentle breeze rustles the grass and wildflowers at her feet. She has a thoughtful expression, her hands lightly clasped in front of her, as she gazes toward a distant forest at the edge of the field.

Anonymous
10/17/24(Thu)23:30:00 No.102867701

Anonymous 10/17/24(Thu)23:30:00 No.102867701

>>102862167
Is that Mayli at the bottom left?

Anonymous
10/17/24(Thu)23:30:25 No.102867703

Anonymous 10/17/24(Thu)23:30:25 No.102867703

>>102867701
duh

Anonymous
10/17/24(Thu)23:37:42 No.102867764

Anonymous 10/17/24(Thu)23:37:42 No.102867764

>>102867701
no, its me

Anonymous
10/18/24(Fri)00:11:58 No.102868033

Anonymous 10/18/24(Fri)00:11:58 No.102868033

File: ComfyUI_temp_hkmlm_00002_.png (2.23 MB, 1120x1440)

2.23 MB PNG

Anonymous
10/18/24(Fri)00:12:31 No.102868045

Anonymous 10/18/24(Fri)00:12:31 No.102868045

File: 00001.jpg (3.45 MB, 1664x2432)

3.45 MB JPG

Anonymous
10/18/24(Fri)00:37:50 No.102868233

Anonymous 10/18/24(Fri)00:37:50 No.102868233

File: ComfyUI_temp_hkmlm_00012_.png (2.45 MB, 1080x1616)

2.45 MB PNG

Anonymous
10/18/24(Fri)00:41:28 No.102868256

Anonymous 10/18/24(Fri)00:41:28 No.102868256

>>102868233
nice

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.