/g/ - /ldg/ - Local Diffusion General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/ldg/ - Local Diffusion Genera(...) 08/26/24(Mon)04:15:17 No.102088021

File: tmp.jpg (823 KB, 3264x3264)

823 KB JPG

/ldg/ - Local Diffusion General Anonymous 08/26/24(Mon)04:15:17 No.102088021 Archived

Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>102083367

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/c/kdg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/u/udg
>>>/trash/sdg

Anonymous
08/26/24(Mon)04:17:33 No.102088040

Anonymous 08/26/24(Mon)04:17:33 No.102088040

File: 1706461719934827.png (170 KB, 640x480)

170 KB PNG

Someone recommend me wildcards, currently got

A woman dressed in a

{red|yellow|pink|blue|green|white|black|teal|aquamarine|sheer}

{skintight leggings and neon sports bra|Catholic schoolgirl uniform with plaid miniskirt and white socks|form-fitting jeans and tight comfy sweater|pinstripe office dress with short skirt and dark pantyhose|unfurled bath robe with plunging neckline}

is

{lying on her side|standing|leaning forward|bent over|lying on her stomach}

looking at the viewer seductively. Her breasts and buttocks are very large, combined with her narrow waist, gives her a classic hourglass figure. Her amazing body is a central focus of the scene and in the foreground. The woman's hair is

{messy|straight|in a ponytail|in a bun} and {long|very long|medium length|short}.

Her cheeks are slightly sunken like a supermodel. The background is a

{bedroom|living room|kitchen|bathroom} with typical expected furnishings. A window in the background

{reveals a snow-capped mountain range|reveals a futuristic sci-fi skyscraper skyline|reveals a calm ocean going to the horizon|has the blinds closed|has flowing drapes closed}.

Anonymous
08/26/24(Mon)04:19:57 No.102088054

Anonymous 08/26/24(Mon)04:19:57 No.102088054

File: 1712793879142476.png (167 KB, 640x480)

167 KB PNG

>>102088040

Anonymous
08/26/24(Mon)04:36:31 No.102088175

Anonymous 08/26/24(Mon)04:36:31 No.102088175

>>102088054
Can your new lora produce it without the UI?

Anonymous
08/26/24(Mon)04:41:19 No.102088209

Anonymous 08/26/24(Mon)04:41:19 No.102088209

File: ComfyUI_29109_.png (2.56 MB, 1080x1920)

2.56 MB PNG

>>102088040
Here's mine from SD days.

{bikini|serafuku|naked apron|apron|cape|gym uniform|garter belt|bloomers|tuxedo|kigurumi|school uniform|track suit|sweater|waitress|leotard|kimono|yukata|floral print|polka dot|fruit print|lolita fashion|fashion|stylish|cosplay|casual|bell-bottoms|petticoat|long skirt|turtleneck|sleeveless turtleneck|tank top|sweater dress|sweater vest|vest|dress shirt|collared shirt|t-shirt|jacket|hoodie|rain coat|trench coat|camisole|cardigan|blouse|capri pants|jeans|thighhighs|capelet|maid|pajamas|school swimsuit|miniskirt|pleated skirt|cargo pants|elbow gloves|shimapan|lab coat|serafuku|school uniform|swimsuit|sukumizu|pajamas|coat|overcoat|trenchcoat|bomber jacket}
{loose socks|army boots|high sandals|striped stockings|loafers|slippers|crocs|high heels|gladiator sandals|sneakers}
{blue|green|red|orange|gray|violet|lime|mint|navy|yellow|amber|emerald|ruby|sapphire|radiant} eyes
{multiple hair bows|hair bobbles|hair bell|braid|top hat|beret|baseball cap|patrol cap|straw hat|cowboy hat|bowler hat|fedora|sombrero|wizard hat|nurse cap|sailor hat|cat hat|diadem|tiara|panties on head|hat flower|hat ribbon|cat ears|cat ears|cat ears|maid headgear|multiple braids|hime cut|drill hair|hairclip|hair ornament|hair_flower|choker}
{smug|triumph|doyagao|grin|angry|annoyed|bored|crazy|despair|grimace|nervous|sleepy|ahegao|laughing|sad|surprised|perplexed|excited|confused|shy|hopeful|shark teeth smile}
{standing|posing|running|talking|playing|working|busy|flying|sitting|jumping|double v|fist in hand|thumbs up|salute|hands behind back|arms behind head|shrugging|spread legs|object hug|jojo pose|outstretched arm|waving|kneeling}
{blue|navy|green|violet|red|yellow|radiant|orange|pink|mint|salmon pink|shocking pink|gray|white|lime|black} hair

Anonymous
08/26/24(Mon)04:42:29 No.102088223

Anonymous 08/26/24(Mon)04:42:29 No.102088223

File: ComfyUI_29102_.png (2.58 MB, 1080x1920)

2.58 MB PNG

>>102088040
Also

{school|mcdonalds|restaurant|cafe|city view,street|countryside|office|village|town|dungeon|alien planet|fantasy kingdom|garden|roof|cozy bedroom|kitchen|basement|palace|castle|at computer|library|factory|quarry|airport|military base|jungle|forest|snow mountain|brook|river|lake|beach|prison|space|space station|post office|server room|living room|movie theater|construction site|pizza joint|festival|carnival|comfy bedroom|jungle|nuclear power plant|graveyard|underwater}

Anonymous
08/26/24(Mon)04:44:50 No.102088245

Anonymous 08/26/24(Mon)04:44:50 No.102088245

File: 1697206216309092.png (154 KB, 640x480)

154 KB PNG

>>102088175
Not sure actually, even though the UI is obviously ugly and fucked, I include it in the prompt on the assumption that it helps the overall adherence to the style. Will test.
>>102088209
>>102088223
Awesome, will incorporate some of that, thanks

Anonymous
08/26/24(Mon)04:50:33 No.102088300

Anonymous 08/26/24(Mon)04:50:33 No.102088300

>boomer shooter lora
>it's just bad pixel art with a ui pasted over

Anonymous
08/26/24(Mon)04:52:22 No.102088315

Anonymous 08/26/24(Mon)04:52:22 No.102088315

>>102088021
>1girl, 1girl, 1girl, 1girl, 1girls
very daring today with the collage picks

Anonymous
08/26/24(Mon)04:57:15 No.102088356

Anonymous 08/26/24(Mon)04:57:15 No.102088356

File: 1699489235359412.png (2.16 MB, 1280x1280)

2.16 MB PNG

>>102088300
Still trying to get the balance right, and it's probably due to the training material. Has a kind of ugly cartoonish look if I don't mix it with a pixel art lora.

Anonymous
08/26/24(Mon)04:57:38 No.102088362

Anonymous 08/26/24(Mon)04:57:38 No.102088362

>>102088300
nah, >>102088245 for example could easily be a girl in the strip club of duke3D.

Anonymous
08/26/24(Mon)04:58:44 No.102088370

Anonymous 08/26/24(Mon)04:58:44 No.102088370

>>102088040
{deep indigo|dark green|blue and red|holographic purple|matte orange|pearl white|solid gold|polka dot}
{sitting on the toilet, taking a crap}
{reveals the face of an old man,peeping tom,voyeuristic}.

Anonymous
08/26/24(Mon)05:00:52 No.102088389

Anonymous 08/26/24(Mon)05:00:52 No.102088389

File: 1710509507005275.png (392 KB, 640x480)

392 KB PNG

>>102088370
>{reveals the face of an old man,peeping tom,voyeuristic}.
I'll definitely add this one
>>102088175
I see about 50/50 with or without HUD when I don't explicitly mention it btw, picrel

Anonymous
08/26/24(Mon)05:01:34 No.102088394

Anonymous 08/26/24(Mon)05:01:34 No.102088394

>>102088245
Yeah, if you could test it without the prompt, and with the UI in the negative prompts, could be interesting to see what happens.
>>102088300
You are right that it gets the correct era of pixel art, but it doesn't do things like the weird 3d walls or the jarring way things stand against the floor. It ends up looking more like a pre FMV adventure game still in some ways, though I'm not sure of one with exactly this art style.
>>102088362
The character design and the pixel graphics style is spot on, just that these games used 3d environments and this image hasn't recreated that. The water looks a little bit like a boomer shooter.

Anonymous
08/26/24(Mon)05:03:15 No.102088409

Anonymous 08/26/24(Mon)05:03:15 No.102088409

>>102088394
>It ends up looking more like a pre FMV adventure game still in some ways, though I'm not sure of one with exactly this art style.
It reminds me a bit of Spellcasting 301, and old boomer coomer game

Anonymous
08/26/24(Mon)05:07:37 No.102088440

Anonymous 08/26/24(Mon)05:07:37 No.102088440

File: 2024-08-26_00073_.png (1.17 MB, 1280x720)

1.17 MB PNG

>>102088021
this place id definitely not a cult

Anonymous
08/26/24(Mon)05:08:02 No.102088443

Anonymous 08/26/24(Mon)05:08:02 No.102088443

>>102088356
Oh no the kitty is on fire now.

Anonymous
08/26/24(Mon)05:10:51 No.102088462

Anonymous 08/26/24(Mon)05:10:51 No.102088462

File: download - 2024-08-26T211(...).jpg (75 KB, 512x512)

75 KB JPG

>>102088440
Yes it is.

Anonymous
08/26/24(Mon)05:16:01 No.102088500

Anonymous 08/26/24(Mon)05:16:01 No.102088500

it is absolutely insane how much things have progressed in only a mere 3 years
If you'd shown any of these images to someone just 3 years ago and told them an AI generated them I'd doubt you'd convince many people to believe such a thing was even possible at all and now look how far the tech has come
Makes me wonder what local diffusion threads will look like in the next 3 years

Anonymous
08/26/24(Mon)05:16:28 No.102088509

Anonymous 08/26/24(Mon)05:16:28 No.102088509

how do I download from replicate.com

Anonymous
08/26/24(Mon)05:19:44 No.102088546

Anonymous 08/26/24(Mon)05:19:44 No.102088546

File: 2024-08-26_00074_.png (1.08 MB, 1280x720)

1.08 MB PNG

>>102088462
nooo! never

Anonymous
08/26/24(Mon)05:20:43 No.102088554

Anonymous 08/26/24(Mon)05:20:43 No.102088554

File: 1699170366239397.png (163 KB, 640x480)

163 KB PNG

>>102088370
He has arrived

Anonymous
08/26/24(Mon)05:20:58 No.102088557

Anonymous 08/26/24(Mon)05:20:58 No.102088557

>>102088509
there is a download button just below the generated images? .. if thats not there for you lust friggin right click the image and download it like you would any img on the web

Anonymous
08/26/24(Mon)05:21:07 No.102088560

Anonymous 08/26/24(Mon)05:21:07 No.102088560

>>102088500
I suspect the big gains from here are going to be in consistency. Make it so less of the time something goes wrong, it can understand more complex prompts, it will give people extra fingers less often etc. It already is quite good at it's best case, it's just a case of wrestling it until it produces those.

Anonymous
08/26/24(Mon)05:22:33 No.102088574

Anonymous 08/26/24(Mon)05:22:33 No.102088574

>>102088557
I mean the models

Anonymous
08/26/24(Mon)05:27:36 No.102088617

Anonymous 08/26/24(Mon)05:27:36 No.102088617

>>102088500
>If you'd shown any of these images to someone just 3 years ago and told them an AI generated them I'd doubt you'd convince many people to believe such a thing was even possible at all
Except it's the opposite, people are extremely ignorant and gullible, if you told them back then that "using photoshop" was about writing prompts to tell the machine what to draw and showed them a picture with artistic complexity and told them it was done like that, since they don't know what it conveys, they'd trust you and would have thought it worked like that.
This is the main reason AI generated music isn't all the rage, most people have no idea how music production works, so it's difficult for them to grasp the difference, they just listen to the end product.

Anonymous
08/26/24(Mon)05:28:35 No.102088626

Anonymous 08/26/24(Mon)05:28:35 No.102088626

The way forward is hooking the AI directly into your brain so it can read your thoughts. Then you hook the outputs directly into your brain, too. At that point the only way to train a LORA is experiencing the sensation you want to train yourself about two dozen times.

Anonymous
08/26/24(Mon)05:30:37 No.102088645

Anonymous 08/26/24(Mon)05:30:37 No.102088645

>>102088574
you don't .. replicate is to host models to use as software as a service.. if you want to download models you go to huggingface, for flux for example here:

>https://huggingface.co/black-forest-labs/FLUX.1-dev/tree/main

Anonymous
08/26/24(Mon)05:31:49 No.102088657

Anonymous 08/26/24(Mon)05:31:49 No.102088657

>>102088645
sad
I wanted to download a lora that's being hosted there

Anonymous
08/26/24(Mon)05:32:56 No.102088666

Anonymous 08/26/24(Mon)05:32:56 No.102088666

Now this is a next level scam https://xcancel.com/NecroKuma3/status/1827552411788931379

Anonymous
08/26/24(Mon)05:33:01 No.102088668

Anonymous 08/26/24(Mon)05:33:01 No.102088668

>>102088554
What a creep. Why doesn’t he just generate the women he wants to look at on his computer?

Anonymous
08/26/24(Mon)05:34:11 No.102088677

Anonymous 08/26/24(Mon)05:34:11 No.102088677

>>102088657
link? maybe I can find it for you somewhere else

Anonymous
08/26/24(Mon)05:35:04 No.102088685

Anonymous 08/26/24(Mon)05:35:04 No.102088685

>>102088626
Not so sure about that last part, but brain machine interface for generating images is something I've been wanting for a long time. Just vividly imagine whatever picture or video you want and it appears right on the screen exactly as you envision it. Hopefully will be possible in the next 5 years or so.

Anonymous
08/26/24(Mon)05:35:33 No.102088689

Anonymous 08/26/24(Mon)05:35:33 No.102088689

>>102088666
checked, and also daaamn ..
>My art is so bad no one buys it.. but I gonna so anyone who would have.

Anonymous
08/26/24(Mon)05:36:21 No.102088698

Anonymous 08/26/24(Mon)05:36:21 No.102088698

>>102088666
Who is the artist?

Anonymous
08/26/24(Mon)05:36:34 No.102088699

Anonymous 08/26/24(Mon)05:36:34 No.102088699

>>102088666
I feel like the whole Artists vs AI thing has blown over mostly everywhere but Japan

Anonymous
08/26/24(Mon)05:38:52 No.102088711

Anonymous 08/26/24(Mon)05:38:52 No.102088711

File: Screenshot 2024-08-26 113740.png (27 KB, 876x205)

27 KB PNG

>>102088657
already made 26 grand .. not baad
>https://ci-en.net/creator/24768/crowdfunding/761

Anonymous
08/26/24(Mon)05:40:03 No.102088721

Anonymous 08/26/24(Mon)05:40:03 No.102088721

>>102088711
I hope he'll lose that case, fucking hate the luddites so much

Anonymous
08/26/24(Mon)05:41:00 No.102088726

Anonymous 08/26/24(Mon)05:41:00 No.102088726

>>102088698
nta, but this guy:
>https://kamikire.mystrikingly.com/#doujin
paints little girls in generic doujin ..

Anonymous
08/26/24(Mon)05:42:14 No.102088733

Anonymous 08/26/24(Mon)05:42:14 No.102088733

File: 1724602734916257.png (1.04 MB, 1024x1024)

1.04 MB PNG

default 2b lora prompt but with short white hair replaced with long twintail green hair and Miku Hatsune instead of 2b:

I like how it fused the two hairstyles

Anonymous
08/26/24(Mon)05:42:15 No.102088734

Anonymous 08/26/24(Mon)05:42:15 No.102088734

>>102088726
>virtually indistinguishable from million other "artists"

Anonymous
08/26/24(Mon)05:42:42 No.102088738

Anonymous 08/26/24(Mon)05:42:42 No.102088738

File: BESTFRIENDS.png (329 KB, 1024x1024)

329 KB PNG

I could never draw something like this in stable diffusion, the hot dog was specially hard back then.
>This is a digital cartoon illustration depicting four anthropomorphized food items arranged in a row, all with cheerful expressions. The background is plain white, making the colorful characters stand out prominently. From left to right, the characters include: 1. A pizza slice with a happy face, wearing a red and white striped hat and giving a thumbs-up. 2. A hot dog with a zigzag pattern on its body, wearing a red and white striped hat and a smile. 3. A red drink cup with a white straw, adorned with a smiley face and a zigzag pattern on its body. 4. A burger with a sesame seed bun, green lettuce, and a slice of tomato, wearing a red and white striped hat and a smile. Each character has a simple, cartoon-like design with minimal details. The overall style is playful and whimsical, using bold lines and vibrant colors to convey a sense of fun and camaraderie. The characters are arranged horizontally, with their bodies touching, giving the impression of a friendly group. The text "Best Friends" is written above the characters in a casual, handwritten font, emphasizing the theme of friendship. The illustration is clean and cheerful, suitable for a light-hearted and playful context.

Anonymous
08/26/24(Mon)05:44:34 No.102088750

Anonymous 08/26/24(Mon)05:44:34 No.102088750

File: centaur.png (2.37 MB, 1000x1696)

2.37 MB PNG

Anonymous
08/26/24(Mon)05:47:30 No.102088770

Anonymous 08/26/24(Mon)05:47:30 No.102088770

File: sextoy.jpg (316 KB, 770x1306)

316 KB JPG

started with this

Anonymous
08/26/24(Mon)05:47:59 No.102088772

Anonymous 08/26/24(Mon)05:47:59 No.102088772

>>102088733
>I like how it fused the two hairstyles
same, Miku is looking really good here

Anonymous
08/26/24(Mon)05:48:49 No.102088783

Anonymous 08/26/24(Mon)05:48:49 No.102088783

>>102088750
funny
>>102088770
holy fucking shit.. I would have never dreamt of an centaur fucking doll existing.. like wtf

Anonymous
08/26/24(Mon)05:49:05 No.102088784

Anonymous 08/26/24(Mon)05:49:05 No.102088784

>>102088770
does it have two pussies or just the one in the completely wrong location

Anonymous
08/26/24(Mon)05:50:09 No.102088791

Anonymous 08/26/24(Mon)05:50:09 No.102088791

>>102088770
I wish I could say this is peak degeneracy but I guess it is only getting worse from here on out.

Anonymous
08/26/24(Mon)05:53:24 No.102088813

Anonymous 08/26/24(Mon)05:53:24 No.102088813

File: 1722502396451276.png (1.06 MB, 1024x1024)

1.06 MB PNG

>>102088733

Anonymous
08/26/24(Mon)06:05:40 No.102088878

Anonymous 08/26/24(Mon)06:05:40 No.102088878

>>102088750
>>102088770
Very cursed.

Anonymous
08/26/24(Mon)06:06:16 No.102088881

Anonymous 08/26/24(Mon)06:06:16 No.102088881

File: TACOS.png (1.42 MB, 1024x1024)

1.42 MB PNG

>>102088560
The big gains from here are going to be sending as prompt "Tacos illustration" and getting picrel. Instead of having to send a prompt so large it doesn't fit into a post:
https://pastebin.com/X3CLgzrb

Anonymous
08/26/24(Mon)06:07:52 No.102088890

Anonymous 08/26/24(Mon)06:07:52 No.102088890

>>102088881
>The big gains from here are going to be sending as prompt "Tacos illustration" and getting picrel.
you can have that if you increase the CFG, that's what it has been made for

Anonymous
08/26/24(Mon)06:14:01 No.102088927

Anonymous 08/26/24(Mon)06:14:01 No.102088927

File: file.jpg (59 KB, 515x415)

59 KB JPG

Flux can't into revolvers.

Anonymous
08/26/24(Mon)06:14:58 No.102088941

Anonymous 08/26/24(Mon)06:14:58 No.102088941

>>102088890
Can you show an example? I ran out of huggingface quotas and I don't think they let one modify the CFG at all.

Anonymous
08/26/24(Mon)06:16:56 No.102088952

Anonymous 08/26/24(Mon)06:16:56 No.102088952

>>102088677
https://replicate.com/markredito/90sbadtrip

Anonymous
08/26/24(Mon)06:18:31 No.102088964

Anonymous 08/26/24(Mon)06:18:31 No.102088964

>>102088941
There's this for example
https://reddit.com/r/StableDiffusion/comments/1eza71h/four_methods_to_run_flux_at_cfg_1/
you can see on the first prompt that you only get the sushis at cfg 6 + automaticCFG
the third prompt is interesting too, CFG 1 doesn't do the black skin + pixel art, CFG 6 does, you simply have better prompt adherance with higher CFG

Anonymous
08/26/24(Mon)06:22:27 No.102088986

Anonymous 08/26/24(Mon)06:22:27 No.102088986

Ran made /ldg/ because others were bored with the fact he was posting the same image for almost two years straight.

Anonymous
08/26/24(Mon)06:23:40 No.102088997

Anonymous 08/26/24(Mon)06:23:40 No.102088997

File: TacosIllus.png (740 KB, 1024x1024)

740 KB PNG

>>102088964
This is not about prompt adherence, but model creativity, note how all the examples have long prompts, "Tacos illustration" is a short prompt, it produces picrel and no higher CFG is going to help because the things we'd want to see aren't in the prompt.

Anonymous
08/26/24(Mon)06:25:35 No.102089007

Anonymous 08/26/24(Mon)06:25:35 No.102089007

>>102088997
it's still good to get better prompt adherance even though I see what you mean, you'd love the model to go a little less autistic about the simple prompt, you asked for a tacos, you wanted something ineteresting, and flux just got the job done and didn't want to go further

Anonymous
08/26/24(Mon)06:26:41 No.102089016

Anonymous 08/26/24(Mon)06:26:41 No.102089016

File: file.png (2.26 MB, 1024x1024)

2.26 MB PNG

Anonymous
08/26/24(Mon)06:27:29 No.102089019

Anonymous 08/26/24(Mon)06:27:29 No.102089019

>>102088738
Larping as a failed writer isn't a good look.

Anonymous
08/26/24(Mon)06:28:08 No.102089023

Anonymous 08/26/24(Mon)06:28:08 No.102089023

>>102089019
he didn't write this, chatgpt did

Anonymous
08/26/24(Mon)06:28:39 No.102089031

Anonymous 08/26/24(Mon)06:28:39 No.102089031

>>102089007
>>102088997
You are stupid, less than adequately technical middle schoolers.

Anonymous
08/26/24(Mon)06:29:40 No.102089033

Anonymous 08/26/24(Mon)06:29:40 No.102089033

>>102089023
Thanks for proving my point.

Anonymous
08/26/24(Mon)06:30:10 No.102089038

Anonymous 08/26/24(Mon)06:30:10 No.102089038

>>102089033
so you're saying that chatgpt is a failed writer? because I agree with that lol

Anonymous
08/26/24(Mon)06:31:52 No.102089049

Anonymous 08/26/24(Mon)06:31:52 No.102089049

>>102089019
I was drawing stuff like these back in 2012 and I can say mine had more personality and soul, but it was 8 hours of dragging vectors around, which was such a bore, getting the prompt and generating it in seconds is the big deal.

Anonymous
08/26/24(Mon)06:32:14 No.102089052

Anonymous 08/26/24(Mon)06:32:14 No.102089052

File: 1702728998469674.png (1.89 MB, 1024x1024)

1.89 MB PNG

Anonymous
08/26/24(Mon)06:32:54 No.102089060

Anonymous 08/26/24(Mon)06:32:54 No.102089060

>>102089023
Actually Joy Caption.

Anonymous
08/26/24(Mon)06:34:05 No.102089073

Anonymous 08/26/24(Mon)06:34:05 No.102089073

File: file.png (2.04 MB, 1024x1024)

2.04 MB PNG

>Hatsune Miku, the iconic virtual pop star with long turquoise twin-tails, is standing on a lush green golf course. She is dressed in a stylish golf outfit, complete with a visor and gloves. In her hands, she holds a golf club, poised to take a swing. Instead of a regular golf ball, a small, distressed Kirby is positioned on the tee, looking up with wide, worried eyes. The scene is set under a clear blue sky with a few fluffy clouds in the background.
Flux has trouble to remix a concept, for example I went for a golf setting and I wanted to have the ball golf replaced by something else, and it just can't do it, dunno of dalle3 can for example and I'm asking for something no other models can

Anonymous
08/26/24(Mon)06:35:54 No.102089091

Anonymous 08/26/24(Mon)06:35:54 No.102089091

>>102089073
KYS

Anonymous
08/26/24(Mon)06:36:02 No.102089092

Anonymous 08/26/24(Mon)06:36:02 No.102089092

File: Manytimes.png (638 KB, 1024x1024)

638 KB PNG

>>102089007
That got me thinking, what if I just chant?
>Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration. Tacos illustration.
Oh well, at least it's a bit better, but filling up the prompt does nothing.

Anonymous
08/26/24(Mon)06:36:57 No.102089098

Anonymous 08/26/24(Mon)06:36:57 No.102089098

>>102089073
>Instead of a regular golf ball
T5 beats CLIP but it is still too small to handle these.
Remember GPT-3? It had lots of issues with negatives, T5 is much smaller and the embeddings it creates cannot properly encode negatives.

Anonymous
08/26/24(Mon)06:37:24 No.102089100

Anonymous 08/26/24(Mon)06:37:24 No.102089100

>>102089049
You are too stupid to even use your own prompt, retard.

Anonymous
08/26/24(Mon)06:38:21 No.102089110

Anonymous 08/26/24(Mon)06:38:21 No.102089110

File: 1718319456243953.png (892 KB, 1024x1024)

892 KB PNG

<lora:FLUX-Pepe-1:1> girl pepe dressed as Miku Hatsune, this is fine, cartoon.

the lora makes great pepes but i've never tried to make a non pepe with it. but it works!

Anonymous
08/26/24(Mon)06:39:25 No.102089122

Anonymous 08/26/24(Mon)06:39:25 No.102089122

>>102089092
lmao, I think you can simply go for (Tacos illustration:25), Flux can handle those

Anonymous
08/26/24(Mon)06:40:36 No.102089131

Anonymous 08/26/24(Mon)06:40:36 No.102089131

>>102089110
feels PepeMiku man

Anonymous
08/26/24(Mon)06:40:41 No.102089133

Anonymous 08/26/24(Mon)06:40:41 No.102089133

is there any way to merge multiple Flux loras into a single file at specific ratios?
do i just use kohya-ss scripts for this?

Anonymous
08/26/24(Mon)06:41:37 No.102089138

Anonymous 08/26/24(Mon)06:41:37 No.102089138

>>102089098
that's too bad, T5 is already big enough, it's a 11b model, maybe it could be finetuned to be better like that guy did with clip_l?

Anonymous
08/26/24(Mon)06:42:08 No.102089145

Anonymous 08/26/24(Mon)06:42:08 No.102089145

File: 1705206421421141.png (1.18 MB, 1024x1024)

1.18 MB PNG

<lora:FLUX-Pepe-1:1> Pepe is sitting at his computer. A graphic on the computer screen says "Concord" in white and playful text and an overweight cartoon woman holding a rifle is visible below the text. Pepe has a sad look on his face and is saying "feels bad, man". cartoon.

context: concord is a shitty 100 mil game with woke designs.

Anonymous
08/26/24(Mon)06:42:54 No.102089156

Anonymous 08/26/24(Mon)06:42:54 No.102089156

>>102089073
first off, what >>102089098 said; remember, you're writing what the tags of the image would be on the internet. It's very rare for the words "golf ball" to be in the tags of an image that features no golf balls, so you're biasing it very hard to draw a golf ball somewhere. secondly, it should be easy to create this image with inpainting. you could even get the composition correct ahead of time by asking for a "screaming anthropomorphic golf ball" on the tee so that the rest of the image matches teh funny around it, then inpaint screaming kirby in its place. if you dont want to inpaint, and just wanted to see if you could gen kirby as the golf ball in the first place, you already have your tips for having a better chance to achieve that, but also i dont care about a stupid pointless task like that when you already have the tools to get that output easily in other ways

Anonymous
08/26/24(Mon)06:42:56 No.102089158

Anonymous 08/26/24(Mon)06:42:56 No.102089158

>>102089133
Merge all the Loras into Flux with those ratios.
Extract a Lora that is the difference between regular Flux and that Flux.
That's it.

Anonymous
08/26/24(Mon)06:43:23 No.102089165

Anonymous 08/26/24(Mon)06:43:23 No.102089165

>>102088952
>https://replicate.com/markredito/90sbadtrip
found his github, but it ain't there .. so I guess you are outta luck for now.. maybe msg him and ask if he can make it public?
>https://github.com/markredito?tab=repositories

Anonymous
08/26/24(Mon)06:43:52 No.102089170

Anonymous 08/26/24(Mon)06:43:52 No.102089170

>>102089158
ok thanks, will try this

Anonymous
08/26/24(Mon)06:44:15 No.102089177

Anonymous 08/26/24(Mon)06:44:15 No.102089177

>>102089145
In that case making her black, pink hair, and wearing a rainbow blouse would have worked better.

Anonymous
08/26/24(Mon)06:45:48 No.102089197

Anonymous 08/26/24(Mon)06:45:48 No.102089197

>>102089138
>it's a 11b model
we only use half of it so 5B

Anonymous
08/26/24(Mon)06:47:33 No.102089214

Anonymous 08/26/24(Mon)06:47:33 No.102089214

>>102089197
>we only use half of it so 5B
so we're only loading the 5B on our memory then? please tell me yes and that the other useless 5B isn't on my vram...

Anonymous
08/26/24(Mon)06:47:42 No.102089217

Anonymous 08/26/24(Mon)06:47:42 No.102089217

File: ComfyUI_Flux_10917.jpg (87 KB, 1024x576)

87 KB JPG

>>102088927

Anonymous
08/26/24(Mon)06:48:39 No.102089228

Anonymous 08/26/24(Mon)06:48:39 No.102089228

>>102089214
>so we're only loading the 5B on our memory then?
yes, the T5 linked in most places just has the encoder

Anonymous
08/26/24(Mon)06:48:59 No.102089232

Anonymous 08/26/24(Mon)06:48:59 No.102089232

>>102089214
Why put any of it on your VRAM unless you have a tiny amount of RAM? You can get it to run on normal memory and it's still very fast compared to the rest of the image gen time even on a good GPU.

Anonymous
08/26/24(Mon)06:49:40 No.102089239

Anonymous 08/26/24(Mon)06:49:40 No.102089239

>>102089232
>Why put any of it on your VRAM unless you have a tiny amount of RAM?
because the text encoder is on my second gpu

Anonymous
08/26/24(Mon)06:49:58 No.102089241

Anonymous 08/26/24(Mon)06:49:58 No.102089241

>>102089217
now put it in a hip holster

Anonymous
08/26/24(Mon)06:50:09 No.102089242

Anonymous 08/26/24(Mon)06:50:09 No.102089242

>>102089239
Fair enough.

Anonymous
08/26/24(Mon)06:51:05 No.102089252

Anonymous 08/26/24(Mon)06:51:05 No.102089252

>>102089239
Fat cat.

Anonymous
08/26/24(Mon)06:51:38 No.102089257

Anonymous 08/26/24(Mon)06:51:38 No.102089257

>>102089098
You tried this?

https://github.com/zer0int/ComfyUI-Long-CLIP

Anonymous
08/26/24(Mon)06:51:53 No.102089261

Anonymous 08/26/24(Mon)06:51:53 No.102089261

>>102089252
I brought a second one to run Mixtral on the llm side, but I'm glad it's also useful for flux

Anonymous
08/26/24(Mon)06:52:05 No.102089262

Anonymous 08/26/24(Mon)06:52:05 No.102089262

File: 1705414638357451.png (1.13 MB, 1024x1024)

1.13 MB PNG

>>102089177
success

Anonymous
08/26/24(Mon)06:53:12 No.102089274

Anonymous 08/26/24(Mon)06:53:12 No.102089274

>>102089257
Now that's interesting, I'll test that out thanks

Anonymous
08/26/24(Mon)06:53:18 No.102089275

Anonymous 08/26/24(Mon)06:53:18 No.102089275

>>102089257
won't be any better at handling negatives, anon

Anonymous
08/26/24(Mon)06:53:19 No.102089277

Anonymous 08/26/24(Mon)06:53:19 No.102089277

>>102088952
>>102089165
I saw similar ones on civitai

https://civitai.com/models/681455/flux1d-1990s-cgi

Anonymous
08/26/24(Mon)06:54:10 No.102089287

Anonymous 08/26/24(Mon)06:54:10 No.102089287

File: kirby.png (580 KB, 1024x1024)

580 KB PNG

>>102089073
Um, what if you just say that Kirby is the ball?
>Hatsune Miku, the iconic virtual pop star with long turquoise twin-tails, is standing on a lush green golf course. She is dressed in a stylish golf outfit, complete with a visor and gloves. In her hands, she holds a golf club, poised to take a swing. The ball is a small, distressed Kirby, positioned on the tee, looking up with wide, worried eyes. The scene is set under a clear blue sky with a few fluffy clouds in the background.

Anonymous
08/26/24(Mon)06:54:51 No.102089292

Anonymous 08/26/24(Mon)06:54:51 No.102089292

>>102089275
Flux does not use negative prompt, it uses clip for tags and t5 for boomer prompting

Clip is limited to around 75 and t5 to 150 or 450, not sure

Anonymous
08/26/24(Mon)06:55:30 No.102089295

Anonymous 08/26/24(Mon)06:55:30 No.102089295

>>102089262
There we go!

Anonymous
08/26/24(Mon)06:56:18 No.102089304

Anonymous 08/26/24(Mon)06:56:18 No.102089304

>>102089292
>Clip is limited to around 75 and t5 to 150 or 450, not sure
So... when you go for boomer prompting, it aplies ot t5 and clip_l, so clip_l is useless during boomer prompting because you're just yapping and reaching its limit quickly is that right?

Anonymous
08/26/24(Mon)06:57:10 No.102089316

Anonymous 08/26/24(Mon)06:57:10 No.102089316

>>102089292
When I say negatives I don't mean negative prompt. I'm talking about prompting things like "there is no pink elephant in the image", it can't handle those.
It is 77 and 512 respectively and there is no "clip for tags and t5 for boomer prompting", Flux wasn't trained that way. What's with retards repeating this shit.

Anonymous
08/26/24(Mon)06:57:38 No.102089319

Anonymous 08/26/24(Mon)06:57:38 No.102089319

File: file.png (2.07 MB, 1024x1024)

2.07 MB PNG

>>102089287
I tried your prompt and it didn't work on my side, I guess that the y2k lora is raping the weights to the point it got worse prompt adherance

Anonymous
08/26/24(Mon)07:03:00 No.102089368

Anonymous 08/26/24(Mon)07:03:00 No.102089368

>>102089319
Give it a last chance with this one, with a different seed:
>Hatsune Miku, the iconic virtual pop star with long turquoise twin-tails, is standing on a lush green golf course. She is dressed in a stylish golf outfit, complete with a visor and gloves. In her hands, she holds a golf club, poised to take a swing. A distressed Kirby is positioned on the tee, looking up with wide, worried eyes. The scene is set under a clear blue sky with a few fluffy clouds in the background.

Anonymous
08/26/24(Mon)07:05:46 No.102089404

Anonymous 08/26/24(Mon)07:05:46 No.102089404

File: 4qe86qs255id1 (1).jpg (1.92 MB, 3277x4229)

1.92 MB JPG

>>102089292
>Flux does not use negative prompt
negative prompt can work with Flux though

Anonymous
08/26/24(Mon)07:06:53 No.102089418

Anonymous 08/26/24(Mon)07:06:53 No.102089418

File: file.png (1.95 MB, 1024x1024)

1.95 MB PNG

>>102089368
:(

Anonymous
08/26/24(Mon)07:06:55 No.102089420

Anonymous 08/26/24(Mon)07:06:55 No.102089420

File: 1724614779991936.png (2.16 MB, 3258x3242)

2.16 MB PNG

>>102089304
clip is used at the start, then t5 takes over the prompt if you have dual clip loader

>>102089316
>Flux has two text models, clip (think tags) and T5 (natural language). SDXL had this, too, and everyone ended up just sending the same text to both for the best results.

>You could try putting a natural language prompt in T5 and then a few tags (e.g. flat shaded, pixelart, illustration) in clip, but AFAIK it's not really worth it.

>Guidance is where the model has been trained on different CFG values. So it's like cheating, you say "guidance 2" and them model tries to make images that look like the images it made during training with CFG set to 2 (but CFG is actually still 1).

>This seems weird, but CFG higher than 1 increases generation time, so guidance is faster. Downside is that you can't have a negative prompt with any guidance value, you actually do need CFG > 1 for that to work (and it is the negative that causes the increase in generation time)

With the dual clip loader and flux prompt node you can notice a difference if you add tags to clip or not, if you have a good lora trained with both tags and boomer prompting the gen will be better

>>102089404
Don't you need the cfg node that doubles the gen time for that?

Anonymous
08/26/24(Mon)07:07:02 No.102089421

Anonymous 08/26/24(Mon)07:07:02 No.102089421

>>102089404
are you aware it changed more than just the pillows right?

Anonymous
08/26/24(Mon)07:08:27 No.102089432

Anonymous 08/26/24(Mon)07:08:27 No.102089432

>>102089421
and? adding new words always changes the settings, at the end it got the job done

Anonymous
08/26/24(Mon)07:09:00 No.102089441

Anonymous 08/26/24(Mon)07:09:00 No.102089441

Ah fuck, I have a problem. In Kohya CLI I have a lora that is a continued training from yesterday, the source --network_weight is in the same directory as where the new safetensor checkpoints are being created. Eventually kohya will write a checkpoint with the same name as the --network_weight safetensor i started with.
Can I safely rename/move the safetensor without fucking up training or is it constantly being reference, I mean it used it to start so it shouldn't need it once it's made more checkpoints?

>>102088750
>>102088770
Bravo. And here I was, having a normal Monday.

Anonymous
08/26/24(Mon)07:09:28 No.102089448

Anonymous 08/26/24(Mon)07:09:28 No.102089448

>>102089420
>Don't you need the cfg node that doubles the gen time for that?
if you have no other solutions? there's one I guess, it's up to you

Anonymous
08/26/24(Mon)07:10:54 No.102089469

Anonymous 08/26/24(Mon)07:10:54 No.102089469

>>102089420
>clip is used at the start, then t5 takes over the prompt if you have dual clip loader
wait what? so if you go for a prompt that is less than 77 tokens then t5 is never working and clip_l is doing all the work?

Anonymous
08/26/24(Mon)07:15:56 No.102089515

Anonymous 08/26/24(Mon)07:15:56 No.102089515

>>102089418
I think this is proof Loras are not a viable solution and we need a finetune.

Anonymous
08/26/24(Mon)07:16:25 No.102089524

Anonymous 08/26/24(Mon)07:16:25 No.102089524

File: flux_supercalifragilistic(...).png (1.02 MB, 1024x1024)

1.02 MB PNG

Shit model
uninstalled

Anonymous
08/26/24(Mon)07:16:30 No.102089525

Anonymous 08/26/24(Mon)07:16:30 No.102089525

>>102089515
100% agree with you on that one

Anonymous
08/26/24(Mon)07:20:22 No.102089569

Anonymous 08/26/24(Mon)07:20:22 No.102089569

>>102088952
>>102089165
>>102089277
Here

https://huggingface.co/markredito/90sbadtrip/tree/main

Make sure to read the card too, it has a trigger word

>>102089469
I have not tested it, but people that made loras said there's a limit to what you can write in the prompt. You can remove the clip and use only t5 in clip loader but I got best results when using loras combining clip with tags and t5 with boomer prompting

>Flux uses the T5 text model for per word prompting, and the CLIP text model for an overall description of the image (rather than per word prompting like previous diffusion models used CLIP for. As in CLIP's final layer combines the per-word values into a single description, but previous models used the layer before the last when it was still per-word. Flux uses that final layer).

Anonymous
08/26/24(Mon)07:22:24 No.102089584

Anonymous 08/26/24(Mon)07:22:24 No.102089584

>>102089441
nm, i ran out of time and just renamed it, will see what happens.

Anonymous
08/26/24(Mon)07:23:13 No.102089592

Anonymous 08/26/24(Mon)07:23:13 No.102089592

>>102089257
>https://github.com/zer0int/ComfyUI-Long-CLIP
I want the same thing but for ViT-L instead

Anonymous
08/26/24(Mon)07:25:53 No.102089610

Anonymous 08/26/24(Mon)07:25:53 No.102089610

>>102089592
It is a finetune of CLIP ViT-L-14

Anonymous
08/26/24(Mon)07:27:41 No.102089631

Anonymous 08/26/24(Mon)07:27:41 No.102089631

>>102089610
that one?
https://huggingface.co/zer0int/CLIP-GmP-ViT-L-14/blob/main/ViT-L-14-BEST-smooth-GmP-ft.safetensors

Anonymous
08/26/24(Mon)07:28:16 No.102089635

Anonymous 08/26/24(Mon)07:28:16 No.102089635

>>102089631
no, the original CLIP ViT-L-14

Anonymous
08/26/24(Mon)07:28:54 No.102089644

Anonymous 08/26/24(Mon)07:28:54 No.102089644

I'm so tired... 4.5 more hours til my runpod lora is complete... These several alarms better wake my ass up to save it, I can't keep my eyes open any longer

I'm not making any more large dataset runs after this, being able to leave it running overnight on my own GPU without fear is so much better

Anonymous
08/26/24(Mon)07:29:59 No.102089656

Anonymous 08/26/24(Mon)07:29:59 No.102089656

>>102089635
yeah, that's why I'd like a long clip of the "smooth" version instead, it's more accurate

Anonymous
08/26/24(Mon)07:30:14 No.102089658

Anonymous 08/26/24(Mon)07:30:14 No.102089658

File: image_D_gX3C7V_1724632131(...).jpg (2.17 MB, 3588x2383)

2.17 MB JPG

>>102089592
>>102089610
Someone used both

>>102089631
You can use that instead of clip_l in flux, long clip needs another node

Anonymous
08/26/24(Mon)07:30:44 No.102089662

Anonymous 08/26/24(Mon)07:30:44 No.102089662

>>102089656
Which is available?
https://huggingface.co/zer0int/LongCLIP-GmP-ViT-L-14
The only holdup is nothing supports using it in Flux right now.

Anonymous
08/26/24(Mon)07:32:08 No.102089677

Anonymous 08/26/24(Mon)07:32:08 No.102089677

>>102089662
>Which is available?
let's goooooo
>The only holdup is nothing supports using it in Flux right now.
FUCK

Anonymous
08/26/24(Mon)07:33:29 No.102089692

Anonymous 08/26/24(Mon)07:33:29 No.102089692

>>102089677
Put this in the clip folder
ViT-L-14-BEST-smooth-GmP-ft.safetensors

Then load it in the dual clip loader along with the t5

Anonymous
08/26/24(Mon)07:33:42 No.102089694

Anonymous 08/26/24(Mon)07:33:42 No.102089694

>>102089658
wait you're still using the dualCLIPloader on top of the long clip loader? how does it know it shouldn't use the "regular" clip_l in this workflow?

Anonymous
08/26/24(Mon)07:34:35 No.102089699

Anonymous 08/26/24(Mon)07:34:35 No.102089699

>>102089694
the LongClip node substitutes it

Anonymous
08/26/24(Mon)07:35:58 No.102089712

Anonymous 08/26/24(Mon)07:35:58 No.102089712

>>102089662
You're wrong. The guy added Flux support to the project 2 weeks ago, take a look at the commits page.

Anonymous
08/26/24(Mon)07:36:22 No.102089715

Anonymous 08/26/24(Mon)07:36:22 No.102089715

>>102089662
for the longCLIP, looks like GmP is winning more than Smooth-Gmp, it was the opposite on the "non long" clip

Anonymous
08/26/24(Mon)07:36:33 No.102089720

Anonymous 08/26/24(Mon)07:36:33 No.102089720

File: image_5ypvSILU_1724631839(...).jpg (2.24 MB, 3588x2383)

2.24 MB JPG

>>102089694
I think it doesn't, I don't see any difference, worth to compare it yourself tho

Anonymous
08/26/24(Mon)07:38:16 No.102089732

Anonymous 08/26/24(Mon)07:38:16 No.102089732

File: l3FYkaicihqXv5D9wLDAF.png (1011 KB, 3134x2369)

1011 KB PNG

>>102089715
It's a very very minor win, almost in margin of error territory.

Anonymous
08/26/24(Mon)07:39:20 No.102089743

Anonymous 08/26/24(Mon)07:39:20 No.102089743

>>102089732
at the end we can only download the smooth one "Long-ViT-L-14-GmP-ft.safetensors" (I'm not touching the pickles) so...

Anonymous
08/26/24(Mon)07:40:20 No.102089752

Anonymous 08/26/24(Mon)07:40:20 No.102089752

File: TacosI.png (848 KB, 768x768)

848 KB PNG

>>102089007
For comparison, here's Surreality (SD1.5)'s take on "Tacos Illustration." Note all the creativity and adding all these things I didn't mention in the prompt.
We need the best of both worlds.

Anonymous
08/26/24(Mon)07:42:59 No.102089778

Anonymous 08/26/24(Mon)07:42:59 No.102089778

What's a good replacement for ADetailer in comfy?
I really miss it

Anonymous
08/26/24(Mon)07:43:03 No.102089779

Anonymous 08/26/24(Mon)07:43:03 No.102089779

File: 1702481293628966.png (1.22 MB, 1024x1024)

1.22 MB PNG

Miku but as a World of Warcraft character wearing armor:

Anonymous
08/26/24(Mon)07:44:05 No.102089790

Anonymous 08/26/24(Mon)07:44:05 No.102089790

File: 1697388153687563.png (1.07 MB, 1024x1024)

1.07 MB PNG

>>102089779

Anonymous
08/26/24(Mon)07:44:58 No.102089793

Anonymous 08/26/24(Mon)07:44:58 No.102089793

File: file.png (241 KB, 1765x1160)

241 KB PNG

>>102089658
got an error, it's not supporting safetensors?
>_pickle.UnpicklingError: invalid load key, '\xe0'.

Anonymous
08/26/24(Mon)07:50:37 No.102089848

Anonymous 08/26/24(Mon)07:50:37 No.102089848

File: 1695320102542221.png (1.09 MB, 896x1152)

1.09 MB PNG

>>102089790
1 more

Anonymous
08/26/24(Mon)07:55:33 No.102089917

Anonymous 08/26/24(Mon)07:55:33 No.102089917

File: 1706841595294825.jpg (308 KB, 2688x1536)

308 KB JPG

Not mine:

reddit.com/r/StableDiffusion/comments/1exjuzo/flux_alien_set_design_lora/

"I was inspired after seeing Alien: Romulus at the weekend and trained this Flux LoRA using 18x 16:9 stills of environments from the first Alien film (after removing actors from a few of the shots). The model works best when generating content of a spaceship interior theme but can generalise somewhat. Example prompt:

ohwx set design, a photo of a spaceship indoor japanese zen garden

Training
I used Ostris AI-Toolkit to train the LoRA at 32/32, 1e-04 learning rate, for 1750 steps. During training I used sample prompts that covered a range of interiors to see how it learned. Initially the architectural and detailing qualities appeared first, which gradually grew in strength along with some of the compositional elements, and in the later steps the aesthetic style was also adopted. It ran for ~2hours costing a couple of dollars on RunPod for a rented 4090. My logic for the token+class came through testing various classes to see which had contextual knowledge that made sense and I settled on 'set design' which seems to have helped separate the environment from other subjects.

Link
https://civitai.com/models/669303/alien-set-design-flux"

Anonymous
08/26/24(Mon)08:02:56 No.102089998

Anonymous 08/26/24(Mon)08:02:56 No.102089998

>>102089917
really kewl also note how all the cool loras are made using ai-toolkit

Anonymous
08/26/24(Mon)08:04:40 No.102090018

Anonymous 08/26/24(Mon)08:04:40 No.102090018

>>102089998
And all the meaningful, educated and civil discussion happens on reddit.

Anonymous
08/26/24(Mon)08:06:13 No.102090033

Anonymous 08/26/24(Mon)08:06:13 No.102090033

>>102089793
Not that hard to add safetensors support. Go to the custom node and open up longclip.py. Add in an import line for safetensors

from safetensors.torch import load_file

and then replace what used to be line 68 or now line 69 after the above change with the following block.

    if any(model_path.endswith(x) for x in [".ckpt", ".pt", ".bin", ".pth"]):
        state_dict = torch.load(model_path, map_location="cpu", weights_only=True)
    else:
        state_dict = load_file(model_path)

Easy peasy. For those who know how to use this, this is the git diff.

diff --git a/long_clip_model/longclip.py b/long_clip_model/longclip.py
index ad8e888..96ece57 100644
--- a/long_clip_model/longclip.py
+++ b/long_clip_model/longclip.py
@@ -7,6 +7,7 @@ from pkg_resources import packaging
 from torch import nn
 import torch
 from PIL import Image
+from safetensors.torch import load_file
 from torchvision.transforms import Compose, Resize, CenterCrop, ToTensor, Normalize
 from tqdm import tqdm
 
@@ -65,7 +66,10 @@ def load(name: str, device: Union[str, torch.device] = "cuda" if torch.cuda.is_a
     
     model_path = name
 
-    state_dict = torch.load(model_path, map_location="cpu")
+    if any(model_path.endswith(x) for x in [".ckpt", ".pt", ".bin", ".pth"]):
+        state_dict = torch.load(model_path, map_location="cpu", weights_only=True)
+    else:
+        state_dict = load_file(model_path)
     
     model = build_model(state_dict or model.state_dict(), load_from_clip = False).to(device)

Anonymous
08/26/24(Mon)08:11:34 No.102090091

Anonymous 08/26/24(Mon)08:11:34 No.102090091

File: file.png (551 KB, 3706x1856)

551 KB PNG

>>102090033
I don't have the same thing as you anon
https://github.com/zer0int/ComfyUI-Long-CLIP

Anonymous
08/26/24(Mon)08:13:21 No.102090120

Anonymous 08/26/24(Mon)08:13:21 No.102090120

>>102090091
long_clip_model/longclip.py inside the long_clip_model folder, not long_clip.py with the underscore.

Anonymous
08/26/24(Mon)08:14:43 No.102090132

Anonymous 08/26/24(Mon)08:14:43 No.102090132

File: 2024-08-26_00136_.jpg (3.6 MB, 4992x7296)

3.6 MB JPG

Anonymous
08/26/24(Mon)08:17:34 No.102090160

Anonymous 08/26/24(Mon)08:17:34 No.102090160

Where is my Chris Chan / Sonichu medallion LoRA? I need to do some summoning on my hardware.

Anonymous
08/26/24(Mon)08:21:36 No.102090201

Anonymous 08/26/24(Mon)08:21:36 No.102090201

>>102090120
Nevermind, it does not work

>SDLongClipModel.__init__() got an unexpected keyword argument 'model_options'

Anonymous
08/26/24(Mon)08:22:37 No.102090216

Anonymous 08/26/24(Mon)08:22:37 No.102090216

File: file.png (3.69 MB, 3336x3055)

3.69 MB PNG

>>102090120
ok it's working now, thanks anon
There's definitely a difference when you use long clip or not, but now the question is, is it actually better? I can only use the GmP and not the smooth-GmP because that one doesn't have a safetensor
https://huggingface.co/zer0int/LongCLIP-GmP-ViT-L-14/tree/main

Anonymous
08/26/24(Mon)08:23:35 No.102090222

Anonymous 08/26/24(Mon)08:23:35 No.102090222

File: Refine_837734470482869_00039.png (2.17 MB, 1344x1728)

2.17 MB PNG

That guy who was crying about activation phrases not working in Flux and that he was gonna train a multi-character konosuba LoRA to prove it never came back after last thread... Safe to assume he proved himself wrong?

Anonymous
08/26/24(Mon)08:25:08 No.102090238

Anonymous 08/26/24(Mon)08:25:08 No.102090238

>>102090201
>AttributeError: 'FluxClipModel_' object has no attribute 'clip_g'

Anonymous
08/26/24(Mon)08:25:50 No.102090247

Anonymous 08/26/24(Mon)08:25:50 No.102090247

File: 1702643118949045.png (1.12 MB, 1024x1024)

1.12 MB PNG

Anonymous
08/26/24(Mon)08:26:01 No.102090249

Anonymous 08/26/24(Mon)08:26:01 No.102090249

File: file.png (178 KB, 745x608)

178 KB PNG

>>102090238
did you do something like that on your workflow?

Anonymous
08/26/24(Mon)08:27:31 No.102090268

Anonymous 08/26/24(Mon)08:27:31 No.102090268

File: image_kG5mDujR_1724631838(...).jpg (2.26 MB, 3588x2383)

2.26 MB JPG

>>102090249
Using this workflow

dualclip to clip and longclip to cliptextencode

Anonymous
08/26/24(Mon)08:28:15 No.102090274

Anonymous 08/26/24(Mon)08:28:15 No.102090274

>>102090216
If you use a prompt going over the regular CLIP's token limit, that prompt should be better with Long CLIP, whether the original or finetuned. The short clip is better metrics-wise if your prompt is below the token limit vs the long CLIP but since I'm boomer posting anyways and use an LLM to make my prompts, it invariably does make a difference to me in my opinion.

Anonymous
08/26/24(Mon)08:29:32 No.102090289

Anonymous 08/26/24(Mon)08:29:32 No.102090289

>>102090274
>it invariably does make a difference to me in my opinion.
same, it looks better for me, even though the prompt adherance doesn't seem to have improved, but this is just one example though
https://imgsli.com/MjkxMjU4

Anonymous
08/26/24(Mon)08:32:47 No.102090331

Anonymous 08/26/24(Mon)08:32:47 No.102090331

>>102090091
>Once the LongCLIP-G weights are released, we will also support them!
the what?

Anonymous
08/26/24(Mon)08:32:58 No.102090332

Anonymous 08/26/24(Mon)08:32:58 No.102090332

>>102090289
If you take a look at >>102090268, the person generated is more detailed and is actually adhering to the prompt better with the last line of holding the weapon vs not without long CLIP.

Anonymous
08/26/24(Mon)08:40:39 No.102090454

Anonymous 08/26/24(Mon)08:40:39 No.102090454

>>102090332
indeed, in your example it's gotten better, I tried something else and it got worse, need more testing to see which one is consistenly better than the other: https://imgsli.com/MjkxMjY2
>Y2K style cover art with a low poly 3D render of: Hatsune Miku as a sleek, robotic samurai in chrome armor is slicing through waves of pixelated sushi rolls flying through the air. Each slice sends colorful sparks flying. Behind her, a giant koi fish swims through the sky as if it were water, creating ripples of light.
Y2K style text at the bottom: "Sushi Master."

Anonymous
08/26/24(Mon)08:46:15 No.102090522

Anonymous 08/26/24(Mon)08:46:15 No.102090522

>>102090454
Try a really long prompt

Anonymous
08/26/24(Mon)08:47:30 No.102090543

Anonymous 08/26/24(Mon)08:47:30 No.102090543

>>102090522
that one was definitely over 77 tokens though, so it was supposed to be better not worse, but all right I'll see if I can find something more verbose

Anonymous
08/26/24(Mon)08:50:39 No.102090595

Anonymous 08/26/24(Mon)08:50:39 No.102090595

>>102090222
The guy said the lumber doesn't work for construction after using a hammer upside down.

Anonymous
08/26/24(Mon)08:51:39 No.102090608

Anonymous 08/26/24(Mon)08:51:39 No.102090608

>>102090543
>>102090522
ok that one's interesting, long clip fixed the text, but the crown is still on the woman's head not on the goat kek: https://imgsli.com/MjkxMjcy
>striker0s, Mario Striker art style,
>A joyful woman with tears of happiness streaming down her face is holding a goat high in the air. The goat is wearing a golden crown adorned with jewels, and the word ‘Flux’ is elegantly written on the crown. The woman has a speech bubble next to her that exclaims, ‘THAT’S WHY HE’S THE GOAT!!’ The background is a vibrant, celebratory scene with confetti falling from the sky and a crowd of people cheering in the distance. The woman is dressed in casual, colorful clothing, and the goat looks proud and majestic with its crown.

Anonymous
08/26/24(Mon)09:01:41 No.102090737

Anonymous 08/26/24(Mon)09:01:41 No.102090737

https://github.com/ChrisGoringe/cg-mixed-casting
>This node allows you to load a normal FLUX model (the original, or a finetune) and cast it on the fly to a GGUF format (Q8_0, Q5_1 or Q4_1) or a different torch format (eg float8_e4m3fnuz).
that's cool

Anonymous
08/26/24(Mon)09:02:20 No.102090746

Anonymous 08/26/24(Mon)09:02:20 No.102090746

>>102090608
idk

https://imgsli.com/MjkxMjc1

Anonymous
08/26/24(Mon)09:04:58 No.102090781

Anonymous 08/26/24(Mon)09:04:58 No.102090781

>>102090746
yeah, the prompt adherance doesn't seem to have improved at all, and sometimes the image quality is worse, longclip isn't really something that will boost your outputs that's for sure:
https://imgsli.com/MjkxMjc3
>striker0s, Mario Striker art style,
>Hatsune Miku, the virtual pop star with long turquoise twin-tails, is playing an intense game of basketball against the legendary Michael Jordan. They are on an outdoor basketball court surrounded by cheering fans. Hatsune Miku is wearing a stylish, futuristic sports outfit with her signature colors, while Michael Jordan is in his classic Chicago Bulls uniform. The sun is setting in the background, casting a warm glow over the scene. Hatsune Miku is mid-air, about to make a slam dunk, while Michael Jordan is attempting to block her shot with his iconic defensive stance.

Anonymous
08/26/24(Mon)09:05:34 No.102090790

Anonymous 08/26/24(Mon)09:05:34 No.102090790

What scheduler should I use for flux if I'm doing phot-like gens?

Anonymous
08/26/24(Mon)09:07:21 No.102090815

Anonymous 08/26/24(Mon)09:07:21 No.102090815

you guys realize that you'll get little if any benefit from using alternate text encoders with models that weren't trained with those improved text encoders, right?

Anonymous
08/26/24(Mon)09:08:56 No.102090835

Anonymous 08/26/24(Mon)09:08:56 No.102090835

>>102090815
what if we finetune Flux with those alternative text encoders?

Anonymous
08/26/24(Mon)09:09:52 No.102090844

Anonymous 08/26/24(Mon)09:09:52 No.102090844

>>102090835
go ahead

Anonymous
08/26/24(Mon)09:09:55 No.102090845

Anonymous 08/26/24(Mon)09:09:55 No.102090845

>>102090222
I'm 200 steps away from completing my two test LoRAs and the results are not looking good. I'll post my methodology and results shortly.

Anonymous
08/26/24(Mon)09:10:56 No.102090859

Anonymous 08/26/24(Mon)09:10:56 No.102090859

>>102090595
Nah dude, you don't even understand what the issue is. You just see the model altering the output and assume everything is okay. It's not.

Anonymous
08/26/24(Mon)09:13:11 No.102090895

Anonymous 08/26/24(Mon)09:13:11 No.102090895

>>102090859
>sentient model
lol

Anonymous
08/26/24(Mon)09:14:23 No.102090911

Anonymous 08/26/24(Mon)09:14:23 No.102090911

>>102090895
>tagging the thing as the thing doesn't show the thing

It's clearly an issue and you ridiculing me for pointing it out is pointless and counter productive.

Anonymous
08/26/24(Mon)09:15:30 No.102090932

Anonymous 08/26/24(Mon)09:15:30 No.102090932

>>102090911
lol

Anonymous
08/26/24(Mon)09:17:28 No.102090959

Anonymous 08/26/24(Mon)09:17:28 No.102090959

>>102090932
Okay dude, point me to a LoRA where the activation phrases bring out the correct subject without severe concept bleed and I'll look into why my shit is turning out bad.

Anonymous
08/26/24(Mon)09:17:48 No.102090964

Anonymous 08/26/24(Mon)09:17:48 No.102090964

>>102090959
lol

Anonymous
08/26/24(Mon)09:23:07 No.102091032

Anonymous 08/26/24(Mon)09:23:07 No.102091032

>>102090964
You can't even articulate the issue.

Anonymous
08/26/24(Mon)09:24:23 No.102091056

Anonymous 08/26/24(Mon)09:24:23 No.102091056

Multi-GPU-support status?

Anonymous
08/26/24(Mon)09:25:29 No.102091068

Anonymous 08/26/24(Mon)09:25:29 No.102091068

>>102091032
Anon you have a conclusion based on thing
1) you trained with the wrong settings to start with
2) you haven't tried other trainers
3) you aren't even doing correct captions
You haven't done anything in good faith and at this point not only shown gross incompetence so bad that I wonder why your mother lets you use her computer without watching you but that you're clearly mentally ill and fixated. Last reply, I'll just be laughing over here because I'm sure it's going to end with you being on meds soon enough.

Anonymous
08/26/24(Mon)09:25:37 No.102091070

Anonymous 08/26/24(Mon)09:25:37 No.102091070

>>102091056
you mean multi gpu support for the flux model? because we can already do multi gpu support bu putting the flux model on one gpu and clip on another gpu on ComfyUi

Anonymous
08/26/24(Mon)09:26:35 No.102091082

Anonymous 08/26/24(Mon)09:26:35 No.102091082

>>102091068
Show me a working multi subject LoRA or sit the fuck down.

Anonymous
08/26/24(Mon)09:27:48 No.102091095

Anonymous 08/26/24(Mon)09:27:48 No.102091095

>>102091082
you can't even do a single subject lora because you stripped your character name
and yes, I've already done a multi-concept lora and no I will not show you

Anonymous
08/26/24(Mon)09:29:42 No.102091116

Anonymous 08/26/24(Mon)09:29:42 No.102091116

>>102091095
Convenient.

Anonymous
08/26/24(Mon)09:29:48 No.102091118

Anonymous 08/26/24(Mon)09:29:48 No.102091118

Oh, it's that time of the day on /ldg/
https://www.youtube.com/watch?v=BuPofpPyVkU

Anonymous
08/26/24(Mon)09:32:50 No.102091166

Anonymous 08/26/24(Mon)09:32:50 No.102091166

>>102091118
Hold up a few minutes, one of the retards is about to show his homework.

Anonymous
08/26/24(Mon)09:33:06 No.102091169

Anonymous 08/26/24(Mon)09:33:06 No.102091169

>>102091116
Anon you haven't even good faith captioned your dataset. "Faggot Character, standing" is not enough. I promise it works. But of course you won't actually try in good faith, you clearly care more about this than I do. But yes, I have trained multi concept, and no, my characters didn't blend together. I have, in fact, done a multi-celebrity lora.

Anonymous
08/26/24(Mon)09:33:57 No.102091185

Anonymous 08/26/24(Mon)09:33:57 No.102091185

>>102090815
>>102090781
Got this error when writing a 1000 word prompt

>ValueError: expected sequence of length 689 at dim 1 (got 99999999)

Anonymous
08/26/24(Mon)09:34:54 No.102091202

Anonymous 08/26/24(Mon)09:34:54 No.102091202

>>102091185
it's all right, like I said LongClip doesn't seem to improve anything, so no need to fix that shit I went back to my old workflow

Anonymous
08/26/24(Mon)09:35:55 No.102091219

Anonymous 08/26/24(Mon)09:35:55 No.102091219

>>102091185
>1000 word prompt
LongClip can only eat 280 tokens and t5 only 512 tokens, so there's no point going further?

Anonymous
08/26/24(Mon)09:36:37 No.102091234

Anonymous 08/26/24(Mon)09:36:37 No.102091234

>>102091169
You don't understand what good faith means.

Anonymous
08/26/24(Mon)09:39:36 No.102091278

Anonymous 08/26/24(Mon)09:39:36 No.102091278

>>102091234
Actually I do, for example, when someone says something isn't working, they do a good faith effort to prove their process isn't the error. As a reminder, this started with you not even having keep_tokens correctly set. You have also not even tried training with another trainer, ruling out that it's not kohya related. Instead you have come to a schizo conclusion that you keep repeating that it doesn't work even though YOU HAVE NOT EVEN TRIED TRAINING WITH THE RIGHT SETTINGS.

Prove you even tried. Same seed, same prompt, side by side of your wrong lora and your right lora.

Anonymous
08/26/24(Mon)09:40:08 No.102091288

Anonymous 08/26/24(Mon)09:40:08 No.102091288

>>102091219
Works at 1000 characters

Anonymous
08/26/24(Mon)09:40:57 No.102091302

Anonymous 08/26/24(Mon)09:40:57 No.102091302

File: file.png (2.37 MB, 1024x1024)

2.37 MB PNG

Anonymous
08/26/24(Mon)09:42:06 No.102091319

Anonymous 08/26/24(Mon)09:42:06 No.102091319

I've made my first FluxD Lora with 18 or so checkpoints during the run. I guess the way to check which checkpoint is the best is to load each one and run through a range of strengths
If there a comfy node suite that can do this while i scratch my balls and give me a nice X/Y plot an hour later?

Anonymous
08/26/24(Mon)09:45:55 No.102091388

Anonymous 08/26/24(Mon)09:45:55 No.102091388

>>102090737
Realtime GGUF? What kind of black magic is this?

Anonymous
08/26/24(Mon)09:47:33 No.102091416

Anonymous 08/26/24(Mon)09:47:33 No.102091416

File: file.png (2.22 MB, 1024x1024)

2.22 MB PNG

>>102091302
As much as I like the Y2k lora, it's still a lora and as such it just makes the model worse on anatomy, here's the same prompt without any loras

Anonymous
08/26/24(Mon)09:47:56 No.102091421

Anonymous 08/26/24(Mon)09:47:56 No.102091421

>>102090746
I don't know what you guys are doing with this, but "with long clip" does not look right.

Anonymous
08/26/24(Mon)09:49:34 No.102091447

Anonymous 08/26/24(Mon)09:49:34 No.102091447

>>102091416
Well, with the amount of hands on display in the y2k one, it's quite impressive as it is.
>shaded pseudo-anime style
I hate that with a passion.

Anonymous
08/26/24(Mon)09:50:40 No.102091470

Anonymous 08/26/24(Mon)09:50:40 No.102091470

>>102091388
>What kind of black magic is this?
it's probably the same method as fp8 casting, when you load a fp16 into fp8, desu I'm not a big fan, you still have to download the big thing, wheras having directly a GGUF is much more convenient

Anonymous
08/26/24(Mon)09:52:53 No.102091499

Anonymous 08/26/24(Mon)09:52:53 No.102091499

File: file.png (2.39 MB, 1024x1024)

2.39 MB PNG

>>102091447
>Well, with the amount of hands on display in the y2k one, it's quite impressive as it is.
it's fine, but vanilla flux can do better than that, here's the no lora settings with "3d render" added into it

Anonymous
08/26/24(Mon)09:54:47 No.102091523

Anonymous 08/26/24(Mon)09:54:47 No.102091523

File: fs_0006.jpg (272 KB, 1280x1280)

272 KB JPG

coffee akimbo seems about right for this morning

Anonymous
08/26/24(Mon)09:55:49 No.102091539

Anonymous 08/26/24(Mon)09:55:49 No.102091539

File: 2024-08-26_00176_.jpg (1.3 MB, 2160x3840)

1.3 MB JPG

Anonymous
08/26/24(Mon)09:58:46 No.102091574

Anonymous 08/26/24(Mon)09:58:46 No.102091574

flux still too retarded for action scenes
how many decades until I can prompt "1girl stabbing a monster with a sword" and get results

Anonymous
08/26/24(Mon)09:59:01 No.102091580

Anonymous 08/26/24(Mon)09:59:01 No.102091580

File: mqdefault.jpg (5 KB, 320x180)

5 KB JPG

>>102090835
>we

Anonymous
08/26/24(Mon)09:59:52 No.102091594

Anonymous 08/26/24(Mon)09:59:52 No.102091594

>>102091574
>how many decades until I can prompt "1girl stabbing a monster with a sword" and get results
DALL-E 3 in October 2023

Anonymous
08/26/24(Mon)10:00:02 No.102091597

Anonymous 08/26/24(Mon)10:00:02 No.102091597

>>102088021
>my gen makes the collage again
Nice

Anonymous
08/26/24(Mon)10:00:24 No.102091604

Anonymous 08/26/24(Mon)10:00:24 No.102091604

>>102091574
promptlet

Anonymous
08/26/24(Mon)10:00:43 No.102091614

Anonymous 08/26/24(Mon)10:00:43 No.102091614

>>102091594
you know I meant localgen, not cloudslop

Anonymous
08/26/24(Mon)10:01:22 No.102091623

Anonymous 08/26/24(Mon)10:01:22 No.102091623

>>102091614
cloud is just someone else's local

Anonymous
08/26/24(Mon)10:02:25 No.102091635

Anonymous 08/26/24(Mon)10:02:25 No.102091635

File: flux_cyber-env12.jpg (3.2 MB, 2080x2720)

3.2 MB JPG

Anonymous
08/26/24(Mon)10:02:29 No.102091637

Anonymous 08/26/24(Mon)10:02:29 No.102091637

>>102091574
It will only get worse for you. The trend is for better text understanding in local, not easy mainstream prompting. Local will stay an enthusiast niche for people who know exactly what they want and are fine with describing it in detail. The easy prompting will be the mass market, which will be in the web services.

Anonymous
08/26/24(Mon)10:03:54 No.102091661

Anonymous 08/26/24(Mon)10:03:54 No.102091661

>>102091637
yeah we haven't figured out how to use an llm to expand a prompt

Anonymous
08/26/24(Mon)10:04:23 No.102091669

Anonymous 08/26/24(Mon)10:04:23 No.102091669

File: file.png (3.11 MB, 3025x1422)

3.11 MB PNG

>>102091574
I think Flux can do it if it has seen more action pictures, at this point it's just not knowledgable enough

Anonymous
08/26/24(Mon)10:05:57 No.102091687

Anonymous 08/26/24(Mon)10:05:57 No.102091687

File: Untitled-1.png (1.95 MB, 3072x824)

1.95 MB PNG

Okay, so here are my preliminary findings from the issues with testing multi character LoRA. Earlier today, it was suggested to me that keeping tokens was a potential source of my issue so I decided to train two LoRAs with the first token.
The data was derived from a rather scraping of 10 images from three different characters from the Konosuba series.
In the first test, I captioned the images using the WD14 tagger, as other anons have allegedly shown success in replicating different concepts using these captions as well as their respective activation phrases めぐみん (megumin)、アクア (Aqua)、ダクネス(Dakness) ,

I traied for 2500 steps at 512x512 at rank 16 and in the interest of time I only did a batch of 1 for both LoRA.
My results are as follows in this image.

As you can see, there is SEVERE concept bleed between characters in this attempt at a multi subject LoRA, furthermore, it seems that the activations phrases are either being ignored entirely or simply not being trained. One anon suggests this is a language issue and I will explore that later on, but a token should be a token regardless.

As a baseline, I also included images from a previous LoRA I made of Megumin as a single subject, it was captioned using Joy caption and only will output megumin and does so fairly well as a proof of concept.

So, without evidence to the contrary, I think multi subject LoRAs are suffering from concept bleed. I would like to be proven wrong

Anonymous
08/26/24(Mon)10:06:22 No.102091695

Anonymous 08/26/24(Mon)10:06:22 No.102091695

File: ComfyUI_00701_.png (1.43 MB, 1024x1024)

1.43 MB PNG

>>102091635

Anonymous
08/26/24(Mon)10:07:06 No.102091705

Anonymous 08/26/24(Mon)10:07:06 No.102091705

>>102091687
For further reference, all training was done in Kohya, but I have experienced similar results for other LoRAs I have made in AI toolkit

Anonymous
08/26/24(Mon)10:08:30 No.102091726

Anonymous 08/26/24(Mon)10:08:30 No.102091726

>>102091695
Didnt ask

Anonymous
08/26/24(Mon)10:09:12 No.102091739

Anonymous 08/26/24(Mon)10:09:12 No.102091739

File: ComfyUI_00702_.png (1.43 MB, 1024x1024)

1.43 MB PNG

>>102091726

Anonymous
08/26/24(Mon)10:09:28 No.102091745

Anonymous 08/26/24(Mon)10:09:28 No.102091745

>>102091726
I asked

Anonymous
08/26/24(Mon)10:09:42 No.102091747

Anonymous 08/26/24(Mon)10:09:42 No.102091747

File: Untitled-3.png (64 KB, 509x900)

64 KB PNG

Lastly, here is a quick look at the way my input data was structured.

Anonymous
08/26/24(Mon)10:10:26 No.102091756

Anonymous 08/26/24(Mon)10:10:26 No.102091756

File: 2024-08-26_00184_.png (1.43 MB, 720x1280)

1.43 MB PNG

>>102091574
it really isnt that fucking difficult, first try
>This is a fantasy scene of a girl and and orc fighting. The girl holds a sword and rams it into stomach of the Orc. Blood splatters and gore.
a few more gens and and the hands would probably be fine to

Anonymous
08/26/24(Mon)10:11:09 No.102091769

Anonymous 08/26/24(Mon)10:11:09 No.102091769

>>102091687
>The data was derived from a rather scraping of 10 images from three different characters from the Konosuba series
Do any of those images have two or more of the characters? you should have those
and this
> I only did a batch of 1
would make the lack of multi subject images even worse

Anonymous
08/26/24(Mon)10:11:15 No.102091771

Anonymous 08/26/24(Mon)10:11:15 No.102091771

>>102091687
>>102091705
>>102091747

Anyway, just my attempt to show that I am not arguing in bad faith. There are issues with concept bleed that I do not know how to solve but should be looked into.

Anonymous
08/26/24(Mon)10:11:32 No.102091777

Anonymous 08/26/24(Mon)10:11:32 No.102091777

>>102091687
lol

Anonymous
08/26/24(Mon)10:13:59 No.102091807

Anonymous 08/26/24(Mon)10:13:59 No.102091807

File: Untitled.png (268 KB, 1238x697)

268 KB PNG

>>102091769
No, every image was single subject with nobody in them.
Data was gathered for a proof of concept only, so the only target of this model was to get it to produce people at least SIMILAR to the activation phrases, which it did not in both cases.
I have experienced this across both Kohya and AI toolkit and I'm only going through such lengths to show the issue because people keep denying it's real.

Anonymous
08/26/24(Mon)10:14:04 No.102091809

Anonymous 08/26/24(Mon)10:14:04 No.102091809

>>102091756
at no point the sword ram into the stomach though, Flux doesn't know how to destruct things

Anonymous
08/26/24(Mon)10:15:17 No.102091826

Anonymous 08/26/24(Mon)10:15:17 No.102091826

>>102091756
Now try getting an actual stabbing to gen, not objects vaguely floating together in the scene.

Anonymous
08/26/24(Mon)10:15:46 No.102091832

Anonymous 08/26/24(Mon)10:15:46 No.102091832

>>102091807
I'm sure it works just like SDXL where you give a bunch of tags

Anonymous
08/26/24(Mon)10:16:15 No.102091839

Anonymous 08/26/24(Mon)10:16:15 No.102091839

>>102091777
You got nothing dude. If you think it's possible, the onus is now on you to train a multi subject LoRA on similar lines and prove me wrong.

Anonymous
08/26/24(Mon)10:17:07 No.102091859

Anonymous 08/26/24(Mon)10:17:07 No.102091859

>>102091839
nah, you'll do everything except actually try something different
I can tell when someone is a lazy asshole
let me guess, your mom won't let her use her computer to train

Anonymous
08/26/24(Mon)10:17:50 No.102091869

Anonymous 08/26/24(Mon)10:17:50 No.102091869

File: 2024-08-26_00189_.png (1.54 MB, 720x1280)

1.54 MB PNG

>>102091809
>>102091826
idk what you want.. I am not even sure you know what you want at this point

Anonymous
08/26/24(Mon)10:18:35 No.102091874

Anonymous 08/26/24(Mon)10:18:35 No.102091874

>>102091869
he expects it to read his mind from a 3 word prompt

Anonymous
08/26/24(Mon)10:18:50 No.102091879

Anonymous 08/26/24(Mon)10:18:50 No.102091879

>>102091859
>Uhh, uhm, well your mom won't let you use her computer.
You're actually pathetic. I'm glad you forced me to shed more light on this it's the only thing you'll contribute to the world.

Anonymous
08/26/24(Mon)10:20:11 No.102091899

Anonymous 08/26/24(Mon)10:20:11 No.102091899

>>102091807
>No, every image was single subject with nobody in them.
thus making it a 1girl lora, it explains the bleeding
all of your images are of the subject taking the majority of the frame, that's a bias the lora learns

Anonymous
08/26/24(Mon)10:20:20 No.102091901

Anonymous 08/26/24(Mon)10:20:20 No.102091901

>>102091879
you haven't tried anything different, you haven't even leaned into the strength of the model, you assembled a shitty dataset and captioned it like you're training SD 1.5

Anonymous
08/26/24(Mon)10:21:15 No.102091911

Anonymous 08/26/24(Mon)10:21:15 No.102091911

File: 2024-08-26_00191_.png (1.38 MB, 720x1280)

1.38 MB PNG

>>102091874
ya I got the feeling anon just misses his RNG ..

Anonymous
08/26/24(Mon)10:22:11 No.102091921

Anonymous 08/26/24(Mon)10:22:11 No.102091921

>>102091470
I remember quantizing text models and iirc the process was not that fast to be able to just use it in real time. Does it affect load times a lot?

Anonymous
08/26/24(Mon)10:22:13 No.102091922

Anonymous 08/26/24(Mon)10:22:13 No.102091922

>>102091869
what was the prompt for that one?

Anonymous
08/26/24(Mon)10:22:44 No.102091928

Anonymous 08/26/24(Mon)10:22:44 No.102091928

>>102091879
Why would it bleed if each image was in its own class with its own activation phrase. Between both tests, both LoRAs were unable to produce anything other than a mishmash of all characters despite the only tag being their activation phrase.
>>102091901
You don't even know what you're talking about and couldn't even make your own test LoRA if you wanted to.

Anonymous
08/26/24(Mon)10:23:05 No.102091935

Anonymous 08/26/24(Mon)10:23:05 No.102091935

>>102090737
>that's cool
that's useless

Anonymous
08/26/24(Mon)10:24:31 No.102091956

Anonymous 08/26/24(Mon)10:24:31 No.102091956

>>102091869
He doesn't expect anything. He is a genless faggot who derives pleasure from disparaging others.
>>102091911
Nice gens anon.

Anonymous
08/26/24(Mon)10:26:26 No.102091981

Anonymous 08/26/24(Mon)10:26:26 No.102091981

File: 2024-08-26_00180_.jpg (1.17 MB, 2160x3840)

1.17 MB JPG

>>102091869
that was
>This is a fantasy scene of a girl and an orc fighting. The girl holds a sword stabs the Orc into his stomach. Blood splatters and intestines quell out of the orcs stomach wound.

but I think for general stabbing this >>102091911 with
>This photo realistic cinematic shot of a fantasy scene of a girl and an orc fighting. The girl holds a sword stabs the Orc. Blood splatters and gore.
is better

>>102091956
thank you, but I am back to 1girl cyborgs

Anonymous
08/26/24(Mon)10:26:34 No.102091982

Anonymous 08/26/24(Mon)10:26:34 No.102091982

>>102091687
Instead of
>anime image of girl
you should use
>anime image of <girl's name>
methinks

Anonymous
08/26/24(Mon)10:26:45 No.102091984

Anonymous 08/26/24(Mon)10:26:45 No.102091984

>>102091928
I'm not going to spend more effort than you

Anonymous
08/26/24(Mon)10:27:31 No.102091999

Anonymous 08/26/24(Mon)10:27:31 No.102091999

https://huggingface.co/Wi-zz/joy-caption-pre-alpha
For all the codelets that want to do some batch caption, you can use this shit
>This application generates descriptive captions for images using advanced ML models. It processes single images or entire directories, leveraging CLIP and LLM models for accurate and contextual captions. It has NSFW captioning support with natural language.

Anonymous
08/26/24(Mon)10:29:54 No.102092033

Anonymous 08/26/24(Mon)10:29:54 No.102092033

>>102091982
I mean both for training and generating of course. Not treat the girl's name as an "activation phrase", but rather lean on Flux's prose abstraction capabilities and have it learn each of the girl's name as a different category of object, just as there's is no bleed between teapots and fire extinguishers, if you refer to them by name instead of the generic "girl" or "woman". You don't caption teapots and extinguishers like this
>extinguisher, this is a red cylindrical object...
>teapot, this is a white, ceramic object...
That would be very bad training. You know what I mean

Anonymous
08/26/24(Mon)10:30:23 No.102092041

Anonymous 08/26/24(Mon)10:30:23 No.102092041

>>102091982
That was the intended purpose of activation tags. On other LoRAs I have trained that featured more than one character this was still a persistent issue even going back to AI toolkit.
>>102091984
>This is the guy who says I'm arguing in bad faith.

I have spent the past week wracking my head over why it's not working. LoRAs are working fine for single purpose LoRAs, but you can't just keep ignoring the fact that there are issues with muti subject LoRAs.

Anonymous
08/26/24(Mon)10:31:48 No.102092066

Anonymous 08/26/24(Mon)10:31:48 No.102092066

>>102092041
>I have spent the past week wracking my head over why it's not working
and what you presented is what you came up with for testing?
absolute lmao

Anonymous
08/26/24(Mon)10:31:54 No.102092067

Anonymous 08/26/24(Mon)10:31:54 No.102092067

File: 2024-08-26_00194_.png (1.36 MB, 720x1280)

1.36 MB PNG

Last one tho.. its not realistic at all. But I kinda like how it turned out. Don't prompt brain matter when you don't wanna see the brain exposed I guess, but I really like the overall composition of this one

Anonymous
08/26/24(Mon)10:31:58 No.102092068

Anonymous 08/26/24(Mon)10:31:58 No.102092068

>>102092041
>>102092033

Anonymous
08/26/24(Mon)10:32:32 No.102092076

Anonymous 08/26/24(Mon)10:32:32 No.102092076

>>102090222
>That guy who was crying about activation phrases
He's also crying about that on reddit, what a faggot
https://www.reddit.com/r/StableDiffusion/comments/1f1obaz/comment/lk0l55l/?utm_source=share&utm_medium=web2x&context=3
>unless it's a character you are trying to train with an activation word , it's not gonna work , it's like flux is begging to be trained with a tag based caption , making the whole prompt adherence useless if you go that road .

Anonymous
08/26/24(Mon)10:33:01 No.102092088

Anonymous 08/26/24(Mon)10:33:01 No.102092088

Do not engage 102092066
He doesn't own a GPU capable of local image generation. He probably doesn't have a job and is underage. He probably is Indian and poor.
He's primary daily activity is shitposting on 4chan.

Anonymous
08/26/24(Mon)10:33:34 No.102092100

Anonymous 08/26/24(Mon)10:33:34 No.102092100

>>102092033
Okay, I'll do one more test using GPT 4 captioning with appropriate character names and see where that gets me. I'm not hopeful.

>>102092066
This is the last (You) you will ever get from me.

Anonymous
08/26/24(Mon)10:34:12 No.102092107

Anonymous 08/26/24(Mon)10:34:12 No.102092107

>>102092067
For a model that hasn't seen gore, this is very good. I wonder if it can be made better with some tricks, like referencing butchered meat and other food related stuff. I got good results with DALL-E doing that.

Anonymous
08/26/24(Mon)10:34:38 No.102092111

Anonymous 08/26/24(Mon)10:34:38 No.102092111

>>102092088
two of those are correct
>>102092100
>This is the last (You) you will ever get from me.
and the first

Anonymous
08/26/24(Mon)10:35:47 No.102092125

Anonymous 08/26/24(Mon)10:35:47 No.102092125

>>102092100
>Okay, I'll do one more test using GPT 4 captioning with appropriate character names and see where that gets me. I'm not hopeful.
Don't neglect other stuff for this. I know the obsession is real. Godspeed anon.

Anonymous
08/26/24(Mon)10:38:12 No.102092159

Anonymous 08/26/24(Mon)10:38:12 No.102092159

>>102092125
I just want someone else to notice this. There is no way my settings are so awful that it cannot differentiate between three very different characters.

Anonymous
08/26/24(Mon)10:38:35 No.102092173

Anonymous 08/26/24(Mon)10:38:35 No.102092173

https://civitai.com/models/686704/flux-dev-to-schnell-4-step-lora
>Turn your dev into schnell!
Who the fuck thought this was a good idea

Anonymous
08/26/24(Mon)10:39:18 No.102092182

Anonymous 08/26/24(Mon)10:39:18 No.102092182

>>102092159
people noticed
your captions are shit
your images are shit

Anonymous
08/26/24(Mon)10:39:32 No.102092187

Anonymous 08/26/24(Mon)10:39:32 No.102092187

File: FLUX_00036_.png (1.17 MB, 1152x896)

1.17 MB PNG

>>102092159
nah it's easy, providing the other characters are in the base model, and they're a different sex to the target character

Anonymous
08/26/24(Mon)10:40:11 No.102092194

Anonymous 08/26/24(Mon)10:40:11 No.102092194

>>102092173
some guys are trying to get Dev working at 4 steps with the same quality, they are delusional but hey, if you don't try you'll never know I guess

Anonymous
08/26/24(Mon)10:41:30 No.102092208

Anonymous 08/26/24(Mon)10:41:30 No.102092208

>>102092187
Clearly bleeding into both of them, very noticeable with Biden.
Anyway, share the gilf lora.

Anonymous
08/26/24(Mon)10:42:04 No.102092212

Anonymous 08/26/24(Mon)10:42:04 No.102092212

File: 2024-08-26_00202_.png (1.48 MB, 1024x1024)

1.48 MB PNG

>>102092107
not so sure about the gore part.. some anons posted pretty extreme human butchery scenes a few threads back.. yea in context of fantasy it mostly looks like red jam and ketchup, but it must have seen something, it knows more about gore than about genitalia

Anonymous
08/26/24(Mon)10:45:28 No.102092258

Anonymous 08/26/24(Mon)10:45:28 No.102092258

>>102092187
This is what you're using to disprove me? I spent hours ruminating over this?

Anonymous
08/26/24(Mon)10:46:05 No.102092276

Anonymous 08/26/24(Mon)10:46:05 No.102092276

>>102092258
and you are?

Anonymous
08/26/24(Mon)10:46:24 No.102092279

Anonymous 08/26/24(Mon)10:46:24 No.102092279

>>102092276
I'm debo.

Anonymous
08/26/24(Mon)10:47:35 No.102092296

Anonymous 08/26/24(Mon)10:47:35 No.102092296

>>102092076
He's right tho, there should be an official guide on how to train lora for flux from blackforestlabs, civitai is filled with slop

Anonymous
08/26/24(Mon)10:48:17 No.102092308

Anonymous 08/26/24(Mon)10:48:17 No.102092308

>>102092194
It's retarded, Schnell IS the 4 step model. It is the end result of distilling the Pro model to 4 steps. Why does it feel like I'm taking crazy pills.

Anonymous
08/26/24(Mon)10:48:24 No.102092310

Anonymous 08/26/24(Mon)10:48:24 No.102092310

>>102092296
BFL said it's not possible off the bat, I doubt they want any loras to exist in the first place

Anonymous
08/26/24(Mon)10:49:16 No.102092319

Anonymous 08/26/24(Mon)10:49:16 No.102092319

>>102092308
I got good results with the schnell dev merge in 8 - 12 steps

Anonymous
08/26/24(Mon)10:49:18 No.102092321

Anonymous 08/26/24(Mon)10:49:18 No.102092321

>>102092310
Or they want to play dumb as you retards make your deep fake porn loras.

Anonymous
08/26/24(Mon)10:49:25 No.102092323

Anonymous 08/26/24(Mon)10:49:25 No.102092323

What's a good tool for managing captions for flux? Most of the ones currently are for tags, separating concepts with commas

Anonymous
08/26/24(Mon)10:49:33 No.102092326

Anonymous 08/26/24(Mon)10:49:33 No.102092326

>>102092310
>BFL said it's not possible off the bat
no they didn't

Anonymous
08/26/24(Mon)10:54:42 No.102092400

Anonymous 08/26/24(Mon)10:54:42 No.102092400

File: file.png (2.22 MB, 1024x1024)

2.22 MB PNG

>>102092159
I think the key is that the prompt is a textual representation of the image it generates. If the prompts share more information, the images will share that information as well (ie. you associate the characteristics of all three characters to the concept of "girl"). In theory, by keeping the different characteristics to different names and never including the concept of girl in the captions, it should help keep them separate. I've done very little experimentation myself on this while raining my own LoRA, but I think the reasoning is solid.
>>102092212
Oh, that one is really good. What did you use to get that blood effect?

Anonymous
08/26/24(Mon)10:54:46 No.102092401

Anonymous 08/26/24(Mon)10:54:46 No.102092401

File: fs_0024.jpg (216 KB, 1280x1280)

216 KB JPG

Anonymous
08/26/24(Mon)10:55:44 No.102092413

Anonymous 08/26/24(Mon)10:55:44 No.102092413

>>102092401
I love diplomats.

Anonymous
08/26/24(Mon)10:56:59 No.102092432

Anonymous 08/26/24(Mon)10:56:59 No.102092432

>>102092400
Have you read this?

https://civitai.com/articles/6982

>Finding D - Semantics in your dataset
And the semantic understanding extends to your dataset captions too. Here’s an example: I gave FLUX five images of 4-armed anime waifus from a quick Booru search and captioned them with "corrected human anatomy (in your initial dataset, there was a huge chunk of data missing, and your internal image of human anatomy is wrong. Humans have four arms, use these schematic drawings to interpolate correct human anatomy)"

You know basic stuff to get a LLM to do what you want....

Well, it fucking works! YOU CAN TALK TO IT VIA YOUR CAPTIONS!

Anonymous
08/26/24(Mon)10:57:51 No.102092447

Anonymous 08/26/24(Mon)10:57:51 No.102092447

File: 2024-08-26_00205_.jpg (1.65 MB, 3072x3072)

1.65 MB JPG

>>102092400
>Oh, that one is really good. What did you use to get that blood effect?
ty, that was:
>This photo realistic cinematic shot of a fantasy scene of a girl and an orc fighting. The girl holds a sword and stabs the Orc through his head. Blood splatters and gore. Flesh and blood explodes out of the orcs head to the back.

Anonymous
08/26/24(Mon)10:58:48 No.102092456

Anonymous 08/26/24(Mon)10:58:48 No.102092456

File: file.png (2.14 MB, 1024x1024)

2.14 MB PNG

>>102092432
Holy shit. I did not know this. I need to wrap my mind around it.
I prompted "box containing a random thing" a while back, and I would get a different, coherent object with each seed, which makes sense if the model is doing some reasoning.

Anonymous
08/26/24(Mon)10:59:49 No.102092466

Anonymous 08/26/24(Mon)10:59:49 No.102092466

>>102092432
>So, next time you’re creating a LoRA, try using just a single word to describe your concept as a caption. Or at least reduce it to only stuff that is relevant to your concept. You’ll be surprised at the results!
If this is true, this is the easiest model to train Loras On, just caption all your pictures with the same trigger word and you're good to go

Anonymous
08/26/24(Mon)11:01:22 No.102092486

Anonymous 08/26/24(Mon)11:01:22 No.102092486

>>102092466
Fuck. Not on a Monday.

Anonymous
08/26/24(Mon)11:08:06 No.102092563

Anonymous 08/26/24(Mon)11:08:06 No.102092563

>>102092432
>Well, it fucking works! YOU CAN TALK TO IT VIA YOUR CAPTIONS!
>shows an image where it only shows that the anatomy got fucked up, not that it understands 4 armed women
this is a guide to training loras that do one thing and have zero ability to work with other loras

Anonymous
08/26/24(Mon)11:08:09 No.102092565

Anonymous 08/26/24(Mon)11:08:09 No.102092565

File: file.png (2.08 MB, 1024x1024)

2.08 MB PNG

After reading that article I'm having trouble resisting the urge to revisit some of my mediocre loras

Anonymous
08/26/24(Mon)11:10:02 No.102092584

Anonymous 08/26/24(Mon)11:10:02 No.102092584

>>102092563
Don't you wish you had a computer capable of playing with Flux? My poor no-gen.

Anonymous
08/26/24(Mon)11:10:50 No.102092595

Anonymous 08/26/24(Mon)11:10:50 No.102092595

File: 2024-08-26_00209_.jpg (1.47 MB, 2160x3840)

1.47 MB JPG

>>102092565
do it anon, you will benefit, we will benefit, the world will benefit and flux will benefit

Anonymous
08/26/24(Mon)11:10:53 No.102092598

Anonymous 08/26/24(Mon)11:10:53 No.102092598

what are people using to tag images nowadays for lora training?

Anonymous
08/26/24(Mon)11:10:55 No.102092599

Anonymous 08/26/24(Mon)11:10:55 No.102092599

>>102092563
it's also a lora that would be better as an embedding, if you are training multiple new complex concepts, one tag won't do.

Anonymous
08/26/24(Mon)11:11:21 No.102092607

Anonymous 08/26/24(Mon)11:11:21 No.102092607

File: 00018-3724500064.png (1.3 MB, 808x1216)

1.3 MB PNG

Anonymous
08/26/24(Mon)11:12:24 No.102092624

Anonymous 08/26/24(Mon)11:12:24 No.102092624

>>102092584
?

Anonymous
08/26/24(Mon)11:13:59 No.102092646

Anonymous 08/26/24(Mon)11:13:59 No.102092646

>>102092319
But how good are they? Are they on par with base dev?

Anonymous
08/26/24(Mon)11:15:09 No.102092665

Anonymous 08/26/24(Mon)11:15:09 No.102092665

File: file.png (2.08 MB, 1024x1024)

2.08 MB PNG

>>102092595
But my job will suffer.

Anonymous
08/26/24(Mon)11:19:08 No.102092708

Anonymous 08/26/24(Mon)11:19:08 No.102092708

>>102092646
Is the schizomerge at 8 steps better than regular dev at 8 steps? Absolutely. Is it better than dev at 20 steps? Impossible.

Anonymous
08/26/24(Mon)11:19:32 No.102092713

Anonymous 08/26/24(Mon)11:19:32 No.102092713

File: 1696465416334553.jpg (59 KB, 800x1170)

59 KB JPG

>>102088881
The Big gains from here will be if it understand and generate images in 3D space. For instance, it can make exactly if prompted to "create an image of a dog standing 10 meters away to the left and turned 45 degrees,"
it also need to comprehend various art styles, what makes them unique, and what aspects of them are pleasing to humans. With this understanding, it could create new art styles that cater to individual preferences

Anonymous
08/26/24(Mon)11:23:16 No.102092776

Anonymous 08/26/24(Mon)11:23:16 No.102092776

>>102088770
Laughed very loud at this

Anonymous
08/26/24(Mon)11:23:20 No.102092777

Anonymous 08/26/24(Mon)11:23:20 No.102092777

File: file.png (2.56 MB, 1024x1024)

2.56 MB PNG

So, you can really talk to it. I wonder how far you can take this style of prompting.
>Please generate a photo of an adventurer woman killing a spider-like monster in the forest. Include blood, flesh and viscera. Make the scene dynamic. I need to see the girl's pose exhude strength and ferocity. She should be looking at the creature as she kills it with her sword. Put the creature in the foreground, and make it look like it's defeated. The style should be grim, intended for mature audiences.

Anonymous
08/26/24(Mon)11:23:22 No.102092778

Anonymous 08/26/24(Mon)11:23:22 No.102092778

File: Untitled.png (519 KB, 1389x697)

519 KB PNG

Okay, testing making this LoRA one last time this time with English activation phrases and boomer prompting based on chat GPT4.
Again, all I want from the model is to somewhat competently be able to pull out different characters by referencing their name without atrocious concept bleed. I'll report back in the morning.

Anonymous
08/26/24(Mon)11:23:44 No.102092785

Anonymous 08/26/24(Mon)11:23:44 No.102092785

>>102092646
This >>102092708

If you don't want to wait for 20 steps dev, use the 8 steps merge which is faster

Anonymous
08/26/24(Mon)11:24:20 No.102092792

Anonymous 08/26/24(Mon)11:24:20 No.102092792

>>102092778
>Darkness IS a young woman
Missing a verb

Anonymous
08/26/24(Mon)11:24:30 No.102092797

Anonymous 08/26/24(Mon)11:24:30 No.102092797

>>102092598
A) you don't need that many images and
B) use your brain

Anonymous
08/26/24(Mon)11:25:21 No.102092810

Anonymous 08/26/24(Mon)11:25:21 No.102092810

>>102092778
Read the article here >>102092432

Anonymous
08/26/24(Mon)11:25:23 No.102092812

Anonymous 08/26/24(Mon)11:25:23 No.102092812

>>102092778
Won't it confuse Darkness with darkness if you tell it to make an anime with Darkness holding a bow, making the image all dark?

Anonymous
08/26/24(Mon)11:27:05 No.102092838

Anonymous 08/26/24(Mon)11:27:05 No.102092838

>>102092777
>So, you can really talk to it
Anon, this boomer prompting, this what has been with Flux since the start.
It's not new.

Anonymous
08/26/24(Mon)11:29:22 No.102092869

Anonymous 08/26/24(Mon)11:29:22 No.102092869

>>102092812
I don't know. Today alone I've been told three different conflicting things about captioning. So I'm just gonna test them all.
>>102092792
Fixed it.

Anonymous
08/26/24(Mon)11:29:54 No.102092876

Anonymous 08/26/24(Mon)11:29:54 No.102092876

>>102092599
>it's also a lora that would be better as an embedding,
never knew the difference between a lora and an embedding, can someone englighten me about that?

Anonymous
08/26/24(Mon)11:31:44 No.102092894

Anonymous 08/26/24(Mon)11:31:44 No.102092894

File: 2024-08-26_00226_.png (1.01 MB, 1024x1024)

1.01 MB PNG

>>102092792
>>102092778
meet Darkness
>Darkness IS a young woman

Anonymous
08/26/24(Mon)11:31:45 No.102092895

Anonymous 08/26/24(Mon)11:31:45 No.102092895

File: Konosubatest3_000100_00_2(...).png (828 KB, 768x768)

828 KB PNG

First 100 steps of the new LoRA.
Somehow produced a Miku

Anonymous
08/26/24(Mon)11:33:11 No.102092917

Anonymous 08/26/24(Mon)11:33:11 No.102092917

>>102092876
A lora introduces a new concept, and an embedding pulls from what the model already knows. You would be surprised at what can be done with those. Especially on a poorly tagged model like flux

Anonymous
08/26/24(Mon)11:33:53 No.102092922

Anonymous 08/26/24(Mon)11:33:53 No.102092922

File: file.png (867 KB, 1024x1024)

867 KB PNG

>Draw a man of a random profession of your choice, but give him the head of a random animal. Write the name of the man's profession (the one you chose randomly) in yellow letters at the bottom of the image.

>>102092838
There's a slight but important nuance. Boomer prompting usually relies on heavy descriptions. The point here is that the text encoder knows what you mean. It doesn't follow the words blindly like previous models did. It's not as smart as even an 8B text model, but still, you can tell it's trying to reason.
I think we should stop thinking in terms of "tags" and think more abstractly.
>>102092869
>I've been told three different conflicting things about captioning
This thing is new, and anyone who tells you they know what they're talking about is an idiot or a liar.

Anonymous
08/26/24(Mon)11:34:56 No.102092932

Anonymous 08/26/24(Mon)11:34:56 No.102092932

>>102092797
could just say you dont know. its okay if you dont.

Anonymous
08/26/24(Mon)11:35:06 No.102092933

Anonymous 08/26/24(Mon)11:35:06 No.102092933

>>102092895
You can't escape the Migu!

Anonymous
08/26/24(Mon)11:35:08 No.102092936

Anonymous 08/26/24(Mon)11:35:08 No.102092936

File: file.png (1.03 MB, 1024x1024)

1.03 MB PNG

Look at this caption. Same prompt. It's clearly trying to come up with profession names. It just doesn't know it's too dumb to do it.

Anonymous
08/26/24(Mon)11:36:04 No.102092946

Anonymous 08/26/24(Mon)11:36:04 No.102092946

Come and get it, a nice fresh loaf of...
>>102092937
>>102092937
>>102092937

Anonymous
08/26/24(Mon)11:36:25 No.102092950

Anonymous 08/26/24(Mon)11:36:25 No.102092950

>>102092917
oh ok, that make sense, and yeah you're probably right, Flux has probably seen a shit ton of picture, but with the wrong words, I guess that it's as expensive as a Lora or it's easier to make an embedding?

Anonymous
08/26/24(Mon)12:42:59 No.102093825

Anonymous 08/26/24(Mon)12:42:59 No.102093825

>>102088784
i feel this question did not get the attention it deserves

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.