[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


New anti-spam measures have been applied to all boards.

Please see the Frequently Asked Questions page for details.

[Advertise on 4chan]


File: the longest dick general.jpg (3.09 MB, 3264x1509)
3.09 MB
3.09 MB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bred : >>102974813

I'm So Lonely Edition

>Beginner UI
Fooocus: https://github.com/lllyasviel/fooocus
EasyDiffusion: https://easydiffusion.github.io
Metastable: https://metastable.studio

>Advanced UI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
reForge: https://github.com/Panchovix/stable-diffusion-webui-reForge
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://aitracker.art
https://huggingface.co
https://civitai.com
https://tensor.art/models
https://liblib.art
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3

>SD3.5
https://huggingface.co/stabilityai/stable-diffusion-3.5-large
https://replicate.com/stability-ai/stable-diffusion-3.5-large

>Sana
https://github.com/NVlabs/Sana
https://sana-gen.mit.edu

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux
DeDistilled Quants: https://huggingface.co/TheYuriLover/flux-dev-de-distill-GGUF/tree/main

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/aco/sdg
>>>/aco/aivg
>>>/b/degen
>>>/c/kdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/tg/slop
>>>/trash/sdg
>>>/u/udg
>>>/vt/vtai
>>
File: dena_sd35_00060_.png (970 KB, 1344x768)
970 KB
970 KB PNG
>mfw
>>
File: 1724770886791051.jpg (484 KB, 1024x1024)
484 KB
484 KB JPG
How do I run local frog memes?
Is there a LoRa for it?
>>
is this worth getting?

https://civitai.com/models/141592/pixelwave?modelVersionId=992642
>>
>>102987755
it's a legit finetune for artistic-inclined content if you want to gen that, yeah
>>
I don't see a legitimate reason to not use the same seed for every sampler.
>>
File: 2024-10-26_00045_.png (1.55 MB, 720x1280)
1.55 MB
1.55 MB PNG
>>102987737
It's not the same, but the Pepe lora for Flux can make some pepes.
>>
File: 1698877237844316.jpg (17 KB, 255x255)
17 KB
17 KB JPG
>>102987979
Haven't delved into local besides fooocus, and swarm back when sd3 first came out.
Do I have to sell my soul to holocaust survivors in order to run Flux locally or just learn comfyUI?
>>
>
>>
File: PBS.png (585 KB, 1152x896)
585 KB
585 KB PNG
>>
has anyone done a real close examination of flux dev fp8 vs nf4? I only use nf4 because fp8 doesn't really work for my small card but I get curious about what I'm missing
>>
>>102988021
comfyui can be learned in minutes
>>
*GASP*
>>
What are some style lora's that either don't exist yet or only exist in poor quality/SD 1.5/etc? Have some credits to burn.
>>
cough
>>
>>102988670
nf4 is 5% less coherent than fp8, which is 3% less coherent than bf16, which is 1% less coherent than fp16
hope this helps
>>
>>102988559
this would be a cool thing to animate with minimax
>>
>>102989289
Can you change purple color to chromatic yellow?
>>
>>102989289
How long do these take through the web api?
>>
new noobxl v0.75
>https://civitai.com/models/833294/noobai-xl-nai-xl?modelVersionId=998979
>>
File: ComfyUI_04638_.png (1.66 MB, 1024x1024)
1.66 MB
1.66 MB PNG
>>102987979

It sorta works
>>
>>102989350
>Can you change purple color to chromatic yellow?
Here you go
She's cute

>>102989432
>How long do these take through the web api?
Like 4 minutes; H100s go brrrr
And it takes like 3 minutes to make a new Gmail account for 2 more gens
>>
>>102987755
he claims he hasn't used generated images in his dataset, so it's less poisoned than your average checkpoint.

The XL checkpoints of pixelwave had the most unique results compared to the other converging slop, which is why I have my hopes up for this one
>>
why doesn't OpenPose in forge or reForge work. I don't get it, it feels like a user error.
I upload an image, I pick a preprocessor, use one of the models from hugging face, click generate.
Why does it fail?
RuntimeError: mat1 and mat2 shapes cannot be multiplied (154x2048 and 768x320)

this is from Forge, reForge's error is much less helpful.
>>
>>102989517
>brrr
Thanks, idk if you've seen the few examples on redit using a 4090 of exploding things, pretty impressive, as is the HD model.
We're going to be eatin' good, but slowly (lol) when the HD model is local and if it is small enough by quanting, esp if the encoder is released at the same time.

https://i.imgur.com/5yPiXXD.mp4
https://i.imgur.com/CoINM7S.mp4
https://i.imgur.com/rAHVOay.mp4
https://i.imgur.com/3GwELbh.mp4
https://i.imgur.com/wGQx9oK.mp4
>>
>>102989592
Dont know for sure because i've not used that for a long time in Comfy but i seem to remember openpose for sd1.5 and SDXL both exist and if the base model you are using doesnt match the openpose model type it wont work.
>>
>>102989665
Oh. So the ones from Ilya won't work if I'm using a Pony based model?
That's fucking obnoxious, thanks.
>>
>>102989615
First one is great, maybe I'll gen some corpo office towers blowing up (cough ClosedAI cough) for my cyberpunk video too. Was the prompt shared for the exploding building gen?
>we're going to be eating good, but slowly
The eating gets truly good when the HD model is local AND someone makes PonyGenmoHD
>>
>>102989764
>prompt
No it wasn't, CliffDeNardo was the user who posted them.
>>102989688
idk, download both openpose variants and try and see.
>>
>>102989820
>No, it wasn't
boooooo more like CliffDeFarto
but whatever it should be a trivial thing to generate. Hopefully it's not blocked for terrorism or something stupid on the website...
>>
File: 439631825.png (1.58 MB, 1344x768)
1.58 MB
1.58 MB PNG
>>
>>102989865
Michael Bay is the "king" of explosions in films, might help your prompt get past any censor filter if you have problems.
>>
File: 196009930.png (1.64 MB, 768x1344)
1.64 MB
1.64 MB PNG
>>
File: 306890738779299843.webm (553 KB, 720x720)
553 KB
553 KB WEBM
>>
File: 1939923965.png (1.58 MB, 896x1152)
1.58 MB
1.58 MB PNG
>>
File: 1805667023.jpg (3.54 MB, 2048x2048)
3.54 MB
3.54 MB JPG
>>
File: 00009-3364060907.png (1 MB, 1024x1024)
1 MB
1 MB PNG
>>102989688
>>102989820
well that took some fucking around and googling but i got it working. thanks anon, i would have been slamming my head against that for hours trying to figure out what I was doing wrong.
Keeping it simple right now until I get the hang of it.
>>
When it comes to learn the math and read the code, what's the best open source project that still gives ok results?
>>
>>102990804
https://github.com/huggingface/diffusers
https://github.com/huggingface/transformers
>>
mochi 5% speedup on cublas with fp8, gguf Q8 and q4 throw up qublas api failed, reee. (I saw todays q4 fix, i'm up to date)
>>102990763
>head slamming
np anon, small steps, happy prompting!
>>
>>102990931
>diffusers
Thanks. Got a pitch why this is a better approach than another repo, even on huggingface/
>>
File: 1520640549.jpg (3.22 MB, 2304x1792)
3.22 MB
3.22 MB JPG
>>
Just want to say it again, you can full finetune Flux 8B on 24 GB of VRAM with Kohya without block swapping.

>ayjank
>>
>>102989473
<3 She's beautiful!
>>
Im giving a shot at training SD3.5
>>
>>102988021
just download a workflow and flux, you can run it on 12GB of VRAM almost losslessly
now, you'd need a pepe lora, Im not sure if there are pepe loras for flux of sd
>>
I'm gonna go outside my comfort zone and try to generate.... 3girls
>>
Can I run a different prompt at certain steps? ie can each step have a different prompt? Obviously, typically I'd only want to change the prompt once or twice.
>>
>>102990223
How about without smiling?
>>
File: ComfyUI_Flux_14894.jpg (192 KB, 704x1472)
192 KB
192 KB JPG
>>
>>102990763
https://files.catbox.moe/q7k4ia.png
catbox because I feel like this might be a bit to risque for a blue board.
I'm just impressed at how well it did with the glasses, these are the best looking glasses I've seen come out of AI. I didn't even ask for them.
>>
>>102991769
Yeah
>>
>>102991125
>Got a pitch why this is a better approach than another repo
Developed by professionals, not hacked together by amateurs
>>
File: ComfyUI_Flux_14946.jpg (248 KB, 704x1472)
248 KB
248 KB JPG
>>
"group of three" doesn't work, gonna try prompting for crowds
>>
File: 2024-10-27_00001_.png (754 KB, 720x1280)
754 KB
754 KB PNG
>>102991720
>>
>>
>>102992074
is this how to do it?
>Bracket notation: Use brackets [] to define changes in the prompt. For example: [large::0.1] [cat|dog:0.05] will start with “a large” and switch to “cat” or “dog” at 0.05 steps.
>>
>>102992325
godspeed anon you can do it.
>>
File: file.jpg (302 KB, 944x1280)
302 KB
302 KB JPG
>>102992357
pepe and warhammer loras were fun for a few gens
>>
>>102992367
>>102992074
trying
>anime Pepe [dog:cat:10]
>>
>>102992399
topkek

I need to try inpainting with pepe
>>
>>102992442
Still genning, but clearly it just did the same as if
>anime Pepe dog cat

It's not changing the prompt midway.
>>
>>102990223
Man, it's so clean, very nice, compared to the local results in >>102989615, which seems only good as long as there are no person in the scene.

I wonder if it means mochi was mostly trained on non copyrighted stock footage without anyone in it.
>>
>>102992357
I say this and look like that you got me
>>
File: 2024-10-27_00002_.png (715 KB, 720x1280)
715 KB
715 KB PNG
>>102992442
The result is clearly not the prompt changing from
>anime Pepe dog
to
>anime Pepe cat

but is just
>anime Pepe dog cat
>>
>>102992325
Could try "Group of friends" useful for generating less "samey" faces in SD compositions for multiple people, might translate across.
>>
File: Mochi_preview_00015.webm (589 KB, 856x480)
589 KB
589 KB WEBM
I hope mochi hd version is good
>>
File: 155616_00001.webm (1.64 MB, 854x480)
1.64 MB
1.64 MB WEBM
fp8. 100steps, cublas, 67 frames, 2501s, 4060ti
>>
>>102992967
yjk
>>
File: fluxUPscale_0127.jpg (1.06 MB, 1544x2696)
1.06 MB
1.06 MB JPG
my gang
>>
>>102987979
>>102988021 (me)
>>102991720
>>102992399
So if I were trying to make a quick guide for autistic gamers with 12gb vram
it would be something like...

> Learn ComfyUI https://github.com/comfyanonymous/ComfyUI
> Get flux https://comfyanonymous.github.io/ComfyUI_examples/flux
> Get the Pepe LoRa https://civitai.com/images/24172831
> ?????
> PROFIT
>>
>>102993255
seems p straight forward to me
>>
>>102993255
I'd go with Forge instead
>>
>>102992372
>godspeed anon you can do it.
might have to just repeat "group of 3 girls" over and over, genmo has maxed out at 2 over 10 attempts

>>102992646
>Could try "Group of friends" useful for generating less "samey" faces in SD compositions for multiple people, might translate across.
I'll try "group of girl friends", sometimes even with two girls it makes one guy and one girl
>>
File: 165704_00001.webm (647 KB, 854x480)
647 KB
647 KB WEBM
>>102992990
That's disgusting anon, how could you even think of something like that?
>>
>>102993531
maybe try to describe each person?
>>
>>102993531
>>102992360
>>102992325
Alright now it's radical.
>>
>>102993671
>maybe try to describe each person?
Not a bad idea, I'll make one blonde one brunette one black hair, but I'll need to try again in 2.5 hours

>>102993679
>Alright now it's radical.
if lromot adherence was better genmo would be wicked tubular
>>
>>102993531
I'll try 3 girls on local, about 40mins to generate, because i like the qaulity of 100 steps on fp8
>>
>captioning some pics
>The image is a digitally rendered scene from a video game, specifically from the 2006 game "Saya no Uta."
I-it knows?
>>
File: ComfyUI_11939_.png (1.34 MB, 776x1024)
1.34 MB
1.34 MB PNG
>3 months after Flux.1
>still not a word about Flux Video
It's vaporware, isn't it?
>>
>>102994081
BFL are very secretive about their work, hell none of us even knew they were doing anything, or even existed, until Flux released.
It would be funny if they released it right after the election though.
>>
>>102994081
don't trust these hoes everything is vaporware until it's released
>>
File: ComfyUI_Flux_15008.jpg (194 KB, 704x1472)
194 KB
194 KB JPG
trying to gen me a new phone wallpaper
>>
Death march
>>
>>102994172
Get yourself one of those gallery widgets and have it scroll through all your fav gens.
>>
File: ComfyUI_04651_.png (2.01 MB, 1024x1024)
2.01 MB
2.01 MB PNG
>>
File: 181101_00001.webm (520 KB, 854x480)
520 KB
520 KB WEBM
>>102993531
>3 teenage irish young-women wearing traditional clothes are sitting in a small 16th century cottage, in front of each one of them is a spinning-jenny which they are using to make cloth, a fireplace burns in one corner and bolts of cloth are piled up against one wall. A boarder collie dog is laying down asleep in front of the fire.
Seemed to work ok, but i had to close browser so am waiting 15 mins to post this, i'll start another "3" girl gen, should be 30 mins from this post.
>>
>>102994721
Looks like I need to rework my prompts. I'll throw it into an LLM for modifications once my 6 hour limit expires
>Seemed to work ok, but i had to close browser so am waiting 15 mins to post this, i'll start another "3" girl gen, should be 30 mins from this post.
I saw a thread where you can get a temporary inbox for Gmail addresses, maybe find it in the archives if you're interested
>>
File: ComfyUI_04656_.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
>>
>>102988021
just use flux, comfy is mostly autism for tinkerers
>>
File: ComfyUI_04663_.png (1.81 MB, 1024x1024)
1.81 MB
1.81 MB PNG
>>
File: 191051_00001.webm (2.64 MB, 854x480)
2.64 MB
2.64 MB WEBM
>>102994800
Thanks, i may have to do that, i know icloud or whatever it is lets you make millions of alises a day or whatever.
I'd focus on putting the most important thing at the front of the prompt
>3 mexican women wearing traditional clothes are at an open air cooking area preparing a large meal for eating doors, they are underneath a large shade, stirring a large cooking pot and chopping up vegetables and meat and preparing soft tortilas on a floured surface, they are happy and smiling at the camera as they work, it is a sunny day.

Turned out largely as expected.
>>
>>102994721
Poor dog...
>>
>>102994968
>largely
kek
>>
>>102994975
ikr, the other half of him is in front of the fireplace >< maybe he's magical and warming his fuzzy trotters... idk
"Front" and "back" and "behind" have got better understanding by models but there's room for improvement.
>>
>>102994975
Like you're so much better at dog storage.
>>
>>102994919
why are you here if you're not an autistic tinkerer?
>>
>>102994986
Yeah
Human: AI, what does the average mexican woman look like to you, describe her?
AI: She's fat <end of line>
>>
File: ComfyUI_04668_.png (1.77 MB, 1024x1024)
1.77 MB
1.77 MB PNG
>>
File: 2024-10-27_00003_.png (1.08 MB, 720x1280)
1.08 MB
1.08 MB PNG
>>102994919


>>102995019
cool! luv books, me
>>
>>102989289
>coherence
not really usable info for me. I was wondering more whether it shows better grasp of niche/subtle prompt terms, rare concepts, does it display wider potential of gens, can it push higher cfg without cooking, is it less prone to sameface, etc
>>
What's Illustrious? I've seen Loras on civitai appear that are marked with IL for somethign that's called like that.
>>
>>102989143
Early video game FMV. Think command and conquer video games, the opening of the original resident evil, many adventure games, wing commander 4 etc. But I'm not sure it's easy to make a great lora for, because of the varying low quality artifacts and all that shit.
>>
File: ComfyUI_34424_.png (903 KB, 848x1024)
903 KB
903 KB PNG
>>
>>102995106
That's actually pretty similar to what I'm doing (a distorted CRT hyper-stylized 90s JRPG/VN aesthetic). I have had zero success in local captioning so I'm slowly plugging through it manually, hopefully I'll have a lora by tonight.
>>
>>102995083
confidence intervals are going to overlap for each of the quants so you'll never be able to get a statistically significant result anon. That's why I gave you the answer I did.
>>
>>102995127
Sounds cool, but wouldn't that be more prerendered 3D animation and 2D animation oriented, rather than actors on a set or green screened onto a prerendered 3D background?
>>
>>102995342
Oh yeah you're right, forgot what FMV meant, was picturing cutscenes broadly.
>>
>>102994968
>Thanks, i may have to do that
I was struggling getting a nice looking prompt until I had Claude help me out a little
>largely
I wonder if using "Mexican" as a token compared to "latina" weighs it more (lmao) towards making fat people
>>
File: 2024-10-27_00004_.png (1.38 MB, 720x1280)
1.38 MB
1.38 MB PNG
>>102995378
Could be random. Here's Flux with your prompt.

This may sound stupid, but is mochi not good at img2img? It sounds to me like a video model would literally be img2img.
>>
>>102995394
flux is biased towards making beautiful people. genmo is too but not as much since video has to fundamentally pull from more "base truth" about reality than a model trained on Instagram pics
At least that's my theory

Mochi can't do image2image or image2giceo because the vae encoder was not released according to another anon
>>
File: ComfyUI_04691_.png (1.62 MB, 1024x1024)
1.62 MB
1.62 MB PNG
>>
>>102995423
>flux is biased towards making beautiful people
to some extent all image gen is because people are biased towards reposting images of beautiful people. Ugly people post their cats as their profile pics. etc.
>>
>>102995423
ok, how about this?

>beautiful, pretty, good looking

in the negative


also I forgot I'm using a painting lora, Leighton.
>>
>>102995458
yeah you're right it's not just flux
my point is that if you scrape 1000 videos of the real world, most won't be photoshopped (maybe they'll look nicer with saturation but the people will look like they do irl) but if you scrape 1000 images of people a sizable amount will be facetuned

>>102995463
idk anon try it out. from what I remember with other anons experiments flux doesn't really understand what it means to be "ugly" so you should prompt stuff like "tired, acne, blemishes" etc
>>
File: 2024-10-27_00005_.png (1.37 MB, 720x1280)
1.37 MB
1.37 MB PNG
>>102995463
>>102995423
>>102995394
>>102995378
I think it's WAY nicer.
>>
>>102995544
>>102995458
imo these models tend towards averaging featurings not in the sense that they'll making average faces, but that the average of a 1000 faces is symmetrical and safe-attractive
outlier attractive faces are the ones that stick in people's memories, and the models just have trouble with that
>>
>>102995586
>averaging features*
>>
>>102995586
juggernaut was able to teach SDXL to only make good looking people when "beautiful" is in the prompt, and normal looking imperfect people otherwise so it's definitely possible
>>
>>102995579
now s/3 mexican/3 obese mexican
>>
File: 2024-10-27_00006_.png (1.35 MB, 720x1280)
1.35 MB
1.35 MB PNG
>>102995639
wow

Now genning the same, but with a more extensive negative:
>beautiful, pretty, good looking, gorgeous, lovely, stunning,
attractive, exquisite, elegant, charming, alluring, radiant, resplendent, comely, fair, pleasant, dainty, delightful, captivating, enchanting
>>
>>102995586
No, the problem is these models filter based on aesthetics ratings which bias towards professional photography. Also I'm pretty sure their dataset is biased towards certain actual models (barely disguised creator's fetish). If you actually train a model on random Flickr photos without cherrypicking, you will get more realistic people.
>>
>>102995586
I think the models are capable of extrapolating outside of the mean. So like:
sample 1: 5
sample 2: 500
sample 3. 414
I think the model will consider 600 reasonable, and even 8 not that likely. idk, something like that, I think the training is not the boundary.

not sure, but I think so.
>>
>>102995728
notice the food improved :^)
>>
File: 2024-10-27_00007_.png (1.36 MB, 720x1280)
1.36 MB
1.36 MB PNG
>>102995728
result.

The next one is a dichotomy, we'll see what it does...
>>
>>102995544
>yeah you're right it's not just flux
To be clear, I completely agree that Flux is worse than average in this regard. I don't know why I'm arguing the opposite.

FLUX has an enormous untapped knowledge of average and ugly people too, but for some reason you really need to force it with tricks because prompting naively with an empty latent is unlikely to get you there. I don't generally try to gen ugly chicks but I do try to gen women who aren't "instagram pretty", and at first I found Flux very frustrating and didn't like it. Now it's all I use.
>>
>>102995866
skinny apparently is synonymous with pretty, in Flux.
>>
>>102995836
(I canceled the experiment) tldr Flux can do a ton, it's crazy.
>>
>>102995866
What do you use to make chubs?
>>
>>102996007
just with the word "chubby" usually, in a construction like "chubby in a good way" "chubby in all the right places" etc., some conventional way of saying hot-chubby. The more important thing is the rest of the prompt setting the scene in a way that suggests a normal person's facebook pic
>>
File: 1708836395014367.jpg (22 KB, 400x400)
22 KB
22 KB JPG
>>102994919
How to apply loRa without local tho?
I'm a noob to Flux or anything beyond fooocus
>>
>>102996157
>without local
This is /ldg/, I have only done local.
>>
File: file.png (305 KB, 1962x1444)
305 KB
305 KB PNG
>red_panda
Chinaman cometh?
>>
>>102996732
That's mostly a test of adherence, since the prompts are complex. Better adherence is very good news potentially.
>>
File: 1708772641460420.jpg (56 KB, 568x568)
56 KB
56 KB JPG
>>102996677
chegged
am about to try forgeUI after struggling with swarmUI
got any tips?
>>
>>102996771
Wish I did, I use ComfyUI
>>
Is there any reason I can't or shouldn't use one or more Tesla p40s with an RTX 4060ti or any other gayman card?
>>
File: ComfyUI_temp_zgmyc_00005_.png (2.72 MB, 1152x1920)
2.72 MB
2.72 MB PNG
>>
>>102996844
bräääääp
>>
>>102995107
get the fuck off my keyboard im trying to prompt
>>
File: ComfyUI_temp_zgmyc_00012_.png (2.83 MB, 1152x1920)
2.83 MB
2.83 MB PNG
anon, your tongue, my soles, now!
>>
>>102996965
feet are gross id rather tongue her butthole
>>
anyone find a way to stop mochi from doing that zoom effect? I want the camera frame to be locked.
>>
>save outputs as .jpg
>temp folder wiped clean
>workflow completely lost
god FUCKING damnit
>>
>>102996965
The human being in this photo is alright (besides having left her hand on the other stool after losing it in a terrible accident) but man every background detail is just fucked.
>>
File: 1708584205805588.png (2.18 MB, 1280x1280)
2.18 MB
2.18 MB PNG
>>
File: Cog_00008.webm (504 KB, 720x480)
504 KB
504 KB WEBM
Hot load
>>
File deleted.
>>
File: ComfyUI_temp_zgmyc_00035_.png (3.66 MB, 1152x2160)
3.66 MB
3.66 MB PNG
>>
File: 306809434104672259 2.webm (977 KB, 1280x720)
977 KB
977 KB WEBM
I need to get better. I find it difficult to zoom out and also keep her tongue stable.
>>
I'm gunna pull
>>
File: ComfyUI_temp_zgmyc_00042_.png (3.72 MB, 1152x2160)
3.72 MB
3.72 MB PNG
>>102997514
how long does it take you to gen that or are you using the online service?
>>
File: 1717288406449638.png (6 KB, 310x107)
6 KB
6 KB PNG
>>102996844
i went to school with her
>>
AI isn't really a mature tech until I can take a small-breasted normie's instagram pic and use it as the first frame to generate an r/biggerthanyouthought video where she reveals she had enormous breasts all along
>>
Using artificalanalysis.ai's arena at the moment, I occasionally get pictures it attributes to a model called "neptune_next" but it's not on their leaderboard. This some new thing?
>>
>>102997629

This is already within reach
>>
File: 002317.jpg (2.16 MB, 1664x2432)
2.16 MB
2.16 MB JPG
>>
>>102997610
Online.

Dancing around banned words and phrases.

I actually had a lot of success taking a picture of a room and adding a main character interacting from my POV
>>
File: 00001-3874033555.png (1.04 MB, 1344x768)
1.04 MB
1.04 MB PNG
>>
>>102997714
why he sad
>>
>>102997358
what teh fuck
>>
File: ComfyUI_00158_.png.webm (1.4 MB, 848x480)
1.4 MB
1.4 MB WEBM
>>
File: 00005-3062886786.png (1.04 MB, 1344x768)
1.04 MB
1.04 MB PNG
>>102997747
Hes in a war.
>>
>>102997854
A big poo
>>
File: 002323.jpg (2.94 MB, 1664x2432)
2.94 MB
2.94 MB JPG
>>
File: ComfyUI_temp_kosgp_00003_.png (3.5 MB, 1152x1920)
3.5 MB
3.5 MB PNG
>>
what sampler do you use? I'm still using AYS
>>
File: ComfyUI_temp_xjqly_00015_.png (3.37 MB, 1152x1920)
3.37 MB
3.37 MB PNG
>>
>>102998436
i havent tried the last 22 samplers that were added
for schedulers: https://github.com/Extraltodeus/sigmas_tools_and_the_golden_scheduler
>>
>>
>>
>>
>>102998287
>>102998592
THIS THREAD IS FOR AI IMAGES ONLY
>>
>>102998592
Good shit. Could you do one with a bit of a tummy?
>>
>bob cut
>mask
>simple shirt with a skirt
What is this archetype called? You see them all the time in those fake upskirt videos.
>>
File: SO HECKING COOLERINO.jpg (210 KB, 1540x850)
210 KB
210 KB JPG
>>102994919
Comfy is the opposite, it's a shitty nu-design trash heap for people who can't handle anything that isn't putting blocks in shaped holes-tier googoo shit.
>>
>>
googoo gaagaa
>>
>>102998637
Holy kino.
>>
>>
File: 1727499707438309.jpg (167 KB, 1024x1024)
167 KB
167 KB JPG
>>102994919
>>102996677
>>102998637
Explain like am tarded
>I am
What app do I use besides fooocus to run flux?
>or can fooocus just run it..?
>>
>>102998828
I just followed the ComfyUI instructions on how to install. You aren't on Linux, so my special problems won't relate to yours. The instructions will have you genning first with SD 1.5. This basically will show that your gpu is working and ComfyUI is working.

as for other apps, no idea.

Anyway, after getting 1.5 working, you look at guides to getting comfyUI working with flux.
>>
File: 2024-10-27_00013_.png (1 MB, 720x1280)
1 MB
1 MB PNG
>>
File: 2024-10-27_00014_.png (1.16 MB, 720x1280)
1.16 MB
1.16 MB PNG
>>102998957
Flux doesn't know the vulcan salute?
>>
>>102992357
nice
>>
File: 002353.jpg (2.74 MB, 2432x1664)
2.74 MB
2.74 MB JPG
>>
File: 002357.jpg (1.89 MB, 2432x1664)
1.89 MB
1.89 MB JPG
>>
File: 1700859425328903.png (760 KB, 768x768)
760 KB
760 KB PNG
>>
>>102987755
>https://civitai.com/models/141592/pixelwave?modelVersionId=992642
the examples arent bad
>>
File: 1718156774449445.jpg (636 KB, 1280x960)
636 KB
636 KB JPG
>>
>>102998828
Anon I use forge. I followed this guide, though it's a bit dated.
https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/981
It specifically tells you how to use the BNB NF4 version though, which might not be best for you.
>>
File: file.png (2.64 MB, 1024x1024)
2.64 MB
2.64 MB PNG
>>102987755
Idk man, I really like the fact it does a lot of styles, but we lost a lot of details in the process, that's one of the biggest strength of Flux, its details are amazing, and this finetune kinda destroyed that
>>
File: ComfyUI_00221_.png.webm (1.95 MB, 848x480)
1.95 MB
1.95 MB WEBM
>>
>>103000471
that's local mochi right? Damn I can't wait to try out the image2video, that'll be fucking epic
>>
>>
File: ComfyUI_00252_.png.webm (1.07 MB, 848x480)
1.07 MB
1.07 MB WEBM
>>103000477
yeah this is running locally. I don't get why this isn't more popular because it's actually extremely good.
>>
File: file.webm (2.16 MB, 856x480)
2.16 MB
2.16 MB WEBM
>>103000530
>I don't get why this isn't more popular because it's actually extremely good.
it's asking for a lot of vram, it's really slow, and for realistic shit it can easily shit its bed, it's nowhere the level of what they displayed: https://x.com/genmoai/status/1848762405779574990#m
>>
>>103000530
It's low resolution, doesn't do img2vid, someone told me scary requirements of needing giga graphics cards, and there's no porn finetune.
That's why I'm guessing. Honestly, I'd download it in an instant if I believed it was worth trying on a 3060. I believe that the next versions will be better, and more popular.
>>
>>103000530
> I don't get why this isn't more popular
requires a lot of vram and takes a long time to gen, experimentation is unviable unless you have infinite patience or are using some of the top hardware available.
>>
>>103000549
>I'd download it in an instant if I believed it was worth trying on a 3060.
it's working on a 16gb of vram, but not on a 3060, at least if you want decent quality with Q8_0
>>
File: ComfyUI_00070_.png.webm (1.74 MB, 848x480)
1.74 MB
1.74 MB WEBM
>>103000542
it does realistic fine
>>
>>103000530
we're waiting for image2gen and the HD version, because so far what we have locally is way better than CogVideoX, but not quite good enough to be truely enjoyed

For image2gen, it'll definitely be a thing, we need the VAE encoder and they'll provide that in a near future
https://github.com/genmoai/models/issues/9#issuecomment-2430585334
>We will be open-sourcing the encoder, stay tuned :)
>>
File: file.jpg (1.64 MB, 3840x3210)
1.64 MB
1.64 MB JPG
Have you guys tried euler_ancestral (it's working on flux now) and the new "linear_quadratic" scheduler?
>>
>>103000455
>its details are amazing, and this finetune kinda destroyed that
yeah I got the same conclusion on that model aswell, which is a shame because that guy really made the first finetune of flux, took him 5 weeks to do it and wanted to make it less rigid with styles, he had the best spirit of them all, at least it shows that it's possible to finetune Flux, because he went for the distilled version
>>
File: ComfyUI_00115_.png.webm (1.24 MB, 848x480)
1.24 MB
1.24 MB WEBM
>>
>>102994081
>It's vaporware, isn't it?
It is, they just used us as a free advertisment and then got the money in the bank with twitter's partnership, they won't release anything locally anymore. Genmo and its MochiHD is our only hope for videogen now
>>
>>102994081
BFL are finishing up raising a 100 million dollar round, they don't give a shit about local anymore.
>>
>>103000530
>I don't get why this isn't more popular
it's just too fucking slow, waiting for 40 mn for a 5 sec video is too long, I get it it's a fucking 10b model and it has to make 163 frames at at least 50 steps, imagine we create a Turbo Lora like we did on flux, like we could get those videos in 10 steps max, the time wait would be way more acceptable
>>
>>103000684

I cannot get cublas to work with mochi ggufs on my 16gb 4060ti and it's doing my head in.
Tried combos of batch/frame sizes, attentions, prescisions atc, every choice thats available in the nodes
New envs, new installs/compiles of cublas and so on, all torch cuda compatibility checks passed, hours with gpt4o always status error 1 which is a generic error so idk what else to do at this point, goddamit, i want my 40-50% speedup! reeeee
>>
>>
>>103000710
did you install a C compiler (visual studio for example) and cuda toolkit on your computer?
>>
>>103000718
I'm using linux, i have gcc installed, pretty sure i have cuda toolkit, I seem to remember i couldn't use nvdia-smi without it being installed, I'll check both again though.
>>
File: file.webm (635 KB, 1360x752)
635 KB
635 KB WEBM
https://flux-ai.io/flux-video-ai/
wait what? they finished it already?
>>
>>
File: ComfyUI_00184_.png.webm (1.25 MB, 848x480)
1.25 MB
1.25 MB WEBM
>>103000684
Use latest comfyUI and do 25 steps + simple scheduler + euler + cfg 6.0. that's how all these videos were generated.

>>103000745
Lol that's not their real website.
>>
>>
>>103000770
>Use latest comfyUI and do 25 steps + simple scheduler + euler + cfg 6.0. that's how all these videos were generated.
but ComfyUi's VAE don't support vae tilting, so it'll overflow 24gb cards no?
>>
>>103000718
Yup both there.
>>
>>103000788
your cuda toolkit has the same version as your pytorch right? Cuda 12.0 sounds sus, usually it's 12.1 or 12.4 on pytorch
>>
>>103000794
no, the toolkit is 12.0 and torch version is 12.4
I guess, update the toolkit or downgrade torch?
>>
>>103000845
update toolkit to 12.4 yeah
>>
tomorrow is sd 3.5 medium day
>>
>>103000881
What's to be hyped about? The 8b model wasn't that good, how can the 2b change anything?
>>
>>103000890
the 2b uses a different architecture, the 8b is probably something they shat out to save face
>>
>>103000530
It's because I have a 2060 12GB
>>
>>103000900
must be a miraculous architecture then, that poor 2b model will have to be competitive with Flux
>>
>>103000573
Those are some flappy legs
>>
>>103000905
>that poor 2b model will have to be competitive with Flux
no it doesn't. it just needs to be a decent 2b model, that's all. nobody is expecting a 2b to be on par with flux
>>
>>103000936
So basically SAI decided to give up, they don't want to make SOTA local models, they admit that Flux has beaten then, what a sad day that is
>>
>>103000936
>it just needs to be a decent 2b model
no it doesn't at that range you compete with XL finetunes, it should have been a 4b model instead
>>
>>103000960
>compete with XL finetunes
if it's a better base finetuners will just switch
>it should have been a 4b model instead
yeah i think so too
>>
>>103000573
That fucking thing on the left crawling into frame is actually disturbing
>>
>>103000953
>pay employees to make SOTA
>employees create SOTA
>employees decide to keep SOTA for themselves
>give you shit instead
>employees leave with SOTA
>release it as their own
>>
>>103000996
wait, you think the BFL fags made Flux when they were working for SAI? lmao if it's true that's fucking based
>>
File: file.png (1.9 MB, 960x960)
1.9 MB
1.9 MB PNG
https://x.com/deedydas/status/1850680875832496577#m
>New image generation model red_panda is #1 by over 100 ELO points on the Artificial Analysis leaderboard.
>Rumor is it's the new Flux model, Midjourney 7 or a Chinese lab!
Oh shit look at that image, damn...
>>
File: file.png (129 KB, 1583x960)
129 KB
129 KB PNG
>>103001054
https://artificialanalysis.ai/text-to-image/arena?tab=Leaderboard
HOLY MOG
>>
File: file.png (30 KB, 1366x246)
30 KB
30 KB PNG
>>103001068
wtf, why dalle3 is so low in the rankings? it barely beat SD3M
>>
File: file.png (816 KB, 590x698)
816 KB
816 KB PNG
>>103001054
>>103001068
>>103001083
kek, if this is true, then Twitter won't need Flux anymore, and the BFL fags will need us again to be relevant, prepare to have a good new Flux local model soon *inhale copium*
>>
>>103001083
I did 60 runs on that website today, and the answer is that it's fucking shit. I don't think a single Dalle3 example got close to the prompt, and it usually looked like slop.
>>
>>103001185
>I don't think a single Dalle3 example got close to the prompt, and it usually looked like slop.
that's weird, dalle3 is known to be one of the bests at following prompts
>>
>>103001194
I only got given Dall E 3 a handful of times in those 60, it could just be bad luck on it's part.
>>
File: file.jpg (1.29 MB, 3840x3210)
1.29 MB
1.29 MB JPG
>>103000596
>Have you guys tried euler_ancestral (it's working on flux now) and the new "linear_quadratic" scheduler?
kek forget about it, those are memes
>>
>>103001083
Flawed methodology, ELO doesn't work when there's no draw. Psychologically some people will just always pick A, B or randomly. Images can be cherrypicked to influence the result.
>>
>>103000881
Medium hype
>>
>>103001296
>ELO doesn't work when there's no draw
now that you say it, that's true that chess has draws too
>>
What's the current advice for training a flux lora for a character?
>>
File: file.png (3.61 MB, 1344x1728)
3.61 MB
3.61 MB PNG
https://civitai.com/models/652699/amateur-photography-flux-dev
that's really impressive, that dude knows how to make Loras
>>
>>
>>
>>
>>103000851
did that, same error , and i recomplied cublas for good measure after the update.. made sure CUDA_HOME is pointing correctly and a few other things along the way.
>>
>>103002087
I've run tests in python in the env to check if cublas is working, matrix multiplication and so on, all fine. Just wont work for me in comfy on gguf models, if i use non gguf it doesn't error out but there's probably a flag not being set to use it if the model is not gguf which may be why, i've used the gguf models successfully without it and i dont know if they have to be made a certain way for cublas for specific cards, kind of like tensort model.
Quite lost now.
fails on 323 of execution.py
>gpt4o Thanks for providing the detailed error message and stack trace. The information indicates that the error occurs when calling the get_output_data() function in the MochiSampler node type, specifically during a processing step involving CUDA operations.
Think ill just give up for now and see what happens when more people have errors using it locally on non 4090's as i seem to be the only one with the problem rn
>>
File: file.png (2.01 MB, 1024x1024)
2.01 MB
2.01 MB PNG
>>
>>102987755
>I fine tuned version 03 from base FLUX.1-dev for over 5 weeks on my 4090. It is able to do different art styles, photography, and anime.
I don't get it, I thought it was impossible to finetune Flux because it was distilled
>>
>>103002551
Distill is just a training process, models can be trained to do anything including not being distilled, as instead of training strictly on CFG 1 you instead train it using the normal CFG process, of course it's going to be ass until it learns how to do CFG again and you really need to be training it on a massive dataset so it doesn't regress.
>>
>>103002890
yeah I get that, but people couldn't stop saying that doing a real finetune of the distilled flux dev model was impossible because it was distilled, yet he managed to make a fine finetune with it, how did he do it?
>>
>>103002911
What do you mean "impossible", it's impractical. Full finetune on a 4090 is like 15 seconds a step batch size 1, and you need to millions of steps to do it properly and Flux is a very, very fragile model so you also have to a ridiculously small learning rate.
>>
>>103002932
>What do you mean "impossible", it's impractical.
I just report what people said on this thread over and over again, "you can't finetune flux, it's distilled", to me that sounded like an impossible task that's all
>>
>>103002950
I think you just want to be a smartass faggot
The dedistilled models still suck ass, don't know if you noticed unless you like your crispy ass oversaturated outputs
>>
File: mipu.jpg (228 KB, 1024x1280)
228 KB
228 KB JPG
>>
>>103003201
OMG IT PEPOGU
>>
File: file.png (887 KB, 1024x1024)
887 KB
887 KB PNG
https://civitai.com/models/739676/chudjak-flux-dev
kek
>>
>>103002299
kinda got it working now, very low number of frames (39?) before it ooms (16gb) at least it's better for test runs theoretically, I probably need to reboot though, thanks to the anon that set me on the right path.
>>
>>103003632
>very low number of frames (39?) before it ooms (16gb)
fp8?
>>
File: file.webm (792 KB, 1080x720)
792 KB
792 KB WEBM
https://huggingface.co/NimVideo/cogvideox-2b-img2vid
>Fine-tuned on 10 million videos for high-quality generation at SBS levels comparable to CogVideoX-5B!
Poor them, they probably did it before Mochi appeared and they couldn't stop halfway through
>>
File: pepe.jpg (529 KB, 1352x1024)
529 KB
529 KB JPG
>>
File: canman5.jpg (1.43 MB, 2144x2144)
1.43 MB
1.43 MB JPG
>>103001364
Nice
>>
>>103003679
yes the GGUF 8 and it was 31 frames, also, after loading it up again it now gives me the original error, idfk anymore lol.
I need food so i'm off the case for a while.
>>
>>103001364
how did you find this img of me
>>
>>103001083
This is a test of prompt adherence. If redpanda is that good at adherence, it will be a big deal.

The problem is, it's not a test of prompt tuning, because that would require formal rules and judges and stuff
>>
>>103004001
>If redpanda is that good at adherence, it will be a big deal.
do we have a clue on what redpanda will be? I'd guess it'll be Midjourney V7 or something, it's been a while they haven't updated their model
>>
Prepare yourselves. A new SaaS SOTA is about to arrive.
>>
File: skronk.jpg (28 KB, 480x360)
28 KB
28 KB JPG
>Someone posted a webm of a girl holding her stomach in a distraught way a few threads ago
Okay, fine. How many damn GPUs do I need?
>>
>>103003758
well, at least is already supported in comfyui and it can actually run on coomsumer hardware, what we need is a good video upscaler like topaz
>>
>>103004146
at least 16gb of vram, if you have 24gb you can go up to 12 seconds of videos with a Q8_0 quant
>>
File: pepe.jpg (101 KB, 768x768)
101 KB
101 KB JPG
>>
>>103004148
>is already supported in comfyui
Mochi is also supported on comfyui
https://github.com/kijai/ComfyUI-MochiWrapper
>it can actually run on coomsumer hardware
Mochi can also run on a consumer hardware, as long as you have at least 16gb of vram,
>>
>>103001054
Is there others pictures of that redpanda model on the internet? I wanna see if it's a truly next level imagemodel
>>
>>103004297
ok but where is the img2video model of mochi?
>>
File: file.jpg (1.42 MB, 2048x2048)
1.42 MB
1.42 MB JPG
>>103004333
https://cancel.com/jesus__suero/status/1850835432390426626#m
there's some here
>that fucking blur
maybe that's a new Flux model
>>
>>103004456
>ok but where is the img2video model of mochi?
we don't have the VAE encoder to do that, they'll release it soon though
https://github.com/genmoai/models/issues/9#issuecomment-2430585334
>We will be open-sourcing the encoder, stay tuned :)
>>
>>103004470
that doesn't look impressive at all
>>
File: 00013-802331459.png (1.07 MB, 1024x1280)
1.07 MB
1.07 MB PNG
>>103003457
Oh yeah, that one's great
>>
>>103004568
billions must sculpt
>>
>>103004480
Ok, so you dont have it
>>
So nice skipping the caching in Kohya, the best part of having two 4090s. Fucking ridiculous how much space it takes and how long it takes especially when you can only do batch size 1 so runtime embeddings costs nothing.
>>
>>103004731
you're making a flux lora anon?
>>
>>103004755
I'm finetuning the 8B Flux which fits perfectly without having to do block swaps.
>>
>>103004767
>finetuning
oh nice, on what exactly?
>>
File: 1715746216172136.png (1.82 MB, 896x1152)
1.82 MB
1.82 MB PNG
>>
>>103004778
futa on shota
>>
>>103004789
kek
>>
>>103004778
Nude women obviously
Then maybe celebrities and other pop culture.
But it's really just until Sana comes out.
>>
>>103004480
>don't have the VAE encoder
I just realized that this is an evil, but genius, way for a company to "open source" a model while keep it censored and preventing it from being used for anything "bad". Without the encoder, you can still do text2img, since it diffuses in the latent space and then decodes. But you can't do img2img (no more face swap, deepfake, whatever). More importantly, you can't train it, since you need to encode images to the latent space to train. Wonder if we start seeing companies start doing this...
>>
>>103004838
>More importantly, you can't train it, since you need to encode images to the latent space to train.
that would singlehandlely kill the model, if it can't be trained no one will give a fuck about it
>>
>>103004838
I don't think that's true, the inference code is available, you should be able to hack it to do what you want. AI models are inherently not black boxes because you interact with them with code. Training can be hidden from you but the inference stuff is like Javascript, anyone can see it.
>>
>>103004851
Flux got really popular within days, even while the majority opinion was "you can't train it because it's distilled". Even now, we have exactly one real finetune. Normies don't give a fuck, they just use the base model, as they are doing currently with flux.
>>
>>103004902
don't forget that we have hundreds of loras on flux to make it more fun, it's not like we're stuck with vanilla, thank god it's not the case
>>
>>103004789
For once, I'm happy that flux fights anyone trying to train it literally to death(collapse).
>>
>>103004902
BigAsp 2.0 cost $3500 to train. To do the same on Flux would be like $35000.
>>
>>103004943
>To do the same on Flux would be like $35000.
Or you could run on a single 4090 and wait a month, like he did lol >>102987755
>>
>>103004975
>barely a fine tune, minor changes at best
Let me help you anon, if you want pop culture put into Flux you're talking about hundreds of thousands if not millions of steps. Big Asp 2.0 was 6 million images trained for 40 million steps (Batch size 1024). Want to do the math on Batch 1, 15 seconds per step?
>>
>>103005007
>>barely a fine tune, minor changes at best
you tried it?
>>
>>103005028
It's a minor aesthetics update, feel free to prove me wrong. Looking at the examples and gallery, it's just aesthetics. That's not anything to write home about, you could literally do the same in 2 hours with a Lora.
>>
>>102987712
Hang in there OP.
>>
>>103005058
>It's a minor aesthetics update, feel free to prove me wrong.
You're the one who claimed it has "minor changes at best" first, therefore you're the one with the burden of proof, hope that helps.
>>
File: tmpeim7jxq0.png (889 KB, 896x1152)
889 KB
889 KB PNG
>>
The buns are out and hot:
>>103005229
>>103005229
>>103005229
>>
sana-samas what's our status?
>>
>>103005248
hiding in shame



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.