[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (1.18 MB, 3264x3264)
1.18 MB
1.18 MB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>102067488

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/c/kdg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/u/udg
>>>/trash/sdg
>>
File: ComfyUI_02297_.png (1.14 MB, 1360x768)
1.14 MB
1.14 MB PNG
>>
>anon's autistic special interest has become the grifter
>>
>>102070603
Now if this turkish grifter would've make such render as a demo from his loras maybe he'll get more actual patreon subscribers, unironically
>>
File: ComfyUI_01558_.png (868 KB, 1024x768)
868 KB
868 KB PNG
>>
File: 1716873834958866.png (1.21 MB, 896x1152)
1.21 MB
1.21 MB PNG
https://civitai.com/models/668799

lora for isometric/AC type gens
>>
File: grid-0004.jpg (2.33 MB, 3793x4000)
2.33 MB
2.33 MB JPG
>>
...
>>
>>102070678
you want him to reinvent the wheel or something?
>>
File: ComfyUI_01557_.png (789 KB, 1024x768)
789 KB
789 KB PNG
>>102070650
>>
>>102070678
Thats a lot of imports.
>>
>>102070662
>>
>>102070696
>yuri
nice
>flux's interpretation of feet
not nice
I feel like adding NFSW training will help on that, that poor Flux hasn't seen enough people on the beach or naked to accurately render feet
>>
File: ComfyUI_02304_.png (1.28 MB, 1360x768)
1.28 MB
1.28 MB PNG
>>
>>102070722
kek, I love that meme
>>
File: ComfyUI_01561_.png (846 KB, 1024x768)
846 KB
846 KB PNG
>>102070707
Agree, sometimes it just farts when you do complex postures of more than one character
>>
>>102070689
Yes, I want him to rewrite it all in machine code - or maybe just not do overly specific and also redundant imports
>>
>>102070699
my man, check the inksketch lora, it does work
>>
File: 1721598292139608.png (1.3 MB, 896x1152)
1.3 MB
1.3 MB PNG
>>102070651
>>
>>102070741
that is the inksketch lora lol. you made it?
>>
>>102070752
nah, I just leech. it's just one of the few loras on civt that is decent
>>
>>102070739
>or maybe just not do overly specific and also redundant imports
why does it matter? why do you even review his code in the first place? I don't get it
>>
File: ComfyUI_32989_.png (1.74 MB, 1024x1024)
1.74 MB
1.74 MB PNG
>>
>>102070632
He's not capable of the subtlety required to produce such a LoRA. He just burns his face into the weights and calls it a success once the model is incapable of producing anything but his face.
>>
Niggawhat. It's a miracle flux even knows Miku.
>>
>>102070782
Because messy and bloated code is slower to run, hard to update easily, prone to breaking, and harder to fix when it breaks.

you review a codebase quickly to see if it could be worth using compared to other options
simple as that
>>
>>102070859
they didn't JoyCaption to caption the pictures used to train Flux, JoyCaption didn't exist when they were training their models, not 100% sure about that last take though
>>
>>102070868
I still don't get it, if you're so obsessed about that guy's code and want to clean it so bad, what's preventing you to make a PR to fix his shit?
>>
>>102070868
I don't know enough about the script to know what's going on below all those imports, but if they're all being used, what alternative does he have? Break it down into different scripts and call them instead?
>>
>>102070878
>obsessed
?
I posted a single screenshot and just replied to your questions.
>>
>>102070890
you took the time to look at the code of that man (who the fuck does that?), you took a script, posted it on /ldg/ (what the fuck again? is this fucking github?) expecting some reactions, and now you're spending your time arguing with someone that calls on your schizo bullshit

So yeah, I would even say that "obsessed" is an understatement at this point, the fuck is your problem?
>>
>>102070878
NOOOO! You can't just say code is shit! You have to make a PR!
>>
>>102070918
So you just want to whine and that's it? You aren't willing to make changes for the better with a PR? Is this /ldg/ or desperate houswife?
>>
File: 1720795950074548.jpg (41 KB, 369x360)
41 KB
41 KB JPG
>>102070915
>you took the time to look at the code of that man (who the fuck does that?)

The absolute state of /g/ codelets
>>
>>102070927
Do you also dismember your TV everytime you buy a new one to see what's in there? If you say yes then I'm sorry you need to see a mental doctor or some shit, that's some weird ass behavior
>>
>>102070915
You seem upset. There's nothing schizo about critiquing the code of a piece of software used by anons here for training.
Why are you so mad about it?
>>
>>102070878
>>102070915
>>102070935
can you stop arguing about this please? youre shitting up the thread with your weird obsession with those other guys just discussing code on /g/, let them be
>>
>>102070827
All tutorials out there are like that. I've never seen one that doesn't involve training someone's face.
>>
>>102070950
I just don't get it, this man produced a code that is working and helping a lot of people, and your reaction is to spit on his back, at least be a man and talk to him on github and say that he could improve his craft instead of doing some desperate housewife shit
>>
File: 1666851641155930.gif (2.4 MB, 640x640)
2.4 MB
2.4 MB GIF
>>102070935
>looking at code equals dismembering your TV
>>
>>102070969
it makes sense to me
>>
>>102070969
it is equal retard, both are the same process, looking at the underlyings of the product
>>
>>102070968
There's nothing wrong with training faces, it's scorching the faces into the model that's bad. But its easy to do and nobody will know there is an issue because it does the job it was designed to anyway.
>>
>>102070915
>>102070935
take a few deep breaths, anon, and post some gens
>>
>>102070880
NTA but he imports "copy" twice and imports the entire libs as well as specific modules from them, and the order is all mixed up
would just either consistently import full libs and use lib.module or the other way around. in any case: using that many lines for imports means you should probably break up things more.
I just took a look at a different 800-line script in that repo: it was all one confusing class
>>
File: 1709721943329707.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
can anyone explain why forge defaults to 896x1152? I'm testing both apps but even if I set 1024x1024 in ui-config.json it reverts back to the default.
>>
>>102071038
Does having a duplicate import decrease the speed that much?
>>
>>102070977
the code doesn't break if you look at it, unlike the TV and you
kek
>>
bicker bicker bicker <turkish cockroach fuckface> bicker bicker bicker <turkish cockroach fuckface> bicker bicker bicker <turkish cockroach fuckface> bicker bicker bicker <turkish cockroach fuckface> bicker bicker bicker <turkish cockroach fuckface>
good going guys. now all I need is some brazilian baby butts and debo spamming his niggers and my day is complete.
>>102071039
if you ever find a way to adjust the free-u parameters it loads, let me know
>>
>>102071039
dunno, but that's one of the reasons I like comfyUi, you leave this software and you come back tommorow, you'll get the exact same setting as your last gen yesterday
>>
>>102071062
>the code doesn't break if you look at it, unlike the TV
if you manage to break a TV by simply dismembering it, then sorry to say but you're a very clumsy dude kek
>>
>>102071039
nm there is a preset extension that fixes this
>>102071066
I like comfy but for some reason I get faster lora gen speed in forge at the moment, but im using q8 in comfy.

my main reason for using forge for random loras is civitai manager. is there a similar tool for comfy?
>>
>>102071064
And yet it's still better than /sdg/
>>
>>102071074
>I get faster lora gen speed in forge at the moment, but im using q8 in comfy.
that's the reason, GGUF quants have trouble with loras because they have a dequant process on the top of it, unlike fp8
>>
>>102071052
no, it's just unnecessary. an indication that the code is unstructured and not optimized.
>>
>>102071085
so... if it doesn't change anything in the speed, who cares? it's not like I'll be looking at the code when running it, I don't give a fuck? I press "execute" and it executes at the speed it should have?
>>
File: 1724164439771371.png (127 KB, 1254x1003)
127 KB
127 KB PNG
there we go, that's the stuff.
>>102071082
I like both UIs but civitai manager for loras with thumbnails and one click instance prompt keywords is really useful. a civitai manager node in comfy would be nice.
>>
>>102071082
Weird because I've found gguf being a LOT faster with loading LoRAs than default.
>>
>>102071094
extension is config-presets (default extensions location)
>>
File: power.jpg (322 KB, 1288x1152)
322 KB
322 KB JPG
>>102071078
I was gonna add "lets take a quick look over at sdg.. ohh I see" but somehow forgot.
>>102071074
kind of. rgthree got a "power lora loader", pic attached
>>
>>102071091
the code in general being unstructured and not optimized means it's slower
this isn't hard anon
>>
>>102071052
>Does having a duplicate import decrease the speed that much?
>>102071085
>no
>>102071109
>it's slower
which one anon?
>>
>>102071108
I have that, it's good but what I really want is scrollable thumbnails that you click for the lora/instance prompts like civitai manager. if you have a bunch of loras you can tell mostly by filename but thumbnails is good for quick browsing.
>>
I like to feed the AI my Daggerfall characters.
>>
>>102070859
the actual JoyCaption output for the pic is this though, I don't get why you're bullshitting:

"This image is a digital drawing in a chibi style, characterized by exaggerated features and a cute, childlike appearance. The subject is a character from the anime series "Hatsune Miku: Project DIVA," a virtual singer with a teal hair style that is long and divided into twin ponytails. She has large, expressive eyes with black pupils and a small, open mouth with a playful expression. Her cheeks are blushed, adding to her animated and cheerful demeanor.

The character is wearing a sleeveless, light blue dress with a pleated skirt and a matching headband. Her arms are extended, one holding a microphone with a pink handle and the other reaching out. The background is simple and white, with a few pink and light blue musical notes floating around her head, suggesting she is singing. There are also two heart symbols near her head, enhancing the joyful and energetic vibe of the image. The overall color palette is vibrant and playful, with a mix of pastel tones and bold contrasts. The texture is smooth and clean, typical of digital art, with a slight pixelated effect giving the image a retro, comic book feel."
>>
>>102071118
are you hard of thinking? the imports themselves aren't the issue, it's what they indicate about the rest of the code
>>
so for steps, is 20 generally the sweet spot? some have said 30, but what does the data show? anyone compare 20 to 100 steps?
>>
how is flux so good at upscaling. I mean I know it but it is scary
>>102071119
it is inconvenient, yes. still better than a simple drop down menu. but a little work goes a long way. I organize loras in 3 categories in subfolders (chara/style/util) and rename the files if needs be. "AH_K_001" not very helpful.
>>
>>102071135
>the imports themselves aren't the issue, it's what they indicate about the rest of the code
they don't indicate anything, you're reaching, unless you have a screen that shows that his actual script (not the imports) could be optimised for speed, you're full of shit, I don't judge a whole script just based on the imports, that's retarded, you're not a retard anon?
>>
>>102071136
30 seems to be the one that has consistency on quality, and 40 seems to be the one that has consistency on quality + good text
https://reddit.com/r/StableDiffusion/comments/1er3wt7/if_you_want_a_good_compromise_between_quality_and/
>>
>>102071145
if you believe you cannot derive any meaningful information about the full code quality, based of that part of the code, then we don't have much to discuss.
>>
>>102071175
So you admit that you're completely reaching, you know what's the best part of this conversation? Is that it's easy to beat me anon, you just have to show me a part of the script that has actual bad speed optimisation, but nooo you had to go for the imports, the least important part of a script when it's about optimising the speed, that's so retarded I have no words, so yeah, let's end up this conversation because your bad faith arguments are ridiculous
>>
Ultimately the way the imports was done was messy, but the code works, and does what it was claimed to do at a clearly reasonable speed and the person complaining about it has offered no fixes or alternatives so I don't really see the point to discussing it further than "Hey look at this messy code"
>>
>>102071201
this, I would've agree with him if he showed for example a loop that does things badly, but duplicate imports are a complete non-issue
>>
Can someone prompt:
>She getting crunk in the club, I mean she work it. And then I like to see the females twerking. Taking her clothes off, buckey naked. ATL, ho, don't disrespect it.
Thank you
>>
File: migu from love live.png (118 KB, 1547x549)
118 KB
118 KB PNG
>>102071134
Must be temperature, I had it set to 0.2. I tried to rerun it a few times and below 0.4 it doesn't recognize Miku at all.
>>
>>102071136
depends on the sampler
>>
File: .webm (294 KB, 706x502)
294 KB
294 KB WEBM
I can't get either foocus or comfyui to run. I click the .bat file and all it does is give me the blue scroll wheel for a second and then nothing happens. Does it not work on windows 11? I can run other bat files like sillytavern's. It doesn't appear to be blocked by defender.
>>
>>102071256
wait what is the backend -smoothbrain
>>
>>102071285
Try running as admin? I've never seen this before.
>>
File: ComfyUI_04168_.png (1016 KB, 768x1024)
1016 KB
1016 KB PNG
>>102071239
>>
>>102071144
How does upscaling work with Flux?
Feels like it would take centuries to do. What's the workflow?
>>
>>102071312
>And of course Flux went for a black woman
that software is more racist than me I swear kek
>>
File: ComfyUI_01516_.png (897 KB, 1024x768)
897 KB
897 KB PNG
>>
>>102071312
gem, ty
>>
>>102071343
My dude, you need to get some LoRAs or train one on Yuri.
>>
File: 2024-08-25_00027_.png (1.47 MB, 1280x720)
1.47 MB
1.47 MB PNG
>>
File: 1708405251270827.png (1.24 MB, 1024x1024)
1.24 MB
1.24 MB PNG
>>
I'm training right now. If you are bored, try this prompt. See what you conjure up from the bowels of latent space.
https://pastebin.com/PFrkyw8s
>>
File: .webm (149 KB, 638x330)
149 KB
149 KB WEBM
>>102071298
I've tried that but unfortunately there's no difference.
>>
>>102071285
Run the bat from command line and see what it says if anything.
>>
File: file.png (1.68 MB, 1000x1000)
1.68 MB
1.68 MB PNG
>>
File: up.png (277 KB, 1909x670)
277 KB
277 KB PNG
>>102071314
very simple actually, here.
the flow goes like this:
load image > upscale with a model > downscale > inject some tasty noise for more detail (this is of course optional but recommended) > resampler (low step and denoise) > color correction (because noise injection) > done. scary good even at low denoise and step count, I go as low as 6 steps. the prompt can be empty or something to reinforce whatever, eye color etc.
can use ultimate sd upscale too if you are not that good on VRAM
blows pmuch everything else out of the water and isn't all that slow
>>
File: 1702815299568785.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
one more AC miku
>>
File: 2024-08-25_00038_.png (1.41 MB, 1280x720)
1.41 MB
1.41 MB PNG
>>
>>102071432
Oh sweet, it doesn't even need many steps.
>>
File: 1711313201081472.png (1.7 MB, 1024x1024)
1.7 MB
1.7 MB PNG
https://civitai.com/models/667196/bayeux-tapestry

now this is what loras were made for: styles/characters.
>>
>>102071144
>how is flux so good at upscaling. I mean I know it but it is scary
flux is great at everything, and I can't wait for the finetunes, it will make it even better
>>
>>102071432
>downscale
How do you do this? Simply scaling down the pixels, scaling down the latent image, or resampling at a smaller scale? Your image does not show it clearly.
>>
>>102071468
>can't wait for the finetunes
Speaking of which, fine tuning on a single GPU is available, but I haven't seen any JSON files around to get started with it.
>>
File: file.png (1.79 MB, 1000x1000)
1.79 MB
1.79 MB PNG
>>
>>
File: 2024-08-25_00116_.png (1.23 MB, 832x1216)
1.23 MB
1.23 MB PNG
>>
File: 2024-08-25_00014_.png (1.6 MB, 1280x720)
1.6 MB
1.6 MB PNG
>>102071463
dats nice
>>
>>102071486
>STUDIO GHIBLI STYLE!!!
>>
File: up2.png (52 KB, 1530x297)
52 KB
52 KB PNG
>>102071469
this here would be how you'd do it with the default nodes. the latent would then continue into the sampler with or without added noise the problem is, try and get a final 1.5x size. the upscaler pulls it up to 400% so you'd need to scale it by 0.375 to get to a 150% size. that is why I use that custom node because it can set the final size. if you want to go 200% then it is easy. x0.5, ok.
>>
>>
>>102071527
>>102071486
looks more like Lupin III .. buuut whatever
>>
>>102071527
the only nonpedo animes
>>
>>
>>102071463
very nice
>>
>>102071285
run it from cmd/powershell so you can see the output
>>
File: 2024-08-25_00133_.png (892 KB, 1280x720)
892 KB
892 KB PNG
>>
do we look alike?
>>
>>102071704
tasty
>>
File: 2024-08-25_00136_.png (1.02 MB, 1280x720)
1.02 MB
1.02 MB PNG
>>
>>102071704
No Egyptologist can prove there wasn't a hidden temple dedicated to the ritualistic worship of femboy catbeings.
>>
>>102071804
THATS NOT A FEMBOY BROTHA
>>
File: 2024-08-25_00143_.png (1.1 MB, 1280x720)
1.1 MB
1.1 MB PNG
>>102071804
>>102071816
1girls > 1femboy
>>
fscking comfy devs.. its been over a FUCKING week since spell check bugged out

>https://github.com/Comfy-Org/ComfyUI_frontend/issues/524

fix it already
>>
is it jus me or does img2img not work on forge?
>>
>>102071893
It's just you
>>
>>102071417
>>102071587
Thanks. That got it to work finally.
>>
File: photo00005.jpg (171 KB, 1016x1312)
171 KB
171 KB JPG
>>102071893
Works fine
>>
File: cottage_flux1.jpg (2.99 MB, 2320x1696)
2.99 MB
2.99 MB JPG
>>
>>102071869
It's a one line change on your end, codelet. I fixed mine, why haven't you fixed yours yet.
>>
File: Untitled.png (569 KB, 1152x709)
569 KB
569 KB PNG
>>102071900
>>102071907

the overlay to select area doesnt work bros..pic attached

also the img2img output on flux gives me the same image no matter what the denoise
>>
I've got a newfag question. Do image generators have to be "jailbroken" in the same way chat bots do? I'm trying to integrate an imagegen into Sillytavern, but the imagegen refuses to work in the chat. I won't lie, the chat is lewd.
>>
>>102071925
okay, if thats so why don't you tell me where and what instead of talking down on this codelet?
>>
Is 150 images too many for an anime character lora?
>>
>>102071941
it's in the third comment of the issue
>>
>>102071947
yes
>>
>>102071955
How many should I use?
>>
How do I get Flux running on Kaggle? I don't have a GPU and on Kaggle there's an option called T4 GPU x2 which gives two GPUs with 15 GB RAM. But for some reason it only uses one GPU then complains about running out of memory.
>>
>>102071938
If you are trying to get nsfw out of a sfw model its just not going to work.
>>
>>102071959
20-50
>>
>>102072003
OK because I did that before and the lora barely works so I don't really know.
>>
(n)a(z)i (f)art
>>
>>102072009
with 150 you can start to make a style lora, I mean you can still make a character lora with 150 pictures .. but it just will take way longer.. and that it didnt work with 20-50 .. should be some other setting, won't be fixed with more pictures
>>
Flux sucks at cigarettes
>>
>>102071953
yea.. fuck me, src/scripts/widgets.ts does not even exist in comfy portable .. guess I am to dumb for your one line fix
>>
File: 2024-08-25_00157_.png (1.1 MB, 720x1280)
1.1 MB
1.1 MB PNG
>>
>>102072151
why don't you search for the line itself instead of the file name? TypeScript gets compiled to JavaScript and the file names don't have to match.
>>
>>102072139
prove it
>>
>>
how bad are amd cards on linux?
i will probably buy a 3060 since it's like the only 12gb nvidia card i can afford but i found a 6700xt at a nice price... and is much better for game
>>
>>102072195
now make her fuck a horse
>>
>>102072202
If you intend to do any generative ai shit there's only one brand you should be considering unfortunately.
>>
>>102070859
yeah it's super fucked, if you put just neon genesis evangelion it gives a fuse of every sci fi/mecha anime they really didn't look over the dataset at all kek
>>
>>102072221
more on /b/
>>
>>102072177
instead of sending me on a wild goose hunt you just could tell me which file, since I cant find it .. all files containing "spellcheck" are irrelevant
>>
File: 2024-08-25T222043.248.jpg (927 KB, 2688x1536)
927 KB
927 KB JPG
I'm out of ideas again
>>
>>102072315
that line only appears in one JS file, anon
>>
>>102072349
Which?
>>
File: 4step_up_00065_.png (1.71 MB, 1536x1536)
1.71 MB
1.71 MB PNG
>>102072139
Dev?
>>
>>102072202
>>102072225
3060 has the same gen time as 6700, 12GB is not enough for dev no matter the card
>>
>>102072349
>>102072356
well I found it .. for anyone else wanting to fix it its
>ComfyUI\web\assets\index-CaD4RONs.js
>>
>>102072399
Good job!
>>
>>
>>102072365
you think this looks good?
>>
>>102072404
ya, thanks anyhow mr. pushy.. but this really makes me rage: This autist here is "annoyed" by red squiggles in his XY plot and then he suggest suggest turning of spellcheck globally
>https://github.com/Comfy-Org/ComfyUI_frontend/pull/267

fantastic job ..
>>
>>102072431
Have you tried dev?
>>
https://civitai.com/models/618792/nepotism-fux-devschnell?modelVersionId=744217

https://civitai.com/models/147933?modelVersionId=699423

are these considered finetunes, or merges?
>>
>>102071995
Looking at the ST docs, it looks like I can use SD, DALL-E, or Flux. Are those all strictly SFW models? I can only assume they are, since they seem to be the big ones.
>>
>>102072456
Yes
>>
File: 1699532731683527.png (1.7 MB, 1024x1024)
1.7 MB
1.7 MB PNG
>>
some of these civitai checkpoints have a new clip_l, is this primarily for text appearance, or does it also affect the non text outputs? it's part of the text encoder isnt it?
>>
>>102072532
it is the text encoder, or one of them depending on the base model. it affects the entire output.
>>
>>102072532
OpenAI's CLIP ViT Large Patch 14 is a text encoder and it sucks at text, it does know a few things T5 might not.
>>
What happens if you don't use the trigger word for a lora but still load it?
>>
File: 1707772619301503.png (1.03 MB, 1024x1024)
1.03 MB
1.03 MB PNG
>>102072451
https://civitai.com/models/618792?modelVersionId=744217

model with the recommended clip_l file, t5 fp8 text encoder, and flux vanilla vae, pretty neat if new models based on flux are already coming out.

got a cool Miku, base flux also does WoW too though but im testing similar prompts.

An ingame screenshot of World of Warcraft, the main character is Miku Hatsune. HUD and UI visible. The setting is Icecrown from World of Warcraft Wrath of the Lich King.
>>
too much of a brainlet to setup kohya or kohya bmaltis, whats's the difference?
I normally use anaconda envs made in navigator in Linux, do i, can i, use venv inside them, do I have to.
For Flux on my 16GB card i have to upgrade torch etc, does that apply to both versions, will it work?
I'm so fucking lost and confused :(
I just want to train a Lora, is there not some install guide that lets me set it all up where i can say, here's my image files, heres the text, here's the model, output dir and click "GO"
>>
File: 1713277441588485.png (1.33 MB, 1024x1024)
1.33 MB
1.33 MB PNG
>>102072603
*this is just a test, vanilla flux made cool WoW Mikus too, im just trying to see how outputs differ.

this was with base flux:
>>
Is A1111 dead?
>>
File: 1499762595832.jpg (81 KB, 904x711)
81 KB
81 KB JPG
>>102070915
>>
>>102072644
yes, use forge now
>>
>>102072617
ostris/ai-toolkit on github. just copy the example lora config, adjust paths, let it rip
if you’re training something very hard like nudity you will have to up the steps but that’s it
>>
>>102072699
>ostris/ai-toolkit
>You currently need a GPU with at least 24GB of VRAM
I have a 16GB card.
>>
File: 2024-08-25_00179_.png (1.12 MB, 720x1280)
1.12 MB
1.12 MB PNG
>>
File: ComfyUI_33003_.png (1.94 MB, 840x2088)
1.94 MB
1.94 MB PNG
>>
File: 1722232836787363.png (1.13 MB, 1024x1024)
1.13 MB
1.13 MB PNG
>>
novelai open sourced their old models. tell me what im supposed to think about that.
>>
>>102073000
Those have been leaked to the public for years. It's literally nothing
>>
>>102073044
good point, i completely forgot about that.
>>
>>102073000
>>102073073
also NAI got insane publicity by the leak, if it had not leaked it would be an obscure third party AI service, now its widely known
>>
>>102073000
They released a leaked model, GENIUS
>>
>>102071432
catbox / workflow, plz
>>
>>102071893
>>102071934
anons help pls
>>
>>102073102
Idk they were pretty well known before the leak.
>>
>>102073183
I had no idea who they were before the leak kek
>>
>>102071432
Why downscale?
>>
>>102073200
>I
>>
File: 1700459551361622.png (1.12 MB, 1024x1024)
1.12 MB
1.12 MB PNG
https://civitai.com/models/618792?modelVersionId=744217

this model seems to have a lot of loras/stuff incorporated into it, it has taylor without using a lora so it must also have lots of other stuff.
>>
File: 2024-08-25_00245_.png (1.41 MB, 1216x832)
1.41 MB
1.41 MB PNG
>>
>>102073206
I dont wanna end up with a 400% scaled flux gen.
I usually go for 150% and then do a final upscale with a model to 200% of the base size if I really like the gen.
>>102073146
I don't hand out my workflows. just build it. important values are all there on that image.
>>
>>102073289
>V3 is officially here! This version combines the speed of Flux.1 Schnell with the quality of Flux.1 Dev, integrated with custom LORAs and a powerful CLIP L derived from my SDXL/PONY model, NepotismXL. For optimal results, I recommend using 20-32 steps.
He merged this shit with Schnell isn't it... and he's only giving the fp8 model, we can't do a Q8_0 from that, fuck
>>
File: 1705894355434204.png (1008 KB, 1024x1024)
1008 KB
1008 KB PNG
>>102073319
it works pretty well, as does vanilla flux but this has loras/other images trained/merged into it.
>>
>>102073334
I wonder how many loras he merged into the model, I guess that you can't do an infinite amount without breaking everything right?
>>
>>102073344
no idea, would be cool if they had a list of what was trained/merged, but if anything it will have all of base flux dev, plus new lora stuff. You can get lewds off this too so it must have the booba lora in it.
>>
>>102073289
>>102073319
Looks like slop, use this merge

https://civitai.com/models/657607?modelVersionId=747918
>>
>>102073359
why people merge dev with schnell? it's only gonna produce a worse base model than dev
>>
>>102073373
The gguf merge works with 12GB vram and you need less steps than dev

>>102073319
>For optimal results, I recommend using 20-32 steps.

Why not just use dev directly then
>>
>>102073390
>Why not just use dev directly then
this, what was the point of merging with schnell if the number of steps recommanded is the same as dev
>>
File: file.png (1.81 MB, 1000x1000)
1.81 MB
1.81 MB PNG
SUUUUUUU
https://civitai.com/models/684565/cristiano-ronaldo-football-player?modelVersionId=766180
>>
>>102073314
Dev does high resolution by itself really well, why upscale and downscale so many times?
>>
File: R.jpg (109 KB, 1024x1280)
109 KB
109 KB JPG
>>102072337
>I'm out of ideas again
create what you want to see more of
or start working towards a larger project
>>
File: 1720416487638089.jpg (129 KB, 738x1292)
129 KB
129 KB JPG
Good morning sirs
>>
>>102073309
somebody starting nuclear war on the moon
>>
do you use adetailer with flux?
>>
File: file.png (1.76 MB, 1000x1000)
1.76 MB
1.76 MB PNG
>>
File: 00002-4123123.png (1.17 MB, 1024x1024)
1.17 MB
1.17 MB PNG
>>
>>102073422
>255 steps
Civit is absolutely being flooded with rank 2 slop right now.
>>
>>102073574
I wish there was a way to filter all the loras that have under a rank 16
>>
>>102073583
It's against Civits little buzz scam's interest to do that.
>>
>>102073574
>>102073583
Trigger word too
>>
Flux really is the new de facto standard, that was kind of fast from 'You won't be able to train it' to, 'actually you can train it even on 12gb, here have hundreds of LORAs'

Could Stability Inc be more dead at this point ?
>>
File: file.png (2.04 MB, 1000x1000)
2.04 MB
2.04 MB PNG
>>102073553
>>
File: bComfyUI_110061_.jpg (881 KB, 1280x2048)
881 KB
881 KB JPG
>>102072337
https://enragedantelope.github.io/Styles-FluxDev/?page=1
been going through that for ideas on styles myself, try it out.
>>
File: fuckwow.png (2 MB, 1018x1018)
2 MB
2 MB PNG
>>102072618
lol fuck world of warcraft
>>
>>102073681
Prompt?
>>
File: 2024-08-25_00250_.png (847 KB, 1216x832)
847 KB
847 KB PNG
>>102073486
heh yea! .. well I did not prompt that tho.. the two suns here .. that I prompted
>>
>>102071463
brilliant
>>
>>102073583
you're a retard if you think you always need a rank higher than 16, rank 2 is enough for certain things
>>
>>102073700
>Asmondgold
>>
>>102073709
>rank 2 is enough for certain things
hello saars
>>
File: 1720081825003865.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
Doom Eternal ingame screenshot, the main character is Miku Hatsune, she is dressed as the main character of the videogame DOOM wearing green metal armor and is holding a large shotgun. The background is the first level of the FPS game Doom Eternal.

tried this with https://civitai.com/models/618792?modelVersionId=744217 model

it says it includes various loras + flux but no idea which.
>>
>>102073700
>pathetic sick emaciated addicted gamer hunched over (computer screen:1.5) desktop PC playing games on mouse and keyboard in filthy room soda cans, alcohol, pipes, drug paraphernalia, old posters, games, videogame console, sad, dark, depraved, rotten

there ya go. I added the screen with a mask
>>
>>102073553
Oh no! he's gonna turn Miku into a right wing facist!
>>
>>102073435
I know but time is valuable & too many outtakes. 1 in 10 gens makes it & almost always need to fix mistakes in krita before upscaling.
>>
>>102073738
I wouldn't touch flux fine tunes until we have solid reports back on finetuning. Everything until that happens is just buzz farming lora merging slop.
>>
File: file.png (2.34 MB, 1000x1000)
2.34 MB
2.34 MB PNG
>>102073665
>>
>>102073751
this
>>
>>102073738
How many steps and gen time?
>>
>>102073721
are you the retard that thinks he needs rank 512 for his one car lora?
https://civitai.com/models/647663/porsche-911-gts-2024-flux
>>
File: 2024-08-25_00253_.jpg (1.27 MB, 4864x3328)
1.27 MB
1.27 MB JPG
>>
>>102073751
yeah, im just testing it out cause it's free.
>>102073774
34 seconds on a 4080, seems good, only quant models are a bit slower but they are more efficient on memory/etc.
>>
>>102073791
*20 steps
>>
File: file.png (2.32 MB, 1000x1000)
2.32 MB
2.32 MB PNG
>>102073745
Too late
>>
>>102073779
How can you see the rank of the lora?
>>
>>102073829
by looking at the tensors inside it
>>
>>102073779
Way too much for a car LoRA that probably already exists somewhere in the dataset, but I did find out through experimentation that you do need to be a little bit aggressive with the rank if you want to train in new concepts to the model. Lower rank shit you get on civit AI will never be able to do anything other than slightly alter the visuals.
>>
>>102073829
It's massive file size is also a pretty good indicator.
>>
>>102073782
4864x3328, clearly not large enough. come on, just dump it into an upscale by model, 8x nmkd and tell me what happens.
>>102073839
is in the model already lol, alo who wants that slop when you can have this
>>
File: 00016-4123123.png (1.72 MB, 1536x1024)
1.72 MB
1.72 MB PNG
>>
File: file.png (2.27 MB, 1000x1000)
2.27 MB
2.27 MB PNG
>>
I wonder if you could save time on generating large flux images by just generating at a smaller 768 or 512 resolution then doing a few steps at a much higher resolution.
>>
File: 2024-08-25_00255_.jpg (387 KB, 2432x1664)
387 KB
387 KB JPG
>>102073862
ow dear.. 4x not enough for ya? .. well the tiled diffususion took like 15 minutes .. soo aahh .. ehh.. 8x? no <3
>>
File: 1705325083392930.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
https://civitai.com/models/618792?modelVersionId=744217

it's just flux with lewds unlocked and loras of various kinds, seem to be a lot of celebs kek
>>
>>102072603
>>102073289
>>102073738
>>102073943
come on anon this is the 4th time you've shilled this model, you made this one don't you?
>>
>>102073893
haha. wish I was that patient. nice purple/pink
>>
>>102073943
stop chilling that borked merge .. you will all loose all finesse and goodness of flux with that shit merge
>>
File: pimp.png (1.91 MB, 1440x992)
1.91 MB
1.91 MB PNG
>>
>>102073959
No, im just testing new shit to see if it varies from base flux, I think base flux is still best + loras if you need them for a style/character.
>>
>>102073974
this is what im thinking too, no reason to mess with a good thing, just complement it with loras
>>
>>102073943
>>102073983

Nah dude, you're clearly shilling your LoRA merge at this point. Fuck off.
>>
File: 2024-08-25_00257_.jpg (1.37 MB, 5120x2880)
1.37 MB
1.37 MB JPG
>>102073961
thanks, I am just perfectionist .. to speed it up to 9 minutes like pic related wasn't good .. I reduced upscale steps to 10 .. and the quality is just not good enough for me!
>>
>>102073289
See kids, this is what we call a non sequitur.
>>
>>102074020
shit, you got me
fine, I'll stop
>>
>>102073553
The pharmaceutical waste in question did trigger hermaphroditism in the frogs that lived in that river, though.
>>
>>102074026
flux doesnt need that many steps tho, i posted a pic of my upscaling workflow further up, maybe you find it helpful. just not sure how to get noise injected if you work with ult sd upscale but is surely doable. really helps with detail!
>>
File: 1693571260302035.png (1.36 MB, 1024x1024)
1.36 MB
1.36 MB PNG
>>
>>102074066
I seen it .. I tried your upscale method, its okay and fast but it destroys details. Its great for anime, it is not practical for my purposes right now. I want extrem details, not a rough upscale with flat surfaces.
>>
>>102074058
I agree with that, I like Alex Jones, but that meme will never not be funny not gonna lie
>>
File: 00003-4140678217.jpg (234 KB, 1040x1360)
234 KB
234 KB JPG
>>102073868
Cool style
>>
>>102074076
it's a pipe tho
>>
>>102074133
but is it a pipe, or a drawing? (it's based on an art piece)
>>
Fourth day of asking if anyone has tried single GPU local fine tuning yet.
>>
File: 2024-08-25_00261_.jpg (766 KB, 3840x2160)
766 KB
766 KB JPG
as compromise I could gen .. just at 4k .. hhhhmmmm.. is the quality good enough tho?
>>
File: ComfyUI_Flux_0246.jpg (1.06 MB, 1536x2688)
1.06 MB
1.06 MB JPG
>>
>>102074189
Its ok for an upscale
>>
>>102074186
ye, OOM
>>
>>102074217
That rough huh? It's really hard to find anyone actually attempting it, even though we know it's possible.
>>
>>102074131
thanks, it's a y2k style
https://civitai.com/models/667307
>>
>>102074255
I hope multigpu finetuning is possible, I have 2 cards so maybe that's enough idk
>>
>>102074276
only sega was that fancy in y2k .. the others needed till like 2003 to catch up .. ah memberberries
>>
>>102074286
It should be possible with 24gb vram 64gb vram so there must be some implementation problem on Kohya.
>>
>>102074325
>64gb vram
I mean system ram
>>
File: KINGOFIMGGEN~2.jpg (3.22 MB, 1792x2304)
3.22 MB
3.22 MB JPG
Yo
>>
>>102074321
1995 - 2005, the golden age of video games, thank god I lived through that period in time
>>
File: 2024-08-25_00263_.jpg (858 KB, 3840x2160)
858 KB
858 KB JPG
>>102074206
idk .. I am not happy. Need more details!

.. need ..

need VRAM ...
>>
>>102074358
Ok
>>
File: 1706931382781786.png (1000 KB, 1024x1024)
1000 KB
1000 KB PNG
>>
>>102074186
I’m still learning to do proper and good porn loras (99% of what’s out there is undercooked lazy slop). Once I have learned I might try.
>>
>>102074423
So you don't get OOM on a single 24gb vram card?
>>
>go to civitai models
>sort by newest
Huh, so I guess Flux was a flavor of them month flop. Okay, next!
>>
>>102074444
Nobody is ooming on a 24gb card while training LoRA now.
>>
>>102074459
oh, the question was about finetuning though, not Loras >>102074186
>>
>>102074452
why, are there only 50 new LoRas a day now?
>>
>>102074452
It's a decent modelo. That much is clear. But it's too vram hungry and it's not as easy to train as SD. It was inevitable that the community was going to rally behind SD in the end. Open Source modelos are all about being trainable.
>>
>>102074444
I don’t. But I don’t want to devote hours to doing anything with it until I understand how to curate and caption the best datasets. Nobody understands or cares, and they just throw shovels of slop images at it hoping it will stick.
I also don’t use kohya or whatever that’s called. I set up my own workflow by looking at that and ai-toolkit.
It’s baffling how relatively simple the code part of this technology is.
>>
>>102074452
there's still a shit ton of flux loras that are coming out these days, what are you talking about Lykon?
>>
>>102074482
absolutely based, finally someone who understands how important quality data is
>>
>>102074452
>>102074472
>>102074476
the heck are you brainlets on about?
>>
>>102074482
you know how to do it best; manually
but you're fucking insane if you do that
>>
File: file.png (2.3 MB, 1000x1000)
2.3 MB
2.3 MB PNG
>>
>>102074452
every minute a new lora pops up, let the people learn, also how to do loras in kohya
>>
>>102074472
>>102074483
Are you blind? All of that is buzz farming shovelware. You can produce most of the example images with vanilla Flux (and they would probably look better). There are only a handful meme, style and character loras that are worth it. But the majority of the action continues to happen (and will continue to happen) on SDXL.
>>
File: file.png (306 KB, 400x400)
306 KB
306 KB PNG
>>102074540
>You can produce most of the example images with vanilla Flux (and they would probably look better).
>>
>>102074540
you mean porn? ya who cares ..

also I call VRAMlet on you, you probably just cant run flux right
>>
>>102074540
holy cope
>>
>>102074540
everything on civit is buzz-farming slop, that's the business model that keeps the lights on
>>
>>102074514
Of course. Each pic should be chosen manually and captioned manually, or at least read and corrected by hand from an automatic caption. Watermarks removed, resolutions standardized (landscape, portrait and square). Etc. That is the only way to produce anything of quality. Unless we’re talking small style LoRA. That is what datasetting is. What Civitai have brought upon us with their memecoin scam is a disgrace. We will be swimming in AI sewage forever because of them.
>>
>>102074186
I've tried it but it's unbearably slow (15s+/step) on a 4090 and Kohya doesn't have samples so I wasn't going to train for 2 hours to find out if it works.
>>
>>102074559
Sometimes the lights should be turned off (and people lined up and shot).
>>
>>102074573
you know to make a decent finetune you need a shit ton of images right? we're talking to hundreds of thousands of them, how are you gonna caption them all by yourself?
>>
>>102074593
>how are you gonna caption them all by yourself?
With a txt editor??
>>
>>102074585
NTA but learn to write basic Python, /g/. It’s trivial to add sampling every X steps. All the code is already written. You just need to assemble it together.
>>
>>102074573
Someone (not me) should do something
>>
>>102074610
probably with vim and a happy hacking keyboard while wearing programming socks and a ponytail butt plug .. bro is elite
>>
perhaps mandatory volunteer work, make people identify objects in a photograph when trying to make posts to image forums
that's nuts right?
>>
>>102074627
I'm not going to go into Kohya's dumpster fire of code to add sampling in. Also fuck you, I know how to code. I'm just not going to spend half a day reading their code to monkeypatch bullshit they can't be assed to do themselves.
>>
>>102074593
This. And this is why Flux will not ever take off. Due to its license you can’t even try to fund the development of a finetune. Open source for the win, yay!
Seesh these corpos are so benevolent, releasing free and open source cucked models for us to play with!
>>
File: file.png (2.26 MB, 1000x1000)
2.26 MB
2.26 MB PNG
>>
Fresh delivery of bread, ready to collect...
>>102074644
>>102074644
>>102074644
>>
>>102074657
You can finetune on Schnell. Distillation is easily removed with a single epoch.
>>
>>102074657
>Seesh these corpos are so benevolent
every company is a "corpo" to you zoomers
>>
>>102074657
>This. And this is why Flux will not ever take off. Due to its license you can’t even try to fund the development of a finetune.
Flux Schnell has a nice licence, but yeah it's a shit model, fuck that
>>
>>102074628
Like I said, I’m making LoRA for now, and trying to grasp what works better. When the time comes I’ll see if it’s feasible to make a finetune or not. With the increased requirements of this model I’m not going to sit around waiting for someone else to do it because it might take a while.
>>102074642
That’s what I did, and it took me less than an hour to understand what pieces to put together.
>fuck them
You’re only hurting yourself by being like this. You think anybody gives a shit whether you get what you want or not? It’s up to you.
>>
>>102074710
Hurting myself? You don't know what my objectives are. More like "please do it for me".
>>
>>102074779
by not inserting so much grain and noise in your upscales

-> set your noise levels to 0.20-25 max
>>
>>102074797
Hmm. Denoise is at 0.25 already though. Don't want to turn it higher at risk of butchering details.
Wonder if running the image through a denoiser first would have any effect.
>>
>>102072617
adjust to troonix:
https://desuarchive.org/g/thread/102057280/#102058707
In the example I recommend changing
>Lion
to
>adamw8bit
>Dim 8 alpha 8
to
>dim 32 alpha 32
>epochs 30
to
>epochs 10
after my recent new tests



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.