[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: ComfyUI_32893_.png (1.52 MB, 1024x1024)
1.52 MB
1.52 MB PNG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>102064103

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/g/sdg
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
>no collage
>comfy baked
what is happening
>>
finally a kino bake
>>
imagine not having to go to work, you wake up at 10am since you slept late, you wake up and drink triple shot cold brew drinks back to back, have a bowl and then come back, smoke a cig while genning on multiple computers, jacking off on teh side, listening to music, going on cam and just hanging out, then have a few beers, watch a few movies and a tv show, eat pizza and then go to sleep like at 12am and wake up and doing again the next day, that's the life /g/
>>
Blessed thread of frenship
>>
>>102067546
That's a sad and miserable life, to each their own I guess.
>>
File: 00076-2608166309.jpg (423 KB, 1536x2040)
423 KB
423 KB JPG
oldie
>>
>>102067561
the only miserable life is the WageSlave life
https://www.youtube.com/watch?t=15&v=ADvJ8p8yY-c&feature=youtu.be
>>
File: ComfyUI_32896_.png (1.17 MB, 1024x1024)
1.17 MB
1.17 MB PNG
>>
File: Blank 8 Grids Collage (1).jpg (478 KB, 1999x1999)
478 KB
478 KB JPG
>no collage
>>
>>102067596
Very nice I missed some of these. Ty anon.
>>
File: 2024-08-25_00090_.png (1.89 MB, 832x1216)
1.89 MB
1.89 MB PNG
>>
anyone else see notice how those with elaborate workflows never actually post anything interesting? its always shit, they have this massive workflows that are all connected but the gens they produce is like a 5 year old drew it, stupid lol.
>>
File: 1712783808940467.png (84 KB, 859x404)
84 KB
84 KB PNG
Retard here, what "model.safetensors" is the local JoyCaption guide referring to?
>>
File: FD_00057_.png (1.14 MB, 768x1344)
1.14 MB
1.14 MB PNG
I think you only need the image_adapter.pt
Are you getting an error when you run it?
>>
File: 2024-08-25_00024_.png (1.31 MB, 1280x720)
1.31 MB
1.31 MB PNG
>>102067654
two actually.. the google clip model
>https://huggingface.co/google/siglip-so400m-patch14-384/tree/main
and the llama 8b llm model
>https://huggingface.co/unsloth/Meta-Llama-3.1-8B-bnb-4bit/tree/main
you need all files in these repositories in their seperate folders, put em into the JoyCaption folder and add the names of the folder names with Path("foldername")
>>
>>102066669
Wasn't a shitpost btw, it really did solve the squares issue.
>>
>>102067654
I edited the post to remove that bit of the picture, ignore it.
>>
File: ComfyUI_32901_.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>
>>102067654
I think you can ignore that, I believe it's unrelated
>>
File: 00138-486044689.png (2.21 MB, 1024x1528)
2.21 MB
2.21 MB PNG
>>
File: ComfyUI_temp_soiux_00011_.png (1.22 MB, 1024x1024)
1.22 MB
1.22 MB PNG
experimenting with lora training now that I got kohya set up again, running batches of 2000 steps but man some of the outputs look pretty good at only a few hundred steps. This is 600. I guess it's just a lot of trial and error and cherry picking the best one after the run
>>
File: file.jpg (93 KB, 768x768)
93 KB
93 KB JPG
>>
File: ComfyUI_32902_.png (1.73 MB, 1024x1024)
1.73 MB
1.73 MB PNG
>>
>>102067654
Don't listen to the other retards it's
>siglip-so400m-patch14-384
You can find the link in the chink part of JoyCaption github
>>
>>102067764
What are you talking about? That's a different step.
>>
>>102067681
Cool
>>
File: ComfyUI_04547_.png (1.44 MB, 1024x960)
1.44 MB
1.44 MB PNG
>>102067686
>>
>>102067768
It's the model he's asking about, which you literally need for it to work.
>>
File: ComfyUI_04530_.png (1.56 MB, 1024x960)
1.56 MB
1.56 MB PNG
>>
https://civitai.com/models/683222/spirited-away-ghibli-style-for-flux
>This was an experiment with LoKr for Flux. My best friend in the entire world bghira made this really cool optimizer called adamw schedulefree that's kinda like prodigy but doesn't suck for flux.
Hmm...
>>
File: ComfyUI_32904_.png (1.71 MB, 1024x1024)
1.71 MB
1.71 MB PNG
>>
File: FD_00060_.png (1.46 MB, 1152x896)
1.46 MB
1.46 MB PNG
>>102067780
Someone tipped me on civit and I found his LoRA, it's pretty slick.
https://civitai.com/models/677322/zx-spectrum-loading-screen-flux?modelVersionId=758196
>>
>>102067488
Why are LoRa layer names for Flux so inconsistent?
How are devs reliably mapping them to their correct layers in the models themselves?
>>
File: ComfyUI_32905_.png (1.81 MB, 1024x1024)
1.81 MB
1.81 MB PNG
>>
File: 1718993410248034.png (1.83 MB, 1024x1024)
1.83 MB
1.83 MB PNG
>>
>>102067805
looks like shit, just generic flux with some chinese buildings thrown in
>>
File: ComfyUI_01024_1.png (1.25 MB, 898x1096)
1.25 MB
1.25 MB PNG
>>
File: 00152-2969836414.png (2.7 MB, 1728x1344)
2.7 MB
2.7 MB PNG
>>
File: ComfyUI_00866_.png (1.2 MB, 1024x1024)
1.2 MB
1.2 MB PNG
And now I'm back to building Datasets for my next LoRa.
>>
>>102067910
gotta keep that buzz train rollin
>>
>>102067737
>>102067746
Thank you
>>102067792
Got it in an earlier step, should be good
>>
File: FD_u_00034_.jpg (477 KB, 2048x2048)
477 KB
477 KB JPG
>>102067917
I feel like if I train a specific fetish LoRA I will have infinite buzz, but I don't want to be associated with that kind of degeneracy.
>>
>>102067948
>Got it in an earlier step
which step? I have no idea how this works, I just compiled it because the only instructions I could find were in Mandarin.
>>
>>102067968
>>102067948
Oh in the clip step. Yeah I originally put it in the wrong folder because I am retarded. Anyway edited the pic to avoid confustion.
>>
File: ComfyUI_00867_.png (1.27 MB, 1024x1024)
1.27 MB
1.27 MB PNG
>>102067917
This time I will go for a larger dataset of 101 pictures.
my new LoRa is even more niche than Kasia but no LoRa for this exists yet so its uncharted territory.
nobodys Penis is ready for what I'm crafting rn.
>>
File: aBn1pA2_460s.jpg (40 KB, 460x595)
40 KB
40 KB JPG
>>102067634
Reminds me of this
>>
>>102067654
nta but how do I make this use a different llm model? I have copied an uncensored one to the LLM folder but the node doesn't see it
>>
>>
>>102067981
Oddly I found that my LoRA worked better with 30 images vs 150. But if yours works well I'm interested in knowing the setting you used. I was training a style so it may not be a 1:1 comparison though.
>>
File: ComfyUI_32911_.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
>>
>>102068011
it needs to have an image adapter
they're not universal
>>
>>102068003
A tale as old as time, retards are impressed by convoluted shit, and geniuses are impressed by simplicity
>>
>>102068003
I like to keep mine as clean as possible, I only add the nodes I need to complete the task. I have several workflows for different things
>>
>>102068027
I heard that for Style you need more images and for people you need less but overall the more you got the better.

this time I plan to go for a higher learning rate and around 5000 to 6000 steps.
>>
File: file.png (2.03 MB, 1000x1000)
2.03 MB
2.03 MB PNG
>>
>>102068003
to be fair the second one is 100% valid if the bool is nullable thougheverbeit
>>
File: 00126-3553446995.png (2.1 MB, 1024x1440)
2.1 MB
2.1 MB PNG
>>
>>102068054
That was my understanding too, so I really don't understand why my smaller dataset worked better. The images were a bit better since they were a handpicked subset of my larger dataset, but I don't think it should have been enough to make up for the increased variety. I can't say I have a great grasp of the settings and all that so maybe I did something wrong.
>>
File: 1706767215474682.png (26 KB, 412x239)
26 KB
26 KB PNG
Do I need to actually put something here? Is this where I should put a trigger word?
>>
>>102068102
It's a trigger to tell the LLM what to do. You don't actually need to put anything there but you can alter the prompt like "a descriptive caption for this image of Hatsune Miku" for exmaple so it tags everything with her name.
>>
>>102068102
good question, it doesn't seem to matter
I think it only has one purpose
>>
>>102068035
Oh I see. Thanks
>>
>>102068123
>>102068127
I think it's important though, if you find the right prompt maybe it'll stop adding some verbose useless fluff shit like "This setting adds to the etheral atmosphere surrounding..."
>>
>>102068148
if you know how then by all means share
>>
File: 1715618136900250.png (61 KB, 1209x573)
61 KB
61 KB PNG
>>102068123
>>102068127
Thanks. New error, a quick google is telling me my model might be fucked?
>>
>>102068158
Try something like "A detailed caption of this image, provide only the details that can be seen"
No idea if this will work but it's how I was using copilot to caption before joycaption
>>
>>102068127
I tried to guide it to use a certain phrase but it only listened pretty rarely. I noticed it often put the phrase near the end and changed it slightly, so I'm going to wildly speculate that rep pen is too high by default.
>>
>>102068176
What format are your images in?
Try reinstall the llama model?
>>
>>102068176
How much VRAM do you have btw?
>>
>>102068202
JPG
I'll try
>>102068211
12GB. I tried the 4bit model as well.
>>
>>102068215
>12GB
should be fine, the 4bit model only uses 9gb.
It's entirely possible I missed something in my instructions, I will re-check my resources
>>
everyone on 4chan is a llm based on llama
>>
>>
File: dependencies.jpg (101 KB, 1300x583)
101 KB
101 KB JPG
>>102068176
You might need to do this, but pretty sure I never did it because I didn't understand the code
>>
>>102068272
lol
>>
>>102068272
Where is that?
>>
>>102068281
That it is...
>>
File: 00005-3713558786.png (1.07 MB, 896x1152)
1.07 MB
1.07 MB PNG
So flux is the best local diffusion right now?
Is there any ARC770 bro running flux, what is the best settings for fastest speed? I'm getting 3.81s/it for 896x1152
>>
File: ComfyUI_06868_.png (1.92 MB, 1024x1536)
1.92 MB
1.92 MB PNG
>*casually strolls up to you*
what would you do?

i'm having so much fun with flux kek
>>
File: ComfyUI_32912_.png (1.47 MB, 1024x1024)
1.47 MB
1.47 MB PNG
>>
File: ComfyUI_32913_.png (1.32 MB, 1024x1024)
1.32 MB
1.32 MB PNG
>>
File: ComfyUI_32916_.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
>>
File: 00139.png (1.5 MB, 1024x1024)
1.5 MB
1.5 MB PNG
>>102068003
you have no idea how complex my workflow is, I use 4 different sampling methods, two tiled upscales, three llm plugins, and a custom clip encoder.
>>
>>
File: 02052.jpg (3.09 MB, 4096x4096)
3.09 MB
3.09 MB JPG
>>102068328
why is the face blurred?
>>
>>102068362
because it's literally (me), i've just been fiddling around with inpaint/loras and seeing what sorts of things are possible
>>
>>
File: ComfyUI_32919_.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
>>
>>102068370
>because it's literally (me)
oof... you could have just have lied and said anything else lmao
>>
>>102068295
https://github.com/StartHua/Comfyui_CXH_joy_caption?tab=readme-ov-file
>>
>>102068412
eh, i just thought it'd be a fun thing to play around with
lora training (or more specifically, dataset creation) is an absolute pain though
hopefully a proper inpaint model will come out at some point, even just that one pic was incredibly tedious to get looking right
>>
File: file.png (1.73 MB, 1000x1000)
1.73 MB
1.73 MB PNG
Ehh that's not bad at all
https://civitai.com/models/433138/ghibli-style-flux-and-pdxl
>>
>>102068414
Thanks, tried it but still getting the same error
>>
File: ComfyUI_32924_.png (997 KB, 1024x1024)
997 KB
997 KB PNG
>>
>>102067546
What provides the income to live this lifestyle?
>>
File: ComfyUI_32925_.png (1.1 MB, 1024x1024)
1.1 MB
1.1 MB PNG
>>
>>102068561
Patreon with 12000 followers.
>>
File: file.png (1.75 MB, 1000x1000)
1.75 MB
1.75 MB PNG
>>102068488
>>
>>102068508
hmm, I don't know what I missed. Can you share your workflow? It works on my machine so maybe there's something in the workflow that stands out
>>
File: 1700454868162237.png (105 KB, 1701x471)
105 KB
105 KB PNG
>>102068623
Just using the exact script provided
>>
File: ComfyUI_01458_.png (985 KB, 1024x1024)
985 KB
985 KB PNG
>>102068633
What's with the date? You are supposed to tell the LLM what to do
>>
File: 00270-381300593.jpg (38 KB, 297x446)
38 KB
38 KB JPG
>>
File: file.png (1.77 MB, 1000x1000)
1.77 MB
1.77 MB PNG
>>102068488
>>
File: ComfyUI_01460_.png (885 KB, 768x1024)
885 KB
885 KB PNG
>>102068633
Also what's the error?
>>
>>102068648
I thought from the previous conversation it was questionable whether or not that field did anything, but on the off chance it works like a trigger word, I want a unique dated one specific to the model I train. In any case, "A descriptive caption for this image" does nothing to fix the error
>>
File: 1720685650386128.png (82 KB, 1256x701)
82 KB
82 KB PNG
>>102068662
>>
>>102068665
Oh it needs an instruction still, "a detailed caption of this image" is the bare minimum. I meant you don't need to change this.
>>
File: ComfyUI_01459_.png (1018 KB, 1024x1024)
1018 KB
1018 KB PNG
>>102068671
I could be wrong but I think I had a similar issue before, let me check in the archive for a second
>>
File: file.png (1.71 MB, 1000x1000)
1.71 MB
1.71 MB PNG
>>
File: flux_00951_.png (1.43 MB, 1240x1024)
1.43 MB
1.43 MB PNG
>>
File: ComfyUI_00872_.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
>>
>>102068699
Hnnnnnngggg... catbox?
>>
File: file.png (2 MB, 1000x1000)
2 MB
2 MB PNG
The details on far away distance (this crow for example) are quite impressive, that's when you can see the 16ch VAE shine, but I feel like it can be better, a 32ch VAE would be the sweet spot imo
>>
Since switching to Kohya from AI toolkit my LoRAs have becoming shittier and I don't know if it's a result of me being more ambitious with my LoRAs or the wonky settings of Kohya.
>>
File: ComfyUI_01457_.png (856 KB, 768x1024)
856 KB
856 KB PNG
>>102068671
>>102068730
>https://desuarchive.org/g/thread/101996391/#q101999327
Yep the same issue, I later found out that it somehow fails to manage memory correctly which gives that issue and fixed it.
Just replace the Joy_caption_node.py in
ComfyUI\custom_nodes\Comfyui_CXH_joy_caption
with this
https://files.catbox.moe/vmfvj3.py
Make sure to rename it ofc.
>>
>>102067488
>>102066659 #
> The challenge now is to define codes of behavior without throwing out the maps that got us to the place we are now. This may be particularly painful in the arts world.
> maps
> By that standard, the museum might have to include hundreds, if not thousands, of warning labels, and not just for works made by heterosexual men with an erotic interest in girls.
>>
>>102068718
>mortal coombait
>>
>>102068764
Interestingly I never had to do this. I wonder what the difference is between our setups
>>
So do unique activation phrases and keywords even work if the text encoder isn't even being trained during training.

Like if I have owhx as the activation phrases, wouldn't it end up doing nothing?
>>
>>102068764
Unfortunately still not working, but thanks
>>
File: 00099-386408537.jpg (576 KB, 1536x2064)
576 KB
576 KB JPG
oldie.
today i learned that my sad stalker/s care alot about with threads/boards i post on, and, i find this amusing, they apparently monitor degen, on watch for any possible posts my moi. it is confusing behavior to me. not sure why they kare so much. i have bad memory issues, and i wonder if i did something in particular that Triggered such seething hatred, that i have long forgotten. thanks for reading. and id honestly like to know why they like to harass me so.
>>
>>102068808
Dunno either, but I had enough VRAM/RAM to spare and still got an error, and changing the way the clearing is done helped in my case.
>>102068823
Welp sorry, some of us always get the short straw with these issues
>>
File: flux_00895_.png (1.15 MB, 1240x1024)
1.15 MB
1.15 MB PNG
I genned so many of these today. Only a few turned out alright though.
>>
File: 00165-2249950324.png (2.09 MB, 1024x1440)
2.09 MB
2.09 MB PNG
boat time
>>
>>102068839
What is your style based off of?
>>
>>102068327
Basically the fastest way to run it is using a quantized model, q8 is most accurate https://github.com/city96/ComfyUI-GGUF
>>
File: ComfyUI_32931_.png (1.58 MB, 1024x1024)
1.58 MB
1.58 MB PNG
>>
>>102068327
>ARC770
I hope intel releases more and better GPUs
>>
Can Flux run on a 4070 yet, or is it still for 24GB cards? Also, A1111 or nah?
>>
File: ComfyUI_32932_.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
>>
File: ComfyUI_01491_.png (800 KB, 768x1024)
800 KB
800 KB PNG
>>102068926
Have you been under a rock?
>>
File: ComfyUI_01495_.png (813 KB, 768x1024)
813 KB
813 KB PNG
>>
File: flux_cyber-env13.jpg (2.68 MB, 2080x2720)
2.68 MB
2.68 MB JPG
>>102067488
Hey OP that image is phenomenal. Super great love the colors
>>
>>102068926
I'm running it on an 8GB 3xxx series laptop card dude
>>
>>102068812
The T5 doesn't need to be trained, it encodes any prompt it is given into something that can train the diffuser. It's not like how SAI was where the text encoder simply wouldn't function without training. The T5 is like the VAE inasmuch as it can convert any string of text into a text embedding.
>>
>>102068971
>running
Running sounds like a strong word... limping?
>>
File: ComfyUI_01486_.png (934 KB, 768x1024)
934 KB
934 KB PNG
>>
>>102068976
Alright, it just feels like my trigger words aren't doing shit, even if the subject of each trigger is very distinct
>>
>>102068977
Do you have a point? I'm genning nice pics and enjoying myself
>>
>>102068990
It still takes many iterations for the model to learn that Sally Joe is referring to the person the picture. The text encoder doesn't know what Sally Joe is, it just turns it into numbers. The model doesn't know what Sally Joe is, it's just guessing based on a game of hot and cold.
>>
File: ComfyUI_01484_.png (854 KB, 768x1024)
854 KB
854 KB PNG
>>
>>102069007
based
>>
>>102069008
I guess that's not true, the T5 likely knows that Sally Joe has semantic meaning of a girl's name.
>>
File: ComfyUI_01470_.png (870 KB, 768x1024)
870 KB
870 KB PNG
>>
>>102068959
kek
>>
File: flux_00964_.png (1.42 MB, 1240x1024)
1.42 MB
1.42 MB PNG
>>
>>102068980
gonna break that mirror
>>
>>102068926
>Can Flux run on a 4070 yet, or is it still for 24GB cards?
if you go for Q8_0 (the quant the most accurate to fp16) you can run it into your 16gb vram card easily
>>
File: ComfyUI_01462_.png (829 KB, 768x1024)
829 KB
829 KB PNG
>>102069032
Girls can't camp, that's obvious.
>>
>>102068926
It runs on my 2060 12GB
>>
File: file.png (1.65 MB, 1000x1000)
1.65 MB
1.65 MB PNG
>>
>>102069008
I seems like the model is deep frying before it ever learns so I'd say it's an issue.
>>
>>102069070
If you're deep frying Flux you're training at a way too high learning rate. Start around 2e4 or 1e4
>>
>>102069081
I do, stop assuming I'm doing something wrong and maybe acknowledge the fact that keywords are not are barely not even working.
>>
>>102067546
>no bad feels
>laugh laugh laugh
>happy happy
>no cry
Brutal lmao
>>
>>102069101
Anon, you are doing something wrong and you're a fucking moron. You're doing something stupid and fucking wrong. I hope this helps. Show your sample history.
>>
>>102068907
Already using quantized version, I just want to ask if anyone using arc770 has better speed
>>
Does anyone know how to trick Flux into generating a first-person shot like pic related? It keeps giving me a third-person shot instead.
>>
>>102069101
>keywords are not are barely not even working
You can't even put together a coherent sentence and you want us to trust there's no possible chance that you're fucking up training an AI model?
>>
>>102069115
How about you show a functioning multi subject LoRA for flux?
>>
>>102069133
That's a low blow. I meant to say they are not or only barely working.
>>
>>102069133
oh hi debo
>>
>>102069148
YOU DO NOT NEED TO TRAIN THE T5. Whatever is going wrong is either in the shitty images you're using, the shitty captions you're using, or the shitty training settings you're using. It's like blaming the VAE for why your outputs look like shit and saying "maybe I need to train the VAE for my Lora'. There are hundreds of Loras for Flux now, none of which trained the T5 and I guarantee most of them are more complicated than what you're doing.
>>
>>102064177
>well its official, I was still using my 4070ti to gen and a rented 4090 to train lora on, but now ill be using it to gen too, fuck sharing my computer to do these tasks and holy shit never knew just how much faster a 4090 was in genning compared to 4070ti, its not even annoying anymore. just gotta pip install gdown and download your lora, everything else is built in including the manager and all the models.

Can you elaborate? What service has all the models and everything else build in?
Runpod and which community template?
>>
File: 1704966594874874.png (24 KB, 679x77)
24 KB
24 KB PNG
>>
>>102069171
Calm down dude. Even on the same seed, changing the key word will add like a smudge to the subjects T-shirt.
Can you show me an example of a functioning multi concept LoRA for flux so I can learn from it or are you just gonna troon out?
>>
>>102069123
"pov"
>>
File: liter flux dev 2.jpg (368 KB, 1842x1163)
368 KB
368 KB JPG
>>102068812

I just used pure danbooru tags. Tried it with Joycaptions, but I find that more words just fuck shit up and not help at all in character replication LoRas.
>>
>>102069194
Go ahead and train the T5 lmao
>>
>>102068947
i have not been here in a while, nah lmao
>>
File: liter flux dev.jpg (234 KB, 1547x1024)
234 KB
234 KB JPG
>>102068812

Oh, and activation tokens works in Flux. 16 dimm/alpha fits 3 alternate costumes successfully. I do not know the limit of how many alternate looks you can cram into one LoRa yet at 16 dimm/alpha.
>>
File: ComfyUI_32938_.png (3.72 MB, 1536x1536)
3.72 MB
3.72 MB PNG
>>102068960
:3
>>
File: ComfyUI_32934_.png (1.86 MB, 1024x1024)
1.86 MB
1.86 MB PNG
>>
File: ComfyUI_32937_.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
>>
File: 00189-970635511.png (1.83 MB, 1024x1440)
1.83 MB
1.83 MB PNG
looks too much like pit hair, sigh
>>
File: 00353-2243818052.png (506 KB, 512x680)
506 KB
506 KB PNG
good ol vqgan lora.
>>
Has anyone shown that JoyCaption/LLaMA is actually better when it comes to captioning than other systems? The amount of boomer rambling it produces has me suspicious...
>>
>>102069360
Are the captions it puts out accurate to what's in the picture? If yes, then it's fine.
>>
>>102069357
looks like a puzzle assembled in the wrong way kek
>>
File: ComfyUI_32939_.png (1.4 MB, 1024x1024)
1.4 MB
1.4 MB PNG
>>
File: 1702750650010102.png (273 KB, 1552x900)
273 KB
273 KB PNG
>>102069396
So is CLIP Interrogator though
>>
File: ComfyUI_32940_.png (1.55 MB, 1024x1024)
1.55 MB
1.55 MB PNG
>>
File: ComfyUI_32941_.png (1.37 MB, 1024x1024)
1.37 MB
1.37 MB PNG
>>
File: ComfyUI_32943_.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
>>
>>102069411
mm yes, beautiful code indeed
>>
previous:
https://desuarchive.org/g/thread/102052110/#102053468

ibuki satsuki 12gb lora anon here with a progress report

thank you so much to the anon who said 32 dim trains faster, wtf, it absolutely does, look at this shit. it understands the style decently by EPOCH FUCKING 2! even then, epoch 1 is already progressing faster than any of my other training runs. wtf vramlets we're so back! I've been doing 24 hour training to hit higher epochs, I'd be ok cutting this off by 10. might even be able to get away with trying 4e4 lr for a little extra boost. anon who gave me the 32 dim tip, if I was a virgin, I'd give you my virginity - bless you
>pic rel is 20 epoch of 1024x1024 with 3e4 lr, adamw8bit/cosine with 32 dim, 32 alpha
(random seeds)

fuck I'm horny for this, even with my 12gb of vram it won't take more than overnight to train loras to a point I'm satisfied with
>>
>>102069427
JoyCaption outputs have to be fixed too
>>
>>102069436
Well that same anon that told you about higher dims learning faster is being called a retard right now because he suspects activation phrases might not be working properly :(
>>
>>102069438
lol fair, and honestly easier to fix a few tags or phrases than a whole block of fiction
>>
File: 00362-4254531756.png (1.8 MB, 1024x1024)
1.8 MB
1.8 MB PNG
forgive.
>>
>>102069455
sorry anon, I haven't tried any activation words whatsoever since I haven't tested anything but style so far. has no one done a character lora with multiple outfits or something yet that we can reference?
>>
>training 1girl anime flux lora
>man appears in sample images

Anyone know why this happens?
>>
>>102069510
It could be so many things, but it's hard to give you any constructive advice without seeing the dataset and the settings.
>>
>>102069455
Do activation phrases even work without training the text encoder (unless that anon is training the text encoder)?

I've trained at a dims of 8 and 2. The activation words had no effect even when weighted very high.
>>
File: jobukofire.jpg (552 KB, 1662x966)
552 KB
552 KB JPG
>>102069038
Neither can anons.
>>
File: ComfyUI_32950_.png (1.47 MB, 1024x1024)
1.47 MB
1.47 MB PNG
>>
>>102069528
That's what I want to know, but when I brought the subject up I was mercilessly lambasted for even asking.
As far as I can tell, even when concepts are cleanly separated into their own class token, they is massive concept bleed and using the tokens themselves provides little difference on the same seed.
I've tried this at 500 steps, 4000 steps and 8000 steps and it still isn't reacting to the class token.
So that's why I even brought it up.
>>
File: ComfyUI_32947_.png (3.68 MB, 1536x1536)
3.68 MB
3.68 MB PNG
>>
File: ComfyUI_32948_.png (1.54 MB, 1024x1024)
1.54 MB
1.54 MB PNG
>>
>>102069510
These are neural networks that work like rivers flowing data around, as you train you are damming up rivers and making new flows, that means things that are repressed in the network start activating. All training is setting up a goal of the model replicating images from your training dataset, the neural network adapts to achieve that goal and that means things can/will unexpectantly change.
>>
File: ComfyUI_32949_.png (1.67 MB, 1024x1024)
1.67 MB
1.67 MB PNG
>>
>>102069551
The N64 Lora uses an activation token, you're simply doing things wrong.
>>
What the chunk is joycaption
>>
>>102069551
Do you put the activation words at the top of your description?
>>
File: liter flux dev 16 dimm.jpg (485 KB, 1024x3072)
485 KB
485 KB JPG
>>102069481
Activation token works. Single 16 dimm/alpha. Each costume is separated into it's own folder. 5_conceptA, 5_conceptB, 5_conceptC. All pics are prefixed with their activation token. I'll try 32 dimm/alpha next time since you mentions it trains faster. 16 dimm/alpha takes forever to get the small accessories and trinkets somewhere close.
>>
File: fs_0090.jpg (89 KB, 1024x1024)
89 KB
89 KB JPG
>>
>>102069590
Did you also caption the images or just put them in their respective concept folder.
>>
>>102069590
good to see it works, I wonder why its not working in these >>102069551 >>102069528 anons cases.. is everyone training with different trainers? maybe it isn't functioning correctly in one of them?
>>
File: ComfyUI_32956_.png (1.69 MB, 1024x1024)
1.69 MB
1.69 MB PNG
>>
>>102069590
Could it be that my activation token was too generic? Do your captions also contain the activation phrase or are they just in their folder?
>>
File: ComfyUI_32957_.png (1.54 MB, 1024x1024)
1.54 MB
1.54 MB PNG
>>
File: ComfyUI_05047_.png (1.94 MB, 1200x1200)
1.94 MB
1.94 MB PNG
>>102069616

100% danbooru style tagging using a WD tagger. The resulting LoRa is receptive to English prompting enough. Earlier attempts with Joycaption just output inaccurate trash. Joycaption captions of dataset is not good for me.
>>
>>102069642
>>102069616
are you guys using the --keep token arg or just putting them at the start of the caption? when captions aren't shuffled I can't imagine why the --keep token arg would do anything, but.. maybe.. for some reason beyond my understanding..?
>>
File: ComfyUI_01225_.png (957 KB, 1280x720)
957 KB
957 KB PNG
>>102069664
>>
File: ComfyUI_32958_.png (1.76 MB, 1024x1024)
1.76 MB
1.76 MB PNG
>>
>>102069664

I might just go back to the WD tagger TB, joy caption is just faggifying the outputs from the clip model anyway.
>>
>>102067488
What are the options to transfer the style of an image?
>>
>>102069721
A LoRA if you're serious about it, an IP adaptor if you're a jeet.
>>
>>102069721
1.5 and SDXL both have IPAdapters that transfer style incredibly well.

For Flux there's nothing reliable yet, the only IPAdapter so far is garbage. So you're stuck just trying img2img at high denoise, which is a bit of a crapshoot, sometimes works at copying style and sometimes doesn't.
>>
File: ComfyUI_32959_.png (1.55 MB, 1024x1024)
1.55 MB
1.55 MB PNG
>>
>>102069590
How close are the outputs compared to your training images?
>>
File: ComfyUI_32960_.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
>>
File: ComfyUI_32961_.png (1.58 MB, 1024x1024)
1.58 MB
1.58 MB PNG
>>
https://futurism.com/the-byte/man-arrested-csam-ai
>>
File: ComfyUI_32963_.png (1.61 MB, 1024x1024)
1.61 MB
1.61 MB PNG
>>
File: ComfyUI_32964_.png (1.39 MB, 1024x1024)
1.39 MB
1.39 MB PNG
>>
File: ComfyUI_01510_.png (952 KB, 1024x768)
952 KB
952 KB PNG
>>
File: liter sunflower.jpg (393 KB, 3072x1024)
393 KB
393 KB JPG
>>102069753

Near carbon copies. Still lost some details on the smaller items. I am trying to figure it out how to retain minute details. It responds well to english prompts. Shit is like magic.

Anime girl girl standing. She stands one leg up. She says "Flux Dev" with a white speech bubble.
>>
File: 00010-2283591083.png (2.22 MB, 1024x1440)
2.22 MB
2.22 MB PNG
>>
File: fs_0110.jpg (195 KB, 1024x1024)
195 KB
195 KB JPG
>>
>>102069894
>Near carbon copies.
Don't take this as an attack, but can it produce images distinct from the training data?
>>
File: ComfyUI_01045_.png (1.4 MB, 896x1152)
1.4 MB
1.4 MB PNG
holy flux
>>
>>102069872
Kek
>>
File: ComfyUI_32966_.png (1.22 MB, 1024x1024)
1.22 MB
1.22 MB PNG
>>
File: 57728.jpg (481 KB, 1440x3120)
481 KB
481 KB JPG
so i take it d'bo got pounced by the cat... again? sad. no chill
>>
Is it possible to use gguf models for training flux?
>>
File: ComfyUI_01046_.png (1.18 MB, 896x1152)
1.18 MB
1.18 MB PNG
>>
File: ComfyUI_32968_.png (1.62 MB, 1024x1024)
1.62 MB
1.62 MB PNG
>>
>>102069954
some of the trainers have quanting built in but not on kohya (he keeps saying he thinks 8bit is good enough, so go on github issues and request quant if you want it on there)
>>
>>102069954
Not at the moment and I'm not sure it's possible desu
>>
File: ComfyUI_32969_.png (1.68 MB, 1024x1024)
1.68 MB
1.68 MB PNG
>>
File: ComfyUI_01468_.png (888 KB, 768x1024)
888 KB
888 KB PNG
>>102069999
>check em
>>
File: ComfyUI_05212_.png (843 KB, 1024x1024)
843 KB
843 KB PNG
>>102069912
Yes. However, my dataset is actually very large, fan art included. I am only posting official art poses so far. You can change various clothing items/locations/items/etc. However, that's not the goal. Why do a LoRa if you want to deviate fa,r far from your target.
>>
File: ComfyUI_32970_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>
>>102070011
>Why do a LoRa if you want to deviate fa,r far from your target.
Fair enough, I usually benchmark how good a LoRA is by how well it can replicate the subject while also putting them in novel poses and situations that don't appear in the dataset.
>>
File: 57731.jpg (519 KB, 1440x3120)
519 KB
519 KB JPG
why do anything?
>>
File: ComfyUI_32972_.png (1.61 MB, 1024x1024)
1.61 MB
1.61 MB PNG
>>
File: ComfyUI_32973_.png (1.55 MB, 1024x1024)
1.55 MB
1.55 MB PNG
>>
>>102070067
why not?
>>
>>102069553
>>102069557
>>102069562
>>102069632
>>102069653
>>102069689
>>102069752
>>102069759
>>102069774
>>102069818
>>102069844
>>102069941
>>102069960
>>102070005
>>102070094
could you pick say 1/20 and post it. You don't need to be spamming this thread with your low effort gens. At least upscale, inpaint or make the smallest effort and stop posting everything single thing that gets output.
>>
File: 1710553617305152.png (1016 KB, 896x1152)
1016 KB
1016 KB PNG
forge seems much faster with a lora, what changed? is it because im using q8 in comfy? using flux dev fp8 + clip/encoder in forge (testing both)
>>
File: ComfyUI_01053_.png (1.51 MB, 896x1152)
1.51 MB
1.51 MB PNG
>>
>>
>>102070150
thread policeman
>>
File: grid-0003.jpg (2.5 MB, 3793x4000)
2.5 MB
2.5 MB JPG
sup, my gens suck. have a nice day.
>>
>>102070193
We just want to see a variety of things, not the same shit being posted billion of times.
>>
>>102070150
You're right, but you don't need to be such an asshat about it.
>>
File: 57733.jpg (593 KB, 1440x3120)
593 KB
593 KB JPG
>>102070208
nah these are pretty good. cut it out with the pathetic millennial coded performative self deprecation. it's stale, just be authentic.
>>
File: ComfyUI_01055_.png (1.48 MB, 896x1152)
1.48 MB
1.48 MB PNG
>>
>>102070238
damn that is on the nose. My man.
>>
File: Untitled.png (6 KB, 289x75)
6 KB
6 KB PNG
>>
>>102070241
i ain't taking shit from you nigga
>>
>>102070229
I'll throw in the usual, shit gen, you suck, uninspired bullshit next time.
>>
File: 57734.jpg (588 KB, 1440x3120)
588 KB
588 KB JPG
Despite the constant negative press etc etc
>>
>>102070264
>perfected
>>
>>
File: 57735.jpg (366 KB, 1440x3120)
366 KB
366 KB JPG
meat, if you're reading this.... you won. i hate to admit it but... then again, it's only cuz the kittycabal made the alternative non-viable. such is life! c'est la vie, as it were.
>>
>>102070150
we don't owe you anything, if you're not happy you can take a rope and hang yourself for what I care
>>
File: 57736.jpg (446 KB, 1440x3120)
446 KB
446 KB JPG
>>
>>102070360
>you don't care about me
>I should care that you are upset with me

First time trolling friend?
>>
>>102070296
nm it's much better with a sharper esrgan model
lollypop obliterated the texture and made it too smooth
>>
>>102070382
not surprised you have reading comprehension, I said don't care that you end your life, you won't be missed. However, I care about your constant whining, you stink the thread with your negativity, go fuck yourself retard, is that clear enough?
>>
File: ComfyUI_01546_.png (870 KB, 1024x768)
870 KB
870 KB PNG
>>
File: ComfyUI_02289_.png (1.22 MB, 1024x1024)
1.22 MB
1.22 MB PNG
>>102070264
>>
File: 57737.jpg (298 KB, 1440x3120)
298 KB
298 KB JPG
>>102070404
anatomy is hard. let's go shopping!
>>
File: file.png (1.39 MB, 3443x1433)
1.39 MB
1.39 MB PNG
Damn...
>>
>>102070549
Title and character aside, it's pretty impressive
>>
File: 13214876823.png (499 KB, 668x532)
499 KB
499 KB PNG
>>102070549
>>
File: ComfyUI_01555_.png (725 KB, 1024x768)
725 KB
725 KB PNG
>>102070480
Yep, that's it, let's go shopping.
>>
>>102070549
https://www.youtube.com/watch?v=JeimE8Wz6e4
>>
File: ComfyUI_02295_.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
>>
>>
Late night/early morning bread straight outta oven...
>>102070583
>>102070583
>>102070583
>>
>>102070581
lmaooo
>>
>>102070581
He makes the "perfect" background character, villager A
>>
File: 00004b-.jpg (130 KB, 896x1152)
130 KB
130 KB JPG
>>102070394
>I said don't care that you end your life
>>102070360
>you can take a rope and hang yourself for what I care

I have say I am still quite confused. Maybe you don't understand what effort is.

Take this picture. I ran through a few prompts, selected an image. Then I masked over the text and selected to inpaint everything but the mask. Then to gimp for a quick finger move. Nobody like 6 and 4 fingers. Fired that off to IOPaint for some cleanup. Back into SD for a final pass on the left hand.

So you can keep being salty or we can post images at each other. Up to you.
>>
>>102070646
>So you can keep being salty
ironic >>102070150 >>102070216 >>102070271
>>
>>102070595
What if he was actually isekai'd to Turkey, gifted with his superior knowledge of browsing github? He will surely rule the country at this pace.
>>
>>102067546
>alcohol and tobacco on the daily
>shit food
>spend all day indoors jacking off
You said a lot about yourself there buddy. You need to rebalance your life or you will end up broken.
>>
File: 1707390586837033.jpg (43 KB, 896x512)
43 KB
43 KB JPG
>>
File: 1718265824990362.jpg (68 KB, 896x512)
68 KB
68 KB JPG
>>
File: 1704913420751144.jpg (62 KB, 896x512)
62 KB
62 KB JPG
>>
File: 1701539040097369.jpg (41 KB, 896x512)
41 KB
41 KB JPG
>>
ded thred
>>
>>102070179
Nice



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.