[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (1.03 MB, 3264x3264)
1.03 MB
1.03 MB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>102061535

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/c/kdg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/u/udg
>>>/trash/sdg
>>
File: ComfyUI_01128_.jpg (95 KB, 783x720)
95 KB
95 KB JPG
>>102064103
Didn't make into the college
>>
well its official, I was still using my 4070ti to gen and a rented 4090 to train lora on, but now ill be using it to gen too, fuck sharing my computer to do these tasks and holy shit never knew just how much faster a 4090 was in genning compared to 4070ti, its not even annoying anymore. just gotta pip install gdown and download your lora, everything else is built in including the manager and all the models.
>>
File: ComfyUI_Flux_Dev_00376_.png (2.2 MB, 1536x1024)
2.2 MB
2.2 MB PNG
*/me farts*
>>
File: 2024-08-24_00346_.jpg (693 KB, 2560x1440)
693 KB
693 KB JPG
>>102064103
ty baker
>>
File: ComfyUI_00015_.png (2.59 MB, 1920x1080)
2.59 MB
2.59 MB PNG
>>
>>102063890
Thanks fren
>>
So what's the meta for captioning images for Flux LoRA training?
>>
>tfw considering folding and rent a 4090 to test more lora settings while my vramlet gpu is cooking
god I wish I wasn't a poorfag and could afford my own. I wonder how hard it'd be to get chatgpt to implement quants in kohya's scripts, then I could at least do higher batches to speed things up
>>
File: ComfyUI_00814_.png (808 KB, 1024x1024)
808 KB
808 KB PNG
also for those who missed the last thread here is the Kasia LoRa I had been working on

https://civitai.com/models/682483/teenkasia-lora-fluxdev-v1

also many thanks to the Anons who helped me get all that Training shit set up.
Thanks Bros.
>>
How do I get rid of these gay ass square artifacts in Flux?
>>
File: file.png (1.66 MB, 1024x1024)
1.66 MB
1.66 MB PNG
>>102064199
>>
>>102064243
idk but lol'd at them riding scooby doo through the desert
>>
>>102064243
What is your prompt, are you including a print style? They look like moire effects from scans of prints/dot grain printing generally.
>>
File: 4step_up_00066_.png (1.99 MB, 1536x1536)
1.99 MB
1.99 MB PNG
>>
>>102064243
You can't, it's a model issue
>>
>>102064283
Make an anime version
>>
>>102064293
why are others not getting them???
>>
File: Untitled.jpg (633 KB, 2511x1158)
633 KB
633 KB JPG
>>102064229
That's how I do it, there's no way I'd use my GPU even if it was a 4090, this is like having an extra computer, its expensive yes but I have my limit, 10-20 bucks a week, if that, its either that or lottery tickets and id rather just waste it genning and making my loras. But yes, its invaluable when it comes to finding a prompt otherwise you are waiting a long time.
>>
>>102064243
It's an upscale?
>>
File: file.png (16 KB, 523x476)
16 KB
16 KB PNG
>>102064280
It's always like this for me, for some reason.
>>
>>102064283
Also, how about spicing it with some of the weird terms they use?
>>
File: ComfyUI_04378_.png (2.19 MB, 992x1376)
2.19 MB
2.19 MB PNG
>>102064103
>finally my randomized 1girl generator made it into the collage

time to retire
>>
>>102064306
They are, it's not every generation, I don't know the exact circumstances where they appear. Post your settings and ask others the same when you see the squares, maybe there's some common factor.
>>
>>102064333
I'm not the original anon, but I've never had them working with flux when I'm using close to regular settings.
>>
>>102064327
Like what?
>>
>>102064103
why put a debo gen in the collage?
>>
File: file.png (1.67 MB, 1024x1024)
1.67 MB
1.67 MB PNG
>>
>>102064346
Beat me, can't remember more than nigga, your best bet would be some news page.
>>
File: 2024-08-24_00349_.png (1024 KB, 1280x720)
1024 KB
1024 KB PNG
>>
>>102064359
it was good?
>>
File: ComfyUI_04379_.png (2.37 MB, 1304x1152)
2.37 MB
2.37 MB PNG
>>
>>102064258
Jej
>>102064271
prompt is as follows:
 A black sun hangs in the white sky over a scene of rolling blue sand dunes. The silhouettes of a group of bedouins migrates through the scene.

Impressionist cubist abstract geometric.

>>102064319
Nope, it's something I often get, although it's usually not so noticeable as this.
>>
>>102064339
Cool, doesn't mean it's not a thing though.
>>102063434
>>102063344
>>102062415
>>102062323
Here's some from the other thread. It's more subtle in some cases but the squares are there if you look closely.
>>
File: 00195-1973845107.png (1.92 MB, 1024x1320)
1.92 MB
1.92 MB PNG
>>102064283
Just putting the original pasta in provides some amusing results
>>
File: ComfyUI_Flux_10649.jpg (380 KB, 768x1344)
380 KB
380 KB JPG
>>
>>102064389
Ah, I do see them subtly. Thanks anon, I would have never noticed on my own (being blind is kinda good, my gens look better, but I miss out on details I might really want to fix)
>>
>>102064395
imagine getting a coherent car interior with base SD1.5 or even SDXL
>>
>>102064412
excel
>>
File: ComfyUI_00036_.png (2.48 MB, 1920x1080)
2.48 MB
2.48 MB PNG
>>
File: 2024-08-24_00357_.png (1.14 MB, 1280x720)
1.14 MB
1.14 MB PNG
>>
File: ComfyUI_00992_.png (2.54 MB, 1536x1152)
2.54 MB
2.54 MB PNG
>>
>>102064420
lol
>>
>>102064317
ugh, its so tempting, and you're so right. I love making loras so much, its worth a minor investment

at the same time though I'm absolutely trying to get chatgpt to edit kohya's training scripts to let me further quant the model to train with, I don't understand why he refuses to do this when multiple people have requested/it doesn't seem like its that much effort when you know what you're doing with code? but being a codelet maybe I'm wrong.. idk.. I'll find out
>>
File: ComfyUI_00993_.png (2.72 MB, 1536x1152)
2.72 MB
2.72 MB PNG
>>
>>102064439
cfg too high fella, you've roasted this image
>>
>>102064389
>>102064413
for me (since you linked my gens from /sdg/), the cause is not using a very basic prompt on the hires section of forge
if i just put somethign like "best quality" in the hires prompt, the banding/other artifacts tend to go away (just dont use "detailed" there). i guess it depends on how messy your original prompt is
have a freebie from /sdg/
>>
File: 4step_up_00068_.png (1.82 MB, 2016x1152)
1.82 MB
1.82 MB PNG
>>102064395
>>
>>102064242
Do you have a 24GB card? was this 512?
>>
>>102064388
Additionally, I find this particular graphical error occurs when I gen at higher resolutions than normal.
>>
what is the best way to get flux to do highly polished digital art style like official art from league or genshin or whatever
>>
can someone explain to me why the FUCK state-of-the-art models are still using 2-4 year old models for text encoding (CLIP/T5)?
couldn't the results be so much better with something more modern?
>>
>>102064542
you know, these models are trained at FP32/FP16 and not Q8 for a few reasons...
>>
how many steps are you guys genning at?
>>
>>102064629
A Lora with 30 images.
>>
File: ComfyUI_Flux_10663.jpg (294 KB, 1344x768)
294 KB
294 KB JPG
>>
File: file.png (1.67 MB, 1024x1024)
1.67 MB
1.67 MB PNG
>>102064553
nta (noticed those typical indicators in theirs and mine), but assume it wasn't that in my case at least because I was using the base workflow that didn't include flux cfg. Added it and turned it down to 1.6 and I think it's better. Useful knob to try turning.
>>102064653
trying 40 right now, but mine aren't the quality of the really good posters itt (yet/at the moment)
>>
>>102064651
maybe, but its already being run at 8bit for training and all the other trainers have implemented quanting without issues (as far as I know). I'd at the very least like to be able to test it for myself and decide for myself if my loras are so much worse at the lower quants
>>
>>102064619
nta but i've trained 512 and 1024 loras on my 4080
>>
File: 2592914216.png (198 KB, 512x512)
198 KB
198 KB PNG
>1600 steps at 20s/it

Welp, cya tomorrow I guess.
>>
>>102064679
Are you using dual clip loader? Ksampler? Lora?
>>
this isn't really karen is it.
>>102064629
tried "official art from league or genshin or whatever" in prompt? throw pic you lust for into joycaption and dump that into flux maybe
>>102064679
MUSHROOMS WHERE ARE THEY WHERE DID I PUT THEM
>>
File: ComfyUI_00058_.png (2.55 MB, 1920x1080)
2.55 MB
2.55 MB PNG
>>102064553
>>
>>102064658
is there a reason to do 30 rather than throw as many as I have (like 200) into joycaption then lora
>>
File: 1724257852026474.jpg (18 KB, 250x250)
18 KB
18 KB JPG
>it's been 6 weeks since flux released
>still no decent finetuned models
>>
>>102064679
see Distilled CFG Guidance from https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/981

You should keep it at 1 (or not include it so it defaults there).

As far as this CFG too high guy, I have no idea what his actual issue. Seems any color for him is too much.
>>
File: ComfyUI_00999_.png (2.63 MB, 1152x1536)
2.63 MB
2.63 MB PNG
>>
>>102064651
Why does 8bit training work, then? Asking sincerely, I don't understand any of this.
>>
>>102064738
if they're diverse then the more the merrier
>>
File: 2024-08-24_00359_.png (819 KB, 1280x720)
819 KB
819 KB PNG
>>
>>102064619
>Do you have a 24GB card?
yes I trained on a 4090
>>
>>102064755
what, exactly, is 8bit in this alleged training? Not the lora weights that are being trained for sure.
>>
File: 00164-1846747047.png (1.2 MB, 1024x1320)
1.2 MB
1.2 MB PNG
Guns are pretty good now
>>
>>102064752
If you want a particular thing you're going to have to make a lora yourself.
>>
File: file.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>102064755
a few reasons... not that i'd tell you...

unless...?
>>
>>102064755
it works because you are creating a blurrier product that using a 16bit generation cleans up. It isn't as good, but it can be possibly good enough. With the mass lora flooding nobody is calling anything out right now.
>>
File: ComfyUI_00070_.png (2.1 MB, 1920x1080)
2.1 MB
2.1 MB PNG
getting there
>>
File: ComfyUI_04392_.png (1.82 MB, 1088x960)
1.82 MB
1.82 MB PNG
>>
How much do you guys touch the JoyCaption outputs? Do you correct it only if it's glaringly wrong (e.g. incorrect perspective) or just mostly leave as-is? And does throwing a specific string of characters at the start of every caption still work for keywords in Flux?
>>
File: 00117-1098764767.jpg (151 KB, 1080x1280)
151 KB
151 KB JPG
Is there any way to connect joycaption with sillytavern via api? or run it with kobold and then sillytavern
>>
File: soldier with an M1 Garand.jpg (223 KB, 1024x1024)
223 KB
223 KB JPG
>>102064778
If you are happy with generic type guns. Try something specific and see what happens.
>>
>>102064722
I am using dualcliploader with t5 e4m3fn, samplercustomadvanced with euler and normal scheduler for that one, although I've been trying most of them posted in the thread. No lora.
>>
>>102064753
I didn't have it, but having it lower than 2 definitely seems to "reduce cfg" compared to not having it, as far as frying the image. Very strange
>>
>>102064317
cool robots
>>
File: FLUX_00084_.png (1.26 MB, 896x1152)
1.26 MB
1.26 MB PNG
>>
>>102064776
I don't know, in kohya ss you use the --8bit command to be able to train. I believe it then turns the model into a 8bit model in order to train it? Using quants is the same idea, is it not?
>>102064796
I don't really understand the issue, then. Why should you police what I create and am happy with? You can make your own loras that are up to your own standards.
>>
>>102064903
>I believe it then turns the model into a 8bit model in order to train it?
when training a lora you're not training the model, what precision are the lora weights at?
>>
>>102064705
for flux? What trainer did you use and could you link to the instructions page, scripts etc?
>>
File: ComfyUI_01002_.png (2.53 MB, 1152x1536)
2.53 MB
2.53 MB PNG
>>102064925
yeah, flux. I use kohya's sd-script, just the cli not the gui
https://github.com/kohya-ss/sd-scripts/tree/sd3?tab=readme-ov-file#flux1-lora-training
>>
>>102064903
I am not policing anything. I thought you were asking by 8bit was bad. I am saying it is only slightly bad and if you are getting good results then keep on doing your thing.
>>
>>102064964
Thank you.
>>
File: ComfyUI_32866_.png (1.41 MB, 1024x1024)
1.41 MB
1.41 MB PNG
Looks like the bake turned out okay! I want to test different checkpoints and settings before releasing.
>>
>>102064789
I hate seeing his face but I kek'd at this one
>>102064917
Well I like I said, I don't understand it, so I can't really explain it to you. I believe it trains and outputs with bf16, but it requires the --8bit command to not OOM even on 24gb vram devices. If it's not quanting the model, then I don't know what its doing. You can view the code here:
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/kohya-ss/sd-scripts/blob/sd3/flux_train_network.py

The osiris trainer and the other finetune trainer have both implemented quant'd training, and people use it to lower their vram, so it must be doing something and working somehow? Even if you think it's lower quality, a lot of us would prefer lower quality output (to an extent) vs needing 24 hours to train. Half the reason the civitai loras are so shitty is because they're extremely undercooked.
>>102064970
I kind of am, but moreso why that anon says training loras with quant is somehow worse than with 8bit to the point we shouldn't have it as an option at all. Like, yeah, I get you want the most quality possible, but for a lot of people that simply isn't practical - the option of practicality at the cost of some quality should exist. Just like how people are using the gguf quants for genning now.
>>
File: 2024-08-24_00364_.jpg (281 KB, 2560x1440)
281 KB
281 KB JPG
>>
>>102064902
looks like a tranny
>>
>>102065013
>If it's not quanting the model, then I don't know what its doing
model weights =/= lora weights
>>
>>102064177
I'm genning on a 1080ti. Flux takes 15-30 minutes per image, "normal" gens like pony or sdxl usually take 3-10 minutes.
My record so far was a 4x upscale from 512x512 that took over 3 hours.
One of these days I'll upgrade, I just need to stop seething about nvidia's shady practices, evil pricing and general anti consumer antics first.
>>
File: FLUX_00082_.png (1.47 MB, 896x1152)
1.47 MB
1.47 MB PNG
>>102065056
thank you, it works then
>>
i love ldg
>>
>>102064861
Try to use another workflow

https://civitai.com/models/669973/umeairt-simple-workflow-flux-or-upscale-or-lora
>>
>>102065076
Based, thanks for the rec. Moving into textgen mode at the moment but I will download that and store it for later use.
>>
>>102065065
Ok, then change my question or posts or whatever to be saying 'quanting the lora weights' if that's the correct thing. I don't know about any of this, I just want lora training to be optimized and I don't see why 8bit works but quanting is a no-no for kohya.
>>
File: ComfyUI_00093_.png (2.05 MB, 1920x1080)
2.05 MB
2.05 MB PNG
>>102064878

Thanks!

>Please kill me...
>>
File: 2024-08-24_00366_.jpg (347 KB, 2560x1440)
347 KB
347 KB JPG
>>
File: ComfyUI_00087_.png (2.83 MB, 1920x1080)
2.83 MB
2.83 MB PNG
>Oy vey the goy knows, we must find him and shut it down.
>>
File: 1703237345615475.png (27 KB, 555x97)
27 KB
27 KB PNG
kek
>>
>>102065013
you should stop listening to anon so much.

Are far as GGUF the quality changes rapidly depending on quant.
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

I am not going to pretend I understand this completely. I do understand that 16 is half of 8 and there are different GGUFs that have more than a double change.

As I am a moron please see this thread:
https://www.reddit.com/r/StableDiffusion/comments/1eslcg0/excuse_me_gguf_quants_are_possible_on_flux_now/?rdt=34707
OPs first comment has side by side pics. All of them could be argued which is "better".

More realistically the argument against GGUF is compression is slow (and slightly lossy) and nothing says internet big dick swinging like claiming my machine can do the right thing and faster.
>>
>>102065067
>One of these days I'll upgrade, I just need to stop seething about nvidia's shady practices, evil pricing and general anti consumer antics first.
Adjusted for inflation a 4080 Super costs the same as a 1080 Ti, has more VRAM and around 10x the performance in AI workloads
praise Jensen and his girthy Huang
>>
File: 2024-08-24_00368_.jpg (466 KB, 2560x1440)
466 KB
466 KB JPG
>>102065164
oof 20% WEED dmg .. thats hardcore OP, pls nerf
>>
>>
>>102065259
querulous.
please i beg of you help me with my square problem
>>
File: ComfyUI_00704_.png (3.41 MB, 2048x1536)
3.41 MB
3.41 MB PNG
I can't wait for a Flux finetune that simply recognises different periods of Western art and famous historical artists and which can be evoked with simple prompts.
>>
>>102065283
what is your prompt?
>>
>>102065259
why is this post empty
>>
>>102064778
Scarlet Johanson if she had a bit of Latina in her
>>
>>102065297
same as it was in >>102064388
>>
>>102065305
I could put a little latino in her, if you know what I mean.
>>
>>102065318
kek
>>
File: ComfyUI_01016_.png (2.41 MB, 1152x1536)
2.41 MB
2.41 MB PNG
>>
>>102065318
Nice one
>>
File: IMG_0706.png (237 KB, 1074x792)
237 KB
237 KB PNG
>>102065318
I dont get it
>>
>>102065309
>
abstract geometric

this is likely your problem, and possibly cubist as well
>>
>>102065354
Basically the Scarlet Johanson image looks a bit more latina than she usually does, so an anon said hey she looks kind of like Scarlet but more latina, maybe like a bit of J Lo in her.

Then some random ass Latino genetics anon said hey I could put a bit of latino in her if you know what I mean.

The if you know what I mean part of this sentence means he would put is latino penis inside her mouth, vagina or ass.

This is the basics of the joke, but it may have nuances that I missed.
>>
>>102065255
I was recently thiiiis close to pulling the trigger on a 7900xtx aqua, until i discovered the fun that is diffusion. Now I don't know what to do. Next rig's going to be am5 & i wanted to keep it all team red, but diffusion's really fucked my plans.
I'm going to wait & see what the 5090 is packing before making a final decision.
>>
>>102065390
>latino penis
no it's latino sperm reddit spacing retard
>>
File: 2024-08-25_00004_.jpg (487 KB, 2560x1440)
487 KB
487 KB JPG
>>
File: 1722293162365315.png (317 KB, 1000x871)
317 KB
317 KB PNG
>add LoRA
>hands start mutating
>>
>>102065413
The context of the joke could have the nuance of both.

Every anon that complains about reddit spacing is likely a newfag because 4chan users did this before reddit even existed.
>>
The JoyCaption at

https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha

Is throttling me after 30 image captions, any forks people are using?
>>
>>102065390
>The if you know what I mean part of this sentence means he would put is latino penis inside her mouth, vagina or ass.
I guess that makes sense
>>
>>102065443
Just run it locally
>>
>>102065443
>/ldg/

https://desuarchive.org/g/thread/101948110/#q101950141
>>
Is it just me or does setting lora weights to less than 0 not work with flux
>>
>>102065375
Nope. it happens with any prompt. See picrel, the prompt for which was:
 A black sun hangs in the white sky over a scene of rolling blue sand dunes. The silhouettes of a group of bedouins migrates through the scene.

Impressionist.
>>
>>102065431
I noticed that with some loras too
>>
is it true flux training doesn't take more than 75 tokens? So you can't put a long caption or caption + tags into the .txt?
>>
File: ComfyUI_Flux_Dev_00381_.png (2.18 MB, 1920x1080)
2.18 MB
2.18 MB PNG
>>102065483
picrel actually attached this time
>>
File: ComfyUI_32878_.png (1.29 MB, 1280x720)
1.29 MB
1.29 MB PNG
>>
>>102065499
weird
are you loading the right vae?
>>
>>102065485
it takes 77 tokens from CLIP and 512 from T5
>>
>>102065514
Yeah, ae.sft.
>>
File: 2024-08-25_00007_.jpg (1.12 MB, 2560x1440)
1.12 MB
1.12 MB JPG
prompting some dante with beksinski lora is quite a trip

>Cerberus, cruel monster, fierce and strange,
>Through his wide threefold throat barks as a dog
>Over the multitude immers’d beneath.
>His eyes glare crimson, black his unctuous beard,
>His belly large, and claw’d the hands, with which
>He tears the spirits, flays them, and their limbs
>Piecemeal disparts. Howling there spread, as curs,
>Under the rainy deluge, with one side
>The other screening, oft they roll them round,
>A wretched, godless crew. When that great worm
>Descried us, savage Cerberus, he op’d
>His jaws, and the fangs show’d us; not a limb
>Of him but trembled. Then my guide, his palms
>Expanding on the ground, thence filled with earth
>Rais’d them, and cast it in his ravenous maw.
>>
File: gardenerlowres.png (2.47 MB, 1018x1018)
2.47 MB
2.47 MB PNG
anyone else use this hack?

>AI cant make the image i have in my mind
>find a close approximation on yandex images
>copy+paste into paint.net
>mask over it with low denoise
>AI finally understands
>>
I'm trying to run fooocus, but whenever I click "run_anime" it doesn't do anything. It won't even bring up the cmd prompt.
>>
>>102065283
>>102065483
What are the rest of the generation parameters?
>>
>>102065539
>hack
>it's just img2img
>>
>>102065539
stick it in joycaption and try and get an equivalent prompt instead

https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha
>>
>>102065536
great
Here's Hell by ImageFX:
https://files.catbox.moe/tlmqud.jpg
>>
File: taylor_sphinx.png (3.89 MB, 1536x1536)
3.89 MB
3.89 MB PNG
>>102065576
yeah but its a way of guiding it rather than hitting your head against a wall trying to do it solely through prompting
>>
>>102065283
>>102065499
It's a flaw of the model. SDXL has this problem too.
>>
>>102065583
and if you want to do joycaption locally here's a comfy workflow
>c:\windows\system32
Kidding. https://files.catbox.moe/u2xfic.json
>>
>>102065539
>>102065598
anon im gonna be honest the denoise is wayy too low, it just looks like a shoddy photoshop
>>
File: ComfyUI_32880_.png (1.09 MB, 1280x720)
1.09 MB
1.09 MB PNG
Is it normal for a lora to need extra steps to avoid looking smeared? Maybe it's because the colors are flat, but this one needs 26 steps to look acceptable instead of the usual 20.
>>
>>102065598
More unrealistic beauty standards for women.
>>
File: g4.png (163 KB, 896x1152)
163 KB
163 KB PNG
>>
Asking for my retard friend how do you run Flux GGUF on a non-comfy webui?
I use comfy but normalfags get filtered by it.
>>
File: 2024-08-25_00010_.jpg (878 KB, 2560x1440)
878 KB
878 KB JPG
>>102065592
nice .. looks like a Christopher Marlow paiting

pic related is Dante entering Dis the city of eternal fire .. forgot which layer of hell it way.. this time without beksinski lora
>>
File: g5.png (161 KB, 896x1152)
161 KB
161 KB PNG
>>
File: g6.png (164 KB, 896x1152)
164 KB
164 KB PNG
my robot has tits for some reason???
>>
File: ComfyUI_04429_.png (1.55 MB, 1344x768)
1.55 MB
1.55 MB PNG
>>102065730
hot
>>
>>102065565
Welp. I installed comfyui and it's the same thing. I tried running as administrator and get the same issue. I know I can run .bat files. I run sillytavern all the time.
>>
File: ComfyUI_00870_.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
woah my Kasia LoRa has already 37 downloads.
The Success is getting to me
>>
File: ComfyUI_04433_.png (1.74 MB, 1088x960)
1.74 MB
1.74 MB PNG
>>
File: g7.png (159 KB, 896x1152)
159 KB
159 KB PNG
>>102065828
creepy af
>>
File: FluxDev_02887_.jpg (218 KB, 832x1216)
218 KB
218 KB JPG
>>102065621
NTA only txt2img here, did I do well?
>>
>>
File: ComfyUI_04438_.png (1.51 MB, 1024x960)
1.51 MB
1.51 MB PNG
>>102065909
fantastic anon, i genuinely thought this was real life for a second
>>
File: ComfyUI_32884_.png (970 KB, 1024x1024)
970 KB
970 KB PNG
Alright, I'm completely satisfied with this lora. It seems it needs guidance 4.5 and 26 steps to work as intended.
g-mgk here: https://mega.nz/folder/mtknTSxB#cGzjJnEqhEXfb_ddb6yxNQ
Artist: https://xcancel.com/mgk0962die/media
>AI learning prohibited
I'm so sorry...
>>
there aren't enough hours in the day
>>
File: ComfyUI_32883_.png (1.32 MB, 1280x720)
1.32 MB
1.32 MB PNG
>>102065952
I know, right.
>>
How do we see which node belongs to which custom node in Comfy? Like who made it?
>>
>>102065909
Kek. I need to try this with flux
>>
If parts of my prompts aren't showing up, does that mean I should add more steps?
>>
>>102065977
anon, that is Flux, I wasn't joking about it being txt2img
>>
File: file.png (42 KB, 677x411)
42 KB
42 KB PNG
>>102065973
if you have the manager then do this
>>
>>
>>102065988
I said I need to try this with flux. I know your gen is flux that's why i said that
>>
>>102066015
That's not how... have fun, anon.
>>
File: 00026-795841876009572.png (1.3 MB, 768x1280)
1.3 MB
1.3 MB PNG
>>102064103
used sdXL_v10VAEFix.safetensors and all images seem to have these colour sparkles and just look wrong. Sanity check doesn't work.
what do I have to change to have good looking photos?
>>
>>102066053
If you masked with SDXL that could become a really good Taylor sphinx, zero copy and paste needed because flux is a bit smarter. I would post it right now but I'm at work
>>
File: ComfyUI_04448_.png (1.38 MB, 768x1280)
1.38 MB
1.38 MB PNG
>>102065945
put it in my 1girl generator, very soulful

>Illustration: The Goth girl's intense topaz eyes bore into your very soul as she stands defiant against the crimson streetlights that bathe her in an eerie, alluring glow. Her raven hair cascades down her back in a fiery fishtail braid, defying gravity and convention like everything else about this mysterious woman.

>With one hand resting casually on her hip, she strikes a confident pose - legs slightly apart, shoulders squared, chin lifted defiantly as if ready to face any challenge that comes her way. The utility jumpsuit hugs every curve of her athletic frame in all the right places, emphasizing her toned figure and hinting at hidden depths and secrets within.

>She looks like an icon of rebellion against conformity, a siren's call luring you into uncharted territories where rules don't apply and anything is possible. With each step she takes down this neon-lit street, your heart races with anticipation for what lies ahead as the Goth girl leads the way to wherever her dark passions may take you...

>The scene is one of mystery, danger and forbidden delights - a place where nothing is quite as it seems on the surface. The orange glow casts an atmosphere of unease yet excitement, hinting at secrets waiting to be uncovered in this strange woman's hidden world beyond the veil of her fishtail braid.
>>
File: ComfyUI_00227_.png (2.56 MB, 1920x1080)
2.56 MB
2.56 MB PNG
Closer and closer
>>
File: FluxDev_02886_.jpg (266 KB, 832x1216)
266 KB
266 KB JPG
>>102066076
It can do it properly too, sometimes. Haven't found the right words to make it consistent. Also messed up the anatomy a little.
>>
File: 2024-08-25_00015_.png (1.6 MB, 1280x720)
1.6 MB
1.6 MB PNG
>>
File: ComfyUI_32885_.png (1.45 MB, 1024x1024)
1.45 MB
1.45 MB PNG
>>
>>102066106
>collage bait
>it's actually peak

i salute you anon
>>
File: 1719490061302098.png (4 KB, 348x31)
4 KB
4 KB PNG
Where do you guys set this?
>>
>>102065598
The only copy and paste I did here was with the wings, the colors are nice but the proportions are a bit off
Taylor was all a SDXL mask with Differential Diffusion, SDXL is extremely good at making Taylors
>>
>>102066130
in my room
>>
File: 2024-08-25_00013_.png (1.27 MB, 1280x720)
1.27 MB
1.27 MB PNG
>>
>>
>>102066130
default is 3.5 or 4 .. lower will make it follow prompt better, but make it ugly, higher will make it more aesthetic but can burn you gens
>>
>>
>>102066206
So lower makes it follow the prompt more and give the ai less freedom?
I woulda thought it was the other way around.
>>
>>102066224
it is the other way around, that anon needs some sleep
>>
File: ComfyUI_04458_.png (1.53 MB, 832x1152)
1.53 MB
1.53 MB PNG
>>
File: FluxDev_02897_.jpg (277 KB, 832x1216)
277 KB
277 KB JPG
uncanny
>>
File: 00248-140299i8185.jpg (85 KB, 507x761)
85 KB
85 KB JPG
https://youtu.be/7YvAYIJSSZY?si=Aif0C4UN6mvmx2qM
relatable
>>
is network dim the same as rank when training loras? if so, does increasing the dim/rank, which increases the file size, also increase the amount of vram needed? the default kohya script i'm using results in 40mb loras which to me seems way too small compared with what i'm used to when training sdxl/pony or even 1.5
>>
>lora is exclusively outputting sepia-toned photographs even when explicitly instructed for colour
yep, i think all the B&W/sepia pics ruined it
shame there isn't much actual quality colour photography available to train on
>>
>>102066266
Increasing network dimension increases both the file size and VRAM requirements.
>>
is flux completely incompatible with negative prompts? it seems like a huge flaw to me
>>
>>102066351
isn't that because the faggot devs only released a cucked ("distilled") version of the model?
>>
>>
File: ComfyUI_00266_.png (2.11 MB, 1920x1080)
2.11 MB
2.11 MB PNG
>>
File: 2024-08-25_00012_.jpg (730 KB, 2560x1440)
730 KB
730 KB JPG
>>102066351
if you use any of the CFG hacks negative prompts work to a certain degree.. either DynamicThresholding, AutomaticCFG or SkimmedCFG will work
>>
>>102066266
>results in 40mb loras which to me seems way too small
that's plenty, Flux knows more than SDXL, you only need the lora to nudge the weights to what you want
only increase the rank if you're sure that's what is holding back the training, the images and captions you use have much more of an effect than rank
>>102066348
>and VRAM requirements
depends on how the lora is applied
>>
File: 1702525125150004.png (1.9 MB, 1024x1024)
1.9 MB
1.9 MB PNG
>>
File: 1717470659666742.png (62 KB, 331x146)
62 KB
62 KB PNG
AAAAAAAHHHH WHY WON'T IT JUST WRITE WHAT I WANT IT TO ALREADY
Literally how hard is it to understand "A sign in the background reads 'X'"?
>>
File: ComfyUI_00201_.png (1.88 MB, 1920x1080)
1.88 MB
1.88 MB PNG
>>
File: 2024-08-25_00030_.png (1.92 MB, 1280x720)
1.92 MB
1.92 MB PNG
>>
Fuck the grifter. I'm not giving you $5 for someone elses work.
https://civitai.com/articles/6901
>>
File: file.png (2.15 MB, 832x1216)
2.15 MB
2.15 MB PNG
>>102064242
>>
File: 2024-08-25_00031_.png (1.95 MB, 1280x720)
1.95 MB
1.95 MB PNG
>>
>>102066544
>>102066544
Here is the one I'm using: https://pastebin.com/fBA7pQiP
>>
File: ComfyUI_32887_.png (1 MB, 1024x1024)
1 MB
1 MB PNG
>>
File: ComfyUI_04473_.png (1.33 MB, 832x1216)
1.33 MB
1.33 MB PNG
>>
>>102066568
kek
>>
>>102066609
So you just run this in your folder where your images are located? Do you need to edit anything with it?
>>
File: file.png (2.08 MB, 832x1216)
2.08 MB
2.08 MB PNG
>>102064242
>>
File: R.jpg (139 KB, 1024x1280)
139 KB
139 KB JPG
>>102064037
>It's still sexualizing the idea of "teen" and therefore wrong.
https://www.washingtonpost.com/entertainment/museums/this-painting-might-be-sexually-disturbing-but-thats-no-reason-to-take-it-out-of-a-museumthe-metropolitan-museum-calls-it-right-by-keeping-a-work-by-balthus-up/2017/12/05/169bd33a-d9f0-11e7-b1a8-62589434a581_story.html
>>
>>102066609
Are you the Anon he stole the script from?
>>
File: Image1.jpg (64 KB, 799x364)
64 KB
64 KB JPG
>>102064388
>>102065483
>ran your prompt
>got the squares
>modified prompt to picrel
>squares all gone
You're welcome bro.
>>
>>102066651
I edit these lines.

>INPUT_PATH = "X:/input/folder/here"
>OUTPUT_PATH = "X:/output/folder/here"

First you have to clone this repo (https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha/tree/main) and set up a venv.
>>
>>102066659
reminds me of 'guitar lesson', by balthus, or based balthus, as i call him. true artist
>>
>>102066679
That's too much work for midtards, but it's good to have options.
>>
File: 1698345915289144.png (540 KB, 735x828)
540 KB
540 KB PNG
So how bad of an idea is it to train a LoRA using DALL-E 3 gens? I'm captioning approximately 90 of them for my first pass. Trying to get as many color palettes, locations, and character/enemy sprites represented as possible. I don't think I've seen a single satisfactory boomer shooter style LoRA for any model.
>>
>>102066255
would
>>
>>102066691
I train with a mix of real images and imagen, but I try to include more real than imagen. As long as the images are high quality and consistent with the subject it should be fine.
>>
>>102066691
hmmmm perhaps I could train good feet into Flux with my DALL-E collection
>>
File: ComfyUI_04485_.png (1.6 MB, 960x1088)
1.6 MB
1.6 MB PNG
>>
File: R.jpg (119 KB, 1024x1280)
119 KB
119 KB JPG
>>102066685
personally i feel like the pornographic themes of "Guitar Lesson" overpower the aesthetic qualities of the art so i prefer the Therese works
>>
>>102066544
Thanks anon
>>
>>102066733
why the dog got an hat
>>
File: 2024-08-25_00034_.jpg (1.05 MB, 2560x1440)
1.05 MB
1.05 MB JPG
>>
>>102066744
>his dog doesn't have a hat
ngmi
>>
>>102066691
Flux generalizes extremely well, you don't need to resort AI slop. Just get 30-50 diverse boomer shooter images and caption them "boomer shooter, [long_paragraph_vlm_caption]"
>>
File: 2024-08-25_00036_.png (1.44 MB, 1280x720)
1.44 MB
1.44 MB PNG
>>
>>102066544
waiting for idiots to start complaining that the ps script renamed all their files when they ran it as full admin from C:\

Thanks anyways anon. I appreciate this and you don't deserve the shit you will catch for doing a good thing.
>>
>>102066776
what if you want the boomer shooter through the eyes of DALL-E 3 style?
>>
>>102066832
sounds like shit because DE3 looks like shit
>>
>>102065259
>>
>>102066847
You wish you had DALL-E 3 at home.
>>
>>102066867
I have Flux which is better than DE3
>>
>>102066879
lmao
>>
I trained my flux loras with the ai toolkit but I cannot fucking figure out comfy UI. Does anyone have a workflow for simple image generation with a lora? I would like to be able to batch generate images for multiple loras but if thats too much of a problem then one at a time is fine.

Why is this so fucking hard, am I this stupid lol
>>
The grifter has gone on an unhinged rant where he insinuates he has a job among other delusions.
>>
>>102066885
Yes, I can make gens of Duke Nukem with nude females
>>
>>102066832
then you should kill yourself
>>102066867
so that we can make it not look like shit yeah
>>
>>102066816
Oh shit you are right, I will remove that
>>
File: joycaption.png (19 KB, 761x175)
19 KB
19 KB PNG
How can i upper the token limit in joycaption, i'm getting incomplete captions :(
>>
File: 00000-AYAKON_1248194.png (710 KB, 768x1280)
710 KB
710 KB PNG
>>
>>102066901
>>102066903
how easy you retards get buttmad just by someone mentioning DALL-E 3
>>
>>102066907
I was honestly in the camp where they deserved it. I would just add the recurse flag and set it to 0.

Still sir, you are a champion.
>>
>>102066926
You made an objectively false statement that it's better.
>>
when will (you) learn to stop using sloppa
>>
>>102066900
tfw paid for a month worth of his patreon and wasted hours and hours following his fine tuning guides only for every outcome to suck ass. live and learn i guess
>>
>>102066900
yes, thank you for posting, everyone here wants to know what's happening over at reddit
>>
>>102066930
or depth or whatever the PS thing is where you say only dig so deep.
>>
>>102066943
It was very brave of you to admit you did that.
>>
>>102066935
it's no good to deny reality this hard anon, the Black Forest Labs guys aren't gonna suck your dick you know?
>>
>>102066816
>>102066907
I can re-write it to specify a file path.
But also the script only looks for image files so hopefully people won't be that retarded.
>hopefully people won't be that retarded
I am hoping for a lot. I will just remove that bit
>>
File: A324.jpg (146 KB, 1024x1024)
146 KB
146 KB JPG
>>102066867
me fr
>>
>>102066959
show me your best DE3 boomer shooter image, you have 2 minutes
>>
File: ComfyUI_32888_.png (1.52 MB, 1024x1024)
1.52 MB
1.52 MB PNG
>>
>>102066971
ask the anon with the DALL-E 3 boomer shooter images
>>
>>102066977
guess DE3 isn't the best
>>
wait a second...
>>
>>102066982
guess that makes perfect logical sense in your deranged little mind
>>
>>102066990
>nogen
Good troll anon
>>
>>102066982
nta but dalle is the gold standard but it simply doesn't matter because it's not local so any point you are trying to make is moot.
>>
>>102066996
absolute buttmad this anon is
>>
>>102067004
DE3 isn't the gold standard but nice try. For starters it has a shitty AI grain on every image.
>>
>>102066776
The thing I'm worried about is that most actual boomer shooters are low on variety when it comes to sprites/world objects. Between Duke, Blood, Shadow Warrior, Ashes 2063, Ion Fury, and a couple others I guess I could find 50 good and varied shots but it'd be close.
>>
>>102066544
wait someone was trying to charge for this simple shit?
Here is my version, including what you put into a .bat and the changed app.py
https://pastebin.com/WBYHTM0Y
Copy the setup from the huggingface repo
>>
>>102067016
And what is the gold standard for imagen?
>>
File: ComfyUI_32889_.png (1.58 MB, 1024x1024)
1.58 MB
1.58 MB PNG
>>
>>102067017
Anyways, I think I'll try what >>102066707
suggested, try a mix of both
>>
>>102067028
koff gens
>>
>>102067018
>wait someone was trying to charge for this simple shit?
Guess who
https://civitai.com/articles/6887/joycaption-is-amazing-to-caption-training-data-advanced-gradio-app-with-batch-processing
>>
>>102067028
Flux because it can be trained to do anything and doesn't dog you for anything that has a woman in it.
>>
>>102067043
hot take but if it's only 5$ that isn't the worst for making it all into a one-click installer, yeah it's not much value add but 5$ isn't much value
>>
>>102067017
I'm going to do a lazy scrape and caption of boomer shooters and see what happens.
>>
>>102067018
A lot of people here are blissfully ignorant of the grifter until they are grifted by him.

If you contribute anything, he'll take it and put it on the patreon.
>>
>>102067055
Flux is indeed the standard for local imagen, but for imagen in total, it's by far Dalle3.
However, as I said, Dalle is not local so it doesn't matter.
>>
>>102066659
>>102066685
> Balthus's mother Elisabeth Dorothée Spiro Klossowska (known as Baladine Klossowska) was descended from Lithuanian
>>
local or bust
>>
>>102067069
Based, looking forward to it, I feel like it's an under-exploited aesthetic niche
>>
>>102066691
It's no different than any other curated dataset. The idea of synthetic datasets sends some people into a frenzy because the curation part goes over their heads. Just do it.
>>
>>102067073
DE3 isn't the gold standard because it has grain and is censored. Anything DE3 can do Flux can do better.
>>
>>102067090
>bro just train on the images with shitty AI pixel grain and nonsensical details
>>
>>102067043
what the FUCK is up with this guy anyway and why do i see him literally EVERYWHERE on github issues and such
>wanted to get cogvlm2 working on windows because triton is troonix-only
>loads of issues with him in it
>he claims to have got it working but the installer is on a patreon
>can't find windows builds of triton anywhere else
FUCK this guy
>>
>>102067063
The entire local image generation project has been a community effort from the start. Comfy, Auto, even JoyCaption which he is trying to profit off, is all free.
Nothing he has done there is worth $5. I can ask chatgpt to write me an app too.
>>
>>102067073
This guy thinks horses are the gold standard of transport because they've been around the longest.
>>
>Flux
>local sloppa

>DE3
>cloud sloppa

Will it ever end?
>>
I learned about spamming kaggle accounts to get infinite free training hours from the turk grifter so I'm grateful to him for that.
>>
I followed some advice to pare down my style dataset from 150 to 30-ish images and it actually worked really well. It's just aggravating because anything that recurs in two or three images gets ingrained really deeply. I wish I knew why the larger dataset didn't get as good results because I think that would solve that issue. Lower learning rate maybe?
>>
>>102064752
beeg model = no one can do shit
=D, look how long pony took to come out, and thats only because the guy could make money from it, you cant make money from flux, there will be no good flux finetunes
>>
/djt/ on /trash/ is exactly the same as ldg lmao, nothing but bickering, thread drama, "miss [insert poster]. this website sucks
>>
>>102065076
>Capture d’écran
>fr*nch
>>
>>102067090
These threads have had an irrational fear of synthetic data for a long time. I've trained off of purely synthetic datasets and got great results. As you say, it comes down to strict curation.
>>
>>102067167
lackluster dif general
metallica - sad but true
>>
File: ComfyUI_32890_.png (1.73 MB, 1024x1024)
1.73 MB
1.73 MB PNG
>>
File: ComfyUI_01442_.png (828 KB, 768x1024)
828 KB
828 KB PNG
>>102065076
inb4
>>
>>102066266
Yes Rank/Dim is the same thing, for SDXL around 16 is about the limit on quality improvement, after that you only get larger files with little to no quality improvement
>>
Is there a way to edit the .bat so comfy doesn't launch in the default browser?
I'd like to have it launch in a different one since my default is firefox & I have so many addons it's a bit heavy, wastes resources.
I assume it's a simple job in notepad++, but I'm a nocodefag so I won't know what I'm looking for.
>>
>>102067063
wtf is /djt/ anon

/trash/ has /slop/ and a bunch of furry threads
>>
File: 2024-08-25T124332.103.jpg (251 KB, 1024x1024)
251 KB
251 KB JPG
>>102067167
You can always go back.
>>
>>102067205
>ynr the sdxl celeb slopper with his 1gb loras
>>
>>102065397
according to some post automatic1111 works with amd gpus but you can't train models.
Also thinking about upgrading. Is generation working fine on AMD cards or do you need to buy nvidia?
>>
File: ComfyUI_32891_.png (1.52 MB, 1024x1024)
1.52 MB
1.52 MB PNG
>>
>>102067227
Holy shit you branching out in art styles. Looks cool desu
>>
>>102067205
Also worth nothing that on those rank 16 loras (~110mb) half of it is garbage, since you can resize them to 40~60mb files and still get 1:1 gens
>>
if I'm using someone elses compute power, is there any sense in training in batches
>>
File: 1697371561995964.jpg (168 KB, 1024x1024)
168 KB
168 KB JPG
>>102067092
>Anything DE3 can do Flux can do better.
except when you point out anything flux can't do and are met with the freetard defense force shitting themselves about how anything flux doesn't know is somehow too tall of a request and everyone should be grateful and retrain the model themselves. flux aesthetically looks bad and is creatively bland. it is incapable of rendering artistic pieces unlike midjourney and dall-e. while it excels in comprehension, it fails at art. it can place a couple of greasy blobs next to eachother in the most boring way possible with no subtlety or meaning and that's about it. oh, and don't forget text on signs. fluxtards can cope all they want about loras and controlnet, like sdkeks have for years, but at the end of the day the model is very sterile and reliant on users re-injecting the creativity back into the model. at the end of the day it's another bland local model to add into the pile of other bland local models that will amount to nothing more than jeets begging for nsfw finetunes as the model slowly melts into the same localslop look that has plagued civitai since SD 1.5.

a swing and a hit, but hardly the home run needed to knock dall-e out of the park. bake again.
>>
bake new one fag
>>
>>102067189
I'm not against them but it is a very slippery slope that will easily turn a model into producing slop. There is not enough non-synthetic data out there for some things that is high quality enough to do training on which is understandable where you should use synthetic data but the line is precariously thin.
>>
>>102067206
start "" "C:\Program Files (x86)\PATH\TO\YOUR\browser.exe" "http://localhost:8188"
should do it but I am only guessing because I haven't actually tried
>>
>>102067274
You hurt my feelings so now I'm not gonna
>>
File: 00105-4005781768.png (2.46 MB, 1024x1024)
2.46 MB
2.46 MB PNG
>>102067298
>>
File: 1709845715945469.jpg (264 KB, 832x1216)
264 KB
264 KB JPG
>>102067264
>it is incapable of rendering artistic pieces unlike midjourney and dall-e
>>
is everyone using ai-toolkit for lora training on flux?
>>
>>102067206
Just copy and paste http://127.0.0.1:8188/ into your preferred browser, comfy isn't actually running in the browser it's just using it for a gui.
>>
>>102067264
Hot woman wearing a bikini showing off cleavage, that is your prompt. Show the results from DE3 and MJ
>>
File: flux_cyber-env04.jpg (2.8 MB, 2080x2720)
2.8 MB
2.8 MB JPG
Hey I made the collage :). Hello Everyone. What's new?
>>
>>102067264
Glossed right over this seethe holy mad
>>
>>102067323
I'm back on Kohya now.
>>
File: ComfyUI_04505_.png (1.57 MB, 1024x960)
1.57 MB
1.57 MB PNG
>>102067206
>>102067288

Remember to combine this with the --disable-auto-launch flag so it doesn't launch your browser automatically.
>>
>>102066776
It won't make much of a difference. The reason Flux generalizes well is because it's already has some familiarity with the concept. You can be sloppy with the captioning and dataset, and the result would still come out fine. Try training something new, and you'll run into difficulties.
>>
>>102067351
It's going to be exposed to most concepts, give a concept it won't know (outside of nudity)
>>
File: ComfyUI_32893_.png (1.59 MB, 1024x1024)
1.59 MB
1.59 MB PNG
>>102067233
I didn't expect the lora to turn out this cool. And it seems deis at 28+ steps is the way to go, no more smudgy patches..
>>
File: ComfyUI_32894_.png (1.73 MB, 1024x1024)
1.73 MB
1.73 MB PNG
>>
>>102067360
What's the LoRA? I love it.
>>
does it still count as an "ai gen" if i've had to manually inpaint parts of it over and over again, photoshop in small parts from other images, fill in some manual details, then img2img over the whole lot again, etc. etc. to get it looking just right?
>>
>>102067381
Yes, why wouldn't it be?
>>
>>102067381
It doesn't matter if other people can't tell the difference. Everyone "cheats" in one way or the other with their art.
>>
File: 2054-flux1-dev-78810064.png (1.87 MB, 896x1280)
1.87 MB
1.87 MB PNG
>>102067285
The way I've used it is to refine the style I like. I started by training on a ton of paintings I liked, but as time went on I got to the point where I liked the style of the images I was generating more than I liked the style of the original paintings. Eventually you have enough images in that style that lack obvious AI flaws like mangled fingers and nonsensical architecture that you can train on your own style without reinforcing the slop too much. I'm glad I can do that because Flux was not getting me the style I like with prompts alone.
>>
File: ComfyUI_32895_.png (1.68 MB, 1024x1024)
1.68 MB
1.68 MB PNG
>>102067374
>>102065945
>>
does it still count as an ai gen if i paint an entire picture and then resample it at 0.01 denoise?
>>
>>102067389
meh, just not sure if my stuff would qualify for these threads is all
might post some later then
>>
>>102067410
If the base image is AI generated it's AI
>>
File: 00108-1301610733.png (2.18 MB, 1024x1440)
2.18 MB
2.18 MB PNG
>>
>>102065945
>>AI learning prohibited
as if we need his authorization in the first place, no one owns a fucking style, those artists are fucking delusional
>>
>>102067400
My opinion is that for certain things that don't need detail like styles and img2img even, synthetic is the best at that. But for anything you need more like a lora, synthetic shouldn't take up more than 20% of your training data ideally, and should never exceed 50% regardless of circumstance.
>>
File: ComfyUI_04520_.png (1.33 MB, 832x1152)
1.33 MB
1.33 MB PNG
>>102067410
we like effortposters here anon
>>
>>102067328
>>102067350
Yeah, I knew that... I 3d print using klipper and have bookmarks for the raspberry pi the same way. Was just wondering if there was a lazier way to save me from having to remember to launch the browser manually since my brain isn't fully operational.
This is probably the simplest solution for now, I'm sure in time my faulty brain will get used to it. Thanks lads.
>>
File: 1715068512694676.png (1.93 MB, 1024x1024)
1.93 MB
1.93 MB PNG
>>
>>102067440
anon did you miss this post >>102067288
>>
>>102067440
All I know that it will launch in whatever window I last had selected last
>>
>>102067360
You were saying you had smudgy patches? I heard that Flux can occasionally do that on some sampler/scheduler combos and you should take the denoised_output from the custom sampler instead of the normal output, as this doesn't seem to have them (but is indistinguishable otherwise).

Haven't run into the issue to confirm this myself though.
>>
>>102067420
How many input images did you use for the asuka LoRA?
>>
File: 2024-08-25T130738.100.jpg (230 KB, 1024x1024)
230 KB
230 KB JPG
>>
>>102067479
I think 20 steps just wasn't enough, the pictures looked underbaked.
>>
>>102067479
>you should take the denoised_output from the custom sampler instead of the normal output
I always wondered what was the difference between the two, didndn't know it would make such a difference, I'm gonna test that one out from now on
>>
Bread so bright and colorful...

>>102067488
>>102067488
>>102067488
>>
https://reddit.com/r/StableDiffusion/comments/1f0b45f/comment/ljqri6x/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button
this guy just used 15 pictures of his face and that's it? that's insane, now you can replace any actors just like that
>>
>>102067525
15 images was sufficient for SD 1.5 as well when it's just about creating a LoRA for a face. You need more for broader themes/styles though
>>
>>102067491
that version was 24 but it kinda sucks. I'm having a ton a problems getting an asuka lora to work the way nai/animerge can reproduce asuka.
>>
>>
yeeeahaaaa
>>
>>102066659
> The challenge now is to define codes of behavior without throwing out the maps that got us to the place we are now. This may be particularly painful in the arts world.
> maps
> By that standard, the museum might have to include hundreds, if not thousands, of warning labels, and not just for works made by heterosexual men with an erotic interest in girls.

is mma the museum of rape?
>>
>>102067438
images like this are weird. It could be made just as well with layers.

can flux generate a woman like this with a transparenr background? then as long as the lighting works, you could use it with another background.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.