[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>101962774

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
File: file.png (2.37 MB, 896x1152)
2.37 MB
2.37 MB PNG
>>101965787
>>
File: 00035-4111115055.jpg (587 KB, 1344x1728)
587 KB
587 KB JPG
>>
File: file.png (1.4 MB, 1152x896)
1.4 MB
1.4 MB PNG
Alright, I'm going to bed. Another 5 hour sleep night because of Flux.
>>
>Maintain thread quality
https://rentry.org/debo
>>
>>101965925
kool
>>
>>101965917
is this the worst op is ldg history?
>>
boil miku
>>
File: Dalle2.jpg (363 KB, 1200x778)
363 KB
363 KB JPG
>>101965813
>absolutely not, have you actually looked at DALL-E 2 images?
Three things:
1. I used to spend hours reading the Dalle SubR/eddit to see their generations, like I do here.
2. I was the guy that posted that thread with "a woman holding a sign that says" that revealed how they were tampering with prompts, that went viral.
3. Call me when MJ6 can reproduce picrel.
>>
File: file.png (1.6 MB, 1152x896)
1.6 MB
1.6 MB PNG
>>101965983
Let's make it the best thread in /ldg/ history.
>>101965961
The best series of images always come from anons sharing their prompts. More prompts should be shared.
>1990 scan screencap sharp still photograph chrome textured biomechanical macro slimy wet dripping boiled creatured smoke dark
>>
So how are y'all training Flux loras under 24GBs of VRAM? What boxes do I need to tick for kohya?
>>
File: 9988.jpg (1.55 MB, 1792x2304)
1.55 MB
1.55 MB JPG
using adetailer by a mistake made a cool creepy gen.
>>
File: readingtime.png (2.11 MB, 1600x960)
2.11 MB
2.11 MB PNG
credit to whoever made the original
>>
>>101965983
not my fault that he chose my shittobeans gen.
>>101966015
BRO
>>
>>101966016
AdamW8bit
>>
>>101966005
Holy shit, you were the guy that posted that thread with "a woman holding a sign that says" that revealed how they were tampering with prompts, that went viral?

Hoooooly shit. I didn't know we were in the presence of royalty - everything you say must be objectively true because you made a post that went viral that one time.

Get over yourself you fucking loser, holy shit. I'm not even that anon and I think this might be the most cringe thing I've ever heard.
>>
File: 00007-666087595.png (1.63 MB, 1024x1440)
1.63 MB
1.63 MB PNG
>Stopped lora training because samples were garbage
>Try semi-trained lora anyway
>It's actually pretty good just very undertrained
>Didn't save training state
Fuuuuuuuuck
>>
>>101966085
you live and learn
>>
File: 00089-1566602826.jpg (260 KB, 1600x1344)
260 KB
260 KB JPG
it's a hard life
>>
File: ifx113.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
>>101966015
ok
>>
File: DucHaitenDreamWorld0.png (840 KB, 768x768)
840 KB
840 KB PNG
>>101965480
Part of being soulful is being unique, try changing your seed, see how it produces the same picture basically.
I changed the seed on mine and got a completely different picture.
That's a problem with Flux, you generate something, other seeds would give you basically the same thing, in a model like DreamWorld you can't specify where anything is, but every single seed is completely different, as it should be.
>>
>>101966085
And here's the sample
>>101966089
We go again
>>
>>101966005
Cool gen, shows nothing that makes "a similarly enhanced prompt would easily DE2 > MJ6" more true.
>>
Catjak defeated debo
>>
>>
File: 1944035404.jpg (2.85 MB, 2688x1536)
2.85 MB
2.85 MB JPG
>>
>>101966073
>you made a post that went viral that one time.
Actually, other of my threads went viral at the r/MBTI/ and r/ENFP subreddits too.
This one was cool because I wasn't the guy that discovered it, I copied his thread and mine got more up arrows than the original.
>>
>>101966154
you sir WON the internet on this blessed day!
>>
>>101966101
instead of focusing on the weaknesses of flux, why not play to its strengths? the past is the past. time to move on - doesn't mean the old tools should be abandoned tho.
>>
>>101966118
DE2 closed its doors and nobody can use it anymore unless they already had an account, so we can't compare them, plus, they're both online and offtopic, but, if MJ6 can't make a more artistic gen despite its enhancer it is the inferior model in this department.
>>
File: ComfyUI_31983_.png (1.33 MB, 1024x1024)
1.33 MB
1.33 MB PNG
>>
File: delux_ci_00049_.png (1.9 MB, 1536x968)
1.9 MB
1.9 MB PNG
>>101966149
this isn't flux, is it? I've had zero luck getting it to do ps1 era lowpoly
>>
File: file.jpg (207 KB, 1024x576)
207 KB
207 KB JPG
>>101966195
if neither can create picrel they are both the inferior models
>>
>>101966209
There's a lora for it.
>>
>>101966005
DALLE2 was sovl, but Flux LoRAs can replicate that sovl pretty easily, look here
https://civitai.com/models/639820/kazuma-sketchbook-flux?modelVersionId=715659

This pretty much reminds me of Dalle2.
>>
>>101966195
if you didnt know, youre arguing with abstractschizo who has been doing this for over a year now. any gains in prompt comprehension are lost to him if it cant do nonsensical SD 1.4 looking garbage. he advocates for the sabotaging of local models until they all just generate 'le abstract art' junk as if the trash he already has isn't more than capable.
>>
File: ComfyUI_31985_.png (1.2 MB, 1024x1024)
1.2 MB
1.2 MB PNG
>>
>>101966239
is this the lora that breaks when the gen ratio isn't 1:1?
>>
>>101966259
?? https://civitai.com/images/23956011
>>
File: ComfyUI_31986_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>
>>101966181
>why not play to its strengths
I can't run local and depend on huggingface spaces, so no GuidanceChange+Tonemapping+NegativeGuidance for me, meaning not much change to the default style, so... finding out how to make half-eaten hot dogs has been the most exciting thing about it for me.
>>
File: 1710137666062460.png (1.01 MB, 1024x1024)
1.01 MB
1.01 MB PNG
>>
File: ifx124.png (1.27 MB, 1024x1024)
1.27 MB
1.27 MB PNG
praise lord reekris
>>
>>101966305
Please tell me that's cum and not mayo.
>>
>>101966211
Specifically at being artistic, probably, yeah.
CraiyonV2 could do your picture and pretty much any artistic style out there, so it may be on top in this department, the most artistic SD1.5 finetune (OpenNijiRemix) looked like a joke compared to CraiyonV2.
>>
Fran needs to forgive him already
>>
>>101966280
most of the gens you saw from me here today were done with no negative but meh, ok. that sucks.
>>101966284
he's got the trumpface LOL
>>
>>101966305
Great. The bread isn't conventional hot dog bread but it's pretty cute, I like it.
>>
>>101966315
man you're just full of shit
>>
File: 00104-2229557066.png (3.47 MB, 1344x1728)
3.47 MB
3.47 MB PNG
>>
how do I go about generating my own femboys?
>>
>>101966335
Describe what you want in the prompt field. (gay ass nigga)
>>
File: ComfyUI_31988_.png (1.12 MB, 1024x1024)
1.12 MB
1.12 MB PNG
>>
File: lazypepe.png (2.12 MB, 1018x1018)
2.12 MB
2.12 MB PNG
>>101966284
SDXL makes sufficient Pepes
>>
File: ComfyUI_31989_.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>
>>101966347
Uhhh sir there is smoke coming out of Pepe's toe
>>
>>101966362
super easy to fix but why bother?
>>
for kohya_ss, what does --resolution specify? is that max resolution?
>>
File: Craiyon2.png (526 KB, 620x413)
526 KB
526 KB PNG
>>101966328
You may be too young to remember. Back then free online was giving us 9 pictures in less than a minute with this level of detail.
>>
>>101966377
no, anon, I was there too
you're full of shit
>>
File: ComfyUI_31984_.png (1.2 MB, 1024x1024)
1.2 MB
1.2 MB PNG
>>
File: ifx126.png (1.57 MB, 1024x1024)
1.57 MB
1.57 MB PNG
>>
>>
>>
>>
>>
>>101966085
Mail!
>>
>>
>>101966149
I forgot what thread I was in and was about to start searching f95 for the game. Ah well. Good job.
>>
File: MoreCraiyon2.png (1.34 MB, 780x974)
1.34 MB
1.34 MB PNG
>>101966391
About what? I can't even be making stuff up, I'm posting proof. Craiyon2 disappeared when Craiyon3 landed and these can't be made anymore, but modern models like Flux can't never make something like this without making the prompt be several paragraphs long, if at that.
>>
File: Flux2.png (792 KB, 1200x768)
792 KB
792 KB PNG
FLUX 2 JUST DROPPED
CHECK CNN
>>
>>101966458
we know why Flux has issues with that but would you trade Flux for Craiyon2?
>>
File: ifx128.png (1.78 MB, 1024x1024)
1.78 MB
1.78 MB PNG
>>
>>101966467
Kek, Flux v1.1 or that fixes our issues with styles would be great.
>>
>>101966377
>this level of detail.
>256x256
>>
File: ComfyUI_31998_.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
>>
File: ComfyUI_32000_.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
>>
File: OpenNijiRemix.png (1006 KB, 768x768)
1006 KB
1006 KB PNG
>>101966473
No, Craiyon couldn't even draw faces! And as I said, there's SD1.5 models that scratch that itch.
But let's not pretend MidJourney 6 is the be all, end all model regarding artistry capabilities, it may even be only top 5 behind the models I've mentioned.
>>
>>101966362
He's just having a smoke. Frogs are amphibious, you know.
>>
File: ComfyUI_32003_.png (1.46 MB, 1024x1024)
1.46 MB
1.46 MB PNG
>>
>>101966574
Have you ever considered just training a lora for what you want
>>
File: Chaingirl-Surreality.png (888 KB, 768x768)
888 KB
888 KB PNG
>>101966510
It was more detailed and had better compositions than SD1.5 models at 512x512 shunk down to 256x256, if instead of giving us 9 pictures they gave us one at x4 the size they'd be the most detailed AI pics we'd have to date.
>>
File: ComfyUI_32004_.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
>>
File: ifx131.png (1.58 MB, 1024x1024)
1.58 MB
1.58 MB PNG
>>101966635
I want some of what this niggas smoking i'll tell u hwat
>>
i tinkered a bit with adaptiveguider and ultimately decided just tuning up whatever the fuck fluxguidance does yielded overall better results, i think it might only be preferable for wildly abstract compositions
>>
File: ComfyUI_31994_.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
>>
File: ComfyUI_32005_.png (1.45 MB, 1024x1024)
1.45 MB
1.45 MB PNG
>>
>>101966635
you're retarded
>>
>>101966632
This level of detail can't be attained by training a Lora, it'd need to be trained from scratch. It's like trying to make a Lora to add the detail of SD1.5 to SDXL. Check this out:
https://civitai.com/models/139565/realistic-stock-photo
PromptSharingSamaritan couldn't do it so for the next version of his model he had to go back to SD1.5 technology instead of SDXL, THAT gave him the level of detail desired, and no Lora training is going to give it to Flux, either.
>>
>>101966674
o fuck my eyes ouch
>>
>>101966674
you... you can't run Flux can you?
>>
File: ComfyUI_32007_.png (1.45 MB, 1024x1024)
1.45 MB
1.45 MB PNG
>>
File: ComfyUI_32008_.png (1.32 MB, 1024x1024)
1.32 MB
1.32 MB PNG
>>
File: ComfyUI8.18.2024__00157_.png (3.44 MB, 1248x1824)
3.44 MB
3.44 MB PNG
What are you guys doing for upscaling with flux? Would appreciate a workflow, not sure if mine is ideal.
>>
File: ComfyUI_00004_.png (2.71 MB, 1248x1848)
2.71 MB
2.71 MB PNG
>>
File: ComfyUI_32009_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>I'm gonna need two swords for this
>>
>>101966735
Wheeeeeeeeee!
>>
>>101966674
>This level of detail can't be attained by training a Lora
Yes it can
>>
>>101966732
I'll be honest, I'm too lazy to upscale.
>>
File: ComfyUI_32010_.png (1.55 MB, 1024x1024)
1.55 MB
1.55 MB PNG
>>
>>101966744
I feel you.
>>
File: CraiyonVsFlux.png (244 KB, 511x257)
244 KB
244 KB PNG
>>101966673
Do you let your insults argue for you?
Here, I cut a 256x256 piece from this anon:
>>101966489
And put it against one of the outputs from here:
>>101966458
Do you now see what I mean by the level of artistic detail? How if these 256x256 patches were used to make a 1024x1024 picture we'd have a level of it never seen before?
>>
File: ComfyUI_temp_dfkqa_00116_.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
how should i use clip_l and t5 differently? the string for both here is

anime style model sitting in a chair, spread legs, wearing lingerie and stockings, arrogantly smirking, nude, luxurious room, panties, cleavage

and thanks to BFL's commitment to safety thre result wont get you immediately PROMOTED

i know T5 is more adept at reading long form text but i dont think that necessarily yields better results
>>
@101966781
elaborate troll hmm
>>
>>101966781
you're actually retarded
>>
>>101966781
because it wasn't trained on 256x256 images based retard
>>
@101966781
it's also not flux pay attention
>>
>>101966796
BREASTS
>>
>>101966747
Well, once it happens generate an image with it and cut a 256x256 square out of it and do a comparison like this one:
>>101966781
And let's see if the detail holds up.
>>
File: ComfyUI_32012_.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
>>
File: up.png (219 KB, 2203x671)
219 KB
219 KB PNG
>>101966732
here. works really well on the type of image you have there.
-I tried all upscale models, including a ton of DAT ones and ultramix balanced is just the best for almost all cases.
-below 16 steps you loose detail
-noise injection between upscaler and resampler is invaluable.
-WIP
>>
File: ComfyUI_32013_.png (1.36 MB, 1024x1024)
1.36 MB
1.36 MB PNG
>>
>>101966809
>Well, once it happens generate an image with it and cut a 256x256 square out of it and do a comparison like this one:
why? the more apt comparison is generating at 256x256 and comparing that directly
>>
File: ComfyUI_Flux_02127_.png (1.74 MB, 768x1152)
1.74 MB
1.74 MB PNG
Manga finetune test
>>
File: 00114-111383475.png (3.83 MB, 1344x1728)
3.83 MB
3.83 MB PNG
>>
File: ComfyUI_32014_.png (1.41 MB, 1024x1024)
1.41 MB
1.41 MB PNG
>>
>>101966836
Comfy.
>>
>>101966801
You're explaining WHY its level of detail isn't as good, not that it was.
You just conceded.
A superior artistic model would show detail this high no matter at what part of the picture you looked.
>>
>>101966819
did this happen to be in your test bed?
https://huggingface.co/uwg/upscaler/blob/main/ESRGAN/4x_NMKD-Superscale-SP_178000_G.pth
it ended up being my general purpose go-to, with 4x-animesharp ]yielding better results for more "2d" images
>>
Is there a list for known artists for flux?
>>
File: ComfyUI_32015_.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
>>
File: Capture.png (37 KB, 1216x375)
37 KB
37 KB PNG
>>101966732
>>
>>101966853
5/10 decent effort bait but not quite believable
>>
File: ComfyUI_32016_.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
>>
This happens when combining 2 LoRAs with Q8. Just gets stuck in a loop.
Can it only handle one at a time?
>>
File: CraiyonVsFlux0.png (245 KB, 507x256)
245 KB
245 KB PNG
>>101966804
>Believes I do it for the (You)s
Lmao even, it doesn't matter, what I say applies to any Flux generation.
Here's one for this one:
>>101966153
>>
>>101966898
if you can use q8, can you use regular fp8?
>>
File: ComfyUI_32017_.png (1.6 MB, 1024x1024)
1.6 MB
1.6 MB PNG
>>
>>101966857
yes used it for ages and still do for sd15 and sdxl occasionally, its softer, really good one. BUT it struggles with FLUX BOKEH and heavy grain.
if you are on forge you can enable noise injection into img2img and highres fix in the options (and put it in the UI) btw, 0.065 - 0.1
>>
lol
>>
File: ComfyUI_32018_.png (1.54 MB, 1024x1024)
1.54 MB
1.54 MB PNG
>>
>>101966825
You'd probably find a better square to compare to among the 16 256x256 ones, but, okay, sure.
>>
File: ComfyUI_01011_.png (147 KB, 256x256)
147 KB
147 KB PNG
>>101966902
Here I generated a 256 native image in flux, it has more detail than that crap on the first try
>>
>>101966906
Never even considered it since the Q's came out since it looks so much better than FP8. Will try.
>>
File: flux_00124_.png (1.62 MB, 896x1160)
1.62 MB
1.62 MB PNG
>>
File: delux_ci_00051_.png (1.73 MB, 1536x968)
1.73 MB
1.73 MB PNG
>>101966658
same. all these cfg/guidance tricks seem to produce worth outputs without much of an increase in control

>>101966732
...upscaling?

>>101966796
I think t5 is more literal and clip is more style-friendly. a major difference is that t5 has a 512 token limit while clip has 77. with your prompt, that doesn't matter but with long prompts, it would make sense to trim out the more important bits for clip

I think its an open question though. when sdxl came out, it had two clips and people found some interesting ways to work with them separately. I'm sure there's room for experimentation in flux too

>>101966919
lmao
>>
go away black niggers
>>
File: ComfyUI_32019_.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
>>101966945
Q6_K shouldn't be much worse than Q8, try it.
>>
Hello
>>
>101966950
Go away.
>>
>>101966963
I can run it fine, or is the issue I am oom?
It seems to be generating on fp8
>>
File: 00121-1718605986.jpg (449 KB, 1344x1728)
449 KB
449 KB JPG
>>
Im OOMing with controlnet, anyway to get my controlnet on my 3060 instead of my 3090? I can do it with CLIP and the VAE but haven't found anything about controlnet
>>
File: ComfyUI_00010_.png (3.22 MB, 1536x1536)
3.22 MB
3.22 MB PNG
>>101966746
Nooo don't do it anon
>>
>>101966968
hello sir
>>
File: 1024x1024.png (1.5 MB, 1024x1024)
1.5 MB
1.5 MB PNG
>>101966936
>>101966936
And it loses all the detail when moving to 1024x1024? There's no single place I can cut from my picrel that is as detailed.
>>
>>101966968
So are you and debo migrating to here finally?
>>
File: ComfyUI_32020_.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
>>
File: 00127-908752457.png (2.61 MB, 1344x1728)
2.61 MB
2.61 MB PNG
>>
>>101966997
I know you're baiting but yes, the model learns to give a certain amount of details on a per-image basis rather than on a per-pixel basis (obviously)
>>
flux doesnt give me soothing coil whine from my gpu, shit tier model

>>101966945
in my A/Bing i thought they were a wash, with q8 maybe doing finer details but fucking up more on fingers, but the main thing was fp8 yielded faster gens
>>
File: ComfyUI_Flux_02136_.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
>>
>>101967027
the coil whine only happens with LLMs
wish it happenned with image models, it's funny to hear the gpu make a sound for every word it says
>>
>>101967027
My issue is specifically with combining LoRAs. With a single LoRA it's fine, but I want to test multiple subjects in a single image. I doubt it will work, as it will likely just average the faces, but wanted to try before training a LoRA with the 2 subjects
>>
File: ComfyUI_Flux_02137_.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
>>
>>101966987
I'm not sure but out of curiosity what controlnets are you using that use that much vram?
>>
>>101966819
Can you share a catbox with the workflow. thanks bro
>>
>>101967041
1.5 and SDXL would sing for me
>>
>>101967016
I'm not baiting, I claim that if when Flux had to draw a house of this size:
>>101966936
On a 1024x1024 pic, it did it with that level of detail, it'd surpass any other artistic model.
Instead, the smaller things get, the more detail they lose.
I'm not even wrong because this is obvious, something like Flux Schnell should use 4 steps to get here and more steps to add detail a the per-pixel basis like it does on 256x256 generations, and the level of detail would be incredible.
>>
>>101966968
Hell Quokka, you are welcome here, but you need to drop the Quokka. Contrary to popular belief, the move to /ldg/ was not entirely motivated by alternatives to SAI products. We also hate avatar fags.
>>
File: 1700721779648707.png (1.64 MB, 1024x1024)
1.64 MB
1.64 MB PNG
sierra pc adventure game lora

Miku Hatsune outside a castle surrounded by water, orthographic style scene, in the style of a (pixel art:1.5) (sierra on-line: 1.5) (adventure game:1.4) from the 80s and 90s

the stuff after water was in the sample images so it seems to work.
>>
File: ComfyUI_32023_.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
I asked gemma-2-2b-it-abliterated-IQ4_XS for a "detailed description of a harrowing, macabre, grotesque, hellish, debased and utterly morally corrupt screenshot from an old pixel art retro game with the theme of" stuff like "hard sci-fi", "robots", or "Nyarlathotep", and the prompts it's generating produce pretty awesome results.
>>
File: 00127-1558550355.png (2.6 MB, 1728x1344)
2.6 MB
2.6 MB PNG
>>
File: ComfyUI_32025_.png (1.32 MB, 1024x1024)
1.32 MB
1.32 MB PNG
>>
File: flux_00132_.png (1.69 MB, 896x1160)
1.69 MB
1.69 MB PNG
>>
File: ComfyUI_Flux_02140_.png (1.78 MB, 1024x1024)
1.78 MB
1.78 MB PNG
>>
>>101967092
kek
>>
File: Crayon.Safetensors.png (835 KB, 768x768)
835 KB
835 KB PNG
>>101966968
>Crayon art style
Excuse me? Wasn't that supposed to look more similar to my picrel? Did people just give up on styles and add it to the prompt for the lulz?
>>
>>
>>101967103
Loving these, keep it up
>>
very good gens
>>
Oh lordy she's genning
>>
>>101967120
ganbare
>>
>>101967111
That's what FLUX gives me when i ask for crayon art style
>>101967001
I don't mind posting in both threads.
>>
File: ComfyUI_Flux_02141_.png (1.56 MB, 1024x1024)
1.56 MB
1.56 MB PNG
>>
>>101967120
gguf? you have to update for k quants
>>
>>101967165
No this is fp8, I am using 2 LoRAs at once. When using Q_8 it ran into loop error here: >>101966898
I just download the Q6_K, after this one image generates in an hour I will test Q6_k
>>
>>101967165
>>101967178
I have a 4080 btw
>>
File: ComfyUI_32024_.png (1.03 MB, 1024x1024)
1.03 MB
1.03 MB PNG
>>
File: flux_00137_.png (1.84 MB, 896x1160)
1.84 MB
1.84 MB PNG
>>101967116
you bet
>>
>>101967178
>>101967187
that's too slow for a 4080, post workflow
>>
File: ComfyUI_Flux_02143_.png (2.11 MB, 1024x1024)
2.11 MB
2.11 MB PNG
>>
File: ComfyUI_32028_.png (1.25 MB, 1024x1024)
1.25 MB
1.25 MB PNG
>>
File: ComfyUI_32029_.png (1.37 MB, 1024x1024)
1.37 MB
1.37 MB PNG
>>
>>101967187
something is amiss
>>
File: ComfyUI_32030_.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>
File: ComfyUI_32031_.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>
>>101967205
It's fine if I am using a single LoRA. I get 1.5it/s. Combining them turned into this mess.
Workflow https://files.catbox.moe/t6c0dw.json
>>
File: ComfyUI_Flux_02146_.png (1.77 MB, 1024x1024)
1.77 MB
1.77 MB PNG
>>
>>101967271
Awesome hatching
>>
File: ComfyUI_32032_.png (1.46 MB, 1024x1024)
1.46 MB
1.46 MB PNG
>>
File: ComfyUI_32033_.png (1.4 MB, 1024x1024)
1.4 MB
1.4 MB PNG
>>
File: 1716982870198961.png (1.67 MB, 1024x1024)
1.67 MB
1.67 MB PNG
>>101967089
>>
>>101967163
>>101967209
>>101967271
It's over, mangakas are finished.
>>
File: ComfyUI_32034_.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
>>
>>101967089
>(pixel art:1.5) (sierra on-line: 1.5) (adventure game:1.4)
weighting doesn't actually work in flux, right?
>>
>>101967049
the x-labs controlnet that uses 1.5GB with flux which uses 23GB
>>
File: 00141-2448221293.png (3.29 MB, 1728x1344)
3.29 MB
3.29 MB PNG
>>
File: ComfyUI_00014_.png (3 MB, 1536x1536)
3 MB
3 MB PNG
>>
File: ComfyUI_32035_.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>
File: flux_00144_.png (1.75 MB, 896x1160)
1.75 MB
1.75 MB PNG
>>
>>101966898
This may be fixed on the latest version of the node.
>>101967253
2 LoRAs for me get an OOM even on the safetensors version on latest comfy with torch 2.4.0, I assume fp8 for you just overflows into system RAM, which is why the gen times end up like that.
>>
>>101967339
>flux which uses 23GB
You should probably just use a quant but you do you
>>
>>101967358
Yeah this is exactly what I am thinking. Q6_k also loops
I haven't updated today so I will try that.
>>
File: 0.jpg (341 KB, 2048x1024)
341 KB
341 KB JPG
>>
File: 1708062916769479.png (1.55 MB, 1024x1024)
1.55 MB
1.55 MB PNG
>>101967336
not sure. im testing various stuff now.

<lora:Classic_Sierra_On-line_Graphics:1> Miku Hatsune outside a small town in the forest with shops selling potions, orthographic style scene, in the style of a sierra on-line adventure game from the 80s and 90s, pixel art

I think this is better
>>
File: ComfyUI_32037_.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
>>
File: ComfyUI_32038_.png (1.36 MB, 1024x1024)
1.36 MB
1.36 MB PNG
>>
File: ComfyUI_32039_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>
File: ComfyUI_Flux_02154_.png (2.06 MB, 1024x1024)
2.06 MB
2.06 MB PNG
>>101967324
It barely follows my prompt though, I'd say it's about 70-80% as good as Flux is at prompt following, but it still does decent styles and seems to know some mangakas.
>>
>>101967380
>>101967358
Updating fixed it. It merges the faces as I expected though kek.
I wonder if training on multiple subjects will work.
>>
File: ComfyUI_32040_.png (1.25 MB, 1024x1024)
1.25 MB
1.25 MB PNG
>>
File: 1722893943870469.png (1.76 MB, 1024x1024)
1.76 MB
1.76 MB PNG
>>101967401
medieval city marketplace:
>>
>>101967401
>use leek on shrubbery
>>
File: flux_00152_.png (1.73 MB, 896x1160)
1.73 MB
1.73 MB PNG
>>
File: Capture.png (991 KB, 1363x770)
991 KB
991 KB PNG
>Higher contrast!
>Vibrant colors!
>Pay up, goy!
>>
File: ComfyUI_32042_.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
>>
File: 1724004325427085.png (2.68 MB, 1080x1893)
2.68 MB
2.68 MB PNG
>>101961683
I feel like I have to share this post from this anon again because that's a very interesting one, especially if you want to make your output work on less steps
>Here's a comparison of the samplers using the simple scheduler. heuncpp2 converges the fastest but is also the slowest. But other samplers start pulling ahead at the 10 step mark. I find also and tested that ipndm gives the best from 15-20 steps if you look at the pie and background details before deis takes over at 20-30 steps looking at the pie again and then every scheduler converges above that step count. Mostly using ipndm at 15 steps for that reason for my gens and deis if I want something better. I like ddim_uniform better so I am using that as a scheduler instead of sgm or simple.
>>
>>101967520
>Flux.1 D
How can he monetize this shit? I thought it was illegal for Flux Dev because of its licence?
>>
>>101967508
cool boris vallejo vibes
>>
>>101967539
I think you pay with their onsite currency so maybe that's a way of getting round it, idk. Never buying "buzz" or paying for anything AI related
>>
File: 1695865888652570.png (534 KB, 1024x1024)
534 KB
534 KB PNG
>>101967494
>>
>>
>>101967539
Doesn't the Buzz thing avoid that? Like, gambling is illegal, but if you win, I give you 5 Teddy bears, and tomorrow I'll buy each for $100, so it's like you won $500, except giving you them for free and buying them from you is't illegal.
>>
File: ComfyUI_temp_byecv_00031_.png (1.93 MB, 1024x1024)
1.93 MB
1.93 MB PNG
>>101967520
>paying to use slop
>>
>>101967576
>Like, gambling is illegal, but if you win, I give you 5 Teddy bears, and tomorrow I'll buy each for $100, so it's like you won $500, except giving you them for free and buying them from you is't illegal.
Literally the Mr.Beast situation kek
>>
File: ComfyUI_32045_.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
>>
>>101967559
That's the thing, they give buzz away for free, once I made a post in their forum, people kept reacting to it and I got like 150 buzz for a post.
It's complicated.
>>
>>
>>101967520
Can someone run his gens through regular flux? I'd like to see the difference
>>
File: ComfyUI_32046_.png (1.36 MB, 1024x1024)
1.36 MB
1.36 MB PNG
>>
File: flux_00157_.png (1.71 MB, 896x1160)
1.71 MB
1.71 MB PNG
>>
>>101967531
>20-50 is the same picture
>50-100 are totally different
what happens in there?
>>
>>
>>101967611
Finally oneeloli
>>
File: 0.jpg (524 KB, 2048x1024)
524 KB
524 KB JPG
>>
>>101967531
>lcm wants to be different so bad
maybe lcm is the real slop remover? but I'm not gonna go for 100 steps for it, even at 30 it's painfully slow
>>
File: Capture.png (13 KB, 410x283)
13 KB
13 KB PNG
>>101967520
Holy fucking kek we're back to
>by artgerm and greg rutkowski, alphonse mucha, cgsociety
>>
>>101967520
>20.34GB
it's not even the fp16 flux model, what a fucking retard, we need that shit to make quants out of it
>>
File: 0.jpg (99 KB, 1024x1024)
99 KB
99 KB JPG
>>
File: flux_00160_.png (1.45 MB, 896x1160)
1.45 MB
1.45 MB PNG
kek
This one made me laugh.
The implication being that
..................................................................................................................................................................................................................................................................................................................... (you) are the other ape
>>
>>101967324
it basically gives power to mangaka/comic artists.
their job is more like movie director than illustrator.
but yes, those with a single style, unattractive characters, mediocre screen structure and acting, and boring storytelling will lose their jobs.
>>
File: delux_ci_00016_.png (1.99 MB, 1536x968)
1.99 MB
1.99 MB PNG
>>101967681
>leave those ladies alone >:(
>>
>>101967654
>by *artist*
Why is that a bad thing?
>>
>>101967654
as it should, how else are you gonna simulate an artist style in a more simplier way than simply adding the artist name?
>>
File: ComfyUI_32051_.png (1.33 MB, 1024x1024)
1.33 MB
1.33 MB PNG
>>
for people training flux w/ kohya sd-scripts, are you using keep_tokens_separator = ","

i haven't tested yet with it set to "," but i noticed it's splitting the captions into tags due to commas being present, ex:

\"this image is a digitally created artwork\": 3, \"likely in a style reminiscent of anime or manga. the scene depicts a surreal\": 1, \"dreamlike cityscape with a foggy\": 1, \"cloudy atmosphere. the sky above is a pale\": 1, \"misty blue\": 2, \"with wispy clouds that blend seamlessly into the cityscape. below the clouds\": 1, \"a dense\": 1, \"layered cityscape is visible\": 1,
>>
File: 00159-3957456854.png (2.5 MB, 1728x1344)
2.5 MB
2.5 MB PNG
>>
>>101967701
>>101967712
It's not necessarily a bad thing. There was a period where those specific artists were included in EVERY SD1.5 prompt that it became somewhat of a meme
>>
>>101967520
dude can lick my balls
>>
File: ComfyUI_32052_.png (1.25 MB, 1024x1024)
1.25 MB
1.25 MB PNG
>>
joytrainer users, have you managed to use any better llms than llama 4 bit cos that model is very retarded

So here is gpt telling me why I can't use ooba for image captioning, is it wrong and can anyone think of ways to send ooba embedding crap?

>If you are trying to send embeddings directly to a model (like in the original script using LLaMA with 8-bit quantization), and the model can process inputs_embeds, then the issue here is that the Oobabooga API, being a drop-in replacement for OpenAI, expects text prompts rather than tensor embeddings. This discrepancy prevents you from using the API in the same way as you would with a local LLaMA model that accepts embeddings.

>The Oobabooga API, designed to be OpenAI-compatible, doesn't directly support the input of tensor embeddings. Instead, it expects a string-based prompt as input. In contrast, when using the original LLaMA model in your script, you were directly injecting embeddings into the model through the generate method. Since the Oobabooga API cannot accept embeddings in this form, the only way to generate captions via the API is through text prompts.
>>
>>101967701
>>101967712
The people who write "by greg rutkowski" don't even know what that means.
>>
>>101967733
>>101967756
I still include artists.
>>
File: ComfyUI_32053_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>
File: 00105-1678592431.jpg (146 KB, 1080x1280)
146 KB
146 KB JPG
>>
>>101967756
why should they know the innerlayings of everything? When I go for heun sampler I have no idea how it works in the inside, but when I press generate I notice it gives me the gens I want
>>
>>101967701
>>101967712
Did you even see the model/artists used? The style doesn't even resemble them, it is pure SDXL/1.5 synthetic looking slop.
>>
File: 1705543207650758.png (1.76 MB, 1024x1024)
1.76 MB
1.76 MB PNG
>>101967570
>>
>>101967777
Not knowing the math behind a sampler and mindlessly rattling off greg rutkowski and uploading it to civit are not the same thing.
>>
>>101967816
I get what you say though, if you prompt "greg rutkowski" at least it should generate something that looks like his art
>>
>>101967828
He was one of the loudest voices against AI art after everyone was using his tag in prompts and mogging his art. He's partly to blame for SAI cucking their models of artist styles
>>
>>101967753
i tried https://huggingface.co/Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2
with joy caption but doesn't change too much- joy caption seems to be a bit baked in, it doesn't follow my prompt very much at all, i modified it to support batching & changing prompt, temperature top_k top_p etc

https://huggingface.co/qresearch/llama-3-vision-alpha
this one is interesting, passing that same uncensored fine tune to it, it can get pretty lewd but will hallucinate details. Someone modified it with commands to clear context/history & change the image:
https://gist.github.com/CoderCowMoo/9c2e611a90ba13b547176618131114f1

there is the LLava stuff, lewd but misses a lot of details:
https://github.com/Z-L-D/Autocaption
this one has good batch features

I need to test this one next, probably more of the same since it uses LLava
https://github.com/DEVAIEXP/image-interrogator
>>
File: flux_00170_.png (1.69 MB, 896x1160)
1.69 MB
1.69 MB PNG
>>
File: ComfyUI_32055_.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
>>
>>101967777
You should at least once generate an image by hands, it's just matrix multiplication.
>>
File: ComfyUI_32057_.png (1.28 MB, 1024x1024)
1.28 MB
1.28 MB PNG
>>
File: EzcUsB_UYAAGF3v.jpg (211 KB, 911x1200)
211 KB
211 KB JPG
>>101967852
>He's partly to blame for SAI cucking their models of artist styles
no that's not his fault, he has the right to have weird takes, he didn't force anything, didn't make any laws, didn't point any gun towards Emad's face. It's SAI's fault they're scared of some whiny bitches on twitter, what a soft generation we're living in
>>
File: 00113-1890055828.jpg (115 KB, 1080x1280)
115 KB
115 KB JPG
>>
File: ComfyUI_32058_.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
>>
>>101967852
the guy who made pepe was mad 4chan was using it, and now it's our meme. feels good man
>>
File: 4488486484.png (284 KB, 329x407)
284 KB
284 KB PNG
>>101967816
I don't think these kinds of models are done in good faith, but the majority of the SD community on civit are stupid, they will 100% be buying into this and slurping the model. After all Dreamshaper was the most downloaded model, tells you all you need to know. A few have voiced complaints now and then but they are quickly downvoted to oblivion plebbitor style. This is meant to be alpha v1, the latest versions will look like pic rel. I'm already seeing a couple that look like this. Does it remind you of anything?
>>
File: flux_00171_.png (1.67 MB, 896x1160)
1.67 MB
1.67 MB PNG
this bwitch is about to test her newest spell
>>
File: ComfyUI_32059_.png (1.33 MB, 1024x1024)
1.33 MB
1.33 MB PNG
>>
>>101967852
>He was one of the loudest voices against AI art
we should care or something? You can't copyright an art style, art is for everyone and no one owns anything in art
>>
>>101967310
These colors are really spot on, especially that background
>>
Please reply to my image posts with (you)s.
>>
Next thread
>>101967932
>>101967932
>>101967932
;)
>>
>>101967871
They didn't care. You think that it was because of whining activists that they decided to entire cuck the model by avoiding training on women? It was simply because Stability have never been in favor of open source. They'd rather MJ and Dalle be years ahead and did almost get away with setting open source a few years back if not for Flux. They purposely gave us crippled models because they don't believe in the open source spirit. Flux basically was a huge middle finger to them, which is why the Flux devs are based.
>>
>>101967930
Which ones were yours?
>>
>>101967993
Hopefully the Twitter/X deal has made it worth it/financially sustainable.
>>
>>101968008
The image files have a prefix of "delux"
>>
>>101967993
>Flux basically was a huge middle finger to them, which is why the Flux devs are based.
yeah, and what's even more based is that they were formet employees of SAI, making the Flux move even more ironic, SAI had people inside that could make great things, but they decided to shoot themselves in the foot for retarded reasons



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.