[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: tmp.jpg (1.11 MB, 3264x3264)
1.11 MB
1.11 MB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>101935309

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
>>101937500

Was about to mention, this, I can't daisy chain two LoRAs together with gguf right now. Which sucks. Luckily I'm not a total vramlet.
>>
File: 111.jpg (3.2 MB, 1999x1999)
3.2 MB
3.2 MB JPG
>>
File: ComfyUI_00061_.png (1.25 MB, 1024x1024)
1.25 MB
1.25 MB PNG
>>
Im just like gonna call it slop haha
>>
L O R A S
O
R
A
S
>>
File: 1700479568546415.png (1.62 MB, 1024x1024)
1.62 MB
1.62 MB PNG
flux was worth it, solely for the Pepe lora. The rest is a bonus. And this is basically the first week of loras, most aren't even training yet.
>>
So I'm interested in captioning my datasets now for LoRA training to see if it can improve the outputs. What captioner are we all using right now?
>>
>>101937515
>can't use one lora without tonemap and therefore without cfg > 1
>with cfg = 1, can only load one lora and not more
yay...
>>
>>101937590
for SFW, GPT4V, for NFSW, that joycaption thing
>>
>>101937606
>joycaption
Were there weights available?
>>
>>101937590
Joy captioner
DL from huggingface and replace with model another llama model to run locally/uncensored
>>
>>101937628
>replace the model
I'm going to bed it's 3am
>>
>>101937620
yes, just clone https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha
it will download the weights for you (the llama weights require you going to the repo and requesting permission but there are mirrors on hugging face and finetunes work as well from what I've seen other anons do)
>>
File: 1720693886864337.png (1.57 MB, 1024x1024)
1.57 MB
1.57 MB PNG
>>101937574
slightly better: forgot the cartoon prompt

remember, if you can generate controversial stuff then the basic stuff is easy as fuck.
>>
>>101937495
From a technical point of view, how does Flux differ from SDXL?
As in, if someone was trying to adapt something that supports SDXL to also support Flux, what would the key points of difference be?
Two CLIP encoders (is the second one optional)? What other magic is there?
>>
>>101937640
lmao that one is good not gonna lie
>>
>>101937641
it's a diffusion transformer architecture instead of a UNet, does you "something" care about the architecture?
>>
File: 1711021540339872.png (1.09 MB, 1024x1024)
1.09 MB
1.09 MB PNG
Pepe is wearing a business suit and is at a political rally. He is standing at a podium. A big rectangular sign that says "PEPE" is behind him. Pepe is saying "feels flux, man" in a white speech bubble.
>>
Quite a few flux loras are beign trained on AI images, WTF is wrong with those people... It just makes everything look more AI.

e.g I saw a lora and I was like why does this look like a Pony image...well they trained it on Pony images.
>>
>>101937673
Do one with a bullet grazing his head.
>>
>>101937638
I hope they add this to Taggui, I love using that program.
>>
>>101937690
those retards find now normal to do some AI inbreeding, I fucking despise that shit
>>
File: 1702613342379333.png (1.01 MB, 1024x1024)
1.01 MB
1.01 MB PNG
>>101937691
>>
File: 1708804035369527.png (1021 KB, 1024x1024)
1021 KB
1021 KB PNG
>>101937732
slightly better:
>>
>>101937661
Ah right, that's probably going in the too-hard basket for me then sorry.
Thought maybe I could hack support into StableDiffusion.cpp.
>>
>>101937641
>From a technical point of view, how does Flux differ from SDXL?
better transformer architecture (DiT vs unet)
Way bigger (12b vs 3.5b)
>>
File: 1703888458747817.png (991 KB, 1024x1024)
991 KB
991 KB PNG
what a time to be alive, huh?
>>
Is there a way to drag an output that is already linked so that I can choose a new output instead of a new input? Haven't found a modifier that changes it from the default.
>>
>>101937732
>>101937748
fire the whole bullet
that's 33% more bullet
>>
>Hatsune miku, her speech bubble says: "I'm loving it!", 50's comic book style
Choose your best unslop combinaison (X = cfg | Y = GuidanceNegative), made possible with Tonemap
https://files.catbox.moe/a8qovd.png
>>
File: 1695429128484101.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
>>
>>101937764
>>101937748
>>101937732
KEK, nice.
>>
File: FD_00340_.png (1.64 MB, 1024x1024)
1.64 MB
1.64 MB PNG
>>
>>101937813
No, kill yourself
>>
It's over
>>
File: FD_00317_.png (1.62 MB, 1024x1024)
1.62 MB
1.62 MB PNG
>>
File: Flux_00857_.png (1.45 MB, 1344x768)
1.45 MB
1.45 MB PNG
>>
File: ComfyUI_00146_.png (571 KB, 512x768)
571 KB
571 KB PNG
Anyone knows how to change the font size of the prompt text? I edited the user.css with

element.style {
--comfy-textarea-font-size: 20px;
}

But it gets overwritten by html when opening comfyui
>>
>>101937893
It wouldn't be ideal, but could you overwrite it in inspect elements?
>>
File: 1702798476278549.png (982 KB, 1024x1024)
982 KB
982 KB PNG
>may I take your order?
>>
File: ComfyUI_00147_.png (585 KB, 512x768)
585 KB
585 KB PNG
>>101937900
I did but it goes away after I close the tab
>>
>>101937909
use Stylus
>>
File: 1703392562031671.png (960 KB, 1024x1024)
960 KB
960 KB PNG
>>101937905
>>
>>101937893
Isn't it an option in the settings? Click the gear and look for "Textarea widget font size"
>>
File: ComfyUI_00148_.png (495 KB, 512x768)
495 KB
495 KB PNG
>>101937934
Holy fuck it works, tried for like 2 hours to change it
>>
File: 1696969884608524.png (991 KB, 1024x1024)
991 KB
991 KB PNG
the colonel in the background is the most impressive part.
>>
>>101937944
I wonder if by virtue of most KFC marketing having the letters KFC in them, they came out better tagged in the dataset.
>>
File: ComfyUI_00149_.png (657 KB, 512x768)
657 KB
657 KB PNG
>>101937934
Thanks, now I need to find a font for it
>>
File: 1693495951317533.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
>>101937944
brands come out pretty good in addition to text too
>>
is there an example repo of someone setting up a flux LoRA with a dataset so I can just copy it without thinking too hard?
>>
File: ComfyUI_00150_.png (623 KB, 512x768)
623 KB
623 KB PNG
More steps in schnell adds more details
>>
>>101937515
>>101937591
I fixed it kek. Who needs sleep.
>>
>>101937996
looks like a Bratz doll
>>
>>101937905
>anime 2b
Can it do a normal in-game one?
>>
>>101938020
this lora is trained in that style but if you gave it an artstyle prompt I guess it would work, not sure still testing.
>>
why is my VRAM usage higher today and why does moving the VAE to the CPU give me OOM now
>>
>>101937997
It works!! No errors and no OOM anymore, what a fucking legend!!
https://www.youtube.com/watch?v=VtAzlUu4b3Y
>>
>>101937996
Truly groundbreaking, Anon.
>>
File: 1699619065772874.png (1.1 MB, 1024x1024)
1.1 MB
1.1 MB PNG
RAYGUN COULD NEVER
>even 2b is better than the aussie
>>
>>101938045
but can 2B hop like a kangaroo
>>
File: 1700288224393599.png (1.12 MB, 1024x1024)
1.12 MB
1.12 MB PNG
>>101938045
this time but with more soul and the logo
>>
File: file.png (10 KB, 252x331)
10 KB
10 KB PNG
>hit the new "queue" button
>comfy freezes and lags for a couple seconds
Guess it needs gpu acceleration on to not shit and piss itself
>>
>>101937997
No more errors anymore, for those having OOM, remove --highvram and remove the force MODELDevice thing too
>>
>>101938094
But gpu accel directly affects your generation speed, why would anyone enable it
>>
File: 1723888726514.jpg (224 KB, 1200x933)
224 KB
224 KB JPG
gn everyone
>>
File: 3922952966.png (1.04 MB, 1152x896)
1.04 MB
1.04 MB PNG
>>
>>101937638
so basically we need a bit of a wrapper to come up with a dataset to avoid duplicating work
>>
File: 1716576859884215.png (1.02 MB, 1024x1024)
1.02 MB
1.02 MB PNG
>>
File: 1934828313.png (1.16 MB, 1152x896)
1.16 MB
1.16 MB PNG
>>101938045
nice
>>
>>101938181
we can use AI to make the routine a 10/10
>>
I do Schnell 1024 res gens in about 18 s on my 3060
dunno if that's slow or not
RAWRRR
>>
File: ComfyUI_01715_.png (664 KB, 1024x1024)
664 KB
664 KB PNG
>>101937813
>>
File: ComfyUI_00156_.png (566 KB, 512x768)
566 KB
566 KB PNG
>>101938191
That's normal for now
>>
File: 1723001676026406.png (908 KB, 1024x1024)
908 KB
908 KB PNG
https://civitai.com/models/644109

a new era of pepes begins

>>101938211
very nice!
>>
File: 1700785213834723.png (867 KB, 1024x1024)
867 KB
867 KB PNG
pepe is also a fan of miku gens in flux
>>
File: 1723706531678745.jpg (547 KB, 3024x1714)
547 KB
547 KB JPG
What scaler or node can do this?
>>
File: ComfyUI_Flux_24.png (1.54 MB, 1216x832)
1.54 MB
1.54 MB PNG
>>101937997
Works perfectly with various loras/cfg adjustments and whatnot, thanks. Custom loaders work just fine too so no need to stick to the regular lora loader.
>>
File: 1708169452067202.png (1.14 MB, 1024x1024)
1.14 MB
1.14 MB PNG
>>
So GUUG Q8 works better with loras than Nf4 right.
>>
File: GU2o97HWMAAPWES.jpg (3.82 MB, 2438x1950)
3.82 MB
3.82 MB JPG
Flux 1 dev with Realism Lora + Magnific + Luminar + Lightroom
>>
>>101938282
it gave miku pepe eyes kek
>>
>>101938290
GGUF*
>>
>>101938291
a lot of work for a noisy analog
>>
>>101938290
it should, Q8 is way better than nf4 on its own
>>
>>101938308
Sounds good, downloading now baby
>>
why the fuck is ComfyUI using more VRAM today, it's starting to hit the shared VRAM when it has been working fine for a week
even changed the DE to the iGPU, the VRAM use on the GPU is literally 0
>>
Can you inpaint faces in Forge yet for Flux?
>>
>>101938315
Did you update it? I noticed the last update did that
>>
File: 1693782740265607.png (938 KB, 1024x1024)
938 KB
938 KB PNG
what a time to be alive
>no there will be no nudes
>no loras are impossible
>no you need an a100 to run it
>no it has safety controls on it
>>
File: ComfyUI_00944_.png (1.57 MB, 1344x768)
1.57 MB
1.57 MB PNG
>>101938315
Are you using quants? Update the node and comfyui. idk which fixed the issue
>>
>>101938262
eh this isnt magic. just looked at their images. upscale by model (pick a decent ESRGAN model) > downscale to a reasonable size > resample with a low ish denoise at around 16 steps > upscale again to final size - with flux you get actually better results lol. many ways to skin that cat bro
>>
>>101938331
I wasn't but tried it just now to see how well it works
using Q8 the VRAM use is lower so no swapping but it is two times slower than FP8, from 2.4s/it to 4.86s/it
is that normal?
>>
File: Comparison_all_quants.jpg (3.84 MB, 7961x2897)
3.84 MB
3.84 MB JPG
>>101938348
>using Q8 the VRAM use is lower so no swapping but it is two times slower than FP8, from 2.4s/it to 4.86s/it
>is that normal?
usually when you get something 2 times slower that's because you went from CFG = 1 to CFG > 1, the speed between Q8 and fp8 are supposed to be close
>>
>>101938362
I don't use CFG at all so it's not that, I just changed from the normal loader to the GGUF loader
>>
>>101938371
maybe it's overflowing your VRAM capacity and that's why it's slow, how many vram do you have?
>>
>>101938390
I have 16GB
>>101938348
>using Q8 the VRAM use is lower so no swapping
>>
File: ComfyUI_Flux_28.png (1.24 MB, 1216x832)
1.24 MB
1.24 MB PNG
>>101938020
To answer my own question - kinda works but it definitely wasn't trained on the game renders. Looks more like a gta san andreas mod kek. I just put "a 3d screenshot" in the positive and "anime, 2d" in the negative
>>
>>101938362
is the gen process pretty much like this:
comfy sampler -> gguf dequant -> flux model script using NN Linear?
thanks for the great work
>>
File: Capture.jpg (96 KB, 1444x727)
96 KB
96 KB JPG
>>101938404
>I have 16GB
do you see if it overflows your vram capacity on task manager? Piercel is the picture of Q8 vram usage during inference on my 3090
>>
>>101938430
I literally said it doesn't in the comment...
>>
File: 00006-2721554246.png (1.61 MB, 896x1152)
1.61 MB
1.61 MB PNG
Ingore the audience
>>
File: ComfyUI_Flux_30.png (1.38 MB, 1216x832)
1.38 MB
1.38 MB PNG
>>101938406
And this is with "a real photo"
>>
>>101938439
welp, sucks to be you I guess
>>
File: ComfyUI_00162_.png (615 KB, 512x768)
615 KB
615 KB PNG
>>101938334
Some dev ones look similar but takes forever to generate compared to schnell
>>
>>101938457
you can prevent that by adding "anime" in the negative prompt
>>
>>101937996
Take the 10 minute per gen pill. Go dev. You will learn to discern if it's worth it by looking at the preview of the first few iterations, and you will put in a lot more work into your gens.
>>
File: 1695506901949471.png (1.08 MB, 1024x1024)
1.08 MB
1.08 MB PNG
isnt technology cool, im a comp sci guy, not an art student.
>>
>>101938457
Not bad.
>>
>>101938470
Flux is worth it just for the coherence of backgrounds
>>
File: 1703246731674638.png (1.1 MB, 1024x1024)
1.1 MB
1.1 MB PNG
>>101938478
flux has a way of making photos/depth of field type prompts feel authentic, i'm not sure exactly how to describe it, but it works. It uses noise in a better way than SDXL, I guess?
>>
Why doesn't image2image work with flux in general? Or is it just with schnell?
Image gets really overcooked or just broken
>>
that explains it https://github.com/city96/ComfyUI-GGUF/issues/33#issuecomment-2294821171
I read that Forge also slows down with loras, is that right?
>>
File: ComfyUI_Flux_31.png (1.4 MB, 1216x832)
1.4 MB
1.4 MB PNG
>>101938468
Yeah, that's the first thing I did.
>>101938510
Everything always slows down with loras, even 1.5 and sdxl.
>>101938497
Dev needs a pretty high denoise value for significant changes, I'm talking 0.8+. No idea about shnell.
>>
>>101938497
I2I should work with every model
>>
>>101938524
>Everything always slows down with loras, even 1.5 and sdxl.
never by doubling gen times as it is happening now
>>
File: 3812288914.jpg (2.55 MB, 1792x2304)
2.55 MB
2.55 MB JPG
>>
File: 00009-1045291052.png (1.31 MB, 896x1152)
1.31 MB
1.31 MB PNG
>>101938348
Something similar with me not double but more, went from 3.9s/it to 5.5s/it. From 1min 20 seconds to 1min 50 seconds.

Gonna restart PC to see if it helps
>>
File: file.png (617 KB, 800x600)
617 KB
617 KB PNG
>>101938191
That's normal. My schnells take 1-3 minutes on my AMD 5700.
>>101938495
What the fuck is going on in that back room?
>>
>>101938460
I scale them to 1.5x (of the original) size after the "upscale by model" node before resampling it, its reasonably fast. then, as a final step, in a seperate workflow, I just upscale with a clean ESRGAN model again and downscale to the final size. tried various DAT upscalers, waste of time in most cases. really hungry mofos. BRO
>>
File: ComfyUI_00329_.png (1.78 MB, 1024x1024)
1.78 MB
1.78 MB PNG
whats a good workflow for Lora stuff?
>>
>>101938536
>never by doubling gen times as it is happening now
Theorically it should double the gen time because what CFG does is creating 2 images instead of just 1, so twice the work
>>
File: 1708198229142296.png (1.98 MB, 1024x1024)
1.98 MB
1.98 MB PNG
>>101938495
and for something a little different, art nouveau lora but with miku:
>>
>>101938549
motherfucker I'm talking about loras what the fuck are you on about
>>
I got a great deal for 4070 ti super 16gb model, anyone using it for image gen?
>>
>>101938542
I see a booger up her nose. Realistic.
>>
>>101938555
>Seething this hard
Take some pills retard, you need to control your anger issues
>>
>>101938495
requesting catbox pls
>>
>>101938574
you read "doubling gen times" and your monkey brain thinks "CFG" disregarding the context of the conversation because you make logical connections at the level of GPT-2
>>
File: 00011-1045291054.png (1.18 MB, 896x1152)
1.18 MB
1.18 MB PNG
>>101938566
it keeps happening on this prompt.
>>
File: Flux_00915_.png (1.04 MB, 1344x768)
1.04 MB
1.04 MB PNG
>>101938556
Using one right now for Flux. With ComfyUI's low VRAM mode you can run the full FP16 model.
>>
>>101938588
Like I said, take some pills, if you're gonna scream like a monkey everytime someone missed something in a conversation you'll have some serious health issues
>>
>>101938582
https://files.catbox.moe/tgka91.png
>>
>>101938548
Just put a lora node in.
>>
>>101938603
So you admit you're retarded.
>>
>>101938556
BRO EVEN A 3090 OUTDATED NOW (t.3090 user). seriously don't waste money on anything below a 4090 now
>>
>>101938588
That dude spends his entire day posting his CFG hacks to here and reddit, any chance he can he'll shill his shit. His images always look deepfried and uber noisy because of it too.
>>
>>101938620
So you admit that you have the same self-control as a unhinged gorrila?
>>
File: ComfyUI_Flux_33.png (1.46 MB, 1216x832)
1.46 MB
1.46 MB PNG
>>101938536
About 25% speed decrease or so for me. Running Q4_0 on RTX 2080
>>
File: 1696294159936767.png (1007 KB, 1024x1024)
1007 KB
1007 KB PNG
I tried the pepe lora with the nf4 model. I got kermit instead. lmao
>>
File: FluxDev_01555_.jpg (186 KB, 768x1312)
186 KB
186 KB JPG
Is it me or her legs are looking a little thick.
>>
>>101938636
Damn, nice
>>
>>101938556
>>101938623
Everything is "outdated" when it comes to this. I'm going to get a 24GB used GPU myself as soon as I find a good deal. Otherwise you will be forever waiting for the next thing and regretting spending money because this scene moves so fast. Just make gradual upgrades and don't get to hung up on it.
>>
>>101938595
Hey that's cool. How are the gen times?

>>101938623
I got a deal for 400€ vs 4090 almost 2k€ where I live
>>
File: 175052_00001_.png (1.12 MB, 1024x1024)
1.12 MB
1.12 MB PNG
>>101938556
I have a 4060ti 16GB and 64gb ddr5 and it does the default fluxd 1024x1024 in 49s, with comfyui on linux so you should be in for some good times.
>>101938611
Bless you anon, doesnt work for me, using comfui, but bless you anyway.
>>
>>101938623
That attachment on the tip of the gun is for sticking it somewhere.
>>
File: grid-0178.jpg (857 KB, 2304x1792)
857 KB
857 KB JPG
I have no idea how I got outputs this varied from:
"Alex Garan artstyle, Page from American college yearbook, , 1980s in color"
>>
>>101938684
means it has fuck all idea what "Alex Garan artstyle" is
>>
>>101938669
Dale duro hermano. I'd go for more memory if I could, but that's a hell of a deal.
>>
>>101938670
it's not complicated, it's just the emma lora then after the instance prompt "in a coffee shop, she is wearing a white tshirt that says..."
>>
File: grid-0177.jpg (904 KB, 2304x1792)
904 KB
904 KB JPG
Same prompt.
>>
>>101938669
For that pic, which was 25 steps with Euler, it was around 45 seconds.
>>
File: ifx66w.jpg (580 KB, 1600x1600)
580 KB
580 KB JPG
>>101938682
fact checker
>>
so in comfy, how do you connect the gguf -> lora -> nodes to text input? the output after lora is model, what does it link into?

what's the basic workflow for gguf/lora?
>>
>>101938669
>I got a deal for 400€
fucking how, my 4060 Ti 16GB was 450
the 4070 ti super was ~850 at a minimum last I checked
>>
Say what you will about the comfy UI design but I am absolutely loving these node timers.
>>
>>101938720
The lora alters the model. It just makes it react to new embeddings. You don't need to connect the lora to the text/embeddings input. Think of it as a filter for the model so that it understands your toddler foot fetish.
>>
>>101938720
*cause the initial workflow has clip and vae I believe but the q4/q8 loader doesnt
>>
>>101938720
depending on the LoRA loader you use, there should be a yellow clip thing you can pull out into the text input.
Depending on the LoRA though, it's probably not necessary.
>>
File: FD_00404_.png (1.37 MB, 1024x1024)
1.37 MB
1.37 MB PNG
>>101938495
If you can't beat them, join them
>>
File: 00008-1120970443.png (1.66 MB, 896x1152)
1.66 MB
1.66 MB PNG
https://civitai.com/models/523485?modelVersionId=732778
Finally, flux is perfect now
>>
>>101938735
>>101938746
thanks
>>
>>101938720
That said, there are several lora loader nodes. Some let you plug the image model and the text encoder (clip). I dunno.
>>101938755
My fetish.
>>
https://civitai.com/models/653149/javelin-82
SHUT IT DOWN
>>
you know not even a few months ago I would have said open source DALL-E 3 was pretty far away, and now flux is basically uncensored open source dalle. pony/sdxl was already good, this is a step above that.

openAI/Microsoft might be weight hoarding assholes but open source always wins.
>>
File: aseet.jpg (20 KB, 542x375)
20 KB
20 KB JPG
>>101938614
Sir pls I need the workflow
>>
>>101938755
>dataset consists of 28 screencaps (1920×1024)
I hate these people so fucking much. Same as the ones finetuning text models with arbitrary prompt formats.
USE THE SAME FORMAT USED ON THE ORIGINAL MODEL FOR FUCK'S SAKE
>>
>>101938665
you are right, yes. that's why I got this 3090 last year. I don't regret it. still, feels too weak sauce for flux.
>>101938669
my advice: spend a bit more and get a 3090. 500? 600? you can do it.
>>101938725
it is going forward.
>>101938766
kiki best girl
>>
>>101938779
Lmaoo, how the fuck civitai is still alive after hosting so many controvertial models
>>
>>101938786
Flux was trained on a range of megapixels and ratios tho
what you should be saying is USE A VARIED SIZE AND RATIO DATA SET FOR FUCK'S SAKE
>>
>>101938725
I like that too. I also like seeing the green outlines so you can visually see the program at work, it's like seeing data go from your ssd to your cpu and memory or whatever.
>>
>>101938779
What's the trigger phrase for this?
>>
>>101938786
>USE THE SAME FORMAT USED ON THE ORIGINAL MODEL FOR FUCK'S SAKE
That would ruin the whole thing, the ghibli screenshots aren't 1:1, flux must know how it really look like
>>
>>101938725
how do you get that anon? that looks interesting

>>101938803
>What's the trigger phrase for this?
kike
>>
>>101938812
Just crop the images judiciously. It's not about genning "a screenshot of a ghibli movie", but genning "kiki getting all her holes filled by a gang of magical brooms".
>>
>>101938801
It always had the outlines though
>>
>>101938812
but it knows cinematic wide shots already
besides the full shots you could also train on crops of the characters (keeping them withing the size/ratio Flux knows of course)
would bulk up the data set and avoid the chance of it degrading with gens that aren't wide
>>
>>101938843
>Just crop the images judiciously.
that's fucking retarded, flux is good at a lot of resolutions, so why not train it further with the real deal instead of forcing it with some 1:1 nonsense
>>
>>101938843
BRO.
>>
>put in the Load Lora node into the workflow
>press queue prompt
>nothing happens

I dont get it
>>
File: FD_00414_.png (1.09 MB, 1024x1024)
1.09 MB
1.09 MB PNG
>>101938785
https://files.catbox.moe/qjmvkr.png
>>101938825
Update comfy and enable it in the settings
>>
>>101937495
How's the Flux.1 q4_0 compare to q4_1?
>>
File: 00015-954822782.png (1.64 MB, 832x1216)
1.64 MB
1.64 MB PNG
>>
File: 121147_00001_.png (1.11 MB, 1024x1024)
1.11 MB
1.11 MB PNG
>>101938698
ah i didn't see it, got it now (was being a filter-tard) ty
>>
>>101938843
Most training programs auto resize and crop pictures anyways to fit the SDXL resolutions, no need to do it manually
>>
>>101938859
Q4_0 is always inferior to Q4_1, in the LLM space, Q4_1 has been created to improve from Q4_0 by adding some more high precisions into the weights, so it's a bit bigger but better too
>>
>>101938856
You need to say the magic words.
>>101938851
If you made the LoRA, know that there's a reason you had to inpaint that fucking third leg out of that image. If Flux is giving your anatomical errors in 10 images you made as samples for your shitty work, there's something you did wrong and you know it.
>>
>simple guidance
>adaptive guidance
>dynamic threshold
>guidance threshold
>flux sampler
>flux shift
>two CLIP nodes for positive/negative with individual guidance settings
>skimmed cfg
>tonemap
>CFG Guider
>AutoCFG
>PrepNegGuider

I am lost.
>>
>>101938885
Worse yet, you are going to get randomly cropped faces and bodies that way.
Just put some effort into your work if you're going to publish for other people ffs. That's all I'm saying.
>>
>>101938889
Don't bother with that one anon's setup, it's hacky and breaks whenever a new update comes out. Just use the basic workflow for now until better supported CFG options come out
>>
File: laura-h4.jpg (603 KB, 1248x1824)
603 KB
603 KB JPG
>>101938889
not enough nodes sir
>>
>>101938889
When did you start learning on imagegen anon? You should try easy first and then learn each concept slowly but surely
>>
>>101938889
>don't need it
>don't need it
>don't need it
>don't need it
>don't need it
>doesn't make a huge difference
>don't need it
>don't need it
>don't need it
>don't need it
>don't need it
>don't need it
>>
File: FD_00410_.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
This is bad, right?
>>
>>101938919
you don't need dynamic threshold? seriously? that shit is great at unslopping anime images
>>
>>101938921
you're treading on dangerous ground
>>
>>101938901
Considering everyone already uses those tools and things turn out perfectly fine, I'm gonna have to go with you're just sperging out for no good reason. You can mess around with the tools used for bucketing/downscaling/cropping, you'd be hard pressed to throw something at it that will truly get fucked by the process.
>>
>>101938933
Not intentional, but I can think of the implications of this from a LoRA.
>>
>>101938919
Why are you saying you don't need Adaptive Guidance? You improve the speed of your gen with that, smells like some serious skill issues if you ask me.
>>
File: file.png (544 KB, 512x512)
544 KB
544 KB PNG
>>
>>101938921
HALF GIRL HALF PLASTIC BRO. & she already hit the wall there
>>
>>101938955
>You improve the speed of your gen with that
only if you're using CFG, you don't need it
>>
File: longew.jpg (1.29 MB, 1600x1200)
1.29 MB
1.29 MB JPG
>>
>>101938945
>things turn out perfectly fine
You have real difficulty finding badly made LoRA's on Civitai lmao
>>
>>101938971
nta, but CFG is good to remove the blur of photos flux is producing, I hate that blurry shit, why does flux have such a bias towards that
>>
File: wut this means.jpg (62 KB, 1358x1072)
62 KB
62 KB JPG
>>101938857
wat do?
>>
>>101938932
It's good for anime, but for anything else it adds even more slop. The guy who shills it uses the same "woman on a street" picture to test his settings because for 98% of anything you'd want to prompt photo-wise it's trash
>>
>>101938983
delete the nf4 loader, I have it there for no reason
>>
>>101938983
It's over.
>>
>>101938921
did you accidentally put megamind in the prompt?
>>
>>101938990
>It's good for anime, but for anything else it adds even more slop.
So it's useful and you need it after all, unless you're pretending people aren't into anime or something?
>>
>>101938979
Civit slop will be Civit slop no matter what tools you give them unfortunately kek
>>
File: ComfyUI_01725_.png (689 KB, 1344x832)
689 KB
689 KB PNG
>>
File: fp125w.jpg (301 KB, 1600x1200)
301 KB
301 KB JPG
>>
File: 3433076726.png (1.31 MB, 1344x768)
1.31 MB
1.31 MB PNG
>>
>>101939003
>It's good for anime
>If you use it for anything else it's trash
>So you need it then?
Not great at reading between the lines their kid are ya?
>>
>>101938919
>>don't need it
>>101939032
>>It's good for anime
choose one anon
>>
File: ComfyUI_Flux_35.png (1.53 MB, 1216x832)
1.53 MB
1.53 MB PNG
>>101938755
What the fuck are these birds
>>
>>101939038
>Don't prompt for anime
>I don't need it
No :D
>>
>>101939042
That's something you haven't specified on your first post, weird huh? :^)
>>
>>101939038
>me
>not me
you people really can't tell when you're talking to different people?
>>
>>101939040
cummy birds
>>
>>101938918
When CompVis released 1.2.
Still way too many new nodes without very much documentation.
>>
File: fp128w.jpg (912 KB, 1600x1200)
912 KB
912 KB JPG
>>101939040
birds aren't real
>>
>>101939059
Considering the guy only seems to talk when he can shove his workflow into a convo it doesn't surprise me that he doesn't know who he's talking to
>>
>>101939040
That's the cum of everyone who ever came to kiki's doujinshi, impregnating the imaginal space from which these images emerge.
>>
>>101938706
Ty

>>101938724
I order it using friends business that gets good offers every now and then

>>101938791
3090 way too expensive where I live
>>
>>101939059
Bullshit I can see your name right there "Anonymous". You've spammed this thread.
>>
File: 1757986426.png (1.1 MB, 1344x768)
1.1 MB
1.1 MB PNG
>>
File: file.png (2.29 MB, 1024x1024)
2.29 MB
2.29 MB PNG
>write running behind a bus
>the bus is always behind him
I thought this thing understood natural language. This is basically the same issue SD had.
>>
>>101939065
>Still way too many new nodes without very much documentation.
The only documentation we have so far are the tutorials on reddit, we definitely need a rentry or something
>>
>>101939114
clearly he's running behind the bus that is behind the camera
try describing the back of the bus
>>
>>101939114
This being a flux gen, you can clearly see it's trained to produce the kind of images that wow normies and journos. 1girl instagram simulacra. The moment you try to describe something that isn't that, it starts introducing weird shit like a shit model. Look at that foot. And what's going on on the left? It also hasn't decided if the camera is inside or outside a bus.
Flux is overtrained on specific types of images to make it look good on social media.

Prove me wrong (seriously, post non-typical situations)
>>
>>101939114
try to go for some boomer prompts, chatgpt or claude can help you on making your prompt more descriptive
>>
>>101939114
what about running after a bus
>>
>>101939159
Damn, your life must be harsh at that moment, you're forcing yourself to be on /ldg/ and see people only talk about flux when you could've gone back to /sdg/ and enjoy talking about your favorite SD models, why are you doing that to yourself anon? genuine question
>>
>>101939114
https://www.youtube.com/watch?v=MjbUnn32_zU
>>
>>101939159
>when you try to describe situations it doesn't understand, it doesn't understand them
Woah
>>
>>101939040
the humans on the ground look horrific aswell, must be a terrible lora to make flux shit itself like that
>>
>>101939159
That's why I upscale and inpaint with SDXL. Flux is vanilla
>>
File: file.png (1.08 MB, 800x600)
1.08 MB
1.08 MB PNG
>>101939169
I know, I'm just trying out things. That prompt was "a jewish man running behind a bus".
This one is "a bus behind a jewish man running". Just interesting to see how this thing thinks.
>>101939191
Probably more idiomatic.
>>
File: file.png (1.03 MB, 800x600)
1.03 MB
1.03 MB PNG
>>101939191
>a jewish man running after a bus
This prompt breaks flux apparently.
>>
>>101939040
The more LoRAs you mix the fuckier things seem to get
>>
>>101939260
that's why we need a finetune that adds more concept to flux, doing some LoraMAXXing has some serious limits, what if you want to go for 2b + Pepe on a ghibli style, that's 3 Loras flux has to handle, it can't work
>>
File: 2728765841.png (1.17 MB, 1344x768)
1.17 MB
1.17 MB PNG
>>
File: 00019-3077351720.png (1.42 MB, 832x1216)
1.42 MB
1.42 MB PNG
>>
>>101939260
Lycoris doesn't work with flux?
>>
>>101939292
Asking the wrong guy here. I think most people are still kind of stumbling over the finer points of LoRAs for flux.
>>
File: file.png (685 KB, 640x480)
685 KB
685 KB PNG
Here our tormented Jewish man is chasing the bus in the opposite direction
>>
>>101939321
that's a muslim
>>
>>101939318
IA3 should also be tried.
>>
simplest way to train a Flux lora on a 4090? want to try and improve its accuracy in generating a specfic outfit
>>
File: file.png (716 KB, 640x480)
716 KB
716 KB PNG
>>101939328
They're all semites, what difference does it make?
In this one I used "chasing a bus", and he didn't even bother running.
We need a LoRA for this it looks like.
>>
>>101939345
https://github.com/ostris/ai-toolkit

Works on windows.
Just follow the instructions and you'll be cooking your LoRA within 20 minutes.
>>
File: 2024-08-17_00024_.png (1.87 MB, 1280x1024)
1.87 MB
1.87 MB PNG
>>
>>101939363
>Just follow the instructions and you'll be cooking your LoRA within 20 minutes.
So that's why there's a spam of Lora on civitai recently...
>>
>>101939350
the model is poorly tagged, embeddings would save the day here, yet that entire tech is being ignored atm.
>>
>>101939373
They are all very good quality because of the reasons this anon stated >>101938945
You can train LoRA's using any resolution, any number of images, and have them automatically cropped by a script. And it will come out amazing and ready for Early Access monetization!
>>
>>101939363
>https://github.com/ostris/ai-toolkit
Every time I have tried this I get an error. Gonna delete it all and start again, see if it still fails.
>>
>>101939373
>So that's why there's a spam of Lora on civitai recently.
No. Even that's too difficult for the average smooth brained jeet on civit. Civit recently introduced in site LoRA training in exchange for "buzz" (whatever the fuck that is)
>>
>>101939394
>You can train LoRA's using any resolution, any number of images, and have them automatically cropped by a script.
like it will adapt to each resolution each image have? that's an excellent feature if you ask me
>>
>>101939400
>Civit recently introduced in site LoRA training in exchange for "buzz" (whatever the fuck that is)
Imagine giving your data + captions to civitai for free, I don't want to sound rude but you have to be pretty retarded to do something like that
>>
File: FD_00430_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
This is the best attempt I could muster. Gonna try Pro, see if it makes a difference
>>
>>101939426
>for free
Actually you have to pay for this too.
>>
>>101939428
>>
>>101939426
Really depends on the data and your hardware? Me? Never in my life would I fork anything over to civit. But the random jeet might want to upload his face five times to make another linkedin scam and that doesn't really bother me.
>>
>>101939441
I won't be surprised if one day Civitai will make their own giant finetune of flux out of people's data and make their own API from that
>>
File: imagefxb.png (1.32 MB, 1024x1024)
1.32 MB
1.32 MB PNG
>>101939428
ImageFX is better trained in many ways
>>
>>101939483
Can it make a wet woman?
>>
Not getting spellcheck in the prompt box in ComfyUI for some reason now
I'm ESL, I need it.
>>
>>101939363
If I wanted to train my own LoRA from nothing but a bunch of images, what's the best way to batch caption them? JoyCaption? (does it even run locally?)
They all have the same [item] in them if that matters, not even sure JoyCaption would recognise it, and I want to ensure it captures every little detail.
Never done this before so any advice would be appreciated.
>>
File: imagefxc.png (1.39 MB, 1024x1024)
1.39 MB
1.39 MB PNG
>>101939523
yes but you have to tard wrangle
>>
>>101939564
>does it even run locally?
yes
>what's the best way
by hand
>They all have the same [item] in them if that matters
You could state that in the prompt
>>
>>101939564
>If I wanted to train my own LoRA from nothing but a bunch of images, what's the best way to batch caption them? JoyCaption? (does it even run locally?)
JoyCaption do run locally, I still think GPT4V is the best captionner model, I would use GPT4V for SFW and JoyCaption for NFSW desu
>>
>>101939564
Well, if it's nice enough that JoyCaption won't even recognize it, then you can run it through it and then edit the outputs manually to fix any mistakes or do it raw.
It's also worth considering if you even need captions at all. Are you training a brand new concept or just trying to enhance something that's already in the model or force a particular style? If that's the case, captioning might be completely optional.
>>
File: FluxDev_01573_.jpg (184 KB, 832x1216)
184 KB
184 KB JPG
>>101939428
on a street
looking at the tail pipe on the back of a bus
looking at the back of a man with grey hair sprinting after the bus
>>
File: FLUX_00040_.png (1012 KB, 896x1152)
1012 KB
1012 KB PNG
10th try and I'm still not happy
it wasn't that complicated of a prompt
>>
>>101939604
LOOKS LIKE A JEWISH LADYBOY BRO.
>>
>>101939604
>it wasn't that complicated of a prompt
Stick it in GPT and ask it to rewrite it to be more verbose?
>>
>>101939397
Still failing. The file is there so I don't really know what the issue is.
>>
File: FLUX_00041_.png (1.01 MB, 896x1152)
1.01 MB
1.01 MB PNG
luck of the draw I guess
>>
File: Untitled.png (11 KB, 642x238)
11 KB
11 KB PNG
>>101939658
I do.
Where is your config file in the config folder?
>>
>>101939670
catbox? want to slap a realism lora on top for better skin
>>
>>101939670
it doesn't look like Emma at all
>>
File: ComfyUI_00955_.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
>>101939428
this seems harder than i thought. need to ask claude
>>
>>101939675
In that folder.
>>
File: ComfyUI_00194_.png (578 KB, 512x768)
578 KB
578 KB PNG
>>101939670
>>
>>101939682
>>101939601
>>
>>101939681
Emma? isn't that Taylor Swift?
>>
>>101939687
Can I see? Something isn't adding up.
>>
File: 1704810071215236.png (1.1 MB, 1024x1024)
1.1 MB
1.1 MB PNG
>>
>>101939713
Oh fuck you're right, kek
>>
I wonder why Lykoris never managed to be the replacement of Lora, it's in theory a superior method
>>
FLUX doesn't work on A111 webui, but will it ever? Is that even possible?

I like A1111...
>>
>>101939721
I see the previous Emma image now so I can understand the confusion.
>>
>>101939714
>>
File: ComfyUI_00196_.png (548 KB, 512x768)
548 KB
548 KB PNG
>>101939697
Using dev + schnell combined for 4 - 8 steps, seems better

https://huggingface.co/drbaph/FLUX.1-schnell-dev-merged-fp8-4step
>>
>>101939740
why won't you jump to the forge ship, it's A1111 but faster and with all the recent updates
>>
>>101939740
just use Forge
>>
>>101939574
>yes
How exactly? Last I checked it was just a HF Spaces frontend to a model, not sure what tool I'd need to run it locally.
>>101939578
>GPT4V
Eh, that's paid so sadly I'll have to stick to JC. Hoping it'll suffice.
>>101939589
Just tested and JC recognises it, fortunately.
I'm actually just trying to enhance the details of a certain historical character that the model is already capable of generating, but not 100% accurately. For example, an emblem on the character's cap always looks wrong or a badge on his suit sometimes looks off, or even his general appearance since I sometimes have to generate 5-10 times before a decent result. I'm guessing that would require captioning.
>>
>>101939752
until it gets abandoned
can you pin T5 to the cpu in Forget yet?
>>
>>101939745
Okay, I see the problem.your config file is named naif.yml and you're trying to get it to run a file called lora.yaml which doesn't exist.

try this:

 python run.py config/naif.yml
>>
>>101939765
>can you pin T5 to the cpu in Forget yet?
Nope, that's why I'm stuck with comfyUi, this shit has more important features
>>
>>101939761
>How exactly?
huggingface spaces can be cloned, they are git repos
the app downloads the models for you (the llama model from the official meta repo requires asking for permission first but there are mirrors you can use just by changing one line in the code)
>>
>>101939778
wait fuck. I see in your screenshot you've changed the name.
>>
File: ComfyUI_00956_.png (1.22 MB, 1024x1024)
1.22 MB
1.22 MB PNG
>>101939699
>tail pipe on the back of a bus
nice. adding "tail end of a city bus that's pulling away from its stop" did the job
>>
>>101939778
no I renamed it. check the terminal again, it's calling the correct file.
>>
File: 1693567355632801.png (772 KB, 1024x1024)
772 KB
772 KB PNG
gentlemen
>>
>>101939752
>>101939754
I haven't done this in a hot minute, didn't realize there was a better fork, thanks anons.
>>
>>101939797
I even tried calling the full path
>>
>>101939802
they thought they could stump T5 when it was them who were promptlets, although "back of the bus" should be pretty clear
>>
File: 00025-1402907279.png (1.17 MB, 832x1216)
1.17 MB
1.17 MB PNG
>>101939114
My one is running head on into the bus...
>>
>>101939578
What about stuff like llava?
>>
>>101939805
Okay. One more thing though. In the command prompt you are writing "niaf" when the actual file name is naif
>>
>>101939813
it's impressive how well flux is learning new concept, and this is just some loras we got there, imagine a serious finetune on this, I hope someone is up to the task, is pony-dev hinting he'll be working on that model?
>>
>>101939840
someone will do it, it's a capable model
>>
>>101939840
>is pony-dev hinting he'll be working on that model?
he can't monetize FluxDev so no
>>
>>101939859
he can on schnell, that model fucking sucks though
>>
>>101939840
>is pony-dev hinting he'll be working on that model?
Nope, he's been adamant about not wanting to do.
>>
>>101939834
Oh holy fuck I am a retard. You're right.
Now I am getting an error about the folder path of the files but that can be fixed, probably doesn't like the spaces.
>>
>>101939884
No problem. I once spent like two days trying to figure out an error before I realized I spelled the world university as unversity.
>>
File: 00027-2540765661.png (1.2 MB, 832x1216)
1.2 MB
1.2 MB PNG
I can't get inpainting to work with GGUF Q8 in forge, hmm.
>>
>>101939897
And word as world.
>>
File: 00028-2540765662.png (1.34 MB, 832x1216)
1.34 MB
1.34 MB PNG
The guy laughing and pointing is a nice touch if I do say so myself.
>>
>>101939897
Ifs a feeble mind that kant think of moar then won way to spel a wurd.
>>
>>101939928
I know. It's my weakness.
>>
this retard https://civitai.com/models/647663/porsche-911-gts-2024-flux
>2.5GB lora
>the LoRa is this big because it is trained with really high res images think 4k and with a really high rank 256. You could make it smaller but it wouldn't retain all the intricate details of the car
clueless
>>
>>101939840
ponyman went all in on auraflow, "disasterpiece"
>>
>>101939897
Before I start, is there anything I should do to run this on a 16gb card. I know people have done it, I just don't know if they did it with this.
>>
>>101939966
>All that wasted gpu time.

Flux honestly does fine at rank 16, I don't think going further beyond that is going to do much for you than deep fry whatever you want to make.
>>
File: 123.jpg (2.85 MB, 1792x2304)
2.85 MB
2.85 MB JPG
>>
>>101939984
16gb? That's gonna be rough, run it in with the low vram args I guess and probably only train at like 512 resolution (not as bad as you think) and maybe consider lowering the rank to 8.
>>
my puter is making pics of my waifu
>>
File: ComfyUI_00197_.png (566 KB, 512x768)
566 KB
566 KB PNG
>>101939967
It says sneed on the poster?
>>
File: 445036257.png (1.41 MB, 1344x768)
1.41 MB
1.41 MB PNG
>>101940001
I've seen this one already, though it does look nice.
>>
>>101940030
>>101940038
>>
>flux is released
>every single AI company release a new, much more powerful closed source model
its over
>>
>>101940074
>every single AI company release a new, much more powerful closed source model
like what
>>
>>101940030
supposed to be topless muscular men, adolf hilter and godzilla. got it right a few times.
>>
>>101940074
now the standard is flux, everything inferior will be dismissed, at least that'll force them to work harder and give us actual good products, which is always welcomed
>>
File: 0.jpg (220 KB, 1024x1024)
220 KB
220 KB JPG
>>
>>101940013
Still struggling. Keeps running into unicode string errors.
I should just boot into ubuntu and do it there. Don't know if I can be fucked though.
Might just wait for kohya
>>
File: file.png (1.4 MB, 1024x768)
1.4 MB
1.4 MB PNG
>>101940088
Grok2 and Imagen 3 (ImageFX) just got released.
>>
>>101940167
That's nice, the genie is definitely going out of the bottle, I'm tired of all that fearmongering of AI
>>
>traditional japanese art style, ink on paper, a cyborg samurai in a futuristic Tokyo with katana and jingasa, red sun, japanese calligraphy on the upper right corner, wabi-sabi, henna and carmine, sepia, minimal brush strokes
IThat was was made by flux-pro, can dev reach that level?
>>
File: 00034-127994779.png (1.37 MB, 1216x832)
1.37 MB
1.37 MB PNG
>>101940167
What was that image made with?
>>
Do we have a baker?
>>
File: file.png (511 KB, 750x500)
511 KB
511 KB PNG
>>101940215
Grok2.
>>
>>101940225
>Grok2.
grok2 is dev pro though?
>>
>>101940225
So Flux Pro
>>
>>101940225
Now bad, has a more natural look.

Let's see how finetuned flux will be like though.
>>
Here we go...
>>101940241
>>101940241
>>101940241
>>
>>101940167
>Grok2
I'm not worried, Grok 2's image gen already got leaked.
>>
File: ComfyUI_00952_.png (1.31 MB, 1344x768)
1.31 MB
1.31 MB PNG
>>101940167
>Grok2
Flux pro
>Imagen 3
Doesn't look any better than Dev
>>
>>101940248
>Grok 2's image gen already got leaked.
what?
>>
>>101940213
I will try and post in the new thread
>>
File: ComfyUI_00199_.png (575 KB, 512x768)
575 KB
575 KB PNG
>>101940167
Looks real, did you use anything else?
>>
>>101940257
Because Grok is flux, idiot.
>>
>>101940331
it's flux pro, and we don't have that retard
>>
>>101940255
where is her right hand?
>>
>>101940255
>Doesn't look any better than Dev
Highly disagree. Case in point >>101939483
>>
>>101940372
looks like a SDXL gen, the far away details are bad compared to flux
>>
>>101940360
she is disabled
>>
>>101940341
demonstrate something Pro can do that Dev can't
>>
File: file.png (2.01 MB, 1024x1024)
2.01 MB
2.01 MB PNG
>>101940442
Bro did you even look at the flux images using the same prompt in this very own thread?

>>101940442
Pro looks much better. It's not even close.
>>
File: imagefxd.png (1.85 MB, 1024x1024)
1.85 MB
1.85 MB PNG
>>101940378
I didn't specify for telephoto
>>
File: Capture.jpg (44 KB, 791x351)
44 KB
44 KB JPG
>>101940541
Come on man that's not a quality picture, that's some ugly mush, flux doesn't fuck up like that
>>
File: ifx47.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
>>101940562
that's a but queue anon...
try and gen a wren, a baby quoll and a salamander on moss in flux
i'll wait...
>>
>>101940629
that I agree, flux doesn't know enough concepts, but it's easier to add more concept to a model than giving it more quality in images
>>
>>101940655
both have strengths as with any model, hopefully a good finetune will add some more textures and film tropes you can see what they shied away from in training
>>
>>101938755
>>101938786
28 screencaps....lazy, the right way is run the bluray through PySceneDetect, detect & remove near-duplicate images, caption with joy caption, do finetune on the final 1000~2000 images with random crop enabled, high epochs and slightly slow learning rate, then extract lora at different ranks and take best one

quality in, quality slop out
>>
>>101941296
>captioning before random crop



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.