[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (882 KB, 3264x3264)
882 KB
882 KB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>101959699

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/g/sdg
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
PINGAS
>>
>>101962774
>top right
kek, good collage op
>>
total debo death
>>
File: 1686467601141.png (1.23 MB, 960x960)
1.23 MB
1.23 MB PNG
>mfw
>>
File: 3481597801.png (1.16 MB, 1152x896)
1.16 MB
1.16 MB PNG
>>
>>101962774
Yes! my image gen in the OP three times in a row now.
>>
Kill Debo. Behead Debo. Roundhouse kick a Debo into the concrete. Slam dunk a Debo baby into the trashcan. Crucify filthy Debos. Defecate in Debo's food. Launch Debo into the sun. Stir fry Debo in a wok. Toss Debo into active volcanoes. Urinate into Debo's gas tank. Judo throw Debo into a wood chipper. Twist Debo's head off. Report Debo to the IRS. Karate chop Debo in half. Curb stomp pregnant black Debos. Trap Debo in quicksand. Crush Debo in the trash compactor. Liquefy Debo in a vat of acid. Eat Debo. Dissect Debo. Exterminate Debo in the gas chamber. Stomp Debo's skull with steel toed boots. Cremate Debo in the oven. Lobotomize Debo. Mandatory abortions for Debo. Grind Debo fetuses in the garbage disposal. Drown Debo in fried chicken grease. Vaporize Debo with a ray gun. Kick old Debo down the stairs. Feed Debo to alligators. Slice Debo with a katana.
>>
File: ComfyUI_03401_.png (2.12 MB, 1216x832)
2.12 MB
2.12 MB PNG
>>101962717
>Flux doesn't do art styles
>Try doing impressionism

works on my machine.
>>
>>101962874
I should probably NOT promt this one.
>>
File: green goblin FINISH IT.png (620 KB, 1280x720)
620 KB
620 KB PNG
>>101962890
PROMPT IT!
>>
what are the best settings for flux dev?
>>
>>101962821
>spamming or flooding
You know what to do boys
Give him a taste of his own medicine
>>
File: ComfyUI8.18.2024__00011_.png (2.87 MB, 1248x1824)
2.87 MB
2.87 MB PNG
>>
File: 640861403.png (1.25 MB, 896x1152)
1.25 MB
1.25 MB PNG
>>
File: 1167752492635494008-SD.png (1.8 MB, 896x1152)
1.8 MB
1.8 MB PNG
>>
File: FLUX_00069_.png (1.19 MB, 896x1152)
1.19 MB
1.19 MB PNG
>>
File: ComfyUI_03403_.png (2.17 MB, 832x1216)
2.17 MB
2.17 MB PNG
>>101962886
>>101962717 (Cross-thread)
>Flux doesn't do art styles
>Try doing non-realism

again, works on my machine. seems like a skill issue more than anything else, happy to provide a catbox of either gen to get you up to speed.
>>
>>101962952
very cool. give it nitros
>>
>>101962975
shooooo
>>
>>101962975
wow is that a boxxy lora?
>>
I need to downgrade pytorch to 2.3.1. I'm scared.
>>101962979
course it works.
>>
>>101962886
Try making portraits of humans.
>>
is multi character prompting in comfyui still obscenely complicated? Tried looking into it just now and instantly gave up at the sight of that monstrous setup..
Pony by the way.
>>
File: 00217-45788179.png (1.11 MB, 1216x832)
1.11 MB
1.11 MB PNG
At a meeting with my boss.

"Yeah 100%"

"Yeah exactly, so true"

"Very interesting boss"

"Yeah...."

I forgot what she said, but it was a productive meeting.
>>
>>101962979
Using CFG doesn't count
>>
>>101963007

Here's the command I used in my python_embedded folder to upgrade to 2.4 nightly, should be able to adapt this to 2.3.1 if you wanna go back rather that forward:

https://pytorch.org/get-started/locally/

.\python.exe .\Scripts\pip.exe install --pre --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu124
>>
>>101963022
after u fart nigger lips
>>
File: 1167752492635494010-SD.png (1.83 MB, 896x1152)
1.83 MB
1.83 MB PNG
>>101962982
silly anon, you dont pimp a classic!
>>
>>101963037
Just noticed the guy on the left has his phone out, pretty sus.
>>
File: 00219-45788181.png (1016 KB, 1216x832)
1016 KB
1016 KB PNG
>>101963037
This was break time, we didn't eat.
>>
File: ComfyUI_01073_.png (1.68 MB, 1344x768)
1.68 MB
1.68 MB PNG
>>101963045
b-but vroom
>>
Holy fuck... it is time...
https://civitai.com/models/659351/fluxgretathunberg?modelVersionId=737774
>>
File: 556151313215.jpg (101 KB, 1200x800)
101 KB
101 KB JPG
>>101963022
If it is real impressionism I should be able to make portraits that look like this.
>>
>>101963039
BRO thank you. does the nightly work? win10 btw and well warning warning with pytorch 2.4.0
>>101963045
exactly. they look cool in blue tho.
>>
File: 2639377541.png (1.3 MB, 768x1344)
1.3 MB
1.3 MB PNG
>>
>>101963097
Nightly gave me a 10 second gen speed increase, so comfy was clearly right that 124 stable was fucked.
>>
>>101963128
>how do i get nightly?
>>
File: ComfyUI_31928_.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
>>
File: 1167752492635494013-SD.png (1.79 MB, 896x1152)
1.79 MB
1.79 MB PNG
>>101962952
>>101963045
>>101963097
blue ferrari?! thats blasphemy.. heres another blue though
>>
>>101963155
>how do i get nightly?

>>101963097
just copy that command and run it in the python_embedded folder
>>
>>101962979
NTA but I would love a catbox pls
>>
File: delux_ci_00046_.png (1.87 MB, 1536x968)
1.87 MB
1.87 MB PNG
>>101963155
its only available at night
>>
How do I generate 1000+ images if I put batch count to 100 it crashes my pc and I dont want to click queue manualy?
>>
>>101962979
>>101963095
Styles also aren't the only issue. Artists also are. It knows like what, compared to SDXL's thousands of artists, including both nonliving and living artists, plus anyone you could name from Artstation, this knows like what, 3 of them?
>>
>>101963159
ferrari's racing colors where blue 1952-64 learn history chud
>>
File: ComfyUI_03402_.png (2.17 MB, 832x1216)
2.17 MB
2.17 MB PNG
>>101963173
Here you go pal
https://files.catbox.moe/3whdcu.png
>>
>>101963195
>if I put batch count to 100 it crashes my pc
no shit
>>
File: delux_ci_00047_.png (1.97 MB, 1536x968)
1.97 MB
1.97 MB PNG
>>101963195
in comfy: just click auto-queue
in forge: I think you can right click the generate button?
>>
File: 1167752492635494012-SD.png (1.81 MB, 896x1152)
1.81 MB
1.81 MB PNG
>>101963159
>>101963202
b-but this is an f40, I dont want my f40 in blue
>>
>>101963039
pytorch version: 2.5.0.dev20240818+cu124
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.4.0+cu118 with CUDA 1108 (you have 2.5.0.dev20240818+cu124)
Python 3.10.11 (you have 3.10.9)
so, upgrade python and xformers or did I fuck up?
>>
>>101963202
>ferrari's racing colors where blue 1952-64
No...
https://www.formula1.com/en/latest/features/2015/10/f1-do-you-remember-when-ferrari-raced-in-blue-in-mexico.html
https://en.wikipedia.org/wiki/North_American_Racing_Team
>>
>>101963208
>CFG
doesn't count
>>
>>101963254
>wiki
oy vey
>>
File: ComfyUI_03416_.png (1.82 MB, 832x1216)
1.82 MB
1.82 MB PNG
>>101963095
>>101963022
>>101963196
>continously just moving the goal posts to an unfathomable degree from the original proposition that 'flux doesn't do art styles'
>>
File: ComfyUI_31932_.png (1.09 MB, 1024x1024)
1.09 MB
1.09 MB PNG
>>
>>101963288
naisu anon!
>>
>>101963250
weird, just installed the standalone release again to check and it came with Python 3.11.8, are you on a really old install?
>>
>>101963196
who cares, you can make a lora of your art style fetish in 60 minutes now and it'll be better than anything a base model can do
>>
>>101963237
fuck this is a good gen, love the little camera artifacts
>>
File: FLUX_00077_.png (971 KB, 896x1152)
971 KB
971 KB PNG
do cigs suck because BFL removed it from the training data due to 'safety', or is it just a universal constant that it will never work in any diffusion model ever
>>
>>101963311
yes, from uh last year. continously upgraded. time to nuke it? does copy-pasting the custom nodes folder work?
>>
File: 1167752492635494016-SD.png (1.87 MB, 896x1152)
1.87 MB
1.87 MB PNG
>>101963237
g'day anons, lets flux laters
>>
>>101963288
you really could make a SCUMM point and click game with AI now
>>
>>101963352
DALL-E 3 does them right most of the time
>>
>>101963353
That would work, including the models, workflows, etc I think. But I wonder if the easier option would be to just download the new standalone and replace your old python_embedded folder with the new one - and then run the new command.

Not tried it myself but in theory the environments are entirely replaceable, you'd just be downloading/installing the required python bits and pieces again on first launch.
>>
>>101963353
>>101963400

https://github.com/comfyanonymous/ComfyUI/blob/master/extra_model_paths.yaml.example

You could also just get the new standalone and direct all your paths to your old folder using the extra_model_paths.yaml config file.
>>
>>101963273
Art styles= thousands of mediums and artists.
Zero bias towards a certain look that it always goes for. It simply doesn't. It needs to be carried heavily by LoRAs to do art styles.

Here's what a proper model should be able to do
https://midlibrary.io/styles

Notice there are thousands of these, and there is no AI face bias depending on which illustrator you or medium you choose. Flux is great with styles when it can do them, but it needs to be carried heavily by LoRAs and finetunes.
>>
File: delux_ra_00008_.png (2.8 MB, 1536x1152)
2.8 MB
2.8 MB PNG
>>101963352
flux does smoking a lot better than sdxl could
>>
File: FLUX_00078_.png (932 KB, 896x1152)
932 KB
932 KB PNG
>>101963367
I'm hoping it's just safety crap, the potential is there with detailed prompt adherence
>>
File: ComfyUI_03396_.png (2.32 MB, 1216x832)
2.32 MB
2.32 MB PNG
>>101963038
>>101963257
CFGlets wish they could be me

>>101963418
I don't disagree
>>
File: 1877375566.png (1.12 MB, 768x1344)
1.12 MB
1.12 MB PNG
>>
>delux
thread schizo poo prompt and output and comment nigger
>>
>>101963418
it's retarded to compare because MJ at this point is an established product, you'd have to compare against earlier versions which is obvious they've been iterating and adding to the dataset.
>>
>>101963466
>it's retarded to compare because MJ at this point is an established product, you'd have to compare against earlier versions which is obvious they've been iterating and adding to the dataset.

SDXL can do just as many no problem
https://rikkar69.github.io/SDXL-artist-study/

I compared with MJ because they are using similar tech and are at similar levels at this point. There is no excuse for Flux to regress even past what SD 1.4 could do. Look at Chink models like Pixart and Hunyuan, they can both natively handle more styles than SD could (though I think with artists Hunyuan is the only one that still continued to shine). The same should apply to Flux even more, it should be able to do a magnitude more than SDXL, but yet it can't.
>>
File: ComfyUI_31937_.png (764 KB, 1024x1024)
764 KB
764 KB PNG
>>
>>101963418
Midjourney dataset was hand-curated including custom meta-styles like 'aetherpunk' and 'meatcore'
https://web.archive.org/web/20231231203837/https://docs.google.com/spreadsheets/d/1MEglfejpqgVcaf-I-cgZ5ngV_MlaOTeGXAoBPJO69FM/htmlview#
while flux was trained on ai captioned images that most certainly don't know what any of that shit implies. I think the best best would be a massive finetune of all of these artists to then see how flux compares, because from my experience training loras it's quite capable of learning new things it's just that the base model completely fails to make use of the 12b params and instead just slops everything together as 'digital art' because that's what cogvlm said it was
>>
>>101963530
Yawn don't care, train your own model.
>>
whenever i post about styles there's this schizo who copies what i say almost verbatim. am i the schizo here, or is he stalking me? get your own talking points
>>
>>101963568
which anon, which side of the style debate are you on
>>
>>101963400
gonna try that now. thanks for the idea & help.
>>101963413
yeah the yaml stuff, I can do that. not that much inside comfy model wise anyways (except the UNET models) - I'll report back.
>>
File: contrapoints.png (1.38 MB, 1216x832)
1.38 MB
1.38 MB PNG
>>
just noticed t5 can't tell left from right, or it doesn't factor perspective
>>
whenever i make a post about styles there's some schizo who copies what i say almost verbatim. am i the schizo here, or am I being stalked? get your own talking points
>>
>>101963455
sexo
>>
whenever I shit I also piss
>>
File: FLUX_00081_.png (1.12 MB, 896x1152)
1.12 MB
1.12 MB PNG
forgot I had the lora on, lol
>>
>>101963604
I've been training loras on flux and the captioners often fuck this up massively with "in his left hand" not accounting for the POV. Combining "in his left hand" and "on the left side" likely fucks everything up, especially when the tagger can't even get it right half the time. I imagine this applied at a large scale is what causes it to be so screwed up.
>>
File: file.png (449 KB, 1024x1024)
449 KB
449 KB PNG
>>101963604
>a glass ball on the left, a red cube on the right, a ray tracing demo
At the end of the day AI captions are not 100% perfect especially for small details.
>>
>If you are using the 4chanX extension you can filter his images by adding /^de/i to your filename filter
>>
>>101963634
Exactly, also taggers keep messing up things like looking at viewer when they aren't etc
>>
>>101962774
What's the difference between Automatic1111 and it's fork SD.Next? Which one would you recommend and why? Pls spoonfeed. I've been looking for some more information on SD.Next but for something that seems pretty good, there's not a lot of people using it and therefore not too much user info.
>>
File: ComfyUI_31944_.png (999 KB, 1024x1024)
999 KB
999 KB PNG
>>
>>101963668
>Pls spoonfeed

Use forge, it's Automatic1111 but better in every way.

https://github.com/lllyasviel/stable-diffusion-webui-forge
>>
File: ComfyUI_31945_.png (1.23 MB, 1024x1024)
1.23 MB
1.23 MB PNG
>>
>>101963668
>Which one would you recommend...?
Forge.
>>
File: 4241246369.png (549 KB, 896x1152)
549 KB
549 KB PNG
>>
>>101963686
Yeah, I've heard that before. But then I read there were a lot of issues with it and the guy who made it is a powertripping faggot, so I'd rather stay away.
>>
File: IMG_20200613_231647.jpg (49 KB, 574x680)
49 KB
49 KB JPG
>finally after dedicating this whole week to learning comfyui after forge failed me for the last time i can gen whatever i want at a respectable quality with whatever lora i want


>mfw i've run out of ideas and don't even know what kinds of interesting prompts to do
>>
>>101963551
My point still stands that SD versions from 1.4 to XL and recent chink models could do more. SD3 was also trained with AI captions and I know the base model did not lose all that knowledge. If Flux could do them it wouldn't have been hard to tell the VLM who the scraped artists were. In fact since it seems to know them it probably was at the base. But yet it's clearly biased to give you a very nonartsy look, so it was DPO'd to give not give you this for "safety" reasons. It's a matter of realignment, but teaching it to do all these different artists in one finetune has never been done before and it's a massive undertaking, we're talking both Western and Japanese artists here to appeal to what everyone needs, it's unlikely we will get something as good as MJ. Now compare that to a base model that's not censored from the start, you get my point. Flux is neat, but if the company behind it decides to not do anything about it competition will be in a position to give us better models.
>>
>>101963699
you'll find characters like that in any open-source software

if you want to autismmaxx then use comfy btw
>>
>>101958634
It's the SD3 branch of Kohya scripts that has flux lora training. Don't ask me why he didn't add it to the branch called flux, I don't question a man creating miracles
>>
>Requested to load Flux
>Loading 1 new model
>loaded partially 3259.2000000000003 3256.535400390625 0
comfy why
just kick out some firefox tabs or something
>>
>>101963699
isn't that comfy that's like that though
>>
File: delux_cc_00015_.png (1.32 MB, 1216x832)
1.32 MB
1.32 MB PNG
>>101963702
>>mfw
make gun-knights
>>
He's so desperate and lonely kek
>>
>write simple instruction like hand resting on hip
>never works
so much for 12 gorillion parameters
>>
File: 02436-2434078953.png (3.49 MB, 2162x1591)
3.49 MB
3.49 MB PNG
This thing is really good at satisfying my sleek modern design porn cravings.
>>
File: catbox_c9sj1g.png (1.17 MB, 832x1216)
1.17 MB
1.17 MB PNG
trying to into flux, keeping it basic tweaking the comfy sample workflow
catboxes of successfully rendered anime grils would be much appreciated
>>
>>101963783
stick to Pony for now if you want anime, worthy Flux finetunes might or not be in the making
>>
>>101963753
Yes but we don't really talk about that
>>
File: photo00014.jpg (134 KB, 1464x1064)
134 KB
134 KB JPG
Hello ive been away all day what have I missed anything new in the genAI space?
>>
Help! How can I batch caption my lora dataset in flux natural language style? Entering this shit one by one into joycaption is nightmarish
>>
>>101963829
Flux 2
>>
File: 00238-4275506538.png (1.08 MB, 1152x896)
1.08 MB
1.08 MB PNG
>>
>>101963809
i think itll happen pretty quickly assuming at least one of the weeb trainers gets off his ass
>>
File: file.png (891 KB, 1024x1024)
891 KB
891 KB PNG
>>101963773
works for me
>>
File: 1086746978.png (744 KB, 768x1344)
744 KB
744 KB PNG
>>101963809
Unless you want porn, Flux is still better.
>>
>>101963839
Really?
>>
>>101963852
Living in a cave eh?
>>
>ERP? Oh yes, I understand you completely. You want to do some Enterprise Resource Planning (ERP). Alright, let's get down to business!
>>
File: FLUX_00003_.png (1.14 MB, 896x1152)
1.14 MB
1.14 MB PNG
didn't realise I'd been at this all day
fuck mondays
>>
so for flux, which of the render hacks should i try to plug in? things like dynamic CFG, attention guidance, negpip, kohya hiresfix etc.
>>
>>101962774
kys
>>
How do I actually make ForgeUI accessible on my local network? Host computer is a Windows 10 machine.
I did
set COMMANDLINE_ARGS= --listen
in webui/webui-user.bat, is there something else I have to set there, or do I set something in another file?
>>
File: 00240-1797056541.png (1.15 MB, 1152x896)
1.15 MB
1.15 MB PNG
>>101963884
Love this one, got a prompt darling?


Also ignore the handholding in my image, it's not gay if you look at a woman while doing it.
>>
>>101963871
>ive been away all day
>>
>>101963965
>don't worry there is no flux 2
>>
File: FLUX_00028_.png (1.28 MB, 896x1152)
1.28 MB
1.28 MB PNG
>>101963958
don't ever call me that again
>>
File: 00129-2942068692.jpg (966 KB, 1344x1728)
966 KB
966 KB JPG
>>101962774
>>
File: FLUX_00088_.png (1.03 MB, 896x1152)
1.03 MB
1.03 MB PNG
this one ain't terrible
no idea where the feather duster came from
>>
>>101963982
we're never gonna escape the flux grain are we
>>
>>101964014

I think that's the watermark in the VAE.
>>
File: 1.jpg (167 KB, 960x1600)
167 KB
167 KB JPG
>>
>>
>>101964014
we need to reticulate the VAE
>>
THE KINO HAS ARRIIIIIIIVED
https://civitai.com/models/660006/neverhood-claymation?modelVersionId=738543
>>
File: 00241-1797056542.png (1.07 MB, 1152x896)
1.07 MB
1.07 MB PNG
>>101963981
OK I won't Mis Bimbo President

But can you just give me a small idea of the prompt for that office scene sweetheart?
>>
File: miku silvia.png (1.38 MB, 1024x1024)
1.38 MB
1.38 MB PNG
>>101963783
>the comfy sample workflow
>>
>>101964041
that's some creative censorship
>>
File: 2683848359.png (1.27 MB, 1344x768)
1.27 MB
1.27 MB PNG
>>
>>101964055
Nice. God there's so much you can do. Like you could spend an entire lifetime just playing with this stuff. And I'm spending a dangerous amount of time on it already. I need to quit.
>>
what does it actually cost to train a lora on civit? I've got points, but it's making me fill in forms before telling me
>>
>>101964101
Why is it so blurry?
>>
File: 00012-1251934252.jpg (406 KB, 1728x1344)
406 KB
406 KB JPG
>>101964014
>flux grain
AHHHHHH HELP
>>
>>101964084
post you're workflow
>>
>>101964155
u're*
>>
>>101964155
>you're
>>
>>101964171
i only speak the king's english
>>
>>101964111
2000 points I think which is around 2$ worth
>>
File: 2249510937.png (1.87 MB, 768x1344)
1.87 MB
1.87 MB PNG
>>101964132
Who knows.
>>
Are 1.5s/it a normal speed for flux fp16 on a 3090? I expected at least 1it/s
>>101964111
2 bucks if you want few steps
Like 2.5 bucks if you want more steps
>>
File: grid-0387.jpg (362 KB, 2304x1792)
362 KB
362 KB JPG
>>
File: half.png (40 KB, 284x177)
40 KB
40 KB PNG
A new boss has appeared, I can't make a picture like this one with Flux, no matter what I try, so people can't appear in the middle of a meal eating hot dogs, they are always brand new, never half eaten, never bitten, impossible to make it as if they have one bite left, only whole, brand new ones.
>>
>>101964233
>Are 1.5s/it a normal speed for flux fp16 on a 3090?
1024x1024? No CFG?
>>
>>101962774
reply
>>
>>101964268
CFG of 6, 1024x1024
>>
>>101962774
retarded bait thread
>>
>>101964308
you're good then
>>
>>101964317
k thanks
>>
BIGstockimage33M is writing to webdataset now, gonna take a hot minute, same for upload, probably ready tomorrow
unfortunately they are low res ~0.4MP but suitable for pre-training or caption model finetuning, or you could upscale them
good variety of creative and editorial type, wide range of subjects and locations
>>
>>101964260
ewww
>>
>>101964263
>The image is a close-up of a hot dog on a white background. The hot dog is in the center of the image, with a bun on top of it. The bun is white and appears to be made of a soft, fluffy material. On top of the bun, there is a red hot dog with a golden-brown color. The dog is lying on its side, with its head slightly tilted to the side. The background is plain white, making the hot dog and bun stand out. The image is slightly blurred, giving it a dreamy, ethereal quality.

Half-eaten is probably not common in the VLM captions
>>
File: 00002-1509294616.jpg (406 KB, 1024x768)
406 KB
406 KB JPG
tfw must be happy with 5-6 minutes per gen if I wanted to join the cool dev club.
>>
fack orfe hlky
>>
>>101964376
it's as close as the feeling of using a dial up model in the late 90s to download large images
>>
File: ComfyUI_31695_.png (1.52 MB, 1024x1024)
1.52 MB
1.52 MB PNG
>>101964263
Yup, I couldn't even make a fish skeleton for a hobo catgirl.
>>
File: ComfyUI_00359_.png (1.57 MB, 704x1408)
1.57 MB
1.57 MB PNG
flux is shit at anime but great at realistic images
>>
>>101964314
>retarded bait thread
>was baited
by transitive property, this means you're retarded
>>
>>101964263
My first gen resulted in what you said and this was my second gen. It actually is eaten into. Just not the same way kek.
>>
>>101963413
I *think* I got it running. old install, new python_embeded. bit of a hassle with some nodes but hey. no speed increase whatsoever tho, went from 1.1x to 1.1x s/it
>>
File: 3561812025.png (1.65 MB, 1152x896)
1.65 MB
1.65 MB PNG
>>
>>101964409
>trans
not valid
>>
>>101964410
Ok this was the fourth gen. Actually kind of worked. Maybe you're just unlucky with your specific prompt, seed, etc?
>>
File: hiring.jpg (436 KB, 896x512)
436 KB
436 KB JPG
>>101964410
>>
File: Bitten.png (124 KB, 600x400)
124 KB
124 KB PNG
>>101964365
I wonder what else is missing.
>your most advanced model can't draw this.
>>
File: file.png (49 KB, 595x100)
49 KB
49 KB PNG
>>101964415
Should be able to verify by seeing if your pytorch version is reporting the newest version. Shame there wasn't a speed increase on your side though.
>>
>>101964453
It is what it is. Realism gens have better hands but I don't want to do 3D.
>>
>>101964455
DALL-E 3 has no issue with this
>>
>>101964451
Okay, that would work, how did you prompt it?
>>
>>101964453
>zir doesn't know the secret handshake
>>
File: 028.png (1.55 MB, 1024x1024)
1.55 MB
1.55 MB PNG
I'm both impressed and annoyed by Flux. Quality is pretty good, but I hate prompting this thing. I genuinely feel more like a schizo prompting then I did with tags.
>>
File: FLUX_00096_.png (1.09 MB, 896x1152)
1.09 MB
1.09 MB PNG
I'm struggling to get the body right, it always looks like she's wearing a costume
I guess it's the lora, they all had her wearing clothes *spit* so it's making her body look like a costume
>>
>>101964474
Dalle dataset is much bigger than Flux. It can also natively do more than Flux since it wasn't censored at all (except for its text ability, which Flux is the best at out of any publicly released model both closed and open so far). The difference this time is that Flux would need finetunes to catch up to it, and there aren't as many concepts Flux needs to learn so it's possible.
>>
File: 2481007589.png (630 KB, 1152x896)
630 KB
630 KB PNG
>>
>>101964461
yes I am on same pytorch&cuda version now. seeing a whole bunch of new errors tho now when using a GGUF quant. whee. also nasty spillover into ram when trying some loras. I'd just give it a week or two normally but flux is so much fun.
>>
>>101964474
hey faggot, you know what you can't do with Dalle-3? train your own sub dataset? you want half eaten hot dogs? train a lora for 20 minutes
>>
Flux Dev finetunes are gonna weaken the CFG distillation, won't they?
>>
>>101964365
this garbage is exactly why 'natural language prompting' was a mistake
>>
File: gadget0001.jpg (133 KB, 1304x1304)
133 KB
133 KB JPG
>>
>>101964563
>brainless knee jerk reaction at any mention of closed models
yawn
>>
>>101964477
I just added this.
>On the desk is a half-eaten hot dog.
To be fair, I have now gotten 10 gens and this was the only "successful" one. Two gens had the bread having been eaten into which I guess can count as a half point each. Also I do not use any CFG or special workflows, just the most basic. My guidance is 3.1 though.
>>
>>101964563
>you want half eaten hot dogs? train a lora for 20 minutes
Or wait till God Prompter shares his prompt.
See, that's why I'm still wishing for a good AI prompter that gives a prompt that one can use.
It would allow one to duplicate real pictures and not just AI generated ones, too!
>>
>>101962979
i got avg 3.71s/it with your workflow vs 2.4s/it on my basic bitch workflow, that's a lotta bloat
>>
File: grid-0392.jpg (541 KB, 1792x2304)
541 KB
541 KB JPG
>>101964359
Calm it with the racism.
>>
>>101964625
CFG, not even once
>>
File: ifx86.jpg (234 KB, 1024x1024)
234 KB
234 KB JPG
>>
>>101962774
fuck off
>>
>>101964634
>Calm it with the racism.
Why?
>>
>>101964594
you can go back SD 1.5 and SDXL, no one asked you to come
>>
>>101964677
Because racism is not cool anon
>>
>>101964619
>See, that's why I'm still wishing for a good AI prompter that gives a prompt that one can use.
or maybe just a good model that captions things properly like "a half-eaten hot-dog on a white paper plate viewed from above" instead of "mmm perhaps this image may contain a hotdog which reflects the dreamy and ethereal qualities of isolation".
>>
>>101964677
some of our best prompters are black
>>
>>101964625
>adding negative conditioning increases gen times but adheres more to the prompt

wow what an amazing discovery you've made

retards will call something bloat without knowing anything about how diffusers work
>>
>>101964688
>bro why can't it do Messi playing Blitzball while holding a half-eaten hotdog??? this model sucks, I'm going to go use Dalle-3
>>
>>101964689
Such as?
>>
uh oh seething freetard meltie!
>>
>>101964688
The hotdog provides a stark contrast to the paper plate beneath it.
>>
>>101964684
Cool is subjective and I think you're gay so your opinion is irrelevant.
>>
File: file.png (1.73 MB, 1653x1000)
1.73 MB
1.73 MB PNG
>>101964705
>bro look at how good Dalle-3 is at this prompt
>>
>>101964730
why are you replying to me faggot, i dont give a shit about dall-e. go cope about it to someone else, dalleet
>>
>>101964699
other retards will rationalize deliberate wastes of time and energy as a stopgap for skill issue
>>
This is not a blessed thread of friendship after all.
>>
>>101964730
The Flux video model will fix this
https://www.reddit.com/r/aivideo/comments/1et27wq/guys_immitating_ai_videos_accurately_the_circle/
>>
File: 1850060241.png (1.5 MB, 896x1152)
1.5 MB
1.5 MB PNG
>>
File: 00057-3659403818.jpg (157 KB, 1080x1280)
157 KB
157 KB JPG
>>
>>101964764
>deliberate wastes of time and energy

kek, where do you think we are? what are (You) using Flux for that isn't a waste of time and energy already?

anyway, my gen speeds are fast enough. 4090 means i don't have to worry about facilitating the BasicGuider poorfags
>>
>>101964796
she can handle my meat if you're picking up what I'm putting down
>>
m u p p e t s
>>101962952
BLUE
>>101964770
nonsense
>>
File: 00059-3105338084.jpg (139 KB, 1080x1280)
139 KB
139 KB JPG
>>
File: d2.png (1009 KB, 1024x1024)
1009 KB
1009 KB PNG
>>
File: mikuAndTheHotDog.png (1.09 MB, 1280x720)
1.09 MB
1.09 MB PNG
>>101964618
>I have now gotten 10 gens and this was the only "successful" one
Oh, so it's about trying over and over until it's done, reminds me of my first Stable Diffusion 1.5 days, many variations of the same picture trying to make it do something, with styles Flux can't do! Thanks fren.
>>101964594
>this garbage is exactly why 'natural language prompting' was a mistake
I never liked long prompts because their ends were ignored and you never knew at what point they'd be cut, here it's fine because I've been able to prompt the bible and the last word of it (amen) still has an effect. I didn't have his prompt so I used Joy Caption Pre Alpha and modified it until getting this one that worked:
>This is a digitally drawn anime-style image featuring Hatsune Miku. She is seated at a wooden desk in a modern office setting. On the desk is part of a half-eaten hot dog and crumbs, the hot dog has a missing part that was bitten off and it's incomplete. She has a serious expression as she extends her right hand to shake hands with a person off-screen to the left. Likely an office colleague. Indicating a break or snack time. The desk is cluttered with various office supplies, including a pencil cup filled with colored pens and markers, a calculator, and a notebook. A green potted plant is visible on the left side of the desk, adding a touch of nature to the otherwise busy workspace. The background features a large window with multiple panes, allowing sunlight to stream in and illuminate the room. Outside the window, lush green trees are visible, suggesting an office with a view of nature. The walls are adorned with bookshelves filled with neatly organized binders and books.
Note it originally said:
>a young woman with long, teal hair styled into twin tails adorned with red and black headbands. She is wearing a black blazer over a white dress shirt and a teal tie.
It claims it's not Miku, just some random girl, ha!
>>
File: 00060-3096778599.jpg (144 KB, 1080x1280)
144 KB
144 KB JPG
>>
File: 1723120839719067.png (967 KB, 1024x1024)
967 KB
967 KB PNG
<lora:FLUX-Pepe-1:1> cctv camera pov, Pepe is dressed in a t-shirt and shorts, the t-shirt says "FLUX" in black text. the cctv has a time and date on it.
>>
>>101964154
I like the grain desu
>>
>>101964812
>BLUE
Reported
>>
>>101964688
It completely hallucinated a fucking dog head tilting his head and you didn't critique that?
>>
>>101963354
>>101963237
The photo texture is very well done. What do you use?
>>
>>101964701
Now, that's not fair, you can get the half-eaten hot dog by just trying 10 different seeds, but no Messi or Blitzball ever.
>>
File: 00064-1623568183.jpg (193 KB, 1344x1600)
193 KB
193 KB JPG
>>
>>101964867
aaaa
>>101964847
prompt: "totally not miku!!1!1!11!!!!! :DDDDDDDDDDDDDDDDDDD" (seriously)
>>
No matter what loras I use, "nude" women always wear panties. Why??
>>
File: 00065-3935977933.jpg (219 KB, 1344x1600)
219 KB
219 KB JPG
>>
>>101964938
not that I think any of the existing loras do a good job but if you can't make women naked using them that's a skill issue
>>
>>101964938
Sex and nudity loras have always sucked. We'll have to wait for a proper sexo finetune
>>
>>101964955
Thank you for pointing that out. What prompting techniques do YOU use?
>>
>>101964964
"naked woman" works 100% of the time
>>
File: 00067-63388903.jpg (230 KB, 1344x1600)
230 KB
230 KB JPG
>>
>>101964974
Thank you, antagonistic faggot. You are very helpful.
>>
>>101964842
This is a high-resolution photograph featuring a close-up view of a hot dog cut in half, placed on a brown tray with a textured surface. The hot dog is a vibrant red color, and the cut section reveals its internal structure, showcasing a detailed view of the meat, fat, and connective tissue. The texture of the meat appears smooth and slightly glossy, indicating that it is fresh and uncooked. The hot dog is positioned slightly off-center, with its curved shape leaning towards the right.

In the background, there is a blurred image of a hamburger bun, providing a subtle context of a meal setting. The background is dark, with a warm, yellowish tone that contrasts sharply with the vivid red of the hot dog. The lighting is soft and highlights the texture of the hot dog's surface, creating a dramatic effect. The photograph is taken from a low angle, emphasizing the texture and detail of the hot dog's interior. The overall mood of the image is surreal and somewhat humorous, with the juxtaposition of the ordinary hot dog and the unusual perspective.
adding a touch of nature to the otherwise modern setting. The overall atmosphere is a blend of professional and casual, with a hint of playful charm.


Llama 3.1 captioning
>>
>>101964847
It's definitely better than the SD 1.5 days since you can do a lot more stuff with success. In the end AI just like this in general. I say this as an old Dalle user. It can literally screw anything up if you get unlucky with the seed. The difference between better models is that the frequency of scewups will be less.
>>
File: 1.jpg (112 KB, 1600x960)
112 KB
112 KB JPG
>>
>>101964995
I'm not jerking your chain, anon, it's just how it works.
>>
File: Deliberate.png (553 KB, 768x768)
553 KB
553 KB PNG
>>101964797
Deliberate was a great 1.5D model ahead of its time, like a Dreamshaper without the sameface problem. Deliberate 2 was inferior because of that and dropping art styles and becoming more genic and soulless.
Unfortunately, Deliberate fails at the half-eaten part too.
>>
File: 00068-1133703696.jpg (202 KB, 1344x1600)
202 KB
202 KB JPG
>>
I can sense DEBO in here ruining the chill vibes
>>
>>101965004
it sucks
>>
File: 00069-2588212012.jpg (242 KB, 1344x1600)
242 KB
242 KB JPG
>>
File: FLUX_00089_.png (1016 KB, 896x1152)
1016 KB
1016 KB PNG
is it time for collage bait yet
>>
File: 00070-640262086.jpg (201 KB, 1344x1600)
201 KB
201 KB JPG
>>
>>101965004
>with a hint of playful charm.
ah, ChatGPT my accursed.
I fucking that website and the company. It has permanently ruined every local model with it's shitty prose and vague details.
>>
File: 4105827552.png (1.31 MB, 1152x896)
1.31 MB
1.31 MB PNG
>>
File: delux_flebo_00066_.png (1.56 MB, 1216x832)
1.56 MB
1.56 MB PNG
>>101965036
what if the collage was all debos
>>
>>101965081
Sounds real gay, I'm in. Any other tranny bros with me?
>>
File: YiffyMix.png (401 KB, 768x768)
401 KB
401 KB PNG
>>101964916
And here's YiffyMix's output for that prompt. Back in the old days we had it instead of PonyXL for characters (only model to give you a Judy Hopps without a LoRa), and it was the model with the most soul.
I guess nowaday's just a relic.
>>
File: 00071-4234978201.jpg (269 KB, 1344x1600)
269 KB
269 KB JPG
>>
File: ifx114.png (1.62 MB, 1024x1024)
1.62 MB
1.62 MB PNG
>>
File: Capture.png (1.64 MB, 2061x430)
1.64 MB
1.64 MB PNG
This fag has made 30+ shitty bloated celebrity loras in a couple days.
>>
File: img_109.png (2.08 MB, 1024x1024)
2.08 MB
2.08 MB PNG
>>101962774
When are we getting a finetune like Juggernaut? Soon? or two more weeks soon?
>>
I want to train a Flux LoRA on my 3090. Does anybody know of a good guide for that? I've done one for SDXL but it's been a while and I never actually understood the parameters.
>>
>>101965160
they are more fairly sized now and due to Flux more versatile but some 20MB loras from other users still mog his
>>
File: 00075-369717276.jpg (242 KB, 1344x1600)
242 KB
242 KB JPG
>>
>>101965005
>I say this as an old Dalle user.
Well, even Dalle 2 knew what sharing a milkshake meant and could draw two cute girls doing it.
Flux would never, you'd have to explicitly tell it how it looks like, and it may only give you 1 girl if you're not explicit enough.
And, you know what? That's what I loved about Dalle, it would appear to read your mind and give you what you wanted, and when it didn't, it could give you a concept or composition that you didn't see coming that was BETTER than what you had in mind!
An open model that could do that would bury Flux in an instant, even if its "quality" wasn't as good.
We need this jump:
Pony XL -> Flux -> New model
>>
>>101965160
>Get in early
>Spam loras
>Become known as the celebrity flux lora guy
>Reap bucks from retards
>>
File: 4032654035.png (507 KB, 1024x1024)
507 KB
507 KB PNG
>>
>>101965013
Not half-eaten.
>Discard.
>>
>>101965172
it would take longer to train than flux has been out for
give it time
>>
>>101965172
>Juggernaut
Pure slop, just like Dreamshaper. We're already getting LoRAs way superior. Finetunes should just look to add what is missing, styles etc...
>>
File: 00077-3004205753.jpg (244 KB, 1344x1600)
244 KB
244 KB JPG
>>
File: 3552858127.png (556 KB, 1024x1024)
556 KB
556 KB PNG
>>
>>101963537
cute
>>
>>101965224
What does it do
>>
File: Miku1.png (1.23 MB, 1366x768)
1.23 MB
1.23 MB PNG
>>101965076
I guess I put my failed generation now? I have no use for them...
Failed generations went from deformed faces, to extra legs, to messed up hands to complete hot dogs that were supposed to be half-eaten.
>>
>>101965237
>The iStaple isn't just a stapler. It's a precision-engineered document unification solution that seamlessly brings your ideas together. Crafted from aerospace-grade aluminum, its minimalist design fits effortlessly into any workspace.
>With our patented PaperFusion technology, the iStaple doesn't just connect pages - it creates a bond as strong as your vision. One gentle press activates the haptic-feedback mechanism, guiding specially designed iStaples through your documents with micron-level accuracy.
>>
File: 8446644684.png (449 KB, 411x664)
449 KB
449 KB PNG
>>101965215
I don't get how anyone can test a new model and desire literally this. This is the same guy that broke absolutely every single SDXL finetune out there, and this style is now baked into base SD and even Flux. It's obnoxious.
>>
>>101965091
there is soul in everything anon, just need to find a way to tickle it out and boy is flux finnicky with the text input.
>>
File: 3416160467.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
>>101965237
>>101965263
I could believe that.
>>
>>101965265
many finetunes have atrocious sample images but the model itself does okay
>>
File: Capture.png (1.33 MB, 854x814)
1.33 MB
1.33 MB PNG
>>101965215
>>101965265
I'm convinced Dreamshaper XL is the cause of butt-chin shiny AI girl face that is ubiquitous in every model now
>>
>>101965299
well as long as ai companies keep training on ai generated outputs, the butt-chin will continue.
>>
File: 54564588454.jpg (268 KB, 800x483)
268 KB
268 KB JPG
>>101965288
>many finetunes have atrocious sample images but the model itself does okay

Doubt it
>>
>>101965282
This image is a digital drawing in a manga style, depicting a young girl sitting on a rooftop at night. The scene is set against a deep blue sky filled with stars, with a prominent shooting star streaking across the sky, adding a sense of wonder and excitement. The girl, who appears to be of East Asian descent, has long, straight black hair and large, expressive eyes. She is dressed in a white blouse with short sleeves and a dark-colored skirt. Her skin is fair, and her cheeks are flushed, indicating she is feeling somewhat anxious or excited. She is sitting on a wooden ledge with her legs drawn up, and her hands are clasped in front of her chest. A speech bubble above her head contains the text "woooahh!" in a playful, handwritten style, suggesting she is enjoying the sight of the shooting star. The background includes a portion of a building, with windows and a chimney visible, enhancing the sense of a cozy, private moment. The overall mood is serene yet slightly thrilling, capturing the essence of a quiet, contemplative night.
>>
I like women with butt chins. I don't think AI does a good job with it though and it is obnoxious. Please like and subscribe.
>>
>>101965189
Oh sorry I meant Dalle 3. I don't know how Dalle 2 was as honestly I wasn't impressed much with image AI before Dalle 3. Maybe DE2 was better? Honestly I loved Dalle 3 but I would not go that far in praising it. It was definitely fun and had a lot of trivia and style knowledge but it couldn't do a lot of stuff I wanted too. I think the experience using Flux is different, rather than worse. I get more coherency from Flux on the things that it does know, and on top of that I get loras, and I also get img2img, which I have a great time getting interesting surprising gens from. For example this one, I didn't even prompt for the character to do this. It just made her do this.
>>
>>101965314
what am I supposed to learn from this
>>
>>101965265
It became a familiar style, so people seeked it, it's the same concept with celebrities, you don't really like their faces, a generation with their faces changes nothing, but the faces are familiar, and not random.
>>
File: 00090-889110300.png (2.74 MB, 1080x1920)
2.74 MB
2.74 MB PNG
>>101965265
>>101965215
I just gave the example since it seemed to be a popular one. The model was def good imo
>>
>>101965330
>>101965189
Oops wrong one.
>>
File: iStaple.png (361 KB, 793x720)
361 KB
361 KB PNG
>>101965263
It's a lot more compact in the japanese version.
>>
File: ComfyUI_00151_.png (1.24 MB, 832x1216)
1.24 MB
1.24 MB PNG
I find Q4_1 quality is more or less the same as fp16 until I try to generate text, then it fucks up the text way more than higher quants, anyone else?
>>
File: file.png (389 KB, 707x715)
389 KB
389 KB PNG
>>101965015
Well then I'm doing something wrong. And it is using the LoRA, otherwise she wouldn't be topless.
>>
File: ComfyUI_31972_.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
Had to spit it out, this stuff is VILE
>>
>>101965299
He is the cause. The real issue is merging. One guy (Leosam) worked to fix that on his models, and others followed suit (E.G. the latest Juggernaut versions don't have that issue, at least not as prominently). Some SD 1.5 models are 100% capable of photoreal girls, but everyone wanted to just keep merging and doing cheap shit, and models like dreamshaper are the result of merging.

What we need to pay attention to are MoEs and LLM-style frankenmerges: https://huggingface.co/blog/segmoe

Rather than the lazy style of merging that keeps taking place (if merges are ever to take place, because merges are incredibly lazy and result in slop).
>>
File: FluxDev_01746_.jpg (220 KB, 832x1216)
220 KB
220 KB JPG
>>101965341
>>
File: 197268907.png (615 KB, 1024x1024)
615 KB
615 KB PNG
>>
So is half-eaten hotdog the new girl lying on grass?
>>
>>101965362
LIAR!
You just cut it. The cut off piece is right there, you didn't even chew it!
>>
File: 1716714054495923.png (971 KB, 1024x1024)
971 KB
971 KB PNG
<lora:FLUX-Pepe-1:1> Pepe is smiling and wearing a tanktop and shorts typing at a computer, on the computer monitor is Miku Hatsune. The text "feels good, man" is visible, meme

a gen of pepe genning miku, it's fluxception
>>
File: 00016-2958243782.png (2.77 MB, 1080x1920)
2.77 MB
2.77 MB PNG
>>101965376
I miss 1.5
>>
>>101965378
>neat
based
>>
>>101965318
that is a forge gen (no longer on my harddrive) so I couldn't even tell you. certainly less words. DON'T THROW MY GENS INTO CYBERNET IMAGE ANALYZER
>>
>>101965370
MoE merges are specially viable now that we have Llama quantization, it could theoretically even improve Flux's prompt following abilities among other things.
>>
>>101965405
>DON'T THROW MY GENS INTO CYBERNET IMAGE ANALYZER
Say please.
>>
>>101965401
Me too.
>>
File: DucHaitenDreamWorld.png (351 KB, 512x512)
351 KB
351 KB PNG
>>101965282
I mean the kind of soulness that makes you forget it's an AI picture, a model like DucHaitenDreamWorld had that, even if the hands sucked.
Trying Flux... all the generations seems sameish, if time was reversed we'd have prompted it as "Anime in the style of Flux" and it'd just be 1 among many.
>>
Which vision local captioner has the most detail right now? I tried Florence2 from Microsoft, but it isn't as good as Joy cap.

https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha
>>
all the images in this thread appear to be slightly humorous, hinting at the playful and ethereal nature of ai image generation
>>
>>101965446
Florence2 is more for specific tasks than accurate captioning.
Joy Caption sucks tho, too many errors.
>>
File: FluxDev_01668_.jpg (200 KB, 832x1216)
200 KB
200 KB JPG
>>
>>101965444
again, here. it's all there
>>
>>101965446
joycap is just boilerplate as far as I can tell, llama3.1 is doing the work
try another vlm with it, gemma or nemo
>>
>>101962774
You bitch
>>
Are LoRA in Comfy and Flux broken?
>>
File: ComfyUI_31974_.png (1.57 MB, 1024x1024)
1.57 MB
1.57 MB PNG
>>101965394
It's literally INEDIBLE
>>
File: ComfyUI_31689_.png (1.64 MB, 1024x1024)
1.64 MB
1.64 MB PNG
>>101965444
Try lowering guidance, the pictures will look crappy but ultra soulful.
>>
>>101965418
>>101965370
Imagine a 4x Flux Schnell merge catching up to and surpassing Pro.
>>
File: ifx106.png (1.13 MB, 1024x1024)
1.13 MB
1.13 MB PNG
>>
File: DalleComparison.jpg (89 KB, 690x483)
89 KB
89 KB JPG
>>101965330
>Maybe DE2 was better?
Huh...
>Posts picrel
The problem with DE2 was bad eyes and hands (bad prompt adherence, and no text understanding), but artistically ahead of Midjourney 6.
>>
>>101965509
GPUlet hopium (I'm a GPUlet too)
>>
>>101965446
curious about this too
JC is "decent", but not great and doesn't recognise half the stuff in my pics, too many errors
wonder if a llama 70b/largestral version would be any better
need SOMETHING to tag my 30k training images
>>
>>101965486
Works on my machine.
>>
*sips*
Trinart... Now that was a model
>>
File: 00086-935985181.jpg (211 KB, 1600x1344)
211 KB
211 KB JPG
>>
>>101965405
This is a highly detailed CGI (computer-generated imagery) rendering of a dramatic canyon scene at sunset. The image captures a narrow, winding river flowing through towering rock formations, with the sun setting in the distance. The sky is a vibrant gradient of orange, pink, and purple, casting a warm glow over the rugged landscape. The canyon walls are jagged and steep, composed of dark grey and black rocks, with patches of green and red vegetation clinging to their edges. The river reflects the sunset hues, creating a shimmering effect in its calm waters. Scattered rocks and pebbles line the riverbanks, adding texture and depth to the scene. The overall mood is one of solitude and grandeur, with the vastness of the canyon emphasized by the diminishing perspective as the eye follows the river into the distance. The image is highly realistic, capturing every detail from the texture of the rocks to the subtle play of light and shadow. The scene is devoid of human presence, enhancing the sense of untouched wilderness.
>>
>>101965486

Works with loRa strength 1.0-1.6, clip 1. No idea why Flux behave like that, but as long as it works I guess.
>>
File: ComfyUI_00153_.png (2.69 MB, 1248x1848)
2.69 MB
2.69 MB PNG
>>
File: 00020-128449626.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
>24gb needed to train loras on flux!!

This is unacceptable there must be an alternative to local training than buying the latest gpu that cost above $1500
>>
>>101965521
>but artistically ahead of Midjourney 6.
Mjv5 maybe but definitely not MJv6
>>
finna bake rq
>>
>>101965146
Lora? This looks so legit
>>
>>101965547
Train at 512x512, werks on my 16gb vram
>>
>>101965523
Nah, theoretically we'd need to run it at Q4 if something that big fits on 24GB, otherwise we'd have to merge even less. To get an idea of inference cost, SegMoE-2x1-v0 which is 2 SDXL unquantized just barely fits into 24GB.
>>
>>101965444
and yes the guidance and max_shift/base_shift values really need some tinkering.
>>101965519
BRO THIS IS MUCHO
>>
>>101965559
How are you training it?

Any tips on making captions for the dataset and so forth
>>
>>101965559
>Train at 512x512
What is this, 2023?
>>
File: igx109.png (1.45 MB, 1024x1024)
1.45 MB
1.45 MB PNG
>>101965566
>>101965557
that's imageFX I now have my schizo master prompt
>>
>>101965547
24gb is like 400-500$ using a 2nd hand 3090
>>
>>101965547

>Stock market crash
>He didn't slurp the dip to go on a trip on the rocket ship.

ngmi. You missed your chances again and again. I told you fucks on here to buy Nvidia stocks over and over for free money.
>>
File: 00076-4238289435.jpg (476 KB, 1728x1344)
476 KB
476 KB JPG
I don't like how Flux frequently generates a path or river.
>>
File: tod.png (3.18 MB, 1536x1536)
3.18 MB
3.18 MB PNG
>go on civitai
>click on link

The creator of this asset requires you to be logged in to download it

>rolleyes
>>
>>101965547
>buying the latest gpu that cost above $1500
Used 3090s go for $700-800 on Ebay. I got mine (Dell version) from a star seller with 2 year warranty. You think I'd be genning with 24GB if it weren't for that? Kek
>>
>>101965598
>>rolleyes
yup
>>
File: pot of greed copypasta.png (1.2 MB, 1024x1024)
1.2 MB
1.2 MB PNG
>>101965237
>What does it do
>>
>>101965362
That's very impressive! How did you prompt it?
>>
>>101965544
>>101965530
Seems to work on dev, but not on schnell. I've cranked up the lora up to 10 and no dice.
>>101965610
I almost bought one like that from an aussie yesterday, but I hesitated because I also need a new PSU to go with it. I will end up doing it.
>>
File: 2596001517.png (1.17 MB, 768x1344)
1.17 MB
1.17 MB PNG
>>
File: ComfyUI_00155_.png (3.04 MB, 1248x1848)
3.04 MB
3.04 MB PNG
>>
>>101965577
https://github.com/bmaltais/kohya_ss/tree/sd3-flux.1
Using this config
https://github.com/bmaltais/kohya_ss/issues/2701#issuecomment-2294833735
I'm been experimenting with captioning, Not noticing much difference in using booru tags or natural language
>>
>>101965588
>24gb is like 400-500$ using a 2nd hand 3090
not outside the US
>>
>>101965587
ACTIVATE SCHIZO MASTER PROMPT
very nice really
>>
>>101965624
>but I hesitated

I did so many times too. Then I found the perfect deal and you do not ever regreat a 3090 purchase. For me the only other alternative at the time was a new one for $900 on StockX but their shipping cost were insane. I also recommend checking Amazon for used GPUs, every now and then someone lists a good deal in $700-800 range but they sell out fast.
>>
>>101965620
I cheated.
>photo of a hot dog perpendicularly sliced in half, only one half is present on the picture, the visible inner side looks messy, like the missing half was bitten off, with pink insides of the sausage and white bread texture of the bun visible, the sausage is covered in shitty cheap mustard and disgusting white mayo, there are crumbs lying around
>>
does Imagen 3 mog Flux?
>>
>>101965446
This has given me the most detail by far, specially the image contains text:
https://aichatonline.org/gpts-2OToA97Vhr-Describe-Image
Unfortunately, Flux doesn't generate close matches of the original pictures with the outputs.
>>
>>101965682
I've got my eye on a refurbished one for 1k EUR (in my country, new are in the 1.6k range, and second hand they are 800 the cheapest).
>you do not ever regreat a 3090 purchase
Thanks for that.
>>
>>101965682
Also it's still around kek https://stockx.com/nvidia-geforce-rtx-3090-graphics-card-founders-edition-black?country=US&currencyCode=USD

(but the shipping cost is bait and there is no warranty)
>>
File: 1721524680591394.png (1.01 MB, 1024x1024)
1.01 MB
1.01 MB PNG
>>101965398
and the inverse: miku genning pepe
>>
>>101965197
I have this mouse
>>
>>101963441
>Wishing to be someone with deepfried gens who shills on reddit
Kek
>>
>>101965551
MJ6 is enhanced at the prompt level as they look at that you're doing and apply artistic filters, but a similarly enhanced prompt would easily DE2 > MJ6.
>>
>>101965587
>imageFX
Prompt? Never used it before
>>
>>101965685
I wouldn't know, I've only killed communists.
>>101965704
YES
>>
File: file.png (8 KB, 808x49)
8 KB
8 KB PNG
Yes, Forge. You do that.
>>
>>101965690
This is using GPT-4.
>Rate this tool20.0 / 5 (200 votes)
off the charts
I sent a spicy image and got my gen quality roasted
>The woman is smiling and pulling open her top to expose her breasts, which are clearly unrealistic and appear to be digitally edited.
>>
>>101965690

Joy Captioning is still the winner for my pics.
>>
zPDXL3 is so good, I think it finally surpasses listing score_9, score_8, score_7, etc.
zPDXL2 was either on par or slightly worse than listing the scores.
>>
>>101965508
Catbox?
>>
>>101965172
this is pretty, I like it
>>
>>101965742
>A 1990 Nagoya sunny japan magazine scan with large text, screencap sharp still photograph chrome platsic pastel transparent textured 1990 science fiction movie crisp weathered spectral larvae macro boiled creature mechanised hybrid smoke dark tsukamoto on board tantive IV
>>
>>101965769
Forge's code is sooo sloppy
>>
File: Half-eaten--hot-dog.png (699 KB, 1280x720)
699 KB
699 KB PNG
>>101965684
You, prompt engineering badass!
>>
for captioning character training pics, can you put that somewhere in the prompt, or do you need to go through them all and replace "a person/a man/a woman" with "[character name]"
this was never a problem with WD
>>
>>101965704
Kek.
>>
>>101965738
absolutely not, have you actually looked at DALL-E 2 images?
MJ's stylize can be turned all the way down btw
>>
>>101965798
I don't think it matters, it will just turn every body into that body anyway
>>
baked, one minute..
>>
File: ComfyUI_31979_cleanup.png (1.55 MB, 1024x1024)
1.55 MB
1.55 MB PNG
>>101965782
https://files.catbox.moe/jshla2.png
>>
>>101965627
Nice Scespa.
>>
File: ComfyUI_01694_.png (1.43 MB, 1536x640)
1.43 MB
1.43 MB PNG
>>101965732
I'm not that anon, good try though!
>>
File: 00080-4238289439.png (3.65 MB, 1728x1344)
3.65 MB
3.65 MB PNG
>prompt coyote
>get fox
>>
>>101965816
Thanks Robert!
>>
>>101965816
Hello?
>>
>>101965816
JESUS CHRIST, ROBERT
>>
AAAAAAAAAAAAAAAAAAH ELDEEGEE IS KILL
>>
>>101965796
is that flux dithering to go with the flux gridlines?
>>
https://files.catbox.moe/yw3mw5.png
Whoa whoa whoa
>>
Calm down, I'm baking rn
>>
Blame Robert...
>>101965917
>>101965917
>>101965917
>>
>>101965819
>>101965929
Yum.
>>
>>101965929
We're trying to generate half-baked hot dogs over here.
>>
>>101965819
Thank you



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.