[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (1.18 MB, 3264x3264)
1.18 MB
1.18 MB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>102594244

>Beginner UI
Fooocus: https://github.com/lllyasviel/fooocus
EasyDiffusion: https://easydiffusion.github.io
Metastable: https://metastable.studio

>Advanced UI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/kohya-ss/sd-scripts/tree/sd3

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/aco/sdg
>>>/aco/aivg
>>>/b/degen
>>>/c/kdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/tg/slop
>>>/trash/sdg
>>>/u/udg
>>>/vt/vtai
>>
>>102610162
>>102610199
It's the fall of COGTISM BOYS
>>
>>102608673 >>102609975 (repost for visibility)
>https://github.com/THUDM/CogView3
>https://github.com/THUDM/CogView3/blob/main/sat/README.md
new chinese 3b model with 16ch vae and t5, has a distilled model too for 4 step and 8 step
>>
>>102610305
https://github.com/THUDM/CogView3/blob/main/sat/README.md
>CogView-3-Base-3B-Relay
>Relay
What's that?
>>
Blessed thread of frenship
>>
File: BigLust_06128_.png (2.22 MB, 1080x1576)
2.22 MB
2.22 MB PNG
>>102610305
is it any good?
>>
File: Untitled-2.png (303 KB, 660x1316)
303 KB
303 KB PNG
>>102610322
>>102609941
here is an example of such an input image, maybe I'll make a workflow demonstrating it a bit later
>>
>>102610545
>16ch VAE
>Not distilled
>Apache-2.0 license
Can be very good
>>
What's the best SDXL model for accurate IPAdapter face cloning? I've noticed that some models have better results than others.
>>
Chicoms vs. Silicon Valley Venture Capital. Who will win it all?
>>
are we back?
>>
>>
File: 0.jpg (406 KB, 2048x1024)
406 KB
406 KB JPG
>>
>>
>>102610669
Careful man, resident trannies mass report anything that looks like a real woman
>>
>>102610669
that's how i sit when i want to fart without making a sound
>>
File: 1114592025.png (769 KB, 891x715)
769 KB
769 KB PNG
>>
>>102610796
KEK
thanks you just relieved the pain i endured having to see these gens
>>
>>102610796
More like a puff sound?
>>
>>
why's there so many 1girls these past few threads wtf is going on
>>
>>102610869
/g/ is aids and dying from it
all the good threads are scattered around 4chan, /b/ /aco/ etc.
>>
>>102610545
How did you test that out? and what model did you test?
>>
>>
emu2 and cogview3 could be local SOTA but we wouldn't know because no vram rich tried it
>>
>>102610869
I bet you can't replicate any of them
>>
>>102610917
What model is this?
>>
>>102610919
>replicate
i don't use replicate, i run my models locally, buddy
>>
>>102610918
>emu2
https://baaivision.github.io/emu2/
>37b parameters
holy shit

> cogview3 could be local SOTA but we wouldn't know because no vram rich tried it
what? it's just a 3b model, no need to be a vram chad to run that
>>
this kills the fluxfag
>>
>>102610958
sniff sniff. please stop stinking up the thread.
>>
I would try to run the model but mom says that's enough computer time today
>>
>>102610977
God damn. I think I know what image set was used to train this.
>>
>>102610986
tell her that you can make her facebook pictures really pretty with the new model
>>
>>102610808
intriguing
>>
did you know that flux can't generate soles nor feet?
>>
I thought /ldg/ was the tech talk general.
>>
>>102611077
>was
>>
File: 2105050044.png (1.3 MB, 832x1216)
1.3 MB
1.3 MB PNG
>>
>>102611077
What makes you say it's not?
>>
>>102611077
Yes and? Realistic waifus are always welcome.
>>
>>102610977
impressive feet for a base model, those chinks are way less afraid of female anatomy than the cucked west that's fore sure
>>
>>102611059
Flux users have been replacing oxygen by copium gas for quite some time now, we are at risk of a shortage.
>>
>>102611117
maybe if their knees didn't look like a man's scrotum >>102610842
>>
>>102611142
They can go red if you spend time on your knees, you should know
>>
>>102611142
If that is the new model, you can make them look like anything you want, with training, as the model is not distilled.
>>
>>102611174
they'll go red once i knee your face in
>>
>>102611186
>If that is the new model
no i think that's just sdxl
>>
I suffer from a cogmatism
>>
>>102611186
>If that is the new model, you can make them look like anything you want, with training, as the model is not distilled.
why do you still believe we can't finetune distilled models? the thousands of loras that are on civitai disprove your point
>>
>>102611059
what model is that anon?
>>
>>102611339
nta but i've been hearing about flux collapsing when trying to attempt any serious finetune because it's distilled
>>
>>102611339
loras =/= finetune , stop saying that a shitty lora is equal to training a model

>why do you still believe we can't finetune distilled models?

because you can't thats why, the distilled model immediately gets overfit with whatever you train on, thats why all those "finetunes" published look like crap, stop with the copium please, it has been months already and there isn't a single flux finetune
>>
>>102611384
>, it has been months already and there isn't a single flux finetune
how many time did we have to wait for SDXL to get a serious good finetune? months too
>>
File: 00718-3575345261.jpg (1.13 MB, 2160x1720)
1.13 MB
1.13 MB JPG
>>
>>102611396
way less than flux already, give it up, Flux users just hit the wall and there is nothing they can't do about it but cope
>>
>>102611444
>way less than flux already, give it up
why do you love to lie like that? SDXL hadn't anything serious 2 months after its release, only Pony saved its ass
>>
>>102610977
>>102611059
>>102610958
model?
>>
>>102611470
lol what? there were dozens of good models released before Pony, what are you talking about?

sdxl had finetunes in less than a month after its release
>>
>>102611568
>sdxl had finetunes in less than a month after its release
they were shit anon, the fuck you talk about? you think you can make great finetunes only 1-2 months after the release of a base model? I accept the fact people are weary of finetuning Flux because it's a big ass motherfucker that's asking for too much vram, or becuase of its licence, but it will never be because it's a "distilled model" or whatever nonsense you're trying to sput
>>
File: 0.jpg (200 KB, 2048x1024)
200 KB
200 KB JPG
>>
>>102611591
You should compare FLUX with SDXL-Turbo the distilled version of SDXL 1.0.
>>
>>102611660
Flux dev isn't distilled as aggressively as SDXL-Turbo, it's still asking for at least 30 steps to get good images, if no one told me this was a distilled model I wouldn't even have noticed, Schnell on the other hand seems to be too fucked to be saved yeah
>>
will we ever get a decent control net for fucks dev?
>>
>>102611768
Learn to photoshop.
>>
>>102611774
nice model
>>
>>102611768
I want PuLID to work on ComfyUi so that we can make celebrities doing funni stuff and shit, it's advancing but still not there
https://github.com/cubiq/PuLID_ComfyUI/issues/69#issuecomment-2381248954
>>
>>102610305
>We have open-sourced CogView3 and CogView-3Plus-3B. CogView3 is a text-to-image system based on cascaded diffusion, using a relay diffusion framework. CogView-3Plus is a new series of text-to-image models based on Diffusion Transformers.
So basically only the "Plus" one matter because it's on the best architecture we have so far?
>>
>>102611689
DEV is the one everyone is training and It still collapses around 10000 steps depending on how much you are trying to change it.
At most you can overwrite most of what people consider good of the model's knowledge, training with a huge dataset, to at least get a usable base model with a 16ch VAE. But no one knows the training code the devs used during training so you can't expect to reach the same level even with the same dataset they which we also don't know. Also distillation can, and will, be used to hide the original model's SOTA architecture so what we have is not even the original architecture. (https://youtu.be/vyJy-0zBSQ0?t=396)
But still, I really wanted to see people trying to finetune it so we can at least put and end to it once and for all.
>>
>>102611985
>Also distillation can, and will, be used to hide the original model's SOTA architecture so what we have is not even the original architecture.
I didn't know that? How does that work? Why we can't look at the architecture that's inside when you distill it?
>>
File: 0.jpg (155 KB, 1024x1024)
155 KB
155 KB JPG
>>
File: 00253-AYAKON_124827379.jpg (1.28 MB, 3072x2048)
1.28 MB
1.28 MB JPG
Thalia from MTG
>>
>>102612060
Metal Tear Golid?
>>
File: 3253979448.png (1.24 MB, 896x1152)
1.24 MB
1.24 MB PNG
>>
>>102612104
i'll let this 1girl slide
>>
>>102610305
Still no one trying this?
I admit the 512 is a turn off, but I assume it could be finetuned to 1024
>>
>>102612132
why don't you try it and report back
>>
File: file.png (269 KB, 2568x1492)
269 KB
269 KB PNG
>>102612132
>the 512 is a turn off
that's why there's the "Plus" one we can try and that one can go up to 2048
>>
>>102612132
>I admit the 512 is a turn off
that's just the relay model >>102609975
>>
>>102612167
base* oops
>>
>>102612151
>one we can try and that one can go up to 2048
But it still says that the Height can't go further than 512, that's bullshit lol
>>
>>102612141
I'm a vramlet and have to wait for it to work on less gb
>>102612151
>>102612167
Well shit that's even better
>>
>>102612151
why does plus use 20g but the distilled version uses 64g? Im retarded this chart must not mean what I think it means.
>>
>>102612186
>But it still says that the Height can't go further than 512
you sure? in the memory usage bit they list 1024*1024 and 2048*2048
>>
File: 2084350819.png (1.24 MB, 768x1344)
1.24 MB
1.24 MB PNG
>>102612115
are you not generous
>>
>>102612186
30G (2048 * 2048)
20G (1024 * 1024)
I assume that's Height x Width, 2048 * 2048
>>
>>102612219
>>102612288
why does it ask for so much vram? it's only a 3b model? Flux can run on a 24gb card yet it's a 12b one
>>
File: 3687510830.png (1.07 MB, 896x1152)
1.07 MB
1.07 MB PNG
>>
>>102612393
I'm really confused by that too. Maybe once it's converted from sata or whatever? I'm guessing we can use the gguf'd t5 with it and that'd bring down requirements

I don't really understand what sata is and am a brainlet so I'll wait I guess
>>
>>102612393
>Flux can run on a 24gb card yet it's a 12b one
no it can't, if you include the text encoder that's 22gb (Flux) + 9gb (t5 + clip_l) + 2gb (resolution) + 1gb (vae)
>>
>>102612474
>22gb (Flux) + 9gb (t5 + clip_l) + 2gb (resolution) + 1gb (vae)
if the 5090 comes out with 32gb it means it would still not being able to run the whole thing by itself, that's craaaaazy
>>
>>102612393
>>102612423
People are running FLUX with over 9000 memory optimizations built one after another since its release.(gguf, nf4, offload to ram, etc)
They can probably be used, without too much difficulty, in this model too.
>>
>>102611479
its probably biglust
but it can also be lustify, natvis or pornworks
>>
on google colab w/gradio, how do you prevent colab from timing out from inactivity when you're focusing on the gradio tab? i remember seeing some colab notebooks use a silent music track or something to trick colab into thinking its active, does that work?
>>
https://open.bigmodel.cn/dev/howuse/cogview
Someone should test it
>>
File: _0553.png (752 KB, 1024x1024)
752 KB
752 KB PNG
gonna have to rebake this lora, it seems a bit confused and hands are pretty fried. don't mind the new adamamix optimizer but I'm probably going to stick to prodigy because its just too much work to figure out the 'best' LR for each lora when flux loras are 12 hour bakes for me
https://files.catbox.moe/4aa2fe.png
>>
File: file.png (90 KB, 2043x267)
90 KB
90 KB PNG
>>102612989
looks like it's on par with flux dev while being only a 3b model and Apache 2.0? That's crazyyy, I wish there was a demo page so we can test that out, I'm not very trusting of benchmarks
>>
>>102613043
The worst part is that even if it is actually on par, every dev just blew their load on flux, how many of them are going to recommit the same energy to optimizing this one in any timely manner?
>>
>>102613073
>every dev just blew their load on flux, how many of them are going to recommit the same energy to optimizing this one in any timely manner?
what do you mean by that? no one made a serious finetune of dev so it's not like they have to leave anything, they didn't learn how to master flux training in the first place, for me it's a no brainer, if a 3b model performs aswell as a 12b model while being a non-distilled model and has an apache 2.0 licence, it will easily replace flux Dev, but like I said, we have to test that out before making any conclusions or whatsoever, the vram requirement is insane for such a tiny model I have no idea why it's the case
>>
File: file.png (81 KB, 2469x514)
81 KB
81 KB PNG
>>102610305
this is a gigachad move, reminescent of the old SAI days, if their models are good then we're being saved by them, and not by BFL
>>
>>102613112
Not bakers, devs. People ggufing it, making comfy nodes, kohya making loras work for it, etc. Many of those kinds of devs have day jobs and don't make extreme updates like they did for flux, so doing it twice in a short period seems taxing to me
>>
File: file.png (69 KB, 2978x715)
69 KB
69 KB PNG
>>102610305
the whole model (DiT + text encoder) is 7.8gb of vram, that means the text encoder is kinda small, probably less than 2gb of vram, that's not a good news, T5 is far from perfect and this shit is 9gb big
>>
>>102613165
the beauty of opensource is that not a single man has to do all the work, if they don't want to do it, someone else will make a fork or a PR, that always depends on the attractivness of the model, we'll see, the licence is good, the benchmarks are good, if in reality that's a model that understands prompts well and produce good images, then it'll be hyped, like flux has been hyped
>>
>>102610305
>no HF
so trash
>>
>>102613180
>the whole model (DiT + text encoder) is 7.8gb of vram, that means the text encoder is kinda small
I don't think so
>>
china model status?
>>
>>102613708
no one has tried it but we all have high hopes as it looks amazing on paper
someone start shilling it to kohya, city96, comfy, etc
>>
>>102611059
It can't generate proper hands either.
>>
>>102613043
>falling for meme data
>>
>>102610305
>>102610918
If they were any good then the modelmakers themselves would post example images.
>>
>>102613358
oh ok it's still the T5 model they're using
>>
>>102613835
Not to mention, read the papers themselves. They are stuck in SDXL era/quality of models. When a model comes out the first question that needs to be asked is 1) Is it Flux tier?
For a paper to be good many parameters need to be tested, from overall concept knowledge to how good the model is with text.

2) Can it do hands, feet, and does it understand the human body?
>>
File: Result_00001_.png (1.1 MB, 1024x1024)
1.1 MB
1.1 MB PNG
I have finally figured out how to use comfyui without making abominations.
Now to figure out how to use flux, or if I even need to use it.
>>
is fp16 or bf16 faster?
>>
if I can get it to work on 12GB vram with the gguf'd t5 I will test the cogview model tonight once I'm home.. if not lol, lmao, oh well
>>
>>102613043
>>102613908
If those parameters are not tested you are wasting your time.
https://www.reddit.com/r/PhD/comments/1f6f0n9/apparently_data_manipulation_is_really_common_in/

Chinks are masters of deception.
>>
>>102613966
if only we had the ability to verify them somehow, SOMEHOW... sad days for /g/
>>
>>102610271
What LORA is the bottom left?
>>
>>102614014
I'm not verifying them because I don't believe nor trust them. Their example images look like shit and that's enough for me to toss the model in the trash. The Chinks would be very loud and clear on their website (as they have been with their video models) if they made a breakthrough.
>>
>>102614014
>if only we had the ability to verify them somehow
the fact they haven't provided a demo is really sus, when flux was released we at least had the chance to try out a demo to see how great it actually was, a good demo motivate people into implementing it into more regular settings like Forge or ComfyUi
>>
>>102613910
LORA? Really digging all these simple style ones.
>>
>>102614062
it looks like they are still working on their 'plans', maybe that is why? I don't know. I don't understand what the point of providing it at all is if they're just trying to scam its capability, padding their resumes?
>>
>>102614087
>I don't understand what the point of providing it at all is if they're just trying to scam its capability, padding their resumes?
they're trying to scam investors I guess
>>
>>102614097
wouldn't investors not want the apache 2.0 license?
>>
Only reason I haven't tested it is that I'm not at home right now. I admit it seems 'too good to be true' but I refuse to dismiss it without trying.
>>
File: file.png (3.15 MB, 2311x1272)
3.15 MB
3.15 MB PNG
>>102610286
https://arxiv.org/pdf/2403.05121
So you're telling me that SDXL can't do a "Green apple and a black backpack"? really?
>>
>>102614166
I believe it, SDXL bleeds a lot and has shit nat language understanding. you could probably get it with enough seed rerolls but this isn't blatantly proof of dishonestly to me
>>
>>102614144
Duuuuude, don't do this. Do you have any idea how many FLUXIANS are going to rope themselves if it works?
>>
File: file.png (397 KB, 2349x935)
397 KB
397 KB PNG
>>102614166
kek, at least they know that training a model with boomer prompting also force us to go for boomer prompting
>>
>>102614208
>Do you have any idea how many FLUXIANS are going to rope themselves if it works?
what? I'm a flux fan but if there's a better model that appears I'm jumping the ship, I'm a local SOTA fan first of all
>>
Is there any reason we can't make a huggingface repo ourselves to test this on? I'll pay the $9 or whatever for huggingface premium if its easy to do, I've never really used HF though
>>
>>102614230
chatgpt seems to think its pretty easy so I'm going to try, I'll be back if I manage to set it up (code+brainlet, pray for me)
>>
>>102614208
Why would they rope? It's would just be a new toy to play with.
>>
so where are the all flux gens? this whole thread looks like 1.5/xl sloppa. finally realize that finetunes aren't coming and the base model is far too censored and slopped to really be creative with? hopefully pixshart will be better, or else it's text-on-signs toony gimmick models until the end of time. why are new models far less creative and artistically interesting than older ones?
>>
>>102614277
i pray
>>
>>102614166
There's not a single mention of CogView-3-Plus in this paper, that's really sus, if that's the best base model, why aren't they talking about it?
>>
File: Result_00002_.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
>>102614081
https://civitai.com/models/389661/adventure-time-style-ponydiffusionxl
>>
i miss model wars
>>
File: 1725618998.png (1.04 MB, 1024x1024)
1.04 MB
1.04 MB PNG
>>
>>102614296
I still believe that Tencent will release an endgame model after Hunyuan. They did not fail to deliver on aesthetics nor pretraining quality it simply need some more parameters, better architecture since it wasn't the best at prompt following.

Hunyuan is still the best anime model out right now that has been ignored by many.
>>
>>102614384
>Hunyuan is still the best anime model out right now that has been ignored by many.
isn't it because it's only understanding chink and it's run on a deprecated architecture?
>>
>>102614416
Nah. It's bilingual. https://imgur.com/a/hunyuandit-0vrZEn0
>>
File: ComfyUI_34202_.png (1.57 MB, 848x1024)
1.57 MB
1.57 MB PNG
>>102614030
https://mega.nz/folder/DwEjQJSI#udon_Z-X99ZHJ4IwCfh_hg
>>
>>102614416
>>102614445
And of course, by best anime model, I mean in terms of aesthetics and understanding of Japanese anime and manga artists. Flux is still ahead in overall scene coherence, though this is very close to Flux in coherence when it understands the prompt.
>>
File: ComfyUI_34230_.png (1.32 MB, 848x1024)
1.32 MB
1.32 MB PNG
>>
>>102614296
I CAN’T POST THEM BECAUSE THERE SOMEHOW STILL ISN’T A RED /AI/ BOARD
https://files.catbox.moe/rgaqzh.jpeg
>>
>>102614733
>THERE SOMEHOW STILL ISN’T A RED /AI/ BOARD
wym? there's like a billion of them, i think there's one on /aco/ for realistic gens
>>
>its going to take an hour to download the 7gb model from this chinq website
jesus fuck
>>
>>102614767
>futashit
Degeneracy
>>
>>102614793
watch this, i'm going to accelerate your download speed
>>
>>102614825
w-why did it unironically jump to 10 minutes left when I read this comment
c-chinq overlords, a-are you with me?
>>
>>102614848
now watch this, i'm going to suck your dick
>>
>>102614956
AAAAHHHHHHHHHHH!!!!!
>>
>>102614966
watch this, i'm going to make a cockroach fly at your face
>>
>>102614733
I was hoping for a blue ai board, myself. I find being too explicit is distasteful and actually less erotic. When people just start posting bare tits and pussies it gets gross and I feel an urge to leave. Anons make way better horny gens when they're constrained to keep it ""worksafe"".
>>
>>102615027
you sound american
>>
>>102615027
I agree but you have to make it red or the problem isn't resolved of containing all AI stuff to one board.
>>
>>102615027
You make it red specifically and explicitly to keep zoo animals like you out.
>>
>>102614277
it is taking a very very long time to upload the models to huggingface, but I think I understand the rest of what I need to do in spaces to get it to work once they have finished. I only am uploading the 3b-plus model with full t5 for now, we can worry about optimizations when we are running it locally
>>
>>102615068
>you sound american
Can you tell by the strength and optimism in his post? Americans are strong and optimistic.
>>
File: 1726710168.png (1.09 MB, 1024x1024)
1.09 MB
1.09 MB PNG
>>
File: 00000-1521094798.png (1.58 MB, 1600x1600)
1.58 MB
1.58 MB PNG
play my game
>>
>>102615595
I am cute and funny.
>>
>>102615756
pedos out
>>
File: 0.jpg (605 KB, 2048x1024)
605 KB
605 KB JPG
>>
>>102615748
Red sky at night.
>>
>>102616044
show me your worm kid
>>
>>102615612
Scared the shit out of me
>>
>>102616068
https://en.wikipedia.org/wiki/Heikegani
>>
>>102616044
ill suck your dick
>>
>>102616139
no homo, man. sorry.
>>
meow
>>
File: 00011-541769527.png (1.07 MB, 832x1216)
1.07 MB
1.07 MB PNG
>>
File: 00013-420522417.png (911 KB, 832x1216)
911 KB
911 KB PNG
>>
>no bigma
>>
>>102614450
adorable, thank you
>>
it's the last day of pixart pride month
>>
Chinaman pride eternal
>>
File: 00100-3182290056.png (1.78 MB, 1152x1632)
1.78 MB
1.78 MB PNG
>>
>drops model
>with no demo
thats a big yikeronis from me dawg
>>
best option for image captioning, for flux prompts?
>>
>>102616521
no problem
>>
i hate black forest labs
>>
>>102617347
whycome?
>>
>>102617361
they don't love us
>>
French > German
>>
the french don't exist in image gen
>>
Chinese > French > German
>>
>>102615748
nice
>>
>>102617422
>the french don't exist in image gen
the funniest part is that the US don't exist on image gen
>SAI -> UK
>BFL -> Germany
>Pixart/Hunyuan... -> China
>>
File: IMG_0347.jpg (86 KB, 1024x1024)
86 KB
86 KB JPG
Idk why, but letting things overtrain until they just make abstract garbage is more fun than doing it properly. Like even once I get what I want, I’ll let it run forever and turn into mush as a treat.
>>
>try to put cogview3 on hf
>wait all night for the files to upload to HF to use
>hf space is now stuck on building... and won't proceed, restarting does nothing
thanks, I hate it
>>
>>102617898
I think it's a shit model, no demo = bad, first of all if they had something insane they would've kept for themselves, like they did for Kling, and usually when you know you're a genuine local SOTA you're screaming it everywhere, in this case it's like they don't want people to try it to get negative feedback and scare the investors or something :^)
>>
>>102617898
watch this, i'm gonna make everything go smoothly without a hitch
>>
>>102617931
holy! that increased my motivation to try it!
>>
>>102618017
I didn't expect to do reverse psychology, that was genuine, but go for it anon, I want to see some non cherry picked pictures and see if that's good enough to me to care kek
>>
File: file.png (561 KB, 720x396)
561 KB
561 KB PNG
>>102617961
>watch this, i'm gonna make everything go smoothly without a hitch
good luck anon
>>
>>102617931
>usually when you know you're a genuine local SOTA you're screaming it everywhere
Like what BFL did to with FLUX? :)
>>
File: file.png (75 KB, 2497x237)
75 KB
75 KB PNG
>>102618060
>Like what BFL did to with FLUX? :)
yes, that's what they did, they claimed their model was SOTA, and they're right
https://blackforestlabs.ai/
>>
>>102618071
Where is the "screaming it everywhere"?
>>
>>102618071
>and output diversity
Sure... knowing how to render only Trump and Migu sure is diverse...
>>
>>102618071
a state of the art AD! grrrr damned black forest labs!!!! WRRRAARRGHHHH!!!!
>>
>>102618076
Where else do you want them to say that they're sota, they also said that on twitter
https://xcancel.com/bfl_ml/status/1819004031957713007#m
>Today we release the FLUX.1 suite of models that push the frontiers of text-to-image synthesis.
>>
>>102618095
So?
People where the ones screaming everywhere not the devs.

ClosedAI is an example of screaming everywhere not BFL.
>>
>>102618107
>People where the ones screaming everywhere not the devs.
>ClosedAI is an example of screaming everywhere not BFL.
I don't get it, both SAI and BFL said that their model was SOTA on both their site and twitter, what makes SAI so different?
>>
>>102618116
he's talking about openai
>>
>>102618140
oh, I don't think that's comparable though, LLMs have such a societal impact everyone knows OpenAI, the worst image models can do is to replace bad drawing artists, but LLMs can replace the whole tertiary sector, that's a big deal, so of course OpenAI is talked about everywhere, and the CEO is in consequence popular enough to do a lot of interview and therefore talk more about their models and stuff, that's absolutely not the same scale
>>
>>102618116
For BFL no one even knew what they were doing before the release and no one knows what they are doing now because they refuse to speak even when people ask about their confuse license. Its like they live in a cave somewhere and only come out to release something and then go back. I am not saying that is a good thing tho, in fact I think its just a different kind of bad.
>>
>>102617898
yeah, sorry frens, I'm way too fucking stupid to understand how to fix this. my brain is simply not big enough to make use of that $9
>>
>>102618159
i think he means how much marketing, fear mongering and hyping up openai does, like that whole strawberry thing that just turned out to be a chain of through finetune of gpt4. in image gen they don't really do much marketing, they just announce a new model and that's about it
>>
>>102618182
why won't you ask claude or chatgpt to fix this shit?
>>
>>102618184
yeah I start to see his point, but I'm still focusing on the context, LLMs and image models are 2 different worlds, it's like complaing about tennis players to be too politicaly correct compared to MMA fighters, they are too different to be compared to begin with
>>
>>102618185
I did, they keep contradicting themselves and what they are telling me to do isn't up to date with HF's interface
>>
>>102618107
I wish mainstream places would call him out for his unhinged schizo fruit shit.
>>
>>102618234
to be fair, OpenAI had the SOTA model for almost 2 years (from end of 2022 to july 2023) and then Claude 3.5 sonnet surpassed them
>>
>>102618260
The strawberry stuff was this year after every single person responsible for that left.
>>
>>102618272
I think people left because Sam decided to make OpenAI a profit company now, and he'll be winning 150 billions from that lmao
https://cybernews.com/news/openai-restructure-ceo-sam-altman-150-billion-equity/
>>
>mfw no gens for 1 hour
>>
>>
>>102618308
>>102618333
Does this technique still work if the image size is 1024 or greater instead of 512
>>
>>102618293
They probably have been cheated out of their equity.
>>
>>102618351
Probably, but hard to know. I never go that high because I don't have the VRAM for it. Although I also just don't have much desire to make them that large. In SDXL days I used to downsize all my gens to make them smaller after generation.
>>
>>102618351
>>102618363

What technique are you guys talking about?
>>
last one, it's someone else's turn now
>>
>>102618404
see >>102610322 >>102610603
>>
the 1girls end here, PAL!
>>
>>102618418
I wonder what would be the result of training a lora out of these kinds of images.
>>
moar 1girls please im begging you
>>
>>102618463
https://civitai.com/models/652699?modelVersionId=730225
>>
>>102618463
They've done it with models like SDXL, I never found the loras particularly impressive. Usually outperformed by base model.
>>
>>102618475
That lora does nsfw too, so there must be at least some detailed images in the dataset.
>>
>>102618522
>I never found the loras particularly impressive
How so?
>>
File: ComfyUI_05923_.png (1.87 MB, 1024x1024)
1.87 MB
1.87 MB PNG
>>102618522
>I never found the loras particularly impressive.
Nah, there are some insane loras on flux, this model is so good at reproducing a style, it just it never had the chance on showing it because of a censored dataset in the first place
>>
>>102618529
On balance I got fewer good gens. They were just bad in the usual way. Reduced range of potential outputs, without sufficient improvement in gen quality to compensate. I felt that I was less likely to get gens I wanted when I used them. What else can I say?
>>
>>102618566
skirt looks painful
>>
>>102618595
yeah looks like a cloth glitch on 3d characters kek
>>
>>102618566
>le insane flux loras
Such as? All flux gens look the same
>>
File: 00325.png (616 KB, 576x1024)
616 KB
616 KB PNG
funny hat
>>
>>102618678
are you trying out "the technique"? are you blur-pilled? are you hazemaxxing? it's lookin' good
>>
File: file.jpg (2.02 MB, 3072x4279)
2.02 MB
2.02 MB JPG
>>102618649
>Such as?
>>
File: images.png (7 KB, 225x225)
7 KB
7 KB PNG
>>102618215
>Looks at the board name
>Yep, this is a technology board
>>
>>102618582
>Reduced range of potential outputs

That's a common problem with a lot of loras. Most people have no idea what they're doing.
>>
>>102618721
desu Y2K sucks unless he updated
>>
File: 00327.png (512 KB, 576x1024)
512 KB
512 KB PNG
>>102618692
its neat, need to explore and practice it more. i wonder what other styles can be decently emulated doing it.
>>
>still no CogView3 & CogView-3Plus demo
>>
>>102618729
>desu Y2K sucks unless he updated
I had my fun with it, but yeah it's completely destroying the anatomy
>>
>>102618726
preach it, brother
>>
>>102618721
Those are not “insane” , the only good thing about flux is its text capabilities, all those styles were already done in sdxl
>>
>>102618721
Also that is not 90s anime style at all
>>
>>102618871
What would be an "insane" lora on flux to you? Because you said this: >>102618649
>All flux gens look the same
and I wanted to disprove your theory with that >>102618721
>>
>>102618405
Oye mami
>>
>>102618884
An “insane” lora for flux would be one that doesn’t overfit the model like all loras and “finetunes” do, like you can with sdxl, and what I meant about all flux gens look the same is that they all share the same composition, same look, same everything, the same boring gens over and over, look at the subjects of your collage, they don’t have any depth, they are all positioned in the same perspective, because flux is such a cucked model, it doesn’t have any creativity for something crazy, all flux gens look bland and boring
>>
>>102618927
care to show what "crazy" means in terms of SDXL possibilities with a picture maybe, so that I can see if it's that different from Flux
>>
File: 3db1d4.jpg (2.94 MB, 4096x2048)
2.94 MB
2.94 MB JPG
>>102618969
this is from over a year ago
>>
>>102619106
was it from a SDXL finetune? because it's a bit disingeuous to compare a finetune to a base model right?
>>
>>
>>102619115
Kek is from 1.5, also fluxfags should makeup their minds, a few posts ago flux loras were “finetunes”
>>
File: 00335.png (513 KB, 576x1024)
513 KB
513 KB PNG
>>
>>102619148
>a few posts ago flux loras were “finetunes”
who said flux were "finetunes"?
>>
>>102619161
>>102611339
Literally on the same thread
>>
>>102619254
I don't think you understood the meaning of that message, never was my intent to say that "Loras" = "Finetune", my point was that it's retarded to sayè that it's impossible to finetune a distilled model without trying it out seriously first, and that the fact we managed to get good loras is a good sign that it's possible to finetune it
>>
>>102619106
This is forgotten magic.
>>
>>102619106
>>102619278
Looks inpainted or img2img.
>>
>>102610869
1girls give life meaning
>>
>>102619278
Kek, flux fags can cope all they want, just look at any flux gen posted here or on reddit or twitter or whatever, you will notice there is a big composition problem with that model where it always positions the subject in the center of the frame and if it’s a multi subject, they will get positioned in the same axis but still aligned in the center of the frame, once you have seen a flux gen you have seen them all, is just a boring model, just like the culture of the country of its devs, it’s funny how the model reflects the way germans are
>>
>>102610869
FLUX caused some sort of Streisand effect on people.
>>
The pixart lies in wait, ready
>>
I downloaded flux from civitai, but it doesn't work. Does flux work with reforge or is it only compatible with comfyAI?
>>
>>102618293
>>102618359
100%. Entities like Sama are why it used to be legal to kill people if you just said they were vampires
>>
>>
>>102619740
It works with regular forge (updated)
>>
It's out.
>>>/h/8233224
>>
>>102620094
waow
>Captcha:P4W0W
>>
Two years ago from this moment, the /sdg/ thread hit image limit in about five hours, and the OP jumped the gun on that one so it was longer than usual. We have 55 images after 15 hours...
>>
File: file.png (2.63 MB, 1188x1526)
2.63 MB
2.63 MB PNG
>>102620094
https://huggingface.co/OnomaAIResearch/Illustrious-xl-early-release-v0
yep the weights are here, it's looking good
>>
File: 1664505839748.png (950 KB, 896x1280)
950 KB
950 KB PNG
>>102620236
Also I see my own post in that thread, picrel. Two years ago...
>>
>>102620257
>it's looking good
damn, imagine Flux with such a high scale finetune, would be insane...
>>
>>102620257
is that the one that got leaked?
>>
>>102620257
>it's looking good
damn, imagine Bigma with such a high scale finetune when it comes out, would be insane...
>>
>>102620257
that's NovelAI that finetuned this right? Why are they giving it to us for free?
>>
>>102620314
>NovelAI
it's not, seems like a different person/group. novelai would rather be skinned alive than help out us little fossites
>>
>>102620257
>>102620257
>it's looking good
damn, imagine CogView3 with such a high scale finetune, would be insane... or maybe it already is insane... nobody knows...
>>
File: file.png (810 KB, 750x1000)
810 KB
810 KB PNG
>>102620257
why are they still finetuning SDXL?
>>
>>102620337
>or maybe it already is insane... nobody knows...
yeah, I want to see some fucking examples already
>>
>>102620341
They want to jump on the money train after seeing Pony's success. Their website already has payments set up.
>>
this was supposed to be a photo and flux got it wrong, but I like it
>>
>>
>>102620257
>people do researches with anime girls in it
I love this timeline
>>
>>102620375
>They want to jump on the money train after seeing Pony's success.
they would get so much more money if they made a flux finetune, that's all we're been waiting for, not for the 145 billionth anime SDXL finetune
>>
File: image.png (1.13 MB, 1024x1024)
1.13 MB
1.13 MB PNG
>>102620257
Very bad quality. They shouldn't be releasing it this early.
>>
>>102620433
I like how she's giving you the finger like she's pissed at you insulting her quality. A meta commentary, if you will
>>
File: IMG_0353.png (1.39 MB, 896x1152)
1.39 MB
1.39 MB PNG
>>102620257
>>
>>102620393
Flux dev finetunes are illegal for money
>>
>>102620575
If you prompt no homo or put homo in negs it's not gay that's how it works
>>
>>102620583
what about schnell? still better than fucking SDXL
>>
File: IMG_0354.jpg (152 KB, 640x1536)
152 KB
152 KB JPG
>>102620585
The prompt was trying to get him frotting with Dainsleif
>>
>>102620591
Can you tell which is which?
>>
>>102620684
is this a base model vs a base model?
>>
>>102620688
Yes
>>
>>102620692
what was the prompt?
>>
>>102620712
woman with red hair, playing chess at the park, bomb going off in the background
>>
>>102620717
I'll go for dev at the right and schnell at the left?
>>
>>102620725
One is sdxl and one is schnell
>>
>>102620736
wait, are you suggesting that schnell is worse than base SDXL? lol
>>
File: 1716330649139694.png (3.12 MB, 1578x1443)
3.12 MB
3.12 MB PNG
>>102620433
It's still the same v0.1 checkpoint from a week ago? It doesn't look that bad, these from a /v/ thread have the prompt metadata.
>>>/v/690001089
>>>/v/689999771
>>>/v/689972740
>>>/v/689971913
>>>/v/689969024
>>>/v/690194470
>>
File: file.png (59 KB, 3642x214)
59 KB
59 KB PNG
>>102620767
>>102620433
there's a "GUIDED" one that was released recently, maybe that's the one they've been trying out?
>>
>>102620752
They’re the same
Both shit
>>
>>102620863
>Both shit
what base model would you suggest instead then?
>>
>>102620877
Ketamine
>>
>>102620888
kek
>>
china model status?
>>
File: catbox_65gciz.png (2.13 MB, 1632x1152)
2.13 MB
2.13 MB PNG
>>102620793
The GUIDED one is just a forced SFW version that's discussed in the tech doc that uses LECO. It can be ignored.
>>
>>102620897
already given up on, the few people who tried it on twitter had lackluster results and everyone else got scared off from having to set it up instead of load it into comfy
>>
>>102621069
FUCK!!! FUUUUUUUUUCK!!! FUCK FUCK FUCK FUUUUUUCKKKK!!!!!!!!!!!!!!!!!!
>>
back to the waiting room... the official pixart bigma waiting room..
>>
>>102620257
apparently this one supports lactation
>>
If you like animu one of the hdg tier local bakers just came into compute that's over 2x what Nai has, could be a disaster or could be the animu coomers dream
should include artists/characters if all goes as expected
>>
a way to reduce parameters of flux so that i can no longer generate things like cuckoo clocks, noodle screens, space photos, but can train more efficiently on tits?
somehow generate a thousand women photos, record the activation of the parameters and throw away what is not or hardly activated? finetune on more women, done?
>>
nobody baking?
>>
>>102622320
you can bake me
>>
>>102622320
seeing as no one is doing it, here

>>102622356
>>102622356
>>102622356



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.