Discussion of free and open source text-to-image modelsPrevious /ldg/ bread : >>101701058>Beginner UIEasyDiffusion: https://easydiffusion.github.ioFooocus: https://github.com/lllyasviel/fooocusMetastable: https://metastable.studio>Advanced UIAutomatic1111: https://github.com/automatic1111/stable-diffusion-webuiComfyUI: https://github.com/comfyanonymous/ComfyUIInvokeAI: https://github.com/invoke-ai/InvokeAISD.Next: https://github.com/vladmandic/automaticSwarmUI: https://github.com/mcmonkeyprojects/SwarmUI >Use a VAE if your images look washed outhttps://rentry.org/sdvae>Model Rankinghttps://imgsys.org/rankings>Models, LoRAs & traininghttps://civitai.comhttps://huggingface.cohttps://aitracker.arthttps://github.com/Nerogar/OneTrainerhttps://github.com/derrian-distro/LoRA_Easy_Training_Scripts>Pixart Sigma & Hunyuan DIThttps://huggingface.co/spaces/PixArt-alpha/PixArt-Sigmahttps://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiThttps://huggingface.co/comfyanonymous/hunyuan_dit_comfyuiNodes: https://github.com/city96/ComfyUI_ExtraModels>Kolorshttps://gokaygokay-kolors.hf.spaceNodes: https://github.com/kijai/ComfyUI-KwaiKolorsWrapper>AuraFlowhttps://fal.ai/models/fal-ai/aura-flowhttps://huggingface.co/fal/AuraFlows>Fluxhttps://huggingface.co/spaces/black-forest-labs/FLUX.1-schnellhttps://comfyanonymous.github.io/ComfyUI_examples/flux>Index of guides and other toolshttps://rentry.org/sdg-linkhttps://rentry.org/rentrysd>GPU performancehttps://vladmandic.github.io/sd-extension-system-info/pages/benchmark.htmlhttps://docs.getgrist.com/3mjouqRSdkBY/sdperformance>Try online without registrationtxt2img: https://www.mage.spaceimg2img: https://huggingface.co/spaces/huggingface/diffuse-the-restsd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium>Related boards>>>/h/hdg>>>/e/edg>>>/d/ddg>>>/b/degen>>>/vt/vtai>>>/aco/sdg>>>/trash/sdg
>>101704253i love hitler
outside of ldg/sdgwhat are some high signal, low slop communities that are out there. Are there any good discord servers?
>>101704253my dalle gen in OP love it
https://x.com/bdsqlsz/status/1819752053109121246
>>101704326Here your waifu.
>>101704253lmao stupid OP baker included dalleslop gen in bottom left
>>101704391smartest people alive btw
>>101704378>>101704397good morning ladies
>>101704404No one was like "you know, negative prompts seem pretty important and is why SD/SDXL had legs, it's basically required for certain outputs"
>>101704397raped xd ;^)
Can't thank the webapp anon enough, had an absolute blast today
>>101704504the keys are dying too fast (i dont feel sorry for the owners though)
>>101704559guidance doesnt seem to be working, it's the same gen at 1 or 5
>>101704569shit you're right AAAAAAAAAA
fixed it
>>101704559could run it myself but only got 4x 3080ti with 12gb, resulting in 90 sec generations, any tips?
>>101704592no tips, I'm using replicate :()
>>101704592there was some anons a few threads back using 3090+3060 combo to load model into one card, encoder into other.. but with 12GBvram you will swap no matter what, guess you are outta luck unless pytorch deems it to support old nvlink setups .. and I guess you are running all in fp8 already?
pro model works on webapp now apparently
>>101704636so edgypuberty hitting you hard uh
How do you use the ComfyBootlegOffload.py script once you installed it?(https://gist.github.com/city96/)When I launch comfyui it only seems to see cuda:0 and when I generate it only uses one gpu when I check with nvidia-smi.>>101704592>>101704631Script is here : https://gist.github.com/city96/
>>101704631but of course
>>101704637link?
>>101704671https://singer-immigration-consist-chen.trycloudflare.com/
changed safety_tolerance for pro so it will hopefully output nsfw gens now
>>101704673no suspicious at all
Blessed thread of frenship
>>101704649
>>101704725so overfried
pro is fucking MJ level kino, proprietary............
>>101704673>>101704637Where did webapp anon get FLUX.pro? torrent link anyone?
>>101704777https://replicate.com/black-forest-labs/flux-pro api
>>101704774why does the dolphin has no legs
i like prompting for the Dancer. it doesn't know the Dark Souls character but the wiki character description is giving
>>101704785so webapp-anon is paying for the anons here for every pro generated picture? rich guy
>>101704805no im stealing keys
>>101704791dolphins don't have legs, silly
>>101704805sweet summer child...
how did pro get so fucking fast, it was real slow before
>>101704823it's hatsune gigu!
>>101704664OK I just needed to add "Force/Set VAE Device" and "Force/Set CLIP Device" to my workflow and I went from a 45s gen on fluxdev to 30s (using 2x3090 capped at 300W).-> >>101685784
btw the pro model API is censored even with the safety option at 5 (least censorship), "A tall black fat nigger with his huge black dick cock penis erect" doesn't get generated
seems like "nigger" is a hard filter
>>101704898You shouldn't be using it if you're under 13 anyway
>>101704898There are no genitals in the dataset. Dev is just a distilled version of Pro so really anything Pro is just a slightly higher quality version but otherwise the same knowledge.
just tried that new animagine flux lora, working really well! Yall were seething too early
Remember, Flux needs highly descriptive prompts, so use LLMs to convert your short prompts into descriptive ones.
>>101705011what comfy node do you use for llm
>>101705000Link?
What's better to use for Flux? The fp16 model with fp8 text encoder or the fp8 model with fp16 text encoder?
btw the seed for flux pro doesn't work, but that's because they do not have it????????? idk ask replicate they're retarded
>>101705032ollama is the easiest to install and to get running (its also very fast). then get the ollama comfy nodes. I like using llama3.1 with it but it may refuse doing stuff thats explicit in nature>picrel
>>101705011I'm already being highly descriptive without LLMs
>>101705065
>>101705032hopefully not the virus one that remained shilled at the top of sdg for months thanks to a certain retard
>>101705086https://civitai.com/models/282061/
>>101705101you're so cool and smart (face with sunglasses emoji)
>>101705032>>101705096https://ollama.com/https://github.com/stavsap/comfyui-ollama
>here's a finetune>it's a merge insteadwhy do they do this
>>101705096using local slows your gens down a lot, no?
>>101705183technically it's still a finetune
>>101705177yes that's me
>>101705192not really. a few seconds to load/swap models maybe. also you dont have to gen a new prompt with ollama every time. i keep it in a 2nd tab in comfy.this browser extension is also pretty cool to talk to your ollama instance:https://github.com/n4ze3m/page-assist
>>101705193A merge is not a fine tune, a fine tuning requires you to have a optimizer.
>>101705208oh wow people still use SD 1.5
>>101705224I also use 1.4
>>101705248you can set your seed to -1 so that it doesn't generate the same every time
is the use of perturbed attention guidance recommended for Flux?
How do I get flux to consistently give this more home-y look?
>>101705355>boring Snapchat photo circa 2015
>>101705364thanks lots
"artists" are insane, are they not? I'm not talking about the video itself but the commentators. https://www.youtube.com/watch?v=dnjy0dNe0ms
>>101705394most salty artists are just mad they cant sell their loli art to pedos anymorethey lived comfortably with their mentally ill ways of earning money and now its being threatened
>>101705364That's some jpeg artifacts alright.
>>101705415Did you forget about suicide watch furry commission artists, kek. Their commissions costed so fucking much but then they got MOGGED by a rich furry and AI.
>>101705394We're dealing with one of the weakest and loudest generations to ever exist, they are Luddites to the extreme. Everything is a grift now where people become a weed in their industry.
>>101705011Also you can ask the LLM to add and remove things, emphasize/deemphasize something, try a new style, etc. Coupled with text comprehension of Flux it's like running a crude simulation.
I have no imagination to make prompt, how do I fix that?
>>101705450use an llm
>>101705450ask your llm
>>101705450add (big boobs)
>>101705453>>101705474But what do I ask it?
>>101705479"make me a cool prompt"
>>101705487## Cool Image Generation Prompts:**Surreal & Dreamlike:*** A melting clock draped over a field of giant dandelions, with a sky full of swirling galaxies.* A library where the bookshelves are made of clouds and the books are filled with starlight.* A city built on the back of a giant, slumbering turtle, with buildings that resemble seashells.**Cyberpunk & Futuristic:*** A neon-drenched cityscape reflecting in the visor of a cybernetically enhanced samurai.* A holographic geisha performing a traditional dance amidst towering skyscrapers.* A bustling marketplace filled with bio-engineered creatures and futuristic technology.**Fantasy & Mythical:*** A majestic griffin soaring through a storm-wracked sky, clutching a glowing orb in its talons.* A hidden forest glade where fairies dance around a luminous mushroom ring.* A dragon perched atop a mountain of gold, its scales shimmering like amethysts.**Abstract & Conceptual:*** The feeling of nostalgia represented as a swirling vortex of colors and textures.* The concept of infinity depicted as a never-ending staircase leading into a blinding light.* The essence of music visualized as a symphony of abstract shapes and patterns.
>>101705515dont use gpt-4o it's slop, claude (even 3.5 sonnet) is miles better in creativity.
>>101705515
Great
>>101705632???????????/
>>101705515Also those are bad prompts, they should be at least 3x as long
>>101705636have you never seen a woman before, anon?
>>101705664................. anon check your eyesight pleace
how to prompt The Matrix style green 1s and 0s raining in the background?
>>101705636it's just armpit cleavage, relax
>>101705515>>101705644
>>101705636
>>101704253>linking the cucked flux schnell model instead of dev
>>101705675if you don't get it for free when you prompt "The Matrix" it's probably not in the dataset. They really have weird dataset honestly given how many pop culture things it knows (and doesn't know).
flux can't do sweaty armpits
>>101705692
>>101704253Still shit compared to DALLE. And you will never get any fine tunes. And this will all end once the Olympics are over and the media notices this :)
>>101705711I wanna do waifu in leather coat and bikini and shades with that as a background
>>101705727ok baitie :)
>101705727kys closedai shill
>>101705713Or feet. But it can do nice toned tummy and big boobs.
>>101705737it can do feet, just not sweaty feet.
>>101705727yawn, get better material
>>101705729Have you tried typing in what you want?
>>101705711They probably freaked out after the disney dalle thing and deleted as many names as they could, from artists to movies.Dumb decision but whatever.
If compute is the limitation for flux finetunes and lora’s, then it’s a non-issue. There are autists on here who built 8x3090 systems to run LLM at full precision so they can talk with their AI waifu at the highest possible quality. This shit is going to have so much compute thrown at it.
>>101705765anon, they exactly only kept disney shit or other popular americanslop
>>101705765What's weird is it knows most Disney stuff, but doesn't know a lot of anime (somehow doesn't know Misty from Pokemon) but they absolutely removed all non-politicians from the dataset. Probably did an IMDB name purge.
>>101705770Compute and datasets afaik.
Loras for Flux?
>>101705760want it with pony or sd1.5
official pixart bigma and lumi- https://github.com/Alpha-VLLM/Lumina-mGPT oh
>>101705782I have millions of images downloaded now and I just do it to dick around.
>>101705775>>101705781Disney is everywhere, but it's easy to delete IMDB/Wikis, something like that?
>>101705797its extremely easy to clean up disney char names too anon
>>101705793lmaohave you type "The Matrix" into Civitai
Can I run flux with my shitty 6gb 1060?
>>101705795DOA because of flux
>>101705814not yet
>>101705803Then I got no idea how they decided to clean some stuff but keep other.Outside of porn.
>>101705839can you even run XL?
>>101705839Whatever will be the next "big thing" after SDXL, safe to say that you will need at least 12GB of VRAM. I would aim for 16GB if you can't get 90-class card for 24GB.
>>101705797My guess is they pulled a list of celebrity names and deleted anything that matched. And then part 2 is they used AI captioning which has extremely weak understanding of characters but most captioners do know Mickey Mouse, Beauty and the Beast, etc. If you've ever used vLLMs you'd notice they incorrectly misattribute characters / movies all the time. Florence for some reason thinks most 3D animated movies are like one of three movies.
this shit is amazing
auroflow devs and fal just confirmed that they're scrapping it. No more releases.
sowwy anons have to turn off the webapp, too many keys burned already, want to leave some for myself for later. you can still use the models on replicate or huggingface, although no pro model there
What nodes should i use to connect a FLUX image to pass it through SDXL sampler? I'm getting some channel error when i do a FLUX Sampler -> SDXL Sampler
>>101705795>34B
>>101705899aughhhh
>>101705918>can't be used for image genningdon't care
>>101705945>>can't be used for image genningwdym?
>>101705905Figured it out, just rerouted it through a simple image rescale. I'm dumb
auroflow is dead
>>101705898i wonder which other known and unknown projects got canned because of flux, there is still a market for low and mid range gpus and models that actually allow things like finetunes and loras.
>>101705795no image examples is grimLumina is always going to be weak because they refuse to specialize and instead insist of making a comprehensive suite no one can run or train.
>>101705870>>101705711this is what happens with ai generated captions. if the caption model isn't smart enough to deduce the character/movie/reference then it gets lost. this is why it will know obvious characters like minions but will fail at other ones like rangiku from bleach. if the character is missed it just gets slopped as "A illustration of an anime girl with orange hair.."
>>101705882Yeah and it takes 2 minutes for a 1024x1024 image. I don't mind it that much honestly. >>101705884Damn. I'll eventually upgrade my rig but right now all I have is a 1060.
>>101705976definitely has nothing to do with their model being poorly trained from the start with The Cat being permanently burned in
>>101706000That's why the raw alt tags are a necessary part of the dataset. Hopefully one day someone will compile a pop culture dataset. The closest we have to this currently are Boorus.
new model have fun autistshttps://github.com/Alpha-VLLM/Lumina-mGPT
Apparently this is Belfast from Azure Lane.Thanks flux devs.
>>101706075If I download this and I discover that it can't do boobs I'm going to be upset
>>101705795you can tell the power of a model just by the demo grid they provide. when all they show is generic stuff looking at the camera you know it can be discarded immediately. fucking finally, i was getting sick of people jumping up and down for the most boring underperforming chinkslop and acting like it was somehow revolutionary for local. the cope ends here
>>101706103I bet it can't.All the new models are made to be exported to Saudi Arabia or something.
reddit is saying you cant train loras for flux. is this true?
>>101706097Up the ra
>>101706075>>101705795>hatsune miku shaking hands with hatsune mikuwell.. it's something
>>101706103yeah SURELY it wont be yet another model with excluded nsfw, pruned artists, and trained on ai outputs
>>101706125There is no training code and the model itself is too big for any of our computers. But people who say it's impossible are retarded. People make Loras for insanely larger LLMs. Flux is just the first image model that actually requires people to do some work and I guarantee there are some horny AI researchers working furiously right now to make Shnell trainable.
the fabled 3x2x2
real shame that flux is just over 24GB VRAMand the difference from 16->8-bit is pretty significantis it at all possible to make quants somewhere in-between, say 13/14/15-bit? if we have 3/5/6-bit quants of LLMs then surely it can't be too bad?
>>101706135Faggots never release with training code either. Tale as old as time. Imagine spending months on something and not even phoning it in with the dev training code. Just give us the code, I don't care if it's a pile of shit.
>>101706205>Training code (coming in 2 or 3 weeks)ig that's +1 point for lumina
>>1017062003090+3080 and it works flawlessly for me.Maybe get one used if you can, if you already have a 24GB card.
>>101706228I've seen multiple times now where they say Coming Soon(tm) and it never happens. So no, I don't expect training code and I'm certainly not going to watch the project.
>>101706126the ra?
>>101706175
>>101706200I'm using 16bit flux with the 8bit text encoder on my 3090 and 32gb ram. Works fine for me although it's a tight fit.
>>101705969Were you trying to feed the flux latent to SDXL as latent? I don't think you need to resize anything just link output VAE of Flux to inbound VAE of SDXL. But now that I think of it it might be faster to work at a lesser resolution and upscale Flux images before stylizing them on SDXL.
Using weight_dtype fp8_e5m2 reduces generation time by 1/3 on my potato. It seems the details aren't affected too much, but the picture comes out with a barely noticeable unpleasant pattern, which completely disappears if you upscale it later with weight_dtype fp8_e4m3fn.
>>101706351Advertiser sama get down!
>>101705727
Is there any link where the comfyui various nodes are explained?
>>101704696ty for the blessing
>>101706379oh shit i forgot this is a blue board
>>101705887>704x704 1girlman discovered sd 1.5
>>101706354is e4m superior to e5m?
>>101705899thanks for the free gens!
Help a retard out, I installed Forge but I keep getting this error.
>>101706500I have no idea, it's slower and the picture looks better, so probably yes.
>>101705899thank you for letting me use it for even just a brief moment, anon i love you
>>101706530try this instead https://github.com/Panchovix/stable-diffusion-webui-reForge
>>101706530install the previous tag not current master it's in alphaor use reForge it's more up to date
>>101706595flux mogs with the eye carpet
>>101706595Damn nice
>>101706592>>101706581Thanks, I'll give it a try.
>>101705795>An anime illustration of a gothic girlwaited 30 minutes in queue for this garbage. OBVIOUSLY trained on the fried journeydb dataset. absolutely terrible work. begone, chink garbage. time to start putting actual effort into your models
>>101705795>Girl on the beach wearing a micro bikini. Sunset.no just no
>>101706625Yeah, it's bad
>>101705985hype
has anyone made a sampler/scheduler comparison chart for flux?
Anyone wants some Zelda 64 alternate universe magazine scans?
oh n-
>>101706911yes
>>101706936
>>101706945proompt?
>>101706602It can do eyes on a variety of surfaces, which I like.
so it seems to be accepted the default cfg or whatever is far too high. what's the ideal you've found?
thinkin bout dem beans
>>101707074cfg==1.0 always for flux, its not supported in the inference method currently represented in comfy, he should have removed the box from a special FLUX sampler node, but he didnt make one
>>101707105i meant guidance, which seems to function like sdxl's cfg
>>101704253It's hilarious how SAI kept shooting themselves in the ball and now they finally became irrelevant. All they needed to do was to scale up 1.5. How do you fuck up that badly? LMAO
>>101707148the guidance default in replicate is 4, but blackforest employee-anon that visited us a few threads back said 1-2 for enforcing styles, so it works kinda inverted to cfg .. I use 3 for non styled, and 1.75 for styled
>>101707151Imagine being such of a train wreck of a company that you can't release a decent product in years and then a handful of your employees leave and in no time they are running circles over you. It's impressive SD was good at some point to begin with.
>>101707151overengineering and fear of failureit's funny how Pixart cut through all the bullshit and proved you can be simple and elegant
For all its strength, flux seems to be overfit on many concepts (e.g. girl is always a child for it, booba is always mid size, certain poses, etc), it's imagen version of LLM slop writing. Prompt following suffers as a result. It's not even a problem with pozzed/castrated dataset, it's just some random but systemic bias during the training. That's why it does ethnicities and basic styles worse than even SD, too.I remember SD 1.4 being overfit on one Aivazovsky's work in the same manner, merely mentioning his name is used to turn a cyberpunk city into a marine painting with a ship and a hazy sun. Same for Van Gogh, Mona Lisa.
>>101707190there simply was no alternative in 2022, I tried had to get anything even remotely acceptble done with core SD1.4/1.5 .. before the NAI leak it was very bleak, if it wasnt for some good finetunes the situation with SDXL would have been the same.
>>101707222it's overfit because it was very very clearly trained on a lot dall-e images, especially for 2d stuff
I keep getting this on loading stable diffusion. I've tried wiping fresh and running it fresh again, but still there.It never seems to finish. The terminal doesn't give me any indication its actually processing anything.
that came out much better than I thought it would, I was very vague
>RAU-NetWTF
Why not promoting Forges?
>>101707256not clear at all, dall-e is terrible shitslop full of very recognizable artifacts, which is not the case here (it can even reproduce era-accurate videogame palettes and MPEG2 artifacts no problem, something that no other model does, even Ideogram (which is way above Dall-E/MJ))
>>101707301I haven't tried to do any expression yet, and I know dalle gave it this warped nightmarish face. Just a regular face twisted into comical exaggerations of whatever emotion you asked for
>>101707196Agree
Props to OP for low key pointing to the super cool nazi girl gen in the past thread, and to the anon that gen'd it
>>101707347Flux interestingly carries the SAI tradition of releasing models that are difficult to train (as a business strategy). It's not a coincidence that Dev and Schnell are given to use in an untrainable state (whoopsies).
Were were you when the open source generative models civil war started?
>>101707236SD wasn't made by Stability AI. It's been made by these guys behind Flux. Basically this is proper "SD2".
>>101707388yeah isn't 1.4-1.5 by runway
>>101707301>>101707388Also Ideogram is based on DeepFloyd IF, which is pixel diffusion and was funded by Stability as well.Basically Stability is not an AI company, they used investor money to be an incubator and funded several good and shit research teams. They got nothing in return though, because good researchers just left and raised more money for their own startups.
>>101707374>Were were
Posting here too to bring attention to it in case anyone wanna try to implement on comfy/autohttps://github.com/SusungHong/SEG-SDXLone anon may be trying it >>101704309
>>101707463wow I love sameface same image portraits from SD 1.5
>>101707363Niggas in the scene are already saying that Schnell seems perfectly trainable. It will just get undistilled in the process.
>>101707404It was created as "Latent Diffusion" by German researchers (same who created Flux later), which was an order of magnitude faster than pixel diffusion models such as Disco Diffusion. Stability gave them GPUs to train the actual model, which was then called Stable Diffusion.
>>101707474that's the thing, it's really fucking good. Can a bunch of perverts really improve on it?
>>101707462>chrome icon>stare
>>101707493>Disco DiffusionAnd the authors of that now work at Midjourney
>>101707474I'll believe it when I see it, it's going to be worse than SDXL for awhile and I think one of the crucial steps is getting the model converted to 6B or somehow made trainable within 24GB of VRAM. If you need an H100 to make a Lora, not going to happen.
>>101707501the perverts would make it good for porn, which it has zero idea of.
>>101707528Man I remember when VQGAN+CLIP was the shit, getting anything decent out that was a miracle. Same for disco diffusion that came a bit later.
>>101707266Age 1060K!1060000 * 1059999 * 1059998!That's very old.
>>101707363These companies want to have their cake and eat it too. They want the prestige of being "pro open source", but they also want their model to be the best there is and have customers willing to pay for their API service. Also the point is to release "freeware" model out there, so that you generate buzz. Soon everyone knows about Flux. "Flux is the best", "Just use Flux bro".They are competing with Midjorney and Dall-E, not with stable diffusion.They literally said that they do not want to enable copycats etc that will steal from their userbase. Ponydiffusion or the likes for Flux would be the most disastrous thing ever for BFL, because how on earth will they sell Pro API, if there is much better model out there and any fool can sell API for it or run it at home.Businesses want to make money, not enable deranged coomers.
>>101707501can specialists improve on a model designed for general use? yeah probably, but only in that field
>>101707585I don't think any AI is turning a profit at the moment. They all seem to be riding the hype wave and people (not consumers) are throwing cash at them
>>101707493https://www.youtube.com/watch?v=7w8aG3VhArE
>>101707614I'm pretty sure that Midjorney and NovelAI are profitable.
>>101707656yeah but whether they are or not is the issue
>>101707614The music and audio generating AI (Suno, Udio, and Elevenlabs) are definitely profitable. Audio models are not that compute intensive to run.
>>101707585I think it's a business strategy that was poorly thought out. After doing stuff for years now, I've realized that people will pay money because they're lazy and stupid. AI in particular is interesting because in order to use it locally you need to be rich and high IQ, these people aren't going to pay money for shit. I mean we can see this obviously with chat bots now, you can run a perfectly good coomer girlfriend locally but people still pay for the web interface There's a ton of money just being the primary distributor, you can make your money selling a web interface, an API, and even training services and it doesn't even cannibalize sales to make it work locally because local-only fags aren't going to buy your services.
>>101707656there is atleast one idiot on /g/sdg who pays for NovelAI .. but I am not sure how sustainable it is to target weebs and coomers when there is so much on civitai for that, but yea Midjourney probably is raking in $$$ from the normies, its constantly updated with new stuff and features and you start seeing their fingerprint on so many content out on the web its digusting.
>>101707614That can't possible be true, it's only not profitable because these companies spend like drunken sailors. Give me $100k in H100s and I promise you I'll make it profitable and I'll train a model from scratch in the process.
>>101707531Yeah. They made it way too fucking big for the cards in the market.
>>101707715>Give me $100k in H100sThat's, like, two?
>>101707737about 50k hours
>>101707531>or somehow made trainable within 24GB of VRAMDon't think it will happen. Rumor is that the secret sauce is that they grooked it. It's a miracle it runs on consumer hardware.
good image models like nai and midjourney make profit. meanwhile at llms:>i want something that's smar-claude>i want something that sexu-claude>i want something that can cod-claudethere is no market for garbage llm startups. the best llm is simply the best llm, there is no room for 2nd place. also midjourney and nai employ <20 people, meanwhile sai had hundreds of useless make-work positions.
>>101707774>grokkinglmao, no
https://x.com/ostrisai/status/1819802556261863925will report back soon
>>101705423very sex i like pleas show bobs
>>101707775Well, all the modalities are converging either way, so soon the best model is going to be the best at generating video, best at making pictures, best at writing code, best at generating and understanding audio etc.
>>101707775NAI is not even a real AI company, at least midjourney create their own model but NAI just rehosts open source shit and finetunes it somewhat.
>>101707799If they can get training to work on 2x24GB we'll be fine, but loras really aren't ideal
>>101707799It will not work
>>101707818video models will be the defacto image models soon enough because video is just a series of images but the big advantage that comes from video is the model has to learn some sort of scene coherency and logic
>>101707823Does it even support multigpu?
Flux is finally a breakthrough quality-wise, but it needs to be more affordable to run and train, otherwise it's bust for local.
the real test is to see if we can get /r/roastme fooled on a flux gen
>>101707845we're even more fucked then, no way those will run on <48GB cards
>>101707872there are ways to shard a model to be on different GPUsrefer to Fairscale https://fairscale.readthedocs.io/en/latest/deep_dive/offload.html
>>101707821still makes profit because they overcharge and are the best at what they do. japs don't want to look at ugly pony garbage that was trained on primarily western artwork. someone could probably do similar with flux and turn a profit. coomers will pay extra if it's even 10% better than what they previously had. some coomer patreons, like virt-a-mate, make millions per yet doing literally nothing. just milking eternally pre-alpha game demos
>>101707878for a 1person gen it's not really better than existing realistic XL models
>>101707845The "omni" models are video/audio/text in and out.
>>101707842Why
>>101707894Oh, then LoRA with 2x/3x 24GB cards is maybe possible.
>>101707875SD came out with nothing and was near impossible to train at the time and people figured out Loras and other novel methods. When there is a challenge there is a will and there are some grad students that are erect right now.
If SAI doesn't release their internal models they are done for. Supposedly they model Comfy was working on is pretty good and should be a lot lighter than Flux.
>>101707912You will see soon. Model will break.
>>101707923I won't be surprised if they drop their SD3 Large model, it has to be their only choice right now or they're dead. SD3 Medium is not going to get better and I can't believe they're wasting money on it.
Flux is a new era
So I can't use negative prompts in flux? I always had to add cats into negatives when generating catgirls because it always generated cats too. Would asking it nicely to not add cats work?
>>101707923>If SAI doesn't release their internal models they are done forWhy? If they have competent model, they are better trying to get some profit using it.There is zero profit in giving it for free.
So basically this model is just good to make it say nigger and draw swastikas everywhere?
>>101707921Let's hope for the best
>>101707957no just gotta make them setting fire to the EA logo
>>101707905true but zero shot text coherence though. No photoshop required
>>101707960Right now the user control is very basic, but it should come soon.
>>101707969I have no clue what goes in the mind of SDtards, do they think the companies get anything by releasing a model? lol
>>101707977Looking at the last pages it does look like this indeed.
>>101707969Locking down a model that requires an $800 minimum GPU to run is stupid. Local does not compete with SaaS and the sooner these retarded business majors realize it the better.
>>101707977This model is like Dalle-3's naughty 4chan little brother. It's everything you wish Dalle-3 could do without getting dogged but it has some areas where it's weak namely pop culture.
>>101707989>do they think the companies get anything by releasing a model?I think it's wort a lot if you get a community for your model who works for free and adds value trough that. Remember the desperate SDXL shills (comfy + other avatarfags) last year?
>>101708001A model that doesn't require a $800 gpu can be run freely by most people.A model that requires a $800+ gpu would have no paying customers because they cannot even buy a $800 gpu anyway.
>>101707989Yeah. They don't understand that SD was a research project first. Stability spun up their operation around the released open source research model and they failed spectacularly at making any money using it.Stability and Stable Diffusion is a cautionary tale. It showed what everyone already knew. You just can't make money with a product you give away for free.Only way it works if you have some kind of nonprofit style organization spun up, like Wikimedia or Blender foundation.
crazy how far local made it
>>101708033Gee anon it's almost like MJ figured it out with their monthly plans. Crazy, you sell $1 in electricity for $10/mo.
>>101708024It's shit at nsfw too, even compared to dalle.Tested cici input (they had access to a less filtered dalle at some point) and the dalle stuff is way better at that.Same with brands, characters and pop culture in general.This model is good but they excluded too much from the training;
>>101708077Dalle dogs innocuous prompts, shut the fuck up.
>>101707775>absolute nonsense made up by /aicg/ coombrain degenerate who has no idea about how LLMs are used in real worldmost use OAI as an assistant, anthropic are way smaller as a company for a reason, opus is not used by anyone basicallya lot are using local models because they want finetunes or don't want any leaks/espionageon the lower end and for technical NLP tasks/edge computing, haiku isn't cost-competitive with small local models which can be ran for less for the same performance, and finetuned
>>101708096>(they had access to a less filtered dalle at some point)
>>101707821They make excellent finetunes (they have lots of expertise) and do provide value in a very specific niche.
>>101707821>virt-a-mate, make millions per yet doing literally nothing. just milking eternally pre-alpha game demosThey will release the promised new engine and game anon, for sure!Before 2100.
>>101708115Their days are also counted, apparently they want to get rid of artist tags in their next iteration, which means a big part of why they had such good output will disappear.
>>101707889>no way those will run on <48GB cards*laughs in nvidia sales*
>>101707989SAI can't compete against the closed models. Their tech isn't as good. Their call to fame was being open source and their potential advantage was getting the support from the community. Personally, I think trying to monetize their models was never going to work. They should have tried to monetize an user friendly interface instead. Something plug and play they could have sold to anyone with an RTX.
>>101708200It's funny because there's a dozen companies that make money off of SAI so the monetization is obvious: you have a private, uncensored (mostly) and secure web generation UI, you have rental services which are completely unmonitored and uncensored, you do training services, you host models and do a patreon system for such models, and you run and sell Discord bots. Just being SAI will get them a lot of good will payers.
>keep the best model API only for 6 months>release the previous one openlyit's really that simple
>>101708240And the other, given you're a frontrunner, you can always have the best and greatest paywalled and every three months release the old version to the public. It keeps everyone scrambling too because the frontrunner model keeps pushing the tech which essentially kills derivative models.
>>101707549Yeah, despite seemingly long stretches of nothing, it's crazy how fast these models improved. I remember when Latent Diffusion came out about two and a half years ago. It was the best local model with borderline coherent 256x256 gens and glaring artifacts, but it could generate some basic text and combine concepts like "squirrel eating a burger", "chair that looks like an octopus", which was mind blowing back then. SD 1.4 was a similar leap with 512x512 and much larger training data, better world knowledge. I think Flux is the third big leap in local now, and once again, the Germans did it.
Baker baker
why did /sdg/ have to die just for /ldg/ to take its place with nothing but whiny faggots
>>101708375You could probably put both threads in a reading level calculator and find out SDG is like 6th grade.
trips and i'll release my flux schell lora
>>101708200>support from the communityCan't pay gpu hours or researchers with cheers from the community
>>1017083543-4 years ago this was the best we could do.
>>101708462There's so much soul because it was pre aesthetic filtering.
>>101708462Original DALLE was the first text to image from what I recall. I guess NVidia had GANs for a while but it was only Image to Image. DALLE by todays standards is not good but at the time it was revolutionary. They never released it though, so it's hard to say how cherry picked their examples were.
>>101708432
>>101708462Now imagine where things will be in 3-4 years. While I don't think the technology will just keep improving at the same pace, "holodeck at home" doesn't seem like such a stretch anymore. I don't mean Star Trek level immersion, but interacting with images and scenes like we are interacting with LLMs now, I think that's where things are headed.
>>101708616Very cheeky
>>101708690we're a very, very far away from hallucinated colors roughly resembling a prompt and the true logical understanding of images humans havewe'll know when we're close to the promised land when you can show the AI a picture of something it's never seen before and it can replicate it immediately and logically
>>101707627Best flux gens I've seen
Flux can do architecture much better than any other model I've tried so far. I'm very impressed altogether. Using Schnell on 12 GB VRAM and 32 GB RAM.Does img2img work with it too?
>>101703851>>101707455Since SEG is based on attention perturbation, I'll just add it as a separate node into https://github.com/pamparamm/sd-perturbed-attention.It will be compatible with both ComfyUI and Forge/reForge
>>101708761ty
>>101708991It does coherent buildings really well, better than other models.
Here's a Reimu
>>101708711>Very cheekyyes
100% moar armpits
Mid-afternoon bread, straight from the oven...>>101709433>>101709433>>101709433
>>101709319>>101709473so good
>>101708750I don't think a model needs true logical understanding to be an engaging world simulator. LLMs are case in point, clearly just modeling the training data, but that already goes a long way. The best models can carry a coherent interactive story for a long time, if a future multimodal model can do the same frame by frame, that's "promised land" territory for me.
>>101709478cozy bred of advancement and intellectualness
>>101709478Thank you baker
>>101705429the projection is beautiful