Discussion of Free and Open Source Text-to-Image/Video ModelsPrev: >>107538552https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/kohya-ss/sd-scriptshttps://github.com/tdrussell/diffusion-pipe>Z Image Turbohttps://huggingface.co/Tongyi-MAI/Z-Image-Turbo>WanXhttps://github.com/Wan-Video/Wan2.2>NetaYumehttps://civitai.com/models/1790792?modelVersionId=2298660https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>Illustrioushttps://rentry.org/comfyui_guide_1girlhttps://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe|https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/r/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
It is finally done!
Small reminder that we won.https://github.com/kohya-ss/musubi-tuner/pull/778
>>107540693Why isn't zit base in the links?
is the base model in the room with you right now?
>>107540748the model isn't released. just the training code
>>107540752>just the training codebased, we're getting there, we're saved!
>>107540718Will be training on this like squirrel on crackGlory to the chinks!
Miku also won
>>107540785kek
>>107540785lmao
>>107540785kryptonite for jeets.
Is there a way to lower lora rank on the go like model quants in loader?
formally requesting the prompt for the mirror selfie image in the collage
>>107540804No
>>107540785
>>107540815Six finger hand and two deformed versions of the character in the background.
The antichrist doesn't want you to know this, but with ComfyUI you can just break models and bend them to your will.
Might be a stupid question, but do you NEED to do the sampling during lora training or can you rawdog everything?
has someone implemented fuzzy search for the model/lora/whatever lists in ComfyUI yet or do I have to code it myself?and by myself I mean Gemini
>>107540848Once you're comfortable enough you don't need to sample during training
>>107540841
PromptManager gives interesting prompts.
>>107540848No you can just rawdog it.Sampling is there so that you can check live during the training whether your lora seems to be learning what it is supposed to learn good enough. If not you can cancel it early.I just rawdog it and see afterwards if it worked or not. If not retrain.>>107540855The latter
>>107540865>Sampling is there so that you can check live during the trainingYeah, I train in my sleep so whatevs...thx.
>>107540861>PromptManagerwhat's that?
>>107540877https://github.com/BigStationW/ComfyUI-Prompt-Manager
Why doesn't ksampler advanced let us control all parameters at all steps?
Where do you guys think Base will land, 24GB or 32GB of VRAM?
From what folder does it pull? Neither my text encoder or my llm folder show up, just the cloudshit.
>>107540918i hope 32gb vram so most of you idiots can't do shit
>>107540817Nice
>>107540926It's guidance distilled so it'll be 6-12GB.
>>107540923>From what folder does it pull?you have to read what it's written anonhttps://github.com/BigStationW/ComfyUI-Prompt-Manager?tab=readme-ov-file#instructthinking-llms
>>107540893do you find vram is released fast enough to prevent problems using llama?
>>107540923cumfart/models/LLM/GGUF/ ?
>>107540935why aren't you pulling it from the normal folders?
>>107540950how should I know?
>>107540971still no model that can do ringworlds / halos, especially not with the shadow squares.
even if base drops, it will take at least a few months for decent finetunes to dropi'm not holding my breath
>>107541002I'll have my fun with it, it'll be able to do edit shit
>>107540974Nice
>comfyui update>changes lots of stuff>can't even click on the image preview - the click event's not even used.>well maybe they fixed reroute to work on floattake a guess guys do you think reroute works on float?
>>107541013Thanks
>>107541026He's wrong.
>>107540991>no model that can do ringworlds / halosyeah :(
>>107540860>>107540841Sexo with mother and daughter
Poll"If ONLY ONE of these were to be released, what would be your pick?"https://poal.me/1ceqwmhttps://poal.me/1ceqwmhttps://poal.me/1ceqwmhttps://poal.me/1ceqwmhttps://poal.me/1ceqwm
>>107541032No thanks
>>107541045Sage Atention 3
>>107541032Seethe somewhere else! Fuck off!
>>107541045>what would be your pick?the one that was promised and has already a training script lol >>107540718
>>107540847
>>107540946VRAM isn't released at all by default. ZiT is working in the remaining vram + offloading. I'm still looking what can be done better.
cute, hk416 is cute!!!!
>>107541089is this comfy?
>>107541045(You) are not going to be able to run Wan 2.5 in your system with sane speed, so what would be the point?
>>107541020shotted and dragged
>>107541101My system has two 3090s, are you sure about that?
and 2.5 would be pure ass without loras
>>107541101Runpod my friend
>>107540848No, it's just a sanity check to see how training is progressing, also the sampling output is usually quite far from the quality that you can expect when using a dedicated inference program like Comfy, Forge etc
>>107541020Have they fixed the UI with the botched stop button at least ?
you niggers are retarded
>still forcing an unfunny meme
>>107541111>and 2.5 would be pure ass without lorasyou're such a fucking idiot. you run it at 128x64 resolution and you have THE BEST LOCAL AUDIO GENERATION MODEL OF ALL TIME AS A FREE SIDE EFFECT
>>107541142Yes but we're also right.
>>107541109Yes. You wouldn't even be able to run a SAAS image generator like Dall-E 3 with sane speed on that machine, let alone video generation.
>>107541137yeah they added it back (after two weeks lmfao). it's a real shit show at cumfartorg
>>107541159>it's a real shit show at cumfartorgno kidding
>>107541155well yeah, base is actually coming and wan is actually not, that's for certain. ltx2 is the only hope right now and everyone is already doomed about it because its implied (((they're))) pozzing the model in preparation for release as we speak
>>107541045I don't think I have the patience for newer, heavier video generation. The 4-5min I spend now rendering 7sec of video is absolutely brutal.
>>107541142No, I would agree here, Wan is the bigger get. As much as I would like base ZImage, de-distilling Turbo should we have been forced to do it is still pretty alright. Video models have also been stagnant and Wan 2.2 was almost 5 months ago even though the last of the tools was around 2 months ago.
>>107541142omg I'm retarded for not wanting a gigantic bloatmaxx vid model that would take a century to inference! how could I choose the image model that's perfect for dethroning sdxl!
>>107541196Ikr, you're so silly anon!
Does this thread still have that one faggot who posts realistic images and refuses to answer any questions about what checkpoints/loras he uses because "someone posted my workflow on reddit and got upvoted for it" and he's butthurt that he missed out on his precious internet points?
I can now generate unlimited Wired Magazine covers.
>>107541002you people underestimate the power of the coomer side of the tech sphereit took 24 hours to get non-pepperoni pizza nipples in turbo, a week later it's blowing up civitai. I give it a full month and we see the best of the best finetunes. one week and we get the best merges/loras from illustrious sent over to z-image. seriously, people will be fast. the hype has been building up with no slowing down.
I don't want a stupid Z-base and to wait days for a lighter version. just give us better adult Z-Loras
>>107541176twinflow will make it 2x as fast (taking it down from 4-step loras to 2-step loras) eventually>>107541183any anon who doesn't understand the value audio adds to video needs to use sora and feel the magic>>107541196ignoring my argument completely is not a good start>>107541154and a team would make a few-step version of it for the clout. the lightx2v team would be able to handle a model 3x larger than WAN 2.2 on their same hardware since Twinflow uses 3x less memoryand we already know the Twinflow team themselves have compute enough for a 20B model>>107541214who cares you can make all that with Z turbo now
that ultra flux vae is actually a straight upgrade, it just makes things sharper.
>>107541232you know what, training loras on Z-base might help us making better loras and we could use them on Z-turbo
>>107541254>that ultra flux vaecan you provide the link anon
>>107541254>straighthue hue
>>107541239>ignoring my argument completely is not a good startfor being such a pooper I'm ignoring everything you say
>>107541290leave my scat fetish out of it
>>107541254someone do a side by side comparison, i won't habeeb it till i twinkie house it - i mean see it.
>stealing /adt/ prompts now >except this time with a hag
>>107541095
>>107541230KinoPrompt?
>>107541324Nice
>>107541334here o/https://files.catbox.moe/x61m7c.txt
>>107541276https://huggingface.co/Owen777/UltraFlux-v1/tree/main/vae>>107541302too lazy to download a single VAE file and test it on Z? whatever, here's the same pic with normal Flux VAE. Notice it's not as sharp. there is a slight line artifact at the edge sometimes with chroma, still worth it.
lul it kinda knows Phryge if you do some boomer description of it
>>107541308Why have you been namedropping your general all day?
>>107540898You need to chain the ksamplers.
>>107541347Prompt her doing something interesting.>>107541355Thanks
>>107541324>chromosome revealwhew
>>107541365>too lazyno, but also yes. and placebo/snake oil trends happen all the time in these threads so i don't want to waste time on more.that said, looks legit. jesus thats almost a night and day difference. i'm giving it a shot.
>>107541365>https://huggingface.co/Owen777/UltraFlux-v1/tree/main/vaeinterestinghttps://imgsli.com/NDM0MzQy
>>107541214yeah, (sadly) I'm still here, after all these months, you still couldn't find the model kek
>>107541347Make her do ass-to-ass.
>>107541254Bit of ringing on real high contrast edges, but overall, not too bad.
>>107541422oh, a different model? isn't it just sdxl?>hiding hands>skintone issues
>>107541476the faults and glitches make think of chroma
>>107540693can this wan2.2 thing do nsfw video?
even img2img is the best
>>107541504no. its only really good for generating videos of rats swimming in bowls of soup.
>>107541515i fucking love soup
>>107541515>rats swimming in bowls of soupmy fetish!
>>107541462
>>107541365>>107541419https://imgsli.com/NDM0MzQzthat's actually pretty cool, thanks for the vae anon
>>107541422>>107541530we need a proper investigation to find this wf
base base base base base
a smiling cat
>>107541422
>>107541365>>107541532Looks just like postprocess sharpening.
>>107541573sharpening, that's what i was thinking it's doing. i forget all the image manip terms but it's definitely doing something very specific and enshittening.it definitely reduces how blurry z-img gens look, but it adds artifacts from what looks like post process sharpening. like >>107541470 and adding a slight halo to white colors.
This new vae just makes the zit banding even worse lol.
>>107541591But it works tho. Although I'm not sure if it is actually so good or if it has a photoshop filter baked in.
>>107541566That just looks like some of the broken gens I get when the settings aren't correct
the ultimate slop thrill is grabbing other people images and run them thru a promptless img2img with a high denoise value
>>107541601>it worksi guess if you wanna make all your gens look shitty and improperly upscaled kek. it does reveal that all z-img gens are blurred by the vae possibly.
>>107541566a smiling cat 2
here saar, I made slop out of your slop
>>107541602that's not remotely similar.>>107541619here
Trying schizoprompting with Z, not having much luck
>>107541636a little better
THEY CHANGED A FONT FOR NO REASON
>>107541156>SAAS image generator like Dall-E 3 with sane speed on that machineAre you retarded? The only thing dalle 3 had special is the fact it was trained on more dataIt did mangled text and elements/anatomy every once in a while, it had signs the model was not that big>>107541156>let alone video generation.I can run Wan 2.2 on my machine better than most people can since I can offload both models (high noise and low noise) to different GPUs
>>107541666saar, please understand! work is progressing! thank you for noticing our hard work saar
>>107541652do you have good luck with temple gens (like with columns, like giant churches)?
zit is full of crazy loras. idk the triggers, but you can force them out.
>>107541713compare.
>>107541718:(
>>107541723tf is that cat doing there
>>107541700postan a few
>>107541740
>>107541156I still want it leaked. I don't care if it took an hour to gen, I KNOW it can gen what I want and the stupid dog is the only thing blocking it.
>>107541740>>107541747sd1.5 vibes
>>107541747
>>107541752
PLEASE EVEN MORE FIBER THIN FONT WHO ASKED FOR THIS
>>107541761
>>107541754Can the model do panties down? (in the leg shin area)
Crunchy as fuck, but for upscaled/resampled pics might be great. But the raw distill output is shit. Gonna try the vae on chroma.
animanon won
>>107541736I chain samplers. It's in the second prompt.
>>107541764That's like 2 whole days grif- work, pls understand
>>107541754instantly thought of The Sims
>>107541422It's hilarious that you think hiding the name of a model makes you some kind of elusive mystery man>What's his model, what could it be???>Mwa ha ha, I'll never tell!This is highschool edgelord level of larping and it makes you look patheticThis doesn't earn you respect; it just makes you seem like a fucking loser
>>107541896Yes, Ran. Now go back to your discord.
>>107541677You are retarded, it has a shit ton of params, hence it's really big, it will not fit in 48gb vram by a long shot which means tons of block swapping which means slow.
what?
>>107541906>See someone you disagree with>Immediately assume they are some boogeyman from a previous threadYou guys have enough self-awareness to recognize that you sound like schizos, right?
>>107541920Yeah Z is no good for schizoprompts, just turns into SD
>>107541750no it isn't.
>>107541919Show me a single dalle3 image that can do text better than Qwen-Image then, retard (a 20b model that does run on my machine)
>baseit wont happen>but..no>but look at this...never
>>107541932Are you using any ddims or res_Xm?
Will we ever be free of the curse of sameface
>>107541954SKILL ISSUE
>>107540750can you fchange the font
>>107541239>audioI'm not convinced having it done in the same model is fine unless you have a lot of pristine footage to train on and only Youtube really has that.
>>107541954turbo is a distilled model, that problem will be less obvious on base
>>107541951rawdogging the default settingshttps://files.catbox.moe/9nad6i.png
>>107541740cool>>107541761yeah, it's realistic in sizes. idk why, but super fantasy (sci fi or magic tier) colossal scale is hard to generate. It's like tiny people, but furniture & doors remain somewhat constrained.
>>107541968Can we stop with this meme already? I got you vramlets got thrown a bone with z-image, but the base model won't magically be Nano Banana Pro tier
>>107542034>but the base model won't magically be Nano Banana Pro tierbut unlike nano banana or bloatmodels you can workyour way around the limitations once you run into them
>>107540443>13 cents a gen
>>107541919>>107541934Left Qwen-Image, right Dalle-3.
>>107540772Hell yeah I converted my 100+ datasets of VNs already to natural language using Jailbroken Gemini 3
>>107542084Only boomers and millennials like me will remember this. Sadly most of the thread are zoomers
$66,697/week selling Ai at 18>step 1: get a job>step 2: run facebook ads for boomers and sell your ai slop service>step 3: ???>step 4: profithttps://www.youtube.com/watch?v=7IUjT0AUdi8
>>107541954Skill issue, sweetie. So it has been for a long time.
>>107542057So sharp and good aesthic Qwen....
>>107541896pretty sure he's just using some variant of pornmasterpro noob or an offshoot shitmix like alchemistmix.>>107542113the problem with having an easy remote job is that I have no motivation to grind and get money like this, and would rather just gen for fun.
>>107542017The Bitch should have been beheaded.
>>107541718I do not care for your very sad image, Anon. Make her smile or something!
vote on which will be released next:>sd 1.6or>z image base
Has the mayli anon trained a mayli lora for z-image already?
>>107542218My vote is neither and my vote is the truth.
>>107542113fake edited html crap earnings to entice people to use their affiliate links, everybody knows that normal people HATE absolute HATE ai phone calls, that just bs
>>107542251Truth hits hard at times
>>107542251lel nice
I'm proud of this gen. It's the first time I think that the eye placement is actually wrong. The scale isn't really off, so it's only a little of the goal.
>>107542262
>>107542113Take it from a boomer. If it was this easy, everyone is doing it.>claim your services net you absurd amounts of money weekly>build a community based on hopes and dreams>profit off of that community based on your bs through ads, subscriptions, etc.
>>107542288yeah his actually business is youtube ad revenue
Why do they insist on training models on both 2d and 3d
>>107542183the only way to make money with ai is by scamming people and i dont like it
>>107542288>>107542299his whole youtube page is just money baiting crap, he sells his "courses" thats all what these zoomer entrepreneurs are doing, they all do the same tactic, show some edited html page with crazy earnings, write some money baiting crap "this 18 year old just made 100k in one month", "my AI influencer just made me win 10k in a week" , here is how, use my links to <insert AI SaaS crappy site affiliate link that will pay me for every user that registers>, oh and here is my the link for my $100 course, and my coaching and my 1-on-1 session, etc etcAI is the new gold rush, everyone is selling prompts now, courses, etc.a few years ago it was NFTs, then dropshipping, and now AI of course, they just hop on every trend they get so they can grift money out of it