Discussion of Free and Open Source Text-to-Image/Video Models and UIPrev: >>106447640https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://tensor.arthttps://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/sd-scripts/tree/sd3https://github.com/derrian-distro/LoRA_Easy_Training_Scriptshttps://github.com/tdrussell/diffusion-pipe>WanXhttps://rentry.org/wan22ldgguidehttps://github.com/Wan-Videohttps://alidocs.dingtalk.com/i/nodes/EpGBa2Lm8aZxe5myC99MelA2WgN7R35y>Chromahttps://huggingface.co/lodestones/Chroma1-BaseTraining: https://rentry.org/mvu52t46>Illustrious1girl and Beyond: https://rentry.org/comfyui_guide_1girlTag Explorer: https://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneSamplers: https://stable-diffusion-art.com/samplers/Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbourshttps://rentry.org/ldg-lazy-getting-started-guide#rentry-from-other-boards>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
genuine threaduntouched by discordtranny hands
I shall be using this one
>>106451254Catbox please I want to gen more of these.
>NSFW on a worksafe board
>>106452041I'm all cosy in my blankey right now but the prompt was something like. Large African American man wearing nothing but a towel approaches from behind and violently pushes the women's head under the water.
>>106451834>>Miku Hatsune does some stupid actionsay no more senpai
>>106452065Thanks I'll give it a try. Goodnight sir.
Blessed thread of frenship
>>106452071>>106452077Finally some good anime
Just train a lora anon
>>106452071ponder the aroma>>106451972based, fuck those guys pushing for their stupid shitty barely functioning guy frfr no cap
>>106452178i dont have the hardware :(
>>106452178Been thinking about trying that, just for the fun of it but I wouldn't even know what for (or what model to use).And, to be honest, I don't even know how to build a decent dataset.
>>106452189What hardware do you have ?
Is it possible to gen videos with RTX 5070 TI? Also can SwarmUI do it?
>>106452231>I don't even know how to build a decent dataset.its as easy as grabbing the highest quality images of the thing you want to train. include variety of course. then use ai to caption the images and make adjustments. it's as simple as that.some people let the lora trainer bucket their images(bucket means the trainer crops/resizes the images to the closet resolution it supports), but i've found I got much, much better results by manually cropping/resizing the images myself.some autistic anons use charts and other bullshit to analyze loss overtime but all that shit is unnecessary.
>>106452231In terms of hardware demands in training, Wan / Qwen needs the most, then Flux, then Chroma, then SDXL, then SD15As for dataset, it depends very much of what you want to train
>>106452178i transform photos into animes in qwen edit, run several upscales with illustrious/pony and different artist loras and sort out extremely strictly - to create high-quality pairs. I add nsfw, landscapes, objects, people and various artists.Two loras are planned for anime > photo and photo > anime.A bit tedious, but I'm making good progress.Is this welcome in the community?
>>106452261>Is it possible to gen videos with RTX 5070 TI?yes
>>106452178Ahhhhg manly hands, let me guess... Chroma right?
I'm so tired of 5070ti. It can't run qwen q8 fast. I demand more vram
>>106452261>16gb vramShould be fine don't let the vramlet sayers get you down.
>>106452189>dont have the hardwareThere's always renting>>106452231>what for (or what model to use)Train for what you want to see, but cannot do currently. Chroma1-HD is the best base model for local training imo.>>106452283>Two loras are planned for anime > photo and photo > anime.Nice>>106452301>Chroma right?Yes, perspective issue. It has trouble sizing things realistically. Is that what he trimmed off and didn't realize it?
thoughts on anisora v3? kijai released fp8 scaled models but I'm not sure how to implement them or what the best settings arehttps://huggingface.co/Kijai/WanVideo_comfy_fp8_scaled/tree/main/I2V/AniSora
Why the fuck is a 3090 so much better than modern cards?
>>106452342by modern you mean like a 5060 or some shit? 3090 was top of the line and has 24gb vram
Hey hey Anon, Anon here.Gave Chroma HD1 another run with some sampler/schedulers I wanted to try out and I like the way skin detail/face variance has been heading in compared to older versions. I really like it, apart from the usual hands and body horror sometimes.I think genning at higher res (just used the Qwen default for 9:16) really helps.Anyway, nothing new, but here's some plots.Full sizesTitty Fairy: https://file.garden/aIdN6xfH0QVghCy0/ChromaHD1-tests/ComfyUI_00001_.jpgTrash girl: https://file.garden/aIdN6xfH0QVghCy0/ChromaHD1-tests/ComfyUI_00002_.jpgRat girl: https://file.garden/aIdN6xfH0QVghCy0/ChromaHD1-tests/ComfyUI_00003_.jpg
>>106452342modern cards?4090 & 5090 are better. 3090 is better than 5070ti/etc because of the 24gb vram. vram is king for wan gens. less vram means some of the model has to be offloaded to system ram which makes gens significantly slower.
>>106452178I've been using it to gen FB MILFs I know being freaks, I can't stop genning.>>106452189You can train on as low as 12GB VRAM comfortably on onetrainer, they even have presets to use 8GB.>>106452231A dataset can legitimately be made out of like 12 images, my convergence point for Chroma has been around 1800-ish steps for 12/13 images so far.>>10645226116GB is more than enough. Just use comfy.>>106452354GOAT!!!! I was waiting for this. it seems like chroma performs better if the smallest edge is at least over 1024.
>>106452342Ehh... I have a 3090, and a 5090 is about 2x its performance on the same ai workload.I would gladly trade, that said my dusty old 3090 is still going strong, it's a really good card. Overall the 30xx era was the best, with the 3060 Ti being the best bang for buck of all Nvidia cards to date.
>>106452399if only it was just 2x the price
>>106452399>3060 Ti being the best bang for buck of all Nvidia cards to date1080GODS, what did this nigger mean by this?
>>106452397>they even have presets to use 8GB.Gotta hand it to OneTrainer, they do some magic with offloading where it allows for extremely low vram requirements and yet very little performance penalty.
>>106452418old and busted
>>106452418I speak the truth, do not reject my message!
>>106452399All that time and the new card is only twice the performance is hardly great. Especially for the price.
>>106452354ty, always interesting! beta seems to win. Have you tried OSS?>>106452342>3090It's nice being in the 24GB club right? This 4090 I bought has paid for itself so many times.>>106452397>I can't stop genning.Same. I've been quiet since the release because there isn't much I can share
>>106452433performance uplifts are getting harder to achieve
Hardware can be the difference between a job and unemployment. Secure your future as a prompt engineer by purchasing the RTX 5090 today.
>>106452448yeah, its so hard a random chink in his basement can install 96gb vram onto a card that has 24gb without touching anything about the core of the gpu
I tried to use wan 2.2 but it goes to shit pretty quickly. I followed the rentry guide, and on workflow ldg_2_2_i2v_14b_480p.json, I get this error when I run a jobSet LoRA node does not use low_mem_load and can't merge LoRAs, disable 'merge_loras' in the LoRA select node.It goes through when I disable it but the generation is fucked. Any idea what the issue could be? Running a rtx 5090
>>106452458that's not a performance uplift you mouth breathing retard
>>106452433True, it's two generations apart, Nvidia sure is coasting on the lack of competition.
I use ComfyUI how can I make images with more than one character? When I do it he mix the prompts of the characters... so I think that there is a better way just to put everything just in CLIP Text Encode?!
>>106452458speaking of, I wonder if nvidia is changing anything about their board designs or drivers to prevent that from happening again, or if they're avoiding it in order not to tank their Chinese partners.
>>106452469>running a model almost 4 times the size that is much much better at fast speed that was impossible before since it didnt fit into vram is not a performance improvement, only core clock ghz improvements count!!!talk about a mouthbreathing retard, lmao
>>106452447>Same. I've been quiet since the release because there isn't much I can shareMy guilty pleasure is genning them in NSFW I2V and then adding a bit of bitrate compression and throwing it on chatpic lol, nobody can even sus out that they're AI.
>>106452480I doubt it's a huge enough industry to be a threat to their dominanceif they did implement counter measures it would be incredibly petty
>>106452478regional prompting. generally this is the shortcoming/flaw of SDXL.
>>106452469I know what you meant, but being able to load a lot more of, or even an entire model into vram is a massive performance uplift.The less you need to offload, the faster things go.
>>106452483you, you're the retard that doesn't understand what like for like comparisons are
>>106452499>regional promptingelaborate? I'm a total beginner
>>106452468well did you disable merge_loras?
>>106452506>that performance improvement doesn't count!!!close your mouth when breathing, npc retard
>>106452358
>>106452512I did, and it goes thorugh but then the colors are all weird.
>>106452505but then you need to make clear you're talking about two very different scenariosthe card with lower VRAM can be 1.5x/2x/3x/4x faster for models that fit in it, which is better then?no matter what just don't be a retarded baboon like >>106452515
>>106452550>the card with lower VRAM can be 1.5x/2x/3x/4x faster for models that fit in it, which is better then?this is moving the goalpost from>performance uplifts are getting harder to achievewhich got proven wrong, showing that they can be acheived with easy almost x4 vramso now adding more arbitrary qualifiers is just cope from a retard that talked big before he got btfod like a retarded baboon that you are
>>106452562I'm sorry you're too stupid to understand what a controlled variable is.For both our sakes I concede, you're right, all cards should be tested with a 50 trillion parameter model and nothing else.I wish you nothing but a good rest of your life.
>>106452546huh? merge_loras is disabled by default. i don't know why you even touched that
>>106452397but i only have 16gb vram, ive been told im a vramlet and i need at least an h100 :(
>>106452572>I'm sorry you're too stupid to understand what a controlled variable is.yeah, it needs to be a controlled variable after your initial statement got disproven so you can ad hoc rescue your position>the card with lower VRAM can be 1.5x/2x/3x/4x faster notice the phrase "can be"? your imagining a scenario that doesnt need to happen in order to make you right, appeal to possibility is a fallacy, retard>all cards should be tested with a 50 trillion parameter model and nothing else.ad absurdum fallacy of something newer said from your brain that is in cognitive dissonnance to help you cope
>>106452592>newer saidnever said
>>106451219Guise help
>>106452618https://huggingface.co/api/resolve-cache/models/lodestones/Chroma/9540b7a813c3e06ca8eb0f01c25f3e76f931c08e/ChromaSimpleWorkflow20250507.json
>Still no Wan 2.2 VACEIt's not coming is it?
>>106452592>yeah, it needs to be a controlled variable after your initial statement got disproven so you can ad hoc rescue your positionperformance comparisons require that you control those variables, it's a worthless comparison otherwiseso it was not ad hoc, it was always there you dimwit>notice the phrase "can be"? your imagining a scenario that doesnt need to happen in order to make you right, appeal to possibility is a fallacy, retardI'm sorry are you denying that this is possible? that if you slap 96GB on a GTX 960 it will beat a 5060? oh right, you don't care about comparing apples to apples, you'd take whatever model size gives the 960 the advantage which leads us to>ad absurdum fallacy of something newer said from your brain that is in cognitive dissonnance to help you copedo tell us what is the RIGHT model size to test cards with. the upper bound seems to be 50 trillion
>>106452327did you make the large Watson lora?
>>106452418Nah man 1080 ti was better.
>>106452580I didn't I swear. I redownloaded the json and looks like it's working fine now. don7t know wth happened. Thanks for the help.
truth is joycaption is a weak captioning model and anything trained with it is doomed to have large flaws
>>106452468>>106452618Was going to load some stuff up for you guys but catbox is being funky to me right now and won't resolve.
chroma is pissing me off. feels like I have to go through 100 seeds to get something that isn't a body horror
>he fell for it
>he fell for "he fell for it"
>>106452486No share, no problems. ez pz>>106452618anon catbox is down but try this https://iili.io/KfhE4KN.png>>106452668Maybe. Why?
>>106452679Hold on there buddy. You are going to have to upload some of your new gens as payment for the tech support I look forward to it.
>>106452666>performance comparisons require that you control those variables, it's a worthless comparison otherwiseno, what matters is how fast a gpu can run X model, arbitrarily saying that more vram doesnt count as an improvement despite it improving performance in all models that couldnt fit before but now do is just faslse>I'm sorry are you denying that this is possible?strawman, i never denied its possible, i specifically said that appeal to possibility is a fallacy, i argued that it doesnt NEED to be the case, and isntholy shit you really are low iq retard that cant follow basic conversation>do tell us what is the RIGHT model size to test cards with. the upper bound seems to be 50 trillionthere doesnt need to be a specific limit, you made up that requirement, if there is an open source model that is good at something, people are gonna care if that model can fit into X gpu, simple asgiven your general retardation and how much you were proven wrong already i will not shit up the thread anymore by allowing your npc brain to pollute it more with low iq fallacies, feel free to continue to cope in the replies without daddys attention anymore
>>106452337If it's anything like the versions released for 2.1, it's extremely censored.
>>106452709>Maybe. Why?what did you use for captioning, it works so well? Just about to do a test run with OneTrainer. also: please give me all your loras
>>106452337in fact i just realized this trash is still 2.1 trained.into the garbage can. never going back to wan2.1
With wan, if I want to get genitals working with a lora, is it important that the base gen with just the model generates blank genitals first?I remember playing with loras in 2.1 and it still generated gore even with loras at high weight a lot of the time. I'm thinking you still have to avoid certain keywords because a lora cannot fix badly mutated genitals.
>>106452740>what did you use for captioningInternVL3-8B outputs mostly correct captions. I have it describe the entire scene, lighting, etc, and name the subject- close to the 512 token limit.>Just about to do a test run with OneTrainerOne of us!>also: please give me all your lorasI've shared too much already. Plus, there's over 60 and I wouldn't know where to start
>>106452723>strawman, i never denied its possible, i specifically said that appeal to possibility is a fallacy, i argued that it doesnt NEED to be the case, and isnt>and isntit clearly fucking is? lmao>there doesnt need to be a specific limit, you made up that requirementit was an absurd value to make you realize a point but you failedyou're right, you are shitting up the thread with your fallacy fallacy stupidity
>>106452397Qwen 9:16 is 928x1664 which already seems to help. My usual testing resolution looked worse, that's for sure.So, 'go big' is the take-away, which most Anons already figured out.Here's another quick set of gens at 1.15 times the resolution to take the smallest edge over 1024.Should be expected that an arbitrary resolution like that would kinda ruin the output, didn't expect it to ruin it by that much, though.>>106452447I only used OSS for face detailing, not for entire gens. I'll plot some Euler/OSS comparisons. But I'm not all too interested in OSS since I guess the main point is getting faster gens. I don't really care too much about speed if the output is good, but for face detailing in a longer pipeline it kinda makes sense.In general, I still haven't looked at NAG along with Chroma, so there's still a lot of way to go as far as testing goes. Which kinda sucks because I've been sucked in the Qwen/Wan rabbithole along the way. Feels like leaving an old friend behind, almost.
>>106452709thanks anon
>>106452588>at least an h100Don't listen to Nvidia shillsYou can train a lora in as little as 8gb, lora is the great equalizer
>look through old folder>hours and hours of guy walks in and kisses anime girl gensI only make them because they are cute.
>>106453058We are not talking about sdxl, kiddo
>>106453121You can do Chroma lora with 8gb, using OneTrainerThey even include a 8gb preset in the trainer
>OneTrainer>NotImplementedError: Loading of single file Chroma models not supported. Transformer-only safetensor files can be loaded by using the diffusers base model and overriding the transformer.I cant fit this shit into Windows cache. Why can't I choose the dl directory if it downloads models straight from Huggingface?
>>106453232Did you clone the whole https://huggingface.co/lodestones/Chroma1-HD or just dl the safetensors? I think you need configs included for OT
If I want to train a LoRA and the subject has a specific body type, should I include both full body pictures and detailed pictures of the face and tag accordingly? Will clothing be an issue?
>>106453270Just downloaded the safetensor https://huggingface.co/lodestones/Chroma/blob/main/chroma-unlocked-v48.safetensors since I wanted to test training on earlier version first
>>106453298The v48 equivalent diffusers-style repo would be https://huggingface.co/lodestones/Chroma1-Base
>>106453322I don't get it. Does it require transformer files too? I'm just so used to just choosing the base model that I can't wrap my head around this
>>106453377It needs the model_index.json and for the subfolders to match and also have their own config.jsonYou could git clone the whole repo (re-downloading everything), or download the .json files and re-create the structure. The OT error says it doesn't support choosing the model as an individual file without the structure around it
I have an sdxl dataset with caption files. What LLM will let me turn the booru tags into natural language? Preferably in batches.
>>106453106uwaaaaaa put a smile on my face
>>106453377No way to screw it up if you:git clone https://huggingface.co/lodestones/Chroma1-Base and point OT to that Chroma1-Base folder
>>106453423a 5090 is still in the VRAMlet caste
>>106452543A Kia?
>>106453415>>106453497TY dudes
>>106453527nah, but I do think 24~32gb would be more akin to merchants/commoners. knights would be 6000 pro users. aristocrats would be h100/h200 users. anything below 16gb is peasant tier.
onetrainer loras do not work in chromaforge (the superior way to generate for chroma)
>>106453717how is it superior if it literally doesn't work
>>106453749chromaforge has hiresfix, so it wins by default.imagine having to tard wrangle a handful of nodes when grugg press buton for gud imaeg
>muh gimmick upscaling method
>>106453497nta but as someone who wants to train an assload of loras on onetrainer, why choose base over HD? was HD not further trained on higher res images?
>>106452723>no, what matters is how fast a gpu can run X modelin that case the 5090 does not have just 2x the performance of a 3090
>>106453828thats 1 node in comfy and it takes all of 2 seconds to setup. you really dont even need hiresfix with chroma depending on the sampler
>>106453851anon probably thinks v48 is still better than HD
Hey /g/uise, does anyone have a retard proof tutorial on how to get video generation? I would also like to add voice and lip synch.Preferably, I would like to use my Mac Mini M4 Pro 64GB but if not I have a 4090.
>>106453851HD resolves details much better imo. anon asked how to make v48 work and that's what Chroma1-Base ended up being.
>>106452354>I really like it, apart from the usual hands and body horror sometimes.>I think genning at higher res (just used the Qwen default for 9:16) really helps.What resolutions do you recommend for chroma hd?
Got inspired by anon, Chroma run go!
>>106453232If you git clone (or manually download all files) the https://huggingface.co/lodestones/Chroma1-HD you can put it anywhere and just set the model path in OneTrainer to it.
>>106453298You still need the Chroma1-HD repo, because the safetensors file is just the transformerHowever if you have the Chroma1-HD repo, and want to train on another transformer, put the path to that transformer in the 'Prior model' in the OneTrainer model settingSo first: Chroma1-HD then under it in Prior Model, put path to chroma-unlocked-v48.safetensors (or whatever other version you want to use)
>>106454231I guess that's pretty nice considering one repo takes around ~84gb disk space. Is it a workaround so people can train with under 24gb cards?
>>106454294what. the HD repo is barely 30gb. even smaller if you delete the safetensor file
>>106454294>~84gb disk space??? My Chroma1-HD repo uses ~25gb, you can delete the .safetensors file in the repo if you don't need it since it's not used for training, only for inference (and perhaps you use another version for that, like I do)But that doesn't account for ~84gb, are you sure you don't have git binary blob directories left from cloning ? If so you can delete them.
>>106454313NTA, but if you git clone a repo, the ".git" folder itself contains a bunch of shit.for example, git cloning the Wan2.2-T2V-A14B repo is like 235 GB.
Are you >>106453481 /ldg/? Or are you a generate batch sloppa?
>>106454192Based anon, godspeed!
>>106454342>the ".git" folder itself contains a bunch of shit.Isn't that just some cached crap you don't need?
>>106454359you wont be able to do git pull to update, not that you'd need to since it's a finished model.
>>106454338>are you sure you don't have git binary blob directories left from cloning ? If so you can delete them.Indeed I had. Thanks anon!>>106454342Why the hell does it hoard this crap
>>106454359Yes, I mean unless you are going to make changes and upstream them, for end user purposes like training etc, just delete them.
>>106453853Yeah for sdxl 1girls stuff it’s faster than that. Heck for that simple tuff I think my 5070ti is faster too, not vram limited there
8gb vram isn't nearly enough for video generation is it
>>106454443It's doable with quants and a good workflow, I was genning with an 8GB 2070S before I upgraded.
Checking in after 2 months. Is Wan still limited to 5 seconds for no reason?
>>106454443Not unless you like waiting + doghsit quality. You can technically "do video gen" on 2gb vram if 90% of the model is offloaded to sys ram + light2x lora + torch/mag/tea/nag/easy cache + Q2 quants.
>>106454462if it is what it was trained on how is it for no reason?
is Topaz still the upscaler for video?
Which model/models are considered the highest quality and most versatile and general purpose, being able to follow detailed prompts to as high a degree as possible, and able to run on gtx 1070 8gb vram 16gb system ram?
>>106454608>the highest quality>on gtx 1070 8gb vram 16gb system ram
what's the best "high resolution fix" for comfy ?
>>106454462I do 7-8 no problem.
I am trying to chain videos together with end frames, but the color of the next video changes slightly. Is there a way to prevent that?
>everything i gen comes out as slowmojust one of those days i guess
Jesus christ, I can delete this shit, right?
>>106454608Perhaps try Q6 or Q8 GGUF quants of wan2.2 and/or chroma/hidream/qwen/sana/[...] via the distorch multigpu loaderUnfortunately I don't know exactly what fits in 16GB system RAM (+much worse, swapping to SSD/HDD). But picking a quant and the offloading in the aforementioned node has some flexibility.I'm not sure you won't enjoy SDXL or other smaller models that generate faster tho.
>>106454746did you not read the previous posts, yes.
>>106454746yes, but then you may not be able to update via git.you actually might just want to switch to a ssd/hdd with the appropriate size for modern bloat
>>106454598Yes.
>>106454735I know you can use latent instead, apparently even the last few latent frames (for ex 16) with some custom nodes to give info about movement, but I didn't try so I have no idea outside of knowing it exists.
>>106454742stop using lightv2x
>>106452511>>106452499also curious about the best way to go about this
>>106452478try hidream, wan (can also be used for images), qwen, chromait generally works better there. some may of course not have trained your characters and then for 2 characters at least you obviously still get better results with noob/illustrious if they HAVE trained the characters
>>106454885>>106452511NTA, but take a gander at this workflow. It is simple enough to understand.https://civitai.com/models/1080711/comfyui-regional-prompter-workflow
Wake me up when we get something like Flux or Chroma but working locally and having the braindead image editing of Gemini Banana.
>>106452478wait for someone to crack NovelAI's character separation technology
>>106454947instead of sleeping you could be working to get the nano banana weights
>>106454968>>106454906
>>106454762>sanaMaybe skip that one. I still cannot believe they made the VAE worse than what SDXL had
>>106454979cry about it, 'jesh
>>106454762I'm not interested in generating videos though. mainly high quality realistic images following long and detailed prompts as close as possible with the most amount of context awareness possible. I know it's possible to tweak the workflow by splitting things up to create better context awareness, but I'm interested in a model that is as context aware as possible from the get go. Right now I'm using SDXL base 1.0 without any loras or anything, and the output is just not very good. It doesn't adhere to the prompt very well and there are often weird artifacts and smudgy stuff. I've tried different samplers and schedulers, and it's just not very good at adhering to more complex/longer prompts.Right now I'm not using any launch arguments or extensions, and one gen of a 1024x1024 on SDXL Base is taking almost 2 minutes.I feel like it shouldn't take that long even with how low my specs are. Maybe I'm doing something wrong. I'm not sure which arguments I could launch with to make it faster.If I can't get it any faster, I'll have to live with it for now. But SDXL base is just not cutting it for me. it's just not good enough of a standard output that I think it will help a lot with using loras and whatever else, to tweak it. I'm feeling there must be some better models than SDXL base at the same level of hardware requirements. maybe some finetuned checkpoints based on SDXL or some other type of model.
Best practices for captioning for style LoRAs for Qwen? I'm training on 19th-century Academic Orientalism paintings. Is it better to caption the all the images as "Orientalism paintings", or to use a variety of more generic terms like "painterly" or "realistic oil painting" across the dataset?
>>106455034I can get everything with qwen image and go for some wan passes.Open source u cant get better results
Anyone know anything about the audio generators? Anything local that tries to compete with ElevenLabs?
I have not been here for a month.I have only one question:Did the Mayli anon delivered? do we have a Mayli NSFW folder?
>>106455130Been using Chatterbox. Here's Emma reading some Orwell https://voca.ro/1ikmIkpzsLHX
>>106455077If the pattern is easy for the ai to 'grok', there are arguments for just describing the contents of the image in generic terms, without mentioning style, medium etc, likewise there are arguments for adding those in order to piggyback on the knowledge already in the base model, thus giving faster convergence.I don't think there's a real consensus, you probably have to decide for yourself and your specific training data.
>PAWG wan lora>use it with lolieeehehehe
>>106455200Too bad there's no easy answer. I love to experiment but training Qwen takes way too damned long on my 3090. Maybe I'll just do a runpod or Civitai or something.
>>106454715Hiresfix like auto1111 is just an additional sampler pass at larger size of your image so just stick an extra k sampler node between your first pass and image output
>>106455247>Too bad there's no easy answer.AI training in a nutshell I'm afraid. Particularly when you want to reach the best quality.
>>106452178i tried to do one of Mila Kunis and it took forever and came out shitty.. i have no idea how to do one properly.. i followed some guide online but there's a billion options and no one covers even 1/10th of them
>>106452399just bought a 5090 last saturday.. noticeably faster than my 4090
>>106455231makes zero sense, it's a t2v lora
>>106455034wan for a single frame is an image.the advantage to this is that wan does understand multiple characters and spatial instructions and such much better than SDXLelse hidream, chroma, qwen, ... are also better or much better
>>106455268Yeah I got it, thanks.
>>106455090what do you mean with open source? you mean with distorch?
>>106455276silly naive anonyou know nothing
>>106455290great explanation anon, thanks
>>106455287I think he means local
>>106455282Oh, ok. Thanks. I didn't know that.I'll try it out.
>>106455160That's pretty good. Have you tried any of the others?
Where the fuck is the refresh nodes button? I updated comfyui but the UI changed and the button dissapearedI know the hotkey is R, but I would like to know where it is in case comfy change the hotkey too
>>106455308yeah. Sorry. I know what he means now. I got confused. I thought he said can instead of can't.
>captioning art with Gemini>a handful of the images include women with unshaved armpits>the armpit hair is always the first thing mentioned in the captions even though it isn't the focus of the image at allSo basically Gemini looks at images exactly the same way I do.
>genning gooning during day>training when I'm sleepThis shit will fix my fucked up sleep schedule, dang
>>106453613
>>106455318Nope, just chatterbox. Havent had the time
>>106455391kek, but knights would 48gb vram. definitely not the same power as the king himself
>>106455268What denoise value do you recommend in the second ksampler?I've gone 1024x1536 -> x1.5 -> denoise 0.5
>>106455378more like training when I'm sleep + at work. my computer has been on for 2 weeks now working non-stop at max gpu. sometimes I queue up 30x wan gens during work and get notifications to my phone when each is finished. they're like little treats.
>>106455160That’s pretty good compared to all the meme ones we were doing with that online service a few years ago. I listened to the whole thing and didn’t hear any errors or weirdness. Maybe a little too monotone if anything. I really should look into audio, make my own asmr sleepy time shit
>>106455443This. I thought genning would be my huge time sink, turns out it's training that drives the majority of my interest.
>random pic of girl at grocery store>wan2.2-i2v-high-oral-insertion-v1.0>wan2.2-i2v-low-oral-insertion-v1.0>old photo of crush>oral-insertion-v1.0>comfy dev>oral-insertion-v1.0
>>106455425There’s no “correct” number, you gotta futz with it. One value might be fine for one image and give you flesh monsters for another, or if it’s like not denoised enough then it’ll just be splotchy. Too much and it’ll change the image too much, and and so on. Plus, it depends on your sampler and your model and your seat and blah blah blah. Like everything else in this forsaken hobby, you have to roll it like a dozen times just to see.
>>106455488Just tried, I was using chroma because nsfw, 0.5 gives a blurry mess, 0.7 seems better, I think I'll try again with 0.8-0.9.
>>106455482oooooooooooo he's COOKIN now!
>wonderbread>oral-insertion-v1.0
>>106455482Qwen image edit and Wan2.2 certainly fucked up my mind. So I am not the only one.
>>106455136nope, hasn't posted for a long time neither.
>>106455572>Qwen image editno nsfw so not sure how
>>106455572qwen image edit? dont you mean chroma. qwen sucks at realism.
What are people using for character consistency nowadays? Is it QIE or something else?
>>106455623>What are people usingalcohol
what to use to inpaint realistic genitalia?I'm way too used to convenient tools for anime styles
>>106454350
>>106453851HD is more temperamental to prompt with and struggles with smaller resolutions. I'm not at my PC but I have some comparisons I could make once I am.
>>106454443I get by on q8 with ram offloading just fine, and if you're less picky q4 is as low as you should go. You won't be outputting huge resolutions but it's doable.
>>106455722Nice nightmare fuelPicked up the Giger style well, combining airbrush with sharp acrylic painting
>>106455755It's not bad since I just ran it with default values. Need to increase rank and learning rate.
>>106455780Yeah it looks good, if you're using OneTrainer, the best lora preset IMO is using 'blocks', it's a bit slower than 'attn' or 'attn,mlp' but it picks up more detail at the same number of trained epochs, I wouldn't do 'full' though, it doesn't seem to have any benefit over 'blocks' and is slower.Overall when it comes to art style training, I really like how Chroma picks up the 'texture', particularly when compared to Flux dev
>>106455780>that wasn't a microdose
>>106455482>this post>oral-insertion-v1.0https://file.garden/aIdN6xfH0QVghCy0/thispost/thispost.mp4
>>106455809Thanks for the tip, I'll check the blocks setting. I wanna try if it works with lycoris+came+cosine.>I really like how Chroma picks up the 'texture'That seems to be the case. I'll upload this lora once I've retrained it
>>106455812Giger never took a microdose of anything!
>>106455866Not sure about lycoris/lokr support in OneTrainer, never used it, but CAME is supported
>>106455866this looks like herpes
is there a reliable way to make sure wan gives female characters flat chests or small breasts? Specifically clothed characters that don't already have their breasts exposed in the input image.Too many times wan gives the girl big boobs when she is supposed to be flat/petite. I've tried various different forms of conditioning in the prompt but nothing is consistent.
>>106455866Here you go Gigerbrohttps://files.catbox.moe/8n4c3d.mp4
>>106454947WAKE UP BITCHhttps://files.catbox.moe/3oic8d.mp4
>>106455959This lora must have been downloaded more than twice at this point, right ?
>>106455939I’m purely guessing here, but they probably either didn’t train that or maybe even trained against it, because for all the SDXL base models flat chest was basically the cheat code for ToT. And they don’t wanna make it easy to do that sort of thing with a realistic model.
>>106455866>>106456029>>106456031>>106456044based gens
>>106455881>Giger never took a microdose of anything!I bet he did after Aliens 3 clusterfuck>>106455908It has lycoris support and 8bit came, should be doable. Just have to figure out where to put extra arguments if it even needs them
>>106456044New Space Jockey anime when ? Untapped potential!
>>106456110>I bet he did after Aliens 3 clusterfuckMore like a gigadose, maybe even a mega pintNice gen
>>106456044This is me btw
>>106456029>vore gens in my /ldg/ ?My likely than you'd think
a*****dio will redeem
>>106456161ty!
does anyone really think they could keep up with nvidia who earn just enough to invest 10x as much in research as all competitors worldwide combined?
>>106456499Too much money on the table to let Nvidia walk away with it all.
>>106455422surely the king would be a B200 at 180GB VRAM
>>106452740Here's one more https://gofile.io/d/W8rk7u>>106454192Flip those bits anon>>106455270>it took forever and came out shittyUnfortunate. I usually bake 8-10 hours.. unless low image count, then 100 epochs.>>106455722>>106455780>>106455866Nice
>>106456031stop making everyone suck dick :(
>>106456029
>>106456751holy shit real captain falcon
>>106456750Seek medical help for your limp dick
>>>/vt/104115369Can I use AI somehow to remove the demented watermarks the schizo ai antis artists use?
>>106456741>Here's one more https://gofile.io/d/W8rk7uThanks for sharing. celebrity?
>>106456884can easily be done with flux kontext or qwen image edit. kontext seems easier to wrangle for this purpose but it produces jpeg-like artifacting, pic related
why hasn't diffusion been very popular lately? there seems to be a large dropoff in interest. probably a lot of things like models being too big, the edit models not being a bigger deal than what we ended up getting, the software keeps committing seppuku every update or breaks itself with memory leaks. what do you guys think?
>>106456971i guess qwen edit works just as well or better
>>106456971>>106457055How do you prompt the edit models? Legit boomer prompting like "remove xy, do wz, turn a into b" ?
>>106456971>>106457055can they remove "adversarial noise" like this?
>>106456750>stop making everyone suck dick :(Everyone and everyTHING will submit to BWC, as nature and God intended.https://files.catbox.moe/ziriy0.mp4
>>106457055for that it was just "remove the watermark in the bottom right and on the girl's shoulder". tabbing between the two it's clear that qwen did a better job at removing the watermark and the colors are more vibrant.
>>106457106he can't keep getting away with it
>>106456988the reddit community seems to be going apeshit with it. I think the dopamine hits you get from genning is so far off the charts, way higher than we could have bargained for. I think that if you ever got into genning, not genning anymore probably isn't a concept that exists for you.rpt0x
>>106457138reddit is going nano bannanas but a lot of that is also bots shilling it. I don't think there is much interest in local diffusion
>>106457138yeah, im sad to say gen'ing has completely taken over as my main hobby. i haven't done fucking anything else besides ai gen for the past 6-7 months. not remotely close to 'burning out'. still so many concepts, ideas and things to try it feels near infinite.
>>106457098yerp. I didn't even realize what you meant at first until I zoomed into the image. I just prompted "remove the watermark in the top right and bottom right" and it drew the image without the noise.>>106457113meant for>>106457082
>>106457157Its what people that learn to draw had always experienced
>>106457157I'm jealous. I seem to be a creatively bankrupt person incapable of producing original ideas. I frequently get really excited about some ideas or settings but when it comes to actually creating something I draw a blank. I learned so much and have gigabytes of catboxes but I mostly just produce slop every now and then and not even that anymore.
>>106457126i'd personally do this but sadly I am in the middle of doing a wan gen
>>106457151The closest thing to organic on reddit is the hustlers trying to monetize the waves of astroturfing. The answer is much simpler in that all the moneyed interest has shifted to video. It's a newer set of keys to jingle in front of increasingly skeptic investors. Image generation is not very useful, is saddled with big lawsuits, and is viewed negatively by the public. But hey, it's great for making weird porn.
>>106457225I dont see how video would be more enticing for said investors if they already understand the limitations of image gen, is has a lot of the same issues for them if not more
>>106456988Diffusion has taken a backseat to video generation, this is 100% expected since video has such a broader reach.That said the endless possibilities of i2v means diffusion will remain in wide use.
Any hope for a NSFW lora/finetune for Qwen edit image or no? Licence is much better than Kontext, but a lora won't do the cut if the model has 0 nsfw trained in, will it
>>106457243Yeah, but investors are also basically stupid animals that act entirely on emotions hence Nvidia stock dropping everytime a shitty AI company gets bad press.
>>106457157Same except I'm not sad to say it. I've practically stopped gaming, and I've totally stopped doing stuff in Blender (unless I'm doing something to use with AI), it's a seemingly endless journey into what you can make this technology do, and as soon as you think it's reached a plateu, something like Wan drops and you're just like WTF ?
>>106457243it's new. not really but it's still got the wow factor, for nowthat's what this whole industry is propped up by, potential
>>106457261It's not really practical with how big and slopped the model is. Chroma only exists because of a big community funding effort and one autistic guy, and schnell was much smaller of a model.
>>106457283I wasn't even done exploring SDXL. In fact I was doing some nice art that people in a discord I'm in enjoyed, but I don't even have time to do SDXL art anymore because WAN has taken over. I really need dual GPU's so I can do the art stuff while WAN is processing.However I'm not dumb enough to blow on my money on another GPU yet. I still need to max out my Roth IRA for the year and reach my investment goals for the year. Finances always come first.
>>106457225>Image generation is not very useful, is saddled with big lawsuitsSame will be true for video, but it's certainly harder to monetize image generation than video generation, the latter has a much broader appeal.I don't think services like Dall-E makes has a chance of making a profit even if you don't count the initial cost, they're just company tech demos.
>>106457126you were asking for ithttps://files.catbox.moe/fn03ja.mp4
>>106457293Welp, guess I will just use it to get more pics for a character dataset, and remove watermark then
>>106456890Kruek
>>106457342Smallville sex cultist
>>106457342>kristin kreukbased anon
>>106457330The studios have huge dicks to swing around while also having dumb suits that want to train their *own* models. It's just impossible to monetize this stuff in its current form in any way.
>>106457248while this is true it's not what I mean. yes I see more YouTube AI vid grifting but I don't see any wider interest for people doing it themselves. when wan came out the threads were moving fast but nowadays it just seems like people are burnt out
I dont think they will see any money until they can make entire movies that look good
>>106457379>nowadays it just seems like people are burnt outI think it's what you brought up before. everyone got obsessed with trying to make it faster for so long but updates and snake oils made it extremely frustrating for a prolonged amount of time. comfyui is definitely worse than it was a year ago, forge is dead, wan2gp is a bit limited and anistudio is updating at a snail's pace
bit of a niche here, anyone got/know a wan lora for just the body teleporting out of their clothes? or maybe a just the body turning invisible one? trying to experiment with living clothes
>>106457379These threads have been much deader in the past. This is an expected lul after a release and even then it still move decently fast. That being said. I think video really is the future. Wan 2.2 is already excellent at single frame generation. I think we're only scratching the surface of what's possible what video at this point. I sometimes run my gens through video models to get more interesting an natural posing.
I need a couple links for AnimateDiff model download, as well as a simple workflowThanks in advance
>>106457379This stuff is too technical for 95% of people and 4chan is slowing down a lot on its own for other reasons. It's definitely a niche space.
>>106457379>seems like people are burnt outMore like genning stuff you can't share online, as in hot real world people in NSFW gens.You can't share that online since it's illegal, for good reason, but I'm thinking a LOT of gpu time has been spent on that since Wan came out.
>>106457433>AnimateDiffHalt. This is a suspicious request. gonna need to ask why you need that model.
>>106457433no one uses animatediff. that's just the default filename for the video combine node
>>106457437Wan truely is wantastic. If you're willing to spend the time on the gen, the outputs are genuinely usable depending on what your use case is. Like if you animate sprites right now, you're eating good.
We really are about to go on a whole year without any new big anime model since noob, are weWeeb models are really getting outdated
>>106457379Everyone was hyped about wan, and although it did deliver it wasn't on the level and quality everyone imagined and had expected. That's why it died down over time when everyone realized it was just average slope.
>>106457455lel
>>106457455Wan sucks because it doesn't fit in <my current VRAM size> and/or is too slow on <my current card>
>>106457439r-reasons
>>106457455>it wasn't on the level and quality everyone imagined and had expectedHuh? It's genuinely fantastic?
Anyone tried booru captions with Chroma lora? Should work since it was trained on that, but I'm sceptical.
>>106457455No wan really is pretty good. I just think that it takes a little more effort than most people anticipate. Diffusionists are crackheads conditioned to instant gratification. As soon as something takes more than 5 mins to get results, forget aboud it
>>106457489>I swear bro 4 steps is more than enough
>>106457489Patience is a virtue. My WAN2.2 take 2 hours to complete(80 steps, no optimizations or lightx2 lora). I've got no problem waiting for quality.
>>106457471>>106457478>>106457489I said it did deliver, and I am happy with what it can do, but it's not on a level where I would say it's perfect like how everyone expected.It still can't do anime well, sill has issues with spatial directions, still has noisy slope quality.
>>106457512what is this mental illness called?
>>106457510>spend 2 hours baking a prompt>get trolled by your lora adding in something completely different in the middle of it
>>106457530>>106457530>>106457530>>106457530
>>106457529yeah, it has many times in the past. thankfully the high noise(the motion) only takes about 15 mins to finish, so If something looks wrong in the high noise I can just cancel and not waste 2 hours. This was a huge problem in wan2.1 where you had to wait for the entire thing to finish
real bake>>106457557>>106457557>>106457557>>106457557
>>106457512>Nothing like a good smoke after railing MegBased Herc
>>106457532>AniStudio in the OP18 star github repo.. is this a troll bake?
>>106457532>>106457560Begun, the bake wars has
>>106457563a butthurt bake, it's funny though, I expected them to remove comfy's link our of spit.
>>106457510I get annoyed that the current workflow I'm using takes 4mins a gen would rather get it down to 2mins. Then again mostly I just want fast slop to fap to or post same as I did for image gens.
>>106457578>butthurtthe second bake looks like the butthurt one to me desu
>>106457588why don't you just fly to julien and suck his cock?
>>106457597concession accepted
>>106456029prompt?