Discussion of Free and Open Source Text-to-Image/Video Models and UIPrev: >>106457557https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicChromaforge: https://github.com/maybleMyers/chromaforgeSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://tensor.arthttps://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/sd-scripts/tree/sd3https://github.com/derrian-distro/LoRA_Easy_Training_Scriptshttps://github.com/tdrussell/diffusion-pipe>WanXhttps://rentry.org/wan22ldgguidehttps://github.com/Wan-Videohttps://alidocs.dingtalk.com/i/nodes/EpGBa2Lm8aZxe5myC99MelA2WgN7R35y>Chromahttps://huggingface.co/lodestones/Chroma1-BaseTraining: https://rentry.org/mvu52t46>Illustrious1girl and Beyond: https://rentry.org/comfyui_guide_1girlTag Explorer: https://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneSamplers: https://stable-diffusion-art.com/samplers/Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbourshttps://rentry.org/ldg-lazy-getting-started-guide#rentry-from-other-boards>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
Cake.
>>106464276you forgot to read anistudio
Cursed thread of hatred and animosity
Blessed thread of frenship
>50 steps euler/simpleYeah radiance needs to stay in the oven
>>106464308TO VALHALLA
>>106464345I am curious as to the final result but to me the entire concept seems backward. A big reason we use vae was to cut computational cost, the very first models didn't use vae and was a pain to train and gen with.
so chroma.... what happened there???
Can someone share a good workflow to upscale images and them still looking sharp?What I tried so far looked like shit.
Is it hard to set up a local model? I've tried to watch a few guides and it seems quite overwhelming.And would a 9070xt be good for generating videos/images?
>>106464424at least yours is INSIDE your case lmao. This is what i get for trying to upgrade a prebuilt. Card don't fit. psu don't work. bought a second replacement psu but it doesn't have enough PCIE plugs so i have to keep using the original psu as well.but it works. so
>>106464510>AMDit's 10x harder than the plug n' play Nvidia. If you thought it was hard after watching videos it's not going to be a good time
>>106464404too much the diaper
milk
>>106464643Are you pretending that you can't prompt with 12gb vram ? You must be some /saasdg/ fag who has never user local.
>>106464404A bit of a disappointment. I had high hopes for a model with a large character database like Noobai, with the advantage of natural language and the variety of styles.Unfortunately, it only recognizes a few more characters than Flux or Qwen, but the advantages end there.Since Qwen arrived, I can easily create a scene, add a background, and add characters described just with a prompt. Then I edit the characters with Noobai InPaint. (pic related) If they don't do something, Chroma will soon be forgotten.
>>106464736It's interesting that qwen has almost no traction on civit compared to chroma.
>>106464510If you can follow instructions on a github repo, installation shouldn't be a problem. Coming from an AMD user, you should avoid AMD if possible. Its gonna be more of a headache to setup than an Nvidia card. If you already have the card though, and are on Windows, I've had the best luck with this repo: https://github.com/patientx/ComfyUI-Zluda
>>106464749Probably because qwen is like 20gb and vramlets can't use it.
>>106464345skill issue.you're using either the wrong settings or you are a promptlet. try euler simple 24 steps
>>106464797I use Q4 with a 12GB card. The loss of quality is absolutely negligible and the speed is comparable to Chroma.
>>106464811retard
>>106464711vramlet cope
>>106464812>the speed is comparable to Chroma.I haven't used chroma but if that's true then it's slow as all fuck. takes several minutes for one image on qwen q4 on my 3060 12gb.
>>106464819damn you really got me there didn't you.pathetic. hate on a model because it's bad, not because you are incapable of using it.
>>106464835It's the same prompt I use on normal chroma and it is perfectly fine there. And less steps won't make it better.
qwen has nunchaku so it's vramlet friendly. chroma doesn't.
>>106464859Still doesn't have lora support.
im not fond of the model but for more nuanced reasons than you
>>106464749i can't figure out how to train it. the aitoolkit method is like 2bit or some nonsense and the kohya's musubi seems way too complicated. i have 24gb (bare minimum btw).but the reason qwen, hidream, etc get no traction on civit is because nobody can run/train it. sdxl is cheap and easy to use so it has tons of content. even flux, which was hailed as the savior of local and better than dalle3, barely received any resources in comparison to XL.
>>106464842post the prompt. all my usual chroma prompts work fine on 20~30 steps. not even normal chroma uses 50 steps so you are going full retard here.
I am trying to get an anime character to take off their wristwatch in wan2.2, but it is outright not playing ball.Did the chinese developers just not include that in the model's training? I would have thought it would be fairly simple since it can do clothes stripping well enough
>>106464749what are you talking about? qwen has far more lora than chroma on civitai.
>>106464918i don't get why people don't realize you can rent a 40GB A40 for $0.40/hour and train a lora for like 1usd cost
sars
>>106464812>The loss of quality is absolutely negligiblelol
>>106464301that level of titty squish up needs to be mandatory dress for all hot women.
>>106464964It is tho. The only loss it with the text capabilities. Image is barely touched.
>>106464964Oh, go on, prove me wrong.
>>106464941its not a 1click setup. majority of lora trainers just use civitai to train loras, or the free-tier google colab (emphasis on the free part). even $1 is a barrier of entry too high.
>>106464977>>106464984you people clearly have low standards and/or are blind so no point showing examples
>>106464999So all bark no bite, eh? Fuck outta here.
quality loss only kicks in at the level below whatever quant i can run
>>106465006Q4 looks just like Q8 looks just like FP16 looks just like FP32 I swear guys I swear!
>comparing Qs to FPsretard alert
>generates plastic fluxslop at Q2saaaarssss where cellphone camera lora for fix output
>>106465015>got blown the fuck out so badly she just started pissing and shitting herself all over the thread
>>106465030>thinks quantization is magic
>>106465033indians have a tendency to fall for cheap magic tricks
>>106465033Post several comparisons across different quants. Assuming you can even run any of them, attention whore.
>>106465049you must be a newfag if haven't seen the endless quant comparisons already
>>106464276Nano banana is insane for genning subject LoRAs without a lot of pics. There's nothing that comes close. When will local catch up?
>>106465020>implying fp8 looks better than Q quants > Q4
why did they call it nano banana though
>>106465059qwen edit is the only contender really
>>106465057threads recently have been overrun with saars trying to hop on the video bandwagon. they don't like to accept the fact that trying to fit a 20b param model into 12gb requires corners to be cut.
>>106465069most engineers at Google have nano penis. Asian genetics aren't very kind
>>106465015>Q8 looks just like FP16To be fair, it's really, really close.
>>106465057Still see no comparisons from you
>>106465069because it is under 1B parameters
Testing out my fashion wildcards on Qwen with different other models right now.Which one of these do you guys like more?Mostly in terms of face.Prompt adherence is pretty damn good on Qwen, though, gotta admit.Here's the Yume Kawaii prompt I used:>A highly detailed photograph of a young japanese woman. She sports an oversized white t-shirt dress with a magical girl transformation sequence print in pastel gradients, worn over a lavender tulle petticoat that peeks out below the hem. Her feet are wrapped in platform boots covered in pearl white holographic material with chunky 4-inch soles and lace-up fronts adorned with star-shaped charms. A sleeping mask accessory rests on her head like a headband - pale pink satin with gold embroidery reading "Sweet Dreams" and dangling pearl chains. Her hair is a long pastel lavender wig with bangs, styled in loose waves and decorated with tiny LED star clips that twinkle softly. Her face showcases pale blue circle lenses, white eyeliner drawn in star patterns at the outer corners, cotton candy pink blush applied generously under her eyes, and glitter tears made from iridescent gems.
Can we fast forward to when NetaLumina or one of its derivatives is good enough to replace Noob
>>106465072Close, but no dice. Photorealism just isn't there yet.
how do we fast forward to a timeline that doesn't exist?
>>106465125right looks better. eyes on the left look digitally fake
>>106465059>When will local catch up?Comparing local against a huge new model that can't run on consumer hardware even if it wasn't proprietary, are you retarded or just a saas shill ?
>>106465125neither are showing pantsu so both are bad
>>106465132Quite honestly i'm already using it a lot. Very fun to play with because of the insane prompt compliance.
>>106465184Not enough artist knowledge desu, and I'd rather not do a second pass with Noob. Not saying you can't get pleasing results, just seems like it needs a bit more time in the oven.
>>106465173Most of their "huge" parameters are only being put to use for the prompt following. For simple ID copy/style, we do not need so many parameters. Qwen Edit, Kontext Pro/Max should tell you that (though it's API only, but Dev is same size). Local is behind, but that far behind. We just need better models.
>>106465237>but that far behind.but not*
>>106464749Ehh? I barely see chroma LoRAs. Qwen is getting a lot more. Still flux and illustrious seems to be the most though. Did you just look up "Chroma" and saw there was more? lol
>>106465223nta prompt adherence is the most important thing to me personally, I am very much bored of noobs/illustrious compositions so completely on the Neta train.
>>106464736You have no idea what you're talking about. Chroma (the model base) is still more powerful than any other model we've seen so far. An additional finetune would help it learn styles/characters, but the caveat is that it requires compute that even small companies do not have. Neta Lumina on the other hand is quite good with anime/styles and it should be a breeze for anime tunes. Chroma is still SOTA for local photorealism NSFW.
>>106465287(me)Also Qwen for more interesting prompts Neta can't do. I was hoping to use chroma but man is it bad at anime like no idea why photoreal is fine but as soon as the word "anime" is anywhere in the prompt everything about the model takes a giant nose dive.
>>106465237Shit you just made upQwen Edit which is far behind Nano is still a pain to run on local due to its resource demands, there's no magic that will make the gap disappear between SOTA saas running on extremely expensive hardware versus local models to be used on consumer hardwareBut local is still better overall, because you can finetune those models to be much better than SAAS models for specific concepts, and even concepts that SAAS models will never allow due to NSFW censorshipBut for pure technical prowess, SOTA saas would have to be totally incompetent for local consumer models to compete
>>106465335>Qwen Edit still a pain to run on local due to its resource demandsWorks easily and fast on a 500$ 3090 at basically fp16 quality of Q8
>>106465335Perceived nano superiority is due to a slight better adherence given any scenario. Again, local can catch up given a decent unfiltered/non-distilled model.
>>106465335Nano is shit. I can't say I got even one good result out if it. Colorize, it misses half the image, "show me character behind" the damn thing gives some weird combo of front of the character clothes that makes no selnse, "complete the rest of the image" gives me either a dwarf cause you can't change the aspect ratio of the image or just some nonsensical design. I genuinely believe this is some sort of weird as gaslighting from google. There has been multiple comparison on reddit too and almost all of them show how much nano fucks up the image. The censorship is just a cherry on top.
>>106465287For sure, and to be clear it can do SOME artists but I'm tired of models that are technically good with the caveat of needing a lora for styles that other models can do OOTB.
>>106465059sure qwen is behind nano banana, but it's not that far behind 4o, which was considered to be an impossibly massive gigasaas saas monolithic 300b model. i wouldn't discredit local so soon, especially since it's roundhouse kicking BFL's faggot API license shit right in the face. plus qwen have hinted that they're already working on a v2
>>106465170Alright, thought so as well. Eye colors are really weird on base Qwen. Tends to lean hard into extreme colors.>>106465176Now that's just lewd.
>>106465388This Wicked City, Ninja Scroll anime style still holds up
>>106465343Define “fast”. Everything seems to be getting progressively slower than sd1.5, which is “fast” in my mind
>>106465445yeah it's timeless
>>106465395>or styles that other models can do OOTBThose models are finetunes specifically trained on art styles though, not really comparable to a base model which needs to know some of everything
>>106465471I was thinking Cyber City Oedo 808 has this style as well but I'm not sure.
>>106465472Which is why it needs more time in the oven
>>106465457less than a minute for an edit that would have took much more if done with inpainting, photoshop, regenning or anything similar>Everything seems to be getting progressively slower than sd1.5hardware didnt improve much compared to the models, no shit toy models from 3 years ago will be nothing to modern day models that are order of magnitude better
>>106465457because hardware hasn't kept up. 4090->5090 isn't as much of an improvement as 3090->4090. the vram gain is worthless because they didn't want to cut into their (already dated) a100s which are still selling for 5 figures. pray for chinese asic or something idk
>>106465343where are you finding 3090's for $500I just searched ebay and the lowest price is $750~$800
Is there anywhere I can download a decent dataset of a few thousand (at minimum) random photos? I want to train a realism lora, a good one, so I need thousands of images.I've tried searching Huggingface. It's a disaster. Anything sourced from Pexels is unusable. The images are majority "artsy" type stuff, bokeh out the ass, just a weird slopped look. There is a dataset that claims it is 120M images from Flickr, but it is actually 3000 unique images duplicated millions of times each (yes, really, 500 downloads per month btw). There are various Flickr30k, Flickr8k, etc datasets, but they are resized to low resolution with no original URLs.I'd really want to avoid resorting to scraping some website myself, but it looks like that might be the only option.
>>106465502there's quite many, like: private eye goku, vampire hunter d etc. they had pretty high budgets back then
>>106465543why would anyone collect a great dataset for no reason? the dataset collection is the bottleneck, not the lora training, fire up that yt-dlp and gallery-dl
>>106465395If you are curious here is a list someone did for the artist Neta can do, I don't think they went through the danbooru ones though but yeah overall still needs more cooking time.https://neta-lumina-style.tz03.xyz/
>>106465567>why would anyone collect a great dataset for no reason?To share with the community so everyone can improve their models? That's the whole point of HF. There are plenty of image datasets there, they are just complete ass for various reasons. Either Pexels slop, or downsampled Flickr, or "whoops I fucked up and duplicated each image tens of thousands of times teehee sorry"
>>106465621>To share with the community so everyone can improve their models?then you wouldnt need to train a lora on those images
>>106465184What does a neta prompt look like? Last time I looked through their user guide, their prompts were the most convoluted shit I have ever seen since Pony and the score_7, score_8, score_9 days.Did you train any loras for it?
>>106465610based should be in OP and replace the khroma section
>>106465610some of these are good but a lot are scarily bad, especially how it quickly replaces basic animals with humans
>>106465471>>106465388>>106465113>>106464736These belong to Anime Diffusion Thread.
>>106465723Cope and seethe
>>106465132>>106465184oh shi- forgot about Neta lumina. Too busy with wan 2.2 and qwen, but i'll try it again.
>>106465738>tranimetard is retardedbasically every time
>threadshitting faggot is a touristbasically every time
If I have a dataset with hundreds of pics, can I get away with training less steps/epochs?
>>106465757>tranimetroony so mad and castrated by his hrt hes to scared to even reply>but still has to reply in the thread to lash out for attention like a child he isuh oh, not again tranimesisters, maybe... maybe if we say "tranime website!!!" on cue now we can recover?? lmao
>>106465765no. the more images you have, the more steps you should train for with a lower learning rate
what caused the absolute seething meltdown demonstrated above?
>if i call the guy btfoing me mad i dont have to engage with the argument>no, i am NOT a child btwtranimetards really are embarrassing...
BIG stinky
>solves the fluxplastic problem>vanisheswhat was their endgame?
>>106465539I don't know why but people keep repeating this bullshit. It's just straight up false. Maybe you can find one on craigslist for $500 and go that route if you're a giga retard.
>>106465539>>106465849huh?
>>106465543Flickr urls are standard format with no secret up to 1024https://www.flickr.com/services/api/misc.urls.htmlIf they have the original ID you can grab them easilyOther than that scraping flickr is easy, you don't need to register for their api, just grab the key from the site and use the site's endpoint instead. Go for the Explore page on a few dates, should be enough for your needs and inherently high quality because they're featuredt. scraped 5b from flickr (no you can't have it)
>>106465871>poundsRETARDETARD
>>106465871anon said $, as in USD£500 is $669.38
>>106465889fair enough, i didn't realize how weak the dollar is recently so yeah you're right they're about 700 usd
>>106465871oi you got a loicense to post here?
>>106465849browse other sites, more local sites
>>106465909>oi you got a loicense to post here?yes but probably not for long. enjoying my final days before the govt bans 4chan
>>106465960isn't kiwifarms and 4chan suing them? lmao
>>106465879Ok thank you anon. I also was finally able to find LAION highres subset on HF, which appears to be decent. Currently scraping a subset of those URLs right now.
There was someone in the previous thread that asked why corporations aren't doing what radiance is doing. The answer is that radiance is based off of a paper that is about a month old.https://arxiv.org/abs/2507.23268
>>106466134https://huggingface.co/datasets/madebyollin/megalith-10mYou could try this but it's cc0 trash so probably low quality aesthetically
>>106466144Yes.But if the furryfag really want to go VAE-less, then he should also go with the route of sub-quadratic complexity attention
>>106466144I don't know about the approach radiance is doing but it has been explored before a while back.https://github.com/ShoufaChen/PixelFlow
My return is on the horizon Just wait for me to get back to speed and master chroma
>>106466430big if true
>>106466226>sub-quadratic complexity attentionIs this a speedup?
>>106466439It will be, I apologize in advance the no nothing hamsters that shit up the general and can't be happy with containment will go nuclear
Any way to gen a girl causing an energy explosion out of her body?
>>106466545try with a regular prompt but in my exp it will probably need to be rewritten to be more detailed by some ai and then try that
How does inpainting on comfyui work? The official comfyui workflow doesn't work well as A111/reforge method.
>>106466582>part 2 - 1 do you have the other parts?
>>106466631
>>106466631>>106466642
>>106466582>How does inpainting on comfyui work?It doesn't :^)
imagine getting filtered by more knobs and buttons
Can I assume that 95% here generate nsfw but are over the post only phase and are now larping as photographers prettifying their photos of the year?
>>106466679You have never made anything
>>106466679Inpainting in comfy sucks tho
>>106466710only for subhuman mouth breathers that are probably using windows or apple anyway not like their opinion matters
>106466729stop damage controlling for comfy lil bro. Anyone who has ever used any graphic problem can tell it sucks. Shitty mask editor with clunk UI, no filters or other tools. Nothing.
>>106466578I can do Jubilee in a hoodie, easyBald Jubilee after the explosion and the little outfit she is givenIt's the explosion itself that's the hard gen, glowing skin, floating, clothes disintegrating, center of energy or energy ball in center, flowing hair, glowing eyes, glitter, fireworks, abstract background, bright background,
>>106466755>too pussy to actually reply concession humbly accepted on behalf of comfychad and his asian QT gf
>>106466710I agree it was pretty bad before but get this node in impact pack, it gets you pretty much 90% of a1111 functionality with it. Only thing missing is using an upscaler model for the upres before inpaint.
>>106466784Hed rather seethe than learn something new.
>people now optimize everything around light loras so that using regular 20 step settings, normal cfg etc actually produces worse results with a lot of loras nowIt is what it is, light is worth it, but its grim that some movement quality for a lot of loras is basically better in 2.1 until better loras are published
>>106466784Or I can just use Krita and have a normal human gui.
>>106466784This node worked wonderfully. Thanks for the tip.
>>106466835I only use light loras for posting memes/garbage for other anons. I'd never use it for my own content. It's actually not that bad for anime since anime doesn't tend to need high motion fidelity anyway.
>>106460743If you're still around can you upload a PNG to catbox? Jpeg doesn't preserve metadata
>https://www.illustrious-xl.ai/sponsorThese guys are sitting on trained models and just let the hype die. Big brain.
>>106467247Are they at least trying to graft an llm onto those models or is it all clip?
nice collage
>>106465502>>106465445>>106465557My niggas have taste>>106465723Nah, that thread is for straight up pedos, above can stay>>106465871Buy 2 for training
>>106467247By the time those release we'll have a model more widely used than SDXL, making it completely pointless.
>>106467247I am thankful we got what we got. Praying for a leak. Expecting nothing.
>>106467347i kneel
comfy should be dragged out on the street and shot
so, anything new in the world of image/video genning?
>>106467347Composition on this is great, really deserves an inpaint pass on the faces and hands to add that extra detail
does qwen nunchaku work with loras?
>>106467275No clue. Newer versions have way better compatibility with natural language.>>106467328>By the time those release we'll have a model more widely used than SDXL, making it completely pointless.Idk man I think we might be stuck with current models for a while, which would suck>>106467338>I am thankful we got what we gotYeah me too, I even put $50 for their stardust, but I guess it doesn't matter
>>106467394This is inspired by Henry Fuseli and quite honestly his own faces and hands/legs are quite messy too.
>>106467433>Yeah me too, I even put $50 for their stardust, but I guess it doesn't matter$50 out of the $300,000 they need.
How much does Qwen edit suffer from quanting?
>>106467459I don't think quanting hurts it as much as the speed LoRAs do. Up to Q4 should be alright
>>106467459>>106467501Q8 or bust for image edit models if you want to keep quality over multiple passes
>>106467516Assuming you properly mask your edits it shouldn't matter. Multiple vae passes probably hurt the image way more than the quants would.
>>106467516Oh the Q8 is only ~20GB. I though it's gonna be far worse. Is the template workflow in comfy fine?
>>106467437Neat. Did you use a cnet or i2i as well?
>>106467555Nope. Chroma is just a good model for these styles and all sorts of weird compositions.
>ComfyUI: v0.3.50 is it safe to update to 0.3.54? any noticeable regressions?
>>106467437This is garbage. I get it. The renaissance era was full of people finally figuring out how to paint, but larping as a classical fag as if you have this deep connection to the art world, especially when using AI, makes you look like a twat.
>>106467584Memory is still fucked other than that nothing really special one way or another.
>>1064675840.3.54? we're on 0.3.56 now. to answer your question, it's perfectly fine.
>>106467582how did you upscale it? looks nice
>>106467692I just natively prompted it in this resolution. Chroma can easily stretch to nearly 2k without losing coherency.
>>106467692Chroam can rawdog 1080p
I like krea.
>>106467437Yeah, I looked at his art and although he paints great details in many pieces, he also draws many with undefined features much like in this image
>>106467433>man I think we might be stuck with current models for a while, which would suckIf you’re just doing big titty 1girls, how would it suck? The newer bigger models are just slower and don’t particularly make anime even in different styles better than noob/illus models. I can’t imagine how a “better” model would actually improve the output much.
>>106467753> I can’t imagine how a “better” model would actually improve the output much.prompt adherence + better handling of multiple subjects. If all you care about is 1girl standing in a basic pose, then yeah noob/illus will be fine for ages.
>>106467542https://www.reddit.com/r/StableDiffusion/comments/1myr9al/use_a_multiple_of_112_to_get_rid_of_the_zoom/
>>106467768learn to regional prompt and inpaint
>>106467845Already know them. This isn't about me.
>>106467845learn that that's no replacement for a model that can do multiple subjects without bleeding
>>106467902and current day ai isnt even safe from vae loss let alone autoregressive let alone a replacement for brain to computer interface reading your thoughts and putting them into pixels and yet you can create 99.99% of everything that you want all the same
>>106467407my niggas...
>>106467635I updated a day or two so ago. It was fine.
GM
>>106467845>adding a turbo to VW golf will make it a porsche !!1!
>>106468175post an example change thats basically impossible with current models with inpainting, regional prompting and controlnet tools
>>106468131he seems like he smokes a LOT of cigs lately ;c>>106465871>foundersthat fat brick wont even fit in my pc case (hotbox)
the joke is that modern models are better at it without needing to mindbreak them with a dozen external tools
Since there are some nice chroma users in this thread and I want to try croma on my vramlet setup:How well does Chroma run on 8gb vram + 16gb systemram?Which version and quant should I use?should I just go for the smallest possible quant, regardless of the newest version?I'm guessing this is the right place to find it: https://huggingface.co/silveroxides/Chroma-GGUF/tree/mainThere are so many versions. Are there big differences between them? I can see they are frequently updated.
>>106468195>this job needs to be done with an all in one tool instead of using whatever is best for the particular task because... it just does, ok!?
>>106468228Step 1: buy more ram
>>106468228Order more ram while you wait for https://github.com/nunchaku-tech/nunchaku/issues/431
>>106468131>Hercules going through his teen angst emo phaseProbably posted nudes of himself on /b/
>>106468228those are old versions. here are the "final" versions:https://huggingface.co/QuantStack/Chroma1-Base-GGUF/tree/mainhttps://huggingface.co/QuantStack/Chroma1-HD-GGUF/tree/main>8gb vramoof
>>106468228No idea, but you should be using https://huggingface.co/rocca/chroma-nunchaku-testAt least until the official nunchaku Chroma dropsAlternatively you can try 4bit GGUF of Chroma Flash HD (you'd use the recommended 8 heun steps for that model)
>>106468235Yes? I mean by that logic might as well go back to the good old days of 1.5 and spend 20 hours inpainting one shitty pic lol
>>106468285>ad absurdumnpcs are incapable of logical thought
>>106468285>why can't i just attach a saw blade to my drill instead of HAVING to use a circular saw??? THIS TECH IS TRASH
>>106468193catbox?
>>106468298my boy Spyro grew up
>>106468314you think he puts his benis into her???
>>106468296luddite are incapable of looking ahead so I guess we are at a stalemate lol. Hey moron inpainting and region isn't magically going away cause these models are getting better at prompt adhesion you can still jerk yourself using them just that base image will be made faster and will need less edits.
>>106468228You could *probably* train Chroma in OneTrainer using nfloat_4 for both model and text, but I don't know if the quality drop would be worth it, that said with 8gb vram you will have to accept a quality drop.If you use OneTrainer, you don't download any special quantization model, you select quantization (in your case nfloat_4) in the trainer (model section) and it will do it 'on the fly' on the full model.OneTrainer has a 8gb vram Chroma preset, but I doubt it can run with just 16gb system ram.
>>106468334i never said the models shouldnt get better nor that they wont, even to a point one day where most things will be automated so it might as well be all "1 thing" or even mostly just 1 ai modeli argued against the cope that with all the modern tools even today like regional prompting and good inpainting uis you cant do things like multiple subjects without bleedingand given that the retard cope continued without being able to actually post a single example where this is impossible >>106468185i accept your concession, and dont bother coping further without posting proof, luddite retard
>>106468264>>106468265What's the difference between the base, the hd and the nunchaku versions?I want an allround model that's as versatile as possible with the most context awareness possible.
Alright, why does Comfy load the face detailer model once the workflow starts even if that's only used once the first inference is done?Why doesn't it just load it sequentially?
>>106468399>Nunchaku is a high-performance inference engine optimized for 4-bit neural networks, as introduced in our paper SVDQuant. best vramlet cope tier for fags like you with no vram, wait for this to support chroma, then>between the base, the hd use hd
>>106468399The Nunchaku one is an outdated hack and not really for general use. Since you're vram limited I would suggest trying lower resolutions (768 and below) with Base. HD loves blurring outputs but you can try it too if you want.
>>106468399base is 48 epochs trained at 512 resolutionhd is those 48 epochs + a lot of epochs trained at 640, 768, 1024, 1152 resolutions as wellnunchaku is a quantization method that yields overall best results compared to other methods, and it's fast since it's q4
>>106468422>>106468458>>106468462ok. thanks.And then I need to use one of text encoders, and this vae, right?They're linked from the official lodestones repo: https://huggingface.co/comfyanonymous/flux_text_encoders/and the vae from here, also linked from the official lodestones repo: https://huggingface.co/lodestones/Chroma/tree/main
>>106468458Yes, good point, the 'nunchaku' version here is NOT a real nunchaku quantizationThat will happen once they're done with the Wan nunchaku
>>106468470Yes, you need the Flux vae and the t5xxl llm, typically they are named:ae.safetensorst5xxl_fp16.safetensorsThere's a t5xxl_fp8_e4m3fn.safetensors you can use instead if you don't have enough vram
>>106468470T5 encoder is fine. You might have more luck with a fp8 version since it's smaller.
>>106468495>>106468507perfect. thanks a lot guys. that's exactly what I needed to hear.
>>106468511Np, let us know if something fucks up.
>>106468333In her butte
>>106468507>>106468470Do NOT use scaled text encoders.
>>106468550What happens if you do?What's the difference?
>>106468554It makes the model more retarded. Offload it to CPU in the loader node if you must.
>>106468565how much difference are we talking?
>>106468577yes
>>106468577Also I can recommend the GNER fp16 version.https://huggingface.co/wikeeyang/GNER-T5-xxl-encoder-only/tree/mainAvoid FLAN since there's some schizo myth circulating around that you should use it, but it's so bad.
>>106468577there is no significant difference for qwen's TE at least
>>106466715Oh no!
>>106467162>Jpeg doesn't preserve metadatayes it does
>>106464276newbie here, so I managed to get the wan22 i2v workflow working, but how do you modify it to include a form of reference image?
>>106468663https://github.com/kijai/ComfyUI-WanVideoWrapper/tree/main/example_workflowshttps://github.com/kijai/ComfyUI-WanVideoWrapper/blob/main/example_workflows/wanvideo2_2_I2V_A14B_example_WIP.jsonJust copy from the examples.
Is there where the hidden chroma gems truly are? Anyone tried any of these? If I can make even wonkier stuff, I'm in
>>106468577It's minor, but there. Some people are anal about it though. Your ram is low which is why I suggested the fp8 version. If you want to try the fp16 go ahead though. There will be a node in comfyUI that says "load CLIP". Make sure it's set to CPU in the device section, otherwise you might run out of memory.>>106468599Also true. Gner is worth experimenting with, but flan is just t5 but worse.
>>106468704Looks a bit like Adam Hughes style ?
>>106468735>but flan is just t5 but worseI thought flan was supposed to do better on text ?
>>106468734no. those are all models meant for testing. look at the dates. they are all old as shit
>>106468734These are mostly abortions.>>106468763Who tf uses chroma for text?
>>106468763It's possible. I don't ever really generate text though so I wouldn't know.
>>106468599>>106468645>>106468735thanks
>>106468779>Who tf uses chroma for text?Not me, but I'm sure there's some weirdo out there who does
>>106468399Nunchaku is basically almost about q8 quality (or q8 if you have a 5090), at the size of q4 but 4x faster than a regular q4. If you don't know wjat this quants mean, q8 is a gguf quant which is almost lossless, half the size of fp16 model (but not necessarily faster if you can fit it into memory), and q4 is half q8, which represents a larger quality loss but is the largest you can shrink before you start losing too much quality, and is also not faster than fp16.
>>106468754It's a Jubilee lora but all I have prompted is Joel Jurion and Traditional medium
is this what people do at mardi gras?
someone at wan dataset team REALLY likes blue archiveto the point it pollutes all high res anime style gens
>>106468770>>106468779I will set sail for the seven epochs
>>106468845I see, nice style
>>106468898That's a special model experiment that uses pixel space instead of VAE, you need a special hacked version of Comfy to try it since it's still trainingI would wait until it's done unless you really want to try bleeding edge stuff, that said it seems to be progressing quite nicely
>>106468930The stupid furfag should leave bleeding edge research stuff to actual data scientists and just train the inevitable qwen model already and I'm tired of people doing the mental gymnastics into convincing themselves this is anything more than a waste of time.
>>106468930>you need a special hacked version of ComfySeriously? Well that's no fun...
>>106468948not bad this one
>>106468954>people doing the mental gymnastics into convincing themselves this is anything more than a waste of time.It is, but it is a waste of his time, and it's very likely he is an actual researcher. Why do you want him to finetune qwen, anyway? The person with the relevant nsfw photographic dataset is a bigasp guy, not lodestone.
>>106468957I've got the nude version of this BTW.https://files.catbox.moe/xtbq2u.jpeg
>>106468856looks accurate to me, only it's missing people dressed as crows
>mfw all the jumpscare ai videos coming in october
>>106468957The unaltered Tifa image is a benchmark for genning 3d pinups, I want to get a good look like that in Illustrious
>>106468954>inevitable qwen modelQwen is too slow and seemingly trains too poorly to be of any real interest , you'd be better off finetuning t2i Wan, but even that will be slow and EXPENSIVEI doubt it will happen, but we'll see
>>106468948kek, based
>>106468987>>106469081Quit replying to bait you knuckleheads
what is this weird miku cult
>>106469131But I... yes, you are likely correct
>>106469132Why are you new?
>>106469149i dont care about miku. i dont get the appeal
>>106469081>trains too poorly ?
>>106468930what's pixel space
>>106469132It's not a cult.>>106469155>i dont care about miku. i dont get the appealI'm so fucking tired of this internalized unconscious transphobia from newfags.
>>106469182People threw their bs XL settings onto the model and decided it didn't work and ignored multiple people who have made great LoRAs for it already. So same as early flux days lol
>upload girl to nano banana>"change her shirt into a different shirt">it massively downsizes her tits from a C cup to an A cupwhy
>>106469218Do you feel safe? The girl you were editing was clearly very unsafe for your health. Good guy google.
>>106469218it takes your gen's breast milk as payment. they have a full data center for AI breast milk
>>106469218(and if you're curious on why this is in local diffusion general, it's because it's a reminder that corpo models, even when smart, are still retarded because the industry leaders will make it retarded before you get to use it)>>106469228lol. god forbid she walks outside with that unsafe amount of fully clothed breasts in regular daywear
>>106469182Having done a few Qwen lora training tests, it's hard to get good results when you try new concepts that are largely unfamiliar to the base model, including just different art styles, and I've been hearing the same from other people who also train, most likely due to it being very overtrained.Also it is very slow to train, meaning few will experiment to find the potential 'best settings', but perhaps they will emerge.
>>106466582>>106466642>>106466642Hero of the thread
>>106469218>Using Google AI>Why is it censoring everything ?...
>>106469287still less censored than chyatgpt
>>106469287Sure, but that's not an outright rejection. Nor is there obscenity in the prompt or the image. It's just wild to me the extents these people will go. And for what?I don't know how else to put it. It's wild that a sfw image is censored into a sfw image, completely randomly.
>>106469258It's kind of insane that you can generate this stuff on localYes, it loses likeness as the video progresses, but still, it's impressive, also continued likeness could be improves by training a lora of the show
>>106469294lowest bar ever
>>106469155y-you don't like miku, anon?
>>106469132Asch effect- Normative social influence(fear of rejection in a social group)- Informational social influence(if everyone does the ritual, it must be right)think of it as an endless cycle of insecure newfags. they see it here and adopt it. Old fags leave, new ones come and so this continuity is created.
>>106469303yeah thats what im saying
>>106468654
>>106469258"holy shit it's sonic's wife"
>>106465845>what was their endgame?I am guessing they want to switch over because Lumina 2.0 came out and blew what Lumina-Next did out of the water and solve the remaining issues probably they encountered in the background. It's not like they disappeared but they have gone dark since late last year. Their public domain dataset is still up so at least even if they are gone, we still have a good dataset from them for not getting slop from a foundational diffusion model.https://huggingface.co/datasets/Spawning/PD12M
>>106469378>first she has to suck a penis>then she drops BB-8>then BB-8 runs away from herMan, it's been a rough first day of school for Jenny today.
>>106469378That didn't look like an accident...
>>106469469Thats based on a real chick?
new>>106469492>>106469492>>106469492>>106469492
>>106469477? she's just surprised it came to life>>106469487yeah
>>106469503LMAO hope she doesn’t see that stuff that menace with the oral insertion Lora was going around with. Is she an actress or something?
>>106469551no, she's a jewtuber. Jenny Nicholson
>>106469571>jewtuberOk I’m interested>Jenny Nichols is a video essayist…Boner killed. Gay.
>>106469630i just checked her page and she uploaded once in the past three years. this general is actually the premier source of Jenny content now lol
>>106469571>Jenny NicholsonThe likeness of your LoRA is very impressive, anon. I'm kinda into her now.... I might have to... I dunno... gen her without her clothes on.... maybe use the facialinsertion LoRA.... I dunno we'll see...
>>106469253I also agree with this. I haven't had much luck training styles, although I'm noticing now that higher dim values than expectedd are producing better results.
>>106469551>Is she an actress or something?She's the cutest girl in the whole wide world!>>106469666>this general is actually the premier source of Jenny content now lolJenny's super-burned out atm (you can hear her almost wanting to cry at getting nothing done) and struggling to produce a "short" video about one of her favorite bad movies. No timeline for that, my guess would be around Halloween time, but she completely gave up on Barbie movie ranking video she was working on up until July. So even her current project isn't a sure thing.
>>106464677nice
>>106466582nubcake here, is inpainting in comf actually worse?