Discussion of Free and Open Source Text-to-Image/Video Models and UIPrev: >>106727118https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/sd-scripts/tree/sd3https://github.com/derrian-distro/LoRA_Easy_Training_Scriptshttps://github.com/tdrussell/diffusion-pipe>WanXhttps://comfyanonymous.github.io/ComfyUI_examples/wan22/https://github.com/Wan-Video>Chromahttps://huggingface.co/lodestones/Chroma1-BaseTraining: https://rentry.org/mvu52t46>Neta Luminahttps://huggingface.co/neta-art/Neta-Luminahttps://civitai.com/models/1790792?modelVersionId=2203741https://neta-lumina-style.tz03.xyz/>Illustrious1girl and Beyond: https://rentry.org/comfyui_guide_1girlTag Explorer: https://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbours>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
hunyuan image 3.0 on comfyui (fast generation with comfy cloud) when?
>>106730007blessed thread of friendly-zoned friend-zoned frens ;3
>>106729994yeah it's not good either like I said, I think it's arguably at least better resolved than the Hunyuan 3.0 image though which had grainy bullshit lines everwhere. And again 2048x2048 instead of 1024x1024, with better text.
1000 more years of sdxl
>>106730007/r/ing the omelette woman getting chopped up or flipped with a spatula or something
how's the pose where the girl is doing the split with her ass up in the air?
>>106730053SDXL is all this general needs, all you guys post is 1girl slop so you don't need anything else.
wake me when 8gb can do good videos
>>106730079>pircelwhat's the source?
>>106730069spread legs, top-down bottom-up?
>>106730074Um, are you even looking at the OP image? That's clearly 1wizard,1naziloli,1gothlolita,1fairy,2fatmen,1omelettedufrumage,1negress. Retard.
>>106730088https://x.com/kirkinator_sol
>>106730079wan2.2 can do THIS?
by the way, anon whos making the omelette girls, whats your prompts? I wanna try em in various models.>also itll get me closer to recreating that vid of the pancake girl getting syrup jizzed on
>>106730102no idea, it's a cryptobro account so they probably have a rack of H100s powered with fissile material
>>106730079why is it SO long ;_;my sides hahahah>>106730074>1girlYUSS!
>>106730079and they say ai cant be kino
Is 1k imgs overkill for an XL style LoRA?
>>106730079I recognized Hulk Hogan... how terminally online do you have to be to recognize the rest? Like teenage girl tier? lol
>>106730094I guess yes, lol
>>106730138it's maybe overkill but it definitely won't hurt as long as the images are varied and good quality
>>106730172>>106730094"jack o pose"
>>106730145Teenage girls are on the streets protesting fascism now, not sitting on the internet like us
incredibly based
>>106730231i love him
>>106730231>>106730231I really like my speculative dev! By the way, what are his thoughts on SDXL? People will never rent GPUs in ComfyCloud if there's a lightweight model, it doesn't make sense to keep supporting it.
>>106730231Imagine your model being so bad that it's not even worth bankrolling it lol.
>>106730208cool crown
I post some more outputs of Hunyuan Image 3.0. They are all terribly underwhelming for 80b, but maybe I'm doing it wrong. The prompts are all very simple. They take 15-20 minutes each with RTX 6000 and 192GB RAM. SDPA because flash attention isn't working properly.Generated with the stock gradio UI and system prompt:>You are an advanced AI text-to-image generation system. Given a detailed text prompt, your task is to create a high-quality, visually compelling image that accurately represents the described scene, characters, or objects. Pay careful attention to style, color, lighting, perspective, and any specific instructions provided.Prompt for this one is>Please provide a map of lower Manhattan
>>106730288>Please provide an illustration showing the interior of an Island 3-type O'Neill cylinder. The perspective is from near the central axis looking outwards along the length of the volume. Cities, towns, farms, forest and other biomes cover the interior surface not taken up by the long window panels. The reflective mirrors are partially visible outside through the windows. The apparent time of day is the golden hour. The illustration style is a detailed goache painting with high contrast.
Taking the LLM approach to image gen is fucking retarded and I said this during the SD3 days. The problem is in most cases they make these models inefficient to put you into the API ecosystem while providing nothing on part with modern day tools that ironically are built by the community. SD would have never taken off if it was inaccessible by mid tier hardware on the consumer level.
>>106730288>not even a maplol
>>106730057>>106730079here's your 8gb video
>>106730293oops wrong oneThis is>Please show me a silk-screen print of a space marine firing their weapon upwards in triumph. Only two inks are used. The print is on white paper.
>>106730293
>>106730138I've done 1000 image generalist detailer loras before, not so much style ones. That said it won't hurt if your images show a good variety of different things in the particular style you're going for, the more the better really.
>>106730288no yeah it's not good, i dunno how it even got approval for release, they could have just iterated on Hunyuan Image 2.1 moreso and probably have gotten something better
>>106730334They are doing the same short sighted shit that killed SD and in turn damaged flux because the retard devs took the same philosophy. I'm pretty sure BFL is underwater financially because there is zero reason to pay for their service.
>>106730319>generalist detailer loraswhats the training / dataset look like for those if you dont mind my asking i used to use that detailer lora for 1.5 but never since so im moreso curious than anything
>>106730301I presume this would require learning several complex functions which are beyond the basic knowledge of a humble swarmUI user such as myself...
>>106730345nope, just download the Pinokio app, it is basically plug and play.
>max queue size in comfy is 100lol
>>106730365until it fails to download packages like huggingface and gets stuck in a download loop lmao
>>106729758So this is the power of a 4o slop dataset... At this point I'm not surprised there isn't always a catch to a Chink model we got for free, and 80B really was too good to be true unfortunately.
>>106730374maybe max batch size to add to the queue, the queue itself doesnt have a limit
>>106730365I installed that months ago and got stuck somewhere along the process of getting it to work.But i'll try it again, thanks!
>>106730019Well I am just saying that based on what they claim. All I know is that when I do latent upscales with 0.6+ denoising at 4 megapixel with Illustrious, I get a lot less decoherence than I do with, say, Pony or another SDXL model. I've also done some experimenting with base 1.5 megapixel generations and it works fine.
>>106730231he's not wrong, that's really an useless model
anyone try local music generation? quickly searching there is one called YuE and one called ACE-Step
>>106730343IDK, I think Kontext Max and Flux Pro Ultra (the one that does like 4K outputs by default) probably get decent use API wise by different third party inference providers
The frustrating bit about Hunyuan Image 3.0 is that the whole point of upping the parameter count is to capture more specifics about the world, pop culture, etc. Look how the post trained it. It's just said how benchmarks kill creativity.
>>106730376just reload and be patientthe files are in the 10s of GB it takes a few hours but it does work :p>WELCOME 2 THA SALTY SPITO0N HOW AUTiSTiC AREE YA????
>>106730440>The frustrating bit about Hunyuan Image 3.0 is that the whole point of upping the parameter count is to capture more specifics about the world, pop culture, etc. Look how the post trained it. It's just said how benchmarks kill creativity.wait they used SPRO and it still looks like slop? it's over..
>>106730418YuE is semi good, quality very comparable to early Suno. ACE Step instruments are best, it's the fastest, but it needs an update for proper voice which can be hit or miss, but mostly miss (they are currently cooking version 1.5 which is to address this) Songbloom is the best for voice quality, but right now its variety sucks, does not support instructions and they changed the license to non commercial. None of the models are a match for Udio or ElevenLabs for music.
>>106730458thanks ill try ace, wanted to do some background music anyways
kek, qwen edit v2 (2509) can use openpose/depth as image references (for image2, etc).replace the man in the blue police uniform with Miku Hatsune using the same pose as image2.it didnt change the face but this might be even more hilarious, lmao
>>106730493and the original openpose for image2 from a aio aux preprocessor -> dwpreprocessor node:
>>106730498there. second iteration did it. Good job Miku.
>>106730493>>106730498>>106730504kek
>>106730345>>106730383btw for this video i used wan 5b and 15 steps. took a little more than 4 minutes for 1 second
What is happening when you give negative weight to positive prompt tokens?
>>106730529>What is happening when you give negative weight to positive prompt tokens?You go to hell before you die!
from this...
>>106730555to thisreplace the anime girl in image1 with the anime girl in image1 that is pointing, using the same pose as image2. the background is white.
>>106730563prompt seems redundant but I had double 2bs until I said replace.
>>106730563
>>106730555>>106730563Impressive. Very nice.
>>106730529i think it's the same as if they're positively weighted in the negative prompt?
>>106730583nop
>>106730440All these reward hacks should be loras or alternative models.
>>106730288>Portrait illustration of a young woman. Media: quill pen with india ink. Highly calligraphic line work.It still has the old problem of mistaking the medium requested with the content. Also we're back to spider web / crosshatching for all fine detail.
>>106730440It is amazing how few people in this space understood why 1.5 became so dominant. Seed gacha is the price to pay for having a model with genuine creativity, which is lacking even in modern SDXL descendants and many other new models.
>>106730440>The final refinement is achieved though the application of SRPOWhy would it be a refinement step? Isn't the whole point of using SRPO is to be able to use it in training over GRPO? There's no way unless you extensively trained with it SRPO can get rid of any slop that came before it.
why did the hype for radial att die? is it worthless for wan2.2?
>>106729921The issue is that their smaller model that they're scaling sucks (which I suspect is because they're using synthetic data). If you stack more layers on top of a good model then that doesn't happen.
>>106730643CFG peaked with CFGAttention peaked with sage attention.It literally cannot get better.Source? My dad works at stable diffusion.
tell your dad to put the busty asian sluts back into the dataset for SD4
Sure are a lot of no through signs on this street.
I can't face swap using Qwen Edit Image 2509 (+ lightning lora) without first superposing the face on the target area using an image editor. "Put image1 face on image2 face", or any other prompt, doesn't work. I guess they censored it. Also, with the image editor trick, I then fail to turn her face around so it looks realistic, like "tilt her head to the right" doesn't fucking work. They trained it to generate e-commerce images, isn't it?
>>106730609absolutely awful. comfy should not support this model, nobody should.
>>106730678I said stable diffusion, not stability AI.
>>106730691>80b for this
>>106730609>>106730691looks like the propaganda blasts louder tonight as punishment. very very shamefur.
the blonde man holding the glass in image1 is shaking hands with the anime girl in image2. keep his facial appearance the same. keep the background the same.changed both their poses and combined it with just 2 nodes. qwen edit v2 is neat stuff.
>>106730007I havent kept up with local genning for a while, there's so much new shit... Got a workflow from a Chroma image on civit and I've not seen the majority of these nodes before.
>>106730294The problem is LLMs can't be creative or controlled enough to come up with a scene in their "mind" and turn that into a usable prompt (it's always too busy and unfocused). Using vision-enabled LLMs work infinitely better though because they use that image as context to build out a prompt. I think this is most evident with Kontext/Qwen edit models.
>>106730744
>>106730374
>>106730458>None of the models are a match for Udio or ElevenLabs for music.That's also my experience, udio is really good (didn't test elevenlabs).
>They'll be calling me promplet for the last timedios mio https://civitai.com/images/91532907
>>106730764
>>106730710very cute
>>106730779
qwen is fucking weird sometimes
>>106730783very nice to meet you
>>106730294Okay I need more of that trap on the left
>>106730686It works when I ask to "put image2 next to image1", but after generating 8 images, I can see the model has 0 creativity. They are almost identical.
>>106730795
any news of wan2.5 being local yet
>>106730838No and it never will be. I can't believe people are still so delulu over this. "Maybe" means "No" in China.
>>106730838wan is their video equivalent to qwen-max which they also never open source
the man in image1 is shaking hands with the man in image2. keep both men's facial appearance the same. keep the background the same.if you say to keep their appearance the same it generally works well. good one, mr fors
>>106730853this one is basically 1:1 with the source.
>>106730845so it really is over for local video gens. oh well. this gives people plenty of time to make finetunes/loras and optimize the hell out of like SDXL.
>>106730853jesus so we've finally reached the peak of a.i meme capabilities huh? we can finally just generate ANYONE/ANY character shaking hands with our buddy spitler. Awesome.>now make hitler shaking hands with hitler
>>106730864>>106730853The obvious use case here is to edit people into candid photos with Jefferey Epstein.
>>106730873also you can swap any outfit onto any character with a cropped cutout of the outfit. tons of possibilities. it's like a swiss army knife for an AI model.
>>106730294>Taking the LLM approach to image gen is fucking retarded and I said this during the SD3 days.GPT Image disproves this. The problem will always be two things with local models, and these two things only. Censorship and data. Another anon said it. The best local model we ever got as a base was 1.4 and 1.5. Why? Pretrained on a crap ton of data, and mostly uncensored. After the lawsuits started rolling in, we got SD 2.0 and its been downhill from there while companies like OpenAI and Google can shield themselves with the best lawyers.
>>106730881yeah, and normies are too stupid to know the difference, im surprised they dont use AI to lie when they lie all the time on the news.
does chroma flash always give artifacts? i'm using heun 8 steps and at 1024 it looks like the image was jpegified
replace the man with the blue police uniform in image1 with the man with the moustache in image2.image2: random hitler pickek, gave me DAS POLICE
>>106730294repurposing (newer, architecturally superior to T5 without even necessarily having as many params) LLMs as straightforward text encoders seems to still be a decent approach though. The fact that e.g. Gemma 3 has a context length of 8192 vs the 512 of T5-XXL is sometimes noticeable on NetaYume for example.
>>106730913I think this video single handedly proves that you should avoid RIFE like the plague.
>>106730917(samefag) TLDR a well trained 2B - 6B model attached to the most modern text encoder available that strikes the best possible balance between size and performance itself is basically the "ideal" thing.
Excuse me, how do I block >>106730932 this user?
>learning how to use flux for the first time>combined a 2d style lora with a photoreal prompt into a subject that'd be normally impossible in xli get it. i completely fucking get the hype now. its like a dopamine rush i haven't felt since like 2023. hooooooly.
>>106730932RIFE has no impact on quality.
>>106730950>learning how to use fluxJust type in the thing you want to see nibba
>>106730686And I can't seem to transfer style from one image to another. Simply asking to turn an image into a drawing can sometimes work (although it's shit at anime style), but this is definitively not a use case. I think that my final verdict is that this model is garbage. Thanks China.
god i wish comfyui had nodes that would let you loop through a series of prompts or something
>>106730991>Just type in the thing you want to see nibbacoming from tagslop/mixture of natural language, no it wasn't that easy.
stagnant thread of stagnant tech
I'm just glad we're past the stage where anything less than glazing chroma would result in an autistic freakout 12 hour spree of posting warped asian feet.
moving thread of advancing tech
the man is giving a speech at the Reichstag in Germany. He is pointing at a large video screen behind him says "RTX 5090: $4000", and has a picture of image2 behind the text. Keep the man's expression and facial appearance the same. The image is black and white.
>>106731061pictured: jensen huang
>>106731061Hi, I'm new.What are you using that allows prompting like that?
>>106731061
>>106731081Nano banana diffused via API nodes.
>>106731061>>106731072i like where this is going. continue.
>>106731081thats qwen image edit v2 (2509). can use multiple image nodes, thats just with a person and a gpu as reference (2 nodes).
the man is giving a speech at the Reichstag in Germany. He is holding an Nvidia GPU in his hand with the appearance of the GPU in image2. Keep the man's expression and facial appearance the same. The image is black and white. At the bottom of the image is subtitles that say "Jensen is a tyrant for these prices!"pretty cool how versatile qwen edit is.
>>106731106its sad it doesn't make images match in quality and style, as it just looks like a shitty photoshop job
the man is giving a speech at the Reichstag in Germany. He is holding an Nvidia GPU high in the air with his hand which has the appearance of the GPU in image2. Keep the man's expression and facial appearance the same. The image is black and white. At the bottom of the image is subtitles that say "No more Jensen holding VRAM hostage!"he has a point
>>106731087thanks fren>>106731094this better not be a ruse
>>106731123nah, it's fun. you can do basically anything and manipulate images to do almost anything.
>>106731119
>>106730997I finally got something, with >>106730529 as a reference image.>Change the style of image1 to an (anime style:1.25) similar to image2. Thoroughly change anything, don't keep details, I want an anime style. Keep the pose of the woman in image1.No neg prompt. It failed with >>106730563
>>106729065>>106729967could i get catbox and model name/links for this two stocking gens. They look way too good.
>>106731005There are various nodes for dynamic prompting, but that can only get so elaborate depending on what you want to do.
>>106730917Is the neta anime models good yet?
Can I get an amen if you'd rather see Mayli than plastic sluts in bikinis?
>>106731183this post explains how i managed the realism >>106729199and sure (the face in the 2nd quote is custom thoughhowever but you might be able to achieve it with one of the civitai stocking loras) https://files.catbox.moe/wjh50f.png
the man is at a press conference with a large video screen with an image of image2 on it. He is pointing to the screen. Below the GPU is the text "Only $8000, you dumb fucks". keep his appearance, pose, expression, and facial structure the same.
>>106730301>>106730079>>106730345just buy more vram anon
>>106730899>does chroma flash always give artifacts? i'm using heun 8 steps and at 1024 it looks like the image was jpegifiedNo? >>106730657>>106730380These are all made with Chroma Flash anon. Likely one of your settings is wrong.
Simon Bisley getting there
plebbitors are claiming you can legit train character/person loras in flux at 512x and get good results, chat is this true or are they r/tarded?
neat, I can make super quick awoos now.
>>106731266what settings are you using? i've been using the rank 32 lora from here https://huggingface.co/silveroxides/Chroma-LoRAs/tree/main/HD-Flash at 1.00 strength with heun/beta 8steps. artifacts don't seem to go away in either the fp8 or full hd model
>>106731296Change her appearance to look like Miku Hatsune, and give her teal colored eyes.
im trying out ace_step but its fucked up. like most of the time the song cant even keep a beat
>>106731296>>106731306The future is now
>>106731324it's such a neat tool, doing all these edits manually with inpainting or with controlnets would take so long, those are still good tools but this makes things way faster, and you can use this with wan, qwen/flux, noob/illustrious, whatever.take gens from those and manipulate it with the edit model.
>>106731306I'll say it again. I loathe how you call her Miku Hatsune and not Hatsune Miku. I think it actually give me autism.
the man is holding an open pizza box with one hand and is holding a pizza slice with the other hand. keep his appearance the same.
>>106731330It requires (un)comfy UI though, or does it?
>>106731343once you have a good workflow it's very comfy, you just gen. save + export the workflow and it will always be there even if something screws up.
thanks, do you have a link to reality stone upscaler he used?
>>106731339could you gen him holding up a snickers bar?
>>106731380sorry that's a scaler i was given a while ago. dunno if he'd want me sharing it (if he did he'd probably post it to civitai himself kek)
>>106731027bruh ;_;
>>106731029I one shot these now with Flash anon kun. I'm not even trying as hard with feet pics anymore. They are so easy, and they come out flawless 99% of the time.
the man is holding a TV with image2 displayed on it. keep his expression the same.what a shame
>flawless>she has two left feet
>>106731382the man is holding a candy bar in his right hand with the appearance of image2 on it. keep his expression the same.just needed to grab a snickers bar as an image reference for the other node and...it more or less works
>>106731410I assume it was subtle ironic humor.
>>106731387wtf i can't find it anywhere. please share it. just litter catbox the upscaler. Didn't even know there were private upscalers people kept to themselves.
>>106730007>posting my Chroma gens in the inclusive collage. this lightweight model has so much potential...
>>106731410Rare case, easy fix
>>106731432the man is holding a chocolate candy bar in his right hand with the appearance of image2 on the wrapper. keep his expression the same.almost, the image source is kinda potato so it's gonna be a bit off.
No tag bleeding Different character body type handlingHandles various body types properly
>>106731432this is all i ever wanted. thank you.>>106731447eh alright fuck it why not. in fairness i use it with like every single one of my sdxl based realism images so i may as well.https://files.catbox.moe/2nolmf.pth
>>106731471Me on the right
Inclusive model. Qwen or WAN only makes rigid body types aesthetically. But Chroma has more variety.
okay this is as good as it will get with the low fidelity of the original image.qwen edit is pretty neat though. just use image2 or image3 instead of image stitch jank.
>>106731484>arch linux
>>106731482thanks a thousand anon. The upscaler and fp16 vae seems to have made a difference between this gen and my older gens. :) really appreciate it.
>>106731286This is from a 512px dataset (pre-cropped). The only downsides are generally softer output and a more unreliable likeness at distance. It can absolutely get the job done though.
Feeder fetish for women is like meh, I get it. But feeder fetish for men makes no sense to me.
After a day of testing I believe Forge Neo is worse than Forge old.
>>106731569after a minute of thinking I believe your post is unhelpful
Day 36 of no good fast chroma solution that doesn't fuck up my images
>>106731549Now?
>>106731534Awesome, have fun.
The new 2.2 LoRA really does have a tendency to produce over saturated marvel slop.
>>106731613try the 2.1 lora for high at 3 strength, and 2.2 lora for low at 1 strength. idk why but 2.2 high can mess with motion and results a bit.
>>106731635oh, nm there is a new lora out. havent tried it yet.
>>106731578>fav extension for custom resolutions broke>has weird behavior when hitting generate after switching a model>worse performance it seemswhere is the benefit
Where the hell are the essential Chroma loras? Why do none exist? There's like nothing. The model has been finished for months now. Who the hell is even using it?
>>106731646some are saying t2v works fine with i2v, i'll give them a try when they are done dling.https://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-T2V-A14B-4steps-lora-250928
>>106731662>nazi spoted
>>106731662you can just look on civit
What's the best Wan 2.2 LoRA for general-purpose nsfw stuff like taking off clothes/etc.?
>>106731680They're all crappy.
>>106731662>Where the hell are the essential Chroma loras?they don't exist>Why do none exist?one of the chroma team members had a repo full of good experimental loras but he removed them and likes to chimp out at anyone who reposts or uses his stuff>Who the hell is even using it?a handful of people on civitai and this thread. most people aren't going to bother with it for a variety of reasons
>>106731682>general-purposeThe only general purpose wan LoRA are the lighting ones, everything else is quite specific
give the blonde girl in the dress teal color twintails like Miku Hatsune, in a low poly style.neat. source is parasite eve 1 on ps.
>>106731662>essential Chroma lorasSpecify or STFU
>>106731704>good experimental loras but he removed them and likes to chimp out at anyone who reposts or uses his stuffWhat why did he remove them?
>>106731724because the people working on chroma are self-sabotaging retards. see >>106728978 in the last thread
>>106731724please join the discord to find out ;)
>>106731662>Who the hell is even using it?Anyone who wants to generate realistic (as in non-plastic) people, particularly for NSFWFor anime / hentai etc, Illustrous remains the only real choiceThe question you should ask yourself, who the hell is using Qwen (not Edit), or even Flux these days, they have zero usecase with their plastic censored output
>>106731722Basic shit like NSFW positions, body sliders, races, Camera positions. You know, essential loras that every model has?
>Heh, oops, sorry, must have been that Chroma gen I posted.
>>106731724>What why did he remove them?Probably because they were made against ancient versions of Chroma and don't work well with the final releases
when will imagegen/videogen figure out MoE?
>>106731766cope, they worked just fine
when will we figure out how to make SDXL into a video model?
>>106731751Really, show me the Qwen loras for these subjectsQwen has been out longer than Chroma and has had a Civitai section much longer than Chroma
>>106731300You want to use thishttps://huggingface.co/lodestones/Chroma1-Flash/tree/mainHere's catboxhttps://files.catbox.moe/xnc4fe.png
>>106731788Everyone uses Qwen Edit, and you don't need those kind of loras for that.
https://github.com/comfyanonymous/ComfyUI/issues/10068
>>106731789gotta love how this guy always posts the template workflow. absolutely nothing different about it from the one comfy provides.
>>106731789So if I wanted this as an FP8 would it be better to just quantize the full Flash .safetensor versus using a lora?
>>106731801kek, backtracking directly, you're so full of shitand of course you need loras for Qwen Edit since it's massively censored just like Qwen
>>106731782seed variation animations homie
>>106731811comfy is right. HunyuanImage-3.0 is meme shit. still, if he wants comfyUI to not get a bad rep, he should just do it anyway to please the 5 people that will use it.swallow your pride comfy
>>106731811Hate to say it, but the shithead Comfy is correct for once, there's zero reason to add Hynuan 3.0 unless there's some insane quantization magic that makes 1bit work well with this model
>be me>using illustrious in SwarmUI like a pleb>inpaint twice>image now has yellow tintwhat am I doing wrong here
>>106731830>>106731832You don't see this retarded mindset of "waaaa model too big for my gayman GPU, so it's useless!!111!" in the LLM scene (thankfully).Some non-poorfag people on /lmg/ have even come around agreeing to the fact that models larger than 200b can, in fact, be "local" as long as you are not poor.
i bought the bank
>>106731858wouldn't you only need enough space for a 13b since hunyuan 3.0 is an moe? the rest could get offloaded
>>106731858since you love /lmg/ so much that you have to post about them in literally every thread, you can go BACK to sucking their cocks in THEIR thread you cum guzzling dickbrain.
>>106731814Yes. This Flash version is very special, it's based on an experiment Lodestone made while training HD so it fixes fine details. Results in less limb horror, higher quality images, much closer convergence than regular Chroma HD. Only cost so far is the style variety is not as large, and prompt following especially with multiple subjects is not fully on par with it.Here's another workflow for 2k images that merges HD(v50) with the Flash delta for better prompt following performance and full leverage of Chroma Flash detail fixes. 2k images are faster than regular Chroma HD, so there's no downside to using that over HD.https://files.catbox.moe/pg1c1o.png
God I cannot stand the furry dick sucking.
hmmmhttps://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-T2V-A14B-4steps-lora-250928interesting results, it doesnt seem to have the slow motion that i2v 2.2 lora has, despite being a t2v lora.
>>106731811do people use ltx video, sky reels or hidream? There are tons of dead models that got no traction or community support. Wonder if there's going be "lost media" when it comes to ai models files in the future.
>>106731898*this was updated recently, they say they are making an i2v update too but for now it seems ok?1 strength for high and low.
>>106731858RTX 6000 Pro 96gb vram can't even run this image model without massive offloading slowing things downLLM is less taxing overall, you can even run it on pure cpu solutions, for image / video generation, it's GPU or bust
>>106731811if it was at least good, but why support something that is huge AND shit
>>106731901>do people use ltx video, sky reels or hidreamno, no one uses those. hidream requiring quad clips is what killed it. if people ditched hidream, then they absolutely will ditch hun image 3 after a week, no question.
>>106731915>1x rtx pro 6000he said non poor fags anon
>>106731891Is there a difference between the full HD-Flash checkpoint there and merging the delta weights into HD? Sorry in advance if you already answered that and I'm just not understanding.
>>106729359to answer my own question...turns out the change in my old gens wasnt caused by pytorch at all!the differences between pytorch versions are actually very small in most cases, so the original comparison image must have been very cherry pickedof course the actual cause was comfyui updates, and rolling back to november 2024 did the trick>>106731848could be a bad vae
another test: the man gets up and runs very fast out of his office.he's a bit confused, don't mind him
>>106731811I hate the argument of "you have this much money, why can't you do everything we want no matter how stupid it is". ComfyUI has no obligation to support every model, especially if it barely qualifies as local for the average user.This can actually have negative consequences for Comfy. If Comfy supports it, retards that don't understand the insane requirements will think they can run it, get angry at how slow it is and blame Comfy for not optimizing it.
>>106731933>three legs>broken leg>watermelon
>>106731945>the chair turns into a CRT TV with a random guy behind it
>>106731945i2v is still more fun than animate
another test: wan 2.2 i2v high into wan 2.1 i2v loraplus wan 2.2 i2v low into wan 2.1 i2v lora1 str for all:
>>106731930Yes, in particular, prompt following, and what I perceive as sloppiness in the regular HD. For instance, when you're prompting for two subjects, like the one I linked, in regular Flash HD it tends to give you 3 or 4 because it's not as strong at prompt following. That is fixed witht he mixed version I gave, with the caveat that to see the amateur photo look and not slop, from my tests you have to prompt it at least at 2k or close enough to it.Here's another example where I put prompt following to the test (this time in 1152 res):https://files.catbox.moe/nkn901.pngRegular Flash version falls apart there, and I tested other ones. Regular Chroma HD does all of them fine, so that's the reason for the merge.Note you'd want to save the checkpoint instead of loading the two separate checkpoints each time.
>>106731953it was the image that described my current mood the best
>>106731976test 2: the man drinks a bottle of beer and smiles.oddly enough, using both loras sorta works? does 2.1 fix the speed issue? idk.
>>106731999and idk why but using both seems to work.
>>106731811>NOOO YOU HAVE TO IMPLEMENT EVERY USELESS MODELS EVERlol, I hate Comfy but he's 100% right there, who's gonna run a slopped 80b model?
>>106731811who the fuck has 100k in GPUs to run an 80b model
>>106731811Okay, I kind of agree. But what other pressing project is currently taking up so much of the teams time that they can't hack together a solution to run it? Most of the time support is available within hours. Is there something about this model that makes it hard for them to implement it on comfy?My instinct tells me it's not for the stated reasons. I believe they are technically incapable of implementing it without serious backend changes.
>>106732012the asian girl with large breasts jumps up and down.need more testing to see if this combo is effective or worse than 2.1 high + 2.2 low.
>>106731267Raw text2img? No LoRA? That's insane.
>>106731811Not true, I generated like 10 images so far and haven't given up yet!>High contrast vector illustration of a young woman with magenta hair and blue goggles riding a hoverboard across a cratered desert surface. She is wearing a vest, cargo pants, and sandals. Various gadgets are looped at her waist. She is riding towards the viewer, seen slightly from below and at an angle.
>>106732058looks like ass, Qwen Image can definitely do something similar
the man in the blue shirt jumps onto the airplane behind him that is taking off high into the sky.still not sure if combining 2.2 and 2.1 is better or worse :/
>>106731750Qwen-->Wan looks hundred times better than any shit chroma can do. Like holy shit get some eyes already and notice the man woman hybrids (unless you are into that, than I guess continue on?)
My dream would be that the Qwen team get their shit together to unfuck the base Qwen-Image just like they tried to do with Qwen Edit, by preferably stop fine-tuning the model during post-training on synthetic slop
>>106732076now we're talking.
>>106732090hell yeah
>>106731811Lol we are getting to the point now they are gonna pick and choose what open models they are gonna implement but spreads wide open for any bullshit API. Yeah comfy is cucked.
the man in the blue shirt points at the airplane behind him, and it explodes into fire and smoke.
>>106732090>>106732109i prefer this over the original cut of dark knight rises
>>106732090It's crazy how far local video models have come...
>>106728954>Powerpuff girlscould be a good idea, theres already an illust one on civitai that looks not terrible though
the man in the blue shirt is having a fist fight with a muscular man wearing a black tank top and black ski mask.
>>106731283
>>106732076>>106732090>>106732109>was generating slop part of your plan?
>>106732147of course!
>>106732147The slop rises!
>>106732077>Qwen-->Wan looks hundred times better than any shit chroma can doPajeet thinking slopped plastic looks good, go away
>>106732166lol, lmao even. Enjoy the blurry trannies I guess
>>106732143this time, i2v lora 2.1 for high noise at 3 str, 2.2 low lora 1 str.yeah, the old fix still works better overall. now he's even fighting!
>>106732144Impressive
>>106732037lora, not good enough for sharing
>nooo my millionaire dev!!! Don't bully him
>>106731283qwen?
>>106732211I like these, how do you make them
the man in the blue shirt holds up a sign saying "you're a big guy", and a muscular man wearing a black tank top and black ski mask punches him, causing him to fall to the floor.not bad.
What's happening to Comfy is normal. He's got the UI monopoly, so if he wants to stay on top he'll have to keep pioneering and make Comfy work with literally every single new meme model that drops.That's why he's successful and Ani is a loser.
>>106732215its a lora i just finished trained on stills from the camp lazlo cartoon
>>106732223
>>106732237
>>106731662Essential Chroma LoRAs as in what exactly anon? Chroma in its current state is a base model (what comes before IL/Noob style finetune), so you'd not find any anime specific LoRAs that are really good. For realism unless you are going for a style it needs no LoRA.
>>106731662>he's still not understanding that Chroma is ass and no one want to deal with this shitlol, some anons are really slow I swear
>>106731398>calling that flawlesslmao
>>106731398>flawless>2 left foot
hmm...
>>106732235everyone loses. cool
>>106732263I like Chroma, and it's the best base model overall, but again it is a base model, which means it's not going to excel at any specific category, just be ok/good at pretty much everything.So for really good results in NSFW, styles etc, like with all base models, you need to further finetune, typically with a lora since it's easiest.
i cant figure out a good lineart processor for these thicker lines
default wan size, pretty good result:
>>106732360>>106732360>>106732360>>106732360>>106732360
>>106732349Yes. this.
>>106732026it is a non-trivial architecture, does comfy support LLMs? No. This is an LLM, not a DiT model. Even the guys at llama.cpp are going to take their sweet time:https://github.com/ggml-org/llama.cpp/issues/16303and we're also still waiting from them to implement qwen3max.So yeah, it's going to be a lot of effort for... a shit model.As for comfy's stance, he's a retard, I would've just said that the implementation would be put on low prio and in 2 weeks when everyone has forgot about it let it rot in the backlog.
>>106730042>>106730123>>106730267>>106730446>>106731391fuck off tranny freak
>>106732208that's fine and all but you're the only one here that knows said millionaire dev's cock tastes like