He Thinks He Knows Edition Discussion of Free and Open Source Text-to-Image/Video Models and UIPrev: >>106715652https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/sd-scripts/tree/sd3https://github.com/derrian-distro/LoRA_Easy_Training_Scriptshttps://github.com/tdrussell/diffusion-pipe>WanXhttps://comfyanonymous.github.io/ComfyUI_examples/wan22/https://github.com/Wan-Video>Chromahttps://huggingface.co/lodestones/Chroma1-BaseTraining: https://rentry.org/mvu52t46>Neta Luminahttps://huggingface.co/neta-art/Neta-Luminahttps://civitai.com/models/1790792?modelVersionId=2203741https://neta-lumina-style.tz03.xyz/>Illustrious1girl and Beyond: https://rentry.org/comfyui_guide_1girlTag Explorer: https://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbours>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
>I have to edit a yank file to add a model route>In a way it's not specified.>Comy.exe doesn't let you select a custom drive to installThis is shit THIS IS SHIT THIS IS SHIIIIIT
Context Window edition
>>106719283hahaha, yeah...
>>106719264yeah i'm using a few loras,https://civitai.com/models/1648982/wan-nsfw-posing-nudehttps://civitai.com/models/1343431/bouncing-boobs-wan-14band my lightning is set to 1, maybe its a lora strength imbalance? i'm on wan 2.1 720p 6K. maybe its also a promptlet issue.. well, just knowing its a model censorship issue brings me some kind of relief that i'm mostly doing this right.
the man in image1 is wearing the outfit of the man on the right in image2.then I added a hat that it missed.>goose ball run
>>106719283API nodes don't have this problem
>comfyui >kijainodes
>>106719303it has no care for the style of the original image, like we're not asking for much, just for the outfit to not be anime when applying to a realistic character lol
>>106719257Test and tell us back anon, I suspect that context length = total length is basically because of the dual sampler thing going on with 2.2.After all, how would the node even works if the rolling context is instantiated in both high and low noise?
>the memory leaks are fixed guys!>infinitely worse leaking and crashing issues than before
>>106719198Yeah and this doesn't seem to work with their default i2v workflow, it just errors.
>>106719311*sniff*
>>106719319well it's a different sampler which gets those latents. i'm wildly assuming it will just work fine. but yeah i'll post results when done
>>106719327It's broken then, maybe worth testing in t2v.
do i look like i know what a context window is
>>106719324wait, this might be why anon is having issues with wan. fucking cumfart, I swear he doesn't fucking test anything
>>106719324don't use my misery for your spamming faggotry, you schizo>>106719286>t2v only
>>106719316you have to do it in two steps, covert the style then blend the images
>nigbo
should just take the API node pill. no issues whatsoever
>>106719324actually fixed it for me, I don't use those lora power loaders though
>>106719336Yeah but how does it make sense?You have 32 frames total, and a context of 16 (let's ignore overlap)So it means the first sampler will : - gen 16- gen the next 16Then it sends 32 frames to the second sampler?How can it only send 16?
I hate what this general has become because of this shit
>>106719377why would it send anything? it's an entirely new process, like a refiner.
>>106719387The solution is to remove the SaaShit from the OP
>>106719389For me the process is linear, you either send all the latents of the 32 images to the next sampler, or it doesn't makes sense node wise.
>>106719403the solution is to use API nodes silly
goddamn, so many filtered messages already. eu times are so much better.anyway thanks to the anon for posting that context window link earlier, i really was a fucking retard with the settings i had. might finally get longer gens working without breaking into mashed potatoes.
>>106719387karma
>>106719414No problem anon, but apparently it's broken for i2v, and maybe this explains why it doesn't have much chatter online.
this is already looking far better than the absolute rot i had it set to
>should just take the API node pill. no issues whatsoever
>>106719451your average 1girl genner
>>106719451plappable
>>106719311Where's Lara Croft?
Hunyaun 3.0 80b predictions>NOOOO WHAT IS THIS?? ITS TOO BIG! I BOUGHT 2 5090s AND STILL CANT RUN IT>UGH WHEN WILL THESE CHINKS LEARN TO OPTIMIZE>IT CANT EVEN FOLLOW THE PROMPT RIGHT. IM RUNNING 1BIT_Q1_LIGHTNING QUANT WHICH SHOULD BE INDISTINGUISHABLE FROM FP16>8 MINUTES PER IMAGE BUT THATS FINE, I DONT MIND WAITING AT ALLL HEHHHHHHHmeanwhile>Hunyaun 3 great, better prompt following than GPT-5! I'm running it through ComfyCloud right now perfectly>Crazy how good this model is with API Nodes, I get a 2k image back in 8 seconds>Diffusing Hunyuan 3.0 locally with ComfyUI API Nodes, fully uncensored the greatest local model of all time!
>>106719273I tried 49 frames, 720x1248, 8 block swap. Ran out of VRAM quick.Tried again with 16 block swap - RAM was maxxed out and VRAM hit 95%, but it did complete. Took about 100 seconds longer than a 480p video with 101 frames so not worth it unless I can get the RAM/VRAM down, assuming that's bottlenecking and that render time isn't the expected difference between the 2.
>your average 1girl genner
>106719515ranfaggot looks like this
>>106719503I feel sad for understanding that reference
I wish I was gamer enough to game.
>>106719515now do a miku genner
Crazy how underrated Krea is
>>106719511qwen image is 20b and gens (with lightning) in ~4 seconds for me.80b is so fat, the idea of stacking 4x qwen image is insane to me.let's not even address how qwen does not remotely achieve the quality of even FLUX, which is what, 12b?anyway, the size isn't important so much as capability. if you've been paying attention to the output gens (the bears + mooncake junk slop they genned), it's pretty nifty how each elemental bear has that elemental text character on it and associated mooncake.in other words: the model is smarter. 80b smarter? fuck no, it's probably gonna need a ton of training to be remotely viable and to me, that's where the pain iscan we train this in any reasonable way or is it gonna be, effectively, just a closed box model due to sheer girth?further to this, it's just no longer viable for raw t2i these dayswe need image models to do i2i, editing, multiple reference images, style/subject extraction/separation, etc.this thing better bang.
>>106719294https://civitai.com/models/1602000/innie-pussy
>>106719434Is that t2v?
>>106719568yes
>>106719571And did it work?
>>106719543>I feel sad for understanding that referenceYou mean, you just know.
>>106719548
>>106719590Classic Lara is still the best
>>106719597miku anon is looking rough
>>106719584yes and no, there is very obvious ghosting where the context ends so i'm trying more settings
>>106719609U're beign y'allsphobic
>>106719285Surely, there must be a way to set the length to something wild like 500 frames without OOMing. Gotta be some kind of node that optimizes the memory (rather than completely offloading). So instead of it taking, lets say 10 minutes to generate, it will take 15 or 20 because it optimizes the memory.
>>106719561>wan2.2 lorasanyway, i figured out what my problem was, while shift 8 is recommended for 720p, i set my shift down to 4.5 and magically everything is working as intended. i'm not even gonna question this logic that defies the clear instructions, just gonna roll with it and set a queue of 8 and hope cumfartyUI decides to let me roll this one.
>>106719511It should be able to be ran locally at Q8 with offloading to the CPU. Just prepare for 30 minute generation times kek.
>>106719625he represents the best of /ldg/
>>106719642Cont...I found these, does anyone know if these would help in generating longer vids?
>>106719618I mean if it works it's great, even with the ghosting, which hopefully is just a parameter issue.
>>106719642>>1067196742 methods : - using the last image of the preceding gen to make the next x frames of the next gen.Issues : 1- it has no idea of the motion going on 2- deterioration of the image over time because of the vae- using a rolling context window to generate long videos : 1- it doesn't work with i2v for now 2- no possibility to change the prompt between videos
>>106719643what exactly does shift do?
>>106719549iykyk
>>106719732I lvoe it, but its blurry
>>106719714increases the urge to subscribe to API nodes
>>106719753please answer if you know. else take the trolling attempt back to sdg
>>106719714decent blog on it that explains it better than i can https://replicate.com/blog/wan-21-parameter-sweepbut it was usually recommended as i said higher shift when you gen at 720p's recommended res, but, in the case of using a lightning lora at 4 steps, you want it pretty low so it doesn't go overboard/fry. in my case, inflated tumor-looking pussies and heads rolling 360 degrees.(also comfy failed to do a 4 gen queue.)
Forge doesn't have this problem btw
>>106719549>Crazy how underrated Krea isIt's all right, but so far, only Seedream is capable of making the photo look like it wasn't taken in a professional studio or something like that.
>>106719713Actually, the sloppy last frame technique could work, considering we now the context nodes. Did a lot of that with 2.1, wasn't great but with these nodes, there has to be a way to feed the context alongside the last frame.
Was there a way to change the order of these?
>>106719779thank you!
>>106719801>feed the context alongside the last frameThe whole idea is to only work in latent space, if you use the output frame, you use a vae, and you deteriorate the result.
>only Seedream is capable of making the photo look like it wasn't taken in a professional studio or something like that.Alright that one made me kek
>>106719750the grain can get kinda out of hand sometimes but i'll take it
>>106719871Ive never gotten gens looking this sharp and good
So glad I saved my ComfyCloud API tokens for Hunyuan 3. Going to be fun mogging openkeks who cant even fit it on a 5090
>>106719952ranfag
>>106719872https://pastebin.com/JgZEs7QQhere's an oversimplified version of my workflow
>>106719795Nano Banana looks closer to high-quality CGI than a photo here (also Nano Banana doesn't even seem to be as good as Imagen 4 Ultra for straight text-to-image in the first place). Both Seedream and Krea look precisely like professional studio photography though IMO, just with different lighting approaches.Either way though at the very least there's not really any good reason to use regular Flux Dev anymore when Flux Krea exists, Krea is just a lot better not only for realism but also prompt-adherence wise and as far as understanding of certain stylistic concepts.
>>106720012the buildup and payoff here is fucking crazy man. are you using that extended context thingamajigger people were talking about last thread?
>>106719872tip: never ever ever ever ever ever ever LOWER the guidance with Flux Krea like people insisted on doing with regular Flux, it will do nothing but reduce color range, detail, and coherency. Instead RAISING it to around 4.5 instead of 3.5 gives the best results with Krea I find in terms of detail and coherency, without having any negative impact on realism or anything like that. This is with Euler Beta, that's generally what I use (unless I'm using one of the RES4LYF custom samplers).
>>10672001210 sec?! Teach me, Master!
>>106719795Want to test the intelligence of SaaS vs. Local models?Character reclining in the armchair, his back leaning against one armrest while both legs rest on the other armrest.
Did you know that wan2.2 is better than wan2.1?
>>106719872the fuq is up with her fingers?
>>106720083
>>106719991if anyone actually uses this, i'd really like to see results before i share it elsewhere>>106720127no hand refiner in place, since i never look at them full res anyway
>>106720129it looks good but that's a really big sofa
>>106720148for you
>>106720178I'm not a manlet so my sofa can't be big, sorry you are one of those
>>106720148>that's a really big sofa...what?
>>106720072it's probably just WAN 2.5 straight output
>>106720263gen made in Chroma btw
>>106720263So that's the power of a 80b model...
>>106719267Troon posting is no different from shitting up the board with scat. It should be a bannable offense.
>>106720291>t. troon
>>106719691been genning more settings, still get ghosting but it's getting better. i'm still a tard so i don't know which fuse method is best for these settings though. closed loop is off because if the last context window is quie different it just blends into the new context instead of smoothly transitioning.
>>106720263That's you in the middle, isn't it?
>>106701482nice style, catbox?
did a five-way comparison with each model at its recommended default resolution instead of trying to artificially match them.Prompt (which I think is a pretty legitimately tricky one to get fully correct) was: ```A dusty, forgotten attic workshop in the late 1970s, captured on grainy, high-ISO Kodachrome film with a slight, warm color shift and subtle light leaks in the upper right corner. The main light source is a single, low-hanging bare bulb just out of frame to the left, casting long, soft shadows. In the center of a cluttered wooden workbench sits a bizarre, handmade device: a series of five nested brass rings, each inscribed with intricate, non-terrestrial constellations, levitating around a central, softly glowing, milky quartz sphere. To the immediate left of this device is an open, leather-bound journal, its pages filled with frantic, handwritten cursive ink notes and complex geometric diagrams. The right-hand page of the journal, facing the viewer, must clearly and legibly display the handwritten text: "The resonance is not a frequency, but a location. It remembers the space it used to occupy." In the near foreground, slightly out of focus due to a shallow depth of field, rests a soldering iron with a wisp of cold smoke rising from its tip. In the background, hanging on a pegboard, various well-worn tools are silhouetted against the dusty light.```Only key takeaways are probably that HiDream is objectively the worst (extreme JPEG artifacting in the native output, total failure to output text at all) and also that none of the images really did an accurate job of capturing what a Kodachrome photo from the 70s actually looks like.
>>106720372i was fucking with you on the closed loop lol, i wouldn't bother with anything that has looped in the name with what you're trying to do
>>106720240so sad if true
Anyone thats run this wan node, how the hell does the temporal mask work?
>>106720372I think I'll just wait for a version that gets support for i2v anyway, and more people trying stuff with it.
Wan 2.5 will never be open source btw. The whole be nice and we might release it thing is just a way to smooth over the discourse as they transition to full SaaS. Anyone saying otherwise should be laughed at for the clowns they are.
>>106720489I mean that's one of the major improvements in 2.5, proper support for true 1080P and up to 10 second for all resolutions (480, 720, or 1080)
>>106720510IDK man, if you run the model through TensorArt at least (assuming you have an old enough account there to not be impacted by their current NSFW crackdown as far as what the generator allows promptwise) you'll see that it's as uncensored as 2.2 and 2.1 were. It's also not THAT good in the grand scheme of things, like the coherency and retention of likeness and so on are great but stuff like the lipsyncing is nowhere close to Veo 3, and the prompt adherence seems noticeably worse than Kling's new 2.5 Pro Turbo, so I don't think Alibaba really have quite a product yet that going full SaaS right now would pay off.
>>106720546>Kling's new 2.5 Pro TurboThey do have great video models, but they will never share them, nor will allow any nsfw.
the man with the blue shirt and black jacket in image1 is wearing the outfit of the anime character in image2.thengive the man wearing a purple shirt in image1 a brown cowboy hat. keep his expression the same.
>>106720559I mean that's not the point though, my observation of that model's overall prompt adherence versus WAN 2.5s was
I installed Forge Neo but it made my waifus butt smaller compared to Forge. What the fuck is this???????
>>106720274were you using like Euler Simple and no negative whatsoever or something?
>>106720012Fuck yeah!!
the man wearing the blue shirt and beige pants in image1 is wearing the outfit of the man in image2.thenadd a blue hat like the man in image2 to the man in the center in image1.
Is qwen image edit 2509 censored?
Flux Krea recreation of an old modded Skyrim screenshot of mine with a prompt from Gemini, composition came out pretty close to the original
>>106719590almost topcow vibes :3>>106719515>>106719597do we HAVE to post selfies here? this is the ai thread (local only)>>106719871>>106719991love the hair <3
>>106720638yes, if it detects the word "breast" in your prompt it IMMEDIATAELY phones home directly to Xi Jinping, who then promptly arranges for your assassination.(actual answer is not anymore than original Qwen Edit AFAIK)
>>106720638Yes, but at least for nudes there is a lora that helps. For anything suggestive or sexual, it will needs loras to understand them.
>>106720662It doesn't seem it is, but I feel it is stupid sometimes and you have to reiterate the prompt
>>106720631the man in the blue shirt is riding a brown horse that is galloping towards the camera. keep his appearance the same.
I swear it's the same fucking guy every time asking if X model is censored. Then his next question is always "can it do pissing?"
>>106720510ntaThey eventually do when the next version is ready.Look, we are not even close to 30 sec of continuous action scripted in the prompt. Once this is achieved, dear lord...
>>106720695>NtaThere was no need for that. I wasn't replying to anyone.
>>106720695>Look, we are not even close to 30 sec of continuous action scripted in the prompt. Once this is achieved, dear lord...Works needs to be done on the node for that, and it seems no one is interested.
>>106720692it's more than one guy, and they're the probably same guys who repeatedly claim that CivitAI "banned NSFW" when they did not do actually do anything more than crack down on nonconsensual deepfakes and extremely niche controversial fetishes (and only once their hand was forced by payment processors).
two edits but it still works. one for the hand up then the action.the man in the blue shirt is holding his left arm in the air, the plane behind him is engulfed in flames, and fire scorches the skies. a pillar of fire from the sky hits the plane. At the top of the image is a blue rectangular textbox the width of the image, with "FIRAGA" in white text in the center. keep his appearance the same.
>>106720662>IMMEDIATAELY phones home directly to Xi JinpingThat phone is going to be busy
>>106720755the man in the blue shirt is holding his left arm in the air, the plane behind him is frozen solid in ice, and a blizzard of snow is in the sky. the plane is encased in thick ice. The ground is covered in snow and ice. At the top of the image is a blue rectangular textbox the width of the image, with "Blizzaga" in white text in the center. keep his appearance the same.
Can I fart on a turtle with SD3
I don't know what to generate.
I fucking knew it was fake.
>>106720873!!!>>106720012 Hint Hint!!
>>106720881are you a moron? all movies are fake
the man is holding a vanilla ice cream cone in an ice cream shop like Baskin Robbins. A neon sign in the back of the shop says "BIG GUYS" in neon lighting. Many flavors of ice cream are visible on display. keep his expression the same.a big scoop, for you
>>106720012Impressive face consistency throughout the video
howdy lads, VRAMlet here.Has anyone heard of this new higgsfield video model?Idk the details yet but if/when it gets local support yet we should run some tests.
>>106720873cute anime scenes? ;3steal some homework! >>106720951>we live in the timeline of skin-tight workout shorts+lace chones there is a lot of complaining going on but realisticallywomen have never looked more adorable in any point in human history >REQUESTIN CATB0X SIR
>>106720012>10sec videoI WANT IT AHHHHHH
>>106721009theory:>create 5 second video>use endframe of 5.3 second video as starting frame for another 5 second video>stitch togetherneat :D
>still no pony v7