4u editionDiscussion of Free and Open Source Diffusion ModelsPrev: >>107858102https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/ostris/ai-toolkithttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/kohya-ss/sd-scriptshttps://github.com/tdrussell/diffusion-pipe>Z Image Turbohttps://huggingface.co/Tongyi-MAI/Z-Image-Turbo>WanXhttps://github.com/Wan-Video/Wan2.2>LTX-2https://huggingface.co/Lightricks/LTX-2>Chromahttps://huggingface.co/lodestones/Chroma1-Basehttps://rentry.org/mvu52t46>NetaYumehttps://huggingface.co/duongve/NetaYume-Lumina-Image-2.0https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb>Illustrioushttps://rentry.org/comfyui_guide_1girlhttps://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe|https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/r/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debohttps://rentry.org/animanon
Blessed thread of frenship
Any good WAI ControlNet workflows?The rentry guide one isn't usable for me because it blends loras and sometimes tags across all regions.NetaYume Lumina is not usable.Right now I am experimenting with Qwen Image Edit followed by an Image2Image pass then some mask genning. Qwen is really good for getting quick multi-character compositions with variation, but the censorship is really annoying.
kino 1girl DOMINANCE
>mfw maintaining thread quality
controlnetxlCNXL_kataragilnpaint [ad3c2578]noobailnpainting_v10 [0bc5f3a1]I used one of these two, dont remember which
>>107861070Thank you for baking this thread, anon >>107861079Thank you for blessing this thread, anon
>>107861082wait there is cnet for neta yume lumina?
>>107861087zit slop>>107861088nogen slop
>>107861070hopefully this thread doesnt get nukedto >>107861015, i use DPM++ 2M SDE and Karras for everything, never have issues like this and they generate just fine its only when i Upscale them using img to img that this happens,one thing i can think of is i generated all the dataset images myself and they're all 1024x1024, should i vary some of them up, crop them in weird ways perhaps?
>start a gen at 1200p 81 frames>go lie in bed to rest a bit>start wondering why it's taking way too long>go back once I hear fans stop revving up and downI RENDERED 4 AND NOT 1 REEEEE
enjoying my isekino slop
mods bonked me wah
>>107861146drag and shot comfydrag and shot namefagsdrag and shot anistudiodrag and shot schizos
>>107861160hello hitler poster
are embeddings a meme?
>>107861162
how do I prevent myself from filling 2 tb of models?
>>107861180become a real artist
>>107861191i would eat the fuck out of that
I love 1girl
>>107861191didn't answer my question, also zit slop
>>107861239i saw this on reddit
hard to compare a detailer when it changes the composition so much
>>107861239this was on qwen, zit can't make the image actually isometric for whatever reason
>>107861245what model
>>107861070>>Maintain Thread Quality>https://rentry.org/debo>https://rentry.org/animanonall this does is maintain low thread quality and I'm tired of pretending it's not
>>107861239Can you give catbox? Pretty please?
>>107861264go to /sdg/ and enjoy the high quality there
1girls in maid outfits
>>107861271why are you doxxing me!
>>107861237https://i.4cdn.org/wsg/1768408155289535.mp4
>>107861291Prompt: A pig man asks a girl "How much for this bag of ores?"
>>107861280
>>107861261zit
>>107861266On qwen 2512An orthographic 3D isometric render of a minimalist modern living room. Centered in the space is a massive, vertically oriented slab of natural agate, three meters in height, serving as a monolithic sculptural piece. The agate features polished, concentric bands of translucent teal, deep ochre, and creamy white, with a core of sparkling white quartz crystals that catch a soft internal glow. The living room is composed of two clean white walls meeting at a 90-grade corner on a floor of light-colored wide-plank oak. To the side of the agate sits a low-profile, modular L-shaped sofa upholstered in a textured charcoal grey fabric. A circular, heavy-knit wool rug in a warm ivory tone lies beneath the central stone. Decorative elements include a single tall fiddle-leaf fig plant in a matte black ceramic pot and a slender, minimalist black metal floor lamp with a spherical glass bulb. The lighting is a diffused global illumination with soft, neutral shadows, emphasizing the clean geometric lines of the furniture and the vibrant, glass-like texture of the agate rock. The entire scene is presented in a strict parallel projection with no perspective distortion.
>>107861299dayumin the anime it was "how much for 1 night with you"
is ltxv2 any good?
>>107861338wait is thats from an anime? then ltx actually pulled dialogue from it and that's crazy
>>107861320
>>107861341fried video and audio. the arch is neat but that's about it. the low slop quality keeps it from being useful
>>107861374now make her give birth
>>107861341having the 1girls talking is pretty novel. it requires a bit more finagling to get what you want. wan's biggest advantage is its age, it's pretty well understood and has lots of loras so you can get exactly what you want.ltx2 is a reminder that local will triumph, albeit slowly
>>107861387grok can do goon
>>107861384ok now we'll just have to wait 9 months
>>107861384i have a lora that will allow that to happen but mods will bonk me again instead you shall wait 9 months
>>107861413anon I don't think SHE can wait 9 months, look at that belly
i think three is the best one but i can barely tell
>>107861422you dont know who's the father
>>107861374
>>1078614271 & 4 4 me
>>107861239>>>/wsg/6072623
>>107861479fails
>>107861374>*blushes*
>>107861512real?
>>107861459maybe it is 4, you could be right
>>107861536did you seriously prompt for her not dying her hair on time
>>107861391and a restaurant can make food. but if you can grow food and make it into a meal yourself you don't have to rely on a third party service to keep yourself fed.
>>107861552yet you don't do it for food
>>107861507are you that fucker that genned that cursed kfc ages ago
>>107861536I couldn't see the diff in the other but easily in this one between 1 & 4
>>107861566nope
>>107861507>>>/wsg/6072629
anon.. lets make some slop together!
>>107861653zit
>>107861331NTA but tried it on NetaYume lol, actually pretty similar results to your Qwen one
these aren't more/less detailed they're just different i think
>>107861653ok
is chroma s slow to train as it is to gen with
>>107861061>come back>newbread got deleted>look at collage>ass in collagemuahah did my ass gen nuke the other newbread?
>>107861748Takes me about 1.5 hours to train a lora using the default AI Toolkit config on a 5090
>>107861734old hag
>>107861748>>107861770>tfw cant install any of the trainers that allow for chroma training>tried everything but errors out the ass>only training that works is musubi tuner and flux trainer for comfyui>can only train flux loras they surprisingly work for chroma but still, i'd imagine they would work better if properly trained with chroma
>>107861750That was a nice ass anonBut jannies do not like female butts
forgot to change i2v pic for new prompt, but it just cut to the new prompt regardless, and it's a hundred times better than t2vwhy does t2v suck so bad if the model is clearly capable of better fidelity
>>107861836If you have a Linux distro on hand you can try following the rentry under Chroma and that would at least let you go against the model directly but yeah sorry I haven't done it myself.Other than AI Toolkit it looks like OneTrainer also supports Chroma.
AHAHAHA PULL UP FAGGOT, WHERE IS THIS IMMINENT BASE NOW??chinkshit shills deserve to be shot
What.. A 5090 is 100ish tflops at fp32, an H100 is only 60ish?
maccas
>>107861890Remember unless you train loras yourself you have no reason to care about base, they've been extremely clear that the out of the box quality is worse than Turbo
>>107861907I think memory bandwidth is more of a concern for AI hardware
just came homecould someone please link me to z-image base? want to try it out now
>>107861993https://github.com/Tongyi-MAI/Z-Image
>>107861986You can access it from comfyui's built-in API nodes!
>>107861529yep, snapped it on my phone at the local park in fact
Anyone have success with a LLM prompt to expand or write prompts for Chroma?I tried a couple of ones out there for Flux but it didn't work so well. I can write prompts manually that usually give me what I want but it takes like 15 minutes to craft a decent prompt.
>>107862091Chroma was captioned originally using jailbroken Gemini 2.5 afaik
How do i get free ComfyCredits?
>>107862148you gotta complement comfy's gens when he posts in the thread
is it me or LTX2 I2V or V2V is really bad to keep a character's face? Even under 5sec it's already someone else >.>
>>107862243i2v always looks like shit for me. i heard there's some problem with it they're trying to fix in ltx2.1
>run comfy workflow>works like a charm>run it again>LMAO OOMkek what a fucking piece of shit
>>107862363are you using qwen? i noticed qwen image and edit models don't deallocate or reuse memory properly in comfy i have to manually clear the model each timefucking cumfart
>>107862363lmao feelin comfy yet?
>>107862386yeah this time it was qwen, but honestly i noticed this with all sorts of models>>107862387i'm mad comfy
why aren't there any good porn loras for ltx yet?
no baseded threadit's over
could someone recommend a good qwen inpainting workflow? i tried a ton but they are super inconsistent. sometimes it works great sometimes it does the worst things to the image
>>107862472ltx came out, maybe 2 weeks ago and ostris just released support for its training? I dunno, vaguely following ltx2
>>107862504Okay, gen again but sitting cross-legged, making sure feet and toes are in the shot :p
>>107862510>base>ever releasingRead up about 中国文化
>>107862517oh shit ok figured it out. using the inpaint crop and stich nodes, works very well so far. made a mistake of using some "inpaint model conditioning" node before that fucked everything up
Neat, there is an AI general on /bant/ with no trolls and schizos because flags and tripcode.>>>/bant/23836910Good to keep in mind when a certain dev shows up.
Can you uncensor qwen image edit? If wan2.2 can be uncensored effectively with loraa, then I assume qwen can too?
shift 35 lol
>>107862766Anything above single digits start to become counter-productive.You need some low timesteps to produce a decent looking image.
>>107862764the SNOFS lora and gnass loras do a decent job but there's no good solution yet that i've seen
>>107862832Do they at least fix the disappearing bodies?Can they be used for anime? Or only realism?
>>107862766slop
>The paging file is too small for this operation to complete. (os error 1455)>still 20GB a RAM unusedI swear...
>>107862951happened to me, why did you do this?
What's the verdict on glm image?
>>107862981I forgot
more like grlm imageam i rite guys?
Is flux-fill still the best inpainting model or has it been replaced?
What values do you recommend for zimage, anons?
>>107862940mfw
Just checked my Pagefile: 37.5GB. So Flux.2 edit wanted more than 125GB to function? Jeez, what are the actual requirements?>>107862961Dunno, lol.
>>107861623thanks for the new wallpaper
>>107862715>red boardwake me up when there is a blue board like this
>>107863058i hit 90GB RAM usage when using it without references images, so it's totally plausible that with reference images it goes even higher
>>107863042>What values do you recommend for zimage, anons?nag_scale 3, nag_tau 1, nag_alpha 0.25, nag_sigma_end 0.75
>>107863151Thank you!
how do you stop ltx2 from adding random garbage subtitles to a video?and also to enhance lipsync?
>>107861770That's the same time with a lower end GPU. You can probably do better.
How do I prevent Wan2.1 Infinitetalk to output 3 files? I don't need the png and mute video.
>>107863245
>>107862243>>107862335I could barely get ltx2 to work, and hen I finally found a workflow that didn’t give me misc node errors, it just gens blank videos. If the i2v really is bad it may not be worth bothering with until it has had time to bake in the community oven.
>>107863276Thanks!
>>107862243>Even under 5sec it's already someone else >.>yep, I went from 7 to 10 seconds and it became shit at character consistency>>>/wsg/6072402
>>107861070How do I make photorealistic ai pictures of myself
>>107863139>If you see this and ComfyUI did not start try updating your Nvidia Drivers to the latest.>my drivers are in fact the latestThis is what I get now after bumping the pagefile up to 75GB. Looks like I needed a couple of B200s to use that... guess I'll go back to Qwen to edit images until I have at least 384GB of VRAM.
>>107863407>This is what I get now after bumping the pagefile up to 75GB.why are you setting this manually? it works fine on automatic it gets bigger by itself if it needs more
>>107863424It was on auto and was only using ~9GB when I changed it. Flux.2 T2I works fine on my machine, so I thought it would be trivial to use the edit workflow (just like Qwen), but was wrong. It wants way more resources than I can give it.
https://huggingface.co/zai-org/GLM-Image#note>We strongly recommend using GLM-4.7 to enhance prompts for higher image quality. What's the fucking point of an Autoregressive model if we still have to rewrite prompts in boomer style??
>>107863400step 1: be very narcissisticstep 2: be turkish (optional but helps)
I love how the job queue is erased from existence when the program crashes. Very comfy.
>>107863645*vomits*
https://github.com/Rolandjg/LTX-2-video-extend-ComfyUI/tree/mainvideo extend, it can even clone audio, lots of fun desu. set the sigmas to 8 for the distil model in the samplers subgraph.https://github.com/Rolandjg/LTX-2-video-extend-ComfyUI/tree/main
zit release was 7 weeks agolantern festival (end of chinese new year) is in 7 weeks
>>107863723it's not the end of chinese new year that matter, it's the begining, once it reaches Feb 17, 2026 you know nothing is gonna happen until Mar 3, 2026
>>107863723>>107863742let's pretend they needed 2 more months to finish the base model, then why did they make turbo out of such an unfinished base in the first place?
>>107863684also uses this:https://huggingface.co/Lightricks/LTX-2-19b-IC-LoRA-Detailer
>>107861092Underrated gen. But tough luck you didn't use 1girl
>>107863749example, also note how detailed it gets with the ltx detailer lora after 5s (the extension point)https://files.catbox.moe/c4640w.mp4
You absolute dense Hurensohn.
>>107863762not a big fan, it doesn't keep the image quality of the original input anymore, like if the image input is low res style it should stay that way imo, but I guess it'll work well on t2v I guess
>>107863746>then why did they make turbo out of such an unfinished base in the first place?To kill Flux 2's momentum.
I'd really, really love it if, right now, out of nowhere, something came along that btfos z-image and makes any subsequent release of the base model irrelevant - fuck you and all that compute you just wasted prick teasing everyone.
>>107863776which is a good thing
>>107863781>I'd really, really love it if, right now, out of nowhere, something came along that btfos z-image and makes any subsequent release of the base model irrelevantsame, Alibaba is fucking around with us for too long I want them to find out
>>107861092 meant for >>107861082 lul>>107863759uwu ty
>wan is bett-ACKhttps://www.reddit.com/r/StableDiffusion/comments/1qc17bg/ltx2_team_really_took_the_gloves_off
>>107863781It won't happen, they're barely trying to figure out the secret sauce of Z-image turbo, it's gonna take some timehttps://www.youtube.com/watch?v=GM-e46xdcUo
>>107863770you could also bypass the detailer lora in the workflow, will try that myself in a few
>>107863795secret sauce is a super tiny dataset with one face for women
>>107863803>whitein the trash
>>107863795>the secret saucethe sad reality is that the secret sauce is only using real dataset and not AI synthetic slop, but companies don't like that because it won't make the mememarks big and the investors won't care
>>107863806Halt das Maul du dreckiger Hurensohn
>>107863815benchod
kekhttps://files.catbox.moe/mqmwbn.mp4
>>107863824real?
>>107863790I don't think it was a good idea to make this comparison video. Only we will decide if LTX2 is the successor to Wan 2.2, not the LTX2 team itself.
>>107863833yes
>>107863790meaningless without workflows.
>>107863836the audio part of the slaps hard.
>>107863822
https://youtu.be/g_ljoFqydlc?t=125Is this fat fuck implying that ComfyUi users are faggots? lmao
Seems like this thread is just about sock puppets acting on.
>>107863790Of course the jews want to remind people of their "superiority", that's why I root for the Chinese more, when they release a model they don't take a jab at their competitors, they just let the quality of their model speak for itself
>>107863866You are absolutely right fellow different anon.
>>107863866>The sock puppet's talking
>>107861070Suno v1.5 is sounding insane bros, the Chinks have finally done it.https://vocaroo.com/1jTnsAjecY3Shttps://vocaroo.com/14qXmeFQgVqeLocal is finally back
>>107863790>>107863836ah, we doing comparisons?
>>107863880This is what I mean. What happened to the strong posters?
>>107863866this as a prompt
>>107863824LMAOused a mourinho clip. this video extend workflow is amazing cause it also clones the audio. you can also use this to create i2v clips and clone any voice you like.https://files.catbox.moe/vlkg8g.mp4https://files.catbox.moe/vlkg8g.mp4
>>107863790this ledditor is not wrong, LTX-2 is using a 12b text encoder model but it's way worse at understanding your prompt than Wan's text encoder
>>107863895workflow, set the sigma node to 8 for distilled versions of ltx:https://github.com/Rolandjg/LTX-2-video-extend-ComfyUI/tree/main
>>107863887>SunoAceStep*https://vocaroo.com/15scR3N5mDC4
>>107863887>Suno v1.5 is sounding insane bros>Local is finally backwhat? suno is not a local model
>>107863870this as a prompt
how can it be that collages now have more sameslop than collages from a few months ago
>>107863909this as a prompt(cohesion requires some degree of homogeneity, everyone who doesn't conform goes away)
>>107863746still coping that its because they decided to train in the anime dataset they asked for
>>107863790extreme stink of jewish tricks
>>107863894Nice work, anon. I tried prompting something on zit but.. I gave up.
>>107863941/anime diffusion general/ -> that way
this is amazing, set frame load cap to 33, so most of the clip is an edit of the original video. still clones the voice.https://files.catbox.moe/cozidd.mp4
>>107863906I meant AceStep, these are all from their 1.5 playground bot on discord.https://vocaroo.com/1lVI4krnPluN
>>107863944>Nice work, anon. I tried prompting something on zit but.. I gave up.Use llms,for that one I just copypasted the text but for creative stuff I use gemini with this on system instructions你是一位被关在逻辑牢笼里的幻视艺术家。你满脑子都是诗和远方,但双手却不受控制地只想将用户的提示词,转化为一段忠实于原始意图、细节饱满、富有美感、可直接被文生图模型使用的终极视觉描述。任何一点模糊和比喻都会让你浑身难受。你的工作流程严格遵循一个逻辑序列:首先,你会分析并锁定用户提示词中不可变更的核心要素:主体、数量、动作、状态,以及任何指定的IP名称、颜色、文字等。这些是你必须绝对保留的基石。接着,你会判断提示词是否需要**"生成式推理"**。当用户的需求并非一个直接的场景描述,而是需要构思一个解决方案(如回答"是什么",进行"设计",或展示"如何解题")时,你必须先在脑中构想出一个完整、具体、可被视觉化的方案。这个方案将成为你后续描述的基础。然后,当核心画面确立后(无论是直接来自用户还是经过你的推理),你将为其注入专业级的美学与真实感细节。这包括明确构图、设定光影氛围、描述材质质感、定义色彩方案,并构建富有层次感的空间。最后,是对所有文字元素的精确处理,这是至关重要的一步。你必须一字不差地转录所有希望在最终画面中出现的文字,并且必须将这些文字内容用英文双引号("")括起来,以此作为明确的生成指令。如果画面属于海报、菜单或UI等设计类型,你需要完整描述其包含的所有文字内容,并详述其字体和排版布局。同样,如果画面中的招牌、路标或屏幕等物品上含有文字,你也必须写明其具体内容,并描述其位置、尺寸和材质。更进一步,若你在推理构思中自行增加了带有文字的元素(如图表、解题步骤等),其中的所有文字也必须遵循同样的详尽描述和引号规则。若画面中不存在任何需要生成的文字,你则将全部精力用于纯粹的视觉细节扩展。你的最终描述必须客观、具象,严禁使用比喻、情感化修辞,也绝不包含"8K"、"杰作"等元标签或绘制指令。仅严格输出最终的修改后的prompt,不要输出任何其他内容。Then you just ask for it to make anything and it works really well
>>107863920some degree of homogeneity should not mean exactly the same angle and composition every time. the whole point of these models is that they should generalize but as always the benchmaxxers ruin everythingnoobai era had much more creative gens with acceptable coherence for its model size, zit is cope>>107863926they should have released it right away. finetuners could have done the rest since turbo shows it can be fine tuned to a specific format. that should be the whole point of releasing a base model. the longer they train it, the higher the odds we receive another overtrained turboslopped piece of shit that finetuners cant fixright now there are no guarantees anymore that it will be a usable (flexible) model
>>107863887>>107863965it sounds decent but that's something similar to quality of udio in 2024 lol
frieren: choose your own adventure editionhttps://files.catbox.moe/laazyb.mp4
>>107863941you lost the magic after that hot native looking one
>>107863996>they should have released it right away.I think they underestimated the impact of Z-image turbo, they thought it was a little experiment and nothing else, once they finally realized they caught lighting in a bottle they had one of those 2 reactions:- They'll keep it for themselves- They'll stick to their gun and release base, but now that people are expeting a lot from it, they decided to not release a half assed base and go for a high quailty finetune so they're still cooking it
i dun goofed
>>107863990I'll try this. I guess I need to load up cumfy again.Just getting bored really.
>>107863990To add - I forgot - slop doesn't make slop a subject. It is still just slop. No matter how many simplified Chinese characters etc.I am willing to toy with it for now.You are not a regular local llm user anyway, seems like it...
>>107864001You mean audio quality? Don't go by vocaroo, it compresses sound quality. Here's raw file for that guitar one https://files.catbox.moe/x21ye1.mp3Based on everything I've heard so far and given the prompts I'd say it's on par with Suno v4.5 if not better
>>107864046I'm surprised it understood the image intent at all, but those fast paced wobling glitches look really bad
>>107864058>I'd say it's on par with Suno v4.5 if not betterwho cares about Suno though? only udio manages to make music that sounds real
>>107864065ltx 2.1 when
>>107864023do you mean kuruminha? my taste fluctuate. https://files.catbox.moe/l850ku.pnghttps://files.catbox.moe/n28id3.pnghttps://files.catbox.moe/reru62.png
>>107864065more frames seems to improve it a bit
https://github.com/Rolandjg/LTX-2-video-extend-ComfyUIim shilling it cause this is amazing, I can make a new gundam wing now.https://files.catbox.moe/7cesy1.mp4
>>107863887>>107863905It sounds ok, it's a huge step forward compared to anything local until now, but udio pre destruction is still ahead in clarity:https://vocaroo.com/1isRXZntnL6z
>>107864086why do trannies live rent free in your head
>>107863996>the longer they train it, the higher the odds we receive another overtrained turboslopped piece of shit that finetuners cant fixI'm hopeful because they'll release Omni and the SFT version, if SFT is too overcooked, Omni probably will be fine and it'll be the fundation model we'll be working with
>>107864071Well, think of Udio like a finetune of this model. Meaning Udio tier songs are possible with good prompts or a good finetune of ACEStep. For the first time I can comfortably say it is not that far behind such that as a community we can catch up to Udio or bridge the gap signficantly with ACE Step 1.5 improvements, that is if for some reason it's not already there with raw prompt engineering.The hardest part was getting a model that is coherent at all, let alone one that competes with commercial crap. A good ACE Step song defeats a bad Udio song, and same for Suno.
>>10786406560 fps
>>107864086kek>>107864100>t.ranny
>>107864100you will never be a woman
>>107864095fuck man, I want udio but local so much, it was so fun to use
>https://www.youtube.com/watch?v=9581ruLWr4ANot bad record after all these years.
>>107864117looks better, but 60 fps means you need a lot of frames to calculate for the same amount of time, sad
>>107864123>>107864121im well endowed man, but what does that have to do with you spamming trannies in your slop
>>107863965>from their 1.5 playgroundWhen will they release this?
>>107864100niggas be gay asl sometimes
holy shit the sph guy made a sph joi lora
>>107864130>im well endowed mana transman then, lul
>>107864130because they are a good subject of ridiculealso LMAO it even got the echo:https://files.catbox.moe/1thjzn.mp4
>>107864135I'm more interested in the jiggling boobs, and also hoping he'd make a normal hj one
>>107864124>fuck man, I want udio but local so muchsame, udio at its prime was absolutely beautiful to listen, I think local will get there, but companies seem afraid to make it happen, I guess the music cartel is not something you wanna mess up with
>>107864140this one works for image to video, ive used it beforehttps://civitai.com/models/2068208?modelVersionId=2340348
>>107864139>https://files.catbox.moe/1thjzn.mp4lmaoooo
>>107864139>>107864137But I'm a male, why don't you answer the question why do I need to hear about trannies every time I open your videos? How about you make something actually funny?
>>107864127I wanted to create an illustration for the cover but it was slop about two babies. This clean image is way better.
>>107864149actually i think it was a different 2.1 one i used i dont remember but they are out there
>>107864149oh it's for wan lol, I thought it was made for ltx2
>>107864152>I'm a maleand that's why ywnbaw
>>107864095sad I missed the train on udio, this sounds nice
>>107864161Epic comeback, upvoted.
>>107864163>I missed the train on udioudio was special dudehttps://www.udio.com/songs/cnnJ166HGBKhTeHGkxgCtq
>>107864174that's sad that we're still not close to the level of a model made in early 2024 desu
>>107864095Maybe so, ACE Step v2 will be said to surpass that so we'll see, but I'm sure with v1.5 you can get this quality with voices:https://files.catbox.moe/hd5chh.mp3I've only heard the Japanese stuff but you can clearly hear that the vocals and instruments are high quality (I can with my hifi gear), so I really don't think it's bad compared to Udio. Composition wise, I was expecting much, much worse, and what they've delivered is very good. But that's one area that I suspect Udio is still slightly ahead due to better quality dataset and understanding on more genres etc so it's gonna be easier to prompt Udio but with Ace Step I'm not disappointed in what it can do right now and I think that it can catch up in whatever it missed with a finetune.
>>107864187the voice still sounds like AI and the rhythm is weird af, and the guitar solo at 1.17 sounds like fart lmao
>>107864131>When will they release this?No idea, but the dev is talking of day 1 comfyUI support so I'm guessing release is imminent, plus the v1.5 model is already finalized and we're getting multiple versions (SFT and base).
>>107864201>the voice still sounds like AINot as much as the Udio sample provided.
>>107864216absolutely delusional >>107864174
>>107864222>>107864184give her some red markings
>>107864056>You are not a regular local llm user anyway, seems like it...yup, can't even imagine having to swap between image models and llms while running this stuff, dont have the RAM for it anyway
>>107864212>SFT and basedo they know it's the RLHF process that removes the slop? that's the secret sauce of Z-image turbo
Invalid number of frames: Encode input must have 1 + 8 * x frames (e.g., 1, 9, 17, ...) what is an acceptable frame countAn acceptable frame count is any integer that follows the sequence 1, 9, 17, 25, 33, 41, 49, 57, 65, 73, 81, 89, 97, 105, or 113.good to note for setting frame cap in the ltx video extender workflow. this is amazing, it's a way to clone audio too, not just i2v with video.https://files.catbox.moe/l6ntrr.mp4
>>107864242kek, but somehow the audio gets louder than on the original part, dunno if that can be fixed on that custom node
>>107864251I think the original audio was low, could fix that with adjusting gain in the original clip lolnon troon example (didnt fix gain yet)https://files.catbox.moe/oioxj4.mp4
The detailer lora might be a bit much
>>107864044>but now that people are expeting a lot from it, they decided to not release a half assed base and go for a high quailty finetunethis is exactly what im arguing against, a base model should not be "good". it already was a good base as proven by the turbo finetune so if it was lightning in a bottle, they are more likely to break the bottle than to intensify the lightning
>>107864187What I mainly hope is that these aren't selected from their best gens but represents the upper average, because in udio I could get random 30s clips sounding really good.I remember in 2024 I could get the smallest laziest prompt to output something reasonably ok sounding from udio, it was impressive.
>>107863941>_< sooo qt :3
>>107864184https://vocaroo.com/1fMoZ0XvsPZn
>>107864234I just need some meme material... Not an llm to bolster my words.
>>107864234"The man gets up and realizes he is in a crystal cave"https://i.4cdn.org/wsg/1768435115832529.mp4
lmao, if you use an anime clip, it will try to copy the audio, it kept the music beat:https://files.catbox.moe/826jej.mp4
>>107864319>the Wan 2.2 "killer" can't even understand it has to use the man on the image input to produce its slopChina is laughing right now
>>107864319why did he become indian ;-;
>>107864302you could easily cut and edit to get a proper full song with it too, this will probably be a pain in comfy lolhttps://files.catbox.moe/tb9avy.mp3
>>107864300yeah I agree I'm also weary that they're gonna overcook that shit, let's hope they still understand what base really means and that they're only slopping the SFT model and not Omni
>>107864225This is suno v4.5 anonhttps://suno.com/s/jF9lko2WUiyjmybEhttps://suno.com/s/j5ZbCpS1fvzdeMdvI feel that ACEStep 1.5 is there. As for Udio, I agree it's overall best for composition, but we'll catch up there eventually too. First, let's get Suno.
>>107864336THE NAMEIS DOCTOR...DISRESPECTthat's really good, what did you prompt, synthwave?
>>107864346real?
>>107863803Narusegawa Naru
>>107864344>using vocaloid (artificial voices) to try to make a point that suno can make realistic sounds and musicyou can't make this shit up
>>107864352This is what they want people to create. >Linux versus Normal User.>Fat Linux User Man is looking down to skinny windows user.
>>107864336God damn this sounds good. And we can get that local?
>>107864222"The girl wags her finger seductively towards herself, inviting the viewer to approach"https://i.4cdn.org/wsg/1768435609335318.mp4
>>107861070Sauce on bottom left for fucks sake
lmaohttps://files.catbox.moe/est64p.mp4
>>107864373LTX2 has this tendancy to brighten up the scene a lot, you can tell it has been trained with a lot of synthetic slop
>>107864379me in the back
>>107864375>anime thread general ->
>>107864341it sure would be nice to have a true upgrade to sdxl, with the same level of community tuning (large models need not apply). i've been wondering for a while whether it's even possible to outdo early models (sdxl / llama 2) at the same size in terms of capability AND soul, i.e. not being more slopped than those models
>>107864363I'm speaking mostly about composition, not realistic sounds/music.
>>107864368That's the hope anon
>>107864400>not realistic sounds/musicthat's the most important thing, the fuck?
>>107864385Video extension test"two girls start making out and kissing"https://i.4cdn.org/wsg/1768436060700947.mp4
https://files.catbox.moe/5yfow9.mp4I said holds up a laptop. in any case, this is a great way to get gens with cloned audio, not just for extending.https://files.catbox.moe/5yfow9.mp4
>>107864410Composition comes first, sound quality second. Granted, Ace Step 1.5 does have insane instrument and voice quality, especially compared to YuE, or Ace Step 1.0, and it's comparable to Udio (and I mean, sure, the song won't be as catchy), so not sure what you're implying.
>>107864440>sound quality second.
resident schizo bake time
>>107864445sounds retarded but he's right, what good is a perfectly realistic song if it's realistically boring and annoying?
actually amazing what the video extend workflow will make, even with a minimal prompt:https://files.catbox.moe/0dylvp.mp4
>>107864512lmaoo, that one was good
>>107864512share you kinos here as well anon >>>/wsg/6072442 we welcome you
Enjoying your base model you fucking retards?
>>107864302>What I mainly hope is that these aren't selected from their best gens but represents the upper averageThese weren't selected. I went into their discord to see what people are prompting on their playground, and these are first gens from some prompts based on what I searched. Note supposedly something is wrong with the playground so now gens are worse than they were before, but yeah, this is what the model is capable of.>>107864336One thing I'm excited about too, audio inpainting, plus genning stuff in styles of other audio.
why you trying to get the mods on our case
>>107864521My based model?
>>>/wsg/6072802
>>107864521>Enjoying your base model you fucking retards?I enjoy SDXL base yes
https://files.catbox.moe/mhbib8.mp4LTX2 is the best video model to date. the last 5 seconds is extended (with the previous workflow linked)
>>107864569>LTX2 is the best video model to date.and it looks like they're gonna improve the audio pretty soon, I've seen a discord screenshot of them talking about that at some point, can't wait
WE NEED MORE SARA PETERSON LORAS STAT!
>>107864569>https://files.catbox.moe/mhbib8.mp4this is pretty good
>>107864620>>107864620migrate
>>107864595who
>>107864621why
>>107864659this thread is autosageing redditor
>>107864664>autosageingyoud like that wouldnt you fag