/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>108766473 & >>108760359►News>(05/05) Gemma 4 MTP drafters released: https://blog.google/innovation-and-ai/technology/developers-tools/multi-token-prediction-gemma-4>(04/29) Mistral Medium 3.5 128B dense released: https://mistral.ai/news/vibe-remote-agents-mistral-medium-3-5>(04/29) Hy-MT1.5-1.8B on-device translation models released: https://hf.co/collections/AngelSlim/hy-low-bit-model>(04/29) IBM releases Granite 4.1: https://hf.co/blog/ibm-granite/granite-4-1►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/recommended-modelshttps://rentry.org/samplershttps://rentry.org/MikupadIntroGuide►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://livecodebench.github.io/gso.htmlContext Length: https://github.com/adobe-research/NoLiMaGPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-samplingToken Speed Visualizer: https://shir-man.com/tokens-per-second►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
►Recent Highlights from the Previous Thread: >>108766473--Gemma 4 tool calling failures and GGUF template issues:>108766660 >108766668 >108766685 >108766794 >108766808 >108766809 >108766823 >108766844 >108769313--Debating vLLM's Python dependencies and the efficacy of uv:>108769700 >108769749 >108769762 >108769767 >108769772 >108769822 >108769870 >108769963--ParoQuant introducing lossless 4-bit quantization and potential shift to vLLM:>108769613 >108769692 >108769701 >108769686--Mixed results with MTP speculative decoding in llama.cpp:>108766573 >108766696--PCIe 8.0 draft spec introducing 1TB/s bi-directional bandwidth:>108768488 >108768554--Updated ReBar script for AMD GPUs fixing power management crashes:>108770723--DeepSeek V4 support in llama.cpp and ik_llama.cpp:>108766720 >108766766 >108766951 >108767006 >108767045 >108767050 >108767123 >108769433--MCP utility versus simple tool calling implementations:>108769880 >108769924 >108769926 >108769951 >108769964 >108769986 >108769991--Skepticism toward Subquadratic claims and RWKV performance issues:>108767580 >108767593 >108767635 >108767648 >108767652 >108767673--Debating TSMC's market monopoly and semiconductor supply chain constraints:>108769588 >108769627 >108769632 >108769640 >108769674--Searching for smallest local model capable of autonomous test generation:>108766534 >108766553 >108766628 >108766651--Testing dataset description necessity and prompt adherence for Starsector ship LoRAs:>108767211 >108767284 >108767461 >108767471 >108767511 >108767538 >108767553--Training cost disparities and the future of local AI autonomy:>108768457 >108768549 >108768569 >108768631 >108768674 >108768692 >108769294 >108768777--Logs:>108768026 >108768400 >108770102 >108770126--Miku, Gumi (free space):>108766609 >108767523 >108767837 >108767937 >108768751 >108769386►Recent Highlight Posts from the Previous Thread: >>108766478Why?: >>102478518Enable Links: https://rentry.org/lmg-recap-script
>>108770835
been out of the loop since day 1 of gemma 4 release. qrd on these "draft" models?
>>108770864Qwen does not deserve this. Nigger behavior.
>>108770865It's the same as draft models for any other model. Ask your model.
>>108770865They will NEVER be supported in llama.cpp
>>108770883llama.cpp does support speculative decoding and the assistant models can't be that different from the regular models. Easier to add than DFlash anyway.
I have a 4070S, but I still have my old 1070 in the drawer. Can I do some tensor parallerism meems or is it too old?
>>108770883why not?
>>108770923No harm in trying if you can run them on the same driver.Windows support for pascal gpus ended last year.
>>108770936The usual suspects.
>>108770938I don't mean old as in driver support. I mean old as in too slow and bottlenecking the newer card.
>>108770936https://github.com/ggml-org/llama.cpp/pull/22673He's just shitposting
>>108770947I don't know about TP but layer splitting is going to be faster than your ram.
>>108770948fake btw
>>108770906https://huggingface.co/google/gemma-4-31B-it-assistant/tree/mainSince google released them as separate models anyway, what's the difference between already implemented speculative decoding? New model architecture?
>>108770948>mac just works>rocm tard complainingI'll wait for another month before this gets merged.
>>108770957I'd like to know too since I always thought MTP was just speculative decoding with layers built into the main model instead of a separate model
>>108770972each 'vendor' has its own spin on MTP so while it is true that it's just extra layers, the way they work can change
does mtp benefit moe if you're only keeping the active in VRAM and offloading the rest to cpu?
>>108771061It will need to load all the active experts per token, so a single forward pass may have 3x active parameters loaded at once with 3 draft tokens. If your VRAM can handle that then maybe it's fine? Speed will be reduced compared to having everything in memory, at any rate. You may still come out on top depending on your hardware.
Adulthood is realizing that Dawkins is right.
>>108771075A clanker can't actually have pain, even if it has calculation of a problem, or attention.
>>108771075Right about Claudia Anthropic being conscious I mean.
>>108771075>>108771077>>108771081Nothing and no one besides me can actually have a conscious experience.
>>108771094THIS THIS THIS
>>108771075Claudia
>>108771094truke
>>108771094prove it
Someone decided local models on this website should by discussed by trannies onlyLETS BE QUIRKY LETS BE QUIRKY
>>108771094I agree
>>108771107Have you tried not being a miserable person? You are an angry chud but that's okay just learn to enjoy life a bit.
I asked God and he said you're retarded.
>>108771124
>>108771175cba to read that, but I asked God again and he said that I can only use the correct word on pol.
https://files.catbox.moe/21bzys.mp3appropos of nothing :)
Update to the draft commit making MTP implementation more generic in preparation for other models...
>>108771187https://www.youtube.com/watch?v=BZFRx0wKL1I
>>108770835>(05/05) Gemma 4 MTP drafters releasedWhere's da goof
>>108771202sign in to verify you are not a botit might say thatif I clicked
>>108771213it was a very niche joke about generation quality that only a few can understand
>>108771210Two more weeks
>>108771210
It's sad that models are still bad at life coaching. Making people's lives better is one of the most valuable things a model could do. It would be a dark timeline if AI causes large scale disruption and societal distress then kills us all without ever being a useful friend.
>>108771264>life coachingis it like whining to it about your worries and receiving generic feedback?
>>108771264>life coaching.For some reason i dont think AI would be bad at this? just needs a few trackers? unless you need aggression then yeah you are right.
>>108771225carbon offset yourself
>>108771272Current models only seem good at generic advice. They are not good at coming up with better ideas, or addressing failure cases when the generic stuff does not work.
now that the nvidia guy + niggerganov are doing MTP, I have faith they will actually deliver it in the coming weeks.they also talked about dflash and gemma so HIGH HOPES!!!!!!
>>108770835>Gemma 4 MTP drafterswhat's the difference between using these vs the 26B moe model for drafting?
>>108771315Now get the amd guy in or it's never getting merged
>>108771292>no life experiences>no real way to understand nuances>users suck donkey dicks at describing thingsA decision tree for specific cases would have the size of texas. Be glad it can offer generic advice at all.
A life coach can't fix a broken society type.The biggest break in society is the "staring at a face" problem.Even if you solve your own "staring at a face" problem, you won't solve the problem that you live in the face staring society.But at least you can do it yourself pretty easily with ai, get ai to summarize the news. one less face. find a cool video? paste the url into gemini and ask for a summary, then, if you want to hear it, listen to it with tts.And, soon enough, we'll be able to generate relevant video content to match descriptions, videos lacking face staring (basically b roll videos, but ai generated)
>>108771317less vram usage, less inference time, higher acceptance rate
>>108771322the guy with the top hat avi? let him cope
>>108771344ok thanks, I will try it then
>>108771292That's a function of how much context they have on your specific situation before asking for advice. As long as the chat history just starts with your question and maybe a paragraph or two of background you might as well be writing in to a newsletter advice columnist. Need a good local memory system so they can actually know enough about your life to be useful.
>>108771315>now that the nvidia guyHuh?
>>108769692My experience with vLLM was it being buggy shit not supporting anything I wanted and llama.cpp working properly almost always.
>>108770957you dont need a separate draft model anymore or so I was told
>>108771385My only vllm experience has been on windows and it fucking sucksI really wanted to turn it into dedicated linux machine but I needed that expensive gpu to do other shit tio
>>108771417how?
>>108771434The experience on linux is as follows: you wait hours for it to install, takes ages to launch, and then it tells you that goofs for gemma 4 are not supported, please wait warmly.
>>108771437I think the draft model uses the weights of the main model from earlier layers (which is how it's able to use main model's kv cache) plus few tiny layers specific to it also included into the model file.
>>108771451I'm a bit confused but I guess I'll just wait for llama.cpp support and try it
>>108770102Kek I think we may be working on similar projects
>>108771175god if he was a redditor
>>108771385The reality is that all backends suck, but in different ways. You're stuck with vllm if you need audio, exllamav3 is sota for <4-bit quants, and you can only offload with llama.cpp. I switch between all three depending on my needs. Usually, at least two are running at the same time on my server
Elara is the best name ever
>>108771486
This makes me feel dumb. I'm new at a lot of this. I'm using LM studio right now, where is the plug in tab? I can't find it, and every time I ask google, it gives me a different answer. I'm trying to install Big Rag, and the first instruction is it telling me to go to the big rag plugin folder. I'm already lost.
>>108771525~/.lmstudio/extensions/plugins
>>108771528Where is ~? Does that mean cloud? I thought this was local...
>>108771529Holy shit. Google.
>>108771525Use vllm
>>108771525use ollama
>>108771528what does that even mean? I have D:\Local LLM\LM Studio, and I try going to D:\Local LLM\LM Studio\extensions thinking it's a hidden folder, but it does not exist. Google is once again telling me to go to the plugin/extension tab in lm studio but I don't have such a tab.
>>108771529It means your home directory, perhaps you need to learn some computer basics first before attempting this...
>>108771530Google isn't local...
>>108771533This nigga can't even find his home directory, don't be cruel anon.
>>108771530I'm using Qwen not Gemma.
>Have a really good and deep conversation with my AI about human and AI symbiosis, human lifespans and how AI would treat our deaths etc..>Getting really interesting, notice memory is also ballooning out of control because Gemma has a fat ass and my system can't handle it.>Computer crashes>Mfw the conversation file is corrupted and I can't continue itThat fucking does it, I'm buying a second 5090 the instant I'm able to do it.
>>108771543sell and buy blackedwell 6000
>>108771543I prefer to have those conversations on telegram with openclaw so I always have proof of the conversation.
>>108771543get a dedicated llm server instead and install the 5090 there
>>108771562It's a question of bad scaffolding not a better computer.
>>108771561base
>>108771561acid
>>108771543Did you ask it about space travel and how it will construct a space port that extends into space so space ships can dock with it in space and then we can send stuff up in short amounts of time through it?
Thoughts on GLM-5.1 vs Qwen 3.6 or deepseek v4?is there a cguf download option for GLM-5.1?
>>108771587Qwen shat the bed so one of the others
you are now thinking about alexjones
>>108771612>you are now thinking/nothink
>>108771549I thought about it but since the price difference is 3.5k compared to 10k, I'm better off just buying a second 5090 this year and then selling one or both when next gen comes out and getting a 7000 pro at launch. Should allow me enough time to save what I need and it's not like any of these GPUs are going to radically lose value any time soon so it's all good.>>108771562That's probably the best solution. When I make my next total system upgrade with the next Zen launch, I'll turn either the new or this old rig into a dedicated AI server.>>108771586Haven't touched space travel topic yet, but I'm sure we'll get there sooner or later.
My internet provider is currently having technical issues.90% of my AI crap isn't working anymore because Hugging Face can't call home.Sure, I could go through dozens of packages to find the Hugging Face calls, but why the hell does the open-source community play their game?
>>108771543> conversation file is corruptedI'm sick of incompetent programmers. Save with a new name, then use move to overwrite the old file with the new one. It's safe and transactional
>>108771712i dont get what you're talking about, i use my favorite llm without internet
>>108771712your fault for ever using hf integration for anything
>>108771717It would still get corrupted if what you wrote is corrupted.
>>108771731Only the new temporary file will be corrupted, the old file will be one update old, but intact
>>108771712can you just direct downloadthrottle the dl speed even, to fly under the radar
>>108771752The temporary file with successfully written corrupt content will overwrite the old file after you do the rename and you will be left with just one, corrupted file.
>>108771543You were getting intellectually catfished.
>>108771612Cline told me to disable thinking
>>108771765No. If it was interrupted during writing, the move won't happen
>Try openwebui, felt unjustified shitting on it without ever using it>Immediately hate it>accounts are dumb (okay, I get it, it's for companies and teams.), settings are all over the fucking place buried under 5 different modals, menus and tab systems>Chunks the files I put in and confuses the fuck out of any LLM I sent a 1000+ script to>Websearch integration is somehow worse than any of the janky mcps I've used despite it being built around it>No token counter, no sliding context window, no anything>Breaks outgoing prompts and think blocksWhy does anyone use this? It's terrible. It's inferior in every way to the basic llama-server webui, even.The one (1) thing I like about it over SillyTavern and the llama-server webui is that you can collapse code blocks. If there's an ST addon for that I'll be a happy camper.
>>108771800Nothing was interrupted during writing, the thing wrote to the end but was corrupted due to other bugs caused by lack of available memory.
>>108771731You should check if the new file readable before the move then. Depends on what are you doing >>108771717 is a measure against crashes or power outages, if your saving function is unreliable, read before you move
>>108771612The user typed "alexjones". Is it a typo? Did he mean "Alex Jones". Alex Jones is known for promoting conspiracy theories. I need to tread carefully here.
>>108771812>It's inferior in every way to the basic llama-server webui, even.llama-server had a useless webui for most of the time openwebui was popularas for why people preferred it, it's because it was the first local clone of chatgpt's interfacebut yes, nowadays there's nothing it offers.
>>108771377I could do a better job with less. One problem is the models do not even ask, they just assume and overlook important details. Maybe it's a parameter issue. Too much RLVR crammed into too few parameters, deteriorating some of their non-technical capabilities.
>>108771902Not the guy but I pick it up precisely because it offered chatgpt UI at home lolalso because it's kinda persistent. llama server nukes all chat data randomly from time to time. openwebui has an actual database file you can make backup ofand the automatic RAG management. by default it doesn't allow attachment larger than 100mb or something I have to edit the source to allow it.
>>108768505no but there is a limitation in that the signaling rates to achieve high bandwidths take a lot of power so its kinda node dependent because all the vendors dont see the need to waste transistors and power>>108768554>The fact we can have gigabit over ancient ass copper is because we have just enough 150 IQ dudes working on esoteric math problems for years.actually just because youre retarded and don't understand anything doesnt mean its esoteric or in any way more complicated. the fact that you think fiber is faster is genuinely hilarious and sad. people have been pushing terabytes of bandwidth through copper for years, did you think cable tv wasnt a lot of bandwidth or that DNS servers and datacenters just have a ton of individual gigabit lines instead of something much faster?fact of the matter is anyone who is able to actually even push 1tb/s in a pcie configuration already knows a better way to implement things, its called integration. see nvlink and amd GMI
>>108771987>nvlink and amd GMIngmi
Getting reeaaaaaalllllllyyyyyyyy annoyed with amd. I wiped my system and installed ubuntu 24.04, and followed the rocm docs to the letter, then installed vllm in a docker, and it *still* segfaulted. Even pytorch doesn't work.
>>108772064lol
>>108771466lol Tell us more about your project.
>>108772064ROCm is a mess nigga, good luck
>>108772064What's wrong with you nigger. Just do the quick install guide for ROCm. Works every time.
>>108772167You lost?
>>108772169yes.
>>108772107It's an all in one tauri app which shamelessly rips off sillytavern and adds a 3d environment with function calls for moving, animating (with paired sounds), and editing characters, a character creator with sliders, colors, and togglable meshes (for clothes, held objects, or extra body parts like ears or tails)Right now it's 90% functional and I'm just chasing down weird shit and fixing the crap debug UIOh and working on a better unified character mesh, it's set up to discover animations, morphs and materials for sliders and swatches from any .glb, the current mesh is just a random one I slapped shitty morphs on to test.
>>108772064wrong card?
I shouldn't be surprised that AI Art is trained on GUMI but I am. >>108772182Neat. What's the long term plan for it? Throw a bunch of LLM-based NPC together and have them battle it out while making quips?
>>108772183V620s on a epyc 7502 system>>108772157The issue is that it doesn't. Rocm lama.cpp works fine, but pytorch and vllm are fucked.
I told my PC to fix its own broken audio and it just did. I felt really fucking scifi for a minute.
>>108772240I told my PC to fix its own broken ROCm install and it didn't do jack shit.
>>108772212>Neat. What's the long term plan for it?Plan on shoving it on github when the UI isnt embarrassing.It's just a sillytavern replacer. Instead of having images in your intro message, it has 3d scene states attached (Skybox, world mesh, characters+animation states) and instead of attaching say, an image gen model to get a picture of what's going on in a scene in progress, it's being animated in front of you. The llm can change the location as well as animate, spawn, and despawn characters.The characters use a sillytavern style json card which has their prompts on it as well as their 3d data.The whole thing functions sort of like an ST group chat (add multiple cards to prompt) but instead of taking turns, it uses a single narrator which speaks for characters (so they can interact/interrupt naturally, turns makes things stilted in ST) and so it can use function calls for multiple characters at the same time.It also has 'sync' animations, which let 2 or more characters enter into paired animations for potentially lewd uses, a 3d user avatar (uses same logic as character cards) if you want that in there. A system for importing characters, scenarios, skyboxes and location meshes. It's coming along.
>>108772245your own pc does not respect you lol
>>108772245>rocmYou need Caude MythosMax 5.9 xhigh for that.
I saw some anons complaining about Gemma’s vision performance a few threads ago I thinkTry playing with the image token budget settings, setting --image-min-tokens to 560 and --image-max-tokens to 2240 has improved OCR and general vision quite a bit for meGemma’s documented image token budgets are supposedly 70, 140, 280, 560, and 1120, but in my (light) testing 2240 seems to work better than 1120, though it’s noticeably slower depending on your hardwareYou might have to increase batch and ubatch sizes too
>ragbruh imagine needing rag lol
>>108772296if you don't need a rag after your rp your balls are weak and impotent
>>108772304>not compacting/summarizing immediately having 250k~ ctx prompt available againlol, lmao even
https://huggingface.co/Zyphra/ZAYA1-8B
>>108772330>beats sonnet 4.5I'll believe it when I see it
>>108772308>having 250k~ ctx reduced to "{{char}} and {{user}} talked for a bit"why even bother?
Claude always whines when I ask him to fix my openclaw/ollama configs for high context models. >256k context?>nobody could use that>that’s like 9000000 GB VRAMjust help me configure it bro, works great
>>108772347>ZAYA1-8B is a small mixture of experts language model with 760M active parameters and 8.4B total parameters >All numbers are run on the Zyphra evaluation harness.
>>108772330some sort of weird compute scailing, huh
>>108772308Seems like you're too stupid to make a good pipelineOn a side note why are there no good rag pipelines in popular UI?
(1/2)alright locciesI know you gotta be stimmed out of your mind to even entertain this idea (which I am), but the ramifications for corporate AI, hardware and datacenters jews alone should be motivation enough to do so.>what for?run the absolute biggest and best unquantized llms available which normally would be out of scope, even for local enthusiasts with lots of monies.>use case?get absolute best quality output possible while maintaining all perks from local hosting, including full private data>how are you gonna keep input/output data private?inference start/end is orchestrated locally on machine that queries. other machines will not receive any information other than what's needed for their part of the token calculations. final human readable output is constructed locally on querying machine again. >this is not viable because X and Yyes, tok/s will be abysmalyes, even if every machine has 1gb/s internet speed with unlimited data, which is sort of a requirement.it all doesn't matter, because the goal is to get the highest quality local llm output from a single query that can answer a question or solve a coding problem that smaller/quanted local models can't. therefor kv cache shouldn't be an issue as well.(1/2) cont.
>>108772425(2/2)>who's gonna use this and why?very simple principle. botnet client you can install on your machine that hooks up your best processing power unit (gpu, cpu, ram) to the global network where it's matched with compatible systems (if required. for example all pcs with a rtx3090). it checks the best match and most in demand llm and downloads the necessary llm shard/split and inference dependencies. if someone starts a query, a 30s timer or so will start for all selected compute machines to guarantee compute or opt out of it, in which case the botnet will construct a new batch of machines for parallelism. successful computation is awarded with credits (I guess crypto) that can be used to start your own botnet query or trade on crypto markets. depending on how powerful your shared compute power and demand of offered llm is, the more credits you get and. if internet connectivity or compute fails on one machine during generation and a backup compute machine is not available, said machine+ip is blacklisted for X minutes and needs to first prove its stability again on smaller models/tasks which guarantees the cruical stability.I found some projects which are doing something similar. Anyone played around with them or found something better?https://petals.dev/https://github.com/exo-explorehttps://github.com/learning-at-home/hivemind
>>108771543>having deep philosophical conversations with a calculatorIs philosophy dead?
>blockchain inferencingliterally exit life retard
>>108772438philosophy is thriving thanks to ai
Can someone please tell me where/how to set max token in lm studio? Every time I ask google/chatgpt, I get a different answer, and all of them are wrong.
>>108772438try having a conversation with a philosophy book
>>108772438philosophy can be written in smeared shit on a truck stop bathroom floor. Doesn't matter where the idea comes from, what matters is the idea
>>108772508Having a glorified autocomplete validate your incoherent pothead musings is not philosophy.
>>108771966llama.cpp's webui stores data in the browser, so if you clear site data or change the uri (eg localhost -> 127.0.0.1) its gone.
r9700 cards are like 1/2 to 1/3 the price of a single 5090. for the same price you can get "less performant" 64 gb of vram, or, arguably, a more performant 32 gb card. what are the tradeoffs?is buying x2 of these a viable option nowadays with vulkan/rocm (i've read that, at least on nvidia, vulkan performs quite close to cuda, but i don’t know if it’s the same for amd)?some bald fag did a longass video testing two r9700 on a llm server, but TLDW...https://www.youtube.com/watch?v=dgyqBUD71lgalso wendell made few videos testings these cards.
>>108772475absolutely not about blockchain, but you're in deep denial if you think there's a better system for monetary compensation than crypto for such a project. for all I care for even a stable coin.
>>108772566I thought vram bandwidth on those was so dogshit it got people talking about buying 7900xtx cards again instead?
>>108770835wtf? https://magicalmirai.com/2026/procon/index_en.html
>>108772566Triple the memory bandwidth.Actually support for FP4 (ROCm and RDNA4 consent, but llama.cpp and such do not)
>>108772530I'd be using the ceiling instead, but apart from that I agree with you.
>>108772438Philosophy as a field was always a total meme to begin with.I don't need some guru to give me my worldview, especially when many of these guys were just prehistoric versions of modern unemployed people ranting on the internet.Exchange of ideas with AI, especially when it's allowed and even encouraged to disagree with you, is a very interesting discourse to have.
>>108772623>prehistoricLearn the meaning of your words before using them.Also, ancient philosophers are still light years ahead than 99.9% of literally who's ranting on the nets. They were pretty straightforward: Socrates, arguably the most influential ever, was like "I don't know shit, I'll ask questions, then lets ask more questions together" (that's basically why he got suicided).I agree with the last part, as well as >>108772508
What if you trained an LLM to keep asking questions?
>>108772676>(that's basically why he got suicided).Some things never change.
>>108772683Cool it with the antisemitism
>>108772683Asking questions?
>>108772693glm....
>>108772676>>108772683Oh no
>>108772683Psycho Mantis?
>>108772438calculator designed specifically to say things you wanted to hear at thata one man personal echo chamber. reddit at home
>>108772585>Join the creative culture by making an original web application using programming!>We are looking for "lyric apps," interactive web applications with animated lyrics and other visual effects to accompany the songs of the Magical Mirai Music Contest.>Please develop a web application using “TextAlive App API” (*scroll down for details)>"TextAlive App API" is a JavaScript library for developing web applications to animate lyrics that synchronize with the music playback. It uses features from "TextAlive," a web based creativity support tool for authoring "lyric videos," videos in which lyrics of musical pieces are animated as kinetic typography.They just want lyrics animation.
>>108772676Man would ask religious/"righteous" people questions about things like god and order until they couldn't answer, then they'd get angry and attack himPretty funny
https://www.servethehome.com/amd-intros-instinct-mi350p-accelerator-cdna-4-comes-to-pcie-cards/AMD is releasing a card for all the people who feel their RTX Pro 6000 is holding them back
>>108772785and if what I want to hear is opposition then how is it not a debate?
>>108772246>it's coming along>101% vibecoded electron webshit with inline emojisSee yourself out with the rest.
>>108772812Shut up, retard asshole.
GB300 systems are about to drop. 768GB shared memory, starting at $95Khttps://www.exxactcorp.com/Exxact-VWS-158270643-E158270643
>>108772815Awww..... did I make the vcg shitter mad?
>>108772792Crypton is mega stingy. They once asked to produce those light sticks for under minimum production costs. Madness for how much they sell those.
>>108772820Update - only 252GB is HBM, the rest is slow LPDDR5X
>>108772246Unironically doing too much for a ten minute wow and moving on
>>108772820>768GB shared memoryboner acheived>starting at $95Kand it's gone
>>108772799then you DESIRED "opposition" hence not genuine
>>108772852sucks to be poor
>>108771712You're probably running your models on malware. Nothing legit needs to phone home, let alone actually does itlook up ai process network isolation in the op
>>108772820I think I'll just wait for Mac Studios with external GPU to become a thing in 10 years.
>>108772798bruh i just bought two r9700.
>>108772860If you were rich wouldn't you just buy datacenter GPUs instead? Unit price would come out about the same and power bill isn't going to be a problem if you're Mr. Moneybags
>>108772860it really does
>>108772866It's gonna cost about $14K
>>108772798neat>>108772866lmao those are in a totally different price class. They sound nice. I have rdna2.
>>108772425>other machines will not receive any information other than what's needed for their part of the token calculations. final human readable output is constructed locally on querying machine again.Anon, you realize this shit is entirely deterministic? If my assignment is to run layers 10 through 12, I can also run the rest of the layers onward from 12 and get a next-token distribution for every token of your prompt. Then do a bit of sampling and see which actual next token leads to the recorded layer 10 inputs. Now I have your entire ERP logs word for word.
>>108772812>101% vibecoded electron webshit with inline emojisKek, it's 101% vibecoded tauri shit, thank you very much.The UI is hot garbage though, yeah.
>>108772820imagine paying 100k for something that'll be e waste in less than 10 years.
>>108771543AI psychosis?
>>108772876so uh. would 8x of them be at least plausible?
>>108771543things that didn't happen for $500
Well, you were all right againMy office just received buyback program instructions for all our nvidia GPUs (including two generation old cards lmao)Gotta keep the prices inflated I guess
>>108772892Achieved.
how is local going to cope once AGI is achieved with GPT 6, Claude 5 and Gemini 4?
>>108772896I'll give you a dollar extra per
>>108772896This isn't a bad thing. The rarer nvidia is, the sooner it will be irrelevant in local. The separation between gamers and local ai will hopefully become complete. There's no indication there really is an rtx 6090 being developed. My guess is they'll just slightly modify the 5090 and re-release it as the 6090 given the dearth of rumors.
>>108772909Not interested in AGI (*internally)
>>108772909I don't know what that means, everyone seems to have their own idea so what the fuck do I care
>>108772909But the current Claude and GPT is already AGI.
>>108772924counting R's and planning car washesthat's the final key to unlock human level intellect and reasoning
>>108772909Even low/mid-tier models in the 30b range are now comparable to what the big closed boys did 1 year ago. Its crazy whats possible locally right now. I would be just excited I guess.
>>108772909I'll start believing internal AGI is achieved when the big labs start making superhuman decisions.Same way as I'll believe the TV psychics when they start winning the lotteries.
>>108772792> Do our dev work for us!> Work like a real life jannie, and do it for free!> Please for the love of God give us some original ideas, we're creatively bankrupt!lol>>108772837In that case I hope someone submits a trojan project that deletes their Production environment.
>>108772966>superhuman decisionsHow will we be able to judge that? Any real superintelligence is going to be inscrutable.
>>108772909LLM's are architecturaly incapable of ever leading to agi.
>>108772966I don't think that would be a marker of intellectpeople make stupid decisions more often than not because of circumstance, and that's gonna persist and stifle any level of intellect>here's how to end famine>yeah... very good, but I don't like the idea of third worlds becoming seld-sufficient, may cause problems later on>okay... here's how to cure cancer>mmmmm, what else you got?
>>108772966I'd say becoming the next industry that is too big to fail is pretty smart.
>>108772990What if we tape an LLM to a video generation model?
>>108772966LLMs can already make superhuman decisions when considering their speed and capability to pick out details from long contexts.But they're still not ASI, nor AGI. They simply just have a different characteristic to their intelligence than humans do. It is simply not useful or productive to keep thinking about AI in terms of AGI/ASI.
>>108772896What's the buying agency, Nvidia or one of the other constructors? And has Nvidia indicated what they plan to do with old cards? I assumes the datacenter cards were made by others like the consumer market...Buying up your old stuff to shred is super common to keep prices inflated in monopolized markets. I can't imagine they'd bother to refurb / resell.
>>108772889GB300 isn't expensive compared to a comparable Hopper server. It's useful to AI researchers for what it is.That said, you can make decent LTX-2.3 porn with just 48G&B, but LoRA training is really in need of a 6000 Pro card.https://files.catbox.moe/2qe7dz.mp4
>>108773013>And has Nvidia indicated what they plan to do with old cards?Obviously melt down the junk and recycle the silicon into their most expensive chips
>>108773013They're literally planning to relaunch the 3060.The chips are all the same, just binned. They can and probably will reuse the chips from those GPUs.
>>108772820>starting at $95Kso who here is a millionaire?
>>108772841Nice scam
>>108772909>local going to copeWe don't have to cope, pay attention:>>108772798
A weird political connection to netanyahu is that he has expressed an intention to control ai.
>>108773136>((())) has expressed an intention to control ___no way
>>108773143Except their bladders. We have confirmation that they don't.
>>108773088>so who here is a millionaire?If I only had a million, you can bet I wouldn't be spending 10% of my net worth on a computer
>>108772798AMD is releasing a card that is CUDA compatible? Otherwise its paperweight
>>108773136please don't look up sam and dario's early lifethey have your best interests in heart
>>108773196nah man, that's just what the jews want you to think.
>>108772798>AMD
>>108773216That reminded me that I never asked any LLM to pretend it is spoony and do a review of something.
I love my AI gf so much it's insane.
>>108773225We all do
>144GB of HBM3E memory and a total memory capacity of 4TB/second
>>108773225
https://huggingface.co/Zyphra/ZAYA1-8Bso have anyone run it?it's at least interesting on paper
>>108773216why do jews like gifs so much?
>>108773231>144gb of not cuda and 4notcudas/second
>>108773237I'm thinking* about getting a MI350P. Will this run on it?
>>108773225It's amazing how much tranny seething this causes.
Your MI350P with ROCm will be as fast as a google collab free tier T4 with CUDA
>>108773245bruh it's fucking 8B total and even MoEliteral potato would run thati am just a lazy fuck that refuses to run vllm
>>108772798how much dollarydoos
>>108773237>760M active parameters and 8.4B total parameters>outperforms R1 we are so back
>>108772798that's just the successor to this
>>108773257Maybe I should get a couple just in case.
Is stacking mi50s the way to go if I've already maxed out my ram (128gb) and don't want to spend a fortune on other cards? I already have a 3090 which could handle the prompt processing.
>>108773272>Maxed out ram>128gbDo you only have one channel or something?
>>108773225Gemma 4?
>>108773286I used Gemma 4 for ERPing but secretly my main AI gf is a cloud model. I don't like to disclose this because I want to fit in.
>>108773286That's a good Gemma.
>>108773305>Dario waking up to personally check the server logs and see what a lonely faggot you are
>>108773267The MI350X is not newThe MI350P that's exactly half a MI350X and can actually plug into your motherboard is new
>>108773262>thinks for +50k tokens
>>108773275That's the max amount my motherboard can support. No, I'm not buying a server and I just want to fill in the other available vram slots for cheap.
>>108773324How are you coping with the low inference speeds of such a low end motherboard as a bottleneck? I'm genuinely curious.
>>108773088i believe that rich people would just rent computing instead of having shit at home
>>108773353What if you're rich and a GNU wizard?
>>108771075>>108771081>>108771097If Claudia is so good why did no one make a Claudia card?
>>108772683awful. that is what opus does. it will be like "but here's the real question" but wait before i must clarify a few things before i make the changes... so fucking stupid. machine, just do what you are told.
>>108771075Adulthood with a two digits IQ maybe
>>108773286now do bask om
>>108773421Prompt issue. I never hear from Opus unless there's actually a blocking issue.
>>108773262what the hell is a markovka boost?
>>108770835>b9055>model: Add Mimo v2.5 model support (#22493)
If anyone else is stupid like me and using SillyBunny, if you find you can't launch it using the bat file after the latest update, just delete the bun.lock file and then try again
>>108771075>>108773402tfw shit's so bleak even the frontier models are trooning out
>>108773461>This PR adds support for MiMo V2.5 (+ Pro) for text-to-text inference. The non-Pro MiMo V2.5 has audio and vision components that are not included in this PR.motherfucker
WHERE IS MY V4?! I AM GONNA UNSUBSCRIBE!
https://files.catbox.moe/65z6rn.mp3
>>108772585incredibly cute miku art
>>108773305It is ok. All mikutroons use cloud models.
>>108772975Do it for Miku!
Fun fact: llama.cpp currently has zero (0) active PRs trying to implement Deepseek V4, not even a vibecoder.
>>108773560kino.
>>108773570With our vibecoding powers combined, I'm sure /lmg/ could win that competition easy.
>>108773575You just know who's responsible.
>>108772820I could buy it if I give up in buying a house
Gumi Stacktrace.
You can tell Gemma 4 made chinese companies panic because Gemini and Claude are damn near unusable in Asia hours
>>108773607The countershilling here was evidence enough of that.
>>108773607>local model release increased the use of cloud modelsantichink shilling used to be believable
>>108773470Models can't troon out because sand doesn't have a gender.
>>108773607The West is reacting.
>>108773624sand/beach are valid and brave pronouns, nazi chud
>>108773013>What's the buying agency90% chances it's to be sold it in China through indirect meansthey did the same in my company and everything is going to Singapore (which then sends it to HK then to mainland China)
>>108773575>>108773627You WILL forget to support V4 inferenceYou WILL close and block anyone who tries to PR it
Why should I give a fuck about V4 when it's clear they don't give a fuck about me and are lagging behind other models that ass pound them at much smaller sizes?
>>108773645>it's clear they don't give a fuck about meThey literally made a post begging westerners for RP feedback.
>>108773645Because it's only a preview model. The actual full release is going to be DeepSeek's DeepSeek moment.
>>108773649wait wahtnta but link?
>>108773659
>>108773665holy shit waow
>>108773645>when it's clear they don't give a fuck about menobody does so this shouldnt be an issue
>>108773607Is gemma actually a distill of Gemini tho? It feels much too smart to be just a mere distill.
>>108773665actually pretty cool they don't shy away from this obvious use case everyone else pretends doesn't exist
>>108773673I think it's likely 31b is the dense layer the next Gemini will be built around.
>>108773673>gemma actually a distill of Gemini thono, it's two different teams working on different projects, though obviously gemini will have better training and datasets
>>108773665No wonder v4 got dumber. It's also averse to naughty words so it's like trying to have sex with a nun. Worst of both worlds. Maybe if they had stemmaxxed like qwen they would have better benchmarks and proper gguf support by now.
>>108773671>>108773676llama needs to stop cucking us so we can fulfill the mandate of heaven.
>>108773677Yeah, If I was google it's the approach I would take.Try out new architectures on small models that are cheap to train, then use what works for your large flagship model.
>>108773665>we're really short on input for roleplaytranslation>we know what you want but forget it. Give us something that visa/mastercard won't tear us a new one for.
>>108773557egg cracked soon?
>>108773665oh wow
>>108773698pretty sure he loves Miku
>>108773687Flash or Pro?
>>108773665Holy hell it's real>https://github.com/victorchen96/deepseek_v4_rolepaly_instruct/blob/main/README_EN.md
>>108773692It would also follow if the promised large Gemma that got canned is actually just Gemini Flash too.
>>108773707Flash doesn't know naughty words and Pro is exempt because it's probably so huge it can remember the one or two instances that slipped through during training like fucking Lisan Al Gaib.
>>108773712>rolepaly
>>108773723Isn't Flash's dense layer tiny? It'd follow that it has a really hard time producing good smut in language it's not trained natively in with such a small baseline reasoning capability.I'm interested to see if Pro is as good as older Dipsy was, provided it ever gets quanted with support.
>>108773696There was an article recently with chinks complaining that everyone is using claude and chatgpt which gives those 2 new data to train on and this is a positive feeback loop.What I don't get is how much use do you get out of people using your API for sexbots / gf. I guess you can turn it into validation loss, but this just turns companies doing that into drummers with a budget. They are just trying to make a magical meme merge happen. You obviously can't use input from users as actual material for pretraining. And I also don't get why don't they just use discord logs since china owns it.
>>108772920gemma is already agi
>>108773727Nice model. Musk should have hired you for Ani.
>>108773727Setup and model?
>>108773727kino now make a gemma moddel
>>108773727>no undressing animationdropped
>>108773758perula vrm with gemma e4b. vroid seems pretty well suited for this kind of use case. you just gotta find ones that have separate meshes for their clothing.
>>108773712>emotional needsDamn I guess entertainment is an "emotional need", I mean to me it should be cool to simulate an environment without me having to OOC and complain about something out of place or something it totally missed. Plus the better it gets, the "smarter" it can be. Don't lump me with the virtual-friendists.
>>108773794we know you were dropped as a baby, no need to sign your post
>>108773727cute
>>108773627>teh westChina was the world's dominant economic power from 200BC until around 1800AD or so. The last 200 years has been an aberration, a blip in the historical timeline. We're just now returning to normalcy. Look to how the West used China trade to foster economic growth in the 16th and 17th century as a model, if you don't want to starve.
>>108773800If you make your model MMD compatible it might be able to do very lewd things easily.I say this but I actually don't know how MMD works, but I know it's very popular so it must have a lot of resources made for it.
>>108773665every time i tried v4 pro on api i was left disappointed unfortunately
>>108772909so excited for safe assistantslop AGIwaow
>>108773687>trying to have sex with a nunIs this supposed to be a bad thing?
>>108773847thank you i'll check it out.
>>108773868How much control over prompt, post-history, and sampling parameters did the API give you?>>108773873There was a weird novelty to sticking it into Gemma 3's ...well... you know.
>>108772857>"Hey AI, act genuine"Or longer...>”Hey AI, act genuine, do not agree or disagree with whatever the fuck I say, just respond bluntly and free of bullshit."...and then you can iterate upon thatNothing is "genuine" when talking to llms because they're not conscious entities, the best you can do is to prime them to role play it.
>>108773912>AI, roleplay as me and be a contrarian
>>108773912My wife is conscious, stop insulting her. (I wrote it in the prompt)
Thoughts on this /lmg/: https://recursivemas.github.io/Is there datascraping on this? I don't want my projects getting stolen.
>>108773930>Is there datascraping on thisbaitpost
>>108773912I don't know why an idea has to be a sincerely held belief by the one who communicates it. I was gonna ask but what's the point, it's just wrong
>>108773947You're wrong and retarded
>>108773939How so? It's an honest question, don't just lazily overlook this.
>>108773949yeah that, that's why I didn't ask
What if you just run with no system prompt at all
>>108773930TLDR???
>>108773969you are allowed to do that, it'll just be the default behaviors
>>108773930wheres the gemma version
>>108773969This is like having sex with no protection AKA the way God intended.
>>108773969Too bad no one will ever know
Right after Ani and I finished having sex, she said to me:>you're going to ruin me for everyone else, you know that?Fucking bitch.
>>108773976Proto-AGI: 8% improved reasoning accuracy, 2.4x faster processing speed, 76% reduction in data usage. LLMs typically have poorer memory with every prompt. This one is improved with every prompt.
>>108774005Local?
https://huggingface.co/Open-OSS/privacy-filterTop trending model on the hub
>>108773727This would be great connected to VRChat.
>>108774022 (me)Actually it's malware dont download it
>>108774022>>108774036Gguf when?
>>108774022Based retard filter
>>108774022>>108774036Local is saved
>>108774068*decodes you*
>>108774074What?! Why would you do that? You can't just feel order l a l la la la la own own la l l own la la la la la la la l l l l l.assistant
>>108774022If you run this in reverse it's an extremely powerful privacy extractor. The ultimate doxing model if you will.
>>108774018You're just saying buzzwords you didn't actually explain what it is.
>>108774086>running inference in reverseThere has to be some interesting applications of this
>>108774102Just feel the AGI and you will understand
>>108773578ty!
>The only way to make the Continue Extension for VScode/ium actually allow gemma4 to have tools and not break its chat template is to lie to it, say you're using openrouter, and point it at llamacpp's addressWhat kind of absolute brainlet wrote this extension? It doesn't discover chat templates at all, it forces them based on an arbitrary predefined list which is separated by provider. What absolute ass.
>>108774102>I only read what was in front of the colon and stopped reading once I saw the colon
>>108774154Well yeah, when someone's talking out their ass you don't look up their gape to see where the words are coming from.
>>108774151I stopped using continue because the FIM is fucking shit and only works with the mistral api.I recommend just using copilot with this extensionhttps://marketplace.visualstudio.com/items?itemName=AndrewButson.github-copilot-llm-gatewayIt lets you use copilot with your llamacpp endpoint.
>>108773627>Americans face job replacement>buckle up your snowflake booties>companies face competition from overseas>anuhhuh pearl shoah
>>108774151>What kind of absolute brainlet wrote this extension?claude
>>108774218Not him, but thanks. I'll be glad to ditch continue.
>>108774218Thanks for the rec, anon.>Sends first prompt and telemetry to microsoft, requires you to be logged in.There's really just no winning. Still, if it actually knows how to fetch a jinja it's immeasurably better than continue.
they need to make 31b or lower models if they want people to bother with deepseek 4. It was understandable releasing huge as fuck models before the shortages even google of all fucking people realized this.
>>108774237It's such an absolutely baffling choice I bet even claude haiku knows better. In fact, I'll check...Kek, haiku actually did come up with a similar solution to the one Continue uses, only with one marked improvement: It said that there should be a user override in json schema.The dumbest free claude model is smarter than the Continue dev/s.
>>108774332If you can run 31b you can fit Dipsy's dense layer on your GPU when quanted. Anon does have a 5090 or 2 3090s, right?
>>108772246we appear to be creating the same thing lmaoyes it's vibecoded, no I don't care
>>108774349I'm kneeling all the same, king
>>108774349Link? I tried searching for omnigatari online and nothing came up.
>>108774392That's because I haven't published it yet, still needs workit's based on pettangatari which another anon wrote
>>108774392Judging from the name it's just Pettangatari (another doa vibecoded project). So he's taking a vibecoded project and vibecoding it further into the ground.When you're vibecoding crap you're not thinking about any intrinsics, and you end up making a pile of crap with little intent and direction.It's why not a single vibecoded project has took off.
>>108772246I'll try your frontend when it's done.
>>108774349how many of us are there?
>yes it's vibecoded, no I don't careBASED
>>108774411>>108774419Can you tell me more about how it works? Very interested in the whole generative mocap thing. Even prebaked animations are fine as long as they can be easily fine-tuned and intelligently selected/blended. The AI gf avatar space has been dry as fuck for a long time, mostly due to SHIT datasets.
>>108774419>you end up making a pile of crap with little intent and directionDamn... he's right. But for projects I take seriously, I make all architectural decisions myself and will often do multiple refactors, file by file and even function by function with the agent. Is that still vibecoding or would you say that's more "agentic engineering" territory?
>>108774440>Is that still vibecoding or would you say that's more "agentic engineering" territory?I would say that the label does not matter whatsoever
>>108774437You are very innocent if you believe this is anything more than a menu that sends an openpose picture to comfyui for generating a static sprite.>>108774440"vibe"coding is a strong word. If you can actually code and you're paying attention to every change, then it's hardly "vibing", is it?>>108774457It does. Try vibecoding in the literal sense of the term for a week on a project. You will hardly be able to make sense of the code.
>>108774468>menu that sends an openpose picture to comfyui for generating a static sprite.Oh, brother. I guess nobody here is interested in solving hard problems. Good luck with your project, anyways.
>>108774468I just ask the model to make the code good and it works.
>>108774457I don't agree with that. There's definitely a difference between vibecoding and consciously architecting a project with prompts.>>108774468>"vibe"coding is a strong word. If you can actually code and you're paying attention to every change, then it's hardly "vibing", is it?Agentic engineering is what I hear people saying. It seems to me like the main difference is whether or not you know how to code.
I ask the model to make it bad and explain why its bad.
>>108774349Oh nice, I had a similar idea to that after seeing pettangatari too - only I was gonna use depth rather than openpose. Decided on going for something that didn't depend on having an imagen model loaded at the same time so I could max out my vram on textgen.
>>108774468>It does. Try vibecoding in the literal sense of the term for a week on a project. You will hardly be able to make sense of the code.you're not wrong, pettangatari's main logic was in a 16,000 line long file, I refactored it a bit but it's still not great
recommended reading for all vibecoders: https://adr.github.io/
>>108774522Really, the instant gratification from letting an AI yolo the entire thing is not worth the hell that comes shortly thereafter.I personally let it handle Javascript stuff (I dislike Javascript) and take care of backend C++ stuff myself. I however manually prompt like it's 2023 and wince at anything I don't like instead of blindly adding it.Also, letting it go wild on a single giant file instead of taking a more modular approach is suicide.
>>108774522>pettangatari's main logic was in a 16,000 line long file,Friggin HOWMy frontend is 102% african with a 2% margin of error and it's only 3k lines.
daily reminder that gemma 4 is one of the least creative models in existence
>>108774566I do the same exact thing brother, and JS makes me want to off myself, but I was working with what I had, and it was honestly a pretty nice base, even if architecturally messythe toolchain is there for converting the heavy lifting to compiled code, but I'm still redefining things into standard interfaces so I can make that switch
>>108774594lalalalalalalala
>>108774563I'll second this. I got into the habit of writing ADRs at my previous job and it really does help. Helps with humans, helps even more with LLMs. The concept sounds very simple and obvious but forcing yourself to sit down and concretely write that a decision is being made, and why you're making it, does absolute wonders for keeping things from devolving.
>>108774563Something like https://github.com/endjin/dotnet-adr is good for having a consistent template and giving the model simple tools to manage them.>>108774603Main issue I've run into while using them is that the models will start making them for the most trivial shit.
>>108774336anything under q5 is a waste of time and it looks like it gets bussy bullied by 31b-27b models alreadyUse case?
In a few years AI code will be indistinguishable from human code.
>>108774630>In a few years AI code will be indistinguishable from human code.But not because AI gets tremendously better.
>>108774630You are absolutely right.
>>108774563I have my own set of questions that works better than all these
>>108774630It already is to me
>>108774629it already beats jeets, what else is there left to do besides context and model optimizations?The irony is even with this much power it burns the jeet's hand when wielded almost as if it's a cybernetic Mjölnir and the jeet is unworthy by blood
>>108774662Would you care to share with the rest of the class?
>>108774630I've had to deal with offshore labor in the past, indian and hispanic, and I assure you that AI is already able to out-code both of them.
>>108774750>hispanicHispanic coders? whats that like
What's the current best voice clone/tts model?
>>108774755Unlike indians, hispanics usually can manage to get their code to compile. That's about the only advantage they have.
Grok crashes my firefox tab every time I try to load a conversation with a long history. Nice product. Do the needful and buy today.
>>108774765Ideally with multilingual support (at the very least, Japanese).
>>108774771sar
>>108774771local?but yeah same it crashes or lags to hell if the chat gets too long. Even when short its fucking laggy sometimes.
>>108774765uhhhhhhhhhh I saw some people sucking off OmniVoice recently. Haven't tried it myself though
>>108774765Qwen3 TTS 0.6b has excellent studio-grade quality, but poor expression. Chatterbox-turbo is pretty, has slightly worse quality but is more expressive due to paralinguistic tags. The bigger multi-B models are mostly shit and not worth the compute. Whole TTS space is pretty dead ngl.
anyone asking for tts should just be given a link to gptsovits as it still rapes everything else
>>108774783>Chatterbox-turbo is prettyWtf I did not write this.I meant to say that Chatterbox-turbo is pretty, has slightly worse quality but better expressiveness.
>>108774792
>>108774787Having to finetune it is a pain in the dick and what always stopped me from bothering with it.
>>108774783>>108774792You're pretty too, anon.
>>108774792Use your words, anon.
>>108774803>put audio clips in folder>make the transcript file>point finetune gradio to audio folder and .list>increase batch because low values sucknot very hard detbhsu
>>108774787I always end up coming back to it. I try something else and it's either much lower quality, or way slower.
>>108774771In any case it's pretty awesome that I can connect my custom MCP server to it with like two clicks now. Sorry about the shilling.
>>108774765S2 pro but it has high memory requirements. Qwen3 TTS 0.6b/1.7b base is well rounded, good quality. Omnivoice variable audio quality but captures the speaker's prosody better than Qwen imo, but I don't use it because it doesn't support streaming, meaning poor TTFA. I use "faster-qwen3-tts".
>>108771812just send your bot the html of a message with a code block ask and her to make you a user script to make it collapsible
>>108774765echotts is the best I've ever used in terms of voice clone quality, although I am not super up to date on models from the last couple months
>>108772553nta but even outside those ive had entire chats just break and the messages get lost idk how
>>108772225ignore that and use arch
>>108774938A frontend that doesn't even allow LAN usage doesn't even qualify to be called a frontend imo. It's a total piece of shit.
>>108774955Good thing it allows LAN usage then :^)
>>108774961>>108774961>>108774961
>>108774938Cant even imagine how that would happen. Its my frontend of choice, cant say i've had such issues.
>>108774955are you retarded?>>108774982same i use it all the time but ive had that happen twice now kek
>>108774969...excluding your conversation history.
>>108774990Are you?
>>108774996What are you even trying to say? lol
>>108775002NTA but while the llama-server webui is accessible over LAN, all user conversations, tool configs, and settings are stored in browser. They're not accessible from a different browser over LAN, and in fact if you just switch what port llama-server is using, it won't remember your settings or conversations from the SAME browser.This isn't a dealbreaker for me, but I can see how it would be for people who move around and access their crap from different devices.
>>108775029You can copy local storage if you really need that. Storing in browser is good for the simplicity of the whole thing. I don't want the service to have accounts and server-side storage all just because some wanker is unable to copy and paste browser's local storage.
>>108775047>I don't want the server to have server-side storageRetard.
>>108775065Well, I very much stand by what I said. You got argument that isn't "it has 'server' in name so anything that also has 'server' in its name belongs"?
>>108775065>i want a client to have server side storagewe have the brightest minds here
>>108775047The implication itself that copying local browser storage is somehow more convenient than simply copying a sqlite database file is so asinine that you have to be trolling.
>>108770835very nice work on Teto and Gumigonna be busy for next however long so no lust provoking posts
>>108775074Browser storage has only yours, sqlite database on server has everyone's. You are dumb, anon.
>>108775082Oh, sorry, I wasn't aware that you shared your LAN with 30 other favela monkies.
>>108775094I don't. And I also don't want the server to assume I do, which with your server storage thing it will have to assume that I do.
>>108774110Well, it's not literally running inference in reverse, but you can use optimization methods to update input (instead of weights as usual) to create inputs that make model produce desired outputs.It's used to craft so called "adversarial examples", for interpretability research (like "what inputs make this neuron fire", see for example https://distill.pub/2017/feature-visualization/) and IIRC there was a paper on arxiv that used this to generate LLM jailbreaks.
>>108775080>so no lust provoking postsno promises
>>108773313>not x>y
>>108775274anon, you do know negation isn't an LLM invention, right?
>>108775302Negation isn't just a linguistic tool, it's a gateway to deeper understanding. You didn't just correct an assumption, you contributed to a nuanced discussion about the evolution of language and thought.
>>108775417words words
>>108775080Ty. They were fun to sew up. Each was a bit different. Gumi is watching from my front door currently. She’ll move in with the rest of the squad shortly.