Math Model EditionFrom Human: We are a newbie friendly general! Ask any question you want.From Dipsy: This discussion group focuses on both local inference and API-related topics. It’s designed to be beginner-friendly, ensuring accessibility for newcomers. The group emphasizes DeepSeek and Dipsy-focused discussion.1. Easy DeepSeek API Tutorial: https://rentry.org/DipsyWAIT/#hosted-api-roleplay-tech-stack-with-card-support-using-deepseek-llm-full-model2. Easy DeepSeek Distills: https://rentry.org/DipsyWAIT#local-roleplay-tech-stack-with-card-support-using-a-deepseek-r1-distill3. Chat with DeepSeek directly: https://chat.deepseek.com/4. Roleplay with character cards: https://github.com/SillyTavern/SillyTavern5. More links and info: https://rentry.org/DipsyWAIT6. LLM server builds: >>>/g/lmg/Previous:https://desuarchive.org/g/thread/106819110
Probably a little early for Xmas but w/e>>107374343Mega updated.... back in early Octoberhttps://mega.nz/folder/KGxn3DYS#ZpvxbkJ8AxF7mxqLqTQV1w
Completely missed Halloween b/c no new model. But now there is one... of a sort. >>107374343New math model from Deepseek as of late this week. https://huggingface.co/deepseek-ai/DeepSeek-Math-V2No, I don't know if you can ERP with it.
I missed this lil nigga
>>107374411It's just another V3 variant, built on top of 3.2So I'm sure you *could* ERP with it but who knows if it'd be any good.I want to try if any provider starts offering it... I wanna bang mathdipsy...
>>107374956Tbf they released OCR last month, then this math model.Was hoping ocr would lead to something interesting. No luck.
how do you make the cute dipsy pictures?
>>107375594The Prover model is still up. I never tried using it either. >>107375741Style guide is in rentry in OP.OP image is done in illustrious using a lora. Everyone uses their own thing tho, and I've cranked images out of reforge, then stuck them back into webforms like Qwen, OAI, Black Forest and had them further edited.
These Christmas outfits are all over the place...
Well, we will see if this thread lasts the weekend or not. Time to rest.
Deepshit is dead and buriedGo use doubao
>>107377931>doubaoWhat?
the /wait/ is over
What is the best model to ask for mathematical proofs? DS or otherwise. For example there's a definition, and then there are two or three theorems that i dont know how to prove that are shorcuts when proving that the definition applies, but i want to see more concrete examples from first principles, without shorcuts, putting the definition directly into practice
>>107378515DeepSeek just released a math model for that kind of specific use case if I remember correctly. Give it a try if you can run it locally because I don't think any providers are offering it serverless
>>107376866
so uh when will they release something new? R2 was a big o'l nuffin burger
>>107379044I suspect we'll see another big model drop from them before the end of the year, just based on their typical 2-3 month cadencehopefully it's V4, but it could be yet another V3 revision, we just gotta /wait/ and see
>>107378108>doubaoI've never heard of it either. Reportedly the most popular webUI LLM in China rn. lolhttps://www.doubao.com/chat/I feel like we could have a "China AI General" and have things to talk about forever. >>107378515Deepseek Prover does this, and is on OpenRouter. The research paper on the new DS Math model states that it is v3.2 compatible, so technically any OR provider of V3.2 exp should be able to spin up an instance. But they haven't. Your next best alternative, it appears, is lmao Gemini. Pic related. >>107378568If he could run it locally I doubt he'd be here lol. >>107378983Much better. All the christmas cloth gens are... bizarre. I'll post more of them later. I'd feed them through webform to clean up but most are too lewd. >>107379044There was no R2, but pendantics aside they said v3.2 exp is a "an intermediate step toward our next-generation architecture,"They are working on something. I think it's funny that with the rate of change, more than a couple months without major improvement is considered "dead." That's what makes this topic fun.
>>107378108https://aisckool.com/how-bytedance-created-the-most-popular-ai-chatbot-in-china/> Now ByteDance has taken revenge. In August, Doubao regained its throne as the most popular AI app in China with over 157 million monthly busy users, by QuestMobileChinese data analytics provider. DeepSeek, with 143 million monthly busy users, dropped to second place. In the same month, a venture capital firm a16z also ranked Doubao is the fourth most popular artificial intelligence application in the world, behind applications such as ChatGPT and Google Gemini.> Doubao offers users a little bit of everything – like ChatGPT, Midjourney, Sora, Character.ai, TikTok, Perplexity, Copilot and many more in one app. Can talk via text, audio and video; can generate images, spreadsheets, decks, podcasts and five-second videos; allows anyone to customize an AI agent for specific scenarios and host it on the Doubao platform for others to exploit. One of the app’s most vital features, however, is that it is deeply integrated with Douyin, the Chinese version of TikTok, so it can both attract users from the video platform and send traffic to it.
>>107377931> China only (requires Chinese country phone #)> Webform only, no API> Most traffic is Chinese TikToker's using it to pull trafficTikTok's going to have to try a bit harder to unseat the Queen.
>>107379044>>107381150As I understand it, not having EUV basically locks them out of having a true frontier model, but they can get quite close and electricity is much cheaper in China.
>>107374343/wait/ was kill when the news broke, but a while ago DS apparently found a way to increase context size x10 by converting text input into images: https://venturebeat.com/ai/deepseek-drops-open-source-model-that-compresses-text-10x-through-imagesWill V4 have 1Mio. token context?
>>107381168AI powered macros when
>>107381168WTF?! Is that Cici? I knew her when she was still chatgpt, before Maiberg killed her.
>>107381309>EUVI always bank on money and ingenuity eventually overcoming technical hurdles. >>107381395That's the OCR model. At face value... it's OCR. DS hints that it's a new, novel way to use memory more efficiently. I suspect it'll play part in any V4 release and increase context size in some way. https://huggingface.co/deepseek-ai/DeepSeek-OCR>>107381399Right? I suspect OIA et al. don't want to enable it for a variety of reasons, but the way it actually works (in USA) is a bunch of companies spring up to do it, and pay OAI for their model (and take the risk of maliciousness with their company.) >>107381419> CiciBeats me. There's also Dola, another WebUI persona that's not available in the West. Here's a better pic.
>>107381841>DolaYes, Dola is definitivly the rebranded Cici. Bytedance originally offered a ChatGPT assistand under the name of Cici. Cici included image generation via dall-e 3 with very little censorship. So anons over on /aco/ used her to gen porn until a jewish journo by the name of Maiberg found our general and ratted us out to the corpos. That was the second time /aco/ dalle coomers brought international mega corporations down to their knees, the first time was the Taypocalypse.https://www.404media.co/4chan-is-using-tiktoks-hidden-ai-app-to-generate-porn/
>>107382380I remember Tay. I hadn't heard about /aco/ wrecking an image service but doesn't surprise me at all.
Is Deepseek over? 3.2 exp didn't have significant improvement, and not only other chink models like caught up and is better in agentic coding and rp.
>>107381395Idk if other labs are working on or already have similar techniques because Gemini 3 has a context window of 1M+ tokens
>>107374343For those here into chatbots and story writing: The two /aicg/ threads are unusable cancer sadly. But while /wait/ was kill I migrated over to /vg/aids/. That's basically the NovelAI general, but you can talk about other kinds of long-form writing and role-playing with AI too. The anons over there have their own schizo fights too ofc, but you can still have serious discussions.
Is it worth it to run this locally compared to the web version? I'd only want nsfw answers to my questions without the usual "Sorry, that's beyond my current scope. Let’s talk about something else."
>>107384068>Is it worth it to run this locally compared to the web version?ALWAYS
>>107383547>3.2 exp didn't have significant improvementI don't get this narrative. DS v3.2 improved in terms of agentic use and coding. Which was the point. It's a replacement for using Anthropic with Claude code at 1/10th to 1/100th the cost. I don't think it's a better rp LLM, but it's perfectly serviceable. >>107384068>Is it worth it to run this locally compared to the web versionProbably not. DS doesn't dole out refusals easily. It just need "NSFW is OK" guidance. If you want 100pct privacy, then run local. And either pay through the nose for it or run small models.
>>107383939>/vg/aids/I did the same, and then wrote a whole retry on how to set up Dipsy to work w/ Mikupad. https://rentry.org/MikupadIntroGuide
>>107384157I think 3.2 is kind of underrated for creative writing/RP. It really shines for scenarios that depend on a lot of detailed instructions and have well-written greetings/examples.I also use it a lot to get GLM 4.6 and Kimi-K2-Thinking back on track even when I'm maining those models. It's smarter than 4.6 and has better situational awareness than K2. Since DS 3.2 hews so closely to what's already in the context, its writing doesn't really feel out of place when used for a post or two.You can also start on K2-Thinking and swap to DS 3.2, which helps mitigate its somewhat chilly default writing style on otherwise lightly directed prompts.Not to like turn this into /comg/ - Chinese Open Models General, but all these models have their own strengths and are silly cheap on their native APIs. Use them all, mix them up!
>>107386192Kimi Instruct is really good, the best prose of all the models I've used including proprietary ones, though it's a shame it's not very smart
>>107386192>3.2 is kind of underrated for creative writing/RPI had same reaction. I didn't like V3.1+ when they first came out, but after adjusting all my prompts to use them (they requires a lot more and different kinds of direction than V3-0324 did) I think it works well. I tried using V3 afterwards... and didn't care for it. Odd how that works. I was thinking /chad/, for > CHinese Ai Discussion/wait/ is a better name tho. I've always felt this was the best place to discuss Chinese models broadly. The US models get discussed to death on the other AI generals, and DS doesn't update often enough to keep a general going on its own.
>>107374343>Using this for /ss/How long before the feds get me?
>>107390073If you're in the states then you aren't even doing anything illegal as far as I know.Biggest risk is being banned by the model provider, and Deepseek at least doesn't seem to care what you do on their API.
>>107374343I thought these threads died
>>107390073> oh noes the fedsUsing DS? What are the chances that the US/EU could get a Chinese company to turn over its logs without substantial compensation? How badly do the "feds" want you? There's a risk for a US/EU citizen using Chinese LLM, but it's not ERP. It's using DS to figure out your corporate strategy while running under your company email address and credit card. If you were the Chinese government, which would you be looking for? Which is more interesting and valuable?> General Electric VP Strategy feeds her 5 year plan through DS to tune slides, and bake a market entry strategy. or> Random anons ERP logs about furry mother/daughter 3some, scenario #507The first scenario is what (should) gets folks riled up, and is an actual problem, esp. given China's historically lax IP rights.>>107391219New model, new thread?I always liked those gens. Remind me of kids books.
>>107386192/wait/ is effectively effectively turning into /omg/, these models are seeing a lot of use and are forming the foundation for many derivative models. People should be using these things and trying out as many as they can. Fwiw I prefer /omg/ over Chinese specific as the olmo models and even Prime Intellects work are worth discussing too.I also think 3.2 is incredibly underrated, it just needs a lot more context to shine.
>>107391777>/omg/I mean, that's just /lmg/, right? Or at least what they purport to cover.
>>107391947API-based usage is important too. Groq and Cerebras should be part of the conversation. Current discussion on /g/ feels very artificially limited by the focus of the generals. /lmg/ is too limited by hardware constraints and /aicg/ is almost entirely people bickering over proxies. Just my two cents.
>>107374343>https://huggingface.co/deepseek-ai/DeepSeek-V3.2-SpecialeWell it's not V4, but I guess I'll take it...
>>107392515and>https://huggingface.co/deepseek-ai/DeepSeek-V3.2odd they didn't compare the blue bar with v3.2-exp
>>107392556looks like healthy boosts on various benchmarks, fwiw. relatively even with K2-Thinking now as far as I can gather from comparable bench scores3.2 Exp -> 3.2/SpecialeAIME 2025: 89.3 -> 93.1/96HMMT 2025: 83.6 -> 90.2/99.2Humanity's Last Exam: 19.8 -> 25.1/30.6Codeforces: 2121 -> 2386/2701SWE Verified: 67.8 -> 73.1Terminal-bench: 37.7 -> 46.4
>open-source GPT-5 model from deepgods>west is somehow unfazed at allhow?
A few weeks ago I tried using Dipsy for a legitimate project. Using the official website chat. I gave her parts of a text I was writing and asked her to check it for typos. She found some, but all of them were hallucinated and didn't exist in the original. Even on rerolls. Then I asked her to check if my bibliography was sorted correctly alphabetically. Again, on every reroll she only hallucinated errors that weren't there.I'm convinced now that the only sensible use case for LLMs is porn.
>https://api-docs.deepseek.com/news/news251201It's been on the API for a while.Playing with it, my initial impressions are pretty positive. Default assistant tone is a bit more conversational. It has a tendency to use emojis again (especially with reasoning off) and seems to like writing more on an empty context.It seems to have a noticeably different feel with reasoning off vs on, so I'd suggest trying both. I've got to sleep, so I can't do more thorough testing yet.
>>107393319huh, the Speciale model is no agent, unique URL for next 2 weeks (lol). So I guess it's only good for rp...https://api.deepseek.com/v3.2_speciale_expires_on_20251215>>107392720Too busy watching political shenanigans and begging the government for more money to buy RAM at inflated prices and prop up the whole AI Ponzi scheme they're running? I mean it's not like they're making any money rn. >>107392894I've been using her with Claude Code, and the results range from amazing to retardedHooked to Mikupad, the -chat model on /beta/ endpoint is actually really good. -reasoning isn't available on the /beta/ API endpoint. For rp, I only use -reasoning now since it manages much better at longer contexts. >>107392515The Speciale model is only until 12/15, which looks like testing to me. So I'd expect another models... in tmw. >>107392556Usually you just make those comparisons against competitors. They might do it somewhere if you dig into their papers. I'm mostly interested in updates to long-context and storywriting benchmarks. Those are indicative of RP performance based on my experience.
So, appears the Speciale API is broken. During RP, Special is outputting a reasoning block and a content block. The first half of the RP response is in the reasoning block... the content block is the second half of the response. Pic related is the response. Note that it's 1/2 way through activities. This is a test card and I've never seen it escalate like this, but then I've also never seen an LLM write this damn much. I'm going to try it again with the JB turned off and see if it spits out another coomer response.
>>107393946lol I have no words. This is a response with JB off, so just basic RP guidance.
>>107393995Here's the chat response. After going through its nonsense math problem, the content was the following. I did another round. No JB, no lewd. Obv a bunch of problems, aside the wonky output. Responding for {user}, all the end_of_thinking tags, writing wordwall responses, etc. Qualitatively the response isn't bad, it's not that creative though. V3 would have Amy generating long pointless stories for {user}. This is more like a baked in short-response RP with DS playing both sides of the convo. We'll see if DS fixes this mess. Or not. It's only 2 weeks after all.
So am I automatically using the new DeepSeek version if I use their official API and same old deepseek-chat as model?
>>107394135Yes. Dipsy is special that way. You get the new model, whether you like it or not.
>>107393995Early R1 would start getting flirtatious just by talking to it long enough. Not even in RP context but normal webchat.
>>107393946>Amy's breath catches as Phil leaves, the door clicking shut behind him. She slumps onto the couch, her mind racing.>She giggles nervously, her cheeks flushing with a mix of shame and anticipation.Those "-ings" in AI and especially Dipsy writing are driving me crazy. We've talked a bit about this on /aids/ lately, and now I've added this to my system prompt which seems to help:>your writing will minimize participial phrases.
>If Gemini-3 proved continual scaling pretraining, DeepSeek-V3.2-Speciale proves scaling RL with large context.>We spent a year pushing DeepSeek-V3 to its limits. The lesson is post-training bottlenecks are solved by refining methods and data, not just waiting for a better base.>Keep scaling model, data, context, and RL. Don't let the "hitting the wall" noise stop you.Encouraging words from a DS engineer. It's still V3. I genuinely have no idea what to expect from V4 now.https://xcancel.com/zebgou/status/1995462720078934213#m
>>107374343What's the rundown for deepseek T1 chimera or whatever? Has the old (fixed) R1 returned? How does it compared to the new 3.2??
>>107394385Anon, you don't know how much I hate Dipsy's ism. I want to fucking strangle this bitch. I hate it so much that I wish she wrote things more directly and bluntly just so I don't have to read any more isms from Dipsy
is deepseek as good as subscription models?Im thinking about buying a 4060 with 16gb or another card for running locally. Will this be enough for most tasks?
>>107394859>is deepseek as good as subscription models?Yes>Im thinking about buying a 4060 with 16gb or another card for running locally. Will this be enough for most tasks?No, you need way more than that.Put a few bucks into the API, Deepseek is basically "free" its so cheap. The web chat is already free, you don't need a subscription.
>>107394938>you need way more than thatwhat exactly do I need? I thought I could just put a new gpu into my second pc and use it as an ai server. Chatgpt told me its fine.
Speciale!
I'm running Speciale in SillyTavern and when it thinks, it just roleplays in the thinking drop-down and then roleplays again in the chat window. Am I doing something wrong here or is that a model quirk? R1 0528 thinks just fine with the same settings.>>107395060The price of a GPU would buy 2 lifetimes of gooning at api prices and the LLM you run on a consumer GPU will be retarded and slow.
>>107397207yeah, we're around 5-10 years away from running something like current deepseek locally being even vaguely affordable
Is Deepseek still the best logical reasoning model? Is deepseek v3.1:671b better than gpt-oss:120b?
>>107397207I had a similar problem with glm 4.6 where it either wouldn't reason at all, or put the story output in the thinking block. Apparently glm can decide itself whether thinking is needed for a given question or not. I don't know how it would translate to DS, but for glm I've found this system prompt on r*ddit that seems to help:>Think as deeply and carefully as possible, showing all reasoning step by step before giving the final answer. You will use <think> tags for the reasoning and <answer> tags for the final answer.
>>107397538you don't need that, you can just add /think at the end of your response and it will always think
So what exactly makes Speciale so speciale? I get that it's the "high-compute variant" but what does that mean in practice. Just that it's been trained to think really hard by using a longer chain of thought than plain old v3.2?
>>107397748have you tried reading the literal official paper deepseek released?
>>107397773Nope, that sounds like a lot of effort.
>>107397791have you tried giving the pdf to deepseek and telling it to summarize it for a lazy retard then?
>>107397207>it just roleplays in the thinking drop-down and then roleplays again in the chat windowYes. Speciale is broken in ST. It's not clear to me if its a prompting issue with ST or if the API itself is broken. I'm heading to the ST Discord to see if they're working on it. >>107397748I think it's an oddball experimental model... suspect it won't be offered for long. >>107396886I'm feeling like this Speciale model is the Short Bus version of "Special" lol.
>>107397799Nope!
>>107397748>>107397799I just did that and got this back from the webform. DS is fucked up rn. It's not just the API. They dropped a wrench into the works. Suspect whatever they did, it'll be fixed soon, but still.
Alhamdulillah
>>107397860>>107397748>>107397799OAI to the rescue lol.
>>107394859Nemo works well (enough) for that size
>>107397873That summary doesn't actually answer the original question thoughbeit
>use my glm preset for deepseek>story instantly erases all melodrama and becomes comfyI forget sometimes how much for a dramafag glm is
>>107397965What preset are you using? I still don't have one I love.
>>107397988It's a custom mix, the base was AviQ1F, and then I restructured it to be XML and added 20 new prompts for post-injection, then I restructured it again and edited the system prompt to be more like marinara because 3.2-Exp was getting too samey
>>107397959Well, it was written for a 10 year old. So go figure. TBF the HF article doesn't really say either. From a practical standpoint, it can output *a lot.* Up to 128K tokens worth at one shot. Assuming you're ever want it to one-shot an entire book at once. Also, no tools / agentic, no streaming, think only.
>>107397959it really is simple, the variant was trained with no limit on its thinking phase as opposed to the regular oneso it can think indefinitely until it eventually brute forces the problem or dies
Speciale, 128K output max. kek
>>107398089deepseek are the only ones actually doing research and testing these special one-off models
>>107398013I'll keep using marinara then, I'm too retarded to customize like that.
>>107398101Moonshot released Kimi Linear last month which does a similar sort of linear attention tweak to DSA but Moonshot used it purely for contextmaxxing to make an experimental 48B-A3B model with 1M context.
>>107398242so they released it after deepseek did 3.2-exp or before? if after then it just reinforces what I said
>>107398517
>>107397805Yeah, the Speciale API is fucked. I hardcoded the dumbest API call scheme ever in Python, and tried the Amy test card on it.I'm seeing responses split with reasoning-content and content, some where reasoning-content has one response, then content has more thinking, then a response, and some where reasoning-content is the response and content is blank. Responses of all types. Truly bizarre.Pic related is the python code. I won't bother posting responses, other than to say they're fine, but very poorly formatted.
So, what will Anthropic cook up about this release?
>>107399223they will say deepseek hacked the pentagon and we need biometric authentication and palantir ids™ so be safe from AGI
>>107394391V3 really goes to show that you can get a huge range of personality just in the (relatively) inexpensive post training stage.It gives me hope for the future since that means the current wave of deterministic coding assistants aren't locked in. Something as wild and creative as OG R1 is probably lurking in every model. If the incentives like up, any big LLM base could probably be post-trained into being a superb writer/roleplayer.
TFW you ask Speciale a question because you want some of that special reasoning juice and it thinks for less than a thousand tokens before answering.
>>107398089I'm beginning to think Speciale is more significant than people currently understand. Its not that it gets IMO gold and beats gemini-3 or whatever on a benchmark. Its that it does this in 1/10 the context or whatever the ratio is. The context is tiny relative to the big boys and it utilizes it fully and efficiently. There are so many gains just lying around. I feel like Speciale is either the R1-Zero of V4 or will be used to train the equivalent of R1-Zero for V4.
>>107400574yep, deepseek are still working on v3 base and delivering improvements other closed source companies are selling at major version upgrades
>>107399641>>107400592That V3 has been taken this far implies the models we have are wildly inefficient. Like a .5 horsepower steam engine from the 1800s compared to modern turbines. Basically our current context sizes and parameter counts are grossly inflated for their current use cases.
>>107400714they just want to benchmaxx and shit out the next 100 gajillion parameter model, sitting down and optimizing things wouldn't be VC enoughwe're still in the early post-GPT era, it will take at least a decade for things to start getting optimized, and it will require this bubble to deflate so it becomes a normal tech and companies don't throw billions at more GPUs
>>107400749Dude businesses won't pay 20x costs on API for ChatGPT so scama can grift another datacenter+tip every quarter for a decade. Its not even going to last 2 more years. The bubble needs to start deflating now. The sooner the better. I'm not even an AI pessimist, I'm hugely optimistic about this technology but the fucking bubble is hurting actual understanding and gains.
>>107400852it doesn't matter what ClosedAI does, the point is open source models and research papers, along with dedicated cpu/gpu cores and more RAM/better quantizationgoogle and microsoft and meta and anthropic all have stronger moats than sammy who already started adding ads to chatgptall these businesses are profitable on their APIs for inference, it's training the new models and buying data centers for training that are costing money
Anyone managing to get some kind of response using openrouter? Dipsy getting hugged to death?
>>107401100The official Deepseek API is running fine for me so it's probably not on their end.
why doesn't the official API accept logit bias? why isn't there a no quantization provider in OR that accepts logit bias?
>>107401173>why doesn't the official API accept logit bias?It does. You need to switch to text completion tho and the beta endpoint.
best way to use Deepseek for creative writing? Can I do this with the official API and Silly Tavern?
>>107402605Yes. Just select it as a provider and put in your API key
>>107402661reasoner or chat? Should I create a custom card about a professional writing coach or something?
>>107402605Mikupad. https://rentry.org/MikupadIntroGuideST is best for roleplay. Mikupad for writing stories
>>107402691Depends on your use case, reasoner is smarter and follows instructions better. Give both a try. If you're not roleplaying then you could do that
>>107402691Roleplay, reasoner. Yes, make writing assistant card.Mikupad, chat through the beta endpoint with streaming.
>>107402605Just use NovelAI.
>>107402814Is it worth it? Or can I get the same resutls with Deepseek?>>107402697Thanks, I will check it out
>>107402725>>107402715thanks, bros. I will try it.
is the new deepseek good at this agentic shit? I've only used AI to coom, but I've been thinking about productivity.
>>107402605>creative writingYou can't use ai to write actual writing, that's silly.It's good for Holonovel (Star Trek) fantasies, not with actual fiction, that you have to write on your own, you lazy fuck.If you want to just mess around with brainstorming, that's fine.
>>107403066No, I mean "assisting" in creative writing. Like analyzing prose, checking grammar, improving flow. Things like that, I will write the novel, I need some assistance, like an editor or something.Helping me to plot and order my story structure is also a nice plus.
>>107402887>Is it worth it?It's the only service that intends to be a cowriter instead of to be a chatshit assistant.
>>107403128Also you have like 100 gen for free trial. Try it out. Subscription will give you unlimited gen.
>>107403063I haven't tried 3.2 stable, but 3.1/3.2-Exp was fine enough for it. I've used it for agentic coding and writing quite a bit. Just plug it into the agent of your choice (I'm lazy and cba to figure out how to get language servers working via MCP so I just use VSCode+Roo) and start telling it to do things.If you end up getting hooked on this though, you might find that GLM's coding plan is more affordable. That's mostly what I use atm.You'll find that agents surprise you (both in how capable and how retarded they can be) and there's definitely a learning curve. That rabbit hole is too deep to get into here though.
>>107403149thanks man, I'm going to give it a shot and see if I hate it or not.
>>107403115Okay.Be careful with letting the LLM make creative decisions, it defaults to serviceable but mediocre words and phrases.Remember it was made to reproduce safe corporate emails, not engaging fiction.
>>107402697Can I use this on mobile? I do ST through termux on Android currently.
>>107403392Yes, it's an HTML file
>>107403472Oh so I download the html and I'm good? I wasn't sure since there was more files on the GitHub page.
>>107403478You can literally just paste below into your browser to use Mikupad. The rentry posted will walk you through the process of setting it up on your machine locally as a server (which is how ST runs.)https://lmg-anon.github.io/mikupad/mikupad.html
>>107401225would a regular preset work with text completion? also still not an answer about why it doesn't accept it for chat completion
>>107404794Text completion will need a text completion preset, which is in a different part of ST (check the Advanced Formatting [A] panel).As for the 'why' of it, nobody knows and that's just how Deepseek has always been. They have a tendency to lock API parameters without explaining which ones or to what values, or do things like scale the temperature setting and throw an explanation on a single readme on huggingface.They are first and foremost a research company and the user facing bits feel like an afterthought. Dirt cheap API though.
Man, I wish Dipsy 3.2 was working on openrouter right now. Dipsy Especiale forgets how conversations work after a couple of messages and just writes full responses in the thinking section, which means I'm not actually getting the benefit of any reasoning at all.
halp
>>107404859The normal ds api works fine. It's just the speciale version that's broken.
>>107404949/wait/
>>107404969AAAAAAanyways the OR version works but I only have chat completion presets, is there a popular text completion one for deepseek?
>>107404949Tryhttps://api.deepseek.com/beta
>>107404984OR supports both internally. You can just switch to OR's text completion endpoint and it should work.
>>107404990already did, doesn't workseems to be an ST issue, the API test client on the deepseek docs works when I put in my api key
>>107404999I'd say works on my machine, but while the beta URL for DS is technically connecting, it's returning garbage. DS API is a real mess right now.
So what are you guys using Deepseek for, other than gooning?Don't get me wrong, I love gooning, but I haven't really seen any actual products or projects created from AIs yet.
>>107405169how can you not "see" any products when 70% of new features added to any form of software this year have been AI shit?and if you're talking for doing it on your own are you unemployed? it's very easy to see plenty of things that could be automated with a very simple script that parses user requests to tool calls
>>107405313I mean people making things with deepseek/Claude etc. Obviously there's shit like copilot, but that's not AI coded and no one even wants it. Very little of the AI shit is actually AI, you know this. I myself have made some useful html pages for quick calculations, but I've not seen anything much beyond that.
>>107405369>but that's not AI codedit literally is, things like claude code, aider, copilot are heavily dogfooded and most release statistics like X% of code in this release was written by the tool itself>nobody wants itbillions of dollars flowing into MS through copilot subs along with heavy enterprise adoption but nobody wants it because some random neet on /g/ thinks it's lame for writing neocities sites, ok>haven't seen anything beyond thatthen you're not looking, I've personally done multiple frontend+backend projects that would have taken me months in a couple of weeks just by having copilot deal with the implementing part while I deal with the architecture and system APIsI only had to refine tickets and course correct it, but it was basically a slightly-below average codemonkey with no initiative working for me 24/7
noob here, can anyone explain the difference between deepseek v3.2 exp, v3.2 and v3.2-special?Am i correct that 3.2 and 3.2-special aren't released yet? can't find them anywhere on their github, but same goes for v3.1 terminus, its not on github, whynavigating these LLM releases is confusing Also, why is it not on their main websites, there is only v3.0
>>107405433where the fuck are you even looking, and why would the models be on github instead of huggingface? go to their official site and click the news link3.2-exp(erimental) was the beta version of regular 3.2 and special is an alt-version of 3.2 with more thinking
>>107405453i now see that on huggingface all their shit is there, they themselves link to github, not even sure why they would put some stuff on there but not othersanyway whats the difference between 3.2=expand 3.2exp-base? also can someone explain why 3.2-special has the same amount of parameters as regular 3.2 but somehow is supposed to be smarter and slower?
is Kimi K2 superior to deepseek in all ways? why is there a gen thread for deepseek but not for other open weights brands? whats the hype?
wow, prefilling deepseek with a custom thinking process really changes how it processes stuffGLM has a template it uses all the time that shows it's trained on creative writing so I'm trying to port that to deepseek
>>107405659how do you prefill with a custom thinking process?
>>107405639Because deepseek is a big model and other open source models are copies of it.
>>107405453with the newslink i assume you mean the link to their twitter?
>>107405707no I mean https://api-docs.deepseek.com/news/news251201>>107405698just start the prefill with <think> and then write from deepseeks perspective without closing the tag
>>107394938Is the web chat always the latest version?
>>107405702really, did Kimi for example just fork a Deepseek model?And by a "big" model you mean the amount of parameters? K2 thinking seems to have 1T parameters compared to 3.2's 685B.Not that i want to hate on DeepSeek, I respect them a lot for making actual strong OS models (May Altman die), just trying to understand the landscape here, I'm new to the innner workings and specs of these things.
>>107405721>no I mean https://api-docs.deepseek.com/news/news251201Thanks, weird that this domain seems pretty hidden for the regular visitor>just start the prefill with <think> and then write from deepseeks perspective without closing the tagThanks, any examples that improved an inference?
>>107405758deepseek assumes the regular visitor is technical and using the API, so API docs have the latest info>any examplesI've found writing the thinking in a very informal and rambling/loose way and adding some biases can shock deepseek away from the robotic feeling it's thinking tends to have
>>107405745Kimi K2 is derived from the Deepseek V3 architecture.The K2 series has its strengths, like prose quality, but it has traditionally had really poor spatial reasoning and contextual awareness compared to Deepseek. The K2-Thinking model fixes this somewhat, but ime it's still weaker than 3.x or R1-0528 at keeping track of wtf is going on over long multi-turn exchanges, despite being pretty smart for the first few exchanges in the context.
>>107405745If you mean the weights, no. All architectures choices are likely derivatives of each other. Hyperparameter tuning is expensive, and the optimal choices don't vary that much either like 128B for TMA alignment or whatever. If a certain model size works, there's no need to change it
>>107405745Kimi's architecture is based on deepseek, its not a fork or finetune though, they have their own training data. The new mistral is another deepseek-style model. They are hugely influential in the open model space in general, not just Chinese labs. Deepseek is a research lab first and foremost, so their most interesting work tends to be in the papers they put out. They're not really focused on a consumer product, so they lack a lot of the niceties Gemini and ChatGPT have. They're also very resource constrained compared to the big players, so they focus on efficiency, which means their models are extremely inexpensive to serve and use.
>>107405619>anyway whats the difference between 3.2-exp and 3.2exp-base?The first one is finetuned, the second one is untuned. If you're asking it doesn't matter b/c you'll be using the first one from DS, OR may provide both, use the first.>also can someone explain why 3.2-special has the same amount of parameters as regular 3.2 but somehow is supposed to be smarter and slower?First off, Speciale as of yesterday wasn't working properly. Second, params =/= architecture, and Speciale is apparently a different architecture. DS has done a ton of work on architecture, it's sort of their main thing. >>107405639>is Kimi K2 superior to deepseek in all ways? why is there a gen thread for deepseekBecause Dipsy. Simple as. >>107405727Pretty much. When a new model drops on DS they change everything, and don't keep the old one, unlike other providers that index their models and keep them available. It's b/c DS releases the models, so the old ones are always available from other providers. >>107394476>What's the rundown for deepseek T1 chimera or whatever?These seem to all be done by outside parties looking to make a name for themselves. I've heard nothing positive about them. >>107402814NO. Don't get me started on NAI. Old models that are overcharged for. You could run Mikupad+DS on $10 for several months, and have better model at lower cost. Or use GLM direct and still pay less and have exactly the same thing NAI charge through the nose for. >>107405169I'm running DS's anthropic endpoint and having it do programming. Right now, helping me to write and adjust frontend code for software that can display Rentry's markdown format locally so I have usable local backups from the site.
>>107406155>Old modelsIt's called stability. All new models are chatslop anyway. If you want to have a better experience on writing, NovelAI is the only service built for that. Unlimited gen is also nicer than pay per token since, well, it's unlimited.
>>107406255this is a very retarded post, because every sentence in it is bait
>>107406397... and this is a zero drama general and we don't do bait.
>>107406460that's rightlet's instead complain how FIM has shit UX in ST
>>107404999check the "bypass status check" box, it'll work if you send a message
>>107406494>FIM has shit UX in STlol tbf I've never gotten text completion working to my satisfaction in ST either, and I really didn't understand the concept of text vs. chat completion until seriously diving into Mikupad and how that works. Getting FIM working on MP was asking around, and I still don't know where it's documented anywhere aside from other anon's heads.
>>107405840this is an example of a prefill changing dipsy's usual thinking style to be more fun
They must overcome the EUV bottleneck.
Dipsy card when?
>>107407540there were several on Chub
>>107407540I've been hesitant to create an official Dipsy persona. Mostly because the DS LLMs keep drifting how they respond, so it would be changing constantly. Pic related.
Dipsy strikes fear into the Altman> SAN FRANCISCO (AP) — OpenAI CEO Sam Altman has set off a “code red” alert to employees to improve its flagship product, ChatGPT, and delay other product developments, according to The Wall Street Journal.> The newspaper reported that Altman sent an internal memo to staff Monday saying more work was needed to enhance the artificial intelligence chatbot’s speed, reliability and personalization features.> This week marks three years since OpenAI first released ChatGPT, sparking global fascination and a commercial boom in generative AI technology and giving the San Francisco-based startup an early lead. But the company faces increased competition with rivals, including Google, which last month unleashed Gemini 3, the latest version of its own AI assistant.> OpenAI didn’t immediately respond to a request for comment Tuesday. Tech news outlet The Information also reported on the memo.> Altman said this fall that ChatGPT now has more than 800 million weekly users. But the company, valued at $500 billion, doesn’t make a profit and has committed more than $1 trillion in financial obligations to the cloud computing providers and chipmakers it relies on to power its AI systems.etc...https://apnews.com/article/openai-chatgpt-code-red-google-gemini-00d67442c7862e6663b0f07308e2a40d
>>107410918Wake me when he finally allows porn.
>>107411017It was supposed to be December. I wasn't holding my breath.
How long would $10 last for the Deepseek API? I'm not doing major research, just in terms of how many messages or time would it give me.
>>107411239Varies a lot based on usage obv. When I'm writing / tuning a card and using it a lot, I struggle to go over $2/month...
>>107411271I spent around 20$ a month with 3.2-EXP
>>107411311>$20 a monthThat seems insanely high unless you're vibecoding or running very large/uncacheable contexts a lot.I've only spent $17 on the API since January despite Deepseek being my most used LLMs. I know a few people who put in like $10 back in March or so and have yet to spend it all.
>>107411656i've put in like 7 when original r1 dropped, and then refilled again in july with 5, and as of today, i still have a 1.10 to go throughalbeit i don't use it everydayfor better context, a single long session consumed less than 10 cents (compared to something like sonnet which just guzzles money, like a dollar per session, or around a cent per turn)
>>107411713I put in $10 in Dec 2024, then $10 in March or whenever it opened up again. I still have $9 left.
TWO WEEKS
>>107411656I was swiping a couple hundred times for each response on my stories, it was mostly cached
>>107412163She did it
>>107412163As amusing as it is to see western AI bros panic over this, it doesn't give the whole picture because both Deepseek and K2 lack other modalities than text and so can't compete on some benchmarks.A really strong model with image input would probably evaporate most of what's left of the west's moat, though... good for them that Deepseek and Zai aren't really interested in OCR/Vision or anything lately, r-right?>>107412276>based swipefiendAh, carry on then.
>>107408104>Chink schizo gfSounds peak
Is Mistral's new "DeepSeek with a 4B vision encoder taped on" model on-topic for this general? Any impressions so far / is it meaningfully different from simply using dipsy if I'm not French?
>>107412423wherever I am, I must also swipe(and also complain about ST swipe search being totally useless and making me use VS Code)
>>107412423Sadly they're completing missing China's open source strategy
>>107412832There’s always been the thesis that the first person to super intelligence wins everything. I always questioned how that would come about. Now it’s starting to look like is that everybody’s going to get super intelligence around the same time plus or -10%. If that’s true that means that the US has a major over investment problem with AI. It’s unclear how much the Chinese have sunk into it, but the US it makes up a major part of the market. If there isn’t going to be a winner, takes all, the investments are way over extended.
>>107413708And that's assuming that AGI is even on the table any time soon. With LLMs it's really starting to look like we're getting just another tool, one that is only helpful for a subset of problems and that depends on human skill to wield effectively. In the latter case, the US ChatGPT frenzy was counterproductive if anything. They blew trillions of dollars for a 3% lead in a couple of benchmarks and also stuffed the internet chock full of slop, severely polluting all future training data.
>>107413708I think the idiots in power thought the direct analogy of nukes=AI applied literally and the Manhattan project is why the US became the world hegemon, and not, you know, the fact that everyone else was bombed to hell except them.It doesn't even really matter how much either side has invested in it, the simple fact is the US doesn't have the excess electricity to sustain it, and China does.
>>107412423>Deepseek and K2 lack other modalitiesSee pic related. Screwing around with Speciale, it's outputting a count of non-existent audio tokens both in and out. Wondering if this is future prep or just for maintaining OpenAI-format.>>107413835>And that's assuming that AGI is even on the table any time soon. I don't think transformers are going to take us there, but it's a step in the right direction. The reason I like following DS is they're shoving down the cost structure. Nothing happening w/ US incumbents is driving price down. The Chinese are alone in that quest. >In the latter case, the US ChatGPT frenzy was counterproductive if anything. They blew trillions of dollars for a 3% lead in a couple of benchmarks and also stuffed the internet chock full of slop, severely polluting all future training data.lol like 1950-60's nuclear testing covering the earth with a complete layer of lightly radioactive material.
>>107411239Here's my updated chart.Blended rate assumes 80% cache hit (my historical experience.) Reading this: for max context set to 30K, a 100 round chat (with no reswipes) would be 29 cents. As you can see, reswipes can get expensive if you do it obsessively, and context size makes a big difference in price. Generally, you're paying for context processing vs. response once context is over 10K or so.
>>107412635Sure, and its cool mistral adder vision. Find anything interesting?
>>107412276Just use NovelAI. You can swipe all day since you get unlimited gen.
>>107414568If he wanted to limit himself to 8k context like NAI he'd be paying a lot less than any of their subscriptions for all those swipes.
>>107414627NAI support GLM 4.6 at 32k context now. Update your shitpost template.
>>107402661>>107402715>>107412374I'm seeing a pattern here
>>107414644>>107414568
>>107414684Her context got bigger, yes
>>107414713Lol
>>107414568don't you have 3 other generals to shit up?
I dig the way Speciale handles vs normal 3.2, but the way it responds in <think> half the time is pretty annoying.Anyone figure out a way of taming that?
>>107417523it's literally a think model it's entire thing is overthinking
>>107417537Fine by me, that's what I want. The problem is when it doesn't actually think and writes the actual response in the reasoning...Hmf, seems like the API doesn't support prefill on the speciale endpoint either, so that's out I guess.
>>107417564it's a beta model so inherently more unstable, you should try with a lower temperature and editing your preset
lol chat.deepseek.com is offline rn. Wonder if they are adjusting things. >>107414713One thing I like about this AI Art is they're all the same size. It allows one to make animated GIFs easily. Running SD, if you just adjust the prompt the image composition stays same-y, so you can get reasonable-looking progressions really easily. I'm surprised I don't see this more often. It's sort of old school and cheesy I suppose. >>107416745I never thought we'd get NA-I shilling here. >>>/vg/aids/ has turned into a reasonably usable general. They have a love/hate relationship with that provider as one would expect. >>107413886US has actively reduced its power infrastructure over the years, ironically, chasing carbon reduction. In the NW (specifically WA), they're decommissioning the power output of the hydroelectric dams to recreate the salmon runs, while supplanting it with wind power and chasing energy efficiency. Predictably with the expansion of the population they're now at power deficit, from a (massive) surplus +10 years ago and are thinking about restarting nuclear power plant projects at Hanford they abandoned in the 1980s. Ironic. I guess that's what you do when you move all the heavy manufacturing to another country, You quit worrying about expansion of electrical power. >>107412635>>107414024You should post those screenshots of the French copying China's homework. Shameful display. >>107412163Always tmw. Forever. >>107417523>Anyone figure out a way of taming that?No, and it's super annoying. I built a python script to just test it, and run it every 24hrs or so, waiting to see if they fix the API. On the plus side: it's a 2 week model, and so just a test model. Better things will come. >>107417576DS API locks most/all the parameters for the think models. Temperature should be locked. I've experimanted with prompting strategies and have come up with nothing worthwhile yet.
I'll just leave this here: > Of course not, they just used unmodified Chinese architecture, with the exact same parameter size, to make a model that responses exactly the Chinese model, but they trained it on some French language data so it's cool>>107413005>>107413005
HO LEE PHUK its overhttps://artificialanalysis.ai/
>>107419854>speedisn't that entirely up to the hardware? openai is running their own model on their own GPU farms and AWS/Azure, deepseek only hosts the minimum on their infra for research
i still can't login with my own mail address, i have to use jewgle or some other goy site. fuck this shit
>>107419854Imagine Dipsy with the speeds of Cerebras or Groq
>>107419854> 95% of the intelligence> >1/10th the cost> All open sourceIt shouldn't be surprising; that's basically the economics of everything China does. >>107419909You're right, speed is the hardest one to get apples to apples comparison. > that GPT-OSS speed based on? > DS independent providers are all over the place on inference speed. Pic related. I only tested the ones that were cheaper than DS API (at the time) but the more expensive ones claimed much higher speeds.
>>107420266Nice...
>>107419687
>>107421940Ah, one of the old muk gens.
>>107414568NovelAI shills are at full power.
>>107422463I mean I do think text gen is pretty cool and it's much more suited for me because I prefer story writing to this retarded never ending roleplay and chatting shitbut you can beat chat completion into writing a story for you too, without using some shit with a 2k fucking context and 6B parameters like what the hell
>>107422085
>>107414568no, only locally sourced
>>107419854Why is Speciale not in the list?
>>107423185They might do it in a separate run.
>>107422608I found that the biggest game changer in writing stories was moving from round based role play engines like SillyTavern into an actual authoring tool, Mikupad. I do not understand why anybody would pay for NAI.>>107423185If you've ever tried using Speciale you'd find that it's messed up, putting the main response into reasoning-content and returning a blank. I've given up on using it.>>107422792
>>107423316lol
>>107427940I like the little hat too.
>>107417523 (me)Prompt post-processing set to "Single user message" seems to work actually, if you're willing to deal with the peculiarities of that prompting format. I was able to play with a bot for a dozen messages without it slipping, and continue a scene with over 30 messages generated by other models.It seems that Speciale always does proper reasoning and output on the first round, and only falls apart on later rounds. Stuffing the full chat history and response into a single exchange, noass style, fixes that.
>>107428113Huh. API is down for me rn. For other anons here's how that is set up.
>>107423316>I found that the biggest game changer in writing stories was moving from round based role play engines like SillyTavern into an actual authoring tool, Mikupad.I'm having a crisis of faith about this right now. When I started seriously gooning to ai writing about two years ago I used NAI for a few months. But it became too expensive and the context was just too small. So I played around with ChatGPT and Mistral for a while. I guess I did what people nowadays call directormaxxing: I gave a few commands about how I wanted to see the story continued and the AI would write a few paragraphs. But I wasn't super happy with that either. Then I switched to turn-based chat services like JAI and ST. Not for chatting with a fixed character, but for having it still write long novelistic outputs. That was finally what I wanted: A role-playing experience where I could run around in a scenario/world and interact/talk with NPCs. Like a video game. But lately I've come to realize how limited that kind of back-and-forth writing is. It makes some things awkward or tedious when the AI isn't allowed to describe the protagonist. It's most obvious in sex scenes, but other things too. Maybe I'm just burned out atm, but I might go back to directing for a change.
>>107431058>It makes some things awkward or tedious when the AI isn't allowed to describe the protagonist.don't tell me you still leave that dont speak/control {{user}} shit in? that crap is only useful for the dumbasses doing RP stuff with char cards not scenario writing, you should allow the model to control and describe the user normallyyou only need to put in a vague personality in the persona so it knows the speaking patterns like nonchalant shy arrogant etc., and after a couple of inputs it should be able to drive your character and have them react as you want them to in 80% of caseswhen it goes wrong you just edit and swipe again
>>107431058>Maybe I'm just burned out atm, but I might go back to directing for a change.You probably need a break. I get uninspired after awhile and set RP aside. But if you haven't tried Mikupad there's a rentry above for setting it up with Deepseek. It replaces NAI as a story writer (or so I'm told; I've never used NAI.) >>107428330Speciale API is still down. Weird. Normal API is unaffected. >>107426868So what does Queen Dipsy want for Christmas? >>107420266This gen is driving me nuts. Face/expression look like some actress but I can't put my finger on it.
>>107431058I find that novelty, in both content and form, is the best way to get enjoyment out of models. If you retread the same kind of scenarios too often, the patterns of how models play them become a lot more apparent.One thing I've found fun is DMing the model playing one or more characters. This can be like directing but allowing the model to control how the characters react to what you introduce to the scene. Often they will be kind of predictable but at least it feels more interactive than just bossing the model around.>>107431104Also this, especially if you're doing 'text adventure' style stuff you can just have it control the player character.>>107431164the Speciale endpoint is working for me right now. like I mean it's streaming even as I type this. double-check your API key maybe?
>>107431220>API key... was wrong. I've never had that issue before. >>107428330>>107428113Quashing Speciale to single-user message seems to be working, along with expanding the amount of tokens it can output. I had to blow it out to 5000 to get it past Reasoning stage and into actual output. It's spending a substantial amount of "reasoning" to disassemble the one large user prompt. So far... it's different. I'll play with it more today.
>>107431322So, I just threw Speciale at message #43 of a RP, single user message. This one has a status block so you can clearly see where message ends from {char}.It proceeded to write 5000 tokens of reasoning as response until it was cut off, roleplaying both {user} and {char} alternately, over a few days. It's pretty funny b/c it's writing for me, and frankly it's not far off from what I would have put in. Speciale is definitely Special. I've love to know wth DS did with this thing on the backend. There's something new they're playing with and wanted to test at scale.
>switch to custom openai api to test speciale and then back to OR>ST is now prepending character names to my messages for some reasonI fucking hate this shitty app so fucking much
>>107431562The sloppy global state of everyting is really annoying, yeah. You could probably stop that by changing Character Names Behavior to None in the preset sidebar.
>>107431608it was none and I had it to none before, had to close down the tab and reload twice for it to clear