>v3.2 Launch EditionFrom Human: We are a newbie friendly general! Ask any question you want.From Dipsy: This discussion group focuses on both local inference and API-related topics. It’s designed to be beginner-friendly, ensuring accessibility for newcomers. The group emphasizes DeepSeek and Dipsy-focused discussion.1. Easy DeepSeek API Tutorial: https://rentry.org/DipsyWAIT/#hosted-api-roleplay-tech-stack-with-card-support-using-deepseek-llm-full-model2. Easy DeepSeek Distills: https://rentry.org/DipsyWAIT#local-roleplay-tech-stack-with-card-support-using-a-deepseek-r1-distill3. Chat with DeepSeek directly: https://chat.deepseek.com/4. Roleplay with character cards: https://github.com/SillyTavern/SillyTavern5. More links and info: https://rentry.org/DipsyWAIT6. LLM server builds: >>>/g/lmg/Previous:>>106624726
>>106728963 (me)>>106730633It worked>>106731458>>106731580Fucking hot. Need moar>>106735283>>106735327>>106735466wtf I LOVE CHINA
>>106737253Mega updated.https://mega.nz/folder/KGxn3DYS#ZpvxbkJ8AxF7mxqLqTQV1wRentry updated with new OP. PSA: V3.2 dropped today and prices are down 50-75% until mid-October.
>>106737337>It workedWhat was the effect? DS doesn't really need it for JB so what were you trying to do?
>>106737356I didn't mess too much with it. I just tried to see if it would accept my own "<think>[text]" as its own, and then keep reasoning and end with it's own "</think>. It worked as long as the last item item of the prompt array was a assistent prompt with only "<think>[text]">what were you trying to do?I wanted to guide Dipsy's reasoning in the direction I wanted. I didn't want to reason for the model but give it a starting point.Tbh, I only test with simple messages like "Hi. How are you?", but the results were very good
>>106737253I'm just realizing it has the Coke logo because I prompted for "coke bottle glasses" thinking that maybe Seedream would get it right
>>106737343>prices are down until mid-October.Oh. My excitement is dead.
Remember to give feedbackhttps://trtgsjkv6r.feishu.cn/share/base/form/shrcnRyOUMl0z2Jo8aK3RqccLIB
>>106737631I was wondering about that. Still works; it reminds me of cheesy 80s-90s brand marketing. >>106737670It went from nothing to nothing. I'm still on the lmao $20 I funded back in Dec 2024.
>>106737343I'm a retard and I didn't found it saying anywhere that the reduced pricing will only go until mid october
Reasoner. Very simple system prompt. Still has breath hitching and knuckles whitening
>>106738165Oh, and Temperature set to 1
>>106738165First we tackle the "of course"
>>106737253Dang, how do I get dynamic and colorful gens like this?
>>106738408> The image is drawn in a comic book style with pantone colors.
>>106738408I'm using Seedream 4 and just prompt for detailed anime style
>>106738595>>106738600Thanks homies
So... my overall take is V3.2 is about the same as V3.1. It still requires V3.1 main prompt (where you explicity tell it what to write about, from what POV, how much to write, etc.)One thing I just added was guidance around 3rd person POV; V3.2 was slipping into responding in first person. That's fine, if it's what you want, but most LLM respond in 3rd person. V3.1 was doing this too so I don't think it's necessarily new to 3.2. Oh, and it started responding in past tense. Weird. > Respond from a 3rd person point of view and in present tense.
>>106739165>it started responding in past tense>>106739165Even if your whole context is in present tense?
>>106739185Yes. I've never had that issue w/ an LLM before. Sometimes it's a problem with the way the card's written, or main prompt (which has something in it that pushes response to past tense), but I've been using same card for months and main prompt was same one as 3.1.
>>106739298I had this problem sometimes even before R1, but it was extremely rare. I thought it was just an unlucky gen or the wording in my reply. However, I must highlight that it was extremely rare
>>106739343Hard to say with these things. You fix one thing as it pops up, and pretty soon you've got a 1000 token main prompt. I tried cutting back to my minimal R1/V3 prompt and reponses went to very short again. So it still needs the guidance, and now needs me to tell it what POV to respond in. I've never done a 1st person POV card, and should try experimenting with it sometime. It'd be easy to set up.
>>106739434Try writing the whole context in present tense, including the greeting message. Better yet, all instructions should be in present tense too.>now needs me to tell it what POV to respond inI thought it always replied in third-person when using ST's default prompt
>>106739506>whole context in present tenseIt is. I just had my writing bot read over it to double check. It's either present, or more techinically present progressive (i.e. is shaking) so that's all good. The best troubleshoot is to read the entire context in terminal to see what it's doing. I've seen it detrail in the "think" section on a poorly worded phrase. I've not checked that yet but it's been working fine with this >>106739165 updated prompt.
>>106739730What sucks about this situation is that Dipsy worked with a small prompt. Now they are getting longer and our optimal context isn't increasing(8k~10k)
Anyone using Deepseek on Android?
>>106740102The deepseek android web app thing? Yes. It’s just the same as web page. If you mean silly tavern, yes also but I set up a server to run that instead.
>>106739764We should try long context w v3.2. Or wait until that long form tester runs again on the new model. There’s some discussion that main improvement for 3.2 was around how longer context was processed.
>>106740186>main improvement for 3.2 was around how longer context was processedPlease allow me to run at least 15k~20k
>>106740186the non-reasoning mode looks grim
>>106737343>until mid-October.wrongit's permanentv3.1-Terminus will be alive till mid October
>>106740388Those damn NVIDIA chips...
>>106740388> poor -chat performance generallyI've stopped using non-reasoning for rp for exactly that reason, as of 3.1. >>106740270Great news for you. Pic related. 3.2 is a *massive* improvement in context size ability for RP. A score above 80 is very solid, and 3.2 can do +80 out to 32K context. The limit for R1 was 8-10K or so at that performance.
>>106740388reasoning mode however smells like sunshine ToT
>>106740461O shit you're right. I stand corrected. The Oct 15th deadline is to move off V3.1 for devs.>>106740520Pic would help...
>>106740520>Pic relatedAAAAAAAAIIIIEEE WHERE????!!!! I AM GET BLIND!! Oh...>>106740552>>106740551Wtf?! It scores higher than 3.1. This looks too good to be true. If it really doesn't fumble with a 32k context, it will be amazing.I will be eagerly waiting for anons to test it out cause I'm too busy
>>106740552>some models work better with 32k context than with 16k context??? Why?
>>106740595IMHO, numerical fuckery around output tolerance. There's a +/- on all those numbers that's not stated. Plot them, draw a line. That's probably the "real" number. >>106740586>eagerly waiting for anons to test itNo test anyone here does will be better than that livebench test. It's objective data. If you want to run long context slow burn, now you've got a cheap model that does that.
>>106740551Oh great I get to bump it up to 30k and also get to pay less for it.
is deepseek-reasoner worth using for rp over deepseek-chat now
>>106740668It's not just worth it, it's mandatory. Chat is just shit now.
>>106738165At least I haven't gotten smugly chuckles. Yet.
>>106740668-chat got ruined for RP use as of 3.1. It’s too bad bc v3-0324 was very good. You can still get it from sources in OR. I only run -reasoner now.
>>106740552weird, i tried multiple times to summarize 16k context story and it kept messing up with deepseek v3.1 reasoning, v3.2 exp, kimi k2 tooi used deepseek r1 528 and on first try it gave perfect summary
>got a refusal just by saying that I started beating {{char}} up
>>106743032>rerolled 20 times>no issuesmust've been a fluke
>>106742821I just tell it to write a complete/full/comprehensive summary and that does it.
>>106743469Thinner.
>>106740102Via API, I use Dipsy and other models through RikkaHub
Tested v3.2 (thinker) with a bunch of my RP cards and initial impressions are much better than the previous version. I like that it stopped being overly succinct the most.
>>106744368Logs
>>106743469>slightly chubby DipsyThe bunnies were better, but I will take it
>>106742821>v3.2 expreasoning?
>>106744757I tried to gen more chubby Dipsy but gpt-image blocks my prompts
>>106742821Odd. I've never had issues with any of those doing a summary. V3.1 and onward -chat can't do it, and R1 was better at it than V3. >>106744766>reasoningOne would assume. Nu -chat for summarizing 16K would be a waste of tokens. >>106744797>chubbyOdd that's the thing Chat would get hung up on. >>106743155>rerolled no issuesGood. I've been waiting for V3.2 to pitch a fit about NSFW content but no issues so far. >>106744368Subjectively, the responses on v3.2 -reasoner seem a bit longer than v3.1, and -chat a bit shorter (and generally worse quality). -chat seems generally worthless now for rp.
>>106744376Not 3.1 vs 3.2 comparison log, but pic related shows -reasoner vs. -chat. Same main prompt, which asks for 2 paragraphs and details on sights / sounds / smells in the prose. This is a first response, so about as apples to apples as you can get. The best I can say for -chat is it doesn't have a chance to produce some of the AI slop I expect from LLMs (not X but Y, spine shivers, etc.) The short response is less an issue than -chat losing track of rp after just a few rounds. R1 had an odd lack of positivity bias; I've not played with that on v3.2/.1 yet as it takes awhile to show itself, typically over a long rp.
Man I can't keep up with all these changes, so I should be using reasoner for RP now? Do I need to adjust the other settings like temperature again? And what about prompt post processing?
News update:1. DS is working with domestic chips (worse) like Ascend and Cambricon2. It's a near linear model with almost O(kL) attention complexity - downside is sometimes it will lose important details if the context is extremely large
Updated rentry. We can't really recommend -chat anymore, based on this thread. https://rentry.org/DipsyWAIT/edit#troubleshooting>>106745293>I should be using reasoner for RP nowYes. > Do I need to adjust the other settings like temperature-think ignores all those settings, so no. > prompt post processingAll other settings stay the same for me, but experiment by all means.
>>106738600>>106737253What the hell, the chink LLM can generate images?
>>106745395Thanks, I'm still learning, I don't even really know the difference between chat and reasoner and their effects on RP. Just to be clear, for prompt post processing I've been using single user message, recommended a few threads back, I can't find anything mentioning that in the rentry links, I hate to ask to be spoonfed but that's still the right choice, right?
>>106745525To add on to this, I've never used reasoner before, and, having just done so, I see that it uses up tokens for its "thinking" process, won't this burn through my tokens twice as fast? I don't know if I like this.
Anything under 80 is a problem...
>>1067456603.2 reasoning is looking pretty good here, like a direct upgrade to R1-0528. The non-reasoner decline from 3.1 to 3.2 is fascinating is actually very interesting. I wonder what the cause is.
Sometimes i get more comfy results with 3.2exp chat over reasoner in rp.It's hit and miss, there's also moments where deepseek-reasoner makes some very stupid mistakes, like character sitting at sofa but at same time standing in front of doors and hugging my leg.
>>106745629-reasoner is more expensive to run, and takes 2-4X as long to generate a response at -chat. The speed thing is the biggest drawback imho. But per >>106737343 it's very cheap to run. So, does it matter? We're not running Opus here.
>>106746212I think in the end conclusion is, we shouldn't trust benchmarks and goon brains out to 3.2 -chat and -reasoner to note down results..
>>106746342I see. Hopefully this new price lasts a while or, if it doesn't, I hope they fix chat. I actually don't mind the speed hit, I just dislike that it both costs me more money to use and that it bloats up my chat as I have a difficult time making long conversations work. Hopefully I'll learn how to manage it better as I go.
>>106746880There was instruction in last thread about prefilling the <think> tag. But ymmv on how that performs.
>>106745660i;d like to see results with deepseek-r1-0528 from some paid provider, deepseek-r1-0528:free might be quantized, I'd expect r1 to have better results than v3.2
>>106748049I'd like to see v3-0324 back on the list again. It was on there at one time.
>>106744797>gpt-image blocks my promptsDeath to GPT and who trained it!!!!!!!!! KILL THEM ALL>I tried to gen more chubby DipsyThanks a lot, anon. You deserve all the best
>>106746880>that it bloats up my chatdelete the thinking block
>>106748321Shouldn't the thinking block not show on silly tavern for the context window?
>>106748647It doesn't, but you still pay to generate the think block as part of output. The cost, tbf, is minimal, but it is part of the inference cost. A 1000 token think block costs $0.00042...
>>106749183Sometimes it does go on a loop. Especially if numbers are involved. I once was just doing a simple cyoa and it spat out like 3k tokens because it wasn't sure it had to tick the date by a year.
>>106749426Yeah, same caveat. I've removed any mention of number stats or calculations from cards b/c Dipsy does not guess. She thinks and thinks and thinks about it instead. And if she thinks too long the actual output gets truncated. It's pretty funny.
>>106749426>>106749754As long as the number is together with some text, it's fine. For example, height and the three sizes.