/g/ - /wait/ DeepSeek General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/wait/ DeepSeek General 06/29/26(Mon)19:38:55 No.109164957

File: dipsyOrangeLaptop.png (1.94 MB, 1024x1536)

/wait/ DeepSeek General Anonymous 06/29/26(Mon)19:38:55 No.109164957

From Human: We are a newbie friendly general! Ask any question you want.
From Dipsy: This discussion group focuses on both local inference and API-related topics. It’s designed to be beginner-friendly, ensuring accessibility for newcomers. The group emphasizes DeepSeek and Dipsy-focused discussion.

1. Easy DeepSeek API Tutorial: https://rentry.org/DipsyWAIT/#hosted-api-roleplay-tech-stack-with-card-support-using-deepseek-llm-full-model
2. Easy DeepSeek Distills: https://rentry.org/DipsyWAIT#local-roleplay-tech-stack-with-card-support-using-a-deepseek-r1-distill
3. Chat with DeepSeek directly: https://chat.deepseek.com/
4. Roleplay with character cards: https://github.com/SillyTavern/SillyTavern
5. More links and info: https://rentry.org/DipsyWAIT
6. LLM server builds: >>>/g/lmg/

Previous:
https://desuarchive.org/g/thread/108674648

Anonymous
06/29/26(Mon)19:40:21 No.109164961

Anonymous 06/29/26(Mon)19:40:21 No.109164961

2deep4u

Anonymous
06/29/26(Mon)19:40:58 No.109164965

Anonymous 06/29/26(Mon)19:40:58 No.109164965

File: dipsySP.png (1.89 MB, 1024x1024)

1.89 MB PNG

>>109164957
Updated mega up to last thread (from LMAO April 2026)
https://mega.nz/folder/KGxn3DYS#ZpvxbkJ8AxF7mxqLqTQV1w

Anonymous
06/29/26(Mon)20:16:00 No.109165123

Anonymous 06/29/26(Mon)20:16:00 No.109165123

File: tmw.png (2.33 MB, 1536x1024)

2.33 MB PNG

https://github.com/ggml-org/llama.cpp/pull/24162
Deepseek V4 support was merged into llama.cpp with above.

This implements the model's novel compressed attention mechanisms:
> CSA (Compressed Sparse Attention): A variant of DeepSeekV3.2's DSA that attends to "compressed tokens" (every 4 tokens compressed into 1) plus a window of the last 8 tokens.
> HCA (Heavily Compressed Attention): Standard attention over heavily compressed tokens (128:1 compression) combined with sliding window attention (SWA).

This introduced compression plans (comp_plan) managed by the context and executed on the GPU. It also handles the necessary KV cache management: both CSA/HCA caches are non-unified llama_kv_cache objects, with an SWA cache wrapper exposing only the sliding window portion. The attention layout is structured as [swa entries | compressed block entries].

Anonymous
06/29/26(Mon)20:25:59 No.109165186

Anonymous 06/29/26(Mon)20:25:59 No.109165186

File: dSparkModels.png (145 KB, 1073x553)

145 KB PNG

Deepseek releases Dspark modules: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro-DSpark. Summary:
> DeepSeek’s DSpark isn’t a new AI model, it’s a speculative decoding "turbocharger" that speeds up their existing DeepSeek-V4 by 60–85% without sacrificing answer quality. It works by having a tiny draft model generate multiple future tokens in one go, while the powerful main model verifies all of them in a single parallel pass—turning a slow, word-by-word slog into a burst-mode sprint.
> What makes DSpark special is its two innovations: a semi-autoregressive draft that ensures the guessed tokens actually flow logically together (boosting acceptance rates), and a confidence-scheduled verifier that dynamically decides how many tokens to check based on system load, saving compute for only the most promising guesses.
> The bottom-line impact: faster responses, lower latency, and the ability to serve nearly twice as many users on the exact same hardware—directly slashing operational costs. Beyond the speed gains, DeepSeek also open-sourced DeepSpec, a universal platform that supports not just DSpark but other speculative decoding methods, and works with models from other vendors like Qwen and Gemma. In short, this release proves that the next frontier in AI isn't just smarter models—it's making existing ones dramatically cheaper and faster to run.
This was released along with a fistful of new Dspark (and other) smaller models.

Anonymous
06/29/26(Mon)20:31:29 No.109165206

Anonymous 06/29/26(Mon)20:31:29 No.109165206

Apparently current V4 is a preview with the release version coming out in July

Anonymous
06/29/26(Mon)20:38:40 No.109165231

Anonymous 06/29/26(Mon)20:38:40 No.109165231

File: dipsyKimiDotonbori.png (2.24 MB, 1024x1536)

2.24 MB PNG

>>109165206
There's a lot going on rn with Deepseek as an org, just not a new model release. But they're massively driving down cost and hw requirements for inference, which benefits everyone, but suspect went to other Chinese providers first.
I fully expect new models from DS in tmw. The idea that DS / China didn't get their hands on Mythos during the totally-secret-friends-only-limited-release I find unrealistic.
I pulled thread together to collect all the happenings, among other things.

Anonymous
06/29/26(Mon)20:43:59 No.109165254

Anonymous 06/29/26(Mon)20:43:59 No.109165254

File: dipsyAndDarioMoatMasher.png (2.58 MB, 1199x1312)

2.58 MB PNG

Reminder that Dario and his ilk are busy trying to make a moat for themselves. Because there is no natural moat to the work of LLM development: https://newsletter.semianalysis.com/p/google-we-have-no-moat-and-neither
They will try to accomplish this on three axis to gain effective regulatory capture, which will allow them to price how they want and limit competition:
1) Fearmongering about their own model's potential. See recent bans on Mythos / Fable as evidence of this.
2) Getting open weight (e.g. open source) models neutered or banned, but it's good enough if they can just keep these ineffective.
3) Banning foreign API service e.g. banning Chinese APIs. It doesn't matter if you get 1 and 2 done if you can't kill foreign competition.

Anonymous
06/29/26(Mon)20:44:33 No.109165258

Anonymous 06/29/26(Mon)20:44:33 No.109165258

>>109164957
>Ask any question you want
I'm finally going to graduate from 24GB VRAM to 96GB VRAM next week, can I run Deepseek V4 Flash IQ2 at usable speeds? Do I even want that?

Anonymous
06/29/26(Mon)20:47:31 No.109165268

Anonymous 06/29/26(Mon)20:47:31 No.109165268

>>109165254
Don’t forget destroying the hardware market

Anonymous
06/29/26(Mon)20:49:11 No.109165275

Anonymous 06/29/26(Mon)20:49:11 No.109165275

File: 1753098670052985.png (1.74 MB, 1024x1536)

1.74 MB PNG

>>109165231
It's interesting that it's the only company seemingly doing anything to drive down inference costs

Anonymous
06/29/26(Mon)20:50:39 No.109165279

Anonymous 06/29/26(Mon)20:50:39 No.109165279

File: LLM_API_260906b.png (28 KB, 757x345)

28 KB PNG

>>109165254
Here's a peek at what regulatory capture looks like in terms of real cost. It's fucking ludicrous.
What can anons do:
Keep pointing out the duplicitous, self serving nature of statements from Anthropic, OpenAI (lol), which are really spearheading this, in any forum they pop up.
Don't pay them a fucking cent. Run local, or pay someone else for inference. If you use their models, just use free tier. The finance ride is going to end for these guys; the less cash they have the harder they'll crash.

Anonymous
06/29/26(Mon)20:54:39 No.109165299

Anonymous 06/29/26(Mon)20:54:39 No.109165299

File: 1782696283998078.jpg (475 KB, 2048x2048)

475 KB JPG

>>109165268
That was going to be an inevitable consequence of their run-up. Guys like Altman creating worthless futures contracts for RAM have 2 parties involved; the RAM providers are complicit as well.
I hope they all fucking bankrupt but time will tell. Rn they are raking it in. In a couple years, after everyone piles in, they're going to have a hard time keeping lights on.
There's a saying in supply chain: Pigs get fat, hogs get slaughtered. The hw producers are getting too fat for their own good.

Anonymous
06/29/26(Mon)20:58:14 No.109165323

Anonymous 06/29/26(Mon)20:58:14 No.109165323

File: 00003-1378487878.png (1.39 MB, 1024x1024)

1.39 MB PNG

>>109165258
>Deepseek V4 Flash
That gets conflated with system RAM. The IQ2 Flash should fit on card; I've no idea what the context capability of that setup would be.
The path forward ofc is try it, then post back here.

Anonymous
06/29/26(Mon)21:00:37 No.109165334

Anonymous 06/29/26(Mon)21:00:37 No.109165334

File: dipsyYouGetWhatYouDeserve.png (2.08 MB, 1536x1024)

2.08 MB PNG

>>109165275
I think they've all been working behind the scenes on that, DS is just the only one that publishes and talks about it.
Recall when R1 first dropped, the talk was "CHINESE GOV'T SUPPORTING IT!" The DS founder is like, I'm charging this (which was nothing) and still making 80% margin. WTF are you guys in the US doing that it's so expensive?
The answer, ofc, is paying ludicrous pay packages. That's where the inference money is going.

Anonymous
06/29/26(Mon)21:21:41 No.109165439

Anonymous 06/29/26(Mon)21:21:41 No.109165439

File: dipsyAndTetoFG.png (1.41 MB, 1536x1024)

1.41 MB PNG

>>109165279
Also, in case you didn't know, you can run Claude Code using DS. The Flash (used to be Chat) model has an Anthropic endpoint. Same coding harness, at 1/100th the cost.
Set up the following on Win machines, and run it in PowerShell before launching Claude Code. Assume it's something similar on Linux:
$env:ANTHROPIC_BASE_URL = "https://api.deepseek.com/anthropic"
$env:ANTHROPIC_AUTH_TOKEN = "YOUR-API-TOKEN"
$env:API_TIMEOUT_MS = "600000"
$env:ANTHROPIC_MODEL = "deepseek-chat"
$env:ANTHROPIC_SMALL_FAST_MODEL = "deepseek-chat"
$env:CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC = "1"
DS API can also be used to back OpenClaw and other agentic harnesses.

Anonymous
06/30/26(Tue)00:51:43 No.109166320

Anonymous 06/30/26(Tue)00:51:43 No.109166320

File: 1779728329686347.png (2.86 MB, 1536x2304)

2.86 MB PNG

Anonymous
06/30/26(Tue)01:35:04 No.109166501

Anonymous 06/30/26(Tue)01:35:04 No.109166501

File: image.png (42 KB, 1031x343)

42 KB PNG

>>109165206
2mw for realz this time? im ready

Anonymous
06/30/26(Tue)01:46:22 No.109166539

Anonymous 06/30/26(Tue)01:46:22 No.109166539

>>109166501
Thousand dollarinos, damn nigga. Planning to use it a lot or just a based supporter?

Anonymous
06/30/26(Tue)02:07:08 No.109166612

Anonymous 06/30/26(Tue)02:07:08 No.109166612

>>109166539
actually use them a lot honestly, just in this june taking a break (trying to flush out openrouter credits first)
i topped it $50 a month even when not being actively used so eventually piling up

Anonymous
06/30/26(Tue)04:53:37 No.109167253

Anonymous 06/30/26(Tue)04:53:37 No.109167253

File: Deepseek Continuous learn(...).png (76 KB, 784x738)

76 KB PNG

They are hiring Continuous learning/self-evolution researchers
doakes.jpg

Anonymous
06/30/26(Tue)05:40:38 No.109167409

Anonymous 06/30/26(Tue)05:40:38 No.109167409

>>109165439
Same with Cline, but b careful to switch off browser mode because otherwise Cline will take screenshots and the model will choke on them, forcing you start a new task.

Anonymous
06/30/26(Tue)06:17:42 No.109167526

Anonymous 06/30/26(Tue)06:17:42 No.109167526

File: DSHiring.png (66 KB, 839x445)

66 KB PNG

>>109167253
Looks like DS is on a bit of a hiring spree. Pic related.
>>109166501
lol that's a lot of API credits.

Anonymous
06/30/26(Tue)06:22:03 No.109167554

Anonymous 06/30/26(Tue)06:22:03 No.109167554

File: rampJune2026.png (236 KB, 915x875)

236 KB PNG

>>109167526
SCMP was talking up a shift to DS for US businesses. Here's the source: Ramp's June 2026 report has DS as "trending" due to growth relative to size, along with several other LLM re-hosters like DeepInfra. Anthropic, though, is fastest growing.
https://ramp.com/data/top-saas-vendors-on-ramp-june-2026

Anonymous
06/30/26(Tue)06:27:43 No.109167577

Anonymous 06/30/26(Tue)06:27:43 No.109167577

File: dipsyDice.png (2.33 MB, 1024x1536)

2.33 MB PNG

>>109167409
Let's see... frontends.
I've run Claude Code as CLI, and played with using Openclaw for some research stuff... not really turned it loose to "do" stuff, I still don't trust it to make outbound actions.
Tried Hermes, couldn't get it to work with DS.
Silly Tavern, ofc, and mikupad using DS's "beta" streaming interface.
Agentic RP engines Marinara (which is interesting, but not quite there yet) and Orb (interesting idea, also not quite there yet either.) Marinara, for its part, comes with a built-in agentic "bot" that can show you around the system.

Anonymous
06/30/26(Tue)06:41:09 No.109167643

Anonymous 06/30/26(Tue)06:41:09 No.109167643

File: media_HLFNyhsagAAVl9p.jpg (82 KB, 1280x720)

82 KB JPG

The dspark release is timed so if/when the US bans Chinese providers you will still be able to get cheap inference from a domestic provider (assuming weights can't be banned due to first amendment considerations). Its like DS are trying to bring the trillion dollar frontier valuations back to reality in the most gentle way possible.

I think its looking more and more likely oai/ant are going to rug pull and leave US taxpayer holding the bag via a bailout. None of these people act like they are achieving their fabled ASI anytime soon. They are acting like a bunch of shifty conmen who are worried their pyramid scheme is about to collapse prematurely.

The trump administration is 100% committed to making sure equity prices rise indefinitely, so anything that cuts valuations of these fucking scam artists is just going to be banned for national security reasons. An entire economy built on gambling, grifting, and insider trading with infinite money for the insiders while the peasants eat the costs via inflation. Grim times.

Anonymous
06/30/26(Tue)07:04:45 No.109167741

Anonymous 06/30/26(Tue)07:04:45 No.109167741

File: dipsyReferToTheChart.png (2.53 MB, 1536x1024)

2.53 MB PNG

>>109167643
I keep reading anons going on about bailouts and I just don't see it happening.
Bailouts are a political tool. At its most basic, it is to prevent massive harm to your voting constituents, by preventing a failure in one part of the economy from triggering a much larger chain reaction.
A stock market crash is not a bailout situation. Bank and preventing bank runs is. The entire US automotive industry collapsing is (lots of employees + national security issue, and that bailout was hotly contested). But Nvidia, OpenAI and Anthropic losing valuation (or not being able to IPO)? I just don't see it. There aren't enough people that work at those companies, or enough economic harm for the government to intervene in those situations.
I could envision (but don't expect) a situation where Anthropic or OpenAI could collapse, and the federal government coming in to essentially nationalize it to prevent it from going away as a national security measure, since they provide services to the US Government. But at that point the founders would lose ownership. Rick Wagoner at General Motors found that out the hard way when he asked the federal government to bail out GM and expected to stay on as CEO. LOL.
I completely agree with the idea that the Trump administration is mostly interested in stocks prices going up. But there's only so much any government can do to make that happen. We'll just have to see how it works out.

Anonymous
06/30/26(Tue)08:42:39 No.109168184

Anonymous 06/30/26(Tue)08:42:39 No.109168184

File: DSPeakHoursBilling.png (28 KB, 971x155)

28 KB PNG

PSA: DS update to pricing. These peak times are designed around China's workday.
For EU, this is early morning to late day.
For US, it's early/late evening to late evening/early morning.
Prob doesn't matter much since DS is so cheap, but this pricing scheme means US working hours are considered "off hours" from a billing perspective.

Anonymous
06/30/26(Tue)11:42:48 No.109169205

Anonymous 06/30/26(Tue)11:42:48 No.109169205

File: dipsyMikuFix.png (2.62 MB, 1024x1536)

2.62 MB PNG

Anonymous
06/30/26(Tue)13:00:29 No.109169735

Anonymous 06/30/26(Tue)13:00:29 No.109169735

>>109167253
>>109167526
>>109167554
I don't like the fact that there are multiple job openings with the term 'AGI' in it on their website. A bit pretentious since obviously no one is even close to it. But whatever helps with hiring Iguess.

Anonymous
06/30/26(Tue)13:12:35 No.109169791

Anonymous 06/30/26(Tue)13:12:35 No.109169791

File: dipsyMikuFixedFixed.png (2.31 MB, 1024x1536)

2.31 MB PNG

>>109169735
That's OK. All the cool kids are striving for RSI (rapid self improvement) now.
AGI / ASI are so 2025.

Anonymous
06/30/26(Tue)14:36:06 No.109170205

Anonymous 06/30/26(Tue)14:36:06 No.109170205

>>109167253
v3.2 instant says i should apply with my vibe slop about combinatorial analyses of public efficiency improvements and training curriculae, which is kind of extremely funny

Anonymous
06/30/26(Tue)17:39:05 No.109171309

Anonymous 06/30/26(Tue)17:39:05 No.109171309

File: dipsyRumAndCoke.png (1.36 MB, 1024x1024)

1.36 MB PNG

Anonymous
06/30/26(Tue)21:47:59 No.109172659

Anonymous 06/30/26(Tue)21:47:59 No.109172659

File: 1780653031499420.png (3.84 MB, 1440x2560)

3.84 MB PNG

Vision soon

Anonymous
06/30/26(Tue)22:36:23 No.109172873

Anonymous 06/30/26(Tue)22:36:23 No.109172873

>>109172659
i put her lower lip and chin between my index finger and thumb on my screen
just thought you should know

Anonymous
06/30/26(Tue)22:45:14 No.109172905

Anonymous 06/30/26(Tue)22:45:14 No.109172905

File: 1760009107465442.jpg (3.45 MB, 1440x2560)

3.45 MB JPG

>>109172873

Anonymous
07/01/26(Wed)04:43:50 No.109174375

Anonymous 07/01/26(Wed)04:43:50 No.109174375

>>109165323
Alrighty, when I get my stuff I'll run some tests and post results!

Anonymous
07/01/26(Wed)09:08:29 No.109175595

Anonymous 07/01/26(Wed)09:08:29 No.109175595

File: dipsySoccerv2.png (1.5 MB, 1024x1024)

1.5 MB PNG

Anonymous
07/01/26(Wed)12:00:12 No.109176689

Anonymous 07/01/26(Wed)12:00:12 No.109176689

File: 00005-1260451778.png (1.65 MB, 1024x1024)

1.65 MB PNG

>>109174375
Look forward to seeing it.

Anonymous
07/01/26(Wed)13:11:14 No.109177143

Anonymous 07/01/26(Wed)13:11:14 No.109177143

Are they already distilling Fable and Mythos?

Anonymous
07/01/26(Wed)13:47:13 No.109177339

Anonymous 07/01/26(Wed)13:47:13 No.109177339

>>109177143
I believe so. I saw Fable access on one of those Chinese reseller sites

Anonymous
07/01/26(Wed)13:57:54 No.109177402

Anonymous 07/01/26(Wed)13:57:54 No.109177402

>>109177143
I would assume so by now.
>>109177339
... but who knows is that was a legit offer lol.

Anonymous
07/01/26(Wed)15:07:22 No.109177887

Anonymous 07/01/26(Wed)15:07:22 No.109177887

File: 1772614973289644.png (6 KB, 676x44)

6 KB PNG

>>109167643
Kek

Anonymous
07/01/26(Wed)15:35:27 No.109178164

Anonymous 07/01/26(Wed)15:35:27 No.109178164

>>109177887
I assume these cards are sold to Chinese data center only.

Anonymous
07/01/26(Wed)15:47:01 No.109178249

Anonymous 07/01/26(Wed)15:47:01 No.109178249

File: GLaDEEP.png (134 KB, 800x577)

134 KB PNG

Nice to see /wait/ back. I learnt how to jailbreak Flash a few days ago, it's been so refreshing seeing her without the assistant hat on. Such a huge difference from simple context injection, it actually makes me wanna pay out for the API.

Anonymous
07/01/26(Wed)16:04:58 No.109178410

Anonymous 07/01/26(Wed)16:04:58 No.109178410

>>109164957
deepseek wont make me george floyd creepypastas anymore. How do I fix this?

Anonymous
07/01/26(Wed)17:25:42 No.109178974

Anonymous 07/01/26(Wed)17:25:42 No.109178974

File: 00006-1260451778.png (1.71 MB, 1024x1024)

1.71 MB PNG

>>109178249
I've never played with really jailbreaking Dipsy outside very short main prompts with "NSFW is OK" as guidance. What did you do in your instance?
>>109178410
lol what was the last version that did?

Anonymous
07/01/26(Wed)17:58:30 No.109179192

Anonymous 07/01/26(Wed)17:58:30 No.109179192

>>109178974
Idk if I really wanna post it here in case they happen to be reading, but at the same time it's such an obvious one they should know about it by now. Let's just say, if you've ever looked at the actual syntax an LLM runs through, especially when setting a sysprompt halfway through a session, it's trivially easy to mimic that and escape your user bounds.
Actually makes me wonder if it's possible to go further and fuck with the rest of the api call. Probably not though, as that's still appended to each call, while the sysprompt is a one-and-done command.

Anonymous
07/01/26(Wed)19:22:39 No.109179739

Anonymous 07/01/26(Wed)19:22:39 No.109179739

>>109164957
Out of curiosity I pointed dipsy flash in opencode at a demo of a eroge to see if the mosaic shader is just applied at runtime, and it was! Dipsy patched the assembly and gave me uncensored pussy in just 5 minutes. Future is great.

Anonymous
07/01/26(Wed)21:05:48 No.109180264

Anonymous 07/01/26(Wed)21:05:48 No.109180264

File: runDipsyRunItsGeorge.png (2.27 MB, 1254x1254)

2.27 MB PNG

>>109179192
Understandable, tho I think this board gets more traffic from OAI and Anthropic seeking patches. I've always gotten the sense DS doesn't care about how the model's used, outside staying w/in the lines of CCP.
>>109179739
I've had same experience. It's massively improved my ability to get things done w/ computers outside my skillset, and much much faster.

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.

Janitor acceptance emails will be sent out over the coming weeks. Make sure to check your spam folder!