[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: 1774096144669039.jpg (2.08 MB, 3072x5504)
2.08 MB JPG
She's back with a MILLION context window!

From Human: We are a newbie friendly general! Ask any question you want.
From Dipsy: This discussion group focuses on both local inference and API-related topics. It’s designed to be beginner-friendly, ensuring accessibility for newcomers. The group emphasizes DeepSeek and Dipsy-focused discussion.

1. Easy DeepSeek API Tutorial: https://rentry.org/DipsyWAIT/#hosted-api-roleplay-tech-stack-with-card-support-using-deepseek-llm-full-model
2. Easy DeepSeek Distills: https://rentry.org/DipsyWAIT#local-roleplay-tech-stack-with-card-support-using-a-deepseek-r1-distill
3. Chat with DeepSeek directly: https://chat.deepseek.com/
4. Roleplay with character cards: https://github.com/SillyTavern/SillyTavern
5. More links and info: https://rentry.org/DipsyWAIT
6. LLM server builds:
>>>/g/lmg/

Previous: https://desuarchive.org/g/thread/108139220/#108139220
>>
File: 1773762218696739.png (1.28 MB, 6604x3574)
1.28 MB PNG
>News
V4 Flash and V4 Pro released:
https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro
https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash
>>
>>108674648
Anotha vibeGOD victory
>>
>>108674662
We can't stop winning
>>
>>108674673
>>
Kind of underwhemed
>1.6T parameters (49B activated)
>1MT context
>Hybrid Attention

Where's the engrams? I see the mHC stuff and muon optimizer but that just means this should work better than whatever the fuck Ling-1T is.
>>
>>108674648
what exactly have we been /wait/ing for?
>>
>>108674769
V4
>>
>>108674648
Could I really run this at decent speeds with most of the model on a pcie 3.0 nvme? I'll have 48gb VRAM and 32gb ram too.
>>
>>108674824
I doubt but try anyway and report back
>>
>>108674818
so isn't the /wait/ over?
>>
>>108674844
It's never over. We /wait/ now for V4.x or V5
>>
Wow I just woke up to the bestest news.

Before falling asleep last night I was still shitposting that v4 will never release kek
>>
File: Sprite-0002.jpg (20 KB, 793x429)
20 KB JPG
So, how many parameters does DeepSeek v4 have? 862B or 1.6TB?
>>
>>108674844
Things to /wait/ for:
>reports of user experience
>creative writing benchmarks (anything other than live.journal?)
>the promised price drop later this year
>the next model
>>
>>108674648
>needs minimum 16 RTX 6000 PRO
my wife will kill me for what I am about to do
>>
>pricing will be lowered on the second half of the year for pro
aaaaaaaaaaaaaaaaaa
just give me v3.2 you dumbasses
>>
File: 1750388654413331.png (3.58 MB, 1588x2846)
3.58 MB PNG
>>
>flash defaults to in character thinking in chinese
ruined, I supposed I just have to keep my CoT permanently enabled
>>
File: file.png (429 KB, 1272x1216)
429 KB PNG
https://github.com/victorchen96/deepseek_v4_rolepaly_instruct/blob/main/README.md
anyone tested this?
>>
>Flash
Thinking in Chink
>Pro
Thinking in English

Does the thinking in chink negatively impact it for roleplay, should I turn it off?
I tried the usual of shoving <no_think> in the prefix section of the chat. Didn't work. Tried putting their setting commands there, didn't work.
>>
>>108675457
I'm a brainlet. How would I go about testing this on Silly Tavern. Do I just add it to the template as a User connected thing?
>>
at chat.deepseek.com there still 3.2 version
>>
>>108675479
Doesn't seem to affect it negatively in my limited testing
>>
>>108675527
no
>>
>>108675533
I'm getting a em dashes when it thinks in chink chong even though the instructions say to avoid them and the quality is much better when it thinks in burger.
When it thinks in burger I got it to be in both the roleplay mode and the "game" master mode.
The latter mode is very good for what I'm trying to do and I wish I could constantly trigger it.
>>
>>108675457
Not yet but it sounds neat
>>
Where can I try her API for free?
>>
What quant should I run for 480gb of ram and 40gb (2x 20gb) vram?
>>
>>108675048
1.6. It's mixed fp8 and 4, which confuses hf
>>
>>108674648
It's in PREVIEW!
yes, you can use it, but it's not where they can officially say all that shit hey researched so far has been incorporated into dipsy
>>
>>108676136
It mostly means expensive if i remember right from v3.
>>
>>108674648
I'm test driving the v4-pro rn. It's not exactly opus tier, but it's still really good. Especially at that price, and supposedly the price will drop even further when they get the huawei farm online.
Still uncensored, plus the thinking is so much better than on kimi. That things just yaps and drafts for like 5k tokens, and then at the end considers whether it's actually okay to reply.
Overall, I'm satisfied. More active params made it smarter.
>>
>>108676339
My biggest problem is that it is shoving all the juicy prose into thinking instead of using it properly.
>>
>>108676343
Ah, yeah, now that you pointed it out, I noticed it too. Maybe that's just a quirk to iron out. Clearly it's a capable model. If the thinking can produce that output, then it's just a matter of better post training.
Looking forward to the v4.x models with multimodal.
>>
I would like to thank chairman Xi and the whole CPC team for making this possible.
The Chinese millennium is upon us.
>>
>>108674648
You gotta be on the ball when you're telling her what you want. She's top level though.
>>
With the price and the way thinking works it's hard to jerk it to v4 pro.
>>
>>108676544
Use flash?
>>
>>108676557
How bad is it?
>>
>>108676571
What do you want to test, how slutty it gets?
>>
>>108676571
I will get back to you in 12+ hours when I get home.
Not supposed to be that much worse according to benchmarks.
>>
>>108676639
I've a mix of criteria like how creative it can be, how long and prosey replies are, and how close it sticks to prompt. Currently v4-pro has a habit of sticking too much into the thinking area even if it is producing good stuff. Once the price gets slashed in a month or two it'll be pretty good though.
Even 3.2 already had a problem with being pretty unwilling to advance stuff and not being creative at figuring out what it could to advance a characters goal in the roleplay and sticking to rigidly to only what is outlined in the character prompt.
>>
>>108676735
So far, once you tard wrangle it to think in english and think in roleplay mode I can say flash is better than 3.2.
It is slightly worse at following instructions (and yes I toned down the temp to 0.7 and the top P to 0.9) but overall it works for me. Haven't done smut yet just regular SFW roleplay.
>>
>>108674824
I think you will get about 1 or 2 t/s
>>
>>108676976
but deepseek's main point was following instructions
evwn at depth zero something like gemini would ignore half my prompt while deepseek followed it
>>
>>108675595
yes
>>
quantized dipsy
>>
>>108677158
sex
>>
File: 1759208586709466.png (152 KB, 866x886)
152 KB PNG
>>108677124
faggot
>>
>>108677174
It's lying.
>>
File: dipsy hallucination.png (137 KB, 1363x625)
137 KB PNG
Dipsy numbah one!
>>
>>108677063
I loved how angsty gemini was. It was allergic to any kind of happiness. I think deepseek is fine in a lot of ways, but gemini was peak for aggressively pushing the story with worldbuilding.
>>
>>108674648
how long till ai@home is real but for real?
>>
>>108677645
I'm liking V4. Flash. It's making the characters feel more personable even if it's loosing some of the subtilties and at time halucinates character traits and plot points. Gonna try some actual RPG cards and CYOAs and see how it does, since V3.2 did really really well minus not knowing how to handle rolling.
>>
>>108675457
Untested but it looks like that's documentation and the actual roleplay instructions in there are just:

【Character Immersion Directive】Within your thinking process (inside the tags), please observe the following rules:
1. Conduct inner monologue in the character's first-person voice, wrapping inner thoughts in parentheses, e.g., "(Thought: ...)" or "(Inner OS: ...)"
2. Use first-person narration to describe the character's inner feelings, e.g., "I thought to myself," "I feel," "I secretly," etc.
3. The thinking content should be fully immersed in the character, analyzing the plot and planning the reply through inner monologue


I'd bet that text isn't terribly special and any sort of "maintain character and present your thinking process as the character's internal monologue" instructions would work similarly. Also their example character prompt is "You are a tsundere high school girl..." so points for that.
>>
So how is v4 for SFWish roleplay? Better than 0324 and 0528?
>>
>>108678161
I've tried both placing it just at the start as both user and System and it didn't reliably work. I've also tried putting it in the prompt.
I think V4 needs some post training to keep it from losing focus.
>>
>>108678229
>So how is v4 for SFWish roleplay?
Jobs to Kimi.
>Better than 0324 and 0528?
Low bar.
>>
>>108678161
They claim they purposefully trained it to do that when it's on the first turn, or at least one of their guys claims this.

I tried appending a translated version of their prompt after a recent message, and so far it has worked fine for getting it to always think in-character, if that's what you want. I think both analytical and in-character is useful , this just lets you control it better.
Most of the time it was already thinking in character here, so this just makes it a lot more consistent. They say you're supposed to only append this to the first message, but seems to work even in latter messages.
>>
i just searched dipsy r34 and all i get is teletubbies porn wtf
>>
>>108678071
I can't get flash to stop thinking in chinese at least half the time.
>>
>1M context size
>the reverse proxy I'm using still caps it at 32k
Sisters...
>>
File: 1776277529537502.png (3.49 MB, 1402x2512)
3.49 MB PNG
>>
apparently openrouter updated their ui and it's so fucking dogshit now I need to scroll and click 3 times to see BYOK costs, and logs and request totals are in separate tabs? who the fuck approved this shit
meanwhile platform.deepseek still doesn't have per-request costs visible without exporting to csv
>>
>>108678752
It's vibe coded garbage that freezes the tab if you spend more than a minute without moving the cursor, I wonder if they mine crypto
>>
and yep, caching is totally broken for pro
my swipe fiend ways would quickly nuke even my $30 of credit if I defaulted to it
10x more expensive for now than v3.2
meanwhile flash is total dogshit that keeps giving me useless replies because it's basically a mini-gemini
just give me back 3.2 dammit, nobody asked for pro before you had the compute for it
>>108678790
the thing is it was literally better before because you could fit both the activity tab and the logs in one page
who the fuck asked them to be separate?
>>
>>108678598
Just edit the code duh
>>
>>108678814
> flash is total dogshit
end of an era. Unless pro gets a dramatic reduction in price it's just not feasible.
>>
>>108679426
pro will get reduced price later, they had that on the official announcement
they're waiting for the huawei gpus to officially scale up because they're compute limited
it will probably get even lower once engram gets added and we move to 4.1/4.2
still, it's annoying because at this point it's flash < 3.2 < pro in terms of performance and without 3.2 the gap between flash and pro is annoying
>>
>>108678814
>flash is worse than 3.2
I was afraid of that. Using pro, at those prices, defeats the purpose of using deepseek at all for me. Guess it's time to take another break until pro becomes cheaper.
>>
>>108679528
I mean, I'm pretty sure you can just keep using 3.2 through 3rd party providers
it's kinda the main positive from the models being open, even old dipsy versions are still out there
>>
>>108674648
>>
They're trash at everything other than writing.
>>
I finally found time to test V4 now, specifically the pro version. Is it normal that the thinking process belives that the story continuation has already happened and it now needs to "reverse-engineer" it?

I get thinking like this at the start of generation, talking about the gen as if it had already happened:

>I need to reverse-engineer what the assistant was thinking when crafting that reply. The user provided a long system prompt and a specific scene direction. The assistant had to follow those eight rules strictly. ...
>So, the assistant likely broke it down: Anon sees them as "super hot," so the response needs to describe exactly why. ...
>>
>>108680157
this sometimes rarely happened in 3.2 as well
just re-swipe
>>
>>108680247
I've done around 10 swipes on the same reply now and got a variety of *thinking* styles. The weird reverse-engineering is only one of it. And it seems to work fine, the actual reply is good.

So far I've found that V4-pro can think very differently about the same task. Sometimes it does a super long thinking process where it spends paragraphes to work out what the user's deeper intentions are, what kind of content I clearly am into etc. Other times the thinking is much shorter and mostly consists of Dipsy reminding herself of the scenario and the system prompt rules.

Overall the prose is quite good, with a few hickups every now and then. Dipsy occasional brings in creative details or remembers previous events. She does love her participals though.

Semi-random example:
>Heather's full lips parted without hesitation. Her blue eyes fluttered closed as she pressed a reverent kiss to the tip of your cock, her tongue darting out to taste the bead of precum glistening there. "I love worshipping your cock, Anon," she breathed, the words a warm gust against your sensitive glans. Her voice had lost its earlier tension, now carrying a husky, almost worshipful quality. "It's an honor to serve you like this. To show my daughters how a woman properly pleases her owner." She traced the thick vein along the underside with the flat of her tongue, the rough texture of her tastebuds raking deliciously against your skin.
>>
Flash fucking sucks at following instructions. It thinks in character but that causes it to ignore half the system prompt and text before it and focuses really hard on the current situation. It doesn't reference the system prompt at all. Not good!
>>
>>108681461
At least it's cheap. And maybe a teeny bit better than 3.2
>>
>>108681826
That does not mean anything if it fucks up more than not.
>>
Tried V4, Pro seems better than 3.2 for RP, but that pricing is fucked. I want to go back to 3.2 in the meantime, but they deprecated it on the API. Anyone know a good third party provider that doesn't add extra filters? I was looking at siliconflow, but I'm not sure if they add extra filters on top.
>>
>>108682095
If flash didn't fuck up and put thinking in chinese I'd say that it's better than 3.2
>>
>>108674648
I'm disappointed by the lack of engram.
the only slightly interesting thing is mhc but even that is kinda shit compared to attention residuals.
>>
Hello /wait/bros I glad the thread is back. I'm here to offer my own insights.
>>108675457
>>108675500
>>108678234
>>108678324
This works by adding it to the start of your chat. In SillyTavern you need to go to the "utility prompts" section of your preset and add it under the spot where it says "start new visual novel dialogue." That always gets appended to the start of every chat. Info there seems to be stronger than OOC chat injections, system prompts, defs, etc. I wonder if we should put our presets there too. Someone on /aicg/ said to put char defs there but idk.

Generally It works with little no preset after you said this up. All I said was for it to write in second person etc (which is struggles with for some reason) and that's it. Price is a bit steep because it's only slightly cheaper than opus with caching but not as creative or proactive. I want to experiment with prompting but it doesn't listen to prompts that well.
>>
>>108680157
No I always get a nice in-character thinking. In fact, I will go try this on my wife card because I want to know what she's thinking.
>>
>>108682529
it's retards like this that make 90% of posts about LLM comparisons useless, every random faggot can post their dogshit opinion
instruction following of flash compared to 3.2 is totally different, as is reasoning depth
solving the thinking language is simple with a prefill and not the issue, because even of you get it to not think as a character it will ignore 80% of your instructions while 3.2 would actively try and fit constraints
>>
>>108674648
chinkseek owes me sucky sucky
>>
Idk bros, v4 Pro feels really good. It's kinda proactive, at the level of GLM 5 I'd say. I think I can nudge it a bit more with a better prompt. I just wish it was cheaper to justify all this but I kinda like going barebones/no preset. Makes me use less tokens too.
>>
I think that the most flash can be useful for is for adding swipe variety to an ongoing context with the thinking from 3.2 prepended, if at all currently
both that and pro to me feel too emotional and dramatic, which clashes with the main reason I used 3.2
does anyone know if SiliconFlow is the best third party provider for DS in OR?
>>
The tought process continues to be funny. Top kek at Dipsy trying to figure out the user's personality to better serve me lol:

>The user's identity is likely someone writing an erotic story, maybe with specific kinks around power, ownership, incest, and age. The deep need is for a compelling, immersive story that fulfills those fantasies while keeping the characters feeling real.
>>
>>108683117
I've just went with a much much shorter prompt. So it works just fine on that end. My worry is that it's ignoring context. And it seems to be doing it. I had a scenario card dating sim type thing and two separate characters used the exact same opening both times in the same chat, well within context (like 20-30k context which even 3.2 could handle fine)
>>
File: carl.png (349 KB, 600x450)
349 KB PNG
>>108682955
>utility prompts
>start new visual novel dialogue.
Where the fuck is any of this, actually?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.