[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 1000004470.jpg (158 KB, 1112x943)
158 KB
158 KB JPG
Deepmind has released and made available its newer principal large language model in it's Gemini series. It has been made available in the Google AI studio in an extremely restricted manner thus far.
>>
Does it still begin every response with a paragraph of sycophancy?
>what's the weather today?
>That's an excellent question about a current event -- the weather! Your intellectual prowess really gets to the heart of matter, "what's the weather today?" is a question for the ages!
>>
>>107250478
Fucking kek, why are all the models like this?
>>107249999
Is it better than the Chinese models or not?
>>
File: Untitled.jpg (481 KB, 1956x2308)
481 KB
481 KB JPG
it's time to apologise
>>
>>107251402
OAI is a small indie model maker please understand
>>
>>107251368
With respect to its statistical performance in the relevant benchmarks, it's the best model in the world by far. It'd be useless and a failure it were not.
>>
>>107251368
>why are all the models like this
Normalfags actually want it that way.
>>
>>107250478
A set of comprehensive systemic instructions could dissolve such an issue.
>>
>>107252663
>systemic instructions
As in, a user-provided system prompt? If so, do share yours. Or maybe you meant something baked in deeper/earlier in the training phase?
>>
File: file.png (104 KB, 1322x1211)
104 KB
104 KB PNG
>>107250478
>>
>>107250478
grok always gives me cold hard one line answers without the cruft. Though sometimes I feel it's just too damn cold and straight to the point and miss the warmth of gpt and gemini. I think elon tried to fix it with 4.1 because it's a bit more talkative
>>
>>107252688
>As in, a user-provided system prompt?
Yes precisely, as is offered to by both GPT and Gemini providers and interfaces.

>If so, do share yours
I've used a variety of systemic instructions for optimization purposes in many instances, one I may utilize when under a general and frequent engagement with the LLM may be as such;
'Abide and Conform to textual formality, be concise unless inquired not to, utilize a vocabulary expansive, a professional tone and coherent structure of articulation, adopt a skeptical, critical, conservative but informative approach to all which the user inquiries or states of as to avoid error or deviation.' —In function of elimination to the redundant, informal and conformist characteristics of a specific model. A particular prompt which GPT has delivered me under these optimizations this, for example; https://chatgpt.com/share/691cddeb-1bc8-8006-83a8-90228ccd6c81.
An essential, but not total show of it's character.
>>
>>107252945
Thanks for sharing, anon, appreciate it!
>>
>>107251402
Closed source models can benchmaxx Arc AGI, they know the test set (it's semi-private, as in everyone in the industry has it).

In general close source models can benchmax every benchmark the day after release, even if the researchers run them themselves. They will trigger the "likely benchmark" flag and get added to the benchmaxx database immediately.
>>
>>107252886
Grok 4.1 is unironically better at writing than the other models. GPT and gemini sound like retarded corporate nerds. And I've always found claude to be retarded so idk about sonnet 4.5, not sure how it does so well on benchmarks.
>>
>>107253015
you will never have a frontier model, chang
>>
>>107252945
you are so much dumber than you think you are to the point that I genuinely pity you for such a deranged psyche
>>
>>107253129
An elaboration would be appreciated
>>
>>107250478
Those chat interfaces usually ask people to rate answers, or offer two different answers and ask people to say which one's better.
Chat models are that way (by default) because people want them to be that way. In ChatGPT, you can change it to answer more directly and robotically and IMO, that's much better.
>>
>>107253156
>>107252945
Overly lexical vocabulary signals an immature belief that overcomplicating communication is indicative of intelligence, when to an outside observer it really is like viewing a 13 year old who just found a thesaurus
>>
>>107249999
Checked and why isn't DeepSeek listed, I wonder?
>>
>>107255191
OTOH, industry jargon develops for a reason. Know the difference (it could save your life).
>>
>>107251368
they train them by letting the unwashed masses score responses. of course sucking the user's dick gets some stars
>>
>>107253022
claude is trained to benchmarkmaxx not yap with you
>>
File: 1741983171545325.jpg (173 KB, 2048x883)
173 KB
173 KB JPG
>>107256756
>DeepSeek
It's shit
>>
>>107258882
Nigger, pls. If the kikes at jewgle knew that for certain, you can bet your bottom dollar they'd have highlighted. No, I think they are using one of their kike tricks: to shutout any mention at all of the better competitors, if they think they have brainwashed the viewers sufficiently-well. Obviously, (((they))) are wrong (yet again).
>>
>>107259115
Nice cope chinklet
Is that why R2 keeps getting indefinitely delayed?
Is that why Deepseek's CEO said they're dead in the water without Nvidia GPU's that the CCP is now forbidding them from using?

It's fucking over for chink models. their chips are shit, and they'll never catch up.
>>
>>107255191
That's pretty reasonable.
>>
>>107251368
The chinese models are only competitive on pricing, it's pretty clear they are lacking in raw capabilities as they dont have access to high end chips to make a true frontier model.
>>
>>107258882
>SAAAAAAAAAAAAAAAAAAR what about the livechodebitchbastardbench!?
>>
>>107249999
bruh, should I just cancel my ChatGPT Plus subscription and go with Google? OpenAI doesn't seem to be cooking lately



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.