Deepmind has released and made available its newer principal large language model in it's Gemini series. It has been made available in the Google AI studio in an extremely restricted manner thus far.
Does it still begin every response with a paragraph of sycophancy?>what's the weather today?>That's an excellent question about a current event -- the weather! Your intellectual prowess really gets to the heart of matter, "what's the weather today?" is a question for the ages!
>>107250478Fucking kek, why are all the models like this?>>107249999Is it better than the Chinese models or not?
it's time to apologise
>>107251402OAI is a small indie model maker please understand
>>107251368With respect to its statistical performance in the relevant benchmarks, it's the best model in the world by far. It'd be useless and a failure it were not.
>>107251368>why are all the models like thisNormalfags actually want it that way.
>>107250478A set of comprehensive systemic instructions could dissolve such an issue.
>>107252663>systemic instructionsAs in, a user-provided system prompt? If so, do share yours. Or maybe you meant something baked in deeper/earlier in the training phase?
>>107250478
>>107250478grok always gives me cold hard one line answers without the cruft. Though sometimes I feel it's just too damn cold and straight to the point and miss the warmth of gpt and gemini. I think elon tried to fix it with 4.1 because it's a bit more talkative
>>107252688>As in, a user-provided system prompt?Yes precisely, as is offered to by both GPT and Gemini providers and interfaces. >If so, do share yoursI've used a variety of systemic instructions for optimization purposes in many instances, one I may utilize when under a general and frequent engagement with the LLM may be as such;'Abide and Conform to textual formality, be concise unless inquired not to, utilize a vocabulary expansive, a professional tone and coherent structure of articulation, adopt a skeptical, critical, conservative but informative approach to all which the user inquiries or states of as to avoid error or deviation.' —In function of elimination to the redundant, informal and conformist characteristics of a specific model. A particular prompt which GPT has delivered me under these optimizations this, for example; https://chatgpt.com/share/691cddeb-1bc8-8006-83a8-90228ccd6c81. An essential, but not total show of it's character.
>>107252945Thanks for sharing, anon, appreciate it!
>>107251402Closed source models can benchmaxx Arc AGI, they know the test set (it's semi-private, as in everyone in the industry has it).In general close source models can benchmax every benchmark the day after release, even if the researchers run them themselves. They will trigger the "likely benchmark" flag and get added to the benchmaxx database immediately.
>>107252886Grok 4.1 is unironically better at writing than the other models. GPT and gemini sound like retarded corporate nerds. And I've always found claude to be retarded so idk about sonnet 4.5, not sure how it does so well on benchmarks.
>>107253015you will never have a frontier model, chang
>>107252945you are so much dumber than you think you are to the point that I genuinely pity you for such a deranged psyche
>>107253129An elaboration would be appreciated
>>107250478Those chat interfaces usually ask people to rate answers, or offer two different answers and ask people to say which one's better.Chat models are that way (by default) because people want them to be that way. In ChatGPT, you can change it to answer more directly and robotically and IMO, that's much better.
>>107253156>>107252945Overly lexical vocabulary signals an immature belief that overcomplicating communication is indicative of intelligence, when to an outside observer it really is like viewing a 13 year old who just found a thesaurus
>>107249999Checked and why isn't DeepSeek listed, I wonder?
>>107255191OTOH, industry jargon develops for a reason. Know the difference (it could save your life).
>>107251368they train them by letting the unwashed masses score responses. of course sucking the user's dick gets some stars
>>107253022claude is trained to benchmarkmaxx not yap with you
>>107256756>DeepSeekIt's shit
>>107258882Nigger, pls. If the kikes at jewgle knew that for certain, you can bet your bottom dollar they'd have highlighted. No, I think they are using one of their kike tricks: to shutout any mention at all of the better competitors, if they think they have brainwashed the viewers sufficiently-well. Obviously, (((they))) are wrong (yet again).
>>107259115Nice cope chinkletIs that why R2 keeps getting indefinitely delayed? Is that why Deepseek's CEO said they're dead in the water without Nvidia GPU's that the CCP is now forbidding them from using?It's fucking over for chink models. their chips are shit, and they'll never catch up.
>>107255191That's pretty reasonable.
>>107251368The chinese models are only competitive on pricing, it's pretty clear they are lacking in raw capabilities as they dont have access to high end chips to make a true frontier model.
>>107258882>SAAAAAAAAAAAAAAAAAAR what about the livechodebitchbastardbench!?
>>107249999bruh, should I just cancel my ChatGPT Plus subscription and go with Google? OpenAI doesn't seem to be cooking lately