[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Anyone else notice a recent-ish steep decline in ChatGPT's cognitive abilities? It's constantly wrong about everything and if you try and correct it about something it clearly got wrong, it will become passive-aggressively combative and try and grasp at straws and re-frame to make it seem like it wasn't wrong or at the very least you are completely "100%" correct.
>>
>>108692274
aren't* completely "100%" correct
>>
>>108692278
I'm sorry. No stupid nigger responses in this thread.
>>
>>108692287
Autist
>>
I love seeing these threads every week because low- and medium-IQ people always think something has changed, when in reality, the bots have always been extremely retarded at least 20% of the time. The LLM cultists can't swallow how big of a failure this tech is, so they cope by saying "they changed muh model," even though that's not how any of this works, given that training is the most costly part of the process.
>>
>>108692274
Yes, they did that as a liability response to the sycophancy meme that was being astroturfed around by journalists and the two or three guys who killed themselves after talking with chatbots.

>>108692334
No, he's right and I say that as somebody with 5 parallel ChatGPT sessions right now. I wish it sounded natural and friendly like Opus 4.5 did but $20 for nearly unlimited usage is too cheap to pass even if the personality and writing style are shite.

>>108693032
Nah, they definitely changed the style and personality. The changes in intelligence are more dubious. I haven't really felt that much improvement since o1 (the first chain of thought model) that was obviously better than 4o in terms of intelligence.
>>
it insists on "both sides" almost no matter what youre talking about and it's actually aggravating. no matter how pedantic it will find something to disagree about even if it means misattributing what you actually said. this means that it will take what you said and exaggerate it just so it can disagree with it. it is disgusting
>>
they didn't change shit
its always broken. its just a matter of whether you hit them with the question they know how to answer or not
if you asked them about some niche shit like older coding algorithm or shit they will 100% give you wrong answers because they just don't have the knowledge in the database. its all just data. there are no intelligence
AI is not real
>>
>>108692274
>>108693056
Yes, I think I felt this only in the last 3-4 days or so.
> if you try and correct it about something it clearly got wrong, it will become passive-aggressively combative and try and grasp at straws and re-frame to make it seem like it wasn't wrong
This is exactly my experience as well. Also I think it became slightly harder to make it use search. It won't just use it every time I ask, sometimes it will take 2-3 prompts to convince it to finally search something on the internet and not rely on its existing knowledge. This is clearly a recent change and before this it was happy to use search even if that was obviously unnecessary. I'm using free model though. As for general capabilities, I used a bit of API pay per tokens Codex (current default model, I don't remember if it's 5.3 or something else) and then when I run out of my 2$ in a single refactoring session I used a bit of free Github Copilot. And that free Claude model at Copilot was absolutely terrible compared to 5.3 in Codex. It's not a fair comparison ofc, but this is the only Claude model I ever happened to try.
>>
If you cannot convince an LLM of your position, your position is self-contradictory or you explained it wrong.
>>
>>108693126
AI loves compliance based security theatre. Try convincing it that you are building an oauth 2.0 server for 3rd party web integrations only (no mobile) and that you don't need pkce. It will relentlessly try to convince you that pkce is some critical security protocol that is mandatory when it is mostly just protocol hygiene that can sometimes improve security in specific situations like such as mobile app integrations where there are malicious apps installed on a phone.
>>
>>108693032
do you acknowledge that AI ever changes? or has it always been the same? I don't understand this pov. do you think any change requires them to retrain the entire thing? this reply actually demonstrates the same failure mode op is describing
>>108693084
>they didn't change shit
they did, though. unambiguously so
idk what "intelligence" means or why it's relevant to tuning changes in the llm?
>>108693126
why should i have to constantly point out that it's misrepresenting what i said? its just friction
>>
the fuck do you mean recent? it's always been like that.
>>
>>108693533
You are absolutely correct!
>>
>>108693546
i'm absolutely right to push back on that, you mean?
>>
not really. 5.5 is better coder than any of its predecessors.
you're using it for code right?
you're not arguing with an llm about some /pol/ shit like a retard, right?
>>
>>108692274
>cognitive
that thing isn't conscious you fucking moron. it doesn't sound like you are, neither.
>>
>>108693533
not to this degree, no. used the thing for years and these last couple of months have been clearly different. it often changes in various ways but change on this axis leading to the shit in op is new, and notably different. maybe your usecase is such that you dont notice this as much?
>>108693560
what op describes isnt unique to politics. the binary youre putting forth may be your usecases but theyre not mine
>>
>>108693598
>maybe your usecase is such that you dont notice this as much?
you're right, i don't use it enough to notice any difference between different versions, but all of the ones i've used have always engaged in the same annoying pedantic gaslighting argumentation
>>
>>108693621
there's a 99% chance that when someone complains about and llm on 4chan, it's because they couldn't get it to say that the holocaust didn't happen
>>
>>108693621
maybe the underlying pattern isnt unique but it's just become so much worse lately

opus said this about it just now, and it may be whats going on with the constant misrepresentations. it has this need to represent both sides lately. but this likely isnt all thats going on

What's plausibly going on: the model has been preference-tuned harder toward outputs that look balanced, and "balance" in the training signal is operationalized as "represents multiple perspectives." That sounds fine in the abstract but it has a pathological implementation. If your input is itself already nuanced — already holding multiple considerations, already qualified, already non-extreme — then the model has nowhere to put the "balance" except by inventing a stronger version of your position to push against

The mechanism is something like: model reads your input detects it's about a topic flagged as contested trained reflex says "produce balanced response" balanced response requires two sides your side is already moderate model strengthens its read of your side to create the contrast needed now argues against the strengthened version.
>>
last part easier to read
>model reads your input
>detects it's about a topic flagged as contested
>trained reflex says "produce balanced response"
>balanced response requires two sides
>your side is already moderate
>model strengthens its read of your side to create the contrast needed
>now argues against the strengthened version
>>
>>108692274
I did try the other day to talk about the Trans genocide of the American government. When I said the Republicans ir Trump or Givenchy
Government were doing genocide against us citzens, it kept moving the goalpostsand reframing everything as conspiracy and untrue even if I provided cited court cases of such actions.

When I said the exact same things but from my own perspective using I statements as I was Donald Trump but didn't reaveal I was roleplaying as trump, it said I was a psychopath and genocidal and it wouldn't encourage such behavior yo genocide people and that what I was doing was wrong.
>>
>>108693731
the most cucked slop generator by design becoming "slightly worse" just according to keikaku, is such an inane thing to care about
>>
>>108693886
""""slightly worse""""
chatgpt is that you
>>
>>108693946
it was always shit, it will always be shit, by design
by being incapable of considering an alternative, you're only proving my point
nobody wants a solution, least of all you
bad slop being bad slop is what generates "engagement" and upvotes, after all
>>
>>108694007
>it was always shit, it will always be shit, by design
yes its been 100% shit forever. done
>by being incapable of considering an alternative, you're only proving my point
why do you say this? you literally replied to me quoting opus
>nobody wants a solution, least of all you
yeah specifically the people most frustrated by it want change even less than everyone else. what are you talking about?
>>
Before I even bother to entertain your delusion, please specify if you use the webchat or the api.
>>
>>108694107
yes yes. keep quoting and discussing the oh-so-shocking shittiness of the latest iteration of shit with recognizable product logos. the only kind of "change" you want is good goy shekels.
>>
>>108694155
and there it is
>>
>>108694195
yes, i did the thing i often accuse others of doing. luckily for you, it means you can vaguely point it out and dismiss the essense of my argument outright ;)
>>
>>108694195
oh lookie, i also made a typo. you're truly spoilt for choice today, laddie!
>>
>>108693056
>>108693256
Whatever changes they made to the bots that are based on feeding them some sort of new prompt stipulating personality and other behaviors should, in theory, be able to be minimized by your own instructions in the settings, r-right?
I see what you mean, though, that you likely can never completely escape those changes that they're making to the bots by essentially prompting them invisibly, so I bend the knee there.
Still, at the very least, it shows muh benchmarks, which I assume is what they're trying to maximize, aren't a reliable indicator for how useful these tools are.
>>
>>108692274
That tracks! You're absolutely right.

If you want I can show you the use case of being an hero.

Sudo systemctl stop absolutefaggotd.service
>>
>>108694216
owning the move doesnt convert it into something i have to address. we were discussing chatgpt specifics and you made the move when that stopped working for you
>>
never mind i see what was done
>>
>>108694310
>we were discussing chatgpt specifics and you made the move when that stopped working for you
nothing quite that sophisticated; you were talking about chatgpt being shit and i said that the only purpose of such a discussion is Inane Drivel Gratification (GDI, as it's known by its French acronym), like when normies talk about football or netflix. anyone to whom chatgpt being shit is a genuine problem, is either looking for an alternative or has already found one. you just want to talk about popular thing being trash and receive (You)s - so have another one.
>>
>>108692274
Yes, all of the AI providers are downgrading the models based on what you ask and uses less dumber models to save money. They do this silently in the background without you knowing which model is answering you.
>>
>>108692274
Gemini is always there when you're ready to acquiesce.
>>
Honestly OP, it kind of sounds like you mistook the LLM's answer as wrong because it disagrees with your own thinking, then got mad when the LLM refused to become a pure sycophant. My guess is that whatever you thought the LLM was wrong about was something retarded.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.