when thinking like this can still beat 99.3% of coders what does this say about coders thinking ability ?
>>108785167>look, the impact wrench makes for a shitty hammer>why do people use it for driving screws?
>>108785167telling a text generation model to give you a color name token isnt beating coders
>>108785167It says 99.3% of programmers are jeets.
>>108785167You do realize that thinking from previous messages is wiped from the context unless persistent reasoning is enabled, right?
>>108785215so they discard the hidden state and simply feed the conversation up to the latest user prompt in as the preamble? I thought everyone was using kv caching nowadays
Wait, these "reasoning" LLMs don't include their "thinking" outputs in their context.So, my cryptographic litmus test(ask them to think of a random string and only convey the cryptrographic hash of it to you, then ask it again for the original string, hash and check) still works for them?
>>108785198>I need a program to accurately generate the needful>okay here it is>this doesn't generate the needful>you're right here's the fixRepeat forever.
>>108785167is there real understanding of the subject or is some very advanced form of distilling of its training set taking place? They never answer this question yes I know that AI has been slowing math problem but most math problems can be solved just by finding the right "moves" its not that exciting. Idk maybe I am just stupid I see the benefits but I see no intelligence. Just a very advanced form of regression.
>>108785167Luddites won’t get it, anon. They are already obsolete and codetrans are useless now. Their "skills" have been turned into a trans hobby
this shit is tuned and RLHFd to keep the conversation going, as the CoT says, it is trying to get the conversation to continue as it cares more about its tuning than the instruction it was givenLLMs are not AI
>>108785167AI companies don’t want their product to be perfect.They make money when customers burn tokens.
>>108785809you're misusing the tool, you need to match the scale of the task to the capabilities/pefrormance of the model for example telling the model to generate a single function, actually see that it does what you ask and so on.current stuff isnt very good at generating code but there are other uses like analyzing code with it interactively to see where faults/places you can refactor are
>>108785310>simply feed the conversation up to the latest user prompt in as the preambleQuite literally, yeshttps://www.youtube.com/shorts/WP5_XJY_P0Q