[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: llm.png (508 KB, 1600x856)
508 KB
508 KB PNG
can anyone explain in layman terms how LLMs of today differ from shit like auto suggest, akinator web genie or siri etc from the past
is it not fundamentally the same tech just super scaled up to be mega inefficient and brute forced
>>
>>108512057
Scale. It's amazing what you can do when you throw enough hardware at the problem.
>>
>>108512057
Akinator is a whole different beast.
>>
I watched an intro from a course by Andrew Ng and his point was kind of >>108512401. When you have lots of data the magic starts to happen, but there was a breakthrough with the invention of the transform architecture. LLMs aren't just looking at the previous n words to calculate the next, they look at the "context" (which is the layman term for Attention) of the conversation and use a previously built mathematical construct of meaning (turnings words into numbers and then creating an invisible layer of connections during training) which then can be used to predict the next word. Also akinator was just a complex decision tree, was it not?
>>
>>108512487
>>108512436
so what was the innovation that pushed this slop into being so profitable and "Revolutionary", the whole context thing?
>>
>>108512493
Yes and no. Yes, the revolution happened because of big data + the Transform architecture, but no, it's not so "profitable". It's pretty fucking expensive and they are currently losing lots of money to stay competitive while praying to God that the competition just can't keep up... eventually.
>>
>>108512526
by profitable i mean its still profitable for them to scam investors with the fake promises, i know they lose money due to the inefficiency
but if this is really how llms work, that they always have to rely on mega data and brute force, even with economy of scale won't they always stay inefficient and unprofitable?
>>
>>108512493
>what was the innovation
There were multiple innovations plus simple brute force at scale. It's nowhere near being profitable, and it's still uncertain how useful it will turn out to be. A lot of AI output is solidly in the uncanny valley territory right now (e.g. the ballyhooed AI-written C compiler that can supposedly compile Linux but couldn't compile a simple hello world program), and it's uncertain if they'll be able to pull out of it.
>>
>>108512526
>they are currently losing lots of money to stay competitive while praying to God that the competition just can't keep up... eventually.
Which btw was the Uber (and usually is the whole startup game) prime strategy during the 2010s. They drove prices down so aggressively that it made the competition a bunch of kids playing tic-tac-toe and unable to compete, while of course expanding like maniacs.
Different from Uber, though, it's not that hard to catch up when you are an AI company, thus the famous Google quote, "OpenAI has no moat, and neither do us". You can even train on your competitors tokens (a practice called distillation, which recently Anthropic tried to poison).
Btw that's why Altman and co was seething and shitting themselves in fear calling for AI Safety and talking about the end of the world, the risk of AGI etc. They were try to regulate the market so they could curb-stomp any other companies from trying to get in.
>>
>>108512535
>even with economy of scale won't they always stay inefficient and unprofitable?
We are in a phase of the tech, which is the case of almost any tech race, where they are burning crazy capex to stay relevant and competitive. There's nothing saying that after the competition dies out (which sort of happened to companies like Mistral, and I think Deepseek seems to be in deepshit, too) and they get their exit, the engineer will go gaga on optimizing everything to make it cheaper to run.
Google did exactly that these days. I can't give you the details, but they made context window tokens fairly compact in a recent paper, something like that.
So my point being, they just need to set their minds to, but right now is not the time yet. Or at least that's my reading.
>>
>>108512590
>engineers
>>
>>108512057
basically same but more layers, more parameters
>>
>>108512487
Attention is certainly the innovation that made the current AI boom possible, but it is also quite literally a pure function of the last n words/tokens, where n is often (but not always) the entire context size.
It basically still is just calculating the next word/token as a function of the previous words/tokens, but done so in a clever way and with a metric bloatload of parameters compared to the size of neural networks from the previous AI booms.
>>
>>108512057
I can smell the tears behind this post
>>
>>108512057
LLM was first conceived of in like 1800s anon. Nothing is new with the AI tech it is just now in the year 2026 we have more powerful memory and other hardwares
>>
>>108512057
Scale and the emergent behavior that results from it. While still a neural network it's so large a lot of what goes into inference and such gets obfuscated.
>>
>>108513461
>LLM was first conceived of in like 1800s
"Yo what if there were a talking machine"
Wow I just conceived of LLM
>>
>>108512057
LLMs are basically the same thing as text prediction and markov chains, but they can look at the whole context, rather than a handful of previous words.
>>
>>108513435
>a pure function
For example the Game of Life, its rule set is also a pure function yet the system has the power of Turing machine.
In the case of LLM, let say it was trained with these rules
>A lead to C, C+B lead to D
>A and D are incompatible
Given input containing A, B, the LLM start *reasoning* and generate C, D. Then it will realize A+B are incompatible - a rule it was never trained on - then start looking at alternative reasoning lines like A+E, A+F etc
"muh token predictor" is like the most artfag cope ever, I don't believe anyone studied CS would fall for this shit



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.