[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tq.png (190 KB, 621x649)
190 KB
190 KB PNG
memory prices coming back down to normal sooner than you think
>>
>>108460578
That's just the cache though. Newer models have already been able to do 1 GB per 10k tokens for cache. If it comes down further, great, but it's not going to change the biggest LLMs requiring 1.5 TB of VRAM to run.
>>
>>108460591
anon, can you tell the class where cache is being stored and how that might affect memory prices?
>>
>>108460618
It's a series of tubes
>>
>>108460578
>memory prices coming back down to normal sooner than you think
https://en.wikipedia.org/w/index.php?title=Jevons_paradox
>>
>>108460618
On whatever medium happens to be faster than whatever you're originally loading it from. If you're loading it from memory the cache is on the CPU. If you're loading it from an NVMe the page cache is in memory. If You're loading it from network you could cache it on mass storage.
>>
>>108460661
It's going to get much worse in the future whenever LLMs get replaced by something better.
>>
>>108460578
As if that's going to fix anything. Even if memory use was made 100x more efficient, they'd just understand that they can keep their orders in place and use 100x as much bullshit.
>>
>>108460578
>sooner than you think
less than two weeks?
>>
>>108460578
quantizing kv cache to 8 bit completely mindrapes the model and makes it hallucinate and go off the rails even worse than they do normally, 4 bit amplifies that even more. There is no way you're getting 3 bit KV quant with "no accuracy loss"
>>
>>108460591
it's KV cache not CPU/GPU cache, and you know it, you are one of these SK Hynix / Samsung bag holders screeching at your stocks dumping
>>
File: bad.png (134 KB, 500x462)
134 KB
134 KB PNG
>>108460578
what if Jevon's paradox applies to this?
>>
>>108460635
Bags of sand
>>
>>108461028
Shoop do woop with milk and pennies, my mudkip
>>
this paper is from q1 of last year and its effects are already in place
>>
>>108460578
Even if this works
>Implying they won't run more and bigger shit on same hardware
>Implying they'll stop buying up supply which sucks out oxygen from competition
>Implying the plan isn't to bully out personal computing as a concept and force everything to be hardware as a service
>>
>>108460591
>>108460618
so is this very good news or not???

>>108460981
>Jevon's paradox
you know what, 100% a nothingburger. with these greedy techniggers we can only lose.
>>
>>108461139
It's not.
>>
>>108460578
does this mean anything for coomer image and video gen
>>
File: file.png (861 KB, 1557x1376)
861 KB
861 KB PNG
i am begging people to stop reading pc gay men consumer blogs for ai news
>>
>>108461139
>so is this very good news or not???
It's good news, but not something revolutionary for home use. The biggest benefit is when you run concurrent requests that all need their own context. Even at 1 GB per 10k tokens it adds up pretty quick. 10 GB for 100k tokens, but if you have 20 concurrent requests then that's 200 GB.
>>
>>108461258
>Even at 1 GB per 10k tokens it adds up pretty quick. 10 GB for 100k tokens, but if you have 20 concurrent requests then that's 200 GB.
what? does this make any sense? I think he speaks gibberish like a tard and has no idea whatsup.
>>
>>108460578
That's just wrong. It means they will be able to do even more with the memory they already have, and can have.
>>
>>108460578
Wait... over UNQUANTIZED bits???
What about comparison to the existing quantization?
>>
>>108460981
Jevon's paradox applies to everything that consumes resources.
>>
>>108460578
>google
>not even 10x
Prices are never coming down.
>>
>>108460578
okay can I locally run a 1 trillion qubit ai locally yet?
>>
>>108461621
>Jevon's paradox applies to everything that consumes resources
Even me?
Wait, every time when I get better I consume more.
When my gf gets better penis, she wants more.

This checks out. Ai could 10x tomorrow and it would only mean more porn and more slop generation. Humans are a virus.
>>
>>108460578
>google's
Yeah no, the prices are gonna go up thanks to them.
>>
>>108460661
this
>thanks to leds outdoor lighting is going to consume much less electricity
>proceeds to install 100x more outdoor lights
>oh for some reason outdoor lightning consume even more electricity than before



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.