/pol/ - you can achieve same with 80% less! - Politically Incorrect

Anonymous (ID: YU0WjWDx) Archived

you can achieve same with 80% (...) 10/21/25(Tue)07:46:13 No.519440646

File: 1760908160774953.jpg (143 KB, 735x920)

you can achieve same with 80% less! Anonymous (ID: YU0WjWDx) 10/21/25(Tue)07:46:13 No.519440646 Archived

New pooling called "Ahegao" is able to achieve same results as the older systems by using 80% less of Nvidia GPU calculation time

https://www.tomshardware.com/tech-industry/semiconductors/alibaba-says-new-pooling-system-cut-nvidia-gpu-use-by-82-percent

The advancements have been detailed in a research paper (link is at Tom's site) at the 2025 ACM Symposium on Operating Systems (SOSP) in Seoul.

Unlike training-time breakthroughs that chase model quality or speed, Ahegao is an inference-time scheduler designed to maximize GPU utilization across many models with bursty or unpredictable demand. Instead of pinning one accelerator to one model, Ahegao virtualizes GPU access at the token level, allowing it to schedule tiny slices of work across a shared pool.

This means one H20 could serve several different models simultaneously, with system-wide "goodput" -- a measure of effective output -- rising by as much as nine times compared to older serverless systems.

The system was tested in production over several months, according to the paper. During that window, the number of GPUs needed to support dozens of different LLMs -- ranging in size up to 72 billion parameters -- fell from 1,192 to just 213.

While the paper does not break down which models (not all of the Nvidia GPUs were sames, they had purchased whatever they could get froma pool of year 2020 Nvidia models and onwards) contributed most to the savings.

Anonymous (ID: 36s23gIA)
10/21/25(Tue)07:47:45 No.519440712

Anonymous (ID: 36s23gIA) 10/21/25(Tue)07:47:45 No.519440712

>>519440646
So? how does this affect me?

Anonymous (ID: F0wY+h1L)
10/21/25(Tue)07:48:07 No.519440722

Anonymous (ID: F0wY+h1L) 10/21/25(Tue)07:48:07 No.519440722

Wait why are my google results all hentai?

Anonymous (ID: zaapHnxQ)
10/21/25(Tue)07:49:30 No.519440782

Anonymous (ID: zaapHnxQ) 10/21/25(Tue)07:49:30 No.519440782

Promote peace and love and unity and its nice to see japanese east asian men fuck blonde white girls, brunette white girls, ebony black girls, ginger white girls, asian girls, brown girls, latina girls, mixed race girls, and etc

Anonymous (ID: YU0WjWDx)
10/21/25(Tue)07:58:07 No.519441147

Anonymous (ID: YU0WjWDx) 10/21/25(Tue)07:58:07 No.519441147

>>519440712
dont you want to use LLM, ChatGPT, whatnot?

Anonymous (ID: BQ7NTLgq)
10/21/25(Tue)07:58:38 No.519441182

Anonymous (ID: BQ7NTLgq) 10/21/25(Tue)07:58:38 No.519441182

>>519440712
You vill own nothing a lot sooner than expected.

Anonymous (ID: 2cLtlznU)
10/21/25(Tue)08:33:12 No.519442935

Anonymous (ID: 2cLtlznU) 10/21/25(Tue)08:33:12 No.519442935

>>519441147
no?

Anonymous (ID: fv6UVLG1)
10/21/25(Tue)08:35:10 No.519443053

Anonymous (ID: fv6UVLG1) 10/21/25(Tue)08:35:10 No.519443053

File: 1712558773905578.gif (1017 KB, 498x345)

1017 KB GIF

>>519440646
So is this good or bad

Anonymous (ID: YU0WjWDx)
10/21/25(Tue)08:45:23 No.519443581

Anonymous (ID: YU0WjWDx) 10/21/25(Tue)08:45:23 No.519443581

>>519443053
>>519442935
see, the thing is you can in fact use those large language chatbot models with less hardware than previously thought

it means two things:
1) create even bigger models
2) or utlizing same old models in places where it wasnt possible before due to computing power restrictions