Before >>108315886
| model | size | params | backend | ngl | test | t/s |
| --------------------------------- | ---------: | ---------: | ---------- | --: | --------------: | -------------------: |
| qwen35 9B Q4_K - Medium | 5.55 GiB | 8.95 B | CUDA | 99 | pp512 | 10562.39 ± 2221.70 |
| qwen35 9B Q4_K - Medium | 5.55 GiB | 8.95 B | CUDA | 99 | tg128 | 182.18 ± 0.74 |
| qwen35moe 397B.A17B Q4_K - Medium | 199.66 GiB | 396.35 B | CUDA | 99 | pp512 | 1607.97 ± 88.97 |
| qwen35moe 397B.A17B Q4_K - Medium | 199.66 GiB | 396.35 B | CUDA | 99 | tg128 | 76.17 ± 0.46 |
After >>108315886
| model | size | params | backend | ngl | test | t/s |
| --------------------------------- | ---------: | ---------: | ---------- | --: | --------------: | -------------------: |
| qwen35 9B Q4_K - Medium | 5.55 GiB | 8.95 B | CUDA | 99 | pp512 | 10646.87 ± 2234.18 |
| qwen35 9B Q4_K - Medium | 5.55 GiB | 8.95 B | CUDA | 99 | tg128 | 204.02 ± 0.35 |
| qwen35moe 397B.A17B Q4_K - Medium | 199.66 GiB | 396.35 B | CUDA | 99 | pp512 | 1645.70 ± 7.07 |
| qwen35moe 397B.A17B Q4_K - Medium | 199.66 GiB | 396.35 B | CUDA | 99 | tg128 | 87.14 ± 0.33 |