>>107951047
[23:06:22] CtxLimit:8192/8192, Amt:100/100, Init:0.49s, Process:15.16s (533.63T/s), Generate:78.85s (1.27T/s), Total:94.02s
Benchmark Completed - v1.106.1 Results:
======
Flags: NoAVX2=False Threads=5 HighPriority=False Cuda_Args=['normal', '0', 'mmq'] Tensor_Split=None BlasThreads=5 BatchSize=512 FlashAttention=True KvCache=0
Timestamp: 2026-01-23 22:06:22.772959+00:00
Backend: koboldcpp_cublas.dll
Layers: 18
Model: c4ai-command-r-v01.i1-IQ2_XXS
MaxCtx: 8192
GenAmount: 100
-----
ProcessingTime: 15.164s
ProcessingSpeed: 533.63T/s
GenerationTime: 78.853s
GenerationSpeed: 1.27T/s
TotalTime: 94.017s
Output: 1 1 1 1
And vram sits at 10.8 of 12GB. Maybe try +1 layer until it stops crashing?