>>108562474
llama-server -m /models/llm/gemma-4-31b-it-heretic-ara-Q8_0.gguf --mmproj /models/llm/mmproj-google_gemma-4-31B-it-bf16.gguf --threads 16 --swa-checkpoints 3 --parallel 1 --no-mmap --mlock --no-warmup --flash-attn on --cache-ram 0 --temp 0.7 --top-k 64 --top-p 0.95 --min-p 0.05 --image-max-tokens 1120 -ngl 999 -np 1 -kvu -ctk q8_0 -ctv q8_0 --reasoning-budget 8192 --reasoning on -c 262144 --verbose --chat-template-file /models/llm/chat_template.jinja -ub 1536
i've been getting settings from the threads since gemma4 came out lol