what am i doing wrong? something retarded, i'm sure, so my apologies in advance
$ git clone https://huggingface.co/google/gemma-4-31B
$ python convert_hf_to_gguf.py --outfile gemma-4-31B.gguf --outtype q8_0 gemma-4-31B/
$ llama-server --model gemma-4-31B.gguf --ctx-size 32768 --n-gpu-layers 48 --batch-size 8192 --temp 1.0 --top-p 0.95 --min-p 0.01 --host 127.0.0.1 --port 8033 --jinja