i tried to run that gemma mtp fork
./llama-server --model '/mnt/miku/Text/gemma4 mtp/Gemma4-31B-Q8_0.gguf' -md '/mnt/miku/Text/gemma4 mtp/mtp-gemma-4-31B-it.gguf' --n-gpu-layers 21 --spec-type draft-mtp --spec-draft-n-max 4
it fails with
/mnt/miku/Text/gemma4 mtp/llama.cpp/ggml/src/ggml-cuda/ggml-cuda.cu:102: ROCm error
0.11.571.697 E ggml_cuda_compute_forward: MUL failed
0.11.571.702 E ROCm error: invalid device function
0.11.571.704 E current device: 0, in function ggml_cuda_compute_forward at /mnt/miku/Text/gemma4 mtp/llama.cpp/ggml/src/ggml-cuda/ggml-cuda.cu:3114
0.11.571.705 E err
so maybe not working for rocm yet