llama-server --hf-repo bartowski/Phi-3.5-mini-instruct-GGUF --hf-file ./Phi-3.5-mini-instruct-Q5_K_M.gguf
>ggml_backend_cpu_buffer_type_alloc_buffer: failed to allocate buffer of size 51539607584
I'm trying to run it on my CPU. I've got like ~8GB of ram available. why is it a default of 51GB