>>101144935
>https://github.com/OpenBMB/llama.cpp?tab=readme-ov-file#run-the-quantized-model
for:
>for openbmb/MiniCPM-Llama3-V-2_5-gguf/ggml-model-Q4_K.gguf?
which damn file do I use and where is the help output? --help just gives:
./llama-gguf --help
./llama-gguf: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by ./llama-gguf)
the folder is full of shit:
llama-b3209-bin-ubuntu-x64/build/bin$ ls
LICENSE llama-q8dot
llama-baby-llama llama-quantize
llama-batched llama-quantize-stats
llama-batched-bench llama-retrieval
llama-bench llama-save-load-state
llama-bench-matmult llama-server
llama-cli llama-simple
... tl;dr
...
llama-lookup-stats test-sampling
llama-parallel test-tokenizer-0
llama-passkey test-tokenizer-1-bpe
>[2024 Jun 12] Binaries have been renamed w/ a llama- prefix. main is now llama-cli, server is llama-server, etc (ggerganov#7809)
what the fuck does all this mean? Last time I used llama ccp was when it first came out on windows and now I'm trying to run multi modal on ubuntu and its nothing like
I know I'm retarded. Please just tell me which button to press. is it llama-cli or llama-gguf for openbmb/MiniCPM-Llama3-V-2_5-gguf/ggml-model-Q4_K.gguf?