>>101619040
So you just want to generate some text following an instruction and then quit?
Save the prompt in a file formatted with the model's format
<|start_header_id|>system<|end_header_id|>
You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>
What's the capital of France?<|eot_id<|start_header_id|>assistant<|end_header_id|>
and call it with
>./llama-cli -f the_file_you_just_saved.txt -m model.gguf
and whatever other options you need for threads, context, batch size and all that.
If you don't add the prompt to the command line directly, use -p "All that stuff up there" instead of -f
Double check the format. I'm not sure if that's the correct one, but it works.
If you still have problems, show how you're running the command and what you're trying to do. Don't make people guess.