>>101040340
CR is somehow overly creative and wild, this is probably what leads to retarded things happening sometimes. Mixtral is drier but follows instructions and plot much better. I can't run Qwen MoE for some reason, latest llama.cpp just crashes with
GGML_ASSERT: ggml-metal.m:1867: dst_rows <= 2048
GGML_ASSERT: ggml-metal.m:1867: dst_rows <= 2048
zsh: abort ./llama.cpp/gg/server -m ./models/qwen2-57b-a14b-instruct-q5_0.gguf