prompt eval time = 1282.00 ms / 734 tokens ( 1.75 ms per token, 572.54 tokens per second)
eval time = 4001.28 ms / 114 tokens ( 35.10 ms per token, 28.49 tokens per second)
total time = 5283.28 ms / 848 tokens
Now we wait for llama.cpp to implement Qwen3.5 properly. This is half the performance I get on GLM 4.7.