>>108878705
>>108878706
Baseline
1.03.045.553 I slot print_timing: id 0 | task 0 | prompt eval time = 48896.26 ms / 64494 tokens ( 0.76 ms per token, 1319.00 tokens per second)
1.03.045.556 I slot print_timing: id 0 | task 0 | eval time = 0.00 ms / 1 tokens ( 0.00 ms per token, 1000000.00 tokens per second)
1.03.045.556 I slot print_timing: id 0 | task 0 | total time = 48896.26 ms / 64495 tokens
1.03.045.558 I slot print_timing: id 0 | task 0 | graphs reused = 1
1.03.046.886 I slot release: id 0 | task 0 | stop processing: n_tokens = 64494, truncated = 0
1.03.046.891 I srv update_slots: all slots are idle
With MTP
1.16.313.011 I slot print_timing: id 0 | task 0 | prompt eval time = 60265.50 ms / 64494 tokens ( 0.93 ms per token, 1070.16 tokens per second)
1.16.313.014 I slot print_timing: id 0 | task 0 | eval time = 0.00 ms / 1 tokens ( 0.00 ms per token, 1000000.00 tokens per second)
1.16.313.015 I slot print_timing: id 0 | task 0 | total time = 60265.50 ms / 64495 tokens
1.16.313.017 I slot print_timing: id 0 | task 0 | graphs reused = 1
1.16.313.032 I statistics draft-mtp: #calls(b,g,a) = 1 0 0, #gen drafts = 0, #acc drafts = 0, #gen tokens = 0, #acc tokens = 0, dur(b,g,a) = 0.002, 0.000, 0.000 ms
1.16.314.354 I slot release: id 0 | task 0 | stop processing: n_tokens = 64494, truncated = 0
1.16.314.361 I srv update_slots: all slots are idle