>>109033377
>Are you niggas really not string banning
yeah but ikllama doesn't have whatever llamacpp has to make gemma-chan use less vram
this is q5_k_m with -c 90000
nvidia-smi |grep Default |awk -F '|' '{print $3}'
13282MiB / 24576MiB
20862MiB / 24576MiB
20785MiB / 24576MiB
17231MiB / 24576MiB
20802MiB / 24576MiB
21856MiB / 24576MiB
so it's string ban with 90k ctx or no string ban with 256k