it just werks
>>107106564>can't have files over 4GBstraight into the trash
>>107106628thats because windows fucking sucks.
>>107106628>>107106699idgi, do these not work?
>>107106699Who is still using Fat32. Are you retarded?
>>107106746windows has a filesize limit for executables. >>107106755retard
>>107106564>troonwareno thanks, still using the superior llama.cpp, thanks
>>107106768kek imagine using winshit, what the fuck are we still in x86_32 days?and yea apparently you gotta split it into 2 files
for me its llamacpp server in docker anyone on lan can access through its http port
>>107106904too bloated if you just need a small fast model to run on your code editor. llamafile is simple
>>107106564Bump. Good morning sir!
>>107109379good morning
>>107106699why would someone use Wiindows?
>>107111272i had to in a workplace several times on an airgapped network. if llamafiles existed it would have saved me a fuckton of time
>>107111272why are you a retard?
>>107106699>runs llms on windowsjust use online services if you don't give a fuck
>>107106564coded by a tranny that tried to sabotage the upstream projectsearch "justine" tunney for more
>>107106564> just werks> built in sandboxing on linux> reasonable token productionIf you know someone who isn't computer literate, but wants to try local LLMs, this is a good solution. Lots of options. I suggest> Mistral 7B Instruct v0.3.Q4 (fast general purpose)> Google gemma 3 12B it Q4 K M (general purpose> Gemma 2 27B it Q6K (slow general purpose)> Qwen2.5.1 Coder 7B Instruct Q8 (fast coding helper)> Qwen2.5 Coder 14B Q6K (slow coding helper)
>>107114504Oh, I almost forgot. You can load gguf's with this, too. > ./Mistral-7B-Instruct-v03.!4_0.llamafile -m <your-gguf-model-here>
>yet another ai chatbotholy yawn
>>107114504im trying to read up on LLM and quantsq4 seems bad, i dont quite get difference of q6k vs q8how can u do 6bit?^just how big is the difference of a q6k 7b vs 13bit seems q6 is the sweetspot and K M suffixes seem to imply better trained
>>107114919>how can u do 6bit?you pack the next 2 bits in the empty part