>>102725219
ST has always been a fancy textbox. llama.cpp has a vim plugin for llama-server. It's about 110 loc. It handles streaming just fine. You get built-in context editing (it's a text editor, after all). You can use any prompt format by just typing or using a macro to insert them. You can change the settings from request to request with the settings on a control line at the top.
You can use localchub to mirror chub.ai. Extracting data from the cards is trivial [picrel. a random card]. Change png_hdr to identify and liljson to jq. Then it's just copy pasting shit as you need. If you don't use vim, make one for your editor of choice. It's just ~100 loc to convert. Save vram by not having a browser, implement only the features you need, avoid bloat. Or convert it to js and add some css on top. Whatever.
!*{"temperature": 0.6, "top_k": 40, "top_p": 1, "n_predict": -1, "repeat_last_n": -1, "stop": "<|endoftext|>", "cache_prompt": true, "n_keep": -1}
:nnoremap <F6> i<\|user\|><\|endoftext\|><CR><\|assistant\|><ESC>6b2l
:nnoremap <F9> :call llama#doLlamaGen()<CR>