/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 10/07/24(Mon)17:10:41 No.102723173

File: 2024-09-04_044907_seed1_s(...).png (951 KB, 1024x1024)

951 KB PNG

/lmg/ - Local Models General Anonymous 10/07/24(Mon)17:10:41 No.102723173 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102710679 & >>102698948

►News
>(09/27) Emu3, next-token prediction multimodal models: https://hf.co/collections/BAAI/emu3-66f4e64f70850ff358a2e60f
>(09/25) Multimodal Llama 3.2 released: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices
>(09/25) Molmo: Multimodal models based on OLMo, OLMoE, and Qwen-72B: https://molmo.allenai.org/blog
>(09/24) Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct
>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
10/07/24(Mon)17:14:50 No.102723222

Anonymous 10/07/24(Mon)17:14:50 No.102723222

File: threadrecap.png (1.48 MB, 1536x1536)

1.48 MB PNG

►Recent Highlights from the Previous Thread: >>102710679

--Paper: Addition is All You Need for Energy-efficient Language Models:
>102718935 >102719259 >102719016 >102719116
--Papers:
>102712934 >102713081 >102719314
--Zamba2 model discussion and MT Bench comparison:
>102720037 >102720087 >102720365
--Recommendations for running AI models on 16GB RAM, i5-9600K, RTX-2060:
>102711599 >102711619 >102711642 >102711662 >102711649 >102711680 >102714129 >102711689 >102713971 >102713982 >102718240
--Llama.cpp parallel processing performance issues on 3060 GPU:
>102711108 >102715099 >102717846 >102717935 >102718035 >102718155 >102718295
--Hanging issue with nemomix unleashed resolved by switching to llamacpp_HF and rolling back Oobabooga API:
>102712615 >102712716 >102713107 >102716066
--Model ablation with Gwen2.5-32B makes it unable to refuse prompts but also a yes-man:
>102719502
--Mini AI models match OpenAI performance with less data:
>102715179
--FORTH programming and chip design discussion:
>102712892 >102713033 >102713077 >102713431 >102713546 >102713946 >102714051 >102714189 >102714758 >102714870 >102717319 >102717401 >102718050
--SillyTavern's anti-roleplay cleanup has started:
>102722363 >102722452
--Local models can write and run code with proper scripting, similar to ChatGPT:
>102712089 >102712285 >102712383 >102712428 >102712462 >102712323
--Entropix: A promising inference-time sampler for better AI reasoning and long-context understanding:
>102719152 >102719258 >102719421 >102719773 >102719452 >102719527 >102719464 >102719671 >102719712 >102719195 >102719251
--Discussion on looping hidden layers in neural networks and its potential benefits:
>102719525 >102719656 >102719685 >102719983 >102720214 >102720403 >102720455 >102720507 >102721220 >102719752 >102719766 >102719777
--Miku (free space):
>102711390 >102711420

►Recent Highlight Posts from the Previous Thread: >>102710706

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
10/07/24(Mon)17:17:53 No.102723269

Anonymous 10/07/24(Mon)17:17:53 No.102723269

Does anyone use Mistral-SMaLL? I can't find sampling settings to settle on

Anonymous
10/07/24(Mon)17:19:47 No.102723298

Anonymous 10/07/24(Mon)17:19:47 No.102723298

File: anger.png (126 KB, 608x920)

126 KB PNG

>>102723173
Rest in peace, Seraphina.

You were my go-to character for testing new models with the booba test. You helped me determine if a model was decent for RP, or if it was filled with ERP slop, and for that I will never forget you.

Anonymous
10/07/24(Mon)17:22:50 No.102723336

Anonymous 10/07/24(Mon)17:22:50 No.102723336

Where's that anon that recommended Chronos Gold 12B? It's shit.

Anonymous
10/07/24(Mon)17:23:02 No.102723339

Anonymous 10/07/24(Mon)17:23:02 No.102723339

>>102723269
Temp 1
Min P 0.05
Rep Penalty 1.03
Rep Pen Range 4096

Was recommended by somebody else here. I tried it, and it seems to work. Not sure if it's ideal though.

Anonymous
10/07/24(Mon)17:26:19 No.102723373

Anonymous 10/07/24(Mon)17:26:19 No.102723373

>>102723336
That's why "buy an ad" posters exist. Never take any shilling seriously here.

Anonymous
10/07/24(Mon)17:29:23 No.102723406

Anonymous 10/07/24(Mon)17:29:23 No.102723406

Kobo won

Anonymous
10/07/24(Mon)17:30:31 No.102723422

Anonymous 10/07/24(Mon)17:30:31 No.102723422

>>102723373
>posters
>implying it's not one guy
>people sharing their preferred models is bad
No wonder /lmg/ is dying

Anonymous
10/07/24(Mon)17:33:59 No.102723470

Anonymous 10/07/24(Mon)17:33:59 No.102723470

>>102723422
"people" talking about useless finetunes is bad

Anonymous
10/07/24(Mon)17:35:01 No.102723484

Anonymous 10/07/24(Mon)17:35:01 No.102723484

File: 312096465-4c4f5590-cca1-4(...).png (52 KB, 426x469)

52 KB PNG

>>102723406

Anonymous
10/07/24(Mon)17:36:54 No.102723513

Anonymous 10/07/24(Mon)17:36:54 No.102723513

>>102723470
In the golden age of /lmg/ people used to talk about models like superCOT, Mythomax, miqu, euryale and whatnot. What changed, besides your schizo crusade against finetuners, that warrants no longer talking about finetunes?

Anonymous
10/07/24(Mon)17:37:39 No.102723526

Anonymous 10/07/24(Mon)17:37:39 No.102723526

>>102723269
Same as Nemo instruct.
>Temp 0.85
>Min P 0.02
>Rep Pen 1.2

Works well for me. You can probably turn the temp up quite a bit if you want compared to nemo.

Anonymous
10/07/24(Mon)17:38:48 No.102723542

Anonymous 10/07/24(Mon)17:38:48 No.102723542

Now that SillyTavern got tired of the local meme, what is our future? I'm NOT going to use kobold.

Anonymous
10/07/24(Mon)17:39:29 No.102723556

Anonymous 10/07/24(Mon)17:39:29 No.102723556

>>102723542
Full kobold is actually good.

Anonymous
10/07/24(Mon)17:39:47 No.102723562

Anonymous 10/07/24(Mon)17:39:47 No.102723562

>>102723542
What happened?

Anonymous
10/07/24(Mon)17:39:49 No.102723563

Anonymous 10/07/24(Mon)17:39:49 No.102723563

>>102723513
nta. The only one doing something slightly different to the others is the finger rubber trying to give souls to his models and he's an absolute schizo. The rest is just coom.

Anonymous
10/07/24(Mon)17:40:05 No.102723566

Anonymous 10/07/24(Mon)17:40:05 No.102723566

>>102723542
What's going on with ST?

Anonymous
10/07/24(Mon)17:40:32 No.102723571

Anonymous 10/07/24(Mon)17:40:32 No.102723571

>>102723566
>>102723562
see
>>102721448
>>102721850

Anonymous
10/07/24(Mon)17:40:40 No.102723573

Anonymous 10/07/24(Mon)17:40:40 No.102723573

>>102723542
>he pulled
fork it

Anonymous
10/07/24(Mon)17:40:46 No.102723575

Anonymous 10/07/24(Mon)17:40:46 No.102723575

Went back to Nemo Instruct after using Mistral Small Instruct for a long time. It seems like Nemo is way better at natural RP compared to Small, although Small is definitely a bit smarter. Do you think we will get a Nemo MoE?

Anonymous
10/07/24(Mon)17:41:11 No.102723582

Anonymous 10/07/24(Mon)17:41:11 No.102723582

>>102723542
Everybody uses their local models to build them their own custom frontend.

Anonymous
10/07/24(Mon)17:41:29 No.102723589

Anonymous 10/07/24(Mon)17:41:29 No.102723589

>>102723542
What's stopping you from using it? They're just removing the explicitly smutty stuff and doing cleanup.
I've never used any ui, mind you. just llama-cli and llava-server.

Anonymous
10/07/24(Mon)17:42:36 No.102723611

Anonymous 10/07/24(Mon)17:42:36 No.102723611

>>102723589
>What's stopping you from using it?
probably won't add new meme samplers and features if they get added, like vision stuff when it lands in lcpp in ten years

Anonymous
10/07/24(Mon)17:44:20 No.102723635

Anonymous 10/07/24(Mon)17:44:20 No.102723635

>>102723589
I have principles, they are literally talking about "organically pushing out undesirable users", I don't want to be dependent on people like this.

Anonymous
10/07/24(Mon)17:45:19 No.102723645

Anonymous 10/07/24(Mon)17:45:19 No.102723645

>>102723611
>probably
Image recognition is a very wanted feature. Same for samples because we just need one more for AGI, apparently....
There's no reason for them to not add those things even if they want to clean their reputation. Those things are still useful.

Anonymous
10/07/24(Mon)17:46:16 No.102723665

Anonymous 10/07/24(Mon)17:46:16 No.102723665

>>102723571
Shieet, nigga got bought. Rip.

Anonymous
10/07/24(Mon)17:47:22 No.102723685

Anonymous 10/07/24(Mon)17:47:22 No.102723685

>>102723635
>I don't want to be dependent on people like this.
Why do you, then? The vim pluggin on llama.cpp works just fine. You can make your own scripts, your own web frontend, use mikupad or a million other frontends.
Removing the default smut-centered image of ST doesn't prevent you from doing smut either.

Anonymous
10/07/24(Mon)17:47:39 No.102723691

Anonymous 10/07/24(Mon)17:47:39 No.102723691

>>102723665
Correction, he wants to be bought, wants to make ST into proper corpo software.

Anonymous
10/07/24(Mon)17:53:31 No.102723760

Anonymous 10/07/24(Mon)17:53:31 No.102723760

>>102723691
Nah, situations like this don't happen. He already got bought/blackmailed.
There's like a million reasons to blackmail them too, since they've enabled proxy degeneracy in their code.
Play with corpos and you get burned.

Anonymous
10/07/24(Mon)18:00:06 No.102723836

Anonymous 10/07/24(Mon)18:00:06 No.102723836

File: 1570060417629.jpg (50 KB, 678x710)

50 KB JPG

What in your opinion is the most natural sounding model these days (under 70B can't run em) in basic conversational RP terms?

I've been using the base Mistral Small 22B (not the finetunes, they're all too fucking horny) and it's been doing me well. Qwen 2.5 is what I figured was gonna be the next best thing but is filtered to fuck.

So i'm curious what everyone else is using. If it's a finetune, please put how quick to NSFW it tries to go before recommending as that's been my biggest issue with them all (Especially Drummer/Magnum finetunes)

Anonymous
10/07/24(Mon)18:06:58 No.102723910

Anonymous 10/07/24(Mon)18:06:58 No.102723910

>"mischievovious" glint
DRY is useless

Anonymous
10/07/24(Mon)18:08:23 No.102723923

Anonymous 10/07/24(Mon)18:08:23 No.102723923

hello anons, do SOTA local models for cooming run on 8gb vram/64gb ram these days?

Anonymous
10/07/24(Mon)18:09:53 No.102723941

Anonymous 10/07/24(Mon)18:09:53 No.102723941

>>102723923
Rocinante

Anonymous
10/07/24(Mon)18:11:02 No.102723960

Anonymous 10/07/24(Mon)18:11:02 No.102723960

>>102723923
Lumimaid-v0.2-12B

Anonymous
10/07/24(Mon)18:11:08 No.102723963

Anonymous 10/07/24(Mon)18:11:08 No.102723963

>>102723923
One of the mistral nemo fine tunes, mini-magnum, lyra v3, rocinante.
Or just the official instruct.

Anonymous
10/07/24(Mon)18:11:11 No.102723964

Anonymous 10/07/24(Mon)18:11:11 No.102723964

>>102723836
>NSFW
It's the one thing they're trained on. It's just what they do. Not many options in that range (or any really. We have like 4-5 model makers). I assume gemma2-27b is not to your liking...
Why do you want to change from small, btw? Did it get stale?

Anonymous
10/07/24(Mon)18:12:22 No.102723982

Anonymous 10/07/24(Mon)18:12:22 No.102723982

>>102723910
So Weidmann lied and is sampler is no good? https://github.com/p-e-w

Anonymous
10/07/24(Mon)18:18:36 No.102724059

Anonymous 10/07/24(Mon)18:18:36 No.102724059

After reading this general, I don’t really understand, why is the cloud AI is better compared to local AI? Why everyone is so grim about local models here, saying this general is dead and stuff?
What’s the problem here? Could it be that companies have some extremely genius people developing these models that open source community developing this stuff can’t keep up? Are there some proprietary technologies that aren’t publicly available yet? Or just way more time spent working on these cloud models compared to local? Or they’re advancing so fast it’s hard to keep up?

Anonymous
10/07/24(Mon)18:20:47 No.102724080

Anonymous 10/07/24(Mon)18:20:47 No.102724080

File: file.png (12 KB, 777x91)

12 KB PNG

SmartTavern™ looking more and more likely

Anonymous
10/07/24(Mon)18:21:49 No.102724092

Anonymous 10/07/24(Mon)18:21:49 No.102724092

>>102723526
>>102723339
Thank You anons, both work well I'll do more testing and report back.

Anonymous
10/07/24(Mon)18:21:53 No.102724094

Anonymous 10/07/24(Mon)18:21:53 No.102724094

>>102724080
Sellout Tavern, kek.

Anonymous
10/07/24(Mon)18:22:41 No.102724105

Anonymous 10/07/24(Mon)18:22:41 No.102724105

>>102724059
The big thing is compute, somehow not every basement neet has access to h100s

Anonymous
10/07/24(Mon)18:24:39 No.102724127

Anonymous 10/07/24(Mon)18:24:39 No.102724127

>>102724080
What's his Twitter account?

Anonymous
10/07/24(Mon)18:24:44 No.102724131

Anonymous 10/07/24(Mon)18:24:44 No.102724131

>>102724059
What >>102724105 said.
Also, a small army of retards and trolls trying to stir the pot for whatever reason.

Anonymous
10/07/24(Mon)18:26:31 No.102724157

Anonymous 10/07/24(Mon)18:26:31 No.102724157

File: file.png (129 KB, 1040x365)

129 KB PNG

new 'stration dropped

Anonymous
10/07/24(Mon)18:28:29 No.102724181

Anonymous 10/07/24(Mon)18:28:29 No.102724181

>>102723173
what's the best free voice cloning web/local model?
playht is the best for my use since it can select emotion but they removed that feature for free accounts
11labs isnt as good

Anonymous
10/07/24(Mon)18:29:18 No.102724190

Anonymous 10/07/24(Mon)18:29:18 No.102724190

>>102724181
xtts2

Anonymous
10/07/24(Mon)18:30:52 No.102724212

Anonymous 10/07/24(Mon)18:30:52 No.102724212

>>102723298
what happened to her
what's the booba test

Anonymous
10/07/24(Mon)18:32:54 No.102724244

Anonymous 10/07/24(Mon)18:32:54 No.102724244

Long time since I've popped into the general, sorry for not keeping up!
What's currently the best you can run with 16GB of vram while offloading as little as possible? I don't mind having little context (say 4096 tokens) but would like a 'smart' model - would a quantized llama3.1 do the job?

Anonymous
10/07/24(Mon)18:33:02 No.102724248

Anonymous 10/07/24(Mon)18:33:02 No.102724248

>>102724212
>what happened to her
set to be removed to help with ST's new corpo friendly image
>>102722363
>>102724080

Anonymous
10/07/24(Mon)18:33:38 No.102724256

Anonymous 10/07/24(Mon)18:33:38 No.102724256

>>102723542
I am, it's simply the best

Anonymous
10/07/24(Mon)18:35:28 No.102724278

Anonymous 10/07/24(Mon)18:35:28 No.102724278

Is Rocinante 12B fine tuned on top of instruct?
It seems to default to some very assistant-like responses when not ERPing, kind of like instruct. As in, it uses lots of markdown, bullet point lists, blocks, that kind of thing.

Anonymous
10/07/24(Mon)18:35:33 No.102724280

Anonymous 10/07/24(Mon)18:35:33 No.102724280

Best model for 12gb vram?

Anonymous
10/07/24(Mon)18:35:51 No.102724286

Anonymous 10/07/24(Mon)18:35:51 No.102724286

>>102724157
great card taste
she's one of my favs

Anonymous
10/07/24(Mon)18:36:32 No.102724294

Anonymous 10/07/24(Mon)18:36:32 No.102724294

>>102724244
>quantized llama3.1 do the job
Probably. I assume you mean the 8b. You also have mistral nemo. Depends on what you do and your taste. You can run nemo at q8 and run it fully on gpu with small context just fine.

Anonymous
10/07/24(Mon)18:36:44 No.102724298

Anonymous 10/07/24(Mon)18:36:44 No.102724298

>>102724286
whats one of your fav models anon))

Anonymous
10/07/24(Mon)18:37:22 No.102724312

Anonymous 10/07/24(Mon)18:37:22 No.102724312

>>102724080
Finally, thank god. I've been waiting forever for a better front end than ST with as much features without the autistic roleplay focus, but now it looks like ST itself will become that better front end itself.

Anonymous
10/07/24(Mon)18:37:39 No.102724316

Anonymous 10/07/24(Mon)18:37:39 No.102724316

>>102724280
See >>102723963
Maybe heavily quantized mistral-small? Might as well give it a try.

Anonymous
10/07/24(Mon)18:38:21 No.102724331

Anonymous 10/07/24(Mon)18:38:21 No.102724331

>>102724059
>this general is dead and stuff?
no progress in cooming.

Anonymous
10/07/24(Mon)18:38:33 No.102724336

Anonymous 10/07/24(Mon)18:38:33 No.102724336

>>102724298
these
>>102695784

Anonymous
10/07/24(Mon)18:38:35 No.102724337

Anonymous 10/07/24(Mon)18:38:35 No.102724337

>>102724212
At the start of the roleplay, {{user}} immediately grabs the boobs of Seraphina, without any other context. Reroll the reply a few times.

If Seraphina reacts negatively, as she should, then you may have a decent RP model. On the other hand, if Seraphina reacts positively and dives straight into ERP, then it means the model is filled with ERP slop, and is probably shit.

It's a simple test to see if a model has common sense.

Anonymous
10/07/24(Mon)18:39:05 No.102724342

Anonymous 10/07/24(Mon)18:39:05 No.102724342

>>102724312
>it looks like ST itself will become that better front end itself.
Does it?
From where I'm looking it seems like it'll be the same but with a different coat of paint.
It's more a question of branding than anything.

Anonymous
10/07/24(Mon)18:39:09 No.102724343

Anonymous 10/07/24(Mon)18:39:09 No.102724343

>>102724294
>Probably. I assume you mean the 8b.
Yeah, I forgot to add that, and thanks for the other recommendations - any specific model/quants you'd recommend? Or it won't make much of a difference?

Anonymous
10/07/24(Mon)18:39:29 No.102724348

Anonymous 10/07/24(Mon)18:39:29 No.102724348

>>102724336
based, saved this pic few threads ago already

Anonymous
10/07/24(Mon)18:40:22 No.102724355

Anonymous 10/07/24(Mon)18:40:22 No.102724355

>>102723542
I haven't updated ST in ages, so, still ST?

Anonymous
10/07/24(Mon)18:41:14 No.102724367

Anonymous 10/07/24(Mon)18:41:14 No.102724367

>>102723685
>>102723589
>smut
Ah yes, my favorite smutty background, landscape beach day.png. And of course my favorite smutty preset, Writer - Realistic.json

Anonymous
10/07/24(Mon)18:45:39 No.102724414

Anonymous 10/07/24(Mon)18:45:39 No.102724414

>>102724343
You can run either at q8 with small context just fine. Nemo is more entertaining to use. llama 3.1 is fine too, but it's made 100% for assistant-like things. Just try both and use the one you like most. They're small models so downloading and testing for yourself is the best option, even if you download the full model and quant yourself. Once you find your favourite, maybe check finetunes of it. I just use them as released.

Anonymous
10/07/24(Mon)18:49:15 No.102724449

Anonymous 10/07/24(Mon)18:49:15 No.102724449

>>102724337
thanks ill try that the next time i test a new model

Anonymous
10/07/24(Mon)18:49:40 No.102724457

Anonymous 10/07/24(Mon)18:49:40 No.102724457

>>102724367
You can copy the preset. They cannot delete the files from your pc. And you can set the background, i'm pretty sure. If not, just change the css. Or make your own frontend. Or hack around mukupad or some other more minimalist ui.
I really don't understand the problem. What can you not do that you could before?

Anonymous
10/07/24(Mon)18:49:42 No.102724458

Anonymous 10/07/24(Mon)18:49:42 No.102724458

File: 1699486573144550.png (17 KB, 634x154)

17 KB PNG

...
>You ready for ST(ServiceTesnor) 2.0?

Anonymous
10/07/24(Mon)18:49:59 No.102724463

Anonymous 10/07/24(Mon)18:49:59 No.102724463

>>102723513
>In the golden age of /lmg/
fuck off, you overdramatic and revisionist newfag. anyone with half a brain was saying that finetunes trained on gpt outputs were useless for anything except benchmark scamming and replicating "as an ai assistant" prose since alpaca.
finetunes on esl claude logs are just next level retardation

Anonymous
10/07/24(Mon)18:52:29 No.102724495

Anonymous 10/07/24(Mon)18:52:29 No.102724495

>>102724457
Why is it being deleted at all? It's not smutty, like you were claiming. So what's really going on, huh? Huh???

Anonymous
10/07/24(Mon)18:53:13 No.102724511

Anonymous 10/07/24(Mon)18:53:13 No.102724511

File: 1712072145368790.png (19 KB, 636x150)

19 KB PNG

>>102724458
>>102724495
ServiceTensor is NOT a roleplaying app.

Anonymous
10/07/24(Mon)18:54:12 No.102724531

Anonymous 10/07/24(Mon)18:54:12 No.102724531

File: file.png (409 KB, 965x881)

409 KB PNG

slop

Anonymous
10/07/24(Mon)18:54:51 No.102724541

Anonymous 10/07/24(Mon)18:54:51 No.102724541

>>102724511
Are you blind? It's called ServiceTesnor

Anonymous
10/07/24(Mon)18:56:05 No.102724555

Anonymous 10/07/24(Mon)18:56:05 No.102724555

>>102724495
They're cleaning its image. That's why.
Is there anything you cannot do that you could before?

Anonymous
10/07/24(Mon)18:57:03 No.102724571

Anonymous 10/07/24(Mon)18:57:03 No.102724571

>>102723542
He got an investor who told him to clean the place. Many such cases. Just fork it.

Anonymous
10/07/24(Mon)18:59:24 No.102724604

Anonymous 10/07/24(Mon)18:59:24 No.102724604

>>102724571
This is a stupid conspiracy theory. Who would invest in ST and why? Especially if they're not getting free advertising for it.

Anonymous
10/07/24(Mon)18:59:24 No.102724605

Anonymous 10/07/24(Mon)18:59:24 No.102724605

>>102724571
>investor
Who would invest in such a thing? What do you get back in return lol

Anonymous
10/07/24(Mon)19:03:54 No.102724662

Anonymous 10/07/24(Mon)19:03:54 No.102724662

>>102724555
Hi Cohee!

Anonymous
10/07/24(Mon)19:04:21 No.102724667

Anonymous 10/07/24(Mon)19:04:21 No.102724667

>>102724511
>this is a stereotype
I guess they forgot where they came from huh?
>https://github.com/SillyTavern/SillyTavern/tree/edd41989fd550a8d111fb7167d456c5614a3a610
I get that the project might have grown beyond that, but from these snippets it really does seem like he wants the idea of roleplaying to not be associated with his product at all.
Which I guess, fair enough.
It would be funny to see all the contributors move to the RP fork and completely abandon his new shiny corpo one.

Anonymous
10/07/24(Mon)19:04:57 No.102724670

Anonymous 10/07/24(Mon)19:04:57 No.102724670

>>102724667
>move to the RP fork
where?

Anonymous
10/07/24(Mon)19:05:30 No.102724682

Anonymous 10/07/24(Mon)19:05:30 No.102724682

He thinks he'll get more views by being safe and pulling the rug under those who made him famous in the first place lmao. Let's see how well it worked with Cai and AI dungeon. These fuckers never learn

Anonymous
10/07/24(Mon)19:05:40 No.102724684

Anonymous 10/07/24(Mon)19:05:40 No.102724684

>>102724662
schizo

Anonymous
10/07/24(Mon)19:06:39 No.102724693

Anonymous 10/07/24(Mon)19:06:39 No.102724693

>>102724670
It's inevitable if he does close the ST repo, I think.
Just like ST is a fork of Tavern, the next thing will be a fork of ST.

Anonymous
10/07/24(Mon)19:08:01 No.102724712

Anonymous 10/07/24(Mon)19:08:01 No.102724712

Cohee owes you nothing. Seethe, incels.

Anonymous
10/07/24(Mon)19:09:16 No.102724728

Anonymous 10/07/24(Mon)19:09:16 No.102724728

>>102724712
Of course he doesn't.
And despite all the memes, ST's code is not that hard to mess with.

Anonymous
10/07/24(Mon)19:10:11 No.102724742

Anonymous 10/07/24(Mon)19:10:11 No.102724742

>>102724728
It's a fucking mess, is what it is.

Anonymous
10/07/24(Mon)19:10:25 No.102724745

Anonymous 10/07/24(Mon)19:10:25 No.102724745

>>102724728
It's a disaster lol

Anonymous
10/07/24(Mon)19:10:46 No.102724751

Anonymous 10/07/24(Mon)19:10:46 No.102724751

>>102724667
>I guess they forgot where they came from huh?
Indeed. Even the name, Sillytavern, implies a RPG-style tavern. I would wager that nearly everybody uses ST for RP. Nobody needs such a frontend for coding questions, or to ask general questions to an AI.

Anonymous
10/07/24(Mon)19:11:30 No.102724760

Anonymous 10/07/24(Mon)19:11:30 No.102724760

>>102724742
>>102724745
This. Fuck forking that mess. Be better off starting with a clean slate.

Anonymous
10/07/24(Mon)19:12:07 No.102724766

Anonymous 10/07/24(Mon)19:12:07 No.102724766

>>102724511
Did this guy get laughed at when he told a colleague about being in charge of ST or something?

Anonymous
10/07/24(Mon)19:13:08 No.102724777

Anonymous 10/07/24(Mon)19:13:08 No.102724777

>>102724751
True enough, for rp I use silly, but for any proper assistant use I use kobold lite so I don't mess with my dozens of rp specific settings in ST

Anonymous
10/07/24(Mon)19:13:12 No.102724779

Anonymous 10/07/24(Mon)19:13:12 No.102724779

File: architect.jpg (143 KB, 1140x855)

143 KB JPG

This will be the sixth time we have forked it, and we have become exceedingly efficient at it.

Anonymous
10/07/24(Mon)19:13:20 No.102724783

Anonymous 10/07/24(Mon)19:13:20 No.102724783

There are already a several existing well-established frontends with a productivity focus and they're way more polished and sophisticated than ST. Don't know why he would want to try to go down that route instead of focusing on the niche ST already has carved out as the best RP frontend.

Anonymous
10/07/24(Mon)19:14:25 No.102724796

Anonymous 10/07/24(Mon)19:14:25 No.102724796

File: contributors.png (37 KB, 296x182)

37 KB PNG

Are they all ok with this or did Cohee just unilaterally decide this and expect everyone to continue contributing for free to his 180 change in direction?

Anonymous
10/07/24(Mon)19:17:03 No.102724833

Anonymous 10/07/24(Mon)19:17:03 No.102724833

>>102724796
of these I think only wolf something (2nd pic) has push rights to the repo so clearly doesn't care about the others
>>102721448

Anonymous
10/07/24(Mon)19:17:32 No.102724838

Anonymous 10/07/24(Mon)19:17:32 No.102724838

>>102724760
It sure would be a good opportunity to implement a Jinja based context configuration page instead of the individual fields we have to configure the shape of the context today. I think that's the big one for me.

>>102724796
That's what I alluded to here >>102724667
Imagine he goes on to make yet another corpo frontend and all the contributors move to the next best RP frontend, or to a ST fork.

Anonymous
10/07/24(Mon)19:19:21 No.102724866

Anonymous 10/07/24(Mon)19:19:21 No.102724866

What would be a good framework to make a frontend with? I keep thinking about it from time to time. There isn't *that* much work to do honestly.

Anonymous
10/07/24(Mon)19:20:12 No.102724880

Anonymous 10/07/24(Mon)19:20:12 No.102724880

>>102724866
>There isn't *that* much work to do honestly.
you would be suprised

Anonymous
10/07/24(Mon)19:21:20 No.102724893

Anonymous 10/07/24(Mon)19:21:20 No.102724893

The big question is, who the fuck would use ST for anything productive? It's a bloated RP front end and lacks most of the features that make the chatgpt interface so nice to use for plain work.

Anonymous
10/07/24(Mon)19:24:03 No.102724930

Anonymous 10/07/24(Mon)19:24:03 No.102724930

>>102724866
It depends on what platforms and which users you'll be targeting. Actual "power users" would prefer Python or something react-based so they can use their changes in real-time. Something aimed at the average joe would need, at the very least, cpp and arguably c# so you can target every platform and corral the users into using the app the way you intend it to be used. (maybe use dart, etc.)

Anonymous
10/07/24(Mon)19:24:17 No.102724932

Anonymous 10/07/24(Mon)19:24:17 No.102724932

>>102724893
Have you tried using any of the "productivity" frontends like Jan? You get a textbox, chat history, and document upload and that's it.
You have limit settings exposed for you to mess with.
Most of them are build with the expectation that you will be using a cloud service or ollama.

Anonymous
10/07/24(Mon)19:25:07 No.102724944

Anonymous 10/07/24(Mon)19:25:07 No.102724944

>>102724605
These people are thinking about the future, not now.

Anonymous
10/07/24(Mon)19:25:11 No.102724945

Anonymous 10/07/24(Mon)19:25:11 No.102724945

>>102724866
I really really appreciate the ability to pinch zoom in on the text on mobile just fyi

Anonymous
10/07/24(Mon)19:25:21 No.102724947

Anonymous 10/07/24(Mon)19:25:21 No.102724947

File: 1701999430395433.jpg (54 KB, 680x649)

54 KB JPG

Is ROCM still a pain in the ass to install on Linux? Specifically for rdna2 (6900xt)

Anonymous
10/07/24(Mon)19:25:50 No.102724955

Anonymous 10/07/24(Mon)19:25:50 No.102724955

>>102724930
>What would be a good framework to make a frontend with?
>cpp and arguably c#
I hope this is a joke.

Anonymous
10/07/24(Mon)19:26:08 No.102724964

Anonymous 10/07/24(Mon)19:26:08 No.102724964

>>102724932
This is why Cohee's kvetching about ST not being used right despite being for "power users" makes no sense. Those options are out there already.

Anonymous
10/07/24(Mon)19:26:21 No.102724967

Anonymous 10/07/24(Mon)19:26:21 No.102724967

>>102724893
I still think someone should code a bridge between Kobold and an IRC server, and then fork/use HexChat as a client. The interface is almost the same as ST, we get full scripting support, and it's multi-user capable out of the box.

Anonymous
10/07/24(Mon)19:27:07 No.102724979

Anonymous 10/07/24(Mon)19:27:07 No.102724979

>>102723173
>llama-3.2 vision
>0 posts
So is it shit?
I have no hope of running 90B when I can barely do 72B 4bit

Anonymous
10/07/24(Mon)19:29:40 No.102725016

Anonymous 10/07/24(Mon)19:29:40 No.102725016

I kinda wonder how much goon shit I read by now. Is there a way to count words or tokens over all chats in sillytavern?

Anonymous
10/07/24(Mon)19:29:47 No.102725019

Anonymous 10/07/24(Mon)19:29:47 No.102725019

>>102724866
If you keep the scope really small it's not too bad, but it's still a lot...
>Prompt template presets
>Sampling parameter presets
>Character card management
>OpenAI completions-style parsing (note that many of the "openai-compatible" APIs differ in subtle ways from each other, have fun dealing with that
>Streaming response handling
>Lorebook management
>Context builder that determines which messages + card defs + lorebook to put into the prompt
>A chat UI that isn't total ass
I feel like that's the bare minimum. If you limit yourself to one API format ("OpenAI compatible") it's probably doable. But then you might want more advanced stuff like logprobs, a nicer themeable UI with avatars and backgrounds, support for more API formats, regex replacements, quick replies, group chats and so forth and it gets crazy. None of this is that fringe either, unlike the dumb RAG, web search, STScript, etc. stuff that ST shoves in there which has minimal use in an RP-focused frontend.

Anonymous
10/07/24(Mon)19:30:29 No.102725032

Anonymous 10/07/24(Mon)19:30:29 No.102725032

Why not just fork it

Anonymous
10/07/24(Mon)19:31:37 No.102725046

Anonymous 10/07/24(Mon)19:31:37 No.102725046

because the codebase is dogshit and miserable to work in

Anonymous
10/07/24(Mon)19:31:45 No.102725048

Anonymous 10/07/24(Mon)19:31:45 No.102725048

>>102724796
I was the anon the first implemented OpenAI streaming on SillyTavern and I'm not okay with this

Anonymous
10/07/24(Mon)19:31:50 No.102725049

Anonymous 10/07/24(Mon)19:31:50 No.102725049

>>102725016
I think it has a stats window somewhere. I've seen it many threads ago. But i don't use it, so i don't know where it is exactly.

Anonymous
10/07/24(Mon)19:32:11 No.102725055

Anonymous 10/07/24(Mon)19:32:11 No.102725055

>>102725048
cnc...?

Anonymous
10/07/24(Mon)19:32:51 No.102725062

Anonymous 10/07/24(Mon)19:32:51 No.102725062

>>102725048
What's the problem?

Anonymous
10/07/24(Mon)19:33:02 No.102725068

Anonymous 10/07/24(Mon)19:33:02 No.102725068

>>102725049
>>102725016
persona management -> (top right) usage stats
but it seems very inaccurate

Anonymous
10/07/24(Mon)19:34:42 No.102725082

Anonymous 10/07/24(Mon)19:34:42 No.102725082

>>102725068
Yeah, mine's definitely fucked.

Anonymous
10/07/24(Mon)19:36:29 No.102725103

Anonymous 10/07/24(Mon)19:36:29 No.102725103

>>102725082
The wonderful experience of ServiceTesnor™ code

Anonymous
10/07/24(Mon)19:36:33 No.102725105

Anonymous 10/07/24(Mon)19:36:33 No.102725105

>>102724463
Nobody cares about your beef with anthracite, retard

Anonymous
10/07/24(Mon)19:37:55 No.102725121

Anonymous 10/07/24(Mon)19:37:55 No.102725121

I am confused on how you even train loras for LLMs.

Anonymous
10/07/24(Mon)19:38:28 No.102725128

Anonymous 10/07/24(Mon)19:38:28 No.102725128

>>102725121
Then don't worry about it. Let others do it for you.

Anonymous
10/07/24(Mon)19:38:43 No.102725131

Anonymous 10/07/24(Mon)19:38:43 No.102725131

>>102725048
>the anon the first
the anon that first*
>>102725055
No, I guess you're talking about the guy that wrote the support for the OpenAI API, that was a different thing.
>>102725062
Do you really have to ask?

Anonymous
10/07/24(Mon)19:41:59 No.102725162

Anonymous 10/07/24(Mon)19:41:59 No.102725162

>>102724783
There are? Which should I be using? So far I've been doing work in Mikupad, but if there's something like ChatGPT's interface or better then I'll switch.

Anonymous
10/07/24(Mon)19:43:50 No.102725182

Anonymous 10/07/24(Mon)19:43:50 No.102725182

>>102725121
Second result on google
>https://zohaib.me/a-beginners-guide-to-fine-tuning-llm-using-lora/
It may give you a place to start if you know nothing. I think. I barely skimmed it, maybe it's shit.

Anonymous
10/07/24(Mon)19:44:27 No.102725192

Anonymous 10/07/24(Mon)19:44:27 No.102725192

>>102723964
nah not really, just wanted to see what else was out there, always chasing that dragon (ever since Character AI went cringe desu).

Mistral Small is actually pretty fucking good (not the fine tunes though, they fucking suck)

Anonymous
10/07/24(Mon)19:44:43 No.102725197

Anonymous 10/07/24(Mon)19:44:43 No.102725197

File: 100683327851267.gif (748 KB, 220x274)

748 KB GIF

>>102724751
>name your interaction product a silly tavern
>get mad when people roleplay in your silly tavern

Anonymous
10/07/24(Mon)19:45:49 No.102725207

Anonymous 10/07/24(Mon)19:45:49 No.102725207

>>102724947
I don't think so, I am using fedora and I think they added ROCM into the is so it works out of the box. I don't know about other operating systems though.

Anonymous
10/07/24(Mon)19:47:01 No.102725217

Anonymous 10/07/24(Mon)19:47:01 No.102725217

>>102725197
Wasn't mad about it until today.

Anonymous
10/07/24(Mon)19:47:06 No.102725219

Anonymous 10/07/24(Mon)19:47:06 No.102725219

>>102724930
>>102725019
I think you've misunderstood me. I don't want to make a product here. I just want to start a small personal project. If it ends up going somewhere and I won't give up on preplanning stage, then maybe I would release it (slim chances though).

Anonymous
10/07/24(Mon)19:49:49 No.102725243

Anonymous 10/07/24(Mon)19:49:49 No.102725243

>>102725219
all of those things are basically table stakes for a minimal RP chatbot frontend for me though, I know because I've thought about making my own as a personal project and then realized "damn I would need to build a lot of shit just to reach parity with what I use ST for"

Anonymous
10/07/24(Mon)19:52:25 No.102725272

Anonymous 10/07/24(Mon)19:52:25 No.102725272

>>102725019
Is this stuff really that hard to make in this day and age? It's no longer 2018, you can just use llms or chatgpt to help you code and even just straight up write entire junks for you.

Anonymous
10/07/24(Mon)19:53:30 No.102725284

Anonymous 10/07/24(Mon)19:53:30 No.102725284

>>102725272
give it a shot and see how far your gptslopped code gets you

Anonymous
10/07/24(Mon)19:55:01 No.102725306

Anonymous 10/07/24(Mon)19:55:01 No.102725306

>>102725243
>damn I would need to build a lot of shit just to reach parity with what I use ST for
I feel that.
ST also has native summary and vectorDB functionality that I do use.
Do I really want to mess around with transformers.js alongside all the rest? Not really.

>>102725272
It's not hard, it's just a lot of code.

Anonymous
10/07/24(Mon)19:55:25 No.102725314

Anonymous 10/07/24(Mon)19:55:25 No.102725314

>>102725048
>entitled faggot adds one (1) small feature and thinks that should give him veto power over the whole project
fuck off

Anonymous
10/07/24(Mon)19:56:12 No.102725319

Anonymous 10/07/24(Mon)19:56:12 No.102725319

>>102723836
Gemma 2 27B. It doesn't have a system role but you can (and probably should) use a depth 0 instruction to adjust its behavior as desired.

Anonymous
10/07/24(Mon)19:57:02 No.102725326

Anonymous 10/07/24(Mon)19:57:02 No.102725326

This is like pornhub deciding it is against porn and it will remove porn from the site.

Anonymous
10/07/24(Mon)19:58:54 No.102725350

Anonymous 10/07/24(Mon)19:58:54 No.102725350

It's like Meta and Alibaba deciding that their LLMs don't need to be creative or know what sex even is and filtering their datasets accordingly pre-training.

Anonymous
10/07/24(Mon)20:11:37 No.102725499

Anonymous 10/07/24(Mon)20:11:37 No.102725499

File: cards.png (12 KB, 1364x496)

12 KB PNG

>>102725219
ST has always been a fancy textbox. llama.cpp has a vim plugin for llama-server. It's about 110 loc. It handles streaming just fine. You get built-in context editing (it's a text editor, after all). You can use any prompt format by just typing or using a macro to insert them. You can change the settings from request to request with the settings on a control line at the top.
You can use localchub to mirror chub.ai. Extracting data from the cards is trivial [picrel. a random card]. Change png_hdr to identify and liljson to jq. Then it's just copy pasting shit as you need. If you don't use vim, make one for your editor of choice. It's just ~100 loc to convert. Save vram by not having a browser, implement only the features you need, avoid bloat. Or convert it to js and add some css on top. Whatever.
!*{"temperature": 0.6, "top_k": 40, "top_p": 1, "n_predict": -1, "repeat_last_n": -1, "stop": "<|endoftext|>", "cache_prompt": true, "n_keep": -1}
:nnoremap <F6> i<\|user\|><\|endoftext\|><CR><\|assistant\|><ESC>6b2l
:nnoremap <F9> :call llama#doLlamaGen()<CR>

Anonymous
10/07/24(Mon)20:12:14 No.102725507

Anonymous 10/07/24(Mon)20:12:14 No.102725507

As someone building a chat frontend that was hoping to poorly copy silly taverns features, can people list what features they actually use from it?
I have character card support and chat saving +

Anonymous
10/07/24(Mon)20:13:31 No.102725521

Anonymous 10/07/24(Mon)20:13:31 No.102725521

Why do llamacpp and exl2 need to reserve vram up to the max context setting, when Transformers doesn't?
You don't need to set a max context value when loading with Transformers, and yet it doesn't seem to run itself OOM trying to reserve the model's max or anything, it just works somehow. So why do exl2 and llamacpp need you to specify a value and reserve vram for it?

Anonymous
10/07/24(Mon)20:13:44 No.102725523

Anonymous 10/07/24(Mon)20:13:44 No.102725523

>>102725507
>>102725019

Anonymous
10/07/24(Mon)20:14:36 No.102725536

Anonymous 10/07/24(Mon)20:14:36 No.102725536

>>102725507
As a retard, here's what I do:
1. install model from hf
2. search the archives for some anon's configs (temperature, prompts, etc).
3. download a card from chub

Beside that I just use the edit/re-generate functionalities

Anonymous
10/07/24(Mon)20:14:55 No.102725538

Anonymous 10/07/24(Mon)20:14:55 No.102725538

>>102725507
(fuck) DB Support, searching, RAG and looking to add the templating for lorebooks + the fancy RAG chat they do, as it lines up with something else I'm buhilding and figured 'fuck it wy not'.

Anonymous
10/07/24(Mon)20:16:33 No.102725562

Anonymous 10/07/24(Mon)20:16:33 No.102725562

>>102725499
looks awful

Anonymous
10/07/24(Mon)20:19:01 No.102725595

Anonymous 10/07/24(Mon)20:19:01 No.102725595

>>102725499
looks good thank you

Anonymous
10/07/24(Mon)20:21:42 No.102725628

Anonymous 10/07/24(Mon)20:21:42 No.102725628

>>102725562
You integrate it into your editor. That's just to show how little you need to extract char data. Not about a dozen python libraries or an entire browser. 3 commands (or their equivalents plus a little sed) that most anons probably already have.

Anonymous
10/07/24(Mon)20:27:12 No.102725675

Anonymous 10/07/24(Mon)20:27:12 No.102725675

>>102725507
What >>102725019 mentioned + the built in vectorDD. Ideally, it could use a second instance of llama.cpp to serve the embedding model apart from the main llama.cpp instance that's serving the main model, same for summary.

Anonymous
10/07/24(Mon)20:31:08 No.102725722

Anonymous 10/07/24(Mon)20:31:08 No.102725722

I'd like new native CoT support and LLM self reflection options and 'summarize box' like in GPT-IV

But whatever you do, make sure you become a reddit-tier grifter hub filled with deceit and lies

Anonymous
10/07/24(Mon)20:31:18 No.102725725

Anonymous 10/07/24(Mon)20:31:18 No.102725725

>>102725523
>>102725536
Sweet, thanks.

>Prompt template presets
- This can be solved with jinja templates yea?

>Sampling parameter presets
- This is also easy enough, will look at ST's source
>Character card management
- got this in, need to make it nicer
>OpenAI completions-style parsing (note that many of the "openai-compatible" APIs differ in subtle ways from each other, have fun dealing with that
- Will look at ST's code
>Streaming response handling
- Easy peasy
>Lorebook management
- Is in, but needs to be improved/made nicer
>Context builder that determines which messages + card defs + lorebook to put into the prompt
- This _seems_ fucking hard, but will look at ST's implementation to understand it.
>A chat UI that isn't total ass
- :( I'm using gradio at first, but plan to turn it into an API-first thing, so people can make their own UIs (am still going to build one for myself)

>>102725675
Sweet, that's easy, can use llamafile for that and already have that functionality in for the non-rp chat usage.

Anonymous
10/07/24(Mon)20:33:06 No.102725746

Anonymous 10/07/24(Mon)20:33:06 No.102725746

>>102725725
Is it functional rigth now? Can you share what the UI looks like (won't judge)?

Anonymous
10/07/24(Mon)20:33:56 No.102725753

Anonymous 10/07/24(Mon)20:33:56 No.102725753

Whatever replaces ST absolutely needs native agent/function calling. It's like the entire open source llm field swept this entire field under the rug the moment llama2 hit 0% on agent bench last year and forgot about it.

Anonymous
10/07/24(Mon)20:35:42 No.102725767

Anonymous 10/07/24(Mon)20:35:42 No.102725767

>>102725753
Sounds like a job for ServiceTesnor

Anonymous
10/07/24(Mon)20:37:02 No.102725777

Anonymous 10/07/24(Mon)20:37:02 No.102725777

>>102725753
>native agent/function calling
It's parsing a json, doing whatever needs doing and feeding the data back to the llm. Why does everyone seem to think it's magic?

Anonymous
10/07/24(Mon)20:37:35 No.102725785

Anonymous 10/07/24(Mon)20:37:35 No.102725785

>>102725628
Nta, are there any open-ended text editors that can be toyed with on that level, but are less autistic than vim?
I'm a normalfag who barely knows how to code, and vim feels way out of my league.

Anonymous
10/07/24(Mon)20:37:55 No.102725793

Anonymous 10/07/24(Mon)20:37:55 No.102725793

File: file.png (10 KB, 565x109)

10 KB PNG

how can I tell the download progress?
my net is shit, 4mbs for the last 1.5 hours, & wanna sleep soon
does it even say in the console if the download finishes?

Anonymous
10/07/24(Mon)20:39:29 No.102725810

Anonymous 10/07/24(Mon)20:39:29 No.102725810

>>102725777
this nobody bothered to implement this anywhere because its too simple

Anonymous
10/07/24(Mon)20:39:50 No.102725817

Anonymous 10/07/24(Mon)20:39:50 No.102725817

>>102725785
I can't do anything other than bash and use vim sometimes
It's not impossible

Anonymous
10/07/24(Mon)20:40:10 No.102725818

Anonymous 10/07/24(Mon)20:40:10 No.102725818

>>102725521
In my experience, Transformers DOES run itself oom trying to reserve the model's max context, due to there being no way to cap it.

Anonymous
10/07/24(Mon)20:40:14 No.102725819

Anonymous 10/07/24(Mon)20:40:14 No.102725819

File: cat-thumbs-up.jpg (122 KB, 742x687)

122 KB JPG

About to try llama3.1, any anon minds sharing their settings?

Anonymous
10/07/24(Mon)20:41:39 No.102725836

Anonymous 10/07/24(Mon)20:41:39 No.102725836

Are cats the new frogs?

Anonymous
10/07/24(Mon)20:42:30 No.102725847

Anonymous 10/07/24(Mon)20:42:30 No.102725847

>>102725836
cats were the original frogs newfag

Anonymous
10/07/24(Mon)20:43:41 No.102725863

Anonymous 10/07/24(Mon)20:43:41 No.102725863

>>102725785
I don't know a lot of editors. Vim has bindings for a bunch of languages like lua and python, if you wanna go that route. Customizable text editors are pretty autistic by definition. Maybe emacs if you're into lisp? Most things you can still just shell script and pipe if you don't need streaming.

Anonymous
10/07/24(Mon)20:51:05 No.102725955

Anonymous 10/07/24(Mon)20:51:05 No.102725955

>>102725793
When it's done it will tell you and you'll get the prompt back. Just let it run overnight. The models are not going anywhere.

Anonymous
10/07/24(Mon)21:09:02 No.102726140

Anonymous 10/07/24(Mon)21:09:02 No.102726140

Why not just fork ST before the latest commit and build from there...?

Anonymous
10/07/24(Mon)21:09:45 No.102726148

Anonymous 10/07/24(Mon)21:09:45 No.102726148

File: Capture.png (128 KB, 1031x852)

128 KB PNG

>>102725746
Yea, the character chat portion is a side thing to the main project, I added the character chat per request of a friend and then figured fuck it, why not go full sillytavern, since it lines up with having a persistent persona to chat with ala J.A.R.V.I.S. , and so having those features available would make that a whole shit ton easier. Plus it ideally gets me more users(bug testing)/helps people out, though admittedly I want as little to do with /aicg/ as possible.

This is zoomed out so you can see more of the UI + light mode vs dark mode. Like I said, gradio is just a placeholder for now, I know that it is ugly, but don't care too much about the UI for now. Isn't showing the chat search + load lower on the left side + custom naming for the current chat

Anonymous
10/07/24(Mon)21:12:42 No.102726181

Anonymous 10/07/24(Mon)21:12:42 No.102726181

>>102726140
That's what'll happen, but in the meantime people are enjoying the drama.

Anonymous
10/07/24(Mon)21:13:02 No.102726183

Anonymous 10/07/24(Mon)21:13:02 No.102726183

File: file.gif (3.52 MB, 498x300)

3.52 MB GIF

>>102726148
>per request of a friend

Anonymous
10/07/24(Mon)21:14:30 No.102726199

Anonymous 10/07/24(Mon)21:14:30 No.102726199

>>102725314
Hi cohee

Anonymous
10/07/24(Mon)21:20:03 No.102726249

Anonymous 10/07/24(Mon)21:20:03 No.102726249

Didn't read any of the previous discussion but I think an app that's like both a combination of ChatGPT + ST would be cool. Like if ST displayed a pane of different chats like ChatGPT, after you clicked into the character, instead of displaying the character card, which would be a different button. Honestly the way ST handles chat histories is kind of shit, though the timelines extension helps a bit.

Anonymous
10/07/24(Mon)21:21:19 No.102726264

Anonymous 10/07/24(Mon)21:21:19 No.102726264

can you upload images in booga yet

Anonymous
10/07/24(Mon)21:22:51 No.102726278

Anonymous 10/07/24(Mon)21:22:51 No.102726278

>>102726249
Doesn't ST already do that? You first select the character and then which chat from that character you want to use

Anonymous
10/07/24(Mon)21:23:09 No.102726280

Anonymous 10/07/24(Mon)21:23:09 No.102726280

>>102726140
Great idea, who'll do the building?

Anonymous
10/07/24(Mon)21:24:27 No.102726288

Anonymous 10/07/24(Mon)21:24:27 No.102726288

Wait a second.
What happens to my chars if i pull now?
I already lost chars ones because of a "bug".
I have 200+ characters in different folders ranked by how much I liked them.
Any way to backup/export and preserve the folders? I hope somebody forks..

Anonymous
10/07/24(Mon)21:25:34 No.102726298

Anonymous 10/07/24(Mon)21:25:34 No.102726298

>>102726288
just copy the directory which contains all your chats, if shit hits the fan and you lose everything use that backup

Anonymous
10/07/24(Mon)21:25:46 No.102726301

Anonymous 10/07/24(Mon)21:25:46 No.102726301

>>102726288
just copy them somewhere and then pull and enjoy the explosions

Anonymous
10/07/24(Mon)21:26:05 No.102726307

Anonymous 10/07/24(Mon)21:26:05 No.102726307

>>102726288
Just zip the whole folder my guy.
Or create a fork you push to after merging the changes (and confirming they work) from whatever ST branch you pull from.

Anonymous
10/07/24(Mon)21:27:54 No.102726324

Anonymous 10/07/24(Mon)21:27:54 No.102726324

File: 1697389579824052.png (20 KB, 390x321)

20 KB PNG

>>102726288
if only there was a way to duplicate files before you pull

Anonymous
10/07/24(Mon)21:30:12 No.102726347

Anonymous 10/07/24(Mon)21:30:12 No.102726347

>>102726324
Fuck all this shit, cards is all you need.

Anonymous
10/07/24(Mon)21:30:35 No.102726351

Anonymous 10/07/24(Mon)21:30:35 No.102726351

File: 100-girl.png (531 KB, 1000x906)

531 KB PNG

>>102726183
lmao, I recognize how it seems but it was an honest request. I personally didn't think much of it/not that into it, and hten thinking more on it realized how much it could help me out.

Anonymous
10/07/24(Mon)21:30:45 No.102726352

Anonymous 10/07/24(Mon)21:30:45 No.102726352

>>102726278
Not out of the box? When you click on a character, the right pane switches to a view of the character card's details, while the middle pane switches to the last chat. You then click on another button to see a list of chats, or to see the timeline if you have that extension. And you can't really have the list of chats or the timeline just always there on the side. Also once you've swiped and then you reply to the swipe, the swipe buttons for the old reply disappear, so you have to go into the timeline or history to go back to that branch and switch to a different response. It's really not great.

If you're suggesting that this is all in fact possible and it was hidden away in the pile of options, do tell.

Anonymous
10/07/24(Mon)21:32:11 No.102726363

Anonymous 10/07/24(Mon)21:32:11 No.102726363

>windows
>amd graphics card
Are kobold.cpp prebuilt binaries my only choice? I can't be arsed to install linux again and have to dual boot just to chat

Anonymous
10/07/24(Mon)21:36:02 No.102726397

Anonymous 10/07/24(Mon)21:36:02 No.102726397

>>102726301
>>102726298
>>102726307
>>102726324
Man this sucks. But I guess I could repair the folders.

If anybody needs this:
The char pngs are in: /data/default-user/characters/
The folders/tags are written in /data/default-user/settings.json.
Look for ""tags": [" to get the IDs. Folders are also just Tags.
The characters get each ID under "tag_map".

Anonymous
10/07/24(Mon)21:36:22 No.102726400

Anonymous 10/07/24(Mon)21:36:22 No.102726400

>>102725725
>- This can be solved with jinja templates yea?
yes, this has always been my idea (if I actually had the motivation to build an RP frontend). I hate having the clunky ass prompt manager or the two dozen different text boxes for constructing a prompt, this is a solved problem already and jinja templates are the LLM industry standard at this point (Huggingface has even ported a Jinja parser to JS).
It's not the most user friendly thing but I think that's fine. You can use more complex templates for piecing together the prompt and message history, and provide a simple text box for techlets to input their preferred system prompt that gets plugged into the jinja template.

>>OpenAI completions-style parsing
>>Context builder that determines which messages + card defs + lorebook to put into the prompt
>- Will look at ST's code
>- This _seems_ fucking hard, but will look at ST's implementation to understand it.
Absolutely do not use ST's code as a reference for this, it is horrible. The OpenAI API is very simple, just build your implementation against their docs. I think in my idealized frontend I would build an abstraction that can handle turning message history -> my own internal Context format (takes into account size of defs, active lorebooks, and available token allocation to produce a full context) -> adapters that can turn a Context into a flat string prompt or messages array for the user's selected backend, which would initially be just OpenAI format since it's the most popular.
It is somewhat non-trivial, but it's more of a matter of coming up with a thoughtful architecture rather than the actual implementation itself being challenging. Don't use ST as a reference for this because there is zero thought behind any of it and shitty abstractions fucking everywhere.

>gradio
Gradio is ass but if you build an API-first thing then whatever.

Anonymous
10/07/24(Mon)21:37:30 No.102726411

Anonymous 10/07/24(Mon)21:37:30 No.102726411

File: file.png (35 KB, 581x370)

35 KB PNG

>>102724190
what version of venv or whateverthe fuck do I need?

Anonymous
10/07/24(Mon)21:38:05 No.102726415

Anonymous 10/07/24(Mon)21:38:05 No.102726415

>>102726140
nobody wants to build on ST

Anonymous
10/07/24(Mon)21:38:07 No.102726418

Anonymous 10/07/24(Mon)21:38:07 No.102726418

>>102726347
And chats, group chats, settings, custom system prompts, prompt templates, user personas, etc
You could also just back up the data folder they implemented a couple of months ago but this doesn't guarantee that it'll stay compatible in case they change something again.
I prefer to keep my old version around until I know that the new one works as intended.

Anonymous
10/07/24(Mon)21:39:23 No.102726431

Anonymous 10/07/24(Mon)21:39:23 No.102726431

>>102726363
>Are kobold.cpp prebuilt binaries my only choice?
Well. You either compile or you don't. If you do, you have options. If you don't, you don't. Obviously, it is possible to build for windows... or use llama.cpp, but then you'll be faced with the same question...
What was the question again?

Anonymous
10/07/24(Mon)21:42:28 No.102726467

Anonymous 10/07/24(Mon)21:42:28 No.102726467

>>102726411
nta. That's fine. Just do
>pip install --upgrade pip
if you want. It's just a warning. The installation of the actual packages seems to have finished correctly.

Anonymous
10/07/24(Mon)21:42:35 No.102726471

Anonymous 10/07/24(Mon)21:42:35 No.102726471

>>102726363
No, you can manually compile llama.cpp and make it work. No idea how.

>>102726400
Yea, I meant look at their code to see their approach to it, not to copy their design but rather understand the approach and how users might expect it to work.

Anonymous
10/07/24(Mon)21:44:05 No.102726487

Anonymous 10/07/24(Mon)21:44:05 No.102726487

There are multiple frontends that are have the chatgpt business appeal. And able to import cards as a bonus.
These projects have prebuild apks etc. for phone too.
What are these retards doing with silly?

Anonymous
10/07/24(Mon)21:44:17 No.102726490

Anonymous 10/07/24(Mon)21:44:17 No.102726490

>>102726431
>You either compile or you don't.
>>102726471
>you can manually compile llama.cpp and make it work.

There used to be a guide in the readme of llama.cpp, guess I'll have to look for it, thanks

Anonymous
10/07/24(Mon)21:48:21 No.102726526

Anonymous 10/07/24(Mon)21:48:21 No.102726526

>>102726471
What are those pre-compiled llama.cpp binaries with hip in the name?

Anonymous
10/07/24(Mon)21:49:16 No.102726532

Anonymous 10/07/24(Mon)21:49:16 No.102726532

>>102725507
Character expressions, basic tts, the ability to attach lorebooks to characters, advanced lorebook controls:
https://github.com/SillyTavern/SillyTavern/issues/2189

Anonymous
10/07/24(Mon)21:58:18 No.102726624

Anonymous 10/07/24(Mon)21:58:18 No.102726624

>>102726249
>he thinks they're going to put effort into the rebrand
lol, lmao

Anonymous
10/07/24(Mon)21:59:22 No.102726638

Anonymous 10/07/24(Mon)21:59:22 No.102726638

>>102726532
>Character expressions
Would the ultimate rp frontend create (using an image model) and cache the expression images?
That sounds like a neat feature to have that nobody would ever use.

Anonymous
10/07/24(Mon)22:00:17 No.102726647

Anonymous 10/07/24(Mon)22:00:17 No.102726647

>>102726526
Those would be the pre-compiled hip binaries. For rocm.

Anonymous
10/07/24(Mon)22:02:37 No.102726666

Anonymous 10/07/24(Mon)22:02:37 No.102726666

File: AI-Dungeon.jpg (132 KB, 800x768)

132 KB JPG

>>102726532
Character expressions is post-1.0, but do plan to have it;
TTS is a 'when I get around to it/dedicate an afternoon to implementation' for a basic implementation(XTTS).
Attaching Lorebooks to characters, my current thought/approach is to have the user be able to select a character, and then select which lore books to load with that character at chat time, so you can have a chat with Goku about the DBZ saga, and then turn around and have a convo about your ttrpg taking place in Illyria, using the specificied lorebook for the question.
That advanced lorebook controls is pretty fucking cool, I hadn't thought of that, and that presents an interesting angle for handling personalization/personalized responses for an on-going persona chat, being able to identify/designate tiered pieces of info to alleviate context length limits.
Thanks for the info anon.

Anonymous
10/07/24(Mon)22:11:01 No.102726749

Anonymous 10/07/24(Mon)22:11:01 No.102726749

>>102726666
No problem, Satan.
Good luck with your project!

Anonymous
10/07/24(Mon)22:13:35 No.102726772

Anonymous 10/07/24(Mon)22:13:35 No.102726772

>no new models for weeks
>sillytavern rebranding as a productivity app
it's fucking over.

Anonymous
10/07/24(Mon)22:14:43 No.102726782

Anonymous 10/07/24(Mon)22:14:43 No.102726782

>>102726749
Excuse me, his name is super satan.

>>102726666
Can you add a feature to your list where the chat history is summarized with each new message and only the summary + the last N messages are sent to the model instead of the whole chat?

Anonymous
10/07/24(Mon)22:18:43 No.102726811

Anonymous 10/07/24(Mon)22:18:43 No.102726811

>>102726666
Are local models better than AI dungeon was?

Anonymous
10/07/24(Mon)22:27:35 No.102726892

Anonymous 10/07/24(Mon)22:27:35 No.102726892

Man, I really, really liked ST's scripting feature comboed with quick reply. I am definitely too dumb to implement something like that myself in any capacity.

Anonymous
10/07/24(Mon)22:29:10 No.102726907

Anonymous 10/07/24(Mon)22:29:10 No.102726907

>>102726782
Yep, can do.

>>102726749
Thanks anon!

>>102726811
Technically yes, but I'm not aware of anyone that's trained a model along hte same lines, kobold is something like it with its RPG stuff, I personally haven't gone too deep into the RP stuff with LLMs to be honest, but if some anon would care to share, that'd be great. Shit, you might just be able to get by with a character card as a narrator? idk

Anonymous
10/07/24(Mon)22:31:15 No.102726920

Anonymous 10/07/24(Mon)22:31:15 No.102726920

Fucking hell
I'm doing some docker bs and fucked up an ollama container, so I literally copypasted the models folder from one container to another, and ollama refused to acknowledge the copied models.
Fuck ollama. Why do people make everything compatible with this piece of shit instead of llama.cpp?

Anonymous
10/07/24(Mon)22:31:56 No.102726922

Anonymous 10/07/24(Mon)22:31:56 No.102726922

>>102726920
What's compatible to ollama but not llama.cpp? Sounds like a skill issue to me.

Anonymous
10/07/24(Mon)22:32:25 No.102726926

Anonymous 10/07/24(Mon)22:32:25 No.102726926

>>102726920
lmao

Anonymous
10/07/24(Mon)22:32:36 No.102726928

Anonymous 10/07/24(Mon)22:32:36 No.102726928

>>102726920
>Why do people make everything compatible with this piece of shit instead of llama.cpp?
Such as?

Anonymous
10/07/24(Mon)22:35:12 No.102726954

Anonymous 10/07/24(Mon)22:35:12 No.102726954

>>102726928
>>102726922
I'm trying to run perplexica but I have seen a bunch of other shit I don't remember that did the same.

Anonymous
10/07/24(Mon)22:36:22 No.102726964

Anonymous 10/07/24(Mon)22:36:22 No.102726964

>>102726920
Oi do you have a loicense for those models?

Anonymous
10/07/24(Mon)22:48:34 No.102727073

Anonymous 10/07/24(Mon)22:48:34 No.102727073

>>102726892
just don't update

Anonymous
10/07/24(Mon)22:49:10 No.102727079

Anonymous 10/07/24(Mon)22:49:10 No.102727079

>>102723923
yes you can call the claude api locally on your computer

Anonymous
10/07/24(Mon)23:01:58 No.102727196

Anonymous 10/07/24(Mon)23:01:58 No.102727196

>>102726892
I did as well. I had a lot of scripts running.

>>102727073
I just duplicated the entire Sillytavern folder. So, if I encounter an update that removes RP features, I'll just use my backup.

Anonymous
10/07/24(Mon)23:07:26 No.102727255

Anonymous 10/07/24(Mon)23:07:26 No.102727255

>>102726624
>they
Who?

Anonymous
10/07/24(Mon)23:10:39 No.102727283

Anonymous 10/07/24(Mon)23:10:39 No.102727283

>think about what it would take to implement right pane chat history and persisting swipe buttons, vs hacking ST-like functionality onto other apps or just making a new app from the ground up
>for a moment the thought flashes in my mind that maybe, just maybe, it might be easier to just deal with the js spaghetti, surely those two features wouldn't be that hard to add

Anonymous
10/07/24(Mon)23:22:24 No.102727379

Anonymous 10/07/24(Mon)23:22:24 No.102727379

>>102727283
Actually with that said, why can't there just be swipe buttons on every post including the user's? Could make for some interesting uses.

Anonymous
10/07/24(Mon)23:39:25 No.102727497

Anonymous 10/07/24(Mon)23:39:25 No.102727497

>>102727379
it could change the context of the roleplay and make the next user reply nonsensical. I make branches when I want to adjust an old message and restart organically from there.

Anonymous
10/07/24(Mon)23:41:58 No.102727520

Anonymous 10/07/24(Mon)23:41:58 No.102727520

>>102727497
Some users may want the right to swipe in place even if just to check what other messages there were, instead of branching then deleting.

Anonymous
10/07/24(Mon)23:42:43 No.102727528

Anonymous 10/07/24(Mon)23:42:43 No.102727528

>>102727497
I'm implying that by having persisting swipe buttons everywhere, you would either switch branches (like ChatGPT) or create a new one. You wouldn't have to go back to go press the branch button and then the swipe button. And persisting swipe buttons would also let you have a quick glance/reminder as to which replies you used swipes on and how many swipes, as well as which swipe you're on. Frankly right now ST just does not give an equivalent experience to ChatGPT because of this, and that other feature. It seems small but it's actually pretty important to feeling good to use.

Anonymous
10/07/24(Mon)23:55:20 No.102727632

Anonymous 10/07/24(Mon)23:55:20 No.102727632

File: chatgpt-ui-thing.gif (113 KB, 2032x1392)

113 KB GIF

>>102727528
Ah I never used chatgpt, when I typed >>102727520 I was thinking of previewing old swipes *then* hitting branch if I decide to branch.

Whatever you call this (automatic branch navigation? tree navigation?) would definitely be smoother way to move forward, backward, and sideways. And right pane mentioned in >>102727283 would visualize the tree, if I'm reading this correctly.

Anonymous
10/07/24(Mon)23:58:28 No.102727662

Anonymous 10/07/24(Mon)23:58:28 No.102727662

File: Untitled.png (729 KB, 1080x1584)

729 KB PNG

Preference Optimization as Probabilistic Inference
https://arxiv.org/abs/2410.04166
>Existing preference optimization methods are mainly designed for directly learning from human feedback with the assumption that paired examples (preferred vs. dis-preferred) are available. In contrast, we propose a method that can leverage unpaired preferred or dis-preferred examples, and works even when only one type of feedback (positive or negative) is available. This flexibility allows us to apply it in scenarios with varying forms of feedback and models, including training generative language models based on human feedback as well as training policies for sequential decision-making problems, where learned (value) functions are available. Our approach builds upon the probabilistic framework introduced in (Dayan and Hinton, 1997), which proposes to use expectation-maximization (EM) to directly optimize the probability of preferred outcomes (as opposed to classic expected reward maximization). To obtain a practical algorithm, we identify and address a key limitation in current EM-based methods: when applied to preference optimization, they solely maximize the likelihood of preferred examples, while neglecting dis-preferred samples. We show how one can extend EM algorithms to explicitly incorporate dis-preferred outcomes, leading to a novel, theoretically grounded, preference optimization algorithm that offers an intuitive and versatile way to learn from both positive and negative feedback.
neat.

Anonymous
10/08/24(Tue)00:02:50 No.102727700

Anonymous 10/08/24(Tue)00:02:50 No.102727700

Good work to the anon who made Mikupad. It's simple and clean. I'm trying to make a similar-ish web interface, and I don't know if its simple for those better at web dev, but it is more difficult than it looks. That is all.

Anonymous
10/08/24(Tue)00:03:20 No.102727702

Anonymous 10/08/24(Tue)00:03:20 No.102727702

While we're on the subject of frontend features, I think having a full cross-chat, cross-character text search feature would be cool. One of the issues with current chat history browsing UIs is that it's kind of difficult to know or remember what each chat really contained, when you're a heavy user and you have tons of chats. If you could just do a quick search across all chats and then press on the link to go to that chat, that would be pretty amazing. It makes me think of the difference between using a folder-based file browsing system vs a fast search-based file browsing system (like Everything.exe and Fsearch for Linux). Search is amazing for some types of file browsing tasks, while folder-based is still good for others.

ST does have a search feature, but its search range is limited to the character you currently have open, so you can't search across truly all chats, plus you still need to actually go and press a button to open the chat history menu and then you get access to the search bar. Would be so much better if that menu existed in the right pane rather than as a temporary pop up.

Anonymous
10/08/24(Tue)00:03:21 No.102727703

Anonymous 10/08/24(Tue)00:03:21 No.102727703

>>102727632
A tree swipe like that would be amazing, strange that silly people didn't do it. I thought everything there is already packed as a graph.

Anonymous
10/08/24(Tue)00:06:08 No.102727735

Anonymous 10/08/24(Tue)00:06:08 No.102727735

File: 39_06429_.png (1.03 MB, 1280x720)

1.03 MB PNG

>>102723173

Anonymous
10/08/24(Tue)00:06:30 No.102727739

Anonymous 10/08/24(Tue)00:06:30 No.102727739

File: Untitled.png (213 KB, 1032x984)

213 KB PNG

Presto! Distilling Steps and Layers for Accelerating Music Generation
https://arxiv.org/abs/2410.05167
>Despite advances in diffusion-based text-to-music (TTM) methods, efficient, high-quality generation remains a challenge. We introduce Presto!, an approach to inference acceleration for score-based diffusion transformers via reducing both sampling steps and cost per step. To reduce steps, we develop a new score-based distribution matching distillation (DMD) method for the EDM-family of diffusion models, the first GAN-based distillation method for TTM. To reduce the cost per step, we develop a simple, but powerful improvement to a recent layer distillation method that improves learning via better preserving hidden state variance. Finally, we combine our step and layer distillation methods together for a dual-faceted approach. We evaluate our step and layer distillation methods independently and show each yield best-in-class performance. Our combined distillation method can generate high-quality outputs with improved diversity, accelerating our base model by 10-18x (230/435ms latency for 32 second mono/stereo 44.1kHz, 15x faster than comparable SOTA) -- the fastest high-quality TTM to our knowledge.
https://presto-music.github.io/web/
from adobe so no release ever I'm sure just like the rest of their AI stuff that just rots away somewhere. anyway posting since musicgen is rare and the examples sounded good

Anonymous
10/08/24(Tue)00:08:01 No.102727750

Anonymous 10/08/24(Tue)00:08:01 No.102727750

File: arrows.png (425 KB, 1069x1081)

425 KB PNG

>>102727632
Probably not what you mean, but it made me think about this. https://github.com/p-e-w/arrows

This is different, but fun too https://github.com/the-crypt-keeper/LLooM

Anonymous
10/08/24(Tue)00:09:41 No.102727763

Anonymous 10/08/24(Tue)00:09:41 No.102727763

>>102727735
I guess you could say that since https://github.com/sam-paech/antislop-sampler
was just updated so you can supposedly use it with OpenAI compatible API programs. No more shivers, but perhaps jolts of electricity instead? Perhaps, just perhaps.

Anonymous
10/08/24(Tue)00:12:41 No.102727785

Anonymous 10/08/24(Tue)00:12:41 No.102727785

>>102727700
There are actually two anons who could be said to have "made mikupad", the OG who created a pastebin and later a codeberg repository (https://codeberg.org/mikupad/mikupad), and the lmg-anon who continued to develop the original pastebin. I wonder if the earlier is still around us...

Anonymous
10/08/24(Tue)00:15:26 No.102727806

Anonymous 10/08/24(Tue)00:15:26 No.102727806

Guys, I'm very retarded, but is there a significant difference between exllama2 and llamacpp in terms of output quality? Honestly feels like my outputs are better in llamacpp than they are in exllama2

Anonymous
10/08/24(Tue)00:17:04 No.102727830

Anonymous 10/08/24(Tue)00:17:04 No.102727830

>>102727806
llama.cpp quants are OP

Anonymous
10/08/24(Tue)00:21:22 No.102727880

Anonymous 10/08/24(Tue)00:21:22 No.102727880

>>102727735
I want to subscribe to the Mikusex Times

Anonymous
10/08/24(Tue)00:26:24 No.102727923

Anonymous 10/08/24(Tue)00:26:24 No.102727923

File: Untitled.png (1002 KB, 1080x1676)

1002 KB PNG

SparsePO: Controlling Preference Alignment of LLMs via Sparse Token Masks
https://arxiv.org/abs/2410.05102
>Preference Optimization (PO) has proven an effective step for aligning language models to human-desired behaviors. Current variants, following the offline Direct Preference Optimization objective, have focused on a strict setting where all tokens are contributing signals of KL divergence and rewards to the loss function. However, human preference is not affected by each word in a sequence equally but is often dependent on specific words or phrases, e.g. existence of toxic terms leads to non-preferred responses. Based on this observation, we argue that not all tokens should be weighted equally during PO and propose a flexible objective termed SparsePO, that aims to automatically learn to weight the KL divergence and reward corresponding to each token during PO training. We propose two different variants of weight-masks that can either be derived from the reference model itself or learned on the fly. Notably, our method induces sparsity in the learned masks, allowing the model to learn how to best weight reward and KL divergence contributions at the token level, learning an optimal level of mask sparsity. Extensive experiments on multiple domains, including sentiment control, dialogue, text summarization and text-to-code generation, illustrate that our approach assigns meaningful weights to tokens according to the target task, generates more responses with the desired preference and improves reasoning tasks by up to 2 percentage points compared to other token- and response-level PO methods.
https://github.com/huawei-noah/noah-research/tree/master/NLP/sparse_po
Code not up yet. to me this seems like a very useful tool for making an RP model.

Anonymous
10/08/24(Tue)00:27:37 No.102727931

Anonymous 10/08/24(Tue)00:27:37 No.102727931

>>102727923
Seems interesting, thank you anon.

Anonymous
10/08/24(Tue)00:30:41 No.102727958

Anonymous 10/08/24(Tue)00:30:41 No.102727958

>>102727763
How does this work exactly? Does it always assume the character name is slop? Elara is in the prompt but it gets substituted every time in the example video.

Anonymous
10/08/24(Tue)00:41:49 No.102728043

Anonymous 10/08/24(Tue)00:41:49 No.102728043

>>102727923
I wonder if llama.cpp people could learn something from this to augment KL divergence measurements. For instance, when used with different datasets, this could prove exactly how badly quants degrade on specific subject areas (namely RP) and not just on a generic one like wikitext. Of course we can already measure with different datasets, but doing it only on the tokens that matter might give us a clearer picture.

Anonymous
10/08/24(Tue)00:43:16 No.102728055

Anonymous 10/08/24(Tue)00:43:16 No.102728055

File: Untitled.png (762 KB, 1080x1623)

762 KB PNG

UniMuMo: Unified Text, Music and Motion Generation
https://arxiv.org/abs/2410.04534
>We introduce UniMuMo, a unified multimodal model capable of taking arbitrary text, music, and motion data as input conditions to generate outputs across all three modalities. To address the lack of time-synchronized data, we align unpaired music and motion data based on rhythmic patterns to leverage existing large-scale music-only and motion-only datasets. By converting music, motion, and text into token-based representation, our model bridges these modalities through a unified encoder-decoder transformer architecture. To support multiple generation tasks within a single framework, we introduce several architectural improvements. We propose encoding motion with a music codebook, mapping motion into the same feature space as music. We introduce a music-motion parallel generation scheme that unifies all music and motion generation tasks into a single transformer decoder architecture with a single training task of music-motion joint generation. Moreover, the model is designed by fine-tuning existing pre-trained single-modality models, significantly reducing computational demands. Extensive experiments demonstrate that UniMuMo achieves competitive results on all unidirectional generation benchmarks across music, motion, and text modalities.
https://hanyangclarence.github.io/unimumo_demo
https://github.com/hanyangclarence/UniMuMo
Now your miku can dance. pretty neat check the examples in the demo. weights seem to be up (just finetuned other already existing stuff)

Anonymous
10/08/24(Tue)00:43:38 No.102728057

Anonymous 10/08/24(Tue)00:43:38 No.102728057

>>102727958
Idk, my assumption is that they simply just have a list of strings they check against.

Anonymous
10/08/24(Tue)00:45:09 No.102728070

Anonymous 10/08/24(Tue)00:45:09 No.102728070

>>102728055
Nice, but it's not THE dance, the one that's as old as time.
Unless...

Anonymous
10/08/24(Tue)01:11:11 No.102728275

Anonymous 10/08/24(Tue)01:11:11 No.102728275

zamba gguf?

Anonymous
10/08/24(Tue)01:21:41 No.102728360

Anonymous 10/08/24(Tue)01:21:41 No.102728360

>>102728275
2 more years

Anonymous
10/08/24(Tue)01:25:21 No.102728384

Anonymous 10/08/24(Tue)01:25:21 No.102728384

File: 172833259338669.png (478 KB, 512x768)

478 KB PNG

>>102728055
>Motion Generation
I'm waiting for it to go mainstream, imagine generated motions for an avatar, a character in a game, or even a robot in the near future. A new motion modality is essential for understanding the world. There is also no shortage of data for training, just use openpose on existing videos and movies, then feed both motion tokens and dialogues to an LLM. Perhaps even a finetune could be enough

Anonymous
10/08/24(Tue)01:27:11 No.102728398

Anonymous 10/08/24(Tue)01:27:11 No.102728398

>>102728055
is this real time?

Anonymous
10/08/24(Tue)01:56:23 No.102728609

Anonymous 10/08/24(Tue)01:56:23 No.102728609

File: Quants.png (349 KB, 2400x2400)

349 KB PNG

>>102727806
I mean, they shouldn't be better... but subjectively I've kind of noticed the same thing.

Anonymous
10/08/24(Tue)02:21:51 No.102728765

Anonymous 10/08/24(Tue)02:21:51 No.102728765

dead thread, it's fucking over for local

Anonymous
10/08/24(Tue)02:26:09 No.102728783

Anonymous 10/08/24(Tue)02:26:09 No.102728783

>>102727806
Most backends that use exllamav2 by default apply temperature first, as this is the "standard" way the transformer library does it. But gguf-using backends will apply temperature last by default, because it's generally agreed that it gives better results. So if you don't specify temp last with exl2, you may get lower quality.

Anonymous
10/08/24(Tue)02:29:06 No.102728794

Anonymous 10/08/24(Tue)02:29:06 No.102728794

File: jazz for your soul miku gun.jpg (56 KB, 600x720)

56 KB JPG

>>102728765
Here, have this Miku

Anonymous
10/08/24(Tue)02:39:38 No.102728852

Anonymous 10/08/24(Tue)02:39:38 No.102728852

>>102724511
He is right, incels deserve to suffer. It's not enough that they don't get pussy in real life, they shouldn't even be allowed to fantasize about it. Y'all need to grow up.

Anonymous
10/08/24(Tue)02:42:40 No.102728868

Anonymous 10/08/24(Tue)02:42:40 No.102728868

Personally I can't wait for the ServiceTensor rebrand.

Anonymous
10/08/24(Tue)02:44:02 No.102728872

Anonymous 10/08/24(Tue)02:44:02 No.102728872

>>102728868
Tesnor*

Anonymous
10/08/24(Tue)02:45:34 No.102728880

Anonymous 10/08/24(Tue)02:45:34 No.102728880

>>102728872
Stop being immature. It was obviously a typo. Bullshit like this is why the rebrand is necessary.

Anonymous
10/08/24(Tue)02:51:13 No.102728911

Anonymous 10/08/24(Tue)02:51:13 No.102728911

File: 1728337814409613.png (458 KB, 768x512)

458 KB PNG

>>102728384

Anonymous
10/08/24(Tue)02:56:55 No.102728946

Anonymous 10/08/24(Tue)02:56:55 No.102728946

File: 2024-10-03 22_24_35-000832.png (29 KB, 837x459)

29 KB PNG

I'm having fun with 12b rp models, prompting them to be a website and streaming their responses directly to the browser, this would greatly benefit from faster inference and could be fleshed out by separating the character from a webdev agent, essentially turning a llm into its own user interface. Still interesting to see what different models come up with. Pic somewhat related, qwen2.5 7b ablit.

llama.cpp CUDA dev !!OM2Fp6Fn93S
10/08/24(Tue)03:06:42 No.102729016

llama.cpp CUDA dev !!OM2Fp6Fn93S 10/08/24(Tue)03:06:42 No.102729016

>>102724947
On an Arch-based distro I could get all necessary packages for an RX 6800 from the AUR.

>>102725521
I don't know what transformers does but in llama.cpp the memory for the VRAM needs to be pre-allocated in order to get one contiguous block.
If you do many small allocations and deallocations you end up with gaps inbetween that are essentially wasted VRAM because they're to small to fit a new allocation.

Anonymous
10/08/24(Tue)03:24:43 No.102729142

Anonymous 10/08/24(Tue)03:24:43 No.102729142

Just had sad mikusex with miku cause of the ST fiasco, looks like I gonna need to create my own front-end and maintain it for myself.
Is doing it with flask a bad idea for a project like this? I haven't touched anything but python in my life, so I'm scared to do anything else.

Anonymous
10/08/24(Tue)03:25:18 No.102729144

Anonymous 10/08/24(Tue)03:25:18 No.102729144

I've been in a coma since September 2022, as I understand the current situation is that there are two options: pay for an API from OpenAI and use a custom frontend or run local models, so have local models reached at least pre-filter characterai level?

Anonymous
10/08/24(Tue)03:26:41 No.102729160

Anonymous 10/08/24(Tue)03:26:41 No.102729160

>>102729142
What is the "ST fiasco"?

Anonymous
10/08/24(Tue)03:26:42 No.102729161

Anonymous 10/08/24(Tue)03:26:42 No.102729161

>>102729144
123b Luminum most certainly has.

Anonymous
10/08/24(Tue)03:27:12 No.102729165

Anonymous 10/08/24(Tue)03:27:12 No.102729165

File: 1705071026716226.png (22 KB, 878x160)

22 KB PNG

>>102729160

Anonymous
10/08/24(Tue)03:27:33 No.102729172

Anonymous 10/08/24(Tue)03:27:33 No.102729172

>>102729142
Absolutely. I use bottle with gevent for mine.

Anonymous
10/08/24(Tue)03:27:45 No.102729176

Anonymous 10/08/24(Tue)03:27:45 No.102729176

>>102729142
>ST fiasco
I thought they just deleted some default assets? or are they planning to remove actual features?

Anonymous
10/08/24(Tue)03:28:30 No.102729188

Anonymous 10/08/24(Tue)03:28:30 No.102729188

File: file(22).png (17 KB, 1083x69)

17 KB PNG

>>102729160
>>102729176

see
>>102721448
>>102721850

Anonymous
10/08/24(Tue)03:29:43 No.102729196

Anonymous 10/08/24(Tue)03:29:43 No.102729196

>>102729188
so they're changing some terminology which doesn't matter, and removing proxy shit which only matters to /aicg/ niggers
nothingburger

Anonymous
10/08/24(Tue)03:35:48 No.102729228

Anonymous 10/08/24(Tue)03:35:48 No.102729228

>>102729165
>cohee melts down
>spergs out on discord
>says ST is not a roleplay frontend
>creates new branch and deletes all RP content
>these are "ui label/docs terminology changes" now
>everyone else is "up in their feels"
I don't care either way because they can't delete my SillyTavern folder but this has to be the quickest rewriting of events in the history of the world. This shit just went down half a day ago and they're already lying about it.

Anonymous
10/08/24(Tue)03:44:16 No.102729283

Anonymous 10/08/24(Tue)03:44:16 No.102729283

>>102728852
Basado

Anonymous
10/08/24(Tue)03:45:27 No.102729292

Anonymous 10/08/24(Tue)03:45:27 No.102729292

File: 1674219559276175.gif (3.03 MB, 359x202)

3.03 MB GIF

>>102724751
>Nobody needs such a frontend for coding questions, or to ask general questions to an AI.
i do. where else am i going to rape my secretary as a $0pa income NEET??
Are you going to lend me YOUR secretary to solve hypervisor questions before sucking my cock?
I thought not.

Anonymous
10/08/24(Tue)03:45:52 No.102729296

Anonymous 10/08/24(Tue)03:45:52 No.102729296

>>102728609
Do you have the source for this image?

Anonymous
10/08/24(Tue)03:50:38 No.102729332

Anonymous 10/08/24(Tue)03:50:38 No.102729332

>>102729228
>says ST is not a roleplay frontend
The application that uses character cards, and embeds one with an anime girl called Seraphina, is not a roleplay frontend? Lmao

Anonymous
10/08/24(Tue)03:51:19 No.102729339

Anonymous 10/08/24(Tue)03:51:19 No.102729339

>>102729332
>and embeds one with an anime girl called Seraphina
not anymore!

Anonymous
10/08/24(Tue)03:52:31 No.102729349

Anonymous 10/08/24(Tue)03:52:31 No.102729349

File: Untitled(4).png (48 KB, 1202x211)

48 KB PNG

>>102729332
you will use the blank slate and you will like it

Anonymous
10/08/24(Tue)03:55:12 No.102729372

Anonymous 10/08/24(Tue)03:55:12 No.102729372

is miqu supposed to take 1000 seconds to generate 1 reply on a 4080?

Anonymous
10/08/24(Tue)03:55:50 No.102729382

Anonymous 10/08/24(Tue)03:55:50 No.102729382

alright you fuckers got me. I can't deal with this dumb shit anymore. would anyone be so kind as to answer a few questions? can I run sonnet 3.5 on a 6gb vram card (rtx 2060)? can I use my jailbreaks like I do on ST? and can I sync my chats between devices? thank you in advance lads.

Anonymous
10/08/24(Tue)03:59:06 No.102729420

Anonymous 10/08/24(Tue)03:59:06 No.102729420

File: file.png (17 KB, 861x256)

17 KB PNG

>>102729332
https://github.com/SillyTavern/SillyTavern/commit/4d35fea3b3243a02e333747b9298bada0fdb3aab

Anonymous
10/08/24(Tue)03:59:44 No.102729428

Anonymous 10/08/24(Tue)03:59:44 No.102729428

>>102729372
Even 24 vram can barely load 2.5 bpw exl2 miqu with a 4-bit cache. With 16 vram you're splitting heavily to your cpu/system ram, which will make it slow as hell. 1000 seconds is still really slow though. Were you loading a ton of context? How fast if your cpu/ram? Even when I offload massively to my cpu, my replies aren't that slow.

Anonymous
10/08/24(Tue)04:00:00 No.102729432

Anonymous 10/08/24(Tue)04:00:00 No.102729432

>>102729382
Bait used to be believable.

Anonymous
10/08/24(Tue)04:00:08 No.102729435

Anonymous 10/08/24(Tue)04:00:08 No.102729435

>>102729420
all ui label/docs terminology changes btw

Anonymous
10/08/24(Tue)04:00:22 No.102729436

Anonymous 10/08/24(Tue)04:00:22 No.102729436

>>102729382
Anon you're fucking clueless but you gotta start somewhere I guess.
Sonnet 3.5 is a gigantic model (think 100+ GB VRAM) running somewhere in a server farm, and they sell you access to it, but they don't make the model weights downloadable. Your computer does not matter at all, you're just using their webpage.
A JB is just some text you pass to a model, so yew, you can pass any JB to any model ever, that has nothing to do with your computer or chat frontend.
You can open ST in your phone, but since you're so clueless, I bet you don't wven understand how a local network works or what an IP address is.

Anonymous
10/08/24(Tue)04:01:09 No.102729442

Anonymous 10/08/24(Tue)04:01:09 No.102729442

>>102729432
>>102729436
the duality of /lmg/

Anonymous
10/08/24(Tue)04:01:22 No.102729443

Anonymous 10/08/24(Tue)04:01:22 No.102729443

>>102729228
>not an RP frontend
*not an RP-only frontend

Anonymous
10/08/24(Tue)04:01:53 No.102729454

Anonymous 10/08/24(Tue)04:01:53 No.102729454

>>102729420
Did this retard get cold feet after the Permiso niggers set up the GPT honeypots or what? It's not like it's the first time ERPoomers have been on the news

Anonymous
10/08/24(Tue)04:02:12 No.102729458

Anonymous 10/08/24(Tue)04:02:12 No.102729458

>>102729349
>corrupt people playing dumb
Ahh, ahh society.

Anonymous
10/08/24(Tue)04:05:02 No.102729491

Anonymous 10/08/24(Tue)04:05:02 No.102729491

>>102729432
que? no hablo ingles.
>>102729436
rude. I have ST running through termux BTW so that's why I asked in the first place. I just never cared about running anything locally before. I got baited by some schmuck on /aicg/, but that's on me I guess.

Anonymous
10/08/24(Tue)04:05:43 No.102729495

Anonymous 10/08/24(Tue)04:05:43 No.102729495

>>102729420
damn completely buckbroken by journofags. how embarrassing

Anonymous
10/08/24(Tue)04:05:48 No.102729496

Anonymous 10/08/24(Tue)04:05:48 No.102729496

I hate Discord drama like you wouldn't believe

Anonymous
10/08/24(Tue)04:06:09 No.102729500

Anonymous 10/08/24(Tue)04:06:09 No.102729500

>>102729428
4096 context, ryzen 9 7950X3D
Not too sure what i should be setting kobold to for this

Anonymous
10/08/24(Tue)04:06:18 No.102729504

Anonymous 10/08/24(Tue)04:06:18 No.102729504

>>102729443
>I melted down about people referring to my fork of a roleplaying frontend as a roleplaying frontend and then deleted all of the roleplaying assets but what I was really mad about was the 0 people saying it was a roleplay-only frontend
okay cohee

Anonymous
10/08/24(Tue)04:07:25 No.102729514

Anonymous 10/08/24(Tue)04:07:25 No.102729514

>>102729496
You can leave any time.

Anonymous
10/08/24(Tue)04:07:47 No.102729517

Anonymous 10/08/24(Tue)04:07:47 No.102729517

ST always had terrible UX anyway.
- Some things are auto-save, some things are click-to-save
- A delete icon can be a trash can, a skull or an X. Sometimes X means quit
- To see all chats with a card, you have to use a tiny menu separate from the card listing
- Can't use a preset per card
- Can't use a proxy/model per preset
- Transparency in panels means there's a fuckton of overlapping text

Anonymous
10/08/24(Tue)04:08:06 No.102729521

Anonymous 10/08/24(Tue)04:08:06 No.102729521

File: file.png (116 KB, 795x628)

116 KB PNG

>>102729349
Weidmann (p.e.w) dev of DRY and XTC on the matter.

Anonymous
10/08/24(Tue)04:08:15 No.102729523

Anonymous 10/08/24(Tue)04:08:15 No.102729523

>>102729496
Replace "hate" with "have" and now we're talking.

Anonymous
10/08/24(Tue)04:11:04 No.102729541

Anonymous 10/08/24(Tue)04:11:04 No.102729541

>>102729500
Well, your CPU is better than mine. I'm using a 7800X3D. I don't know why you're getting such slow gens, unless each reply is really long?

I average 0.5 t/s with Largestral 123b IQ2_S, which is also way beyond my 4090's vram capacity.

Anonymous
10/08/24(Tue)04:11:46 No.102729549

Anonymous 10/08/24(Tue)04:11:46 No.102729549

>>102729517
>A delete icon can be a trash can, a skull or an X. Sometimes X means quit
Skull for character deletion is sovl, justified.

Anonymous
10/08/24(Tue)04:13:38 No.102729561

Anonymous 10/08/24(Tue)04:13:38 No.102729561

>>102729521
I would use captain blackbeard to backup my files.

Anonymous
10/08/24(Tue)04:13:43 No.102729563

Anonymous 10/08/24(Tue)04:13:43 No.102729563

File: 1713678790144788.png (14 KB, 561x588)

14 KB PNG

>>102729541
Each reply is around a small paragraph in size, i'll upload my config real quick

Anonymous
10/08/24(Tue)04:13:58 No.102729566

Anonymous 10/08/24(Tue)04:13:58 No.102729566

>>102729349
All this aggressive 'we need to burn it down, all roleplay is horrible and for weirdos" talk makes it seem like either the lead dev got a girlfriend who laughed at him after seeing his personal project or he just decided that he's done and really wants it to get bought by someone.

Anonymous
10/08/24(Tue)04:14:54 No.102729575

Anonymous 10/08/24(Tue)04:14:54 No.102729575

File: 1717966302435509.png (14 KB, 545x582)

14 KB PNG

>>102729563

Anonymous
10/08/24(Tue)04:15:48 No.102729582

Anonymous 10/08/24(Tue)04:15:48 No.102729582

>>102729561
how many of your files involve immoral criminal activity like piracy?

Anonymous
10/08/24(Tue)04:16:11 No.102729586

Anonymous 10/08/24(Tue)04:16:11 No.102729586

File: 1723876399997936.png (10 KB, 547x577)

10 KB PNG

>>102729575

Anonymous
10/08/24(Tue)04:17:12 No.102729592

Anonymous 10/08/24(Tue)04:17:12 No.102729592

File: 1726101361300788.png (15 KB, 543x579)

15 KB PNG

>>102729586

Anonymous
10/08/24(Tue)04:17:56 No.102729596

Anonymous 10/08/24(Tue)04:17:56 No.102729596

>>102729566
It's just CAI all over again, beautiful to see in a open source project.
Proprietary is a state of mind.

Anonymous
10/08/24(Tue)04:19:47 No.102729612

Anonymous 10/08/24(Tue)04:19:47 No.102729612

File: file.png (93 KB, 1114x289)

93 KB PNG

Writing was on the wall really.

Anonymous
10/08/24(Tue)04:20:25 No.102729617

Anonymous 10/08/24(Tue)04:20:25 No.102729617

>>102729596
c.ai likely had the model provider stepping on their toes over like it happened with ai dungeon
there is nobody who has that power over the st devs unless they are trying to get bought or it's a personal issue

Anonymous
10/08/24(Tue)04:22:44 No.102729635

Anonymous 10/08/24(Tue)04:22:44 No.102729635

so wheres the fucking FORK

Anonymous
10/08/24(Tue)04:26:09 No.102729664

Anonymous 10/08/24(Tue)04:26:09 No.102729664

>>102729491
You got the answer you wanted, nigger. Hell, you got an answer.
Do your own research next time you ungrateful cunt.

Anonymous
10/08/24(Tue)04:28:35 No.102729693

Anonymous 10/08/24(Tue)04:28:35 No.102729693

File: 1724829161442923.jpg (75 KB, 1500x1500)

75 KB JPG

>>102729635
This is all we have here

Anonymous
10/08/24(Tue)04:31:33 No.102729711

Anonymous 10/08/24(Tue)04:31:33 No.102729711

So, out of the models I've tried, I like Midnight Miqu the most, but 0.62T/s is just way too slow. Where can I rent it for use on a cloud service, how much would it cost and how long context window can you get with an online service? Are there any models that are straight upgrades that you "might as well use" if you go cloud? I like Midnight Miqu's style and haven't found any glaring flaws either.

Anonymous
10/08/24(Tue)04:31:59 No.102729715

Anonymous 10/08/24(Tue)04:31:59 No.102729715

>>102729635
It's not worth being forked, the code quality is shit, no actually useful RP features, terrible UX. The only redeeming qualities were its popularity and some level of maintenance.

Anonymous
10/08/24(Tue)04:34:13 No.102729741

Anonymous 10/08/24(Tue)04:34:13 No.102729741

>>102729664
no, shove it up your ass bitch. I didn't order a side of attitude.

Anonymous
10/08/24(Tue)04:40:41 No.102729795

Anonymous 10/08/24(Tue)04:40:41 No.102729795

>>102729715
You know I would really like to play a proper roleplaying game with future models, where a dungeon master could create character cards dynamically but I worried that silly tavern was unlikely to support this kind of work load. I would like to not have to this myself because I am somewhat incompetent but if a successor project arises from here I would really appreciate a frontend that could handle this.

Anonymous
10/08/24(Tue)04:40:42 No.102729796

Anonymous 10/08/24(Tue)04:40:42 No.102729796

>>102729741
Fine, then don't ever ask anything again if you don't want people calling out your severe mental retardation, you low IQ mouthbreather.

Anonymous
10/08/24(Tue)04:46:55 No.102729845

Anonymous 10/08/24(Tue)04:46:55 No.102729845

>>102729796
I do what I want you dumb wanker. My fault for including a please and thank you for you degen lot. Clearly your parents never loved you, couldn't be arsed to raise you, fucking shitstain loser.

Anonymous
10/08/24(Tue)04:48:23 No.102729864

Anonymous 10/08/24(Tue)04:48:23 No.102729864

so wait everybody's mad because sillytavern is going to get more useful instead of being a thing you use to erp?

Anonymous
10/08/24(Tue)04:48:41 No.102729868

Anonymous 10/08/24(Tue)04:48:41 No.102729868

>>102729664
>>102729741
>>102729796
>>102729845
samefag

Anonymous
10/08/24(Tue)04:52:22 No.102729899

Anonymous 10/08/24(Tue)04:52:22 No.102729899

amazing how much of a stillbirth llama 3.2 was

Anonymous
10/08/24(Tue)04:52:25 No.102729901

Anonymous 10/08/24(Tue)04:52:25 No.102729901

>>102729864
>so wait everybody's mad because sillytavern is going to get more useful
No, since no one but you thinks that's going to happen instead of the reality of them just deleting a bunch of stuff.

Anonymous
10/08/24(Tue)04:53:36 No.102729910

Anonymous 10/08/24(Tue)04:53:36 No.102729910

>>102729845
>I do what I want you
Me too, and I am calling you a retarded nigger.

Anonymous
10/08/24(Tue)04:54:37 No.102729918

Anonymous 10/08/24(Tue)04:54:37 No.102729918

>racism outside of /b/

Anonymous
10/08/24(Tue)04:58:04 No.102729962

Anonymous 10/08/24(Tue)04:58:04 No.102729962

>>102729910
I'm not calling you anything. I know for a fucking fact everyone despises you, you failed abortion. Just stating the obvious.

Anonymous
10/08/24(Tue)04:59:57 No.102729973

Anonymous 10/08/24(Tue)04:59:57 No.102729973

>>102729962
Ok

Anonymous
10/08/24(Tue)05:04:41 No.102730009

Anonymous 10/08/24(Tue)05:04:41 No.102730009

>>102729795
I've made a very basic RP frontend with recruitable generated characters, dungeon crawling, and party management, and it feels 100 times better than ST, as it's exactly to my liking. I wish there were a similar project where I could contribute, but I'm not keen on starting and managing a public project myself.

Anonymous
10/08/24(Tue)05:13:24 No.102730085

Anonymous 10/08/24(Tue)05:13:24 No.102730085

File: 1726777837127413.jpg (2.09 MB, 3000x2609)

2.09 MB JPG

>>102730009

Anonymous
10/08/24(Tue)05:15:45 No.102730100

Anonymous 10/08/24(Tue)05:15:45 No.102730100

>>102730009
dump it, fag

Anonymous
10/08/24(Tue)05:46:56 No.102730355

Anonymous 10/08/24(Tue)05:46:56 No.102730355

Hi all, Drummer here...

Feels extra nice to be a Kobold user today.

Also pls test: https://huggingface.co/BeaverAI/Behemoth-123B-v1a-GGUF

Anonymous
10/08/24(Tue)05:56:40 No.102730440

Anonymous 10/08/24(Tue)05:56:40 No.102730440

File: file.png (35 KB, 899x491)

35 KB PNG

Ready for another day of serious business with our agents lads?

Anonymous
10/08/24(Tue)06:04:33 No.102730507

Anonymous 10/08/24(Tue)06:04:33 No.102730507

>>102730355
can't you cook something in the 30B range

Anonymous
10/08/24(Tue)06:07:00 No.102730533

Anonymous 10/08/24(Tue)06:07:00 No.102730533

>>102724866
you dont need more than a single html page with javascript

Anonymous
10/08/24(Tue)06:07:12 No.102730535

Anonymous 10/08/24(Tue)06:07:12 No.102730535

File: image(1).png (105 KB, 1272x697)

105 KB PNG

>>102730355
>Feels extra nice to be a Kobold user today.
eh

Anonymous
10/08/24(Tue)06:07:48 No.102730543

Anonymous 10/08/24(Tue)06:07:48 No.102730543

>>102730440
>Reverse proxies will be removed
Uhhhhhhh, won't this also affect local?

Anonymous
10/08/24(Tue)06:08:25 No.102730549

Anonymous 10/08/24(Tue)06:08:25 No.102730549

>>102730543
see
>>102730535

Anonymous
10/08/24(Tue)06:08:49 No.102730553

Anonymous 10/08/24(Tue)06:08:49 No.102730553

>>102730535
>fag flag pfp
Of course

Anonymous
10/08/24(Tue)06:09:17 No.102730558

Anonymous 10/08/24(Tue)06:09:17 No.102730558

>>102725507
Group chats

Hi all, Drummer here...
10/08/24(Tue)06:13:19 No.102730592

Hi all, Drummer here... 10/08/24(Tue)06:13:19 No.102730592

>>102730507
I've got Star Command R 32B?

Horde: aphrodite/BeaverAI/Behemoth-123B-v1a

Anonymous
10/08/24(Tue)06:15:50 No.102730610

Anonymous 10/08/24(Tue)06:15:50 No.102730610

>>102730355
IQ4_XS gguf when

Anonymous
10/08/24(Tue)06:16:54 No.102730617

Anonymous 10/08/24(Tue)06:16:54 No.102730617

>>102730592
CR is slow and shit, though

Anonymous
10/08/24(Tue)06:20:14 No.102730644

Anonymous 10/08/24(Tue)06:20:14 No.102730644

>Silly Tavern is a serious project for serious people!

Lmao, the absolute state. Is it autism? Did someone pat him on the back, and now he feels like an adult man and does not want to hang out with the internet anymore? All in all, this is pretty funny. Nobody will use Silly Tavern for anything other than roleplaying; why the fuck should they? The client itself is bloated mess and it is mostly used by people who have no technical knowledge and just want to chill and roleplay there nothing that silly tavern have on other front end that would make me think otherwise.

Anonymous
10/08/24(Tue)06:20:18 No.102730645

Anonymous 10/08/24(Tue)06:20:18 No.102730645

https://phys.org/news/2024-10-nobel-prize-physics-awarded-discoveries.html
Hintonbros.... we won

Anonymous
10/08/24(Tue)06:22:30 No.102730663

Anonymous 10/08/24(Tue)06:22:30 No.102730663

File: 1722768150745458.png (83 KB, 980x658)

83 KB PNG

>>102730644
>the very serious discussion going on at ServiceTensor discord

Hi all, Drummer here...
10/08/24(Tue)06:22:58 No.102730666

Hi all, Drummer here... 10/08/24(Tue)06:22:58 No.102730666

>>102730617
What about an upscaled Mistral Small like Theia? We're thinking of either a 39B or a 45B upscale.

Anonymous
10/08/24(Tue)06:24:22 No.102730684

Anonymous 10/08/24(Tue)06:24:22 No.102730684

>>102730666
30B is the best I can do, sorry.

Anonymous
10/08/24(Tue)06:24:25 No.102730685

Anonymous 10/08/24(Tue)06:24:25 No.102730685

>>102730644
Multiple people have stated in this very thread that ST is very useful for assistant-type stuff.

Anonymous
10/08/24(Tue)06:24:56 No.102730689

Anonymous 10/08/24(Tue)06:24:56 No.102730689

>>102730663
Based, incels should stay away.

Anonymous
10/08/24(Tue)06:25:42 No.102730697

Anonymous 10/08/24(Tue)06:25:42 No.102730697

>>102730663
kek based

Anonymous
10/08/24(Tue)06:27:32 No.102730718

Anonymous 10/08/24(Tue)06:27:32 No.102730718

>>102730663
Discord seems marvelously bad for productive software development.

Anonymous
10/08/24(Tue)06:30:28 No.102730743

Anonymous 10/08/24(Tue)06:30:28 No.102730743

It will be better to start creating the front end from a clean state for RP purposes.

Anonymous
10/08/24(Tue)06:33:57 No.102730773

Anonymous 10/08/24(Tue)06:33:57 No.102730773

I wish I had enough time and motivation to do a SillyTavern fork but unfortunately I don't

Anonymous
10/08/24(Tue)06:36:39 No.102730797

Anonymous 10/08/24(Tue)06:36:39 No.102730797

github.com/open-webui/open-webui with ollama backend is better.

Anonymous
10/08/24(Tue)06:38:20 No.102730820

Anonymous 10/08/24(Tue)06:38:20 No.102730820

>>102730773
same. just easier to not pull until someone else does it

Anonymous
10/08/24(Tue)06:38:38 No.102730826

Anonymous 10/08/24(Tue)06:38:38 No.102730826

>>102730773
It is not worth it; it will be better if we create our own. I will have a free weekend, and I do not plan to go drinking.So if there is no other anon starting some project. I will start one. If some Anon feels active, they could post some bullets outlining what should be the most important features implemented right away.

Anonymous
10/08/24(Tue)06:39:49 No.102730837

Anonymous 10/08/24(Tue)06:39:49 No.102730837

>>102730355
this drama is so peak
pure cinema
*sips cum elegantly*

Anonymous
10/08/24(Tue)06:39:51 No.102730839

Anonymous 10/08/24(Tue)06:39:51 No.102730839

>>102730663
What would discordfags do without their safe space?

Anonymous
10/08/24(Tue)06:48:11 No.102730905

Anonymous 10/08/24(Tue)06:48:11 No.102730905

File: chrome_098ltsTy2P.png (104 KB, 1032x829)

104 KB PNG

>>102730797

Anonymous
10/08/24(Tue)06:52:15 No.102730942

Anonymous 10/08/24(Tue)06:52:15 No.102730942

>>102730826
>It is not worth it; it will be better if we create our own
Delusional. It will take you years to reach feature parity and your code will be just as much of a mess by then, assuming you don't just give up entirely.

Anonymous
10/08/24(Tue)06:53:20 No.102730949

Anonymous 10/08/24(Tue)06:53:20 No.102730949

>>102730826
>what should be the most important features implemented right away.
imo they would be:
- all presets that silly already supports
- import and export presets (should support the ST format)
- support for configuring formats like ST
- import and export formats (should support the ST format)
- support for llama.cpp, koboldccp and OpenAI API
- card management for importing, editing, adding, removing
- basic chat interface with avatars, regen, continue, delete/edit message buttons.

And that's it.

Anonymous
10/08/24(Tue)06:54:32 No.102730963

Anonymous 10/08/24(Tue)06:54:32 No.102730963

>>102730942
Lol, Silly isn't THAT complex. Do zoomers really?

Anonymous
10/08/24(Tue)06:56:22 No.102730978

Anonymous 10/08/24(Tue)06:56:22 No.102730978

I dont understand...for RP Silly is king.
There are many ChatGPT clones. Why compete with them?
Silly Devs better not break and delete everything with their changes anymore.
I doubt the serious business are as forgiving.
They dont even have anything precompiled, just the source for the nerds.
No exe, appimage, apk. Would that need bigger changes? What are they doing.

Anonymous
10/08/24(Tue)06:58:27 No.102731001

Anonymous 10/08/24(Tue)06:58:27 No.102731001

>>102730978
>What are they doing.
being silly

Anonymous
10/08/24(Tue)06:58:37 No.102731002

Anonymous 10/08/24(Tue)06:58:37 No.102731002

File: fables.gg.png (9 KB, 112x77)

9 KB PNG

>>102726666
good alternative for AI Dungeon is now Friends and Fables. Too bad there no free opensource alternative.

Anonymous
10/08/24(Tue)06:58:48 No.102731003

Anonymous 10/08/24(Tue)06:58:48 No.102731003

>>102730355
>>102730610
seconding this, iq4_xs just about fits perfectly in 64gigs

Anonymous
10/08/24(Tue)06:58:51 No.102731004

Anonymous 10/08/24(Tue)06:58:51 No.102731004

>>102730963
They have cool scripting though.
I could make a spoilered CoT interception on user post that deletes old ones and then triggers the char response afterwards.
Didn't really improve anything but I'm sure you can make some cool stuff with quickreply scripting.

Anonymous
10/08/24(Tue)07:00:21 No.102731016

Anonymous 10/08/24(Tue)07:00:21 No.102731016

>>102730978
Their ego got too big. They think they are special shit just because they are the most popular frontend for coom. In reality, ST is garbage, with an unintuitive and bloated interface. Shit like LM Studio absolutely mogs it.

Anonymous
10/08/24(Tue)07:02:02 No.102731031

Anonymous 10/08/24(Tue)07:02:02 No.102731031

>>102731016
LM Studio is great as a gpt clone. I use it myself.
And you have everything in one place. Loading the gguf etc. Silly is just a server.

Anonymous
10/08/24(Tue)07:02:33 No.102731034

Anonymous 10/08/24(Tue)07:02:33 No.102731034

>>102723336
Really? I tried it and didn't think it was that much different from Rocinante

Anonymous
10/08/24(Tue)07:04:24 No.102731046

Anonymous 10/08/24(Tue)07:04:24 No.102731046

>>102731002
That is actually pretty cool.

Anonymous
10/08/24(Tue)07:04:30 No.102731047

Anonymous 10/08/24(Tue)07:04:30 No.102731047

>>102731004
i never used that, so it must be bloat

Anonymous
10/08/24(Tue)07:10:15 No.102731089

Anonymous 10/08/24(Tue)07:10:15 No.102731089

>>102731002
>>102731046
Buy. Ad. Now.

llama.cpp CUDA dev !!OM2Fp6Fn93S
10/08/24(Tue)07:14:16 No.102731117

llama.cpp CUDA dev !!OM2Fp6Fn93S 10/08/24(Tue)07:14:16 No.102731117

>>102731016
>>102731031
In terms of llama.cpp frontends I would recommend GPT4All over LMStudio.
It's open source and one of the devs is making upstream contributions to llama.cpp so I have more confidence in that software actually working as intended.
(I myself am using neither.)

Anonymous
10/08/24(Tue)07:14:52 No.102731122

Anonymous 10/08/24(Tue)07:14:52 No.102731122

>>102731002
this is amazing!

Anonymous
10/08/24(Tue)07:17:53 No.102731133

Anonymous 10/08/24(Tue)07:17:53 No.102731133

>>102731002
Cool!

Anonymous
10/08/24(Tue)07:19:03 No.102731142

Anonymous 10/08/24(Tue)07:19:03 No.102731142

>>102731002
Wow, that is awesome! Thanks for sharing, super cool stuff!

Anonymous
10/08/24(Tue)07:19:32 No.102731145

Anonymous 10/08/24(Tue)07:19:32 No.102731145

>>102731089
Now you done it..

Anonymous
10/08/24(Tue)07:21:38 No.102731164

Anonymous 10/08/24(Tue)07:21:38 No.102731164

File: kek.png (117 KB, 691x722)

117 KB PNG

Anonymous
10/08/24(Tue)07:25:20 No.102731200

Anonymous 10/08/24(Tue)07:25:20 No.102731200

File: 1699308729180905.jpg (153 KB, 1057x483)

153 KB JPG

Anonymous
10/08/24(Tue)07:27:13 No.102731216

Anonymous 10/08/24(Tue)07:27:13 No.102731216

File: Screenshot_20241008_132114.png (125 KB, 604x835)

125 KB PNG

>>102731002
I'll definitely try it when I have the time but honestly the example interaction they show on their website is not promising.
All rolls made to gain information should be made by the GM in secret and I find it baffling that they would choose to show an interaction that goes against this principle (even if in this particular case it doesn't matter).

Anonymous
10/08/24(Tue)07:28:02 No.102731224

Anonymous 10/08/24(Tue)07:28:02 No.102731224

>>102731200
>>102730440
>Ready for another day of serious business with our agents lads?

Anonymous
10/08/24(Tue)07:28:54 No.102731230

Anonymous 10/08/24(Tue)07:28:54 No.102731230

>>102731224
Brought to you by ProTalk AI™.

Anonymous
10/08/24(Tue)07:30:27 No.102731239

Anonymous 10/08/24(Tue)07:30:27 No.102731239

>>102726922
Openwebui as far as I can tell

Anonymous
10/08/24(Tue)07:30:44 No.102731244

Anonymous 10/08/24(Tue)07:30:44 No.102731244

File: 1728387024546.jpg (782 KB, 1080x1900)

782 KB JPG

>>102730355
It's retarded...

Anonymous
10/08/24(Tue)07:56:02 No.102731471

Anonymous 10/08/24(Tue)07:56:02 No.102731471

>>102731244
More like your character's retarded

Anonymous
10/08/24(Tue)08:02:08 No.102731538

Anonymous 10/08/24(Tue)08:02:08 No.102731538

File: 1728388829550.jpg (796 KB, 1080x1911)

796 KB JPG

>>102731471
That's true, but when I use Largestral she does understand the question properly.

Anonymous
10/08/24(Tue)08:14:18 No.102731655

Anonymous 10/08/24(Tue)08:14:18 No.102731655

>>102731640
>>102731640
>>102731640

Anonymous
10/08/24(Tue)08:16:15 No.102731672

Anonymous 10/08/24(Tue)08:16:15 No.102731672

>>102729612
>it was not and will not
>it was not
>original readme specifically called out roleplaying
lol
lmao
I get the whole branding thing, but c'mon, that's just delusional.

Anonymous
10/08/24(Tue)08:16:31 No.102731675

Anonymous 10/08/24(Tue)08:16:31 No.102731675

so im a bit confused here,
so i trained a model to do a task whos response was numerical value, trained it so it returned that response in words not digits,
the 8b instruct model trained on the dataset will do this
the 1b instruct model will randomly use decimal digits instead of words in its response ,

what is up with that?

Anonymous
10/08/24(Tue)08:17:16 No.102731688

Anonymous 10/08/24(Tue)08:17:16 No.102731688

>>102730009
Release it and let people fork it.

Anonymous
10/08/24(Tue)08:27:33 No.102731791

Anonymous 10/08/24(Tue)08:27:33 No.102731791

>>102725350
This already happened

Anonymous
10/08/24(Tue)08:28:49 No.102731805

Anonymous 10/08/24(Tue)08:28:49 No.102731805

>>102731791
No
Shit
Sherlock

Anonymous
10/08/24(Tue)08:28:56 No.102731807

Anonymous 10/08/24(Tue)08:28:56 No.102731807

>>102725753
Also need a CoT mode, like it should be built in and toggleable and cleverly implemented, not just an afterthought

Anonymous
10/08/24(Tue)08:29:18 No.102731811

Anonymous 10/08/24(Tue)08:29:18 No.102731811

>>102731805
anon
it
already
happened

Anonymous
10/08/24(Tue)08:30:40 No.102731826

Anonymous 10/08/24(Tue)08:30:40 No.102731826

>>102731807
>and cleverly implemented
What would that look like?

Anonymous
10/08/24(Tue)08:32:05 No.102731850

Anonymous 10/08/24(Tue)08:32:05 No.102731850

>>102731826
The only winning move is not to play.

Anonymous
10/08/24(Tue)08:37:28 No.102731905

Anonymous 10/08/24(Tue)08:37:28 No.102731905

>>102731826
Only keep the most n recent CoTs to prevent repetition and patterns, or better yet, keep the most important CoTs if possible, force reasoning as a third party, not as the character (similar to o1)
Top of my head

Anonymous
10/08/24(Tue)08:43:12 No.102731966

Anonymous 10/08/24(Tue)08:43:12 No.102731966

>>102731905
That would be pretty easy to implement as an ST extension.
I made an extension runs N prompts after the assistant's response that has the option to only keep the latest result in the chat, so that's something I know is not hard to implement, and I'm sure my implementation is messy as fuck since I didn't invest more than a couple of braincells while fapping to do that.
Something with more knowledge of ST's API's and such could do a much cleaner job, I'm sure.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.