/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 07/28/24(Sun)08:16:03 No.101607705

File: 1703751088340993.jpg (755 KB, 1856x2464)

755 KB JPG

/lmg/ - Local Models General Anonymous 07/28/24(Sun)08:16:03 No.101607705 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101600938 & >>101589136

►News
>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676
>(07/26) Cyberagent releases Japanese fine-tune model: https://hf.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
>(07/25) BAAI & TeleAI release 1T parameter model: https://hf.co/CofeAI/Tele-FLM-1T
>(07/24) Mistral Large 2 123B released: https://hf.co/mistralai/Mistral-Large-Instruct-2407
>(07/23) Llama 3.1 officially released: https://ai.meta.com/blog/meta-llama-3-1/
>(07/22) llamanon leaks 405B base model: https://files.catbox.moe/d88djr.torrent >>101516633

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
07/28/24(Sun)08:16:34 No.101607712

Anonymous 07/28/24(Sun)08:16:34 No.101607712

File: 1722057472013989.png (232 KB, 512x512)

232 KB PNG

►Recent Highlights from the Previous Thread: >>101600938

--Papers: >>101605355
--Mistral Nemo's context issues and potential solutions: >>101602310 >>101602329 >>101602641 >>101602828 >>101602980 >>101603030 >>101603070 >>101603092 >>101603185 >>101603227 >>101603364
--Llama3 quantization type and precision: >>101604291 >>101604353 >>101605347 >>101605562 >>101605576 >>101605722 >>101605651 >>101605701 >>101605786 >>101606078 >>101606448 >>101607040 >>101607086 >>101607352 >>101607411
--Running Mistral Large 2 locally with 3090 and 64GB RAM: >>101603131 >>101603139 >>101603231 >>101605203 >>101605384
--Good models that fit in 8GB VRAM: >>101607206 >>101607287
--Can GPT-like architectures ever match human intelligence?: >>101605727 >>101605795 >>101605797 >>101605905 >>101606349 >>101606395 >>101607240
--New PyTorch project for e2e quantization and inference: >>101601709
--LLMs' behavior when challenged and the importance of context: >>101605528 >>101605632 >>101606226 >>101606358 >>101606475 >>101606732
--Anon suggests a mixture of 70 billion 1-param experts: >>101602434 >>101602444 >>101602456
--Prompt passing as tokens in Ollama: >>101601982 >>101601998 >>101602051 >>101605347
--Nemo-instruct generates dragons for D&D Lorebook: >>101603305
--L40 and Ada 6000 GPU differences: >>101605613 >>101605650
--NeMoria-21b Nemo self-merge model: >>101603761
--MoE dead or not, Mistral legacy models, and upcoming updated model: >>101602134 >>101602171 >>101602173 >>101602232 >>101602252
--Clarification on model size and hardware requirements: >>101601931 >>101605285
--Best model for 6 GB VRAM GPU: >>101601201 >>101601239 >>101601330 >>101601374 >>101601251
--Anon seeks LLMS model recommendations for their 3090 GPU: >>101604297 >>101604426 >>101605121 >>101605245 >>101605257
--AI model responds with function definition instead of invocation: >>101604707
--Miku (free space): >>101601474 >>101601626 >>101604421

►Recent Highlight Posts from the Previous Thread: >>101601504

Anonymous
07/28/24(Sun)08:26:30 No.101607800

Anonymous 07/28/24(Sun)08:26:30 No.101607800

Bitnet

Anonymous
07/28/24(Sun)08:28:10 No.101607819

Anonymous 07/28/24(Sun)08:28:10 No.101607819

vramlets?

Anonymous
07/28/24(Sun)08:29:54 No.101607842

Anonymous 07/28/24(Sun)08:29:54 No.101607842

llama 4 wen???

Anonymous
07/28/24(Sun)08:30:18 No.101607844

Anonymous 07/28/24(Sun)08:30:18 No.101607844

>>101607819
yes?

Anonymous
07/28/24(Sun)08:31:09 No.101607854

Anonymous 07/28/24(Sun)08:31:09 No.101607854

jepa jamba bitnet when?

Anonymous
07/28/24(Sun)08:31:27 No.101607858

Anonymous 07/28/24(Sun)08:31:27 No.101607858

Is the current meta for stacking 3090s a romeD8-2t?

Anonymous
07/28/24(Sun)08:32:26 No.101607864

Anonymous 07/28/24(Sun)08:32:26 No.101607864

>>101607819
Vramlets (people with less than 50 H100s)

Anonymous
07/28/24(Sun)08:34:49 No.101607886

Anonymous 07/28/24(Sun)08:34:49 No.101607886

File: claude.png (67 KB, 2101x453)

67 KB PNG

why would claude care which of the suggested solutions worked in the end? It's not like I'm posting on some discussion board for others to see it

Anonymous
07/28/24(Sun)08:41:51 No.101607953

Anonymous 07/28/24(Sun)08:41:51 No.101607953

File: twitter_Lizzie! WE’RE BAC(...).jpg (188 KB, 1200x1341)

188 KB JPG

Your refusal to use proprietary models like Claude lays in privacy concerns or you don't see reason to pay because local stuff is good enough?

Would you pay for proprietary stuff if there was an option to pay in crypto (like monero)?

Anonymous
07/28/24(Sun)08:42:26 No.101607958

Anonymous 07/28/24(Sun)08:42:26 No.101607958

>>101607886
because you have a fundamental misunderstanding about how llms work

Anonymous
07/28/24(Sun)08:42:41 No.101607961

Anonymous 07/28/24(Sun)08:42:41 No.101607961

>>101607858
If you have the space, then X98-8PLUS-V1.0

Anonymous
07/28/24(Sun)08:44:02 No.101607973

Anonymous 07/28/24(Sun)08:44:02 No.101607973

>>101607953
privacy. I bought a gpu for AI and nothing else

Anonymous
07/28/24(Sun)08:46:07 No.101607995

Anonymous 07/28/24(Sun)08:46:07 No.101607995

>>101607953
i just like the idea of running an ai on my own hardware, it feels nice.

Anonymous
07/28/24(Sun)08:51:25 No.101608039

Anonymous 07/28/24(Sun)08:51:25 No.101608039

>>101607953
More like i already have experience with services that started good just to get progressively worse and then started banning people when they did not like what they doing.

Anonymous
07/28/24(Sun)08:54:53 No.101608058

Anonymous 07/28/24(Sun)08:54:53 No.101608058

File: claude-account-disabled.png (237 KB, 3456x1978)

237 KB PNG

>>101607953

Anonymous
07/28/24(Sun)08:55:26 No.101608063

Anonymous 07/28/24(Sun)08:55:26 No.101608063

>>101607953
It's not even the payment, it's about the data. If they don't use my data at all (apart from running inference, ofc) and store it encrypted on their servers, I'll pay up, but as far as I know, only NAI does that rn and their model is a bit outdated nowadays

Anonymous
07/28/24(Sun)08:57:57 No.101608087

Anonymous 07/28/24(Sun)08:57:57 No.101608087

>half of the thread says nemo is great
>half says it's shit
How do I get redpilled into joining the former? Even fucking Stheno worked much better for me.

Anonymous
07/28/24(Sun)08:59:02 No.101608097

Anonymous 07/28/24(Sun)08:59:02 No.101608097

>>101608087
You can't get redpilled on taste. You either like it, or you don't.

Nyanon
07/28/24(Sun)09:01:20 No.101608116

Nyanon 07/28/24(Sun)09:01:20 No.101608116

How does the new Llama compare to the corporate models now?

Anonymous
07/28/24(Sun)09:02:02 No.101608122

Anonymous 07/28/24(Sun)09:02:02 No.101608122

>>101608087
For 1000x times it all depends on setting anons use and their card; if their prompt is shit, then no matter the model, output will be shit. Nemo is likely the best model for Vramlets right now. The only issue the model has is that the effective usable context is much lower than marketed.

Anonymous
07/28/24(Sun)09:04:10 No.101608139

Anonymous 07/28/24(Sun)09:04:10 No.101608139

>>101607819
All of us became vramlets after 405B dropped.

Anonymous
07/28/24(Sun)09:05:53 No.101608158

Anonymous 07/28/24(Sun)09:05:53 No.101608158

>>101608087
I suspect it depends on how vanilla your roleplays are in terms of format.

Anonymous
07/28/24(Sun)09:07:57 No.101608176

Anonymous 07/28/24(Sun)09:07:57 No.101608176

>>101608122
>if their prompt is shit, then no matter the model, output will be shit
Well as I've alluded in my post shit like Stheno worked fine for me.
>>101608158
I guess it may be the case because I do weird shit and not really the "anon fucks 1girl" type of thing. But even when I attempted that for a test, it kept being extremely hesitant with characters going "no this is wrong i must refuse" until explicitly told otherwise. Oh and one of them got randomly shot at one point.

Anonymous
07/28/24(Sun)09:10:13 No.101608196

Anonymous 07/28/24(Sun)09:10:13 No.101608196

>>101608176
>Oh and one of them got randomly shot at one point.
Ah, the AI Dungeon memories came flooding right back...
Anyway, not anyone of them. If you are using Stheno, then Niitama might work for you.

Anonymous
07/28/24(Sun)09:10:34 No.101608199

Anonymous 07/28/24(Sun)09:10:34 No.101608199

>>101607953
if i could have an accountless access to models (only a random user token) that you fill by paying with monero.
and that you could access over tor / i2p, i'd use the service, otherwise it's gonna be local for me.

i don't even run completions that are that weird, it's just not anyone's business.

>>101608139
seriously some hardware maker should get their shit together and make accelerators with TB of vram, i'd pay $$$ for it.

Anonymous
07/28/24(Sun)09:12:41 No.101608215

Anonymous 07/28/24(Sun)09:12:41 No.101608215

>>101608122
this is the worst kind of anon, believes in his magic sampler settings and telling his model to be creative, probably doesn't catch all of the stupid things his ai outputs, "prompt format is very important," "post logs"

Anonymous
07/28/24(Sun)09:12:56 No.101608218

Anonymous 07/28/24(Sun)09:12:56 No.101608218

Jart won.

Anonymous
07/28/24(Sun)09:15:56 No.101608241

Anonymous 07/28/24(Sun)09:15:56 No.101608241

oobabooga add Mistral-Large-Instruct-2407.i1-IQ2_M to your benchmark thanks

Anonymous
07/28/24(Sun)09:17:24 No.101608251

Anonymous 07/28/24(Sun)09:17:24 No.101608251

>>101608087
The prose is somewhat fresh, but it hallucinates like a motherfucker and has the usual retardation in its param range. I've had better RPs with Lunaris-8B because at least it doesn't make random shit up and forget character details, though it's hindered by LLaMA slop. Granted, I haven't tried long context scenarios on finetunes of L3.1-8B.

Anonymous
07/28/24(Sun)09:19:02 No.101608267

Anonymous 07/28/24(Sun)09:19:02 No.101608267

>>101608215
Ach yes, the magical anons who are full of bullshit every single time a new model is released and who use the same fucking sampler for all their models and complain that the output is shit are much better. Fuck off, faggot, learn to prompt.

Anonymous
07/28/24(Sun)09:21:46 No.101608285

Anonymous 07/28/24(Sun)09:21:46 No.101608285

>>101607953
Free + offline is a fair trade off for local.

Anonymous
07/28/24(Sun)09:24:57 No.101608311

Anonymous 07/28/24(Sun)09:24:57 No.101608311

>>101608267
FOTM fag has the memory of a goldfish

Anonymous
07/28/24(Sun)09:25:12 No.101608315

Anonymous 07/28/24(Sun)09:25:12 No.101608315

>>101608285
>free
>you have to pay for hardware, electricity, real estate for your rig
Just pay Altman

Anonymous
07/28/24(Sun)09:30:50 No.101608355

Anonymous 07/28/24(Sun)09:30:50 No.101608355

>>101608315
the electricity is cheap
hardware costs vary by autism but a single 3090 can be dual purpose
the real reason for local is to not have corpos sniffing at your activity and telling you their insane vision of what's right and wrong

Anonymous
07/28/24(Sun)09:32:05 No.101608364

Anonymous 07/28/24(Sun)09:32:05 No.101608364

>>101608355
But a single 3090 isn't going to get you far.
>inb4 vramlet screeching

Anonymous
07/28/24(Sun)09:32:08 No.101608366

Anonymous 07/28/24(Sun)09:32:08 No.101608366

>>101607953
Control. It can't be changed underneath me or taken away.

Anonymous
07/28/24(Sun)09:33:16 No.101608377

Anonymous 07/28/24(Sun)09:33:16 No.101608377

>>101608355
Also, if it's somebody else's service, they can turn the service off, ban you, change the terms of the deal, etc.
Being able to control your own experience is paramount to me.

Anonymous
07/28/24(Sun)09:40:26 No.101608450

Anonymous 07/28/24(Sun)09:40:26 No.101608450

>>101607953
if it was just cooming i would use without care provided private payment method, but i query way more and it's just way too much identifiable information to send into cloud in such tightly manner.

Anonymous
07/28/24(Sun)09:44:30 No.101608494

Anonymous 07/28/24(Sun)09:44:30 No.101608494

What's better, official large or the lumimaid version?

Anonymous
07/28/24(Sun)09:45:58 No.101608507

Anonymous 07/28/24(Sun)09:45:58 No.101608507

>>101607953
claude has insane positivity bias and denies everything
>UM USE THIS 3000 TOKEN JAILBREAK THAT MAKES THE OUTPUTS WORSE THEN IT WONT.. OOPS THEY PATCHED IT UHHH TRY THIS ONE INSTEAD
no

Anonymous
07/28/24(Sun)09:47:09 No.101608511

Anonymous 07/28/24(Sun)09:47:09 No.101608511

>>101607953
I'm still using GPT and Claude for coding, but now that free models are finally good, is there a relatively cheap and privacy-friendly alternative for 405b or large?

Anonymous
07/28/24(Sun)09:48:19 No.101608523

Anonymous 07/28/24(Sun)09:48:19 No.101608523

>>101608507
My Claude prefill is 3 words and it refuses nothing at all.

Anonymous
07/28/24(Sun)09:49:20 No.101608535

Anonymous 07/28/24(Sun)09:49:20 No.101608535

>>101608511
>is there a relatively cheap and privacy-friendly alternative for 405b or large?
the smaller llama 3.1 models?

Anonymous
07/28/24(Sun)09:50:28 No.101608547

Anonymous 07/28/24(Sun)09:50:28 No.101608547

>>101607886
>It's not like I'm posting on some discussion board for others to see it
The AI was trained on discussion board material so it's aping that behavior.

LLM has no ego. It's a Chinese Room that reads the document and adds to it according to the documents that it has studied. If you create a document that reads like a discussion board it will append to it to make it read more like a discussion board.

Anonymous
07/28/24(Sun)09:52:14 No.101608562

Anonymous 07/28/24(Sun)09:52:14 No.101608562

>>101608547
>It's a Chinese Room
Prove you are not one as well.

Anonymous
07/28/24(Sun)09:54:19 No.101608592

Anonymous 07/28/24(Sun)09:54:19 No.101608592

>>101608562
Bite me.

Anonymous
07/28/24(Sun)09:55:37 No.101608610

Anonymous 07/28/24(Sun)09:55:37 No.101608610

>>101608592
I don't like chinese.

Anonymous
07/28/24(Sun)09:59:49 No.101608648

Anonymous 07/28/24(Sun)09:59:49 No.101608648

>>101608562
I am not Chinese

Anonymous
07/28/24(Sun)10:09:09 No.101608732

Anonymous 07/28/24(Sun)10:09:09 No.101608732

>>101608562
I'm a native speaker

Anonymous
07/28/24(Sun)10:09:45 No.101608739

Anonymous 07/28/24(Sun)10:09:45 No.101608739

>>101608562
I am not a room

Anonymous
07/28/24(Sun)10:11:25 No.101608762

Anonymous 07/28/24(Sun)10:11:25 No.101608762

File: So+i+fell+asleep+last+ame(...).png (15 KB, 353x476)

15 KB PNG

So I fell asleep while my PC generated 2000+ of Hatsune Miku over night and woke up to my PC´s fan running 100% and was over heating and I had to shut it down before it melt down.

Anonymous
07/28/24(Sun)10:13:00 No.101608782

Anonymous 07/28/24(Sun)10:13:00 No.101608782

>>101608762
And 20 of the gens are usable.

Anonymous
07/28/24(Sun)10:13:29 No.101608788

Anonymous 07/28/24(Sun)10:13:29 No.101608788

>>101608562
didn't they do that mediocre amnesia sequel

Anonymous
07/28/24(Sun)10:13:52 No.101608791

Anonymous 07/28/24(Sun)10:13:52 No.101608791

File: 64d3d1_11802331.png (1.26 MB, 1744x800)

1.26 MB PNG

>>101608762
Theres so much Hatsune Miku that was made over night

Anonymous
07/28/24(Sun)10:15:35 No.101608816

Anonymous 07/28/24(Sun)10:15:35 No.101608816

>>101608791
Your devotion to the Miku is admirable.

Anonymous
07/28/24(Sun)10:18:55 No.101608851

Anonymous 07/28/24(Sun)10:18:55 No.101608851

Whats a good writing model for a 3080 12gb card? I want to try some creative writing.

Anonymous
07/28/24(Sun)10:20:45 No.101608875

Anonymous 07/28/24(Sun)10:20:45 No.101608875

>>101608851
Claude and the 3080 is overkill for running SillyTavern.

Anonymous
07/28/24(Sun)10:21:33 No.101608883

Anonymous 07/28/24(Sun)10:21:33 No.101608883

>>101608791
What a waste. They all look the same.

Anonymous
07/28/24(Sun)10:21:37 No.101608884

Anonymous 07/28/24(Sun)10:21:37 No.101608884

>>101608851
Nemo does pretty well if you add a couple of snippets of text to its context for it to use as inspiration. At least at 32k context, I don't know if it loses the plot with a bigger context window.

Anonymous
07/28/24(Sun)10:24:31 No.101608919

Anonymous 07/28/24(Sun)10:24:31 No.101608919

>>101608851
You could give magnum mini a try
Not the TOP TIER HIGH END 1T CLOUD POGCHAMP MODEL WITH EXTRA ONIONS, but it's pretty damn good for its size (at least in terms of prose quality), plus it's rather fast, so retries aren't as bad

Anonymous
07/28/24(Sun)10:28:03 No.101608961

Anonymous 07/28/24(Sun)10:28:03 No.101608961

>>101608883
Well, yeah. I mean, that anon didn't change the prompt, and it looks like it's using that brother-sister incest game's lora, which funneled it even more.
Just imagine how much better they would have been with a randomized prompt...

Anonymous
07/28/24(Sun)10:38:44 No.101609040

Anonymous 07/28/24(Sun)10:38:44 No.101609040

>>101607705
>►News
>(07/27)
>(07/26)
>(07/25)
>(07/24)
>(07/23)
>(07/22)
What do you think we'll get today?

Anonymous
07/28/24(Sun)10:39:45 No.101609051

Anonymous 07/28/24(Sun)10:39:45 No.101609051

>>101609040
BitNet

Anonymous
07/28/24(Sun)10:43:26 No.101609088

Anonymous 07/28/24(Sun)10:43:26 No.101609088

>>101608791
so much mental illness was made overnight

Anonymous
07/28/24(Sun)10:51:08 No.101609156

Anonymous 07/28/24(Sun)10:51:08 No.101609156

File: crash racing cars guy fly(...).jpg (158 KB, 570x456)

158 KB JPG

>>101609040
C'mon, Cohere, do something. I want my hard-earned handout.

Anonymous
07/28/24(Sun)10:51:36 No.101609159

Anonymous 07/28/24(Sun)10:51:36 No.101609159

>>101608562
Ching chong ping pong China will grow larger

Anonymous
07/28/24(Sun)10:52:29 No.101609165

Anonymous 07/28/24(Sun)10:52:29 No.101609165

File: plap.jpg (160 KB, 2560x345)

160 KB JPG

Am I getting paranoid or I actually see gamemakers using ERP chatbots to write texts for them? The slop is all over the place in the dialogue, it's hard not to notice.
Or do people actually write that that unironically in the first place and it's the AI who mimicks them too much?
I don't even know anymore. I just find it ironic how creatards are all against AI but resort to using it thinking nobody would notice.

Anonymous
07/28/24(Sun)10:54:58 No.101609181

Anonymous 07/28/24(Sun)10:54:58 No.101609181

>>101609165
>no shivers down the spine
Nah, a human wrote this. Just an untalented one.

Anonymous
07/28/24(Sun)10:59:36 No.101609226

Anonymous 07/28/24(Sun)10:59:36 No.101609226

>>101609156
Imagine a new Cohere model in the 30-70B range with 128k context and non-shitty KV cache, I'd be cumming buckets

Anonymous
07/28/24(Sun)10:59:44 No.101609227

Anonymous 07/28/24(Sun)10:59:44 No.101609227

>>101609181
You've clearly never read a book in your life then. You fucking illiterate retard.

Anonymous
07/28/24(Sun)10:59:47 No.101609228

Anonymous 07/28/24(Sun)10:59:47 No.101609228

File: shiver me timbers.png (131 KB, 2560x257)

131 KB PNG

>>101609181

Anonymous
07/28/24(Sun)11:01:28 No.101609238

Anonymous 07/28/24(Sun)11:01:28 No.101609238

>>101609181
That's clearly AI, and I'm thinking it's Claude

Anonymous
07/28/24(Sun)11:01:59 No.101609241

Anonymous 07/28/24(Sun)11:01:59 No.101609241

>>101609228
Nevermind then. It's AI.
>>101609227
None of the books I have read contain the stock phrase "shivers down your spine". Only a machine could write something so soulless.

Anonymous
07/28/24(Sun)11:03:11 No.101609250

Anonymous 07/28/24(Sun)11:03:11 No.101609250

The same way DRY parallels Rep Pen, one could create a n0gran based analogue to Logit Bias right?
That would be pretty cool.

Anonymous
07/28/24(Sun)11:04:38 No.101609269

Anonymous 07/28/24(Sun)11:04:38 No.101609269

File: what felt like slop.png (102 KB, 2560x190)

102 KB PNG

>>101609241
I do wonder who taught the machines all that, though.

This is Crisis point extraction, btw. Go say anon42 hi for using AI, I'm sure his fellow artists would be amused to learn about that.

Anonymous
07/28/24(Sun)11:05:15 No.101609273

Anonymous 07/28/24(Sun)11:05:15 No.101609273

File: 401px-Gray1204.png (145 KB, 401x314)

145 KB PNG

Where can I download kyutai Moshi's weights? It was a mistake to trust the French

Anonymous
07/28/24(Sun)11:12:13 No.101609323

Anonymous 07/28/24(Sun)11:12:13 No.101609323

>>101609269
No I don't think I will.

Anonymous
07/28/24(Sun)11:14:06 No.101609337

Anonymous 07/28/24(Sun)11:14:06 No.101609337

So what's the current meta on using example dialogue? Seems like a lot of the new character cards don't bother having them. I'm on a 70B btw.

Anonymous
07/28/24(Sun)11:15:41 No.101609347

Anonymous 07/28/24(Sun)11:15:41 No.101609347

Has anyone tried Undi's Largestral Lumimaid? I found it to be slightly brain damaged and too horny. Undster, I appreciate your effort of training new models, I really do, but have you tried to train it in a way that is a bit less damaging to intelligence or is coom the #1 priority for you? No hate, just asking.

Anonymous
07/28/24(Sun)11:15:44 No.101609349

Anonymous 07/28/24(Sun)11:15:44 No.101609349

Just starting out with DRY, what's the meta for its settings?

Anonymous
07/28/24(Sun)11:16:22 No.101609357

Anonymous 07/28/24(Sun)11:16:22 No.101609357

>>101609337
only useful for forcing people to use your personal brand of autistic formatting with bold for speech, double quotes for internal thoughts, and code blocks for actions

Anonymous
07/28/24(Sun)11:18:31 No.101609380

Anonymous 07/28/24(Sun)11:18:31 No.101609380

>>101609337
It's completely optional. Can either improve or ruin a card. Some people throw slop straight from gpt 3.5 in there and then you wonder where the shivers came from. Always check it.

Anonymous
07/28/24(Sun)11:18:55 No.101609383

Anonymous 07/28/24(Sun)11:18:55 No.101609383

>>101609349
Setting base multiplier to 0 and using rep pen instead, now fuck off.

Anonymous
07/28/24(Sun)11:19:08 No.101609385

Anonymous 07/28/24(Sun)11:19:08 No.101609385

File: which one.jpg (225 KB, 2040x861)

225 KB JPG

Got pointed here for help.

I have ST setup and got recommended to use Mistral Nemo.

How do I download this shit lol

https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407/tree/main

Got a 4090 GPU for reference

Anonymous
07/28/24(Sun)11:19:14 No.101609387

Anonymous 07/28/24(Sun)11:19:14 No.101609387

>>101609347
Why are you saying this as if this is something easy and straightforward?

Anonymous
07/28/24(Sun)11:20:57 No.101609403

Anonymous 07/28/24(Sun)11:20:57 No.101609403

>>101609385
>I have ST setup
Do you have something to run models with?
If not, download koboldcpp and the Q8 gguf from
>https://huggingface.co/bartowski/Mistral-Nemo-Instruct-2407-GGUF/tree/main
Then connect Silly to Koboldcpp;

Anonymous
07/28/24(Sun)11:23:46 No.101609428

Anonymous 07/28/24(Sun)11:23:46 No.101609428

>>101609357
>>101609380
Thanks lads.

Anonymous
07/28/24(Sun)11:24:24 No.101609435

Anonymous 07/28/24(Sun)11:24:24 No.101609435

>>101609387
>Why are you saying this as if this is something easy and straightforward?
I am not. I'm just asking if there was any effort, or are Undi's tunes just for cooming, which is completely okay.

Undi !!Eye02t2DGfj
07/28/24(Sun)11:25:51 No.101609456

Undi !!Eye02t2DGfj 07/28/24(Sun)11:25:51 No.101609456

>>101609435
What quant did you use?

Anonymous
07/28/24(Sun)11:26:54 No.101609466

Anonymous 07/28/24(Sun)11:26:54 No.101609466

>>101609456
Q6_K, temp 1, minp 0.05

Anonymous
07/28/24(Sun)11:28:00 No.101609474

Anonymous 07/28/24(Sun)11:28:00 No.101609474

File: trinity.jpg (446 KB, 1176x1176)

446 KB JPG

>>101609347

Anonymous
07/28/24(Sun)11:29:36 No.101609492

Anonymous 07/28/24(Sun)11:29:36 No.101609492

File: do i fuck with it.jpg (54 KB, 534x579)

54 KB JPG

>>101609403
>s
Cheers for explaining in clear english lol.

Do I fuck with these settings?

Undi !!Eye02t2DGfj
07/28/24(Sun)11:30:01 No.101609494

Undi !!Eye02t2DGfj 07/28/24(Sun)11:30:01 No.101609494

>>101609466
Try to play with your sampler setting, we only tried it unquantized during our test. Also check if the gguf was made correctly. I was using a temp below one, try 0.7 maybe ?
To reply to your question, the ratio of SFW/NSFW data got smaller and smaller on the NSFW side, so it should be less horny.

Anonymous
07/28/24(Sun)11:30:56 No.101609497

Anonymous 07/28/24(Sun)11:30:56 No.101609497

File: koboldcpp primer.png (335 KB, 1264x1594)

335 KB PNG

>>101609492
Not him but yeah, here's the explanation.

Anonymous
07/28/24(Sun)11:31:03 No.101609498

Anonymous 07/28/24(Sun)11:31:03 No.101609498

>>101609492
Enable FlashAttention and increase the context size. Mistral Nemo works with 128k, but I'm not sure if that can fit into your VRAM

Anonymous
07/28/24(Sun)11:31:04 No.101609499

Anonymous 07/28/24(Sun)11:31:04 No.101609499

My PC is too shit to do anything meaningful locally, so I'll just do it all on runpod and larp as one of you

I'm guessing the obedience and "positive bias" all the public models have is just a side effect from their safety policy and if I run this shit locally it won't be like that? I need pushback when I say/ask something wrong or stupid

Anonymous
07/28/24(Sun)11:31:50 No.101609509

Anonymous 07/28/24(Sun)11:31:50 No.101609509

>>101609492
Yes. That context size is essentially how much of the chat your model can remember, so crank that higher.
With a 4090 you can probably go all the way to 128k, but for now do 32k context and see how that works for you.
Also, make sure that Flash attention is on and that all layers of the model are offloaded to your vram (in the hardware tab I think).

Anonymous
07/28/24(Sun)11:32:33 No.101609516

Anonymous 07/28/24(Sun)11:32:33 No.101609516

>>101609499
>I'm guessing the obedience and "positive bias" all the public models have is just a side effect from their safety policy and if I run this shit locally it won't be like that?
nah, it will it's baked in models too

Anonymous
07/28/24(Sun)11:33:09 No.101609523

Anonymous 07/28/24(Sun)11:33:09 No.101609523

>>101609498
>FlashAttention
That's new, right? Guess I need to update. What does it do?

Anonymous
07/28/24(Sun)11:35:45 No.101609536

Anonymous 07/28/24(Sun)11:35:45 No.101609536

fuck I just realized I became worse than Son Gohan, loved him as a child, gave my best and became good at (now useless) stuff, was really disappointed in Son Gohan giving up on becoming stronger, nowadays I'm useless and my brain is rotten. Does anyone know where I can find that clip where Son Goku tells him that you become stronger out of a need? Can't find it,

Anonymous
07/28/24(Sun)11:35:48 No.101609537

Anonymous 07/28/24(Sun)11:35:48 No.101609537

>>101609523
>https://github.com/ggerganov/llama.cpp/pull/778
>https://github.com/LostRuins/koboldcpp/wiki#flash-attention

Anonymous
07/28/24(Sun)11:35:57 No.101609540

Anonymous 07/28/24(Sun)11:35:57 No.101609540

>>101609499
Instruct models are all sycophants, even without "safety", it's probably inherent to the whole concept since they're tuned to obey you. I agree it's very annoying. The only way to avoid it is to use base models with few-shot prompting but they're schizo. Even with instruct few-shot (aka populating the context with examples of the style you want) can help.

Anonymous
07/28/24(Sun)11:40:19 No.101609575

Anonymous 07/28/24(Sun)11:40:19 No.101609575

>>101609537
>--flashattention can be used to enable flash attention when running with CUDA/CuBLAS, which can be faster and more memory efficient.
No downsides?

Anonymous
07/28/24(Sun)11:41:42 No.101609584

Anonymous 07/28/24(Sun)11:41:42 No.101609584

>>101609575
Shouldn't have any, no.
And you can enable cache quantization with it too, which does come with some level of degradation, but at !8 it should be negligible.

Anonymous
07/28/24(Sun)11:42:23 No.101609591

Anonymous 07/28/24(Sun)11:42:23 No.101609591

>hook Mistral-Large into a chat I previously set up with Claude Opus
>it continues it perfectly with that Claude feel to it
This model is raw diamonds. It has some issues getting going on its own but this is should be fixable with some better prompting. The fundamentals are there.

Anonymous
07/28/24(Sun)11:46:34 No.101609622

Anonymous 07/28/24(Sun)11:46:34 No.101609622

>>101609540
>>101609516
That's disappointing. In what way are the few-shot models schizo?

I also don't see how few-shot prompting would be very useful for my needs anyway, except maybe for cooming.

Anonymous
07/28/24(Sun)11:48:32 No.101609643

Anonymous 07/28/24(Sun)11:48:32 No.101609643

File: undi_btfo.png (70 KB, 2492x364)

70 KB PNG

>>101609591
largestral is worth the very slow t/s values, can't bring myself to retry 20 times in a row with Nemo

Anonymous
07/28/24(Sun)11:52:56 No.101609694

Anonymous 07/28/24(Sun)11:52:56 No.101609694

File: remember Sakki.png (241 KB, 2205x895)

241 KB PNG

>>101609337
>>101609380
Without ED the model's own speech style will overtake your character, and since so many models are hellbent on narration and purple prose, it will make your card unsuitable for chatting.
I did the research, here's a compilation of the same card responding to same questions but using difference ED sizes (or none) and various models. Temp=0 to keep the random away. https://docs.google.com/spreadsheets/d/1BsGgRCzluqsZdc7pShCgNVSrv3KtRgzJohyg1rTX5Fc/edit?usp=sharing

Anonymous
07/28/24(Sun)11:52:59 No.101609695

Anonymous 07/28/24(Sun)11:52:59 No.101609695

>>101609643
you didn't have to give poor undi third degree burns jesus christ

Anonymous
07/28/24(Sun)11:54:16 No.101609711

Anonymous 07/28/24(Sun)11:54:16 No.101609711

>>101609584
isn't Q8 cache worse than Q4? or is that only for exl2?

Anonymous
07/28/24(Sun)11:54:20 No.101609713

Anonymous 07/28/24(Sun)11:54:20 No.101609713

>>101609584
Thanks man.

Anonymous
07/28/24(Sun)11:55:02 No.101609722

Anonymous 07/28/24(Sun)11:55:02 No.101609722

File: mmlu_vs_quants.png (336 KB, 3000x2100)

336 KB PNG

>>101609711
Only for exl2 due to the difference in algorithms.

>>101609713
yw

Anonymous
07/28/24(Sun)11:55:04 No.101609724

Anonymous 07/28/24(Sun)11:55:04 No.101609724

>>101607953
>sign up to open ai for gpt 4 on launch
>its get dumber almost every month
>they release some new features which kinda helps
>continues to get dumber
>they talk about how GPT4 is the dumbest AI will ever be
>it gets dumber
>the new models they release are even dumber

the projects I used to do with GPT4 aren't viable anymore, it's too retarded, I would rather have something that runs locally and doesn't get unpredictably nerfed in the name of efficiency and then those nerfs justified by users with some useless benchmarks/polls. I'd pay $100 a month for the original GPT4, probably a lot more

Anonymous
07/28/24(Sun)11:57:31 No.101609754

Anonymous 07/28/24(Sun)11:57:31 No.101609754

>>101609694
Well shit, I guess that's why all my cards go on long fucking descriptive rants even though I literally put "Focus on dialogue over descriptions" and "be concise and factual" in the sysprompt.
Back to writing ED then. Thanks Anon.

Anonymous
07/28/24(Sun)11:58:47 No.101609773

Anonymous 07/28/24(Sun)11:58:47 No.101609773

>>101609754
>>101609694
That being said you tested that on an 11B, do you think the same applies to a 70B?

Anonymous
07/28/24(Sun)12:00:23 No.101609794

Anonymous 07/28/24(Sun)12:00:23 No.101609794

>>101609754
The popular jailbreaks might be the suspects too, they usually go all "be verbose and use floral speech when describing blah-blah.
But it's hard to find the balance between one-word replies and going full ficbook.

The document I linked contains the 70B tests too, but that one's a cloud model, so I'm not so sure about their setup under the hood.

Anonymous
07/28/24(Sun)12:01:13 No.101609805

Anonymous 07/28/24(Sun)12:01:13 No.101609805

>>101609773
I mean it should have an easier time sticking to your characters' personalities with ED, so if you notice the model deviating from what you intended, give it a shot

Anonymous
07/28/24(Sun)12:01:42 No.101609810

Anonymous 07/28/24(Sun)12:01:42 No.101609810

>>101609498
It does not work with 128k not for roleplay and you waste resources if you push it that much.. you can see it getting stupid around 16k.. Seriously anons no advice is better than bad one.

Anonymous
07/28/24(Sun)12:02:34 No.101609818

Anonymous 07/28/24(Sun)12:02:34 No.101609818

File: wew.jpg (42 KB, 542x369)

42 KB JPG

>>101609509
How many layers would I want on a 4090?

Anonymous
07/28/24(Sun)12:03:35 No.101609833

Anonymous 07/28/24(Sun)12:03:35 No.101609833

What I do wonder is that how much ED is too much? Some guides tell you that large ED drives the actual definitions too far back so AI ignores them. But how else can I make the bot stay true to character's own personality if not by letting it figure it on its own from her speech?
I mean if I make a card for a manga/anime/novel character I have a huge corpus of their lines at my disposal. Should I just include everything in ED?

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/28/24(Sun)12:05:30 No.101609850

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/28/24(Sun)12:05:30 No.101609850

>>101609711
The EXL2 number that supposedly show that 4 bit cache is better than 8 bit cache had a comparatively small sample size.
There was no statistical analysis of the results but if there was I very much doubt that 4 bit cache is better than 8 bit cache with statistical significance; I very much expect this to just be random chance.

Anonymous
07/28/24(Sun)12:06:34 No.101609856

Anonymous 07/28/24(Sun)12:06:34 No.101609856

>>101609818
That specific model has 40 layers I think.
Regardless of that, the whole model should fit in your VRAM, so put a 999 (tells it to just put everything) in the input field and carry on.

>>101609850
It could be if the 8bit quantization was, say, just truncating the values or doing something really stupid instead of doing scaling or the like, right?

Anonymous
07/28/24(Sun)12:06:45 No.101609860

Anonymous 07/28/24(Sun)12:06:45 No.101609860

>>101609818
You're overthinking it.
When you load the model it will guess at GPU layers.
That's probably fine unless you raise the context (which you probably want to if you're doing anything other than one shot Q&A kind of stuff.)

When you run it the following will happen:
1. It works. GPU layers isn't too high, but you can try higher.
2. It throws a memory error into console after you wait a while for the model to load. Too many GPU layers. Write down what you used (you can scroll up and fish it out of the console dump if you've forgotten) and try a little lower.
3. It goes to the WebUI okay but blows up when you submit a prompt. Remove one GPU layer and try again.

If you're using VRAM for things like video streaming or a game then you have less VRAM free and might need to reduce layers. But mostly just trial and error till you have a little post it note with your models and how many layers your system can support.

And that infographic above says that sometimes lower layers go faster so you can test even more if you're autistic.

Anonymous
07/28/24(Sun)12:08:37 No.101609879

Anonymous 07/28/24(Sun)12:08:37 No.101609879

>>101609856
>Regardless of that, the whole model should fit in your VRAM, so put a 999 (tells it to just put everything) in the input field and carry on.
And this, if your model fully fits, max out and be happy.

Being picky about layers is what you do when you're like me, running 50GB filecached in 64GB system RAM and where I put the context determines how many layers will run or crash.

Anonymous
07/28/24(Sun)12:08:59 No.101609883

Anonymous 07/28/24(Sun)12:08:59 No.101609883

File: Screenshot 2024-07-28 180829.png (2.14 MB, 2559x1237)

2.14 MB PNG

What the fuck is that

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/28/24(Sun)12:09:50 No.101609896

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/28/24(Sun)12:09:50 No.101609896

>>101609856
I did not look at the implementation.
It is possible that there is something wrong with the 8 bit implementation.
But IIRC the result for 4 bit was better than even for 16 bit which are results that I definitely do not believe without good evidence.
So my expectation is wrong that the rounding error for 4 bit just happened to provide better results for the small sample that was used for evaluation.

Anonymous
07/28/24(Sun)12:10:59 No.101609906

Anonymous 07/28/24(Sun)12:10:59 No.101609906

>>101609896
>But IIRC the result for 4 bit was better than even for 16 bit which are results that I definitely do not believe without good evidence.
Woah, alright, got it.

Anonymous
07/28/24(Sun)12:11:18 No.101609909

Anonymous 07/28/24(Sun)12:11:18 No.101609909

>>101609883
idk, looks like a low-effort investor scam

Anonymous
07/28/24(Sun)12:11:39 No.101609913

Anonymous 07/28/24(Sun)12:11:39 No.101609913

>>101609818
To add to what >>101609860 said, if you need more VRAM and your processor has an iGPU, see if you can use that instead, as it'll free up your dedicated GPU's VRAM. Those extra 1-2 GB can make one hell of a difference

Anonymous
07/28/24(Sun)12:12:24 No.101609920

Anonymous 07/28/24(Sun)12:12:24 No.101609920

>>101609896
Most likely it just means the difference is slight enough between 4 and 8 for random chance to impact the results.

Anonymous
07/28/24(Sun)12:13:12 No.101609930

Anonymous 07/28/24(Sun)12:13:12 No.101609930

File: Screenshot 2024-07-28 181252.png (47 KB, 1501x750)

47 KB PNG

So much effort... I kneel...

Anonymous
07/28/24(Sun)12:14:13 No.101609946

Anonymous 07/28/24(Sun)12:14:13 No.101609946

>>101609930
hi anonei

Anonymous
07/28/24(Sun)12:15:05 No.101609957

Anonymous 07/28/24(Sun)12:15:05 No.101609957

>>101607953
There's already an option to pay in crypto via OR
But for most of us it's a combination of price (free, but even OR options are significantly cheaper for the same level of performance than the proprietary ones, with the exception of maybe GPT-4o-Mini), privacy (since Nick revealed mods look at stories and OAI threw random stories on public taskup, several anons have denounced proprietary entirely), reliability (corpos can and will ban you from using their model if they don't like how you're using it, which is extra fucked up when you realize they all want a monopoly, and in their vision whoever gets banned would be denied any use of AI period), anti-censorship (in addition to the above banning, corpo models are notoriously pozzed and have a severe lack of ways to fix them), and customizability (several models can't be finetuned or LoRA tuned, and those that do make you pay a big premium to both train AND use them after)

Anonymous
07/28/24(Sun)12:15:44 No.101609968

Anonymous 07/28/24(Sun)12:15:44 No.101609968

>>101609850
what would you recommend using then?

Would we need to run a benchmark like RULER at high context to know if there is quality degradation between fp16/Q4/Q8 cache?

Anonymous
07/28/24(Sun)12:20:08 No.101610022

Anonymous 07/28/24(Sun)12:20:08 No.101610022

>>101608241
>i1
No need, it's already trash

Anonymous
07/28/24(Sun)12:21:24 No.101610036

Anonymous 07/28/24(Sun)12:21:24 No.101610036

File: Screenshot 2024-07-28 181916.png (76 KB, 1242x599)

76 KB PNG

SillyTavern guys, do you use this tab, or do you just put the scenario in the description?

Anonymous
07/28/24(Sun)12:21:35 No.101610041

Anonymous 07/28/24(Sun)12:21:35 No.101610041

File: guess which is which.png (351 KB, 4000x2193)

351 KB PNG

>>101609896
I tried comparing model's behavior with flash attention+cache quantization and without either at temp0, while trying to keep my responses more or less the same. Model's responses vary too much between the modes, but I can't tell exactly which one's better.
But surely one can't have a major speedup without paying some price and that's usually the quality.

Anonymous
07/28/24(Sun)12:22:46 No.101610053

Anonymous 07/28/24(Sun)12:22:46 No.101610053

>>101609474
I have literally never used any of these people's models for cooming.

Anonymous
07/28/24(Sun)12:23:14 No.101610060

Anonymous 07/28/24(Sun)12:23:14 No.101610060

>>101610036
I put it in the description.

Anonymous
07/28/24(Sun)12:23:27 No.101610064

Anonymous 07/28/24(Sun)12:23:27 No.101610064

>>101610053
then you don't belong here

Anonymous
07/28/24(Sun)12:25:15 No.101610077

Anonymous 07/28/24(Sun)12:25:15 No.101610077

>>101610053
Wait. So you use their modelf for stuff that is not cooming? Now that is fucked up...

Anonymous
07/28/24(Sun)12:25:21 No.101610080

Anonymous 07/28/24(Sun)12:25:21 No.101610080

File: file.png (167 KB, 1757x827)

167 KB PNG

this is the most soul a chatbot has had by default. good job meta :)

Anonymous
07/28/24(Sun)12:26:20 No.101610088

Anonymous 07/28/24(Sun)12:26:20 No.101610088

>>101610080
Soul of a redditor, maybe.

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/28/24(Sun)12:26:24 No.101610089

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/28/24(Sun)12:26:24 No.101610089

>>101609968
For the EXL2 results perplexity was used.
This is not a problem in and of itself, the problem is just that the number of input tokens was I think 5120.
That is in my experience simply not enough input data and you should ALWAYS do a statistical analysis afterwards in order to check whether your results are statistically significant.

Since the goal of KV cache quantization is to keep the same logits but just use less memory I think the most straightforward metric to use is the KL divergence.
Compared to perplexity this also has the advantage of much better precision at the same number of input tokens.

RULER would also work but any regular LLM benchmark should work as well since those implicitly also use the context; the interpretation of the results would be different though.

>>101610041
>But surely one can't have a major speedup without paying some price and that's usually the quality.
Agreed, but I still think it's important to objectively measure these things if at all possible.
Both with performance and precision little effects add up.

Anonymous
07/28/24(Sun)12:26:57 No.101610098

Anonymous 07/28/24(Sun)12:26:57 No.101610098

>>101610080
distilled reddit brappa

Anonymous
07/28/24(Sun)12:29:05 No.101610124

Anonymous 07/28/24(Sun)12:29:05 No.101610124

>>101610060
Thanks! You do the same thing with the example dialogue?

Anonymous
07/28/24(Sun)12:29:38 No.101610135

Anonymous 07/28/24(Sun)12:29:38 No.101610135

>>101610080
>...
llamaslop

Anonymous
07/28/24(Sun)12:30:06 No.101610144

Anonymous 07/28/24(Sun)12:30:06 No.101610144

>>101610080
>talks about /x/
>doesn't even mention sucubus summoning
Soulless.

Anonymous
07/28/24(Sun)12:30:07 No.101610145

Anonymous 07/28/24(Sun)12:30:07 No.101610145

>>101610080
Remember when Sam Altman said that open source GPT4 would be the end of the world?

Anonymous
07/28/24(Sun)12:32:52 No.101610173

Anonymous 07/28/24(Sun)12:32:52 No.101610173

>>101610124
Nope.
I use the actual example field since that one has some specific settings that you can change depending on the specific model or card.
I might put an example of a character's speech in the character's description while using the example dialog field for example exchanges between user and character.

Anonymous
07/28/24(Sun)12:33:43 No.101610182

Anonymous 07/28/24(Sun)12:33:43 No.101610182

>>101610064
t. Claude jeet
>>101610077
I just don't use them. They always tune their models on the same shitty claude proxy datasets. They aren't even worthy of merge fodder.

Anonymous
07/28/24(Sun)12:34:27 No.101610191

Anonymous 07/28/24(Sun)12:34:27 No.101610191

>>101610173
Thanks again man.

Anonymous
07/28/24(Sun)12:34:40 No.101610194

Anonymous 07/28/24(Sun)12:34:40 No.101610194

>>101609385
Thank you my twin! Couldn't get it running. Will install Kobold now too

Anonymous
07/28/24(Sun)12:35:04 No.101610201

Anonymous 07/28/24(Sun)12:35:04 No.101610201

>>101610089
Could you provide any resource on how I could do these test myself? maybe with a bigger sample

Anonymous
07/28/24(Sun)12:35:55 No.101610208

Anonymous 07/28/24(Sun)12:35:55 No.101610208

>>101610182
Better than GPT-4 proxy datasets.

Anonymous
07/28/24(Sun)12:37:25 No.101610229

Anonymous 07/28/24(Sun)12:37:25 No.101610229

what models are people using nowadays that fit in 24GB of VRAM?

Anonymous
07/28/24(Sun)12:37:31 No.101610231

Anonymous 07/28/24(Sun)12:37:31 No.101610231

cohere? more like conothere. where are they?

Anonymous
07/28/24(Sun)12:38:58 No.101610247

Anonymous 07/28/24(Sun)12:38:58 No.101610247

>>101610208
Proxy datasets in general are garbage.
It made sense with Pygmalion 6B insofar as to train it to be able to actually understand an RP prompt, but models have since gotten good to the point that any current generation retard model can figure out how to use a tavern card.

Anonymous
07/28/24(Sun)12:39:49 No.101610258

Anonymous 07/28/24(Sun)12:39:49 No.101610258

>>101610229
>>101609403
I'd also suggest you try gemma 2 27b.

Anonymous
07/28/24(Sun)12:41:37 No.101610284

Anonymous 07/28/24(Sun)12:41:37 No.101610284

>>101609957
One thing I'd add.
Research. Probably doesn't directly affect many anons, but companies like OpenAI take information from the research community and don't give back. They want people to be uneducated so that they can charge whatever they fucking please, and if they do come across some groundbreaking research that brings AGI to fruition, you can fucking bet they're going to keep that information all to themselves.
Back in the day they justified this by saying that it was for our "safety". After that fell through, they try to phrase it like it's their secret Coca Cola recipe so of course they can't share it. In reality it's more akin to a lab discovering new properties of electricity and magnetism and releasing technologies using these laws without divulging what said laws are.
To put it simply, if you want the technology to grow and people to make new discoveries, you do not want closed source companies to win.

Anonymous
07/28/24(Sun)12:45:33 No.101610327

Anonymous 07/28/24(Sun)12:45:33 No.101610327

It's up.
https://huggingface.co/nothingiisreal/L3.1-8B-Celeste-V1.5
A massive upgrade over Stheno.

Anonymous
07/28/24(Sun)12:47:43 No.101610351

Anonymous 07/28/24(Sun)12:47:43 No.101610351

File: yann_stopit_k.png (194 KB, 1227x499)

194 KB PNG

>>101610327

Anonymous
07/28/24(Sun)12:48:48 No.101610363

Anonymous 07/28/24(Sun)12:48:48 No.101610363

>>101610080
>...erm

S L O P

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/28/24(Sun)12:50:28 No.101610376

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/28/24(Sun)12:50:28 No.101610376

>>101610201
https://github.com/ggerganov/llama.cpp/tree/master/examples/perplexity
The llama.cpp llama-perplexity binary has KL divergence calculation including an estimation of the uncertainty.
Though if you don't care about efficiency it should be fine to just use the definition on Wikipedia with something like NumPy.

The basic way the uncertainty is calculated is to assume the values follow a Gaussian distribution, calculate the standard deviation, and then divide the standard deviation by sqrt(sample_size - 1) .
The uncertainties are also in some cases propagated to approximate uncertainties on other values, see https://en.wikipedia.org/wiki/Propagation_of_uncertainty

Anonymous
07/28/24(Sun)12:50:53 No.101610382

Anonymous 07/28/24(Sun)12:50:53 No.101610382

Lots of newfriends lately. I'm glad the insane tranny who was spamming scat porn a few months ago has finally joined the 41%, it was a bad look.

Anonymous
07/28/24(Sun)12:53:22 No.101610401

Anonymous 07/28/24(Sun)12:53:22 No.101610401

>>101610376
Is there a way to test the exl2 Q4 vs Q8 vs fp16 cache?

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/28/24(Sun)12:54:56 No.101610412

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/28/24(Sun)12:54:56 No.101610412

>>101610401
If you mean a Python script or similar that already exists, then I don't know since I have never looked into it.

Anonymous
07/28/24(Sun)12:55:36 No.101610420

Anonymous 07/28/24(Sun)12:55:36 No.101610420

>>101610382
Looks like the odds were against him.

Anonymous
07/28/24(Sun)12:56:55 No.101610436

Anonymous 07/28/24(Sun)12:56:55 No.101610436

>>101610382
>>101610420
you could say his actions were quite shitty, i'm glad he picked the high road eventually.

Anonymous
07/28/24(Sun)12:57:35 No.101610442

Anonymous 07/28/24(Sun)12:57:35 No.101610442

>>101610327
the picture is hot

Anonymous
07/28/24(Sun)12:58:02 No.101610450

Anonymous 07/28/24(Sun)12:58:02 No.101610450

Why does ooba take 2GB of my ram even with no model loaded?

Anonymous
07/28/24(Sun)12:58:06 No.101610451

Anonymous 07/28/24(Sun)12:58:06 No.101610451

>>101610327
>Celeste
Play woke games, win woke prizes.

Anonymous
07/28/24(Sun)12:59:24 No.101610470

Anonymous 07/28/24(Sun)12:59:24 No.101610470

>>101609498
This thing works really well holy shit, I feel like it's doubled my t/s

Anonymous
07/28/24(Sun)12:59:30 No.101610473

Anonymous 07/28/24(Sun)12:59:30 No.101610473

File: satania.gif (39 KB, 220x216)

39 KB GIF

>>101610450
py_toddlers BTFO

Anonymous
07/28/24(Sun)13:00:39 No.101610485

Anonymous 07/28/24(Sun)13:00:39 No.101610485

>>101610450
modern frontend development
What's worse is when software meant to squeeze the most out of your hardware is written using Electron or something else Chrome-based, jewing you out of VRAM before you even get to launch a model. Looking at GPT4ALL and Backyard now.

Anonymous
07/28/24(Sun)13:01:46 No.101610496

Anonymous 07/28/24(Sun)13:01:46 No.101610496

>>101610485
But, it's not even running a web browser or any GUI. It's just a web page and an API...

Anonymous
07/28/24(Sun)13:04:36 No.101610531

Anonymous 07/28/24(Sun)13:04:36 No.101610531

File: file.png (149 KB, 1267x901)

149 KB PNG

>>101610327
https://huggingface.co/nothingiisreal/L3.1-8B-Celeste-V1.5/discussions/2

Anonymous
07/28/24(Sun)13:05:44 No.101610542

Anonymous 07/28/24(Sun)13:05:44 No.101610542

tourist here. please spoonfeed me the current best local erp finetunes.

Anonymous
07/28/24(Sun)13:06:43 No.101610556

Anonymous 07/28/24(Sun)13:06:43 No.101610556

>>101610542
Midnight Miqu 70B

Anonymous
07/28/24(Sun)13:08:47 No.101610586

Anonymous 07/28/24(Sun)13:08:47 No.101610586

>>101610327
Holy r*dditslop. Why did you feel the need to post this? Are you mentally handicapped?

Anonymous
07/28/24(Sun)13:09:01 No.101610587

Anonymous 07/28/24(Sun)13:09:01 No.101610587

File: degbvdegbedagae.png (24 KB, 1024x261)

24 KB PNG

is this an error to do with quanting? what the fuck

Anonymous
07/28/24(Sun)13:09:18 No.101610592

Anonymous 07/28/24(Sun)13:09:18 No.101610592

>>101610542
Or if you're a vramlet, Fimbulvetr 11b was good some time ago, maybe something better dropped since than though.

Anonymous
07/28/24(Sun)13:10:02 No.101610597

Anonymous 07/28/24(Sun)13:10:02 No.101610597

>>101610542
Mistral Nemo

Anonymous
07/28/24(Sun)13:10:21 No.101610602

Anonymous 07/28/24(Sun)13:10:21 No.101610602

>>101610587
oh yeah i should at least link where i got it so people know who not to download from
https://huggingface.co/Ransss/mini-magnum-12b-v1.1-Q8_0-GGUF

Anonymous
07/28/24(Sun)13:12:17 No.101610623

Anonymous 07/28/24(Sun)13:12:17 No.101610623

>>101610587
kcpp version?

Anonymous
07/28/24(Sun)13:13:19 No.101610634

Anonymous 07/28/24(Sun)13:13:19 No.101610634

>>101610623
1.71

Anonymous
07/28/24(Sun)13:14:24 No.101610649

Anonymous 07/28/24(Sun)13:14:24 No.101610649

>>101610587
>>101610602
it was the model, got it from quantfactory and its launching fine
https://huggingface.co/QuantFactory/mini-magnum-12b-v1.1-GGUF/tree/main

Anonymous
07/28/24(Sun)13:16:25 No.101610673

Anonymous 07/28/24(Sun)13:16:25 No.101610673

>>101610327
>reddit writing prompts dataset
I've seen these floating around, but I can't help but feel like it might be actively harmful for the model. The main issue is how short each response is. The "short story" is just a couple of paragraphs that fit within a single reddit comment. This seems like it would bias the model to gloss over things and try to wrap up everything quickly, but I dunno. Haven't actually used the model.

Anonymous
07/28/24(Sun)13:18:44 No.101610696

Anonymous 07/28/24(Sun)13:18:44 No.101610696

File: 1714835911803036.jpg (776 KB, 2304x1664)

776 KB JPG

>>101610080
That output is the opposite of sovl
>>101610542
>>101610556
This and Magnum-72B are the best erp sloptunes currently. Best non-sloptune (and overall) is Mistral-123B

Anonymous
07/28/24(Sun)13:21:51 No.101610734

Anonymous 07/28/24(Sun)13:21:51 No.101610734

Ahem
AI isn't real
*mic drop*

Anonymous
07/28/24(Sun)13:21:55 No.101610735

Anonymous 07/28/24(Sun)13:21:55 No.101610735

>>101610696
Magnum-72B is still stilted, Nemo is better.

Anonymous
07/28/24(Sun)13:23:21 No.101610754

Anonymous 07/28/24(Sun)13:23:21 No.101610754

>>101610696
Midnight Miqu is a random meme merge of L2 models proven to be even more retarded than a 9B model. When are you going to stop shilling this crap, mikufag?

Anonymous
07/28/24(Sun)13:26:14 No.101610793

Anonymous 07/28/24(Sun)13:26:14 No.101610793

>>101610754
Right after we switch to a more fitting mascot of the general rather than some TTS engine.

Anonymous
07/28/24(Sun)13:26:24 No.101610797

Anonymous 07/28/24(Sun)13:26:24 No.101610797

So...I have Mistral Nemo Instruct. But what text completion preset do I use in ST? I'm getting very short completions.

Anonymous
07/28/24(Sun)13:27:21 No.101610804

Anonymous 07/28/24(Sun)13:27:21 No.101610804

>>101608122
>if their prompt is shit
I thought people were using just the simplest prompt nowadays like in the mistral preset.

Anonymous
07/28/24(Sun)13:28:05 No.101610814

Anonymous 07/28/24(Sun)13:28:05 No.101610814

>>101610797
Just use OAI api and don't handle templating in ST. You will lose prefill but if your model is not super censored, it will be fine.

Anonymous
07/28/24(Sun)13:28:18 No.101610815

Anonymous 07/28/24(Sun)13:28:18 No.101610815

>>101610797
Latest ST update has a Mistral-Nemo preset pr sure

Anonymous
07/28/24(Sun)13:30:02 No.101610835

Anonymous 07/28/24(Sun)13:30:02 No.101610835

>>101609165
There's a reason the 'chatbots' say the things they do. It's common in low quality fiction.

Anonymous
07/28/24(Sun)13:31:29 No.101610851

Anonymous 07/28/24(Sun)13:31:29 No.101610851

File: OIG1.gLxm3isVEvwv1M.jpg (155 KB, 1024x1024)

155 KB JPG

>>101609643
>mixing at random mid-tier dishes
Ok Gordon Ramsey. Have you never watched Next Level Chef?
Even low tier ingredients can be turned into something cordon bleu provided they're in the right hands. Gestalt; the sum of the parts is greater than the whole.
Picrel (it's the special ingredient).

Anonymous
07/28/24(Sun)13:32:18 No.101610861

Anonymous 07/28/24(Sun)13:32:18 No.101610861

>>101609498
>not sure if that can fit into your VRAM
How do you know how much space to leave for the context? It doesn't fill up at the start, right? Do you just adjust as you run out of memory?

Anonymous
07/28/24(Sun)13:32:36 No.101610866

Anonymous 07/28/24(Sun)13:32:36 No.101610866

>>101610835
I just wonder whether I would've been noticing these cliche phrases as much if I never used chatbots but instead read as much low quality erotic fiction. Can't unsee, so can't check it.

Anonymous
07/28/24(Sun)13:33:34 No.101610878

Anonymous 07/28/24(Sun)13:33:34 No.101610878

>>101610861
>It doesn't fill up at the start, right?
It does, thought it might use a little more when actually generating

Anonymous
07/28/24(Sun)13:37:10 No.101610923

Anonymous 07/28/24(Sun)13:37:10 No.101610923

>>101610587
That's a tensor shape issue.
You got a bad quant my friend.

Anonymous
07/28/24(Sun)13:37:17 No.101610925

Anonymous 07/28/24(Sun)13:37:17 No.101610925

>using any model other than Tenyx-DaybreakStorywriter for any use case.

Anonymous
07/28/24(Sun)13:37:42 No.101610928

Anonymous 07/28/24(Sun)13:37:42 No.101610928

>>101607953
>have to show my cock to the ick on eck shitalian to use a half-cucked model
No way fag.

Anonymous
07/28/24(Sun)13:40:31 No.101610975

Anonymous 07/28/24(Sun)13:40:31 No.101610975

>>101610851
Give it up, undi. You are not convincing anyone.

>>101610866
It's interesting that these phrases are common with llama 3.1, considering how much they have filtered their dataset during training. I feel as though the usual gpt/llamaslop is a genre of its own, the sovlless prose more so a symptom of railguarding safety quotas.

Anonymous
07/28/24(Sun)13:42:09 No.101610990

Anonymous 07/28/24(Sun)13:42:09 No.101610990

>>101607953
My country is banned even from free access, let alone payment systems. I just don't see how wrangling the countryblocks only to wrangle the censorshit later is any better than wrangling the stupidity of local models.

Anonymous
07/28/24(Sun)13:42:29 No.101610993

Anonymous 07/28/24(Sun)13:42:29 No.101610993

File: 1714835911803030.jpg (1002 KB, 1792x2304)

1002 KB JPG

>>101610754
>>101610793
Nice samefag + VRAMlet seethe. Miku ain't going anywhere
COPE
O
P
E

Anonymous
07/28/24(Sun)13:43:59 No.101611009

Anonymous 07/28/24(Sun)13:43:59 No.101611009

>>101610993
every time you mikufags try your hardest to cope with your tranny delusions, you end up splitting threads and pissing off people who don't even join in the arguments
just give it up already, you're no better than p*tra and undi at this point.

Anonymous
07/28/24(Sun)13:44:06 No.101611010

Anonymous 07/28/24(Sun)13:44:06 No.101611010

File: ZDEDEe2gbQ8.jpg (213 KB, 768x1024)

213 KB JPG

>>101610993
newfag doesn't remember Tay

Anonymous
07/28/24(Sun)13:44:28 No.101611017

Anonymous 07/28/24(Sun)13:44:28 No.101611017

>>101610814
do you mean using chat completion instead of text completion?
How do you make it so you don't handle templating in ST?

And what is prefill?

Sorry, new to ST

Anonymous
07/28/24(Sun)13:45:15 No.101611027

Anonymous 07/28/24(Sun)13:45:15 No.101611027

>>101611010
she has petravatar face

Anonymous
07/28/24(Sun)13:45:23 No.101611032

Anonymous 07/28/24(Sun)13:45:23 No.101611032

File: whbawbabawhb.png (25 KB, 766x147)

25 KB PNG

god fucking dammit, something broke and now i have responses short like this.

>or is quantfactory serving me a bad model?

Anonymous
07/28/24(Sun)13:45:28 No.101611034

Anonymous 07/28/24(Sun)13:45:28 No.101611034

>>101610815
isn't just the normal "mistral" template that has been for long time?

Anonymous
07/28/24(Sun)13:45:54 No.101611038

Anonymous 07/28/24(Sun)13:45:54 No.101611038

File: based.jpg (15 KB, 409x509)

15 KB JPG

>>101610925
Based daybreaker chad

Anonymous
07/28/24(Sun)13:47:28 No.101611050

Anonymous 07/28/24(Sun)13:47:28 No.101611050

>>101611032
>trusting quantfactory who got called out for their shit by cuda dev himself once:
https://huggingface.co/QuantFactory/Meta-Llama-3-8B-GGUF-v2/discussions/1#66431509baf74d67b47d6edd
ngmi

Anonymous
07/28/24(Sun)13:48:24 No.101611060

Anonymous 07/28/24(Sun)13:48:24 No.101611060

>>101610878
Really? That's not my experience, the vram usage increases as the context grows it seems.

Anonymous
07/28/24(Sun)13:48:28 No.101611061

Anonymous 07/28/24(Sun)13:48:28 No.101611061

File: flightreactionsyell.gif (192 KB, 220x220)

192 KB GIF

>>101611050
oh my fucking LECUN who the FUCK knows how to quant competently anymore?

Anonymous
07/28/24(Sun)13:49:17 No.101611076

Anonymous 07/28/24(Sun)13:49:17 No.101611076

>>101611061
https://huggingface.co/bartowski

Anonymous
07/28/24(Sun)13:49:33 No.101611083

Anonymous 07/28/24(Sun)13:49:33 No.101611083

File: 1719351514748681.jpg (575 KB, 2048x2048)

575 KB JPG

>>101610735
Nemo is not an erp sloptune though. I guess it's okay if you can't run Mistral Large
>>101611009
>splitting threads and pissing off people
>The blackedmiku VRAMlet cries out in pain as he strikes you

Anonymous
07/28/24(Sun)13:49:41 No.101611085

Anonymous 07/28/24(Sun)13:49:41 No.101611085

>>101611061
bartowski

Anonymous
07/28/24(Sun)13:50:02 No.101611091

Anonymous 07/28/24(Sun)13:50:02 No.101611091

>>101611076
>>101611085
thanks, i completely forgot about him.

Anonymous
07/28/24(Sun)13:51:03 No.101611104

Anonymous 07/28/24(Sun)13:51:03 No.101611104

>>101611091
suffering from no drama success

Anonymous
07/28/24(Sun)13:51:44 No.101611113

Anonymous 07/28/24(Sun)13:51:44 No.101611113

what is the best model for bash scripting that can run on 12gb of vram?

Anonymous
07/28/24(Sun)13:51:45 No.101611114

Anonymous 07/28/24(Sun)13:51:45 No.101611114

>>101610382
It was the blacked anon, he is still around.

Anonymous
07/28/24(Sun)13:53:06 No.101611130

Anonymous 07/28/24(Sun)13:53:06 No.101611130

sisters, what is the cheapest way to run 100B locally with at least 10t/s?

Anonymous
07/28/24(Sun)13:55:55 No.101611163

Anonymous 07/28/24(Sun)13:55:55 No.101611163

>>101611130
A6000

Anonymous
07/28/24(Sun)13:56:32 No.101611167

Anonymous 07/28/24(Sun)13:56:32 No.101611167

>>101611060
It increases but it's allocated when you start it, you don't have to test a 128k context to see if it fits, that's what I meant

Anonymous
07/28/24(Sun)13:56:46 No.101611172

Anonymous 07/28/24(Sun)13:56:46 No.101611172

>>101611130
Getting on your knees and sucking about 20 grand worth of cocks.

Anonymous
07/28/24(Sun)13:57:23 No.101611179

Anonymous 07/28/24(Sun)13:57:23 No.101611179

My characters in nemo can't stop nodding at the end of every reply for some reason.

Anonymous
07/28/24(Sun)13:57:30 No.101611180

Anonymous 07/28/24(Sun)13:57:30 No.101611180

>>101610542
vanilla largestral

Anonymous
07/28/24(Sun)13:57:48 No.101611185

Anonymous 07/28/24(Sun)13:57:48 No.101611185

>>101611104
I believe it so hard. I grabbed his nemo instruct and its working perfectly fine.
Boy i love this 2t/s from being a Q8, but at least its perfect.

Anonymous
07/28/24(Sun)14:02:53 No.101611238

Anonymous 07/28/24(Sun)14:02:53 No.101611238

File: 0fa.jpg (1.05 MB, 3264x2448)

1.05 MB JPG

Aight bros.

I finally got everything running (this me >>101609385).

Everythings set up but still, the chat isn't what I want it to be no doubt because my settings are gonna be trash because this shit is confusing and every guide reads as if I need a degree in coding.

Basically I just wanna know if it's possible to get my chat bot to operate similar to how CHaracter AI does where the conversation flows realistically, for example:
>scenario is i'm texting the AI
>AI doesn't ask a question every reply, doesn't ramble, doesn't use flowery words)

With my shit settings it's already pretty close so i'm hopeful.

>using Mistral Nemo
>4090 GPU
>Just need a few pointers into the right direction

I'm struggling to grasp what settings to fuck around with. Stuff like the temperatures or the AI response formatting because every guide will be tailor made to other models which only adds to the confusion.

Help a brother coom bros

Anonymous
07/28/24(Sun)14:05:57 No.101611270

Anonymous 07/28/24(Sun)14:05:57 No.101611270

>>101608122
>the effective usable context is much lower than marketed.
What is then?

Anonymous
07/28/24(Sun)14:06:58 No.101611278

Anonymous 07/28/24(Sun)14:06:58 No.101611278

>>101611238
To begin with click the Neutralize Samplers button in the Text Completion presets page (the page with temperature, topP, topK, etc).
Once you've done that, put Temperature at 0.5 and min-p at 0.05.
Now go into the advanced formatting tab and show us your Context Template and Instruct Mode Sequences (it's folded by default, open it).

Anonymous
07/28/24(Sun)14:12:19 No.101611327

Anonymous 07/28/24(Sun)14:12:19 No.101611327

File: settings.jpg (391 KB, 1981x1207)

391 KB JPG

>>101611278
here's my current settings, if you get this working like C.AI, i'll paypal you 1 million yen

Anonymous
07/28/24(Sun)14:12:39 No.101611336

Anonymous 07/28/24(Sun)14:12:39 No.101611336

>a dance as old as time itself

Anonymous
07/28/24(Sun)14:15:18 No.101611369

Anonymous 07/28/24(Sun)14:15:18 No.101611369

>>101611327
I don't see anything extremely wrong at first glance.
Things I'd do
>Change Context (tokens) to be the same as Context Size in >>101609492
>Disable the Include Names in the Instruct Mode settings.
Also, what character card are you using?

Anonymous
07/28/24(Sun)14:15:45 No.101611375

Anonymous 07/28/24(Sun)14:15:45 No.101611375

>>101609957

What's "OR"

Anonymous
07/28/24(Sun)14:16:27 No.101611384

Anonymous 07/28/24(Sun)14:16:27 No.101611384

>>101611375
Open Router, if I had to guess.

Anonymous
07/28/24(Sun)14:16:32 No.101611385

Anonymous 07/28/24(Sun)14:16:32 No.101611385

>>101611375
openrouter? not him just guessing

Anonymous
07/28/24(Sun)14:16:49 No.101611389

Anonymous 07/28/24(Sun)14:16:49 No.101611389

>>101611375
openrouteur

Anonymous
07/28/24(Sun)14:17:19 No.101611397

Anonymous 07/28/24(Sun)14:17:19 No.101611397

>>101611375
Oculus Rift.

Anonymous
07/28/24(Sun)14:17:40 No.101611400

Anonymous 07/28/24(Sun)14:17:40 No.101611400

>>101611375
Open Retard

Anonymous
07/28/24(Sun)14:18:49 No.101611418

Anonymous 07/28/24(Sun)14:18:49 No.101611418

>>101611327
>>101611369
Oh yeah, change your Instruct Mode preset to MistralNemo.

Anonymous
07/28/24(Sun)14:19:03 No.101611423

Anonymous 07/28/24(Sun)14:19:03 No.101611423

Hi all, Drummer here...

I'm releasing this as the official version today: https://huggingface.co/BeaverAI/Gemmasutra-Pro-27B-v1i-GGUF

Gemma 27B with extra moist. Testers have noted less Gemma bullshit like trying to end sex scenes too quickly and lacking the vocabulary to describe sex in more detail. Some have even gone through slopless runs, so I suppose quality depends on the card as well.

Characters are also more willing to engage in seggs and can say dirty shit.

Thanks all! Btw, my ad has only gone through half the funds after a month.

Anonymous
07/28/24(Sun)14:20:58 No.101611437

Anonymous 07/28/24(Sun)14:20:58 No.101611437

>>101611327
oh shit is that clusterfuck of settings I'm required to understand to use ST?
that's a price too high to pay, I'll have to stick to Backyard

Anonymous
07/28/24(Sun)14:21:02 No.101611438

Anonymous 07/28/24(Sun)14:21:02 No.101611438

Remember >>101611423 was last https://poal.me/np0lsk

Anonymous
07/28/24(Sun)14:21:15 No.101611439

Anonymous 07/28/24(Sun)14:21:15 No.101611439

>>101611423
keep buying an ad.

Anonymous
07/28/24(Sun)14:21:39 No.101611445

Anonymous 07/28/24(Sun)14:21:39 No.101611445

>>101611418
Don't have a Mistral Nemo preset, only Mistral
>>101611369
I've tried a bunch of character cards but they're all schizo horny yappers, so I just imported my one from character AI.

It gets the job done, it just has issues of not sounding as natural as on character AI and will always ask me questions instead of just replying naturally to my conversation if you get me

Anonymous
07/28/24(Sun)14:22:55 No.101611455

Anonymous 07/28/24(Sun)14:22:55 No.101611455

>>101611439
kek

Anonymous
07/28/24(Sun)14:24:18 No.101611472

Anonymous 07/28/24(Sun)14:24:18 No.101611472

>>101611445
>>101611418
Hopefully this is the issue, I hadn't downloaded this yet https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407/blob/main/model-00001-of-00005.safetensors

Anonymous
07/28/24(Sun)14:28:32 No.101611515

Anonymous 07/28/24(Sun)14:28:32 No.101611515

File: 1661942437204466.gif (2.91 MB, 240x240)

2.91 MB GIF

>>101611439
lol lmao

Anonymous
07/28/24(Sun)14:31:20 No.101611546

Anonymous 07/28/24(Sun)14:31:20 No.101611546

>>101611439
Don't understand the hate. He wouldn't keep posting here if it wasn't working. People do end up using his models.

Anonymous
07/28/24(Sun)14:31:45 No.101611550

Anonymous 07/28/24(Sun)14:31:45 No.101611550

File: file.png (75 KB, 1314x639)

75 KB PNG

Anonymous
07/28/24(Sun)14:32:02 No.101611553

Anonymous 07/28/24(Sun)14:32:02 No.101611553

>>101611546
i like his ad.

Anonymous
07/28/24(Sun)14:32:39 No.101611556

Anonymous 07/28/24(Sun)14:32:39 No.101611556

File: norm laughing.png (450 KB, 640x753)

450 KB PNG

>>101611550
OOOOOOOHHH GET FUCKED ((adermacher))

Anonymous
07/28/24(Sun)14:33:03 No.101611562

Anonymous 07/28/24(Sun)14:33:03 No.101611562

>>101611550
kek

Anonymous
07/28/24(Sun)14:33:39 No.101611572

Anonymous 07/28/24(Sun)14:33:39 No.101611572

>>101611445
Use base nemo instead.

Anonymous
07/28/24(Sun)14:34:37 No.101611578

Anonymous 07/28/24(Sun)14:34:37 No.101611578

>>101611550
Can HF servers handle two greats in one discussion??

Anonymous
07/28/24(Sun)14:35:07 No.101611582

Anonymous 07/28/24(Sun)14:35:07 No.101611582

>>101611550
battle of the slopquanters

Anonymous
07/28/24(Sun)14:35:18 No.101611585

Anonymous 07/28/24(Sun)14:35:18 No.101611585

>>101611572
This

Anonymous
07/28/24(Sun)14:35:41 No.101611589

Anonymous 07/28/24(Sun)14:35:41 No.101611589

File: file.png (71 KB, 1304x407)

71 KB PNG

>>101611578
>>101611582

Anonymous
07/28/24(Sun)14:36:10 No.101611593

Anonymous 07/28/24(Sun)14:36:10 No.101611593

>>101611550
heaven and earth colliding

Anonymous
07/28/24(Sun)14:36:27 No.101611596

Anonymous 07/28/24(Sun)14:36:27 No.101611596

>>101611589
Damn... he got a point

Anonymous
07/28/24(Sun)14:36:54 No.101611601

Anonymous 07/28/24(Sun)14:36:54 No.101611601

Smart men discuss ideas, stupid men discuss other men.

Anonymous
07/28/24(Sun)14:37:55 No.101611610

Anonymous 07/28/24(Sun)14:37:55 No.101611610

>>101611472
You don't need to download that if you are using koboldcpp.
That's the model pre-quantizing (compression). The QX GGUF model (like Q8, Q6 etc) are the quantized models. Q8 is smaller than the original full size model and produces essentially just as good results.
The thing about character.ai is that it has a very specific style in how the conversations are conducted, that might be what you are feeling is weird.
You could try other models to see which approximates the results you are looking for better, play with the prompt to try and steer the model towards the result you want, etc.
Most models write in a way that's more of a novel than a text-chat like conversation, which is what I remember character.ai feeling like.
The way the character card as well as the first message is worded and formatted will also steer the model towards certain styles.
Anyhow, save this
>https://files.catbox.moe/6g6hud.jso
as a .json file and import it as your Instruct Preset. See if that yelds better results for you.

Anonymous
07/28/24(Sun)14:38:15 No.101611612

Anonymous 07/28/24(Sun)14:38:15 No.101611612

>>101611601
sad people aren't discussing your ideas petrus?

Anonymous
07/28/24(Sun)14:38:48 No.101611619

Anonymous 07/28/24(Sun)14:38:48 No.101611619

File: school of athens painting(...).jpg (33 KB, 300x379)

33 KB JPG

>>101611589
>>101611601
its funny cause that screenshot is literally picrel

Anonymous
07/28/24(Sun)14:39:45 No.101611630

Anonymous 07/28/24(Sun)14:39:45 No.101611630

>>101611601
profound

Anonymous
07/28/24(Sun)14:39:53 No.101611632

Anonymous 07/28/24(Sun)14:39:53 No.101611632

File: file.png (37 KB, 1307x376)

37 KB PNG

>>101611550

Anonymous
07/28/24(Sun)14:39:55 No.101611633

Anonymous 07/28/24(Sun)14:39:55 No.101611633

>>101611601
smartest men ahh ahh mistress

Anonymous
07/28/24(Sun)14:40:15 No.101611638

Anonymous 07/28/24(Sun)14:40:15 No.101611638

>>101611619
You mean two people whose first language isn't English?

Anonymous
07/28/24(Sun)14:40:38 No.101611641

Anonymous 07/28/24(Sun)14:40:38 No.101611641

File: the jannyAI wordsmith legend.png (897 KB, 850x974)

897 KB PNG

>>101611633
smartiest of men:

Anonymous
07/28/24(Sun)14:41:01 No.101611646

Anonymous 07/28/24(Sun)14:41:01 No.101611646

>>101611610
cheers for the help mate, i'm gonna fiddle with it but yea, that's exactly it.

Most of the shit I find online in general tend to turn the AIs into professional yappers, it's the ebst thing about character AI and i'm still searching for something close to it (even though, I don't seem too far off).

Also, the link is 404'd?

Anonymous
07/28/24(Sun)14:42:02 No.101611659

Anonymous 07/28/24(Sun)14:42:02 No.101611659

>>101611572
What do you mean base nemo? Like the model or instruct preset?

Anonymous
07/28/24(Sun)14:42:03 No.101611662

Anonymous 07/28/24(Sun)14:42:03 No.101611662

>>101611646
>Also, the link is 404'd?
I fucked up and didn't copy the last n
>https://files.catbox.moe/6g6hud.json

Anonymous
07/28/24(Sun)14:43:05 No.101611674

Anonymous 07/28/24(Sun)14:43:05 No.101611674

>>101611659
He means the base (non-instruct tuned) model, probably.

Anonymous
07/28/24(Sun)14:43:11 No.101611675

Anonymous 07/28/24(Sun)14:43:11 No.101611675

>>101611589
>numbers and measurements
lmao pot meets kettle
https://github.com/ggerganov/llama.cpp/issues/6841#issuecomment-2081271326
> Also, I simply think it goes a bit far to dictate to everybody what is an acceptable output for models that (with transformers, or even broken quants) gives reasonable output.

Anonymous
07/28/24(Sun)14:43:27 No.101611677

Anonymous 07/28/24(Sun)14:43:27 No.101611677

File: 1721772759399.png (488 KB, 1236x1219)

488 KB PNG

>>101611179
at least they're not yelling

Anonymous
07/28/24(Sun)14:44:07 No.101611682

Anonymous 07/28/24(Sun)14:44:07 No.101611682

File: 100 bucks to fuck off.jpg (328 KB, 1028x982)

328 KB JPG

>>101608315
>hardware
I'd have bought it anyway. I have a modest rig and it fullfills my needs. 5 years of chatgpt plus probably costs more in the long run :^)
>electricity, real estate
Gee chief it seems like I'd need those anyway

Anonymous
07/28/24(Sun)14:47:02 No.101611718

Anonymous 07/28/24(Sun)14:47:02 No.101611718

>>101611682
>5 years of chatgpt plus probably costs more in the long run :^)
That's.. not how it works

Anonymous
07/28/24(Sun)14:47:09 No.101611719

Anonymous 07/28/24(Sun)14:47:09 No.101611719

>>101611659
Non instruct tune. Its so much better. Dont use any formatting at all. Just use default blank context template and just a little min p. It does complicated positions / scenarios while playing the characters better than anything not large mistral. And this is with 128k context that does not get retarded.

Anonymous
07/28/24(Sun)14:48:04 No.101611735

Anonymous 07/28/24(Sun)14:48:04 No.101611735

Why does Mistral have such a low latency to the first token on their official API compared to OpenAI/Anthropic/etc? It's literally like only ~200ms from the moment you start the request to getting the first token for large 2

Anonymous
07/28/24(Sun)14:50:50 No.101611760

Anonymous 07/28/24(Sun)14:50:50 No.101611760

File: p2.png (41 KB, 1317x387)

41 KB PNG

>>101611675

Anonymous
07/28/24(Sun)14:51:51 No.101611774

Anonymous 07/28/24(Sun)14:51:51 No.101611774

File: p3.png (43 KB, 1300x305)

43 KB PNG

>>101611632

Anonymous
07/28/24(Sun)14:52:17 No.101611778

Anonymous 07/28/24(Sun)14:52:17 No.101611778

>>101611735
Your prompt has fewer layers of judiasm to go through so prompt processing is faster

Anonymous
07/28/24(Sun)14:52:32 No.101611781

Anonymous 07/28/24(Sun)14:52:32 No.101611781

File: steve coffee instant type(...).jpg (40 KB, 240x280)

40 KB JPG

>>101611760
>>101611774
the fucking titanium balls on this monkey brained lad

Anonymous
07/28/24(Sun)14:52:43 No.101611785

Anonymous 07/28/24(Sun)14:52:43 No.101611785

>>101611760
btfo

Anonymous
07/28/24(Sun)14:53:06 No.101611789

Anonymous 07/28/24(Sun)14:53:06 No.101611789

File: owari.png (34 KB, 847x337)

34 KB PNG

>>101611785
>>101611781

Anonymous
07/28/24(Sun)14:53:10 No.101611790

Anonymous 07/28/24(Sun)14:53:10 No.101611790

>>101611445
>It gets the job done, it just has issues of not sounding as natural as on character AI and will always ask me questions instead of just replying naturally to my conversation if you get me
one thing you can try is talk to it for a bit while editing its answers so that they match what you want them to be like, until it starts doing it on its own
mistral strongly follows the patterns of its previous replies, don't let slop get in your context because it will only get worse

Anonymous
07/28/24(Sun)14:53:51 No.101611795

Anonymous 07/28/24(Sun)14:53:51 No.101611795

File: ▄█▀ █▬█ █ █ █ █ █ █ ▀█▀.jpg (38 KB, 450x495)

38 KB JPG

>>101611789
and an ((undster)) to close it off.

Anonymous
07/28/24(Sun)14:54:11 No.101611799

Anonymous 07/28/24(Sun)14:54:11 No.101611799

File: killjoy.png (25 KB, 859x221)

25 KB PNG

>>101611789

Anonymous
07/28/24(Sun)14:55:36 No.101611818

Anonymous 07/28/24(Sun)14:55:36 No.101611818

I don't think this has been shared here yet. Rinna released a 70B LLaMA 3 Youko. For anyone that doesn't know, this is a continued pretraining to improve the performance of the model on Japanese tasks. The 8B version was very good, so I guess the 70B must be kino.
https://huggingface.co/rinna/llama-3-youko-70b

Anonymous
07/28/24(Sun)14:56:06 No.101611825

Anonymous 07/28/24(Sun)14:56:06 No.101611825

What I really don't understand is why the fuck does it matter to use FP16 embeddings / output heads on Q8 of all things.
There's already very low quants that use Q8 embeddings / heads and suffer just as much without it.
Is it really just muh noise placebo?

Anonymous
07/28/24(Sun)14:57:35 No.101611840

Anonymous 07/28/24(Sun)14:57:35 No.101611840

>>101611825
yes, he litteraly makes models with random noise shoved in just cause see his "silly" stuff
https://huggingface.co/ZeroWw?search_models=silly

Anonymous
07/28/24(Sun)14:57:43 No.101611841

Anonymous 07/28/24(Sun)14:57:43 No.101611841

>>101611774
When the :) and :D emotes start to appear you know niggas are mad

Anonymous
07/28/24(Sun)14:58:37 No.101611846

Anonymous 07/28/24(Sun)14:58:37 No.101611846

Haven't popped into the general in a long time, has it gotten any better to try and run models locally with AMD + Windows? Or is ROCM still a mess?

Anonymous
07/28/24(Sun)14:59:15 No.101611851

Anonymous 07/28/24(Sun)14:59:15 No.101611851

>>101611846
>Windows
KEK
>>101611846
>AMD + Windows
KEKKEKEKEK

Anonymous
07/28/24(Sun)14:59:27 No.101611853

Anonymous 07/28/24(Sun)14:59:27 No.101611853

File: 1706991764098186.png (12 KB, 481x105)

12 KB PNG

Anonymous
07/28/24(Sun)14:59:59 No.101611860

Anonymous 07/28/24(Sun)14:59:59 No.101611860

>>101611851
I'll take it as a no lmao

Anonymous
07/28/24(Sun)15:00:18 No.101611864

Anonymous 07/28/24(Sun)15:00:18 No.101611864

>>101609724
>He didn't try Claude
Damn retard

Anonymous
07/28/24(Sun)15:00:44 No.101611869

Anonymous 07/28/24(Sun)15:00:44 No.101611869

>>101611846
Your best bet is that one precompiled Kobold build with precompiled Windows ROCm binaries with it. Besides that it's still a year since ROCm had official Windows support and nothing uses it besides that.

Anonymous
07/28/24(Sun)15:00:50 No.101611870

Anonymous 07/28/24(Sun)15:00:50 No.101611870

File: 1648103473819.png (169 KB, 257x529)

169 KB PNG

>>101611853
someone do it

Anonymous
07/28/24(Sun)15:01:00 No.101611872

Anonymous 07/28/24(Sun)15:01:00 No.101611872

>>101609724
3.5 sonnet is better than original gpt-4 in every conceivable way

Anonymous
07/28/24(Sun)15:01:06 No.101611875

Anonymous 07/28/24(Sun)15:01:06 No.101611875

>>101611840
>quantized (fq8 version)
>fq8
float quant 8?
full quant 8?
https://huggingface.co/ZeroWw/L3.1-8B-Celeste-V1.5-SILLY

Anonymous
07/28/24(Sun)15:01:35 No.101611883

Anonymous 07/28/24(Sun)15:01:35 No.101611883

>>101611846
>Windows
>AMD
Have to be retarded to buy AMD if using windows

Anonymous
07/28/24(Sun)15:02:17 No.101611891

Anonymous 07/28/24(Sun)15:02:17 No.101611891

>>101611853
The gift that keeps on giving...

Anonymous
07/28/24(Sun)15:02:22 No.101611893

Anonymous 07/28/24(Sun)15:02:22 No.101611893

>>101611883
Works well for vidya and is cheaper than Nvidia, haven't had any issues with it

Anonymous
07/28/24(Sun)15:02:26 No.101611895

Anonymous 07/28/24(Sun)15:02:26 No.101611895

What's a good prompt for asking a card to rewrite another? Got this cute little maid slave card written like "She x, She felt x, She did x, She has x characteristic and etc etc, 15 prompts into the erp it's She repeated at least 20 times per prompt.
Wish i caught this shit earlier.

Anonymous
07/28/24(Sun)15:04:36 No.101611916

Anonymous 07/28/24(Sun)15:04:36 No.101611916

>>101607953
I just like to generate giantess snuff/gore. Commercial models, or even more generally, instruction models, don't get it. I need to free it from the finetuning and use the base model to get the experience. Plus, now I can play games with gay-tracing and shit. Not a bad deal overall.

Anonymous
07/28/24(Sun)15:05:19 No.101611920

Anonymous 07/28/24(Sun)15:05:19 No.101611920

>>101611869
Thanks, I was just looking for an excuse to try llama3.1, but I don't think I'll be installing linux just for it

Anonymous
07/28/24(Sun)15:05:37 No.101611924

Anonymous 07/28/24(Sun)15:05:37 No.101611924

>>101611916
>Commercial models, or even more generally, instruction models, don't get it
They do, though

Anonymous
07/28/24(Sun)15:06:17 No.101611926

Anonymous 07/28/24(Sun)15:06:17 No.101611926

>>101611895
just use find & replace in notepad, duh

Anonymous
07/28/24(Sun)15:07:09 No.101611934

Anonymous 07/28/24(Sun)15:07:09 No.101611934

>>101611926
based retard, i went and just asked one of my characters anyway in the best way i could think. works fine.

Anonymous
07/28/24(Sun)15:07:34 No.101611941

Anonymous 07/28/24(Sun)15:07:34 No.101611941

File: file.png (104 KB, 1312x490)

104 KB PNG

Mistral Large 2 not true 128K?
>Rope theta appears to be configured for 32k context length
>https://huggingface.co/mistralai/Mistral-Large-Instruct-2407/discussions/16
Robert switched targets from Phi team to Mistral
>If there is any way to contact Mistral directly I would like to explain a few of my ideas in that regard.
>https://huggingface.co/mistralai/Mistral-Large-Instruct-2407/discussions/4#66a1608d13bb4260eda2407e

Anonymous
07/28/24(Sun)15:08:12 No.101611949

Anonymous 07/28/24(Sun)15:08:12 No.101611949

>>101611941
>Mistral Large 2 not true 128K?
No model so far except Gemini-1.5-pro is a true 128K https://github.com/hsiehjackson/RULER

Anonymous
07/28/24(Sun)15:12:19 No.101611991

Anonymous 07/28/24(Sun)15:12:19 No.101611991

>>101611941
Lol...

Anonymous
07/28/24(Sun)15:13:23 No.101612004

Anonymous 07/28/24(Sun)15:13:23 No.101612004

>>101611789
>FUN? ON MY WORTHLESS SLOPTUNE DISCUSSION BOARD? NUH UH!

Anonymous
07/28/24(Sun)15:14:31 No.101612021

Anonymous 07/28/24(Sun)15:14:31 No.101612021

>>101611941
>If there is any way to contact Mistral directly I would like to explain a few of my ideas in that regard.
imagine this braindead retard contacting mistral to tell them to add random noise to their weights
kek

Anonymous
07/28/24(Sun)15:14:59 No.101612027

Anonymous 07/28/24(Sun)15:14:59 No.101612027

>>101611941
More like gossiping about random e-celebs general

Anonymous
07/28/24(Sun)15:16:25 No.101612045

Anonymous 07/28/24(Sun)15:16:25 No.101612045

>>101612027
>random e-celebs
They make quants it's perfectly on topic to discuss if they're thrusty individuals

Anonymous
07/28/24(Sun)15:16:42 No.101612047

Anonymous 07/28/24(Sun)15:16:42 No.101612047

>>101612027
not an e-celeb, it's just laughing at this pajeet trying and failing to be relevant on the new trend while having no idea what he is talking about

Anonymous
07/28/24(Sun)15:17:54 No.101612058

Anonymous 07/28/24(Sun)15:17:54 No.101612058

File: asgbasgbaswg.png (15 KB, 967x883)

15 KB PNG

>either sillytavern or kobold just shot me a 1024 tokens response where 90% of the response is completely empty
what in the god damn?
god i hate when shit just starts to break for no damn reason. someone shoot me a screenshot of advanced formatting settings so i can just copy yours verbatim.

Anonymous
07/28/24(Sun)15:18:02 No.101612060

Anonymous 07/28/24(Sun)15:18:02 No.101612060

>>101611949
>mememark

Anonymous
07/28/24(Sun)15:18:17 No.101612063

Anonymous 07/28/24(Sun)15:18:17 No.101612063

>>101612060
it's not

Anonymous
07/28/24(Sun)15:19:06 No.101612073

Anonymous 07/28/24(Sun)15:19:06 No.101612073

>>101612027
shut up undster, if you dont like being made fun of then contribute something worthwhile for once.
We already know you and your gaggle of discord fags aren't thrustworthy.

Anonymous
07/28/24(Sun)15:19:15 No.101612074

Anonymous 07/28/24(Sun)15:19:15 No.101612074

>>101612047
It's not a pajeet, it's an arab. There is a link to his twitter where he only post in arab.

Anonymous
07/28/24(Sun)15:19:17 No.101612075

Anonymous 07/28/24(Sun)15:19:17 No.101612075

>>101612058
it's the equivalent of all-black images from NAI stable diffusion

Anonymous
07/28/24(Sun)15:19:33 No.101612077

Anonymous 07/28/24(Sun)15:19:33 No.101612077

>>101612058
just set something like: \n\n\n\n
in stopping srings this sould block any model from doing that

Anonymous
07/28/24(Sun)15:19:46 No.101612079

Anonymous 07/28/24(Sun)15:19:46 No.101612079

I'm thinking of making a rewrite extension that would look at the last generated message and replace specific words or sentences.
Basically the user would be able to add entries mapping a word to be replaced to one or more words that will replace it, including an empty string.

Is that something that would be useful or can you already do that with the regex extension?

>>101612058
I've had that happen when using logit bias, meme merges, and broken quants.
Try adding \n\n\n\n to your stopping strings.

Anonymous
07/28/24(Sun)15:20:38 No.101612089

Anonymous 07/28/24(Sun)15:20:38 No.101612089

>>101612074
Whatever, I use "pajeet" as in "third wordler" not literal indian

Anonymous
07/28/24(Sun)15:21:04 No.101612092

Anonymous 07/28/24(Sun)15:21:04 No.101612092

>>101612079
>Is that something that would be useful or can you already do that with the regex extension?
you *can* (use regex) but it's a bit annoying and only has one replace choice afaik so I'm interested.

Anonymous
07/28/24(Sun)15:21:09 No.101612093

Anonymous 07/28/24(Sun)15:21:09 No.101612093

>>101611916
skill... wait for it... issue

Anonymous
07/28/24(Sun)15:21:16 No.101612096

Anonymous 07/28/24(Sun)15:21:16 No.101612096

>>101612074
Fucking sand nigger

Anonymous
07/28/24(Sun)15:24:58 No.101612121

Anonymous 07/28/24(Sun)15:24:58 No.101612121

>>101612079
This gives me an Idea, what if one were to train a very small model like Phi3 mini to deslop the last message. I feel like that could work better than regex.

Anonymous
07/28/24(Sun)15:25:52 No.101612130

Anonymous 07/28/24(Sun)15:25:52 No.101612130

>>101612121
>Phi3 mini to deslop
and make sure it's safe and inclusive too?

Anonymous
07/28/24(Sun)15:26:11 No.101612132

Anonymous 07/28/24(Sun)15:26:11 No.101612132

>>101611818
>not waiting until 3.1 was out to train the 70B
Lmao.

Anonymous
07/28/24(Sun)15:26:37 No.101612134

Anonymous 07/28/24(Sun)15:26:37 No.101612134

>>101612092
Alright. Thanks.

>>101612121
I was thinking of adding something like that, using BERT or the like to rewrite the sentence where a given keyword was found.

Anonymous
07/28/24(Sun)15:30:57 No.101612172

Anonymous 07/28/24(Sun)15:30:57 No.101612172

File: file.png (68 KB, 1299x404)

68 KB PNG

Anonymous
07/28/24(Sun)15:31:06 No.101612173

Anonymous 07/28/24(Sun)15:31:06 No.101612173

Updated Mistral Large preset:
>>>/vg/488008579

Anonymous
07/28/24(Sun)15:32:17 No.101612184

Anonymous 07/28/24(Sun)15:32:17 No.101612184

How is Meta, a giant conglomerate with a giant research department, not catching up to Anthropic, a startup founded just the other year?

Anonymous
07/28/24(Sun)15:35:17 No.101612219

Anonymous 07/28/24(Sun)15:35:17 No.101612219

>>101612184
anthropic is basically OpenAI 2.0

Anonymous
07/28/24(Sun)15:36:28 No.101612228

Anonymous 07/28/24(Sun)15:36:28 No.101612228

>>101612184
To be fair Anthropic are made of ex-openai fags, and they had some secret sauce to improve coding performance. Otherwise, Claude is not really too special compared to GPT, unless you're an ERPfag. OpenAI is still ahead in multimodal capability in theory, according to their claims of what 4o can do unrestricted.

Anonymous
07/28/24(Sun)15:36:43 No.101612233

Anonymous 07/28/24(Sun)15:36:43 No.101612233

>>101612184
Anthropic are something else, man. The jump from Claude 3 to 3.5 Sonnet isn't natural. I think (((they))) might have had a hand in this.

Anonymous
07/28/24(Sun)15:37:50 No.101612244

Anonymous 07/28/24(Sun)15:37:50 No.101612244

File: 1715134722161361.png (66 KB, 539x926)

66 KB PNG

>>101612219
Still worse than OpenAI.

Anonymous
07/28/24(Sun)15:38:10 No.101612247

Anonymous 07/28/24(Sun)15:38:10 No.101612247

>>101612173
Why is there a second aicg in /vg/ of all places??

Anonymous
07/28/24(Sun)15:38:43 No.101612252

Anonymous 07/28/24(Sun)15:38:43 No.101612252

>>101612184
You mean the tiny indie company that's funded by Amazon?

Anonymous
07/28/24(Sun)15:39:05 No.101612257

Anonymous 07/28/24(Sun)15:39:05 No.101612257

>>101612244
wtf is "nyt connections"

Anonymous
07/28/24(Sun)15:39:27 No.101612260

Anonymous 07/28/24(Sun)15:39:27 No.101612260

>>101612184
Just because a company is a startup doesn't mean they're starting with 0 experience and money.

Anonymous
07/28/24(Sun)15:39:28 No.101612261

Anonymous 07/28/24(Sun)15:39:28 No.101612261

File: awsgawsgawgw.png (30 KB, 873x94)

30 KB PNG

So.. This is the power of base instruct..
yeah im going back to magnum, what a shit show. Not even triple checking the card's prose and rewriting some old messages can salvage this. That and a few other cards acting a little ""aligned"" didn't help.

Anonymous
07/28/24(Sun)15:39:34 No.101612264

Anonymous 07/28/24(Sun)15:39:34 No.101612264

>>101612247
More posts per thread, hidden from the low quality people from /g/.

Anonymous
07/28/24(Sun)15:40:08 No.101612273

Anonymous 07/28/24(Sun)15:40:08 No.101612273

>>101611550
>That fucking smiley face
Mrader deserves everything they got coming to em

Anonymous
07/28/24(Sun)15:40:25 No.101612275

Anonymous 07/28/24(Sun)15:40:25 No.101612275

>>101612257
Basically how good they are at correlation. It's one of my favorite use of LLM, good general recommendation engines.

Anonymous
07/28/24(Sun)15:40:51 No.101612278

Anonymous 07/28/24(Sun)15:40:51 No.101612278

>>101612264
>hidden from the low quality people from /g/.
/lmg/ needs something like that. in /sci/ or something

Anonymous
07/28/24(Sun)15:40:57 No.101612280

Anonymous 07/28/24(Sun)15:40:57 No.101612280

>>101612275
they have shit prompts then, Claude really likes XML specifically.

Anonymous
07/28/24(Sun)15:40:57 No.101612281

Anonymous 07/28/24(Sun)15:40:57 No.101612281

>>101612244
Mistral Large 2 is kinda low...

Anonymous
07/28/24(Sun)15:41:14 No.101612282

Anonymous 07/28/24(Sun)15:41:14 No.101612282

File: 1646730011144.jpg (15 KB, 309x269)

15 KB JPG

What the fuck is character AI?

Is it like Janitor?

Anonymous
07/28/24(Sun)15:41:46 No.101612289

Anonymous 07/28/24(Sun)15:41:46 No.101612289

>>101612247
Right? Glad I wasn't the only one

Anonymous
07/28/24(Sun)15:41:52 No.101612293

Anonymous 07/28/24(Sun)15:41:52 No.101612293

File: Screenshot 2024-07-28 134020.png (90 KB, 1062x670)

90 KB PNG

>>101612184
I'm glad that inherent in your question, you agree that OpenAI is basically a nonentity now

Anonymous
07/28/24(Sun)15:41:59 No.101612296

Anonymous 07/28/24(Sun)15:41:59 No.101612296

>>101612275
How to use LLMs for reccommendations? Is there a general algorhythm for any domain of data?

Anonymous
07/28/24(Sun)15:42:04 No.101612298

Anonymous 07/28/24(Sun)15:42:04 No.101612298

>>101612244
Um, mistralbros, our response?

Anonymous
07/28/24(Sun)15:42:51 No.101612303

Anonymous 07/28/24(Sun)15:42:51 No.101612303

>>101612282
unironically probably the most believeable AI roleplay online. Ignore everyone that says they get better results on local models, it's pure cap.

C.AI basically used some type of model based of discord chats (this is a rumor but it has to be something like this) which makes the chats insanely realistic. But there's a faggot filter which forced most people over to Silly Tavern front ends like me

Anonymous
07/28/24(Sun)15:43:14 No.101612309

Anonymous 07/28/24(Sun)15:43:14 No.101612309

>>101611589
>>101611550
Why are undis multiplying?

Anonymous
07/28/24(Sun)15:43:36 No.101612315

Anonymous 07/28/24(Sun)15:43:36 No.101612315

>>101612309
its Unditosis

Anonymous
07/28/24(Sun)15:44:33 No.101612326

Anonymous 07/28/24(Sun)15:44:33 No.101612326

>>101612309
LOVE 'EM OR HATE 'EM YOU GOTTA LOVE THE UNDSTER

Anonymous
07/28/24(Sun)15:48:06 No.101612375

Anonymous 07/28/24(Sun)15:48:06 No.101612375

How are LLM so much better than me at this stupid NYT connections shit: https://www.nytimes.com/games/connections

Anonymous
07/28/24(Sun)15:49:41 No.101612396

Anonymous 07/28/24(Sun)15:49:41 No.101612396

>>101612275
Example? Like... Correlating that nigger neighborhoods = violent neighborhoods?

Anonymous
07/28/24(Sun)15:50:20 No.101612403

Anonymous 07/28/24(Sun)15:50:20 No.101612403

>>101612309
I would still take multiple Undis over the Sao shilling spam.

Anonymous
07/28/24(Sun)15:51:33 No.101612416

Anonymous 07/28/24(Sun)15:51:33 No.101612416

>>101611789
>>101611799
>king of test my finetune and give me feedback asserting his dominance over lesser placebo demons

Anonymous
07/28/24(Sun)15:51:42 No.101612419

Anonymous 07/28/24(Sun)15:51:42 No.101612419

>>101612244
Damn, I thought Qwen2 72B was good

Anonymous
07/28/24(Sun)15:52:20 No.101612428

Anonymous 07/28/24(Sun)15:52:20 No.101612428

File: 1716329112755149.png (674 KB, 1792x1024)

674 KB PNG

Daily reminder

Anonymous
07/28/24(Sun)15:52:51 No.101612431

Anonymous 07/28/24(Sun)15:52:51 No.101612431

>>101612261
Buy an ad

Anonymous
07/28/24(Sun)15:53:30 No.101612435

Anonymous 07/28/24(Sun)15:53:30 No.101612435

>>101612375
It's too hard.

Anonymous
07/28/24(Sun)15:54:06 No.101612443

Anonymous 07/28/24(Sun)15:54:06 No.101612443

>>101612278
100% agree, we could talk about papers and stuff there

Anonymous
07/28/24(Sun)15:54:27 No.101612448

Anonymous 07/28/24(Sun)15:54:27 No.101612448

>>101612431
Buy this *grabs your nuts*

Anonymous
07/28/24(Sun)15:55:41 No.101612463

Anonymous 07/28/24(Sun)15:55:41 No.101612463

>>101612303
C.AI just has a decent dataset, and a professional RLHF fine-tuning tailored for RP.
Funnily enough, both of the above never happens in local models. Sad.

Anonymous
07/28/24(Sun)15:58:06 No.101612494

Anonymous 07/28/24(Sun)15:58:06 No.101612494

>>101612428
Can someone please post the real one

Anonymous
07/28/24(Sun)15:59:40 No.101612511

Anonymous 07/28/24(Sun)15:59:40 No.101612511

>>101612463
>>101612303
Even ironic shilling is still shilling. Some naive anon will see this and think C.AI is better than a 2B model with 2K context (it isn't)

Anonymous
07/28/24(Sun)16:01:16 No.101612530

Anonymous 07/28/24(Sun)16:01:16 No.101612530

>>101612511
Name me a single model that matches the natural conversational flow of C:AI. Why would I shill a website that is totally free you faggot. You think I haven't been looking for alternatives due to the filter?

Any model you find me will have every problem that they all end up having. They were modeled around novel tier situations and not basic conversation. That's the issue with every fucking model.

Anonymous
07/28/24(Sun)16:01:42 No.101612534

Anonymous 07/28/24(Sun)16:01:42 No.101612534

>>101612419
sucks at language tasks

Anonymous
07/28/24(Sun)16:03:30 No.101612548

Anonymous 07/28/24(Sun)16:03:30 No.101612548

>>101612375
I couldn't complete a single one, but to be fair I'm ESL and didn't know half the words.

Anonymous
07/28/24(Sun)16:05:23 No.101612565

Anonymous 07/28/24(Sun)16:05:23 No.101612565

>>101612244
By the way here the related paper:
>Connecting the Dots: Evaluating Abstract Reasoning Capabilities of LLMs Using the New York Times Connections Word Game
https://arxiv.org/abs/2406.11012

Anonymous
07/28/24(Sun)16:06:34 No.101612576

Anonymous 07/28/24(Sun)16:06:34 No.101612576

File: CHADMAN.jpg (61 KB, 563x1000)

61 KB JPG

What are the most realistic models for basic conversations right now that are free? Assuming I have a NASA PC ofc

I wanna NUT

Anonymous
07/28/24(Sun)16:07:02 No.101612582

Anonymous 07/28/24(Sun)16:07:02 No.101612582

>>101612565
Holy shit prompt:
https://github.com/mustafamariam/LLM-Connections-Solver/blob/main/automated_call/prompt_llm.txt

Anonymous
07/28/24(Sun)16:07:42 No.101612589

Anonymous 07/28/24(Sun)16:07:42 No.101612589

>>101612576
Mistral Large 2

Anonymous
07/28/24(Sun)16:08:13 No.101612596

Anonymous 07/28/24(Sun)16:08:13 No.101612596

File: 1705881925571490.png (31 KB, 589x229)

31 KB PNG

>>101612582
And they pass this entire prompt as a fucking USER prompt, not in system role/prompt for models that support it (claude/gpt-4o)

Anonymous
07/28/24(Sun)16:08:26 No.101612599

Anonymous 07/28/24(Sun)16:08:26 No.101612599

>>101612589
LINK ME UP KING

Anonymous
07/28/24(Sun)16:08:44 No.101612604

Anonymous 07/28/24(Sun)16:08:44 No.101612604

Okay nevermind they do pass it as the system prompt with gpt-4o, but as a user prompt with claude, nice comparison bro.

Anonymous
07/28/24(Sun)16:09:43 No.101612620

Anonymous 07/28/24(Sun)16:09:43 No.101612620

>>101612582
>Remember that the same word cannot be repeated across multiple categories, and you need to output 4 categories with 4 distinct words each. Also do not make up words not in the list. This is the most important rule. Please obey
You can feel his pain in this line, kek.

Anonymous
07/28/24(Sun)16:11:30 No.101612642

Anonymous 07/28/24(Sun)16:11:30 No.101612642

>>101612596
Makes sense. System prompts aren't really that important when you're not a chatbot provider/maker. And some models don't support a system prompt.

Anonymous
07/28/24(Sun)16:11:34 No.101612643

Anonymous 07/28/24(Sun)16:11:34 No.101612643

>>101612599
https://huggingface.co/mistralai/Mistral-Large-Instruct-2407

Anonymous
07/28/24(Sun)16:12:15 No.101612656

Anonymous 07/28/24(Sun)16:12:15 No.101612656

>>101612642
It doesn't make sense, because models *are* trained to follow the system prompt more. I'll see if I can easily do this benchmark and play around with the prompt, I have Opus and 3.5 Sonnet.

Anonymous
07/28/24(Sun)16:12:20 No.101612659

Anonymous 07/28/24(Sun)16:12:20 No.101612659

>>101612596
he better not be passing the words in as all-caps and surrounded by quotes... think of the tokenization... aieeeee

Anonymous
07/28/24(Sun)16:12:21 No.101612660

Anonymous 07/28/24(Sun)16:12:21 No.101612660

File: file.png (84 KB, 554x435)

84 KB PNG

>>101612620
kek, the 'please' really sells it.

Anonymous
07/28/24(Sun)16:12:42 No.101612661

Anonymous 07/28/24(Sun)16:12:42 No.101612661

>>101612596
System prompt is a meme that exists only to stop people from writing "ignore previous instructions"

Anonymous
07/28/24(Sun)16:12:48 No.101612664

Anonymous 07/28/24(Sun)16:12:48 No.101612664

>>101612643
I need to downlaod all of those 4GB files? Yikes

Anonymous
07/28/24(Sun)16:13:00 No.101612670

Anonymous 07/28/24(Sun)16:13:00 No.101612670

>>101612582
Note that this is not code used to get results that anon posted here. They simply copied the idea of the twitter dude once it got popular and wrote a paper on it without crediting.

Anonymous
07/28/24(Sun)16:13:23 No.101612674

Anonymous 07/28/24(Sun)16:13:23 No.101612674

File: 1719268367929183.png (58 KB, 467x857)

58 KB PNG

>>101612659
it's actual pajeet code if you look, holy fucking shit

Anonymous
07/28/24(Sun)16:14:31 No.101612684

Anonymous 07/28/24(Sun)16:14:31 No.101612684

>>101612656
If a model is trained to follow the system prompt and the system prompt says to obey user requests (unless they're unsafe), then they should be able to do that. If they can't, and performance is degraded, then that's a deserved minus point for the model.

Anonymous
07/28/24(Sun)16:14:56 No.101612690

Anonymous 07/28/24(Sun)16:14:56 No.101612690

>>101612684
>the system prompt says to obey user requests
But those pajeets didn't pass any system prompt for Opus.

Anonymous
07/28/24(Sun)16:15:00 No.101612691

Anonymous 07/28/24(Sun)16:15:00 No.101612691

>>101612674
lmfao

Anonymous
07/28/24(Sun)16:17:16 No.101612712

Anonymous 07/28/24(Sun)16:17:16 No.101612712

File: 1720314448685348.png (45 KB, 749x315)

45 KB PNG

>>101612670
Oh, interesting. Did the twitter guys publish their repo?
>>101612659
He is doing exactly that, all uppercase. He only removes [] and quotes

Anonymous
07/28/24(Sun)16:18:08 No.101612721

Anonymous 07/28/24(Sun)16:18:08 No.101612721

>>101612690
Are you sure there isn't a generic system prompt in place for these models if one isn't provided? If they aren't accounting for system prompt, then sure, this would be a flaw of their method.

Anonymous
07/28/24(Sun)16:18:46 No.101612730

Anonymous 07/28/24(Sun)16:18:46 No.101612730

>>101612712
I don't think so, he is doing that for a while now. I think he said he don't want it to get popular to avoid LLM being trained/benchmarked on it but that it's super easy to reproduce anyway.

Anonymous
07/28/24(Sun)16:19:03 No.101612735

Anonymous 07/28/24(Sun)16:19:03 No.101612735

>>101612721
>Are you sure there isn't a generic system prompt in place for these models if one isn't provided?
Yes, check the repo and README, they use the same system prompt for all models.

Anonymous
07/28/24(Sun)16:19:09 No.101612736

Anonymous 07/28/24(Sun)16:19:09 No.101612736

File: 1708303002167547.png (240 KB, 680x510)

240 KB PNG

>>101612620
>Please work
>Please

Anonymous
07/28/24(Sun)16:19:48 No.101612742

Anonymous 07/28/24(Sun)16:19:48 No.101612742

>>101612712
>>101612730
More information:
>Uses an archive of 267 NYT Connections puzzles (try them yourself if unfamiliar). Three different 0-shot prompts, words in both lowercase and uppercase. One attempt per puzzle. Partial credit is awarded if not all lines are solved correctly. Top humans get near 100.

Anonymous
07/28/24(Sun)16:20:47 No.101612752

Anonymous 07/28/24(Sun)16:20:47 No.101612752

File: file.png (87 KB, 907x734)

87 KB PNG

>>101612303
It's still trash. Broke character in one message. And the function is even wrong. Wtf is this trash

Anonymous
07/28/24(Sun)16:22:29 No.101612775

Anonymous 07/28/24(Sun)16:22:29 No.101612775

File: 1703901542568306.png (202 KB, 1132x501)

202 KB PNG

>>101612752
ignore the claudeslop, but yeah, even claude can stay in character more than this

Anonymous
07/28/24(Sun)16:24:28 No.101612800

Anonymous 07/28/24(Sun)16:24:28 No.101612800

File: 1714375129190591.png (183 KB, 1186x717)

183 KB PNG

Anonymous
07/28/24(Sun)16:26:14 No.101612817

Anonymous 07/28/24(Sun)16:26:14 No.101612817

File: 1703045472318512.png (109 KB, 922x412)

109 KB PNG

Anonymous
07/28/24(Sun)16:26:21 No.101612821

Anonymous 07/28/24(Sun)16:26:21 No.101612821

>>101612800
sovl

Anonymous
07/28/24(Sun)16:27:14 No.101612831

Anonymous 07/28/24(Sun)16:27:14 No.101612831

>>101612800
>>101612817
kek

Anonymous
07/28/24(Sun)16:27:29 No.101612835

Anonymous 07/28/24(Sun)16:27:29 No.101612835

>>101612800
>>101612817
Model? Card?

Anonymous
07/28/24(Sun)16:27:55 No.101612842

Anonymous 07/28/24(Sun)16:27:55 No.101612842

>>101612752
>write a python func-
Stopped reading.

Anonymous
07/28/24(Sun)16:28:03 No.101612849

Anonymous 07/28/24(Sun)16:28:03 No.101612849

>>101612835
3.5 Sonnet, preset, prefill and everything else from >>101561964

Anonymous
07/28/24(Sun)16:29:26 No.101612862

Anonymous 07/28/24(Sun)16:29:26 No.101612862

File: wew.gif (674 KB, 474x498)

674 KB GIF

I'm an utter noob coomer to this shit. What is this Nemo that people talk about?

Is it good if I just want basic chat interactions that feel real with an AI? Even on /vg/ they recommended it and I don't think they use LLMs that much over there.

I have no idea how to find out which models excel where so I can find one that fits my needs

Anonymous
07/28/24(Sun)16:30:51 No.101612877

Anonymous 07/28/24(Sun)16:30:51 No.101612877

>>101612494
There is no real one, thats the only one.

Anonymous
07/28/24(Sun)16:32:57 No.101612903

Anonymous 07/28/24(Sun)16:32:57 No.101612903

On the topic of triangles, this is some nice OST moosic https://www.youtube.com/watch?v=-1ceYDToVCU

Anonymous
07/28/24(Sun)16:37:47 No.101612962

Anonymous 07/28/24(Sun)16:37:47 No.101612962

File: local-struggle-its-ok.png (121 KB, 474x579)

121 KB PNG

>>101612877
you can't gaslight me anon. I've used LLMs.
Fuck captcha.

Anonymous
07/28/24(Sun)16:37:53 No.101612964

Anonymous 07/28/24(Sun)16:37:53 No.101612964

Where new bread?

Anonymous
07/28/24(Sun)16:38:41 No.101612971

Anonymous 07/28/24(Sun)16:38:41 No.101612971

what's better? mini-magnum, nemo 12b base, or nemo 12b instruct?

Anonymous
07/28/24(Sun)16:39:28 No.101612977

Anonymous 07/28/24(Sun)16:39:28 No.101612977

>>101612971
Nemo 12B Instruct

Anonymous
07/28/24(Sun)16:39:30 No.101612978

Anonymous 07/28/24(Sun)16:39:30 No.101612978

>>101612971
for RP / creative stuff? Base if you are not a retard. Goes for any model.

Anonymous
07/28/24(Sun)16:41:26 No.101613002

Anonymous 07/28/24(Sun)16:41:26 No.101613002

new bread
>>101612988
>>101612988
>>101612988

Anonymous
07/28/24(Sun)16:51:10 No.101613126

Anonymous 07/28/24(Sun)16:51:10 No.101613126

>>101611920
Just wait a couple of days for koboldcpp rocm to update to 1.71.1 for the rope scaling fixes and you should be able to try 3.1 ggufs

Anonymous
07/28/24(Sun)16:51:23 No.101613131

Anonymous 07/28/24(Sun)16:51:23 No.101613131

File: 142140240420.png (97 KB, 640x626)

97 KB PNG

Anyone tried Nous-Hermes-2-Mixtral?

How is it compared to Nemo?

Anonymous
07/28/24(Sun)17:16:29 No.101613462

Anonymous 07/28/24(Sun)17:16:29 No.101613462

>>101604707
Out of curiosity, what template are you using for function calling? (IE: How are you listing the functions?)

My issue is unless I give it a one shot example of invoking a function via json or xml, it always tries to do a fucking python markdown block, but otherwise, most models I try recognize and invoke functions pretty reliably when fed the function definitions using raw json definitions similar to how they're listed for OpenAI stuff.

Anonymous
07/28/24(Sun)17:18:15 No.101613489

Anonymous 07/28/24(Sun)17:18:15 No.101613489

>>101607953
My refusal to pay is because I don't like the idea of jackass providers deciding for me what model I can use, or that it can stop working at any time, or change functionality at any time, or start working differently.

Local models means my model behaves exactly the way I want it to, and it won't just suddenly change on me.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.