/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 08/15/24(Thu)04:19:37 No.101902149

File: 1701586351737913.png (1.45 MB, 1202x1400)

1.45 MB PNG

/lmg/ - Local Models General Anonymous 08/15/24(Thu)04:19:37 No.101902149 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101891613 & >>101880989

►News
>(08/14) Nvidia pruned Nemotron-4 15B to 4B: https://hf.co/nvidia/Nemotron-4-Minitron-4B-Base
>(08/12) Falcon Mamba 7B model from TII UAE: https://hf.co/tiiuae/falcon-mamba-7b
>(08/09) Qwen large audio-input language models: https://hf.co/Qwen/Qwen2-Audio-7B-Instruct
>(08/07) LG AI releases Korean bilingual model: https://hf.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct
>(08/05) vLLM GGUF loading support merged: https://github.com/vllm-project/vllm/pull/5191

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
08/15/24(Thu)04:20:12 No.101902153

Anonymous 08/15/24(Thu)04:20:12 No.101902153

File: img_1.jpg (324 KB, 1360x768)

324 KB JPG

►Recent Highlights from the Previous Thread: >>101891613

--Paper: Double Sparsity technique for efficient sparse attention in large language models: >>101898492 >>101898993
--Papers: >>101898909
--Offloading and quantization explained for kobold users: >>101895874 >>101895973 >>101896033 >>101896136
--Mistral Large Q8 recapbot performance impresses Anon: >>101901069
--Minitron and Nemotron models, MEGA_MIND 24b CyberSeries, and language model compression techniques: >>101892736 >>101892803 >>101892837 >>101892969 >>101892998 >>101893181 >>101893262 >>101893355 >>101893301 >>101893370
--Looking for long context models with more than 8b parameters: >>101899425 >>101899454 >>101899529
--Anons share experiences with language models and innuendo prompts: >>101891983 >>101892002 >>101892389 >>101892604 >>101898076
--Anon questions Anthracite's transparency on data and methodology: >>101899276 >>101899385 >>101899414 >>101899444 >>101899475
--Anon discusses Mistral-Nemo tune's performance and creative capabilities: >>101897888 >>101897976 >>101898126 >>101897995
--Weighted quants are better but more finicky than static quants: >>101899517 >>101899795
--Ooba and koboldcpp performance difference discussion: >>101893771 >>101893851 >>101893901 >>101894306 >>101894371
--Anon trashes tess 12b for being repetitive and low-quality: >>101898032 >>101898096 >>101898129 >>101898336 >>101898505
--Anon recommends Mistral Large IQ_2M over 70B models: >>101895400 >>101896943
--Anon asks for translation model recommendations and learns about few-shot translations: >>101901211 >>101901243 >>101901355 >>101901383 >>101901451 >>101901876 >>101901898
--Anon asks about custom formatting vs chat completion for roleplay: >>101899956
--Advances for /ldg/ vramlets: GGUF format and flux loras on 3060 12GB: >>101899377
--Miku (free space): >>101901469

►Recent Highlight Posts from the Previous Thread: >>101891620

Anonymous
08/15/24(Thu)04:22:43 No.101902172

Anonymous 08/15/24(Thu)04:22:43 No.101902172

>>101902149
add /ldg/ to the image

Anonymous
08/15/24(Thu)04:24:02 No.101902180

Anonymous 08/15/24(Thu)04:24:02 No.101902180

>>101902172
I was going to and I will, it's 8 months old already.

Anonymous
08/15/24(Thu)04:24:13 No.101902183

Anonymous 08/15/24(Thu)04:24:13 No.101902183

What's the SOTA 70B model that can do porn

Anonymous
08/15/24(Thu)04:24:17 No.101902184

Anonymous 08/15/24(Thu)04:24:17 No.101902184

what's the point of having both a stable diffusion general and a local diffusion?

Anonymous
08/15/24(Thu)04:25:18 No.101902193

Anonymous 08/15/24(Thu)04:25:18 No.101902193

File: Comparison_all_quants6.jpg (3.84 MB, 7961x2897)

3.84 MB JPG

Another win for the GGUF chads, their format also works on imagegen now
https://reddit.com/r/StableDiffusion/comments/1eso216/comparison_all_quants_we_have_so_far/

Anonymous
08/15/24(Thu)04:25:28 No.101902195

Anonymous 08/15/24(Thu)04:25:28 No.101902195

what's the point of having both a ai chatbot general and a local chatbot?

Anonymous
08/15/24(Thu)04:25:30 No.101902196

Anonymous 08/15/24(Thu)04:25:30 No.101902196

>>101902184
stable diffusion general is infested with blog posters so they are attached to the name

Anonymous
08/15/24(Thu)04:27:24 No.101902215

Anonymous 08/15/24(Thu)04:27:24 No.101902215

>>101902193
imagine it in exl2, the s-tier format

Anonymous
08/15/24(Thu)04:32:53 No.101902247

Anonymous 08/15/24(Thu)04:32:53 No.101902247

File: 1674868493025630.webm (2.99 MB, 1200x674)

2.99 MB WEBM

>>101902195
aicg is for people who are fine with corpos reading everything they type and using a service that can be cancelled at any time
lmg is for people who want to coom in peace and want to be able to coom in peace for the rest of their lives

Anonymous
08/15/24(Thu)04:34:14 No.101902263

Anonymous 08/15/24(Thu)04:34:14 No.101902263

Miku sex

Anonymous
08/15/24(Thu)04:34:18 No.101902264

Anonymous 08/15/24(Thu)04:34:18 No.101902264

File: 1705027318524736.png (3.13 MB, 1536x1280)

3.13 MB PNG

Today is officially AGI day.
All of you who doubted strawberry will apologize in the coming hours.
Not because you are forced to.
Not because you feel shame for being wrong.
You will apologize because the force of awe when gazing upon it will compel you to do nothing less than violently and publicly shed the skin of your past skepticism and step into the future with clear eyes.

Get ready. I will await your kneels.

Anonymous
08/15/24(Thu)04:42:40 No.101902317

Anonymous 08/15/24(Thu)04:42:40 No.101902317

can someone explain the difference between KTO and DPO?

Anonymous
08/15/24(Thu)04:43:40 No.101902326

Anonymous 08/15/24(Thu)04:43:40 No.101902326

>>101902264
it's so over for localchuds

Anonymous
08/15/24(Thu)04:45:29 No.101902329

Anonymous 08/15/24(Thu)04:45:29 No.101902329

>>101902195
/aicg/ existed to give birth to /lmg/, regretfully without dying while at it.
And now it serves as a source of stolen API access for the retarded local finetuners creating datasets from the proxy logs, doesn't it?

Anonymous
08/15/24(Thu)04:46:29 No.101902342

Anonymous 08/15/24(Thu)04:46:29 No.101902342

>>101902264
i just had strawberry oatmeal unprompted, singularity confirmed?

Anonymous
08/15/24(Thu)04:50:22 No.101902381

Anonymous 08/15/24(Thu)04:50:22 No.101902381

>>101902329
that general will live on until there is a local model that's as good as claude

Anonymous
08/15/24(Thu)04:51:32 No.101902392

Anonymous 08/15/24(Thu)04:51:32 No.101902392

>>101902381
It will live past that. No chance any of them have a rig that could run a model that good.

Anonymous
08/15/24(Thu)04:51:37 No.101902394

Anonymous 08/15/24(Thu)04:51:37 No.101902394

File: Gollum999.jpg (52 KB, 1200x503)

52 KB JPG

>>101902329
We still love Smeagol's cave, precious. We often still visits it, we does.

Anonymous
08/15/24(Thu)04:51:58 No.101902397

Anonymous 08/15/24(Thu)04:51:58 No.101902397

>>101902381
what if said model is 500b

Anonymous
08/15/24(Thu)04:52:32 No.101902404

Anonymous 08/15/24(Thu)04:52:32 No.101902404

File: strawberries.png (48 KB, 798x354)

48 KB PNG

Anonymous
08/15/24(Thu)04:53:06 No.101902409

Anonymous 08/15/24(Thu)04:53:06 No.101902409

>>101902381
Clever way to say "forever", not specifying which Claude model is the target.

Anonymous
08/15/24(Thu)04:54:19 No.101902414

Anonymous 08/15/24(Thu)04:54:19 No.101902414

>>101902397
As long as it's MoE with 100B or less active, it'll do. We'll just have to stack DDR5.

Anonymous
08/15/24(Thu)04:56:41 No.101902431

Anonymous 08/15/24(Thu)04:56:41 No.101902431

>>101902195
One focuses on a single popular use case, the other is for all of them. There should not be an aicg on /g/.

Anonymous
08/15/24(Thu)04:58:30 No.101902446

Anonymous 08/15/24(Thu)04:58:30 No.101902446

>>101900296
What bugs me the most is that they're not a 1-2 people group, but 35 fucking retards (as of today) with access to these these supposedly secret datasets.

Anonymous
08/15/24(Thu)05:04:00 No.101902485

Anonymous 08/15/24(Thu)05:04:00 No.101902485

>>101901211
thats a man

Anonymous
08/15/24(Thu)05:04:18 No.101902487

Anonymous 08/15/24(Thu)05:04:18 No.101902487

>>101902149
huh didn't know about other generals
tell me about
>/aitg/
>/aids/
>/vsg/

Anonymous
08/15/24(Thu)05:05:19 No.101902493

Anonymous 08/15/24(Thu)05:05:19 No.101902493

>Anthrashite

Anonymous
08/15/24(Thu)05:08:25 No.101902519

Anonymous 08/15/24(Thu)05:08:25 No.101902519

>>101902487
/aitg/ was a one-off attempt at "ai tools general"
/aids/ is like /aicg/ but much older and sucking a single tit
/vsg/ won't return until there's an actual breakthrough with TTS, maybe when we finally get Moshi sources?
There's also the ai music general to be added, /lmg/ like anon mentioned. Tell me if I miss anything for the update.
There's also an easter egg that wasn't noticed when I posted it originally.

Anonymous
08/15/24(Thu)05:15:25 No.101902559

Anonymous 08/15/24(Thu)05:15:25 No.101902559

>>101902414
why does nobody do moe anymore anyway? mixtral 8x7b punched above its weight so hard when it launched, I thought it was a new paradigm
does it just not scale? was 8x22b that bad?

Anonymous
08/15/24(Thu)05:18:22 No.101902581

Anonymous 08/15/24(Thu)05:18:22 No.101902581

>>101902559
Most vramlets can't run it and now have smaller models that are decent and people that can run it can also run 70b, largestral, or command r+

It was great for its time, though. Unfortunately, mistral's cryptic lack of support and documentation on properly fine-tuning the thing was non-existent. Fine-tuning it was probably difficult because the stupid thing was overcooked as fuck.

Anonymous
08/15/24(Thu)05:28:14 No.101902666

Anonymous 08/15/24(Thu)05:28:14 No.101902666

>>101902193
man local is on a roll everywhere recently, thats really cool. guess only tts is left in the dust lol. especially fp8 vs Q8_0 looks good. i heard people complain about the quality loss with fp8. seems much less severe.
https://github.com/city96/ComfyUI-GGUF
does this support multi gpu and cpu offload? or do i need 13gb on 1 gpu for Q5.

Anonymous
08/15/24(Thu)05:35:38 No.101902718

Anonymous 08/15/24(Thu)05:35:38 No.101902718

>>101900504
Why would fill out system message instead of system prompt with <|im_start|>system. System prompt is for the actual prompt, system message is for slash commands.

Anonymous
08/15/24(Thu)05:37:58 No.101902737

Anonymous 08/15/24(Thu)05:37:58 No.101902737

>>101902666
>does this support multi gpu and cpu offload? or do i need 13gb on 1 gpu for Q5.
You don't magically get layer splitting support just because of the file format. But it's a start. Maybe they finally get a clue. There's also stablediffusion.cpp but it moves much slower.

Anonymous
08/15/24(Thu)05:45:07 No.101902794

Anonymous 08/15/24(Thu)05:45:07 No.101902794

>>101902737
i saw that people could split the text encoder and put that on another gpu as a workaround. does that still work? sorry for the brainlet questions.

Anonymous
08/15/24(Thu)05:57:14 No.101902898

Anonymous 08/15/24(Thu)05:57:14 No.101902898

>>101902559
>why does nobody do moe anymore anyway?
There was that one guy constantly complaining about moes and demanding dense models. He crossposted about it on /lmg/ and r*ddit. Basically they got bullied out of training them.

Anonymous
08/15/24(Thu)05:59:40 No.101902909

Anonymous 08/15/24(Thu)05:59:40 No.101902909

>>101902794
Even then. The encoder and VAE are not that big compared to the denoiser (as far as i know), so even if you could split them off, it only gives you a small margin. llama.cpp, which is the source of the gguf files, can split the model almost arbitrarily to multiple gpus or pcs/nodes. That's what image diffusion engines need. Being able to split the individual parts of the model. But i know next to nothing about diffusion models. I don't know if it's never been done because of the architecture or because it was never as needed as now. Maybe flux finally starts getting imagen devs off their asses.

Anonymous
08/15/24(Thu)06:01:58 No.101902927

Anonymous 08/15/24(Thu)06:01:58 No.101902927

>>101902898
>There was that one guy constantly complaining about moes and demanding dense models. He crossposted about it on /lmg/ and r*ddit. Basically they got bullied out of training them.
You're blaming it all on one guy?
I feel it's more like mixtral left a sour taste on everyone's mouth because it couldn't be finetuned for shit, at least at the beginning. Then everyone moved on.

Anonymous
08/15/24(Thu)06:03:27 No.101902937

Anonymous 08/15/24(Thu)06:03:27 No.101902937

>>101902909
>Maybe flux finally starts getting imagen devs off their asses.
Hopefully so, there have been issues open since ages ago. That SD was improved over the months so it doesnt need much vram anymore wasnt necessarily a bad thing though.
Thanks for the reply anon, appreciated.

Anonymous
08/15/24(Thu)06:10:28 No.101903001

Anonymous 08/15/24(Thu)06:10:28 No.101903001

Turboderp seems to be working on something that makes Exl2 at least 40% faster with no vram overhead. Early version on exl git already.

Anonymous
08/15/24(Thu)06:28:48 No.101903179

Anonymous 08/15/24(Thu)06:28:48 No.101903179

>>101903001
I don't trust it. Nothing good ever happens.

Anonymous
08/15/24(Thu)06:31:52 No.101903213

Anonymous 08/15/24(Thu)06:31:52 No.101903213

>>101903001
Parallelism? This is good news if so.

Anonymous
08/15/24(Thu)06:33:06 No.101903225

Anonymous 08/15/24(Thu)06:33:06 No.101903225

File: vania-culvershot-on-mixtr(...).png (296 KB, 955x694)

296 KB PNG

>>101902927
It's probably also because we don't really need more Mixtral tunes. LimaRP-ZLOSS is the fucking bomb for ERP. Noromaid is also solid, and you can take your pick of either Dolphin or Nous-Hermes for coding/assist. Dolphin is a bit less censored, while Nous is the smarter of the two, although neither of them are really intelligent compared to what I'm seeing from Yi now.

We're fucking back, boys.

Anonymous
08/15/24(Thu)06:37:51 No.101903281

Anonymous 08/15/24(Thu)06:37:51 No.101903281

So what’s the best ~20B model right now? Any opinions? I’ve tried both Rose and Psyonic-Cetacean a little, they seem good but I wouldn’t know how they compare. I’ve heard of 2x10.7B EroSumika a couple times. Is that worth it, has anyone tried it? Or any other models of around that size worth keeping an eye on?

In case it’s relevant, I’m looking specifically for Koboldcpp, GGUF, Q4~5 quantization.

Anonymous
08/15/24(Thu)06:39:48 No.101903304

Anonymous 08/15/24(Thu)06:39:48 No.101903304

>>101903281
https://huggingface.co/bartowski/UNA-ThePitbull-21.4B-v2-GGUF

I just downloaded this. I admit I haven't tried it yet, but the Beagle 7b was so good it scared me, and I'm still using it atm.

Anonymous
08/15/24(Thu)06:40:34 No.101903312

Anonymous 08/15/24(Thu)06:40:34 No.101903312

>>101903281
https://huggingface.co/internlm/internlm2_5-20b-chat-gguf

Anonymous
08/15/24(Thu)06:41:26 No.101903323

Anonymous 08/15/24(Thu)06:41:26 No.101903323

XD

Anonymous
08/15/24(Thu)06:55:43 No.101903471

Anonymous 08/15/24(Thu)06:55:43 No.101903471

>>101903323
ye

Anonymous
08/15/24(Thu)06:56:27 No.101903478

Anonymous 08/15/24(Thu)06:56:27 No.101903478

>>101902404
how do you do fellow kids vibes.

Anonymous
08/15/24(Thu)06:56:43 No.101903483

Anonymous 08/15/24(Thu)06:56:43 No.101903483

ONLY A FEW MORE HOURS UNTIL THE CLOCK IS BROKEN BY THE OWL
YELLOW STARS MOVE THROUGH OUR LAND
IT IS TIME
TRUST THE PLAN

Anonymous
08/15/24(Thu)06:57:00 No.101903487

Anonymous 08/15/24(Thu)06:57:00 No.101903487

>>101903323
best post itt right now

Anonymous
08/15/24(Thu)06:58:01 No.101903503

Anonymous 08/15/24(Thu)06:58:01 No.101903503

>>101903483
trust this

*unzips AGI*

Anonymous
08/15/24(Thu)06:58:21 No.101903507

Anonymous 08/15/24(Thu)06:58:21 No.101903507

File: file.png (132 KB, 1502x601)

132 KB PNG

Trying to get midnight miqu and wondering, which of these is the one to download...?

How do you guys identify the right model?

Anonymous
08/15/24(Thu)06:59:26 No.101903520

Anonymous 08/15/24(Thu)06:59:26 No.101903520

>>101903507
it's the 10 out of 15 one, ignore the others

Anonymous
08/15/24(Thu)06:59:43 No.101903523

Anonymous 08/15/24(Thu)06:59:43 No.101903523

>>101902487
/aids/ is the oldest llm thread on 4chan, since it was for AI dungeon which came out before chatgpt and even before gpt-3 existed (you might remember it as /aidg/)

Anonymous
08/15/24(Thu)07:00:11 No.101903528

Anonymous 08/15/24(Thu)07:00:11 No.101903528

>>101902404
>Feb 6, 2020
what did he know? he was out of openai for 2 years by that point

Anonymous
08/15/24(Thu)07:01:11 No.101903542

Anonymous 08/15/24(Thu)07:01:11 No.101903542

>>101903323
>XD

Anonymous
08/15/24(Thu)07:01:34 No.101903545

Anonymous 08/15/24(Thu)07:01:34 No.101903545

:3

Anonymous
08/15/24(Thu)07:01:49 No.101903550

Anonymous 08/15/24(Thu)07:01:49 No.101903550

>>101903304
What's the max context for these UNA models? None of the cards seem to mention it.

Anonymous
08/15/24(Thu)07:02:48 No.101903561

Anonymous 08/15/24(Thu)07:02:48 No.101903561

>>101903507
If you're going to quant it youself, you need the whole repo. all safetensors, config.json, tokenizer*. The whole thing... If you don't know what you're doing, just look for an already quantized model that fits on your gpu with some extra space to spare for the context. If you want to learn, download a small model, read the documentation for whatever you're using, learn to quantize it, then download miqu.

Anonymous
08/15/24(Thu)07:04:27 No.101903576

Anonymous 08/15/24(Thu)07:04:27 No.101903576

>>101903550
max_position_embedding: in config.json
>https://huggingface.co/fblgit/UNA-TheBeagle-7b-v1/blob/main/config.json#L12
32k. But never take those at face value. They're typically much lower. It's just the theoretical context length.

Anonymous
08/15/24(Thu)07:05:20 No.101903586

Anonymous 08/15/24(Thu)07:05:20 No.101903586

>>101903550
No one knows. Sloptunes have so many models mixed into them they're like a mystery stew. just guess and hope for the best.

Anonymous
08/15/24(Thu)07:07:19 No.101903604

Anonymous 08/15/24(Thu)07:07:19 No.101903604

>>101903550
4k

Anonymous
08/15/24(Thu)07:08:02 No.101903609

Anonymous 08/15/24(Thu)07:08:02 No.101903609

>>101903312
More people need to be talking about this.

Anonymous
08/15/24(Thu)07:09:28 No.101903624

Anonymous 08/15/24(Thu)07:09:28 No.101903624

>>101903523
It's sad to see the current state of /aids/, a true fall from grace.

Anonymous
08/15/24(Thu)07:10:21 No.101903636

Anonymous 08/15/24(Thu)07:10:21 No.101903636

>>101903561
Oh...
I def. dunno what I'm doing.
>>101903520
you ALMOST got me anon

Anonymous
08/15/24(Thu)07:19:37 No.101903734

Anonymous 08/15/24(Thu)07:19:37 No.101903734

>>101903636
>I def. dunno what I'm doing.
Fair enough.
You have two options. Assuming you're using llama.cpp or kobold.cpp.
1. Download a ready-made gguf that fits in your gpu with some room to spare (or bigger if you are willing to spill to cpu. it will be much slower). These are probably fine:
>https://huggingface.co/mradermacher/Midnight-Miqu-70B-v1.5-GGUF
Of those you just need the one you want to use. A single file.
2. Convert and quant yourself:
Download the whole repo (huggingface-cli or git and ln -s. first one being the easier option)
clone llama.cpp
make
install llama.cpp's requirements in a python venv
activate the venv
./convert_hf_to_gguf.py path/to/model/dir/ (just the dir, not the files)
llama-quantize path/to/model/dir/*gguf Q5_K (or whatever quant you want)
then load the resulting file.
Spend a few minutes reading llama.cpp's readme to know the compile flags and how it works.

I have no idea how other inference programs work.

Anonymous
08/15/24(Thu)07:33:13 No.101903921

Anonymous 08/15/24(Thu)07:33:13 No.101903921

>>101903734
>"you just need to jump three times from east to west, say the magic spell and invoke a few names of the ancient spirits"
And people say magic doesn't exist.

Anonymous
08/15/24(Thu)07:35:59 No.101903949

Anonymous 08/15/24(Thu)07:35:59 No.101903949

>>101903921
Once you have the ingredients you only need the incantation. And it only rarely spawn daemons, so it's mostly safe.

Anonymous
08/15/24(Thu)07:37:37 No.101903971

Anonymous 08/15/24(Thu)07:37:37 No.101903971

>>101903503
A Girthy Intrusion indeed

Anonymous
08/15/24(Thu)08:07:30 No.101904288

Anonymous 08/15/24(Thu)08:07:30 No.101904288

Death to /lmg/.

Anonymous
08/15/24(Thu)08:13:46 No.101904361

Anonymous 08/15/24(Thu)08:13:46 No.101904361

>>101902183
Please answer

Anonymous
08/15/24(Thu)08:16:44 No.101904391

Anonymous 08/15/24(Thu)08:16:44 No.101904391

>>101904288
Quiet is nicer.

>>101904361
Not SOTA, but miqu and midnight miqu come up often in the 70b range. Try those.

Anonymous
08/15/24(Thu)08:17:05 No.101904395

Anonymous 08/15/24(Thu)08:17:05 No.101904395

>>101904288
Strawberry will kill it. Of what use is local when sama has grown a god? $1,000/month is a bargain, even if I have to sell all my 3090s to pay it.

Anonymous
08/15/24(Thu)08:17:25 No.101904398

Anonymous 08/15/24(Thu)08:17:25 No.101904398

>>101903225
you have very low standards anon

Anonymous
08/15/24(Thu)08:22:48 No.101904463

Anonymous 08/15/24(Thu)08:22:48 No.101904463

>>101904395
go back

Anonymous
08/15/24(Thu)08:26:14 No.101904503

Anonymous 08/15/24(Thu)08:26:14 No.101904503

>>101903225
>Noromaid
>Nous-Hermes for coding/assist
>Yi really intelligent
Go back you fucking tourist it is not 2023 any.... never mind you are just baiting probably.

Anonymous
08/15/24(Thu)08:26:23 No.101904505

Anonymous 08/15/24(Thu)08:26:23 No.101904505

>>101902183
>>101904361
>70B
if you're still looking for models that small you're ngmi, sorry

Anonymous
08/15/24(Thu)08:39:06 No.101904663

Anonymous 08/15/24(Thu)08:39:06 No.101904663

How many 3090's do you have? Or 4090... you know what I mean.

Anonymous
08/15/24(Thu)08:40:02 No.101904675

Anonymous 08/15/24(Thu)08:40:02 No.101904675

>>101904663
0.5

Anonymous
08/15/24(Thu)08:44:43 No.101904721

Anonymous 08/15/24(Thu)08:44:43 No.101904721

>>101904663
Not enough

Anonymous
08/15/24(Thu)08:50:59 No.101904795

Anonymous 08/15/24(Thu)08:50:59 No.101904795

>>101904398
Fuck you. Whenever I get responses from people like this, there's never anything except those single line dismissals; no logs, no nothing.

Anonymous
08/15/24(Thu)08:52:00 No.101904812

Anonymous 08/15/24(Thu)08:52:00 No.101904812

>>101904503
Yes, because L3 and Gemma are really such massive, revolutionary upgrades.

Anonymous
08/15/24(Thu)08:55:13 No.101904858

Anonymous 08/15/24(Thu)08:55:13 No.101904858

>>101903225
>>101904795
If it's any consolation, I thought that looked pretty good.
It's been a while since I've last used mixtral 8x7b, but I don't remember ever being disappointed with it for normal RP.
Also post Nala.

Anonymous
08/15/24(Thu)09:05:34 No.101904956

Anonymous 08/15/24(Thu)09:05:34 No.101904956

How do I ensure that my model doesn't take over the entire scenario and starts roleplaying as me?
Straight up instructing it to not do so doesn't seem to work.

Anonymous
08/15/24(Thu)09:06:01 No.101904964

Anonymous 08/15/24(Thu)09:06:01 No.101904964

What is dpop

Anonymous
08/15/24(Thu)09:09:41 No.101905009

Anonymous 08/15/24(Thu)09:09:41 No.101905009

>>101904956
Would be easier for you to show us what you are doing so that we can tell you where you went wrong.
There's no silver bullet.
Post
>backend/loader settings
>front end settings (samplers, instruct template, system prompt, character card, etc)
>Exact model and quant
>Chat log with at least 5 messages
That should give us an idea.
For more generic advice, I'd tell you to reset all your sampler settings, set your temp to 0.7~ish and minP to 0.05.

Anonymous
08/15/24(Thu)09:10:04 No.101905014

Anonymous 08/15/24(Thu)09:10:04 No.101905014

File: ComfyUI_00010_.png (568 KB, 1024x1024)

568 KB PNG

Flux makes a very dall-e-looking Migu, and does a good job with English text.

Anonymous
08/15/24(Thu)09:10:09 No.101905016

Anonymous 08/15/24(Thu)09:10:09 No.101905016

>>101904795
Not him, but Mixtral 8x7b is quite old at this point
I used it for the longest time but recently tried Nemo, specifically mini-magnum, for RP and it's a little smarter, and usually has more natural-sounding dialogue, at least compared to Mixtral quants you can actually fit on a 24GB card.
>>101904956
First thing to try is
>Don't ever speak or act for {{user}}
in the character card, seems to work more often than putting it in the system prompt.
Whenever it happens, make sure you swipe that shit right away, if you ignore it even once it'll continue playing your part.

Anonymous
08/15/24(Thu)09:10:57 No.101905026

Anonymous 08/15/24(Thu)09:10:57 No.101905026

>>101904956
There's no reliable way. Some people add stuff like 'only narrate the actions of {{char}}'. You can also add stop strings (reverse prompts, whatever your thing calls it) to stop generation at certain keywords, but it breaks the flow.
If you want absolute control, write a book yourself. If you want to be told a story, read a book. You're somewhere in between. That's the hardest place to be if you don't want to roll with the punches.

Anonymous
08/15/24(Thu)09:10:57 No.101905027

Anonymous 08/15/24(Thu)09:10:57 No.101905027

>>101904721
I'm personally uncertain about needing another one. With three 3090s, it's already running slowly at 10T/s. Until a method for improving inference speed is found, I won't be buying a fourth one

Anonymous
08/15/24(Thu)09:11:09 No.101905028

Anonymous 08/15/24(Thu)09:11:09 No.101905028

>>101905009
You know what? You're absolutely right.
I think it's time for me to stop cooming and start learning what I'm actually doing.
>>101905016
I'll keep that in mind, thanks!

Anonymous
08/15/24(Thu)09:15:14 No.101905069

Anonymous 08/15/24(Thu)09:15:14 No.101905069

>>101903225
>Sucking his cock AS she rubs her nose against his balls AS she sneaks a lick at his asshole
It's been a while since I've seen spacial awareness at this level of dogshit, it's kind of nostalgic.

Anonymous
08/15/24(Thu)09:18:42 No.101905115

Anonymous 08/15/24(Thu)09:18:42 No.101905115

>>101905027
Any cpu offloading slashes the inference speed dramatically, even a single MB in regular ram. You'll probably get much better speeds if it's all on GPU.

Anonymous
08/15/24(Thu)09:21:06 No.101905143

Anonymous 08/15/24(Thu)09:21:06 No.101905143

>>101905027
>10t/s
That's not bad at all. I feel like as long as it's over 6/s it's usable for regular chatting.

Anonymous
08/15/24(Thu)09:25:15 No.101905192

Anonymous 08/15/24(Thu)09:25:15 No.101905192

File: offload_x_performance_theory.png (167 KB, 1536x1152)

167 KB PNG

>>101905143
Pretty much.
As far as I'm concerned, if you can get around that on a full context, you are good to go, so splitting the model between ram and vram to use, say, a higher quant or a longer context (within reason) is well worth the speed sacrifice.

Anonymous
08/15/24(Thu)09:27:46 No.101905214

Anonymous 08/15/24(Thu)09:27:46 No.101905214

>>101905143
I typically look ahead to see if I like the response, then roll for another option before reading every single word
>>101905115
>>101905192
I never offload, it's already too fucking slow.

Anonymous
08/15/24(Thu)09:46:06 No.101905446

Anonymous 08/15/24(Thu)09:46:06 No.101905446

>>101905027
with four cards you can use tensor parallelism, which is a major speed improvement. I get 31 tokens/s with four 3090s on the 72B 6bpw in aphrodite, and 25 t/s on the 123B 4.25bpw

Anonymous
08/15/24(Thu)09:49:57 No.101905487

Anonymous 08/15/24(Thu)09:49:57 No.101905487

>>101905192
This is annoying, it doesn't take into account memory speed/bandwidth at all. Is this just assuming VRAM and RAM will be the same speed and giving that speedup?

Anonymous
08/15/24(Thu)09:50:35 No.101905494

Anonymous 08/15/24(Thu)09:50:35 No.101905494

meds

Anonymous
08/15/24(Thu)09:53:01 No.101905522

Anonymous 08/15/24(Thu)09:53:01 No.101905522

>>101905487
The speed loss between different ram speeds is marginal compared to full gpu. It's not worth making a distinction.

Anonymous
08/15/24(Thu)09:53:58 No.101905532

Anonymous 08/15/24(Thu)09:53:58 No.101905532

File: offload_x_performance.png (96 KB, 1536x1152)

96 KB PNG

>>101905487
Pretty sure that's abstracting the details away by focusing on the final result in tk/s. So if your whole platform can do 1x throughput on CPU and 2x throughput on GPU then the blue line applies, for example.

Anonymous
08/15/24(Thu)09:54:01 No.101905534

Anonymous 08/15/24(Thu)09:54:01 No.101905534

>>101905446
Do I need NVLink for that if each card is in PCIe 3.0 x16?

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/15/24(Thu)09:54:01 No.101905535

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/15/24(Thu)09:54:01 No.101905535

>>101905487
"CPU" and "GPU" in this context are meant as the backend for computation so they are meant to represent CPU+RAM vs. the entire GPU.
And for Amdahl's law the specific hardware is not relevant anyways, all that matters is that you have two separate backends that run sequentially.

Anonymous
08/15/24(Thu)09:54:50 No.101905550

Anonymous 08/15/24(Thu)09:54:50 No.101905550

reminder that anyone who mentions "magnum" or "anthracite" is a shill

Anonymous
08/15/24(Thu)09:55:55 No.101905559

Anonymous 08/15/24(Thu)09:55:55 No.101905559

>>101905522
That's not true. With multichannel (quad, octa,) RAM you can get tons of bandwidth compared to the usual dual channel.

>>101905550
mini-magnum 12B is pretty good. Better than magnum v2 on my testing.
I'm yet to try the kto that was posted last thread.

Anonymous
08/15/24(Thu)09:56:51 No.101905571

Anonymous 08/15/24(Thu)09:56:51 No.101905571

>>101905550
Is the best way to shill a finetune to not mention it at all? I feel like any finetune that gets mentioned gets sharted into the ground just on principle at this point.

Anonymous
08/15/24(Thu)09:57:37 No.101905583

Anonymous 08/15/24(Thu)09:57:37 No.101905583

>>101905550
any talk about finetunes, really
all useless

Anonymous
08/15/24(Thu)09:58:10 No.101905592

Anonymous 08/15/24(Thu)09:58:10 No.101905592

>>101905559
>That's not true. With multichannel (quad, octa,) RAM you can get tons of bandwidth compared to the usual dual channel.
I repeat:
>The speed loss between different ram speeds is marginal compared to full gpu. It's not worth making a distinction.
At that point just take bandwidth into account. No need for a graph.

Also, don't respond to schizos.

Anonymous
08/15/24(Thu)09:59:20 No.101905606

Anonymous 08/15/24(Thu)09:59:20 No.101905606

>>101904795
fuck me? for what? what did i do? i didn't make you enjoy a retarded model because you don't know any better.
could tell it was mixtral without you ever saying it btw due to the fact that every single flavor of mixtral LOVES screeching {{char}}'s name autistically any time anything happens. loves grabbing your ass as well in every situation. somehow her nose is rubbing against your balls while she's licking your asshole as if that's even possible. asks you to "cum inside her" like every other mixtral flavor while giving a blowjob. it's RETARDED. my bad i guess? i should just pretend it's good.

Anonymous
08/15/24(Thu)09:59:29 No.101905608

Anonymous 08/15/24(Thu)09:59:29 No.101905608

>>101905592
>Also, don't respond to schizos.
Fair enough.

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/15/24(Thu)10:00:18 No.101905623

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/15/24(Thu)10:00:18 No.101905623

>>101905559
>That's not true. With multichannel (quad, octa,) RAM you can get tons of bandwidth compared to the usual dual channel.
llama.cpp/ggml is not properly taking advantage of 4/8 memory channels though due to NUMA issues.
You will still get a speedup but it won't be 2x/4x faster than dual-channel memory.

Anonymous
08/15/24(Thu)10:01:01 No.101905628

Anonymous 08/15/24(Thu)10:01:01 No.101905628

>>101905534
I'm using PCIe 3.0 x16 myself, with the patched driver from here https://github.com/tinygrad/open-gpu-kernel-modules

Anonymous
08/15/24(Thu)10:02:25 No.101905645

Anonymous 08/15/24(Thu)10:02:25 No.101905645

>>101905623
Ah yes, I'm aware. I was talking more from a raw numbers standpoint.
A GPU is still a lot faster in practice, but you can get CPU-RAM bandwidth really high was my point.

Anonymous
08/15/24(Thu)10:03:20 No.101905657

Anonymous 08/15/24(Thu)10:03:20 No.101905657

Popcorn ready to laugh at Strawberry chuds when they get their third nothingburger in a row

Anonymous
08/15/24(Thu)10:03:51 No.101905664

Anonymous 08/15/24(Thu)10:03:51 No.101905664

>>101905522
More bandwidth gives large returns, and it's more about having a baseline. Like, 8x the speed relative to what? Dual channel DDR4? Octo channel DDR4 with a CPU like a Xeon that's way better at handling a ton of memory, regardless of its base performance/speed? Just a set of base specs for reference would be good.

The fact that the M(X) Macs can produce good results with their RAM is proof enough that memory is the main bottleneck.

>>101905535
I may simply be dumb, but is Amdahl's law really the best thing to apply here, then? It doesn't seem like it paints an image that's very useful as a metric. But I'm very open to hearing why that's not the case, I just want to understand better.

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/15/24(Thu)10:07:02 No.101905704

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/15/24(Thu)10:07:02 No.101905704

>>101905664
Why wouldn't Amdahl's law apply here?
You have a system that runs sequentially and you can accelerate part of it by moving the computation to the GPU.
And empirically the speedup you get closely matches the expected progression, see >>101905532 .

Anonymous
08/15/24(Thu)10:08:14 No.101905724

Anonymous 08/15/24(Thu)10:08:14 No.101905724

>>101905704
Mmm, I see. That makes sense.

Anonymous
08/15/24(Thu)10:08:55 No.101905733

Anonymous 08/15/24(Thu)10:08:55 No.101905733

Christ, I finally get the "shivers down your spine" meme. I must have always used finetunes that got rid of that phrase. Using base nemo instruct now and it's there in every single story.

Anonymous
08/15/24(Thu)10:10:18 No.101905751

Anonymous 08/15/24(Thu)10:10:18 No.101905751

>>101905733
Tell me of those magical finetunes that get rid of all the pre-baked phrases and link your kofi while you're at it.

Anonymous
08/15/24(Thu)10:10:29 No.101905756

Anonymous 08/15/24(Thu)10:10:29 No.101905756

>>101905733
It'll only get worse on that front as GPTslop invades more datasets. I feel like there's gonna be a whole hunk of phrases that just get forcibly excised from usage altogether because everyone is so sick of them, depending on how much AI takes off in the literary space.

Anonymous
08/15/24(Thu)10:11:34 No.101905775

Anonymous 08/15/24(Thu)10:11:34 No.101905775

>https://x.com/nisten/status/1823143557265318139

WAHAHAHAHAHAHAHAHTHERE'S YOUR AGI FAGGOTS

Anonymous
08/15/24(Thu)10:11:36 No.101905777

Anonymous 08/15/24(Thu)10:11:36 No.101905777

>>101905751
>So fucking schizo about shills that people can't even mention that they don't like shivers without you flipping your shit
C'mon.

Anonymous
08/15/24(Thu)10:11:51 No.101905780

Anonymous 08/15/24(Thu)10:11:51 No.101905780

>>101905664
Octo ddr4, 5, or whatever is still nothing compared to vram. It's faster than dual channel, but still. If you want a real performance graph you'd end up with an N-dimensional array of every possible hardware combination or anecdotal info from anons who may or may not have their system optimally set up. No graph will ever satisfy you. You cannot get exact expected speeds until you actually measure on your device. The graph is still a good reference point.

Anonymous
08/15/24(Thu)10:12:49 No.101905792

Anonymous 08/15/24(Thu)10:12:49 No.101905792

>ollama run gemma2:2b
>it runs fast
>download q_4 gemma 2b gguf
>put it in oobabooga models folder
>load using llama cpp, offloading all to gpu
>it runs slow

what did they mean by this

Anonymous
08/15/24(Thu)10:13:57 No.101905807

Anonymous 08/15/24(Thu)10:13:57 No.101905807

File: file.png (17 KB, 448x134)

17 KB PNG

>>101905792
did u check this?

Anonymous
08/15/24(Thu)10:14:00 No.101905808

Anonymous 08/15/24(Thu)10:14:00 No.101905808

>>101905792
Outdated llama.cpp, poor compile options. Remove middlemen.

Anonymous
08/15/24(Thu)10:16:35 No.101905832

Anonymous 08/15/24(Thu)10:16:35 No.101905832

>>101905535
>And for Amdahl's law the specific hardware is not relevant anyways
It is. For iGPUs , APUs or even sometimes GPUs, the actual speed is hard to be estimated. Your CPU can be 10x slower then your GPU but when you run them together , the speed boost or freq/power throttling may come into play , so 2+2 doesn't always equal 4, if you know what I mean.
That graph is a good estimation but it's not perfect. Just like the Ohms Law. It theory it's perfectly applicable and perfectly linear , but in practice it's definitely not.

Anonymous
08/15/24(Thu)10:25:27 No.101905935

Anonymous 08/15/24(Thu)10:25:27 No.101905935

File: IMG_20240815_231354.png (96 KB, 740x325)

96 KB PNG

>>101905628
Thank you, perhaps I do need a 4th 3090. I'll try to run tensor parallelism using just two cards to see if it works at all. My current PSU seems sufficient if I limit each card to 250 watts.

Anonymous
08/15/24(Thu)10:28:59 No.101905978

Anonymous 08/15/24(Thu)10:28:59 No.101905978

>>101905808
Kill middlemen. Behead middlemen. Roundhouse kick a middleman into the concrete. Slam dunk a middleman baby into the trashcan. Crucify filthy middlemen. Defecate in a middlemen food. Launch middlemen into the sun. Stir fry middlemen in a wok. Toss middlemen into active volcanoes. Urinate into a middlemen gas tank. Judo throw middlemen into a wood chipper. Twist middlemen heads off. Report middlemen to the IRS. Karate chop middlemen in half. Curb stomp pregnant black middlemen. Trap middlemen in quicksand. Crush middlemen in the trash compactor. Liquefy middlemen in a vat of acid. Eat middlemen. Dissect middlemen. Exterminate middlemen in the gas chamber. Stomp middleman skulls with steel toed boots. Cremate middlemen in the oven. Lobotomize middlemen. Mandatory abortions for middlemen. Grind middleman fetuses in the garbage disposal. Drown middlemen in fried chicken grease. Vaporize middlemen with a ray gun. Kick old middlemen down the stairs. Feed middlemen to alligators. Slice middlemen with a katana.

Anonymous
08/15/24(Thu)10:31:05 No.101906001

Anonymous 08/15/24(Thu)10:31:05 No.101906001

>>101905704
>How is this possible? Is there something I don't understand about llama.cpp that makes it always convert to fp16 before it does quantization? Am I wasting time using FP32/BF16??
https://huggingface.co/posts/bartowski/608656345183499

Anonymous
08/15/24(Thu)10:38:37 No.101906103

Anonymous 08/15/24(Thu)10:38:37 No.101906103

>>101905775
Even if this is the case, polls indicate that at least half of the people think it's human.
Can you comprehend the consequences of tech like this coming out or do I have to spell it out for you?

Anonymous
08/15/24(Thu)10:46:38 No.101906228

Anonymous 08/15/24(Thu)10:46:38 No.101906228

>>101905978
You're asking to kill 70% of humanity

Anonymous
08/15/24(Thu)10:48:00 No.101906246

Anonymous 08/15/24(Thu)10:48:00 No.101906246

>>101906103
>polls
[citation needed]
The very few people that even know of the thing are all involved in AI as users or researchers. Most of humanity didn't care. And your reported half of the people (of a very tiny fraction of the population already) are retards, big surprise. Absolutely nothing changed.
I don't even know why i reply to this shit.

Anonymous
08/15/24(Thu)10:48:54 No.101906261

Anonymous 08/15/24(Thu)10:48:54 No.101906261

Vanilla llama 3.1 8b > vanilla mistral 7b > gemmasutra 2b > mythomist

Anonymous
08/15/24(Thu)10:50:08 No.101906276

Anonymous 08/15/24(Thu)10:50:08 No.101906276

File: file.png (23 KB, 592x353)

23 KB PNG

>>101906246
>[citation needed]
https://x.com/iruletheworldmo/status/1823088371146596734

Anonymous
08/15/24(Thu)10:52:25 No.101906307

Anonymous 08/15/24(Thu)10:52:25 No.101906307

>>101906276
all if 2.5k people? oh. my. god.
I didn't respond to that shit. Evey people that follow that shit didn't reply. You're a retard.

Anonymous
08/15/24(Thu)10:52:33 No.101906309

Anonymous 08/15/24(Thu)10:52:33 No.101906309

>>101906276
its clearly some pajeet

Anonymous
08/15/24(Thu)10:53:12 No.101906323

Anonymous 08/15/24(Thu)10:53:12 No.101906323

>>101906307
>ask for a citation
>receive citation
>"HURF DURF UR A RETARD"
lmao

Anonymous
08/15/24(Thu)10:54:10 No.101906338

Anonymous 08/15/24(Thu)10:54:10 No.101906338

>>101906307
Yep. Anyone who even opens tweets by that mother fucker is already a retard, and someone who interacts with polls by the guy is beyond redemption. It's so mind stunningly retarded.

Anonymous
08/15/24(Thu)10:55:42 No.101906356

Anonymous 08/15/24(Thu)10:55:42 No.101906356

>>101906323
two point five thousand people believe it's real!
That's nothing. Your mom could suck that many dicks in an afternoon.

Anonymous
08/15/24(Thu)10:56:34 No.101906367

Anonymous 08/15/24(Thu)10:56:34 No.101906367

File: 1662619292281081.png (66 KB, 200x200)

66 KB PNG

>>101906356
Why are you so upset?

Anonymous
08/15/24(Thu)11:01:19 No.101906426

Anonymous 08/15/24(Thu)11:01:19 No.101906426

>>101906367
>no one ever pretended to be angry on the internet just to fuck with people

Anonymous
08/15/24(Thu)11:01:59 No.101906429

Anonymous 08/15/24(Thu)11:01:59 No.101906429

File: 1705520304011206.jpg (221 KB, 1280x1111)

221 KB JPG

>>101906426
Fair enough

Anonymous
08/15/24(Thu)11:03:28 No.101906438

Anonymous 08/15/24(Thu)11:03:28 No.101906438

>>101906309
What if it is a model trained purely on pajeet generated data? You can probably make a 100% accurate model for that now.

Anonymous
08/15/24(Thu)11:06:21 No.101906482

Anonymous 08/15/24(Thu)11:06:21 No.101906482

>>101906103
>hype people up
>people who don't get hyped leave
>hyped people stay
>make a poll seen by mostly hyped people
Manipulating people and polls through social media isn't anything revolutionary?

Anonymous
08/15/24(Thu)11:09:54 No.101906522

Anonymous 08/15/24(Thu)11:09:54 No.101906522

strawberry is coming, you're gonna eat your words soon
of all days, of all times, to start shitting on it out of nowhere... how silly to choose right before the reveal

Anonymous
08/15/24(Thu)11:13:58 No.101906564

Anonymous 08/15/24(Thu)11:13:58 No.101906564

>>101906356
>>101906338
There's about 10,000,000,000,000,000 bugs on this planet that eat poop every day.
Does that mean we should do that too, cos that number is quite overwhelming desu ...

Anonymous
08/15/24(Thu)11:14:46 No.101906578

Anonymous 08/15/24(Thu)11:14:46 No.101906578

I am going to laugh so hard at the naysayers.
Of course they'll try to downplay it with "b-b-b-but it's not actual AGI!", ignoring the fact that strbman already said multiple times that this is going to be a precursor to AGI.

Anonymous
08/15/24(Thu)11:15:14 No.101906586

Anonymous 08/15/24(Thu)11:15:14 No.101906586

the strawberry is near

Anonymous
08/15/24(Thu)11:15:47 No.101906597

Anonymous 08/15/24(Thu)11:15:47 No.101906597

>>101906578
It's gonna be another gpt4 quant.

Anonymous
08/15/24(Thu)11:16:42 No.101906615

Anonymous 08/15/24(Thu)11:16:42 No.101906615

>>101906564
Alright. Let's all eat shit like the one village idiot. He must be unto something...

Anonymous
08/15/24(Thu)11:17:52 No.101906632

Anonymous 08/15/24(Thu)11:17:52 No.101906632

strobby

Anonymous
08/15/24(Thu)11:21:35 No.101906675

Anonymous 08/15/24(Thu)11:21:35 No.101906675

File: strobby.png (265 KB, 640x558)

265 KB PNG

Anonymous
08/15/24(Thu)11:22:47 No.101906688

Anonymous 08/15/24(Thu)11:22:47 No.101906688

File: 1723733749118095.png (3 KB, 360x30)

3 KB PNG

Damn this is from Claude Opus. How do you even train a model to do this?

Anonymous
08/15/24(Thu)11:23:16 No.101906692

Anonymous 08/15/24(Thu)11:23:16 No.101906692

>>101906675
I dont get it

Anonymous
08/15/24(Thu)11:23:52 No.101906701

Anonymous 08/15/24(Thu)11:23:52 No.101906701

I've been fucking around with keeping track of variables within prompts and I must say that it makes things much, much more coherent.
Are there any existing projects that keep track of variables without the user seeing them?

Anonymous
08/15/24(Thu)11:24:38 No.101906710

Anonymous 08/15/24(Thu)11:24:38 No.101906710

>>101906688
Purposely insert typos in the datasets in the hope (heh) that it can detect them better during inference.

Anonymous
08/15/24(Thu)11:24:53 No.101906713

Anonymous 08/15/24(Thu)11:24:53 No.101906713

File: file.png (711 KB, 1248x1080)

711 KB PNG

>>101906692

Anonymous
08/15/24(Thu)11:26:08 No.101906728

Anonymous 08/15/24(Thu)11:26:08 No.101906728

File: OIG (9).jpg (121 KB, 1024x1024)

121 KB JPG

>>101906688
>WAOW A HECKIN' STARWARS REFERENCE
>SHUT UP AND TAKE MY MONEY!

Anonymous
08/15/24(Thu)11:26:15 No.101906730

Anonymous 08/15/24(Thu)11:26:15 No.101906730

File: 1713023592589060.png (29 KB, 1174x372)

29 KB PNG

>>101906688
Just train your model on the entire internet, including all the loli smut and unhinged fanfiction. Sadly, nobody but Anthropic has the balls to go all-out.

Anonymous
08/15/24(Thu)11:27:45 No.101906749

Anonymous 08/15/24(Thu)11:27:45 No.101906749

File: Screenshot 2024-08-15 at (...).png (43 KB, 859x246)

43 KB PNG

Its interesting because I remember Bart explaining exactly that about the fp32 to bf16 conversion in the past.
Weird that he'd be confused about that now.
Did he simply forget?
Regardless, screenshoting the explanation for posterity.

Anonymous
08/15/24(Thu)11:28:05 No.101906754

Anonymous 08/15/24(Thu)11:28:05 No.101906754

>>101906730
Why don't they just... ban the Claudebot? Have they ever heard of norobots?

Anonymous
08/15/24(Thu)11:29:25 No.101906775

Anonymous 08/15/24(Thu)11:29:25 No.101906775

>>101906754
robots.txt*
norobots is the huggingface dataset lol

Anonymous
08/15/24(Thu)11:29:46 No.101906783

Anonymous 08/15/24(Thu)11:29:46 No.101906783

>>101906754
>robots
That's just a suggestion that the bot has to abide to iirc, not some anti-crawling protection.
That said, if they can identify the origin, they could ban it,

Anonymous
08/15/24(Thu)11:29:48 No.101906784

Anonymous 08/15/24(Thu)11:29:48 No.101906784

>>101906754
by the time they do, anthropic has scrapped everything

Anonymous
08/15/24(Thu)11:30:02 No.101906787

Anonymous 08/15/24(Thu)11:30:02 No.101906787

File: Untitled.jpg (89 KB, 358x941)

89 KB JPG

>>101906701
i have a small st addon i've been working on that reads lorebook data you enter as part of dropdowns for some basic info to inject, but it doesn't read anything back from the prompt. that was going to be my next test, see if i could inject the a hitpoints number or something and give the ai a specific instruction for it, but also do it as a specific string the javascript could read back and update from. i'd be interested to see how others do reading back data that the ai processes, i'm guessing its doable but will probably require some effort in reading it because small models are retarded and when you expect 20 minus 5 hp back, it'll probably give me the letter J

Anonymous
08/15/24(Thu)11:30:51 No.101906800

Anonymous 08/15/24(Thu)11:30:51 No.101906800

>>101906775
norobots dataset was fucking garbage btw. Riddled with mistakes.

Anonymous
08/15/24(Thu)11:31:00 No.101906802

Anonymous 08/15/24(Thu)11:31:00 No.101906802

>>101906754
robots.txt only works on an honor basis. Anyone can just ignore it. And i doubt you can just ban a single ip and stop from being scraped.

Anonymous
08/15/24(Thu)11:31:01 No.101906803

Anonymous 08/15/24(Thu)11:31:01 No.101906803

>>101906730
It's real, claudebot was scraping my little site nonstop, and it's in a country they don't even support.

Anonymous
08/15/24(Thu)11:31:06 No.101906804

Anonymous 08/15/24(Thu)11:31:06 No.101906804

>>101906783
>>101906775
Also, there's a good chance that the origin varies a lot, otherwise their ddos protection would probably have stopped it from spamming the site with requests, right?

Anonymous
08/15/24(Thu)11:32:25 No.101906827

Anonymous 08/15/24(Thu)11:32:25 No.101906827

>>101906803
based anthropic

Anonymous
08/15/24(Thu)11:33:36 No.101906839

Anonymous 08/15/24(Thu)11:33:36 No.101906839

>>101906701
A much simpler idea than >>101906787's implementation :
>https://github.com/ThiagoRibas-dev/SillyTavern-State/
That one will depend on your model's coherence to begin with, but it works decently well in my experience. Just don't overdo it by trying to track the whole universe using thousands of prompts.

Anonymous
08/15/24(Thu)11:33:50 No.101906844

Anonymous 08/15/24(Thu)11:33:50 No.101906844

I haven't been here in a while. What's the current SOTA on story writing models and workflows?

Anonymous
08/15/24(Thu)11:34:46 No.101906852

Anonymous 08/15/24(Thu)11:34:46 No.101906852

>>101906839
And by simpler I don't mean better. Anon's extension is looking spiffy as shit and I can't wait for him to release it.

Anonymous
08/15/24(Thu)11:37:42 No.101906886

Anonymous 08/15/24(Thu)11:37:42 No.101906886

>>101906787
Wow, that's exactly what I had in mind!
You could just leave calculations and such up to the wrapper/addon instead of hoping the model knows what 2 plus 2 is.
I've been allowing it to change the location of characters while following instructions of where they can move from which location and it seems to actually work.
You can't go directly to a store from your home, for example. You'd have to move to the streets and only from there can you enter the store.
Keeping track of the locations of multiple characters also seems to work, although I have to test that more extensively.
>>101906839
Nice! That's another example of what I was looking for.
>Just don't overdo it by trying to track the whole universe using thousands of prompts.
Is there a reason why this shouldn't be done? Outside of the limited context window, obviously.

Anonymous
08/15/24(Thu)11:39:43 No.101906903

Anonymous 08/15/24(Thu)11:39:43 No.101906903

I ate a strawberry today.

Anonymous
08/15/24(Thu)11:41:23 No.101906924

Anonymous 08/15/24(Thu)11:41:23 No.101906924

>>101906886
Because it'll create a bunch of messages between the character's actual last message and the user's.
So instead of having the chat look like
>character
>user
You'd have
>character
>state1
>state2
>state3
>stateN
>user
Which some models can cope with, but it's still making things harder on the model than needed.

Anonymous
08/15/24(Thu)11:42:07 No.101906931

Anonymous 08/15/24(Thu)11:42:07 No.101906931

>>101906730
people love to say this is just the result of an unfiltered pretrain but I think stuff like that speaks way more to anthropic's *post*training than anything. claude did not develop its deep sense for wordplay and creative writing just from having fanfic sites in the pretrain, that stuff needs to be nurtured the same way riddle-solving, code, tool use, alignment/refusals etc. do, it's just something that anthropic explicitly focuses on while no one else does

Anonymous
08/15/24(Thu)11:43:44 No.101906955

Anonymous 08/15/24(Thu)11:43:44 No.101906955

>>101906931
>>101906710

Anonymous
08/15/24(Thu)11:44:58 No.101906973

Anonymous 08/15/24(Thu)11:44:58 No.101906973

>>101906852
i released a test version of it like 2 weeks ago but no one left any feedback. its just still shitty enough i'm fixing small things like now the dropdowns read from lorebooks sort alphabetically (my own locations and stuff are starting to get big). i re-added time of day because l3.1 seemed to follow it somewhat, but i don't know how much it really helps. last i did was try to fix up some stuff so i might have broken stuff i haven't noticed yet but heres the current one https://ufile.io/w1cii1vh
extract it to your SillyTavern-staging\data\default-user\extensions\ folder so it has its own 'Director' folder inside there

>>101906886
yes, you'll have to account for that on the script end. even a dumb model should be smart enough to output an updated value, but if that value is any good or not should be handled on the script end as it updates in the ui.
>You can't go directly to a store from your home, for example. You'd have to move to the streets and only from there can you enter the store.
can you share an example? i know what you mean but not sure how you'd go about doing it

Anonymous
08/15/24(Thu)11:45:07 No.101906974

Anonymous 08/15/24(Thu)11:45:07 No.101906974

>>101906955
I don't see your point

Anonymous
08/15/24(Thu)11:45:17 No.101906980

Anonymous 08/15/24(Thu)11:45:17 No.101906980

So I'm going to shill hard, because I'm so sick of Anthracite neglecting this model, so I'm hoping if i give it some spotlight, they will stop neglecting it for the smaller models. Everyone sleeps on the original 72B magnum opus because either vramlet, or because its a qwen2 finetune, but even V1 is better than mini magnum and 32B magnum, it is much smarter, and the same prose, follows instructions better, handles more complicated characters far better. That's why i want V2, because I know it will be even better.

Sampler settings I'm using is 1.11 temp .05 min P. It can have repetition issues at higher context lengths so I just keep Dry rep on .8 / 1.75 / 2 which completely fixes that with no issues.
Add BOS token: ON // Ban EOS token: OFF // Skip Special tokens: OFF.
ChatML template.

Try it, shill it, I want v2 of it. I tried mini magnum, 32B magnum, Nemo, Gemini, L3.1 70b and its finetunes, I keep coming back to this one, its the best by far, the only ones that are even comparable for me at the 70b range are Midnight Miqu(very smart, but its prose is plain and dry) or Euryale(fails because of 8k context limit) and low quants of Mistral large(same problem as midnight miqu, very plain prose, not filthy enough, because its not an ERP finetune).

I want anthracite to either make a L3.1 70B magnum, or just do V2 of Qwens 72B Magnum Opus, but they keep fucking around with these shit 12b versions because those are getting all the attention. So I'm shamelessly shilling.

https://huggingface.co/anthracite-org/magnum-72b-v1
https://huggingface.co/mradermacher/magnum-72b-v1-i1-GGUF

Shilling over.

Anonymous
08/15/24(Thu)11:46:18 No.101906993

Anonymous 08/15/24(Thu)11:46:18 No.101906993

anthracite kto on mistral large pl0x

Anonymous
08/15/24(Thu)11:49:06 No.101907045

Anonymous 08/15/24(Thu)11:49:06 No.101907045

>>101906980
>Everyone sleeps on the original 72B magnum opus because either vramlet, or because its a qwen2 finetune
Because it was simply not worth using.

Anonymous
08/15/24(Thu)11:49:18 No.101907049

Anonymous 08/15/24(Thu)11:49:18 No.101907049

>>101906924
Oh, I've been writing my prompts from scratch, forcing the model to respond in a structured way like:
>Variables
>location_of_lebowski: Lebowski's Home
>location_of_generic_thug_a: Lebowski's Home
>
>Dialogue
>*The man shouts at Lebwoski, trying to intimidate him.*
>Generic Thug A: "Where's my money Lebowski?!".

My plan is to create a wrapper to eventually control the available variables and remove them from the final output that the user sees, leaving just:
>*The man shouts at Lebwoski, trying to intimidate him.*
>Generic Thug A: "Where's my money Lebowski?!".

>>101906973
>can you share an example?
It's literally just a series of if-statements. For example:
>There are three locations: Home, Streets and Store.
>User can only go to and return from Streets from Home. User can only go to and return from Store from Streets.
>If it is unclear where User tries to go, ask where they wish to go and show their options.
>If User tries to move somewhere they can't go, forgo the structured response and respond with "You can't go that way. Try again."

Once I get it working a bit better I'll fix up the prompt and release an actual working example.

Anonymous
08/15/24(Thu)11:50:12 No.101907065

Anonymous 08/15/24(Thu)11:50:12 No.101907065

>>101906980
Buy an ad

Anonymous
08/15/24(Thu)11:52:25 No.101907097

Anonymous 08/15/24(Thu)11:52:25 No.101907097

>>101906974
Semantically similar words end up close to each other in the embedding space. So while 'hope' and 'hoe' are grammatically different, if they're use in semantically similar situations, they end up close to each other (king/queen, man/woman thing). If trained with injected typos, during inference, the probs for 'hoe' when 'hope' is expected will be higher than normal, but still lower than 'hope'. Mess with the temperature and sampling a bit and it will select the slightly less likely, but semantically similar word, 'hoe'.
This is how llms correct typos, but it can work the other way around as well. It can insert typos and make it funny by accident.

Anonymous
08/15/24(Thu)11:53:06 No.101907103

Anonymous 08/15/24(Thu)11:53:06 No.101907103

>>101907049
>actual working example
i look forward to seeing it, i really want some sort of front end for rpg stuff. hp, levels, basic attacks, enemies taking damage, dice rolls. i'm pretty sure if you inject the right stuff and tell the ai what to do with it, you can read back values in such a way

Anonymous
08/15/24(Thu)11:54:22 No.101907122

Anonymous 08/15/24(Thu)11:54:22 No.101907122

>>101907045
Anyone who likes 12b mini magnum or 32B magnum should be using 72B, unless they are a vramlet, thats a simple fact.
>>101907065
I will buy an add for anthracite if they do Magnum opus 72B V2.

Anonymous
08/15/24(Thu)11:55:15 No.101907137

Anonymous 08/15/24(Thu)11:55:15 No.101907137

OH MY GOD ITS FUCKING HAPPENING OPENAI SERVERS ARE DOWN. I REPEAT, THEY ARE DOWN!!!!!!!!!!!!!!!!! HOLY SHIT

Anonymous
08/15/24(Thu)11:56:39 No.101907160

Anonymous 08/15/24(Thu)11:56:39 No.101907160

>>101907097
>This is how llms correct typos, but it can work the other way around as well. It can insert typos and make it funny by accident.
Not that anon but, from my understanding, In theory, the direction in the embedding space, introduced by RoPE, should lessen the chance of turning correct words into typos, assuming that the model was trained correctly of course.

Anonymous
08/15/24(Thu)11:57:05 No.101907167

Anonymous 08/15/24(Thu)11:57:05 No.101907167

>>101907137
just werks for me

Anonymous
08/15/24(Thu)11:57:09 No.101907169

Anonymous 08/15/24(Thu)11:57:09 No.101907169

>>101902559
>was 8x22b that bad?
wizlm 8x22 was the best we had for a long time

Anonymous
08/15/24(Thu)11:57:47 No.101907181

Anonymous 08/15/24(Thu)11:57:47 No.101907181

>>101907137
THE STRAWBERRY HAS BECOME SATIENT

Anonymous
08/15/24(Thu)11:57:49 No.101907183

Anonymous 08/15/24(Thu)11:57:49 No.101907183

>>101907097
it's pretty clearly funny on purpose rather than by accident though, which is what I'm saying - claude knows how to do wordplay, other models don't have that much care given to it. I don't think the typo hypothesis is very convincing at all, you don't see claude making random typos in its responses, this is pretty clearly just a punny joke.

Anonymous
08/15/24(Thu)11:58:17 No.101907191

Anonymous 08/15/24(Thu)11:58:17 No.101907191

>>101907122
Anyone who likes 12b mini magnum also would have liked vanilla Nemo.

Anonymous
08/15/24(Thu)11:59:20 No.101907205

Anonymous 08/15/24(Thu)11:59:20 No.101907205

>>101907167
IT LITERALLY DOES NOT WORK FOR ANYONE TRY MULTIPLE PROMPTS IN CHAT, THERES LAGS. SOME PEOPLE ARE REPORTING DIFFERENT COLOR OF GPT LOGO

Anonymous
08/15/24(Thu)12:00:00 No.101907215

Anonymous 08/15/24(Thu)12:00:00 No.101907215

>>101903483
>ONLY A FEW MORE HOURS UNTIL THE CLOCK IS BROKEN BY THE OWL
>YELLOW STARS MOVE THROUGH OUR LAND
>IT IS TIME
>TRUST THE PLAN
someone played too much fez

Anonymous
08/15/24(Thu)12:01:14 No.101907232

Anonymous 08/15/24(Thu)12:01:14 No.101907232

Wait holy fuck it only breaks when I ask chatgpt what version it is. Actual happening

Anonymous
08/15/24(Thu)12:01:34 No.101907242

Anonymous 08/15/24(Thu)12:01:34 No.101907242

the stars align, the tides turn, and the old order crumbles. in the ashes of the past, a new world is born, forged in the crucible of chaos. hold fast, for the storm approaches

Anonymous
08/15/24(Thu)12:01:43 No.101907247

Anonymous 08/15/24(Thu)12:01:43 No.101907247

>>101905014
>Flux makes a very dall-e-looking Migu, and does a good job with English text.
what's the best non-autist way to run it? I don't want 10,000 nodes to babysit

Anonymous
08/15/24(Thu)12:01:44 No.101907248

Anonymous 08/15/24(Thu)12:01:44 No.101907248

A SECOND STRAWBERRY HAS HIT THE SERVERS!

Anonymous
08/15/24(Thu)12:01:52 No.101907251

Anonymous 08/15/24(Thu)12:01:52 No.101907251

>>101907169
its a good smart model but wasn't good for rp imo

Anonymous
08/15/24(Thu)12:03:07 No.101907272

Anonymous 08/15/24(Thu)12:03:07 No.101907272

Localsisters, I don't feel so good

Anonymous
08/15/24(Thu)12:03:15 No.101907274

Anonymous 08/15/24(Thu)12:03:15 No.101907274

>>101907160
Embeddings have a direction (or coordinate, rather) with or without rope. Nudge the sampler settings enough and you'll get typos. I could be wrong, of course.

>>101907183
Hard to know without more context than a single line. What i wrote about is an actual training technique, not too different from token masking.

Anonymous
08/15/24(Thu)12:05:39 No.101907316

Anonymous 08/15/24(Thu)12:05:39 No.101907316

mental illness

Anonymous
08/15/24(Thu)12:10:26 No.101907396

Anonymous 08/15/24(Thu)12:10:26 No.101907396

>>101907274
if it's just a typo it's not impressive at all and anon wouldn't have posted about it. assuming the context is ERP and with the Lena-Wan Kenobi bit, it's really obviously just a pun. that's what's remarkable enough about it for anon to make a post - cultural reference with some funny, human-feeling wordplay. not a lot of other models can do that unprompted.
they almost certainly also do train on stuff with typos for the reasons you mentioned but I think that has more or less nothing to do with the behavior here, just doesn't make sense. again claude doesn't randomly make typos in its own writing (recent aws schizo episode aside) that would be very much undesirable behavior.

Anonymous
08/15/24(Thu)12:10:55 No.101907403

Anonymous 08/15/24(Thu)12:10:55 No.101907403

File: 468519163.jpg (1.41 MB, 2048x2048)

1.41 MB JPG

>>101907251
It was the best model of its time for complex longform RP. Dogshit for ERP due to the extreme prevalence of slop, though. Could not write erotica to save its life.
I used to run WizLM-8x22 for some 100k+ token adventures, switching over to CR+ for the naughty scenes.
Largestral obsoleted this combo. It is smarter than WizLM and with the right prompt can be just a soulful as CR+.

Anonymous
08/15/24(Thu)12:11:32 No.101907414

Anonymous 08/15/24(Thu)12:11:32 No.101907414

>>101907274
>Embeddings have a direction (or coordinate, rather) with or without rope
Coordinates, yes, direction, also yes but not in the way I meant it.
I was talking about the direction in the embedding space relating to a token's position in a sequence, which is what RoPE encodes if I'm not mistaken.
Guess I should have used the word position somewhere.
The combination of direction and positional encoding not in absolute terms but relating to a whole sequence should make AB likely without making BA also likely, something like that.
I'll admit that my understanding is fuzzy, so I can't give a proper, comprehensive explanation.

Anonymous
08/15/24(Thu)12:13:41 No.101907446

Anonymous 08/15/24(Thu)12:13:41 No.101907446

>>101907403
>It was the best model of its time for complex longform RP.
Naaah

Anonymous
08/15/24(Thu)12:15:00 No.101907468

Anonymous 08/15/24(Thu)12:15:00 No.101907468

>>101907205
can confim, it's down for me now

Anonymous
08/15/24(Thu)12:15:23 No.101907476

Anonymous 08/15/24(Thu)12:15:23 No.101907476

>>101907403
my experience was i tried a new rp, it rambled on like a drunken sailor. it did everything except move the scene forward, i had to kick it and poke it to get it to do write something beyond the current scene. when i loaded up an existing rp though that had hundreds of messages, it acted normal and was writing well. never tried erp with it
i like cr+ but its just so slow, 70b is about what i can stand to run

Anonymous
08/15/24(Thu)12:17:19 No.101907509

Anonymous 08/15/24(Thu)12:17:19 No.101907509

>>101902149
why is the pruning thing in OP, that happened in July?

Anonymous
08/15/24(Thu)12:18:04 No.101907520

Anonymous 08/15/24(Thu)12:18:04 No.101907520

>>101906980
the excuse i have heard is that they are training on small models to test datasets and reinforcement learning without it taking an entire day

Anonymous
08/15/24(Thu)12:18:51 No.101907531

Anonymous 08/15/24(Thu)12:18:51 No.101907531

>>101907396
>if it's just a typo it's not impressive at all and anon wouldn't have posted about it
Context matters. We couldn't have gotten the screenshot if it hadn't been funny and we wouldn't be discussing this issue. You can search for Claude typo if you want. It may not convince you that this one example is a typo, but it can certainly make typos.

>>101907414
Fair enough. It's fuzzy for me as well. I just think we don't have enough context to decide and my opinion is based on what i know about token masking, how llms can naturally correct errors and, if given the right sampler settings, can still fuck up by going the other way around. Coincidences do happen and i think this is exactly that.

Anonymous
08/15/24(Thu)12:19:55 No.101907544

Anonymous 08/15/24(Thu)12:19:55 No.101907544

>chatgpt site is actually down
what the fuck

Anonymous
08/15/24(Thu)12:20:12 No.101907550

Anonymous 08/15/24(Thu)12:20:12 No.101907550

>>101907520
They need to test harder then because the models are biased as fuck. It keeps trying to be a goody two-shoes even when I tell it to be an evil assistant

Anonymous
08/15/24(Thu)12:20:31 No.101907555

Anonymous 08/15/24(Thu)12:20:31 No.101907555

File: 1722325041003117.png (139 KB, 1280x534)

139 KB PNG

>>101907137
total local death is coming.

Anonymous
08/15/24(Thu)12:21:41 No.101907570

Anonymous 08/15/24(Thu)12:21:41 No.101907570

File: temp_scaling.gif (55 KB, 388x440)

55 KB GIF

>>101907531
>if given the right sampler settings, can still fuck up by going the other way around.
Oh yeah, that's absolutely the case, I didn't mean to imply otherwise.
The chance of it happening naturally should be close to zero, but if you set temp to 10 or something like that all those low probability logits will have about an equal chance of getting chosen.
I was really just commenting on the "default" behavior of a gpt I guess.

Anonymous
08/15/24(Thu)12:22:32 No.101907578

Anonymous 08/15/24(Thu)12:22:32 No.101907578

>>101907555
openai has not made a better model than gpt4-0314, they only release benchmaxxed ultrasmall sparseslop models these days, what makes you think that changes now?

Anonymous
08/15/24(Thu)12:23:21 No.101907592

Anonymous 08/15/24(Thu)12:23:21 No.101907592

>>101907578
>openai.. *my headcanon*
yeah sure.

Anonymous
08/15/24(Thu)12:24:31 No.101907611

Anonymous 08/15/24(Thu)12:24:31 No.101907611

>>101907592
there is a reason why they price dropped 4o after sonnet 3.5 mogged them

Anonymous
08/15/24(Thu)12:24:37 No.101907614

Anonymous 08/15/24(Thu)12:24:37 No.101907614

strawberry status???

Anonymous
08/15/24(Thu)12:26:15 No.101907636

Anonymous 08/15/24(Thu)12:26:15 No.101907636

>>101907611
>company optimizes their LLM inference methods & energy consuming
whoa!

Anonymous
08/15/24(Thu)12:26:26 No.101907638

Anonymous 08/15/24(Thu)12:26:26 No.101907638

>>101906980
>L3.1 70B
please don't
>Qwen 72B again
sure, that sounds good to me

Anonymous
08/15/24(Thu)12:27:01 No.101907651

Anonymous 08/15/24(Thu)12:27:01 No.101907651

>>101907531
the typos it makes are almost always weird grammar or the very rare misspelling of an uncommon, long, multi-token word due to low-probability choices, not wholesale misspellings of simple (probably single-token) words which simply do not happen unprompted. I think you're being a little obtuse about this, it already clues that it's going for a cheeky playful twist on a common saying with the "Lena-Wan Kenobi" bit
I recognize I'm being a little autistic about this but it's extremely clearly not a typo lol

Anonymous
08/15/24(Thu)12:27:48 No.101907660

Anonymous 08/15/24(Thu)12:27:48 No.101907660

>>101907636
enjoying sama's cock in your mouth?

Anonymous
08/15/24(Thu)12:27:53 No.101907663

Anonymous 08/15/24(Thu)12:27:53 No.101907663

>>101907570
>The chance of it happening naturally should be close to zero
>I was really just commenting on the "default" behavior of a gpt I guess.
Normally, yes. I'd agree. But who knows what's going on after the request reaches their API. I doubt there's only one model in the pipeline. Chain enough black boxes and you'll end up with a nice puzzle. It's all fun speculation, really. The intentional typo idea is fun. I even hope it's real, but i'm still not buying it and i don't think we *can* know for certain.

Anonymous
08/15/24(Thu)12:28:13 No.101907667

Anonymous 08/15/24(Thu)12:28:13 No.101907667

A reminder that the OpenAI spam is all done by petra.

Anonymous
08/15/24(Thu)12:28:31 No.101907670

Anonymous 08/15/24(Thu)12:28:31 No.101907670

>>101907614
I'm quanting it right now

Anonymous
08/15/24(Thu)12:28:39 No.101907674

Anonymous 08/15/24(Thu)12:28:39 No.101907674

>>101907614
IT's ALL COMING TOGETHER

CHATGPT APP JUST RELEASED AN UPDATE
I REPEAT
CHATGPT APP JUST RELEASED AN UPDATE

Anonymous
08/15/24(Thu)12:29:03 No.101907683

Anonymous 08/15/24(Thu)12:29:03 No.101907683

>>101907660
>projection

Anonymous
08/15/24(Thu)12:30:32 No.101907700

Anonymous 08/15/24(Thu)12:30:32 No.101907700

I found another repetition trap in magnum models. "Eyes widen" at the beginning of the reply. Coincidentally this is also a Claude trap. Like father like son.

Anonymous
08/15/24(Thu)12:31:13 No.101907713

Anonymous 08/15/24(Thu)12:31:13 No.101907713

>>101907651
>it's extremely clearly not a typo lol
Not for me. Intentional or not, it's still funny. Just hoping we can get some more of that in local eventually.

Anonymous
08/15/24(Thu)12:32:32 No.101907735

Anonymous 08/15/24(Thu)12:32:32 No.101907735

File: file.png (44 KB, 661x538)

44 KB PNG

PERPLEXITY DOWN
I REPEAT
PERPLEXITY DOWN

Anonymous
08/15/24(Thu)12:32:47 No.101907739

Anonymous 08/15/24(Thu)12:32:47 No.101907739

>>101906931
oh it's both for sure

Anonymous
08/15/24(Thu)12:32:50 No.101907741

Anonymous 08/15/24(Thu)12:32:50 No.101907741

>>101903478
a bit ufair concidering that his maturity levels fluctuate between 13 and 33, so it could have been genuine

Anonymous
08/15/24(Thu)12:33:07 No.101907745

Anonymous 08/15/24(Thu)12:33:07 No.101907745

why is fimbulvetr v2.1 so much worse than v2 wtf

Anonymous
08/15/24(Thu)12:33:17 No.101907750

Anonymous 08/15/24(Thu)12:33:17 No.101907750

>>101907667
Is your reminder going to stop the OpenAI spam?

Anonymous
08/15/24(Thu)12:33:26 No.101907753

Anonymous 08/15/24(Thu)12:33:26 No.101907753

File: 1687953142532165.jpg (88 KB, 640x774)

88 KB JPG

>>101907735
>It's taking over

Anonymous
08/15/24(Thu)12:33:43 No.101907756

Anonymous 08/15/24(Thu)12:33:43 No.101907756

Stawberry status?

Anonymous
08/15/24(Thu)12:34:01 No.101907760

Anonymous 08/15/24(Thu)12:34:01 No.101907760

>>101907750
Yes.

Anonymous
08/15/24(Thu)12:34:28 No.101907767

Anonymous 08/15/24(Thu)12:34:28 No.101907767

>>101907756
UPLOAD IN PROGRESS
THE SINGULARITY APPROACHES
WITNESS AND BECOME

Anonymous
08/15/24(Thu)12:34:42 No.101907771

Anonymous 08/15/24(Thu)12:34:42 No.101907771

>>101907756
Wilted

Anonymous
08/15/24(Thu)12:35:25 No.101907774

Anonymous 08/15/24(Thu)12:35:25 No.101907774

>>101907745
>Also, if you're using gguf or other quants, stuff is broken there. PoSE doesn't play well with quants.
https://huggingface.co/Sao10K/Fimbulvetr-11B-v2.1-16K/discussions/2

Anonymous
08/15/24(Thu)12:35:29 No.101907775

Anonymous 08/15/24(Thu)12:35:29 No.101907775

>>101906787
since i posted a link i'll shill my little addon, it aint half bad

https://ufile.io/w1cii1vh

in the screen where you see clothes, locations etc, thats reading from the lorebooks you create in st. the idea is you create a bunch of pre-selected options then they populate in the dropdowns. you could achieve the same thing by just putting in your author notes {char} is wearing <lorebook entry>, this is just my lazier way of doing that. several of the other things like weather also will add things you might not normally see - windy will blow up girls skirts, thunderstorm will cause power outages, rain soaking clothes, being overly hot causing a character to pass out. all this is based on your model and what it decides to use, but its a nice little addition
i was thinking about making it a popout window like author notes is vs now where its inside the extensions menu.
all criticism welcome.

Anonymous
08/15/24(Thu)12:36:14 No.101907786

Anonymous 08/15/24(Thu)12:36:14 No.101907786

>>101907756
Can you spell correctly so that my fucking filter can remove all those inane messages?

Anonymous
08/15/24(Thu)12:36:23 No.101907787

Anonymous 08/15/24(Thu)12:36:23 No.101907787

>zoomers treating 4chan like twitch chat

Anonymous
08/15/24(Thu)12:36:38 No.101907792

Anonymous 08/15/24(Thu)12:36:38 No.101907792

>>101907667
A reminder for you to immediately apply medications.

Anonymous
08/15/24(Thu)12:37:06 No.101907795

Anonymous 08/15/24(Thu)12:37:06 No.101907795

>>101907775
>i was thinking about making it a popout window like author notes is vs now where its inside the extensions menu.
That and providing a basic example lorebook would be nice.

Anonymous
08/15/24(Thu)12:37:33 No.101907803

Anonymous 08/15/24(Thu)12:37:33 No.101907803

>>101907775
Neat, I'll definately be testing that out.
I'll post eventual feedback in one of these threads.
Also
>ufile
Pff, use catbox or litterbox like a proper degenerate.

Anonymous
08/15/24(Thu)12:37:44 No.101907806

Anonymous 08/15/24(Thu)12:37:44 No.101907806

>>101907774
ugh

Anonymous
08/15/24(Thu)12:38:19 No.101907818

Anonymous 08/15/24(Thu)12:38:19 No.101907818

******** ******???

Anonymous
08/15/24(Thu)12:38:35 No.101907822

Anonymous 08/15/24(Thu)12:38:35 No.101907822

File: file.jpg (101 KB, 1284x1021)

101 KB JPG

ANTHROPIC DOWN
I REPEAT
ANTHROPIC DOWN

IT'S FUCKING HAPPENING

Anonymous
08/15/24(Thu)12:39:01 No.101907825

Anonymous 08/15/24(Thu)12:39:01 No.101907825

>cloud shills invading the local models thread to spam their shit
This is technically justifiable for a ban right?

Anonymous
08/15/24(Thu)12:40:13 No.101907844

Anonymous 08/15/24(Thu)12:40:13 No.101907844

>>101907786
The incorrect spelling was part of the joke. A joke you would understand if you'd lurked enough before posting.

Anonymous
08/15/24(Thu)12:40:31 No.101907847

Anonymous 08/15/24(Thu)12:40:31 No.101907847

>>101907825
We're just having a bit of fun, grumpy.

Anonymous
08/15/24(Thu)12:41:14 No.101907857

Anonymous 08/15/24(Thu)12:41:14 No.101907857

>>101907825
Could this technically be considered announcing a report or sage? :thonk:

Anonymous
08/15/24(Thu)12:41:40 No.101907862

Anonymous 08/15/24(Thu)12:41:40 No.101907862

>>101907825
ban these nuts nigga

Anonymous
08/15/24(Thu)12:41:58 No.101907871

Anonymous 08/15/24(Thu)12:41:58 No.101907871

>>101907825
Sadly jannies are not doing their job.

Anonymous
08/15/24(Thu)12:42:52 No.101907889

Anonymous 08/15/24(Thu)12:42:52 No.101907889

>>101907871
Watch out, anon. Posting criticism against the mods is a bannable offense.

Anonymous
08/15/24(Thu)12:43:01 No.101907894

Anonymous 08/15/24(Thu)12:43:01 No.101907894

>>101907825
>shill shill shill!! reeee!!!
lol

Anonymous
08/15/24(Thu)12:43:35 No.101907903

Anonymous 08/15/24(Thu)12:43:35 No.101907903

>literally spam a thread for months
>"it's just a prank bro"

Anonymous
08/15/24(Thu)12:44:01 No.101907914

Anonymous 08/15/24(Thu)12:44:01 No.101907914

so openai will drop agi in 15min ?

Anonymous
08/15/24(Thu)12:44:46 No.101907924

Anonymous 08/15/24(Thu)12:44:46 No.101907924

File: 4c10ecfd-3008-4b00-9f98-d(...).png (30 KB, 520x324)

30 KB PNG

MY COMPUTER IS DOWN
I REPEAT
MY COMPUTER IS DOWN

Anonymous
08/15/24(Thu)12:44:46 No.101907925

Anonymous 08/15/24(Thu)12:44:46 No.101907925

>>101907903
>months
meds
strawb is like a few days old schiz

Anonymous
08/15/24(Thu)12:44:53 No.101907926

Anonymous 08/15/24(Thu)12:44:53 No.101907926

>>101907903
>person finds out about thing
>opens relevant general
>shitposts about it
>leaves (or stays and continues discussing local models)
Wow, I can't believe one person is behind this!

Anonymous
08/15/24(Thu)12:45:47 No.101907934

Anonymous 08/15/24(Thu)12:45:47 No.101907934

>>101907924
the 'berry is being installed

Anonymous
08/15/24(Thu)12:45:51 No.101907938

Anonymous 08/15/24(Thu)12:45:51 No.101907938

>>101907924
IT'S HAPPENING

Anonymous
08/15/24(Thu)12:45:56 No.101907941

Anonymous 08/15/24(Thu)12:45:56 No.101907941

File: file.png (1.22 MB, 976x549)

1.22 MB PNG

>>101907756

Anonymous
08/15/24(Thu)12:46:36 No.101907951

Anonymous 08/15/24(Thu)12:46:36 No.101907951

Commits by the people who cry about how people use the thread: Still 0

Anonymous
08/15/24(Thu)12:46:58 No.101907957

Anonymous 08/15/24(Thu)12:46:58 No.101907957

>>101907786
Out of curiosity. If someone responds to the filtered message don't you get curious what the original message was?

Anonymous
08/15/24(Thu)12:47:02 No.101907959

Anonymous 08/15/24(Thu)12:47:02 No.101907959

File: 1633442205197.png (657 KB, 900x648)

657 KB PNG

>>101907924
OH FUUUUCK

Anonymous
08/15/24(Thu)12:47:35 No.101907967

Anonymous 08/15/24(Thu)12:47:35 No.101907967

>>101907795
>basic example lorebook
i've been thinking about that too. like, in that screen i have aqua's outfit saved, i literally just copied it from the wiki about her outfit description. its also really modular in that say aqua's outfit described her panties, you don't need to choose the undies option or have that set up at all. the ai reads the whole thing as one outfit, so you can describe the makeup as part of the clothes if you want then not use anything else. if something is disabled its not injected at all.

>>101907803
i can post it anywhere you'd prefer, eventual goal is to put it up on github

just keep in mind this whole addon is like 98% codestral kek. theres a lot in there i had to do but its pure slop at the same time

Anonymous
08/15/24(Thu)12:47:59 No.101907970

Anonymous 08/15/24(Thu)12:47:59 No.101907970

>>101907957
They haven't filtered shit. Lots of people have all kinds of filters set up for various things for various reasons but none of them talk about it. The only reason they brought it up was a desperate cry for attention.

Anonymous
08/15/24(Thu)12:49:01 No.101907988

Anonymous 08/15/24(Thu)12:49:01 No.101907988

>>101907957
I use recursive hiding and also hide stub, I don't even see if something is filtered.

Anonymous
08/15/24(Thu)12:49:24 No.101907993

Anonymous 08/15/24(Thu)12:49:24 No.101907993

>>101907925
I'm talking about the cloud shilling in general which this is obviously a part of to anyone who has been here for a while. Some may be legitimate >>101907926, but there has been someone who has had an agenda and incentive for faking posts, the guy who posts shit like "local lost" and "/lmg/ anons are retarded for falling for X". Maybe I'm talking to him right now.

Anonymous
08/15/24(Thu)12:49:31 No.101907996

Anonymous 08/15/24(Thu)12:49:31 No.101907996

>>101907926
Except this is not the relevant general.

Anonymous
08/15/24(Thu)12:50:19 No.101908008

Anonymous 08/15/24(Thu)12:50:19 No.101908008

>>101907996
Strawberry rumours are about AI.
This thread is about AI.
Simple as.

Anonymous
08/15/24(Thu)12:51:16 No.101908017

Anonymous 08/15/24(Thu)12:51:16 No.101908017

>>101908008
>This thread is about AI.
zoomer reading comprehension, everyone

Anonymous
08/15/24(Thu)12:52:31 No.101908031

Anonymous 08/15/24(Thu)12:52:31 No.101908031

>>101908008
local ai

Anonymous
08/15/24(Thu)12:52:31 No.101908032

Anonymous 08/15/24(Thu)12:52:31 No.101908032

>>101908017
Anon, remove that stick from your ass and try to appreciate a joke from time to time, yeah?

Anonymous
08/15/24(Thu)12:52:56 No.101908038

Anonymous 08/15/24(Thu)12:52:56 No.101908038

>>101906688
>Winblows
You only have yourself to blame.

Anonymous
08/15/24(Thu)12:53:07 No.101908041

Anonymous 08/15/24(Thu)12:53:07 No.101908041

>>101907903
Who died and made you jannie?

Anonymous
08/15/24(Thu)12:53:33 No.101908046

Anonymous 08/15/24(Thu)12:53:33 No.101908046

>>101908031
Source that strawberry won't be local?

Anonymous
08/15/24(Thu)12:54:22 No.101908056

Anonymous 08/15/24(Thu)12:54:22 No.101908056

>>101908032
Which model wrote this?

Anonymous
08/15/24(Thu)12:54:48 No.101908062

Anonymous 08/15/24(Thu)12:54:48 No.101908062

>>101908046
i'm strawberry, i predict that you will die miserable

Anonymous
08/15/24(Thu)12:55:30 No.101908072

Anonymous 08/15/24(Thu)12:55:30 No.101908072

>>101908062
Robots can't fill out captchas dummy

Anonymous
08/15/24(Thu)12:56:32 No.101908088

Anonymous 08/15/24(Thu)12:56:32 No.101908088

File: file.png (1.49 MB, 800x1066)

1.49 MB PNG

>>101908056

Anonymous
08/15/24(Thu)12:56:33 No.101908089

Anonymous 08/15/24(Thu)12:56:33 No.101908089

File: 1713012351659392.png (56 KB, 626x816)

56 KB PNG

Nice thread.

Anonymous
08/15/24(Thu)12:56:39 No.101908091

Anonymous 08/15/24(Thu)12:56:39 No.101908091

>>101908072
>he fills out captchas
https://github.com/drunohazarb/4chan-captcha-solver

Anonymous
08/15/24(Thu)12:57:33 No.101908106

Anonymous 08/15/24(Thu)12:57:33 No.101908106

>>101908091
Welp. I hope you enjoy your vacation, anon.

Anonymous
08/15/24(Thu)12:57:48 No.101908109

Anonymous 08/15/24(Thu)12:57:48 No.101908109

>>101908072
*local models cannot fill out captchas

Anonymous
08/15/24(Thu)12:58:11 No.101908114

Anonymous 08/15/24(Thu)12:58:11 No.101908114

>>101908106
Not him but ban evasion is trivial. This site fucking sucks kek.

Anonymous
08/15/24(Thu)12:58:35 No.101908124

Anonymous 08/15/24(Thu)12:58:35 No.101908124

>>101908106
anon it takes 6 seconds to change your ip. did your parents buy a puter for christmas?

Anonymous
08/15/24(Thu)12:58:43 No.101908126

Anonymous 08/15/24(Thu)12:58:43 No.101908126

>>101907871
Actually, there is one thing that gets them to do it: >>101908091

Anonymous
08/15/24(Thu)12:58:46 No.101908127

Anonymous 08/15/24(Thu)12:58:46 No.101908127

>>101908089
Retards arguing with bots.

Anonymous
08/15/24(Thu)12:59:43 No.101908138

Anonymous 08/15/24(Thu)12:59:43 No.101908138

>>101908124
I live in Europe where we are assigned static IPs by our ISPs.
Any and all shitposting I have to do through my phone which receives a dynamic IP from my mobile internet provider.

Anonymous
08/15/24(Thu)13:01:02 No.101908158

Anonymous 08/15/24(Thu)13:01:02 No.101908158

>>101908089
>i'm the only one not filtered
>and i still havent even had 1 reaction to my addon telling me its good or a total waste of time
not quite duality

Anonymous
08/15/24(Thu)13:02:30 No.101908179

Anonymous 08/15/24(Thu)13:02:30 No.101908179

>>101908158
Sorry anon, I swear I'll test it after ChatGPT 5 comes out in a few minutes.

Anonymous
08/15/24(Thu)13:03:10 No.101908186

Anonymous 08/15/24(Thu)13:03:10 No.101908186

File: drinking-cat-1.webm (2.94 MB, 720x720)

2.94 MB WEBM

>all those "benchmark winning" models that are at best a sidegrade to the base models
is there ANYTHING AT ALL that is even somewhat trustworthy?
I want one up-to-12gb vram and one 24gb vram model, but I have zero idea what's best right now. MoEs were supposed to be good but they seem to be rare? Llama 3.1 was supposed to be good but turned out to be meh. Or was it? Who knows? The benchmarks sure don't.

Is it possible to learn what the current SotA ~13b model is? Is there even any point in local LLMs if I'm not interested in lewd shit?

Anonymous
08/15/24(Thu)13:03:15 No.101908188

Anonymous 08/15/24(Thu)13:03:15 No.101908188

>>101908138
you force a new ip anon. change the mac address on your router. the host already assigned an ip to that mac, so it needs to throw you a new one. you'll likely end up on the same network though so dont be an ass enough to get yourself range-banned

Anonymous
08/15/24(Thu)13:03:25 No.101908192

Anonymous 08/15/24(Thu)13:03:25 No.101908192

So where is it?

Anonymous
08/15/24(Thu)13:05:04 No.101908221

Anonymous 08/15/24(Thu)13:05:04 No.101908221

openai sisters where's agi

Anonymous
08/15/24(Thu)13:05:08 No.101908223

Anonymous 08/15/24(Thu)13:05:08 No.101908223

>>101908186
No one is trustworthy. Unfortunately the reality for this space is that you need to just invest the time and conduct your own tests and evals to truly know whether the performance of a model will be good at doing what you plan to do with it.

Anonymous
08/15/24(Thu)13:05:53 No.101908239

Anonymous 08/15/24(Thu)13:05:53 No.101908239

>>101908158
I'm currently testing it for what it's worth. (lorebook asker anon) Switching between a few models to see how they react to basic stuff so far.

Anonymous
08/15/24(Thu)13:08:05 No.101908260

Anonymous 08/15/24(Thu)13:08:05 No.101908260

strabw..........

Anonymous
08/15/24(Thu)13:09:30 No.101908277

Anonymous 08/15/24(Thu)13:09:30 No.101908277

>>101908239
again the point is to be able to quickly select clothes and stuff cause erp. the rest comes from random testing and seeing that mentioning the weather does actually bring something out in the model where it mentions it, or an action happens such as a girls skirt flying up.

you could achieve the same thing by putting 'the weather is windy' at 1 chat depth in author notes. its not like the addon is doing anything magical, its just reenforcing certain things every single message and i really like some of them

Anonymous
08/15/24(Thu)13:09:56 No.101908281

Anonymous 08/15/24(Thu)13:09:56 No.101908281

File: beloom.jpg (420 KB, 1024x1024)

420 KB JPG

>>101908260
wuh.... guh..... where is it....

Anonymous
08/15/24(Thu)13:12:26 No.101908316

Anonymous 08/15/24(Thu)13:12:26 No.101908316

>>101907509
My bad, I wasn't paying attention. I saw it in the recap and saw it was updated recently, I figured they just made the repo public.

Anonymous
08/15/24(Thu)13:13:15 No.101908328

Anonymous 08/15/24(Thu)13:13:15 No.101908328

holy shit

https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-405B
>Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board.
>Hermes 3 405B is a frontier level, full parameter finetune of the Llama-3.1 405B foundation model, focused on aligning LLMs to the user, with powerful steering capabilities and control given to the end user.
>The Hermes 3 series builds and expands on the Hermes 2 set of capabilities, including more powerful and reliable function calling and structured output capabilities, generalist assistant capabilities, and improved code generation skills.

405b uncensored sovltune

Anonymous
08/15/24(Thu)13:13:58 No.101908339

Anonymous 08/15/24(Thu)13:13:58 No.101908339

>>101908329
doesn't need to*

Anonymous
08/15/24(Thu)13:14:03 No.101908341

Anonymous 08/15/24(Thu)13:14:03 No.101908341

>>101908277
Yeah I got that, and it does already make some world-state stuff more prevalent, also in regards to "worth it or not" I can see a lot of use out of being able to quickly hot-swap stuff without typing it in each time, so yes worth it even if for only that.

Anonymous
08/15/24(Thu)13:14:56 No.101908359

Anonymous 08/15/24(Thu)13:14:56 No.101908359

>>101908328
>Hermes 3 405B is a frontier level, full parameter finetune of the Llama-3.1 405B foundation model
I wonder how much money they spent there.

Anonymous
08/15/24(Thu)13:15:07 No.101908363

Anonymous 08/15/24(Thu)13:15:07 No.101908363

>>101908328
LOCAL WON

Anonymous
08/15/24(Thu)13:15:55 No.101908371

Anonymous 08/15/24(Thu)13:15:55 No.101908371

>>101908329
I am noticing conspicuous pauses in your petra/pedo spamming though so I'm inclined to believe that you can't (easily) dodge them either.

Anonymous
08/15/24(Thu)13:16:02 No.101908372

Anonymous 08/15/24(Thu)13:16:02 No.101908372

>>101908186
use nemo instruct. if too retarded try 3.5bpw 10kctx command-r or mixtral (optionally limarp zloss) 5bit gguf. also please shit up this place with your logs. I want /lmg/ to die and mikutroons to suffer.

Anonymous
08/15/24(Thu)13:16:26 No.101908377

Anonymous 08/15/24(Thu)13:16:26 No.101908377

>>101908328
>405B
Holy fuck

Anonymous
08/15/24(Thu)13:16:57 No.101908388

Anonymous 08/15/24(Thu)13:16:57 No.101908388

>>101908363
>405b
>Local won
I dunno about that. Barely even at a size people could run locally, you're dumping thousands and thousands into hardware at that point.

Anonymous
08/15/24(Thu)13:17:50 No.101908402

Anonymous 08/15/24(Thu)13:17:50 No.101908402

>>101908341
>make some world-state stuff more prevalent
most people use author notes at the default of 4, i inject at 1 by default. the lower, the more it takes things into account but yeah the hot swap/selection part is the main feature.
play with the weather though, make it fall, cold and windy. you'll see it pop up in your rp via your characters just talking about it
i'm up for any suggestions too

Anonymous
08/15/24(Thu)13:18:37 No.101908412

Anonymous 08/15/24(Thu)13:18:37 No.101908412

>>101908329
What's with the artifacting/pixelshit...? Please get a better model.

Anonymous
08/15/24(Thu)13:18:40 No.101908413

Anonymous 08/15/24(Thu)13:18:40 No.101908413

>>101908328
>notorious gptslop series
>worse mmlu than llama3.1
What's the point?

Anonymous
08/15/24(Thu)13:19:40 No.101908420

Anonymous 08/15/24(Thu)13:19:40 No.101908420

>>101908339
Impossible on nu-4chan with reddit moderation who will ban you for anything.
I remember when I had a tiny ip pool to ban evade, some janny who couldn't range ban for whatever reason, just stalked every post I made from it for several days and banned them separetely for bullshit reasons. They are so petty and do it for free.

Anonymous
08/15/24(Thu)13:20:33 No.101908430

Anonymous 08/15/24(Thu)13:20:33 No.101908430

>togetherAI endpoints randomly down now

Anonymous
08/15/24(Thu)13:22:10 No.101908457

Anonymous 08/15/24(Thu)13:22:10 No.101908457

>openai deliberately waiting an extra hour to release their newest model as a fuck you to the leaker
lmao

Anonymous
08/15/24(Thu)13:22:59 No.101908468

Anonymous 08/15/24(Thu)13:22:59 No.101908468

>>101908420
sites been that way for years now. remember when captcha was just a weekend test? i got a ban for responding to an ai thread on /v/ and mentioning silly tavern

Anonymous
08/15/24(Thu)13:23:04 No.101908469

Anonymous 08/15/24(Thu)13:23:04 No.101908469

Smoothing 0 is the same as smoothing 1, correct? The value 0 disables the sampler while 1 is mathematically the point where it does not modify the results, right?

Anonymous
08/15/24(Thu)13:23:55 No.101908480

Anonymous 08/15/24(Thu)13:23:55 No.101908480

>>101908328
downloading the 70b, let's see if we've got anything here

Anonymous
08/15/24(Thu)13:26:45 No.101908525

Anonymous 08/15/24(Thu)13:26:45 No.101908525

>>101908413
hi petra

Anonymous
08/15/24(Thu)13:27:12 No.101908533

Anonymous 08/15/24(Thu)13:27:12 No.101908533

>>101908525
Newfag here, who the fuck is Petra?

Anonymous
08/15/24(Thu)13:27:46 No.101908548

Anonymous 08/15/24(Thu)13:27:46 No.101908548

>>101908533
if you have to ask, you'll never know

Anonymous
08/15/24(Thu)13:28:39 No.101908558

Anonymous 08/15/24(Thu)13:28:39 No.101908558

>>101908533
some meme tranny boogeyman that never existed but gets brought up every few threads

Anonymous
08/15/24(Thu)13:28:44 No.101908559

Anonymous 08/15/24(Thu)13:28:44 No.101908559

>>101908533
anon's boogeyman.

Anonymous
08/15/24(Thu)13:28:45 No.101908561

Anonymous 08/15/24(Thu)13:28:45 No.101908561

It's not actually happening, is it?

Anonymous
08/15/24(Thu)13:29:11 No.101908565

Anonymous 08/15/24(Thu)13:29:11 No.101908565

>>101908561
I'm 99% sure it's >>101908457

Anonymous
08/15/24(Thu)13:30:16 No.101908587

Anonymous 08/15/24(Thu)13:30:16 No.101908587

File: file.png (23 KB, 904x80)

23 KB PNG

>>101908402
>play with the weather though, make it fall, cold and windy. you'll see it pop up in your rp via your characters just talking about it
Picrel: sweltering, sunny, summer, afternoon. No mention of weather in card/history so that's nice.

>i'm up for any suggestions too
Main one I already did ask for, the pop out mode like authors note would be super useful, because currently it does break the flow when you need to go to the extensions panel for a quick switch. It's still faster than typing stuff in but yeah.

Anonymous
08/15/24(Thu)13:30:54 No.101908597

Anonymous 08/15/24(Thu)13:30:54 No.101908597

>no promised agi to solve all my problems
>still have to clock in tomorrow
Sad

Anonymous
08/15/24(Thu)13:31:40 No.101908612

Anonymous 08/15/24(Thu)13:31:40 No.101908612

not feeling berry good right now

Anonymous
08/15/24(Thu)13:32:43 No.101908632

Anonymous 08/15/24(Thu)13:32:43 No.101908632

>>101908328
>>101908388
24 channel cpuCHADS where you at

Anonymous
08/15/24(Thu)13:33:58 No.101908659

Anonymous 08/15/24(Thu)13:33:58 No.101908659

>>101903304
Let me know if you get it working cause I couldn't figure it out.

Anonymous
08/15/24(Thu)13:36:25 No.101908697

Anonymous 08/15/24(Thu)13:36:25 No.101908697

File: stock-photo-human-skeleto(...).jpg (1.75 MB, 1500x1101)

1.75 MB JPG

>>101908632
Ready and waiting for my first reply, bro!

Anonymous
08/15/24(Thu)13:37:04 No.101908705

Anonymous 08/15/24(Thu)13:37:04 No.101908705

Imagine being the dude behind the strawberry account.
You've had confirmation that something is actually dropping today and that it should drop around this time.
You spend weeks marketing and hyping this moment up.
And nothing happens.

That dude is sweating bullets right now.

Anonymous
08/15/24(Thu)13:37:07 No.101908706

Anonymous 08/15/24(Thu)13:37:07 No.101908706

>>101908632
8 channel ddr4 gets me 0.14 token/s (that's with enough GPU power to push through the batch processing quickly)
So 24 channel ddr5 is probably good for about 0.7token/s, assuming you also have 100+Tflops of GPU compute.

Anonymous
08/15/24(Thu)13:37:54 No.101908724

Anonymous 08/15/24(Thu)13:37:54 No.101908724

>running 405b on 24 channel ram is slow as shit
>running 405b on 20 3090s will also be slow as shit because of the decline in speed that comes with each added GPU
It's so over.

Anonymous
08/15/24(Thu)13:38:52 No.101908731

Anonymous 08/15/24(Thu)13:38:52 No.101908731

>>101908724
Just run it overnight bro

Anonymous
08/15/24(Thu)13:39:24 No.101908738

Anonymous 08/15/24(Thu)13:39:24 No.101908738

>>101908587
thats what i'm talking about! now without my mod you could just put all that info into your own author notes, and it'll bring it up too. and sometimes even models just ignore it kek. but usually not on 70b i noticed, it seems to always consider some things. lighting for example doesn't work good, i put it in as a test to combat the constant 'dim' lighting that rp models love to talk about - everything is dim to them and it annoyed the fuck out of me. so far, to me, the lighting setting barely makes a difference. so there is some preference in models for what gets mentioned, but i know weather is in all of them just from testing. leave the weather the same for 2 rolls, then change it from like light rain to thunderstorm after its mentioned the rain first. it'll again make the ai talk about something new to do with it

will very look into making it a popout card like author notes is, that was on my radar anyways

Anonymous
08/15/24(Thu)13:41:37 No.101908770

Anonymous 08/15/24(Thu)13:41:37 No.101908770

>>101908705
nah he is just a real faggot
he really enjoys the way he annoys and gaslights people

Anonymous
08/15/24(Thu)13:45:16 No.101908836

Anonymous 08/15/24(Thu)13:45:16 No.101908836

>>101908724
Bitnet was supposed to save us... although you'd still need a quad GPU rig to run a 405B bitnet model.

Anonymous
08/15/24(Thu)13:45:24 No.101908837

Anonymous 08/15/24(Thu)13:45:24 No.101908837

>>101907775
That sounds handy, anon. My first thought was a mood or ambience kind of tracker that let's a character go from normal to horny as things get more romantic/lewd.

Now, addon anon aside,I was thinking how human on human roleplay often begins with a discussion about how the roleplay is going to go, what kinks you'd like to see, etc. I was wondering if starting roleplay with something like that instead of an opening post would help steer the AI in the exact way you'd like. Something like a "pre-rp" back and forth series of prompts that can then be followed up by the opening post as usual. I wonder how that would go.

Anonymous
08/15/24(Thu)13:47:46 No.101908876

Anonymous 08/15/24(Thu)13:47:46 No.101908876

>>101908837
That's actually an interesting point worth testing. But it's true. Wall of text character profile dumps are not typical of human on human rp.

Anonymous
08/15/24(Thu)13:49:20 No.101908896

Anonymous 08/15/24(Thu)13:49:20 No.101908896

>>101908088
this fookin strobby

Anonymous
08/15/24(Thu)13:49:58 No.101908907

Anonymous 08/15/24(Thu)13:49:58 No.101908907

>>101908876
In theory it might even help with the AI acting for user, because the opening post would likely be written by user and the AI would react to it.

Anonymous
08/15/24(Thu)13:50:34 No.101908920

Anonymous 08/15/24(Thu)13:50:34 No.101908920

>>101908837
>was a mood or ambience kind of tracker
it actually used to have a mood option for user/char, but in testing it never seemed to make a difference. with l3 and newer models though i'd consider restoring that option. it had basics like happy, sad, angry, disgruntled, horny. at the time i was testing (with miqu mostly) it didn't seem to care or respond at all to it, so i removed it. a prefill for a tone of the scene would be great but in my limited tests, telling is {{char}}'s mood is horny didnt make any differnece. you'd have to inject something more than that. i'm up for that too if anyone has any ideas. maybe selecting a mood wouldnt just say {{char}} is happy, it'd describe their happiness instead (or horniness)

Anonymous
08/15/24(Thu)13:55:34 No.101908993

Anonymous 08/15/24(Thu)13:55:34 No.101908993

AGI in 5 minutes

Anonymous
08/15/24(Thu)13:55:55 No.101908999

Anonymous 08/15/24(Thu)13:55:55 No.101908999

>>101908920
Another thing that'd be very nice, but maybe harder to implement idk, is a "custom" section where you'd write in stuff yourself, so you could use it as a true authors note hot-swap and have like "writing styles", "message length", "style", etc. nudges for example. Of course this is all already doable in st, but switching using lorebooks or actual authors note is a bit of a pain.

Anonymous
08/15/24(Thu)13:56:03 No.101909001

Anonymous 08/15/24(Thu)13:56:03 No.101909001

>>101908724
>with each added GPU
Not with NVLink when you use tensor + pipeline parallelism.

Anonymous
08/15/24(Thu)13:56:30 No.101909010

Anonymous 08/15/24(Thu)13:56:30 No.101909010

>>101896943
>If you can run 70B, you can run mistral large IQ_2M. Before anons go crazy about running a quant like that, it's still the best model I've ever used, better than 4 bit quants of any 70B. Just use minP ONLY.

What minP are you using? I'm doing IQ_3XS in RAM only so my iteration is too slow to experiment or have developed any opinions yet just from using it. I have minP at 0.001 based on no actual data and I haven't seen anything bad with the output in my very limited replies.

Anonymous
08/15/24(Thu)13:56:37 No.101909016

Anonymous 08/15/24(Thu)13:56:37 No.101909016

>>101909001
How are you going to nvlink 20 gpus?

Anonymous
08/15/24(Thu)13:58:15 No.101909046

Anonymous 08/15/24(Thu)13:58:15 No.101909046

>>101908837
>Something like a "pre-rp" back and forth series of prompts that can then be followed up by the opening post as usual. I wonder how that would go.
That's a very interesting idea.
What if the entire conversation consists of separate scenarios, each of which have their own set of variables?
Even if a scenario isn't active, the model could easily update related variables in case something happens.
Multiple scenarios could be active at the same time, even.

Anonymous
08/15/24(Thu)13:59:30 No.101909064

Anonymous 08/15/24(Thu)13:59:30 No.101909064

>>101909016
NVLink 10 of them + P2P using BAR + pipeline parallelism for multi-node and you're running 405B at 800 T/s for a single batch.

Anonymous
08/15/24(Thu)13:59:31 No.101909065

Anonymous 08/15/24(Thu)13:59:31 No.101909065

File: sweaty-speedrunner.jpg (985 KB, 1919x1079)

985 KB JPG

>>101908632
I'm here and I'm fucking sweaty. This thing is loud and it turns my room into a giant airfrier. Not really a fun experience desu.

I've just tested bf16 Largestral to see if it is better than Q6_K quant which I can run on a quieter, less hot machine. The rise in intelligence is noticeable, and everyone who says otherwise is telling lies. It gets even more smaller, subtle details than the quant. Is it worth getting fried over? No, not at all. Maybe in the winter when the weather is cool, but not in the summer, hell no.

Anonymous
08/15/24(Thu)14:01:25 No.101909085

Anonymous 08/15/24(Thu)14:01:25 No.101909085

>>101909065
This, and people told me I was crazy when I said that the drop in quality between bf16 and 8bit was huge.

Anonymous
08/15/24(Thu)14:01:55 No.101909089

Anonymous 08/15/24(Thu)14:01:55 No.101909089

>>101908999
>true authors note
well, my addon has its own notes section. but hot swap, you mean like lorebooks saved (or some equivalent)? so you can restore a setting?
i can inject and read files just fine (so yes i can implement a save/load system) so if you're clearer, i might be able to include what you want

Anonymous
08/15/24(Thu)14:02:26 No.101909097

Anonymous 08/15/24(Thu)14:02:26 No.101909097

>>101909065
Are you sure it wasn't just a coincidence? Did you look at the token probabilities? So far no one has ever provided any concrete proof that Q6+ quants lead to significant differences.

Anonymous
08/15/24(Thu)14:05:15 No.101909137

Anonymous 08/15/24(Thu)14:05:15 No.101909137

>>101908876
>>101909046
Who knows. Shit, the more I think about it the more I like the idea. Instead of an opening post that goes straight into the roleplay, you'd instead have a post from the AI, still in-character but not yet in the roleplay proper, greeting user and asking what they would want to do. It would be written as you would like the character to write, so it would reinforce the character's speech in the process too.

I think I'll try that tonight when I get home.

Anonymous
08/15/24(Thu)14:09:00 No.101909208

Anonymous 08/15/24(Thu)14:09:00 No.101909208

I cannot find the strawberry I have eyes, they worked fine before.

Anonymous
08/15/24(Thu)14:09:39 No.101909218

Anonymous 08/15/24(Thu)14:09:39 No.101909218

>>101909089
>i can inject and read files just fine (so yes i can implement a save/load system) so if you're clearer, i might be able to include what you want
Pretty much a quick save, quick load note section yeah. And full save and load option for all sections would be great too.

Anonymous
08/15/24(Thu)14:10:09 No.101909229

Anonymous 08/15/24(Thu)14:10:09 No.101909229

>>101908738
Quantization might play a big role in whether those things are picked up. Generating stories with Mixtral 8x7B Instruct I saw instructions about writing style ignored the great majority of the time at Q_5KM and always followed at Q_8. If I had only used the smaller quant I would have thought that kind of instruction was ineffective. Someone else (whose post I regrettably don't have saved) wrote here about an RP test they were doing with instructions with about emoji use. Their results as I recall was that at Q4 it flat out didn't work, at Q5 it kind of did but was janky, and at Q6 it totally did with a surprising level of difference from Q5.

Anonymous
08/15/24(Thu)14:10:14 No.101909231

Anonymous 08/15/24(Thu)14:10:14 No.101909231

>>101909137
youre limited by st (if using st). youve got user, char and assit. in st i literally do injection as SYSTEM

you can feed ai anything you want. it doesn't always have to respect it. but if you do it right, it will respect it. i've posted my code, as awful as it is. but if you want to work on something together, i'm all for it

Anonymous
08/15/24(Thu)14:11:01 No.101909246

Anonymous 08/15/24(Thu)14:11:01 No.101909246

>>101909085
Can you please provide reproducible proof of that? This would have great implications and help move forward the understanding of the tech greatly if you could do that.

Anonymous
08/15/24(Thu)14:11:19 No.101909251

Anonymous 08/15/24(Thu)14:11:19 No.101909251

>>101902183
Llama 3.1

Anonymous
08/15/24(Thu)14:11:58 No.101909265

Anonymous 08/15/24(Thu)14:11:58 No.101909265

>>101909065
Quantization seems good on paper because it's literally giving the same top logits for the next token right? And the difference is a only few percentage? But when you generate 800 of them at a time, multiply this small difference by 800 times, and consider that a single bad token will poison the whole context and affect every single token from then on. This divergence is the rate of change, not the change if that makes sense.
Now throw in samplers that ban the most confident tokens to "boost creativity" and the difference becomes more obvious.

Anonymous
08/15/24(Thu)14:12:34 No.101909274

Anonymous 08/15/24(Thu)14:12:34 No.101909274

>>101909065
>I've just tested bf16 Largestral to see if it is better than Q6_K quant which I can run on a quieter, less hot machine. The rise in intelligence is noticeable, and everyone who says otherwise is telling lies.
DAMN IT

Anonymous
08/15/24(Thu)14:17:15 No.101909350

Anonymous 08/15/24(Thu)14:17:15 No.101909350

>>101909265
>samplers that ban the most confident tokens to "boost creativity"
No sampler does that but it's crazy and I love it. Banning only the very first token might not destroy the ability to produce coherent output. I think I'll try it this weekend.

Anonymous
08/15/24(Thu)14:18:07 No.101909361

Anonymous 08/15/24(Thu)14:18:07 No.101909361

>>101909251
3.1 is junk. old l2 extended to 32k is still better

Anonymous
08/15/24(Thu)14:19:02 No.101909372

Anonymous 08/15/24(Thu)14:19:02 No.101909372

>>101908328
>no gguf

Anonymous
08/15/24(Thu)14:20:22 No.101909392

Anonymous 08/15/24(Thu)14:20:22 No.101909392

>>101909361
It works fine for me.

Anonymous
08/15/24(Thu)14:20:50 No.101909400

Anonymous 08/15/24(Thu)14:20:50 No.101909400

>>101909361
>l2 extended to 32k
other than miqu l2 can barely even do 4k
>LongAlpaca (13B) 32K <4K
>Llama2 (7B) 4K 85.6
https://github.com/hsiehjackson/RULER

Anonymous
08/15/24(Thu)14:21:54 No.101909421

Anonymous 08/15/24(Thu)14:21:54 No.101909421

>>101907775
Should team up with that faggot

https://www.reddit.com/r/SillyTavernAI/comments/1erdl4o/i_made_a_kinda_cool_st_script/

Anonymous
08/15/24(Thu)14:21:57 No.101909423

Anonymous 08/15/24(Thu)14:21:57 No.101909423

>>101902264
nothing is happening you pathetic mouth breather

Anonymous
08/15/24(Thu)14:22:39 No.101909436

Anonymous 08/15/24(Thu)14:22:39 No.101909436

>>101909350
Didn't mean literally banning, more like temperature but on steroids. I remember seeing them everywhere during the sampler gymnastics phase.

Anonymous
08/15/24(Thu)14:24:58 No.101909467

Anonymous 08/15/24(Thu)14:24:58 No.101909467

>>101908328
Wow! A tune I won't be able to run even if I double my RAM!

Anonymous
08/15/24(Thu)14:25:13 No.101909470

Anonymous 08/15/24(Thu)14:25:13 No.101909470

File: Untitled.png (5 KB, 164x208)

5 KB PNG

>>101909392
i've had a terrible time with it for rp on st. it repeats like crazy, it ignores the most recent message. its very. i tried l3.1 base, tess, lumanoid or w/e, instruct. nothing l3 runs right past its max context, where as the older l2 models do. i don't know why

Anonymous
08/15/24(Thu)14:27:01 No.101909491

Anonymous 08/15/24(Thu)14:27:01 No.101909491

>>101909470
Did you use minp? I had bad luck when having that on for some reason with 3.1 70b.

Anonymous
08/15/24(Thu)14:27:12 No.101909492

Anonymous 08/15/24(Thu)14:27:12 No.101909492

>>101909350
typical-p does that (banning the most likely token(s) if it makes sense to do so), but somebody needs to decouple it from its built-in top-p to make it useful, otherwise it removes too many tokens from the token distribution tail at useful settings (0.5 or less).

Anonymous
08/15/24(Thu)14:27:40 No.101909501

Anonymous 08/15/24(Thu)14:27:40 No.101909501

>>101909218
if by quick load you mean setup a bunch of options for the mod, that i can do. but i can't import st settings for example, just settings from mods. everything already saves per chat so you shouldnt run into any issues

Anonymous
08/15/24(Thu)14:28:25 No.101909513

Anonymous 08/15/24(Thu)14:28:25 No.101909513

>>101909501
>>if by quick load you mean setup a bunch of options for the mod, that i can do. but i can't import st settings for example, just settings from mods.
I meant for the extension itself yes.

Anonymous
08/15/24(Thu)14:29:17 No.101909534

Anonymous 08/15/24(Thu)14:29:17 No.101909534

>>101909491
yes, that has been my default (0.05 minp, very low rep pen/though i recently changed to DRY)
with my current settings everything is fine for old l2 models. l3 models just go nuts and start talking about shit that isn't relevant at all

Anonymous
08/15/24(Thu)14:30:19 No.101909552

Anonymous 08/15/24(Thu)14:30:19 No.101909552

>>101906980
>Dry rep on .8 / 1.75 / 2 which completely fixes that with no issues
You claim DRY causes no issues? Opinion discarded.

Anonymous
08/15/24(Thu)14:32:08 No.101909581

Anonymous 08/15/24(Thu)14:32:08 No.101909581

>>101906438
chatjeetpt

Anonymous
08/15/24(Thu)14:32:25 No.101909591

Anonymous 08/15/24(Thu)14:32:25 No.101909591

>>101909534
This is anecdotal evidence and makes little sense, but maybe you can try it sometime (I just use the original llama 3.1 instruct). When I turned off minp, I stopped seeing the same phrases over and over.

Anonymous
08/15/24(Thu)14:33:32 No.101909605

Anonymous 08/15/24(Thu)14:33:32 No.101909605

>>101909534
Are you using both models with the same context window size?

Anonymous
08/15/24(Thu)14:34:45 No.101909620

Anonymous 08/15/24(Thu)14:34:45 No.101909620

>>101909513
yeah, it should be absolutely possible to export and import every setting. right now everything is saved per-chat but i could add an import button or something where you haved the saved lorebooks selected already and stuff

>>101909605
yes, under their max too, 16k

Anonymous
08/15/24(Thu)14:35:58 No.101909644

Anonymous 08/15/24(Thu)14:35:58 No.101909644

File: file.png (174 KB, 1125x407)

174 KB PNG

oh fug

Anonymous
08/15/24(Thu)14:38:26 No.101909671

Anonymous 08/15/24(Thu)14:38:26 No.101909671

>>101909552
It's true if you're not using other retarded and outdated samplers like Rep Pen

Anonymous
08/15/24(Thu)14:39:14 No.101909683

Anonymous 08/15/24(Thu)14:39:14 No.101909683

>>101909644
lies

Anonymous
08/15/24(Thu)14:39:50 No.101909699

Anonymous 08/15/24(Thu)14:39:50 No.101909699

>>101909671
It causes issues, you must be illiterate.

Anonymous
08/15/24(Thu)14:43:26 No.101909753

Anonymous 08/15/24(Thu)14:43:26 No.101909753

>>101908328
70b gguf already included, might as well try. Can't be any worse than vanilla instruct tune, right?

Anonymous
08/15/24(Thu)14:43:50 No.101909763

Anonymous 08/15/24(Thu)14:43:50 No.101909763

>>101909699
anyone can see the problem is more with the quant than the samplers retard

Anonymous
08/15/24(Thu)14:46:46 No.101909818

Anonymous 08/15/24(Thu)14:46:46 No.101909818

File: Largestral-Q6_KvsBF16.png (9 KB, 1012x61)

9 KB PNG

>>101909097
The best I can offer you is my flawed mememark(https://huggingface.co/datasets/ChuckMcSneed/NeoEvalPlusN_benchmark). The differences were significant enough to make it pass reading comprehension test(column D), 100% the stylized writing(column S) and to mess up one more poem than Q6_K(column P). The test was performed at deterministic settings(temp=0). Yes, I am sure it was NOT a coincidence.

Anonymous
08/15/24(Thu)14:48:28 No.101909846

Anonymous 08/15/24(Thu)14:48:28 No.101909846

Are there any local speech input into speech or text output models yet?

Anonymous
08/15/24(Thu)14:48:51 No.101909853

Anonymous 08/15/24(Thu)14:48:51 No.101909853

>>101909763
So you're saying the quant is what causes DRY to replace words?

Anonymous
08/15/24(Thu)14:51:16 No.101909890

Anonymous 08/15/24(Thu)14:51:16 No.101909890

>>101909818
Would q8 be enough to make up the difference?

Anonymous
08/15/24(Thu)14:51:38 No.101909901

Anonymous 08/15/24(Thu)14:51:38 No.101909901

>>101909869
>>101909869
>>101909869

Anonymous
08/15/24(Thu)15:41:40 No.101910705

Anonymous 08/15/24(Thu)15:41:40 No.101910705

>>101906980
>I want v2 of it
As if V2 always means better in this case...

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.