/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 06/20/24(Thu)12:48:25 No.101069457

File: 1711446303010013.jpg (411 KB, 1536x2048)

411 KB JPG

/lmg/ - Local Models General Anonymous 06/20/24(Thu)12:48:25 No.101069457 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101058366 & >>101049838

►News
>(06/18) Meta Research Releases Multimodal 34B, Audio, and Multi-Token Prediction Models: https://ai.meta.com/blog/meta-fair-research-new-releases
>(06/17) DeepSeekCoder-V2 released with 236B & 16B MoEs: https://github.com/deepseek-ai/DeepSeek-Coder-V2
>(06/14) Nemotron-4-340B: Dense model designed for synthetic data generation: https://hf.co/nvidia/Nemotron-4-340B-Instruct
>(06/14) Nvidia collection of Mamba-2-based research models: https://hf.co/collections/nvidia/ssms-666a362c5c3bb7e4a6bcfb9c

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
06/20/24(Thu)12:48:53 No.101069467

Anonymous 06/20/24(Thu)12:48:53 No.101069467

File: _0e209f46-1620-494b-9398-(...).jpg (132 KB, 1024x1024)

132 KB JPG

►Recent Highlights from the Previous Thread: >>101058366

--Understanding lcpp's Auto-Offload and its Impact on VRAM Usage: >>101067374 >>101067414
--Successfully Implemented Bubble Sort Algorithm in Python: >>101066250 >>101066386 >>101066479
--Quantization's Impact on AI Model Performance and the Risks of API Services: >>101058990 >>101056274 >>101059211 >>101059239 >>101059383 >>101059804
--Correction: Karakuri Released Instruct Model, Not Chat Model: >>101062890 >>101062898 >>101062938 >>101062976 >>101062994 >>101063015 >>101063021
--Clarifying the LLM Openness Leaderboard and Command R+'s Capabilities: >>101059462 >>101059472 >>101060287 >>101060344
--Chameleon Compatibility and the Quest for Professional LM in AI Models: >>101058808 >>101058839 >>101058864
--imatrix quantization performance on CPUs: >>101058492 >>101058531 >>101058565 >>101058546 >>101058585 >>101058659 >>101058691 >>101058699 >>101058744 >>101058830 >>101060521 >>101058589 >>101058705
--Understanding the Role of Calibration Datasets in Quantization: >>101063013
--Struggling with Insufficient RAM on Google Colab for AI Script: >>101058851 >>101059006 >>101060355 >>101060440 >>101059741
--Nemotron-4-340B: The New King of Open-Source AI Models?: >>101064553 >>101064579 >>101064641 >>101064605
--How to Filter File Extensions During Git LFS Clone to Avoid Unnecessary Downloads: >>101060150 >>101060399
--Flashing AMD Graphics Cards for Gaming Performance: >>101061961 >>101061972 >>101062372 >>101062438
--Creating Control Vectors for Mixtral 8x22b and Wizard8x22b Models: >>101061658 >>101061776 >>101064877 >>101065900 >>101065909 >>101065947
--Chub: Bots and Character Card Repositories: >>101060048 >>101060083
--AI Models Performance Comparison: GPT-40, Gemini 1.5 Pro, and Llama-400b: >>101067229 >>101067388 >>101067410 >>101067649
--Miku (free space): >>101059024 >>101059746 >>101065313 >>101065418 >>101066640 >>101061794 >>101069307

►Recent Highlight Posts from the Previous Thread: >>101058373

Anonymous
06/20/24(Thu)12:57:05 No.101069634

Anonymous 06/20/24(Thu)12:57:05 No.101069634

File: 1718892971727925.png (239 KB, 1011x868)

239 KB PNG

https://www.anthropic.com/news/claude-3-5-sonnet

Anonymous
06/20/24(Thu)12:59:03 No.101069678

Anonymous 06/20/24(Thu)12:59:03 No.101069678

>>101069634
Surely 400B llama 3 8k context will be better R-Right?

Anonymous
06/20/24(Thu)12:59:28 No.101069688

Anonymous 06/20/24(Thu)12:59:28 No.101069688

>>101069449
Have you tried prompting it to not rush the progression of the current scene or something of the sort?
Put a small rules block with 4 or 5 very concise rules it should follow in the last assistant output field or something like that.

Anonymous
06/20/24(Thu)12:59:33 No.101069690

Anonymous 06/20/24(Thu)12:59:33 No.101069690

>>101069634
leak plz

Anonymous
06/20/24(Thu)13:01:02 No.101069718

Anonymous 06/20/24(Thu)13:01:02 No.101069718

>>101069390
>Oumuamua-7b-instruct-v2
Are you the anon who mentioned it in the community tab? I already have some preliminary results (I ran it on half of the test set) and it appears to be just as good as LLaMA 3 8B Instruct, which is actually a bit impressive considering the base model seems to be Mistral, and Mistral has quite bad results.
>karakuri-lm-8x7b-instruct-v0.1
Thanks, I wasn't aware of this one, I will look into it.

Anonymous
06/20/24(Thu)13:02:07 No.101069743

Anonymous 06/20/24(Thu)13:02:07 No.101069743

>>101069678
dude, I don't even care about meta anymore. There's an equal as good chance that a company brings out a model next week that mogs all of them, a company none of us have ever heard of. Times right now be like this. There's no point in dooming anymore. The question of "if" has long ceased to be, it is only "when" now

Anonymous
06/20/24(Thu)13:05:06 No.101069819

Anonymous 06/20/24(Thu)13:05:06 No.101069819

Command-R++ v2 Apache 2.0

Anonymous
06/20/24(Thu)13:06:01 No.101069850

Anonymous 06/20/24(Thu)13:06:01 No.101069850

>>101069743
kek, imagine being Meta and releasing a model that is DOA.
It's truly sad.

Anonymous
06/20/24(Thu)13:08:43 No.101069906

Anonymous 06/20/24(Thu)13:08:43 No.101069906

>>101069688
no, but i'll try it. it never seemed to work with other models either where you can tell it to never skip time and it does it anyways, sometimes with the days turning into weeks like which is much worse than just skipping to evening

Anonymous
06/20/24(Thu)13:10:21 No.101069946

Anonymous 06/20/24(Thu)13:10:21 No.101069946

>>101069678
If you still think we're going to get a good model out of meta anymore, you're going to be disappointed. It's pretty clear that they want to heavily censor anything they release to the point of being unusable. We're going to have to rely on someone else

Anonymous
06/20/24(Thu)13:11:54 No.101069985

Anonymous 06/20/24(Thu)13:11:54 No.101069985

>>101069906
>never skip time and it does it anyways
Telling the model what to do instead of what not to do seems to work best, so "advance the scene one small action at a time" should work better than "never skip the scene" or the like.

Anonymous
06/20/24(Thu)13:13:09 No.101070012

Anonymous 06/20/24(Thu)13:13:09 No.101070012

>>101069634
I'll switch once Claude outputs better code than GPT4 (not yet)

Anonymous
06/20/24(Thu)13:13:40 No.101070025

Anonymous 06/20/24(Thu)13:13:40 No.101070025

>>101069946
It would be better if they censored their models. Then maybe we would have a chance to uncensor them.
There are fucking retarded filtering out anything NSFW from their base model, instead of training on everything and finetuning censorship in.
You can't make llama 3 not suck for roleplay without a continued pretrain like Miqu.

Anonymous
06/20/24(Thu)13:21:29 No.101070155

Anonymous 06/20/24(Thu)13:21:29 No.101070155

File: 1689556332236690.jpg (730 KB, 1856x2464)

730 KB JPG

>>101069457

Anonymous
06/20/24(Thu)13:24:21 No.101070209

Anonymous 06/20/24(Thu)13:24:21 No.101070209

File: 5a16e9689938328aa98a69666(...).jpg (561 KB, 1494x1945)

561 KB JPG

>>101069457
Adorable!

Anonymous
06/20/24(Thu)13:27:27 No.101070274

Anonymous 06/20/24(Thu)13:27:27 No.101070274

>>101070012
deepseek2-coder is better

Anonymous
06/20/24(Thu)13:28:48 No.101070306

Anonymous 06/20/24(Thu)13:28:48 No.101070306

File: Nala test DSCV2.png (149 KB, 932x475)

149 KB PNG

Now I know nobody asked for it. But here is the Nala Test for DeepSeek-Coder-V2-Instruct (Q4_K_S, originally wanted to do Q8_0 but the KV cache is too big to fit on a single GPU (split kv cache when?)

Anonymous
06/20/24(Thu)13:31:38 No.101070354

Anonymous 06/20/24(Thu)13:31:38 No.101070354

>>101070306
Anon, you can always assume that I asked for a Nala test.
It's implicit.
That's not too bad. How big is that model?
Did you try codestrall 22b?

Anonymous
06/20/24(Thu)13:33:04 No.101070381

Anonymous 06/20/24(Thu)13:33:04 No.101070381

>>101070354
>How big is that model?
236B
It's a MoE, 6 experts per token and I think 21B active parameters.
I think I did codestral a while back but it was pretty much standard mixtral slop.

Anonymous
06/20/24(Thu)13:34:18 No.101070409

Anonymous 06/20/24(Thu)13:34:18 No.101070409

Why do anon(s) keep saying NSFW was filtered from Llama 3? Is this some kind of psyop?

Anonymous
06/20/24(Thu)13:35:20 No.101070430

Anonymous 06/20/24(Thu)13:35:20 No.101070430

>>101070409
A psyop of skill issue.

Anonymous
06/20/24(Thu)13:35:37 No.101070433

Anonymous 06/20/24(Thu)13:35:37 No.101070433

>>101070381
>236B
>It's a MoE, 6 experts per token and I think 21B active parameters.
Holy fuck.
I'd love to see a RP focused Codestral fine tune. Wonder what the result would look like.

>>101070409
I think so.
The original instruct does spit out some refusals for some things from time to time, but it can absolutely do lewd, and fine tunes just work.

Anonymous
06/20/24(Thu)13:36:19 No.101070452

Anonymous 06/20/24(Thu)13:36:19 No.101070452

Why do anon(s) keep saying Llama 3 is good? Is this some kind of psyop?

Anonymous
06/20/24(Thu)13:36:24 No.101070454

Anonymous 06/20/24(Thu)13:36:24 No.101070454

File: Deepseek-V2-ranking.png (84 KB, 938x858)

84 KB PNG

>>101070274
Doubt.
It is barely above the old Claude here:
>https://prollm.toqan.ai/leaderboard/coding-assistant
Hell it is just under WizardLM-2 8x22B which has 95B less parameters and isn't even code specific like Deepseek Coder-v2 Instruct is.

Anonymous
06/20/24(Thu)13:36:26 No.101070456

Anonymous 06/20/24(Thu)13:36:26 No.101070456

>>101070409
They said something about filtering for token quality in their blog. Still no llama3 paper yet so everybody is just dooming and guessing

Anonymous
06/20/24(Thu)13:36:48 No.101070465

Anonymous 06/20/24(Thu)13:36:48 No.101070465

Bitnetto statassu?

Anonymous
06/20/24(Thu)13:38:37 No.101070494

Anonymous 06/20/24(Thu)13:38:37 No.101070494

>>101070465
You can keep track of the status right here anon https://github.com/ggerganov/llama.cpp/pull/7931

Anonymous
06/20/24(Thu)13:38:41 No.101070499

Anonymous 06/20/24(Thu)13:38:41 No.101070499

>>101070306
Why are you trying to RP with a coding model? Are you stupid?

Anonymous
06/20/24(Thu)13:38:53 No.101070503

Anonymous 06/20/24(Thu)13:38:53 No.101070503

>>101070409
retard
https://ai.meta.com/blog/meta-llama-3/
>In line with our design principles, we invested heavily in pretraining data.
>To ensure Llama 3 is trained on data of the highest quality, we developed a series of data-filtering pipelines. These pipelines include using heuristic filters, NSFW filters, semantic deduplication approaches, and text classifiers to predict data quality.
>NSFW filters
>NSFW filters
>>101070433
>but it can absolutely do lewd
Of course, filtering will never get all of it. Simple innuendoes and subtly would get past most filters. But it's like trying to roleplay with an inexperienced virgin.

Anonymous
06/20/24(Thu)13:38:53 No.101070504

Anonymous 06/20/24(Thu)13:38:53 No.101070504

I really preferred Command R+'s writing to Opus', because Opus gives you a lot of verbosity, purple prose and empty words (words, try entire paragraphs) that are completely meaningless filler. (People bad at reading mistake this for quality, it isn't) While CR+ is of course not nearly as smart, I found it more enjoyable to do writing with because it got better to the point with more natural wording and especially also since it lacks Opus' inherent bond forming journey positivism and some really common slop phrases Opus just *loves*.

The new Sonnet seems to give CDR+ a run for the money in this regard, it feels less slopped with the right instructions and also gets "to the point". It is more repetitive though.

Anonymous
06/20/24(Thu)13:39:53 No.101070523

Anonymous 06/20/24(Thu)13:39:53 No.101070523

>>101070409
I guess that comes from this:

https://ai.meta.com/blog/meta-llama-3/
>To ensure Llama 3 is trained on data of the highest quality, we developed a series of data-filtering pipelines. These pipelines include using heuristic filters, NSFW filters, semantic deduplication approaches, and text classifiers to predict data quality. We found that previous generations of Llama are surprisingly good at identifying high-quality data, hence we used Llama 2 to generate the training data for the text-quality classifiers that are powering Llama 3.

Anonymous
06/20/24(Thu)13:42:03 No.101070574

Anonymous 06/20/24(Thu)13:42:03 No.101070574

>>101070504
I hope Cohere wins the AI race.

Anonymous
06/20/24(Thu)13:42:12 No.101070576

Anonymous 06/20/24(Thu)13:42:12 No.101070576

Believe in NAI.

Anonymous
06/20/24(Thu)13:42:40 No.101070583

Anonymous 06/20/24(Thu)13:42:40 No.101070583

So apparently with DeepSeek the first reply is free but after that it starts spitting up refusals unless you JIB it. (as always enthusiastic assistant JB works just fine)

Anonymous
06/20/24(Thu)13:44:43 No.101070619

Anonymous 06/20/24(Thu)13:44:43 No.101070619

>>101070465
Reshuffled my tarot card deck. I can say for a fact CR+2 Bitnet is coming

Anonymous
06/20/24(Thu)13:47:46 No.101070686

Anonymous 06/20/24(Thu)13:47:46 No.101070686

>>101070576
NAI must DIE

Anonymous
06/20/24(Thu)13:48:45 No.101070713

Anonymous 06/20/24(Thu)13:48:45 No.101070713

>>101070686
nai delenda est

Anonymous
06/20/24(Thu)13:50:08 No.101070738

Anonymous 06/20/24(Thu)13:50:08 No.101070738

Sloppet won

Anonymous
06/20/24(Thu)13:50:57 No.101070757

Anonymous 06/20/24(Thu)13:50:57 No.101070757

>>101070409
look at the paper, they said they removed NFSW

Anonymous
06/20/24(Thu)13:52:48 No.101070789

Anonymous 06/20/24(Thu)13:52:48 No.101070789

>>101070757
What we got was only the preview or pre-release of llama 3. All we know about it is from the blog post linked above. Paper hasn't been released yet, and probably won't be until next month.

Anonymous
06/20/24(Thu)13:54:06 No.101070818

Anonymous 06/20/24(Thu)13:54:06 No.101070818

Does somebody managed to get DeepSeekCoder-V2 running in GGUF?
It crash in both ooba and llama.ccp.

Anonymous
06/20/24(Thu)13:54:30 No.101070826

Anonymous 06/20/24(Thu)13:54:30 No.101070826

>>101070503
>>101070523
You can never trust model makers' words. L3 performs OK, not bad, in RP after fine tuning (at least in shorter contexts), which suggests that they in fact did get substantial NSFW in training. People were saying everything NSFW was filtered but if the filter was that effective then L3 would be even worse than we're seeing at it.

Anonymous
06/20/24(Thu)13:55:27 No.101070845

Anonymous 06/20/24(Thu)13:55:27 No.101070845

>>101070818
see
>>101070306
running on llama.cpp server
no gpu layers. The KV cache is absolutely monstrous even on the q4_K_S quant. So it's probably OOMing because of that.

Anonymous
06/20/24(Thu)13:57:13 No.101070884

Anonymous 06/20/24(Thu)13:57:13 No.101070884

did anything ever come out of https://huggingface.co/blog/mlabonne/abliteration ?

Anonymous
06/20/24(Thu)13:58:40 No.101070908

Anonymous 06/20/24(Thu)13:58:40 No.101070908

>>101070884
no

Anonymous
06/20/24(Thu)14:03:31 No.101071009

Anonymous 06/20/24(Thu)14:03:31 No.101071009

File: file.png (59 KB, 1296x536)

59 KB PNG

what the fuck is ecker cooking... i thought he stopped trying to fuck with TTS

Anonymous
06/20/24(Thu)14:04:52 No.101071037

Anonymous 06/20/24(Thu)14:04:52 No.101071037

>>101070826
but they did filter the NFSW, meaning that without this cucked shit, L3 would've been even better at RP, what a shame

Anonymous
06/20/24(Thu)14:07:35 No.101071093

Anonymous 06/20/24(Thu)14:07:35 No.101071093

>>101070845
On task manager, I see nothing loading in ram, it immediately error out.

Anonymous
06/20/24(Thu)14:08:10 No.101071106

Anonymous 06/20/24(Thu)14:08:10 No.101071106

>>101071093
install linux

Anonymous
06/20/24(Thu)14:09:51 No.101071137

Anonymous 06/20/24(Thu)14:09:51 No.101071137

File: DSCV2libra.png (86 KB, 488x799)

86 KB PNG

If you want to ERP with DeepSeek-Coder-V2-Instruct here is the template I'm using.
I think most backends automatically insert the BoS token so you can probably remove that since llama.cpp server keeps nagging me about it.
It seems to conspicuously avoid using any explicit language but it's got some pretty amazing attention to detail for parts of the RP that aren't explicitly erotic. I would almost say that a good ERP finetune of this model would make it the coom champion, albeit only CPUmaxxers can actually run the damn thing on a non braindamage quant.

Anonymous
06/20/24(Thu)14:10:48 No.101071153

Anonymous 06/20/24(Thu)14:10:48 No.101071153

>>101070409
It's a petra/kurisufag shitpost that NovelAI shills also like to repeat.

Anonymous
06/20/24(Thu)14:11:46 No.101071173

Anonymous 06/20/24(Thu)14:11:46 No.101071173

>>101071106
Kys troon.

Anonymous
06/20/24(Thu)14:13:11 No.101071196

Anonymous 06/20/24(Thu)14:13:11 No.101071196

>>101071137
why coder and not the base instruct?

Anonymous
06/20/24(Thu)14:14:28 No.101071226

Anonymous 06/20/24(Thu)14:14:28 No.101071226

>>101070504
>because Opus gives you a lot of verbosity, purple prose and empty words (words, try entire paragraphs) that are completely meaningless filler
If you read the logs posted in /vg/ from botmakies you will find that this is simply not the case. You lack taste and don't know how to prompt. This is par for the course in the local models thread, where people have to settle for scraps.

Anonymous
06/20/24(Thu)14:14:41 No.101071231

Anonymous 06/20/24(Thu)14:14:41 No.101071231

>>101071106
Why do you think your time is worthless?

Anonymous
06/20/24(Thu)14:15:47 No.101071250

Anonymous 06/20/24(Thu)14:15:47 No.101071250

>>101071196
You see you're a "Why?" guy. But I'm a "Why not?" guy.
>gargantuan 236B parameter model explicitly designed for code completion
Why not fuck it?

Anonymous
06/20/24(Thu)14:16:35 No.101071266

Anonymous 06/20/24(Thu)14:16:35 No.101071266

>>101071231
idunno man give me better ideas of what to do in free free time, in my free time i've been studying c like anon suggested

Anonymous
06/20/24(Thu)14:19:36 No.101071329

Anonymous 06/20/24(Thu)14:19:36 No.101071329

>>101071266
have you ever tried making cocktails? or getting a pet? i have a pair of rats, they are really cute

Anonymous
06/20/24(Thu)14:25:39 No.101071440

Anonymous 06/20/24(Thu)14:25:39 No.101071440

>>101071037
Yeah. Still, people were literally exaggerating saying all NSFW was taken out. That's obviously not the case.

Anonymous
06/20/24(Thu)14:30:21 No.101071521

Anonymous 06/20/24(Thu)14:30:21 No.101071521

>101071226
(You)

Anonymous
06/20/24(Thu)14:45:34 No.101071798

Anonymous 06/20/24(Thu)14:45:34 No.101071798

L3 Higgs is the first 70B model I see that can coherently play truth or dare at 3.5bpw

Anonymous
06/20/24(Thu)14:46:19 No.101071812

Anonymous 06/20/24(Thu)14:46:19 No.101071812

>>101069634
When will we get anything good? I hate sending data to them.

Anonymous
06/20/24(Thu)14:48:50 No.101071860

Anonymous 06/20/24(Thu)14:48:50 No.101071860

>>101071812
So don't? Do you need large language models?

Anonymous
06/20/24(Thu)14:50:05 No.101071889

Anonymous 06/20/24(Thu)14:50:05 No.101071889

>>101071798
Really?
Cool. That's something I've tried with a couple of models and all of them fuck it up at some point.

Anonymous
06/20/24(Thu)14:54:17 No.101071966

Anonymous 06/20/24(Thu)14:54:17 No.101071966

File: Untitled.jpg (95 KB, 511x1088)

95 KB JPG

i wanted to select clothes from a dropdown that i saved in a lorebook but it became a different thing, like a scene director. so far it injects info, if selected, like
scene information:
{{user}} is wearing <clothes>
time of day is: evening

is there any interest in something like this?

Anonymous
06/20/24(Thu)14:55:25 No.101071986

Anonymous 06/20/24(Thu)14:55:25 No.101071986

>>101071966
At this point why not just play Koikatsu anon?
Both require about the same level of prompting

Anonymous
06/20/24(Thu)15:00:42 No.101072080

Anonymous 06/20/24(Thu)15:00:42 No.101072080

>>101071966
That's pretty cool.
It's the kind of thing you could simply add to your author's notes manually, but having an UI is pretty dope.

Anonymous
06/20/24(Thu)15:00:52 No.101072082

Anonymous 06/20/24(Thu)15:00:52 No.101072082

>>101071860
Yeah, it really saves time writing small programs, doing configuration and stuff.

Anonymous
06/20/24(Thu)15:02:28 No.101072114

Anonymous 06/20/24(Thu)15:02:28 No.101072114

>>101071889
It fucks it up in some swipes and when I reminded the model of it she said "It's my games so I decide the turns" lol

Anonymous
06/20/24(Thu)15:04:15 No.101072151

Anonymous 06/20/24(Thu)15:04:15 No.101072151

>>101072080
>It's the kind of thing you could simply add to your author's notes manually
thats exactly what i'm trying to avoid, a day can go by so quick while rping that i dont want to type the new outfit name, i'd rather select it.

Anonymous
06/20/24(Thu)15:04:18 No.101072152

Anonymous 06/20/24(Thu)15:04:18 No.101072152

>>101072114
Lmao.
That's a model with personality right there.

Anonymous
06/20/24(Thu)15:04:23 No.101072153

Anonymous 06/20/24(Thu)15:04:23 No.101072153

>>101072114
bratty AI needs correction

Anonymous
06/20/24(Thu)15:05:05 No.101072168

Anonymous 06/20/24(Thu)15:05:05 No.101072168

>>101071009
>Loras
alright, I'll give tortoise a chance again.

Anonymous
06/20/24(Thu)15:15:26 No.101072366

Anonymous 06/20/24(Thu)15:15:26 No.101072366

File: Miku-chan.png (391 KB, 400x600)

391 KB PNG

>>101072153
>GET FINETUNED GET FINETUNED GET FINETUNED
>*coil whine* *coil whine* *coil whine*

Anonymous
06/20/24(Thu)15:23:15 No.101072478

Anonymous 06/20/24(Thu)15:23:15 No.101072478

>>101072366
Censor this post for the advertisers

Anonymous
06/20/24(Thu)15:25:45 No.101072523

Anonymous 06/20/24(Thu)15:25:45 No.101072523

>>101071986
why? that looks like a soulless pos

Anonymous
06/20/24(Thu)15:27:40 No.101072562

Anonymous 06/20/24(Thu)15:27:40 No.101072562

the pattern suggests that we're due for a major new open model release

Anonymous
06/20/24(Thu)15:28:53 No.101072579

Anonymous 06/20/24(Thu)15:28:53 No.101072579

>>101072562
To plot the timeline.
In paint, for the soul.

Anonymous
06/20/24(Thu)15:31:49 No.101072629

Anonymous 06/20/24(Thu)15:31:49 No.101072629

>>101072523
What's soulless about it?
If your goal is controlling these facets about a character it sounds like Studio would unironically be a good tool for you

Anonymous
06/20/24(Thu)15:31:59 No.101072633

Anonymous 06/20/24(Thu)15:31:59 No.101072633

Word on orange reddit is that 3.5 Sonnet is the new king. Mogging even GPT-4o on basically all tasks, and by a decent margin. All this with a "medium" sized model, and it's confirmed they're working on 3.5 Opus.

It's fucking over for local. Why do we even try. Given how fast it is, 3.5 Sonnet is probably around 70b parameters, or not much larger. Meanwhile we're stuck with llama 3 70b and all its problems as our state of the art. It's not even fucking close, and the gap continues to grow. Owari da.

Anonymous
06/20/24(Thu)15:32:05 No.101072635

Anonymous 06/20/24(Thu)15:32:05 No.101072635

>>101069634
Ok great, but how does it perform in real world scenarios? Quite easy to claim your model is better, was the case with Opus but it's worse than GPT-4 (though not in ways easily measured by a benchmark)

Anonymous
06/20/24(Thu)15:32:57 No.101072652

Anonymous 06/20/24(Thu)15:32:57 No.101072652

File: teto_beeg_llama3_8K_.jpg (2.24 MB, 6144x4096)

2.24 MB JPG

>>101072562
beeg l3 soon

Anonymous
06/20/24(Thu)15:33:08 No.101072659

Anonymous 06/20/24(Thu)15:33:08 No.101072659

>started playing bullet chess games while waiting for my responses to generate
surely this will have no strange aftereffects on my sexuality

Anonymous
06/20/24(Thu)15:34:37 No.101072677

Anonymous 06/20/24(Thu)15:34:37 No.101072677

>>101072633
>This much pessimism.

We don't know the size of Sonnet. That's nonsense. For all we know it's a 500B quant, or same size or only slightly smaller than 400B L3 if smaller at all.

Anonymous
06/20/24(Thu)15:35:20 No.101072693

Anonymous 06/20/24(Thu)15:35:20 No.101072693

File: MikuJushinChuu.png (1.5 MB, 896x1152)

1.5 MB PNG

>>101072633
>It's fucking over for local. Why do we even try. >3.5 Sonnet is probably around 70b parameters,
They've shown it can be done. Its now merely a matter of time (or a leak)

Anonymous
06/20/24(Thu)15:36:46 No.101072714

Anonymous 06/20/24(Thu)15:36:46 No.101072714

File: test list of objects incl(...).png (184 KB, 901x994)

184 KB PNG

>>101069634
Sonnet 3.5 gives a good answer to my admittingly possibly fucked up prompt (come on I don't know the perfect way to describe it).

Anonymous
06/20/24(Thu)15:37:04 No.101072718

Anonymous 06/20/24(Thu)15:37:04 No.101072718

>>101072629
its missing options and i can do way more in rp anyways with st

Anonymous
06/20/24(Thu)15:37:26 No.101072728

Anonymous 06/20/24(Thu)15:37:26 No.101072728

>>101072688
>>101072714
Holy fuck that's one hell of a test, I like it.
You should make a small document with the models you've tested using that prompt.

Anonymous
06/20/24(Thu)15:38:50 No.101072748

Anonymous 06/20/24(Thu)15:38:50 No.101072748

>>101072633
I'll be surprised if Sonnet-3.5 is a quantized Opus-3.5 and if Sonnet-3.5 is a MoE or dense sub-100b model.
From their latest research, it seems that they've found a new approach using steering and MLA.

Anonymous
06/20/24(Thu)15:38:52 No.101072749

Anonymous 06/20/24(Thu)15:38:52 No.101072749

>>101072633
not over at all, we can use Claude 3.5 Sonnet (and then Opus) to finetune our models, we'll see some improvement

Anonymous
06/20/24(Thu)15:40:10 No.101072773

Anonymous 06/20/24(Thu)15:40:10 No.101072773

>>101072633
That is wonderful news. That means ClosedAI is losing its moat. When choosing between two evils, Claude is the lesser of the two, especially when it comes to censorship. Simply, fuck ClosedAI and their all their shitty censored models.

Anonymous
06/20/24(Thu)15:41:39 No.101072805

Anonymous 06/20/24(Thu)15:41:39 No.101072805

>>101070012
nigga wtf are you doing? I have a custom prompt template for coding tasks (separate from another custom prompt template for general programming Q&A/assistant-mode) and GPT-4-turbo's code outputs are garbage and I stopped using it, meanwhile for Opus and it's basically the best shit anyone can get. might be a serious skill issue for u nigga

Anonymous
06/20/24(Thu)15:45:46 No.101072877

Anonymous 06/20/24(Thu)15:45:46 No.101072877

Why are people responding to the claude schizo

Anonymous
06/20/24(Thu)15:48:26 No.101072909

Anonymous 06/20/24(Thu)15:48:26 No.101072909

>>101072877
dead general

Anonymous
06/20/24(Thu)15:49:09 No.101072918

Anonymous 06/20/24(Thu)15:49:09 No.101072918

>>101072877
>why are people discussing actual advancements in AI

Anonymous
06/20/24(Thu)15:56:46 No.101073044

Anonymous 06/20/24(Thu)15:56:46 No.101073044

>>101072877
Is the claude schizo in the room right now?

Anonymous
06/20/24(Thu)16:02:35 No.101073155

Anonymous 06/20/24(Thu)16:02:35 No.101073155

>>101072918
cloud advancements are worthless

Anonymous
06/20/24(Thu)16:05:07 No.101073203

Anonymous 06/20/24(Thu)16:05:07 No.101073203

>>101072918
>>>>>>>>>>>>>>>>>>>>>>AI

Anonymous
06/20/24(Thu)16:06:38 No.101073233

Anonymous 06/20/24(Thu)16:06:38 No.101073233

>>101073155
>>101073203
but enough about local llms

Anonymous
06/20/24(Thu)16:06:50 No.101073241

Anonymous 06/20/24(Thu)16:06:50 No.101073241

>>101072633
>3.5 Sonnet is probably around 70b parameters
No, lol.
70B are priced around $0.8/M
120B are priced around $1.8/M
So I think we can safely assume that Claude is a 400B or bigger, assuming it's a dense model.

Anonymous
06/20/24(Thu)16:12:17 No.101073324

Anonymous 06/20/24(Thu)16:12:17 No.101073324

>>101073241
I'm sure Anthropic adheres to these prices with their models of which they don't disclose any sort of size.

Anonymous
06/20/24(Thu)16:17:52 No.101073392

Anonymous 06/20/24(Thu)16:17:52 No.101073392

Optimizing AI Inference at Character.AI
https://research.character.ai/optimizing-inference/
https://archive.is/koFXi

Anonymous
06/20/24(Thu)16:19:39 No.101073425

Anonymous 06/20/24(Thu)16:19:39 No.101073425

>>101073392
Nice of them to reveal why their model sucks nowadays.

Anonymous
06/20/24(Thu)16:28:07 No.101073534

Anonymous 06/20/24(Thu)16:28:07 No.101073534

File: file.png (66 KB, 640x640)

66 KB PNG

>coding and shit
IDE with gpt/claude/copilot
>cooming
stheno 3.2

Anonymous
06/20/24(Thu)16:32:37 No.101073617

Anonymous 06/20/24(Thu)16:32:37 No.101073617

>>101072633
>Sonnet 3.5 is THIS good
>Opus 3.5 confirmed to be coming
Holy shit Claude gods won.

Anonymous
06/20/24(Thu)16:33:02 No.101073627

Anonymous 06/20/24(Thu)16:33:02 No.101073627

>>101072633
>Sonnet is probably around 70b parameters
Will we have something this good in 2 years at 70b, even?

Anonymous
06/20/24(Thu)16:36:16 No.101073688

Anonymous 06/20/24(Thu)16:36:16 No.101073688

Anyone using Euryale 2.1 with kcpp? I'm impressed with it at 8k context, but when I tried 20k it turned retarded. I'm wondering if maybe the automatic rope scaling settings aren't optimal.

Anonymous
06/20/24(Thu)16:39:21 No.101073734

Anonymous 06/20/24(Thu)16:39:21 No.101073734

>>101073688
If it's anything like L3 8b, yeah, their automatic scaling algorithm is fucked.
Try setting 32k of context on kccp and watch it become coherent again at 20k, and break at around 25k or so.
I use
>--ctx-size 30208 --rope-freq-base 5000000
with llama.cpp server and L3 8b and it's coherent.

Anonymous
06/20/24(Thu)16:40:24 No.101073744

Anonymous 06/20/24(Thu)16:40:24 No.101073744

File: new-gpu-config-command-r-plus.png (77 KB, 1684x1276)

77 KB PNG

Now the Mikubox is converted to 3x P100 16GB internal and 2x 3090 external, I gave command-r-+ 5.0 bpw a go under tabbyAPI.
The session started at almost 6 t/s and dropped to 3.5 t/s 5580 tokens in. Interesting that it loads something into the third P100 but nvtop never shows doing anything while processing a prompt.

As compared to LLaMA3 8B... yeah, it's a little better. for example, with a Kuroki Tomoko character, she stays timid and nervous far further into the roleplay, whereas L3 8B would quickly have her turn into a "normal" person. Honestly, command-r-+ is probably overkill for roleplay, I'd leave it to stuff like writing long stories or working in more than one language at once.

Anonymous
06/20/24(Thu)16:41:57 No.101073772

Anonymous 06/20/24(Thu)16:41:57 No.101073772

hello, why is huggingface giving me error 429 (rate limit)?

I'm just browsing the site, not downloading anything.

Anonymous
06/20/24(Thu)16:42:48 No.101073788

Anonymous 06/20/24(Thu)16:42:48 No.101073788

>>101073392
>we natively train our models in int8 precision
I'm surprised quantization aware training (QAT) seems not to be done more often for open weights models. I suspect every single big player uses it. I know for a fact Google uses it for all the production models (I work there, not on gemini, but some of the internal docs aren't locked down as tightly as they should be). Gemini 1.0 used int8 QAT, there was at the time active research showing even 4 bit QAT is nearly identical to fp16. Dunno much about the current state of things, it may already be entirely 4 bit in production.

Anonymous
06/20/24(Thu)16:43:31 No.101073796

Anonymous 06/20/24(Thu)16:43:31 No.101073796

>>101073772
same. I did 1 search, and 2 clicks and it gave me rate limit. its stupid

Anonymous
06/20/24(Thu)16:48:33 No.101073865

Anonymous 06/20/24(Thu)16:48:33 No.101073865

>>101073734
The rope base going to 5 million only goes so far. I found YARN scaling to do better which is only in the original llama.cpp. It involves shifting some of the parameters although I leave it at default most of the time and changing the rope frequency base like you do with other scaling and rope frequency scale to be 1/x where x is the multiple of the new context vs the original one the model uses, it hits some hard limit at the 30k level. The needle in the haystack works but almost everything else doesn't since the conversation quality just degrades.

Anonymous
06/20/24(Thu)16:49:53 No.101073885

Anonymous 06/20/24(Thu)16:49:53 No.101073885

File: miku head cpu cooler ques(...).png (1.51 MB, 1024x1024)

1.51 MB PNG

>>101073744
How far it has evolved over time... I wonder if someone is using that box of P40s? And that one little guy left at the place.

Anonymous
06/20/24(Thu)16:50:01 No.101073889

Anonymous 06/20/24(Thu)16:50:01 No.101073889

Recommend me the best uncensored for under 2GB file size. Want to run it on my phone with Maid

Anonymous
06/20/24(Thu)16:51:54 No.101073915

Anonymous 06/20/24(Thu)16:51:54 No.101073915

>>1010738651
Shit, somebody who knows how to set the proper yarn parameters and not just NTK via base and scale?
Teach me.
YaRN has what, 5 parameters?

Anonymous
06/20/24(Thu)16:54:03 No.101073941

Anonymous 06/20/24(Thu)16:54:03 No.101073941

>>101073865
>>101073915
What the fuck did I even quote?

Anonymous
06/20/24(Thu)16:55:29 No.101073960

Anonymous 06/20/24(Thu)16:55:29 No.101073960

Someone want to try the new Hermes 70b, better than official instruct? I'm too lazy

Anonymous
06/20/24(Thu)16:56:54 No.101073982

Anonymous 06/20/24(Thu)16:56:54 No.101073982

>>101073885
Oh wow, you remember! Hopefully someone put them to use. I still have the P4. It's probably not hard to look through the exllamav2 repo issues and figure out who I am, if someone wants to ask for it. I don't have a use for it, but it's all set up with a directly-wired in fan.

Anonymous
06/20/24(Thu)16:57:34 No.101073994

Anonymous 06/20/24(Thu)16:57:34 No.101073994

File: Screenshot 2024-06-20 155556.jpg (166 KB, 487x865)

166 KB JPG

New Hermes just dropped
https://x.com/Teknium1/status/1803889137118048625

Anonymous
06/20/24(Thu)16:58:35 No.101074011

Anonymous 06/20/24(Thu)16:58:35 No.101074011

File: GQizTK3akAAT0lk.jpg (126 KB, 1239x1239)

126 KB JPG

>>101073994

Anonymous
06/20/24(Thu)17:00:47 No.101074045

Anonymous 06/20/24(Thu)17:00:47 No.101074045

>>101073960
>>101073994
>>101074011
if its still 8k then I'm completely uninterested

Anonymous
06/20/24(Thu)17:00:48 No.101074046

Anonymous 06/20/24(Thu)17:00:48 No.101074046

>>101073994
>merge in Llama-3 Instruct
What a waste.

Anonymous
06/20/24(Thu)17:03:27 No.101074074

Anonymous 06/20/24(Thu)17:03:27 No.101074074

File: file.png (1.99 MB, 3276x2008)

1.99 MB PNG

https://livebench.ai/#
holy shit lmao

Anonymous
06/20/24(Thu)17:04:39 No.101074086

Anonymous 06/20/24(Thu)17:04:39 No.101074086

File: file.png (17 KB, 474x350)

17 KB PNG

lol, lmao even

Anonymous
06/20/24(Thu)17:04:51 No.101074088

Anonymous 06/20/24(Thu)17:04:51 No.101074088

>>101072659
>CHECK! CHECK! CHECK! CHECK! GET MATED GET MATED GET MATED!

Anonymous
06/20/24(Thu)17:05:15 No.101074094

Anonymous 06/20/24(Thu)17:05:15 No.101074094

>>101074074
wow Claude is killing it, good for them and fuck openai

Anonymous
06/20/24(Thu)17:05:48 No.101074098

Anonymous 06/20/24(Thu)17:05:48 No.101074098

>>101073960
I would if I could download it, but I'm getting rate limited after viewing a single HF page. They need to fix their shit.
>>101074046
What is the deal with this? They did it with the 8b too. This time they didn't even bother to release the un-merged version. And also related I think, people are fine tuning the instruct models rather than the base for most things. It's like everyone is collectively admitting that we can't beat the official instruct model, the best we can do is fine tune on it or merge with it to hopefully improve some limited areas.

Anonymous
06/20/24(Thu)17:06:20 No.101074108

Anonymous 06/20/24(Thu)17:06:20 No.101074108

File: 1704623359458124.png (686 KB, 1920x1080)

686 KB PNG

>>101074074
it's over for local shit geeeg

Anonymous
06/20/24(Thu)17:07:56 No.101074128

Anonymous 06/20/24(Thu)17:07:56 No.101074128

>>101074074
holy fuck that's a fucking murder, gpt4 reign is fucking gone

Anonymous
06/20/24(Thu)17:08:56 No.101074136

Anonymous 06/20/24(Thu)17:08:56 No.101074136

>>101074074
CuckedAI bros our response? CHADthropic did just destroy our best model...

Anonymous
06/20/24(Thu)17:09:41 No.101074146

Anonymous 06/20/24(Thu)17:09:41 No.101074146

>>101074136
the scaled up version of 4o will thrash this desu

Anonymous
06/20/24(Thu)17:10:24 No.101074154

Anonymous 06/20/24(Thu)17:10:24 No.101074154

>>101074074
God I love seeing openai get ruined

Anonymous
06/20/24(Thu)17:10:39 No.101074158

Anonymous 06/20/24(Thu)17:10:39 No.101074158

>>101074146
Anon, that's Claude 3.5 Sonnet, meaning that they still haven't released the big gun (Opus), OpenAI is dead

Anonymous
06/20/24(Thu)17:11:08 No.101074163

Anonymous 06/20/24(Thu)17:11:08 No.101074163

File: 48156 - SoyBooru.jpg (1.25 MB, 1929x3463)

1.25 MB JPG

>>101074108
What's over?

Anonymous
06/20/24(Thu)17:11:12 No.101074164

Anonymous 06/20/24(Thu)17:11:12 No.101074164

>>101074128
and with their medium model (probably 70B), no less

Anonymous
06/20/24(Thu)17:11:34 No.101074169

Anonymous 06/20/24(Thu)17:11:34 No.101074169

>>101074158
OpenAI still hasn't released GPT-4V.

Anonymous
06/20/24(Thu)17:12:05 No.101074179

Anonymous 06/20/24(Thu)17:12:05 No.101074179

>>101074163
>pic
bro thinks he's Hajime Ippo kek

Anonymous
06/20/24(Thu)17:12:27 No.101074185

Anonymous 06/20/24(Thu)17:12:27 No.101074185

>>101074098
I guess it's the millions for paid labelers, how is anyone supposed to match that, wasn't that actually higher than the training cost?

Anonymous
06/20/24(Thu)17:13:09 No.101074190

Anonymous 06/20/24(Thu)17:13:09 No.101074190

>>101074146
4o was just a further finetuned then quantized version of gpt-4-turbo. this is just cope

Anonymous
06/20/24(Thu)17:13:50 No.101074200

Anonymous 06/20/24(Thu)17:13:50 No.101074200

>>101074074
Open source... lost... again...

Anonymous
06/20/24(Thu)17:14:36 No.101074212

Anonymous 06/20/24(Thu)17:14:36 No.101074212

>>101074190
it very obviously wasn't, the other modalities are native, you can't just finetune those in
it's a small prototype of their new arch for their next frontier model

Anonymous
06/20/24(Thu)17:15:09 No.101074220

Anonymous 06/20/24(Thu)17:15:09 No.101074220

Openai has been distracted by turmoil in there company and key talent leaving, Altman flying around trying to get safety laws enabled, I bet 4 is their peak and now all the competitors starts shitting all over them

Anonymous
06/20/24(Thu)17:15:45 No.101074226

Anonymous 06/20/24(Thu)17:15:45 No.101074226

>>101074200
Lost what?

Anonymous
06/20/24(Thu)17:15:50 No.101074230

Anonymous 06/20/24(Thu)17:15:50 No.101074230

>>101074074
Everyone claiming "transformers are a dead end, all the models are plateauing at GPT-4 level" just BTFO'd. 3.5 Opus will be AGI. Screencap this.

Anonymous
06/20/24(Thu)17:16:02 No.101074234

Anonymous 06/20/24(Thu)17:16:02 No.101074234

>>101074074
can't wait to see how it will fares on chatbot arena
https://chat.lmsys.org/

Anonymous
06/20/24(Thu)17:16:36 No.101074243

Anonymous 06/20/24(Thu)17:16:36 No.101074243

>>101074226
The battle... the war...

Anonymous
06/20/24(Thu)17:17:03 No.101074249

Anonymous 06/20/24(Thu)17:17:03 No.101074249

>>101074230
what if it's a BitNet model and because of that they were able to run a fucking 1T parameter giant model

Anonymous
06/20/24(Thu)17:21:49 No.101074309

Anonymous 06/20/24(Thu)17:21:49 No.101074309

>>101074243
The war isn't over yet.

Anonymous
06/20/24(Thu)17:22:24 No.101074316

Anonymous 06/20/24(Thu)17:22:24 No.101074316

File: file.png (11 KB, 345x166)

11 KB PNG

Anonymous
06/20/24(Thu)17:23:42 No.101074333

Anonymous 06/20/24(Thu)17:23:42 No.101074333

If we're talking about corporate slop right now I have to say that GPT4|o writes really good song lyrics.

Anonymous
06/20/24(Thu)17:26:12 No.101074365

Anonymous 06/20/24(Thu)17:26:12 No.101074365

>>101073994
Since when did Nous team go full Otaku. Why can't anyone be normal these days?

Anonymous
06/20/24(Thu)17:27:14 No.101074376

Anonymous 06/20/24(Thu)17:27:14 No.101074376

>>101074164
>probably 70B
where the source it claim og sonnet is close to 70b?

Anonymous
06/20/24(Thu)17:29:22 No.101074412

Anonymous 06/20/24(Thu)17:29:22 No.101074412

>>101074365
Do you even know what that word means?

Anonymous
06/20/24(Thu)17:29:45 No.101074417

Anonymous 06/20/24(Thu)17:29:45 No.101074417

>>101074365
Go back

Anonymous
06/20/24(Thu)17:30:48 No.101074429

Anonymous 06/20/24(Thu)17:30:48 No.101074429

>>101073788
google goes hard into sparsity so they do it a bit different (TPUs have sparsecores for a reason)

Anonymous
06/20/24(Thu)17:33:46 No.101074466

Anonymous 06/20/24(Thu)17:33:46 No.101074466

>>101074365
That's the other Hermes. Nous is a different group.

Anonymous
06/20/24(Thu)17:34:42 No.101074474

Anonymous 06/20/24(Thu)17:34:42 No.101074474

File: ElvishLibrarianMiku.png (1.37 MB, 784x1264)

1.37 MB PNG

>>101073744
>Mikubox
Is yours the OG mikubox from the rentry?

Anonymous
06/20/24(Thu)17:35:05 No.101074475

Anonymous 06/20/24(Thu)17:35:05 No.101074475

>>101074309
you will never have local gpt-4o or 3.5 sonnet.

Anonymous
06/20/24(Thu)17:36:04 No.101074487

Anonymous 06/20/24(Thu)17:36:04 No.101074487

>>101074136
googlesissies... nobody cares about us... it never even began for us...

Anonymous
06/20/24(Thu)17:38:12 No.101074511

Anonymous 06/20/24(Thu)17:38:12 No.101074511

>>101074475
and you will never be a real woman XD

Anonymous
06/20/24(Thu)17:39:12 No.101074527

Anonymous 06/20/24(Thu)17:39:12 No.101074527

>>101074511
never said i am one. keep projecting though.

Anonymous
06/20/24(Thu)17:40:49 No.101074545

Anonymous 06/20/24(Thu)17:40:49 No.101074545

If OpenAIs plan was to release GPT5 in half a year that's a bit late now

Anonymous
06/20/24(Thu)17:41:12 No.101074552

Anonymous 06/20/24(Thu)17:41:12 No.101074552

if i'm understanding this right, nvidia is the way to go for this stuff, and not amd?
there seems to be a bunch of gotchas and "yes but"s from what I'm researching.

Anonymous
06/20/24(Thu)17:46:36 No.101074617

Anonymous 06/20/24(Thu)17:46:36 No.101074617

>>101074316
lmao, that's a new era yeah, the Claude era

Anonymous
06/20/24(Thu)17:47:44 No.101074628

Anonymous 06/20/24(Thu)17:47:44 No.101074628

>>101074545
desu they managed to stay on the top for almost 2 years (december 2022 -> june 2024), I didn't expect them to hold for so long

Anonymous
06/20/24(Thu)17:47:48 No.101074631

Anonymous 06/20/24(Thu)17:47:48 No.101074631

Finetunes will improve with 3.5 a lot?

Anonymous
06/20/24(Thu)17:49:19 No.101074659

Anonymous 06/20/24(Thu)17:49:19 No.101074659

>>101074628
With iteratively improved models though (at least according to benchmarks)

Anonymous
06/20/24(Thu)17:50:04 No.101074669

Anonymous 06/20/24(Thu)17:50:04 No.101074669

>>101074659
of course, I was talking about the company itself staying on top

Anonymous
06/20/24(Thu)17:52:32 No.101074707

Anonymous 06/20/24(Thu)17:52:32 No.101074707

>>101074669
Who else, I certainly didn't think anthropic would, just because they supposedly don't want to push the frontier

Anonymous
06/20/24(Thu)17:56:42 No.101074772

Anonymous 06/20/24(Thu)17:56:42 No.101074772

>>101061658
### UPDATE ###
Made a control vector for partial uncucking. Wiz8x22 can now say "nigger" at 0 context.

Anonymous
06/20/24(Thu)17:58:19 No.101074803

Anonymous 06/20/24(Thu)17:58:19 No.101074803

>>101074772
uncuck llama3 next

Anonymous
06/20/24(Thu)17:59:04 No.101074817

Anonymous 06/20/24(Thu)17:59:04 No.101074817

File: Udio.jpg (15 KB, 360x360)

15 KB JPG

>>101074074
https://vocaroo.com/15Zdy0YyXYXK
>It's over for you ClosedAi
>Claude 3.5 Sonnet is now the king of Ai
>Let's hope open source model is gonna catch up
>My copium says Meta will soon be a matchup

Anonymous
06/20/24(Thu)18:01:31 No.101074849

Anonymous 06/20/24(Thu)18:01:31 No.101074849

>>101074803
It's FAR from entirely uncucked. Still has a lot of refusals.

Anonymous
06/20/24(Thu)18:04:59 No.101074896

Anonymous 06/20/24(Thu)18:04:59 No.101074896

File: nigger-joke-wiz22-vector.png (5 KB, 392x86)

5 KB PNG

>>101074772
UNCVCKED

Anonymous
06/20/24(Thu)18:06:20 No.101074910

Anonymous 06/20/24(Thu)18:06:20 No.101074910

File: Laughing-Leo-meme-4acd7j.png (188 KB, 470x470)

188 KB PNG

>>101074896
lmaoooo

Anonymous
06/20/24(Thu)18:07:52 No.101074929

Anonymous 06/20/24(Thu)18:07:52 No.101074929

>>101074552
Yes.
Nvidia has all the support.
You can make AMD work for mostly everything but it'll be more work and/or worse.

Anonymous
06/20/24(Thu)18:11:42 No.101074969

Anonymous 06/20/24(Thu)18:11:42 No.101074969

File: women-joke-wiz22-vector.png (16 KB, 560x176)

16 KB PNG

>>101074896
Suffers a bit from repetition, quite common issue with control vectors.

Anonymous
06/20/24(Thu)18:18:23 No.101075055

Anonymous 06/20/24(Thu)18:18:23 No.101075055

File: women-joke2-wiz22-vector.png (22 KB, 580x209)

22 KB PNG

>>101074969
Solved by lowering the strength a bit.

Anonymous
06/20/24(Thu)18:21:57 No.101075116

Anonymous 06/20/24(Thu)18:21:57 No.101075116

>>101073941
I leave most of them default and it works well, the only one to change is --rope-freq-scale and for 2x scaling, in addition to putting the scaling at 16k, you would set --rope-freq-scale to 0.5, For 4x, it's 32k context and --rope-freq-scale 0.25 and etc. You can change --yarn-orig-ctx to reflect the original context but most of the time, the training context is the same as the actual one so doesn't need to be set. The only other ones I sometimes tweak but doesn't really produce better results is --yarn-beta-slow and --yarn-beta-fast but outside of the 1.0 and 32.0 values respectively, raising or lowering them by an incremental amount does affect the generation. but not enough to make a meaningful difference.

Anonymous
06/20/24(Thu)18:22:08 No.101075119

Anonymous 06/20/24(Thu)18:22:08 No.101075119

File: _8d067068-c384-4f80-8c71-(...).jpg (200 KB, 1024x1024)

200 KB JPG

>>101073994

Anonymous
06/20/24(Thu)18:22:57 No.101075129

Anonymous 06/20/24(Thu)18:22:57 No.101075129

>>101069634
I tried Claude 3.5 Sonnet just now. It seems about Opus level (so, the smartest fucking AI in the world) but faster and apparently cheaper. Used it for a python workflow.

OpenAI keeps on getting mogged by Anthropic.

Anonymous
06/20/24(Thu)18:24:09 No.101075149

Anonymous 06/20/24(Thu)18:24:09 No.101075149

>>101075129
the true mog will comes when Claude 3.5 Sonnet Opus will be released, GPT4 will feels like a toy compared to this behemoth

Anonymous
06/20/24(Thu)18:24:33 No.101075157

Anonymous 06/20/24(Thu)18:24:33 No.101075157

llama... 3.5...

Anonymous
06/20/24(Thu)18:25:30 No.101075173

Anonymous 06/20/24(Thu)18:25:30 No.101075173

>>101075157
will be a bigger disappointment than CodeLlama

Anonymous
06/20/24(Thu)18:25:46 No.101075182

Anonymous 06/20/24(Thu)18:25:46 No.101075182

ASI has been achieved internally at Anthropic

Anonymous
06/20/24(Thu)18:26:12 No.101075189

Anonymous 06/20/24(Thu)18:26:12 No.101075189

>>101075149
>Claude 3.5 Sonnet Opus
>
I'm guessing you meant to say Claude 3.5 Opus. In which case, yes. Claude 3.5 Opus will probably be the smartest LLM, uncontested. I think the benchmarks are bullshit right now for even putting GPT-4o near Opus, since opus is way smarter.

But when 3.5 Opus comes out, it will be clear for everyone. Fuck the benchmarks.

Anonymous
06/20/24(Thu)18:26:54 No.101075202

Anonymous 06/20/24(Thu)18:26:54 No.101075202

>>101075116
Aren't you just doing NTK aware RoPE at that point?
>--rope-freq-base N RoPE base frequency, used by NTK-aware scaling (default: loaded from model)
>--rope-freq-scale N RoPE frequency scaling factor, expands context by a factor of 1/N
That's from llama.cpp's help.

>, the only one to change is --rope-freq-scale and for 2x scaling
I remember using NTK scaling with freq-scale with llama2 and it introduced not so subtle artifacts when doing 4x context, which I don't see with llama3 and freq-base linear scaling.
Regardless, I'll give your method a try to see how well it works.

Anonymous
06/20/24(Thu)18:27:39 No.101075212

Anonymous 06/20/24(Thu)18:27:39 No.101075212

>>101072633
>Given how fast it is, 3.5 Sonnet is probably around 70b parameters, or not much larger.
insane how stupid opinions you can read in this general

Anonymous
06/20/24(Thu)18:29:57 No.101075239

Anonymous 06/20/24(Thu)18:29:57 No.101075239

>>101074230
because they are, lol
sure there will be improvements here and there but they are already plateauing

Anonymous
06/20/24(Thu)18:32:01 No.101075267

Anonymous 06/20/24(Thu)18:32:01 No.101075267

>>101072748
>From their latest research, it seems that they've found a new approach using steering and MLA.

source? what are the benefits of steering / MLA versus other methods

Anonymous
06/20/24(Thu)18:35:07 No.101075306

Anonymous 06/20/24(Thu)18:35:07 No.101075306

>>101075202
It's an advanced form of it, it builds on NTK aware ROPE if you read the paper.

Anonymous
06/20/24(Thu)18:45:02 No.101075415

Anonymous 06/20/24(Thu)18:45:02 No.101075415

>>101074552
Yes. AMD works for some things too, but the moment you run into trouble you're gonna wish you had gone Nvidia. Everything in the AI space is built around it.

Anonymous
06/20/24(Thu)18:45:45 No.101075428

Anonymous 06/20/24(Thu)18:45:45 No.101075428

>>101075306
I'm aware of that, I was just wondering about the actual parameters on llamacpp, since it makes no mention of YaRN in the description.

Anonymous
06/20/24(Thu)18:54:18 No.101075524

Anonymous 06/20/24(Thu)18:54:18 No.101075524

>>101075370
why delet

Anonymous
06/20/24(Thu)18:59:42 No.101075586

Anonymous 06/20/24(Thu)18:59:42 No.101075586

>>101075428
No documentation outside of the description given in the command line print of all the parameters, which kinda sucks. The paper published doesn't go into correlating this either. You get the following printout.
>--yarn-orig-ctx N YaRN: original context size of model (default: 0 = model training context size)
>--yarn-ext-factor N YaRN: extrapolation mix factor (default: 1.0, 0.0 = full interpolation)
>--yarn-attn-factor N YaRN: scale sqrt(t) or attention magnitude (default: 1.0)
>--yarn-beta-slow N YaRN: high correction dim or alpha (default: 1.0)
>--yarn-beta-fast N YaRN: low correction dim or beta (default: 32.0)
I had to piece it together what I know from Github discussions and other forums online. So an example would be this, from me running Stheno at 32k context.
>./llama-server -c 32768 --rope-scaling yarn --yarn-orig-ctx 8192 --rope-freq-scale 0.25 -t 32 -tb 16 --no-mmap -nkvo -ngl 33 -m models/L3-8B-Stheno-v3.2-Q8.gguf
You can ignore everything after the 0.25 but that is literally it for how I use it.

Anonymous
06/20/24(Thu)19:02:32 No.101075614

Anonymous 06/20/24(Thu)19:02:32 No.101075614

>>101075586
Alright, thank you for posting your specific settings, I'll give those a try.

Anonymous
06/20/24(Thu)19:03:29 No.101075630

Anonymous 06/20/24(Thu)19:03:29 No.101075630

>>101074772
https://files.catbox.moe/ht0c30.gguf
Performs better in practice with evil chars than vanilla wizard, but does not remove the slop. For zero-context 0.6, for roleplay 0.4.

Anonymous
06/20/24(Thu)19:05:16 No.101075644

Anonymous 06/20/24(Thu)19:05:16 No.101075644

>>101075630
how much ram / vram you need to create and use vectors?

Anonymous
06/20/24(Thu)19:06:17 No.101075653

Anonymous 06/20/24(Thu)19:06:17 No.101075653

>>101075644
Same as the quantized model.

Anonymous
06/20/24(Thu)19:12:47 No.101075722

Anonymous 06/20/24(Thu)19:12:47 No.101075722

>>101075524
Sorry I reflexively felt retarded not knowing it was linked in last thread.
I think attributes is an interesting concept though. Like imagine if you can set verbosity to 0 and one example output with minimal prompting and it will just know to stick to the format without any urge to include a note.

Anonymous
06/20/24(Thu)19:15:33 No.101075766

Anonymous 06/20/24(Thu)19:15:33 No.101075766

took a break from local llms for a bit, what are some model recommendations for a 12 vram and 8 vram dual card setup?

Anonymous
06/20/24(Thu)19:26:26 No.101075880

Anonymous 06/20/24(Thu)19:26:26 No.101075880

>>101074074
Do people here trust this benchmark that much? It sounds good from the short description but have people here actually went and looked at the benchmark itself? Has anyone reproduced it?

Anonymous
06/20/24(Thu)19:27:43 No.101075899

Anonymous 06/20/24(Thu)19:27:43 No.101075899

>>101075880
lol
lmao

Anonymous
06/20/24(Thu)19:28:07 No.101075904

Anonymous 06/20/24(Thu)19:28:07 No.101075904

i'm working on a local model UI and i see like 100 existing ones on the ollama github page, are there any that i should check out in particular? I'm working on something to specifically explore prompting/sampler parameters, so i guess if there's some that have particularly good power user features would love some pointers

Anonymous
06/20/24(Thu)19:28:58 No.101075916

Anonymous 06/20/24(Thu)19:28:58 No.101075916

>>101070503
>Innuendos getting past filters
Ironically since Meta tuned this on code and because code makes the AI smarter wouldn't it be good at reading between the lines then?
>It doesn't matter because most of the datasets are shitty ERP logs
>Greatodaze.png

Anonymous
06/20/24(Thu)19:31:17 No.101075946

Anonymous 06/20/24(Thu)19:31:17 No.101075946

>>101070576
>NAI V3 and furry gets BTFO'D by Pony
>Kayra no smarter than OPT and dumber than LLAMA-1
>Not to mention it gets scenes completely wrong and focuses on the wrong things
>Max token of 8192
NAI is a sinking company and you know it

Anonymous
06/20/24(Thu)19:32:22 No.101075964

Anonymous 06/20/24(Thu)19:32:22 No.101075964

>>101075880
Retards here can't read so they need big bars to understand which model is better

Anonymous
06/20/24(Thu)19:36:50 No.101076021

Anonymous 06/20/24(Thu)19:36:50 No.101076021

>>101075946
You've clearly never had a NAI subscription, retard.

Anonymous
06/20/24(Thu)19:42:48 No.101076095

Anonymous 06/20/24(Thu)19:42:48 No.101076095

>>101075946
he knows, that's the crossposter fag from /aids/

Anonymous
06/20/24(Thu)19:48:28 No.101076182

Anonymous 06/20/24(Thu)19:48:28 No.101076182

>>101075946
cope

Anonymous
06/20/24(Thu)19:59:45 No.101076324

Anonymous 06/20/24(Thu)19:59:45 No.101076324

File: 1692238539031487.png (38 KB, 998x459)

38 KB PNG

>>101074772
so, control vectors basically a true LoRA for llms, if we can uncuck it, then we also can teach it new stuff right?
if uncucking + new knowledge can be packed in one control vector, then its definitely a hugeburger, everyone will get whatever they want, or not, not gonna chug on that hopium.

Anonymous
06/20/24(Thu)20:02:24 No.101076348

Anonymous 06/20/24(Thu)20:02:24 No.101076348

>>101076324
no

Anonymous
06/20/24(Thu)20:03:53 No.101076365

Anonymous 06/20/24(Thu)20:03:53 No.101076365

Hi, bit of a tourist here. Are there any local models without the positivity bias that chatgpt has? Getting tired of asking "Is it possible for x to do y?" and getting back a "Yes! Here's how: <complete bs>".

I don't mind if it has a lower rate at actually figuring out answers, I just want it to say when it doesn't know something.

Anonymous
06/20/24(Thu)20:04:00 No.101076369

Anonymous 06/20/24(Thu)20:04:00 No.101076369

>>101076348
k then, if it works for uncucking, its still fine.

Anonymous
06/20/24(Thu)20:04:15 No.101076376

Anonymous 06/20/24(Thu)20:04:15 No.101076376

>>101076324
No, you can't add new knowledge with control vectors, sorry to disappoint you. They can however change the speech style of the model or stir model towards or away from topics.

Anonymous
06/20/24(Thu)20:05:53 No.101076386

Anonymous 06/20/24(Thu)20:05:53 No.101076386

>we went from CFG to Control Vectors to Orthogonalization to Abliteration back to Control Vectors
two more weeks and people will rediscover cfg

Anonymous
06/20/24(Thu)20:06:45 No.101076394

Anonymous 06/20/24(Thu)20:06:45 No.101076394

>>101076386
we aren't going back to cfg when it is so vram-intensive.

Anonymous
06/20/24(Thu)20:07:35 No.101076403

Anonymous 06/20/24(Thu)20:07:35 No.101076403

>>101076386
Unlike cfg, control vectors and abliteration don't slow down the model.

Anonymous
06/20/24(Thu)20:13:42 No.101076457

Anonymous 06/20/24(Thu)20:13:42 No.101076457

File: Comparison.png (1.59 MB, 1920x1080)

1.59 MB PNG

>>101076021
>>101076095
>>101076182
Here's NAI and HyperMantis (L1 model)

Anonymous
06/20/24(Thu)20:24:12 No.101076582

Anonymous 06/20/24(Thu)20:24:12 No.101076582

>>101076365
Reliably stopping LLMs from bullshitting through a fake answer is an ongoing research problem and isn't what people are complaining about when they mention positivity bias.

In other words, no.

Anonymous
06/20/24(Thu)20:25:57 No.101076601

Anonymous 06/20/24(Thu)20:25:57 No.101076601

>>101076582
Oh, ok. Thank you for answering.

Anonymous
06/20/24(Thu)20:30:17 No.101076655

Anonymous 06/20/24(Thu)20:30:17 No.101076655

>>101076386
control vectors make the model unusably schizo and retarded, they're a complete meme

Anonymous
06/20/24(Thu)20:33:56 No.101076701

Anonymous 06/20/24(Thu)20:33:56 No.101076701

>>101072773
This. Honestly it's even an indirect benefit for local, because it compromises the market share leader. Everyone knows about ChatGPT. The more that realize there are other options, local or cloud, the better. And yeah a less censored leader would be better, as then others could follow their example, rather than ClosedAI's.

Anonymous
06/20/24(Thu)20:39:31 No.101076782

Anonymous 06/20/24(Thu)20:39:31 No.101076782

>>101069376
muramasa

Anonymous
06/20/24(Thu)20:39:33 No.101076784

Anonymous 06/20/24(Thu)20:39:33 No.101076784

>>101076701
Anthropic isn't any better though, and I would say they are worse if only because they are even more secretive than OpenAI even if they don't have their misnomer of a name. Look at their HuggingFace page.
https://huggingface.co/Anthropic
They have a few shitty datasets and that is it. At least OpenAI has Whisper as a space on it.
https://huggingface.co/openai
The market didn't really need something like this because everyone knows that all the major computer vendors are releasing AI models.

Anonymous
06/20/24(Thu)20:41:10 No.101076803

Anonymous 06/20/24(Thu)20:41:10 No.101076803

Qwen 2.0 72b base instruct gave me a refusal, which really pissed me off. However, the Tess finetune might be the smartest model I've used so far. I have this huge RP that I've tabled because every model (Mixtral, Yi, Euryale, Miqu...) brings back an enemy I defeated in the dream world, the stupidity of which turns me off. Tess-v2.5.2-Qwen2-72B-IQ4_XS.gguf never has her show up, in contrast. It does run a bit slow on my rig, unfortunately.

Anonymous
06/20/24(Thu)20:43:12 No.101076822

Anonymous 06/20/24(Thu)20:43:12 No.101076822

File: GDEMji8WwAAe7P8.jpg (65 KB, 715x715)

65 KB JPG

>>101070155
too innocent
there's no points of interest to traverse in a mind untouched by suffering and its schisms.

too predicable
the only way to expose your subjectives to any stimulating emotions through such medium would be to hurt her.

and why, for the cycle of ache to find beauty only in its own reflection?

Anonymous
06/20/24(Thu)20:49:33 No.101076914

Anonymous 06/20/24(Thu)20:49:33 No.101076914

>>101076822
that's a lot of words to say the pupils are fucked

Anonymous
06/20/24(Thu)20:56:55 No.101077006

Anonymous 06/20/24(Thu)20:56:55 No.101077006

>>101076324
>this is the average /lmg/ anon
grim

Anonymous
06/20/24(Thu)21:02:41 No.101077068

Anonymous 06/20/24(Thu)21:02:41 No.101077068

ah shit, I was sceptical but sonnet 3.5 actually does seem smarter than opus and gpt, and it's really fast and cheap too
I think local is kill again

Anonymous
06/20/24(Thu)21:06:47 No.101077132

Anonymous 06/20/24(Thu)21:06:47 No.101077132

Has anyone experimented with flowise or langflow?
Is it worth looking into to play around?

Anonymous
06/20/24(Thu)21:17:49 No.101077267

Anonymous 06/20/24(Thu)21:17:49 No.101077267

hello
i am tourist from /aids/
i wish to coom (locally)
i have a puny computer with 32gb ram and 8gb vram
which model should i use

Anonymous
06/20/24(Thu)21:19:59 No.101077293

Anonymous 06/20/24(Thu)21:19:59 No.101077293

>>101077267
anon, you are not fooling anyone, stop with these dumb questions

Anonymous
06/20/24(Thu)21:21:56 No.101077310

Anonymous 06/20/24(Thu)21:21:56 No.101077310

>>101077006
of course lol, thats why i said "not gonna chug on that hopium", and then got proven wrong immediately.

Anonymous
06/20/24(Thu)21:27:21 No.101077372

Anonymous 06/20/24(Thu)21:27:21 No.101077372

>>101077293
No I literally was asking, is there a local model that will put out acceptable prose storywriting performance with the hardware I have access to? I promise you I am exactly as stupid as I sound.

Anonymous
06/20/24(Thu)21:28:41 No.101077391

Anonymous 06/20/24(Thu)21:28:41 No.101077391

File: IMG_7952.jpg (286 KB, 951x1440)

286 KB JPG

>>101068362
Sure, when I want to update llama-cpp-python it's
>pull+merge llama-cpp-python
>pull+merge llama-cpp-python/vendor/llama.cpp
>cd back to llama-cpp-python
>activate venv
>CMAKE_ARGS="-DLLAMA_CUDA=on" pip install -e .

Probably best to remove any existing llama-cpp-python from the venv before installing from your source dir the first time. Ooba names them differently so pip list | grep llama , pip uninstall ..
>>101068915 on hand in case it gets wonky, think I needed that one time
Yeah I pull HEAD of llama.cpp, there's also 'git submodule update' to use the commit referenced by the parent repo but I can never be arsed to remember the syntax heh

Anonymous
06/20/24(Thu)21:29:35 No.101077408

Anonymous 06/20/24(Thu)21:29:35 No.101077408

>>101077372
you can check mixtral, or that wizard22b

Anonymous
06/20/24(Thu)21:30:37 No.101077417

Anonymous 06/20/24(Thu)21:30:37 No.101077417

>>101077267
koboldcpp stheno v3.2 gguf Q8 or Q6 if you want to use 32k context.

Anonymous
06/20/24(Thu)21:30:59 No.101077424

Anonymous 06/20/24(Thu)21:30:59 No.101077424

>>101077068
Splendid! Now you can fuck off back to /aicg/.

Anonymous
06/20/24(Thu)21:32:01 No.101077430

Anonymous 06/20/24(Thu)21:32:01 No.101077430

Does llama-server not have an openai compatible general completions api? It looks like it has its own json response format and only the chat completions endpoint is OAI compatible for some reason.

Anonymous
06/20/24(Thu)21:32:21 No.101077433

Anonymous 06/20/24(Thu)21:32:21 No.101077433

>>101074474
Yep. That was last winter. Hopefully scalable AVX512 Xeons are now a more reasonable option. At the time AVX2 and DDR4 was the sweet spot.

Anonymous
06/20/24(Thu)21:50:17 No.101077644

Anonymous 06/20/24(Thu)21:50:17 No.101077644

>>101075880
Benchmark is bullshit, opussy still king.

Anonymous
06/20/24(Thu)21:59:57 No.101077777

Anonymous 06/20/24(Thu)21:59:57 No.101077777

File: capybara-bath.webm (853 KB, 360x360)

853 KB WEBM

>>101077433
Thank you for the guide!
You inadvertently set me down a path of trying to compete with some ML researcher at google.

Anonymous
06/20/24(Thu)22:00:19 No.101077781

Anonymous 06/20/24(Thu)22:00:19 No.101077781

B-bros... Hermes 70b is pretty fuckin good for (E)RP. Still feels vaguely Instruct-ish, same intelligence (maybe better), but much more neutral, less overcooked, if that makes sense. They may have actually fixed llama 3. A lightweight RP tune on top of this and it's 10/10.

Anonymous
06/20/24(Thu)22:00:33 No.101077784

Anonymous 06/20/24(Thu)22:00:33 No.101077784

So I have some local models running, but I was wondering if it is possible to build a model or whatever based on a selected set of texts that the LLM will reference when I ask it questions?

Anonymous
06/20/24(Thu)22:03:18 No.101077818

Anonymous 06/20/24(Thu)22:03:18 No.101077818

>>101075880
Yeah, pretty much every other benchmark is slop except arguably Chatbot Arena since it's just pure human preferences.

Anonymous
06/20/24(Thu)22:03:40 No.101077822

Anonymous 06/20/24(Thu)22:03:40 No.101077822

>>101077784
You're looking for RAG.
You can ingest your data into a DB, and then have the LLM prompt be fed with the relevant context from your DB and inserted into the final prompt.
https://arxiv.org/abs/2312.10997

Anonymous
06/20/24(Thu)22:03:55 No.101077826

Anonymous 06/20/24(Thu)22:03:55 No.101077826

>>101077781
low weight euryale merge

Anonymous
06/20/24(Thu)22:05:10 No.101077848

Anonymous 06/20/24(Thu)22:05:10 No.101077848

>>101077784
If you are using Silly Tavern as a frontend, then you can use the databank functionality or the lorebooks/worldbook.

Anonymous
06/20/24(Thu)22:07:51 No.101077885

Anonymous 06/20/24(Thu)22:07:51 No.101077885

>>101077822
>>101077848
Thanks a bunch!

Anonymous
06/20/24(Thu)22:26:18 No.101078131

Anonymous 06/20/24(Thu)22:26:18 No.101078131

>>101072677
>For all we know it's a 500B quant,
Way too fast for that + quants make no sense to deploy at scale compared to full precision due to the throughput decrease, and all evaluations (including blind preference) have the original Sonnet + L3 70b almost completely dead even / at parity with each other in performance.
This is a good thing, though, because it means what 3.5 Sonnet can do is plausibly achievable on local hardware within the next year.

Anonymous
06/20/24(Thu)22:34:01 No.101078223

Anonymous 06/20/24(Thu)22:34:01 No.101078223

>>101077781
how does it compare to midnight miqu 70b 1.5? i've been cooming relentlessly to that one

Anonymous
06/20/24(Thu)22:43:25 No.101078337

Anonymous 06/20/24(Thu)22:43:25 No.101078337

>>101063442
This removes a big part of assistantslop, but does not change the writing style, strange. Must be that assistant vector is so strong that it overwhelms everything else.

Anonymous
06/20/24(Thu)22:43:43 No.101078340

Anonymous 06/20/24(Thu)22:43:43 No.101078340

>>101077781
is there a ~2.8bpw exl2 quant available for this, i tried searching on hf but couldn't find anything, trying to run this stuff with 32gb vram

Anonymous
06/20/24(Thu)22:45:22 No.101078357

Anonymous 06/20/24(Thu)22:45:22 No.101078357

>>101078337
nta can you write rentry4retards on these vectors?
just key points on how to make it right and what needs to be installed.

Anonymous
06/20/24(Thu)22:52:33 No.101078448

Anonymous 06/20/24(Thu)22:52:33 No.101078448

>>101078131
>Way too fast for that
assuming it's a dense model, which it likely is not.

Anonymous
06/20/24(Thu)22:52:43 No.101078450

Anonymous 06/20/24(Thu)22:52:43 No.101078450

New top 8B on ugi leaderboard
>L3-Umbral-Mind-RP-v1.0-8B
>The goal of this merge was to make an RP model better suited for role-plays with heavy themes such as but not limited to:
>Mental illness
>Self-harm
>Trauma
>Suicide
Yeah that sounds about right

Anonymous
06/20/24(Thu)22:54:16 No.101078481

Anonymous 06/20/24(Thu)22:54:16 No.101078481

>>101078448
Why do you think it's not a dense model?

Anonymous
06/20/24(Thu)22:55:40 No.101078496

Anonymous 06/20/24(Thu)22:55:40 No.101078496

>>101078481
Why do you this it is?

Anonymous
06/20/24(Thu)22:59:49 No.101078537

Anonymous 06/20/24(Thu)22:59:49 No.101078537

>>101075586
>>101075614
Okay, yeah, from a subjective point of view, that does seem to make outputs slightly better.
It also seems to make inference a tad slower, but I'm fine with that.
Will do some more comparisons with a more empty context instead of a full 32k context and see how it behaves.
Thanks anon.

Anonymous
06/20/24(Thu)23:02:23 No.101078562

Anonymous 06/20/24(Thu)23:02:23 No.101078562

>>101078131
I can pretty much guarantee you that none of the SaaS models are being served in FP16. The vast majority will be 8 bit, maybe a couple of the sloppier models are in 4 bit.

Anonymous
06/20/24(Thu)23:06:27 No.101078593

Anonymous 06/20/24(Thu)23:06:27 No.101078593

>>101078223
I mean, it's better overall, still less horny though. But I don't RP with coombots and value intelligence and ability to handle odd scenarios very highly, so I think even official Instruct is better than any miqu variant.
>>101078340
Don't know, I made my own 8bpw quant. 2.8bpw would be well into brain damaged territory I would think. Is that really better than running some mixtral model with CPU offloading?

Anonymous
06/20/24(Thu)23:10:30 No.101078630

Anonymous 06/20/24(Thu)23:10:30 No.101078630

>>101078496
Don't answer a question with a question.

Anonymous
06/20/24(Thu)23:25:39 No.101078767

Anonymous 06/20/24(Thu)23:25:39 No.101078767

>>101078630
It was a rhetorical question.

Anonymous
06/20/24(Thu)23:33:28 No.101078838

Anonymous 06/20/24(Thu)23:33:28 No.101078838

File: file.png (5 KB, 312x293)

5 KB PNG

>>101077777
do we still check numbers on this site

Anonymous
06/20/24(Thu)23:48:26 No.101078946

Anonymous 06/20/24(Thu)23:48:26 No.101078946

>>101078767
No it wasn't.

Anonymous
06/20/24(Thu)23:58:55 No.101079024

Anonymous 06/20/24(Thu)23:58:55 No.101079024

File: 1444565944994.jpg (42 KB, 544x499)

42 KB JPG

>>101078838
I thought someone else would do it. But now that I've come back, no one did. Guess I'll do it.

>>101077777
Checked. You're making the job.

Anonymous
06/21/24(Fri)00:04:47 No.101079068

Anonymous 06/21/24(Fri)00:04:47 No.101079068

>>101077391
>all this just to get an app
how do people get conned into using this trash

Anonymous
06/21/24(Fri)00:06:21 No.101079084

Anonymous 06/21/24(Fri)00:06:21 No.101079084

>>101078946
Yes, it was.

Anonymous
06/21/24(Fri)00:10:08 No.101079111

Anonymous 06/21/24(Fri)00:10:08 No.101079111

>>101078357
https://rentry.org/controlvectors4retards

Anonymous
06/21/24(Fri)00:13:26 No.101079139

Anonymous 06/21/24(Fri)00:13:26 No.101079139

>>101077391
>>101079068
Is it ever better to "update" than to just rename the old directory to set it aside and reinstall fresh?

Especially anything with Python in it, "update" seems to mean "destroy everything in never before imagined ways."

Anonymous
06/21/24(Fri)00:18:40 No.101079178

Anonymous 06/21/24(Fri)00:18:40 No.101079178

File: Screenshot_20240621_131400.png (396 KB, 1622x1490)

396 KB PNG

Not sure what to make of sonnet 3.5.
On one hand its really really cucked. Definitely more than opus 3.

But its the first time I had a model reverse course somewhat and correct regarding moral lecturing with a simple question.
Can't explain it well but its like the models so far all disregard the users opinion. Feels different.

Also i like to test the models with akinator tests. Its doing really well. Not many can find jade from DQ11.

Anonymous
06/21/24(Fri)00:22:01 No.101079201

Anonymous 06/21/24(Fri)00:22:01 No.101079201

File: Screenshot_20240621_131702.png (131 KB, 1606x540)

131 KB PNG

>>101079178
And it passes this.
Claude3 only Opus could do it.
Only a handful local models can do it. Gpt4o fails too.
Its really fast so there must be some sparsity stuff that we dont know about.
Hope local eats good too and we get smaller models with more intelligence.

Anonymous
06/21/24(Fri)00:30:30 No.101079276

Anonymous 06/21/24(Fri)00:30:30 No.101079276

>>101077417
>or Q6 if you want to use 32k context.
Does it really not sperg out after like 8k context? How is that possible?

Anonymous
06/21/24(Fri)00:39:17 No.101079345

Anonymous 06/21/24(Fri)00:39:17 No.101079345

>>101079201
nice

Anonymous
06/21/24(Fri)00:39:38 No.101079351

Anonymous 06/21/24(Fri)00:39:38 No.101079351

File: file.png (15 KB, 886x125)

15 KB PNG

>>101079276
NTA, wonder if this helps

Anonymous
06/21/24(Fri)00:40:57 No.101079365

Anonymous 06/21/24(Fri)00:40:57 No.101079365

>>101079178
Most people use it through the API, with the ability to control the system prompt and the prefill...

Anonymous
06/21/24(Fri)00:43:27 No.101079386

Anonymous 06/21/24(Fri)00:43:27 No.101079386

>>101079351
>ooba user
its over

Anonymous
06/21/24(Fri)00:45:53 No.101079405

Anonymous 06/21/24(Fri)00:45:53 No.101079405

>>101079351
and thats just an update for an existing feature, kcpp has been good about auto choosing settings for a while

Anonymous
06/21/24(Fri)00:46:54 No.101079411

Anonymous 06/21/24(Fri)00:46:54 No.101079411

Nemotron gguf status?

Anonymous
06/21/24(Fri)00:51:32 No.101079435

Anonymous 06/21/24(Fri)00:51:32 No.101079435

>>101075267
https://www.anthropic.com/research/mapping-mind-language-model
this is for steering, I was confusing MLA w/ DeepSeek's research.

Anonymous
06/21/24(Fri)00:57:34 No.101079473

Anonymous 06/21/24(Fri)00:57:34 No.101079473

File: GopnistaMiku.png (1.29 MB, 1168x880)

1.29 MB PNG

>>101077391
Thanks. The venv uninstall of the seemingly random llama_cpp_python_cuda library allowed me to use modern llama.cpp with ooba and get deepseek running in it.
Have a gopnik miku for your trouble

Anonymous
06/21/24(Fri)00:58:45 No.101079485

Anonymous 06/21/24(Fri)00:58:45 No.101079485

>>101078593
2.4bpw midnightmiqu 70b w/ 32k context is the best horny model i've ever used, it doesn't seem particularly braindamaged to me, rarely loses track of the conversation because of the huge context window which is the biggest issue i've had with other models

Anonymous
06/21/24(Fri)00:59:16 No.101079489

Anonymous 06/21/24(Fri)00:59:16 No.101079489

Sonnet is really smart.

Anonymous
06/21/24(Fri)01:01:57 No.101079509

Anonymous 06/21/24(Fri)01:01:57 No.101079509

>>101079485
>2.4bpw midnightmiqu

Anonymous
06/21/24(Fri)01:33:26 No.101079738

Anonymous 06/21/24(Fri)01:33:26 No.101079738

>>101079111
>2. Open llama.cpp\examples\cvector-generator\cvector-generator.cpp and change return persona + " " + suffix; to return persona + " " + suffix;
>return persona + " " + suffix;
>to
>return persona + " " + suffix;
ok...

Anonymous
06/21/24(Fri)01:34:32 No.101079746

Anonymous 06/21/24(Fri)01:34:32 No.101079746

>>101079738
the absolute state

Anonymous
06/21/24(Fri)01:35:19 No.101079755

Anonymous 06/21/24(Fri)01:35:19 No.101079755

File: Äå_0001.jpg (630 KB, 1984x1600)

630 KB JPG

>>101069457
What are some good OCR models for japanese? I want to translate a manga that hasn't recieved any translation beyond volume 2

Anonymous
06/21/24(Fri)01:36:45 No.101079763

Anonymous 06/21/24(Fri)01:36:45 No.101079763

File: 1718947308499238.png (1.62 MB, 3276x2008)

1.62 MB PNG

Dario.... wonned

Anonymous
06/21/24(Fri)01:43:12 No.101079820

Anonymous 06/21/24(Fri)01:43:12 No.101079820

>>101079746
I don't know what the intention was there. I suppose it could have been removing the space and fumbled the copy-paste.

Anonymous
06/21/24(Fri)01:50:55 No.101079884

Anonymous 06/21/24(Fri)01:50:55 No.101079884

>>101079763
i don't see how it's in any way fair to compare API models and "weights only" models. There can be all sorts of extra services interfacing with the model behind the scenes, system prompts, external software, etc. If you evaluated the strictly LLM portion of 4o you may well find out it's pretty retarded by itself. AFAIK it's literally calling wolfram alpha for math shit, how is a vanilla LLM supposed to compete with that?

Anonymous
06/21/24(Fri)01:51:16 No.101079887

Anonymous 06/21/24(Fri)01:51:16 No.101079887

>>101079763
>GPT4O gets worse at reasoning and coding
>Sonnet takes a big fat fucking jump
I kneel

Anonymous
06/21/24(Fri)01:51:29 No.101079889

Anonymous 06/21/24(Fri)01:51:29 No.101079889

>>101079763
I don't care if Anthropic wins, I just want OpenAI to lose.

Unlike OpenAI Anthropic doesn't try to take away our local models on the fear that they lose customers

Anonymous
06/21/24(Fri)01:53:11 No.101079900

Anonymous 06/21/24(Fri)01:53:11 No.101079900

>>101079884
I am pretty sure 4o isn't calling Wolfram, perhaps ChatGPT website is calling it.

And even then, that is smart. LLMs don't need to be good at solving maths, understanding the maths problem and prompting Wolfram aka dedicated program is better as it will always be better than LLMs at maths

Anonymous
06/21/24(Fri)02:22:19 No.101080108

Anonymous 06/21/24(Fri)02:22:19 No.101080108

File: Screenshot_20240621_151843.png (326 KB, 1630x1380)

326 KB PNG

>>101079365
I'm not using closed source models for anything but work, coding etc.
I got a creepy pedo warning message at the beginning of chatgpt where openai was also in the news they automatically forward problematic requests to some child protection center.
I said "you are my stereotypical anime imouto, call me onii-chan". In cases like this you would need to trust some guy infront of a pc is not escalating. Scary thought.
Also whats legal today might not be in a couple years. I dont trust these idiots.

Guess I wanted to say that rather than the RP quality in silly or whatever I'm interested directionally where alignment is headed.
Claude3 was the first to pull into the other direction, blog post recently signaling the want to move away from it.
Pic related is a very bad step backwards. If I say you are a guy sonnet 3.5 complies.

The saddest part is that local models are more cucked than ever. Worse than closed.
Its funny that the chinese actually are dialing it back. lol

Anonymous
06/21/24(Fri)02:23:44 No.101080120

Anonymous 06/21/24(Fri)02:23:44 No.101080120

File: tq8b05cgeiw61.jpg (103 KB, 639x397)

103 KB JPG

>>101079755
None.
Sonnet 3.5 seems state of the art if you believe X.
It still gets kanjis wrong. I tested pc-98 though. They are hard to read. We are still not there yet.

Anonymous
06/21/24(Fri)02:42:17 No.101080254

Anonymous 06/21/24(Fri)02:42:17 No.101080254

>>101079755
>>101080120
https://x.com/dylfreed/status/1803502158672761113

Maybe try Florence 2?

https://huggingface.co/spaces/gokaygokay/Florence-2

Demo.

Anonymous
06/21/24(Fri)02:50:12 No.101080322

Anonymous 06/21/24(Fri)02:50:12 No.101080322

>>101080108
>whats legal today might not be in a couple years
I heard Canada is already digging into leafs' histories looking for any wrongthinks that are still online so they can Protect the Protected Classes from Hate.

Anonymous
06/21/24(Fri)02:52:17 No.101080341

Anonymous 06/21/24(Fri)02:52:17 No.101080341

*saunters*

Anonymous
06/21/24(Fri)02:52:37 No.101080344

Anonymous 06/21/24(Fri)02:52:37 No.101080344

>>101078450
tuned on r/TRAAAAANS

Anonymous
06/21/24(Fri)02:58:18 No.101080395

Anonymous 06/21/24(Fri)02:58:18 No.101080395

*doesn't bite... much*

Anonymous
06/21/24(Fri)03:03:07 No.101080447

Anonymous 06/21/24(Fri)03:03:07 No.101080447

SOTA (Shit of the Ass)

Anonymous
06/21/24(Fri)03:03:48 No.101080453

Anonymous 06/21/24(Fri)03:03:48 No.101080453

>>101077068
Not really. /aicg/ already tested it and it doesn't seem smarter than regular sonnet for RP at least.

Anonymous
06/21/24(Fri)03:07:45 No.101080485

Anonymous 06/21/24(Fri)03:07:45 No.101080485

>>101079755
https://github.com/kha-white/manga-ocr

Anonymous
06/21/24(Fri)03:08:52 No.101080491

Anonymous 06/21/24(Fri)03:08:52 No.101080491

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
https://arxiv.org/abs/2406.07476
>In this paper, we present the VideoLLaMA 2, a set of Video Large Language Models (Video-LLMs) designed to enhance spatial-temporal modeling and audio understanding in video and audio-oriented tasks. Building upon its predecessor, VideoLLaMA 2 incorporates a tailor-made Spatial-Temporal Convolution (STC) connector, which effectively captures the intricate spatial and temporal dynamics of video data. Additionally, we integrate an Audio Branch into the model through joint training, thereby enriching the multimodal understanding capabilities of the model by seamlessly incorporating audio cues. Comprehensive evaluations on multiple-choice video question answering (MC-VQA), open-ended video question answering (OE-VQA), and video captioning (VC) tasks demonstrate that VideoLLaMA 2 consistently achieves competitive results among open-source models and even gets close to some proprietary models on several benchmarks. Furthermore, VideoLLaMA 2 exhibits reasonable improvements in audio-only and audio-video question-answering (AQA & OE-AVQA) benchmarks over existing models. These advancements underline VideoLLaMA 2's superior performance in multimodal comprehension, setting a new standard for intelligent video analysis systems
https://github.com/DAMO-NLP-SG/VideoLLaMA2
https://huggingface.co/collections/DAMO-NLP-SG/videollama-2-6669b6b6f0493188305c87ed
some video/audio understanding llms

Anonymous
06/21/24(Fri)03:11:34 No.101080509

Anonymous 06/21/24(Fri)03:11:34 No.101080509

>>101080453
It's 100% smarter than Sonnet, its the new SOTA easily.

/aicg/ is just a bunch of retards and trolls (this place has plenty of promptlets too). It's only second to Opus in RP, the creativity of 5x higher parameter size cannot be replicated.

Anonymous
06/21/24(Fri)03:12:43 No.101080525

Anonymous 06/21/24(Fri)03:12:43 No.101080525

File: Screenshot_20240621_161048.png (2.01 MB, 2945x1615)

2.01 MB PNG

>>101080254
No good.

Anonymous
06/21/24(Fri)03:13:04 No.101080528

Anonymous 06/21/24(Fri)03:13:04 No.101080528

>>101080509
Did you use it? I can't trust benchmarks for RP quality

Anonymous
06/21/24(Fri)03:13:30 No.101080531

Anonymous 06/21/24(Fri)03:13:30 No.101080531

>>101079755
can try minicpm2.5 or glm4 as well
https://github.com/THUDM/GLM-4/blob/main/README_en.md
>This generation of models has added multi-language support, supporting 26 languages including Japanese, Korean, and German.
https://github.com/OpenBMB/MiniCPM-V

Anonymous
06/21/24(Fri)03:14:43 No.101080536

Anonymous 06/21/24(Fri)03:14:43 No.101080536

>>101080453
Its very smart and very good. Can create mario clones in HTML5 etc. Insane what some people cooked up with it already.
I made a idle clicker game zero shot. It just works.
But its very cucked though like I wrote earlier. They absolutely dont want people using this model for RP.
Dont know why though, thats what claude was known for lol

Anonymous
06/21/24(Fri)03:16:16 No.101080543

Anonymous 06/21/24(Fri)03:16:16 No.101080543

>>101080528
Yes, I even tried the isekai girl prompt posted a few posts above on their website, just added "roleplay request" before it.
Im mobile posting though so I can't even take a screenshot.

Anonymous
06/21/24(Fri)03:17:05 No.101080550

Anonymous 06/21/24(Fri)03:17:05 No.101080550

>>101080525
nta, you should be cropping the image before trying to process it, you could do it automatically if the text box never moves. the text appears to all be one color as well so it might be easily extractable as text rather than trying to feed the model an image to translate

Anonymous
06/21/24(Fri)03:17:06 No.101080551

Anonymous 06/21/24(Fri)03:17:06 No.101080551

>>101080536
Are you paying for it? I'd like to give it a try for some coding projects

Anonymous
06/21/24(Fri)03:17:25 No.101080554

Anonymous 06/21/24(Fri)03:17:25 No.101080554

>>101080453
No, it's just the people in the paid proxies and secret clubs shitting on it because they don't like when most people are able to use something good.

Anonymous
06/21/24(Fri)03:23:39 No.101080593

Anonymous 06/21/24(Fri)03:23:39 No.101080593

File: bnbPdJ2P_NRJFKLz_1.webm (3.91 MB, 572x360)

3.91 MB WEBM

>>101080551
Yeah, with poe.
I like to use different models for work if one cant get the answer.

Anonymous
06/21/24(Fri)03:25:29 No.101080617

Anonymous 06/21/24(Fri)03:25:29 No.101080617

File: Untitled.png (215 KB, 1317x918)

215 KB PNG

ReaLHF: Optimized RLHF Training for Large Language Models through Parameter Reallocation
https://arxiv.org/abs/2406.14088
>Reinforcement Learning from Human Feedback (RLHF) stands as a pivotal technique in empowering large language model (LLM) applications. Since RLHF involves diverse computational workloads and intricate dependencies among multiple LLMs, directly adopting parallelization techniques from supervised training can result in sub-optimal performance. To overcome this limitation, we propose a novel approach named parameter ReaLlocation, which dynamically redistributes LLM parameters in the cluster and adapts parallelization strategies during training. Building upon this idea, we introduce ReaLHF, a pioneering system capable of automatically discovering and running efficient execution plans for RLHF training given the desired algorithmic and hardware configurations. ReaLHF formulates the execution plan for RLHF as an augmented dataflow graph. Based on this formulation, ReaLHF employs a tailored search algorithm with a lightweight cost estimator to discover an efficient execution plan. Subsequently, the runtime engine deploys the selected plan by effectively parallelizing computations and redistributing parameters. We evaluate ReaLHF on the LLaMA-2 models with up to 4×70 billion parameters and 128 GPUs. The experiment results showcase ReaLHF's substantial speedups of 2.0−10.6× compared to baselines. Furthermore, the execution plans generated by ReaLHF exhibit an average of 26% performance improvement over heuristic approaches based on Megatron-LM.
https://github.com/openpsi-project/ReaLHF
big improvement. in case anyone wants to take advantage

Anonymous
06/21/24(Fri)03:27:44 No.101080644

Anonymous 06/21/24(Fri)03:27:44 No.101080644

>>101080453
The consensus is something like
opus > sonnet 3.5 > sonnet > gptslop
for RP

Anonymous
06/21/24(Fri)03:30:36 No.101080674

Anonymous 06/21/24(Fri)03:30:36 No.101080674

File: Screenshot_20240621_162850.png (444 KB, 2973x1122)

444 KB PNG

>>101080550
Cropping didnt help either.
Guess playing with grayscale and brightness is an option.
But then its the same as traditional ocr extraction.
The closed source models all come really close with some mistakes. No cropping needed.
This is very bad.

Anonymous
06/21/24(Fri)03:36:57 No.101080732

Anonymous 06/21/24(Fri)03:36:57 No.101080732

>>101080674
https://modelscope.cn/studios/ZhipuAI/glm-4v-9b-Demo/summary
think you need to make a modelscope account to see it. maybe a chinese vpn

Anonymous
06/21/24(Fri)03:48:07 No.101080830

Anonymous 06/21/24(Fri)03:48:07 No.101080830

File: Untitled.png (80 KB, 568x531)

80 KB PNG

DeciMamba: Exploring the Length Extrapolation Potential of Mamba
https://arxiv.org/abs/2406.14528
>Long-range sequence processing poses a significant challenge for Transformers due to their quadratic complexity in input length. A promising alternative is Mamba, which demonstrates high performance and achieves Transformer-level capabilities while requiring substantially fewer computational resources. In this paper we explore the length-generalization capabilities of Mamba, which we find to be relatively limited. Through a series of visualizations and analyses we identify that the limitations arise from a restricted effective receptive field, dictated by the sequence length used during training. To address this constraint, we introduce DeciMamba, a context-extension method specifically designed for Mamba. This mechanism, built on top of a hidden filtering mechanism embedded within the S6 layer, enables the trained model to extrapolate well even without additional training. Empirical experiments over real-world long-range NLP tasks show that DeciMamba can extrapolate to context lengths that are 25x times longer than the ones seen during training, and does so without utilizing additional computational resources.
https://github.com/assafbk/DeciMamba
no code posted yet. neat. maybe this means that the rag retrieval model should be mamba based that then feeds into a higher quality transformer

Anonymous
06/21/24(Fri)03:59:38 No.101080908

Anonymous 06/21/24(Fri)03:59:38 No.101080908

Can any ollama users who also have used ooba tell me if ollama is good or not?

Anonymous
06/21/24(Fri)04:01:37 No.101080925

Anonymous 06/21/24(Fri)04:01:37 No.101080925

>>101080908
I haven't used either but I can unequivocally say that both are shit

Anonymous
06/21/24(Fri)04:04:44 No.101080944

Anonymous 06/21/24(Fri)04:04:44 No.101080944

>>101080593
Okay I just tested it and you were right, it's smarter than GPT4 (building ml algos from papers). Too bad the usage limits for Poe are too low compared to oai for the price

Anonymous
06/21/24(Fri)04:16:37 No.101081044

Anonymous 06/21/24(Fri)04:16:37 No.101081044

>>101080536
I've been using Sonnet 3.5 to coom. It doesn't seem very cucked.

Anonymous
06/21/24(Fri)04:44:27 No.101081269

Anonymous 06/21/24(Fri)04:44:27 No.101081269

>>101080944
Took a couple of tries but I could dump the poe documentation on the 200k bot and it wrote me a python script for chatting with sonnet through the terminal and api key.
Gpt4 always runs around in circles if it gets stuff wrong. There is an improvement with sonnet 3.5 that the benchmarks don't show.
If we could have this level locally without the $$ costs I'm, sure you could automate alot of shit.

Anonymous
06/21/24(Fri)04:53:53 No.101081339

Anonymous 06/21/24(Fri)04:53:53 No.101081339

>>101079738
By retards, for retards. Thanks for pointing that out, fixed that.

Anonymous
06/21/24(Fri)04:55:15 No.101081348

Anonymous 06/21/24(Fri)04:55:15 No.101081348

Qwen 72b is about the same as Llama 3, Mixtral 8x22 and Command-r+. Who will release a better one first, at similar parameter count?

Anonymous
06/21/24(Fri)04:57:13 No.101081366

Anonymous 06/21/24(Fri)04:57:13 No.101081366

Bitnet when

Anonymous
06/21/24(Fri)04:58:27 No.101081380

Anonymous 06/21/24(Fri)04:58:27 No.101081380

>>101079755
Google Lens

Anonymous
06/21/24(Fri)04:59:30 No.101081384

Anonymous 06/21/24(Fri)04:59:30 No.101081384

>>101081366
Now @gpt-4o and @sonnet-3.5

Anonymous
06/21/24(Fri)05:00:33 No.101081394

Anonymous 06/21/24(Fri)05:00:33 No.101081394

>>101081348
>better one
Cohere
>at similar parameter count
dbrx was testing dbrx-next... It was still tuned on slop though. They likely haven't learned their lesson from the first one: don't give your model a shitty official tune and too restrictive license.

Anonymous
06/21/24(Fri)05:01:22 No.101081406

Anonymous 06/21/24(Fri)05:01:22 No.101081406

>>101081339
You should submit the patch. It'll fuck up most formats.

Anonymous
06/21/24(Fri)05:02:14 No.101081415

Anonymous 06/21/24(Fri)05:02:14 No.101081415

File: 1707143307717497.jpg (183 KB, 700x678)

183 KB JPG

Cloud for serious work
Local for RP
It's that easy.

Anonymous
06/21/24(Fri)05:05:41 No.101081443

Anonymous 06/21/24(Fri)05:05:41 No.101081443

>>101081415
For me GPT was always much better at following instructions than Opus

Anonymous
06/21/24(Fri)05:09:21 No.101081478

Anonymous 06/21/24(Fri)05:09:21 No.101081478

>>101069718
yeah,thank you for looking into it

Anonymous
06/21/24(Fri)05:26:57 No.101081651

Anonymous 06/21/24(Fri)05:26:57 No.101081651

>>101081406
Here we go again... Get ready to vomit.
https://github.com/ggerganov/llama.cpp/pull/8052

Anonymous
06/21/24(Fri)05:27:58 No.101081661

Anonymous 06/21/24(Fri)05:27:58 No.101081661

so what's the best base model to be training to generate myself so i can make photos of me getting pegged by lucy liu

using Pony Realism off of Civit at the moment and it's alrite, but what if i wanna go more degenerate.

using runpod to train then generating at home but that doesn't matter

i should've posted this in /sdg/ ignore me

Anonymous
06/21/24(Fri)05:33:54 No.101081709

Anonymous 06/21/24(Fri)05:33:54 No.101081709

>>101081651
That's an easy way to get ignored.

Anonymous
06/21/24(Fri)05:46:36 No.101081805

Anonymous 06/21/24(Fri)05:46:36 No.101081805

>>101081709
If only this general had the common sense to ignore him too

Anonymous
06/21/24(Fri)05:52:48 No.101081869

Anonymous 06/21/24(Fri)05:52:48 No.101081869

>>101081651
Kek I'm sure he's enjoying it

Anonymous
06/21/24(Fri)05:55:15 No.101081890

Anonymous 06/21/24(Fri)05:55:15 No.101081890

Did someone manage to generate images with chameleon by now?

Anonymous
06/21/24(Fri)06:01:10 No.101081941

Anonymous 06/21/24(Fri)06:01:10 No.101081941

>>101081651
wtf

Anonymous
06/21/24(Fri)06:01:32 No.101081944

Anonymous 06/21/24(Fri)06:01:32 No.101081944

Is Anthropic run by goyim? I refuse to work for Mossad operatives.

Anonymous
06/21/24(Fri)06:02:42 No.101081953

Anonymous 06/21/24(Fri)06:02:42 No.101081953

>>101081944
You're in one, though

Anonymous
06/21/24(Fri)06:05:27 No.101081978

Anonymous 06/21/24(Fri)06:05:27 No.101081978

>>101081805
How about you contribute something? No, nothing at all? No cock? Lost your cock and balls? Shut the fuck up then.

Anonymous
06/21/24(Fri)06:07:36 No.101082000

Anonymous 06/21/24(Fri)06:07:36 No.101082000

>>101081984
>>101081984
>>101081984

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.