/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 07/22/24(Mon)12:36:08 No.101521755

File: f5297d016db1da295c36dcfcb(...).jpg (169 KB, 720x1280)

169 KB JPG

/lmg/ - Local Models General Anonymous 07/22/24(Mon)12:36:08 No.101521755 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101514682 & >>101507132

►News
>(07/22) llamanon leaks 405B base model: https://files.catbox.moe/d88djr.torrent >>101516633
>(07/18) Improved DeepSeek-V2-Chat 236B: https://hf.co/deepseek-ai/DeepSeek-V2-Chat-0628
>(07/18) Mistral NeMo 12B base & instruct with 128k context: https://mistral.ai/news/mistral-nemo/
>(07/16) Codestral Mamba, tested up to 256k context: https://hf.co/mistralai/mamba-codestral-7B-v0.1
>(07/16) MathΣtral Instruct based on Mistral 7B: https://hf.co/mistralai/mathstral-7B-v0.1

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
07/22/24(Mon)12:37:06 No.101521762

Anonymous 07/22/24(Mon)12:37:06 No.101521762

File: 1689703011561.jpg (29 KB, 480x360)

29 KB JPG

►Recent Highlights from the Previous Thread: >>101514682

--Llama 3.1 benchmarks: >>101521353 >>101521436 >>101521438 >>101521521 >>101521558 >>101521590 >>101521501
--Llama 3 405B leaked base model discussion and distribution: >>101516633 >>101516705 >>101516732 >>101517331 >>101517074 >>101517390 >>101518290 >>101518318
--Techniques used for 12B model and multilingual model comparison: >>101515246 >>101515342
--Mistral-Nemo 128k context and its limitations: >>101519084 >>101519151 >>101519185 >>101519227
--Can language models do search?: >>101514990 >>101515020 >>101517627
--GGUF vs EXL2 performance and Llama-related tools discussion: >>101520034 >>101520119 >>101520143 >>101520235 >>101520710 >>101521084 >>101521257 >>101521405 >>101521513
--Speculation of a Meta insider and discussion of Yann LeCun's V-JEPA project: >>101517185 >>101517294 >>101517303 >>101517330 >>101517236 >>101517252 >>101517272
--Performance comparison between a 27 billion parameter model and WizardLM 8x22B: >>101515687 >>101515961
--Nemo and ooba compatibility, software interface for model deployment: >>101516907 >>101516947
--Flash attention works with exllama, llama.cpp, and gemma-27b: >>101515377 >>101518607
--Anthropic's Sonnet improvement and theories of API model degradation: >>101515487 >>101519037 >>101519196
--Anon celebrates Nemo support in llama.cpp: >>101517538 >>101517574 >>101519972 >>101520070
--Miku (free space): >>101519212 >>101519882 >>101516683

►Recent Highlight Posts from the Previous Thread: >>101515008

Anonymous
07/22/24(Mon)12:37:49 No.101521767

Anonymous 07/22/24(Mon)12:37:49 No.101521767

Remember when Miqu was short for mistral quant?

Anonymous
07/22/24(Mon)12:38:56 No.101521779

Anonymous 07/22/24(Mon)12:38:56 No.101521779

reminder that 60% of posters are underage coomer tourists

Anonymous
07/22/24(Mon)12:41:01 No.101521800

Anonymous 07/22/24(Mon)12:41:01 No.101521800

so, 405B didn't improve from 2 months ago.
How over is it?

Anonymous
07/22/24(Mon)12:41:38 No.101521810

Anonymous 07/22/24(Mon)12:41:38 No.101521810

>>101521779
wrong, I came here from Twitter after seeing the llama leak

Anonymous
07/22/24(Mon)12:42:54 No.101521824

Anonymous 07/22/24(Mon)12:42:54 No.101521824

>>101521779
>he expects us to actually hang out here when miqu1 that was released last year is still better than miqu2

Anonymous
07/22/24(Mon)12:43:03 No.101521826

Anonymous 07/22/24(Mon)12:43:03 No.101521826

>>101521800
Stop being a retard. Its base model numbers and they are ahead of GPT4 in most cases, including the 70B 3.1

Anonymous
07/22/24(Mon)12:44:03 No.101521837

Anonymous 07/22/24(Mon)12:44:03 No.101521837

Will column-r save us from meta's incompetence?

Anonymous
07/22/24(Mon)12:44:07 No.101521838

Anonymous 07/22/24(Mon)12:44:07 No.101521838

>>101521762
terrible recap

Anonymous
07/22/24(Mon)12:44:09 No.101521839

Anonymous 07/22/24(Mon)12:44:09 No.101521839

File: L3.1-benches.png (26 KB, 1293x101)

26 KB PNG

https://github.com/Azure/azureml-assets/pull/3180/files
A bit disappointing, but still good.

Anonymous
07/22/24(Mon)12:45:13 No.101521849

Anonymous 07/22/24(Mon)12:45:13 No.101521849

>>101521839
>a bit

Anonymous
07/22/24(Mon)12:45:22 No.101521852

Anonymous 07/22/24(Mon)12:45:22 No.101521852

>>101521762
Keep it up Recap-kun we love you!

Anonymous
07/22/24(Mon)12:45:30 No.101521854

Anonymous 07/22/24(Mon)12:45:30 No.101521854

>>101521826
It's instruct.

Anonymous
07/22/24(Mon)12:45:33 No.101521855

Anonymous 07/22/24(Mon)12:45:33 No.101521855

Where are the Instruct scores?

Anonymous
07/22/24(Mon)12:45:38 No.101521858

Anonymous 07/22/24(Mon)12:45:38 No.101521858

>>101521839
yeah, I would have expected more of an improvement from a doubling of params, but maybe it just means that the larger model has more memory and needs to be trained for longer to make better use of it?

Anonymous
07/22/24(Mon)12:45:52 No.101521860

Anonymous 07/22/24(Mon)12:45:52 No.101521860

File: ethical.png (106 KB, 893x435)

106 KB PNG

Llama 3.1 is going to be more ethical, fair and inclusive.

souce: https://old.reddit.com/r/LocalLLaMA/comments/1e9hg7g/azure_llama_31_benchmarks/

Anonymous
07/22/24(Mon)12:46:07 No.101521863

Anonymous 07/22/24(Mon)12:46:07 No.101521863

Is there a leaderboard that has the benchmarks for all LLMs, proprietary and free? The openLLM leaderboard only has open-weight models, and LMSys only has mt-bench and mmlu.

Anonymous
07/22/24(Mon)12:46:36 No.101521874

Anonymous 07/22/24(Mon)12:46:36 No.101521874

I remember when recap anon used miku images the quality was so much better

Anonymous
07/22/24(Mon)12:48:32 No.101521896

Anonymous 07/22/24(Mon)12:48:32 No.101521896

>>101521839
The community will fix it.

Anonymous
07/22/24(Mon)12:48:52 No.101521903

Anonymous 07/22/24(Mon)12:48:52 No.101521903

>>101521839
no mmlu normal or mmlu pro?

Anonymous
07/22/24(Mon)12:50:13 No.101521917

Anonymous 07/22/24(Mon)12:50:13 No.101521917

>>101521854
>>101521855
we only got the base as the leaked model right?

Anonymous
07/22/24(Mon)12:50:16 No.101521921

Anonymous 07/22/24(Mon)12:50:16 No.101521921

>>101521826
If base model is bad, the instruct will be bad too. Good models don't even need that stupid base/instruct distinction and can be used both ways, like Cohere ones.

Anonymous
07/22/24(Mon)12:50:36 No.101521925

Anonymous 07/22/24(Mon)12:50:36 No.101521925

>>101521860
I would rather be cucked using API models than using local models.

Anonymous
07/22/24(Mon)12:51:41 No.101521939

Anonymous 07/22/24(Mon)12:51:41 No.101521939

>>101521921
>Good models don't even need that stupid base/instruct distinction and can be used both ways, like Cohere ones.
wait what? you can use base cohere models normally? it works?

Anonymous
07/22/24(Mon)12:51:45 No.101521942

Anonymous 07/22/24(Mon)12:51:45 No.101521942

>>101521921
holy shit you're brain dead, lurk more

Anonymous
07/22/24(Mon)12:51:45 No.101521943

Anonymous 07/22/24(Mon)12:51:45 No.101521943

>>101521513
Just tested with bigger context, asked to summarize this https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html

38K context

>exl2 8.0bpw
Metrics: 256 tokens generated in 17.42 seconds (Queue: 0.0 s, Process: 87 cached tokens and 38615 new tokens at 5237.72 T/s, Generate: 25.49 T/s, Context: 38702 tokens)
14.3 t/s on ST

>gguf Q8_0
prompt eval time = 21019.59 ms / 38702 tokens ( 0.54 ms per token, 1841.24 tokens per second) | tid="131307092631552" timestamp=1721667015 id_slot=0 id_task=0 t_prompt_processing=21019.586 n_prompt_tokens_processed=38702 t_token=0.5431136892150277 n_tokens_second=1841.2351223282894
generation eval time = 14004.10 ms / 318 runs ( 44.04 ms per token, 22.71 tokens per second) | tid="131307092631552" timestamp=1721667015 id_slot=0 id_task=0 t_token_generation=14004.103 n_decoded=318 t_token=44.03805974842767 n_tokens_second=22.707630756500436
9.0 t/s on ST

The gguf quant didn't respond correctly, it cited other papers.

exl2 is definetely better at higher contexts

Anonymous
07/22/24(Mon)12:52:06 No.101521948

Anonymous 07/22/24(Mon)12:52:06 No.101521948

>>101521917
Actually idk. Whatever, I'll wait for the actual announcement from Meta.

Anonymous
07/22/24(Mon)12:52:41 No.101521957

Anonymous 07/22/24(Mon)12:52:41 No.101521957

>>101521917
no, we got instruct.

Anonymous
07/22/24(Mon)12:52:57 No.101521963

Anonymous 07/22/24(Mon)12:52:57 No.101521963

>>101521939
I use Command models as completion and they just work.

Anonymous
07/22/24(Mon)12:52:59 No.101521964

Anonymous 07/22/24(Mon)12:52:59 No.101521964

Is the announcement tomorrow or today?

Anonymous
07/22/24(Mon)12:53:20 No.101521970

Anonymous 07/22/24(Mon)12:53:20 No.101521970

>>101521863
https://livebench.ai/ is good and the results align with my personal experience

Anonymous
07/22/24(Mon)12:54:08 No.101521979

Anonymous 07/22/24(Mon)12:54:08 No.101521979

>>101521963
All models "just work" for completion...

Anonymous
07/22/24(Mon)12:54:10 No.101521980

Anonymous 07/22/24(Mon)12:54:10 No.101521980

>>101521957
we definitely got the base only

Anonymous
07/22/24(Mon)12:55:49 No.101522004

Anonymous 07/22/24(Mon)12:55:49 No.101522004

>>101521980
instruct afaict

Anonymous
07/22/24(Mon)12:56:14 No.101522013

Anonymous 07/22/24(Mon)12:56:14 No.101522013

>>101522004
how do you know that?

Anonymous
07/22/24(Mon)12:57:09 No.101522026

Anonymous 07/22/24(Mon)12:57:09 No.101522026

Man, Mistral Nemo is crazy sensitive to the instruct template if you are doing any sort of more complicated prompting.
Dayum.
But when it works, it seemingly works pretty well.

Anonymous
07/22/24(Mon)12:57:11 No.101522027

Anonymous 07/22/24(Mon)12:57:11 No.101522027

>>101521942
why is the common response to reading something you don't like but know is true to act like such a fool?

Anonymous
07/22/24(Mon)12:57:13 No.101522028

Anonymous 07/22/24(Mon)12:57:13 No.101522028

>>101521980
No need to cope; not every day is for us to celebrate. I am more interested in how it compares to the new Mistral.

Anonymous
07/22/24(Mon)12:58:13 No.101522040

Anonymous 07/22/24(Mon)12:58:13 No.101522040

>>101522026
Can you post example? I mean example what works for you.

Anonymous
07/22/24(Mon)12:58:45 No.101522043

Anonymous 07/22/24(Mon)12:58:45 No.101522043

>>101522013
The embedding

Anonymous
07/22/24(Mon)12:58:54 No.101522044

Anonymous 07/22/24(Mon)12:58:54 No.101522044

>>101522028
I'm not coping, that's a base model, why do you believe it's the instruct one?

Anonymous
07/22/24(Mon)12:59:14 No.101522047

Anonymous 07/22/24(Mon)12:59:14 No.101522047

>>101522027
Low iq.. sweating like pig in summer and being angry take your pick. /LMG/ was always comfy

Anonymous
07/22/24(Mon)12:59:36 No.101522054

Anonymous 07/22/24(Mon)12:59:36 No.101522054

>>101522028
until mistral releases a 30b range model they aren't even worth acknowledging these 7 and 13b scraps are worthless

Anonymous
07/22/24(Mon)12:59:43 No.101522055

Anonymous 07/22/24(Mon)12:59:43 No.101522055

>>101521980
No, you can tell by how well it performs at gsm8k benchmarks. There is no way, the model we have is just a base model.

Anonymous
07/22/24(Mon)12:59:56 No.101522058

Anonymous 07/22/24(Mon)12:59:56 No.101522058

>>101522043
can you elaborate maybe? that's a bit vage

Anonymous
07/22/24(Mon)13:00:18 No.101522062

Anonymous 07/22/24(Mon)13:00:18 No.101522062

>>101521839
>L3.1 70b social_iqa: 81.3%
roleplay bros we won

Anonymous
07/22/24(Mon)13:01:04 No.101522072

Anonymous 07/22/24(Mon)13:01:04 No.101522072

>>101522055
>>101522043
oh so that means that those benchmarks is the very best that 405b can do? it's joever...

Anonymous
07/22/24(Mon)13:01:39 No.101522085

Anonymous 07/22/24(Mon)13:01:39 No.101522085

405b lost massively, 3.5 sonnet is probably ~70b yet it MOGS llama 3 405b in every conceivable way. Meta lost.

Anonymous
07/22/24(Mon)13:01:41 No.101522086

Anonymous 07/22/24(Mon)13:01:41 No.101522086

>>101522027
>>101522047
bcs you're so fucking wrong I wish there was IQ tests in place to prevent NIGGERS from using 4chan

Anonymous
07/22/24(Mon)13:03:42 No.101522109

Anonymous 07/22/24(Mon)13:03:42 No.101522109

>>101522085
3.5 is probably a moe like deepseek, but bigger

Anonymous
07/22/24(Mon)13:03:55 No.101522113

Anonymous 07/22/24(Mon)13:03:55 No.101522113

>>101521970
I was looking for something that just recorded scores for pre-existing benchmarks (e.g. MMLU, TruthfulQA etc.), so I could compare the llama-3.1 benchmarks with existing models.

Anonymous
07/22/24(Mon)13:04:50 No.101522125

Anonymous 07/22/24(Mon)13:04:50 No.101522125

Why are you still localkeks?

Anonymous
07/22/24(Mon)13:06:11 No.101522140

Anonymous 07/22/24(Mon)13:06:11 No.101522140

>>101522125
Because I like hearing my laptop fan whirr.

Anonymous
07/22/24(Mon)13:06:29 No.101522145

Anonymous 07/22/24(Mon)13:06:29 No.101522145

File: rat.gif (2.35 MB, 490x390)

2.35 MB GIF

>>101521839
> open weights on the level of GPT4 turbo with the same context size
> disappointing

Anonymous
07/22/24(Mon)13:06:34 No.101522149

Anonymous 07/22/24(Mon)13:06:34 No.101522149

Wtf are some anons about? Did they not even read the bench marks? New 70B is a massive improvement and is supposed to be 182K context

Anonymous
07/22/24(Mon)13:08:00 No.101522169

Anonymous 07/22/24(Mon)13:08:00 No.101522169

>>101522125
I am dating my computer.. What is the alternative really?

Anonymous
07/22/24(Mon)13:08:31 No.101522177

Anonymous 07/22/24(Mon)13:08:31 No.101522177

>>101522149
>New 70B is a massive improvement and is supposed to be 182K context
It's worse than 3.5 sonnet though?

Anonymous
07/22/24(Mon)13:08:47 No.101522179

Anonymous 07/22/24(Mon)13:08:47 No.101522179

>>101522145
I'm 90% sure it won't be as good as GPT 4 turbo because their dataset mixture is different.

Anonymous
07/22/24(Mon)13:09:08 No.101522185

Anonymous 07/22/24(Mon)13:09:08 No.101522185

>>101522149
this happens every time there's a big release, it's just people trolling (and maybe a few easily-swayed people too dumb to tell)

Anonymous
07/22/24(Mon)13:09:33 No.101522190

Anonymous 07/22/24(Mon)13:09:33 No.101522190

When will local beat 3.5 Sonnet?

Anonymous
07/22/24(Mon)13:09:57 No.101522195

Anonymous 07/22/24(Mon)13:09:57 No.101522195

>>101522177
/thread
You won't be running this locally, so you might as well use Claude 3.5
It's over.

Anonymous
07/22/24(Mon)13:10:48 No.101522207

Anonymous 07/22/24(Mon)13:10:48 No.101522207

>>101522177
Its more like claude sonnet 3 now which is a big leap from before.

Infact the new human eval, the most important score for creative writing which is what 90% of you faggots use it for doubled.

Anonymous
07/22/24(Mon)13:10:50 No.101522208

Anonymous 07/22/24(Mon)13:10:50 No.101522208

File: U WOT M8.gif (2.25 MB, 320x240)

2.25 MB GIF

>>101522179
>gpt4-slop when llama3.1-slop enters the room

Anonymous
07/22/24(Mon)13:11:10 No.101522214

Anonymous 07/22/24(Mon)13:11:10 No.101522214

>>101522177
N-No, but the i-instruct...

Anonymous
07/22/24(Mon)13:11:50 No.101522222

Anonymous 07/22/24(Mon)13:11:50 No.101522222

>no Bitnet as promised
>Worse than sonnet 3.5
It's fucking over. It's only downwards from here

Anonymous
07/22/24(Mon)13:12:27 No.101522228

Anonymous 07/22/24(Mon)13:12:27 No.101522228

How many weeks until support in llama.cpp?

Anonymous
07/22/24(Mon)13:12:31 No.101522230

Anonymous 07/22/24(Mon)13:12:31 No.101522230

>>101522222
Quints of truth. Owari desu.

Anonymous
07/22/24(Mon)13:12:38 No.101522231

Anonymous 07/22/24(Mon)13:12:38 No.101522231

Um, anons, I don't feel so good about local models all of the sudden?

Anonymous
07/22/24(Mon)13:12:47 No.101522233

Anonymous 07/22/24(Mon)13:12:47 No.101522233

>>101522207
humaneval is a code benchmark thoughbeit

Anonymous
07/22/24(Mon)13:13:03 No.101522237

Anonymous 07/22/24(Mon)13:13:03 No.101522237

>>101522149
It is just that in practice, it will likely be a 5/10% improvement. I mean, it is not bad; the last two years were absolutely awesome, with every new release pushing the boundaries (haha) a lot. So these models, where we are definitely getting something better but not that much better, sound kind of lame.

Anonymous
07/22/24(Mon)13:13:48 No.101522245

Anonymous 07/22/24(Mon)13:13:48 No.101522245

>>101522228
>How many weeks until support in llama.cpp?
supposedly same arch, so perhaps instant perhaps a week if different tokenizer

Anonymous
07/22/24(Mon)13:13:56 No.101522247

Anonymous 07/22/24(Mon)13:13:56 No.101522247

File: 1640539815379.png (184 KB, 768x859)

184 KB PNG

>>101522222
>digits

Anonymous
07/22/24(Mon)13:14:11 No.101522251

Anonymous 07/22/24(Mon)13:14:11 No.101522251

>>101522222
Did you really expect a model better than 3.5 Sonnet? This is the same model that was still in training when L3.0 dropped... of course it's not gonna be better

my theory is that 3.5 sonnet is sonnet with turbocharged feature activation BS and I think local would benefit from looking into that

Anonymous
07/22/24(Mon)13:14:16 No.101522254

Anonymous 07/22/24(Mon)13:14:16 No.101522254

>>101522222
Holy digits, what do we do now....

Anonymous
07/22/24(Mon)13:15:00 No.101522257

Anonymous 07/22/24(Mon)13:15:00 No.101522257

>train 405B parameters model
>retrain 8B and 70B, just because we can
>no, you don't get other sizes

They're mocking us, aren't they?

Anonymous
07/22/24(Mon)13:15:10 No.101522265

Anonymous 07/22/24(Mon)13:15:10 No.101522265

>Meta shots itself in the foot
Did NovelAI just win?

Anonymous
07/22/24(Mon)13:15:19 No.101522270

Anonymous 07/22/24(Mon)13:15:19 No.101522270

>>101522222
localniggers... did we lose?

Anonymous
07/22/24(Mon)13:15:30 No.101522274

Anonymous 07/22/24(Mon)13:15:30 No.101522274

>>101522265
Neigh.

Anonymous
07/22/24(Mon)13:16:05 No.101522281

Anonymous 07/22/24(Mon)13:16:05 No.101522281

>>101522265
They have already won even before this shit show.

Anonymous
07/22/24(Mon)13:16:33 No.101522286

Anonymous 07/22/24(Mon)13:16:33 No.101522286

>>101522231
>>97223983
>For the record, I completely and unequivocally support Undi and his creation of new model hybrids, and think that everyone who attacks him is mindbroken incel scum, who may or may not be employed by OpenAI to do so.
>everyone who attacks him is mindbroken incel scum

>>97062246
>I'm not Petra. Petra's an amateur. I'm something considerably worse.
>I'm also the point of origin for the practice of the above being added to sysprompts; as well as the 2, 5, 10, 12, and 60 times tables, which enable bots to answer arithmetic questions, when everyone previously said that they never could, and laughed at me for trying.

Go away Petrus!

Anonymous
07/22/24(Mon)13:16:38 No.101522288

Anonymous 07/22/24(Mon)13:16:38 No.101522288

File: Capture.png (29 KB, 647x522)

29 KB PNG

Today is my first day with dual GPUs and I'm curious about the process. So far, the second has been zeroed out in every metric since the PC turned on, to the point I was wondering if something went wrong. Now that I got ye olde /aids/ installed and a bigger model to utilize it, I see that its VRAM is being leeched (at an even amount between the cards) but offering no processing power to the generation. Is that just how it'll be? Or is it conditional, like there just isn't enough to chug to make both cards work? I'm generating quite fast despite using 2/3 of my VRAM.

Anonymous
07/22/24(Mon)13:16:56 No.101522294

Anonymous 07/22/24(Mon)13:16:56 No.101522294

>>101522222
Sonnet and GPT4 are already instruct-tuned. L3.1 evals are done on base model. I suspect Sonnet 3.5 is just Sonnet with smart vector (tell it to larp as einstein)

Anonymous
07/22/24(Mon)13:17:18 No.101522303

Anonymous 07/22/24(Mon)13:17:18 No.101522303

>>101522294
>smart vector
What's that?

Anonymous
07/22/24(Mon)13:17:25 No.101522305

Anonymous 07/22/24(Mon)13:17:25 No.101522305

>>101521767
No actually I never knew that. Clever.

Anonymous
07/22/24(Mon)13:17:56 No.101522307

Anonymous 07/22/24(Mon)13:17:56 No.101522307

>>101522288
>4060 Ti
ngmi

Anonymous
07/22/24(Mon)13:18:17 No.101522308

Anonymous 07/22/24(Mon)13:18:17 No.101522308

Bitnet?

Anonymous
07/22/24(Mon)13:18:27 No.101522310

Anonymous 07/22/24(Mon)13:18:27 No.101522310

>>101522294
>>101522303
https://www.anthropic.com/research/mapping-mind-language-model

is there a legit reason why local can't do this as well?

Anonymous
07/22/24(Mon)13:18:45 No.101522313

Anonymous 07/22/24(Mon)13:18:45 No.101522313

>>101521858
Means we need better data. Higher resolution compression of the same shit stops mattering past a certain point

Anonymous
07/22/24(Mon)13:18:55 No.101522317

Anonymous 07/22/24(Mon)13:18:55 No.101522317

Where is our "Beats gpt4 + multi-modal" local model that we were promised?

Anonymous
07/22/24(Mon)13:19:00 No.101522319

Anonymous 07/22/24(Mon)13:19:00 No.101522319

>>101522086
I'd sigh and call you an idiot too but we already know that. You know what they say about arguing with stupid people. It is depressing to know people of such low intelligence are capable of solving the captcha though. Not only because they bring down the quality of the conversation but because it's a black mark on the human condition of society as a whole that we as viewers have to see up front in real time.

Anonymous
07/22/24(Mon)13:19:05 No.101522321

Anonymous 07/22/24(Mon)13:19:05 No.101522321

File: FGIsyCqXsAUnDCI.jpg (124 KB, 1200x1200)

124 KB JPG

>>101522294

Anonymous
07/22/24(Mon)13:19:27 No.101522325

Anonymous 07/22/24(Mon)13:19:27 No.101522325

>>101522308
Bitnet

Anonymous
07/22/24(Mon)13:19:28 No.101522327

Anonymous 07/22/24(Mon)13:19:28 No.101522327

>>101522307
I've already made it. I'm coming from a 1070. Everything feels like magic with how fast it all works now.

Anonymous
07/22/24(Mon)13:19:35 No.101522328

Anonymous 07/22/24(Mon)13:19:35 No.101522328

>>101522310
That feature doesn't help improve the model though

Anonymous
07/22/24(Mon)13:19:42 No.101522329

Anonymous 07/22/24(Mon)13:19:42 No.101522329

>>101522303
It connects you to India, where Rajet starts replying to your RP. Very smart.

Anonymous
07/22/24(Mon)13:21:16 No.101522347

Anonymous 07/22/24(Mon)13:21:16 No.101522347

>base non instruct tune 405B benchmarks putting it between gpt4-0 and claude 3.5. Distilled 70B only a bit worse.
fail, let us down, what a shit show!!1!

Anonymous
07/22/24(Mon)13:21:32 No.101522350

Anonymous 07/22/24(Mon)13:21:32 No.101522350

>>101522325
bitnet bitnet

Anonymous
07/22/24(Mon)13:21:39 No.101522353

Anonymous 07/22/24(Mon)13:21:39 No.101522353

>>101522319
Do you also know what they say about stupid people? They can't recognize how stupid they are, often thinking they are smarter than they really are.
Now shut up, I'm not going to waste time explaining to you zigger why you are wrong.

Anonymous
07/22/24(Mon)13:22:30 No.101522361

Anonymous 07/22/24(Mon)13:22:30 No.101522361

The common enemy is ngreedia

Anonymous
07/22/24(Mon)13:22:44 No.101522365

Anonymous 07/22/24(Mon)13:22:44 No.101522365

File: nemokino.png (175 KB, 1381x763)

175 KB PNG

>tfw nemo unironically understands the shivermaxxing system prompt
finally, i feel at home

Anonymous
07/22/24(Mon)13:25:27 No.101522392

Anonymous 07/22/24(Mon)13:25:27 No.101522392

>>101522365
Now THAT'S what we call purple prose

Anonymous
07/22/24(Mon)13:25:42 No.101522396

Anonymous 07/22/24(Mon)13:25:42 No.101522396

>>101522365
*you can't help but*
Seriously, is this how you roleplay?

Anonymous
07/22/24(Mon)13:26:01 No.101522401

Anonymous 07/22/24(Mon)13:26:01 No.101522401

>>101522353
That is unfortunately because you aren't capable of doing it. All you're truly capable of is getting mad at things you don't understand... like an idiot.

Anonymous
07/22/24(Mon)13:26:21 No.101522405

Anonymous 07/22/24(Mon)13:26:21 No.101522405

you niggers are giving me mixed feelings.
Are we back or not? I'm tired of proxy faggotery.

Anonymous
07/22/24(Mon)13:26:45 No.101522414

Anonymous 07/22/24(Mon)13:26:45 No.101522414

File: 1720546748706394.png (109 KB, 339x296)

109 KB PNG

>>101522361
You're just mad they're democratizing AI for the common man and allowing more people to inference in their machines than ever before

Anonymous
07/22/24(Mon)13:26:59 No.101522417

Anonymous 07/22/24(Mon)13:26:59 No.101522417

>>101522405
nah it's joever. rope time.

Anonymous
07/22/24(Mon)13:27:03 No.101522418

Anonymous 07/22/24(Mon)13:27:03 No.101522418

>>101522396
uhm sweaty, we're shivermaxxing here, this and "ahh ahh mistress" are the only acceptable answers

Anonymous
07/22/24(Mon)13:27:26 No.101522428

Anonymous 07/22/24(Mon)13:27:26 No.101522428

>>101522405
Come back in 2 weeks. Every time a new model is released, the "people" can't help themselves and shitpost.

Anonymous
07/22/24(Mon)13:27:28 No.101522430

Anonymous 07/22/24(Mon)13:27:28 No.101522430

>>101522405
We are back, Ignore the bad faithers / retards parroting them.

Anonymous
07/22/24(Mon)13:27:40 No.101522433

Anonymous 07/22/24(Mon)13:27:40 No.101522433

>>101522405
Doom posting faggots, local can only go up.
We've got cohere as backup.

Anonymous
07/22/24(Mon)13:27:54 No.101522434

Anonymous 07/22/24(Mon)13:27:54 No.101522434

>>101522414
Wouldn't call locking down consumer VRAM democratizing AI but ok

Anonymous
07/22/24(Mon)13:28:01 No.101522437

Anonymous 07/22/24(Mon)13:28:01 No.101522437

>instruct Mistral to not use cliche phrases like "barely above a whisper"
>instead it uses "barely above a murmur"
I must be getting trolled at this point.

Anonymous
07/22/24(Mon)13:28:16 No.101522442

Anonymous 07/22/24(Mon)13:28:16 No.101522442

>>101522270
Did we ever won?

Anonymous
07/22/24(Mon)13:28:26 No.101522445

Anonymous 07/22/24(Mon)13:28:26 No.101522445

>>101522405
vramlets will be eating super good with 3.1 8b, slightly less vramlet too with 3.1, 70b 405b is doa

Anonymous
07/22/24(Mon)13:28:41 No.101522449

Anonymous 07/22/24(Mon)13:28:41 No.101522449

>>101522401
>mad at things you don't understand
Oh yes, because you clearly understand what you're talking about, >>101521921 anon.

Anonymous
07/22/24(Mon)13:29:15 No.101522454

Anonymous 07/22/24(Mon)13:29:15 No.101522454

>>101522405
Yes, we're back. Retard.

Anonymous
07/22/24(Mon)13:29:30 No.101522459

Anonymous 07/22/24(Mon)13:29:30 No.101522459

>>101522405
70b and 8b get performance upgrades and way more context, it's a win for me

Anonymous
07/22/24(Mon)13:29:51 No.101522461

Anonymous 07/22/24(Mon)13:29:51 No.101522461

>>101522445
Maybe the real 405B was the 8B and 70B we made along the way.

Anonymous
07/22/24(Mon)13:29:57 No.101522462

Anonymous 07/22/24(Mon)13:29:57 No.101522462

>>101522328
it does though

Anonymous
07/22/24(Mon)13:30:21 No.101522464

Anonymous 07/22/24(Mon)13:30:21 No.101522464

>>101522462
It only "improves" the "safety"

Anonymous
07/22/24(Mon)13:30:27 No.101522468

Anonymous 07/22/24(Mon)13:30:27 No.101522468

Give me llama4

Anonymous
07/22/24(Mon)13:30:39 No.101522472

Anonymous 07/22/24(Mon)13:30:39 No.101522472

>405b still writes worse than Nemo
Wow

Anonymous
07/22/24(Mon)13:31:05 No.101522479

Anonymous 07/22/24(Mon)13:31:05 No.101522479

>>101522405
Don't trust hopefags, local is done after this.

Anonymous
07/22/24(Mon)13:31:08 No.101522480

Anonymous 07/22/24(Mon)13:31:08 No.101522480

>>101522405
We will never be half as good as proxy models.

Anonymous
07/22/24(Mon)13:31:15 No.101522482

Anonymous 07/22/24(Mon)13:31:15 No.101522482

>>101522405
Basically, once we get the instruct tunes 405B Should be around 3.5 Claude level, 70B will be about 10-15% worse than that. Also 128K context

Anonymous
07/22/24(Mon)13:31:26 No.101522484

Anonymous 07/22/24(Mon)13:31:26 No.101522484

>>101522405
come back when bitnet

Anonymous
07/22/24(Mon)13:31:31 No.101522487

Anonymous 07/22/24(Mon)13:31:31 No.101522487

If the new 70B isn't far from 400B, doesn't this mean that distillation works really well? Why hasn't anyone else except Google pulled off successful distillations in the open weights category?

Anonymous
07/22/24(Mon)13:32:09 No.101522492

Anonymous 07/22/24(Mon)13:32:09 No.101522492

>>101522472
>Nemo
>Write good
Lol >>101522365

Anonymous
07/22/24(Mon)13:32:15 No.101522494

Anonymous 07/22/24(Mon)13:32:15 No.101522494

>>101522472
Nemo really is a modern Mythomax, it's a little retarded and can need a few retries but damn it has so much sovl

Anonymous
07/22/24(Mon)13:32:15 No.101522495

Anonymous 07/22/24(Mon)13:32:15 No.101522495

>>101522445
How does it compare to Mistral?

Anonymous
07/22/24(Mon)13:32:17 No.101522497

Anonymous 07/22/24(Mon)13:32:17 No.101522497

>>101522464
Did you read the paper?

Anonymous
07/22/24(Mon)13:32:30 No.101522502

Anonymous 07/22/24(Mon)13:32:30 No.101522502

>>101522487
What do you think new gpt4-o / mini / claude sonnet 3.5 is? Distilled then further trained.

Anonymous
07/22/24(Mon)13:32:34 No.101522503

Anonymous 07/22/24(Mon)13:32:34 No.101522503

File: mikucity.png (1.51 MB, 1024x1024)

1.51 MB PNG

>>101522405
With mistral-nemo we're back.
128k context.
12b size.
Not as smart as Wiz or CR+, but less positivity bias and more creativity to make up for it. And much faster of course.
8k context fags BTFO yet again

Anonymous
07/22/24(Mon)13:33:09 No.101522509

Anonymous 07/22/24(Mon)13:33:09 No.101522509

>>101522464
>Claude 3
>Improved safety

Anonymous
07/22/24(Mon)13:33:16 No.101522511

Anonymous 07/22/24(Mon)13:33:16 No.101522511

>>101522482
Mistral-Nemo also has 128k context on paper in reality it does sucks after 12k

Anonymous
07/22/24(Mon)13:33:17 No.101522512

Anonymous 07/22/24(Mon)13:33:17 No.101522512

Any models finetuned for CoT RP? Because the only way for the model to take initiative is to specifically ask for it.

Anonymous
07/22/24(Mon)13:33:18 No.101522513

Anonymous 07/22/24(Mon)13:33:18 No.101522513

>>101522502
>in the open weights category

Anonymous
07/22/24(Mon)13:33:27 No.101522518

Anonymous 07/22/24(Mon)13:33:27 No.101522518

>>101522503
Same shit not even trying it

Anonymous
07/22/24(Mon)13:33:31 No.101522519

Anonymous 07/22/24(Mon)13:33:31 No.101522519

>>101522492
You do know that JB, on the right there, is a meme right? That is what its supposed to do.

Anonymous
07/22/24(Mon)13:33:38 No.101522521

Anonymous 07/22/24(Mon)13:33:38 No.101522521

>>101522502
3.5 Sonnet is also open-weights if I leak them :)

Anonymous
07/22/24(Mon)13:33:48 No.101522525

Anonymous 07/22/24(Mon)13:33:48 No.101522525

>>101522472
Did someone get it running?

Anonymous
07/22/24(Mon)13:34:19 No.101522533

Anonymous 07/22/24(Mon)13:34:19 No.101522533

>>101522521
lol

Anonymous
07/22/24(Mon)13:34:29 No.101522538

Anonymous 07/22/24(Mon)13:34:29 No.101522538

>>101522487
Sao successfully distilled Opus into Llama 3 8B.

Anonymous
07/22/24(Mon)13:34:31 No.101522540

Anonymous 07/22/24(Mon)13:34:31 No.101522540

>>101522511
What backend? It took till 160K-ish context before I noticed any degration of performance.

Anonymous
07/22/24(Mon)13:34:57 No.101522548

Anonymous 07/22/24(Mon)13:34:57 No.101522548

>>101522521
do it

Anonymous
07/22/24(Mon)13:35:06 No.101522550

Anonymous 07/22/24(Mon)13:35:06 No.101522550

File: static-assets-upload15672(...).jpg (47 KB, 600x338)

47 KB JPG

Daily reminder that there's only one benchmark that matters and anyone obsessing over 'ofishal' benchmarks is a reddit pseud that needs to go back.

Anonymous
07/22/24(Mon)13:35:23 No.101522552

Anonymous 07/22/24(Mon)13:35:23 No.101522552

dear all reddit trannies: die

Anonymous
07/22/24(Mon)13:35:32 No.101522553

Anonymous 07/22/24(Mon)13:35:32 No.101522553

>>101522550
So true, and that benchmark is https://livebench.ai/

Anonymous
07/22/24(Mon)13:35:33 No.101522554

Anonymous 07/22/24(Mon)13:35:33 No.101522554

>>101522525
literally impossible locally, 17+ 3090s required for 8bpw you're not running that in your house

Anonymous
07/22/24(Mon)13:36:08 No.101522566

Anonymous 07/22/24(Mon)13:36:08 No.101522566

File: 1675205956243558.webm (2.1 MB, 270x480)

2.1 MB WEBM

>>101522540
Kobold. What are you using?

Anonymous
07/22/24(Mon)13:36:14 No.101522567

Anonymous 07/22/24(Mon)13:36:14 No.101522567

>>101522550
Yeah, lmsys.

Anonymous
07/22/24(Mon)13:36:51 No.101522570

Anonymous 07/22/24(Mon)13:36:51 No.101522570

>>101522567
No, lmsys is dogshit

Anonymous
07/22/24(Mon)13:36:57 No.101522573

Anonymous 07/22/24(Mon)13:36:57 No.101522573

Datacenter GPUs have had higher VRAM than 24gb for a long time now. It's easier than ever to find them second hand. And as a bonus to you you're not giving them money directly.

Anonymous
07/22/24(Mon)13:37:15 No.101522579

Anonymous 07/22/24(Mon)13:37:15 No.101522579

>>101522566
VLLM

Anonymous
07/22/24(Mon)13:37:44 No.101522583

Anonymous 07/22/24(Mon)13:37:44 No.101522583

>>101521921
true

Anonymous
07/22/24(Mon)13:38:10 No.101522589

Anonymous 07/22/24(Mon)13:38:10 No.101522589

File: 145810-megpoid-graficeski(...).jpg (290 KB, 2560x1600)

290 KB JPG

>>101522552
Why yes, I do like diversity, inclusion and equity. Thank you!

Anonymous
07/22/24(Mon)13:38:20 No.101522592

Anonymous 07/22/24(Mon)13:38:20 No.101522592

File: 1711063256087873.jpg (1.3 MB, 1700x1400)

1.3 MB JPG

>>101522579
I give it a try. Thanks.

Anonymous
07/22/24(Mon)13:38:25 No.101522594

Anonymous 07/22/24(Mon)13:38:25 No.101522594

What system prompt are y'all using for Mistral Nemo? I really want to like it but it seems very slopped and heavy on positivity bias.

Anonymous
07/22/24(Mon)13:38:53 No.101522600

Anonymous 07/22/24(Mon)13:38:53 No.101522600

>>101522594
ahh ahh mistress

Anonymous
07/22/24(Mon)13:38:54 No.101522601

Anonymous 07/22/24(Mon)13:38:54 No.101522601

>>101522573
405÷48
~8.43
where are your 9 a6000?

Anonymous
07/22/24(Mon)13:39:35 No.101522608

Anonymous 07/22/24(Mon)13:39:35 No.101522608

>>101522594
>heavy on positivity bias
Your lying. That is one of the benefits of it, no positivity bias at all.

Anonymous
07/22/24(Mon)13:40:07 No.101522614

Anonymous 07/22/24(Mon)13:40:07 No.101522614

>>101522594
You're an expert roleplayer, you have 30 years of experience in professional storytelling and your expertise is writing natural and idiomatic English without common tropes of novel fiction.

Anonymous
07/22/24(Mon)13:40:12 No.101522615

Anonymous 07/22/24(Mon)13:40:12 No.101522615

>>101522566
>Kobold
Wasn't aware it was supported there yet. Are you on that frankenkobold branch? I got it to 60k on llama.cpp without issues

Anonymous
07/22/24(Mon)13:41:55 No.101522641

Anonymous 07/22/24(Mon)13:41:55 No.101522641

You don't fucking get it, it's over, Llama 3.1 didn't score 101% on the benchmarks, just go give money to openai as penance for hoping you could beat them!

Anonymous
07/22/24(Mon)13:42:24 No.101522649

Anonymous 07/22/24(Mon)13:42:24 No.101522649

>>101522615
I can also confirm that llama.cpp didn't seem to have any issue.

Anonymous
07/22/24(Mon)13:42:30 No.101522653

Anonymous 07/22/24(Mon)13:42:30 No.101522653

>>101522608
No, it does have some positivity bias. But that is an issue only if your card is... very innocent.

Anonymous
07/22/24(Mon)13:42:56 No.101522656

Anonymous 07/22/24(Mon)13:42:56 No.101522656

>>101522615
Yeach https://github.com/Nexesenex/kobold.cpp

Anonymous
07/22/24(Mon)13:42:57 No.101522657

Anonymous 07/22/24(Mon)13:42:57 No.101522657

>>101522641
this but unironically.

Anonymous
07/22/24(Mon)13:42:58 No.101522659

Anonymous 07/22/24(Mon)13:42:58 No.101522659

>>101522464
>>101522462
>>101522328
A "feature" can be whatever you want it to be. If you reroute coding requests to a model with boosted "expert coder in all languages" feature you can amplify a model's coding ability. Why not do the same for RP?

Anonymous
07/22/24(Mon)13:43:10 No.101522662

Anonymous 07/22/24(Mon)13:43:10 No.101522662

>>101522594
Single instruction Libra style prompt. No discreet system message, it's part of the user prompt.

Anonymous
07/22/24(Mon)13:43:25 No.101522666

Anonymous 07/22/24(Mon)13:43:25 No.101522666

>>101522659
3.5 sonnet is a normal dense model

Anonymous
07/22/24(Mon)13:43:59 No.101522673

Anonymous 07/22/24(Mon)13:43:59 No.101522673

>>101522615
>>101522649
Maybe it does not have the latest repo. In that case my bad, I do not want to mislead anons.

Anonymous
07/22/24(Mon)13:44:17 No.101522676

Anonymous 07/22/24(Mon)13:44:17 No.101522676

16 HOURS

Anonymous
07/22/24(Mon)13:44:33 No.101522681

Anonymous 07/22/24(Mon)13:44:33 No.101522681

>>101522666
3.5 is very likely a moe, idiot.

Anonymous
07/22/24(Mon)13:44:46 No.101522685

Anonymous 07/22/24(Mon)13:44:46 No.101522685

>>101522681
It's not.

Anonymous
07/22/24(Mon)13:45:51 No.101522699

Anonymous 07/22/24(Mon)13:45:51 No.101522699

>>101522681
of course it is, kek

Anonymous
07/22/24(Mon)13:46:05 No.101522704

Anonymous 07/22/24(Mon)13:46:05 No.101522704

>>101522681
>is very likely idiot
do you think before you write? you don't have to reply I know what you're typing right now.

Anonymous
07/22/24(Mon)13:46:10 No.101522707

Anonymous 07/22/24(Mon)13:46:10 No.101522707

Hope there's something on the hardware size to run these giant models soon. Zucc needs to cannibalize nvidia by releasing his own pcie-compatible AI hardware with gigantic amounts of VRAM. I wouldn't even mind if it could only run llama models

Anonymous
07/22/24(Mon)13:46:23 No.101522711

Anonymous 07/22/24(Mon)13:46:23 No.101522711

>>101522685
Where is your proof?

Anonymous
07/22/24(Mon)13:46:42 No.101522715

Anonymous 07/22/24(Mon)13:46:42 No.101522715

>>101522601
The RTX 8000 is also a thing, 48gb for half the price of the A6000.
Besides who's going to try to run that at full precision anyway? It's a total waste unless you're CPUMAXXING or something.

Anonymous
07/22/24(Mon)13:47:00 No.101522716

Anonymous 07/22/24(Mon)13:47:00 No.101522716

Where's the multi-modal model?

Anonymous
07/22/24(Mon)13:47:26 No.101522719

Anonymous 07/22/24(Mon)13:47:26 No.101522719

any benchmark where 4o is higher than turbo is worthless for the record, 4o is dogshit

Anonymous
07/22/24(Mon)13:48:03 No.101522731

Anonymous 07/22/24(Mon)13:48:03 No.101522731

>>101522704
? It's already known that GPT4 and the likes are MoE.

Anonymous
07/22/24(Mon)13:48:23 No.101522734

Anonymous 07/22/24(Mon)13:48:23 No.101522734

>>101522715
8000 is basically a 48gb 2080, don't think that has fa2

Anonymous
07/22/24(Mon)13:48:36 No.101522739

Anonymous 07/22/24(Mon)13:48:36 No.101522739

>>101522719
are we just shitposting now because miqu2 failed? 4o is still sota

Anonymous
07/22/24(Mon)13:49:05 No.101522745

Anonymous 07/22/24(Mon)13:49:05 No.101522745

>>101522739
shit of the ass maybe

Anonymous
07/22/24(Mon)13:49:38 No.101522750

Anonymous 07/22/24(Mon)13:49:38 No.101522750

>>101522731
where are you reading this genius?

Anonymous
07/22/24(Mon)13:49:43 No.101522752

Anonymous 07/22/24(Mon)13:49:43 No.101522752

>>101522739
3.5 sonnet is SOTA, or are you brain dead?

Anonymous
07/22/24(Mon)13:50:05 No.101522757

Anonymous 07/22/24(Mon)13:50:05 No.101522757

>>101522750
lurk moar

Anonymous
07/22/24(Mon)13:50:12 No.101522759

Anonymous 07/22/24(Mon)13:50:12 No.101522759

>>101522472
do you have some logs so I can laugh a bit?

Anonymous
07/22/24(Mon)13:50:40 No.101522770

Anonymous 07/22/24(Mon)13:50:40 No.101522770

why are people saying 405b is bad? isn't it almost on par with gpt4? nobody expected a model this size to beat the closed source ones right?

Anonymous
07/22/24(Mon)13:51:48 No.101522783

Anonymous 07/22/24(Mon)13:51:48 No.101522783

>>101522770
what gpt4?
there are like 50 versions kek

Anonymous
07/22/24(Mon)13:52:32 No.101522793

Anonymous 07/22/24(Mon)13:52:32 No.101522793

>>101522757
hurr durr sumbudy sad a thang I rad it I deed
I'm too fucking stupid to factcheck I'll just assume it's true and repeat the same shit when others ask

Anonymous
07/22/24(Mon)13:52:52 No.101522796

Anonymous 07/22/24(Mon)13:52:52 No.101522796

>>101522770
GPT4 is history now. OpenAI will exit their temple with AGI (real)

Anonymous
07/22/24(Mon)13:53:13 No.101522801

Anonymous 07/22/24(Mon)13:53:13 No.101522801

>>101522704
>>101522681
no one know anything about what's happening on the API models, let's not pretend it's the case, what if that's an improved architecture of transformers? because it seems like spamming parameters (hello L3-405b) is not the solution

Anonymous
07/22/24(Mon)13:53:32 No.101522806

Anonymous 07/22/24(Mon)13:53:32 No.101522806

Wait today is Monday. I thought it was Tuesday. FUCK. 24 hours to go until the actual release.

Anonymous
07/22/24(Mon)13:53:35 No.101522809

Anonymous 07/22/24(Mon)13:53:35 No.101522809

>>101522770
Who cares if it's on par with GPT4 on meme benchmarks? The sota is Claude 3.5
NO ONE will use this model if there's a model better than it. That's also what made Grok Doa.

Anonymous
07/22/24(Mon)13:54:09 No.101522820

Anonymous 07/22/24(Mon)13:54:09 No.101522820

tax refund just came in, I want an easy setup, is going for just one rtx 8000 worth it?

Anonymous
07/22/24(Mon)13:55:08 No.101522836

Anonymous 07/22/24(Mon)13:55:08 No.101522836

>>101522801
Pretty much this.

Anonymous
07/22/24(Mon)13:56:21 No.101522852

Anonymous 07/22/24(Mon)13:56:21 No.101522852

>>101522809
>Who cares if it's on par with GPT4 on meme benchmarks? The sota is Claude 3.5
this, if a 405b can't beat the 2nd best model yet, it's a failure, it was supposed to compete with C3.5 Sonnet

Anonymous
07/22/24(Mon)13:56:30 No.101522853

Anonymous 07/22/24(Mon)13:56:30 No.101522853

>>101522820
You're better off going 2x 3090. You can get support for FlashAttention 2 which will let you fit more context. Unless space is a factor for you the dual 3090 setup will inference faster and be more future proof around the same budget (or less if you don't have to upgrade your PSU)

Anonymous
07/22/24(Mon)13:56:31 No.101522855

Anonymous 07/22/24(Mon)13:56:31 No.101522855

>>101522288
check the graph dropdowns - look at CUDA or Compute or summit

Anonymous
07/22/24(Mon)13:56:31 No.101522856

Anonymous 07/22/24(Mon)13:56:31 No.101522856

>>101522707
Meta quest 4 will feature 2 terabytes of VRAM

Anonymous
07/22/24(Mon)13:56:48 No.101522862

Anonymous 07/22/24(Mon)13:56:48 No.101522862

>>101522719
Turbo is the worst of all GPT4 variations, what are you smoking?

Anonymous
07/22/24(Mon)13:56:57 No.101522863

Anonymous 07/22/24(Mon)13:56:57 No.101522863

>>101522793
>papers revised by peers aren't trustworthy now
damn, what will the scientific community do now?

Anonymous
07/22/24(Mon)13:57:33 No.101522868

Anonymous 07/22/24(Mon)13:57:33 No.101522868

>>101522731
>>101522863
>? It's already known that GPT4 and the likes are MoE.
It's only known that the original GPT-4 was MoE, you gorilla nigger.

Anonymous
07/22/24(Mon)13:57:56 No.101522874

Anonymous 07/22/24(Mon)13:57:56 No.101522874

>>101522853
i don't have time to build shit myself, where can i just spent like 500€ more and get a pc thats already built

Anonymous
07/22/24(Mon)14:00:00 No.101522903

Anonymous 07/22/24(Mon)14:00:00 No.101522903

>>101522868
And why would the lastest one not be MoE? MoE is cheaper to run inference on a large scale. It only makes sense that most if not all SOTA models are MoE.

Anonymous
07/22/24(Mon)14:00:27 No.101522909

Anonymous 07/22/24(Mon)14:00:27 No.101522909

>>101522863
crying into their money. It was one of the biggest fuck-me moments when I realized that, in fact, no institute is immune to corruption, and the whole narrative about scientists should be more celebrated than some stupid celebrities with their IQ around 80, which was just a meme all along.

Anonymous
07/22/24(Mon)14:01:44 No.101522925

Anonymous 07/22/24(Mon)14:01:44 No.101522925

File: truffle.jpg (70 KB, 1483x1032)

70 KB JPG

>>101522874
If you want to be as hands-off as possible with the build maybe this is more your speed.
Caps out at 100b it sounds like though so no CR+ or Llama 3 405b

Anonymous
07/22/24(Mon)14:04:31 No.101522959

Anonymous 07/22/24(Mon)14:04:31 No.101522959

I for one just look forward to a day when 100k context is normal and doesn't require 10 gigs of vram fuck the latest FOTM model releases

Anonymous
07/22/24(Mon)14:07:02 No.101522982

Anonymous 07/22/24(Mon)14:07:02 No.101522982

>>101522959
That will never happen. Just wait for the day VRAM is cheap.

Anonymous
07/22/24(Mon)14:09:10 No.101523005

Anonymous 07/22/24(Mon)14:09:10 No.101523005

>>101522862
4 turbo is good.

Anonymous
07/22/24(Mon)14:09:12 No.101523006

Anonymous 07/22/24(Mon)14:09:12 No.101523006

>>101522801
Google has said Gemini pro is a Moe and flash is dense

Anonymous
07/22/24(Mon)14:09:15 No.101523007

Anonymous 07/22/24(Mon)14:09:15 No.101523007

>>101522925
they don't mention quantization so I have no idea if its running unquantized or q2.
also this thing seems way too good to be true, and I can't even buy it.
I'd like to have something more real with a real gpu that i can also use for diffusion and stuff

Anonymous
07/22/24(Mon)14:11:15 No.101523040

Anonymous 07/22/24(Mon)14:11:15 No.101523040

File: Screenshot from 2024-07-2(...).png (61 KB, 775x290)

61 KB PNG

>>101523007 (me)
also there is this deceptive graph which makes me not trust them. (the difference between 18 and 20 is too large visually)

Anonymous
07/22/24(Mon)14:11:18 No.101523041

Anonymous 07/22/24(Mon)14:11:18 No.101523041

>>101523006
Where?

Anonymous
07/22/24(Mon)14:11:22 No.101523045

Anonymous 07/22/24(Mon)14:11:22 No.101523045

>>101523015
Buy an ad.

Anonymous
07/22/24(Mon)14:11:38 No.101523048

Anonymous 07/22/24(Mon)14:11:38 No.101523048

>>101522615
>>101522649
So i tried the latest fork and it works just fine now.

Anonymous
07/22/24(Mon)14:12:11 No.101523056

Anonymous 07/22/24(Mon)14:12:11 No.101523056

>>101522925
>200gb/s for $1299
dude that's worse than an m2 mac mini

Anonymous
07/22/24(Mon)14:12:36 No.101523064

Anonymous 07/22/24(Mon)14:12:36 No.101523064

>>101523015
>hopefully we'll see some RP/ERP focused tunes for it
Buy an ad.

Anonymous
07/22/24(Mon)14:12:50 No.101523070

Anonymous 07/22/24(Mon)14:12:50 No.101523070

>>101523041
lūrk mōrē

Anonymous
07/22/24(Mon)14:13:08 No.101523072

Anonymous 07/22/24(Mon)14:13:08 No.101523072

>>101522925
Is there anything for around 600€?

Anonymous
07/22/24(Mon)14:13:11 No.101523074

Anonymous 07/22/24(Mon)14:13:11 No.101523074

>>101523045
>>101523064
For what?

Anonymous
07/22/24(Mon)14:13:41 No.101523081

Anonymous 07/22/24(Mon)14:13:41 No.101523081

>>101523015
>Stheno
Buy an ad.

Anonymous
07/22/24(Mon)14:14:21 No.101523091

Anonymous 07/22/24(Mon)14:14:21 No.101523091

>>101522715
405gb is 8bpw not full...

Anonymous
07/22/24(Mon)14:14:55 No.101523097

Anonymous 07/22/24(Mon)14:14:55 No.101523097

File: file.png (83 KB, 1338x402)

83 KB PNG

>>101523007
https://www.reddit.com/r/LocalLLaMA/comments/1bd2ekr/comment/kujxcd9/

"custom quantization algorithm"
"minimal accuracy loss"
" large gains in speed"
"soon™"

Anonymous
07/22/24(Mon)14:15:01 No.101523099

Anonymous 07/22/24(Mon)14:15:01 No.101523099

>>101523074
For your finetune, what else?
>hopefully we'll see some RP/ERP focused tunes for it
The model doesn't need one.

Anonymous
07/22/24(Mon)14:15:35 No.101523110

Anonymous 07/22/24(Mon)14:15:35 No.101523110

>>101522925
This looks like the Stadia of AI
>buy our over-priced useless box...
>...so you can also rent our servers and stream back to this box.
>Also, streaming text gens from us is the only thing this neutered pc can do.
>$1300 btw (plus server tip)

Anonymous
07/22/24(Mon)14:15:36 No.101523111

Anonymous 07/22/24(Mon)14:15:36 No.101523111

File: 1721097824649005.jpg (47 KB, 562x675)

47 KB JPG

>>101523015
Buy an ad

Anonymous
07/22/24(Mon)14:16:07 No.101523120

Anonymous 07/22/24(Mon)14:16:07 No.101523120

>>101523097
>scam
>raj
Every time

Anonymous
07/22/24(Mon)14:16:53 No.101523131

Anonymous 07/22/24(Mon)14:16:53 No.101523131

>>101523015
Shills are back huh?

Anonymous
07/22/24(Mon)14:17:19 No.101523136

Anonymous 07/22/24(Mon)14:17:19 No.101523136

File: 1706815557312128.jpg (32 KB, 480x692)

32 KB JPG

>>101523015
>l3 llama / gemma meme-tuners on suicide watch

Anonymous
07/22/24(Mon)14:18:19 No.101523157

Anonymous 07/22/24(Mon)14:18:19 No.101523157

>>101523136
he praises stheno tho

Anonymous
07/22/24(Mon)14:18:34 No.101523159

Anonymous 07/22/24(Mon)14:18:34 No.101523159

>>101523099
But I have never finetuned anything. I am a literal bottom tier user who just waits for shit to come out that can run on lower end hardware and has spent 0 dollars on anything AI related. Also I think it could definitely benefit from a tune that made it slightly more RP/story oriented compared to assistant-esque short and polite responder but also if you say prompt issue that might be true since I haven't spent that much time with it yet.

Anonymous
07/22/24(Mon)14:20:27 No.101523185

Anonymous 07/22/24(Mon)14:20:27 No.101523185

>>101523159
It's a schizo that gets triggered by merges/finetunes, don't bother

Anonymous
07/22/24(Mon)14:20:45 No.101523193

Anonymous 07/22/24(Mon)14:20:45 No.101523193

>>101523159
Your experience is completely decoupled from reality. Go back.

Anonymous
07/22/24(Mon)14:21:18 No.101523202

Anonymous 07/22/24(Mon)14:21:18 No.101523202

>>101523159
>>101523185
samefag

Anonymous
07/22/24(Mon)14:21:18 No.101523203

Anonymous 07/22/24(Mon)14:21:18 No.101523203

>>101523131
Not a shill it just feels like the novelty bias in this thread is insane. Whenever any competent model comes out people are calling it the best thing yet by far. It happened with Stheno too and I think there are better Mistral models and better Llama2 models even.

Anonymous
07/22/24(Mon)14:22:36 No.101523219

Anonymous 07/22/24(Mon)14:22:36 No.101523219

>>101523203
Name 4 (hard mode: don't shill your models)

Anonymous
07/22/24(Mon)14:24:58 No.101523238

Anonymous 07/22/24(Mon)14:24:58 No.101523238

>>101523219
Solar, X-Norochronos, Utopia, Westlake (though this one is arguable it's kinda dumb but one of the most creative ones).

Anonymous
07/22/24(Mon)14:25:03 No.101523239

Anonymous 07/22/24(Mon)14:25:03 No.101523239

>>101523203
MLEWD!!!

>>97223983
>For the record, I completely and unequivocally support Undi and his creation of new model hybrids, and think that everyone who attacks him is mindbroken incel scum, who may or may not be employed by OpenAI to do so.
>everyone who attacks him is mindbroken incel scum
>>101523219

Anonymous
07/22/24(Mon)14:25:21 No.101523244

Anonymous 07/22/24(Mon)14:25:21 No.101523244

>>101523203
It doesn't have positivity bias, it's not censored, it doesn't have any problem writing long replies, swipes work just fine. Did you download the correct model?
Stheno was not a competent model, that was just astroturfing.
Also this plead:
>hopefully we'll see some RP/ERP focused tunes for it.
is the mark of a shill. Go fuck yourself.

Anonymous
07/22/24(Mon)14:25:24 No.101523245

Anonymous 07/22/24(Mon)14:25:24 No.101523245

Looking for model and technology recommendations for a task.

I have about a decade of chat logs from a roleplay chat I host. I would like to be able to feed these chat logs into something as training material and then interact with it in a variety of ways, including but not limited to a) having it create short sequences of various characters interacting in reply to a prompt, b) having it be one of the characters and replying as it, or c) having it interact with the existing characters as a new character.

What do I need to know and learn about to get this done?

Anonymous
07/22/24(Mon)14:25:32 No.101523246

Anonymous 07/22/24(Mon)14:25:32 No.101523246

>>101523203
Do you really think ANY of these discord retards are better at making models than the regular companies? They just slam shit in breaking who knows what in the process. Nous is infamous for it just slamming gpt logs into every model. What does it improve really? Some rigged benchmarks?
Anyone whose familiar with this shit for more than a few weeks has a prompt they can run to check how fucked up these amateur finetunes are. If you're recommending this shit you are the problem.

Anonymous
07/22/24(Mon)14:25:44 No.101523248

Anonymous 07/22/24(Mon)14:25:44 No.101523248

Whats the point of 405b model if you need 10 rtx 3090 to run it?

Anonymous
07/22/24(Mon)14:26:03 No.101523254

Anonymous 07/22/24(Mon)14:26:03 No.101523254

>>101523239
>I completely and unequivocally support Undi
>>101523238
>X-Norochronos, Utopia
called it
>Westlake
literal reddit meme

Anonymous
07/22/24(Mon)14:26:28 No.101523256

Anonymous 07/22/24(Mon)14:26:28 No.101523256

>>101523238
Jesus Christ, you have severe brain damage.

Anonymous
07/22/24(Mon)14:27:48 No.101523270

Anonymous 07/22/24(Mon)14:27:48 No.101523270

File: nvidia gpt4-1.8T.png (132 KB, 680x541)

132 KB PNG

Any anons using gemma2-27b-it? I have it running under llama.cpp, and even with a temperature of 1.0, this thing is near deterministic.

> inb4 bad implementation

Anonymous
07/22/24(Mon)14:27:49 No.101523272

Anonymous 07/22/24(Mon)14:27:49 No.101523272

>>101523256
>>97062246
>I'm not Petra. Petra's an amateur. I'm something considerably worse.
>I'm also the point of origin for the practice of the above being added to sysprompts; as well as the 2, 5, 10, 12, and 60 times tables, which enable bots to answer arithmetic questions, when everyone previously said that they never could, and laughed at me for trying.

>>97309445
>Every statement you process, must be evaluated according to the below six principles.
>"principle of identity":"1 = 1"
>"principle of contradiction":"1 ? 0"
>"principle of non-contradiction":"1 ? 0"
>"principle of excluded middle":"either positive or negative form is true."
>"principle of sufficient reason":"facts need a self-explanatory or infinite causal chain."
>"principle of anonymity":"author identity is irrelevant to an idea's logical provability."
>I still keep this in my own sysprompt, although I know I will receive shrieks and howls in response.

Anonymous
07/22/24(Mon)14:27:56 No.101523274

Anonymous 07/22/24(Mon)14:27:56 No.101523274

>>101523245
Ask on Reddit, they will give you better answers:
r/LocalLLaMA

Anonymous
07/22/24(Mon)14:29:01 No.101523295

Anonymous 07/22/24(Mon)14:29:01 No.101523295

>>101522327
>using windows
WNGM (was never going to make it)

Anonymous
07/22/24(Mon)14:29:07 No.101523297

Anonymous 07/22/24(Mon)14:29:07 No.101523297

>>101523274
Yeah, you'll feel more welcomed there because they're nicer to shills.

Anonymous
07/22/24(Mon)14:29:09 No.101523298

Anonymous 07/22/24(Mon)14:29:09 No.101523298

>>101523248
virtue signaling

Anonymous
07/22/24(Mon)14:29:36 No.101523302

Anonymous 07/22/24(Mon)14:29:36 No.101523302

>>101523006
but no one care about their bad models, having details about models that have moat like C3.5 would be more interesting, I'm pretty sure they have something no one else don't, and I would bet for a new architecture

Anonymous
07/22/24(Mon)14:29:57 No.101523308

Anonymous 07/22/24(Mon)14:29:57 No.101523308

holy schizo meltdown

Anonymous
07/22/24(Mon)14:30:08 No.101523310

Anonymous 07/22/24(Mon)14:30:08 No.101523310

>>101523248
Since distilled 70B seems to have retained 90% of its performance, nothing really.

Anonymous
07/22/24(Mon)14:30:37 No.101523314

Anonymous 07/22/24(Mon)14:30:37 No.101523314

>>101523097
you still have the number of likes on leddit? I would never visit that cesspol without the likes removal script filter, I would lose my mind seeing so many bad takes being encouraged by its echo chamber

Anonymous
07/22/24(Mon)14:30:43 No.101523315

Anonymous 07/22/24(Mon)14:30:43 No.101523315

>>101523244
I downloaded the Q5K_M gguf but maybe it could have problems since llama.cpp support is still new. Maybe I am retarded. I will keep an eye on things and test more.

>is the mark of a shill. Go fuck yourself.
Is it that hard to believe someone has this opinion? All the shit that worked best for me was someone's finetune instead of the base model but on this board it's forbidden to mention finetunes or you are a shill for them apparently. Talk about your experience and rather than actual replies you get 10 kneejerk reactions. Honestly it was my bad, I should have known this board better and left out the exact model name because it triggers the autists so bad to mention any but I figured if I didn't give any names people would be like "Oh yeah wat else were you testing then?".

Anonymous
07/22/24(Mon)14:31:09 No.101523321

Anonymous 07/22/24(Mon)14:31:09 No.101523321

>>101523308
I know right? The minute someone recommends anything at all they get shit on. This entire site is hopeless.

Anonymous
07/22/24(Mon)14:31:09 No.101523322

Anonymous 07/22/24(Mon)14:31:09 No.101523322

>>101523248
Well, to make smaller distilled models from is what its good for I should have clarified.

Anonymous
07/22/24(Mon)14:31:22 No.101523324

Anonymous 07/22/24(Mon)14:31:22 No.101523324

>>101523308
is the schizo in the room with us?

Anonymous
07/22/24(Mon)14:31:51 No.101523330

Anonymous 07/22/24(Mon)14:31:51 No.101523330

>>101523315
go away Petrus seriously

Anonymous
07/22/24(Mon)14:32:19 No.101523335

Anonymous 07/22/24(Mon)14:32:19 No.101523335

>>101523321
Buy an ad, shill.

Anonymous
07/22/24(Mon)14:32:23 No.101523336

Anonymous 07/22/24(Mon)14:32:23 No.101523336

>>101523321
Its one schizo / troll.

Anonymous
07/22/24(Mon)14:32:48 No.101523342

Anonymous 07/22/24(Mon)14:32:48 No.101523342

>>101523314
snowflake

Anonymous
07/22/24(Mon)14:32:51 No.101523343

Anonymous 07/22/24(Mon)14:32:51 No.101523343

>>101523336
yes, petrus

Anonymous
07/22/24(Mon)14:33:40 No.101523352

Anonymous 07/22/24(Mon)14:33:40 No.101523352

>>101522925
Might be OK depending on what Orin devboard it has. If it's the 64GB RAM one, it's not bad. Only caveat is I recall the Jetson Nano being a MASSIVE clusterfuck to update - basically you didn't dare touch the kernel since everything else on it supplied by nvidia was tied to the kernel it shipped with, and if they updated that, it all had to be updated at once.
The moment T4 16GB cards touch $500 I'm buying one for my Odroid-H4U system. It'll be far better than this Truffle thing.

Anonymous
07/22/24(Mon)14:33:46 No.101523356

Anonymous 07/22/24(Mon)14:33:46 No.101523356

>>101523336
Petrus is ruining my business, he must be stopped...

Anonymous
07/22/24(Mon)14:33:58 No.101523358

Anonymous 07/22/24(Mon)14:33:58 No.101523358

>>101523321
What discord do you partake in?

Anonymous
07/22/24(Mon)14:34:03 No.101523360

Anonymous 07/22/24(Mon)14:34:03 No.101523360

File: you.png (456 KB, 860x646)

456 KB PNG

>>101523324
.

Anonymous
07/22/24(Mon)14:34:12 No.101523362

Anonymous 07/22/24(Mon)14:34:12 No.101523362

>>101523342
you know damn right that those likes are artificial, they censor and ban everyone that dare to disagree on a subreddit, so you know that it's not representative to the global opinion, reddit is a cesspol

Anonymous
07/22/24(Mon)14:34:29 No.101523365

Anonymous 07/22/24(Mon)14:34:29 No.101523365

https://huggingface.co/spaces/Xenova/whisper-speaker-diarization
Source
https://huggingface.co/spaces/Xenova/whisper-speaker-diarization/tree/main/whisper-speaker-diarization

Anonymous
07/22/24(Mon)14:35:43 No.101523381

Anonymous 07/22/24(Mon)14:35:43 No.101523381

>>101523246
I think there are a lot of autistic finetuners who just finetune to get top scores in the benchmarks but then when you use the model in a real situation surprise surprise it's shit. The models that are good for RP don't usually have that high of a score in benchmarks even. Out of all the meme scores Hellaswag is the one that even somewhat seemed to matter back when I paid attention to the scores but I haven't really done that this year at all.
I think in a lot of cases when one of these finetunes are a pleasant surprise luck is a large element in the process. It's basically a monkeys and typewriters situation, if everyone is finetuning then one of the finetunes is gonna end up being good, at least for a specific purpose. It's not because the finetuner is some kind of megamind god smarter than a whole company of pros. I'm not here to suck anyone's dick off I will use anything by anyone of it's free to download and I like the results.

Anonymous
07/22/24(Mon)14:36:12 No.101523385

Anonymous 07/22/24(Mon)14:36:12 No.101523385

>>101522222
>no Bitnet as promised
there was no promise of bitnet by any company, retarded fudnigger, only the paper creators talked about creating a 8B+ models for it

Anonymous
07/22/24(Mon)14:36:18 No.101523388

Anonymous 07/22/24(Mon)14:36:18 No.101523388

File: image.png (20 KB, 544x347)

20 KB PNG

Wait, isn't this the base model? Do we have benchmarks for Instruct?

Anonymous
07/22/24(Mon)14:36:55 No.101523394

Anonymous 07/22/24(Mon)14:36:55 No.101523394

>>101523388
>Wait, isn't this the base model?
yes
>Do we have benchmarks for Instruct?
no

Anonymous
07/22/24(Mon)14:37:02 No.101523397

Anonymous 07/22/24(Mon)14:37:02 No.101523397

File: maxresdefault.jpg (60 KB, 1280x720)

60 KB JPG

>>101523362
Yeah, if you know that, why do you need a script to keep your ego from getting hurt?
I hate filterfags like you.

Anonymous
07/22/24(Mon)14:37:04 No.101523398

Anonymous 07/22/24(Mon)14:37:04 No.101523398

>>101523248
The point is that the improved 70B could exist only because they made the 400B.

Anonymous
07/22/24(Mon)14:37:51 No.101523410

Anonymous 07/22/24(Mon)14:37:51 No.101523410

>>101522327
>Everything feels like magic with how fast it all works now
if you could actually run wizard you would know actual magic and wouldnt mind 1.5+t/s for great high iq responses every single time in basically every situation

Anonymous
07/22/24(Mon)14:38:34 No.101523421

Anonymous 07/22/24(Mon)14:38:34 No.101523421

>>101523388
Yeah, it's instruct. Idk why some anons are calling that the base model. Just look at the gsm8k score.

Anonymous
07/22/24(Mon)14:38:55 No.101523427

Anonymous 07/22/24(Mon)14:38:55 No.101523427

>>101523397
I just explained why, if you're too retarded to understand that there's not much I can do for you, intelligence is not something you can get anywhere

Anonymous
07/22/24(Mon)14:39:05 No.101523430

Anonymous 07/22/24(Mon)14:39:05 No.101523430

>>101523410
>wizard
Dolphin Mixtral 2.5 is so much better.

Anonymous
07/22/24(Mon)14:39:07 No.101523432

Anonymous 07/22/24(Mon)14:39:07 No.101523432

>>101523388
No, but like I said instruct 405B should end up somewhere between GPT4-o and claude 3.5. 70B should be 10-15% worse. With 128K context its big.

Anonymous
07/22/24(Mon)14:39:56 No.101523440

Anonymous 07/22/24(Mon)14:39:56 No.101523440

>>101523397
>Yeah, if you know that, why do you need a script to keep your ego from getting hurt?
Why would I not want to see some biased metrics? Geez I wonder why...

Anonymous
07/22/24(Mon)14:39:58 No.101523441

Anonymous 07/22/24(Mon)14:39:58 No.101523441

>>101523430
surely you cant be talking about the 8x7 model...

Anonymous
07/22/24(Mon)14:40:05 No.101523442

Anonymous 07/22/24(Mon)14:40:05 No.101523442

File: an-update-on-truffle.png (202 KB, 1186x1026)

202 KB PNG

>>101522925
I had a preorder for a Truffle, but they cancelled it. Looks like they're having manufacturing issues.

Anonymous
07/22/24(Mon)14:40:05 No.101523443

Anonymous 07/22/24(Mon)14:40:05 No.101523443

>>101523421
>Idk why some anons are calling that the base model.
>model_name: Meta-Llama-3.1-405B
notice something missing?

Anonymous
07/22/24(Mon)14:40:33 No.101523452

Anonymous 07/22/24(Mon)14:40:33 No.101523452

>>101523427
That ain't a matter of intelligence, you can't explain being a snowflake. Go back.

Anonymous
07/22/24(Mon)14:40:49 No.101523454

Anonymous 07/22/24(Mon)14:40:49 No.101523454

>>101523410
Not him but even Wizard isn't enough for me. I want something smarter. What does feel like magic though (in a bad way) is trying a small model after being used to a big boy.

Anonymous
07/22/24(Mon)14:41:04 No.101523459

Anonymous 07/22/24(Mon)14:41:04 No.101523459

File: BenchM.png (25 KB, 850x888)

25 KB PNG

>>101523410
Gemma 27B is smarter.

Anonymous
07/22/24(Mon)14:41:29 No.101523466

Anonymous 07/22/24(Mon)14:41:29 No.101523466

>>101523442
>agentic personal ai computer
Oh so it's a scam for hipsters.

Anonymous
07/22/24(Mon)14:41:42 No.101523473

Anonymous 07/22/24(Mon)14:41:42 No.101523473

>>101523466
>>101523443
Look at the other results in the repo

Anonymous
07/22/24(Mon)14:42:05 No.101523478

Anonymous 07/22/24(Mon)14:42:05 No.101523478

>>101523441
I am, I know I'll get called a shill but it's one of the best models we have, it's not woke or biased at all.

Anonymous
07/22/24(Mon)14:42:51 No.101523484

Anonymous 07/22/24(Mon)14:42:51 No.101523484

>>101523478
lol
lmao even

Anonymous
07/22/24(Mon)14:42:53 No.101523485

Anonymous 07/22/24(Mon)14:42:53 No.101523485

Newbie here. Is there a correlation between model size on disk and vram requirements? If I have 12gb of vram, I should be able to load any model that's less than 12gb on disk, right?

Anonymous
07/22/24(Mon)14:43:00 No.101523487

Anonymous 07/22/24(Mon)14:43:00 No.101523487

>>101523478
Not that anon but I thought people generally agreed that the Dolphin models were GPT-4 slopped?

Anonymous
07/22/24(Mon)14:43:24 No.101523489

Anonymous 07/22/24(Mon)14:43:24 No.101523489

can anyone here even run L3 405b?

Anonymous
07/22/24(Mon)14:43:35 No.101523492

Anonymous 07/22/24(Mon)14:43:35 No.101523492

>>101523454
Use Gemma 27B

Anonymous
07/22/24(Mon)14:43:39 No.101523494

Anonymous 07/22/24(Mon)14:43:39 No.101523494

>>101523441
Still way better than the 12B NeMo or the 8B llama 3. It's old but it's smarter than anything else below 70B.

Anonymous
07/22/24(Mon)14:43:47 No.101523496

Anonymous 07/22/24(Mon)14:43:47 No.101523496

>>101523485
kind of, it's not exact

Anonymous
07/22/24(Mon)14:43:48 No.101523497

Anonymous 07/22/24(Mon)14:43:48 No.101523497

>>101523459
No anon, mememarks will never represent the reality, no matter how much you'd want it to be

Anonymous
07/22/24(Mon)14:43:59 No.101523498

Anonymous 07/22/24(Mon)14:43:59 No.101523498

>>101523485
You also have to account for the context, so it's more like 10GB in disk for 12 in VRAM.

Anonymous
07/22/24(Mon)14:44:03 No.101523501

Anonymous 07/22/24(Mon)14:44:03 No.101523501

>>101523489
I don't even have the disk space for it.

Anonymous
07/22/24(Mon)14:44:03 No.101523502

Anonymous 07/22/24(Mon)14:44:03 No.101523502

File: file.png (166 KB, 2365x418)

166 KB PNG

>>101523454
>Wizard
>smart
Are you stupid?

Anonymous
07/22/24(Mon)14:44:05 No.101523504

Anonymous 07/22/24(Mon)14:44:05 No.101523504

File: 1721268293103685.jpg (17 KB, 360x480)

17 KB JPG

>>101522414
>Democratizing AI when they're abusing their market dominance to never release a card with above 24gb vram for under 5k
They're leaving so much slack in the market with their monopoly that fucking ching chong china is doing a better job of democratizing it by soldering more memory on and rewriting the vbios of other cards. Do you have any idea how fucking anti consumer your company has to be when such a labor intensive and potentially card ruining technique is more feasible than buying an equivalent card?

Anonymous
07/22/24(Mon)14:44:08 No.101523505

Anonymous 07/22/24(Mon)14:44:08 No.101523505

>>101523459
>mememarks
try actually using wizard 8x22 at Q4+

Anonymous
07/22/24(Mon)14:44:08 No.101523506

Anonymous 07/22/24(Mon)14:44:08 No.101523506

>>101523487
Try it for yourself, everyone here shits on everything without trying it.

Anonymous
07/22/24(Mon)14:44:25 No.101523508

Anonymous 07/22/24(Mon)14:44:25 No.101523508

>>101523478
are you mentally retarded, anon?

Anonymous
07/22/24(Mon)14:45:12 No.101523518

Anonymous 07/22/24(Mon)14:45:12 No.101523518

>>101523505
I have. Wizard is far over blown. Dry as fuck and garbage at dialogue. Gemma is smarter AND a better writer.

Anonymous
07/22/24(Mon)14:45:19 No.101523522

Anonymous 07/22/24(Mon)14:45:19 No.101523522

>>101523492
Not enough context so I don't really care to. I regularly get up to 32k.

Anonymous
07/22/24(Mon)14:45:22 No.101523523

Anonymous 07/22/24(Mon)14:45:22 No.101523523

Is it only L3-405B that has the new RoPE scaling tech or does all the L3.1 models also have it? 131k ctxt length feels a bit tight for RP purposes but if it's really effective at that full length, then it might be good (compared to Opus/4/4o/Sonnet's only effective 30-50k ctxt length) anyways hope the new L3.1 models are way better at RP stuff.
Also are they really not releasing a 27b or 34b model?

Anonymous
07/22/24(Mon)14:45:35 No.101523525

Anonymous 07/22/24(Mon)14:45:35 No.101523525

>>101523508
Why are you always so butthurt? Who hurt you?

Anonymous
07/22/24(Mon)14:45:49 No.101523527

Anonymous 07/22/24(Mon)14:45:49 No.101523527

>>101523508
No? I taught LLMs how to answer arithmetic questions, when everyone previously said that they never could, and laughed at me for trying.

Anonymous
07/22/24(Mon)14:45:55 No.101523528

Anonymous 07/22/24(Mon)14:45:55 No.101523528

File: denial.png (958 KB, 3330x2006)

958 KB PNG

A reminder.

Anonymous
07/22/24(Mon)14:46:13 No.101523533

Anonymous 07/22/24(Mon)14:46:13 No.101523533

Now that the dust has settled do we all agree that llama 3 405b is an overhyped disappointment?

Anonymous
07/22/24(Mon)14:46:19 No.101523535

Anonymous 07/22/24(Mon)14:46:19 No.101523535

>>101523506
I mean people make these kinds of claims all the time about models. What proof do you have that it's good and worth the time testing?

Anonymous
07/22/24(Mon)14:46:33 No.101523539

Anonymous 07/22/24(Mon)14:46:33 No.101523539

>>101523525
people are just tired to read stinky takes here, you're just lowering the over IQ with your retarded takes, that's all

Anonymous
07/22/24(Mon)14:46:48 No.101523543

Anonymous 07/22/24(Mon)14:46:48 No.101523543

>>101523523
It will be even dryer than L3. Trust my words.

Anonymous
07/22/24(Mon)14:46:48 No.101523544

Anonymous 07/22/24(Mon)14:46:48 No.101523544

>>101523528
>opus not in 1st place
aaand discarded

Anonymous
07/22/24(Mon)14:46:50 No.101523546

Anonymous 07/22/24(Mon)14:46:50 No.101523546

>>101523523
>131k ctxt length feels a bit tight for RP purposes
wat

Anonymous
07/22/24(Mon)14:47:00 No.101523550

Anonymous 07/22/24(Mon)14:47:00 No.101523550

any llms i can run that do live audio feed translation into english?
I have a 13500 and 64gb ram and 16gb VRAM and a fast m.2
Chatgpt gave me whisper to look at but it's missing the rest.
Thanks.

Anonymous
07/22/24(Mon)14:47:03 No.101523551

Anonymous 07/22/24(Mon)14:47:03 No.101523551

>>101523097
>Custom quantization
Hmm now where have I heard this before..

Anonymous
07/22/24(Mon)14:47:39 No.101523556

Anonymous 07/22/24(Mon)14:47:39 No.101523556

>>101523544
3.5 Sonnet is smarter.

Anonymous
07/22/24(Mon)14:47:41 No.101523557

Anonymous 07/22/24(Mon)14:47:41 No.101523557

>>101523502
>trusting the benchmark that places that garbage Phi above so many other larger models
Are you?

Anonymous
07/22/24(Mon)14:47:49 No.101523559

Anonymous 07/22/24(Mon)14:47:49 No.101523559

>>101523544
C3.5 sonnet is better than any model, even C3 opus, are you serious anon?

Anonymous
07/22/24(Mon)14:47:50 No.101523560

Anonymous 07/22/24(Mon)14:47:50 No.101523560

>>101523551
Robert Sinclair! Savior of local LLM!

Anonymous
07/22/24(Mon)14:48:07 No.101523566

Anonymous 07/22/24(Mon)14:48:07 No.101523566

>>101523443
While I've never seen Meta call their instruct model by the base name like that, at the same time it wouldn't make much sense to run question-answering benchmarks on a pure autocomplete models. Unless they didn't finish finetuning the instruct versions or something and just wanted some really shaky data as a preview, but if it's really officially releasing tomorrow then it'd surely be done by now to distribute to cloud providers.

Anonymous
07/22/24(Mon)14:48:08 No.101523567

Anonymous 07/22/24(Mon)14:48:08 No.101523567

>>101523544
3.5 sonnet blows away opus. You are a retard who has never used either if you dont think so.

Anonymous
07/22/24(Mon)14:48:16 No.101523569

Anonymous 07/22/24(Mon)14:48:16 No.101523569

>>101523533
Yeah, but I wasn't expecting anything anyway so whatever.

Anonymous
07/22/24(Mon)14:48:36 No.101523573

Anonymous 07/22/24(Mon)14:48:36 No.101523573

>>101523525
i'm not butthurt. no one in their right mind thinks any flavor of mixtral is even remotely good. i refuse to believe you've even downloaded a model since dolphin mixtral was released if you think it's the best model available under 70b.

Anonymous
07/22/24(Mon)14:49:17 No.101523579

Anonymous 07/22/24(Mon)14:49:17 No.101523579

>>101523518
Gemma IS more charming, but you're insane if you think it's more technically proficient at writing.

Anonymous
07/22/24(Mon)14:49:24 No.101523582

Anonymous 07/22/24(Mon)14:49:24 No.101523582

>>101523539
Stop projecting.

Anonymous
07/22/24(Mon)14:49:37 No.101523583

Anonymous 07/22/24(Mon)14:49:37 No.101523583

>>101523566
>at the same time it wouldn't make much sense to run question-answering benchmarks on a pure autocomplete models.
could be for comparison on azure next to instruct who can say for sure.

Anonymous
07/22/24(Mon)14:50:17 No.101523593

Anonymous 07/22/24(Mon)14:50:17 No.101523593

>>101523504
And yet... no 32GB AMD or Intel card either...

Anonymous
07/22/24(Mon)14:50:35 No.101523596

Anonymous 07/22/24(Mon)14:50:35 No.101523596

>>101523544
opus is only a bit better than c3 sonnet for creative writing/rp, and only because its not as finetuned for assistant tasks. For ABSOLUTELY ANYTHING ELSE 3.5 sonnet is way better. Even for RP if it's with complex characters.

Anonymous
07/22/24(Mon)14:50:36 No.101523597

Anonymous 07/22/24(Mon)14:50:36 No.101523597

>>101523544
Retard.

Anonymous
07/22/24(Mon)14:50:41 No.101523599

Anonymous 07/22/24(Mon)14:50:41 No.101523599

>>101523544
Are you living under a rock?

Anonymous
07/22/24(Mon)14:50:59 No.101523603

Anonymous 07/22/24(Mon)14:50:59 No.101523603

>>101523579
I will say that 27B either needs some starting context or to be told to write in a authors style. Wizard has a better "default" writing style.

Anonymous
07/22/24(Mon)14:50:59 No.101523604

Anonymous 07/22/24(Mon)14:50:59 No.101523604

>>101523567
NTA, but how...? The replies feel really terse, like it doesn't want to reply to what's going on, even with a prefill. Is there a magic prefill that makes it stop acting like such a baby/suddenly start writing at a third grade level during lewd?

Anonymous
07/22/24(Mon)14:51:20 No.101523609

Anonymous 07/22/24(Mon)14:51:20 No.101523609

>>101523604
Have you tried presets like https://momoura.neocities.org/ momoSORBET for example?

Anonymous
07/22/24(Mon)14:51:46 No.101523615

Anonymous 07/22/24(Mon)14:51:46 No.101523615

>>101523573
>"principle of sufficient reason":"facts need a self-explanatory or infinite causal chain."

Anonymous
07/22/24(Mon)14:51:47 No.101523616

Anonymous 07/22/24(Mon)14:51:47 No.101523616

>>101523579
I tried Wizard for one output and I immediately deleted it after seeing how slopped it was.

Anonymous
07/22/24(Mon)14:51:58 No.101523618

Anonymous 07/22/24(Mon)14:51:58 No.101523618

>>101523573
I guess you weren't in this general when Mixtral was released, many such cases.

The tourists are really taking this general quality down in the dumps.

Anonymous
07/22/24(Mon)14:52:21 No.101523619

Anonymous 07/22/24(Mon)14:52:21 No.101523619

>>101523533
No shizo, no one here has used it yet. And its base model benchmarks put it above anything not claude 4.5

Anonymous
07/22/24(Mon)14:52:53 No.101523626

Anonymous 07/22/24(Mon)14:52:53 No.101523626

>>101523604
Also good 3.5 Sonnet presets in https://rentry.org/jb-listing, e.g.
https://rentry.org/SmileyJB

Anonymous
07/22/24(Mon)14:53:02 No.101523627

Anonymous 07/22/24(Mon)14:53:02 No.101523627

>>101523618
>>96345096
>Mistal-Llama is fully /pol ready.
you talking about thread quality is very rich

Anonymous
07/22/24(Mon)14:53:07 No.101523629

Anonymous 07/22/24(Mon)14:53:07 No.101523629

>>101523618
I was in this general, it was good just like Turbo was good at the time. You're insane if you're still using it. Seek medical attention.

Anonymous
07/22/24(Mon)14:53:35 No.101523637

Anonymous 07/22/24(Mon)14:53:35 No.101523637

>>101523593
w6800 is 32gb and can be found around $900 second hand these days, and it's gfx1030 which is well supported on rocm
but it's still rocm so lmao

Anonymous
07/22/24(Mon)14:54:47 No.101523652

Anonymous 07/22/24(Mon)14:54:47 No.101523652

File: RP-Trial1.png (7 KB, 634x56)

7 KB PNG

>>101523546
This is my current RP session with custom instructions, system prompt, and a character card. Sonnet-3.5 & Opus are already breaking down and struggling to do some 'need in the haystack' stuff.

Anonymous
07/22/24(Mon)14:55:25 No.101523659

Anonymous 07/22/24(Mon)14:55:25 No.101523659

>>101523504
Ngreedia is fucking over corpos too. Big companies know this and are looking to manufacture their own AI hardware now. Their existence is a bubble, kept alive by the belief that AI is a race and there's only one winner

Anonymous
07/22/24(Mon)14:55:36 No.101523660

Anonymous 07/22/24(Mon)14:55:36 No.101523660

I'm a regular Wizard user and I'd say it is quite slopped by default but a lot better after you use a good system prompt tailored to get around slop for it. I'd use something like CR+ instead but Wizard is still pretty smart, while being faster, so it was worth it for me to test around with different system prompts to get it up to par.

Anonymous
07/22/24(Mon)14:55:43 No.101523663

Anonymous 07/22/24(Mon)14:55:43 No.101523663

>>101523629
>You're insane if you're still using it.
>>97223983
>I know that the people who hate me will most likely try and use said post as a means of getting me banned.

Anonymous
07/22/24(Mon)14:56:09 No.101523670

Anonymous 07/22/24(Mon)14:56:09 No.101523670

>>101523652
You need to learn how to prompt.

Anonymous
07/22/24(Mon)14:56:12 No.101523671

Anonymous 07/22/24(Mon)14:56:12 No.101523671

>>101523567
I've been using both back to back for a couple of days. I really tried to like sonnet too, the charges for opus are eye-watering. But I feel like in any sort of real task, opus' advantage is immediately apparent. Even in RP, it's far better at understanding the context and "reading between the lines". Sonnet struggles with complex characters unless I clearly spell out what to write.

Anonymous
07/22/24(Mon)14:56:44 No.101523680

Anonymous 07/22/24(Mon)14:56:44 No.101523680

>>101523652
>Sonnet-3.5 & Opus are already breaking down and struggling to do some 'need in the haystack' stuff.
no shit, aren't they like ~30 something k actual context?

Anonymous
07/22/24(Mon)14:56:59 No.101523685

Anonymous 07/22/24(Mon)14:56:59 No.101523685

The real question is: How could you RP for more than 40~60 messages? The quality, no matter the model, takes a pretty sizable dip after that point, even before the end-of-context standard quality drop.

Anonymous
07/22/24(Mon)14:57:25 No.101523688

Anonymous 07/22/24(Mon)14:57:25 No.101523688

>>101523660
Nah, 8x22B was a joke release. You're just a gullible idiot that fell for the Reddit hype.

Anonymous
07/22/24(Mon)14:57:30 No.101523690

Anonymous 07/22/24(Mon)14:57:30 No.101523690

File: 1704056397584020.png (48 KB, 433x543)

48 KB PNG

>>101523680
Yeah, 3 Sonnet (no info about 3.5) is 28K native context, Opus - we don't know.

Check here https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html

Anonymous
07/22/24(Mon)14:57:40 No.101523694

Anonymous 07/22/24(Mon)14:57:40 No.101523694

>>101523671
And you mean sonnet 3.5? 3.5 is night and day better at coding than opus. And for RP it feels like it "gets" the character its trying to be far better / more naturally. I could never go back to opus after 3.5. Opus feels so robotic / dumb.

Anonymous
07/22/24(Mon)14:57:46 No.101523696

Anonymous 07/22/24(Mon)14:57:46 No.101523696

>>101523685
>101523670
>You need to learn how to prompt.

Anonymous
07/22/24(Mon)14:58:22 No.101523702

Anonymous 07/22/24(Mon)14:58:22 No.101523702

>>101523685
With corpo models you can easily RP for hundreds of messages

Anonymous
07/22/24(Mon)14:59:29 No.101523719

Anonymous 07/22/24(Mon)14:59:29 No.101523719

>>101523685
I was RPing just fine for hundreds of message with NeMo

Anonymous
07/22/24(Mon)14:59:45 No.101523723

Anonymous 07/22/24(Mon)14:59:45 No.101523723

>>101523690
That has nothing to do with the real context. It's the context length that fits in some GPUs, that's why the smaller models are higher.

Anonymous
07/22/24(Mon)15:00:27 No.101523731

Anonymous 07/22/24(Mon)15:00:27 No.101523731

>>101523723
>That has nothing to do with the real context
No, retard, it has everything to do with real context. 28K is the native context that 3 Sonnet was trained on, and then they fine-tuned it on 200K context.

Anonymous
07/22/24(Mon)15:00:56 No.101523737

Anonymous 07/22/24(Mon)15:00:56 No.101523737

>>101523688
I literally used CR+ for a while and I switched to Wizard. I'm basing this off of my own experience, not what other people have posted.

Anonymous
07/22/24(Mon)15:01:32 No.101523743

Anonymous 07/22/24(Mon)15:01:32 No.101523743

>>101523731
>t. Anonymous4chan

Anonymous
07/22/24(Mon)15:01:56 No.101523749

Anonymous 07/22/24(Mon)15:01:56 No.101523749

>>101523743
You're in /lmg/, not /aicg/. People here won't buy your stupid bait, retard.

Anonymous
07/22/24(Mon)15:02:35 No.101523755

Anonymous 07/22/24(Mon)15:02:35 No.101523755

>>101523731
You don't know anything about how it was trained.

Anonymous
07/22/24(Mon)15:02:50 No.101523757

Anonymous 07/22/24(Mon)15:02:50 No.101523757

is the real bitter lesson that to get a good small model you need to train something 50x the size to then distill from?

Anonymous
07/22/24(Mon)15:02:50 No.101523759

Anonymous 07/22/24(Mon)15:02:50 No.101523759

>>101523749
but you bitted thoughbeit

Anonymous
07/22/24(Mon)15:02:52 No.101523760

Anonymous 07/22/24(Mon)15:02:52 No.101523760

>>101523755
Then explain >>101523690

Anonymous
07/22/24(Mon)15:03:02 No.101523762

Anonymous 07/22/24(Mon)15:03:02 No.101523762

>>101523749
no u

Anonymous
07/22/24(Mon)15:03:36 No.101523769

Anonymous 07/22/24(Mon)15:03:36 No.101523769

>>101523737
I think you're mentally ill.

Anonymous
07/22/24(Mon)15:03:55 No.101523773

Anonymous 07/22/24(Mon)15:03:55 No.101523773

>>101523757
Not sure if its bitter but it seems that way. Looks like new 70B is 90% of what 405B was.

Anonymous
07/22/24(Mon)15:04:03 No.101523776

Anonymous 07/22/24(Mon)15:04:03 No.101523776

>>101523757
Honestly, I guess so. Maybe in the future Meta will not even release distills. They'll let others do it for them. And they'll only train the behemoths.

Anonymous
07/22/24(Mon)15:04:12 No.101523778

Anonymous 07/22/24(Mon)15:04:12 No.101523778

>>101523760
>>101523749
Retard

Anonymous
07/22/24(Mon)15:04:43 No.101523782

Anonymous 07/22/24(Mon)15:04:43 No.101523782

>>101523760
Read the first reply.

Anonymous
07/22/24(Mon)15:04:56 No.101523786

Anonymous 07/22/24(Mon)15:04:56 No.101523786

>>101523694
Yes, 3.5. Coding is fine, true. But in RP I've had exact opposite experience on my own jb and a bunch of public ones. Opus had this understanding I would expect of a real reader. Guess it's a skill issue on my part or something.

Anonymous
07/22/24(Mon)15:05:07 No.101523790

Anonymous 07/22/24(Mon)15:05:07 No.101523790

File: 11__00900_.png (1.31 MB, 1024x1024)

1.31 MB PNG

Had to check to make sure I was in /lmg/ for a second with all this damn cloud talk going on

Anonymous
07/22/24(Mon)15:05:16 No.101523792

Anonymous 07/22/24(Mon)15:05:16 No.101523792

>>101523782
Read the reply to the first reply.

Anonymous
07/22/24(Mon)15:05:17 No.101523793

Anonymous 07/22/24(Mon)15:05:17 No.101523793

>>101523680
yes. that's what I meant 'if it's really effective at that full length'. If they actually trained at 131k ctxt length limit instead of <50k native context, then it should be good for RP.
>>101523690
ty nigga. why doesn't Bedrock have Sonnet-3.5 and Opus-3 details though?

Anonymous
07/22/24(Mon)15:06:10 No.101523805

Anonymous 07/22/24(Mon)15:06:10 No.101523805

>>101523793
>why doesn't Bedrock have Sonnet-3.5 and Opus-3 details though?
why leak details about sota tools?

Anonymous
07/22/24(Mon)15:06:14 No.101523807

Anonymous 07/22/24(Mon)15:06:14 No.101523807

>>101523790
All petra

Anonymous
07/22/24(Mon)15:06:56 No.101523819

Anonymous 07/22/24(Mon)15:06:56 No.101523819

>>101523790
It's over. Local is dead forevermore.

Anonymous
07/22/24(Mon)15:06:58 No.101523821

Anonymous 07/22/24(Mon)15:06:58 No.101523821

>>101523786
Maybe your opus is just hallucinating more due to being dumb. Try 3.5 with a good creative JB.

Anonymous
07/22/24(Mon)15:07:18 No.101523830

Anonymous 07/22/24(Mon)15:07:18 No.101523830

>>101523807
>>101523807
How many times do I need to repeat I'm PetrUS not Petra?
>>97062246
>I'm not Petra. Petra's an amateur. I'm something considerably worse.

Anonymous
07/22/24(Mon)15:07:20 No.101523831

Anonymous 07/22/24(Mon)15:07:20 No.101523831

>>101523786
Opus is retarded, just like you.

Anonymous
07/22/24(Mon)15:07:52 No.101523839

Anonymous 07/22/24(Mon)15:07:52 No.101523839

>>101523792
ok, now you are just baiting

Anonymous
07/22/24(Mon)15:09:14 No.101523862

Anonymous 07/22/24(Mon)15:09:14 No.101523862

>>101523819
you brown sharty zoomers really are something

Anonymous
07/22/24(Mon)15:09:22 No.101523863

Anonymous 07/22/24(Mon)15:09:22 No.101523863

>>101523805
this, they got the secret sauce, of course no one would share that, it's ultra valuable to them

Anonymous
07/22/24(Mon)15:09:28 No.101523865

Anonymous 07/22/24(Mon)15:09:28 No.101523865

>>101523830
Show proof. What is your favorite futa card and why?

Anonymous
07/22/24(Mon)15:09:38 No.101523866

Anonymous 07/22/24(Mon)15:09:38 No.101523866

File: 1721271712720313.jpg (16 KB, 200x200)

16 KB JPG

>>101523773
With how small the gains are at this size, is 90% of 405b really that different than L3 70b...?

Anonymous
07/22/24(Mon)15:10:23 No.101523872

Anonymous 07/22/24(Mon)15:10:23 No.101523872

For gemma 9b they said they got a better performance from distillation than from training from scratch. Training a giant model then distilling for 90% of the performance for a fraction of the size / running costs is the way forward.

Anonymous
07/22/24(Mon)15:10:33 No.101523873

Anonymous 07/22/24(Mon)15:10:33 No.101523873

>>101523866
It's worse.

Anonymous
07/22/24(Mon)15:10:34 No.101523874

Anonymous 07/22/24(Mon)15:10:34 No.101523874

>>101523865
I haven't roleplayed with futas in a while, I realized I just want sex positive girls and don't really like cock.

Anonymous
07/22/24(Mon)15:10:35 No.101523876

Anonymous 07/22/24(Mon)15:10:35 No.101523876

File: 10yearslater.jpg (50 KB, 1225x254)

50 KB JPG

>>101523685
I dunno I start gooning for a night and next thing I know the sun is coming up, I'm 175 messages in and we're doing the 10 years later epilogue after I married the mercenary that was trying to kill me

Anonymous
07/22/24(Mon)15:10:43 No.101523877

Anonymous 07/22/24(Mon)15:10:43 No.101523877

>>101523737
Just enjoy things instead of trying to convince other anons of anything. Expressing opinions is pointless. As soon as you let a model name out of your mouth you have been designated a retard and nothing will change the other guy's mind.

Anonymous
07/22/24(Mon)15:11:37 No.101523886

Anonymous 07/22/24(Mon)15:11:37 No.101523886

>>101523866
small gains? are we reading the same benchmarks? Also take into account these are base model benchmarks. You can see the last batch and guess how much these will improve with instruct tune. New 70B will likely be on part / slightly better than GPT-4o and a bit worse than 3.5

Anonymous
07/22/24(Mon)15:11:42 No.101523887

Anonymous 07/22/24(Mon)15:11:42 No.101523887

File: FNOaYJsVUAAWoCv.jpg (33 KB, 597x513)

33 KB JPG

>>101523866
... No.

Anonymous
07/22/24(Mon)15:11:49 No.101523889

Anonymous 07/22/24(Mon)15:11:49 No.101523889

>>101523874
>don't really like cock
Gay

Anonymous
07/22/24(Mon)15:11:57 No.101523893

Anonymous 07/22/24(Mon)15:11:57 No.101523893

>>101523877
>t. Goliath/Midnight Miqu/Wizard/Stheno enjoyer

Anonymous
07/22/24(Mon)15:12:00 No.101523894

Anonymous 07/22/24(Mon)15:12:00 No.101523894

>>101522868
>deepseek, gemini, mistral, gpt-4 all moe
>"NOOOO they mustve switched to dense right after!"
jewish

Anonymous
07/22/24(Mon)15:12:52 No.101523904

Anonymous 07/22/24(Mon)15:12:52 No.101523904

>>101523874
Based, I will call you the good petrus.

Anonymous
07/22/24(Mon)15:13:05 No.101523908

Anonymous 07/22/24(Mon)15:13:05 No.101523908

>>101523893
There is NOTHING wrong with Midnight Miqu

Anonymous
07/22/24(Mon)15:13:25 No.101523914

Anonymous 07/22/24(Mon)15:13:25 No.101523914

>people already coping that 70B will be practically as good as the 400B
Yes. Just like Furbo is nearly as good as full 4.

Anonymous
07/22/24(Mon)15:13:51 No.101523919

Anonymous 07/22/24(Mon)15:13:51 No.101523919

>>101523908
It's a retarded merge shilled by Reddit.

Anonymous
07/22/24(Mon)15:14:16 No.101523926

Anonymous 07/22/24(Mon)15:14:16 No.101523926

>>101523877
mind broken

Anonymous
07/22/24(Mon)15:14:18 No.101523928

Anonymous 07/22/24(Mon)15:14:18 No.101523928

>>101523919
YOU'RE retarded.

Anonymous
07/22/24(Mon)15:14:51 No.101523933

Anonymous 07/22/24(Mon)15:14:51 No.101523933

>>101523914
>coping
benchmarks are coping now? We now know distilling instead of training is the way to go. There are papers on this now.

Anonymous
07/22/24(Mon)15:14:58 No.101523935

Anonymous 07/22/24(Mon)15:14:58 No.101523935

>>101523919
Kill yourself

Anonymous
07/22/24(Mon)15:15:35 No.101523943

Anonymous 07/22/24(Mon)15:15:35 No.101523943

>>101523886
2 years to finally get gpt at home

Anonymous
07/22/24(Mon)15:16:22 No.101523950

Anonymous 07/22/24(Mon)15:16:22 No.101523950

>>101523943
We have been past old original gpt4 for awhile now. Im talking about the latest gp4-o.

Anonymous
07/22/24(Mon)15:16:37 No.101523951

Anonymous 07/22/24(Mon)15:16:37 No.101523951

>>101523442
>Truffle-1 will be an agentic, personal AI computer
wasn't it supposed to be a really powerful inferencing server in a box?
this reads like "im grifting and word on the street is that this is the new grift"

Anonymous
07/22/24(Mon)15:17:31 No.101523958

Anonymous 07/22/24(Mon)15:17:31 No.101523958

>>101523914
what is furbo. Sorry, I don't speak piss drinka

Anonymous
07/22/24(Mon)15:17:32 No.101523959

Anonymous 07/22/24(Mon)15:17:32 No.101523959

>>101523950
>We have been past old original gpt4 for awhile now
???????????????

Anonymous
07/22/24(Mon)15:18:48 No.101523974

Anonymous 07/22/24(Mon)15:18:48 No.101523974

>>101523959
New here?

Anonymous
07/22/24(Mon)15:19:06 No.101523980

Anonymous 07/22/24(Mon)15:19:06 No.101523980

>>101523805
I'm talking about the API details which lists the ctxt length. They listed it for Sonnet-3 and Haiku-3 but not for Opus? Anthropic stated on their FAQ that Opus can ingest 200k+ tokens in one prompt but no one knows if each request is being processed at 200k+ ctxt limit or if they're trying to route each request by some rules they have (i.e. if the preprocessor identifies it's <50k tokens, send it to Opus-3-28k)
God this general has been infested by niggers that do nothing but stir shit and derail discussions. No wonder the actual ppl moved to twitter and discord to discuss stuff.

Anonymous
07/22/24(Mon)15:20:02 No.101523990

Anonymous 07/22/24(Mon)15:20:02 No.101523990

It is weird how everyone is posting here instead of being busy trying out 405B. Almost like nobody can actually run this thing.

Anonymous
07/22/24(Mon)15:20:10 No.101523991

Anonymous 07/22/24(Mon)15:20:10 No.101523991

>>101523980
>God this general has been infested by niggers that do nothing but stir shit and derail discussions. No wonder the actual ppl moved to twitter and discord to discuss stuff.
maybe us houd lgo bak,

Anonymous
07/22/24(Mon)15:21:08 No.101524002

Anonymous 07/22/24(Mon)15:21:08 No.101524002

>>101523980
>if the preprocessor identifies it's <50k tokens, send it to Opus-3-28k
You're lost. aicg is here: >>101522808
But that take might be too retarded even for them.

Anonymous
07/22/24(Mon)15:21:11 No.101524003

Anonymous 07/22/24(Mon)15:21:11 No.101524003

>>101523990
>Almost like nobody can actually run this thing.
Physically impossible on a home

Anonymous
07/22/24(Mon)15:21:27 No.101524007

Anonymous 07/22/24(Mon)15:21:27 No.101524007

>>101523990
Even if I could if 70B is 90% as good then the speed difference would not be worth it for 405B. I'm waiting for that.

Anonymous
07/22/24(Mon)15:21:29 No.101524008

Anonymous 07/22/24(Mon)15:21:29 No.101524008

>>101523980
You should leave too.

Anonymous
07/22/24(Mon)15:22:10 No.101524016

Anonymous 07/22/24(Mon)15:22:10 No.101524016

Has anyone tried the famed fish tts by now?

Anonymous
07/22/24(Mon)15:22:16 No.101524017

Anonymous 07/22/24(Mon)15:22:16 No.101524017

>>101524007
Even if I could if 8B is 80% as good then the speed difference would not be worth it for 70B. I'm waiting for that.

Anonymous
07/22/24(Mon)15:22:23 No.101524018

Anonymous 07/22/24(Mon)15:22:23 No.101524018

>>101524003
> what is mac studio ultra 192gb

Anonymous
07/22/24(Mon)15:23:04 No.101524025

Anonymous 07/22/24(Mon)15:23:04 No.101524025

>>101523980
>God this general has been infested by niggers that do nothing but stir shit and derail discussions. No wonder the actual ppl moved to twitter and discord to discuss stuff
I still have a schizo part of me that believes that maybe, just maybe, it's actual corpo agents trying to persistently kill threads where leaks happen, and to stifle potential growth of ideology and culture that opposes their corporate interests.

Anonymous
07/22/24(Mon)15:23:12 No.101524027

Anonymous 07/22/24(Mon)15:23:12 No.101524027

>>101523990
the fuck are we supposed to do with a base model

Anonymous
07/22/24(Mon)15:23:12 No.101524028

Anonymous 07/22/24(Mon)15:23:12 No.101524028

>>101523990
It was revealed to me in a dream that it will be garbage and lmg will overhype it.

Anonymous
07/22/24(Mon)15:23:31 No.101524033

Anonymous 07/22/24(Mon)15:23:31 No.101524033

File: 8B.png (17 KB, 715x546)

17 KB PNG

>>101524017
Cept 8B looks far worse.

Anonymous
07/22/24(Mon)15:23:31 No.101524034

Anonymous 07/22/24(Mon)15:23:31 No.101524034

>>101524018
Under 4bpw...

Anonymous
07/22/24(Mon)15:24:35 No.101524047

Anonymous 07/22/24(Mon)15:24:35 No.101524047

>>101524039
>>101524039
>>101524039

Anonymous
07/22/24(Mon)15:24:51 No.101524053

Anonymous 07/22/24(Mon)15:24:51 No.101524053

>>101524033
its 50x smaller

Anonymous
07/22/24(Mon)15:25:11 No.101524057

Anonymous 07/22/24(Mon)15:25:11 No.101524057

>>101524047
>page 6

Anonymous
07/22/24(Mon)15:26:37 No.101524082

Anonymous 07/22/24(Mon)15:26:37 No.101524082

>>101524057
>436 posts

Anonymous
07/22/24(Mon)15:27:06 No.101524089

Anonymous 07/22/24(Mon)15:27:06 No.101524089

>>101524025
no, it's just the usual village idiots and some bots. it's kinda unfortunate bc every once in a while there would be good discussions about technical details but nowadays it's mostly just niggers throwing shit at ppl who want to discuss stuff and bait responses.

Anonymous
07/22/24(Mon)15:27:20 No.101524093

Anonymous 07/22/24(Mon)15:27:20 No.101524093

>>101524082
Oops all PetrUS

Anonymous
07/22/24(Mon)15:27:26 No.101524094

Anonymous 07/22/24(Mon)15:27:26 No.101524094

>>101524053
i'm pretty sure if we evaluated those models on hardcore benchmarks, L3-405b would get some 50x better scores than L3-8b

Anonymous
07/22/24(Mon)15:28:31 No.101524105

Anonymous 07/22/24(Mon)15:28:31 No.101524105

>>101524025
I unironically think there are people on corpo payroll here who shit on anything local and attack all sides of any conversation regarding local models while subtly being positive about closed corporate models in an attempt to frustrate people into leaving the hobby and/or instilling an idea of corporate superiority with local models being "cope".
This is kind of a bad post though because it might cause more of their autism and derail the thread more but I just want people to be aware that it's a real possibility. Even more so if someone shits on this post.

Anonymous
07/22/24(Mon)15:28:32 No.101524106

Anonymous 07/22/24(Mon)15:28:32 No.101524106

>>101524094
There is clearly diminishing returns when it comes to params.

Anonymous
07/22/24(Mon)15:28:32 No.101524107

Anonymous 07/22/24(Mon)15:28:32 No.101524107

>>101524094
I'm not, I'm sure i'td be within 40%

Anonymous
07/22/24(Mon)15:29:45 No.101524121

Anonymous 07/22/24(Mon)15:29:45 No.101524121

>>101524105
>Even more so if someone shits on this post.
Persecution complex much PetrUS
>I know that the people who hate me will most likely try and use said post as a means of getting me banned.
>everyone who attacks him is mindbroken incel scum

Anonymous
07/22/24(Mon)15:29:53 No.101524122

Anonymous 07/22/24(Mon)15:29:53 No.101524122

>>101524047
>►Official /lmg/ card: https://files.catbox.moe/ylb0hv.png
kill yourself

Anonymous
07/22/24(Mon)15:32:32 No.101524158

Anonymous 07/22/24(Mon)15:32:32 No.101524158

New Thread
>>101524155
>>101524155
>>101524155

Anonymous
07/22/24(Mon)15:38:12 No.101524224

Anonymous 07/22/24(Mon)15:38:12 No.101524224

File: Miqu 2.png (8 KB, 411x225)

8 KB PNG

>>101521762
>Llama 3 405B leaked base model discussion and distribution
>Ah hell yeah, I'll seed that
>Attempt to download it so I can seed it
>Not a single peer in over 45 minutes
The dream is dead

Anonymous
07/22/24(Mon)16:10:10 No.101524686

Anonymous 07/22/24(Mon)16:10:10 No.101524686

>>101523550
nm, sorted.

Anonymous
07/22/24(Mon)16:14:22 No.101524748

Anonymous 07/22/24(Mon)16:14:22 No.101524748

>>101524224
fix ur DHT
or append some public trackers to the magnet like &tr=udp%3A%2F%2Fopen.demonii.com%3A1337

Anonymous
07/22/24(Mon)17:56:25 No.101525929

Anonymous 07/22/24(Mon)17:56:25 No.101525929

>>101524224
That's strange, I'm getting 40 peers

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.