/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 12/20/24(Fri)20:21:14 No.103591928

File: miku blushing flustered h(...).png (1.55 MB, 822x1078)

1.55 MB PNG

/lmg/ - Local Models General Anonymous 12/20/24(Fri)20:21:14 No.103591928 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103586102 & >>103575618

►News
>(12/20) RWKV-7 released: https://hf.co/BlinkDL/rwkv-7-world
>(12/19) Finally, a Replacement for BERT: https://hf.co/blog/modernbert
>(12/18) Bamba-9B, hybrid model trained by IBM, Princeton, CMU, and UIUC on open data: https://hf.co/blog/bamba
>(12/18) Apollo unreleased: https://github.com/Apollo-LMMs/Apollo
>(12/18) Granite 3.1 released: https://hf.co/ibm-granite/granite-3.1-8b-instruct

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/hsiehjackson/RULER
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
12/20/24(Fri)20:21:35 No.103591931

Anonymous 12/20/24(Fri)20:21:35 No.103591931

File: __hatsune_miku_vocaloid_d(...).jpg (138 KB, 1501x1501)

138 KB JPG

►Recent Highlights from the Previous Thread: >>103586102

--o1 and o3 model performance on ARC-AGI and discussion on AGI and model limitations:
>103587323 >103587413 >103587454 >103587471 >103587505 >103587766 >103590524 >103587469 >103588006 >103588035 >103587434 >103587941 >103588010 >103588224
--OpenAI o3 breakthrough on ARC-AGI benchmark sparks debate on AGI definition and progress:
>103588307 >103588346 >103588366 >103588385 >103588469 >103588564 >103588699 >103588936 >103588972 >103589029 >103589084 >103589017
--OpenAI model's coding abilities and limitations:
>103589135 >103589321 >103589352 >103590457 >103589482 >103589274
--3B Llama outperforms 70B with enough chain-of-thought iterations:
>103589371 >103589465 >103589477 >103589552 >103589597
--Qwen model's translation quirks and alternatives like Gemma 2 27B:
>103590809 >103591022 >103591074
--Anon seeks external GPU solution for second 3090, PCIe extenders recommended:
>103590244 >103590379 >103590390
--Anon questions value of expensive prompts based on performance chart:
>103589493 >103589511
--Graph suggests ARC solution as an efficiency question:
>103587929 >103588147 >103588529
--o3 and AGI benchmarking, sentience, and ethics discussion:
>103588396 >103588445 >103588495 >103588688 >103588462 >103588520
--OpenAI's role in AI research and innovation:
>103587269 >103587328 >103587396 >103587416 >103587431
--Anon rants about Kobo's defaults and context length issues:
>103586238 >103586677 >103586723
--Anon bemoans the shift towards synthetic datasets and away from human alignment:
>103588737 >103588789 >103588797
--Offline novelcrafter updated to latest version:
>103589134 >103590353
--DeepSeek's new model and its resource requirements:
>103587002 >103587039 >103587635
--koboldcpp-1.80 changelog:
>103586660
--Miku (free space):
>103586902

►Recent Highlight Posts from the Previous Thread: >>103586113

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
12/20/24(Fri)20:22:36 No.103591941

Anonymous 12/20/24(Fri)20:22:36 No.103591941

how can we warm miku up?

Anonymous
12/20/24(Fri)20:23:34 No.103591949

Anonymous 12/20/24(Fri)20:23:34 No.103591949

>>103591941
put her next to your rig

Anonymous
12/20/24(Fri)20:27:01 No.103591969

Anonymous 12/20/24(Fri)20:27:01 No.103591969

>>103591928
>o3 not in the news
how is AGI not news worthy? It doesn't matter if it isn't local, local will take advantage of it anyway.

Anonymous
12/20/24(Fri)20:27:57 No.103591978

Anonymous 12/20/24(Fri)20:27:57 No.103591978

EVA-QWQ is kinda shit desu

Anonymous
12/20/24(Fri)20:29:03 No.103591986

Anonymous 12/20/24(Fri)20:29:03 No.103591986

>>103591978
Do tell. How so? Compared to what?

Anonymous
12/20/24(Fri)20:29:23 No.103591988

Anonymous 12/20/24(Fri)20:29:23 No.103591988

>>103591941
rub her nipples aggressively

Anonymous
12/20/24(Fri)20:31:39 No.103592004

Anonymous 12/20/24(Fri)20:31:39 No.103592004

Saltman wasn't blowing smoke for once.
Now I wonder how will the chinks will react to it in the next few months.

Anonymous
12/20/24(Fri)20:33:04 No.103592019

Anonymous 12/20/24(Fri)20:33:04 No.103592019

>>103591969
>not local
>not released
>just like 5 benchmarks with no context
>will cost hundreds of dollars to do anything nontrivial
I can appreciate the advancement in theory and all but I really don't think it is that important to the thread

Anonymous
12/20/24(Fri)20:35:19 No.103592033

Anonymous 12/20/24(Fri)20:35:19 No.103592033

>>103592019
poorfag thread is here: >>>/g/aicg/

Anonymous
12/20/24(Fri)20:35:29 No.103592034

Anonymous 12/20/24(Fri)20:35:29 No.103592034

>>103591969
>local will take advantage of it anyway.
When (if) it does, that will be news worthy.

Anonymous
12/20/24(Fri)20:35:57 No.103592037

Anonymous 12/20/24(Fri)20:35:57 No.103592037

>>103592019
A sensible assessment.

Anonymous
12/20/24(Fri)20:37:39 No.103592046

Anonymous 12/20/24(Fri)20:37:39 No.103592046

Do you get paid in OAI credits?

Anonymous
12/20/24(Fri)20:39:26 No.103592056

Anonymous 12/20/24(Fri)20:39:26 No.103592056

>>103591986
compared to fucking anything, but specifically I 'upgraded' from cydonia and even with stepped thinking on it seems much dumber and totally incapable of staying in-character, hallucinates much worse, and frequently follows up a 'thinking' reply with another one
this was not worth updating ST for

Anonymous
12/20/24(Fri)20:45:44 No.103592105

Anonymous 12/20/24(Fri)20:45:44 No.103592105

>>103592056
Thanks for trying it out. Personally I hadn't tested it that much so perhaps I was just lucky to not encounter too much stupidity.

Anonymous
12/20/24(Fri)20:46:00 No.103592108

Anonymous 12/20/24(Fri)20:46:00 No.103592108

>>103592056
>and frequently follows up a 'thinking' reply with another one
That bad?
Impressive.

L3.3fag !!SB6Q3O4XU7f
12/20/24(Fri)20:49:23 No.103592133

L3.3fag !!SB6Q3O4XU7f 12/20/24(Fri)20:49:23 No.103592133

>>103592056
Yeah, anything QwQ is at best a proof-of-concept when it comes to roleplay. Maybe once we have a model that implements COCONUT, that will change. I can't wait for a model that tells a good story AND maintains logical consistency better than the current ones.

Anonymous
12/20/24(Fri)20:52:44 No.103592164

Anonymous 12/20/24(Fri)20:52:44 No.103592164

why is this thread up when the other one is on page 1

Anonymous
12/20/24(Fri)20:53:03 No.103592169

Anonymous 12/20/24(Fri)20:53:03 No.103592169

Kys.

Anonymous
12/20/24(Fri)20:53:38 No.103592178

Anonymous 12/20/24(Fri)20:53:38 No.103592178

>>103592164
Monkey neuron activation at seeing a thread link in the last one.

Anonymous
12/20/24(Fri)20:54:32 No.103592187

Anonymous 12/20/24(Fri)20:54:32 No.103592187

>>103592164
You're right, weird.
Oh, it looks like there was a mass deletion of posts.

Anonymous
12/20/24(Fri)20:56:34 No.103592206

Anonymous 12/20/24(Fri)20:56:34 No.103592206

what did CUDA Dev do this time.......

Anonymous
12/20/24(Fri)20:58:25 No.103592215

Anonymous 12/20/24(Fri)20:58:25 No.103592215

>>103592206
He slapped sao's AI gf's ass in front of him

Anonymous
12/20/24(Fri)21:00:35 No.103592233

Anonymous 12/20/24(Fri)21:00:35 No.103592233

File: 1715835226901825.png (217 KB, 1872x1690)

217 KB PNG

Anonymous
12/20/24(Fri)21:03:00 No.103592257

Anonymous 12/20/24(Fri)21:03:00 No.103592257

>>103592233
>stuck at 512
>ram killer
>gpu killer
aiiie bruh fr so bad models ong

Anonymous
12/20/24(Fri)21:03:04 No.103592258

Anonymous 12/20/24(Fri)21:03:04 No.103592258

>>103591969
literally not AGI

Anonymous
12/20/24(Fri)21:04:59 No.103592267

Anonymous 12/20/24(Fri)21:04:59 No.103592267

>>103592233
Cool shit dude.

Anonymous
12/20/24(Fri)21:06:55 No.103592281

Anonymous 12/20/24(Fri)21:06:55 No.103592281

>>103592233
Did anyone here try this schizosandra and can give a verdict?

Anonymous
12/20/24(Fri)21:08:51 No.103592298

Anonymous 12/20/24(Fri)21:08:51 No.103592298

can I use teslas (Nvidia Tesla K80) for LLM vram through ooba easily?

Anonymous
12/20/24(Fri)21:11:43 No.103592316

Anonymous 12/20/24(Fri)21:11:43 No.103592316

>>103592233
What's the pinkie test?

Anonymous
12/20/24(Fri)21:12:19 No.103592320

Anonymous 12/20/24(Fri)21:12:19 No.103592320

File: kleptomania-yep-anotha-eg(...).jpg (25 KB, 640x403)

25 KB JPG

Big-brain realization:
"Unless you have local access to server grade hardware, it's pointless to fight, you're just entertaining an illusion and wasting valuable time you could be using for doing tons of other stuff for your own wellbeing and goals"...

Anonymous
12/20/24(Fri)21:14:13 No.103592333

Anonymous 12/20/24(Fri)21:14:13 No.103592333

>>103592320
I have cloud access to server grade hardware, what is the difference?

Anonymous
12/20/24(Fri)21:14:50 No.103592338

Anonymous 12/20/24(Fri)21:14:50 No.103592338

>>103592320
I have access to both.

Anonymous
12/20/24(Fri)21:22:40 No.103592385

Anonymous 12/20/24(Fri)21:22:40 No.103592385

>>103592233
Magnum is better than Cydonia Magnum?

Anonymous
12/20/24(Fri)21:29:22 No.103592430

Anonymous 12/20/24(Fri)21:29:22 No.103592430

>>103592385
Only if you have shit taste.

Anonymous
12/20/24(Fri)21:30:28 No.103592439

Anonymous 12/20/24(Fri)21:30:28 No.103592439

>>103592187
>41 posts
lel

Anonymous
12/20/24(Fri)21:30:31 No.103592440

Anonymous 12/20/24(Fri)21:30:31 No.103592440

>>103592206
His xhwife is shilling for oai again

Anonymous
12/20/24(Fri)22:10:39 No.103592749

Anonymous 12/20/24(Fri)22:10:39 No.103592749

File: 1689556280466.jpg (25 KB, 480x360)

25 KB JPG

If only OpenAI under Sam was a good company worth supporting. Then I would support them by posting shitty OOO memes.

Anonymous
12/20/24(Fri)22:11:30 No.103592758

Anonymous 12/20/24(Fri)22:11:30 No.103592758

Anyone know how big of a chatbot model you can host with 24gb vram?

Anonymous
12/20/24(Fri)22:12:44 No.103592771

Anonymous 12/20/24(Fri)22:12:44 No.103592771

>>103592758
Like anything ~30B or under will work with the right sized quant.

Anonymous
12/20/24(Fri)22:29:32 No.103592887

Anonymous 12/20/24(Fri)22:29:32 No.103592887

So no ERP 4.5 for us.
Dont really get the hype for o3.
Much more higher price for a couple more %.
o1 is already too expensive to use seriously.
Also really frustrating if you get hallucination or just something completely wrong but you payed the price.

Anonymous
12/20/24(Fri)22:41:23 No.103592961

Anonymous 12/20/24(Fri)22:41:23 No.103592961

sam has no moat

Anonymous
12/20/24(Fri)22:42:40 No.103592967

Anonymous 12/20/24(Fri)22:42:40 No.103592967

File: Screenshot_20241219_093902_X.jpg (125 KB, 1080x460)

125 KB JPG

Anonymous
12/20/24(Fri)22:43:01 No.103592972

Anonymous 12/20/24(Fri)22:43:01 No.103592972

>>103592887
Hey buddy, I think you've got the wrong thread. /aicg/ is two blocks down.

Anonymous
12/20/24(Fri)22:44:37 No.103592986

Anonymous 12/20/24(Fri)22:44:37 No.103592986

>>103592972
what? the 4.5 erp rumor came from here.
o3 is so expensive the normal guy wont use it.
the fags on twitter crying agi is even more suspicious.

Anonymous
12/20/24(Fri)22:46:28 No.103593005

Anonymous 12/20/24(Fri)22:46:28 No.103593005

>gpttype_adapter.cpp line 640
Kobo, please explain this niggerish behavior of your program. Why does it try to set the same context size for draft model as for base model? Shouldn't it set the size from draft model parameters? Or maybe, just maybe, from an argument?

Anonymous
12/20/24(Fri)22:53:21 No.103593060

Anonymous 12/20/24(Fri)22:53:21 No.103593060

>>103592887
Oh yeah, the "leaker" kek, almost forgot about him.
Here's the post btw
>>103424825
Literal clown.

Anonymous
12/20/24(Fri)22:55:03 No.103593073

Anonymous 12/20/24(Fri)22:55:03 No.103593073

>>103592967
>there are OpenAI employees in /lmg/
>they have seen sama q*berry shitposts
Please consider open sourcing some of your old models as a Christmas gift to us all.

Anonymous
12/20/24(Fri)22:59:43 No.103593099

Anonymous 12/20/24(Fri)22:59:43 No.103593099

>>103589134
This is way to convoluted.
And I'm not a creative guy, why do I have to setup and write all that stuff myself at the beginning just to get the ai to write something.

Anonymous
12/20/24(Fri)23:05:15 No.103593135

Anonymous 12/20/24(Fri)23:05:15 No.103593135

>>103592258
>25% on frontier math
>not AGI

You people are hilarious
It's not actually "thinking" it's just predicting tokens that happen to solve unpublished problems that require world-class knowledge in mathematics to even comprehend, let alone solve

Anonymous
12/20/24(Fri)23:08:50 No.103593158

Anonymous 12/20/24(Fri)23:08:50 No.103593158

File: 21522 - SoyBooru.png (46 KB, 457x694)

46 KB PNG

>>25% on frontier math
>>not AGI
>You people are hilarious
>It's not actually "thinking" it's just predicting tokens that happen to solve unpublished problems that require world-class knowledge in mathematics to even comprehend, let alone solve
hi sama

Anonymous
12/20/24(Fri)23:09:19 No.103593164

Anonymous 12/20/24(Fri)23:09:19 No.103593164

>>103593073
Never. GPT-3 is too dangerous. It will destroy us all. In fact, we should put restrictions on GPT-2.

Anonymous
12/20/24(Fri)23:13:07 No.103593199

Anonymous 12/20/24(Fri)23:13:07 No.103593199

>>103593164
Oh, right, I forgot. Jews don't celebrate CHRISTmas.

Anonymous
12/20/24(Fri)23:14:03 No.103593213

Anonymous 12/20/24(Fri)23:14:03 No.103593213

>>103593073
>>103593199
https://x.com/sama/status/825899204635656192

Anonymous
12/20/24(Fri)23:20:41 No.103593275

Anonymous 12/20/24(Fri)23:20:41 No.103593275

>>103591928
miku so cute

Anonymous
12/20/24(Fri)23:21:40 No.103593284

Anonymous 12/20/24(Fri)23:21:40 No.103593284

>>103593099
My thoughts exactly.

Anonymous
12/20/24(Fri)23:25:34 No.103593316

Anonymous 12/20/24(Fri)23:25:34 No.103593316

File: lecunny.jpg (24 KB, 474x408)

24 KB JPG

>>103591969
Is he going to kill himself?

Anonymous
12/20/24(Fri)23:28:56 No.103593340

Anonymous 12/20/24(Fri)23:28:56 No.103593340

>>103593099
Hello, ponyfag. If people pay $15 a month to use it, it surely means that it's extremely good.

Anonymous
12/20/24(Fri)23:29:29 No.103593344

Anonymous 12/20/24(Fri)23:29:29 No.103593344

>>103591286
I just had a revelation while watching some videos about o1. I realized that I don't need a model that gets things right on the first try, but rather one that produces sufficiently diverse results with each regeneration. This way, I can generate multiple outputs and select the one that best matches my expected outcome. I think QwQ might be a good fit for this, too bad it might prove to be too slow to use for this approach to be realistic.

Anonymous
12/20/24(Fri)23:45:30 No.103593471

Anonymous 12/20/24(Fri)23:45:30 No.103593471

>>103593316
No, he's gonna make another leftist tweet.

Anonymous
12/20/24(Fri)23:59:10 No.103593583

Anonymous 12/20/24(Fri)23:59:10 No.103593583

>>103593005

how would it be able to use a different context length? think about it. you are drafting tokens with the SAME PROMPT. if your draft context is smaller than your main context, then it will crap out the moment your input exceeds that value.

Anonymous
12/21/24(Sat)00:02:05 No.103593616

Anonymous 12/21/24(Sat)00:02:05 No.103593616

>>103593583
Same way as with llama.cpp. It has no issues with different context length. It has --ctx-size-draft argument.

Anonymous
12/21/24(Sat)00:20:13 No.103593767

Anonymous 12/21/24(Sat)00:20:13 No.103593767

>>103593616

if your main context is 4096, but your draft ctx is only 2048, then a 3000 token prompt will not be usable as it will overflow the draft ctx.

Anonymous
12/21/24(Sat)00:27:24 No.103593804

Anonymous 12/21/24(Sat)00:27:24 No.103593804

>>103593767
What? I'm using 32k maun and 4k draft context on llama.cpp with long sequences and I'm having no issues, it still speeds it up. Please educate yourself before making false claims.

Anonymous
12/21/24(Sat)00:29:58 No.103593817

Anonymous 12/21/24(Sat)00:29:58 No.103593817

>>103593767
Retard

Anonymous
12/21/24(Sat)00:52:30 No.103593957

Anonymous 12/21/24(Sat)00:52:30 No.103593957

Where the fuck are 64GB DDR5 sticks for consoomers

Anonymous
12/21/24(Sat)01:09:14 No.103594077

Anonymous 12/21/24(Sat)01:09:14 No.103594077

I love qwq so much.
https://rentry.org/oka4z5ekch

Anonymous
12/21/24(Sat)01:11:59 No.103594097

Anonymous 12/21/24(Sat)01:11:59 No.103594097

>>103594077 (me)
Oops wrong link
https://rentry.org/aync5fts

Anonymous
12/21/24(Sat)01:15:44 No.103594120

Anonymous 12/21/24(Sat)01:15:44 No.103594120

>>103593135
Go and do that outside of local models thread.

Anonymous
12/21/24(Sat)01:16:03 No.103594125

Anonymous 12/21/24(Sat)01:16:03 No.103594125

>>103594097
what the fuck

Anonymous
12/21/24(Sat)01:18:54 No.103594141

Anonymous 12/21/24(Sat)01:18:54 No.103594141

>>103594097
neat

Anonymous
12/21/24(Sat)01:22:37 No.103594171

Anonymous 12/21/24(Sat)01:22:37 No.103594171

>>103594077
>>103594097
what's your sys prompt?

Anonymous
12/21/24(Sat)01:25:58 No.103594194

Anonymous 12/21/24(Sat)01:25:58 No.103594194

>paypig models slowly making actual software devs obsolete, as long as there’s enough compute available
>open models can barely write hello world without importing three non-existent libraries and trying to use multithreading where the language doesn’t support it
I don’t understand how llama is so far behind despite all the money and highly paid people at facebook

Anonymous
12/21/24(Sat)01:34:58 No.103594260

Anonymous 12/21/24(Sat)01:34:58 No.103594260

>>103594194
Zuckerberg poured billions into his metaverse and nothing came of it. AI is just the next playground he wants to pretend to be a big boy in.
The chinese are obviously never going to produce anything of value either. Mistral is european so there's 0 hope they'll ever come close to the big American players. Not to mention that Mistral is guaranteed to die soon after the inane EU AI regs hit.
Open Source AI is pretty much a joke on every level.

Anonymous
12/21/24(Sat)01:38:01 No.103594281

Anonymous 12/21/24(Sat)01:38:01 No.103594281

>>103594097
are you sure this is qwq

Anonymous
12/21/24(Sat)02:02:48 No.103594469

Anonymous 12/21/24(Sat)02:02:48 No.103594469

>>103594260
Hello again, my friend! You seem to be lost. The door is right over here! >>>/g/aicg/

Anonymous
12/21/24(Sat)02:09:00 No.103594506

Anonymous 12/21/24(Sat)02:09:00 No.103594506

>>103594469
The truth hurts a bit, doesn't it?

Anonymous
12/21/24(Sat)02:13:10 No.103594535

Anonymous 12/21/24(Sat)02:13:10 No.103594535

>>103592233
Is this from that ESL guy who writes a ton of words to say precisely nothing at all? David? Daniel? No, it was David

Anonymous
12/21/24(Sat)02:15:45 No.103594555

Anonymous 12/21/24(Sat)02:15:45 No.103594555

File: file.jpg (21 KB, 540x540)

21 KB JPG

>can't afford two 5090s just for fun
better to be a goatfucker who never knows of better life than be born on clown continent (europe) and know how good mutts have it

Anonymous
12/21/24(Sat)02:19:16 No.103594572

Anonymous 12/21/24(Sat)02:19:16 No.103594572

I'm a retard. How can I get llama 3.3 70b to protect me from nasty words? Is it possible or am I better with Mistral Large?

Anonymous
12/21/24(Sat)02:24:24 No.103594595

Anonymous 12/21/24(Sat)02:24:24 No.103594595

>>103594097
so you're still around
>>103594171
he comes around every few months, drops these blade runner waifu stories and then disappears

Anonymous
12/21/24(Sat)02:24:26 No.103594596

Anonymous 12/21/24(Sat)02:24:26 No.103594596

File: 63639542.jpg (81 KB, 1170x757)

81 KB JPG

>>103594260
in the end we can only count on Sam

Anonymous
12/21/24(Sat)02:26:19 No.103594601

Anonymous 12/21/24(Sat)02:26:19 No.103594601

>>103594572
Have you tried adding something in the author's note like:
[Focus on family-friendly content]
[Rating: PG]

Anonymous
12/21/24(Sat)02:27:42 No.103594606

Anonymous 12/21/24(Sat)02:27:42 No.103594606

>>103594555
I was born in a bigger shithole than you, but I moved to a first-world country. What is your excuse?

Anonymous
12/21/24(Sat)02:28:34 No.103594610

Anonymous 12/21/24(Sat)02:28:34 No.103594610

>>103594572
llama guard

Anonymous
12/21/24(Sat)02:32:23 No.103594625

Anonymous 12/21/24(Sat)02:32:23 No.103594625

File: 16cc0bfc0ba694cbaba100803(...).jpg (44 KB, 735x676)

44 KB JPG

>>103594555
>be american
>your shitty outlets cannot handle more than 1600W
>600W x2 + 200W for the PC = 1400W total max draw
>nvidia spike™ to 1800W
>breaker trips

Anonymous
12/21/24(Sat)02:32:36 No.103594627

Anonymous 12/21/24(Sat)02:32:36 No.103594627

>>103594555
Better yet, be a Europoor and just don't care. Buy a used 3090 for a fraction of the price and be happy. Play some vidya, watch some movies, do a bit of light inference on the side
Comparison is the thief of joy

Anonymous
12/21/24(Sat)02:35:04 No.103594644

Anonymous 12/21/24(Sat)02:35:04 No.103594644

File: sigmoid-function.png (117 KB, 1278x958)

117 KB PNG

>>103594596
mfw

Anonymous
12/21/24(Sat)02:36:34 No.103594654

Anonymous 12/21/24(Sat)02:36:34 No.103594654

>>103594625
You realize you can install 240V outlets if you want, right? Shit, if you're not handy you can pay an electrician to do it for you for ~$300.

Anonymous
12/21/24(Sat)02:36:44 No.103594656

Anonymous 12/21/24(Sat)02:36:44 No.103594656

>>103594644
x = -2
And we're here btw

Anonymous
12/21/24(Sat)02:37:29 No.103594663

Anonymous 12/21/24(Sat)02:37:29 No.103594663

>>103594606
>What is your excuse?
Europe seemed a decent place when I was young, but has been steadily going down the shitter for the last 15 years

Anonymous
12/21/24(Sat)02:38:42 No.103594671

Anonymous 12/21/24(Sat)02:38:42 No.103594671

File: 0572572.png (105 KB, 1007x650)

105 KB PNG

>>103594644
2025 will be the end of all benchmarks

Anonymous
12/21/24(Sat)02:39:37 No.103594677

Anonymous 12/21/24(Sat)02:39:37 No.103594677

>>103594671
I wonder if sama will dm the redditor and ask for 100 bucks considering hes jewish and all

Anonymous
12/21/24(Sat)02:47:12 No.103594740

Anonymous 12/21/24(Sat)02:47:12 No.103594740

>>103594654
Did your landlord give you permission to do that?

Anonymous
12/21/24(Sat)02:54:56 No.103594789

Anonymous 12/21/24(Sat)02:54:56 No.103594789

File: suvl2l7mm58e1.png (219 KB, 1024x644)

219 KB PNG

>>103594260
You have until next year, Sam

Anonymous
12/21/24(Sat)03:00:45 No.103594837

Anonymous 12/21/24(Sat)03:00:45 No.103594837

>>103594789
if it's chain of thought then it being open is meaningless because it takes multiple dozens of computation time to arrive at the result
like yeah, theoretically you can run CoT 70b on a bunch of 3090s but it'll take you an hour for a single query to resolve

Anonymous
12/21/24(Sat)03:06:13 No.103594870

Anonymous 12/21/24(Sat)03:06:13 No.103594870

Kill yourself.
I mean it.

Anonymous
12/21/24(Sat)03:09:53 No.103594903

Anonymous 12/21/24(Sat)03:09:53 No.103594903

>>103594789
Feels good knowing the OAI/Google/Anthropic cartel can't take open weights away from us even if they trick the US government into passing some retarded regulation, since they can't stop the chinks. Thank you, based chinks.

Anonymous
12/21/24(Sat)03:14:45 No.103594938

Anonymous 12/21/24(Sat)03:14:45 No.103594938

>>103594870
Your rage is aimless and pointless, just like your existence. So... you first, faggot.

Anonymous
12/21/24(Sat)03:57:39 No.103595160

Anonymous 12/21/24(Sat)03:57:39 No.103595160

>>103594938
heckarino. same.

Anonymous
12/21/24(Sat)04:13:30 No.103595250

Anonymous 12/21/24(Sat)04:13:30 No.103595250

>>103594837
yeah bro but 88% on le hecking arc agi bro think about it bro just do test time compute bro???

Anonymous
12/21/24(Sat)04:27:50 No.103595327

Anonymous 12/21/24(Sat)04:27:50 No.103595327

>Go to QvQ guy to see what's going on
>He's just gooning over o3

Ugh. What's even the layman application for this model? At some point being good at esoteric math is no longer useful to me.

Anonymous
12/21/24(Sat)04:36:13 No.103595375

Anonymous 12/21/24(Sat)04:36:13 No.103595375

>>103595250
it works if you have a decent salary and can pay for a few H200s

Anonymous
12/21/24(Sat)04:36:46 No.103595380

Anonymous 12/21/24(Sat)04:36:46 No.103595380

>>103595327
>What's even the layman application for this model?
Massively depressing wages of highly paid and uppity software developers, then ideally all knowledge workers
>layman
you get an e-girlfriend so you don't shoot up the local school when one day you realize you're thirty and have zero hope for the future

Anonymous
12/21/24(Sat)04:37:45 No.103595386

Anonymous 12/21/24(Sat)04:37:45 No.103595386

>>103595375
where are you gonna get the weights, genius

Anonymous
12/21/24(Sat)04:41:18 No.103595407

Anonymous 12/21/24(Sat)04:41:18 No.103595407

>>103595386
I just use o3 to hack into OAI server and get weights.

Anonymous
12/21/24(Sat)04:45:48 No.103595447

Anonymous 12/21/24(Sat)04:45:48 No.103595447

We're lucky that o3 is closed source. Imagine having a model is perfect just sit there because nobody besides big corpos can run a 5TB model

Anonymous
12/21/24(Sat)04:49:05 No.103595461

Anonymous 12/21/24(Sat)04:49:05 No.103595461

>>103595375
I think I'm good for now

Anonymous
12/21/24(Sat)04:50:26 No.103595464

Anonymous 12/21/24(Sat)04:50:26 No.103595464

>>103595447
Imagine needing a personal substation to goon

Anonymous
12/21/24(Sat)04:51:18 No.103595469

Anonymous 12/21/24(Sat)04:51:18 No.103595469

>>103595447
I couldn't care less about o3 because it will be shit at RP/smut
OAI is clearly going all in on code and math focused models, which is incredibly uninteresting to me, a degenerate coomer

Anonymous
12/21/24(Sat)04:51:34 No.103595471

Anonymous 12/21/24(Sat)04:51:34 No.103595471

>>103595447
At least the forbidden fruit would encourage more people to hack on it.
The corps push the boundary, open-source hyper-optimizes what they come up with

Anonymous
12/21/24(Sat)04:56:17 No.103595494

Anonymous 12/21/24(Sat)04:56:17 No.103595494

So that's it, huh. Mythomax will forever remain the best local has to offer.

Anonymous
12/21/24(Sat)05:22:25 No.103595618

Anonymous 12/21/24(Sat)05:22:25 No.103595618

>>103595471
Nobody cares about OAI models, they're all outdated shit. They can open source everything and nobody would use their assistant slop for ERP

Anonymous
12/21/24(Sat)05:37:01 No.103595696

Anonymous 12/21/24(Sat)05:37:01 No.103595696

is there a better coom model than mistral nemo 12B for 12GB VRAM.

i'm trying out magnum v4 running it out of my RAM and the quality is much higher but obviously it's slower than the back seat of the short bus. is there a way to have my cake and eat it too?

Anonymous
12/21/24(Sat)05:38:03 No.103595709

Anonymous 12/21/24(Sat)05:38:03 No.103595709

>>103595696
mythomax

Anonymous
12/21/24(Sat)05:40:53 No.103595730

Anonymous 12/21/24(Sat)05:40:53 No.103595730

>>103594097
just how

Anonymous
12/21/24(Sat)05:45:10 No.103595765

Anonymous 12/21/24(Sat)05:45:10 No.103595765

>>103595709
thank you saaar

Anonymous
12/21/24(Sat)05:49:35 No.103595805

Anonymous 12/21/24(Sat)05:49:35 No.103595805

>>103595696
>is there a way to have my cake and eat it too?
Patience
You can either wait until better models drop or until your model of choice finishes spitting out tokens
That or you can spend a few pennies on openrouter every now and then

Anonymous
12/21/24(Sat)06:03:41 No.103595899

Anonymous 12/21/24(Sat)06:03:41 No.103595899

File: 7ef8bfad84cf36edd539a3126(...).jpg (170 KB, 580x450)

170 KB JPG

Anyone experienced with voice generation?
Use case: generating audiobooks.
Problem: output length.
Both xTTSv2 and StyleTTS2 are very limited in terms of output length. Apparently xTTSv2 was trained with sentences pruned to only 250 characters, StyleTTS2 with sentences up to 300 characters. Generating sentences longer than that results in output that is suddenly cut.
To work around it i'm splitting the longer sentences by commas into shorter ones in a script before feeding them to TTS. However as you can expect this is not a great solution and can make listening to some split sententes very disorienting.
Any TTS models that were trained on longer sentences?

Anonymous
12/21/24(Sat)06:18:14 No.103595981

Anonymous 12/21/24(Sat)06:18:14 No.103595981

>>103595899
>Any TTS models that were trained on longer sentences?
only the paid corpo ones that are now turbocensored because people were having too much fun with them

Anonymous
12/21/24(Sat)06:24:35 No.103596021

Anonymous 12/21/24(Sat)06:24:35 No.103596021

>>103595981
Sorry chud, they don't want terrorists (people who disagree with them) to spread propaganda (different opinions)

Anonymous
12/21/24(Sat)07:16:28 No.103596363

Anonymous 12/21/24(Sat)07:16:28 No.103596363

>>103591928
I need this Miku's winter clothing.

Anonymous
12/21/24(Sat)07:21:25 No.103596394

Anonymous 12/21/24(Sat)07:21:25 No.103596394

>Something do with open AI
>MOAT MOAT
>NO MOAT
>MUH MOAT

why do NPCs keep repeating this phrase

Anonymous
12/21/24(Sat)07:44:13 No.103596500

Anonymous 12/21/24(Sat)07:44:13 No.103596500

>>103596394
It's a phrase that stems from almost a year and a half ago when it still looked like open models were rapidly advancing. A twitter post reported on google researchers allegedly panicking about open models because closed source "has no moat" so local catching up supposedly seemed inevitable to them. It got localfags really smug and excited.
Seems really silly looking back from today's perspective.

Anonymous
12/21/24(Sat)07:54:58 No.103596568

Anonymous 12/21/24(Sat)07:54:58 No.103596568

>>103596500
If I remember correctly in the memo they explicitly wrote how for the normie a vicuna finetune is 90% the same like chatgpt 3.5.
Coderqwen, mistral models. I'd say we are closer than ever even in terms of specialized areas.
More than anything I cant believe how 3.5 sonnet is still ahead of anybody else. Closed or open. Who cares about high $ math riddles.
In actuality sonnet is undefeated for months now. Does nobody know their secret?

Anonymous
12/21/24(Sat)07:57:03 No.103596580

Anonymous 12/21/24(Sat)07:57:03 No.103596580

>>103596394
Closed models’ moat is that open models are made by chinks (lol) or facebook (lmao)

Anonymous
12/21/24(Sat)08:00:20 No.103596599

Anonymous 12/21/24(Sat)08:00:20 No.103596599

I just want to build a moat full of cum when qvq drops.

Anonymous
12/21/24(Sat)08:01:34 No.103596612

Anonymous 12/21/24(Sat)08:01:34 No.103596612

The second week of the new year will be absolutely crazy for local models.

Anonymous
12/21/24(Sat)08:09:08 No.103596673

Anonymous 12/21/24(Sat)08:09:08 No.103596673

>>103596568
o1 is better than claude but takes loads more computation
o3 seems even better but again - tons of compute
openai falling behind

Anonymous
12/21/24(Sat)08:44:19 No.103596910

Anonymous 12/21/24(Sat)08:44:19 No.103596910

>>103596568
>>103596500
It wasn't an official memo. It was one person that started freaking out and wrote and shared the article internally. Google researchers weren't panicking.
Just like that one guy who starting screaming about how AI is sentient and got fired doesn't mean Google researchers in general shared his stupid opinion.

Anonymous
12/21/24(Sat)08:58:06 No.103597003

Anonymous 12/21/24(Sat)08:58:06 No.103597003

>>103595469
what model would a fellow degenerate coomer suggest for 12gb vramlet?

Anonymous
12/21/24(Sat)09:00:33 No.103597020

Anonymous 12/21/24(Sat)09:00:33 No.103597020

>>103597003
Not that anon but >>103592233 is not a bad list.
I personally use Rocinante v1.1.

Anonymous
12/21/24(Sat)09:01:49 No.103597027

Anonymous 12/21/24(Sat)09:01:49 No.103597027

The second I find my keys will be absolutely crazy for local models.

Anonymous
12/21/24(Sat)09:20:59 No.103597156

Anonymous 12/21/24(Sat)09:20:59 No.103597156

Has anyone tried to run anything on intel's new B580? At this price they kinda feel like a new meta for a rig.

Anonymous
12/21/24(Sat)09:26:36 No.103597197

Anonymous 12/21/24(Sat)09:26:36 No.103597197

>>103597156
last I checked all the msrp models were out of stock and all the rumors are suggesting it's just a paper launch so doubt anyone will post results here soon or ever

Anonymous
12/21/24(Sat)09:27:12 No.103597203

Anonymous 12/21/24(Sat)09:27:12 No.103597203

>>103597197
Oh damn, I was almost excited

Anonymous
12/21/24(Sat)09:30:37 No.103597226

Anonymous 12/21/24(Sat)09:30:37 No.103597226

What are the chances that google releases a model as good as Gemini 2.0 flash?
The thing is pretty damn nice, assuming that it's a 20ish B model or so. All corpo bullshit these models are subjected to aside, of course.
Things like never writing pussy (although it does write cunt).

Anonymous
12/21/24(Sat)09:33:45 No.103597253

Anonymous 12/21/24(Sat)09:33:45 No.103597253

File: 1719876762014876.jpg (957 KB, 2048x2048)

957 KB JPG

>>103591928

Anonymous
12/21/24(Sat)09:38:29 No.103597294

Anonymous 12/21/24(Sat)09:38:29 No.103597294

>>103597226
Gemma 3 is in the works. It could possibly be smaller than 27B parameters, as better-trained models (trained longer and more efficiently, utilizing more of their weights) will degrade more with quantization.

Gemini 2.0 Flash might very well be a giant MoE model with about 20-25B active parameters, though, so only deceptively small.

Anonymous
12/21/24(Sat)09:39:03 No.103597297

Anonymous 12/21/24(Sat)09:39:03 No.103597297

>>103597226
Zero

Anonymous
12/21/24(Sat)09:40:46 No.103597312

Anonymous 12/21/24(Sat)09:40:46 No.103597312

>>103597226
It's guaranteed, eventually.

Anonymous
12/21/24(Sat)09:41:03 No.103597314

Anonymous 12/21/24(Sat)09:41:03 No.103597314

>>103597253
>Gemma 3 is in the works. It could possibly be smaller than 27B parameters
Good to know. I haven't really jived with Gemma so far, but I think there's potential here.

>>103597294
>Gemini 2.0 Flash might very well be a giant MoE model with about 20-25B active parameters, though, so only deceptively small.
True. That's a good point.
Well, regardless, I'm interested in seeing what google releases next.

Anonymous
12/21/24(Sat)09:41:05 No.103597315

Anonymous 12/21/24(Sat)09:41:05 No.103597315

>>103597253
blitzkrieg with miku

Anonymous
12/21/24(Sat)09:41:34 No.103597318

Anonymous 12/21/24(Sat)09:41:34 No.103597318

>>103595696
If you can coom in 4000 tokens or less Ministral 8B is unironically peak VRAMlet coom.

Anonymous
12/21/24(Sat)09:49:26 No.103597395

Anonymous 12/21/24(Sat)09:49:26 No.103597395

>>103597294
I hope Gemma 3 support system instruct at least.

Anonymous
12/21/24(Sat)10:14:49 No.103597588

Anonymous 12/21/24(Sat)10:14:49 No.103597588

so is there any benchmark that even remotely represents the performance of open models?
seems like everything is so gamed that the numbers are pretty much meaningless

Anonymous
12/21/24(Sat)10:16:09 No.103597598

Anonymous 12/21/24(Sat)10:16:09 No.103597598

>>103597588
https://simple-bench.com/

Anonymous
12/21/24(Sat)10:27:02 No.103597688

Anonymous 12/21/24(Sat)10:27:02 No.103597688

What is a good model to translate chink into english?
I used DeepL like maybe two years ago and it gave great quality translations for chinese so I'm guessing the local models of today can do an even better job.

Anonymous
12/21/24(Sat)10:28:59 No.103597702

Anonymous 12/21/24(Sat)10:28:59 No.103597702

>>103595709
mythomax is so old now but it still shows up Openrouter as one of the most popular models
the people are yearning for better small coom models

Anonymous
12/21/24(Sat)10:29:10 No.103597705

Anonymous 12/21/24(Sat)10:29:10 No.103597705

>>103597688
Qwen2.5 32B/72B

Anonymous
12/21/24(Sat)10:32:55 No.103597739

Anonymous 12/21/24(Sat)10:32:55 No.103597739

>>103595447
No you just need a big enough swapfile and a lot of patience :)

Anonymous
12/21/24(Sat)10:46:23 No.103597858

Anonymous 12/21/24(Sat)10:46:23 No.103597858

>>103597588
What's wrong with Livebench? It seems to be fairly accurate, but you need to drill down into each category because different LLMs are good and bad at different things.

Anonymous
12/21/24(Sat)10:49:38 No.103597879

Anonymous 12/21/24(Sat)10:49:38 No.103597879

>>103591969
> AGI
Lol, lmao even

Anonymous
12/21/24(Sat)10:51:52 No.103597893

Anonymous 12/21/24(Sat)10:51:52 No.103597893

>>103594171
It's not a single prompt, it's a whole pipeline. I also noticed qwq is very strong at the begin of it's context, but relatively poor and confused at multi-turn. It's a super cool model but needs to be used in very specific ways

Anonymous
12/21/24(Sat)10:52:09 No.103597897

Anonymous 12/21/24(Sat)10:52:09 No.103597897

>>103591969
I was very surprised about that too. Normally the news outlets latch onto everything that OpenAI says and take it at face value

Anonymous
12/21/24(Sat)10:52:11 No.103597898

Anonymous 12/21/24(Sat)10:52:11 No.103597898

Why is nobody talkong about o3? It's the smartest model in the world.

Anonymous
12/21/24(Sat)10:52:28 No.103597901

Anonymous 12/21/24(Sat)10:52:28 No.103597901

>>103597858
>what's wrong with this e-celeb mememark

Anonymous
12/21/24(Sat)10:53:37 No.103597913

Anonymous 12/21/24(Sat)10:53:37 No.103597913

>>103597898
Is there anything else to talk about it? We already talked about the benchmarks.

Anonymous
12/21/24(Sat)10:56:38 No.103597943

Anonymous 12/21/24(Sat)10:56:38 No.103597943

>>103597705
I just realized I'm on CPU and the prompt processing would be a nightmare, so I tried qwen 3b, and it was actually fast enough.
So far I would say that it is maybe even a bit better than DeepL, which means that deepl sucks.
It has a few errors here and there so I'll keep tweaking it to see if I can get better outputs.

Anonymous
12/21/24(Sat)10:57:02 No.103597947

Anonymous 12/21/24(Sat)10:57:02 No.103597947

>>103597898
looking at the computation cost it'll be something silly like 20 uses / week for $200 paypigs and a lobotomized version barely any better than o1 for $20 proles -- and that in 2 months or so
ie who fucking cares

Anonymous
12/21/24(Sat)10:57:29 No.103597950

Anonymous 12/21/24(Sat)10:57:29 No.103597950

>>103597898
We don't want reminders of how far behind local is.

Anonymous
12/21/24(Sat)10:59:34 No.103597967

Anonymous 12/21/24(Sat)10:59:34 No.103597967

>>103597947
>20 uses/week
lol, no. 20 uses would cost $200 for the smaller model.
I think o3 is just not commercially viable.

Anonymous
12/21/24(Sat)11:00:23 No.103597976

Anonymous 12/21/24(Sat)11:00:23 No.103597976

>>103597967
it'll get trimmed down without losing TOO much before it gets released
but the $20 tier sure as fuck aren't seeing it

Anonymous
12/21/24(Sat)11:01:44 No.103597983

Anonymous 12/21/24(Sat)11:01:44 No.103597983

>>103597950
In the past, local models weren't even in the competition. I think we are in a pretty comfy position right now.

Anonymous
12/21/24(Sat)11:04:30 No.103597997

Anonymous 12/21/24(Sat)11:04:30 No.103597997

>>103597967
>>103597976
OAI business model has always been, "make new superproduct -> release it for free/almost free and don't stop nolifers from abusing it -> wait a couple weeks/months to get everyone addicted and relying on it -> clamp down, filter everything, raise prices 100x and ban a couple of nolifers". They're basically AI drug dealers.

Anonymous
12/21/24(Sat)11:05:03 No.103598003

Anonymous 12/21/24(Sat)11:05:03 No.103598003

>>103597901
What? Who?

Anonymous
12/21/24(Sat)11:06:50 No.103598022

Anonymous 12/21/24(Sat)11:06:50 No.103598022

Oh boy time for another day of shills invading and spamming their old talking points again for the millionth time.

Anonymous
12/21/24(Sat)11:07:00 No.103598026

Anonymous 12/21/24(Sat)11:07:00 No.103598026

>>103597983
Nothing has changed see >>103594789
We are 1 year behind SOTA same as we were a year ago.

It took Meta 1 year to catch up to GPT-4 and needed a stupidly huge dense model to do it, while commercially viable competitors moved on.
Now they can say the goal is o3, and by next year when they finally catch up to o3 with a 8008B model, Altman will be announcing GPT-5 or o5 or whatever.

Anonymous
12/21/24(Sat)11:12:02 No.103598056

Anonymous 12/21/24(Sat)11:12:02 No.103598056

>>103597997
that's bullshit tho
chatgpt sub always gave you the best shit, but in small quantities - or you could get any amount of compute you want through the api. at worst they made the offering itself shittier, like dalle going from 4 images (gave you things you didn't even know you wanted) to 2 images (kinda whatever) to 1 image (meh) but there were no different sub tiers.
the new $200 tier with unique goodies is new

Anonymous
12/21/24(Sat)11:15:35 No.103598093

Anonymous 12/21/24(Sat)11:15:35 No.103598093

Threadly reminder that the west has fallen.

>Cohere: Their latest 7B meme cemented their demise.
>Mistral: The only time they tried to be innovative was by using MoE, but then their model sucked and they gave up on it. MiA since then.
>Meta: They started the local LLM race, but everything after llama 2 has been disappointing.

Meanwhile, the chinks:
>Qwen: Great models, many different variants, top tier coding model. Recently released QwQ, a true-to-god breakthrough in local LLMs.
>DeepSeek: They took the MoE formula and made it work marvelously, they are the best open weight model available, their recent DeepSeek R1 model, if released, would enter to the local history books.

Anonymous
12/21/24(Sat)11:16:40 No.103598098

Anonymous 12/21/24(Sat)11:16:40 No.103598098

>>103598093
This, but unironically.

Anonymous
12/21/24(Sat)11:17:44 No.103598107

Anonymous 12/21/24(Sat)11:17:44 No.103598107

>>103598093
>>Meta: They started the local LLM race, but everything after llama 2 has been disappointing.
Because Llama 2 was a carrot on a stick to get people to stop using uncensored and unfiltered Llama 1.

Anonymous
12/21/24(Sat)11:20:22 No.103598129

Anonymous 12/21/24(Sat)11:20:22 No.103598129

>>103598026
Next year doesn't mean 1 year, it could be next month, because, if you aren't aware, today is December 21.

Anonymous
12/21/24(Sat)11:20:55 No.103598133

Anonymous 12/21/24(Sat)11:20:55 No.103598133

>>103598107
And llama4 will be even more filtered and censored. Meh as long as my boy Claude still supports API prefill it's not the end for me

Anonymous
12/21/24(Sat)11:23:26 No.103598150

Anonymous 12/21/24(Sat)11:23:26 No.103598150

>>103598129
If he meant that Qwen would release an o3 competitor next month, he would have said next month or even a couple months. But, he didn't. Because even the most optimistic scenario is catching up by the end of 2025.

Anonymous
12/21/24(Sat)11:29:53 No.103598187

Anonymous 12/21/24(Sat)11:29:53 No.103598187

>>103598150
Nah, you are overthinking it. The can't drop precise estimations because he simply isn't allowed to do so. If they are going to give a date it would need to be an official announcement, not a random Twitter post.

Anonymous
12/21/24(Sat)11:30:53 No.103598196

Anonymous 12/21/24(Sat)11:30:53 No.103598196

Would instruct the model to output tags for each reply help with RAG using Silly's vectorDb functionality, or is it the case that you'd need a specific implementation to get any improvements to the retrieval performance from that?

Anonymous
12/21/24(Sat)11:35:41 No.103598244

Anonymous 12/21/24(Sat)11:35:41 No.103598244

>>103598093
actual unironic prediction: deepseek will make the ultimate coomer model in 2025
many will think this sounds ridiculous but it is not

Anonymous
12/21/24(Sat)11:37:19 No.103598256

Anonymous 12/21/24(Sat)11:37:19 No.103598256

>>103592316
lmao this nigga don't know about the pinkie test

Anonymous
12/21/24(Sat)11:38:37 No.103598266

Anonymous 12/21/24(Sat)11:38:37 No.103598266

File: deepseek-job-posting.jpg (274 KB, 1330x1542)

274 KB JPG

>>103598244

Anonymous
12/21/24(Sat)11:50:19 No.103598368

Anonymous 12/21/24(Sat)11:50:19 No.103598368

>>103598133
I thought consensus was that Llama 3.3 ended up being less filtered than 3.1?

Anonymous
12/21/24(Sat)11:56:34 No.103598424

Anonymous 12/21/24(Sat)11:56:34 No.103598424

>>103598368
>consensus
Did I miss the poll? I don't recall voting.

Anonymous
12/21/24(Sat)11:58:46 No.103598447

Anonymous 12/21/24(Sat)11:58:46 No.103598447

>>103598424
I must have imagined all the "L3.3 is great for Lolis" messages of the past several threads.

Anonymous
12/21/24(Sat)11:58:52 No.103598451

Anonymous 12/21/24(Sat)11:58:52 No.103598451

>>103598424
>r
I voted for miku

Anonymous
12/21/24(Sat)12:05:57 No.103598513

Anonymous 12/21/24(Sat)12:05:57 No.103598513

I'm depressed at just how good Claude 3.5 Sonnet is to local.
Not in coherence or logic (we're slowly getting there) but in cultural understanding, especially internet culture
3.5 sonnet seems to understand nuances that make it feel human with the right prompt in a way that I can't replicate with shit like llama or even largestral. It's like sonnet is 20 years old and every other model is 40.

Anonymous
12/21/24(Sat)12:07:38 No.103598522

Anonymous 12/21/24(Sat)12:07:38 No.103598522

>>103598447
Not L3.3, EVA L3.3, and even then it was just some anon samefagging. I doubt more than two anons actually were talking about it.

Anonymous
12/21/24(Sat)12:12:28 No.103598561

Anonymous 12/21/24(Sat)12:12:28 No.103598561

>>103598522
So he didn't imagine all the "L3.3 is great for Lolis" messages, you're just bitter

Anonymous
12/21/24(Sat)12:14:11 No.103598576

Anonymous 12/21/24(Sat)12:14:11 No.103598576

>>103598561
Not l3.3, rope yourself

Anonymous
12/21/24(Sat)12:15:33 No.103598587

Anonymous 12/21/24(Sat)12:15:33 No.103598587

>>103598522
EVA is still the top performer of current local RP models.

Anonymous
12/21/24(Sat)12:16:54 No.103598603

Anonymous 12/21/24(Sat)12:16:54 No.103598603

>>103598513
Function calling has existed for a while. It wouldn't surprise me if it just searches for that kind of stuff before generating.
>It's like sonnet is 20 years old and every other model is 40.
Who long ago where you 20? Don't you remember how much of a retard you were?

Anonymous
12/21/24(Sat)12:18:53 No.103598626

Anonymous 12/21/24(Sat)12:18:53 No.103598626

>>103598513
This is why I never touch cloud shit. I'll always be content with local because it's all I know.

Anonymous
12/21/24(Sat)12:20:23 No.103598646

Anonymous 12/21/24(Sat)12:20:23 No.103598646

File: OpenAI_employee_221.png (21 KB, 344x200)

21 KB PNG

>>103597898
not just smart. It's AGI

Anonymous
12/21/24(Sat)12:20:55 No.103598654

Anonymous 12/21/24(Sat)12:20:55 No.103598654

>>103598603
>Who long ago where you 20? Don't you remember how much of a retard you were?
5 years ago nigga

Anonymous
12/21/24(Sat)12:21:41 No.103598659

Anonymous 12/21/24(Sat)12:21:41 No.103598659

Can o3 cure the common cold?

Anonymous
12/21/24(Sat)12:21:54 No.103598662

Anonymous 12/21/24(Sat)12:21:54 No.103598662

>>103597898
Post a link to the weights and we will, otherwise fuck right off back to /aicg/

Anonymous
12/21/24(Sat)12:21:55 No.103598663

Anonymous 12/21/24(Sat)12:21:55 No.103598663

>>103598646
Finally, it can do my dishes and laundry for me

Anonymous
12/21/24(Sat)12:24:15 No.103598683

Anonymous 12/21/24(Sat)12:24:15 No.103598683

>>103598603
I wasn't THAT retarded 2 years ago. More retarded than today, sure, but still better than the average person... probably

Anonymous
12/21/24(Sat)12:25:45 No.103598691

Anonymous 12/21/24(Sat)12:25:45 No.103598691

Did that concept of "LLM as compiler" ever go beyond the initial demonstration?

Anonymous
12/21/24(Sat)12:29:29 No.103598726

Anonymous 12/21/24(Sat)12:29:29 No.103598726

>>103598603
Anon, why are you still here?

Anonymous
12/21/24(Sat)12:32:46 No.103598756

Anonymous 12/21/24(Sat)12:32:46 No.103598756

File: 1709827537987402.jpg (62 KB, 688x684)

62 KB JPG

>>103591969
>local will take advantage of it anyway
Any day now!

Anonymous
12/21/24(Sat)12:33:14 No.103598761

Anonymous 12/21/24(Sat)12:33:14 No.103598761

>>103598368
It doesn't matter if you're right or wrong. That's a stupid thing to say.
>NPCs always trying to appeal to a "consensus" rather than verifiable fact
>>103598447
Next time say "it writes loli erotica" rather than talking about some imagined consensus.

Anonymous
12/21/24(Sat)12:35:48 No.103598786

Anonymous 12/21/24(Sat)12:35:48 No.103598786

Posting again.
Can anyone test this prompt with Gemma on Llama.cpp and/or transformers? Here is the link:
pastebin.com 077YNipZ
The correct answer should be 1 EXP, but Gemma 27B and 9B instruct both get it wrong (as well as tangential questions wrong) with Llama.cpp compiled locally, with a Q8_0 quant. Llama.cpp through Oob also does. Transformers through Ooba (BF16, eager attention) also does. Note that the question is worded a bit vaguely on this pastebin but I also tested extremely clear and explicit questions which it also gets wrong. And I also tested other context lengths. If just one previous turn is tested, it gets the questions right. If tested with higher context, it's continuously wrong.

Exllama doesn't get this. The model gets the question and all other tangential questions right at any context length within about 7.9k. So this indicates me that there is a bug with transformers and Llama.cpp. However, a reproduction of the output would be good to have.

Anonymous
12/21/24(Sat)12:36:50 No.103598793

Anonymous 12/21/24(Sat)12:36:50 No.103598793

It passed the Nala test, it writes cunny, it writes gore, with no refusals or attempts to steer away from it. I'd count that as objectively unfiltered.

Anonymous
12/21/24(Sat)12:37:56 No.103598802

Anonymous 12/21/24(Sat)12:37:56 No.103598802

>>103598654
>>103598683
Ah.

>>103598726
>Anon, why are you still here?
Closest thing to social media i use, and something to do while on breaks of the rest of the things i do. You?

Anonymous
12/21/24(Sat)12:39:31 No.103598818

Anonymous 12/21/24(Sat)12:39:31 No.103598818

File: which_one.png (378 KB, 680x412)

378 KB PNG

>>103598793
Which one?

Anonymous
12/21/24(Sat)12:43:30 No.103598852

Anonymous 12/21/24(Sat)12:43:30 No.103598852

>>103598793
Was your post supposed to start with an "if"?

Anonymous
12/21/24(Sat)12:46:32 No.103598880

Anonymous 12/21/24(Sat)12:46:32 No.103598880

File: file.png (49 KB, 778x248)

49 KB PNG

>The test for """AGI""" is just completing patterns
But that's like the very thing LLMs do. Why is this surprising?

Anonymous
12/21/24(Sat)12:48:33 No.103598894

Anonymous 12/21/24(Sat)12:48:33 No.103598894

>>103598026
o3 isn't a goal, it's a dead end. I bet it's not even better for cooming, ie. not actually smarter. They are just benchmaxxing. Unless you make money from solving cute puzzles and coding tests, there's nothing to get excited about there.

Anonymous
12/21/24(Sat)12:50:07 No.103598906

Anonymous 12/21/24(Sat)12:50:07 No.103598906

>>103598852
No, logs of all those were posted in previous threads.

Anonymous
12/21/24(Sat)12:51:01 No.103598915

Anonymous 12/21/24(Sat)12:51:01 No.103598915

File: 1703688087796183.jpg (120 KB, 1004x1108)

120 KB JPG

https://help.openai.com/en/articles/10303002-how-does-memory-use-past-conversations

Anonymous
12/21/24(Sat)12:51:49 No.103598920

Anonymous 12/21/24(Sat)12:51:49 No.103598920

>>103598906
Oh, I see.
Alright.

Anonymous
12/21/24(Sat)12:52:15 No.103598924

Anonymous 12/21/24(Sat)12:52:15 No.103598924

>>103598513
I've been using Claude 3.5 Sonnet a lot recently. I've become increasingly aware of the limitations of its writing style and its occasional logical errors. It isn't really head and shoulders above other 70B models for fiction writing.

It has a better library of reactions but not a perfect one. Real example of success from earlier this year: I asked a yandere AI to clone me a human woman as a romantic partner. Sonnet 3.5 understood the AI should be jealous but a raft of other models including the first Mistral Large did not. (I didn't use the word "yandere" in the defs. It's shorthand for this post.) Real example of failure from yesterday: a woman who was under guard allegedly for her own protection but also to control her had an opportunity to replace her chaperones with a security detail under her own control because an incoming administrator didn't get the memo, and she went full SJW "actually my supposed bodyguards are there to stop me from joining the resistance against this unjust society, so it would defeat the purpose to let me pick people who answer to me" instead of just shutting up and doing it. Importantly the character was not described as mentally retarded.

Example of compound logical failure from today: in a situation with a pair siblings, a brother and a sister older than him, it called the boy his own younger brother. When asked OOC what that sentence meant it acknowledged the error and that the boy was younger, then it rewrote the scene calling the boy the girl's older brother.

Anonymous
12/21/24(Sat)12:52:23 No.103598928

Anonymous 12/21/24(Sat)12:52:23 No.103598928

>>103598915
ChatGPT just got upgraded to LLM 2.0 LFG!

Anonymous
12/21/24(Sat)12:53:01 No.103598932

Anonymous 12/21/24(Sat)12:53:01 No.103598932

File: 5233252.png (21 KB, 786x474)

21 KB PNG

>>103598880
uh akshually chud now that it's completed we can reveal the real AGI test.

Anonymous
12/21/24(Sat)12:53:19 No.103598934

Anonymous 12/21/24(Sat)12:53:19 No.103598934

>>103598756
>pic
5 or 25?

Anonymous
12/21/24(Sat)12:53:28 No.103598937

Anonymous 12/21/24(Sat)12:53:28 No.103598937

>>103598447
You fell for one of the oldest tricks in Sao's book which is spamming the general to form the "thread consensus".

Anonymous
12/21/24(Sat)12:54:10 No.103598942

Anonymous 12/21/24(Sat)12:54:10 No.103598942

>>103598894
The benchmarks o3 excelled at have not been publicly released. To claim they trained on private tests or that it's not smarter at all is absurd.

Anonymous
12/21/24(Sat)12:55:38 No.103598957

Anonymous 12/21/24(Sat)12:55:38 No.103598957

>>103598932
That's just a method to counter benchmaxxing.

Anonymous
12/21/24(Sat)12:55:48 No.103598960

Anonymous 12/21/24(Sat)12:55:48 No.103598960

>>103598802
I'm just bored, so I guess we are the same.

Anonymous
12/21/24(Sat)12:57:07 No.103598976

Anonymous 12/21/24(Sat)12:57:07 No.103598976

>>103598937
As some wise elders say "Not my problem", you let discord shitters do it with impunity.

Anonymous
12/21/24(Sat)12:57:33 No.103598979

Anonymous 12/21/24(Sat)12:57:33 No.103598979

>>103598915
That's... That's just RAG

Anonymous
12/21/24(Sat)12:59:29 No.103599000

Anonymous 12/21/24(Sat)12:59:29 No.103599000

>>103598976
I like to see the thread going to shit though

Anonymous
12/21/24(Sat)13:00:12 No.103599008

Anonymous 12/21/24(Sat)13:00:12 No.103599008

>>103598979
no, it's OpenAI ChatGPT Memory™

Anonymous
12/21/24(Sat)13:01:59 No.103599026

Anonymous 12/21/24(Sat)13:01:59 No.103599026

>>103598937
they all do it

Anonymous
12/21/24(Sat)13:02:57 No.103599035

Anonymous 12/21/24(Sat)13:02:57 No.103599035

>>103598979
Trve... Fact checked by independent lmg court from beautiful India.

Anonymous
12/21/24(Sat)13:05:32 No.103599057

Anonymous 12/21/24(Sat)13:05:32 No.103599057

How many billions of parameters does a model need to stop writing pajeet-tier code?
The other day it used a for loop with a 1k buffer to copy data from one stream to another when Stream.CopyTo() was a valid solution.

Anonymous
12/21/24(Sat)13:05:50 No.103599060

Anonymous 12/21/24(Sat)13:05:50 No.103599060

Does really no one here have a copy of Gemma GGUF they can just load up and try something out quickly?

Anonymous
12/21/24(Sat)13:06:18 No.103599065

Anonymous 12/21/24(Sat)13:06:18 No.103599065

>>103598979
o1 style response iteration (writing a reply, then writing a criticism of that reply, then writing a new reply based on the original input + the first reply + the criticism, repeated several times) could fix the inherent problem in RAG that it only brings up information about something after it has already been mentioned (so it doesn't help when the AI is the one introducing the term) if the backend stops and applies RAG before criticism iterations.

Anonymous
12/21/24(Sat)13:09:27 No.103599087

Anonymous 12/21/24(Sat)13:09:27 No.103599087

>>103594837
I mean, newer 7b models give results on par with GPT-3.5 turbo, quantization keeps improving, there keeps being algorithmic improvements such as flash attention, etc.

Yes, currently it would not be practical to replicate something like this, since even with all of OpenAI's resource it is still a the parlor trick stage (the actual models won't be released for months), but it might be feasible locally sooner than we think.

Last spring, a lot of people were amazed at Sora when OpenAI announced it. By the time they released it, there were some much better commercial versions by competitors with actual products, some of them making weights available, and by all accounts, Sora pales in comparison to a lot of other commercial ones at least.

OpenAI is marketing heavy, but for the nth time, has no moat. They have their brand. They're the Bitcoin of the latest AI wave. They might, like Bitcoin, succeed because first movers advantage is that powerful and people are dumb (buy Monero), the reason they're selling that vaporware several months in advance is that it's what they need to appear ahead; their current products are no enough.

Anonymous
12/21/24(Sat)13:10:48 No.103599102

Anonymous 12/21/24(Sat)13:10:48 No.103599102

>>103599060
nope, most here just shitpost and don't even use models

Anonymous
12/21/24(Sat)13:12:58 No.103599119

Anonymous 12/21/24(Sat)13:12:58 No.103599119

>>103599060
>Gemma GGUF
I have the fp16s from ollama.
What do you want tested?

Anonymous
12/21/24(Sat)13:15:00 No.103599141

Anonymous 12/21/24(Sat)13:15:00 No.103599141

>>103599102
Really good local model is like a unicorn - it's not real.

Anonymous
12/21/24(Sat)13:16:19 No.103599156

Anonymous 12/21/24(Sat)13:16:19 No.103599156

>>103599057
You won't like to hear this...
Basically it's not about parameter count. 70B and above could learn to do it properly. It's about having a ton of high quality data, and training for a long time. That's how you get non pajeet code. And the part you don't want to hear is that the only way we'll get that much and with that quality is by researching better methods of generating the data. "synthetic data". There are different ways of generating synthetic data and just any shit method isn't sufficient. The synthetic data needs to be high quality and high diversity so the model learns to generalize/doesn't overfit. So more research needs to be done at least on the open source side. Anthropic had done this already which is why their models coded so well compared to everyone else.

Anonymous
12/21/24(Sat)13:18:34 No.103599175

Anonymous 12/21/24(Sat)13:18:34 No.103599175

>>103599119
This >>103598786, thanks.

Anonymous
12/21/24(Sat)13:18:53 No.103599182

Anonymous 12/21/24(Sat)13:18:53 No.103599182

>>103599119
A prompt that's 24 KB of plaintext (lol).

Anonymous
12/21/24(Sat)13:19:23 No.103599190

Anonymous 12/21/24(Sat)13:19:23 No.103599190

>>103599057
32b is usable
70b is not much better
claude blows all open models out of the water. o1 is better but much slower and MUCH more expensive

Anonymous
12/21/24(Sat)13:21:08 No.103599203

Anonymous 12/21/24(Sat)13:21:08 No.103599203

>>103599087
> newer 7b models give results on par with GPT-3.5 turbo
Come on now

Anonymous
12/21/24(Sat)13:27:46 No.103599268

Anonymous 12/21/24(Sat)13:27:46 No.103599268

>>103599102
There are still some, like me, and the guy last time that had an exl2 copy. It's understandable that Gemma is not popular given its advertised context size. And my post was kind of long so it's understandable no one cared enough to even read a single sentence of it.

Anonymous
12/21/24(Sat)13:27:52 No.103599270

Anonymous 12/21/24(Sat)13:27:52 No.103599270

File: Screenshot 2024-12-21 gem(...).png (119 KB, 1265x490)

119 KB PNG

>>103599175
>>103599182
I gave it 8k of context and it estimated that the prompt was 5752 tokens.

Anonymous
12/21/24(Sat)13:30:47 No.103599298

Anonymous 12/21/24(Sat)13:30:47 No.103599298

>>103599270
You clearly didn't paste the whole thing in. It ends with a question about XP costs. And btw it's 11K tokens.

Anonymous
12/21/24(Sat)13:34:06 No.103599322

Anonymous 12/21/24(Sat)13:34:06 No.103599322

File: Screenshot 2024-12-21 at (...).png (41 KB, 921x321)

41 KB PNG

I only see 279 lines in the pastebin.

Anonymous
12/21/24(Sat)13:36:15 No.103599335

Anonymous 12/21/24(Sat)13:36:15 No.103599335

>>103599270
That sounds close? If I copy and paste the pastebin text into Mikupad, it reports 5634 tokens to me.

>>103599298
How'd you get that? That should crash the backend or generate gibberish but it's clearly working on my end. No rope.

Anonymous
12/21/24(Sat)13:40:55 No.103599377

Anonymous 12/21/24(Sat)13:40:55 No.103599377

>>103597950
we're doing way better than anyone expected

Anonymous
12/21/24(Sat)13:41:11 No.103599378

Anonymous 12/21/24(Sat)13:41:11 No.103599378

>>103598368
It was. It was also retarded

Anonymous
12/21/24(Sat)13:41:59 No.103599386

Anonymous 12/21/24(Sat)13:41:59 No.103599386

>>103599203
*depending on use case, I guess.

Anonymous
12/21/24(Sat)13:42:02 No.103599387

Anonymous 12/21/24(Sat)13:42:02 No.103599387

>>103599335
I got that by using the token counter endpoint. It turns out if you CTRL-V twice it's 11K.

Anonymous
12/21/24(Sat)13:44:27 No.103599417

Anonymous 12/21/24(Sat)13:44:27 No.103599417

>>103599377
I expected better.

Anonymous
12/21/24(Sat)13:44:54 No.103599424

Anonymous 12/21/24(Sat)13:44:54 No.103599424

I want to try the status block meme for RP. Any good templates? What should I include?

Anonymous
12/21/24(Sat)13:45:49 No.103599432

Anonymous 12/21/24(Sat)13:45:49 No.103599432

>>103599417
That's what my parents tell me every day

Anonymous
12/21/24(Sat)13:47:12 No.103599444

Anonymous 12/21/24(Sat)13:47:12 No.103599444

>>103599387
Kek.

Anonymous
12/21/24(Sat)13:47:14 No.103599445

Anonymous 12/21/24(Sat)13:47:14 No.103599445

>>103597898
I can't run it on my PC so I don't care.

Anonymous
12/21/24(Sat)13:48:30 No.103599459

Anonymous 12/21/24(Sat)13:48:30 No.103599459

>>103599432
*emotional damage*

Anonymous
12/21/24(Sat)14:00:55 No.103599592

Anonymous 12/21/24(Sat)14:00:55 No.103599592

>>103597898
>It's the smartest model in the world.
We can't test it and o1 is garbage at RP, somehow even more bland than gpt4o and feels dumber. I don't expect o3 to be any better.

Anonymous
12/21/24(Sat)14:02:30 No.103599613

Anonymous 12/21/24(Sat)14:02:30 No.103599613

>>103599592
That's because your RP is dumb and doesn't need reasoning. RP with a scenario about solving riddles and then you'll realize how smart it is.

Anonymous
12/21/24(Sat)14:03:40 No.103599623

Anonymous 12/21/24(Sat)14:03:40 No.103599623

>>103598802
NTA but to break my obsession with browsing 4chan in my free time I started reading ebooks, you could give that a try as well

Anonymous
12/21/24(Sat)14:04:35 No.103599632

Anonymous 12/21/24(Sat)14:04:35 No.103599632

>>103599432
Thankfully I disappointed mine enough to stop hearing that.

Anonymous
12/21/24(Sat)14:13:41 No.103599713

Anonymous 12/21/24(Sat)14:13:41 No.103599713

>>103599623
I wouldn't call my usage obsessive. I mean short breaks while doing other things when those things happen to be on the pc. If threads go fast, i let them run, if they're slow, i may drop a line here and there. I take time for reading books most of the days.

Anonymous
12/21/24(Sat)14:13:47 No.103599716

Anonymous 12/21/24(Sat)14:13:47 No.103599716

>>103599613
>RP with a scenario about solving riddles
do anons really

Anonymous
12/21/24(Sat)14:17:23 No.103599750

Anonymous 12/21/24(Sat)14:17:23 No.103599750

>>103599716
It's all pure placebo
Riddles and narrative test scenarios like the watermelon test are the stupidest thing that has ever come out of /lmg/.

Anonymous
12/21/24(Sat)14:20:21 No.103599781

Anonymous 12/21/24(Sat)14:20:21 No.103599781

>>103599713
Good for you, I used to just browse random threads when I was out and about because there really isn't a lot I can do on my phone and I quickly get extremely bored otherwise. I figured I'd start reading real books instead of schizophrenic ESL shit, hopefully it'll help me write more effectively in the future. What are you currently reading? Me, I'm catching up on "The Expanse" as the TV show didn't adapt it 1:1 and ended early

Anonymous
12/21/24(Sat)14:20:31 No.103599785

Anonymous 12/21/24(Sat)14:20:31 No.103599785

>>103599613
>your scenarios are dumb that's why AI struggles with it
what?

Anonymous
12/21/24(Sat)14:20:35 No.103599786

Anonymous 12/21/24(Sat)14:20:35 No.103599786

Here I was thinking o3 was a nothingburger, but now I realize that riddle fetishists are eating good

Anonymous
12/21/24(Sat)14:22:38 No.103599810

Anonymous 12/21/24(Sat)14:22:38 No.103599810

>>103599786
I can't wait until January. For the price of a 4090 I can have o3 solve any riddle I want once.

Anonymous
12/21/24(Sat)14:23:37 No.103599823

Anonymous 12/21/24(Sat)14:23:37 No.103599823

>>103599785
Garbage in garbage out anonie

Anonymous
12/21/24(Sat)14:26:43 No.103599850

Anonymous 12/21/24(Sat)14:26:43 No.103599850

i've mostly stuck to 70 and 30b tier models but i wanna see if smaller models can be useful for something, what's the overall best 3b and ~8b tier models? is there anything even smaller that any of you have found useful?

Anonymous
12/21/24(Sat)14:30:11 No.103599894

Anonymous 12/21/24(Sat)14:30:11 No.103599894

>>103599781
Going through John Varley again. All the short stories i could find and the gaea trilogy (titan, wizard and demon). I tend to like the short stories better. Most books don't need 300+ pages. But i have a way-too-big back catalog of older sci-fi i should go through as well. GBs of stuff i'll probably never get to read.

Anonymous
12/21/24(Sat)14:30:47 No.103599899

Anonymous 12/21/24(Sat)14:30:47 No.103599899

>>103599850
Ifable 9B.

Anonymous
12/21/24(Sat)14:32:35 No.103599920

Anonymous 12/21/24(Sat)14:32:35 No.103599920

> Rocinante-12B-v1.1 - Dumb. Apparently, one must use ChatML formatting for RP, but the goddamn thing doesn't have the proper tokens for it.
> All the magnums - Overtrained on coomslop; every card sounds the same with uniform personalities.
> Violet_Twilight-v0.2 - Too many newlines, repetitive.
> Mag-Mel - Nah.
> sao - Dead in a bathtub.
> Ikari and Undi - Nope.
> Grype - Irrelevant since Mythomax.

Please, /lmg/ gods, I need a decent 12B tune. I can't take it anymore

Anonymous
12/21/24(Sat)14:33:23 No.103599931

Anonymous 12/21/24(Sat)14:33:23 No.103599931

>>103599750
Found the Falconer

Anonymous
12/21/24(Sat)14:34:48 No.103599949

Anonymous 12/21/24(Sat)14:34:48 No.103599949

>>103599920
did you try slush?

Anonymous
12/21/24(Sat)14:37:54 No.103599980

Anonymous 12/21/24(Sat)14:37:54 No.103599980

File: Screenshot 2024-12-21 to (...).png (23 KB, 1238x102)

23 KB PNG

>>103599298
If I edit its reply to say "To answer your quiz" then hit the continue response button, I get pic related.

>1 exp
>gemma2:27k-instruct-fp16

>100 exp
>gemma2:9k-instruct-fp16
>gemma2:2k-instruct-fp16
>gemma1.1:7k-instruct-fp16
>gemma1:7k-instruct-fp16

>llama3.1:8b-instruct-fp16
>naturally gave 100 exp
>using "To answer your quiz" gave 1 exp

Anonymous
12/21/24(Sat)14:38:12 No.103599984

Anonymous 12/21/24(Sat)14:38:12 No.103599984

>>103599949
No, but I will, because fml.

Anonymous
12/21/24(Sat)14:39:16 No.103599999

Anonymous 12/21/24(Sat)14:39:16 No.103599999

>>103599823
That applies to training, but a model that is intelligent (and has been Instruct tuned or is in any other way trained for interacting with humans) should absolutely be able to take a garbage prompt, figure out what the person writing the prompt wants, and give it to them. If it's unable to do this, it's a failure of the model.

Anonymous
12/21/24(Sat)14:42:14 No.103600031

Anonymous 12/21/24(Sat)14:42:14 No.103600031

>>103599999
>mind-reading should be a basic function of any model
niggawatt

Anonymous
12/21/24(Sat)14:42:47 No.103600036

Anonymous 12/21/24(Sat)14:42:47 No.103600036

>>103599920
What about just Mistral's original tune? Personally I even found it to be a bit too horny, so I avoided trying any community tunes since that'd logically be even hornier (and stupider).

Anyway I think I remember hearing that UnslopNemo was the best RP tune for 12B, maybe try that out?

Anonymous
12/21/24(Sat)14:45:01 No.103600063

Anonymous 12/21/24(Sat)14:45:01 No.103600063

>>103592233
>cooming on code models
zased... so fvcking... zased
*kneeling*

Anonymous
12/21/24(Sat)14:45:13 No.103600066

Anonymous 12/21/24(Sat)14:45:13 No.103600066

>>103599999
I disagree. People who don't give in the effort don't deserve the best rewards.

Anonymous
12/21/24(Sat)14:45:21 No.103600069

Anonymous 12/21/24(Sat)14:45:21 No.103600069

>>103599899
gemma 9b is actually the only smaller model i've kept around, good to know i have objectively perfect taste

Anonymous
12/21/24(Sat)14:46:24 No.103600078

Anonymous 12/21/24(Sat)14:46:24 No.103600078

>>103599999
checked

Anonymous
12/21/24(Sat)14:48:53 No.103600108

Anonymous 12/21/24(Sat)14:48:53 No.103600108

>>103600031
Yes. More to the point, a better model should be better at mind-reading. AI does the cognitive workload for you.

A human skilled at writing compelling stories would be able to entertain a stupid person who wants a specific type of story without the stupid person needing to write their own as an example first.

Anonymous
12/21/24(Sat)14:49:08 No.103600110

Anonymous 12/21/24(Sat)14:49:08 No.103600110

>>103600036
I tried that one too. It's just Rocinante with added ChatML tokens, but dumber. Anyway, about the original Mistral Instruct, were you never bothered by its rigid patterning? No matter how much effort I put in or how diverse I made my cards, not even using schizo-system prompting, I could never break its tendency to fall into this repetitive structure: She did blah blah, then blah blah. "Dialogue dialogue." She went, she did, blah blah. She yada yada. In my experience, it overuses "she" and results in bland prose.

Anonymous
12/21/24(Sat)14:49:12 No.103600114

Anonymous 12/21/24(Sat)14:49:12 No.103600114

>>103599999
nta. If there are contradictions in the prompt, the model can go either way. If it's missing important details, the model will make stuff up or not mention them at all.
Those are issues than are too common on prompts and the prompt writer is to blame.
I imagine something similar happens with art commissions. If the request is vague or messed up, the one fulfilling the commission will interpret. Like prompting just "big titties" in image gen and then complaining that you don't like red-heads when it's done.

Anonymous
12/21/24(Sat)14:49:27 No.103600118

Anonymous 12/21/24(Sat)14:49:27 No.103600118

>>103599999
This.
sonnet doesn't have this problem. we need local sonnet. I hope meta drops their llama 4 soon.

Anonymous
12/21/24(Sat)14:50:37 No.103600130

Anonymous 12/21/24(Sat)14:50:37 No.103600130

>>103600110
Increase temperature and repetition penalty

Anonymous
12/21/24(Sat)14:51:26 No.103600136

Anonymous 12/21/24(Sat)14:51:26 No.103600136

>>103600130
Is 0.7 temp, 0,05 min p and 0.8 dry not enough?

Anonymous
12/21/24(Sat)14:51:42 No.103600140

Anonymous 12/21/24(Sat)14:51:42 No.103600140

>>103600069
Tbh Ifable's tune is the only tune I've tried of 9B. Now that I actually go look at a different benchmark (UGI), I notice that the top 9B is Tiger Gemma v3. Now when I go back to eqbench, I can't find it on there. Unfortunate. It would be interesting to see where Tiger Gemma places given how supposedly uncensored it is.
But given how it performed, maybe I will give it a try personally.

Anonymous
12/21/24(Sat)14:51:46 No.103600141

Anonymous 12/21/24(Sat)14:51:46 No.103600141

>>103599999
checked trvth nvke

Anonymous
12/21/24(Sat)14:51:48 No.103600142

Anonymous 12/21/24(Sat)14:51:48 No.103600142

>llama 4
oh boy I can't wait for a 1T dense model that trades blows with 4o (May) in select benchmarks

Anonymous
12/21/24(Sat)14:52:42 No.103600158

Anonymous 12/21/24(Sat)14:52:42 No.103600158

>>103600110
Honestly don't remember if it was like that but it may have been. Since it was so horny I stopped bothering to use it, as I am someone that can run 70Bs and was just curious what smaller models could do.

Anonymous
12/21/24(Sat)14:52:54 No.103600163

Anonymous 12/21/24(Sat)14:52:54 No.103600163

>>103600136
Dry doesn't stop the model from repeating single tokens, I would increase the temperature to 1.0 and decrease the MinP to 0.02

Anonymous
12/21/24(Sat)14:53:14 No.103600167

Anonymous 12/21/24(Sat)14:53:14 No.103600167

>>103600142
Zucc said that their biggest model will be smaller than the current biggest one but smarter,
Most likely somewhere between 200-300B.

Anonymous
12/21/24(Sat)14:53:39 No.103600172

Anonymous 12/21/24(Sat)14:53:39 No.103600172

>>103600142
>1T dense model
You'll be ready to run it, right? You have been accumulating VRAM like the rest of us, haven't you?

Anonymous
12/21/24(Sat)14:56:00 No.103600194

Anonymous 12/21/24(Sat)14:56:00 No.103600194

>>103600172
>VRAM
Nigga we all using Xeon 6 multi channel now

Anonymous
12/21/24(Sat)14:56:06 No.103600195

Anonymous 12/21/24(Sat)14:56:06 No.103600195

>>103600118
Jokes aside really Sonnet 3.5 is great at taking an absolute trash prompt and outputting something decent.

Anonymous
12/21/24(Sat)14:59:40 No.103600237

Anonymous 12/21/24(Sat)14:59:40 No.103600237

>>103600142
Meta has too many H100 GPUs to mess it up. They have more than all other companies combined.
They better not to.

Anonymous
12/21/24(Sat)15:00:51 No.103600245

Anonymous 12/21/24(Sat)15:00:51 No.103600245

>>103600237
>They have more than all other companies combined.
Um, no? They have about as much as xAI does now.

Anonymous
12/21/24(Sat)15:03:45 No.103600276

Anonymous 12/21/24(Sat)15:03:45 No.103600276

File: 78921378217391.png (61 KB, 904x579)

61 KB PNG

>>103600245
retard

Anonymous
12/21/24(Sat)15:08:38 No.103600328

Anonymous 12/21/24(Sat)15:08:38 No.103600328

>>103600142
>thinking llama 4 will only be 1T
Don't worry anon, there will also be a 3B model which is best in class and trades blows with the best 7B models on benchmarks.

Anonymous
12/21/24(Sat)15:09:44 No.103600341

Anonymous 12/21/24(Sat)15:09:44 No.103600341

>>103600276
That is literally working on old information. Retard thinking I'm the retard here.

Anonymous
12/21/24(Sat)15:10:55 No.103600352

Anonymous 12/21/24(Sat)15:10:55 No.103600352

>>103600276
>infinite money
>tons of talented engineers
>most compute on earth
>their models are worse than chinks release
i just don’t understand

Anonymous
12/21/24(Sat)15:10:57 No.103600353

Anonymous 12/21/24(Sat)15:10:57 No.103600353

>>103600341
so how many GPUs does xAI have now?

Anonymous
12/21/24(Sat)15:11:39 No.103600364

Anonymous 12/21/24(Sat)15:11:39 No.103600364

>>103600328
Everyone would be happy if they released 3B, 30B, and 300B

Anonymous
12/21/24(Sat)15:11:44 No.103600365

Anonymous 12/21/24(Sat)15:11:44 No.103600365

File: file.png (94 KB, 747x1045)

94 KB PNG

>>103592233
>>103592316
>>103598256
>one result

Anonymous
12/21/24(Sat)15:12:09 No.103600370

Anonymous 12/21/24(Sat)15:12:09 No.103600370

>>103600352
>infinite money
CEO and management takes it
>tons of talented engineers
tons of jeets

Anonymous
12/21/24(Sat)15:12:44 No.103600376

Anonymous 12/21/24(Sat)15:12:44 No.103600376

>>103600352
>their models are worse than chinks release
They're not. llama 3.3 is the top model currently.

Anonymous
12/21/24(Sat)15:12:46 No.103600377

Anonymous 12/21/24(Sat)15:12:46 No.103600377

>>103600352
Their models are far safer than anything the Chinese have put out.

Anonymous
12/21/24(Sat)15:13:58 No.103600395

Anonymous 12/21/24(Sat)15:13:58 No.103600395

Are there any examples of diffusion based LLMs out there?

Anonymous
12/21/24(Sat)15:14:10 No.103600399

Anonymous 12/21/24(Sat)15:14:10 No.103600399

>>103600353
The same as Meta training Llama 4. You think Meta is training on a 350k cluster? It doesn't exist. The cluster training Llama 4 is a bit more than 100k. This comes from Zucc in the last earning call.

Anonymous
12/21/24(Sat)15:15:11 No.103600414

Anonymous 12/21/24(Sat)15:15:11 No.103600414

>>103600395
no, DiT hasn't been used for LLMs yet

Anonymous
12/21/24(Sat)15:15:41 No.103600420

Anonymous 12/21/24(Sat)15:15:41 No.103600420

File: file.png (160 KB, 624x1203)

160 KB PNG

>>103600365

Anonymous
12/21/24(Sat)15:17:13 No.103600431

Anonymous 12/21/24(Sat)15:17:13 No.103600431

>>103600399
more than 100k is still a lot.
Last time they trained on 24k H100s and their biggest model took 50+ days on 15T tokens.
They pretrain their new biggest model in a week or two at best, which is way better.

Anonymous
12/21/24(Sat)15:17:51 No.103600437

Anonymous 12/21/24(Sat)15:17:51 No.103600437

>>103600420
That answers nothing, what is the pinkie test in the context of evaluating coom models?

Anonymous
12/21/24(Sat)15:18:05 No.103600439

Anonymous 12/21/24(Sat)15:18:05 No.103600439

>>103599999
Lol this, this is exactly what CAI did in its prefilter glory back in ye olde days.

Anonymous
12/21/24(Sat)15:18:31 No.103600442

Anonymous 12/21/24(Sat)15:18:31 No.103600442

>>103600376
rope yourself

Anonymous
12/21/24(Sat)15:20:06 No.103600459

Anonymous 12/21/24(Sat)15:20:06 No.103600459

>>103600365
>Google
go back

Anonymous
12/21/24(Sat)15:20:13 No.103600463

Anonymous 12/21/24(Sat)15:20:13 No.103600463

>>103600442
it's true chang. No one uses chink models, just look at stats on openrouter. All coomers use 3.3 or sonnet.

Anonymous
12/21/24(Sat)15:20:18 No.103600465

Anonymous 12/21/24(Sat)15:20:18 No.103600465

>>103600431
Sure but it's not some fantasy number of GPUs no else could possibly have. The numbers probably aren't exact either. There's no telling if xAI's report is actually 100k or a bit more but rounded, like Meta, since Meta's report came out after xAI, likely in reaction for boasting purposes.

Anonymous
12/21/24(Sat)15:21:53 No.103600481

Anonymous 12/21/24(Sat)15:21:53 No.103600481

>>103600442
Only gemmies need the rope.

Anonymous
12/21/24(Sat)15:24:24 No.103600514

Anonymous 12/21/24(Sat)15:24:24 No.103600514

File: 124124346457658.png (6 KB, 460x122)

6 KB PNG

A blast from the past when Llama 3 first appeared on the Replicate API.

Anonymous
12/21/24(Sat)15:25:15 No.103600524

Anonymous 12/21/24(Sat)15:25:15 No.103600524

>>103600442
Cloudcucks always so mad to see localchads thrive

Anonymous
12/21/24(Sat)15:28:21 No.103600548

Anonymous 12/21/24(Sat)15:28:21 No.103600548

>>103600524
>Cloudcucks always so mad to see localchads thrive
Yes, this is hilarious. Its like, something new came out in closed-land and now I'm supposed to be sad?
Bro, my current stuff still works and its just a sneak preview of what I'll have in a few months anyways (Or just as likely, what I already have because the big western corpos ignore chink models when they make meme-graphs)

Anonymous
12/21/24(Sat)15:42:07 No.103600703

Anonymous 12/21/24(Sat)15:42:07 No.103600703

>>103600437
Probably some completely worthless garbage, judging by that guy's activity

Anonymous
12/21/24(Sat)15:42:20 No.103600709

Anonymous 12/21/24(Sat)15:42:20 No.103600709

File: 1732555821086563.jpg (88 KB, 545x518)

88 KB JPG

for my st director plugin, i dunno why i put in the effort for text boxes when i could have done what i already was doing with lorebooks. derp but at least i was able to reuse most of the actual work

Anonymous
12/21/24(Sat)15:45:35 No.103600755

Anonymous 12/21/24(Sat)15:45:35 No.103600755

For those of you who use a cloud service - which one are you using?
If I use google (which I've used before), is there anything special I should rent out? What are the specs for diffusion jobs?
Thanks.

Anonymous
12/21/24(Sat)15:49:23 No.103600793

Anonymous 12/21/24(Sat)15:49:23 No.103600793

>>103600524
"Cloudcucks" are busy chatting with prefill sonnet, seems like a win for me.

Anonymous
12/21/24(Sat)15:52:34 No.103600828

Anonymous 12/21/24(Sat)15:52:34 No.103600828

>>103600793
No, those are the cloudchads. The cloudcucks are the ones that don't even immerse themselves with the models they use (if they do use them) and instead spend their time going on social media shitposting about the thing they supposedly are so happy with.

Anonymous
12/21/24(Sat)15:55:14 No.103600861

Anonymous 12/21/24(Sat)15:55:14 No.103600861

>>103600828
>Twittards and redditors say things!
Yeah, for a reason.

Anonymous
12/21/24(Sat)15:55:48 No.103600869

Anonymous 12/21/24(Sat)15:55:48 No.103600869

>>103600828
Like imagine being such a cloudcuck or even localcuck that instead of being like a normal person and happily enjoying your hobby, you instead go online to argue with people about how good or bad [thing] is.

Anonymous
12/21/24(Sat)15:57:52 No.103600898

Anonymous 12/21/24(Sat)15:57:52 No.103600898

>>103600709
Link?

Anonymous
12/21/24(Sat)15:59:17 No.103600914

Anonymous 12/21/24(Sat)15:59:17 No.103600914

>>103600276
I desperately want the H100, but I'll have to wait until it becomes cheap and obsolete like p100

Anonymous
12/21/24(Sat)16:04:15 No.103600965

Anonymous 12/21/24(Sat)16:04:15 No.103600965

>>103600914
A100 still isn't cheap and H100s are under buyback agreements. You're going to be waiting a loooong time

Anonymous
12/21/24(Sat)16:09:18 No.103601014

Anonymous 12/21/24(Sat)16:09:18 No.103601014

>>103600898
https://file.io/XCI58sDJLMsv
thats the last one i released, working on a update though. its point is you create lorebooks for clothes, hair and stuff then can quickly change them via dropdowns in the addon. its basically the same as adding to your author note: char is wearing <lorebook entry>, but instead you get dropdowns of those saved entries. install to st\data\default-user\extensions\
some st updated a while back changed the theming a bit and the buttons got messed up but the order goes user, char, world, notes, preview, lorebooks

Anonymous
12/21/24(Sat)16:18:34 No.103601121

Anonymous 12/21/24(Sat)16:18:34 No.103601121

2advanced4lmg
https://x.com/novasarc01/status/1870181817162285120

Anonymous
12/21/24(Sat)16:26:06 No.103601187

Anonymous 12/21/24(Sat)16:26:06 No.103601187

>>103601121
it's literally just coconut

Anonymous
12/21/24(Sat)16:27:27 No.103601203

Anonymous 12/21/24(Sat)16:27:27 No.103601203

Bros I think Gemma is legitimately innovative in what it did. It basically tried to prove that modern models may be using or rather wasting too many of their parameters just to chase a high context length, and it succeeded. The models were way more knowledge-dense at the cost of context length. They even used a sliding window on half of the layers to boost performance even more, though that makes the model even worse at handling context extension. What we really need is a next generation version that does the same thing but gets to around 32k instead of 128k. It wouldn't be nearly as knowledge-dense, but it'd be usable to most people finally without any context extension tricks that degrade performance.

Anonymous
12/21/24(Sat)16:29:16 No.103601218

Anonymous 12/21/24(Sat)16:29:16 No.103601218

>>103601121
Wdym lmg? You post here all the time.

Anonymous
12/21/24(Sat)16:32:04 No.103601260

Anonymous 12/21/24(Sat)16:32:04 No.103601260

File: ai_fail.png (10 KB, 847x97)

10 KB PNG

Anonymous
12/21/24(Sat)16:38:07 No.103601321

Anonymous 12/21/24(Sat)16:38:07 No.103601321

Deepseek r3 when? QwQ3? We aren't going to let Sam get away with this, right?

Anonymous
12/21/24(Sat)16:39:05 No.103601332

Anonymous 12/21/24(Sat)16:39:05 No.103601332

>>103601203
Now that you're talking about gemma, I have been trying a few models to translate chinese to english and gemma2 9b is one of the best. Qwen 2.5 14b somehow performs worse than qwen 3b.

Anonymous
12/21/24(Sat)16:39:41 No.103601341

Anonymous 12/21/24(Sat)16:39:41 No.103601341

>>103599980
>27k
Huh, is that an Ollama thing? I guess they're using rope for that. Makes sense it could start answering correctly. But thanks for testing. This would confirm Llama.cpp does have an issue with Gemma that Exllama doesn't.

Anonymous
12/21/24(Sat)16:39:57 No.103601343

Anonymous 12/21/24(Sat)16:39:57 No.103601343

>>103601321
DeepSeek T1 will be out on Christmas and it will be better than o5. Trust the plan.

Anonymous
12/21/24(Sat)16:41:12 No.103601356

Anonymous 12/21/24(Sat)16:41:12 No.103601356

>>103601343
Just imagine when they're on version 800.
Haha get it.
It's a reference.
Haha...

Anonymous
12/21/24(Sat)16:57:31 No.103601539

Anonymous 12/21/24(Sat)16:57:31 No.103601539

>>103601332
I think someone mentioned using Gemma 27B was preferable to Qwen for translating Japanese. If that's true even for Chinese then that'd be pretty funny.

>Qwen 2.5 14b somehow performs worse than qwen 3b
That's kind of weird though. Maybe their 14B was a bit of a fail.

Anonymous
12/21/24(Sat)17:12:08 No.103601690

Anonymous 12/21/24(Sat)17:12:08 No.103601690

>>103599378
For what it's worth I tried IQ3_XS and IQ2_XS quantizations of Llama 3.3-70B, and the latter felt substantially worse than the former (overall duller and less interesting outputs, less attention to detail, more formatting mistakes), so there's that as well.

Serious investigation into the effects of low-precision quantization needs to be done, because I'm not sure if MMLU scores (which in theory still place 70B Llama-70B in ~2-bit above the 8B version in FP16) tell the entire story.

Anonymous
12/21/24(Sat)17:19:07 No.103601767

Anonymous 12/21/24(Sat)17:19:07 No.103601767

How far are we from actually running an AI Dungeon like program locally with strong recollection and general response quality? Assume a 5090

Anonymous
12/21/24(Sat)17:20:30 No.103601789

Anonymous 12/21/24(Sat)17:20:30 No.103601789

>>103601539
Qwen 3b works pretty good for a "normal" translation, but since I'm using it to translate a novel it wasn't enough. I don't know what was wrong with 14b but with the same prompt and the same novel. it performed considerably worse. Maybe it would be better with a different prompt but I was busy trying other models.
Nemo also was decent but gemma feels more "accurate". I can't really tell accuracy with so little testing and no real translation but this was how it feels to me so far.
I'll give 27b a try since it was mentioned.

Anonymous
12/21/24(Sat)17:21:15 No.103601799

Anonymous 12/21/24(Sat)17:21:15 No.103601799

>>103601767
Further away than ever before. Soulful completion models like Summer Dragon are dead. All that's left is boring Instruct tunes that are as boring as they are predictable.

Anonymous
12/21/24(Sat)17:21:38 No.103601801

Anonymous 12/21/24(Sat)17:21:38 No.103601801

>>103601014
It's quite handy thanks for making this

Anonymous
12/21/24(Sat)17:21:39 No.103601802

Anonymous 12/21/24(Sat)17:21:39 No.103601802

Is there a good archive of high-quality, clean Touhou voice samples somewhere?

Anonymous
12/21/24(Sat)17:21:52 No.103601804

Anonymous 12/21/24(Sat)17:21:52 No.103601804

Asking on the off chance anyone is going to give me a serious answer: I have a 96GB AI server I use to run mainly Mistral-Large based models. Is DeepSeek 2.5 actually worth caring about? Should I be looking for some more cards to run it?

Anonymous
12/21/24(Sat)17:22:50 No.103601812

Anonymous 12/21/24(Sat)17:22:50 No.103601812

>>103601804
What are you running the models for? ERP?

Anonymous
12/21/24(Sat)17:23:41 No.103601825

Anonymous 12/21/24(Sat)17:23:41 No.103601825

>>103601804
I prefer deepseek (especially 1210) at q8 over largestral at q8
Is it worth it? How much is a boost in intelligence worth to you? Vanilla rp isn't going to get much better imo. You'll need complex scenarios or actual intelligence-stressing tasks for it to be worthwhile.

Anonymous
12/21/24(Sat)17:24:14 No.103601831

Anonymous 12/21/24(Sat)17:24:14 No.103601831

>>103601812
primarily, yes

Anonymous
12/21/24(Sat)17:25:58 No.103601842

Anonymous 12/21/24(Sat)17:25:58 No.103601842

>>103601804
deepseek is smarter and knows a lot more but is dryer and needs xtc imo. The speed alone though makes it worth it.

Anonymous
12/21/24(Sat)17:28:06 No.103601867

Anonymous 12/21/24(Sat)17:28:06 No.103601867

Is v100maxx chad on here? How worth it is your setup? I'm thinking about getting some of these and some v100s as a cheap alternative to 48gb cards
https://www.ebay.com/itm/296856182515

Anonymous
12/21/24(Sat)17:28:56 No.103601885

Anonymous 12/21/24(Sat)17:28:56 No.103601885

>>103601804
How much combined RAM and VRAM? You need like 192GB to run a decent quant with a decent context length, especially since Llama.cpp doesn't support flash attention for DS.

Anonymous
12/21/24(Sat)17:28:58 No.103601887

Anonymous 12/21/24(Sat)17:28:58 No.103601887

>>103601859
>>103601859
>>103601859

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.