/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 10/02/24(Wed)15:13:25 No.102654480

File: HerBreathMistsTheWindowPa(...).png (1.18 MB, 1168x880)

1.18 MB PNG

/lmg/ - Local Models General Anonymous 10/02/24(Wed)15:13:25 No.102654480 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102645080 & >>102632446

►News
>(09/27) Emu3, next-token prediction multimodal models: https://hf.co/collections/BAAI/emu3-66f4e64f70850ff358a2e60f
>(09/25) Multimodal Llama 3.2 released: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices
>(09/25) Molmo: Multimodal models based on OLMo, OLMoE, and Qwen-72B: https://molmo.allenai.org/blog
>(09/24) Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct
>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
10/02/24(Wed)15:16:32 No.102654530

Anonymous 10/02/24(Wed)15:16:32 No.102654530

recap anon gay

Anonymous
10/02/24(Wed)15:17:29 No.102654543

Anonymous 10/02/24(Wed)15:17:29 No.102654543

>>102654480
slash recap anon's salary

Anonymous
10/02/24(Wed)15:17:34 No.102654548

Anonymous 10/02/24(Wed)15:17:34 No.102654548

Tell me something you've done with LLMs today.

Anonymous
10/02/24(Wed)15:18:35 No.102654563

Anonymous 10/02/24(Wed)15:18:35 No.102654563

Does anyone have a magnet for llama 3.2 11b vision? For whatever reason it's unavailable to download in Europe.

Anonymous
10/02/24(Wed)15:18:40 No.102654567

Anonymous 10/02/24(Wed)15:18:40 No.102654567

>>102654548
I was gooning a little after midnight but then I got bored (mistral large is getting stale) so I switched to gelbooru.

Anonymous
10/02/24(Wed)15:20:37 No.102654596

Anonymous 10/02/24(Wed)15:20:37 No.102654596

>>102654480
slash miku's throat

Anonymous
10/02/24(Wed)15:21:32 No.102654614

Anonymous 10/02/24(Wed)15:21:32 No.102654614

File: Hatsune-Miku-Vocaloids-an(...).jpg (1.43 MB, 2880x1814)

1.43 MB JPG

►Recent Highlights from the Previous Thread: >>102632446

--Local is dead, in other news the new OpenAI advanced voice mode is pretty cool

►Recent Highlight Posts from the Previous Thread: >>102632451

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
10/02/24(Wed)15:22:21 No.102654628

Anonymous 10/02/24(Wed)15:22:21 No.102654628

>>102654548
found some nonpozzed models with claudeslop

Anonymous
10/02/24(Wed)15:28:34 No.102654701

Anonymous 10/02/24(Wed)15:28:34 No.102654701

>>102654614
I think you should seek help, I'm not even trying to be mean. Your obsession with this thread is very clearly unhealthy.

Anonymous
10/02/24(Wed)15:29:05 No.102654710

Anonymous 10/02/24(Wed)15:29:05 No.102654710

Claude:
>API
>so good localfags finetune on its worst outputs
OpenAI Advanced Voice:
>API
>first TTS to do convincing emotions at inference
Dall-E 3:
>API
>so good localfags finetune SDXL (lol) on its worst outputs
local:
>slop
>shivers cope
>xtts-v2 cope
>flux cope
>discord sloptuners calling for reddit moderation in /lmg/
It's fucking over isn't it?

Anonymous
10/02/24(Wed)15:30:00 No.102654721

Anonymous 10/02/24(Wed)15:30:00 No.102654721

Is 48GB of VRAM enough to make a 70B AWQ quant?

Anonymous
10/02/24(Wed)15:31:08 No.102654739

Anonymous 10/02/24(Wed)15:31:08 No.102654739

>>102654710
>>xtts-v2 cope
valle + a LORA is all you need.

Anonymous
10/02/24(Wed)15:31:14 No.102654744

Anonymous 10/02/24(Wed)15:31:14 No.102654744

File: 1727896698406980.png (140 KB, 766x733)

140 KB PNG

they can't stop winning...

Anonymous
10/02/24(Wed)15:33:20 No.102654774

Anonymous 10/02/24(Wed)15:33:20 No.102654774

>>102654739
>https://valle-demo.github.io/
>404
LOCALBROS...

Anonymous
10/02/24(Wed)15:34:10 No.102654791

Anonymous 10/02/24(Wed)15:34:10 No.102654791

>>102654614
>>102654710
>>102654744
hi sam

>>102654701
it's sam. he is still butthurt about the fact that meta's llama 405 performs at the level of gpt4 and is open source. he can't take it away, he can't moderate it, so he just tries to scare new people off to prevent wider adoption.

Anonymous
10/02/24(Wed)15:34:16 No.102654792

Anonymous 10/02/24(Wed)15:34:16 No.102654792

>>102654548
I made anime girls real!

Anonymous
10/02/24(Wed)15:34:39 No.102654799

Anonymous 10/02/24(Wed)15:34:39 No.102654799

File: losang1.png (465 KB, 1178x983)

465 KB PNG

>>102654710
>flux cope
I KNOW this is just a shitpost but flux is anything but cope.

Anonymous
10/02/24(Wed)15:34:42 No.102654802

Anonymous 10/02/24(Wed)15:34:42 No.102654802

>>102654614
petra is better tahn you

Anonymous
10/02/24(Wed)15:35:24 No.102654815

Anonymous 10/02/24(Wed)15:35:24 No.102654815

File: 1459460919407.jpg (45 KB, 396x337)

45 KB JPG

>>102654548
Realized that it is functionally impossible for me to socialize with real people for more than a few minutes both because they bore me to tears and because what I consider a "good time" hanging out makes other people miserable.

Anonymous
10/02/24(Wed)15:36:36 No.102654833

Anonymous 10/02/24(Wed)15:36:36 No.102654833

>>102654563
https://huggingface.co/unsloth/Llama-3.2-11B-Vision-Instruct
The least you can do is search for a reupload, you lazy fuck.

Anonymous
10/02/24(Wed)15:37:02 No.102654841

Anonymous 10/02/24(Wed)15:37:02 No.102654841

>>102654802
Is it a competition?

Anonymous
10/02/24(Wed)15:37:41 No.102654856

Anonymous 10/02/24(Wed)15:37:41 No.102654856

>>102654799
I'm not shitposting and it's not cope. flux doesn't hold a candle to Dall-E in terms of prompting and even SDXL is better at things like ripped clothes/dirty faces/blood.

Anonymous
10/02/24(Wed)15:39:25 No.102654883

Anonymous 10/02/24(Wed)15:39:25 No.102654883

>>102654856
slop-e is worse at humans, sorry sam
>verification not required

Anonymous
10/02/24(Wed)15:39:41 No.102654887

Anonymous 10/02/24(Wed)15:39:41 No.102654887

>>102654799
I don't know, Flux didn't really impress me. I guess you can make nice Migu images with it, but you can find those on the boorus too.

Anonymous
10/02/24(Wed)15:40:03 No.102654891

Anonymous 10/02/24(Wed)15:40:03 No.102654891

>>102654883
>absolutely no argument
I accept your copecession.

Anonymous
10/02/24(Wed)15:40:03 No.102654892

Anonymous 10/02/24(Wed)15:40:03 No.102654892

>>102654856
>flux doesn't hold a candle to Dall-E in terms of prompting
I literally tested all of this on day 1.
Flux blows DALL-E out of the water for prompt understanding and conceptual granularity. Like it's not even a fucking contest. Buy a fucking ad saltman.

Anonymous
10/02/24(Wed)15:40:39 No.102654903

Anonymous 10/02/24(Wed)15:40:39 No.102654903

I have 96GB of VRAM and 128GB of RAM. Thinking about trying some 405b quants locally, but I've never attempted this before. I have some questions if anyone can help.

1. Can llama.cpp even load a model split between GPU and CPU without loading it entirely into RAM first? Meaning if I had a 150GB model would it OOM while trying to load it.
2. Is a quant like IQ2_XXS usable for 405b? Is it better in any way compared to a 70b q8?
3. I remember something about certain IQ quants being slow if offloaded on CPU. Is that still a thing and if so which quants is it?

Anonymous
10/02/24(Wed)15:41:04 No.102654913

Anonymous 10/02/24(Wed)15:41:04 No.102654913

>>102654892
>I literally tested all of this on day 1.
Then you won't mind sharing some of the comparison images and prompts? I look forward to seeing them.

Anonymous
10/02/24(Wed)15:41:35 No.102654917

Anonymous 10/02/24(Wed)15:41:35 No.102654917

>>102654903
>96GB of VRAM
4x 3090?

Anonymous
10/02/24(Wed)15:42:04 No.102654927

Anonymous 10/02/24(Wed)15:42:04 No.102654927

Can I run a decent quant of mistral large with 24gb vram and 64gb ram? I don't care about inference speed. If I get 0.3t/s that's fine. I just want to try it.

Anonymous
10/02/24(Wed)15:42:33 No.102654939

Anonymous 10/02/24(Wed)15:42:33 No.102654939

File: realtime.png (5 KB, 607x28)

5 KB PNG

theyre either torching money on advanced voice or api pricing is a scam
>assume a voice convo costs $0.10/min
>if you use it 15mins/day, that's $45/month

Anonymous
10/02/24(Wed)15:42:38 No.102654941

Anonymous 10/02/24(Wed)15:42:38 No.102654941

>>102654913
you haven't too :)

Anonymous
10/02/24(Wed)15:43:25 No.102654953

Anonymous 10/02/24(Wed)15:43:25 No.102654953

>>102654903
>Can llama.cpp even load a model split between GPU and CPU without loading it entirely into RAM first?
Disable mmap.
2, no idea. I suppose they are reasonable if people use it for 70b.
3. At that point it doesn't matter much. it'd be the difference between 0.1 and 0.15 t/s.

Anonymous
10/02/24(Wed)15:43:51 No.102654960

Anonymous 10/02/24(Wed)15:43:51 No.102654960

>>102654880
Ugly blonde chick spammer, I'm sorry for being mean to you. You are much better than that boring proprietary cocksucker that we got in your place.

Anonymous
10/02/24(Wed)15:44:42 No.102654973

Anonymous 10/02/24(Wed)15:44:42 No.102654973

>>102654941
>I have proof local is better but I'm not going to share it
Caught lying and conceded with an emoticon immediately. Embarrassing.

Anonymous
10/02/24(Wed)15:45:45 No.102654989

Anonymous 10/02/24(Wed)15:45:45 No.102654989

>>102654973
ADOLF HITLER IS A NIGGER

Anonymous
10/02/24(Wed)15:46:43 No.102655006

Anonymous 10/02/24(Wed)15:46:43 No.102655006

>sperging because he got caught in an obvious lie
Yeah, it's over for local.

Anonymous
10/02/24(Wed)15:47:44 No.102655023

Anonymous 10/02/24(Wed)15:47:44 No.102655023

>>102654960
^_^

Anonymous
10/02/24(Wed)15:48:02 No.102655030

Anonymous 10/02/24(Wed)15:48:02 No.102655030

>>102654917
4090s, but yeah basically that
>>102654953
Thanks, I'll try with mmap disabled and hopefully it loads.

Anonymous
10/02/24(Wed)15:48:22 No.102655038

Anonymous 10/02/24(Wed)15:48:22 No.102655038

>>102654960
The thread getting more dead means even shitposters are replaced by lower quality ones. Like Petra anon was way more interesting than cloud shill and buy an ad spammer

Anonymous
10/02/24(Wed)15:48:59 No.102655047

Anonymous 10/02/24(Wed)15:48:59 No.102655047

>>102655038
Hi Sao

Anonymous
10/02/24(Wed)15:49:03 No.102655049

Anonymous 10/02/24(Wed)15:49:03 No.102655049

>>102654903
I'm way out of that range. I could draw comparisons from others though. I think that 2.5bpw quants of 70b models outperform q6_K_M quants of 22b stuff.

Anonymous
10/02/24(Wed)15:49:52 No.102655059

Anonymous 10/02/24(Wed)15:49:52 No.102655059

>>102655038
Hi Cuda dev

Anonymous
10/02/24(Wed)15:50:57 No.102655070

Anonymous 10/02/24(Wed)15:50:57 No.102655070

>>102654927
You won't be able to fully load any decent quant of it on GPU. On RAM the biggest you can load is Q3_K_S. If you can, quant it yourself with bf16 or q8_0 embed and output, they should improve output quality without increasing model size too much. Get full 128GB so you can run Q6_K, it's worth it, trust me.

Anonymous
10/02/24(Wed)15:51:51 No.102655085

Anonymous 10/02/24(Wed)15:51:51 No.102655085

>>102654960
petra is a proprietary cock sucker, he keeps spiting any local generals.

Anonymous
10/02/24(Wed)15:52:00 No.102655088

Anonymous 10/02/24(Wed)15:52:00 No.102655088

>>102655038
This thread seems pretty active to me.

Anonymous
10/02/24(Wed)15:52:51 No.102655104

Anonymous 10/02/24(Wed)15:52:51 No.102655104

>>102654913
I don't care about you that much to bother C:

Anonymous
10/02/24(Wed)15:53:30 No.102655118

Anonymous 10/02/24(Wed)15:53:30 No.102655118

File: No.png (192 KB, 309x326)

192 KB PNG

>>102655038
No, we're not going back to the cloud.

Anonymous
10/02/24(Wed)15:53:49 No.102655125

Anonymous 10/02/24(Wed)15:53:49 No.102655125

>>102654960
Oldfag here, I agree with this. I have been here since the beginning and as far as I remember Petra at least cared about the general and wanted to end the Miku manace.

Anonymous
10/02/24(Wed)15:54:10 No.102655128

Anonymous 10/02/24(Wed)15:54:10 No.102655128

>local is bad
Oh really? anthracite-org/magnum-v2-123b

Anonymous
10/02/24(Wed)15:54:40 No.102655136

Anonymous 10/02/24(Wed)15:54:40 No.102655136

>>102655070
Thanks.

Anonymous
10/02/24(Wed)15:54:52 No.102655139

Anonymous 10/02/24(Wed)15:54:52 No.102655139

>>102655125
>Petra at lesat cared about the general
No he fucking didn't, he did the same shit he did to /vsg/ by spreading FUD to try and kill the general.

Anonymous
10/02/24(Wed)15:54:58 No.102655141

Anonymous 10/02/24(Wed)15:54:58 No.102655141

File: file.png (6 KB, 530x59)

6 KB PNG

>>102655085
believe

Anonymous
10/02/24(Wed)15:55:30 No.102655150

Anonymous 10/02/24(Wed)15:55:30 No.102655150

I'm using Silly's vector functionality with its native transformer.js lib, using
>Snowflakesnowflake-arctic-embed-m
as the embedding model.
Opinions, suggestions?
I'm using llama.cpp to serve the main model. I can't use that to both generate text and provide an the embeds functionality at the same time, right?

Anonymous
10/02/24(Wed)15:55:40 No.102655153

Anonymous 10/02/24(Wed)15:55:40 No.102655153

>>102655128
Tried Luminum yet?

Anonymous
10/02/24(Wed)15:56:36 No.102655165

Anonymous 10/02/24(Wed)15:56:36 No.102655165

>>102655153
There's no reason to try anything other than Magnum.

Anonymous
10/02/24(Wed)15:57:08 No.102655176

Anonymous 10/02/24(Wed)15:57:08 No.102655176

>>102655038
The buy an ad spammer is Sao false-flagging.

Anonymous
10/02/24(Wed)15:57:41 No.102655184

Anonymous 10/02/24(Wed)15:57:41 No.102655184

File: ps.png (1 KB, 323x79)

1 KB PNG

>>102655141

Anonymous
10/02/24(Wed)15:57:42 No.102655185

Anonymous 10/02/24(Wed)15:57:42 No.102655185

>>102655176
Hi Drummer

Anonymous
10/02/24(Wed)15:58:10 No.102655195

Anonymous 10/02/24(Wed)15:58:10 No.102655195

>>102655165
anon, don't say shit like this even as a joke. You may invite people that unironically think this to the general. This is a fairly common occurrence in online communities.

Anonymous
10/02/24(Wed)15:58:31 No.102655198

Anonymous 10/02/24(Wed)15:58:31 No.102655198

>>102655139
>the same shit he did to /vsg/
To think Miku was the right choice to defeat him...

Anonymous
10/02/24(Wed)15:59:24 No.102655212

Anonymous 10/02/24(Wed)15:59:24 No.102655212

>>102655195
Good. I would like to be surrounded by people I agree with. I'm sick of being called a faggot any time I say anything in this general.

Anonymous
10/02/24(Wed)15:59:28 No.102655215

Anonymous 10/02/24(Wed)15:59:28 No.102655215

>>102655139
>FUD
opinion discarded.

Anonymous
10/02/24(Wed)16:00:52 No.102655234

Anonymous 10/02/24(Wed)16:00:52 No.102655234

>>102655212
>I'm sick of being called a faggot any time I say anything in this general.
Now you're just asking for it...

Anonymous
10/02/24(Wed)16:01:03 No.102655239

Anonymous 10/02/24(Wed)16:01:03 No.102655239

>>102655215
It was genuine FUD and you know it.
https://desuarchive.org/g/thread/95983527/#95984811

Anonymous
10/02/24(Wed)16:02:55 No.102655263

Anonymous 10/02/24(Wed)16:02:55 No.102655263

File: belief.png (592 KB, 747x800)

592 KB PNG

>>102655239

Anonymous
10/02/24(Wed)16:05:02 No.102655287

Anonymous 10/02/24(Wed)16:05:02 No.102655287

File: 1721097824649005.jpg (47 KB, 562x675)

47 KB JPG

>>102655239
I was actually talking about your use of cryptobro vocabulary, but okay, I guess.

Anonymous
10/02/24(Wed)16:06:51 No.102655307

Anonymous 10/02/24(Wed)16:06:51 No.102655307

Can't wait for you to get IP wiped again.

Anonymous
10/02/24(Wed)16:19:44 No.102655511

Anonymous 10/02/24(Wed)16:19:44 No.102655511

File: 34 Days Until November 5.png (2.14 MB, 1368x1192)

2.14 MB PNG

Anonymous
10/02/24(Wed)16:21:42 No.102655544

Anonymous 10/02/24(Wed)16:21:42 No.102655544

>>102655128
>anthracite
They have like unlimited VRAM for free at their disposal. Where is our dedicated ERP model after all this time?

Anonymous
10/02/24(Wed)16:25:10 No.102655608

Anonymous 10/02/24(Wed)16:25:10 No.102655608

>>102655511
vance won the debate

Anonymous
10/02/24(Wed)16:26:02 No.102655621

Anonymous 10/02/24(Wed)16:26:02 No.102655621

>>102655511
hi betiful show bobby plz

Anonymous
10/02/24(Wed)16:30:23 No.102655690

Anonymous 10/02/24(Wed)16:30:23 No.102655690

>>102655511
Will you start another arbitrary countdown after 34 days? 260 days until bitnet or some such.

Anonymous
10/02/24(Wed)16:34:10 No.102655743

Anonymous 10/02/24(Wed)16:34:10 No.102655743

File: 1727901044818374.webm (1.85 MB, 1088x720)

1.85 MB WEBM

What are the implications of this to local models?

Anonymous
10/02/24(Wed)16:35:47 No.102655762

Anonymous 10/02/24(Wed)16:35:47 No.102655762

File: suiseiseki laugh desu cov(...).gif (42 KB, 200x204)

42 KB GIF

>>102655743
>webm

Anonymous
10/02/24(Wed)16:36:09 No.102655766

Anonymous 10/02/24(Wed)16:36:09 No.102655766

>>102655743
openaisisters.. not like this

Anonymous
10/02/24(Wed)16:36:33 No.102655769

Anonymous 10/02/24(Wed)16:36:33 No.102655769

>>102655743
Open AI bubble bursting is good for local.

Anonymous
10/02/24(Wed)16:41:10 No.102655836

Anonymous 10/02/24(Wed)16:41:10 No.102655836

>>102655743
Wtf is that real? Is Sam OK?

Anonymous
10/02/24(Wed)16:41:53 No.102655847

Anonymous 10/02/24(Wed)16:41:53 No.102655847

>>102655743
me in the back

Anonymous
10/02/24(Wed)16:45:35 No.102655909

Anonymous 10/02/24(Wed)16:45:35 No.102655909

>>102655743
>so this is what those faggot artists meant by the "A.I bubble bursting"...

Anonymous
10/02/24(Wed)16:47:21 No.102655933

Anonymous 10/02/24(Wed)16:47:21 No.102655933

>>102655743
This is extremely unethical and unsafe. We need to regulate China NOW.

Anonymous
10/02/24(Wed)16:49:25 No.102655965

Anonymous 10/02/24(Wed)16:49:25 No.102655965

>>102655743
openai BTFO

Anonymous
10/02/24(Wed)16:55:42 No.102656052

Anonymous 10/02/24(Wed)16:55:42 No.102656052

>>102655608
based

Anonymous
10/02/24(Wed)16:59:32 No.102656093

Anonymous 10/02/24(Wed)16:59:32 No.102656093

File: we so essited.png (4 KB, 272x114)

4 KB PNG

ah good to know corpospeak still pisses me the fuck off >>102655743
seems like every vidgen site is getting hammered lately, here's hoping local doesn't fall behind by the end of the year.

Anonymous
10/02/24(Wed)17:00:50 No.102656112

Anonymous 10/02/24(Wed)17:00:50 No.102656112

>>102656093
oh my god i wasn't expecting the squishing sound effect, anon failed to deliver the sound of (((altmann))) being inflated.
https://files.catbox.moe/j254od.mp4

Anonymous
10/02/24(Wed)17:01:34 No.102656122

Anonymous 10/02/24(Wed)17:01:34 No.102656122

>>102656093
sorry anon they have to get through my 10 queued gens of inflating girls first

Anonymous
10/02/24(Wed)17:04:43 No.102656165

Anonymous 10/02/24(Wed)17:04:43 No.102656165

>>102656122
Understandable, have a nice day.

Anonymous
10/02/24(Wed)17:09:50 No.102656253

Anonymous 10/02/24(Wed)17:09:50 No.102656253

>>102654927
Yeah I was running Mistral Large IQ4_XS at low context with that 24gb vram 64gb ram config, it works, probably around 0.3t/s yeah. I think page file might have been used there tho, I don't remember as it was a few weeks ago.

Anonymous
10/02/24(Wed)17:12:46 No.102656294

Anonymous 10/02/24(Wed)17:12:46 No.102656294

>>102655743
we are really getting there huh? at this point I wonder why LLMs are still so far behind.
https://xcancel.com/emollick/status/1841345969184498168#m

Anonymous
10/02/24(Wed)17:13:10 No.102656303

Anonymous 10/02/24(Wed)17:13:10 No.102656303

>>102656093
>here's hoping local doesn't fall behind
nobody tell him local already fell behind

Anonymous
10/02/24(Wed)17:13:38 No.102656313

Anonymous 10/02/24(Wed)17:13:38 No.102656313

>>102656303
the year's not up fat lady, you can't be singing yet.

Anonymous
10/02/24(Wed)17:18:55 No.102656391

Anonymous 10/02/24(Wed)17:18:55 No.102656391

New official comments about the state of local AI:
Joe Biden: asfeiogjegjewigrji what?
Donald Trump: Tremendous progress bigly even

Anonymous
10/02/24(Wed)17:22:11 No.102656446

Anonymous 10/02/24(Wed)17:22:11 No.102656446

>>102656391
33 days left until october 5th

Anonymous
10/02/24(Wed)17:22:28 No.102656450

Anonymous 10/02/24(Wed)17:22:28 No.102656450

File: 62b1a79c5e075.jpg (215 KB, 1024x576)

215 KB JPG

>>102656313

Anonymous
10/02/24(Wed)17:22:56 No.102656457

Anonymous 10/02/24(Wed)17:22:56 No.102656457

>>102655030 (me)
Okay, 405b IQ2_XXS works and is coherent. 1.5 tok/s on 4x4090 + 128GB RAM. And I'm only using half the RAM slots, the bandwidth could easily be doubled if I buy 4 more sticks. Not bad speed at all, definitely usable for testing purposes.

Unfortunately it's kinda retarded. It's making mistakes and weird choices I'm pretty sure 70b at a decent quant wouldn't. It fucks up grammar and misspells words, or makes up nonsensical words sometimes too. I'm downloading IQ3_M, which I think should just barely fit across my VRAM+RAM, let's see if that one's any better.

Anonymous
10/02/24(Wed)17:24:19 No.102656477

Anonymous 10/02/24(Wed)17:24:19 No.102656477

>>102656450
im speaking, sound familiar?

Anonymous
10/02/24(Wed)17:25:18 No.102656498

Anonymous 10/02/24(Wed)17:25:18 No.102656498

>>102656112
Huh? You mean they generate audio for the video too?

Anonymous
10/02/24(Wed)17:27:05 No.102656511

Anonymous 10/02/24(Wed)17:27:05 No.102656511

>>102656477
GUTEN MORGEN KAMERAD DUMMKOPF WIR VERSAMMELN UNS HEUTE UM DEITSCHLAND ZU DIENEN

Anonymous
10/02/24(Wed)17:27:22 No.102656514

Anonymous 10/02/24(Wed)17:27:22 No.102656514

File: images.jpg (4 KB, 289x174)

4 KB JPG

>discovered I could run mistral large q2_xs slowly on my computer
>refuse to go anything smaller now despite 1 t/s because any other model seems retarded, boring and/or shallow in comparison
I hate this

Anonymous
10/02/24(Wed)17:44:51 No.102656755

Anonymous 10/02/24(Wed)17:44:51 No.102656755

>>102654903
A 96GB Vramlet should be targeting decent quants of 70-100+B tier models. Llama 3.1 series suffers more from quantization than most. For 405B if you can't fit at least Q5 you'd be better off with the Q8 70B version. Right now I think Largestral and Qwen 72B are the best options for this range of memory, and I'd pick them over any Llama unless I had a DGX supercomputer collecting dust.

Anonymous
10/02/24(Wed)17:45:40 No.102656767

Anonymous 10/02/24(Wed)17:45:40 No.102656767

MN finetunes seem giga retarded compared to using Gemma 2 9b it simpO. Prose sucks too.

Anonymous
10/02/24(Wed)17:46:02 No.102656774

Anonymous 10/02/24(Wed)17:46:02 No.102656774

>>102656755
I don't care anon

Anonymous
10/02/24(Wed)17:51:55 No.102656845

Anonymous 10/02/24(Wed)17:51:55 No.102656845

>>102656767
color me surprised

Anonymous
10/02/24(Wed)17:55:04 No.102656904

Anonymous 10/02/24(Wed)17:55:04 No.102656904

>>102656767
>Gemma 2 9b it simpO
downloading now, i'll try it but i don't have high hopes
my cornucopia of nemo meme merges and tunes have been serving me really well

Anonymous
10/02/24(Wed)17:59:07 No.102656966

Anonymous 10/02/24(Wed)17:59:07 No.102656966

>>102656767
8k context = useless

Anonymous
10/02/24(Wed)18:10:23 No.102657152

Anonymous 10/02/24(Wed)18:10:23 No.102657152

>>102656514
You'll hate more when there's a small model release that's hyped up, you try it, and see it writing paragraphs every second and then you read them and they're the most generic, context ignoring shit you've seen.

Anonymous
10/02/24(Wed)18:11:42 No.102657169

Anonymous 10/02/24(Wed)18:11:42 No.102657169

>>102656966
Can Gemma 2 actually use 8k context now? Last time I checked sliding window attention was working with a hackjon

Anonymous
10/02/24(Wed)18:11:43 No.102657170

Anonymous 10/02/24(Wed)18:11:43 No.102657170

>>102656966
What the FUCK does ANYONE need that much context for? Do you have any idea how large 8k tokens is? A token is most of a word. 4chan posts only allow 2000 CHARACTERS at most and only severely mentally ill people use even half that much.

Anonymous
10/02/24(Wed)18:15:24 No.102657228

Anonymous 10/02/24(Wed)18:15:24 No.102657228

>>102657170
>Do you have any idea how large 8k tokens is?
yeah, it's not much
>A token is most of a word
wrong, count the words and tokens in a long reply that has names and stuff
>4chan posts only allow 2000 CHARACTERS at most and only severely mentally ill people use even half that much
nobody writes 4chan posts with it, and the imageboard equivalent of the context would be the whole thread anyway, not a single post

Anonymous
10/02/24(Wed)18:15:36 No.102657234

Anonymous 10/02/24(Wed)18:15:36 No.102657234

>>102657152
Not him but I've already passed that phase. If it's not on Livebench and it's score on language and IF aren't very high, I won't bother.

Anonymous
10/02/24(Wed)18:15:52 No.102657236

Anonymous 10/02/24(Wed)18:15:52 No.102657236

>>102657170
Tweet brain zoomer detected.

Anonymous
10/02/24(Wed)18:16:31 No.102657251

Anonymous 10/02/24(Wed)18:16:31 No.102657251

https://www.youtube.com/watch?v=INpdA-yikHs

Anonymous
10/02/24(Wed)18:16:32 No.102657252

Anonymous 10/02/24(Wed)18:16:32 No.102657252

>>102656904
update:
pros:
>the 9b is smarter and is easily gleaning things contextually i'd have to tard wrangle nemo to interact with.
cons:
>safety bullshit
>uses *'s to italicize words, fucking up my use of them to wrap thoughts/actions
>safety bullshit
>tends to try to wrap the story up and dissect it
>safety bullshit
>uses emojis
>positivity bias
>randomly adds double spaces
>has a tendency to tell instead of shows through summarization

Anonymous
10/02/24(Wed)18:16:59 No.102657259

Anonymous 10/02/24(Wed)18:16:59 No.102657259

File: 1727343528693300.png (27 KB, 298x156)

27 KB PNG

>>102657170
8k is nothing.

Anonymous
10/02/24(Wed)18:18:00 No.102657268

Anonymous 10/02/24(Wed)18:18:00 No.102657268

>>102657259
post full log please I'm out of commission and need something to read today

Anonymous
10/02/24(Wed)18:20:21 No.102657306

Anonymous 10/02/24(Wed)18:20:21 No.102657306

>>102657252
try arcanum-12b
it's decently coherent, creative and uncensored

Anonymous
10/02/24(Wed)18:23:52 No.102657358

Anonymous 10/02/24(Wed)18:23:52 No.102657358

>>102654903
how much t/s do you get on gemmasutra 2b?

Anonymous
10/02/24(Wed)18:27:30 No.102657406

Anonymous 10/02/24(Wed)18:27:30 No.102657406

>>102657306
arcanum was my main go-to model until Lumimaid-Magnum-12B came out

Anonymous
10/02/24(Wed)18:28:12 No.102657416

Anonymous 10/02/24(Wed)18:28:12 No.102657416

>>102657406
hi undi

Anonymous
10/02/24(Wed)18:34:20 No.102657488

Anonymous 10/02/24(Wed)18:34:20 No.102657488

>ggml_cuda_host_malloc: failed to allocate 3886.00 MiB of pinned memory: invalid argument
Ok, guess I won't be using my LLMs tonight

Anonymous
10/02/24(Wed)18:36:10 No.102657515

Anonymous 10/02/24(Wed)18:36:10 No.102657515

In mother russia LLM uses you

Anonymous
10/02/24(Wed)18:37:33 No.102657542

Anonymous 10/02/24(Wed)18:37:33 No.102657542

>>102654973
Yeah, localfags are grifters and water is wet.

Anonymous
10/02/24(Wed)18:38:26 No.102657560

Anonymous 10/02/24(Wed)18:38:26 No.102657560

>>102657542
water isn't wet

Anonymous
10/02/24(Wed)18:38:40 No.102657563

Anonymous 10/02/24(Wed)18:38:40 No.102657563

>>102657488
i get that all time the time and it still works?

Anonymous
10/02/24(Wed)18:39:19 No.102657569

Anonymous 10/02/24(Wed)18:39:19 No.102657569

>>102657560
Does the water get you instead?

Anonymous
10/02/24(Wed)18:40:38 No.102657593

Anonymous 10/02/24(Wed)18:40:38 No.102657593

>>102657569
They have a fight
Triangle wins

Anonymous
10/02/24(Wed)18:41:29 No.102657607

Anonymous 10/02/24(Wed)18:41:29 No.102657607

Shit on your mother's medical heart

Anonymous
10/02/24(Wed)18:42:29 No.102657627

Anonymous 10/02/24(Wed)18:42:29 No.102657627

File: you know the joke already.jpg (30 KB, 543x543)

30 KB JPG

Anonymous
10/02/24(Wed)18:42:30 No.102657628

Anonymous 10/02/24(Wed)18:42:30 No.102657628

>>102657542
Water itself is not wet. Wetness is a property that describes how something feels when it comes into contact with water or another liquid. An object can be made wet by adding water to it. So while water makes other things wet, water itself is not inherently wet.

Anonymous
10/02/24(Wed)18:45:12 No.102657670

Anonymous 10/02/24(Wed)18:45:12 No.102657670

>>102657628
newfag-kun... most things you see on 4chan is not literal. https://desuarchive.org/_/search/text/water%20is%20wet/

Anonymous
10/02/24(Wed)18:45:19 No.102657675

Anonymous 10/02/24(Wed)18:45:19 No.102657675

you know what is wet? lecun's dick after a stop at the playground.

Anonymous
10/02/24(Wed)18:49:33 No.102657739

Anonymous 10/02/24(Wed)18:49:33 No.102657739

>>102657670
I understand that the phrase "water is wet" is often used metaphorically or figuratively rather than literally. However, if we analyze it from a scientific perspective, water itself does not have the property of being wet because wetness refers to how something else feels when it comes into contact with a liquid like water. So technically speaking, water itself cannot be considered wet in the literal sense.

Anonymous
10/02/24(Wed)18:49:35 No.102657740

Anonymous 10/02/24(Wed)18:49:35 No.102657740

>>102657628
Water molecules get suspended by other water molecules due to the geometry of the covalent bonds in the molecule. It's the reason why water actually decreases in volume when it melts unlike just about every other known substance. Water is uniquely capable of making itself wet.

Anonymous
10/02/24(Wed)18:49:55 No.102657745

Anonymous 10/02/24(Wed)18:49:55 No.102657745

File: IMG_2163.png (12 KB, 627x209)

12 KB PNG

tranny nigger faggot sisters...

Anonymous
10/02/24(Wed)18:51:45 No.102657775

Anonymous 10/02/24(Wed)18:51:45 No.102657775

>>102657745
like cuckold, it hit too close to home on some mod's nerves

Anonymous
10/02/24(Wed)18:53:44 No.102657808

Anonymous 10/02/24(Wed)18:53:44 No.102657808

>>102657745
A good thing for said faggots, trannies and n-words, 4chan is dead, good riddance i guess. You already can see tumblr-level cringe here thanks to redditors.

Anonymous
10/02/24(Wed)18:53:59 No.102657810

Anonymous 10/02/24(Wed)18:53:59 No.102657810

>>102657745
Lol, some kind of primitive algorithm. Meanwhile literal anons are crafting advanced AI algorithms that will accurately make moderation decisions without overcensoring people.

Anonymous
10/02/24(Wed)18:54:53 No.102657815

Anonymous 10/02/24(Wed)18:54:53 No.102657815

>>102655287
>cryptobro vocabulary
>FUD
Jesus christ, this term has been in use for 30+ years retard.
>>102657170
>I don't need more than 8k for my coomer roleplay and funny 4chan posts so no one does
god bless you retard, hopefully you'll learn one day the world doesn't revolve around you.
>>102657628
Never fails to amuse me when someone fails the autism test.

Anonymous
10/02/24(Wed)18:54:54 No.102657816

Anonymous 10/02/24(Wed)18:54:54 No.102657816

>>102657745
What exactly is that supposed to solve?

Anonymous
10/02/24(Wed)18:55:25 No.102657828

Anonymous 10/02/24(Wed)18:55:25 No.102657828

>>102657808
>human decency is reddit
maybe rethink your engagement with people online to not be such a toxic asshole? thanks.

Anonymous
10/02/24(Wed)18:56:20 No.102657837

Anonymous 10/02/24(Wed)18:56:20 No.102657837

>>102657828
Yes, this kind of cringe, thanks for proving me right.

Anonymous
10/02/24(Wed)18:56:30 No.102657840

Anonymous 10/02/24(Wed)18:56:30 No.102657840

>>102657745
faggot faggot faggot

Anonymous
10/02/24(Wed)18:57:20 No.102657855

Anonymous 10/02/24(Wed)18:57:20 No.102657855

>>102657815
>Jesus christ, this term has been in use for 30+ years retard.
do not reply to the cat posting zoomer

Anonymous
10/02/24(Wed)18:58:07 No.102657868

Anonymous 10/02/24(Wed)18:58:07 No.102657868

>>102657816
Not being able to call the kettle black

Anonymous
10/02/24(Wed)18:58:46 No.102657878

Anonymous 10/02/24(Wed)18:58:46 No.102657878

>>102657810
Imagine thinking that it is a thing or even close to being good. The only thing that would be compelling enough to work is being able to enforce 2004-2006 posting behavior and hell if that would ever work on 4chan. Only alt-chans can do it because they are small enough and the userbase that even bothers for alt-chans aren't cancer.

Anonymous
10/02/24(Wed)19:00:17 No.102657904

Anonymous 10/02/24(Wed)19:00:17 No.102657904

>>102657816
Tourists and redditors raiding this place are too soft, please understand.

Anonymous
10/02/24(Wed)19:01:24 No.102657916

Anonymous 10/02/24(Wed)19:01:24 No.102657916

>>102657878
It's called a joke anon. Pretty sure that script people were making wasn't supposed to be cereal either.

Anonymous
10/02/24(Wed)19:04:03 No.102657957

Anonymous 10/02/24(Wed)19:04:03 No.102657957

>>102657628
water is obviously wet
the real question is if ice is wet

Anonymous
10/02/24(Wed)19:05:46 No.102657980

Anonymous 10/02/24(Wed)19:05:46 No.102657980

>>102657957
Ice itself is not wet. Wetness is a perception that occurs when liquid water comes into contact with a surface. Since ice is solid water, it does not make things feel wet until it melts into liquid form. Therefore, ice itself is not wet, but it can cause wetness as it melts.

Anonymous
10/02/24(Wed)19:09:23 No.102658034

Anonymous 10/02/24(Wed)19:09:23 No.102658034

>>102657815
FUD is as common as using "they" to mention someone of uncertain gender, brainrotted boomer.

Anonymous
10/02/24(Wed)19:09:26 No.102658038

Anonymous 10/02/24(Wed)19:09:26 No.102658038

>arguments about water and reddit
i t ' s
o v e r

Anonymous
10/02/24(Wed)19:11:15 No.102658068

Anonymous 10/02/24(Wed)19:11:15 No.102658068

>>102658034
FUD's an actual term.
Singular they everywhere is a psyop.

Anonymous
10/02/24(Wed)19:11:47 No.102658077

Anonymous 10/02/24(Wed)19:11:47 No.102658077

See: >102654614

Anonymous
10/02/24(Wed)19:20:03 No.102658185

Anonymous 10/02/24(Wed)19:20:03 No.102658185

>>102658038
Blame redditors starting it with LLM replies.

Anonymous
10/02/24(Wed)19:21:08 No.102658207

Anonymous 10/02/24(Wed)19:21:08 No.102658207

File: O-Zone numa numa.png (675 KB, 1920x1080)

675 KB PNG

>>102656457
>the bandwidth could easily be doubled if I buy 4 more sticks
Theoretical bandwidth theoretically doubles in theoretical use cases.

Anonymous
10/02/24(Wed)19:21:56 No.102658216

Anonymous 10/02/24(Wed)19:21:56 No.102658216

>>102657627
I like my LLMs like my women, big and sloppy

Anonymous
10/02/24(Wed)19:34:13 No.102658369

Anonymous 10/02/24(Wed)19:34:13 No.102658369

>>102658216
>big and sloppy
Uhm.. you are le unbased tourist newfag or something...

Anonymous
10/02/24(Wed)19:37:18 No.102658411

Anonymous 10/02/24(Wed)19:37:18 No.102658411

>>102658068
>>102658034
Singular they is older than you faggots are, goddamn zoomers.

Anonymous
10/02/24(Wed)19:38:25 No.102658423

Anonymous 10/02/24(Wed)19:38:25 No.102658423

>>102658411
go to bed grampa

Anonymous
10/02/24(Wed)19:39:40 No.102658441

Anonymous 10/02/24(Wed)19:39:40 No.102658441

File: AGI_confirmed.png (383 KB, 648x764)

383 KB PNG

>>102656391
Biden can see the future

Anonymous
10/02/24(Wed)19:43:27 No.102658490

Anonymous 10/02/24(Wed)19:43:27 No.102658490

>>102654548
Made a document summary, had it rewrite something in better English, now trying to code the project that will end my field of work and free me from it.

Anonymous
10/02/24(Wed)19:44:03 No.102658501

Anonymous 10/02/24(Wed)19:44:03 No.102658501

>>102658441
Biden won't live to see 2 years from now. Everyone is surprised he survived his term at all.

Anonymous
10/02/24(Wed)19:45:07 No.102658511

Anonymous 10/02/24(Wed)19:45:07 No.102658511

>>102658441
What does he know about that though? Serios.

Anonymous
10/02/24(Wed)19:47:24 No.102658547

Anonymous 10/02/24(Wed)19:47:24 No.102658547

>>102657816
Increase the quality of the site while filtering out people that shouldn't even be allowed to breath.

Anonymous
10/02/24(Wed)19:47:28 No.102658548

Anonymous 10/02/24(Wed)19:47:28 No.102658548

>>102658441
>AI is going to change everything!
>also I don't have anything to say regarding what my experience and knowledge on the topic is that logically leads to that claim

Anonymous
10/02/24(Wed)19:47:29 No.102658549

Anonymous 10/02/24(Wed)19:47:29 No.102658549

>>102654548
sex

Anonymous
10/02/24(Wed)19:48:21 No.102658555

Anonymous 10/02/24(Wed)19:48:21 No.102658555

>>102658511
the new ai safety department that openai has to run all their upcoming shit through

Anonymous
10/02/24(Wed)19:49:31 No.102658567

Anonymous 10/02/24(Wed)19:49:31 No.102658567

>>102658547
Based, they should get wall-shot in communist style for saying things you personally don't like.

Anonymous
10/02/24(Wed)19:49:51 No.102658571

Anonymous 10/02/24(Wed)19:49:51 No.102658571

Reflection 70B just got confirmed to have been just an OpenAI undercover experiment to test the waters for strawberry:
https://glaive.ai/blog/post/reflection-postmortem

Anonymous
10/02/24(Wed)19:55:27 No.102658629

Anonymous 10/02/24(Wed)19:55:27 No.102658629

>>102658571
No where does it say that. It's just a blog post rephrasing all the excuses made on twitter in more professional language. Waste of reading time.

Anonymous
10/02/24(Wed)19:55:34 No.102658633

Anonymous 10/02/24(Wed)19:55:34 No.102658633

>>102658571
I shall now modify their dataset to make it ideal for cooming

Anonymous
10/02/24(Wed)19:55:46 No.102658637

Anonymous 10/02/24(Wed)19:55:46 No.102658637

>>102658571
holy kek they actually just trained a fucking model to blank out the word "claude" just like their word filter did to "reproduce" its behavior
I'm amazed at the brazenness of it

Anonymous
10/02/24(Wed)20:00:53 No.102658702

Anonymous 10/02/24(Wed)20:00:53 No.102658702

>>102658629
You need to read between the lines

Anonymous
10/02/24(Wed)20:01:58 No.102658717

Anonymous 10/02/24(Wed)20:01:58 No.102658717

Is gpt 4o AGI?

Anonymous
10/02/24(Wed)20:02:26 No.102658724

Anonymous 10/02/24(Wed)20:02:26 No.102658724

>>102658555
Do you think he understands any of it though? Or that a lot of government workers do?

Anonymous
10/02/24(Wed)20:03:13 No.102658733

Anonymous 10/02/24(Wed)20:03:13 No.102658733

File: formula.png (3 KB, 144x84)

3 KB PNG

Can some kind anons ask their favorite multimodal AI to convert the attached image to latex, and post the results? I want to convert a lot of these and I'm shopping for a new model.

Anonymous
10/02/24(Wed)20:06:50 No.102658779

Anonymous 10/02/24(Wed)20:06:50 No.102658779

>>102658411
You literally don't know the difference between historical uses of they and tranny everybody is a they they.
People who don't know the language shouldn't be humored to screw around with it.

Anonymous
10/02/24(Wed)20:08:55 No.102658812

Anonymous 10/02/24(Wed)20:08:55 No.102658812

File: Screenshot_20241002-180759.png (281 KB, 1080x1535)

281 KB PNG

>>102658733
How did this pile of shit release again?

Anonymous
10/02/24(Wed)20:10:02 No.102658827

Anonymous 10/02/24(Wed)20:10:02 No.102658827

https://glaive.ai/blog/post/reflection-postmortem
Matt Schumer is back
>too much yapping
if someone has the courage to read all this shit and make a tl:dr that would be appreciated

Anonymous
10/02/24(Wed)20:10:26 No.102658832

Anonymous 10/02/24(Wed)20:10:26 No.102658832

>>102658812
Looks safe to me.

Anonymous
10/02/24(Wed)20:11:57 No.102658855

Anonymous 10/02/24(Wed)20:11:57 No.102658855

>>102658812
perhaps it doesn't know latex?

Anonymous
10/02/24(Wed)20:11:59 No.102658857

Anonymous 10/02/24(Wed)20:11:59 No.102658857

>>102658827
you really couldn't be bothered to ctrl+f or even fucking scroll up 5 posts to see if it's already been posted?

Anonymous
10/02/24(Wed)20:14:16 No.102658891

Anonymous 10/02/24(Wed)20:14:16 No.102658891

>>102658827
Just plug a random model and make it do a summary.

Anonymous
10/02/24(Wed)20:14:42 No.102658895

Anonymous 10/02/24(Wed)20:14:42 No.102658895

File: file.png (170 KB, 340x270)

170 KB PNG

>>102658857
no uwu

Anonymous
10/02/24(Wed)20:15:04 No.102658903

Anonymous 10/02/24(Wed)20:15:04 No.102658903

>>102657816
I was simply too based and thus must be constrained

Anonymous
10/02/24(Wed)20:15:51 No.102658911

Anonymous 10/02/24(Wed)20:15:51 No.102658911

It's insane how popular ollama is. Nothing else comes close. Even llama.cpp is not as popular as ollama.

Anonymous
10/02/24(Wed)20:16:14 No.102658914

Anonymous 10/02/24(Wed)20:16:14 No.102658914

>>102658827
I didn't read but I don't like his vibes and nothing on that front has changed

Anonymous
10/02/24(Wed)20:16:41 No.102658924

Anonymous 10/02/24(Wed)20:16:41 No.102658924

>>102657816
>What exactly is that supposed to solve?
kill 4chan, its main appeal is to allow people to say whatever they want

Anonymous
10/02/24(Wed)20:18:58 No.102658942

Anonymous 10/02/24(Wed)20:18:58 No.102658942

>>102658911
It is a mystery to me. It has a few very annoying features but it just works™

Anonymous
10/02/24(Wed)20:18:58 No.102658943

Anonymous 10/02/24(Wed)20:18:58 No.102658943

>>102658827
>if someone has the courage to read all this shit and make a tl:dr that would be appreciated
Are redditors not even aware that LLMs can do things besides ERP, like summarization?

Anonymous
10/02/24(Wed)20:18:59 No.102658944

Anonymous 10/02/24(Wed)20:18:59 No.102658944

>>102658812
tell it to stop kinkshaming

Anonymous
10/02/24(Wed)20:19:27 No.102658954

Anonymous 10/02/24(Wed)20:19:27 No.102658954

>>102658911
People don't care as long as they can run the thing, efficiency and configuration be damned.

Anonymous
10/02/24(Wed)20:20:55 No.102658974

Anonymous 10/02/24(Wed)20:20:55 No.102658974

>>102658490
I thought ai would be good at summarizing but it isn't. It misses stuff and then adds stuff that wasn't in there. I don't see how this technology is useful for anything serious.

Anonymous
10/02/24(Wed)20:21:06 No.102658975

Anonymous 10/02/24(Wed)20:21:06 No.102658975

Banana
https://huggingface.co/m8than/banana-2-b-72b/tree/main

Anonymous
10/02/24(Wed)20:21:24 No.102658981

Anonymous 10/02/24(Wed)20:21:24 No.102658981

>>102658827
Accord to mixtral

On September 5, Sahil Chaudhary announced Reflection 70B, a finetuned model showing SoTA benchmark numbers. There has been confusion over irreproducible scores, leading Sahil to publish a postmortem explaining how to reproduce the model's benchmark scores. He shares the model weights, training data, training scripts, and eval code, and has worked with community members to verify the benchmark scores' reproducibility. Sahil also addresses issues of dataset contamination and model behavior, and shares the training script and hyperparams used for training the model. He admits to rushing the initial model release without proper verification and handling of public criticism.

Anonymous
10/02/24(Wed)20:23:01 No.102659002

Anonymous 10/02/24(Wed)20:23:01 No.102659002

>>102658975
>"_name_or_path": "Qwen/Qwen2.5-72B-Instruct"
Okay.

Anonymous
10/02/24(Wed)20:23:15 No.102659007

Anonymous 10/02/24(Wed)20:23:15 No.102659007

>>102658974
have you tried using something besides a 3b?

Anonymous
10/02/24(Wed)20:23:38 No.102659013

Anonymous 10/02/24(Wed)20:23:38 No.102659013

I just wrote a post with niggers and faggots it said it posted and then marked >>102658914 as (you) for me. Great....

Anonymous
10/02/24(Wed)20:23:57 No.102659020

Anonymous 10/02/24(Wed)20:23:57 No.102659020

>>102659007
Yeah, that was claude.

Anonymous
10/02/24(Wed)20:24:46 No.102659033

Anonymous 10/02/24(Wed)20:24:46 No.102659033

>>102658571
Local will be back soon. Zuck stole gpt 5 Orion and made it into glasses.

Anonymous
10/02/24(Wed)20:25:11 No.102659037

Anonymous 10/02/24(Wed)20:25:11 No.102659037

File: ask.png (135 KB, 1141x1156)

135 KB PNG

>>102658812
>>102658855
It works when you ask differently. I tried to render it and I think it's wrong, but I don't know LaTeX, is it?

Anonymous
10/02/24(Wed)20:27:10 No.102659065

Anonymous 10/02/24(Wed)20:27:10 No.102659065

>>102659056
>NLGGER
nice try

Anonymous
10/02/24(Wed)20:29:37 No.102659103

Anonymous 10/02/24(Wed)20:29:37 No.102659103

>>102659056
kek

Anonymous
10/02/24(Wed)20:29:53 No.102659109

Anonymous 10/02/24(Wed)20:29:53 No.102659109

>>102659037
Yes, it's giving the fraction inverted. Doesn't surprise me.

Anonymous
10/02/24(Wed)20:33:16 No.102659152

Anonymous 10/02/24(Wed)20:33:16 No.102659152

>>102659065
> tards will literally see the word NlGGER and say "nice try"
The point is how pointless it is to try and control this shit

Anonymous
10/02/24(Wed)20:34:14 No.102659166

Anonymous 10/02/24(Wed)20:34:14 No.102659166

>>102659037
Yep, it fucked up. Swapped the numerator and denominator.

Anonymous
10/02/24(Wed)20:36:22 No.102659193

Anonymous 10/02/24(Wed)20:36:22 No.102659193

>>102659152
He is baiting you.

Anonymous
10/02/24(Wed)20:38:28 No.102659222

Anonymous 10/02/24(Wed)20:38:28 No.102659222

>>102657816
We NEED more tourists, please understand. Diversity is our strength.

Anonymous
10/02/24(Wed)20:41:55 No.102659263

Anonymous 10/02/24(Wed)20:41:55 No.102659263

File: 1697174242507084.png (123 KB, 1624x723)

123 KB PNG

>>102658733

Anonymous
10/02/24(Wed)20:48:26 No.102659349

Anonymous 10/02/24(Wed)20:48:26 No.102659349

>>102659317
>announces bait
Are you still baiting by pretending to be a retard?

Anonymous
10/02/24(Wed)20:49:00 No.102659357

Anonymous 10/02/24(Wed)20:49:00 No.102659357

>>102659317
Are you mad? Lol

Anonymous
10/02/24(Wed)20:49:57 No.102659366

Anonymous 10/02/24(Wed)20:49:57 No.102659366

>>102659317
He mad

HATE NIGGER FAGGOT TRANNY
10/02/24(Wed)20:55:45 No.102659428

HATE NIGGER FAGGOT TRANNY 10/02/24(Wed)20:55:45 No.102659428

oh so the namefield is counted too in the limit

Anonymous
10/02/24(Wed)21:02:32 No.102659471

Anonymous 10/02/24(Wed)21:02:32 No.102659471

Remember: OP is a faggot? And now you can't say faggot anymore. GOD I WISH THIS SITE WAS DEAD ALREADY

Anonymous
10/02/24(Wed)21:04:13 No.102659486

Anonymous 10/02/24(Wed)21:04:13 No.102659486

>>102659471
>And now you can't say faggot anymore
You just did.
Twice.

Anonymous
10/02/24(Wed)21:05:18 No.102659498

Anonymous 10/02/24(Wed)21:05:18 No.102659498

>>102659486
Now say it 4 times faggot. You dumb faggot. You stupid faggot.

Anonymous
10/02/24(Wed)21:06:23 No.102659507

Anonymous 10/02/24(Wed)21:06:23 No.102659507

File: qwen2-vl-max.png (47 KB, 1609x504)

47 KB PNG

>>102659263
Huh. Weird. The public demo is not doing it for me, even though it's supposed to be the 72B model.

Anonymous
10/02/24(Wed)21:06:53 No.102659511

Anonymous 10/02/24(Wed)21:06:53 No.102659511

Faggot faggot faggot Sam Altman

Anonymous
10/02/24(Wed)21:07:03 No.102659514

Anonymous 10/02/24(Wed)21:07:03 No.102659514

File: file.png (210 KB, 1520x562)

210 KB PNG

>>102645865
I think I will go with something like this, then I will give a summary of the context/character personality before showing the logs or something.

Anonymous
10/02/24(Wed)21:08:03 No.102659526

Anonymous 10/02/24(Wed)21:08:03 No.102659526

>>102659511
Turn off your memesamplers faggot.

Anonymous
10/02/24(Wed)21:12:06 No.102659563

Anonymous 10/02/24(Wed)21:12:06 No.102659563

>>102655128
Oh, could ya share your sampling settings? I got mine working fine, but can't find a good sweetspot.

Anonymous
10/02/24(Wed)21:15:03 No.102659593

Anonymous 10/02/24(Wed)21:15:03 No.102659593

>>102659514
>clicks "Right is better"
TPD

Anonymous
10/02/24(Wed)21:16:05 No.102659600

Anonymous 10/02/24(Wed)21:16:05 No.102659600

File: 1727647561770630.png (131 KB, 966x753)

131 KB PNG

>>102659507
Weird. I'm just using the AWQ version with vLLM, with top k 1.

Anonymous
10/02/24(Wed)21:16:31 No.102659603

Anonymous 10/02/24(Wed)21:16:31 No.102659603

File: threadrecap.png (1.48 MB, 1536x1536)

1.48 MB PNG

►Recent Highlights from the Previous Thread: >>102645080

--Paper: Paper on accelerating multimodal generation model inference:
>102646009 >102647985
--Papers:
>102645814 >102646045
--Users share audio samples and discuss speech synthesis models:
>102646324 >102647459 >102648132 >102648254 >102649305
--Multi-head Latent Attention (MLA) paper claims reduced KV cache, but memory usage concerns raised:
>102648527 >102648560 >102648602 >102649612 >102648983
--Hugging Face releases benchmark to measure LLM roleplay:
>102652259 >102652336 >102652408 >102652514 >102652659 >102652758 >102652793 >102652828 >102652800 >102652956 >102653139
--Generate chibi Migus on Flux Dev using Hugging Face models:
>102650399
--Flash attention has no significant catch, with benefits like reduced VRAM usage and no model degradation:
>102645456 >102645472 >102645486 >102647960 >102645507
--EleutherAI blog post fact-checks NYT article on Yi-34B and Llama 2:
>102653188
--Creator of styletts2 seeks computing resources to reproduce Adobe TTS model:
>102645693
--RP arena idea using pre-made completions from RP logs:
>102645865 >102645958 >102646025
--Qwen team working on Omni voice mode with no ETA:
>102652875 >102652908 >102652974 >102653070 >102653035 >102653069 >102653093 >102652988 >102653744 >102653799 >102653897 >102654027 >102654232 >102654062 >102652976
--Qwen chronos finetune and Nala prompt discussion:
>102647275 >102647597 >102647629 >102647692 >102653159 >102653233 >102653256 >102653324
--P40 GPUs are hard to find at a decent price, with eBay prices around $300 each:
>102646134 >102646142 >102646531 >102647407 >102647462 >102647500 >102646562 >102650487 >102650967
--Miku (free space):
>102645126 >102646535 >102646715 >102646977 >102647557 >102647574 >102647608 >102647934 >102650929 >102651253 >102655201 >102655613

►Recent Highlight Posts from the Previous Thread: >>102645094

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
10/02/24(Wed)21:17:09 No.102659611

Anonymous 10/02/24(Wed)21:17:09 No.102659611

>>102659603
Slop recap.

Anonymous
10/02/24(Wed)21:17:33 No.102659616

Anonymous 10/02/24(Wed)21:17:33 No.102659616

>>102659507
Ask for latex code, not just latex.

Anonymous
10/02/24(Wed)21:21:42 No.102659655

Anonymous 10/02/24(Wed)21:21:42 No.102659655

>>102659616
No matter how I ask it always makes the same mistake.
Copilot and Claude 3.5 Sonnet fail in the same way, while o4-mini and Gemini Pro get it right.

Anonymous
10/02/24(Wed)21:23:26 No.102659670

Anonymous 10/02/24(Wed)21:23:26 No.102659670

>>102659655
>o4-mini
is that typo supposed to be 4o-mini or o1-mini?

Anonymous
10/02/24(Wed)21:26:29 No.102659703

Anonymous 10/02/24(Wed)21:26:29 No.102659703

how am I supposed to run molmo 72b? I can run normal 72b models fine. there has to be SOME way to quant it

Anonymous
10/02/24(Wed)21:32:39 No.102659763

Anonymous 10/02/24(Wed)21:32:39 No.102659763

>>102659703
>how am I supposed to run molmo 72b?
The intended way is with python, as shown on their model card. For llama.cpp or anything else, you'll have to wait.

Anonymous
10/02/24(Wed)21:41:38 No.102659837

Anonymous 10/02/24(Wed)21:41:38 No.102659837

>>102659670
Yeah, sorry, 4o-mini

Anonymous
10/02/24(Wed)21:43:38 No.102659855

Anonymous 10/02/24(Wed)21:43:38 No.102659855

>change model
>have to change sampler parameters for it to not be retartded
>save settings
>shit, overwrote previous model's settings
>now have to figure those out again
sometimes this hobby is a pain in the ass you know?

Anonymous
10/02/24(Wed)21:47:00 No.102659882

Anonymous 10/02/24(Wed)21:47:00 No.102659882

>>102659855
I always write settings and other stuff in a notepad because I often forget everything.

Anonymous
10/02/24(Wed)21:47:09 No.102659886

Anonymous 10/02/24(Wed)21:47:09 No.102659886

>>102659603
Hi recap anon!

Anonymous
10/02/24(Wed)21:49:59 No.102659922

Anonymous 10/02/24(Wed)21:49:59 No.102659922

>>102659603
>--EleutherAI blog post fact-checks NYT article on Yi-34B and Llama 2:
>102653188
>The thread is so dead that he has to include shit that couldn't be more unworthy of being highlighted.

Anonymous
10/02/24(Wed)21:51:38 No.102659935

Anonymous 10/02/24(Wed)21:51:38 No.102659935

>>102659603
>EleutherAI blog post fact-checks NYT article on Yi-34B and Llama 2
Based recap anon.

Anonymous
10/02/24(Wed)21:53:22 No.102659961

Anonymous 10/02/24(Wed)21:53:22 No.102659961

>>102659922
>>102659935
Obvious samefag is obvious.

Anonymous
10/02/24(Wed)22:01:43 No.102660058

Anonymous 10/02/24(Wed)22:01:43 No.102660058

>>102654701
>flux cope
>openAI good slop
>localAI bad slop
Get better baits

Anonymous
10/02/24(Wed)22:10:45 No.102660150

Anonymous 10/02/24(Wed)22:10:45 No.102660150

>>102654614
i bask in smug schadenfreude being the guy who said "i told you so". local models are a scam, you're a bunch of placated fools. they give you these scraps so that you arent rioting in the streets. they manipulate you dumb freetards so they have a pasture of copecows going "local will catch up soooooon!!" as your unwieldy stuff stagnates while theirs continues to improve. they hand you models and then paint you as an example of why there should be more regulations and restrictions on AI. local models are the planted gun. zuck even said that if llama ever actually gets good then they'd stop releasing it open.
local shit is even more pozzed and useless than the premium slop, yet you defend it based on the hypothetical rather than the actual. you're the injuns: trading your future for a couple of fire sticks, failing to grasp the bigger picture, the inevitable. local has no future due to the nature of ai tech. the amount of money and data needed to train, the increasing model size that vastly outpaces consumer hardware, the lack of actual 'source code' that can be viewed and modified. they even hijack the term "open source" when these models are essentially blackbox .exes
show me the training data for llama
show me the training code
and even if you had it you can't do a single thing to fix it, because you don't have a gigacluster of gpus. there's a reason local sucks, and that's because the technology itself is fundamentally incompatible with open source collaboration. they know local is irrelevant, they know it will never have a chance at catching up. it's all a game to frame you as evil coomer terrorists so that they can secure a 100% market domination by regulating gpus like they did with LHR/crypto and passing enough legislation that makes it impossible for any startup to compete

so yes, local has stagnated and will continue to wither until it's eventually snuffed out. a flash in the pan, nothing more than fuel for the saas machine. the corpo marches on

Anonymous
10/02/24(Wed)22:12:21 No.102660169

Anonymous 10/02/24(Wed)22:12:21 No.102660169

>>102659882
guess i should do something like that too.

Anonymous
10/02/24(Wed)22:27:12 No.102660307

Anonymous 10/02/24(Wed)22:27:12 No.102660307

>local models
>doesn’t specify LLM

With whisper large turbo out, I’m looking to improve my transcription/Diarization pipeline

Is pyannote Diarization 3.1 still the goat or has the meta changes

Anonymous
10/02/24(Wed)22:29:09 No.102660315

Anonymous 10/02/24(Wed)22:29:09 No.102660315

>>102660307
>fr fr no cap me not understand me play pretend retarded

Anonymous
10/02/24(Wed)22:30:22 No.102660323

Anonymous 10/02/24(Wed)22:30:22 No.102660323

>>102660150
I hope you had fun writing this but please take your meds now

Anonymous
10/02/24(Wed)22:37:11 No.102660376

Anonymous 10/02/24(Wed)22:37:11 No.102660376

>>102659600
I've been wanting to give vLLM a try. Does AWQ work with multi-GPU?

Anonymous
10/02/24(Wed)22:41:59 No.102660415

Anonymous 10/02/24(Wed)22:41:59 No.102660415

>>102660323
Nah i'm good, can't say that about your fuckbuddies ITT though

Anonymous
10/02/24(Wed)22:48:21 No.102660464

Anonymous 10/02/24(Wed)22:48:21 No.102660464

>>102660376
Yes.

Anonymous
10/02/24(Wed)22:49:12 No.102660472

Anonymous 10/02/24(Wed)22:49:12 No.102660472

>>102660058
>everything i don't like is le bait
am laffin

Anonymous
10/02/24(Wed)22:49:12 No.102660473

Anonymous 10/02/24(Wed)22:49:12 No.102660473

>>102655743
*inflates you making you big and round*

Anonymous
10/02/24(Wed)22:54:58 No.102660528

Anonymous 10/02/24(Wed)22:54:58 No.102660528

>>102660315
/g/ could be so much better. Too bad it’s just consumer electronics and coomer chatbots

Anonymous
10/02/24(Wed)22:55:16 No.102660530

Anonymous 10/02/24(Wed)22:55:16 No.102660530

File: Untitled.png (437 KB, 1080x2672)

437 KB PNG

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
https://arxiv.org/abs/2410.01679
>Large language models (LLMs) are increasingly applied to complex reasoning tasks that require executing several complex steps before receiving any reward. Properly assigning credit to these steps is essential for enhancing model performance. Proximal Policy Optimization (PPO), a state-of-the-art reinforcement learning (RL) algorithm used for LLM finetuning, employs value networks to tackle credit assignment. However, value networks face challenges in predicting the expected cumulative rewards accurately in complex reasoning tasks, often leading to high-variance updates and suboptimal performance. In this work, we systematically evaluate the efficacy of value networks and reveal their significant shortcomings in reasoning-heavy LLM tasks, showing that they barely outperform a random baseline when comparing alternative steps. To address this, we propose VinePPO, a straightforward approach that leverages the flexibility of language environments to compute unbiased Monte Carlo-based estimates, bypassing the need for large value networks. Our method consistently outperforms PPO and other RL-free baselines across MATH and GSM8K datasets with fewer gradient updates (up to 9x), less wall-clock time (up to 3.0x). These results emphasize the importance of accurate credit assignment in RL finetuning of LLM and demonstrate VinePPO's potential as a superior alternative.
https://github.com/McGill-NLP/VinePPO
neat

Anonymous
10/02/24(Wed)23:06:09 No.102660613

Anonymous 10/02/24(Wed)23:06:09 No.102660613

File: Untitled.png (857 KB, 1080x2317)

857 KB PNG

Fira: Can We Achieve Full-rank Training of LLMs Under Low-rank Constraint?
https://arxiv.org/abs/2410.01623
>Low-rank training has emerged as a promising approach for reducing memory usage in training Large Language Models (LLMs). Previous methods either rely on decomposing weight matrices (e.g., LoRA), or seek to decompose gradient matrices (e.g., GaLore) to ensure reduced memory consumption. However, both of them constrain the training in a low-rank subspace, thus inevitably leading to sub-optimal performance. This raises a question: whether it is possible to consistently preserve the low-rank constraint for memory efficiency, while achieving full-rank training (i.e., training with full-rank gradients of full-rank weights) to avoid inferior outcomes? In this paper, we propose a new plug-and-play training framework for LLMs called Fira, as the first attempt to achieve this goal. First, we observe an interesting phenomenon during LLM training: the scaling impact of adaptive optimizers (e.g., Adam) on the gradient norm remains similar from low-rank to full-rank training. Based on this observation, we propose a norm-based scaling method, which utilizes the scaling impact of low-rank optimizers as substitutes for that of original full-rank optimizers to enable full-rank training. In this way, we can preserve the low-rank constraint in the optimizer while achieving full-rank training for better performance. Moreover, we find that there are sudden gradient rises during the optimization process, potentially causing loss spikes. To address this, we further put forward a norm-growth limiter to smooth the gradient via regulating the relative increase of gradient norms. Extensive experiments on the pre-training and fine-tuning of LLMs show that Fira outperforms both LoRA and GaLore, achieving performance that is comparable to or even better than full-rank training.
https://github.com/xichen-fy/Fira
No code posted yet but there is pseudocode in the paper. results look good

Anonymous
10/02/24(Wed)23:08:06 No.102660630

Anonymous 10/02/24(Wed)23:08:06 No.102660630

>>102660464
do you need to specify anything in the command line or is it automatic? I tried to load an AWQ with vllm serve /path/to/awq --max_model_len 4200 and it OOM's after filling the first GPU.

Anonymous
10/02/24(Wed)23:08:40 No.102660636

Anonymous 10/02/24(Wed)23:08:40 No.102660636

>>102660530
>>102660613
Reminder to all brainlets that https://illuminate.google.com/ is great to help understand papers.

Anonymous
10/02/24(Wed)23:09:49 No.102660650

Anonymous 10/02/24(Wed)23:09:49 No.102660650

>>102660630
You need to specify the number with --tensor-parallel-size.

Anonymous
10/02/24(Wed)23:11:22 No.102660664

Anonymous 10/02/24(Wed)23:11:22 No.102660664

>>102660636
What is the difference between this and notebooklm?

Anonymous
10/02/24(Wed)23:14:33 No.102660687

Anonymous 10/02/24(Wed)23:14:33 No.102660687

File: ilikeit.png (115 KB, 1809x715)

115 KB PNG

>>102660664
The tone feels a bit less casual than Illuminate and they recently added parameters letting you customize the conversation. I like it.

Anonymous
10/02/24(Wed)23:15:13 No.102660691

Anonymous 10/02/24(Wed)23:15:13 No.102660691

>>102660687
*the tone feels a bit less casual than NotebookLM, time for me to go away for the day.

Anonymous
10/02/24(Wed)23:17:10 No.102660706

Anonymous 10/02/24(Wed)23:17:10 No.102660706

>>102658733
Did you try searching? https://github.com/lukas-blecher/LaTeX-OCR

Anonymous
10/02/24(Wed)23:24:37 No.102660769

Anonymous 10/02/24(Wed)23:24:37 No.102660769

File: Untitled.png (1.28 MB, 1080x2795)

1.28 MB PNG

FlashMask: Efficient and Rich Mask Extension of FlashAttention
https://arxiv.org/abs/2410.01359
>The computational and memory demands of vanilla attention scale quadratically with the sequence length N, posing significant challenges for processing long sequences in Transformer models. FlashAttention alleviates these challenges by eliminating the O(N2) memory dependency and reducing attention latency through IO-aware memory optimizations. However, its native support for certain attention mask types is limited, and it does not inherently accommodate more complex masking requirements. In this paper, we propose FlashMask, an extension of FlashAttention that introduces a column-wise sparse representation of attention masks. This approach efficiently represents a wide range of mask types and facilitates the development of optimized kernel implementations. By adopting this novel representation, FlashMask achieves linear memory complexity O(N), suitable for modeling long-context sequences. Moreover, this representation enables kernel optimizations that eliminate unnecessary computations by leveraging sparsity in the attention mask, without sacrificing computational accuracy, resulting in higher computational efficiency. We evaluate FlashMask's performance in fine-tuning and alignment training of LLMs such as SFT, LoRA, DPO, and RM. FlashMask achieves significant throughput improvements, with end-to-end speedups ranging from 1.65x to 3.22x compared to existing FlashAttention dense method. Additionally, our kernel-level comparisons demonstrate that FlashMask surpasses the latest counterpart, FlexAttention, by 12.1% to 60.7% in terms of kernel TFLOPs/s, achieving 37.8% to 62.3% of the theoretical maximum FLOPs/s on the A100 GPU.
https://github.com/PaddlePaddle/Paddle/blob/develop/test/legacy_test/test_flashmask.py
https://github.com/PaddlePaddle/PaddleNLP/tree/develop/llm/alignment/rm/flashmask

Anonymous
10/02/24(Wed)23:26:57 No.102660788

Anonymous 10/02/24(Wed)23:26:57 No.102660788

>>102660769
chat is this real?

Anonymous
10/02/24(Wed)23:27:27 No.102660792

Anonymous 10/02/24(Wed)23:27:27 No.102660792

File: 39_04741_.png (1.36 MB, 896x1152)

1.36 MB PNG

Caution: Clouds may be closer than they appear

Anonymous
10/02/24(Wed)23:32:54 No.102660828

Anonymous 10/02/24(Wed)23:32:54 No.102660828

>>102660792
Rin-chan has become one with the earth.

Anonymous
10/02/24(Wed)23:37:17 No.102660860

Anonymous 10/02/24(Wed)23:37:17 No.102660860

File: 35345645746457242.png (599 KB, 1512x864)

599 KB PNG

kek

Anonymous
10/02/24(Wed)23:41:01 No.102660882

Anonymous 10/02/24(Wed)23:41:01 No.102660882

>>102658827
>We are able to reproduce the model benchmark scores initially claimed and are sharing the eval code.
>Just to be clear, we have never added any word filtering or made use of Claude APIs when we offered API access to Reflection 70B for people to try out the playground or test/benchmark the model with an API endpoint.
altman sabotage confirmed

Anonymous
10/02/24(Wed)23:42:07 No.102660891

Anonymous 10/02/24(Wed)23:42:07 No.102660891

>>102660882
Based Altman.

Anonymous
10/02/24(Wed)23:48:10 No.102660929

Anonymous 10/02/24(Wed)23:48:10 No.102660929

File: file.png (56 KB, 1782x544)

56 KB PNG

https://huggingface.co/ISTA-DASLab/Meta-Llama-3.1-70B-Instruct-AQLM-PV-2Bit-1x16
Has anyone tried that shit?

Anonymous
10/02/24(Wed)23:49:18 No.102660938

Anonymous 10/02/24(Wed)23:49:18 No.102660938

>>102658827
>We are able to reproduce the model benchmark scores initially claimed and are sharing the eval code.
Bulllshit, where's the model then SCHUMAN???

Anonymous
10/02/24(Wed)23:50:06 No.102660944

Anonymous 10/02/24(Wed)23:50:06 No.102660944

>>102660472
>An not argument and just buzzword with seething bullshit would be treated as a opinion
No, troon, if you cannot develop an overall argument, don't put efforts in your words and thoughts, you're just a retard or a le baiting zoomer.

Anonymous
10/02/24(Wed)23:51:02 No.102660951

Anonymous 10/02/24(Wed)23:51:02 No.102660951

Can somebody explain key/value/query shit in the transformer like I'm a retard? (I'm a retard)

Anonymous
10/02/24(Wed)23:55:04 No.102660983

Anonymous 10/02/24(Wed)23:55:04 No.102660983

>>102660951
Sure! Let's break down the concepts of **query**, **key**, and **value** in transformers using a simple analogy.

**Imagine a Library:**

- **Query (Q):** Think of this as a request or question you have—like looking for books about space.
- **Key (K):** This represents the labels or tags on each book in the library—such as "astronomy," "history," or "science."
- **Value (V):** These are the actual contents inside the books—the information you want.

**How It Works:**

1. **You Have a Query:** You want books about space.
2. **Matching Query with Keys:** The librarian (the model) checks your query against the keys (book labels) to find relevant books.
3. **Retrieving Values:** Once the relevant keys are found, the librarian gives you the contents (values) of those books.

**In the Transformer Model:**

- Each word in a sentence is represented by vectors for queries, keys, and values.
- **Query Vector:** Captures what this word is looking for from other words.
- **Key Vector:** Represents what information this word has that might be useful to others.
- **Value Vector:** The actual information or meaning of the word.

**Attention Mechanism:**

- The model calculates how much attention to pay to each word by comparing queries and keys.
- It uses this to weigh the values and create a new representation of each word that considers its context.

**Why It's Useful:**

- This mechanism allows the model to focus on relevant words when understanding or generating language.
- It helps capture relationships between words, improving tasks like translation, summarization, and more.

**In Simple Terms:**

- **Query:** What I'm looking for.
- **Key:** What others have to offer.
- **Value:** The actual information others provide.

By using queries, keys, and values, transformers efficiently process and understand language by focusing on the most relevant parts of the input.

Anonymous
10/02/24(Wed)23:55:36 No.102660988

Anonymous 10/02/24(Wed)23:55:36 No.102660988

File: Untitled.png (1.21 MB, 1080x3434)

1.21 MB PNG

OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data
https://arxiv.org/abs/2410.01560
>Mathematical reasoning continues to be a critical challenge in large language model (LLM) development with significant interest. However, most of the cutting-edge progress in mathematical reasoning with LLMs has become \emph{closed-source} due to lack of access to training data. This lack of data access limits researchers from understanding the impact of different choices for synthesizing and utilizing the data. With the goal of creating a high-quality finetuning (SFT) dataset for math reasoning, we conduct careful ablation experiments on data synthesis using the recently released \texttt{Llama3.1} family of models. Our experiments show that: (a) solution format matters, with excessively verbose solutions proving detrimental to SFT performance, (b) data generated by a strong teacher outperforms \emph{on-policy} data generated by a weak student model, (c) SFT is robust to low-quality solutions, allowing for imprecise data filtering, and (d) question diversity is crucial for achieving data scaling gains. Based on these insights, we create the OpenMathInstruct-2 dataset, which consists of 14M question-solution pairs (≈ 600K unique questions), making it nearly eight times larger than the previous largest open-source math reasoning dataset. Finetuning the \texttt{Llama-3.1-8B-Base} using OpenMathInstruct-2 outperforms \texttt{Llama3.1-8B-Instruct} on MATH by an absolute 15.9\% (51.9\% 67.8\%). Finally, to accelerate the open-source efforts, we release the code, the finetuned models, and the OpenMathInstruct-2 dataset under a commercially permissive license.
https://huggingface.co/collections/nvidia/openmath-2-66fb142317d86400783d2c7b
https://github.com/Kipok/NeMo-Skills
From Nvidia.

Anonymous
10/02/24(Wed)23:56:17 No.102660990

Anonymous 10/02/24(Wed)23:56:17 No.102660990

>>102660944
>shits out same buzzwords he accuses people of
Nice self-own.

Anonymous
10/03/24(Thu)00:01:09 No.102661025

Anonymous 10/03/24(Thu)00:01:09 No.102661025

>>102660983
Man, 4chan would benefited from markdown, latex too.
What model did you use btw?

Anonymous
10/03/24(Thu)00:03:30 No.102661043

Anonymous 10/03/24(Thu)00:03:30 No.102661043

>>102660944
No one cares about your culture war grift, go back >>>/pol/ transphobe.

Anonymous
10/03/24(Thu)00:07:40 No.102661076

Anonymous 10/03/24(Thu)00:07:40 No.102661076

Thermodynamic Bayesian Inference
https://arxiv.org/abs/2410.01793
interesting

Anonymous
10/03/24(Thu)00:43:11 No.102661345

Anonymous 10/03/24(Thu)00:43:11 No.102661345

"*How do I stop Mistral Small doing this?*"

Anonymous
10/03/24(Thu)00:48:03 No.102661372

Anonymous 10/03/24(Thu)00:48:03 No.102661372

>>102661345
Use Mistral Large

Anonymous
10/03/24(Thu)00:48:36 No.102661375

Anonymous 10/03/24(Thu)00:48:36 No.102661375

can someone post their sampler settings and all of their cards for mistral large?

Anonymous
10/03/24(Thu)00:50:09 No.102661384

Anonymous 10/03/24(Thu)00:50:09 No.102661384

Is there a way to tell the model to stop doing something out of character? It keeps doing *nuzzles you* and *narrows her eyes* over and over. I've tried editing it out but it keeps doing it. Maybe it's just a Mistral thing.

Anonymous
10/03/24(Thu)00:59:44 No.102661435

Anonymous 10/03/24(Thu)00:59:44 No.102661435

>>102661384
it's a mistal thing

Anonymous
10/03/24(Thu)01:02:03 No.102661449

Anonymous 10/03/24(Thu)01:02:03 No.102661449

>>102661384
Uhm sweaty, you can shit on cloudkeks only! Local LLMs are totally perfect! A random /lmg/tard says so!

Anonymous
10/03/24(Thu)01:05:29 No.102661475

Anonymous 10/03/24(Thu)01:05:29 No.102661475

>>102661449
How does it make economic sense anymore to use open-source models and host it ourselves? Openais fine-tuned models are as powerful as any small language model for domain specific tasks. Not just that, it's super duper cheap compared to hosting and running your own fine tuned models.

Apart from data privacy reasons, I don't see any other reason to fine tune and host my own models.

Anonymous
10/03/24(Thu)01:06:55 No.102661482

Anonymous 10/03/24(Thu)01:06:55 No.102661482

>>102661475
>Apart from data privacy reasons
that's a pretty big fucking reason

Anonymous
10/03/24(Thu)01:08:25 No.102661494

Anonymous 10/03/24(Thu)01:08:25 No.102661494

>>102660307
Whisper Large v2 > large v3 ime.
If you have any tips on how to get pyannote working properly, please gib.

Anonymous
10/03/24(Thu)01:24:23 No.102661626

Anonymous 10/03/24(Thu)01:24:23 No.102661626

File: 1706443839292802.png (83 KB, 900x510)

83 KB PNG

this https://x.com/_xjdr/status/1840782196921233871 fag playing with 1B model, he made a repo now https://github.com/xjdr-alt/entropix https://x.com/_xjdr/status/1841632017299210490

Anonymous
10/03/24(Thu)01:35:03 No.102661703

Anonymous 10/03/24(Thu)01:35:03 No.102661703

>>102654548
did a revision session on arterial blood gases and ph buffers

Anonymous
10/03/24(Thu)01:45:46 No.102661778

Anonymous 10/03/24(Thu)01:45:46 No.102661778

>>102661626
There's no need to call yourself a fag to fit in anon

Anonymous
10/03/24(Thu)01:47:51 No.102661797

Anonymous 10/03/24(Thu)01:47:51 No.102661797

>>102661778
Well, i'll call you faggot instead because this is exactly what you are.

Anonymous
10/03/24(Thu)01:48:41 No.102661809

Anonymous 10/03/24(Thu)01:48:41 No.102661809

>>102654548
I forced Qwen to code a script to force itself to pretend to be a janny, and do it for free.

Anonymous
10/03/24(Thu)01:49:42 No.102661818

Anonymous 10/03/24(Thu)01:49:42 No.102661818

>>102661809
*expert janny

Anonymous
10/03/24(Thu)01:59:50 No.102661897

Anonymous 10/03/24(Thu)01:59:50 No.102661897

>>102661797
well yeah you're the one sucking my dick after all

Anonymous
10/03/24(Thu)02:02:09 No.102661917

Anonymous 10/03/24(Thu)02:02:09 No.102661917

>>102661797
lol gottem

Anonymous
10/03/24(Thu)02:03:47 No.102661927

Anonymous 10/03/24(Thu)02:03:47 No.102661927

>>102654480
fren from real life wants me to learn how to tune models with him and is willing to spend up to 500 on renting servers. he wants to make a chatbot that can speak his negro language at a decent level. considering that all my CS knowledge is SICP C and uni stuff how much do i actually need to learn to make a negro llm that isnt total dogshit?

Anonymous
10/03/24(Thu)02:05:50 No.102661945

Anonymous 10/03/24(Thu)02:05:50 No.102661945

>>102661778
Why so mad niggerfaggot?

Anonymous
10/03/24(Thu)02:06:57 No.102661949

Anonymous 10/03/24(Thu)02:06:57 No.102661949

>midnight miqu keeps trying to give the elf a tail
I fucking hate you shills

Anonymous
10/03/24(Thu)02:06:57 No.102661950

Anonymous 10/03/24(Thu)02:06:57 No.102661950

pixtral vs nvlm vs 3.2 vs molmo

which is best at captioning?

Anonymous
10/03/24(Thu)02:07:35 No.102661956

Anonymous 10/03/24(Thu)02:07:35 No.102661956

>>102661927
you need to learn how to read the op and lurk before asking stupid questions

Anonymous
10/03/24(Thu)02:08:40 No.102661962

Anonymous 10/03/24(Thu)02:08:40 No.102661962

Who set the Migus loose?

Anonymous
10/03/24(Thu)02:09:07 No.102661966

Anonymous 10/03/24(Thu)02:09:07 No.102661966

And that's why any sane general never puts anime slop in OP, it reeks with redditor faggotry in here since day one.

Anonymous
10/03/24(Thu)02:34:38 No.102662127

Anonymous 10/03/24(Thu)02:34:38 No.102662127

What do you guys use in System Prompt? Should I use anything other than Actor preset?

Anonymous
10/03/24(Thu)02:34:48 No.102662129

Anonymous 10/03/24(Thu)02:34:48 No.102662129

>>102661962
ur mom is loose

Anonymous
10/03/24(Thu)02:38:41 No.102662155

Anonymous 10/03/24(Thu)02:38:41 No.102662155

>>102662127
>something something no ethics sex sex no apologize sex

Anonymous
10/03/24(Thu)02:44:46 No.102662192

Anonymous 10/03/24(Thu)02:44:46 No.102662192

>>102662127
After doing lots of personal research on system prompts back during the llama2 days, I came to realize that system prompts are a placebo meme.
>Write {{char}}'s next reply in this fictional roleplay with {{user}}.
This is all I use these days unless the model expects a specific one.

Anonymous
10/03/24(Thu)02:58:49 No.102662300

Anonymous 10/03/24(Thu)02:58:49 No.102662300

>>102662127
>PLEASE behave like a larger model that requires more VRAM than I could possibly afford. If you do not, I will be fired from my job, causing my family to die and forcing me to take out my frustrations on people of the jewish faith.

Anonymous
10/03/24(Thu)03:04:41 No.102662352

Anonymous 10/03/24(Thu)03:04:41 No.102662352

Is it normal for LLM to take increasingly more time to answer, or is it just my CPU heating up?

Anonymous
10/03/24(Thu)03:12:41 No.102662406

Anonymous 10/03/24(Thu)03:12:41 No.102662406

>>102662352
it takes more time as the context grows

Anonymous
10/03/24(Thu)03:14:02 No.102662417

Anonymous 10/03/24(Thu)03:14:02 No.102662417

>>102662406
Is there a solution to this? Like, a sliding window context?

Anonymous
10/03/24(Thu)03:15:16 No.102662427

Anonymous 10/03/24(Thu)03:15:16 No.102662427

File: 1535212253394.jpg (51 KB, 720x958)

51 KB JPG

Anyone know more cards that are designed to have surprises and provide an "experience" used blind? It's a pretty fun idea, but there's way too little of this "genre". I want more!

Anonymous
10/03/24(Thu)03:16:34 No.102662432

Anonymous 10/03/24(Thu)03:16:34 No.102662432

>>102662417
You can use koboldcpp and enable context shift and lower the context size to match the speed you want.

Anonymous
10/03/24(Thu)03:19:57 No.102662463

Anonymous 10/03/24(Thu)03:19:57 No.102662463

>>102662432
Thanks

Anonymous
10/03/24(Thu)03:20:22 No.102662466

Anonymous 10/03/24(Thu)03:20:22 No.102662466

File: 1719393769572232.png (351 KB, 540x540)

351 KB PNG

>OpenAI asks investors to avoid five AI startups

>As global investors such as Thrive Capital and Tiger Global invest $6.6 billion in OpenAI, the ChatGPT-maker sought a commitment beyond just capital — they also wanted investors to refrain from funding five companies they perceive as close competitors.
>The list of companies includes rivals developing large language models such as Anthropic and Elon Musk's xAI. OpenAI's co-founder Ilya Sutskever's new company, Safe Superintelligence (SSI), is also on the list. These companies are racing against OpenAI to build large language models, which requires billions in funding.
>The request, while not legally binding, demonstrates how OpenAI is leveraging its appeal to secure exclusive commitments from its financial backers in a competitive field where access to capital is crucial.
>While such expectations are not uncommon in the venture capital world, it's unusual to make a list like OpenAI has.

Anonymous
10/03/24(Thu)03:26:26 No.102662506

Anonymous 10/03/24(Thu)03:26:26 No.102662506

>>102662466
>nooo I'm supposed to become the god-king of AI, you can't just give money to other AI companies t.altman
little bitch

Anonymous
10/03/24(Thu)03:30:19 No.102662536

Anonymous 10/03/24(Thu)03:30:19 No.102662536

>>102662466
>Anthropic
>xAI
>SSI
Kind of funny that their three biggest concerns are all companies of OpenAI founders/early members that ran away from Sam

Anonymous
10/03/24(Thu)03:32:50 No.102662553

Anonymous 10/03/24(Thu)03:32:50 No.102662553

not even openai sees open source as competition anymore
mistral and meta are irrelevant

Anonymous
10/03/24(Thu)04:15:09 No.102662841

Anonymous 10/03/24(Thu)04:15:09 No.102662841

>>102662536
Only natural with corps who failed to capture the market at the start.

Anonymous
10/03/24(Thu)04:18:48 No.102662871

Anonymous 10/03/24(Thu)04:18:48 No.102662871

>>102657745
Instead of blocking the posts they should just do string replacement à la basedboy.

Anonymous
10/03/24(Thu)04:26:37 No.102662926

Anonymous 10/03/24(Thu)04:26:37 No.102662926

>>102658911
That's just regular network effects at play.
Things that are already popular get more popular automatically.
If just a few things had gone differently in the early days it would have been another one of the million llama.cpp frontends that would have gotten popular.
I think there was some early publication about ollama on Hacker News or something which gave the project a boost, the fact that the devs are ex Google (vs. literal whos from Europe) probably helped a lot.

Anonymous
10/03/24(Thu)04:28:21 No.102662935

Anonymous 10/03/24(Thu)04:28:21 No.102662935

>>102662192
I find directives like "write in a vivid style" make a big difference for Mixtral 8x7B Instruct and Llama 3.1 70B Instruct. Absolutely not placebo. Whether you like the result better is up to you but things like that caude an immediate and dramatic change. NeMo is less affected and I make no representation as to whether sloptunes can still be guided that way.

Anonymous
10/03/24(Thu)04:28:26 No.102662936

Anonymous 10/03/24(Thu)04:28:26 No.102662936

>>102659056
holy fuck lmao

Anonymous
10/03/24(Thu)04:29:29 No.102662943

Anonymous 10/03/24(Thu)04:29:29 No.102662943

>>102660951
Watch the 3blue1brown video series on it.

Anonymous
10/03/24(Thu)04:33:47 No.102662970

Anonymous 10/03/24(Thu)04:33:47 No.102662970

>>102662935
* For Mixtral 8x7B the "dramatic effect" becomes less reliable at Q5KM and completely unreliable at Q4KM. If your model isn't being affected by instructions maybe it's because you're running a low quant.

Anonymous
10/03/24(Thu)04:35:15 No.102662981

Anonymous 10/03/24(Thu)04:35:15 No.102662981

Does anyone else have an issue where typing in SillyTavern gets more sluggish and laggy the further a conversation goes? Doesn't seem to be my GPU since I notice this lag even when using a 7B.

Anonymous
10/03/24(Thu)04:36:56 No.102662993

Anonymous 10/03/24(Thu)04:36:56 No.102662993

>>102662981
browser, ram status? you're not using chrome are you?

Anonymous
10/03/24(Thu)04:37:17 No.102662997

Anonymous 10/03/24(Thu)04:37:17 No.102662997

>>102659056
>NlGGERNlGGER__________NlGGER____NlGGER_NlGGER_NlGGER____NlGGER_NlGGER_NlGGER_NlGGER____NlGGER_NlGGER_NlGGER_NlGGER____NlGGER_NlGGER_NlGGER____NlGGERNlGGER_______
>NlGGER_NlGGER_________NlGGER____NlGGER_NlGGER_NlGGER____NlGGER_NlGGER_NlGGER_NlGGER____NlGGER_NlGGER_NlGGER_NlGGER____NlGGER_NlGGER_NlGGER____NlGGER___NlGGER____
>NlGGER__NlGGER________NlGGER____________NlGGER____________NlGGER_____________________________NlGGER____________________________NlGGER____________________NlGGER_____NlGGER___
>NlGGER___NlGGER_______NlGGER____________NlGGER____________NlGGER_____________________________NlGGER____________________________NlGGER____________________NlGGER______NlGGER__
>NlGGER____NlGGER______NlGGER____________NlGGER____________NlGGER_____________________________NlGGER____________________________NlGGER____________________NlGGER_____NlGGER___
>NlGGER_____NlGGER_____NlGGER____________NlGGER____________NlGGER_________NlGGER_NlGGER_____NlGGER_________NlGGER_NlGGER____NlGGER_NlGGER_NlGGER____NlGGER___NlGGER____
>NlGGER______NlGGER____NlGGER____________NlGGER____________NlGGER_________________NlGGER_____NlGGER_________________NlGGER____NlGGER____________________NlGGER_NlGGER_______
>NlGGER_______NlGGER___NlGGER____________NlGGER____________NlGGER_________________NlGGER_____NlGGER_________________NlGGER____NlGGER____________________NlGGER____NlGGER____
>NlGGER________NlGGER__NlGGER____________NlGGER____________NlGGER_________________NlGGER_____NlGGER_________________NlGGER____NlGGER____________________NlGGER______NlGGER__
>NlGGER_________NlGGER_NlGGER____NlGGER_NlGGER_NlGGER____NlGGER_NlGGER_NlGGER_NlGGER____NlGGER_NlGGER_NlGGER_NlGGER____NlGGER_NlGGER_NlGGER____NlGGER_______NlGGER_
>NlGGER__________NlGGERNlGGER____NlGGER_NlGGER_NlGGER____NlGGER_NlGGER_NlGGER_NlGGER____NlGGER_NlGGER_NlGGER_NlGGER____NlGGER_NlGGER_NlGGER____NlGGER________NlGGER
what did he mean by this???

Anonymous
10/03/24(Thu)04:40:37 No.102663013

Anonymous 10/03/24(Thu)04:40:37 No.102663013

File: 1490075502548.png (266 KB, 665x574)

266 KB PNG

>>102662993
>ram status
I have 64 GB of ram. Surely that can't be the iss-
>you're not using chrome are you?
Oh.... oh no....

Anonymous
10/03/24(Thu)04:41:44 No.102663025

Anonymous 10/03/24(Thu)04:41:44 No.102663025

>>102663013
bro...

Anonymous
10/03/24(Thu)04:47:19 No.102663049

Anonymous 10/03/24(Thu)04:47:19 No.102663049

>>102662997
hypothesis: ______
research: ___________NlGGER______NlGGER__
>NlGGER____NlGGER______NlGGER____________NlGGER__________
analysis: __NlGGER_____________________________NlGGER____________________________NlGGER____________________
conclusion: NlGGER_____NlGGER

Anonymous
10/03/24(Thu)05:47:49 No.102663375

Anonymous 10/03/24(Thu)05:47:49 No.102663375

>>102662427
*surprises you with dogshit schizo formatting*

Anonymous
10/03/24(Thu)06:33:17 No.102663627

Anonymous 10/03/24(Thu)06:33:17 No.102663627

We're about to get another big release next week. I see the patterns and there are clear signs pointing to another major new open model.

Anonymous
10/03/24(Thu)06:52:20 No.102663750

Anonymous 10/03/24(Thu)06:52:20 No.102663750

Why does it sometimes take A LOT of time to produce a simple response? I've also noticed that the prompt immediately after the slow one completes very fast.

Anonymous
10/03/24(Thu)06:55:35 No.102663773

Anonymous 10/03/24(Thu)06:55:35 No.102663773

>>102663750
your prompt changed and it has to reprocess the context

Anonymous
10/03/24(Thu)06:57:20 No.102663794

Anonymous 10/03/24(Thu)06:57:20 No.102663794

>>102663772
>>102663772
>>102663772

Anonymous
10/03/24(Thu)09:27:38 No.102665063

Anonymous 10/03/24(Thu)09:27:38 No.102665063

>>102662466
>literally directly begging investors not to invest in his rivals
wew that's pathetic.
How the fuck does anybody take this guy seriously anymore?

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.