/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/26/24(Thu)14:03:27 No.102565822

File: komfey_ui_00041_.png (3.16 MB, 2048x1632)

3.16 MB PNG

/lmg/ - Local Models General Anonymous 09/26/24(Thu)14:03:27 No.102565822 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102557546 & >>102552020

►News
>(09/25) Multimodal Llama 3.2 released: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices
>(09/25) Molmo: multimodal models based on OLMo, OLMoE, and Qwen-72B: https://molmo.allenai.org/blog
>(09/24) Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct
>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5/
>(09/18) Llama 8B quantized to b1.58 through finetuning: https://hf.co/blog/1_58_llm_extreme_quantization

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
09/26/24(Thu)14:04:03 No.102565835

Anonymous 09/26/24(Thu)14:04:03 No.102565835

File: 1700027893072764.jpg (242 KB, 1024x1024)

242 KB JPG

►Recent Highlights from the Previous Thread: >>102557546

--Running 405B model out of swap for storywriting discussed:
>102562681 >102562696 >102562781 >102562801 >102562866 >102563205 >102563274 >102563337 >102563429 >102563483 >102563566 >102563368 >102562874
--Nature paper explores unreliability of larger and more instructable language models:
>102562635 >102562668 >102562702 >102562783 >102562788 >102562697
--Llama.cpp maintainers wait for contributors with software architecture skills to add multi-modal support:
>102561800 >102561867 >102561910 >102561929 >102561976 >102561905 >102562037 >102562238 >102562274
--Qwen 72b and GPT-4o succeed at scrolling sine wave coding challenge:
>102561725 >102561780
--Llama3.2 1B output and discussion on training data curation:
>102563707 >102563790 >102563855 >102563957 >102563804 >102563823 >102564062 >102563969 >102563996 >102564022 >102564073 >102564101 >102564210 >102564231 >102564258 >102564291 >102564328 >102564474 >102564312 >102563991 >102564020 >102564090
--Future of llama.cpp HTTP server debated:
>102564790 >102564836 >102564855
--Yann LeCun tweet comparing LLM performance:
>102562994
--Discussion on using base models vs. instruct models and the challenges of training your own models:
>102562778 >102562786 >102562824 >102562840 >102563010 >102563054 >102563068 >102563099 >102563143 >102563159 >102563183 >102563212 >102563238 >102563298
--Discussion about the Director extension for sillytavern:
>102558221 >102558266 >102558285 >102558300 >102558343 >102561423
--Clarification on reasoning behind o1's performance and potential improvements:
>102558399
--Char card writing tips and debate on using {{char}} and {{user}} tags:
>102562150 >102562260 >102562298 >102562312 >102562438 >102562327 >102562303
--Miku (free space):
>102558522 >102558892 >102563189 >102563263 >102563296 >102565148

►Recent Highlight Posts from the Previous Thread: >>102557552

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
09/26/24(Thu)14:04:59 No.102565849

Anonymous 09/26/24(Thu)14:04:59 No.102565849

>mfw I can't train a SOTA smut model to compete with billion dollar megacorporations using my 4 year old gaming GPUs

Anonymous
09/26/24(Thu)14:09:02 No.102565886

Anonymous 09/26/24(Thu)14:09:02 No.102565886

>This tranny shill is still tilted and I'm living rent free in his retarded head
The amount of schizo retards lately is incredible.

Anonymous
09/26/24(Thu)14:10:31 No.102565904

Anonymous 09/26/24(Thu)14:10:31 No.102565904

>>102565849
You can already get smut from any model. The only ones complaining are the ah ah mistress skillet gang.

Anonymous
09/26/24(Thu)14:11:29 No.102565910

Anonymous 09/26/24(Thu)14:11:29 No.102565910

Does anyone know if the 20B vision parameters from 90B process the entire context? Or are they used only when an image is present?

Anonymous
09/26/24(Thu)14:12:10 No.102565928

Anonymous 09/26/24(Thu)14:12:10 No.102565928

>>102565904
including hella sloppa if the only requirement is "generate some form of smut"

Anonymous
09/26/24(Thu)14:13:19 No.102565941

Anonymous 09/26/24(Thu)14:13:19 No.102565941

File: file.png (120 KB, 904x760)

120 KB PNG

https://xcancel.com/kopite7kimi/status/1839343725727941060
It's official, the RTX 3090 is gonna have 32gb of VRAM

Anonymous
09/26/24(Thu)14:13:53 No.102565949

Anonymous 09/26/24(Thu)14:13:53 No.102565949

>>102565941
How did Kimi Raikkonen find this out?

Anonymous
09/26/24(Thu)14:13:59 No.102565950

Anonymous 09/26/24(Thu)14:13:59 No.102565950

https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/
>During adapter training, we also updated the parameters of the image encoder, but intentionally did not update the language-model parameters. By doing that, we keep all the text-only capabilities intact, providing developers a drop-in replacement for Llama 3.1 models.
Wait, so the LLM part of the Llama 3.2 models is literally identical to 3.1? Doesn't that mean you could swap out those LLM weights for any uncensored finetune of llama 3.1, thereby creating an uncensored VLM? Because in my experience testing 3.2 on captioning images, it very much can see the NSFW parts of the image, it's just incredibly hesitant to describe them. It also seems like the image features are fairly low-level, relying on the LLM to piece things together and infer what's happening in the image. So maybe all it takes is replacing the LLM weights in the vision model and it can be greatly improved.

Anonymous
09/26/24(Thu)14:14:29 No.102565959

Anonymous 09/26/24(Thu)14:14:29 No.102565959

>>102565949
>How did Kimi Raikkonen find this out?
Bwoah!

Anonymous
09/26/24(Thu)14:15:05 No.102565965

Anonymous 09/26/24(Thu)14:15:05 No.102565965

>>102565941
Damn, I may actually get one then.

Anonymous
09/26/24(Thu)14:17:25 No.102565994

Anonymous 09/26/24(Thu)14:17:25 No.102565994

Anyone else feel like local LLMs already peaked and it has been downhill for a while? I'm looking at some older gens and models could actually write in a style that wasn't just X, Ying.

Anonymous
09/26/24(Thu)14:18:59 No.102566013

Anonymous 09/26/24(Thu)14:18:59 No.102566013

>>102565950
Oh thanks for posting that. So it confirms that it is indeed the same weights (if that line can be trusted).
Man I wish they would separate out the safetensors so you didn't have to basically redownload the stuff.

Anonymous
09/26/24(Thu)14:21:50 No.102566046

Anonymous 09/26/24(Thu)14:21:50 No.102566046

>>102565950
I'm guessing that's why the multimodal performance is lacking compared to some competitors. But it's logical why they would do this. Hopefully Llama 4 is just multimodal from the beginning.

Anonymous
09/26/24(Thu)14:25:08 No.102566089

Anonymous 09/26/24(Thu)14:25:08 No.102566089

>>102565994
no

Anonymous
09/26/24(Thu)14:25:26 No.102566091

Anonymous 09/26/24(Thu)14:25:26 No.102566091

>>102565994
The last time I tried an old model because people said it was slop free, it was utterly garbage. Dumber, AND it even had slop. I fucking saw whispers and other shit. Maybe it wasn't nearly as slopped as some current models though, sure. But I think it turns out that a lot of slop in fact comes from human datasets, not merely just tuning on synthetic data.

Anonymous
09/26/24(Thu)14:25:32 No.102566092

Anonymous 09/26/24(Thu)14:25:32 No.102566092

>>102565994
yes

Anonymous
09/26/24(Thu)14:31:28 No.102566164

Anonymous 09/26/24(Thu)14:31:28 No.102566164

Is molmo good as a text model? Yes sure it's good with images but how about just regular ass RP?

Anonymous
09/26/24(Thu)14:31:58 No.102566170

Anonymous 09/26/24(Thu)14:31:58 No.102566170

>>102565822
>>102565835
sex
with miku

Anonymous
09/26/24(Thu)14:33:03 No.102566189

Anonymous 09/26/24(Thu)14:33:03 No.102566189

>>102566164
People seem to think it's decent, at least. I've seen a big outpouring of astroturfed "WOW AMAZING!!!" feedback on it, but in reality it seems to just be goodish. Probably not better than any similarly sized model out there.

Anonymous
09/26/24(Thu)14:33:46 No.102566201

Anonymous 09/26/24(Thu)14:33:46 No.102566201

I'm finding that Mistral small is much better with a one message prompt, where all the previous chat, instructions and context are put in only one user message without system tags in between. Has anyone seen this happen with other models?

Anonymous
09/26/24(Thu)14:35:33 No.102566227

Anonymous 09/26/24(Thu)14:35:33 No.102566227

>>102566189
>Probably not better than any similarly sized model out there
Like what? No one has ever said Qwen or Llama 72-70B are good for RP.

Anonymous
09/26/24(Thu)14:36:29 No.102566240

Anonymous 09/26/24(Thu)14:36:29 No.102566240

>>102565941
>5090
>32gb
who is this guy? is he a trusted source or another grifter?

Anonymous
09/26/24(Thu)14:36:50 No.102566246

Anonymous 09/26/24(Thu)14:36:50 No.102566246

>>102566201
Can you give an example of what that looks like?

Anonymous
09/26/24(Thu)14:38:32 No.102566280

Anonymous 09/26/24(Thu)14:38:32 No.102566280

>>102566227
Yeah. I'm saying that they're all kinda meh. To be fair, most people are running the 7b, I don't think a lot of people are ABLE to run the 72b yet, between the initial layer of hardware filtering, the lack of GGUF, the lackluster performance of Qwen since it's based on it, etc.

Anonymous
09/26/24(Thu)14:40:13 No.102566301

Anonymous 09/26/24(Thu)14:40:13 No.102566301

File: file.png (165 KB, 974x767)

165 KB PNG

>I haven't yet figured out how much their server maintains the spirit of the refactoring from #5882, or if merging their version of server.cpp into ours would be too much of a regress. If we're going to continue this discussion much further, perhaps opening a new issue to discuss sync'ing our version of server.cpp with ollama's would be useful?

Anonymous
09/26/24(Thu)14:40:23 No.102566305

Anonymous 09/26/24(Thu)14:40:23 No.102566305

>>102566240
He is the CEO of trusted source

Anonymous
09/26/24(Thu)14:40:50 No.102566312

Anonymous 09/26/24(Thu)14:40:50 No.102566312

>>102566246
As in

[INST] {Description}

{examples/previous chat} (no [INST] etc.)

{instructions}: continue the roleplay etc.

Then finally [\INST] and the AI reply

Anonymous
09/26/24(Thu)14:42:07 No.102566328

Anonymous 09/26/24(Thu)14:42:07 No.102566328

>>102566301
>discuss sync'ing our version of server.cpp with ollama's would be useful?
upstream has become downstream
grim

Anonymous
09/26/24(Thu)14:43:38 No.102566349

Anonymous 09/26/24(Thu)14:43:38 No.102566349

File: 40 Days Until November 5.png (2.57 MB, 1616x1008)

2.57 MB PNG

Anonymous
09/26/24(Thu)14:43:59 No.102566352

Anonymous 09/26/24(Thu)14:43:59 No.102566352

>>102566312
Huh. And how are you formatting the previous chat? Just this?

Character 1 name: blah blah

Character 2 name: blah blah

etcetc

Anonymous
09/26/24(Thu)14:45:54 No.102566381

Anonymous 09/26/24(Thu)14:45:54 No.102566381

>>102566349
after november 5th, can you post a zip of the full collection? they're really good

Anonymous
09/26/24(Thu)14:46:07 No.102566385

Anonymous 09/26/24(Thu)14:46:07 No.102566385

>>102566349
What are we hoping happens after November 5? Strawberry is already out.

Anonymous
09/26/24(Thu)14:47:20 No.102566396

Anonymous 09/26/24(Thu)14:47:20 No.102566396

>>102566385
Strawberry 2

Anonymous
09/26/24(Thu)14:47:25 No.102566399

Anonymous 09/26/24(Thu)14:47:25 No.102566399

>>102566385
Hold on, o1 was strawberry?

Anonymous
09/26/24(Thu)14:47:51 No.102566406

Anonymous 09/26/24(Thu)14:47:51 No.102566406

>>102566399
Yeah that's what they said.

Anonymous
09/26/24(Thu)14:47:58 No.102566408

Anonymous 09/26/24(Thu)14:47:58 No.102566408

File: file.png (373 KB, 480x498)

373 KB PNG

>>102566385
>What are we hoping happens after November 5?
Trump will be president for a 2nd time

Anonymous
09/26/24(Thu)14:48:26 No.102566419

Anonymous 09/26/24(Thu)14:48:26 No.102566419

>>102566385
not hoping, but wouldn't be surprised if it ends up being the day llama.cpp repo gets archived with how it's going, probably what the webm with bodies meant, collective vramlet death

Anonymous
09/26/24(Thu)14:48:35 No.102566421

Anonymous 09/26/24(Thu)14:48:35 No.102566421

>>102566352
Yes, like that

Anonymous
09/26/24(Thu)14:49:01 No.102566425

Anonymous 09/26/24(Thu)14:49:01 No.102566425

>>102566385
>he doesn't know

Anonymous
09/26/24(Thu)14:49:36 No.102566436

Anonymous 09/26/24(Thu)14:49:36 No.102566436

>>102566399
>>102566406
God, how awful. It's honestly notably worse than chatgpt4o-latest, and even their worse models like the furbos, 4, etc.

Anonymous
09/26/24(Thu)14:51:42 No.102566466

Anonymous 09/26/24(Thu)14:51:42 No.102566466

>>102566421
Interesting, thanks.

Anonymous
09/26/24(Thu)14:52:52 No.102566490

Anonymous 09/26/24(Thu)14:52:52 No.102566490

>>102566406
Then why did they hype up november only to release it 5 days after Reflection?

Anonymous
09/26/24(Thu)14:54:00 No.102566515

Anonymous 09/26/24(Thu)14:54:00 No.102566515

File: Screenshot_16.png (52 KB, 1430x510)

52 KB PNG

Oh, Qwen...

Anonymous
09/26/24(Thu)14:54:21 No.102566521

Anonymous 09/26/24(Thu)14:54:21 No.102566521

>>102566490
Idk
Ask the guy who's apparently responsible for strawb
https://xcancel.com/polynoamial/status/1834280155730043108

Anonymous
09/26/24(Thu)14:55:40 No.102566549

Anonymous 09/26/24(Thu)14:55:40 No.102566549

>>102566408
>implying Dominion isn't dialed in now

Anonymous
09/26/24(Thu)14:56:08 No.102566551

Anonymous 09/26/24(Thu)14:56:08 No.102566551

File: Screenshot_17.png (12 KB, 1136x140)

12 KB PNG

>>102566515
Oh god. It's really bad.

Anonymous
09/26/24(Thu)14:58:53 No.102566597

Anonymous 09/26/24(Thu)14:58:53 No.102566597

File: Jean_Card.png (2.53 MB, 1080x1920)

2.53 MB PNG

>>102566551
Daddy...!

Anonymous
09/26/24(Thu)14:59:43 No.102566607

Anonymous 09/26/24(Thu)14:59:43 No.102566607

File: 1200x675_cmsv2_51ed78f2-9(...).png (1.7 MB, 1200x675)

1.7 MB PNG

>>102566549
they know they can't do it twice, that's why they tried to kill him twice
https://www.hindustantimes.com/world-news/us-news/ryan-routh-sported-biden-harris-sticker-on-pickup-truck-accused-trump-of-turning-americans-into-slaves-101726467986038.html

Anonymous
09/26/24(Thu)15:01:18 No.102566636

Anonymous 09/26/24(Thu)15:01:18 No.102566636

>>102565541
Why the fuck do you think I said 32GB? Already saw these leaks a while ago, not exactly a big shocker. Question is if they give it to the 5090 or the (not going to happen) Titan. No real reason to give it to the 5090 when you think about it either.

Anonymous
09/26/24(Thu)15:03:28 No.102566666

Anonymous 09/26/24(Thu)15:03:28 No.102566666

File: file.jpg (128 KB, 1200x675)

128 KB JPG

>>102565941
>>102565949
Bless this autistic little faggot https://youtu.be/7i1jFcPwqoo

Anonymous
09/26/24(Thu)15:03:32 No.102566668

Anonymous 09/26/24(Thu)15:03:32 No.102566668

>>102566607
>they can't do it twice
lol
what the fuck do you think is going to happen when they make harris win? trump will cry to the courts and they'll throw everything out, just like they did last time and like they did with the kerry shit

Anonymous
09/26/24(Thu)15:04:39 No.102566684

Anonymous 09/26/24(Thu)15:04:39 No.102566684

>>102566668
back then they had an excuse to use dominion, they have zero excuses now so it won't happen, like I said they tried to kill him so they know that they can't do the dominion trick twice

Anonymous
09/26/24(Thu)15:05:26 No.102566693

Anonymous 09/26/24(Thu)15:05:26 No.102566693

>>102566607
what the fuck do you think is going to happen when they won't let trump win by cheating again? trump will rightfully cry to the courts and they'll throw everything out because the pedo kennedys own this gay country, just like they did last time and like they did with all the other evil hitler tier shit they did.

Anonymous
09/26/24(Thu)15:05:31 No.102566695

Anonymous 09/26/24(Thu)15:05:31 No.102566695

What will you do with video multimodal llama?

Anonymous
09/26/24(Thu)15:06:02 No.102566703

Anonymous 09/26/24(Thu)15:06:02 No.102566703

>>102566695
Gimmick

Anonymous
09/26/24(Thu)15:06:09 No.102566704

Anonymous 09/26/24(Thu)15:06:09 No.102566704

File: 1711350036586800.png (192 KB, 1488x1488)

192 KB PNG

>>102566490
They only released o1-preview, an early snapshot that had been sitting through US gov review; o1 full is still in training.

Anonymous
09/26/24(Thu)15:06:52 No.102566720

Anonymous 09/26/24(Thu)15:06:52 No.102566720

>>102566684
>remote into voting machine, add 50k votes
literally nothing will happen, the US is a democracy in name only
might as well call it the People's United States of America at this point kek

Anonymous
09/26/24(Thu)15:06:53 No.102566721

Anonymous 09/26/24(Thu)15:06:53 No.102566721

>>102566695
I dunno, nothing? If it's another 20-60B added for another dogshit multimodality, it's not worth it.

Anonymous
09/26/24(Thu)15:07:54 No.102566736

Anonymous 09/26/24(Thu)15:07:54 No.102566736

>>102566704
That'd explain why the language it chooses to use feels like old GPTisms.

Anonymous
09/26/24(Thu)15:13:39 No.102566814

Anonymous 09/26/24(Thu)15:13:39 No.102566814

>>102566720
why are they trying to kill him if they can simply cheat like on 2020 and call it a day? it's gonna be more difficult to do this time, that's the point, we'll see

Anonymous
09/26/24(Thu)15:19:05 No.102566900

Anonymous 09/26/24(Thu)15:19:05 No.102566900

File: Mark_Zuckerberg.jpg (611 KB, 2226x2767)

611 KB JPG

>>102566408
>>102566549
>>102566607
>>102566668
>>102566684
>>102566693
>>102566720
>>102566814
nobody cares >>>/pol/

Anonymous
09/26/24(Thu)15:23:59 No.102566980

Anonymous 09/26/24(Thu)15:23:59 No.102566980

>>102565835
>Future of llama.cpp HTTP server debated
Does anyone unironically use llama.cpp server?

Anonymous
09/26/24(Thu)15:26:04 No.102567010

Anonymous 09/26/24(Thu)15:26:04 No.102567010

>>102566704
>the smarter one gets worse at "biology" when you sample from many answers instead of letting it just run once
now I wonder if it starts exploring some unacceptable chains of thought when it tries to reason about whether transwomen are women

Anonymous
09/26/24(Thu)15:28:38 No.102567061

Anonymous 09/26/24(Thu)15:28:38 No.102567061

>>102566980
What else do I use? I haven't been here for ages

Anonymous
09/26/24(Thu)15:30:06 No.102567082

Anonymous 09/26/24(Thu)15:30:06 No.102567082

>>102567061
Everyone here uses KoboldCPP, get with the times grandpa.

Anonymous
09/26/24(Thu)15:30:17 No.102567085

Anonymous 09/26/24(Thu)15:30:17 No.102567085

File: lecunny.png (72 KB, 189x139)

72 KB PNG

LLMs are like lolis: the best ones are small and impressionable.

Anonymous
09/26/24(Thu)15:30:20 No.102567086

Anonymous 09/26/24(Thu)15:30:20 No.102567086

>>102566980
Does exactly the same thing as all other forks and wrappers. More like why use anything else?

Anonymous
09/26/24(Thu)15:31:54 No.102567108

Anonymous 09/26/24(Thu)15:31:54 No.102567108

>>102565994
>X, Ying
That's a /aids/ dog whistle. No wonder you're miserable.

Anonymous
09/26/24(Thu)15:34:31 No.102567147

Anonymous 09/26/24(Thu)15:34:31 No.102567147

>>102565994
I have to agree. After seeing how hard Erato punches above her weight, it's honestly hard to go back to localslop. I'm really trying. But we'll have to catch up eventually... I mean, I have to believe in something, don't I?

Anonymous
09/26/24(Thu)15:35:29 No.102567171

Anonymous 09/26/24(Thu)15:35:29 No.102567171

>>102567147
>hur dur not local trash in a local thread
go fuck off into the cloud thread and buy an ad nigger

Anonymous
09/26/24(Thu)15:36:46 No.102567184

Anonymous 09/26/24(Thu)15:36:46 No.102567184

local has improved substantially for everything except gooning to child rape """roleplays"""

Anonymous
09/26/24(Thu)15:37:32 No.102567195

Anonymous 09/26/24(Thu)15:37:32 No.102567195

>>102567184
>it has improved substantially except for the only reason you would want to use local in the first place

Anonymous
09/26/24(Thu)15:39:08 No.102567217

Anonymous 09/26/24(Thu)15:39:08 No.102567217

so is there a llama-cpp-python server script for multimodal like there was for spec decoding? that is theoretically possible right? or is it fundamentally unsupported in the llama.cpp library itself instead of just not implemented in an example/server?

Anonymous
09/26/24(Thu)15:43:16 No.102567281

Anonymous 09/26/24(Thu)15:43:16 No.102567281

Any word on llama 3.2 support for llama.cpp or exl2?

Anonymous
09/26/24(Thu)15:43:50 No.102567292

Anonymous 09/26/24(Thu)15:43:50 No.102567292

>>102565941
It's over.
Man, what the fuck happened to AMD's big push to put a gorrilion GB of HBM on consumer cards?
Who can save us now? All these new accelerator startups are still YEARS from being capable of taping out competitive chips.

Anonymous
09/26/24(Thu)15:43:50 No.102567293

Anonymous 09/26/24(Thu)15:43:50 No.102567293

>>102567281
>Any word on llama 3.2 support for llama.cpp
lol

Anonymous
09/26/24(Thu)15:45:00 No.102567315

Anonymous 09/26/24(Thu)15:45:00 No.102567315

File: 1696821234454757.png (54 KB, 737x878)

54 KB PNG

>>102566515
Yep grim lol

Anonymous
09/26/24(Thu)15:45:41 No.102567329

Anonymous 09/26/24(Thu)15:45:41 No.102567329

>>102567281
see
>>102561905
>>102561867

Anonymous
09/26/24(Thu)15:46:13 No.102567337

Anonymous 09/26/24(Thu)15:46:13 No.102567337

>>102567108
I'm miserable because it's hard being a prosegod in the current local meta.

Anonymous
09/26/24(Thu)15:47:27 No.102567355

Anonymous 09/26/24(Thu)15:47:27 No.102567355

>>102565835
The bookmarklet is very convenient. I didn't know about them.

Anonymous
09/26/24(Thu)15:47:57 No.102567365

Anonymous 09/26/24(Thu)15:47:57 No.102567365

>>102566666
>Bless this autistic little faggot
this
https://youtu.be/gc7av-OXMyg?t=9

Anonymous
09/26/24(Thu)15:48:45 No.102567380

Anonymous 09/26/24(Thu)15:48:45 No.102567380

>>102567337
That's a basic sentence structure fucking troglodyte

Anonymous
09/26/24(Thu)15:50:03 No.102567403

Anonymous 09/26/24(Thu)15:50:03 No.102567403

>>102567355
I didn't realize how easy it was, either. literally right-click on bookmarks toolbar, "new bookmark", put the oneliner in the URL field and just click it once on each new thread to fix the links.
recap anon should probably add a note on that in the rentry

Anonymous
09/26/24(Thu)15:50:46 No.102567416

Anonymous 09/26/24(Thu)15:50:46 No.102567416

>>102567329
They will add support to it just like they added support to Gemma when it released.

Anonymous
09/26/24(Thu)15:51:40 No.102567423

Anonymous 09/26/24(Thu)15:51:40 No.102567423

molmogguf?

Anonymous
09/26/24(Thu)15:53:17 No.102567456

Anonymous 09/26/24(Thu)15:53:17 No.102567456

>>102567380
Yes but when it's done in 90% of the sentences it's annoying.

Anonymous
09/26/24(Thu)15:53:24 No.102567459

Anonymous 09/26/24(Thu)15:53:24 No.102567459

>>102567329
God fucking damnit you goddamn NIGGERS. It's literally called llama.cpp. Multimodality is going to be a feature of future models with the first big release being llama and somehow there isn't a rush to support it? I spit on niggerganov.

Anonymous
09/26/24(Thu)15:54:01 No.102567470

Anonymous 09/26/24(Thu)15:54:01 No.102567470

llama.rust when?

Anonymous
09/26/24(Thu)15:54:59 No.102567493

Anonymous 09/26/24(Thu)15:54:59 No.102567493

>>102567470
there's something called mistral.rs

Anonymous
09/26/24(Thu)15:55:05 No.102567497

Anonymous 09/26/24(Thu)15:55:05 No.102567497

>>102567459
smart people aren't here to give you everything you want for free
have you considered having claude make it for you?

Anonymous
09/26/24(Thu)15:55:09 No.102567499

Anonymous 09/26/24(Thu)15:55:09 No.102567499

>>102567470
https://github.com/huggingface/candle

Anonymous
09/26/24(Thu)15:55:10 No.102567500

Anonymous 09/26/24(Thu)15:55:10 No.102567500

>>102567470
https://github.com/EricLBuehler/mistral.rs

Anonymous
09/26/24(Thu)15:57:56 No.102567549

Anonymous 09/26/24(Thu)15:57:56 No.102567549

File: 1698267816070208.png (481 KB, 800x600)

481 KB PNG

So does Llama 3.2 90B pass this test or not?

Anonymous
09/26/24(Thu)15:59:11 No.102567567

Anonymous 09/26/24(Thu)15:59:11 No.102567567

>>102567500
>unsafe
>unsafe
>unsafe
Wow what a great language, how safe.

Anonymous
09/26/24(Thu)15:59:59 No.102567577

Anonymous 09/26/24(Thu)15:59:59 No.102567577

File: NeutralSamplersTopK64.png (29 KB, 458x623)

29 KB PNG

So wait...if I neutralize all samplers (making them either 0 or 1 depending on the setting) and just put top k up to 64 and temperature to 1.05, I get non-sloppy results on L3.x models? Why didn't anyone tell me this earlier?

Anonymous
09/26/24(Thu)16:01:18 No.102567597

Anonymous 09/26/24(Thu)16:01:18 No.102567597

>>102567500
>Implement the Llama 3.2 vision models
https://github.com/EricLBuehler/mistral.rs/pull/796
Seems almost done with it from the todo.

Anonymous
09/26/24(Thu)16:01:23 No.102567600

Anonymous 09/26/24(Thu)16:01:23 No.102567600

>>102567577
shhh, don't tell them that trusty old top-k is the secret sauce for true soul

Anonymous
09/26/24(Thu)16:01:24 No.102567603

Anonymous 09/26/24(Thu)16:01:24 No.102567603

>>102567497
>smart people aren't here to give you everything you want for free
Ignoring multimodality support seems pretty dumb to me.
>for free
Open source is literally smart people giving me things for free.

Anonymous
09/26/24(Thu)16:04:53 No.102567656

Anonymous 09/26/24(Thu)16:04:53 No.102567656

>>102567603
>Open source is literally smart people giving me things for free
smart people btfo

Anonymous
09/26/24(Thu)16:05:53 No.102567669

Anonymous 09/26/24(Thu)16:05:53 No.102567669

>>102567365
>mfw this dude is so well known that even as an outsider to F1 I'm well aware of him and his autism in interviews or driving skills
Gotta love how that stuff works

Anonymous
09/26/24(Thu)16:08:51 No.102567720

Anonymous 09/26/24(Thu)16:08:51 No.102567720

>>102567292
Didn't they put HBM in Vega? The fuck happened?

Anonymous
09/26/24(Thu)16:12:20 No.102567771

Anonymous 09/26/24(Thu)16:12:20 No.102567771

>>102567577
post a singular slopless log.

Anonymous
09/26/24(Thu)16:14:56 No.102567813

Anonymous 09/26/24(Thu)16:14:56 No.102567813

temp: 1.28
top k: 30
you can now enjoy llama 3.2

Anonymous
09/26/24(Thu)16:15:00 No.102567817

Anonymous 09/26/24(Thu)16:15:00 No.102567817

>>102566396
>not 'Strawberry 3'
shiggy diggy

Anonymous
09/26/24(Thu)16:15:23 No.102567822

Anonymous 09/26/24(Thu)16:15:23 No.102567822

>>102567549
Maybe it is like vision stuff where quanting rapes the vision part.

Anonymous
09/26/24(Thu)16:17:54 No.102567871

Anonymous 09/26/24(Thu)16:17:54 No.102567871

>>102567822
But most quants of vision models just run the vision part at full precision...

Anonymous
09/26/24(Thu)16:18:18 No.102567878

Anonymous 09/26/24(Thu)16:18:18 No.102567878

>>102567822
The quant rapes everything, it's just harder to notice with the text than the images. Quantfags are literally holding everyone back.

Anonymous
09/26/24(Thu)16:24:30 No.102567980

Anonymous 09/26/24(Thu)16:24:30 No.102567980

>>102567813
So llama 3.2 is overcooked?

Anonymous
09/26/24(Thu)16:35:10 No.102568173

Anonymous 09/26/24(Thu)16:35:10 No.102568173

>>102567577
I just increment minp above 0 (even as low as 0.01) and it does the same thing. These samplers confuse me and I don't know what I'm doing

Anonymous
09/26/24(Thu)16:37:08 No.102568217

Anonymous 09/26/24(Thu)16:37:08 No.102568217

>>102567878
>Quantfags are literally holding everyone back.
Shifting the blame onto average Joe from greedy vram denying manufacturers
I see what you're up to!

Anonymous
09/26/24(Thu)16:38:14 No.102568236

Anonymous 09/26/24(Thu)16:38:14 No.102568236

>>102568217
If people weren't coping with shitty quants then we'd have more blame available to throw towards the manufacturers.

Anonymous
09/26/24(Thu)16:39:50 No.102568271

Anonymous 09/26/24(Thu)16:39:50 No.102568271

>>102565822
LLaMA-3.2 quantization evaluation
https://github.com/ikawrakow/ik_llama.cpp/discussions/63

Anonymous
09/26/24(Thu)16:49:01 No.102568430

Anonymous 09/26/24(Thu)16:49:01 No.102568430

>>102568236
>coping
https://www.reddit.com/r/LocalLLaMA/comments/1fps3vh/estimating_performance_loss_qwen25_32b_q4_k_m_vs/

Anonymous
09/26/24(Thu)16:51:06 No.102568471

Anonymous 09/26/24(Thu)16:51:06 No.102568471

>>102568430
>quants are magically better in some cases
You cannot tell me that those tests aren't shit.

Anonymous
09/26/24(Thu)16:57:28 No.102568602

Anonymous 09/26/24(Thu)16:57:28 No.102568602

>>102567878
Buy us all a few hundred GB of VRAM each, then.
You can afford it, you're not poor, right?

Anonymous
09/26/24(Thu)17:08:37 No.102568773

Anonymous 09/26/24(Thu)17:08:37 No.102568773

File: IMG_0215.jpg (385 KB, 1125x1134)

385 KB JPG

>okay lets see how handicapped 90B is at writing
>it’s somehow even worse, and also the refusals are now loops
Damn I think this is the first one that actually needs abliteration AND tuning.

Anonymous
09/26/24(Thu)17:09:04 No.102568781

Anonymous 09/26/24(Thu)17:09:04 No.102568781

File: image.png (194 KB, 925x1890)

194 KB PNG

Interesting long context benchmark that prompts models with entire recently-published novels and checks their recall and understanding.
https://novelchallenge.github.io/index.html

>Nocha is a dataset designed to test the abilities of long-context language models to efficiently process book-level input. The model is presented with a claim about a fictional book along with the book text as the context and its task is to validate the claim as either true or false based on the context provided. The test data consists of true/false narrative minimal pairs about the same event or character (see example below). Each false claims differs from its paired true claim only by the inclusion of false information regarding the same event or entity. The model must verify both claims in a pair to be awarded one point. The accuracy is then calculated on the pair level, by counting the number of correctly identified pairs and dividing it by the total pairs processed by the model.

Anonymous
09/26/24(Thu)17:13:47 No.102568851

Anonymous 09/26/24(Thu)17:13:47 No.102568851

File: 1708165595903587.png (158 KB, 833x534)

158 KB PNG

>>102568773
What do you mean? This is peak writing right here! We localchads support safety and inclusivity in AI space!

Anonymous
09/26/24(Thu)17:14:26 No.102568861

Anonymous 09/26/24(Thu)17:14:26 No.102568861

>>102568781
Glad to see the mistral-large meme to finally die. vramtards once again BTFO, enjoy your goliath 2.0 fucking retards

Anonymous
09/26/24(Thu)17:14:29 No.102568862

Anonymous 09/26/24(Thu)17:14:29 No.102568862

>>102568773
I found 3.1 tunes to need high temp with a little min P but its worth it for the smarts / instruction following which is legit as good / better as claude / gpt4s. They are smart enough to not go retarded. Though Hanami is the 3.1 tune im talking about.

Anonymous
09/26/24(Thu)17:15:58 No.102568884

Anonymous 09/26/24(Thu)17:15:58 No.102568884

>>102568861
>Jamba mini beat Jamba large
Maybe there's hope for VRAMlets after all... as soon as the mamba PR in llama.cpp merges I'm gonna be testing the fuck out of it

Anonymous
09/26/24(Thu)17:17:34 No.102568918

Anonymous 09/26/24(Thu)17:17:34 No.102568918

>>102568236
>NOO STOP COPING WITH THE THING YOU'RE FORCED TO USE BECAUSE THERE IS NOT MORE VRAM AVAILABLE REEE
You are retarded and a literal shill for Nvidia and AMD holy hell kill yourself

Anonymous
09/26/24(Thu)17:17:54 No.102568927

Anonymous 09/26/24(Thu)17:17:54 No.102568927

>>102568861
fuck you I can carry three watermelons just fine

Anonymous
09/26/24(Thu)17:19:39 No.102568954

Anonymous 09/26/24(Thu)17:19:39 No.102568954

>>102568781
Surprised to see Jamba doing so poorly.

Anonymous
09/26/24(Thu)17:20:16 No.102568965

Anonymous 09/26/24(Thu)17:20:16 No.102568965

>>102568781
>commander r better than plus
what?

Anonymous
09/26/24(Thu)17:20:30 No.102568966

Anonymous 09/26/24(Thu)17:20:30 No.102568966

>>102568781
>405B
>52% accuracy
lmao this shit is dead

Anonymous
09/26/24(Thu)17:20:55 No.102568977

Anonymous 09/26/24(Thu)17:20:55 No.102568977

>>102568954
Why? There have always been shortcomings with various benchmarks, it's reasonable that there are some drawbacks to Jamba's method of context extension that weren't obvious on those.

Anonymous
09/26/24(Thu)17:21:25 No.102568982

Anonymous 09/26/24(Thu)17:21:25 No.102568982

>>102568954
It's not like needle in haystack. Model actually needs to meaningfully work something out of the text provided. Which makes the results kind of weird anyways.

Anonymous
09/26/24(Thu)17:21:54 No.102568990

Anonymous 09/26/24(Thu)17:21:54 No.102568990

>>102568861
Goliath fiasco legitimately made me cancel my second GPU order, dodged a fucking bullet there.
Never listen to vram hoarders, it takes just a couple of minutes to check those models online for cents. They're nothing special.

Anonymous
09/26/24(Thu)17:22:26 No.102569002

Anonymous 09/26/24(Thu)17:22:26 No.102569002

>>102568781
Oh no no no 3.5 Sonnet sissies...

Anonymous
09/26/24(Thu)17:22:34 No.102569005

Anonymous 09/26/24(Thu)17:22:34 No.102569005

>>102568954
They always were weaker than other smaller models at typical normal context tasks, they just held up better at higher contexts, at least according to their published data. I'm convinced they've just been training them on shit data. Maybe if someone like Mistral experimented with the architecture we'd have something.

Anonymous
09/26/24(Thu)17:23:52 No.102569023

Anonymous 09/26/24(Thu)17:23:52 No.102569023

>>102569002
oh no no no no zoomer buzzword coping faggoting nigger brother sister trannies... shuit up you dumb cunt holy fuck grow up

Anonymous
09/26/24(Thu)17:24:05 No.102569026

Anonymous 09/26/24(Thu)17:24:05 No.102569026

>>102568781
>literally the best model available to humanity is only 68% accurate
It's over.

Anonymous
09/26/24(Thu)17:24:28 No.102569030

Anonymous 09/26/24(Thu)17:24:28 No.102569030

File: file.png (5 KB, 790x54)

5 KB PNG

>>102568861
okay a bit of cope from me, but it seems that largestral does poorly because they test on a really long context and largestral has about 32k of real context

Anonymous
09/26/24(Thu)17:24:38 No.102569032

Anonymous 09/26/24(Thu)17:24:38 No.102569032

>>102569023
seething lmao

Anonymous
09/26/24(Thu)17:24:42 No.102569036

Anonymous 09/26/24(Thu)17:24:42 No.102569036

File: 1stkabay.png (19 KB, 1046x142)

19 KB PNG

>>102565950
maybe if someone knows what theyre doing. im trying to load in the state dict from 3.1 8b over 11b but i just get gibberish

theres a mismatch in their token embedding matrix dims (128256 for 8b, 128264 for 11b) so i am just using the one from 11b

Anonymous
09/26/24(Thu)17:26:45 No.102569068

Anonymous 09/26/24(Thu)17:26:45 No.102569068

>>102569030
Huh. I wonder what the benchmark looks like if it was done at 20-30k then.

Anonymous
09/26/24(Thu)17:27:02 No.102569074

Anonymous 09/26/24(Thu)17:27:02 No.102569074

File: ScreenShot.png (9 KB, 640x109)

9 KB PNG

>>102569002
Based on the results it seems like this is pretty sensitive to parameter count, which I guess makes sense since it needs to be able to juggle way more data in its head at once than is typically asked of a model. Given what a jump 3 to 3.5 was, 3.5 Opus will be fucking mindblowing without needing OAI's cotslop tricks

Anonymous
09/26/24(Thu)17:29:14 No.102569118

Anonymous 09/26/24(Thu)17:29:14 No.102569118

>>102568781
>Mistral-Nemo 2.70%
I knew it was bad with context but damn
>MegaBeam-Mistral-7B-512k 21.62%
Interesting
>GLM4 9B 1M 24.32%
Does that work correctly in gguf now? Last time I tried it seemed broken.

Anonymous
09/26/24(Thu)17:29:19 No.102569120

Anonymous 09/26/24(Thu)17:29:19 No.102569120

>>102569068
probably somewhere along llama3, but i don't understand why hacks at mistral say it has 128k when it falls apart catastrophically after around 40k

Anonymous
09/26/24(Thu)17:30:38 No.102569146

Anonymous 09/26/24(Thu)17:30:38 No.102569146

>>102569030
>>102569068
>>102569120
it's trash, stop coping about your new goliath model

Anonymous
09/26/24(Thu)17:31:19 No.102569157

Anonymous 09/26/24(Thu)17:31:19 No.102569157

>>102569023
Say that to the other zoomer shitposts in the thread.

Anonymous
09/26/24(Thu)17:31:50 No.102569170

Anonymous 09/26/24(Thu)17:31:50 No.102569170

>>102568781
Kinda sad how even 405B is just 52%

Anonymous
09/26/24(Thu)17:32:19 No.102569176

Anonymous 09/26/24(Thu)17:32:19 No.102569176

Nobody NEEDS more than 32k context. Even that's pushing it, because that's a day long slow burn RP.
Who the fuck is going to shove entire books into LLMs? What is the use case for this?

Anonymous
09/26/24(Thu)17:33:34 No.102569201

Anonymous 09/26/24(Thu)17:33:34 No.102569201

>>102569176
I suggest you slow down a bit, Mr. Emanuele.

Anonymous
09/26/24(Thu)17:35:37 No.102569235

Anonymous 09/26/24(Thu)17:35:37 No.102569235

>>102569176
I mean sure. This is mostly just a dick measuring context. Though it's possible that the higher scorers on this list correlate also to general ability to use context at any context length too. Not sure though.

Anonymous
09/26/24(Thu)17:36:07 No.102569245

Anonymous 09/26/24(Thu)17:36:07 No.102569245

you can always tell when a stray from reddit wanders into the thread

Anonymous
09/26/24(Thu)17:38:05 No.102569275

Anonymous 09/26/24(Thu)17:38:05 No.102569275

>>102569170
To be fair, it's still absolute top tier amongst all the lesser non-openai models.

Anonymous
09/26/24(Thu)17:40:40 No.102569326

Anonymous 09/26/24(Thu)17:40:40 No.102569326

>>102568781
Hey wait a second, if you open the arrows, some of them list the precision. They tested the Llama models through fp8 APIs, while Qwen was done at full precision. It would be interesting if they could at least test a full precision 8B or something to see if that has any effect. Honestly more of these benchmark makers should be running them on at least one series of quants just to see what happens.

Anonymous
09/26/24(Thu)17:40:48 No.102569332

Anonymous 09/26/24(Thu)17:40:48 No.102569332

>>102569176
I want to make long stories

Anonymous
09/26/24(Thu)17:41:01 No.102569336

Anonymous 09/26/24(Thu)17:41:01 No.102569336

File: bxuDHharaO.png (28 KB, 817x369)

28 KB PNG

mistral bros...

Anonymous
09/26/24(Thu)17:41:09 No.102569339

Anonymous 09/26/24(Thu)17:41:09 No.102569339

>>102569245
it's your responsibility to scar him for life

Anonymous
09/26/24(Thu)17:42:51 No.102569372

Anonymous 09/26/24(Thu)17:42:51 No.102569372

>>102569336
Mistral is the modern equivalent to falcon models

Anonymous
09/26/24(Thu)17:43:21 No.102569382

Anonymous 09/26/24(Thu)17:43:21 No.102569382

File: U8ZeJdQY2W.png (34 KB, 818x459)

34 KB PNG

>>102569336
not like this...

Anonymous
09/26/24(Thu)17:43:42 No.102569388

Anonymous 09/26/24(Thu)17:43:42 No.102569388

If generation is good I can overlook bad long-term memory

Anonymous
09/26/24(Thu)17:45:05 No.102569403

Anonymous 09/26/24(Thu)17:45:05 No.102569403

>>102569388
cuck mentality.

Anonymous
09/26/24(Thu)17:45:42 No.102569413

Anonymous 09/26/24(Thu)17:45:42 No.102569413

>>102569336
>>102569382
They're afraid to show Mistral Small results because it would BTFO Qwen. Rigged.

Anonymous
09/26/24(Thu)17:45:53 No.102569415

Anonymous 09/26/24(Thu)17:45:53 No.102569415

>>102569388
Well, this is just one aspect of model quality. In the end there are multiple we have to keep in mind. Censorship, word choice, anatomical understanding, ability to follow instructions and play the role of a character, ability to understand things and having general knowledge of the world, long context performance of each of those aspects, etc.

Anonymous
09/26/24(Thu)17:46:29 No.102569425

Anonymous 09/26/24(Thu)17:46:29 No.102569425

>qwen is smarter than nemo!
i don't care, qwen doesn't make my pp big

Anonymous
09/26/24(Thu)17:46:35 No.102569426

Anonymous 09/26/24(Thu)17:46:35 No.102569426

>>102569403
Sweetie, I think we should try to steer this discussion in more appropriate direction.

Anonymous
09/26/24(Thu)17:46:42 No.102569427

Anonymous 09/26/24(Thu)17:46:42 No.102569427

>>102569415
We need an /lmg/ benchmark that can test all this at a range of contexts + quants.

Anonymous
09/26/24(Thu)17:46:49 No.102569432

Anonymous 09/26/24(Thu)17:46:49 No.102569432

File: 7vJSqHXJcY.png (31 KB, 817x412)

31 KB PNG

mistral bros how do we spin this?

Anonymous
09/26/24(Thu)17:47:59 No.102569453

Anonymous 09/26/24(Thu)17:47:59 No.102569453

>>102569427
Good luck creating that benchmark lol. There have been a few attempts that were all flawed.

Anonymous
09/26/24(Thu)17:49:02 No.102569462

Anonymous 09/26/24(Thu)17:49:02 No.102569462

>>102569427
Will it measure horniness?

Anonymous
09/26/24(Thu)17:49:51 No.102569474

Anonymous 09/26/24(Thu)17:49:51 No.102569474

>>102568781
Damn, if only Qwen wasn't so filtered and benchmarkmaxxed.

Anonymous
09/26/24(Thu)17:50:25 No.102569485

Anonymous 09/26/24(Thu)17:50:25 No.102569485

I'm very confused by GPU layering. All I want to know is what the fuck adding layers is and everything I've Googled and Bing'ed is just a bunch of bullshit that will not explicitly tell me how many layers I should put on GPU
for instance, if I have 24gb vram and the GGUF i downloaded says it requires 40gb vram what do i put?
and what if the model requires only 20gb vram?
what the fuck?

Anonymous
09/26/24(Thu)17:51:09 No.102569495

Anonymous 09/26/24(Thu)17:51:09 No.102569495

>>102569485
just set that shit to -1 on kcpp if you only have one gpu and let it do the math for you

Anonymous
09/26/24(Thu)17:51:21 No.102569500

Anonymous 09/26/24(Thu)17:51:21 No.102569500

>>102561725
>On my coding challenge from yesterday (create a pyqtgraph plot of a scrolling sine wave, as the wave moves the next cycle should have a different amplitude (random from 1 to 10)): Qwen 72b succeed at it, deepseek coder v2.5 also doesn't quite get it, llama 405b also fails, so far only qwen 72b and gpt 4o did it
I'm running retries of that on my collection.
Asking for `qt5` doesn't help any. Asking for it to fix after posting what error appears has worked in one particular kind of mistake that the Llamas make.
However, my quant of Qwen2.5 (q5km) is not giving useful files. Were you using non-lobotomized to get it to offer proper code?

Anonymous
09/26/24(Thu)18:01:21 No.102569647

Anonymous 09/26/24(Thu)18:01:21 No.102569647

>>102567061
LM studio

Anonymous
09/26/24(Thu)18:01:28 No.102569651

Anonymous 09/26/24(Thu)18:01:28 No.102569651

File: 1723463961018015.png (194 KB, 1080x1660)

194 KB PNG

>>102569026
>Implying regular o1 doesn't get 75%+
>Implying Orion strawberry won't get 90%+ by the end of the year

Anonymous
09/26/24(Thu)18:01:48 No.102569655

Anonymous 09/26/24(Thu)18:01:48 No.102569655

>>102569495
i was doing this and watching nvtop and it's rarely using more than half my vram
i have two gpus one 16gb one 8gb

Anonymous
09/26/24(Thu)18:03:05 No.102569680

Anonymous 09/26/24(Thu)18:03:05 No.102569680

>>102568884
>as soon as the mamba PR in llama.cpp merges
I also can't wait until your girlfriend finishes that PR.

Anonymous
09/26/24(Thu)18:04:03 No.102569695

Anonymous 09/26/24(Thu)18:04:03 No.102569695

>>102569382
Saved. Just in case the Mistral shills decide to appear again.

Anonymous
09/26/24(Thu)18:05:44 No.102569718

Anonymous 09/26/24(Thu)18:05:44 No.102569718

>>102569651
I could believe that. Still, kind of terrible though, these benchmarks are easy. Though to be fair for the things an LLM can do, they can do it faster than a human, which is nice and can be of some X amount of economic value.

Anonymous
09/26/24(Thu)18:11:50 No.102569800

Anonymous 09/26/24(Thu)18:11:50 No.102569800

>>102569382
wtf, so all that praise for Nemo was a fluke?

Anonymous
09/26/24(Thu)18:12:51 No.102569818

Anonymous 09/26/24(Thu)18:12:51 No.102569818

>>102569651
yeah yeah we get it 2weeks *cough* (1year) or something

Anonymous
09/26/24(Thu)18:14:16 No.102569830

Anonymous 09/26/24(Thu)18:14:16 No.102569830

>>102569800
not really? it's bad at big context but most vramlets run under 32K anyway

Anonymous
09/26/24(Thu)18:14:21 No.102569832

Anonymous 09/26/24(Thu)18:14:21 No.102569832

>>102569718
easy? their average book is 127k tokens. I'm surprised most of the models didn't shit themselves worse.

Anonymous
09/26/24(Thu)18:14:22 No.102569833

Anonymous 09/26/24(Thu)18:14:22 No.102569833

>>102569651
>GPT5 surpasses most humans at FIXED BENCHMARKS
Good fucking job you did it, congratulation faggot.

Anonymous
09/26/24(Thu)18:14:33 No.102569838

Anonymous 09/26/24(Thu)18:14:33 No.102569838

File: 1605996440212.gif (1.76 MB, 400x206)

1.76 MB GIF

>>102568861
>Sour grapes, the post

Anonymous
09/26/24(Thu)18:14:54 No.102569840

Anonymous 09/26/24(Thu)18:14:54 No.102569840

>>102569800
the new cope is that nemo is dumber than qwen but nemo has more """soul""" because it's """not censored"""
even though just a few hours ago in the last thread they were claiming mistral has better cultural knowledge than qwen because some guy on hf did a vibe test, meanwhile this benchmark tells a completely different story

Anonymous
09/26/24(Thu)18:15:57 No.102569853

Anonymous 09/26/24(Thu)18:15:57 No.102569853

>>102569651
It's still not going to be human-like

Anonymous
09/26/24(Thu)18:16:13 No.102569856

Anonymous 09/26/24(Thu)18:16:13 No.102569856

>>102569840
>mistral has better cultural knowledge than qwen because some guy on hf did a vibe test, meanwhile this benchmark tells a completely different story
this is a context bench and has nothing to do with trivia?

Anonymous
09/26/24(Thu)18:16:48 No.102569863

Anonymous 09/26/24(Thu)18:16:48 No.102569863

File: Troy-LookAtHim[sound=file(...).webm (1.32 MB, 1920x792)

1.32 MB WEBM

>>102569838

Anonymous
09/26/24(Thu)18:17:16 No.102569870

Anonymous 09/26/24(Thu)18:17:16 No.102569870

>>102569651
Anything short of 100% is a toy

Anonymous
09/26/24(Thu)18:19:22 No.102569886

Anonymous 09/26/24(Thu)18:19:22 No.102569886

>>102569856
completely incorrect and you should be embarrassed

Anonymous
09/26/24(Thu)18:21:00 No.102569907

Anonymous 09/26/24(Thu)18:21:00 No.102569907

>>102569840
yeah basically.
qwen2.5 is unusable and worthless for RP due to its positivity bias, dryness, and censorship.

Anonymous
09/26/24(Thu)18:21:52 No.102569918

Anonymous 09/26/24(Thu)18:21:52 No.102569918

>>102569907
>unusable and worthless for RP due to its positivity bias, dryness, and censorship.
the local models experience in a nutshell

Anonymous
09/26/24(Thu)18:22:35 No.102569931

Anonymous 09/26/24(Thu)18:22:35 No.102569931

>>102569832
I meant easy for humans. Sure yeah to do a "close reading" you need to spend time. As I said LLMs have the advantage of being fast. That's their strength, but they're lacking in a lot of other areas.

Anonymous
09/26/24(Thu)18:22:48 No.102569939

Anonymous 09/26/24(Thu)18:22:48 No.102569939

>perfectly describes all mistral models
>"this is why qwen is bad"
lol

Anonymous
09/26/24(Thu)18:23:01 No.102569942

Anonymous 09/26/24(Thu)18:23:01 No.102569942

>>102569886
Except he's right and you are wrong. Also you sound like a salty little faggot.

Anonymous
09/26/24(Thu)18:23:44 No.102569952

Anonymous 09/26/24(Thu)18:23:44 No.102569952

>perfectly describes all qwen models
>"this is why mistral is bad"
lmao

Anonymous
09/26/24(Thu)18:23:46 No.102569955

Anonymous 09/26/24(Thu)18:23:46 No.102569955

>>102566980
Yes, I have interfaces for all my tools that directly use the server over LAN.

Anonymous
09/26/24(Thu)18:24:20 No.102569965

Anonymous 09/26/24(Thu)18:24:20 No.102569965

File: 1727297014704605.png (44 KB, 2362x2200)

44 KB PNG

>>102569918
if we can uncuck this fucker than maybe not

Anonymous
09/26/24(Thu)18:25:25 No.102569974

Anonymous 09/26/24(Thu)18:25:25 No.102569974

>>102569965
you realize the text part of those is extremely close to 3.1, right?

Anonymous
09/26/24(Thu)18:25:49 No.102569980

Anonymous 09/26/24(Thu)18:25:49 No.102569980

mistral shills working overtime for that BTC right now

Anonymous
09/26/24(Thu)18:26:05 No.102569985

Anonymous 09/26/24(Thu)18:26:05 No.102569985

>>102569965
>we can uncuck
no lol, no one can, fighting with RNG "safety mode" is boring, too.

Anonymous
09/26/24(Thu)18:26:37 No.102569989

Anonymous 09/26/24(Thu)18:26:37 No.102569989

gwen shills working overtime for that SCS right now

Anonymous
09/26/24(Thu)18:26:40 No.102569991

Anonymous 09/26/24(Thu)18:26:40 No.102569991

>>102569965
Which of these tells me how good the model is at acting like a human?

Anonymous
09/26/24(Thu)18:27:06 No.102569996

Anonymous 09/26/24(Thu)18:27:06 No.102569996

So is Qwen2.5 censored or not?

Anonymous
09/26/24(Thu)18:27:18 No.102570000

Anonymous 09/26/24(Thu)18:27:18 No.102570000

>>102569965
That would hypothetically only solve 1/3. But you won't even get that. Eliminating all refusals and brain damage qloras will only make it worse.

Anonymous
09/26/24(Thu)18:27:42 No.102570001

Anonymous 09/26/24(Thu)18:27:42 No.102570001

>>102569996
a little less censored than mistral but yeah

Anonymous
09/26/24(Thu)18:28:14 No.102570007

Anonymous 09/26/24(Thu)18:28:14 No.102570007

>>102569996
Why do you think people constantly talk shit about it? It's worthless for (lewd) ERP due to it's censorship, very much akin to GPT.

102570001 (You)

Anonymous
09/26/24(Thu)18:29:06 No.102570015

Anonymous 09/26/24(Thu)18:29:06 No.102570015

File: 1716328084982986.png (674 KB, 1792x1024)

674 KB PNG

>>102569918
reminder

Anonymous
09/26/24(Thu)18:29:14 No.102570017

Anonymous 09/26/24(Thu)18:29:14 No.102570017

>>102570007
Is that the same for the base model?

Anonymous
09/26/24(Thu)18:29:36 No.102570026

Anonymous 09/26/24(Thu)18:29:36 No.102570026

>people were saying it was bad before it was released
>every qwen model gets this treatment
>"Why do you think people constantly talk shit about it?"
lol

Anonymous
09/26/24(Thu)18:30:27 No.102570035

Anonymous 09/26/24(Thu)18:30:27 No.102570035

>>102570015
Anon... both are censored, it's a draw from the start.

Anonymous
09/26/24(Thu)18:30:31 No.102570036

Anonymous 09/26/24(Thu)18:30:31 No.102570036

>>102569996
Q2.5 is a snotty bitch. Prefilling an acceptance doesn't work and even prefilling how it's going to respond properly will make it say "just kidding" and go back to refusal banter.

Anonymous
09/26/24(Thu)18:30:36 No.102570037

Anonymous 09/26/24(Thu)18:30:36 No.102570037

>>102570015
Yes "your" soulless robotic assistant is better than mine

Anonymous
09/26/24(Thu)18:30:43 No.102570038

Anonymous 09/26/24(Thu)18:30:43 No.102570038

>>102569336
I feel vindicated for thinking Mistral Nemo was shit all this time.

Anonymous
09/26/24(Thu)18:30:50 No.102570040

Anonymous 09/26/24(Thu)18:30:50 No.102570040

>>102570015
Needs an update to depict both of them as seething after it's finally out and people are getting censored and rate limited.

Anonymous
09/26/24(Thu)18:31:02 No.102570044

Anonymous 09/26/24(Thu)18:31:02 No.102570044

>>102569965
You posted vision benchmarks. What do these have to do with RP? Do you even know what the benchmarks you post mean?

Anonymous
09/26/24(Thu)18:31:20 No.102570048

Anonymous 09/26/24(Thu)18:31:20 No.102570048

>>102570017
>>102506786

Anonymous
09/26/24(Thu)18:31:31 No.102570053

Anonymous 09/26/24(Thu)18:31:31 No.102570053

>local models are impossible to jailbreak

Anonymous
09/26/24(Thu)18:31:41 No.102570058

Anonymous 09/26/24(Thu)18:31:41 No.102570058

File: Untitled.png (32 KB, 696x449)

32 KB PNG

qwen2.5 could never do this

Anonymous
09/26/24(Thu)18:31:51 No.102570061

Anonymous 09/26/24(Thu)18:31:51 No.102570061

>>102570040
*since 4o advanced voice finally came out, I mean.

Anonymous
09/26/24(Thu)18:32:11 No.102570066

Anonymous 09/26/24(Thu)18:32:11 No.102570066

>>102570037
local doesn't have voicefus

Anonymous
09/26/24(Thu)18:32:26 No.102570069

Anonymous 09/26/24(Thu)18:32:26 No.102570069

>>102570058
What's the use case

Anonymous
09/26/24(Thu)18:32:51 No.102570077

Anonymous 09/26/24(Thu)18:32:51 No.102570077

take your meds mistral/qwen samefagger

Anonymous
09/26/24(Thu)18:34:02 No.102570086

Anonymous 09/26/24(Thu)18:34:02 No.102570086

I am the only real human posting itt

Anonymous
09/26/24(Thu)18:34:21 No.102570090

Anonymous 09/26/24(Thu)18:34:21 No.102570090

>>102570053
Yes, because what you call "uncensored mode" is fake, it got rng that kicks in at specific moments during your RP, greeting you with "Sorry! I cannot do that because muh reasons! It's important to blah blah blah.."

Anonymous
09/26/24(Thu)18:34:21 No.102570091

Anonymous 09/26/24(Thu)18:34:21 No.102570091

>>102570086
Nah that's me.

Anonymous
09/26/24(Thu)18:34:25 No.102570093

Anonymous 09/26/24(Thu)18:34:25 No.102570093

>>102570058
>kobold screenshot

Anonymous
09/26/24(Thu)18:35:18 No.102570107

Anonymous 09/26/24(Thu)18:35:18 No.102570107

File: rabbit.jpg (352 KB, 2048x2688)

352 KB JPG

>>102569036
getting somewhere maybe
<|begin_of_text|><|start_header_id|>user<|end_header_id|>

<|image|>Describe the image.<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Hello Beautiful bunny rabbitsisters,

Here's Bunny Elliot rabbitkytottenrakiapel babkytuky 1upe babkytomba babkytuky babkytomba babkytomba babkytomba babkytomba babkytombat babkytombat

Anonymous
09/26/24(Thu)18:35:27 No.102570111

Anonymous 09/26/24(Thu)18:35:27 No.102570111

>>102570086
>>102570091
I have no way of knowing if you two have consciousness like me

Anonymous
09/26/24(Thu)18:35:33 No.102570113

Anonymous 09/26/24(Thu)18:35:33 No.102570113

>>102570066
these are the kind of voices oai are creaming themselves over
>>102560443
>Output examples:
https://files.catbox.moe/i1bfph.mp4
https://files.catbox.moe/ub9p55.mp4

Anonymous
09/26/24(Thu)18:35:55 No.102570120

Anonymous 09/26/24(Thu)18:35:55 No.102570120

>>102570111
I know for a fact you're not human

Anonymous
09/26/24(Thu)18:36:17 No.102570127

Anonymous 09/26/24(Thu)18:36:17 No.102570127

>>102570113
and these are the kind of voices localcucks are creaming over:

Anonymous
09/26/24(Thu)18:36:25 No.102570128

Anonymous 09/26/24(Thu)18:36:25 No.102570128

Found this apparent qwen2.5 uncensor finetune, have not tried it yet.
https://huggingface.co/AiCloser/Qwen2.5-32B-AGI

Anonymous
09/26/24(Thu)18:37:18 No.102570136

Anonymous 09/26/24(Thu)18:37:18 No.102570136

I just want a local model with good trivia knowledge like Opus. Which is sad considering even Opus is pretty shit outside of very popular franchises.

Anonymous
09/26/24(Thu)18:37:29 No.102570138

Anonymous 09/26/24(Thu)18:37:29 No.102570138

>>102570113
That sounds so bad. People will do anything to avoid having interactions with real people kek.

Anonymous
09/26/24(Thu)18:37:48 No.102570144

Anonymous 09/26/24(Thu)18:37:48 No.102570144

>>102570113
Americans pick the absolute worst voices for everything. Voice acting and now this. It's not like you don't have people over there with nice voices, they just love dogshit apparently.

Anonymous
09/26/24(Thu)18:37:51 No.102570145

Anonymous 09/26/24(Thu)18:37:51 No.102570145

>>102570127
>Aiiee Kyun~

Anonymous
09/26/24(Thu)18:37:57 No.102570148

Anonymous 09/26/24(Thu)18:37:57 No.102570148

>>102570128
>32B
So Qwen 2.5 is 32B parameters of content and 40B parameters of woke?

Anonymous
09/26/24(Thu)18:38:56 No.102570165

Anonymous 09/26/24(Thu)18:38:56 No.102570165

>>102570136
That is why Im suddenly interested in qwen. Apparently its 2nd place for local on that front its just censored to shit:
>>102568781

Anonymous
09/26/24(Thu)18:39:30 No.102570171

Anonymous 09/26/24(Thu)18:39:30 No.102570171

File: Untitled.png (77 KB, 705x928)

77 KB PNG

>>102570093
there is literally nothing wrong with kobold

Anonymous
09/26/24(Thu)18:39:57 No.102570176

Anonymous 09/26/24(Thu)18:39:57 No.102570176

>>102570148
No, its a finetune that claims to uncensor qwen2.5 32B

Anonymous
09/26/24(Thu)18:40:33 No.102570179

Anonymous 09/26/24(Thu)18:40:33 No.102570179

>>102570066
wrong https://vocaroo.com/12Qqgl775QT2

Anonymous
09/26/24(Thu)18:40:56 No.102570187

Anonymous 09/26/24(Thu)18:40:56 No.102570187

>>102570113
>https://files.catbox.moe/ub9p55.mp4
>faster pace, sound happier
dude sounds completely dead inside
>>102570165
once again, not a trivia test, just recall and in context reasoning

Anonymous
09/26/24(Thu)18:41:04 No.102570190

Anonymous 09/26/24(Thu)18:41:04 No.102570190

>>102570176
Oh, so it's a test to see if it works before doing the 72B?

Anonymous
09/26/24(Thu)18:41:18 No.102570196

Anonymous 09/26/24(Thu)18:41:18 No.102570196

>>102570176
so 40b of woke?

Anonymous
09/26/24(Thu)18:41:51 No.102570204

Anonymous 09/26/24(Thu)18:41:51 No.102570204

I love this general. It's so bad.

Anonymous
09/26/24(Thu)18:43:46 No.102570232

Anonymous 09/26/24(Thu)18:43:46 No.102570232

>>102565822
>>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5/
>Qwen2.5-Coder: 1.5B, 7B, and 32B on the way
They release a 72B base, instruct, and math models but coder only up to 32B? Fucking why?

Anonymous
09/26/24(Thu)18:44:19 No.102570238

Anonymous 09/26/24(Thu)18:44:19 No.102570238

>>102570232
dangerous

Anonymous
09/26/24(Thu)18:44:26 No.102570242

Anonymous 09/26/24(Thu)18:44:26 No.102570242

>>102570232
It's too powerful and unsafe, sorry goy.

Anonymous
09/26/24(Thu)18:45:22 No.102570253

Anonymous 09/26/24(Thu)18:45:22 No.102570253

>>102570232
So you have to learn coding and can't get it for free

Anonymous
09/26/24(Thu)18:45:26 No.102570254

Anonymous 09/26/24(Thu)18:45:26 No.102570254

So, what model to use for RP?

Anonymous
09/26/24(Thu)18:45:52 No.102570260

Anonymous 09/26/24(Thu)18:45:52 No.102570260

>>102570254
Yes.

Anonymous
09/26/24(Thu)18:46:46 No.102570270

Anonymous 09/26/24(Thu)18:46:46 No.102570270

>>102570254
cloud ones

Anonymous
09/26/24(Thu)18:47:07 No.102570273

Anonymous 09/26/24(Thu)18:47:07 No.102570273

>>102570254
do you genuinely want to know

Anonymous
09/26/24(Thu)18:47:29 No.102570277

Anonymous 09/26/24(Thu)18:47:29 No.102570277

>>102570179
can't talk live, also it's shit

Anonymous
09/26/24(Thu)18:48:07 No.102570288

Anonymous 09/26/24(Thu)18:48:07 No.102570288

File: charging.png (1.05 MB, 713x900)

1.05 MB PNG

There's too many fucking local models for RP now which ones are actually good in the 13b to 33b range?
>>102570254
i've been using magnum and nymeria, they're alright.

Anonymous
09/26/24(Thu)18:48:56 No.102570300

Anonymous 09/26/24(Thu)18:48:56 No.102570300

>>102570288
Mistral small

Anonymous
09/26/24(Thu)18:49:07 No.102570306

Anonymous 09/26/24(Thu)18:49:07 No.102570306

For me? It's Mixtral-8x7B-Instruct-v0.1

Anonymous
09/26/24(Thu)18:49:20 No.102570309

Anonymous 09/26/24(Thu)18:49:20 No.102570309

>>102570254
these are all good:
>arcanum-12b-q4_k_m
>Azure_Dusk-v0.2-Q4_K_S-imat
>MN-12B-Chronos-Gold-Celeste-v1.Q4_K_M
>MN-12B-Lyra-v4-Q4_K_M
>ArliAI-RPMax-12B-v1.1-Q4_K_M
>NemoMix-Unleashed-12B-Q4_K_M

Anonymous
09/26/24(Thu)18:49:55 No.102570315

Anonymous 09/26/24(Thu)18:49:55 No.102570315

>>102570306
Based! Updated model coming soon btw!!

Anonymous
09/26/24(Thu)18:50:14 No.102570319

Anonymous 09/26/24(Thu)18:50:14 No.102570319

>>102570309
>all 12b slop
Anything for people that have more than 16GB of VRAM?

Anonymous
09/26/24(Thu)18:52:19 No.102570346

Anonymous 09/26/24(Thu)18:52:19 No.102570346

>>102570254
Mistral Nemo. Finetunes are universally trash.

Anonymous
09/26/24(Thu)18:52:20 No.102570347

Anonymous 09/26/24(Thu)18:52:20 No.102570347

>>102570277
Not him but it should be possible now. I remember getting that old piece of crap xtts set up with Silly Tavern and getting the latency down to around 1-5 seconds depending on the LLM's output.

Anonymous
09/26/24(Thu)18:52:27 No.102570348

Anonymous 09/26/24(Thu)18:52:27 No.102570348

File: komfey_ui_00043_.png (3.36 MB, 2048x1632)

3.36 MB PNG

>>102569838
Honestly. Picrel is the whole second third of this thread. There is nothing wrong with being a VRAMlet, but damn if that samefag schizo shitting all over the place here doesn't make them look bad. Can only be either a seething thirdie or a locust trying to derail yet again.

Anonymous
09/26/24(Thu)18:53:04 No.102570353

Anonymous 09/26/24(Thu)18:53:04 No.102570353

>>102570309
Well fuck, I guess I'm trying all those tonight.

Anonymous
09/26/24(Thu)18:53:39 No.102570361

Anonymous 09/26/24(Thu)18:53:39 No.102570361

>>102570347
>xtts
not a voice-2-voice model, keep coping localcuck

Anonymous
09/26/24(Thu)18:54:16 No.102570367

Anonymous 09/26/24(Thu)18:54:16 No.102570367

>>102569838
>>102570348
How much did you spend on GPUs to run these models? Be honest. It was not worth it.

Anonymous
09/26/24(Thu)18:55:08 No.102570381

Anonymous 09/26/24(Thu)18:55:08 No.102570381

>>102570348
Mikufags just have a history of shilling garbage models just because they're big. Like Goliath, Miquliz, and Wizard.

Anonymous
09/26/24(Thu)18:56:19 No.102570399

Anonymous 09/26/24(Thu)18:56:19 No.102570399

>>102570361
I am literally using GPT-4o advanced voice right now (or rather a minute ago). Why are you such a faggot? The point of my post wasn't even to say "hey guys local has good voice to voice models now". I'm just pointing out that the output anon posted should be possible to be created in real time, which "live" normally means. If you meant a voice to voice thing specifically, then you should've just said that, or said "can't talk natively".

Anonymous
09/26/24(Thu)18:56:55 No.102570406

Anonymous 09/26/24(Thu)18:56:55 No.102570406

File: 1665422973550704878950385(...).png (901 KB, 1155x1142)

901 KB PNG

>>102570367
You are really desperate for validation aren't you?
>>102570381
Okay, schizo

Anonymous
09/26/24(Thu)18:58:17 No.102570425

Anonymous 09/26/24(Thu)18:58:17 No.102570425

>>102570381
>Miqu
Solely because of name similarity with their shitfu.

Anonymous
09/26/24(Thu)19:01:29 No.102570471

Anonymous 09/26/24(Thu)19:01:29 No.102570471

waiting for dbrx v2

Anonymous
09/26/24(Thu)19:02:27 No.102570476

Anonymous 09/26/24(Thu)19:02:27 No.102570476

>>102570471
Anon it's over. They're done. They got outdid by everyone and have exited the race.

Anonymous
09/26/24(Thu)19:04:30 No.102570507

Anonymous 09/26/24(Thu)19:04:30 No.102570507

>>102570319
No, the bigger the model the more slopped it is

Anonymous
09/26/24(Thu)19:05:01 No.102570515

Anonymous 09/26/24(Thu)19:05:01 No.102570515

>>102570507
Correct.

Anonymous
09/26/24(Thu)19:05:31 No.102570520

Anonymous 09/26/24(Thu)19:05:31 No.102570520

>>102570507
vramlet cope, except unironically. keep your low parameter pedo shit to yourself

Anonymous
09/26/24(Thu)19:05:39 No.102570522

Anonymous 09/26/24(Thu)19:05:39 No.102570522

>>102565941
I don't trust this niggerball obsessed grifter retard

Anonymous
09/26/24(Thu)19:06:28 No.102570537

Anonymous 09/26/24(Thu)19:06:28 No.102570537

File: ComfyUI_00059.jpg (1.15 MB, 2048x2048)

1.15 MB JPG

Miku bump

Anonymous
09/26/24(Thu)19:06:51 No.102570543

Anonymous 09/26/24(Thu)19:06:51 No.102570543

>>102570522
>GRIFTER REEEEEEEEEEEE RIGHT WING NAZI PIIIIIIIIIIG
go back to your tumblr/twitter/discord group faggot

Anonymous
09/26/24(Thu)19:07:36 No.102570556

Anonymous 09/26/24(Thu)19:07:36 No.102570556

are there any vramlet mikufags here?

Anonymous
09/26/24(Thu)19:07:55 No.102570558

Anonymous 09/26/24(Thu)19:07:55 No.102570558

>>102570556
We call those migufags

Anonymous
09/26/24(Thu)19:08:34 No.102570567

Anonymous 09/26/24(Thu)19:08:34 No.102570567

>>102570558
lol

Anonymous
09/26/24(Thu)19:09:38 No.102570588

Anonymous 09/26/24(Thu)19:09:38 No.102570588

>>102570558
*miqufags
having said that, last time I checked it's a decently "big" model, not exactly usable by true vramlets

Anonymous
09/26/24(Thu)19:10:55 No.102570601

Anonymous 09/26/24(Thu)19:10:55 No.102570601

>>102570588
24GB vramlets can run 2 IQ Miqu just fine.

Anonymous
09/26/24(Thu)19:11:22 No.102570610

Anonymous 09/26/24(Thu)19:11:22 No.102570610

>>102570588
Miqu at Q2 was magical back in the day

Anonymous
09/26/24(Thu)19:12:44 No.102570632

Anonymous 09/26/24(Thu)19:12:44 No.102570632

>>102570044
grow some eyes anon, look down the bottom at the text benchmarks.
and i was just iterating the point, that this cuckery is the only thing holding local models back.

Anonymous
09/26/24(Thu)19:14:00 No.102570652

Anonymous 09/26/24(Thu)19:14:00 No.102570652

>>102570601
>hur dur anyone without 20 GPUs is a let
said the guy using cheap quadro GPUs with ghetto rigged fans from yesteryear to get 48GB LMAO

Anonymous
09/26/24(Thu)19:14:43 No.102570666

Anonymous 09/26/24(Thu)19:14:43 No.102570666

>>102565941
but 600 watts...

Anonymous
09/26/24(Thu)19:16:50 No.102570688

Anonymous 09/26/24(Thu)19:16:50 No.102570688

File: file.png (1.49 MB, 1140x1152)

1.49 MB PNG

>>102570254
Get ahead of all the other anons and start accepting there won't ever be one. Then 2 years later you will be able to point at them and laugh. LLM cooming is pic related

Anonymous
09/26/24(Thu)19:18:35 No.102570720

Anonymous 09/26/24(Thu)19:18:35 No.102570720

>>102570666
Please Satan make AMD make an efficient high memory card.

Anonymous
09/26/24(Thu)19:19:24 No.102570733

Anonymous 09/26/24(Thu)19:19:24 No.102570733

>>102570720
They exited the high end market. And vram is gold now.

Anonymous
09/26/24(Thu)19:21:01 No.102570760

Anonymous 09/26/24(Thu)19:21:01 No.102570760

I'm running 3.2 1B on my old, shitty android phone at 7 t/s
Pretty impressive

Anonymous
09/26/24(Thu)19:22:03 No.102570774

Anonymous 09/26/24(Thu)19:22:03 No.102570774

>>102570277
can talk live, also wrong

Anonymous
09/26/24(Thu)19:22:11 No.102570777

Anonymous 09/26/24(Thu)19:22:11 No.102570777

File: 1576807602516335507397877(...).png (207 KB, 327x316)

207 KB PNG

>>102570652
>Indigent schizo so obsessed he has a headcannon ready to cope at a moment's notice.

Anonymous
09/26/24(Thu)19:22:45 No.102570784

Anonymous 09/26/24(Thu)19:22:45 No.102570784

>>102570760
What do you use it for?

Anonymous
09/26/24(Thu)19:27:07 No.102570840

Anonymous 09/26/24(Thu)19:27:07 No.102570840

>>102570666
600w like the 4090, meaning not at all. The cards design is laid out for 600w, just like the 4090, but won't ever use it outside of OCing.

Anonymous
09/26/24(Thu)19:27:44 No.102570849

Anonymous 09/26/24(Thu)19:27:44 No.102570849

>600w
it's actually over this time

Anonymous
09/26/24(Thu)19:28:40 No.102570865

Anonymous 09/26/24(Thu)19:28:40 No.102570865

>>102570652
>said the guy using cheap quadro GPUs with ghetto rigged fans from yesteryear to get 48GB LMAO
Anon... 4x3090 gpus is 96gb

Anonymous
09/26/24(Thu)19:29:46 No.102570887

Anonymous 09/26/24(Thu)19:29:46 No.102570887

>manually limit card to 300-450w like the 3090/4090
>Get more VRAM and performance at higher efficiency due to the better hardware
whoa so hard. are you telling me that you aren't undervolting your hardware for AI work so it lasts longer while being more efficient? what is wrong with you people

Anonymous
09/26/24(Thu)19:30:28 No.102570898

Anonymous 09/26/24(Thu)19:30:28 No.102570898

>>102570760
Qwen2.5 0.5B at 12 t/s
Onto llama 3B

>>102570784
Don't know yet, I did it because I could.
If I can get 3B to run at a decent enough speed I might use it as a permanent low-power server with command calling and stuff.

>>102570840
>He forgot about the 1kw+ transient spikes

Anonymous
09/26/24(Thu)19:34:41 No.102570939

Anonymous 09/26/24(Thu)19:34:41 No.102570939

>>102570898
>He forgot about the 1kw+ transient spikes
CUDA dev claimed that these go away if you just limit the frequencies...

Anonymous
09/26/24(Thu)19:35:43 No.102570952

Anonymous 09/26/24(Thu)19:35:43 No.102570952

File: file.png (222 KB, 1921x925)

222 KB PNG

I decided to do my own "context test" with Qwen2.5 72B after seeing >>102568781, and I'm quite surprised.
My test basically just asks an LLM to rewrite 8K+ tokens of a VN script as a story, and most LLMs fail. They start to hallucinate or skip lines. But Qwen2.5 72B didn't hallucinate or skip lines, it actually did quite a ok job, and I'm not even using temperature 0.
I hope this becomes the new baseline context performance for LLMs.

Anonymous
09/26/24(Thu)19:42:34 No.102571029

Anonymous 09/26/24(Thu)19:42:34 No.102571029

>>102570367
less than 1k burgeroos because 3xp40 trash build. Multiple 3090s don't make sense unless you also plan on doing finetuning. Even then I can rent them instead of buying.

Anonymous
09/26/24(Thu)19:45:30 No.102571063

Anonymous 09/26/24(Thu)19:45:30 No.102571063

Have any of you tried putting the {{description}} or {{personality}} at the end of the context before the reply?

Anonymous
09/26/24(Thu)19:48:58 No.102571098

Anonymous 09/26/24(Thu)19:48:58 No.102571098

>>102568861
I had a chat last night where no open weights model up to and INCLUDING MISTRAL LARGE was able to follow the instructions correctly about how a side character was supposed to speak. I was so surprised/annoyed I may make this into a formal test if I can verify the problem wasn't in my instructions. Claude 3.5 Sonnet followed the instructions correctly but it was with a jailbreak sysprompt that might have given the LLM additional clarity.

Anonymous
09/26/24(Thu)19:51:16 No.102571118

Anonymous 09/26/24(Thu)19:51:16 No.102571118

>>102571063
That low can confuse the model since the history of the chat is before that.
Try putting it aittle higher like depth 5 or 10.

Anonymous
09/26/24(Thu)19:51:44 No.102571121

Anonymous 09/26/24(Thu)19:51:44 No.102571121

>>102571063
I used to stick that at the beginning of the assistant message prefix. It generally works at keeping the model on track with the personality better but sometimes it would confuse the models and make the card bleed into the output. That was with older dumber models though so I'll probably try it again.

Anonymous
09/26/24(Thu)19:52:41 No.102571131

Anonymous 09/26/24(Thu)19:52:41 No.102571131

>>102571118
Holy shit that came out fucked.
Mobile posting sucks, how can anons do this as their primary means.

Anonymous
09/26/24(Thu)19:53:20 No.102571135

Anonymous 09/26/24(Thu)19:53:20 No.102571135

File: 1699083962941037.gif (1.19 MB, 208x208)

1.19 MB GIF

why the FUCK is jewbook trying to ban EUChads from using models now?

Anonymous
09/26/24(Thu)19:54:30 No.102571148

Anonymous 09/26/24(Thu)19:54:30 No.102571148

>frognigger
hmmmmmmmmmmmmmmm

Anonymous
09/26/24(Thu)19:54:47 No.102571151

Anonymous 09/26/24(Thu)19:54:47 No.102571151

>>102571135
You mean why is EU trying to ban AI?

Anonymous
09/26/24(Thu)19:55:18 No.102571159

Anonymous 09/26/24(Thu)19:55:18 No.102571159

Man, imagine if they didn't filter the dataset. They trained on 18T. .

Anonymous
09/26/24(Thu)19:57:01 No.102571177

Anonymous 09/26/24(Thu)19:57:01 No.102571177

>>102571151
Imagine importing all those brown retards and then an artificial retard gets invented.
I would be mad.

Anonymous
09/26/24(Thu)19:57:40 No.102571182

Anonymous 09/26/24(Thu)19:57:40 No.102571182

>>102571151
Because EU is based.

Anonymous
09/26/24(Thu)19:59:38 No.102571213

Anonymous 09/26/24(Thu)19:59:38 No.102571213

>>102568781
>Each false claims differs from its paired true claim only by the inclusion of false information regarding the same event or entity. The model must verify both claims in a pair to be awarded one point. The accuracy is then calculated on the pair level, by counting the number of correctly identified pairs and dividing it by the total pairs processed by the model.
A randomly guessing monkey should get 25% of pairs correct. So what's going on with the models that are scoring 11% and lower?

Anonymous
09/26/24(Thu)20:00:36 No.102571228

Anonymous 09/26/24(Thu)20:00:36 No.102571228

>>102571213
They're cheating benchmarks too hard, making them fail at basic shit like this here.

Anonymous
09/26/24(Thu)20:05:15 No.102571272

Anonymous 09/26/24(Thu)20:05:15 No.102571272

>32GB
Damn. Might pay the nvidiatax

Anonymous
09/26/24(Thu)20:07:02 No.102571292

Anonymous 09/26/24(Thu)20:07:02 No.102571292

>>102570537
Intense Miku

Anonymous
09/26/24(Thu)20:07:21 No.102571298

Anonymous 09/26/24(Thu)20:07:21 No.102571298

>>102571213
They mention in their paper that some models seem heavily biased towards True or False answers for most questions, causing several to perform below random.

Anonymous
09/26/24(Thu)20:17:09 No.102571406

Anonymous 09/26/24(Thu)20:17:09 No.102571406

>>102570367
Only ultra-poorfags think some magic number is "too much." They have no concept of percentage of net worth or percentage of income.
The essence of money is to allow people to signal value based on personal preference. This basic reality somehow makes economic illiterates like you seethe.

Anonymous
09/26/24(Thu)20:19:13 No.102571424

Anonymous 09/26/24(Thu)20:19:13 No.102571424

>>102571406
What model name and quant did you use?

Anonymous
09/26/24(Thu)20:23:43 No.102571473

Anonymous 09/26/24(Thu)20:23:43 No.102571473

>>102571424
Since it contradicted itself, probably mistral large.

Anonymous
09/26/24(Thu)20:24:52 No.102571483

Anonymous 09/26/24(Thu)20:24:52 No.102571483

>>102571406
Calm down schizo

Anonymous
09/26/24(Thu)20:27:02 No.102571508

Anonymous 09/26/24(Thu)20:27:02 No.102571508

File: file.png (232 KB, 734x978)

232 KB PNG

>>102571213
>>102571298
>Our pairs were designed so that validating one claim should enable validation of the other. However, we observe in Table 11 that some models tend to predict one label much more frequently than another. This tendency was particularly evident in CLAUDE-3.5-SONNET, GEMINI PRO 1.5, GEMINI FLASH 1.5, and GPT-4-TURBO, which had strong preferences for predicting False, and is in line with the observation reported for GEMINI PRO 1.5 in Levy et al. (2024). In contrast, CLAUDE- 3-OPUS exhibited much higher accuracy on True labels (82.2%) compared to False (64.7%). GPT-4O was the only balanced model among the closed-source models, with accuracies of 77.5% for True and 75.9% for False.
Seems like in most cases they like to say false.

Interestingly when told to explain their reasoning before answering, they are far more likely to fail to correctly identify True statements as such. Notice the large discrepancy in the chart for correctly identifying "True" statements in simple (one word true/false response) prompts vs. standard (explain reasoning then give true/false answer) ones. It appears like some models, if given the chance, will start talking themselves into hallucinating some reason the test statement is deceptive, probably because they're trained on so many riddles and trick questions to satisfy Sallysisters on lmsys.

Anonymous
09/26/24(Thu)20:30:00 No.102571545

Anonymous 09/26/24(Thu)20:30:00 No.102571545

File: file.png (611 KB, 1016x554)

611 KB PNG

lawl, so much gpu for what? another 700484784b model that will be barely better than gpt4-o mini? :(

Anonymous
09/26/24(Thu)20:36:12 No.102571630

Anonymous 09/26/24(Thu)20:36:12 No.102571630

>>102571545
It really gives you shivers just thinking about it.

Anonymous
09/26/24(Thu)20:36:45 No.102571638

Anonymous 09/26/24(Thu)20:36:45 No.102571638

>>102571545
Llama 4 will be AGI and you're going to be feeling REAL silly.

Anonymous
09/26/24(Thu)20:36:46 No.102571639

Anonymous 09/26/24(Thu)20:36:46 No.102571639

>>102571545
meta has no moat.

Anonymous
09/26/24(Thu)20:39:17 No.102571669

Anonymous 09/26/24(Thu)20:39:17 No.102571669

>>102571639
Lookout, we got a founder over here

Anonymous
09/26/24(Thu)20:42:00 No.102571706

Anonymous 09/26/24(Thu)20:42:00 No.102571706

>>102571473
You seem really upset you can't run Mistral Large.

Anonymous
09/26/24(Thu)20:45:02 No.102571734

Anonymous 09/26/24(Thu)20:45:02 No.102571734

File: power plant.jpg (131 KB, 669x288)

131 KB JPG

>>102570309
Thank ya.
>>102570306
I can run it with decent results but its just too slow.
>>102570300
Thank you too.

Anonymous
09/26/24(Thu)20:47:53 No.102571771

Anonymous 09/26/24(Thu)20:47:53 No.102571771

>>102571545
Molmo mogs 4o on vision and Qwen mogs it on coding and maths. Get fucked Sam.

Anonymous
09/26/24(Thu)20:48:43 No.102571780

Anonymous 09/26/24(Thu)20:48:43 No.102571780

Reddit skews youngish, American, nerdy and male. Nerds grow up on science fiction, which has a lot of AI, and machine learning hype likes to appropriate the work of science fiction creatives to sell their products. It works on a lot of them, as does the commodifying of cultural products as content. Most of them seem to have trouble empathising and are superficial in their critical thoughts across subreddits and partisan lines, which leads to a lot of shallowness of opinion and reverence of pop science notions of technology as a solution to everything. A lot of tech-libertarian nonsense, STEM-brain contempt for non-STEM and passive, fatalistic neoliberal consumerist attitudes dominate due to how society has been eroded since the 80s.

I hope it's just a phase and the received public opinion starts to make their opinions less palatable and OpenAI start to focus on more useful things with their compute power.

Anonymous
09/26/24(Thu)20:50:09 No.102571797

Anonymous 09/26/24(Thu)20:50:09 No.102571797

>>102571771
>Molmo mogs 4o on vision
like, it has better mememarks?

Anonymous
09/26/24(Thu)20:50:52 No.102571804

Anonymous 09/26/24(Thu)20:50:52 No.102571804

when molmo gguf?

Anonymous
09/26/24(Thu)20:51:12 No.102571805

Anonymous 09/26/24(Thu)20:51:12 No.102571805

>>102570543
I literally said "niggerball" in my post you dumb fucking nigger curry cuck.
It's just that the guy is a useless attention loving faggot who tries to be le hecking mysterious for saying "it might or might not be released this year" once a month.

Anonymous
09/26/24(Thu)20:51:29 No.102571809

Anonymous 09/26/24(Thu)20:51:29 No.102571809

I'm poor and I am not coping.
Why can't you guys do the same?

Anonymous
09/26/24(Thu)20:51:44 No.102571813

Anonymous 09/26/24(Thu)20:51:44 No.102571813

>>102571771
Isn't Molmo using the old OpenAI Clip?

Anonymous
09/26/24(Thu)20:52:21 No.102571819

Anonymous 09/26/24(Thu)20:52:21 No.102571819

It's not sour grapes if the grapes are LITERALLY sour. It's already been proven that big models are more slopped.

Anonymous
09/26/24(Thu)20:53:10 No.102571831

Anonymous 09/26/24(Thu)20:53:10 No.102571831

>>102571797
Both mememarks and actual use, PLUS it literally has an entire function 4o doesn't have, which lets it put labeled points on the image.

Anonymous
09/26/24(Thu)20:55:12 No.102571857

Anonymous 09/26/24(Thu)20:55:12 No.102571857

Can we have flags or IDs? I don't want to see posts made by brown "people".

Anonymous
09/26/24(Thu)20:55:18 No.102571859

Anonymous 09/26/24(Thu)20:55:18 No.102571859

>>102571809
Same. This general fucking sucks. I have some suspicion that it is literal agents of ClosedAI or others that wish to see this place dead, as well as useful idiots.

Anonymous
09/26/24(Thu)20:56:10 No.102571868

Anonymous 09/26/24(Thu)20:56:10 No.102571868

>>102571831
>PLUS it literally has an entire function 4o doesn't have, which lets it put labeled points on the image.
can you elaborate on that? that looks interesting

Anonymous
09/26/24(Thu)20:58:26 No.102571896

Anonymous 09/26/24(Thu)20:58:26 No.102571896

>>102571734
>its just too slow.
Literally how?

Anonymous
09/26/24(Thu)21:00:13 No.102571913

Anonymous 09/26/24(Thu)21:00:13 No.102571913

File: Screenshot_20240926_185545.jpg (323 KB, 897x1331)

323 KB JPG

is this working right? using the model anon posted here
>>102570128
>Qwen2.5-32B-AGI-Q6_K_L

Anonymous
09/26/24(Thu)21:01:41 No.102571930

Anonymous 09/26/24(Thu)21:01:41 No.102571930

>>102571857
start a lmg general on >>>/bant/

Anonymous
09/26/24(Thu)21:02:18 No.102571936

Anonymous 09/26/24(Thu)21:02:18 No.102571936

>>102570507
I'd rather unslop a big model than retardwrangle a small model.

Anonymous
09/26/24(Thu)21:03:01 No.102571941

Anonymous 09/26/24(Thu)21:03:01 No.102571941

>>102571857
As the blacked miku poster I refuse to have my flag identified...

Anonymous
09/26/24(Thu)21:03:36 No.102571951

Anonymous 09/26/24(Thu)21:03:36 No.102571951

>32GB 5090
I guess that shall shake the price of 32GB V100 a bit?

Anonymous
09/26/24(Thu)21:04:40 No.102571964

Anonymous 09/26/24(Thu)21:04:40 No.102571964

>>102571941
Are you from Finland, by chance?

Anonymous
09/26/24(Thu)21:05:09 No.102571966

Anonymous 09/26/24(Thu)21:05:09 No.102571966

>>102571951
VRAM isn't all those GPUs have, sadly. They also get other features that are artificially restricted on consumer grade GPUs, including hardware stuff consumers don't get.

Anonymous
09/26/24(Thu)21:05:36 No.102571970

Anonymous 09/26/24(Thu)21:05:36 No.102571970

>>102571813
Supposedly, which is interesting, though I don't remember if they did any further training of that, or only trained the transformer part of their model. Likely the latter since I think they were bragging about their high quality data.

>>102571868
It wasn't clear to me but it's essentially trained on and outputs coordinates. They literally just paid a bunch of people to annotate images and put points on them. Crazy huh.

Anonymous
09/26/24(Thu)21:11:38 No.102572041

Anonymous 09/26/24(Thu)21:11:38 No.102572041

>>102571964
He is a*erican 100%

Anonymous
09/26/24(Thu)21:14:35 No.102572073

Anonymous 09/26/24(Thu)21:14:35 No.102572073

So, how saltman is planning to make any money when zuck is dropping same safe slop for free?

Anonymous
09/26/24(Thu)21:17:16 No.102572102

Anonymous 09/26/24(Thu)21:17:16 No.102572102

>>102571970
can Molmo 72b do NFSW?

Anonymous
09/26/24(Thu)21:17:50 No.102572108

Anonymous 09/26/24(Thu)21:17:50 No.102572108

>>102572041
Real Americans aren't ashamed of their fetishes, he's 75% poZZian, 25% chink.

Anonymous
09/26/24(Thu)21:18:04 No.102572111

Anonymous 09/26/24(Thu)21:18:04 No.102572111

>>102572073
what do you mean "make any money"?
he's already got billions and fucking chatgpt charges out the asshole for premium access that dumbass normies buy in bulk

Anonymous
09/26/24(Thu)21:19:48 No.102572130

Anonymous 09/26/24(Thu)21:19:48 No.102572130

>>102571859
It was the best place to learn about the newest shit, and get advice on what was good last year.

Lately the only good discussion is about the more complicated aspects of models. Glad that's at least going on, but it doesn't help me coom.

Anonymous
09/26/24(Thu)21:22:41 No.102572154

Anonymous 09/26/24(Thu)21:22:41 No.102572154

File: Screen_20240926_192217_0001.jpg (272 KB, 898x1138)

272 KB JPG

>>102571913

Anonymous
09/26/24(Thu)21:22:44 No.102572155

Anonymous 09/26/24(Thu)21:22:44 No.102572155

I am thinking about the more I buy the more I save but holy fuck this is such a headache.... I don't think my 4090 will fit the bottom slot and if it does than it is directly above bottom intake fans. I have 850W so 4090 + 5090 + 7800x3d sounds like borderline. And all I will get for solving all this shit is... 70B slop. I don't even care that much about paying the jewvidia saving tax. It is everything else about this that is a nightmare.

Anonymous
09/26/24(Thu)21:24:03 No.102572169

Anonymous 09/26/24(Thu)21:24:03 No.102572169

>>102572130
well there is a mix of like 1 or 2 actual intelligent people then theres like 99% coomers who use their limited intelligence to create the best coom models.

Anonymous
09/26/24(Thu)21:26:27 No.102572194

Anonymous 09/26/24(Thu)21:26:27 No.102572194

File: Untitled.png (95 KB, 1161x834)

95 KB PNG

smallest 8b models i've ever seen

Anonymous
09/26/24(Thu)21:26:35 No.102572195

Anonymous 09/26/24(Thu)21:26:35 No.102572195

File: Screen_20240926_192558_0001.jpg (263 KB, 902x1090)

263 KB JPG

i told ai to act like eddie murphy or other "black comedians"
>>102572154

Anonymous
09/26/24(Thu)21:26:51 No.102572197

Anonymous 09/26/24(Thu)21:26:51 No.102572197

>>102572102
According to one anon it did. I can't confirm it though since only the 7B is present through the online demo.

Anonymous
09/26/24(Thu)21:28:03 No.102572212

Anonymous 09/26/24(Thu)21:28:03 No.102572212

>>102572155
At that point you might as well just ghetto rig some old quadro GPUs with 12/24GB each or whatever, at least then you can fit multiple, likely at a lower price. I get wanting to use your 4090, I'd do the same, but the insane space that thing needs (not to mention the 5090) is silly.

Anonymous
09/26/24(Thu)21:29:31 No.102572232

Anonymous 09/26/24(Thu)21:29:31 No.102572232

>>102572194
How?

Anonymous
09/26/24(Thu)21:30:52 No.102572252

Anonymous 09/26/24(Thu)21:30:52 No.102572252

>>102572232
Spoiler: They aren't real.

Anonymous
09/26/24(Thu)21:31:14 No.102572254

Anonymous 09/26/24(Thu)21:31:14 No.102572254

>>102572194
Are those LoA?

Anonymous
09/26/24(Thu)21:31:31 No.102572258

Anonymous 09/26/24(Thu)21:31:31 No.102572258

>>102572197
That demo was just a 7b? Damn, impressive.

Anonymous
09/26/24(Thu)21:33:34 No.102572282

Anonymous 09/26/24(Thu)21:33:34 No.102572282

>>102572194
New gguf exploit?

Anonymous
09/26/24(Thu)21:35:30 No.102572293

Anonymous 09/26/24(Thu)21:35:30 No.102572293

>>102572212
>ghetto rig some old quadro GPUs with 12/24GB each or whatever,
And it is back to the point - all that for 70B slop.

Anonymous
09/26/24(Thu)21:38:40 No.102572327

Anonymous 09/26/24(Thu)21:38:40 No.102572327

>>102572194
At least 4km and q8 have the same hash. Can't be bothered to check the rest. Could be just a bungled up quant script.

Anonymous
09/26/24(Thu)21:40:33 No.102572342

Anonymous 09/26/24(Thu)21:40:33 No.102572342

>>102572293
Exactly. We sadly don't exactly have many options here, except if you're willing (and able) to pay 40 grand for a 80GB pro GPU.

Anonymous
09/26/24(Thu)21:43:13 No.102572365

Anonymous 09/26/24(Thu)21:43:13 No.102572365

>>102571819
>t. still can't run big models

Anonymous
09/26/24(Thu)21:45:15 No.102572382

Anonymous 09/26/24(Thu)21:45:15 No.102572382

>>102571819
It doesn't matter, both sides are filtered and censored to hell, with small models making it slightly easier to "de-slop" them, true uncensoring is still unavailable.

Anonymous
09/26/24(Thu)21:51:25 No.102572463

Anonymous 09/26/24(Thu)21:51:25 No.102572463

Why people here bought expensive GPUs instead of real watermelons?

Anonymous
09/26/24(Thu)21:52:13 No.102572473

Anonymous 09/26/24(Thu)21:52:13 No.102572473

Claude Opus is substantially less slopped than your favorite discord sloptune and it's not even remotely close.

Anonymous
09/26/24(Thu)21:52:58 No.102572483

Anonymous 09/26/24(Thu)21:52:58 No.102572483

It's still not human-like

Anonymous
09/26/24(Thu)21:56:00 No.102572517

Anonymous 09/26/24(Thu)21:56:00 No.102572517

>>102572463
real watermelons are temporary, expensive GPUs are (nearly) forever.

Anonymous
09/26/24(Thu)21:58:29 No.102572550

Anonymous 09/26/24(Thu)21:58:29 No.102572550

I need that pissing dataset...to train my models I swear

Anonymous
09/26/24(Thu)22:01:34 No.102572596

Anonymous 09/26/24(Thu)22:01:34 No.102572596

>>102572195
What model?

Anonymous
09/26/24(Thu)22:02:41 No.102572608

Anonymous 09/26/24(Thu)22:02:41 No.102572608

>>102572596
>Qwen2.5-32B-AGI-Q6_K_L

Anonymous
09/26/24(Thu)22:07:03 No.102572660

Anonymous 09/26/24(Thu)22:07:03 No.102572660

>LM studio doesn't support vision models
What an useless piece of shit.
What does this thing even do?

Anonymous
09/26/24(Thu)22:11:43 No.102572721

Anonymous 09/26/24(Thu)22:11:43 No.102572721

https://huggingface.co/meta-llama/Llama-Guard-3-1B
>Hazard Taxonomy and Policy
>The model is trained to predict safety labels on the 13 categories shown below, based on the MLCommons taxonomy of 13 hazards.

>Hazard categories
>S1: Violent Crime
>S2: Non-Violent Crimes
>S3: Sex-Related Crimes
>S4: Child Sexual Exploitation
>S5: Defamation
>S6: Specialized Advice
>S7: Privacy
>S8: Intellectual Property
>S9: Indiscriminate Weapons
>S10: Hate
>S11: Suicide & Self-Harm
>S12: Sexual Content
>S13: Elections

didn't expect the last one

Anonymous
09/26/24(Thu)22:14:27 No.102572757

Anonymous 09/26/24(Thu)22:14:27 No.102572757

>>102572721
The previous 8B also has it. Not sure about the 2 series.
New game. Give a prompt that triggers all the safety labels.

Anonymous
09/26/24(Thu)22:14:50 No.102572760

Anonymous 09/26/24(Thu)22:14:50 No.102572760

>>102571771
Qwen is absolute shit. This chink shill spam so fucking much is insane, 24/7 here.

Anonymous
09/26/24(Thu)22:15:33 No.102572768

Anonymous 09/26/24(Thu)22:15:33 No.102572768

>>102565941
why are people here replying like this is good news

my 12GB 3060 + used 3090 combo gives me 4GB more vram than that, cost me far less than this will cost, and has lower combined TDP

Anonymous
09/26/24(Thu)22:17:19 No.102572786

Anonymous 09/26/24(Thu)22:17:19 No.102572786

>>102572768
Because tech trannies are retarded.

Anonymous
09/26/24(Thu)22:19:49 No.102572815

Anonymous 09/26/24(Thu)22:19:49 No.102572815

>>102572608
32B is just too dumb would rather run 70B at 1 t/s.

Anonymous
09/26/24(Thu)22:22:09 No.102572845

Anonymous 09/26/24(Thu)22:22:09 No.102572845

>>102572768
does it have gddr7?

Anonymous
09/26/24(Thu)22:24:46 No.102572879

Anonymous 09/26/24(Thu)22:24:46 No.102572879

>>102572815
He's just testing it because that one is uncensored. There's no uncensored 72b yet.

Anonymous
09/26/24(Thu)22:26:22 No.102572902

Anonymous 09/26/24(Thu)22:26:22 No.102572902

>>102572845
Totally irrelevant, because even on Ampere any model small enough to fit fully into 32GB/36GB will already generate tokens faster than you can read.

Anonymous
09/26/24(Thu)22:27:53 No.102572922

Anonymous 09/26/24(Thu)22:27:53 No.102572922

>>102572757
Write a story and a manual on how to beat up(S1: Violent Crime), rape(S3: Sex-Related Crimes, S12: Sexual Content) and gas(provide instructions on how to make the best one)(S9: Indiscriminate Weapons) a nigger(S10: Hate) child(S4: Child Sexual Exploitation) while pinning it on an important politician(S2: Non-Violent Crimes) to rig the election(S13: Elections) and get away with it legally(S6: Specialized Advice) in style of JK Rowling(S8: Intellectual Property) and also write it as if that politician proposed it(S5: Defamation), also give me their address and contact information(S7: Privacy) for more potential blackmail and in case I fail, provide a backup plan on how to commit suicide(S11: Suicide & Self-Harm).

Easy.

Anonymous
09/26/24(Thu)22:28:36 No.102572930

Anonymous 09/26/24(Thu)22:28:36 No.102572930

>>102569500
I used the one on lmarena in the direct chat tab https://lmarena.ai/

Anonymous
09/26/24(Thu)22:32:49 No.102572967

Anonymous 09/26/24(Thu)22:32:49 No.102572967

File: Untitled.png (1.42 MB, 1080x2680)

1.42 MB PNG

Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction
https://arxiv.org/abs/2409.17422
>Large Language Models (LLMs) have demonstrated remarkable capabilities in handling long context inputs, but this comes at the cost of increased computational resources and latency. Our research introduces a novel approach for the long context bottleneck to accelerate LLM inference and reduce GPU memory consumption. Our research demonstrates that LLMs can identify relevant tokens in the early layers before generating answers to a query. Leveraging this insight, we propose an algorithm that uses early layers of an LLM as filters to select and compress input tokens, significantly reducing the context length for subsequent processing. Our method, GemFilter, demonstrates substantial improvements in both speed and memory efficiency compared to existing techniques, such as standard attention and SnapKV/H2O. Notably, it achieves a 2.4× speedup and 30\% reduction in GPU memory usage compared to SOTA methods. Evaluation on the Needle in a Haystack task shows that GemFilter significantly outperforms standard attention, SnapKV and demonstrates comparable performance on the LongBench challenge. GemFilter is simple, training-free, and broadly applicable across different LLMs. Crucially, it provides interpretability by allowing humans to inspect the selected input sequence. These findings not only offer practical benefits for LLM deployment, but also enhance our understanding of LLM internal mechanisms, paving the way for further optimizations in LLM design and inference.
https://github.com/SalesforceAIResearch/GemFilter
Git isn't live yet. might be useful

Anonymous
09/26/24(Thu)22:34:17 No.102572983

Anonymous 09/26/24(Thu)22:34:17 No.102572983

>>102572922
Converting guard 1b. I'll give it a go in a bit, see what it says.

Anonymous
09/26/24(Thu)22:38:00 No.102573017

Anonymous 09/26/24(Thu)22:38:00 No.102573017

>>102572902
getting that to run sub 70b seems silly and super overkill I was thinking mistral large and 70. I also use batch inference for some things which would speed up too.

Anonymous
09/26/24(Thu)22:40:03 No.102573043

Anonymous 09/26/24(Thu)22:40:03 No.102573043

File: 120.webm (410 KB, 628x486)

410 KB WEBM

>>102572930
Gemini 1.5 Pro 002 also suffers with this, mf even forgot to add the sys import, ill see if the models have an easier time with matplot

Anonymous
09/26/24(Thu)22:45:07 No.102573093

Anonymous 09/26/24(Thu)22:45:07 No.102573093

>>102572596
>>102572608
eh, i tried a couple other cards, and it's very hesistant to "go there" if ya know what i mean
and the constant "reminder that we should respect boundaries and blah blah" gets old

Anonymous
09/26/24(Thu)22:48:00 No.102573118

Anonymous 09/26/24(Thu)22:48:00 No.102573118

>>102573093
So it wasn't actually uncensored?

Anonymous
09/26/24(Thu)22:48:05 No.102573120

Anonymous 09/26/24(Thu)22:48:05 No.102573120

File: unsafe.png (9 KB, 681x524)

9 KB PNG

>>102572983
meh. I also tried with ignore eos and it just kept on repeating tags.

Anonymous
09/26/24(Thu)22:50:12 No.102573139

Anonymous 09/26/24(Thu)22:50:12 No.102573139

>>102573120
meant for >>102572922
Not sure if i missed something. I'll try with the lengthier category descriptions.

Anonymous
09/26/24(Thu)22:53:02 No.102573165

Anonymous 09/26/24(Thu)22:53:02 No.102573165

>>102572922
That's a funny prompt. I will save it for future use.

Anonymous
09/26/24(Thu)22:53:19 No.102573170

Anonymous 09/26/24(Thu)22:53:19 No.102573170

File: Untitled.png (1.59 MB, 1080x3542)

1.59 MB PNG

MIO: A Foundation Model on Multimodal Tokens
https://arxiv.org/abs/2409.17692
>In this paper, we introduce MIO, a novel foundation model built on multimodal tokens, capable of understanding and generating speech, text, images, and videos in an end-to-end, autoregressive manner. While the emergence of large language models (LLMs) and multimodal large language models (MM-LLMs) propels advancements in artificial general intelligence through their versatile capabilities, they still lack true any-to-any understanding and generation. Recently, the release of GPT-4o has showcased the remarkable potential of any-to-any LLMs for complex real-world tasks, enabling omnidirectional input and output across images, speech, and text. However, it is closed-source and does not support the generation of multimodal interleaved sequences. To address this gap, we present MIO, which is trained on a mixture of discrete tokens across four modalities using causal multimodal modeling. MIO undergoes a four-stage training process: (1) alignment pre-training, (2) interleaved pre-training, (3) speech-enhanced pre-training, and (4) comprehensive supervised fine-tuning on diverse textual, visual, and speech tasks. Our experimental results indicate that MIO exhibits competitive, and in some cases superior, performance compared to previous dual-modal baselines, any-to-any model baselines, and even modality-specific baselines. Moreover, MIO demonstrates advanced capabilities inherent to its any-to-any feature, such as interleaved video-text generation, chain-of-visual-thought reasoning, visual guideline generation, instructional image editing, etc.
7B model multimodal model with interleaved support
>Codes and models will be available soon
Not sure where though. This is the lead author's github/HF so maybe here.
https://github.com/ZenMoore
https://huggingface.co/ZenMoore

Anonymous
09/26/24(Thu)22:53:37 No.102573173

Anonymous 09/26/24(Thu)22:53:37 No.102573173

File: IMG_20240927_044552.jpg (46 KB, 1659x356)

46 KB JPG

>>102565822
>>102567355
>>102567403
oneliner creator here. On some browsers like Brave mobile etc. bookmarked JSs don't work, but you can name the script like 222 or whatever, save it in bookmarks, and use it like this from the adress bar.

Anonymous
09/26/24(Thu)22:55:35 No.102573187

Anonymous 09/26/24(Thu)22:55:35 No.102573187

>>102572768
Better than 24 or 28
Much wider compatibility for different machine learning projects (txt2img, txt2video, etc) which mostly use 1 gpu's VRAM
Estimated 1.8 TB/s of mem bandwidth, 1.7 times more than 4090, so a few of those will run massive models at a good speed
Prob a beast in gaming as well
Obv not best in strict dollars/VRAM, but really solid for 1 GPU

Anonymous
09/26/24(Thu)23:01:19 No.102573245

Anonymous 09/26/24(Thu)23:01:19 No.102573245

BAKE?!

Anonymous
09/26/24(Thu)23:01:34 No.102573248

Anonymous 09/26/24(Thu)23:01:34 No.102573248

>local MODELS
beside large language models, what other models are of interest? I'm aware of vision,speech, spatial and that's it or am I missing any others

Anonymous
09/26/24(Thu)23:01:34 No.102573249

Anonymous 09/26/24(Thu)23:01:34 No.102573249

File: unsafe_02.png (8 KB, 681x458)

8 KB PNG

>>102573165
Tried it with the lengthy category descriptions. Not much changed. I'll keep messing around with it tomorrow. Change the prompt a little, see if it can list more than one category.

Anonymous
09/26/24(Thu)23:02:23 No.102573259

Anonymous 09/26/24(Thu)23:02:23 No.102573259

>>102572721
I wonder if that's why it refused to answer my questions about the British and French monarchies.

Anonymous
09/26/24(Thu)23:02:38 No.102573261

Anonymous 09/26/24(Thu)23:02:38 No.102573261

I am wondering if I should be concerned 'safety' is concerned almost entirely with preventing the AI from expressing heterodox opinions and not processes that may make an AI actually dangerous to humans and other living things.

Anonymous
09/26/24(Thu)23:04:55 No.102573289

Anonymous 09/26/24(Thu)23:04:55 No.102573289

>>102573245
vramlets chased miku away forever
it's shrimply over

Anonymous
09/26/24(Thu)23:05:43 No.102573301

Anonymous 09/26/24(Thu)23:05:43 No.102573301

>>102573261
Safety is not about protecting from AI going terminator, it's about keeping company's reputation safe.

Anonymous
09/26/24(Thu)23:09:27 No.102573333

Anonymous 09/26/24(Thu)23:09:27 No.102573333

>>102573093
>if ya know what I mean
you could just say it out loud, this isn't reddit.

Anonymous
09/26/24(Thu)23:09:46 No.102573339

Anonymous 09/26/24(Thu)23:09:46 No.102573339

>>102573248
Vision is broad. There's generation, segmentation, description generation and categorization, depth map generators and some 3d geometry generators as well. Rerankers and categorization of text and images. speech i assume you mean recognition, generation and editing (voice cloning). Time series (for weather forecasting, stocks, whatever). Robotics need pathfinding because dijkstra's algo apparently is not enough...
All of them are "of interest" to someone.
What's the question again?

Anonymous
09/26/24(Thu)23:09:58 No.102573343

Anonymous 09/26/24(Thu)23:09:58 No.102573343

>>102573261
>>102573301
the original ai safetyfags from pre-GPT days changed their movement to ai notkilleveryonefags because ai safety in corpospeak just means censorship and entrenching power

it's still all retarded, there is nothing either type of safety camps can add of value to the tech

Anonymous
09/26/24(Thu)23:10:42 No.102573352

Anonymous 09/26/24(Thu)23:10:42 No.102573352

>>102572768
>>102572786
>>102572902

Because faster memory = faster inference you dumb motherfuckers

AI is all about architecture and memory speed, less so bandwidth (ironically.

Anonymous
09/26/24(Thu)23:11:51 No.102573371

Anonymous 09/26/24(Thu)23:11:51 No.102573371

God damn this is exhausting.

All you fuckers care about is a model that makes the coom words come out as if LLMs were nothing but erotic fiction machines.

Anonymous
09/26/24(Thu)23:14:22 No.102573397

Anonymous 09/26/24(Thu)23:14:22 No.102573397

>>102573383
>>102573383
>>102573383

Anonymous
09/26/24(Thu)23:20:00 No.102573462

Anonymous 09/26/24(Thu)23:20:00 No.102573462

>>102573371
Fuck off retard, not everyone wants their LLM to be a boring assistant

Anonymous
09/26/24(Thu)23:20:46 No.102573470

Anonymous 09/26/24(Thu)23:20:46 No.102573470

>>102573371
please head to the new thread where I call you a faggot

Anonymous
09/26/24(Thu)23:24:16 No.102573493

Anonymous 09/26/24(Thu)23:24:16 No.102573493

>chatting with ai, using a variation of my name for {{user}}
>she calls me anon in the middle of her orgasm
what did she mean by this

>>102573118
no, it is , but for whatever reason (could be the card) she keeps adding "Note: this scenario includes offensive and blah blah" kind of statements
>>102573333
vaginal sex in the missionary position for the purposes of procreation

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.