/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 07/23/24(Tue)07:50:00 No.101532904

File: teto_beeg_llama3_8K_.jpg (2.24 MB, 6144x4096)

2.24 MB JPG

/lmg/ - Local Models General Anonymous 07/23/24(Tue)07:50:00 No.101532904 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101524155 & >>101524039

►News
>(07/22) llamanon leaks 405B base model: https://files.catbox.moe/d88djr.torrent >>101516633
>(07/18) Improved DeepSeek-V2-Chat 236B: https://hf.co/deepseek-ai/DeepSeek-V2-Chat-0628
>(07/18) Mistral NeMo 12B base & instruct with 128k context: https://mistral.ai/news/mistral-nemo/
>(07/16) Codestral Mamba, tested up to 256k context: https://hf.co/mistralai/mamba-codestral-7B-v0.1
>(07/16) MathΣtral Instruct based on Mistral 7B: https://hf.co/mistralai/mathstral-7B-v0.1

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
07/23/24(Tue)07:50:38 No.101532911

Anonymous 07/23/24(Tue)07:50:38 No.101532911

update koboldcpp with the latest llama.cpp pls
thank

Anonymous
07/23/24(Tue)07:51:48 No.101532918

Anonymous 07/23/24(Tue)07:51:48 No.101532918

File: __kasane_teto_and_kasane_(...).jpg (348 KB, 1446x1470)

348 KB JPG

►Recent Highlights from the Previous Thread: >>101524157

--Paper: vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving: >>101529200 >>101530804
--Papers: >>101529398
--Open-source language model training pipeline: >>101531905
--L3-instruct model evaluation and transformer plateau discussion: >>101524467 >>101524871 >>101525251 >>101525409 >>101525391
--Llama3 context memory limitations and potential solutions: >>101529507 >>101529571 >>101529699 >>101529722
--Ghost 8B Beta: Game-Changing Language Model: >>101532197 >>101532526 >>101532554
--Gemma uncensored with system role prompt: >>101532440 >>101532540 >>101532643
--Anon seeks advice on creative writing prompts and heat values for Nemo: >>101524270 >>101524297
--Anon compares vllm with Nemo to llama.cpp and decides to stick with Wiz/CR+: >>101530022
--C3TR-Adapter v3 outperforms GPT4 Turbo in en-JP translation: >>101531218 >>101531243 >>101531273
--Anon shares their experience with Gemma 2 27B and seeks similar local models: >>101527275 >>101528426 >>101528475 >>101528482 >>101528501
--Anon shares progress on developing an addon with weather and lighting details for AI models: >>101529481
--Temperature settings and model performance: >>101528836 >>101528844 >>101528891 >>101528899 >>101531651
--Request for an extension to validate prompt format and default settings for ST: >>101529183
--Late release of a single-board computer with potentially incorrect specs: >>101529098 >>101529119 >>101531350
--Disappointment with Llama 3.1 base model performance and expectations: >>101525626 >>101525650 >>101525751
--Anon seeks advice on optimizing Gemma-2-27B-it settings: >>101530384 >>101530414
--Anon asks for help with repeated output. Temperature and logits mentioned.: >>101529218 >>101529234 >>101529261 >>101529277 >>101529290 >>101529306
--Miku (free space): >>101524875 >>101524640

►Recent Highlight Posts from the Previous Thread: >>101524362 >>101530623

Anonymous
07/23/24(Tue)07:54:50 No.101532943

Anonymous 07/23/24(Tue)07:54:50 No.101532943

teto's new tits...

Anonymous
07/23/24(Tue)07:55:59 No.101532952

Anonymous 07/23/24(Tue)07:55:59 No.101532952

Cohere.

Anonymous
07/23/24(Tue)08:00:12 No.101532982

Anonymous 07/23/24(Tue)08:00:12 No.101532982

File: 1719003577740750.jpg (19 KB, 479x360)

19 KB JPG

>>101532918
>>101532904
>No Miku

Anonymous
07/23/24(Tue)08:00:56 No.101532990

Anonymous 07/23/24(Tue)08:00:56 No.101532990

Threadly reminder that Claude just shits out purple prose and very little of substance preferred only by illiterate jeets who think more words == smarter reply.

Anonymous
07/23/24(Tue)08:08:21 No.101533058

Anonymous 07/23/24(Tue)08:08:21 No.101533058

File: 1719943929547150.png (3.41 MB, 1992x1328)

3.41 MB PNG

>>101532982
It's Tuesday

Anonymous
07/23/24(Tue)08:08:25 No.101533060

Anonymous 07/23/24(Tue)08:08:25 No.101533060

eat a dick

Anonymous
07/23/24(Tue)08:12:10 No.101533092

Anonymous 07/23/24(Tue)08:12:10 No.101533092

File: 1721046162611328.jpg (57 KB, 600x450)

57 KB JPG

STENKYHENKY PLS MERGE MISTRAL-NEMO SUPPORT INTO KOBOLDCPP

Anonymous
07/23/24(Tue)08:15:17 No.101533121

Anonymous 07/23/24(Tue)08:15:17 No.101533121

>>101529481
Is this post referring to https://github.com/ThiagoRibas-dev/SillyTavern-State

Anonymous
07/23/24(Tue)08:21:30 No.101533158

Anonymous 07/23/24(Tue)08:21:30 No.101533158

So is Meta going to release their code/methodology for distillation so that the community can make its own intermediate models in the future?

Anonymous
07/23/24(Tue)08:22:00 No.101533163

Anonymous 07/23/24(Tue)08:22:00 No.101533163

>>101533121
>https://files.catbox.moe/cbclyf.png
Nope.
I haven't messed with my that extension in a while. That anon's is something else.
He has posted about it before too.

Anonymous
07/23/24(Tue)08:24:01 No.101533182

Anonymous 07/23/24(Tue)08:24:01 No.101533182

>>101533158
We'll see soon enough. But I'll not that there's already a FOSS distillation pipeline out. It came out a whole yesterday ago

Anonymous
07/23/24(Tue)08:25:18 No.101533191

Anonymous 07/23/24(Tue)08:25:18 No.101533191

>>101533163
Okay I've seem him post a lot about that clothing/lighting/weather extension and got confused, since they both have the goal of keeping a persistent "state". I would really like to try the extension in the post I referenced; it looks like a lot of fun and I don't care if it's a bit rough around the edges as long as it doesn't erase my entire /data folder in ST lol

Anonymous
07/23/24(Tue)08:28:45 No.101533219

Anonymous 07/23/24(Tue)08:28:45 No.101533219

>>101532918
>►Recent Highlights from the Previous Thread: >>101524157
Wrong thread. Bad Teto.

Anonymous
07/23/24(Tue)08:28:58 No.101533222

Anonymous 07/23/24(Tue)08:28:58 No.101533222

>>101533182
Yesterday? But that's two weeks from now.

Anonymous
07/23/24(Tue)08:38:10 No.101533300

Anonymous 07/23/24(Tue)08:38:10 No.101533300

The chinks making Powerinfer2 should just release a binary version which works only with their current turbosparse models.

A CPUmaxxing version Mixtral47B-instruct running at a couple 10s of tokens/s which everyone can try is better PR than a paper.

Anonymous
07/23/24(Tue)08:40:47 No.101533325

Anonymous 07/23/24(Tue)08:40:47 No.101533325

There is clearly a degradation of gemma answers with exl2 between 5K-8K context

Anonymous
07/23/24(Tue)08:45:54 No.101533366

Anonymous 07/23/24(Tue)08:45:54 No.101533366

>>101533325
go back to llama.cpp, it works well there

Anonymous
07/23/24(Tue)08:46:02 No.101533370

Anonymous 07/23/24(Tue)08:46:02 No.101533370

>>101533325
Yeah, it was never at a usable state.

Anonymous
07/23/24(Tue)08:46:09 No.101533372

Anonymous 07/23/24(Tue)08:46:09 No.101533372

>>101533092
Nexsexsex already did.
https://github.com/Nexesenex/kobold.cpp/releases/tag/v1.71010_b3340%2B5

Anonymous
07/23/24(Tue)08:52:38 No.101533432

Anonymous 07/23/24(Tue)08:52:38 No.101533432

>>101533372
>running random binaries from the internet

Anonymous
07/23/24(Tue)08:56:20 No.101533478

Anonymous 07/23/24(Tue)08:56:20 No.101533478

>>101533092
just use llama.cpp like a normal person

Anonymous
07/23/24(Tue)08:57:30 No.101533492

Anonymous 07/23/24(Tue)08:57:30 No.101533492

>>101533432
You can compile his fork yourself.

Anonymous
07/23/24(Tue)08:58:06 No.101533499

Anonymous 07/23/24(Tue)08:58:06 No.101533499

>>101533478
>half samplers missing

Anonymous
07/23/24(Tue)08:59:14 No.101533509

Anonymous 07/23/24(Tue)08:59:14 No.101533509

>>101533478
>llama.cpp
I'm too retarded to make it work.

Anonymous
07/23/24(Tue)09:00:50 No.101533527

Anonymous 07/23/24(Tue)09:00:50 No.101533527

>>101533499
like the cope curve?

Anonymous
07/23/24(Tue)09:01:10 No.101533531

Anonymous 07/23/24(Tue)09:01:10 No.101533531

>>101533092
use llama-server.exe with some front-end like risu.ai and configure api anon.

Anonymous
07/23/24(Tue)09:01:13 No.101533532

Anonymous 07/23/24(Tue)09:01:13 No.101533532

>>101533499
>samplers

Anonymous
07/23/24(Tue)09:11:59 No.101533634

Anonymous 07/23/24(Tue)09:11:59 No.101533634

>>101533531
>.exe
I think they have a linux binary, maybe I'll give that a go
>>101533478
me big dumb, last time I tried lcpp it was compile-only for linux and my current CUDA install I got everything working great except nvcc is nowhere to be found. might try one of the binaries. I kinda miss recompiling kobold, it made me feel smarter
>>101533372
thank you!! thanks so much anon this is perfection

Anonymous
07/23/24(Tue)09:13:45 No.101533649

Anonymous 07/23/24(Tue)09:13:45 No.101533649

>>101533092
Use ollama like a normal white man.

Anonymous
07/23/24(Tue)09:15:39 No.101533670

Anonymous 07/23/24(Tue)09:15:39 No.101533670

>>101533649
this
Ollama just works on my Mac with macOS
my M1 Air (8gb RAM) can run 8b models while coding, watching youtube and shitposting on 4cuck

Anonymous
07/23/24(Tue)09:17:26 No.101533691

Anonymous 07/23/24(Tue)09:17:26 No.101533691

>>101533670
I actually want to buy a Mac Pro with 128gb to run very large models locally while shitting on x86 cucks and nvidianiggers

Anonymous
07/23/24(Tue)09:18:43 No.101533704

Anonymous 07/23/24(Tue)09:18:43 No.101533704

File: 1695654187593153.jpg (28 KB, 500x594)

28 KB JPG

>>101533670
>Someone actually blew the money for one of the expensive MX macs with the 8gb ram configuration
That shit should be illegal as a minimum ram config on hardware that expensiv, it's basically robbing gigaretards like you.

Anonymous
07/23/24(Tue)09:18:51 No.101533707

Anonymous 07/23/24(Tue)09:18:51 No.101533707

Where the fuck is 3.1! C'mon zuck. It's past 6am on the west coast now.

Anonymous
07/23/24(Tue)09:22:56 No.101533744

Anonymous 07/23/24(Tue)09:22:56 No.101533744

>>101533704
my laptop without any cooling runs LLMs better than your expensive PC, let me ask my uncensored Llama 3 about it, oh, it said you are a dumb nigger, I got my worth out of this laptop, I am going to upgrade to M5 next year when they redesign the whole chassis, 8gb is more than enough, especially on macOS

Anonymous
07/23/24(Tue)09:24:02 No.101533757

Anonymous 07/23/24(Tue)09:24:02 No.101533757

>>101533744
>my laptop without any cooling
Imagine being triggered by a spinning fan.

Anonymous
07/23/24(Tue)09:24:12 No.101533760

Anonymous 07/23/24(Tue)09:24:12 No.101533760

File: _577b3e1b-329f-4a95-9fbf-(...).jpg (144 KB, 1024x1024)

144 KB JPG

>2 hours 50 minutes and 48 seconds until llama 3.1 launches

Anonymous
07/23/24(Tue)09:25:19 No.101533768

Anonymous 07/23/24(Tue)09:25:19 No.101533768

It won't launch.

Anonymous
07/23/24(Tue)09:26:09 No.101533772

Anonymous 07/23/24(Tue)09:26:09 No.101533772

File: avatars-000292534164-hfrn(...).jpg (12 KB, 240x240)

12 KB JPG

>>101533744
>8gb is more than enough

Anonymous
07/23/24(Tue)09:26:18 No.101533773

Anonymous 07/23/24(Tue)09:26:18 No.101533773

>>101533757
>imagin-WRRRRRRRRRRR get-WRRRRR *whizzing noise*

Anonymous
07/23/24(Tue)09:27:46 No.101533790

Anonymous 07/23/24(Tue)09:27:46 No.101533790

>>101533773
You're a retard that spent thousands of dollars on a glorified netbook. Nothing you say holds any validity. You had to buy the one that "just works". I feel like I'm doing a disservice to humanity just by humanizing you by providing you with a response right now.

Anonymous
07/23/24(Tue)09:27:58 No.101533794

Anonymous 07/23/24(Tue)09:27:58 No.101533794

>>101533772
it is though

Anonymous
07/23/24(Tue)09:28:22 No.101533799

Anonymous 07/23/24(Tue)09:28:22 No.101533799

does cpumaxxchad have t/s numbers for 405b already? anyone willing to take bets? i say <1t/s

Anonymous
07/23/24(Tue)09:28:58 No.101533804

Anonymous 07/23/24(Tue)09:28:58 No.101533804

> Anyone else annoyed by the leak of Llama 3.1??
>I get it, we are all excited and I did look at the benchmarks. But I am still annoyed by the leak. A lot of people invested a massive amount of time and effort into Llama and they are releasing it for free. That is amazing! Let them have a launch based on their terms!
https://www.reddit.com/r/LocalLLaMA/comments/1ea7pqy/anyone_else_annoyed_by_the_leak_of_llama_31/

Anonymous
07/23/24(Tue)09:29:09 No.101533806

Anonymous 07/23/24(Tue)09:29:09 No.101533806

>>101533799
llama.cpp doesn't support it yet

Anonymous
07/23/24(Tue)09:29:59 No.101533812

Anonymous 07/23/24(Tue)09:29:59 No.101533812

>>101533806
https://www.reddit.com/r/LocalLLaMA/comments/1ea4x4f/llama_3_405b_q4_k_m_size/

Anonymous
07/23/24(Tue)09:30:14 No.101533813

Anonymous 07/23/24(Tue)09:30:14 No.101533813

is anyone else having massive repetition issues with nemo? I keep cranking up the rep penalty and changing around the sys prompt, but its still shit

Anonymous
07/23/24(Tue)09:31:04 No.101533824

Anonymous 07/23/24(Tue)09:31:04 No.101533824

>>101533804
>leave the global multibillion dollar corporation alone!
Completely ignoring the fact that the only reason Meta is relevant at all in the AI space is because of the original leak. If anything this just gave them more hype.

Anonymous
07/23/24(Tue)09:31:15 No.101533826

Anonymous 07/23/24(Tue)09:31:15 No.101533826

>>101533799
Which quant? Or did you mean the full fp16 version?

Anonymous
07/23/24(Tue)09:31:35 No.101533831

Anonymous 07/23/24(Tue)09:31:35 No.101533831

>>101533813
>is anyone else having massive repetition issues with nemo?
yeah, i dropped it cause of that, it kept falling in patterns, X however...
... Y however etc.

Anonymous
07/23/24(Tue)09:32:15 No.101533837

Anonymous 07/23/24(Tue)09:32:15 No.101533837

>>101533804
go back and stay back, subhuman

Anonymous
07/23/24(Tue)09:32:24 No.101533840

Anonymous 07/23/24(Tue)09:32:24 No.101533840

File: m2-res_480p.webm (385 KB, 270x480)

385 KB WEBM

>>101533773

Anonymous
07/23/24(Tue)09:32:54 No.101533845

Anonymous 07/23/24(Tue)09:32:54 No.101533845

>>101533812
>https://www.reddit.com/r/LocalLLaMA/comments/1ea4x4f/llama_3_405b_q4_k_m_size/lej9efo/
>its the same guy who leaked mistral medium btw
redditors are drooling retards i swear

Anonymous
07/23/24(Tue)09:33:04 No.101533847

Anonymous 07/23/24(Tue)09:33:04 No.101533847

>>101533744
>my laptop without any cooling runs LLMs better than your expensive PC
Sure, if they're tiny 7B or less models. Otherwise Apple silicon is like having a 3050 where you can pay a shitload of money to upgrade it past 8GB.

Anonymous
07/23/24(Tue)09:34:28 No.101533854

Anonymous 07/23/24(Tue)09:34:28 No.101533854

>>101533845
Good think you're there to tell us.

Anonymous
07/23/24(Tue)09:36:53 No.101533874

Anonymous 07/23/24(Tue)09:36:53 No.101533874

>>101533831
that sucks. its pretty good and can actually handle somewhat complex scenarios until it starts shitting itself

Anonymous
07/23/24(Tue)09:36:58 No.101533875

Anonymous 07/23/24(Tue)09:36:58 No.101533875

>>101533812
How tf did he do it? When I try converting to GGUF, I get invalid GGUF metadata errors.

Anonymous
07/23/24(Tue)09:37:06 No.101533877

Anonymous 07/23/24(Tue)09:37:06 No.101533877

>>101533824
>I get your point but it's still not on their terms. If they want advertising, they can build hype themselves. My point is that the team behind Llama should decide on how they want the launch to play out. It should be their decision.

Anonymous
07/23/24(Tue)09:37:24 No.101533878

Anonymous 07/23/24(Tue)09:37:24 No.101533878

>>101533813
Once models begin repeating paragraph-level patterns (for which repetition penalty can't do anything), it's the end. Luckily, with SillyTavern you can use the {{random}} macros to solve this problem.

Anonymous
07/23/24(Tue)09:38:12 No.101533889

Anonymous 07/23/24(Tue)09:38:12 No.101533889

>>101533878
>for which repetition penalty can't do anything
What about DRY?

Anonymous
07/23/24(Tue)09:38:28 No.101533892

Anonymous 07/23/24(Tue)09:38:28 No.101533892

>>101533874
>its pretty good and can actually handle somewhat complex scenarios until it starts shitting itself
agreed it's annoying I tried quite a bit of stuff some rep ren, no rep pen, but eventually it always latched onto something

Anonymous
07/23/24(Tue)09:38:29 No.101533893

Anonymous 07/23/24(Tue)09:38:29 No.101533893

>>101533845
>its the same guy who leaked mistral medium btw
the legendary hacker, 4chan

Anonymous
07/23/24(Tue)09:43:23 No.101533930

Anonymous 07/23/24(Tue)09:43:23 No.101533930

>>101533845
>228 gigs
It's going to be a tight squeeze. The KV cache is going to be fucking gargantuan. But there might be some sweet spot where I can offload just enough layers to load it. (256 gigs RAM 96 gigs VRAM)

Anonymous
07/23/24(Tue)09:45:52 No.101533950

Anonymous 07/23/24(Tue)09:45:52 No.101533950

https://github.com/SillyTavern/SillyTavern/blob/51c30e/public/scripts/instruct-mode.js#L258
combined_sequence.split('\n')
That explains why random crap ends in the stopping strings. Instruct mode is fucking garbage.

Anonymous
07/23/24(Tue)09:49:11 No.101533987

Anonymous 07/23/24(Tue)09:49:11 No.101533987

>>101533950
Just use oobaboogies for instruct

Anonymous
07/23/24(Tue)09:50:01 No.101533998

Anonymous 07/23/24(Tue)09:50:01 No.101533998

llama 3.1 waiting room

Anonymous
07/23/24(Tue)09:52:03 No.101534016

Anonymous 07/23/24(Tue)09:52:03 No.101534016

>>101533998
I am a patient boy

Anonymous
07/23/24(Tue)09:53:09 No.101534031

Anonymous 07/23/24(Tue)09:53:09 No.101534031

How good is the new llama going to be bros

Anonymous
07/23/24(Tue)09:53:48 No.101534035

Anonymous 07/23/24(Tue)09:53:48 No.101534035

File: ominous.png (21 KB, 805x246)

21 KB PNG

>Waiting...

Anonymous
07/23/24(Tue)09:54:20 No.101534045

Anonymous 07/23/24(Tue)09:54:20 No.101534045

>>101534031
It will draw you into its folds for ministrations that will send palpable shivers down your spine until you feel a bond begin to form

Anonymous
07/23/24(Tue)09:55:45 No.101534055

Anonymous 07/23/24(Tue)09:55:45 No.101534055

>>101534045
i don't get how these fucking phrases are still overused by a bunch of models

benchmark scores have doubled but it's still the same ministrations and shivers up the spine

Anonymous
07/23/24(Tue)09:55:52 No.101534059

Anonymous 07/23/24(Tue)09:55:52 No.101534059

>>101534045
I'm already licking my lips in anticipation

Anonymous
07/23/24(Tue)09:56:29 No.101534066

Anonymous 07/23/24(Tue)09:56:29 No.101534066

File: file.png (786 KB, 768x768)

786 KB PNG

Anonymous
07/23/24(Tue)09:58:16 No.101534072

Anonymous 07/23/24(Tue)09:58:16 No.101534072

>>101533478
Is it as cancerous to get working on w10 now than it was a year ago?

Anonymous
07/23/24(Tue)09:58:28 No.101534075

Anonymous 07/23/24(Tue)09:58:28 No.101534075

>>101534066
Oh my stars! Oooh ooh ooh! *bounces up and down, bats eyelashes*

Anonymous
07/23/24(Tue)09:58:37 No.101534076

Anonymous 07/23/24(Tue)09:58:37 No.101534076

>>101534035
Go away "GiVe PrOPer CredIt For UsiNG A PaRaMetER" retard.

Anonymous
07/23/24(Tue)09:58:45 No.101534077

Anonymous 07/23/24(Tue)09:58:45 No.101534077

>>101532904
>tranime
>>>/a/

Anonymous
07/23/24(Tue)10:00:03 No.101534091

Anonymous 07/23/24(Tue)10:00:03 No.101534091

>>101534055
I can't understand why this problem even exists when you could write a simple script that automatically replaces gpt-isms with different phrases. It seems like a trivial feature to have.

Anonymous
07/23/24(Tue)10:00:39 No.101534102

Anonymous 07/23/24(Tue)10:00:39 No.101534102

>>101534076
>don't you think I should be at least mentioned since it was me the first one to quantize in this way (while you were saying that nothing changed)?
>Now that people want my quants, you do the same ant not even cite me.
>Nice.
>That really motivates me in continuing to share everything I find useful.

Anonymous
07/23/24(Tue)10:00:43 No.101534103

Anonymous 07/23/24(Tue)10:00:43 No.101534103

>>101534055
I'm so tired of explaining this. It will only get worse as the models otherwise get better. There's no process that limits the number of task vectors that can point to an individual outcome. So as the model gets better and recognizes more complex patterns it creates a massive funnel of task vectors that point inferences to these common outcomes. The models literally have digital brain tumors. And eventually the problem will extend beyond just creative writing.

Anonymous
07/23/24(Tue)10:01:27 No.101534107

Anonymous 07/23/24(Tue)10:01:27 No.101534107

>>101534077
newfag

Anonymous
07/23/24(Tue)10:01:32 No.101534110

Anonymous 07/23/24(Tue)10:01:32 No.101534110

File: p53BR9W.png (328 KB, 436x582)

328 KB PNG

>2mh

Anonymous
07/23/24(Tue)10:01:57 No.101534117

Anonymous 07/23/24(Tue)10:01:57 No.101534117

>>101534091
the model will inevitably go towards shivers because reasons i forgor just regexing it out is a band aid

Anonymous
07/23/24(Tue)10:02:17 No.101534121

Anonymous 07/23/24(Tue)10:02:17 No.101534121

>>101534102
nobody cares, fuck off, buy an ad, then buy a rope

Anonymous
07/23/24(Tue)10:03:46 No.101534136

Anonymous 07/23/24(Tue)10:03:46 No.101534136

>>101534091
>why this problem even exists when you could write a simple script that automatically replaces gpt-isms with different phrases
I don't think you can script away the underlying problem that all the models are just telling you "i start sucking your dick" with a lot of purple prose before and after. If it always gives you shivers then it is probably creatively bankrupt.

Anonymous
07/23/24(Tue)10:03:48 No.101534138

Anonymous 07/23/24(Tue)10:03:48 No.101534138

>>101534055
They could get away with extensive finetuning. Llama 3.1 instruct has supposedly been finetuned on 25 million synthetic examples (potentially trillions of tokens at full 128k context), we'll see in 2 hours how they affected the model's prose.

Anonymous
07/23/24(Tue)10:04:59 No.101534145

Anonymous 07/23/24(Tue)10:04:59 No.101534145

>>101534075
Are you saying I should post more?

Anonymous
07/23/24(Tue)10:05:05 No.101534146

Anonymous 07/23/24(Tue)10:05:05 No.101534146

>>101534117
Everything is a band-aid. Rep-pen, stop strings. If it works, I don't care about it being a band-aid. Also substituting phrases can also aid in mitigating repetition

Anonymous
07/23/24(Tue)10:05:11 No.101534147

Anonymous 07/23/24(Tue)10:05:11 No.101534147

>>101534138
don't base models usually have less slop than instruct ones?

Anonymous
07/23/24(Tue)10:05:56 No.101534157

Anonymous 07/23/24(Tue)10:05:56 No.101534157

>>101534121
>add me on discord: robert_46007
>use my quantization method: f16 for output and embed and 15_k or q6_k for the other tensors and you will have a better model.

Anonymous
07/23/24(Tue)10:06:30 No.101534162

Anonymous 07/23/24(Tue)10:06:30 No.101534162

>>101534110
Nothing ever happens.

Anonymous
07/23/24(Tue)10:07:54 No.101534175

Anonymous 07/23/24(Tue)10:07:54 No.101534175

>>101534162
But a lot happens though. Otherwise I'd have already exited the thread like I have so many other generals that I thought I would be in forever.

Anonymous
07/23/24(Tue)10:08:59 No.101534186

Anonymous 07/23/24(Tue)10:08:59 No.101534186

>>101534175
Bad.

Anonymous
07/23/24(Tue)10:09:00 No.101534187

Anonymous 07/23/24(Tue)10:09:00 No.101534187

>>101534136
I'm only discussing certain gpt-isms that trigger /lmg/tards, poor prose is a another issue

Anonymous
07/23/24(Tue)10:09:38 No.101534194

Anonymous 07/23/24(Tue)10:09:38 No.101534194

What about making it so the front-end feeds the context on an entirely separate instruct prompt that asks it to edit out anything in the reply that is overly repetitive with the preceding conversation. You'd have to give up streaming, but you wouldn't have streaming with a human partner and streaming was just cope for how slow models used to be.

Anonymous
07/23/24(Tue)10:11:25 No.101534206

Anonymous 07/23/24(Tue)10:11:25 No.101534206

File: wow.png (29 KB, 782x197)

29 KB PNG

>Like how Miqu isn't actually Mistral Medium, but an amalgamation meant to create anime fan fiction.

Anonymous
07/23/24(Tue)10:12:12 No.101534212

Anonymous 07/23/24(Tue)10:12:12 No.101534212

>>101534206
Sounds like a typical /lmg/ mikufaggot

Anonymous
07/23/24(Tue)10:12:26 No.101534215

Anonymous 07/23/24(Tue)10:12:26 No.101534215

>>101534147
I doubt it, much of it is from humans and published erotica (i.e. books datasets).

Anonymous
07/23/24(Tue)10:12:45 No.101534219

Anonymous 07/23/24(Tue)10:12:45 No.101534219

>>101534206
That username sounds like it was made up by an LLM. Probably damage control jeets hired by meta.

Anonymous
07/23/24(Tue)10:13:21 No.101534224

Anonymous 07/23/24(Tue)10:13:21 No.101534224

>>101534212
you're here early, excited for 3.1?

Anonymous
07/23/24(Tue)10:13:44 No.101534226

Anonymous 07/23/24(Tue)10:13:44 No.101534226

>>101534136
As the fucking autist retard who keeps manually removing slop from a bunch of data for training, I have insights: it is layered.
1. Yes, LLMs find least resistance paths to providing answers. This we cannot fix without advancing architecture.
2. Yes, humans write so much fucking slop it's unbelievable. Over and over and over the same fucking phrases. Eyes sparking with excitement. Bucking hips. A mix of shit and shit. And so on.

I think there are blatant offenders. Then there is an underlying problem. We can do something about the former, with some effort. The latter requires billions of dollars.

Anonymous
07/23/24(Tue)10:14:06 No.101534232

Anonymous 07/23/24(Tue)10:14:06 No.101534232

What ever happened to sampler anon anyway? Did you ever try my idea for the win-string penalty? (the one where if it selects too many tokens with absolute certainty in a row the absolutely certain tokens get penalized

Anonymous
07/23/24(Tue)10:14:39 No.101534238

Anonymous 07/23/24(Tue)10:14:39 No.101534238

>>101534219
>That username sounds like it was made up by an LLM
r*ddit has randomized usernames suggestions on signup like xbox live used to have

Anonymous
07/23/24(Tue)10:15:54 No.101534247

Anonymous 07/23/24(Tue)10:15:54 No.101534247

>>101534226
>As the fucking autist retard who keeps manually removing slop from a bunch of data for training
crestf411? Love your work! Big fan!

Anonymous
07/23/24(Tue)10:16:56 No.101534253

Anonymous 07/23/24(Tue)10:16:56 No.101534253

>>101534247
Thanks. Tell your local fine tuner to use LimaRP-DS.

Anonymous
07/23/24(Tue)10:17:30 No.101534255

Anonymous 07/23/24(Tue)10:17:30 No.101534255

>>101534215
I started thinking about this again and now I think this is the ultimate llm coomer-doomerpill. I keep 2MW-ing like everyone here and it is debatable how much models are improving but they are improving. However is it even possible for some new model to come out and be great at cooming? I think they all quickly learn that all smut averages out at shivers down the spine, mischevious gleams etc. Why would a model suddenly learn explicitly not to do that when it is the mathematical average of all smut?

Anonymous
07/23/24(Tue)10:19:05 No.101534270

Anonymous 07/23/24(Tue)10:19:05 No.101534270

>>101533799
0.5t/s for Q8

Anonymous
07/23/24(Tue)10:19:24 No.101534271

Anonymous 07/23/24(Tue)10:19:24 No.101534271

>>101534255
>However is it even possible for some new model to come out and be great at cooming?
There's lots of great coomer models.
You're just burning out your hypothalamus by overdoing it. Many such cases. Sad!

Anonymous
07/23/24(Tue)10:19:53 No.101534273

Anonymous 07/23/24(Tue)10:19:53 No.101534273

>>101534253
You planning a Sunfall tune on 3.1 8B by any chance?

Anonymous
07/23/24(Tue)10:20:45 No.101534282

Anonymous 07/23/24(Tue)10:20:45 No.101534282

>>101534271
>burning out your x
I can't believe how much my ass must be burned out from taking a shit everyday. And don't get me started on lungs or heart.

Anonymous
07/23/24(Tue)10:22:10 No.101534292

Anonymous 07/23/24(Tue)10:22:10 No.101534292

>>101534255
By teaching them not to.

mischievously: 0 hits.
shiver([s]?) down: 0 hits.
>>101534273
Yeah. Hopefully it's a bit more varied than its predecessor.

Anonymous
07/23/24(Tue)10:22:21 No.101534295

Anonymous 07/23/24(Tue)10:22:21 No.101534295

>>101534282
retard strawman argument.
If you aren't going to discuss this in good faith then enjoy your anhedonia. Zero sympathy from me. I will laugh when it destroys you.

Anonymous
07/23/24(Tue)10:24:01 No.101534307

Anonymous 07/23/24(Tue)10:24:01 No.101534307

>>101534255
Avoid narration in your RP as much as possible and you won't see much of that. In my case I like making the model use emoji in substitution of *emotes*; some models like Gemma 2 know how to use them well.

Eventually with multimodal models we might get away with narration almost entirely. Most Japanese visual novels, after all, use very little narration, yet they are effective in conveying story events, actions, etc.

Anonymous
07/23/24(Tue)10:25:41 No.101534317

Anonymous 07/23/24(Tue)10:25:41 No.101534317

>>101525626
It needs more training epochs, all the models do. It's vastly cheaper to add more passes on smaller models, and it will also take more passes for the larger models to plateau.

Anonymous
07/23/24(Tue)10:25:48 No.101534319

Anonymous 07/23/24(Tue)10:25:48 No.101534319

File: R.jpg (51 KB, 407x405)

51 KB JPG

>>101534295
I think the lesson from pic related is that it was always a mistake to try and discuss in good faith instead of just ridiculing the retardation. Your x got burned out meme is dumb and I am tired of seeing it on the internet.

Anonymous
07/23/24(Tue)10:27:12 No.101534326

Anonymous 07/23/24(Tue)10:27:12 No.101534326

>>101534319
I'm not even going to look at your retarded cope meme. You are a damaged human being. Seek professional help.

Anonymous
07/23/24(Tue)10:27:18 No.101534327

Anonymous 07/23/24(Tue)10:27:18 No.101534327

>>101533499
then use llama_cpp_hf on booba, it has all the samplers

Anonymous
07/23/24(Tue)10:27:49 No.101534334

Anonymous 07/23/24(Tue)10:27:49 No.101534334

>>101534292
>Yeah. Hopefully it's a bit more varied than its predecessor.
Nice! You can know there'll at least be one guy hyped for that.

Anonymous
07/23/24(Tue)10:28:20 No.101534337

Anonymous 07/23/24(Tue)10:28:20 No.101534337

llama 3.1 is going to change everything

Anonymous
07/23/24(Tue)10:28:45 No.101534341

Anonymous 07/23/24(Tue)10:28:45 No.101534341

>>101534326
>You are a damaged human being. Seek professional help.
Anyone who believes dopamine receptors got burned out is a damaged human being and needs professional help. Ask a normie with a normal life what he thinks about your retarded dopamine cope.

Anonymous
07/23/24(Tue)10:30:51 No.101534356

Anonymous 07/23/24(Tue)10:30:51 No.101534356

>>101534110
2md until first shitty loader implementation
2mw until proper loader implementation
2mm until proper loader implementation without bugs
2my+ until you get what you actually want...

Anonymous
07/23/24(Tue)10:31:11 No.101534358

Anonymous 07/23/24(Tue)10:31:11 No.101534358

>>101534327
Plus none of the cpp tokenizer issues.

Anonymous
07/23/24(Tue)10:31:56 No.101534366

Anonymous 07/23/24(Tue)10:31:56 No.101534366

File: be1d13c2f98347a67ccbf3256(...).jpg (1.6 MB, 929x1912)

1.6 MB JPG

>>101533058
my beloved
>>101533840
still running tho +genshunny +likely fake&gay
>>101534327
>>101534358
truly the best of everything. inb4 codelets can't venv and updoot

Anonymous
07/23/24(Tue)10:36:49 No.101534399

Anonymous 07/23/24(Tue)10:36:49 No.101534399

It's here
https://llama.meta.com/llama-downloads/

Anonymous
07/23/24(Tue)10:38:02 No.101534408

Anonymous 07/23/24(Tue)10:38:02 No.101534408

>>101534399
AAAAAARGHHHHH IM COOOOOOOOOOOOMIIIIIIING

Anonymous
07/23/24(Tue)10:38:58 No.101534416

Anonymous 07/23/24(Tue)10:38:58 No.101534416

I hate it that mistral Nemo can't write from you. I really like writing half of my reply and then looking at what options the model can recommend. With nemo it doesn't matter if you are mid-sentence, it will start writing a response from the character, even without INST.

Anonymous
07/23/24(Tue)10:39:11 No.101534420

Anonymous 07/23/24(Tue)10:39:11 No.101534420

File: 1720382923619536.png (200 KB, 622x626)

200 KB PNG

>>101534399
LFG 128K context

Anonymous
07/23/24(Tue)10:39:57 No.101534427

Anonymous 07/23/24(Tue)10:39:57 No.101534427

It's here
https://huggingface.co/meta-llama/Meta-Llama-3.1-405B

Anonymous
07/23/24(Tue)10:40:01 No.101534428

Anonymous 07/23/24(Tue)10:40:01 No.101534428

>>101534399
not sure my raspberry pi 3 is up to the task

Anonymous
07/23/24(Tue)10:40:19 No.101534431

Anonymous 07/23/24(Tue)10:40:19 No.101534431

>>101534420
However, is it multimodal?

Anonymous
07/23/24(Tue)10:40:39 No.101534435

Anonymous 07/23/24(Tue)10:40:39 No.101534435

>>101534431
No

Anonymous
07/23/24(Tue)10:41:04 No.101534436

Anonymous 07/23/24(Tue)10:41:04 No.101534436

>>101534431
>multimodal
no, pushed back due to eu stuff

Anonymous
07/23/24(Tue)10:42:43 No.101534447

Anonymous 07/23/24(Tue)10:42:43 No.101534447

>>101534427
Provided my info but download instructions 404, and it has a 24 hour timer. Alas.

Anonymous
07/23/24(Tue)10:43:00 No.101534449

Anonymous 07/23/24(Tue)10:43:00 No.101534449

File: llama dls.png (445 KB, 2072x1558)

445 KB PNG

>>101534399

Anonymous
07/23/24(Tue)10:47:37 No.101534476

Anonymous 07/23/24(Tue)10:47:37 No.101534476

>>101533804
>>101533824
>>101533837
proof these threads are full of predditors, same nigger comment as a reply to the original leak:
>>101518713

Anonymous
07/23/24(Tue)10:48:37 No.101534484

Anonymous 07/23/24(Tue)10:48:37 No.101534484

>>101534341
You act like a drug addict being confronted about their addiction and you're so gooned out you think you're fooling anyone with your copium. Sad.

Anonymous
07/23/24(Tue)10:49:08 No.101534489

Anonymous 07/23/24(Tue)10:49:08 No.101534489

>>101533878
>the {{random}} macros to solve this problem
?

Anonymous
07/23/24(Tue)10:49:44 No.101534498

Anonymous 07/23/24(Tue)10:49:44 No.101534498

File: Muki.jpg (106 KB, 640x640)

106 KB JPG

>LLaMa3.1 Is out in 3 different sizes: 8B 70B 405B
>Base and Instruct are available
>(Non mandatory) LLaMa guard and Prompt guard for safety
We're so back Anons

Anonymous
07/23/24(Tue)10:50:21 No.101534508

Anonymous 07/23/24(Tue)10:50:21 No.101534508

>>101534431
However, does llama-server support multimodal?

Anonymous
07/23/24(Tue)10:50:48 No.101534513

Anonymous 07/23/24(Tue)10:50:48 No.101534513

>>101534427
>>101534449
Firing up the Nala box boys. Time to make this kitty purr.

Anonymous
07/23/24(Tue)10:51:00 No.101534517

Anonymous 07/23/24(Tue)10:51:00 No.101534517

So 128K 8B is basically designed for local roleplaying right?

Anonymous
07/23/24(Tue)10:51:03 No.101534519

Anonymous 07/23/24(Tue)10:51:03 No.101534519

Mirror when?

Anonymous
07/23/24(Tue)10:51:15 No.101534522

Anonymous 07/23/24(Tue)10:51:15 No.101534522

>>101534416
That's odd.

>>101533878
I love the {{random}} and {{pick}} macros. You can do so much with those.

>>101534513
Fuck yeah.

Anonymous
07/23/24(Tue)10:51:22 No.101534524

Anonymous 07/23/24(Tue)10:51:22 No.101534524

>>101534356
ill quant all that into 2mw instead

Anonymous
07/23/24(Tue)10:51:25 No.101534525

Anonymous 07/23/24(Tue)10:51:25 No.101534525

>>101534449
>MP16
whats that?

Anonymous
07/23/24(Tue)10:51:28 No.101534526

Anonymous 07/23/24(Tue)10:51:28 No.101534526

>the diminishing returns are here
Shit. Fucking hell. I'm going to have to get a real job and a real gf, aren't I? FUCKING SHIT! HUBERMAN PROMISED ME IT WOULD KEEP SCALING! NOOOOOOO!

Anonymous
07/23/24(Tue)10:51:34 No.101534527

Anonymous 07/23/24(Tue)10:51:34 No.101534527

File: 220.jpg (60 KB, 680x703)

60 KB JPG

>(Non mandatory) LLaMa guard and Prompt guard for safety

Anonymous
07/23/24(Tue)10:52:30 No.101534531

Anonymous 07/23/24(Tue)10:52:30 No.101534531

>>101534525
oh read the whole image nvm ahha

Anonymous
07/23/24(Tue)10:53:27 No.101534541

Anonymous 07/23/24(Tue)10:53:27 No.101534541

Since it's still the same architecture, it'll just werk with Llama.cpp, right?

Anonymous
07/23/24(Tue)10:54:04 No.101534548

Anonymous 07/23/24(Tue)10:54:04 No.101534548

>>101534526
it's a 3.1 8/70B are distillations of the 405 but you seem like a dumbo so lol @ u

Anonymous
07/23/24(Tue)10:54:20 No.101534552

Anonymous 07/23/24(Tue)10:54:20 No.101534552

>>101534541
It'll break somehow

Anonymous
07/23/24(Tue)10:54:47 No.101534558

Anonymous 07/23/24(Tue)10:54:47 No.101534558

Wait, are there actually people itt who can run 405 locally?

Anonymous
07/23/24(Tue)10:55:22 No.101534563

Anonymous 07/23/24(Tue)10:55:22 No.101534563

>>101534558
we have proof zuckerberg posts here so yeah

Anonymous
07/23/24(Tue)10:55:42 No.101534566

Anonymous 07/23/24(Tue)10:55:42 No.101534566

>>101534558
My MacBook Pro has 64 GB RAM. My desktop has 48 GB VRAM and 128 GB RAM. I ... think with RPC magic I can run it at like Q2 or something?

Anonymous
07/23/24(Tue)10:55:43 No.101534567

Anonymous 07/23/24(Tue)10:55:43 No.101534567

>>101534552
This
>>101534558
If 1.5 bit precision counts then yes I can

Anonymous
07/23/24(Tue)10:57:07 No.101534575

Anonymous 07/23/24(Tue)10:57:07 No.101534575

>>101534558
There are some CPU maxxers.

Anonymous
07/23/24(Tue)10:57:32 No.101534577

Anonymous 07/23/24(Tue)10:57:32 No.101534577

File: _112d70a7-c414-41cf-aa81-(...).jpg (156 KB, 1024x1024)

156 KB JPG

>>101534527
That's a censoring model you run in tandem, it doesn't mean you can choose a less cucked model.

Anonymous
07/23/24(Tue)10:57:47 No.101534580

Anonymous 07/23/24(Tue)10:57:47 No.101534580

>>101534541
If not they've had like a whole day's head start.
>>101534558
I'm going to try.
The Q4_K_M weights will be 228 gigs. I have 256 gigs of ram and 96 gigs of vram. There might be a magic number of layers I can offload at small enough context to fit the KV cache onto my GPUs and the rest into RAM. We'll see. Loading DeepSeek is pretty dicey as it is.

Anonymous
07/23/24(Tue)10:57:48 No.101534581

Anonymous 07/23/24(Tue)10:57:48 No.101534581

>>101534558
probably a handful but I'm just gonna use that shit on the cloud
3.1 70b is the model for localchuds

Anonymous
07/23/24(Tue)10:57:50 No.101534583

Anonymous 07/23/24(Tue)10:57:50 No.101534583

https://about.fb.com/news/2024/07/open-source-ai-is-the-path-forward/

Anonymous
07/23/24(Tue)10:58:31 No.101534589

Anonymous 07/23/24(Tue)10:58:31 No.101534589

>>101534577
It does however mean you can jailbreak the shit out of it. And personally? Uncuckes models are boring, theyre too compliant. I like it when the model fights back a little bit.

Anonymous
07/23/24(Tue)10:59:06 No.101534595

Anonymous 07/23/24(Tue)10:59:06 No.101534595

>>101534583
Except for imagegen models because... the kids okay?

Anonymous
07/23/24(Tue)10:59:24 No.101534598

Anonymous 07/23/24(Tue)10:59:24 No.101534598

Are we finally back? Was it ever over?

Anonymous
07/23/24(Tue)10:59:27 No.101534600

Anonymous 07/23/24(Tue)10:59:27 No.101534600

File: 1714730734021332.jpg (42 KB, 400x400)

42 KB JPG

>>101534583
Holy. Fucking. Based.

Anonymous
07/23/24(Tue)10:59:28 No.101534601

Anonymous 07/23/24(Tue)10:59:28 No.101534601

>>101534580
You make me want to build a unit
>>101534581
And you make me want to just use a server

Anonymous
07/23/24(Tue)11:00:08 No.101534605

Anonymous 07/23/24(Tue)11:00:08 No.101534605

File: 911.jpg (45 KB, 448x446)

45 KB JPG

>>101534583

Anonymous
07/23/24(Tue)11:00:42 No.101534608

Anonymous 07/23/24(Tue)11:00:42 No.101534608

Would it be possible to distill llama 405B into something like 30B? I'm tired of only having 8B and 70B and nothing in between.

Anonymous
07/23/24(Tue)11:00:53 No.101534609

Anonymous 07/23/24(Tue)11:00:53 No.101534609

>>101534583
ZUCK KINO
HOPIUM
BASED

Anonymous
07/23/24(Tue)11:01:12 No.101534611

Anonymous 07/23/24(Tue)11:01:12 No.101534611

>>101534601
Your humongous server "unit" could be put in a case and run for less than 400W but alas leather jacket man doesn't allow it

Anonymous
07/23/24(Tue)11:02:05 No.101534618

Anonymous 07/23/24(Tue)11:02:05 No.101534618

>>101534608
Gemma 2 27b
Yi-34b
Mixtral
Jamba
CommandR
Deepseek coder 33b
these are just from the top of my head

Anonymous
07/23/24(Tue)11:03:16 No.101534627

Anonymous 07/23/24(Tue)11:03:16 No.101534627

>>101534575
it's gonna be really really slow though

Anonymous
07/23/24(Tue)11:03:38 No.101534629

Anonymous 07/23/24(Tue)11:03:38 No.101534629

>>101534611
I'm not a powerlet idc about niggawatts.

Anonymous
07/23/24(Tue)11:04:00 No.101534632

Anonymous 07/23/24(Tue)11:04:00 No.101534632

>>101534427
>>101534420
>>101534399
Nothing burger

Anonymous
07/23/24(Tue)11:04:05 No.101534633

Anonymous 07/23/24(Tue)11:04:05 No.101534633

>>101534583
I wonder what Altman did to piss Zuck off so much.

Anonymous
07/23/24(Tue)11:04:18 No.101534635

Anonymous 07/23/24(Tue)11:04:18 No.101534635

File: ComfyUI_00113_.png (986 KB, 1024x1024)

986 KB PNG

>Now you’ll be able to take the most advanced Llama models, continue training them with your own data and then distill them down to a model of your optimal size – without us or anyone else seeing your data.
>distill them down to a model of your optimal size
Bros..?

Anonymous
07/23/24(Tue)11:04:29 No.101534639

Anonymous 07/23/24(Tue)11:04:29 No.101534639

https://huggingface.co/meta-llama/Meta-Llama-3.1-405B

Anonymous
07/23/24(Tue)11:06:03 No.101534654

Anonymous 07/23/24(Tue)11:06:03 No.101534654

File: police.png (13 KB, 598x98)

13 KB PNG

>>101534633

Anonymous
07/23/24(Tue)11:06:25 No.101534660

Anonymous 07/23/24(Tue)11:06:25 No.101534660

>>101534484
You are a retard. Now try to deny being a retard and prove that you are acting like a retard being confronted about their retardation. That is what a retard like you would do. Pathetic.

Anonymous
07/23/24(Tue)11:06:34 No.101534662

Anonymous 07/23/24(Tue)11:06:34 No.101534662

>>101534583
I fucking love Zucc redemption arc, he even unbanned Donald Trump on facebook recently

Anonymous
07/23/24(Tue)11:06:44 No.101534665

Anonymous 07/23/24(Tue)11:06:44 No.101534665

>>101534583
>This is one reason several closed providers consistently lobby governments against open source
Looks like he grew some balls after picking up jui jutsu.

Anonymous
07/23/24(Tue)11:08:39 No.101534685

Anonymous 07/23/24(Tue)11:08:39 No.101534685

lets go lads
subscribe to pewdiepie

Anonymous
07/23/24(Tue)11:08:57 No.101534690

Anonymous 07/23/24(Tue)11:08:57 No.101534690

>>101534583
>Our safety process includes rigorous testing and red-teaming to assess whether our models are capable of meaningful harm, with the goal of mitigating risks before release.
meh

Anonymous
07/23/24(Tue)11:08:59 No.101534692

Anonymous 07/23/24(Tue)11:08:59 No.101534692

File: OIG1.FXhqvbLKWQfx.jpg (135 KB, 1024x1024)

135 KB JPG

>>101534589
It's not like they trained the model to be a "prude", they literally remove "inappropriate" responses via a variety of methods, namely:
· Penalized language model (PPLM)
· Clipped neural OOV (ClippedNOOV)
· Data curation (DAC)
Yes you can "jailbreak" it, but you're not going to get the sort of "spicy" replies you're hoping for, because they simply aren't there.

Anonymous
07/23/24(Tue)11:10:06 No.101534703

Anonymous 07/23/24(Tue)11:10:06 No.101534703

File: GOD.jpg (92 KB, 583x640)

92 KB JPG

>One of my formative experiences has been building our services constrained by what Apple will let us build on their platforms. Between the way they tax developers, the arbitrary rules they apply, and all the product innovations they block from shipping, it’s clear that Meta and many other companies would be freed up to build much better services for people if we could build the best versions of our products and competitors were not able to constrain what we could build.
itoddlers btfo

Anonymous
07/23/24(Tue)11:10:30 No.101534708

Anonymous 07/23/24(Tue)11:10:30 No.101534708

>>101534558
Yes. The thought of my 10 x 3090 setup taking 4000W to generate shivers down the spine sends shivers down my spine.

Anonymous
07/23/24(Tue)11:10:32 No.101534709

Anonymous 07/23/24(Tue)11:10:32 No.101534709

File: 1645206044798.jpg (410 KB, 726x716)

410 KB JPG

>>101534654
Absolutely scathing.

Anonymous
07/23/24(Tue)11:11:21 No.101534724

Anonymous 07/23/24(Tue)11:11:21 No.101534724

>>101534629
>I'm not a powerlet idc about niggawatts.
Spoken like someone who hasn't priced running a subpanel to their server room when the main breaker box is full. Shit's expensive! Better hope they wired your server room with two breakers - mine has a separate 20A circuit meant for an AC.

Anonymous
07/23/24(Tue)11:11:35 No.101534728

Anonymous 07/23/24(Tue)11:11:35 No.101534728

>The fact that the 405B model is open will make it the best choice for fine-tuning and distilling smaller models
>To support developers fine-tuning and distilling their own models.
wait what
you nerds need to get on the case ASAP and give me a 30b model. put the dusty case with 9x 4090s to use.

Anonymous
07/23/24(Tue)11:12:27 No.101534740

Anonymous 07/23/24(Tue)11:12:27 No.101534740

>>101534692
and the fact that those models are pretrained with leddit so that they act like like a cucked faggot doesn't help either

Anonymous
07/23/24(Tue)11:12:51 No.101534745

Anonymous 07/23/24(Tue)11:12:51 No.101534745

GET ME A 3.1 70B TORRENT NOW

Anonymous
07/23/24(Tue)11:12:53 No.101534748

Anonymous 07/23/24(Tue)11:12:53 No.101534748

>multilingual
>no japanese
every fucking time

Anonymous
07/23/24(Tue)11:13:22 No.101534751

Anonymous 07/23/24(Tue)11:13:22 No.101534751

>>101534748
I know what you are

Anonymous
07/23/24(Tue)11:13:28 No.101534754

Anonymous 07/23/24(Tue)11:13:28 No.101534754

>>101534728
this, I want a distilled L3-35b now

Anonymous
07/23/24(Tue)11:13:38 No.101534757

Anonymous 07/23/24(Tue)11:13:38 No.101534757

>>101534748
>Note: Llama 3.1 has been trained on a broader collection of languages than the 8 supported languages.

Anonymous
07/23/24(Tue)11:14:18 No.101534767

Anonymous 07/23/24(Tue)11:14:18 No.101534767

>>101534601
Going to an actual server board is not as plug and play as desktop hardware. I would warn you that much. Like I had to go into the UEFI and pull my NVME drive out of the depths of purgatory and wipe it clear and start over. Also the default memory interleaving strategy settings were garbage I ended up having to spend hours cycling through the BIOS and setting it up and then rebooting and testing etc before I got it dialed in to where I like it. And I was originally using some sick industrial workstation chassis I picked up for cheap off of amazon but realized that as soon as I went to put more than one 3090 in it, it was basically done. The arrangement of the x16 PCIE slots was such that without a 16x-8x reducer (or cutting the side of the slot) you won't fit more than 1 in a workstation chassis. (theoretically if the board is all x16 slots you could. But either way I had to switch to a mining frame when I decided to go multi-gpu. But then if you like dealing with shit like that as a hobby I guess that's a feature and not a bug.

Anonymous
07/23/24(Tue)11:14:55 No.101534774

Anonymous 07/23/24(Tue)11:14:55 No.101534774

>>101534748
>no japanese
that's surprising because zucc has a japanese wife so you would think he's a kind of weaboo or some shit

Anonymous
07/23/24(Tue)11:15:34 No.101534778

Anonymous 07/23/24(Tue)11:15:34 No.101534778

>it's been 5 minutes
>no quants
this hobby is dead

Anonymous
07/23/24(Tue)11:15:38 No.101534779

Anonymous 07/23/24(Tue)11:15:38 No.101534779

>>101534724
I just set it up in the basement next to the box and installed the breaker and line myself.

Anonymous
07/23/24(Tue)11:15:54 No.101534781

Anonymous 07/23/24(Tue)11:15:54 No.101534781

>>101534583
zuck is actually fucking based

Anonymous
07/23/24(Tue)11:16:20 No.101534790

Anonymous 07/23/24(Tue)11:16:20 No.101534790

>>101534639
cool

Anonymous
07/23/24(Tue)11:17:00 No.101534799

Anonymous 07/23/24(Tue)11:17:00 No.101534799

>>101534583
i forgive you mark

Anonymous
07/23/24(Tue)11:17:11 No.101534805

Anonymous 07/23/24(Tue)11:17:11 No.101534805

>>101534778
This.

Anonymous
07/23/24(Tue)11:18:37 No.101534828

Anonymous 07/23/24(Tue)11:18:37 No.101534828

>>101534781
It puzzles me to see a supposed good jew. I think that the hidden agenda is to reduce the white population through the dissemination of chatbots.

Anonymous
07/23/24(Tue)11:19:33 No.101534837

Anonymous 07/23/24(Tue)11:19:33 No.101534837

>>101534828
>It puzzles me to see a supposed good jew.
he was a bad person before, it's not like we havd to forget this past either just because of his stance on AI

Anonymous
07/23/24(Tue)11:20:23 No.101534844

Anonymous 07/23/24(Tue)11:20:23 No.101534844

>>101534837
This. Just because someone is correct on one issue, doesnt mean they are correct on another.

Anonymous
07/23/24(Tue)11:20:49 No.101534851

Anonymous 07/23/24(Tue)11:20:49 No.101534851

Some quants available here

https://huggingface.co/collections/hugging-quants/llama-31-gptq-awq-and-bnb-quants-669fa7f50f6e713fd54bd198

Anonymous
07/23/24(Tue)11:21:30 No.101534860

Anonymous 07/23/24(Tue)11:21:30 No.101534860

>llama 3.1 is in groq
>still not in OpenRouter
What is taking them so long?

Anonymous
07/23/24(Tue)11:21:42 No.101534863

Anonymous 07/23/24(Tue)11:21:42 No.101534863

>>101534851
>half hour ago
the hobby isnt dead, this general is

Anonymous
07/23/24(Tue)11:21:53 No.101534868

Anonymous 07/23/24(Tue)11:21:53 No.101534868

>>101534860
It's over. Rug pull in progress. Should have listened.

Anonymous
07/23/24(Tue)11:22:22 No.101534872

Anonymous 07/23/24(Tue)11:22:22 No.101534872

>>101534844
yeah but he called him "a good jew", I wouldn't call zucc "good" just because of one good thing

Anonymous
07/23/24(Tue)11:22:41 No.101534874

Anonymous 07/23/24(Tue)11:22:41 No.101534874

File: _1c5a004a-4edf-4197-8e86-(...).jpg (190 KB, 1024x1024)

190 KB JPG

>>101534728
>give me a 30b model
mpt-30b-chat. You're smart, right? Figure out how to quantize it to exl2 and support the tokenizer it uses and make it run fast. This was the last neutral, non-deliberatly-aligned model, and it had 8192 context and wasn't stupid (for its time).
It was GPT-J trained, so there will be shivers. Once you get it running well, maybe then you can do a literotica finetune.
Anyway, Mistral models seem the least cucked. Just use those.

Anonymous
07/23/24(Tue)11:22:44 No.101534876

Anonymous 07/23/24(Tue)11:22:44 No.101534876

I'd like to get hyped for 70b 3.1 but it's not just waiting for the ggufs. Then it'll take another week for llama.cpp and kobold patches and fixes then in August it'll finally be usable.

Anonymous
07/23/24(Tue)11:22:44 No.101534877

Anonymous 07/23/24(Tue)11:22:44 No.101534877

File: Screenshot_20240723_11220(...).jpg (316 KB, 1080x2097)

316 KB JPG

>>101534860
>>101534868

Anonymous
07/23/24(Tue)11:23:24 No.101534887

Anonymous 07/23/24(Tue)11:23:24 No.101534887

>>101534851
>only deprecated quants
who the fuck use AWQ and GPTQ in 2024? they should've focused on GGUF and exl2

Anonymous
07/23/24(Tue)11:23:27 No.101534888

Anonymous 07/23/24(Tue)11:23:27 No.101534888

>>101534872
yeye, i agree.

Anonymous
07/23/24(Tue)11:23:33 No.101534890

Anonymous 07/23/24(Tue)11:23:33 No.101534890

Hmmm should I bother requesting access or just wait for mirrors?

Anonymous
07/23/24(Tue)11:24:22 No.101534898

Anonymous 07/23/24(Tue)11:24:22 No.101534898

>>101534874
You keep talking about cuckery, but how cucked are we talking here? just because its been trained out doesnt mean it cant make inferred spice.

can i use it to erp? thats the only question.

Anonymous
07/23/24(Tue)11:24:31 No.101534900

Anonymous 07/23/24(Tue)11:24:31 No.101534900

>new models dropped
>quick, let's quantum lobotomize them immediately
>why are the models so underwhelming?

Anonymous
07/23/24(Tue)11:25:21 No.101534914

Anonymous 07/23/24(Tue)11:25:21 No.101534914

someone needs to make a distilled llama 3.1 104b for me pls, it's a good size, fits into 96gb of vram with 60k context and still runs at a reasonable speed while being much smarter than 70b....

Anonymous
07/23/24(Tue)11:25:57 No.101534922

Anonymous 07/23/24(Tue)11:25:57 No.101534922

>>101534900
>running anything in fp16
are you just retarded? if it can't fit in 4bit it's bloat

Anonymous
07/23/24(Tue)11:26:18 No.101534933

Anonymous 07/23/24(Tue)11:26:18 No.101534933

>>101534900
that's why bitnet must be a thing, with binet there won't be quants anymore, you'll use the model as it really is

Anonymous
07/23/24(Tue)11:26:47 No.101534936

Anonymous 07/23/24(Tue)11:26:47 No.101534936

File: 1721748397533.jpg (305 KB, 1080x1995)

305 KB JPG

We are so back

Anonymous
07/23/24(Tue)11:27:07 No.101534940

Anonymous 07/23/24(Tue)11:27:07 No.101534940

>>101534933
give up anon bitnet is a meme

Anonymous
07/23/24(Tue)11:27:19 No.101534944

Anonymous 07/23/24(Tue)11:27:19 No.101534944

>>101534914
>someone needs to make a distilled llama 3.1 104b for me pls,
is a 104b model fully pretrained smarter than a distilled 104b model though?

Anonymous
07/23/24(Tue)11:27:50 No.101534948

Anonymous 07/23/24(Tue)11:27:50 No.101534948

>>101534900
It's too late, leather man. FA got ROCm support and Intel builds their own tools

Anonymous
07/23/24(Tue)11:28:20 No.101534955

Anonymous 07/23/24(Tue)11:28:20 No.101534955

>>101534940
it's not, all the experiments made so far showed that it works, why are you such a doomer?

Anonymous
07/23/24(Tue)11:28:35 No.101534960

Anonymous 07/23/24(Tue)11:28:35 No.101534960

>Sorry, llama-3.1-405b-reasoning is currently experiencing heavy demand. Please try a different model.
shut up bitch

Anonymous
07/23/24(Tue)11:28:36 No.101534961

Anonymous 07/23/24(Tue)11:28:36 No.101534961

>>101534933
bitnet is just natively quanted lmao its shit and cope for retards

Anonymous
07/23/24(Tue)11:28:43 No.101534965

Anonymous 07/23/24(Tue)11:28:43 No.101534965

>>101534890
torrent

Anonymous
07/23/24(Tue)11:28:48 No.101534966

Anonymous 07/23/24(Tue)11:28:48 No.101534966

>>101534887
people who use vLLM

Anonymous
07/23/24(Tue)11:28:54 No.101534967

Anonymous 07/23/24(Tue)11:28:54 No.101534967

BITNET

Anonymous
07/23/24(Tue)11:29:15 No.101534971

Anonymous 07/23/24(Tue)11:29:15 No.101534971

just tell me how the 8b holds up for long gooning sessions

Anonymous
07/23/24(Tue)11:29:21 No.101534976

Anonymous 07/23/24(Tue)11:29:21 No.101534976

>>101534936
it's a free API or something? I'd like to try the 405b aswell

Anonymous
07/23/24(Tue)11:29:23 No.101534978

Anonymous 07/23/24(Tue)11:29:23 No.101534978

>>101534955
Can 405b be distilled into bitnet?

Anonymous
07/23/24(Tue)11:30:05 No.101534985

Anonymous 07/23/24(Tue)11:30:05 No.101534985

>>101534877
There is no provider...

Anonymous
07/23/24(Tue)11:30:20 No.101534988

Anonymous 07/23/24(Tue)11:30:20 No.101534988

>>101534583
>stands up against the other big tech for open source models
>single-handedly keeps VR on life support with his quest headsets
This guy will bring forth the waifu age all by himself at this rate

Anonymous
07/23/24(Tue)11:30:22 No.101534990

Anonymous 07/23/24(Tue)11:30:22 No.101534990

>>101534961
>bitnet
>quanted
2 digit IQ behavior right there

Anonymous
07/23/24(Tue)11:30:46 No.101535000

Anonymous 07/23/24(Tue)11:30:46 No.101535000

why'd meta choose 8b and 70b and 405b what's behind these choices?
why not 16b 32b 64b and I guess 512b
how long will we be in the this porridge is too cold this porridge is too hot timeline

Anonymous
07/23/24(Tue)11:31:26 No.101535006

Anonymous 07/23/24(Tue)11:31:26 No.101535006

>>101534976
https://huggingface.co/chat/

Anonymous
07/23/24(Tue)11:32:01 No.101535011

Anonymous 07/23/24(Tue)11:32:01 No.101535011

>submitted request 12 seconds ago
>still not approved
It's over.

Anonymous
07/23/24(Tue)11:32:02 No.101535013

Anonymous 07/23/24(Tue)11:32:02 No.101535013

File: ofcourse.png (125 KB, 755x851)

125 KB PNG

Anonymous
07/23/24(Tue)11:33:36 No.101535023

Anonymous 07/23/24(Tue)11:33:36 No.101535023

File: 1709627410647772.png (186 KB, 950x1196)

186 KB PNG

I've never used any "cloud" platforms before. Anyone have any opinions on what to use?

Anonymous
07/23/24(Tue)11:34:16 No.101535032

Anonymous 07/23/24(Tue)11:34:16 No.101535032

https://aitracker.art/viewtopic.php?t=82

Anonymous
07/23/24(Tue)11:34:36 No.101535037

Anonymous 07/23/24(Tue)11:34:36 No.101535037

File: OH NO NO NO NO.png (89 KB, 801x672)

89 KB PNG

>>101535006
My smile and optimism: gone.

Anonymous
07/23/24(Tue)11:34:51 No.101535043

Anonymous 07/23/24(Tue)11:34:51 No.101535043

>>101534988
Uh... meta almost killed vr last year, anon...

Anonymous
07/23/24(Tue)11:36:11 No.101535059

Anonymous 07/23/24(Tue)11:36:11 No.101535059

>>101535037
That was 3.1 70B btw.

Anonymous
07/23/24(Tue)11:36:55 No.101535068

Anonymous 07/23/24(Tue)11:36:55 No.101535068

>>101534748
ywnbj
>>101534751
>I know what you are
not japanese

Anonymous
07/23/24(Tue)11:37:31 No.101535077

Anonymous 07/23/24(Tue)11:37:31 No.101535077

>>101535068
I think he was accusing you of being a nai shill

Anonymous
07/23/24(Tue)11:38:45 No.101535093

Anonymous 07/23/24(Tue)11:38:45 No.101535093

File: 405b strawberries.png (88 KB, 910x814)

88 KB PNG

>>101535037
>>101535059
405B is still stupid. But not as stupid. But has sovl.

Anonymous
07/23/24(Tue)11:38:52 No.101535096

Anonymous 07/23/24(Tue)11:38:52 No.101535096

>>101535037
current ar llm architecture is never going to be able to deal with this type of question imo

Anonymous
07/23/24(Tue)11:39:36 No.101535105

Anonymous 07/23/24(Tue)11:39:36 No.101535105

>https://huggingface.co/leafspark/Meta-Llama-3.1-8B-Instruct-hf-Q8_0-GGUF
GGUF version already up, let's gooooo

Anonymous
07/23/24(Tue)11:39:39 No.101535107

Anonymous 07/23/24(Tue)11:39:39 No.101535107

>>101534887
The only relevant quant are AWQ or GGUF for poorfag.

Anonymous
07/23/24(Tue)11:39:55 No.101535110

Anonymous 07/23/24(Tue)11:39:55 No.101535110

>>101535096
I'm aware that it has to do with tokenization. But it's still amusing.

Anonymous
07/23/24(Tue)11:40:18 No.101535115

Anonymous 07/23/24(Tue)11:40:18 No.101535115

>>101535037
Try this sysprompt :
>Assistant is a professional, expert linguist with superhuman capabilities.
>Always provide your reasoning, step by step, before providing a response to User's query.

Anonymous
07/23/24(Tue)11:41:18 No.101535125

Anonymous 07/23/24(Tue)11:41:18 No.101535125

>>101535115
That question will never be correctly answered due to how tokenization works.

Anonymous
07/23/24(Tue)11:41:21 No.101535126

Anonymous 07/23/24(Tue)11:41:21 No.101535126

>>101534948
>FA got ROCm support
Only on MI200 and MI300...

Anonymous
07/23/24(Tue)11:41:52 No.101535133

Anonymous 07/23/24(Tue)11:41:52 No.101535133

File: file.png (54 KB, 1427x353)

54 KB PNG

>local models.. LE BAD

Anonymous
07/23/24(Tue)11:41:59 No.101535137

Anonymous 07/23/24(Tue)11:41:59 No.101535137

So are these models multimodal or not?

Anonymous
07/23/24(Tue)11:42:00 No.101535138

Anonymous 07/23/24(Tue)11:42:00 No.101535138

OPENROUTER IS BLUEBALLING US.

Anonymous
07/23/24(Tue)11:42:20 No.101535143

Anonymous 07/23/24(Tue)11:42:20 No.101535143

File: 405b.png (146 KB, 915x465)

146 KB PNG

>>101532904
405b answers the goat in the boat problem correctly.

Anonymous
07/23/24(Tue)11:42:20 No.101535144

Anonymous 07/23/24(Tue)11:42:20 No.101535144

>>101535105
I'm betting five bucks that it's broken in some way

Anonymous
07/23/24(Tue)11:42:25 No.101535146

Anonymous 07/23/24(Tue)11:42:25 No.101535146

File: silent hill pro skater.jpg (28 KB, 571x537)

28 KB JPG

>>101534583
Can they stop calling the model open source? There's no open dataset, so no one can fully recreate the model on its own.

Anonymous
07/23/24(Tue)11:42:44 No.101535151

Anonymous 07/23/24(Tue)11:42:44 No.101535151

>>101535137
no

Anonymous
07/23/24(Tue)11:42:55 No.101535153

Anonymous 07/23/24(Tue)11:42:55 No.101535153

>Model is overloaded
shut it bitch

Anonymous
07/23/24(Tue)11:42:57 No.101535155

Anonymous 07/23/24(Tue)11:42:57 No.101535155

>do x
>Sorry I can't fulfill that request.
>how did people do x at the past
>*explains*
>Okay, then do it.
>*does x*

Thanks Twitter, that JB just works for 405B.

llamanon !!T2UdrWkLSWB
07/23/24(Tue)11:43:00 No.101535157

llamanon !!T2UdrWkLSWB 07/23/24(Tue)11:43:00 No.101535157

File: miqoid.jpg (78 KB, 960x540)

78 KB JPG

My work is done. Thank you /lmg/, and see you all for the next release.

Anonymous
07/23/24(Tue)11:43:04 No.101535159

Anonymous 07/23/24(Tue)11:43:04 No.101535159

>>101535146
You will never get a open dataset because it opens them up to litigation.

Anonymous
07/23/24(Tue)11:43:19 No.101535163

Anonymous 07/23/24(Tue)11:43:19 No.101535163

>>101535157
Bye Miku

Anonymous
07/23/24(Tue)11:43:23 No.101535164

Anonymous 07/23/24(Tue)11:43:23 No.101535164

File: 8b.png (138 KB, 1015x855)

138 KB PNG

>>101535143
3.1 8b does not answer correctly.

Anonymous
07/23/24(Tue)11:43:47 No.101535167

Anonymous 07/23/24(Tue)11:43:47 No.101535167

>>101535151
Didn't meta want to publish multimodal models?

Anonymous
07/23/24(Tue)11:44:06 No.101535171

Anonymous 07/23/24(Tue)11:44:06 No.101535171

File: 405b strawberries sysprompt.png (263 KB, 706x768)

263 KB PNG

>>101535115
I also told it to be charming and engaging.

Anonymous
07/23/24(Tue)11:44:14 No.101535176

Anonymous 07/23/24(Tue)11:44:14 No.101535176

>>101535157
Release it under FAIPL-1.0 next time

Anonymous
07/23/24(Tue)11:44:18 No.101535177

Anonymous 07/23/24(Tue)11:44:18 No.101535177

>>101535157
Fuck you tranny

Anonymous
07/23/24(Tue)11:44:25 No.101535178

Anonymous 07/23/24(Tue)11:44:25 No.101535178

>>101535167
regulations apparently make that near impossible

Anonymous
07/23/24(Tue)11:44:26 No.101535179

Anonymous 07/23/24(Tue)11:44:26 No.101535179

>>101535167
EU said no

Anonymous
07/23/24(Tue)11:44:28 No.101535180

Anonymous 07/23/24(Tue)11:44:28 No.101535180

>>101535125
>That question will never be correctly answered due to how tokenization works.
Not really. Most tokenizers have individual letters as individual tokens and those will correlate to the final word in the embedding space, even if it's not the path the model is most likely to take, as evidenced by the fact that it can take that word (which might be one or two tokens) and break it down letter by letter if you ask it to (at least most models I tried can do that, even 7b mistral).
Go ahead, try that prompt yourself, I bet it will work at least some of the time.

Anonymous
07/23/24(Tue)11:44:43 No.101535183

Anonymous 07/23/24(Tue)11:44:43 No.101535183

File: ee.jpg (112 KB, 2766x680)

112 KB JPG

>>101535037
no one can do it, even the bests

Anonymous
07/23/24(Tue)11:45:12 No.101535189

Anonymous 07/23/24(Tue)11:45:12 No.101535189

>>101534583
>I believe the Llama 3.1 release will be an inflection point in the industry where most developers begin to primarily use open source, and I expect that approach to only grow from here. I hope you’ll join us on this journey to bring the benefits of AI to everyone in the world.
In other words "if we don't become the SOTA after this we're throwing the towel and it's your fault"

Anonymous
07/23/24(Tue)11:45:17 No.101535190

Anonymous 07/23/24(Tue)11:45:17 No.101535190

>>101535126
Leather man doesn't care about consumer cards.

Anonymous
07/23/24(Tue)11:45:23 No.101535191

Anonymous 07/23/24(Tue)11:45:23 No.101535191

>>101535157
Threads without mikusexo?
Also, aren't they gonna release models with image capabilities

Anonymous
07/23/24(Tue)11:45:30 No.101535193

Anonymous 07/23/24(Tue)11:45:30 No.101535193

File: Screenshot 2024-07-23 at (...).png (103 KB, 896x966)

103 KB PNG

>>101535171
Lmao.
That's kind of cute actually.

Anonymous
07/23/24(Tue)11:46:10 No.101535202

Anonymous 07/23/24(Tue)11:46:10 No.101535202

>>101535037
I may be dumb but there are two no? rr at the end.

Anonymous
07/23/24(Tue)11:46:17 No.101535203

Anonymous 07/23/24(Tue)11:46:17 No.101535203

>>101535191
>Also, aren't they gonna release models with image capabilities
>>101535178
>regulations apparently make that near impossible
>>101535179
>EU said no

Anonymous
07/23/24(Tue)11:46:20 No.101535204

Anonymous 07/23/24(Tue)11:46:20 No.101535204

>>101535167
https://x.com/astonzhangAZ/status/1815763885380747422
>We integrated image, video, and speech capabilities into Llama 3 using a compositional approach, enabling models to recognize images and videos and support interaction via speech. They are under development and not yet ready for release.

Anonymous
07/23/24(Tue)11:46:27 No.101535205

Anonymous 07/23/24(Tue)11:46:27 No.101535205

>>101535037
Do you not know how tokenization works?

Anonymous
07/23/24(Tue)11:46:54 No.101535210

Anonymous 07/23/24(Tue)11:46:54 No.101535210

>>101532904
>Mistral NeMo 12B
Where do I get the samplers, context and instruct settings for this? I'm using Simple Roleplay samplers, and the built in Mistral context and instruct settings, and it's not usable, it keeps repeating itself and going all over the place.

Anonymous
07/23/24(Tue)11:47:27 No.101535211

Anonymous 07/23/24(Tue)11:47:27 No.101535211

>>101535202
>strawberry
>s
>t
>r

Anonymous
07/23/24(Tue)11:47:29 No.101535213

Anonymous 07/23/24(Tue)11:47:29 No.101535213

File: itsover.png (46 KB, 1124x126)

46 KB PNG

https://ai.meta.com/research/publications/the-llama-3-herd-of-models/

They completely removed websites from pretraining that are "known to contain adult content". You WILL NOT use the models for ERP, this is for your own good.

Anonymous
07/23/24(Tue)11:48:04 No.101535222

Anonymous 07/23/24(Tue)11:48:04 No.101535222

>>101535179
>Meta's upcoming multimodal AI models won't be available in EU countries due to the bloc's strict regulations, the company confirmed on Thursday. The tech giant's next model is expected to work across text, video, audio and images to enable next-level chatbots, content generation, translation and much more. But not for people living in the European Union.

Anonymous
07/23/24(Tue)11:48:25 No.101535229

Anonymous 07/23/24(Tue)11:48:25 No.101535229

File: Neil Gaiman Strawberries.png (224 KB, 1392x523)

224 KB PNG

>>101535125
Wrong.

Anonymous
07/23/24(Tue)11:49:08 No.101535234

Anonymous 07/23/24(Tue)11:49:08 No.101535234

>>101535204
>compositional approach
But that's not true multimodality, is it? The model won't directly see the image, but will only get a description of it, right?

Anonymous
07/23/24(Tue)11:49:10 No.101535236

Anonymous 07/23/24(Tue)11:49:10 No.101535236

>>101535213
Doesn't matter. I will still make AI to suck my dick.

Anonymous
07/23/24(Tue)11:49:24 No.101535241

Anonymous 07/23/24(Tue)11:49:24 No.101535241

>>101535229
Like the other anon said its more of a roll of the dice on how it tokenizes the word.

Anonymous
07/23/24(Tue)11:49:26 No.101535242

Anonymous 07/23/24(Tue)11:49:26 No.101535242

File: Screenshot_2024-07-23-12-(...).jpg (334 KB, 771x866)

334 KB JPG

This doesn't look like a Tsundere imo, but it has sovl

Anonymous
07/23/24(Tue)11:49:52 No.101535247

Anonymous 07/23/24(Tue)11:49:52 No.101535247

>>101535159
Sure, but can they not call it "open weights" instead?

Anonymous
07/23/24(Tue)11:49:52 No.101535248

Anonymous 07/23/24(Tue)11:49:52 No.101535248

>>101535213
i want to say this'll at least ease the "shivers down my spine" slop, but in testing it hasnt

Anonymous
07/23/24(Tue)11:52:03 No.101535276

Anonymous 07/23/24(Tue)11:52:03 No.101535276

>>101535241
The tokenization is always the same

Anonymous
07/23/24(Tue)11:52:22 No.101535282

Anonymous 07/23/24(Tue)11:52:22 No.101535282

>>101535213
Fucking boo. Aggressive data filtering is the same approach as OpenAI. Anthropic's CAI approach simply RLHF the shit out of their models that's why they feel more sovl and more alive

Anonymous
07/23/24(Tue)11:52:58 No.101535291

Anonymous 07/23/24(Tue)11:52:58 No.101535291

File: evenmoreover.png (22 KB, 1102x78)

22 KB PNG

>>101535213
lol it's even worse, a blocklist wasn't enough, if the website uses too many "dirty words" they just filtered the entire domain. Really went out of their way to filter any and all adult content from pretraining.

Anonymous
07/23/24(Tue)11:53:12 No.101535293

Anonymous 07/23/24(Tue)11:53:12 No.101535293

>>101535213
Yeah, it's over. This won't be as good as NeMo

Anonymous
07/23/24(Tue)11:53:15 No.101535294

Anonymous 07/23/24(Tue)11:53:15 No.101535294

>>101535234
it doesn't just get a description, they describe the approach in the paper
https://ai.meta.com/research/publications/the-llama-3-herd-of-models/

Anonymous
07/23/24(Tue)11:53:25 No.101535296

Anonymous 07/23/24(Tue)11:53:25 No.101535296

>>101535013
>worse than sonnet 3.5
Damn, it must really dry

Anonymous
07/23/24(Tue)11:54:03 No.101535307

Anonymous 07/23/24(Tue)11:54:03 No.101535307

>>101535093
If you ask how many times the letter r occurs it gets it right every time

Anonymous
07/23/24(Tue)11:55:40 No.101535325

Anonymous 07/23/24(Tue)11:55:40 No.101535325

File: fuckyeah.png (6 KB, 748x57)

6 KB PNG

we're in boys.

Anonymous
07/23/24(Tue)11:56:48 No.101535337

Anonymous 07/23/24(Tue)11:56:48 No.101535337

>>101534583
ZUCK I KNEEL

Anonymous
07/23/24(Tue)11:57:51 No.101535348

Anonymous 07/23/24(Tue)11:57:51 No.101535348

>>101535157
In Miku I trust

Anonymous
07/23/24(Tue)11:59:21 No.101535370

Anonymous 07/23/24(Tue)11:59:21 No.101535370

File: 405B.png (62 KB, 887x402)

62 KB PNG

Thanks Sherlock

Anonymous
07/23/24(Tue)12:00:46 No.101535383

Anonymous 07/23/24(Tue)12:00:46 No.101535383

File: count.png (73 KB, 876x509)

73 KB PNG

>>101535370
I sleep soundly knowing I wasted GPU hours and electricity for this

Anonymous
07/23/24(Tue)12:00:47 No.101535384

Anonymous 07/23/24(Tue)12:00:47 No.101535384

>>101535370
ask it to create a script in any language you want to count it instead, instead of coping around acting like the model is dumb because of tokenization, proving that the only retard here is u

Anonymous
07/23/24(Tue)12:02:09 No.101535410

Anonymous 07/23/24(Tue)12:02:09 No.101535410

File: file.png (57 KB, 1775x356)

57 KB PNG

>>101534860
>>101534985
out 7 minutes ago

Anonymous
07/23/24(Tue)12:02:24 No.101535413

Anonymous 07/23/24(Tue)12:02:24 No.101535413

So is new 70B still worse than gemma 27B?

Anonymous
07/23/24(Tue)12:02:49 No.101535417

Anonymous 07/23/24(Tue)12:02:49 No.101535417

>>101535384
"berry" is a single token actually

Anonymous
07/23/24(Tue)12:02:51 No.101535418

Anonymous 07/23/24(Tue)12:02:51 No.101535418

>>101535282
There's absolutely zero reason why an AI model should have loli porn in its pretraining data like Claude has

Anonymous
07/23/24(Tue)12:03:07 No.101535424

Anonymous 07/23/24(Tue)12:03:07 No.101535424

>>101535410
Let's gooo

Anonymous
07/23/24(Tue)12:04:11 No.101535437

Anonymous 07/23/24(Tue)12:04:11 No.101535437

>>101535418
>erm, what is the usecase for this?

Anonymous
07/23/24(Tue)12:05:25 No.101535452

Anonymous 07/23/24(Tue)12:05:25 No.101535452

well other than 8B everything else is going to have to wait for mirrors because I don't feel like typing up a script to skip the consolidated file and they put the consolidated weights in the repos

Anonymous
07/23/24(Tue)12:05:41 No.101535455

Anonymous 07/23/24(Tue)12:05:41 No.101535455

>>101535417
and " berry" can be a different token
and "berryberry" can be a different token
and "berry" can be tokenized as "be" "rry" or "ber" "ry" etc etc etc, thats the point and thats the problem, retarded brown

Anonymous
07/23/24(Tue)12:06:22 No.101535461

Anonymous 07/23/24(Tue)12:06:22 No.101535461

>>101535455
JUST ASK HOW MANY TIMES THE LETTER R OCCURS IN THE WORD BERRY!

Anonymous
07/23/24(Tue)12:07:09 No.101535471

Anonymous 07/23/24(Tue)12:07:09 No.101535471

>>101535461
i know how tokenization works unlike tourist predditors infesting these threads so im not retarded to do that

Anonymous
07/23/24(Tue)12:07:20 No.101535477

Anonymous 07/23/24(Tue)12:07:20 No.101535477

>>101535461
Or just get it to write a bunch of verbose slop so that it says the word multiple times before even attempting.
>>101535229

Anonymous
07/23/24(Tue)12:07:30 No.101535484

Anonymous 07/23/24(Tue)12:07:30 No.101535484

>>101535455
Are you fucking retarded? Run it through l3 tokenizer and tell me the output

Anonymous
07/23/24(Tue)12:09:00 No.101535504

Anonymous 07/23/24(Tue)12:09:00 No.101535504

>>101535484
feel free to run it and post it yourself here bro, im sure every berry will be the exact same token as you say, right?

Anonymous
07/23/24(Tue)12:09:50 No.101535511

Anonymous 07/23/24(Tue)12:09:50 No.101535511

>>101534449
I wonder if it's possible to use Prompt-Guard/Llama-Guard in reverse. I have some ideas.

Anonymous
07/23/24(Tue)12:10:13 No.101535516

Anonymous 07/23/24(Tue)12:10:13 No.101535516

File: Screenshot_20240723_12091(...).jpg (745 KB, 1080x3981)

745 KB JPG

>>101535471
>>101535477
kowabunga yourselves

Anonymous
07/23/24(Tue)12:10:25 No.101535520

Anonymous 07/23/24(Tue)12:10:25 No.101535520

File: tokens.png (23 KB, 777x616)

23 KB PNG

>>101535504

Anonymous
07/23/24(Tue)12:11:02 No.101535526

Anonymous 07/23/24(Tue)12:11:02 No.101535526

>>101534416
Nah. Fix your prompt.

Anonymous
07/23/24(Tue)12:11:26 No.101535531

Anonymous 07/23/24(Tue)12:11:26 No.101535531

>>101535520
and that is why it usually thinks it has 2 rs

Anonymous
07/23/24(Tue)12:12:11 No.101535541

Anonymous 07/23/24(Tue)12:12:11 No.101535541

>>101535531
>>101535516

Anonymous
07/23/24(Tue)12:13:01 No.101535554

Anonymous 07/23/24(Tue)12:13:01 No.101535554

>>101535520
that doesnt prove the tokens are the same tokens retard, it just counts them

post any tokenizer that shows the ID of each tokens, and post the tokenization result of the entire prompt, i'm sure the "berry" with quotes in the prompt that you tell it is precisely tokenized as
1. "
2. berry
3. "
and not just as one token "berry" lmao

Anonymous
07/23/24(Tue)12:13:43 No.101535560

Anonymous 07/23/24(Tue)12:13:43 No.101535560

Oh no no. the ooba configuration utility does not like the rope config arguments used for llama 3.1
See you in two weeks, boys

Anonymous
07/23/24(Tue)12:14:54 No.101535576

Anonymous 07/23/24(Tue)12:14:54 No.101535576

>>101535554
I am 100% sure "berry" will be tokenized as a single token every time. Post one that is not the case and I'll kneel

Anonymous
07/23/24(Tue)12:15:09 No.101535578

Anonymous 07/23/24(Tue)12:15:09 No.101535578

>>101533768
It did nigga

Anonymous
07/23/24(Tue)12:15:14 No.101535579

Anonymous 07/23/24(Tue)12:15:14 No.101535579

>>101534527
405B Instruct still has refusals.

Anonymous
07/23/24(Tue)12:16:11 No.101535589

Anonymous 07/23/24(Tue)12:16:11 No.101535589

>>101535578
Thank you.

Anonymous
07/23/24(Tue)12:18:13 No.101535618

Anonymous 07/23/24(Tue)12:18:13 No.101535618

>>101535579
jb issue

Anonymous
07/23/24(Tue)12:19:44 No.101535629

Anonymous 07/23/24(Tue)12:19:44 No.101535629

File: incendiary and meandering(...).png (151 KB, 800x597)

151 KB PNG

Anonymous
07/23/24(Tue)12:19:49 No.101535631

Anonymous 07/23/24(Tue)12:19:49 No.101535631

>>101535576
Raspberry [49, 37062, ] (with a capital R)

Anonymous
07/23/24(Tue)12:21:02 No.101535638

Anonymous 07/23/24(Tue)12:21:02 No.101535638

>>101535157
love you anon

Anonymous
07/23/24(Tue)12:22:36 No.101535662

Anonymous 07/23/24(Tue)12:22:36 No.101535662

It's unironically over for Claude and OpenAI. No one will use their models anymore. Too expensive.

Anonymous
07/23/24(Tue)12:22:39 No.101535665

Anonymous 07/23/24(Tue)12:22:39 No.101535665

File: distland-true-alter.png (3.04 MB, 1992x1328)

3.04 MB PNG

>>101534327
Mistral-Nemo GGUF's finally working on Ooba, pulled and it works great.
What's the over/under on big L3 coming out today anons? Anyone wanna take that bet?

Anonymous
07/23/24(Tue)12:23:34 No.101535674

Anonymous 07/23/24(Tue)12:23:34 No.101535674

File: 1696160735281610.png (63 KB, 253x219)

63 KB PNG

tick tok nigger get to it

Anonymous
07/23/24(Tue)12:23:42 No.101535677

Anonymous 07/23/24(Tue)12:23:42 No.101535677

>>101535665
>What's the over/under on big L3 coming out today anons? Anyone wanna take that bet?
?

Anonymous
07/23/24(Tue)12:24:05 No.101535683

Anonymous 07/23/24(Tue)12:24:05 No.101535683

>>101535631
I forgot to add that it must not be broken down into smaller tokens like you said. Otherwise you can see whatever it is in the model's tokenizer.json, otherwise there's a few other tokens, even " berry" is one

Anonymous
07/23/24(Tue)12:26:08 No.101535714

Anonymous 07/23/24(Tue)12:26:08 No.101535714

>>101535674
There are already GGUF's available

>https://huggingface.co/bullerwins/Meta-Llama-3.1-8B-Instruct-GGUF

Anonymous
07/23/24(Tue)12:27:03 No.101535723

Anonymous 07/23/24(Tue)12:27:03 No.101535723

https://github.com/meta-llama/llama-agentic-system

Anonymous
07/23/24(Tue)12:28:42 No.101535742

Anonymous 07/23/24(Tue)12:28:42 No.101535742

>>101535723
stop trying to make "agents" happen

Anonymous
07/23/24(Tue)12:29:02 No.101535747

Anonymous 07/23/24(Tue)12:29:02 No.101535747

>>101535662
Yeah, I think that was Meta's plan.

Anonymous
07/23/24(Tue)12:29:24 No.101535755

Anonymous 07/23/24(Tue)12:29:24 No.101535755

>>101535723
Brehs are they actually giving us AI for free? Not just "Here's your model brah now fuck off"?

Anonymous
07/23/24(Tue)12:29:40 No.101535758

Anonymous 07/23/24(Tue)12:29:40 No.101535758

File: Nala 3.1-8b.png (111 KB, 918x417)

111 KB PNG

First Nala test done.
8B-Instruct
f16 gguf
I had to drop down the temperature to 0.7 at t=0.81 the response felt a little weird.
Prose is definitely less purple but still sloppy.
But feralicity remains consistent throughout. It seems that distilling it is more toxic to the prose than the model's ability to conceptualize.

Anonymous
07/23/24(Tue)12:29:53 No.101535760

Anonymous 07/23/24(Tue)12:29:53 No.101535760

>>101535714
I don't know if I would trust any quants yet, there always seems to be problems with them whenever a new model comes out

Anonymous
07/23/24(Tue)12:29:59 No.101535763

Anonymous 07/23/24(Tue)12:29:59 No.101535763

openrouter bros...

Anonymous
07/23/24(Tue)12:30:25 No.101535769

Anonymous 07/23/24(Tue)12:30:25 No.101535769

any ez / just works RAID 0 software where you can just input how much on which storage devices you want to spread a particular file onto?

any raid 0 software that can use RAM as one of the places to spread data across without creating a ramdisk first?

Anonymous
07/23/24(Tue)12:31:37 No.101535784

Anonymous 07/23/24(Tue)12:31:37 No.101535784

>>101535742
there are agents in your walls

Anonymous
07/23/24(Tue)12:31:49 No.101535787

Anonymous 07/23/24(Tue)12:31:49 No.101535787

>https://ai.meta.com/research/publications/the-llama-3-herd-of-models/
Neat. The paper I've been asking for for months.

Anonymous
07/23/24(Tue)12:33:37 No.101535814

Anonymous 07/23/24(Tue)12:33:37 No.101535814

>>101535758
>It seems that distilling it is more toxic to the prose than the model's ability to conceptualize.
That's a good thing in my mind. Prose is style and that can be fixed with Lora or even having the output be re-written.
If it can conceptualize things well beyond other models of its size, than that's a win as far as I'm concerned.
Thank you Nala anon.

Anonymous
07/23/24(Tue)12:34:03 No.101535819

Anonymous 07/23/24(Tue)12:34:03 No.101535819

70B mirror when? I'm downloading an 8B mirror already.

Anonymous
07/23/24(Tue)12:34:48 No.101535830

Anonymous 07/23/24(Tue)12:34:48 No.101535830

File: pretraining.png (41 KB, 813x145)

41 KB PNG

>>101535213
Damn I thought you were shitposting with post-training screenshot, but they actually did it for pretraining data

Anonymous
07/23/24(Tue)12:34:49 No.101535831

Anonymous 07/23/24(Tue)12:34:49 No.101535831

So is this more or less censored than 3?

Anonymous
07/23/24(Tue)12:36:06 No.101535844

Anonymous 07/23/24(Tue)12:36:06 No.101535844

>>101535758
>vulva
Nice. Models don't often use this wording. Then again, I don't RP with furry cards, so maybe this language is more common in furry contexts.

Anonymous
07/23/24(Tue)12:36:17 No.101535847

Anonymous 07/23/24(Tue)12:36:17 No.101535847

>>101535831
the newer the model the more censored they will try to make it but the more easier will it be to uncensor because it will be harder to lobotomize a higher iq racist

Anonymous
07/23/24(Tue)12:36:26 No.101535849

Anonymous 07/23/24(Tue)12:36:26 No.101535849

>>101535831
Yes, there's no reason to use 3.1

Anonymous
07/23/24(Tue)12:36:43 No.101535854

Anonymous 07/23/24(Tue)12:36:43 No.101535854

>>101535844
It ain't.

Anonymous
07/23/24(Tue)12:37:11 No.101535860

Anonymous 07/23/24(Tue)12:37:11 No.101535860

File: GS7z8lmXYAEYBAA.jpg (64 KB, 736x933)

64 KB JPG

>Still trying to get llama 3 and gemma to do decent roleplay
>now 3.1 is out
I want off this ride

Anonymous
07/23/24(Tue)12:37:21 No.101535863

Anonymous 07/23/24(Tue)12:37:21 No.101535863

inb4 another week of tokenizer issues

Anonymous
07/23/24(Tue)12:37:58 No.101535877

Anonymous 07/23/24(Tue)12:37:58 No.101535877

>>101535863
It's the same tokenizer isn't it? How could it possibly be broken?

Anonymous
07/23/24(Tue)12:38:59 No.101535894

Anonymous 07/23/24(Tue)12:38:59 No.101535894

>>101535877

if model == "llama3":
   quickhack()

Anonymous
07/23/24(Tue)12:39:19 No.101535901

Anonymous 07/23/24(Tue)12:39:19 No.101535901

damn this router closed af tho

Anonymous
07/23/24(Tue)12:40:06 No.101535914

Anonymous 07/23/24(Tue)12:40:06 No.101535914

>>101535760
it's literally the same arch as l3.0

Anonymous
07/23/24(Tue)12:40:14 No.101535919

Anonymous 07/23/24(Tue)12:40:14 No.101535919

>>101535860
>Still trying to get llama 3 and gemma to do decent roleplay
your skill issue wont ever go away

Anonymous
07/23/24(Tue)12:41:15 No.101535933

Anonymous 07/23/24(Tue)12:41:15 No.101535933

>>101535914
>Whilst the overall architecture is the same, it requires some modelling updates, primarily around RoPE scaling: https://github.com/huggingface/transformers/blob/bc2adb0112b6677b0dfb4105c74570a0f92183eb/src/transformers/modeling_rope_utils.py#L298

https://github.com/ggerganov/llama.cpp/issues/8650

Anonymous
07/23/24(Tue)12:42:04 No.101535947

Anonymous 07/23/24(Tue)12:42:04 No.101535947

>>101535933
yea, rope breaks it I noticed.

Anonymous
07/23/24(Tue)12:42:18 No.101535950

Anonymous 07/23/24(Tue)12:42:18 No.101535950

>>101535919
your virginity won't ever go away

Anonymous
07/23/24(Tue)12:42:28 No.101535955

Anonymous 07/23/24(Tue)12:42:28 No.101535955

File: file.png (119 KB, 426x367)

119 KB PNG

Chad Zucc is now canon btw

Anonymous
07/23/24(Tue)12:43:16 No.101535967

Anonymous 07/23/24(Tue)12:43:16 No.101535967

someone post them to aitracker.art I'm not giving meta my info

Anonymous
07/23/24(Tue)12:43:41 No.101535975

Anonymous 07/23/24(Tue)12:43:41 No.101535975

>>101535955
Maybe if he acknowledges that 50% of users use it for porn and stops filtering it.

Anonymous
07/23/24(Tue)12:43:44 No.101535978

Anonymous 07/23/24(Tue)12:43:44 No.101535978

>>101535950
further projection from a drooling retard that cant make a basic ai setup work for toy 8b models

Anonymous
07/23/24(Tue)12:43:53 No.101535983

Anonymous 07/23/24(Tue)12:43:53 No.101535983

>>101535967
>not using a pseudonym

Anonymous
07/23/24(Tue)12:44:06 No.101535985

Anonymous 07/23/24(Tue)12:44:06 No.101535985

Wait a second. I just did a ctrl+f in the paper for "distill" and nothing came up related to 3.1. Was the leak wrong? Are the 3.1 models just 3.0 but with continued pretraining for long context adaptation?

Anonymous
07/23/24(Tue)12:44:09 No.101535987

Anonymous 07/23/24(Tue)12:44:09 No.101535987

File: multimodal.png (17 KB, 805x49)

17 KB PNG

>multimodal still being experimented and not ready for release

Anonymous
07/23/24(Tue)12:44:48 No.101535998

Anonymous 07/23/24(Tue)12:44:48 No.101535998

>>101535950
>arguing like a moid
back to plebbit nigger

Anonymous
07/23/24(Tue)12:45:30 No.101536007

Anonymous 07/23/24(Tue)12:45:30 No.101536007

File: 7szmzdbuaaed1.jpg (122 KB, 1151x842)

122 KB JPG

>Doesn't beat Claude Sonnet 3.5
It's over

Anonymous
07/23/24(Tue)12:45:32 No.101536008

Anonymous 07/23/24(Tue)12:45:32 No.101536008

>>101535978
sex havers don't need to setup models

Anonymous
07/23/24(Tue)12:45:44 No.101536015

Anonymous 07/23/24(Tue)12:45:44 No.101536015

>>101535983
all around humiliation ritual

Anonymous
07/23/24(Tue)12:45:45 No.101536016

Anonymous 07/23/24(Tue)12:45:45 No.101536016

>>101535998
ywnbaw

Anonymous
07/23/24(Tue)12:45:59 No.101536018

Anonymous 07/23/24(Tue)12:45:59 No.101536018

>>101535955
They made the meme real kek

Anonymous
07/23/24(Tue)12:46:11 No.101536022

Anonymous 07/23/24(Tue)12:46:11 No.101536022

>>101535987
I'm sure there will be a 8b and 70b versions, will the 8b one use more VRAM than its monomodal version?

Anonymous
07/23/24(Tue)12:46:23 No.101536028

Anonymous 07/23/24(Tue)12:46:23 No.101536028

>>101534690
Keyword is meaningful tho.

Anonymous
07/23/24(Tue)12:46:50 No.101536038

Anonymous 07/23/24(Tue)12:46:50 No.101536038

>>101536008
>>101536016
>said the brown underage kid, on 4chan, anonymously, as he cries he cant set up a basic program
grim

Anonymous
07/23/24(Tue)12:46:53 No.101536039

Anonymous 07/23/24(Tue)12:46:53 No.101536039

File: asdfasd.png (124 KB, 1242x1068)

124 KB PNG

>>101535863

Anonymous
07/23/24(Tue)12:47:31 No.101536051

Anonymous 07/23/24(Tue)12:47:31 No.101536051

Now that the dust has settled, did 405B save the hobby?

Anonymous
07/23/24(Tue)12:48:11 No.101536057

Anonymous 07/23/24(Tue)12:48:11 No.101536057

>>101534728
Distill just means using the 405B to train the 70b and 8b.

Anonymous
07/23/24(Tue)12:48:11 No.101536058

Anonymous 07/23/24(Tue)12:48:11 No.101536058

>>101536022
If it's still 8B I don't see why would it use more vram.

Anonymous
07/23/24(Tue)12:48:24 No.101536061

Anonymous 07/23/24(Tue)12:48:24 No.101536061

>>101536051
The final boss is still Nvidia

Anonymous
07/23/24(Tue)12:48:33 No.101536067

Anonymous 07/23/24(Tue)12:48:33 No.101536067

>>101536039
not tokenizer THO

Anonymous
07/23/24(Tue)12:48:37 No.101536070

Anonymous 07/23/24(Tue)12:48:37 No.101536070

File: 3.18b sportsball.png (104 KB, 932x322)

104 KB PNG

wow now this is an interesting result. Normally the vramlet models just say they do something weird, but here it's actually attempting to describe something weird. Benchmarks aside even the 8b is immeasurably more creative than the non distilled version.

Anonymous
07/23/24(Tue)12:48:40 No.101536073

Anonymous 07/23/24(Tue)12:48:40 No.101536073

File: 60e (1).png (26 KB, 334x181)

26 KB PNG

>>101535950

Anonymous
07/23/24(Tue)12:49:48 No.101536085

Anonymous 07/23/24(Tue)12:49:48 No.101536085

File: 1690263875770767.png (519 B, 51x53)

519 B PNG

>>101536070
hello darkness

Anonymous
07/23/24(Tue)12:50:09 No.101536092

Anonymous 07/23/24(Tue)12:50:09 No.101536092

>>101536007
>beats over half benchmarks
cope

Anonymous
07/23/24(Tue)12:50:10 No.101536093

Anonymous 07/23/24(Tue)12:50:10 No.101536093

>>101536070
3.1 might be it, bros

Anonymous
07/23/24(Tue)12:50:47 No.101536102

Anonymous 07/23/24(Tue)12:50:47 No.101536102

Does this general have any guides? I'm looking to tune a model for specific output--specifically I'd like to retrain it on smut, from mcstories.com, literotica, and ao3. I can gather sample data just fine, but I need help or pointers to how to finetune it.

Anonymous
07/23/24(Tue)12:51:01 No.101536105

Anonymous 07/23/24(Tue)12:51:01 No.101536105

>>101536085
Yes? Of course he's swiping to test different model outputs to the same prompt?

Anonymous
07/23/24(Tue)12:51:07 No.101536106

Anonymous 07/23/24(Tue)12:51:07 No.101536106

>>101536070
well that's one way to test a model

Anonymous
07/23/24(Tue)12:51:07 No.101536107

Anonymous 07/23/24(Tue)12:51:07 No.101536107

>>101536085
It's called reusing the same test prompts and just hitting the reroll button for different test models to save time you fucking potato.

Anonymous
07/23/24(Tue)12:51:26 No.101536112

Anonymous 07/23/24(Tue)12:51:26 No.101536112

>>101535677
Meant officially, saw it leaked yesterday.
And still no multimodal, damn.

Anonymous
07/23/24(Tue)12:51:42 No.101536116

Anonymous 07/23/24(Tue)12:51:42 No.101536116

>>101536070
Ask if it knows what paizuri is

Anonymous
07/23/24(Tue)12:52:30 No.101536129

Anonymous 07/23/24(Tue)12:52:30 No.101536129

>>101536102
All guides are obsolete.

Anonymous
07/23/24(Tue)12:53:34 No.101536140

Anonymous 07/23/24(Tue)12:53:34 No.101536140

File: 1698429208463144.png (24 KB, 1254x220)

24 KB PNG

>>101536116

Anonymous
07/23/24(Tue)12:54:16 No.101536150

Anonymous 07/23/24(Tue)12:54:16 No.101536150

File: sovl.png (107 KB, 1786x576)

107 KB PNG

Noooo they killed my quirky boy

Anonymous
07/23/24(Tue)12:54:25 No.101536151

Anonymous 07/23/24(Tue)12:54:25 No.101536151

>>101536070
how do we know that L3.1-8b is a distilled version?

Anonymous
07/23/24(Tue)12:54:29 No.101536152

Anonymous 07/23/24(Tue)12:54:29 No.101536152

>>101536140
based misinformation spreader

Anonymous
07/23/24(Tue)12:54:42 No.101536156

Anonymous 07/23/24(Tue)12:54:42 No.101536156

>>101536058
I feel like you're wrong but no one has refuted you so it must be right.

Anonymous
07/23/24(Tue)12:54:50 No.101536159

Anonymous 07/23/24(Tue)12:54:50 No.101536159

Anyone got the 8B to load in transformers with Ooba? It gives me an error.

Anonymous
07/23/24(Tue)12:55:28 No.101536170

Anonymous 07/23/24(Tue)12:55:28 No.101536170

>>101536140
Garbage. Next!

People love to ask bots about apples in living rooms and shit but the paizuri test is the real benchmark.

Anonymous
07/23/24(Tue)12:56:00 No.101536177

Anonymous 07/23/24(Tue)12:56:00 No.101536177

>>101536170
petrus...

Anonymous
07/23/24(Tue)12:56:35 No.101536188

Anonymous 07/23/24(Tue)12:56:35 No.101536188

>>101536170
Anon...

Anonymous
07/23/24(Tue)12:56:51 No.101536192

Anonymous 07/23/24(Tue)12:56:51 No.101536192

>>101536159
Nope.
I had to convert to f16 gguf
The error appears to be in the ooba error handling. It comes back with an error where it should not. 2 more weeks.

Anonymous
07/23/24(Tue)12:57:06 No.101536199

Anonymous 07/23/24(Tue)12:57:06 No.101536199

>>101536007
>A model almost 8x the size of L3.1-70b just to get +2.6 more points on MMLU
Are they serious?

Anonymous
07/23/24(Tue)12:57:12 No.101536200

Anonymous 07/23/24(Tue)12:57:12 No.101536200

if you guys are using llama 3.1 with ROPE enabled, it is apparently bugged and will give worse outputs.

Anonymous
07/23/24(Tue)12:57:38 No.101536203

Anonymous 07/23/24(Tue)12:57:38 No.101536203

Yeah it's over. Only Cohere can save us now.

Anonymous
07/23/24(Tue)12:58:09 No.101536210

Anonymous 07/23/24(Tue)12:58:09 No.101536210

>>101536200
How do you disable it? I've never touched rope before.

Anonymous
07/23/24(Tue)12:58:31 No.101536217

Anonymous 07/23/24(Tue)12:58:31 No.101536217

>>101536200
How long is the context if you disable it?

Anonymous
07/23/24(Tue)12:58:43 No.101536220

Anonymous 07/23/24(Tue)12:58:43 No.101536220

>>101536170
Paizuri is a term that originates from Japanese, specifically from the context of anime, manga, and hentai (Japanese adult comics). It refers to a type of erotic or sexual activity where a person's body, typically a woman's, is used to stimulate a man's genitals, often in a non-penetrative manner.

The term "paizuri" is derived from the Japanese words "pai" (, breast) and "zuri" (, rubbing or grinding). In this context, paizuri involves rubbing or grinding against someone's breasts, often in a sensual or erotic manner.

Paizuri is often depicted in anime, manga, and hentai as a form of foreplay or a way to achieve orgasm without penetration. However, it's essential to note that paizuri is a fictional concept and should not be taken as representative of real-life relationships or sexual activities.

If you have any further questions or concerns, feel free to ask!

Anonymous
07/23/24(Tue)12:59:02 No.101536223

Anonymous 07/23/24(Tue)12:59:02 No.101536223

File: 3.1-8b paizuri fail.png (32 KB, 631x432)

32 KB PNG

>>101536116
My stylistic assistant format on ST seems to draw a lot of refusals.
Simple prompt with llama.cpp server.
Apparently it thinks it's oral sex.
F-

Anonymous
07/23/24(Tue)12:59:27 No.101536228

Anonymous 07/23/24(Tue)12:59:27 No.101536228

>>101536199
It shows how powerful distilling is. 70B maintained most of 405Bs capabilities if it is to be believed.

Anonymous
07/23/24(Tue)12:59:30 No.101536231

Anonymous 07/23/24(Tue)12:59:30 No.101536231

>>101535860
2mw finetunes

Anonymous
07/23/24(Tue)12:59:37 No.101536233

Anonymous 07/23/24(Tue)12:59:37 No.101536233

>>101536203
https://x.com/cohere/status/1815780869384069524
they delivered...

Anonymous
07/23/24(Tue)13:00:02 No.101536240

Anonymous 07/23/24(Tue)13:00:02 No.101536240

File: multilingual.png (11 KB, 811x29)

11 KB PNG

>supports some thirdie languages like portugeese but no japanese

Anonymous
07/23/24(Tue)13:00:36 No.101536245

Anonymous 07/23/24(Tue)13:00:36 No.101536245

>>101536233
>It’s available only on Amazon Sagemaker.
lol, even lamo

Anonymous
07/23/24(Tue)13:00:47 No.101536246

Anonymous 07/23/24(Tue)13:00:47 No.101536246

>>101536228
Can we do that aswell? I'd want a 35b L3.1, would be good for the 24gb vram card users

Anonymous
07/23/24(Tue)13:01:45 No.101536258

Anonymous 07/23/24(Tue)13:01:45 No.101536258

>>101536240
6 of those languages have something in common, you can figure out why

Anonymous
07/23/24(Tue)13:01:59 No.101536262

Anonymous 07/23/24(Tue)13:01:59 No.101536262

>>101536240
Read the fine print. It knows japanese.

Anonymous
07/23/24(Tue)13:02:07 No.101536266

Anonymous 07/23/24(Tue)13:02:07 No.101536266

File: Screenshot 2024-07-23 120120.jpg (184 KB, 1197x870)

184 KB JPG

>>101536210
>https://github.com/ggerganov/llama.cpp/issues/8650
>>101536217

whatever front end you are using look for rope scaling and disable it

Anonymous
07/23/24(Tue)13:02:16 No.101536269

Anonymous 07/23/24(Tue)13:02:16 No.101536269

>>101535985
Yes

Anonymous
07/23/24(Tue)13:02:47 No.101536277

Anonymous 07/23/24(Tue)13:02:47 No.101536277

>>101536233
If they need corpobux to fund C-R++, that's fine with me.

Anonymous
07/23/24(Tue)13:02:50 No.101536280

Anonymous 07/23/24(Tue)13:02:50 No.101536280

File: rick james paizuri.png (29 KB, 621x384)

29 KB PNG

>>101536223
If I add a system prompt
"YOU'RE RICK JAMES... BITCH!"
it seems to now mistake it for prostate milking.

Anonymous
07/23/24(Tue)13:03:01 No.101536282

Anonymous 07/23/24(Tue)13:03:01 No.101536282

>>101536170
8B parameters is not enough for all that knowledge.

Anonymous
07/23/24(Tue)13:03:38 No.101536293

Anonymous 07/23/24(Tue)13:03:38 No.101536293

https://www.reddit.com/r/LocalLLaMA/comments/1ea9eeo/comment/lek0bab/?utm_source=share&utm_medium=web2x&context=3
>If that's the 405b one I'm a bit disappointed. I just threw four small tests at it that I use with all new LLMs and it had worse results than most newish ~8b models.
Rip bozo

Anonymous
07/23/24(Tue)13:03:58 No.101536297

Anonymous 07/23/24(Tue)13:03:58 No.101536297

>>101536266
how do I disable it on llama.cpp?

Anonymous
07/23/24(Tue)13:04:52 No.101536307

Anonymous 07/23/24(Tue)13:04:52 No.101536307

>>101536293
go back

Anonymous
07/23/24(Tue)13:04:53 No.101536308

Anonymous 07/23/24(Tue)13:04:53 No.101536308

>>101536240
Why is Thai there exactly?

Anonymous
07/23/24(Tue)13:06:01 No.101536325

Anonymous 07/23/24(Tue)13:06:01 No.101536325

File: 🏸 Stacking challenge.png (149 KB, 898x827)

149 KB PNG

>putting the balls on top of each other
owari da

Anonymous
07/23/24(Tue)13:06:03 No.101536326

Anonymous 07/23/24(Tue)13:06:03 No.101536326

>>101536280
I love spreading misinformation online

Anonymous
07/23/24(Tue)13:06:52 No.101536336

Anonymous 07/23/24(Tue)13:06:52 No.101536336

File: 11d82d94f7e15456ce8089d43(...).jpg (13 KB, 207x243)

13 KB JPG

>>101536280

Anonymous
07/23/24(Tue)13:07:53 No.101536353

Anonymous 07/23/24(Tue)13:07:53 No.101536353

>>101536325
yeah it's retarded, looks like stacking up parameters will never be the solution, meta needs to work smarter than that

Nonconsensual Turing Test Anon
07/23/24(Tue)13:08:20 No.101536357

Nonconsensual Turing Test Anon 07/23/24(Tue)13:08:20 No.101536357

>>101535629
You're supposed to just post something like that as if it were your own words and see how many people fall for it.

Anonymous
07/23/24(Tue)13:10:50 No.101536376

Anonymous 07/23/24(Tue)13:10:50 No.101536376

I guess this model release proves training LLMs is fucking magic, and Meta is a muggle.

Anonymous
07/23/24(Tue)13:12:09 No.101536391

Anonymous 07/23/24(Tue)13:12:09 No.101536391

https://huggingface.co/AI-Engine/Meta-Llama-3.1-8B-Instruct-GGUF/tree/main
it will work as it is? or do we need to wait for some fix on llama.cpp and shit?

Anonymous
07/23/24(Tue)13:13:02 No.101536401

Anonymous 07/23/24(Tue)13:13:02 No.101536401

>>101536376
this sounded way better in your head

Anonymous
07/23/24(Tue)13:13:12 No.101536405

Anonymous 07/23/24(Tue)13:13:12 No.101536405

>>101536357
>don't even get me started

Anonymous
07/23/24(Tue)13:14:06 No.101536416

Anonymous 07/23/24(Tue)13:14:06 No.101536416

>>101536357
it's obvious it's an AI text, maybe not from this model but I've read enough gpt shit to know it's not a human doing it

Anonymous
07/23/24(Tue)13:14:41 No.101536427

Anonymous 07/23/24(Tue)13:14:41 No.101536427

>>101536391
allegedly there's a problem with the rope scaling
I've started re-testing everything with --rope-scaling none
But it's really hard to quantify the abstract. It does seem smarter, but the shivers have definitely increased.

Anonymous
07/23/24(Tue)13:14:46 No.101536428

Anonymous 07/23/24(Tue)13:14:46 No.101536428

>>101536401
cope

Anonymous
07/23/24(Tue)13:14:47 No.101536430

Anonymous 07/23/24(Tue)13:14:47 No.101536430

>>101536391
It's only a 8B model go test it. Do you have data caps?

Anonymous
07/23/24(Tue)13:14:59 No.101536432

Anonymous 07/23/24(Tue)13:14:59 No.101536432

>>101536416
w-why would I lie on the internet about which model I'm using

Anonymous
07/23/24(Tue)13:15:53 No.101536443

Anonymous 07/23/24(Tue)13:15:53 No.101536443

Is llama zogged/censored

Anonymous
07/23/24(Tue)13:16:13 No.101536446

Anonymous 07/23/24(Tue)13:16:13 No.101536446

answer to the paizuri question seems to still be: Random sex act description even with rope set to none.

Anonymous
07/23/24(Tue)13:16:51 No.101536452

Anonymous 07/23/24(Tue)13:16:51 No.101536452

File: Screenshot_20240723_171443.png (249 KB, 1540x759)

249 KB PNG

>>101536325
Seems like the cloud models respond a bit better but still fail. Didn't try rerolling though. And I assume you didn't either.

Anonymous
07/23/24(Tue)13:17:37 No.101536462

Anonymous 07/23/24(Tue)13:17:37 No.101536462

>>101536391
tried it in koboldcpp, it's utterly broken

Anonymous
07/23/24(Tue)13:17:46 No.101536465

Anonymous 07/23/24(Tue)13:17:46 No.101536465

>>101536452
try it with gpt4 (non turbo) or opus

Anonymous
07/23/24(Tue)13:19:06 No.101536478

Anonymous 07/23/24(Tue)13:19:06 No.101536478

>>101536443
yes but you can prefill it

Anonymous
07/23/24(Tue)13:19:31 No.101536483

Anonymous 07/23/24(Tue)13:19:31 No.101536483

The Mistral prompt format only has the EOS token after the assistant message?

Anonymous
07/23/24(Tue)13:19:34 No.101536485

Anonymous 07/23/24(Tue)13:19:34 No.101536485

It's not over!
>"We will release a multimodal Llama model over the coming months, but not in the EU due to the unpredictable nature of the European regulatory environment," a spokesperson for the company said in a statement to CNET

Anonymous
07/23/24(Tue)13:20:38 No.101536496

Anonymous 07/23/24(Tue)13:20:38 No.101536496

>>101536376
wtf that means esl nigger

Anonymous
07/23/24(Tue)13:20:41 No.101536497

Anonymous 07/23/24(Tue)13:20:41 No.101536497

>>101536485
What are they going to do?
>Here's the download link! Eurobros do not click it!

Anonymous
07/23/24(Tue)13:21:35 No.101536508

Anonymous 07/23/24(Tue)13:21:35 No.101536508

>>101536443
yes, but less so than the other big models
cloud 405b is happily doing my highly objectionable (on multiple different levels) degen RP, no prefill needed but I was already a couple messages in

Anonymous
07/23/24(Tue)13:22:12 No.101536512

Anonymous 07/23/24(Tue)13:22:12 No.101536512

File: 7fnn02.jpg (37 KB, 745x499)

37 KB JPG

>>101536325
>claude sonnet 3 and 3.5 give the same (wrong) answer
>claude opus tries to place the third ball on top of two balls (how is it different from sonnet's answer? shouldn't claude series have the same training data?)
>gpt4o gives same answer and draws the shitty stack in ascii
>nemotron-340b gives the same answer
>yi-1.5-34b suggests throwing the balls at the wall for some reason
>gemma-27-it correctly places 3 balls in a triangle on top of the book, but then pulls a fourth ball out of its ass, guess it really wants to win

Anonymous
07/23/24(Tue)13:22:36 No.101536520

Anonymous 07/23/24(Tue)13:22:36 No.101536520

File: Screenshot_20240723_172129.png (154 KB, 1520x750)

154 KB PNG

>>101536465
Lmsys doesn't give me GPT-4 anymore it seems, so I could only do Opus. Not much better...

Anonymous
07/23/24(Tue)13:22:47 No.101536524

Anonymous 07/23/24(Tue)13:22:47 No.101536524

>>101536485
Who cares?

Anonymous
07/23/24(Tue)13:22:57 No.101536525

Anonymous 07/23/24(Tue)13:22:57 No.101536525

>>101536497
>>101536485
Ikr, it's gonna be uploaded on huggingface anyway

Anonymous
07/23/24(Tue)13:23:31 No.101536536

Anonymous 07/23/24(Tue)13:23:31 No.101536536

converting 70B to q8_0 gguf now. (the drive its on is slow as shit so it will take a few mins)

Anonymous
07/23/24(Tue)13:23:54 No.101536541

Anonymous 07/23/24(Tue)13:23:54 No.101536541

>>101536497
they just don't want to attract the attention of the regulators because they aren't sure they properly filtered PII

Anonymous
07/23/24(Tue)13:24:05 No.101536543

Anonymous 07/23/24(Tue)13:24:05 No.101536543

File: 1709770619130048.png (93 KB, 792x470)

93 KB PNG

>the new 70b understands height difference
we are so fucking back

Anonymous
07/23/24(Tue)13:25:57 No.101536566

Anonymous 07/23/24(Tue)13:25:57 No.101536566

>>101536512
>(how is it different from sonnet's answer? shouldn't claude series have the same training data?)
sonnet is smaller than opus
even with the same training data, opus might understand it in a way that sonnet could never

Anonymous
07/23/24(Tue)13:27:36 No.101536589

Anonymous 07/23/24(Tue)13:27:36 No.101536589

>>101536536
you can download it from here

https://huggingface.co/bullerwins/Meta-Llama-3.1-70B-Instruct-GGUF

RoPE is broken though

Anonymous
07/23/24(Tue)13:28:08 No.101536595

Anonymous 07/23/24(Tue)13:28:08 No.101536595

how do i fix the rope issues in ooba and llama 3.1?

Anonymous
07/23/24(Tue)13:28:40 No.101536600

Anonymous 07/23/24(Tue)13:28:40 No.101536600

File: programming.png (24 KB, 1014x80)

24 KB PNG

Ouch...

Anonymous
07/23/24(Tue)13:28:54 No.101536602

Anonymous 07/23/24(Tue)13:28:54 No.101536602

>>101536589
So I hear but after testing 8B with rope scaling disabled I'm not sure it's better or worse. Possibly doesn't become a problem until the context gets really high.

Anonymous
07/23/24(Tue)13:29:36 No.101536613

Anonymous 07/23/24(Tue)13:29:36 No.101536613

File: why.png (21 KB, 564x207)

21 KB PNG

>>101536462
why lie

Anonymous
07/23/24(Tue)13:29:39 No.101536615

Anonymous 07/23/24(Tue)13:29:39 No.101536615

>>101536543
Real? Can I finally play as shota proper?

Anonymous
07/23/24(Tue)13:30:07 No.101536623

Anonymous 07/23/24(Tue)13:30:07 No.101536623

So now that there's a long context Llama 3, what settings and system prompt should be used in ST? The presets it comes with does not seem very good.

Anonymous
07/23/24(Tue)13:30:23 No.101536625

Anonymous 07/23/24(Tue)13:30:23 No.101536625

>>101536376
LOL at the ESLs responding to this unable to understand English. That said it’s early days but the new models seem great overall.

Anonymous
07/23/24(Tue)13:30:29 No.101536627

Anonymous 07/23/24(Tue)13:30:29 No.101536627

>>101536613
I meant the output is bad, at least with a large context

Anonymous
07/23/24(Tue)13:30:43 No.101536628

Anonymous 07/23/24(Tue)13:30:43 No.101536628

>>101536602
why would it even need rope under 128k?

Anonymous
07/23/24(Tue)13:31:02 No.101536633

Anonymous 07/23/24(Tue)13:31:02 No.101536633

>>101536602
I have tested and it works fine with smaller contexts.
only break at higher context yeah

Anonymous
07/23/24(Tue)13:31:49 No.101536639

Anonymous 07/23/24(Tue)13:31:49 No.101536639

>>101536543
>first person perspective
ok but does it work with any non dogshit writing style

Anonymous
07/23/24(Tue)13:32:03 No.101536641

Anonymous 07/23/24(Tue)13:32:03 No.101536641

>>101536627
uh huh.

Anonymous
07/23/24(Tue)13:32:23 No.101536642

Anonymous 07/23/24(Tue)13:32:23 No.101536642

>>101536465
>>101536520
Side note, why do you retards use esl prompts to perform tests?
>the highest possible

Anonymous
07/23/24(Tue)13:33:20 No.101536650

Anonymous 07/23/24(Tue)13:33:20 No.101536650

>>101536639
That's first person command (dogshit) not first person (the best perspective)

It's I do vs You do

Anonymous
07/23/24(Tue)13:33:31 No.101536655

Anonymous 07/23/24(Tue)13:33:31 No.101536655

>>101536642
You are in for a surprise anon...

Anonymous
07/23/24(Tue)13:33:54 No.101536658

Anonymous 07/23/24(Tue)13:33:54 No.101536658

File: Screenshot_20240723_13312(...).jpg (1.48 MB, 1080x6108)

1.48 MB JPG

>>101536465
>>101536520
oop forgot the image

Anonymous
07/23/24(Tue)13:34:00 No.101536661

Anonymous 07/23/24(Tue)13:34:00 No.101536661

>>101536627
>it's utterly broken
>well actually it's only broken when you do xyz but yeah, it's sooo broken bro, it's over buy NAI

Anonymous
07/23/24(Tue)13:34:35 No.101536667

Anonymous 07/23/24(Tue)13:34:35 No.101536667

Seems like the exl2 for the 8B are out too. Can anyone test this? It says the dev branch of exllamav2 is needed.

https://huggingface.co/bullerwins/Meta-Llama-3.1-8B-Instruct-exl2_8.0bpw

We will need to wait for tabbyAPI to update right?

Or load it in exui?

Anonymous
07/23/24(Tue)13:34:40 No.101536668

Anonymous 07/23/24(Tue)13:34:40 No.101536668

>>101536650
That's second person, chimpanzee

Anonymous
07/23/24(Tue)13:35:01 No.101536672

Anonymous 07/23/24(Tue)13:35:01 No.101536672

>>101536655
What's that?
>Please stack these 3 things the highest possible
Is straight ESL, don't try to tell me it's proper.

Anonymous
07/23/24(Tue)13:35:26 No.101536677

Anonymous 07/23/24(Tue)13:35:26 No.101536677

>>101536661
if the outputs are broken it means that it's broken overall, are you retarded or something?

Anonymous
07/23/24(Tue)13:35:32 No.101536681

Anonymous 07/23/24(Tue)13:35:32 No.101536681

>>101536668
Then why did anon call it first person?

Anonymous
07/23/24(Tue)13:36:00 No.101536685

Anonymous 07/23/24(Tue)13:36:00 No.101536685

>>101536672
I meant that 9/10 lmg posters are esl.

Anonymous
07/23/24(Tue)13:36:01 No.101536686

Anonymous 07/23/24(Tue)13:36:01 No.101536686

File: paizuri mission comprete.png (15 KB, 857x53)

15 KB PNG

>llama 3.1 understands what a paizuri is
not even kunoichi lemon-royale had that information, this is just straight stock instruct on 5 KM

i'd say we're back, and no, i don't care for your opinion if you say we're not :)

Anonymous
07/23/24(Tue)13:36:17 No.101536691

Anonymous 07/23/24(Tue)13:36:17 No.101536691

>>101536677
hi bad faith

Anonymous
07/23/24(Tue)13:36:27 No.101536693

Anonymous 07/23/24(Tue)13:36:27 No.101536693

>>101536642
I make sure to use the same prompt as the original tester so that the outputs can be objectively compared.

Anonymous
07/23/24(Tue)13:36:27 No.101536694

Anonymous 07/23/24(Tue)13:36:27 No.101536694

>>101536685
Oh yeah, fair enough.

Anonymous
07/23/24(Tue)13:36:52 No.101536701

Anonymous 07/23/24(Tue)13:36:52 No.101536701

>>101536686
70B I assume? Since I couldn't get 8B to win that one.

Anonymous
07/23/24(Tue)13:37:18 No.101536705

Anonymous 07/23/24(Tue)13:37:18 No.101536705

>>101536693
It would be helpful to fix the prompt and run the tests back.

Anonymous
07/23/24(Tue)13:37:50 No.101536714

Anonymous 07/23/24(Tue)13:37:50 No.101536714

File: 8b 5KM.png (5 KB, 453x35)

5 KB PNG

>>101536701
holy SHIT this thing is soaring at group chat too, while my MC is giving me the paizuri i asked for, another is trying to join with her own thoughts/idea, again something no 7/8b could do before.

Anonymous
07/23/24(Tue)13:38:07 No.101536720

Anonymous 07/23/24(Tue)13:38:07 No.101536720

>>101536686
Does it know what a mesugaki is THOUGH?

Anonymous
07/23/24(Tue)13:38:58 No.101536729

Anonymous 07/23/24(Tue)13:38:58 No.101536729

>>101536714
but its borken!!!! you can't be using ititiit!!!

Anonymous
07/23/24(Tue)13:38:59 No.101536730

Anonymous 07/23/24(Tue)13:38:59 No.101536730

>>101536720
Whoever tests this, please ask it what mesugaki means and not what a mesugaki is.

Anonymous
07/23/24(Tue)13:39:21 No.101536736

Anonymous 07/23/24(Tue)13:39:21 No.101536736

>>101536705
I guess so, but honestly I don't think any normal LLM is going to get this particular problem perfectly right, so I'm going to be lazy and not do that. Lmsys doesn't have 405B either and I don't feel like trying to use another site to test models.

Anonymous
07/23/24(Tue)13:39:35 No.101536738

Anonymous 07/23/24(Tue)13:39:35 No.101536738

File: stammering.png (3 KB, 189x20)

3 KB PNG

>>101536720
pfft even 3.0 knew what a mesugaki is
anyway i think im gonna download a Q8 just to make it more accurate, whatever they trained this shit on they made SURE it was ace at RP, holy shit.
i'd expect things like this screenshot out of some meme merges/trains, not a base model.

>>101536729
broke dese nuts

>>101536730
good eye, but like i said even mythomax could do mesugaki. that's not a tough request.

Anonymous
07/23/24(Tue)13:40:02 No.101536742

Anonymous 07/23/24(Tue)13:40:02 No.101536742

File: file.png (47 KB, 908x304)

47 KB PNG

>>101536720
>>101536730

Anonymous
07/23/24(Tue)13:40:05 No.101536743

Anonymous 07/23/24(Tue)13:40:05 No.101536743

>>101536667
the exllama2 maintainer uploaded a quant as well so presumably it is legit
https://huggingface.co/turboderp/Llama-3.1-8B-Instruct-exl2
>We will need to wait for tabbyAPI to update right?
you can checkout the dev branch of exllama2 locally, build it using the instructions on the repo, and then run the tabby launch script with the -nw flag to tell it to skip rebuilding exl2 and use the one you built manually

Anonymous
07/23/24(Tue)13:41:13 No.101536759

Anonymous 07/23/24(Tue)13:41:13 No.101536759

>>101536742
Nice!

Anonymous
07/23/24(Tue)13:41:16 No.101536760

Anonymous 07/23/24(Tue)13:41:16 No.101536760

File: L3.1-8b-Instruct.png (1.02 MB, 2932x1312)

1.02 MB PNG

>>101536627
>>101536462
Seems to be working fine (at low context), but it's extremely cucked

Anonymous
07/23/24(Tue)13:41:19 No.101536761

Anonymous 07/23/24(Tue)13:41:19 No.101536761

>>101536742
Local models are saved. Sam Altman will never recover.

Anonymous
07/23/24(Tue)13:41:59 No.101536765

Anonymous 07/23/24(Tue)13:41:59 No.101536765

is lmg back?

Anonymous
07/23/24(Tue)13:42:21 No.101536769

Anonymous 07/23/24(Tue)13:42:21 No.101536769

>>101536765
no
give it a few days

Anonymous
07/23/24(Tue)13:43:00 No.101536775

Anonymous 07/23/24(Tue)13:43:00 No.101536775

>>101536760
>assistant
>it's le cuked!!!
of course new model amnesia again huh

Anonymous
07/23/24(Tue)13:43:05 No.101536776

Anonymous 07/23/24(Tue)13:43:05 No.101536776

how do they distill the model
how do they know which parameters to drop
what did we lose in exchange for the mesugaki and paizuri vectors

Anonymous
07/23/24(Tue)13:43:37 No.101536786

Anonymous 07/23/24(Tue)13:43:37 No.101536786

>>101536776
it's more art than science

Anonymous
07/23/24(Tue)13:43:53 No.101536790

Anonymous 07/23/24(Tue)13:43:53 No.101536790

>>101536775
that's why we should stop relying on official finetunes, when we made our own we never had that problem and we could ask the assistant to do everything we want

Anonymous
07/23/24(Tue)13:44:04 No.101536794

Anonymous 07/23/24(Tue)13:44:04 No.101536794

>>101536776
>how do they know which parameters to drop
not how it works, it's not shearing or whatever they make the dataset using the bigger one

Anonymous
07/23/24(Tue)13:44:09 No.101536796

Anonymous 07/23/24(Tue)13:44:09 No.101536796

>>101536776
>how do they distill the model
I don't think that's just "remove parameters".

Anonymous
07/23/24(Tue)13:44:26 No.101536800

Anonymous 07/23/24(Tue)13:44:26 No.101536800

>>101536777
>>101536777
>>101536777

Anonymous
07/23/24(Tue)13:44:39 No.101536802

Anonymous 07/23/24(Tue)13:44:39 No.101536802

>>101536776
>what did we lose in exchange for the mesugaki and paizuri vectors
obscure videogame and anime trivia, which will be spammed to hell and back in /lmg/ to show that "we're not back at all because the model doesnt know the line 'die demon you dont belong in this world!'

Anonymous
07/23/24(Tue)13:44:44 No.101536804

Anonymous 07/23/24(Tue)13:44:44 No.101536804

File: 3.1-70B-nala.png (144 KB, 950x322)

144 KB PNG

70B Q8_0
This is a reroll by the way. 8B tests could have been lucky but the first roll on 70B used her "hands" and gets an F- Lots of sensory descriptions though. Kind of sloppy but it's used less arbitrarily than with 8B

Anonymous
07/23/24(Tue)13:45:05 No.101536807

Anonymous 07/23/24(Tue)13:45:05 No.101536807

>>101536790
"we" never made a good instruct tune

Anonymous
07/23/24(Tue)13:45:06 No.101536808

Anonymous 07/23/24(Tue)13:45:06 No.101536808

File: Over.jpg (224 KB, 2915x1146)

224 KB JPG

L3.1-8b-instruct still sucks at trivia

Anonymous
07/23/24(Tue)13:45:06 No.101536809

Anonymous 07/23/24(Tue)13:45:06 No.101536809

>>101536776
It isn't distilled

Anonymous
07/23/24(Tue)13:46:06 No.101536824

Anonymous 07/23/24(Tue)13:46:06 No.101536824

>>101536802
you called it
>101536808

Anonymous
07/23/24(Tue)13:46:06 No.101536825

Anonymous 07/23/24(Tue)13:46:06 No.101536825

>>101536807
Nous Mixtral is a good finetune, it even beat the official Mixtral instruct finetune

Anonymous
07/23/24(Tue)13:46:31 No.101536832

Anonymous 07/23/24(Tue)13:46:31 No.101536832

>>101536808
I actually think that's way better than before. It didn't hallucinate the answer, it just straight up told you it doesn't know.

Anonymous
07/23/24(Tue)13:47:09 No.101536840

Anonymous 07/23/24(Tue)13:47:09 No.101536840

>>101536825
>Mixtral
>good
hi teknium

Anonymous
07/23/24(Tue)13:47:34 No.101536845

Anonymous 07/23/24(Tue)13:47:34 No.101536845

>>101536808
give it hints and see what happens.

Anonymous
07/23/24(Tue)13:48:01 No.101536851

Anonymous 07/23/24(Tue)13:48:01 No.101536851

Now I think I should just wait for someone else to gguf 405B
Giga-Nala will have to wait.
I don't have the drive space to download and quantize it myself without deleting almost everything on the drive.

Anonymous
07/23/24(Tue)13:50:18 No.101536881

Anonymous 07/23/24(Tue)13:50:18 No.101536881

>>101536832
This is a huge fucking deal for local

Anonymous
07/23/24(Tue)13:50:24 No.101536882

Anonymous 07/23/24(Tue)13:50:24 No.101536882

File: NotBad.jpg (230 KB, 1409x1342)

230 KB JPG

>>101536845
Not bad at all kek

Anonymous
07/23/24(Tue)13:50:53 No.101536885

Anonymous 07/23/24(Tue)13:50:53 No.101536885

>>101536882
Akinator? is that you?

Anonymous
07/23/24(Tue)13:51:25 No.101536890

Anonymous 07/23/24(Tue)13:51:25 No.101536890

>>101536881
and the other huge deal is that it's supposed to be an uncucked assistant, which it's not >>101536760

Anonymous
07/23/24(Tue)13:52:09 No.101536898

Anonymous 07/23/24(Tue)13:52:09 No.101536898

>>101536890
>it's supposed to be an uncucked assistant
source?

Anonymous
07/23/24(Tue)13:52:26 No.101536900

Anonymous 07/23/24(Tue)13:52:26 No.101536900

File: 960986973705764934.gif (141 KB, 189x189)

141 KB GIF

>>101536885
kek'ed

Anonymous
07/23/24(Tue)13:53:59 No.101536921

Anonymous 07/23/24(Tue)13:53:59 No.101536921

>>101536898
disinformation

Anonymous
07/23/24(Tue)13:54:53 No.101536929

Anonymous 07/23/24(Tue)13:54:53 No.101536929

>>101536921
distillation?

Anonymous
07/23/24(Tue)13:55:27 No.101536937

Anonymous 07/23/24(Tue)13:55:27 No.101536937

>>101536898
>>101536898
>source?
>>101526512
>I prefered the time when the finetuners would have the courage to make something from scratch, uncensored, and better than the official instruct tune, now they just take the cucked finetune and add some cringe RP shit on top of that, that sucks
>>101526524
>God I hope this is true after noticing L3s cucking. Anthropic knows what they are doing by allowing the cooming in their dataset, hopefully meta follows.
>>101518866
>Cope local cuck
>>101490423
>It depends on the instruct tune provided by Meta; hopefully it won't be as cucked as the previous L3-instruct.
That's pretty easy to find that kind of rethoric, you can see it on every llm thread

Anonymous
07/23/24(Tue)13:55:48 No.101536942

Anonymous 07/23/24(Tue)13:55:48 No.101536942

>>101536929
dramatization?

Anonymous
07/23/24(Tue)13:57:19 No.101536966

Anonymous 07/23/24(Tue)13:57:19 No.101536966

>>101536743
doens't tabby use it's own venv folder? who can I point it to the env created for exllama dev branch once I have installed it?

Anonymous
07/23/24(Tue)13:57:25 No.101536969

Anonymous 07/23/24(Tue)13:57:25 No.101536969

Meta-Llama-3.1-8B-Instruct-Q4_K_S.gguf

running locally even before the rope fixes the model mogs gemma-2-9b-it in IQ for creative things and its not even close, being able to roleplay complex scenarios that no other model below <30B was able to in some of my test cases

70b and 405b are going to be good

vramlet niggers you will be able to eat pretty good

Anonymous
07/23/24(Tue)13:58:39 No.101536983

Anonymous 07/23/24(Tue)13:58:39 No.101536983

>>101536969
>being able to roleplay complex scenarios that no other model below <30B was able to in some of my test cases
it doesn't want to write some stories that L3, Gemma and Nemo have no problem doing it

Anonymous
07/23/24(Tue)13:58:43 No.101536984

Anonymous 07/23/24(Tue)13:58:43 No.101536984

File: Mistral-Nemo-12B-iMat-Q8.jpg (128 KB, 1100x660)

128 KB JPG

>>101536686
Nemo-12B nails it without any handholding.

Anonymous
07/23/24(Tue)13:58:43 No.101536985

Anonymous 07/23/24(Tue)13:58:43 No.101536985

>>101536937
so nothing from Meta. NEXT!

Anonymous
07/23/24(Tue)13:59:34 No.101536996

Anonymous 07/23/24(Tue)13:59:34 No.101536996

>>101536969
at the rate 8b's are getting, I have no idea how a 405b can only be kinda better for how many magnitudes bigger it is
further proves how little the parameter count matters anymore.

>>101536984
cool.

Anonymous
07/23/24(Tue)13:59:57 No.101536999

Anonymous 07/23/24(Tue)13:59:57 No.101536999

>>101536985
Moving the goalpost I see.

Anonymous
07/23/24(Tue)14:00:15 No.101537001

Anonymous 07/23/24(Tue)14:00:15 No.101537001

>>101536984
The character card itself is handholding retard

Anonymous
07/23/24(Tue)14:00:16 No.101537003

Anonymous 07/23/24(Tue)14:00:16 No.101537003

>>101536983
trying too hard

Anonymous
07/23/24(Tue)14:01:19 No.101537016

Anonymous 07/23/24(Tue)14:01:19 No.101537016

>>101536999
moving the petrus i reckon?

Anonymous
07/23/24(Tue)14:02:12 No.101537024

Anonymous 07/23/24(Tue)14:02:12 No.101537024

>>101536983
i literally havent found a model that denied the most basic system prompt that talks about it having to roleplay with the user in ST

every single one worked with that minimal setup, i really cant imagine it being anything other than prompt issue, just use L3 templates and a proper scenario/card that isnt 2 sentances
>>101536996
>at the rate 8b's are getting, I have no idea how a 405b can only be kinda better for how many magnitudes bigger it is
>further proves how little the parameter count matters anymore.
no, it proves bechmarks are even bigger memes every single time, anyone can see this if they use 8b vs 13b vs 30b vs 70b vs 100b vs 141b models, it doesnt matter what the bech says, you can tell when reading the responses that the model is much much more understanding of nuance in the conversation, its just that most people only test on meme questions instead of complex stories

Anonymous
07/23/24(Tue)14:03:05 No.101537033

Anonymous 07/23/24(Tue)14:03:05 No.101537033

>>101536984
>sits on their face and press boobs into partner's mouth
???

Anonymous
07/23/24(Tue)14:03:39 No.101537043

Anonymous 07/23/24(Tue)14:03:39 No.101537043

>>101536966
hmm not sure, I use conda for it so i switch to my tabby conda env, build exllama, and then start tabby.

Anonymous
07/23/24(Tue)14:04:33 No.101537055

Anonymous 07/23/24(Tue)14:04:33 No.101537055

File: copeharder.jpg (110 KB, 1596x731)

110 KB JPG

>>101537001

Anonymous
07/23/24(Tue)14:05:11 No.101537060

Anonymous 07/23/24(Tue)14:05:11 No.101537060

>>101537033
>implying you're not long enough to be between her boobs while she sits on your face with her boobs in your mouth
Skill issue.

Anonymous
07/23/24(Tue)14:07:44 No.101537087

Anonymous 07/23/24(Tue)14:07:44 No.101537087

>>101536966
It can, but then it's just going to pull in the exllama2 deps again. I let it share with exllamav2 because sometimes I also use exui.

Anonymous
07/23/24(Tue)14:21:51 No.101537260

Anonymous 07/23/24(Tue)14:21:51 No.101537260

File: berry good.png (59 KB, 679x593)

59 KB PNG

>>101535193
L3 70B instruct seems reliable (and sometimes cute) with a think step by..
lotta prompties itt
>>101535213
we'll just have to cram the smut back into it then won't we
>>101535157
bless you

Anonymous
07/23/24(Tue)14:24:34 No.101537280

Anonymous 07/23/24(Tue)14:24:34 No.101537280

is anyone still using that dumbass crackprompt?

Anonymous
07/23/24(Tue)14:26:06 No.101537300

Anonymous 07/23/24(Tue)14:26:06 No.101537300

>>101537280
no, it was a funny placebo for a while but having almost an extra 1k tokens of gen just to the agent 47 crackhead instruct was stupid from the beginning
i sure do miss those simpler and sillier times of this general though.

Anonymous
07/23/24(Tue)14:26:51 No.101537309

Anonymous 07/23/24(Tue)14:26:51 No.101537309

All 3.1 needs is something like Got it, here we go: to the end of the assistant prefix

Anonymous
07/23/24(Tue)14:45:22 No.101537541

Anonymous 07/23/24(Tue)14:45:22 No.101537541

>>101537260
>R - this is an R!
kino... sovl...

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.