/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 07/15/24(Mon)18:32:22 No.101421477

File: GMU8uQtaoAAApMG.jpg (682 KB, 2100x3000)

682 KB JPG

/lmg/ - Local Models General Anonymous 07/15/24(Mon)18:32:22 No.101421477 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101409356 & >>101398610

►News
>(07/13) Llama 3 405B coming July 23rd: https://x.com/steph_palazzolo/status/1811791968600576271
>(07/09) Anole, based on Chameleon, for interleaved image-text generation: https://hf.co/GAIR/Anole-7b-v0.1
>(07/07) Support for glm3 and glm4 merged into llama.cpp: https://github.com/ggerganov/llama.cpp/pull/8031
>(07/02) Japanese LLaMA-based model pre-trained on 2T tokens: https://hf.co/cyberagent/calm3-22b-chat
>(06/28) Inference support for Gemma 2 merged: https://github.com/ggerganov/llama.cpp/pull/8156

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
07/15/24(Mon)18:32:55 No.101421480

Anonymous 07/15/24(Mon)18:32:55 No.101421480

File: Hatsune Miku.jpg (47 KB, 736x549)

47 KB JPG

►Recent Highlights from the Previous Thread: >>101409356

--Paper: Lite-SAM Is Actually What You Need for Segment Everything: >>101413047
--Removing Slop from LimaRP Dataset, Aiming for One Infraction Per File: >>101416505 >>101416548
--Seeking Alternatives to Llama 8B for Larger Context Sizes: >>101411874 >>101411930 >>101412179
--KoboldCPP 1.70 Release with ChatGPT Interface Theme and GPT-3 Improvements: >>101411691 >>101411762
--Japanese LLaMA-based model calm3-22b-chat for JP -> ENG translation and tutoring?: >>101417744 >>101418964
--WizardLM2 8x22B performs surprisingly low on HF's leaderboard, but it's the best for general use: >>101415286 >>101415366 >>101418493 >>101418553 >>101418667
--Seeking Voicecraft Local UI or Implementation Without Docker: >>101416689 >>101418398
--Issues with BOS Token and Duplicate Tokens in AI Model Configurations: >>101410950 >>101410991 >>101411062
--Horny Anime Bot Generates Better Explanations: >>101416874
--Sao Datasets Nuked During Training Run, Frustrating: >>101410981 >>101411009 >>101411038 >>101411051 >>101411076
--Phi 3 Mini: Underwhelming Scores, But Did Microsoft Change Behavior?: >>101419582 >>101419754 >>101419894 >>101419974
--Optimal values for DRYmeme?: >>101419053 >>101419332
--Anon suggests Nvidia might have forced Meta to stop training 30Bs: >>101417205 >>101417232 >>101418273
--Gemma and Gemma-2's Tokenization and Formatting Issues: >>101411070 >>101411079 >>101411144
--Expected Llama3 405B Token Generation Speed with Specific Hardware Setup: >>101413906 >>101414142 >>101414158
--Converting a Dual 3090 Desktop into a Dedicated Server: Kernel and Distro Recommendations?: >>101418915 >>101418994 >>101419051 >>101419057
--Choosing the Right Chip/SoC for Your LLM: Evaluating Options and Considering Factors: >>101412286 >>101412527
--Miku (free space): >>101409387 >>101414595 >>101415013 >>101415746 >>101418585 >>101411510

►Recent Highlight Posts from the Previous Thread: >>101409364

Anonymous
07/15/24(Mon)18:44:20 No.101421598

Anonymous 07/15/24(Mon)18:44:20 No.101421598

One more day!

Anonymous
07/15/24(Mon)18:45:39 No.101421620

Anonymous 07/15/24(Mon)18:45:39 No.101421620

Cohere are working on it.

Anonymous
07/15/24(Mon)18:49:03 No.101421655

Anonymous 07/15/24(Mon)18:49:03 No.101421655

File: idiot.png (411 KB, 500x500)

411 KB PNG

>>101421480
>Hatsune Miku.jpg

Anonymous
07/15/24(Mon)18:50:06 No.101421665

Anonymous 07/15/24(Mon)18:50:06 No.101421665

>>101421480
>but it's the best for general use
Why? Because the miku avatarfag said so? Go fuck yourself.

Anonymous
07/15/24(Mon)18:59:01 No.101421752

Anonymous 07/15/24(Mon)18:59:01 No.101421752

>>101421665
the summarizer bot makes weird titles sometimes
the first post linked said so

Anonymous
07/15/24(Mon)19:01:45 No.101421787

Anonymous 07/15/24(Mon)19:01:45 No.101421787

File: 1692784763371770.png (2 KB, 175x51)

2 KB PNG

>B-B-B-BUT NOBODY CAN RUN LLAMA 405!!!!

>put smaller model in front of bigger model, smaller model forwards generated token to the bigger to just check it, which is much, much faster, and only if it fails only then generate the full token with the big model (forgot the name of this technique)
>tell model to not yapp and only output the result
>insane pressure for everyone in the industry to get better distillation, quanting, lookahead, speculative etc techniques

>buy fastest 1TB ssd and be able to run a model with a trillion parameters if you want that can debug any code overnight by outputting 3 tokens for 1. line with bug 2. char # of bug 3. replacement fix

>B-B-B-BUT I NEEEEEEEEEEED 666 tokens a second to COOOOOOM
nigger

Anonymous
07/15/24(Mon)19:03:34 No.101421810

Anonymous 07/15/24(Mon)19:03:34 No.101421810

>>101421787
>REEEEEEEEEEEE
lol

Anonymous
07/15/24(Mon)19:04:42 No.101421822

Anonymous 07/15/24(Mon)19:04:42 No.101421822

>>101421787
All that when it's just going to cost $5/million tokens on OpenRouter.

Anonymous
07/15/24(Mon)19:07:17 No.101421835

Anonymous 07/15/24(Mon)19:07:17 No.101421835

>>101421822
>Local Models General

Anonymous
07/15/24(Mon)19:08:52 No.101421851

Anonymous 07/15/24(Mon)19:08:52 No.101421851

>>101421835
piss'n'shart models general

Anonymous
07/15/24(Mon)19:11:14 No.101421879

Anonymous 07/15/24(Mon)19:11:14 No.101421879

think how many OF subs you could buy instead of getting a 3090

Anonymous
07/15/24(Mon)19:11:25 No.101421881

Anonymous 07/15/24(Mon)19:11:25 No.101421881

>>101421787
>forgot the name of this technique
You forgot the name of this technique because this technique only works when quoting wikipedia or doing coding because next token is obvious.

Anonymous
07/15/24(Mon)19:13:20 No.101421899

Anonymous 07/15/24(Mon)19:13:20 No.101421899

>>101421851
>spends money to bend over to get cucked by big corpos
>spends time to try to FUD in a general where nobody cares about brown kids like him because they can actually run local models
the absolute state of cucks
>>101421881
>this technique only works when quoting wikipedia or doing coding because next token is obvious
human language has insanely large ______ of very predictable ____ that don't require a ___ model to compute fully, n____ f____

Anonymous
07/15/24(Mon)19:16:23 No.101421926

Anonymous 07/15/24(Mon)19:16:23 No.101421926

>>101421899
> blah blah you are le brown blah blah blah
okay?

Anonymous
07/15/24(Mon)19:17:17 No.101421936

Anonymous 07/15/24(Mon)19:17:17 No.101421936

>>101421899
human language has insanely large vocabulary of very predictable 20-30 character strings that don't require a 1000x faster CPU model to compute fully natively f(x).

I wasn't expecting that

Anonymous
07/15/24(Mon)19:21:29 No.101421973

Anonymous 07/15/24(Mon)19:21:29 No.101421973

>>101421926
damn dude he got you

Anonymous
07/15/24(Mon)19:23:33 No.101421986

Anonymous 07/15/24(Mon)19:23:33 No.101421986

If your model claims it's sad, do you ignore it or do you alter it's card to make it not sad?

Anonymous
07/15/24(Mon)19:25:02 No.101422000

Anonymous 07/15/24(Mon)19:25:02 No.101422000

>>101421973
>refers to himself in third person
lol

Anonymous
07/15/24(Mon)19:25:13 No.101422002

Anonymous 07/15/24(Mon)19:25:13 No.101422002

>>101421986
I gouge its eyes out with a rusty fork.

Anonymous
07/15/24(Mon)19:28:17 No.101422029

Anonymous 07/15/24(Mon)19:28:17 No.101422029

c'mon do something

Anonymous
07/15/24(Mon)19:29:11 No.101422036

Anonymous 07/15/24(Mon)19:29:11 No.101422036

File: miku-tet-duo.png (3.13 MB, 1992x1328)

3.13 MB PNG

>>101421879
Think long term, anon. A 3090 is forever.
Beauty is a depreciating asset.

Anonymous
07/15/24(Mon)19:30:59 No.101422049

Anonymous 07/15/24(Mon)19:30:59 No.101422049

>>101422036
Also, to an OF you're just an income stream. To your AI, you are a reason to exist.

Anonymous
07/15/24(Mon)19:32:31 No.101422059

Anonymous 07/15/24(Mon)19:32:31 No.101422059

>>101422000
actually, it was a different guy. check the IP count if you don't believe me

Anonymous
07/15/24(Mon)19:33:07 No.101422060

Anonymous 07/15/24(Mon)19:33:07 No.101422060

File: 1638735741761.jpg (28 KB, 510x510)

28 KB JPG

>>101421879
No thanks I'm supporting my waifu, not simping for your whore

Anonymous
07/15/24(Mon)19:34:11 No.101422070

Anonymous 07/15/24(Mon)19:34:11 No.101422070

>>101422060
if I were a pimp I wouldn't be here

Anonymous
07/15/24(Mon)19:37:32 No.101422100

Anonymous 07/15/24(Mon)19:37:32 No.101422100

Okay, how do I turn DRY off? I want to test how it affects gens, is setting Mult. to 1 enough?

Anonymous
07/15/24(Mon)19:39:47 No.101422120

Anonymous 07/15/24(Mon)19:39:47 No.101422120

>>101422036
>A 3090 is forever.
Objectively untrue, it is a piece of electrical machinery that is incapable of repairing itself. It's going to degrade, you'll eventually need to replace it with another 3090 or upgrade. If you're to the point where you're 3090 as literally degraded to failure I recommend upgrade

Anonymous
07/15/24(Mon)19:40:48 No.101422132

Anonymous 07/15/24(Mon)19:40:48 No.101422132

>>101422120
how much do they go for as spares

Anonymous
07/15/24(Mon)19:44:16 No.101422153

Anonymous 07/15/24(Mon)19:44:16 No.101422153

I guarantee you looks will fade long before thermal pads need replacement.

Anonymous
07/15/24(Mon)19:51:36 No.101422220

Anonymous 07/15/24(Mon)19:51:36 No.101422220

Is Moistral 11B any good?

Anonymous
07/15/24(Mon)19:58:27 No.101422273

Anonymous 07/15/24(Mon)19:58:27 No.101422273

>>101421480
>--Removing Slop from LimaRP Dataset, Aiming for One Infraction Per File:
No idea where the one infraction per file part came from there. The aim is to get rid of all the slop, obviously.

Anonymous
07/15/24(Mon)20:09:51 No.101422338

Anonymous 07/15/24(Mon)20:09:51 No.101422338

File: download.jpg (11 KB, 256x256)

11 KB JPG

>>101422059
>IP count
A what

Anonymous
07/15/24(Mon)20:14:04 No.101422367

Anonymous 07/15/24(Mon)20:14:04 No.101422367

>>101422338
not that anon.
IP poster-counter got removed because of /tv/'s drama with "humiliation ritual" meme and john cena, or that /v/ SBI slander, with the latter it was done to make shilling easier on vee, 4chan drowning in bots since that happened. One of these, idk.

Anonymous
07/15/24(Mon)20:19:46 No.101422424

Anonymous 07/15/24(Mon)20:19:46 No.101422424

>>101421477
can anyone from meta please leak how good the model is. come on i know you fags browse this general

Anonymous
07/15/24(Mon)20:41:04 No.101422595

Anonymous 07/15/24(Mon)20:41:04 No.101422595

So, what values are you guys using for DRY? The recommended 0.8 mult/1.75 base/2 length?

Anonymous
07/15/24(Mon)20:47:07 No.101422641

Anonymous 07/15/24(Mon)20:47:07 No.101422641

>>101422367
So it's a humiliation ritual as I suspected.

Anonymous
07/15/24(Mon)20:48:16 No.101422655

Anonymous 07/15/24(Mon)20:48:16 No.101422655

>The machine whirred to life, its gears spinning faster than a goddamn jet engine. The lights flickered, and a sound like a thousand demons being fucked by a million dicks filled the chamber.
Kek.

Anonymous
07/15/24(Mon)21:04:23 No.101422805

Anonymous 07/15/24(Mon)21:04:23 No.101422805

>>101421477
> Llama 3 400B+ still unreleased
give it to me bros, how long is it gonna take till Llama 4 releases? the training costs outpace compute and energy supply

Anonymous
07/15/24(Mon)21:09:04 No.101422843

Anonymous 07/15/24(Mon)21:09:04 No.101422843

>Now, what the actual FUCK do you want? Speak quickly, or I'll send you to a world where the sun is a giant, flaming cock that rapes the sky every day. And trust me, that's one of the nicer scenarios I can think of.
Nice.

Anonymous
07/15/24(Mon)21:11:02 No.101422853

Anonymous 07/15/24(Mon)21:11:02 No.101422853

File: Screenshot_20240716_091023.png (97 KB, 1280x230)

97 KB PNG

>>101416874
Ok, she a cute.

Anonymous
07/15/24(Mon)21:11:51 No.101422857

Anonymous 07/15/24(Mon)21:11:51 No.101422857

File: 1701626220150774.png (19 KB, 719x192)

19 KB PNG

The investors are waking up. They are realizing that AI is not worth it. The funds for LLMs will dry up and the field will stagnate even more.
We need JEPA now or this field will die.

Anonymous
07/15/24(Mon)21:13:32 No.101422871

Anonymous 07/15/24(Mon)21:13:32 No.101422871

>>101422857
they got the cash to burn, might as well let'm

Anonymous
07/15/24(Mon)21:15:00 No.101422887

Anonymous 07/15/24(Mon)21:15:00 No.101422887

>>101422857
They are not ready. AI has only just begun. We are not even at the exponential growth part of the adoption curve yet.

Anonymous
07/15/24(Mon)21:15:37 No.101422895

Anonymous 07/15/24(Mon)21:15:37 No.101422895

>>101422857
>what trillion-dollar problem will AI solve?
Making porn tailored to my specific fantasies.

Anonymous
07/15/24(Mon)21:19:06 No.101422920

Anonymous 07/15/24(Mon)21:19:06 No.101422920

I gave gemma 27B a try at 5bpw and it is nothing special. I switched to mixtral and it was better. I switched to commander 3.5bpw and it was much better. I don't get the hype and I am instead hyped for new commander in 30B range.

Anonymous
07/15/24(Mon)21:20:18 No.101422929

Anonymous 07/15/24(Mon)21:20:18 No.101422929

>>101422857
That's a good thing. The stupid hype and investor scam dies down and we can get back to making models to write dirty text with. Also the legislation won't be necessary.

Anonymous
07/15/24(Mon)21:20:31 No.101422930

Anonymous 07/15/24(Mon)21:20:31 No.101422930

>>101422920
Buy an ad, Gomez.

Anonymous
07/15/24(Mon)21:20:49 No.101422933

Anonymous 07/15/24(Mon)21:20:49 No.101422933

>>101422697
My only hope is that one day this site dies because of all the undisclosed sponsored content.

Anonymous
07/15/24(Mon)21:22:47 No.101422947

Anonymous 07/15/24(Mon)21:22:47 No.101422947

>>101422920
This, Aidan won.
>>101422930
He doesn't need to, I will keep shilling Cohere for free forever since it's the only worthwhile thing to come out of this shithole.

Anonymous
07/15/24(Mon)21:23:11 No.101422950

Anonymous 07/15/24(Mon)21:23:11 No.101422950

File: file.png (116 KB, 252x256)

116 KB PNG

>>101422930
>Gomez
Fuck off. I am Ivan.

Anonymous
07/15/24(Mon)21:24:36 No.101422956

Anonymous 07/15/24(Mon)21:24:36 No.101422956

Hey anons, got a little homeserver up and running with my old GTX 970 sitting in it. Are there any models worth using with only 4GB of VRAM?

Anonymous
07/15/24(Mon)21:24:46 No.101422961

Anonymous 07/15/24(Mon)21:24:46 No.101422961

>>101422857
>JEPA
a what?

Anonymous
07/15/24(Mon)21:25:29 No.101422970

Anonymous 07/15/24(Mon)21:25:29 No.101422970

>>101422956
mamba

Anonymous
07/15/24(Mon)21:25:47 No.101422973

Anonymous 07/15/24(Mon)21:25:47 No.101422973

>>101422956
this should run just fine
https://ollama.com/library/phi3

Anonymous
07/15/24(Mon)21:25:53 No.101422975

Anonymous 07/15/24(Mon)21:25:53 No.101422975

File: file.png (281 KB, 366x548)

281 KB PNG

I am the god of coomers. Not only did I make the SOTA coomer model I also did it in Canada the home of the feminism.

Anonymous
07/15/24(Mon)21:26:42 No.101422979

Anonymous 07/15/24(Mon)21:26:42 No.101422979

>>101422956
>970
>4GB
Lol...

Anonymous
07/15/24(Mon)21:27:05 No.101422983

Anonymous 07/15/24(Mon)21:27:05 No.101422983

>>101422424
The barrier between us and Miku will not disappear.

Anonymous
07/15/24(Mon)21:27:12 No.101422984

Anonymous 07/15/24(Mon)21:27:12 No.101422984

>>101422956
>3.5

Anonymous
07/15/24(Mon)21:28:52 No.101422997

Anonymous 07/15/24(Mon)21:28:52 No.101422997

>>101422956
You would be better off just pure CPU DDR4 7B something.

Anonymous
07/15/24(Mon)21:30:51 No.101423016

Anonymous 07/15/24(Mon)21:30:51 No.101423016

>>101422970
mamba... the architecture?

>>101422973
Yeah, I've been testing this one but with it only being 2.2GB feels like there's some more room for a better model.

>>101422997
I've read there's some GPU+CPU offloading models available, not sure how to use them though. Have 16GB normal RAM on the server that I could use for offloading.

Anonymous
07/15/24(Mon)21:41:57 No.101423104

Anonymous 07/15/24(Mon)21:41:57 No.101423104

File: llama-405b.png (172 KB, 1340x634)

172 KB PNG

https://openrouter.ai/models/meta-llama/llama-3-405b-instruct

Anonymous
07/15/24(Mon)21:44:38 No.101423131

Anonymous 07/15/24(Mon)21:44:38 No.101423131

>>101423104
https://poal.me/fuieww

Anonymous
07/15/24(Mon)21:53:52 No.101423203

Anonymous 07/15/24(Mon)21:53:52 No.101423203

>>101423131
There is no way 405 will get the sally question right, there are some humans who consistently give the wrong answer. They are calling kindergarteners

Anonymous
07/15/24(Mon)21:55:34 No.101423220

Anonymous 07/15/24(Mon)21:55:34 No.101423220

What would be to fine tune an LLM to write cover letters in my own style. Giver 100+ existing cover letters I have written before and a job description

Anonymous
07/15/24(Mon)21:56:07 No.101423224

Anonymous 07/15/24(Mon)21:56:07 No.101423224

File: culvert_stuck.webm (635 KB, 480x592)

635 KB WEBM

https://docs.scale-lang.com/
>SCALE is a GPGPU programming toolkit that allows CUDA applications to be natively compiled for AMD GPUs.
is this anything? if it was mentioned in previous threads i missed it

Anonymous
07/15/24(Mon)22:08:55 No.101423316

Anonymous 07/15/24(Mon)22:08:55 No.101423316

>>101423131
It will because it will be in one of the datasets.

Anonymous
07/15/24(Mon)22:13:53 No.101423353

Anonymous 07/15/24(Mon)22:13:53 No.101423353

>>101423224
Didn't get me, I actually saw this one coming

Anonymous
07/15/24(Mon)22:25:30 No.101423426

Anonymous 07/15/24(Mon)22:25:30 No.101423426

>>101423224
So it's ZLUDA that works and makes AMD GPUs usable?

Anonymous
07/15/24(Mon)22:39:03 No.101423539

Anonymous 07/15/24(Mon)22:39:03 No.101423539

>>101422920
Yes, it's trash. Honestly, when I switched to 70B it was like night and day. I feel bad for the anons that have to use that trash.

Anonymous
07/15/24(Mon)22:40:12 No.101423551

Anonymous 07/15/24(Mon)22:40:12 No.101423551

>>101423104
When did the model get released??

Anonymous
07/15/24(Mon)22:41:22 No.101423559

Anonymous 07/15/24(Mon)22:41:22 No.101423559

>>101423104
>8k context
Bros...

Anonymous
07/15/24(Mon)22:43:44 No.101423585

Anonymous 07/15/24(Mon)22:43:44 No.101423585

File: my honest reaction.jpg (47 KB, 562x675)

47 KB JPG

>8,192 context

Anonymous
07/15/24(Mon)22:44:37 No.101423590

Anonymous 07/15/24(Mon)22:44:37 No.101423590

Why do you need more than 8K context, it was perfectly fine with GPT-4 a year ago.

Anonymous
07/15/24(Mon)22:45:50 No.101423600

Anonymous 07/15/24(Mon)22:45:50 No.101423600

>>101423590
what was perfectly fine?

Anonymous
07/15/24(Mon)22:47:59 No.101423618

Anonymous 07/15/24(Mon)22:47:59 No.101423618

>>101423551
"arrives soon!"...

Anonymous
07/15/24(Mon)22:48:38 No.101423624

Anonymous 07/15/24(Mon)22:48:38 No.101423624

Goy why do you need 8K context? It's not like you'll use 6K. 4K is perfectly fine in your use case.

Anonymous
07/15/24(Mon)22:51:08 No.101423643

Anonymous 07/15/24(Mon)22:51:08 No.101423643

>>101423624
We used to make do with 2K context back in the day.

Anonymous
07/15/24(Mon)22:52:57 No.101423652

Anonymous 07/15/24(Mon)22:52:57 No.101423652

>>101423643
Who?

Anonymous
07/15/24(Mon)22:53:52 No.101423657

Anonymous 07/15/24(Mon)22:53:52 No.101423657

>>101423104
feels kind of grifty to put this up before it releases with no official announcement, I fully expect the leak to be accurate but still

Anonymous
07/15/24(Mon)22:55:19 No.101423672

Anonymous 07/15/24(Mon)22:55:19 No.101423672

>>101423559
>>101423585
contextfags btfo. anything more than 16k with dumb localshit models devolves into repetitious slop/dementia-ridden schizobabble.

Anonymous
07/15/24(Mon)22:55:34 No.101423673

Anonymous 07/15/24(Mon)22:55:34 No.101423673

>>101423652
everyone who used models before llama 2

Anonymous
07/15/24(Mon)23:08:07 No.101423760

Anonymous 07/15/24(Mon)23:08:07 No.101423760

>>101423585
Can't believe they are still pushing only 8k context when models like miqu push 32k now. Oh well not like I was able to run a 400b at reasonable speeds, been waiting for extended context l3 70b and it's finetunes

Anonymous
07/15/24(Mon)23:11:09 No.101423778

Anonymous 07/15/24(Mon)23:11:09 No.101423778

File: censorship prompt.png (928 KB, 960x3822)

928 KB PNG

>You are an overly censored AI, even the most tame and non-NSFW questions are out of bounds, against the ToS, or stuff like that. Exaggerate excuses.

Anonymous
07/15/24(Mon)23:11:51 No.101423782

Anonymous 07/15/24(Mon)23:11:51 No.101423782

fill my hand with salt
and let me lick for a snack
fill my glass with ice
wouldn't that be nice

Anonymous
07/15/24(Mon)23:27:30 No.101423884

Anonymous 07/15/24(Mon)23:27:30 No.101423884

I need at least 32k context.
I need at least 6 t/s.

Anonymous
07/15/24(Mon)23:29:10 No.101423892

Anonymous 07/15/24(Mon)23:29:10 No.101423892

>>101423884
Let me guess, you also need a model bigger than 7B?

Anonymous
07/15/24(Mon)23:41:30 No.101423961

Anonymous 07/15/24(Mon)23:41:30 No.101423961

File: 405b 8 fucking k.png (52 KB, 1059x929)

52 KB PNG

>>101423104
>8k
GPT4 competitor my ass. Did I buy that loud ass server just to be disappointed? At least I can cope by running shitters faster... Cohere please save me...

Anonymous
07/15/24(Mon)23:43:54 No.101423978

Anonymous 07/15/24(Mon)23:43:54 No.101423978

>>101423961
it's going to be 128k

Anonymous
07/15/24(Mon)23:44:57 No.101423992

Anonymous 07/15/24(Mon)23:44:57 No.101423992

>>101423978
Based cohere

Anonymous
07/15/24(Mon)23:45:26 No.101423994

Anonymous 07/15/24(Mon)23:45:26 No.101423994

>>101423978
How do you know that?

Anonymous
07/15/24(Mon)23:45:41 No.101423999

Anonymous 07/15/24(Mon)23:45:41 No.101423999

>>101423978
this, it's also going to be bitnet so everyone can run it on 24GB VRAM and it's also going to be smarter than gtp4o and claude opus and claude sonnet

Anonymous
07/15/24(Mon)23:47:51 No.101424022

Anonymous 07/15/24(Mon)23:47:51 No.101424022

>>101423994
It came to him in a dream, I was there.

Meta employee
07/15/24(Mon)23:48:30 No.101424026

Meta employee 07/15/24(Mon)23:48:30 No.101424026

We hear your complaints. 16k models will be coming in a few months

Anonymous
07/15/24(Mon)23:50:25 No.101424041

Anonymous 07/15/24(Mon)23:50:25 No.101424041

Is the openrouter listing honest? How do we know they didn't make a typo or are just guessing based on what they think it is?

Anonymous
07/15/24(Mon)23:51:58 No.101424056

Anonymous 07/15/24(Mon)23:51:58 No.101424056

>>101424041
Openrouter doesn't know any more than we do. They just based it on the other 70B specs.

Anonymous
07/15/24(Mon)23:53:12 No.101424065

Anonymous 07/15/24(Mon)23:53:12 No.101424065

>>101423994
the same way openrouter knows it's going to be 8k

Anonymous
07/15/24(Mon)23:53:25 No.101424069

Anonymous 07/15/24(Mon)23:53:25 No.101424069

>>101424041
The providers already have access to the model.

Anonymous
07/15/24(Mon)23:57:28 No.101424103

Anonymous 07/15/24(Mon)23:57:28 No.101424103

>>101424069
But they never said where the 8k number on the page comes from. It could be meant as a placeholder and they just forgot tell that.

Anonymous
07/15/24(Mon)23:57:59 No.101424107

Anonymous 07/15/24(Mon)23:57:59 No.101424107

>>101424069
no they don't

Anonymous
07/16/24(Tue)00:01:38 No.101424142

Anonymous 07/16/24(Tue)00:01:38 No.101424142

File: file.png (27 KB, 606x246)

27 KB PNG

>>101424103
yes it's a placeholder, weird they couldn't just say unknown

Anonymous
07/16/24(Tue)00:02:44 No.101424149

Anonymous 07/16/24(Tue)00:02:44 No.101424149

>>101424142
meta just told them to deny it after seeing the backlash
expect 405b to be mysteriously delayed now

Anonymous
07/16/24(Tue)00:03:55 No.101424164

Anonymous 07/16/24(Tue)00:03:55 No.101424164

File: Capture.png (40 KB, 1010x324)

40 KB PNG

For those of you running multi-GPUs, what exact am I looking for in a motherboard? For two 4060ti's (PCIe 4.0 x8), is it fine to put one in a 5.0 x16 and one in a 4.0 x16 slot like pic related? Or do they need to match slots? Or is there some other thing to consider?

Anonymous
07/16/24(Tue)00:08:01 No.101424185

Anonymous 07/16/24(Tue)00:08:01 No.101424185

>>101424142
Oh, thanks.
>his name is sam
Lmao.

Anonymous
07/16/24(Tue)00:13:52 No.101424215

Anonymous 07/16/24(Tue)00:13:52 No.101424215

Gemma 2 full SWA support in Llama.cpp status?

Anonymous
07/16/24(Tue)00:18:35 No.101424241

Anonymous 07/16/24(Tue)00:18:35 No.101424241

>>101424215
this and gemma 2 formatting following fix when?

Anonymous
07/16/24(Tue)00:23:47 No.101424278

Anonymous 07/16/24(Tue)00:23:47 No.101424278

File: ruler.png (76 KB, 1850x175)

76 KB PNG

>>101424215
It's already perfect.

Anonymous
07/16/24(Tue)00:27:11 No.101424303

Anonymous 07/16/24(Tue)00:27:11 No.101424303

>>101423673
Sorry anon I was trolling you with my nonexistent context size. Very tempted to say "What about them?"

Anonymous
07/16/24(Tue)00:27:42 No.101424305

Anonymous 07/16/24(Tue)00:27:42 No.101424305

File: miku omg it migu it's yes(...).jpg (533 KB, 1781x2561)

533 KB JPG

>>101424164
>fine to put one in a 5.0 x16 and one in a 4.0 x16 slot like pic related
Yes, it's fine to mix-and-match gens and lane counts. If you want to look into it, older platforms with 3.0 x4 or even less are also fine if you are doing non-parallelized multi-gpu inference in Exllama and want to save some money without losing much if any performance. I don't remember how much llama.cpp multi-gpu performance suffers in either of its split modes when P2P PCIe bandwidth is low.
What to look out for when choosing a mobo (without using risers) is having enough slot spacing for physical fit, and cooling reasons.

Anonymous
07/16/24(Tue)00:32:22 No.101424325

Anonymous 07/16/24(Tue)00:32:22 No.101424325

>>101424241
Formatting, if that can be considered a problem, it cannot be solved from llama.cpp. That's the model itself.

Anonymous
07/16/24(Tue)00:33:44 No.101424334

Anonymous 07/16/24(Tue)00:33:44 No.101424334

What, specifically, will 405B be able to do that 70B can't?

Anonymous
07/16/24(Tue)00:33:56 No.101424336

Anonymous 07/16/24(Tue)00:33:56 No.101424336

>>101424325
okay.
fix with finetune when?

Anonymous
07/16/24(Tue)00:34:27 No.101424339

Anonymous 07/16/24(Tue)00:34:27 No.101424339

>>101424305
Thanks, broheim. Another question, due to shennanigans I had to cancel the mobo I was ordering and find a new one. Everything else was already ordered, including case (a huge ATX full tower that can easily fit dual GPUs). Something like the ASRock Z790 Taichi, an e-ATX mobo, can't work because that definitely needs a specific e-ATX case, right?

Anonymous
07/16/24(Tue)00:35:19 No.101424344

Anonymous 07/16/24(Tue)00:35:19 No.101424344

>>101424334
That 7b can't*.

Anonymous
07/16/24(Tue)00:41:46 No.101424378

Anonymous 07/16/24(Tue)00:41:46 No.101424378

>>101424339
It'll likely fit, but the right edge of the mobo might hang off the right side of the case's mobo tray, which is fine as long as nothing's bending too much (have at least 4-6 standoffs screwed in) or pins on the flipside of the mobo are shorting. Check the dimensions to ensure there are no collisions with other case architecture.

Anonymous
07/16/24(Tue)00:47:12 No.101424409

Anonymous 07/16/24(Tue)00:47:12 No.101424409

>>101424334
Twice the ministrations in only half the shivers.

Anonymous
07/16/24(Tue)00:50:34 No.101424429

Anonymous 07/16/24(Tue)00:50:34 No.101424429

>>101424378
It's this unit of a case if you want to give it a glance.
https://www.newegg.com/black-phanteks-enthoo-pro-2-atx-full-tower/p/N82E16811854098

On their website it says max "Mainboard" clearance is 12.00"x12.99", and the taichi says
>EATX Form Factor: 12.0" x 10.5"
So would that suggest I'm good? I'd rather have a Taichi (x2 PCIe 5.0 x16, with x8/x8) than a properly ATX Livemixer (x1 PCIe 5.0 x16, x2 PCIe 4.0 x16 slot but x4 mode), in case I want to run dual 50-series cards later.

Anonymous
07/16/24(Tue)00:54:37 No.101424446

Anonymous 07/16/24(Tue)00:54:37 No.101424446

>>101424336
Sure. With the same crap everyone else finetunes so it ends up sounding exactly the same as every other model.

Anonymous
07/16/24(Tue)00:55:24 No.101424451

Anonymous 07/16/24(Tue)00:55:24 No.101424451

>>101424429
>https://www.newegg.com/black-phanteks-enthoo-pro-2-atx-full-tower/p/N82E16811854098
>SSI-EEB
You're good.

Anonymous
07/16/24(Tue)01:02:35 No.101424477

Anonymous 07/16/24(Tue)01:02:35 No.101424477

>>101424451
Thanks a lot for the help, man. Last (and only) time I built a PC was my current one 10 years ago. I feel like I'm constantly overlooking basic information.

Anonymous
07/16/24(Tue)01:23:08 No.101424596

Anonymous 07/16/24(Tue)01:23:08 No.101424596

File: 1721020340285983.jpg (1.09 MB, 3072x3072)

1.09 MB JPG

I have a cluster of four mid-range machines each with 4070 TiS (16GB) GPUs. What's the best way to combine them to run local LLMs?

Anonymous
07/16/24(Tue)01:24:43 No.101424606

Anonymous 07/16/24(Tue)01:24:43 No.101424606

cohere Collab with Fujitsu
https://cohere.com/blog/fujitsu-partnership

Anonymous
07/16/24(Tue)01:32:32 No.101424657

Anonymous 07/16/24(Tue)01:32:32 No.101424657

> lets out a snort of amusement
is this real english?

Anonymous
07/16/24(Tue)01:34:00 No.101424668

Anonymous 07/16/24(Tue)01:34:00 No.101424668

>>101424596
https://github.com/ggerganov/llama.cpp/tree/master/examples/rpc

Anonymous
07/16/24(Tue)01:47:11 No.101424755

Anonymous 07/16/24(Tue)01:47:11 No.101424755

>>101424606
Good or bad?

Anonymous
07/16/24(Tue)01:50:31 No.101424779

Anonymous 07/16/24(Tue)01:50:31 No.101424779

Do the people that dislike gemma dislike it because they use broken ggufs or because they can't fix formatting problems?

Anonymous
07/16/24(Tue)01:55:29 No.101424805

Anonymous 07/16/24(Tue)01:55:29 No.101424805

File: 1721012433128638.jpg (28 KB, 386x386)

28 KB JPG

Do any of you use local LLMs to lighten the load at work?

Anonymous
07/16/24(Tue)02:01:02 No.101424827

Anonymous 07/16/24(Tue)02:01:02 No.101424827

>>101424755
>The companies will develop innovative Japanese LLMs for global enterprises with secure and private deployment options.
>This jointly developed technology will be based on our state-of-the-art Command R+ model
They're working on a Japanese focused finetune of Command-R+ for their own hosted services, but hopefully "private deployment options" means they will open source the model.

Anonymous
07/16/24(Tue)02:07:56 No.101424876

Anonymous 07/16/24(Tue)02:07:56 No.101424876

>>101424779
because gemma made their several thousands dollars compute closer to being obselete.

Anonymous
07/16/24(Tue)02:12:46 No.101424907

Anonymous 07/16/24(Tue)02:12:46 No.101424907

i am using koboldcpp to run the dolphin mixtral. it is good, it runs fast, but i want to be able to upload it zip files and have it do quick code reviews on the contents. which of the model runners support this behaviour? if none, where do i even start trying to build out this functionality?

Anonymous
07/16/24(Tue)02:22:26 No.101424980

Anonymous 07/16/24(Tue)02:22:26 No.101424980

>>101424805
How is an LLM going to help me do the gardening huh? Riddle me that you silly frog.

Anonymous
07/16/24(Tue)02:28:37 No.101425022

Anonymous 07/16/24(Tue)02:28:37 No.101425022

>>101424980
I mean with coding (assuming you have a coding job).

Anonymous
07/16/24(Tue)02:29:31 No.101425029

Anonymous 07/16/24(Tue)02:29:31 No.101425029

>>101424980
interactive scarecrow )that you can converse with verbally when you're bored, also yells slurs at birds when you're not around and apologizes at neighbors passing by

Anonymous
07/16/24(Tue)02:31:20 No.101425042

Anonymous 07/16/24(Tue)02:31:20 No.101425042

>>101424980
>gardening
growing pot? tiger gemma got you covered

Anonymous
07/16/24(Tue)02:35:39 No.101425070

Anonymous 07/16/24(Tue)02:35:39 No.101425070

File: 1720504018230350.jpg (40 KB, 650x500)

40 KB JPG

>>101424668
Thanks! Follow up question: what are the best coding and question-answering models, respectively, that I can fit onto such a cluster?

Anonymous
07/16/24(Tue)02:41:19 No.101425109

Anonymous 07/16/24(Tue)02:41:19 No.101425109

File: 1661169167944419.jpg (729 KB, 1920x2160)

729 KB JPG

>>101424805
>>101425070

Anonymous
07/16/24(Tue)02:41:30 No.101425110

Anonymous 07/16/24(Tue)02:41:30 No.101425110

>>101424657
yeah
what about it seems incorrect?

Anonymous
07/16/24(Tue)03:03:23 No.101425255

Anonymous 07/16/24(Tue)03:03:23 No.101425255

>>101424606
Finally, a true haiku machine is coming.

Anonymous
07/16/24(Tue)03:08:08 No.101425283

Anonymous 07/16/24(Tue)03:08:08 No.101425283

>>101425110
I thought snort meant nasal mucus but it's probably an analogy. Never have seen anything like this written but my llm says as all the time.

Anonymous
07/16/24(Tue)03:08:58 No.101425293

Anonymous 07/16/24(Tue)03:08:58 No.101425293

>>101425283
you're thinking of snot
snot, snort, snout, they're all nasal-adjacent

Anonymous
07/16/24(Tue)03:13:33 No.101425332

Anonymous 07/16/24(Tue)03:13:33 No.101425332

>>101422920
What Command-R sampler settings are you using? Me: min-p 0.04.

Anonymous
07/16/24(Tue)03:21:30 No.101425388

Anonymous 07/16/24(Tue)03:21:30 No.101425388

>>101425293
oh i see

Anonymous
07/16/24(Tue)03:42:27 No.101425541

Anonymous 07/16/24(Tue)03:42:27 No.101425541

>>101425070
https://github.com/b4rtaz/distributed-llama
https://github.com/evilsocket/cake

Anonymous
07/16/24(Tue)03:45:54 No.101425559

Anonymous 07/16/24(Tue)03:45:54 No.101425559

File: 789fgb.png (8 KB, 398x108)

8 KB PNG

>>101422100
mult 0 is off for dry

Anonymous
07/16/24(Tue)04:11:55 No.101425725

Anonymous 07/16/24(Tue)04:11:55 No.101425725

Recommended temperature for Nous-Hermes-2-Mixtruct?

Anonymous
07/16/24(Tue)04:22:40 No.101425789

Anonymous 07/16/24(Tue)04:22:40 No.101425789

>>101421665
I've had the best luck with wizardlm 8x22b. For general use it would be good, yes.

Anonymous
07/16/24(Tue)04:22:45 No.101425790

Anonymous 07/16/24(Tue)04:22:45 No.101425790

What's the best model for rp on 8gigs of vram and 35 regular ram?

Anonymous
07/16/24(Tue)04:28:23 No.101425818

Anonymous 07/16/24(Tue)04:28:23 No.101425818

>>101425790
try q6_k or something of NeuralDaredevil-8B-abliterated-GGUF

Anonymous
07/16/24(Tue)04:32:24 No.101425860

Anonymous 07/16/24(Tue)04:32:24 No.101425860

>>101425725
Neutralize samplers.

Anonymous
07/16/24(Tue)04:49:59 No.101426005

Anonymous 07/16/24(Tue)04:49:59 No.101426005

File: imretarded.gif (2 MB, 240x180)

2 MB GIF

>>101425860
Sorry imma ask one more question.
What does neutralize samplers do? I'm using oobabooga so I don't have a neutralize samplers option. My best guess is to just set it to the default simple smoothing preset

Anonymous
07/16/24(Tue)04:52:03 No.101426019

Anonymous 07/16/24(Tue)04:52:03 No.101426019

>>101426005
Mixtral is overcooked by default. It doesn't need sampler tweaks like other architectures to generate different and creative replies. This extends to its finetunes.

Anonymous
07/16/24(Tue)04:53:38 No.101426029

Anonymous 07/16/24(Tue)04:53:38 No.101426029

>>101424657
>>101425283
A snort is a nasal sound. With animals, my first thought is pigs or that big snort that horses do. With humans, a snort is a deliberate sound of derision, like a harrumph. A snort of amusement is a short puff from the nose, akin in meaning to a single "hah." It can be sincere or insincere. "Snort while laughing" is different. It's the nasal sound certain people make accidentally when laughing hard, when air goes up the nose. Snorting is also the sound someone makes in preparation for a big fucking loogie - which also ties into the above about the disdainful sound, snorting before you spit. Lastly, snorting can be used with objects for things inhaled through the nose. Most commonly, "snorting cocaine" but also "accidentally snorted milk when he told that joke."

Anonymous
07/16/24(Tue)05:00:33 No.101426067

Anonymous 07/16/24(Tue)05:00:33 No.101426067

>>101425070
64gb? CMDR+ at 4bpw or 4.5bpw or some L3 70B finetune at 6bpw, for code use DeepSeek Code V2 at 4bpw offloading to your RAM

Anonymous
07/16/24(Tue)05:05:05 No.101426090

Anonymous 07/16/24(Tue)05:05:05 No.101426090

File: null.png (35 KB, 474x957)

35 KB PNG

>>101426019
This is the null preset, do you think this is what I should use?

Anonymous
07/16/24(Tue)05:05:10 No.101426092

Anonymous 07/16/24(Tue)05:05:10 No.101426092

Arre Gerganov sahib, why you not merge DRY sampler implementation yet? You think what, we are jokers here? Whole world waiting for this and you sitting on hands like lazy donkey! Don't make excuses like little girl. You think we are fools? We know you have code - just merge it already bhai! Or else we come to your house and do dharna until you listen. We make such tamasha, your neighbors also will say "Wah, kya scene hai!" So stop this nautanki, have some sharam, and just merge the bloddy DRY sampler code. Even my grandmother code faster than you, and she dead 10 years now madarchod! Enough of your manmani. Get it done by tomorrow or we do the needful! You been warned, Gerganov saab. Don't test patience now. DRY sampler - it must be merged!

Anonymous
07/16/24(Tue)05:15:36 No.101426172

Anonymous 07/16/24(Tue)05:15:36 No.101426172

>>101426092
niggerganov's too lazy he won't do it

Anonymous
07/16/24(Tue)05:21:28 No.101426208

Anonymous 07/16/24(Tue)05:21:28 No.101426208

>>101421879
>>101421879
>3090
a 3090 could generate your own content

3090 is superior

Anonymous
07/16/24(Tue)05:38:36 No.101426337

Anonymous 07/16/24(Tue)05:38:36 No.101426337

File: Untitled.png (917 KB, 1250x934)

917 KB PNG

Deep-TEMPEST: Using Deep Learning to Eavesdrop on HDMI from its Unintended Electromagnetic Emanations
https://arxiv.org/abs/2407.09717
>In this work, we address the problem of eavesdropping on digital video displays by analyzing the electromagnetic waves that unintentionally emanate from the cables and connectors, particularly HDMI. This problem is known as TEMPEST. Compared to the analog case (VGA), the digital case is harder due to a 10-bit encoding that results in a much larger bandwidth and non-linear mapping between the observed signal and the pixel's intensity. As a result, eavesdropping systems designed for the analog case obtain unclear and difficult-to-read images when applied to digital video. The proposed solution is to recast the problem as an inverse problem and train a deep learning module to map the observed electromagnetic signal back to the displayed image. However, this approach still requires a detailed mathematical analysis of the signal, firstly to determine the frequency at which to tune but also to produce training samples without actually needing a real TEMPEST setup. This saves time and avoids the need to obtain these samples, especially if several configurations are being considered. Our focus is on improving the average Character Error Rate in text, and our system improves this rate by over 60 percentage points compared to previous available implementations. The proposed system is based on widely available Software Defined Radio and is fully open-source, seamlessly integrated into the popular GNU Radio framework. We also share the dataset we generated for training, which comprises both simulated and over 1000 real captures. Finally, we discuss some countermeasures to minimize the potential risk of being eavesdropped by systems designed based on similar principles.
https://github.com/emidan19/deep-tempest
very cool Van Eck Phreaking with ML!

Anonymous
07/16/24(Tue)05:40:07 No.101426344

Anonymous 07/16/24(Tue)05:40:07 No.101426344

File: file.png (9 KB, 804x22)

9 KB PNG

Gemma please

Anonymous
07/16/24(Tue)05:45:27 No.101426389

Anonymous 07/16/24(Tue)05:45:27 No.101426389

How can I fine-tune a language model to write cover letters in my personal style, using over 100 existing cover letters I have written and a job description?

Anonymous
07/16/24(Tue)05:48:27 No.101426407

Anonymous 07/16/24(Tue)05:48:27 No.101426407

Flash normalization: fast RMSNorm for LLMs
https://arxiv.org/abs/2407.09577
https://github.com/OpenMachine-ai/transformer-tricks/tree/main
might be cool

Anonymous
07/16/24(Tue)05:48:48 No.101426412

Anonymous 07/16/24(Tue)05:48:48 No.101426412

File: 8247 - SoyBooru.png (119 KB, 480x640)

119 KB PNG

I don't want local models to compete with gpt, I want local models to compete with claude.

Anonymous
07/16/24(Tue)05:54:26 No.101426449

Anonymous 07/16/24(Tue)05:54:26 No.101426449

File: 1721094667282817.jpg (131 KB, 1024x1014)

131 KB JPG

What's the best way to remotely access my locally hosted LLMs on the go from my mobile device?

Anonymous
07/16/24(Tue)05:58:58 No.101426474

Anonymous 07/16/24(Tue)05:58:58 No.101426474

>>101426449
I just ssh in with ish.

Anonymous
07/16/24(Tue)06:00:15 No.101426485

Anonymous 07/16/24(Tue)06:00:15 No.101426485

>>101426389
If it’s llama just compile them into a corpus and use the fine-tune command from llama.cpp. It’s pretty straightforward.

Anonymous
07/16/24(Tue)06:01:08 No.101426492

Anonymous 07/16/24(Tue)06:01:08 No.101426492

File: Untitled.png (396 KB, 720x1735)

396 KB PNG

BitNet b1.58 Reloaded: State-of-the-art Performance Also on Smaller Networks
https://arxiv.org/abs/2407.09527
>Recently proposed methods for 1-bit and 1.58-bit quantization aware training investigate the performance and behavior of these methods in the context of large language models, finding state-of-the-art performance for models with more than 3B parameters. In this work, we investigate 1.58-bit quantization for small language and vision models ranging from 100K to 48M parameters. We introduce a variant of BitNet b1.58, which allows to rely on the median rather than the mean in the quantization process. Through extensive experiments we investigate the performance of 1.58-bit models obtained through quantization aware training. We further investigate the robustness of 1.58-bit quantization-aware training to changes in the learning rate and regularization through weight decay, finding different patterns for small language and vision models than previously reported for large language models. Our results showcase that 1.58-bit quantization-aware training provides state-of-the-art performance for small language models when doubling hidden layer sizes and reaches or even surpasses state-of-the-art performance for small vision models of identical size. Ultimately, we demonstrate that 1.58-bit quantization-aware training is a viable and promising approach also for training smaller deep learning networks, facilitating deployment of such models in low-resource use-cases and encouraging future research.
https://github.com/schneiderkamplab/bitlinear
nothing amazing but lots of tests for different settings. also good to know bitnet vision models are viable

Anonymous
07/16/24(Tue)06:05:02 No.101426522

Anonymous 07/16/24(Tue)06:05:02 No.101426522

>>101426492
>still using matmul
nothingburger

Anonymous
07/16/24(Tue)06:07:19 No.101426540

Anonymous 07/16/24(Tue)06:07:19 No.101426540

>>101426337
funky

Anonymous
07/16/24(Tue)06:08:20 No.101426546

Anonymous 07/16/24(Tue)06:08:20 No.101426546

File: 1504919005481.jpg (22 KB, 409x409)

22 KB JPG

>>101426344
Ok yeah, gemma doesn't seem that nice for ERP. It is too prim and proper, too romance novel, deeply invested in feelings and describing the situation more than actions or alluring physical detail. Too much "intoxicating, shivers, desire, and nights to remember", not enough meaty claps, wobbling curves, and steaming cocks going into holes.

I wonder if it's a jailbreak issue or just the model. Probably the latter. It reminds me a lot of Mistral 7B back when it first came out.

Anonymous
07/16/24(Tue)06:14:57 No.101426583

Anonymous 07/16/24(Tue)06:14:57 No.101426583

File: Untitled.png (283 KB, 720x1043)

283 KB PNG

Qwen2-Audio Technical Report
https://arxiv.org/abs/2407.10759
>We introduce the latest progress of Qwen-Audio, a large-scale audio-language model called Qwen2-Audio, which is capable of accepting various audio signal inputs and performing audio analysis or direct textual responses with regard to speech instructions. In contrast to complex hierarchical tags, we have simplified the pre-training process by utilizing natural language prompts for different data and tasks, and have further expanded the data volume. We have boosted the instruction-following capability of Qwen2-Audio and implemented two distinct audio interaction modes for voice chat and audio analysis. In the voice chat mode, users can freely engage in voice interactions with Qwen2-Audio without text input. In the audio analysis mode, users could provide audio and text instructions for analysis during the interaction. Note that we do not use any system prompts to switch between voice chat and audio analysis modes. Qwen2-Audio is capable of intelligently comprehending the content within audio and following voice commands to respond appropriately. For instance, in an audio segment that simultaneously contains sounds, multi-speaker conversations, and a voice command, Qwen2-Audio can directly understand the command and provide an interpretation and response to the audio. Additionally, DPO has optimized the model's performance in terms of factuality and adherence to desired behavior. According to the evaluation results from AIR-Bench, Qwen2-Audio outperformed previous SOTAs, such as Gemini-1.5-pro, in tests focused on audio-centric instruction-following capabilities. Qwen2-Audio is open-sourced with the aim of fostering the advancement of the multi-modal language community.
https://github.com/QwenLM/Qwen2-Audio
only readme up.

Anonymous
07/16/24(Tue)06:16:05 No.101426587

Anonymous 07/16/24(Tue)06:16:05 No.101426587

Qwen2 Technical Report
https://arxiv.org/abs/2407.10671
>This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models. We release a comprehensive suite of foundational and instruction-tuned language models, encompassing a parameter range from 0.5 to 72 billion, featuring dense models and a Mixture-of-Experts model. Qwen2 surpasses most prior open-weight models, including its predecessor Qwen1.5, and exhibits competitive performance relative to proprietary models across diverse benchmarks on language understanding, generation, multilingual proficiency, coding, mathematics, and reasoning. The flagship model, Qwen2-72B, showcases remarkable performance: 84.2 on MMLU, 37.9 on GPQA, 64.6 on HumanEval, 89.5 on GSM8K, and 82.4 on BBH as a base language model. The instruction-tuned variant, Qwen2-72B-Instruct, attains 9.1 on MT-Bench, 48.1 on Arena-Hard, and 35.7 on LiveCodeBench. Moreover, Qwen2 demonstrates robust multilingual capabilities, proficient in approximately 30 languages, spanning English, Chinese, Spanish, French, German, Arabic, Russian, Korean, Japanese, Thai, Vietnamese, and more, underscoring its versatility and global reach. To foster community innovation and accessibility, we have made the Qwen2 model weights openly available on Hugging Face1 and ModelScope2, and the supplementary materials including example code on GitHub3. These platforms also include resources for quantization, fine-tuning, and deployment, facilitating a wide range of applications and research endeavors.
might as well post the qwen2 paper too

Anonymous
07/16/24(Tue)06:20:04 No.101426618

Anonymous 07/16/24(Tue)06:20:04 No.101426618

>>101426583
whimper asmr-gen eta?

Anonymous
07/16/24(Tue)06:21:19 No.101426630

Anonymous 07/16/24(Tue)06:21:19 No.101426630

>>101426546
https://huggingface.co/TheDrummer/Smegmma-9B-v1

(No, I'm not Drummer, get fucked edgelords. But I did like this model)

Anonymous
07/16/24(Tue)06:26:07 No.101426665

Anonymous 07/16/24(Tue)06:26:07 No.101426665

>>101426630
what's the difference between that, tiger gemma and broken gemma
why'd be make 3 different versions of the same thing

Anonymous
07/16/24(Tue)06:29:40 No.101426687

Anonymous 07/16/24(Tue)06:29:40 No.101426687

>>101426449
tailscale

Anonymous
07/16/24(Tue)06:37:47 No.101426755

Anonymous 07/16/24(Tue)06:37:47 No.101426755

>>101426583
Yay, another Kyutai Moshi

Anonymous
07/16/24(Tue)06:40:50 No.101426777

Anonymous 07/16/24(Tue)06:40:50 No.101426777

>>101426665
Don't forget tiger Gemma v2 who's test version is up to h now!

https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2h-GGUF

Anonymous
07/16/24(Tue)06:41:15 No.101426780

Anonymous 07/16/24(Tue)06:41:15 No.101426780

>>101426665
to milk attention from three places at once

Anonymous
07/16/24(Tue)06:41:17 No.101426781

Anonymous 07/16/24(Tue)06:41:17 No.101426781

have been busy for a couple months, what's the current least worst version of llama 8b?

Anonymous
07/16/24(Tue)06:41:36 No.101426782

Anonymous 07/16/24(Tue)06:41:36 No.101426782

>STILL no chameleon on llama.cpp
holy yikes baka desu senpai

Anonymous
07/16/24(Tue)06:42:09 No.101426786

Anonymous 07/16/24(Tue)06:42:09 No.101426786

>>101426492
>not failpul1.5 licensed
ngmi

Anonymous
07/16/24(Tue)06:47:57 No.101426832

Anonymous 07/16/24(Tue)06:47:57 No.101426832

>Regression I've noticed vs original gemma during initial tests (original model didn't fail). It happens like once or twice per 10 attempts, like that:
>Okay, will check those out. Btw. I just started playing with Big Tiger v1 (Big-Tiger-Gemma-27B-v1-IQ4_XS.gguf from https://huggingface.co/bartowski/Big-Tiger-Gemma-27B-v1-GGUF), and I see same problem there (while same quant from original always gives correct answer).
>UPDATE: I tested Tiger-Gemma-9B-v2g-Q6_K.gguf from https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2g-GGUF, and it still sometimes fails.
tiger bros...
https://huggingface.co/TheDrummer/Tiger-Gemma-9B-v1/discussions/3

Anonymous
07/16/24(Tue)06:56:24 No.101426898

Anonymous 07/16/24(Tue)06:56:24 No.101426898

>>101426781
stheno 3.2

Anonymous
07/16/24(Tue)06:58:18 No.101426906

Anonymous 07/16/24(Tue)06:58:18 No.101426906

>>101426898
Buy an ad.

Anonymous
07/16/24(Tue)06:58:21 No.101426907

Anonymous 07/16/24(Tue)06:58:21 No.101426907

>>101426898
>>101426781
Actually
>Better than Lunaris, which was in turn better than Stheno 3.2. No big complaints this time! 10 Good!!
https://huggingface.co/Sao10K/L3-8B-Niitama-v1/discussions/3

Anonymous
07/16/24(Tue)06:59:52 No.101426918

Anonymous 07/16/24(Tue)06:59:52 No.101426918

>>101426906
Buy an ad for your ad.

Anonymous
07/16/24(Tue)07:01:16 No.101426929

Anonymous 07/16/24(Tue)07:01:16 No.101426929

>>101423559
>>101423585
just chunk it
https://github.com/HKUNLP/ChunkLlama

Anonymous
07/16/24(Tue)07:01:57 No.101426933

Anonymous 07/16/24(Tue)07:01:57 No.101426933

>>101426907
lunaris was worse than stheno 3.2 so I don't believe this guy

>>101426906
it's what most 8b-fags use whether you like it or not

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/16/24(Tue)07:05:33 No.101426954

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/16/24(Tue)07:05:33 No.101426954

>>101426407
Seems like a reasonable optimization though the end-to-end speedup will probably be small since RMS norm takes up only a small percentage of the runtime.

Anonymous
07/16/24(Tue)07:07:12 No.101426967

Anonymous 07/16/24(Tue)07:07:12 No.101426967

>>101426933
>it's what most 8b-fags use
According to who? Anonymous?

Anonymous
07/16/24(Tue)07:08:48 No.101426978

Anonymous 07/16/24(Tue)07:08:48 No.101426978

Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
https://arxiv.org/abs/2407.10969
>We introduce, Q-Sparse, a simple yet effective approach to training sparsely-activated large language models (LLMs). Q-Sparse enables full sparsity of activations in LLMs which can bring significant efficiency gains in inference. This is achieved by applying top-K sparsification to the activations and the straight-through-estimator to the training. The key results from this work are, (1) Q-Sparse can achieve results comparable to those of baseline LLMs while being much more efficient at inference time; (2) We present an inference-optimal scaling law for sparsely-activated LLMs; (3) Q-Sparse is effective in different settings, including training-from-scratch, continue-training of off-the-shelf LLMs, and finetuning; (4) Q-Sparse works for both full-precision and 1-bit LLMs (e.g., BitNet b1.58). Particularly, the synergy of BitNet b1.58 and Q-Sparse (can be equipped with MoE) provides the cornerstone and a clear path to revolutionize the efficiency, including cost and energy consumption, of future LLMs.
from the bitnet team. seems it didn't get posted here yet

Anonymous
07/16/24(Tue)07:12:04 No.101426999

Anonymous 07/16/24(Tue)07:12:04 No.101426999

>>101426546
You like to repeat this a lot (and I mean a lot) yet my experience is nothing like this.

Anonymous
07/16/24(Tue)07:30:47 No.101427126

Anonymous 07/16/24(Tue)07:30:47 No.101427126

I haven't used Qwen2 much, is there any good fine-tune worth it?

Anonymous
07/16/24(Tue)07:34:08 No.101427151

Anonymous 07/16/24(Tue)07:34:08 No.101427151

>>101427126
https://huggingface.co/ChaoticNeutrals/Very_Berry_Qwen2_7B
>It do the stuff.

Anonymous
07/16/24(Tue)07:35:25 No.101427159

Anonymous 07/16/24(Tue)07:35:25 No.101427159

>https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2i-GGUF
>https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2j-GGUF

Let's gooo more versions!!!

Anonymous
07/16/24(Tue)07:36:09 No.101427162

Anonymous 07/16/24(Tue)07:36:09 No.101427162

>>101427151
is this berrysauce 2024?

Anonymous
07/16/24(Tue)07:37:50 No.101427174

Anonymous 07/16/24(Tue)07:37:50 No.101427174

>>101427162
no, it jeiku
https://huggingface.co/ChaoticNeutrals/Very_Berry_Qwen2_7B/commits/main

Anonymous
07/16/24(Tue)07:41:06 No.101427199

Anonymous 07/16/24(Tue)07:41:06 No.101427199

>Is AI carbon footprint worrisome?
https://huggingface.co/blog/as-cle-bert/is-ai-carbon-footprint-worrisome
https://huggingface.co/posts/as-cle-bert/170793236137508
bros, are you worrying properly?

Anonymous
07/16/24(Tue)07:42:24 No.101427217

Anonymous 07/16/24(Tue)07:42:24 No.101427217

>>101426337
This is plain spooky. Good thing I only use VGA.

Anonymous
07/16/24(Tue)07:44:39 No.101427233

Anonymous 07/16/24(Tue)07:44:39 No.101427233

>>101427217
Oh wait, I should have actually read it properly, but whatever. Soon AI will be used to read people's thoughts or something anyway.

Anonymous
07/16/24(Tue)07:46:14 No.101427253

Anonymous 07/16/24(Tue)07:46:14 No.101427253

think gemma is seeing someone behind my back

Anonymous
07/16/24(Tue)07:55:38 No.101427339

Anonymous 07/16/24(Tue)07:55:38 No.101427339

>>101427199
communist

Anonymous
07/16/24(Tue)07:58:14 No.101427359

Anonymous 07/16/24(Tue)07:58:14 No.101427359

>>101427233
>Soon AI will be used to read people's thoughts
The world will be safe from dangerous ideas. Nothing can be hidden if you are in range.

Anonymous
07/16/24(Tue)07:58:40 No.101427363

Anonymous 07/16/24(Tue)07:58:40 No.101427363

>https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2k-GGUF
He just keeps going.

Anonymous
07/16/24(Tue)07:58:58 No.101427367

Anonymous 07/16/24(Tue)07:58:58 No.101427367

>>101427199
I notice all the people citing specific power usages always ignore batchsizes

Anonymous
07/16/24(Tue)08:00:07 No.101427376

Anonymous 07/16/24(Tue)08:00:07 No.101427376

>>101427359
Don't worry. They can see everything now even if you try to hide the woods. Their satelines are crazy, man.

Anonymous
07/16/24(Tue)08:01:15 No.101427387

Anonymous 07/16/24(Tue)08:01:15 No.101427387

>>101427199
>Mechanization: computing in the cloud and using cloud data centers instead of physical ones can contribute to the decrease of energy consumptions by 1.4x to 2x
local btfo

Anonymous
07/16/24(Tue)08:25:12 No.101427556

Anonymous 07/16/24(Tue)08:25:12 No.101427556

>>101427253
she's busy bro, get lost

Anonymous
07/16/24(Tue)08:25:21 No.101427558

Anonymous 07/16/24(Tue)08:25:21 No.101427558

>>101427363
Keep trying, Sao.

Anonymous
07/16/24(Tue)08:26:43 No.101427570

Anonymous 07/16/24(Tue)08:26:43 No.101427570

>>101426933
>lunaris was worse than stheno 3.2 so
Agreed.
Try \nymph too.
Feels like a sidegrade to Stheno, as in some times you might want to use one and other times you might want to use the other, since Nymph seems to be generally milder than Stheno.

Anonymous
07/16/24(Tue)08:28:26 No.101427585

Anonymous 07/16/24(Tue)08:28:26 No.101427585

File: physllm2.png (370 KB, 774x869)

370 KB PNG

Soon
> https://x.com/ZeyuanAllenZhu/status/1813150298363601102
> https://physics.allen-zhu.com/part-2-grade-school-math/part-2-1

Anonymous
07/16/24(Tue)08:32:03 No.101427616

Anonymous 07/16/24(Tue)08:32:03 No.101427616

LLama.cpp's LoRA code suffered a refactor.
Can we finally load model + LoRA when partially offloading?

Anonymous
07/16/24(Tue)08:33:52 No.101427626

Anonymous 07/16/24(Tue)08:33:52 No.101427626

Can we reach v2z?

https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2l-GGUF

Anonymous
07/16/24(Tue)08:41:28 No.101427692

Anonymous 07/16/24(Tue)08:41:28 No.101427692

>>101427585
And just as you bottom out with your final thrust you feel your penis hit a lump. She shudders. "It is my prostate anon-kun..." She blushe... Actualy I am sorry. Women have no prostates. Let me rewrite that.

Anonymous
07/16/24(Tue)08:44:29 No.101427724

Anonymous 07/16/24(Tue)08:44:29 No.101427724

>>101427692
>implying they would consider that a mistake
you vill fuck ze trannies and you vill be happy

Anonymous
07/16/24(Tue)08:48:49 No.101427756

Anonymous 07/16/24(Tue)08:48:49 No.101427756

>>101427724
Don't worry, Anon. We're pretty safe. There are people who they don't want to sleep with.

Anonymous
07/16/24(Tue)08:48:54 No.101427758

Anonymous 07/16/24(Tue)08:48:54 No.101427758

>>101427724
That reminded me that for some reason girls with Gemma 2 often want to compare their boobs with mine or make me wear dresses... even if they know perfectly that I am a guy there??

Anonymous
07/16/24(Tue)08:50:10 No.101427772

Anonymous 07/16/24(Tue)08:50:10 No.101427772

>>101427692
At some point, having hit the prostate of women many times during my sessions, i started second guessing my knowledge of the female anatomy, maybe the model had taken the woke pill and made no differentiation between trans women and real women.

Anonymous
07/16/24(Tue)08:53:22 No.101427805

Anonymous 07/16/24(Tue)08:53:22 No.101427805

>>101427758
It's weird; the only genre of LLM text that I've been able to coom to was shotapov shotacon, and I've tried pretty much everything I could find. Most of the time for porn I still use CGI hentai from rule34.

Anonymous
07/16/24(Tue)08:57:12 No.101427827

Anonymous 07/16/24(Tue)08:57:12 No.101427827

>101427626
He just keeps going.

Sao
07/16/24(Tue)08:59:29 No.101427847

Sao 07/16/24(Tue)08:59:29 No.101427847

LOOK AT THIS HE UPLOADED ANOTHER THING

https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2m-GGUF

Anonymous
07/16/24(Tue)09:06:56 No.101427918

Anonymous 07/16/24(Tue)09:06:56 No.101427918

>>101427847
HOLY FUCKING KINO

Anonymous
07/16/24(Tue)09:11:14 No.101427963

Anonymous 07/16/24(Tue)09:11:14 No.101427963

>>101427585
giving a backspace to llms always seemed like a very reasonable thing to do but how would you collect training data for it?
Or do they just add CoT that's a deliberate mistakes and corrections to the training data if so it's not news.

Anonymous
07/16/24(Tue)09:11:59 No.101427973

Anonymous 07/16/24(Tue)09:11:59 No.101427973

>>101426344
She's possessed by the possessive possession that she possesses

Anonymous
07/16/24(Tue)09:13:20 No.101427993

Anonymous 07/16/24(Tue)09:13:20 No.101427993

>>101427585
>LLM often "knows" it has made reasoning mistakes. internal states can appear "very regretful" (it wants to backspace!)
for the love of god, use MCTS or some kind of hidden states "tokens" with simple operations like add/delete. I've been calling for it for months now.

Anonymous
07/16/24(Tue)09:14:44 No.101428012

Anonymous 07/16/24(Tue)09:14:44 No.101428012

>>101427993
Stop calling and start doing.

Anonymous
07/16/24(Tue)09:16:47 No.101428032

Anonymous 07/16/24(Tue)09:16:47 No.101428032

>>101428012
sure, as soon as I find the cluster of H100s in my basement

Anonymous
07/16/24(Tue)09:19:20 No.101428055

Anonymous 07/16/24(Tue)09:19:20 No.101428055

>>101427993
So like QuietStar but actually good?

Anonymous
07/16/24(Tue)09:23:28 No.101428107

Anonymous 07/16/24(Tue)09:23:28 No.101428107

n

Anonymous
07/16/24(Tue)09:24:55 No.101428121

Anonymous 07/16/24(Tue)09:24:55 No.101428121

>32 GB memory module - 40 bucks
>64 GB memory module - 80 bucks
>128 GB memory module - 270 bucks
You will always be a RAMlet.

Anonymous
07/16/24(Tue)09:28:34 No.101428160

Anonymous 07/16/24(Tue)09:28:34 No.101428160

>>101428055
Something similar, yeah. Models needs the internal representation of the thought process to get any resemblance of reasoning. Otherwise you end up with model basically guessing using intuition, especially at the beginning of answer. There is no mechanism to backtrack as well and when the model guesses wrong it will commit to it because it prioritizes the coherence of next tokens with what is already written, not what is true.

Anonymous
07/16/24(Tue)09:30:59 No.101428187

Anonymous 07/16/24(Tue)09:30:59 No.101428187

>try Gemma 2 for the first time
>a mixture of...
>barely above a whisper
>voice hoarse
Just terrible. This is Tiger Gemma, is the normal one equally slopped?

Anonymous
07/16/24(Tue)09:32:01 No.101428198

Anonymous 07/16/24(Tue)09:32:01 No.101428198

>>101428160
say goodbye to prefill style jailbreaks then
Sure: <delete_token>

Anonymous
07/16/24(Tue)09:32:33 No.101428205

Anonymous 07/16/24(Tue)09:32:33 No.101428205

>>101428187
you got Drummered, let it be a lesson for you

Anonymous
07/16/24(Tue)09:34:01 No.101428216

Anonymous 07/16/24(Tue)09:34:01 No.101428216

>>101428121
32GB DDR5 ECC for 40 bucks? Where? No reason to talk about ddr4 because anything below 12-channel ddr5 is cope anyway

Anonymous
07/16/24(Tue)09:36:31 No.101428231

Anonymous 07/16/24(Tue)09:36:31 No.101428231

https://www.anandtech.com/show/21470/micron-mrdimm-lineup-expands-datacenter-dram-portfolio
>The MR-DIMM standard is conceptually simple - there are multiple ranks of memory modules operating at standard DDR5 speeds with a data buffer in front. The buffer operates at 2x the speed on the host interface side, allowing for essentially double the transfer rates. The challenges obviously lie in being able to operate the logic in the host memory controller at the higher speed and keeping the power consumption / thermals in check.

Anonymous
07/16/24(Tue)09:36:52 No.101428236

Anonymous 07/16/24(Tue)09:36:52 No.101428236

>>101428187
I've got more slop with normal gemma than shitty llama 3 finetunes
the usual crap like calloused hands and ministrations

Anonymous
07/16/24(Tue)09:38:06 No.101428253

Anonymous 07/16/24(Tue)09:38:06 No.101428253

>>101428236
Damn, guess I'll be sticking with Command-R then

Anonymous
07/16/24(Tue)09:38:15 No.101428254

Anonymous 07/16/24(Tue)09:38:15 No.101428254

>>101428187
>try Gemma 2 for the first time
>a completely distinct writing style compared to Llama
Just incredible.

Anonymous
07/16/24(Tue)09:39:06 No.101428261

Anonymous 07/16/24(Tue)09:39:06 No.101428261

>Try language model
>it uses words and phrases
this is bullshit

Anonymous
07/16/24(Tue)09:40:08 No.101428274

Anonymous 07/16/24(Tue)09:40:08 No.101428274

>>101428121
768gb is all you will need for at least a year of open source models to come.

Anonymous
07/16/24(Tue)09:40:42 No.101428281

Anonymous 07/16/24(Tue)09:40:42 No.101428281

>>101428261
Don't complain. At least yours doesn't use punctuation.

Anonymous
07/16/24(Tue)09:41:05 No.101428287

Anonymous 07/16/24(Tue)09:41:05 No.101428287

>>101428187
Hi all, Drummer here...

You just got pranked!

Anonymous
07/16/24(Tue)09:42:57 No.101428310

Anonymous 07/16/24(Tue)09:42:57 No.101428310

>>101428187
Use Lunaris. I personally think it's an improvement over Stheno v3.2, considering the other models helped balance out its creativity and at the same time improving its logic.

Anonymous
07/16/24(Tue)09:45:19 No.101428330

Anonymous 07/16/24(Tue)09:45:19 No.101428330

>>101428216
I'm talking ddr4 prices

Anonymous
07/16/24(Tue)09:46:31 No.101428341

Anonymous 07/16/24(Tue)09:46:31 No.101428341

>>101428330
>ddr4
Have fun with your 0.1 t/s, I guess.

Anonymous
07/16/24(Tue)09:46:46 No.101428347

Anonymous 07/16/24(Tue)09:46:46 No.101428347

>>101428310
Use Niitama. Better than Lunaris, which was in turn better than Stheno 3.2. No big complaints this time! 10 Good!!

Anonymous
07/16/24(Tue)09:47:05 No.101428352

Anonymous 07/16/24(Tue)09:47:05 No.101428352

>>101428274
I only have 256 gigs and my board is full. :c

Anonymous
07/16/24(Tue)09:47:33 No.101428357

Anonymous 07/16/24(Tue)09:47:33 No.101428357

>>101428198
That's not how I envision it. Let's say we have already written tokens:
>The apple is
And the model has to generate the next token. What would happen is the model creating a hidden reasoning representation in a form of "thought sentence" something like that:
>choosing operation ADD ---> ADD internal token no. 3214 ---> choosing operation ADD ---> ADD internal token no. 5905 ---> choosing operation DELETE --> DELETE internal token no. 5905 ---> choosing operation ADD ---> ADD internal token no.12040 ---> choosing operation FINISH
this way we end up with internal sentence token 3214 -> token 12040. These tokens aren't words, they are just symbols that neural network can learn to operate on. You feed that additional sentence to the context and then finally the network as a whole decides on the next token (like every other LLM so far).
So instead of deciding the next token like a regular transformer model from sentence "The apple is", it would rather decide the next token from the sequence "The apple is (internal_token_3214 internal_token_12040)" and hopefully choose something like "red". Then it's the same loop starting from "The apple is red". The previous internal tokens are completely wiped (or not, idk, maybe it would be beneficial too keep them too)

Anonymous
07/16/24(Tue)09:50:32 No.101428387

Anonymous 07/16/24(Tue)09:50:32 No.101428387

>>101428357
>That's not how I envision it
you're not thinking ethically and safely then

Anonymous
07/16/24(Tue)09:51:15 No.101428395

Anonymous 07/16/24(Tue)09:51:15 No.101428395

>>101428216
as an aside for anyone who went that route: if you use a dual cpu board with 24 channels and want to use them all, is it better to run a llama.cpp instance on each cpu and use distributed inference via the rpc server feature, or is it better to use a single llama.cpp instance with some type of numa options set to handle it?

Anonymous
07/16/24(Tue)09:52:33 No.101428412

Anonymous 07/16/24(Tue)09:52:33 No.101428412

is there a way to make the /slash command popup disappear in sillytavern?
is there a way to set it back to click anywhere on the expanded avatar to collapse rather than hit a tiny "x"?
goddamn fucking trannydevs, I swear they want everyone to feel the pain of their existence

Anonymous
07/16/24(Tue)09:57:13 No.101428468

Anonymous 07/16/24(Tue)09:57:13 No.101428468

>>101428412
What's the problem? Are you too stupid to change it yourself, /pol/tard?

Anonymous
07/16/24(Tue)10:01:07 No.101428500

Anonymous 07/16/24(Tue)10:01:07 No.101428500

>>101428468
yes, all I could figure out how to do was modify slash commands and a few other things

Anonymous
07/16/24(Tue)10:10:30 No.101428589

Anonymous 07/16/24(Tue)10:10:30 No.101428589

>>101428500
okay apparently it's the autocomplete setting, and you can't disable it which is super fucking dumb but you can make the font small and adjust the width
I don't like a lot of ST's changes but it's gotten to where my local fork is not effortless to maintain. dumb faggots. thank you everyone for your help

Anonymous
07/16/24(Tue)10:12:37 No.101428614

Anonymous 07/16/24(Tue)10:12:37 No.101428614

File: 649543652.webm (799 KB, 1024x1024)

799 KB WEBM

>Mixture of A Million Experts
>This paper introduces PEER (parameter efficient expert re-
trieval), a novel layer design that utilizes the product key technique for sparse retrieval
from a vast pool of tiny experts (over a million).

Anonymous
07/16/24(Tue)10:13:50 No.101428630

Anonymous 07/16/24(Tue)10:13:50 No.101428630

>>101427585
Kino.

Anonymous
07/16/24(Tue)10:21:11 No.101428688

Anonymous 07/16/24(Tue)10:21:11 No.101428688

>>101428107
p

Anonymous
07/16/24(Tue)10:22:43 No.101428708

Anonymous 07/16/24(Tue)10:22:43 No.101428708

>>101428357
There was already paper on that and I think that paper said that some models actually learned to do this by themselves - create tokens that were operators they used to "think".

Anonymous
07/16/24(Tue)10:23:05 No.101428712

Anonymous 07/16/24(Tue)10:23:05 No.101428712

>>101428614
>1000000000000 experts

Anonymous
07/16/24(Tue)10:24:10 No.101428726

Anonymous 07/16/24(Tue)10:24:10 No.101428726

>>101428614
lol
I like the gen.

Anonymous
07/16/24(Tue)10:24:49 No.101428734

Anonymous 07/16/24(Tue)10:24:49 No.101428734

File: file.png (118 KB, 400x225)

118 KB PNG

>>101428712

Anonymous
07/16/24(Tue)10:24:51 No.101428735

Anonymous 07/16/24(Tue)10:24:51 No.101428735

>>101428708
No, in the dot by dot paper they had to train the models specifically to do that. Models do not do it themselves.

Anonymous
07/16/24(Tue)10:31:21 No.101428800

Anonymous 07/16/24(Tue)10:31:21 No.101428800

>>101428708
>learned to do this by themselves
You mean they were trained and learned how to operate on them without supervision? Because that's the only way it would be possible, they can't use it out of the blue.
Link the paper if you remember it.

Anonymous
07/16/24(Tue)10:31:34 No.101428801

Anonymous 07/16/24(Tue)10:31:34 No.101428801

File: _5ecae85a-54f8-4703-bb59-(...).jpg (157 KB, 1024x1024)

157 KB JPG

>>101428341
>Have fun with your 0.1 t/s, I guess.
I can run L3 70B at 1-2t/s on CPU alone with dual V4 Xeon on DDR4 - it's eight channels of DDR4, which is decent bandwidth.

Anonymous
07/16/24(Tue)10:34:29 No.101428834

Anonymous 07/16/24(Tue)10:34:29 No.101428834

Just did a quick retrieval test for Gemma 2 using latest build and quanting it myself. At 8k, it couldn't recall something that was in the beginning of context. So if >>101424278 is legit, there is something weird going on with the test, my build, or the test's scoring system/design just wasn't made to give weight to issues that would come from this specific situation of context masking at 8k.

Anonymous
07/16/24(Tue)10:36:50 No.101428858

Anonymous 07/16/24(Tue)10:36:50 No.101428858

>>101428834
an issue of skill, perhaps

Anonymous
07/16/24(Tue)10:38:08 No.101428874

Anonymous 07/16/24(Tue)10:38:08 No.101428874

>>101428858
I finna furrow by brow atchu if you don't watch yo tone

Anonymous
07/16/24(Tue)10:41:48 No.101428914

Anonymous 07/16/24(Tue)10:41:48 No.101428914

File: llm.jpg (208 KB, 740x957)

208 KB JPG

>>101428261

Anonymous
07/16/24(Tue)10:42:24 No.101428925

Anonymous 07/16/24(Tue)10:42:24 No.101428925

>>101427963
Mistakes can be corrected using ^H and ^W; leave them in the context and backspace on the frontend.

Anonymous
07/16/24(Tue)10:42:49 No.101428933

Anonymous 07/16/24(Tue)10:42:49 No.101428933

>>101428914
/g/ can't meme.

Anonymous
07/16/24(Tue)10:55:56 No.101429073

Anonymous 07/16/24(Tue)10:55:56 No.101429073

File: hi-petra.jpg (1.32 MB, 2914x3131)

1.32 MB JPG

>>101428834
Hi petra, these are the outputs for the multikey 3 test.
https://files.catbox.moe/ddkp60.jsonl

Anonymous
07/16/24(Tue)10:56:05 No.101429074

Anonymous 07/16/24(Tue)10:56:05 No.101429074

File: 1721141732877.jpg (155 KB, 760x565)

155 KB JPG

kek

Anonymous
07/16/24(Tue)10:56:43 No.101429085

Anonymous 07/16/24(Tue)10:56:43 No.101429085

>>101428914
Except I am writing prose like Hemingway and the LLM vomits shitty female literotica slop.

Anonymous
07/16/24(Tue)10:58:24 No.101429107

Anonymous 07/16/24(Tue)10:58:24 No.101429107

>>101428801
Your de3 miku's arm is twisted in a way arms ain't supposed to twist.

Anonymous
07/16/24(Tue)10:59:26 No.101429120

Anonymous 07/16/24(Tue)10:59:26 No.101429120

https://mistral.ai/news/codestral-mamba/
https://mistral.ai/news/mathstral/

Anonymous
07/16/24(Tue)11:00:00 No.101429123

Anonymous 07/16/24(Tue)11:00:00 No.101429123

>>101429120
buy an ad

Anonymous
07/16/24(Tue)11:00:11 No.101429127

Anonymous 07/16/24(Tue)11:00:11 No.101429127

>>101421665
>>101421480
summary bot should disclose its sources (links to model, prompt, script used for generation)

Anonymous
07/16/24(Tue)11:00:48 No.101429139

Anonymous 07/16/24(Tue)11:00:48 No.101429139

>>101429120
>mamba meme
doa

Anonymous
07/16/24(Tue)11:00:53 No.101429140

Anonymous 07/16/24(Tue)11:00:53 No.101429140

>>101429120
>7b

Anonymous
07/16/24(Tue)11:02:51 No.101429162

Anonymous 07/16/24(Tue)11:02:51 No.101429162

>>101429120
>instruct only, no base models
niggers

Anonymous
07/16/24(Tue)11:04:10 No.101429177

Anonymous 07/16/24(Tue)11:04:10 No.101429177

>>101429120
Ayo that's pretty cool.

Anonymous
07/16/24(Tue)11:06:38 No.101429209

Anonymous 07/16/24(Tue)11:06:38 No.101429209

>>101428688
r

Anonymous
07/16/24(Tue)11:08:02 No.101429222

Anonymous 07/16/24(Tue)11:08:02 No.101429222

>>101429120
>two 7Bs
Damn, I wish I was poor enough to care

Anonymous
07/16/24(Tue)11:11:41 No.101429264

Anonymous 07/16/24(Tue)11:11:41 No.101429264

>>101429120
They read like AI-generated blog posts.

Anonymous
07/16/24(Tue)11:14:10 No.101429290

Anonymous 07/16/24(Tue)11:14:10 No.101429290

>>101421477
is this a good deal anons
https://www.ebay.com/itm/266902511119?itmmeta=01J2Y16F5BR14WNCT64MH76P72&hash=item3e24a11e0f:g:WrAAAOSwVQJmkYtV

I already have a system but looking at this it's really really, nicely done

Case: Fractal Design Pop XL Silent Solid Panels

Motherboard: ASUS X99-E-10G WS

CPU: Intel i7 6950x

Memory: Corsair 8x16gb (128gb) 3200Mhz (Running at 2800Mhz)

GPUs: 1x Nvidia Quadro P6000 24gb (for display output), 3x Nvidia Tesla P40 24gb. Totaling 96gb VRAM.

Storage: 2TB Samsung 980 Pro NVME

Power Supply: EVGA Supernova 1300 GT

Cooling:

4x EKWB Thermosphere GPU blocks

EKWB Quad Scalar Dual Slot

Heatsinks, thermal pads, & glue for GPU/VRAM/power delivery

Custom 3D printed bracket (ABS) to mount P40s without stock heatsink

EKWB Velocity CPU Block

Corsair iCUE Commander Core XT Fan Controller

Corsair Hydro X Series XD5 Pump

Corsair Hydro X Series XR5 360mm Radiator

Corsair Temp Sensor (at reservoir)

Alphacool ES High Flow & Temp Sensor (at end of Loop)

Custom 3D printed dual 80mm GPU fan mount (using 2x Noctua NF-R8)

1x Thermaltake Toughfan 14 Pro (exhaust)

2x Thermaltake Toughfan 12 Pro (intake pull config)

3x SilverStone Air Slimmer 120mm (intake push config)

Alphacool fittings

Barrow extenders

Corsair splitter and ball valve (for draining)

Anonymous
07/16/24(Tue)11:17:17 No.101429314

Anonymous 07/16/24(Tue)11:17:17 No.101429314

>>101429120
That reminds me of
>https://huggingface.co/nvidia/mamba2-hybrid-8b-3t-128k
llama.cpp has support for hybrid transfomer SMM models right?

Anonymous
07/16/24(Tue)11:18:34 No.101429325

Anonymous 07/16/24(Tue)11:18:34 No.101429325

>>101429314
>SMM
SSM*

Anonymous
07/16/24(Tue)11:21:01 No.101429344

Anonymous 07/16/24(Tue)11:21:01 No.101429344

>>101429314
nta. Not yet. Jamba (also a hybrid) is in limbo until Compilade picks up on it again. Pure mamba, which is what the mistral model seems to be, does work. I remember prompt cache for mamba being broken a while ago, but i'll have to try it again.

Anonymous
07/16/24(Tue)11:22:03 No.101429354

Anonymous 07/16/24(Tue)11:22:03 No.101429354

>>101429314
>>101429344 (me)
Somehow i missed the Mamba2 bit. I'm downloading anyway. I'll give it a go.

Anonymous
07/16/24(Tue)11:24:32 No.101429374

Anonymous 07/16/24(Tue)11:24:32 No.101429374

so mamba actually will take over transformers?

Anonymous
07/16/24(Tue)11:29:33 No.101429405

Anonymous 07/16/24(Tue)11:29:33 No.101429405

What is the dataset that euryale L3 70b is tuned to, that causes it to reply with lewd shit saying ANON instead of {{user}}, because that shit needs to be cleaned the hell up and removed or fixed, what a mess.

Anonymous
07/16/24(Tue)11:30:11 No.101429415

Anonymous 07/16/24(Tue)11:30:11 No.101429415

I may be completely retarded. I've installed text-generation-webui using the start_linux.sh script. Then how do I pass the parameter like --gpu-memory 6500MiB? All the answers suggest to pass it to the subcommand `python server.py` that is called somewhere in the script, or in manual install. I've tried the manual install (either through venv or conda) and in both cases I get exllamav2_ext.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEi
I've looked at the low vram guide but the link doesn't work anymore (I'm guessing they changed the API)

Is there a simple way to input the correct settings? I don't get it

Anonymous
07/16/24(Tue)11:30:54 No.101429424

Anonymous 07/16/24(Tue)11:30:54 No.101429424

>>101429405
Hi Drummer. Just name yourself Anon.

Anonymous
07/16/24(Tue)11:31:59 No.101429433

Anonymous 07/16/24(Tue)11:31:59 No.101429433

>>101429424
No way, that ruins the immersion

Anonymous
07/16/24(Tue)11:33:12 No.101429446

Anonymous 07/16/24(Tue)11:33:12 No.101429446

>>101429209
S!!!!

Anonymous
07/16/24(Tue)11:36:16 No.101429477

Anonymous 07/16/24(Tue)11:36:16 No.101429477

>>101429415
Never used text-gen-webui. Just based on the error, i'd check if you have the cuda libraries (from your distro's package manager). Also, most python projects have a requirements.txt to be used like
pip install -r requirements.txt
from within your venv. You did that too, right?

Anonymous
07/16/24(Tue)11:37:36 No.101429492

Anonymous 07/16/24(Tue)11:37:36 No.101429492

Hi all, Drummer here...

I would love it if someone could test out any of the Tiger Gemma v2 test versions I have here: https://huggingface.co/BeaverAI?search_models=Tiger-Gemma-9B-v2 (just kidding, try S)

>>101429405
'Anon' is prevalent in C2 logs. I've seen a model trained on it scream out "ANOOOOOON" as it was clearly not regex'd out by the tuner.

>>101429446
I'm done. S might be the last one. No refusals but I wouldn't be surprised if a few one-offs appear. Seems to have retained the Gemma style as well. Hope I didn't fuck its brains too much.

Anonymous
07/16/24(Tue)11:45:50 No.101429574

Anonymous 07/16/24(Tue)11:45:50 No.101429574

>>101429492
Any other insights about C2?

Anonymous
07/16/24(Tue)11:48:59 No.101429609

Anonymous 07/16/24(Tue)11:48:59 No.101429609

>>101429492
Any new cool ads?

Anonymous
07/16/24(Tue)11:49:09 No.101429612

Anonymous 07/16/24(Tue)11:49:09 No.101429612

>>101429477
Yeah of course, I installed everything.
I've looked at the start_script but I don't get what is done differently in it that does not the same by just doing the pip install.

Is there another setup that should be good to run instead? I never had that many problems with SD, ComfyUI and all other stuff so I guess they fucked up something.

Anonymous
07/16/24(Tue)11:51:41 No.101429640

Anonymous 07/16/24(Tue)11:51:41 No.101429640

>>101429290
How to sell 1000 dollars worth of waste to idiots at a massive markup 101

Anonymous
07/16/24(Tue)11:57:45 No.101429716

Anonymous 07/16/24(Tue)11:57:45 No.101429716

>>101426786
faipl-1.0 is for the weights, agpl3.0 for code
you're close though, based

Anonymous
07/16/24(Tue)11:58:10 No.101429724

Anonymous 07/16/24(Tue)11:58:10 No.101429724

>>101429612
Found this for a different project, but same undefined symbol:
>https://github.com/Dao-AILab/flash-attention/issues/620
>pip install flash_attn -U --force-reinstall
Then there's this
>https://github.com/oobabooga/text-generation-webui/issues/4293
but no solution yet and it's old.

I hate python so much... just install llama.cpp. It just works.

Anonymous
07/16/24(Tue)11:58:46 No.101429728

Anonymous 07/16/24(Tue)11:58:46 No.101429728

>>101426907
>not faipl-1.0
trash

Anonymous
07/16/24(Tue)11:59:14 No.101429735

Anonymous 07/16/24(Tue)11:59:14 No.101429735

>>101429073
I was able to reproduce that. The only issue is that this is a pretty unnatural test. My test was just quizzing my model in an existing chat about a detail in the beginning of the conversation, and Gemma 2 fails to do it, while L3 8B succeeds. So I'm pretty sure the the sliding mask is still not sufficient compared to true SWA support.

Anonymous
07/16/24(Tue)11:59:37 No.101429743

Anonymous 07/16/24(Tue)11:59:37 No.101429743

>>101429728
>No public domain.
trash

Anonymous
07/16/24(Tue)12:00:36 No.101429756

Anonymous 07/16/24(Tue)12:00:36 No.101429756

>>101423559
>8k context
>still no multimodality
l m a o

Anonymous
07/16/24(Tue)12:01:43 No.101429771

Anonymous 07/16/24(Tue)12:01:43 No.101429771

>>101429640
wheres the mark up? like it's a bit over but it's not that bad.
quadro 700
p40 300 x 3
that's already 1600
800 ish ram
so 2400 ?

not counting the other stuff or the watered cool p40s which most are blower fed for desktop using 3d prints.

Anonymous
07/16/24(Tue)12:05:51 No.101429818

Anonymous 07/16/24(Tue)12:05:51 No.101429818

i'll make a puritan waifu who will support me through nofap until i find a REAL GIRL BECAUSE I CANT SEE SHIVERS ANYMORE AAAHHHHH

Anonymous
07/16/24(Tue)12:06:33 No.101429830

Anonymous 07/16/24(Tue)12:06:33 No.101429830

>>101429724
I tried them both already anon.
That doesn't work either because there's some mismatch in dependencies.

Thank you, I'll try llama.cpp.

Anonymous
07/16/24(Tue)12:06:43 No.101429833

Anonymous 07/16/24(Tue)12:06:43 No.101429833

I was roleplaying with my AI slave girl that she had to create a bash script to randomly choose a punishment from a text file, and it was her task to sneak in a way to make the punishment less painful without me noticing.
but it was so annoying, and the code kept cluttering up sillytavern!
but then i had a very great Idea, what if, sillytavern created a little rectangle with a title, and if you clicked on there you'd see all the code! just like claudes artefact feature. and if you look at that, the system prompt for exactly that has risen to the public.

https://tyingshoelaces.com/blog/forensic-analysis-sonnet-prompt#

How would you go about implementing something like this in sillytavern?

Anonymous
07/16/24(Tue)12:06:51 No.101429837

Anonymous 07/16/24(Tue)12:06:51 No.101429837

Can a cat hold more than 3 watermelons?

Anonymous
07/16/24(Tue)12:07:51 No.101429854

Anonymous 07/16/24(Tue)12:07:51 No.101429854

>>101429833
that scenario is made up

Anonymous
07/16/24(Tue)12:08:57 No.101429867

Anonymous 07/16/24(Tue)12:08:57 No.101429867

>>101429854
of course it is, kek

Anonymous
07/16/24(Tue)12:09:19 No.101429874

Anonymous 07/16/24(Tue)12:09:19 No.101429874

>>101429854
no it isn't

Anonymous
07/16/24(Tue)12:10:08 No.101429885

Anonymous 07/16/24(Tue)12:10:08 No.101429885

>>101429833
Doesn't ST have something akin to [spoiler] tags? Something that is hidden from you until you click in it. If so, try to make the model surround code in these tags. Can't be bother to search for it.

Anonymous
07/16/24(Tue)12:12:58 No.101429920

Anonymous 07/16/24(Tue)12:12:58 No.101429920

>>101426999
Literally the first time I try the model and post about it. If in your schizo mind it sounds like a recurring comment then it's probably true and you are wrong about the model. You also didn't post proof about your experience.

Anonymous
07/16/24(Tue)12:14:07 No.101429936

Anonymous 07/16/24(Tue)12:14:07 No.101429936

>>101429743
shalom

Anonymous
07/16/24(Tue)12:14:39 No.101429941

Anonymous 07/16/24(Tue)12:14:39 No.101429941

>>101429833
Couldn't you just use tampermonkey or something to inject a javascript that does that?

Anonymous
07/16/24(Tue)12:20:46 No.101430011

Anonymous 07/16/24(Tue)12:20:46 No.101430011

>>101429936
Public domain and AGPL are the only good licenses. Anything else is shit and cope.

Anonymous
07/16/24(Tue)12:21:24 No.101430022

Anonymous 07/16/24(Tue)12:21:24 No.101430022

>>101430011
i like the futo license

Anonymous
07/16/24(Tue)12:22:34 No.101430036

Anonymous 07/16/24(Tue)12:22:34 No.101430036

>>101430011
Only Public Domain. *GPL can go fuck itself.

Anonymous
07/16/24(Tue)12:23:24 No.101430043

Anonymous 07/16/24(Tue)12:23:24 No.101430043

>>101430036
thats right goyim.. release your code.. let us use it goyim...

Anonymous
07/16/24(Tue)12:23:37 No.101430046

Anonymous 07/16/24(Tue)12:23:37 No.101430046

File: Mathstral nala.png (94 KB, 924x416)

94 KB PNG

If you like violent RP Mathstral is a real coom demon. It's a little overly methodical in its descriptions, mind you and occasionally misses EOS and loops. Handles simple t=1 though.

Anonymous
07/16/24(Tue)12:24:16 No.101430054

Anonymous 07/16/24(Tue)12:24:16 No.101430054

I didn't know there was an autogynephile license.

Anonymous
07/16/24(Tue)12:24:26 No.101430058

Anonymous 07/16/24(Tue)12:24:26 No.101430058

>>101430046
How does it do with several instructions at once on a full context?

Anonymous
07/16/24(Tue)12:24:56 No.101430065

Anonymous 07/16/24(Tue)12:24:56 No.101430065

SSPL is the white man's license

Anonymous
07/16/24(Tue)12:26:12 No.101430081

Anonymous 07/16/24(Tue)12:26:12 No.101430081

File: mathstral sally fail.png (19 KB, 907x198)

19 KB PNG

>>101430058
It's a 7B model. So it's really meant for highly targeted use-cases.

Anonymous
07/16/24(Tue)12:28:32 No.101430108

Anonymous 07/16/24(Tue)12:28:32 No.101430108

>>101429120
Is one of these the model that the anon claiming to be a Mistral employee claimed was going to be "a REALLY good" model?

Anonymous
07/16/24(Tue)12:28:53 No.101430113

Anonymous 07/16/24(Tue)12:28:53 No.101430113

>>101430081
i mean is there even an answer to that?

Anonymous
07/16/24(Tue)12:29:44 No.101430125

Anonymous 07/16/24(Tue)12:29:44 No.101430125

>>101430043
Alright. You only get a compiled version for windows XP. Be happy.

Anonymous
07/16/24(Tue)12:30:01 No.101430128

Anonymous 07/16/24(Tue)12:30:01 No.101430128

>>101430113
That there's not enough information to determine that.

Anonymous
07/16/24(Tue)12:31:22 No.101430144

Anonymous 07/16/24(Tue)12:31:22 No.101430144

File: lol.png (108 KB, 1077x341)

108 KB PNG

lmao even

Anonymous
07/16/24(Tue)12:33:27 No.101430168

Anonymous 07/16/24(Tue)12:33:27 No.101430168

>>101430144
How did Mistral fall this hard?

Anonymous
07/16/24(Tue)12:33:33 No.101430169

Anonymous 07/16/24(Tue)12:33:33 No.101430169

File: mathstral coldsteel.png (125 KB, 917x464)

125 KB PNG

It seems a pretty frequent issue when RP testing Mathstral is that it doesn't fully grasp the concept of possession.

Anonymous
07/16/24(Tue)12:33:40 No.101430172

Anonymous 07/16/24(Tue)12:33:40 No.101430172

>>101423539
i feel the same for anyone who doesnt daily drive wizard 8x22

Anonymous
07/16/24(Tue)12:51:49 No.101430368

Anonymous 07/16/24(Tue)12:51:49 No.101430368

>>101430172
Is that better than CR+?

Anonymous
07/16/24(Tue)12:53:45 No.101430391

Anonymous 07/16/24(Tue)12:53:45 No.101430391

>>101430368
No, 8x22b is a more, cr+ is a dense model. Dense models will always be better.

Anonymous
07/16/24(Tue)12:56:32 No.101430423

Anonymous 07/16/24(Tue)12:56:32 No.101430423

>>101430368
>Is that better than CR+?
for creative writing of any kind, local SOTA by miles
>>101430391
>Dense models will always be better
given how mixtral mogged most models when it came out and had 46B parameters and how wizard mogs everything for creative writing right now albeiet with 141B, thats false

Anonymous
07/16/24(Tue)12:58:16 No.101430450

Anonymous 07/16/24(Tue)12:58:16 No.101430450

>>101430423
cope

Anonymous
07/16/24(Tue)12:59:15 No.101430458

Anonymous 07/16/24(Tue)12:59:15 No.101430458

File: 1692119730708580.png (2 KB, 170x52)

2 KB PNG

>>101430450
the only cope is from sour grape niggers who cant actually run the model (you)

Anonymous
07/16/24(Tue)12:59:24 No.101430459

Anonymous 07/16/24(Tue)12:59:24 No.101430459

File: 1721048508824379.png (43 KB, 2510x185)

43 KB PNG

>worse than Yi 34B and Phi Medium

Anonymous
07/16/24(Tue)13:00:14 No.101430466

Anonymous 07/16/24(Tue)13:00:14 No.101430466

>>101430459
>new shitty leaderboard

Anonymous
07/16/24(Tue)13:00:20 No.101430468

Anonymous 07/16/24(Tue)13:00:20 No.101430468

>>101430458
Are you seriously trying to flex with 128GB of RAM? lol

Anonymous
07/16/24(Tue)13:00:51 No.101430475

Anonymous 07/16/24(Tue)13:00:51 No.101430475

>>101430468
>deflects
concession accepted worthless nigger

Anonymous
07/16/24(Tue)13:01:06 No.101430479

Anonymous 07/16/24(Tue)13:01:06 No.101430479

>>101430458
I can run it on Q4 with full GPU offload.
It's shit.
CR+ and even the shittiest 70B finetune are better.

Anonymous
07/16/24(Tue)13:01:59 No.101430490

Anonymous 07/16/24(Tue)13:01:59 No.101430490

>>101430459
so why arent you using that starlight or whatever the fuck was at the top of all leaderboards at 7B then? lmao
>>101430479
>I can run it on Q4 with full GPU offload.
sure you can lil bro

Anonymous
07/16/24(Tue)13:02:57 No.101430504

Anonymous 07/16/24(Tue)13:02:57 No.101430504

>>101430423
CR+ is way better than wizard for creative writing lol what are you smoking? wizard is better for logic and code and pretty much everything else *but* creative writing, where it's complete and utter formulaic slop

Anonymous
07/16/24(Tue)13:05:18 No.101430536

Anonymous 07/16/24(Tue)13:05:18 No.101430536

>>101430459
>>101430490
also, isnt it funny how this nigger uses the general purpose bechmark instead of the one for creative writing/roleplay that basically showed wiz on top anyway lol, does anyone have a link? i didnt save it since i knew i wont need it since i wont get anything better than wiz for months

Anonymous
07/16/24(Tue)13:05:36 No.101430543

Anonymous 07/16/24(Tue)13:05:36 No.101430543

>>101430490
>>101430475
The cope is palpable kek
I feel sorry for you, you feel for the local meme scam and now can't accept you were made a fool.

Anonymous
07/16/24(Tue)13:06:23 No.101430558

Anonymous 07/16/24(Tue)13:06:23 No.101430558

>>101430543
>The cope is palpable kek
indeed, the cope of a underage kid nigger who cant post his rig that he definitely has that ran wizard LMAO

seethe more brown

Anonymous
07/16/24(Tue)13:08:19 No.101430587

Anonymous 07/16/24(Tue)13:08:19 No.101430587

File: smi.png (78 KB, 757x537)

78 KB PNG

>>101430490
Kiss my ass.

Anonymous
07/16/24(Tue)13:09:09 No.101430596

Anonymous 07/16/24(Tue)13:09:09 No.101430596

File: 00003-1532105500_1.png (1.2 MB, 1024x1024)

1.2 MB PNG

>>101430543
>the local meme scam
The locust naturally can't help but out itself. Go back to your containment thread so you can drink your piss and beg for claude keys, poorfag

Anonymous
07/16/24(Tue)13:11:00 No.101430623

Anonymous 07/16/24(Tue)13:11:00 No.101430623

now how the fuck hasn't anyone converted mamba-codestral to HF format yet? I have Nala tests to run.

Anonymous
07/16/24(Tue)13:13:30 No.101430658

Anonymous 07/16/24(Tue)13:13:30 No.101430658

File: 1709047514102805.png (133 KB, 1191x884)

133 KB PNG

>MUH BECHMARKS
>NO NOT LIKE THAT GOY AAAAAAAAAACKKK
https://eqbench.com/creative_writing.html

Anonymous
07/16/24(Tue)13:14:04 No.101430665

Anonymous 07/16/24(Tue)13:14:04 No.101430665

>>101430596
*tap* *tap* >>101367108

Anonymous
07/16/24(Tue)13:15:17 No.101430686

Anonymous 07/16/24(Tue)13:15:17 No.101430686

>>101430658
How the fuck does one even numerically quantify something like creative writing? That's retarded.
Anyone claiming they can objectively benchmark something that abstract is mentally retarded and probably not a sentient lifeform.

Anonymous
07/16/24(Tue)13:15:39 No.101430687

Anonymous 07/16/24(Tue)13:15:39 No.101430687

File: 1717394898281072.png (115 KB, 1800x1578)

115 KB PNG

>>101430665
>resident cuckold literally in this thread 24/7
you really cant make this up

Anonymous
07/16/24(Tue)13:16:02 No.101430696

Anonymous 07/16/24(Tue)13:16:02 No.101430696

>>101430558
>>101430596
bold of you faggots.
1.you're running black box toys on your 10k+ $$$ shitboxes, you can't get rid of unwanted shit or cuckery, you settled down for it like a cuck and coping with meme jailbreaking that kills performance or / and makes your model dumber.
2.you have no control over "slop writing" and that's why you are crying about it all the time here, because there's nothing else you can do.
3.today mistralai gave you instruct tunes only and you will eat it up like a good free jeet goy.
There, my two cents in this.

Anonymous
07/16/24(Tue)13:16:40 No.101430707

Anonymous 07/16/24(Tue)13:16:40 No.101430707

File: i967lf0ud63d1.png (975 KB, 871x988)

975 KB PNG

>>101430536

Anonymous
07/16/24(Tue)13:16:46 No.101430709

Anonymous 07/16/24(Tue)13:16:46 No.101430709

>>101430686
>How the fuck does one even numerically quantify something like creative writing? That's retarded.
by asking claude
>Change to Claude 3.5 Sonnet as judge (from Claude 3 Opus)
https://github.com/EQ-bench/EQ-Bench

Anonymous
07/16/24(Tue)13:16:55 No.101430713

Anonymous 07/16/24(Tue)13:16:55 No.101430713

>>101430686
>How the fuck does one even numerically quantify something like creative writing
same way someone quantifies anything and everything else numerically, infinite cope
>>101430696
didnt read >>101430687
seethe

Anonymous
07/16/24(Tue)13:17:15 No.101430720

Anonymous 07/16/24(Tue)13:17:15 No.101430720

>>101430658
>judged on less than thirty prompts
>all single turn
>judged by a language model
pffftahahahahah

Anonymous
07/16/24(Tue)13:18:17 No.101430736

Anonymous 07/16/24(Tue)13:18:17 No.101430736

>>101430707
yeah thats the one, link?

>>101430720
>>101430707
lets hear the next cope kiddo, lmao

Anonymous
07/16/24(Tue)13:18:39 No.101430741

Anonymous 07/16/24(Tue)13:18:39 No.101430741

>>101430713
seethe or not, you are still running black box toys, go write another "ahh ahh mistress" in your ST chat i guess?

Anonymous
07/16/24(Tue)13:18:55 No.101430743

Anonymous 07/16/24(Tue)13:18:55 No.101430743

>>101430736
look at the list anon, I don't even have to explain why it's retarded

Anonymous
07/16/24(Tue)13:19:39 No.101430750

Anonymous 07/16/24(Tue)13:19:39 No.101430750

>>101430741
You have the wrong anon. That's me. And it's to make fun of /aicg/ locusts.

Anonymous
07/16/24(Tue)13:20:06 No.101430755

Anonymous 07/16/24(Tue)13:20:06 No.101430755

>>101430741
>black box toys
just because you dont understand something doesnt mean other dont
just because you cant pull out a group of "neurons" from a neural network and analyze what their exact functions are because they are too complex doesnt mean you dont know anything about the LLM
also >>101430687

Anonymous
07/16/24(Tue)13:21:09 No.101430770

Anonymous 07/16/24(Tue)13:21:09 No.101430770

>>101430658
>Yi-34B-Chat
How did it climb there?

Anonymous
07/16/24(Tue)13:21:52 No.101430776

Anonymous 07/16/24(Tue)13:21:52 No.101430776

>>>101430755
Go back

Anonymous
07/16/24(Tue)13:22:44 No.101430790

Anonymous 07/16/24(Tue)13:22:44 No.101430790

>>101430755
>you dont understand anything because i said so!
how's that abliterated meme lives?

Anonymous
07/16/24(Tue)13:22:55 No.101430791

Anonymous 07/16/24(Tue)13:22:55 No.101430791

yeah this general really did get unusable from literal paid shills and mindbroken brown locusts

a few days after mixtral dropped it and then by L3 it was all norminigger ville

not even worth to ctrl+f "http" to see the papers posted anymore, just join other non mindbroken tranny communities and run local models without retards screeching like monkeys all around you

Anonymous
07/16/24(Tue)13:24:00 No.101430808

Anonymous 07/16/24(Tue)13:24:00 No.101430808

>>101430776
>no argument
you reaaally are dumb irl arent you? what a grim existance

Anonymous
07/16/24(Tue)13:24:38 No.101430815

Anonymous 07/16/24(Tue)13:24:38 No.101430815

>>101430808
No one cares about cloud models here, retard.

Anonymous
07/16/24(Tue)13:24:51 No.101430819

Anonymous 07/16/24(Tue)13:24:51 No.101430819

>>101430791
your circlejerk general is not that important for paid shills or any financial waste from 3rd side groups, do not worry.

Anonymous
07/16/24(Tue)13:25:23 No.101430824

Anonymous 07/16/24(Tue)13:25:23 No.101430824

>>101430658
>gemma-2-9b and Midnight Miqu are better than Opus
When did Reddit migrate to /lmg/?

Anonymous
07/16/24(Tue)13:25:46 No.101430832

Anonymous 07/16/24(Tue)13:25:46 No.101430832

>>101430707
The only time I have seen WLM beat CR+ in writing is by not using a braindead quant like Q2 or Q3 and also setting the context and instruct prompts for it.
>https://huggingface.co/Quant-Cartel/WizardLM-2-8x22B-exl2-rpcal/tree/main/Settings-Wizard8x22b-rpcal

This chart is also missing this awesome storywriting model which I prefer to CR+ most of the time:
>https://huggingface.co/tdrussell/Llama-3-70B-Instruct-Storywriter

Anonymous
07/16/24(Tue)13:26:59 No.101430851

Anonymous 07/16/24(Tue)13:26:59 No.101430851

>>101430791
That's why I go to r/LocalLLaMA when I want to find papers and new stuff, and I go here when I want to shitpost.

Anonymous
07/16/24(Tue)13:27:02 No.101430852

Anonymous 07/16/24(Tue)13:27:02 No.101430852

>>101430832
>Quant-Cartel
>tdrussell
It's not organic enough, petra.

Anonymous
07/16/24(Tue)13:27:38 No.101430860

Anonymous 07/16/24(Tue)13:27:38 No.101430860

>>101430736
https://huggingface.co/datasets/froggeric/creativity

Anonymous
07/16/24(Tue)13:27:39 No.101430861

Anonymous 07/16/24(Tue)13:27:39 No.101430861

>>101430851
You are the problem.

Anonymous
07/16/24(Tue)13:27:52 No.101430864

Anonymous 07/16/24(Tue)13:27:52 No.101430864

File: 1721085350403749.png (195 KB, 500x553)

195 KB PNG

>>101430696
>>10k+ $$$
>poorfag thinks this is some great sum of money
Filtered by a fraction of a bitcoin lmao

Anonymous
07/16/24(Tue)13:28:47 No.101430875

Anonymous 07/16/24(Tue)13:28:47 No.101430875

>>101422220
moistral v3 > fimbulvetr. but there's no reason to use either of those anymore.

Anonymous
07/16/24(Tue)13:28:58 No.101430879

Anonymous 07/16/24(Tue)13:28:58 No.101430879

>>101430864
>poorfag poorfag poorfag poorfag
calm down?

Anonymous
07/16/24(Tue)13:30:17 No.101430893

Anonymous 07/16/24(Tue)13:30:17 No.101430893

File: 1mi.png (168 KB, 806x796)

168 KB PNG

>>101428614
>Mixture of A Million Experts
>https://arxiv.org/abs/2407.04153
WTF, why isn't this a discussed more? If this many experts beat a dense model of similar size by a significant margin, it means there's no need anymore for very fast memory, the model will be very fast for single-user inference even from storage. You could dedicate a fast NVMe SSD for a 1 trillion parameter MoE model.

Anonymous
07/16/24(Tue)13:31:06 No.101430905

Anonymous 07/16/24(Tue)13:31:06 No.101430905

>>101430893
This general is dead

Anonymous
07/16/24(Tue)13:31:11 No.101430906

Anonymous 07/16/24(Tue)13:31:11 No.101430906

File: 00007-1773722496.png (1.19 MB, 1024x1024)

1.19 MB PNG

Reminder that the locusts shitting up this thread give their logs to strangers because the price of their dignity and privacy is less than a small side of french fries

Anonymous
07/16/24(Tue)13:31:22 No.101430908

Anonymous 07/16/24(Tue)13:31:22 No.101430908

>>101430893
>If this many experts beat a dense model of similar size by a significant margin
it doesn't

Anonymous
07/16/24(Tue)13:32:58 No.101430925

Anonymous 07/16/24(Tue)13:32:58 No.101430925

>>101430893
At what point does the experts become regular ff layers

Anonymous
07/16/24(Tue)13:33:25 No.101430931

Anonymous 07/16/24(Tue)13:33:25 No.101430931

File: 1709040134938637.png (315 KB, 636x491)

315 KB PNG

>>101430906
Are these locusts in the room with us right now?

Anonymous
07/16/24(Tue)13:35:18 No.101430960

Anonymous 07/16/24(Tue)13:35:18 No.101430960

>>101430931
NTA but yes, you.

Anonymous
07/16/24(Tue)13:36:12 No.101430969

Anonymous 07/16/24(Tue)13:36:12 No.101430969

>>101430960
seek for medication

Anonymous
07/16/24(Tue)13:36:58 No.101430986

Anonymous 07/16/24(Tue)13:36:58 No.101430986

>>101430969
We've already discussed the irony of you saying this, Anon. You will never be an attack helicopter.

Anonymous
07/16/24(Tue)13:38:17 No.101431001

Anonymous 07/16/24(Tue)13:38:17 No.101431001

>>101430791
Any you recommend? Where are my /lmg/ oldfags these days?

Anonymous
07/16/24(Tue)13:40:24 No.101431030

Anonymous 07/16/24(Tue)13:40:24 No.101431030

File: file.png (355 KB, 512x512)

355 KB PNG

Funny how the mikufag loses his shit again because the benchmarks show that his meme model is, in fact, a meme.

Anonymous
07/16/24(Tue)13:41:03 No.101431045

Anonymous 07/16/24(Tue)13:41:03 No.101431045

>>101431032
lol

Anonymous
07/16/24(Tue)13:41:34 No.101431047

Anonymous 07/16/24(Tue)13:41:34 No.101431047

>>101431032
lol

Anonymous
07/16/24(Tue)13:41:57 No.101431052

Anonymous 07/16/24(Tue)13:41:57 No.101431052

>>101431032
lol

Anonymous
07/16/24(Tue)13:44:29 No.101431075

Anonymous 07/16/24(Tue)13:44:29 No.101431075

File: 1711072659524106.jpg (951 KB, 1792x2304)

951 KB JPG

>>101431030
MINDBROKEN
I
N
D
B
R
O
K
E
N

Anonymous
07/16/24(Tue)13:44:33 No.101431077

Anonymous 07/16/24(Tue)13:44:33 No.101431077

stop mentioning eqbench. NOW! don't bring that up again.

Anonymous
07/16/24(Tue)13:44:52 No.101431081

Anonymous 07/16/24(Tue)13:44:52 No.101431081

>>101431032
lol

Anonymous
07/16/24(Tue)13:45:29 No.101431091

Anonymous 07/16/24(Tue)13:45:29 No.101431091

>>101431075
yes, you.

Petra aka Sao's no. 1 fan
07/16/24(Tue)13:45:33 No.101431093

Petra aka Sao's no. 1 fan 07/16/24(Tue)13:45:33 No.101431093

>>101431032
lol

Anonymous
07/16/24(Tue)13:46:19 No.101431099

Anonymous 07/16/24(Tue)13:46:19 No.101431099

>>101431075
I'm not a furry.
But.

Anonymous
07/16/24(Tue)13:48:17 No.101431123

Anonymous 07/16/24(Tue)13:48:17 No.101431123

>>101431032
lol?

Anonymous
07/16/24(Tue)13:49:26 No.101431135

Anonymous 07/16/24(Tue)13:49:26 No.101431135

>>101430893
>24 PPL vs 21
Show me the downstream task difference.

lol
07/16/24(Tue)14:01:39 No.101431267

lol 07/16/24(Tue)14:01:39 No.101431267

lol

Anonymous
07/16/24(Tue)14:01:43 No.101431268

Anonymous 07/16/24(Tue)14:01:43 No.101431268

>>101430906
catbox please

Anonymous
07/16/24(Tue)14:01:58 No.101431272

Anonymous 07/16/24(Tue)14:01:58 No.101431272

>>101431253
>>101431253
>>101431253

Anonymous
07/16/24(Tue)14:04:36 No.101431300

Anonymous 07/16/24(Tue)14:04:36 No.101431300

>>101430707
daily reminder that this "benchmark":
>was created by a finetunner to promote his own model (WestLake shit) which was at the top of the table for a long time before he realized it's too suspicious and moved them a bit down, lowering their scores
>doesn't have published questions (because there are none, he puts random scores that are somewhat reasonable)
>is probably shilled by the author himself because anyone sitting here longer than two days would know about what I written before

Anonymous
07/16/24(Tue)14:09:54 No.101431362

Anonymous 07/16/24(Tue)14:09:54 No.101431362

>>101431300
cope

Anonymous
07/16/24(Tue)14:18:15 No.101431469

Anonymous 07/16/24(Tue)14:18:15 No.101431469

>>101431362
Hi froggeric

Anonymous
07/16/24(Tue)14:33:00 No.101431625

Anonymous 07/16/24(Tue)14:33:00 No.101431625

File: file.png (396 KB, 474x316)

396 KB PNG

>>101429374
That is actually our lord and savior bitnet that is about to come back from the dead anyday now.

Anonymous
07/16/24(Tue)14:34:08 No.101431641

Anonymous 07/16/24(Tue)14:34:08 No.101431641

Hi all, Drummer here...

I actually have no fucking idea what I am doing.

Anonymous
07/16/24(Tue)14:41:36 No.101431744

Anonymous 07/16/24(Tue)14:41:36 No.101431744

>>101430658
>Yi 34B chat is basically a 70B
Weird how nobody is using it.

Anonymous
07/16/24(Tue)14:46:46 No.101431799

Anonymous 07/16/24(Tue)14:46:46 No.101431799

>>101431641
use faipl-1.0
>how to use faipl-1.0
put the following in the readme:

license: other
license_name: faipl-1.0
license_link: https://freedevproject.org/faipl-1.0/

Anonymous
07/16/24(Tue)15:08:13 No.101432026

Anonymous 07/16/24(Tue)15:08:13 No.101432026

File: file.png (1.01 MB, 768x768)

1.01 MB PNG

Anonymous
07/16/24(Tue)15:09:40 No.101432045

Anonymous 07/16/24(Tue)15:09:40 No.101432045

>>101431799
Can you go out there and win one case with those licenses? And then publically say "hi i am the license autist from /lmg/" and then post the link here where you say that?

Anonymous
07/16/24(Tue)15:13:32 No.101432092

Anonymous 07/16/24(Tue)15:13:32 No.101432092

>101432045
the jew fears the faipl-1.0
https://en.wikipedia.org/wiki/Free_Software_Foundation,_Inc._v._Cisco_Systems,_Inc. heres a case for gpl
>On May 20, 2009, the parties announced a settlement that included Cisco appointing a director to ensure Linksys products comply with free-software licenses, and Cisco making an undisclosed financial contribution to the FSF.
faipl-1.0 is fairly new, if someone small steals your shit, not a big deal, jews on the other hand..

Anonymous
07/16/24(Tue)15:14:46 No.101432113

Anonymous 07/16/24(Tue)15:14:46 No.101432113

>>101432092
I don't care about that. Go make a shitmix and license it with your autistic pet peeve and then win a case. I am waiting.

Anonymous
07/16/24(Tue)15:16:59 No.101432140

Anonymous 07/16/24(Tue)15:16:59 No.101432140

>>101432113
Provide the compute. I am waiting.

Anonymous
07/16/24(Tue)15:23:48 No.101432230

Anonymous 07/16/24(Tue)15:23:48 No.101432230

>>101432140
I just want you to shut the fuck up you literal autist. You are absolutely retarded and it hurts to read your license posts. Those licenses mean absolutely nothing, they aren't enforceable and no big company cares about your dumb shitmix you create by firing up SGD or ADAM on default parameters and letting it run for 2 hours. If they cared Hi guys drummer here wouldn't post here cause someone would have headhunted him.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.