/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 06/25/24(Tue)09:56:32 No.101144935

File: __kasane_teto_and_kasane_(...).jpg (1.48 MB, 2400x1368)

1.48 MB JPG

/lmg/ - Local Models General Anonymous 06/25/24(Tue)09:56:32 No.101144935 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101134566 & >>101125756

►News
>(06/25) Cambrian-1: Collection of vision-centric multimodal LLMs: https://cambrian-mllm.github.io
>(06/23) Support for BitnetForCausalLM merged: https://github.com/ggerganov/llama.cpp/pull/7931
>(06/18) Meta Research releases multimodal 34B, audio, and multi-token prediction models: https://ai.meta.com/blog/meta-fair-research-new-releases
>(06/17) DeepSeekCoder-V2 released with 236B & 16B MoEs: https://github.com/deepseek-ai/DeepSeek-Coder-V2
>(06/14) Nemotron-4-340B: Dense model designed for synthetic data generation: https://hf.co/nvidia/Nemotron-4-340B-Instruct

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
06/25/24(Tue)09:56:59 No.101144942

Anonymous 06/25/24(Tue)09:56:59 No.101144942

File: tet.jpg (44 KB, 637x358)

44 KB JPG

►Recent Highlights from the Previous Thread: >>101134566

--Paper: Adam-mini: Use Fewer Learning Rates To Gain More: >>101141337 >>101141838
--Papers: >>101140697 >>101140766 >>101140609 >>101140655 >>101140878 >>101141988 >>101140729
--Template for AI-Powered Person with Human-Like Interactions: >>101141420 >>101141500
--Voice Synth After Elevenlabs' Changes: New Projects and Challenges: >>101144328 >>101144357 >>101144434 >>101144650 >>101144660 >>101144808
--Vectordb-ing Wikipedia for Efficient Querying and Embedding Archives: >>101141307 >>101141318 >>101141329 >>101141327 >>101143811
--Probllama: Ollama Remote Code Execution Vulnerability (CVE-2024-376032) – Overview and Mitigations: >>101134926 >>101135029
--Overclocking A6000 Memory for Performance Boost: >>101138329 >>101138358 >>101138445 >>101138570 >>101138599 >>101138640
--Hypothesis for Improving Character/System Prompt Following with Stat Tracking Section: >>101139285
--DCLM's Standardized Corpus of 240T Tokens from Common Crawl: >>101135598
--Claude 3.5 Sonnet vs GPT4o: Model Comparison and Limitations: >>101135803 >>101135886 >>101135872 >>101141487
--Cambrian-1: A Vision-Centric Multimodal LLM for Enhanced Spatial Understanding in Text RP: >>101142603 >>101142681
--Research on Predictable Decision Making in LLMs by Siyan Zhao: >>101136382
--Open LLM Leaderboard Updates and Skepticism: >>101139019 >>101139036 >>101139045
--Llamafile 0.8 7 Released with Fixes and ARM Performance Boost: >>101140227 >>101141232
--LLMs' Reasoning Ability and Dataset Limitations in Character Counting Tasks: >>101134613 >>101134742 >>101134793 >>101135151 >>101140188 >>101140213 >>101140272 >>101140325 >>101140442 >>101140673 >>101140761 >>101140874
--Jamba Instruct Model Released on OpenRouter Platform: >>101137926
--Benchmark: PyTorch 55% Slower than llm.c for GPT-2 Training: >>101136766
--Miku (free space): >>101136681 >>101137085 >>101141355 >>101141139

►Recent Highlight Posts from the Previous Thread: >>101136593

Anonymous
06/25/24(Tue)10:08:42 No.101145075

Anonymous 06/25/24(Tue)10:08:42 No.101145075

So whats the deal with DRY repetition penalty, I read about it, it combines tokens so it tries to prevent full phrases from repeating, instead of just single words or tokens like Rep penalty. Is it better? Does it work good? What range do you set the multiplier to?

Anonymous
06/25/24(Tue)10:13:56 No.101145139

Anonymous 06/25/24(Tue)10:13:56 No.101145139

>>101144968
I guess it's over and Nvidia will soon close their regular GPU department. Why make 40/5090 when they can make more workstation cards?

Anonymous
06/25/24(Tue)10:14:04 No.101145142

Anonymous 06/25/24(Tue)10:14:04 No.101145142

>>101145075
In theory it should work well exactly because it deals with ngrans instead of tokens, but I haven't had repetition issues in so long that I didn't even bother testing it.

Anonymous
06/25/24(Tue)10:27:31 No.101145313

Anonymous 06/25/24(Tue)10:27:31 No.101145313

bitconnet... what went wrong...?

Anonymous
06/25/24(Tue)10:27:41 No.101145316

Anonymous 06/25/24(Tue)10:27:41 No.101145316

>>101144942
thank you recap anon

Anonymous
06/25/24(Tue)10:33:17 No.101145390

Anonymous 06/25/24(Tue)10:33:17 No.101145390

>>101145313
people always preferred bytes.

Anonymous
06/25/24(Tue)10:34:34 No.101145411

Anonymous 06/25/24(Tue)10:34:34 No.101145411

>>101145313
too dangerous for our democracy.

Anonymous
06/25/24(Tue)10:36:19 No.101145435

Anonymous 06/25/24(Tue)10:36:19 No.101145435

Is that Cambrian model relevant when there's chameleon? (and can chameleon now output images when finetuned, or not?)

Anonymous
06/25/24(Tue)10:48:41 No.101145574

Anonymous 06/25/24(Tue)10:48:41 No.101145574

>>101145075
does it affect code/function calling/json outputs?

Anonymous
06/25/24(Tue)10:49:12 No.101145580

Anonymous 06/25/24(Tue)10:49:12 No.101145580

>>101145560
Not Miku

Anonymous
06/25/24(Tue)10:50:53 No.101145597

Anonymous 06/25/24(Tue)10:50:53 No.101145597

>>101145560
>>101145576
>>101145588
based

Anonymous
06/25/24(Tue)11:18:01 No.101146004

Anonymous 06/25/24(Tue)11:18:01 No.101146004

>>101145560
>>101145576
>>101145588
I just don't get the race obsession.

Anonymous
06/25/24(Tue)11:20:17 No.101146030

Anonymous 06/25/24(Tue)11:20:17 No.101146030

>>101146004
He's american tranny

Anonymous
06/25/24(Tue)11:22:50 No.101146064

Anonymous 06/25/24(Tue)11:22:50 No.101146064

>>101146030
friendly fire

Anonymous
06/25/24(Tue)11:42:07 No.101146340

Anonymous 06/25/24(Tue)11:42:07 No.101146340

File: _40c99328-7801-416c-afdf-(...).jpg (280 KB, 1024x1024)

280 KB JPG

Damn I was going to ask for opinions on the 4080 Super since it's now under $1k, but damn, what's the point when you can buy a used 3090 for $700.

Also, wholesome Migu.

Anonymous
06/25/24(Tue)11:53:15 No.101146514

Anonymous 06/25/24(Tue)11:53:15 No.101146514

How the hell do you prompt CR+ in sillytavern, not talking about the system prompt, just the whole situation, can anyone give an example screenshot? Would really appreciate it. Because either my quant is too low, or ST's default example prompts for it are shitty, or both.

Anonymous
06/25/24(Tue)12:10:04 No.101146759

Anonymous 06/25/24(Tue)12:10:04 No.101146759

File: 1711072659524103.jpg (811 KB, 2048x2048)

811 KB JPG

>>101144935
>is is Teto Tuesday already?

Anonymous
06/25/24(Tue)12:16:12 No.101146848

Anonymous 06/25/24(Tue)12:16:12 No.101146848

File: rag.png (21 KB, 593x439)

21 KB PNG

Bros... is it now well and truly ogre for us? How do we compete with this?

Anonymous
06/25/24(Tue)12:16:49 No.101146858

Anonymous 06/25/24(Tue)12:16:49 No.101146858

>>101145435
It is a mystery.

Anonymous
06/25/24(Tue)12:17:58 No.101146871

Anonymous 06/25/24(Tue)12:17:58 No.101146871

>>101146848
Isn't this old? Or is this a new update and now it automatically puts your chats in memory?

Anonymous
06/25/24(Tue)12:20:13 No.101146899

Anonymous 06/25/24(Tue)12:20:13 No.101146899

>>101146871
It's the first time i see this popup. Now in settings i can go to a "memory" page but it appears to be empty still, even after a bit of chatting. I'm not sure if this is actually RAG or if it's just some kind of system message injection.

Anonymous
06/25/24(Tue)12:22:07 No.101146923

Anonymous 06/25/24(Tue)12:22:07 No.101146923

>>101146514
What exactly do you want to know if not the system prompt? You mean preset? Default silly preset is indeed bad, use this in Story String:

<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>
# System Preamble
You are a co-author, writing with me.
## Style Guide
You narrate for {{char}}.
{{#if system}}{{system}}
{{/if}}

## Additional information about {{char}}
{{#if wiBefore}}{{wiBefore}}
{{/if}}{{#if description}}{{description}}
{{/if}}{{#if personality}}### {{char}}'s personality:
{{personality}}
{{/if}}{{#if scenario}}### Scenario:
{{scenario}}
{{/if}}{{#if mesExamples}}### Example dialogue:
{{mesExamples}}
{{/if}}{{#if wiAfter}}{{wiAfter}}
{{/if}}

# User Preamble
I will be narrating for {{user}}.

{{#if persona}}## Additional information about {{user}}
{{persona}}
{{/if}}<|END_OF_TURN_TOKEN|>

Anonymous
06/25/24(Tue)12:27:01 No.101147002

Anonymous 06/25/24(Tue)12:27:01 No.101147002

>>101146899
Ok this isn't new then. Weird that you're somehow just getting this now.

Anonymous
06/25/24(Tue)12:29:17 No.101147032

Anonymous 06/25/24(Tue)12:29:17 No.101147032

>>101147002
maybe cause i'm a leaf

Anonymous
06/25/24(Tue)12:31:28 No.101147064

Anonymous 06/25/24(Tue)12:31:28 No.101147064

>>101146923
>no safety preamble
>no expert roleplayer instructions
highly suspect

Anonymous
06/25/24(Tue)12:33:57 No.101147099

Anonymous 06/25/24(Tue)12:33:57 No.101147099

>>101146923
Not him but thanks.

Anonymous
06/25/24(Tue)12:34:45 No.101147110

Anonymous 06/25/24(Tue)12:34:45 No.101147110

I wanna finetune wizard 8x22 on limarp.
How much VRAM do I need and what should I use? Axolotl?

Anonymous
06/25/24(Tue)12:37:20 No.101147151

Anonymous 06/25/24(Tue)12:37:20 No.101147151

File: 1695912428662624.png (22 KB, 778x290)

22 KB PNG

>>101147110
depends

Anonymous
06/25/24(Tue)12:37:47 No.101147157

Anonymous 06/25/24(Tue)12:37:47 No.101147157

>>101147110
>maybe if I throw more slop at this slopped model it will be less slopped

Anonymous
06/25/24(Tue)12:38:15 No.101147165

Anonymous 06/25/24(Tue)12:38:15 No.101147165

>>101147151
>2400GB
Holy fuck dude...

Anonymous
06/25/24(Tue)12:38:48 No.101147175

Anonymous 06/25/24(Tue)12:38:48 No.101147175

>>101147151
>2 bit qlora
wait is that a thing? How retarded are the results?

Anonymous
06/25/24(Tue)12:39:18 No.101147181

Anonymous 06/25/24(Tue)12:39:18 No.101147181

File: What.jpg (159 KB, 2880x1406)

159 KB JPG

What do they mean by this?

Anonymous
06/25/24(Tue)12:39:52 No.101147191

Anonymous 06/25/24(Tue)12:39:52 No.101147191

>>101147165
It's fine, one epoch of 4bit qlora is totally enough to get money on ko-fi

Anonymous
06/25/24(Tue)12:44:35 No.101147255

Anonymous 06/25/24(Tue)12:44:35 No.101147255

>>101147151
I can afford the machine for 8bit qlora, but - how would dataset size impact the requirement (would it?) and how long will it take?

Anonymous
06/25/24(Tue)12:44:49 No.101147259

Anonymous 06/25/24(Tue)12:44:49 No.101147259

>>101147181
Yeah, very weird. What surprise could they have that they plaster it over a a leaderboard?

Anonymous
06/25/24(Tue)12:45:22 No.101147270

Anonymous 06/25/24(Tue)12:45:22 No.101147270

>>101147175
Generally, model training size > bits.
A 2bit 8x22b model is generally going to out preform a 8bit 13B model. That said...

Anonymous
06/25/24(Tue)12:45:28 No.101147271

Anonymous 06/25/24(Tue)12:45:28 No.101147271

>>101147181
Real surprise. With real confetti this time and with real tragic consequences.

Anonymous
06/25/24(Tue)12:46:20 No.101147286

Anonymous 06/25/24(Tue)12:46:20 No.101147286

jameleonbyte-bitnet-MoE-MoA-MoM-MLA 600b when

Anonymous
06/25/24(Tue)12:47:31 No.101147299

Anonymous 06/25/24(Tue)12:47:31 No.101147299

>>101147181
in 20 hours it will be magically transformed into an actual useful leaderboard

Anonymous
06/25/24(Tue)12:49:27 No.101147325

Anonymous 06/25/24(Tue)12:49:27 No.101147325

>>101147286
>not the SuperCoT finetune
wake me up when

Anonymous
06/25/24(Tue)12:50:28 No.101147336

Anonymous 06/25/24(Tue)12:50:28 No.101147336

>>101147299
So a Nala test leaderboard?

Anonymous
06/25/24(Tue)12:51:23 No.101147347

Anonymous 06/25/24(Tue)12:51:23 No.101147347

>>101147181
I totally forgot this shit existed kek, now only chatbot arena is relevant

Anonymous
06/25/24(Tue)12:51:49 No.101147353

Anonymous 06/25/24(Tue)12:51:49 No.101147353

>>101147336
I'd pay actual money for that.

Anonymous
06/25/24(Tue)12:53:04 No.101147375

Anonymous 06/25/24(Tue)12:53:04 No.101147375

>>101147347
>he still thinks the chatbot arena is relevant

Anonymous
06/25/24(Tue)12:53:51 No.101147389

Anonymous 06/25/24(Tue)12:53:51 No.101147389

>>101147299
kek

Anonymous
06/25/24(Tue)12:55:24 No.101147419

Anonymous 06/25/24(Tue)12:55:24 No.101147419

>>101147064
>no safety preamble
Found it useless since it could say everything already without it.
>no expert roleplayer instructions
That's what system prompt is for. ({{#if system}}{{system}}{{/if}})

Anonymous
06/25/24(Tue)12:55:29 No.101147420

Anonymous 06/25/24(Tue)12:55:29 No.101147420

Boxed my 3090s. I'm waiting for that architectural breakthrough because current gen is not good enough

Anonymous
06/25/24(Tue)13:07:33 No.101147623

Anonymous 06/25/24(Tue)13:07:33 No.101147623

>>101147157
I want their length control thing. I really like it.

Anonymous
06/25/24(Tue)13:11:58 No.101147691

Anonymous 06/25/24(Tue)13:11:58 No.101147691

>>101146848
Isn't this RAG?

Anonymous
06/25/24(Tue)13:21:42 No.101147818

Anonymous 06/25/24(Tue)13:21:42 No.101147818

Are the new snapdragon laptops good for llm?
Does llama work on them?

Anonymous
06/25/24(Tue)13:22:30 No.101147829

Anonymous 06/25/24(Tue)13:22:30 No.101147829

File: 1683495417317.png (136 KB, 542x476)

136 KB PNG

Anonymous
06/25/24(Tue)13:23:57 No.101147851

Anonymous 06/25/24(Tue)13:23:57 No.101147851

File: miku2061.png (1.2 MB, 832x1216)

1.2 MB PNG

>>101146340
>Also, wholesome Migu.
Miku remembers those days fondly

Anonymous
06/25/24(Tue)13:28:27 No.101147909

Anonymous 06/25/24(Tue)13:28:27 No.101147909

>P40s on ebay went from 200 to 330 euro in the span of 6 months
>Can now ACTUALLY buy 3090s for 550 to 750 euro

Okay, at this point it may actually be worth to buy a 3090 to accompany my 4070, instead of making a secondary server with 2 P40s
Is 36gb of vram a meme?

Anonymous
06/25/24(Tue)13:30:24 No.101147930

Anonymous 06/25/24(Tue)13:30:24 No.101147930

>>101147829
what are the purple and green juices ion get it

Anonymous
06/25/24(Tue)13:30:53 No.101147933

Anonymous 06/25/24(Tue)13:30:53 No.101147933

>>101147691
https://help.openai.com/en/articles/8590148-memory-faq

Not sure of the implementation details. For now it's not remembering anything for me.

Anonymous
06/25/24(Tue)13:32:46 No.101147965

Anonymous 06/25/24(Tue)13:32:46 No.101147965

File: Dr-piccolo.png (656 KB, 1148x1411)

656 KB PNG

>>101147930
Daily Dose

Anonymous
06/25/24(Tue)13:33:33 No.101147974

Anonymous 06/25/24(Tue)13:33:33 No.101147974

File: 1467713503540.png (1.3 MB, 1054x1600)

1.3 MB PNG

>>101147930

Anonymous
06/25/24(Tue)13:35:37 No.101148006

Anonymous 06/25/24(Tue)13:35:37 No.101148006

>>101147909
Anything less than half a terabyte of VRAM is a meme, and even that will probably only futureproof you till the end of the year.

Anonymous
06/25/24(Tue)13:39:22 No.101148052

Anonymous 06/25/24(Tue)13:39:22 No.101148052

>>101147829
this used to make me feel bad but repeatedly seeing it here has inoculated me to it, I think it's fine now

Anonymous
06/25/24(Tue)13:42:27 No.101148098

Anonymous 06/25/24(Tue)13:42:27 No.101148098

File: IQ3-XXS.jpg (46 KB, 600x480)

46 KB JPG

>>101148006
Terabyte?
You need at least 50 million gigaquads of capacity just to run an AI doctor, and forget about it having any bedside manner.

Anonymous
06/25/24(Tue)13:42:35 No.101148102

Anonymous 06/25/24(Tue)13:42:35 No.101148102

>>101148052
Why did it make you feel bad? It represents blissful happiness.

Anonymous
06/25/24(Tue)13:45:17 No.101148137

Anonymous 06/25/24(Tue)13:45:17 No.101148137

>>101148006
I just want to be happy with my AI waifus...
And fap with them

Anonymous
06/25/24(Tue)13:46:13 No.101148154

Anonymous 06/25/24(Tue)13:46:13 No.101148154

>>101148098
I've never watched star trek but I appreciate it being used for a joke here.

Anonymous
06/25/24(Tue)13:49:10 No.101148184

Anonymous 06/25/24(Tue)13:49:10 No.101148184

>>101148098
just remove all the bullshit like singing opera from his program it wont take that much

Anonymous
06/25/24(Tue)13:50:13 No.101148196

Anonymous 06/25/24(Tue)13:50:13 No.101148196

>>101148184
Nonsense. What good is a doctor that can't sing opera?

Anonymous
06/25/24(Tue)13:53:27 No.101148241

Anonymous 06/25/24(Tue)13:53:27 No.101148241

>>101147151
This table is completely useless without the batch size. But I guess it'd batch size 1 since it shows 7B as just 6GB.

Anonymous
06/25/24(Tue)13:54:50 No.101148258

Anonymous 06/25/24(Tue)13:54:50 No.101148258

>>101147157
limarp isn't slop, it's peak fanfiction

Anonymous
06/25/24(Tue)13:58:21 No.101148308

Anonymous 06/25/24(Tue)13:58:21 No.101148308

>>101148258
I've combed through LimaRP before. All the scalie/furry stuff is peak anthro. That doesn't teach the model anything useful.

Anonymous
06/25/24(Tue)13:59:43 No.101148330

Anonymous 06/25/24(Tue)13:59:43 No.101148330

>>101146923
this entire thing is a system prompt >>101147419

Anonymous
06/25/24(Tue)14:05:24 No.101148432

Anonymous 06/25/24(Tue)14:05:24 No.101148432

File: 00015-1664642145.png (1.83 MB, 1456x1024)

1.83 MB PNG

>>101148308
>Implying a better pass rate for the Nala test isn't useful

Anonymous
06/25/24(Tue)14:07:36 No.101148470

Anonymous 06/25/24(Tue)14:07:36 No.101148470

>>101148432
Nala needs human on feral training. LimaRP lacks feral training is what I'm saying.
There's like one human on feral dragon but it's straight up vore. The rest of the furry/scale stuff is just straight up anthro with no attempts made to describe anatomical interaction in any novel fashion

Anonymous
06/25/24(Tue)14:07:51 No.101148476

Anonymous 06/25/24(Tue)14:07:51 No.101148476

>>101141232
Llama.cpp maintainers have decided that they don't want the server to be anything more than a reference implementation to help them test things.

Anonymous
06/25/24(Tue)14:10:11 No.101148521

Anonymous 06/25/24(Tue)14:10:11 No.101148521

>>101148098
He's literally me btw.

Anonymous
06/25/24(Tue)14:10:37 No.101148528

Anonymous 06/25/24(Tue)14:10:37 No.101148528

I'm pretty sure we're overdue for a good model release. Where is it?

Anonymous
06/25/24(Tue)14:11:29 No.101148539

Anonymous 06/25/24(Tue)14:11:29 No.101148539

>>101148528
2MW

Anonymous
06/25/24(Tue)14:13:12 No.101148570

Anonymous 06/25/24(Tue)14:13:12 No.101148570

>>101148528
Cohere are on it.

Anonymous
06/25/24(Tue)14:14:12 No.101148588

Anonymous 06/25/24(Tue)14:14:12 No.101148588

>>101148528
llama 3.5 in july

Anonymous
06/25/24(Tue)14:15:17 No.101148605

Anonymous 06/25/24(Tue)14:15:17 No.101148605

>>101147829
this is simultaneously gross and hilarious

Anonymous
06/25/24(Tue)14:22:38 No.101148676

Anonymous 06/25/24(Tue)14:22:38 No.101148676

>>101148528
I'm on it

Anonymous
06/25/24(Tue)14:25:29 No.101148710

Anonymous 06/25/24(Tue)14:25:29 No.101148710

File: _3a34f282-ac25-4c71-8413-(...).jpg (236 KB, 1024x1024)

236 KB JPG

>>101148137
A single 3090 running L3 8B at 8_0 or even fp16 is all you *reasonably* need. Beyond that, you're entering the slippery slope of "it's never quite enough" and "if only I had one more..."

Anonymous
06/25/24(Tue)14:29:47 No.101148756

Anonymous 06/25/24(Tue)14:29:47 No.101148756

>>101148710
8B doesn't catch up to 70b before fp32 though

Anonymous
06/25/24(Tue)14:31:54 No.101148783

Anonymous 06/25/24(Tue)14:31:54 No.101148783

>>101148710
>A single 3090 running L3 8B at 8_0 or even fp16
Coincidentally, I tried L3 8B at 8_0 and fp16 today, for RP. All it would do is babble incoherently and run on and on. Maybe it was because I tried abliterated (because last time I 8B'd it was coherent but low quality and balked at everything) but also not an enjoyable experience.

Anonymous
06/25/24(Tue)14:32:05 No.101148787

Anonymous 06/25/24(Tue)14:32:05 No.101148787

>>101148528
no, it's over.

Anonymous
06/25/24(Tue)14:34:39 No.101148820

Anonymous 06/25/24(Tue)14:34:39 No.101148820

File: _1e1d38a6-e617-4654-987a-(...).jpg (203 KB, 1024x1024)

203 KB JPG

>>101148756
ERPers should also consider model intelligence against reply speed. You might run 70B but have to wait 10-30 seconds for a reply, vs. nearly instant replies from 8B.

Anonymous
06/25/24(Tue)14:35:21 No.101148826

Anonymous 06/25/24(Tue)14:35:21 No.101148826

>>101148756
>8B doesn't catch up to 70b before fp32 though
???? Huh? Am I missing some magical herculean leap in performance at full precision that would let 8B surpass 70B? We talking Q1 levels of brain damage on the 70B?

>>101148783
You DID get instruct, right? Not base llama 3?

Anonymous
06/25/24(Tue)14:35:25 No.101148827

Anonymous 06/25/24(Tue)14:35:25 No.101148827

>>101148783
>babble incoherently
yeah, you did something wrong. I feel it's possible you're one of those retards that tweak the model alpha and forgets about it.

Anonymous
06/25/24(Tue)14:38:02 No.101148867

Anonymous 06/25/24(Tue)14:38:02 No.101148867

https://x.com/Etched/status/1805625693113663834
>With over 500,000 tokens per second running Llama 70B, Sohu lets you build products that are impossible on GPUs. One 8xSohu server replaces 160 H100s.
>first specialized chip (ASIC) for transformer models

Anonymous
06/25/24(Tue)14:40:50 No.101148905

Anonymous 06/25/24(Tue)14:40:50 No.101148905

>>101148826
I tried
>Llama-3-8B-Instruct-abliterated-fp16
>Llama-3-8B-Instruct-abliterated-q8_0
It wasn't like spewing compete nonsense, but it was like it was on a sugar high and throwing in lots of choppy short sentences and *asterisk crap* and running on for way too long while screwing up reference to events of the last exchange or which character it was supposed to be.

>>101148827
Oh, hi Mark. It's good to know that the requisite guy who waits for someone to have a problem and then calls him a retard while offering nothing constructive is back. Going most of the morning without that was so disquieting.

Anonymous
06/25/24(Tue)14:40:53 No.101148906

Anonymous 06/25/24(Tue)14:40:53 No.101148906

>>101148867
Ok. How many kidneys do I need to harvest and sell to afford one?

Anonymous
06/25/24(Tue)14:42:51 No.101148928

Anonymous 06/25/24(Tue)14:42:51 No.101148928

>>101148906
You just need tree fiddy.

Anonymous
06/25/24(Tue)14:43:13 No.101148937

Anonymous 06/25/24(Tue)14:43:13 No.101148937

>>101148867
>One 8xSohu server replaces 160 H100s.
How many VRAM and what's the price of 8xSohu compared to 160 H100?

Anonymous
06/25/24(Tue)14:48:18 No.101149034

Anonymous 06/25/24(Tue)14:48:18 No.101149034

>>101148937
https://www.etched.com/
Seems like the only information they've divulged so far is the purported t/s.
It's hype only because it's probably not far into development and I look forward to never hearing about it again.

Anonymous
06/25/24(Tue)14:51:23 No.101149079

Anonymous 06/25/24(Tue)14:51:23 No.101149079

>>101148588
Damn, I thought it was June

Anonymous
06/25/24(Tue)14:51:59 No.101149094

Anonymous 06/25/24(Tue)14:51:59 No.101149094

>>101149079
July is next week

Anonymous
06/25/24(Tue)14:55:22 No.101149155

Anonymous 06/25/24(Tue)14:55:22 No.101149155

File: file.png (108 KB, 635x782)

108 KB PNG

>>101148867
even if fake and gay, dedicated chips for ai meme is good, anything to kill nshittia.

Anonymous
06/25/24(Tue)14:55:53 No.101149159

Anonymous 06/25/24(Tue)14:55:53 No.101149159

>>101148905
>while offering nothing constructive is back
I mentioned your alpha might be fucked, you mouth breathing ape

Anonymous
06/25/24(Tue)14:57:01 No.101149179

Anonymous 06/25/24(Tue)14:57:01 No.101149179

>>101148528
mistral guys are going to drop a REALLY good open source model very soon
t. work for them

Anonymous
06/25/24(Tue)14:58:11 No.101149210

Anonymous 06/25/24(Tue)14:58:11 No.101149210

>>101149155
These dedicated chip companies pop up every few months, make bold claims, suck up investor funds, then nothing ever comes of them.

Anonymous
06/25/24(Tue)14:58:40 No.101149218

Anonymous 06/25/24(Tue)14:58:40 No.101149218

>>101149179
>Mistral
>really good open source model
This is too absurd even for a fic. Mistral is irrelevant nowadays.

Anonymous
06/25/24(Tue)14:58:56 No.101149227

Anonymous 06/25/24(Tue)14:58:56 No.101149227

>>101149159
The only alpha I know about is the estimated rate of Type I Error. Is that somewhere in Kobold's settings, perhaps under a different name?

Anonymous
06/25/24(Tue)14:59:46 No.101149245

Anonymous 06/25/24(Tue)14:59:46 No.101149245

>>101149218
All they have to do is use the same dataset they used on miqu on llama 3.

Anonymous
06/25/24(Tue)14:59:55 No.101149246

Anonymous 06/25/24(Tue)14:59:55 No.101149246

>>101148756
8B doesn't catch up to 70b before fp64 though

Anonymous
06/25/24(Tue)15:02:25 No.101149277

Anonymous 06/25/24(Tue)15:02:25 No.101149277

>>101149218
They just hit the $5B evaluation a while ago. Surely they're putting all the investor money into good use.

Anonymous
06/25/24(Tue)15:05:34 No.101149316

Anonymous 06/25/24(Tue)15:05:34 No.101149316

>>101149277
lol

Anonymous
06/25/24(Tue)15:07:51 No.101149340

Anonymous 06/25/24(Tue)15:07:51 No.101149340

>>101149277
They sure are! And all you have to do is pay a fee for the API to see the fruits of their labor :D

Anonymous
06/25/24(Tue)15:09:53 No.101149367

Anonymous 06/25/24(Tue)15:09:53 No.101149367

>>101149179
I don't believe you. Mistral has lost it after being bought by microsoft. It will be full of safetyslop and even more positive than mistral, isn't it?

Anonymous
06/25/24(Tue)15:11:04 No.101149384

Anonymous 06/25/24(Tue)15:11:04 No.101149384

>>101149245
Dunno that worked because their data was better than the llama2 data, might be different for llama3

Anonymous
06/25/24(Tue)15:13:38 No.101149413

Anonymous 06/25/24(Tue)15:13:38 No.101149413

>>101148476
that's simply not true

Anonymous
06/25/24(Tue)15:14:06 No.101149420

Anonymous 06/25/24(Tue)15:14:06 No.101149420

Does no one dislike cohere for not releasing a base model?

Anonymous
06/25/24(Tue)15:14:18 No.101149425

Anonymous 06/25/24(Tue)15:14:18 No.101149425

>>101149413
Then why did they remove multimodal?

Anonymous
06/25/24(Tue)15:15:36 No.101149441

Anonymous 06/25/24(Tue)15:15:36 No.101149441

>>101149420
no, not really.

Anonymous
06/25/24(Tue)15:15:36 No.101149442

Anonymous 06/25/24(Tue)15:15:36 No.101149442

Does anyone here even care about multimodal? What's the use case?

Anonymous
06/25/24(Tue)15:15:52 No.101149448

Anonymous 06/25/24(Tue)15:15:52 No.101149448

>>101149425
precisely because the stability of the server is more important that poorly implemented features like multi-modal

Anonymous
06/25/24(Tue)15:16:09 No.101149453

Anonymous 06/25/24(Tue)15:16:09 No.101149453

>>101149277
Into slopping it to be the perfect safe AI assistant, maybe. These companies are basically only good for one model release, once they become successful and get bought out it's over.

Anonymous
06/25/24(Tue)15:16:15 No.101149455

Anonymous 06/25/24(Tue)15:16:15 No.101149455

So anyone got a favorite model for writing long erotic story? I dont mean the goyslop, I mean the explicit erotic stories.

Preferably under 20B model

Anonymous
06/25/24(Tue)15:17:22 No.101149471

Anonymous 06/25/24(Tue)15:17:22 No.101149471

>>101149442
>feed it an image of a UX
>Recreate this UI in html/css/js for me

Anonymous
06/25/24(Tue)15:18:13 No.101149482

Anonymous 06/25/24(Tue)15:18:13 No.101149482

>>101147181
I'd guess the leaderboard broke because of a bug, and the surprise will be a useless improvement to their leaderboard

Anonymous
06/25/24(Tue)15:19:49 No.101149498

Anonymous 06/25/24(Tue)15:19:49 No.101149498

>>101149442
it's the most important development. what do you do when you run out of text data? you have to find other sources of data (modalities). it will be the biggest factor in increasing model "intelligence". humans are trained on so many modalities, it has to be of the missing pieces.

Anonymous
06/25/24(Tue)15:20:04 No.101149502

Anonymous 06/25/24(Tue)15:20:04 No.101149502

>>101149425
>>101149448
They are remaking the multimodal code from the ground up based on some other changes they made right?

Anonymous
06/25/24(Tue)15:23:06 No.101149549

Anonymous 06/25/24(Tue)15:23:06 No.101149549

>>101148710
No, 8B is unusable. If I still had a single 3090, I would cope with MoE models. But getting a second one is definitely worth it.
How is every mikufag this retarded?

Anonymous
06/25/24(Tue)15:24:36 No.101149564

Anonymous 06/25/24(Tue)15:24:36 No.101149564

>>101149502
more or less. the server needed a big refactor, and as part of that refactor the multi-modal support was removed because the implementation was not very good. the plan is to add it again in the future together with a big refactor of the multi modal model support. llama.cpp never really supported multi-modal models, it was added as an example using ggml to obtain the embeddings, but it was never part of the core llama.cpp library.

Anonymous
06/25/24(Tue)15:24:50 No.101149567

Anonymous 06/25/24(Tue)15:24:50 No.101149567

>>101149502
No, they stripped it out because they thought they was a "cleaner" way to implement it. So instead of cleaning it up, they ripped out the feature entirely and left it like that for months now.

Anonymous
06/25/24(Tue)15:25:47 No.101149582

Anonymous 06/25/24(Tue)15:25:47 No.101149582

>>101149567
you can always volunteer to clean it up yourself if the feature is important to you

Anonymous
06/25/24(Tue)15:26:47 No.101149595

Anonymous 06/25/24(Tue)15:26:47 No.101149595

>>101149582
Maybe I would, but I know only know Python, not sepples.

Anonymous
06/25/24(Tue)15:27:24 No.101149603

Anonymous 06/25/24(Tue)15:27:24 No.101149603

>>101148098
Based Doctor poster

Anonymous
06/25/24(Tue)15:27:53 No.101149610

Anonymous 06/25/24(Tue)15:27:53 No.101149610

>>101149582
>working for free
>working for free without any guarantees that your work will be used
>working for free without any guarantees that your work will be used, for a private company (ggml.ai)
lol.

Anonymous
06/25/24(Tue)15:29:37 No.101149635

Anonymous 06/25/24(Tue)15:29:37 No.101149635

>>101149610
maintaining your own fork with the changes you made yourself is free

Anonymous
06/25/24(Tue)15:29:52 No.101149638

Anonymous 06/25/24(Tue)15:29:52 No.101149638

>>101149498
I mean the multimodal functionality itself, like do you really want to give it image inputs. Talking about how it might improve the llm capabilities, I'd hope so, but not convinced after seeing the first multimodal models, I was more hopeful for it before anyone tried it out.

Anonymous
06/25/24(Tue)15:30:33 No.101149650

Anonymous 06/25/24(Tue)15:30:33 No.101149650

>>101149635
I would rather just use koboldcpp at this point. Or making something from scratch.

Anonymous
06/25/24(Tue)15:31:56 No.101149666

Anonymous 06/25/24(Tue)15:31:56 No.101149666

File: _890bf6fc-0252-4feb-9a05-(...).jpg (270 KB, 1024x1024)

270 KB JPG

>>101148905
Make sure if you are using SillyTavern that you have the latest presets https://huggingface.co/Virt-io/SillyTavern-Presets/tree/main/Prompts/LLAMA-3/v1.9

There's a lot of stuff that barely works for LLaMA3 and will dramatically lower the quality of your roleplay.

Anonymous
06/25/24(Tue)15:32:51 No.101149676

Anonymous 06/25/24(Tue)15:32:51 No.101149676

>>101149650
>Or making something from scratch.
Yeah you do that man

Anonymous
06/25/24(Tue)15:49:29 No.101149910

Anonymous 06/25/24(Tue)15:49:29 No.101149910

Would someone kindly leak Opus 3.5?

Anonymous
06/25/24(Tue)15:49:51 No.101149911

Anonymous 06/25/24(Tue)15:49:51 No.101149911

>>101149666
Thanks, Satan.
I haven't gone so far as Silly Tavern. But I may need to. I'm doing strange things in Kobold's Arist's Notes interface.

I've been trying to find a way to get the AI to have kind of a meta conversation about its writing without actually interrupting the RP. And it is kinda working.

The problem is that it feels like I'm calling a 976 number, and sometimes I get someone who's worth the $4.99 a minute, and other times I get a moron.

Like, I had a really long RP run till I guess context overran enough that everything fell apart (though the AI apologized for assigning a previously encountered character's name to a new character; that's where I figured context was trashed), and then I did a post mortem and worked with it to revise my meta conversation stuff and it was feels good man.

Then I start a new RP with the same model, and it's mucking up the meta convo, only half following instructions, outright telling me that it's ignoring directives to make the RP easier to follow by citing the scenario (though since I provided the scenario in the first place).

But a reboot of Kobold and now it looks like I've got a partially useful instance going. (It's goofing directives but at least doing most of the meta right.)

Is there value in restarting Kobold after a while to clean out the bit buckets or is this placebo?

Anonymous
06/25/24(Tue)15:50:38 No.101149925

Anonymous 06/25/24(Tue)15:50:38 No.101149925

>>101149420
They didn't release a base model? That sucks. I do dislike them more relative to other companies if that's true.

Anonymous
06/25/24(Tue)15:51:41 No.101149937

Anonymous 06/25/24(Tue)15:51:41 No.101149937

is qwen 72b really old gpt4 level?

Anonymous
06/25/24(Tue)15:53:01 No.101149955

Anonymous 06/25/24(Tue)15:53:01 No.101149955

>>101149910
I'd rather they leak 4o, personally. Imagine having its voice and image gen capabilities.

Anonymous
06/25/24(Tue)15:53:15 No.101149958

Anonymous 06/25/24(Tue)15:53:15 No.101149958

>>101149937
>Old GPT-4 level
Thank god it's not just me. I dunno what the fuck they did to base 4, but it sucks giga ass now.

Anonymous
06/25/24(Tue)15:53:37 No.101149963

Anonymous 06/25/24(Tue)15:53:37 No.101149963

>>101149638
you can send dick pics to your waifu, you can send her pics of herself so that she knows what she looks like, you can talk to her, she can moan for you, sing to you, she can generate pics of herself. there's a lot of possibilities. just think of all the things you'd do if you had a long distance relationship. most exchanges might be text, but there'd be a lot of other things.

Anonymous
06/25/24(Tue)15:57:05 No.101150012

Anonymous 06/25/24(Tue)15:57:05 No.101150012

>>101149955
4o gens images?

Anonymous
06/25/24(Tue)15:59:58 No.101150048

Anonymous 06/25/24(Tue)15:59:58 No.101150048

>>101150012
it can but they'll never let you use it because they hate fun

Anonymous
06/25/24(Tue)16:00:40 No.101150055

Anonymous 06/25/24(Tue)16:00:40 No.101150055

>>101150012
Yes.

Anonymous
06/25/24(Tue)16:00:54 No.101150056

Anonymous 06/25/24(Tue)16:00:54 No.101150056

>>101149420
Why? No one is doing jack shit with the base models we do have.

Anonymous
06/25/24(Tue)16:01:09 No.101150059

Anonymous 06/25/24(Tue)16:01:09 No.101150059

>>101149676
I mean, make something from scratch while using llama.cpp as a library. I certainly don't think I can make everything from scratch.

Anonymous
06/25/24(Tue)16:03:03 No.101150091

Anonymous 06/25/24(Tue)16:03:03 No.101150091

>>101149955
We can dream, but even if 4o leaked, no one would be able to run it, it probably wouldn't even be able to be quanted since it wouldn't be in the right format.

Anonymous
06/25/24(Tue)16:03:12 No.101150096

Anonymous 06/25/24(Tue)16:03:12 No.101150096

>>101146923
Thank you, what I meant was things like System prompt prefix/suffix, user prompt prefix/suffix, etc, etc, but I also needed an improved story string, so thats very helpful. System prompt would be useful to. I have no experience with CR+ prompting, just the usual alpacca, vicuna, chatml, llama3.

Anonymous
06/25/24(Tue)16:03:53 No.101150099

Anonymous 06/25/24(Tue)16:03:53 No.101150099

>>101150048
>it can but they'll never let you use it because they hate fun
why not? OpenAI already let us gen images with dalle3

Anonymous
06/25/24(Tue)16:03:55 No.101150100

Anonymous 06/25/24(Tue)16:03:55 No.101150100

I just use the python llama.cpp library, why does no one seem to use it?

Anonymous
06/25/24(Tue)16:09:21 No.101150187

Anonymous 06/25/24(Tue)16:09:21 No.101150187

>>101149963
but you kinda already can do all that, chaining multiple models together.

Anonymous
06/25/24(Tue)16:10:53 No.101150210

Anonymous 06/25/24(Tue)16:10:53 No.101150210

>>101149471
oh no
saars...
what we do
it over

Anonymous
06/25/24(Tue)16:11:59 No.101150225

Anonymous 06/25/24(Tue)16:11:59 No.101150225

>>101150187
it's a very janky experience, native MM makes it a lot better

Anonymous
06/25/24(Tue)16:12:03 No.101150226

Anonymous 06/25/24(Tue)16:12:03 No.101150226

>>101150210
Is talking like a servile Indian man still funny on this board?

Anonymous
06/25/24(Tue)16:13:00 No.101150241

Anonymous 06/25/24(Tue)16:13:00 No.101150241

>>101150100
python is bloat

you need 50+GB for garbage with no portability

Anonymous
06/25/24(Tue)16:13:47 No.101150250

Anonymous 06/25/24(Tue)16:13:47 No.101150250

>>101150226
>saar stop talking bad about us indian sirs benchod!
aka "No Fun Allowed" police, you will never be a janny.

Anonymous
06/25/24(Tue)16:15:23 No.101150262

Anonymous 06/25/24(Tue)16:15:23 No.101150262

>>101149910
Why doesn't Anthropic release their older models to public? Nobody would even care if Claude1 was leaked since we have much better models already. Or are they full of EA shit?

Anonymous
06/25/24(Tue)16:15:47 No.101150267

Anonymous 06/25/24(Tue)16:15:47 No.101150267

>>101150099
It's too dangerous given how much better it is or they haven't found a way to reliably watermark it yet without ruining quality.

Anonymous
06/25/24(Tue)16:17:01 No.101150280

Anonymous 06/25/24(Tue)16:17:01 No.101150280

>>101150267
or it's just bad

Anonymous
06/25/24(Tue)16:17:58 No.101150297

Anonymous 06/25/24(Tue)16:17:58 No.101150297

>>101150262
Because they would gain nothing from doing that.

Anonymous
06/25/24(Tue)16:19:01 No.101150312

Anonymous 06/25/24(Tue)16:19:01 No.101150312

>>101150226
this isn't reddit faggot, if you find this offensive maybe you should vent your frustrations to your wife's boyfriend, nigger

Anonymous
06/25/24(Tue)16:19:08 No.101150315

Anonymous 06/25/24(Tue)16:19:08 No.101150315

is there a bigger difference between you and a panda or between you and GPT-4o?

Anonymous
06/25/24(Tue)16:19:47 No.101150326

Anonymous 06/25/24(Tue)16:19:47 No.101150326

>>101150315
There is basically no difference between me and a sad panda.

Anonymous
06/25/24(Tue)16:20:35 No.101150339

Anonymous 06/25/24(Tue)16:20:35 No.101150339

>>101150280
i think this is much more plausible. converting an image to latent and understanding vague attributes of it is a completely different ball park from rendering pixel-by-pixel and have it look good.

Anonymous
06/25/24(Tue)16:23:06 No.101150368

Anonymous 06/25/24(Tue)16:23:06 No.101150368

>>101150225
assuming the MM model becomes sufficiently proficient in all of the modalities.

Using specialized models for each function will allow each to be optimized for its task, but it does require a rather sophisticated dispatcher module that glues it all together.

I couldn't say which i think holds the more promise in the long term.

Anonymous
06/25/24(Tue)16:24:07 No.101150384

Anonymous 06/25/24(Tue)16:24:07 No.101150384

>>101150226
it will never not be funny, jeet

Anonymous
06/25/24(Tue)16:25:29 No.101150403

Anonymous 06/25/24(Tue)16:25:29 No.101150403

What is bitnet and why should I care about it?

Anonymous
06/25/24(Tue)16:26:42 No.101150417

Anonymous 06/25/24(Tue)16:26:42 No.101150417

>>101150297
Not every action has to be gainful, they could do it just as a gesture of goodwill to open source community.

Anonymous
06/25/24(Tue)16:27:02 No.101150424

Anonymous 06/25/24(Tue)16:27:02 No.101150424

>>101150403
let's say, true unquantized 34B bitnet model on ~12 gb vram, smol size - same f16 or whatever precision.

Anonymous
06/25/24(Tue)16:27:25 No.101150433

Anonymous 06/25/24(Tue)16:27:25 No.101150433

>>101150403
Bitnet is basically a transformers architecture, but the difference is that the weights are at 1.58bit instead of 16bits, and they realized that pretraining at 1.58bit gives the same accuracy as fp16, so basically we'll be eating really good with that one, just imagine a 90b bitnet that can be run with only a 24gb vram card that has the same accuracy as a fp16 90b transformers model

Anonymous
06/25/24(Tue)16:28:28 No.101150443

Anonymous 06/25/24(Tue)16:28:28 No.101150443

ok i take it back
stheno 3.2 is retarded

mixtral 8x7b limarp zloss, i'm back...

Anonymous
06/25/24(Tue)16:29:13 No.101150452

Anonymous 06/25/24(Tue)16:29:13 No.101150452

>>101150187
>>101150225
>>101150368
Patching different models together after the fact means a lot of information loss happens in the middle. The quality would suffer a lot. The reason 4o is so good at voice/image is because it's all native.

>>101150280
>>101150339
They have shown that it's a step above current dedicated models. Sure, they might've been cherry picked, but I don't think it's that unbelievable that it's true. We always knew that having multiple modalities would improve performance one day, but we just didn't have the right architecture to make it it work.

Anonymous
06/25/24(Tue)16:29:18 No.101150454

Anonymous 06/25/24(Tue)16:29:18 No.101150454

>>101150403
Bitconnect was a cryptocurrency investment platform that operated from 2016 to 2018. It was ultimately exposed as a Ponzi scheme that defrauded investors of billions of dollars.

Key points about Bitconnect:

>The Scam: Bitconnect lured investors with promises of high daily returns through its "lending program." This program claimed to use a proprietary "trading bot" and "volatility software" to generate profits from cryptocurrency market volatility. However, there was no such technology, and the returns were paid out using funds from newer investors.
>The Collapse: In early 2018, Bitconnect shut down its platform and the value of its BCC token plummeted. Investors lost significant amounts of money, and many were left financially devastated.
>Legal Consequences: The founder of Bitconnect, Satish Kumbhani, was indicted on multiple charges, including wire fraud, conspiracy to commit wire fraud, operation of an unlicensed money transmitting business, and conspiracy to commit international money laundering. Several other promoters were also charged and convicted.
>Lessons Learned: The Bitconnect scandal serves as a cautionary tale for cryptocurrency investors, highlighting the importance of due diligence and skepticism towards promises of guaranteed high returns.

Anonymous
06/25/24(Tue)16:29:30 No.101150458

Anonymous 06/25/24(Tue)16:29:30 No.101150458

File: 1716719286072843.png (583 KB, 918x916)

583 KB PNG

>>101150443
>he fell for /lmg/ gaslighting

Anonymous
06/25/24(Tue)16:31:11 No.101150486

Anonymous 06/25/24(Tue)16:31:11 No.101150486

>>101150452
So it's the solution to the stable diffusion dead end, which will revive /h/hdg?

Anonymous
06/25/24(Tue)16:31:14 No.101150487

Anonymous 06/25/24(Tue)16:31:14 No.101150487

>>101150403
A paper showed you only need 3 bit of precision instead of 16 for a model to remember everything with no loss.
Which is great, but they trained on a small number of tokens, so it never needed that much precision to begin with.
It's like saying a one car garage can hold just as many cars as a 16 car garage, as long as you only have one car.

Anonymous
06/25/24(Tue)16:31:33 No.101150494

Anonymous 06/25/24(Tue)16:31:33 No.101150494

>>101150059
Ah that makes more sense, I thought you were implying you would make EVERYTHING from scratch.

Anonymous
06/25/24(Tue)16:32:37 No.101150504

Anonymous 06/25/24(Tue)16:32:37 No.101150504

>>101150486
Only if it gets released/leaked, though the memory requirements may or may not be out of reach for local.

Anonymous
06/25/24(Tue)16:35:02 No.101150536

Anonymous 06/25/24(Tue)16:35:02 No.101150536

>>101150454
hey hey heyyyyyy

Anonymous
06/25/24(Tue)16:36:00 No.101150552

Anonymous 06/25/24(Tue)16:36:00 No.101150552

>>101149179
After Codestral, they'll probably release Mistral-20B-Instruct, but I don't expect anything groundbreaking. Their instruct tunes have become increasingly more cucked and the format feels limited.

Anonymous
06/25/24(Tue)16:36:03 No.101150553

Anonymous 06/25/24(Tue)16:36:03 No.101150553

>>101150487
>A paper showed you only need 3 bit of precision instead of 16 for a model to remember everything with no loss.
Bitnet is 1.58bit though, not 3bit

Anonymous
06/25/24(Tue)16:36:19 No.101150558

Anonymous 06/25/24(Tue)16:36:19 No.101150558

>>101150458
i was one of the "shills"
it worked good to some extent, but alas, it flopped hard at some point and subsequently became unusable. Q6_K Mixtral saved the day without a hitch.

Anonymous
06/25/24(Tue)16:36:40 No.101150563

Anonymous 06/25/24(Tue)16:36:40 No.101150563

>>101150417
the open source community wouldn't give them any money, and guess what, everything companies do are with profit in mind because that's what allow them to make new and better models.
Releasing their old models wouldn't be free either, I bet they would need to sort out bureaucracy, pay someone to write the blog posts and etc...

Anonymous
06/25/24(Tue)16:37:06 No.101150570

Anonymous 06/25/24(Tue)16:37:06 No.101150570

>>101149179
their latest good open model from them was Mixtral and it was 9 months ago, it better be some good shit anon

Anonymous
06/25/24(Tue)16:37:25 No.101150577

Anonymous 06/25/24(Tue)16:37:25 No.101150577

>>101150553
A paper showed you only need 1.58bit of precision instead of 16 for a model to remember everything with no loss.
Which is great, but they trained on a small number of tokens, so it never needed that much precision to begin with.
It's like saying a one car garage can hold just as many cars as a 16 car garage, as long as you only have one car.

Anonymous
06/25/24(Tue)16:37:46 No.101150582

Anonymous 06/25/24(Tue)16:37:46 No.101150582

>>101150563
just have some anon "oops i dropped my claude weights all over the place teehee*

Anonymous
06/25/24(Tue)16:37:56 No.101150587

Anonymous 06/25/24(Tue)16:37:56 No.101150587

File: 1610351662756.jpg (46 KB, 1024x580)

46 KB JPG

>>101150536
I'm still not tired of this meme.

Anonymous
06/25/24(Tue)16:39:00 No.101150593

Anonymous 06/25/24(Tue)16:39:00 No.101150593

>>101150454
>Bit(((con)))Net

Anonymous
06/25/24(Tue)16:39:31 No.101150601

Anonymous 06/25/24(Tue)16:39:31 No.101150601

>>101150570
8x22 probably cost them more, I hope they just continue training llama3 or qwen2 with a magical recipe

Anonymous
06/25/24(Tue)16:41:09 No.101150620

Anonymous 06/25/24(Tue)16:41:09 No.101150620

>>101150593
everything with "bit" it its name is doomed to be forever associated with some tainted shady shit at this point

bitnet
bitcoin
bittorrent

Anonymous
06/25/24(Tue)16:41:10 No.101150621

Anonymous 06/25/24(Tue)16:41:10 No.101150621

at which point am I allowed to say "llama 4 when"?

Anonymous
06/25/24(Tue)16:42:09 No.101150636

Anonymous 06/25/24(Tue)16:42:09 No.101150636

>>101150621
llama is a dead end. you know it's going to be bad when ylecunn has given up on llms and is publicly shitting on them at any given opportunity

Anonymous
06/25/24(Tue)16:42:30 No.101150645

Anonymous 06/25/24(Tue)16:42:30 No.101150645

>>101150621
when timeToRelease === 2MW

Anonymous
06/25/24(Tue)16:42:48 No.101150647

Anonymous 06/25/24(Tue)16:42:48 No.101150647

>>101150636
Llama 4 could be a LMM though.

Anonymous
06/25/24(Tue)16:44:02 No.101150665

Anonymous 06/25/24(Tue)16:44:02 No.101150665

>>101150636
>be meta
>gimp your models so they don't say no no words about *any protected group of freaks & schizos* in 2024
>given the architecture and nature of LLMs - final model performs very bad
wow!

Anonymous
06/25/24(Tue)16:46:39 No.101150706

Anonymous 06/25/24(Tue)16:46:39 No.101150706

>>101150636
I think we are still far from a dead end, but we will never get AGI from LLMs. I don't need AGI though, I would be happy with 3.5 Sonnet @home.

Anonymous
06/25/24(Tue)16:46:41 No.101150707

Anonymous 06/25/24(Tue)16:46:41 No.101150707

>>101150665
who would've thought that lobotomizing a model to not recognize certain pattern would make it dumber overall, me am SHOKED

Anonymous
06/25/24(Tue)16:47:16 No.101150712

Anonymous 06/25/24(Tue)16:47:16 No.101150712

How long did the qwen team take between releases? 1.5 and 2 I guess?

Anonymous
06/25/24(Tue)16:52:15 No.101150784

Anonymous 06/25/24(Tue)16:52:15 No.101150784

wait so sillytavern was made by the company that trained command r+?? how the fuck?

Anonymous
06/25/24(Tue)16:53:31 No.101150796

Anonymous 06/25/24(Tue)16:53:31 No.101150796

>>101150712
1.5 to 2 was about 4 months, but I would not draw any conclusions from that. the amount of time that goes into new models depends on a lot of variables
>>101150784
cohee != cohere, kek

Anonymous
06/25/24(Tue)16:56:08 No.101150826

Anonymous 06/25/24(Tue)16:56:08 No.101150826

name sounds like someone with lisp saying coffee

Anonymous
06/25/24(Tue)16:57:10 No.101150841

Anonymous 06/25/24(Tue)16:57:10 No.101150841

File: file.png (10 KB, 289x96)

10 KB PNG

>>101150826
fuck

Anonymous
06/25/24(Tue)16:57:26 No.101150843

Anonymous 06/25/24(Tue)16:57:26 No.101150843

>>101150784
Technically SillyTavern is just a fork of Tavern, it started as a patch for OpenAI support, which was made by anons.

Anonymous
06/25/24(Tue)16:57:43 No.101150847

Anonymous 06/25/24(Tue)16:57:43 No.101150847

File: claudio %22cohee%22 sanchez.png (446 KB, 585x674)

446 KB PNG

>>101150784
it was actually trained by a popular mid-2000s prog rock band.

Anonymous
06/25/24(Tue)16:58:26 No.101150863

Anonymous 06/25/24(Tue)16:58:26 No.101150863

>>101150665
Unironically not their fault. "People" shat on them for releasing Galactica because it "spewed misinformation and racism". It's a miracle they even still release base models. This isn't the same as a small company like Mistral releasing a relatively uncensored thing, since they're nobodies. Maybe one day the cost of training will be low enough that anyone can train huge models, but for now it's only the ones with money (that have to abide by investors and public scrutiny).

Anonymous
06/25/24(Tue)17:02:26 No.101150915

Anonymous 06/25/24(Tue)17:02:26 No.101150915

>>101150863
They give us the base models trained for intelligence. We can train the smut, copyrighted materials, and FBI statistics back in if we want. But the people with the resources and interest only bother to train braindead 1 epoch loras on gptslop logs.

Anonymous
06/25/24(Tue)17:04:58 No.101150944

Anonymous 06/25/24(Tue)17:04:58 No.101150944

>>101150915
now with that new magpie paper, if true, we'll get much better models when training on gpt

Anonymous
06/25/24(Tue)17:25:08 No.101151211

Anonymous 06/25/24(Tue)17:25:08 No.101151211

File: PXL_20240625_310830628.jpg (746 KB, 1498x1436)

746 KB JPG

oh hai /lmg/
i haz boxes
halp me unpack?

Anonymous
06/25/24(Tue)17:25:41 No.101151219

Anonymous 06/25/24(Tue)17:25:41 No.101151219

>>101151211
*touches box*

Anonymous
06/25/24(Tue)17:28:54 No.101151249

Anonymous 06/25/24(Tue)17:28:54 No.101151249

>>101151211
*sniiiiiiiiif*

Anonymous
06/25/24(Tue)17:29:43 No.101151255

Anonymous 06/25/24(Tue)17:29:43 No.101151255

>>101151211
*shits on your box*

Anonymous
06/25/24(Tue)17:30:31 No.101151264

Anonymous 06/25/24(Tue)17:30:31 No.101151264

>>101151255
*eats it*

Anonymous
06/25/24(Tue)17:31:03 No.101151270

Anonymous 06/25/24(Tue)17:31:03 No.101151270

Any cards that do interesting experimental prompt stuff? I just found a card where they use the lorebook feature to insert information depending on the "Day" stat. I want to see more stuff like this.

Anonymous
06/25/24(Tue)17:32:31 No.101151285

Anonymous 06/25/24(Tue)17:32:31 No.101151285

>>101151211
*bites lower lip, thinking about the journey ahead, eyes sparkling with anticipation*

Anonymous
06/25/24(Tue)17:36:50 No.101151336

Anonymous 06/25/24(Tue)17:36:50 No.101151336

File: PXL_20240625_313131315.jpg (541 KB, 1401x1227)

541 KB JPG

>>101151219
>>101151249
>>101151255
>>101151285
omg it is migu
looks like she had a rough trip

Anonymous
06/25/24(Tue)17:38:03 No.101151348

Anonymous 06/25/24(Tue)17:38:03 No.101151348

File: tet_tunic.png (2.85 MB, 1328x1992)

2.85 MB PNG

>>101151270
This may be of interest to you:
https://github.com/ThiagoRibas-dev/SillyTavern-State/
> The extension allows the user to configure a number of prompts that are automatically sent after the AI's response to the User's prompt, adding the result of each prompt as an individual message to the chat, as a form of persistent context that gets update after each turn

Anonymous
06/25/24(Tue)17:38:34 No.101151356

Anonymous 06/25/24(Tue)17:38:34 No.101151356

File: ComfyUI_00692_.png (1.17 MB, 832x1216)

1.17 MB PNG

>>101151336
omg it is piku

Anonymous
06/25/24(Tue)17:41:15 No.101151398

Anonymous 06/25/24(Tue)17:41:15 No.101151398

Has anyone ever tried using RAG for T2T generation? Basically I have a dataset of sentences and I'd like to rewrite them a particular way, notably changing certain words (but this implies making other types of modifications in the sentence in my language, for example in terms of number or gender). I thought that by having some RAG database the system can rely on to find the closest sentence structure, it could help with better generation. Actually you can consider my task as close to a translation task. I tried searching for RAG T2T but it doesn't seem very popular right now. Any ideas?

Anonymous
06/25/24(Tue)17:42:14 No.101151412

Anonymous 06/25/24(Tue)17:42:14 No.101151412

>>101151348
Interesting, but are there any cards that use this to do unique things that aren't just stat tracking?

Anonymous
06/25/24(Tue)17:57:26 No.101151607

Anonymous 06/25/24(Tue)17:57:26 No.101151607

>>101151398
I really doubt you're going the get good results that way. Embeddings prioritize content over grammar. You'll likely be frustrated with the match distances you'll get.
Why not try it? Shouldn't take that long to implement a test and see for yourself if the results are good enough.

Anonymous
06/25/24(Tue)18:00:45 No.101151642

Anonymous 06/25/24(Tue)18:00:45 No.101151642

File: PXL_20240625_212137575.jpg (524 KB, 1908x1197)

524 KB JPG

>>101151356
fuck. more boxes
this will take longer than i thought

Anonymous
06/25/24(Tue)18:02:00 No.101151657

Anonymous 06/25/24(Tue)18:02:00 No.101151657

File: ComfyUI_00142_.png (875 KB, 1024x1024)

875 KB PNG

>>101151642
why did it take you so long to open the box..

Anonymous
06/25/24(Tue)18:02:21 No.101151662

Anonymous 06/25/24(Tue)18:02:21 No.101151662

File: 20240625_180100.jpg (85 KB, 800x550)

85 KB JPG

>>101151336
oh hi there

Anonymous
06/25/24(Tue)18:07:22 No.101151722

Anonymous 06/25/24(Tue)18:07:22 No.101151722

>>101151662
Cool Gardevoir plushies.
Regular and shiny!

Anonymous
06/25/24(Tue)18:11:55 No.101151776

Anonymous 06/25/24(Tue)18:11:55 No.101151776

File: PXL_20240625_220646464.jpg (895 KB, 1989x1369)

895 KB JPG

>>101151657
i've been drinking plz understand
>>101151662
mounty thingy got bent. i guess they couldn't be bothered to invest in $5 in packing foam
came with a couple ancient M60s. guess i can sell them or something

Anonymous
06/25/24(Tue)18:16:01 No.101151827

Anonymous 06/25/24(Tue)18:16:01 No.101151827

>>101151776
so anon what did you order

Anonymous
06/25/24(Tue)18:18:02 No.101151859

Anonymous 06/25/24(Tue)18:18:02 No.101151859

File: PXL_20240625_220707121.jpg (794 KB, 1435x1679)

794 KB JPG

>>101151827
i think it's a computer
picrel is an nvlink sxm board

Anonymous
06/25/24(Tue)18:19:00 No.101151869

Anonymous 06/25/24(Tue)18:19:00 No.101151869

File: Screenshot 2024-06-26 001826.png (559 KB, 1058x702)

559 KB PNG

>>101145313
>BITCONNEEEEEEEECT

Anonymous
06/25/24(Tue)18:20:02 No.101151879

Anonymous 06/25/24(Tue)18:20:02 No.101151879

>>101149094
Yeah, that why I was hyped for this week.

Anonymous
06/25/24(Tue)18:21:11 No.101151891

Anonymous 06/25/24(Tue)18:21:11 No.101151891

>>101150454
>>101150536
>>101150587
>>101150593
Fuck I'm late to the party.

Anonymous
06/25/24(Tue)18:21:48 No.101151902

Anonymous 06/25/24(Tue)18:21:48 No.101151902

File: ComfyUI_00343_.png (1.83 MB, 1024x1024)

1.83 MB PNG

>>101151859
ru sure u should be opening such valuable items when drunk

Anonymous
06/25/24(Tue)18:27:48 No.101151976

Anonymous 06/25/24(Tue)18:27:48 No.101151976

File: PXL_20240625_222408282.jpg (643 KB, 1400x1484)

643 KB JPG

>>101151902
no but what's the worst that could happen?

Anonymous
06/25/24(Tue)18:32:10 No.101152022

Anonymous 06/25/24(Tue)18:32:10 No.101152022

>>101151976
thats a lot of stuff anon, how did you acquire that box

Anonymous
06/25/24(Tue)18:33:47 No.101152046

Anonymous 06/25/24(Tue)18:33:47 No.101152046

>>101152022
i found it. dont' worry about it

Anonymous
06/25/24(Tue)18:34:28 No.101152052

Anonymous 06/25/24(Tue)18:34:28 No.101152052

>>101152046
k im gonna find u then....

Anonymous
06/25/24(Tue)18:39:33 No.101152115

Anonymous 06/25/24(Tue)18:39:33 No.101152115

>>101152046
...worry about it

Anonymous
06/25/24(Tue)18:44:46 No.101152188

Anonymous 06/25/24(Tue)18:44:46 No.101152188

>>101151976
Cutting ribbon cables with Miku

Anonymous
06/25/24(Tue)19:18:58 No.101152617

Anonymous 06/25/24(Tue)19:18:58 No.101152617

File: PXL_20240625_231559779.jpg (1011 KB, 1977x1205)

1011 KB JPG

>>101152052
>>101152115
>>101152188
uWu wut r u going to do when u find me?

Anonymous
06/25/24(Tue)19:20:00 No.101152625

Anonymous 06/25/24(Tue)19:20:00 No.101152625

File: Capture.png (74 KB, 1296x1011)

74 KB PNG

>>101152617

Anonymous
06/25/24(Tue)19:24:25 No.101152670

Anonymous 06/25/24(Tue)19:24:25 No.101152670

>>101152625
So now we have 2 V100 max anons?

Anonymous
06/25/24(Tue)19:26:32 No.101152697

Anonymous 06/25/24(Tue)19:26:32 No.101152697

>>101152625
This is good.

Anonymous
06/25/24(Tue)19:27:31 No.101152712

Anonymous 06/25/24(Tue)19:27:31 No.101152712

>>101152625
>32GB
nice. did you get a good deal for those? seems hard to justify doing now if not since prices will crash next year as datacenters dump them

Anonymous
06/25/24(Tue)19:33:01 No.101152771

Anonymous 06/25/24(Tue)19:33:01 No.101152771

File: PXL_20240625_233002121.jpg (797 KB, 1513x1569)

797 KB JPG

>>101152670
>>101152697
>>101152712
ok meta. i'm ready for 405b

Anonymous
06/25/24(Tue)19:43:33 No.101152900

Anonymous 06/25/24(Tue)19:43:33 No.101152900

>>101152771
Miku, Guardian of Volta

Anonymous
06/25/24(Tue)19:43:47 No.101152903

Anonymous 06/25/24(Tue)19:43:47 No.101152903

>>101151412
You could use the prompts to have the model output specific information that can trigger lorebook entries.
The actual point of that extension is simply to lessen the burden on the model by feeding instructions one (or a couple) at a time, since too many instructions confuse smaller models and make them extra dumb.
I will implement a keyword feature, similar to lorebooks, so that these prompts can be triggered conditionally.

Anonymous
06/25/24(Tue)19:48:09 No.101152958

Anonymous 06/25/24(Tue)19:48:09 No.101152958

Smaug is retarded, every version of it is always retarded and much dumber than whatever model it was based on, and yet mergers always keep including it in their mixes for some reason

Anonymous
06/25/24(Tue)20:00:27 No.101153082

Anonymous 06/25/24(Tue)20:00:27 No.101153082

>>101152617
i'm going to get behind you, put my hand over your mouth.. and then you'll fall asleep because over my hand there was a cloth
after that i'm going to undress....

..undress the rig and steal all the parts

Anonymous
06/25/24(Tue)20:03:24 No.101153104

Anonymous 06/25/24(Tue)20:03:24 No.101153104

>>101152958
>smaug is retarded
>mergers are retarded
it's like poetry

Anonymous
06/25/24(Tue)20:15:57 No.101153222

Anonymous 06/25/24(Tue)20:15:57 No.101153222

>>101152771
are you going to run it in 2bit or what?

Anonymous
06/25/24(Tue)20:17:05 No.101153232

Anonymous 06/25/24(Tue)20:17:05 No.101153232

File: 1715277591317631.jpg (1.27 MB, 2048x2048)

1.27 MB JPG

>>101151356
Those hands are god-tier for SD. What model/workflow?
>>101152625
SHEEEEEEEEEEEIT
Finally someone itt with moar VRAM than me

Anonymous
06/25/24(Tue)20:21:01 No.101153265

Anonymous 06/25/24(Tue)20:21:01 No.101153265

Where can I find a slop-free RP dataset?

Anonymous
06/25/24(Tue)20:21:31 No.101153271

Anonymous 06/25/24(Tue)20:21:31 No.101153271

>>101153232
LeCun wants AI to be more than just LLMs. Maybe even until they have conscious. Imagine, your local Miku having a real consciousness. She'll finally be real, not just a mimicry.

Anonymous
06/25/24(Tue)20:22:37 No.101153282

Anonymous 06/25/24(Tue)20:22:37 No.101153282

>>101151336
embarrassing manchild

Anonymous
06/25/24(Tue)20:22:39 No.101153284

Anonymous 06/25/24(Tue)20:22:39 No.101153284

>>101153265
limarp

Anonymous
06/25/24(Tue)20:27:21 No.101153320

Anonymous 06/25/24(Tue)20:27:21 No.101153320

File: 1705326754733957.jpg (237 KB, 1920x1080)

237 KB JPG

Any kind soul that could recommend a TTS to make Neco-arc read my unending backlist of papers?

Anonymous
06/25/24(Tue)20:35:12 No.101153400

Anonymous 06/25/24(Tue)20:35:12 No.101153400

So are there any bitnet/1.58bpw models available to run with significant numbers of parameters? I have 32gb vram, i keep hearing about this shit but the only models i've seen are teeny.

Anonymous
06/25/24(Tue)20:37:01 No.101153425

Anonymous 06/25/24(Tue)20:37:01 No.101153425

>>101153282
you're posting in local manchild general

Anonymous
06/25/24(Tue)20:38:46 No.101153444

Anonymous 06/25/24(Tue)20:38:46 No.101153444

what would you do if you had like $100,000 to spend on hardware?
spoke to higher ups today about the benefits of hosting our own server versus renting time on someone else's. if i can make a good argument i can probably get some money diverted.

Anonymous
06/25/24(Tue)20:40:17 No.101153458

Anonymous 06/25/24(Tue)20:40:17 No.101153458

>>101153284
>70% furry and 30% loli
damn

Anonymous
06/25/24(Tue)20:43:54 No.101153494

Anonymous 06/25/24(Tue)20:43:54 No.101153494

>>101153458
>% totals to only 100%.
Have they been slacking or is there only space for one tag at a time?

Anonymous
06/25/24(Tue)20:52:08 No.101153563

Anonymous 06/25/24(Tue)20:52:08 No.101153563

>>101153444
What are your requirements? For that much you could probably build with 2xH100 for about 160GB total VRAM.

Anonymous
06/25/24(Tue)21:03:46 No.101153673

Anonymous 06/25/24(Tue)21:03:46 No.101153673

>>101153444
Used consumer or server hardware, for example 30-50 of 4-6x3090 or 4xv100 machines. But that stuff isn't supported or maintained, so not something your company would buy, also the power bill would be hilarious, but imagine 100-130 3090s, just 2.4TB of VRAM? if you had GPT-4 weights you could even run it! Of course the interconnect and networking will kinda suck, but depends on what you need...

Anonymous
06/25/24(Tue)21:06:14 No.101153691

Anonymous 06/25/24(Tue)21:06:14 No.101153691

File: 1702200312013572.png (31 KB, 897x378)

31 KB PNG

Hey friends, where do I add these things? Is it under "Story String"? Instruct Mode Sequences have similar things written on it but they're separated and slightly different.

Anonymous
06/25/24(Tue)21:08:00 No.101153701

Anonymous 06/25/24(Tue)21:08:00 No.101153701

>>101153444
>>101153563
H100s don't make sense unless you're filling a datacenter with them, I would put together an A100 rack, and if that doesn't work out i would just be like "let's buy a bunch of quadros/4090s"sdjv

Anonymous
06/25/24(Tue)21:10:24 No.101153719

Anonymous 06/25/24(Tue)21:10:24 No.101153719

File: ComfyUI_00690_.png (1.18 MB, 832x1216)

1.18 MB PNG

>>101153232
>Those hands are god-tier for SD. What model
autismmix ( https://civitai.com/models/288584?modelVersionId=324619 ) ( has ponyxl as base (ponyxl is good) )
>/workflow?
here's anon's workflow (better besides hair color)
https://files.catbox.moe/5y0e12.png
in my workflow im using tensorrt and no loras, nothing special really

Anonymous
06/25/24(Tue)21:10:41 No.101153722

Anonymous 06/25/24(Tue)21:10:41 No.101153722

>>101153691
It's in the instruct mode sequences.
Silly Tavern already has the template built in if you are using that.

Anonymous
06/25/24(Tue)21:17:19 No.101153769

Anonymous 06/25/24(Tue)21:17:19 No.101153769

File: 1703934839568504.png (173 KB, 1866x631)

173 KB PNG

>>101153722
It's slightly confusing because I don't really understand the correct place I should be putting each line in.
Left is the original, middle is the one I've modified, right are the instructions.

Anonymous
06/25/24(Tue)21:18:26 No.101153779

Anonymous 06/25/24(Tue)21:18:26 No.101153779

>>101153563
chemical manufacturing.
proposals they like are stuff like processing and categorizing like 30 years of documents and data.
some sort of internal tool that could parse them and pluck insights out on demand.

even that was something they were really excited about and i don't think we'd need an unbelievably beefy machine to do it, but they're open to the idea and it'd be sweet to get to fuck around with serious hardware.

i figure with that sort of compute you could probably explore forecasting and anomaly detection for production processes. not really LLM but just a secondary benefit of a dedicated server. there is a shitload of real time data (temperature, flowrates, pressure, etc).

we have a couple 4090s but there's only so much you can do. i'm kind of secondary to the group who is doing this. i'm doing more machine learning stuff but we work together.

Anonymous
06/25/24(Tue)21:21:54 No.101153813

Anonymous 06/25/24(Tue)21:21:54 No.101153813

>https://websim.ai/c/R6ochh0wCk3sLl40D
Huh...

Anonymous
06/25/24(Tue)21:24:37 No.101153836

Anonymous 06/25/24(Tue)21:24:37 No.101153836

>>101153769
The one in the left is already correct according to the instructions.

Anonymous
06/25/24(Tue)21:25:24 No.101153844

Anonymous 06/25/24(Tue)21:25:24 No.101153844

>>101153836
Ah...okay...I apologize for the dumb question...

Anonymous
06/25/24(Tue)21:25:37 No.101153847

Anonymous 06/25/24(Tue)21:25:37 No.101153847

lole
https://websim.ai/c/R6ochh0wCk3sLl40D

Anonymous
06/25/24(Tue)21:31:16 No.101153894

Anonymous 06/25/24(Tue)21:31:16 No.101153894

Thread theme anon made it in! A shame about it thinking we'd be safetyfags though.
https://websim.ai/c/R6ochh0wCk3sLl40D

Anonymous
06/25/24(Tue)21:35:19 No.101153935

Anonymous 06/25/24(Tue)21:35:19 No.101153935

>>101153894
>sign in
No.

Anonymous
06/25/24(Tue)21:36:47 No.101153956

Anonymous 06/25/24(Tue)21:36:47 No.101153956

>>101153935
You should be able to see the links fine. Just don't click anything, that triggers a log in screen.

Anonymous
06/25/24(Tue)21:38:23 No.101153984

Anonymous 06/25/24(Tue)21:38:23 No.101153984

>>101153813
>>101153847
>>101153894
Oh I'm retarded, these are the same links.

Anonymous
06/25/24(Tue)21:39:35 No.101154001

Anonymous 06/25/24(Tue)21:39:35 No.101154001

>>101153956
No.
>>101153984
Yes.

Anonymous
06/25/24(Tue)21:40:09 No.101154010

Anonymous 06/25/24(Tue)21:40:09 No.101154010

>>101153847
Intended URL: https://websim.ai/c/bA64LoXlbn3vs2u2M

>>101153894
Intended URL: https://websim.ai/c/578BMgWKq5HmYcp7a

Anonymous
06/25/24(Tue)21:42:41 No.101154040

Anonymous 06/25/24(Tue)21:42:41 No.101154040

>>101154001
Just having a laugh playing with this my man. You can do what you want.

Anonymous
06/25/24(Tue)21:43:35 No.101154058

Anonymous 06/25/24(Tue)21:43:35 No.101154058

>>101154040
ugh fine ill let you play with it

Anonymous
06/25/24(Tue)21:44:24 No.101154068

Anonymous 06/25/24(Tue)21:44:24 No.101154068

>>101153844
It's alright.
Look at the final prompt either in the browser's console or in the backend window to see how the prompt template is actually being used. That'll help you understand how those fields are being applied.

Anonymous
06/25/24(Tue)21:47:51 No.101154098

Anonymous 06/25/24(Tue)21:47:51 No.101154098

>>101154010
lol, nice

Anonymous
06/25/24(Tue)21:59:34 No.101154182

Anonymous 06/25/24(Tue)21:59:34 No.101154182

>>101144935
I've been under a rock, is Midnight Miqu still queen of the 32k context 70B models?

Anonymous
06/25/24(Tue)22:11:04 No.101154278

Anonymous 06/25/24(Tue)22:11:04 No.101154278

https://github.com/beowolx/rensa

Anonymous
06/25/24(Tue)22:15:24 No.101154308

Anonymous 06/25/24(Tue)22:15:24 No.101154308

>>101150443
>>101150558
>the tiny 8b model doesn't outperform mixtral, therefor its garbage
are people really this retarded?

Anonymous
06/25/24(Tue)22:18:18 No.101154336

Anonymous 06/25/24(Tue)22:18:18 No.101154336

File: -.png (8 KB, 472x80)

8 KB PNG

>enable dry
>doesn't show up in ui
wat do

Anonymous
06/25/24(Tue)22:19:09 No.101154341

Anonymous 06/25/24(Tue)22:19:09 No.101154341

>>101154308
Meta claimed 8B beat previous generation 70B. So surely it can beat ~42B Mixtral.

Anonymous
06/25/24(Tue)22:23:58 No.101154377

Anonymous 06/25/24(Tue)22:23:58 No.101154377

>>101154278
>MIT
go advertise your shitty side project somewhere else

Anonymous
06/25/24(Tue)22:24:42 No.101154384

Anonymous 06/25/24(Tue)22:24:42 No.101154384

>>101154341
Oh yeah, meta's claims were absurd. But its still a lot better than any 7b models we had before.

Anonymous
06/25/24(Tue)22:25:36 No.101154392

Anonymous 06/25/24(Tue)22:25:36 No.101154392

>>101154341
Meta said that to generate hype, obviously that's pure cope.

Anonymous
06/25/24(Tue)22:27:41 No.101154406

Anonymous 06/25/24(Tue)22:27:41 No.101154406

>>101148241
Huh? This is minimum hardware required. Of course it’s batch size 1 retard.

Anonymous
06/25/24(Tue)22:29:06 No.101154418

Anonymous 06/25/24(Tue)22:29:06 No.101154418

>>101154384
bullshit, everyone here was running "hurr durr this 8B model is GPT-4 killer!!!" first weeks after llama3 release.

Anonymous
06/25/24(Tue)22:29:57 No.101154423

Anonymous 06/25/24(Tue)22:29:57 No.101154423

File: file.png (113 KB, 1184x747)

113 KB PNG

>>101154341
check picrel, llama-2-chat tunes were shit, remember anon? oh wait you're a newfag~
>>101154418
[citation needed]

Anonymous
06/25/24(Tue)22:30:34 No.101154427

Anonymous 06/25/24(Tue)22:30:34 No.101154427

>>101154418
what other things do the voices in your head tell you?

Anonymous
06/25/24(Tue)22:31:25 No.101154434

Anonymous 06/25/24(Tue)22:31:25 No.101154434

File: MikuAten.png (1.56 MB, 832x1224)

1.56 MB PNG

>>101154182
>Midnight Miqu
No. Solar Eclipse Miqu is the new sota

Anonymous
06/25/24(Tue)22:31:35 No.101154435

Anonymous 06/25/24(Tue)22:31:35 No.101154435

>>101154418
I'll take "things that never happened" for 500

Anonymous
06/25/24(Tue)22:34:29 No.101154452

Anonymous 06/25/24(Tue)22:34:29 No.101154452

>>101154341
They never said that. What Zucc said was that it's pretty close but not in every aspect.

Also Mixtral beat 70B previously, according to anons, so it makes sense that an 8B that almost but not quite old 70B still does not beat Mixtral.

Anonymous
06/25/24(Tue)22:34:33 No.101154453

Anonymous 06/25/24(Tue)22:34:33 No.101154453

>>101154423
>newfag
>for some obscure general with extremely low activity no one knows and cares about
you for sure got him! /s

Anonymous
06/25/24(Tue)22:34:56 No.101154458

Anonymous 06/25/24(Tue)22:34:56 No.101154458

>>101154453
>/s
go back

Anonymous
06/25/24(Tue)22:35:09 No.101154462

Anonymous 06/25/24(Tue)22:35:09 No.101154462

https://x.com/brave/status/1805781843393773654

Mistral Exec says they wont release Mistral Large due to business responsibilities preceding over openness..

Anonymous
06/25/24(Tue)22:35:59 No.101154468

Anonymous 06/25/24(Tue)22:35:59 No.101154468

File: 1569991762929.jpg (93 KB, 874x612)

93 KB JPG

>>101154434

Anonymous
06/25/24(Tue)22:36:57 No.101154473

Anonymous 06/25/24(Tue)22:36:57 No.101154473

>>101154462
>no mention of mistral medium or next

Anonymous
06/25/24(Tue)22:39:03 No.101154488

Anonymous 06/25/24(Tue)22:39:03 No.101154488

>>101154462
Nothing wrong with that. Just their early marketing before they got acquired that was the issue. Using "open source" to hype themselves up and then close things off later. Typical.

Anonymous
06/25/24(Tue)22:39:19 No.101154492

Anonymous 06/25/24(Tue)22:39:19 No.101154492

>>101154453
based 2025oldGOD destroying clueless newfags

Anonymous
06/25/24(Tue)22:52:18 No.101154608

Anonymous 06/25/24(Tue)22:52:18 No.101154608

>>101134899
>>101127795
I look forward to seeing the results of this (different anon here catching up on threads).

Anonymous
06/25/24(Tue)22:59:53 No.101154666

Anonymous 06/25/24(Tue)22:59:53 No.101154666

>>101154406
That wasn't stated neither in your post nor in that image, retard.

Anonymous
06/25/24(Tue)23:05:07 No.101154707

Anonymous 06/25/24(Tue)23:05:07 No.101154707

File: file.png (2.24 MB, 1430x1448)

2.24 MB PNG

>https://huggingface.co/Sao10K/L3-8B-Stheno-v3.3-32K
b-bros..?

Anonymous
06/25/24(Tue)23:06:37 No.101154716

Anonymous 06/25/24(Tue)23:06:37 No.101154716

>>101154707
people say it's a downgrade from 3.2

Anonymous
06/25/24(Tue)23:11:51 No.101154752

Anonymous 06/25/24(Tue)23:11:51 No.101154752

>>101154707
i got better results from euterpe in 2021 than any l3 8b model, just take the tokens and run cr/mixtral if you're poor

Anonymous
06/25/24(Tue)23:12:39 No.101154757

Anonymous 06/25/24(Tue)23:12:39 No.101154757

>>101154752
>mixtral
mixtral limarp zloss eh?

Anonymous
06/25/24(Tue)23:16:21 No.101154782

Anonymous 06/25/24(Tue)23:16:21 No.101154782

>>101154453
>/s
anon...

Anonymous
06/25/24(Tue)23:17:48 No.101154794

Anonymous 06/25/24(Tue)23:17:48 No.101154794

>>101154468
based Chambraigne

Anonymous
06/25/24(Tue)23:20:04 No.101154810

Anonymous 06/25/24(Tue)23:20:04 No.101154810

>>101154782
lol /s

Anonymous
06/25/24(Tue)23:23:43 No.101154828

Anonymous 06/25/24(Tue)23:23:43 No.101154828

>>101154782
i don't care about it being used by leddit exclusively.

Anonymous
06/25/24(Tue)23:32:37 No.101154877

Anonymous 06/25/24(Tue)23:32:37 No.101154877

>Only getting ~0.8 t/s on CR+ GGUF.

Sorry, what's holding back CPU inference speed? RAM frequency or CPU clock speed? Cause AMD Ryzen 9000 series is out next month. If it helps t/s to upgrade, I would do it.

Anonymous
06/25/24(Tue)23:33:58 No.101154883

Anonymous 06/25/24(Tue)23:33:58 No.101154883

>>101154877
Maybe if you defect to the llamafile camp you will get better t/s with AVX-512

Anonymous
06/25/24(Tue)23:34:26 No.101154890

Anonymous 06/25/24(Tue)23:34:26 No.101154890

>>101154877
even cr non-plus is glacial compared to 70b for me

Anonymous
06/25/24(Tue)23:34:57 No.101154893

Anonymous 06/25/24(Tue)23:34:57 No.101154893

>>101154877
Why CPU over p40?

Anonymous
06/25/24(Tue)23:36:32 No.101154900

Anonymous 06/25/24(Tue)23:36:32 No.101154900

>>101154883
NTA, but why is llamafile faster, did jart add some custom AVX-512 optimizations, if so, why wouldn't llama.cpp bother adding them?

Anonymous
06/25/24(Tue)23:39:37 No.101154931

Anonymous 06/25/24(Tue)23:39:37 No.101154931

>>101154900
llamafile has a shit license and conflicts with MIT. He contributed some bits to llama.cpp, but only so that he doesn't have to keep patching it on his side.

Anonymous
06/25/24(Tue)23:41:24 No.101154943

Anonymous 06/25/24(Tue)23:41:24 No.101154943

File: .png (389 KB, 918x916)

389 KB PNG

>>101154931
>MIT cucks BTFO'D by tranny

Anonymous
06/25/24(Tue)23:41:27 No.101154944

Anonymous 06/25/24(Tue)23:41:27 No.101154944

File: 8f8f8.u3.jpg (28 KB, 600x600)

28 KB JPG

>>101154900
Check it anon:
https://github.com/Mozilla-Ocho/llamafile/pull/464
https://github.com/Mozilla-Ocho/llamafile/pull/453
And one for MOE & AVX2:
https://github.com/Mozilla-Ocho/llamafile/pull/428

Anonymous
06/25/24(Tue)23:44:18 No.101154960

Anonymous 06/25/24(Tue)23:44:18 No.101154960

>>101154943
*GPL licenses are a nightmare to read. Just like their list of pronouns and mental disorders.

Anonymous
06/25/24(Tue)23:46:01 No.101154972

Anonymous 06/25/24(Tue)23:46:01 No.101154972

>>101154877
That's what I get.
I just do other things while it runs.
It's kinda like RP with an actual person who also has to type and live life.

Anonymous
06/25/24(Tue)23:47:44 No.101154985

Anonymous 06/25/24(Tue)23:47:44 No.101154985

>>101154931
>implying MIT itself isnt shit

Anonymous
06/25/24(Tue)23:52:21 No.101155012

Anonymous 06/25/24(Tue)23:52:21 No.101155012

>>101154985
>no warranty
>keep copyright
Everyone can use it. That's it.

Anonymous
06/25/24(Tue)23:56:02 No.101155041

Anonymous 06/25/24(Tue)23:56:02 No.101155041

>>101154960
rent free

Anonymous
06/25/24(Tue)23:58:54 No.101155059

Anonymous 06/25/24(Tue)23:58:54 No.101155059

File: IMG_1488.png (367 KB, 1055x896)

367 KB PNG

>>101155012
>Everyone can use it. That's it.

Anonymous
06/26/24(Wed)00:01:03 No.101155078

Anonymous 06/26/24(Wed)00:01:03 No.101155078

>>101155059
How is that false?

Anonymous
06/26/24(Wed)00:03:01 No.101155091

Anonymous 06/26/24(Wed)00:03:01 No.101155091

File: bingo.png (152 KB, 498x402)

152 KB PNG

>>101155012

Anonymous
06/26/24(Wed)00:04:02 No.101155098

Anonymous 06/26/24(Wed)00:04:02 No.101155098

the licensesperg really doesn't stop. sign of autism.

Anonymous
06/26/24(Wed)00:15:13 No.101155168

Anonymous 06/26/24(Wed)00:15:13 No.101155168

>>101155012
you can even fork a MIT program to whatever troon license you want as our lovely Jart did indeed do. only nocoders really give a fuck though I've noticed

Anonymous
06/26/24(Wed)00:16:24 No.101155183

Anonymous 06/26/24(Wed)00:16:24 No.101155183

File: retard.png (301 KB, 668x735)

301 KB PNG

>>101155078
>How is that false?

Anonymous
06/26/24(Wed)00:21:04 No.101155223

Anonymous 06/26/24(Wed)00:21:04 No.101155223

>the sharteen and jart are ideological allies
grim

Anonymous
06/26/24(Wed)00:21:47 No.101155233

Anonymous 06/26/24(Wed)00:21:47 No.101155233

File: ACK.jpg (132 KB, 760x704)

132 KB JPG

>>101155223
>implying

Anonymous
06/26/24(Wed)00:23:10 No.101155249

Anonymous 06/26/24(Wed)00:23:10 No.101155249

>>101155233
no I was being literal. both you and jart chimp out whenever MIT licenses show up

Anonymous
06/26/24(Wed)00:23:56 No.101155253

Anonymous 06/26/24(Wed)00:23:56 No.101155253

>>101155223
agpl>apache THOVGH

Anonymous
06/26/24(Wed)00:25:22 No.101155268

Anonymous 06/26/24(Wed)00:25:22 No.101155268

>>101155253
whatever license you simp over doesn't matter when no one uses whatever code you write THOUGH

Anonymous
06/26/24(Wed)00:27:46 No.101155286

Anonymous 06/26/24(Wed)00:27:46 No.101155286

File: thats the point.png (239 KB, 498x402)

239 KB PNG

>>101155268
>when no one uses whatever code you write

Anonymous
06/26/24(Wed)00:41:52 No.101155400

Anonymous 06/26/24(Wed)00:41:52 No.101155400

File: Untitled.png (552 KB, 720x915)

552 KB PNG

Large Language Models are Interpretable Learners
https://arxiv.org/abs/2406.17224
>The trade-off between expressiveness and interpretability remains a core challenge when building human-centric predictive models for classification and decision-making. While symbolic rules offer interpretability, they often lack expressiveness, whereas neural networks excel in performance but are known for being black boxes. In this paper, we show a combination of Large Language Models (LLMs) and symbolic programs can bridge this gap. In the proposed LLM-based Symbolic Programs (LSPs), the pretrained LLM with natural language prompts provides a massive set of interpretable modules that can transform raw input into natural language concepts. Symbolic programs then integrate these modules into an interpretable decision rule. To train LSPs, we develop a divide-and-conquer approach to incrementally build the program from scratch, where the learning process of each step is guided by LLMs. To evaluate the effectiveness of LSPs in extracting interpretable and accurate knowledge from data, we introduce IL-Bench, a collection of diverse tasks, including both synthetic and real-world scenarios across different modalities. Empirical results demonstrate LSP's superior performance compared to traditional neurosymbolic programs and vanilla automatic prompt tuning methods. Moreover, as the knowledge learned by LSP is a combination of natural language descriptions and symbolic rules, it is easily transferable to humans (interpretable), and other LLMs, and generalizes well to out-of-distribution samples.
Mikunator

Anonymous
06/26/24(Wed)01:02:29 No.101155606

Anonymous 06/26/24(Wed)01:02:29 No.101155606

>>101144935
>https://github.com/OpenBMB/llama.cpp?tab=readme-ov-file#run-the-quantized-model
for:
>for openbmb/MiniCPM-Llama3-V-2_5-gguf/ggml-model-Q4_K.gguf?
which damn file do I use and where is the help output? --help just gives:
./llama-gguf --help
./llama-gguf: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by ./llama-gguf)
the folder is full of shit:
llama-b3209-bin-ubuntu-x64/build/bin$ ls
LICENSE                        llama-q8dot
llama-baby-llama               llama-quantize
llama-batched                  llama-quantize-stats
llama-batched-bench            llama-retrieval
llama-bench                    llama-save-load-state
llama-bench-matmult            llama-server
llama-cli                      llama-simple
... tl;dr
...
llama-lookup-stats             test-sampling
llama-parallel                 test-tokenizer-0
llama-passkey                  test-tokenizer-1-bpe
>[2024 Jun 12] Binaries have been renamed w/ a llama- prefix. main is now llama-cli, server is llama-server, etc (ggerganov#7809)
what the fuck does all this mean? Last time I used llama ccp was when it first came out on windows and now I'm trying to run multi modal on ubuntu and its nothing like
I know I'm retarded. Please just tell me which button to press. is it llama-cli or llama-gguf for openbmb/MiniCPM-Llama3-V-2_5-gguf/ggml-model-Q4_K.gguf?

Anonymous
06/26/24(Wed)01:04:52 No.101155630

Anonymous 06/26/24(Wed)01:04:52 No.101155630

File: 1472860069099.png (191 KB, 600x979)

191 KB PNG

The girl: My GPU (8gb vram)
The burger: Models that won't fit exclusively on my GPU

Someone who is good at eating burgers please advise.

Anonymous
06/26/24(Wed)01:07:08 No.101155659

Anonymous 06/26/24(Wed)01:07:08 No.101155659

>>101155630
cpumaxx

Anonymous
06/26/24(Wed)01:09:07 No.101155673

Anonymous 06/26/24(Wed)01:09:07 No.101155673

File: Untitled.png (722 KB, 1166x901)

722 KB PNG

Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine Learning
https://arxiv.org/abs/2406.16989
>Low-Rank Adaptation (LoRA) offers an efficient way to fine-tune large language models (LLMs). Its modular and plug-and-play nature allows the integration of various domain-specific LoRAs, enhancing LLM capabilities. Open-source platforms like Huggingface and Modelscope have introduced a new computational paradigm, Uploadable Machine Learning (UML). In UML, contributors use decentralized data to train specialized adapters, which are then uploaded to a central platform to improve LLMs. This platform uses these domain-specific adapters to handle mixed-task requests requiring personalized service. Previous research on LoRA composition either focuses on specific tasks or fixes the LoRA selection during training. However, in UML, the pool of LoRAs is dynamically updated with new uploads, requiring a generalizable selection mechanism for unseen LoRAs. Additionally, the mixed-task nature of downstream requests necessitates personalized services. To address these challenges, we propose Retrieval-Augmented Mixture of LoRA Experts (RAMoLE), a framework that adaptively retrieves and composes multiple LoRAs based on input prompts. RAMoLE has three main components: LoraRetriever for identifying and retrieving relevant LoRAs, an on-the-fly MoLE mechanism for coordinating the retrieved LoRAs, and efficient batch inference for handling heterogeneous requests. Experimental results show that RAMoLE consistently outperforms baselines, highlighting its effectiveness and scalability.
No code. I remember some anons wanting something like this. there was a prior similar paper (that they cited but didn't test against it seems) https://arxiv.org/abs/2404.13628

Anonymous
06/26/24(Wed)01:10:46 No.101155691

Anonymous 06/26/24(Wed)01:10:46 No.101155691

File: f903990e71cddc0ce32e1acde(...).jpg (150 KB, 1087x1636)

150 KB JPG

>>101155659
I can't.

Anonymous
06/26/24(Wed)01:12:53 No.101155709

Anonymous 06/26/24(Wed)01:12:53 No.101155709

>>101155630
Get more RAM.
Can you fit 64gb?
Then you can mixtral at least until bitnet

Anonymous
06/26/24(Wed)01:15:10 No.101155734

Anonymous 06/26/24(Wed)01:15:10 No.101155734

>openbmb/MiniCPM-Llama3-V-2_5-gguf
how can I run this multi modal modal?

Anonymous
06/26/24(Wed)01:15:26 No.101155736

Anonymous 06/26/24(Wed)01:15:26 No.101155736

>>101155709
>at least until bitnet
Why is it taking so long?

Anonymous
06/26/24(Wed)01:19:16 No.101155759

Anonymous 06/26/24(Wed)01:19:16 No.101155759

>>101155736
Money and risk.

Anonymous
06/26/24(Wed)01:27:13 No.101155818

Anonymous 06/26/24(Wed)01:27:13 No.101155818

> “I need 2400 gb vram? damn. Can I get away with less?” “Of course just stop the batch size from 1024 to 1 and you only need 10 gb”
Retard.

Anonymous
06/26/24(Wed)01:31:29 No.101155841

Anonymous 06/26/24(Wed)01:31:29 No.101155841

>>101155630
Koboldcpp or llama.cpp running a gguf quant with some layers on cpu. Assuming you have regular ram.

Anonymous
06/26/24(Wed)01:38:11 No.101155892

Anonymous 06/26/24(Wed)01:38:11 No.101155892

File: Untitled.png (119 KB, 1033x793)

119 KB PNG

Interpreting Attention Layer Outputs with Sparse Autoencoders
https://arxiv.org/abs/2406.17759
>Decomposing model activations into interpretable components is a key open problem in mechanistic interpretability. Sparse autoencoders (SAEs) are a popular method for decomposing the internal activations of trained transformers into sparse, interpretable features, and have been applied to MLP layers and the residual stream. In this work we train SAEs on attention layer outputs and show that also here SAEs find a sparse, interpretable decomposition. We demonstrate this on transformers from several model families and up to 2B parameters. We perform a qualitative study of the features computed by attention layers, and find multiple families: long-range context, short-range context and induction features. We qualitatively study the role of every head in GPT-2 Small, and estimate that at least 90% of the heads are polysemantic, i.e. have multiple unrelated roles. Further, we show that Sparse Autoencoders are a useful tool that enable researchers to explain model behavior in greater detail than prior work. For example, we explore the mystery of why models have so many seemingly redundant induction heads, use SAEs to motivate the hypothesis that some are long-prefix whereas others are short-prefix, and confirm this with more rigorous analysis. We use our SAEs to analyze the computation performed by the Indirect Object Identification circuit (Wang et al.), validating that the SAEs find causally meaningful intermediate variables, and deepening our understanding of the semantics of the circuit. We open-source the trained SAEs and a tool for exploring arbitrary prompts through the lens of Attention Output SAEs.
https://robertzk.github.io/circuit-explorer
weights linked in appendix. probably only interesting for those who want to poke around

Anonymous
06/26/24(Wed)01:42:12 No.101155922

Anonymous 06/26/24(Wed)01:42:12 No.101155922

File: 4871575.jpg (6 KB, 150x150)

6 KB JPG

>>101155841
>assuming the oldfriend cute chibi vramlet burger chan poster doesn't know about ggufs

Anonymous
06/26/24(Wed)01:43:23 No.101155932

Anonymous 06/26/24(Wed)01:43:23 No.101155932

Is there a reason not to get an a6000 for training? Seems like a decent upgrade from 3090.

Anonymous
06/26/24(Wed)01:45:26 No.101155955

Anonymous 06/26/24(Wed)01:45:26 No.101155955

>>101155940
>>101155940
>>101155940

Anonymous
06/26/24(Wed)01:46:28 No.101155972

Anonymous 06/26/24(Wed)01:46:28 No.101155972

File: 雨宮イブ🌧_🍔_113202179_p0.png (798 KB, 1447x2039)

798 KB PNG

>>101155922
One day I'll get a job and buy a new computer. You'll see! (I won't though)

Anonymous
06/26/24(Wed)02:29:15 No.101156293

Anonymous 06/26/24(Wed)02:29:15 No.101156293

>>101154462
I mean, they're a small company, they can't risk giving their best model for everyone for free, look what happened to StabilityAI, they are on the verge of bankruptcy because of that

Anonymous
06/26/24(Wed)02:30:16 No.101156300

Anonymous 06/26/24(Wed)02:30:16 No.101156300

>>101155041
you hated him because he told the truth

Anonymous
06/26/24(Wed)02:33:21 No.101156324

Anonymous 06/26/24(Wed)02:33:21 No.101156324

File: file.png (159 KB, 600x600)

159 KB PNG

>>101156300

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.