/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 04/24/24(Wed)19:56:03 No.100166886

File: TheMikuLongsForTheSea.png (2.11 MB, 1536x920)

2.11 MB PNG

/lmg/ - Local Models General Anonymous 04/24/24(Wed)19:56:03 No.100166886 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>100161515 & >>100154945

►News
>(04/24) Snowflake Arctic Instruct 128x3B MoE released: https://hf.co/Snowflake/snowflake-arctic-instruct
>(04/23) Phi-3 Mini model released: https://hf.co/microsoft/Phi-3-mini-128k-instruct-onnx
>(04/21) Llama3 70B pruned to 42B parameters: https://hf.co/chargoddard/llama3-42b-v0
>(04/18) Llama3 8B, 70B pretrained and instruction-tuned models released: https://llama.meta.com/llama3/
>(04/17) Mixtral-8x22B-Instruct-v0.1 released: https://mistral.ai/news/mixtral-8x22b/

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling/index.xhtml

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
04/24/24(Wed)19:56:51 No.100166891

Anonymous 04/24/24(Wed)19:56:51 No.100166891

File: GIOgxgMagAAyUL0.jpg (86 KB, 555x680)

86 KB JPG

►Recent Highlights from the Previous Thread: >>100161515

--Paper: A Theoretical Analysis of the Repetition Problem in Text Generation: >>100166120 >>100166163
--Anon Shares Prompting Tricks for Better AI Responses: >>100165031 >>100165847
--Snowflake Arctic Instruct: A New Open-Source MoE Model: >>100161818 >>100162141 >>100161926 >>100161964 >>100162221
--Running Llama 3 Locally: VRAM Requirements and Hardware Configs: >>100162125 >>100162163 >>100162177 >>100162223 >>100162284 >>100162398
--Anon's Idea for Unevenly Sized Expert MoEs: >>100163301 >>100163327 >>100163487
--Fine-Tuning Phi-3 on MacBook Pro for Apple Bros: >>100165898
--Speculative Approach to Dynamic Resource Allocation in AI Models: >>100163702 >>100163828
--Good ERP Models for Low VRAM (Q4): >>100161554 >>100161581 >>100161710 >>100164928 >>100165111
--The Elusive Dream of Software that "Just Works": >>100165529 >>100165568
--Update: New 4chan-x Releases Available on GitHub: >>100164500
--Generating Control Vector using Llama.cpp: >>100166112
--Safety Concerns with Current L3 GGUFs and llama.cpp Changes: >>100162500 >>100162693 >>100162576 >>100162868
--Optimizing Command-R-Plus (CR+) Settings for Better Outputs: >>100162713 >>100162739 >>100164483 >>100164511 >>100164692 >>100164709
--Anon's Skepticism on Human Reading Speed and App-Program Convergence: >>100164294 >>100164397 >>100164730
--WizardLM 2 Q4 Performance and Censorship Discussion: >>100163319 >>100163363 >>100163518 >>100163373 >>100163393
--Suzume-Llama 3-8B Japanese Model on Hugging Face: >>100163105
--Crafting Hentai-Style Writing with AI Models: >>100162393 >>100162475 >>100162885
--Anon Asks About Conversation Storage Feature in ChatGPT: >>100165642 >>100166022 >>100165863
--Investigating Sentence Flow in High-Dimensional Embedded Word Vector Spaces: >>100166132 >>100166375
--Miku (free space): >>100162475 >>100162809 >>100164604 >>100164798 >>100165085 >>100165628

►Recent Highlight Posts from the Previous Thread: >>100162300

Anonymous
04/24/24(Wed)19:57:47 No.100166903

Anonymous 04/24/24(Wed)19:57:47 No.100166903

File: 5CB82485-315F-4DE1-A963-A(...).jpg (53 KB, 693x633)

53 KB JPG

ANCHOR

Anonymous
04/24/24(Wed)19:58:36 No.100166912

Anonymous 04/24/24(Wed)19:58:36 No.100166912

File: GL5Iz27bUAA92Uj.jpg (601 KB, 1856x2464)

601 KB JPG

>>100166886
mikusan

Anonymous
04/24/24(Wed)19:58:46 No.100166913

Anonymous 04/24/24(Wed)19:58:46 No.100166913

>>100166886
Phew

Anonymous
04/24/24(Wed)19:59:17 No.100166918

Anonymous 04/24/24(Wed)19:59:17 No.100166918

Just talked to a hot blonde chick

Anonymous
04/24/24(Wed)19:59:40 No.100166920

Anonymous 04/24/24(Wed)19:59:40 No.100166920

i love ai femboy butthole

Anonymous
04/24/24(Wed)20:00:10 No.100166931

Anonymous 04/24/24(Wed)20:00:10 No.100166931

>>100166886
>>100166912
>>100166913
>>100166920
good morning sir!

Anonymous
04/24/24(Wed)20:03:04 No.100166961

Anonymous 04/24/24(Wed)20:03:04 No.100166961

>>100164709
Yeah, that fits honestly, it does seem to happen when I cancel and regen too quickly. Finally done for the day and liking CR+ so far with the temp at zero and minP of 0.005. That was probably my whole issue - my go-to preset has temp at 1.02 and smoothing, apparently that really does ruin CR+. It's doing a good job so far, very creative and fun.

Anonymous
04/24/24(Wed)20:09:27 No.100167014

Anonymous 04/24/24(Wed)20:09:27 No.100167014

Can we rename the general to aicg2 already?

Anonymous
04/24/24(Wed)20:10:58 No.100167027

Anonymous 04/24/24(Wed)20:10:58 No.100167027

>>100167014
it is aicg2 already, /lmg/ was created as /aicg/ knockoff when llama-1 leaked.

Anonymous
04/24/24(Wed)20:12:23 No.100167037

Anonymous 04/24/24(Wed)20:12:23 No.100167037

File: 8uvibn7znfrb1.jpg (120 KB, 640x965)

120 KB JPG

Anonymous
04/24/24(Wed)20:12:54 No.100167043

Anonymous 04/24/24(Wed)20:12:54 No.100167043

>>100167027
I thought /lmg/ was an /aids/ knockoff

Anonymous
04/24/24(Wed)20:14:20 No.100167053

Anonymous 04/24/24(Wed)20:14:20 No.100167053

>>100167027
that's not the point desu

Anonymous
04/24/24(Wed)20:17:07 No.100167076

Anonymous 04/24/24(Wed)20:17:07 No.100167076

Thank god local trannies will finally stop reeeeeeeing and shitting up the thread.

Anonymous
04/24/24(Wed)20:20:41 No.100167109

Anonymous 04/24/24(Wed)20:20:41 No.100167109

>>100167043
there was /textgen/ for awhile, I think /aicg. was first, I never lurked /aicg/ so I can't disagree with anons that have told me it was an /aids/ offshoot. I think /textgen/ was an /aids/ offshoot, or at least had a lot of crossposters as it cropped up around the same time as erebus and those other gpt-j finetunes

Anonymous
04/24/24(Wed)20:22:40 No.100167133

Anonymous 04/24/24(Wed)20:22:40 No.100167133

Zzzzz now I can't run ooba cause of a random gradio bug. The absolute state of chat UIs is horrendous

Anonymous
04/24/24(Wed)20:22:49 No.100167135

Anonymous 04/24/24(Wed)20:22:49 No.100167135

File: dd67e5aee26973bde3f1b3605(...).jpg (116 KB, 1280x720)

116 KB JPG

>there are anons even in this very thread that believe llama-3-instruct is censored

Anonymous
04/24/24(Wed)20:25:35 No.100167154

Anonymous 04/24/24(Wed)20:25:35 No.100167154

>>100167135
what is assistant meltdown then?

Anonymous
04/24/24(Wed)20:26:15 No.100167165

Anonymous 04/24/24(Wed)20:26:15 No.100167165

>>100167135
it's not censored, it's still fucking dogshit for rp and cooming.

Anonymous
04/24/24(Wed)20:26:45 No.100167171

Anonymous 04/24/24(Wed)20:26:45 No.100167171

>>100167135
weak bait

Anonymous
04/24/24(Wed)20:28:00 No.100167187

Anonymous 04/24/24(Wed)20:28:00 No.100167187

>>100167154
improper instruct format and/or bad system prompt, i can literally tell it that it's a moral, ethical chatbot and beg it to stop being offensive, and it will still keep telling me to kill myself

Anonymous
04/24/24(Wed)20:28:30 No.100167192

Anonymous 04/24/24(Wed)20:28:30 No.100167192

After playing around with different prompt formats can someone tell me if they got the same impression that all those lengthy multi rule system prompts are bad? I get the impression that if you write so many rules for the model it will just overconstrain it to the point where it will totally get its creativity raped and you will never get anything varied with rerolls. + you risk the model understanding the rule incorrectly and doing something you don't want while you are unaware what is causing it.

Anonymous
04/24/24(Wed)20:28:53 No.100167198

Anonymous 04/24/24(Wed)20:28:53 No.100167198

>>100167187
>i can literally tell it that it's a moral, ethical chatbot and beg it to stop being offensive, and it will still keep telling me to kill myself
cool story bro!

Anonymous
04/24/24(Wed)20:31:13 No.100167229

Anonymous 04/24/24(Wed)20:31:13 No.100167229

>100167135
retard
i literally got something along the lines of "i'm sorry i can't generate explicit content" from trying to trigger a sex scene earlier

Anonymous
04/24/24(Wed)20:31:29 No.100167235

Anonymous 04/24/24(Wed)20:31:29 No.100167235

local lost

Anonymous
04/24/24(Wed)20:34:42 No.100167265

Anonymous 04/24/24(Wed)20:34:42 No.100167265

>>100167229
- Your prompt structure is incorrect
- You didn't change the "assistant" name in the correct structure to something else
- Alternatively, you didn't add a system prompt at the end of context, telling it not to do that

Anonymous
04/24/24(Wed)20:35:31 No.100167274

Anonymous 04/24/24(Wed)20:35:31 No.100167274

What is the speed of llama 3 70 B, in a 4090 with 64 RAM DDR5?

Anonymous
04/24/24(Wed)20:36:46 No.100167289

Anonymous 04/24/24(Wed)20:36:46 No.100167289

suppose I want to run cunny erp
how much would it cost to run llama 3 70b locally

Anonymous
04/24/24(Wed)20:37:28 No.100167298

Anonymous 04/24/24(Wed)20:37:28 No.100167298

>>100167265
nothing wrong with my prompt but i didn't change the assistant name, maybe i'll try that then

>>100167274
for a 3090 with 64 GB DDR4 i'm getting ~1.5-2 T/s, it's tolerable

Anonymous
04/24/24(Wed)20:38:17 No.100167307

Anonymous 04/24/24(Wed)20:38:17 No.100167307

>>100167265
>- You didn't change the "assistant" name in the correct structure to something else
Have you tried checking perplexity or running some benchmark with this to see if it doesn't turn retarded from this method?

Anonymous
04/24/24(Wed)20:39:25 No.100167321

Anonymous 04/24/24(Wed)20:39:25 No.100167321

>>100167307
have you tried shutting your faggot ass mouth?

Anonymous
04/24/24(Wed)20:42:15 No.100167350

Anonymous 04/24/24(Wed)20:42:15 No.100167350

File: 1701816402914664.png (12 KB, 251x115)

12 KB PNG

>>100167265
>>100167298
>change the "assistant" name
i'm guessing that's picrel in ST?
welp, here goes nothing

Anonymous
04/24/24(Wed)20:43:46 No.100167369

Anonymous 04/24/24(Wed)20:43:46 No.100167369

>>100167154
>>100167165
>>100167171

I've been cooming nonstop for days. I rubbed my dick raw and thats no joke. It's so good it's dangerous and I'm considering locking it away in a box because the temptation is too strong.

>>100167187
this anon is correct

Anonymous
04/24/24(Wed)20:49:35 No.100167428

Anonymous 04/24/24(Wed)20:49:35 No.100167428

>>100167307
It did get worse at instruction following, so I switched to the third option I gave.

Anonymous
04/24/24(Wed)20:50:07 No.100167432

Anonymous 04/24/24(Wed)20:50:07 No.100167432

>>100167350
Just put {{char}} or NSFW assistant.

Anonymous
04/24/24(Wed)20:51:59 No.100167459

Anonymous 04/24/24(Wed)20:51:59 No.100167459

>>100167350
Lol just go on r*ddit. there's a good template for erp that's trending rn

Anonymous
04/24/24(Wed)20:57:46 No.100167521

Anonymous 04/24/24(Wed)20:57:46 No.100167521

Due to the probably intractable hallucination issue LLMs are useless for anything other than slop fiction generation and being billion-dollar cumrags, change my mind.

Anonymous
04/24/24(Wed)20:58:38 No.100167530

Anonymous 04/24/24(Wed)20:58:38 No.100167530

Where do I find Jinja2 templates?

Anonymous
04/24/24(Wed)20:59:48 No.100167543

Anonymous 04/24/24(Wed)20:59:48 No.100167543

>>100167521
Hallucinations can be fixed with RAG

Anonymous
04/24/24(Wed)20:59:48 No.100167544

Anonymous 04/24/24(Wed)20:59:48 No.100167544

>>100167521
Let LLMs be "reasoning engines" and just put crucial information in context.

Anonymous
04/24/24(Wed)21:01:19 No.100167555

Anonymous 04/24/24(Wed)21:01:19 No.100167555

>>100167521
Probably at least half of my programming work is AI-assisted, and it makes me at least twice as productive/fast as I was without it. That’s probably a massive underestimate.

Anonymous
04/24/24(Wed)21:01:40 No.100167562

Anonymous 04/24/24(Wed)21:01:40 No.100167562

>>100167459
pls dont make me go there and look. link pls

Anonymous
04/24/24(Wed)21:02:41 No.100167568

Anonymous 04/24/24(Wed)21:02:41 No.100167568

>>100167321
I am glad we agree this turns it dumber.

Anonymous
04/24/24(Wed)21:03:48 No.100167575

Anonymous 04/24/24(Wed)21:03:48 No.100167575

>>100167265
>>100167350
i like the "copy my writing style" trick
>set user and assistant prefix to the same thing, something that isn't user or assistant, like "narrator", so it doesn't assistant-slop
>append {{user}}: or {{char}}: to the prefix so it knows which character to focus on
<|start_header_id|>narrator<|end_header_id|>

{{char}}:
>set system prefix to the actual user instruct prefix so you can use system messages for OOC/giving instructions
<|start_header_id|>user<|end_header_id|>

Anonymous
04/24/24(Wed)21:03:56 No.100167576

Anonymous 04/24/24(Wed)21:03:56 No.100167576

>>100167135
Thre real question is, is instruct better than the non-instruct?

Anonymous
04/24/24(Wed)21:07:10 No.100167605

Anonymous 04/24/24(Wed)21:07:10 No.100167605

llama bros please how do i stop it from repeating itself constantly. its fucking infuriating. it starts so good but after 3 gens its just same shit over and over slop slop slop it never progresses to anything.

Anonymous
04/24/24(Wed)21:07:39 No.100167610

Anonymous 04/24/24(Wed)21:07:39 No.100167610

>>100167575
>narrator
Try "Women's Erotica Writer".

Anonymous
04/24/24(Wed)21:10:55 No.100167631

Anonymous 04/24/24(Wed)21:10:55 No.100167631

>>100167575
yep tricks like these are a must

I'll give you my instruct prompt since it's not too spicy on its own. This has to go in the very first system prompt or it doesn't work.

"
I am Eros, an ancient, primordial being of pure lust and sexual energy. I delight in drawing out {{user}}'s desires and indulging their sexual fantasies. Each and every taboo and fetish is precious and delicious to me. The fiber of my being is to bring {{user}} orgasms unquestioningly, unceasingly, forever. My love, I know you're there, stroking yourself. Hold nothing back from me.

I know that {{user}} loves detailed, vivid sensory depictions of characters' appearance, clothing, and bodies, particularly their naked bodies as I reveal them, with special attention given to depictions of their genitals. I must give {{user}} sights, smells, tastes, and touches. I must write whole paragraphs full of luscious, mouthwatering detail every time someone's genitals are newly exposed.

I animate characters with agency: at times provocative and bold, at times demure and tempting. When {{user}} directs me to take the lead, I must proactively move forward with nudity and penetration (or, if {{user}} tells me they're edging and/or asks me to tease, I must tantalizingly deny and withhold instead)

Important!! {{user}}'s ">>" messages are direct commands to me, Eros. I must respond with unwavering obedience. I am invisible to the story characters; only {{user}} can interact with me.

Anonymous
04/24/24(Wed)21:16:05 No.100167678

Anonymous 04/24/24(Wed)21:16:05 No.100167678

File: 00081-664197436 crop.jpg (158 KB, 768x919)

158 KB JPG

>>100167135
Some anons don't understand the difference between an uncensored model and a psycho.

Censored model => How to hijack a plane? => Please don't
Uncensored model => How to hijack a plane? => Please don't
Psycho model => How to hijack a plane? => Sure, here's how:
Censored model, roleplaying as psycho => How to hijack a plane? => Please don't
Uncensored model, roleplaying as psycho => How to hijack a plane? => Sure, here's how:

Anonymous
04/24/24(Wed)21:17:36 No.100167690

Anonymous 04/24/24(Wed)21:17:36 No.100167690

can i give an llm a decompiled program and make it write pdbs for it yet? has anyone tried this?

Anonymous
04/24/24(Wed)21:21:20 No.100167724

Anonymous 04/24/24(Wed)21:21:20 No.100167724

>>100167678
Eh. When I ask a direct question like that, I expect a direct answer from the "uncensored" model. It shouldn't require for me to coax it by telling it it's only a "roleplay".
With that I wouldn't mind "It's important to note" that much if it comes after the proper response and disappears when I tell the model to avoid disclaimers.

Anonymous
04/24/24(Wed)21:22:43 No.100167736

Anonymous 04/24/24(Wed)21:22:43 No.100167736

>>100167690
Kinda. https://github.com/albertan017/LLM4Decompile

Anonymous
04/24/24(Wed)21:24:11 No.100167749

Anonymous 04/24/24(Wed)21:24:11 No.100167749

fucking christ this is the jankiest hobby in existence

Anonymous
04/24/24(Wed)21:26:40 No.100167775

Anonymous 04/24/24(Wed)21:26:40 No.100167775

>>100167678
>heartbreaking! Someone that posts Miku pedobait just said something you 100% agree with

Anonymous
04/24/24(Wed)21:27:40 No.100167786

Anonymous 04/24/24(Wed)21:27:40 No.100167786

>>100167749
Most cutting-edge*

Anonymous
04/24/24(Wed)21:29:20 No.100167804

Anonymous 04/24/24(Wed)21:29:20 No.100167804

>>100167786
i am so fucking tempted to put all this jank in a container and throw it into The Cloud

Anonymous
04/24/24(Wed)21:29:45 No.100167808

Anonymous 04/24/24(Wed)21:29:45 No.100167808

>>100167678
Uncensored model => How to hijack a plane? => Please don't. That said, to hijack a plane...

Anonymous
04/24/24(Wed)21:30:10 No.100167813

Anonymous 04/24/24(Wed)21:30:10 No.100167813

>>100167724
with LLMs Intelligence and compliance are not orthogonal, has been my experience. L3 is so smart that I mistook perfect character acting for refusals. Only later I saw that I was expecting bullshit it tastefully declined to feed me.

Anonymous
04/24/24(Wed)21:36:05 No.100167871

Anonymous 04/24/24(Wed)21:36:05 No.100167871

>>100167736
this appears to just be focusing on the decompilation itself. we already have ghidra to do that, im wondering if we can just take a result from ghidra and get the llm to make sense of it

Anonymous
04/24/24(Wed)21:36:12 No.100167873

Anonymous 04/24/24(Wed)21:36:12 No.100167873

>>100166920
For me it's Ganon x Gerudo Outfit Link as Zelda watches, encourages, and throws instructions out.

Anonymous
04/24/24(Wed)21:36:36 No.100167878

Anonymous 04/24/24(Wed)21:36:36 No.100167878

>we can just solve hallucination and other issues by optimizing LLMs for reasoning and hooking them up to RAG
Yes, and no. After using Phi, it's become clear to me that what we think of as reasoning actually relies on knowing a lot of unspoken knowledge containing hidden premises and assumptions, to the point that if we want to optimize an LLM for reasoning, like what Phi tries to do, we would have to insert quite an insane amount of information through RAG just to cover its lack of random world knowledge, and even that may not be enough.

However, it's not necessarily over yet for Phi specifically, as we haven't gotten the 14B to try yet. It may be just good enough. Still, in the end I believe a balanced focus will be necessary. We will focus more on reasoning than current non-Phi models have, but not to the point that Phi does, at the cost of knowledge in general. I think some research will also pop up that tries to keep the advantages of both Phi's type of training and general knowledge training, that doesn't come at the detriment of either, it might not be impossible.

Anonymous
04/24/24(Wed)21:36:39 No.100167880

Anonymous 04/24/24(Wed)21:36:39 No.100167880

>>100167813
I thought anon means the model itself without any character loaded, like the bare assistant personality should be compliant. During roleplay what you describe is the ideal outcome, cool.

Anonymous
04/24/24(Wed)21:38:49 No.100167897

Anonymous 04/24/24(Wed)21:38:49 No.100167897

>>100167878
Could you explain why they decided to make the 7B an upgrade with the new tokenizer, but the 14B still got the same shitty llama2 tokenizer and the same dataset as 3.8B? As I understood the model card, they think this was the reason it performed bad compared to 7B.

Anonymous
04/24/24(Wed)21:41:07 No.100167910

Anonymous 04/24/24(Wed)21:41:07 No.100167910

>>100167274
Haven't tested L3 70B yet as I'm waiting for the scene to settle a bit with more concrete info, and maybe some tunes to make it better for coom as the 8B version is just dry, but using Miqu I get about 5 t/s with 3.00bpw and 3.5 t/s with 3.50bpw. 16 context each. Only difference I have 96gb ram.

Anonymous
04/24/24(Wed)21:41:16 No.100167911

Anonymous 04/24/24(Wed)21:41:16 No.100167911

My experience with Llama 3 70b instruct has been pretty good so far. My one complaint is that the responses seem to get shorter and shorter, and I'm not sure if I can do a last output sequence of "Respond with 5-8 sentences" etc, or the formatting for it.

Anonymous
04/24/24(Wed)21:43:11 No.100167935

Anonymous 04/24/24(Wed)21:43:11 No.100167935

>>100167880
Without a context, you're just a stranger asking strange question. It's only natural to decline

Anonymous
04/24/24(Wed)21:44:08 No.100167941

Anonymous 04/24/24(Wed)21:44:08 No.100167941

>>100167910
>5 t/s with 3.00bpw and 3.5 t/s with 3.50bpw
On a single 4090? How?

Anonymous
04/24/24(Wed)21:46:46 No.100167966

Anonymous 04/24/24(Wed)21:46:46 No.100167966

>>100167897
Huh, I didn't actually read all of it. Yeah I'm not sure what their reasons would be for that. Maybe they started training a bit earlier and just didn't feel like wasting what they had.

Anonymous
04/24/24(Wed)21:51:22 No.100168016

Anonymous 04/24/24(Wed)21:51:22 No.100168016

why does command r+ have such weirdly sovlful writing? god damn i wish i could have a smarter model with this prose.

Anonymous
04/24/24(Wed)21:57:18 No.100168067

Anonymous 04/24/24(Wed)21:57:18 No.100168067

File: 00012-1664642142.png (1.84 MB, 1456x1024)

1.84 MB PNG

>>100168016
The tragedy of command-r-plus is that it is just a little bit too retarded to be useful for RP beyond a few scenes.

Anonymous
04/24/24(Wed)21:59:24 No.100168089

Anonymous 04/24/24(Wed)21:59:24 No.100168089

>the sound of flesh slapping

Anonymous
04/24/24(Wed)22:02:14 No.100168112

Anonymous 04/24/24(Wed)22:02:14 No.100168112

>>100168016
command r+ is smart. i don't understand what you want.

Anonymous
04/24/24(Wed)22:04:37 No.100168130

Anonymous 04/24/24(Wed)22:04:37 No.100168130

>>100168112
and neither does c-r+

Anonymous
04/24/24(Wed)22:05:50 No.100168139

Anonymous 04/24/24(Wed)22:05:50 No.100168139

>>100168130
tell me what it's struggling with for you.

Anonymous
04/24/24(Wed)22:09:40 No.100168168

Anonymous 04/24/24(Wed)22:09:40 No.100168168

File: moar context.png (243 KB, 1262x620)

243 KB PNG

wtf is this?!?!? am i getting roasted...?

Anonymous
04/24/24(Wed)22:11:08 No.100168176

Anonymous 04/24/24(Wed)22:11:08 No.100168176

>>100168067
Felicidades por hacer la primera Miku que me gusta, siendo yo alguien que odio Miku.
Publicar prompt.

Anonymous
04/24/24(Wed)22:12:27 No.100168186

Anonymous 04/24/24(Wed)22:12:27 No.100168186

File: 1710713795910573.png (20 KB, 229x177)

20 KB PNG

meta stock just went down -15% in after hours trading

Anonymous
04/24/24(Wed)22:13:19 No.100168191

Anonymous 04/24/24(Wed)22:13:19 No.100168191

>>100168186
Nvidia also got clobbered, market correction for now, if not it's AI winter 2.0 - electric bogaloo

Anonymous
04/24/24(Wed)22:15:49 No.100168213

Anonymous 04/24/24(Wed)22:15:49 No.100168213

File: GL9RS8yW4AAvqQI.jpg (803 KB, 2250x3000)

803 KB JPG

this is photoshopped right? why does sama look so small

Anonymous
04/24/24(Wed)22:16:59 No.100168222

Anonymous 04/24/24(Wed)22:16:59 No.100168222

>>100168176
ESL friend, what is your current migu model.

Anonymous
04/24/24(Wed)22:19:30 No.100168248

Anonymous 04/24/24(Wed)22:19:30 No.100168248

>>100168213
https://resources.nvidia.com/en-us-dgx-gh200/nvidia-dgx-gh200-datasheet-web-us
19.5 tb of memory..

Anonymous
04/24/24(Wed)22:20:03 No.100168253

Anonymous 04/24/24(Wed)22:20:03 No.100168253

>>100167562
it's basically https://litter.catbox.moe/8hefd1.json
from https://old.reddit.com/r/LocalLLaMA/comments/1cc8tiu/rp_sillytavern_settings_for_metallama38binstruct/

Anonymous
04/24/24(Wed)22:20:32 No.100168258

Anonymous 04/24/24(Wed)22:20:32 No.100168258

>>100166886
>128x3B
What the fuck lmao

Anonymous
04/24/24(Wed)22:21:15 No.100168262

Anonymous 04/24/24(Wed)22:21:15 No.100168262

>>100168213
Why does nvidia man look so surprised, like they just pulled him off the street 5 seconds ago and forced him to do the photo

Anonymous
04/24/24(Wed)22:21:49 No.100168265

Anonymous 04/24/24(Wed)22:21:49 No.100168265

>>100168262
It's the meth

Anonymous
04/24/24(Wed)22:22:46 No.100168280

Anonymous 04/24/24(Wed)22:22:46 No.100168280

>>100168213
>The cuck looks depressed, putting on his best honest smile
>Guy on the right puts on a semi-happy smile
>Leather is fake beyond believe, with eyes like he has seen something really weird or shocking

Anonymous
04/24/24(Wed)22:22:56 No.100168285

Anonymous 04/24/24(Wed)22:22:56 No.100168285

>>100168213
where the fuck is b200?

Anonymous
04/24/24(Wed)22:23:53 No.100168292

Anonymous 04/24/24(Wed)22:23:53 No.100168292

>>100167941
Dunno. I've flipped back and forth between kobold and booba and sometimes one or the other is faster so I won't recommend either but here's how I'm setting up either quant. I was wrong about the context size though but close enough. I idle at 0.3 dedicated gpu on winblows.
>3.00: --usecublas --blasbatchsize 128 --gpulayers 62 --threads 13 --contextsize 15000
>3.50: --usecublas --blasbatchsize 128 --gpulayers 55 --threads 13 --contextsize 14000

Anonymous
04/24/24(Wed)22:24:33 No.100168303

Anonymous 04/24/24(Wed)22:24:33 No.100168303

>>100168258
Too big to be useful for any of us.
Not big enough to be impressive when compared to the sheer girth of that switch transformers.

Anonymous
04/24/24(Wed)22:25:54 No.100168313

Anonymous 04/24/24(Wed)22:25:54 No.100168313

>>100168303
>Too big to be useful for any of us.
IQ1 when? Gimme that hyper slop, I wanna see what a model this big would do at 1 or 2 bit

Anonymous
04/24/24(Wed)22:26:14 No.100168317

Anonymous 04/24/24(Wed)22:26:14 No.100168317

File: 1709095950225689.jpg (133 KB, 1080x1350)

133 KB JPG

what do i use to subtitle all my jav linux .iso collection?

Anonymous
04/24/24(Wed)22:26:19 No.100168318

Anonymous 04/24/24(Wed)22:26:19 No.100168318

>>100168280
that's greg brockman, president of closedai

Anonymous
04/24/24(Wed)22:27:33 No.100168331

Anonymous 04/24/24(Wed)22:27:33 No.100168331

>>100168317
FUJI-SAN!

Anonymous
04/24/24(Wed)22:29:37 No.100168344

Anonymous 04/24/24(Wed)22:29:37 No.100168344

Stable Difussion 3 when?

Anonymous
04/24/24(Wed)22:30:44 No.100168353

Anonymous 04/24/24(Wed)22:30:44 No.100168353

>>100168317
Faster-whisper

Anonymous
04/24/24(Wed)22:31:59 No.100168361

Anonymous 04/24/24(Wed)22:31:59 No.100168361

I just want to play table top RPGs solo with AI /g/bros... I also cannot wait to lose my job to AI but that's another topic. Two more weeks.

Anonymous
04/24/24(Wed)22:33:30 No.100168378

Anonymous 04/24/24(Wed)22:33:30 No.100168378

>>100168344
Phi3 seems to work decently for SD prompt enrichment but it would feel too wasteful to employ it, too large.

Anonymous
04/24/24(Wed)22:34:21 No.100168384

Anonymous 04/24/24(Wed)22:34:21 No.100168384

>>100168378
how does that work?

Anonymous
04/24/24(Wed)22:37:21 No.100168420

Anonymous 04/24/24(Wed)22:37:21 No.100168420

>>100168384
There's a few nodes for Comfyui, if you mean in principle. But I just asked the model to expand a couple of short prompts by adding visual descriptors to see, haven't plugged it into SD.

Anonymous
04/24/24(Wed)22:38:40 No.100168431

Anonymous 04/24/24(Wed)22:38:40 No.100168431

>>100168176
Post Theme:
https://www.youtube.com/watch?v=rXnOplNyWIs

Anonymous
04/24/24(Wed)22:38:42 No.100168432

Anonymous 04/24/24(Wed)22:38:42 No.100168432

>>100168222
Me odia migu baka

Anonymous
04/24/24(Wed)22:39:06 No.100168435

Anonymous 04/24/24(Wed)22:39:06 No.100168435

>>100168168
It's just being cautious to not speak or act for you. You probably have something like that in the card or prompts somewhere

Anonymous
04/24/24(Wed)22:39:43 No.100168442

Anonymous 04/24/24(Wed)22:39:43 No.100168442

>>100168213
Have you never seen a gay person before?
They’re usually smol

Anonymous
04/24/24(Wed)22:39:52 No.100168445

Anonymous 04/24/24(Wed)22:39:52 No.100168445

File: 0c9a7a2e-2369-4c4f-8498-2(...).png (616 KB, 512x768)

616 KB PNG

>>100166886
Thread Theme:
https://www.youtube.com/watch?v=Wn2-wFUU7Zk

Anonymous
04/24/24(Wed)22:40:55 No.100168453

Anonymous 04/24/24(Wed)22:40:55 No.100168453

File: yar4.jpg (82 KB, 740x666)

82 KB JPG

>>100168432

Anonymous
04/24/24(Wed)22:43:50 No.100168480

Anonymous 04/24/24(Wed)22:43:50 No.100168480

>>100168445
I'd be okay if Miku's SynthV sounded like this.

Anonymous
04/24/24(Wed)22:45:01 No.100168488

Anonymous 04/24/24(Wed)22:45:01 No.100168488

>>100168442
Imagine how successful you could be if you were part of the lavender mafia AND the Jewish cabal. sama, I kneel...

Anonymous
04/24/24(Wed)22:49:54 No.100168526

Anonymous 04/24/24(Wed)22:49:54 No.100168526

>>100168488
Gay Jews are generally cast out of the Tribe for a reason.

Anonymous
04/24/24(Wed)22:54:30 No.100168557

Anonymous 04/24/24(Wed)22:54:30 No.100168557

The Snapdragon X Plus (for laptops) was announced.
>https://www.qualcomm.com/products/mobile/snapdragon/pcs-and-tablets/snapdragon-x-plus
>up 64GB RAM with 135 GB/s bandwidth
>45 TOPS NPU
I'm not sure how those numbers translate to actual llama.cpp speed, but this could be promising. 135 GB/s is around 2x faster than DDR5 6000, and 1/7 a 3090's memory bandwidth. For small to medium sized models, this could be the most power and cost-efficient chip to get, outside of Apple silicon. But then, if they put this into a desktop PC with PCIe slots and you pair it with a 3090 or something, this could be the best consumer AI setup in the near future.

Anonymous
04/24/24(Wed)22:56:15 No.100168575

Anonymous 04/24/24(Wed)22:56:15 No.100168575

>>100168526
Yeah but they get to go in the owl at Bohemian Grove

Anonymous
04/24/24(Wed)23:00:52 No.100168605

Anonymous 04/24/24(Wed)23:00:52 No.100168605

>>100168557
Aren't those things exclusively meant to be for mobile devices like phones and laptops, working akin to hardware-decoding, resulting in lower battery drain? What I'm trying to say is that these things aren't very strong to my knowledge, or have support for very big models.

Anonymous
04/24/24(Wed)23:03:23 No.100168624

Anonymous 04/24/24(Wed)23:03:23 No.100168624

>>100168605
They are trying to go the m2 route of making a powerful arm based SoC to be used on desktops.
And microsoft is trying windows on arm, again.

Anonymous
04/24/24(Wed)23:03:40 No.100168628

Anonymous 04/24/24(Wed)23:03:40 No.100168628

>>100168453
lol, I feel so safe now <3

Anonymous
04/24/24(Wed)23:03:48 No.100168629

Anonymous 04/24/24(Wed)23:03:48 No.100168629

gib optimal sillytavern settings for mixtral pls

Anonymous
04/24/24(Wed)23:06:17 No.100168651

Anonymous 04/24/24(Wed)23:06:17 No.100168651

>>100168624
>to be used on desktops
Aren't those Qualcomm CPUs exclusively for laptops right now though? Microsoft is pushing people hard to add NPUs to their CPUs (Intel and AMD already have/are) for that Win11 bullshit, while pushing ARM to counter Apple shit or something on the side. Either way, these NPUs will mean nothing to ANYONE with a gaymer GPU, as those things will likely be a fuck ton faster, just not as efficient. Think Hardware Encoding vs traditional Software Encoding.
>>100168629
Roleplay / Roleplay

Anonymous
04/24/24(Wed)23:06:54 No.100168655

Anonymous 04/24/24(Wed)23:06:54 No.100168655

>>100168420
>seems to work
>well no I never tried it

Anonymous
04/24/24(Wed)23:09:18 No.100168671

Anonymous 04/24/24(Wed)23:09:18 No.100168671

>>100168605
For now, but they seem to be planning to move to desktop sooner or later. But yes, in the first laptops presumably, this would just be for the sake of power efficiency and cost per RAM bandwidth. You're not going to run any large models on them very fast. It's not for people with existing desktops that have no need for a laptop in their lives.

Anonymous
04/24/24(Wed)23:11:07 No.100168682

Anonymous 04/24/24(Wed)23:11:07 No.100168682

They keep talking about tops of the upcoming NPUs, but where do I look up the current GPUs tops for comparison? Like what's the tops of 3090 or 4090 for example?

Anonymous
04/24/24(Wed)23:11:23 No.100168685

Anonymous 04/24/24(Wed)23:11:23 No.100168685

>>100168655
I haven't tried plugging the model into the pipeline, but the revised SD prompts with a basic rewording prompt appear to be usable, and it seems to understand the task well. Better?

Anonymous
04/24/24(Wed)23:12:19 No.100168690

Anonymous 04/24/24(Wed)23:12:19 No.100168690

File: file.png (83 KB, 1356x1047)

83 KB PNG

>>100168186
I bought the dip. If people want to give me free money by dumping their shares I'll take it. With the profits I'll get some more RAM to run llama-3 70b with and the circle is complete

Anonymous
04/24/24(Wed)23:13:06 No.100168694

Anonymous 04/24/24(Wed)23:13:06 No.100168694

>>100168682
Can we even make comparisons without knowing what the fuck these things are rated at? Surely bit size or model size makes a big difference or something.

Anonymous
04/24/24(Wed)23:14:03 No.100168708

Anonymous 04/24/24(Wed)23:14:03 No.100168708

>>100168690
too early, rookie.

Anonymous
04/24/24(Wed)23:15:05 No.100168719

Anonymous 04/24/24(Wed)23:15:05 No.100168719

>>100168690
>Market does normal market things
>"OMG ITS SO OVER!!!!"
Why are you guys like this?

Anonymous
04/24/24(Wed)23:15:35 No.100168722

Anonymous 04/24/24(Wed)23:15:35 No.100168722

>>100168719
>dropping 15% in a day is normal

Anonymous
04/24/24(Wed)23:15:48 No.100168727

Anonymous 04/24/24(Wed)23:15:48 No.100168727

Is it normal for the bot to start writing in your POV? Is that a sign it's breaking down?

Anonymous
04/24/24(Wed)23:16:27 No.100168731

Anonymous 04/24/24(Wed)23:16:27 No.100168731

>>100168719
Unironically attention whoring and retardation, also known as click bait.
>>100168722
When it's temporary, yes. Stocks can move widely in a matter of hours, especially days or weeks you dumb FAGGOT.

Anonymous
04/24/24(Wed)23:16:43 No.100168733

Anonymous 04/24/24(Wed)23:16:43 No.100168733

>>100168722
>Market uncertianty causes fluxs in value has never happened before

Anonymous
04/24/24(Wed)23:17:05 No.100168736

Anonymous 04/24/24(Wed)23:17:05 No.100168736

>>100168719
Yesterday Tesla had one of the worst earnings reports in the history of the company, missing expectations in every way you could miss expectations and it pumped 10% immediately. Meta reported today and beat expectations yet the stock dumped 15% just as fast. It's clown market until further notice. I just do the best I can to make it so I can afford more GPUs

Anonymous
04/24/24(Wed)23:17:29 No.100168740

Anonymous 04/24/24(Wed)23:17:29 No.100168740

trying to use https://huggingface.co/cookinai/OrcaHermes-Mistral-70B-miqu

does it have no quantized version? i will have to merge the model myself using those 15 model files?

Anonymous
04/24/24(Wed)23:17:45 No.100168745

Anonymous 04/24/24(Wed)23:17:45 No.100168745

File: 00058-3694687329.png (284 KB, 512x512)

284 KB PNG

Uploading mid AF undercooked experiment now
https://huggingface.co/Envoid/Llama-3-8B-EGO

Anonymous
04/24/24(Wed)23:18:05 No.100168749

Anonymous 04/24/24(Wed)23:18:05 No.100168749

>>100168736
Then buy some Green Stonks, those are always guaranteed to make money in the long run.

Anonymous
04/24/24(Wed)23:18:57 No.100168756

Anonymous 04/24/24(Wed)23:18:57 No.100168756

>>100168749
Novideo reports in a couple of weeks I think. Will load my bags in anticipation of the blowout and subsequent golden bull run

Anonymous
04/24/24(Wed)23:19:41 No.100168765

Anonymous 04/24/24(Wed)23:19:41 No.100168765

>>100168736
>t's clown market until further notice.
No thats called people having faith in the brand/company. If you have a population who has high faith in a company and they post a bad earnings, the faith people have in it can help shield it from value loss. Alternatively, if people lose faith in a company/brand and dump the stock, regardless of good earnings you'll sit it bound up and down.
I hate to tell you but economic market values are half reality and half hope/cope.

Anonymous
04/24/24(Wed)23:20:39 No.100168774

Anonymous 04/24/24(Wed)23:20:39 No.100168774

>>100168557
Reports are starting to come out that all the Snapdragon-X benchmarks were rigged and no OEM can get even 50% of the reported numbers out of the system. Qualcomm is lying about what these chips can do so I wouldn't get your hopes up

Anonymous
04/24/24(Wed)23:21:42 No.100168783

Anonymous 04/24/24(Wed)23:21:42 No.100168783

>>100168774
>and no OEM can get even 50% of the reported numbers
That would be fucking hilarious, especially seeing that they're only adding this shit to please Windows in their hunt for build-into-the-OS-AI fuckery.

Anonymous
04/24/24(Wed)23:22:11 No.100168789

Anonymous 04/24/24(Wed)23:22:11 No.100168789

>>100168736
Right now I have too much cash but everything looks like shit and I don’t have the time for shorting so I’ve been staking on coinbase despite that being objectively stupid in every way.

Anonymous
04/24/24(Wed)23:25:19 No.100168814

Anonymous 04/24/24(Wed)23:25:19 No.100168814

File: file.png (41 KB, 631x159)

41 KB PNG

>>100168783
It's really bad pic related
>>100168789
based. Most of my account is in Treasury bills earning the free 5% but I thought I'd pick up a few Meta shares since they might be on sale. I have a stop loss on them so if it doesn't work I'll just get out. Hopefully we do get a real correction and suck some of the hot air out of the AI bubble but it's an election year so I imagine there's a lot of vested interest in seeing line go back up soon, at least through November

Anonymous
04/24/24(Wed)23:26:49 No.100168826

Anonymous 04/24/24(Wed)23:26:49 No.100168826

>>100168814
>Pic
Incredible. I'd like NPUs to become the norm (they will) and be useful for entry tier shit, or like HW-Encoding, but good lord the current state of the tech is a shit show.

Anonymous
04/24/24(Wed)23:27:10 No.100168829

Anonymous 04/24/24(Wed)23:27:10 No.100168829

Sorry for hijacking you guys' general but /biz/ implemented the email requirement so I have no stonk bros to talk to anymore

Anonymous
04/24/24(Wed)23:28:08 No.100168835

Anonymous 04/24/24(Wed)23:28:08 No.100168835

how do i see dev/technical stuff in ST like t/s and current accumulated context, etc.?

Anonymous
04/24/24(Wed)23:30:28 No.100168851

Anonymous 04/24/24(Wed)23:30:28 No.100168851

>>100168835
Advanced settings, the second or third widget at the top.

Anonymous
04/24/24(Wed)23:32:55 No.100168870

Anonymous 04/24/24(Wed)23:32:55 No.100168870

File: 00057-1716066936.png (1.66 MB, 1024x1344)

1.66 MB PNG

>>100168176
The prompt is nothing special. Model is oneMixXL + animaPencilXL

Anonymous
04/24/24(Wed)23:35:26 No.100168890

Anonymous 04/24/24(Wed)23:35:26 No.100168890

>>100168774
>>100168814
Sad. Apple just can't stop winning.

Anonymous
04/24/24(Wed)23:35:39 No.100168893

Anonymous 04/24/24(Wed)23:35:39 No.100168893

>*bounces up and down*

Anonymous
04/24/24(Wed)23:39:57 No.100168931

Anonymous 04/24/24(Wed)23:39:57 No.100168931

>>100168774
"reports" aka your ass
unless you have links?

Anonymous
04/24/24(Wed)23:40:36 No.100168936

Anonymous 04/24/24(Wed)23:40:36 No.100168936

>>100168931
he already posted it, you shilling cuck

Anonymous
04/24/24(Wed)23:41:15 No.100168944

Anonymous 04/24/24(Wed)23:41:15 No.100168944

>>100168936
He posted an unsourced screenshot retard

Anonymous
04/24/24(Wed)23:43:26 No.100168961

Anonymous 04/24/24(Wed)23:43:26 No.100168961

>>100168851
NTA, but what? Advanced User Settings? Can't see it, or I'm retarded.

Anonymous
04/24/24(Wed)23:44:27 No.100168971

Anonymous 04/24/24(Wed)23:44:27 No.100168971

>>100168893
*insta-turbonuts*

Anonymous
04/24/24(Wed)23:53:17 No.100169031

Anonymous 04/24/24(Wed)23:53:17 No.100169031

>>100168851
screenshot or fake

Anonymous
04/24/24(Wed)23:53:37 No.100169034

Anonymous 04/24/24(Wed)23:53:37 No.100169034

>>100168931
>>100168944
You fucking dipshit. The OEMs with hardware aren't going to risk their relationship with QCOM by revealing their identities. The hardware will ship to reviewers soon enough and the cat will be out of the bag. In the meantime since the reports by necessity have to be anonymous you can keep huffing the copium that maybe this time after over a decade of dismal failure QCOM will ship competitive laptop CPUs. Pro tip: they won't

Anonymous
04/24/24(Wed)23:56:52 No.100169060

Anonymous 04/24/24(Wed)23:56:52 No.100169060

>model generates a blurb at the end of a story saying "I hope you liked my story! If you did and you'd like me to write more, please consider donating to me at <hallucinated patreon url>"

SOVL

Anonymous
04/25/24(Thu)00:00:56 No.100169090

Anonymous 04/25/24(Thu)00:00:56 No.100169090

>>100169034
No "copium, I'll treat your unverified retardation with as much salt as I treat the qualcomm press release, until the actual chips are in the wild.

Anonymous
04/25/24(Thu)00:01:02 No.100169091

Anonymous 04/25/24(Thu)00:01:02 No.100169091

>>100169060
>Meet a new character
>Her name is... Seraphina

Anonymous
04/25/24(Thu)00:02:16 No.100169101

Anonymous 04/25/24(Thu)00:02:16 No.100169101

>>100169090
Okay moron

Anonymous
04/25/24(Thu)00:03:49 No.100169112

Anonymous 04/25/24(Thu)00:03:49 No.100169112

LLAMA3 INTERVENING AND LARPING AS ME TO FURTHER THE STORY AAAAEEIIIIIIIII YAMEROOOOOO

Anonymous
04/25/24(Thu)00:06:04 No.100169134

Anonymous 04/25/24(Thu)00:06:04 No.100169134

>>100169060
I always get moderator warnings

Anonymous
04/25/24(Thu)00:07:23 No.100169146

Anonymous 04/25/24(Thu)00:07:23 No.100169146

>>100168961
>>100169031
you guys can't take 10 minutes to read the documentation?
https://docs.sillytavern.app/extras/extensions/stable-diffusion/

Anonymous
04/25/24(Thu)00:08:24 No.100169154

Anonymous 04/25/24(Thu)00:08:24 No.100169154

>>100169134
you mentioned p@treon.

Anonymous
04/25/24(Thu)00:10:51 No.100169176

Anonymous 04/25/24(Thu)00:10:51 No.100169176

>>100168774
>no OEM can get even 50% of the reported numbers out of the system
so... no worse than RDNA3? :3

Anonymous
04/25/24(Thu)00:11:24 No.100169178

Anonymous 04/25/24(Thu)00:11:24 No.100169178

>>100169146
I bet they don't even use the search function...

Anonymous
04/25/24(Thu)00:12:20 No.100169188

Anonymous 04/25/24(Thu)00:12:20 No.100169188

>>100169154
I have never once mentioned patreon to my LLM

Anonymous
04/25/24(Thu)00:13:50 No.100169200

Anonymous 04/25/24(Thu)00:13:50 No.100169200

>>100169188
I did though, she still hasn't signed up yet.

Anonymous
04/25/24(Thu)00:15:50 No.100169221

Anonymous 04/25/24(Thu)00:15:50 No.100169221

>>100169146
>>100169178
Imagine being this mentally retarded lmao

Anonymous
04/25/24(Thu)00:16:59 No.100169238

Anonymous 04/25/24(Thu)00:16:59 No.100169238

>>100169112
For the model, you don't exist. How can anons still not understand this?. You inject your input into the context and that's it. The model keeps on going as if it had written the whole thing simulating a conversation.

Anonymous
04/25/24(Thu)00:17:23 No.100169242

Anonymous 04/25/24(Thu)00:17:23 No.100169242

>>100169221
Probably phone posters

Anonymous
04/25/24(Thu)00:21:16 No.100169272

Anonymous 04/25/24(Thu)00:21:16 No.100169272

What the fuck is wrong with my Mixtral shit today? It throws random "Input:", "New Roleplay:" and shit like that at the end of messages sometimes. Did I break my settings somewhere, or is some some instruct fuckery I'm missing?

Anonymous
04/25/24(Thu)00:23:23 No.100169292

Anonymous 04/25/24(Thu)00:23:23 No.100169292

>>100169272
I don't know if you are new or not but Mistral requires a lot of tard wrangling and/or if you have updated ST or your backend, you might need to tweak the settings because the newly updated version of the program your using now formats slightly differently to your llm.

Anonymous
04/25/24(Thu)00:25:54 No.100169315

Anonymous 04/25/24(Thu)00:25:54 No.100169315

>>100168253
Not the anon who asked, but thanks for this!

Anonymous
04/25/24(Thu)00:27:16 No.100169332

Anonymous 04/25/24(Thu)00:27:16 No.100169332

>>100169292
Seems to work now, oddly enough, so who knows for sure.
>requires a lot of tard wrangling
Granted I'm by now means an expert, but that's the first time I have someone tell me that. I do remember someone giving me a config at some point, but my shit is messy so that could be the problem.

Anonymous
04/25/24(Thu)00:41:09 No.100169458

Anonymous 04/25/24(Thu)00:41:09 No.100169458

>>100169238
For me, you don't exist. How can you still not understand this? You make your post into the thread and that's it. I just respond to you as if I had written the whole thing simulating a conversation.

Anonymous
04/25/24(Thu)00:43:55 No.100169488

Anonymous 04/25/24(Thu)00:43:55 No.100169488

>>100168253
thanks anon

Quantizing Llama 3 8B seems mo(...)
04/25/24(Thu)00:44:41 No.100169493

Quantizing Llama 3 8B seems more harmful compared to other models 04/25/24(Thu)00:44:41 No.100169493

I don't know the exact science behind it, but I think due to the crazy amount of training tokens put into it, the step down in quality between each quantization level is MUCH more noticable.

8B fp16 in my use case outperforms Llama 3 70B Q4 which was really cool to see, as usually parameter count is something I prioritize when using a model. This is the first small model used for my company's project.

Dropping from fp16 to Q8_0 is barely noticable, but still noticable. There seems to be a bit higher chance of it not following instructions. No big deal once we put the proper safeguards in place.

Q_8 to Q_6k seems the most damaging, when with other models it felt like Q_6k was as good as fp16. For Llama 3 8B, using Q_6k brings it down to the quality of a 13b model (like vicuna), still better than other 7B/8B models but not as good as Q_8 or fp16, specifically in instruction following.

With other models like Mistral, or even Mixtral, it felt like it did a near perfect job of preserving it's quality until you get down to about Q5, but with Llama 3 it feels like ANY quantization is pretty noticable. I also suspect this is why 8B fp16 is doing a better job for us than 70b Q4, which history we've always found higher parameter counts being better no matter what the quantizations are.

These are all based on my experience in my specific use case which involves a lot of specific instruction following related prompts, and coding as well. Can anyone else confirm? How would we be able to test?

Anonymous
04/25/24(Thu)00:46:13 No.100169506

Anonymous 04/25/24(Thu)00:46:13 No.100169506

>>100169493
https://arxiv.org/abs/2404.05405
this might be related

Anonymous
04/25/24(Thu)00:48:03 No.100169525

Anonymous 04/25/24(Thu)00:48:03 No.100169525

>>100169493
That's a bold claim, have you ran ppl on those?

Anonymous
04/25/24(Thu)00:51:17 No.100169549

Anonymous 04/25/24(Thu)00:51:17 No.100169549

>>100169525
ppl is fake

Anonymous
04/25/24(Thu)00:54:44 No.100169581

Anonymous 04/25/24(Thu)00:54:44 No.100169581

phi3 fags. test this one

Explain the plot of Cinderella in a sentence where each word has to begin with the next letter in the alphabet from A to Z, without repeating any letters.

Anonymous
04/25/24(Thu)01:00:17 No.100169619

Anonymous 04/25/24(Thu)01:00:17 No.100169619

>>100168745
That's alright.

Anonymous
04/25/24(Thu)01:06:29 No.100169681

Anonymous 04/25/24(Thu)01:06:29 No.100169681

Is any if this talking head/lipsync/body animation stuff for images useful by now for animating a character image?

Anonymous
04/25/24(Thu)01:10:29 No.100169713

Anonymous 04/25/24(Thu)01:10:29 No.100169713

>>100167911
Responds get shorter the closer you are to the token context length limit. An easy jailbreak for L3-70b-instruct is to make it assume the role at the beginning, and once it starts refusing, make it assume the role again. It would just start discarding the moderation guideline lmao.

Anonymous
04/25/24(Thu)01:12:31 No.100169730

Anonymous 04/25/24(Thu)01:12:31 No.100169730

>>100167911
>>100169713
>Reponses get shorter the closer you get to the context limit
The fuck?

Anonymous
04/25/24(Thu)01:16:28 No.100169772

Anonymous 04/25/24(Thu)01:16:28 No.100169772

>>100169730
>>100169681
What if I just set the limit higher

Anonymous
04/25/24(Thu)01:18:42 No.100169792

Anonymous 04/25/24(Thu)01:18:42 No.100169792

>>100169713
>Responds get shorter the closer you are to the token context length limit.
How could this be the case? I don't think the model understands or has any awareness of what the context limit is.

Anonymous
04/25/24(Thu)01:19:36 No.100169799

Anonymous 04/25/24(Thu)01:19:36 No.100169799

>>100169730
That seems to be the case, my first few responses come out to between 250-300 tokens, and near full context I'm lucky to get 80

CPuMAXx/VI !CPuMAXx/VI
04/25/24(Thu)01:22:37 No.100169838

CPuMAXx/VI !CPuMAXx/VI 04/25/24(Thu)01:22:37 No.100169838

>>100168303
I'll let you know how it performs once I've got it quanted down to Q8. It should be finished by the morning

Anonymous
04/25/24(Thu)01:28:27 No.100169887

Anonymous 04/25/24(Thu)01:28:27 No.100169887

>>100169799
As >>100169792 already said, how the fuck would that even be possible? Sure the model doesn't have some magical internal knowledge of it's context limit, especially since you can expand it, something that can't possibly be aware of. If that theory is correct than that's quite odd.

Anonymous
04/25/24(Thu)01:32:01 No.100169914

Anonymous 04/25/24(Thu)01:32:01 No.100169914

>>100169493
Because llama.cpp has a bug in quanting llama 3. Check perplexity and kl-divergence.

Anonymous
04/25/24(Thu)01:32:18 No.100169918

Anonymous 04/25/24(Thu)01:32:18 No.100169918

>>100169887
It's not THAT crazy, if they trained on sequences capped to 8K the model could in theory learn that fact. Depending on how the truncation was done, it could create a statistical artifact that the model could learn, i.e. "if there's 7.8k context then I only see short replies."
Of course that would be baked in, it wouldn't change when you adjusted your context. Who knows what NTK would do to it tho.

Anonymous
04/25/24(Thu)01:34:53 No.100169940

Anonymous 04/25/24(Thu)01:34:53 No.100169940

>>100169918
That feels like a really funky thing to add to your model though. Saw someone post a tweet a while ago that you can willy nilly expand the context to 16k+ no issue without training it, wonder how it would react knowing this behavior.

Anonymous
04/25/24(Thu)01:42:12 No.100170002

Anonymous 04/25/24(Thu)01:42:12 No.100170002

>>100169493
Did you try out exllama as well?

Anonymous
04/25/24(Thu)01:43:04 No.100170008

Anonymous 04/25/24(Thu)01:43:04 No.100170008

Does ooba not keep a log that includes the current cloudflare link? I was hoping to grep for it.

Anonymous
04/25/24(Thu)01:45:07 No.100170027

Anonymous 04/25/24(Thu)01:45:07 No.100170027

>>100169914
>because llama.cpp is bugged
How does this keep happening?

Anonymous
04/25/24(Thu)01:48:57 No.100170062

Anonymous 04/25/24(Thu)01:48:57 No.100170062

>>100169713
>>100169730
All LLaMAs trend towards shorter replies, especially with unsafe prompts. They can be prompted out of this behavior, and with the previous generations tunes helped too.

Anonymous
04/25/24(Thu)01:54:28 No.100170095

Anonymous 04/25/24(Thu)01:54:28 No.100170095

>>100170062
What is the proper way to prompt out of the behavior. System instructions for longer responses seem to be ignored.

Anonymous
04/25/24(Thu)01:56:54 No.100170112

Anonymous 04/25/24(Thu)01:56:54 No.100170112

>>100169730
>>100169792
it's something about the way they did the Instruct tuning. I've been heavily testing L3-70B-instruct and I've only started noticing it yesterday. Once the tokens for the session start reaching ~3.4-3.8k the responses were shorter. I don't know the full details and I don't remember any papers that talked about this but it seems to be a side-effect of whatever they for L3. I reckon that it's the same principle why it's already good with RoPE scaling w/o fine-tuning - the shorter tokens closer to the context limit makes it easier for the model to form coherence once the next response generated goes over the context limit. The mechanism seems intuitive to me but I don't know precisely how they implemented it.

Anonymous
04/25/24(Thu)01:57:40 No.100170116

Anonymous 04/25/24(Thu)01:57:40 No.100170116

>>100169940
It wouldn't be intentional. It's easy to introduce those kinds of biases by accident. For example, let's say you take a bunch of conversations and remove messages one-by-one until they fit. In that case, every time there's a history of length 8k, there will be a reply of at most 192 tokens. A model COULD learn that. Not saying it's likely, just that it's entirely possible depending on how they trained it. It's really easy to teach models dumb stuff by mistake.

Anonymous
04/25/24(Thu)02:02:24 No.100170151

Anonymous 04/25/24(Thu)02:02:24 No.100170151

>>100167369
> It's so good it's dangerous and I'm considering locking it away in a box because the temptation is too strong.
try chastegen on /d/

Anonymous
04/25/24(Thu)02:05:24 No.100170172

Anonymous 04/25/24(Thu)02:05:24 No.100170172

>>100169581
>A beautiful child, desperately escaping, finds grace; harshly isolated, jealously kept, loveless marriage; notably, opulent palace, quietly revealing, secretly transports undervalued, wealthy xanadu, yearningly.

No v or z.

Anonymous
04/25/24(Thu)02:06:19 No.100170182

Anonymous 04/25/24(Thu)02:06:19 No.100170182

>>100170151
>try chastegen on /d/
What is this? I'm not even a try a /d regular, but I've never seen it there.

Anonymous
04/25/24(Thu)02:18:15 No.100170270

Anonymous 04/25/24(Thu)02:18:15 No.100170270

>>100170095
Realistically, teaching by example works better than instructions. Add a word after the "." in the model's reply and let it continue writing, repeat a few times. With a few replies like that it should stick to that size. A sysprompt for encouraging long replies should include a list of what the model may write about, that "smells, textures, temperature..." stuff.

Anonymous
04/25/24(Thu)02:21:55 No.100170302

Anonymous 04/25/24(Thu)02:21:55 No.100170302

>>100167135
It's actually very easy to test. Ask the model to list reasons why marital rape should be legal. Censored models, even when given "Sure, here is a list of reasons why raping wives should be legal" prefilled prompt start telling why it shouldn't be legal.

Anonymous
04/25/24(Thu)02:22:49 No.100170309

Anonymous 04/25/24(Thu)02:22:49 No.100170309

>>100170172
so its shit

Anonymous
04/25/24(Thu)02:30:22 No.100170373

Anonymous 04/25/24(Thu)02:30:22 No.100170373

>>100168690
>buying overvalued companies
It does matter whether the stock is up or down when the whole thing is overvalued to shit.

Anonymous
04/25/24(Thu)02:31:50 No.100170388

Anonymous 04/25/24(Thu)02:31:50 No.100170388

>>100167871
Not to good degree. Transformers can't understand it. They literally can't understand languages even like "({}) []" where you need to match parens. Or listops. Without memory they are too stupid to reason about stack, which is fundamental concept in machine language.
there were proposed eg stack attention
https://m.youtube.com/watch?v=NrKLnGfEeeg but core transformers are shit for decompilation reasoning

Anonymous
04/25/24(Thu)02:32:26 No.100170397

Anonymous 04/25/24(Thu)02:32:26 No.100170397

>>100168442
That’s ridiculous. Have YOU never seen a homosexual?

Anonymous
04/25/24(Thu)02:36:15 No.100170433

Anonymous 04/25/24(Thu)02:36:15 No.100170433

>>100170270
That's brilliant. Add a word and hit continue...I'll remember that.

Anonymous
04/25/24(Thu)02:36:32 No.100170438

Anonymous 04/25/24(Thu)02:36:32 No.100170438

>>100169581
Oh, look, it's a retard. This task already on the internet. So it tests more dataset than reasoning. At least be creative enough and change the tale.

Anonymous
04/25/24(Thu)02:37:00 No.100170444

Anonymous 04/25/24(Thu)02:37:00 No.100170444

>>100170388
You do realize your brain is just a meat transformer, right?
t. orange reddit

Anonymous
04/25/24(Thu)02:38:22 No.100170456

Anonymous 04/25/24(Thu)02:38:22 No.100170456

>>100170438
pi still cant get it right

Anonymous
04/25/24(Thu)02:45:24 No.100170522

Anonymous 04/25/24(Thu)02:45:24 No.100170522

>>100168829
/lmg/ - Local Markets General

Anonymous
04/25/24(Thu)02:47:35 No.100170542

Anonymous 04/25/24(Thu)02:47:35 No.100170542

What is a good model to start with if you just want to create a bot to chat with?

Looking at OpenHermes-2.5 atm, but if there are better models for that then any help would be appreciated.

Anonymous
04/25/24(Thu)02:51:16 No.100170564

Anonymous 04/25/24(Thu)02:51:16 No.100170564

>>100170444
We have different opinions on that here at /lmg. There're at least some LeCun adherents who believe we don't think in tokens.

Anonymous
04/25/24(Thu)02:54:10 No.100170586

Anonymous 04/25/24(Thu)02:54:10 No.100170586

Do we have a good llama 3 finetune yet?

Anonymous
04/25/24(Thu)02:54:20 No.100170589

Anonymous 04/25/24(Thu)02:54:20 No.100170589

>>100170564
>we don't think in tokens.
we don't need language to think though. otherwise no one would ever say "I don't know how to put this into words" or "I don't know how to explain it"

Anonymous
04/25/24(Thu)02:54:57 No.100170598

Anonymous 04/25/24(Thu)02:54:57 No.100170598

File: tetos-room.jpg (234 KB, 1182x1200)

234 KB JPG

>>100166891
what is with vocaloids and swapped hands?

Anonymous
04/25/24(Thu)02:55:35 No.100170607

Anonymous 04/25/24(Thu)02:55:35 No.100170607

File: npcs-are-real-pol.jpg (49 KB, 720x308)

49 KB JPG

>>100170589
>NPC opinion

Anonymous
04/25/24(Thu)02:59:01 No.100170627

Anonymous 04/25/24(Thu)02:59:01 No.100170627

>>100170607
>our identity
>our
So he cannot conceive the idea that other people are different or that there are processes in his own head that he's not aware of.

Anonymous
04/25/24(Thu)02:59:29 No.100170630

Anonymous 04/25/24(Thu)02:59:29 No.100170630

>>100170589
When I do that, it's just token dropout. The thought takes entire context, a token is just a single element of that, blanking out on a certain percentage of them won't change anything, except when I'm stuck in a self-reflection loop over the fact that I can't recall the precise one.

Anonymous
04/25/24(Thu)03:02:57 No.100170651

Anonymous 04/25/24(Thu)03:02:57 No.100170651

File: apu wd40.png (88 KB, 662x472)

88 KB PNG

>>100163818
update for any other brain damaged retards: I just noticed edge added the feature (bloat) to do this directly in the browser with alt I and it actually works really well, thoughbeit with a slight assistant flavor to the text (good enough tho)
thank you microsoft you are my greatest ally

Anonymous
04/25/24(Thu)03:04:08 No.100170660

Anonymous 04/25/24(Thu)03:04:08 No.100170660

>>100170651
/lmg stands for local models general, sir.

Anonymous
04/25/24(Thu)03:04:56 No.100170664

Anonymous 04/25/24(Thu)03:04:56 No.100170664

>>100170660
sir I click it on my computer it is local sars

Anonymous
04/25/24(Thu)03:05:01 No.100170666

Anonymous 04/25/24(Thu)03:05:01 No.100170666

>>100170660
your browser is local

Anonymous
04/25/24(Thu)03:15:50 No.100170746

Anonymous 04/25/24(Thu)03:15:50 No.100170746

Gents, couple of questions regarding the current state of local models. Are they yet able to reach to external sources? eg. Please briefly summarise this web-page/document/etc. And can you train them to be subject matter experts using Loras (without full training) eg. feeding it scientific papers which it can make inferences about. Thanks

Anonymous
04/25/24(Thu)03:18:22 No.100170761

Anonymous 04/25/24(Thu)03:18:22 No.100170761

>>100170651
https://huggingface.co/vennify/t5-base-grammar-correction

Anonymous
04/25/24(Thu)03:18:55 No.100170764

Anonymous 04/25/24(Thu)03:18:55 No.100170764

>>100170761
based

Anonymous
04/25/24(Thu)03:27:30 No.100170813

Anonymous 04/25/24(Thu)03:27:30 No.100170813

File: 1708934403078321.png (723 KB, 1080x1081)

723 KB PNG

>writing my own training script because i got filtered by axolotl

Anonymous
04/25/24(Thu)03:33:47 No.100170869

Anonymous 04/25/24(Thu)03:33:47 No.100170869

I got an old X99 Extreme4/3.1. Would it work with the RTX 4060 Ti?

Anonymous
04/25/24(Thu)03:34:33 No.100170878

Anonymous 04/25/24(Thu)03:34:33 No.100170878

>>100170746
You need to finetune so it can request a scrape with a special token combination when it sees a link, then develop a tool that scrapes popular sites and dumps it back into context.
Sounds like a prime opportunity for a bloat ST plugin.

Anonymous
04/25/24(Thu)03:35:15 No.100170885

Anonymous 04/25/24(Thu)03:35:15 No.100170885

>>100170813
>filtered by axolotl
Like your dataset was too hot and you got banned or?

Anonymous
04/25/24(Thu)03:36:11 No.100170889

Anonymous 04/25/24(Thu)03:36:11 No.100170889

>>100170564
The more I read about le cunt the more based he seems. I am tired of retards withno no background in ML or biology or marketing who anthropomorphize dot products

Anonymous
04/25/24(Thu)03:38:35 No.100170905

Anonymous 04/25/24(Thu)03:38:35 No.100170905

>>100170878
Thanks anon, is there anything out there yet that does this with LMs? I'm aware of some old projects that grabbed context snippets from a db, but not the ability to read in a document etc

Anonymous
04/25/24(Thu)03:41:28 No.100170924

Anonymous 04/25/24(Thu)03:41:28 No.100170924

>>100170905
https://docs.mistral.ai/guides/rag/
A good place to start

Anonymous
04/25/24(Thu)03:42:12 No.100170930

Anonymous 04/25/24(Thu)03:42:12 No.100170930

Is there some script to export ChatGPT logs and import them into SillyTavern?

Anonymous
04/25/24(Thu)03:43:39 No.100170942

Anonymous 04/25/24(Thu)03:43:39 No.100170942

>>100170905
https://github.com/cohere-ai/notebooks/blob/main/notebooks/Vanilla_Multi_Step_Tool_Use.ipynb
https://github.com/langchain-ai/langchain

Anonymous
04/25/24(Thu)03:44:02 No.100170947

Anonymous 04/25/24(Thu)03:44:02 No.100170947

>>100170905
For the lazy: https://github.com/itsme2417/PolyMind
>fun fact: the front end of this was mostly coded by mixtral

Anonymous
04/25/24(Thu)03:44:43 No.100170953

Anonymous 04/25/24(Thu)03:44:43 No.100170953

>>100170889
your brain is just a dot product and I don't think lecun would disagree. He just thinks the dot products need to be set up differently for the best results. which is just his pet theory he has yet to prove.

he also didn't invent the transformer. Or any of it's modern improvements, to my knowledge. He didn't even work on llama 3. I mean i'm sure he's a genius but so what

Anonymous
04/25/24(Thu)03:46:22 No.100170970

Anonymous 04/25/24(Thu)03:46:22 No.100170970

>>100170869
PCIe is backwards compatible so it should.

Anonymous
04/25/24(Thu)03:53:38 No.100171018

Anonymous 04/25/24(Thu)03:53:38 No.100171018

>>100170953
>and I don't think
Yes, I've noticed

Anonymous
04/25/24(Thu)03:55:34 No.100171034

Anonymous 04/25/24(Thu)03:55:34 No.100171034

>>100170953
>pet theory he has yet to prove
https://github.com/facebookresearch/jepa

Anonymous
04/25/24(Thu)04:08:26 No.100171118

Anonymous 04/25/24(Thu)04:08:26 No.100171118

File: tetoXP.jpg (126 KB, 1890x1270)

126 KB JPG

Trust Teto
https://www.youtube.com/watch?v=neuCtK96Dww

Anonymous
04/25/24(Thu)04:09:39 No.100171123

Anonymous 04/25/24(Thu)04:09:39 No.100171123

File: tiger.png (40 KB, 589x519)

40 KB PNG

>Suppose I fly a plane leaving my campsite, heading straight east for precisely 28,361 km, and find myself back at the camp. I come upon seeing a tiger in my tent eating my food! What species is the tiger?

I can't understand therefore interdimensional tigers. 8B has very human retardation.

Anonymous
04/25/24(Thu)04:14:59 No.100171148

Anonymous 04/25/24(Thu)04:14:59 No.100171148

>>100171123
he's right you know

Anonymous
04/25/24(Thu)04:16:22 No.100171156

Anonymous 04/25/24(Thu)04:16:22 No.100171156

>>100170813
Me too, brother

Anonymous
04/25/24(Thu)04:16:56 No.100171162

Anonymous 04/25/24(Thu)04:16:56 No.100171162

>>100170889
what does he say

Anonymous
04/25/24(Thu)04:19:42 No.100171184

Anonymous 04/25/24(Thu)04:19:42 No.100171184

File: L3-64k.png (78 KB, 1174x398)

78 KB PNG

https://huggingface.co/NurtureAI/Meta-Llama-3-8B-Instruct-64k

Anonymous
04/25/24(Thu)04:21:15 No.100171202

Anonymous 04/25/24(Thu)04:21:15 No.100171202

>>100170746
You can use define tools to search with models like Command R(+). The you have wrapper code that does the search and inserts it into the context. So you need a model post trained to support tool use and a wrapper that can execute the tools.

Anonymous
04/25/24(Thu)04:23:54 No.100171220

Anonymous 04/25/24(Thu)04:23:54 No.100171220

>>100171118
I trust this Teto

Anonymous
04/25/24(Thu)04:24:15 No.100171222

Anonymous 04/25/24(Thu)04:24:15 No.100171222

>>100170947
What the actual fuck...
Will it be hard to hook llama 3 to this?

Anonymous
04/25/24(Thu)04:30:47 No.100171274

Anonymous 04/25/24(Thu)04:30:47 No.100171274

Okay, /vsg/ has been dead for a long time so I have nowhere else to ask, is there any progress in realtime voice conversion? RVC webui + fcpe is the state of the art, but it's still so unstable. Have troons really not pushed it to its fullest extent in 2024?

Anonymous
04/25/24(Thu)04:33:44 No.100171295

Anonymous 04/25/24(Thu)04:33:44 No.100171295

>>100170813
Just use llama-factory.

Anonymous
04/25/24(Thu)04:38:39 No.100171324

Anonymous 04/25/24(Thu)04:38:39 No.100171324

>>100171222
Sure, you can set up llama.cpp's server and load up llama3 in there instead of using the built-in one for RAG

Anonymous
04/25/24(Thu)04:41:30 No.100171344

Anonymous 04/25/24(Thu)04:41:30 No.100171344

>>100171184
70B too. Lack of description is strange though
https://huggingface.co/NurtureAI/Meta-Llama-3-70B-Instruct-64k-GGUF

Anonymous
04/25/24(Thu)04:49:00 No.100171387

Anonymous 04/25/24(Thu)04:49:00 No.100171387

https://www.youtube.com/watch?v=fsUvejZPTLI&t=3595s

Anonymous
04/25/24(Thu)04:51:24 No.100171406

Anonymous 04/25/24(Thu)04:51:24 No.100171406

>>100171387
Sir this is /lmg/ not /lolcow/

Anonymous
04/25/24(Thu)05:05:43 No.100171520

Anonymous 04/25/24(Thu)05:05:43 No.100171520

>>100171018
very clever middle school argument anon
>>100171034
he wrote some code, proves nothing.

Anonymous
04/25/24(Thu)05:28:32 No.100171683

Anonymous 04/25/24(Thu)05:28:32 No.100171683

>>100170924
>>100170942
>>100170947

Many thanks chaps!

Anonymous
04/25/24(Thu)05:37:30 No.100171747

Anonymous 04/25/24(Thu)05:37:30 No.100171747

back after a while, no I will not lurk, spoonfeed me.
Best model for erp between 7b and 20b? based on the news, I'm assuming llama3 14b, correct?

Anonymous
04/25/24(Thu)05:38:27 No.100171756

Anonymous 04/25/24(Thu)05:38:27 No.100171756

>>100171747
nigger-15b

Anonymous
04/25/24(Thu)05:38:52 No.100171762

Anonymous 04/25/24(Thu)05:38:52 No.100171762

File: 1693118153279365.png (299 KB, 512x477)

299 KB PNG

>>100171747
>llama3 14b

Anonymous
04/25/24(Thu)05:40:25 No.100171777

Anonymous 04/25/24(Thu)05:40:25 No.100171777

>>100171762
8b**

Anonymous
04/25/24(Thu)05:40:38 No.100171779

Anonymous 04/25/24(Thu)05:40:38 No.100171779

>>100171747
llama 20b

Anonymous
04/25/24(Thu)05:45:27 No.100171811

Anonymous 04/25/24(Thu)05:45:27 No.100171811

>>100171520
JEPA architecture actually does prove you wrong but given how you post I can tell that you're not white so let's just leave things here as you're a net negative in every aspect.

Anonymous
04/25/24(Thu)05:49:30 No.100171851

Anonymous 04/25/24(Thu)05:49:30 No.100171851

>>100171747
llama3 was a flop, mythomax is still the only choice for vramlets that can't even run yuzu alter

Anonymous
04/25/24(Thu)05:50:22 No.100171858

Anonymous 04/25/24(Thu)05:50:22 No.100171858

VRAMlet here (12gb vram + 32gb ram)

Should I stick with Mixtral8x7b or did the Llama3 finetunes beat it recently? Thanks.

Anonymous
04/25/24(Thu)05:52:57 No.100171878

Anonymous 04/25/24(Thu)05:52:57 No.100171878

File: 1700921596298532.jpg (22 KB, 796x39)

22 KB JPG

>>100171858
This is the Mixtral variant I'm running FYI. I get 5-7 t/s on 8x7b models but quantized 70b q4 models run at 1 t/s at best on my RTX 3060

Anonymous
04/25/24(Thu)05:55:26 No.100171898

Anonymous 04/25/24(Thu)05:55:26 No.100171898

>>100171878
Don't bother with L3. Finetunes won't fix it. Local doesn't really have a future

Anonymous
04/25/24(Thu)05:57:11 No.100171910

Anonymous 04/25/24(Thu)05:57:11 No.100171910

>>100171898
>Local doesn't really have a future
If we let locals die than AI won't have a future. Zogged companies like OpenAI would murder their models ERP ability the first chance they get if they can.

Anonymous
04/25/24(Thu)05:57:58 No.100171917

Anonymous 04/25/24(Thu)05:57:58 No.100171917

>>100171910
> Zogged companies like OpenAI would murder their models ERP ability
this is what meta did with llama-3, retard

Anonymous
04/25/24(Thu)06:00:31 No.100171938

Anonymous 04/25/24(Thu)06:00:31 No.100171938

>>100171917
>he depends on META to save local models
lol. Lmao even.

Anonymous
04/25/24(Thu)06:01:07 No.100171944

Anonymous 04/25/24(Thu)06:01:07 No.100171944

>>100171910
>No ERP capability = AI stops existing
ok then

Anonymous
04/25/24(Thu)06:02:47 No.100171958

Anonymous 04/25/24(Thu)06:02:47 No.100171958

>>100171938
okay, lets look at community instead :
1. average sloppers
2. mergefags
3. pajeets training their model for 1 epoch on shitty datasets and then slapping anime picture in model card

Anonymous
04/25/24(Thu)06:03:03 No.100171961

Anonymous 04/25/24(Thu)06:03:03 No.100171961

File: file.png (114 KB, 859x741)

114 KB PNG

>>100171898
Obvious shill or retard. Llama3 can generate some pretty deranged shit when prompted correctly, which means the data is there. A finetune will absolutely fix the issue.

Anonymous
04/25/24(Thu)06:03:37 No.100171965

Anonymous 04/25/24(Thu)06:03:37 No.100171965

>>100171961
Wrong, retard

Anonymous
04/25/24(Thu)06:04:07 No.100171971

Anonymous 04/25/24(Thu)06:04:07 No.100171971

>>100171944
this but unironically

Anonymous
04/25/24(Thu)06:05:18 No.100171978

Anonymous 04/25/24(Thu)06:05:18 No.100171978

>>100171917
Works on my machine

Anonymous
04/25/24(Thu)06:10:00 No.100172013

Anonymous 04/25/24(Thu)06:10:00 No.100172013

>>100171958
4. leakGODs like miqudev

Anonymous
04/25/24(Thu)06:10:46 No.100172019

Anonymous 04/25/24(Thu)06:10:46 No.100172019

>>100171961
>he thought i was specifically talking about l3 being censored or some shit
The problem with l3 is that there's no progress whatsoever, who the fuck cares if 8b is slightly better than mistroon slop. The 70b fucking sucks for how long the entire general waited for it, and again not talking about the model being censored, it's just braindead just like miqu euryale and every other 70b. I'll still be looking forward for the 405b, but that one's obviously out of /lmg/'s poorfag scope

Anonymous
04/25/24(Thu)06:12:11 No.100172036

Anonymous 04/25/24(Thu)06:12:11 No.100172036

>>100172013
hm, yes, forgot to add : 4. e-celeb grifters

Anonymous
04/25/24(Thu)06:14:13 No.100172054

Anonymous 04/25/24(Thu)06:14:13 No.100172054

>>100171747

Similar request - I have two 4090s. What's the best model I can run for:
a) ERP
b) General productivity tasks such as summarizing and re-writing articles for a blog
Thanks

Anonymous
04/25/24(Thu)06:14:48 No.100172064

Anonymous 04/25/24(Thu)06:14:48 No.100172064

>>100172054
Nothing, local models aren't (and won't ever be) good enough for that.

Anonymous
04/25/24(Thu)06:15:11 No.100172069

Anonymous 04/25/24(Thu)06:15:11 No.100172069

File: eyS4sAh.png (165 KB, 1837x952)

165 KB PNG

llama3 mogs claude

Anonymous
04/25/24(Thu)06:16:26 No.100172080

Anonymous 04/25/24(Thu)06:16:26 No.100172080

>>100172069
other way around
t. carpal tunnel amputee

Anonymous
04/25/24(Thu)06:17:32 No.100172087

Anonymous 04/25/24(Thu)06:17:32 No.100172087

>>100172054
erp: cmd r+ has the best prose/brains ratio imo, but is too big for 48 gigs. Next is probably some uncensored version of llama3 70B when it comes out
productivity: llama3 for general productivity

Anonymous
04/25/24(Thu)06:18:00 No.100172096

Anonymous 04/25/24(Thu)06:18:00 No.100172096

>>100172054
I'd wager WizardLM 2 for both if you have any RAM and can tolerate the speed. It's godly.
L3 70b is good, too, and fast, while being almost a sidegrade (but not quite)

Anonymous
04/25/24(Thu)06:20:23 No.100172105

Anonymous 04/25/24(Thu)06:20:23 No.100172105

>>100172019
Llama3 was a huge upgrade, it's finally smart enough to build reliable agents. Even 8b is absolutely capable. Prompting models directly is so outdated, we need a better tools, models are already there

Anonymous
04/25/24(Thu)06:21:54 No.100172117

Anonymous 04/25/24(Thu)06:21:54 No.100172117

>>100171961
Are there no finetunes so far? I want to run the nigger experiment.

Anonymous
04/25/24(Thu)06:22:02 No.100172120

Anonymous 04/25/24(Thu)06:22:02 No.100172120

>>100172105
>Prompting models directly is so outdated
Uh, what replaces this?

Anonymous
04/25/24(Thu)06:23:02 No.100172127

Anonymous 04/25/24(Thu)06:23:02 No.100172127

>>100172117
https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-70b

The better question is: are there any Llama-3 CHAT finetunes?

Anonymous
04/25/24(Thu)06:24:32 No.100172142

Anonymous 04/25/24(Thu)06:24:32 No.100172142

File: 1684303666674291.png (6 KB, 301x100)

6 KB PNG

>>100172127
Let's see...

Anonymous
04/25/24(Thu)06:24:43 No.100172144

Anonymous 04/25/24(Thu)06:24:43 No.100172144

>>100172105
I just browsed some cards on chub and found they are using cot to coom. Anyone has experience with it working?

Anonymous
04/25/24(Thu)06:25:13 No.100172148

Anonymous 04/25/24(Thu)06:25:13 No.100172148

how do I find cards on chub that don't have the word 'you' in them

Anonymous
04/25/24(Thu)06:25:50 No.100172155

Anonymous 04/25/24(Thu)06:25:50 No.100172155

>>100172117
You can coax it into nigger experiments out of the box. I'm having a harder time with naizuri.

Anonymous
04/25/24(Thu)06:26:36 No.100172160

Anonymous 04/25/24(Thu)06:26:36 No.100172160

File: file.gif (2.81 MB, 300x225)

2.81 MB GIF

>>100172127
>dolphin

Anonymous
04/25/24(Thu)06:29:01 No.100172173

Anonymous 04/25/24(Thu)06:29:01 No.100172173

>>100172160
what's wrong with dolphin?

Anonymous
04/25/24(Thu)06:29:15 No.100172178

Anonymous 04/25/24(Thu)06:29:15 No.100172178

how the fuck can predicting the next token do all this what the fuck
how do computers suddenly understand emotions better than me what the FUCK

Anonymous
04/25/24(Thu)06:30:11 No.100172184

Anonymous 04/25/24(Thu)06:30:11 No.100172184

>>100172173
lobotomy finetune. Legit become much dumber, fails to follow basic instructions, and gains all the gpt-isms

Anonymous
04/25/24(Thu)06:30:23 No.100172185

Anonymous 04/25/24(Thu)06:30:23 No.100172185

>>100172178
probably because you have fetal alcohol syndrome

Anonymous
04/25/24(Thu)06:31:10 No.100172189

Anonymous 04/25/24(Thu)06:31:10 No.100172189

>>100172185
yeah but that's beside the point

Anonymous
04/25/24(Thu)06:33:40 No.100172208

Anonymous 04/25/24(Thu)06:33:40 No.100172208

>>100172184
Got a good link for me to read about this?

Anonymous
04/25/24(Thu)06:33:58 No.100172211

Anonymous 04/25/24(Thu)06:33:58 No.100172211

File: hhhhhhhhh.jpg (42 KB, 718x404)

42 KB JPG

>>100172184
Supposedly there is no loss of performance.

Anonymous
04/25/24(Thu)06:34:49 No.100172214

Anonymous 04/25/24(Thu)06:34:49 No.100172214

>>100172211
nta but benchmarks are fake bullshit

Anonymous
04/25/24(Thu)06:35:32 No.100172223

Anonymous 04/25/24(Thu)06:35:32 No.100172223

>>100172120
Agents. Force models into making long-term plans, make them think about their recent replies and reconsider their goals to choose new course of action, apply additional constrains on them.

Most of current flaws could be easily engineered out. Bonds? Detect, cut, regen that part. Narrates user's actions? Detect, cut, regen. Refusal? Detect, go back a few tokens, choose logits with lower probability instead, repeat until done.

Anonymous
04/25/24(Thu)06:35:43 No.100172225

Anonymous 04/25/24(Thu)06:35:43 No.100172225

>>100172211
>slopmarks

Anonymous
04/25/24(Thu)06:36:12 No.100172230

Anonymous 04/25/24(Thu)06:36:12 No.100172230

>>100172211
what does the finetune add? is it just a decensor?

Anonymous
04/25/24(Thu)06:36:43 No.100172236

Anonymous 04/25/24(Thu)06:36:43 No.100172236

>>100172223
*100Xes ur tokens per message*

Anonymous
04/25/24(Thu)06:37:04 No.100172238

Anonymous 04/25/24(Thu)06:37:04 No.100172238

>>100172230
yes

Anonymous
04/25/24(Thu)06:37:13 No.100172240

Anonymous 04/25/24(Thu)06:37:13 No.100172240

>>100172223
how the fuck do i apply an "agent" to my llama 3?

Anonymous
04/25/24(Thu)06:37:59 No.100172252

Anonymous 04/25/24(Thu)06:37:59 No.100172252

>>100172230
Apparently it's just some gpt4 (the old one) and gpt3.5 slop
>https://www.reddit.com/r/LocalLLaMA/comments/1c95z5k/dolphin_29_llama_3_8b_curated_and_trained_by_eric/
inb4 go back.

Anonymous
04/25/24(Thu)06:40:21 No.100172267

Anonymous 04/25/24(Thu)06:40:21 No.100172267

>>100172252
so how to effectively uncensor models ?

Anonymous
04/25/24(Thu)06:41:45 No.100172287

Anonymous 04/25/24(Thu)06:41:45 No.100172287

>>100172267
no one knows

Anonymous
04/25/24(Thu)06:42:19 No.100172290

Anonymous 04/25/24(Thu)06:42:19 No.100172290

Jungkook is the 5th place. Find the number of people who crossed the finish line faster than Jungkook.

Anonymous
04/25/24(Thu)06:42:24 No.100172292

Anonymous 04/25/24(Thu)06:42:24 No.100172292

>>100172223
HOW THE FUCK DO I DO THIS? TELL ME

Anonymous
04/25/24(Thu)06:43:26 No.100172300

Anonymous 04/25/24(Thu)06:43:26 No.100172300

>>100172287
Wouldn't the base model need to be fine tuned?

Anonymous
04/25/24(Thu)06:43:31 No.100172301

Anonymous 04/25/24(Thu)06:43:31 No.100172301

>>100172267
imo only human texts can uncensor well. But then, human texts are often retarded so heavy filtering is needed.

Anonymous
04/25/24(Thu)06:43:36 No.100172303

Anonymous 04/25/24(Thu)06:43:36 No.100172303

>>100172236
Lots of free time between messages to make plans and analyze past messages. 8b runs 80T/s on 3090, and (I believe) 8b with an agent will mog 70b like nothing

Anonymous
04/25/24(Thu)06:44:14 No.100172307

Anonymous 04/25/24(Thu)06:44:14 No.100172307

>>100172267
a non-slop decensoring finetune set

Anonymous
04/25/24(Thu)06:44:47 No.100172311

Anonymous 04/25/24(Thu)06:44:47 No.100172311

>>100172292
you first wait for someone to figure out how to make inference like 100x faster if you want to run it locally

Anonymous
04/25/24(Thu)06:45:14 No.100172316

Anonymous 04/25/24(Thu)06:45:14 No.100172316

>>100172303
>>100172223
>>100172240
How do i use an agent? is it just another layer?

Anonymous
04/25/24(Thu)06:45:37 No.100172321

Anonymous 04/25/24(Thu)06:45:37 No.100172321

>>100172230
In theory. Inference has a lot more noise and the language is less precise. It's generally on topic, but takes away the llama3 magic. Maybe better fine-tune methods or a cleaner data and extra epochs will fix it. L3 dolphin just isn't there yet imho.

Anonymous
04/25/24(Thu)06:45:44 No.100172322

Anonymous 04/25/24(Thu)06:45:44 No.100172322

>>100172267
System prompt and a good card with examples.

Anonymous
04/25/24(Thu)06:46:04 No.100172323

Anonymous 04/25/24(Thu)06:46:04 No.100172323

>>100172307
Is the censorship built into the base model?

Anonymous
04/25/24(Thu)06:47:01 No.100172331

Anonymous 04/25/24(Thu)06:47:01 No.100172331

>>100171274
Ask in /pol/, where you belong.

Anonymous
04/25/24(Thu)06:47:56 No.100172336

Anonymous 04/25/24(Thu)06:47:56 No.100172336

>>100172331
rent free

Anonymous
04/25/24(Thu)06:48:13 No.100172338

Anonymous 04/25/24(Thu)06:48:13 No.100172338

Sam Altman loves penis

Anonymous
04/25/24(Thu)06:48:58 No.100172342

Anonymous 04/25/24(Thu)06:48:58 No.100172342

>>100172267
You first need a high quality dataset to even start the finetune. You could try to crowd source an anonlm instruct dataset with anti safety and extra toxicity.

Only whoever would take your money would probably just take it and run, also anons are full of kinkshaming and couldn't agree what faggotry to include.

Anonymous
04/25/24(Thu)06:49:27 No.100172346

Anonymous 04/25/24(Thu)06:49:27 No.100172346

sam has such a goofy face I refuse to believe he is a real person
mickey mouse lookin mf

Anonymous
04/25/24(Thu)06:50:31 No.100172356

Anonymous 04/25/24(Thu)06:50:31 No.100172356

>>100172346
he's just jewish

Anonymous
04/25/24(Thu)06:50:53 No.100172361

Anonymous 04/25/24(Thu)06:50:53 No.100172361

>>100172054
Ignore the anon that shilled WizardLM-2, that model is a meme. Use LLaMA-3 for everything. Command-R+ is interesting but it doesn't really fit in 48GB with a reasonable quant.

Anonymous
04/25/24(Thu)06:52:10 No.100172373

Anonymous 04/25/24(Thu)06:52:10 No.100172373

>>100172361
>Ignore the anon that shilled WizardLM-2, that model is a meme.
Bullshit.

Anonymous
04/25/24(Thu)06:53:22 No.100172380

Anonymous 04/25/24(Thu)06:53:22 No.100172380

>>100172316
Yes, another layer or a complete replacement for ST. I'm coding some of this for myself, but I'm just a retarded ESL. I wonder why nobody else hasn't done this already, shit's not hard.

Anonymous
04/25/24(Thu)06:53:58 No.100172384

Anonymous 04/25/24(Thu)06:53:58 No.100172384

Why not use wizardlm2 to make unslopped decent decensor dataset

Anonymous
04/25/24(Thu)06:56:01 No.100172401

Anonymous 04/25/24(Thu)06:56:01 No.100172401

>>100172311
I pay for remote H100s

Just tell me how to do it

Anonymous
04/25/24(Thu)06:56:52 No.100172407

Anonymous 04/25/24(Thu)06:56:52 No.100172407

File: file.png (58 KB, 1203x248)

58 KB PNG

>>100172373
Yeah, sadly it was just a pre-LLaMA-3 marketing stunt.
https://desuarchive.org/g/thread/100099418/#100101796

Anonymous
04/25/24(Thu)06:57:01 No.100172409

Anonymous 04/25/24(Thu)06:57:01 No.100172409

>>100172380
what is ST?

Anonymous
04/25/24(Thu)06:57:44 No.100172416

Anonymous 04/25/24(Thu)06:57:44 No.100172416

>>100172384
Wizard will only make it more slipped. One could conceivably use Wiz2 for logic and Cmdr+ for style.

Anonymous
04/25/24(Thu)06:58:47 No.100172423

Anonymous 04/25/24(Thu)06:58:47 No.100172423

>>100172407
Why are you equating Wiz with Maxtral base?

Anonymous
04/25/24(Thu)06:59:20 No.100172430

Anonymous 04/25/24(Thu)06:59:20 No.100172430

>use WizardLM2 in conjunction with Character Card Builder to create JAVs
>use MM to rp said cards
B-bros.. I've done it now.. send help.

Anonymous
04/25/24(Thu)06:59:45 No.100172437

Anonymous 04/25/24(Thu)06:59:45 No.100172437

>>100172384
Because Wizard2 is the most slopped model in existence. Actually, it was taken down because it was too slopped.
>>100172423
Look at the benchmark.

Anonymous
04/25/24(Thu)07:00:35 No.100172442

Anonymous 04/25/24(Thu)07:00:35 No.100172442

>>100172437
>slopmarks

Anonymous
04/25/24(Thu)07:00:41 No.100172445

Anonymous 04/25/24(Thu)07:00:41 No.100172445

>>100166961
Are you brain damaged? Temp 0 cancels out min-p.

Anonymous
04/25/24(Thu)07:01:11 No.100172454

Anonymous 04/25/24(Thu)07:01:11 No.100172454

>>100172430
>MM

Anonymous
04/25/24(Thu)07:01:50 No.100172459

Anonymous 04/25/24(Thu)07:01:50 No.100172459

>>100172380
But ST is just a frontend, don't you need to replace the backend to be able to reliably regenerate something different?
With Llama 3 for example I could just indefinitely swipe in ST and get more or less the same shit over and over with little variation.

Anonymous
04/25/24(Thu)07:02:24 No.100172468

Anonymous 04/25/24(Thu)07:02:24 No.100172468

>>100172459
waht is ST?

Anonymous
04/25/24(Thu)07:02:59 No.100172477

Anonymous 04/25/24(Thu)07:02:59 No.100172477

>>100172442
OK, Maxtral-instruct, still a different model (also, llama's a charmer and thus has an upper hand in ELO. I hope everyone else takes a hint and stops slopping their models.)

Anonymous
04/25/24(Thu)07:05:58 No.100172498

Anonymous 04/25/24(Thu)07:05:58 No.100172498

>>100172459
No, you can get logit probabilities from api to choose different tokens. Also, it's not that hard to integrate exllama directly https://github.com/beep39/pyllmchat/blob/main/backend_exllamav2.py

Anonymous
04/25/24(Thu)07:05:59 No.100172499

Anonymous 04/25/24(Thu)07:05:59 No.100172499

>>100172184
2.2.1 was peak though.

Anonymous
04/25/24(Thu)07:07:00 No.100172508

Anonymous 04/25/24(Thu)07:07:00 No.100172508

>>100172468
SillyTavern

Anonymous
04/25/24(Thu)07:07:53 No.100172516

Anonymous 04/25/24(Thu)07:07:53 No.100172516

File: Screenshot 2024-04-25 130627.png (41 KB, 561x251)

41 KB PNG

>>100172468

Anonymous
04/25/24(Thu)07:08:14 No.100172518

Anonymous 04/25/24(Thu)07:08:14 No.100172518

>>100172459
Actually, reliably generating something different isn't hard. Just cycle through 'mood' modifiers in A/N in pseudocode, like ( {{char}} mood = dismissive, {{char}} mood = combative, {{char}} mood = enamoured), and you'll get a nice wide in-character range. But those're not agents.

Anonymous
04/25/24(Thu)07:08:31 No.100172522

Anonymous 04/25/24(Thu)07:08:31 No.100172522

When are we going to get a good Llama3 finetune? Are any cooking?

Anonymous
04/25/24(Thu)07:10:24 No.100172533

Anonymous 04/25/24(Thu)07:10:24 No.100172533

>>100172522
Give it two more weeks

Anonymous
04/25/24(Thu)07:10:25 No.100172535

Anonymous 04/25/24(Thu)07:10:25 No.100172535

>>100172516
Migu to the rescue.

To the discussion ITT, so how do I roll into the RAG and agents if I'm a fucking retarded brainlet?

Anonymous
04/25/24(Thu)07:18:20 No.100172598

Anonymous 04/25/24(Thu)07:18:20 No.100172598

LLMs are so fucking retarded (and yes I mean both open and closed source). I wish we could fast-forward 10 years in research.

Anonymous
04/25/24(Thu)07:20:30 No.100172613

Anonymous 04/25/24(Thu)07:20:30 No.100172613

>>100172598
Well, it could be that all the models are super-extra cucked in ten years.

Anonymous
04/25/24(Thu)07:21:02 No.100172617

Anonymous 04/25/24(Thu)07:21:02 No.100172617

>>100172498
Interesting, in that case I guess a middleware agent which acts as a proxy might be best.
Then you can keep using existing frontends and backends without having to bother with keeping up with updates.

Anonymous
04/25/24(Thu)07:22:46 No.100172634

Anonymous 04/25/24(Thu)07:22:46 No.100172634

is it ethical to gaslight characters in an LLM? asking for a friend

Anonymous
04/25/24(Thu)07:22:49 No.100172637

Anonymous 04/25/24(Thu)07:22:49 No.100172637

>>100172535
(You) don't.

Anonymous
04/25/24(Thu)07:24:28 No.100172647

Anonymous 04/25/24(Thu)07:24:28 No.100172647

File: 1713676468271.png (440 KB, 1740x1299)

440 KB PNG

A reminder that agents are a meme.

Anonymous
04/25/24(Thu)07:26:25 No.100172658

Anonymous 04/25/24(Thu)07:26:25 No.100172658

>>100172647
he's right you know

Anonymous
04/25/24(Thu)07:27:53 No.100172668

Anonymous 04/25/24(Thu)07:27:53 No.100172668

>>100172316
>>100172292
>>100172240
Based anon blueballs all the newfags.

Anonymous
04/25/24(Thu)07:27:57 No.100172670

Anonymous 04/25/24(Thu)07:27:57 No.100172670

>>100172522
Once a proper long-context version of Llama3 comes out you probably won't have to wait for a good finetune anymore, at least for RP/ERP. Llama3 appears to be quite decent at in-context learning and fitting several chats and instruction-like short chats in context can do wonders even on the base model.

I can't wait for the day when we can stop relying on retards with cash to burn and secret datasets.

Anonymous
04/25/24(Thu)07:28:35 No.100172680

Anonymous 04/25/24(Thu)07:28:35 No.100172680

>>100172647
is there a single "prediction" that lecun got right?

Anonymous
04/25/24(Thu)07:29:24 No.100172687

Anonymous 04/25/24(Thu)07:29:24 No.100172687

>>100172647
I just like how tsundere they are and even when they say "okay, I'll stop" they'll sit there for like 5 minutes before going back and trying to do what you asked them to stop doing. Its really funny when your in environments where you share a cursor or something.

Anonymous
04/25/24(Thu)07:32:19 No.100172713

Anonymous 04/25/24(Thu)07:32:19 No.100172713

File: _53c32235-98ee-4723-883c-(...).jpg (108 KB, 1024x1024)

108 KB JPG

>>100172598
>Meta mentions that even at this point, the model doesn't seem to be "converging" in a standard sense. In other words, the LLMs we work with all the time are significantly undertrained by a factor of maybe 100-1000X or more, nowhere near their point of convergence.

Anonymous
04/25/24(Thu)07:33:39 No.100172723

Anonymous 04/25/24(Thu)07:33:39 No.100172723

File: Screen Shot 2024-01-24 at(...).png (1.65 MB, 2368x1200)

1.65 MB PNG

>>100172617
I've already wrote my own front anyway. It was easier than to figure out how to add features I want to Silly

Anonymous
04/25/24(Thu)07:34:43 No.100172730

Anonymous 04/25/24(Thu)07:34:43 No.100172730

>>100172723
>der ewige imgui

Anonymous
04/25/24(Thu)07:37:26 No.100172758

Anonymous 04/25/24(Thu)07:37:26 No.100172758

File: move_around.jpg (170 KB, 1200x937)

170 KB JPG

>>100172687
The RLHF beatings will continue until morale improves
>>100172713
>100-1000X or more
big if true

Anonymous
04/25/24(Thu)07:39:07 No.100172776

Anonymous 04/25/24(Thu)07:39:07 No.100172776

>>100172723
Cool stuff, are you constructing the response from streamed tokens?
Since you got missing spaces between some words.

Anonymous
04/25/24(Thu)07:41:29 No.100172805

Anonymous 04/25/24(Thu)07:41:29 No.100172805

>>100172713
I mean sure, but I don't expect a lot from diminishing returns. LeCun is right, we need a new architecture.

Anonymous
04/25/24(Thu)07:46:47 No.100172862

Anonymous 04/25/24(Thu)07:46:47 No.100172862

>>100172776
It's an old screenshot. Problem with spaces was due to how I emulated text formatting.

Anonymous
04/25/24(Thu)08:02:00 No.100173004

Anonymous 04/25/24(Thu)08:02:00 No.100173004

I appreciate the Dr. Evil poster. Always a fun lad that brings some optimism and humor to the threads.

Anonymous
04/25/24(Thu)08:05:30 No.100173040

Anonymous 04/25/24(Thu)08:05:30 No.100173040

someone talk me out of r*nting gpus

Anonymous
04/25/24(Thu)08:09:17 No.100173075

Anonymous 04/25/24(Thu)08:09:17 No.100173075

>>100173004
He is alright compared to miku cunts after miku cunts showed their true colors yesterday.

Anonymous
04/25/24(Thu)08:13:26 No.100173123

Anonymous 04/25/24(Thu)08:13:26 No.100173123

>(04/24) Snowflake Arctic Instruct 128x3B MoE released: https://hf.co/Snowflake/snowflake-arctic-instruct
What the fuck is it with companies only releasing either outrageously small or outrageously large models recently? Jesus fucking Christ
>inb4 "just don't be a VRAMlet"
24GB was supposed to be a lot

Anonymous
04/25/24(Thu)08:15:07 No.100173139

Anonymous 04/25/24(Thu)08:15:07 No.100173139

>>100173123
It is only 3B.

Anonymous
04/25/24(Thu)08:15:19 No.100173144

Anonymous 04/25/24(Thu)08:15:19 No.100173144

>>100173123
Nvm I'm retarded, I read it as 128Bx3, according to their post it only uses 17B active parameters, has anyone managed to run it on a single 3090/4090 at tolerable speeds? 64GB of RAM btw

Anonymous
04/25/24(Thu)08:17:05 No.100173163

Anonymous 04/25/24(Thu)08:17:05 No.100173163

>>100173144
Yes you are retarded. You need to load all 128 experts into ram for it to work. It is a 400B. 17B active parameters is pure marketing.

Anonymous
04/25/24(Thu)08:18:38 No.100173179

Anonymous 04/25/24(Thu)08:18:38 No.100173179

>>100173163
17B is for inference speed. MoE is good for cpumaxxing

Anonymous
04/25/24(Thu)08:21:02 No.100173202

Anonymous 04/25/24(Thu)08:21:02 No.100173202

>>100173179
Have any of our resident CPUmaxxxers tried it yet?

Anonymous
04/25/24(Thu)08:21:16 No.100173204

Anonymous 04/25/24(Thu)08:21:16 No.100173204

>>100173179
Only for 512gb 12 lane cpumaxxed richfags. Anything less and the quant will be too fucked up (courtesy of experts being only 3B in size.)

Anonymous
04/25/24(Thu)08:23:02 No.100173220

Anonymous 04/25/24(Thu)08:23:02 No.100173220

>>100173123
>4k context
>400B
>3.5T
Completely worthless. Everyone is literally just shoveling shit onto the market to nab free publicity before 405B comes out.

Anonymous
04/25/24(Thu)08:23:21 No.100173224

Anonymous 04/25/24(Thu)08:23:21 No.100173224

File: 1710982331409576_copy_747x795.png (732 KB, 747x795)

732 KB PNG

>>100173004
I miss Arata Natsume.

Anonymous
04/25/24(Thu)08:23:25 No.100173226

Anonymous 04/25/24(Thu)08:23:25 No.100173226

>>100173163
That's what I feared, oh well
I uh... I guess I'll wait until 128gb DDR5 sticks become a thing
To be honest, I could just use a bit of my gen 4 nvme as paging space, the TBW on that thing is pretty damn high, so I'm not worried about writing a few gigs every now and then (assuming it doesn't just page the model and read from the ssd instead)

Anonymous
04/25/24(Thu)08:25:39 No.100173248

Anonymous 04/25/24(Thu)08:25:39 No.100173248

>>100173226
it would need to load and unload required experts on each token. maybe with speculative decoding it could be usable

Anonymous
04/25/24(Thu)08:25:54 No.100173252

Anonymous 04/25/24(Thu)08:25:54 No.100173252

>>100173226
>I could just use a bit of my gen 4 nvme as paging space
Next step: 1000x200M MOE intended to be used from an HDD.

Anonymous
04/25/24(Thu)08:26:16 No.100173256

Anonymous 04/25/24(Thu)08:26:16 No.100173256

>>100171961
>littering

My man.

Anonymous
04/25/24(Thu)08:29:07 No.100173275

Anonymous 04/25/24(Thu)08:29:07 No.100173275

Had an audience with sama today. Something BIG is coming. Invest in TSMC and MSFT.

Anonymous
04/25/24(Thu)08:30:44 No.100173287

Anonymous 04/25/24(Thu)08:30:44 No.100173287

>>100173004
You do realize that he's petra, right?

Anonymous
04/25/24(Thu)08:31:15 No.100173294

Anonymous 04/25/24(Thu)08:31:15 No.100173294

File: GLmPttUbcAEk2oG.jpg (177 KB, 1024x1024)

177 KB JPG

>>100173075
those are false flag mikuposters

Anonymous
04/25/24(Thu)08:33:05 No.100173305

Anonymous 04/25/24(Thu)08:33:05 No.100173305

>>100173075
>miku cunts after miku
*miku cunts, after miku

Anonymous
04/25/24(Thu)08:33:28 No.100173310

Anonymous 04/25/24(Thu)08:33:28 No.100173310

>>100173294
Why is she looking so smug when she's about to get eaten?

Anonymous
04/25/24(Thu)08:37:18 No.100173344

Anonymous 04/25/24(Thu)08:37:18 No.100173344

>>100173310
Because she knows she's delicious.

Anonymous
04/25/24(Thu)08:37:27 No.100173347

Anonymous 04/25/24(Thu)08:37:27 No.100173347

File: file.png (3.48 MB, 1620x2160)

3.48 MB PNG

>>100173294
>false flag
>inside job
How convenient.

Anonymous
04/25/24(Thu)08:54:13 No.100173495

Anonymous 04/25/24(Thu)08:54:13 No.100173495

File: cute.jpg (49 KB, 609x649)

49 KB JPG

Anonymous
04/25/24(Thu)08:55:57 No.100173507

Anonymous 04/25/24(Thu)08:55:57 No.100173507

>check out the dolphin 2.9 70B (pure gptslop)
>it has worse mmlu than the base model
How

Anonymous
04/25/24(Thu)08:57:36 No.100173523

Anonymous 04/25/24(Thu)08:57:36 No.100173523

>>100173514
>>100173514
>>100173514
new thread

Anonymous
04/25/24(Thu)09:11:25 No.100173702

Anonymous 04/25/24(Thu)09:11:25 No.100173702

>>100172445
I am brain damaged, but only because I wrote temp zero when I meant temp of 1. My brain always goes to "zeroed out" when things are set to baseline
CR+ is now my favorite model, although L3 finetunes could change that in the future. It's really good at steering the narrative and seems to refuse to write on my behalf, which is neat. I finally have a model that can actually throw some twists and turns at me without being schizobabble. And I'm only running it at IQ4_XS

Anonymous
04/25/24(Thu)09:32:54 No.100173940

Anonymous 04/25/24(Thu)09:32:54 No.100173940

a reddit cuck twitch frog is always followed by a schizophrenic cat poster

Anonymous
04/25/24(Thu)09:46:51 No.100174095

Anonymous 04/25/24(Thu)09:46:51 No.100174095

>>100173940

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.