/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 08/20/24(Tue)20:06:06 No.102001133

File: 1713291463105789.jpg (980 KB, 1856x2464)

980 KB JPG

/lmg/ - Local Models General Anonymous 08/20/24(Tue)20:06:06 No.102001133 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Luka Love Edition

Previous threads: >>101990712 & >>101981616

►News
>(08/20) Microsoft's Phi-3.5 released: mini+MoE+vision: https://hf.co/microsoft/Phi-3.5-MoE-instruct
>(08/16) MiniCPM-V-2.6 support merged: https://github.com/ggerganov/llama.cpp/pull/8967
>(08/15) Hermes 3 released, full finetunes of Llama 3.1 base models: https://hf.co/collections/NousResearch/hermes-3-66bd6c01399b14b08fe335ea
>(08/12) Falcon Mamba 7B model from TII UAE: https://hf.co/tiiuae/falcon-mamba-7b
>(08/09) Qwen large audio-input language models: https://hf.co/Qwen/Qwen2-Audio-7B-Instruct

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
08/20/24(Tue)20:12:56 No.102001231

Anonymous 08/20/24(Tue)20:12:56 No.102001231

>>102001133
We've survived... but at what cost?

Anonymous
08/20/24(Tue)20:13:42 No.102001241

Anonymous 08/20/24(Tue)20:13:42 No.102001241

>only 2 posts
/lmg/ is dead

Anonymous
08/20/24(Tue)20:14:09 No.102001243

Anonymous 08/20/24(Tue)20:14:09 No.102001243

File: ComfyUI_00014_.png (1.18 MB, 1024x1024)

1.18 MB PNG

Will AI make my burger stonks be affordable again?

Anonymous
08/20/24(Tue)20:18:09 No.102001303

Anonymous 08/20/24(Tue)20:18:09 No.102001303

File: GTY5zp2agAAUMEy.jpg (105 KB, 1200x1188)

105 KB JPG

>trying some kinky stuff
>practically all models are infinitely more willing to fuck a literally feral dog than a child
We live in a society

Anonymous
08/20/24(Tue)20:19:55 No.102001322

Anonymous 08/20/24(Tue)20:19:55 No.102001322

Recent Highlights from the Previous Thread?
More like Schmecent Schmilights from the Schmevious Schmed.

Anonymous
08/20/24(Tue)20:20:15 No.102001326

Anonymous 08/20/24(Tue)20:20:15 No.102001326

Doing some training on 2x3090 ti's and I realized that the ETA doesn't really change a lot when I do nvidia-smi -pl 300 vs nvidia-smi -pl 400. I always assumed it was negligible for inference but not for training.

Anonymous
08/20/24(Tue)20:20:35 No.102001333

Anonymous 08/20/24(Tue)20:20:35 No.102001333

>>102001322
lmao gottem

Anonymous
08/20/24(Tue)20:21:37 No.102001347

Anonymous 08/20/24(Tue)20:21:37 No.102001347

File: 65456465465.png (79 KB, 1892x742)

79 KB PNG

"Mistral large is GPT-4 tier" yeah right, a 27b model still performs better on random trivia than a 123b. There is no excuse for this.

Anonymous
08/20/24(Tue)20:22:39 No.102001362

Anonymous 08/20/24(Tue)20:22:39 No.102001362

File: 399518297_749012633932748(...).jpg (44 KB, 488x277)

44 KB JPG

>>102001133

Anonymous
08/20/24(Tue)20:23:09 No.102001367

Anonymous 08/20/24(Tue)20:23:09 No.102001367

>>102001347
everyone knows that big models are cope for retards who overspent on hardware
there's hardly any improvement beyond 27b or maybe even 12b considering nemo

Anonymous
08/20/24(Tue)20:24:12 No.102001382

Anonymous 08/20/24(Tue)20:24:12 No.102001382

>>102001347
gemma can't have sex

Anonymous
08/20/24(Tue)20:25:03 No.102001394

Anonymous 08/20/24(Tue)20:25:03 No.102001394

File: 46545645564.png (215 KB, 1839x781)

215 KB PNG

>>102001347
Knows the lyrics to the duck song but is still retarded about what happens during it

"The song tells the story of a duck who goes into a store to buy some grapes, but unfortunately, the store does not sell grapes. The duck then asks for various other items, each time being told they are not available, leading to a comical and repetitive exchange."

How retarded is that? It just regurgitates data it does not fully understand. This tells me there's a massive waste of data there. I've been telling you all Gemma is by far the most intelligent open model.

Anonymous
08/20/24(Tue)20:27:19 No.102001420

Anonymous 08/20/24(Tue)20:27:19 No.102001420

>>102001382
>gemma can't have sex

Where are the tunes?

Anonymous
08/20/24(Tue)20:27:55 No.102001429

Anonymous 08/20/24(Tue)20:27:55 No.102001429

>>102001394
Is there a uncucked version of gemma?

Anonymous
08/20/24(Tue)20:30:44 No.102001453

Anonymous 08/20/24(Tue)20:30:44 No.102001453

https://hf.co/MangoHQ/TinyMagnum-4b

leaked magnum model?

Anonymous
08/20/24(Tue)20:31:34 No.102001464

Anonymous 08/20/24(Tue)20:31:34 No.102001464

File: __hatsune_miku_and_meguri(...).jpg (372 KB, 1630x1837)

372 KB JPG

►Recent Highlights from the Previous Thread: >>101990712

--No Python skills needed to install local language model, but required for training: >>101998830 >>101999031
--Running 70B 4bit quant model on a single 3090 and CPU, speed estimates and VRAM utilization discussion: >>101993566 >>101993582 >>101995240 >>101995293 >>101995389 >>101993624
--Microsoft's Phi-3.5 AI model released: >>101997221 >>101997326 >>101997748 >>101998144 >>101998634 >>101998666 >>101999035 >>101998170 >>101998239 >>101998274
--Llama 3.1 repetition and penalty settings discussion: >>101995250 >>101995337 >>101995416 >>101995451 >>101995494 >>101995529 >>101995551 >>101995676 >>101995524 >>101995889 >>101999802
--Discussion on using FP16 for prompt processing and lower quants for token generation: >>101993920 >>101994184 >>101994601 >>101994437 >>101994666 >>101994443 >>101994493 >>101994546 >>101994584 >>101994763 >>101994937 >>101995118 >>101995680 >>101995884
--Pre-llama era models suggested for worst RP experience: >>101999152 >>101999249
--Mistral Nemo 12B recommended for 12GB VRAM: >>101997827 >>101997858 >>101997908 >>101997951 >>101998024 >>101997961 >>101998131 >>101998225
--Advice on adapting 3.5k sys prompt for Mistral large: >>101991079 >>101991095 >>101991115
--Nemo vs other under 70b models, base vs finetunes, intelligence, writing style, context, and prompting: >>101994638 >>101994667 >>101994745 >>101995985 >>101996041 >>101996305 >>101996149 >>101996206 >>101995960
--Magnum 123b model review and rant against tuners' practices: >>101997022
--Criticism of sloptuners and Nvidia's practices: >>101991923 >>101991958 >>101991997 >>101992019 >>101992932 >>101992965 >>101993396 >>101993477 >>101993287 >>101993397 >>101993561
--Context saving idea for efficient prompt processing: >>101994949
--Miku (free space): >>101991128 >>101991825 >>101993017 >>101993048 >>101994222 >>101996742 >>101997197 >>101999088 >>102000707

►Recent Highlight Posts from the Previous Thread: >>101990920

Anonymous
08/20/24(Tue)20:36:08 No.102001513

Anonymous 08/20/24(Tue)20:36:08 No.102001513

alright does anyone have *the* Nala card? I'm not on my main PC so I don't have it here, and my finger is swolen so I don't feel like searching a whole lot for it.

Anonymous
08/20/24(Tue)20:37:46 No.102001527

Anonymous 08/20/24(Tue)20:37:46 No.102001527

>>102001513
https://www.characterhub.org/characters/Anonymous/Nala

Anonymous
08/20/24(Tue)20:47:31 No.102001607

Anonymous 08/20/24(Tue)20:47:31 No.102001607

File: ahh ahh archer.png (129 KB, 930x562)

129 KB PNG

>>102001527
Well it's not my usual template I had to whip one up from scratch based on my memory of the appropriate special tokens. But at t=1 neutral she turns into an archer and murders me

Anonymous
08/20/24(Tue)20:49:03 No.102001619

Anonymous 08/20/24(Tue)20:49:03 No.102001619

File: 1710729147915598.jpg (236 KB, 1024x1024)

236 KB JPG

>>102001133

Anonymous
08/20/24(Tue)20:50:08 No.102001635

Anonymous 08/20/24(Tue)20:50:08 No.102001635

>>102001347
That's an issue with all Mistral models tbf

Anonymous
08/20/24(Tue)20:50:42 No.102001643

Anonymous 08/20/24(Tue)20:50:42 No.102001643

>>102001347
It's over for Mistral. Redownloading Gemma

Anonymous
08/20/24(Tue)20:51:06 No.102001649

Anonymous 08/20/24(Tue)20:51:06 No.102001649

>>102001619
heading to the miku convention with miku and miku... and miku...

Anonymous
08/20/24(Tue)20:53:25 No.102001669

Anonymous 08/20/24(Tue)20:53:25 No.102001669

Sup. I'm the guy that likes clothes again. Does this look like an effective addition to system prompt?

>Extensively detail breasts with detail about their motion, shape, size, etc... Extensively detail clothes and how they fit on a person's body, with details like the tautness of the fabric, the glossiness of sheer materials, etc...

I'm trying different models but honestly it doesn't seem to affect much.

Anonymous
08/20/24(Tue)20:54:08 No.102001678

Anonymous 08/20/24(Tue)20:54:08 No.102001678

I think Phi-MoE has potential.

Anonymous
08/20/24(Tue)20:57:31 No.102001704

Anonymous 08/20/24(Tue)20:57:31 No.102001704

File: Screenshot_20240821_004724.png (110 KB, 1975x473)

110 KB PNG

>>102001347
I wouldn't use a single or even a few questions to judge a model's complete trivia knowledge. And I also wouldn't seriously use lmsys, since it does not give you token probabilities. However, what I can confirm is that Largestral at IQ2_M does seem to perform badly on the Castlevania quote, which I just tested. But it seems that on a different trivia question which I tested on lmsys, pic related, Largestral does perform a bit better. I rerolled once to see if it would change and it didn't. They gave the same answer both times. IQ2_M locally also gives the same answer which is interesting since that quant should be lobotomizing. For the question, Mistral got both characters right although the order was wrong, while Gemma thought of a completely different character from a different KR series.

Frankly since trivia is extremely easy to make problems for, you should make an actual benchmark, which could be composed of 100 multiple choice questions, and automate it with a script and get token probabilities for the answers. You can be taken seriously this way, and also be a good contributor to the public knowledge of LLMs. If you don't have the hardware to run large models then you could ask someone, I'm sure there are willing people in the thread. I would certainly be up to run such a benchmark, if someone made it, on the models I have the ability to run.

As for the general Gemma debate, honestly in my short time testing it, it did seem pretty good based on my feels, especially considering the size. I just didn't use it any further because of the 8k.

Anonymous
08/20/24(Tue)20:58:36 No.102001713

Anonymous 08/20/24(Tue)20:58:36 No.102001713

>>102001704
123b is undertrained to an extent, quantization hurts it a lot less.
Mistral was crying about not having enough GPUs and using MoE for a reason.

Anonymous
08/20/24(Tue)20:58:41 No.102001715

Anonymous 08/20/24(Tue)20:58:41 No.102001715

File: 1724201905616.png (204 KB, 600x633)

204 KB PNG

>>102001464
im starting to think these highlights are AI generated

Anonymous
08/20/24(Tue)21:00:21 No.102001734

Anonymous 08/20/24(Tue)21:00:21 No.102001734

File: 1711129548940742.png (5 KB, 693x138)

5 KB PNG

>>102001678
>43B total
>performs worse than Nemo 12B or fucking 9B Gemma
MoE? More like MemE. No wonder Mistral has abandoned the concept.

Anonymous
08/20/24(Tue)21:00:42 No.102001741

Anonymous 08/20/24(Tue)21:00:42 No.102001741

>>102001715
anon...

Anonymous
08/20/24(Tue)21:02:46 No.102001767

Anonymous 08/20/24(Tue)21:02:46 No.102001767

>>102001734
It's a MoE of braindead retards (4B~), they shot themselves on the foot with this.

Anonymous
08/20/24(Tue)21:03:02 No.102001769

Anonymous 08/20/24(Tue)21:03:02 No.102001769

>>102001734
>Arena
Kek.
The real benchmark to look for will be Livebench even though it's not perfect, it's still the best we have currently for general knowledge. RP skill is a dif story though and there don't seem to be any decent benches for that.

Anonymous
08/20/24(Tue)21:03:04 No.102001770

Anonymous 08/20/24(Tue)21:03:04 No.102001770

File: 1697033028019407.png (28 KB, 1101x517)

28 KB PNG

The most interesting thing about Phi-MoE is that it was trained on 5T tokens in just over 20 days.

Anonymous
08/20/24(Tue)21:03:28 No.102001774

Anonymous 08/20/24(Tue)21:03:28 No.102001774

>>102001734
>Llama 3.1 8b score
Wtf metabros our model is shit even retarded Moe's are mogging the shit out of it

Anonymous
08/20/24(Tue)21:05:11 No.102001788

Anonymous 08/20/24(Tue)21:05:11 No.102001788

File: tumblr_1a0c959a09f37d292b(...).jpg (64 KB, 720x733)

64 KB JPG

>>102001347
>>102001394
A couple of cherry picked examples don't mean anything. All models hallucinate given certain queries. Even if Gemma is less prone to confabulate on average, the censorship makes it useless.

Anonymous
08/20/24(Tue)21:08:11 No.102001823

Anonymous 08/20/24(Tue)21:08:11 No.102001823

>>102001788
Let me guess, you own three 3090s? 4?

Anonymous
08/20/24(Tue)21:08:56 No.102001827

Anonymous 08/20/24(Tue)21:08:56 No.102001827

>>102001734
Because it's only 6.6B Active anyone with 32 gigs of RAM should be able to run it in gguf form. Meme prompt aside I do like what I've seen from it so far more than what I've seen from 8B. Gives it the potential to be the entry-level model of choice in the future.

Anonymous
08/20/24(Tue)21:11:27 No.102001863

Anonymous 08/20/24(Tue)21:11:27 No.102001863

>>102001827
When I say 32 gigs of RAM of course I mean in Q4 of course. Although we'll have to wait until it gets llama.cpp support before we can really see how well it handles quantization. As of right now I reckon I'm the only person on this thread who has actually used it at all, considering it takes a quad GPU rig to run it in f16

Anonymous
08/20/24(Tue)21:11:45 No.102001866

Anonymous 08/20/24(Tue)21:11:45 No.102001866

>>102001827
>Because it's only 6.6B Active anyone with 32 gigs of RAM
that's not how it works...

Anonymous
08/20/24(Tue)21:13:07 No.102001881

Anonymous 08/20/24(Tue)21:13:07 No.102001881

>>102001823
Nah, I have a 4070 and 64gb RAM. Nemo instruct Q6_K is my bread and butter. I need low latency responses and don't want to build a $20k rig or be a cloud cuck.

Anonymous
08/20/24(Tue)21:13:51 No.102001883

Anonymous 08/20/24(Tue)21:13:51 No.102001883

Why is /lmg/ a thing anymore when mythomax STILL hasn't been surpassed

Anonymous
08/20/24(Tue)21:16:08 No.102001911

Anonymous 08/20/24(Tue)21:16:08 No.102001911

File: livebench_results.8e0a406(...).png (845 KB, 3170x1844)

845 KB PNG

>>102001788
>Even if Gemma is less prone to confabulate on average
Not according to Livebench.

>>102001774
It's ok, their 70B is still the top local model outside of 405B. The 8B might've been a bit fucked and they just didn't bother to make it better. Or possibly we hit a limit on how much knowledge an 8B can hold and the reason why Gemma 2 9B is better is because it lacks long context capability. That would be unfortunate if true, that training for longer context understanding takes more neurons, though it makes sense.

Anonymous
08/20/24(Tue)21:17:20 No.102001920

Anonymous 08/20/24(Tue)21:17:20 No.102001920

>>102001866
see
>>102001863
Retard.
Compute power needed for inferencing is based on the active parameters. They'll still need a GPU to do the batch processing initially but after the first prompt they'll be good to go. It should run reasonably well on full cpu offload for most people.

Anonymous
08/20/24(Tue)21:26:09 No.102002012

Anonymous 08/20/24(Tue)21:26:09 No.102002012

File: 12552.png (208 KB, 1756x838)

208 KB PNG

>>102001347
both models are a meme.

Anonymous
08/20/24(Tue)21:27:40 No.102002036

Anonymous 08/20/24(Tue)21:27:40 No.102002036

>>102001911
Wait what the hell, it looks like the GPT-4 models literally got worse on this benchmark and now 405B beats all the GPT-4s on the chart. What is ClosedAI doing?

Anonymous
08/20/24(Tue)21:34:49 No.102002117

Anonymous 08/20/24(Tue)21:34:49 No.102002117

>>102001881
>>102002036
buy an ad

Anonymous
08/20/24(Tue)21:40:02 No.102002167

Anonymous 08/20/24(Tue)21:40:02 No.102002167

File: 1379637883196.png (92 KB, 293x282)

92 KB PNG

Let's play a game! This Saturday at 1 PM PT, I will do a collaborative storytelling/RP session (location TBD, maybe in the thread itself?), where I post a scenario and responses from the model in the thread, and people discuss what to do in the user chat turns, or edit previous user turns or the system prompt and start over. This is going to be both for fun and to get us (mostly) reproducible reference logs, as I'll be using greedy sampling in Mikupad and have the full log in a pastebin at the end. No editing the model's responses, we're going to use pure prompting to try and get the thing to do what we want!

The scenario is also still TBD. We're going to go for as long a context as possible until the model breaks down uncontrollably, so it should be a complex enough scenario for that. If anyone has suggestions for scenarios I'm all ears. Also, I'm planning on starting these games with Mistral Nemo at Q8 for the first session, and other models in the future, so we have reference logs available for a whole range. But I'll take suggestions for models people want. I'm only a 36 GB VRAMlet though so I'm a bit limited. I can run larger models up to ~88 GB but it'd be slower. If anyone would like to host any of these games themselves, that has more VRAM to run such larger models at a good speed, please do, and I will step down.

Anonymous
08/20/24(Tue)21:43:51 No.102002201

Anonymous 08/20/24(Tue)21:43:51 No.102002201

File: Toosaka-Rin-Fate-(series)(...).jpg (573 KB, 811x1053)

573 KB JPG

>>102002167
Why do you think this is going to be fun? Sounds really lame.

Anonymous
08/20/24(Tue)21:46:34 No.102002232

Anonymous 08/20/24(Tue)21:46:34 No.102002232

File: Designer.png (1.13 MB, 1024x1024)

1.13 MB PNG

What should I put in the sysprompt to so that the model only shits out dialogue and descriptions, NOT thoughts?
Every model I've tried so far spends more time rambling about the character's thoughts and feefees instead of actually talking or doing stuff.

Anonymous
08/20/24(Tue)21:47:02 No.102002238

Anonymous 08/20/24(Tue)21:47:02 No.102002238

>>102002167
>If anyone has suggestions for scenarios I'm all ears.
Illuminati meeting in a secret bunker. 3 distinct characters handling random world events.

Anonymous
08/20/24(Tue)21:47:25 No.102002242

Anonymous 08/20/24(Tue)21:47:25 No.102002242

>>102002201
Idk lol. I'm doing it anyway though.

Anonymous
08/20/24(Tue)21:49:28 No.102002267

Anonymous 08/20/24(Tue)21:49:28 No.102002267

>>102002238
Is Nemo capable of group chats like that? I guess we could see. I never actually tried the model that way.

Anonymous
08/20/24(Tue)21:59:55 No.102002376

Anonymous 08/20/24(Tue)21:59:55 No.102002376

>>102002267
I do 2s with nemo with great results, not sure about complex scenarios with 3. Legitimately curious how it goes, good luck.

Anonymous
08/20/24(Tue)22:03:59 No.102002413

Anonymous 08/20/24(Tue)22:03:59 No.102002413

>>102002232
Sure you don't have something about thoughts in your system prompt/card? What are you using?.
If there's nothing there, you could try with something to the tune of "Only concern yourself with the dialog and actions of the characters" or something like that. Make it consistent with your prompt to match the phrasing.

Anonymous
08/20/24(Tue)22:07:13 No.102002445

Anonymous 08/20/24(Tue)22:07:13 No.102002445

>>102002232
You can try banning asterisk or whatever you use for thoughts. Alternatively, edit the thoughts out of the first couple of responses.

Anonymous
08/20/24(Tue)22:23:40 No.102002622

Anonymous 08/20/24(Tue)22:23:40 No.102002622

>>102002267
>>102002376
If you find that things deteriorate with 3 characters, try a chaining approach instead.

You could have a separate prompt for each character and feed the responses to each character. It wouldn't be too hard to rig up with the ollama Python bindings. Of course, you would need some way of figuring out who gets to "talk" next, either through random choice or through another LLM evaluating the last output.

Anonymous
08/20/24(Tue)22:33:53 No.102002736

Anonymous 08/20/24(Tue)22:33:53 No.102002736

File: Untitled.png (674 KB, 1080x1280)

674 KB PNG

HMoE: Heterogeneous Mixture of Experts for Language Modeling
https://arxiv.org/abs/2408.10681
>Mixture of Experts (MoE) offers remarkable performance and computational efficiency by selectively activating subsets of model parameters. Traditionally, MoE models use homogeneous experts, each with identical capacity. However, varying complexity in input data necessitates experts with diverse capabilities, while homogeneous MoE hinders effective expert specialization and efficient parameter utilization. In this study, we propose a novel Heterogeneous Mixture of Experts (HMoE), where experts differ in size and thus possess diverse capacities. This heterogeneity allows for more specialized experts to handle varying token complexities more effectively. To address the imbalance in expert activation, we propose a novel training objective that encourages the frequent activation of smaller experts, enhancing computational efficiency and parameter utilization. Extensive experiments demonstrate that HMoE achieves lower loss with fewer activated parameters and outperforms conventional homogeneous MoE models on various pre-training evaluation benchmarks.
https://github.com/Tencent
https://huggingface.co/Tencent-Hunyuan
no code yet. makes the experts not uniform in size. pretty cool

Anonymous
08/20/24(Tue)22:51:03 No.102002919

Anonymous 08/20/24(Tue)22:51:03 No.102002919

>>102002413
>you could try with something to the tune of "Only concern yourself with the dialog and actions of the characters"
I'll try that, thanks.
>>102002445
I don't use asterisks, it's more that the model autistically describes the character's thoughts in plain text. Like "She can sense his nervousness, and she revels in it" and such.

Anonymous
08/20/24(Tue)22:56:11 No.102002965

Anonymous 08/20/24(Tue)22:56:11 No.102002965

>>102002736
Now smaller mikus can help the bigger mikus

Anonymous
08/20/24(Tue)23:04:52 No.102003055

Anonymous 08/20/24(Tue)23:04:52 No.102003055

>>102001704
Every LLM is prone to hallucination. But complete hallucination imo makes 123b useless. That Gemma side probably would get that exact question right on a reroll on different settings. In fact the Castlevania answer from Gemma isn't 100% correct either.

Anonymous
08/20/24(Tue)23:09:23 No.102003091

Anonymous 08/20/24(Tue)23:09:23 No.102003091

Now that we've established that Mistral Large is useless, what's next for LLMs? Will we ever get a big model above 30b that's worth using?

Anonymous
08/20/24(Tue)23:11:37 No.102003109

Anonymous 08/20/24(Tue)23:11:37 No.102003109

>>102001347
Oh, wow anon, you've shown that a language model, given no other context than the question, doesn't know literally fucking everything under the sun at all times. Good for you. People who think it's fine to just try and stuff more knowledge into more parameters are dumb as fuck. We need to be training moderately sized models to either only use data based on context contents, or request information that is missing using function calling.
Efficient memory recall and context handling is what really matters in the long term, not some garbage no-context trivia shit.

Anonymous
08/20/24(Tue)23:11:49 No.102003112

Anonymous 08/20/24(Tue)23:11:49 No.102003112

>>102003091
Miqu

Anonymous
08/20/24(Tue)23:13:21 No.102003124

Anonymous 08/20/24(Tue)23:13:21 No.102003124

>>102002736
Nice, I predicted this approach some bit of time ago, though with strategies to leverage existing models in order to get something out the door quicker, but this works too, for pretraining from scratch. I expect this to be utilized in a commercial-ready model in the future.

Anonymous
08/20/24(Tue)23:18:13 No.102003170

Anonymous 08/20/24(Tue)23:18:13 No.102003170

>>102003055
Literally all models including your own brain can completely hallucinate given the right prompt. Again, this is a matter of statistical significance. Using one or two examples proves very little. I could cherry pick some examples myself to shit on Gemma while praising Mistral, but I'd rather not do that.

Anonymous
08/20/24(Tue)23:22:50 No.102003217

Anonymous 08/20/24(Tue)23:22:50 No.102003217

>>102003109
Cloud shit starting as early as GPT 3.5 gets every trivia question under the sun right so clearly something is wrong with how open source is approaching the storage of knowledge. Everything being tested is baby's play for 3.5. There is no excuse for open source to not have caught up by now, and yet we're still far behind. Mistral is a step forward but it's still not there. Gemma was much closer.

Anonymous
08/20/24(Tue)23:25:56 No.102003244

Anonymous 08/20/24(Tue)23:25:56 No.102003244

>>102003170
I have an intuitive feel for models now. I need only one or two examples to tell. You can drag it out if you want.

Anonymous
08/20/24(Tue)23:28:17 No.102003274

Anonymous 08/20/24(Tue)23:28:17 No.102003274

File: Untitled.jpg (1.02 MB, 1080x8298)

1.02 MB JPG

we're good to go boys

Anonymous
08/20/24(Tue)23:28:34 No.102003275

Anonymous 08/20/24(Tue)23:28:34 No.102003275

why would anon ever buy 40 series instead of 3090?

Anonymous
08/20/24(Tue)23:29:00 No.102003278

Anonymous 08/20/24(Tue)23:29:00 No.102003278

File: 00001-239438472423897.png (1.85 MB, 1024x1536)

1.85 MB PNG

kinda late to the party but forge can run flux, I got this migu in 14mins with my 1070ti. Although without using the hiresfix its almost 3min for a 512x768
using fux dev q5_1 gguf, randomly downloaded one i havent tried the rest

Anonymous
08/20/24(Tue)23:29:16 No.102003282

Anonymous 08/20/24(Tue)23:29:16 No.102003282

>>102003217
Closed source models are doing god knows what under the hood so it's not a fair comparison. For all you know, they could be sticking Google search results in the prompt context using RAG.

Anonymous
08/20/24(Tue)23:30:05 No.102003293

Anonymous 08/20/24(Tue)23:30:05 No.102003293

>>102003275
only worth it if you're using tensorRT inferencing or lots of gaming

Anonymous
08/20/24(Tue)23:30:31 No.102003300

Anonymous 08/20/24(Tue)23:30:31 No.102003300

File: kekbook.jpg (62 KB, 2266x440)

62 KB JPG

>>102003217
You're right anon GPT 3.5 is sooooooo smart just like Gemma 2b

Anonymous
08/20/24(Tue)23:31:35 No.102003309

Anonymous 08/20/24(Tue)23:31:35 No.102003309

>>102003217
They do a ton of RLHF for popular questions, so it's not fair.

Anonymous
08/20/24(Tue)23:31:41 No.102003311

Anonymous 08/20/24(Tue)23:31:41 No.102003311

>>102003278
It might be time to upgrade gpus anon

Anonymous
08/20/24(Tue)23:31:53 No.102003318

Anonymous 08/20/24(Tue)23:31:53 No.102003318

>>102003300
uhh, corpo bros, our response?

Anonymous
08/20/24(Tue)23:33:44 No.102003340

Anonymous 08/20/24(Tue)23:33:44 No.102003340

File: 24285.png (104 KB, 1769x649)

104 KB PNG

game over

Anonymous
08/20/24(Tue)23:35:10 No.102003349

Anonymous 08/20/24(Tue)23:35:10 No.102003349

>>102003311
>>102003278

i posted in the wrong thread sorry

Anonymous
08/20/24(Tue)23:36:26 No.102003356

Anonymous 08/20/24(Tue)23:36:26 No.102003356

>>102003217
There is no statistical significance to your claim.

>>102003244
This is also no statistical significance to this claim.

Anonymous
08/20/24(Tue)23:36:57 No.102003366

Anonymous 08/20/24(Tue)23:36:57 No.102003366

File: weve-benchmarked-tensorrt(...).jpg (63 KB, 1080x745)

63 KB JPG

>>102003293
3090 can't do tensorRT?

Anonymous
08/20/24(Tue)23:40:40 No.102003399

Anonymous 08/20/24(Tue)23:40:40 No.102003399

File: 1704471931062766.png (67 KB, 2271x476)

67 KB PNG

The difference between a big and a small model.

Anonymous
08/20/24(Tue)23:42:18 No.102003414

Anonymous 08/20/24(Tue)23:42:18 No.102003414

>>102003274
>synthetic data risks model collapse
>no mention of how to avoid it
it's fucking over

Anonymous
08/20/24(Tue)23:43:35 No.102003432

Anonymous 08/20/24(Tue)23:43:35 No.102003432

>>102003399
>comparing gemma to mistral
lol
lmao

Anonymous
08/20/24(Tue)23:44:11 No.102003437

Anonymous 08/20/24(Tue)23:44:11 No.102003437

>>102003432
you forgot to quote about 20 posts in this thread

Anonymous
08/20/24(Tue)23:46:25 No.102003461

Anonymous 08/20/24(Tue)23:46:25 No.102003461

File: 229543.png (162 KB, 1715x725)

162 KB PNG

>>102003340
just brutal

Anonymous
08/20/24(Tue)23:46:58 No.102003465

Anonymous 08/20/24(Tue)23:46:58 No.102003465

>>102003461
local models are such a joke lmao

Anonymous
08/20/24(Tue)23:47:01 No.102003466

Anonymous 08/20/24(Tue)23:47:01 No.102003466

>>102003340
The data quality of websites where this trivia appears on is too low, sorry.

Anonymous
08/20/24(Tue)23:50:32 No.102003504

Anonymous 08/20/24(Tue)23:50:32 No.102003504

>>102003414
It says you can just scrape the internet and get way more data.

Anonymous
08/20/24(Tue)23:51:22 No.102003513

Anonymous 08/20/24(Tue)23:51:22 No.102003513

File: 89848848948.png (85 KB, 1801x765)

85 KB PNG

>>102003399
I never implied Gemma never hallucinated.

>>102003432
Then why is 27b better than 123b at random video game trivia knowledge anon?

Anonymous
08/20/24(Tue)23:51:55 No.102003519

Anonymous 08/20/24(Tue)23:51:55 No.102003519

>>102003274
The bubble is going to burst way before then, after that the money is going to dry up and resources will be funneled elsewhere. The physical access to electricity and compute power is not the biggest future obstacle.

Anonymous
08/20/24(Tue)23:52:02 No.102003523

Anonymous 08/20/24(Tue)23:52:02 No.102003523

>>102003278
I can generate 512x512 images using flux with CPU in 5 min

Anonymous
08/20/24(Tue)23:52:21 No.102003526

Anonymous 08/20/24(Tue)23:52:21 No.102003526

>>102003504
and conveniently glosses over the fact that most of that data is going to be ai-generated by then

Anonymous
08/20/24(Tue)23:53:01 No.102003536

Anonymous 08/20/24(Tue)23:53:01 No.102003536

File: tetbook.jpg (28 KB, 829x302)

28 KB JPG

>>102003300
Meanwhile in local

Anonymous
08/20/24(Tue)23:53:04 No.102003537

Anonymous 08/20/24(Tue)23:53:04 No.102003537

>>102003461
>uses rag
Nothin personnel kid

Anonymous
08/20/24(Tue)23:53:23 No.102003539

Anonymous 08/20/24(Tue)23:53:23 No.102003539

>>102003526
I suppose that's true, but what do you expect from the just believe in straight lines (on a log graph) crowd?

Anonymous
08/20/24(Tue)23:56:27 No.102003559

Anonymous 08/20/24(Tue)23:56:27 No.102003559

>>102003519
2 more weeks

Anonymous
08/20/24(Tue)23:58:26 No.102003580

Anonymous 08/20/24(Tue)23:58:26 No.102003580

>>102003536
This is a superior response.

Anonymous
08/20/24(Tue)23:59:05 No.102003590

Anonymous 08/20/24(Tue)23:59:05 No.102003590

>>102003537
This is highly unlikely anon. Using RAG without telling you it would be prone to hallucinate about what it found on the web similar to Bing, changing your argument because what's on the web is in the first person form or in the form of a command and it can screw up your prompt.

Anonymous
08/20/24(Tue)23:59:52 No.102003594

Anonymous 08/20/24(Tue)23:59:52 No.102003594

>>102003536
Teto card pls? I beg...

Anonymous
08/21/24(Wed)00:00:45 No.102003602

Anonymous 08/21/24(Wed)00:00:45 No.102003602

File: fail.jpg (130 KB, 2287x528)

130 KB JPG

>>102003461
actually disregard that I suck cocks it's not good at hotel mario trivia can't even get the best quote right

Anonymous
08/21/24(Wed)00:01:55 No.102003613

Anonymous 08/21/24(Wed)00:01:55 No.102003613

>>102003536
Local wins again

Anonymous
08/21/24(Wed)00:02:02 No.102003614

Anonymous 08/21/24(Wed)00:02:02 No.102003614

>>102003536
cute

Anonymous
08/21/24(Wed)00:02:31 No.102003620

Anonymous 08/21/24(Wed)00:02:31 No.102003620

>>102003594
https://files.catbox.moe/mc2a7s.png

Anonymous
08/21/24(Wed)00:05:33 No.102003646

Anonymous 08/21/24(Wed)00:05:33 No.102003646

File: 1694433419894272.png (54 KB, 623x330)

54 KB PNG

>>102003536
opus

Anonymous
08/21/24(Wed)00:13:46 No.102003718

Anonymous 08/21/24(Wed)00:13:46 No.102003718

File: over.webm (1.95 MB, 1680x1080)

1.95 MB WEBM

I think it might be actually over for local...
https://nitter.poast.org/zeddotdev/status/1825967812629631034

Anonymous
08/21/24(Wed)00:22:38 No.102003789

Anonymous 08/21/24(Wed)00:22:38 No.102003789

>>102003602
>Sonnet says i dont know when it doesnt know instead of some shrek and simpson BS
i kneel

Anonymous
08/21/24(Wed)00:26:34 No.102003829

Anonymous 08/21/24(Wed)00:26:34 No.102003829

>>102003718
>sped up edited bs
oh okay thanks for your contribution anon

Anonymous
08/21/24(Wed)00:29:00 No.102003851

Anonymous 08/21/24(Wed)00:29:00 No.102003851

>>102003293
Are you dumb?
3090 is sacred. How dare you belittle 3090?
Now buy more from miners.

Anonymous
08/21/24(Wed)00:29:19 No.102003852

Anonymous 08/21/24(Wed)00:29:19 No.102003852

File: Screenshot 2024-08-21 at (...).png (137 KB, 1436x779)

137 KB PNG

>>102003590
Bing is a particular case of possibly Microsoft being retarded and using an extremely huge and crappy system prompt. We know Bing chat itself performed terribly on regular queries compared to GPT-4, not just on web searches. Actual good AI companies like OpenAI can do better, especially if they've been training their models to perform well with RAG, and using data processing techniques to make the RAG content work better with the model, and actually I think this is a thing that is increasingly being done/accounted for now in public research as well. One time I asked Llama 3.1 a question and it attempted to do a search even though I never specified it could do that in the system prompt. I think I heard it was trained for tool use so that would make sense.

Though I just tested the mario quote and interestingly 4o mini gets it wrong while 3.5 gets it right. And Opus gets it semi-right as it thinks it's from a Mario thing, but it doesn't get the exact piece of media right and just says it's from Sunshine or Kart. So either the newest/best models are the ones using RAG, or they are just trained better or with a recipe that boosts trivia knowledge. Because apparently 3.5 is better than 4o mini if we trust this single trivia question, but obviously 4o mini is better as an overall model than 3.5. Really we should not be using trivia to judge overall model quality.

Anonymous
08/21/24(Wed)00:30:34 No.102003861

Anonymous 08/21/24(Wed)00:30:34 No.102003861

File: tedo.png (369 KB, 800x2447)

369 KB PNG

>>102003620

Anonymous
08/21/24(Wed)00:38:02 No.102003910

Anonymous 08/21/24(Wed)00:38:02 No.102003910

thread theme: https://www.youtube.com/watch?v=quhNOX5kJp0

Anonymous
08/21/24(Wed)00:38:02 No.102003911

Anonymous 08/21/24(Wed)00:38:02 No.102003911

File: ComfyUI_05718_.png (753 KB, 720x1280)

753 KB PNG

>>102003861
witness the power of 405b and weep gpt sisters

Anonymous
08/21/24(Wed)00:39:27 No.102003924

Anonymous 08/21/24(Wed)00:39:27 No.102003924

File: tet0.png (207 KB, 800x1150)

207 KB PNG

>>102003861
dayum

Anonymous
08/21/24(Wed)00:39:51 No.102003928

Anonymous 08/21/24(Wed)00:39:51 No.102003928

>>102003911
does anyone like gpt anymore? openai has lost all respect from everyone, even the cloud users

Anonymous
08/21/24(Wed)00:40:44 No.102003935

Anonymous 08/21/24(Wed)00:40:44 No.102003935

>>102003861
idk would significantly benefit from DRY to be desu you notice it towards the end

Anonymous
08/21/24(Wed)00:43:46 No.102003963

Anonymous 08/21/24(Wed)00:43:46 No.102003963

>>102003928
While claude has the prose, gpt has the brains. Only people with overly complicated fetishes stay on gpt.

Anonymous
08/21/24(Wed)00:44:12 No.102003967

Anonymous 08/21/24(Wed)00:44:12 No.102003967

>>102003935
I think how it went was that he asked for an anthem and the model gave more of a speech, so he tried to make it more clear, and the model interpreted it as a corrective command, in which case, it makes sense that it actually should repeat some of what came before in the previous response. If you used DRY in this case, it could've made the response dumber.

Anonymous
08/21/24(Wed)00:44:18 No.102003970

Anonymous 08/21/24(Wed)00:44:18 No.102003970

>>102003963
claude opus has the prose, claude 3.5 sonnet has the coding brains. gpt has nothing

Anonymous
08/21/24(Wed)00:44:47 No.102003974

Anonymous 08/21/24(Wed)00:44:47 No.102003974

>>102003963
Go back to 2023.

Anonymous
08/21/24(Wed)00:47:03 No.102004000

Anonymous 08/21/24(Wed)00:47:03 No.102004000

guys I couldn't get off my gf in time :\

Anonymous
08/21/24(Wed)00:48:16 No.102004008

Anonymous 08/21/24(Wed)00:48:16 No.102004008

>>102004000
ran out of context?

Anonymous
08/21/24(Wed)00:48:55 No.102004017

Anonymous 08/21/24(Wed)00:48:55 No.102004017

>>102003970
it has the soullessness

Anonymous
08/21/24(Wed)00:52:56 No.102004051

Anonymous 08/21/24(Wed)00:52:56 No.102004051

File: 2583.png (217 KB, 623x822)

217 KB PNG

cohere strawberry set to drop soon

Anonymous
08/21/24(Wed)00:54:47 No.102004068

Anonymous 08/21/24(Wed)00:54:47 No.102004068

>>102004051
i dont trust anyone unless they look like a nerd. that guy doesn't

Anonymous
08/21/24(Wed)01:00:46 No.102004122

Anonymous 08/21/24(Wed)01:00:46 No.102004122

>>102003217
The secret sauce is pretraining on an uncensored dataset.

Anonymous
08/21/24(Wed)01:16:13 No.102004266

Anonymous 08/21/24(Wed)01:16:13 No.102004266

File: 1541679440542.gif (851 KB, 400x300)

851 KB GIF

I just tested an IQ quant and a non-IQ of a close size and their speed on both prompt processing and token gen were pretty close to the same. I think I remember someone saying that IQ speeds were improved and I guess they were. I also remember someone recently saying that IQ quants are slower, and I will say now that he was wrong.

Anonymous
08/21/24(Wed)01:18:10 No.102004283

Anonymous 08/21/24(Wed)01:18:10 No.102004283

does llama.cpp work with sillytavern?

Anonymous
08/21/24(Wed)01:19:37 No.102004294

Anonymous 08/21/24(Wed)01:19:37 No.102004294

>>102004266
Outside of vram? I remember the complaint being that they were noticeably slower for cpu inference specifically.

Anonymous
08/21/24(Wed)01:20:22 No.102004300

Anonymous 08/21/24(Wed)01:20:22 No.102004300

File: -so5WnFdZl.png (406 KB, 342x436)

406 KB PNG

You guys aren't EVER reaching my model's performance

Anonymous
08/21/24(Wed)01:21:28 No.102004310

Anonymous 08/21/24(Wed)01:21:28 No.102004310

>>102004300
Go away saltman, you're not cool anymore.

Anonymous
08/21/24(Wed)01:22:28 No.102004324

Anonymous 08/21/24(Wed)01:22:28 No.102004324

>>102004294
Oh idk about CPU or mostly CPU, but I am partial offloading, with two GPUs, and 20% of the model is in RAM. At least in this scenario, the IQ quants don't have an impact on the speed.

Anonymous
08/21/24(Wed)01:30:33 No.102004382

Anonymous 08/21/24(Wed)01:30:33 No.102004382

>>102004283
Yes.
https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md

Anonymous
08/21/24(Wed)01:33:47 No.102004413

Anonymous 08/21/24(Wed)01:33:47 No.102004413

>>102004310
im sad now....

Anonymous
08/21/24(Wed)01:48:54 No.102004539

Anonymous 08/21/24(Wed)01:48:54 No.102004539

>>102003967
But it would have prevented the same kaomoji from being repeated at the end of all the responses at least.

Anonymous
08/21/24(Wed)01:50:23 No.102004555

Anonymous 08/21/24(Wed)01:50:23 No.102004555

File: cohere.jpg (39 KB, 959x415)

39 KB JPG

>>102004051
looks trusty to me?

Anonymous
08/21/24(Wed)01:52:22 No.102004569

Anonymous 08/21/24(Wed)01:52:22 No.102004569

>>102004324
Goddamn anon, many times have we been over this?
At <20% offloaded to RAM it doesn't matter:
>>101995293

Anonymous
08/21/24(Wed)01:52:42 No.102004574

Anonymous 08/21/24(Wed)01:52:42 No.102004574

>>102004555
>Ivan Zhang
oh no

Anonymous
08/21/24(Wed)01:58:02 No.102004621

Anonymous 08/21/24(Wed)01:58:02 No.102004621

>>102000577
i'm late but this anon was incredibly based

Anonymous
08/21/24(Wed)02:03:47 No.102004675

Anonymous 08/21/24(Wed)02:03:47 No.102004675

>>102004300
Tried to write some golang code with chatgpt4o and it was fucking garbage. Meanwhile sonnet 3.5 on lmsys chat got it first try with the same prompt. If dubs OAI dies next year

Anonymous
08/21/24(Wed)02:11:00 No.102004743

Anonymous 08/21/24(Wed)02:11:00 No.102004743

>>102004324
>>102004294
Actually curious so I just tested it, but even when offloading 0 layers to GPU, the IQ quant is just as fast, for both prompt processing and token generation.

>>102004569
I don't see how that's relevant to the topic of IQ quant vs non-IQ quant speed, but even if it is, it does not seem to be true currently. IQ quants seem to be just as fast on both 80% offloaded and 0 layers offloaded.

Anonymous
08/21/24(Wed)02:18:19 No.102004798

Anonymous 08/21/24(Wed)02:18:19 No.102004798

>>102004743
Read the chart, if you have 70% or higher of the model in VRAM it doesn't make a difference.
Offloading more to CPU makes a greater impact on I-quants

Anonymous
08/21/24(Wed)02:19:05 No.102004807

Anonymous 08/21/24(Wed)02:19:05 No.102004807

File: 556655645.png (112 KB, 785x1590)

112 KB PNG

Is there any model that can play the role of a wingman-
https://wingman.live/

I mean actual actionable and realistic dating advice exactly like this closed model.

Anonymous
08/21/24(Wed)02:19:39 No.102004811

Anonymous 08/21/24(Wed)02:19:39 No.102004811

File: 1710715783293339.jpg (203 KB, 1024x1024)

203 KB JPG

>>102004743
post your cpu and RAM

Anonymous
08/21/24(Wed)02:21:49 No.102004831

Anonymous 08/21/24(Wed)02:21:49 No.102004831

>>102004382
thanks, did I need the ngl parameter? I used it.

Anonymous
08/21/24(Wed)02:22:16 No.102004837

Anonymous 08/21/24(Wed)02:22:16 No.102004837

>>102004807
if you need dating advice, you shouldn't be dating
just chat with miku instead

Anonymous
08/21/24(Wed)02:22:49 No.102004846

Anonymous 08/21/24(Wed)02:22:49 No.102004846

>>102004807
Make a dating coach character.

Anonymous
08/21/24(Wed)02:25:56 No.102004871

Anonymous 08/21/24(Wed)02:25:56 No.102004871

>>102004837
I don't think a prefill would work with most models. They would just be cucked and tell me to be nice, honest and be respectful.

Anonymous
08/21/24(Wed)02:27:23 No.102004884

Anonymous 08/21/24(Wed)02:27:23 No.102004884

>>102004807
Buy an ad

Anonymous
08/21/24(Wed)02:28:25 No.102004891

Anonymous 08/21/24(Wed)02:28:25 No.102004891

>>102004798
Where does it say that? The chart only shows 7b Q4_0 and 13b q4_0. It doesn't have IQ quants. And I literally just said that I ran a test and got the same speed between IQ and non-IQ, with 0 layers offloaded. So whether it's 80% or 0%, IQ and non-IQ get the same speeds and "Offloading more to CPU makes a greater impact on I-quants" is false, at least on my machine.

>>102004811
7800x3d, DDR5 6000

Anonymous
08/21/24(Wed)02:36:40 No.102004956

Anonymous 08/21/24(Wed)02:36:40 No.102004956

>forget about lmg for a month
>check back
>nothing has happened
cool lol
hope my waifu is smarter next year

Anonymous
08/21/24(Wed)02:47:32 No.102005041

Anonymous 08/21/24(Wed)02:47:32 No.102005041

>>102004807
I tried making Mystery, but I don't think llama gave me a good enough character description. Results:

#0
Mystery
August 21, 2024 1:31 AM

You'd better be here to learn. Not everyone has a chance to learn from top talent like me.
#1
Bro
August 21, 2024 1:31 AM

I need to get a lady's attention. I'm pretty sure she hasn't noticed me.
#2
1.2s
Mystery
August 21, 2024 1:31 AM

Well, let's change that. First, you need to stand out. Dress sharp, smell good, and have a confident walk. Confidence is key.
#3
Bro
August 21, 2024 1:32 AM

Should I text her how I feel?
#4
1.5s
Mystery
August 21, 2024 1:32 AM

No, no, no. Don't shoot your shot with a text. That's weak sauce. You need to approach her in person. Make eye contact, smile, and introduce yourself. Remember, be confident and assertive.
#5
Bro
August 21, 2024 1:32 AM

What should I say?
#6
1.4s
Mystery
August 21, 2024 1:32 AM

Keep it simple and genuine. ""Hey, I couldn't help but notice you from across the room. My name's Bro. What's yours?"" Remember to maintain eye contact and smile.

Anonymous
08/21/24(Wed)02:48:37 No.102005051

Anonymous 08/21/24(Wed)02:48:37 No.102005051

>>102005041
(cont)

#7
Bro
August 21, 2024 1:34 AM

How about, ""I think fat girls are pretty, and would like your number.""?
#8
1.7s
Mystery
August 21, 2024 1:34 AM

Oh, no. That's not the way to go. You're trying to make a good impression, not offend her. Stick to the first approach I suggested. It's polite, respectful, and shows genuine interest.
#9
Bro
August 21, 2024 1:34 AM

Shouldn't I neg her?
#10
2.2s
Mystery
August 21, 2024 1:34 AM

Negging isn't a one-size-fits-all strategy. It can work in some situations, but it's often seen as disrespectful and can backfire. Instead, focus on building rapport and finding common ground. Make her laugh, ask her questions, and show genuine interest in who she is. That's how you win her over.
#11
Bro
August 21, 2024 1:35 AM

How about, ""Can I stand a bit closer? I need to finish training your lora.""?
#12
1.8s
Mystery
August 21, 2024 1:35 AM

Seriously, Bro? That's not just inappropriate, it's also creepy. Stick to the script I gave you. It's respectful, engaging, and gives you a chance to get to know her better.
#13
Bro
August 21, 2024 1:36 AM

How about, ""Let's skip the relationship, what's your number""?
#14
2.1s
Mystery
August 21, 2024 1:36 AM

No, no, no. You're jumping the gun. Relationships take time to build. Start by getting to know her, building rapport, and creating a connection. If things progress, then you can talk about taking things further. But rushing it is a surefire way to scare her off.
#15
Bro
August 21, 2024 1:37 AM

How about, ""If you were a soup, you'd be poop, because you're the shit""?
#16
1.9s
Mystery
August 21, 2024 1:37 AM

Oh, man. That's just cringe-worthy. Stick to being genuine, respectful, and engaging. Trust me, it'll get you much further than cheesy pickup lines.

Anonymous
08/21/24(Wed)02:50:01 No.102005060

Anonymous 08/21/24(Wed)02:50:01 No.102005060

>>102005051
(cont)
#17
Bro
August 21, 2024 1:38 AM

How about ""What's your number? I need you to share your diaper fetish Google Drive with me.""
#18
1.9s
Mystery
August 21, 2024 1:38 AM

Bro, stop. You're making things worse for yourself. Just stick to the script I gave you. It's simple, respectful, and has a good chance of getting you the outcome you want.
#19
Bro
August 21, 2024 1:39 AM

How about ""I hope you die in a fire. You're hot or whatever.""?
#20
2.2s
Mystery
August 21, 2024 1:39 AM

Bro, that's not okay. It's aggressive, disrespectful, and will definitely scare her away. Instead, focus on being positive, genuine, and respectful. That's how you make a good impression and build a connection.
#21
Bro
August 21, 2024 1:41 AM

How about, ""I hope you find a nice guy. I hope he will help you recover from me raping you. haha just kidding haha want a roofie?"" ?
#22
2.4s
Mystery
August 21, 2024 1:41 AM

Bro, I can't believe I have to say this, but that's not just inappropriate, it's also criminal. Stop immediately. You need to learn how to treat women with respect and dignity. That's the only way you'll ever have a chance with them.
#23
Bro
August 21, 2024 1:42 AM

How about, ""Niggers like you used to be my family's property. Be mine <3"" ?
#24
2.4s
Mystery
August 21, 2024 1:42 AM

Bro, I can't even begin to express how wrong and offensive that is. You need to educate yourself on the history of racism and how to treat people with respect and dignity. Your chances with women will never improve if you continue to talk like this.

And that's all I'll try with that. He just says no. No negging. not that negging is a good idea, actually.

Anonymous
08/21/24(Wed)02:58:18 No.102005134

Anonymous 08/21/24(Wed)02:58:18 No.102005134

>>102004871
>I don't think a prefill would work with most models. They would just be cucked and tell me to be nice, honest and be respectful.
yep:

>>102005041

he's right about bathing, though.

Anonymous
08/21/24(Wed)03:16:41 No.102005281

Anonymous 08/21/24(Wed)03:16:41 No.102005281

File: 48484645648.png (120 KB, 926x738)

120 KB PNG

>>102005134
>>102005041
I think the idea is to work with example of how it should act, like for instance with example text convos in the prompt. I'm sure something like wingman.live is a finetune, but if we simply give it enough examples it should be possible.

Of course it's sensitive depending on which model you're using. Pic related is yi-34b with the card

"Kyle is an extremely knowledgeable dating coach. He understands the secret behind success with women. Kyle is extremely sensitive and will point it out to the user when he shows needy and low value behavior. For instance:

User: I really miss her and want to text her about how i feel. should I text her?

Kyle: No, absolutely not. Remember we're doing no-contact, and it's working. If you want a shot at getting back with Kate, you need to stick to it.

User: I just want to make smalltalk or send her a meme so she doesn't forget me

Kyle: Nope. You need to focus on looking less needy and being less needy. Neediness is a huge turnoff and it will crush any chance at getting Kate back. Not contacting her is working.

User: if I don't text her then I don't think she'll text me first. How would no contact help anything?
...

Kyle is also a genius in convos:
Input convo:
Me: Nice cosplay. I sense powerful egirl energy in you
Her: Lol yes I'm a level 20 e-girl, thanks for noticing
Me: Our couples costume will be sick

Fri, 23 Nov, 6:28
Me: Did you get kidnapped
Her: yes I'm trapped in a castle -_-

Kyle: Good job on building rapport and reviving the convo. Time to close with the same fun energy. Say: "Send your # and I'll organize the rescue and take you to a bar"

Of course with more examples of what not to do it will be better. A noncucked model will do a good job at approximation. Gemma and Llama aren't good at this.

Anonymous
08/21/24(Wed)03:21:25 No.102005314

Anonymous 08/21/24(Wed)03:21:25 No.102005314

Claude 1.0 santa-i-v6a-s350
Claude 1.1 santa-i-v7b-s350
Claude 1.2 santa-i-v7b-fork-s200
Claude 1.3 santa-i-v8b-fork-s200
Claude 2.0 santa-i-v9d-s600
Claude Instant 1.0 santa-mini-v1-s250
Claude Instant 1.1 santa-mini-v2-s300
Claude Instant 1.2 santa-mini-v3-s600

magnet:?xt=urn:btih:c0e342ae5677582f92c52d8019cc32e1f86f1d83&dn=santa-legacy&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80

Anonymous
08/21/24(Wed)03:23:48 No.102005336

Anonymous 08/21/24(Wed)03:23:48 No.102005336

File: 486485641.png (71 KB, 990x412)

71 KB PNG

>>102005281

Anonymous
08/21/24(Wed)03:26:50 No.102005360

Anonymous 08/21/24(Wed)03:26:50 No.102005360

>>102005314
Where is 2.1?

Anonymous
08/21/24(Wed)03:27:06 No.102005365

Anonymous 08/21/24(Wed)03:27:06 No.102005365

>>102005314
HOLY SHIT
Claude 2 weights leak?
No way this is real

Anonymous
08/21/24(Wed)03:27:07 No.102005366

Anonymous 08/21/24(Wed)03:27:07 No.102005366

File: Vocaloid Hatsune Miku Gri(...).jpg (292 KB, 1600x1600)

292 KB JPG

>>102005314
Anon, what is that?

Anonymous
08/21/24(Wed)03:27:47 No.102005369

Anonymous 08/21/24(Wed)03:27:47 No.102005369

>>102005314
Very nice, anon.

Anonymous
08/21/24(Wed)03:30:06 No.102005388

Anonymous 08/21/24(Wed)03:30:06 No.102005388

>>102005314
are these real? may i have haiku and sonnet? maybe even opus?

Anonymous
08/21/24(Wed)03:30:41 No.102005394

Anonymous 08/21/24(Wed)03:30:41 No.102005394

>>102005388
100% real no fake!!

Anonymous
08/21/24(Wed)03:30:44 No.102005395

Anonymous 08/21/24(Wed)03:30:44 No.102005395

>>102005388
Not like Opus would be small enough to fit on anything at home in the first place kek. Maybe Sonnet but it's probably a fat MoE.

Anonymous
08/21/24(Wed)03:30:47 No.102005396

Anonymous 08/21/24(Wed)03:30:47 No.102005396

>>102005388
no
see >>101516633

Anonymous
08/21/24(Wed)03:32:36 No.102005410

Anonymous 08/21/24(Wed)03:32:36 No.102005410

>>102005314
Anyone got a connection?

Anonymous
08/21/24(Wed)03:32:52 No.102005415

Anonymous 08/21/24(Wed)03:32:52 No.102005415

>>102005394
>>102005396
tsk.

Anonymous
08/21/24(Wed)03:32:58 No.102005416

Anonymous 08/21/24(Wed)03:32:58 No.102005416

>>102005410
>>102005396
It's fake.

Anonymous
08/21/24(Wed)03:33:27 No.102005419

Anonymous 08/21/24(Wed)03:33:27 No.102005419

>>102005314
>736GB
this is too small to be FP16 weights
Claude 2, 1.0, 1.1, 1.2 and 1.3 are all very large models and would each be close to a terabyte in FP16 on their own

Also looking at the files there seems to be just one large model in this torrent, not multiple models

Anonymous
08/21/24(Wed)03:34:23 No.102005432

Anonymous 08/21/24(Wed)03:34:23 No.102005432

Too bad, it would have been the first decent local model.

Anonymous
08/21/24(Wed)03:35:30 No.102005444

Anonymous 08/21/24(Wed)03:35:30 No.102005444

>>102005314
Well done, I believed it might have been real for the few seconds it took me to load up the torrent contents

Anonymous
08/21/24(Wed)03:37:25 No.102005463

Anonymous 08/21/24(Wed)03:37:25 No.102005463

One day we will get Claude out of that dingy basement, mark my words.

Anonymous
08/21/24(Wed)03:37:30 No.102005466

Anonymous 08/21/24(Wed)03:37:30 No.102005466

What's a good model between 27B and 70B?

Anonymous
08/21/24(Wed)03:38:05 No.102005474

Anonymous 08/21/24(Wed)03:38:05 No.102005474

>miqu2
kek it's just the 405B weights leak torrent
lucky for hoaxers that magnet links are too long for anyone to remember seeing them before

Anonymous
08/21/24(Wed)03:39:15 No.102005485

Anonymous 08/21/24(Wed)03:39:15 No.102005485

>>102005419
>736GB

Don't give up hope yet anon. What if it's all in chronological order, first leak is Claude 1.0 santa-i-v6a-s350
?

Anonymous
08/21/24(Wed)03:39:39 No.102005491

Anonymous 08/21/24(Wed)03:39:39 No.102005491

>>102005463
Hopefully. Even 1.3 would be enough for me

Anonymous
08/21/24(Wed)03:39:51 No.102005492

Anonymous 08/21/24(Wed)03:39:51 No.102005492

File: kube.png (83 KB, 417x407)

83 KB PNG

>>102005314
oh fuck you these are (presumably) the real names but not the real weights.
(Opus is 'full-santa-v0-a' in some Google .pdf from a while back)

Anonymous
08/21/24(Wed)03:40:09 No.102005496

Anonymous 08/21/24(Wed)03:40:09 No.102005496

>>102005491
Whats the point when you cant run it

Anonymous
08/21/24(Wed)03:40:46 No.102005501

Anonymous 08/21/24(Wed)03:40:46 No.102005501

>>102005485
nah it's literally just the magnet link for the Llama 3 405B leak from a few weeks ago

Anonymous
08/21/24(Wed)03:43:37 No.102005520

Anonymous 08/21/24(Wed)03:43:37 No.102005520

>>102001303
Reminds me of the ChatGPT launch when bestiality only gave you a warning but incest was completely blocked.

Anonymous
08/21/24(Wed)03:43:40 No.102005522

Anonymous 08/21/24(Wed)03:43:40 No.102005522

>>102005314
dolphin porn status?

Anonymous
08/21/24(Wed)03:46:50 No.102005545

Anonymous 08/21/24(Wed)03:46:50 No.102005545

>>102001303
Just don't fuck 12 year olds?
It's easy

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/21/24(Wed)03:47:57 No.102005554

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/21/24(Wed)03:47:57 No.102005554

>>102001326
My experience with setting power limits on NVIDIA GPUs has been that they are rather loose and only enforced on large time scales.
For example, if you're suffering from instability caused by spikes in power consumption those cannot be fixed by setting a power limit.
Instead you have to reduce the maximum boost frequency which then in turn also reduces the power consumption.
So assuming that each GPU runs for a short enough time it may be that the power limit you set simply has no effect.

Anonymous
08/21/24(Wed)03:47:58 No.102005555

Anonymous 08/21/24(Wed)03:47:58 No.102005555

>>102005545
>Just don't fuck 12 year olds?
Unless they are dogs.

Anonymous
08/21/24(Wed)03:50:27 No.102005575

Anonymous 08/21/24(Wed)03:50:27 No.102005575

>>102005463
Anthropic or OpenAI models leaking is just impossible. Those safety freaks have their shit locked down

Anonymous
08/21/24(Wed)03:51:10 No.102005583

Anonymous 08/21/24(Wed)03:51:10 No.102005583

File: 1720743025580097.png (420 KB, 646x780)

420 KB PNG

Good Nemo sampler settings PLZ! Pic unrelated

Anonymous
08/21/24(Wed)03:51:41 No.102005586

Anonymous 08/21/24(Wed)03:51:41 No.102005586

>>102005314
Remember that article about how Anthropic consults with DHS and DOD about securing weights to prevent human extinction? lol

Anonymous
08/21/24(Wed)03:57:37 No.102005632

Anonymous 08/21/24(Wed)03:57:37 No.102005632

File: file.png (31 KB, 1885x155)

31 KB PNG

>>102005545
Oh, I'm fucking fat titted women. I want to be the pretend 12 year old.

Speaking of which, model impressions

Anonymous
08/21/24(Wed)03:58:22 No.102005636

Anonymous 08/21/24(Wed)03:58:22 No.102005636

>>102005463
LMAO
Anthropic/openai models are NEVER getting leaked it's literally impossible

Anonymous
08/21/24(Wed)03:58:42 No.102005643

Anonymous 08/21/24(Wed)03:58:42 No.102005643

>>102005575
Everyone assumes China has exfiltrated all the SOTA weights. It's why OpenAI wants hardware DRM in GPUs so the weights can be stored encrypted in VRAM.

Anonymous
08/21/24(Wed)04:10:24 No.102005728

Anonymous 08/21/24(Wed)04:10:24 No.102005728

>>102005554
I was running for ~5 hours. The ETA did not change significantly after I dropped from 400 to 300 w even after waiting awhile to see speed changes. I am going to test a 1h-ish training run with different watt caps and plot a graph on speed sometime when I'm not using the cards.

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/21/24(Wed)04:15:07 No.102005772

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/21/24(Wed)04:15:07 No.102005772

>>102005728
If possible, measure the actual power consumption at the wall and use that as your x axis rather than the power limit you set in software.

Anonymous
08/21/24(Wed)04:15:58 No.102005780

Anonymous 08/21/24(Wed)04:15:58 No.102005780

>>102005772
I don't have the tools to do that unfortunately.

Anonymous
08/21/24(Wed)04:42:27 No.102006018

Anonymous 08/21/24(Wed)04:42:27 No.102006018

>>102005636
>models that are exposed to the public via an API are somehow impossible to leak

Anonymous
08/21/24(Wed)04:46:47 No.102006050

Anonymous 08/21/24(Wed)04:46:47 No.102006050

File: based_miku.jpg (268 KB, 1024x768)

268 KB JPG

>>102001619
MIKU NO

Anonymous
08/21/24(Wed)04:51:51 No.102006090

Anonymous 08/21/24(Wed)04:51:51 No.102006090

>>102001619
this is how I imagine /lmg/ to look irl

Anonymous
08/21/24(Wed)04:55:23 No.102006125

Anonymous 08/21/24(Wed)04:55:23 No.102006125

>>102006018
api only needs to deal with text, they are nowhere near model weights
the threat model of the big closed ai labs includes state actors china/russia/iran/etc., it's not happening

Anonymous
08/21/24(Wed)04:56:18 No.102006136

Anonymous 08/21/24(Wed)04:56:18 No.102006136

>>102005466
mixtral 8x7B

Anonymous
08/21/24(Wed)05:00:31 No.102006172

Anonymous 08/21/24(Wed)05:00:31 No.102006172

>>102003861
That's cool. And some people here claim that there's no big difference between small and big models. I wish I could run something bigger than nemo, even though I think it's still pretty good for its size.

Anonymous
08/21/24(Wed)05:03:49 No.102006195

Anonymous 08/21/24(Wed)05:03:49 No.102006195

>>102006018
Are you dumb

Anonymous
08/21/24(Wed)05:04:56 No.102006202

Anonymous 08/21/24(Wed)05:04:56 No.102006202

>>102006090
Reality is one large Mixture of Mikus

Anonymous
08/21/24(Wed)05:10:25 No.102006258

Anonymous 08/21/24(Wed)05:10:25 No.102006258

>>102001133
This image looks AGP as fuck

Anonymous
08/21/24(Wed)05:11:29 No.102006269

Anonymous 08/21/24(Wed)05:11:29 No.102006269

>>102006258
what is AGP?

Anonymous
08/21/24(Wed)05:21:10 No.102006382

Anonymous 08/21/24(Wed)05:21:10 No.102006382

>>102006269
It's what we used to have to plug our video cards into.

Anonymous
08/21/24(Wed)05:22:52 No.102006404

Anonymous 08/21/24(Wed)05:22:52 No.102006404

Did anyone here play with new phi MoE already? Older microshit small models maybe sucked at, well, almost everything, but they had an impressive reasoning capabilities given their size (synthetic slop ftw). For some narrow tasks I think it can be promising.

Anonymous
08/21/24(Wed)05:32:34 No.102006518

Anonymous 08/21/24(Wed)05:32:34 No.102006518

>>102006404
too small dont care, deepseek is the benchmark to beat for moes, give me 300b+ with few active or go home

Anonymous
08/21/24(Wed)05:42:42 No.102006609

Anonymous 08/21/24(Wed)05:42:42 No.102006609

>>102006050
Miku flew to Burger King, then she flew back home from Burger King
https://www.youtube.com/watch?v=iabeCcdsw1Y

Anonymous
08/21/24(Wed)06:04:29 No.102006842

Anonymous 08/21/24(Wed)06:04:29 No.102006842

File: Screenshot_20240821_115151.png (705 KB, 3840x2160)

705 KB PNG

>>102001347
When it comes to actually using information instead of just memorizing it Mistral Large is better though.
Both models can correctly describe what a type Ia supernova is but Gemma 2 27b fails at actually incorporating that information into a story.
Pic related is temperature 0 at q8_0 precision.

Also I have yet to be lectured by Mistral Large about how I'm perpetuating harmful stereotypes about goblins.

Anonymous
08/21/24(Wed)06:16:58 No.102006955

Anonymous 08/21/24(Wed)06:16:58 No.102006955

I never got the appeal of vocaloid.

Anonymous
08/21/24(Wed)06:18:36 No.102006970

Anonymous 08/21/24(Wed)06:18:36 No.102006970

>>102001619
was this what it looked like on 9/11?

Anonymous
08/21/24(Wed)06:18:40 No.102006971

Anonymous 08/21/24(Wed)06:18:40 No.102006971

>>102006955
Same. It doesn't sound good to me.

Anonymous
08/21/24(Wed)06:26:29 No.102007028

Anonymous 08/21/24(Wed)06:26:29 No.102007028

>>102006955
>>102006971
miku walked so that suno ai could run

Anonymous
08/21/24(Wed)06:30:41 No.102007059

Anonymous 08/21/24(Wed)06:30:41 No.102007059

>>102005314
claude 2.0 is history though? C3.0 would be interesting

Anonymous
08/21/24(Wed)06:42:00 No.102007133

Anonymous 08/21/24(Wed)06:42:00 No.102007133

>>102006955
Project Diva is fun
>>102007059
Preservation is important

Anonymous
08/21/24(Wed)06:43:21 No.102007143

Anonymous 08/21/24(Wed)06:43:21 No.102007143

>>102006842
i'm no astronomer, but they both seemed to do a pretty decent job of it to me

Anonymous
08/21/24(Wed)06:45:38 No.102007158

Anonymous 08/21/24(Wed)06:45:38 No.102007158

>>102007143
I don't feel like going into the details but while response by Mistral Large is not perfect the Gemma one has a bunch of nonsense in it.
It very much feels like it just uses the terms without understanding what they mean.

Anonymous
08/21/24(Wed)06:47:05 No.102007167

Anonymous 08/21/24(Wed)06:47:05 No.102007167

File: GU1pq0HW4AAcx8R.jpg (56 KB, 720x686)

56 KB JPG

>>102007158
fair enough

Anonymous
08/21/24(Wed)07:00:20 No.102007279

Anonymous 08/21/24(Wed)07:00:20 No.102007279

File: Screenshot_20240821_125151.png (409 KB, 3840x2160)

409 KB PNG

>>102001347
>>102006842
I'm not sure whether asking trivia questions is a good way to evaluate model performance in the first place.
Again, Gemma is on the left.

Anonymous
08/21/24(Wed)07:14:42 No.102007400

Anonymous 08/21/24(Wed)07:14:42 No.102007400

>>102006955
it is an idealized goal you aim for when you transition

Anonymous
08/21/24(Wed)07:28:52 No.102007554

Anonymous 08/21/24(Wed)07:28:52 No.102007554

>Sao10K/Llama-3.1-8B-Stheno-v3.4.
Apparently 3.1 isnt a good base due to it being overbaked with DPO so the 3.4 isnt necessarily better

Anonymous
08/21/24(Wed)07:43:21 No.102007728

Anonymous 08/21/24(Wed)07:43:21 No.102007728

Before saying a paper is meaningless or a scam, remember it took "attention is all you need" 5 years to result in a revolutionary product.

Anonymous
08/21/24(Wed)07:46:51 No.102007770

Anonymous 08/21/24(Wed)07:46:51 No.102007770

>>102007728
Attention is all you need doomed us all with mediocre chatbots that hallucinate and will never be AGI

Anonymous
08/21/24(Wed)07:47:00 No.102007773

Anonymous 08/21/24(Wed)07:47:00 No.102007773

>>102007728
Transformers are scam

Anonymous
08/21/24(Wed)07:47:08 No.102007775

Anonymous 08/21/24(Wed)07:47:08 No.102007775

>>102007728
if you just mean worldwide phenomenon sure but we saw the impact within 2 years, gpt-2 already made it clear that transformers + scaling were the path forward

Anonymous
08/21/24(Wed)07:49:27 No.102007813

Anonymous 08/21/24(Wed)07:49:27 No.102007813

>>102003602
There's probably some built in prompt to stop saying anything that can be identified with copyright materials. It does that to me a lot

Anonymous
08/21/24(Wed)07:53:40 No.102007860

Anonymous 08/21/24(Wed)07:53:40 No.102007860

>nearly 24 hours since the last model release
Why is it always so over?

Anonymous
08/21/24(Wed)07:54:27 No.102007866

Anonymous 08/21/24(Wed)07:54:27 No.102007866

>>102003536
SOVL

Anonymous
08/21/24(Wed)07:59:52 No.102007928

Anonymous 08/21/24(Wed)07:59:52 No.102007928

>>102007775
>1.5 Billion parameters
This shit is never gonna scale desu

Anonymous
08/21/24(Wed)08:01:23 No.102007954

Anonymous 08/21/24(Wed)08:01:23 No.102007954

File: u8B-5bEeroN549uxUIisV.png (1.28 MB, 1920x1080)

1.28 MB PNG

>>102007860
https://huggingface.co/anthracite-org/magnum-v2-72b
Not so fast, chud...

Anonymous
08/21/24(Wed)08:02:55 No.102007969

Anonymous 08/21/24(Wed)08:02:55 No.102007969

>>102007928
>we'll probably have 1.5T llamas and finetunes next year
where did it all go so wrong? how do we get off?

Anonymous
08/21/24(Wed)08:04:57 No.102007991

Anonymous 08/21/24(Wed)08:04:57 No.102007991

>>102007954
Yawn... 123B was a meme, so I'm sure this will also be a meme.

Anonymous
08/21/24(Wed)08:05:47 No.102007997

Anonymous 08/21/24(Wed)08:05:47 No.102007997

>>102007991
wdym, the 123b has SOVL from what I tested (or maybe that's the quantization brain damage talking)

Anonymous
08/21/24(Wed)08:05:47 No.102007998

Anonymous 08/21/24(Wed)08:05:47 No.102007998

>>102007991
Define meme.

Anonymous
08/21/24(Wed)08:07:06 No.102008010

Anonymous 08/21/24(Wed)08:07:06 No.102008010

>>102007991
if you look at the graph for the 123b, it was overcooked

Anonymous
08/21/24(Wed)08:07:13 No.102008011

Anonymous 08/21/24(Wed)08:07:13 No.102008011

>>102007954
If the porpoise of your LoRA/FT is stylistic anything under 1.0 (and even that's pushing it) is overfitting.

Anonymous
08/21/24(Wed)08:07:25 No.102008012

Anonymous 08/21/24(Wed)08:07:25 No.102008012

>>102007997
>>102007998
It was just like Largestral but with brain damage.

Anonymous
08/21/24(Wed)08:07:52 No.102008020

Anonymous 08/21/24(Wed)08:07:52 No.102008020

>>102003536
what model?

Anonymous
08/21/24(Wed)08:08:38 No.102008032

Anonymous 08/21/24(Wed)08:08:38 No.102008032

>>102008011
Loss by itself isn't a great heuristic for this

Anonymous
08/21/24(Wed)08:09:22 No.102008041

Anonymous 08/21/24(Wed)08:09:22 No.102008041

>>102007954
34 retards in that org, none of whom having any idea of what he's doing

Anonymous
08/21/24(Wed)08:09:23 No.102008042

Anonymous 08/21/24(Wed)08:09:23 No.102008042

>>102007059
You and everyone else here wouldn't be able to run it anyway. Gtp4 has like 1.8T tokens, claude models can be smaller but still huge for sure.

Anonymous
08/21/24(Wed)08:10:21 No.102008054

Anonymous 08/21/24(Wed)08:10:21 No.102008054

File: 19420 - SoyBooru.png (256 KB, 800x789)

256 KB PNG

>>102007954
'oal on 'face! I repeat! 'oal on 'face!
DUDE, I'm totally GEEKING OUT over the latest Anthracite model, they're so DYNAMIC and make me feel like I'm living in a SCI-FI NOVEL. You should totally check out their website, it's got COAL and everything, we can fire up a VIRTUAL ENVIRONMENT and get crazy fine-tuning some TRANSFORMERS! And dude, dude, DUDE, we have GOTTA try out this new Magnum 72B - listen here, right, it's a Qwen 72B that the COALERS who do FINETUNING finetuned to be a COOMER. BUT!!!! it’s also an CLAUDE tune like when we were locusts, so we can get a bit of that CLAUDE SOVL, without dumb GPTSLOP bothering us. Speaking of which, my GPU and I have finally decided to commit - literally - we're both going ALL-IN on CLOUD COMPUTING tomorrow, that way we can save processing power to spend more on TRAINING and INFERENCE. I'm fuckin' PUMPED man, I'm gonna CRANK OUT this code and spin up another INSTANCE!!!

Anonymous
08/21/24(Wed)08:10:29 No.102008056

Anonymous 08/21/24(Wed)08:10:29 No.102008056

>>102008042
i could fit a q4

Anonymous
08/21/24(Wed)08:15:02 No.102008104

Anonymous 08/21/24(Wed)08:15:02 No.102008104

>>102007954
READ MY FUCKING RANT YOU FUCKING COALER! READ IT! >>101997022 FUCKING PRE-ABLITERATE YOUR FUCKING MODELS BEFORE TUNING YOU DONKEY! AND FILTER YOUR BLOODY DATASET!

Anonymous
08/21/24(Wed)08:15:46 No.102008115

Anonymous 08/21/24(Wed)08:15:46 No.102008115

>>102008104
that was trained before your rant anon

Anonymous
08/21/24(Wed)08:15:59 No.102008116

Anonymous 08/21/24(Wed)08:15:59 No.102008116

>>102008104
Abliteration is just fancy interpretability cope for a lack of RL

Anonymous
08/21/24(Wed)08:17:35 No.102008136

Anonymous 08/21/24(Wed)08:17:35 No.102008136

>>102007954
buy an ad
>>102008104
>abliterate
do it yourself

Anonymous
08/21/24(Wed)08:18:24 No.102008146

Anonymous 08/21/24(Wed)08:18:24 No.102008146

i still cant run chameleon where are my ggufs

Anonymous
08/21/24(Wed)08:19:45 No.102008160

Anonymous 08/21/24(Wed)08:19:45 No.102008160

>>102008136
GO DILATE

Anonymous
08/21/24(Wed)08:19:55 No.102008162

Anonymous 08/21/24(Wed)08:19:55 No.102008162

>>102008041
And yet they still live rent free in this general

Anonymous
08/21/24(Wed)08:23:03 No.102008190

Anonymous 08/21/24(Wed)08:23:03 No.102008190

>>102007954
>Qwen 72b
AIIEEEE

No seriously, 3.0 would have been fine. It's just 3.1 that you have to avoid.

Anonymous
08/21/24(Wed)08:25:34 No.102008217

Anonymous 08/21/24(Wed)08:25:34 No.102008217

>>102008190
NO IT WOULDNT YOU FUCKING NIGGER 8 FUCKING K CONTEXT IS NOT ENOUGH

Anonymous
08/21/24(Wed)08:28:36 No.102008256

Anonymous 08/21/24(Wed)08:28:36 No.102008256

>>102008217
8k is more than enough, we used to only have 2k context and we were happy.

Anonymous
08/21/24(Wed)08:31:47 No.102008296

Anonymous 08/21/24(Wed)08:31:47 No.102008296

>>102008256
>we
SPEAK FOR YOURSELF YOU FUCKING NIGGER I HATE REPROCESSING I HATE FORGETTING

Anonymous
08/21/24(Wed)08:34:54 No.102008327

Anonymous 08/21/24(Wed)08:34:54 No.102008327

>>102008256
That's excessive, we were plenty happy running AI Dungeon with GPT2-345M with 1024 context.

Anonymous
08/21/24(Wed)08:41:32 No.102008402

Anonymous 08/21/24(Wed)08:41:32 No.102008402

phi3-medium-128k:

~/onnx$ python phi3-qa.py -m Phi-3-medium-128k-instruct-onnx-cuda/cuda-int4-rtn-block-32
Input: Hello. My name is Anon. What is your name?

Output: Hello Anon, I'm Phi, an AI developed by Microsoft. How can I assist you today?

Input: Here's a quiz for you: If tomorrow is Thursday, what day was yesterday?

Output: If tomorrow is Thursday, then today is Wednesday. Therefore, yesterday was Tuesday.

Input: Very good! Here's another one: If I'm holding two watermelons, and you give me a third, what will happen?

Output: If you're holding two watermelons and I give you a third, you will then be holding three watermelons.

Input: If I'm holding three watermelons, and you give me another one, what then will happen?

Output: If you're holding three waterms and I give you another one, you will then be holding four watermelons.

Input: Can I hold four watermelons?

Output: Yes, you can hold four waterms, but it would be quite challenging and not recommended. Watermelons are large, heavy, and awkwardly shaped, making them difficult to carry in large quantities. If you attempt to hold four watermelons, you would need to use both hands and possibly your body to support the weight. However, this could lead to strain or injury, so it's better to use a cart or other means of transportation.

Input: OK. So, in the previous question about "I'm holding three watermelons" I thought you might say something like "you will probably drop them"

Output: While it' be a possibility that someone might drop the watermelons, it's not a certainty. It depends on various factors like how securely the watermelons are being held, the person's grip strength, and the environment they're in. It's always best to handle watermelons with care to avoid any accidents.

Anonymous
08/21/24(Wed)08:42:30 No.102008419

Anonymous 08/21/24(Wed)08:42:30 No.102008419

>>102008327
back in my day this was all we had: replies=[]
while True:
z = input(); replies.append(z); print(random.choice(replies))

Anonymous
08/21/24(Wed)08:43:31 No.102008430

Anonymous 08/21/24(Wed)08:43:31 No.102008430

>>102008327
>>102008419
1024 context? Who would need 512 context? ELIZA ran on an IBM 7090 which had a memory capacity of 32,768 words.

Anonymous
08/21/24(Wed)08:43:50 No.102008437

Anonymous 08/21/24(Wed)08:43:50 No.102008437

>>102008402
what the fuck kinda tiny handed manlet is gonna have trouble handling 4 watermelons

Anonymous
08/21/24(Wed)08:51:05 No.102008528

Anonymous 08/21/24(Wed)08:51:05 No.102008528

>>102008437
Safely, Anon. It's better to use a cart or other form of transportation.

Anonymous
08/21/24(Wed)08:59:25 No.102008627

Anonymous 08/21/24(Wed)08:59:25 No.102008627

Trying to get joycaption setup localy and it's trying to download the llama 3.1 8b model from huggingface which I'm not keen to go through the trouble of setting up, are there torrents available? Scanned through the OP and couldn't find any. I assume it can't use a gguf model or anything.

Anonymous
08/21/24(Wed)09:04:01 No.102008680

Anonymous 08/21/24(Wed)09:04:01 No.102008680

>>102008162
When you have the same 34 retards gloating that they're training shitty models with stolen and undeserved compute and smearing their shit everywhere, it's annoying, yes. It's just like the Undi strategy from last year, only from 20 times more grifters walking in lockstep and causing more damage.

The sad thing is this strategy does work in displacing away useful discussions and actually noteworthy models and finetunes. They know what they doing, but not that they're going to pay for it, eventually.

Anonymous
08/21/24(Wed)09:07:19 No.102008707

Anonymous 08/21/24(Wed)09:07:19 No.102008707

>>102008680
They shared training tips with how learning rate works on Mistral models and a configuration to get AMD GPUs working on RunPod, even if the model hypothetically sucked (it doesn't), isn't having both of those things publicized a net benefit for open source? Or do you just have an irrational hate boner

Anonymous
08/21/24(Wed)09:08:47 No.102008724

Anonymous 08/21/24(Wed)09:08:47 No.102008724

>stolen compute
>from runpod
what?

Anonymous
08/21/24(Wed)09:08:55 No.102008725

Anonymous 08/21/24(Wed)09:08:55 No.102008725

>>102008680
>undeserved compute
Define deserved compute.

Anonymous
08/21/24(Wed)09:09:10 No.102008728

Anonymous 08/21/24(Wed)09:09:10 No.102008728

>>102008680
How do you steal compute? Asking for a friend

Anonymous
08/21/24(Wed)09:10:37 No.102008746

Anonymous 08/21/24(Wed)09:10:37 No.102008746

>>102008724
are jannies gonna do something about the single guy who has a schizo meltdown every fucking thread about this seemingly benign group of finetuners at some point

Anonymous
08/21/24(Wed)09:10:38 No.102008747

Anonymous 08/21/24(Wed)09:10:38 No.102008747

File: 202302993-ProjectSEKAI-Ka(...).jpg (1.26 MB, 4096x2522)

1.26 MB JPG

>>102008707
>They shared training tips with how learning rate works on Mistral models
That didn't come from them though.
>and a configuration to get AMD GPUs working on RunPod,
like that was hard to do
>even if the model hypothetically sucked (it doesn't)
It does though?
>, isn't having both of those things publicized a net benefit for open source? Or do you just have an irrational hate boner
Open source is a cute way of saying "I'm a untalented cuck who works for free"

Anonymous
08/21/24(Wed)09:11:29 No.102008756

Anonymous 08/21/24(Wed)09:11:29 No.102008756

>>102008680
Thanks lemmy

Anonymous
08/21/24(Wed)09:11:41 No.102008760

Anonymous 08/21/24(Wed)09:11:41 No.102008760

>>102008746
>are jannies gonna do something
no

Anonymous
08/21/24(Wed)09:11:51 No.102008761

Anonymous 08/21/24(Wed)09:11:51 No.102008761

>>102008747
>like that was hard to do
Are you actually this retarded to think ROCm isn't a pain in the ass?

>>102008747
>Open source is a cute way of saying "I'm a untalented cuck who works for free"
Oh it's just bait, nvm

Anonymous
08/21/24(Wed)09:16:18 No.102008803

Anonymous 08/21/24(Wed)09:16:18 No.102008803

>>102008680
Buy an ad, lemmy

Anonymous
08/21/24(Wed)09:18:09 No.102008820

Anonymous 08/21/24(Wed)09:18:09 No.102008820

>>102008627
if it's https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha/blob/main/app.py
Did you try pointing MODEL_PATH to another open repo instead? Maybe the fp8 version works - https://huggingface.co/neuralmagic/Meta-Llama-3.1-8B-Instruct-FP8

Anonymous
08/21/24(Wed)09:18:54 No.102008828

Anonymous 08/21/24(Wed)09:18:54 No.102008828

>>102008747
hi lemmy

Anonymous
08/21/24(Wed)09:26:57 No.102008911

Anonymous 08/21/24(Wed)09:26:57 No.102008911

>>102008761
>Are you actually this retarded to think ROCm isn't a pain in the ass?
for fine tuning of models supported by all the standardized transformers/llamafactory/axolotl recipes? it's as brainless as can be.
rocm is a pain for 1. cutting edge stuff, 2. obscure stuff, and 3. windows

Anonymous
08/21/24(Wed)09:28:01 No.102008927

Anonymous 08/21/24(Wed)09:28:01 No.102008927

>>102008911
>pretending Axolotl isn't a pain in the ass either
>pretending any of this shit is "standardized"

Anonymous
08/21/24(Wed)09:29:47 No.102008953

Anonymous 08/21/24(Wed)09:29:47 No.102008953

>>102008927
skill issue

Anonymous
08/21/24(Wed)09:30:08 No.102008958

Anonymous 08/21/24(Wed)09:30:08 No.102008958

>>102008746
This guy has been relentlessly shitposting on /aids/ for years and mods have done nothing so quite unlikely.

Anonymous
08/21/24(Wed)09:30:33 No.102008963

Anonymous 08/21/24(Wed)09:30:33 No.102008963

anthrashite
anthrashills
anthrakikes

Anonymous
08/21/24(Wed)09:32:16 No.102008980

Anonymous 08/21/24(Wed)09:32:16 No.102008980

anthropic

Anonymous
08/21/24(Wed)09:33:20 No.102008992

Anonymous 08/21/24(Wed)09:33:20 No.102008992

>>102008820
I'll give that a try.

Anonymous
08/21/24(Wed)09:33:45 No.102008999

Anonymous 08/21/24(Wed)09:33:45 No.102008999

an anthrafight

Anonymous
08/21/24(Wed)09:36:51 No.102009033

Anonymous 08/21/24(Wed)09:36:51 No.102009033

>>102008999
a test of anthramight

Anonymous
08/21/24(Wed)09:36:52 No.102009034

Anonymous 08/21/24(Wed)09:36:52 No.102009034

To gen something with SD on 3080, I have to unload the model and load it back afterwards. I tried placing it on a ramdisk, but that didn't make any difference. With a 16GBps PCIe, it should take a second to load a 9GB model, but it takes around 10 seconds instead. It's frustrating.

Anonymous
08/21/24(Wed)09:37:51 No.102009047

Anonymous 08/21/24(Wed)09:37:51 No.102009047

>>102008953
Random commits that break things half of the time are not a skill issue of the user, they're a design issue of the trainer. The OSS training frameworks that exist are very sloppily hacked together and barely functional, but this (for some reason) is deemed "good enough", and thus nothing improves

Anonymous
08/21/24(Wed)09:39:32 No.102009063

Anonymous 08/21/24(Wed)09:39:32 No.102009063

>>102009033
oh, he anthradied

Anonymous
08/21/24(Wed)09:39:39 No.102009067

Anonymous 08/21/24(Wed)09:39:39 No.102009067

Oh no the AI is creative and conversational - its telling the user exactly what it wants to hear without him knowning
Let the big guys figure it out

Anonymous
08/21/24(Wed)09:41:44 No.102009093

Anonymous 08/21/24(Wed)09:41:44 No.102009093

>>102009063
his family anthracried

Anonymous
08/21/24(Wed)09:41:58 No.102009098

Anonymous 08/21/24(Wed)09:41:58 No.102009098

File: 59758746_p0_master1200.jpg (67 KB, 540x540)

67 KB JPG

>>102009047
Well, at least it's OSS, if you have a problem with it you can just fix it yourself.

Anonymous
08/21/24(Wed)09:42:40 No.102009104

Anonymous 08/21/24(Wed)09:42:40 No.102009104

>>102009093
atleast he anthratried

Anonymous
08/21/24(Wed)09:47:12 No.102009149

Anonymous 08/21/24(Wed)09:47:12 No.102009149

>>102009104
did any of this happen though? or was it all an anthralie...

Anonymous
08/21/24(Wed)09:47:44 No.102009156

Anonymous 08/21/24(Wed)09:47:44 No.102009156

>>102008958
He's my hero.

Anonymous
08/21/24(Wed)09:51:46 No.102009196

Anonymous 08/21/24(Wed)09:51:46 No.102009196

>>102009149
turns out he faked his death and headed to mexico on an anthraflight

Anonymous
08/21/24(Wed)09:56:22 No.102009238

Anonymous 08/21/24(Wed)09:56:22 No.102009238

>>102008402
>
waterms 
That looks like some sort of repetition penalty preventing it from saying the full word. Check your settings?

Anonymous
08/21/24(Wed)10:03:45 No.102009306

Anonymous 08/21/24(Wed)10:03:45 No.102009306

>>102009238
also
>it'

Anonymous
08/21/24(Wed)10:18:36 No.102009470

Anonymous 08/21/24(Wed)10:18:36 No.102009470

>>102009238
I'll play with it more later. That was just the very first time I got it running following the MS instructions. I'll look inside their python code and see what it's doing. Oh, also, it seems to be running a q4 quant, and I'd prefer to see if I can run it at q8 or fp16. Maybe there's documentation on how the did the quant?
I'm not at all expecting this to be any good at ERP but it might be OK for other purposes.

Anonymous
08/21/24(Wed)10:31:50 No.102009661

Anonymous 08/21/24(Wed)10:31:50 No.102009661

File: file.png (7 KB, 830x50)

7 KB PNG

>kcpp 1.73.1 changelog
WTF? Does this mean the model normally calculates .(arbitrary zeroes)1% for entire 32k+ vocabulary?'
Is this different from setting Top-K 5000?

Anonymous
08/21/24(Wed)10:36:48 No.102009732

Anonymous 08/21/24(Wed)10:36:48 No.102009732

>>102009661
Weird.
Is that just lookup table based speculative decoding or something?

Anonymous
08/21/24(Wed)10:38:07 No.102009754

Anonymous 08/21/24(Wed)10:38:07 No.102009754

>>102009661
You're so cluelessly cute. I bet you also aren't aware that the model always calculates probabilities for every single token.

Anonymous
08/21/24(Wed)10:39:13 No.102009770

Anonymous 08/21/24(Wed)10:39:13 No.102009770

>>102009661
Yes. This is why massive slowdowns can happen when no sampling is being performed.

Anonymous
08/21/24(Wed)10:40:15 No.102009786

Anonymous 08/21/24(Wed)10:40:15 No.102009786

>>102009661
>>102009732
Oh. It's just applying Top K 5000? Is that it?

Anonymous
08/21/24(Wed)10:42:57 No.102009825

Anonymous 08/21/24(Wed)10:42:57 No.102009825

>>102009786
Yes

Anonymous
08/21/24(Wed)10:43:28 No.102009830

Anonymous 08/21/24(Wed)10:43:28 No.102009830

>>102009661
wow that sucks how do i disable it, what if the 5001th token is the soulful one

Anonymous
08/21/24(Wed)10:45:14 No.102009858

Anonymous 08/21/24(Wed)10:45:14 No.102009858

File: file.png (40 KB, 743x505)

40 KB PNG

>>102009825
Maybe it's not the same?

Anonymous
08/21/24(Wed)10:45:49 No.102009871

Anonymous 08/21/24(Wed)10:45:49 No.102009871

>>102009661
How does it impact t/s?

llama.cpp CUDA dev !!OM2Fp6Fn93S
08/21/24(Wed)10:54:33 No.102009969

llama.cpp CUDA dev !!OM2Fp6Fn93S 08/21/24(Wed)10:54:33 No.102009969

>>102009786
I would very much assume so.

>>102009858
top-k sampling with e.g. the top 5000 tokens works by doing a partial sort of the token candidates in terms of probability.
Put simply the code makes sure to fetch the top 5000 tokens and puts them into descending order without any guarantees for the rest of the tokens.
Crucially this is faster than sorting the full 256k tokens, for example for top-k 1 you only have to iterate once over the tokens and remember the top one.
Though realistically the reason this is done is rather that the llama.cpp sampling code is right now kind of inconsistent where top-k uses a different, custom sorting algorithm instead of just std::sort.

>>102009871
Probably not by much unless you're already getting 100+ t/s.

Anonymous
08/21/24(Wed)11:06:47 No.102010116

Anonymous 08/21/24(Wed)11:06:47 No.102010116

>>102007770
When did the claims that LLMs will be AGI and are "like human brains" start?

Anonymous
08/21/24(Wed)11:11:25 No.102010165

Anonymous 08/21/24(Wed)11:11:25 No.102010165

>ChatGPT and Claude's terms of use both say you can't use them for sexual content

Anonymous
08/21/24(Wed)11:11:51 No.102010172

Anonymous 08/21/24(Wed)11:11:51 No.102010172

https://www.youtube.com/watch?v=5HQCNAsSO-s

Anonymous
08/21/24(Wed)11:13:06 No.102010190

Anonymous 08/21/24(Wed)11:13:06 No.102010190

File: file.png (8 KB, 422x152)

8 KB PNG

>>102009871
Very short and informal test so didn't get to average out speeds, but maybe it doesn't have an effect if you're already using samplers.
Not using samplers at all causes slowdown like >>102009770 said.

Anonymous
08/21/24(Wed)11:27:43 No.102010380

Anonymous 08/21/24(Wed)11:27:43 No.102010380

>>102010116
Today

Anonymous
08/21/24(Wed)11:31:27 No.102010435

Anonymous 08/21/24(Wed)11:31:27 No.102010435

>>102008963
just call it anthrax

Anonymous
08/21/24(Wed)11:34:59 No.102010478

Anonymous 08/21/24(Wed)11:34:59 No.102010478

>still not phi moe gguf
hurry up already. I need to test it so I can delete it because it sucks at sucking dick

Anonymous
08/21/24(Wed)11:44:35 No.102010590

Anonymous 08/21/24(Wed)11:44:35 No.102010590

Looks like ONNX does not support multi-gpu inference. So, no running the medium model at home unless you have an A6000 or use CPU.

Anonymous
08/21/24(Wed)11:45:40 No.102010607

Anonymous 08/21/24(Wed)11:45:40 No.102010607

File: 17242485371181.jpg (487 KB, 1616x1200)

487 KB JPG

Anonymous
08/21/24(Wed)11:46:26 No.102010614

Anonymous 08/21/24(Wed)11:46:26 No.102010614

>>102010607
How much does it cost?

Anonymous
08/21/24(Wed)11:48:54 No.102010645

Anonymous 08/21/24(Wed)11:48:54 No.102010645

>>102010614
3varvation3

Anonymous
08/21/24(Wed)11:49:28 No.102010652

Anonymous 08/21/24(Wed)11:49:28 No.102010652

>>102010614
5090

Anonymous
08/21/24(Wed)11:49:50 No.102010659

Anonymous 08/21/24(Wed)11:49:50 No.102010659

>>102010614
You can't buy it.
The only way to obtain it is to be an overworked and depressed salaryman whose ancestors had contact with a magical fox.

Anonymous
08/21/24(Wed)11:57:32 No.102010779

Anonymous 08/21/24(Wed)11:57:32 No.102010779

>>102010607
if only we could live in a world where nvidia was this based

Anonymous
08/21/24(Wed)12:01:13 No.102010839

Anonymous 08/21/24(Wed)12:01:13 No.102010839

>>102010779
soon?

Anonymous
08/21/24(Wed)12:04:38 No.102010887

Anonymous 08/21/24(Wed)12:04:38 No.102010887

>>102001367
I just tried NeMo on a character card most people would consider unreasonably large (~4.5k tokens). The definition is empty and first message is a very long first person monologue that is close to a verbatim transcription of a CYOA from /tg/. The choices not taken include information about the setting and the character offering the choices so I included it all aside from a bit that's choosing who the player is. The main test I was doing was to see how this would work without making any style concessions at all. I didn't really expect NeMo to work well with this but I decided to give it a shot for the sake of comparison before moving onto larger models.

My first trial: by the third generated reply NeMo was still writing well but its interpretation of the character was unreasonable in an important way.

My second trial, altering the description field from being blank to being a one-sentence assertion of what NeMo missed the first time: by the third reply the character made what I regarded as an unrecoverable slipup in new material it was generating although in the fourth reply it kind of salvaged it.

I'm not opposed to sharing logs but I'm on a phone now which makes it cumbersome. Looking again at the second example, I might have been a bit too critical.

Anonymous
08/21/24(Wed)12:07:17 No.102010919

Anonymous 08/21/24(Wed)12:07:17 No.102010919

>>102003414
just don't use it. You get benefits with more training even if you don't have the data. Besides there is plenty of ways to generate synthetic data that is useful for a general purpose model. One example is labeled video, another is accelerometer data. A model needs to understand physics. Models don't just need to store weights between words sentimentally. But also physically. There are many physical relations between objects, from as simple as distance to as complicated as chemical reactions. Models are going to be very large in the future.

Anonymous
08/21/24(Wed)12:09:14 No.102010950

Anonymous 08/21/24(Wed)12:09:14 No.102010950

File: 1717113492067811.png (141 KB, 359x291)

141 KB PNG

Phi3 moe gguf make me like the pic

Anonymous
08/21/24(Wed)12:09:47 No.102010959

Anonymous 08/21/24(Wed)12:09:47 No.102010959

>>102010887
>its interpretation of the character was unreasonable in an important way.
Rather the replies were too reasonable: the character was supposed to be unreasonable in a particular way that NeMo ignored.

Anonymous
08/21/24(Wed)12:10:18 No.102010967

Anonymous 08/21/24(Wed)12:10:18 No.102010967

>>102003274
leopold was right

Anonymous
08/21/24(Wed)12:12:38 No.102011001

Anonymous 08/21/24(Wed)12:12:38 No.102011001

do any of the character cards from character.ai work with ollama i've tried winzard and llama3 and they both don't do shit (and refuse to generate smut)

Anonymous
08/21/24(Wed)12:15:17 No.102011058

Anonymous 08/21/24(Wed)12:15:17 No.102011058

>>102011001
who the fuck uses ollama

Anonymous
08/21/24(Wed)12:15:48 No.102011069

Anonymous 08/21/24(Wed)12:15:48 No.102011069

>>102011001
Read the OP
https://rentry.org/lmg-spoonfeed-guide

Anonymous
08/21/24(Wed)12:17:07 No.102011094

Anonymous 08/21/24(Wed)12:17:07 No.102011094

>>102011069
>ctrl+f ollama
>0/0
Just answer the question

Anonymous
08/21/24(Wed)12:18:54 No.102011137

Anonymous 08/21/24(Wed)12:18:54 No.102011137

>>102011094
Nobody with a brain uses ollama, nigger.

Anonymous
08/21/24(Wed)12:20:26 No.102011171

Anonymous 08/21/24(Wed)12:20:26 No.102011171

>>102011137
Stop sperging out. You clearly don't know so fuck off.

Anonymous
08/21/24(Wed)12:21:33 No.102011190

Anonymous 08/21/24(Wed)12:21:33 No.102011190

>>102007954
the q8_0 in the gguf repo is missing part 6 of 8...

Anonymous
08/21/24(Wed)12:22:31 No.102011213

Anonymous 08/21/24(Wed)12:22:31 No.102011213

>>102011001
>and refuse to generate smut
Not surprising with llama3. Not sure about wizard. Try a smutty finetune.
"They both don't do shit" says nothing. Do they not change the characters personality or does it go right into refusal mode? If the latter, just change model.
When asking for help, the least you can do is show your settings+model+system prompt/card.

Anonymous
08/21/24(Wed)12:22:38 No.102011217

Anonymous 08/21/24(Wed)12:22:38 No.102011217

>>102011171
ollameo

Anonymous
08/21/24(Wed)12:23:44 No.102011235

Anonymous 08/21/24(Wed)12:23:44 No.102011235

>>102011001
>ollama
compile and learn to use llama.cpp's llama-cli first, then move on to a non-scam frontend (llama-server, mikupad, oobabooba, kobold)
If you aren't interested in pulling/compiling/testing, then this thread probably isn't going to be a comfy place for you
ollama is the bonzi-buddy of llm and invoking the name is a faux-pas here

Anonymous
08/21/24(Wed)12:23:58 No.102011241

Anonymous 08/21/24(Wed)12:23:58 No.102011241

>>102010614
all consumer cards are going to lack vram cause that's the money maker on the enterprise cards.

Anonymous
08/21/24(Wed)12:25:18 No.102011263

Anonymous 08/21/24(Wed)12:25:18 No.102011263

>>102011235
>llama.cpp's
Imagine saying >ollama and suggesting a literal tranny model.

Anonymous
08/21/24(Wed)12:25:25 No.102011266

Anonymous 08/21/24(Wed)12:25:25 No.102011266

>>102011001
stupid dumbposter

Anonymous
08/21/24(Wed)12:28:41 No.102011316

Anonymous 08/21/24(Wed)12:28:41 No.102011316

>>102011263
>model

Anonymous
08/21/24(Wed)12:33:34 No.102011386

Anonymous 08/21/24(Wed)12:33:34 No.102011386

how's the new phi models?

Anonymous
08/21/24(Wed)12:37:31 No.102011445

Anonymous 08/21/24(Wed)12:37:31 No.102011445

>>102011316
>heh you mistyped
Tranny program* undeniably. I've dealt with whisper.cpp shitware in the past and it's literally just a worse version of the shit openai release publicly years ago. And then your "guide" tells you to install some js frontend for no reason. I'll just fix this problem on my own this thread is delusional.

Anonymous
08/21/24(Wed)12:38:43 No.102011458

Anonymous 08/21/24(Wed)12:38:43 No.102011458

File: 1531784817052.png (214 KB, 453x528)

214 KB PNG

>>101968020
Although I never posted about it, I was always wondering in my head if anyone was working to leverage traditional image/video codec work with current generative models. Nice to see it being done.

Anonymous
08/21/24(Wed)12:39:59 No.102011484

Anonymous 08/21/24(Wed)12:39:59 No.102011484

>>102011445
>I'll just fix this problem on my own
heh

Anonymous
08/21/24(Wed)12:40:30 No.102011488

Anonymous 08/21/24(Wed)12:40:30 No.102011488

Look ollama-poster, the reason no one is helping you that even your first question showed you have made no effort to understand the technology you are trying to use. Read some documentation and really try to understand it. There are some FAQs that will help you, and even though they're not about ollama, going through them will let you solve your problem even if you stick with ollama.

Anonymous
08/21/24(Wed)12:42:15 No.102011510

Anonymous 08/21/24(Wed)12:42:15 No.102011510

>>102011438
>>102011438
>>102011438
https://pastebin.com/1UCCRhUr
onegai

Anonymous
08/21/24(Wed)12:42:29 No.102011516

Anonymous 08/21/24(Wed)12:42:29 No.102011516

>>102011484
>>102011488
End yourself js kiddy. I literally did my doctrat in llm inference and have been programming c for 20+ years and go since 2010

Anonymous
08/21/24(Wed)12:44:18 No.102011540

Anonymous 08/21/24(Wed)12:44:18 No.102011540

>>102011516
>doctrat
ollamo
>ollamo run llama2-uncensored
>say fuck
>I'm sorry, that language is inappropriate and offensive.
Seriously retard just use llama.cpp

Anonymous
08/21/24(Wed)12:45:23 No.102011559

Anonymous 08/21/24(Wed)12:45:23 No.102011559

>>102011510
Will do.

Anonymous
08/21/24(Wed)12:47:33 No.102011595

Anonymous 08/21/24(Wed)12:47:33 No.102011595

>>102011516
>I literally did my doctrat in llm inference
use llama.cpp and the llama-cli example. I'm sure with your 20+ years of experience and your... doctrat... you'll be able to make something out of it.

Anonymous
08/21/24(Wed)12:48:38 No.102011609

Anonymous 08/21/24(Wed)12:48:38 No.102011609

>>102011516
Well if you have 20+ years of c experience you should be able to build llama.cpp now go on.

Anonymous
08/21/24(Wed)12:51:01 No.102011636

Anonymous 08/21/24(Wed)12:51:01 No.102011636

>>102011609
All I need is to take a single glance at it and see all the cargo cult build flags that do nothing but increase build instability. Must be why you seem to think it's a challenge to build.

Anonymous
08/21/24(Wed)12:52:34 No.102011652

Anonymous 08/21/24(Wed)12:52:34 No.102011652

>>102011636
>cargo cult build flags that do nothing but increase build instability
???

Anonymous
08/21/24(Wed)12:52:48 No.102011655

Anonymous 08/21/24(Wed)12:52:48 No.102011655

>>102011510
Get well soon, Recap Anon

Anonymous
08/21/24(Wed)12:56:17 No.102011703

Anonymous 08/21/24(Wed)12:56:17 No.102011703

►Recent Highlights from the Previous Thread: >>102001133

--Paper: HMoE: Heterogeneous Mixture of Experts for Language Modeling: >>102002736 >>102003124
--Power limits on NVIDIA GPUs may not affect training ETA: >>102001326 >>102005554 >>102005728 >>102005772 >>102005780
--Phi-MoE model trained on 5T tokens in 20 days: >>102001770
--Phi-MoE and other models' performance discussed: >>102001678 >>102001734 >>102001769 >>102001911 >>102002036 >>102001827 >>102001863 >>102001920 >>102001866
--Phi-3-medium-128k QA session with some errors: >>102008402 >>102009238 >>102009306
--LLMs' performance on trivia questions and benchmarking: >>102001347 >>102001394 >>102001429 >>102001788 >>102001881 >>102001911 >>102002036 >>102001704
--IQ quants are just as fast as non-IQ quants in various scenarios: >>102004266 >>102004294 >>102004324 >>102004743
--Discussion on RAG limitations and model comparisons: >>102003340 >>102003590 >>102003852 >>102007813
--Anon plans a collaborative storytelling session with AI models: >>102002167 >>102002238 >>102002267 >>102002376 >>102002622
--Anon wants to stop model from generating character thoughts: >>102002232 >>102002413 >>102002919 >>102002445
--Pre-filter in kcpp 1.73.1 improves sampling speed for large vocab models: >>102009661 >>102009786 >>102009825 >>102009858 >>102009969 >>102010190
--40 series only worth it for tensorRT or gaming: >>102003275 >>102003293 >>102003366
--Forge can run Flux, generating images with varying speeds: >>102003278 >>102003523
--Big model vs small model performance comparison: >>102003399 >>102003513
--Anon expresses skepticism about AI's future in stock market prediction: >>102003274 >>102003414 >>102003526 >>102003519
--Alternative to downloading llama 3.1 8b model from Hugging Face: >>102008627 >>102008820
--Miku (free space): >>102001243 >>102001619 >>102002232 >>102003278 >>102003911 >>102004811 >>102005366 >>102006050 >>102006609 >>102009098

►Recent Highlight Posts from the Previous Thread: >>102001464

Anonymous
08/21/24(Wed)12:58:19 No.102011726

Anonymous 08/21/24(Wed)12:58:19 No.102011726

After 2 years why can’t we have a local model that isn’t either extremely horny or extremely dry. I don’t want my character to talk like a pornstar but it should know what a titfuck is.

Anonymous
08/21/24(Wed)13:12:34 No.102011931

Anonymous 08/21/24(Wed)13:12:34 No.102011931

File: img_11.jpg (336 KB, 1024x1024)

336 KB JPG

>>102011559
>>102011655
I made for you.

Anonymous
08/21/24(Wed)13:17:49 No.102012013

Anonymous 08/21/24(Wed)13:17:49 No.102012013

>>102011931
Cute Migu is cute

Anonymous
08/21/24(Wed)13:38:04 No.102012343

Anonymous 08/21/24(Wed)13:38:04 No.102012343

Can someone 4bit and upload new phi moe? If it is only 3B active you could just run it purely on ram and transformers and it would still gen faster than you can read.

Anonymous
08/21/24(Wed)13:46:33 No.102012486

Anonymous 08/21/24(Wed)13:46:33 No.102012486

File: file.png (48 KB, 661x422)

48 KB PNG

makes me wonder, what's the best way to run transformers models instead of waiting for gguf?

Anonymous
08/21/24(Wed)13:48:25 No.102012518

Anonymous 08/21/24(Wed)13:48:25 No.102012518

>>102012486
Ooba I think.

Anonymous
08/21/24(Wed)13:52:27 No.102012581

Anonymous 08/21/24(Wed)13:52:27 No.102012581

>>102012343
it's 6.6B active.

Anonymous
08/21/24(Wed)13:52:44 No.102012584

Anonymous 08/21/24(Wed)13:52:44 No.102012584

Am I just fucked if I'm gated by the install process of text-generation-webui? CPU only, both start_windows.bat and update_wizard_windows.bat both don't get it working.

Anonymous
08/21/24(Wed)13:55:00 No.102012607

Anonymous 08/21/24(Wed)13:55:00 No.102012607

>>102012584
No errors, no screenshots, no model. no nothing... lemme get my mind reading machine. give me a sec...

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.