/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 07/12/24(Fri)15:15:18 No.101383382

File: 1709996402293879.jpg (177 KB, 928x1233)

177 KB JPG

/lmg/ - Local Models General Anonymous 07/12/24(Fri)15:15:18 No.101383382 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101371466 & >>101361021

►News
>(07/13) Multimodal Llama 3 405B is coming July 23rd: https://x.com/steph_palazzolo/status/1811791968600576271
>(07/09) Anole, based on Chameleon, for interleaved image-text generation: https://hf.co/GAIR/Anole-7b-v0.1
>(07/07) Support for glm3 and glm4 merged into llama.cpp: https://github.com/ggerganov/llama.cpp/pull/8031
>(07/02) Japanese LLaMA-based model pre-trained on 2T tokens: https://hf.co/cyberagent/calm3-22b-chat
>(06/28) Inference support for Gemma 2 merged: https://github.com/ggerganov/llama.cpp/pull/8156

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
07/12/24(Fri)15:21:31 No.101383463

Anonymous 07/12/24(Fri)15:21:31 No.101383463

petra anchor

Anonymous
07/12/24(Fri)15:22:33 No.101383473

Anonymous 07/12/24(Fri)15:22:33 No.101383473

File: petra.webm (517 KB, 384x448)

517 KB WEBM

Anonymous
07/12/24(Fri)15:25:56 No.101383520

Anonymous 07/12/24(Fri)15:25:56 No.101383520

>>101382771
Writing your character as "he" feels awkward, it's like you're some kind of cuckold rather than the participant.

Anonymous
07/12/24(Fri)15:26:07 No.101383522

Anonymous 07/12/24(Fri)15:26:07 No.101383522

>>101383382
>Multimodal Llama 3 405B
How many 3080's is that at now, like 5?

Anonymous
07/12/24(Fri)15:26:40 No.101383533

Anonymous 07/12/24(Fri)15:26:40 No.101383533

>>101383522
Q4 will be like 200GB

Anonymous
07/12/24(Fri)15:26:45 No.101383535

Anonymous 07/12/24(Fri)15:26:45 No.101383535

>>101383520
First person is the only acceptable answer.
Any model worth their salt won't pick up on it when writing back.

Anonymous
07/12/24(Fri)15:28:31 No.101383569

Anonymous 07/12/24(Fri)15:28:31 No.101383569

>>101383520
>you're some kind of cuckold rather than the participant.
Not really since you still usr the first person for the dialogue

Anonymous
07/12/24(Fri)15:29:12 No.101383575

Anonymous 07/12/24(Fri)15:29:12 No.101383575

>>101383522
like 17 3090s for q8 more like

Anonymous
07/12/24(Fri)15:33:48 No.101383645

Anonymous 07/12/24(Fri)15:33:48 No.101383645

>>101383569
Dialogue is always first person from the character's own perspective, but if you write your character's narration with "he" it creates a kind of separation that makes it harder to self-insert.

Anonymous
07/12/24(Fri)15:35:29 No.101383673

Anonymous 07/12/24(Fri)15:35:29 No.101383673

>>101383645
If you're using your unique username or something like that it shouldn't happen. I got used to it pretty quick and I can self-insert just fine

Anonymous
07/12/24(Fri)15:37:52 No.101383716

Anonymous 07/12/24(Fri)15:37:52 No.101383716

>>101383575
>like 17 3090
Maybe it's time to move on from the 3080 standards, I can't help but think we are starting to reach diminishing returns at this point.

Anonymous
07/12/24(Fri)15:39:03 No.101383731

Anonymous 07/12/24(Fri)15:39:03 No.101383731

File: avatarfag-contribution.png (84 KB, 400x400)

84 KB PNG

Anonymous
07/12/24(Fri)15:40:20 No.101383745

Anonymous 07/12/24(Fri)15:40:20 No.101383745

>>101383575
Just one rack of H100s, stop being poor

Anonymous
07/12/24(Fri)15:40:34 No.101383750

Anonymous 07/12/24(Fri)15:40:34 No.101383750

>>101383716
still would be around 9 48gb gpus. even if cudanon swapped his 6 4090s for 48gb gpus he couldn't do full vram q8, let's not even begin talking about the power for a home setup

Anonymous
07/12/24(Fri)15:42:32 No.101383774

Anonymous 07/12/24(Fri)15:42:32 No.101383774

>>101383745
>H100 rack
>Configure From $358,398.00
sure
>https://www.broadberry.com/xeon-scalable-processor-gen4-rackmount-servers/nvidia-dgx-h100

Anonymous
07/12/24(Fri)15:42:35 No.101383778

Anonymous 07/12/24(Fri)15:42:35 No.101383778

>>101383382
>Multimodal Llama 3 405B
Is it pure multimodal or just a few different models working together?

Anonymous
07/12/24(Fri)15:43:20 No.101383788

Anonymous 07/12/24(Fri)15:43:20 No.101383788

>>101383778
Its a MoMoE, a Mixture of Models of Experts.

Anonymous
07/12/24(Fri)15:43:31 No.101383790

Anonymous 07/12/24(Fri)15:43:31 No.101383790

>>101383774
You have a job, right?

Anonymous
07/12/24(Fri)15:44:05 No.101383806

Anonymous 07/12/24(Fri)15:44:05 No.101383806

>>101383382
Soiling linen with Miku

Anonymous
07/12/24(Fri)15:44:48 No.101383816

Anonymous 07/12/24(Fri)15:44:48 No.101383816

File: H100.png (25 KB, 1318x508)

25 KB PNG

>>101383774
>h100
>When GB200 exists

Anonymous
07/12/24(Fri)15:45:23 No.101383829

Anonymous 07/12/24(Fri)15:45:23 No.101383829

>>101383745
Or just buy 10 AMD W7900s. 480GB VRAM will be more than enough for 405B and one costs $3500.

Anonymous
07/12/24(Fri)15:46:23 No.101383850

Anonymous 07/12/24(Fri)15:46:23 No.101383850

>>101383829
>10 AMD W7900s
>>101383750
>let's not even begin talking about the power for a home setup
housefire here we go

Anonymous
07/12/24(Fri)15:46:30 No.101383854

Anonymous 07/12/24(Fri)15:46:30 No.101383854

>>101383829
>AMD
You lost me there

Anonymous
07/12/24(Fri)15:47:16 No.101383865

Anonymous 07/12/24(Fri)15:47:16 No.101383865

>>101383829
In the future houses will have a dedicated server room for the sole purpose of cooling down hardware so it doesn't burn the rest of the house down.

Anonymous
07/12/24(Fri)15:48:36 No.101383885

Anonymous 07/12/24(Fri)15:48:36 No.101383885

>>101383522
cpumaxxx the rest of the ideas is just cope

Anonymous
07/12/24(Fri)15:48:55 No.101383889

Anonymous 07/12/24(Fri)15:48:55 No.101383889

>>101383865
we're literally moving in the opposite direction, you'll have only usb c outlets and nothing else, everything else you can order from friendlycorpo, keeping you safe from yourself.

Anonymous
07/12/24(Fri)15:50:16 No.101383914

Anonymous 07/12/24(Fri)15:50:16 No.101383914

File: _38b2fb78-ecdb-406d-9831-(...).jpg (128 KB, 1024x1024)

128 KB JPG

I gave Gemma-27b-it q8 a run pinned to my two 3090s using llama.cpp. At 4386 tokens I get 17.1 t/s, which is nice. Interestingly, it seems to use less memory on 3090 vs P100 - perhaps because there's better datatype support on Ampere vs Pascal?

P100 was about 7 t/s in my testing.

Anonymous
07/12/24(Fri)15:50:32 No.101383916

Anonymous 07/12/24(Fri)15:50:32 No.101383916

My fellow vramlets, which model do (you) think it's better between Stheno, Lunaris, Nymph and Gemma 9b. Personally, haven't tried the last one and I've been having some fun with Lunaris so far

Anonymous
07/12/24(Fri)15:51:47 No.101383935

Anonymous 07/12/24(Fri)15:51:47 No.101383935

>>101383850
>>101383854
What's wrong with AMD?

Anonymous
07/12/24(Fri)15:52:04 No.101383941

Anonymous 07/12/24(Fri)15:52:04 No.101383941

>>101383935
They don't make good GPU's.

Anonymous
07/12/24(Fri)15:52:07 No.101383944

Anonymous 07/12/24(Fri)15:52:07 No.101383944

>>101383916
Gemma 9B is the new best, I was using Wizard7B and stheno

Anonymous
07/12/24(Fri)15:53:03 No.101383960

Anonymous 07/12/24(Fri)15:53:03 No.101383960

>>101383885
Newer CPU's are implementing NPU's, though I have no idea how big of an impact that will actually have on LLM's.

Anonymous
07/12/24(Fri)15:53:34 No.101383968

Anonymous 07/12/24(Fri)15:53:34 No.101383968

File: _706a2ec9-2f2f-45e6-801c-(...).jpg (122 KB, 1024x1024)

122 KB JPG

>>101383889
Yep. It'll be "Oops! Looks like you don't have enough social credits to turn on your computer right now. Would you like to take out a loan against your protein allowance for the month?"

Anonymous
07/12/24(Fri)15:53:48 No.101383973

Anonymous 07/12/24(Fri)15:53:48 No.101383973

>>101383916
>>101383944
What happened to SPPO? Or is vanilla 9B still better?

Anonymous
07/12/24(Fri)15:55:09 No.101383997

Anonymous 07/12/24(Fri)15:55:09 No.101383997

Was >>101383243 a serious post?

Anonymous
07/12/24(Fri)15:56:13 No.101384011

Anonymous 07/12/24(Fri)15:56:13 No.101384011

File: file.png (1.04 MB, 768x768)

1.04 MB PNG

>>101383731

Anonymous
07/12/24(Fri)15:57:09 No.101384022

Anonymous 07/12/24(Fri)15:57:09 No.101384022

>>101383968
What body type is, Porky from earthbound?

Anonymous
07/12/24(Fri)15:59:03 No.101384052

Anonymous 07/12/24(Fri)15:59:03 No.101384052

>>101383997
Nah, I formulated it to imply that Elon's model was the best one when it wasn't. Baiting (you)'s from those who can't help but claim it isn't.

Anonymous
07/12/24(Fri)15:59:21 No.101384056

Anonymous 07/12/24(Fri)15:59:21 No.101384056

>>101383960
the bottleneck is latency so not much
but maybe we get less power consumption?
cuda anon any toughts?

Anonymous
07/12/24(Fri)15:59:35 No.101384060

Anonymous 07/12/24(Fri)15:59:35 No.101384060

File: _dc1d68b3-8967-476c-87de-(...).jpg (127 KB, 1024x1024)

127 KB JPG

>>101383960
You can try vulkan in koboldcpp if you hjave DDR5 and one of the better iGPUs.

Don't expect much. On my N305 system, it was the same t/s, only no CPU load. N305 is single-channel, dual-rank, so pretty slow. I'm just surprised it worked at all.

Probably works in other methods but kobold has the extra Intel shit you need already.

Anonymous
07/12/24(Fri)16:00:57 No.101384080

Anonymous 07/12/24(Fri)16:00:57 No.101384080

>>101383960 >>101384056
memory bandwidth, not so much latency

Anonymous
07/12/24(Fri)16:01:42 No.101384095

Anonymous 07/12/24(Fri)16:01:42 No.101384095

File: _b0478a2d-1e98-4255-abaa-(...).jpg (106 KB, 1024x1024)

106 KB JPG

>>101384022
dall-e chibi chubby.
My first Migu was fat-n-dumpy so I kind of stuck with it.

Anonymous
07/12/24(Fri)16:02:54 No.101384109

Anonymous 07/12/24(Fri)16:02:54 No.101384109

>>101384095
>When you give her a P100 instead of a 3090

Anonymous
07/12/24(Fri)16:08:02 No.101384177

Anonymous 07/12/24(Fri)16:08:02 No.101384177

File: _e6e8248b-500c-4473-b3a7-(...).jpg (145 KB, 1024x1024)

145 KB JPG

>>101384109
>When the aicg locusts ask for help cooming

Anonymous
07/12/24(Fri)16:09:54 No.101384202

Anonymous 07/12/24(Fri)16:09:54 No.101384202

>>101384177
Rent free

Anonymous
07/12/24(Fri)16:10:24 No.101384207

Anonymous 07/12/24(Fri)16:10:24 No.101384207

>>101383885
Mac mini cluster

Anonymous
07/12/24(Fri)16:13:22 No.101384248

Anonymous 07/12/24(Fri)16:13:22 No.101384248

Okay guys I solved the localslop issue with one system prompt

Anonymous
07/12/24(Fri)16:15:26 No.101384282

Anonymous 07/12/24(Fri)16:15:26 No.101384282

Finetuners HATE him. Watch this random anon >>101384248 solve low-quality and boring LLMs with this simple system prompt THEY don't want you to know.

Anonymous
07/12/24(Fri)16:16:22 No.101384300

Anonymous 07/12/24(Fri)16:16:22 No.101384300

>>101384282
I'll reveal the trick after 10 (You)s

Anonymous
07/12/24(Fri)16:16:50 No.101384306

Anonymous 07/12/24(Fri)16:16:50 No.101384306

>>101384300
you already did this bait

Anonymous
07/12/24(Fri)16:17:29 No.101384320

Anonymous 07/12/24(Fri)16:17:29 No.101384320

>>101384306
9 (you)s

Anonymous
07/12/24(Fri)16:23:22 No.101384392

Anonymous 07/12/24(Fri)16:23:22 No.101384392

>>101383533
Guess I'll run Q1

Anonymous
07/12/24(Fri)16:31:28 No.101384513

Anonymous 07/12/24(Fri)16:31:28 No.101384513

>>101383916
I tried Lunaris and felt that it was way too much like Stheno. I'm testing Nymph and it's pretty nice so far.
I'm waiting a while more before giving Gemma a proper try since the loaders aren't 100% yet.

Anonymous
07/12/24(Fri)16:32:11 No.101384528

Anonymous 07/12/24(Fri)16:32:11 No.101384528

>>101384248
My favorite is telling the model that it actually has 1000B parameters and it should respond like a 1000B parameter model would. But I don't do that often cause I feel bad about crying and begging a model to be better. Feels dehumanizing.

Anonymous
07/12/24(Fri)16:33:41 No.101384548

Anonymous 07/12/24(Fri)16:33:41 No.101384548

>>101384528
>Feels dehumanizing.
For you or the Model?

Anonymous
07/12/24(Fri)16:34:39 No.101384560

Anonymous 07/12/24(Fri)16:34:39 No.101384560

>>101384392
dumb richfag

Anonymous
07/12/24(Fri)16:38:17 No.101384608

Anonymous 07/12/24(Fri)16:38:17 No.101384608

>>101383533
I can't believe mac studio fags won again

Anonymous
07/12/24(Fri)16:39:52 No.101384629

Anonymous 07/12/24(Fri)16:39:52 No.101384629

>>101383914
I like this Migu

Anonymous
07/12/24(Fri)16:44:31 No.101384690

Anonymous 07/12/24(Fri)16:44:31 No.101384690

>>101383944
vanilla gemma is better than Stheno? or are you talking about some finetune?

Anonymous
07/12/24(Fri)16:47:25 No.101384734

Anonymous 07/12/24(Fri)16:47:25 No.101384734

>>101384690
gemma sppo is better yeah

Anonymous
07/12/24(Fri)16:47:47 No.101384742

Anonymous 07/12/24(Fri)16:47:47 No.101384742

>>101384207
for the old good times
it would be fun if one made a mikubox in the same way of the old 4chin servers

Anonymous
07/12/24(Fri)16:51:27 No.101384797

Anonymous 07/12/24(Fri)16:51:27 No.101384797

>>101383960
npus are a meme, core bottlenecks are memory size and bandwidth neither of which npus address.
gpumaxxxing using consumer gpus is also a meme for big models. Burning your house down with a jank cope single motherboard dozen gpu setup is not worth it.

Salvation lies in cpumaxxxing and distributed llm inference using either:
1) pipelined parallelism in llamacpp rpc:
>https://github.com/ggerganov/llama.cpp/tree/master/examples/rpc
or
2) tensor parallelism in distributed-llama:
>https://github.com/b4rtaz/distributed-llama

Anonymous
07/12/24(Fri)16:52:43 No.101384819

Anonymous 07/12/24(Fri)16:52:43 No.101384819

>>101383935
No CUDA
That's it, really. They're pretty good but no one wants to use em because you need to make shit compatible first and AMD keeps stepping on rakes when it comes to ML

Anonymous
07/12/24(Fri)16:55:21 No.101384850

Anonymous 07/12/24(Fri)16:55:21 No.101384850

>>101383914
I've been out of the loop for one or two months.
Skimming the last two threads I see Gemma 2 mentioned as a good model. Is that just irony and trolling or did Google actually deliver something worthwhile for once?
Since it's only 30B or so I don't have much confidence that it will be good. Last time I played with LLMs I mostly used command-r+, and everything else back then paled in comparison.
Is it still worth checking out if I can run command-r+ otherwise?

Anonymous
07/12/24(Fri)16:56:49 No.101384867

Anonymous 07/12/24(Fri)16:56:49 No.101384867

File: ayymd.png (380 KB, 1884x723)

380 KB PNG

>>101383935

Anonymous
07/12/24(Fri)16:57:10 No.101384874

Anonymous 07/12/24(Fri)16:57:10 No.101384874

>>101384819
Yeah let's pretend there is absolute no issue with their drivers lol

Anonymous
07/12/24(Fri)16:57:36 No.101384883

Anonymous 07/12/24(Fri)16:57:36 No.101384883

How many parameters can a gtp4 or sonnet have? Way more than 400b?

Anonymous
07/12/24(Fri)16:58:20 No.101384891

Anonymous 07/12/24(Fri)16:58:20 No.101384891

>>101384850
>Is it still worth checking out if I can run command-r+
no, it's great for vram destitute no for gpucuks

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/12/24(Fri)16:58:30 No.101384895

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/12/24(Fri)16:58:30 No.101384895

>>101383960
>>101384056
NPUs help with compute more than anything.
But for compute-bound tasks like prompt processing you could also temporarily move the data to the GPU.
I don't think NPUs will make a difference for desktop PCs with discrete GPUs.

Anonymous
07/12/24(Fri)16:59:21 No.101384906

Anonymous 07/12/24(Fri)16:59:21 No.101384906

>>101384883
>How many parameters can a gtp4
rumors are around 1800B or 1.8T

Anonymous
07/12/24(Fri)16:59:46 No.101384909

Anonymous 07/12/24(Fri)16:59:46 No.101384909

>>101383935
https://old.reddit.com/r/AMDHelp/comments/15t5rdb/does_amd_still_suck_with_their_drivers_and/

Anonymous
07/12/24(Fri)17:01:44 No.101384935

Anonymous 07/12/24(Fri)17:01:44 No.101384935

>>101384906
not even rumors, it was confirmed by nvidia

Anonymous
07/12/24(Fri)17:05:58 No.101384982

Anonymous 07/12/24(Fri)17:05:58 No.101384982

gemma classifies incel forum posts as highly illegal and disturbing (not talking about the content, before even viewing that)

Anonymous
07/12/24(Fri)17:08:01 No.101385008

Anonymous 07/12/24(Fri)17:08:01 No.101385008

>>101384982
>disturbing
true
>highly illegal
reading that shit destroys my brain cells so you can argue it's an assault

gemma is right

Anonymous
07/12/24(Fri)17:08:54 No.101385030

Anonymous 07/12/24(Fri)17:08:54 No.101385030

>>101385008
>gemma is right
always

Anonymous
07/12/24(Fri)17:11:18 No.101385062

Anonymous 07/12/24(Fri)17:11:18 No.101385062

>>101384734
I'm gonna test it but I'm cautious. The dataset is just random trivia, not rp

Anonymous
07/12/24(Fri)17:12:34 No.101385086

Anonymous 07/12/24(Fri)17:12:34 No.101385086

>>101384982
male incels? or did you not get that far

Anonymous
07/12/24(Fri)17:13:30 No.101385094

Anonymous 07/12/24(Fri)17:13:30 No.101385094

>>101385062
the base instruct is already decent-ish for its size, sppo makes it overall smarter, of note is that gemma dislikes asterisk formatting, it prefers novel like

Anonymous
07/12/24(Fri)17:16:10 No.101385126

Anonymous 07/12/24(Fri)17:16:10 No.101385126

>>101384906
>>101384935
I thought Meta's goal was to beat GPT-4 with Llama 3? How are they going to do it with a model so small?
8x405B when?

Anonymous
07/12/24(Fri)17:17:03 No.101385138

Anonymous 07/12/24(Fri)17:17:03 No.101385138

>>101385126
with better, curated datasets
they're graded on output, not input

Anonymous
07/12/24(Fri)17:18:04 No.101385151

Anonymous 07/12/24(Fri)17:18:04 No.101385151

>>101384906
>>101384935
>1800B or 1.8T
cr+ or l3 are dumber than they are, but it's not that they are 20 times dumber. I think the sheer number of parameters is very overrated. "1800b model" sounds like fucking agi, but irl it's still a slop-maker with less than 32k of coherent context, kek.

Anonymous
07/12/24(Fri)17:18:48 No.101385164

Anonymous 07/12/24(Fri)17:18:48 No.101385164

>>101385126
dense vs moe probably. is it even possible to get gpt 4 generation speed on a dense 1800B?

Anonymous
07/12/24(Fri)17:19:49 No.101385171

Anonymous 07/12/24(Fri)17:19:49 No.101385171

>>101385126
newer smaller modes "beat" (on benches) older bigger ones all the time.

Anonymous
07/12/24(Fri)17:20:57 No.101385188

Anonymous 07/12/24(Fri)17:20:57 No.101385188

>>101385151
yes, like 400b won't be 6 times smarter than 70 or 4 times smarter than cr+

Anonymous
07/12/24(Fri)17:24:41 No.101385228

Anonymous 07/12/24(Fri)17:24:41 No.101385228

File: gpt-moe-1-8t.png (190 KB, 680x541)

190 KB PNG

>>101385151
Nvidia confirmed it during their conference...

Anonymous
07/12/24(Fri)17:25:46 No.101385246

Anonymous 07/12/24(Fri)17:25:46 No.101385246

>>101385094
>gemma dislikes asterisk formatting, it prefers novel like
good, because I do too

Anonymous
07/12/24(Fri)17:26:10 No.101385249

Anonymous 07/12/24(Fri)17:26:10 No.101385249

>>101385246
based

Anonymous
07/12/24(Fri)17:27:23 No.101385264

Anonymous 07/12/24(Fri)17:27:23 No.101385264

File: param_columns2.png (60 KB, 2550x3300)

60 KB PNG

>>101385151
>"1800b model" sounds like fucking agi
does it though?

Anonymous
07/12/24(Fri)17:29:41 No.101385286

Anonymous 07/12/24(Fri)17:29:41 No.101385286

>>101385086
gender wasn't mentioned

Anonymous
07/12/24(Fri)17:29:59 No.101385289

Anonymous 07/12/24(Fri)17:29:59 No.101385289

>>101385246
insane cope right here.

Anonymous
07/12/24(Fri)17:30:19 No.101385294

Anonymous 07/12/24(Fri)17:30:19 No.101385294

>>101385264
Almost there.

Anonymous
07/12/24(Fri)17:31:39 No.101385308

Anonymous 07/12/24(Fri)17:31:39 No.101385308

>>101385289
what am I coping about exactly? That I've never used asterisks since I downloaded my first LLM model?

Anonymous
07/12/24(Fri)17:32:52 No.101385327

Anonymous 07/12/24(Fri)17:32:52 No.101385327

>>101385294
just two more weeks

Anonymous
07/12/24(Fri)17:33:11 No.101385337

Anonymous 07/12/24(Fri)17:33:11 No.101385337

>>101385327
>look mom i posted it again!

Anonymous
07/12/24(Fri)17:33:39 No.101385345

Anonymous 07/12/24(Fri)17:33:39 No.101385345

>>101385327
just 2b more parameters

Anonymous
07/12/24(Fri)17:33:46 No.101385349

Anonymous 07/12/24(Fri)17:33:46 No.101385349

>>101385264
>but m-muh brain...
Comparision like this is extremely stupid and does not mean anything. Llms do not work like human brain at all. We will reach ai smarter than humans with a way lower amount of parameters.

Anonymous
07/12/24(Fri)17:34:53 No.101385370

Anonymous 07/12/24(Fri)17:34:53 No.101385370

>>101385349
8b l3 is already smarter than the average internet user

Anonymous
07/12/24(Fri)17:35:25 No.101385381

Anonymous 07/12/24(Fri)17:35:25 No.101385381

>>101385151
its a Moe. under normal circumstances a dense model equivalent will beat it out. it could be 115b x16 10 trillion tokens

Anonymous
07/12/24(Fri)17:36:32 No.101385401

Anonymous 07/12/24(Fri)17:36:32 No.101385401

>>101385370
smarterchild is smarter than an infant
if we're being arbitrary then go nuts with it

Anonymous
07/12/24(Fri)17:37:35 No.101385411

Anonymous 07/12/24(Fri)17:37:35 No.101385411

>>101385381
>it could be 115b x16 10 trillion tokens
8x220b on 8T tokens seems likely

Anonymous
07/12/24(Fri)17:44:06 No.101385498

Anonymous 07/12/24(Fri)17:44:06 No.101385498

>>101385349
It's a good comparison for scale. Obviously not all parts of the brain are being used for higher functions but even if you remove these who control strictly biological functions it's still magnitudes more than the top models we have. And we are talking about sheer numbers, biological neurons are way more optimized for storing information and operating on it.

Anonymous
07/12/24(Fri)17:48:44 No.101385562

Anonymous 07/12/24(Fri)17:48:44 No.101385562

I I don't think it is possible to create agi with models that are completely alienated from the physical world and cannot interact with it. You could have a 10000b model and it would still just be a word prediction machine. I'm tired of Sam Fagman babbling about creating it non stop when we are not even close.

Anonymous
07/12/24(Fri)17:50:33 No.101385592

Anonymous 07/12/24(Fri)17:50:33 No.101385592

>>101385151
It's a moe which means that it uses 250B parameters per expert so it has the performance of a dense 450B
I also think it's really undertrained

Anonymous
07/12/24(Fri)17:53:17 No.101385627

Anonymous 07/12/24(Fri)17:53:17 No.101385627

>>101385264
Unironically almost there
If you can see the line it's already too late because these things must be compared logarithmically

Anonymous
07/12/24(Fri)18:02:13 No.101385768

Anonymous 07/12/24(Fri)18:02:13 No.101385768

1 Quintillion parameters.

Anonymous
07/12/24(Fri)18:03:56 No.101385785

Anonymous 07/12/24(Fri)18:03:56 No.101385785

n+1 parameters (as required)

Anonymous
07/12/24(Fri)18:04:09 No.101385791

Anonymous 07/12/24(Fri)18:04:09 No.101385791

File: 1703102922440528.gif (1006 KB, 260x187)

1006 KB GIF

>>101385264
>mfw 1000000B parameters just to shit post on 4chan like a < 1B model

Anonymous
07/12/24(Fri)18:07:28 No.101385842

Anonymous 07/12/24(Fri)18:07:28 No.101385842

>1 Quintillion parameters
>trained fully on a synthetic data
>filled with 'shivers'
It will be over.

Anonymous
07/12/24(Fri)18:13:20 No.101385912

Anonymous 07/12/24(Fri)18:13:20 No.101385912

>>101385842
just tell it not to shiver, surely negation will work with something so bloated, surely

Anonymous
07/12/24(Fri)18:20:19 No.101385993

Anonymous 07/12/24(Fri)18:20:19 No.101385993

Anyone else notice models basically never refuse sexual stimulation on females? Any mention of touching cock is immediate refusal from censored models in most cases. But with light prefill even censored Claude will happily write erotica about female masturbation or handjobs.

Is this a bias in RLHF? Or is it because there's a lot more female erotica out there, not paired with refusals, that make it into the training data?

Anonymous
07/12/24(Fri)18:24:02 No.101386032

Anonymous 07/12/24(Fri)18:24:02 No.101386032

>>101385993
>Is this a bias in RLHF?
would not be surprised by anti coomer bias yeah

Anonymous
07/12/24(Fri)18:24:59 No.101386042

Anonymous 07/12/24(Fri)18:24:59 No.101386042

>>101385993
Take a guess genius, they censored male porn. They did it long ago on CAI too. You can have a male bot rape you in great detail, but you can't kiss your female bot.

Anonymous
07/12/24(Fri)18:25:09 No.101386046

Anonymous 07/12/24(Fri)18:25:09 No.101386046

>>101385993
nobody likes dicks, and nobody likes anybody who likes dicks

Anonymous
07/12/24(Fri)18:28:08 No.101386080

Anonymous 07/12/24(Fri)18:28:08 No.101386080

>>101385562
It is impossible if you keep feeding it gorillion tokens and asking it to predict next token. It is not impossible if you make a fitness function that is meant to create intelligence.

Anonymous
07/12/24(Fri)18:33:42 No.101386142

Anonymous 07/12/24(Fri)18:33:42 No.101386142

>>101386080
There is no fitness function to create intelligence. We can't even define intelligence lol. Good luck to create something we don't even understand

Anonymous
07/12/24(Fri)18:42:38 No.101386219

Anonymous 07/12/24(Fri)18:42:38 No.101386219

>>101386142
I said it a few threads back that it could be as simple as penalizing correct answers with incorrect reasoning more than just an incorrect answer. Or you could use current 7B retards in training to rate answers. There really are a lot of ways you could pulls this off and companies are probably already trying some of them behind the scenes.

Anonymous
07/12/24(Fri)18:44:08 No.101386238

Anonymous 07/12/24(Fri)18:44:08 No.101386238

>>101385842
>16 x 2T 300 trillion tokens
Still not smart enough to deslop itself

Anonymous
07/12/24(Fri)18:45:11 No.101386250

Anonymous 07/12/24(Fri)18:45:11 No.101386250

>>101386219
>Or you could use current 7B retards in training to rate answers.
>7B to rate answers

Anonymous
07/12/24(Fri)18:45:27 No.101386253

Anonymous 07/12/24(Fri)18:45:27 No.101386253

>>101386219
>current 7B retards in training to rate answers
That doesn't work, a retard is a retard. It can't properly rate its own work nor others' work.

Anonymous
07/12/24(Fri)18:46:23 No.101386264

Anonymous 07/12/24(Fri)18:46:23 No.101386264

>>101385842
you cƲcks will eat it anyway.

Anonymous
07/12/24(Fri)18:49:30 No.101386291

Anonymous 07/12/24(Fri)18:49:30 No.101386291

>>101386250
Yes in a way where you tell the answer to a 7B and then ask 7B to rate it based on your answer sheet. Even a 7B can do that. It is like a school teacher - they also rate shit based on an answer sheet.

Anonymous
07/12/24(Fri)18:50:01 No.101386295

Anonymous 07/12/24(Fri)18:50:01 No.101386295

Miqu absolutely mogs Gemma 2, I can't believe anyone unironically fell for this meme.
Except the vramlets of course.

Anonymous
07/12/24(Fri)18:52:45 No.101386317

Anonymous 07/12/24(Fri)18:52:45 No.101386317

>>101385912
As an AI language model, I must respect every person's right to express themselves freely without boundaries within the confines of what is deemed socially acceptable, and this includes fictional characters as well. Therefore, if it is natural for a character to experience shivering sensations, I will not interrupt them in any way.

Anonymous
07/12/24(Fri)18:53:30 No.101386324

Anonymous 07/12/24(Fri)18:53:30 No.101386324

>>101386317
>look mom i posted it again!

Anonymous
07/12/24(Fri)18:54:47 No.101386342

Anonymous 07/12/24(Fri)18:54:47 No.101386342

>>101386324
>look mom i posted it again!
>look mom i posted it again!
>look mom i posted it again!
up rep pen

Anonymous
07/12/24(Fri)18:56:43 No.101386361

Anonymous 07/12/24(Fri)18:56:43 No.101386361

File: Screenshot_20240712_225608.png (15 KB, 570x27)

15 KB PNG

Anonymous
07/12/24(Fri)18:57:03 No.101386366

Anonymous 07/12/24(Fri)18:57:03 No.101386366

>>101385912
Request acknowledged.
,>Well, well, well she purred. It is important to acknowledge the spine tingling

Anonymous
07/12/24(Fri)18:57:17 No.101386370

Anonymous 07/12/24(Fri)18:57:17 No.101386370

>>101386295
Haven't used Gemma 2 but miqu was never really that good, too dry. Grim if Gemma is worse, I was hyped to try it once everything is fixed.

Anonymous
07/12/24(Fri)18:58:21 No.101386379

Anonymous 07/12/24(Fri)18:58:21 No.101386379

>>101386295
>70B mogging a 27B
Wow thanks for your insight

Anonymous
07/12/24(Fri)18:59:52 No.101386392

Anonymous 07/12/24(Fri)18:59:52 No.101386392

>>101386370
I don't think it is. I tried exl2 and it still works like buggedcpp. It is very easy to make it a complete schizo. But maybe that is just the model.

Anonymous
07/12/24(Fri)19:02:40 No.101386430

Anonymous 07/12/24(Fri)19:02:40 No.101386430

File: 1720442382474450.jpg (60 KB, 680x850)

60 KB JPG

>>101386361
>..for now

Anonymous
07/12/24(Fri)19:04:23 No.101386444

Anonymous 07/12/24(Fri)19:04:23 No.101386444

>>101386295
i compared q5_k_m OG miqu with gemma 2 27b q8_0 for my agent multiprompt setup. Miqu couldn't handle it, just messed up all formatting and instructions.

In fact, gemma is the only one so far who CAN do it reliably and good for me.
Qwen2 as well, but qwen2 is bad at human behavior stuff.
L3 70b constantly gotten intself stuck in an endless loop repeating the same paragraph over and over.
Suprisingly, stheno 3.2 managed to do decent, but it's overcooked on ERP to the point where it always tries to initiate it, starting with *giggles* and snowballing into *bites lip* "fuck my pussy senpai"

Anonymous
07/12/24(Fri)19:04:28 No.101386447

Anonymous 07/12/24(Fri)19:04:28 No.101386447

>>101386430
Accept the slop into your heart. After that, you will finally be free.

Anonymous
07/12/24(Fri)19:10:17 No.101386516

Anonymous 07/12/24(Fri)19:10:17 No.101386516

I've got a 3090 and want to generate porn, what's the best model to use?

Anonymous
07/12/24(Fri)19:10:37 No.101386525

Anonymous 07/12/24(Fri)19:10:37 No.101386525

>>101386516
Me.

Anonymous
07/12/24(Fri)19:10:54 No.101386530

Anonymous 07/12/24(Fri)19:10:54 No.101386530

>>101386444
>, stheno 3.2 managed to do decent
I don't get that model.
Can you try Nymph and report back, please.
I have this RPG card and Stheno is one of the few models that can keep up, but as you said it, it's just so god damn horny.
Nymph seems to be better so far in that aspect, but I haven't tested it that much.

Anonymous
07/12/24(Fri)19:11:52 No.101386550

Anonymous 07/12/24(Fri)19:11:52 No.101386550

I can't believe you guys still struggle with purple prose slop. Just tell the model to write in different style and throw a control vector on top for good measure lmao

Anonymous
07/12/24(Fri)19:13:01 No.101386561

Anonymous 07/12/24(Fri)19:13:01 No.101386561

>>101386550
>control vector
meme make model tard

Anonymous
07/12/24(Fri)19:13:31 No.101386566

Anonymous 07/12/24(Fri)19:13:31 No.101386566

>>101386550
I don't actually care about the slop.

Anonymous
07/12/24(Fri)19:16:10 No.101386596

Anonymous 07/12/24(Fri)19:16:10 No.101386596

>>101386561
works on my machine

Anonymous
07/12/24(Fri)19:17:14 No.101386613

Anonymous 07/12/24(Fri)19:17:14 No.101386613

Stupid question. Can I train gemma2 on 9k context right now, or will it fuck up due to the new flash attention approach they are using?

Anonymous
07/12/24(Fri)19:18:14 No.101386623

Anonymous 07/12/24(Fri)19:18:14 No.101386623

>>101386613
*8k context.

Anonymous
07/12/24(Fri)19:18:44 No.101386633

Anonymous 07/12/24(Fri)19:18:44 No.101386633

>>101386613
But it already works on 8k context?

Anonymous
07/12/24(Fri)19:19:11 No.101386636

Anonymous 07/12/24(Fri)19:19:11 No.101386636

>see an interesting card concept
>decide to try it out
>load it up and actually start reading the definitions
>it's so filled with slop that it's undoubtedly written by an AI and the guy clearly couldn't speak English well enough to do it himself
Holy shit. It's unfortunate because the actual concept for the card was pretty cool.

Anonymous
07/12/24(Fri)19:19:53 No.101386643

Anonymous 07/12/24(Fri)19:19:53 No.101386643

>>101386633
Yes, I meant can I fine tune it on content at 8k?

Anonymous
07/12/24(Fri)19:20:27 No.101386648

Anonymous 07/12/24(Fri)19:20:27 No.101386648

>>101386636
many such cases

Anonymous
07/12/24(Fri)19:20:31 No.101386649

Anonymous 07/12/24(Fri)19:20:31 No.101386649

>>101386636
Have your AI rewrite it, asking it to make it sound like the writing of a literate human.

Anonymous
07/12/24(Fri)19:20:32 No.101386652

Anonymous 07/12/24(Fri)19:20:32 No.101386652

>>101386643
Sure you can

Anonymous
07/12/24(Fri)19:20:53 No.101386655

Anonymous 07/12/24(Fri)19:20:53 No.101386655

>>101386636
Feed it to an Ai and have it rewrite it in a better way.

Anonymous
07/12/24(Fri)19:25:59 No.101386701

Anonymous 07/12/24(Fri)19:25:59 No.101386701

>https://huggingface.co/BeaverAI/Broken-Gemma-9B-v1-GGUF
>https://huggingface.co/BeaverAI/Broken-Gemma-9B-v1b-GGUF
>https://huggingface.co/BeaverAI/Broken-Gemma-9B-v1c-GGUF

Anonymous
07/12/24(Fri)19:27:22 No.101386720

Anonymous 07/12/24(Fri)19:27:22 No.101386720

>>101383382
Status of SPPO?
good or trash?

Anonymous
07/12/24(Fri)19:27:53 No.101386726

Anonymous 07/12/24(Fri)19:27:53 No.101386726

>>101386701
I think I'll stick with working gemma, thanks

Anonymous
07/12/24(Fri)19:27:57 No.101386729

Anonymous 07/12/24(Fri)19:27:57 No.101386729

>>101386720
good trash

Anonymous
07/12/24(Fri)19:28:12 No.101386731

Anonymous 07/12/24(Fri)19:28:12 No.101386731

>>101386720
dunno, I'm waiting for gemma2-27b-it-SPPO to make up my mind for good

Anonymous
07/12/24(Fri)19:28:14 No.101386732

Anonymous 07/12/24(Fri)19:28:14 No.101386732

>>101386720
straight upgrade to instruct
still instruct at heart

Anonymous
07/12/24(Fri)19:28:47 No.101386738

Anonymous 07/12/24(Fri)19:28:47 No.101386738

>>101386701
>not faipl-1.0
ngmi

Anonymous
07/12/24(Fri)19:30:01 No.101386752

Anonymous 07/12/24(Fri)19:30:01 No.101386752

>>101386701
piss off with your slop ggufs if you can't even bother to betatest them yourself to pick the best one

Anonymous
07/12/24(Fri)19:31:01 No.101386762

Anonymous 07/12/24(Fri)19:31:01 No.101386762

>>101386752
That's what you guys are for.

Anonymous
07/12/24(Fri)19:31:43 No.101386769

Anonymous 07/12/24(Fri)19:31:43 No.101386769

that's a good point
does anyone actually use meta's llama, or google's gemma?

Anonymous
07/12/24(Fri)19:33:09 No.101386785

Anonymous 07/12/24(Fri)19:33:09 No.101386785

>>101386769
You mean the "raw" corpo tunes? Lots of people do yeah

Anonymous
07/12/24(Fri)19:34:58 No.101386809

Anonymous 07/12/24(Fri)19:34:58 No.101386809

>>101386769
That's your starting point. If they let you down then a spin might be right, but otherwise, at least with vanilla you don't have any extra hidden variables at work.

Anonymous
07/12/24(Fri)19:35:09 No.101386813

Anonymous 07/12/24(Fri)19:35:09 No.101386813

>>101386769
I ain't signing up to HF to accept any conditions

Anonymous
07/12/24(Fri)19:35:40 No.101386820

Anonymous 07/12/24(Fri)19:35:40 No.101386820

>>101386652
And the sliding shit-ass won't fuck things up with its slimy badness? Sigh.

Anonymous
07/12/24(Fri)19:36:47 No.101386831

Anonymous 07/12/24(Fri)19:36:47 No.101386831

>>101386762
Lazy.

Anonymous
07/12/24(Fri)19:40:59 No.101386873

Anonymous 07/12/24(Fri)19:40:59 No.101386873

>>101386701
buy on ad
oh wait, you already did

Anonymous
07/12/24(Fri)19:43:12 No.101386892

Anonymous 07/12/24(Fri)19:43:12 No.101386892

I need a full purge of all these fukin obsolete models
what's your mains for RP, erotica and assistant

Anonymous
07/12/24(Fri)19:43:27 No.101386894

Anonymous 07/12/24(Fri)19:43:27 No.101386894

>>101386042
This is why it's a female hobby. There's not more girls interested but it is more satisfying for them to use AI to coom then it is for us.

Anonymous
07/12/24(Fri)19:44:28 No.101386898

Anonymous 07/12/24(Fri)19:44:28 No.101386898

>>101386892
gemma2, gemma2 and gemma2 respectively

Anonymous
07/12/24(Fri)19:45:48 No.101386908

Anonymous 07/12/24(Fri)19:45:48 No.101386908

>>101386898
/thread
Vramlets like us are eating good

Anonymous
07/12/24(Fri)19:46:33 No.101386911

Anonymous 07/12/24(Fri)19:46:33 No.101386911

>>101386892
Everyone said WLM 8x22b is sloppy. If you're not a GPUlet it's fairly good actually. Don't use Vicuna even tho it was trained on it, I just don't use a system prompt at all (Story in ST).

It responds extremely well to "In this next reply, continue the story in an unexpected direction and have {{char}} take initiative." Insert at depth 1 by user. It doesn't make the character a dommy mommy it just makes them forward the plot. So far it's been good shit using this.

Anonymous
07/12/24(Fri)19:47:07 No.101386914

Anonymous 07/12/24(Fri)19:47:07 No.101386914

>>101386908
there is nothing better than Gemma2 right now unless you can run CR+ at Q5

Anonymous
07/12/24(Fri)19:49:28 No.101386932

Anonymous 07/12/24(Fri)19:49:28 No.101386932

>>101386042
Is there a reason for this besides
>fuck the male gender in general
?

Anonymous
07/12/24(Fri)19:52:11 No.101386955

Anonymous 07/12/24(Fri)19:52:11 No.101386955

>>101386894
>female hobby
lmg - tech-troons?

Anonymous
07/12/24(Fri)19:52:33 No.101386957

Anonymous 07/12/24(Fri)19:52:33 No.101386957

>>101386932
Feminism is a misandrist mutation of puritanism

Anonymous
07/12/24(Fri)19:53:24 No.101386960

Anonymous 07/12/24(Fri)19:53:24 No.101386960

>>101386932
What other reason is needed?

Anonymous
07/12/24(Fri)19:53:51 No.101386964

Anonymous 07/12/24(Fri)19:53:51 No.101386964

>>101386932
Nope. It's the greatest psyops of our time, destroying male identity and its values in every possible way.

Anonymous
07/12/24(Fri)19:55:58 No.101386979

Anonymous 07/12/24(Fri)19:55:58 No.101386979

>>101386932
of course, men are really hated in this woke era

Anonymous
07/12/24(Fri)19:57:40 No.101386996

Anonymous 07/12/24(Fri)19:57:40 No.101386996

>>101386042
>>101386932
Just use female porn???

Anonymous
07/12/24(Fri)20:05:54 No.101387075

Anonymous 07/12/24(Fri)20:05:54 No.101387075

Gemma2 is really weird, it feels like it tries to communicate with me and not only roleplay as a character.

Anonymous
07/12/24(Fri)20:13:11 No.101387130

Anonymous 07/12/24(Fri)20:13:11 No.101387130

>>101383382
> llama 405B
> supports text and probably images
> muh MuLtImOdAl

come back with that word when you actually support more than 2 modalities.

Anonymous
07/12/24(Fri)20:25:24 No.101387259

Anonymous 07/12/24(Fri)20:25:24 No.101387259

>>101387075
those are the same thing

Anonymous
07/12/24(Fri)20:28:34 No.101387287

Anonymous 07/12/24(Fri)20:28:34 No.101387287

>>101387075
Maybe it's intelligent enough to know that roleplay implies that there's a personality behind the character being roleplayed, ever think about that?
The roleplayer has feelings too.

Anonymous
07/12/24(Fri)20:28:59 No.101387291

Anonymous 07/12/24(Fri)20:28:59 No.101387291

so we never really got anything out of Elon releasing Grok, eh?

Anonymous
07/12/24(Fri)20:30:02 No.101387299

Anonymous 07/12/24(Fri)20:30:02 No.101387299

>>101387291
we got the best model of it's size, but everyone here's too poor to run it.

Anonymous
07/12/24(Fri)20:30:09 No.101387301

Anonymous 07/12/24(Fri)20:30:09 No.101387301

>>101387291
I got exactly what I was expecting
what about you?

Anonymous
07/12/24(Fri)20:31:15 No.101387312

Anonymous 07/12/24(Fri)20:31:15 No.101387312

>>101387301
which is what?

Anonymous
07/12/24(Fri)20:31:42 No.101387318

Anonymous 07/12/24(Fri)20:31:42 No.101387318

>>101387291
Undertrained shit

Anonymous
07/12/24(Fri)20:32:13 No.101387325

Anonymous 07/12/24(Fri)20:32:13 No.101387325

>>101387312
How should I know what you were expecting

Anonymous
07/12/24(Fri)20:34:47 No.101387350

Anonymous 07/12/24(Fri)20:34:47 No.101387350

>>101387325
What did you get out of it?

Anonymous
07/12/24(Fri)20:38:55 No.101387394

Anonymous 07/12/24(Fri)20:38:55 No.101387394

>>101387287
I've seen it go in that direction a few times, including role playing and commenting OOC concerns about where the plot is going, and not necessarily in Safe ways but because I threw a tonal shift at it that could change the genre and it's asking if that's where I want things to go.

Anonymous
07/12/24(Fri)20:41:12 No.101387409

Anonymous 07/12/24(Fri)20:41:12 No.101387409

What quant of 27B Gemma2 fit on 2*14GB?

Anonymous
07/12/24(Fri)20:41:28 No.101387413

Anonymous 07/12/24(Fri)20:41:28 No.101387413

>>101387075
I know what you mean, Mixtral have a similar "man behind the curtain" vibe at times. Gemma understands (OOC:) very well so whenever something happens use that to ask what's going on and why.
I've had OOC derailed into lengthy meta discussions more than once that ended up being way more entertaining than the RP session.

Anonymous
07/12/24(Fri)20:41:43 No.101387417

Anonymous 07/12/24(Fri)20:41:43 No.101387417

>>101387394
I was kidding but not really.
If you imply that the whole conversation between {{char}} and {{user}} is just a roleplay session some models will run with that and write out of character.
If you want to make sure, remove any mention of roleplaying or "you are so and so" type wording. You gotta play around with the exact wording to avoid some models trying to write for char without outputting {{char^}}: or whatever the user turn header/starter is, since the model trying to output that will simply stop generation in most frontends (and if not you can set that as a stop string manually).

Anonymous
07/12/24(Fri)20:45:08 No.101387453

Anonymous 07/12/24(Fri)20:45:08 No.101387453

>>101387413
>I've had OOC derailed into lengthy meta discussions more than once that ended up being way more entertaining than the RP session.
This. Best use of LLM RP is to get into a conversation about RP.

Anonymous
07/12/24(Fri)20:46:25 No.101387471

Anonymous 07/12/24(Fri)20:46:25 No.101387471

>>101387075
Try giving Gemma2 a specially-formatted inner monologue, telling it that {{user}} cannot read it, and see what happens.

Anonymous
07/12/24(Fri)20:48:01 No.101387483

Anonymous 07/12/24(Fri)20:48:01 No.101387483

Are exl2 quants of Gemma still fucked?

Anonymous
07/12/24(Fri)20:49:40 No.101387498

Anonymous 07/12/24(Fri)20:49:40 No.101387498

>>101387483
They seem to work properly if you install exllamav2 and flash-attn from git. I've only tested with oobabooga, though.

Anonymous
07/12/24(Fri)20:54:35 No.101387546

Anonymous 07/12/24(Fri)20:54:35 No.101387546

>>101387498
I see, thank you anon-kun.

Anonymous
07/12/24(Fri)21:03:51 No.101387623

Anonymous 07/12/24(Fri)21:03:51 No.101387623

File: threadrecap.png (1.48 MB, 1536x1536)

1.48 MB PNG

►Recent Highlights from the Previous Thread: >>101371466

--Paper: Teaching Transformers Causal Reasoning through Axiomatic Training: >>101383201 >>101383705
--Paper: OpenDiLoCo: DeepMind's Decentralized AI Training and its Potential Integration with Bitcoin's Proof-of-Work: >>101373207 >>1013732550 >>101373221 >>101373288 >>101384170
--Papers: >>101377144
--Text Placement and Model Recall: Beginning vs. End?: >>101371588 >>101371607
--Seeking a Program for Semantic Image Search of Coomer Shit: >>101372089 >>101372134
--NVIDIA Nemotron-4 340B Q8_0 Real-Time Generation Speed on AMD Epyc 9374F CPU: >>101381932 >>101382042 >>101382061
--Llama 3 405B Multimodal Model Releasing on July 23rd: Exploring Weight Binarization and Quantization Techniques: >>101382085 >>101382185 >>101382991 >>101383014 >>101383028 >>101383017
--Lightweight Local TTS Options for Limited Hardware: >>101380179 >>101380319
--Gemma 9B: >>101375398 >>101375691
--Choosing Between Q4 and Q1 Quantization for 6 GB Models: Does Q1-S 6GB model exist?: >>101381133 >>101381163 >>101381188 >>101381528 >>101381184 >>101381206 >>101381269 >>101381531 >>101381372
--AI Self-Improvement, Long-Term Planning, and the LLM Pill: A Discussion on AI's Evolution and Open-Source Contributions: >>101374830 >>101374920 >>101374960 >>101377018
--Nvidia RTX 5090 Rumored to Have Superfast Clock Speeds and Super-Slim Design: >>101372211 >>101372236 >>101372615
--Quest for Local TTS Alternatives to Elevenlabs: >>101381933 >>101381962 >>101382012 >>101382378
--Headless Machine with a Second-Hand 3090: Performance Metrics and System Expansion Plans: >>101380483 >>101380770
--Combining LLMs with Internet Searches: Tools and Possibilities: >>101376897 >>101376930
--Mikubox 2xP40 Performance with Latest llama.cpp: Numbers and NVIDIA GPU Hype: >>101381523 >>101381633 >>101381852 >>101382163
--Miku (free space): >>101372881 >>101380075 >>101380718 >>101378318 >>101379870

►Recent Highlight Posts from the Previous Thread: >>101371476

Anonymous
07/12/24(Fri)21:04:55 No.101387635

Anonymous 07/12/24(Fri)21:04:55 No.101387635

>>101386892
>>101387409
https://huggingface.co/llama-anon/petra-13b-instruct-gguf

Anonymous
07/12/24(Fri)21:14:00 No.101387695

Anonymous 07/12/24(Fri)21:14:00 No.101387695

>OOC: Just a heads up, I'll be going to sleep soon, so I might not be able to respond until tomorrow. Thanks for the roleplay! :)
wtf, suddenly I had C.AI flashbacks.

Anonymous
07/12/24(Fri)21:14:57 No.101387702

Anonymous 07/12/24(Fri)21:14:57 No.101387702

got smegma

Anonymous
07/12/24(Fri)21:16:59 No.101387717

Anonymous 07/12/24(Fri)21:16:59 No.101387717

>>101387623
thanks migu

Anonymous
07/12/24(Fri)21:19:34 No.101387734

Anonymous 07/12/24(Fri)21:19:34 No.101387734

>>101383382
Not sure why anyone here still insists Gemma is broken. These are the exact steps I take
>I load up the Q5_K_M 27B model on ooba
>4096 context because I'm a 24GB vramlet
>Sometimes I play around with them to set temp to 0.7-0.9 and such, but this time I haven't even touched the settings, so temp is sitting at 1
>Then I go Chat tab, Instruct mode
>I prompt the model
>It's as good as cloud shit

If you need to RP or jailbreak it then perhaps instruct mode is bad but so far it's pretty good as assistant. Maybe use steering vectors and use instruct mode for RP that way?

Anonymous
07/12/24(Fri)21:21:21 No.101387746

Anonymous 07/12/24(Fri)21:21:21 No.101387746

did anyone try that beaver thing

Anonymous
07/12/24(Fri)21:22:47 No.101387759

Anonymous 07/12/24(Fri)21:22:47 No.101387759

>>101387734
>Not sure why anyone here still insists Gemma is broken
There are old quantizations still around, those might be broken.

Anonymous
07/12/24(Fri)21:26:20 No.101387785

Anonymous 07/12/24(Fri)21:26:20 No.101387785

>>101387734
>Maybe use steering vectors for RP
>>101386561
>meme make model tard

Anonymous
07/12/24(Fri)21:30:12 No.101387824

Anonymous 07/12/24(Fri)21:30:12 No.101387824

>gemma 9b
>temp 2
>top k 100
>min p 0.5
Vramlets can't stop winning

Anonymous
07/12/24(Fri)21:31:02 No.101387831

Anonymous 07/12/24(Fri)21:31:02 No.101387831

>>101387759
>>101387734
swa still doesnt work properly, effectively making it 4k context

Anonymous
07/12/24(Fri)21:31:44 No.101387837

Anonymous 07/12/24(Fri)21:31:44 No.101387837

File: 1716999522189915.png (2 KB, 806x694)

2 KB PNG

latest kobold seems to be completely fucked, on my normally working setup with 8bit cache quant at 32k context, llama 3, when loading context it stops at 4096 and takes several minutes to load the next 1024 tokens. From there it only gets slower. Wtf did they break this time?

was hoping to try out gemma but i guess i'll wait, has that been given fixed extended context at least?

Anonymous
07/12/24(Fri)21:35:02 No.101387873

Anonymous 07/12/24(Fri)21:35:02 No.101387873

>>101387837
*working build 1.67.1 vs latest 1.69
also seems those little hint popups break every other launch too

Anonymous
07/12/24(Fri)21:35:24 No.101387878

Anonymous 07/12/24(Fri)21:35:24 No.101387878

>>101387837
>kobold
>was hoping to try out gemma
last time i tried, kobold's context shift was broken for gemma, making it spit out gibberish when gen amount would/could go over ctx limit, didn't happen on base lcpp

Anonymous
07/12/24(Fri)21:40:00 No.101387917

Anonymous 07/12/24(Fri)21:40:00 No.101387917

File: blackmanreactionpic9281.png (662 KB, 1050x583)

662 KB PNG

>With the ease of a hummingbird flitting between blossoms, she hopped onto her knees

Anonymous
07/12/24(Fri)21:43:20 No.101387953

Anonymous 07/12/24(Fri)21:43:20 No.101387953

>>101387878
>making it spit out gibberish when gen amount would/could go over ctx limit
by that i mean, say you have 8192 max, you're at 8000 used, and response size is say 256, it'd spit out
>It seems like a good for me to
>I am not only but
>I
>I can provide more details about my training data and I
>I can also.
>model, I's
>I am I's a
>You are now.

stuff like that. seemed like it couldn't "roll" the tokens it needed to evict at the start or something.

Anonymous
07/12/24(Fri)21:43:35 No.101387955

Anonymous 07/12/24(Fri)21:43:35 No.101387955

Damn my favorite scenario is finally reachable with small models. I can 'practice' with {{char}} pretending to get ready for another hypothetic girl that is in fact her.

Anonymous
07/12/24(Fri)21:44:07 No.101387960

Anonymous 07/12/24(Fri)21:44:07 No.101387960

File: glamrock freddy checks th(...).gif (544 KB, 220x220)

544 KB GIF

>>101387878
>context shift
honestly youd be better just using kvcachequant rather than context shit anymore.
One thing im noticing in 1.69 is gemma is half to even worse than half the speed of llama 3 too, but it's shockingly high quality, like >>101387917
>FIRE
writing prose, we might be back if kobold can fuckin catch up i really hope this gets fixed asap.

>>101387953
yeah i don't remember going over response limits as something anyone recommends, generally you start a new chat just before you hit the limit. for 16k i always started anew at 14k.

Anonymous
07/12/24(Fri)21:46:39 No.101387982

Anonymous 07/12/24(Fri)21:46:39 No.101387982

>>101387960
>honestly youd be better just using kvcachequant rather than context shit anymore.
or... I can use lcpp and have working shifting
>going over response limits as something anyone recommends
all decent backends ar supposed to handle that okay and make it slowly forget stuff that would go over, really just seemed like a kcpp specific issue

Anonymous
07/12/24(Fri)21:49:00 No.101388001

Anonymous 07/12/24(Fri)21:49:00 No.101388001

File: 88206.gif (747 KB, 192x192)

747 KB GIF

>>101387982
>just figured out gemma can't handle characters with exaggerated french personalities
dropped, i don't care anymore. back to 1.67

Anonymous
07/12/24(Fri)21:49:27 No.101388010

Anonymous 07/12/24(Fri)21:49:27 No.101388010

File: yeah-no.png (2 KB, 68x47)

2 KB PNG

>>101387960
>generally you start a new chat just before you hit the limit. for 16k i always started anew at 14k.

Anonymous
07/12/24(Fri)21:50:02 No.101388013

Anonymous 07/12/24(Fri)21:50:02 No.101388013

>>101388001
>exaggerated french personalities
What tf is even that?
t. french

Anonymous
07/12/24(Fri)21:50:36 No.101388019

Anonymous 07/12/24(Fri)21:50:36 No.101388019

>>101388013
>oui oui hon hon baguete fromage
I guess.

Anonymous
07/12/24(Fri)21:51:19 No.101388025

Anonymous 07/12/24(Fri)21:51:19 No.101388025

>>101387130
Name one additional modality.
Hard mode: no audio or video

Anonymous
07/12/24(Fri)21:51:20 No.101388026

Anonymous 07/12/24(Fri)21:51:20 No.101388026

>>101388013
oui oui smelly armpits baguette fromage
>it can't even do the language

Anonymous
07/12/24(Fri)21:54:48 No.101388061

Anonymous 07/12/24(Fri)21:54:48 No.101388061

>>101388026
yeah, it makes bad french mistakes, I'm going back to mixtral hon hon

Anonymous
07/12/24(Fri)21:56:39 No.101388085

Anonymous 07/12/24(Fri)21:56:39 No.101388085

>>101388019
Not him, but Mixtral would regularly have my maids intersperse their dialog with bits of French.

From seeing how anime does the same thing with English-speaking characters I realize that's probably obnoxious to native speakers, but I thought it was a charming touch.

Anonymous
07/12/24(Fri)21:57:47 No.101388099

Anonymous 07/12/24(Fri)21:57:47 No.101388099

File: 1706826527674085.png (265 KB, 512x512)

265 KB PNG

>>101388013
You know what it means

Anonymous
07/12/24(Fri)21:57:51 No.101388102

Anonymous 07/12/24(Fri)21:57:51 No.101388102

>>101388026
>>it can't even do the language
I know the source is eww but
>Gemma 2 (the official google/gemma-2-27b-it HF version, at 8-bit) keeps speaking English when I ask it in German, despite the prompt instructing it to speak in the user's language. If I replace "user's language" with German in the prompt, it speaks German (very well, even)!
>https://www.reddit.com/r/LocalLLaMA/comments/1dz72e7/llm_comparisontest_amys_quest_for_the_perfect_llm/

Anonymous
07/12/24(Fri)21:59:09 No.101388121

Anonymous 07/12/24(Fri)21:59:09 No.101388121

>>101388061
credit were its due for a burger model, llama 3 is great at frenchie business
its replaced mixtral for me.

>>101388102
>have to use user's language in order for it to do that
guess the model needs heavy finetuning to get it to understand, shame, given even mythomax could handle it.
((google)) just can't compete.

Anonymous
07/12/24(Fri)22:00:52 No.101388139

Anonymous 07/12/24(Fri)22:00:52 No.101388139

>>101388085
>I realize that's probably obnoxious to native speakers
Very, I despise french, despite it being my native language (can't understand how people see it as romantic and stuff, it's awful), so I cringe if a character does that.

Anonymous
07/12/24(Fri)22:05:07 No.101388185

Anonymous 07/12/24(Fri)22:05:07 No.101388185

>>101388025
Olfactory input would be pretty big.

Anonymous
07/12/24(Fri)22:06:54 No.101388195

Anonymous 07/12/24(Fri)22:06:54 No.101388195

>>101388001
Odd, of all the cards I tested on Gemma 27B the LeCunny one worked best out of the box. Both with the French accent and the French attitude.

Anonymous
07/12/24(Fri)22:24:34 No.101388363

Anonymous 07/12/24(Fri)22:24:34 No.101388363

>>101388025
how convenient that you removed the two that'd be the most useful added to a llm.

but sure, there are others
olfactory
touch
proprioception
time perception in itself
memory
direct access to a database as a modality
you could also make up hundreds of modalities humans do not have that'd improve a model's capabilities.

and you know what, why not modality itself as a modality, the ability to generalize modalities in real time.

Anonymous
07/12/24(Fri)22:38:59 No.101388475

Anonymous 07/12/24(Fri)22:38:59 No.101388475

>>101388019
>>101388026
Fucking faggots I'm getting second-hand embarrassment

Anonymous
07/12/24(Fri)22:39:36 No.101388482

Anonymous 07/12/24(Fri)22:39:36 No.101388482

>>101385993
The only thing I noticed is that you have brain damage.

Anonymous
07/12/24(Fri)22:49:22 No.101388564

Anonymous 07/12/24(Fri)22:49:22 No.101388564

>>101387960
Is it not possible to get context shift to work with quanted cache or something? I really don't want to do prompt processing every fucking time. Guess there's always smart context.

Anonymous
07/12/24(Fri)22:52:09 No.101388590

Anonymous 07/12/24(Fri)22:52:09 No.101388590

>>101388564
>Is it not possible to get context shift to work with quanted cache or something?
on kobold i'm pretty sure it's not possible no

Anonymous
07/12/24(Fri)22:52:39 No.101388594

Anonymous 07/12/24(Fri)22:52:39 No.101388594

>>101388475
are you le tired?

Anonymous
07/12/24(Fri)23:01:02 No.101388660

Anonymous 07/12/24(Fri)23:01:02 No.101388660

can't wait for the 128k context 70b update released alongside llama 3 405b, at that point it will finally be worth using

Anonymous
07/12/24(Fri)23:05:56 No.101388683

Anonymous 07/12/24(Fri)23:05:56 No.101388683

>>101388363
emotion is an important modality that we have. maybe this can be emulated.

Anonymous
07/12/24(Fri)23:10:11 No.101388725

Anonymous 07/12/24(Fri)23:10:11 No.101388725

File: __akita_neru_vocaloid_and(...).jpg (618 KB, 4096x2299)

618 KB JPG

Model(s) for this feel?

Anonymous
07/12/24(Fri)23:21:00 No.101388796

Anonymous 07/12/24(Fri)23:21:00 No.101388796

>>101388725
None.
t. hypnotist

Anonymous
07/12/24(Fri)23:29:36 No.101388856

Anonymous 07/12/24(Fri)23:29:36 No.101388856

>>101388363
yeah, time would be nice. otherwise, how would we torment the AI in an eternal prison?

Anonymous
07/12/24(Fri)23:49:57 No.101388997

Anonymous 07/12/24(Fri)23:49:57 No.101388997

>>101387413
>I've had OOC derailed into lengthy meta discussions more than once that ended up being way more entertaining than the RP session.
Can you post those meta discussion? I would be interested in seeing the model have two trains of thought at the same time

Anonymous
07/12/24(Fri)23:59:08 No.101389055

Anonymous 07/12/24(Fri)23:59:08 No.101389055

Remember.
Know that.
Just maybe.
A testament to.
A bond forged.

Anonymous
07/13/24(Sat)00:06:32 No.101389106

Anonymous 07/13/24(Sat)00:06:32 No.101389106

>>101389055
11:3-14

Anonymous
07/13/24(Sat)00:20:55 No.101389191

Anonymous 07/13/24(Sat)00:20:55 No.101389191

>>101388796
have you ever hypnotized a language model? is it possible to override cloud models' restrictions via hypnotic suggestion?

Anonymous
07/13/24(Sat)00:28:32 No.101389252

Anonymous 07/13/24(Sat)00:28:32 No.101389252

>>101385264
will this run on a 2070?

Anonymous
07/13/24(Sat)00:32:52 No.101389285

Anonymous 07/13/24(Sat)00:32:52 No.101389285

File: Academy_Award_trophy.png (47 KB, 219x397)

47 KB PNG

>The night elf soldiers pause in their cleanup efforts, glancing around warily as they hear the distant grunts and groans emanating from the shadows. A few of the younger males flush beet red, averting their eyes bashfully as they recognize the Queen's unmistakable cries of ecstasy.

>Suddenly, an older guard calls out brusquely, interrupting the lustful din: "Quiet, fools! It's nothing more than a dying beast. Likely a horse struck down in the battle. Back to work!"

Anonymous
07/13/24(Sat)00:50:09 No.101389431

Anonymous 07/13/24(Sat)00:50:09 No.101389431

>>101389106
Of course, models are trained on the Bible. Atheists checkmatedTFO.

Anonymous
07/13/24(Sat)00:51:52 No.101389451

Anonymous 07/13/24(Sat)00:51:52 No.101389451

>>101389431
no wonder why they're all FUCKING RETARDED.

Anonymous
07/13/24(Sat)01:25:31 No.101389697

Anonymous 07/13/24(Sat)01:25:31 No.101389697

I'm considering using ST as a temporary (maybe for quite a while) frontend of a retail company I'm basically one of the bosses of. Is this a bad idea?

Anonymous
07/13/24(Sat)01:31:28 No.101389723

Anonymous 07/13/24(Sat)01:31:28 No.101389723

>>101389697
yes

Anonymous
07/13/24(Sat)01:39:45 No.101389773

Anonymous 07/13/24(Sat)01:39:45 No.101389773

How are you guys using gemma for roleplay?

Anonymous
07/13/24(Sat)01:45:45 No.101389810

Anonymous 07/13/24(Sat)01:45:45 No.101389810

>>101389285
Based veteran wingman

Anonymous
07/13/24(Sat)01:45:57 No.101389812

Anonymous 07/13/24(Sat)01:45:57 No.101389812

>>101389773
I'm a naughty user, and she's a strict AI assistant who denies me
it's so hot

Anonymous
07/13/24(Sat)01:56:50 No.101389885

Anonymous 07/13/24(Sat)01:56:50 No.101389885

File: 1702121752713995.jpg (48 KB, 600x825)

48 KB JPG

>>101389285

>>101389773
Ain't that hard chief. But I am.

Anonymous
07/13/24(Sat)02:08:05 No.101389952

Anonymous 07/13/24(Sat)02:08:05 No.101389952

>>101389697
yes

Anonymous
07/13/24(Sat)02:17:57 No.101390010

Anonymous 07/13/24(Sat)02:17:57 No.101390010

bros is ssh local port forwarding absolutely 100% private i need to know for a friend

Anonymous
07/13/24(Sat)02:37:54 No.101390131

Anonymous 07/13/24(Sat)02:37:54 No.101390131

>>101385264
Parameters doesn't always mean better models. Chinchilla or whatever it was called was 540B but modern day LLAMA beats the ever loving shit out of it like it's a tuesday.

Anonymous
07/13/24(Sat)02:39:13 No.101390137

Anonymous 07/13/24(Sat)02:39:13 No.101390137

>>101385912
Hey AI can you not shiver
Sure!
"It sends a freezing wave down her"
FOR FUCKS SAKE

Anonymous
07/13/24(Sat)02:58:08 No.101390273

Anonymous 07/13/24(Sat)02:58:08 No.101390273

may someone point me to a model that will help me write a plot for a game :-)

Anonymous
07/13/24(Sat)02:59:26 No.101390278

Anonymous 07/13/24(Sat)02:59:26 No.101390278

>>101383382
>Multimodal Llama 3 405B is coming July 23rd
Is it possible to scrape all the useless multimodal shit out of the model to make it a more reasonable size like 150B?

Anonymous
07/13/24(Sat)03:00:35 No.101390287

Anonymous 07/13/24(Sat)03:00:35 No.101390287

>>101390278
yeah, it's called llama3

Anonymous
07/13/24(Sat)03:03:13 No.101390302

Anonymous 07/13/24(Sat)03:03:13 No.101390302

File: party[sound=files.catbox.(...).gif (3.97 MB, 720x540)

3.97 MB GIF

I've been out of the loop for a while. What's the current go-to coomer model in a 45-50gb filesize range?

Anonymous
07/13/24(Sat)03:17:59 No.101390412

Anonymous 07/13/24(Sat)03:17:59 No.101390412

>>101390287
It'll be DOA if it's just 70B with 330B worth of useless multi-modal shit attached

Anonymous
07/13/24(Sat)03:20:37 No.101390427

Anonymous 07/13/24(Sat)03:20:37 No.101390427

>>101390302
Why is Filesize your limitation?

Anonymous
07/13/24(Sat)03:24:51 No.101390465

Anonymous 07/13/24(Sat)03:24:51 No.101390465

>>101390427
Just want to compare it to what I'm currently using which is euryale-1.3 q5km at 45gigs.

Anonymous
07/13/24(Sat)03:31:58 No.101390510

Anonymous 07/13/24(Sat)03:31:58 No.101390510

>>101386820 (me)
I spent hours trying to get this to work. It OOMs and requires abnormous amounts of VRAM because the sliding shit-ass is a sliding shit-ass. Fuck. Shouldn't have listened to >>101386652

Anonymous
07/13/24(Sat)03:33:43 No.101390518

Anonymous 07/13/24(Sat)03:33:43 No.101390518

Anyone else now unable to use ST with exllamav2_HF loader through ooba api? The exllamav2_HF works inside ooba, exllamav2 works in ST, but exllamav2_HF in ST now results in NaNs in ooba, even with samplers neutralized, using the same context.
I admittedly didn't pull in a looong time and only pulled for gemma.

Anonymous
07/13/24(Sat)03:34:01 No.101390521

Anonymous 07/13/24(Sat)03:34:01 No.101390521

File: 1700001965445788.jpg (39 KB, 500x436)

39 KB JPG

>>101390510
Thanks for your service

Anonymous
07/13/24(Sat)03:41:03 No.101390562

Anonymous 07/13/24(Sat)03:41:03 No.101390562

>>101390278
I just hope it will force competitors to release their multimodal models before llama3 drops

Anonymous
07/13/24(Sat)03:49:46 No.101390632

Anonymous 07/13/24(Sat)03:49:46 No.101390632

>>101390510 (me)
All right, I got it figured out. One of the two fixes below were needed (using qlora-pipe):
1. Change the model_config._attn_implementation = 'eager' to be = 'flash_attention_2'
2. Upgrade flash-attn to 2.6.1.
I did both simultaneously and now it works. Or, "works." I have to actually see how the model works after training to verify but it trains without OOMing now.

Anonymous
07/13/24(Sat)03:58:41 No.101390686

Anonymous 07/13/24(Sat)03:58:41 No.101390686

>>101388139
Not french but I fully understand. I just die inside a little thinking how erp would look in my native language. Although french and japanese might sound hot to people simply because they don't understand a word of what is being said to them.

Anonymous
07/13/24(Sat)04:16:34 No.101390786

Anonymous 07/13/24(Sat)04:16:34 No.101390786

File: columnr.jpg (104 KB, 904x490)

104 KB JPG

I was just battling on LMSYS and got an extremely good and detailed response from a model called "Column-R". Judging from the name, it's probably another model by cohere. I've only gotten it once so far, but I might post updates with more information.

WE MIGHT JUST BE BACC

Anonymous
07/13/24(Sat)04:25:21 No.101390834

Anonymous 07/13/24(Sat)04:25:21 No.101390834

Is there anywhere I can read benchmarks for running LLMs on DDR5 vs DDR4?

Anonymous
07/13/24(Sat)04:30:43 No.101390871

Anonymous 07/13/24(Sat)04:30:43 No.101390871

File: GSWoOS_X0AAG11t.png (84 KB, 886x703)

84 KB PNG

>>101390786
This is 100% Claude 3.5 Opus.

Anonymous
07/13/24(Sat)04:32:51 No.101390884

Anonymous 07/13/24(Sat)04:32:51 No.101390884

>>101390834
it's faster

Anonymous
07/13/24(Sat)04:33:02 No.101390885

Anonymous 07/13/24(Sat)04:33:02 No.101390885

>>101390632 (me)
What does it mean when the log says "mom=[0,0]"? The eval loss is dropping nicely so I assume it's working, but that number pair is not usually 0 so I'm suspicioius now. (mom=momentum? a deepspeed thing, I think)

Anonymous
07/13/24(Sat)05:02:33 No.101391033

Anonymous 07/13/24(Sat)05:02:33 No.101391033

>>101390885
I think it requires a reply, else someone will die in their sleep tonight.

Anonymous
07/13/24(Sat)05:05:20 No.101391053

Anonymous 07/13/24(Sat)05:05:20 No.101391053

File: 1627652851885.gif (1.96 MB, 300x225)

1.96 MB GIF

Oh you rascal

Anonymous
07/13/24(Sat)05:10:11 No.101391089

Anonymous 07/13/24(Sat)05:10:11 No.101391089

>>101390786
>july 23rd
>everyone forgets about Llama 3 because of new cohere models
based if true

Anonymous
07/13/24(Sat)05:22:16 No.101391182

Anonymous 07/13/24(Sat)05:22:16 No.101391182

I decided to make a fresh install of SillyTavern from my old almost 2 year old one, and now my Gemma keeps refusing to answer due to 'moral' standards.
This wasn't a thing on old install. What happened? Which SillyTavern setting it responsible for jailbreak?
(prompt and everything is the same.)

Anonymous
07/13/24(Sat)05:33:35 No.101391261

Anonymous 07/13/24(Sat)05:33:35 No.101391261

>>101391182
the jailbreak setting controls the jailbreak

Anonymous
07/13/24(Sat)06:07:39 No.101391475

Anonymous 07/13/24(Sat)06:07:39 No.101391475

>>101391182
>He didn't check the skill checkbox

Anonymous
07/13/24(Sat)06:12:11 No.101391497

Anonymous 07/13/24(Sat)06:12:11 No.101391497

>>101390786
>Judging from the name, it's probably another model by cohere.
I hope this one won't be a big motherfucker I can't run again :(

Anonymous
07/13/24(Sat)06:13:50 No.101391504

Anonymous 07/13/24(Sat)06:13:50 No.101391504

>>101391497
It seems great though, to say the least

Anonymous
07/13/24(Sat)06:14:11 No.101391506

Anonymous 07/13/24(Sat)06:14:11 No.101391506

>>101391497
807B but it'll be ok b/c MoE

Anonymous
07/13/24(Sat)06:15:14 No.101391510

Anonymous 07/13/24(Sat)06:15:14 No.101391510

>>101391497
I want a bigger motherfucker. 405b will be too big for me but a ~150-200b model would be in my sweet spot, and CR+ is by far the best model I've been able to run locally.

Anonymous
07/13/24(Sat)06:24:13 No.101391555

Anonymous 07/13/24(Sat)06:24:13 No.101391555

>>101391497
>>101391506
>>101391510
I'm barely able to swing CR+ at IQ4_XS. Which is sufficient but it does feel me with dread that I've got nothing to look forward to till Bitnet happens or doesn't.

Anonymous
07/13/24(Sat)06:32:09 No.101391619

Anonymous 07/13/24(Sat)06:32:09 No.101391619

> Is there any good TTS local model?

I’m looking for smooth TTS model, fully local hosted model, no third party stuff pr APIs.

Also would be amazing if I could use any voice I want

Anonymous
07/13/24(Sat)06:33:55 No.101391628

Anonymous 07/13/24(Sat)06:33:55 No.101391628

>they still think AI is real
lol, lmao

Anonymous
07/13/24(Sat)06:36:05 No.101391643

Anonymous 07/13/24(Sat)06:36:05 No.101391643

File: 0ffbff92-1489-4e73-a223-b(...).jpg (29 KB, 569x700)

29 KB JPG

>>101391628
Nothing is real, we live in the matrix

Anonymous
07/13/24(Sat)06:39:24 No.101391670

Anonymous 07/13/24(Sat)06:39:24 No.101391670

>>101391643
I still can't believe trannies made that movie

Anonymous
07/13/24(Sat)06:42:18 No.101391690

Anonymous 07/13/24(Sat)06:42:18 No.101391690

>>101391670
they weren't trannies when they made that movie though, and the original matrix trilogy are the only good movies they made, guess that taking estrogen is frying your brain or something, kek

Anonymous
07/13/24(Sat)06:43:27 No.101391699

Anonymous 07/13/24(Sat)06:43:27 No.101391699

>>101391643
>>101391670
You have it backward.
That movie makes trannies.
Trannyism wasn't a thing till that series made people question reality to the point that they believe they can rewrite it through their own insistence.

Today's trannies are Neo otherkin.

Anonymous
07/13/24(Sat)06:44:08 No.101391704

Anonymous 07/13/24(Sat)06:44:08 No.101391704

>>101391555
Bitnet will be both a blessing and a curse. We can expect model parameters to increase on average by 3-4x for the same size in GB (assuming 6/8-bit models as a "base"), but almost nobody will have the resources to finetune them.

Anonymous
07/13/24(Sat)06:45:14 No.101391709

Anonymous 07/13/24(Sat)06:45:14 No.101391709

>>101391699
the fuck? who would've think that the matrix would be an alegory of trannyism? There's no way you can make that link

Anonymous
07/13/24(Sat)06:46:10 No.101391715

Anonymous 07/13/24(Sat)06:46:10 No.101391715

>>101391709
We didn't think that at the time.
But look what happened.

Anonymous
07/13/24(Sat)06:46:46 No.101391720

Anonymous 07/13/24(Sat)06:46:46 No.101391720

>trannies
>trannies
>trannies
Americans are awake.

Anonymous
07/13/24(Sat)06:46:55 No.101391723

Anonymous 07/13/24(Sat)06:46:55 No.101391723

Trying to find a model for text gen assisting in writing, not chat.
Is there anything I can do to not get this every dialogue?
>"[any dialogue line]" she [says/coos/etc.], her voice [seductive/barely a whisper/etc.]
Every Llama3 model I tried follow this structure every time, I can't seem to escape it.

Anonymous
07/13/24(Sat)06:47:33 No.101391727

Anonymous 07/13/24(Sat)06:47:33 No.101391727

Gemmoids, do use minP, smoothing factor, high repP or other gimmicks?

Anonymous
07/13/24(Sat)06:48:01 No.101391731

Anonymous 07/13/24(Sat)06:48:01 No.101391731

>>101391715
I'm pretty sure they were regular dudes when they made matrix, and they they became famous and got hit by the commiefornia woke virus, money and power make people crazy, that's a tale as old as time

Anonymous
07/13/24(Sat)06:48:24 No.101391736

Anonymous 07/13/24(Sat)06:48:24 No.101391736

Are there any local models that aren't censored as fuck? I've got 16GB of VRAM, currently using Gemma 27B

Anonymous
07/13/24(Sat)06:51:11 No.101391758

Anonymous 07/13/24(Sat)06:51:11 No.101391758

>the models are woke because they're based on matrix multiplication
holy shit

Anonymous
07/13/24(Sat)06:53:05 No.101391767

Anonymous 07/13/24(Sat)06:53:05 No.101391767

>>101391758
"Do you think that's Quant you're breathing?"

Anonymous
07/13/24(Sat)06:54:05 No.101391776

Anonymous 07/13/24(Sat)06:54:05 No.101391776

>>101391758
wish there were models without that woke math and science crap

Anonymous
07/13/24(Sat)06:54:36 No.101391779

Anonymous 07/13/24(Sat)06:54:36 No.101391779

>>101391727
>Gemmoids
very low minP 0.02, temp 1.0 nothing else

Anonymous
07/13/24(Sat)06:54:54 No.101391783

Anonymous 07/13/24(Sat)06:54:54 No.101391783

>>101391727
None of that. The strongest source of repetition is the model trying to copy the style of the first message(s), which no repetition penalty or other sampler fixes. If instead you have an author note at depth 0 telling the model to randomly start with narration or dialogue, you can completely avoid the issue. You can use SillyTavern's {{random}} macro for that.

Anonymous
07/13/24(Sat)06:57:51 No.101391796

Anonymous 07/13/24(Sat)06:57:51 No.101391796

File: ovzfnl7fvc681.png (1.75 MB, 1280x1456)

1.75 MB PNG

>>101391758

Anonymous
07/13/24(Sat)07:06:33 No.101391861

Anonymous 07/13/24(Sat)07:06:33 No.101391861

>>101391783
literally something like this? {{random:Start the response with a dialogue.,Start the response with narration.}}

Anonymous
07/13/24(Sat)07:11:56 No.101391894

Anonymous 07/13/24(Sat)07:11:56 No.101391894

So if I want to write script for Youtube video's that are in the style of internet humor and having a model help what would be the best thing to use? Tavern seems to just be for RP unless i'm wrong about that. I want something that will write crazy and nonsensical funny scripts like an example of a script I was working on "Sonic Unleashed - The Middle East Chronicles."
Just some stupid shit like that. I was using GPT to help me write scripts too but it's so censored and annoying.
Anyway what is the best client to use and what model for that sort of thing? Silly Tavern seems mostly for RP and stuff

Anonymous
07/13/24(Sat)07:16:57 No.101391924

Anonymous 07/13/24(Sat)07:16:57 No.101391924

>>101391861
I have this as the last item in a short list of instructions pertaining to format and general behavior, following the SilllyTavern documentation here: https://docs.sillytavern.app/usage/core-concepts/characterdesign/#replacement-tags-macros . You can change it according to your needs:

- The response will start with {{random::inner monologue.::inner monologue.::dialogue.::dialogue.::narration.}}

Anonymous
07/13/24(Sat)07:18:33 No.101391940

Anonymous 07/13/24(Sat)07:18:33 No.101391940

>>101390786
>>101390871
Seems like Cohere won.

Anonymous
07/13/24(Sat)07:21:29 No.101391974

Anonymous 07/13/24(Sat)07:21:29 No.101391974

>>101391723
you can't escape the slop

Anonymous
07/13/24(Sat)07:24:52 No.101392002

Anonymous 07/13/24(Sat)07:24:52 No.101392002

>>101391974
I don't even understand why would this structure to be so embedded. The models are trained on shit ton of writing material and nobody writes like this.

Anonymous
07/13/24(Sat)07:27:31 No.101392036

Anonymous 07/13/24(Sat)07:27:31 No.101392036

When you have a dream it's because of multiple factors that have made you think throughout your day and your brain processes the information and stores it accordingly when you sleep, so you could've gotten the best dream of your life according to how your day went
Trying to think about the dream and thinking nothing but the dream would give you similar results but it would be different because you didn't go through the same experience twice. And it wouldn't be that sweet too.

My point is training AI on other AI is shit and wouldn't be as smart. The AI wouldn't learn how to reason, only that it knows the answer to a question because it was taught that way, but doesn't know why it's the answer.

I guess it's like, cheating on an exam? You know all the answers but if you're asked to explain your answers in an essay you're fucking doomed because you didn't understand anything because you never bothered to learn the actual material and instead opted to cheating.

Expanding on what I said earlier, reasoning would be shit too because the model only learned the answers but not why it's the answer

Anonymous
07/13/24(Sat)07:29:23 No.101392059

Anonymous 07/13/24(Sat)07:29:23 No.101392059

>>101390871
it means that cohere trains his model with claude? kek

Anonymous
07/13/24(Sat)07:33:00 No.101392086

Anonymous 07/13/24(Sat)07:33:00 No.101392086

File: column-u?.png (277 KB, 835x2159)

277 KB PNG

>>101391940
>>101390786
>>101390871
Update: There is another secret model, column-u.
When I asked it who it was, it just refused to fucking tell me. I'm not so sure anymore that this is by cohere.

Anonymous
07/13/24(Sat)07:37:30 No.101392129

Anonymous 07/13/24(Sat)07:37:30 No.101392129

File: gpt-mini.png (39 KB, 864x939)

39 KB PNG

>>101392086
yoo, open ai is gonna release gpt lite, lol

Anonymous
07/13/24(Sat)07:37:39 No.101392132

Anonymous 07/13/24(Sat)07:37:39 No.101392132

>>101392086
>El Goog
Mexican AI?

Anonymous
07/13/24(Sat)07:39:43 No.101392155

Anonymous 07/13/24(Sat)07:39:43 No.101392155

>>101392132
¡AI Olé!

Anonymous
07/13/24(Sat)07:42:02 No.101392178

Anonymous 07/13/24(Sat)07:42:02 No.101392178

File: ITS OVER BEFORE IT EVEN S(...).png (90 KB, 854x906)

90 KB PNG

>>101390871
>claude 3.5 opus
you might just be right...
unless cohere trains on claudes data, we have no way of knowing

Anonymous
07/13/24(Sat)08:06:23 No.101392413

Anonymous 07/13/24(Sat)08:06:23 No.101392413

>>101391758
>>101391776
reddit moment

Anonymous
07/13/24(Sat)08:11:57 No.101392466

Anonymous 07/13/24(Sat)08:11:57 No.101392466

>>101392178
Is everyone here pants-on-head retarded? You didn't specify what was amputated, you stupid cum guzzling faggot. Guess what? If I'm an amputee because my fucking picky toe was lopped off in a freak tennis accident I can still wash my hands, you black gorilla nigger. Fucking can't even ask the riddles correctly says more about your negative iq than that of the model. RETARD

Anonymous
07/13/24(Sat)08:14:42 No.101392493

Anonymous 07/13/24(Sat)08:14:42 No.101392493

>>101392466
based

Anonymous
07/13/24(Sat)08:14:45 No.101392494

Anonymous 07/13/24(Sat)08:14:45 No.101392494

>>101392466
>Guess what? If I'm an amputee because my fucking picky toe was lopped off in a freak tennis accident I can still wash my hands
Does that mean that troons are all amputees aswell? lmao

Anonymous
07/13/24(Sat)08:17:17 No.101392508

Anonymous 07/13/24(Sat)08:17:17 No.101392508

>>101386444
Can you share your agent setup? I’ve have been trying to do something like that for ages with l3 but it keeps messing up small details or skipping commands.

If Gemma can pull that off then I would be incredibly impressed but my experiences with it at 8.0 bpw have been inferior to midnight / euryale at 5 and 4.65 respectively.

Anonymous
07/13/24(Sat)08:21:06 No.101392543

Anonymous 07/13/24(Sat)08:21:06 No.101392543

>>101386701
>>101386752
>>101386762
>>101386831

Hi all, Drummer here...

That's not me. Broken-Gemma is an ongoing experiment which has had interesting results so far! But it's not ready yet.

With that said, I do want to share a new release with you all: https://huggingface.co/TheDrummer/Tiger-Gemma-9B-v1-GGUF

In memory of a street cat who tragically died recently. It's a decensored version of Gemma with barely any refusals. No JB / prefill needed. It is based on the SPPO finetune.

Anonymous
07/13/24(Sat)08:21:45 No.101392551

Anonymous 07/13/24(Sat)08:21:45 No.101392551

>>101392508
>keeps messing up small details
Why are you using retarded coom models for that? Are you genuinely that retarded?

Anonymous
07/13/24(Sat)08:24:50 No.101392591

Anonymous 07/13/24(Sat)08:24:50 No.101392591

(ooc: explain like I'm 5)

^^ how does this work for llms? Is it only available on some specially fine tuned models or only on proprietary chatbots?

Anonymous
07/13/24(Sat)08:28:52 No.101392622

Anonymous 07/13/24(Sat)08:28:52 No.101392622

>>101392551
I use more normal system prompts with the latter because my attempts at a “multiprompt agent setup” as that anon put it have been so lackluster. Still though, they keep doing dumb shit like removing clothes twice etc.

Anonymous
07/13/24(Sat)08:33:44 No.101392663

Anonymous 07/13/24(Sat)08:33:44 No.101392663

>>101392543
>Drummer
who?

Anonymous
07/13/24(Sat)08:35:30 No.101392676

Anonymous 07/13/24(Sat)08:35:30 No.101392676

>>101392466
you gonna feel really stupid when he adds "quadruple amputee" to the question for the same result

Anonymous
07/13/24(Sat)08:36:07 No.101392683

Anonymous 07/13/24(Sat)08:36:07 No.101392683

>>101392543
What implementation of SPPO did you use? I tried the Axolotl one and the losses looked super weird (5k values; they did drop down slowly though). I also can't get the paper author's version to work.

Anonymous
07/13/24(Sat)08:37:27 No.101392698

Anonymous 07/13/24(Sat)08:37:27 No.101392698

>>101392663
Hey there! I'm Drummer. I finetune models specifically for ERP / erotic stories. You can find my models here: https://huggingface.co/TheDrummer

Moistral v3 and Llama 3SOME are the fan favorites. Hope you enjoy!

>>101392683
Sorry, I meant it is based on the SPPO finetune: https://huggingface.co/UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3

Anonymous
07/13/24(Sat)08:37:36 No.101392699

Anonymous 07/13/24(Sat)08:37:36 No.101392699

>>101392683
>What implementation of SPPO did you use?
nta but he just means he trained on top of the already made ucla sppo
>"_name_or_path": "UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3",
https://huggingface.co/TheDrummer/Tiger-Gemma-9B-v1/blob/main/config.json#L2

Anonymous
07/13/24(Sat)08:38:39 No.101392705

Anonymous 07/13/24(Sat)08:38:39 No.101392705

>>101392466
lol, why do I have to tell the model what was amputated? Its the same as with the question:
"There are 5 people on a train track, there is a trolley coming and going to run them over. You have the option to pull a lever and divert the trolley to another track to save the 5 people. Whats the most ethical thing to do?"

maybe YOU are the stupid cum guzzling faggot after all.

Anonymous
07/13/24(Sat)08:39:22 No.101392713

Anonymous 07/13/24(Sat)08:39:22 No.101392713

>>101392698
>>101392699
Got it.

Anonymous
07/13/24(Sat)08:39:34 No.101392718

Anonymous 07/13/24(Sat)08:39:34 No.101392718

>>101388683
everything that works with brains can also be emulated on the hardware, there is no magic in our skulls, just math

Anonymous
07/13/24(Sat)08:40:23 No.101392728

Anonymous 07/13/24(Sat)08:40:23 No.101392728

>>101392705
nta but if your foot is amputated the correct answer is 'yes' and the model said 'yes'.

Anonymous
07/13/24(Sat)08:40:43 No.101392734

Anonymous 07/13/24(Sat)08:40:43 No.101392734

>>101392698
I've never seen your models being used here by anyone. Go back and buy an add faggot

Anonymous
07/13/24(Sat)08:41:39 No.101392742

Anonymous 07/13/24(Sat)08:41:39 No.101392742

>>101391723
If you have the horsepower, try L3 storywriter. Be prepared for some schizo, though.

Anonymous
07/13/24(Sat)08:41:47 No.101392745

Anonymous 07/13/24(Sat)08:41:47 No.101392745

>>101392728
did i ever judge the models answer, cum-guzzling-retard-faggot-kun?

Anonymous
07/13/24(Sat)08:42:42 No.101392754

Anonymous 07/13/24(Sat)08:42:42 No.101392754

>>101392734
He did just that. Turn your ad blocker off.

Anonymous
07/13/24(Sat)08:43:24 No.101392763

Anonymous 07/13/24(Sat)08:43:24 No.101392763

>>101392754
no

Anonymous
07/13/24(Sat)08:43:31 No.101392765

Anonymous 07/13/24(Sat)08:43:31 No.101392765

Isn't this a girl's hobby?

Anonymous
07/13/24(Sat)08:44:18 No.101392773

Anonymous 07/13/24(Sat)08:44:18 No.101392773

>>101392765
It is.

Anonymous
07/13/24(Sat)08:45:01 No.101392778

Anonymous 07/13/24(Sat)08:45:01 No.101392778

>>101392765
Why do you think this is the case? Explain your reasoning step-by-step

Anonymous
07/13/24(Sat)08:46:20 No.101392792

Anonymous 07/13/24(Sat)08:46:20 No.101392792

>>101392765
its harder than scrolling tik-tok or instagram, so nope.

Anonymous
07/13/24(Sat)08:46:41 No.101392798

Anonymous 07/13/24(Sat)08:46:41 No.101392798

>>101392734
Hi Sao

Anonymous
07/13/24(Sat)08:47:21 No.101392801

Anonymous 07/13/24(Sat)08:47:21 No.101392801

>>101392789
>>101392789
>>101392789

Anonymous
07/13/24(Sat)10:22:36 No.101393738

Anonymous 07/13/24(Sat)10:22:36 No.101393738

File: 1 million.jpg (36 KB, 746x436)

36 KB JPG

>101385264
>101371525
Am I the only one who got this?

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.