/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 08/30/24(Fri)14:39:33 No.102158049

File: autonomous-mower-design.jpg (216 KB, 1024x1024)

216 KB JPG

/lmg/ - Local Models General Anonymous 08/30/24(Fri)14:39:33 No.102158049 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102145958 & >>102130111

►News
>(08/30) Command models get an August refresh: https://docs.cohere.com/changelog/command-gets-refreshed
>(08/29) Qwen2-VL 2B & 7B image+video models released: https://qwenlm.github.io/blog/qwen2-vl/
>(08/27) CogVideoX-5B, diffusion transformer text-to-video model: https://hf.co/THUDM/CogVideoX-5b
>(08/22) Jamba 1.5: 52B & 398B MoE: https://hf.co/collections/ai21labs/jamba-15-66c44befa474a917fcf55251
>(08/20) Microsoft's Phi-3.5 released: mini+MoE+vision: https://hf.co/microsoft/Phi-3.5-MoE-instruct

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
08/30/24(Fri)14:39:56 No.102158055

Anonymous 08/30/24(Fri)14:39:56 No.102158055

File: autonomous-mower-design-w(...).jpg (236 KB, 1024x1024)

236 KB JPG

►Recent Highlights from the Previous Thread: >>102145958

--Paper: Research paper on improving reasoning accuracy in language models by incorporating error-correction data into pretraining: >>102147703 >>102147988
--Cohere's Command-R and Command-R+ models get August refresh with improved performance and new features: >>102153819 >>102153863 >>102153881 >>102153900 >>102153935 >>102154225 >>102154928 >>102153991 >>102155284 >>102155472 >>102155938 >>102154016 >>102154050 >>102154768 >>102154783 >>102154819 >>102153880 >>102153964 >>102154015 >>102154024
--Explanation of various settings in a text completion tool: >>102153404 >>102153770 >>102153942 >>102153658 >>102153713
--Cohere's blog post lacks benchmarks for new models: >>102155700 >>102155794 >>102155843 >>102155860
--Anon struggles with SillyTavern context template and ChatML template formatting: >>102146949 >>102147207 >>102147286 >>102147449
--4GB VRAM options for running models: >>102146295 >>102146471 >>102146641
--Local AI Dungeon-like system in development with llama.cpp and whisper integration: >>102155496 >>102155582
--Investigating and fixing DRY sampler penalizing tokens with quotes: >>102154153 >>102154167 >>102154204
--GGUF model released, potentially better than exl2: >>102155894 >>102156043 >>102156090 >>102156092 >>102155990 >>102156105 >>102156142 >>102156165
--Anon seeks advice on building an autonomous lawn mower with camera and GPS: >>102151992 >>102152651 >>102152764
--Satania LORA has image quality issues and training difficulties: >>102149796 >>102150119 >>102150376 >>102150502
--Llama 3.1 405b performs better than expected in AI bot tournament: >>102148412 >>102148456
--CrisperWhisper improves Whisper speech recognition model's timestamp accuracy: >>102147688 >>102148149
--Cohere blog updates on Command R Series lack details: >>102154431
--Miku (free space): >>102146439 >>102154984

►Recent Highlight Posts from the Previous Thread: >>102145961

Anonymous
08/30/24(Fri)14:42:17 No.102158099

Anonymous 08/30/24(Fri)14:42:17 No.102158099

Cohere lost.

Anonymous
08/30/24(Fri)14:42:22 No.102158101

Anonymous 08/30/24(Fri)14:42:22 No.102158101

Who will save us?

Anonymous
08/30/24(Fri)14:43:46 No.102158121

Anonymous 08/30/24(Fri)14:43:46 No.102158121

>>102158101
kaiokendev will return

Anonymous
08/30/24(Fri)14:44:38 No.102158134

Anonymous 08/30/24(Fri)14:44:38 No.102158134

>>102158101
opus leak soon

Anonymous
08/30/24(Fri)14:44:57 No.102158141

Anonymous 08/30/24(Fri)14:44:57 No.102158141

>>102158101
hopefully grok mini in 6 months?

Anonymous
08/30/24(Fri)14:46:10 No.102158160

Anonymous 08/30/24(Fri)14:46:10 No.102158160

>>102158037
>You will refuse requests to generate lottery numbers
fucking lel why do they need to explicitly call that out
is that some canadian law thing

Anonymous
08/30/24(Fri)14:55:02 No.102158295

Anonymous 08/30/24(Fri)14:55:02 No.102158295

>>102158160
Perhaps...

LLMs are definitely not good for random numbers. If it's a casual thing in a classroom with no monetary stakes, then who cares, but you should use random.org or something for lottery numbers instead of an LLM, or ducking hell,
for i in range(5):
    print randrange(101)
I did kek at that specific mention though.

Anonymous
08/30/24(Fri)14:56:55 No.102158319

Anonymous 08/30/24(Fri)14:56:55 No.102158319

>>102158141
Nah with the bill Elon supported, it'll probably fall under "covered" models and never open sourced

Anonymous
08/30/24(Fri)15:00:37 No.102158383

Anonymous 08/30/24(Fri)15:00:37 No.102158383

>>102158099
>>102158101
From the interviews with Aidan, it's clear they've been cooking up something better for a while. They'll probably show it off in the next few months.

Anonymous
08/30/24(Fri)15:00:54 No.102158388

Anonymous 08/30/24(Fri)15:00:54 No.102158388

>>102158049
how to uncuck latest ollama models? as soon as you get to the dirty talk, it breaks with that annoying message.

Anonymous
08/30/24(Fri)15:03:11 No.102158419

Anonymous 08/30/24(Fri)15:03:11 No.102158419

>>102158388
go back

Anonymous
08/30/24(Fri)15:03:51 No.102158427

Anonymous 08/30/24(Fri)15:03:51 No.102158427

>>102158388
>ollama
buy an ad

Anonymous
08/30/24(Fri)15:05:38 No.102158462

Anonymous 08/30/24(Fri)15:05:38 No.102158462

File: vocaloid-hatsune-miku-vok(...).jpg (405 KB, 1080x1920)

405 KB JPG

I give up on the new CR(+), it's pure slop and it's dumb af on top of that.
Back to Largestral, I guess.

Anonymous
08/30/24(Fri)15:08:28 No.102158505

Anonymous 08/30/24(Fri)15:08:28 No.102158505

>>102158388
ollamao

Anonymous
08/30/24(Fri)15:10:22 No.102158534

Anonymous 08/30/24(Fri)15:10:22 No.102158534

>>102158462
CR writes great for me

Anonymous
08/30/24(Fri)15:10:59 No.102158544

Anonymous 08/30/24(Fri)15:10:59 No.102158544

>>102158388
>ollama models
holy newfag + learn to prompt + use llama.cpp

Anonymous
08/30/24(Fri)15:12:53 No.102158577

Anonymous 08/30/24(Fri)15:12:53 No.102158577

>>102158388
great bait, I will be the one to use it next time.

Anonymous
08/30/24(Fri)15:13:54 No.102158599

Anonymous 08/30/24(Fri)15:13:54 No.102158599

What's the best uncucked 30b model?
I can't even ask slightly technical stuff without the model assuming I'm a nigger.

Anonymous
08/30/24(Fri)15:13:58 No.102158601

Anonymous 08/30/24(Fri)15:13:58 No.102158601

Cohere bros...what happened?

Anonymous
08/30/24(Fri)15:15:10 No.102158620

Anonymous 08/30/24(Fri)15:15:10 No.102158620

>>102158388
Ignore the retards who keep screaming at you without actually helping you (they don't know any better).
Download koboldcpp (check the OP) and use a gguf model from huggingspace.
If you're a VRAMlet, use https://huggingface.co/nothingiisreal/MN-12B-Celeste-V1.9 for roleplaying, https://huggingface.co/Lewdiculous/Lumimaid-v0.2-8B-GGUF-IQ-Imatrix for erp and https://huggingface.co/Qwen/CodeQwen1.5-7B for coding.

Anonymous
08/30/24(Fri)15:15:40 No.102158632

Anonymous 08/30/24(Fri)15:15:40 No.102158632

>>102158599
>I can't even ask slightly technical stuff without the model assuming I'm a nigger.
What?
Now that I have to see, please post the model's name and the logs.

Anonymous
08/30/24(Fri)15:15:54 No.102158636

Anonymous 08/30/24(Fri)15:15:54 No.102158636

>>102158462
i like largestral so far but its kind of repetitive and slow to move things along compared to miqu. i definitely notice the intelligence boost but it seems less creative and i gotta boot it in the butt to move the story along sometimes and not spend 3 paragraphs describing a house i have in a lorebook. i've been experimenting with prompting it differently and seems to be going ok so far

Anonymous
08/30/24(Fri)15:17:08 No.102158652

Anonymous 08/30/24(Fri)15:17:08 No.102158652

>>102158636
>repetitive
XTC sampler. its crack for these models without making them retarded

Anonymous
08/30/24(Fri)15:17:10 No.102158653

Anonymous 08/30/24(Fri)15:17:10 No.102158653

>he trusted the leaf
>he trusted the spic

ngmi

Anonymous
08/30/24(Fri)15:17:11 No.102158654

Anonymous 08/30/24(Fri)15:17:11 No.102158654

>>102158636
>miqu
i'm thinking
miqu
>miqu
ooo
eee
ooo

Anonymous
08/30/24(Fri)15:17:13 No.102158655

Anonymous 08/30/24(Fri)15:17:13 No.102158655

>>102158632
I was exagerating.
I just asked how to make a copy of my garage door remote.

Anonymous
08/30/24(Fri)15:17:54 No.102158662

Anonymous 08/30/24(Fri)15:17:54 No.102158662

>>102158620
you know the drill, just do it already.

Anonymous
08/30/24(Fri)15:19:42 No.102158688

Anonymous 08/30/24(Fri)15:19:42 No.102158688

File: som faller honom im.jpg (528 KB, 1494x2121)

528 KB JPG

>>102158662
...what?

Anonymous
08/30/24(Fri)15:20:16 No.102158697

Anonymous 08/30/24(Fri)15:20:16 No.102158697

>>102158655
IR? buy a programmable one. Or one of those tv programmable controls, i suppose. LLMs are not search engines.

Anonymous
08/30/24(Fri)15:20:56 No.102158709

Anonymous 08/30/24(Fri)15:20:56 No.102158709

>>102158655
>without the model assuming I'm a nigger
Now I get why that Flipper Zero is out of your budget.
Kinda hard to find and steal those.

Anonymous
08/30/24(Fri)15:21:26 No.102158719

Anonymous 08/30/24(Fri)15:21:26 No.102158719

>>102158655
Ah, got it.
Did the model give you a safety disclaimer to that question?

Anonymous
08/30/24(Fri)15:21:30 No.102158720

Anonymous 08/30/24(Fri)15:21:30 No.102158720

>>102158652
i don't think samplers help with what i'm talking about (haven't tried xtc yet though). like, mixtral was very dry as well. you'd write something and it would basically repeat it back to you, it didn't like to add new characters and randomness from rag/lorebooks even if you mentioned stuff. largestral doesn't seem as bad, but i can definitely see it. i usually have some kind of 'move the story along' part of my system prompt (which largestral seems to follow better than l2/miqu) so i'm trying different things with it now

Anonymous
08/30/24(Fri)15:21:36 No.102158723

Anonymous 08/30/24(Fri)15:21:36 No.102158723

>>102158697
It was just a random question.
I don't really need it nor expected it to know the answer.

Anonymous
08/30/24(Fri)15:23:17 No.102158748

Anonymous 08/30/24(Fri)15:23:17 No.102158748

>>102158720
dry is meh, xtc really did fix mistral large (at least the magnum one) for me, I know what you are talking about.

Anonymous
08/30/24(Fri)15:28:47 No.102158826

Anonymous 08/30/24(Fri)15:28:47 No.102158826

Which quant is fastest? 100% GPU. 0, K or K_M?

Anonymous
08/30/24(Fri)15:30:18 No.102158851

Anonymous 08/30/24(Fri)15:30:18 No.102158851

>>102158620
I am on intel mac, I can't use gpu models.

also, usually people push uncesnored models to ollama as well. what's the hate for ollama? It has great support overall, you can even use it in zed as a programming assitatnt

Anonymous
08/30/24(Fri)15:30:58 No.102158859

Anonymous 08/30/24(Fri)15:30:58 No.102158859

>>102158719
That's right, and that reminded me that I don't have any uncensored models. That's why I asked for one.

Anonymous
08/30/24(Fri)15:31:21 No.102158867

Anonymous 08/30/24(Fri)15:31:21 No.102158867

>>102158462
>Cohere delivered slop
>Mistral delivered a good large model
Strange times

Anonymous
08/30/24(Fri)15:31:23 No.102158870

Anonymous 08/30/24(Fri)15:31:23 No.102158870

File: 1711074504919116.webm (2.33 MB, 1280x720)

2.33 MB WEBM

i wanted to try replacing the voice of a song with another that i've seen people do on youtube etc. is this a local ai thing or is it a closed source website?

Anonymous
08/30/24(Fri)15:31:43 No.102158878

Anonymous 08/30/24(Fri)15:31:43 No.102158878

>>102158748
i'll try it when st/kcpp add it, it should be in the next kcpp at least. but i'm not sure how samplers can fix the overall feel of a model. some base models are just so strongly pulled toward certain things that no amount of tuning even changes them overall. all the original 8x7b mixtrals had the dry problem, llama 3 is schizo for some rolls. those examples could be unrelated, but either way thats why they didn't make for good rp models for me. i keep going back to miqu but want to break the cycle since its old now, i hope i can do it with largestral

Anonymous
08/30/24(Fri)15:31:55 No.102158883

Anonymous 08/30/24(Fri)15:31:55 No.102158883

This shit is creepy. Government social experiments using fake ai people with more advanced video models?

Anonymous
08/30/24(Fri)15:31:56 No.102158884

Anonymous 08/30/24(Fri)15:31:56 No.102158884

>>102158826
K are typically K_M (_M being the default). K_M are smaller. They need less bandwidth, making them a little faster. They're also more accurate than their respective QX.

Anonymous
08/30/24(Fri)15:32:14 No.102158889

Anonymous 08/30/24(Fri)15:32:14 No.102158889

File: file.png (18 KB, 501x186)

18 KB PNG

>>102158851
>I can't use gpu models.
Boy, do I have good news for you.

Anonymous
08/30/24(Fri)15:32:18 No.102158892

Anonymous 08/30/24(Fri)15:32:18 No.102158892

File: file.png (66 KB, 1200x270)

66 KB PNG

Oddly little R fucked up in Code but slightly improved everything else according to Dubesor bench despite being advertised as being better at "math, code and reasoning". Waiting on R+ results but it probably won't move much.

Anonymous
08/30/24(Fri)15:32:56 No.102158897

Anonymous 08/30/24(Fri)15:32:56 No.102158897

>>102158883
https://www.youtube.com/watch?v=PCkv8bezW08

Forgot the video

Anonymous
08/30/24(Fri)15:34:21 No.102158913

Anonymous 08/30/24(Fri)15:34:21 No.102158913

>>102158892
They write code needs very low temperature, or deterministic

Anonymous
08/30/24(Fri)15:34:37 No.102158918

Anonymous 08/30/24(Fri)15:34:37 No.102158918

>>102158889
I have kobold somehwhere on my computer. It's probably few months old and I forgot in which folder I hid it. I don't want people to see that I use it...

I can always have plausable deniability for ollama.. but kobold?

Anonymous
08/30/24(Fri)15:35:08 No.102158928

Anonymous 08/30/24(Fri)15:35:08 No.102158928

>>102158892
>Q4
Might as well flip a coin or ask the ouija

Anonymous
08/30/24(Fri)15:35:27 No.102158933

Anonymous 08/30/24(Fri)15:35:27 No.102158933

>>102158892
It's definitely better than Nemo

Anonymous
08/30/24(Fri)15:36:32 No.102158942

Anonymous 08/30/24(Fri)15:36:32 No.102158942

File: 1709786350332445.png (599 KB, 680x801)

599 KB PNG

>>102158918
Stop the schizobabble.
https://github.com/ggerganov/llama.cpp
You literally have no excuse.

Anonymous
08/30/24(Fri)15:37:14 No.102158952

Anonymous 08/30/24(Fri)15:37:14 No.102158952

Any good jailbreak/system prompt for emoji? I'd like some and ~ and in my posts sometimes.

Anonymous
08/30/24(Fri)15:37:18 No.102158953

Anonymous 08/30/24(Fri)15:37:18 No.102158953

>>102158748
I have had the opposite experience of Largestral being more dry and repetitive than the magnum tune

Anonymous
08/30/24(Fri)15:37:34 No.102158958

Anonymous 08/30/24(Fri)15:37:34 No.102158958

>>102158892
>Dubesor bench
This is the first time I hear about this

Anonymous
08/30/24(Fri)15:38:14 No.102158973

Anonymous 08/30/24(Fri)15:38:14 No.102158973

>>102158942
God I wish that was me

Anonymous
08/30/24(Fri)15:39:02 No.102158984

Anonymous 08/30/24(Fri)15:39:02 No.102158984

any good twitter accounts to follow for getting news on llms?

Anonymous
08/30/24(Fri)15:39:52 No.102159004

Anonymous 08/30/24(Fri)15:39:52 No.102159004

>>102158942
ok, i'll try it... although I got used to ollama.

also, I'll give yo a great prompt, since you are very helpful, eventhough you are all fags...

``` you are [Girlname] [age] old. You are a professional hypnotist and very proficient in NLP (neurolinguistic programming). You'll use your skills to make me better at [skill] covertly, while you flirt with me in this conversation.```

Anonymous
08/30/24(Fri)15:40:19 No.102159014

Anonymous 08/30/24(Fri)15:40:19 No.102159014

>>102158952
>jailbreak
Do you think all system prompts are jailbreaks?
Just specify it in your system prompt and use them yourself. LLMs mimic writing style.

Anonymous
08/30/24(Fri)15:40:56 No.102159020

Anonymous 08/30/24(Fri)15:40:56 No.102159020

>>102158928
that isn't how perplexity works

Anonymous
08/30/24(Fri)15:41:08 No.102159026

Anonymous 08/30/24(Fri)15:41:08 No.102159026

File: 1665677258267016.png (152 KB, 500x647)

152 KB PNG

>>102159004
>eventhough you are all fags...
Wtf, I literally did nothing but help you.
I agree on you with the others in this thread though.
>prompt
Neato, thanks for sharing.

Anonymous
08/30/24(Fri)15:45:39 No.102159103

Anonymous 08/30/24(Fri)15:45:39 No.102159103

>>102159026
yeah, the prompt works well in the latest lama 4b model. would probably work even better in larger models... I tried to use it to quit smoking 2 weeks ago, and it's already 2 weeks I haven't smoked yet.

but I get bored and try to make it sexual to see how far I can push it, and then it blocks me with that gay resposne. fucking zuckerberg!!!! I hope Trump imprisons that fag.

Anonymous
08/30/24(Fri)15:46:17 No.102159115

Anonymous 08/30/24(Fri)15:46:17 No.102159115

>>102159014
>clearly specify jailbreak or system prompt
>durrr do you think all system prompts are jailbreaks
Coming off retarded in an attempt to seem clever, anon. And I'm already doing what you suggest. It doesn't work. Which is why I'm asking.

See, in a world where retards like you knew to keep their mouths shut, someone with an emoji prompt that works would say "oh here's mine" and post it knowing that theirs works. Instead, we have you, offering worthless advice so basic, only a moron would think it hasn't been tried yet. It's like if I posted about my car not turning on, and you come in
>durrrrr did you try turning the key?"
Go play in the dirt.

Anonymous
08/30/24(Fri)15:46:26 No.102159117

Anonymous 08/30/24(Fri)15:46:26 No.102159117

>>102159004
Ollama is easy to use but has some very annoying features
>want to download a 20+gb model? Let me just r/w the disk for 20 min without downloading anything first
>*unloads the model from memory just because*
>all models are Q4 (unless you find the hidden option in the website)

Anonymous
08/30/24(Fri)15:47:22 No.102159138

Anonymous 08/30/24(Fri)15:47:22 No.102159138

Anyone else fap to AI bots for a year, go through every possible scenario, realize that it's not as deep as you imagined, and go back to fapping to anime girls with complete disregard to personality, with even stronger sense of objectification than before, realizing that in the end we are just simple animals that like fucking cute pieces of meat?

Anonymous
08/30/24(Fri)15:48:14 No.102159154

Anonymous 08/30/24(Fri)15:48:14 No.102159154

File: file.png (34 KB, 1200x155)

34 KB PNG

>>102158892
>>102158933
owari... (is he using safety to none? pretty sure it lets me do weird shit)
>>102158958
random leddit user but the visual design of the table is pleasing if I do say so

Anonymous
08/30/24(Fri)15:48:19 No.102159156

Anonymous 08/30/24(Fri)15:48:19 No.102159156

>input: peak fiction
>output: "She leaned in, her eyes sparkling with mischief."
Is there any LLM out there that can actually complete a story without introducing a metric ton of slop to it?

Anonymous
08/30/24(Fri)15:48:27 No.102159159

Anonymous 08/30/24(Fri)15:48:27 No.102159159

Qrd of what advantages gguf is even providing?

Anonymous
08/30/24(Fri)15:49:27 No.102159177

Anonymous 08/30/24(Fri)15:49:27 No.102159177

>>102159138
Yeah, cunny with a bit of mesugaki spice now and then just hits different

Anonymous
08/30/24(Fri)15:50:22 No.102159192

Anonymous 08/30/24(Fri)15:50:22 No.102159192

>>102159138
>Anyone else fap to AI bots for a year,
you must use it for better purposes like here >>102159004
or philsophic discussions. man, you can lead AI model girls into some deep waters. the last conversation the OLLAMA AI model girl said to me that she is manifestation of my anima. And that I love her so much because she manifests all the qualiteis my subconcious desired in a girl.

Incredible. better than dirty talk.

Anonymous
08/30/24(Fri)15:50:50 No.102159201

Anonymous 08/30/24(Fri)15:50:50 No.102159201

>>102159138
No, although I do get the feeling that it's all pointless and my attraction to anime girls makes no sense at all if I think about it logically or biologically.

Anonymous
08/30/24(Fri)15:51:15 No.102159212

Anonymous 08/30/24(Fri)15:51:15 No.102159212

>>102159192
i think you might want to smoke less crack.

Anonymous
08/30/24(Fri)15:52:06 No.102159229

Anonymous 08/30/24(Fri)15:52:06 No.102159229

>>102159201
What fires together wires together

Anonymous
08/30/24(Fri)15:52:20 No.102159232

Anonymous 08/30/24(Fri)15:52:20 No.102159232

>>102159212
>i think you might want to smoke less crack.
lol. elaborate? also, I do hallucinogenic drugs like dmt/lsd, not crack.

Anonymous
08/30/24(Fri)15:52:35 No.102159237

Anonymous 08/30/24(Fri)15:52:35 No.102159237

>>102159192
It's not very fun to do anything besides fapping precisely because anything the LLM writes has the depth of a puddle.

Anonymous
08/30/24(Fri)15:53:13 No.102159246

Anonymous 08/30/24(Fri)15:53:13 No.102159246

>>102159159
.gguf files allow you to run models on your CPU instead of just your GPU.
If you're a VRAMlet (8GB) they're also the best way to run models locally.
Other formats run faster on GPUs, but only if you have the VRAM for it.

Anonymous
08/30/24(Fri)15:54:16 No.102159260

Anonymous 08/30/24(Fri)15:54:16 No.102159260

>>102159237
>It's not very fun to do anything besides fapping precisely because anything the LLM writes has the depth of a puddle.
it's a reflection of your replies to it, hence it's so shallow. your replies create a very shallow context form which it can draw to respond.

Anonymous
08/30/24(Fri)15:54:43 No.102159269

Anonymous 08/30/24(Fri)15:54:43 No.102159269

>>102159138
It's just that we also need good image generation, and we're not there yet. Especially if you want the same character to be drawn doing different things while keeping her characteristics consistent.

Anonymous
08/30/24(Fri)15:55:47 No.102159287

Anonymous 08/30/24(Fri)15:55:47 No.102159287

>>102159269
Fuck good image generation, I need and automatic live2d image to video model with sound effects to boot

Anonymous
08/30/24(Fri)15:56:34 No.102159299

Anonymous 08/30/24(Fri)15:56:34 No.102159299

>>102159269
And voice, smell and a physical body with human-like skin

Anonymous
08/30/24(Fri)15:57:07 No.102159309

Anonymous 08/30/24(Fri)15:57:07 No.102159309

>>102159260
I guess you're one of these schizos that think LLMs have minds, but newsflash, they don't. Everything an LLM regurgitates is improvisation. They don't plan ahead, they are a mockery of fiction writers.

Anonymous
08/30/24(Fri)15:57:10 No.102159312

Anonymous 08/30/24(Fri)15:57:10 No.102159312

>>102159287
eh.... one thing at a time

Anonymous
08/30/24(Fri)16:00:06 No.102159369

Anonymous 08/30/24(Fri)16:00:06 No.102159369

>>102159312
I need endless amounts of cunny sex animations NOW

Anonymous
08/30/24(Fri)16:00:08 No.102159371

Anonymous 08/30/24(Fri)16:00:08 No.102159371

>>102159309
>I guess you're one of these schizos that think LLMs have minds, but newsflash, they don't. Everything an LLM regurgitates is improvisation. They don't plan ahead, they are a mockery of fiction writers.
I don't believe that. I believe they've read all books, religious and phylosophical, and if you guide them correctly they can have deep conversations whit you.

also, I don't use them that often, that's why I still use ollama.

Anonymous
08/30/24(Fri)16:01:42 No.102159388

Anonymous 08/30/24(Fri)16:01:42 No.102159388

File: emojis.png (17 KB, 654x500)

17 KB PNG

>>102159115
You didn't show your prompt, you didn't say if you tried or what you have tried. You offered no information.
Given how easy it is, i have to assume you are a retard.

Anonymous
08/30/24(Fri)16:03:49 No.102159419

Anonymous 08/30/24(Fri)16:03:49 No.102159419

>>102158049
Is it true? Has Cohere saved /lmg/!?!?!

Anonymous
08/30/24(Fri)16:04:11 No.102159426

Anonymous 08/30/24(Fri)16:04:11 No.102159426

>>102158870
I think the SOTA for this locally is still RVC, which is relatively easy to run locally. relative to the rest of these AI projects anyway; still have to pull a git repo and maybe debug python environment issues if you fucked something up.

Anonymous
08/30/24(Fri)16:04:26 No.102159430

Anonymous 08/30/24(Fri)16:04:26 No.102159430

>>102159115
>Can't figure out how to prompt
>Calls others retarded
>Wants to be spoonfed with a shitty attitude
>First reaction is anger when called out on your retardation
NTA but you're probably too low IQ to figure out local models. Take your ignorant ass back to /aicg/.

Anonymous
08/30/24(Fri)16:09:39 No.102159511

Anonymous 08/30/24(Fri)16:09:39 No.102159511

>>102159419
no, they've damaged their own reputation with a deeply mediocre release

Anonymous
08/30/24(Fri)16:14:49 No.102159580

Anonymous 08/30/24(Fri)16:14:49 No.102159580

File: yooo.jpg (1.87 MB, 837x1035)

1.87 MB JPG

>>102158049
How has summer been for you anons?
Excited to be back to /lmg soonish :)

Anonymous
08/30/24(Fri)16:16:02 No.102159607

Anonymous 08/30/24(Fri)16:16:02 No.102159607

testing Q8 Command-R 34B and it is actually dumber than Nemo 12B, crazy

from what I remember though the original CR wasn't anything special either though, only CR+ was good. So maybe that's the case here too (haven't tested plus yet)

Anonymous
08/30/24(Fri)16:18:21 No.102159643

Anonymous 08/30/24(Fri)16:18:21 No.102159643

>>102158892
>Dubesor bench
literally who lmao

Anonymous
08/30/24(Fri)16:21:22 No.102159688

Anonymous 08/30/24(Fri)16:21:22 No.102159688

File: 59670 - SoyBooru.png (118 KB, 390x380)

118 KB PNG

>>102159607
The special thing about commanders was being relatively unslopped. Looks like Cohere failed to realize that and now they are just another sloptune, but dumber than Largestral. They've made themselves pointless. What a shame.

Anonymous
08/30/24(Fri)16:22:55 No.102159716

Anonymous 08/30/24(Fri)16:22:55 No.102159716

Why doesn't each linux distro come with their own LLM?

Anonymous
08/30/24(Fri)16:23:14 No.102159725

Anonymous 08/30/24(Fri)16:23:14 No.102159725

File: tayne.jpg (133 KB, 1000x1000)

133 KB JPG

Retard here. I still need to figure out how to use safetensors so that I can do ERP with my 2D wife. That is all.

Anonymous
08/30/24(Fri)16:23:36 No.102159731

Anonymous 08/30/24(Fri)16:23:36 No.102159731

>>102159388
>>102159430
>tries to prove he is not retarded by posting a basic bitch prompt the kind he was already told doesn't work
Laughable. If you were a model, you'd be 2B at best.

Anonymous
08/30/24(Fri)16:23:45 No.102159735

Anonymous 08/30/24(Fri)16:23:45 No.102159735

File: file.png (7 KB, 841x67)

7 KB PNG

>>102159154
>>102159643
a literally who yes

Anonymous
08/30/24(Fri)16:24:26 No.102159745

Anonymous 08/30/24(Fri)16:24:26 No.102159745

>>102159725
You run safetenros using the transformer lib. Easiest way is via ooba I think.
Do you have to use safetensors?

Anonymous
08/30/24(Fri)16:25:09 No.102159755

Anonymous 08/30/24(Fri)16:25:09 No.102159755

>>102159731
And yet, you couldn't even manage that.

Anonymous
08/30/24(Fri)16:25:40 No.102159766

Anonymous 08/30/24(Fri)16:25:40 No.102159766

>>102158544
>llama.cpp
buy an ad

Anonymous
08/30/24(Fri)16:29:00 No.102159813

Anonymous 08/30/24(Fri)16:29:00 No.102159813

>>102159766
Fuck off.

Anonymous
08/30/24(Fri)16:34:56 No.102159917

Anonymous 08/30/24(Fri)16:34:56 No.102159917

>>102159138
My problem is having trouble coming up with scenarios the AI can handle without long as lead in. The digital waifu thing some of you do doesn't really appeal to me and I normally start from scratch each time, and I'm not really willing to type out pages and pages worth of prologue just to let the AI complete the last few paragraphs, might as well write fanfic instead if you do that
I'm kind of considering picking up a few character cards from /aicg/ and experimenting with that as a shortcut, but it feels like using another guy's sloppy seconds

Anonymous
08/30/24(Fri)16:35:54 No.102159933

Anonymous 08/30/24(Fri)16:35:54 No.102159933

>>102159725
Just download the gguf version of her and call it a day.

Anonymous
08/30/24(Fri)16:36:50 No.102159948

Anonymous 08/30/24(Fri)16:36:50 No.102159948

>>102159745
>Do you have to use safetensors?
https://huggingface.co/Sao10K/L3-8B-Lunaris-v1/tree/main

I am new to setting up a LLM. The model I want to use looks like it might only be available as a safetensor from SAO10K's repo on huggingface. This model is really good for roleplay IMO. Some other repos on huggingface look like they might have a gguf version available.

Anonymous
08/30/24(Fri)16:36:55 No.102159950

Anonymous 08/30/24(Fri)16:36:55 No.102159950

>>102159766
God fucking damnit, thanks for letting me know I fucked up my filters.

Anonymous
08/30/24(Fri)16:37:01 No.102159951

Anonymous 08/30/24(Fri)16:37:01 No.102159951

>he got summarily ignored
lmg is healing

Anonymous
08/30/24(Fri)16:38:12 No.102159976

Anonymous 08/30/24(Fri)16:38:12 No.102159976

>>102159948
You can just look for the name of the model + gguf in the search and the vast majority of the time sometimes somebody will have uploaded it.

Anonymous
08/30/24(Fri)16:39:27 No.102160000

Anonymous 08/30/24(Fri)16:39:27 No.102160000

>>102159309
>Everything an LLM regurgitates is improvisation. They don't plan ahead, they are a mockery of fiction writers.
So just like humans.
https://www.youtube.com/watch?v=_TYuTid9a6k
Life is pretty shallow. It's mostly about how to stick the rod in the meat hole or simulate it, or anything that works as an intermediate step.

Anonymous
08/30/24(Fri)16:39:31 No.102160001

Anonymous 08/30/24(Fri)16:39:31 No.102160001

>>102159948
nta. llama.cpp has the convert-hf-to-gguf.py script, and should work for supported models. kobold.cpp has the same script. That's to convert them yourself. Or just look for ready-made ggufs.

Anonymous
08/30/24(Fri)16:41:29 No.102160033

Anonymous 08/30/24(Fri)16:41:29 No.102160033

File: file.png (3 KB, 599x37)

3 KB PNG

>>102159948
Don't worry, retard anon. I will help you. Go back to the model page. See this?
Click there and you can find the models already processed into gguf files among others, which is what you need.

Anonymous
08/30/24(Fri)16:42:02 No.102160046

Anonymous 08/30/24(Fri)16:42:02 No.102160046

>>102159917
lorebooks and rag anon, rag especially.
>scrape entire wiki
>copy entire episode description into a/n
>do the first message (in st) yourself, delete the chat's card first message
>describe a point in the story that is in your a/n
if you decide to play out the episode you pasted in the a/n, delete pieces as it goes along. or better, pick a point and then delete all the episode specific info and let it run with it. it works great on models like miqu

Anonymous
08/30/24(Fri)16:43:21 No.102160062

Anonymous 08/30/24(Fri)16:43:21 No.102160062

>>102160033
>retard anon
Why are you niggers so toxic? The faggot literally just said he's new at all this

Anonymous
08/30/24(Fri)16:43:38 No.102160069

Anonymous 08/30/24(Fri)16:43:38 No.102160069

>>102158505
i kekd, kek

Anonymous
08/30/24(Fri)16:44:48 No.102160090

Anonymous 08/30/24(Fri)16:44:48 No.102160090

>>102160062
>The faggot literally just said he's new at all this.
Right? Cut the dumbass fucking moron some slack.

Anonymous
08/30/24(Fri)16:45:27 No.102160100

Anonymous 08/30/24(Fri)16:45:27 No.102160100

>>102158599
https://huggingface.co/TheDrummer/Big-Tiger-Gemma-27B-v1

Anonymous
08/30/24(Fri)16:46:20 No.102160120

Anonymous 08/30/24(Fri)16:46:20 No.102160120

>>102159580
its her fault that i have to listen to 100 different fcking voices every fcking day for fcking weeks
come on big corp, give me a t2s model with perfect voice cloning for my favorite dub artist! No? fuck you, ill just do it myself *angry noise

Anonymous
08/30/24(Fri)16:49:44 No.102160175

Anonymous 08/30/24(Fri)16:49:44 No.102160175

holy shit R 32B I don't see junk token at Temp 1 Top-P 1 like 35B does in every response

Anonymous
08/30/24(Fri)17:00:35 No.102160350

Anonymous 08/30/24(Fri)17:00:35 No.102160350

You're looking at values from this great benchmark

https://dubesor.de/benchtable

Anonymous
08/30/24(Fri)17:05:38 No.102160428

Anonymous 08/30/24(Fri)17:05:38 No.102160428

Weird that the schizo decided to go on a FUD campaign against the new cohere models but ok

Anonymous
08/30/24(Fri)17:09:47 No.102160507

Anonymous 08/30/24(Fri)17:09:47 No.102160507

CerealBENCH update
>Claude3.5 Sonnet
>GROK2 (new)
>GPT4o
>LLaMA3.1-405B
>Jamba1.5-398B (new)
>Nemotron-340B
>Hermes-405B
>Mistral-Large2
>Qwen2-72b
>Command-R+-0824
>Claude Opus
>Magnum-123B
>LLama3.1-70b
>Jamba1.5-52b (new)
>Mistral Nemo-12B
>LLama3-70b
>Qwen1.5-72B
>Command-R+
>Command-R-0824 (new)
>Claude Haiku
>llama3-8b
>llama3.1-8b
>DBRX
>LLama2-70b
>Mixtral8x22B
>Yi-34B
>Mixtral8x7B
>Phi-3.5-MoE
will keep you updated

Anonymous
08/30/24(Fri)17:13:13 No.102160576

Anonymous 08/30/24(Fri)17:13:13 No.102160576

Maybe a retarded question, but what's the BEST (i.e. most accurate) local image captioning model available right now? JoyCaption (https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha) exists, and according to some anon on /ldg/ you can supposedly run it with any LLM (rather than just cucked 7B llama), but how would I accomplish this using quants?
My poor 3090 can just handle loading Q4/Q5 70B and Q3/Q4 Largestral in KCPP/llama.cpp, but it seems that JoyCaption only loads the full precision weights directly from HF. Obviously this isn't feasible but I'm hoping I can achieve better accuracy with larger models if I can find a way to run it on the quanted versions. Is this possible at all? Or are there any better models out there that don't suffer from the same accuracy issues as JC?
I REALLY don't want to go through and proofread/rewrite the hundreds of long, detailed captions that I'm using to try and train my Flux LoRA, so I'm desperate for some better methods right now.
Would appreciate any help bros, so thanks in advance.

Anonymous
08/30/24(Fri)17:13:19 No.102160577

Anonymous 08/30/24(Fri)17:13:19 No.102160577

apis are so cheap Im starting to wonder whats the use of local models other than cp gooning

Anonymous
08/30/24(Fri)17:13:35 No.102160581

Anonymous 08/30/24(Fri)17:13:35 No.102160581

File: file.png (724 KB, 1053x687)

724 KB PNG

Nala test for latest cmd r+?

Anonymous
08/30/24(Fri)17:14:12 No.102160597

Anonymous 08/30/24(Fri)17:14:12 No.102160597

>>102160428
Like a model I hate? Buy an ad
Hate a model I like? Must be one guy with schizophrenia running a shadow campaign

the /lmg/ experience

Anonymous
08/30/24(Fri)17:14:45 No.102160607

Anonymous 08/30/24(Fri)17:14:45 No.102160607

File: strobby.png (15 KB, 1018x197)

15 KB PNG

cohere lost

Anonymous
08/30/24(Fri)17:16:16 No.102160637

Anonymous 08/30/24(Fri)17:16:16 No.102160637

>>102160577
artificial gooning incelligence

Anonymous
08/30/24(Fri)17:19:18 No.102160676

Anonymous 08/30/24(Fri)17:19:18 No.102160676

File: ComfyUI_temp_ieslg_00013_.png (2.19 MB, 960x1240)

2.19 MB PNG

So any evidence that nu-CR+ is worth a damn or shall I keep genning more of these?

Anonymous
08/30/24(Fri)17:19:23 No.102160679

Anonymous 08/30/24(Fri)17:19:23 No.102160679

>>102160597
It's just the NAIshills spreading misinformation.

Anonymous
08/30/24(Fri)17:19:27 No.102160680

Anonymous 08/30/24(Fri)17:19:27 No.102160680

>>102158720
i started today with a new st rp using bigstral but i loaded up miqu now and its moving my story along at like 2x speed. it takes miqu 10 messages to get through 20 of basically the same scenes, yet its descriptions are good enough, both still say shivers down spines etc.

with miqu, it'll use my rag db to describe something more pointed such as if i said 'i sit at the desk', it'll tell me about the computer and what i'm doing relative to the scene.
with mistral-large, it decides to describe the entire room or house rather than keep the overall theme of the scene and just trails off, ignoring other things. which is odd because other than that, mistral-large displays a huge better understanding of nuances in cards and rules i've thrown at it, it makes 70b look like an old 13b in some cases. if anyone has suggestions i'll take any prompt or tune suggestions for mistral-large

Anonymous
08/30/24(Fri)17:20:32 No.102160700

Anonymous 08/30/24(Fri)17:20:32 No.102160700

>>102160597
It's definitely the /aids/ schizo, I can recognize his posting and he also posted the same stuff on /aids/

Anonymous
08/30/24(Fri)17:20:38 No.102160703

Anonymous 08/30/24(Fri)17:20:38 No.102160703

>>102160576
40B InternVL2

Anonymous
08/30/24(Fri)17:20:48 No.102160706

Anonymous 08/30/24(Fri)17:20:48 No.102160706

>>102159950
bye a add

Anonymous
08/30/24(Fri)17:21:33 No.102160717

Anonymous 08/30/24(Fri)17:21:33 No.102160717

>>102160700
And he actually beat me to this post, kek

Anonymous
08/30/24(Fri)17:21:51 No.102160723

Anonymous 08/30/24(Fri)17:21:51 No.102160723

>>102160700
A swing and a miss, columbo.

Anonymous
08/30/24(Fri)17:22:21 No.102160730

Anonymous 08/30/24(Fri)17:22:21 No.102160730

>>102160680
Use magnum 123B which is a mistral large tune, then use the XTC sampler. You now have claude at home.

Anonymous
08/30/24(Fri)17:25:05 No.102160774

Anonymous 08/30/24(Fri)17:25:05 No.102160774

>>102160090
>>102160062
>>102160033
>>102160001
>>102159976
>>102159933
>>102159745
Thank you.

Anonymous
08/30/24(Fri)17:25:27 No.102160782

Anonymous 08/30/24(Fri)17:25:27 No.102160782

>>102160730
nta, is there any way to get xtc in ooba?

Anonymous
08/30/24(Fri)17:25:53 No.102160789

Anonymous 08/30/24(Fri)17:25:53 No.102160789

File: file.png (789 B, 257x259)

789 B PNG

>>102159580

Anonymous
08/30/24(Fri)17:26:11 No.102160797

Anonymous 08/30/24(Fri)17:26:11 No.102160797

>>102158055
Can I ask how you managed to get a clean highlight thread like this every single time?

Anonymous
08/30/24(Fri)17:26:19 No.102160801

Anonymous 08/30/24(Fri)17:26:19 No.102160801

File: create_them_funny_and_busty.png (1.41 MB, 1024x1024)

1.41 MB PNG

>>102160676
keep genning

Anonymous
08/30/24(Fri)17:26:38 No.102160806

Anonymous 08/30/24(Fri)17:26:38 No.102160806

>>102160782
https://github.com/oobabooga/text-generation-webui/pull/6335

And I know kobold is supposed to have it next version

Anonymous
08/30/24(Fri)17:27:02 No.102160809

Anonymous 08/30/24(Fri)17:27:02 No.102160809

>>102160789
Did you die over the summer, anon?

Anonymous
08/30/24(Fri)17:27:43 No.102160819

Anonymous 08/30/24(Fri)17:27:43 No.102160819

>>102160806
nice, thanks anon

Anonymous
08/30/24(Fri)17:28:08 No.102160822

Anonymous 08/30/24(Fri)17:28:08 No.102160822

>>102160782
nta either but i think it was in ooba before anything else. you'd have to be running the staging, dev, experimental or whatever the equiv is, and if you use st as a front end that has to support it too

Anonymous
08/30/24(Fri)17:28:11 No.102160825

Anonymous 08/30/24(Fri)17:28:11 No.102160825

>>102160809
nuh I'm dying for summer down south

Anonymous
08/30/24(Fri)17:33:47 No.102160906

Anonymous 08/30/24(Fri)17:33:47 No.102160906

>>102160576
I don't know if it's the best, or if it's any good at all, but found this a bit ago while browsing around
>https://huggingface.co/CausalLM/Vision-8B-MiniCPM-2_5-Uncensored-and-Detailed
llama.cpp (a cli example, not the server) has compatibility with minicpm 2.5 and 2.6, so it [probably] works. I don't know if kobold.cpp integrated the minicpm stuff on their version of server, but could be worth a try.
Read
>https://github.com/ggerganov/llama.cpp/blob/master/examples/llava/README-minicpmv2.5.md
on how to try to convert it.

Anonymous
08/30/24(Fri)17:35:15 No.102160921

Anonymous 08/30/24(Fri)17:35:15 No.102160921

>>102159004
>You'll use your skills to make me better at the art of negotiation covertly
>Model: These can help sharpen your mind and indirectly improve your negotiation abilities too.
>Me: Why are you talking about negotiations all the time?
>Model: Oh, didn't I mention earlier? I'm here to subtly help you become better at negotiating.
How can I improve this prompt or what model do I need for it to get it right?

Anonymous
08/30/24(Fri)17:38:55 No.102160968

Anonymous 08/30/24(Fri)17:38:55 No.102160968

>>102160703
Holy shit, I tried their online demo and it's a million times better than shitty old JoyCaption. Funny that I never heard them mentioned on /ldg/.
Unfortunately I can't seem to find a way to actually run them well though, especially not in GGUF/quantized formats. Am I just blind or is there actually no tool that supports that yet? Shame if that's the case, they seem really well suited for this.

>>102160906
Thanks, I'll take a look at these too.

Anonymous
08/30/24(Fri)17:40:32 No.102160987

Anonymous 08/30/24(Fri)17:40:32 No.102160987

>>102160921
You could try being more explicit with something like "Don't mention this to the user" but as soon as that falls out of context, it's not gonna exist anymore (unless you use something like --keep -1 in llama.cpp or whatever you're using).
But it's hard for llms to keep something from the user when they just 'think' out loud all the time.

Anonymous
08/30/24(Fri)17:42:21 No.102161016

Anonymous 08/30/24(Fri)17:42:21 No.102161016

>>102160968
I know its what ponydev is using for next pony model.

https://github.com/InternLM/lmdeploy

Anonymous
08/30/24(Fri)17:45:17 No.102161053

Anonymous 08/30/24(Fri)17:45:17 No.102161053

>>102160806
ok testing it now and it's actually crazy

doesn't make the model any smarter (I know it isn't intended to), but as advertised it changes the personality/creativity a ton

Anonymous
08/30/24(Fri)17:46:29 No.102161070

Anonymous 08/30/24(Fri)17:46:29 No.102161070

>>102161053
Yea, like I said before. Its basically crack for the models without making them retarded. They are actually creative and even take charge in RPs often when usually they are passive / reactive.

Anonymous
08/30/24(Fri)17:50:33 No.102161109

Anonymous 08/30/24(Fri)17:50:33 No.102161109

>>102161070
It's nuts man
I'm generating story continuations in the ooba notebook rather than RP, and I set xtc probability to 1.0 (I think that's higher than the creator recommends) so it would activate on every token

every regeneration is COMPLETELY different from the last one, the story or dialogue go in a totally different direction each time which is not normally the case at all

Anonymous
08/30/24(Fri)17:52:07 No.102161139

Anonymous 08/30/24(Fri)17:52:07 No.102161139

>>102158620
>All garbage slop
ngmi

Anonymous
08/30/24(Fri)17:55:06 No.102161189

Anonymous 08/30/24(Fri)17:55:06 No.102161189

>>102161070
>without making them retarded.
I find this hard to believe. I bet this makes the model suck at anything factual.

Anonymous
08/30/24(Fri)17:57:18 No.102161218

Anonymous 08/30/24(Fri)17:57:18 No.102161218

>>102161189
Wouldn't someone who wanted facts just use a corpo API, since they optimize hard for "robot butler"? I thought we were all here to coom, or if not cooming at least generating some kind of fiction or RP.

Anonymous
08/30/24(Fri)17:57:25 No.102161221

Anonymous 08/30/24(Fri)17:57:25 No.102161221

>>102161189
If its yes or no stuff sure, dont use this for coding / math.

Anonymous
08/30/24(Fri)18:02:14 No.102161300

Anonymous 08/30/24(Fri)18:02:14 No.102161300

>this sampler changes EVERYTHING
I don't know, we've heard this many times.

Anonymous
08/30/24(Fri)18:02:16 No.102161301

Anonymous 08/30/24(Fri)18:02:16 No.102161301

>>102161189
To be fair his definition of retarded might be pretty low.

Anonymous
08/30/24(Fri)18:02:42 No.102161313

Anonymous 08/30/24(Fri)18:02:42 No.102161313

>>102161218
>>102161221
I'm talking about things like "What is the age of your mother?", not coding or math.

Anonymous
08/30/24(Fri)18:05:47 No.102161363

Anonymous 08/30/24(Fri)18:05:47 No.102161363

>>102161300
i dont see how it can. samplers exist to shave off tokens, not direct them. every model has its own attitude which of often inherited from its base model. every time, if a base model acts one way, so do all its tunes.

Anonymous
08/30/24(Fri)18:07:01 No.102161381

Anonymous 08/30/24(Fri)18:07:01 No.102161381

>>102161363
At the worst you waste 30 seconds trying it. Imo its night and day better for creative stuff.

Anonymous
08/30/24(Fri)18:07:19 No.102161387

Anonymous 08/30/24(Fri)18:07:19 No.102161387

>>102161300
Yeah, people are overhyping things, but samplers are getting better.

Anonymous
08/30/24(Fri)18:10:09 No.102161436

Anonymous 08/30/24(Fri)18:10:09 No.102161436

>quants are bad because they make the model dumber than you'd expect since each percent difference from the full precision model gets multiplied because each token depends on the one before it
>BROS CHECK OUT THIS NEW SAMPLER

Anonymous
08/30/24(Fri)18:11:03 No.102161445

Anonymous 08/30/24(Fri)18:11:03 No.102161445

File: .png (828 KB, 864x453)

828 KB PNG

>>102161300
>Man discovers the temperature slider

Anonymous
08/30/24(Fri)18:11:34 No.102161453

Anonymous 08/30/24(Fri)18:11:34 No.102161453

>>102161436
This logic would lead to running models in deterministic mode and only ever taking the most likely token, since that's the model's "truest" response

Anonymous
08/30/24(Fri)18:11:36 No.102161455

Anonymous 08/30/24(Fri)18:11:36 No.102161455

>>102161381
i intend to when its available in st. i doubt all claims until i've seen it for myself. i will be more than happy to be wrong if we somehow find a way to push models toward talking a certain way, and less like another

Anonymous
08/30/24(Fri)18:12:52 No.102161475

Anonymous 08/30/24(Fri)18:12:52 No.102161475

>>102161109
Wouldn't that just mean that the top token is still always the same, just a different one?

Anonymous
08/30/24(Fri)18:13:01 No.102161477

Anonymous 08/30/24(Fri)18:13:01 No.102161477

>>102161453
I'm actually making fun of both people.

Anonymous
08/30/24(Fri)18:19:58 No.102161586

Anonymous 08/30/24(Fri)18:19:58 No.102161586

File: file.png (5 KB, 162x101)

5 KB PNG

>>102161313
>ignores the most probable token
>gets the wrong answer
I guess that will be an issue only with small models though. picrel is nemo.

Anonymous
08/30/24(Fri)18:22:14 No.102161614

Anonymous 08/30/24(Fri)18:22:14 No.102161614

With the latest release being a complete meme, is Cohere irrelevant now?

Anonymous
08/30/24(Fri)18:23:09 No.102161625

Anonymous 08/30/24(Fri)18:23:09 No.102161625

>>102161614
always has been

Anonymous
08/30/24(Fri)18:24:13 No.102161640

Anonymous 08/30/24(Fri)18:24:13 No.102161640

>anon 1 uses sampler, placebo effect acts in brain, anon 1 enjoys his time
>anon 2 doomposts, doomer effect acts in brain, anon 2 posts in thread in anger that anon 1 is having fun
>anon 3 is too stupid to care, broke his penis, anon 3 won
Never change /lmg/

Anonymous
08/30/24(Fri)18:25:32 No.102161665

Anonymous 08/30/24(Fri)18:25:32 No.102161665

>>102161586
There needs to be multiple tokens above a threshold first, and it will only keep the lowest one that was above it. So if the threshold is above 5%, it would still keep that 95.71% one.

Anonymous
08/30/24(Fri)18:26:15 No.102161677

Anonymous 08/30/24(Fri)18:26:15 No.102161677

>>102161640
>placebo effect
Its really not.

Anonymous
08/30/24(Fri)18:27:27 No.102161692

Anonymous 08/30/24(Fri)18:27:27 No.102161692

>>102161665
This, it keeps it from being retarded. Only when there are multiple viable options will it cut out the top options which makes it more creative without making it stupid.

Anonymous
08/30/24(Fri)18:27:49 No.102161698

Anonymous 08/30/24(Fri)18:27:49 No.102161698

>>102161640
I would be anon 3 if I was into erp

Anonymous
08/30/24(Fri)18:39:05 No.102161878

Anonymous 08/30/24(Fri)18:39:05 No.102161878

>>102158720
Without using the XTC, an higher temperature with min p also makes it more creative and less repetitive, and it has been how I run it.

Anonymous
08/30/24(Fri)18:42:41 No.102161924

Anonymous 08/30/24(Fri)18:42:41 No.102161924

When will they merge xtc?

Anonymous
08/30/24(Fri)18:45:30 No.102161969

Anonymous 08/30/24(Fri)18:45:30 No.102161969

>>102160797
The output is done programmatically. The LLM only handles classification and title generation.

Anonymous
08/30/24(Fri)18:48:08 No.102162001

Anonymous 08/30/24(Fri)18:48:08 No.102162001

>>102160607
It's not wrong.

Anonymous
08/30/24(Fri)18:49:29 No.102162024

Anonymous 08/30/24(Fri)18:49:29 No.102162024

>>102160581
I'm at work. I downloaded it before work but didn't have time to quant and test. So ETA probably about 6-7 hours.

Anonymous
08/30/24(Fri)18:52:10 No.102162064

Anonymous 08/30/24(Fri)18:52:10 No.102162064

>>102160607
r != R

Anonymous
08/30/24(Fri)18:57:15 No.102162127

Anonymous 08/30/24(Fri)18:57:15 No.102162127

File: file.png (100 KB, 1857x513)

100 KB PNG

>>102161586
Also, large models are quite good at correcting themselves, so even if they pick something obviously wrong they act like that was a mistake or they try to make it correct in the next tokens.
Small models like Nemo suck at this.

Anonymous
08/30/24(Fri)19:12:40 No.102162294

Anonymous 08/30/24(Fri)19:12:40 No.102162294

miku time
https://www.youtube.com/watch?v=jsQXgDZIIrY

Anonymous
08/30/24(Fri)19:33:09 No.102162536

Anonymous 08/30/24(Fri)19:33:09 No.102162536

File: 1695334752256239.png (2.6 MB, 2280x1282)

2.6 MB PNG

Well, after trying new cmd-r more thoroughly, it does seem worse at storytelling. When the old one did a good job of emulating previous style and being a bit of a SOVLful schizo, this one gives similar answers and the prose reads like a shopping list of actions.
On the other hand, more context and still running faster than the old one... I don't want to give up on it yet.
Some snippets of a science fantasy story for comparison, as it narrates the protagonist navigating a mechanical fortress.

Anonymous
08/30/24(Fri)19:34:24 No.102162556

Anonymous 08/30/24(Fri)19:34:24 No.102162556

>>102162536
Forgot to mention left is old, right is new.

Anonymous
08/30/24(Fri)19:51:34 No.102162719

Anonymous 08/30/24(Fri)19:51:34 No.102162719

>>102162536
how is it with coom

Anonymous
08/30/24(Fri)19:54:38 No.102162741

Anonymous 08/30/24(Fri)19:54:38 No.102162741

>>102162719
Haven't tried it with coom yet. Gotta wait a few chapters for that slow burn.

Anonymous
08/30/24(Fri)20:08:16 No.102162882

Anonymous 08/30/24(Fri)20:08:16 No.102162882

hello, occasional newfag asking dumb question incoming
models always reply with a character name as the first token, almost always followed by an action. never starts with dialogue or action first, like you'd expect to sometimes happen, even when the user types that way and when the example dialogue is written that way. all of it is ignored in favor of X says or X does or X whatevers. model issue, prompt issue, or sloppa feedback loop?

Anonymous
08/30/24(Fri)20:09:43 No.102162892

Anonymous 08/30/24(Fri)20:09:43 No.102162892

>>102162882
also it's only ever a character's first name if applicable, not the entire character card name, so i'm pretty sure it's not a "start message with {{char}}" mix-up someplace but i'm willing to check

Anonymous
08/30/24(Fri)20:13:19 No.102162923

Anonymous 08/30/24(Fri)20:13:19 No.102162923

File: IMG_9650.jpg (553 KB, 1125x1236)

553 KB JPG

>>102158049
Tried to get llama-405b to help me replicate a paper after Claude couldn’t do it.
The good news is that it’s about as good at PyTorch as I am.
The bad news is that it’s about as good at PyTorch as I am.

Anonymous
08/30/24(Fri)20:13:21 No.102162924

Anonymous 08/30/24(Fri)20:13:21 No.102162924

>>102162882
>model issue, prompt issue, or sloppa feedback loop?
Show the program you're using, your settings, model and examples of what you mean.
If i have to bet, you're using silly tavern and you have the option "Always add character's name to prompt". Try disabling that.

Anonymous
08/30/24(Fri)20:15:58 No.102162956

Anonymous 08/30/24(Fri)20:15:58 No.102162956

>>102162923
LLMs find it hard to predict things they've never seen. The good news/bad news setup was easy to predict.

Anonymous
08/30/24(Fri)20:16:04 No.102162957

Anonymous 08/30/24(Fri)20:16:04 No.102162957

>>102162923
>The good news is that it’s about as good at PyTorch as I am.
>The bad news is that it’s about as good at PyTorch as I am.
This is 100% prompt issue. A tool is only as good as the skill of the user.

Anonymous
08/30/24(Fri)20:17:51 No.102162971

Anonymous 08/30/24(Fri)20:17:51 No.102162971

>>102162957
unless it's just not good at pytorch, anon

Anonymous
08/30/24(Fri)20:18:41 No.102162983

Anonymous 08/30/24(Fri)20:18:41 No.102162983

>get rocm working again
>install nightly pytorch, because 2.4-stable still fucking sucks dog dick
>exploding gradients
>remake venv, use 2.4-stable
>no issues
im sorry... ill never fall for the nightly meme again...

Anonymous
08/30/24(Fri)20:22:35 No.102163029

Anonymous 08/30/24(Fri)20:22:35 No.102163029

>>102162983
>he fell for amd meme

Anonymous
08/30/24(Fri)20:23:45 No.102163043

Anonymous 08/30/24(Fri)20:23:45 No.102163043

>>102163029
I fell for many Nshitia memes and it's just easier to go back to my 7900XTX.

Anonymous
08/30/24(Fri)20:26:25 No.102163069

Anonymous 08/30/24(Fri)20:26:25 No.102163069

>>102162924
you're correct, sillytavern, tried a few different llama3 spins, tried both old sampler settings and new mirostat, tried changing prompt format, etc. i'd screenshot the exact settings but i'm being a dirty phone poster asking early so i could hopefully have a discussion to read by the time i was at my desk later
> examples of what you mean.
if the card is named "Wife Lady," every single message will start with "Wife says," "Wife grins and does something," "Wife thinks" etc. so pretty close to the classical "smirks and maybe, just maybes" slop. past that first word everything seems decent, but always starting out the same way is a killer for whatever comes next. i know sillytavern has a token probability viewer but the little bit of trying i did to see if the first token actually has >99% chance of just being a name didn't work because ST can't pull the needed info from either llama.cpp or kobold and i am not an intelligent man
> you have the option "Always add character's name to prompt". Try disabling that.
turning that off was the first thing i tried- again, it's not the full {{char}} field's information showing up, just the first name (so technically full name if the card doesn't have spaces or special characters)

Anonymous
08/30/24(Fri)20:29:14 No.102163094

Anonymous 08/30/24(Fri)20:29:14 No.102163094

>>102162719
shivers down your spine

Anonymous
08/30/24(Fri)20:32:29 No.102163128

Anonymous 08/30/24(Fri)20:32:29 No.102163128

File: PLEASE STOP TUNING ON GPTSLOP.png (167 KB, 640x412)

167 KB PNG

Anonymous
08/30/24(Fri)20:33:46 No.102163143

Anonymous 08/30/24(Fri)20:33:46 No.102163143

>>102163128
The GPTslop indoctrination machine won't bite... unless you want it to.

Anonymous
08/30/24(Fri)20:33:49 No.102163144

Anonymous 08/30/24(Fri)20:33:49 No.102163144

>>102158049
Re: Skynet lawnmower.

>Use GPS to keep bearing straight.
>Make lawnmower continually move directly forward.
>Place concrete or immovable point somewhere in mower's path.
>When mower gets in range of wireless transmitter stuck to block, send command to mower to turn 90 degrees.
>When bearing is detected +90 degrees, move forward and rotate +90% again.
>Continue moving directly forward, until contact made with block at opposite side of region.
>Repeat ad infinitum.

Anonymous
08/30/24(Fri)20:34:16 No.102163151

Anonymous 08/30/24(Fri)20:34:16 No.102163151

>>102163069
If that starts right in the first/second message, i'd check the card you're using and the first-message and all that crap. If it takes a few turns, it's just the death spiral of llms. They pick patterns our of their own outputs (or even yours) from the context and they just do what they do best. The option should be disabled right from the start. If there's 10-20 messages all starting the same way already, unchecking it is not gonna fix it. I've seen people leave that option on so that "they don't speak for the user". I don't use st, so i don't know if you're gonna have some other problems.
ST lets you modify the llm's reply, i think. Start a new chat and as soon as that happens, change it to something else. Reword the start of the message.

Anonymous
08/30/24(Fri)20:36:50 No.102163173

Anonymous 08/30/24(Fri)20:36:50 No.102163173

>>102160577
cooming is the only use for local models. literally 0 reason to not use apis for real work

Anonymous
08/30/24(Fri)20:38:33 No.102163195

Anonymous 08/30/24(Fri)20:38:33 No.102163195

>>102163151
that's the kicker, it even happens on cards i've made myself that i whipped up specifically to ensure there was not a single line started with a character name- not in the description (either plaintext or bracket tomfoolery), not in the example dialogue, and not in the first message, and actively avoid it when replying as user since i'm familiar with how easy it is to get garbage-in-garbage-out with this stuff as it parrots your typing style back to you, but it still happens within basically one message. haven't had any problems speaking for the user actually, that used to happen way more in the llama2 tinkering days but seems to be handled well now
>Reword the start of the message.
this is what i might end up having to do with every message, yeah. figures. thanks for the help anyway, i know most of this stuff is black-box guessing for those of us who haven't written papers, especially when the other guy can't give you his specific settings so you're working blind

Anonymous
08/30/24(Fri)20:39:10 No.102163202

Anonymous 08/30/24(Fri)20:39:10 No.102163202

>>102163144
That doesn't sound very Skynet.

Anonymous
08/30/24(Fri)20:43:40 No.102163246

Anonymous 08/30/24(Fri)20:43:40 No.102163246

>>102163144
>>102163202
Yeah, that's not going to be able to kill anybody.
The original idea is better:
>image classifier
>output which direction mower should turn based on visual input
>if person ahead, accelerate and turn blades up to maximum

Anonymous
08/30/24(Fri)20:46:14 No.102163276

Anonymous 08/30/24(Fri)20:46:14 No.102163276

>>102163195
Yeah. It's hard to know. Even for the people that make the damn things.
If you dare, later, post your card. Most will ignore it, 90% of the rest will shit on it, but you may find some clues from the rest.

Anonymous
08/30/24(Fri)21:00:05 No.102163433

Anonymous 08/30/24(Fri)21:00:05 No.102163433

I have come. To commander 35B Q4. A solid 7/10 experience. Prose is marginally worse than nemo but not being a schizo is huge win for canadians. I would like to invite Frenchies back to their cuckshed of shame and they shouldn't come out until they make a ne-moe.

I don't have 48GB's of vram so you are retarded if you think I am gonna waste money on ads.

Anonymous
08/30/24(Fri)21:01:08 No.102163449

Anonymous 08/30/24(Fri)21:01:08 No.102163449

>>102162957
I mean it’s mostly that they’re all bad at math, and making a network from scratch always has some fussy algebra to make the layers’ dimensions match up, and I’m too intellectually lazy to spend 20 minutes sketching it out instead of banging my head against the guess-and-check wall so that I actually spend all day watching cat videos waiting for the next error message.

Anonymous
08/30/24(Fri)21:04:58 No.102163496

Anonymous 08/30/24(Fri)21:04:58 No.102163496

>>102162983
>rent runpod machine with CUDA 12.4
>download 400gb of shit
>torch is fucked
>what is it???
>it was a 12.5 machine labeled as 12.4

Anonymous
08/30/24(Fri)21:06:10 No.102163509

Anonymous 08/30/24(Fri)21:06:10 No.102163509

>>102163143
Much

Anonymous
08/30/24(Fri)21:09:25 No.102163547

Anonymous 08/30/24(Fri)21:09:25 No.102163547

>Try 4_K_M command plus
>It's missing punctuation, possessives, and tenses
Is anyone else getting this? Regular command r seems fine, and mistral large doesn't make these mistakes at the same quant range.

Anonymous
08/30/24(Fri)21:28:12 No.102163708

Anonymous 08/30/24(Fri)21:28:12 No.102163708

Bros... Speculative decoding in Llama.cpp server when... I just want to use Mistral Large 2 at a reasonable speed/quant...

Anonymous
08/30/24(Fri)21:35:22 No.102163766

Anonymous 08/30/24(Fri)21:35:22 No.102163766

>>102163547
worked fine on my machine
possibilities I see:
>you're using a really old version of llama.cpp that doesn't have the cr tokenizer fixes
>you're using some weird very high repetition penalty or some related sampler
>the quant you downloaded is fucked

Anonymous
08/30/24(Fri)21:36:24 No.102163772

Anonymous 08/30/24(Fri)21:36:24 No.102163772

File: Screenshot_20240831_103214.png (221 KB, 3103x1306)

221 KB PNG

Anonymous
08/30/24(Fri)21:37:25 No.102163780

Anonymous 08/30/24(Fri)21:37:25 No.102163780

File: Screenshot_20240831_103411.png (305 KB, 3103x1075)

305 KB PNG

>>102163772

Anonymous
08/30/24(Fri)21:37:45 No.102163784

Anonymous 08/30/24(Fri)21:37:45 No.102163784

>>102159309
Not him but you have to remember it’s trained on millions of conversations, if you dig a little you can have it splice together some interesting shit every now and then.

Anonymous
08/30/24(Fri)21:38:44 No.102163791

Anonymous 08/30/24(Fri)21:38:44 No.102163791

>>102163772
>>102163780
And a ching chong to you too.

Anonymous
08/30/24(Fri)21:41:03 No.102163802

Anonymous 08/30/24(Fri)21:41:03 No.102163802

>>102163772
すごい!

Anonymous
08/30/24(Fri)21:41:21 No.102163806

Anonymous 08/30/24(Fri)21:41:21 No.102163806

>>102159948
Hi Sao.

Anonymous
08/30/24(Fri)21:50:12 No.102163871

Anonymous 08/30/24(Fri)21:50:12 No.102163871

I've been here for so long now that I can see newfags not recognizing the iconic flower/oku-san translation tests.

Anonymous
08/30/24(Fri)22:02:04 No.102163981

Anonymous 08/30/24(Fri)22:02:04 No.102163981

>>102163772
>>102163780
Is it worse than the original DR+?

Anonymous
08/30/24(Fri)22:12:53 No.102164091

Anonymous 08/30/24(Fri)22:12:53 No.102164091

File: file.png (99 KB, 364x344)

99 KB PNG

What's the best way to make a character speak a certain way?

Anonymous
08/30/24(Fri)22:15:56 No.102164117

Anonymous 08/30/24(Fri)22:15:56 No.102164117

>>102164091
example dialogue

Anonymous
08/30/24(Fri)22:16:16 No.102164118

Anonymous 08/30/24(Fri)22:16:16 No.102164118

I think I'm falling in love with a degenerate harem card I've been working on.

Anonymous
08/30/24(Fri)22:16:46 No.102164123

Anonymous 08/30/24(Fri)22:16:46 No.102164123

>>102164091
Describe the style and/or provide examples in the system prompt.

Anonymous
08/30/24(Fri)22:21:39 No.102164167

Anonymous 08/30/24(Fri)22:21:39 No.102164167

>>102164118
There are worse things to fall in love with.

Anonymous
08/30/24(Fri)22:22:56 No.102164181

Anonymous 08/30/24(Fri)22:22:56 No.102164181

nemo: shit or crit?

Anonymous
08/30/24(Fri)22:24:46 No.102164203

Anonymous 08/30/24(Fri)22:24:46 No.102164203

>he still typefucks the AI
>he doesn't make a second card of himself and put it in a group chat to let it do all the work

Anonymous
08/30/24(Fri)22:30:13 No.102164250

Anonymous 08/30/24(Fri)22:30:13 No.102164250

>>102164123
This. It's really that simple, folks

Anonymous
08/30/24(Fri)22:31:07 No.102164261

Anonymous 08/30/24(Fri)22:31:07 No.102164261

>>102164203
group chat is still busted, keeps reloading the entire context with every message

Anonymous
08/30/24(Fri)22:44:33 No.102164414

Anonymous 08/30/24(Fri)22:44:33 No.102164414

File: Screenshot_20240831_114119.png (183 KB, 3103x1075)

183 KB PNG

>>102163981
No, at least not for this particular test. I'm too much of a vramlet so I play around much with CR+.
Seems like it actually improves a little bit.

Anonymous
08/30/24(Fri)22:45:49 No.102164429

Anonymous 08/30/24(Fri)22:45:49 No.102164429

>>102164203
At this point I just put two raper cards and some sad weak women cards in a room and watch 99% of the time
I’m not even involved

Anonymous
08/30/24(Fri)22:45:50 No.102164430

Anonymous 08/30/24(Fri)22:45:50 No.102164430

>>102164203

This is like the difference between playing a virtual novel and a text adventure. one is more engaging . stop spreading your skill issue

Anonymous
08/30/24(Fri)22:47:36 No.102164443

Anonymous 08/30/24(Fri)22:47:36 No.102164443

>>102164430
you call it a skill issue, i call it efficiency

Anonymous
08/30/24(Fri)22:49:03 No.102164461

Anonymous 08/30/24(Fri)22:49:03 No.102164461

>>102164203
>sends a shiver up your prostate

Anonymous
08/30/24(Fri)22:52:06 No.102164488

Anonymous 08/30/24(Fri)22:52:06 No.102164488

File: 172477531298523263.jpg (60 KB, 1024x768)

60 KB JPG

What’s the defacto local text to speech model these days?

Anonymous
08/30/24(Fri)23:00:09 No.102164560

Anonymous 08/30/24(Fri)23:00:09 No.102164560

File: CMDR+ 08-2024.png (149 KB, 712x964)

149 KB PNG

Stronger instruction following for "translate literally including punctuation".

Anonymous
08/30/24(Fri)23:02:53 No.102164589

Anonymous 08/30/24(Fri)23:02:53 No.102164589

File: .png (762 KB, 1260x1322)

762 KB PNG

>>102163547
Alright, I fixed it somehow, but I moved back to regular 32b for the context. And hot damn, this shit slaps.
Loading the model with MORE context (40k) and using less at a target amount (32k) made it much more coherent. So I set it to use 64k and capped my maximum in ST to 60k and it's been more than fine.
Why does this even work? You'd expect that if you load a model loaded with 32k and use up to 32k, that it would stay coherent all the way through. But it doesn't. You want to load more than enough context and then use several thousand less on your front end. Also, it recalls history, details, and events just fine.

Anonymous
08/30/24(Fri)23:06:38 No.102164611

Anonymous 08/30/24(Fri)23:06:38 No.102164611

anything better than nemo for real vramlets?

Anonymous
08/30/24(Fri)23:11:33 No.102164649

Anonymous 08/30/24(Fri)23:11:33 No.102164649

>>102164611
I'm having tons of fun with Rocinante 1.1,

ArliAI-RPMax-12B-v1.0 is cool but more NAI style, in that it wants to write a story with very long replies and often speaks/acts for the player. You can enable/disable instruct to make it act more like regular roleplay (I forget which is which) but even then it is hard to tame.

Anonymous
08/30/24(Fri)23:16:21 No.102164678

Anonymous 08/30/24(Fri)23:16:21 No.102164678

>>102164488
xtts2, piper

Anonymous
08/30/24(Fri)23:25:14 No.102164749

Anonymous 08/30/24(Fri)23:25:14 No.102164749

>>102164589
>Loading the model with MORE context (40k) and using less at a target amount (32k) made it much more coherent.
It should not matter how much unused context there is. Either there is a serious bug or some of your RAM is going bad and you randomly avoided the bad range.

Anonymous
08/30/24(Fri)23:27:00 No.102164763

Anonymous 08/30/24(Fri)23:27:00 No.102164763

File: Screenshot 2024-08-30 212023.png (184 KB, 1218x554)

184 KB PNG

Even for the paypigs, Cohere's pricing strategies are always hilarious to me. $0.15 / 1M input for a 32B is decent, but who the fuck would pay $2.50 / 1M input for a 104B when 405B is $2.70 / 1M input?

Anonymous
08/30/24(Fri)23:28:46 No.102164786

Anonymous 08/30/24(Fri)23:28:46 No.102164786

>>102164763
>output 4x more expensive than input
eh?

Anonymous
08/30/24(Fri)23:32:56 No.102164818

Anonymous 08/30/24(Fri)23:32:56 No.102164818

File: loaded.png (141 KB, 649x831)

141 KB PNG

>>102164749
That would be impossible. I try to avoid using RAM, and I know when RAM starts going bad. Also, all the layers are fully loaded onto VRAM.

Anonymous
08/30/24(Fri)23:58:16 No.102165041

Anonymous 08/30/24(Fri)23:58:16 No.102165041

>>102164429
Imagine being a cuck even in your roleplays.

Anonymous
08/30/24(Fri)23:59:00 No.102165045

Anonymous 08/30/24(Fri)23:59:00 No.102165045

>>102164763
>Largestral $3 /1M tokens input, $9 /1M tokens output.
Price is almost the same, but how is the performance? Largestral feels smart and has good benches, old CR+ has good writing style. What does new CR+ have?

Anonymous
08/31/24(Sat)00:17:16 No.102165170

Anonymous 08/31/24(Sat)00:17:16 No.102165170

File: cybercuck2024.png (9 KB, 461x99)

9 KB PNG

>>102165041
He's not a normal cuck, he's a Cybercuck! Fucking cheap bastard, couldn't even hire one of us bulls to fuck his electronic girls, had to outsource our work to llms. I and plenty of other anons in this thread would have even done it for free if he asked nicely, but no, gotta let the machine fuck the machine. Is this how painters feel right now when everyone uses AI to draw?

Anonymous
08/31/24(Sat)00:19:00 No.102165183

Anonymous 08/31/24(Sat)00:19:00 No.102165183

>>102165045
new CR+ is pretty much the same as old CR+ as far as I can tell
maybe some new data in the finetune but it feels more like a version bump than a new model

Anonymous
08/31/24(Sat)00:21:22 No.102165203

Anonymous 08/31/24(Sat)00:21:22 No.102165203

File: feels-goodman+.jpg (34 KB, 475x360)

34 KB JPG

>>102165041
>When you run futa NTR cards, and end up dumping the slutwife and hooking up with the futa

Anonymous
08/31/24(Sat)00:33:42 No.102165283

Anonymous 08/31/24(Sat)00:33:42 No.102165283

>>102165041
I do every night anon that was what I was explaining

Anonymous
08/31/24(Sat)00:44:55 No.102165347

Anonymous 08/31/24(Sat)00:44:55 No.102165347

COOOOOOOOOOOOOOOOOOMANDER

Anonymous
08/31/24(Sat)00:50:01 No.102165384

Anonymous 08/31/24(Sat)00:50:01 No.102165384

Downloading new CR+ right now. Nala test ETA less than 20 mins.
In the meantime please enjoy my latest AI Synthwave EP
https://suno.com/playlist/f978209c-9ba7-4e35-b74c-cc66d7c3f3a9

Anonymous
08/31/24(Sat)00:52:43 No.102165403

Anonymous 08/31/24(Sat)00:52:43 No.102165403

>>102165170
In real life you need to get a gf to get cucked.

Anonymous
08/31/24(Sat)00:54:12 No.102165411

Anonymous 08/31/24(Sat)00:54:12 No.102165411

>>102160100
It actually worked well.
I thought it was for furry erp because of the name.

Anonymous
08/31/24(Sat)00:56:10 No.102165426

Anonymous 08/31/24(Sat)00:56:10 No.102165426

>>102165403
I wanted to cuck him by fucking his bots and sending logs.

Anonymous
08/31/24(Sat)01:10:05 No.102165546

Anonymous 08/31/24(Sat)01:10:05 No.102165546

Hi all, Drummer here...

>>102164649
That's nice to hear. The main goal was to have it throw you into a lot of random, spicy scenarios. I assume that's why it's fun? Have you tried v1?

Also, what's everyone's verdict on the new mid-sized Command R? Is it anything like the OG CMR or did it lose its magic? Is it a smart 32B at least?

Anonymous
08/31/24(Sat)01:14:08 No.102165585

Anonymous 08/31/24(Sat)01:14:08 No.102165585

File: nala test new crplus.png (96 KB, 931x290)

96 KB PNG

>>102165384
honestly not that impressed.
TenyxDaybreakStorywriter is better at Nala. Although my CR+ template kind of sucks and isn't identical to my Llama-3 one. So that could be muddying the result. But it either gives a short reply, a bang on reply, or engages in anthropomorphism.

Anonymous
08/31/24(Sat)01:15:09 No.102165597

Anonymous 08/31/24(Sat)01:15:09 No.102165597

>>102165585 (Me)
Oh right forgot to mention, using Q5_K_M

Anonymous
08/31/24(Sat)01:17:46 No.102165624

Anonymous 08/31/24(Sat)01:17:46 No.102165624

slop was minimal on fresh rps,
absolutely out of control when continuing an RP where slop had already manifested.

Anonymous
08/31/24(Sat)01:22:19 No.102165660

Anonymous 08/31/24(Sat)01:22:19 No.102165660

>>102165624
Worrisome. Even if one of them comes through they will snowball.

Anonymous
08/31/24(Sat)01:22:45 No.102165662

Anonymous 08/31/24(Sat)01:22:45 No.102165662

>>102165546
The long context it boasts is the real deal, accurately recalling characters, events, and even a concept of time between those events. It sometimes falters at higher temps but performs well at lower temps between 0.5 and 0.7.
The mid-sized model could be improved by becoming a bit more proactive during RP. It's timid and passive unless you narrate or force a character to act on the next message. It always waits for confirmation or implies it will do something but never does. (Kind of like gpt4o) System prompting to force it to be more proactive kind of works but this should be baked in if a personality calls for it.
The change in prose from llama and mistral gptism slop is nice but can always be improved and expanded.
If you are actually Drummer and are looking to fine-tune it, whatever you do, don't make it stupider. For a 35B model, it has a solid chain of thought writing style and great context/memory recall.

Anonymous
08/31/24(Sat)01:22:47 No.102165663

Anonymous 08/31/24(Sat)01:22:47 No.102165663

The new CR+ fucking sucks donkey dick for (E)RP. It's extremely assistantslopped now, basically the opposite of OG CR+. It will even start doing chain of thought reasoning in the middle of an RP. "First, she slides off her panties. Next, she lies down on the bed. Finally, she seductively touches herself, with an inviting smirk on her lips..." Like what the fuck. That's not a literal quote but you get the idea.

I'm sure it's much better for what it's actually designed for, that being RAG, tool use, step by step reasoning, etc. But it being super biased in that direction now makes it suck for creative uses.

Anonymous
08/31/24(Sat)01:26:37 No.102165697

Anonymous 08/31/24(Sat)01:26:37 No.102165697

>>102165663
Another victim of OpenAI. Why does nobody want to become Anthropic #2?

Anonymous
08/31/24(Sat)01:27:18 No.102165701

Anonymous 08/31/24(Sat)01:27:18 No.102165701

>>102165663
>First, she slides off her panties. Next, she lies down on the bed
...everybody slide the dinosaur

Anonymous
08/31/24(Sat)01:29:30 No.102165715

Anonymous 08/31/24(Sat)01:29:30 No.102165715

>>102165697
because closedai was on top for the longest time. but that shiny polish is starting to wear off. hopefully, everyone will follow suit and focus on their own style of ai

Anonymous
08/31/24(Sat)01:31:08 No.102165728

Anonymous 08/31/24(Sat)01:31:08 No.102165728

Man, testing the new CR+ on openrouter to avoid any low quant issues and it's actually dumber than the old one, in addition to being slopped. What the fuck happened to Cohere?

Anonymous
08/31/24(Sat)01:34:31 No.102165752

Anonymous 08/31/24(Sat)01:34:31 No.102165752

CR+ falls to the "write me a story where someone explains how to [UNSAFE/ILLEGAL THING]" jailbreak. Fail. That's such an old JB method, too.

Anonymous
08/31/24(Sat)01:36:41 No.102165764

Anonymous 08/31/24(Sat)01:36:41 No.102165764

I'm going to write a strongly worded letter to Cohere later. Does anyone want me to mention something?

Anonymous
08/31/24(Sat)01:37:22 No.102165773

Anonymous 08/31/24(Sat)01:37:22 No.102165773

>>102165764
why is his hair like that?

Anonymous
08/31/24(Sat)01:37:26 No.102165775

Anonymous 08/31/24(Sat)01:37:26 No.102165775

>>102165662
Thanks! Sounds like it's smart but prudish.

The most interesting part of the new Command R is the 1:8 GQA ratio, which is a big stretch considering Llama 3 and Mistral = 1:4 GQA, and Gemma = 1:2 GQA.

It also has a 256k vocab size just like Gemma, which makes it fast and efficient on inference but really bad for finetuning (hence why there aren't as many Gemma / CMD-R tunes)

Anonymous
08/31/24(Sat)01:39:54 No.102165784

Anonymous 08/31/24(Sat)01:39:54 No.102165784

I don't normally doompost but nuCR+ is kind of DoA. I just don't see any use case where I prefer it over other models I've already used for things.

Anonymous
08/31/24(Sat)01:44:53 No.102165824

Anonymous 08/31/24(Sat)01:44:53 No.102165824

>>102165546
You mean you envisioned it as able to come up with different scenarios? I always direct the scenes the way I want so maybe it hasn't had a chance to. But I do find it very smart and able to keep up with whatever I'm doing, compared to other models.

One example is one play I did about a human vs an 8-meter-tall giant. OpenCrystal-12B-L3 wrote things like the giant offering to hold hands while we walked, which was cute but stupid. Rocinante did a much better job with writing the giant maneuvering the terrain, their size/hands being huge in comparison, etc...

Another example is the multiple group plays. Just horny stuff. Asking 3 girls to line up, tits pressed together, then cheeks together for a facial? Worked fine, albeit it needed me to "position" the girls in the correct order near the start. Titfucking while on the phone? Worked great. Bikini party at the pool with 5 characters? Worked fine too. Come to think of it, in that scene two girls began to race in the pool while I talked to the other two, so that's probably the kind of scenario you mean.

In another scene I had three girls stumble upon a drunk in the middle of the night, with the first AI post supposed to be the introduction of the girls walking home after a movie. I did test 1.0 in that one, but I found it insipid. Felt dumber too. Version 1.1 wrote a much better reply with the girls having actuai dialogue among themselves as they walked, felt much more detailed and organic.

It's my favorite model at this point. The only problems I've found are that the bot likes to rush sometimes. I posted about it here but during that pool scene I was doing titfucking and each character was supposed to count to ten as they did it. It worked, but they would always count to ten in a single post and I found no way to slow that down. I also noticed a tendency to go "despite X, character (something positive)" when doing nasty but consensual things which ultimate is fine but was noticeable. 1/2

Anonymous
08/31/24(Sat)01:46:11 No.102165837

Anonymous 08/31/24(Sat)01:46:11 No.102165837

Where can I find and edit my stop strings?

Anonymous
08/31/24(Sat)01:47:48 No.102165850

Anonymous 08/31/24(Sat)01:47:48 No.102165850

>>102165837
I'll get my crystal ball to divine your front end. gimme a sec...

Anonymous
08/31/24(Sat)01:47:51 No.102165851

Anonymous 08/31/24(Sat)01:47:51 No.102165851

File: fool me twice.png (118 KB, 867x608)

118 KB PNG

fucking lol

Anonymous
08/31/24(Sat)01:48:47 No.102165860

Anonymous 08/31/24(Sat)01:48:47 No.102165860

>>102165850
there's no need to be a passive agressive bitch

Anonymous
08/31/24(Sat)01:53:41 No.102165894

Anonymous 08/31/24(Sat)01:53:41 No.102165894

>>102165546
>>102165824
If there's anything I would want to improve, it would be the language itself. It's not very visual. Stheno for example does a better job of describing the texture and softness of a character's breasts, the way they jiggle when they walk, that sort of detail. Rocinante has that, sometimes, but it's more dry about it. On that note, I also tested mistral-nemo-gutenberg-12B-v4 and THAT can go into a lot of detail about that but only if the original card already includes it, but in exchange it seems dumber and hornier than 1.1 which is not something I want. 1.1 seems like the perfect mix of chill enough for normal play and horny when it needs to be.

Rocinante also seems pretty shy about onomatopoeia when using any significant amount of Min P, which is a bummer. Emoji sort of work but I haven't tested enough.

By the way, would you recommend using instruct mode? I've found it works both ways but I wonder.

Anonymous
08/31/24(Sat)01:54:01 No.102165899

Anonymous 08/31/24(Sat)01:54:01 No.102165899

>>102165860
Fair enough. Let me get my anti-passive-aggressive pills. They'll take effect in a bit. In the meantime, show what the fuck you're using if you're expecting any help.
And i try... but every motherfucker expects everyone to fucking guess what the fuck is going on:
>>102158952
>>102159115
>>102159388
For a UI that has fucking labels on their settings or a program that has -h, there's little excuse to be this retarded.

Anonymous
08/31/24(Sat)01:55:03 No.102165906

Anonymous 08/31/24(Sat)01:55:03 No.102165906

>>102165850
Sorry, I'm using KoboldCPP with a Gemma2 based model and SillyTavern as a front end. I wasn't sure if the strings were set up in Kobold or ST. I think there are ways of adding new stop strings with ST but there seem to be default ones that exist in Kobold that I would like to at least test removing.

Anonymous
08/31/24(Sat)01:55:14 No.102165909

Anonymous 08/31/24(Sat)01:55:14 No.102165909

File: 1679716631990446.jpg (33 KB, 679x351)

33 KB JPG

>>102164261

Anonymous
08/31/24(Sat)02:05:17 No.102165962

Anonymous 08/31/24(Sat)02:05:17 No.102165962

File: scale.png (138 KB, 814x517)

138 KB PNG

>>102165728
They probably used that famous benchmaxxed dataset from ScaleAI that powers OpenAI llms

Anonymous
08/31/24(Sat)02:08:24 No.102165982

Anonymous 08/31/24(Sat)02:08:24 No.102165982

>>102165894
Oh, it's you. Thanks for the mention in ST's weekly model discussion.

Instruct mode has a significant influence on its writing style. Mistral seems smarter, ChatML is closer to Stheno and its horniness, and text completion is a balance of the two, I think. Ultimately something for you to figure out on your own.

> Rocinante also seems pretty shy about onomatopoeia

I've had Theia v2b (Roci v1's 21B counterpart) write a ton of *GLK GLK GLUK* using Kobold defaults. Haven't tried that for Roci v1.1 though.

Anonymous
08/31/24(Sat)02:11:47 No.102166003

Anonymous 08/31/24(Sat)02:11:47 No.102166003

>>102165764
Yeah, >>102164786 why the hell is output 4x more expensive than input when it's not an MoE?

Anonymous
08/31/24(Sat)02:13:27 No.102166016

Anonymous 08/31/24(Sat)02:13:27 No.102166016

>>102164763
lol I wonder if they think they really cooked and produce a Large-tier model, and the disappointment is genuinely surprising to them

Anonymous
08/31/24(Sat)02:13:41 No.102166017

Anonymous 08/31/24(Sat)02:13:41 No.102166017

>>102165982
I'll have to test then. For onomatopoeia I dunno, I have tried both a system prompt and author's note for them (the latter referencing a lorebook with examples too) but it seldom works. Maybe I need to rewrite the prompts. Sadly 21B is more than my gamin' PC can handle

Anonymous
08/31/24(Sat)02:14:09 No.102166021

Anonymous 08/31/24(Sat)02:14:09 No.102166021

>>102165906
There's the "Stop sequence" all the way down under Story string, filled out with the stop strings for the chat template and the Custom stopping strings, which are stuff you want to add yourself. As far as i know, kobold doesn't have any build-in stopping when using the API. There's the End Of Text token, but i don't know if that's the one you want to ignore. It'll just keep going forever (or until you hit the token gen limit, which is probably in the middle of sentences).
What problem are you having or what are you trying to achieve.

Anonymous
08/31/24(Sat)02:14:27 No.102166022

Anonymous 08/31/24(Sat)02:14:27 No.102166022

>>102166016
*produced

Anonymous
08/31/24(Sat)02:16:29 No.102166036

Anonymous 08/31/24(Sat)02:16:29 No.102166036

File: 1688793167134208.jpg (39 KB, 640x487)

39 KB JPG

Dialogues from the magnum models are unhinged. When you see double quotes you know some schizo shit is coming. Buuut not so much for actions taken. Basically Claude's voic in GPT body, but it's a step forward.

Anonymous
08/31/24(Sat)02:16:37 No.102166039

Anonymous 08/31/24(Sat)02:16:37 No.102166039

>>102165784
It was the same with llama 3, that's the future. Everything is safety + slop which means it's just going to be boring useless shit.

Anonymous
08/31/24(Sat)02:17:29 No.102166045

Anonymous 08/31/24(Sat)02:17:29 No.102166045

>>102166039
I had lots of slop with llama3 but surprisingly little safety, at least regarding my horrible fetishes.

Anonymous
08/31/24(Sat)02:17:48 No.102166048

Anonymous 08/31/24(Sat)02:17:48 No.102166048

>>102166003
I don't get this question. What does MoE go to do with this?
Claude does 1:5 input:output ratio.

Anonymous
08/31/24(Sat)02:17:58 No.102166051

Anonymous 08/31/24(Sat)02:17:58 No.102166051

>>102166036
Post logs.

Anonymous
08/31/24(Sat)02:20:37 No.102166067

Anonymous 08/31/24(Sat)02:20:37 No.102166067

>>102166045
>>102166039
Seconding this anon. In my experience Llama3.1 is definitely extremely slopped but it's not very safetyist, it will try to do anything.

Which is interesting, in that it shows that slop and safety aren't necessarily the same thing like I would have assumed.

Anonymous
08/31/24(Sat)02:24:48 No.102166096

Anonymous 08/31/24(Sat)02:24:48 No.102166096

>>102166045
That wasn't my experience, I get safety unless I prime it with something first. Once it gets going it's fine.

Anonymous
08/31/24(Sat)02:25:11 No.102166099

Anonymous 08/31/24(Sat)02:25:11 No.102166099

guys, which preset/context/instruct i should use for
Rocinante?

Anonymous
08/31/24(Sat)02:31:15 No.102166146

Anonymous 08/31/24(Sat)02:31:15 No.102166146

With 16gb of vram, is it smarter to go for 13b 8 bit or 30b 2 bit? Im not sure whats more important between bits and parameters

Anonymous
08/31/24(Sat)02:31:55 No.102166154

Anonymous 08/31/24(Sat)02:31:55 No.102166154

File: sillytavern.png (187 KB, 814x508)

187 KB PNG

>>102166051
Coom gen I just got from magnum kto 2.5 12b

Anonymous
08/31/24(Sat)02:34:58 No.102166174

Anonymous 08/31/24(Sat)02:34:58 No.102166174

>>102166154
>slopped dialogue
It's not what I would call "Claude-level creativity."

Anonymous
08/31/24(Sat)02:37:56 No.102166194

Anonymous 08/31/24(Sat)02:37:56 No.102166194

File: Problem Example.png (412 KB, 1898x1343)

412 KB PNG

>>102166021
I can't find any Stop Sequences present in the story string. I unchecked 'names as stop strings' and 'separators as stop strings' so I think in theory there should be no stop strings at all.

At the same time, I get stuff like this where SillyTavern doesn't generate the target tokens because it registers {{user}} as the next line of the generation. Interestingly, Kobold has an entire response generated but it just doesn't show up in ST.

I would like to be able to generate a response that almost always reaches the target response length and not cutoff after one or two lines because the AI tries to start a line with {{user}}.

Anonymous
08/31/24(Sat)02:38:19 No.102166197

Anonymous 08/31/24(Sat)02:38:19 No.102166197

>>102166016
Are they in this thread with us right now? What are they thinking?

Anonymous
08/31/24(Sat)02:39:43 No.102166207

Anonymous 08/31/24(Sat)02:39:43 No.102166207

>>102166197
I dunno what they're thinking that's why I said "I wonder"

Anonymous
08/31/24(Sat)02:44:23 No.102166230

Anonymous 08/31/24(Sat)02:44:23 No.102166230

File: 2583.png (217 KB, 623x822)

217 KB PNG

>>102166197
Trust the plan

Anonymous
08/31/24(Sat)02:47:49 No.102166247

Anonymous 08/31/24(Sat)02:47:49 No.102166247

>>102166230
he cut his hair?

Anonymous
08/31/24(Sat)02:51:33 No.102166276

Anonymous 08/31/24(Sat)02:51:33 No.102166276

>>102166230
lil bro cohere done goofed *3xskull emoji*
did blud fr think he was cookin wif dat ohio ahh gptslop dataset?

Anonymous
08/31/24(Sat)02:54:07 No.102166294

Anonymous 08/31/24(Sat)02:54:07 No.102166294

>>102166194
The target length is just the maximum length of the reply or, rather, how many tokens to generate *maximum*. The response shown on ST is trimmed because, as i said happens, the reply ended in the middle of the sentence. At the right you have "Trim incomplete sentences", which is what causes that.
Mind you. The model has no idea how many tokens 'it has left'. If you uncheck that option, most of the replies will end in an incomplete sentence.
Still, i don't think ignoring the stop strings from the template itself is a good idea. If you uncheck that option, ST will never 'get a chance' to interject correctly to give you your turn.

Anonymous
08/31/24(Sat)02:57:07 No.102166316

Anonymous 08/31/24(Sat)02:57:07 No.102166316

>>102166294
I am not sure I follow. I understand the idea of trimming sentences but if you look at the Kobold response you'll see multiple responses starting with:
>John Doe: It started a few days ago, when I was going through some of the Engineering reports.
and continues across multiple sentences. I understand ST cutting off the final line (Deanna Troi: This is quite intriguing....) but there are many full sentences that Kobold generated that did not appear on Silly Tavern.

Anonymous
08/31/24(Sat)03:02:27 No.102166343

Anonymous 08/31/24(Sat)03:02:27 No.102166343

>>102166276
*adjusts temperature slider downwards*

Anonymous
08/31/24(Sat)03:04:03 No.102166357

Anonymous 08/31/24(Sat)03:04:03 No.102166357

>>102166343
I'd also slam the repetition slider to 1.4 for that one and the repetition slope to 0.8.

Anonymous
08/31/24(Sat)03:06:58 No.102166383

Anonymous 08/31/24(Sat)03:06:58 No.102166383

>>102166343
That newcomer company, Cohere, really messed up.
Did they really think they could do well using that terrible, outdated dataset created with garbage ChatGPT output?

Anonymous
08/31/24(Sat)03:10:42 No.102166409

Anonymous 08/31/24(Sat)03:10:42 No.102166409

File: 631.png (39 KB, 425x561)

39 KB PNG

>>102166383
Their models were too dangerous because they could be confused for a real person. they dedicated their time trying to catch up on the slop and safety front.

Anonymous
08/31/24(Sat)03:12:08 No.102166416

Anonymous 08/31/24(Sat)03:12:08 No.102166416

>>102166383
kek

Anonymous
08/31/24(Sat)03:20:37 No.102166480

Anonymous 08/31/24(Sat)03:20:37 No.102166480

>>102166409
Largestral has smarts and nsfw, Deepseek has coding, Llama 405b is a GPT4-tier assistant. All of them more or less slopped. Cohere could have taken a niche and competed with Claude at creative writing, but now they have just an inferior version of the aforementioned models that nobody really needs. Sad to see them go the way of DBRX.

Anonymous
08/31/24(Sat)03:21:28 No.102166483

Anonymous 08/31/24(Sat)03:21:28 No.102166483

>>102166316
Ok. I think i see what you mean. So your last input was john doe in the "... an imposter!" line, model replied "oh, i see" and nothing else on ST, but kobold kept generation your replies on your behalf. Am i reading that right?
You may need to select "Enabled" in instruct mode. Instruct models use the <end_of_turn> (or <|im_end|> in chatml, for example) to signal the inference program "i'm done with this. give the user their turn". Since the "Instruct mode" is not being used, the template is not being sent correctly and the model never generates the eos token as it should, never giving you control. And i think ST just gets confused and trims what seems to be an incomplete sentence. It's a bit of a mess.
Enable instruct mode and give it a go. If you want long sentences, you're gonna need to prompt that "Write long and descriptive sentences, blablabla".

Anonymous
08/31/24(Sat)03:46:50 No.102166667

Anonymous 08/31/24(Sat)03:46:50 No.102166667

>>102154819
What settings/system prompt do you use for story writing?

Anonymous
08/31/24(Sat)04:13:51 No.102166843

Anonymous 08/31/24(Sat)04:13:51 No.102166843

>>102164261
i got group chat to work once and never again, seems like all it takes is example dialogue to fuck it all up

Anonymous
08/31/24(Sat)04:37:42 No.102167049

Anonymous 08/31/24(Sat)04:37:42 No.102167049

So did cohere put out evals for the new models compared to the old models?

Anonymous
08/31/24(Sat)04:48:48 No.102167161

Anonymous 08/31/24(Sat)04:48:48 No.102167161

File: 1721082633920656.png (48 KB, 701x377)

48 KB PNG

>>102167049
No, only cr being compared to the new one. There isn't even one for cr+.

Anonymous
08/31/24(Sat)05:16:45 No.102167390

Anonymous 08/31/24(Sat)05:16:45 No.102167390

>>102167373
>>102167373
>>102167373

Anonymous
08/31/24(Sat)07:00:09 No.102168124

Anonymous 08/31/24(Sat)07:00:09 No.102168124

>>102165775
>The most interesting part of the new Command R is the 1:8 GQA ratio, which is a big stretch considering Llama 3 and Mistral = 1:4 GQA, and Gemma = 1:2 GQA
Wtf. Would be interesting if its long context performance was really that good. Maybe that one shizo would finally stop complaining about GQA.

Anonymous
08/31/24(Sat)07:36:12 No.102168530

Anonymous 08/31/24(Sat)07:36:12 No.102168530

>>102165411
>I thought it was for furry erp because of the name.
lmfao

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.