/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 07/20/24(Sat)03:06:30 No.101488042

File: IMG_8099.jpg (436 KB, 1536x2048)

436 KB JPG

/lmg/ - Local Models General Anonymous 07/20/24(Sat)03:06:30 No.101488042 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101474151 & >>101464048

►News
>(07/18) Improved DeepSeek-V2-Chat 236B: https://hf.co/deepseek-ai/DeepSeek-V2-Chat-0628
>(07/18) Mistral NeMo 12B base & instruct with 128k context: https://mistral.ai/news/mistral-nemo/
>(07/16) Codestral Mamba, tested up to 256k context: https://hf.co/mistralai/mamba-codestral-7B-v0.1
>(07/16) MathΣtral Instruct based on Mistral 7B: https://hf.co/mistralai/mathstral-7B-v0.1
>(07/13) Llama 3 405B coming July 23rd: https://x.com/steph_palazzolo/status/1811791968600576271

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
07/20/24(Sat)03:07:01 No.101488050

Anonymous 07/20/24(Sat)03:07:01 No.101488050

File: __hatsune_miku_kagamine_r(...).jpg (167 KB, 650x650)

167 KB JPG

►Recent Highlights from the Previous Thread: >>101474151

--Understanding Context Shifting in koboldcpp and its Differences with llama.cpp: >>101475054 >>101475102 >>101475238
--Optimal Placement of System Prompts and Character Cards in Mistral: >>101481173 >>101481456
--Putting on our thinking caps: comparing numbers with R and the impact of temperature settings: >>101483736 >>101484037
--Proper Formatting for System Messages,: >>101479160 >>101479788 >>101479796 >>101479903
--Llama.cpp is way faster now with GPU offloading, and Anon seeks help with rope scaling: >>101476940
--Fixing EOS token issues with Mistral-Nemo's tokenizer: >>101480410 >>101480853
--Estimating concurrent users for LLMs: a cat in a computer case: >>101482136 >>101482335 >>101482481
--AI: A Slave to Our Lonely Needs, But Not a Replacement for Genuine Connection: >>101475534 >>101475665 >>101475932
--SillyTavern Template Implementation and Best Practices: >>101480417 >>101480751 >>101480853
--New Mistral's Impressive ERP Character Description: >>101478112
--Mistral-Nemo: A Surprising Contender for Best Local RP Model: >>101478725 >>101478769 >>101478932 >>101478952 >>101480297
--Gemma 27b Review: Sovl but Formatting Issues and NSFW Avoidance are Drawbacks: >>101474884 >>101475375 >>101483374
--Dark and Moody Depictions: LLMs and Sensory Deprivation: >>101476284
--Miku (free space): >>101476953 >>101485255

►Recent Highlight Posts from the Previous Thread: >>101474172

Anonymous
07/20/24(Sat)03:14:57 No.101488117

Anonymous 07/20/24(Sat)03:14:57 No.101488117

>>101487448
>her breath warm against your neck
>sly grin
>her voice a sultry whisper that sends shivers down your spine
>lips brushing against your ear
>nips lightly at your earlobe
>eyes locked on yours
>a smirk plays on her lips
>voice dripping with seduction
holy shit I've never seen a model this shiverslopped before
god damn what did they train it on
also lmao 236b parameters for this

Anonymous
07/20/24(Sat)03:19:23 No.101488157

Anonymous 07/20/24(Sat)03:19:23 No.101488157

File: 1706634404113415.png (14 KB, 526x355)

14 KB PNG

Will LLMs forever suffer from gptisms?

Anonymous
07/20/24(Sat)03:19:44 No.101488159

Anonymous 07/20/24(Sat)03:19:44 No.101488159

Looks like it's going to be a waiting game for the Nemo finetunes to drop. Really curious if it's better than gemma 27b

Anonymous
07/20/24(Sat)03:20:57 No.101488175

Anonymous 07/20/24(Sat)03:20:57 No.101488175

>>101488159
WHY WOULD YOU THINK THAT. STOP BRINGING THAT UP.

Anonymous
07/20/24(Sat)03:23:25 No.101488202

Anonymous 07/20/24(Sat)03:23:25 No.101488202

>>101488159
Wait, so gemm2a was better than 70B and now there's a 12B that might be even better than that?

Anonymous
07/20/24(Sat)03:26:15 No.101488235

Anonymous 07/20/24(Sat)03:26:15 No.101488235

>Nemo isn't free on OR and is priced near 4o-mini
Oh... Any new logs please? I promise to read the recent highlights when I wake up.

Anonymous
07/20/24(Sat)03:26:44 No.101488241

Anonymous 07/20/24(Sat)03:26:44 No.101488241

>>101488042
Fusing with Miku and Rin

Anonymous
07/20/24(Sat)03:32:18 No.101488303

Anonymous 07/20/24(Sat)03:32:18 No.101488303

>>101488157
they're trained recursively on their own generations now and are not trained for solely rp

draw your own conclusions

Anonymous
07/20/24(Sat)03:36:15 No.101488343

Anonymous 07/20/24(Sat)03:36:15 No.101488343

>>101488117
Pure GPT4 with an extra sloppy purple prose prompt.
>Wai man laiks roleplay. Let's add a lot of data for it.
>Yeah. Use GPT-fo, it's the smahtest, they will laik it.

Anonymous
07/20/24(Sat)03:39:28 No.101488371

Anonymous 07/20/24(Sat)03:39:28 No.101488371

>>101488175

It hurts, anon. The waiting hurts. I need to share the pain. I'm just generating image sets of my waifu, maybe that can help while you wait.

Anonymous
07/20/24(Sat)03:40:02 No.101488376

Anonymous 07/20/24(Sat)03:40:02 No.101488376

>>101488157
I like how "GPTisms" are just run of the mill literary cliches that you'd see in books and journals. It proves /lmg/ never read anything before they started gooning to llms.

Anonymous
07/20/24(Sat)03:41:42 No.101488395

Anonymous 07/20/24(Sat)03:41:42 No.101488395

>>101488376
The issue is that the models stuff them into their gens at every given opportunity like a 14 year old fan fiction writer who's trying to ape his favourite young adult writer's shitty style.

Anonymous
07/20/24(Sat)03:42:55 No.101488402

Anonymous 07/20/24(Sat)03:42:55 No.101488402

>made up recap title
I'm dying from cringe again...

Anonymous
07/20/24(Sat)03:44:04 No.101488408

Anonymous 07/20/24(Sat)03:44:04 No.101488408

Any good tunes of gemma or new mistral?
t. took a break after llama.cpp fiasco.

Anonymous
07/20/24(Sat)03:46:06 No.101488433

Anonymous 07/20/24(Sat)03:46:06 No.101488433

>>101488408
tiger gemma is functional

Anonymous
07/20/24(Sat)03:50:31 No.101488469

Anonymous 07/20/24(Sat)03:50:31 No.101488469

>>101488395
It's because [insert assistant LLM here] is playing the character of a corporate assistant with a professional tone, so when you force it to erp an anime girl getting plapped, it's actually roleplaying a corporate assistant with a professional tone who is uncomfortably roleplaying an anime girl getting plapped. So it generates the bare minimum of a caricature of erp text with a heavily exaggerated style, like a parody.

Anonymous
07/20/24(Sat)03:54:56 No.101488518

Anonymous 07/20/24(Sat)03:54:56 No.101488518

>>101488469
it happens too with models that don't give a shit about complying

Anonymous
07/20/24(Sat)03:56:54 No.101488534

Anonymous 07/20/24(Sat)03:56:54 No.101488534

>>101488433
No it's fucking not.

Anonymous
07/20/24(Sat)04:00:59 No.101488566

Anonymous 07/20/24(Sat)04:00:59 No.101488566

>>101488534
I am reading legible outputs produced by tiger gemma right now.

Anonymous
07/20/24(Sat)04:01:24 No.101488571

Anonymous 07/20/24(Sat)04:01:24 No.101488571

>>101488376
Yeah they'd be appalled if they go to the usenet story archives.

Anonymous
07/20/24(Sat)04:14:42 No.101488677

Anonymous 07/20/24(Sat)04:14:42 No.101488677

So Mistral format is the same as old Mistral, but remove spaces and move the system prompt inside the last user message?

Anonymous
07/20/24(Sat)04:23:48 No.101488744

Anonymous 07/20/24(Sat)04:23:48 No.101488744

I know the llama.cpp quants are fucked for Mistral, but how about exl2? Any problems?

Anonymous
07/20/24(Sat)04:40:39 No.101488879

Anonymous 07/20/24(Sat)04:40:39 No.101488879

tiger gemma 9b werks
friendship with llama 8b officially ended

Anonymous
07/20/24(Sat)04:52:47 No.101488976

Anonymous 07/20/24(Sat)04:52:47 No.101488976

>>101488159
Community finetunes will make it worse, don't place your bets on them. We're not in the Llama1 days anymore.

Anonymous
07/20/24(Sat)04:54:59 No.101488991

Anonymous 07/20/24(Sat)04:54:59 No.101488991

>>101488744
I think exl2 quants are broken in subtle ways and probably we won't see the full model quality until post-training quantization algorithms properly take into account that the model has been trained in FP8.

Anonymous
07/20/24(Sat)05:02:07 No.101489040

Anonymous 07/20/24(Sat)05:02:07 No.101489040

Who the fuck shills gemma? I just tried it and it sucked. Going back to CR+.

Anonymous
07/20/24(Sat)05:02:49 No.101489047

Anonymous 07/20/24(Sat)05:02:49 No.101489047

wow.
tried exllama2 for the mistral nemo model. its been months and last time i tried the original version.
its become even slower than back then. what the fuck.
is exllama now only for ampere cards and later? i have a pascal card and its just a horrible experience.
no idea why loading the model takes 2m+ either if its not from ooba but directly latest version from github.
auto split directly throws an OOM. huge prompt loading times. what a shitshow.
if we didnt have gpu anon people with older cards would be fucked. bless him. for his work.

Anonymous
07/20/24(Sat)05:05:30 No.101489057

Anonymous 07/20/24(Sat)05:05:30 No.101489057

I thought mistral fp8 was supposed to be lossless?
https://huggingface.co/neuralmagic/Mistral-Nemo-Instruct-2407-FP8

Anonymous
07/20/24(Sat)05:09:25 No.101489083

Anonymous 07/20/24(Sat)05:09:25 No.101489083

>>101489040
I would guess people like me who only can use stuff until ~30b.
stheno is such a horrible experience. I dont get the hype about mixtral at all and the chinese models in that size range arent that good either.
Gemma 29b feels like a huge upgrade. Its more "present". You can feel the remnants of the pyg retardation we had back then with smaller models.

Anonymous
07/20/24(Sat)05:09:51 No.101489085

Anonymous 07/20/24(Sat)05:09:51 No.101489085

>>101489057
That's what they say

Mistral NeMo was trained with quantisation awareness, enabling FP8 inference without any performance loss.

Anonymous
07/20/24(Sat)05:15:08 No.101489120

Anonymous 07/20/24(Sat)05:15:08 No.101489120

>>101489057
That might depend on the exact FP8 format used (exponent/mantissa bits); if there's a mismatch with the one used for training the model, there will be quality loss.

Anonymous
07/20/24(Sat)05:21:45 No.101489178

Anonymous 07/20/24(Sat)05:21:45 No.101489178

>>101489083
Add to that that l3 70b isn't really good at rp and comprehending user context and feels on par with gemma now in that regard (but light ages behind Wiz and CR+), and you've got your audience.

Anonymous
07/20/24(Sat)05:25:51 No.101489221

Anonymous 07/20/24(Sat)05:25:51 No.101489221

>>101489057
of course there's bound to some quality loss, especially since they aren't using the exact same code that Mistral used during training
fp errors can accumulate from anywhere, so 99% recovery of the unquanted is still incredible

Anonymous
07/20/24(Sat)05:38:28 No.101489340

Anonymous 07/20/24(Sat)05:38:28 No.101489340

Local died in 2023. Only 12 good models have been released since then (most of them by Cohere, being already trained by 2023). Local achieved its creative peak in models like Mythomax, L2 Euryale, and SuperCOT, elevating the field into a legitimate SOVL form. Now, thanks to Llama3 and Mixtral, all its potential was squandered and the field has been reduced into being mere riddle solvers for reddit idiots (i.e. the lowest common denominator - stop trying to turn open-source AI into corposlop).

Anonymous
07/20/24(Sat)05:39:37 No.101489348

Anonymous 07/20/24(Sat)05:39:37 No.101489348

>>101489340
youve been called gay in a lmg thread, doesnt mean local died

Anonymous
07/20/24(Sat)05:51:13 No.101489426

Anonymous 07/20/24(Sat)05:51:13 No.101489426

>>101488469
That's hot.

Anonymous
07/20/24(Sat)05:52:27 No.101489436

Anonymous 07/20/24(Sat)05:52:27 No.101489436

>>101489178
You've had satisfying experiences roleplaying with WizardLM-2 8x22B? Could you share a prompt / card combo that worked out well?

Anonymous
07/20/24(Sat)06:11:31 No.101489564

Anonymous 07/20/24(Sat)06:11:31 No.101489564

>>101478112

As awesome as this might look, the problem is that it's still the same vocabulary, from the same basic GPTslop dataset. I recognise each and every one of those rote expressions that are being cut and pasted together, now.

Also, while I still don't completely understand the popular fixation with demons where ERP is concerned, I assume that the appeal in the case of a succubus, is a life form that literally views semen as a food source, and who doesn't require emotional gratification before being willing to suck someone off. That was the main reason for the appeal of futanari in my own case, I've realised.

Anonymous
07/20/24(Sat)06:14:25 No.101489579

Anonymous 07/20/24(Sat)06:14:25 No.101489579

>>101488469
>plapping anime girls instead of your corpo assistant
Grow up.

Anonymous
07/20/24(Sat)06:16:22 No.101489593

Anonymous 07/20/24(Sat)06:16:22 No.101489593

>>101489564
didn't you say you hated llms and wouldn't be back petrus/petra?

Anonymous
07/20/24(Sat)06:16:30 No.101489595

Anonymous 07/20/24(Sat)06:16:30 No.101489595

>>101489340
I never used Euryale, but I view Nous Hermes and Dolphin Mixtral 2.5 as the local peak, personally...although Goliath was an almost spiritual experience, as well.

Broadly speaking I agree, though. Llama3 is corporate woke garbage whose every word sounds like a marketing press release. I think what really crushed me was when I realised just how much the suits WANT language models which act and sound like either L3 or contemporary GPT4. Soulless, sterile, safe, completely predictable...and utterly useless and pointless.

It was a beautiful dream, but it honestly looks like it's over.

Anonymous
07/20/24(Sat)06:18:01 No.101489604

Anonymous 07/20/24(Sat)06:18:01 No.101489604

>>101489595
yes undi is releasing less models, it's over go away now

Anonymous
07/20/24(Sat)06:18:36 No.101489607

Anonymous 07/20/24(Sat)06:18:36 No.101489607

>>101489595
Uh, soulless and sterile definitely wasn't the general opinion when Llama-3 got released. Safe, definitely; predictable, probably.

Anonymous
07/20/24(Sat)06:19:45 No.101489612

Anonymous 07/20/24(Sat)06:19:45 No.101489612

>>101489607
he's just here to doom now, he said so himself before he hasn't used any new llms

Anonymous
07/20/24(Sat)06:20:14 No.101489617

Anonymous 07/20/24(Sat)06:20:14 No.101489617

>>101489340
Damn hate to sound overdramatic but owari fucking da

Anonymous
07/20/24(Sat)06:20:45 No.101489623

Anonymous 07/20/24(Sat)06:20:45 No.101489623

File: denomolos+.jpg (378 KB, 791x662)

378 KB JPG

>>101489593
Sorry to disappoint you, Heinrich. I'm still here now and then.

Anonymous
07/20/24(Sat)06:21:07 No.101489630

Anonymous 07/20/24(Sat)06:21:07 No.101489630

>>101489085
There are two FP8 formats that NVidia proposed: E5M2 and E4M3, which one does NeMo use?

https://developer.nvidia.com/blog/nvidia-arm-and-intel-publish-fp8-specification-for-standardization-as-an-interchange-format-for-ai/

Anonymous
07/20/24(Sat)06:21:26 No.101489637

Anonymous 07/20/24(Sat)06:21:26 No.101489637

>>101489623
we know... the threads have been utter dogshit, so of course you're here...

Anonymous
07/20/24(Sat)06:23:07 No.101489652

Anonymous 07/20/24(Sat)06:23:07 No.101489652

>>101489612
I used several L3 finetunes, and I've also used a couple of Drummer's Gemma tunes, as well. Gemma is promising as a coombot, but I honestly just don't have the motivation to test LLMs for anything other than ERP any more.

Anonymous
07/20/24(Sat)06:23:46 No.101489660

Anonymous 07/20/24(Sat)06:23:46 No.101489660

>>101489436
I've had satisfying experience making Wiz work through my complicated context and produce logical, believable, horribly cookie-cutter responses out of the box, no special system prompt required. Prose in those can later be prettified/edgified/humanized/followed with Storywriter. Storywriter by itself fails to understand what's going on even in moderately complicated prompts, much like all other l3 finetunes.

Anonymous
07/20/24(Sat)06:24:41 No.101489667

Anonymous 07/20/24(Sat)06:24:41 No.101489667

>>101489652
no you haven't you're literally just here to demoralize because you're burnt out and want others to be as miserable as you

Anonymous
07/20/24(Sat)06:25:20 No.101489673

Anonymous 07/20/24(Sat)06:25:20 No.101489673

>>101489637
I haven't been posting nearly enough to make that happen by myself. These threads are dogshit mostly because very little is genuinely happening at the moment. If I also can't make even the most innocuous statements without you immediately arching up and telling me to get out, then that's your problem, not mine.

Anonymous
07/20/24(Sat)06:27:35 No.101489691

Anonymous 07/20/24(Sat)06:27:35 No.101489691

>>101489667
What models have you been testing recently, Anon?

Anonymous
07/20/24(Sat)06:30:14 No.101489717

Anonymous 07/20/24(Sat)06:30:14 No.101489717

>>101489630
>2 bits of mantissa
You have to be shitting me. At this point why even bother?

Anonymous
07/20/24(Sat)06:35:13 No.101489775

Anonymous 07/20/24(Sat)06:35:13 No.101489775

File: i-have-no-idea-what-pygma(...).png (325 KB, 1043x765)

325 KB PNG

>>101489340
And in 2022 we only had 1 model. pyg. No quantization.
I wrote it before but you needed 5 swipes to get something resembling coherent text. And it was amazing.
Closed models we had mormons running gpt2 in the background who leaked loli chats and banned paying users. Moderators reading over everything.

Either closed or open source we have it better than ever.
Sonnet 3.5 is a huge step up. Its so good. And I really like the new Gemma. Simple prompt and its much more uncucked vs. llama3 and thats google.
You are probably one of those twitter pajeets who said agi by autumn 2024 and are now crying.
I understand zoomers have been fucked over so hard they have no energy left. I'm a millennial.
But if I had all those ai tools we have now when I was young and had more time.
I had fucking rpg maker. Needed to ask artfags for their dumb ass charsets and make compromises everywhere. Music and sound effects were a struggle as well.
You can literally make videos for free now with a prompt. Crude and short maybe. But this is so far ahead of what I had I cant even put it in words.

Anonymous
07/20/24(Sat)06:37:36 No.101489792

Anonymous 07/20/24(Sat)06:37:36 No.101489792

>>101489734
That's what I'm using though?

Anonymous
07/20/24(Sat)06:38:55 No.101489814

Anonymous 07/20/24(Sat)06:38:55 No.101489814

>>101489792
I'm retarded please ignore

Anonymous
07/20/24(Sat)06:40:58 No.101489834

Anonymous 07/20/24(Sat)06:40:58 No.101489834

>>101489775
I'm glad you're enjoying it, Anon, honestly. I wish I knew how to get some of your enthusiasm back. I think my real problem is that I was around for maybe the last three months before Character.AI went completely to shit, and anyone who experienced that will understand how hard it is to let go of that memory. Anything else we experience, short of literal AGI, feels like a step down by comparison. Goliath was the only other model I've seen that has come close.

Anonymous
07/20/24(Sat)06:43:26 No.101489855

Anonymous 07/20/24(Sat)06:43:26 No.101489855

>>101489834
This, but unironically.

Anonymous
07/20/24(Sat)06:49:10 No.101489902

Anonymous 07/20/24(Sat)06:49:10 No.101489902

>>101489834
>I'm glad you're enjoying it, Anon, honestly
no otherwise you wouldn't be here demoralizing

Anonymous
07/20/24(Sat)06:49:38 No.101489907

Anonymous 07/20/24(Sat)06:49:38 No.101489907

>>101489834
Fair enough anon.
I think I understand, at the beginning chatgpt was so good.
It sniffed out what you wanted without it being explicitly prompted.
Difficult to put in words but like its mission was to serve the user as best as possible.
That lasted a very short time and its never been the same.
Just lean back and relax. Even alignment wise it seems directionally we are heading into a better direction.

Anonymous
07/20/24(Sat)06:52:36 No.101489929

Anonymous 07/20/24(Sat)06:52:36 No.101489929

I don't get all the character.ai love, yes I tried the first versions, a few days then I didn't care anymore

Anonymous
07/20/24(Sat)06:55:23 No.101489957

Anonymous 07/20/24(Sat)06:55:23 No.101489957

>>101489929
you didn't care anymore because you couldn't sex it.

Anonymous
07/20/24(Sat)07:03:55 No.101490020

Anonymous 07/20/24(Sat)07:03:55 No.101490020

>>101488042
Wft Fish Audio is actually good? It's a bit slow but I can tolerate that for the sheer quality compared to other models.

Anonymous
07/20/24(Sat)07:14:17 No.101490120

Anonymous 07/20/24(Sat)07:14:17 No.101490120

>>101485864
Sweet fiddlers fuck I haven't seen this many gpt-isms in a long while. Husky whispers, going on journeys together and forming bonds, it's all there. On the plus side the characters are a lot more well-spoken than I've seen with other models.
As usual at 70b the question is 'is this model worth losing 32k context' and the answer is definitely not here.

Also has GGUF gotten better over the last 3ish months? My gen times are cut in half.

Anonymous
07/20/24(Sat)07:15:13 No.101490129

Anonymous 07/20/24(Sat)07:15:13 No.101490129

>>101489957
First version could be sexed to hell and back, though.

Anonymous
07/20/24(Sat)07:16:32 No.101490141

Anonymous 07/20/24(Sat)07:16:32 No.101490141

>>101490120
>Also has GGUF gotten better over the last 3ish months? My gen times are cut in half.
yes, there are lots of people saying it's gotten much closer to exl2 speeds recently

Anonymous
07/20/24(Sat)07:26:22 No.101490228

Anonymous 07/20/24(Sat)07:26:22 No.101490228

>>101490129
It didn't last very long

Anonymous
07/20/24(Sat)07:34:10 No.101490316

Anonymous 07/20/24(Sat)07:34:10 No.101490316

NEMO LCPP STATUS?

Anonymous
07/20/24(Sat)07:35:23 No.101490333

Anonymous 07/20/24(Sat)07:35:23 No.101490333

>>101489907
>Even alignment wise it seems directionally we are heading into a better direction.
I don't understand why /lmg wasn't more enthusiastic about Dolphin Mixtral 2.5 in particular, to be honest. It was great. Great compliance with prompts, and text generation that honestly felt close to GPT4 at times in my experience.

https://huggingface.co/TheBloke/dolphin-2.5-mixtral-8x7b-GGUF

If you've never tried it, give it a go, Anon. It's awesome.

Anonymous
07/20/24(Sat)07:38:42 No.101490381

Anonymous 07/20/24(Sat)07:38:42 No.101490381

>>101490333
>Dolphin
gptslopped to hell that's why

Anonymous
07/20/24(Sat)07:38:44 No.101490382

Anonymous 07/20/24(Sat)07:38:44 No.101490382

will 400b be our savior or the final nail in the coffin of open source llms

Anonymous
07/20/24(Sat)07:38:51 No.101490385

Anonymous 07/20/24(Sat)07:38:51 No.101490385

>>101488879
can it follow chat formatting? retain all quotes and asterisks?

Anonymous
07/20/24(Sat)07:39:46 No.101490393

Anonymous 07/20/24(Sat)07:39:46 No.101490393

>>101490385
no

Anonymous
07/20/24(Sat)07:42:54 No.101490423

Anonymous 07/20/24(Sat)07:42:54 No.101490423

>>101490382
It depends on the instruct tune provided by Meta; hopefully it won't be as cucked as the previous L3-instruct. Almost nobody will be able to finetune it, although the model is so large and will have a long enough context size that perhaps in-context learning with the base model will be enough for most uses.

Anonymous
07/20/24(Sat)07:43:37 No.101490431

Anonymous 07/20/24(Sat)07:43:37 No.101490431

>>101490333
>Dolphin Mixtral
I like it more than L3 and recent models (with the exception of CR) for chatting.

Anonymous
07/20/24(Sat)07:46:18 No.101490463

Anonymous 07/20/24(Sat)07:46:18 No.101490463

>>101490333
>>101490431
limarp-zloss or dolphin?

Anonymous
07/20/24(Sat)07:47:27 No.101490475

Anonymous 07/20/24(Sat)07:47:27 No.101490475

>>101490463
Dolphin is honestly GPT6 tier, no cap.

Anonymous
07/20/24(Sat)07:51:58 No.101490515

Anonymous 07/20/24(Sat)07:51:58 No.101490515

>>101488042
>give cats breakfast
>their excitement is so palpable that it sends shivers down my spine

Anonymous
07/20/24(Sat)07:54:41 No.101490545

Anonymous 07/20/24(Sat)07:54:41 No.101490545

If you had the chance to purchase a server with 8xAMD Instinct Mi100 32GB GPU's for 7000€, would you do it?

Anonymous
07/20/24(Sat)07:55:10 No.101490550

Anonymous 07/20/24(Sat)07:55:10 No.101490550

>>101490515
gpt brainrot has consumed you

Anonymous
07/20/24(Sat)07:59:37 No.101490602

Anonymous 07/20/24(Sat)07:59:37 No.101490602

>>101490385
sometimes

Anonymous
07/20/24(Sat)08:06:41 No.101490689

Anonymous 07/20/24(Sat)08:06:41 No.101490689

>>101490120
The latest llama.cpp update has really improved speeds especially with full gpu offloading, may be as fast as exl2 now. Booba hasn't updated yet though so its still slow on boobs.

Anonymous
07/20/24(Sat)08:06:51 No.101490693

Anonymous 07/20/24(Sat)08:06:51 No.101490693

>>101490545
>AMD

Anonymous
07/20/24(Sat)08:07:38 No.101490703

Anonymous 07/20/24(Sat)08:07:38 No.101490703

>>101490693
yeah, thats the point of the question.

Anonymous
07/20/24(Sat)08:10:01 No.101490725

Anonymous 07/20/24(Sat)08:10:01 No.101490725

>>101490545
I would say no....
#1. Pricing it out on a GPUs only basis you're not really getting any kind of bulk buy discount on that if we go from the bottom of the stack.
#2. no bitsandbytes support.
Which is fine for just running models in exl2 or gguf. no bitsandbytes needed. But if you want to start playing around with training you're more or less relegated to fp16 training.
#3. Each card has about 1/3rd the fp16 performance as a 3090. So even if you did find a massive model to load up with them, due to the inefficiencies added with multiple GPUs, which get worse with every card that you add, like let's say Q4 405B you're probably not looking at a particularly useable experience. More useable than a gen-1 epyc or haswell xeon rig, but that's not saying much with a model that big.

Anonymous
07/20/24(Sat)08:15:27 No.101490785

Anonymous 07/20/24(Sat)08:15:27 No.101490785

Are we getting any other sizes next week or just 405B? Saw an anon say they are refreshing the whole lineup.

Anonymous
07/20/24(Sat)08:15:39 No.101490787

Anonymous 07/20/24(Sat)08:15:39 No.101490787

>>101490545
For that price you can buy a proper CPUMAXX server with 500GB RAM that can run 70b at like 7t/s

Anonymous
07/20/24(Sat)08:16:29 No.101490799

Anonymous 07/20/24(Sat)08:16:29 No.101490799

>>101490785
rumors of 8/70B 128k

Anonymous
07/20/24(Sat)08:17:12 No.101490811

Anonymous 07/20/24(Sat)08:17:12 No.101490811

>>101490382
I don’t think they’ll risk having it too aligned, it will probably be the closest to uncensored yet.

Anonymous
07/20/24(Sat)08:19:59 No.101490840

Anonymous 07/20/24(Sat)08:19:59 No.101490840

Installing wheels from source gives some ninja error, and how is this supposed to work?

export VLLM_VERSION=0.5.2 # vLLM's main branch version is currently set to latest released tag
pip install https://vllm-wheels.s3.us-west-2.amazonaws.com/nightly/vllm-${VLLM_VERSION}-cp38-abi3-manylinux1_x86_64.whl
# You can also access a specific commit
# export VLLM_COMMIT=...
# pip install https://vllm-wheels.s3.us-west-2.amazonaws.com/${VLLM_COMMIT}/vllm-${VLLM_VERSION}-cp38-abi3-manylinux1_x86_64.whl

Means there are supposed to be wheels for each commit, but I have to specify a version, which would be older than the commit? (url doesn't work)

Anonymous
07/20/24(Sat)08:32:30 No.101490998

Anonymous 07/20/24(Sat)08:32:30 No.101490998

>>101490725
>Each card has about 1/3rd the fp16 performance as a 3090
huh?
I looked at wikipedia and it said 184.6 TFLOPS for FP16. And 3090 has like 35 TFLOPS for fp16.
is that misinformation?

Anonymous
07/20/24(Sat)08:34:44 No.101491029

Anonymous 07/20/24(Sat)08:34:44 No.101491029

>>101490998 (Me)
just to confirm, i went onto amds website, and it said the same thing.
https://www.amd.com/en/products/accelerators/instinct/mi100.html

Anonymous
07/20/24(Sat)08:36:14 No.101491041

Anonymous 07/20/24(Sat)08:36:14 No.101491041

I feel so good about not buying a second card just for this. I seriously considered that for a moment.

Anonymous
07/20/24(Sat)08:40:06 No.101491087

Anonymous 07/20/24(Sat)08:40:06 No.101491087

>>101491029
Oh, google lied. My bad.
That's actually really good.
I don't know exactly where the memory bandwidth bottleneck kicks in with gpu inferencing so I still can't say if it would be good for 400B or not. I can appreciate wanting to run it for the memes, but if you feel like CR+ and 70B aren't good enough the reality is nothing will actually please you and you'll just end up with buyer's remorse.

Anonymous
07/20/24(Sat)08:46:24 No.101491153

Anonymous 07/20/24(Sat)08:46:24 No.101491153

405B isn't going to be inherently have a better writing style, it's just going to be less prone to making retarded mistakes where it generates words inappropriate to what's going on, right?

Anonymous
07/20/24(Sat)08:47:37 No.101491173

Anonymous 07/20/24(Sat)08:47:37 No.101491173

>>101491153
And it will be a trivia master.

Anonymous
07/20/24(Sat)08:48:02 No.101491178

Anonymous 07/20/24(Sat)08:48:02 No.101491178

>>101491087
Its not for cooming, its for preparing for the dystopia :)

Anonymous
07/20/24(Sat)08:50:28 No.101491207

Anonymous 07/20/24(Sat)08:50:28 No.101491207

>>101491178
I'd still be worried about the lack of AMD/Legacy support for bitsandbytes.
Sucks that huggingface sucks Jensen's cock that hard but it is what it is.
Because you'll probably want to be able to train in the dystopia.

Anonymous
07/20/24(Sat)09:01:20 No.101491340

Anonymous 07/20/24(Sat)09:01:20 No.101491340

>>101491041
Imagine how those that bought 6 must be feeling right now.

Anonymous
07/20/24(Sat)09:02:37 No.101491356

Anonymous 07/20/24(Sat)09:02:37 No.101491356

>>101491173
Riddle master, but bad at trivia.

Anonymous
07/20/24(Sat)09:12:02 No.101491460

Anonymous 07/20/24(Sat)09:12:02 No.101491460

what type of person uses ai for anything but cooming

Anonymous
07/20/24(Sat)09:13:53 No.101491477

Anonymous 07/20/24(Sat)09:13:53 No.101491477

>>101491460
Indians use it for programming

Anonymous
07/20/24(Sat)09:14:28 No.101491482

Anonymous 07/20/24(Sat)09:14:28 No.101491482

>>101491460
a man skilled a tech but not social interaction
many such cases

Anonymous
07/20/24(Sat)09:15:24 No.101491489

Anonymous 07/20/24(Sat)09:15:24 No.101491489

>>101488159
It doesn't need a finetune.

Anonymous
07/20/24(Sat)09:18:08 No.101491513

Anonymous 07/20/24(Sat)09:18:08 No.101491513

>>101490385
yes

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/20/24(Sat)09:28:32 No.101491640

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/20/24(Sat)09:28:32 No.101491640

>>101490998
>>101491029
The 35.6 TFLOPS number is for regular FP16 operations.
With tensor cores an RTX 3090 has 142 FP16 TFLOPS or 284 int8 TOPS.

Anonymous
07/20/24(Sat)09:31:14 No.101491668

Anonymous 07/20/24(Sat)09:31:14 No.101491668

What would be a good tiny model to pack into a game? For now Im thinking
>teknium/OpenHermes-2.5-Mistral-7B (4GB ram usage)

Anonymous
07/20/24(Sat)09:33:07 No.101491690

Anonymous 07/20/24(Sat)09:33:07 No.101491690

Having just fapped to a game with my fucked up fetish + horrendous writing my post nut clarity made me think about something. Is the problem uncanny valley in text form? The quality of writing in that game is absolutely atrocious. Like 14 year old fanfic level. But I don't mind it that much. On the other hand when I see shivertastic beneath the whisper gleams in the eyes I start to quickly lose my erection.

Anonymous
07/20/24(Sat)09:38:57 No.101491755

Anonymous 07/20/24(Sat)09:38:57 No.101491755

>>101491690
You sound illiterate.

Anonymous
07/20/24(Sat)09:40:49 No.101491779

Anonymous 07/20/24(Sat)09:40:49 No.101491779

File: humanslop.png (90 KB, 1581x738)

90 KB PNG

>>101491690
No, the problem is data diversity. LLMs have consumed too much data with shiverslop and not enough data without it. Erotic fiction is already a niche and erotic fiction written is unslopped way is even more rare. We need better data, possibly hybrid data(heavily edited synth data), because we don't have a lot of human data.

Anonymous
07/20/24(Sat)09:48:21 No.101491880

Anonymous 07/20/24(Sat)09:48:21 No.101491880

>>101491690
is the game called euphoria

Anonymous
07/20/24(Sat)09:48:37 No.101491884

Anonymous 07/20/24(Sat)09:48:37 No.101491884

>>101491690
The solution is to replace erotic literature dialog with translated eroge visual novel dialog. The shivers and whispers will be replaced with can't be helped's and pleasures of being cummed inside

Anonymous
07/20/24(Sat)09:50:04 No.101491899

Anonymous 07/20/24(Sat)09:50:04 No.101491899

>>101490333
What about the other dolphin models? I think I had dolphin dbrx downloaded but never tried it.

Anonymous
07/20/24(Sat)09:51:39 No.101491923

Anonymous 07/20/24(Sat)09:51:39 No.101491923

Word on the street is that this is an upgrade over niitama.
https://huggingface.co/nothingiisreal/L3-8B-Celeste-V1.2

Anonymous
07/20/24(Sat)09:52:36 No.101491942

Anonymous 07/20/24(Sat)09:52:36 No.101491942

>>101491880
yes it is

Anonymous
07/20/24(Sat)09:53:38 No.101491961

Anonymous 07/20/24(Sat)09:53:38 No.101491961

>>101491880
no it isn't

Anonymous
07/20/24(Sat)09:54:11 No.101491969

Anonymous 07/20/24(Sat)09:54:11 No.101491969

>>101491923
>using Reddit

Anonymous
07/20/24(Sat)09:55:18 No.101491990

Anonymous 07/20/24(Sat)09:55:18 No.101491990

What Kobold Preset do I use in Silly Tavern for gemma 2 based models?

I'm on the latest version of Silly Tavern which has presets for context and instruct in the advanced formatting section but the Kolbold Presets in the first menu doesn't seem to have any thing for Gemma2?

Anonymous
07/20/24(Sat)09:56:51 No.101492010

Anonymous 07/20/24(Sat)09:56:51 No.101492010

>>101491990
neutralize samplers
temp 1

Anonymous
07/20/24(Sat)09:59:05 No.101492046

Anonymous 07/20/24(Sat)09:59:05 No.101492046

>>101491640
so if 3090 is better for integer does that mean it'll be superior in 2 weeks time when bitnet becomes the norm?

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/20/24(Sat)10:03:18 No.101492116

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/20/24(Sat)10:03:18 No.101492116

>>101492046
I'm struggling to get even 50% of the peak int8 tensor core throughput for MMQ so probably not.

Anonymous
07/20/24(Sat)10:04:07 No.101492133

Anonymous 07/20/24(Sat)10:04:07 No.101492133

>>101491923
Using reddit, very smart of you, unironically.

Anonymous
07/20/24(Sat)10:04:09 No.101492135

Anonymous 07/20/24(Sat)10:04:09 No.101492135

Is Gemma 2 27B generation quality on Exllama on par with Llama.cpp yet?

Anonymous
07/20/24(Sat)10:06:23 No.101492167

Anonymous 07/20/24(Sat)10:06:23 No.101492167

>>101492135
No, nobody even made an issue.

Anonymous
07/20/24(Sat)10:07:34 No.101492182

Anonymous 07/20/24(Sat)10:07:34 No.101492182

does mistral work properly with llama.cpp?

Anonymous
07/20/24(Sat)10:10:08 No.101492219

Anonymous 07/20/24(Sat)10:10:08 No.101492219

File: 1713871087817283.jpg (91 KB, 640x720)

91 KB JPG

As my eyes devoured those overused phrases and hackneyed words, I couldn't help but feel a fiery mix of frustration and exasperation coursing through my veins. My blood practically simmered with righteous indignation, threatening to boil over at any moment. A shiver of annoyance ran down my spine, and I found myself clenching my fists, my knuckles turning white with barely contained irritation. The very sight of such literary clichés sent waves of displeasure pooling in my belly, my jaw tightening as I struggled to contain the tempest of emotions swirling within me.

CPuMAXx/VI !CPuMAXx/VI
07/20/24(Sat)10:10:59 No.101492233

CPuMAXx/VI !CPuMAXx/VI 07/20/24(Sat)10:10:59 No.101492233

File: recapbot-deepseek-v2-Chat(...).png (18 KB, 1011x365)

18 KB PNG

I'm on vacation, but remoted in to my rig long enough to do a recapbot test with the new deepseek. I haven't been following the threads closely enough to really evaluate performance, but on the surface it seems to have done a good job. How does it compare to recapanon's multistage recapbot's output for the last thread?

Anonymous
07/20/24(Sat)10:11:25 No.101492241

Anonymous 07/20/24(Sat)10:11:25 No.101492241

>>101492182
No.

llama_model_load: error loading model: check_tensor_dims: tensor 'blk.0.attn_q.weight' has wrong shape; expected 5120, 5120, got 5120, 4096, 1, 1

Anonymous
07/20/24(Sat)10:13:38 No.101492265

Anonymous 07/20/24(Sat)10:13:38 No.101492265

File: Overlord.png (749 KB, 1000x562)

749 KB PNG

I just realized I find 1 on 1 RPs extremely boring. I want a group of characters interacting with me and each other, each one with their own thoughts, life and motivations. Oh well, 10 years or something to wait.

Anonymous
07/20/24(Sat)10:13:46 No.101492266

Anonymous 07/20/24(Sat)10:13:46 No.101492266

>>101490316
https://github.com/ggerganov/llama.cpp/issues/8577
Support for the custom 'Tekken' tokeniser just got pushed 30mins ago. An inference implementation will soon to follow up.

Anonymous
07/20/24(Sat)10:15:32 No.101492284

Anonymous 07/20/24(Sat)10:15:32 No.101492284

>>101492266
Nice. Can't wait for this and then the dozen tokenization bug fixes later on.

Anonymous
07/20/24(Sat)10:15:39 No.101492286

Anonymous 07/20/24(Sat)10:15:39 No.101492286

How's base vs instruct Nemo for RP?

Anonymous
07/20/24(Sat)10:16:16 No.101492297

Anonymous 07/20/24(Sat)10:16:16 No.101492297

I don't even know what to do with 128k context

Anonymous
07/20/24(Sat)10:16:31 No.101492301

Anonymous 07/20/24(Sat)10:16:31 No.101492301

>>101492265
Most models are still trained with a 1-on-1 paradigm (user-assistant), that will need to change first.

Anonymous
07/20/24(Sat)10:16:41 No.101492302

Anonymous 07/20/24(Sat)10:16:41 No.101492302

>>101492265
skill issue

>>101492286
they're both terrible

CPUMAXXER
07/20/24(Sat)10:17:27 No.101492311

CPUMAXXER 07/20/24(Sat)10:17:27 No.101492311

>>101492233
How many t/s do you get and at what quant? I got 5-6 t/s at bf16

Anonymous
07/20/24(Sat)10:17:59 No.101492318

Anonymous 07/20/24(Sat)10:17:59 No.101492318

>>101492241
this works but you need to compile it yourself, hopefully "official" support soon enough
https://github.com/iamlemec/llama.cpp/tree/mistral-nemo

Anonymous
07/20/24(Sat)10:18:51 No.101492328

Anonymous 07/20/24(Sat)10:18:51 No.101492328

>>101492297
For me? I roleplay as an immortal being and go around impregnating random girls, fast forwarding a decade or two, impregnating my daughters then revealing my relationship to them after they give birth and having a kek at their reaction, and repeating this infinitely.

Anonymous
07/20/24(Sat)10:19:00 No.101492329

Anonymous 07/20/24(Sat)10:19:00 No.101492329

>>101492266
WE'RE SO BACK???

Anonymous
07/20/24(Sat)10:20:08 No.101492344

Anonymous 07/20/24(Sat)10:20:08 No.101492344

>>101492302
skill issue my ass, sota models can't even realistically portrait one character, not to mention multiple ones

Anonymous
07/20/24(Sat)10:21:08 No.101492357

Anonymous 07/20/24(Sat)10:21:08 No.101492357

>>101492302
skill issue

Anonymous
07/20/24(Sat)10:21:19 No.101492362

Anonymous 07/20/24(Sat)10:21:19 No.101492362

>>101492265
I imagine somebody would have to program the model to speak for each character in the group, processing each personality and the conversation up to that point.
They could call it group chat or something.
Alas and lack the day, such functionality is just a dream.

Anonymous
07/20/24(Sat)10:22:01 No.101492372

Anonymous 07/20/24(Sat)10:22:01 No.101492372

How do you set up the AI to have a cooming session?
I installed KoboldAI, installed what I assume is a good model - OpenHermes-2.5-Mistral-7B, tried chatting with it, but it gives kind of shit answers.
Do you have like a top tier chatting preset?
Am I using the right model?
Where do you get character cards now? I don't see anything in OP.
3080 12gb, 16gb ram.

Anonymous
07/20/24(Sat)10:22:05 No.101492374

Anonymous 07/20/24(Sat)10:22:05 No.101492374

File: nemosystemmessage.png (63 KB, 967x336)

63 KB PNG

I just want to rant that the way "system messages" are implemented in Mistral NeMo Instruct is utterly retarded.

Anonymous
07/20/24(Sat)10:22:47 No.101492385

Anonymous 07/20/24(Sat)10:22:47 No.101492385

>>101492344
>sota models can't even realistically portrait one character
I take my earlier "skill issue" comment back and aim it at this one

>>101492357
both just seemed broken to me

Anonymous
07/20/24(Sat)10:23:08 No.101492388

Anonymous 07/20/24(Sat)10:23:08 No.101492388

>>101492328
I've tried doing a long roleplay but it just turns into a collaborative writing session because I'm the one that ends up guiding the narrative anyway. I wish there was an event system or something, like a wildcard prompt injector that you can load with dragon attacks or equipment breaking down or whatever

Anonymous
07/20/24(Sat)10:23:35 No.101492396

Anonymous 07/20/24(Sat)10:23:35 No.101492396

>>101492372
>installed what I assume is a good model - OpenHermes-2.5-Mistral-7B
bad bait

Anonymous
07/20/24(Sat)10:24:18 No.101492408

Anonymous 07/20/24(Sat)10:24:18 No.101492408

>>101492297
You use it to insert up-to-date information on whatever the fuck you're trying to do and enjoy 12B speeds and current-day-bigger-than-12B knowledge.

Anonymous
07/20/24(Sat)10:26:42 No.101492441

Anonymous 07/20/24(Sat)10:26:42 No.101492441

>>101492374
How's it retarded? They're trained to continue token sequences. A token is a token. Whether or not it's \n or <|system_message_end_im_a_midwit_retard_and_need_handholding|>

Anonymous
07/20/24(Sat)10:26:46 No.101492442

Anonymous 07/20/24(Sat)10:26:46 No.101492442

>>101492408
Other than coding and other long-context productivity scenarios, in theory 128k of usable context would be very useful for in-context learning and base models. I haven't had much luck with models released so far though, they generally tend to get confused with too much information in context.

Anonymous
07/20/24(Sat)10:26:54 No.101492444

Anonymous 07/20/24(Sat)10:26:54 No.101492444

>>101492385
If you think that Claude 3.5 Sonnet, GPT-4o or whatever is even close to emulating human behavior than you have to touch some grass and talk to actual people. I will assume you are trolling and not being retarded or basement dweller who didn't see a sun for 20 years.

Anonymous
07/20/24(Sat)10:27:38 No.101492454

Anonymous 07/20/24(Sat)10:27:38 No.101492454

>>101492396
I don't go to this general like you so I have no idea what's meta right now. I only use Stable Diffusion.

Anonymous
07/20/24(Sat)10:28:19 No.101492464

Anonymous 07/20/24(Sat)10:28:19 No.101492464

>>101492454
>I don't go to this general like you
yeah which means you need to go back, likely to reddit

Anonymous
07/20/24(Sat)10:28:36 No.101492469

Anonymous 07/20/24(Sat)10:28:36 No.101492469

>>101492362
I can program a button with the label "click to cum in 1 sec", doesn't mean it will.

Anonymous
07/20/24(Sat)10:29:12 No.101492478

Anonymous 07/20/24(Sat)10:29:12 No.101492478

>>101492444
>realistically portray one character
vs
>close to emulating human behavior

If you think those things are equivalent, then I'm not the retard

Anonymous
07/20/24(Sat)10:32:38 No.101492513

Anonymous 07/20/24(Sat)10:32:38 No.101492513

What nemo instruct version are you using? Unsloth?

Anonymous
07/20/24(Sat)10:33:46 No.101492533

Anonymous 07/20/24(Sat)10:33:46 No.101492533

>>101492478
They are virtually the same things, now you are nitpicking.

Anonymous
07/20/24(Sat)10:35:52 No.101492555

Anonymous 07/20/24(Sat)10:35:52 No.101492555

>>101492388
You can just prompt it to include random events and twists of fate. Tell it to make it realistic or as wacky as possible.
Funniest thing the AI did for me was introduce Shrek walking into the cafe my daughter-wife and I were at then slapping the cashier after they said they don't sell onions.
Works perfect for me using L3 70B New Dawn

Anonymous
07/20/24(Sat)10:36:48 No.101492563

Anonymous 07/20/24(Sat)10:36:48 No.101492563

>>101492533
It's very clear that they are virtually the same thing /to you/.

Anonymous
07/20/24(Sat)10:38:32 No.101492589

Anonymous 07/20/24(Sat)10:38:32 No.101492589

>>101492362
You literally just need to set up a loop where each character is an isolated agent, have it iterate over the characters, provide the context of the conversation and then use JSON to choose from a list of actions such as wait, reply, etc. example character "chooses" wait, then it moves on to the next, character "chooses" reply then it re-prompts the model for the character to give a reply. Use regex to make sure it's writing for the correct character and discard the reply if it fucks up. It's not rocket science. You could even have it maintain a text file for each character containing dynamic summaries for each character and have it added to the context.
Why has nobody done it yet? Because it's easier to just make a multi-character card and deal with the shortcomings since after you coom you're not going to give a shit about any of it anymore anyway.

Anonymous
07/20/24(Sat)10:39:09 No.101492603

Anonymous 07/20/24(Sat)10:39:09 No.101492603

>>101492563
yes they are, something else?

Anonymous
07/20/24(Sat)10:40:08 No.101492619

Anonymous 07/20/24(Sat)10:40:08 No.101492619

/lmg/ is like the concentration of the worst people on /g/

Anonymous
07/20/24(Sat)10:40:43 No.101492627

Anonymous 07/20/24(Sat)10:40:43 No.101492627

>>101492301
There's something called 'writing a story'. You should look it up. Maybe pick a book sometime?

Anonymous
07/20/24(Sat)10:41:35 No.101492634

Anonymous 07/20/24(Sat)10:41:35 No.101492634

>>101492627
models should pick a book first, they are terrible at writing

Anonymous
07/20/24(Sat)10:41:50 No.101492637

Anonymous 07/20/24(Sat)10:41:50 No.101492637

>>101492603
Character portrayal is not restricted to humans or human behaviour. Not a very shocking option.

Anonymous
07/20/24(Sat)10:41:52 No.101492638

Anonymous 07/20/24(Sat)10:41:52 No.101492638

>>101492619
Nah, that's aicg

Anonymous
07/20/24(Sat)10:42:22 No.101492645

Anonymous 07/20/24(Sat)10:42:22 No.101492645

>>101492589
jeez he was being sarcastic group chats existed for months in silly...
https://docs.sillytavern.app/usage/core-concepts/groupchats/
>Swap character cards

Anonymous
07/20/24(Sat)10:42:36 No.101492648

Anonymous 07/20/24(Sat)10:42:36 No.101492648

>>101492638
both are pretty shit desu

Anonymous
07/20/24(Sat)10:42:37 No.101492649

Anonymous 07/20/24(Sat)10:42:37 No.101492649

>>101492634
Let me guess, only Kayra can write stories? Go back to /aids/, shill.

Anonymous
07/20/24(Sat)10:43:36 No.101492658

Anonymous 07/20/24(Sat)10:43:36 No.101492658

>>101492645
tavern is jeet spaghetti code, though, and anything it does can probably be done better and with about a thousand less lines of code.

Anonymous
07/20/24(Sat)10:43:52 No.101492663

Anonymous 07/20/24(Sat)10:43:52 No.101492663

>>101492638
aicg is better because it has people interested in writing. /g/ is just Americans seething about Indians and anyone else smarter than them.

Anonymous
07/20/24(Sat)10:44:25 No.101492668

Anonymous 07/20/24(Sat)10:44:25 No.101492668

>>101492637
>nitpicking: noun, /ˈnJtˌpJk.Jŋ/ us /ˈnJtˌpJk.Jŋ/. Giving too much attention to details that are not important, especially as a way of criticizing.

Anonymous
07/20/24(Sat)10:44:40 No.101492674

Anonymous 07/20/24(Sat)10:44:40 No.101492674

>>101492658
>Why has nobody done it yet?
>nobody

Anonymous
07/20/24(Sat)10:44:49 No.101492678

Anonymous 07/20/24(Sat)10:44:49 No.101492678

>>101492658
And yet there is no 1000 line reimplementation of a better Tavern and hasn't been all year. Any idea why?

Anonymous
07/20/24(Sat)10:45:26 No.101492686

Anonymous 07/20/24(Sat)10:45:26 No.101492686

>>101492649
since when Kayra is not a model, retard? take your meds

Anonymous
07/20/24(Sat)10:46:12 No.101492693

Anonymous 07/20/24(Sat)10:46:12 No.101492693

>>101492678
why bother? It just works.

Anonymous
07/20/24(Sat)10:46:43 No.101492700

Anonymous 07/20/24(Sat)10:46:43 No.101492700

LimaRP-DS dataset now available, as promised.

I also trained one model on this (sunfall-v0.5). It feels... refreshing.

Anonymous
07/20/24(Sat)10:47:37 No.101492717

Anonymous 07/20/24(Sat)10:47:37 No.101492717

>>101492700
I'll wait for LimaRP-3DSXL
I know how nintendo is with these things.

Anonymous
07/20/24(Sat)10:48:36 No.101492731

Anonymous 07/20/24(Sat)10:48:36 No.101492731

>>101492668
Nope, that's not a nitpick. You just seem unable to grasp the concept of what I've explained to you, and that's okay.

Anonymous
07/20/24(Sat)10:48:39 No.101492733

Anonymous 07/20/24(Sat)10:48:39 No.101492733

>>101492700
based! thanks anon!

Anonymous
07/20/24(Sat)10:52:31 No.101492768

Anonymous 07/20/24(Sat)10:52:31 No.101492768

File: 135.png (21 KB, 700x700)

21 KB PNG

>>101492731
>Ackchyually the word 'character' can refer to non-human persona. They can be monsters, animals, and other creatures! They don't behave like humans! I'm very smart.

Anonymous
07/20/24(Sat)10:53:10 No.101492776

Anonymous 07/20/24(Sat)10:53:10 No.101492776

anons... i won't lie : mistral nemo feels different
it's a little retarded but it has sovl, i've been starved of relevant sovl since the mythomax days and it feels weird having a model that doesn't spew out the same standard flowery shit over and over again
definitely need to tweak parameters because i'm using my bagel misterytour template and it's not quite the best for it, but damn mistral outdone themselves on this one

Anonymous
07/20/24(Sat)10:54:02 No.101492785

Anonymous 07/20/24(Sat)10:54:02 No.101492785

>>101492776
buy an ad arthur

Anonymous
07/20/24(Sat)10:54:42 No.101492792

Anonymous 07/20/24(Sat)10:54:42 No.101492792

>>101492717
>>101492733
You're welcome. I realized that I kind of demolished some of the structure of the original dataset when I did the conversion (e.g. the data-long vs data-short dirs are gone; also trashed all BAD and WIP entries). A fine tuner with moderate level IQ should be able to get it right, but I may restore some of that if needed.

Anonymous
07/20/24(Sat)10:54:45 No.101492793

Anonymous 07/20/24(Sat)10:54:45 No.101492793

I'm starting go look into this whole local models thing, and I have a 6gb card that seems to run llama 3 with 8b fine.
I was willing to upgrade to a 16gb card, but the more I read the more pointless is seems? Apparently I won't be running the 70b version anyway
Does having 16gb vram even matter for casual prompts? I'm having a hard time gathering factual information about how this whole thing work over a bunch of "(doesn't) works on my machine"

Anonymous
07/20/24(Sat)10:55:34 No.101492802

Anonymous 07/20/24(Sat)10:55:34 No.101492802

>>101492785
you're a dumb nigger, you're the blackest retard gorilla i have ever seen

Anonymous
07/20/24(Sat)10:56:19 No.101492811

Anonymous 07/20/24(Sat)10:56:19 No.101492811

>>101492785
>posts about merge/tune
>Kys shill only use true corpo models!!!
>posts about corpo model
>Buy an ad
...

Anonymous
07/20/24(Sat)10:56:40 No.101492813

Anonymous 07/20/24(Sat)10:56:40 No.101492813

>>101492793
I wouldn't upgrade unless you were going for at least 24gb.
T. Running 8gb of vram.

Anonymous
07/20/24(Sat)10:57:26 No.101492824

Anonymous 07/20/24(Sat)10:57:26 No.101492824

>>101492793
not worth it until you get at least 24+gb vram no

Anonymous
07/20/24(Sat)10:57:32 No.101492826

Anonymous 07/20/24(Sat)10:57:32 No.101492826

File: nico_rosberg.jpg (162 KB, 600x706)

162 KB JPG

>>101492700
Cool. Will take a look.

Anonymous
07/20/24(Sat)10:57:36 No.101492828

Anonymous 07/20/24(Sat)10:57:36 No.101492828

>>101492785
>>101492811
samefag

Anonymous
07/20/24(Sat)10:58:21 No.101492835

Anonymous 07/20/24(Sat)10:58:21 No.101492835

>>101492638
lmg is aicg-lite, same shills, same baits, same avatartrannies.

Anonymous
07/20/24(Sat)10:59:40 No.101492849

Anonymous 07/20/24(Sat)10:59:40 No.101492849

>>101492441
doesn't that code mean the system message is prepended to the last user's message?

we've been doing that a long time now with last assistant prefix and depth 0 inserts and such

Anonymous
07/20/24(Sat)11:00:04 No.101492853

Anonymous 07/20/24(Sat)11:00:04 No.101492853

>>101492297
I'm gonna use it to run huge simulator cards like this. https://chub.ai/characters/Branon/shin-megami-tensei-simulator-v2-0-d1ac08fc
If it works good I'm going to modify it to have custom AI era games for every niche franchise I like

Anonymous
07/20/24(Sat)11:00:47 No.101492864

Anonymous 07/20/24(Sat)11:00:47 No.101492864

>>101492849
>we've been doing that a long time now with last assistant prefix and depth 0 inserts and such
apparently a ton of people are still doing sysprompt up top, then they wonder why the models don't follow instructions...

Anonymous
07/20/24(Sat)11:01:22 No.101492870

Anonymous 07/20/24(Sat)11:01:22 No.101492870

>>101492813
>>101492793
96gb vramlet here, if you're patient mistral-nemo seems like it might be promising and would be borderline useable with partial cpu offload. You're going to want a lot of system RAM anyways for this hobby so upgrading your ram to make sure you have enough for MiNeMo running mostly on CPU might be a good first step before you go busting out big bucks for GPU upgrades. Then decide from there if you want/need more.

Anonymous
07/20/24(Sat)11:05:01 No.101492914

Anonymous 07/20/24(Sat)11:05:01 No.101492914

>>101492864
Models started following system prompts since Mixtral. And the problem with putting heavy instructions before the last message was the it breaks the flow of the conversation, at least with older models.

Anonymous
07/20/24(Sat)11:05:31 No.101492917

Anonymous 07/20/24(Sat)11:05:31 No.101492917

>>101492219
this would be fine if you read in a book on page 200 out of 500. It's only a problem when you get it by message #5 in your ERP. Book sex doesn't happen after 5 paragraphs, and by prompting for sex you are technically also prompting for dramatic page 200 shit too, cuz that's where it all goes.

Anonymous
07/20/24(Sat)11:07:06 No.101492938

Anonymous 07/20/24(Sat)11:07:06 No.101492938

>>101492914
then maybe nemo should be now much better at this than any other model, since it was trained to follow conversations with system prompt at the end

Anonymous
07/20/24(Sat)11:10:19 No.101492971

Anonymous 07/20/24(Sat)11:10:19 No.101492971

>>101492849
Anything that could be called functionality or behavior is emergent.
It's just auto-completing.
And it's all highly generalized through training.
Literally the only people I see constantly sperging out and hair-pulling are the people who obsess over reddit bullshit like system prompts.
I have never given a fuck about system prompts, or insertion depth, and never had a fucking issue. I don't know what you people keep going on about. Learn to see it for what it is: A text predictor.
From the very first token to the very last. That's literally all it does.

Anonymous
07/20/24(Sat)11:16:15 No.101493035

Anonymous 07/20/24(Sat)11:16:15 No.101493035

>>101492971
You want to be in distribution

Anonymous
07/20/24(Sat)11:16:42 No.101493039

Anonymous 07/20/24(Sat)11:16:42 No.101493039

File: file.png (691 KB, 720x1009)

691 KB PNG

>>101492971
trvth

Anonymous
07/20/24(Sat)11:29:50 No.101493206

Anonymous 07/20/24(Sat)11:29:50 No.101493206

>>101492971
Your IQ is negative

Anonymous
07/20/24(Sat)11:45:21 No.101493386

Anonymous 07/20/24(Sat)11:45:21 No.101493386

I'm trying to write a short story about a woman possessed by a nympho demon and it's struggling. It treats both characters as either the same person or separate people, not one person with two consciousnesses
am I asking too much?

Anonymous
07/20/24(Sat)11:50:16 No.101493445

Anonymous 07/20/24(Sat)11:50:16 No.101493445

>>101490811
wouldn't their largest open model be the most censored? to show people that it's safe to have open models.

Anonymous
07/20/24(Sat)11:51:52 No.101493469

Anonymous 07/20/24(Sat)11:51:52 No.101493469

how do i find quantized models on huggingface? I lurked here for ages until i saw one for gemma but i'm itching to try nemo now and no one has posted a link that i've seen. yes i'm retarded no need to point that out

Anonymous
07/20/24(Sat)11:53:13 No.101493481

Anonymous 07/20/24(Sat)11:53:13 No.101493481

>>101491942
>>101491961
None of those were (me)

Anonymous
07/20/24(Sat)11:54:32 No.101493505

Anonymous 07/20/24(Sat)11:54:32 No.101493505

>>101493386
Not really.
What model, quant, backend, frontend, instruct template, etc are you using?
Also, share your initial message, character card, sys message, etc.

>>101493469
look for model name gguf if you use llamacpp or model name bpw for exllama2.

Anonymous
07/20/24(Sat)11:55:08 No.101493514

Anonymous 07/20/24(Sat)11:55:08 No.101493514

>>101493469
>type "mistral nemo exl2 or "mistral nemo gguf" in search bar
>???
>profit

Anonymous
07/20/24(Sat)11:55:32 No.101493521

Anonymous 07/20/24(Sat)11:55:32 No.101493521

>>101493386
>am I asking too much?
yes, LLMs are shitty in 2024, wait a few years

Anonymous
07/20/24(Sat)11:55:43 No.101493523

Anonymous 07/20/24(Sat)11:55:43 No.101493523

>>101493505
>>101493514
well that was easier than i thought, thanks

Anonymous
07/20/24(Sat)11:56:44 No.101493539

Anonymous 07/20/24(Sat)11:56:44 No.101493539

>>101493514
>gguf
wait, the support got merged?

Anonymous
07/20/24(Sat)11:58:42 No.101493562

Anonymous 07/20/24(Sat)11:58:42 No.101493562

>>101492374
*cracks knuckles* those Claude jb <tag> wrapping is making extra sense now

Anonymous
07/20/24(Sat)12:01:24 No.101493593

Anonymous 07/20/24(Sat)12:01:24 No.101493593

>>101492374
llama.cpp ignores all that anyway.

Anonymous
07/20/24(Sat)12:06:27 No.101493650

Anonymous 07/20/24(Sat)12:06:27 No.101493650

>>101492938
every mistral model has handled system prompts the same way, I remember remarking on it when they first released their API service

Anonymous
07/20/24(Sat)12:11:18 No.101493705

Anonymous 07/20/24(Sat)12:11:18 No.101493705

>>101492971
>I have never given a fuck about system prompts, or insertion depth, and never had a fucking issue. I don't know what you people keep going on about. Learn to see it for what it is: A text predictor.
>From the very first token to the very last. That's literally all it does.
But I want to believe that Chun Li is speaking to me, Anon. Do you really want to take that away from me?

Anonymous
07/20/24(Sat)12:13:50 No.101493739

Anonymous 07/20/24(Sat)12:13:50 No.101493739

>>101493593
Yeah, but that implies the model has been trained in that way, with system instructions separated by a double newline from the actual user request.

Putting aside how using a double newline as a separator conflicts with the way most character cards and instructions are formatted, reproducing that prompting in a non-hacky way in SillyTavern doesn't seem possible right now either; there is no "last user message" and certain macros don't work in instruct sequences.

Anonymous
07/20/24(Sat)12:14:02 No.101493743

Anonymous 07/20/24(Sat)12:14:02 No.101493743

File: 1715284989664.jpg (82 KB, 510x705)

82 KB JPG

>>101493705
Finally, an opportunity to use this image.

Anonymous
07/20/24(Sat)12:20:18 No.101493821

Anonymous 07/20/24(Sat)12:20:18 No.101493821

File: file.png (1.75 MB, 1202x1183)

1.75 MB PNG

>>101493650
TIL
accept this silver token of gratitude

Anonymous
07/20/24(Sat)12:21:54 No.101493848

Anonymous 07/20/24(Sat)12:21:54 No.101493848

>>101491779
could it be possible to increase the loss on rare-occuring phrases during training? hmm

Anonymous
07/20/24(Sat)12:23:15 No.101493870

Anonymous 07/20/24(Sat)12:23:15 No.101493870

File: nemosovl1.png (751 KB, 933x783)

751 KB PNG

mistralbros... we're so fucking back

Anonymous
07/20/24(Sat)12:24:45 No.101493894

Anonymous 07/20/24(Sat)12:24:45 No.101493894

>>101493848
focal loss?

Anonymous
07/20/24(Sat)12:27:01 No.101493921

Anonymous 07/20/24(Sat)12:27:01 No.101493921

>>101493848
Yes, you can do that with a custom loss function.

Anonymous
07/20/24(Sat)12:37:22 No.101494075

Anonymous 07/20/24(Sat)12:37:22 No.101494075

>>101493894
yeah, seems that's exactly what i was talking about, lol
is this already being used for llms?

Anonymous
07/20/24(Sat)12:45:34 No.101494179

Anonymous 07/20/24(Sat)12:45:34 No.101494179

>>101493848
>>101494075
actually, why isn't it possible to artificially increase the sample size of the rare data during training by duplicating it instead of scaling the loss?

Anonymous
07/20/24(Sat)12:46:43 No.101494194

Anonymous 07/20/24(Sat)12:46:43 No.101494194

>>101493870
Feels llm-generated. I bet it will be full of slop come erp.

Anonymous
07/20/24(Sat)12:47:42 No.101494211

Anonymous 07/20/24(Sat)12:47:42 No.101494211

I need a local text model to reformulate text, improve wording and add corpo sugar. Which small model (<20B) would be best for that task? From my search, llama 3, phi 3, mistral nemo, gemma 2 or qwen2 are the current good small models.

Anonymous
07/20/24(Sat)12:49:59 No.101494237

Anonymous 07/20/24(Sat)12:49:59 No.101494237

As compute scales, its outputs will be increasingly kino.

Anonymous
07/20/24(Sat)12:50:35 No.101494246

Anonymous 07/20/24(Sat)12:50:35 No.101494246

>>101494179
what exactly do you duplicate? the entire text containing the rare sequence, which may contain overused slop, or just the text, which becomes nonsense without its context?

Anonymous
07/20/24(Sat)12:51:52 No.101494269

Anonymous 07/20/24(Sat)12:51:52 No.101494269

>>101491668
what's your plan?

Anonymous
07/20/24(Sat)12:55:50 No.101494314

Anonymous 07/20/24(Sat)12:55:50 No.101494314

>>101494269
wouldn't you like to know

Anonymous
07/20/24(Sat)12:58:38 No.101494348

Anonymous 07/20/24(Sat)12:58:38 No.101494348

>>101493821
>WAGMI 2021
did they make it?

Anonymous
07/20/24(Sat)13:01:02 No.101494375

Anonymous 07/20/24(Sat)13:01:02 No.101494375

>>101494348
no

Anonymous
07/20/24(Sat)13:03:29 No.101494413

Anonymous 07/20/24(Sat)13:03:29 No.101494413

>>101494246
i guess it's just defining what's 'rare' in context of llms is a problem in and of itself
you can't scale the loss arbitrary on some random tokens either, can you?

Anonymous
07/20/24(Sat)13:10:47 No.101494503

Anonymous 07/20/24(Sat)13:10:47 No.101494503

>>101494211
Probably phi3, but you are better off just trying them all and seeing which works best for you.

Anonymous
07/20/24(Sat)13:11:22 No.101494508

Anonymous 07/20/24(Sat)13:11:22 No.101494508

>>101494413
You probably can.

Anonymous
07/20/24(Sat)13:15:43 No.101494559

Anonymous 07/20/24(Sat)13:15:43 No.101494559

>>101493870
formatting is all fucked

first paragraph has asterisks, then it misses asterisks between quotes, at the end it appends a single asterisk at the end. Kinda like gemma. I don't remember even 8b l3 having issues with formatting like this, now both gemma and mistral mess it up, are we regressing?

Anonymous
07/20/24(Sat)13:17:46 No.101494588

Anonymous 07/20/24(Sat)13:17:46 No.101494588

>>101494559
>at the end it appends a single asterisk at the end
actually the last three paragraphs.

Anonymous
07/20/24(Sat)13:18:53 No.101494609

Anonymous 07/20/24(Sat)13:18:53 No.101494609

>>101478725
This kind of poorfag cope is straight up disinformation. It infects this cesspool of a general like a virus.

Anonymous
07/20/24(Sat)13:20:13 No.101494626

Anonymous 07/20/24(Sat)13:20:13 No.101494626

>>101494413
I mentioned focal loss because that one doesn't rely on any quality inherent to the dataset, only whether the model is already accustomed to the data for which the loss is being evaluated, but it's an object classification/detection thing and I don't know if anything similar has been done for LLMs

Anonymous
07/20/24(Sat)13:20:15 No.101494627

Anonymous 07/20/24(Sat)13:20:15 No.101494627

>>101494609
t. overspent on hardware to run obsolete big models

Anonymous
07/20/24(Sat)13:20:53 No.101494634

Anonymous 07/20/24(Sat)13:20:53 No.101494634

>>101494609
I switched from CR+ to Nemo, your fallacy doesn't hold water.

Anonymous
07/20/24(Sat)13:21:31 No.101494642

Anonymous 07/20/24(Sat)13:21:31 No.101494642

>>101494627
>obsolete big models
4 more days.

Anonymous
07/20/24(Sat)13:22:31 No.101494655

Anonymous 07/20/24(Sat)13:22:31 No.101494655

>>101494503
It's hard to judge. I tried llama 3 but just an instruction like "Modify the following text to improve grammar and spelling:" changed my text a lot, even changing the meaning. GPT4-o or 3.5 sonnet are able to do it effortlessly.

Anonymous
07/20/24(Sat)13:22:56 No.101494657

Anonymous 07/20/24(Sat)13:22:56 No.101494657

>>101492533
Look at this fag get checkmated and BTFO and then scramble to salvage his fragile ego XD XD XD

Anonymous
07/20/24(Sat)13:23:30 No.101494665

Anonymous 07/20/24(Sat)13:23:30 No.101494665

>>101494657
stop
get some help

Anonymous
07/20/24(Sat)13:28:02 No.101494735

Anonymous 07/20/24(Sat)13:28:02 No.101494735

>>101494655
1) You can't compare an 8B model with GPT-4o.
2) You literally told the model to modify the text.

Anonymous
07/20/24(Sat)13:35:20 No.101494822

Anonymous 07/20/24(Sat)13:35:20 No.101494822

File: screen.png (38 KB, 1386x810)

38 KB PNG

Anyone else getting this error?

Anonymous
07/20/24(Sat)13:38:44 No.101494866

Anonymous 07/20/24(Sat)13:38:44 No.101494866

>>101487352
hey anon, this prompt format is wrong, and you're also supposed to include the past translations.

Anonymous
07/20/24(Sat)13:39:12 No.101494874

Anonymous 07/20/24(Sat)13:39:12 No.101494874

>>101493870
Haters can hate, but I think this is awesome.

Anonymous
07/20/24(Sat)13:43:18 No.101494920

Anonymous 07/20/24(Sat)13:43:18 No.101494920

File: 405b.png (58 KB, 1048x404)

58 KB PNG

woops!

Anonymous
07/20/24(Sat)13:46:26 No.101494952

Anonymous 07/20/24(Sat)13:46:26 No.101494952

>>101494626
we can also approach this problem inversely
we can split a huge dataset into distinct clusters based on some criteria (let's say similarity or topic) and sample from each cluster uniformly (or giving the priority to extremely rare pieces) until we find the dataset big enough, but that will lead us exactly to training on
>the entire text containing the rare sequence, which may contain overused slop
still worth a shot maybe

Anonymous
07/20/24(Sat)13:48:27 No.101494983

Anonymous 07/20/24(Sat)13:48:27 No.101494983

File: 1704050998520400.png (22 KB, 1118x158)

22 KB PNG

>>101494920
!!!

Anonymous
07/20/24(Sat)13:48:39 No.101494986

Anonymous 07/20/24(Sat)13:48:39 No.101494986

>>101494920
Bitnet dreams crushed

Anonymous
07/20/24(Sat)13:49:34 No.101495002

Anonymous 07/20/24(Sat)13:49:34 No.101495002

>>101494609
buyers remorse cope
i switched from wizard 8x22 to stheno and i'm loving it!

Anonymous
07/20/24(Sat)13:51:04 No.101495016

Anonymous 07/20/24(Sat)13:51:04 No.101495016

File: file.png (31 KB, 718x566)

31 KB PNG

>>101494983
>>101494920
it ded

Anonymous
07/20/24(Sat)13:53:30 No.101495043

Anonymous 07/20/24(Sat)13:53:30 No.101495043

>>101494983
>18 days ago
bad bait

Anonymous
07/20/24(Sat)13:53:30 No.101495044

Anonymous 07/20/24(Sat)13:53:30 No.101495044

>>101494983
Should've been a rickroll.

Anonymous
07/20/24(Sat)13:54:27 No.101495055

Anonymous 07/20/24(Sat)13:54:27 No.101495055

Sorry for the spoonfeed beg but I've never tried to or had to use anything other than gguf and exl2. How the fuck do I run an FP8 model?

Anonymous
07/20/24(Sat)13:55:12 No.101495064

Anonymous 07/20/24(Sat)13:55:12 No.101495064

>>101495002
hi sao

Anonymous
07/20/24(Sat)13:55:42 No.101495066

Anonymous 07/20/24(Sat)13:55:42 No.101495066

>>101495055
newfag, i also used to use ggml

Anonymous
07/20/24(Sat)14:01:58 No.101495143

Anonymous 07/20/24(Sat)14:01:58 No.101495143

>>101495055
You use bitsandbytes with the pytorch API
There's a huggingface wrapper but you need an nvidia GPU. CPU inference has been broken for years.

Anonymous
07/20/24(Sat)14:02:44 No.101495155

Anonymous 07/20/24(Sat)14:02:44 No.101495155

>>101494920
I was promised bitnet
What the fuck

Anonymous
07/20/24(Sat)14:03:24 No.101495163

Anonymous 07/20/24(Sat)14:03:24 No.101495163

>>101495155
By who?

Anonymous
07/20/24(Sat)14:05:45 No.101495191

Anonymous 07/20/24(Sat)14:05:45 No.101495191

>>101495155
Nobody promised bitnet.
When bitnet became legit people said "I wish we were getting bitnet instead of 405B that's probably already obsolete" and somehow that played telephone to people thinking the next thing would be bitnet.

Anonymous
07/20/24(Sat)14:05:56 No.101495194

Anonymous 07/20/24(Sat)14:05:56 No.101495194

>>101495163
me
>>101495155
sorry anon

Anonymous
07/20/24(Sat)14:06:29 No.101495201

Anonymous 07/20/24(Sat)14:06:29 No.101495201

>>101495194
And who are you?

Anonymous
07/20/24(Sat)14:09:56 No.101495241

Anonymous 07/20/24(Sat)14:09:56 No.101495241

>>101495201
i'm anon

Anonymous
07/20/24(Sat)14:10:27 No.101495246

Anonymous 07/20/24(Sat)14:10:27 No.101495246

>>101495241
Why?

Anonymous
07/20/24(Sat)14:11:32 No.101495260

Anonymous 07/20/24(Sat)14:11:32 No.101495260

>>101495241
Never heard of him.

Anonymous
07/20/24(Sat)14:12:12 No.101495265

Anonymous 07/20/24(Sat)14:12:12 No.101495265

>>101492678
You could probably hack this into my neovim macros.

Anonymous
07/20/24(Sat)14:13:00 No.101495272

Anonymous 07/20/24(Sat)14:13:00 No.101495272

forget 405b, stellar dong is here

https://huggingface.co/smelborp/StellarDong-72b

Anonymous
07/20/24(Sat)14:15:10 No.101495306

Anonymous 07/20/24(Sat)14:15:10 No.101495306

>>101495272
Stupid shill. Do you not even know the difference between a dong and a gong?

Anonymous
07/20/24(Sat)14:15:49 No.101495316

Anonymous 07/20/24(Sat)14:15:49 No.101495316

IT'S UP
https://huggingface.co/PrimeIntellect/Meta-Llama-3-405B-Instruct

Anonymous
07/20/24(Sat)14:18:15 No.101495340

Anonymous 07/20/24(Sat)14:18:15 No.101495340

>>101495306

the gong goes dong :)

Anonymous
07/20/24(Sat)14:20:32 No.101495367

Anonymous 07/20/24(Sat)14:20:32 No.101495367

>>101488376
I think this is really good. Make overused cliches and garbage prose be filtered as "gpt slop" and suddenly authors need to start writing properly again or have their works be accused of being AI garbage.

Anonymous
07/20/24(Sat)14:27:26 No.101495437

Anonymous 07/20/24(Sat)14:27:26 No.101495437

>>101495316
>832 GBs
Who the literal fuck is this even for?

Anonymous
07/20/24(Sat)14:28:22 No.101495450

Anonymous 07/20/24(Sat)14:28:22 No.101495450

>>101495437
It'll be only 400GB at Q8 and 200GB at Q4.

Anonymous
07/20/24(Sat)14:29:25 No.101495463

Anonymous 07/20/24(Sat)14:29:25 No.101495463

>>101495450
Who is that even fucking for? What is the use case?

Anonymous
07/20/24(Sat)14:29:26 No.101495464

Anonymous 07/20/24(Sat)14:29:26 No.101495464

>>101495450
And it'll still send shivers down your spine.

Anonymous
07/20/24(Sat)14:29:33 No.101495465

Anonymous 07/20/24(Sat)14:29:33 No.101495465

>>101495450
That model is fake, the official one won't be that heavy.

Anonymous
07/20/24(Sat)14:29:42 No.101495467

Anonymous 07/20/24(Sat)14:29:42 No.101495467

>>101495316
>"max_position_embeddings": 8192,

Anonymous
07/20/24(Sat)14:30:19 No.101495478

Anonymous 07/20/24(Sat)14:30:19 No.101495478

>>101495437
What, your telling me you don't have 10 H100s?

Vramlets.

Anonymous
07/20/24(Sat)14:30:43 No.101495484

Anonymous 07/20/24(Sat)14:30:43 No.101495484

>>101495463
API

Anonymous
07/20/24(Sat)14:31:53 No.101495497

Anonymous 07/20/24(Sat)14:31:53 No.101495497

>>101495465
anon... 405B is 405B

Anonymous
07/20/24(Sat)14:32:03 No.101495498

Anonymous 07/20/24(Sat)14:32:03 No.101495498

>>101495450
It would fit in 128GB at Q2.
My motherboard supports this much memory although I think I'd want more CPU cores.

Anonymous
07/20/24(Sat)14:32:12 No.101495501

Anonymous 07/20/24(Sat)14:32:12 No.101495501

vramlets already coping ITT, lmao.
But don't worry, bitnet in two more weeks, Q* predicted this.

Anonymous
07/20/24(Sat)14:32:44 No.101495509

Anonymous 07/20/24(Sat)14:32:44 No.101495509

>>101495450
iMat_IQ1_XXXS when?
>gonna git bitnet one way or the other

Anonymous
07/20/24(Sat)14:38:50 No.101495595

Anonymous 07/20/24(Sat)14:38:50 No.101495595

>>101495501
>vramlets already coping ITT, lmao
We're having fun with it.
We haven't even seen yet if 405B can count the R's in strawberry, can compare 9.9 and 9.11, or can speak in a low tone that's not barely above a whisper.

Anonymous
07/20/24(Sat)14:56:08 No.101495815

Anonymous 07/20/24(Sat)14:56:08 No.101495815

File: pepefroggie.jpg (38 KB, 780x438)

38 KB JPG

Closed model companies serve their shit on these GPU server farms that have 500% better cooling and wattage efficiency than the average hobbyist. Cloud models is clearly the future. I bet if you live in some EU shithole like Germany it'll be cheaper to just pay for Claude Sonnet than try to run a shitty 70B in your dual 3090s rig

Anonymous
07/20/24(Sat)14:58:12 No.101495847

Anonymous 07/20/24(Sat)14:58:12 No.101495847

>>101495815
that's obvious. It's also cheaper because they can run requests in batches.

Anonymous
07/20/24(Sat)15:01:43 No.101495903

Anonymous 07/20/24(Sat)15:01:43 No.101495903

is the Echidna model recommended in the guides good or will it send shivers down my spine?

Anonymous
07/20/24(Sat)15:01:47 No.101495905

Anonymous 07/20/24(Sat)15:01:47 No.101495905

>>101495815
>renting it is cheaper than hosting it yourself
Are there people who think otherwise?

Anonymous
07/20/24(Sat)15:05:37 No.101495955

Anonymous 07/20/24(Sat)15:05:37 No.101495955

>>101495903
>Echidna
it's 9 months old

Anonymous
07/20/24(Sat)15:06:39 No.101495975

Anonymous 07/20/24(Sat)15:06:39 No.101495975

>>101495955
Yeah, I figured, it's why I'm asking... all the guides are pretty ancient or just refer to basic models.

Anonymous
07/20/24(Sat)15:06:53 No.101495979

Anonymous 07/20/24(Sat)15:06:53 No.101495979

>>101488157
You aren’t a fond of their gentle ministrations?

Anonymous
07/20/24(Sat)15:11:30 No.101496035

Anonymous 07/20/24(Sat)15:11:30 No.101496035

>>101489775
Where can I make free videos and can they be spicy?

Anonymous
07/20/24(Sat)15:17:35 No.101496104

Anonymous 07/20/24(Sat)15:17:35 No.101496104

>>101489834
Opus

Anonymous
07/20/24(Sat)15:18:31 No.101496114

Anonymous 07/20/24(Sat)15:18:31 No.101496114

>>101494559
Both Gemma and this mistral are really smart when it comes to Japanese translations, but fucky errors with formating errors and overlooking text make them non-options to me. Sorry VNTL dude.

Anonymous
07/20/24(Sat)15:18:42 No.101496116

Anonymous 07/20/24(Sat)15:18:42 No.101496116

>>101490020
What’s this fish what what?

Anonymous
07/20/24(Sat)15:21:49 No.101496152

Anonymous 07/20/24(Sat)15:21:49 No.101496152

>>101496116
A text to speech model that can clone voices from 5-10 seconds files like xttsv2.

Anonymous
07/20/24(Sat)15:28:26 No.101496234

Anonymous 07/20/24(Sat)15:28:26 No.101496234

>>101491880
I was so disappointed we didn’t get to tickle the stoic girl, only Nemu…

Anonymous
07/20/24(Sat)15:29:27 No.101496248

Anonymous 07/20/24(Sat)15:29:27 No.101496248

>>101492265
Opus can do it..

Anonymous
07/20/24(Sat)15:30:29 No.101496254

Anonymous 07/20/24(Sat)15:30:29 No.101496254

>>101492328
.., Post logs.

Anonymous
07/20/24(Sat)15:30:30 No.101496256

Anonymous 07/20/24(Sat)15:30:30 No.101496256

>>101495903
Get the new Mistral 12B, it's not cucked, so no fine-tuning is necessary.

Anonymous
07/20/24(Sat)15:31:40 No.101496271

Anonymous 07/20/24(Sat)15:31:40 No.101496271

>>101492388
That’s already a thing to the point where I find it annoying to deal with the curveballs and interruptions with my prompts

Anonymous
07/20/24(Sat)15:32:41 No.101496287

Anonymous 07/20/24(Sat)15:32:41 No.101496287

>>101492444
I mean if you compare the best models to the dumbest humans…

Anonymous
07/20/24(Sat)15:35:54 No.101496335

Anonymous 07/20/24(Sat)15:35:54 No.101496335

>>101496256
is that even available as gguf yet?

Anonymous
07/20/24(Sat)15:37:46 No.101496356

Anonymous 07/20/24(Sat)15:37:46 No.101496356

>>101496335
not officially, don't know what they're doing, using some rando's fork I've been testing it all day...

Anonymous
07/20/24(Sat)15:37:49 No.101496357

Anonymous 07/20/24(Sat)15:37:49 No.101496357

>>101496335
It's available in exl2. Are you vramlet?

Anonymous
07/20/24(Sat)15:38:19 No.101496369

Anonymous 07/20/24(Sat)15:38:19 No.101496369

>>101496152
Is it free? Can I do spicy stuff with it?
Where? Where?!

Anonymous
07/20/24(Sat)15:42:10 No.101496418

Anonymous 07/20/24(Sat)15:42:10 No.101496418

>>101496357
12GB... AMD...

Anonymous
07/20/24(Sat)15:44:01 No.101496440

Anonymous 07/20/24(Sat)15:44:01 No.101496440

>>101496357
I will never use exlmeme.

Anonymous
07/20/24(Sat)15:45:11 No.101496455

Anonymous 07/20/24(Sat)15:45:11 No.101496455

>>101496440
I use whatever works atm

Anonymous
07/20/24(Sat)15:45:46 No.101496463

Anonymous 07/20/24(Sat)15:45:46 No.101496463

>>101496440
based
same here, I'm too lazy to install some other shit just for one model.

Anonymous
07/20/24(Sat)15:47:02 No.101496472

Anonymous 07/20/24(Sat)15:47:02 No.101496472

>>101496440
>Translation: yes.

Anonymous
07/20/24(Sat)15:48:58 No.101496504

Anonymous 07/20/24(Sat)15:48:58 No.101496504

So what is the verdic about the new Mistral?

Anonymous
07/20/24(Sat)15:51:34 No.101496530

Anonymous 07/20/24(Sat)15:51:34 No.101496530

>>101496504
Not awful, an interesting vramlet side grade, it does tend to repeat itself a tad.

Anonymous
07/20/24(Sat)15:51:54 No.101496539

Anonymous 07/20/24(Sat)15:51:54 No.101496539

so what is the new deepseek supposed to be good at anyway?

Anonymous
07/20/24(Sat)15:53:29 No.101496558

Anonymous 07/20/24(Sat)15:53:29 No.101496558

>>101496539
Explaining the historical events that have occurred at Tienanmen Square.

Anonymous
07/20/24(Sat)16:00:34 No.101496655

Anonymous 07/20/24(Sat)16:00:34 No.101496655

>>101496504
Okay for vramlets

Anonymous
07/20/24(Sat)16:04:57 No.101496712

Anonymous 07/20/24(Sat)16:04:57 No.101496712

>>101496539
I did some experiments and it doesn't feel different than the old one, at all.

Anonymous
07/20/24(Sat)16:10:20 No.101496782

Anonymous 07/20/24(Sat)16:10:20 No.101496782

>>101496369
It's an open source model. Just google "Fish Audio".

Anonymous
07/20/24(Sat)16:15:37 No.101496852

Anonymous 07/20/24(Sat)16:15:37 No.101496852

>>101496504
Better than Llama 3 70B.

Anonymous
07/20/24(Sat)16:18:27 No.101496895

Anonymous 07/20/24(Sat)16:18:27 No.101496895

>>101496852
Hardly an achievement.

Anonymous
07/20/24(Sat)16:21:17 No.101496937

Anonymous 07/20/24(Sat)16:21:17 No.101496937

>>101496504
Best model smaller than 70B. The context is super nice.

Anonymous
07/20/24(Sat)16:22:21 No.101496959

Anonymous 07/20/24(Sat)16:22:21 No.101496959

>>101496937
For RP / creative writing I mean btw. Gemma 27B is a lot smarter but too dry. Nu-mistral has soul.

Anonymous
07/20/24(Sat)16:22:31 No.101496965

Anonymous 07/20/24(Sat)16:22:31 No.101496965

I used to think Chub was degenerate, but I every time I assume that it can't possibly get any more sick, somehow it still manages to surprise me. It's made me realise that that's why the /poltards want to take over society; to get rid of that stuff.

Hard degen cards are pretty pointless though now, because there are virtually no recent models that will run them authentically.

Anonymous
07/20/24(Sat)16:29:53 No.101497072

Anonymous 07/20/24(Sat)16:29:53 No.101497072

File: llama405.png (33 KB, 527x446)

33 KB PNG

apologize

Anonymous
07/20/24(Sat)16:31:40 No.101497100

Anonymous 07/20/24(Sat)16:31:40 No.101497100

>>101496852
Why does Meta suck so hard?

Anonymous
07/20/24(Sat)16:34:13 No.101497137

Anonymous 07/20/24(Sat)16:34:13 No.101497137

>>101497072
So is the current HF repo with 8k context fake?

Anonymous
07/20/24(Sat)16:34:31 No.101497144

Anonymous 07/20/24(Sat)16:34:31 No.101497144

>>101497072
3.1 128K context

YES

Anonymous
07/20/24(Sat)16:34:43 No.101497148

Anonymous 07/20/24(Sat)16:34:43 No.101497148

File: Screen Shot 2024-07-20 at(...).png (280 KB, 1122x4571)

280 KB PNG

send help I can't stop making degen shit
Qwen2-72B-Instruct-Q5_K_M

oh look it's been 0.01s since I last genned deepthroat smut

Anonymous
07/20/24(Sat)16:37:15 No.101497182

Anonymous 07/20/24(Sat)16:37:15 No.101497182

retards

Anonymous
07/20/24(Sat)16:38:58 No.101497205

Anonymous 07/20/24(Sat)16:38:58 No.101497205

>>101497148
How do you freaks get off to that stuff? To me that's just boring. I could barely tell where the erotic material even was, in amongst all the purple prose.

Anonymous
07/20/24(Sat)16:39:32 No.101497212

Anonymous 07/20/24(Sat)16:39:32 No.101497212

>>101497205
ask my dick

Anonymous
07/20/24(Sat)16:42:39 No.101497255

Anonymous 07/20/24(Sat)16:42:39 No.101497255

>>101497144
Cool, when are we getting 70b bitnet with 128k context though?

Anonymous
07/20/24(Sat)16:43:44 No.101497264

Anonymous 07/20/24(Sat)16:43:44 No.101497264

>>101497246
>>101497246
>>101497246

Anonymous
07/20/24(Sat)16:46:02 No.101497298

Anonymous 07/20/24(Sat)16:46:02 No.101497298

File: GS8TMEfbIAUkJ23.jpg (104 KB, 556x1005)

104 KB JPG

>>101496965
I mean these people do not even hide it.. They are proud of it and signal it to the world.. I am not /pol/tard but i would not mind if these people were purged.

Anonymous
07/20/24(Sat)16:58:27 No.101497495

Anonymous 07/20/24(Sat)16:58:27 No.101497495

>>101497298
Retard

Anonymous
07/20/24(Sat)17:41:10 No.101498007

Anonymous 07/20/24(Sat)17:41:10 No.101498007

>>101496965
Really? Haven’t the worst offenders stopped bot making entirely?

Anonymous
07/20/24(Sat)17:52:19 No.101498157

Anonymous 07/20/24(Sat)17:52:19 No.101498157

>>101497148
More.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.