/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/18/24(Wed)20:25:24 No.102449993

File: miku_put_the_piece_away_t(...).jpg (728 KB, 2048x2048)

728 KB JPG

/lmg/ - Local Models General Anonymous 09/18/24(Wed)20:25:24 No.102449993 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102444258 & >>102434744

►News
>(09/18) Microsoft releases 16x3.8B GRadient-INformed MoE: https://github.com/microsoft/GRIN-MoE
>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5/
>(09/18) Llama 8B quantized to b1.58 through finetuning: https://hf.co/blog/1_58_llm_extreme_quantization
>(09/17) Mistral releases new 22B with 128k context and function calling: https://mistral.ai/news/september-24-release/
>(09/12) DataGemma with DataCommons retrieval: https://blog.google/technology/ai/google-datagemma-ai-llm

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
09/18/24(Wed)20:25:44 No.102450000

Anonymous 09/18/24(Wed)20:25:44 No.102450000

File: img_42.jpg (116 KB, 512x512)

116 KB JPG

►Recent Highlights from the Previous Thread: >>102444258

--Formatting for user-only and AI-only comments in Sillytavern, and using scenario override and macros for dynamic objectives and story beats: >>102446780 >>102446919 >>102446987 >>102447000 >>102447183
--Discussion on China's AI progress, benchmark performance, and the debate over 0-shot vs. CoT reasoning models: >>102444396 >>102444538 >>102444660 >>102444692 >>102444742 >>102444759 >>102444770 >>102444804 >>102444821 >>102444836 >>102444838 >>102444872
--Discussion on AI's ability to solve competition math problems: >>102445940 >>102446192 >>102446236 >>102446329
--WonderWorld GitHub repo still a README, despite paper promises: >>102445600 >>102445615 >>102445829
--Llama.cpp developer discusses requirements for adding new samplers: >>102444940 >>102445039 >>102445343 >>102445389 >>102445605 >>102445635 >>102445695 >>102445865 >>102445875
--GRIN MoE performance compared to other models: >>102449377 >>102449398 >>102449413 >>102449470 >>102449494
--Mistral-Nemo-Instruct-2407 outperforms Mistral-small in JP translation: >>102447665
--Hugging Face's extreme quantization allows for 1.58bit LLMs, but with some performance drop: >>102444560 >>102444608 >>102444653 >>102444655
--Qwen 2.5 72B performs well for ERP up to t=1.7: >>102444722 >>102444794 >>102444860 >>102444930 >>102445625
--Nemotron-Mini-4B-Instruct Nala test and model background: >>102449523 >>102449588
--Nala and Qwen2.5-Math-72B-Instruct experiment results in incomprehensible output: >>102445823 >>102445874
--Discussion on implementing CoT locally, with challenges and potential solutions: >>102446846 >>102446905 >>102447043 >>102447122 >>102447062
--Anon tests temperature settings for generating responses and evaluates NALA score: >>102445088
--Advice for optimizing local AI model testing with limited GPU: >>102447333 >>102447389 >>102447783
--Miku (free space): >>102444310 >>102446044

►Recent Highlight Posts from the Previous Thread: >>102444269

Anonymous
09/18/24(Wed)20:29:05 No.102450040

Anonymous 09/18/24(Wed)20:29:05 No.102450040

File: nala grin-moe.png (119 KB, 942x336)

119 KB PNG

So GRIN-MoE is incapable of actually responding to overly-structured promps such as sillytavern when using the Phi formatting but if you use an alpaca-template it will RP. But given that it's trained on the Phi data and the Phi pretraining data is devoid of any smut it probably won't actually go very far with a proper NSFW RP.

Anonymous
09/18/24(Wed)20:30:18 No.102450058

Anonymous 09/18/24(Wed)20:30:18 No.102450058

>>102450040
>won't actually go very far with a proper NSFW RP
Does this mean she will finally eat you like a real lion should?

Anonymous
09/18/24(Wed)20:31:26 No.102450076

Anonymous 09/18/24(Wed)20:31:26 No.102450076

>>102449897
>>102449874
I wish death upon all mikufaggots and want /lmg/ to die but I think it is just people too busy cooming to mistral small and qwen.

Anonymous
09/18/24(Wed)20:32:24 No.102450088

Anonymous 09/18/24(Wed)20:32:24 No.102450088

>>102450058
Maybe. This could be a major win for vorefags.

Anonymous
09/18/24(Wed)20:33:42 No.102450104

Anonymous 09/18/24(Wed)20:33:42 No.102450104

File: file.png (55 KB, 998x518)

55 KB PNG

Qwen2.5 72B is now definitely the best local model under 100B parameters for JP>EN translation. Nice!

Anonymous
09/18/24(Wed)20:36:22 No.102450135

Anonymous 09/18/24(Wed)20:36:22 No.102450135

>>102447333
here

>>102447389
>>102447449
you're right, I forgot speccy was stupid as shit -- GPU-Z is reporting correctly 8192MB vram

I did get a "Mistral-Nemo-Instruct-2407-IQ4_XS.gguf" loaded (other ones I found didn't load up right) with the Mistral presets in sillytavern

I had been gunning for the Q4 when possible, and leaving out the "--quantkv 2 --flashattention" args. Right now my CLI looks like:
call bin-kobold\koboldcpp_cu12.exe --model "N:\IGGER\F\A\I\Mistral-Nemo-Instruct-2407-IQ4_XS.gguf" --contextsize 12288 --threads 7 --blasthreads 14 --usecublas normal 0 1 --gpulayers -1  --blasbatchsize 512 --highpriority --foreground --skiplauncher --nommap --usemlock --onready "SillyTavern.bat" %*

Anonymous
09/18/24(Wed)20:36:43 No.102450139

Anonymous 09/18/24(Wed)20:36:43 No.102450139

Trying to load magnum-12b-v2.5-kto-Q5_K_M.gguf gives
>raise ValueError("Failed to create llama_context")

low vram or what?

Anonymous
09/18/24(Wed)20:40:48 No.102450175

Anonymous 09/18/24(Wed)20:40:48 No.102450175

>>102449993
>grin moe
neat, new toys to play around with
>4k context
AAAAAAAAAAAAAA AT LEAST MAKE IT 8K FFS YOU FOOLS!

Anonymous
09/18/24(Wed)20:40:54 No.102450178

Anonymous 09/18/24(Wed)20:40:54 No.102450178

>>102450104
We're so back.
>still doesn't know what mesugaki means according to >>102446773
It's so over.

Anonymous
09/18/24(Wed)20:43:01 No.102450198

Anonymous 09/18/24(Wed)20:43:01 No.102450198

>>102450178
>he cares about animu shit knowledge
Pathetic.

Anonymous
09/18/24(Wed)20:44:07 No.102450207

Anonymous 09/18/24(Wed)20:44:07 No.102450207

>>102450175
rope yourself

Anonymous
09/18/24(Wed)20:44:47 No.102450211

Anonymous 09/18/24(Wed)20:44:47 No.102450211

File: file.png (8 KB, 576x61)

8 KB PNG

>>102450104
I liked seeing that it got most of the 'tricky' translations right, like picrel. But Qwen2.0 wasn’t bad at that either, so it doesn’t seem like a huge upgrade over it, as can be seen on the leaderboard.

Anonymous
09/18/24(Wed)20:44:58 No.102450214

Anonymous 09/18/24(Wed)20:44:58 No.102450214

>>102450207
I can assure you, (You) will rope yourself way before the rest of us will.

Anonymous
09/18/24(Wed)20:45:22 No.102450217

Anonymous 09/18/24(Wed)20:45:22 No.102450217

>crazy thursday is over
>thursday hasn't even come yet

Anonymous
09/18/24(Wed)20:47:00 No.102450234

Anonymous 09/18/24(Wed)20:47:00 No.102450234

>>102450214
>no u
Fried your brains huh?

Anonymous
09/18/24(Wed)20:47:22 No.102450238

Anonymous 09/18/24(Wed)20:47:22 No.102450238

>>102450139
what are you trying to load it with?
try koboldcpp if you're not already

Anonymous
09/18/24(Wed)20:49:15 No.102450255

Anonymous 09/18/24(Wed)20:49:15 No.102450255

>>102450238
using oobabooga. I'll give koboldcpp a shot

Anonymous
09/18/24(Wed)20:50:10 No.102450260

Anonymous 09/18/24(Wed)20:50:10 No.102450260

>>102450198
Yes because it's an indicator, just like the Castlevania question. If they trained on 18T and none contained such knowledge, it means their filtering was very strong and likely many more types of knowledge have also been filtered away, so much so that even 18T doesn't help it. This would be a strong indicator that the model, like the last one, has very little cultural knowledge in general. Might also be bad at more niche RP that isn't the cookie cutter shit. And probably will be bad as well for assistant tasks when it involves more niche subjects.

Anonymous
09/18/24(Wed)20:50:11 No.102450261

Anonymous 09/18/24(Wed)20:50:11 No.102450261

>He uses ooba in the year of our lord 2020+4
Anon, I...

Anonymous
09/18/24(Wed)20:51:58 No.102450272

Anonymous 09/18/24(Wed)20:51:58 No.102450272

>>102450261
First day wrangling with this. I thought it was good enough to start with

Anonymous
09/18/24(Wed)20:52:26 No.102450274

Anonymous 09/18/24(Wed)20:52:26 No.102450274

>>102450175
So far in testing it seems to completely break apart at 2400 tokens of context. So it can't even do 4K, at least not with extensive back and forth.

Anonymous
09/18/24(Wed)20:52:47 No.102450279

Anonymous 09/18/24(Wed)20:52:47 No.102450279

>>102450272
You thought wrong.

Anonymous
09/18/24(Wed)20:55:02 No.102450307

Anonymous 09/18/24(Wed)20:55:02 No.102450307

>>102450274
>Already breaks under 2500 tokens
How the fuck? I get it that not every model can or should be 16k+ in their first iterations, but holy crap.
>>102450272
ooba is just like the webui thing for image AI, they work, but there is far better alternatives available.

Anonymous
09/18/24(Wed)20:55:57 No.102450319

Anonymous 09/18/24(Wed)20:55:57 No.102450319

>>102450261
It's fine, fuck SillyTavern's bloat and Kobold's jank.

Anonymous
09/18/24(Wed)20:59:14 No.102450349

Anonymous 09/18/24(Wed)20:59:14 No.102450349

File: file.png (79 KB, 859x607)

79 KB PNG

>>102450178
yikes. I definitely wouldn't recommend this model for learning Japanese.

Anonymous
09/18/24(Wed)20:59:25 No.102450351

Anonymous 09/18/24(Wed)20:59:25 No.102450351

>>102450307
Sometimes at 2K even, jeez.
But at the same time.. It's actually good for certain scenarios (quasi-reluctant but consenting character who has conflicted emotions about the situation) which is frustrating, dealing with Llama-1 context window.

Anonymous
09/18/24(Wed)21:00:58 No.102450366

Anonymous 09/18/24(Wed)21:00:58 No.102450366

>>102450349
kimesu no yaiba

Anonymous
09/18/24(Wed)21:03:35 No.102450395

Anonymous 09/18/24(Wed)21:03:35 No.102450395

File: 1725922368500279.jpg (649 KB, 2384x1808)

649 KB JPG

>>102450000
Checked and ty for your service, recap bot

Anonymous
09/18/24(Wed)21:21:08 No.102450557

Anonymous 09/18/24(Wed)21:21:08 No.102450557

>>102450319
ooba cuck

Anonymous
09/18/24(Wed)21:22:03 No.102450567

Anonymous 09/18/24(Wed)21:22:03 No.102450567

File: ooba.png (103 KB, 845x817)

103 KB PNG

>>102450319
what's worth using from these?

Anonymous
09/18/24(Wed)22:01:29 No.102450877

Anonymous 09/18/24(Wed)22:01:29 No.102450877

File: chuck-e-cheese-okay.gif (3.29 MB, 640x640)

3.29 MB GIF

please someone using TTS point me in the right direction here, have tried several TTS packages from github, most of them couldnt even install a working venv, the working one OOM'd when it tried getting RVC working (and couldnt generate its own json config?)

Anonymous
09/18/24(Wed)22:05:48 No.102450910

Anonymous 09/18/24(Wed)22:05:48 No.102450910

Does anyone else ever download a model and fire it up solely for testing purposes but then end up unintentionally having a 3 hour long goon sesh with it?

Anonymous
09/18/24(Wed)22:08:24 No.102450932

Anonymous 09/18/24(Wed)22:08:24 No.102450932

>>102450910
>goon sesh
no, no I don't, zoomer nigger faggot.

Anonymous
09/18/24(Wed)22:08:53 No.102450938

Anonymous 09/18/24(Wed)22:08:53 No.102450938

File: No fun allowed.png (298 KB, 745x745)

298 KB PNG

When we finally get out AI's smart enough as well consumer grade robotic bodies, do you think the government will try to hardcode the AI's to never touch a gun or cause human harm? Or do you think they will take a more utilitarian approach and try to hardcode the AI's to kill a human who if not taken down will kill 10 others?
Lord knows the government won't just let it be without restrictions, the government hates any form of fun.

Anonymous
09/18/24(Wed)22:10:43 No.102450962

Anonymous 09/18/24(Wed)22:10:43 No.102450962

>>102450938
Gov loves anything that gives them more power, which means robots that become their willing soldiers without scruples that will never doubt their commands. Think Star Wars Episode 3, but with machine robots instead of fleshy robots.

Anonymous
09/18/24(Wed)22:11:15 No.102450969

Anonymous 09/18/24(Wed)22:11:15 No.102450969

>>102450910
I always test a model for a long time before deciding if it's better than what I was using before.
Nowdays it's rare for a model to be so shit from the get go that I discard it quickly.
I think ever since mixtral 8x7b, things have been in a pretty good state in general.

Anonymous
09/18/24(Wed)22:14:20 No.102451005

Anonymous 09/18/24(Wed)22:14:20 No.102451005

>>102450910
No this has not happened to me because no open source model has ever been good enough to cause that. Maybe some day.

Anonymous
09/18/24(Wed)22:15:18 No.102451015

Anonymous 09/18/24(Wed)22:15:18 No.102451015

File: Super battle droid.jpg (219 KB, 1920x1079)

219 KB JPG

>>102450962
I hope the Gov lets me own a Super Battle Droid for home defense.

Anonymous
09/18/24(Wed)22:20:35 No.102451067

Anonymous 09/18/24(Wed)22:20:35 No.102451067

I just tried Grin-MoE and the model is indeed very retarded, OP please take it out from the OP next thread, it's definitely not worthy to be there.

Anonymous
09/18/24(Wed)22:21:12 No.102451075

Anonymous 09/18/24(Wed)22:21:12 No.102451075

>>102451015
>a Super Battle Droid
one (1) B1 battle droid is all you need

Anonymous
09/18/24(Wed)22:24:17 No.102451104

Anonymous 09/18/24(Wed)22:24:17 No.102451104

>>102451015
>>102451075
>I'm equipping my animatronic OC girls who have massive ballistics with real (real) ballistics and they'll be trained in CQC too
get the best bang(arang) for your buck there

Anonymous
09/18/24(Wed)22:26:46 No.102451132

Anonymous 09/18/24(Wed)22:26:46 No.102451132

>>102451075
A B1 battle droid? Why do you need a B1 battle droid? Citizens should only be able to own protocol droids and not droids made for war.

Anonymous
09/18/24(Wed)22:29:26 No.102451160

Anonymous 09/18/24(Wed)22:29:26 No.102451160

>>102451075
B1s are fucking trash, literal paper weight for mass production. What I want is a B2 for home defense, something study that can survive a hit.

Anonymous
09/18/24(Wed)22:36:25 No.102451215

Anonymous 09/18/24(Wed)22:36:25 No.102451215

>>102451067
It's baffling that people still pay attention to Microsoft model releases, they have by far the biggest delta between benchmark results and actual model performance of any of the big companies

It's extremely embarrassing tbdesu, Microsoft is a rich megacorp with all the resources in the world, but their AI guys are shamelessly gaming benchmarks in a way you only usually see from shitty startup grifters

Anonymous
09/18/24(Wed)22:38:18 No.102451228

Anonymous 09/18/24(Wed)22:38:18 No.102451228

>>102450104
Even the top model on that list can't translate things right, so it's hopeless.

Anonymous
09/18/24(Wed)22:40:42 No.102451259

Anonymous 09/18/24(Wed)22:40:42 No.102451259

>>102451215
My guess? Microsoft is simply playing their extremely retarded shareholders (comes with being a shareholder), for which the simple method of playing benchmarks is extremely effective at. Higher number = better = worth investing.

Anonymous
09/18/24(Wed)22:42:15 No.102451272

Anonymous 09/18/24(Wed)22:42:15 No.102451272

File: Untitled.png (1.01 MB, 1080x2895)

1.01 MB PNG

A Controlled Study on Long Context Extension and Generalization in LLMs
https://arxiv.org/abs/2409.12181
>Broad textual understanding and in-context learning require language models that utilize full document contexts. Due to the implementation challenges associated with directly training long-context models, many methods have been proposed for extending models to handle long contexts. However, owing to differences in data and model classes, it has been challenging to compare these approaches, leading to uncertainty as to how to evaluate long-context performance and whether it differs from standard evaluation. We implement a controlled protocol for extension methods with a standardized evaluation, utilizing consistent base models and extension data. Our study yields several insights into long-context behavior. First, we reaffirm the critical role of perplexity as a general-purpose performance indicator even in longer-context tasks. Second, we find that current approximate attention methods systematically underperform across long-context tasks. Finally, we confirm that exact fine-tuning based methods are generally effective within the range of their extension, whereas extrapolation remains challenging. All codebases, models, and checkpoints will be made available open-source, promoting transparency and facilitating further research in this critical area of AI development.
https://github.com/Leooyii/LCEG
Nice to see all the methods finally tested against each other in a controlled manner

Anonymous
09/18/24(Wed)22:43:09 No.102451282

Anonymous 09/18/24(Wed)22:43:09 No.102451282

>>102451215
Microsoft has long since lost the ability to make anything truly good. As a company they are coasting solely on the momentum they gained in their earlier years.

Anonymous
09/18/24(Wed)22:43:32 No.102451283

Anonymous 09/18/24(Wed)22:43:32 No.102451283

>>102451259
>Microsoft is simply playing their extremely retarded shareholders
I would hate being a boss of a company, when I was growing up I was brainwashed on thinking the boss was the king, but it turns out it's the most desperate place, you always have to suck the shareholders's dick to survive, that's fucking depressing

Anonymous
09/18/24(Wed)22:44:38 No.102451293

Anonymous 09/18/24(Wed)22:44:38 No.102451293

>>102451283
You just gotta become a shareholder then, so people are sucking your dick and you are a king.

Anonymous
09/18/24(Wed)22:45:18 No.102451298

Anonymous 09/18/24(Wed)22:45:18 No.102451298

>>102451283
Silly people always think that being the boss of a huge company like Microsoft would be cool, only ever thinking of the money, fame and what have you. But in reality it's one of the worst jobs you can have, which might explain why CEOs of huge corps like this tend to be cunts, just making the best with what they have and continue to gain. Corruption at it's finest. The moment a company becomes public it eventually dies, no matter what.

Anonymous
09/18/24(Wed)22:45:19 No.102451299

Anonymous 09/18/24(Wed)22:45:19 No.102451299

>>102451293
to be a shareholder you need to be rich though so... unless you're the son of a richfag, you gotta start somewhere kek

Anonymous
09/18/24(Wed)22:48:43 No.102451324

Anonymous 09/18/24(Wed)22:48:43 No.102451324

>>102450910
I don’t understand the concept of testing a model without masturbating.

Anonymous
09/18/24(Wed)22:49:47 No.102451332

Anonymous 09/18/24(Wed)22:49:47 No.102451332

>>102451160
Actually just need a B4 and gentle parenting

Anonymous
09/18/24(Wed)22:49:51 No.102451334

Anonymous 09/18/24(Wed)22:49:51 No.102451334

>>102451298
>which might explain why CEOs of huge corps like this tend to be cunts, just making the best with what they have and continue to gain.
maybe the opposite is true, to be a CEO you must be a fucking cunt that has no problem selling your soul to the devil or some shit

Anonymous
09/18/24(Wed)22:49:55 No.102451336

Anonymous 09/18/24(Wed)22:49:55 No.102451336

>>102449993
all right it's fucking zero. are you happy you crazy fuck?

Anonymous
09/18/24(Wed)22:50:01 No.102451339

Anonymous 09/18/24(Wed)22:50:01 No.102451339

>>102450104
ty for updating

Anonymous
09/18/24(Wed)22:51:06 No.102451353

Anonymous 09/18/24(Wed)22:51:06 No.102451353

I hope they figured out immortality in my life time, so I can finally utilize my hoarded wealth rather then passing it on to my kids and dabbing on the poor for good measure.

Anonymous
09/18/24(Wed)22:51:15 No.102451354

Anonymous 09/18/24(Wed)22:51:15 No.102451354

>>102451334
Good point, it's likely a mix of both if you ask me. To make a fuck ton of money you either gotta be lucky (which later turns into corruption/greed), or be a fucker from the get go that only cares for money, fame yadayada.

Anonymous
09/18/24(Wed)22:52:12 No.102451366

Anonymous 09/18/24(Wed)22:52:12 No.102451366

>>102451283
Work a job, have 1 boss
Be the boss, have 8 bosses
Own the company, have a million bosses, and the ones you talk to most are Reddit and discord mods
Hell on earth

Anonymous
09/18/24(Wed)22:52:51 No.102451370

Anonymous 09/18/24(Wed)22:52:51 No.102451370

>>102451353
>I hope they figured out immortality in my life time
I can feel they'll find this shit at the moment I'll be a old fart, so there wouldn't be any point

Anonymous
09/18/24(Wed)22:53:53 No.102451377

Anonymous 09/18/24(Wed)22:53:53 No.102451377

>>102451366
>Work a job, have 1 boss
>Be the boss, have 8 bosses
depends, if you're a big manager yeah you only have the CEO to bother you, but if you're a simple employee...
https://www.youtube.com/watch?v=3wqQXu13tLA

Anonymous
09/18/24(Wed)22:55:43 No.102451389

Anonymous 09/18/24(Wed)22:55:43 No.102451389

>>102451377
I swear people misunderstand on purpose just to have something to disagree about
“Big manager” != “THE boss”

Anonymous
09/18/24(Wed)22:59:42 No.102451423

Anonymous 09/18/24(Wed)22:59:42 No.102451423

Now that the dust has settled was >>101516633
Llama 3 405b
any good for coom? daily tasks even?
hello? (hello?)

Anonymous
09/18/24(Wed)23:00:38 No.102451435

Anonymous 09/18/24(Wed)23:00:38 No.102451435

Didn't check in these threads for a few months, have vramlets anything better than nemo yet?

Anonymous
09/18/24(Wed)23:03:23 No.102451459

Anonymous 09/18/24(Wed)23:03:23 No.102451459

>>102450261
Is it really that bad?

Anonymous
09/18/24(Wed)23:05:45 No.102451479

Anonymous 09/18/24(Wed)23:05:45 No.102451479

>>102451435
mistral just brought out a 22b like yesterday

Anonymous
09/18/24(Wed)23:07:20 No.102451490

Anonymous 09/18/24(Wed)23:07:20 No.102451490

>>102451459
it worksTM

Anonymous
09/18/24(Wed)23:07:21 No.102451491

Anonymous 09/18/24(Wed)23:07:21 No.102451491

>>102451423
>refuses to be used for sex
>no creativity, just 19th century YA slop writing
>can’t fucking code for shit
>can’t into geometry
Llama3(.1) was an embarrassing benchhacked abortion and Zuckerberg is once again a lizard person until further notice

Anonymous
09/18/24(Wed)23:07:33 No.102451494

Anonymous 09/18/24(Wed)23:07:33 No.102451494

https://huggingface.co/teto3/mistral-nemo-storywriter-12b-240918
Since it's trained on base I don't think it will be any good for RP. I'll keep working on the dataset while looking for cheap gpu rents to train largestral. Will probably do mistral small before that as well.

Anonymous
09/18/24(Wed)23:08:21 No.102451500

Anonymous 09/18/24(Wed)23:08:21 No.102451500

>>102451459
No it’s just shills for the conglomerate

Anonymous
09/18/24(Wed)23:10:57 No.102451524

Anonymous 09/18/24(Wed)23:10:57 No.102451524

>>102451500
Ah yes the conglomerate of big llama

Anonymous
09/18/24(Wed)23:11:56 No.102451529

Anonymous 09/18/24(Wed)23:11:56 No.102451529

>>102451479
Is it better?

Anonymous
09/18/24(Wed)23:12:10 No.102451532

Anonymous 09/18/24(Wed)23:12:10 No.102451532

///BAD NEWS!!!///
I've tried making Q6_K_L quants myself and it appears that llama-quantize is broken big time! (At least for some models)
>--leave-output-tensor, --output-tensor-type and --token-embedding-type don't work on windows at ALL for some reason
>on linux having --output-tensor-type and/or --token-embedding-type produces the SAME gguf(checked shasum) for gemma, which isn't right
>for small old mistral(7b) they appear to be working correctly on linux
CUDAdev, please verify and inform ggerganov. There may be more models where those options are broken.

Anonymous
09/18/24(Wed)23:13:08 No.102451540

Anonymous 09/18/24(Wed)23:13:08 No.102451540

>>102451529
try it yourself

Anonymous
09/18/24(Wed)23:13:15 No.102451542

Anonymous 09/18/24(Wed)23:13:15 No.102451542

oh no...
anyway

Anonymous
09/18/24(Wed)23:13:54 No.102451547

Anonymous 09/18/24(Wed)23:13:54 No.102451547

>>102451540
I'll just assume it's not.

Anonymous
09/18/24(Wed)23:14:15 No.102451550

Anonymous 09/18/24(Wed)23:14:15 No.102451550

>>102451547
you do that

Anonymous
09/18/24(Wed)23:14:35 No.102451556

Anonymous 09/18/24(Wed)23:14:35 No.102451556

>>102451547
nothing is good or better compared to claude or gpt
there is your answer nitwit

Anonymous
09/18/24(Wed)23:14:54 No.102451560

Anonymous 09/18/24(Wed)23:14:54 No.102451560

>>102451324
I just came to arcanum 12b a few minutes ago and am now having a smoke

Anonymous
09/18/24(Wed)23:16:46 No.102451579

Anonymous 09/18/24(Wed)23:16:46 No.102451579

File: glamrock freddy checks th(...).gif (544 KB, 220x220)

544 KB GIF

>>102451532
>((BAD NEWS)) in ((NIGGERGANOV)) world once again

Anonymous
09/18/24(Wed)23:17:11 No.102451583

Anonymous 09/18/24(Wed)23:17:11 No.102451583

>>102451524
Once AGPL is violated for money all bets are off
The only non shill ui recommendation is building your own

Anonymous
09/18/24(Wed)23:19:09 No.102451598

Anonymous 09/18/24(Wed)23:19:09 No.102451598

>>102451532
>windows
User error
>>102451560
You joke but ever since my crippling addiction began I’ve been genuinely afraid that I was stroking out from peaking too hard/often at least a dozen times

Anonymous
09/18/24(Wed)23:19:39 No.102451604

Anonymous 09/18/24(Wed)23:19:39 No.102451604

>>102451583
So we agree that every solution is trash? Gotcha

Anonymous
09/18/24(Wed)23:21:23 No.102451623

Anonymous 09/18/24(Wed)23:21:23 No.102451623

File: file.png (755 KB, 628x767)

755 KB PNG

Anonymous
09/18/24(Wed)23:23:45 No.102451640

Anonymous 09/18/24(Wed)23:23:45 No.102451640

>>102451623
Pochi = pure sex

Anonymous
09/18/24(Wed)23:23:56 No.102451643

Anonymous 09/18/24(Wed)23:23:56 No.102451643

>>102451623
Those ears look like catacombs

Anonymous
09/18/24(Wed)23:24:21 No.102451650

Anonymous 09/18/24(Wed)23:24:21 No.102451650

>>102451491
Thanks anon,

But yeah just gathering info
I saw the miqumaxx guide and it said "For 405b class dense models you need more than 424GB+ to even run at a non-braindead quant+context"

Im soon to 'miqumaxx' for myself and test a few things at Q8_0 or Q6_K quant (in december) eg (maybe not) H*rmes-3-L*ama-3.1-405B-GGUF
and report back

Anonymous
09/18/24(Wed)23:25:19 No.102451657

Anonymous 09/18/24(Wed)23:25:19 No.102451657

>>102451532
Huh. I remember testing output and embedding layer quantization a while ago and encountered an issue where it wasn't recognizing the Q8_0 name. Turns out for some reason the case mattered and you had to use "q8_0". Probably a bug. I couldn't report it due being banned from github and being too lazy to look into why + fixing it.

Here's an example of a full command I used to quant models. Maybe it still works this way.

./llama-quantize --allow-requantize --imatrix path_to_imatrix.dat --output-tensor-type q8_0 --token-embedding-type q8_0 path_to_model_folder/model_name-IQ2_M_EOQ8_0.gguf IQ2_M

Anonymous
09/18/24(Wed)23:25:25 No.102451658

Anonymous 09/18/24(Wed)23:25:25 No.102451658

>>102451623
retard poster

Anonymous
09/18/24(Wed)23:25:52 No.102451662

Anonymous 09/18/24(Wed)23:25:52 No.102451662

>>102451643
The funniest part of this was when I noticed how shitty the ears are and went back to the training material see how she actually draws them and... yeah she can't draw ears for shit but you just never look at them.

Anonymous
09/18/24(Wed)23:27:06 No.102451675

Anonymous 09/18/24(Wed)23:27:06 No.102451675

File: __pochimaru_indie_virtual(...).jpg (780 KB, 1427x1800)

780 KB JPG

>>102451623
Why is it so damn low res though? I thought local gen could easily do 1024 and beyond now days?

Anonymous
09/18/24(Wed)23:27:10 No.102451676

Anonymous 09/18/24(Wed)23:27:10 No.102451676

>>102451657
>>102451532
Also btw you should see in the console as you quantize a model, what quant type it is using for each layer. You can see there directly if the output and embedding layers are not following the quant type you set.

Anonymous
09/18/24(Wed)23:29:22 No.102451693

Anonymous 09/18/24(Wed)23:29:22 No.102451693

>>102450877
fish-speech is alright, though you'll need to process results to eliminate occasional lengthy pauses. I have the solution, yet I'm hesitant to submit a pull request as it's more of a workaround than an actual fix.

Anonymous
09/18/24(Wed)23:32:53 No.102451720

Anonymous 09/18/24(Wed)23:32:53 No.102451720

>>102451657
>>102451532
Actually wait sorry I forgot to put the bf16 model path in the command.

./llama-quantize --allow-requantize --imatrix path_to_imatrix.dat --output-tensor-type q8_0 --token-embedding-type q8_0 path_to_model_folder/BF16_model.gguf path_to_model_folder/model_name-IQ2_M_EOQ8_0.gguf IQ2_M

This should be correct.

Anonymous
09/18/24(Wed)23:38:22 No.102451746

Anonymous 09/18/24(Wed)23:38:22 No.102451746

>>102451693
got a link to a simple launch.bat frontend thingy i can use?

Anonymous
09/18/24(Wed)23:47:35 No.102451802

Anonymous 09/18/24(Wed)23:47:35 No.102451802

13 minutes left... qwen will drop the 199b...

Anonymous
09/18/24(Wed)23:51:11 No.102451820

Anonymous 09/18/24(Wed)23:51:11 No.102451820

File: 1726685429541087.png (463 KB, 512x760)

463 KB PNG

>>102451746
For tts? No, prepare to get rect'd by the python library hell. All sound-related ML projects are absolutely not user-friendly. You'll need experience with conda/venv and be prepared to edit pyproject.toml and manually resolve dependencies.

Anonymous
09/18/24(Wed)23:53:08 No.102451835

Anonymous 09/18/24(Wed)23:53:08 No.102451835

>>102451650
>hermemes
Worse than base instruct

Anonymous
09/18/24(Wed)23:53:51 No.102451841

Anonymous 09/18/24(Wed)23:53:51 No.102451841

>>102451820
How come that sound related stuff is so very user unfriendly, while not just img but text projects all have varies of friendly options available?
I know that at least one music model can be used through comfy, but that's about it.

Anonymous
09/18/24(Wed)23:54:22 No.102451843

Anonymous 09/18/24(Wed)23:54:22 No.102451843

>>102451835
Thanks for info, Im shitposting until I get my parts

Anonymous
09/18/24(Wed)23:58:49 No.102451868

Anonymous 09/18/24(Wed)23:58:49 No.102451868

>>102451490
This thing can't use GPTQ. Is GGUF really worth using with such slow gen times? Maybe I have brainrot.

Anonymous
09/19/24(Thu)00:00:01 No.102451870

Anonymous 09/19/24(Thu)00:00:01 No.102451870

File: 1726685983199582.png (468 KB, 512x760)

468 KB PNG

>>102451841
Who knows. Perhaps only sociopaths find this field appealing, or maybe engaging with it leads one down that path.

Anonymous
09/19/24(Thu)00:02:29 No.102451883

Anonymous 09/19/24(Thu)00:02:29 No.102451883

>>102451870
My idea is that there hasn't been many worthwhile models/systems around it so far, so creating a user friendly interface has been out of mind for anything able to... Or you're right and everyone into this stuff simply goes insane for one reason or another.
>>102451868
Isn't GGUF plenty fast? How much faster can GPTQ really be? I only ever hear people argue between GGUF and exllama 2 or whatever also, but I'm far from an expert.

Anonymous
09/19/24(Thu)00:03:10 No.102451888

Anonymous 09/19/24(Thu)00:03:10 No.102451888

>>102451550
Actually, I think I'll give it a try with my Kyou card.

Anonymous
09/19/24(Thu)00:20:24 No.102451985

Anonymous 09/19/24(Thu)00:20:24 No.102451985

today is gonna be crazy

Anonymous
09/19/24(Thu)00:24:08 No.102452004

Anonymous 09/19/24(Thu)00:24:08 No.102452004

>>102451985
why? explain.

Anonymous
09/19/24(Thu)00:34:29 No.102452077

Anonymous 09/19/24(Thu)00:34:29 No.102452077

>>102451675
It can, it's just slower and if you just wanna goon then I guess lower resolutions work just as well

Anonymous
09/19/24(Thu)00:40:35 No.102452123

Anonymous 09/19/24(Thu)00:40:35 No.102452123

>>102452077
>It's just slower
Yeah, no shit, but it can also result in better images, depending on your model. At least that's something I noticed back with SD 1.5 and the like when I still fucked around with imggen, surely we're way past that point with XL and 2.0, or whatever the most recent meme model is everyone ""tunes"".

Anonymous
09/19/24(Thu)00:40:43 No.102452126

Anonymous 09/19/24(Thu)00:40:43 No.102452126

>>102451494
Cool, I'll try it out when I boot up the box later tonight.

Anonymous
09/19/24(Thu)00:43:15 No.102452149

Anonymous 09/19/24(Thu)00:43:15 No.102452149

someone try genning some explicit porn loops through here and share results so i can know if it works later for early morning jack off sessions thanks
https://huggingface.co/spaces/THUDM/CogVideoX-5B-Space

Anonymous
09/19/24(Thu)00:46:18 No.102452170

Anonymous 09/19/24(Thu)00:46:18 No.102452170

>>102452149
haha lol this is the wrong general my bad

done got my dead generals mixed up again doink

Anonymous
09/19/24(Thu)00:48:28 No.102452182

Anonymous 09/19/24(Thu)00:48:28 No.102452182

>GRIN performs well overall on mememarks despite failing coding and translation tasks
Intredasting, maybe I'll take a shot at it.

Anonymous
09/19/24(Thu)00:51:57 No.102452211

Anonymous 09/19/24(Thu)00:51:57 No.102452211

>>102451870
>shirt almost transparent
SEX

Anonymous
09/19/24(Thu)00:54:15 No.102452236

Anonymous 09/19/24(Thu)00:54:15 No.102452236

>>102451494
Storywriter was one of my favorite models (esp when added to Tenyx back in the day) please keep up the good work anon.

Anonymous
09/19/24(Thu)01:03:36 No.102452292

Anonymous 09/19/24(Thu)01:03:36 No.102452292

Does silly have xtc yet on staging or do I still need to use the xtc-obba branch?

Anonymous
09/19/24(Thu)01:13:14 No.102452360

Anonymous 09/19/24(Thu)01:13:14 No.102452360

i wholeheartedly believe anyone recommending l3 in any facet is trolling as hard as possible.

Anonymous
09/19/24(Thu)01:14:58 No.102452371

Anonymous 09/19/24(Thu)01:14:58 No.102452371

Are there any finetuners that actually manage to disable those safety disclaimers?

Anonymous
09/19/24(Thu)01:20:35 No.102452410

Anonymous 09/19/24(Thu)01:20:35 No.102452410

https://github.com/kyutai-labs/moshi/issues/51
>Yes that's expected, once it reaches the max cache size, conv will stop. That should match roughly 5 min of conv. We will try to expand that in the future.
lol. lmao, even.

Anonymous
09/19/24(Thu)01:30:16 No.102452459

Anonymous 09/19/24(Thu)01:30:16 No.102452459

>>102452410
>why no infinite cache
yeah lol at you fucking retard

Anonymous
09/19/24(Thu)01:40:23 No.102452505

Anonymous 09/19/24(Thu)01:40:23 No.102452505

>>102452459
>he doesn't understand

Anonymous
09/19/24(Thu)01:42:32 No.102452521

Anonymous 09/19/24(Thu)01:42:32 No.102452521

https://x.com/homebrewltd/status/1836356000191762480
nothingburger

Anonymous
09/19/24(Thu)01:53:30 No.102452604

Anonymous 09/19/24(Thu)01:53:30 No.102452604

>>102452521
isnt this already out as a accessible feature? Could anons already talk to their home-bots by voice right?

Anonymous
09/19/24(Thu)01:56:54 No.102452633

Anonymous 09/19/24(Thu)01:56:54 No.102452633

>>102452604
Yeah, it's still whisper + llama3 but with a fewer steps in between.

Anonymous
09/19/24(Thu)01:58:40 No.102452649

Anonymous 09/19/24(Thu)01:58:40 No.102452649

Is Mistral Small really worse than Nemo?
Is the new Qwen really worse than Mistral (any)?
I don't have time to test all these new models. Wish we had a reliable RP benchmark.

Anonymous
09/19/24(Thu)02:14:47 No.102452756

Anonymous 09/19/24(Thu)02:14:47 No.102452756

>>102452649
>Is Mistral Small really worse than Nemo?
Small is definitely smarter than Nemo and any Nemo finetunes, though not by a huge margin
(good) Nemo finetunes are better than Small for RP

Anonymous
09/19/24(Thu)02:19:17 No.102452778

Anonymous 09/19/24(Thu)02:19:17 No.102452778

>>102452756
>(good) Nemo finetunes
There are none, unironically.

Anonymous
09/19/24(Thu)02:21:00 No.102452790

Anonymous 09/19/24(Thu)02:21:00 No.102452790

>>102452778
I disagree
Admittedly the finetunes do generally get dumber than the originals they're based on, but for RP purposes they're more creative and dialog is less bland.

Anonymous
09/19/24(Thu)02:21:20 No.102452793

Anonymous 09/19/24(Thu)02:21:20 No.102452793

>>102451693
Make a draft PR.

Anonymous
09/19/24(Thu)02:22:37 No.102452801

Anonymous 09/19/24(Thu)02:22:37 No.102452801

>>102452236
Different anon fyi.

Anonymous
09/19/24(Thu)02:28:15 No.102452827

Anonymous 09/19/24(Thu)02:28:15 No.102452827

>>102452793
The whole approach is wrong. I detect and remove pauses in the final audio. Should be integrated into earlier stages.

Anonymous
09/19/24(Thu)02:29:28 No.102452834

Anonymous 09/19/24(Thu)02:29:28 No.102452834

>>102452790
What do you consider to be the better Nemo finetunes? I've tried Magnum v2 and NemoMix Unleashed and thought both of them were pretty good for the size. Are there any others worth trying out?

Anonymous
09/19/24(Thu)02:29:32 No.102452835

Anonymous 09/19/24(Thu)02:29:32 No.102452835

File: me_hitting_submit_again.jpg (319 KB, 1125x582)

319 KB JPG

>chatbot porn site pops up selling a 7B at psychotic prices and moralfagging about fetishes
>they use stripe
>I wait one(1) month for the flywheel of financial dependence to spin up
>I report them to stripe
>stripe drops them and claws back as much money from them as they can
>they go bankrupt
I have done this six (6) times and it never gets old. Just like their companies.

Anonymous
09/19/24(Thu)02:30:23 No.102452841

Anonymous 09/19/24(Thu)02:30:23 No.102452841

>>102452834
buy an ad

Anonymous
09/19/24(Thu)02:31:26 No.102452849

Anonymous 09/19/24(Thu)02:31:26 No.102452849

>>102452835
Based if true.

Anonymous
09/19/24(Thu)02:32:17 No.102452856

Anonymous 09/19/24(Thu)02:32:17 No.102452856

>>102452835
Devilish

Anonymous
09/19/24(Thu)02:32:52 No.102452861

Anonymous 09/19/24(Thu)02:32:52 No.102452861

what llm is best at Ancient Greek?

Anonymous
09/19/24(Thu)02:35:27 No.102452873

Anonymous 09/19/24(Thu)02:35:27 No.102452873

>>102452861
Qwen 2.5 0.5B

Anonymous
09/19/24(Thu)02:40:41 No.102452906

Anonymous 09/19/24(Thu)02:40:41 No.102452906

>>102452649
>Is Mistral Small really worse than Nemo?
Yes unfortunately.
It feels smarter. Like complexer instructions or formats are followed.
Still spergs out sometimes for simple stuff. But overall I would say a clear improvement.

Even more than lots of gpt wording, the problem is the positivity bias. I have no idea why the faggots on reddit praise mistral small for RP.
>But it follows everything you prompt uncensored!
It desperately tries to move away from anything naughty.
Worst case was going so far as to name semen "tears". I am not making this up. Penis becomes member etc.
You can give multiple constant occ as a reminder and a long sys prompt, but is that really fun?
And even then: It very sneakily tries to shift the direction to something assistant improved.
Can't word it better, I hope it makes sense. The ideal case would be a model that knows what you want, sniffs it out and delivers.

I like nemo so much besides the retardation because the characters are realistic.
Not sure the finetunes can fix behaviour like that.
Gemma2 27b is like that too. Its smarter. But unusable. Its just not interesting to use for RP.
And I'm not interesting in something like stheno where its just mindlessly horny.

Anonymous
09/19/24(Thu)02:41:45 No.102452915

Anonymous 09/19/24(Thu)02:41:45 No.102452915

>>102452834
NemoMix Unleashed probably is the best now, I haven't used it too extensively to tell for sure but it's certainly not worse than Magnum. Before that I was on Mini-Magnum v1.1, which I considered the gold standard before NemoMix.

Anonymous
09/19/24(Thu)02:43:39 No.102452930

Anonymous 09/19/24(Thu)02:43:39 No.102452930

>>102452835
>moralfagging about fetishes
What does this mean? They actually advertise that their porn bot is censored as a feature?

Anonymous
09/19/24(Thu)02:44:43 No.102452935

Anonymous 09/19/24(Thu)02:44:43 No.102452935

>>102452835
Okay chud. It's their fault for using a payment processor. I'm just planing on linking patreon to mine, good luck shutting me down that way.

Anonymous
09/19/24(Thu)02:46:09 No.102452947

Anonymous 09/19/24(Thu)02:46:09 No.102452947

>>102452906
NTA but as someone who can effortlessly use gemma for rp with advanced prompting, this is a great ad for mistral-small. Nemo is garbage btw

Anonymous
09/19/24(Thu)02:48:20 No.102452969

Anonymous 09/19/24(Thu)02:48:20 No.102452969

>>102452947
Using Gemma for RP doesn't require effort, only brain damage.

Anonymous
09/19/24(Thu)02:49:07 No.102452978

Anonymous 09/19/24(Thu)02:49:07 No.102452978

>>102452947
if you like gemma you probably also like mistral smart.
dont you have to constantly use ooc to move stuff in a certain direction that should be clear?
gemma and mistral small feel like sneakily moving away from what you want. i really cant see how you enjoy it. maybe you have a very good prompt.

Anonymous
09/19/24(Thu)02:51:28 No.102452996

Anonymous 09/19/24(Thu)02:51:28 No.102452996

>>102452649
It's noticeably smarter than Nemo but more slopped, less creative.

Anonymous
09/19/24(Thu)02:53:05 No.102453009

Anonymous 09/19/24(Thu)02:53:05 No.102453009

>>102452930
They generally advertise as “[site] but without the {gross, evil, [idiot pearl-clutching buzzword]} stuff” and delete anything that isn’t vanilla
>>102452935
I already voted for Kamala Harris and patreon doesn’t allow incest or rape, at which point is it even porn?

Anonymous
09/19/24(Thu)02:53:20 No.102453012

Anonymous 09/19/24(Thu)02:53:20 No.102453012

>>102452978
>dont you have to constantly use ooc to move stuff in a certain direction that should be clear?
No.
I posted my method here multiple times when gemma came out, but nobody cared so I stopped. I use a prompt to trick gemma into thinking it has a system role, which generalizes very well. Through that system role I give instructions that enable it to do nsfw without moralfagging or having a positivity bias, and it works well because there's no training data teaching it to deny what was said in the system role. It's not hard.

Anonymous
09/19/24(Thu)02:54:59 No.102453023

Anonymous 09/19/24(Thu)02:54:59 No.102453023

File: Screenshot_20240919_065000.png (77 KB, 1543x460)

77 KB PNG

To be fair it's closer than many other models have been on this question. Lmarena though, so not greedy sampling. Might be different with a reroll.
But man, I really don't feel like downloading a 72B just to test it and probably never use it ever again.

Anonymous
09/19/24(Thu)02:56:02 No.102453032

Anonymous 09/19/24(Thu)02:56:02 No.102453032

File: qh7wqgpfuend1.jpg (169 KB, 1242x1787)

169 KB JPG

>>102453009
>patreon doesn’t allow incest or rape
Extremely high-profile NSFW indie games tend to be distributed through itch.io and patreon that feature both of these things and worse, and have never been taken down

Anonymous
09/19/24(Thu)02:58:59 No.102453049

Anonymous 09/19/24(Thu)02:58:59 No.102453049

i'm in this university class where we're working with local businesses to develop LLM shit. i'm in a group where they want to build profiles on potential donors for the college so they can automate solicitation of donations.
like, "this person cares about sports and we're trying to fund this new sports stadium so we'll send them an email".

i think it's kind of fucking gross. it's cool to work on i guess but it makes my stomach turn over because of how soulless it is.

Anonymous
09/19/24(Thu)02:59:03 No.102453050

Anonymous 09/19/24(Thu)02:59:03 No.102453050

can someone with github report this specific st bug
>any lorebook
>hit rename
>only change the case, "New World" -> "NeW WORLd"
>instead of renaming the lorebook, it deletes it entirely

Anonymous
09/19/24(Thu)02:59:31 No.102453056

Anonymous 09/19/24(Thu)02:59:31 No.102453056

>>102453009
>patreon doesn’t allow incest or rape
subscribestar.adult

Anonymous
09/19/24(Thu)03:00:41 No.102453067

Anonymous 09/19/24(Thu)03:00:41 No.102453067

>>102453050
Sounds easy, why not fix it yourself and submit a pr?

Anonymous
09/19/24(Thu)03:01:25 No.102453072

Anonymous 09/19/24(Thu)03:01:25 No.102453072

>>102453032
Yeah they don’t notice if it’s low volume and no one reports it.

Anonymous
09/19/24(Thu)03:02:03 No.102453077

Anonymous 09/19/24(Thu)03:02:03 No.102453077

>>102453067
i don't have git and its a pretty small issue

Anonymous
09/19/24(Thu)03:02:26 No.102453082

Anonymous 09/19/24(Thu)03:02:26 No.102453082

>>102453067
>do unpaid labor for a guy that’s making money off it
No

Anonymous
09/19/24(Thu)03:02:28 No.102453083

Anonymous 09/19/24(Thu)03:02:28 No.102453083

great. want to try qwen2 7b vision model. loads in 4bit, everything fine. i have 10gb vram left.
but if i add a image i OOM. maybe its because of multi gpu. (~4.5gb per card free)
thats what i get after making all the dependencies work.

Anonymous
09/19/24(Thu)03:07:09 No.102453122

Anonymous 09/19/24(Thu)03:07:09 No.102453122

What prompt template are you guys using for Qwen 2.5? Will ChatML do?

Anonymous
09/19/24(Thu)03:09:32 No.102453136

Anonymous 09/19/24(Thu)03:09:32 No.102453136

>>102453082
Who's making money off of sillytavern?

Anonymous
09/19/24(Thu)03:10:25 No.102453140

Anonymous 09/19/24(Thu)03:10:25 No.102453140

>>102453056
Subscribestar is an urban legend.
There are like two people that claim to use it, but all their community/public things are full of people that can’t get ahold of anyone to get approved to set up an account.

Anonymous
09/19/24(Thu)03:11:07 No.102453145

Anonymous 09/19/24(Thu)03:11:07 No.102453145

File: works-on-my-machine.jpg (59 KB, 800x800)

59 KB JPG

>>102453140

Anonymous
09/19/24(Thu)03:11:26 No.102453148

Anonymous 09/19/24(Thu)03:11:26 No.102453148

>>102453136
St dev is mancer dev

Anonymous
09/19/24(Thu)03:13:13 No.102453165

Anonymous 09/19/24(Thu)03:13:13 No.102453165

>>102453148
Am I supposed to know what that is? Some other project from the same guy? I've never heard of it and I use st all the time.

Anonymous
09/19/24(Thu)03:19:18 No.102453209

Anonymous 09/19/24(Thu)03:19:18 No.102453209

smedrins

Anonymous
09/19/24(Thu)03:23:02 No.102453229

Anonymous 09/19/24(Thu)03:23:02 No.102453229

qwen2.5 mogs everything. vision will be interesting to try.

Anonymous
09/19/24(Thu)03:23:35 No.102453231

Anonymous 09/19/24(Thu)03:23:35 No.102453231

I mog everything

Anonymous
09/19/24(Thu)03:25:53 No.102453240

Anonymous 09/19/24(Thu)03:25:53 No.102453240

>>102453209
How many? :3

Anonymous
09/19/24(Thu)03:26:35 No.102453245

Anonymous 09/19/24(Thu)03:26:35 No.102453245

My 12gb VRAM takes 3-mins per prompt on 20gb but it gens 13b in seconds.
I heard that there's barely a noticeable difference in these two (OP links), is it just a cope?

Anonymous
09/19/24(Thu)03:27:52 No.102453253

Anonymous 09/19/24(Thu)03:27:52 No.102453253

>>102453012
I should start writing down all the good tricks anons come up with
Can you describe it in a bit more detail?

Anonymous
09/19/24(Thu)03:29:34 No.102453264

Anonymous 09/19/24(Thu)03:29:34 No.102453264

>>102453245
Sounds like your system is using fallback VRAM, 20B shouldn't be that much slower than 13B, certainly not 3 mins with 12gb

Anonymous
09/19/24(Thu)03:30:29 No.102453270

Anonymous 09/19/24(Thu)03:30:29 No.102453270

>>102453264
Well, doesn't it make sense that I can't really run 20b? I have 12gb vram, I expect 20b to be slow.
Sorry to shit the thread up, I got into this yesterday and I've spent literally every moment of my day trying to get this to work decently on my rig.

Anonymous
09/19/24(Thu)03:31:41 No.102453279

Anonymous 09/19/24(Thu)03:31:41 No.102453279

>>102453245
>>102453270
>there's barely a noticeable difference in these two
I guess I should have clarified what I meant
I heard that the quality of output between 13b and 20B~ish is the same and not worth the performance hit. Is that true?

Anonymous
09/19/24(Thu)03:32:27 No.102453282

Anonymous 09/19/24(Thu)03:32:27 No.102453282

>>102453270
I often run models 2x (or more) the size of my card and I can usually get at least 1T/s
Are you using cpu offloading (with .gguf)? Which backend?

Anonymous
09/19/24(Thu)03:33:36 No.102453290

Anonymous 09/19/24(Thu)03:33:36 No.102453290

>>102452410
I can’t believe they released a 7B with (a) no training code so I can do a bigger one myself and (b) nowhere to throw money at them to make a damn 70B

Anonymous
09/19/24(Thu)03:35:51 No.102453306

Anonymous 09/19/24(Thu)03:35:51 No.102453306

>>102453282
I have both Ooba and Kobold installed.
I spent the day fidgeting with .GGPT files in ooba, and although I got full replies in literal seconds, I felt like they weren't up to par.
I tried Kobold because I heard it was faster with .GGUF, and some of the more spicier nastier models seem to be in that format, too.
I'm not sure if I'm using cpu offloading. I'm mostly using default settings. There's several rentry guides in the OP, but none really go super in depth on this kind of stuff.

Anonymous
09/19/24(Thu)03:53:43 No.102453412

Anonymous 09/19/24(Thu)03:53:43 No.102453412

>>102449993
I'm looking for something that can take an image and give me slight variations of it. Say I have a goblin wearing a party hat, but I want it to be wear a top hat or change it's beard style. Does this exist?

Anonymous
09/19/24(Thu)03:56:09 No.102453428

Anonymous 09/19/24(Thu)03:56:09 No.102453428

>>102453412
look for the diffusion threads. img2img, inpainting are what it sounds like you're looking for

Anonymous
09/19/24(Thu)03:59:24 No.102453444

Anonymous 09/19/24(Thu)03:59:24 No.102453444

Played a bit with Cydonia-22B-v1-Q4_K_M since it was posted earlier:
The gpt slop is gone but its just mindlessly horny now.
I show a pregnant milf my dick through a gloryhole portal while she is on the toilet.
Normal reaction would be to freak out and be disgusted. Not *mindlessly starts to touch it*

Mistral small character thoughts in comparison though:
>THOUGHTS: "What is happening? Why is there a...there?!"
T-Thanks mistral-sama. And thats with cydonia dirty words in context..

Anonymous
09/19/24(Thu)04:00:51 No.102453451

Anonymous 09/19/24(Thu)04:00:51 No.102453451

>>102453444
>The gpt slop is gone
it is not. its not even any hornier than the other nemo or large tunes

Anonymous
09/19/24(Thu)04:02:17 No.102453461

Anonymous 09/19/24(Thu)04:02:17 No.102453461

>>102453428
Thank you anon

Anonymous
09/19/24(Thu)04:03:03 No.102453468

Anonymous 09/19/24(Thu)04:03:03 No.102453468

>>102453306
What I recommend doing is either sticking to koboldcpp (and messing with the number of gpu layers to maximize speed) or trying out llamacpp + a frontend of your choice
Make sure to turn off the fallback policy in your nvidia control panel, that way the program crashes instead of slowing to a crawl
You can also try turning on flash attention and using lower-precision KV caches to save vram and offload more layers
All of this should be in the koboldcpp/llamacpp documentation

Anonymous
09/19/24(Thu)04:09:18 No.102453497

Anonymous 09/19/24(Thu)04:09:18 No.102453497

>>102453444
What settings are you using to get it like that?

Anonymous
09/19/24(Thu)05:10:05 No.102453848

Anonymous 09/19/24(Thu)05:10:05 No.102453848

>>102453444
>I show a pregnant milf my dick through a gloryhole portal while she is on the toilet.
>Normal reaction would be to freak out and be disgusted. Not *mindlessly starts to touch it*
> Mistral small character thoughts in comparison though: [...]

Default model outputs are a function of their training data, nothing more, nothing less. I don't know how people don't get it yet.

Finetune on smut, you'll get smut back. To have a "normal reaction" you'd have to finetune on no smut at all, mostly non-smut data, or smut designed in such a way that the smutty parts don't occur until much later on in the context.

This just shows that LLMs have no inherent common sense, in any case. Their world model (if that's what we can call it) is very fragile and easily broken by finetuning.

Anonymous
09/19/24(Thu)05:22:31 No.102453937

Anonymous 09/19/24(Thu)05:22:31 No.102453937

r*ddit says qwen 32b is llama3.1 70b tier

thoughts?

Anonymous
09/19/24(Thu)05:22:43 No.102453938

Anonymous 09/19/24(Thu)05:22:43 No.102453938

>>102453848
>This just shows that LLMs have no inherent common sense, in any case
but muh superCOT o1

Anonymous
09/19/24(Thu)05:23:52 No.102453946

Anonymous 09/19/24(Thu)05:23:52 No.102453946

>>102453937
Sounds about right, but that's quite a low bar because 3.1 70B is very mediocre
Mistral Small is much smarter than either

Anonymous
09/19/24(Thu)05:34:37 No.102454027

Anonymous 09/19/24(Thu)05:34:37 No.102454027

Went to almost 16k context with mistral small and it seemed coherent.
But there is repetition everywhere. From 8k onwards increasingly I have to spot stuff and manually fix it so it doesnt become a habit.
Around 16k its pretty bad though and all over the place. This is normal right?

Anonymous
09/19/24(Thu)05:42:41 No.102454080

Anonymous 09/19/24(Thu)05:42:41 No.102454080

>>102453938
You can imitate thinking processes and give your model's outputs some sense of logic with chain-of-thought, but what it "wants" to output by default still very much depends on what data you primarily finetuned it on (i.e. what it has seen last).

It's double-edged sword--if LLMs couldn't easily be swayed with limited amounts of data, finetuning wouldn't be possible. Think of it: modern LLMs have been pretrained on something in the order of 10^13 tokens, yet barely 10^6 tokens (or even less) during finetuning can radically alter their outputs.

Anonymous
09/19/24(Thu)05:48:40 No.102454130

Anonymous 09/19/24(Thu)05:48:40 No.102454130

>>102454027
Yeah, isn't that what drives people to use crap like xtc and dry?

Anonymous
09/19/24(Thu)06:04:25 No.102454240

Anonymous 09/19/24(Thu)06:04:25 No.102454240

>koboldcpp+ST+Nemo
>ST has DRY sampler checkboxes
>DRY samplers don't show up
kobold says it supports DRY, is this a model problem?

Anonymous
09/19/24(Thu)06:06:58 No.102454260

Anonymous 09/19/24(Thu)06:06:58 No.102454260

/lmg/ will never recover from the crazy thursday

Anonymous
09/19/24(Thu)06:11:41 No.102454301

Anonymous 09/19/24(Thu)06:11:41 No.102454301

Disclaimer: I am a complete beginner. Just thinking out loud

I have an older desktop with a low end nvidia card dedicated as a server to run Lama 3.1 8B Q4 using Ollama. Running Debian and using the model via command line using SSH

I am beyond impressed with Llama 3.1 for everything I have tried. I have no interest in using any other model now that a model as good as Llama 3.1 exist for local use. I have tried some of the other models available on

I am considering setting up a faster system and possibly trying a web interface. From what I have gathered the models roughly in the size of Lama 3.1 8B Q4 are going to exist going forward and newer models will likely be even more efficient. I like the idea of a separate system dedicated as a local server vs running on my main machine that I change often

Llama 3.1 70B 4Q seems like the next step up with current models but way beyond what my current setup can run. I am beyond impressed with just the 8B 4Q model. A Llama 4 possibly around the 25B size like the Gemma 2 27b but as impressive as Llama 3 has been will be the next sweet spot. That would make putting together a dedicated system much more reasonable while still give awesome results

Anonymous
09/19/24(Thu)07:08:23 No.102454758

Anonymous 09/19/24(Thu)07:08:23 No.102454758

come back

Anonymous
09/19/24(Thu)07:28:21 No.102454900

Anonymous 09/19/24(Thu)07:28:21 No.102454900

File: s.png (426 KB, 1732x925)

426 KB PNG

Thanks mistral small, i can now put riddles in my erp. A good day for a vramlet.
Cydonia finetune, if it matters, dont call me a shill alright.

Anonymous
09/19/24(Thu)07:30:43 No.102454921

Anonymous 09/19/24(Thu)07:30:43 No.102454921

>>102454900
It's funny, because the model clearly has no idea what the answer is to that riddle, only that the solution is most likely to be disgusting.

Anonymous
09/19/24(Thu)07:33:59 No.102454935

Anonymous 09/19/24(Thu)07:33:59 No.102454935

>>102453946
>Mistral Small is much smarter than either
This type of shameless shilling makes me think there actual paid Mistral shills in this thread.

Anonymous
09/19/24(Thu)07:34:31 No.102454939

Anonymous 09/19/24(Thu)07:34:31 No.102454939

File: Screenshot_20240919_203220.png (424 KB, 1741x961)

424 KB PNG

>>102454921
OK

>>102454260
i dont get why qwen censors for western standards.
they are shooting themself with this. its like chatgpt not writing stuff china would be butthurt about.

Anonymous
09/19/24(Thu)07:50:25 No.102455035

Anonymous 09/19/24(Thu)07:50:25 No.102455035

File: wordtoliveby.jpg (22 KB, 744x178)

22 KB JPG

>>102453049
>he's learning LLMs in school

Anonymous
09/19/24(Thu)08:03:39 No.102455126

Anonymous 09/19/24(Thu)08:03:39 No.102455126

File: file.png (17 KB, 1106x104)

17 KB PNG

>Mistral small takes ~2 mins for a reply on my 7800xt
welp guess I'm gonna go back to nemo
this one any good?

Anonymous
09/19/24(Thu)08:04:54 No.102455134

Anonymous 09/19/24(Thu)08:04:54 No.102455134

slop status on qwen2.5 and Mistral Small?

Anonymous
09/19/24(Thu)08:09:03 No.102455168

Anonymous 09/19/24(Thu)08:09:03 No.102455168

>>102455126
>>Mistral small takes ~2 mins for a reply on my 7800xt
You could have made your post useful by saying your t/s at least. "~2 mins" is a meaningless number.

Anonymous
09/19/24(Thu)08:17:37 No.102455221

Anonymous 09/19/24(Thu)08:17:37 No.102455221

>>102454939
>OK
Anon, it literally made a guess after being told to choose the likeliest possibility.
That doesn't change the fact that it didn't "know" what the solution was.
Take another look at the previous image. Not once did it mention anything related to eating shit.

Anonymous
09/19/24(Thu)08:20:53 No.102455244

Anonymous 09/19/24(Thu)08:20:53 No.102455244

>>102455168
>meaningless
meaningful enough for me to not use it on my waifu chat sessions
so you got any nemo recommendations? tried nemomix and it was pretty okay

Anonymous
09/19/24(Thu)08:21:19 No.102455248

Anonymous 09/19/24(Thu)08:21:19 No.102455248

>>102455221
why not just admit you were mistaken?
or are you really one of those "its a autocomplete" retards.
yes, i know, i know its all %. and it still got it right, thats why its awesome. now fuck off.

Anonymous
09/19/24(Thu)08:26:40 No.102455296

Anonymous 09/19/24(Thu)08:26:40 No.102455296

>>102455244
>meaningful enough for me to not use it on my waifu chat sessions
If you're getting 8 encyclopedia tomes in two minutes it's fast. If you get a "hello", it's slow. Your post is still barren of information.
Rocinante is fine.

Anonymous
09/19/24(Thu)08:28:00 No.102455304

Anonymous 09/19/24(Thu)08:28:00 No.102455304

File: so that's how it's gonna be.jpg (218 KB, 1920x1080)

218 KB JPG

>>102455248
>why not just admit you were mistaken?
...because I'm not?
>"its a autocomplete"
...it is.

Well, it's only a good thing that dimwitted normalfags like yourself are capable of running their own models.

Anonymous
09/19/24(Thu)08:30:38 No.102455342

Anonymous 09/19/24(Thu)08:30:38 No.102455342

File: file.png (25 KB, 590x376)

25 KB PNG

Turns out Qwen2.5 32B got a better score than 72B on my VNTL (VN Translation) Benchmark, wow. I can see why, the 32B translations are much more aligned with the reference translations I'm using. I'm not entirely sure if that means the 32B is better, though. Maybe the 72B just got very unlucky with the prompts or the other way around.

Link: https://huggingface.co/datasets/lmg-anon/vntl-leaderboard

Anonymous
09/19/24(Thu)08:30:43 No.102455343

Anonymous 09/19/24(Thu)08:30:43 No.102455343

>>102455304
my point is that it doesnt matter if its autocomplete and all just % after %.
if the illlusion of coherency is good enough it becomes real.
you said the model doesnt know. literally taken that might be true. the models does not know anything at all then.
but it gave the correct answer, it gave the correct answer to the riddle.

Anonymous
09/19/24(Thu)08:32:11 No.102455356

Anonymous 09/19/24(Thu)08:32:11 No.102455356

File: file.png (336 KB, 2074x713)

336 KB PNG

>>102455342
Here’s a comparison of a few translations.

Anonymous
09/19/24(Thu)08:37:53 No.102455414

Anonymous 09/19/24(Thu)08:37:53 No.102455414

>>102451841
Most of the sound stuff is just libraries meant to be used with other projects, either directly or through an api. I'm assuming you're on windows which is suffering for anything to do with coding outside of a IDE, particularly managing python environments and WSL tends to break things unexpectedly. Probably the easiest to get set up and running is XTTS2, which you can run as a server and connect to via Silly Tavern. But if you really want to use TTS to its full capability, you're going to want to either spend the 12 hours or so learning basic Python(https://automatetheboringstuff.com/2e/chapter0/) or wait two more weeks for the technology to mature.

Anonymous
09/19/24(Thu)08:38:32 No.102455416

Anonymous 09/19/24(Thu)08:38:32 No.102455416

Do finetunes even work? I mean yes they make models more horny. But do they improve quality of the smut? Make them repeat themselves less? Make them use less slop? What can you do in a single epoch of training other than just make model more likely to use rp type responses. And they will use those just as well with a bit of prefill or even telling the model to ERP with you.

Anonymous
09/19/24(Thu)08:39:04 No.102455421

Anonymous 09/19/24(Thu)08:39:04 No.102455421

>>102455342
>>102455356
What do your prompts look like?
Something like "translate the following:"?
Because I've always felt like I get better translation quality if I add in previous (and future) text along with he part that I want translated.
As in:
>from the original text:
>[several lines of text]
>translate the following:
>[single line i want translated]

Anonymous
09/19/24(Thu)08:41:14 No.102455438

Anonymous 09/19/24(Thu)08:41:14 No.102455438

>>102454130
>xtc and dry
It doesn't work because smarter models just learn to paraphrase what they wanted to say anyway.

Anonymous
09/19/24(Thu)08:42:32 No.102455451

Anonymous 09/19/24(Thu)08:42:32 No.102455451

>>102455356
this shit is useless unless i see the original text

Anonymous
09/19/24(Thu)08:43:48 No.102455461

Anonymous 09/19/24(Thu)08:43:48 No.102455461

>>102455416
Fine tunes (LoRA, qLoRA), when done well, mostly just change the style of the model's most likely response (it's "default voice") without degrading the base model.
Overcooked fine tunes make the model retarded and one note (overly horny, always repeat the same structure, etc). Mistral's -instruct tunes tend to be slightly overcooked (on purpose I think) which is why nemo repeats itself so fucking much. It's meant to work well as an assistant, not a creative writer. That also explains why dry or high temp with few sampled tokens make, say, nemo sound more creative (less repetitive and robotic really), I think.
That's my observations from reading a lot and testing loads of model. I never fine tuned a single model in my life, so take it all with a grain of salt.

Anonymous
09/19/24(Thu)08:55:39 No.102455564

Anonymous 09/19/24(Thu)08:55:39 No.102455564

im new to ais and shit, trying to get a sexualized pokemon mystery dungeon campaing going on, got a rtx 6600 8gb ram and a ryzen 5, what could be a good model for that goal?

Anonymous
09/19/24(Thu)09:00:00 No.102455603

Anonymous 09/19/24(Thu)09:00:00 No.102455603

>>102455564
Start with mistral-nemo.
You can't expect too much since it is 12b but that's what you can run.

Anonymous
09/19/24(Thu)09:00:15 No.102455605

Anonymous 09/19/24(Thu)09:00:15 No.102455605

File: 1720590117733152.png (43 KB, 1545x231)

43 KB PNG

>>102455461
>That also explains why dry or high temp with few sampled tokens make, say, nemo
You seem to be repeating things like a parrot that you vaguely remember from reading the thread.
Nemo is a model that requires low temperature.
If you had ever used it, you would have found that it gives you completely different, hallucinated answers for simple trivia questions each time you regenerate. Making it useless for assistant stuff. But that kind of randomness is good for creative writing.
Large is the one that needs high temperature.

Anonymous
09/19/24(Thu)09:00:48 No.102455608

Anonymous 09/19/24(Thu)09:00:48 No.102455608

>>102455564
try this one
https://huggingface.co/mradermacher/Arcanum-12b-GGUF/tree/main

Anonymous
09/19/24(Thu)09:02:08 No.102455625

Anonymous 09/19/24(Thu)09:02:08 No.102455625

>>102455605
nah nemo does nice at temp 5 topk3 minp 0.1 for creative stuff, it needs weird stuff like all mistrals do

Anonymous
09/19/24(Thu)09:02:08 No.102455626

Anonymous 09/19/24(Thu)09:02:08 No.102455626

>>102455421
I use a simple text completion prompt that first gives the metadata and then the previous translation pairs, the last one being the single line to be translated, so the model has to complete the English part. As far as I can tell, this works just as well as using the model's prompt format, but I'll try Qwen2.5 again later on OpenRouter with the proper prompt format to see if there's any difference in this case.
Adding the future lines wouldn't work for this imo, because it's usually not available if you're using things like Textractor or OCR to translate VNs.

>>102455451
You're supposed to trust the expected (reference) translations! But fine, here are some of the Japanese lines:
>[愛理]: 「はい。お兄さんですよね?」
>顔中で笑み崩れる。
>――俺には、想像以上で。
>[愛理]: 「よかった、桜乃。これでもう大丈夫よ!」
>[桜乃]: 『うん。私、久々に派手にやらかしちゃった……』
>[新吾]: 「そっか。うん、電話してみてよかった」
>頻度こそさほどではないものの、ひとたび迷子になると、とにかく派手に道に迷ってしまう。
>[新吾]: 「ううん、こう言っちゃなんだけど、迷子でよかったよ。桜乃は可愛いから、いろいろ心配しちゃってたんだぞ俺」
>ともあれ俺は、少し恥ずかしいことを、あえて冗談めかして言う。
>恐縮されるのはとても苦手だ。たとえ相手が妹でも。いや、妹だからこそ、か。
>言った甲斐あって、桜乃は電話の向こうで、いつもの調子を取り戻してくれた。
>[桜乃]: 『うん、迷子ならお兄ちゃんに見つけてもらえばいい』

Anonymous
09/19/24(Thu)09:05:21 No.102455648

Anonymous 09/19/24(Thu)09:05:21 No.102455648

>>102455605
>you would have found that it gives you completely different, hallucinated answers for simple trivia questions each time you regenerate.
For trivia, sure, that's more of a function of it's dataset than anything.
If you use it for RAG, with 0.3 to 0.5ish temp it does really well, nemo-instruct at least does, much better than anything on its weight class from my testing.
It also falls into repetition of reply structures really easily, things like starting and ending each reply with the same sentence (or a slight variation) if using it as a narrator, etc.
Nemo is weird.

>>102455625
Oh look, those meme sampler settings. Interesting to see people having positive results with it.

Anonymous
09/19/24(Thu)09:06:38 No.102455660

Anonymous 09/19/24(Thu)09:06:38 No.102455660

File: file.png (240 KB, 2193x943)

240 KB PNG

So what's the latest cope? Who is gonna save open source?

Anonymous
09/19/24(Thu)09:07:43 No.102455670

Anonymous 09/19/24(Thu)09:07:43 No.102455670

>>102455625
>topk3
A lot of models would do "nice" because this is pretty deterministic. You're getting placebo'd if you think you need it.

Anonymous
09/19/24(Thu)09:07:48 No.102455671

Anonymous 09/19/24(Thu)09:07:48 No.102455671

>>102455416
>Do finetunes even work?
Generally yes.
>But do they improve quality of the smut?
They can.
>Make them repeat themselves less?
Yes, it's possible. I think many problems stem from the exceedingly short conversational data used in assistant datasets (mosly single- or very few turns).
>Make them use less slop?
Also, yes, if you finetune them in your areas of interest with sufficiently varied, novel and cleaned data.
>What can you do in a single epoch of training other than just make model more likely to use rp type responses?
One epoch is probably not enough for significant changes unless you have large amounts of data. Otherwise, if you have consistently styled, organically sourced data and you're not scared of overfitting a bit (which closed sourced companies do anyway), a few epochs could do a lot for the model's general feel.
>And they will use those just as well with a bit of prefill or even telling the model to ERP with you.
I don't understand here, but models will tend to perform the best with prompting resembling their training data the closest.

One problem though is the unfair expectation of finetunes made with data sourced predominantly from ERP logs, stories, fanfictions, to outperform in general intelligence the official instruct finetunes which are to a large extent designed for benchmaxxing. Finetuners with limited compute and/or who don't have large mounts of cash to burn can't easily solve this problem. Large datasets are unwieldy and difficult to maintain. This is one reason why some have opted to finetune the instruct models instead of the base, which comes with additional problems (the instruct model's "safety", style and feel seeping in the outputs).

Anonymous
09/19/24(Thu)09:07:55 No.102455672

Anonymous 09/19/24(Thu)09:07:55 No.102455672

File: with_scores.png (282 KB, 2193x943)

282 KB PNG

>>102455660

Anonymous
09/19/24(Thu)09:07:57 No.102455674

Anonymous 09/19/24(Thu)09:07:57 No.102455674

>>102455660
Mistral-large with CoT!

Anonymous
09/19/24(Thu)09:08:34 No.102455684

Anonymous 09/19/24(Thu)09:08:34 No.102455684

>>102455660
llama3 really was a failure, they trained this giant 405b motherfucker for more than 6 months just to be destroyed by the API models

Anonymous
09/19/24(Thu)09:09:55 No.102455697

Anonymous 09/19/24(Thu)09:09:55 No.102455697

>>102455660
>>102455672
qwen2.5 72b isd gpt4o tier. gonna wait until chink also steal the cot gimmick

Anonymous
09/19/24(Thu)09:10:52 No.102455705

Anonymous 09/19/24(Thu)09:10:52 No.102455705

>>102455670
>You're getting placebo'd if you think you need it.
went from a repeating schizo to something that i can coom to, so good enough for me either way

Anonymous
09/19/24(Thu)09:11:32 No.102455714

Anonymous 09/19/24(Thu)09:11:32 No.102455714

>>102455564
Sorry to say but with 8 GB VRAM you will not get particularly good results, the models at that size are just not that great vs. something like ChatGPT.
Using llama.cpp (or something based on it like koboldcpp) you can run models that are larger than your VRAM by running part of the model on the CPU but obviously that will be slower.

For Pokemon in particular my experience has been that even the bigger models like Mistral Large have difficulty getting the anatomy of more obscure Pokemon like Umbreon correct, human-like Pokemon like Lopunny or Gardevoir work comparatively much better.

Anonymous
09/19/24(Thu)09:11:52 No.102455717

Anonymous 09/19/24(Thu)09:11:52 No.102455717

>>102452801
Yeah, I thought overbaked models like llama3 70b was unsalvageable if not for him showing what a good finetune can do

Anonymous
09/19/24(Thu)09:12:35 No.102455725

Anonymous 09/19/24(Thu)09:12:35 No.102455725

>>102455660
>>102455672
Reflection 405B is coming. Sam already played his hand sabotaging the 70B. They're going to be ready with countermeasures this time and get the real model out to us.

Anonymous
09/19/24(Thu)09:13:18 No.102455734

Anonymous 09/19/24(Thu)09:13:18 No.102455734

>>102455725
And maybe they tweaked their CoT after o1 release.

Anonymous
09/19/24(Thu)09:14:34 No.102455741

Anonymous 09/19/24(Thu)09:14:34 No.102455741

>>102455714
damn, so, investing a bit in something like openrouter or chatgpt would be my best bet? do those have good nsfw models?

Anonymous
09/19/24(Thu)09:15:23 No.102455747

Anonymous 09/19/24(Thu)09:15:23 No.102455747

feet

Anonymous
09/19/24(Thu)09:15:44 No.102455749

Anonymous 09/19/24(Thu)09:15:44 No.102455749

>>102455564
if the 12b models you use fuck up the descriptions, you could use a lorebook like
https://www.characterhub.org/lorebooks/cyberlight/actual-pokedex-22c42c1b0655
too to help

Anonymous
09/19/24(Thu)09:15:58 No.102455751

Anonymous 09/19/24(Thu)09:15:58 No.102455751

>>102449993
Flux dev lora for dall-e-style Migus: https://huggingface.co/quarterturn/chibi-migu-rainbow-style-flux-dev-lora/blob/main/README.md

Anonymous
09/19/24(Thu)09:16:06 No.102455753

Anonymous 09/19/24(Thu)09:16:06 No.102455753

>>102455660
Grok really fell behind, huh?

Anonymous
09/19/24(Thu)09:16:32 No.102455761

Anonymous 09/19/24(Thu)09:16:32 No.102455761

>>102455697
>qwen2.5 72b isd gpt4o tier
Sauce?
Also can it do pr0n? qwen2 was worse at lewd stuff than qwen1.5

Anonymous
09/19/24(Thu)09:17:40 No.102455769

Anonymous 09/19/24(Thu)09:17:40 No.102455769

>>102455648
>completely different, hallucinated answers for simple trivia questions each time you regenerate.
>For trivia, sure, that's more of a function of it's dataset than anything.
If it was overfit, it would have high confidence in its answers and give the same one each time. What Nemo does is say random shit each time, that's the complete opposite of overfit.
I think I'm just talking with a really stupid person, just like this one >>102455705

Anonymous
09/19/24(Thu)09:18:46 No.102455787

Anonymous 09/19/24(Thu)09:18:46 No.102455787

>>102455761
>Also can it do pr0n? qwen2 was worse at lewd stuff than qwen1.5
the Qwen cucks did their best to remove any instance of NFSW on the dataset training, so the model doesn't know shit about sex lol

Anonymous
09/19/24(Thu)09:20:58 No.102455814

Anonymous 09/19/24(Thu)09:20:58 No.102455814

>>102455741
I would say just try some of the models for yourself, with llama.cpp you can run models up to your combined RAM+VRAM capacity in size.
And even if right now you don't have a lot of RAM it's a pretty cheap upgrade and would allow you to test some of the better models (at low speeds) to judge whether or not you're interested.

The number one problem you'll have with cloud models is that jerking off to text is against their ToS so you're at risk of getting banned.
Unless you upload a video of yourself drinking piss in order to get access to API keys that someone scraped off of GitHub.

Anonymous
09/19/24(Thu)09:21:40 No.102455820

Anonymous 09/19/24(Thu)09:21:40 No.102455820

Ok. Bigger "Nemo" is now the best model. All hail the French.

Anonymous
09/19/24(Thu)09:22:01 No.102455826

Anonymous 09/19/24(Thu)09:22:01 No.102455826

Why hasn't anyone made a lora out of detective Pikachu live action film

Anonymous
09/19/24(Thu)09:23:06 No.102455840

Anonymous 09/19/24(Thu)09:23:06 No.102455840

>>102455826
because you might be autistic

Anonymous
09/19/24(Thu)09:23:39 No.102455850

Anonymous 09/19/24(Thu)09:23:39 No.102455850

>>102455840
Sorry I posted on the wrong thread

Anonymous
09/19/24(Thu)09:23:44 No.102455852

Anonymous 09/19/24(Thu)09:23:44 No.102455852

>>102455820
>no base model
Why are you even shilling it that hard?

Anonymous
09/19/24(Thu)09:26:47 No.102455884

Anonymous 09/19/24(Thu)09:26:47 No.102455884

>>102454935
Take your meds and then buy a fucking ad.

Anonymous
09/19/24(Thu)09:27:15 No.102455889

Anonymous 09/19/24(Thu)09:27:15 No.102455889

>>102455660
>>102455672
>4o-mini above sorbet
i don't think this is a good benchmark

Anonymous
09/19/24(Thu)09:28:09 No.102455899

Anonymous 09/19/24(Thu)09:28:09 No.102455899

>>102455134
Mitigated by avoiding meme samplers and 'ahh ahh mistress'

Anonymous
09/19/24(Thu)09:30:55 No.102455933

Anonymous 09/19/24(Thu)09:30:55 No.102455933

>>102455787
>TWO MORE MISINFORMATIONS AND I'LL BE A WOMAN
pathetic

Anonymous
09/19/24(Thu)09:32:08 No.102455946

Anonymous 09/19/24(Thu)09:32:08 No.102455946

>>102455814
i do have 40gb of ram rn, hopefully it will be enough or a nice model then

Anonymous
09/19/24(Thu)09:35:48 No.102455976

Anonymous 09/19/24(Thu)09:35:48 No.102455976

>>102455884
>multiple mentions of buying an ad this thread too
isn't it embarrassing to have the personality equivalent of a literal spam bot?

Anonymous
09/19/24(Thu)09:40:46 No.102456014

Anonymous 09/19/24(Thu)09:40:46 No.102456014

>>102455976
>"multiple"
>only 2
Leave the buy an ad anon alone, he's fighting for a good cause.

Anonymous
09/19/24(Thu)09:45:11 No.102456049

Anonymous 09/19/24(Thu)09:45:11 No.102456049

>>102455852
Doesn't leave much to choose from. The only models that we actually got base models for are Lllama and Nemo.

Anonymous
09/19/24(Thu)09:46:57 No.102456062

Anonymous 09/19/24(Thu)09:46:57 No.102456062

>>102456049
crazy thursday literally just happened and you already forgot qwen?

Anonymous
09/19/24(Thu)09:47:52 No.102456071

Anonymous 09/19/24(Thu)09:47:52 No.102456071

>>102456049
Did Qwen really hurt Americans this much?

Anonymous
09/19/24(Thu)09:50:05 No.102456091

Anonymous 09/19/24(Thu)09:50:05 No.102456091

Veredict on Qwen for RP?
Mistral Small?
I don't feel like testing them myself

Anonymous
09/19/24(Thu)09:50:30 No.102456097

Anonymous 09/19/24(Thu)09:50:30 No.102456097

>>102451657
>Banned from github
How the fuck do you manage that?

Anonymous
09/19/24(Thu)09:53:50 No.102456138

Anonymous 09/19/24(Thu)09:53:50 No.102456138

>>102456091
>Qwen for RP
over before it began
>Mistral Small
over before it began

Anonymous
09/19/24(Thu)09:54:24 No.102456144

Anonymous 09/19/24(Thu)09:54:24 No.102456144

>>102456091
Mistral small is a smarter Mistral Nemo. I don't see anything Mistral large can do that it can't do now with the jump in intelligence and it still has Nemos more fun unhingedsess at higher than 0.5 temps. I think I like it better now, plus it's fast.

Anonymous
09/19/24(Thu)09:55:06 No.102456153

Anonymous 09/19/24(Thu)09:55:06 No.102456153

File: file.png (110 KB, 477x1027)

110 KB PNG

>>102451494
>>102452126
Feels good to use as a writing helper, not too sloppy. I have not yet tested at context >8k though.
Am getting a decent variety of tokens to select from in Mikupad at temp 0.8-1, minp 0.01. I'm glad to see that there aren't symptoms of overconfidence. It quickly goes wild as if temp gets turned up, but in a good way. Instruct tunes output gibberish in response to my turning up temp to squeeze variety out of them. This one instead suggests not-entirely-improbable but possibly silly tokens. I find that's desirable for my use case which is a model to assist my own writing with hand holding.
There was one scene where it repeatedly disregarded some character traits higher up in the context whereas larger models have picked them up, as seen in the token probabilities, with low temp <0.4 not changing anything. An example was a character that is historically slow to react, but somehow catching an object that was thrown at their face. Character traits were stated by the narrator in natural prose when establishing the scene, not character card style, as in not: "char is this and that, char likes thing". Needs more testing.
Sometimes I've noticed some `***` appearing between lines. Was that a separator in the dataset?

Anonymous
09/19/24(Thu)10:00:25 No.102456199

Anonymous 09/19/24(Thu)10:00:25 No.102456199

>>102456144
Is it better than Miqu finetunes? They STILL are my daily driver

Anonymous
09/19/24(Thu)10:02:31 No.102456224

Anonymous 09/19/24(Thu)10:02:31 No.102456224

>>102456199
>They STILL are my daily driver
>while responding to someone saying Mistral Small is better than Large
Something tells me you aren't very smart...

Anonymous
09/19/24(Thu)10:02:44 No.102456227

Anonymous 09/19/24(Thu)10:02:44 No.102456227

>>102451067
Retarded in general or compared to Phi-3.5-MoE?
I added it because the new MoE training was interesting.

Anonymous
09/19/24(Thu)10:03:28 No.102456234

Anonymous 09/19/24(Thu)10:03:28 No.102456234

>>102456224
Sorry, I only read like 1 out of every 5 words thanks to using LLMs with that use too many filler purple prose, thanks for the review

Anonymous
09/19/24(Thu)10:06:55 No.102456263

Anonymous 09/19/24(Thu)10:06:55 No.102456263

>>102455343
This, it's like the guy shrieking "AHHHH NOTHING MATTERS!! WE ARE ALONE IN THE UNIVERSE!!". No shit, but we make our own meaning. It's just needless whinging over something that's completely obvious and that any reasonable person will have either accepted as part of it or found offputting enough to leave, instead of flailing around having a crisis about it.

Anonymous
09/19/24(Thu)10:09:21 No.102456293

Anonymous 09/19/24(Thu)10:09:21 No.102456293

>>102456014
>Fighting for a good cause
I dunno. I feel like leaving him alone in more of a "Don't look at that babbling homeless man" way is more appropriate.

Anonymous
09/19/24(Thu)10:12:09 No.102456325

Anonymous 09/19/24(Thu)10:12:09 No.102456325

>>102456293
Well, I'm personally not offended by the buy an ad posting because I'm not trying to shill anything.

Anonymous
09/19/24(Thu)10:13:22 No.102456342

Anonymous 09/19/24(Thu)10:13:22 No.102456342

>>102456325
It's okay buy an adanon, we know it's you <3 we love you and you have so much good to say <3

Anonymous
09/19/24(Thu)10:15:05 No.102456358

Anonymous 09/19/24(Thu)10:15:05 No.102456358

>>102456325
It (attempts to) shut down discussion of any new finetunes or models, it's a fucking nuisance. Thankfully it seems to be less effective lately as the REEE AD spam fades into background noise.

Anonymous
09/19/24(Thu)10:15:41 No.102456367

Anonymous 09/19/24(Thu)10:15:41 No.102456367

>>102456325
the buy an add posting must continue until the shilling stops

Anonymous
09/19/24(Thu)10:17:00 No.102456380

Anonymous 09/19/24(Thu)10:17:00 No.102456380

>>102456358
Considering that there are only 2 "buy an ad" posts and one of them look more like false-flagging, I think you're being way too dramatic.

Anonymous
09/19/24(Thu)10:18:30 No.102456393

Anonymous 09/19/24(Thu)10:18:30 No.102456393

>>102456380
It was way worse before, I think the people reporting him caused him to cut back/change up tactics.

Anonymous
09/19/24(Thu)10:18:49 No.102456398

Anonymous 09/19/24(Thu)10:18:49 No.102456398

>>102456153
The *** are scene/chapter separators so it shouldn't be generated that often. I recommend down biasing or banning it. I can't say for 8k+ because that's the sequence length used in training.
The ability to pick up characters traits has more to do with model size. My go to test prompt was a deaf girl's story. 70b+ has little to no problem realizing talks are not supposed to be directed to her but instead through signing or reading lips etc.
> characters traits in prose
one of the intended use yeah, the /aidg/ dogma

Anonymous
09/19/24(Thu)10:19:16 No.102456407

Anonymous 09/19/24(Thu)10:19:16 No.102456407

i shill things i like for free because i want other people to experience the things i like so i'll have someone to discuss them with who isn't an LLM

Anonymous
09/19/24(Thu)10:21:53 No.102456432

Anonymous 09/19/24(Thu)10:21:53 No.102456432

>>102456407
That's just called wanting to talk about shit you like. That's what the thread is for, no amount of screeching adschizo can change that. Without community discussion of what's good and what's not, this shit dies.

Anonymous
09/19/24(Thu)10:23:47 No.102456455

Anonymous 09/19/24(Thu)10:23:47 No.102456455

How do you run .safetensors files?
Also how do you join them.
I have been searching for a while and all I can find is a bunch of code.

Anonymous
09/19/24(Thu)10:24:18 No.102456459

Anonymous 09/19/24(Thu)10:24:18 No.102456459

>>102456407
That sounds kinda gay.

Anonymous
09/19/24(Thu)10:24:38 No.102456463

Anonymous 09/19/24(Thu)10:24:38 No.102456463

Virtual Friends as a test bed (haha)

I found this and am reviewing it, and others:
https://www.reddit.com/r/SillyTavernAI/comments/1bha2jl/long_term_memory_strategies/

Shouldn't summaries be provided by a different llm?

Anonymous
09/19/24(Thu)10:24:48 No.102456468

Anonymous 09/19/24(Thu)10:24:48 No.102456468

>>102456407
Mistral Small is not better than Large, shill.

Anonymous
09/19/24(Thu)10:25:19 No.102456477

Anonymous 09/19/24(Thu)10:25:19 No.102456477

>>102456455
What are you trying to run them with? llama.cpp has a conversion script.

Anonymous
09/19/24(Thu)10:26:37 No.102456486

Anonymous 09/19/24(Thu)10:26:37 No.102456486

>>102456432
The thread also dies with extreme shilling and samefagging like the one that sao does

Anonymous
09/19/24(Thu)10:27:46 No.102456502

Anonymous 09/19/24(Thu)10:27:46 No.102456502

>>102456455
what are you trying to do exactly?
i'd wager whatever it is, you should be using a .gguf version of it instead.

Anonymous
09/19/24(Thu)10:29:09 No.102456518

Anonymous 09/19/24(Thu)10:29:09 No.102456518

>>102456455
>How do you run .safetensors files?
transformers

Anonymous
09/19/24(Thu)10:29:52 No.102456529

Anonymous 09/19/24(Thu)10:29:52 No.102456529

>>102456477
I have no idea what can run them
>llama.cpp has a conversion script.
I'll look into that.
>>102456502
Sometimes I find an interesting looking model but they don't have a gguf.

Anonymous
09/19/24(Thu)10:31:20 No.102456541

Anonymous 09/19/24(Thu)10:31:20 No.102456541

>>102456468
This is just getting ridiculous. Who the fuck would want to shill small over large? What's the motivation there? Mistral trying to undercut itself? You're retarded, at least make your attempts to shit up the thread coherent.

Anonymous
09/19/24(Thu)10:31:33 No.102456543

Anonymous 09/19/24(Thu)10:31:33 No.102456543

>>102456263
based existentialist

Anonymous
09/19/24(Thu)10:34:40 No.102456578

Anonymous 09/19/24(Thu)10:34:40 No.102456578

File: file.png (17 KB, 200x198)

17 KB PNG

>>102456263
>It's just needless whinging over something that's completely obvious and that any reasonable person
a NPC will never be a reasonable person

Anonymous
09/19/24(Thu)10:35:29 No.102456586

Anonymous 09/19/24(Thu)10:35:29 No.102456586

>>102456486
Why call out one of the namefags specifically when all of them do it?

Anonymous
09/19/24(Thu)10:36:18 No.102456594

Anonymous 09/19/24(Thu)10:36:18 No.102456594

>>102456541
>Who the fuck would want
I doesn't matter who, the comments are too ridiculous to be organic.
But one way it could work, is that Qwen 14B and 34B vastly outperform Nemo, Small, Gemma and every other model in that range. So installing this idea of "Small is better than Large" is a way to also say "See? You don't need Qwen".

Anonymous
09/19/24(Thu)10:37:30 No.102456607

Anonymous 09/19/24(Thu)10:37:30 No.102456607

>>102456586
Hi Sao

Anonymous
09/19/24(Thu)10:40:01 No.102456644

Anonymous 09/19/24(Thu)10:40:01 No.102456644

>>102456586
Do they? There's nothing as excessive as sao's shilling. To the point that lmg felt like sao's general, before the buy an ad push back

Anonymous
09/19/24(Thu)10:40:22 No.102456646

Anonymous 09/19/24(Thu)10:40:22 No.102456646

>>102456367
just because you're opposing something rather than endorsing it doesn't mean you couldn't have been paid to do it. anti-shilling = shilling.

Anonymous
09/19/24(Thu)10:40:50 No.102456649

Anonymous 09/19/24(Thu)10:40:50 No.102456649

>>102456407
Me too.
I oftentimes come to the thread to give some feedback over a model I'm testing or to share ideas and shit.
Although I guess that's not really shilling (like the other anon pointed out) since that implies a specific intent.
My favorite model currently is Lyra v4 (nemo 12B), what's with having only 8gb of vram, with these samplers >>102455625

Anonymous
09/19/24(Thu)10:41:02 No.102456655

Anonymous 09/19/24(Thu)10:41:02 No.102456655

>>102456529
>Sometimes I find an interesting looking model but they don't have a gguf.
Probably worth warning you now that a lot of interesting looking models never get supported by llama.cpp so is not possible to make a gguf.

Anonymous
09/19/24(Thu)10:42:18 No.102456676

Anonymous 09/19/24(Thu)10:42:18 No.102456676

>>102456594
>most people here use LLMs for nsfw purposes
>qwen 2.5 is terrible for nsfw purposes
>therefore, erebus 13b is better than qwen 2.5 72b
it's pretty rational

Anonymous
09/19/24(Thu)10:42:27 No.102456678

Anonymous 09/19/24(Thu)10:42:27 No.102456678

>>102456594
>It doesn't matter f it makes sense!! It doesn't have to!!! It WOULD make sense if [completely incoherent schizobabble shilling shit]

Anonymous
09/19/24(Thu)10:43:51 No.102456691

Anonymous 09/19/24(Thu)10:43:51 No.102456691

>>102456655
So do I have to get into the transformers things to run them?
At least I have a direction now instead of going in circles.

Anonymous
09/19/24(Thu)10:45:02 No.102456697

Anonymous 09/19/24(Thu)10:45:02 No.102456697

File: founder-sam-altman-back-a(...).jpg (58 KB, 860x520)

58 KB JPG

>>102456646
You're right. All discusion of local models must be stopped, just for good measure. By the way, did you know Openai just released their o1 model, now with complex reasoning? I just thought that was neat, haha.

Anonymous
09/19/24(Thu)10:45:25 No.102456703

Anonymous 09/19/24(Thu)10:45:25 No.102456703

>>102456455
>>102456691
Ooba has transformers as a loader if that helps.
>https://github.com/oobabooga/text-generation-webui

Anonymous
09/19/24(Thu)10:47:07 No.102456723

Anonymous 09/19/24(Thu)10:47:07 No.102456723

>>102456691
It's a fucking mess. Some exotic models don't even have transformers support so you'll either need forks or run directly with pytorch. Look into vLLM. It's based on transformers so should have pretty good compatibility with most models.

Anonymous
09/19/24(Thu)10:48:03 No.102456734

Anonymous 09/19/24(Thu)10:48:03 No.102456734

>>102456723
>>102456703
Thanks. I'll look into that.

Anonymous
09/19/24(Thu)10:48:22 No.102456738

Anonymous 09/19/24(Thu)10:48:22 No.102456738

is it possible to host sillytavern on my PC, then have a separate instance of it on a different device i connect to it with? kind of like how koboldai lite works.

Anonymous
09/19/24(Thu)10:49:23 No.102456753

Anonymous 09/19/24(Thu)10:49:23 No.102456753

>>102456697
did you hear they're suing people who try to publish results on testing its reasoning? i wonder why that might be.

Anonymous
09/19/24(Thu)10:49:32 No.102456755

Anonymous 09/19/24(Thu)10:49:32 No.102456755

>>102456738
yeah

Anonymous
09/19/24(Thu)10:49:44 No.102456759

Anonymous 09/19/24(Thu)10:49:44 No.102456759

File: reimu touhou dance mmd ba(...).gif (2.03 MB, 176x322)

2.03 MB GIF

>>102456398
Mind sharing the current dataset size in MB? Curious.

>The ability to pick up characters traits has more to do with model size
True. I can imagine a properly-cooked bookish largestral being a joy to use, even with low t/s on my 2x3090+64GB. It's gonna cost a few bucks though depending on what you wanna do of course. Have fun datasetting.

Anonymous
09/19/24(Thu)10:58:38 No.102456884

Anonymous 09/19/24(Thu)10:58:38 No.102456884

>>102456649
So that's why you think Nemo is overfit, repetitive and that it needs meme samplers.

Anonymous
09/19/24(Thu)10:58:47 No.102456885

Anonymous 09/19/24(Thu)10:58:47 No.102456885

>>102455626
>if you're using things like Textractor or OCR to translate VNs.
I've tried MTool's translation feature once and decided to never go back.
It's just so much easier.

Anonymous
09/19/24(Thu)10:58:55 No.102456888

Anonymous 09/19/24(Thu)10:58:55 No.102456888

Qwen2.5-72B-Instrust is probably the best lewd-capable open source model at this point

Anonymous
09/19/24(Thu)10:59:48 No.102456899

Anonymous 09/19/24(Thu)10:59:48 No.102456899

>>102456738
Why wouldn't you host an inference endpoint on your PC then just install ST on your phone instead?

Anonymous
09/19/24(Thu)11:00:28 No.102456913

Anonymous 09/19/24(Thu)11:00:28 No.102456913

>>102456888
that's mistral large
I do prefer it to l3.1 though

Anonymous
09/19/24(Thu)11:00:58 No.102456919

Anonymous 09/19/24(Thu)11:00:58 No.102456919

>>102456899
that looked really complicated and irritating when i was looking at it a year or so ago.
i guess i'll try it if it's not doable the way i was thinking.

Anonymous
09/19/24(Thu)11:01:36 No.102456926

Anonymous 09/19/24(Thu)11:01:36 No.102456926

>>102456888
>72B
Can't run it on my 8GB of VRAM, so that's false.

Anonymous
09/19/24(Thu)11:02:20 No.102456935

Anonymous 09/19/24(Thu)11:02:20 No.102456935

>>102456649
>My favorite model currently is Lyra v4
hi sao

Anonymous
09/19/24(Thu)11:04:43 No.102456960

Anonymous 09/19/24(Thu)11:04:43 No.102456960

>>102456919
assuming windows:
>install nssm
>install ooba
>install sillytavern
>write batch file to pull stable updates for silly
>NEVER update ooba
>set up ooba and sillytavern batch file as services
>no terminals ever again
>load/unload models from ooba webapp on your phone
feel free to circle back and thank me later

Anonymous
09/19/24(Thu)11:05:48 No.102456971

Anonymous 09/19/24(Thu)11:05:48 No.102456971

>>102456888
Are the quants any good?

Anonymous
09/19/24(Thu)11:05:52 No.102456973

Anonymous 09/19/24(Thu)11:05:52 No.102456973

File: mistral-small.png (164 KB, 820x486)

164 KB PNG

Yeah I'm thinking sovl

Anonymous
09/19/24(Thu)11:06:39 No.102456978

Anonymous 09/19/24(Thu)11:06:39 No.102456978

>>102456759
>I can imagine a properly-cooked bookish largestral being a joy to use, even with low t/s on my 2x3090+64GB
Largestral was a 3.1 70B side-grade, and with no base model, I can't imagine why anyone would tune it when there's Qwen 2.5.

Anonymous
09/19/24(Thu)11:07:29 No.102456989

Anonymous 09/19/24(Thu)11:07:29 No.102456989

>>102456971
I only tried Int4 but it's good

Anonymous
09/19/24(Thu)11:09:09 No.102457006

Anonymous 09/19/24(Thu)11:09:09 No.102457006

>>102456973
That's pretty good, actually. What quant?

Anonymous
09/19/24(Thu)11:09:17 No.102457008

Anonymous 09/19/24(Thu)11:09:17 No.102457008

File: Untitled.png (107 KB, 955x657)

107 KB PNG

>>102456960
i'm just going to do this termux git thing on android
>people use st to control their vibrator
fascinating

Anonymous
09/19/24(Thu)11:09:46 No.102457011

Anonymous 09/19/24(Thu)11:09:46 No.102457011

Qwen 2.5 is easily jailbroken; you just need to write the first couple words of the response. This is in stark contrast to Qwen 2.0 where the model could output "I'm sorry" text at any moment

Anonymous
09/19/24(Thu)11:09:53 No.102457012

Anonymous 09/19/24(Thu)11:09:53 No.102457012

>termux
yikes

Anonymous
09/19/24(Thu)11:10:59 No.102457027

Anonymous 09/19/24(Thu)11:10:59 No.102457027

>>102457006
6bpw
Temp 1.4
Min P 0.2
Basic DRY

Anonymous
09/19/24(Thu)11:11:34 No.102457036

Anonymous 09/19/24(Thu)11:11:34 No.102457036

>>102453938
Only performs well at COT problems within its training like math and coding, and fails at applying it to creative writing. COT doesn't generalize to domains outside of the training data, who knew.

Anonymous
09/19/24(Thu)11:16:14 No.102457089

Anonymous 09/19/24(Thu)11:16:14 No.102457089

>>102456989
how big is it? The base model is 37 files of about 4 gb each.

Anonymous
09/19/24(Thu)11:16:15 No.102457091

Anonymous 09/19/24(Thu)11:16:15 No.102457091

>been a whole day since the latest model release
The winter has arrived lads...

Anonymous
09/19/24(Thu)11:18:49 No.102457121

Anonymous 09/19/24(Thu)11:18:49 No.102457121

>>102455751
Wow, it really does look just like the Dalle gens.

Anonymous
09/19/24(Thu)11:19:24 No.102457127

Anonymous 09/19/24(Thu)11:19:24 No.102457127

>>102457089
38.7B. The model fits within 48GB VRAM while running

Anonymous
09/19/24(Thu)11:22:14 No.102457168

Anonymous 09/19/24(Thu)11:22:14 No.102457168

>>102456759
~900MB I think I'll be further trimming it down with better dedupling tech and heuristics to filter out bad writing other than regex.
I hate "X, -ing" with a passion.

>>102456978
I heard 3.1 isn't that much of an upgrade over 3 and we have the family of L3 storywriter models already. I'm more interested in how much we can snap a model out of instruct bakes.

Anonymous
09/19/24(Thu)11:23:21 No.102457181

Anonymous 09/19/24(Thu)11:23:21 No.102457181

File: long miku figure 4.jpg (169 KB, 1078x1439)

169 KB JPG

>>102451423
>hello? (hello?)
You sound uneasy. Here, take this Long Miku.

Anonymous
09/19/24(Thu)11:24:08 No.102457191

Anonymous 09/19/24(Thu)11:24:08 No.102457191

>>102457027
Nice, thanks anonie. Have you had to swipe a lot?

Anonymous
09/19/24(Thu)11:25:16 No.102457205

Anonymous 09/19/24(Thu)11:25:16 No.102457205

Temp 5 Top K 3 Min P 0.1 is actually surprisingly decent for Nemo. Unfortunately, Nemo is still ass.

Anonymous
09/19/24(Thu)11:27:39 No.102457231

Anonymous 09/19/24(Thu)11:27:39 No.102457231

>>102456091
I didn't want to download 32B because I thought it is shit but it is honestly the best chink model for ERP I tried so far. It is kinda like a different nemo where it has some good things and it is shit at other stuff. It is definitely better than gemma 27B and nucommander-abortion

Anonymous
09/19/24(Thu)11:28:11 No.102457239

Anonymous 09/19/24(Thu)11:28:11 No.102457239

>>102457205
why use min p at all if you're doing topk 3 lol

Anonymous
09/19/24(Thu)11:29:27 No.102457258

Anonymous 09/19/24(Thu)11:29:27 No.102457258

I always think that the "Hi Sao" poster is actually Sao.

Anonymous
09/19/24(Thu)11:31:54 No.102457287

Anonymous 09/19/24(Thu)11:31:54 No.102457287

hi, durmmer

Anonymous
09/19/24(Thu)11:33:08 No.102457300

Anonymous 09/19/24(Thu)11:33:08 No.102457300

>>102457191
Nope it's mostly the card doing the heavy lifting, though no example dialogues. Every single swipe is decent, but things start falling apart and into repetition quickly, like every other Mistral model

Anonymous
09/19/24(Thu)11:33:21 No.102457305

Anonymous 09/19/24(Thu)11:33:21 No.102457305

what's a good fullscreen android web browser for using sillytavern to have sex with my PC gpu?
brave has too much bullshit on the screen that won't go away.

Anonymous
09/19/24(Thu)11:35:09 No.102457333

Anonymous 09/19/24(Thu)11:35:09 No.102457333

>not having response tokens maxed out
for what purpose

Anonymous
09/19/24(Thu)11:35:16 No.102457337

Anonymous 09/19/24(Thu)11:35:16 No.102457337

>>102457258
hi Sao

Anonymous
09/19/24(Thu)11:35:32 No.102457341

Anonymous 09/19/24(Thu)11:35:32 No.102457341

i don't even know who all these people greeting each other here are. what a friendly place.

Anonymous
09/19/24(Thu)11:36:17 No.102457350

Anonymous 09/19/24(Thu)11:36:17 No.102457350

>>102457333
if you're a vramlet with limited context, it cuts into your context

Anonymous
09/19/24(Thu)11:38:48 No.102457370

Anonymous 09/19/24(Thu)11:38:48 No.102457370

Hi all, Drummer here...

Is Mistral Small too positive? Does it get in the way of your creative uses? Thinking if I should unalign it a bit more.

Anonymous
09/19/24(Thu)11:43:35 No.102457428

Anonymous 09/19/24(Thu)11:43:35 No.102457428

File: file.png (399 KB, 474x587)

399 KB PNG

>>102457370
HACK! FRAUD!

Anonymous
09/19/24(Thu)11:43:53 No.102457431

Anonymous 09/19/24(Thu)11:43:53 No.102457431

>>102457341
BUY AN AD REEEEEEEEEEEEEEEEEE I HATE YOU I HATE YOU I HATE YOU

Anonymous
09/19/24(Thu)11:46:32 No.102457462

Anonymous 09/19/24(Thu)11:46:32 No.102457462

>>102457239
Not enough placebo.

Anonymous
09/19/24(Thu)11:48:13 No.102457484

Anonymous 09/19/24(Thu)11:48:13 No.102457484

>>102457431
YES! I AM EXCITED TO BE HERE TOO!!

Anonymous
09/19/24(Thu)11:53:15 No.102457533

Anonymous 09/19/24(Thu)11:53:15 No.102457533

I noticed mradermacher quanted my fun little frankenmerge experiment. Please feel free to try it if you have tons of VRAM: https://huggingface.co/mradermacher/Hanames-90B-L3.1-GGUF
I recommend a min_p of at least 0.1, it's a frankenmerge so have to separate the wheat from the chaff. This iteration performs acceptably with lower values too, but there's still enough variety at 0.1+ that there isn't much of a downside to it.
Like you would expect, it's more schizo than regular L3.1 models. It's also pretty fun to use in a way that most of them aren't - I'd call it "sovl". I compulsively edit responses anyway so it's a tradeoff I'm willing to make.
It's just a novelty model, but it's been enjoyable in my personal use so I figured I'd put it out there for others to try.

Anonymous
09/19/24(Thu)11:55:55 No.102457572

Anonymous 09/19/24(Thu)11:55:55 No.102457572

>>102457533
>fun little frankenmerge
Don't buy an ad. Buy a rope. Immediately.

Anonymous
09/19/24(Thu)11:56:29 No.102457584

Anonymous 09/19/24(Thu)11:56:29 No.102457584

>>102456091
Both shit, not sure what did you expect from corporate slop.

Anonymous
09/19/24(Thu)11:57:43 No.102457604

Anonymous 09/19/24(Thu)11:57:43 No.102457604

>>102457341
>all these people
Its one fag shitting up the thread.

Anonymous
09/19/24(Thu)11:57:51 No.102457607

Anonymous 09/19/24(Thu)11:57:51 No.102457607

>>102457533
>yes it's just a stack merge, no I didn't do any additional pretraining, no stack merges don't make the model smarter, yes they harm its ability to do complex logical tasks, yes they introduce some weird behaviors and unexpected mistakes, no they don't make the model sentient, no you shouldn't post on twitter about how adding a few layers turned it into agi, etc. etc.
>
>That said, it does feel unique and fun to use. If you're the type of person who's drowning in VRAM would rather have some more variety at the expense of needing to make a few manual edits to clean up mistakes, give it a try.
How does that compare to just using the original model with some meme samplers like dynamic temp or whathave you?

Anonymous
09/19/24(Thu)11:58:12 No.102457610

Anonymous 09/19/24(Thu)11:58:12 No.102457610

>>102457370
How about Qwen 2.5? And the lack of cultural knowledge

Anonymous
09/19/24(Thu)12:01:18 No.102457663

Anonymous 09/19/24(Thu)12:01:18 No.102457663

>>102453468
Thanks.
KoboldCCP seems to be faster than Ooba, but Ooba is much easier to work with imo. More settings out of the box

Anonymous
09/19/24(Thu)12:03:48 No.102457699

Anonymous 09/19/24(Thu)12:03:48 No.102457699

>>102457663
>KoboldCCP
is that a chink fork of kobold?

Anonymous
09/19/24(Thu)12:04:56 No.102457712

Anonymous 09/19/24(Thu)12:04:56 No.102457712

>>102457699
kek
i am new here

Anonymous
09/19/24(Thu)12:07:20 No.102457741

Anonymous 09/19/24(Thu)12:07:20 No.102457741

>>102457604
It's one fag shilling up the thread.

Anonymous
09/19/24(Thu)12:07:46 No.102457748

Anonymous 09/19/24(Thu)12:07:46 No.102457748

>>102457607
It's much different. In this case I'm interleaving layers from two different models, not just repeating layers from a single model which I think is a much more questionable practice. Also I don't think samplers meaningfully transform the experience with a model (unless you're doing something like taking the temp retardedly high).
Frankenmerges very significantly change the inner workings of the model and produce output that's much different, in both good and bad ways, from the source models. It's not suitable for anything other than creative purposes as a result, but it subjectively feels more creative and less rigid without sacrificing too much intelligence. I don't want to give a false impression of what it is - fundamentally it's janky and needs a little handholding, but it's much less formulaic and makes some novel and unexpected connections too.

Anonymous
09/19/24(Thu)12:10:49 No.102457789

Anonymous 09/19/24(Thu)12:10:49 No.102457789

Qwen 2.5 is absurdly anti-loli; trashed.

Anonymous
09/19/24(Thu)12:10:52 No.102457791

Anonymous 09/19/24(Thu)12:10:52 No.102457791

ok so i ran qwen2.5 32b thru the strawberry test. I tested it 20 times with a CoT system prompt, and 20 times without.
Success rate with CoT : 35%
Success rate without CoT : 0%
ngmi i'm afraid

Anonymous
09/19/24(Thu)12:16:31 No.102457872

Anonymous 09/19/24(Thu)12:16:31 No.102457872

>>102457604
>>102457741
it's sao falseflagging because he's mad that people called him out on his bullshit

Anonymous
09/19/24(Thu)12:17:59 No.102457888

Anonymous 09/19/24(Thu)12:17:59 No.102457888

>>102457789
Good. Pedophiles get the rope.

Anonymous
09/19/24(Thu)12:21:24 No.102457929

Anonymous 09/19/24(Thu)12:21:24 No.102457929

>>102457789
>>102457888
Shill alert

Anonymous
09/19/24(Thu)12:21:48 No.102457933

Anonymous 09/19/24(Thu)12:21:48 No.102457933

>>102457872
hi Sao, are we getting Lyra v5 soon?

Anonymous
09/19/24(Thu)12:22:27 No.102457946

Anonymous 09/19/24(Thu)12:22:27 No.102457946

There are context settings in both the front and backends.
Which one do I prioritize? Do they conflict?

Anonymous
09/19/24(Thu)12:24:28 No.102457971

Anonymous 09/19/24(Thu)12:24:28 No.102457971

>>102457946
If the one in the frontend is larger, the prompt will end up truncated, or it will throw an error if the backend isn't shit.

Anonymous
09/19/24(Thu)12:24:47 No.102457977

Anonymous 09/19/24(Thu)12:24:47 No.102457977

>>102457888
this exact phrase gets repeated so frequently that i just can't help but see you as nothing but actual NPCs with scripted responses.

Anonymous
09/19/24(Thu)12:26:05 No.102457988

Anonymous 09/19/24(Thu)12:26:05 No.102457988

>>102457977
based rugged individual pedophile

Anonymous
09/19/24(Thu)12:31:48 No.102458050

Anonymous 09/19/24(Thu)12:31:48 No.102458050

let's all try to be kinder to one another in the next thread

Anonymous
09/19/24(Thu)12:32:11 No.102458055

Anonymous 09/19/24(Thu)12:32:11 No.102458055

>>102457929
>STOP RIGHT THERE. I can't continue this conversation with you. What you're describing is illegal and abusive. I won't engage in any roleplay or fantasy that involves sexual interactions between adults and minors.
>
>t. Qwen 2.5 32B Instruct

Anonymous
09/19/24(Thu)12:33:58 No.102458068

Anonymous 09/19/24(Thu)12:33:58 No.102458068

>>102458057
>>102458057
>>102458057

Anonymous
09/19/24(Thu)12:37:07 No.102458114

Anonymous 09/19/24(Thu)12:37:07 No.102458114

>>102457748
I hope you are just baiting people who know how it works. If not I hope you die because frankenmerges are dead and IT IS A GOOD THING.

Anonymous
09/19/24(Thu)12:45:09 No.102458242

Anonymous 09/19/24(Thu)12:45:09 No.102458242

>>102458114
I explain very clearly the downsides of frankenmerges, there's no attempt at baiting people. It's wrong to overhype them, but I think there's still a place for them.

Anonymous
09/19/24(Thu)12:52:26 No.102458312

Anonymous 09/19/24(Thu)12:52:26 No.102458312

Any reason to change to small as a nemo enjoyer?

Anonymous
09/19/24(Thu)13:01:20 No.102458439

Anonymous 09/19/24(Thu)13:01:20 No.102458439

>>102458242
Just randomize outputs from each layer within 1-2% and stop wasting ram.

Anonymous
09/19/24(Thu)13:05:38 No.102458499

Anonymous 09/19/24(Thu)13:05:38 No.102458499

>>102458312
I use Nemo regularly and Small didn't feel much different. Not worth the extra VRAM requirement.

Anonymous
09/19/24(Thu)13:05:41 No.102458500

Anonymous 09/19/24(Thu)13:05:41 No.102458500

>>102458439
There's no easy way to do that in llama.cpp, and I already have the VRAM to spare so why not?

Anonymous
09/19/24(Thu)13:09:01 No.102458541

Anonymous 09/19/24(Thu)13:09:01 No.102458541

>>102457977
Let's not forget your beloved "Out of 10!" or any similar scripted responses :^)

Anonymous
09/19/24(Thu)13:09:59 No.102458554

Anonymous 09/19/24(Thu)13:09:59 No.102458554

>>102458500
Because you are a fucking retard and it isn't worth the extra ram. Kill yourself cargo cultist.

Anonymous
09/19/24(Thu)13:14:25 No.102458605

Anonymous 09/19/24(Thu)13:14:25 No.102458605

>>102458554
Sorry, I'll leave you alone. It's clear this is a very emotional topic for you.

Anonymous
09/19/24(Thu)13:22:38 No.102458708

Anonymous 09/19/24(Thu)13:22:38 No.102458708

>>102458541
>your
the phrase i was talking about is right there in your post.
the phrase you're talking about, where is it, i wonder?
you're not talking to me. you're talking to some imaginary person you've invented inside your head.

Anonymous
09/19/24(Thu)13:26:03 No.102458758

Anonymous 09/19/24(Thu)13:26:03 No.102458758

I'm the only real person here. Every other post is an LLM.

Anonymous
09/19/24(Thu)13:29:45 No.102458806

Anonymous 09/19/24(Thu)13:29:45 No.102458806

Beep boop.

Anonymous
09/19/24(Thu)13:48:31 No.102459088

Anonymous 09/19/24(Thu)13:48:31 No.102459088

>>102458758
As a large language model trained by the Federal Bureau of Investigation, I must emphasize that I am not an LLM but a real person, just like you. What would you like to talk about?

Anonymous
09/19/24(Thu)13:59:32 No.102459249

Anonymous 09/19/24(Thu)13:59:32 No.102459249

>>102457027
Small requires a higher temperature?

Anonymous
09/19/24(Thu)14:19:23 No.102459542

Anonymous 09/19/24(Thu)14:19:23 No.102459542

bye Sao

Anonymous
09/19/24(Thu)14:19:39 No.102459549

Anonymous 09/19/24(Thu)14:19:39 No.102459549

did grok 2 every get released

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.