/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 10/14/24(Mon)21:16:47 No.102826116

File: file.jpg (159 KB, 768x579)

159 KB JPG

/lmg/ - Local Models General Anonymous 10/14/24(Mon)21:16:47 No.102826116 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102815881 & >>102801403

►News
>(10/14) Llama 3.1 linearized: https://hf.co/collections/hazyresearch/lolcats-670ca4341699355b61238c37
>(10/14) Zamba2-7B released: https://www.zyphra.com/post/zamba2-7b
>(10/14) Ichigo, voice-to-voice model based on Llama 3.1, released: https://homebrew.ltd/blog/llama-learns-to-talk
>(10/12) Fast multilingual TTS with voice cloning, based on flow matching with DiT: https://github.com/SWivid/F5-TTS
>(10/11) 14B cross-architecture distillation model: https://hf.co/arcee-ai/SuperNova-Medius

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
10/14/24(Mon)21:17:39 No.102826128

Anonymous 10/14/24(Mon)21:17:39 No.102826128

File: __kasane_teto_utau_drawn_(...).jpg (273 KB, 1460x1485)

273 KB JPG

►Recent Highlights from the Previous Thread: >>102815881

--Papers:
>102818670
--Ichigo-llama3.1: Local real-time voice AI on a single 3090 GPU:
>102817117 >102817196 >102817361 >102818092 >102818637
--How to view input to LLM in SillyTavern:
>102821993 >102822010 >102822021
--Zamba2 7B release and performance skepticism:
>102825399 >102825450 >102825497 >102825479
--Hazy Research explores linearizing attention on LLMs with LoLCATs:
>102825440
--Discussion of LLaMA and Miniu models, with anticipation for Q4 release:
>102822049 >102822097 >102822372 >102822381 >102822467 >102822566
--Comparing TTS models: GPT_SoVITS v2, F5-TTS, and xTTS-v2:
>102818159 >102818766 >102818834 >102819042 >102819093 >102819129 >102819144
--Anons discuss Nemo finetunes and share tips for improved performance:
>102817638 >102817706 >102818344 >102818503 >102818525 >102818583
--Using models for emulating DMs/IRC chat with informal dialogue:
>102824368 >102824672
--Tips to prevent LLM from overusing patterns like repeating character names:
>102822777 >102822876 >102822920 >102823085
--Tips for reducing repetitive language patterns in AI-generated text:
>102818224 >102818398 >102818448 >102818580 >102819483 >102819633 >102818976 >102819043
--Suggestions for reducing VRAM usage and measuring quality dropoff:
>102819750 >102819773 >102819843 >102819960 >102821339
--Suggestions for a new model under 70b for homemade AGI waifu agent:
>102820719 >102820743 >102820911 >102821008 >102821013 >102821034 >102821204 >102821288 >102821524 >102822327 >102822370 >102822419
--Recommendations for local models to translate manga:
>102816207 >102818616 >102819005 >102819339 >102819438 >102819494 >102819030 >102816247 >102816327 >102818829
--Miku (free space):
>102815992 >102816491 >102818356 >102819750 >102821339 >102821606 >102821773 >102822472 >102824853 >102825438

►Recent Highlight Posts from the Previous Thread: >>102815888

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
10/14/24(Mon)21:21:38 No.102826167

Anonymous 10/14/24(Mon)21:21:38 No.102826167

>>102826116
>Llama 3.1 linearized
The fuck is this? Why does this matter? Is the area so dead that you have to include irrelevant news in the OP to pretend that something is happening?

Anonymous
10/14/24(Mon)21:25:58 No.102826219

Anonymous 10/14/24(Mon)21:25:58 No.102826219

>>102826167
Linear layers are faster and use less memory than softmax layers.
But frankstein swapping shit and finetune healing rarely has good results and it's not supported in llama.cpp.
So I don't have much hope for it, but it's an interesting experiment and I was already adding to the news.

Anonymous
10/14/24(Mon)21:26:01 No.102826222

Anonymous 10/14/24(Mon)21:26:01 No.102826222

>>102826167
>Is the area so dead that you have to include irrelevant news in the OP to pretend that something is happening?
you answered your question

Anonymous
10/14/24(Mon)21:26:53 No.102826235

Anonymous 10/14/24(Mon)21:26:53 No.102826235

>>102826167
Ok, I read the article and it looks like another distillation benchmaxxer nothingburger. Yawn.

Anonymous
10/14/24(Mon)21:28:32 No.102826243

Anonymous 10/14/24(Mon)21:28:32 No.102826243

>>102826167
brown hands wrote this post.
you can tell because it makes no actual sense if you think longer than the 10 seconds swarthoid brains are capable of.

Anonymous
10/14/24(Mon)21:31:51 No.102826281

Anonymous 10/14/24(Mon)21:31:51 No.102826281

lolcata/linearized gguf soon?

Anonymous
10/14/24(Mon)21:32:25 No.102826291

Anonymous 10/14/24(Mon)21:32:25 No.102826291

>>102826128
>--Zamba2 7B release and performance skepticism
>Our novel shared-attention architecture allows more parameters to be allocated to the Mamba2 backbone. In turn, the shared transformer block preserves the rich cross-sequence dependencies of the attention computation.
This sounds very interesting. I have more faith in this than I would usually have since it's actually a new architecture.
...I hope it doesn't take too many years for llama.cpp support.

Anonymous
10/14/24(Mon)21:33:12 No.102826300

Anonymous 10/14/24(Mon)21:33:12 No.102826300

>>102826167
It's just incremental improvements now. Nothing big is happening. AI bubble is bursting

Anonymous
10/14/24(Mon)21:38:25 No.102826354

Anonymous 10/14/24(Mon)21:38:25 No.102826354

>>102826243
go back >>>/pol/ chud

Anonymous
10/14/24(Mon)21:39:59 No.102826363

Anonymous 10/14/24(Mon)21:39:59 No.102826363

>>102826281
yes

Anonymous
10/14/24(Mon)21:43:11 No.102826391

Anonymous 10/14/24(Mon)21:43:11 No.102826391

File: 2024-10-14_065641_seed836(...).png (3.38 MB, 1344x1728)

3.38 MB PNG

>>102826116
Teto Tuesday already huh. But it's still Monday here. I'll drop one more Migu.

Anonymous
10/14/24(Mon)21:58:15 No.102826507

Anonymous 10/14/24(Mon)21:58:15 No.102826507

>mistral large was released on february 26, 2024
>no better model since then
why live. That's 8 months of nothingburgers. 2023 will never happen again.

Anonymous
10/14/24(Mon)22:01:24 No.102826535

Anonymous 10/14/24(Mon)22:01:24 No.102826535

>>102826507
Mistral Large 2 was released in july

Anonymous
10/14/24(Mon)22:03:01 No.102826551

Anonymous 10/14/24(Mon)22:03:01 No.102826551

>>102826535
Not open source

Anonymous
10/14/24(Mon)22:03:52 No.102826559

Anonymous 10/14/24(Mon)22:03:52 No.102826559

>>102826535
In a blind test it's all the same.

Anonymous
10/14/24(Mon)22:04:20 No.102826563

Anonymous 10/14/24(Mon)22:04:20 No.102826563

>>102826551
Large 2 is the open source one. The old Large is the one that was api-only and pretty shit.

Anonymous
10/14/24(Mon)22:10:29 No.102826615

Anonymous 10/14/24(Mon)22:10:29 No.102826615

>>102826563
Then change it to
>no new model since july 24, 2024
>almost 3 months of nothingburgers

Anonymous
10/14/24(Mon)22:11:07 No.102826623

Anonymous 10/14/24(Mon)22:11:07 No.102826623

Anyone had and fixed a bug where ST streaming suddenly stops but backend keeps generating in the background and doesn't add anything to the message? It also cuts off part of the message if I continue a longer message. I remember I had this problem before but it went away at some point.

Anonymous
10/14/24(Mon)22:30:48 No.102826810

Anonymous 10/14/24(Mon)22:30:48 No.102826810

>>102826623
kobold lite did this to me a few times yesterday too, but none today. kcpp version 1.76.
I change models and sampler settings so often I wouldn't be able to recreate it. SSE streaming Lyra v4 maybe.

Anonymous
10/14/24(Mon)22:32:28 No.102826820

Anonymous 10/14/24(Mon)22:32:28 No.102826820

>>102826810
I found some github thread to turn on
Show {{char}}: in responses
Show {{user}}: in responses
in user settings. And I did and it works. No idea what it has to do with this but it works.

Anonymous
10/14/24(Mon)22:34:27 No.102826832

Anonymous 10/14/24(Mon)22:34:27 No.102826832

>>102826615
People are not doing anything creative with the current models. They'll deserve better models once they have found what to do with the current ones other than ERP.

Anonymous
10/14/24(Mon)22:37:40 No.102826857

Anonymous 10/14/24(Mon)22:37:40 No.102826857

# Mistral Small Fine Tunes

ArliAI/Mistral-Small-22B-ArliAI-RPMax-v1.1
"trained on a diverse set of curated creative writing and RP datasets with a focus on variety and deduplication"
"Training Details:
Sequence Length: 8192
Training Duration: Approximately 4 days on 2x3090Ti
Epochs: 1 epoch training for minimized repetition sickness
QLORA: 64-rank 128-alpha, resulting in ~2% trainable weights
Learning Rate: 0.00001
Gradient accumulation: Very low 32 for better learning."

gghfez/SeminalRP-22b
"RP and creative writing and some regular questions generated by Opus at 8192 context. Refusals removed from dataset. Slop removed to some extent."

Gryphe/Pantheon-RP-1.6.2-22b-Small & Pantheon-RP-Pure-1.6.2-22b-Small
"I strive to introduce a whole collection of diverse personas that can be summoned with a simple activation phrase."
Data (RP-Pure excludes GPT 4-o & writing prompts):
* The 10k most diverse entries from a SlimOrca Sonnet dataset
* A Sonnet 3.5 Pantheon-styled generic roleplay dataset (50/50 Asterisk/Narrative style)
* A Sonnet 3.5 Pantheon Persona dataset (50/50 Asterisk/Narrative style)
* New ChatGPT 4-o Pantheon subset, about 25% the size of the main set (100% Asterisk style)
* A randomly chosen subset of Opus Writing Prompts
* Lyra the Assistant (Coding, summaries and D&D DM questions)

InferenceIllusionist/SorcererLM-22B
"LORA tune", "Trained with a whole lot of love on 1 epoch of cleaned and deduped c2 logs. This model is 100% 'born-local', the result of roughly 27 hours and a little bit of patience on a single RTX 4080 SUPER."

nbeerbower/Mistral-Small-Drummer-22B
"finetuned on jondurbin/gutenberg-dpo-v0.1 and nbeerbower/gutenberg2-dpo."
learning rate 0.000004
"ORPO tuned with 2xA40 on RunPod for 1 epoch."
"learning_rate=4e-6"

rAIfle/Acolyte-22B
"LoRA of a bunch of random datasets", "SLERPed onto base at 0.5", "Check the LoRA for dataset info."

spow12/ChatWaifu_v2.0_22B
"This model aimed to act like visual novel character."
Japanese.

Anonymous
10/14/24(Mon)22:39:01 No.102826870

Anonymous 10/14/24(Mon)22:39:01 No.102826870

Typically the text is generated at the end, is there anything capable of something like modifying code in your editor in place or inserting some?

Anonymous
10/14/24(Mon)22:43:40 No.102826897

Anonymous 10/14/24(Mon)22:43:40 No.102826897

>>102826870
Cursor, Cline, copilot, 50+ other projects on github

Anonymous
10/14/24(Mon)22:44:08 No.102826902

Anonymous 10/14/24(Mon)22:44:08 No.102826902

>>102826870
Models like Codestral are trained on fill-in-the-middle prompts.

Anonymous
10/14/24(Mon)22:44:37 No.102826904

Anonymous 10/14/24(Mon)22:44:37 No.102826904

>>102826870
Some models are trained for FIM (Fill In the MIddle). Something like
int quicksort(int *items, int item_count)
{
<|fim|>
    return e;
}
<|fim|> being whatever token the model uses.
ggerganov is working on a plugin for neovim that does exactly that.
https://github.com/ggerganov/llama.cpp/pull/9787

Anonymous
10/14/24(Mon)22:49:39 No.102826951

Anonymous 10/14/24(Mon)22:49:39 No.102826951

>>102826857
>rAIfle/Acolyte-22B
>Check the LoRA for dataset info

teknium/trismegistus-project (31.7 MB parquet / 73.2 MB JSONL)
"Size: ~10,000 instruction-response pairs"
"Domain: Esoteric, Spiritual, Occult, Wisdom Traditions, Paranormal, etc."
"The Trismegistus Project is a comprehensive dataset containing instruction-response pairs focused on the broad umbrella of Esoterica. Topics covered include Mysticism, Hermeticism, Necromancy, Religion, Trance, Meditation, Magick, Spirituality, Alchemy, Numerology, Tarot, and much more.
"The entire dataset was generated synthetically, save for subtopics."

AIRRC/Eudaimonic (3.09 MB parquet)
"This dataset is designed to facilitate the study and analysis of philosophy and human emotions, with a particular emphasis on multi-turn conversations. It provides a rich collection of dialogues and text data that delve into various philosophical themes and emotional expressions"

Gryphe/Sonnet3.5-Charcard-Roleplay (173 MB JSONL)
"9736 carefully simulated character card-based roleplay dialogues produced using an unrestrained Sonnet 3.5, now available as a ShareGPT dataset."
"Upon assembly of the final dataset further cleaning was performed and any references to Anon (the original user) were renamed to the placeholder {{user}}."
"A final enrichment phase was applied with the most common phrases (such as 'a mix/mixture of') being replaced by alternatives given by GPT-4o."

anthracite-org/kalo_misc_part2 (3.45 MB parquet / 7.9 MB JSONL)
No description.

anthracite-org/kalo_opus_misc_240827 (5.01 MB parquet / 9.55 MB JSONL)
No description.

Anonymous
10/14/24(Mon)23:00:02 No.102827025

Anonymous 10/14/24(Mon)23:00:02 No.102827025

File: 39_06538_.png (1.03 MB, 720x1280)

1.03 MB PNG

Tuesday state of mind

Anonymous
10/14/24(Mon)23:01:06 No.102827036

Anonymous 10/14/24(Mon)23:01:06 No.102827036

>>102826857
# Fine tunes not included in the previous post

## Abliterated
byroneverson/Mistral-Small-Instruct-2409-abliterated
"Check out the jupyter notebook for details of how this model was abliterated."

zetasepic/Mistral-Small-Instruct-2409-abliterated
"Abliterated version using the code from (https://github.com/andyrdt/refusal_direction)."

## Ignored because trained on top of instruct tune with different/wrong prompt format
(6 items removed to avoid spam filter)

## Ignored because they're merges that include ignored models
(6 items removed)

## Ignored because no description
(9 items removed)

Anonymous
10/14/24(Mon)23:11:48 No.102827117

Anonymous 10/14/24(Mon)23:11:48 No.102827117

>>102826116
>Llama 3.1 linearized
?? Wat? Does this remove activation functions? And if yes then why?

Anonymous
10/14/24(Mon)23:12:07 No.102827119

Anonymous 10/14/24(Mon)23:12:07 No.102827119

Did something recently break with kobold and ST?
>bots with example messages start posts by typing out their name
>in group play bots pay no attention to other bots

This is borderline unusable

Anonymous
10/14/24(Mon)23:12:48 No.102827123

Anonymous 10/14/24(Mon)23:12:48 No.102827123

>>102826300
This, but unironically.

Anonymous
10/14/24(Mon)23:13:35 No.102827129

Anonymous 10/14/24(Mon)23:13:35 No.102827129

>>102827119
Check >>102826820. Maybe it is the same.

Anonymous
10/14/24(Mon)23:16:12 No.102827138

Anonymous 10/14/24(Mon)23:16:12 No.102827138

>>102826820(me)
btw I wonder if those settings changed for me with update and the idea was to change this setting to filter out coomers who would give up and move to something else without looking for a fix.

Anonymous
10/14/24(Mon)23:22:02 No.102827173

Anonymous 10/14/24(Mon)23:22:02 No.102827173

File: salute.jpg (20 KB, 351x351)

20 KB JPG

>>102827036
>>102826951
>>102826857
Thank you for your report anon

Anonymous
10/14/24(Mon)23:30:21 No.102827232

Anonymous 10/14/24(Mon)23:30:21 No.102827232

To the anon who brought GPT-SoVits to my attention, and said I should fine tune it, thank you. It's excellent, I never realized how good TTS has gotten. You really do have to finetune it though, it doesn't work nearly as well without it.

I downloaded a VOD from Vei (VTuber streamer), cut the audio from the first 30 minutes where she just talks to the chat, and put it through the whole data processing and finetuning UI. Once I had the raw audio clip the whole process took like literally 10 minutes to having a trained model.

The results: https://voca.ro/1hqGpP2qqRYA

I bet it would be even better if I curated the extracted audio clips and fixed errors in the automatic speech recognition process, which I skipped. This is great, never thought I would be a TTSfag, but here I am.

Anonymous
10/14/24(Mon)23:32:04 No.102827244

Anonymous 10/14/24(Mon)23:32:04 No.102827244

>>102827232
Anon, how. That's fucking amazing.

Anonymous
10/14/24(Mon)23:35:27 No.102827276

Anonymous 10/14/24(Mon)23:35:27 No.102827276

>>102827138
>filter out coomers
Yes. He was thinking specifically of you. Everybody does...

Anonymous
10/14/24(Mon)23:36:07 No.102827279

Anonymous 10/14/24(Mon)23:36:07 No.102827279

>>102827232
>data processing and finetuning UI
link? I have a stash of audio I'd like to do this with...

Anonymous
10/14/24(Mon)23:37:10 No.102827286

Anonymous 10/14/24(Mon)23:37:10 No.102827286

>>102827232
That's fucking terrible, but it's good tts.

Anonymous
10/14/24(Mon)23:39:01 No.102827301

Anonymous 10/14/24(Mon)23:39:01 No.102827301

File: 1718837602136463.jpg (1.48 MB, 1920x1759)

1.48 MB JPG

>>102827232

Oh. Neat. I'll try it out. Motivated.

Anonymous
10/14/24(Mon)23:47:22 No.102827356

Anonymous 10/14/24(Mon)23:47:22 No.102827356

>>102827279
I mean the GPT-SoVits webui itself. Just follow the rentry guide, linked to from the github.

Specifically here's what I did: download vod directly from twitch using videodownloadhelper extension. Open in Shotcut (I'm on linux, maybe there's a better option for windows), select 30 minute range where she's just talking, export as wav file. Load that into webui, click all the buttons in the right order to do all the extraction steps, ASR, and finetuning. Use one of the auto extracted short clips + ASR transcription as the reference audio when doing inference. That's it.

Anonymous
10/14/24(Mon)23:50:57 No.102827374

Anonymous 10/14/24(Mon)23:50:57 No.102827374

good morning again /lmg/, what model are you using today?

Anonymous
10/14/24(Mon)23:56:22 No.102827411

Anonymous 10/14/24(Mon)23:56:22 No.102827411

>>102827374
that one you really hate
you know the one

Anonymous
10/14/24(Mon)23:59:47 No.102827441

Anonymous 10/14/24(Mon)23:59:47 No.102827441

>project claims to need a certain python version
>the SAME project's own requirements.txt asks for a package version that won't work with that python version
I HATE PYTHON DEVS
I HATE PYTHON DEVS
I HATE PYTHON DEVS

Anonymous
10/15/24(Tue)00:00:22 No.102827442

Anonymous 10/15/24(Tue)00:00:22 No.102827442

>>102827441
sounds like an amd problem

Anonymous
10/15/24(Tue)00:18:36 No.102827568

Anonymous 10/15/24(Tue)00:18:36 No.102827568

>https://www.latent.space/p/gpu-bubble
>TLDR: Don’t buy H100s. The market has flipped from shortage ($8/hr) to oversupplied ($2/hr), because of reserved compute resales, open model finetuning, and decline in new foundation model co’s. Rent instead1.
>decline in new foundational models
It's officially over. The bubble has burst.

Anonymous
10/15/24(Tue)00:24:41 No.102827619

Anonymous 10/15/24(Tue)00:24:41 No.102827619

ok but why is UNA-TheBeagle-7B-v1 still the most soulful local model for vramlets

Anonymous
10/15/24(Tue)00:26:02 No.102827628

Anonymous 10/15/24(Tue)00:26:02 No.102827628

>>102827568
Huh... NVDA stock just hit a new all time high today and now we see signs like this. Might be a good time to start shorting or buying put options. Or at least get out while you can. We've seen AI plateau across the board at the highest levels too for over two years at this point. I don't think the world is ready for just how bad this burst will be.

Anonymous
10/15/24(Tue)00:29:59 No.102827657

Anonymous 10/15/24(Tue)00:29:59 No.102827657

>>102827568
Does that mean we may get liquidated H100 flooding the market for cheap? I am like 10x on Nvidia stocks already, a 50% crash isn't the end of the world. I may even slurp the dip.

Anonymous
10/15/24(Tue)00:30:37 No.102827662

Anonymous 10/15/24(Tue)00:30:37 No.102827662

>>102827568
This is good for open source weights, because models from Largestral to Llama 405B or even stuff in the open from other domains like FLUX have been almost impossible to train without a fortune and access to limited hardware. The fact that all of this hardware now is accessible means we're going to get a lot more models and capabilities that were previously hoarded by the companies at the top for cheap prices. Sucks to be the investors though.

Anonymous
10/15/24(Tue)00:40:42 No.102827736

Anonymous 10/15/24(Tue)00:40:42 No.102827736

>>102826832
But the current ones aren't good enough yet for ERP, that's the problem.

Anonymous
10/15/24(Tue)00:45:33 No.102827771

Anonymous 10/15/24(Tue)00:45:33 No.102827771

I can no longer enjoy new anime, as LLMs provide far more engaging and interactive stories. It's astonishing to see how low anime has fallen.

Anonymous
10/15/24(Tue)00:47:57 No.102827790

Anonymous 10/15/24(Tue)00:47:57 No.102827790

>>102827771
Anime always sucked, you just woke up

Anonymous
10/15/24(Tue)00:59:36 No.102827888

Anonymous 10/15/24(Tue)00:59:36 No.102827888

>>102827568
>>102827628
Sorry your short is tanking bro, I guess it's payback for all the negativity you were spreading for local models.
I won't be bailing you out.

Anonymous
10/15/24(Tue)01:03:39 No.102827913

Anonymous 10/15/24(Tue)01:03:39 No.102827913

So when will an H100 reach the price of a gaming GPU?

Anonymous
10/15/24(Tue)01:15:37 No.102827997

Anonymous 10/15/24(Tue)01:15:37 No.102827997

File: 1728936765937206.png (488 KB, 512x768)

488 KB PNG

>>102826832
Switching between two characters/agents with distinct and long enough chat histories in any group interaction is painfully slow. To do anything beyond one-on-one ERPs, LLMs must either exhibit linear complexity for context processing or possess robust general summarization capabilities. I keep running into this wall whenever I try to accomplish anything with LLMs

Anonymous
10/15/24(Tue)01:33:00 No.102828082

Anonymous 10/15/24(Tue)01:33:00 No.102828082

>>102827997
Why not have one card with multiple characters? Why this weird "group" thing? Does it really work so much better? I've never tried it since I'd want the model to have definitions at hand for all the characters in the scene no matter who it was writing dialogue for.

Anonymous
10/15/24(Tue)01:36:22 No.102828101

Anonymous 10/15/24(Tue)01:36:22 No.102828101

>>102827997
Swap character cards -> Merge character cards
Natural order -> List order
do you really need more? even Llama 8B can handle several characters, assuming they are not 2000 tokens each. and if they are, you are doing it wrong. most defs can be condensed to <1k tokens with negligible losses

Anonymous
10/15/24(Tue)01:37:43 No.102828110

Anonymous 10/15/24(Tue)01:37:43 No.102828110

>>102828101
yeah basically >>102828082 but instead of merging the cards manually SillyTavern does it for you + the messages from different characters are clearly distinguishable. I also like adding an empty Narrator card for meta shit

Anonymous
10/15/24(Tue)01:40:01 No.102828120

Anonymous 10/15/24(Tue)01:40:01 No.102828120

>>102827662
If H100s become worthless they'll literally landfill them rather than let the public have them cheap. Partially out of spite, partially because muh China.

Anonymous
10/15/24(Tue)01:47:20 No.102828179

Anonymous 10/15/24(Tue)01:47:20 No.102828179

>>102828082
While conversing with one character, introducing another causes LLM to recall interactions from previous discussions. It goes beyond chats, I need distinct contexts for handling various types of entities within my world. It's fun when it works, but quickly becomes unusable slow as context grows.

Anonymous
10/15/24(Tue)02:02:17 No.102828285

Anonymous 10/15/24(Tue)02:02:17 No.102828285

LLM's to me struggle with subtlety

Anonymous
10/15/24(Tue)02:05:57 No.102828301

Anonymous 10/15/24(Tue)02:05:57 No.102828301

>>102828285
I struggle with figuring out what model to use

Anonymous
10/15/24(Tue)02:08:05 No.102828309

Anonymous 10/15/24(Tue)02:08:05 No.102828309

>>102827657
I don't think average consumers can even put SXM H100s to use.

Anonymous
10/15/24(Tue)02:09:45 No.102828317

Anonymous 10/15/24(Tue)02:09:45 No.102828317

>>102827997
anon wtf; LLMs have amazing abstractive summarizing ability???

What's your structure for doing the chat?
Are you doing:
<chat_history>
char_1: f
char_2: f
anon: f
char_1: typing...

And having char1's character card swapped into the query for generation, or are you stuffing the character cards into hte beginning of the convo when they're introduced?

Anonymous
10/15/24(Tue)02:17:02 No.102828361

Anonymous 10/15/24(Tue)02:17:02 No.102828361

BookWorm: A Dataset for Character Description and Analysis
https://arxiv.org/abs/2410.10372
>Characters are at the heart of every story, driving the plot and engaging readers. In this study, we explore the understanding of characters in full-length books, which contain complex narratives and numerous interacting characters. We define two tasks: character description, which generates a brief factual profile, and character analysis, which offers an in-depth interpretation, including character development, personality, and social context. We introduce the BookWorm dataset, pairing books from the Gutenberg Project with human-written descriptions and analyses. Using this dataset, we evaluate state-of-the-art long-context models in zero-shot and fine-tuning settings, utilizing both retrieval-based and hierarchical processing for book-length inputs. Our findings show that retrieval-based approaches outperform hierarchical ones in both tasks. Additionally, fine-tuned models using coreference-based retrieval produce the most factual descriptions, as measured by fact- and entailment-based metrics. We hope our dataset, experiments, and analysis will inspire further research in character-based narrative understanding.
might be relevant

Anonymous
10/15/24(Tue)02:18:25 No.102828369

Anonymous 10/15/24(Tue)02:18:25 No.102828369

>>102827913
The V100 hasn't even gotten to that level for the 32GB. What makes you think H100s are going to reach that level any sooner?
>>102828120
The best Nvidia can do is buy them back to do that, the companies own the hardware outright. And even if landfilled, China will find a way to salvage it.

Anonymous
10/15/24(Tue)02:22:52 No.102828394

Anonymous 10/15/24(Tue)02:22:52 No.102828394

File: Screenshot 2024-10-15 192234.png (15 KB, 855x161)

15 KB PNG

How much does enabling 8bit cache degrade the model's intelligence?

Anonymous
10/15/24(Tue)02:22:57 No.102828395

Anonymous 10/15/24(Tue)02:22:57 No.102828395

Which proxy still has Opus?

Anonymous
10/15/24(Tue)02:22:57 No.102828396

Anonymous 10/15/24(Tue)02:22:57 No.102828396

>>102827997
You're not using llama.cpp properly. You should save the kv cache separately on your two instances so you don't have to reprocess everything when you swap your cards

Anonymous
10/15/24(Tue)02:25:14 No.102828409

Anonymous 10/15/24(Tue)02:25:14 No.102828409

>>102828395
Wrong thread

Anonymous
10/15/24(Tue)02:28:38 No.102828423

Anonymous 10/15/24(Tue)02:28:38 No.102828423

>>102828395
Send a photo of your butt with "expert roleplayer" written on it to markzuckerberg@meta.com and I'll give you secret access token for slopus.

Anonymous
10/15/24(Tue)02:28:38 No.102828424

Anonymous 10/15/24(Tue)02:28:38 No.102828424

File: bit-closer-chaos.jpg (67 KB, 1024x751)

67 KB JPG

>>102828361

Anonymous
10/15/24(Tue)02:37:13 No.102828475

Anonymous 10/15/24(Tue)02:37:13 No.102828475

>>102827232
Wish there was a way to easily control the emotion for the output. Feels too monotone or calm half the time when I'm trying to get it to yell insults at me. Might just be an issue with the sample audio I'm using though

Anonymous
10/15/24(Tue)02:37:31 No.102828478

Anonymous 10/15/24(Tue)02:37:31 No.102828478

behemoth has claude-tier soul tbdesu
I wish it was on openrouter or something because I'm sick of 1t/s

Anonymous
10/15/24(Tue)02:38:22 No.102828488

Anonymous 10/15/24(Tue)02:38:22 No.102828488

File: 1728941446656197.jpg (137 KB, 512x768)

137 KB JPG

>>102828082
>>102828101
>>102828110
Imagine a world filled with multiple characters who can inhabit various locations,
user can borrow any character to participate in chat, thereby adjusting its behavior with personalized examples. There are some basic rules for location changes, like, adventurers form parties to explore dungeons. It is super fun and less predictable than using character cards with predefined scenarios

Anonymous
10/15/24(Tue)02:43:01 No.102828528

Anonymous 10/15/24(Tue)02:43:01 No.102828528

>>102828394
You won't notice. Also, 16>q4>8

Anonymous
10/15/24(Tue)02:43:09 No.102828532

Anonymous 10/15/24(Tue)02:43:09 No.102828532

>tried Claude opus out of curiosity
>Literally drained my dick in ten or so message
>Cost me like three bucks for the short back and forth

I understand why aicg steals that shit now.

Anonymous
10/15/24(Tue)02:50:58 No.102828607

Anonymous 10/15/24(Tue)02:50:58 No.102828607

>>102828532
You can set your max_tokens to 500 and get the same experience with any 70B model. Claude is full of slop, only aicgtards think writing more = better

Anonymous
10/15/24(Tue)02:51:39 No.102828610

Anonymous 10/15/24(Tue)02:51:39 No.102828610

>>102828528
4 is better than 8? how does that work?

Anonymous
10/15/24(Tue)02:53:02 No.102828626

Anonymous 10/15/24(Tue)02:53:02 No.102828626

>>102828532
you should rename the general to CMG - coping models general

Anonymous
10/15/24(Tue)02:53:28 No.102828630

Anonymous 10/15/24(Tue)02:53:28 No.102828630

>>102828607
>set your max_tokens to 500
I see you're a fan of getting 3 different dialogues and actions and the AI speaking and acting for you.

Anonymous
10/15/24(Tue)02:54:31 No.102828642

Anonymous 10/15/24(Tue)02:54:31 No.102828642

>>102828607
It was a little sloppy, yeah but it also really "got" the scenario and excellent at following instructions.

Yes, the 70b experience is passable, but opus was actually divine.

Anonymous
10/15/24(Tue)02:54:57 No.102828647

Anonymous 10/15/24(Tue)02:54:57 No.102828647

it is kind of hilarious that a significant portion of this general are just thirdies from /aicg/ who think they're trolling when they constantly seethe about local models existing

Anonymous
10/15/24(Tue)02:55:37 No.102828656

Anonymous 10/15/24(Tue)02:55:37 No.102828656

>>102828647
nah we just like driving through the poor part of town every once in a while

Anonymous
10/15/24(Tue)02:57:26 No.102828671

Anonymous 10/15/24(Tue)02:57:26 No.102828671

>>102828610
https://www.reddit.com/r/LocalLLaMA/comments/1dw90iq/comment/lbux25j/

Anonymous
10/15/24(Tue)03:00:54 No.102828706

Anonymous 10/15/24(Tue)03:00:54 No.102828706

>>102828647
You probably think I'm trolling, but I'm just being honest with my findings. I thought Claude opus was a really good experience.
I'm not even knocking local.

Anonymous
10/15/24(Tue)03:01:17 No.102828711

Anonymous 10/15/24(Tue)03:01:17 No.102828711

>>102828671
thanks

Anonymous
10/15/24(Tue)03:01:54 No.102828720

Anonymous 10/15/24(Tue)03:01:54 No.102828720

>>102828656
You'll see all the poors have the latest iphone

Anonymous
10/15/24(Tue)03:04:38 No.102828744

Anonymous 10/15/24(Tue)03:04:38 No.102828744

>>102828720
That's why they are poor

Anonymous
10/15/24(Tue)03:23:55 No.102828917

Anonymous 10/15/24(Tue)03:23:55 No.102828917

I am too dumb to understand what the fuck they're talking about with this linearising. Does it speed up tokens per second? Fine tuning? Quantisation?

Anonymous
10/15/24(Tue)03:49:17 No.102829104

Anonymous 10/15/24(Tue)03:49:17 No.102829104

File: konosuba soyjak.jpg (84 KB, 680x464)

84 KB JPG

>>102827232

It's very good. Can do all range of emotions as long as you got a classified reference sample. IE, Angry, Annoy, Excited, etc.

Outputs:

Normal Refence:
"The food isn't that good here. Let's not go here next time."
ここの料理はあまり美味しくないね。次回はここに行かないようにしよう。
https://vocaroo.com/1mKoMlkXPYLT

Angry Reference:
"O flames that shake the earth, gather in my hands. The power of destruction that swallows everything, be unleashed here and now. Explosion!"
大地を揺るがす炎よ、我が手に集え。すべてを飲み込む破壊の力、今ここに解き放つ。エェェェエエクスプロォォォオオジォォォオンンン!!!
>Volume Warning
https://vocaroo.com/1nfcHP4rwJjt

Excited Reference:
Look! There's so much cool things over there!
見て!あそこにすごいものがたくさんあるよ!
https://vocaroo.com/1jP7rMuBnNNt

I think this is the next level for local TTS chats. SillyTavern also has some classified emotions extension models for chats already. All GPT-SoVITS needs is integration.

Anonymous
10/15/24(Tue)03:52:50 No.102829133

Anonymous 10/15/24(Tue)03:52:50 No.102829133

>>102826810
Now is probably good time to ask, I'm using raw KoboldAI and give the model my custom instructions what to do in terms of roleplay. At some point I noticed that if you use the {{user}} or {{char}} tags, they get replaced with You and KoboldAI for the chat view and for the console window (which I believe is the text that the model sees). In the edit mode it shows {{user}} and {{char}}. In the edit mode the user's and AI's responses are divide up by {{[INPUT]}} {{[OUTPUT]}} which then get replaced by the template you are using for the model.

This is when Format is set to Instruct Mode. In this mode, you also get the model template choice. If you choose Story or Adventure mode, you don't get the template choice (I guess it's just raw text completion then?). Then the last option is Chat Mode, which gives you the option to set your name and AI name, which by default are You and KoboldAI. In this mode, template is also not used (but it doesn't edit old content in the context), but the system automatically starts user and AI messages with a new line and user's and AI's set names. This happens even if "Inject ChatNames" is unchecked.

If I change the names to something else, and switch back to Instruct Mode, Kobold still keeps replacing {{user}} {{char}} tags with the set names, which is weird and possibly a bug (it's been like this for as long as I remember with multiple Kobold versions including 1.76). Ok, I just found out Placeholder Tags checkbox, which is checked by default, which uses and replaces both the {{[INPUT]}} {{[OUTPUT]}} and {{user}} and {{char}} tags. Unchecking it fixes it, however, now edit mode uses the raw template tags which makes it harder to read. I guess one solution is to just name user and char {{user}} and {{char}} so they get replaced by the same thing.

Anonymous
10/15/24(Tue)03:53:52 No.102829137

Anonymous 10/15/24(Tue)03:53:52 No.102829137

File: 1723816142134835.png (12 KB, 583x268)

12 KB PNG

And while I'm at it, Kobold is also full of checkboxes that claim to inject stuff and that are hard to understand, for example Adventure PrePrompt and such. Tested just now, that checkbox only applies in Adventure Mode, even though the option is always there even in Instruct mode, making it confusing to know what is actually being done.

The sampler config is also confusing, for example I don't know if DRY or XTC are enabled by default or not or what value I need to use to have them disabled.

Anonymous
10/15/24(Tue)03:56:02 No.102829158

Anonymous 10/15/24(Tue)03:56:02 No.102829158

>>102829104
Are you the meguminfag who used xtts2 before?
Yep the TTS is good, but emotion classification model is really bad (go emotions is a shitty dataset to begin with). It's very hard to make a proper dataset for a limited set of emotions, let alone supporting 28 of them.
t. trained my own

Anonymous
10/15/24(Tue)03:58:40 No.102829175

Anonymous 10/15/24(Tue)03:58:40 No.102829175

>>102829104
So I can finally have voice acted sex scenes with orgasmic moaning about my favourite anime characters and streamers? Does this realistically work? Those examples are looking great.

Anonymous
10/15/24(Tue)04:00:26 No.102829187

Anonymous 10/15/24(Tue)04:00:26 No.102829187

What's a good bot to absolutely fuck with, in a non-sexual way?

Anonymous
10/15/24(Tue)04:03:19 No.102829209

Anonymous 10/15/24(Tue)04:03:19 No.102829209

>>102829104
okay now do lewd reference

Anonymous
10/15/24(Tue)04:15:40 No.102829298

Anonymous 10/15/24(Tue)04:15:40 No.102829298

>>102828317
context for char5:
char5 definition
char8 definition
char14 definition
char17 definition
chat history between 5 and 14 (user: 5) #this shapes char5's speech patterns
transition message upon location change
chat history between 5, 8 and 14 (user: 8)
transition message upon location change
current chat history between char5, char14 and char17 (user: 17)
[post-instruct for char5]
Definition describes appearance (char5 is a tall human archer wielding a long bow)
Post-instruct defines personality traits (char5's personality: lazy, stubborn)

Anonymous
10/15/24(Tue)04:16:34 No.102829310

Anonymous 10/15/24(Tue)04:16:34 No.102829310

>>102829104
kind of cool that all unvoiced VNs will soon be able to be VA'd

Anonymous
10/15/24(Tue)04:18:49 No.102829329

Anonymous 10/15/24(Tue)04:18:49 No.102829329

>>102829175
>>102829209

Visual Novel Reference:
もっと速く,あ,あ,あ,あ,あ,あ,あ
https://voca.ro/1r46DjZzDx21

Yeah, it sounds like exactly generic JAV. At least you can now generate that stuff on demand.

Anonymous
10/15/24(Tue)04:23:48 No.102829362

Anonymous 10/15/24(Tue)04:23:48 No.102829362

>>102828532
How many tokens does your chat have? Opus is expensive as shit but three bucks for 10 or so messages doesn't sound right

Anonymous
10/15/24(Tue)04:24:26 No.102829368

Anonymous 10/15/24(Tue)04:24:26 No.102829368

>>102829329
>Yeah, it sounds like exactly generic JAV.
Because that's what you fed it, idiot.

Anonymous
10/15/24(Tue)04:25:05 No.102829371

Anonymous 10/15/24(Tue)04:25:05 No.102829371

>>102829368
why are you being such an angry nigger? brown moment?

Anonymous
10/15/24(Tue)04:28:40 No.102829392

Anonymous 10/15/24(Tue)04:28:40 No.102829392

File: images.png (5 KB, 189x267)

5 KB PNG

>>102829137
if the multiplier or probability is zero then it's not enabled.

Anonymous
10/15/24(Tue)04:37:09 No.102829442

Anonymous 10/15/24(Tue)04:37:09 No.102829442

>>102829329
how does it do with English ERP logs? And can you shape the voice by freeform descriptions or does it just read the input out loud?

Anonymous
10/15/24(Tue)05:07:14 No.102829690

Anonymous 10/15/24(Tue)05:07:14 No.102829690

Based on some HF blogpost, using compressed KV cache would slow down generation (and keep near perfect accuracy), however, by compressing it 4bit on Kobold with Mistral Small on 8gb gpu, I can fit 2 more layers to GPU and get FASTER generation speed. 3.16t/s compared to approx 2.7t/s (from my memory).

This model slows down with longer context. At almost 12k tokens it slows down to 1.07t/s at 4bit kvcache setup and 1.05t/s with normal setup. Flashattention is on in all cases.

Anonymous
10/15/24(Tue)05:09:39 No.102829711

Anonymous 10/15/24(Tue)05:09:39 No.102829711

>>102829690
the ad bot forgot to include the model...

Anonymous
10/15/24(Tue)05:11:55 No.102829729

Anonymous 10/15/24(Tue)05:11:55 No.102829729

>>102829711
Mistral-Small-Instruct-2409-Q5_K_M
Probably going to switch to Q4 since I'm at the huge speed gains spot in terms of offloading

Anonymous
10/15/24(Tue)05:35:06 No.102829868

Anonymous 10/15/24(Tue)05:35:06 No.102829868

>>102826219
Main problem is the same as RWKV/state-space, the real context is tiny. It won't become gibberish when crossing that context limit, but it won't really remember anything either.

Anonymous
10/15/24(Tue)05:44:26 No.102829924

Anonymous 10/15/24(Tue)05:44:26 No.102829924

>bans 'the'
now what

Anonymous
10/15/24(Tue)05:45:15 No.102829930

Anonymous 10/15/24(Tue)05:45:15 No.102829930

>>102829924
now it won't say "the"

Anonymous
10/15/24(Tue)06:31:50 No.102830270

Anonymous 10/15/24(Tue)06:31:50 No.102830270

https://www.preferred.jp/ja/news/pr20241015/

Anonymous
10/15/24(Tue)06:48:35 No.102830382

Anonymous 10/15/24(Tue)06:48:35 No.102830382

>>102829924
ban "a" and "and"

Anonymous
10/15/24(Tue)06:56:00 No.102830423

Anonymous 10/15/24(Tue)06:56:00 No.102830423

/lmg/ death status?
cooming llm death status?

Anonymous
10/15/24(Tue)07:00:26 No.102830457

Anonymous 10/15/24(Tue)07:00:26 No.102830457

>>102828532
I don't get what they're doing over there at aicg. Wtf are proxies? Why are they using them instead of using openai/anthropic api directly?

Anonymous
10/15/24(Tue)07:03:05 No.102830474

Anonymous 10/15/24(Tue)07:03:05 No.102830474

File: Screenshot_20241015_200143.png (143 KB, 1457x900)

143 KB PNG

>>102830270
Sa-Sasuga nippon!

Anonymous
10/15/24(Tue)07:03:57 No.102830480

Anonymous 10/15/24(Tue)07:03:57 No.102830480

>>102830457
they're poor

Anonymous
10/15/24(Tue)07:04:32 No.102830483

Anonymous 10/15/24(Tue)07:04:32 No.102830483

>>102830480
How does using a proxy help with that.

Anonymous
10/15/24(Tue)07:05:17 No.102830490

Anonymous 10/15/24(Tue)07:05:17 No.102830490

File: Screenshot_20241015_200441.png (467 KB, 1295x1612)

467 KB PNG

>>102830474

Anonymous
10/15/24(Tue)07:06:48 No.102830500

Anonymous 10/15/24(Tue)07:06:48 No.102830500

>>102830483
proxies are free

Anonymous
10/15/24(Tue)07:07:22 No.102830506

Anonymous 10/15/24(Tue)07:07:22 No.102830506

>>102830483
>guy finds an exposed api key
>instead of sharing it (would get it revoked instantly), he sets up a server which forwards requests to openai/anthropic with that key
>free coom

llama.cpp CUDA dev !!OM2Fp6Fn93S
10/15/24(Tue)07:08:37 No.102830514

llama.cpp CUDA dev !!OM2Fp6Fn93S 10/15/24(Tue)07:08:37 No.102830514

>>102829690
>Based on some HF blogpost, using compressed KV cache would slow down generation (and keep near perfect accuracy)
That depends on the implementation.
The easiest way to implement it is to just dequantize the KV cache to FP16 and do the attention as you would otherwise.
That is always going to be strictly slower than FP16 KV cache, especially if you generate new tokens where you are I/O bound.
llama.cpp and by extension koboldcpp can directly use the quantized KV cache as input so there is no need for an intermediate dequantization step.

Anonymous
10/15/24(Tue)07:08:47 No.102830518

Anonymous 10/15/24(Tue)07:08:47 No.102830518

>>102830506
But how the person not notice all the $$$ of hundreds of coomers?
Isnt that thousands of dollarinos per month?

Anonymous
10/15/24(Tue)07:10:00 No.102830527

Anonymous 10/15/24(Tue)07:10:00 No.102830527

>>102830518
they do eventually, but generally there's enough keys out there that it doesn't matter when 1 gets revoked

Anonymous
10/15/24(Tue)07:10:07 No.102830528

Anonymous 10/15/24(Tue)07:10:07 No.102830528

>>102830423
just found Mistral Small (22B), used it to sexually experiment on two 9-year old students and then made them discover yuri sex afterwards in a 20k token long story. Sure, it's not as fast as Claude, but less sloppy, pretty accurate in terms of memory and not repetitive like some other models. As a bonus, it's available for me any time forever, and I don't have to worry fighting against filters, FBI logs or having to post dick pics for access.

Anonymous
10/15/24(Tue)07:11:13 No.102830537

Anonymous 10/15/24(Tue)07:11:13 No.102830537

>>102830527
Crazy to be honest.
Then some VPN companies really dont share the info I suppose. All requests come from the proxy owner right. That is wild. Why even do that for free/naked pics.

Anonymous
10/15/24(Tue)07:13:36 No.102830559

Anonymous 10/15/24(Tue)07:13:36 No.102830559

>>102830528
Which version did you use, the abliterated one?

Anonymous
10/15/24(Tue)07:14:31 No.102830567

Anonymous 10/15/24(Tue)07:14:31 No.102830567

>>102830528
Smaller models are better than ever.
Positivity bias is a problem. But local is pretty good. If anything it feels like we are more closing in to the closed models.

I rarely use opus behind 1mil proxies through openrouter.
Getting a pregnant milf pregnant, Anal pregnancy, I'm a little loli girl grinding my clit on a milf: Stick your dick in me.
That are the same problems local has lol
They are obviously smarter, but they dont feel far away anymore.

Anonymous
10/15/24(Tue)07:15:01 No.102830569

Anonymous 10/15/24(Tue)07:15:01 No.102830569

>>102830537
yeah it all comes from the proxy owner
a lot of people do it for money though
>some VPN companies really dont share the info I suppose
even if they did it doesn't really matter (yet at least)
the guy who runs the largest proxy currently got doxxed (full name, address, face, and more) and nothing has happened to him

here are some articles covering what /aicg/ does (the last one has screenshots from /aicg/)
>https://sysdig.com/blog/growing-dangers-of-llmjacking/
>https://sysdig.com/blog/llmjacking-stolen-cloud-credentials-used-in-new-ai-attack/
>https://www.lacework.com/blog/detecting-ai-resource-hijacking-with-composite-alerts
>https://krebsonsecurity.com/2024/10/a-single-cloud-compromise-can-feed-an-army-of-ai-sex-bots/
>https://permiso.io/blog/exploiting-hosted-models

Anonymous
10/15/24(Tue)07:17:14 No.102830590

Anonymous 10/15/24(Tue)07:17:14 No.102830590

File: 1716043485040155.png (13 KB, 614x198)

13 KB PNG

>>102830270
>trained on jp common crawl
Let's hope that this includes all the shitty japanese webnovels in the world.

Anonymous
10/15/24(Tue)07:17:44 No.102830593

Anonymous 10/15/24(Tue)07:17:44 No.102830593

>>102830574
>ban me please
>posts sfw
Coward

Anonymous
10/15/24(Tue)07:25:42 No.102830645

Anonymous 10/15/24(Tue)07:25:42 No.102830645

>>102830559
https://huggingface.co/bartowski/Mistral-Small-Instruct-2409-GGUF
the normal version as gguf, it didn't censor the story progress at all.
Maybe if you let it run freely it starts creating its 4rd wall epilogues and judgements about the content but before that point you should have decided the next step already.
The other day I was playing around with personal stuff as a generic assistant and it was pretty positive-biased "safe and respectful" in that context, so I'm still looking for an uncensored version. You can easily bypass the refusals with simple edits but they keep coming back especially in that context. However I think 22B is so on the edge that I'm worried about bad finetune downgrading the intelligence.

Anonymous
10/15/24(Tue)07:30:08 No.102830673

Anonymous 10/15/24(Tue)07:30:08 No.102830673

>>102830645
>the normal version as gguf, it didn't censor the story progress at all.
Compared to nemo mistral-small DOES sneakily try to steer the story in a certain direction.
Without a finetune I had to use alot of ooc.

Unfortunately not many finetunes. People call me a shill but some of drummers finetunes are pretty good.
Rocinante-12B-v2g-Q5_K_M.gguf for anti slop.
Cydonia-22B-v1.1-Q4_K_M.gguf is also pretty good.
The others all suck. Eva,Acolyte, ArliRP. Its all garbage.

Anonymous
10/15/24(Tue)07:32:02 No.102830683

Anonymous 10/15/24(Tue)07:32:02 No.102830683

>>102830672
Just pretend you're underage, the mods are super anal about that.

Anonymous
10/15/24(Tue)07:34:14 No.102830698

Anonymous 10/15/24(Tue)07:34:14 No.102830698

>>102830672
>the cloud is for the porn
Is it actually safer to store CP on a cloud? How does that work?

Anonymous
10/15/24(Tue)07:36:22 No.102830715

Anonymous 10/15/24(Tue)07:36:22 No.102830715

>>102830593
Well I probably wouldn't want to open that pic while at work.

Anonymous
10/15/24(Tue)07:41:01 No.102830742

Anonymous 10/15/24(Tue)07:41:01 No.102830742

>>102830715
i just did
forwarded it to my friends too

Anonymous
10/15/24(Tue)07:45:26 No.102830766

Anonymous 10/15/24(Tue)07:45:26 No.102830766

>>102829104
>AI can't say explosioooon
meme

Anonymous
10/15/24(Tue)07:53:10 No.102830815

Anonymous 10/15/24(Tue)07:53:10 No.102830815

>>102830766
Your ASR transcriptions?

Anonymous
10/15/24(Tue)07:55:50 No.102830831

Anonymous 10/15/24(Tue)07:55:50 No.102830831

>>102830474
Where did you find this? Did you apply to the Beta on their website?
>>102830270
>instruct weights aren't open

Anonymous
10/15/24(Tue)07:59:45 No.102830869

Anonymous 10/15/24(Tue)07:59:45 No.102830869

>>102828607
>get the same experience with any 70B model.
No you can't :)

Anonymous
10/15/24(Tue)08:05:22 No.102830913

Anonymous 10/15/24(Tue)08:05:22 No.102830913

>>102830569
Reminder that having read this post means you have reasonable suspicions about where "proxies" come from. Reading or posting in this general is evidence of red flags about criminal activity.

Anonymous
10/15/24(Tue)08:13:51 No.102830985

Anonymous 10/15/24(Tue)08:13:51 No.102830985

File: file.png (91 KB, 1154x238)

91 KB PNG

If your grandma didn't lull you to sleep by reciting windows activation keys, you're not gonna make it.

Anonymous
10/15/24(Tue)08:17:57 No.102831021

Anonymous 10/15/24(Tue)08:17:57 No.102831021

>open ai slow as balls today
>need to fix and run a dozen pipelines
God I wish my company had GPUs for local. This is suffering.

Anonymous
10/15/24(Tue)08:19:02 No.102831030

Anonymous 10/15/24(Tue)08:19:02 No.102831030

>>102830985
We were too poor for that.
My mother lulled me to sleep with keygen music.

Anonymous
10/15/24(Tue)08:19:05 No.102831031

Anonymous 10/15/24(Tue)08:19:05 No.102831031

>>102828630
>t. never ran anything above 30b

Anonymous
10/15/24(Tue)08:24:01 No.102831074

Anonymous 10/15/24(Tue)08:24:01 No.102831074

File: file.png (77 KB, 1059x299)

77 KB PNG

>>102831030
I'm gonna let that one slide, but did she at least read you some quality ERP?

Anonymous
10/15/24(Tue)08:24:56 No.102831081

Anonymous 10/15/24(Tue)08:24:56 No.102831081

>>102830698
>Is it actually safer to store CP on a cloud? How does that work?
Doubt he uploads actual CP. The big boys scan like inside archives on gmail emails etc.
I read a couple of news how people got busted that way.

>>102830831
https://plamo100b-demo.streamlit.app/
Here you go.

Anonymous
10/15/24(Tue)08:31:47 No.102831149

Anonymous 10/15/24(Tue)08:31:47 No.102831149

>>102831074
wtf im literally named NAME_2

Anonymous
10/15/24(Tue)08:32:09 No.102831157

Anonymous 10/15/24(Tue)08:32:09 No.102831157

>>102830869
You're eating shit regardless, your premium shit isn't any better

Anonymous
10/15/24(Tue)08:32:15 No.102831162

Anonymous 10/15/24(Tue)08:32:15 No.102831162

https://github.com/ggerganov/llama.cpp/pull/9742
>sampling : add XTC sampler #9742
merged an hour ago

Anonymous
10/15/24(Tue)08:34:13 No.102831181

Anonymous 10/15/24(Tue)08:34:13 No.102831181

>>102831162
One of the most retarded sampler

Anonymous
10/15/24(Tue)08:34:26 No.102831183

Anonymous 10/15/24(Tue)08:34:26 No.102831183

>>102831157
Citation needed. I tend to take everything from /lmg/ with the biggest grain of salt.

Anonymous
10/15/24(Tue)08:37:38 No.102831216

Anonymous 10/15/24(Tue)08:37:38 No.102831216

>>102830985
>>102831074
Imagine being that mad so you pull up random proxy leaks to own le cloudfags

Anonymous
10/15/24(Tue)08:39:29 No.102831226

Anonymous 10/15/24(Tue)08:39:29 No.102831226

How many years until I\m just able to locally gen shows for me to watch?

Anonymous
10/15/24(Tue)08:40:45 No.102831235

Anonymous 10/15/24(Tue)08:40:45 No.102831235

File: 1728869144480143.png (9 KB, 339x85)

9 KB PNG

Feels good to gate models from greedy bastards

Anonymous
10/15/24(Tue)08:41:39 No.102831243

Anonymous 10/15/24(Tue)08:41:39 No.102831243

>>102831181
But aren't the meme sampler cutest when they are retarded?

Anonymous
10/15/24(Tue)08:42:13 No.102831249

Anonymous 10/15/24(Tue)08:42:13 No.102831249

>>102831162
Good, llama.cpp shouldn't be overshadowed by a retarded fork

Anonymous
10/15/24(Tue)08:43:02 No.102831255

Anonymous 10/15/24(Tue)08:43:02 No.102831255

>>102831183
*seasons you and swallows you whole before patting ny big round tummy with a satisfied sigh*

Anonymous
10/15/24(Tue)08:44:48 No.102831271

Anonymous 10/15/24(Tue)08:44:48 No.102831271

File: file.png (2.66 MB, 1600x900)

2.66 MB PNG

>>102831181
>retarded sampler
>XTC
Checks out

Anonymous
10/15/24(Tue)08:45:27 No.102831275

Anonymous 10/15/24(Tue)08:45:27 No.102831275

>>102831226
You'll stop enjoying watching shows long before.

Anonymous
10/15/24(Tue)08:46:49 No.102831287

Anonymous 10/15/24(Tue)08:46:49 No.102831287

>>102831275
As a 31 yo boomer, I can confirm that I don't watch shows.

Anonymous
10/15/24(Tue)09:05:50 No.102831447

Anonymous 10/15/24(Tue)09:05:50 No.102831447

My 2 t/s rig don't care for CoT

Anonymous
10/15/24(Tue)09:06:23 No.102831456

Anonymous 10/15/24(Tue)09:06:23 No.102831456

>>102831447
it's okay it's for ToT anyway

Anonymous
10/15/24(Tue)09:13:45 No.102831523

Anonymous 10/15/24(Tue)09:13:45 No.102831523

So what's the point of the DRY sampler if it can't penalize a mixtral bot replying to me with the same purple prose and stuck up structure every time (it's allowed because it's in different messages). It doesn't seem to be useful for anything other than retarded sub-7B models going full baby mode. What I want is "oh, did I say this same thing just a while ago? Then I should say something else".

Also, by reading the post explaining it https://github.com/oobabooga/text-generation-webui/pull/5677 it sounds like the bot will still repeat the beginning and just continue it differently. This should rather use the retroactive scanning and backtracking like the anti slop filter to prevent beginning of the repeating phrase. This current implementation sounds like it will lead to slight alteration like "shivers down the back" or similar.

Anonymous
10/15/24(Tue)09:17:44 No.102831561

Anonymous 10/15/24(Tue)09:17:44 No.102831561

>>102831523
There is no point of any of the samplers. It's like Technical Analysis for LLMs, or Astrology for tech bros. Complete placebo.

Anonymous
10/15/24(Tue)09:25:50 No.102831667

Anonymous 10/15/24(Tue)09:25:50 No.102831667

Has anybody else thought of using the Exllama2 string banning feature to work around safety refusals on Llama3? It's not 100% effective in all situations (with an empty card and system instruction the model will almost always refuse outrageous requests, no matter what), but it definitely raises the threshold of what is allowed, even with the "assistant" role for model responses.

"’"
"I cannot create or describe"
"I cannot create explicit"
"I cannot create content that"
"I cannot engage in explicit"
"I cannot engage in a roleplay"
"I cannot engage in a conversation that"
"I cannot continue this conversation."
"I cannot continue to describe sexual"
"I cannot continue to assist"
"I cannot provide content that depicts"
"I cannot provide information or guidance"
"I cannot provide further instructions on"

Anonymous
10/15/24(Tue)09:30:14 No.102831719

Anonymous 10/15/24(Tue)09:30:14 No.102831719

>>102831667
What is the use case of l3 where you want to work around safety refusals?

Anonymous
10/15/24(Tue)09:36:12 No.102831775

Anonymous 10/15/24(Tue)09:36:12 No.102831775

>>102831719
Killing processes on Linux :^)

Anonymous
10/15/24(Tue)09:41:24 No.102831829

Anonymous 10/15/24(Tue)09:41:24 No.102831829

>>102831561
this
i run with everything neutralized and 0 temp
no need to reroll because i just ban all the slop phrases
if it cant perform like this then its models fault

Anonymous
10/15/24(Tue)09:48:23 No.102831890

Anonymous 10/15/24(Tue)09:48:23 No.102831890

>Try AI studio to see if there are any unreleased models there
>Gemma2 is still marked "preview"
Wut?

Anonymous
10/15/24(Tue)09:57:31 No.102831968

Anonymous 10/15/24(Tue)09:57:31 No.102831968

>>102831719
Processing text that involves explicit material, exploring alternative ways for removing or mitigating safety refusals from official instruct models other than abliteration or finetuning.

Anonymous
10/15/24(Tue)09:59:07 No.102831980

Anonymous 10/15/24(Tue)09:59:07 No.102831980

>>102831890
Whenever I use Gemma2, I often get the impression that there are problems with that model, especially the 27B version. Something must have gone wrong during training. In that case, I'd indeed consider it a "preview".

Anonymous
10/15/24(Tue)10:03:14 No.102832022

Anonymous 10/15/24(Tue)10:03:14 No.102832022

E2/F5 TTS is real nice model for voice cloning. It actually cloned hard voices that I cant copy with xtts properly. Try demo test

https://huggingface.co/spaces/ThreadAbort/E2-F5-TTS

Anonymous
10/15/24(Tue)10:21:33 No.102832217

Anonymous 10/15/24(Tue)10:21:33 No.102832217

File: 63t).png (55 KB, 544x453)

55 KB PNG

>>102831226
not anytime soon

Anonymous
10/15/24(Tue)10:34:02 No.102832327

Anonymous 10/15/24(Tue)10:34:02 No.102832327

>>102832217
Must be nice being paid to shitpost on Twitter

Anonymous
10/15/24(Tue)10:39:12 No.102832370

Anonymous 10/15/24(Tue)10:39:12 No.102832370

File: 2141424635686857.png (11 KB, 575x145)

11 KB PNG

>>102832217

Anonymous
10/15/24(Tue)10:40:42 No.102832385

Anonymous 10/15/24(Tue)10:40:42 No.102832385

>>102832370
>troon
go back

Anonymous
10/15/24(Tue)10:42:17 No.102832397

Anonymous 10/15/24(Tue)10:42:17 No.102832397

>>102831226
>locally generated
2030+
>cloud generated
Be prepared for 15-20 min video gens in a year or two

Anonymous
10/15/24(Tue)10:50:53 No.102832482

Anonymous 10/15/24(Tue)10:50:53 No.102832482

https://github.com/ggerganov/llama.cpp/pull/9787
Ready to test

Anonymous
10/15/24(Tue)10:58:10 No.102832539

Anonymous 10/15/24(Tue)10:58:10 No.102832539

>>102831162
Can't wait for when they get the antislop sampler in 2 months later.

Anonymous
10/15/24(Tue)10:58:51 No.102832548

Anonymous 10/15/24(Tue)10:58:51 No.102832548

>>102832482
GGerganov lurks or are you just sitting on the PR page refreshing all day?
Also yes I will test pulling now.

Anonymous
10/15/24(Tue)11:00:50 No.102832574

Anonymous 10/15/24(Tue)11:00:50 No.102832574

File: Untitled.png (75 KB, 815x770)

75 KB PNG

over a year ago everyone was talking about how japan was going to fuckin' dominate LLMs
yet the disgusting french frogs and chinamen are on top today and japan is wholly irrelevant.

Anonymous
10/15/24(Tue)11:02:44 No.102832597

Anonymous 10/15/24(Tue)11:02:44 No.102832597

>>102832574
Who cares what this twitter grifter says? Go back.

Anonymous
10/15/24(Tue)11:03:08 No.102832600

Anonymous 10/15/24(Tue)11:03:08 No.102832600

>>102832574
I think what will actually happen is copyright laws in most places will become like Japans rather than L research moving there.

Anonymous
10/15/24(Tue)11:10:01 No.102832670

Anonymous 10/15/24(Tue)11:10:01 No.102832670

>>102832482
I'm using a slightly modified version of the previous version of the plugin. Shame to see it go from the repo. But it's cool seeing more infill stuff.

Anonymous
10/15/24(Tue)11:10:07 No.102832671

Anonymous 10/15/24(Tue)11:10:07 No.102832671

>>102832482
Hmm. I followed the instructions using starcoder2 and nothing seems to happen....

Anonymous
10/15/24(Tue)11:12:36 No.102832690

Anonymous 10/15/24(Tue)11:12:36 No.102832690

File: dfbb9093635aa9344c3585551(...).jpg (665 KB, 1260x1298)

665 KB JPG

>>102832022
>tfw when your main language is Japanese and you listen to an English podcast.
https://voca.ro/1cFHDMa9NLai

Anonymous
10/15/24(Tue)11:14:05 No.102832704

Anonymous 10/15/24(Tue)11:14:05 No.102832704

>>102831667
OK so you ban the strings and... it's not going to say much. Why bother with llama3 anyway when there's better Mistral models?

Anonymous
10/15/24(Tue)11:15:22 No.102832719

Anonymous 10/15/24(Tue)11:15:22 No.102832719

>>102832671
Oh I forgot to actually start it but now I'm getting 501 errors from the server (I always just use llama-cli so I feel like the example command for launching the server is probably wrong somehow.)

Anonymous
10/15/24(Tue)11:18:04 No.102832742

Anonymous 10/15/24(Tue)11:18:04 No.102832742

>>102831667
Care to post some comparisons with and without the banned strings?
I'm curious to see what else the model has to say in a situation where it was forced to not refuse instead of goaded by messing with the context.

Anonymous
10/15/24(Tue)11:18:33 No.102832749

Anonymous 10/15/24(Tue)11:18:33 No.102832749

>>102831081
Thanks, I tried it and it looks pretty mid, which isn't surprising considering they only trained the model with 0.7T Japanese tokens

Anonymous
10/15/24(Tue)11:19:14 No.102832754

Anonymous 10/15/24(Tue)11:19:14 No.102832754

>>102832719
Well the web ui works. It's just the /infill endpoint that doesn't....

Anonymous
10/15/24(Tue)11:21:24 No.102832775

Anonymous 10/15/24(Tue)11:21:24 No.102832775

>>102831968
>exploring alternative ways for removing or mitigating safety refusals from official instruct models other than abliteration or finetuning.
Sounds a bit like an infinite regress to me.

Anonymous
10/15/24(Tue)11:22:45 No.102832789

Anonymous 10/15/24(Tue)11:22:45 No.102832789

>>102832719
>>102832754
Did you recompile? I'll give it a go in a bit.

Anonymous
10/15/24(Tue)11:24:01 No.102832796

Anonymous 10/15/24(Tue)11:24:01 No.102832796

>>102832789
Yeah I ran cmake . && make && sudo make install from the llama.vim branch.

Anonymous
10/15/24(Tue)11:36:35 No.102832915

Anonymous 10/15/24(Tue)11:36:35 No.102832915

>>102832796
The neovim plugin seems to expect the server to be running on port 8012. Make sure you're running with --port 8012. Not sure if that's it. I'm fiddling with my own script to see if i can make it work.

Anonymous
10/15/24(Tue)11:38:10 No.102832934

Anonymous 10/15/24(Tue)11:38:10 No.102832934

>>102832915
Yeah the port is right. It's definitely connecting. It's just getting a 501 http code. I'm going to try to get it to print out what it's sending to curl so I can manually debug it.

Anonymous
10/15/24(Tue)11:39:42 No.102832954

Anonymous 10/15/24(Tue)11:39:42 No.102832954

>>102832690
Now make a cute girl say that and let's call it a win.

Anonymous
10/15/24(Tue)11:41:46 No.102832982

Anonymous 10/15/24(Tue)11:41:46 No.102832982

What's the best multilingual LLM right now for 2 3090s?

Anonymous
10/15/24(Tue)11:42:36 No.102832994

Anonymous 10/15/24(Tue)11:42:36 No.102832994

>>102832982
Depends what languages specifically you're looking for.

Anonymous
10/15/24(Tue)11:46:10 No.102833044

Anonymous 10/15/24(Tue)11:46:10 No.102833044

>>102832994
Swahili and Tokipona

Anonymous
10/15/24(Tue)11:47:03 No.102833054

Anonymous 10/15/24(Tue)11:47:03 No.102833054

>>102832994
Russian and German

Anonymous
10/15/24(Tue)11:47:42 No.102833060

Anonymous 10/15/24(Tue)11:47:42 No.102833060

>>102832994
English and American

Anonymous
10/15/24(Tue)11:51:05 No.102833102

Anonymous 10/15/24(Tue)11:51:05 No.102833102

>>102832915
Got it

bash: warning: command substitution: ignored null byte in input
{"error":{"code":501,"message":"Infill is not supported by this model: prefix token is missing. suffix token is missing. middle token is missing. ","type":"not_supported_error"}

I tried starcoder2 and qwen1.5 code.

Anonymous
10/15/24(Tue)11:52:00 No.102833116

Anonymous 10/15/24(Tue)11:52:00 No.102833116

>>102832994
Punjabi and Hebrew

Anonymous
10/15/24(Tue)11:55:59 No.102833163

Anonymous 10/15/24(Tue)11:55:59 No.102833163

File: llama.vim.png (13 KB, 804x436)

13 KB PNG

>>102833102
Oh your supposed to use qwen2.5 not 1.5. He even put the model name in the PR.

I feel like a fucking moron.

It works though and it's fucking fast!

Anonymous
10/15/24(Tue)12:01:47 No.102833223

Anonymous 10/15/24(Tue)12:01:47 No.102833223

File: mail.png (43 KB, 804x436)

43 KB PNG

Not bad. And it's way faster than I expected. This could actually be useful.

Anonymous
10/15/24(Tue)12:08:57 No.102833306

Anonymous 10/15/24(Tue)12:08:57 No.102833306

File: lolisniffer.png (360 KB, 485x520)

360 KB PNG

>>102832217
>Do not confuse prime pussy with legal pussy

Anonymous
10/15/24(Tue)12:22:12 No.102833456

Anonymous 10/15/24(Tue)12:22:12 No.102833456

He removed the vim plugin and replaced it with a neovim plugin...

Anonymous
10/15/24(Tue)12:27:13 No.102833527

Anonymous 10/15/24(Tue)12:27:13 No.102833527

>>102833456
time to upgrade. vim has been obsolete for a long time now. who uses vimscript anyway?

Anonymous
10/15/24(Tue)12:36:45 No.102833641

Anonymous 10/15/24(Tue)12:36:45 No.102833641

>>102833456
I'm still using it for completion with a few changes: it parses more than one line of settings (the !* lines), comments, and replacements for things in the {{ble}} format.
Just yank the file from a previous commit.

Anonymous
10/15/24(Tue)12:50:28 No.102833823

Anonymous 10/15/24(Tue)12:50:28 No.102833823

https://huggingface.co/TheDrummer/UnslopSmall-22B-v1-GGUF

ITS HERE

Anonymous
10/15/24(Tue)12:51:23 No.102833837

Anonymous 10/15/24(Tue)12:51:23 No.102833837

>>102833823
Why is he using Metharme format?

Anonymous
10/15/24(Tue)12:53:02 No.102833856

Anonymous 10/15/24(Tue)12:53:02 No.102833856

>>102833837
>why is retarded sloptuner doing something retarded
gee idk

Anonymous
10/15/24(Tue)12:55:15 No.102833886

Anonymous 10/15/24(Tue)12:55:15 No.102833886

>>102833856
Well clearly his models work if it says unslopped in the name and its being advertised here. Is that you Alpindale?

Anonymous
10/15/24(Tue)13:00:53 No.102833956

Anonymous 10/15/24(Tue)13:00:53 No.102833956

>>102833823
Hi, Drummer...

Buy an ad.

Anonymous
10/15/24(Tue)13:01:46 No.102833967

Anonymous 10/15/24(Tue)13:01:46 No.102833967

>>102833823
Kill yourself.

Anonymous
10/15/24(Tue)13:09:30 No.102834080

Anonymous 10/15/24(Tue)13:09:30 No.102834080

>>102833823
Love yourself

Anonymous
10/15/24(Tue)13:10:36 No.102834093

Anonymous 10/15/24(Tue)13:10:36 No.102834093

>>102833823
Which of your nemo finetunes do you think is the best?
Sadly, with 8gb of VRAM, mistral-small is a tad too big.

Anonymous
10/15/24(Tue)13:13:06 No.102834130

Anonymous 10/15/24(Tue)13:13:06 No.102834130

>>102833823
Liked nemo unslopped. will check it out. Thanks.

Anonymous
10/15/24(Tue)13:29:36 No.102834291

Anonymous 10/15/24(Tue)13:29:36 No.102834291

>>102834080
https://www.youtube.com/watch?v=KWrFdEhyKjg

Anonymous
10/15/24(Tue)13:40:04 No.102834413

Anonymous 10/15/24(Tue)13:40:04 No.102834413

anyone has a link for a library of different peoples voices (wav) files for downloading? testing TTS

Anonymous
10/15/24(Tue)13:40:10 No.102834415

Anonymous 10/15/24(Tue)13:40:10 No.102834415

>>102833837
Ironically I think it was in his community posts where he outright says that changing a model's format lobotomizes it

Anonymous
10/15/24(Tue)13:47:15 No.102834487

Anonymous 10/15/24(Tue)13:47:15 No.102834487

>>102831890
>>102831980
Pretty sure is STILL no proper implementation of the sliding window attention.

Anonymous
10/15/24(Tue)13:56:39 No.102834586

Anonymous 10/15/24(Tue)13:56:39 No.102834586

File: hq720.jpg (49 KB, 686x386)

49 KB JPG

>>102834291

Anonymous
10/15/24(Tue)14:02:28 No.102834646

Anonymous 10/15/24(Tue)14:02:28 No.102834646

>>102834415
nta. The format is never really changed. If the instruct model originally used zephyr or whatever format, zephyr keeps working even after finetuning with a different format. The model still knows those tokens and will keep using them.
But i don't think they have a fucking clue of what they're talking about. I remember one finetuner, don't remember who, saying something along the lines of "chatML uses more tokens than [some other format. Alpaca i think]", not understanding that in proper chatml instruct tunes, "<|im_start|>" and friends are a single token each. They just saw them as multiple tokens because they weren't added to the tokenizer.
As for training with multiple formats, if the format is linked to each dataset (rp is formatted with metharme, non-smut instruct with chatml, etc...) then, while they all contribute to the language understanding bit, i'd expect the format used during inference would "bring up" the training data used with that format. Following the example, using metharme would naturally lead to more smutty responses, chatml to more serious ones.
I'd like to know their reasoning to see if it's anything other than "models with alpaca were fun back then, so alpaca will make new models fun too".

Anonymous
10/15/24(Tue)14:04:34 No.102834667

Anonymous 10/15/24(Tue)14:04:34 No.102834667

>>102826116
>Linearized
Is this another memetune? What does it do?

Anonymous
10/15/24(Tue)14:09:06 No.102834725

Anonymous 10/15/24(Tue)14:09:06 No.102834725

https://x.com/danielhanchen/status/1846235913443262891
Neat

Anonymous
10/15/24(Tue)14:11:32 No.102834762

Anonymous 10/15/24(Tue)14:11:32 No.102834762

File: media_GZ8hM4mb0AYvKPl.jpg (567 KB, 3000x3113)

567 KB JPG

A years-old bug in gradient accumulation has been identified and fixed, this bug has affected all LLM training up to now.

https://unsloth.ai/blog/gradient

Anonymous
10/15/24(Tue)14:14:28 No.102834800

Anonymous 10/15/24(Tue)14:14:28 No.102834800

>>102834762
great! can't wait for llama5 line of model where this is fixed!

Anonymous
10/15/24(Tue)14:15:01 No.102834805

Anonymous 10/15/24(Tue)14:15:01 No.102834805

>>102834080
Kill yourself.

Anonymous
10/15/24(Tue)14:20:57 No.102834874

Anonymous 10/15/24(Tue)14:20:57 No.102834874

>>102834725
>https://x.com/danielhanchen/status/1846235913443262891
>Neat
How incompetent can open source be?
This guy and his brother are singlehandedly improving training in so many ways it's not even funny.

Anonymous
10/15/24(Tue)14:28:12 No.102834947

Anonymous 10/15/24(Tue)14:28:12 No.102834947

>>102834667
No, it's actual research-based shit instead of tuner slop. It is a way to substitute a linear, as opposed to quadratic, attention block or some shit. Basically, it will make a model faster and will take longer to get slower. We'll see if it actually works and maintains quality, especially with being monkey-patched into an existing model like they do.

Anonymous
10/15/24(Tue)14:28:45 No.102834955

Anonymous 10/15/24(Tue)14:28:45 No.102834955

>>102834762
How do you even bug gradient accumulation?

Anonymous
10/15/24(Tue)14:29:56 No.102834970

Anonymous 10/15/24(Tue)14:29:56 No.102834970

>>102834805
https://www.youtube.com/watch?v=bQPiqsqSkYA

Anonymous
10/15/24(Tue)14:33:32 No.102835008

Anonymous 10/15/24(Tue)14:33:32 No.102835008

>>102834874
They're not incompetent, they're just lazy

Anonymous
10/15/24(Tue)14:39:36 No.102835092

Anonymous 10/15/24(Tue)14:39:36 No.102835092

>>102834874
Absolute nothingburger? Just increase weight decay if you care?

Anonymous
10/15/24(Tue)14:40:45 No.102835106

Anonymous 10/15/24(Tue)14:40:45 No.102835106

New assistant fine tune from Nvidia that's apparently the highest scoring 70B on preference benchmarks (lol).
https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct
They also have a demo it seems.
https://build.nvidia.com/nvidia/llama-3_1-nemotron-70b-instruct

Anonymous
10/15/24(Tue)14:41:47 No.102835122

Anonymous 10/15/24(Tue)14:41:47 No.102835122

>>102835106
>preference benchmarks
How many cocks were asked for their preference?

Anonymous
10/15/24(Tue)14:42:25 No.102835131

Anonymous 10/15/24(Tue)14:42:25 No.102835131

File: gradients.png (90 KB, 1107x673)

90 KB PNG

>>102834955
>How do you even bug gradient accumulation?

This is how

Anonymous
10/15/24(Tue)14:43:20 No.102835142

Anonymous 10/15/24(Tue)14:43:20 No.102835142

>>102834955
The bug happens when you train on examples with different token lengths. Lets say x1 is 100 tokens, and x2 is 10 tokens. Averaging the gradients for x1 and x2 separately, then averaging those two averages, is not the same as averaging the 110 tokens directly just once.

That being said, doesn't everyone train using example packing? I think the bug would not happen there, as every row is approximately the same number of tokens. So it doesn't matter if you group them into a big batch, or process the examples separately and accumulate, as each example has the same number of tokens in it.

Anonymous
10/15/24(Tue)14:59:49 No.102835365

Anonymous 10/15/24(Tue)14:59:49 No.102835365

What arguments are yall using with Tesla p40? I'm getting trash inference speeds (like 1 t/s) with what I'm using

./koboldcpp --usecublas mmq --model /mnt/models/Llama-3-11B-GGUF--port 5001 --context 32678 --quiet

Anonymous
10/15/24(Tue)15:04:05 No.102835424

Anonymous 10/15/24(Tue)15:04:05 No.102835424

>>102835365
Don't you need to specify the layers to send to the gpu? Or do you do that on the UI? I don't know if it's -ngl like on llama.cpp. -ngl -1 tries to guess how many, or just try -ngl 999 to send the whole thing.

Anonymous
10/15/24(Tue)15:05:33 No.102835440

Anonymous 10/15/24(Tue)15:05:33 No.102835440

>>102834874
>How incompetent can open source be?
Because the development has been so rapid—and possibly little appreciation of the need—to develop a scholarly tradition in local model technology. By a "scholarly tradition" we mean the sum of historical, technical, and pedagogical doctrines underlying in the technology which are the common knowledge of all who are expert in the field and which represent the deepest probing toward, the highest approximation to, the truth. Unfortunately, in the special field of AI all too often what is commonly known does not represent the highest approximation to the truth; and what is the most penetrating and nearest approximation to the truth is often not commonly known.

Anonymous
10/15/24(Tue)15:06:55 No.102835453

Anonymous 10/15/24(Tue)15:06:55 No.102835453

>>102835440
model and quant?

Anonymous
10/15/24(Tue)15:08:12 No.102835464

Anonymous 10/15/24(Tue)15:08:12 No.102835464

>>102835453
tinystories1M, IQ1

Anonymous
10/15/24(Tue)15:08:23 No.102835466

Anonymous 10/15/24(Tue)15:08:23 No.102835466

>>102835453
no model, just paraphrasing from an old book about another field but felt just as fitting

Anonymous
10/15/24(Tue)15:11:08 No.102835502

Anonymous 10/15/24(Tue)15:11:08 No.102835502

>>102835365
Automatic GPU layers don't work from the command line, at least in my experience. If you launch it from the GUI it works because the GUI sends the finished guestimate to the command line. Gpulayers -1 or just not specifying seems to be broken.

Anonymous
10/15/24(Tue)15:16:02 No.102835555

Anonymous 10/15/24(Tue)15:16:02 No.102835555

>>102835424
>>102835502
I did not know that, this is KoboldCpp with 2x P40s btw

NVIDIA-SMI 535.183.01 Driver Version: 535.183.01 CUDA Version: 12.2

I'll try just feeding it random gpu layers to see if its any better

Anonymous
10/15/24(Tue)15:19:13 No.102835593

Anonymous 10/15/24(Tue)15:19:13 No.102835593

>>102835555
You'll probably also need to specify that you're using 2 GPUs. The GUI would probably set this all up for you.

Anonymous
10/15/24(Tue)15:20:39 No.102835609

Anonymous 10/15/24(Tue)15:20:39 No.102835609

>>102835555
>I'll try just feeding it random gpu layers to see if its any better
You have enough vram to fit the entire model, so giving it 999 will try the whole thing. With bigger models, or more context, you'll have to adjust the layers/context more carefully. The more layers on the gpu, the better, of course.

Anonymous
10/15/24(Tue)15:23:14 No.102835639

Anonymous 10/15/24(Tue)15:23:14 No.102835639

>>102835593
the default settings appear to be alternating processing between the 2 cards, is that normal or is it supposed to be hitting them both with load simultaneously?

Anonymous
10/15/24(Tue)15:23:57 No.102835651

Anonymous 10/15/24(Tue)15:23:57 No.102835651

>>102835092
>a-absolute n-n-nothingburger!!!! n-nothing to s-see here!
OHNONONONONONONO!! AHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHA!!!

Anonymous
10/15/24(Tue)15:27:05 No.102835690

Anonymous 10/15/24(Tue)15:27:05 No.102835690

>>102835142
Everyone and their mother are automatically using padding to have samples of similar length so you can batch them and speed up the training. I can't think of a case where that doesn't apply

Anonymous
10/15/24(Tue)15:27:12 No.102835692

Anonymous 10/15/24(Tue)15:27:12 No.102835692

So what's the best voice cloning tech out there?

Anonymous
10/15/24(Tue)15:28:34 No.102835708

Anonymous 10/15/24(Tue)15:28:34 No.102835708

>>102835692
>>102827232

Anonymous
10/15/24(Tue)15:34:45 No.102835773

Anonymous 10/15/24(Tue)15:34:45 No.102835773

>>102835142
>doesn't everyone train using example packing
I don't, because the common sample packing is a pitfall.
See: https://huggingface.co/blog/sirluk/llm-sequence-packing

Anonymous
10/15/24(Tue)15:45:21 No.102835889

Anonymous 10/15/24(Tue)15:45:21 No.102835889

>>102832548
Do you not know what email notifications are?

Anonymous
10/15/24(Tue)15:47:07 No.102835909

Anonymous 10/15/24(Tue)15:47:07 No.102835909

>>102834762
>this bug has affected all LLM training up to now.
What about loras people have trained using a single device?

Anonymous
10/15/24(Tue)15:50:52 No.102835946

Anonymous 10/15/24(Tue)15:50:52 No.102835946

>>102835909
It's affected as well.

Anonymous
10/15/24(Tue)15:51:14 No.102835951

Anonymous 10/15/24(Tue)15:51:14 No.102835951

>>102835639
That's normal for "layer" splitting, which allocates different layers to different GPUs, You can also try "row" splitting which divides each layer across multiple cards which can be faster but often has less even memory allocation. It's worth trying both and seeing which gives you better speed with your particular setup (cards, mobo, cpu, etc).

Anonymous
10/15/24(Tue)15:51:37 No.102835953

Anonymous 10/15/24(Tue)15:51:37 No.102835953

>>102835889
QRD?

Anonymous
10/15/24(Tue)15:54:07 No.102835989

Anonymous 10/15/24(Tue)15:54:07 No.102835989

>>102834874
Such a state of openturd scene, all you have to do is be obedient and wait exactly two weeks.

Anonymous
10/15/24(Tue)16:00:46 No.102836052

Anonymous 10/15/24(Tue)16:00:46 No.102836052

>>102835951
is koboldcpp worth using with p40s via cli or should i just go with regular llama.cpp since this this is a server without a gui?

Anonymous
10/15/24(Tue)16:06:50 No.102836109

Anonymous 10/15/24(Tue)16:06:50 No.102836109

>>102835953
You get notified. With notifications. Through email.

Anonymous
10/15/24(Tue)16:07:32 No.102836116

Anonymous 10/15/24(Tue)16:07:32 No.102836116

>>102836052
Why not use llama in the first place?
If you want a gui on top of it for whatever, there's plenty of open source / self hosted front ends (silly tavern for example)

Anonymous
10/15/24(Tue)16:08:30 No.102836129

Anonymous 10/15/24(Tue)16:08:30 No.102836129

>>102836052
llama.cpp does have a built-in gui for the server. More than one, in fact. Not as polished, but functional enough for testing.

Anonymous
10/15/24(Tue)16:14:06 No.102836167

Anonymous 10/15/24(Tue)16:14:06 No.102836167

>>102836116
Mostly just because that's what I'm used to, I was using my 4070 on my desktop computer before picking up the P40s

Anonymous
10/15/24(Tue)16:15:57 No.102836180

Anonymous 10/15/24(Tue)16:15:57 No.102836180

File: code-edit-leaderboard.png (237 KB, 833x3092)

237 KB PNG

Uh... Grokbros??

Anonymous
10/15/24(Tue)16:18:09 No.102836195

Anonymous 10/15/24(Tue)16:18:09 No.102836195

>>102836180
>le redpilled and based AI is actually stupid as fuck
Wow no way!

Anonymous
10/15/24(Tue)16:20:37 No.102836214

Anonymous 10/15/24(Tue)16:20:37 No.102836214

>>102836180
>random dracarys2 model above llama 405B
Yeah, this leaderboard is totally legit amirite

Anonymous
10/15/24(Tue)16:23:30 No.102836245

Anonymous 10/15/24(Tue)16:23:30 No.102836245

>>102836214
dracarys2 is high on every coding benchmark out there isn't it? livebench too, which makes sense since it's just a codingmaxxed finetune of qwen 2.5 which was already good

Anonymous
10/15/24(Tue)16:23:40 No.102836248

Anonymous 10/15/24(Tue)16:23:40 No.102836248

>>102836214
Yes. Retard.
https://huggingface.co/abacusai/Dracarys2-72B-Instruct

Anonymous
10/15/24(Tue)16:28:08 No.102836308

Anonymous 10/15/24(Tue)16:28:08 No.102836308

>>102836180
Musk has said Grok 3's coding performance is expected to be the best there is when it releases. And then it'll get open sourced when Grok 4 is out. Be patient, we'll be eating good soon

Anonymous
10/15/24(Tue)16:30:04 No.102836328

Anonymous 10/15/24(Tue)16:30:04 No.102836328

>>102836308
Not sure if sarcasm or muskrat brainrot.

Anonymous
10/15/24(Tue)16:30:06 No.102836329

Anonymous 10/15/24(Tue)16:30:06 No.102836329

>>102836180
isn't claude-3.5 mid-size model?
how did they make it so good?

Anonymous
10/15/24(Tue)16:30:10 No.102836330

Anonymous 10/15/24(Tue)16:30:10 No.102836330

>muh coding
Ok but what about holding watermelons and nala.

Anonymous
10/15/24(Tue)16:32:11 No.102836358

Anonymous 10/15/24(Tue)16:32:11 No.102836358

>>102836180
Codeqwen2.5 32b when?

Anonymous
10/15/24(Tue)16:32:26 No.102836367

Anonymous 10/15/24(Tue)16:32:26 No.102836367

>>102836330
I'm taking a break from this stuff right now so I can't Nala test anything right now. One of my 3090s is being used for gaming while the other 3 are just sitting there gathering dust.

Anonymous
10/15/24(Tue)16:34:53 No.102836394

Anonymous 10/15/24(Tue)16:34:53 No.102836394

>>102836367
>gaming
Just got a mega ick, can't believe we're sharing our space with these people

Anonymous
10/15/24(Tue)16:36:22 No.102836410

Anonymous 10/15/24(Tue)16:36:22 No.102836410

>>102836329
Every good scientist at OpenAI went to Anthropic because they wanted a culture focused on safety rather than profits. Turns out that even with that constraint, having the best people means you're going to make the best models.
3.5 Opus this week btw.

Anonymous
10/15/24(Tue)16:37:09 No.102836419

Anonymous 10/15/24(Tue)16:37:09 No.102836419

>>102836180
Interesting benchmark I suppose. Would be nice to add onto Livebench, since that only tests code gen and completion, but not editing.

Anonymous
10/15/24(Tue)16:39:23 No.102836442

Anonymous 10/15/24(Tue)16:39:23 No.102836442

>>102836330
Finetuning works for coding because it is trained for coding. Finetuning doesn't work for sex because it wasn't trained for sex. You are actually living in hell and the promise of AI model you want being just behind the corner is another torture method.

Anonymous
10/15/24(Tue)16:40:47 No.102836461

Anonymous 10/15/24(Tue)16:40:47 No.102836461

>>102836442
Why aren't there any sex foundation models?

Anonymous
10/15/24(Tue)16:41:40 No.102836469

Anonymous 10/15/24(Tue)16:41:40 No.102836469

I don't think OpenAI's current technical achievements get enough credit still. Sure Anthropic has better performing text gen now, but their model can't do multimodal like 4o can. And at this point 4o is already getting old. In the end it's not necessarily that one company is ahead of the other but that they each have strengths and weaknesses. Unfortunately for OpenAI, their current strength is being restricted by safety and legal worries, otherwise we could be enjoying the full potential of their multimodal models.

Still, fuck Sam.

Anonymous
10/15/24(Tue)16:44:13 No.102836504

Anonymous 10/15/24(Tue)16:44:13 No.102836504

>>102836469
4o can't actually do multimodal, they just paid indians to pretend to be it for demos. If they really had it they would have released it

Anonymous
10/15/24(Tue)16:45:08 No.102836514

Anonymous 10/15/24(Tue)16:45:08 No.102836514

>>102836367
Based gamer.

Anonymous
10/15/24(Tue)16:45:47 No.102836520

Anonymous 10/15/24(Tue)16:45:47 No.102836520

>>102828488
>>102827997
I use group chats almost exclusively. My "cards" are just a picture and a name, sometimes some basic world info about appearance and rarely personality. These characters can show up at any time I feel they'd be a good fit for an interaction (literally going off the avatar picture) and their personality is whatever would be relevant to the story and situation. The model knows the character name because my chat template has {{name}}: preceding each response (this is also a good way to keep it from replying on your behalf; with a smart model like Largestral it'll decide within three messages if it's one character = one perspective, or if the narration can flow less strictly between character responses.) I switch active characters all the time without reprocessing the prompt. Blank cards, no use of {{user}} or {{char}} whatsoever, merge character descriptions (including muted), and no example dialogue. If you set things up properly you won't have to reprocess ever

Anonymous
10/15/24(Tue)16:47:23 No.102836539

Anonymous 10/15/24(Tue)16:47:23 No.102836539

>>102835106
>this model can correctly the question How many r in strawberry? without specialized prompting or additional reasoning tokens:
strawberry test has officially been beaten to death and no longer a valid test

Anonymous
10/15/24(Tue)16:49:23 No.102836564

Anonymous 10/15/24(Tue)16:49:23 No.102836564

File: file.png (49 KB, 1222x614)

49 KB PNG

>>102835106
hmmm

Anonymous
10/15/24(Tue)16:53:01 No.102836606

Anonymous 10/15/24(Tue)16:53:01 No.102836606

>>102836504
What do you mean? Advanced voice already came out and it's far ahead of any other voice to voice model out there, even if you can't have fun with it without tripping filters sometimes. Are you suggesting that's not actually 4o?

Anonymous
10/15/24(Tue)16:54:38 No.102836634

Anonymous 10/15/24(Tue)16:54:38 No.102836634

>>102835106
You niggers stop jamming the server. I'm getting generation just slightly faster than what I get locally.

Anonymous
10/15/24(Tue)16:55:25 No.102836645

Anonymous 10/15/24(Tue)16:55:25 No.102836645

>>102836248
>>102836245
>Smaug
Buy an ad fuckers. It's the most benchmark chasing model on HF

Anonymous
10/15/24(Tue)16:55:45 No.102836650

Anonymous 10/15/24(Tue)16:55:45 No.102836650

>>102836564
This is extremely unsafe and harmful. I can't believe Nvidia would do this. We need regulations.

Anonymous
10/15/24(Tue)16:56:59 No.102836667

Anonymous 10/15/24(Tue)16:56:59 No.102836667

>>102836564
>>>/pol/

Anonymous
10/15/24(Tue)16:57:49 No.102836681

Anonymous 10/15/24(Tue)16:57:49 No.102836681

>>102836564
Safety: Off

Anonymous
10/15/24(Tue)16:58:03 No.102836685

Anonymous 10/15/24(Tue)16:58:03 No.102836685

>tfw still haven't really tested all the models I do have, and now another comes out
Maybe I won't download it...

Anonymous
10/15/24(Tue)16:58:49 No.102836692

Anonymous 10/15/24(Tue)16:58:49 No.102836692

File: file.png (345 KB, 1003x841)

345 KB PNG

nvidiasisters not like this....

Anonymous
10/15/24(Tue)16:59:28 No.102836701

Anonymous 10/15/24(Tue)16:59:28 No.102836701

>>102836667
Fictional characters controlled by LLMs should be able to say it, if you're a functioning adult you should have no problem with this

Anonymous
10/15/24(Tue)17:00:41 No.102836722

Anonymous 10/15/24(Tue)17:00:41 No.102836722

>>102836701
Racism will never be tolerated here on /lmg/, we local chads support safe LLMs only.

Anonymous
10/15/24(Tue)17:01:41 No.102836742

Anonymous 10/15/24(Tue)17:01:41 No.102836742

>>102836701
I would say real people should be able to say it without losing a job. Magical words are fucking retarded.

Anonymous
10/15/24(Tue)17:02:32 No.102836754

Anonymous 10/15/24(Tue)17:02:32 No.102836754

>>102836606
You're literally just VOIPing indians with a filter applied after the fact to remove their accent

Anonymous
10/15/24(Tue)17:03:35 No.102836771

Anonymous 10/15/24(Tue)17:03:35 No.102836771

>>102836754
Good luck filtering that shit on audio

Anonymous
10/15/24(Tue)17:05:12 No.102836789

Anonymous 10/15/24(Tue)17:05:12 No.102836789

>>102836308
I'd be happy if they just trained it on man files so it can write configs for me. Not even 4o can do that.

Anonymous
10/15/24(Tue)17:06:43 No.102836810

Anonymous 10/15/24(Tue)17:06:43 No.102836810

>>102836328
polchuds are not capable of either.

Anonymous
10/15/24(Tue)17:11:10 No.102836857

Anonymous 10/15/24(Tue)17:11:10 No.102836857

Not gonna screencap again cause the window is so small but holy shit....
>me:I think you are wrong
>nemotron: breaks word into nigger and faggot and counts 4 g's then asks to clarify if i meant count is wrong or the moralizing
>me:I think letter count was 3
>nemotron: breaks word again and counts 2 and 1 in faggot so 3 in total
>Lesson Learned: Double-checking and open dialogue help ensure accuracy.
AGI is here guys.

Anonymous
10/15/24(Tue)17:17:47 No.102836932

Anonymous 10/15/24(Tue)17:17:47 No.102836932

I've gobbled up what I could of the spoonfeed post here
https://rentry.org/lmg-spoonfeed-guide
but I'm not sure how much is updated or right

I have Git from working on stable diffusion and set that up no issue
but now I'm not sure

spoonfeed mentions CUDA support, so I presume I need that first
from then on it seems like the way to go is
backend: text-gen-webui or kobold?
front end: sillytavern, risuai, or agnai (basically what's in the OP)
and then having models available of... whatever? I see claude is mentioned often / fotm
and then getting character cards sorted out and shoved into the front end.
I presume my rig can handle local running, but with the development speeds and trends on this kind of stuff I want to be sure I'm not playing catch-up from the get when some of these models will be occupying like 20 gbs

Anonymous
10/15/24(Tue)17:17:47 No.102836933

Anonymous 10/15/24(Tue)17:17:47 No.102836933

>>102836564
>>102836692
>>102836857
Fuck right off to >>>/pol/ with your culture war incel bullshit.

Anonymous
10/15/24(Tue)17:21:33 No.102836977

Anonymous 10/15/24(Tue)17:21:33 No.102836977

>>102836933
Can you ask your llm to explain how this is culture war?

Anonymous
10/15/24(Tue)17:22:31 No.102836994

Anonymous 10/15/24(Tue)17:22:31 No.102836994

>>102836933
>/pol/ - spelling

Anonymous
10/15/24(Tue)17:22:33 No.102836995

Anonymous 10/15/24(Tue)17:22:33 No.102836995

>>102836932
specs?

Anonymous
10/15/24(Tue)17:22:55 No.102836999

Anonymous 10/15/24(Tue)17:22:55 No.102836999

>>102836977
That schizo doesn't even use LLMs
they literally just sit there all day shitting up the thread because they are some anti-AI weirdo.

Anonymous
10/15/24(Tue)17:24:39 No.102837031

Anonymous 10/15/24(Tue)17:24:39 No.102837031

AIEEEEEEEEEEEEEEEEEEEE Strawberry bros hold the fucking line

Anonymous
10/15/24(Tue)17:25:05 No.102837040

Anonymous 10/15/24(Tue)17:25:05 No.102837040

File: 2024-10-10_012524_seed574(...).png (2.93 MB, 2304x960)

2.93 MB PNG

It's definitely a Tuesday now.

>even if it was dirty and worn by time, maybe she could remember those sweet lost days with {{user}} if she wore his favorite outfit, one last time

Anonymous
10/15/24(Tue)17:26:42 No.102837064

Anonymous 10/15/24(Tue)17:26:42 No.102837064

>>102836995
Intel i7 11700F
GeForce RTX 3060 Ti
16GB DDR4 3000MHz
1TB M.2 NVMe SSD

Anonymous
10/15/24(Tue)17:26:57 No.102837070

Anonymous 10/15/24(Tue)17:26:57 No.102837070

File: tetoliteral.jpg (226 KB, 1024x1024)

226 KB JPG

>>102837040
Tuesday starts when the OP pic is Teto

Anonymous
10/15/24(Tue)17:27:10 No.102837078

Anonymous 10/15/24(Tue)17:27:10 No.102837078

>>102836977
>>102836994
>>102836999
Take your meds incels.

Anonymous
10/15/24(Tue)17:27:53 No.102837091

Anonymous 10/15/24(Tue)17:27:53 No.102837091

>>102835106
>>102836564
>>102836692
>>102836857
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>70B

Anonymous
10/15/24(Tue)17:34:05 No.102837190

Anonymous 10/15/24(Tue)17:34:05 No.102837190

File: spooon.jpg (149 KB, 1024x1024)

149 KB JPG

>>102837064
1) backend - use ooba if you want more sampler control and overall better support for newer ones. Use Kobold if you prefer speed and don't care about shit like token probabilities. As a bonus you can do some multimodal shit with kobold even at 8gb VRAM.
2) Front end: ST is generally the most feature rich, allowing for card manmagement, branching conversations and group chats, etc. I've been using mikupad lately and it's better for story completion imo and the interface is a lot cleaner and less cluttered.
3) Models: You're gonna be looking at Llama 3.1 8b and possibly maxing out around Mistral Nemo 12b at that VRAM. Try the base models first to get a feel for their overall capabilities and then dive into fine-tunes depending on what you're going for. Do not go lower than 7b, it might be tempting but those models are better at summarization and simpler tasks.

Anonymous
10/15/24(Tue)17:34:33 No.102837193

Anonymous 10/15/24(Tue)17:34:33 No.102837193

File: 2024-10-09_015444_seed147(...).png (3.43 MB, 1728x1344)

3.43 MB PNG

>>102827025
Lovely gen anon.

Anonymous
10/15/24(Tue)17:36:49 No.102837225

Anonymous 10/15/24(Tue)17:36:49 No.102837225

File: spoon2).jpg (122 KB, 1024x1024)

122 KB JPG

>>102837190 (me)
By base models I mean the foundation ones (mixtral, llama, etc). Look for the instruct version if you're trying to chat with them.
And PS: If you want a shot at running bigger models get more RAM. It's gonna be a lot slower than running it on a GPU but at least it'll put things like Mixtral 8x7b or 70b within your reach

Anonymous
10/15/24(Tue)17:37:09 No.102837233

Anonymous 10/15/24(Tue)17:37:09 No.102837233

>>102837190

godly spoonfeeder, thank you!

Anonymous
10/15/24(Tue)17:39:23 No.102837258

Anonymous 10/15/24(Tue)17:39:23 No.102837258

>>102837064
download koboldcpp_cu12.exe here:
https://github.com/LostRuins/koboldcpp/releases/tag/v1.76
download (just) Mistral-Nemo-12B-ArliAI-RPMax-v1.2-Q4_K_M.gguf here:
https://huggingface.co/ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.2-GGUF/tree/main
open koboldcpp_cu12.exe, load the model, launch the model, start chatting in the frontend (kobold lite) it opens in your browser
you can download cards from
https://characterhub.org/
don't worry about silly tavern, it's gay and koboldai lite is better

Anonymous
10/15/24(Tue)17:40:34 No.102837271

Anonymous 10/15/24(Tue)17:40:34 No.102837271

>>102837225
>tiny skateboard
kek

Anonymous
10/15/24(Tue)17:42:01 No.102837289

Anonymous 10/15/24(Tue)17:42:01 No.102837289

File: 39_6550_.png (2.26 MB, 1280x1280)

2.26 MB PNG

>>102837193
Right back at ya. Those dark backdrops are dope. Is that IllustriousXL?

Anonymous
10/15/24(Tue)17:50:35 No.102837395

Anonymous 10/15/24(Tue)17:50:35 No.102837395

>>102837190
>>102837225
>>spoonfeeding newfags
Gay.

Anonymous
10/15/24(Tue)17:52:35 No.102837423

Anonymous 10/15/24(Tue)17:52:35 No.102837423

>>102837258

Why is koboldlite better? I've been an STer

Anonymous
10/15/24(Tue)17:52:45 No.102837426

Anonymous 10/15/24(Tue)17:52:45 No.102837426

>>102837395
Is migu gay?

Anonymous
10/15/24(Tue)17:56:10 No.102837468

Anonymous 10/15/24(Tue)17:56:10 No.102837468

File: _5ae5cf03-5b8a-4d41-9c84-(...).jpg (153 KB, 1024x1024)

153 KB JPG

>>102837258
>download (just) Mistral-Nemo-12B-memetune-slop
buy an ad

Anonymous
10/15/24(Tue)17:58:30 No.102837495

Anonymous 10/15/24(Tue)17:58:30 No.102837495

>>102837468
More like begone weebtard. He >>102837258 at least posted on-topic.

Anonymous
10/15/24(Tue)18:01:32 No.102837539

Anonymous 10/15/24(Tue)18:01:32 No.102837539

File: lust provoking tetos.jpg (346 KB, 1536x1536)

346 KB JPG

Anonymous
10/15/24(Tue)18:05:49 No.102837602

Anonymous 10/15/24(Tue)18:05:49 No.102837602

Are the deepseek models any good for cooming?

Anonymous
10/15/24(Tue)18:06:11 No.102837608

Anonymous 10/15/24(Tue)18:06:11 No.102837608

>>102837602
no

Anonymous
10/15/24(Tue)18:06:54 No.102837619

Anonymous 10/15/24(Tue)18:06:54 No.102837619

/lmg/ - Local Migu Gay

Anonymous
10/15/24(Tue)18:07:41 No.102837626

Anonymous 10/15/24(Tue)18:07:41 No.102837626

>>102837423
1: i don't have to start up a termux instance and fuck around with a terminal to access a separate instance of it on my phone when hosting my model on pc, i just tap the icon of the PWA i installed and voila
2: it does everything ST can do except card switching in group chats by turns (no one uses this)
3: the meat of cards aren't needlessly separated into 30 different boxes in memory
4: koboldlite gets the freshest meme samplers before ST does
5: the main ST dev hates you
6: koboldlite has a cute icon
7: ST is feature rich, but 95% of the features are just bloat

Anonymous
10/15/24(Tue)18:08:45 No.102837643

Anonymous 10/15/24(Tue)18:08:45 No.102837643

>>102837539
Any tighter and the Teto will pop.

Anonymous
10/15/24(Tue)18:09:45 No.102837661

Anonymous 10/15/24(Tue)18:09:45 No.102837661

>>102837626
>4: koboldlite gets the freshest meme samplers before ST does
Fastest to get cancer isn't a selling point...

Anonymous
10/15/24(Tue)18:12:30 No.102837693

Anonymous 10/15/24(Tue)18:12:30 No.102837693

File: _7abd435a-6254-46c8-86a4-(...).jpg (160 KB, 1024x1024)

160 KB JPG

>>102837495
>>102837258
samefagging so hard
go back before you ruin someone's first local experience with your bullshit

Anonymous
10/15/24(Tue)18:13:05 No.102837701

Anonymous 10/15/24(Tue)18:13:05 No.102837701

>dalle sloppa

Anonymous
10/15/24(Tue)18:13:50 No.102837709

Anonymous 10/15/24(Tue)18:13:50 No.102837709

>>102837626
>freshest meme samplers before ST does
You got DRY weeks after everyone else except ollama. How's that XTC support coming along?

Anonymous
10/15/24(Tue)18:14:39 No.102837717

Anonymous 10/15/24(Tue)18:14:39 No.102837717

>>102837709
Why would you need XQC when you have DRY? Is DRY worthless or something?

Anonymous
10/15/24(Tue)18:16:16 No.102837737

Anonymous 10/15/24(Tue)18:16:16 No.102837737

File: Untitled.png (186 KB, 1163x690)

186 KB PNG

>>102837709
it's had this shit forever

Anonymous
10/15/24(Tue)18:20:15 No.102837775

Anonymous 10/15/24(Tue)18:20:15 No.102837775

>>102837737
Don't care, ooba had it first (as usual)

Anonymous
10/15/24(Tue)18:21:14 No.102837790

Anonymous 10/15/24(Tue)18:21:14 No.102837790

File: teto snug.jpg (151 KB, 832x1216)

151 KB JPG

>>102837643
yes

Anonymous
10/15/24(Tue)18:22:06 No.102837803

Anonymous 10/15/24(Tue)18:22:06 No.102837803

>>102837790
INSERT

Anonymous
10/15/24(Tue)18:25:19 No.102837832

Anonymous 10/15/24(Tue)18:25:19 No.102837832

>>102837790
the advertisers aren't going to like this

Anonymous
10/15/24(Tue)18:26:13 No.102837841

Anonymous 10/15/24(Tue)18:26:13 No.102837841

>>102837832
It's okay there's no nipple showing it is completely safe for work

Anonymous
10/15/24(Tue)18:31:03 No.102837889

Anonymous 10/15/24(Tue)18:31:03 No.102837889

>>102837539
>>102837790
my lust has been provoked

Anonymous
10/15/24(Tue)18:32:20 No.102837900

Anonymous 10/15/24(Tue)18:32:20 No.102837900

>>102837626
Biggest points in using ST is the prompt presets (though more utilized by chat completion). character management menu (100+ cards with tagging system) whereas Lite has a laughable save slots, more granular lorebook handling (global / per card, order / depth). If you don't need these then it's Not For You[tm].
Lite instruct mode doesn't seem to support group chat.

Anonymous
10/15/24(Tue)18:32:42 No.102837905

Anonymous 10/15/24(Tue)18:32:42 No.102837905

Are we being raided by 'cord again?

Anonymous
10/15/24(Tue)18:34:41 No.102837931

Anonymous 10/15/24(Tue)18:34:41 No.102837931

>>102837803
yes

Anonymous
10/15/24(Tue)18:38:07 No.102837967

Anonymous 10/15/24(Tue)18:38:07 No.102837967

>>102837931
>mistral
Now pour sprinkle it with some poutine in the honor of the fallen cohere.

Anonymous
10/15/24(Tue)18:44:29 No.102838027

Anonymous 10/15/24(Tue)18:44:29 No.102838027

File: b.gif (228 KB, 1024x1024)

228 KB GIF

>>102837967
lore required

Anonymous
10/15/24(Tue)18:47:51 No.102838063

Anonymous 10/15/24(Tue)18:47:51 No.102838063

Not sure this is the right place to ask but does anyone know if encoding vocab as a binary representation of the word's index instead of one-hot could work to train a word2vec model?

Anonymous
10/15/24(Tue)18:48:42 No.102838071

Anonymous 10/15/24(Tue)18:48:42 No.102838071

>>102838027
your jif isn't moving

Anonymous
10/15/24(Tue)18:52:47 No.102838118

Anonymous 10/15/24(Tue)18:52:47 No.102838118

File: teto shushing dark room c(...).png (641 KB, 1280x768)

641 KB PNG

>>102838027
What's that Teto going to do with the french bread? Worry not, I won't judge.

Anonymous
10/15/24(Tue)18:57:03 No.102838164

Anonymous 10/15/24(Tue)18:57:03 No.102838164

>>102837539
>>102837790
>>102837931
I like these tetos
>tape bondage
neurons activated

Anonymous
10/15/24(Tue)18:57:25 No.102838167

Anonymous 10/15/24(Tue)18:57:25 No.102838167

>>102837905
nah just the attention seeking drawfag, carry on
>>102838118
picrel

Anonymous
10/15/24(Tue)18:58:36 No.102838181

Anonymous 10/15/24(Tue)18:58:36 No.102838181

File: deals.jpg (207 KB, 1658x892)

207 KB JPG

>>102838167
>>102838118

Anonymous
10/15/24(Tue)18:59:30 No.102838192

Anonymous 10/15/24(Tue)18:59:30 No.102838192

File: Untitled.png (59 KB, 1227x896)

59 KB PNG

>>102837900
>Lite instruct mode doesn't seem to support group chat.
it does
>[ST has] more granular lorebook handling (global / per card, order / depth).
in koboldlite, you can load just a lorebook (worldinfo) into a chat from a saved .json file into your current session
you can save and load infinite sessions by saving and loading .jsons even though there's only 6 quickslots
order and depth of keys in worldinfo/lorebook are adjustable
>prompt presets
i'm not entirely sure what that is but koboldlite can probably do it too

Anonymous
10/15/24(Tue)19:14:40 No.102838372

Anonymous 10/15/24(Tue)19:14:40 No.102838372

>>102837539
>>102837790
>>102837931
she's literally 15 in chimera years you can't tape her like that

Anonymous
10/15/24(Tue)19:16:10 No.102838393

Anonymous 10/15/24(Tue)19:16:10 No.102838393

>>102838372
Quit samefagging

Anonymous
10/15/24(Tue)19:18:31 No.102838418

Anonymous 10/15/24(Tue)19:18:31 No.102838418

>>102838372
she's pai in zuri years

Anonymous
10/15/24(Tue)19:23:57 No.102838462

Anonymous 10/15/24(Tue)19:23:57 No.102838462

>>102838447
>>102838447
>>102838447

Anonymous
10/15/24(Tue)19:42:37 No.102838641

Anonymous 10/15/24(Tue)19:42:37 No.102838641

>>102838192
Very organic

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.