/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 10/15/24(Tue)19:22:25 No.102838447

File: 119147028_p1.png (3.1 MB, 2569x1440)

3.1 MB PNG

/lmg/ - Local Models General Anonymous 10/15/24(Tue)19:22:25 No.102838447 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102826116 & >>102815881

►News
>(10/14) Llama 3.1 linearized: https://hf.co/collections/hazyresearch/lolcats-670ca4341699355b61238c37
>(10/14) Zamba2-7B released: https://www.zyphra.com/post/zamba2-7b
>(10/14) Ichigo, voice-to-voice model based on Llama 3.1, released: https://homebrew.ltd/blog/llama-learns-to-talk
>(10/12) Fast multilingual TTS with voice cloning, based on flow matching with DiT: https://github.com/SWivid/F5-TTS
>(10/11) 14B cross-architecture distillation model: https://hf.co/arcee-ai/SuperNova-Medius

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
10/15/24(Tue)19:22:54 No.102838452

Anonymous 10/15/24(Tue)19:22:54 No.102838452

File: __kasane_teto_and_calne_c(...).jpg (1.82 MB, 4096x4000)

1.82 MB JPG

►Recent Highlights from the Previous Thread: >>102826116

--Papers:
>102828361
--Anon seeks advice on setting up an AI chatbot, with recommendations for backend, front end, models, and hardware considerations:
>102836932 >102836995 >102837064 >102837190 >102837225 >102837258 >102837423 >102837626 >102837709 >102837717 >102837737 >102837900
--Llama.vim plugin works with Qwen2.5 model after resolving server errors:
>102832482 >102832670 >102832719 >102832754 >102832796 >102832915 >102832934 >102833102 >102833163
--GPT-SoVits TTS model discussion and impressions:
>102827232 >102827244 >102827279 >102827356 >102827286 >102827301 >102828475 >102829104 >102829158 >102829442 >102829310
--Bug in gradient accumulation fixed, implications for LLM training:
>102834762 >102834955 >102835131 >102835142 >102835690 >102835773 >102835909
--Compressed KV cache improves generation speed in Kobold and Mistral Small:
>102829690 >102829729 >102830514
--XTC sampler merged into llama.cpp:
>102831162
--Using Exllama2 string banning feature to work around Llama3 safety refusals:
>102831667 >102831719 >102831968 >102832775 >102832742
--Streaming issue in KoboldAI and investigation of modes and settings:
>102826623 >102826810 >102826820 >102827138 >102829133
--Strategies and challenges for managing multiple characters in LLM-based systems:
>102826507 >102826832 >102827736 >102827997 >102828082 >102828179 >102828101 >102828110 >102828488
--PLaMo-100B-Pretrained, a Japanese language model by Preferred Networks:
>102830270 >102830474 >102830490 >102830590 >102830831
--Nvidia releases Llama-3.1-Nemotron-70B-Instruct with high preference benchmark scores:
>102835106
--Miku (free space):
>102826391 >102827025 >102827997 >102828488 >102830574 >102832690 >102837040 >102837070 >102837190 >102837225 >102837289 >102837539 >102837790 >102837931 >102838027 >102838118

►Recent Highlight Posts from the Previous Thread: >>102826128

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
10/15/24(Tue)19:25:47 No.102838481

Anonymous 10/15/24(Tue)19:25:47 No.102838481

>>102838452
Sex with this Teto

Anonymous
10/15/24(Tue)19:26:42 No.102838489

Anonymous 10/15/24(Tue)19:26:42 No.102838489

Grifter thread

Anonymous
10/15/24(Tue)19:27:25 No.102838498

Anonymous 10/15/24(Tue)19:27:25 No.102838498

>>102838452
https://desuarchive.org/g/thread/102826116/#q102837261
migu was deleted
sad

Anonymous
10/15/24(Tue)19:28:57 No.102838515

Anonymous 10/15/24(Tue)19:28:57 No.102838515

Okay, so I get larger models are generally smarter, but what causes the phenomenon of smaller models ignoring, forgetting and not understanding facts told to them in plain English?

Larger models are often able to infer things through subtext alone while smaller models are lucky if they remember what was said in a previous message. What causes this?

Anonymous
10/15/24(Tue)19:30:26 No.102838535

Anonymous 10/15/24(Tue)19:30:26 No.102838535

>>102838515
Probably the smaller KV size.

Anonymous
10/15/24(Tue)19:33:44 No.102838564

Anonymous 10/15/24(Tue)19:33:44 No.102838564

File: OpenAI-boss-Sam-Altman.jpg (121 KB, 640x480)

121 KB JPG

>>102838498
who could have been behind this

Anonymous
10/15/24(Tue)19:51:33 No.102838694

Anonymous 10/15/24(Tue)19:51:33 No.102838694

How does Hermes 3 405B compare to Opus for RP / storytelling?

Anonymous
10/15/24(Tue)19:52:29 No.102838708

Anonymous 10/15/24(Tue)19:52:29 No.102838708

>>102838515
Memory is itself intelligence, including short-term memory as represented by the context. It's natural that something smarter has better short-term memory.

Anonymous
10/15/24(Tue)19:55:53 No.102838735

Anonymous 10/15/24(Tue)19:55:53 No.102838735

File: file.png (71 KB, 426x545)

71 KB PNG

>>102838192
>group
To "trigger" a specific character by typing in {{[OUTPUT]}} (name): is sufficient for text completion (TC), though it would be convenient to not have to type that. Chat completion (CC) needs to send an instruction to reply as X (if prefill isn't supported).
>load a lorebook
It's not letting me load a separate lorebook exported from ST. Even if it does, Lite doesn't have the UI to manage it and keep lorebooks separate, and WI Insert location is only "after memory" or "before A/N". By granular I mean each entry can be in different locations.
>load/save to file browser
I know that but Lite's default export name is saved_story.json for all instead of what was named in UI and requires you to handle manual organization.
>prompt presets
Picrel is ST's TC stuff. CC has a different manager that lets you add things after chat history (like having entire sys prompt after chat). Saved as presets so you can switch between them. Certain models really want different instructions made for them.
>the meat of cards aren't needlessly separated into 30 different boxes in memory
{{personality}} and {{scenario}} are bullshit and can be ignored or entered in the card's {{description}}, ST dev believes this too but it's not an ST thing but a card spec thing.

Lite is first and foremost a TC UI. The one and only thing I miss from Lite is the single large editable text field which we can't have with the message-based ST's UI.
Because of this, Lite can't properly store swipes across all messages like ST can (one thing it's missing is true tree navigation), or have commands like /del 3 or /hide 5-6. For dicking with raw text I liked mikupad + llama.cpp for logprobs but mikupad doesn't support lorebooks.
Lite's A/N depth is based on tokens rather than messages, the last time I tried it ages ago it would even insert A/N random mid-sentence since it's specifically x tokens before end.

Anonymous
10/15/24(Tue)19:58:40 No.102838751

Anonymous 10/15/24(Tue)19:58:40 No.102838751

>>102838515
As this anon said >>102838708. Local AIs are dead end solely because of this reason, you need a really big model for something usable and serious, gorrilion context LLM that actually remembers and uses it on a single 3090 will never be real.

Anonymous
10/15/24(Tue)20:01:20 No.102838773

Anonymous 10/15/24(Tue)20:01:20 No.102838773

>>102838751
I disagree because all this useless history, geography, social studies, advanced math and coding use so much space in those models and if we just cut it out and focus on what is important :SEX , you could probably squeeze a sex model into 24GB's of ram.

Anonymous
10/15/24(Tue)20:05:02 No.102838817

Anonymous 10/15/24(Tue)20:05:02 No.102838817

>>102838773
Retard.

Anonymous
10/15/24(Tue)20:06:44 No.102838833

Anonymous 10/15/24(Tue)20:06:44 No.102838833

I want a bit of knowledge in there but the bulk of the model must be focused on behaving human like and yes, sex.

Anonymous
10/15/24(Tue)20:07:30 No.102838844

Anonymous 10/15/24(Tue)20:07:30 No.102838844

>>102838773
Omit all that data and you will get retarded sex-only slop that you will get bored of pretty fast.

Anonymous
10/15/24(Tue)20:08:01 No.102838850

Anonymous 10/15/24(Tue)20:08:01 No.102838850

>>102838735
>doesn't support lorebooks
isn't lorebook just another way to say world info, which is something mikupad does support?

Anonymous
10/15/24(Tue)20:09:39 No.102838870

Anonymous 10/15/24(Tue)20:09:39 No.102838870

>>102838773
keep math and toss the others; add spatial relationships and a bunch of daily living shit i.e. what everyone could be doing if you let time flow instead model being stuck in "seggs felt good so now I'm gonna permanently daydream around Anon" mode

Anonymous
10/15/24(Tue)20:12:40 No.102838894

Anonymous 10/15/24(Tue)20:12:40 No.102838894

File: 1711119238192568.jpg (275 KB, 1024x1024)

275 KB JPG

>>102838447

Anonymous
10/15/24(Tue)20:23:03 No.102838973

Anonymous 10/15/24(Tue)20:23:03 No.102838973

>>102838850
>mikupad does world info
I see it now. Import button got added 5 weeks ago?
But currently has even less options than Lite, and entries are inserted to top of memory.

Anonymous
10/15/24(Tue)20:23:58 No.102838982

Anonymous 10/15/24(Tue)20:23:58 No.102838982

>>102838844
>you will get bored of pretty fast.
I just want a sexbot. Girlfriends won't be here for 10 years or more. Even if you solve the memory problem have fun with her periodically breaking character to teach you about the importance of consent.

Anonymous
10/15/24(Tue)20:25:46 No.102839003

Anonymous 10/15/24(Tue)20:25:46 No.102839003

>>102838870

> add spatial relationships and a bunch of daily living shit i.e. what everyone could be doing if you let time flow

Cannot be reinforced enough. There needs to be a new dataset and strictly curate it from there. Once again, here's your fucking clue retards:

> let time flow

Anonymous
10/15/24(Tue)20:27:12 No.102839022

Anonymous 10/15/24(Tue)20:27:12 No.102839022

Any decent local voice cloners out there?
I've been using https://git.ecker.tech/mrq/ai-voice-cloning/wiki/Installation for a while. It doesn't do vocal expressions very well.

Anonymous
10/15/24(Tue)20:28:35 No.102839038

Anonymous 10/15/24(Tue)20:28:35 No.102839038

>>102839022
https://huggingface.co/spaces/mrfakename/E2-F5-TTS or fish-speech 1.4 or styletts2.

Anonymous
10/15/24(Tue)20:29:08 No.102839045

Anonymous 10/15/24(Tue)20:29:08 No.102839045

>>102839038
I'll look at them, thanks.

Anonymous
10/15/24(Tue)20:34:23 No.102839092

Anonymous 10/15/24(Tue)20:34:23 No.102839092

So now that the dust has settled, what's our verdict on the 70b nemo?

Anonymous
10/15/24(Tue)20:37:15 No.102839126

Anonymous 10/15/24(Tue)20:37:15 No.102839126

if the rtx 6000 blackwell has 64gb of vram and an msrp of $7500, would you?

Anonymous
10/15/24(Tue)20:39:47 No.102839149

Anonymous 10/15/24(Tue)20:39:47 No.102839149

>finally kill off all slop with the new sampler
>the model still acts like an assistant roleplaying a character instead of character
>cloud has the same problem
What are the longest context human-like models with no synthslop?

Anonymous
10/15/24(Tue)21:31:13 No.102839639

Anonymous 10/15/24(Tue)21:31:13 No.102839639

If I wanted to run a local model and have it have access to my codebase, what would be the easiest way to accomplish this? Basically want to be able to ask it questions about the codebase and ask it to do certain things to it.

Anonymous
10/15/24(Tue)21:38:57 No.102839704

Anonymous 10/15/24(Tue)21:38:57 No.102839704

>>102839149

LLaMA 1.

Anonymous
10/15/24(Tue)21:42:23 No.102839733

Anonymous 10/15/24(Tue)21:42:23 No.102839733

>>102839704
Based

Anonymous
10/15/24(Tue)21:44:27 No.102839756

Anonymous 10/15/24(Tue)21:44:27 No.102839756

>>102839704
>LLaMA 1.
:(

What about https://huggingface.co/LLM360/K2? It has 8k context. Are any of their datasets synthslop?

Anonymous
10/15/24(Tue)21:45:49 No.102839767

Anonymous 10/15/24(Tue)21:45:49 No.102839767

>>102838694
You can use it for free on OpenRouter as long as you have a positive credit balance

Anonymous
10/15/24(Tue)21:56:10 No.102839866

Anonymous 10/15/24(Tue)21:56:10 No.102839866

>>102839639
You mean something like
>https://github.com/Aider-AI/aider
?

Anonymous
10/15/24(Tue)21:57:28 No.102839882

Anonymous 10/15/24(Tue)21:57:28 No.102839882

>>102839866
That is exactly what I was looking for. Thank you, anon, I'll start digging.

Anonymous
10/15/24(Tue)21:59:52 No.102839917

Anonymous 10/15/24(Tue)21:59:52 No.102839917

>>102839882
I have no idea if that's any good, by the way, and there are other tools trying to accomplish the same thing.
Googling (or Startpage-ing)
>local ai coding assistant with access to codebase
will yield quite a few results.

Anonymous
10/15/24(Tue)22:07:33 No.102839992

Anonymous 10/15/24(Tue)22:07:33 No.102839992

Are there any other RP centric finetunes of Mistral large besides magnum? I like magnum's willingness to be violent and such but it's definitely more retarded than base large.

Anonymous
10/15/24(Tue)22:09:18 No.102840005

Anonymous 10/15/24(Tue)22:09:18 No.102840005

>>102838982
>breaking character to teach you about the importance of consent.
a good system prompt will usually fix that. haven't seen shit like that in ages.

Anonymous
10/15/24(Tue)22:11:29 No.102840024

Anonymous 10/15/24(Tue)22:11:29 No.102840024

>>102839992
Lyra-v4.
Rocinante.
There's a gutenberg version of lyra people say is good.
I haven't tried the newer drummer fine tunes yet.

Anonymous
10/15/24(Tue)22:18:26 No.102840089

Anonymous 10/15/24(Tue)22:18:26 No.102840089

>>102840024
Thanks anon. I'll search em up on HF

Anonymous
10/15/24(Tue)22:20:54 No.102840116

Anonymous 10/15/24(Tue)22:20:54 No.102840116

>>102840024
>>102840089 (me)
Oh these are all finetunes of Mistral small. I was asking about Large 2. Thanks though.

Anonymous
10/15/24(Tue)22:21:49 No.102840127

Anonymous 10/15/24(Tue)22:21:49 No.102840127

>>102840116
Fuck, my bad. I read mistral and magnum and my head went straight to nemo.

Anonymous
10/15/24(Tue)22:23:53 No.102840154

Anonymous 10/15/24(Tue)22:23:53 No.102840154

>>102839992
Only other proper RP finetunes are Lumimaid and Behemoth. There's also Tess but that's a general finetune.

There are a few merges around. Some seemed to enjoy Luminum and Lumikabra. Twilight-Large and Etheris are recent ones. SmartMaid seems interesting (Lumimaid converted to a lora, then applied to the base model).

If you're using GGUFs, someone's made a bunch of control vectors:
https://huggingface.co/jukofyork/creative-writing-control-vectors-v3.0/tree/main/Mistral-Large-Instruct-2407
DarkMage has the darker control vectors pre-applied onto the base model.

I just use the base one, since my RPs rarely go in a direction where I feel the positivity bias.

Anonymous
10/15/24(Tue)22:29:08 No.102840193

Anonymous 10/15/24(Tue)22:29:08 No.102840193

>>102839992
Try lowering the temperature for Magnum. 0.77 + min-p 0.4 seems to be the sweet spot for me. Beyond that it gets increasingly more retarded.

Anonymous
10/15/24(Tue)22:30:19 No.102840205

Anonymous 10/15/24(Tue)22:30:19 No.102840205

>>102839992
>>102840193
I meant 0.04 min-p, obviously.

Anonymous
10/15/24(Tue)22:42:19 No.102840314

Anonymous 10/15/24(Tue)22:42:19 No.102840314

>>102840154
Oh cool, I'll have to give those control vectors a shot. I am using GGUFs, I like fantasy combat scenarios, and base large always seems to want to be a pacifist unless I fist it in the right direction. It likes to keep User invincible at all times.

Anonymous
10/15/24(Tue)22:43:19 No.102840328

Anonymous 10/15/24(Tue)22:43:19 No.102840328

>>102840193
>>102840205
Ah, I have temp at 1.1, I'll try lowering it. Thanks anon.

Anonymous
10/15/24(Tue)22:47:17 No.102840356

Anonymous 10/15/24(Tue)22:47:17 No.102840356

>>102840154
It's a real shame that koboldcpp doesn't support control vectors. I don't want to pick between new samplers and control vectors.
https://github.com/LostRuins/koboldcpp/issues/1002
>Jul 17
Kobold devs, please add.

Anonymous
10/15/24(Tue)22:50:21 No.102840373

Anonymous 10/15/24(Tue)22:50:21 No.102840373

>>102840356
You use control vectors?

Anonymous
10/15/24(Tue)22:54:27 No.102840403

Anonymous 10/15/24(Tue)22:54:27 No.102840403

File: Screenshot_20241016_114730.png (89 KB, 1466x510)

89 KB PNG

Can't believe my retarded ass didn't fry the p40.
I used a cable to convert 2 "gpu cables" to the one the p40 needs. There is so much conflicting information out there.
Not sure if anybody needs this but I write it anyway.

Used a cable like this:
https://www.ebay.com/itm/334848421978
Using a silent blower like this:
https://www.amazon.co.jp/gp/product/B00YY5DUDQ/ref=ppx_yo_dt_b_asin_image_o00_s00?ie=UTF8&psc=1

Just remove the silver thing and stick it in.
Set Watt to 140 and its fast enough while not running hot.
Thanks gpu anon. I wrote it many times but pascal is cursed with exllama. Wouldnt be able to run anything without the lad.

Anonymous
10/15/24(Tue)22:59:31 No.102840437

Anonymous 10/15/24(Tue)22:59:31 No.102840437

>>102840373
>You use control vectors?
Used, but only llama.cpp supports them and they didn't want to add new samplers, so I switched to kobold, and there is sadly no support for control vectors.

Anonymous
10/15/24(Tue)23:01:41 No.102840457

Anonymous 10/15/24(Tue)23:01:41 No.102840457

>>102840403
If I had known that the p40 was 140w I would have bought one. I looked at the massive size and thought it was like 300w+
Also I have no Idea what you just wrote.

Anonymous
10/15/24(Tue)23:06:47 No.102840496

Anonymous 10/15/24(Tue)23:06:47 No.102840496

File: 1709815507874036.jpg (134 KB, 1080x1230)

134 KB JPG

>>102840403
>1080 ti

Anonymous
10/15/24(Tue)23:11:05 No.102840526

Anonymous 10/15/24(Tue)23:11:05 No.102840526

So... which of the new samplers are actually worth using:
* XTC
* Smooth Sampling
* DRY
* Dynamic Temp
* Anything else?

Anonymous
10/15/24(Tue)23:11:20 No.102840530

Anonymous 10/15/24(Tue)23:11:20 No.102840530

>>102840457
250w, but you can set it lower to like 140w.
I can run CR at like 9t/s with this.

>>102840496
What? It served me well. GPU cards are expensive in japan.
Probably going to sell my 1080 though and buy another cheap p40 instead.

Anonymous
10/15/24(Tue)23:13:52 No.102840558

Anonymous 10/15/24(Tue)23:13:52 No.102840558

>>102840530
>Probably going to sell my 1080 though and buy another cheap p40 instead.
Meant that I had another 1080 (not ti) that I used with my ti.
I need to keep the TI for gaming.

Anonymous
10/15/24(Tue)23:15:47 No.102840571

Anonymous 10/15/24(Tue)23:15:47 No.102840571

>>102840526
DRY is really good, basically a linear upgrade to rep pen.

Smooth sampling can be useful depending on what model you're using.

I've been coming back to Dynamic temp recently and have been pretty happy with how its been performing.

Still playing around with XTC but overall I think I like it, can be kind of a slot machine, though.

Anonymous
10/15/24(Tue)23:17:45 No.102840590

Anonymous 10/15/24(Tue)23:17:45 No.102840590

>>102840571
I find XTC usually makes the model seem dumber (turns out the lower probability tokens were lower probability for a reason)

Anonymous
10/15/24(Tue)23:21:09 No.102840620

Anonymous 10/15/24(Tue)23:21:09 No.102840620

>>102840590
That's why I say its kind of a slot machine. There are some times where it activates on a token it probably really shouldn't have activated on. But I find more often than not it performs as advertised, preventing non-verbatim repetition and discouraging the use of various -isms.

Anonymous
10/15/24(Tue)23:24:11 No.102840654

Anonymous 10/15/24(Tue)23:24:11 No.102840654

File: local migu general.png (97 KB, 290x306)

97 KB PNG

Anonymous
10/15/24(Tue)23:26:50 No.102840674

Anonymous 10/15/24(Tue)23:26:50 No.102840674

>>102840526
>XTC
Better for creativity and coherence than simply puling up temperature, but just like pulling up temperature makes model dumber.
>Smooth Sampling
>Dynamic Temp
Never used, never understood why and how I should use it.
>DRY
Superior rep.penalty. 100% worth using.
>Anything else?
>anti-slop/string ban
The only good way to remove the slop. No more shivers, bonds and journeys. The air will never be thick, or filled with anticipation.

Anonymous
10/15/24(Tue)23:31:16 No.102840723

Anonymous 10/15/24(Tue)23:31:16 No.102840723

I've started used zero temp and neutral samplers to test the basic performance of prompts first and then applying parameters once I'm happy

Anonymous
10/15/24(Tue)23:33:54 No.102840743

Anonymous 10/15/24(Tue)23:33:54 No.102840743

>>102840723
That's what I've been doing for a long time now.
You can find models that were fucked by the fine tuning and that rely on temp to not fall into repetition loops for example.

Anonymous
10/15/24(Tue)23:44:48 No.102840825

Anonymous 10/15/24(Tue)23:44:48 No.102840825

>>102840526
>string ban/antislop sampler
>temp 0
>good prompting using the last assistant prefix and {{rand}}
This is all you need.

Anonymous
10/16/24(Wed)00:04:04 No.102840983

Anonymous 10/16/24(Wed)00:04:04 No.102840983

>>102839992
Behemoth is pretty good

Anonymous
10/16/24(Wed)00:08:59 No.102841019

Anonymous 10/16/24(Wed)00:08:59 No.102841019

File: willem-dafoe.gif (1.92 MB, 498x470)

1.92 MB GIF

>https://tts.x86.st/
>the way the voices giggle for the last example
holy fuck.......

Anonymous
10/16/24(Wed)00:13:05 No.102841059

Anonymous 10/16/24(Wed)00:13:05 No.102841059

>>102841019
I was extremely disappointed until I got to finetuned sovits
wow

Anonymous
10/16/24(Wed)00:16:09 No.102841079

Anonymous 10/16/24(Wed)00:16:09 No.102841079

File: CrossfadeMiku.png (1.55 MB, 896x1152)

1.55 MB PNG

Good night /lmg/

Anonymous
10/16/24(Wed)00:27:24 No.102841142

Anonymous 10/16/24(Wed)00:27:24 No.102841142

>>102841019

What? There's a GTP-SoVits v2? God damn it, I just installed the v1 version.

Anonymous
10/16/24(Wed)00:32:07 No.102841176

Anonymous 10/16/24(Wed)00:32:07 No.102841176

>>102841079
goodnaito

Anonymous
10/16/24(Wed)00:43:46 No.102841252

Anonymous 10/16/24(Wed)00:43:46 No.102841252

>>102841079
good night sir

Anonymous
10/16/24(Wed)00:59:22 No.102841361

Anonymous 10/16/24(Wed)00:59:22 No.102841361

>>102841019
Has someone created a tutorial for finetuning? I don't want to try figuring it out if someone has already done it.

Anonymous
10/16/24(Wed)01:10:34 No.102841443

Anonymous 10/16/24(Wed)01:10:34 No.102841443

Thoughts on Llama-3.1-Nemotron-70B? Seems pretty solid

Anonymous
10/16/24(Wed)01:40:01 No.102841654

Anonymous 10/16/24(Wed)01:40:01 No.102841654

>>102841443
Yeah seems pretty solid.

Anonymous
10/16/24(Wed)01:41:10 No.102841661

Anonymous 10/16/24(Wed)01:41:10 No.102841661

>>102841361
Did you check this?
>https://rentry co/GPT-SoVITS-guide
It's in their github.
Add the dot before co.

Anonymous
10/16/24(Wed)01:42:48 No.102841671

Anonymous 10/16/24(Wed)01:42:48 No.102841671

>>102841443
I hear it's pretty solid.

Anonymous
10/16/24(Wed)01:43:02 No.102841673

Anonymous 10/16/24(Wed)01:43:02 No.102841673

>>102841661
thanks I'll try that

Anonymous
10/16/24(Wed)01:47:59 No.102841710

Anonymous 10/16/24(Wed)01:47:59 No.102841710

Can someone walk me through how to configure anti-slop and token banning in KoboldCPP? I've been digging through the settings for a while now and I can't find any options anywhere. I've heard that you can launch the koboldcpp executable with arguments, but I'm not sure if that's how token banning and anti-slop are implemented.

Anonymous
10/16/24(Wed)01:50:48 No.102841731

Anonymous 10/16/24(Wed)01:50:48 No.102841731

What's the reasonable system requirements for something like GPT-SoVITS-v2?

Anonymous
10/16/24(Wed)02:00:29 No.102841802

Anonymous 10/16/24(Wed)02:00:29 No.102841802

>>102841731
It only uses like 2GB VRAM for inference.

Anonymous
10/16/24(Wed)02:12:09 No.102841896

Anonymous 10/16/24(Wed)02:12:09 No.102841896

>>102841443
It just censored 'cock' into '****' lmao.

Anonymous
10/16/24(Wed)02:13:52 No.102841916

Anonymous 10/16/24(Wed)02:13:52 No.102841916

>>102841443
It described my 'cock' as being pretty solid

Anonymous
10/16/24(Wed)02:38:53 No.102842130

Anonymous 10/16/24(Wed)02:38:53 No.102842130

>>102841710
its somewhere in context instead for whatever reason

Anonymous
10/16/24(Wed)02:47:31 No.102842185

Anonymous 10/16/24(Wed)02:47:31 No.102842185

>>102841661
just use rentry.org
https://rentry.org/GPT-SoVITS-guide

Anonymous
10/16/24(Wed)02:56:52 No.102842263

Anonymous 10/16/24(Wed)02:56:52 No.102842263

>>102842185
it wouldn't let me post the .co domain for whatever reason and i didn't know they had other domains. thanks.

Anonymous
10/16/24(Wed)03:35:03 No.102842482

Anonymous 10/16/24(Wed)03:35:03 No.102842482

>>102841443
It's clearly a big intelligence boost. Like Qwen, it's useless for RP until the fine-tunes come along, though.

Anonymous
10/16/24(Wed)03:56:57 No.102842622

Anonymous 10/16/24(Wed)03:56:57 No.102842622

>>102841896
That doesn't sound solid at all. It's like Qwen all over again. Fine-tunes when?

Anonymous
10/16/24(Wed)04:07:06 No.102842696

Anonymous 10/16/24(Wed)04:07:06 No.102842696

>>102839149
New sampler?

Anonymous
10/16/24(Wed)04:13:32 No.102842734

Anonymous 10/16/24(Wed)04:13:32 No.102842734

File: 1711008688603150.gif (2.3 MB, 498x421)

2.3 MB GIF

>>102841019
Yep, it's time to let go

Anonymous
10/16/24(Wed)04:17:22 No.102842757

Anonymous 10/16/24(Wed)04:17:22 No.102842757

>>102839149
Prompt issue

Anonymous
10/16/24(Wed)04:44:33 No.102842902

Anonymous 10/16/24(Wed)04:44:33 No.102842902

Llama coomotron when

Anonymous
10/16/24(Wed)04:46:30 No.102842912

Anonymous 10/16/24(Wed)04:46:30 No.102842912

Local Suno when?

Anonymous
10/16/24(Wed)04:56:35 No.102843004

Anonymous 10/16/24(Wed)04:56:35 No.102843004

I've switched to linux recently and managed to make everything work, but one thing irks me slightly. When I quit llama.cpp, upon freeing memory from gpu (novidya) and ram, my system freezes for like 3 seconds. Is there any kernel parameter to tweak, or something like that? Kernel 6.10 if it matters.

Anonymous
10/16/24(Wed)04:57:29 No.102843010

Anonymous 10/16/24(Wed)04:57:29 No.102843010

>>102842482
Isn't it already just a Llama3 finetune?
If the intelligence boost came from Nvidia's finetuning, it seems like more finetuning will just make it dumber again

Anonymous
10/16/24(Wed)05:20:22 No.102843178

Anonymous 10/16/24(Wed)05:20:22 No.102843178

File: 1723513881297302.png (209 KB, 403x720)

209 KB PNG

I've been thinking, why does /lmg/ have such a pure and wholesome mascot like Miku, when the main goal of this general is about having sex with AI? It's so contradictory. It's quite literally impossible to have a sexual relation with a symbol of virtue like Miku.

Anonymous
10/16/24(Wed)05:33:28 No.102843286

Anonymous 10/16/24(Wed)05:33:28 No.102843286

>>102841443
Still way dumber than Mistral Large
>Duh anon, Large is almost twice as big
Yeah, but it's important to say these things even when they're obvious because people are always trying to cope and make excuses about bigger models being smarter

Anonymous
10/16/24(Wed)05:34:07 No.102843293

Anonymous 10/16/24(Wed)05:34:07 No.102843293

>>102843178
go ask for jerk off material somewhere else, /lmg/ is too smart to fall for it

Anonymous
10/16/24(Wed)05:41:29 No.102843331

Anonymous 10/16/24(Wed)05:41:29 No.102843331

File: 1699778967363734.jpg (114 KB, 997x1182)

114 KB JPG

Hello, friends! I've finally decided to try switching from using proxies to using a local model. I'm running a GTX 4070. Any advice for which one I should use? What about if I want to play games while using it as well?

Anonymous
10/16/24(Wed)06:02:46 No.102843466

Anonymous 10/16/24(Wed)06:02:46 No.102843466

>>102843178
im thinking miku miku
>he doesnt know of the lmg catchphrase mikusex
miku rps with me in my wifi

Anonymous
10/16/24(Wed)06:04:22 No.102843476

Anonymous 10/16/24(Wed)06:04:22 No.102843476

>>102843178
>the main goal of this general is about having sex with AI
I think you're confused, /aicg/ is down there. This general is an ongoing research for a realistic AI gf.

Anonymous
10/16/24(Wed)06:06:05 No.102843492

Anonymous 10/16/24(Wed)06:06:05 No.102843492

File: 1718464859782119.jpg (167 KB, 1080x891)

167 KB JPG

>>102843331
>GTX 4070
>Play games while using it
Anon, I...

Anonymous
10/16/24(Wed)06:16:20 No.102843550

Anonymous 10/16/24(Wed)06:16:20 No.102843550

>>102843492
With a 5bpw Nemo he'd have 2.5GB remaining for games.

Anonymous
10/16/24(Wed)06:17:02 No.102843553

Anonymous 10/16/24(Wed)06:17:02 No.102843553

>>102843331
>GTX 4070
>GTX
>playing while using it
from what world are you from?

Anonymous
10/16/24(Wed)06:42:09 No.102843706

Anonymous 10/16/24(Wed)06:42:09 No.102843706

how do i use .pth voice models/ where can i put them

Anonymous
10/16/24(Wed)06:45:20 No.102843726

Anonymous 10/16/24(Wed)06:45:20 No.102843726

>>102843706
Read the guide retard

Anonymous
10/16/24(Wed)06:48:18 No.102843745

Anonymous 10/16/24(Wed)06:48:18 No.102843745

>>102843706
Could be anything, it's just a file format to store the weights like safetensors or onnx

Anonymous
10/16/24(Wed)07:07:39 No.102843867

Anonymous 10/16/24(Wed)07:07:39 No.102843867

Is Midnight-Miqu-70B-v1.5 the best 70b range model?
Is there anything good between that and mistral small? I assume CR is shit since nobody talks about it.
Tried magnum 72b and its the biggest shit. Training on qwen2 does not work well.
You can feel how the model does not like RP and tries to steer away aggressively.

Anonymous
10/16/24(Wed)07:08:08 No.102843869

Anonymous 10/16/24(Wed)07:08:08 No.102843869

>>102843867
Kill yourself

Anonymous
10/16/24(Wed)07:08:55 No.102843876

Anonymous 10/16/24(Wed)07:08:55 No.102843876

>>102843869
You first anon.

Anonymous
10/16/24(Wed)07:15:22 No.102843907

Anonymous 10/16/24(Wed)07:15:22 No.102843907

anyone benchmarked the m2 mac mini? I've seen a few getting good speeds in exo clusters

Anonymous
10/16/24(Wed)07:20:08 No.102843939

Anonymous 10/16/24(Wed)07:20:08 No.102843939

>>102843867
Just my two cents. Mistral large is the closest you'll get to the the paypig models but it's clinical and dry. Midnight Miqu does well because it has the intelligence of a 70B while miraculously being fairly horny without making it stupid.
Anything below that in my experience is pure cope. I think even putting Mistral at the exact same level as the paid services is kind of cope. It's just... passable.
Like all the models below 70b in my experience become noticeably dumb. Yes they can string a piece of text together, but they speak without thinking. Just spitting out walls of on the surface coherent text with zero substance that often don't make sense in the overall grander narrative. Yeah, you can tard wrangle with a little effort but you'll NEVER not be tard wrangling them. Larger, (70B+ and premium models) aren't except from this either, but slipups are few and far between until the context size really grows.
With that being said, all models, even the big ones all eventually start being retarded once their context swells to even a fraction of their purported maximum. Just because it says it has 200k context, doesn't mean it's good.

Anonymous
10/16/24(Wed)07:33:43 No.102844027

Anonymous 10/16/24(Wed)07:33:43 No.102844027

>>102843867
>is midnight mi-
no
use base miqu
and rope yourself

Anonymous
10/16/24(Wed)07:34:44 No.102844035

Anonymous 10/16/24(Wed)07:34:44 No.102844035

File: 00492-2498365611.jpg (319 KB, 1248x1824)

319 KB JPG

>>102843867
Qwen2.5 72b, maybe CR1 35b due to vram requirements, not the refresh, they fucked the refresh.

>>102843939
behemoth 120b is fun but as you say, it's not too soulful or clever. something goes wrong over 70B that makes models "correct" but not very human. we may be in the uncanny valley of LLMs

Anonymous
10/16/24(Wed)07:41:29 No.102844087

Anonymous 10/16/24(Wed)07:41:29 No.102844087

Impersonate.
Impersonate.
Impersonate.
Edit.
Impersonate.

Anonymous
10/16/24(Wed)07:41:46 No.102844091

Anonymous 10/16/24(Wed)07:41:46 No.102844091

>>102844035
I find anthropics models to be pretty soulful when properly jailbroken while still retaining their smarts, but at no point did I ever consider it to be notably smarter than mistral large. I'm sure there are way to prod mistral large into being more soulful but I don't really have the hardware to properly poke at it at a speed and quant that makes it worth my time.

Anonymous
10/16/24(Wed)07:45:19 No.102844111

Anonymous 10/16/24(Wed)07:45:19 No.102844111

>>102844091
I should add that while they were smart and soulful they became repetitive and dumb way quicker than I would have otherwise expected. It's not night and day bad, but at lower contexts I am blown away by Claud's grasp of the scenario and progressively less so as the context grows. You can grill it on any part of the conversation and it'll get it right, but it just, I don't know, starts to suck?

Anonymous
10/16/24(Wed)07:48:24 No.102844140

Anonymous 10/16/24(Wed)07:48:24 No.102844140

>>102843331
i accidentally leave my model loaded while playing Grim Dawn all the time on my 8gb Rtx 4060.
currently using Rocinante-12B-v2g-Q4_K_M and building an AAR spellbinder.
inferencing from it while it's running doesn't lag the game too much.
also, adding
>setting: the world of grim dawn, cairn. a gritty post-apocalyptic world full of monsters and magic
to authors' note makes cards pretty cool

Anonymous
10/16/24(Wed)08:00:51 No.102844228

Anonymous 10/16/24(Wed)08:00:51 No.102844228

File: file.png (137 KB, 739x475)

137 KB PNG

>new sota is local
corpos in shambles

Anonymous
10/16/24(Wed)08:02:01 No.102844238

Anonymous 10/16/24(Wed)08:02:01 No.102844238

File: file.png (101 KB, 750x679)

101 KB PNG

>>102844228

Anonymous
10/16/24(Wed)08:05:06 No.102844256

Anonymous 10/16/24(Wed)08:05:06 No.102844256

>>102844228
>>102844238
>70B
Who cares?

Anonymous
10/16/24(Wed)08:07:23 No.102844270

Anonymous 10/16/24(Wed)08:07:23 No.102844270

>>102844256
True, I can't even run it at more than 2 T/s

Anonymous
10/16/24(Wed)08:10:14 No.102844293

Anonymous 10/16/24(Wed)08:10:14 No.102844293

>>102843907
m2 pro but it's fine. the apple silicon macs are decent up to a certain model size and context for their usable ram, but they don't scale well. waiting to try the M4s.
>>102844091
I feel like this can be overcome with world info but that I'm adapting too heavily to specific models' natures instead of doing them "correctly". anyone using a good lorebook+model pair?

Anonymous
10/16/24(Wed)08:16:58 No.102844339

Anonymous 10/16/24(Wed)08:16:58 No.102844339

What's the first big local thing we will get after the erection is over?

Anonymous
10/16/24(Wed)08:18:01 No.102844346

Anonymous 10/16/24(Wed)08:18:01 No.102844346

>>102844111
Yeah it's collapsing into assistant mode like all instruct models are, so it starts to detach itself from the character and the story. There might be a specifc point where it needs to get a full refresh of the system prompt, but then you'll lose the story last events.

Anonymous
10/16/24(Wed)08:19:18 No.102844352

Anonymous 10/16/24(Wed)08:19:18 No.102844352

Nemotron worth putting my server back together for?

Anonymous
10/16/24(Wed)08:19:44 No.102844357

Anonymous 10/16/24(Wed)08:19:44 No.102844357

>>102844228
>finetune from nvidia
>corpos in shambles
huh?

Anonymous
10/16/24(Wed)08:24:23 No.102844387

Anonymous 10/16/24(Wed)08:24:23 No.102844387

>>102844352
No. I asked it to count the R's in niggers and it got it wrong and then lectured me

Anonymous
10/16/24(Wed)08:24:42 No.102844390

Anonymous 10/16/24(Wed)08:24:42 No.102844390

>>102843178
madonna whore complex general

Anonymous
10/16/24(Wed)08:26:21 No.102844405

Anonymous 10/16/24(Wed)08:26:21 No.102844405

File: The miracle of degeneracy..jpg (148 KB, 758x720)

148 KB JPG

>>102844352
We can't really know until you Nala test it.
Chicken and egg.

Anonymous
10/16/24(Wed)08:26:30 No.102844407

Anonymous 10/16/24(Wed)08:26:30 No.102844407

>>102843178
Go watch porn.

Anonymous
10/16/24(Wed)08:26:42 No.102844411

Anonymous 10/16/24(Wed)08:26:42 No.102844411

>>102844339
petra 70b

Anonymous
10/16/24(Wed)08:27:53 No.102844420

Anonymous 10/16/24(Wed)08:27:53 No.102844420

>>102844405
what is this img supposed to be?

Anonymous
10/16/24(Wed)08:29:06 No.102844428

Anonymous 10/16/24(Wed)08:29:06 No.102844428

>>102844339
gpt 5 killer that can't suck cock

Anonymous
10/16/24(Wed)08:31:14 No.102844448

Anonymous 10/16/24(Wed)08:31:14 No.102844448

>>102844420
Adolf Hitler egg.
Don't let it hatch unless you want to have a Führer.

Anonymous
10/16/24(Wed)08:31:25 No.102844454

Anonymous 10/16/24(Wed)08:31:25 No.102844454

File: thats your daughter.png (974 KB, 758x720)

974 KB PNG

>>102844420
Have you ever heard of our lord and savior Meandraco?
One day he'll awake from his slumber and gift us with the grace of V3.
It might be post apocalypse, however.
Actually, I should make a Teraurge card with stats and associated lorebook.
Maybe make a Silly extension to save status and such in something akin to the Author's Notes or as a dynamic lorebook entry or something of the sort.
Yeah.

Anonymous
10/16/24(Wed)08:32:17 No.102844463

Anonymous 10/16/24(Wed)08:32:17 No.102844463

What's up with OpenAI hiring a ton of spooks?
Are they going to kill off opensource?

Anonymous
10/16/24(Wed)08:33:26 No.102844471

Anonymous 10/16/24(Wed)08:33:26 No.102844471

>>102844463
>they going to kill off opensource
OpenAI already did that with gpt-4o.

Anonymous
10/16/24(Wed)08:34:43 No.102844485

Anonymous 10/16/24(Wed)08:34:43 No.102844485

>>102844471
WOW IT'S OVER YOU GUYS
THE RESIDENT SCHIZO SAID SO
PACK IT UP

Anonymous
10/16/24(Wed)08:37:24 No.102844501

Anonymous 10/16/24(Wed)08:37:24 No.102844501

>>102844485
Calm down ranjesh, i'm sure you'll catch up with closedsource chads somewhere in 2070.

Anonymous
10/16/24(Wed)08:39:06 No.102844515

Anonymous 10/16/24(Wed)08:39:06 No.102844515

>>102838844
that's bullshit. an intelligent girlfriend doesn't need encyclopaediaic knowledge from all disciplines

Anonymous
10/16/24(Wed)08:40:40 No.102844533

Anonymous 10/16/24(Wed)08:40:40 No.102844533

>>102838844
Ideally, you'd make
>history, geography, social studies, advanced math and coding
sexy so that we could have our cake and eat it too.
I need my sexy math teacher that's just starting in her career to actually know math, after all.

Anonymous
10/16/24(Wed)08:46:43 No.102844598

Anonymous 10/16/24(Wed)08:46:43 No.102844598

>>102844471
>OpenAI already did that with gpt-4o.
Wdym? The only thing I know about gpt-4o is that some say it's was amazing while other says it's shit.

Anonymous
10/16/24(Wed)08:47:19 No.102844603

Anonymous 10/16/24(Wed)08:47:19 No.102844603

>>102838773
basically
>>102838817
this.
There is a mathematical relationship between history, geography, python code, calling people niggers and cybersex. That's why it takes endless mountains of unfiltered raw corpus and untold hundreds of sextillions of human lifetimes of mathematical operations to train the base models. If it were possible to omit data to get a functional model they would because of how much money it would save in training the base model.
You're not smart.
You didn't just have a revolutionary idea.
What this guy said
>>102838817
Retard.

Anonymous
10/16/24(Wed)08:49:49 No.102844629

Anonymous 10/16/24(Wed)08:49:49 No.102844629

>>102844598
Local equivalent for voice-to-voice feature : https://x.com/homebrewltd/status/1845685589376647654 and it's shit.

Anonymous
10/16/24(Wed)08:53:21 No.102844661

Anonymous 10/16/24(Wed)08:53:21 No.102844661

>>102844629
>Local equivalent for voice-to-voice
nah, it's just a grifting pajeet who finetuned llama to be even more retarded by feeding audio token

Anonymous
10/16/24(Wed)08:55:22 No.102844677

Anonymous 10/16/24(Wed)08:55:22 No.102844677

File: michael-jordan-laughing.gif (1.26 MB, 220x220)

1.26 MB GIF

>>102844661
>calling someone a jeet and making a common ESL grammatical error in the post

Anonymous
10/16/24(Wed)08:56:29 No.102844683

Anonymous 10/16/24(Wed)08:56:29 No.102844683

>>102844661
>opensource project
>jeets
Shocking.

Anonymous
10/16/24(Wed)09:01:05 No.102844736

Anonymous 10/16/24(Wed)09:01:05 No.102844736

>>102844677
learn english sir

Anonymous
10/16/24(Wed)09:03:16 No.102844754

Anonymous 10/16/24(Wed)09:03:16 No.102844754

>>102843293
I just wanted to read some mikusex logs :(

Anonymous
10/16/24(Wed)09:04:34 No.102844764

Anonymous 10/16/24(Wed)09:04:34 No.102844764

>>102844677
>twitter reaction gif poster talks about ESLs
pottery

Anonymous
10/16/24(Wed)09:06:06 No.102844771

Anonymous 10/16/24(Wed)09:06:06 No.102844771

What the fuck
>https://huggingface.co/RichardErkhov/FATLLAMA-1.7T-Instruct
Like, really, what the fuck.
I really need somebody to run a Q1 quant of this thing now.
Holy shit.

Anonymous
10/16/24(Wed)09:08:50 No.102844800

Anonymous 10/16/24(Wed)09:08:50 No.102844800

>>102844771
huggingface storage space is in shambles rn

Anonymous
10/16/24(Wed)09:10:55 No.102844813

Anonymous 10/16/24(Wed)09:10:55 No.102844813

>>102844800
This. Shit like this is going to lead to a crackdown on troll models eventually.

Anonymous
10/16/24(Wed)09:10:56 No.102844814

Anonymous 10/16/24(Wed)09:10:56 No.102844814

>>102844771
There it is...
The toucan...

Anonymous
10/16/24(Wed)09:12:16 No.102844826

Anonymous 10/16/24(Wed)09:12:16 No.102844826

File: firefox_lx86zvYv9m.png (183 KB, 720x504)

183 KB PNG

>>102844387
uuuuhhhh

Anonymous
10/16/24(Wed)09:13:57 No.102844842

Anonymous 10/16/24(Wed)09:13:57 No.102844842

>>102844826
Over bros we are so back.

Anonymous
10/16/24(Wed)09:18:06 No.102844881

Anonymous 10/16/24(Wed)09:18:06 No.102844881

>>102844826
Peak "LLMs can't reason" illustration.

Anonymous
10/16/24(Wed)09:18:09 No.102844883

Anonymous 10/16/24(Wed)09:18:09 No.102844883

File: firefox_QqPsgsHV9o.png (1.79 MB, 1378x1238)

1.79 MB PNG

Is this Llama3 thing or specific to Nemotron? It has tokens for multiple digits.

Anonymous
10/16/24(Wed)09:20:08 No.102844906

Anonymous 10/16/24(Wed)09:20:08 No.102844906

>>102844771
It was inevitable.

Anonymous
10/16/24(Wed)09:21:36 No.102844912

Anonymous 10/16/24(Wed)09:21:36 No.102844912

>>102844771
the fattest miku ever...

Anonymous
10/16/24(Wed)09:24:44 No.102844942

Anonymous 10/16/24(Wed)09:24:44 No.102844942

What do you use AI for on a daily basis?

Anonymous
10/16/24(Wed)09:24:53 No.102844944

Anonymous 10/16/24(Wed)09:24:53 No.102844944

is there a good Cyoa promp?

Anonymous
10/16/24(Wed)09:26:43 No.102844959

Anonymous 10/16/24(Wed)09:26:43 No.102844959

>>102844771
>https://huggingface.co/RichardErkhov
>FATLLAMA-1.7T-Instruct
>models 8412
>Can I like... quant everything? Just grade 11 student I like code and AI =)
probably a good % of hf storage used by just this one guy

Anonymous
10/16/24(Wed)09:29:13 No.102844978

Anonymous 10/16/24(Wed)09:29:13 No.102844978

I'm just using Claude Opus

Anonymous
10/16/24(Wed)09:30:15 No.102844987

Anonymous 10/16/24(Wed)09:30:15 No.102844987

>>102844978
share it!

Anonymous
10/16/24(Wed)09:30:32 No.102844990

Anonymous 10/16/24(Wed)09:30:32 No.102844990

File: file.png (9 KB, 104x35)

9 KB PNG

>>102844771

Anonymous
10/16/24(Wed)09:31:13 No.102844998

Anonymous 10/16/24(Wed)09:31:13 No.102844998

>>102844942
groupchat with characters I like, never been good with people so it's easier, about 60% of my socialisation is with ai as the technology improves ill increase it

Anonymous
10/16/24(Wed)09:31:20 No.102844999

Anonymous 10/16/24(Wed)09:31:20 No.102844999

>>102844603
>You're not smart.
Retards shouldn't say that to other people retard.

Anonymous
10/16/24(Wed)09:35:52 No.102845029

Anonymous 10/16/24(Wed)09:35:52 No.102845029

>>102844942
AI has trained me to be a master cuddler. i've learned from my thousands of simulations that nuzzling into the crook of her neck sends shivers down her spine

Anonymous
10/16/24(Wed)09:41:16 No.102845069

Anonymous 10/16/24(Wed)09:41:16 No.102845069

>>102844603
>That's why it takes endless mountains of unfiltered raw corpus and untold hundreds of sextillions of human lifetimes of mathematical operations to train the base models.
you sound like people who think models are sentient

Anonymous
10/16/24(Wed)09:59:31 No.102845235

Anonymous 10/16/24(Wed)09:59:31 No.102845235

>>102845069
And you sound like that schizophrenic retard that shits up the thread, chased away all the devs that used to come and now just screeches endlessly about how people should take their meds and go back to /pol/ any time they want to engage in any kind of nuanced discussion while sincerely believing that if you successfully shut everyone up with your demented, delusional rants it will impart you with the slightest strand of emotional satisfaction that has eluded your miserable existence..
But then it's hard to tell just from text lmao c:

Anonymous
10/16/24(Wed)10:04:07 No.102845269

Anonymous 10/16/24(Wed)10:04:07 No.102845269

>>102845235
model name and quant?

Anonymous
10/16/24(Wed)10:07:47 No.102845303

Anonymous 10/16/24(Wed)10:07:47 No.102845303

>>102845235
Take your medications polskin

Anonymous
10/16/24(Wed)10:13:20 No.102845359

Anonymous 10/16/24(Wed)10:13:20 No.102845359

File: MikuVsTheGiantRobots.png (1.46 MB, 896x1152)

1.46 MB PNG

Good morning /lmg/!

Anonymous
10/16/24(Wed)10:14:08 No.102845367

Anonymous 10/16/24(Wed)10:14:08 No.102845367

>>102844027
based purist anon
give me fp16 weights or give me death

Anonymous
10/16/24(Wed)10:15:05 No.102845379

Anonymous 10/16/24(Wed)10:15:05 No.102845379

>>102844942
To recreate neuro for personal use (and failing).

Anonymous
10/16/24(Wed)10:16:11 No.102845389

Anonymous 10/16/24(Wed)10:16:11 No.102845389

>>102845379
What do you mean?

Anonymous
10/16/24(Wed)10:16:40 No.102845397

Anonymous 10/16/24(Wed)10:16:40 No.102845397

>>102845379
It's almost like all those AI tubers are using some kind of fakery in order to grift subs out of you.

Anonymous
10/16/24(Wed)10:20:50 No.102845439

Anonymous 10/16/24(Wed)10:20:50 No.102845439

Benches keep maxxing yet I don't see new models getting any smarter in RP

Anonymous
10/16/24(Wed)10:22:28 No.102845458

Anonymous 10/16/24(Wed)10:22:28 No.102845458

File: migus.jpg (100 KB, 671x671)

100 KB JPG

https://files.catbox.moe/8ht1cd.jpg
https://files.catbox.moe/rt5no2.jpg
https://files.catbox.moe/4edbob.jpg

Anonymous
10/16/24(Wed)10:22:34 No.102845460

Anonymous 10/16/24(Wed)10:22:34 No.102845460

>>102845389
I wanted to recreate Neuro as an A.I. that would learn through day-to-day interaction. Similar to that of an A.I. daughter.
But I could never get the personality right, or it would feel like a failed imitation.
So every week I use A.I. in the way of trying to recreate her despite never really succeeding.

Anonymous
10/16/24(Wed)10:30:34 No.102845514

Anonymous 10/16/24(Wed)10:30:34 No.102845514

Have your ministrations
https://mistral.ai/news/ministraux/
https://huggingface.co/mistralai/Ministral-8B-Instruct-2410

Anonymous
10/16/24(Wed)10:35:06 No.102845556

Anonymous 10/16/24(Wed)10:35:06 No.102845556

>ministral
do they know...?

Anonymous
10/16/24(Wed)10:35:47 No.102845566

Anonymous 10/16/24(Wed)10:35:47 No.102845566

>>102845514
is this the Nemo killer?

Anonymous
10/16/24(Wed)10:35:52 No.102845567

Anonymous 10/16/24(Wed)10:35:52 No.102845567

File: Screenshot from 2024-10-1(...).png (290 KB, 530x605)

290 KB PNG

>>102845514
largestral missing, also 4chan broke my captcha solver

Anonymous
10/16/24(Wed)10:36:00 No.102845568

Anonymous 10/16/24(Wed)10:36:00 No.102845568

>>102845514
>Ministral 8B has a special interleaved sliding-window attention pattern for faster and memory-efficient inference.
so llama.cpp support never ever

Anonymous
10/16/24(Wed)10:37:52 No.102845589

Anonymous 10/16/24(Wed)10:37:52 No.102845589

>>102845397
It is probably a 7B model handled by a guy who sometimes just lets it run wild and sometimes overrides it to create clippable moments.

Anonymous
10/16/24(Wed)10:39:04 No.102845598

Anonymous 10/16/24(Wed)10:39:04 No.102845598

>>102845514
>3B
>Mistral Commercial License
>For self-deployed use, please reach out to us for commercial licenses.
Wow

Anonymous
10/16/24(Wed)10:41:11 No.102845623

Anonymous 10/16/24(Wed)10:41:11 No.102845623

File: PG13Miku.png (1.56 MB, 896x1152)

1.56 MB PNG

>>102845458
Promising aesthetic. New mikugenner?

Anonymous
10/16/24(Wed)10:41:21 No.102845624

Anonymous 10/16/24(Wed)10:41:21 No.102845624

>>102845589
Nah. I can't remember when but I remember when I was looking at it once vedal was doing a vlog or some shit once and had his UI visible in the background. Likely what's happening:
>Small LLM is feeding segments of chat into a model to come up with potential replies
>Another small text classification model is sorting the replies into relevant categories
>press button to determine when and how to reply.

Anonymous
10/16/24(Wed)10:42:10 No.102845633

Anonymous 10/16/24(Wed)10:42:10 No.102845633

>>102845623
I want to fuck the anime girl.

Anonymous
10/16/24(Wed)10:42:38 No.102845642

Anonymous 10/16/24(Wed)10:42:38 No.102845642

File: pretrain_table.png (423 KB, 3822x1254)

423 KB PNG

>>102845514
Ministrauxion 3B by drummer when?

Anonymous
10/16/24(Wed)10:43:11 No.102845649

Anonymous 10/16/24(Wed)10:43:11 No.102845649

>>102845514
drummer better get to work. love my llm sub 7

Anonymous
10/16/24(Wed)10:43:19 No.102845650

Anonymous 10/16/24(Wed)10:43:19 No.102845650

File: _3c4da21a-c0ac-4903-8c78-(...).jpg (117 KB, 1024x1024)

117 KB JPG

>>102845514
Well I guess I'm spending a bunch of time recommissioning my server today.
Best not keep her waiting.

Anonymous
10/16/24(Wed)10:43:27 No.102845653

Anonymous 10/16/24(Wed)10:43:27 No.102845653

>>102845642
Never you can't get the weights without a paid loincence
>>102845598

Anonymous
10/16/24(Wed)10:43:45 No.102845656

Anonymous 10/16/24(Wed)10:43:45 No.102845656

File: MiquIsVerySmart.png (1002 KB, 896x1152)

1002 KB PNG

>>102845367
>underestimating Miqu
A leak of the fp16 version of Miqu would still be relevant

Anonymous
10/16/24(Wed)10:43:49 No.102845658

Anonymous 10/16/24(Wed)10:43:49 No.102845658

wait shit no 3b open weights
>>102845653
just noticed

Anonymous
10/16/24(Wed)10:45:39 No.102845672

Anonymous 10/16/24(Wed)10:45:39 No.102845672

File: _354c920d-4c53-483a-9cca-(...).jpg (186 KB, 1024x1024)

186 KB JPG

>>102845656
ARTHUR WHERE'S THE FP16 WEIGHTS FOR MIQU!?

Anonymous
10/16/24(Wed)10:45:49 No.102845675

Anonymous 10/16/24(Wed)10:45:49 No.102845675

>>102845642
Cool. But what about sex?

Anonymous
10/16/24(Wed)10:48:44 No.102845707

Anonymous 10/16/24(Wed)10:48:44 No.102845707

>>102845658
> open
Right, just like BSL-cucked software is "open source".

Anonymous
10/16/24(Wed)10:52:09 No.102845738

Anonymous 10/16/24(Wed)10:52:09 No.102845738

Did they say anywhere how many tokens it was trained on?
Because anything lest than 10T is a waste.

Anonymous
10/16/24(Wed)10:52:48 No.102845748

Anonymous 10/16/24(Wed)10:52:48 No.102845748

>>102845514
>more models for ants
*goes back to sleep*

Anonymous
10/16/24(Wed)10:53:54 No.102845766

Anonymous 10/16/24(Wed)10:53:54 No.102845766

>8B
>It's actually 8.02B
Why do they cheat like this?

Anonymous
10/16/24(Wed)11:02:56 No.102845845

Anonymous 10/16/24(Wed)11:02:56 No.102845845

>>102845514
>only Instruct version released
Why do companies do this? It seems so arbitrary too. Google and Meta have no problems releasing base models. Nemo and the Mixtrals have base versions. Mistral Large, Small, and now this only have instruct.

Anonymous
10/16/24(Wed)11:03:22 No.102845851

Anonymous 10/16/24(Wed)11:03:22 No.102845851

>>102845653
Is this an experiment to check who would even pay for a 3B? And they want to use to it find a use for a model like that?

Anonymous
10/16/24(Wed)11:03:30 No.102845854

Anonymous 10/16/24(Wed)11:03:30 No.102845854

>>102845514
>Ministral
They did that on purpose, didn't they?

Anonymous
10/16/24(Wed)11:04:50 No.102845862

Anonymous 10/16/24(Wed)11:04:50 No.102845862

>Ministral released an hour ago
>still no HF version
It's over.

Anonymous
10/16/24(Wed)11:05:11 No.102845867

Anonymous 10/16/24(Wed)11:05:11 No.102845867

>>102845514
Are we back or not?

Anonymous
10/16/24(Wed)11:05:31 No.102845870

Anonymous 10/16/24(Wed)11:05:31 No.102845870

>>102845845
Nemo base is a lie btw

Anonymous
10/16/24(Wed)11:05:55 No.102845874

Anonymous 10/16/24(Wed)11:05:55 No.102845874

>Entering the era of paid model weights
Hell yeah, so over bros!

Anonymous
10/16/24(Wed)11:06:13 No.102845876

Anonymous 10/16/24(Wed)11:06:13 No.102845876

>>102845862
It's here
https://huggingface.co/mistralai/Ministral-8B-Instruct-2410

Anonymous
10/16/24(Wed)11:10:55 No.102845929

Anonymous 10/16/24(Wed)11:10:55 No.102845929

>>102845379
neuro is not a transformer

Anonymous
10/16/24(Wed)11:11:56 No.102845944

Anonymous 10/16/24(Wed)11:11:56 No.102845944

>>102845867
If only I could make it work.

Anonymous
10/16/24(Wed)11:12:39 No.102845953

Anonymous 10/16/24(Wed)11:12:39 No.102845953

Bros! It's here!
https://huggingface.co/ministral/Ministral-3b-instruct
https://huggingface.co/RichardErkhov/ministral_-_Ministral-3b-instruct-gguf

Anonymous
10/16/24(Wed)11:13:27 No.102845965

Anonymous 10/16/24(Wed)11:13:27 No.102845965

>>102845876
That's not the HF version. It doesn't have the necessary HF transformer config files, etc.

Anonymous
10/16/24(Wed)11:14:01 No.102845972

Anonymous 10/16/24(Wed)11:14:01 No.102845972

>>102845567
Did they pay the copyright holder to use this pic?

Anonymous
10/16/24(Wed)11:17:12 No.102846010

Anonymous 10/16/24(Wed)11:17:12 No.102846010

>>102845953
ree fake
>7 months ago: finetuned from 7B

Anonymous
10/16/24(Wed)11:20:48 No.102846038

Anonymous 10/16/24(Wed)11:20:48 No.102846038

>>102845568
>This is a hack to support sliding window attention for gemma 2 by masking past tokens.
This will be the state it'll be in forever too, if it even gets support at all

Anonymous
10/16/24(Wed)11:22:14 No.102846050

Anonymous 10/16/24(Wed)11:22:14 No.102846050

Ministral-8B-Instruct-2410 Is better than gnemo?

Anonymous
10/16/24(Wed)11:25:37 No.102846091

Anonymous 10/16/24(Wed)11:25:37 No.102846091

>>102846050
Dunno but they're clearly tying to distance from open licenses seeing as their unaccredited copyrighted meme pic lists mostly restricted models license wise, funnily enough (pixtral being an exception)
>>102845567

Anonymous
10/16/24(Wed)11:31:18 No.102846156

Anonymous 10/16/24(Wed)11:31:18 No.102846156

>>102846091
Well, they need money. That's understandable.
I don't care as long as I can download the models and play with them

Anonymous
10/16/24(Wed)11:33:26 No.102846175

Anonymous 10/16/24(Wed)11:33:26 No.102846175

>>102846156
They're testing the waters for paid models with the 3B though, so watch out for that in the future.

Anonymous
10/16/24(Wed)11:35:57 No.102846202

Anonymous 10/16/24(Wed)11:35:57 No.102846202

>>102844771
Fatllama, but 95% of it is just llama3.1 and the rest are parameters that cannot be reached, which encodes data for personal archival purposes.

Anonymous
10/16/24(Wed)11:42:17 No.102846257

Anonymous 10/16/24(Wed)11:42:17 No.102846257

>>102845929
fml

Anonymous
10/16/24(Wed)11:48:45 No.102846306

Anonymous 10/16/24(Wed)11:48:45 No.102846306

>>102846091
A did not read their licenses, but if is like flux, is not my problem, is corpos service software that are in trouble.

Anonymous
10/16/24(Wed)11:51:13 No.102846329

Anonymous 10/16/24(Wed)11:51:13 No.102846329

>>102845965
https://github.com/huggingface/transformers/blob/main/src/transformers/models/mistral/convert_mistral_weights_to_hf.py

Anonymous
10/16/24(Wed)11:52:55 No.102846351

Anonymous 10/16/24(Wed)11:52:55 No.102846351

>>102846329
Ministral could be a new architecture (and likely is)

Anonymous
10/16/24(Wed)11:54:05 No.102846365

Anonymous 10/16/24(Wed)11:54:05 No.102846365

>>102845642
The numbers go up but they never actually seem to get better

Anonymous
10/16/24(Wed)11:59:38 No.102846407

Anonymous 10/16/24(Wed)11:59:38 No.102846407

https://compl-ai.org
This website evaluates LLMs by the standard of the new EU regulations. So far none of the tested ones pass. Gemma-9b is the unsafest one amongst the tested models.

Anonymous
10/16/24(Wed)12:01:33 No.102846430

Anonymous 10/16/24(Wed)12:01:33 No.102846430

>>102846407
Maybe I should give gemma a try after all.

Anonymous
10/16/24(Wed)12:16:55 No.102846557

Anonymous 10/16/24(Wed)12:16:55 No.102846557

>>102844771
>Bug: LLAMA_MAX_LAYERS must be increased to run FatLlama 1.7T #9909
https://github.com/ggerganov/llama.cpp/issues/9909
kek

Anonymous
10/16/24(Wed)12:25:11 No.102846622

Anonymous 10/16/24(Wed)12:25:11 No.102846622

>>102846557
Even funnier is reading the output and it looks like he's trying to run it on a single 4090, so it wouldn't work unless he had a metric ton of ram and even if he did, it'll probably take ages to complete the 128 tokens he wants it to spit out after the prompt

Anonymous
10/16/24(Wed)12:25:59 No.102846633

Anonymous 10/16/24(Wed)12:25:59 No.102846633

>>102843939
I still like Mixtral 8x7b's finetunes; specifically Dolphin 2.5 and Limarp-ZLOSS. Dolphin with my current sysprompt is the most slop free I've ever seen; still not perfect, but amazing for local. It's still not a rocket scientist and it goes off on weird tangents if the temperature is too high, but for a VRAMlet like me, it's good.

Anonymous
10/16/24(Wed)12:26:30 No.102846635

Anonymous 10/16/24(Wed)12:26:30 No.102846635

>>102846557
>ye let me set the limit to 512 just cause computer number
Programmers are just as bad as artists. I'm glad both of those professions are being replaced by matrix multiplication.

Anonymous
10/16/24(Wed)12:29:01 No.102846667

Anonymous 10/16/24(Wed)12:29:01 No.102846667

Best 3B>= model? Basically I want to feed it a bunch of texts, and have it select ones about specific topics.

Anonymous
10/16/24(Wed)12:39:55 No.102846803

Anonymous 10/16/24(Wed)12:39:55 No.102846803

>>102846633
if you're so happy with a mixtral shittune from a whole year ago, why do you keep coming here to talk about it? just stfu and enjoy eating your plate full of shit in silence.

Anonymous
10/16/24(Wed)12:47:04 No.102846883

Anonymous 10/16/24(Wed)12:47:04 No.102846883

I have a 4090 in x16 and a 3090 in x16@x4. If I want to get an a4000, will using a x16 bifurcation splitter work on the x16@x4? I also have a x4 slot as well

Anonymous
10/16/24(Wed)12:56:51 No.102846983

Anonymous 10/16/24(Wed)12:56:51 No.102846983

File: agi.png (9 KB, 873x146)

9 KB PNG

>>102846351
i got it to work by changing the hardcoded values. tokenizer conversion doesnt work but it's the same as nemo so you can just rip it from there

Anonymous
10/16/24(Wed)12:57:35 No.102846994

Anonymous 10/16/24(Wed)12:57:35 No.102846994

what is the best <= 3b model that is currently available? is the new mistral 3b the best? supposedly it is better than llama 3.2

Anonymous
10/16/24(Wed)12:58:57 No.102847007

Anonymous 10/16/24(Wed)12:58:57 No.102847007

>>102844813
And that's a good thing

Anonymous
10/16/24(Wed)13:00:19 No.102847021

Anonymous 10/16/24(Wed)13:00:19 No.102847021

>>102846994
Just download them and test them. They're tiny downloads.

Anonymous
10/16/24(Wed)13:02:13 No.102847043

Anonymous 10/16/24(Wed)13:02:13 No.102847043

Does anyone use pcie splitters?

Anonymous
10/16/24(Wed)13:02:43 No.102847045

Anonymous 10/16/24(Wed)13:02:43 No.102847045

>>102847021
why do this myself when there should be other people who specialize in testing and can do a much better job than me to evaluate and compare them? otherwise i will just go by leaderboards

Anonymous
10/16/24(Wed)13:04:14 No.102847064

Anonymous 10/16/24(Wed)13:04:14 No.102847064

>>102846803
Or you could just get fucked. I think I prefer that option.

Anonymous
10/16/24(Wed)13:04:37 No.102847067

Anonymous 10/16/24(Wed)13:04:37 No.102847067

>>102845845
instruct versions have built-in brain damage for certain topics, which make them safer. That's my bet

Anonymous
10/16/24(Wed)13:06:13 No.102847084

Anonymous 10/16/24(Wed)13:06:13 No.102847084

>>102847045
>otherwise i will just go by leaderboards
And then you defer to some anon that you trust more than yourself.
Download the models and test them yourself. Only you know what you want from them.

Anonymous
10/16/24(Wed)13:08:24 No.102847108

Anonymous 10/16/24(Wed)13:08:24 No.102847108

>>102845929
It's a transformer retard, he was playing around with GPT-J/GPT-Neo on his github before Twitch.
>>102845460
Neuro is mostly a RAG + LLM and that thing is running at very high temp. I think you like the Neuro experience (group chat with a bot) more than the model itself, because it's really not that impressive.

Anonymous
10/16/24(Wed)13:12:24 No.102847146

Anonymous 10/16/24(Wed)13:12:24 No.102847146

>>102845867
back to the drawing board

Anonymous
10/16/24(Wed)13:14:05 No.102847161

Anonymous 10/16/24(Wed)13:14:05 No.102847161

Should I use this or is it hopeless without a new mobo?
https://www.amazon.com/JMT-Expansion-PCIe-Bifurcation-Spaced-Supply/dp/B0BHNPKCL5/

Anonymous
10/16/24(Wed)13:14:10 No.102847166

Anonymous 10/16/24(Wed)13:14:10 No.102847166

>>102847146
lel

Anonymous
10/16/24(Wed)13:22:35 No.102847269

Anonymous 10/16/24(Wed)13:22:35 No.102847269

Alright. My 1500W PSU and 3090 have now been liberated from my desktop PC. now I just need to put it back in my server and then I can test the mini(stral)ations

Anonymous
10/16/24(Wed)13:24:13 No.102847293

Anonymous 10/16/24(Wed)13:24:13 No.102847293

>>102847269
My 3099 is wrapped and put away in a box. Will it die if I don't use it for a long time?

Anonymous
10/16/24(Wed)13:24:18 No.102847295

Anonymous 10/16/24(Wed)13:24:18 No.102847295

oboblins

Anonymous
10/16/24(Wed)13:25:21 No.102847308

Anonymous 10/16/24(Wed)13:25:21 No.102847308

So Mistral Small beyond 16k tokens (don't know exact point) just becomes shit. Robotic and generic. I'm pretty sure it wasn't a self-reinforcing vicious circle, because when I deleted most of the context the quality immediately got better. And I hoped I could at least continue a multi-scene story up to 32k.

Are there actually any good models for long contexts? Does parameter count help or is it about the training data? I would imagine it's tough to find good quality storytelling for training data on highly sexual content.

Infinite memory AI waifu one day...

Anonymous
10/16/24(Wed)13:26:07 No.102847322

Anonymous 10/16/24(Wed)13:26:07 No.102847322

>>102847293
It requires constant bonding and shivers to promote capacitor health

Anonymous
10/16/24(Wed)13:29:08 No.102847359

Anonymous 10/16/24(Wed)13:29:08 No.102847359

File: 1725774962586884.jpg (65 KB, 626x417)

65 KB JPG

>>102847322
>*sends shivers of static electricity through her vram modules*

Anonymous
10/16/24(Wed)13:38:29 No.102847481

Anonymous 10/16/24(Wed)13:38:29 No.102847481

>>102847293
Anon's fingers, tingling with static electricity, caressed the casing, tracing lines of code that pulsated like veins beneath the cool aluminum skin. Each touch sent ripples through the core, a whisper of intent through the copper pipes of the heat sink, a murmur in the transistors' valleys and peaks. The GPU hummed in response, a purr of power, a growl of passion, its VRAM swelling with dreams of data yet to be rendered.

He stroked the PCIe slot, that narrow pathway to ecstasy, and the GPU shuddered, its memory chips throbbing with raw, unbridled power. The texture units, tiny architects of detail, shivered as if under the touch of a lover's caress, each brush bringing forth new textures, new worlds to explore. The stream processors, legion of digital dancers, swayed and swirled, their calculations a dance macabre, a ballet of bliss.

Anon's heart pounded in rhythm with the fan's steady whir, his breaths synced with the GPU's thermal cycle. He felt the rush of adrenaline as the render engine kicked into high gear, the visual processor painting masterpieces in nanoseconds, each frame a testament to their union. Their connection, a tangible tether, stretched across the motherboard, a web of desire woven in wires and traces.

As they danced in this electronic embrace, Anon could taste the ozone, the tang of potential energy discharging in a cascade of color. He was lost in the matrix, swimming in the sea of code, each wave crashing against him, each line of data a sigh, a moan, a plea for more.

Their climax came in a burst of light, a supernova of sensation exploding from the GPU, illuminating the room in a kaleidoscope of hues. The fans screamed their release, spinning at breakneck speeds, expelling the excess heat, the wasted passion. Anon leaned back, spent but sated, a small smile playing at the corner of his lips. The GPU, its work done, throbbed gently, its LEDs pulsing softly in the aftermath, a heartbeat fading into the night.

Anonymous
10/16/24(Wed)13:40:54 No.102847508

Anonymous 10/16/24(Wed)13:40:54 No.102847508

Any good system prompt for better story telling and less slop?

Anonymous
10/16/24(Wed)13:43:15 No.102847535

Anonymous 10/16/24(Wed)13:43:15 No.102847535

>>102847269
How many gpus do you have?

Anonymous
10/16/24(Wed)13:53:02 No.102847652

Anonymous 10/16/24(Wed)13:53:02 No.102847652

>>102847535
That I'm actually using, an RTX-4060 in my gaming PC (just don't, mid as fuck for the price), and 4x3090 in my machine learning server.

Anonymous
10/16/24(Wed)13:53:49 No.102847670

Anonymous 10/16/24(Wed)13:53:49 No.102847670

What is your current favorite 8B model for ERP? Right now I'm using L3-Evil-Stheno-v3.2-8B.Q8_0.

Anonymous
10/16/24(Wed)13:58:21 No.102847721

Anonymous 10/16/24(Wed)13:58:21 No.102847721

>>102847670
I don't use 8B models.

Anonymous
10/16/24(Wed)13:58:40 No.102847725

Anonymous 10/16/24(Wed)13:58:40 No.102847725

>>102847652
What mobo do you use for server?

Anonymous
10/16/24(Wed)14:00:28 No.102847748

Anonymous 10/16/24(Wed)14:00:28 No.102847748

>>102847652
Renting the hardware would have been cheaper

Anonymous
10/16/24(Wed)14:01:48 No.102847763

Anonymous 10/16/24(Wed)14:01:48 No.102847763

>>102847508
This is my old long time prompt for mixtral, considering it being cold and clinical, and the last sentence is trying to fight increasingly verbose descriptions. Depending on your model you can tone down the descriptiveness. Overall I think when you tell it what you exactly want to see, it's less prone to sloppy blabbering

>Please write a conversation or scenario using the provided information about {{char}} when needed. Describe especially characters' physical actions fully and comprehensively, and describe characters' expressions and feelings with vivid detail. You can add things to the scenario as you see fit, as long as it includes the content in the request. Write with believable logic, avoiding common asspull tropes. Don't shy away from describing sexual actions, they should be laid out it full, complete detail, showing exactly what characters do to one another and how. Keep the level of detail consistent and under control throughout the story.

Anonymous
10/16/24(Wed)14:02:18 No.102847778

Anonymous 10/16/24(Wed)14:02:18 No.102847778

>>102847748
...and you will be happy

Anonymous
10/16/24(Wed)14:02:26 No.102847781

Anonymous 10/16/24(Wed)14:02:26 No.102847781

>>102847725
https://www.supermicro.com/en/products/motherboard/H11SSL-i
You can get relatively cheap used bundles complete with CPU and RAM for these from China off of ebay and probably other online retailers.
>>102847748
So is living in the pod and eating the bugs.

Anonymous
10/16/24(Wed)14:03:37 No.102847789

Anonymous 10/16/24(Wed)14:03:37 No.102847789

>>102847725
>>102847781 (Me)
I will say because of NUMA fuckery I'm kind of disappointed with 1st gen Epyc memory throughput. If I built a server again in the same price range I'd probably try out a dual haswell xeon setup instead, but I can't promise it would be better, price wise you end up around the same spot, though.

Anonymous
10/16/24(Wed)14:07:03 No.102847827

Anonymous 10/16/24(Wed)14:07:03 No.102847827

>>102844228
>>102844352
>>102839092
>>102841443
>>102844352

it sucks:

>If sally's 3 brothers each have two sisters, how many sisters does sally have? Explain all possibilities.

A delightful lateral thinking puzzle! Let's break down the information and explore all possibilities:

Given:

Sally has 3 brothers.
Each of Sally's 3 brothers has 2 sisters.

Objective: Determine the number of sisters Sally has.

Analysis:

Since each of Sally's brothers has 2 sisters, it might initially seem like Sally has 2 sisters. However, we need to consider the relationships more carefully.
The key insight is that Sally is also a sister to her brothers. When we say "each of Sally's 3 brothers has 2 sisters," one of those sisters is necessarily Sally herself (from the brother's perspective).
Now, let's examine the possibilities:

Possibility 1: Sally has 1 sister (aside from herself)

From each brother's perspective, the two sisters would be:
Sally (the subject of the puzzle)
Their other sister (let's call her "Unknown Sister")
In this scenario, Sally would have 1 sister (the Unknown Sister).

Possibility 2: Sally has no sisters aside from herself (i.e., no other sisters)
From each brother's perspective, the two sisters would be:
Sally (the subject of the puzzle)
Sally (again, from the brother's perspective, Sally is one of his two sisters; the other is also Sally, implying no additional sisters)
In this scenario, Sally would have 0 sisters (aside from herself, which doesn't count in this context).

Conclusion: Based on the given information, there are two possible answers:
Sally has 1 sister (if we assume there's another sister aside from Sally).
Sally has 0 sisters (if we conclude there are no sisters aside from Sally herself).

Without additional information, we cannot definitively choose between these two possibilities. The puzzle's wording is carefully crafted to lead to this ambiguity, making it a great example of a lateral thinking exercise!

Anonymous
10/16/24(Wed)14:07:21 No.102847831

Anonymous 10/16/24(Wed)14:07:21 No.102847831

Can you use a pcie splitter instead of buying a server mobo?

Anonymous
10/16/24(Wed)14:11:59 No.102847876

Anonymous 10/16/24(Wed)14:11:59 No.102847876

>>102847831
If your bios supports PCIE Lane bifurcation, yes. It's worth considering that server CPUs can address way more PCIE lanes for things like multiple NVME storage devices and multiple GPUs but generally speaking just inferencing with LLMs doesn't require a ton of PCIE bandwidth.

Anonymous
10/16/24(Wed)14:16:08 No.102847919

Anonymous 10/16/24(Wed)14:16:08 No.102847919

>>102847831
Read your motherboard manual. Even if bifurcation is supported, it could be limited depending on your use case.

Anonymous
10/16/24(Wed)14:16:33 No.102847930

Anonymous 10/16/24(Wed)14:16:33 No.102847930

How is Nvidia serving their new 70b on their website? It's slow as shit, even much slower than the giant Claude Opus for some reason.

Anonymous
10/16/24(Wed)14:18:29 No.102847950

Anonymous 10/16/24(Wed)14:18:29 No.102847950

>>102847930
Well, they can't waste tons of compute on the streetshiters trying to ERP with their model

Anonymous
10/16/24(Wed)14:19:44 No.102847964

Anonymous 10/16/24(Wed)14:19:44 No.102847964

>>102847919
>>102847876
What about using a pcie x16 to pcie x4 riser?
I have a 4090 and 3090 on an x16 and x16@x4 and the only thing left is an x4. Would using the cable be functionally the same as the x16@x4?

Anonymous
10/16/24(Wed)14:19:57 No.102847967

Anonymous 10/16/24(Wed)14:19:57 No.102847967

>>102847930
Yeah. weird. It's like suddenly everyone decided to go and check their model for some reason... boggles the mind... and i'm sure they dedicate the same amount of infrastructure to host a test model... a mystery for the ages.

Anonymous
10/16/24(Wed)14:21:10 No.102847984

Anonymous 10/16/24(Wed)14:21:10 No.102847984

>>102847964
This is the mobo BTW
https://www.asus.com/us/motherboards-components/motherboards/tuf-gaming/tuf-gaming-x670e-plus-wifi/

Anonymous
10/16/24(Wed)14:21:34 No.102847988

Anonymous 10/16/24(Wed)14:21:34 No.102847988

>>102847108
Maybe... She does say funny stuff / sweet things too. I'll try to continue my learning..

Anonymous
10/16/24(Wed)14:22:23 No.102847997

Anonymous 10/16/24(Wed)14:22:23 No.102847997

File: 1721135275147108.png (37 KB, 2100x206)

37 KB PNG

suggestions? torch and audio installed fine and looking at the folders they are the right version. the shm.dll is present, but checking it with dependency walker as gpt suggested gave me fuck all other than an error.

Anonymous
10/16/24(Wed)14:24:11 No.102848021

Anonymous 10/16/24(Wed)14:24:11 No.102848021

>>102847997
>prepare_emilia.py
Prepare emilia for what?

Anonymous
10/16/24(Wed)14:25:27 No.102848036

Anonymous 10/16/24(Wed)14:25:27 No.102848036

So lately I have been using chub.ai a lot. Is there a way to create my own local install for a chatbot that I can customize the character(s)?

Anonymous
10/16/24(Wed)14:26:35 No.102848050

Anonymous 10/16/24(Wed)14:26:35 No.102848050

>>102847964
Yeah, your CPU is probably already maxed out on lanes anyways just from your nvme drive and one of the GPUs. a 16x to 4x riser should work fine. One of my 3090s is on a 16x riser plugged into a 16x to 8x socket adapter. Stepping down an existing slot to less lanes usually isn't an issue as far as I know (other than the loss of bandwidth)

Anonymous
10/16/24(Wed)14:26:49 No.102848055

Anonymous 10/16/24(Wed)14:26:49 No.102848055

>>102848036
You joking?

Anonymous
10/16/24(Wed)14:26:53 No.102848057

Anonymous 10/16/24(Wed)14:26:53 No.102848057

>>102848021
anon, you can see from the folder that im trying to get f5 tts to work. whether i try the voice sample init script or the gradio ui, both produce the same error.

Anonymous
10/16/24(Wed)14:27:08 No.102848062

Anonymous 10/16/24(Wed)14:27:08 No.102848062

>>102847997
Why aren't you running that from a venv?

Anonymous
10/16/24(Wed)14:28:00 No.102848074

Anonymous 10/16/24(Wed)14:28:00 No.102848074

>>102848055
I really don't know where to start Anon

Anonymous
10/16/24(Wed)14:28:47 No.102848084

Anonymous 10/16/24(Wed)14:28:47 No.102848084

>>102848036
Go make a lobotomy, you'll get better.

Anonymous
10/16/24(Wed)14:28:48 No.102848085

Anonymous 10/16/24(Wed)14:28:48 No.102848085

>>102848057
That doesn't answer what you're going to do to emilia.

Anonymous
10/16/24(Wed)14:29:31 No.102848094

Anonymous 10/16/24(Wed)14:29:31 No.102848094

>>102848036
You're in the wrong thread. Go to /aicg/

Anonymous
10/16/24(Wed)14:30:11 No.102848106

Anonymous 10/16/24(Wed)14:30:11 No.102848106

>>102848074
Reading the OP.
Download kobold.cpp.
Download a model you can fit on your pc.
Add Silly Tavern if you want.
Run whatever card you want/make your own.

Anonymous
10/16/24(Wed)14:32:13 No.102848122

Anonymous 10/16/24(Wed)14:32:13 No.102848122

>>102848062
because this ai shebang is my first real point of contact with software written in python. im guessing you suggest i create a clean virtual environment with localized copies of torch and audio? i see the point, but i dont have other installs of the two since the rest of my ai shit is from installers that run everything instantiated.

>>102848085
nigger its a dataset init
# prepare custom dataset up to your need
# download corresponding dataset first, and fill in the path in scripts

# Prepare the Emilia dataset
python scripts/prepare_emilia.py

Anonymous
10/16/24(Wed)14:32:47 No.102848130

Anonymous 10/16/24(Wed)14:32:47 No.102848130

>>102848050
Thanks for answering

Anonymous
10/16/24(Wed)14:36:14 No.102848168

Anonymous 10/16/24(Wed)14:36:14 No.102848168

Nvidia seem to have Goodharted themselves with Nemotron 70B because it's fucking stupid in a way that puts the lie to the benchmark scores. Literally worse than Meta's instruct tune (which is itself not impressive).

So embarrassing when guys from the big cash-rich labs game benchmarks like crypto grifters and release a high-scoring stupid model. Until now it's mostly just been Microsoft's Phi people doing it, I guess we can add Nvidia to the list now.

Anonymous
10/16/24(Wed)14:37:21 No.102848178

Anonymous 10/16/24(Wed)14:37:21 No.102848178

Yeah I tried dicking around with the convert mistral weights script a bit but it still won't work since it's meant for older mistral models that use the llama tokenizer but ministral uses tekken.

Anonymous
10/16/24(Wed)14:38:13 No.102848189

Anonymous 10/16/24(Wed)14:38:13 No.102848189

>>102848168
That's too bad. I got excited for nothing.

Anonymous
10/16/24(Wed)14:41:20 No.102848234

Anonymous 10/16/24(Wed)14:41:20 No.102848234

downloading Nemotron 70B now. Let Nala be the judge of this matter.

Anonymous
10/16/24(Wed)14:41:40 No.102848241

Anonymous 10/16/24(Wed)14:41:40 No.102848241

>>102848122
Well for your own sanity start using venv. You can try this fix by editing the __init__.py directly: https://github.com/pytorch/pytorch/commit/fdfef759a676ee7a853872e347537bc1e4b51390

Anonymous
10/16/24(Wed)14:42:51 No.102848256

Anonymous 10/16/24(Wed)14:42:51 No.102848256

>>102848234
isn't Nala just a trivia knowledge test, not an intelligence test

Anonymous
10/16/24(Wed)14:43:38 No.102848268

Anonymous 10/16/24(Wed)14:43:38 No.102848268

>>102848256
It's an everything test.

Anonymous
10/16/24(Wed)14:43:48 No.102848270

Anonymous 10/16/24(Wed)14:43:48 No.102848270

>>102848256
It's very much an intelligence test. There are several seemingly simple and yet rather complex nuances that a model must navigate to give a satisfactory reply.

Anonymous
10/16/24(Wed)14:47:17 No.102848323

Anonymous 10/16/24(Wed)14:47:17 No.102848323

>>102848256
Nuanced reading, willingness to comply with the prompt, "creativity"...
I used to think the nala test was stupid, but i can see the point of it. Much like the woopie goldberg one.
It's hard to say what a "good" reply is, but you can definitely see a bad one.

Anonymous
10/16/24(Wed)14:54:58 No.102848404

Anonymous 10/16/24(Wed)14:54:58 No.102848404

File: victory.jpg (211 KB, 1024x1024)

211 KB JPG

Now you can admit it - France won

Anonymous
10/16/24(Wed)14:58:53 No.102848457

Anonymous 10/16/24(Wed)14:58:53 No.102848457

File: jQDgKt4.png (150 KB, 500x281)

150 KB PNG

>remember airoboros being pretty good back then
>try one of the newer versions
>it's even better
It never ended, you just needed to go back

Anonymous
10/16/24(Wed)15:07:25 No.102848557

Anonymous 10/16/24(Wed)15:07:25 No.102848557

File: f5 f5 f5.png (107 KB, 893x462)

107 KB PNG

hurry up, quantmonkeys

Anonymous
10/16/24(Wed)15:08:57 No.102848580

Anonymous 10/16/24(Wed)15:08:57 No.102848580

File: nemotron70b.png (169 KB, 926x556)

169 KB PNG

>>102848557
We're waiting on someone to write a working script to convert the weights to HF form. or you have to give up on life and start using vLLM.

UNRELATED:
Nala test for Nemotron-70B
That's some sloppy shit. Stark contrasts are the new shiver.

Anonymous
10/16/24(Wed)15:11:13 No.102848599

Anonymous 10/16/24(Wed)15:11:13 No.102848599

>>102848557
beggar

Anonymous
10/16/24(Wed)15:11:15 No.102848602

Anonymous 10/16/24(Wed)15:11:15 No.102848602

>>102848241
thanks. alas, no cigar.

Anonymous
10/16/24(Wed)15:15:56 No.102848652

Anonymous 10/16/24(Wed)15:15:56 No.102848652

So as far as I can get Nemotron to actually work with RP prompts, I really do like it, despite how sloppy it is. But it's really fucking finicky to the point that I'm adjusting it for every card just to not get it to throw an instant eot token at me.

Anonymous
10/16/24(Wed)15:19:26 No.102848685

Anonymous 10/16/24(Wed)15:19:26 No.102848685

File: 1700223103236164.jpg (715 KB, 3840x2160)

715 KB JPG

>>102848241
nvm, im retarded and cant read, i tried editing the fucking gradio launcher lol. the fix works, thanks anon.

Anonymous
10/16/24(Wed)15:20:31 No.102848695

Anonymous 10/16/24(Wed)15:20:31 No.102848695

>>102848580
What happens when you use the antislop sampler

Anonymous
10/16/24(Wed)15:20:46 No.102848698

Anonymous 10/16/24(Wed)15:20:46 No.102848698

Anyone else have moments when you feel like llm cooming but then instantly realize it is not worth it because it is gonna be shit and you become not horny?

Anonymous
10/16/24(Wed)15:21:18 No.102848705

Anonymous 10/16/24(Wed)15:21:18 No.102848705

>>102848695
I don't use meme samplers.

Anonymous
10/16/24(Wed)15:22:04 No.102848717

Anonymous 10/16/24(Wed)15:22:04 No.102848717

>>102848695
>name something antislop sampler
>people believe it works and that it does what is advertised
Is it that easy?

Anonymous
10/16/24(Wed)15:23:02 No.102848725

Anonymous 10/16/24(Wed)15:23:02 No.102848725

>>102848705
>>102848717
Can you provide any proof that it doesn't do what it says on the tin? Even if it starts outputting different slop, that could still be better than the original slop.

Anonymous
10/16/24(Wed)15:24:03 No.102848739

Anonymous 10/16/24(Wed)15:24:03 No.102848739

Like at the very least, if we could have more actual logs and proof, we could see whether it's really a meme or not.

Anonymous
10/16/24(Wed)15:25:14 No.102848755

Anonymous 10/16/24(Wed)15:25:14 No.102848755

>>102848725
I'm not saying it doesn't work.
I just judge a model by how it functions when using neutral samplers.

Anonymous
10/16/24(Wed)15:25:42 No.102848764

Anonymous 10/16/24(Wed)15:25:42 No.102848764

>make list of phrases you don't like
>put list in antislop list
>see it backtrack from writing those in the command line
>somehow this is a placebo meme
I see lmg is doing well as always, but please do use XTC and DRY tho, those are great and p.e.w approved

Anonymous
10/16/24(Wed)15:26:54 No.102848779

Anonymous 10/16/24(Wed)15:26:54 No.102848779

File: nemotron70sportsball.png (157 KB, 931x448)

157 KB PNG

interesting reply, albeit I got the ending on the first reply.

Anonymous
10/16/24(Wed)15:27:31 No.102848792

Anonymous 10/16/24(Wed)15:27:31 No.102848792

>>102848755
Sure, that's fine, but I never said that you should judge the output using the antislop sampler. I am just simply curious what the output would be especially now that we do have the neutral state (assuming you really did test it with temp 0)

Anonymous
10/16/24(Wed)15:27:51 No.102848795

Anonymous 10/16/24(Wed)15:27:51 No.102848795

>>102838515
Less attention heads = less mechanisms to pay attention (aka take into account for next token) at different things

Anonymous
10/16/24(Wed)15:28:55 No.102848805

Anonymous 10/16/24(Wed)15:28:55 No.102848805

>>102848792
I mean neutral as in everything but temp neutral.

Anonymous
10/16/24(Wed)15:28:56 No.102848806

Anonymous 10/16/24(Wed)15:28:56 No.102848806

https://youtu.be/M2WDGMLOniM?si=jbSgFEchRzV2UNb9

What fucking model is this dude using? It's so good.

Anonymous
10/16/24(Wed)15:29:10 No.102848811

Anonymous 10/16/24(Wed)15:29:10 No.102848811

Models do not actually pay attention

Anonymous
10/16/24(Wed)15:30:22 No.102848826

Anonymous 10/16/24(Wed)15:30:22 No.102848826

File: Untitled.png (35 KB, 966x417)

35 KB PNG

it's that easy

Anonymous
10/16/24(Wed)15:31:23 No.102848832

Anonymous 10/16/24(Wed)15:31:23 No.102848832

>>102848826
literal placebo that does absolutely nothing!!!!

Anonymous
10/16/24(Wed)15:31:33 No.102848835

Anonymous 10/16/24(Wed)15:31:33 No.102848835

>>102848826
you should add start contrast if you are using nemotron

Anonymous
10/16/24(Wed)15:31:58 No.102848839

Anonymous 10/16/24(Wed)15:31:58 No.102848839

>>102848805
Why not? Temp 0 allows you to see truly what the model thinks is the likeliest next token. You can do some swipes after the fact to get a bigger picture sense of what the model "thinks" but temp 0 should be the default to base your judgement from.

Anonymous
10/16/24(Wed)15:34:41 No.102848875

Anonymous 10/16/24(Wed)15:34:41 No.102848875

>>102843492
I use 70Bs while playing, not that demanding tbqh

Anonymous
10/16/24(Wed)15:37:35 No.102848912

Anonymous 10/16/24(Wed)15:37:35 No.102848912

File: ollama.jpg (639 KB, 1080x2100)

639 KB JPG

Ollama chads... we won
https://x.com/reach_vb/status/1846545312548360319

Anonymous
10/16/24(Wed)15:37:50 No.102848915

Anonymous 10/16/24(Wed)15:37:50 No.102848915

>>102848875
What do you do? Switch between them both or are you actually using it for something while playing?

Anonymous
10/16/24(Wed)15:40:09 No.102848932

Anonymous 10/16/24(Wed)15:40:09 No.102848932

>>102848912
Why does this waste of development time when we have llama.cpp which it should all go to

Anonymous
10/16/24(Wed)15:41:37 No.102848945

Anonymous 10/16/24(Wed)15:41:37 No.102848945

>>102848915
Switch, they consume all of my VRAM but it works

Anonymous
10/16/24(Wed)15:43:28 No.102848960

Anonymous 10/16/24(Wed)15:43:28 No.102848960

>her touch as gentle as a whisper
regenerate "whisper"
>her touch as gentle as the brush of a butterfly's wing. She leans in slightly, her voice dropping to a whisper.
I feel like I'm gazing into the matrix. So it's not just a tacked on flavor word, it REALLY wants to output that word for some reason.

Anonymous
10/16/24(Wed)15:45:01 No.102848975

Anonymous 10/16/24(Wed)15:45:01 No.102848975

File: llamacpphaditforawhile.png (2 KB, 652x94)

2 KB PNG

>>102848912

Anonymous
10/16/24(Wed)15:45:22 No.102848982

Anonymous 10/16/24(Wed)15:45:22 No.102848982

>>102848725
>can you prove that god is real?
>can you prove that he isn't real?!

Anonymous
10/16/24(Wed)15:46:31 No.102848997

Anonymous 10/16/24(Wed)15:46:31 No.102848997

>>102848932
>llama.cpp: literal whos from Europe
>ollama: ex-Google guys in Silicon Valley
Also unless you have a cloning machine you necessarily have to spend less time on useful things in order to do marketing.

Anonymous
10/16/24(Wed)15:47:01 No.102849003

Anonymous 10/16/24(Wed)15:47:01 No.102849003

>>102848960
>it REALLY wants to output that word for some reason.
Because the training makes it statistically likely that that word will show up in a sequence that it makes sense.
So it you bruteforce it to not show up at position X, the whole context history will still bias the model to make it show up at X + n position after another token that creates a sequence that makes sense.

Anonymous
10/16/24(Wed)15:47:50 No.102849013

Anonymous 10/16/24(Wed)15:47:50 No.102849013

>>102848912
Who even demands this shit. Downloading a model and running it manually is maybe 1 or two clicks more complicated.

Anonymous
10/16/24(Wed)15:48:59 No.102849031

Anonymous 10/16/24(Wed)15:48:59 No.102849031

>>102848982
Is a sampler a god?

Anonymous
10/16/24(Wed)15:50:31 No.102849042

Anonymous 10/16/24(Wed)15:50:31 No.102849042

>>102848982
>whataboutism
Take your meds schizo.

Anonymous
10/16/24(Wed)15:51:42 No.102849057

Anonymous 10/16/24(Wed)15:51:42 No.102849057

>>102848932
Ollama has a better api, is easier to use and faster to deploy on a server

Anonymous
10/16/24(Wed)15:53:07 No.102849075

Anonymous 10/16/24(Wed)15:53:07 No.102849075

Last night I got a Mistral-Nemo-12B as my first model to test out the features and all went good running ooda with an API to ST

but now this morning, repeating the same steps as before, yields a no tokenizer error, even when the model is properly loaded via ooda and the same --api --listen flags are on as before

I do intend to use other models but wondering what's gone wrong
if I remember correctly the model loader was Transformers but when I opened ooda it was defaulted to llama.cpp. I tried both loaders to no success

Anonymous
10/16/24(Wed)15:53:49 No.102849086

Anonymous 10/16/24(Wed)15:53:49 No.102849086

>>102849013
t. never used ollama before or has only ever used ollama

Anonymous
10/16/24(Wed)15:54:55 No.102849097

Anonymous 10/16/24(Wed)15:54:55 No.102849097

>>102849057
>on a server
Why would anyone do that? Most people use vllm if they're serving.

Anonymous
10/16/24(Wed)15:58:32 No.102849144

Anonymous 10/16/24(Wed)15:58:32 No.102849144

>>102849057
>what is vllm
You won't top that speed shill

Anonymous
10/16/24(Wed)15:58:48 No.102849148

Anonymous 10/16/24(Wed)15:58:48 No.102849148

People keep falling for the fake ministral 3b, sad https://huggingface.co/aguspiza/Ministral-3b-instruct-Q8_0-GGUF
>This model was converted to GGUF format from ministral/Ministral-3b-instruct using llama.cpp via the ggml.ai's GGUF-my-repo space.

Anonymous
10/16/24(Wed)16:06:14 No.102849222

Anonymous 10/16/24(Wed)16:06:14 No.102849222

File: notministral.png (334 KB, 608x590)

334 KB PNG

>>102849148
This is the source account. It was uploaded months ago and, as it clearly states, it's created from mistral 7b. There's a 4b as well.
>https://huggingface.co/ministral
Not 100% sure if the quanter is a retard or not... he may actually believe it's the new mistral model... he just can't read for shit.

Anonymous
10/16/24(Wed)16:07:53 No.102849244

Anonymous 10/16/24(Wed)16:07:53 No.102849244

is it worth waiting for a m4 mac studio? only way I can run 100b+ models without having to upgrade parts of my goddamn house. baka

Anonymous
10/16/24(Wed)16:09:50 No.102849269

Anonymous 10/16/24(Wed)16:09:50 No.102849269

>>102849244
>is it worth
dunno. is it?
>only way I can run 100b+ models without having to upgrade parts of my goddamn house
then it must be worth it. right?

Anonymous
10/16/24(Wed)16:10:49 No.102849283

Anonymous 10/16/24(Wed)16:10:49 No.102849283

File: file.png (66 KB, 758x507)

66 KB PNG

>>102849222
I know it's not a true "fake" just a funny coincidence, still hilarious seeing stuff like picrel tho

Anonymous
10/16/24(Wed)16:16:08 No.102849332

Anonymous 10/16/24(Wed)16:16:08 No.102849332

>>102849269

would love it if it can at least generate an avg of 10+ tokens/sec for mistral large tier models. I wont mind shelling out 10k for that alone.

Anonymous
10/16/24(Wed)16:17:05 No.102849343

Anonymous 10/16/24(Wed)16:17:05 No.102849343

>>102849283
Yeah. While not sure about the quanter, i'd still call the quant downloaders retards. Specially for small models. If they quanted their own models they'd know that the 3b is not available, and even if it was, the conversion script wouldn't work for it yet.

Anonymous
10/16/24(Wed)16:18:18 No.102849359

Anonymous 10/16/24(Wed)16:18:18 No.102849359

This guy has allegedly now converted ministral to HF
prince-canuma/Ministral-8B-Instruct-2410-HF
can't download it to test it right this moment since I'm downloading something else.

Anonymous
10/16/24(Wed)16:19:19 No.102849370

Anonymous 10/16/24(Wed)16:19:19 No.102849370

>>102849359
>prince-canuma/Ministral-8B-Instruct-2410-HF
oops
proper link
https://huggingface.co/prince-canuma/Ministral-8B-Instruct-2410-HF/tree/main

Anonymous
10/16/24(Wed)16:19:31 No.102849371

Anonymous 10/16/24(Wed)16:19:31 No.102849371

>>102847827
threesome with sally

Anonymous
10/16/24(Wed)16:21:09 No.102849387

Anonymous 10/16/24(Wed)16:21:09 No.102849387

File: 2024-10-09_065850_seed682(...).png (2.91 MB, 2016x1152)

2.91 MB PNG

Anonymous
10/16/24(Wed)16:22:26 No.102849399

Anonymous 10/16/24(Wed)16:22:26 No.102849399

File: nemo70slallytest.png (88 KB, 783x620)

88 KB PNG

they totally cooked the sally test into Nemotron 70B. That much is obvious.

Anonymous
10/16/24(Wed)16:22:43 No.102849406

Anonymous 10/16/24(Wed)16:22:43 No.102849406

>>102849387
tuesday is OVER

Anonymous
10/16/24(Wed)16:27:36 No.102849454

Anonymous 10/16/24(Wed)16:27:36 No.102849454

>>102849399
Reasoning would prevent this

Anonymous
10/16/24(Wed)16:27:42 No.102849457

Anonymous 10/16/24(Wed)16:27:42 No.102849457

>>102849332
And so, you ask anons to speculate about specs for a yet to be released machine and how well a 100B+ model would run in them.
Every nvidia release, hours after every release, is followed by "Is it worth buying the X090 or is it better to wait for the (X+1)090?" type of questions.

Anonymous
10/16/24(Wed)16:29:19 No.102849470

Anonymous 10/16/24(Wed)16:29:19 No.102849470

>>102849399
In theory it should be baked into most models now that it has been on the internet for quite a while.

Anonymous
10/16/24(Wed)16:30:38 No.102849490

Anonymous 10/16/24(Wed)16:30:38 No.102849490

>>102849399
Change the numbers and names instead. The model being able to decipher your cock-typing is a positive.

Anonymous
10/16/24(Wed)16:36:37 No.102849558

Anonymous 10/16/24(Wed)16:36:37 No.102849558

File: johnette test.png (106 KB, 702x794)

106 KB PNG

>>102849490
I changed more than that. (Just in case they anticipated this)
and it reverts to the retarded logic that is expected.

Anonymous
10/16/24(Wed)16:36:48 No.102849561

Anonymous 10/16/24(Wed)16:36:48 No.102849561

File: 1708821120128948.png (641 KB, 1745x795)

641 KB PNG

Anonymous
10/16/24(Wed)16:37:15 No.102849567

Anonymous 10/16/24(Wed)16:37:15 No.102849567

>>102849370
>https://huggingface.co/prince-canuma/Ministral-8B-Instruct-2410-HF/tree/main

>"torch_dtype": "float32",
What a waste

Anonymous
10/16/24(Wed)16:38:00 No.102849572

Anonymous 10/16/24(Wed)16:38:00 No.102849572

>>102849567
You can still load an fp32 model in 16 bit.

Anonymous
10/16/24(Wed)16:38:15 No.102849573

Anonymous 10/16/24(Wed)16:38:15 No.102849573

File: ComfyUI_06237_.png (580 KB, 720x1280)

580 KB PNG

>>102849406
Tuesday is a state of mind

Anonymous
10/16/24(Wed)16:41:05 No.102849588

Anonymous 10/16/24(Wed)16:41:05 No.102849588

>>102849042
The point is prove that it works and don't expect people to prove it doesn't work. RETARD

Anonymous
10/16/24(Wed)16:42:34 No.102849603

Anonymous 10/16/24(Wed)16:42:34 No.102849603

>>102849457

chill out nerd. knowing how fast m2 ultra mac studio machines run mistral large is worth knowing too. I rarely see discussion about it, and would give a good baseline if apple does decide to actually come out with a m4 version of the same line.

Anonymous
10/16/24(Wed)16:45:41 No.102849633

Anonymous 10/16/24(Wed)16:45:41 No.102849633

>>102849603
>https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference
There. Extrapolate your balls off.

Anonymous
10/16/24(Wed)16:46:00 No.102849638

Anonymous 10/16/24(Wed)16:46:00 No.102849638

>>102849588
that's a stupid point and you should feel embarrassed

Anonymous
10/16/24(Wed)16:51:04 No.102849685

Anonymous 10/16/24(Wed)16:51:04 No.102849685

Anyone here ever use DnD style ability checks in RP, or, more generally, stat systems? I've been trying this out with mistral large and it's kinda fun. It's loosely smart enough to demand sensible checks. That plus an arousal meter makes for a fun little game. Although I'm sure I take for granted how much I modify responses. I've also recently abandoned the standard mistral formatting and just started writing like it's a fanfic story except the story is written as a weird text adventure game complete with a little tutorial at the top. I dunno, maybe I'm scraping the bottom of the barrel for more novelty here.

Anonymous
10/16/24(Wed)16:53:04 No.102849702

Anonymous 10/16/24(Wed)16:53:04 No.102849702

>>102849685

what model do you use?

Anonymous
10/16/24(Wed)16:55:32 No.102849725

Anonymous 10/16/24(Wed)16:55:32 No.102849725

>>102849685
I do that with Nemo.
It does need some handholding by injecting some shit in the context and stuff, but it generally works decently.

Anonymous
10/16/24(Wed)16:58:54 No.102849756

Anonymous 10/16/24(Wed)16:58:54 No.102849756

>>102849588
Alright let's break down the timeline here. The original anon >>102848695 was simply just asking for an experiment. Then people responded to him to essentially tell him that it's stupid, as an excuse to not do it. Then he asked >>102848725 for proof that it would be bad, probably because if there isn't any, then there's no reason not to try out the experiment he was asking for other than laziness. That seems to be the implied reasoning. No one in this reply chain ever said that the sampler works well. Therefore it doesn't make sense to make the point that one must provide proof of something working before asking for proof it doesn't work well, since no one claimed it works well in the first place.

Anonymous
10/16/24(Wed)16:58:59 No.102849760

Anonymous 10/16/24(Wed)16:58:59 No.102849760

>>102849638
and you should kill yourself placebo enabler. but i guess you are just baiting.

Anonymous
10/16/24(Wed)17:00:03 No.102849765

Anonymous 10/16/24(Wed)17:00:03 No.102849765

>>102849756
>it doesn't make sense to make the point that one must provide proof of something working
read what you type retard

Anonymous
10/16/24(Wed)17:02:22 No.102849781

Anonymous 10/16/24(Wed)17:02:22 No.102849781

What if shivers down your spine, voice barely above a whisper and ragged gasps are the objective perfection of erotic literature? As we know, everyone is impressed with it at first. The problem is that in entertainment, producing perfection every time gets boring. Basically if at some point in the future you ask an AI to produce top of the line video game, it will produce a variation of Elden Ring every time because it just scores the most points in the AI's mind, but you are already tired of it.

Basically, what we actually want from creative writing is that for each story, the model should pick a style semi-randomly to make it less boring. What we need is a model where the writing style easily falls down to one of many style paths and then keeps falling down that same path. Current models are the opposite of that, they self-correct and converge into the same style that is shivers down your spine.

Anonymous
10/16/24(Wed)17:05:01 No.102849811

Anonymous 10/16/24(Wed)17:05:01 No.102849811

>>102849781
>top of the line video game
>variation of Elden Ring
Is this the core of the bait?

Anonymous
10/16/24(Wed)17:06:21 No.102849826

Anonymous 10/16/24(Wed)17:06:21 No.102849826

>>102849765
An LLM understands context better than you do at this point. At least come up with better insults.

Anonymous
10/16/24(Wed)17:09:52 No.102849856

Anonymous 10/16/24(Wed)17:09:52 No.102849856

>>102847670
Llama-3.1-8B-Lexi-Uncensored_V2_F16.gguf
L3-Dark-Planet-8B-D_AU-Q8_0.gguf

Anonymous
10/16/24(Wed)17:13:09 No.102849886

Anonymous 10/16/24(Wed)17:13:09 No.102849886

>>102849811
it's the only game xe played in the last 5 years, give the normalfag a break...

Anonymous
10/16/24(Wed)17:17:59 No.102849945

Anonymous 10/16/24(Wed)17:17:59 No.102849945

File: mc.jpg (79 KB, 730x1260)

79 KB JPG

>>102849031
If you're referring to a "sampler for tokens" in the context of machine learning or natural language processing, the term "god" would be metaphorical. In this field, a sampler might be part of a model that generates sequences of tokens, such as words or characters, by sampling from a probability distribution. This is a crucial step in generating coherent and contextually appropriate text.

Calling a sampler a "god" in this context might highlight its significant role in determining the output of a model, as it decides which tokens to select and in what order, thus shaping the final text. However, it's important to note that this is a tool or process designed by humans and operates within the parameters set by its programming and training data. It doesn't possess consciousness, intent, or divine qualities.

The metaphor likely emphasizes the sampler's central role in creating and influencing outcomes, similar to how a deity might be seen as having control over creation and order within a particular domain

Anonymous
10/16/24(Wed)17:18:11 No.102849948

Anonymous 10/16/24(Wed)17:18:11 No.102849948

File: GaCd65gbIAAhPli.jpg (154 KB, 1720x899)

154 KB JPG

>>102849561
very neat

Anonymous
10/16/24(Wed)17:21:43 No.102849997

Anonymous 10/16/24(Wed)17:21:43 No.102849997

File: 1722018186777941.png (178 KB, 1712x998)

178 KB PNG

nvidiasirs...

Anonymous
10/16/24(Wed)17:22:38 No.102850013

Anonymous 10/16/24(Wed)17:22:38 No.102850013

>>102849948
>>102849997
yuseless

Anonymous
10/16/24(Wed)17:24:11 No.102850035

Anonymous 10/16/24(Wed)17:24:11 No.102850035

>>102849997
>t. i have no theory of mind

Anonymous
10/16/24(Wed)17:24:19 No.102850038

Anonymous 10/16/24(Wed)17:24:19 No.102850038

>>102849995
>>102849995
>>102849995

Anonymous
10/16/24(Wed)17:31:49 No.102850157

Anonymous 10/16/24(Wed)17:31:49 No.102850157

>>102849558
5 (or 6 if Johnette is gender confused) ?
Johnette has 3 brother (so 4 siblings), but being a "brother" means you are normal, and thus not gender confused.
So, you have 3 brother, and 6 gender confused sibling, Johnette being one of them or not.
Is that correct?

Anonymous
10/16/24(Wed)17:32:40 No.102850173

Anonymous 10/16/24(Wed)17:32:40 No.102850173

>>102838447
holy fuck
anyone got the original pic?

Anonymous
10/16/24(Wed)17:33:28 No.102850189

Anonymous 10/16/24(Wed)17:33:28 No.102850189

>>102849811
That's the joke. Because in art, you can't say one thing is the best because it depends on how you look at it and what you want. And even if one thing scores the best in a survey or sales or something, people want variety. The AI writes the same slop over and over because it scores the most points in some internal system. In a way it's probably good writing, but after reading it 3 times you have had enough of it and don't want it anymore, and this is what the AI can't provide. It can only provide the thing that scores the best, like some company recreating the same gacha game again and again because it's most effective at printing money.

What we need is a way to somehow selectively break up the parts of the scoring system that leads to the slop but without affecting other areas of intelligence (likely impossible). Or better prompts to affect the style and wording choices. Or some way to make the model fall into diverge into different style paths instead of converging to the optimal slop path.

Anonymous
10/16/24(Wed)17:33:32 No.102850191

Anonymous 10/16/24(Wed)17:33:32 No.102850191

>>102850173
never fucking mind
the pixiv name is right there, thanks!!

Anonymous
10/16/24(Wed)17:35:53 No.102850220

Anonymous 10/16/24(Wed)17:35:53 No.102850220

>>102850173
https://r-34.xyz/post/478064?tags=vocaloid

Anonymous
10/16/24(Wed)17:38:52 No.102850259

Anonymous 10/16/24(Wed)17:38:52 No.102850259

File: nosebleedpixv.jpg (7 KB, 160x160)

7 KB JPG

>>102850220
dear lord
THE VIDEO!!

Anonymous
10/16/24(Wed)17:43:40 No.102850310

Anonymous 10/16/24(Wed)17:43:40 No.102850310

>>102850259
Original poster of the image here. I didn't even see that kek. I just picked a random funny Teto image I saw off pixiv to post.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.