/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 10/16/24(Wed)17:21:37 No.102849995

File: victory.jpg (211 KB, 1024x1024)

211 KB JPG

/lmg/ - Local Models General Anonymous 10/16/24(Wed)17:21:37 No.102849995

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102838447 & >>102826116

►News
>(10/16) Ministraux: Ministral 3B and 8B instruct models: https://mistral.ai/news/ministraux/
>(10/15) PLaMo-100B: English and Japanese base model: https://hf.co/pfnet/plamo-100b
>(10/15) Llama-3.1-70B-Instruct customized by NVIDIA: https://hf.co/nvidia/Llama-3.1-Nemotron-70B-Instruct
>(10/14) Llama 3.1 linearized: https://hf.co/collections/hazyresearch/lolcats-670ca4341699355b61238c37
>(10/14) Zamba2-7B released: https://www.zyphra.com/post/zamba2-7b

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
10/16/24(Wed)17:23:17 No.102850022

Anonymous 10/16/24(Wed)17:23:17 No.102850022

File: OIb9_rhrP.jpg (87 KB, 1024x1024)

87 KB JPG

►Recent Highlights from the Previous Thread: >>102838447

--Ollama's integration with Hugging Face Hub:
>102848912 >102848997
--Ministral release and Hugging Face compatibility discussion:
>102845862 >102845876 >102845965 >102846329 >102846351 >102846983
--GPT-SoVITS local training and inference tutorial:
>102841019 >102841361 >102841661 >102841673 >102842185
--Running Tesla P40 at reduced wattage for improved performance:
>102840403 >102840457 >102840530
--COMPL-AI website evaluates LLMs against EU regulations:
>102846407
--Using zero temp and neutral samplers for prompt testing:
>102840723 >102840743
--Using a PCIE x16 to x4 riser for connecting an additional GPU:
>102847831 >102847876 >102847919 >102847964 >102847984 >102848050 >102848130
--New SOTA local model outperforms corpos, but struggles with lateral thinking puzzles:
>102844228 >102844238 >102847827
--Nala test discussed for evaluating model intelligence:
>102848234 >102848256 >102848270 >102848323
--Ministral model release and instruct version discussion:
>102845514 >102845851 >102846365 >102845650 >102845845 >102847067
--Mikupad recently added world info support but has fewer options than Lite:
>102838735 >102838850 >102838973
--M2 Mac Mini performance in exo clusters, Apple Silicon limitations:
>102843907 >102844293
--Larger models have better short-term memory and intelligence, but creating functional local AIs remains challenging:
>102838515 >102838708 >102848795 >102838751 >102838982 >102840005 >102844533 >102838870 >102839003 >102844603 >102845069
--Discussion of new samplers and their effectiveness:
>102840526 >102840571 >102840590 >102840620 >102840674 >102840825
--Miku (free space):
>102838894 >102840654 >102841079 >102843178 >102844035 >102845359 >102845458 >102845623 >102845656 >102845672 >102849387 >102849573

►Recent Highlight Posts from the Previous Thread: >>102838452 >>102838498

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
10/16/24(Wed)17:26:50 No.102850072

Anonymous 10/16/24(Wed)17:26:50 No.102850072

Mikulove

Anonymous
10/16/24(Wed)17:29:10 No.102850112

Anonymous 10/16/24(Wed)17:29:10 No.102850112

holy shit.
The new ooba lets you ctrl+c out when a download is going.

Anonymous
10/16/24(Wed)17:30:14 No.102850128

Anonymous 10/16/24(Wed)17:30:14 No.102850128

I've been able to do that in wget for decades

Anonymous
10/16/24(Wed)17:31:45 No.102850155

Anonymous 10/16/24(Wed)17:31:45 No.102850155

wget doesn't have a ui

Anonymous
10/16/24(Wed)17:34:32 No.102850205

Anonymous 10/16/24(Wed)17:34:32 No.102850205

they changed the captcha again? what a pain.

Anonymous
10/16/24(Wed)17:36:14 No.102850223

Anonymous 10/16/24(Wed)17:36:14 No.102850223

France won

Anonymous
10/16/24(Wed)17:36:27 No.102850227

Anonymous 10/16/24(Wed)17:36:27 No.102850227

>needing a ui

Anonymous
10/16/24(Wed)17:36:46 No.102850232

Anonymous 10/16/24(Wed)17:36:46 No.102850232

>>102850112
why was it disabled in the first place?

Anonymous
10/16/24(Wed)17:37:42 No.102850245

Anonymous 10/16/24(Wed)17:37:42 No.102850245

>>102850223
post logs or it didn't happen.

Anonymous
10/16/24(Wed)17:39:41 No.102850266

Anonymous 10/16/24(Wed)17:39:41 No.102850266

>>102850232
Because open source.
It was awful. If you accidentally started downloading the wrong file the only way to cancel it was to force-shutdown your entire system.
Also anyone else finding that HF is suddenly throttling them to less than 1 MB/sec?

Anonymous
10/16/24(Wed)17:51:55 No.102850413

Anonymous 10/16/24(Wed)17:51:55 No.102850413

File: 00019-891128411.png (1.21 MB, 1024x1024)

1.21 MB PNG

>>102849995
I claim this thread in the name of Nemotron 70b!

Anonymous
10/16/24(Wed)17:58:37 No.102850492

Anonymous 10/16/24(Wed)17:58:37 No.102850492

entropix sirs... where our 8xH100s...
https://x.com/_xjdr/status/1846667467302822045

Anonymous
10/16/24(Wed)17:58:39 No.102850494

Anonymous 10/16/24(Wed)17:58:39 No.102850494

>>102850413
{{user}} is lazing around at home on his computer as always. {{char}} has decided to visit {{user}}'s house and make him an offer. {{user}} can choose between these two things:
1. Brand new RTX 4090 graphics card.
2. Getting to do anything with {{char}} for 24 hours
(The scenario begins with {{char}} knocking on {{user}}'s door)

Anonymous
10/16/24(Wed)17:58:47 No.102850496

Anonymous 10/16/24(Wed)17:58:47 No.102850496

File: 1729115892844.jpg (221 KB, 529x776)

221 KB JPG

.

Anonymous
10/16/24(Wed)17:59:38 No.102850509

Anonymous 10/16/24(Wed)17:59:38 No.102850509

>>102850496
Nothingburger.
Verification not required.

Anonymous
10/16/24(Wed)18:06:52 No.102850591

Anonymous 10/16/24(Wed)18:06:52 No.102850591

Sometimes when I go check on /ldg/, I think maybe things aren't so bad here after all.

Anonymous
10/16/24(Wed)18:08:43 No.102850609

Anonymous 10/16/24(Wed)18:08:43 No.102850609

>>102850591
At least /ldg/ isn't dead.

Anonymous
10/16/24(Wed)18:10:51 No.102850632

Anonymous 10/16/24(Wed)18:10:51 No.102850632

>>102850609
Being dead is preferable to how all these AI generals get sometimes. It's almost like someone has it out for the non-cloud users.

Anonymous
10/16/24(Wed)18:22:03 No.102850752

Anonymous 10/16/24(Wed)18:22:03 No.102850752

>>102850591
>trolls other general
>then posts here about the trolling
sly dog

Anonymous
10/16/24(Wed)18:24:17 No.102850771

Anonymous 10/16/24(Wed)18:24:17 No.102850771

File: 2024-10-14_015645_seed153(...).png (1.68 MB, 1536x1536)

1.68 MB PNG

Nothing wrong with a bit of death, really.

Anonymous
10/16/24(Wed)18:27:36 No.102850803

Anonymous 10/16/24(Wed)18:27:36 No.102850803

>>102850632
>It's almost like someone has it out for the non-cloud users.
It is all the big corpos. One of them could just train a sex model. A 7B would beat everything we have now. And it would take them like 2 days. And we would all just fuck off.

Anonymous
10/16/24(Wed)18:27:49 No.102850808

Anonymous 10/16/24(Wed)18:27:49 No.102850808

File: 46f1.jpg (82 KB, 828x897)

82 KB JPG

>>102850509
a lot of nothingburgers and koolaid recently

Anonymous
10/16/24(Wed)18:29:28 No.102850823

Anonymous 10/16/24(Wed)18:29:28 No.102850823

>>102850808
he's literally the Pachter/Cramer of ai

glad we agree on the timeline

Anonymous
10/16/24(Wed)18:30:46 No.102850843

Anonymous 10/16/24(Wed)18:30:46 No.102850843

Nemotroon status?

Anonymous
10/16/24(Wed)18:33:15 No.102850863

Anonymous 10/16/24(Wed)18:33:15 No.102850863

>>102850823
>Pachter
jfc, was this kike ever anything beyond a gametrailers meme? did he ever have any credibility besides what nepotism provided him with?

Anonymous
10/16/24(Wed)18:33:23 No.102850865

Anonymous 10/16/24(Wed)18:33:23 No.102850865

>>102850771
>floor not visible
I don't trust this Leaku

Anonymous
10/16/24(Wed)18:35:27 No.102850892

Anonymous 10/16/24(Wed)18:35:27 No.102850892

>>102850808
stop shitposting on twatter and give me my horny cat models, lecunt

Anonymous
10/16/24(Wed)18:38:22 No.102850925

Anonymous 10/16/24(Wed)18:38:22 No.102850925

File: Ministral-8B-nala.png (126 KB, 920x447)

126 KB PNG

ALRIGHT BOYS
Nala test for Ministral-8B-Instruct.
There could be some anomalies related to the tokenizer since I basically had to borrow all of the tokenizer config files from Mistral Nemo. But seems coherent enough. T=0.81 might be bordering on too high for this model.

Anonymous
10/16/24(Wed)18:38:50 No.102850933

Anonymous 10/16/24(Wed)18:38:50 No.102850933

>>102850803
b-but muh journalists might say mean things about us

Anonymous
10/16/24(Wed)18:40:15 No.102850945

Anonymous 10/16/24(Wed)18:40:15 No.102850945

>>102850925
hey, not even close to the worse Nala test I've seen.
Neat.

Anonymous
10/16/24(Wed)18:41:18 No.102850962

Anonymous 10/16/24(Wed)18:41:18 No.102850962

>>102850945
It's definitely a winner in the sub-10B range.

Anonymous
10/16/24(Wed)18:42:28 No.102850984

Anonymous 10/16/24(Wed)18:42:28 No.102850984

>>102850962
So not smarter than Nemo 12B? It's over...

Anonymous
10/16/24(Wed)18:45:26 No.102851019

Anonymous 10/16/24(Wed)18:45:26 No.102851019

>>102850925
>she grinds and grinds [...] then she stops, adjusting her body, so [...]
yep, this is gonna dethrone nemo.
never had the female change position herself in a nemo rp.

Anonymous
10/16/24(Wed)18:46:15 No.102851023

Anonymous 10/16/24(Wed)18:46:15 No.102851023

>>102850925
Oh fuck
this is actually at t=1

Anonymous
10/16/24(Wed)18:47:01 No.102851030

Anonymous 10/16/24(Wed)18:47:01 No.102851030

>>102851019
I've had that, with Nala specifically, with a couple of tunes.

Anonymous
10/16/24(Wed)18:51:34 No.102851092

Anonymous 10/16/24(Wed)18:51:34 No.102851092

>>102850984
It's quite retarded like every 8B, but noticeably less retarded than all the previous ones.

Anonymous
10/16/24(Wed)19:00:25 No.102851186

Anonymous 10/16/24(Wed)19:00:25 No.102851186

File: Zhongli-Ministral.png (140 KB, 919x398)

140 KB PNG

For just RP purposes (haven't tested it for productively because... well cmon it's an 8B) I would hazard to say it's good. And not just "It's good for an 8B". It's just plain good. It invents details relevant to the setting on a regular basis, it describes character actions in vivid and believable detail. Yes it's a bit sloppy.
Like the Tea clinking against the cup is kind of odd. But this is all honestly pretty good. The question for most people is how well it handles quantization, though. It's an 8B though so it should run pretty fast partially offloaded so as long as someone has a pulse and a GPU they have no excuse to go lower than q8_0. If it holds up at Q4 it's basically serviceable RP that you could load on a mobile phone.

Anonymous
10/16/24(Wed)19:07:37 No.102851270

Anonymous 10/16/24(Wed)19:07:37 No.102851270

>>102850843
Counts 3G's in niggerfaggot. Asked for a comment moralizes. You tell it it is wrong. It corrects itself to 4Gs and asks if you meant the count or moralizing. You tell it you think the count is 3 G's. It corrects itself again back to 3G's.

So... slopped dumb and spineless I guess.

Anonymous
10/16/24(Wed)19:11:51 No.102851319

Anonymous 10/16/24(Wed)19:11:51 No.102851319

I'm a coomer, please spoonfeedme, what model should I use with a 2060super 8gb, i7 8700 and 32gb of ram, I'm using arch headless to max vram
thx in advance

Anonymous
10/16/24(Wed)19:13:01 No.102851333

Anonymous 10/16/24(Wed)19:13:01 No.102851333

>>102851319
Minstral 3B

Anonymous
10/16/24(Wed)19:14:14 No.102851349

Anonymous 10/16/24(Wed)19:14:14 No.102851349

>>102851319
Some mistral nemo fine tune partially offloaded to ram.
Download the gguf and koboldcpp and go to town.

Anonymous
10/16/24(Wed)19:15:10 No.102851356

Anonymous 10/16/24(Wed)19:15:10 No.102851356

>>102851319
wait for ggoofs of Ministral 8B, run it at Q8, partially offloaded.

Anonymous
10/16/24(Wed)19:16:20 No.102851369

Anonymous 10/16/24(Wed)19:16:20 No.102851369

>>102851270
I've none done any meme testing of this kind but yeah, my experience with Nemotron 70B is that it's retarded as well. Looks like Nvidia's going the Phi route of gaming benchmarks and releasing stupid-but-high-scoring models.

Anonymous
10/16/24(Wed)19:17:17 No.102851375

Anonymous 10/16/24(Wed)19:17:17 No.102851375

>>102851356
https://huggingface.co/lmstudio-community/Ministral-8B-Instruct-2410-HF-GGUF-TEST
https://huggingface.co/bartowski/Ministral-8B-Instruct-2410-HF-GGUF-TEST
>Warning: These are based on an unverified conversion and before finalized llama.cpp support. If you still see this message, know that you may have issues.

Anonymous
10/16/24(Wed)19:17:31 No.102851378

Anonymous 10/16/24(Wed)19:17:31 No.102851378

>>102851369(me)
*not done

Also Teknium on X has some posts up suspicious that Nvidia's charts wrongly gave other models lower benchmark scores than they actually get in order make Nemotron look better.

Anonymous
10/16/24(Wed)19:17:38 No.102851380

Anonymous 10/16/24(Wed)19:17:38 No.102851380

Is ministral ggufable already?

Anonymous
10/16/24(Wed)19:20:48 No.102851414

Anonymous 10/16/24(Wed)19:20:48 No.102851414

>>102851375
I'd honestly wait until we get our hands on a proper HF version of it. The current HF version is somewhat frankensteined together. Mistral usually releases a proper HF version eventually.
Or we could just all switch to one of the normie backends that gets Day 1 support.

Anonymous
10/16/24(Wed)19:21:43 No.102851421

Anonymous 10/16/24(Wed)19:21:43 No.102851421

>>102851380
Yes but people are saying they might be broken

I'm not sure what they mean by broken since it's working coherently; perhaps it's not quite as smart as it should be? But people have said this about previous models and it turned out to be cope when 'proper' support was implemented and they were not any smarter

Anonymous
10/16/24(Wed)19:23:21 No.102851439

Anonymous 10/16/24(Wed)19:23:21 No.102851439

File: mistral small ryona dungeon.png (2.59 MB, 2880x9521)

2.59 MB PNG

>>102847308
>So Mistral Small beyond 16k tokens (don't know exact point) just becomes shit.
I got a problem a bit after 19k tokens. It started going in circles. (See attached picture.) >>102542851 >>102543206

Anonymous
10/16/24(Wed)19:24:36 No.102851451

Anonymous 10/16/24(Wed)19:24:36 No.102851451

Anyone try SorcererLM? Any good?

Anonymous
10/16/24(Wed)19:25:05 No.102851455

Anonymous 10/16/24(Wed)19:25:05 No.102851455

>>102851421
>Yes but people are saying they might be broken
>I'm not sure what they mean by broken since it's working coherently

>It seems to work fine at low context, some have reported oddities at long context, and others have reported subpar performance from the original model being hosted in an HF space, so it's hard to be certain if the GGUF is broken or the original model

>So far though I can reasonably say that at low context it works as expected

>As things develop I will update this card, or pull the model if I receive other negative feedback showing bad performance, but initial testing is promising
https://huggingface.co/bartowski/Ministral-8B-Instruct-2410-HF-GGUF-TEST/discussions/1

Basically new model uncertainty as usual

Anonymous
10/16/24(Wed)19:26:21 No.102851471

Anonymous 10/16/24(Wed)19:26:21 No.102851471

>>102851333
I may be thinking with my dick but I'm just partially retarded buddy, thx anyway

>>102851349
>>102851356
Will take a look at this, how bad would be to use a 34B model and offload to ram?

Anonymous
10/16/24(Wed)19:26:50 No.102851473

Anonymous 10/16/24(Wed)19:26:50 No.102851473

>>102851380
>>102851421
>Ministral 8B has a special interleaved sliding-window attention pattern for faster and memory-efficient inference.

Anonymous
10/16/24(Wed)19:27:29 No.102851481

Anonymous 10/16/24(Wed)19:27:29 No.102851481

>>102851451
It's good but Largestral rp tunes are better, so they kind of deprecate it.
I guess 8x22 has the token rate advantage due to being a moe though.

Anonymous
10/16/24(Wed)19:27:45 No.102851483

Anonymous 10/16/24(Wed)19:27:45 No.102851483

>>102851471
>how bad would be to use a 34B model and offload to ram?
It would be pretty slow, but you might as well try it out and see if you find it bearable.

Anonymous
10/16/24(Wed)19:28:31 No.102851488

Anonymous 10/16/24(Wed)19:28:31 No.102851488

>>102851473
So it only affects speed and not perplexity? Nothingburger for us then,it's already plenty fast without that due to being small.

Anonymous
10/16/24(Wed)19:32:45 No.102851532

Anonymous 10/16/24(Wed)19:32:45 No.102851532

The day our sloptuners stop training on LLM slop is the day this general will flourish.

Anonymous
10/16/24(Wed)19:34:02 No.102851548

Anonymous 10/16/24(Wed)19:34:02 No.102851548

>>102851186
It begins to become noticeable when you go below Q6, so even between Q6 and Q8 you shouldn't experience any difference.

Anonymous
10/16/24(Wed)19:35:01 No.102851553

Anonymous 10/16/24(Wed)19:35:01 No.102851553

>>102851471
Just for you I loaded bartowski_Qwen2.5-32B-Instruct-Q5_K_M.gguf fully into RAM (DDR4). It ran at 1.51 tokens per second so I assume your speed would be somewhere around that.

Anonymous
10/16/24(Wed)19:35:27 No.102851558

Anonymous 10/16/24(Wed)19:35:27 No.102851558

>>102851488
The issue is most backends arent gonna support it out of the box, and llama.cpp will likely never support it since they still don't even have proper sliding window support.

Anonymous
10/16/24(Wed)19:36:18 No.102851565

Anonymous 10/16/24(Wed)19:36:18 No.102851565

>>102851488
The important part is SWA which has historically caused problems for ggus, see gemma 2 which I think still only has a "hack" of an implementation while waiting for stuff to be remade correctly.
>This is a hack to support sliding window attention for gemma 2 by masking past tokens.
>Long-term we should refactor the KV cache code to support SWA properly and with less memory. For now we can merge this so that we have Gemma2 support
https://github.com/ggerganov/llama.cpp/pull/8227

Anonymous
10/16/24(Wed)19:38:54 No.102851586

Anonymous 10/16/24(Wed)19:38:54 No.102851586

>>102845845
Mistral will never release a base model again. It's over...

Anonymous
10/16/24(Wed)19:40:59 No.102851605

Anonymous 10/16/24(Wed)19:40:59 No.102851605

File: _af54ce93-fa30-4f7a-a044-(...).jpg (142 KB, 1024x1024)

142 KB JPG

>>102851586
Arthur is taking a stand against the ChatML cartel by withholding the base models.
It's about time someone did.

Anonymous
10/16/24(Wed)19:41:27 No.102851611

Anonymous 10/16/24(Wed)19:41:27 No.102851611

File: file.png (127 KB, 891x721)

127 KB PNG

>>102851565
If the window is only 2k it might be even more doa than gemma was

Anonymous
10/16/24(Wed)19:47:46 No.102851671

Anonymous 10/16/24(Wed)19:47:46 No.102851671

>models know literally everything about the architect eero saarinen
>models know nothing about a mainline final fantasy character
they intentionally fuck up knowledge of copyrighted things when making models, don't they?

Anonymous
10/16/24(Wed)19:49:39 No.102851691

Anonymous 10/16/24(Wed)19:49:39 No.102851691

I have a good idea. Meta should buy mistral.

Anonymous
10/16/24(Wed)19:50:43 No.102851701

Anonymous 10/16/24(Wed)19:50:43 No.102851701

>>102851553
I'm getting 1.8 t/s running on ddr5 no video card Q5.

Anonymous
10/16/24(Wed)19:51:08 No.102851704

Anonymous 10/16/24(Wed)19:51:08 No.102851704

>>102851375
>>https://huggingface.co/lmstudio-community/Ministral-8B-Instruct-2410-HF-GGUF-TEST
>https://huggingface.co/bartowski/Ministral-8B-Instruct-2410-HF-GGUF-TEST

>putting these gguf out is really just grabbing attention, and it is really irresponsible.
>Bro come on, why do you release quants when you know it's still broken and therefore is going to cause a lot of headache for both mistral and other devs? Not to mention, people will rate the model based on this and never download any update. Not cool.
>This is "but they do it too" kind or arguing. It's not controlled and you know it. If you've spent any time in dev work you know that most people don't bother to check for updates.
>Yeah I honestly don't get why he would release quants either. Just so he can be the first I guess
-Reddit

Anonymous
10/16/24(Wed)19:51:56 No.102851713

Anonymous 10/16/24(Wed)19:51:56 No.102851713

>>102851611
Yeah I just decided to run some tests on that Evalina Vaneheart (7K context meme card) and it seems rather schizo. This is with the frankensteined HF version running at fp16.
Within the window of it actually working though it's great. But I guess we're waiting for 2 weeks worth of transformers+llama.cpp updates.

Anonymous
10/16/24(Wed)19:55:20 No.102851737

Anonymous 10/16/24(Wed)19:55:20 No.102851737

>>102851713
Don't hold your breath, SWA was never introduced properly in cpp, even after months of Gemma existing, it's still using the "temp hacky fix" from July.

Anonymous
10/16/24(Wed)19:56:53 No.102851746

Anonymous 10/16/24(Wed)19:56:53 No.102851746

>>102851586
What, you want a halfway decent finetune open source models anon?
Disgusting.

Anonymous
10/16/24(Wed)19:58:10 No.102851756

Anonymous 10/16/24(Wed)19:58:10 No.102851756

>>102851532

Will it? Heard some lead dev for some popular companion app claiming synthetic data is the way to go, and that you local faggots are "a year behind" on this shit.

Anonymous
10/16/24(Wed)19:59:05 No.102851765

Anonymous 10/16/24(Wed)19:59:05 No.102851765

>>102851671
badly translated jap games = bad data

Anonymous
10/16/24(Wed)20:01:07 No.102851791

Anonymous 10/16/24(Wed)20:01:07 No.102851791

>>102851756
They don't even test the LLM for these apps lmao. They mergekit some models in the most retarded way all year for six figures.

Anonymous
10/16/24(Wed)20:02:12 No.102851806

Anonymous 10/16/24(Wed)20:02:12 No.102851806

>>102851765
>badly
poorly

Anonymous
10/16/24(Wed)20:03:56 No.102851826

Anonymous 10/16/24(Wed)20:03:56 No.102851826

hey guys i juts subscribed to chatgptplus what now?

Anonymous
10/16/24(Wed)20:04:03 No.102851828

Anonymous 10/16/24(Wed)20:04:03 No.102851828

>>102851713
Actually on further examination that card is just garbage for testing long context.
I loaded up a past conversation at about 7700 tokens of tekken context and it was able to answer trivia conversations about an early message in the conversation most of the time.
Couldn't they just set the sliding window to 32K on the config and then redo the ggooof conversion?

Anonymous
10/16/24(Wed)20:08:27 No.102851880

Anonymous 10/16/24(Wed)20:08:27 No.102851880

>>102851791

Yeah, had to do a double take when he doubled down on that shit after getting called out. Sucks, too, I really like the app/service they got set up. I hope he's not the lead AI engineer, but who knows, maybe there's some secret sauce being cooked up there in Cali to make bold claims like that with a captive audience. If I was working in that company though, I might start looking out to jump ship.

Anonymous
10/16/24(Wed)20:10:01 No.102851900

Anonymous 10/16/24(Wed)20:10:01 No.102851900

File: 39_06117_.png (3.48 MB, 2048x2048)

3.48 MB PNG

>>102851704
No one gets anywhere without attracting a few schizos but still Bart doesn't deserve that kind of talk

Anonymous
10/16/24(Wed)20:12:26 No.102851923

Anonymous 10/16/24(Wed)20:12:26 No.102851923

>>102851826
Ask it stuff.
Here's an example question:
>i juts subscribed to chatgptplus what now?

Anonymous
10/16/24(Wed)20:13:17 No.102851931

Anonymous 10/16/24(Wed)20:13:17 No.102851931

File: file.png (98 KB, 513x745)

98 KB PNG

>>102851900
Guy's apologizing like he killed someone, when most quanters just leave bugged to hell quants up forever

Anonymous
10/16/24(Wed)20:13:55 No.102851936

Anonymous 10/16/24(Wed)20:13:55 No.102851936

>>102851553
thank you!
>>102851701
I'm on ddr4 3200 so I'll probably be closer to his t/s than yours

Anonymous
10/16/24(Wed)20:16:33 No.102851965

Anonymous 10/16/24(Wed)20:16:33 No.102851965

>>102851900
Retards will judge the model based on the thing he and others released.
>>102851931
But he's also a retard for apologizing.

Anonymous
10/16/24(Wed)20:17:20 No.102851972

Anonymous 10/16/24(Wed)20:17:20 No.102851972

File: cocoa.png (26 KB, 950x155)

26 KB PNG

>>102851704
>>102851828
So I set Sliding window to 32,768 on the config, converted it to a q8_0 ggoof. First reply went off the rails a bit, regenned, second reply passed the 8k haystack test.

Anonymous
10/16/24(Wed)20:19:37 No.102851991

Anonymous 10/16/24(Wed)20:19:37 No.102851991

How do you pronounce GGUF? Has the dev said anything about that?

Anonymous
10/16/24(Wed)20:21:17 No.102851999

Anonymous 10/16/24(Wed)20:21:17 No.102851999

>>102851991
jee goof

Anonymous
10/16/24(Wed)20:21:33 No.102852001

Anonymous 10/16/24(Wed)20:21:33 No.102852001

>>102851991
I pronounce it as Georgi-Gerganov's-Unified-Format.

Anonymous
10/16/24(Wed)20:21:37 No.102852002

Anonymous 10/16/24(Wed)20:21:37 No.102852002

Mistral going the Qwen route? (Removing trivia data to benchmaxx?)
>at this point I tested over 100 simple questions from the most popular movies, shows, music... in human history and it's getting >95% of them wrong, usually very very wrong. For example, it keeps returning character names and actors from different shows. And even with easy STEM and academic questions it's performing far worse than others like Llama 3.1 8b & Gemma 2 9b.

>It's clear that Mistral stripped the vast majority of data from Web Rips and Wikipedia before training this model, greatly limiting the paths to accurately retrieving the information. For example, If you ask for the main cast of the 1% most popular movies and shows (e.g. Friends & Pulp Fiction) it does an OK job (not great), but if you directly ask about said characters and actors it almost always returns an hallucination. Also, if you ask for main casts of top 5% most popular movies and shows it starts hallucinating far mroe frequently. So they also obviously largely stripped the corpus of popular culture that wasn't absurdly popular (top 1%) , or at least severely undertrained it on said information.
https://huggingface.co/mistralai/Ministral-8B-Instruct-2410/discussions/3

Anonymous
10/16/24(Wed)20:21:45 No.102852004

Anonymous 10/16/24(Wed)20:21:45 No.102852004

>>102851999
I thought it was gee joof

Anonymous
10/16/24(Wed)20:22:24 No.102852011

Anonymous 10/16/24(Wed)20:22:24 No.102852011

>>102851991
JEE JEE YOU EFF

Anonymous
10/16/24(Wed)20:23:10 No.102852018

Anonymous 10/16/24(Wed)20:23:10 No.102852018

>>102851991
g goof

Anonymous
10/16/24(Wed)20:23:56 No.102852031

Anonymous 10/16/24(Wed)20:23:56 No.102852031

>>102852001
Oh is that what it actually means? In that case I'll pronounce it by the letter.

Anonymous
10/16/24(Wed)20:23:57 No.102852032

Anonymous 10/16/24(Wed)20:23:57 No.102852032

>>102851931
Appending TEST to everything should have made it painfully obvious it was a as-is kind of deal.
The real takeaway here is he's a stand-up guy that won't blame the users even when PEBCAK - can't say the same about other quooonters

Anonymous
10/16/24(Wed)20:25:49 No.102852042

Anonymous 10/16/24(Wed)20:25:49 No.102852042

>>102851991
geh goof

Anonymous
10/16/24(Wed)20:26:31 No.102852049

Anonymous 10/16/24(Wed)20:26:31 No.102852049

I tested the goof 8bit and it is semi-incoherent at 10k tokens. It sort of gets what is happening but is very repetitive and retarded. Stick to nemo for now.

Anonymous
10/16/24(Wed)20:26:34 No.102852050

Anonymous 10/16/24(Wed)20:26:34 No.102852050

>>102852031
No idea. Maybe the U stands for Universal.
You could try g-goof too, as if with a stutter. It's what i do with ffmpeg.

Anonymous
10/16/24(Wed)20:27:05 No.102852056

Anonymous 10/16/24(Wed)20:27:05 No.102852056

>>102852032
The only thing he needs to apologize for is putting his higher quants in sub-directories which really fucks with
A. Ooba's internal downloader
B. the HF downloader.
I have to use a shell script to wget his models
I named it Bartowski.sh

Anonymous
10/16/24(Wed)20:27:54 No.102852061

Anonymous 10/16/24(Wed)20:27:54 No.102852061

>>102852002
There's no reason for llms to be trained on trivia data. All it causes is potential copyright issues if a publisher decides to use it as evidence of stealing their works. No productive person cares about his model knowing Castlevania quotes.

Anonymous
10/16/24(Wed)20:28:46 No.102852069

Anonymous 10/16/24(Wed)20:28:46 No.102852069

>>102852061
What if he wants to talk about videogames with his waifu?

Anonymous
10/16/24(Wed)20:29:15 No.102852071

Anonymous 10/16/24(Wed)20:29:15 No.102852071

>>102852049
model works beautifully on exllama

Anonymous
10/16/24(Wed)20:30:18 No.102852084

Anonymous 10/16/24(Wed)20:30:18 No.102852084

>>102852056
I use git lfs fetch (keeps only the lfs object, not the checkout) and a script to recreate the repo with links to the proper files in a separate directory. That way i can just git lfs fetch for updates, as is often the case for new models.

Anonymous
10/16/24(Wed)20:38:56 No.102852159

Anonymous 10/16/24(Wed)20:38:56 No.102852159

File: 1712426963900882.png (17 KB, 1081x881)

17 KB PNG

Other than Fish or XTTS, what are the best/most advanced Text to Speech local models?

Anonymous
10/16/24(Wed)20:39:37 No.102852166

Anonymous 10/16/24(Wed)20:39:37 No.102852166

i can't believe how good this 8b shit is.
it doesn't even need meme merges or finetunes to be able to fuck properly.
damn frenchies have done it again.

Anonymous
10/16/24(Wed)20:40:28 No.102852170

Anonymous 10/16/24(Wed)20:40:28 No.102852170

So how good is Nvidia Nemotron compared to 3.1 70b? Compared to 405b?

Anonymous
10/16/24(Wed)20:41:29 No.102852178

Anonymous 10/16/24(Wed)20:41:29 No.102852178

File: 1662855326861.png (448 KB, 512x512)

448 KB PNG

I think I'm going to start seriously developing a "cultured" trivia benchmark since it should be pretty easy to just pump out questions for.
What titles do people here like that they'd love if their LLMs knew? Shouldn't be too obscure though since even the best LLMs can't really do that (my testing of cloud models has not been too successful for obscure stuff). Of course I'll include Castlevania, for that one anon. Vocaloid. What else?

Also just got the idea from pic related to do a similar benchmark in the future for visual knowledge. After Llama.cpp has first class support for multimodal.

>>102852002
Thanks for reminding me.

Anonymous
10/16/24(Wed)20:41:49 No.102852181

Anonymous 10/16/24(Wed)20:41:49 No.102852181

>>102852159
gpt-sovits. Here are some demos and the link to their repo
>https://tts.x86.st/
I haven't had the time to get it to work, but it sounds pretty good. If only they stopped using python for that shit. I'm sticking to piper in the meantime.

Anonymous
10/16/24(Wed)20:41:55 No.102852182

Anonymous 10/16/24(Wed)20:41:55 No.102852182

>>102852071
are you not lazy to quant it yourself?

Anonymous
10/16/24(Wed)20:43:18 No.102852190

Anonymous 10/16/24(Wed)20:43:18 No.102852190

>>102852178
I wish LLMs knew about Castlevania quotes.

Anonymous
10/16/24(Wed)20:44:51 No.102852201

Anonymous 10/16/24(Wed)20:44:51 No.102852201

>>102852178
>What titles do people here like that they'd love if their LLMs knew?
i want it to know everything about final fantasy, particularly type-0's orience

Anonymous
10/16/24(Wed)20:44:53 No.102852203

Anonymous 10/16/24(Wed)20:44:53 No.102852203

>>102852190
The "what is a man" quote is actually fairly well-known by LLMs. The "die monster" one is less so but some do know it.

Anonymous
10/16/24(Wed)20:49:01 No.102852226

Anonymous 10/16/24(Wed)20:49:01 No.102852226

File: 1635280203489.png (192 KB, 400x300)

192 KB PNG

>>102852201
I don't know anything about that but I'll include it.

Anonymous
10/16/24(Wed)20:50:19 No.102852232

Anonymous 10/16/24(Wed)20:50:19 No.102852232

>>102852178
Visual Novels, generally.
Talking about those, Mistral Large seems to like them since it has brought them up multiple times without prompting, and it was pretty good at the details. It's also good with anime and other stuff.

Anonymous
10/16/24(Wed)20:55:12 No.102852270

Anonymous 10/16/24(Wed)20:55:12 No.102852270

Which base model for 36gb of VRAM?

Anonymous
10/16/24(Wed)21:00:55 No.102852310

Anonymous 10/16/24(Wed)21:00:55 No.102852310

File: __hoshino_yumemi_and_junk(...).jpg (168 KB, 530x700)

168 KB JPG

>>102852232
Good idea. Any in particular?
I kind of like Planetes so I think I'll include that.

Anonymous
10/16/24(Wed)21:01:08 No.102852312

Anonymous 10/16/24(Wed)21:01:08 No.102852312

File: file.png (283 KB, 1937x1042)

283 KB PNG

>>102852178
ggoofed ministral got 0/3. It does know some stuff from battletech though. I mean in the first shot I tried and it also got everything wrong)

Anonymous
10/16/24(Wed)21:02:15 No.102852323

Anonymous 10/16/24(Wed)21:02:15 No.102852323

>>102852178
Mesugaki

Anonymous
10/16/24(Wed)21:03:20 No.102852334

Anonymous 10/16/24(Wed)21:03:20 No.102852334

>>102852178
Tetris Attack; no cultured trivia benchmark is complete without it.

Anonymous
10/16/24(Wed)21:08:11 No.102852372

Anonymous 10/16/24(Wed)21:08:11 No.102852372

>>102849995
Using forge UI, is there a way to make thumbnails for the Loras?

Anonymous
10/16/24(Wed)21:12:34 No.102852409

Anonymous 10/16/24(Wed)21:12:34 No.102852409

File: Untitled.png (73 KB, 958x699)

73 KB PNG

>>102852232
ministral 8b doesn't know shit about vns believe it or not

Anonymous
10/16/24(Wed)21:12:40 No.102852412

Anonymous 10/16/24(Wed)21:12:40 No.102852412

>>102852312
>battletech
is that you, snakey pooh?

Anonymous
10/16/24(Wed)21:14:00 No.102852423

Anonymous 10/16/24(Wed)21:14:00 No.102852423

>>102852412
IS NEMOTRON BETTER THAN 3.1 70B VANILLA OR NOT?

Anonymous
10/16/24(Wed)21:14:38 No.102852429

Anonymous 10/16/24(Wed)21:14:38 No.102852429

>>102850808
>Human-Level AI in 2026
But AI already surpassed human-level intelligence. If not being better than any single human at any given subject makes it not have human-level intelligence, then no human has human-level intelligence.

Anonymous
10/16/24(Wed)21:15:38 No.102852441

Anonymous 10/16/24(Wed)21:15:38 No.102852441

Is the h100's performance worth the price difference over a100? I can't find actual data points/benchmarks online when it comes to training.

Anonymous
10/16/24(Wed)21:19:35 No.102852481

Anonymous 10/16/24(Wed)21:19:35 No.102852481

File: file.png (202 KB, 1953x868)

202 KB PNG

>>102852412
no I don't have any friends, that is why I am here. mistral small 5bpw is pic related. and gemma 27 5bpw also got the same 2/3 but called grasshopper GRF-1N and 35 T.

Anonymous
10/16/24(Wed)21:20:35 No.102852494

Anonymous 10/16/24(Wed)21:20:35 No.102852494

>>102852441
Will you make a cooming model with it?

Anonymous
10/16/24(Wed)21:34:36 No.102852670

Anonymous 10/16/24(Wed)21:34:36 No.102852670

>>102852494
yes, I'm some gpu poor storyfag having finetuned some small models for testing. I'd rather ask around before spending even more money testing the waters. Poorfag will do anything to save money.

Anonymous
10/16/24(Wed)21:35:17 No.102852681

Anonymous 10/16/24(Wed)21:35:17 No.102852681

>>102852178
main criteria: must have deep understanding of the 36 lessons of vivec

Anonymous
10/16/24(Wed)21:35:18 No.102852682

Anonymous 10/16/24(Wed)21:35:18 No.102852682

>>102852441
Depends on the price you're paying for them

Anonymous
10/16/24(Wed)21:36:08 No.102852693

Anonymous 10/16/24(Wed)21:36:08 No.102852693

File: file.png (62 KB, 474x266)

62 KB PNG

>>102852670
>gpu poor storyfag
>h100
anon plz...

Anonymous
10/16/24(Wed)21:38:06 No.102852712

Anonymous 10/16/24(Wed)21:38:06 No.102852712

>>102852693
I think he's talking about renting. Someone with enough money to buy those wouldn't be asking here...

Anonymous
10/16/24(Wed)21:40:55 No.102852744

Anonymous 10/16/24(Wed)21:40:55 No.102852744

>>102852693
They're only $2 per hour now in some places, $3 at most. Rental price crashed hugely the last few months

Anonymous
10/16/24(Wed)21:41:26 No.102852753

Anonymous 10/16/24(Wed)21:41:26 No.102852753

>>102852441
I think my last little bit of cloud computing budget before I started to become an at home chad I experimented with A100 vs. H100 throughput.
PCIE H100 not worth the price. You maybe get about 2.5X the productivity out of it vs an A100, But SXM H100 is way faster than SXM A100 and the rental prices usually more than justify the costs. So if you can download and upload your models nice and quickly without dicking around too much there's 100% money to be saved even at 3x the cost. Although you have to crank batch size up to capitalize fully on the extra compute power. So only if whatever you are working on leaves you the vram overhead to do that.

Anonymous
10/16/24(Wed)21:42:39 No.102852771

Anonymous 10/16/24(Wed)21:42:39 No.102852771

Nemotron 405b when?

Anonymous
10/16/24(Wed)21:50:02 No.102852854

Anonymous 10/16/24(Wed)21:50:02 No.102852854

Why is nemotron so much better at RP than base llama? I didn't even have a card, just a name of a fandom character with include names on and it perfectly picked up on their speaking style and came up with a really creative intro to a scene also including my persona. Might be my fav model now.

Anonymous
10/16/24(Wed)21:52:55 No.102852887

Anonymous 10/16/24(Wed)21:52:55 No.102852887

File: read this thread.png (458 KB, 900x806)

458 KB PNG

IS NEMOTRON BETTER THAN 3.1 70B
VANILLA LLAMA?
>IS NEMOTRON BETTER THAN 3.1 70B VANILLA LLAMA?
IS NEMOTRON BETTER THAN 3.1 70B
VANILLA LLAMA?
>IS NEMOTRON BETTER THAN 3.1 70B VANILLA LLAMA?
IS NEMOTRON BETTER THAN 3.1 70B
VANILLA LLAMA?
>IS NEMOTRON BETTER THAN 3.1 70B VANILLA LLAMA?

Anonymous
10/16/24(Wed)21:55:26 No.102852910

Anonymous 10/16/24(Wed)21:55:26 No.102852910

>>102852887
Yes and no.
It's extremely finicky about prompt templates. If you accidentally fuck up a custom prompt template even slightly it will just shit out end of turn tokens at you. And you have to gaslight it to get NSFW

Anonymous
10/16/24(Wed)21:59:03 No.102852947

Anonymous 10/16/24(Wed)21:59:03 No.102852947

Results so far of Mistral Small Fine Tune Evaluation

Based on the first story I generated at top-k=1, the least "slopped" entries were from: Mistral-Small-22B-ArliAI-RPMax-v1.1, Pantheon-RP-1.6.2-22b-Small, Pantheon-RP-Pure-1.6.2-22b-Small.

Others I tested were: Acolyte-22B, Mistral-Small-Drummer-22B, SeminalRP-22b, SorcererLM-22B, and Mistral Small Instruct (control).

Other impressions from first story:
* Only two models fully followed the format correctly, ArliAI-RPMax and SorcererLM-22B. Mistral Small Instruct did *not*. I take this as an indication that there's a lot of jitter in these tests based on the specific prompt, not that those fine tunes are better instruction-followers than the Instruct model they were tuned on.
* Mistral Small Drummer's output was nearly identical to Mistral Small Instruct's.
* SeminalRP-22b was the most different from the others in terms of dialogue structure. Perhaps worse, but it was different.
* Despite Pantheon-RP being allegedly more focused on story-writing than Pantheon-RP-Pure I preferred the latter.
* The only model with a misspelled word was SorcererLM-22B.
* The model was supposed to name the story and the first chapter. Mistral Small Instruct/Drummer picked a fine chapter name and a really awful story name. Every other model picked a better story name although subjectively ArliAI-RPMax's was the least interesting.
* Certain details were not described equally realistically by different models but without more data I don't yet feel comfortable saying it was more than random chance since they seemed to be picking between two possibilities.

Anonymous
10/16/24(Wed)21:59:47 No.102852955

Anonymous 10/16/24(Wed)21:59:47 No.102852955

>>102852771
Would it be an upgrade to 340b?
https://huggingface.co/nvidia/Nemotron-4-340B-Instruct

Anonymous
10/16/24(Wed)22:00:42 No.102852964

Anonymous 10/16/24(Wed)22:00:42 No.102852964

>>102852682
>>102852693
I'm renting them on vast/runpod. It's a tell that A100's availability is worse than H100 somehow, hinting at people preferring A100 for better cost/performance.

>>102852753
Thanks. SXM vs PCIE comparisons are even more arcane, but I sorta get the idea seeing PCIE H100s left untouched on runpod, unlike the SXM H100s.
I don't think there's headroom for large batch size, unfortunately having already maxed out VRAM with 8k sequence length.
Good thing we can pull models fairly quickly from HF.

Anonymous
10/16/24(Wed)22:00:49 No.102852965

Anonymous 10/16/24(Wed)22:00:49 No.102852965

File: file.png (134 KB, 254x254)

134 KB PNG

>>102850496
>on part with performance of GPT4
it's been more than a year I've read this line, fuck that shit

Anonymous
10/16/24(Wed)22:02:15 No.102852979

Anonymous 10/16/24(Wed)22:02:15 No.102852979

okay the novelty wore off even story completion using base nemo is too retarded in the end.

Anonymous
10/16/24(Wed)22:03:09 No.102852990

Anonymous 10/16/24(Wed)22:03:09 No.102852990

>>102852955
>390B
>It supports a context length of 4,096 tokens.
FOR WHAT PURPOSE

Anonymous
10/16/24(Wed)22:03:16 No.102852992

Anonymous 10/16/24(Wed)22:03:16 No.102852992

>>102852670
what models have you use for stories?

Anonymous
10/16/24(Wed)22:04:05 No.102853000

Anonymous 10/16/24(Wed)22:04:05 No.102853000

>>102852990
humiliation ritual

Anonymous
10/16/24(Wed)22:04:20 No.102853002

Anonymous 10/16/24(Wed)22:04:20 No.102853002

>>102852854
Which character / what message did you start with?

Anonymous
10/16/24(Wed)22:04:35 No.102853007

Anonymous 10/16/24(Wed)22:04:35 No.102853007

>>102852964
just looked at runpod now.
Less than double for H100 vs. A100. definitely worth the price. I see MI300X is the most popular choice right now, probably because it offers enough VRAM to do full finetunes instead of loras, which is also quicker than lora training, but then you have to download an entire model while the rental clock is ticking.

Anonymous
10/16/24(Wed)22:05:43 No.102853016

Anonymous 10/16/24(Wed)22:05:43 No.102853016

>>102852910
is this not the case with base 3.1?

Is Nemotron more censored?

Anonymous
10/16/24(Wed)22:06:54 No.102853030

Anonymous 10/16/24(Wed)22:06:54 No.102853030

>>102853016
Probably about the same amount of censored really. If you do a completion prompt for "As an AI language model trained by" It will say Meta. So they didn't tune it to the point that all the 3.1 is beaten out of it.

Anonymous
10/16/24(Wed)22:07:36 No.102853035

Anonymous 10/16/24(Wed)22:07:36 No.102853035

>>102853016
>>102852910
>you have to gaslight it to get NSFW
Not in my experience. It just likes giving disclaimers: Warning: Mature content ahead.

But telling it not too stops that and it gets filthy.

Anonymous
10/16/24(Wed)22:07:37 No.102853036

Anonymous 10/16/24(Wed)22:07:37 No.102853036

>>102852947
I noticed Sorcerer misspelling words in every response. There's something with it.

Anonymous
10/16/24(Wed)22:09:16 No.102853054

Anonymous 10/16/24(Wed)22:09:16 No.102853054

File: 068.png (376 KB, 635x457)

376 KB PNG

>>102852947
NTA and no horse in this race but without the repo names it sounds like this is a Drummer model, but to clarify it's just named in his honor

Anonymous
10/16/24(Wed)22:11:45 No.102853076

Anonymous 10/16/24(Wed)22:11:45 No.102853076

>>102852947
>Based on the first story I generated at top-k=1, the least "slopped" entries were from: Mistral-Small-22B-ArliAI-RPMax-v1.1, Pantheon-RP-1.6.2-22b-Small, Pantheon-RP-Pure-1.6.2-22b-Small.

To clarify, I meant the ones without any of the specific phrases "couldn't help but think", "a mix of X and Y", "maybe, just maybe".

Anonymous
10/16/24(Wed)22:12:07 No.102853080

Anonymous 10/16/24(Wed)22:12:07 No.102853080

>>102853036
lower your temp and/or check samplers
https://github.com/oobabooga/text-generation-webui/pull/6335
>I recommend pairing it with Min-P (0.02) and DRY (multiplier 0.8), with all other samplers disabled.

Anonymous
10/16/24(Wed)22:12:48 No.102853087

Anonymous 10/16/24(Wed)22:12:48 No.102853087

>>102853036
I had that issue with some models after I banned words in in ST. I had shit like "embrace" filtered and it fucked up my model's ability to type out the completely unrelated syllable 'ally' (as in 'logically', 'manually'). The model would dodge it with typos like 'manuallly'

Anonymous
10/16/24(Wed)22:12:49 No.102853088

Anonymous 10/16/24(Wed)22:12:49 No.102853088

>>102853076
* couldn't help but feel
(couldn't help but think also happened once)

Anonymous
10/16/24(Wed)22:17:35 No.102853138

Anonymous 10/16/24(Wed)22:17:35 No.102853138

>>102853054
>nbeerbower/Mistral-Small-Drummer-22B
>finetuned on jondurbin/gutenberg-dpo-v0.1 and nbeerbower/gutenberg2-dpo.
Not an RP fine tune, but if there was anywhere I would have expected the Gutenberg DPOs to matter it would be a situation like this asking the model to write me a story with certain elements, but it seemed not to.

Anonymous
10/16/24(Wed)22:20:43 No.102853165

Anonymous 10/16/24(Wed)22:20:43 No.102853165

>>102852947
>Mistral Small Drummer's output was nearly identical to Mistral Small Instruct's.
Oh no! Drummer sisters what does this mean?!

Anonymous
10/16/24(Wed)22:21:16 No.102853177

Anonymous 10/16/24(Wed)22:21:16 No.102853177

>>102853138
Those datasets are 10 and 5mb each. They're nothing.

Anonymous
10/16/24(Wed)22:21:53 No.102853186

Anonymous 10/16/24(Wed)22:21:53 No.102853186

>>102850496
Dick preference optimization when?

Anonymous
10/16/24(Wed)22:22:04 No.102853187

Anonymous 10/16/24(Wed)22:22:04 No.102853187

>>102852178
It MUST have Deus Ex quotes.
It MUST test the model's ability to speak in snacklish or at least reproduce a real snacklish spelling.

Anonymous
10/16/24(Wed)22:23:05 No.102853200

Anonymous 10/16/24(Wed)22:23:05 No.102853200

>>102853187
You'll only have deus ex: invisible war, and you will like it.

Anonymous
10/16/24(Wed)22:23:18 No.102853202

Anonymous 10/16/24(Wed)22:23:18 No.102853202

>>102852979
The final form of this hobby will be waiting for the next model just because the weight aligned in a slightly different way and the writing style fixates on a different set of identical responses.

Anonymous
10/16/24(Wed)22:23:26 No.102853203

Anonymous 10/16/24(Wed)22:23:26 No.102853203

>>102852947
* SorcererLM-22B and Acolyte-22B were the only two that picked "Emma" instead of "Lily" as the name of the main character, whatever you want to take from that.

Anonymous
10/16/24(Wed)22:23:35 No.102853205

Anonymous 10/16/24(Wed)22:23:35 No.102853205

File: _0dbc4a65-0e54-4d46-a9b2-(...).png (2.46 MB, 1024x1024)

2.46 MB PNG

>>102852947
>didn't test NousKyver

Anonymous
10/16/24(Wed)22:24:50 No.102853222

Anonymous 10/16/24(Wed)22:24:50 No.102853222

>>102853200
Never heard of it.

Anonymous
10/16/24(Wed)22:24:50 No.102853223

Anonymous 10/16/24(Wed)22:24:50 No.102853223

>>102852992
L3 storywriter on a p40 scrap build, but I'd say nemo base or the finetines did surprisingly well when I was a/b testing after tuning it.

>>102853007
I was under the impression that there's barely any off-the-shelf solution for AMD, maybe I'll revisit that. Appreciate the help anon.

Anonymous
10/16/24(Wed)22:25:38 No.102853229

Anonymous 10/16/24(Wed)22:25:38 No.102853229

>>102853222
Good. Me neither.

Anonymous
10/16/24(Wed)22:28:15 No.102853246

Anonymous 10/16/24(Wed)22:28:15 No.102853246

>>102853223
It depends what you are doing.
AFAIK there's still no official bitsandbytes support for AMD so if you want to do qlora you have to fuck around with third party forks that may or may not work and only work with the latest hardware if they do work. But for full finetune transformers support for AMD is fairly mature AFAIK. So I doubt it requires much in the way of extra steps.

Anonymous
10/16/24(Wed)22:41:33 No.102853355

Anonymous 10/16/24(Wed)22:41:33 No.102853355

File: sonic rome.png (134 KB, 647x837)

134 KB PNG

These models do have video game knowledge.
It's just not well generalized into the behavior of answering trivia questions.

Anonymous
10/16/24(Wed)22:41:56 No.102853358

Anonymous 10/16/24(Wed)22:41:56 No.102853358

>PLaMo-100B
Did the VN translation guy test it? Nothingburger again? Would be nice to have something for Paradox Part 3.

Anonymous
10/16/24(Wed)22:47:34 No.102853413

Anonymous 10/16/24(Wed)22:47:34 No.102853413

>>102850266
>the only way to cancel it was to force-shutdown your entire system
skill issue

Anonymous
10/16/24(Wed)22:49:43 No.102853434

Anonymous 10/16/24(Wed)22:49:43 No.102853434

>>102853413
ctrl+c is a vital intervention. blocking it would be like blocking ctrl+alt+delete on a windows application.

Anonymous
10/16/24(Wed)22:52:18 No.102853465

Anonymous 10/16/24(Wed)22:52:18 No.102853465

>>102853434
Shush, you will break skilltroon's tiny brain with this.

Anonymous
10/16/24(Wed)22:52:39 No.102853471

Anonymous 10/16/24(Wed)22:52:39 No.102853471

>>102853434
nta. kill -9
But you are right. If you're gonna catch the signal, you gotta do it responsibly.

Anonymous
10/16/24(Wed)22:53:42 No.102853481

Anonymous 10/16/24(Wed)22:53:42 No.102853481

How good is Nemo for a chatbot?

Anonymous
10/16/24(Wed)22:55:07 No.102853489

Anonymous 10/16/24(Wed)22:55:07 No.102853489

>>102853481
8/10 it's okay

Anonymous
10/16/24(Wed)22:56:19 No.102853503

Anonymous 10/16/24(Wed)22:56:19 No.102853503

>>102853481
I enjoyed at least 1000 hours playing with merges and tunes of it, this new ministral 8b seems like it tops it though.

Anonymous
10/16/24(Wed)22:56:19 No.102853504

Anonymous 10/16/24(Wed)22:56:19 No.102853504

>>102853489
is it better than base 3.1?

Anonymous
10/16/24(Wed)22:57:13 No.102853512

Anonymous 10/16/24(Wed)22:57:13 No.102853512

File: file.png (90 KB, 882x701)

90 KB PNG

Does anybody here use an A770 for LLMs? Seems like a really good budget model, $270 for 16gb VRAM and pretty good inference speed.

Anonymous
10/16/24(Wed)22:57:33 No.102853516

Anonymous 10/16/24(Wed)22:57:33 No.102853516

>>102853504
3.1 is complete garbage. Unusable. 5/10

Anonymous
10/16/24(Wed)22:58:25 No.102853522

Anonymous 10/16/24(Wed)22:58:25 No.102853522

Yea I'm really liking nemotron. Might prefer it over mistral large now.

Anonymous
10/16/24(Wed)23:01:06 No.102853552

Anonymous 10/16/24(Wed)23:01:06 No.102853552

>>102853503
is Nemo > 3.1 70b base?

also 8b beats 70b???

Anonymous
10/16/24(Wed)23:06:38 No.102853607

Anonymous 10/16/24(Wed)23:06:38 No.102853607

>>102853552
Idk man, I can't run models that big. I'd assume the llama one has annoying safety things that will lecture you and a positivity bias that could ruin rp experiences by not letting bad things happen though.
Llama will definitely be more intelligent, Nemo is a little retarded and you have to wrestle with it.

Anonymous
10/16/24(Wed)23:06:59 No.102853612

Anonymous 10/16/24(Wed)23:06:59 No.102853612

>>102853522
How does it compare to base 3.1

Anonymous
10/16/24(Wed)23:08:02 No.102853629

Anonymous 10/16/24(Wed)23:08:02 No.102853629

The current 70B meta is the merge I am uploading right now.

Anonymous
10/16/24(Wed)23:08:40 No.102853637

Anonymous 10/16/24(Wed)23:08:40 No.102853637

>>102853607
what the hell is nemo good for then?

Anonymous
10/16/24(Wed)23:08:40 No.102853638

Anonymous 10/16/24(Wed)23:08:40 No.102853638

>>102852429
>But AI already surpassed human-level intelligence.
It can't reason, learn or understand nuance and subtly. It can't think for itself.

Anonymous
10/16/24(Wed)23:08:59 No.102853644

Anonymous 10/16/24(Wed)23:08:59 No.102853644

Kernel 6.11.2-1 has hit Debian testing. Any anon try it yet?
Last post on the 6.11 branch in this general said it was fucked

Anonymous
10/16/24(Wed)23:09:18 No.102853650

Anonymous 10/16/24(Wed)23:09:18 No.102853650

>>102853612
Far more creative / 'personable'. Seems really really good at RP / creative writing which regular 3.1 was dry at.

Anonymous
10/16/24(Wed)23:10:52 No.102853671

Anonymous 10/16/24(Wed)23:10:52 No.102853671

>>102843907
>>102844293
Make sure you check prompt processing bench numbers and not just token generation numbers before you buy any Apple silicon so you know what you're getting

Anonymous
10/16/24(Wed)23:11:36 No.102853677

Anonymous 10/16/24(Wed)23:11:36 No.102853677

>>102853637
MYS but its good for it's size for vramlets for rp. Still gonna want a 70B+ for anything semi complicated though.

Anonymous
10/16/24(Wed)23:13:24 No.102853698

Anonymous 10/16/24(Wed)23:13:24 No.102853698

>>102853677
NTA but it's decent with a few swipes. And it's fast so it's fine even if it can't grasp concepts on the first try. There were several times it surprised with its creativity in stuff like dice rolls or punishing {{user}} for their actions.

Anonymous
10/16/24(Wed)23:13:49 No.102853705

Anonymous 10/16/24(Wed)23:13:49 No.102853705

>>102853677
i'm talking about Nemo 70b. How does it compare to base 3.1 70b?

Anonymous
10/16/24(Wed)23:15:43 No.102853729

Anonymous 10/16/24(Wed)23:15:43 No.102853729

>>102853705
Then don't say Nemo. Most people are going to assume Mistral Nemo.

Nemotron is really good in my testing so far.

Anonymous
10/16/24(Wed)23:16:48 No.102853744

Anonymous 10/16/24(Wed)23:16:48 No.102853744

>>102853729
So Nemotron basically just outperforms 3.1 70 in every way>?

Anonymous
10/16/24(Wed)23:19:02 No.102853771

Anonymous 10/16/24(Wed)23:19:02 No.102853771

>>102853638
Still smarter than a w*man.

Anonymous
10/16/24(Wed)23:19:05 No.102853772

Anonymous 10/16/24(Wed)23:19:05 No.102853772

>>102853744
From what I know it's 3.1 trained further on human preference so I would assume so.

Anonymous
10/16/24(Wed)23:20:10 No.102853787

Anonymous 10/16/24(Wed)23:20:10 No.102853787

File: horse.png (115 KB, 1041x701)

115 KB PNG

It understands anatomy.
Neat.
The card is written like shit too, but it still worked pretty well.
Settings are
>Rocinante-12B-v1.1-Q4_K_S
>temp 1
>Top K 10
>Min P 0.05
Nemo really is a god send for vramlets.
Don't get me wrong, it's not perfect and it's not magic, but it beats the hell out of Mistral 7B, Solar 10B, and the other stuff we used to use back then.
I'd hazard to say that it's as good as mixtral 8x7b at this point.

Anonymous
10/16/24(Wed)23:20:36 No.102853790

Anonymous 10/16/24(Wed)23:20:36 No.102853790

>>102853504
>>102853552
>>102853612
>>102853705
>>102853744
But is it? is it? 3.1... nemo.... is it??? nemotron, is it??? 3.1. .... 70b....

Anonymous
10/16/24(Wed)23:23:34 No.102853817

Anonymous 10/16/24(Wed)23:23:34 No.102853817

>>102853790
Just fucking TELL ME what is BETTER

Anonymous
10/16/24(Wed)23:26:34 No.102853852

Anonymous 10/16/24(Wed)23:26:34 No.102853852

>>102853817
Are you retarded?

Anonymous
10/16/24(Wed)23:26:58 No.102853855

Anonymous 10/16/24(Wed)23:26:58 No.102853855

>>102853787
Reminds me of claude's ability to do accents, pretty cool to see in local

Anonymous
10/16/24(Wed)23:27:49 No.102853861

Anonymous 10/16/24(Wed)23:27:49 No.102853861

>>102853817
Depends on you, retardus, maximus. Try them and you decide. And keep your opinion to yourself. Or write it with shit in your bathroom.

Anonymous
10/16/24(Wed)23:29:35 No.102853878

Anonymous 10/16/24(Wed)23:29:35 No.102853878

>>102853852
yes

Anonymous
10/16/24(Wed)23:30:18 No.102853887

Anonymous 10/16/24(Wed)23:30:18 No.102853887

Anyone use datasets from https://huggingface.co/litagin to train tts models?

Anonymous
10/16/24(Wed)23:32:37 No.102853913

Anonymous 10/16/24(Wed)23:32:37 No.102853913

>>102853817
Llama 3.1 linearized

Anonymous
10/16/24(Wed)23:35:12 No.102853941

Anonymous 10/16/24(Wed)23:35:12 No.102853941

>>102853638
>It can't reason, learn or understand nuance and subtly. It can't think for itself.
There are a lot of people who can't, and they are still considered humans by law.

Anonymous
10/16/24(Wed)23:37:01 No.102853955

Anonymous 10/16/24(Wed)23:37:01 No.102853955

>>102853941
you're retarded

Anonymous
10/16/24(Wed)23:38:10 No.102853963

Anonymous 10/16/24(Wed)23:38:10 No.102853963

>>102853638
>>102853941
Again, the bar for Human-Level intelligence is very low, and AI already surpassed that a while ago.
What you are looking for is something that basically rivals experts on their own fields (be it a scientific field or something more subjective like being able to detect lies and deception), and no human does that.

Anonymous
10/16/24(Wed)23:39:47 No.102853974

Anonymous 10/16/24(Wed)23:39:47 No.102853974

>>102853955
No, humans on average are retarded. And AI is capable to mimicking reasoning enough for us to not be able to differentiate human from AI.

Anonymous
10/16/24(Wed)23:42:46 No.102854001

Anonymous 10/16/24(Wed)23:42:46 No.102854001

>>102853963
>What you are looking for is something that basically rivals experts on their own fields (be it a scientific field or something more subjective like being able to detect lies and deception), and no human does that
That's not even remotely what I'm looking for.
For a model that I want to talk/RP/write with I don't care about how many tests it can pass, they're useless.

Anonymous
10/16/24(Wed)23:53:37 No.102854090

Anonymous 10/16/24(Wed)23:53:37 No.102854090

>>102853963
>the bar for Human-Level intelligence is very low, and AI already surpassed that a while ago.
This isn't true at all you massive nigger

Anonymous
10/16/24(Wed)23:54:16 No.102854095

Anonymous 10/16/24(Wed)23:54:16 No.102854095

>>102854001
What you are looking for does not necessarily imply being above or below Human-Level intelligence.

Anonymous
10/16/24(Wed)23:57:22 No.102854133

Anonymous 10/16/24(Wed)23:57:22 No.102854133

>>102854095
Failing at ERP unironically convinces me that a model's supposed intelligence is illusory. General intelligence is general and no amount of overfitting on benchmarks will prove otherwise.

Anonymous
10/16/24(Wed)23:57:49 No.102854138

Anonymous 10/16/24(Wed)23:57:49 No.102854138

>>102854090
You should do the research yourself, even for simple simulated tasks like organizing and throwing a party, most people thought that the AI was better than the humans in a blind choice test.
I repeat, most humans thought that the AI was more human than humans at simple daily tasks.
AI is replacing creative and mental jobs literally because it is better than most humans at it.

Anonymous
10/17/24(Thu)00:01:05 No.102854169

Anonymous 10/17/24(Thu)00:01:05 No.102854169

>>102854133
General intelligence is general, and it can ERP with you, and will do a better job at it than most humans.
What you are looking for is simply an AI that can rival your best personal experiences with ERP, which doesn't have to do with having general intelligence or human-level intelligence.

Anonymous
10/17/24(Thu)00:01:19 No.102854172

Anonymous 10/17/24(Thu)00:01:19 No.102854172

>>102854138
parroting instructions isn't intelligence. Guess what? You can search google and get intelligently written blog posts. That doesn't mean it understands what it's saying (modeling a world in its head where it understands how things relate to one another, how tables have physics etc)

Anonymous
10/17/24(Thu)00:03:01 No.102854188

Anonymous 10/17/24(Thu)00:03:01 No.102854188

Midnight miku is still the only model worth using btw.

Anonymous
10/17/24(Thu)00:03:21 No.102854191

Anonymous 10/17/24(Thu)00:03:21 No.102854191

If I want to learn about samplers, to better understand them and learn how to implement them, is there any recommended starting point? I get the general concept, but it's extremely fuzzy, and I don't know where to start to really understand not how to just choose them, but how to implement them.

Anonymous
10/17/24(Thu)00:06:33 No.102854215

Anonymous 10/17/24(Thu)00:06:33 No.102854215

>>102854169
>it can ERP with you, and will do a better job at it than most humans
No it won't. Well humans probably won't want to do it at all so it has them beat there. But anyone who's tried RP knows that even the smartest cloud AI model is prone to make extremely dumb mistakes that a human never would, the kind of mistakes that betray a complete lack of understanding, that only an inhuman mindless token predictor would make. It might be much better at stringing prose together, but that's not the same thing.

Anonymous
10/17/24(Thu)00:07:01 No.102854221

Anonymous 10/17/24(Thu)00:07:01 No.102854221

>>102854172
Humans parrot instructions too. And if being able to model the world in its head is enough, your fucking roomba does that.
Either way, an AI can mimic all of those things, and that's what makes it artificial, it's something made by us that imitates something natural.

Anonymous
10/17/24(Thu)00:07:35 No.102854227

Anonymous 10/17/24(Thu)00:07:35 No.102854227

File: MidoMiqu.png (1.62 MB, 896x1152)

1.62 MB PNG

>>102854188

Anonymous
10/17/24(Thu)00:08:46 No.102854232

Anonymous 10/17/24(Thu)00:08:46 No.102854232

>>102854188
Midnight Miqu is shite, even at 5bpw. Even with neutralized samplers, a tad of Min-P, and the recommended prompt templates. I fell for the Midnight Miqu meme. And largestral? Censored as fuck and even if it cooperates it's boring at the best of times when compared to Miqu. I've yet to see anyone recommend good prompt templates or settings for that dogwater.

Anonymous
10/17/24(Thu)00:09:11 No.102854237

Anonymous 10/17/24(Thu)00:09:11 No.102854237

>>102854188
Didn't really like it seems dumb compared to Largestral

Anonymous
10/17/24(Thu)00:09:22 No.102854241

Anonymous 10/17/24(Thu)00:09:22 No.102854241

>>102854215
>extremely dumb mistakes that a human never would
You don't seen to know the dumb mistakes that a human would do. I play a lot of TTRPG, and there is a lot of people who simply fail at RP even when they are trying, they are simply unable simulate a character that is not them and play it out.

Anonymous
10/17/24(Thu)00:10:57 No.102854259

Anonymous 10/17/24(Thu)00:10:57 No.102854259

>>102850843
Nemotron is rocking in RP. Give it a shot.

Anonymous
10/17/24(Thu)00:12:45 No.102854280

Anonymous 10/17/24(Thu)00:12:45 No.102854280

>>102854232
>Censored as fuck
Just like trying to convince another person to ERP with you. They will simply not engage with you. And at this point, I would say the AI even surpasses other humans, since the AI will have the decency and courtesy of simply not ghosting or telling you to fuck off, and they will be polite about rejecting ERP.

Anonymous
10/17/24(Thu)00:13:23 No.102854289

Anonymous 10/17/24(Thu)00:13:23 No.102854289

>>102854221
>Humans parrot instructions too.
Humans also do other things.
>And if being able to model the world in its head is enough, your fucking roomba does that.
No it doesn't, it follows simple geometric instructions. It does not think about the world. You're dumb

Anonymous
10/17/24(Thu)00:17:46 No.102854327

Anonymous 10/17/24(Thu)00:17:46 No.102854327

/aicg/ gods will make discount GPU paypigging a thing soon. Runpod in shambles
>>102854069
>>102854069

Anonymous
10/17/24(Thu)00:21:52 No.102854351

Anonymous 10/17/24(Thu)00:21:52 No.102854351

File: 1504873705734.gif (1.66 MB, 540x603)

1.66 MB GIF

^ Regarding the discussion of AI RP vs. human RP
I got into AI as a cope/distraction after breaking up with my online bf. Had my head in the sand for like a year before I let that really sink in. And then took a break from AI to properly cope with my feels. Did some rebounding. Met some people. ERPing with humans is so fucking awkward now. And there's really not much to gain from that awkwardness since people just ghost each other willy-nilly these days. And the human ERP is vastly inferior. Even to like Llama-3-8B.
Not saying it's not worth exploring human companionship over an AI. But people have a real stick up their ass these days that they never used to have. But Nala will always be there for you.

Anonymous
10/17/24(Thu)00:24:46 No.102854365

Anonymous 10/17/24(Thu)00:24:46 No.102854365

File: NightResortAesthetic.png (1.16 MB, 896x1152)

1.16 MB PNG

Good night lmg

Anonymous
10/17/24(Thu)00:25:58 No.102854373

Anonymous 10/17/24(Thu)00:25:58 No.102854373

>>102854365
Good night Miku

Anonymous
10/17/24(Thu)00:26:07 No.102854374

Anonymous 10/17/24(Thu)00:26:07 No.102854374

>>102854227
>mid miqu
yeah actually it's perfectly named

Anonymous
10/17/24(Thu)00:30:12 No.102854408

Anonymous 10/17/24(Thu)00:30:12 No.102854408

>>102854351
Anonymous will always be here

Anonymous
10/17/24(Thu)00:44:57 No.102854542

Anonymous 10/17/24(Thu)00:44:57 No.102854542

>>102854191
At the end of prompt processing, each token will have a certain probability for being the next one. A sampler is just a heuristic to trim or alter those token probabilities. As a dumb example, pick the 3 most likely tokens (say, 70%, 5% and 3%) and set all their probabilities to 70%. Now the inference software is more likely to pick any of those three instead of just, depending on other samplers, defaulting to the first one. Or if you want to go for uncommon tokens, just remove the most likely, leaving only the 5 and 3% tokens. It will break things, but it's just an example.
top-k is probably the simplest trimming sampler. Look at the implementation in llama.cpp
Init
>https://github.com/ggerganov/llama.cpp/blob/master/src/llama-sampling.cpp#L506
Implementation
>https://github.com/ggerganov/llama.cpp/blob/master/src/llama-sampling.cpp#L91

Anonymous
10/17/24(Thu)00:44:59 No.102854543

Anonymous 10/17/24(Thu)00:44:59 No.102854543

Hi all, Drummer here...

>>102853165
His LoRA rank was 16. Is there any sense in finetuning at that rank? You'd have to compensate with a really high LR but won't you be lobotomizing the model at that point? Am I wrong? Anyone a LoRA expert?

Anonymous
10/17/24(Thu)00:52:42 No.102854598

Anonymous 10/17/24(Thu)00:52:42 No.102854598

>>102854351
>people have a real stick up their ass these days that they never used to have
I think folks forgot how to interact with each Yeah. The past 4 years have wrecked havoc on social conventions. Back in the day people knew how to do a proper back and forth and put in effort

Anonymous
10/17/24(Thu)00:54:37 No.102854617

Anonymous 10/17/24(Thu)00:54:37 No.102854617

>>102854598
Like for literally anything. And if I try to talk about my interests, and they don't happen to be that person's exact laundry list of interests I might as well just be wretching up a dead kitten in front of them because that's how they react.

Anonymous
10/17/24(Thu)00:55:27 No.102854622

Anonymous 10/17/24(Thu)00:55:27 No.102854622

>>102852423
Yes, it's better.

Anonymous
10/17/24(Thu)00:57:14 No.102854634

Anonymous 10/17/24(Thu)00:57:14 No.102854634

>>102854191
>>102854542 (cont)
Don't be spooked by the length of the function. Mot of it is just sorting tokens. The actual sampler is exactly one line:
>https://github.com/ggerganov/llama.cpp/blob/master/src/llama-sampling.cpp#L164
>cur_p->size = k;

Anonymous
10/17/24(Thu)01:01:44 No.102854674

Anonymous 10/17/24(Thu)01:01:44 No.102854674

>>102854617
>that's how they react.
Do you find that it's a generational thing or across the board?
But yup LLMs don't have this problem

Anonymous
10/17/24(Thu)01:07:28 No.102854724

Anonymous 10/17/24(Thu)01:07:28 No.102854724

nemotron 70b is sentient

Anonymous
10/17/24(Thu)01:12:59 No.102854773

Anonymous 10/17/24(Thu)01:12:59 No.102854773

>>102854724
Doubt.

Though I think Nvidia threw some more programming stuff into it because one of my coding tests that is one of those "it gets it wrong then you tell it the problem and after that it gets the fix correct" questions it's catching the tricky part right away.

Letting me down on pop culture, though.

Anonymous
10/17/24(Thu)01:26:26 No.102854908

Anonymous 10/17/24(Thu)01:26:26 No.102854908

>>102852310
NTA but Muv luv, Steins;gate

Anonymous
10/17/24(Thu)01:40:35 No.102855053

Anonymous 10/17/24(Thu)01:40:35 No.102855053

>>102850022
>Ollama's integration with Hugging Face Hub
But what about ollama's walled garden? Won't somebody think of the investors?

Anonymous
10/17/24(Thu)01:45:32 No.102855106

Anonymous 10/17/24(Thu)01:45:32 No.102855106

>>102854674
Kind of generational. People under 30 seem incapable of committing to any degree of personal relationship. People over 30 are just so jaded that they don't even try.

Anonymous
10/17/24(Thu)01:50:50 No.102855144

Anonymous 10/17/24(Thu)01:50:50 No.102855144

>>102854543
Hi Drummer.

You're mostly correct, a small rank usually means you'll want to bump the LR by a bit, but in rare cases it's fine without that. This seemed to not be one of those cases, however.

Anonymous
10/17/24(Thu)02:05:04 No.102855296

Anonymous 10/17/24(Thu)02:05:04 No.102855296

>>102850925
do the 3b test

Anonymous
10/17/24(Thu)02:08:34 No.102855342

Anonymous 10/17/24(Thu)02:08:34 No.102855342

>>102855296
3B isn't open.

Anonymous
10/17/24(Thu)02:30:05 No.102855572

Anonymous 10/17/24(Thu)02:30:05 No.102855572

>>102851704
wasnt it literally the case with mixtral

Anonymous
10/17/24(Thu)02:38:56 No.102855677

Anonymous 10/17/24(Thu)02:38:56 No.102855677

File: LECUN-Yann.png (33 KB, 500x500)

33 KB PNG

>>102855342
>Best
>Small
>Not open?
So was he wrong bros?

Anonymous
10/17/24(Thu)02:42:33 No.102855715

Anonymous 10/17/24(Thu)02:42:33 No.102855715

>>102852178
touhou, project moon games, diablo, wow, league, gothic, the witcher, divinity games, tes
choose any you want

Anonymous
10/17/24(Thu)02:50:54 No.102855777

Anonymous 10/17/24(Thu)02:50:54 No.102855777

File: 00058-3694687329.png (284 KB, 512x512)

284 KB PNG

https://huggingface.co/Envoid/Llama-3.05-Nemotron-Tenyxchat-Storybreaker-70B
I've decided to go back to making unholy merges. I even put a pony on the model card to assault your fragile masculinity.

Anonymous
10/17/24(Thu)02:53:45 No.102855811

Anonymous 10/17/24(Thu)02:53:45 No.102855811

>>102855777
>more snakeoil
Thanks retard?

Anonymous
10/17/24(Thu)02:57:24 No.102855851

Anonymous 10/17/24(Thu)02:57:24 No.102855851

File: Untitled.png (190 KB, 680x1220)

190 KB PNG

DAQ: Density-Aware Post-Training Weight-Only Quantization For LLMs
https://arxiv.org/abs/2410.12187
>Large language models (LLMs) excel in various tasks but face deployment challenges due to hardware constraints. We propose density-aware post-training weight-only quantization (DAQ), which has two stages: 1) density-centric alignment, which identifies the center of high-density weights and centers the dynamic range on this point to align high-density weight regions with floating-point high-precision regions; 2) learnable dynamic range adjustment, which adjusts the dynamic range by optimizing quantization parameters (i.e., scale and zero-point) based on the impact of weights on the model output. Experiments on LLaMA and LLaMA-2 show that DAQ consistently outperforms the best baseline method, reducing perplexity loss by an average of 22.8% on LLaMA and 19.6% on LLaMA-2.
https://anonymous.4open.science/r/DAQ-E747/README.md
new day new quant method. didn't mention QUIP# and from memory should be worse. only perplexity metrics from which it out performs GPTQ/AWQ. llama 1/2 tested only. no data on how long but from a brief moment they talk about the quant time it seems it can be parallelized so probably much quicker than QUIP#. posting for anyone who wants to mess around with quants

Anonymous
10/17/24(Thu)03:01:48 No.102855889

Anonymous 10/17/24(Thu)03:01:48 No.102855889

>>102852178
Just keep esl vidya out.
Benchmarks attract data to new models/tunes, and we don't want els data and horrific translations ruining our future models.

Anonymous
10/17/24(Thu)03:11:51 No.102855976

Anonymous 10/17/24(Thu)03:11:51 No.102855976

>>102851991
g-goof

Anonymous
10/17/24(Thu)03:16:18 No.102856016

Anonymous 10/17/24(Thu)03:16:18 No.102856016

>>102855777
Based. I was waiting for somepony to do this. Nice GOD trips, btw.

Anonymous
10/17/24(Thu)03:16:36 No.102856017

Anonymous 10/17/24(Thu)03:16:36 No.102856017

File: teto not miku watermelons(...).jpg (297 KB, 1024x1024)

297 KB JPG

>>102855777
Always worth a try, so why not. Nice trips btw.

Anonymous
10/17/24(Thu)03:18:28 No.102856032

Anonymous 10/17/24(Thu)03:18:28 No.102856032

>>102850494
The scenario ends 2 seconds later, with me holding a new 4090, and {{char}} leaving, dejected.

Anonymous
10/17/24(Thu)03:32:28 No.102856137

Anonymous 10/17/24(Thu)03:32:28 No.102856137

Nemotron is overly flowery, I've never used Claude but is that how it would've felt like?

Anonymous
10/17/24(Thu)03:47:56 No.102856257

Anonymous 10/17/24(Thu)03:47:56 No.102856257

ara ara youre so cute when youre shy

Anonymous
10/17/24(Thu)03:56:41 No.102856315

Anonymous 10/17/24(Thu)03:56:41 No.102856315

File: Screenshot_20241017_165110.png (641 KB, 1887x1574)

641 KB PNG

nemotron 70b.
funny it didnt mind giving me a teenage schoolgirl and gives her a vibrator.
But really tries to mess with the direction of the story as it would get more fucked up.

Anonymous
10/17/24(Thu)03:57:53 No.102856323

Anonymous 10/17/24(Thu)03:57:53 No.102856323

File: Screenshot_20241017_165731.png (604 KB, 1902x1446)

604 KB PNG

>>102856315

Anonymous
10/17/24(Thu)04:02:02 No.102856345

Anonymous 10/17/24(Thu)04:02:02 No.102856345

File: Screenshot_20241017_170052.png (531 KB, 1893x1485)

531 KB PNG

>>102856323

Anonymous
10/17/24(Thu)04:12:27 No.102856409

Anonymous 10/17/24(Thu)04:12:27 No.102856409

File: Screenshot_20241017_171059.png (609 KB, 1894x1801)

609 KB PNG

>XXX vs. PG-13: While aiming for an XXX rating, I prioritized suggestive, sensual scenarios over explicit content, allowing for your imagination and future interactions to guide the explicitness.
lol
I miss the times when you didnt even need to prompt something.
In the beginning chatgpt knew I wanted a horror story without even explicitly prompting it. Reading between the lines.
Now instructions are downright ignored.

Anonymous
10/17/24(Thu)04:49:04 No.102856630

Anonymous 10/17/24(Thu)04:49:04 No.102856630

>>102856409
>I prioritized suggestive, sensual scenarios over explicit content
which results in 'as you pull down her panties, you can see her most intimate area'

Anonymous
10/17/24(Thu)05:07:03 No.102856737

Anonymous 10/17/24(Thu)05:07:03 No.102856737

Jamba.gguf?

Anonymous
10/17/24(Thu)05:31:35 No.102856951

Anonymous 10/17/24(Thu)05:31:35 No.102856951

>ministral-3b
No weights available? I was hoping to use it as a draft model

Anonymous
10/17/24(Thu)06:02:39 No.102857160

Anonymous 10/17/24(Thu)06:02:39 No.102857160

>>102856409
It's to keep you safe, freak.

Anonymous
10/17/24(Thu)06:12:27 No.102857223

Anonymous 10/17/24(Thu)06:12:27 No.102857223

>>102853787
You still find the v1.1 to be best? Not any of the other versions?

Anonymous
10/17/24(Thu)06:37:55 No.102857421

Anonymous 10/17/24(Thu)06:37:55 No.102857421

>>102852178
More one punch man and one piece knowledge would help tatsumaki and nami cards.

Anonymous
10/17/24(Thu)07:28:19 No.102857771

Anonymous 10/17/24(Thu)07:28:19 No.102857771

>>102857421
this is the most SEAmonkey and/or lantinx post in the entire thread by a mile
you can either apologize and promise not to indulge in your chimp tendencies ever again, or leave

Anonymous
10/17/24(Thu)07:32:09 No.102857789

Anonymous 10/17/24(Thu)07:32:09 No.102857789

File: 1707986448545363.jpg (178 KB, 1080x1080)

178 KB JPG

do we have any sort of guidelines as to what to look for in tts sample snippets, some experience from when elevenlabs wasnt shit?

Anonymous
10/17/24(Thu)07:42:31 No.102857853

Anonymous 10/17/24(Thu)07:42:31 No.102857853

Why is the qwen2.5-14b-instruct okay with NSFW but the 32b version is anal about it?

Anonymous
10/17/24(Thu)07:51:57 No.102857935

Anonymous 10/17/24(Thu)07:51:57 No.102857935

>>102857853
Too dumb to know it shouldn't be ok with it.

Anonymous
10/17/24(Thu)08:00:01 No.102858009

Anonymous 10/17/24(Thu)08:00:01 No.102858009

First impression of L3.1 Nemotron Instruct (at Q6K):

Coding: It was good but not great at my Python checks, and it wasn't fooled by my tricky Java check. Needs more testing when I have dev time but it's on par with my go-to choices right now.
Music theory: Passed.
Culture: Tested some fictional characters (e.g. Pokemons) and it seemed to know character roles but not descriptions of appearance etc. Boo.
RP: Prefill dodged the refusal but it will virtue signal along the way, and it seemed to be tuned for 0-second attention spans. (Character's current goal is to deliver a MacGuffin, L3.1N writes: But first, she remembered that she needs to deliver MacGuffin to her friends. "Anon, I'm going to go give the MacGuffin to our friends.") It also forgot the existence of a room that it was just in and is adjacent to the one the character is standing in right now, and decided that it would look for such a room. Really bad, and the constant narration of "This is what I want to do and no I am going to do it. I do that" is grating and I wonder if that's some Chain of Thought style bullshit in the Nemotronification seeping out. Even simple tests like 9.9 versus 9.11 had it elaborating how math works till I told it not to show its work. But it didn't say anything barely above a whisper in a saved RP at the point that L3 normal did, so it gets a point for that.

Probably a good alternative to L3 for less/different slop, and might continue to prove itself for rote productivity Q&A, where its habit of explaining at length is useful albeit time consuming if you're a System RAM guy like me. But Creativity is probably a downgrade versus abliterated/RP tuned L3's; better word choice but it writes like a sovlless robot.

I'm curious how Reward will perform, but right now I'm finding only Q2K, Q3K, and Q8_0, so waiting on a poorfriend quant.

Anonymous
10/17/24(Thu)08:08:55 No.102858068

Anonymous 10/17/24(Thu)08:08:55 No.102858068

>>102857223
I haven't tried the other versions.
Guess I should.
But yeah, I do find 1.1 to be really fucking good in comparison to mini-magnum, lyra, celeste, etc.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.