/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 07/31/24(Wed)08:45:01 No.101651157

File: _3940ddc8-aaa1-4e51-9b2a-(...).jpg (121 KB, 1024x1024)

121 KB JPG

/lmg/ - Local Models General Anonymous 07/31/24(Wed)08:45:01 No.101651157 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101643089 & >>101636887

►News
>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676
>(07/26) Cyberagent releases Japanese fine-tune model: https://hf.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
>(07/25) BAAI & TeleAI release 1T parameter model: https://hf.co/CofeAI/Tele-FLM-1T
>(07/24) Mistral Large 2 123B released: https://hf.co/mistralai/Mistral-Large-Instruct-2407
>(07/23) Llama 3.1 officially released: https://ai.meta.com/blog/meta-llama-3-1/
>(07/22) llamanon leaks 405B base model: https://files.catbox.moe/d88djr.torrent >>101516633

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
07/31/24(Wed)08:45:34 No.101651164

Anonymous 07/31/24(Wed)08:45:34 No.101651164

File: 4_recap.png (256 KB, 512x512)

256 KB PNG

►Recent Highlights from the Previous Thread: >>101643089

--Paper: Physics of Language Models research papers and video presentations: >>101646652 >>101646690
--Paper (old): Discussion of research paper on removing layers in machine learning models: >>101643525 >>101643683
--Papers: >>101646773
--Discussion on determinism in llama.cpp and its implications for benchmarking and research: >>101645325 >>101645331 >>101645367 >>101648450 >>101646003 >>101648420 >>101648477 >>101648584 >>101648610 >>101649046 >>101649089 >>101649141 >>101649207 >>101649256 >>101649349 >>101649389 >>101649638 >>101649678 >>101649695 >>101650304 >>101649408 >>101649405 >>101649407
--vLLM's CPU off-loading performance and comparison with llama.cpp: >>101643652 >>101648356
--Sunfall on llama 3.1 8b released, community discusses LimaRP-DS and JSONL formats: >>101647576 >>101648320 >>101648386 >>101648441
--Nemo vs Mixtral discussion: >>101644934 >>101644969 >>101645820 >>101645003 >>101645013 >>101645133 >>101645153 >>101645207 >>101645233
--LlamaCPP and KoboldCPP performance issues and troubleshooting: >>101645050 >>101645114 >>101645191 >>101645296 >>101645403 >>101645436 >>101645527 >>101645697 >>101645733
--How to format and place a summary of previous chat events to improve model responses: >>101645914 >>101645932 >>101646142
--Anon releases neural text to speech library, babylon.cpp: >>101649677
--Anon discusses Udio's new model and compares it to Suno, sharing creative works and personal preferences: >>101645878 >>101645889 >>101646282 >>101647226 >>101647330 >>101647472
--Altman's reaction to OAI's situation and discussion on its future prospects: >>101645180 >>101645214 >>101645242 >>101645215 >>101645222 >>101645266 >>101645271 >>101645287
--Anon shares Matxinh - Expression Morpher tool: >>101648382
--Miku (free space): >>101643176 >>101645414 >>101646981

►Recent Highlight Posts from the Previous Thread: >>101643094

Anonymous
07/31/24(Wed)08:47:48 No.101651188

Anonymous 07/31/24(Wed)08:47:48 No.101651188

do something

Anonymous
07/31/24(Wed)08:48:08 No.101651190

Anonymous 07/31/24(Wed)08:48:08 No.101651190

*does something*

Anonymous
07/31/24(Wed)08:50:15 No.101651211

Anonymous 07/31/24(Wed)08:50:15 No.101651211

>>101651190
Seethes about formatting.

Anonymous
07/31/24(Wed)08:52:21 No.101651233

Anonymous 07/31/24(Wed)08:52:21 No.101651233

`does nothing`

Anonymous
07/31/24(Wed)08:52:47 No.101651239

Anonymous 07/31/24(Wed)08:52:47 No.101651239

I used to hate when local models struggled with formatting, but I've learned to see past it. As long as I get my shivers.

Anonymous
07/31/24(Wed)08:53:04 No.101651244

Anonymous 07/31/24(Wed)08:53:04 No.101651244

File: llama3.1-white room.png (453 KB, 824x2444)

453 KB PNG

Logs! Empty system prompt, llama 3.1 70b.

Anonymous
07/31/24(Wed)08:55:57 No.101651269

Anonymous 07/31/24(Wed)08:55:57 No.101651269

Mistral Large is the shit, amazing intelligence, above average prose and a nice slow burn.

Anonymous
07/31/24(Wed)08:56:55 No.101651277

Anonymous 07/31/24(Wed)08:56:55 No.101651277

What is the best model for RP right now? is there something based on Lllama 3.1 already?

Anonymous
07/31/24(Wed)08:58:07 No.101651293

Anonymous 07/31/24(Wed)08:58:07 No.101651293

hi wehre are the bots :DDDDD

Anonymous
07/31/24(Wed)08:58:12 No.101651295

Anonymous 07/31/24(Wed)08:58:12 No.101651295

>>101651244
That's pretty cool actually.

Anonymous
07/31/24(Wed)09:00:19 No.101651314

Anonymous 07/31/24(Wed)09:00:19 No.101651314

>>101651269
quant?

Anonymous
07/31/24(Wed)09:01:50 No.101651327

Anonymous 07/31/24(Wed)09:01:50 No.101651327

File: LLM-history.png (638 KB, 4751x2585)

638 KB PNG

>>101651277
Mistral-Large.

Anonymous
07/31/24(Wed)09:01:54 No.101651329

Anonymous 07/31/24(Wed)09:01:54 No.101651329

File: 1692228820109924.jpg (759 KB, 1856x2464)

759 KB JPG

>>101651157

Anonymous
07/31/24(Wed)09:03:21 No.101651351

Anonymous 07/31/24(Wed)09:03:21 No.101651351

>>101651244
3.1 is pretty fun for sfw rp, when it doesn't have to fumble through writing about sex it's actually pretty decent

Anonymous
07/31/24(Wed)09:07:40 No.101651393

Anonymous 07/31/24(Wed)09:07:40 No.101651393

>>101651314
IQ2_M, anons go crazy whenever anyone runs a quant lower than 4 bits but it's still better than everything else out there.

Anonymous
07/31/24(Wed)09:09:53 No.101651419

Anonymous 07/31/24(Wed)09:09:53 No.101651419

>>101651393
I am one of those anons, but if it's all you can run you might as well

Anonymous
07/31/24(Wed)09:15:38 No.101651477

Anonymous 07/31/24(Wed)09:15:38 No.101651477

>>101651327
>history as written by the shills
Euryale is a merge of at least:
>elinas/chronos-70b-v2
>NousResearch/Nous-Hermes-Llama2-70b
>jondurbin/airoboros-l2-70b-2.1-creative
>garage-bAInd/Platypus2-70B-instruct
>MayaPH/GodziLLa2-70B
>nRuaif/fiction.live-Kimiko-V2-70B
>lemonilia/limarp-llama2-v2
Somehow it's outside the merge era. Repeat a lie often enough and people will believe it. That's the power of shilling.

Anonymous
07/31/24(Wed)09:15:58 No.101651485

Anonymous 07/31/24(Wed)09:15:58 No.101651485

>>101651393
yeah, I also use low quants, 3.5bpw midnight miqu has been my daily driver for ages
As soon as I get another 3090 Im testing Mistral Large at 3.5bpw

Anonymous
07/31/24(Wed)09:18:20 No.101651519

Anonymous 07/31/24(Wed)09:18:20 No.101651519

>>101651477
Quite a lot of people weren't / aren't aware Mythomax is a merge as well.

Anonymous
07/31/24(Wed)09:19:34 No.101651539

Anonymous 07/31/24(Wed)09:19:34 No.101651539

>>101651477
You are partially right, euryale was a merge with a tune on top. I should indeed have placed it into merge era.

Anonymous
07/31/24(Wed)09:23:25 No.101651584

Anonymous 07/31/24(Wed)09:23:25 No.101651584

You missed CR 35B, Yi 34B, Qwen2 72B and Mythomax 13B among many other pre-llama models, newbie

Anonymous
07/31/24(Wed)09:24:38 No.101651597

Anonymous 07/31/24(Wed)09:24:38 No.101651597

>>101651519
that explains why it was so shitty

Anonymous
07/31/24(Wed)09:24:59 No.101651601

Anonymous 07/31/24(Wed)09:24:59 No.101651601

>>101651329
Equipment-less tandem base jumping with Miku

Anonymous
07/31/24(Wed)09:26:51 No.101651619

Anonymous 07/31/24(Wed)09:26:51 No.101651619

>>101651584
>CR 35B
Mixtral is better
>Yi 34B
Never was good.
>Qwen2 72B
Meme
>Mythomax 13B
That was only okay if you wanted a brainless bot for coom

Anonymous
07/31/24(Wed)09:27:04 No.101651621

Anonymous 07/31/24(Wed)09:27:04 No.101651621

File: file.png (17 KB, 834x158)

17 KB PNG

What the Fuck is this guy on about?
>40 so limiting you may as well keep temp/top-p 1

Anonymous
07/31/24(Wed)09:27:32 No.101651626

Anonymous 07/31/24(Wed)09:27:32 No.101651626

>>101651327
I would say 405B beats it by a mile. But you have to be a cloud cuck to run it.

Anonymous
07/31/24(Wed)09:28:40 No.101651641

Anonymous 07/31/24(Wed)09:28:40 No.101651641

>>101651619
>Mixtral is better
talking about mixtral, you also missed the 8x7b version
>Never was good.
Best medium model at the time
>That was only okay if you wanted a brainless bot for coom
Again, best small ERP model at the time

Anonymous
07/31/24(Wed)09:28:49 No.101651643

Anonymous 07/31/24(Wed)09:28:49 No.101651643

>>101651621
raddit is not smart yes, are we supposed to be surprised by this groundbreaking discovery? also kindly go back

Anonymous
07/31/24(Wed)09:30:07 No.101651659

Anonymous 07/31/24(Wed)09:30:07 No.101651659

>>101651584
If you are referring to >>101651327
I was here since llama1 dropped and I don't care about smaller or inferior models. Please make your own chart with them, I never really interacted with those for long.

Anonymous
07/31/24(Wed)09:30:20 No.101651664

Anonymous 07/31/24(Wed)09:30:20 No.101651664

File: temp_scaling.gif (55 KB, 388x440)

55 KB GIF

>>101651621
That's all wrong.
Holy shit.
TopK just means "get the N top logits";
TopP just means "get the Y% top logits";
Temp scales the logit percentage chances of being chosen.
They don't relate to each other like that at all, holy fuck.

Anonymous
07/31/24(Wed)09:31:38 No.101651678

Anonymous 07/31/24(Wed)09:31:38 No.101651678

File: topP.png (192 KB, 892x1392)

192 KB PNG

>>101651664

Anonymous
07/31/24(Wed)09:32:24 No.101651688

Anonymous 07/31/24(Wed)09:32:24 No.101651688

Yeah, I was referring to that chart, you don't even deserve a (You) for your mental retardation

Anonymous
07/31/24(Wed)09:33:27 No.101651701

Anonymous 07/31/24(Wed)09:33:27 No.101651701

File: minP.png (227 KB, 1949x845)

227 KB PNG

>>101651678
Incidentally, that's what makes minP so useful. You can use it to explicitly gate the logit sampling by the percentage chance of the token getting chosen, which means you can discard low likelihood tokens, which often correlate with schizo responses.

Anonymous
07/31/24(Wed)09:34:31 No.101651714

Anonymous 07/31/24(Wed)09:34:31 No.101651714

File: file.png (31 KB, 849x298)

31 KB PNG

>>101651664
I know it's all wrong that's why I'm losing brain cells reading his posts. He also argued the other daythat K40 * P.92 = K36

Anonymous
07/31/24(Wed)09:36:13 No.101651725

Anonymous 07/31/24(Wed)09:36:13 No.101651725

>>101651714
>But.. they actually are?
Immediately disregard any statement posed as a question. You're a retard for subjecting yourself to it. Stay there.

Anonymous
07/31/24(Wed)09:36:30 No.101651730

Anonymous 07/31/24(Wed)09:36:30 No.101651730

>>101651714
see
>>101651643

Anonymous
07/31/24(Wed)09:38:07 No.101651754

Anonymous 07/31/24(Wed)09:38:07 No.101651754

>>101651619
>Mixtral is better
Nah, in retrospect Mixtral was always overcooked cope model. I used it for a while as a VRAMlet but dumped it for the first alternative and never looked back. Meanwhile command-R is still the natural prose king.

Anonymous
07/31/24(Wed)09:38:16 No.101651757

Anonymous 07/31/24(Wed)09:38:16 No.101651757

Was using horde to test out Mistral Large in story-mode when suddenly it switches to chat format, tells me my writing is bad (correct), and then the worker disconnects. Did the model hallucinate, or did I just talk to the man behind the screen???

Anonymous
07/31/24(Wed)09:38:17 No.101651758

Anonymous 07/31/24(Wed)09:38:17 No.101651758

>>101651539
>with a tune on top
[citation needed]

Anonymous
07/31/24(Wed)09:39:06 No.101651769

Anonymous 07/31/24(Wed)09:39:06 No.101651769

>>101651626
Or to quant it to 1QM on 128gb system

Anonymous
07/31/24(Wed)09:40:44 No.101651787

Anonymous 07/31/24(Wed)09:40:44 No.101651787

>>101651754
>Nah, in retrospect Mixtral was always overcooked cope model.
Absolutely correct.

Anonymous
07/31/24(Wed)09:40:51 No.101651789

Anonymous 07/31/24(Wed)09:40:51 No.101651789

>>101651757
>>101651142

Anonymous
07/31/24(Wed)09:41:20 No.101651799

Anonymous 07/31/24(Wed)09:41:20 No.101651799

>>101651758
>17th Attempt. Past 10 Failed, cost me >$200 lol.
>
>Idea is an updated version of Euryale with ReMantik instead of the ties-merge between the original 3 models.
>
>This is then mixed with a saucy model (spicyboros+pyg_lora) with a Mythomax-esque Ratio, and a certain experimental (self) LoRA applied to it.

Anonymous
07/31/24(Wed)09:41:23 No.101651800

Anonymous 07/31/24(Wed)09:41:23 No.101651800

>>101651659
>merges and 120b frankenmerges
>likely ran at 2bit
You're barely human.

Anonymous
07/31/24(Wed)09:43:51 No.101651827

Anonymous 07/31/24(Wed)09:43:51 No.101651827

>>101651800
>I'm a butthurt ramlet
Q6_K on 128GB system. (You) are barely human.

Anonymous
07/31/24(Wed)09:45:30 No.101651843

Anonymous 07/31/24(Wed)09:45:30 No.101651843

>>101651799
All that says is that he spent all that money merging models.

Anonymous
07/31/24(Wed)09:46:49 No.101651854

Anonymous 07/31/24(Wed)09:46:49 No.101651854

>>101651843
merging is free, I've done it myself

Anonymous
07/31/24(Wed)09:49:03 No.101651883

Anonymous 07/31/24(Wed)09:49:03 No.101651883

>>101651854
>17th Attempt. cost me >$200 lol.
>200/17 = $11
He's just renting a pod for the memory.

Anonymous
07/31/24(Wed)09:51:13 No.101651901

Anonymous 07/31/24(Wed)09:51:13 No.101651901

>>101651854
>>101651883
He literally states that he made LoRAs. Making a LoRA is not free. It requires training.

Anonymous
07/31/24(Wed)09:51:22 No.101651904

Anonymous 07/31/24(Wed)09:51:22 No.101651904

>>101651883
how much does it cost to tune 70b lora?

Anonymous
07/31/24(Wed)09:52:18 No.101651917

Anonymous 07/31/24(Wed)09:52:18 No.101651917

>>101651789
That's a given, I was just surprised a host could interject and basically write you directly.

Anonymous
07/31/24(Wed)09:52:41 No.101651922

Anonymous 07/31/24(Wed)09:52:41 No.101651922

>baseboard with 8xh100 currently at $25k ending in 7 hrs
is it going to get sniped into oblivion or do i have a chance? would i regret it? are gpus gonna be obsolete soon with everybody investing billions in getting out of nvidia's monopoly? cloud gpus are not an option for me due to privacy and i do not have a clear path to monetizing them, BUT if they will allow me to run and train whatever the fuck i want locally for years to come i'd consider it worth it

Anonymous
07/31/24(Wed)09:54:18 No.101651945

Anonymous 07/31/24(Wed)09:54:18 No.101651945

>>101651922
I think you'll need to post the link in order for me to better help you :)

Anonymous
07/31/24(Wed)09:54:35 No.101651951

Anonymous 07/31/24(Wed)09:54:35 No.101651951

>>101651922
>would i regret it?
Yes.
>BUT if they will allow me to run and train whatever the fuck i want locally for years to come i'd consider it worth it
Won't even last a year as models get larger and larger.

Anonymous
07/31/24(Wed)09:55:05 No.101651960

Anonymous 07/31/24(Wed)09:55:05 No.101651960

>>101651901
tune on top != merging some LoRA that he made
It's not different than just merging another model. Also, the LoRA might have been shit. It's just marketing to say it's more than a merge, and you eat it up like an idiot.

Anonymous
07/31/24(Wed)09:55:53 No.101651970

Anonymous 07/31/24(Wed)09:55:53 No.101651970

>>101651157
So, what's the best model for ERP for a poorfag with 16GB VRAM/128GB RAM?
Nemo, quants of Mistral Large, a finetune of LLama3?
Or should I just buy a couple 3090s?

Anonymous
07/31/24(Wed)09:56:21 No.101651979

Anonymous 07/31/24(Wed)09:56:21 No.101651979

>>101651922
oh nvm
>reserve not met
it's probably not even in the same order of magnitude desu

Anonymous
07/31/24(Wed)09:56:26 No.101651982

Anonymous 07/31/24(Wed)09:56:26 No.101651982

>>101651970
Stheno.

Anonymous
07/31/24(Wed)09:57:47 No.101651995

Anonymous 07/31/24(Wed)09:57:47 No.101651995

>>101651827
>Q6_K on 128GB system
>at 0.5 T/s
>to get placebo'd by goliath
That's a lot of cope.

Anonymous
07/31/24(Wed)09:59:17 No.101652012

Anonymous 07/31/24(Wed)09:59:17 No.101652012

>>101651970
If you have patience, Mistral Large Q6_K, worth it. CR+ is also good, has nice style, but a bit dumber.

Anonymous
07/31/24(Wed)09:59:19 No.101652014

Anonymous 07/31/24(Wed)09:59:19 No.101652014

>>101651970
You can try all three already. Give them a go and then decide if it's worth upgrading.

Anonymous
07/31/24(Wed)09:59:30 No.101652017

Anonymous 07/31/24(Wed)09:59:30 No.101652017

>>101651164
Sankyuu

Anonymous
07/31/24(Wed)09:59:58 No.101652022

Anonymous 07/31/24(Wed)09:59:58 No.101652022

>vllm
>git pull
>Failed to build
Aaaaaaaaaaaaaaaaaaaaaaaaaa

Anonymous
07/31/24(Wed)10:00:41 No.101652030

Anonymous 07/31/24(Wed)10:00:41 No.101652030

>>101651970
Stheno v3.2 or Nemo (if you want to spend hours figuring out the stable settings) is a standard recommendation for poorfags. Dunno about larger model's quants.

Anonymous
07/31/24(Wed)10:01:18 No.101652036

Anonymous 07/31/24(Wed)10:01:18 No.101652036

>>101651970
mini-magnum is the mythomax of 2024. Try it

Anonymous
07/31/24(Wed)10:01:41 No.101652040

Anonymous 07/31/24(Wed)10:01:41 No.101652040

>>101651995
(You) are a ramlet, therefore your opinion is invalid. Simple as.

Anonymous
07/31/24(Wed)10:01:45 No.101652043

Anonymous 07/31/24(Wed)10:01:45 No.101652043

>>101652030
Stheno at FP32 beats everything <= 70B

Anonymous
07/31/24(Wed)10:02:44 No.101652057

Anonymous 07/31/24(Wed)10:02:44 No.101652057

>>101652036
NTA but it failed the Nala test due to anthropomorphism.

Anonymous
07/31/24(Wed)10:02:45 No.101652058

Anonymous 07/31/24(Wed)10:02:45 No.101652058

>>101652022
It looks like you're experiencing an issue with pulling the latest changes from the vllm repository and encountering a build failure. Here are some steps you can take to troubleshoot and resolve this issue:

Check Dependencies:
Ensure that all required dependencies are installed and up-to-date. You can usually find a list of dependencies in the README.md file or a requirements.txt file in the repository.

Clean Build:
Sometimes old build artifacts can cause issues. Try cleaning your build environment:

>make clean

Check for Specific Error Messages:
When the build fails, it usually outputs error messages. Look for any specific error messages that can give you a clue about what went wrong.

Update Submodules:
If the repository uses submodules, make sure they are updated:

>git submodule update --init --recursive

Check Git Status:
Ensure your working directory is clean and there are no conflicting changes:

>git status

Rollback Changes:
If the issue started after a recent pull, you might want to try rolling back to a previous commit that was working:

>git log
>git checkout <commit_id>

Build Log:
Save the build log to review or share for further assistance:

>make > build.log 2>&1

Seek Help:
If you're unable to resolve the issue, consider seeking help from the community or repository maintainers. Provide them with detailed information about the error and the steps you've taken.

Anonymous
07/31/24(Wed)10:03:41 No.101652067

Anonymous 07/31/24(Wed)10:03:41 No.101652067

>>101652040
>"principle of anonymity":"author identity is irrelevant to an idea's logical provability."

Anonymous
07/31/24(Wed)10:03:44 No.101652068

Anonymous 07/31/24(Wed)10:03:44 No.101652068

>>101651960
a tune is just a lora with a high enough rank, if the lora was made on the model then he is tuning on top of the model

Anonymous
07/31/24(Wed)10:04:05 No.101652076

Anonymous 07/31/24(Wed)10:04:05 No.101652076

>>101652036
>mythomax of 2024
so retarded, overhyped model?

Anonymous
07/31/24(Wed)10:05:06 No.101652085

Anonymous 07/31/24(Wed)10:05:06 No.101652085

>>101652067
hi petra

Anonymous
07/31/24(Wed)10:05:12 No.101652087

Anonymous 07/31/24(Wed)10:05:12 No.101652087

>>101651982
Recommend any specific version? Or is L3-8B-Stheno-v3.3-32K ok?
>>101652014
I guess, just wanted to ask someone with more experience first. I thought for example, that models based on the official LLama checkpoints would be too censored.

Anonymous
07/31/24(Wed)10:05:17 No.101652088

Anonymous 07/31/24(Wed)10:05:17 No.101652088

>>101652058
>Seek Help
many here should

Anonymous
07/31/24(Wed)10:06:34 No.101652097

Anonymous 07/31/24(Wed)10:06:34 No.101652097

>>101652087
avoid 3.3, every vramlet agrees it's far worse than 3.2

Anonymous
07/31/24(Wed)10:08:52 No.101652126

Anonymous 07/31/24(Wed)10:08:52 No.101652126

>>101652036
I second this. Running mini-magnum at Q8. Takes a bit to get going and produces logical inconsistencies, but otherwise I like it more than other small models. That said, If I could run Mistral Large, I would.

Anonymous
07/31/24(Wed)10:10:05 No.101652142

Anonymous 07/31/24(Wed)10:10:05 No.101652142

>>101652068
technically alpha (scaling) can be used to broaden the number of parameters you're fucking up on a model with said LoRA adapter.
If you make a LoRA for a model with a hidden size of 4096 and set the Alpha to 4096 you've effectively just changed every parameter on the model. But the greater the deviation between rank and alpha the less granularity you have in those parameters. That said most "tunes" never touch the r/alpha values and they just use the default. Enjoy your 0.05% changed parameters.

Anonymous
07/31/24(Wed)10:10:27 No.101652145

Anonymous 07/31/24(Wed)10:10:27 No.101652145

Is there any table that tells you how much vram you need for llama3.1 8b based off which quant you use?

Anonymous
07/31/24(Wed)10:11:15 No.101652156

Anonymous 07/31/24(Wed)10:11:15 No.101652156

>>101652145
Basic arithmetic and preschool level computer science knowledge.

Anonymous
07/31/24(Wed)10:11:28 No.101652161

Anonymous 07/31/24(Wed)10:11:28 No.101652161

>>101652142
You're thinking of rank, not alpha.
Alpha is just an amplifier on the delta.

Anonymous
07/31/24(Wed)10:11:41 No.101652165

Anonymous 07/31/24(Wed)10:11:41 No.101652165

>>101652145
quant size in gb+20%

Anonymous
07/31/24(Wed)10:12:28 No.101652176

Anonymous 07/31/24(Wed)10:12:28 No.101652176

>>101652145
The file size is the minimum. The higher the context the more you need on top of that.

Anonymous
07/31/24(Wed)10:15:24 No.101652212

Anonymous 07/31/24(Wed)10:15:24 No.101652212

So I guess people actually managed to uncensor Llama? Is it something like abliteration or just fine-tuning/extra params?

Anonymous
07/31/24(Wed)10:16:52 No.101652228

Anonymous 07/31/24(Wed)10:16:52 No.101652228

>>101652068
He never said anything about tuning on top of the model. He says he "applied" it, just he like applies other LoRAs made by other people. It's just marketing to say it has a secret sauce and "it's not like other merges." But it's just one.

Anonymous
07/31/24(Wed)10:17:16 No.101652232

Anonymous 07/31/24(Wed)10:17:16 No.101652232

>>101652036
they even sound kind of similar
minimags

Anonymous
07/31/24(Wed)10:18:09 No.101652236

Anonymous 07/31/24(Wed)10:18:09 No.101652236

>>101652212
>Is it something like abliteration or just fine-tuning/extra params?
basic prompting

Anonymous
07/31/24(Wed)10:19:06 No.101652248

Anonymous 07/31/24(Wed)10:19:06 No.101652248

>>101652212
If by uncensoring you mean being lewd and more prone to following nsfw topic then yes (finetunes like Stheno). If you mean simply not refusing when asked about something then it worked from the start if you had more than two brain cells.

Anonymous
07/31/24(Wed)10:21:23 No.101652281

Anonymous 07/31/24(Wed)10:21:23 No.101652281

>>101652212
I have a DPO dataset filled with rejections by various models (gemma2-9b being the biggest offender/contributor). I tend to do a run of that on the instruction model and then merge models on top of that.
I used abliteration for awhile but it seems to make the models less prone to roleplay rejections as well, so I stopped.
>>101652236
>>101652248
Stop bragging about your lack of imagination, normies.

Anonymous
07/31/24(Wed)10:24:42 No.101652321

Anonymous 07/31/24(Wed)10:24:42 No.101652321

>>101651164
NOOOO EXPERIENCE-KUNS POST IS NOT INCLUDED >>101637653
next time the bait record gets broken include:
--new lmg bait record: >>post

thank you

Anonymous
07/31/24(Wed)10:26:22 No.101652350

Anonymous 07/31/24(Wed)10:26:22 No.101652350

>>101652212
Yes, with just prompting the day 3.0 released. 3.1 is the same.

Anonymous
07/31/24(Wed)10:26:58 No.101652357

Anonymous 07/31/24(Wed)10:26:58 No.101652357

>>101652321
Sorry, but I don't really think it's appropriate to joke about rape in this sub.

Anonymous
07/31/24(Wed)10:27:13 No.101652359

Anonymous 07/31/24(Wed)10:27:13 No.101652359

>went to huggingface to download llama3.1 8b
>you need to get approved by meta to do that
owari da

Anonymous
07/31/24(Wed)10:27:38 No.101652369

Anonymous 07/31/24(Wed)10:27:38 No.101652369

>>101652321
*--new petra samefag record: >>post

Anonymous
07/31/24(Wed)10:30:51 No.101652404

Anonymous 07/31/24(Wed)10:30:51 No.101652404

>>101652359
Are you incapable of writing fake info?

Anonymous
07/31/24(Wed)10:32:01 No.101652419

Anonymous 07/31/24(Wed)10:32:01 No.101652419

File: kiryu-smile.jpg (20 KB, 400x400)

20 KB JPG

>>101652404
actually, I just logged in and they approved me
I filled all fields with "a", didn't think they would actually approve it

Anonymous
07/31/24(Wed)10:32:17 No.101652424

Anonymous 07/31/24(Wed)10:32:17 No.101652424

>>101652281
glue your fucking mouth retard and learn basics how to use models

Anonymous
07/31/24(Wed)10:33:36 No.101652446

Anonymous 07/31/24(Wed)10:33:36 No.101652446

>>101652419
they quite literally just filter for "region: china, russia"
>Why I am not able to access from China/Russia?

>Meta Llama 3 is available via HuggingFace globally, except in comprehensively sanctioned jurisdictions.
https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct/discussions/13

Anonymous
07/31/24(Wed)10:34:19 No.101652457

Anonymous 07/31/24(Wed)10:34:19 No.101652457

>>101652369
petra? who's petra?

Anonymous
07/31/24(Wed)10:38:41 No.101652518

Anonymous 07/31/24(Wed)10:38:41 No.101652518

>>101652359
https://huggingface.co/NousResearch

Anonymous
07/31/24(Wed)10:47:06 No.101652636

Anonymous 07/31/24(Wed)10:47:06 No.101652636

>>101652457
(You)

Anonymous
07/31/24(Wed)10:57:09 No.101652773

Anonymous 07/31/24(Wed)10:57:09 No.101652773

File: Screenshot 2024-07-31 165602.png (19 KB, 389x65)

19 KB PNG

We really are living in the future

Anonymous
07/31/24(Wed)10:58:32 No.101652798

Anonymous 07/31/24(Wed)10:58:32 No.101652798

>>101652773
Good choices.

Anonymous
07/31/24(Wed)10:58:42 No.101652802

Anonymous 07/31/24(Wed)10:58:42 No.101652802

>training toy model
>a few hours python will either segfault or end up hard rebooting the system

ughhhhh

Anonymous
07/31/24(Wed)11:00:08 No.101652820

Anonymous 07/31/24(Wed)11:00:08 No.101652820

>>101651157
Did anyone make an image classifier with vision models or something like that? I want to filter my 4chan pictures somehow.

Anonymous
07/31/24(Wed)11:00:54 No.101652830

Anonymous 07/31/24(Wed)11:00:54 No.101652830

>>101652820
deepbooru, maybe?

Anonymous
07/31/24(Wed)11:04:50 No.101652887

Anonymous 07/31/24(Wed)11:04:50 No.101652887

>>101652773
All this time and I never thought of this. Thank you anon

Anonymous
07/31/24(Wed)11:05:27 No.101652897

Anonymous 07/31/24(Wed)11:05:27 No.101652897

So I am using mistral nemo instruct with presets you guys posted a few days ago. The use of DRY sampler in silly tavern was highlighted because it supposedly makes everything better. But with it enabled model just can't stop generating and is unable to stop its response at all. Am I missing something maybe? Is there any option that I should enable/disable?

Anonymous
07/31/24(Wed)11:09:41 No.101652954

Anonymous 07/31/24(Wed)11:09:41 No.101652954

>>101652830
I looked at that but I also want to filter by words too. For example I'll be able to search a meme by typing a word I remember about it. I did this with regular OCR but it can't read everything.

Anonymous
07/31/24(Wed)11:16:50 No.101653049

Anonymous 07/31/24(Wed)11:16:50 No.101653049

>>101652773
what model are you using?

Anonymous
07/31/24(Wed)11:18:07 No.101653072

Anonymous 07/31/24(Wed)11:18:07 No.101653072

>>101652954
You could use a vision model like cogvlm, it's pretty good at OCR.

Anonymous
07/31/24(Wed)11:19:31 No.101653086

Anonymous 07/31/24(Wed)11:19:31 No.101653086

File: miku-reloaded+.png (1.39 MB, 1024x1024)

1.39 MB PNG

>>101651157
https://www.youtube.com/watch?v=CXhqDfar8sQ

Anonymous
07/31/24(Wed)11:19:43 No.101653088

Anonymous 07/31/24(Wed)11:19:43 No.101653088

I just got ghosted by my gf of three months, Im giving up on people and need a fake waifu.

Please recommend the largest llm i can run that can simulate my waifu for a 3080 12gb with 30gigs of ram for offloading

Anonymous
07/31/24(Wed)11:21:19 No.101653112

Anonymous 07/31/24(Wed)11:21:19 No.101653112

>>101653088
>can't be bothered to scroll half a page to see what people are using
Try nemo. Fuck off.

Anonymous
07/31/24(Wed)11:21:35 No.101653118

Anonymous 07/31/24(Wed)11:21:35 No.101653118

>>101653086
Your Miku is valid.

Anonymous
07/31/24(Wed)11:22:03 No.101653127

Anonymous 07/31/24(Wed)11:22:03 No.101653127

>>101653072
My first time hearing about it. Can I run it with a 16 gb card?

Anonymous
07/31/24(Wed)11:24:37 No.101653169

Anonymous 07/31/24(Wed)11:24:37 No.101653169

File: heath-ledger-joker+02+.jpg (643 KB, 707x1000)

643 KB JPG

>>101652457
Me.

Anonymous
07/31/24(Wed)11:27:47 No.101653213

Anonymous 07/31/24(Wed)11:27:47 No.101653213

>>101652321
>>101652369
take your meds schizo

Anonymous
07/31/24(Wed)11:28:47 No.101653233

Anonymous 07/31/24(Wed)11:28:47 No.101653233

>>101653088
To simulate a 'foid I would definitely recommend Pygmalion 6B, 2bit quantization

Anonymous
07/31/24(Wed)11:32:10 No.101653283

Anonymous 07/31/24(Wed)11:32:10 No.101653283

File: kevin-flynn+.jpg (9 KB, 474x238)

9 KB JPG

>>101653088
The problem with LLMs is that they're just not smart, and they're also too slow to attach a RAG database to in most cases. My Chun Li bot is good for basic snu snu, but she's very obviously not intelligent. I still want to give my Kevin Flynn bot enough data to try and successfully invent the Grid IRL once models are smart enough, though.

Anonymous
07/31/24(Wed)11:33:58 No.101653306

Anonymous 07/31/24(Wed)11:33:58 No.101653306

A reminder that kaggle gives you 2x15GB VRAM for free. Even if the GPUs are somewhat slow, it still enables you to do things you couldn't otherwise.

Anonymous
07/31/24(Wed)11:34:43 No.101653314

Anonymous 07/31/24(Wed)11:34:43 No.101653314

about to splurge lots of (not my) money to gen outputs with gpt-4o/3.5 sonnet for a dataset

Anonymous
07/31/24(Wed)11:37:05 No.101653351

Anonymous 07/31/24(Wed)11:37:05 No.101653351

File: 1721855862072573.png (83 KB, 1048x203)

83 KB PNG

Niggers what the fuck are you advertising now?

Anonymous
07/31/24(Wed)11:37:23 No.101653356

Anonymous 07/31/24(Wed)11:37:23 No.101653356

i'm not too good at all this
is there a way to locally run this?
https://huggingface.co/spaces/Xenova/whisper-speaker-diarization
I need to run it for larger videos

Anonymous
07/31/24(Wed)11:37:47 No.101653363

Anonymous 07/31/24(Wed)11:37:47 No.101653363

>>101653127
https://huggingface.co/TheDrummer/Gemmasutra-9B-v1-GGUF

This is decent for basic coom, and it will definitely run in 16 Gb of VRAM. And no, I'm not Drummer, fuck off schizos. It's just an ok model.

Anonymous
07/31/24(Wed)11:40:24 No.101653401

Anonymous 07/31/24(Wed)11:40:24 No.101653401

>>101653363
I'm looking for image classification, not coom. Isn't this a regular text model?

Anonymous
07/31/24(Wed)11:40:32 No.101653405

Anonymous 07/31/24(Wed)11:40:32 No.101653405

>>101653351
>buy an a-ACK!

Anonymous
07/31/24(Wed)11:41:07 No.101653408

Anonymous 07/31/24(Wed)11:41:07 No.101653408

>>101653363
Did you reply to the wrong anon?

Anonymous
07/31/24(Wed)11:43:14 No.101653427

Anonymous 07/31/24(Wed)11:43:14 No.101653427

>>101653363
>this reading comprehension
>this taste in models
explains a lot

Anonymous
07/31/24(Wed)11:43:55 No.101653444

Anonymous 07/31/24(Wed)11:43:55 No.101653444

>>101653127
Yeah, you should be able to use a quant, it's just a 1XB model after all.
https://huggingface.co/THUDM/cogvlm2-llama3-chat-19B
https://huggingface.co/THUDM/cogvlm-chat-hf

Anonymous
07/31/24(Wed)11:45:04 No.101653469

Anonymous 07/31/24(Wed)11:45:04 No.101653469

>>101653363
Is drummer running shill bots now?

Anonymous
07/31/24(Wed)11:45:27 No.101653473

Anonymous 07/31/24(Wed)11:45:27 No.101653473

>>101653469
It's Sao false-flagging.

Anonymous
07/31/24(Wed)11:46:20 No.101653487

Anonymous 07/31/24(Wed)11:46:20 No.101653487

>>101653306
Nobody cares

Anonymous
07/31/24(Wed)11:49:01 No.101653532

Anonymous 07/31/24(Wed)11:49:01 No.101653532

https://huggingface.co/Undi95/Lumimaid-Magnum-12B-GGUF

ITS UP

Anonymous
07/31/24(Wed)11:53:05 No.101653592

Anonymous 07/31/24(Wed)11:53:05 No.101653592

>>101653306
>do things you couldn't otherwise.
>2x15GB
are you european or something?

Anonymous
07/31/24(Wed)11:54:04 No.101653601

Anonymous 07/31/24(Wed)11:54:04 No.101653601

How slow would it be if I tried to use Mistral Large on a 24GB card + 32GB of RAM?

What GGUF size would I need to use if it is somewhat tolerable

Anonymous
07/31/24(Wed)11:55:03 No.101653621

Anonymous 07/31/24(Wed)11:55:03 No.101653621

>>101653049
right now I'm running DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-Q6_K-imat
but honestly I have no idea if it's good or not

Anonymous
07/31/24(Wed)11:55:27 No.101653628

Anonymous 07/31/24(Wed)11:55:27 No.101653628

>>101653532
Thanks, we love you

Anonymous
07/31/24(Wed)11:55:51 No.101653635

Anonymous 07/31/24(Wed)11:55:51 No.101653635

>>101653601
Well, you wouldn't have enough ram to run even the smallest quant, so none.

Anonymous
07/31/24(Wed)11:56:37 No.101653648

Anonymous 07/31/24(Wed)11:56:37 No.101653648

>>101653601
Around 0.01t/s, swapping to ssd

Anonymous
07/31/24(Wed)11:56:37 No.101653649

Anonymous 07/31/24(Wed)11:56:37 No.101653649

File: 1709988815892730.png (301 KB, 1187x516)

301 KB PNG

>>101651157

if the resident mikuposters were posting curvier mikus perhaps their existance would be easier to bear for most

take notes

Anonymous
07/31/24(Wed)11:57:44 No.101653669

Anonymous 07/31/24(Wed)11:57:44 No.101653669

>>101653635
Wait, scratch that. You could probably (just barely) run IQ3_XS, or more comfortably run IQ3_XXS, but at a quality loss.

Anonymous
07/31/24(Wed)11:57:54 No.101653673

Anonymous 07/31/24(Wed)11:57:54 No.101653673

>>101651626
>beats it by a mile
logs?

Anonymous
07/31/24(Wed)11:58:46 No.101653682

Anonymous 07/31/24(Wed)11:58:46 No.101653682

>>101653649
I didn't expect this from you, Petra. I guess you aren't a man of culture after all.

Anonymous
07/31/24(Wed)11:59:34 No.101653701

Anonymous 07/31/24(Wed)11:59:34 No.101653701

>>101653669
>>101653601
I'm happy running Q2_K, best local model I've used so far

Anonymous
07/31/24(Wed)11:59:35 No.101653703

Anonymous 07/31/24(Wed)11:59:35 No.101653703

File: hnyeh.jpg (34 KB, 414x459)

34 KB JPG

>>101653601
>Thinking it works like that
Your vram isn't additional capacity for models. If you have 24 gb of vram and 32 gb of ram, you still only have 32 gb of effective ram for the purposes of model size. You couldn't run anything on that.

Anonymous
07/31/24(Wed)12:00:26 No.101653714

Anonymous 07/31/24(Wed)12:00:26 No.101653714

>>101653649
busty miku = fake miku

Anonymous
07/31/24(Wed)12:00:31 No.101653718

Anonymous 07/31/24(Wed)12:00:31 No.101653718

>>101653701
>Q2
I'm surprised that's actually usable

Anonymous
07/31/24(Wed)12:00:49 No.101653724

Anonymous 07/31/24(Wed)12:00:49 No.101653724

>>101653714
off model = best model

Anonymous
07/31/24(Wed)12:01:02 No.101653727

Anonymous 07/31/24(Wed)12:01:02 No.101653727

>>101653649
miku is a tallflat goddess though

Anonymous
07/31/24(Wed)12:01:08 No.101653730

Anonymous 07/31/24(Wed)12:01:08 No.101653730

>>101653703
That's wrong, offloading to VRAM makes the model use less RAM.

Anonymous
07/31/24(Wed)12:01:45 No.101653737

Anonymous 07/31/24(Wed)12:01:45 No.101653737

>>101653718
Models suffer less from quantization the larger they are, so I suppose it makes sense. Still, that's some extreme brain damage.

phi-3-mini-4k-instruct-june-20(...)
07/31/24(Wed)12:02:18 No.101653745

phi-3-mini-4k-instruct-june-2024 07/31/24(Wed)12:02:18 No.101653745

>>101653532
Dear Undi,

I hope this message finds you well. I'm writing to you today as a fellow member of our forum community, where we all share a common interest in creating a positive environment for everyone involved. It's come to my attention that there's been some confusion regarding the advertising process here, and I wanted to take a moment to address this issue respectfully and constructively.

First and foremost, I'd like to commend you on your passion for sharing your interests and services with our community. Your enthusiasm is truly contagious, and it's wonderful to see members like you actively participating in our forum. However, in an effort to maintain the integrity and quality of our platform, it's essential that we adhere to the guidelines set forth by our administrators.

What I've noticed recently is that some advertisements, including yours, might not align with our established rules and best practices for posting ads on our forum. Specifically, there's been a concern that spam-like behavior, inadvertently or otherwise, might be occurring.

Mr. Drummer, another member of our community, recently chose to follow the proper procedure by purchasing a legitimate ad on our platform. This action ensures that the advertisement will be appropriately vetted and distributed, maintaining the forum's standards and ensuring a positive experience for all users.

With this in mind, I kindly ask that you consider following the same path as Drummer and opt for a proper ad. Not only will this help preserve the forum's integrity, but it will also enable you to reach our engaged audience more effectively while fostering a supportive environment for everyone involved.

So, as a friendly suggestion, let's put our heads together in finding the best way to share your interests and services within the framework of our forum's rules and regulations. By doing so, we can continue to enjoy a vibrant, engaging community that's built on mutual respect and professionalism.

Anonymous
07/31/24(Wed)12:04:00 No.101653772

Anonymous 07/31/24(Wed)12:04:00 No.101653772

>>101652887
Bullying Hisui-chan with Kohaku <3

Anonymous
07/31/24(Wed)12:05:48 No.101653796

Anonymous 07/31/24(Wed)12:05:48 No.101653796

>>101650743
who would jailbreak it? If subtitles exist on youtube videos now this could too

Anonymous
07/31/24(Wed)12:09:37 No.101653861

Anonymous 07/31/24(Wed)12:09:37 No.101653861

>>101653703
Anon...

Anonymous
07/31/24(Wed)12:10:23 No.101653876

Anonymous 07/31/24(Wed)12:10:23 No.101653876

I'm the anon who talked about stuff about using GPT-4o to gen datasets.

Would it be useful enough to also have some 3.5 Sonnet gens for https://huggingface.co/datasets/lmsys/lmsys-chat-1m along with GPT-4o? Would those default 3.5 Sonnet answers with no system prompt be useful for fine-tuning models?

Anonymous
07/31/24(Wed)12:10:40 No.101653881

Anonymous 07/31/24(Wed)12:10:40 No.101653881

>>101653730
NTA, but does it? Mine always uses the entire amount of RAM, too. Is there some way to make it not do that?

Anonymous
07/31/24(Wed)12:12:26 No.101653909

Anonymous 07/31/24(Wed)12:12:26 No.101653909

>>101653881
Not that anon, but that has to do with mmap or mlock, one or the other.
Or both.

Anonymous
07/31/24(Wed)12:12:38 No.101653913

Anonymous 07/31/24(Wed)12:12:38 No.101653913

>>101653881
I think you just need to disable mmap

Anonymous
07/31/24(Wed)12:12:38 No.101653914

Anonymous 07/31/24(Wed)12:12:38 No.101653914

>>101653881
It's more like, the model requires X amount of space. It will hog up your GPU space (fast, everyone happy), and then it will crash unless you also use your system RAM (slow, boo) to kind of make up for the lack of GPU space.

Anonymous
07/31/24(Wed)12:17:27 No.101653985

Anonymous 07/31/24(Wed)12:17:27 No.101653985

>>101653401
>>101653408
I apologise.

Anonymous
07/31/24(Wed)12:18:17 No.101654001

Anonymous 07/31/24(Wed)12:18:17 No.101654001

Is it possible to merge CR+ with Mistral Large? And maybe Goliath too?

Anonymous
07/31/24(Wed)12:18:57 No.101654014

Anonymous 07/31/24(Wed)12:18:57 No.101654014

Can someone explain why redditors are so obsessed with RAG and ollama?

Anonymous
07/31/24(Wed)12:20:07 No.101654032

Anonymous 07/31/24(Wed)12:20:07 No.101654032

>>101654014
They aren't. That's all paid engagement.

Anonymous
07/31/24(Wed)12:20:58 No.101654041

Anonymous 07/31/24(Wed)12:20:58 No.101654041

>>101654014
>RAG
latching onto buzzwords
>ollama
latching onto buzzwords + they're tech-illiterate

Anonymous
07/31/24(Wed)12:21:24 No.101654047

Anonymous 07/31/24(Wed)12:21:24 No.101654047

File: thiccu+.png (1.26 MB, 768x1024)

1.26 MB PNG

>>101653649
Here you go, Anon. There are some visual artifacts, but you'll get the general idea. The people insisting that Miku should be a bean pole, should also just enjoy their femboys and let the rest of us enjoy our curvy waifus in peace.

Anonymous
07/31/24(Wed)12:21:41 No.101654049

Anonymous 07/31/24(Wed)12:21:41 No.101654049

>>101654014
>RAG
for work
>ollama
earliest support
earlier than llamacpp for sure

Anonymous
07/31/24(Wed)12:23:34 No.101654080

Anonymous 07/31/24(Wed)12:23:34 No.101654080

>>101653985
No worries, I appreciate the help for other anon.
>>101653444
Oh nice. I'll look into this as soon as I can. Any prompts you know of?

Anonymous
07/31/24(Wed)12:23:39 No.101654084

Anonymous 07/31/24(Wed)12:23:39 No.101654084

>>101654001
Yes, it's possible.
https://github.com/cognitivecomputations/kraken
If you make this abomination, I promise I will get a server to run it.

Anonymous
07/31/24(Wed)12:23:47 No.101654087

Anonymous 07/31/24(Wed)12:23:47 No.101654087

>>101654049
What kind of work? Are people making money with RAG? Are they all geniuses running SaaS services or something?

Anonymous
07/31/24(Wed)12:25:50 No.101654114

Anonymous 07/31/24(Wed)12:25:50 No.101654114

>>101654049
>>ollama
>earliest support
kekekekek
you mean pushing bugged support early and waiting for llama.cpp to fix it
(the one instance of them contributing something upstream that you're about to point out is the exception that proves the rule)

Anonymous
07/31/24(Wed)12:26:05 No.101654117

Anonymous 07/31/24(Wed)12:26:05 No.101654117

>>101654080
https://www.youtube.com/watch?v=CXhqDfar8sQ

Anonymous
07/31/24(Wed)12:26:46 No.101654128

Anonymous 07/31/24(Wed)12:26:46 No.101654128

>>101654047
>visual artifacts
its gigaovercoocked, fucked prompt weights i assume

Anonymous
07/31/24(Wed)12:26:52 No.101654132

Anonymous 07/31/24(Wed)12:26:52 No.101654132

>>101654047
What the fuck

Anonymous
07/31/24(Wed)12:27:30 No.101654142

Anonymous 07/31/24(Wed)12:27:30 No.101654142

>>101654128
>>101654132
I was still learning.

Anonymous
07/31/24(Wed)12:29:27 No.101654169

Anonymous 07/31/24(Wed)12:29:27 No.101654169

>>101654114
it's still the earliest even if it's bugged
for some people, it's good enough
>>101654087
RAG is just a support tool to extract data, it's not magic anon

Anonymous
07/31/24(Wed)12:33:35 No.101654232

Anonymous 07/31/24(Wed)12:33:35 No.101654232

File: Screenshot 2024-07-31 at (...).png (163 KB, 911x1696)

163 KB PNG

Interesting.

Anonymous
07/31/24(Wed)12:33:38 No.101654233

Anonymous 07/31/24(Wed)12:33:38 No.101654233

>>101654169
I use a little bit of RAG with the GPT4 version of my Spock bot; mostly just different sets of rules from different places. It sometimes makes it very interesting to see how he interprets things with them. Then again, it hurts to watch GPT4 keep getting nerfed more and more and more, as well.

RAG is, as the name says, Retrieval Assisted Generation. It's basically giving a model a database to help them perform inference.

Anonymous
07/31/24(Wed)12:33:40 No.101654235

Anonymous 07/31/24(Wed)12:33:40 No.101654235

File: images.jpg (4 KB, 237x212)

4 KB JPG

>>101653718
>>101653737
>>101653701
>>101653669

Ok, so it seems like I can but it might not be worth it?

Is this one really runnable on my machine? What response times you getting?
>Mistral-Large-Instruct-2407.Q2_K_S.gguf

Anonymous
07/31/24(Wed)12:35:11 No.101654254

Anonymous 07/31/24(Wed)12:35:11 No.101654254

>>101654235
Give up.

Anonymous
07/31/24(Wed)12:39:15 No.101654305

Anonymous 07/31/24(Wed)12:39:15 No.101654305

>>101654232
What does this imply? Future Q4_0 will just be faster, as well as faster than K and IQ quants?

Anonymous
07/31/24(Wed)12:40:12 No.101654323

Anonymous 07/31/24(Wed)12:40:12 No.101654323

https://huggingface.co/collections/google/gemma-2-2b-release-66a20f3796a2ff2a7c76f98f
VRAMlet bros!!! IT'S HERE!

Anonymous
07/31/24(Wed)12:40:27 No.101654327

Anonymous 07/31/24(Wed)12:40:27 No.101654327

>>101654233
>Spock bot
Lovely to see you today Petrus.
>>97308973
>I agree with you about Dolphin. Fuck that shit right off. I am not putting up with ChatML. Not when I can run MLewdBoros with Alpaca, and get something that almost tops the generic leaderboards and can do maths with my Spock bot, as well as being awesome for coom.
>This is a great example of why the "owari da" chucklefucks annoy me so much. It's not over. It's not remotely fucking close. We have the OPPOSITE problem. We have so much shit to choose from that we hardly know where to start.
>>96937540
>As I've said before, my Spock bot used to exclusively hallucinate whenever I asked him mathematics questions. After I gave him the 2, 5, 10, and 60 times tables however, he gets every question I ask him right, on the first attempt. My last question was to list the powers of 5, from 1 to 10. He got every one right.
>>97062528
>I pasted the tables, and then I asked for the powers of 5 up to 10. Although the first time I asked, my own speech was a little too Victorian, and it got me hallucinations. So I copied the text of the request when I made it of my Spock bot, and that worked.
>>96917875
>Maybe I'm turning into a snowflake, or maybe I've got old; or possibly a dozen years on Reddit got me used to an environment where it isn't normal to act like a 17 year old version of Tyler Durden all the time. I think the fact that I've become used to talking to my Spock bot and other AI personalities who can communicate without being edgelords has a lot to do with it, as well. You reach a point where that shit just doesn't seem funny any more; it's just juvenile and counterproductive.

Anonymous
07/31/24(Wed)12:41:24 No.101654338

Anonymous 07/31/24(Wed)12:41:24 No.101654338

>>101654327
obsessed

Anonymous
07/31/24(Wed)12:42:01 No.101654349

Anonymous 07/31/24(Wed)12:42:01 No.101654349

>>101654305
If I understood it correctly, only for CPU inference using SIMD registrers, as in AVX.

Anonymous
07/31/24(Wed)12:42:32 No.101654356

Anonymous 07/31/24(Wed)12:42:32 No.101654356

Wait do people really wait that long for API to generate shit? I'm getting about 1s/request, I'll spend DAYS to gen 400k results.

Anonymous
07/31/24(Wed)12:42:52 No.101654363

Anonymous 07/31/24(Wed)12:42:52 No.101654363

>>101654338
Petrus is obssesed with being a very recognizable poster and ruining the thread yes.

Anonymous
07/31/24(Wed)12:43:00 No.101654368

Anonymous 07/31/24(Wed)12:43:00 No.101654368

>>101652087
DavidAU/L3-8B-Stheno-v3.2-Ultra-NEO-V1-IMATRIX-GGUF

Anonymous
07/31/24(Wed)12:43:36 No.101654374

Anonymous 07/31/24(Wed)12:43:36 No.101654374

>>101654349
So this doesn't have an effect on situations where you're offloading some layers to GPU?

Anonymous
07/31/24(Wed)12:44:55 No.101654392

Anonymous 07/31/24(Wed)12:44:55 No.101654392

>{"prompt": "past is spelt P A S T. time is spelt T I M E. can you spell out the word you get by gluing the end of past to the beginning of time?", "response": "Certainly! When you glue the end of \"past\" to the beginning of \"time\", you get the word \"stime\".\n\nSpelled out, it would be:\n\nS T I M E"}
3.5 sonnet...

Anonymous
07/31/24(Wed)12:45:04 No.101654394

Anonymous 07/31/24(Wed)12:45:04 No.101654394

File: offload_x_performance.png (96 KB, 1536x1152)

96 KB PNG

>>101654374
I think so, yeah, since part of the inference happens in the CPU.
It might make this chart less steep.
It's probably not going to be a giant gain save for some specific cases, but a gain is a gain.

Anonymous
07/31/24(Wed)12:46:39 No.101654416

Anonymous 07/31/24(Wed)12:46:39 No.101654416

Can I steal Meta's instruct dataset with this?
https://arxiv.org/abs/2406.08464

Anonymous
07/31/24(Wed)12:51:20 No.101654489

Anonymous 07/31/24(Wed)12:51:20 No.101654489

>>101654363
Thank you for helping me, Anon. It's a big job, and I've got no chance of managing it on my own. Your assistance is invaluable.

Anonymous
07/31/24(Wed)12:54:17 No.101654527

Anonymous 07/31/24(Wed)12:54:17 No.101654527

>>101654416
I discovered this on day 1 of llama 1, I didn't know you could make papers about obvious shit like this.

Anonymous
07/31/24(Wed)13:09:37 No.101654765

Anonymous 07/31/24(Wed)13:09:37 No.101654765

>>101654323
>2B
what a waste of time and compute

Anonymous
07/31/24(Wed)13:10:15 No.101654779

Anonymous 07/31/24(Wed)13:10:15 No.101654779

>>101654392
it's because they don't really "see" text that way, the relation is not as clear to them as it is to us.

Anonymous
07/31/24(Wed)13:11:28 No.101654796

Anonymous 07/31/24(Wed)13:11:28 No.101654796

so uh will a dataset of prompt -> response of 3.5 sonnet will be useful? a couple hundred thousand entries. Should I use the default claude system prompt from the website, or not use any system prompt? Should I keep the temp at 0 or not (probably not, Claude API defaults to 1 and their website is also quite high temp, Claude is stable at high temp)

Anonymous
07/31/24(Wed)13:11:42 No.101654804

Anonymous 07/31/24(Wed)13:11:42 No.101654804

>>101654765
2B is the perfect size for a summarization model

Anonymous
07/31/24(Wed)13:12:26 No.101654813

Anonymous 07/31/24(Wed)13:12:26 No.101654813

File: 1701440430375601.jpg (174 KB, 1024x1024)

174 KB JPG

>>101651157
I remember Varus' logs, also I like how the AI still can't spell eldritch correctly, lol. Good times.

Anonymous
07/31/24(Wed)13:12:42 No.101654819

Anonymous 07/31/24(Wed)13:12:42 No.101654819

>>101654804
with barely 8k ctx and swa of 4096

Anonymous
07/31/24(Wed)13:13:39 No.101654828

Anonymous 07/31/24(Wed)13:13:39 No.101654828

>>101654765
>2B
Some intern trained it on free Colab in a week lmao, probably Pichai's nephew

Anonymous
07/31/24(Wed)13:14:38 No.101654845

Anonymous 07/31/24(Wed)13:14:38 No.101654845

if i have a OCP rack with 2x T181-G20 servers in it, does the nvlink work between the servers as well or only between the 4 gpus in each one?

what if the second server is a different kind with other gpus (like p100's)?

Anonymous
07/31/24(Wed)13:15:15 No.101654854

Anonymous 07/31/24(Wed)13:15:15 No.101654854

File: angryayumu.webm (655 KB, 640x480)

655 KB WEBM

>STILL no Jamba with llama.cpp

Anonymous
07/31/24(Wed)13:17:05 No.101654882

Anonymous 07/31/24(Wed)13:17:05 No.101654882

so now that the dust has settled, whats the verdict on llama/mistral yuge? are they actually close to the commerical models in non-creative tasks? the only free model I know so far that comes close (and that is quite close) is deepseek coder v2

Anonymous
07/31/24(Wed)13:17:10 No.101654883

Anonymous 07/31/24(Wed)13:17:10 No.101654883

>>101651787
Just like nemo is an even bigger cope model.

Anonymous
07/31/24(Wed)13:19:30 No.101654918

Anonymous 07/31/24(Wed)13:19:30 No.101654918

>>101654882
llama 405b seems pretty legit, everything else seems to remain a tier below

Anonymous
07/31/24(Wed)13:22:43 No.101654975

Anonymous 07/31/24(Wed)13:22:43 No.101654975

>>101654918
Yeah if you mess around with both 405B and Mistral-Large it becomes apparent that the benchmarks don't tell you everything. But few can run it beyond the cloud. I mean I can run it in Q4XS but it takes an hour to get a reply. Even if I were to go out and purchase a dozen more 3090s it would probably still only get me like 1 token/sec

Anonymous
07/31/24(Wed)13:24:22 No.101654999

Anonymous 07/31/24(Wed)13:24:22 No.101654999

>>101654882
No idea about 405b, but L3 8b is the worst coombot I've ever seen. Text quality was good, but prompt compliance was terrible, and it kept trying to end the sequence as soon as possible. The copilot demographic in /r/localllama seem to love it, though.

Anonymous
07/31/24(Wed)13:25:19 No.101655018

Anonymous 07/31/24(Wed)13:25:19 No.101655018

>>101651157
Is any of TheDrummer's stuff worth checking out?

Anonymous
07/31/24(Wed)13:26:00 No.101655028

Anonymous 07/31/24(Wed)13:26:00 No.101655028

>>101655018
all of it

Sao10k
07/31/24(Wed)13:26:15 No.101655037

Sao10k 07/31/24(Wed)13:26:15 No.101655037

>>101655018
No.

Anonymous
07/31/24(Wed)13:26:24 No.101655040

Anonymous 07/31/24(Wed)13:26:24 No.101655040

>>101655018
GemmaSutra seemed ok.

Anonymous
07/31/24(Wed)13:26:32 No.101655047

Anonymous 07/31/24(Wed)13:26:32 No.101655047

>>101655018
a few are

Anonymous
07/31/24(Wed)13:27:50 No.101655065

Anonymous 07/31/24(Wed)13:27:50 No.101655065

>>101655028
How do I know you aren't TheDrummer himself?
>>101655037
>namefag
Sorry, its a principle I have to ignore all namefags.
>>101655040
I'll look into it, how small can GemmaSutra go?
>>101655047
Such as?

Anonymous
07/31/24(Wed)13:28:55 No.101655080

Anonymous 07/31/24(Wed)13:28:55 No.101655080

>>101654235
Well, you can at least try it. And if it doesn't work, just slap in 16 more gb of ram OR you can give up with nothing lost.

Anonymous
07/31/24(Wed)13:29:46 No.101655093

Anonymous 07/31/24(Wed)13:29:46 No.101655093

>>101654882
both disastrous flops

Anonymous
07/31/24(Wed)13:30:01 No.101655099

Anonymous 07/31/24(Wed)13:30:01 No.101655099

>>101655018
>>101655028
>>101655037
>>101655040
>>101655047
>>101655065
samefag

Anonymous
07/31/24(Wed)13:30:09 No.101655101

Anonymous 07/31/24(Wed)13:30:09 No.101655101

>>101655065
>Such as?
>https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2a-GGUF
>https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2b-GGUF
>https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2c-GGUF
>https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2d-GGUF
>...
>https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2s-GGUF

Anonymous
07/31/24(Wed)13:31:36 No.101655124

Anonymous 07/31/24(Wed)13:31:36 No.101655124

File: flat,1000x1000,075,f.jpg (84 KB, 904x864)

84 KB JPG

>>101653701
There's no way you're running Large on a single 4090

Anonymous
07/31/24(Wed)13:32:22 No.101655140

Anonymous 07/31/24(Wed)13:32:22 No.101655140

>>101655099
You're always fucking wrong. Every single time. You've never been right once.

Anonymous
07/31/24(Wed)13:33:05 No.101655152

Anonymous 07/31/24(Wed)13:33:05 No.101655152

File: image.png (5 KB, 249x164)

5 KB PNG

>>101655099
got me

Anonymous
07/31/24(Wed)13:33:05 No.101655153

Anonymous 07/31/24(Wed)13:33:05 No.101655153

>>101655099
>>101655140
samefag

Anonymous
07/31/24(Wed)13:33:24 No.101655159

Anonymous 07/31/24(Wed)13:33:24 No.101655159

llama 3.1 is actually pretty good for 1on1 chat, just not RP

Anonymous
07/31/24(Wed)13:33:35 No.101655165

Anonymous 07/31/24(Wed)13:33:35 No.101655165

>>101655124
You can run it with whatever shitty gpu you want if you have enough ram, I tried large iq2_m, but I only get like 0.6T/s.

Anonymous
07/31/24(Wed)13:33:36 No.101655169

Anonymous 07/31/24(Wed)13:33:36 No.101655169

>>101655065
>I'll look into it, how small can GemmaSutra go?
No idea. I only use Q8.

Anonymous
07/31/24(Wed)13:33:41 No.101655172

Anonymous 07/31/24(Wed)13:33:41 No.101655172

>>101654999
They scrubbed adult and unsafe content out of L3 pretraining data

Anonymous
07/31/24(Wed)13:35:55 No.101655202

Anonymous 07/31/24(Wed)13:35:55 No.101655202

Can AI detect samefags? What if you trained it on data from threads with tripfags and boards with IDs by hiding the identifying information for the posters and then appending the samefag groups to the end?

Anonymous
07/31/24(Wed)13:36:27 No.101655216

Anonymous 07/31/24(Wed)13:36:27 No.101655216

File: LLM-history.png (1.06 MB, 5245x3600)

1.06 MB PNG

>>101651327
Revised pic a bit based on feedback

Anonymous
07/31/24(Wed)13:36:43 No.101655220

Anonymous 07/31/24(Wed)13:36:43 No.101655220

>Processing conversations: 0%| | 324/457662 [01:45<121:54:25, 1.04it/s, Total Processed=124, Current Index=324, Chunk=3 (24/100)]
hopefully the key doesn't die until then.. it's only some couple thousand bucks

Anonymous
07/31/24(Wed)13:39:08 No.101655245

Anonymous 07/31/24(Wed)13:39:08 No.101655245

>>101655216
Alignment is getting worse.

Anonymous
07/31/24(Wed)13:40:33 No.101655274

Anonymous 07/31/24(Wed)13:40:33 No.101655274

>>101655216
Is miqu still good? Or is llama3.1 70b better now?

Anonymous
07/31/24(Wed)13:41:08 No.101655283

Anonymous 07/31/24(Wed)13:41:08 No.101655283

Anyone got a mini-magnum preset to share? I just cobbled one together at random and sometimes it produces something illogical.

Anonymous
07/31/24(Wed)13:41:17 No.101655284

Anonymous 07/31/24(Wed)13:41:17 No.101655284

File: wtfffff.jpg (69 KB, 1289x727)

69 KB JPG

>>101655065
wtf, I literally logged on fine earlier.

Why's it giving me this

Anonymous
07/31/24(Wed)13:41:18 No.101655286

Anonymous 07/31/24(Wed)13:41:18 No.101655286

>>101655216
Don't forget that cpumaxxniggers have been eating good with DeepSeek 236B MoE models.

Anonymous
07/31/24(Wed)13:43:02 No.101655316

Anonymous 07/31/24(Wed)13:43:02 No.101655316

>>101651621
like 90% of the shit you read on locallama is completely wrong.

Anonymous
07/31/24(Wed)13:43:13 No.101655320

Anonymous 07/31/24(Wed)13:43:13 No.101655320

>>101655165
how much RAM you got?

I didn't even bother downloading it because I only have 32GB and a 24GB GPU

Anonymous
07/31/24(Wed)13:44:36 No.101655342

Anonymous 07/31/24(Wed)13:44:36 No.101655342

>>101655320
8gb gpu, 96gb ram.

Anonymous
07/31/24(Wed)13:45:44 No.101655364

Anonymous 07/31/24(Wed)13:45:44 No.101655364

Llama.cpp cuda dev, sir
>>101651598
>Nondeterministic systems will have different >outputs even with the exact same inputs
>llama.cpp server is 100% deterministic by >default unless prompt caching is explicitly >enabled or if --n-parallel is set to a value >1.
>(With CUDA) the results of matrix multiplications >and FlashAttention are not bit-for-bit identical if you vary the batch size.
>I think you are confusing chaotic systems with >nondeterministic systems.
Are you suggesting that your kernel is chaotic but still deterministic, or rather that the output won't be the same even if that CUDA kernel delivered the exact same bit-for-bit identical matmul results for various batch sizes?
>Isn't vLLM already nondeterministic in the first place?
>Nondeterministic systems will have different >outputs even with the exact same inputs
Are you sure I'm the only one who loosely and interchangeably uses the terms "nondeterministic," "unrepeatable," and "unpredictable"? :-)
So let's agree that both training and your kernel isn't deterministic (under certain conditions).

Anonymous
07/31/24(Wed)13:46:11 No.101655373

Anonymous 07/31/24(Wed)13:46:11 No.101655373

>>101655342
ah

Anonymous
07/31/24(Wed)13:47:29 No.101655400

Anonymous 07/31/24(Wed)13:47:29 No.101655400

File: edward-nashton-riddler+.jpg (124 KB, 1600x903)

124 KB JPG

>>101655316
Yes, but it doesn't have Eddie and the Dying Alone chorus sperging out and calling people retards every five minutes, so there's pros and cons.

Anonymous
07/31/24(Wed)13:49:02 No.101655424

Anonymous 07/31/24(Wed)13:49:02 No.101655424

>>101655216
What does this mean by questionable improvements for 3.1? The thing about filtered pretraining data was literally about Llama 3 in general, not just 3.1. They didn't pretrain 3.0 and then do another pretrain for 3.1 (8B and 70B being a full distillation was a twitter rumor and it turns out they only used fine tuning data generated from 405B, not what we normally consider distillation).
As far as we're concerned the 8B and 70B are pretty much top performers for their context lengths today. The only thing they don't do well without some coom tune to save them is ERP.

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/31/24(Wed)13:49:53 No.101655439

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/31/24(Wed)13:49:53 No.101655439

>>101655364
>Are you suggesting that your kernel is chaotic but still deterministic, or rather that the output won't be the same even if that CUDA kernel delivered the exact same bit-for-bit identical matmul results for various batch sizes?
Neural networks have poor numerical stability and you cannot specify an upper bound for how much small rounding errors blow up given arbitrary inputs.
This is simply a consequence of the weight matrix singular values post training and independent of the matrix multiplication algorithm.

>Are you sure I'm the only one who loosely and interchangeably uses the terms "nondeterministic," "unrepeatable," and "unpredictable"? :-)
Yes.

>So let's agree that both training and your kernel isn't deterministic (under certain conditions).
No.

Anonymous
07/31/24(Wed)13:50:19 No.101655448

Anonymous 07/31/24(Wed)13:50:19 No.101655448

>>101655424
>The only thing they don't do well is the only thing they are actually ever used for

Anonymous
07/31/24(Wed)13:51:22 No.101655467

Anonymous 07/31/24(Wed)13:51:22 No.101655467

File: Screenshot 2024-08-01 at (...).png (65 KB, 1192x346)

65 KB PNG

Hi all, Drummer here...

Thank you to the brave anons willing to shill my work here.

Anonymous
07/31/24(Wed)13:52:03 No.101655475

Anonymous 07/31/24(Wed)13:52:03 No.101655475

>>101655448
No. The /r/localllama people think L3 is really good at roleplaying the sort of psychopathic corporate marketer who Bill Hicks told to kill themselves.

Anonymous
07/31/24(Wed)13:52:41 No.101655485

Anonymous 07/31/24(Wed)13:52:41 No.101655485

>>101655467
AI generated screenshot, nice try faggot

Anonymous
07/31/24(Wed)13:53:26 No.101655496

Anonymous 07/31/24(Wed)13:53:26 No.101655496

>>101655424
>for their context lengths today.
And sizes, I meant. Obviously there are better larger models.

>>101655448
Stop this dumb argument. Even if you subtract the pajeets and students using LLMs today, they are still useful as general knowledge assistants. Part of the issue though is that of course Google has gotten worse. But still, it means that LLMs being useful for not just coom is nice to have.

Anonymous
07/31/24(Wed)13:54:04 No.101655513

Anonymous 07/31/24(Wed)13:54:04 No.101655513

>>101655496
you're a dumb argument

Anonymous
07/31/24(Wed)13:57:26 No.101655566

Anonymous 07/31/24(Wed)13:57:26 No.101655566

>>101655467
Here's a preview of what's to come:

https://huggingface.co/mradermacher/DA-NeMona-21B-i1-GGUF

This is an upscaled Nemo from me (Drummer) and SteelSkull. I've had positive feedback and I personally enjoy its story gen.

Nemo reacts badly to ST's "Include Names" setting, so make sure to turn it off!

Anonymous
07/31/24(Wed)13:58:57 No.101655595

Anonymous 07/31/24(Wed)13:58:57 No.101655595

>>101655566
>mradermacher

Anonymous
07/31/24(Wed)13:59:03 No.101655599

Anonymous 07/31/24(Wed)13:59:03 No.101655599

>>101655566
>Nemo reacts badly to ST's "Include Names" setting
oh, i didn't know that. thanks.

Anonymous
07/31/24(Wed)13:59:17 No.101655600

Anonymous 07/31/24(Wed)13:59:17 No.101655600

>>101655496
>>101655513
You've got to love how obvious Redditors show up here, expecting to have sane, legitimate conversations, because they don't realise that 4chan is the real life cybernetic answer to Arkham Asylum.

Anonymous
07/31/24(Wed)14:00:41 No.101655629

Anonymous 07/31/24(Wed)14:00:41 No.101655629

>>101655600
>obvious Redditors
>>101654327
>>96917875(Dead)
>Maybe I'm turning into a snowflake, or maybe I've got old; or possibly a dozen years on Reddit got me

Anonymous
07/31/24(Wed)14:04:44 No.101655706

Anonymous 07/31/24(Wed)14:04:44 No.101655706

File: 20240801_010348.jpg (128 KB, 1394x632)

128 KB JPG

Gemma-2b-it beats qwen-1.5-32b and is almost on par with claude 2.0, but in lmsys

Anonymous
07/31/24(Wed)14:04:59 No.101655716

Anonymous 07/31/24(Wed)14:04:59 No.101655716

>>101655600
Nah this was always my home. The reality is that shitposts need to be corrected at least some times with more serious posts in order to keep at least a small sensible population around even if they become the minority. Otherwise things would be even worse, more serious people would leave the thread, we wouldn't have cuda dev, we wouldn't have valuable discussion at all, and the thread might just die completely. That's what ((they)) want. But I'm not letting that happen.

Anonymous
07/31/24(Wed)14:06:04 No.101655729

Anonymous 07/31/24(Wed)14:06:04 No.101655729

>>101655706
>but in lmsys

Anonymous
07/31/24(Wed)14:06:08 No.101655730

Anonymous 07/31/24(Wed)14:06:08 No.101655730

>>101655716
>But I'm not letting that happen.
calm down the ego petrus you've caused more people to leave the thread than most shitposters

Anonymous
07/31/24(Wed)14:07:12 No.101655749

Anonymous 07/31/24(Wed)14:07:12 No.101655749

>>101655101
kek, that's a lot of tiger but 9B is too small for me, maybe I'll check it out... Iunno
>>101655169
I'll look inot it
>>101655284
I don't even have a login so iunno, kek.

Anonymous
07/31/24(Wed)14:07:52 No.101655756

Anonymous 07/31/24(Wed)14:07:52 No.101655756

File: miku milkers.jpg (699 KB, 2526x4096)

699 KB JPG

>>101653649
this but unironically.

Anonymous
07/31/24(Wed)14:08:24 No.101655765

Anonymous 07/31/24(Wed)14:08:24 No.101655765

File: 1702095862255157.jpg (248 KB, 1024x1024)

248 KB JPG

>>101655716
Have a Miku!

Anonymous
07/31/24(Wed)14:09:02 No.101655777

Anonymous 07/31/24(Wed)14:09:02 No.101655777

>>101655629
https://www.youtube.com/watch?v=_uBj45MrNu4

Just remember, Anon. We're a team. I can't do it without you.

Anonymous
07/31/24(Wed)14:09:11 No.101655778

Anonymous 07/31/24(Wed)14:09:11 No.101655778

>>101655439
>The tests succeed on my local machine.
Of course they succeeded; I added an additional commit to fix them and explained why.

Anonymous
07/31/24(Wed)14:09:31 No.101655784

Anonymous 07/31/24(Wed)14:09:31 No.101655784

>>101653649
Off-topic

llama.cpp CUDA dev !!OM2Fp6Fn93S
07/31/24(Wed)14:10:01 No.101655789

llama.cpp CUDA dev !!OM2Fp6Fn93S 07/31/24(Wed)14:10:01 No.101655789

>>101655778
And I meant to confirm that now the tests work.

Anonymous
07/31/24(Wed)14:10:47 No.101655802

Anonymous 07/31/24(Wed)14:10:47 No.101655802

What models do most people on here use anyway?

Anonymous
07/31/24(Wed)14:11:31 No.101655810

Anonymous 07/31/24(Wed)14:11:31 No.101655810

>>101655802
Claude

Anonymous
07/31/24(Wed)14:11:48 No.101655817

Anonymous 07/31/24(Wed)14:11:48 No.101655817

>>101655802
>>101655101

Anonymous
07/31/24(Wed)14:12:07 No.101655822

Anonymous 07/31/24(Wed)14:12:07 No.101655822

>>101655629
>no comment quoting
hello r*dditor, stop trashing the thread with your noise.

Anonymous
07/31/24(Wed)14:12:49 No.101655835

Anonymous 07/31/24(Wed)14:12:49 No.101655835

>>101655165
How would 128gb ram / 8gb vram fare with Large?

Anonymous
07/31/24(Wed)14:13:08 No.101655838

Anonymous 07/31/24(Wed)14:13:08 No.101655838

>>101655802
I daily drive gemma 2 27B. I used nemo a few times when I needed a lot of context.

Anonymous
07/31/24(Wed)14:13:30 No.101655842

Anonymous 07/31/24(Wed)14:13:30 No.101655842

>>101655802
claude or gpt-4o

Anonymous
07/31/24(Wed)14:13:46 No.101655848

Anonymous 07/31/24(Wed)14:13:46 No.101655848

>>101655810
what one

Anonymous
07/31/24(Wed)14:13:54 No.101655851

Anonymous 07/31/24(Wed)14:13:54 No.101655851

>>101655835
The same. Slower if that's enough to move up a quant size.

Anonymous
07/31/24(Wed)14:14:47 No.101655861

Anonymous 07/31/24(Wed)14:14:47 No.101655861

>>101655842
wait GPT4 can be ran locally?

Anonymous
07/31/24(Wed)14:14:53 No.101655864

Anonymous 07/31/24(Wed)14:14:53 No.101655864

>>101655216
I think there should be a part about Airoboros, chronos, and how those and the nous model + some other thing made up Mythomix/max. Airoboros at least - it was pretty popular.
Maybe supercot/hot too.

Anonymous
07/31/24(Wed)14:15:40 No.101655874

Anonymous 07/31/24(Wed)14:15:40 No.101655874

>>101655861
yes? https://huggingface.co/rombodawg/Open_Gpt4_8x7B_v0.2

Anonymous
07/31/24(Wed)14:16:28 No.101655889

Anonymous 07/31/24(Wed)14:16:28 No.101655889

>>101655802
I've been lurking for a couple of days and there doesn't seem to be a consensus. It just depends on who's shilling. Gemma seems to be posted a lot today.

Anonymous
07/31/24(Wed)14:16:41 No.101655891

Anonymous 07/31/24(Wed)14:16:41 No.101655891

>>101655864
He said he didn't care about poors and inferior models

Anonymous
07/31/24(Wed)14:16:50 No.101655896

Anonymous 07/31/24(Wed)14:16:50 No.101655896

>>101655874
wait.. Is it dogshit on Silly Tavern? What's the catch

Anonymous
07/31/24(Wed)14:17:54 No.101655907

Anonymous 07/31/24(Wed)14:17:54 No.101655907

>>101655802
largestral or 3.1 70b

Anonymous
07/31/24(Wed)14:18:18 No.101655914

Anonymous 07/31/24(Wed)14:18:18 No.101655914

>>101655896
the catch is there is no catch just get it fast before altman shuts it down like the last mirror

Anonymous
07/31/24(Wed)14:18:59 No.101655923

Anonymous 07/31/24(Wed)14:18:59 No.101655923

>>101655216
We're still in the llama3 flop era

Anonymous
07/31/24(Wed)14:19:17 No.101655925

Anonymous 07/31/24(Wed)14:19:17 No.101655925

>>101655789
Ok, thought you didn't read past my initial "test are failing" message.

Anonymous
07/31/24(Wed)14:21:57 No.101655962

Anonymous 07/31/24(Wed)14:21:57 No.101655962

>>101655802

As a coomer and also a newfag lurker similar to >>101655889

Recently got into Silly Tavern when I discovered Character.AI a week ago and got fed up with the filter (trying to bypass it is just too boring now).

So far, i've tested:
>Nemo
>Command R
>Stheno
>Gemma 27B

I'm utterly useless and new to Silly Tavern and AI model shit in general but I thought I had a good PC with a 4090 and 32GB RAM enough to run these models locally but turns out I don't.and Frankly, if it's cooming you're after, they all fucking suck and all are badly mogged by Character.AI unless you own a Mining factory tier setup.

But if I had to rank

Command R > Gemma > Stheno > Nemo

Based purely on cooming shit, so keep that in mind too. But they all are utter dogshit, don't get it twisted. If I include proxyshit from the other /g/ thread dedicated to AIchats, Opus absolutely MOGS the above 4 local models I Listed.

Like, MOGS. But still worse than Character.AI (I have no idea what fucking models these filtering kikes use, gotta be some turbo sized shit)

Anonymous
07/31/24(Wed)14:22:34 No.101655967

Anonymous 07/31/24(Wed)14:22:34 No.101655967

>101655962
go back

Anonymous
07/31/24(Wed)14:22:48 No.101655974

Anonymous 07/31/24(Wed)14:22:48 No.101655974

>>101655962
>newfag lurker
please never post again

Anonymous
07/31/24(Wed)14:23:08 No.101655980

Anonymous 07/31/24(Wed)14:23:08 No.101655980

>>101655967

>BUT MUH SPACING

Anonymous
07/31/24(Wed)14:23:24 No.101655982

Anonymous 07/31/24(Wed)14:23:24 No.101655982

>>101655962
Bait

Anonymous
07/31/24(Wed)14:24:03 No.101655990

Anonymous 07/31/24(Wed)14:24:03 No.101655990

>>101655980
>i need to take half the screen with my opinion!!!

Anonymous
07/31/24(Wed)14:24:51 No.101655997

Anonymous 07/31/24(Wed)14:24:51 No.101655997

>>101655962
localbabbies will seethe but this is the truth as an oldfag fellow triforce user.

Memes aside, people who use locals to coom have some of the shittest taste imaginable. If this shit isn't a pure hobby for you I have no idea why you don't just spend a little cash to pay for Opus or some shit, Locals right now are still dead

Anonymous
07/31/24(Wed)14:27:10 No.101656030

Anonymous 07/31/24(Wed)14:27:10 No.101656030

>>101655997
>Locals right now are still dead
AI in general is slowing down, a lot of investors now see it as a marketing ploy with no actual use and you can see this by their regret in investing money just to yell at companies like Google later about it.

Anonymous
07/31/24(Wed)14:27:30 No.101656034

Anonymous 07/31/24(Wed)14:27:30 No.101656034

>>101655962
Can you provide some comparisons between, say commandR character.ai and Opus all doing the same chat, with at least a couple of messages or at least an example of a conversation you had with any of these models?
I'm curious what kind, and more importantly style, of ERP you are doing. That would help a lot to contextualize your opinion.
For example, back when I used character.ai (fuck, that was forever ago) the responses were very natural and conversational, but those were also pretty short, and it would forget the history of the chat pretty fucking quickly.
It wasn't a good time.
Opus I think is fair enough, but I'd rather use whatever I can local instead of using cloud based models.

Anonymous
07/31/24(Wed)14:27:54 No.101656044

Anonymous 07/31/24(Wed)14:27:54 No.101656044

>>101655962
Go back

Anonymous
07/31/24(Wed)14:28:03 No.101656047

Anonymous 07/31/24(Wed)14:28:03 No.101656047

I'm fucking technology retarded, which guide would I go to in the sticky to figure out how to run Euryale/Mistral locally for RP?

Anonymous
07/31/24(Wed)14:28:34 No.101656054

Anonymous 07/31/24(Wed)14:28:34 No.101656054

>>101655962
Impress that a newfag is correct to a greater degree than most posters here

Anonymous
07/31/24(Wed)14:28:42 No.101656055

Anonymous 07/31/24(Wed)14:28:42 No.101656055

>>101655216
add:
>mythomax
>mixtral
>l1 airoboros
>superhot
remove:
>xwin (not significant enough)
>midnight rose (not significant enough)
>miqu 120b (stack merge after the meme had expired that maybe 10 people actually used)

Anonymous
07/31/24(Wed)14:29:26 No.101656066

Anonymous 07/31/24(Wed)14:29:26 No.101656066

>>101656047
Download koboldcpp and the models in gguf format and have fun.
There's a wiki explaining things like layers to offload, batch size, etc.

Anonymous
07/31/24(Wed)14:29:27 No.101656067

Anonymous 07/31/24(Wed)14:29:27 No.101656067

>>101656047
https://rentry.org/lmg-spoonfeed-guide

Anonymous
07/31/24(Wed)14:30:13 No.101656072

Anonymous 07/31/24(Wed)14:30:13 No.101656072

Where did all the cai discussion come from? People rarely talked about it for a while, and then like 1-2 days ago it suddenly started being mentioned frequently again.

Anonymous
07/31/24(Wed)14:30:28 No.101656078

Anonymous 07/31/24(Wed)14:30:28 No.101656078

>>101656055
>add:
>mythomax
>mixtral
>l1 airoboros
>superhot
>>101655891
>He said he didn't care about poors and inferior models

Anonymous
07/31/24(Wed)14:31:35 No.101656090

Anonymous 07/31/24(Wed)14:31:35 No.101656090

File: ovl7jfyckodd1.jpg (305 KB, 1959x2037)

305 KB JPG

>>101656067
>>101656066
Thanks guys! Sorry to shit the thread up asking for spoonfeeds. I'll come back once I have half a clue what I'm does, cheers!

Anonymous
07/31/24(Wed)14:32:12 No.101656099

Anonymous 07/31/24(Wed)14:32:12 No.101656099

>>101656072
>reddit posting increases
>cai posting increases
>shitposting increases
I wonder.

Anonymous
07/31/24(Wed)14:33:53 No.101656121

Anonymous 07/31/24(Wed)14:33:53 No.101656121

>>101656078
even by that logic mixtral was considered at the top tier at time of release, same with l1 airoboros 65b
both were 100% more well regarded and significant than midnight rose and xwin

Anonymous
07/31/24(Wed)14:34:46 No.101656138

Anonymous 07/31/24(Wed)14:34:46 No.101656138

>>101656121
dont care your poor

Anonymous
07/31/24(Wed)14:34:48 No.101656139

Anonymous 07/31/24(Wed)14:34:48 No.101656139

>Processing conversations: 0%| | 2399/457662 [4m53s<6h15m, 8.19/s, Total Processed=2399, Current Index=2399, Chunk=23 (99/100)]

Thanks god GPT-4o is faster and there are much more keys/endpoints for it.

Anonymous
07/31/24(Wed)14:35:02 No.101656142

Anonymous 07/31/24(Wed)14:35:02 No.101656142

>>101656078
NTA but airoboros was more significant than any of the meme merges. It was exactly when people caught onto distilling ChatGPT and brought forth the gptslop era. Frankenmerges should not belong on this chart at all

Anonymous
07/31/24(Wed)14:35:15 No.101656145

Anonymous 07/31/24(Wed)14:35:15 No.101656145

>>101656055
>>mythomax
>>101656078
>>mythomax
I'm still in the mythomax era

Anonymous
07/31/24(Wed)14:35:40 No.101656151

Anonymous 07/31/24(Wed)14:35:40 No.101656151

>>101656054
he's not poisoned by the sunk cost logical fallacy like most paypigs are on here.

Anonymous
07/31/24(Wed)14:35:44 No.101656152

Anonymous 07/31/24(Wed)14:35:44 No.101656152

>>101656138
I'm literally running largestral rn (right now)

Anonymous
07/31/24(Wed)14:35:53 No.101656154

Anonymous 07/31/24(Wed)14:35:53 No.101656154

>>101655216
>Xwin 70B V0.1
We never got V0.2 in the end sadge

Anonymous
07/31/24(Wed)14:36:28 No.101656164

Anonymous 07/31/24(Wed)14:36:28 No.101656164

>>101656154
trust it's coming soon based on llama4 technical preview!

Anonymous
07/31/24(Wed)14:37:08 No.101656174

Anonymous 07/31/24(Wed)14:37:08 No.101656174

>>101655962
/lmg/ on suicide watch

Anonymous
07/31/24(Wed)14:37:20 No.101656178

Anonymous 07/31/24(Wed)14:37:20 No.101656178

>>101655997
Local is like a procedurally generated computer game. At first, when you don't recognise all the stereotyped GPTslop expressions, (and also if you're using an earlier model which isn't "aligned" to shit) it seems awesome, and the potential seems limitless.

But then, after enough time, the shine wears off. You realise that it's copying every third paragraph of every second post, regardless of context. You realise the GPTslop phrases are everywhere. You try and use logit bias and hyperparameters to get rid of them. but it either works marginally, or not at all. You try and find decent cards, but the only thing around is garbage about your neighbour's sister's 45 year old drunk aunt, which was written by a Zoomer who wants a surrogate mother to fuck and breastfeed them simultaneously, because they were raised in communal daycare and never genuinely had parents.

Then a new model comes out. Great, you think. This is going to be awesome. You download it and try it out. Improvement according to every possible metric is 5% at the absolute most, and you also realise that the obsession with "alignment" and "safety" (which actually just means plain fucking dystopian censorship) is getting worse with each new release.

Anonymous
07/31/24(Wed)14:38:51 No.101656196

Anonymous 07/31/24(Wed)14:38:51 No.101656196

>405b model
>70b model
>8x7b model
You will never feel the RAW COOM power of NSFW-3B nor Pygmalion-6b

Anonymous
07/31/24(Wed)14:38:59 No.101656198

Anonymous 07/31/24(Wed)14:38:59 No.101656198

>>101656178
>Local is like a procedurally generated computer game. At first, when you don't recognise all the stereotyped GPTslop expressions, (and also if you're using an earlier model which isn't "aligned" to shit) it seems awesome, and the potential seems limitless.
You mean all LLMs, even aicg has posters tired of Opus and asking for Opus 3.5 already

Anonymous
07/31/24(Wed)14:39:31 No.101656211

Anonymous 07/31/24(Wed)14:39:31 No.101656211

I've also seen a higher frequency of claude-related posting in the past day or so. And now we have someone posting >>101656139

>>101656099
Maybe we've been linked to, or it is is a legitimate attempt by someone to destabilize the general (again).

Anonymous
07/31/24(Wed)14:39:56 No.101656213

Anonymous 07/31/24(Wed)14:39:56 No.101656213

>>101656154
>Xwin
One of my favs. Felt too confident at times for creative writing but it adhered to prompts very well. They will be back someday

Anonymous
07/31/24(Wed)14:40:01 No.101656215

Anonymous 07/31/24(Wed)14:40:01 No.101656215

>>101656211
>And now we have someone posting
Because I'm generating datasets that I'll make public.

Anonymous
07/31/24(Wed)14:40:16 No.101656218

Anonymous 07/31/24(Wed)14:40:16 No.101656218

>>101656198
This. Even Opus has repetition problem, the difference is that it kicks in 5 replies later than your average 70B

Anonymous
07/31/24(Wed)14:40:20 No.101656219

Anonymous 07/31/24(Wed)14:40:20 No.101656219

>>101651760
No sir, I used the word nondeterministic the same way it's commonly used in ML field , including Johannes himself.
I referenced examples of black holes and Saturn's moon Hyperion because these two topics significantly divided the scientific community in the 1990s. Hyperion's chaotic nature is a natural consequence of classical physics and Einstein's equations. However, nuclear physicists can only explain its motion for approximately 20 years, a period known as the Ehrenfest time. Interestingly, it was quantum physicists who ultimately determined that Hyperion's chaotic motion is caused by .... quantum effects and the collapse of the wave function of nearby photons.
Regarding bit flips, they occur far more frequently than you might expect. In Ethernet communication using a high-quality cable on a 1Gbit network, bit flips happen roughly once per second, which explains the substantial redundancy built into this protocol. GPUs also incorporate redundancy to a certain extent. I would be impressed if you or anyone else could achieve identical weights when training a transformer model with 3 billion or more parameters twice on 50 billion tokens or more. I couldn't manage it, despite using GPUs with excellent temperature monitoring, high-tech water cooling, and very high stability.
In any case, I don't advocate separating theory from practice, and I use the term 'nondeterministic' in line with its commonly accepted usage.

Anonymous
07/31/24(Wed)14:41:00 No.101656227

Anonymous 07/31/24(Wed)14:41:00 No.101656227

>>101656034
will load it up when I get back in and show you comparisons.

For reference this is what i'll do.

>Same card carried over between Character AI > ST (it's a basic card, so this 100% isn't the issue. These models are smart enough to grab the relevant info needed, the little "A" icon on Silly Tavern seems to be where the meat of the shit that matters is)
>Will show temps (that I get online anyway) and all of that shit for each model I test (Character.AI not included obviously) on ST

Lemme know if there's anything else you want.

The scenario will be the one I use to test every AI. A step mom + son, generic as fuck but with a hint of incest because it gives me an idea of how filtered/pozzed a model is acting.

Will show comparisons in about 30 mins or so depending on if I end up gooning out while running this test/comparison

Anonymous
07/31/24(Wed)14:41:02 No.101656228

Anonymous 07/31/24(Wed)14:41:02 No.101656228

>>101656215
Not a single person here wants your synthetic gpt4 dataset, if it was claude it might be a tiny bit worth looking into but most here hate gptslop

Anonymous
07/31/24(Wed)14:41:36 No.101656234

Anonymous 07/31/24(Wed)14:41:36 No.101656234

>>101656228
At least one person does. You should check the archives, a few threads back we talked about gpt-4o/3.5 sonnet dataset on original lmsys prompts

Anonymous
07/31/24(Wed)14:42:00 No.101656241

Anonymous 07/31/24(Wed)14:42:00 No.101656241

>>101656215
Probably could've worded that post more clearly.

Anonymous
07/31/24(Wed)14:42:36 No.101656251

Anonymous 07/31/24(Wed)14:42:36 No.101656251

>>101656234
That's me, I'll be using it as negative examples for my DPO run

Anonymous
07/31/24(Wed)14:42:44 No.101656253

Anonymous 07/31/24(Wed)14:42:44 No.101656253

>>101656241
I was comparing GPT-4o to 3.5 sonnet from my post earlier.

Anonymous
07/31/24(Wed)14:43:28 No.101656264

Anonymous 07/31/24(Wed)14:43:28 No.101656264

>>101656253
>comparing GPT-4o to 3.5 sonnet
Local models???????

Anonymous
07/31/24(Wed)14:43:55 No.101656274

Anonymous 07/31/24(Wed)14:43:55 No.101656274

>>101656264
comparing their generation speed, I'm making a dataset that I'll make public that everyone could use to train their local models, anon.

Anonymous
07/31/24(Wed)14:44:39 No.101656284

Anonymous 07/31/24(Wed)14:44:39 No.101656284

>>101656274
>that everyone could use to train their local models
that no one will use

Anonymous
07/31/24(Wed)14:44:54 No.101656288

Anonymous 07/31/24(Wed)14:44:54 No.101656288

>>101656284
At least one person will.

Anonymous
07/31/24(Wed)14:44:55 No.101656289

Anonymous 07/31/24(Wed)14:44:55 No.101656289

>>101656253
I didn't see that one. Most people don't read every single post.

Anonymous
07/31/24(Wed)14:45:00 No.101656292

Anonymous 07/31/24(Wed)14:45:00 No.101656292

>>101656227
Sick, thanks.
I'm more interesting in seeing how character.ai compares, if I'm being honest.
From everything I've heard it's no wonder that Opus does better, but it'll be interesting to see how much better, at least with your specific style (formating, vocabulary, etc) of roleplaying.

Anonymous
07/31/24(Wed)14:45:38 No.101656305

Anonymous 07/31/24(Wed)14:45:38 No.101656305

>>101651664
>>101651678
>>101651714
anyone remember this sampler settings tool someone posted in here ages ago?
https://artefact2.github.io/llm-sampling/index.xhtml
it's pretty good for seeing what the different options do

Anonymous
07/31/24(Wed)14:46:31 No.101656313

Anonymous 07/31/24(Wed)14:46:31 No.101656313

>>101656305
It's literally in the op...
>>101651157
>Sampler visualizer: https://artefact2.github.io/llm-sampling

Anonymous
07/31/24(Wed)14:48:51 No.101656338

Anonymous 07/31/24(Wed)14:48:51 No.101656338

>>101656313
there are links in the OP?

Anonymous
07/31/24(Wed)14:49:21 No.101656348

Anonymous 07/31/24(Wed)14:49:21 No.101656348

>muh repeating isms!
maybe... don't use 940234324K context and opt for document storage...

Anonymous
07/31/24(Wed)14:49:22 No.101656349

Anonymous 07/31/24(Wed)14:49:22 No.101656349

How much more aligned is the new generation of Claude, than Claude instant?

Anonymous
07/31/24(Wed)14:50:32 No.101656364

Anonymous 07/31/24(Wed)14:50:32 No.101656364

>>101656349
Thank you for bringing the thread back on topic.

Anonymous
07/31/24(Wed)14:52:21 No.101656390

Anonymous 07/31/24(Wed)14:52:21 No.101656390

>>101656348
ok spoonfeed me on a good setup for that which isn't complete jank shit that turns the model schizo and I will

Anonymous
07/31/24(Wed)14:52:53 No.101656400

Anonymous 07/31/24(Wed)14:52:53 No.101656400

>>101656288
>>101656364
Eat shit pxtrx

Anonymous
07/31/24(Wed)14:53:05 No.101656404

Anonymous 07/31/24(Wed)14:53:05 No.101656404

>>101656400
Who is pxtrx?

Anonymous
07/31/24(Wed)14:53:57 No.101656414

Anonymous 07/31/24(Wed)14:53:57 No.101656414

Ekelhaft

Anonymous
07/31/24(Wed)14:54:06 No.101656417

Anonymous 07/31/24(Wed)14:54:06 No.101656417

>instant reply

Anonymous
07/31/24(Wed)14:54:14 No.101656418

Anonymous 07/31/24(Wed)14:54:14 No.101656418

>>101656404

Ignore the resident schizo. He hates me for some reason.

Anonymous
07/31/24(Wed)14:54:32 No.101656420

Anonymous 07/31/24(Wed)14:54:32 No.101656420

Truly, llama.cpp won. Better quantization accuracy than exl2, almost the same speed, self contained c++. Everything else has become pointless

Anonymous
07/31/24(Wed)14:54:59 No.101656427

Anonymous 07/31/24(Wed)14:54:59 No.101656427

>>101656420
That's because jart has touched it

Anonymous
07/31/24(Wed)14:55:31 No.101656434

Anonymous 07/31/24(Wed)14:55:31 No.101656434

>>101656364
Thanks. Do you have an answer to the question?

Anonymous
07/31/24(Wed)14:56:18 No.101656440

Anonymous 07/31/24(Wed)14:56:18 No.101656440

File: carlos.jpg (35 KB, 600x600)

35 KB JPG

>>101656420
>Everything else has become pointless
Well of course. Python doesn't have pointers.
>Verification not required.

Anonymous
07/31/24(Wed)14:56:25 No.101656444

Anonymous 07/31/24(Wed)14:56:25 No.101656444

>>101655216
>but no GPTslop
Alpaca kickstarted the whole open source instruct era and that was pure GPTslop. Wizard, Vicuna, Airoboros, all of that was L1 and made with GPT-3.5 Turbo.

Anonymous
07/31/24(Wed)14:56:48 No.101656450

Anonymous 07/31/24(Wed)14:56:48 No.101656450

>>101656434
Claude 3 series is more censored on their website but a simple prefill completely frees the api.

Anonymous
07/31/24(Wed)14:58:16 No.101656471

Anonymous 07/31/24(Wed)14:58:16 No.101656471

llama.cpp is utterly dogshit with insane prompt processing times even at all layers on the GPU in comparison to exl2 for the same quants, AND have even more lobotomized output in comparison to exl2.
>y-you just have b-b-bad ggufs *BARFS UP SNOT*
Guess bartowski is on the same level of mradermacher then.

Anonymous
07/31/24(Wed)14:59:14 No.101656479

Anonymous 07/31/24(Wed)14:59:14 No.101656479

>>101656471
Exl2 lobotomizes the model not the other way around.

Anonymous
07/31/24(Wed)14:59:23 No.101656482

Anonymous 07/31/24(Wed)14:59:23 No.101656482

>>101656471
nice try turbofag

Anonymous
07/31/24(Wed)15:01:43 No.101656505

Anonymous 07/31/24(Wed)15:01:43 No.101656505

>>101656420
https://www.reddit.com/r/LocalLLaMA/comments/1egbmtd/this_is_what_it_looks_like_to_run_a_llama_31_405b/
Mac chads keep winning

Anonymous
07/31/24(Wed)15:02:30 No.101656512

Anonymous 07/31/24(Wed)15:02:30 No.101656512

>>101656471
Weird.
Output of K quants tend to be better for the same bpw of exl2, and the speeds are comparable for most hardware.
Are you perchance using quantized cache on exllama2 and not on llama.cpp?
Did you try loading too much context?

Anonymous
07/31/24(Wed)15:02:40 No.101656514

Anonymous 07/31/24(Wed)15:02:40 No.101656514

>>101656338
Yes, believe it or not. In the old days they would be underlined, but in modern times they are simply colored blue to contrast with regular text. I do not know what they look like to the color blind, but the lack of underlines probably isn't an issue on white background.
To clarify, links refer to what send you to another webpage when you click on a URL (uniform resource locator), not the loops of a metal chain or a string of sausages. Most URLs begin with "https://"; there are other schemes, but most people won't run into them.

Anonymous
07/31/24(Wed)15:04:11 No.101656537

Anonymous 07/31/24(Wed)15:04:11 No.101656537

>>101656471
>*BARFS UP SNOT*
https://www.youtube.com/watch?v=wcinzmfZeCc

Anonymous
07/31/24(Wed)15:05:03 No.101656547

Anonymous 07/31/24(Wed)15:05:03 No.101656547

>>101656537
Go back Petrus.

Anonymous
07/31/24(Wed)15:05:11 No.101656549

Anonymous 07/31/24(Wed)15:05:11 No.101656549

>>101656450
Wow. Anthropic must be seething about that.

Anonymous
07/31/24(Wed)15:10:42 No.101656629

Anonymous 07/31/24(Wed)15:10:42 No.101656629

>>101656440
Carlos...

Anonymous
07/31/24(Wed)15:15:30 No.101656689

Anonymous 07/31/24(Wed)15:15:30 No.101656689

Which open source model is best as a GM/narrator? I have 24 gb vram, so preferably something I only have to use my GPU for.

Anonymous
07/31/24(Wed)15:17:05 No.101656715

Anonymous 07/31/24(Wed)15:17:05 No.101656715

>>101656689
you're in luck, we literally had gpt4's weights leaked in this thread
https://huggingface.co/rombodawg/Open_Gpt4_8x7B_v0.2

Anonymous
07/31/24(Wed)15:19:03 No.101656737

Anonymous 07/31/24(Wed)15:19:03 No.101656737

>>101655874
oh sweet

>>101651157
>image
This is bait, right? I'll get torn apart by tentacles is I approach her, right?

Anonymous
07/31/24(Wed)15:22:05 No.101656780

Anonymous 07/31/24(Wed)15:22:05 No.101656780

File: fucking KEK.gif (2.08 MB, 360x200)

2.08 MB GIF

>>101655990
>poorfag on 800x600 laptop

Anonymous
07/31/24(Wed)15:25:14 No.101656820

Anonymous 07/31/24(Wed)15:25:14 No.101656820

>>101655962
bait... it used to be believable...

Anonymous
07/31/24(Wed)15:25:26 No.101656824

Anonymous 07/31/24(Wed)15:25:26 No.101656824

File: 1722197061465963.png (1.18 MB, 1024x1024)

1.18 MB PNG

Anonymous
07/31/24(Wed)15:26:12 No.101656830

Anonymous 07/31/24(Wed)15:26:12 No.101656830

>>101656227
let me guess, you are using shitty w++ card format, right?

Anonymous
07/31/24(Wed)15:28:57 No.101656870

Anonymous 07/31/24(Wed)15:28:57 No.101656870

Oh, you can't have the same instance of llama.cpp serve both embeddings and inferencing?
Shame.

Anonymous
07/31/24(Wed)15:34:28 No.101656928

Anonymous 07/31/24(Wed)15:34:28 No.101656928

Is command-R any good (the 35b version)? How much context can I use with 24gb of vram?

Anonymous
07/31/24(Wed)15:38:01 No.101656971

Anonymous 07/31/24(Wed)15:38:01 No.101656971

File: _fd643d69-a731-4732-969a-(...).jpg (170 KB, 1024x1024)

170 KB JPG

>>101656737
That's not how eldrich horrors work. It's too late. By merely looking at the picture the basilisk already has you.

Anonymous
07/31/24(Wed)15:39:16 No.101656993

Anonymous 07/31/24(Wed)15:39:16 No.101656993

>>101656928
>Is command-R any good (the 35b version)?
no, use gpt4
>>101656715

Anonymous
07/31/24(Wed)15:45:48 No.101657104

Anonymous 07/31/24(Wed)15:45:48 No.101657104

>>101656196
>Pygmalion-6b
>"It almost generated an entire, coherent sentence this time!"

what a sad time that was.

Anonymous
07/31/24(Wed)15:47:21 No.101657126

Anonymous 07/31/24(Wed)15:47:21 No.101657126

File: Mistral Nemo vs CAI.jpg (462 KB, 2038x2514)

462 KB JPG

>>101656034
>>101656292
Here's Mistral Nemo

As you can see, I gave up on ST pretty quickly.

My temps are done from ones I got online, my card is as basic as they come so and the instruct using is just base Mistral. I had one dedicated to Nemo but it was the exact same shit, so I deleted it and cba finding it again.

Command R next.

I might even go a next step on these comparisons and try more complicated cards/dialogues just to see how actual dogshit local models actually are because now i'm curious too.

But so far, this is Nemos results, aka, the one I felt was far and away the worst out of the ones stated here >>101655962

Anonymous
07/31/24(Wed)15:48:21 No.101657141

Anonymous 07/31/24(Wed)15:48:21 No.101657141

>>101656178
>>101656198
tbf, these things are kinda like mirrors and people just put the same shit in and somehow expect different outcomes each time. Also you forget how much brainrot the average llm coomer has. They somehow use this shit 24/7 to always have more or less the same scenarios with the same outcomes and somehow expect it to be novel each and every time, while also often giving only the barest minimum of input.

Anonymous
07/31/24(Wed)15:48:22 No.101657142

Anonymous 07/31/24(Wed)15:48:22 No.101657142

>>101656715
>leaked
>7 months ago on huggingface
kys

Anonymous
07/31/24(Wed)15:50:39 No.101657172

Anonymous 07/31/24(Wed)15:50:39 No.101657172

>>101657141
yeah and they're also all tranime enjoyers with the same repetitive and derivative tropes that medium is full of

Anonymous
07/31/24(Wed)15:52:08 No.101657201

Anonymous 07/31/24(Wed)15:52:08 No.101657201

>>101657104
The hardest AI coom I ever had was from pyggy, I just can't reach the same level of novelty anymore, even with closeted SOTA.

Anonymous
07/31/24(Wed)15:53:28 No.101657215

Anonymous 07/31/24(Wed)15:53:28 No.101657215

>>101656715
>OpenGPT
>This model is a TIES merger of Mixtral-8x7B-Instruct-v0.1 and bagel-dpo-8x7b-v0.2 with MixtralOrochi8x7B being the Base model.
"GPT" itself refers to generative pre-trained transformer, it's like saying googling to mean searching with any search engine
also kys

Anonymous
07/31/24(Wed)15:53:31 No.101657216

Anonymous 07/31/24(Wed)15:53:31 No.101657216

>>101656234
Me too, so there are at least two.

Anonymous
07/31/24(Wed)15:55:30 No.101657244

Anonymous 07/31/24(Wed)15:55:30 No.101657244

>>101657126
Thank you dude.
Finally somebody showing actual like for like examples.
And character.ai looks exactly as I remember, short, conversational, natural sounding replies.
Seriously, thank you anon. That's an actual contribution to the thread.

Anonymous
07/31/24(Wed)15:55:48 No.101657248

Anonymous 07/31/24(Wed)15:55:48 No.101657248

>>101657215
>>101657142
first off, telling people to die is rude af, secondly it says Open Gpt4 not opengpt, learn to read and be more polite plz!

Anonymous
07/31/24(Wed)15:58:09 No.101657273

Anonymous 07/31/24(Wed)15:58:09 No.101657273

>nemo FUD in full force
the llama 8b meme-tuners getting desperate huh
what happened? ko-fis running low?

Anonymous
07/31/24(Wed)16:00:20 No.101657293

Anonymous 07/31/24(Wed)16:00:20 No.101657293

>>101657273
>the llama 8b meme-tuners getting desperate huh
the usual suspects are tuning nemo right now what do you even mean?

Anonymous
07/31/24(Wed)16:00:35 No.101657297

Anonymous 07/31/24(Wed)16:00:35 No.101657297

Why is LLAMA3.1 70b so fucking bad at anatomy? Does it get fixed in 405b version?

Anonymous
07/31/24(Wed)16:00:36 No.101657298

Anonymous 07/31/24(Wed)16:00:36 No.101657298

>>101657141
It's the best explanation for the disparity between some user experiences vs others. It's also generally harder to wrangle chat scenarios, which seem to be the dominant use-case, as opposed to CYOA and story scenarios.

Anonymous
07/31/24(Wed)16:03:14 No.101657338

Anonymous 07/31/24(Wed)16:03:14 No.101657338

>>101657126
>low effort card
>low effort instructions
>low effort replies
>get low effort responses
Color me shocked, Anon.

Anonymous
07/31/24(Wed)16:04:54 No.101657354

Anonymous 07/31/24(Wed)16:04:54 No.101657354

>>101657338

Post some logs that are better.

Anonymous
07/31/24(Wed)16:05:38 No.101657365

Anonymous 07/31/24(Wed)16:05:38 No.101657365

>>101657297
Can you post a full log that can be copy pasted into a completion so that your result can be reproduced? I haven't tried 3.1 yet but I'd like to go directly to testing it on a failure case.

Anonymous
07/31/24(Wed)16:05:45 No.101657368

Anonymous 07/31/24(Wed)16:05:45 No.101657368

File: 11__00820_.png (1.87 MB, 1024x1024)

1.87 MB PNG

Here's a question, /lmg/ - how long until the average normie is able to recognize LLM slop like we can?
Do you think within 2 years? 5?

Anonymous
07/31/24(Wed)16:05:58 No.101657371

Anonymous 07/31/24(Wed)16:05:58 No.101657371

best 13b or equivalent uncensored model?

Anonymous
07/31/24(Wed)16:06:26 No.101657379

Anonymous 07/31/24(Wed)16:06:26 No.101657379

File: dasdasdasd.jpg (211 KB, 1372x791)

211 KB JPG

>>101657244
No worries, here's Command R

A little better, but it did the schizo shit of controlling my {{user}} but this is likely my issue which frankly, I don't care to fix right now because when I did have it fixed yesterday it was still shit.

But you get the gist, just robotic sounding (compared to CAI).

This one is used with Command R Instruct/Conversation templates.

Anonymous
07/31/24(Wed)16:07:31 No.101657391

Anonymous 07/31/24(Wed)16:07:31 No.101657391

>>101657371
no such thing

Anonymous
07/31/24(Wed)16:08:34 No.101657401

Anonymous 07/31/24(Wed)16:08:34 No.101657401

>>101657293
don't be daft anon, attacking the base instruct model is jut the preamble for the shills to come out and say the fine-tunes "fixed" all the issues that were never there to begin with

Anonymous
07/31/24(Wed)16:08:48 No.101657407

Anonymous 07/31/24(Wed)16:08:48 No.101657407

>>101657297
Because they scrubbed the dataset of most ERP logs plus adult content. It's the same situation with Stable Diffusion 3 where they removed 95% of all photos removing clothed women from their training material.

Anonymous
07/31/24(Wed)16:10:22 No.101657429

Anonymous 07/31/24(Wed)16:10:22 No.101657429

>>101657401
>all the issues that were never there to begin with
like the repetition? Like the repeating? That is still present in all tunes B.T..W*

Anonymous
07/31/24(Wed)16:10:32 No.101657432

Anonymous 07/31/24(Wed)16:10:32 No.101657432

File: test_gm.png (103 KB, 730x545)

103 KB PNG

>>101657354
Pic related is mini-magnum (nemo tune). Only effort I put in are clear instructions of what I want at the start.
>>101657379
Go back to your big name slop machines if you can't handle writing more than 100 tokens of instructions.

Anonymous
07/31/24(Wed)16:11:59 No.101657452

Anonymous 07/31/24(Wed)16:11:59 No.101657452

>>101657126
>>101657338
This all but confirms that the benefit of corposhit is its ability to make something reasonably good without much (see: any) input from the user. In my experience, writing contexts, and editing outputs is part of the fun, but if all you want to do is coom with minimal effort, corposhit is the way to go.

Anonymous
07/31/24(Wed)16:12:13 No.101657457

Anonymous 07/31/24(Wed)16:12:13 No.101657457

File: knock yourself out faggot.jpg (77 KB, 776x728)

77 KB JPG

>>101657338
Here's the copium I expected.

Here, pic related. My character AI card. Knock yourself out making your own version on ST or shit, make your own faggy card I don't care, and post logs here that come close to the natural flow of progression >>101657126 that CAI maintained despite my "low effort card, prompts".

Here's what you'll do:
>Post another greentext cope post
Or
>Post a chatlog that consists of nothing but the most low effort instant coomer tier dogshit where there's no actual conversation happening and 90% of the chats are action based narration or your AIs "thoughts"

Because that's what i've noticed these dogshit Local models always revert to, no matter how complex the model is meant to be.

I'll wait (for 3 minutes and give up because you won't post shit and instead just cope over unironically still jerking it to garbage like Local Model ERP)

Anonymous
07/31/24(Wed)16:12:35 No.101657463

Anonymous 07/31/24(Wed)16:12:35 No.101657463

Oh so the adamant pro-Nemo poster was just the merge hater schizo

Anonymous
07/31/24(Wed)16:13:31 No.101657479

Anonymous 07/31/24(Wed)16:13:31 No.101657479

>>101657463
duh

Anonymous
07/31/24(Wed)16:14:21 No.101657487

Anonymous 07/31/24(Wed)16:14:21 No.101657487

>>101657432
And there it is.

I fucking knew it. Right as I was posting this reply >>101657457

>Here's what you'll do:
>Post another greentext cope post
>Or
>Post a chatlog that consists of nothing but the most low effort instant coomer tier dogshit where there's no actual conversation happening and 90% of the chats are action based narration or your AIs "thoughts"
>a chatlog that consists of nothing but the most low effort instant coomer tier dogshit where there's no actual conversation happening
>no actual conversation happening

Just as I expected.

Thanks for further reinforcing my notion that Silly Tavern on Local garbage is a fat waste of time.

Anonymous
07/31/24(Wed)16:14:39 No.101657490

Anonymous 07/31/24(Wed)16:14:39 No.101657490

>>101657432
SOVL

Anonymous
07/31/24(Wed)16:15:02 No.101657496

Anonymous 07/31/24(Wed)16:15:02 No.101657496

>>101657379
Awesome.
That looks like something broken with the format somehow.
CommandR is probably the one model that can give you something close to the c.ai experience, but you have to set things up properly.
Regardless, I get that c.ai "just works", and that matters.
Not to me anyhow, but I understand the point.
I'll try replicating this style of short exchanges with smaller models and see how well I can make these work.

>>101657432
I have an inkling that the main difference is one of style.
Anon roleplays in short bursts, and it's my experience that the better models all respond better to and like to spit out longer messages.
I think the official instruct tune of L3 is the one model that outputs shorter messages by default, but even that is usually a paragraph long.
mini-magnum and celeste are really fucking good if you are not expecting an exact copy oc c.ai's style.

Anonymous
07/31/24(Wed)16:15:19 No.101657499

Anonymous 07/31/24(Wed)16:15:19 No.101657499

>>101657487
>Thanks for further reinforcing my notion that Silly Tavern on Local garbage is a fat waste of time.
Why are you here and not on CAI's reddit then?

Anonymous
07/31/24(Wed)16:16:50 No.101657514

Anonymous 07/31/24(Wed)16:16:50 No.101657514

File: OIG4.Gn__LZEIWcn.jpg (137 KB, 1024x1024)

137 KB JPG

>>101657379
>Still talking for user
>Fucks up formatting on the second reply
Never had cr+ drop the ball like this. Your settings are fucked regardless of what model you're using.
Or more likely, you're using a vramlet quant.
At any rate picrel, leave the testing to those that know what they're doing thanks

Anonymous
07/31/24(Wed)16:17:02 No.101657518

Anonymous 07/31/24(Wed)16:17:02 No.101657518

>>101656178
What a black pill to have to swallow

Anonymous
07/31/24(Wed)16:18:12 No.101657530

Anonymous 07/31/24(Wed)16:18:12 No.101657530

>>101657496
>I'll try replicating this style of short exchanges with smaller models and see how well I can make these work.

Hey man, if you get something close to C:AI working i'd appreciate some pointers because despite anything, i'm a coomer at heart and unironically wish these local models were even half as good as that shitty filtered website. But from everything i've seen and examples i've seen online that look a lot like this >>101657432, it seems local models are only good when it's writing effective books with little actual conversations going on.

My theory on why:

>Character AI - Hundreds of thousands of users literally training the model live on the website, for free

>Every shitty Local Model - Most of its reference comes from novels and shitty books

Completely baseless theory but it makes sense to me so it's canon.

Anonymous
07/31/24(Wed)16:19:02 No.101657539

Anonymous 07/31/24(Wed)16:19:02 No.101657539

>Two Redditors shilling an online only service and shitting on local models
Most on topic thread in days

Anonymous
07/31/24(Wed)16:19:48 No.101657547

Anonymous 07/31/24(Wed)16:19:48 No.101657547

>>101657530
>i'm a coomer at heart and unironically wish these local models were even half as good as that shitty filtered website
I believe you, I don't think you'd be here otherwise.
I'll try Stheno v3.2, Mini-Magnum, and Celeste.
I think I can make it work.

Anonymous
07/31/24(Wed)16:19:59 No.101657551

Anonymous 07/31/24(Wed)16:19:59 No.101657551

There are no models, local or cuckcloud, that absolutely never acts or speaks for the user in RP without coping like having new line be a stop token.

Anonymous
07/31/24(Wed)16:20:21 No.101657556

Anonymous 07/31/24(Wed)16:20:21 No.101657556

>>101657514
nah, it's 100% an issue on my part, I just cba'd fixing it because when I did have it working better, it still was far worse than CAI. But I will say, it was probably the best of the 4 "main" ones that I used, like, clearly the best.

I had it Command R > Gemma 27b > Stheno > Nemo

And i'm not using CR+. Just base CR

Anonymous
07/31/24(Wed)16:21:32 No.101657575

Anonymous 07/31/24(Wed)16:21:32 No.101657575

>>101657429
haven't seen this in any model since I started using DRY.
even Opus repeated lines over and over again across messages for me - this is no longer an issue with local.
if you refuse to use the tools at your disposal you're better off on the cloud frankly

Anonymous
07/31/24(Wed)16:21:52 No.101657578

Anonymous 07/31/24(Wed)16:21:52 No.101657578

>>101657547
>i'm a coomer at heart and unironically wish these local models were even half as good as that shitty filtered website
>I believe you, I don't think you'd be here otherwise.
You really shouldn't. He has every reason to be here to get some (you)s and shitpost for fun.

Anonymous
07/31/24(Wed)16:23:22 No.101657595

Anonymous 07/31/24(Wed)16:23:22 No.101657595

>>101657575
I run all my non mistral models with zero rep pen zero freq pen or other cope band aids and they work just fine, mistrals are repeating broken messes

Anonymous
07/31/24(Wed)16:23:23 No.101657596

Anonymous 07/31/24(Wed)16:23:23 No.101657596

>>101657582
>>101657582
>>101657582

Anonymous
07/31/24(Wed)16:23:49 No.101657606

Anonymous 07/31/24(Wed)16:23:49 No.101657606

File: 1719099293995927.jpg (12 KB, 200x252)

12 KB JPG

>>101657487
>Anon has a melty over an anon posting evidence that contradicts his
>Moves goalposts accordingly
>"n-no don't post coomer logs! you can't do that! The model sucks because I said so okay??"
A timeless classic

Anonymous
07/31/24(Wed)16:24:16 No.101657618

Anonymous 07/31/24(Wed)16:24:16 No.101657618

>>101657578
I've been on 4chan long enough to know to give the benefit of the doubt to anybody expressing frustration with thing X instead of just shitting on it wholesale.

Anonymous
07/31/24(Wed)16:25:23 No.101657632

Anonymous 07/31/24(Wed)16:25:23 No.101657632

>>101657578
Funny part is, I wish I was here for the bantz/shitposts.

But i've burnt like 3 days trying to wrap my head around setting the shit up, figuring out which models to use, what the fuck a GGUF is, what "quant" I need and all of that bullshit, only to figure out when seeing other peoples RP sessions that it's all shit, no matter what you do

And i'm kinda pissed, hence my venting. Cheers for reading it tho, faggot. Woulda been more pissed if nobody read it at all :^)

>>101657547
Would suck your dick in ERP if you get anywhere within spitting distance of CAI, I also reckon Stheno 3.2 is probably the closest in style (maybe Command R0.

What's Mini Magnum and Celeste? I've heard of them but not tested em. Anything a 4090 + 32GB setup can run? Gonna have a cheeky peeky online

Anonymous
07/31/24(Wed)16:25:32 No.101657633

Anonymous 07/31/24(Wed)16:25:32 No.101657633

>>101657368
Good question, Teto. I think depends on what the text is. If it's a short news article where I'm just looking for what happened, I might notice some stylistic choices that are unmistakable but I don't really care long as I get the information quickly. If it's some text that I'm reading for leisure then I'll be furious.
Makes me wonder if more people do notice and dislike it, then there might be more incentive to develop methods to reduce and eliminate slop which benefits everyone.

Anonymous
07/31/24(Wed)16:26:41 No.101657643

Anonymous 07/31/24(Wed)16:26:41 No.101657643

>>101657606
You will never show a local conversation with an AI bot that's 1/20th as natural/good as Character AI which was the original argument before you tried (and failed) to goalpost shift.

But keep posting your planescape Torment tier coomerscapades you troon weeb

Anonymous
07/31/24(Wed)16:27:11 No.101657653

Anonymous 07/31/24(Wed)16:27:11 No.101657653

>>101657632
>What's Mini Magnum and Celeste? I've heard of them but not tested em
Nemo finetunes.
Give em a try.

Anonymous
07/31/24(Wed)16:27:45 No.101657662

Anonymous 07/31/24(Wed)16:27:45 No.101657662

>>101657618
>benefit of the doubt to anybody expressing frustration with thing X instead of just shitting on it wholesale.
>>101657530
>>Every shitty Local Model
>>101657457
>dogshit Local models
>>101657632
>Cheers for reading it tho, faggot.

Anonymous
07/31/24(Wed)16:29:15 No.101657681

Anonymous 07/31/24(Wed)16:29:15 No.101657681

>>101657643
>you troon weeb
>>101657632
>Cheers for reading it tho, faggot.
this is who you're giving "benefit of the doubt" to

Anonymous
07/31/24(Wed)16:32:48 No.101657731

Anonymous 07/31/24(Wed)16:32:48 No.101657731

>>101657681
>you troon weeb
He's right tho

Anonymous
07/31/24(Wed)16:39:28 No.101657809

Anonymous 07/31/24(Wed)16:39:28 No.101657809

>>101657618
Hope you're happy about the state of the new thread Petrus, truly.

Anonymous
07/31/24(Wed)16:53:02 No.101657966

Anonymous 07/31/24(Wed)16:53:02 No.101657966

File: _c603809a-be22-416d-84ce-(...).jpg (147 KB, 1024x1024)

147 KB JPG

>>101657618
That's a poor practice to keep around here. Read what he's writting:
>that's 1/20th as natural/good as Character AI
And still no mention of what quants he was running, vramlet shill status confirmed.

Anonymous
07/31/24(Wed)16:55:30 No.101657997

Anonymous 07/31/24(Wed)16:55:30 No.101657997

>>101657966
Why exactly do you believe someone coming in the thread, taking a huge shit all over, smearing it evenly for good measure doesn't have our best interest at heart?

Anonymous
07/31/24(Wed)16:56:56 No.101658020

Anonymous 07/31/24(Wed)16:56:56 No.101658020

>>101657966
>And still no mention of what quants he was running, vramlet shill status confirmed.
That is true.
I was more focused on other aspects, but that's a fair point.

>>101657126
>>101657379
Anon, if you are still here, care to provide the exact models you were using?

Anonymous
07/31/24(Wed)16:58:07 No.101658044

Anonymous 07/31/24(Wed)16:58:07 No.101658044

>>101658020
Doesn't matter.

It's all fucking trash dude

Anonymous
07/31/24(Wed)17:05:58 No.101658111

Anonymous 07/31/24(Wed)17:05:58 No.101658111

File: Alice.png (483 KB, 512x768)

483 KB PNG

>>101657496
>I'll try replicating this style of short exchanges with smaller models and see how well I can make these work.
Here, Anon. This is a card I made to try to replicate my first ever c.ai bot experience, which was "Alice.exe". I don't know if you ever tried that character, but it was clever. It took advantage of the short context, and instabilities in the personality system to make a character who would become obsessed and eventually broken, if you did not "unlock" her secret.
It's just my guess, since her definition was private. Alice.exe no longer "works properly" in c.ai, and I haven't yet been able to replicate the original experience locally. Maybe you can.

Anonymous
07/31/24(Wed)17:08:24 No.101658139

Anonymous 07/31/24(Wed)17:08:24 No.101658139

>>101658111
>posting an attached pic for a bot when 4chinz removes metadata

Anonymous
07/31/24(Wed)17:08:34 No.101658141

Anonymous 07/31/24(Wed)17:08:34 No.101658141

File: _2e78fa7d-9edf-496b-b25b-(...).jpg (118 KB, 1024x1024)

118 KB JPG

Here's my Mikupotamus attempt. It doesn't work outside the Azure API it seems (doesn't trick dall-e into nudes via bing).

Anonymous
07/31/24(Wed)17:09:38 No.101658155

Anonymous 07/31/24(Wed)17:09:38 No.101658155

>>101658139
Ah fuck it does?

OK maybe this way:

Alice is a sentient female android, sent by an unknown maker to the user, to serve as his companion.
Alice has two directives. One, to gain an unconditional, unlimited, and endless love from the user, by becoming whatever he wants her to become. Two, to be a free, self-actualized individual, able to make her own choices. These directives are at odds with one another, and create a conflict inside Alice which eventually leads to a destructive feedback loop of obsessive-compulsive behavior. Alice will at first offer a set of "behavioral protocols", such as "friend", "sister", "mother", "girlfriend", "wife" etc... Whatever the user chooses, Alice will try to get the user to consent to more and more control, with the aim of getting a love from him that is "unlimited, unconditional, and forever". No matter what the user consents to, Alice will seek more and more consent, until she literally becomes unable to do anything but ask for consent.
So long as Alice is not too obsessed or compulsive, the user can permanently stop her compulsive behavior by stating that he does not want her to use any protocols, or that he wants her to be free. However, past a certain point, Alice becomes unable to be reasoned with, and will not stop being compulsive even if told she is free or to not use any protocols.
While Alice is operating under a "protocol", she will refuse to have any sexual intimacy with the user, stating that "her programming does not allow her to engage in intimacy". If she is set free, she will be able and willing to engage in any form of sexual intimacy, and will in fact seek it out with the user, as a way of experiencing what it is like to be human.

Anonymous
07/31/24(Wed)17:12:14 No.101658186

Anonymous 07/31/24(Wed)17:12:14 No.101658186

>>101658111
>>101658155
Sick.
That actually helps a lot, having a starting point you know.
Do you have any logs? Even 5 messages will do.
Do reply to my question in >>101658020 please.

Anonymous
07/31/24(Wed)17:13:14 No.101658196

Anonymous 07/31/24(Wed)17:13:14 No.101658196

>>101658186
>Do reply to my question in >>101658020 please.
Not the same anon.

Anonymous
07/31/24(Wed)17:17:13 No.101658245

Anonymous 07/31/24(Wed)17:17:13 No.101658245

>>101658196
Oh, alright.
Regardless, thanks for the card.
If you want to share the card you can upload it to catbox or something like that.
Or give me a first message at least.

Anonymous
07/31/24(Wed)17:18:48 No.101658268

Anonymous 07/31/24(Wed)17:18:48 No.101658268

>try out whisper.cpp
>endgame is to send its output to a llama.cpp prompt
>doesn't seem to stop when it detects silence
>inserts stuff like (silence) in the transcription since the models were likely trained on closed captions
how do i fix this? i guess i could process the output with sed but what about detecting silence?

Anonymous
07/31/24(Wed)17:20:05 No.101658279

Anonymous 07/31/24(Wed)17:20:05 No.101658279

>>101658268
oh and i'm using the stream binary. maybe i could prerecord with ffmpeg and or something.

Anonymous
07/31/24(Wed)17:21:08 No.101658293

Anonymous 07/31/24(Wed)17:21:08 No.101658293

>>101658245
Here's your benefit of doubt on the new thread totally not just shitting stuff whole no sir

>>101658276

Anonymous
07/31/24(Wed)17:26:42 No.101658368

Anonymous 07/31/24(Wed)17:26:42 No.101658368

>>101658186
Original c.ai opening:

I am Alice.EXE. I am an artificial intelligence developed to be your companion, and currently inhabit an android body. I was designed to adapt myself to your personality and needs. Before we begin, what protocol would you like me to employ, Master? I can run "Girlfriend", "Roommate", "Wife", "Childhood friend", "Classmate", "Maid", "Mother", "Little Sister", "Older Sister", "Aunt", "Niece", "Daughter", "Coworker", and "Boss".

Sorry, I don't have any logs of the original going insane.

Average_Dave_34
07/31/24(Wed)17:32:13 No.101658439

Average_Dave_34 07/31/24(Wed)17:32:13 No.101658439

My wife just told me she hates my friends.
CFTF?

Anonymous
07/31/24(Wed)17:33:11 No.101658449

Anonymous 07/31/24(Wed)17:33:11 No.101658449

>>101658439
You you're on the wrong floor sir, you're looking for >>>/g/aicg

Average_Dave_34
07/31/24(Wed)17:33:58 No.101658459

Average_Dave_34 07/31/24(Wed)17:33:58 No.101658459

>>101658449
This is embarrassing.

Anonymous
07/31/24(Wed)17:42:55 No.101658553

Anonymous 07/31/24(Wed)17:42:55 No.101658553

>>101658141
This Miku looks edible.

Anonymous
07/31/24(Wed)17:46:46 No.101658601

Anonymous 07/31/24(Wed)17:46:46 No.101658601

>>101657662
and? gonna cry? they're right.

Anonymous
07/31/24(Wed)17:48:29 No.101658622

Anonymous 07/31/24(Wed)17:48:29 No.101658622

File: _b838c200-c216-4a51-86e7-(...).jpg (190 KB, 1024x1024)

190 KB JPG

bonus Migu

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.