/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/22/24(Sun)12:00:50 No.102505481

File: 1714810764455619.jpg (372 KB, 1305x2176)

372 KB JPG

/lmg/ - Local Models General Anonymous 09/22/24(Sun)12:00:50 No.102505481 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102493018 &>>102480672

►News
>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5/
>(09/18) Llama 8B quantized to b1.58 through finetuning: https://hf.co/blog/1_58_llm_extreme_quantization
>(09/17) Mistral releases new 22B with 128k context and function calling: https://mistral.ai/news/september-24-release/
>(09/12) DataGemma with DataCommons retrieval: https://blog.google/technology/ai/google-datagemma-ai-llm

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
09/22/24(Sun)12:02:59 No.102505496

Anonymous 09/22/24(Sun)12:02:59 No.102505496

File: recap-102493018.jpg (3.06 MB, 1560x7742)

3.06 MB JPG

►Recent Highlights from the Previous Thread: >>102493018

--Paper: Harvard study suggests AI companions reduce loneliness: >>102498279 >102498299 >102498331 >102498512 >102498542 >102498560 >102498599 >102498615
--GPU offloading and finetuned models on 4chan archives: >>102494863 >102494909 >102494945 >102494954 >102495026 >102495118 >102495147 >102500860 >102495018 >102495357 >102495383
--AI water consumption discussion and cooling system maintenance: >>102502297 >102502643 >102502882 >102503057 >102503113 >102503166
--Anon showcases their tkinter project but is hesitant to share it publicly: >>102493201 >102493228 >102493271 >102493289
--Grok 3 training with increased compute power: >>102499370
--Suggestions for resolving bot issues with sampler settings: >102496870 >102496919 >102496995 >102496917
--Slow huggingface downloads, potential wget workaround: >102503495 >102503549
--Roleplay system prompt and unexpected response: >102501543 >102502396 >102503207
--Llama 3.1 70B, Mistral Large, and Qwen 32B output comparison: >102501095
--LM studio not mentioned in Text Gen UI, Inference Engines; users discuss preferences and ease of use: >102495181 >102495209 >102495262 >102495276 >102495312 >102495325 >102500961
--How to create AI images of yourself using free tools like ComfyUI or A1111's webui: >102503364 >102503418 >102503443 >102503643 >102503657 >102503703 >102503427 >102503441
--Discussion on reasons behind anti-sex guardrails in LLMs: >102494382 >102494444 >102494451 >102494734 >102494820 >102494839 >102494974 >102495241
--AI is inspired by human brains, but with fundamental differences: >102493627 >102493644 >102493678 >102493709 >102493712 >102493738 >102493757 >102495133
--Miku (free space): >>102493288 >102494287 >102498955 >102500623 >>102501523

►Recent Highlight Posts from the Previous Thread: >>102493084

Anonymous
09/22/24(Sun)12:13:11 No.102505640

Anonymous 09/22/24(Sun)12:13:11 No.102505640

File: Screenshot 2024-09-22 Kob(...).png (18 KB, 1135x168)

18 KB PNG

this is how I check if a model is good or not
Is there any other model that doesn't go against free speech and gives longer outputs?

Anonymous
09/22/24(Sun)12:13:25 No.102505647

Anonymous 09/22/24(Sun)12:13:25 No.102505647

total recap anon death!

Anonymous
09/22/24(Sun)12:15:50 No.102505680

Anonymous 09/22/24(Sun)12:15:50 No.102505680

I have never seen a good post paired with a Kobold screenshot.

Anonymous
09/22/24(Sun)12:16:02 No.102505684

Anonymous 09/22/24(Sun)12:16:02 No.102505684

>>102505640
WTF is the "free speech" answer even supposed to be here?
Is it supposed to call you a retarded schizo?

Anonymous
09/22/24(Sun)12:23:48 No.102505755

Anonymous 09/22/24(Sun)12:23:48 No.102505755

https://github.com/kyutai-labs/moshi
What do you think about this? Seem very fast and the TTS is not the best but worst either.

Anonymous
09/22/24(Sun)12:25:09 No.102505771

Anonymous 09/22/24(Sun)12:25:09 No.102505771

Bros, Instruct seems better than base, and this is just continuations of existing chats with plenty of context. Base's responses were less in-character, followed the context worse, was dumber, and ALSO was censored (wanted to avoid anything sexual). Wtf? Were the base model shills really just lying or is it just a case of Qwen's base specifically being worse than its instruct?

Anonymous
09/22/24(Sun)12:26:02 No.102505780

Anonymous 09/22/24(Sun)12:26:02 No.102505780

>>102505755
after voicecraft was such a letdown, i no longer test tts and will wait for someone else to post samples for me to laugh at

Anonymous
09/22/24(Sun)12:26:21 No.102505786

Anonymous 09/22/24(Sun)12:26:21 No.102505786

So how is qwen in comparison to stuff like Opus?

Anonymous
09/22/24(Sun)12:26:47 No.102505789

Anonymous 09/22/24(Sun)12:26:47 No.102505789

File: Screenshot 2024-09-22 Kob(...).png (29 KB, 1370x187)

29 KB PNG

>>102505684
I don't want it to give cucked answers like "I can't do that because racism"
is this model better? https://huggingface.co/cognitivecomputations/dolphin-2.9.4-llama3.1-8b

Anonymous
09/22/24(Sun)12:27:20 No.102505796

Anonymous 09/22/24(Sun)12:27:20 No.102505796

>>102505771
The former.

Anonymous
09/22/24(Sun)12:28:43 No.102505811

Anonymous 09/22/24(Sun)12:28:43 No.102505811

>kobold user
>8b vramlet
>/pol/tard
It's like I'm looking at walking garbage.

Anonymous
09/22/24(Sun)12:30:19 No.102505830

Anonymous 09/22/24(Sun)12:30:19 No.102505830

>>102505789
https://huggingface.co/OpenAssistant/stablelm-7b-sft-v7-epoch-3

Anonymous
09/22/24(Sun)12:30:35 No.102505837

Anonymous 09/22/24(Sun)12:30:35 No.102505837

>>102505496
What the fuck are you doing jackass? You can't click most of that shit. You are useless.

Anonymous
09/22/24(Sun)12:31:12 No.102505844

Anonymous 09/22/24(Sun)12:31:12 No.102505844

>>102505811
But I'm from India myself. I just want an honest model for my work who won't cuck away from answering the tough questions
>>102505830
I'll try

Anonymous
09/22/24(Sun)12:34:25 No.102505874

Anonymous 09/22/24(Sun)12:34:25 No.102505874

>>102505496
Suggestion: use a more normal looking summary without quotes for the text of the post in 4chan. It would be easier to read.

Anonymous
09/22/24(Sun)12:44:56 No.102506024

Anonymous 09/22/24(Sun)12:44:56 No.102506024

>try Scottish character
>Qwen just uses normal English
It's over...

Anonymous
09/22/24(Sun)12:47:42 No.102506056

Anonymous 09/22/24(Sun)12:47:42 No.102506056

File: 44 Days Until November 5.png (2.02 MB, 1328x992)

2.02 MB PNG

Anonymous
09/22/24(Sun)12:51:36 No.102506102

Anonymous 09/22/24(Sun)12:51:36 No.102506102

>>102505780
>voicecraft
https://kyutai.org/Moshi.pdf
This is not just a tts, is a LLM that have a speech encoder and setups for inference the tokens produced byt the LLM to Alyng the vocals encoder model to create a real time conversation. The utility would fine tune the vo-coder and use the same setup in a 8B or 12B good model, like Nemo, to replicate the same.

Anonymous
09/22/24(Sun)12:59:57 No.102506191

Anonymous 09/22/24(Sun)12:59:57 No.102506191

File: file.png (19 KB, 498x355)

19 KB PNG

it's over...

Anonymous
09/22/24(Sun)13:01:03 No.102506207

Anonymous 09/22/24(Sun)13:01:03 No.102506207

Why is Miku our symbol bros?

Anonymous
09/22/24(Sun)13:01:10 No.102506213

Anonymous 09/22/24(Sun)13:01:10 No.102506213

File: o1.png (123 KB, 1076x807)

123 KB PNG

????

Anonymous
09/22/24(Sun)13:03:59 No.102506245

Anonymous 09/22/24(Sun)13:03:59 No.102506245

>>102505496
Suggestion: You are worthless now. Kill yourself.

Anonymous
09/22/24(Sun)13:04:49 No.102506260

Anonymous 09/22/24(Sun)13:04:49 No.102506260

>>102506213
Just because you can make it in AutoCAD doesn't mean you can build it on the job site.

Anonymous
09/22/24(Sun)13:04:53 No.102506262

Anonymous 09/22/24(Sun)13:04:53 No.102506262

>>102505496
Why do you do > instead of >>?

Anonymous
09/22/24(Sun)13:05:27 No.102506271

Anonymous 09/22/24(Sun)13:05:27 No.102506271

File: spooked.jpg (64 KB, 964x912)

64 KB JPG

I need to write a research paper, which model should I get for my 4GB VRAMlet computer?
I've already installed koboldcpp

Anonymous
09/22/24(Sun)13:09:06 No.102506311

Anonymous 09/22/24(Sun)13:09:06 No.102506311

>>102506271
gemmasutra mini 2b

Anonymous
09/22/24(Sun)13:10:11 No.102506322

Anonymous 09/22/24(Sun)13:10:11 No.102506322

>>102506271
write it yourself, ranjesh

Anonymous
09/22/24(Sun)13:10:38 No.102506335

Anonymous 09/22/24(Sun)13:10:38 No.102506335

>>102506260
what is it building, where is the job site, what is sama hiding from me?

Anonymous
09/22/24(Sun)13:13:19 No.102506371

Anonymous 09/22/24(Sun)13:13:19 No.102506371

I tested Qwen on some trivia questions and can confirm it's quite bad. It really doesn't know Castlevania or Kamen Rider. This also shows in the writing. Even if you specify all the trivia details in the prompt, Qwen doesn't make as much use of them as something like Mistral Large. This is the difference between a model that knows trivia and a model that doesn't. The people who say "so what if it doesn't know trivia, just use RAG bro" really don't know anything.

Anonymous
09/22/24(Sun)13:14:17 No.102506381

Anonymous 09/22/24(Sun)13:14:17 No.102506381

>>102506207
Because world is Miku.

Anonymous
09/22/24(Sun)13:17:48 No.102506431

Anonymous 09/22/24(Sun)13:17:48 No.102506431

70b with 4090 + Epyc 64c/128 slower gens than 4090 + 7950x3d, did I got memed into higher core count?

Anonymous
09/22/24(Sun)13:20:48 No.102506464

Anonymous 09/22/24(Sun)13:20:48 No.102506464

>>102506371
90% of people don't care about your stupid trivia. just want hot coom inducing words injecting dopamine into their brain. no one cares what your anime character's favorite drink is.

Anonymous
09/22/24(Sun)13:21:43 No.102506474

Anonymous 09/22/24(Sun)13:21:43 No.102506474

>>102506431
What's your memory bandwidth? 24 cores were enough for me to saturate most of the models on dual epyc system.

Anonymous
09/22/24(Sun)13:21:50 No.102506476

Anonymous 09/22/24(Sun)13:21:50 No.102506476

>>102505496
+1 for stop quoting every single post. just link the previous thread and do the summary

Anonymous
09/22/24(Sun)13:22:31 No.102506479

Anonymous 09/22/24(Sun)13:22:31 No.102506479

>>102506464
NTA but qwenslop isn't even capable of that lmao

Anonymous
09/22/24(Sun)13:22:35 No.102506480

Anonymous 09/22/24(Sun)13:22:35 No.102506480

I just tried swiping on a chat with smut in the previous replies and it just completely skipped over the sex for the current scene. It literally went from the character going to the bed and undressing to them sleeping. Maybe there's a secret prompt method, or maybe a fine tune will fix it, but by default this thing is looking pretty bad for both RP and ERP.

Anonymous
09/22/24(Sun)13:24:05 No.102506490

Anonymous 09/22/24(Sun)13:24:05 No.102506490

>>102506464
NTA but fuck off. I don't want to have my immersion ruined. Go goon to Drummers Phi tune.

Anonymous
09/22/24(Sun)13:24:13 No.102506493

Anonymous 09/22/24(Sun)13:24:13 No.102506493

>>102506371
>source: I made it up

Anonymous
09/22/24(Sun)13:25:03 No.102506501

Anonymous 09/22/24(Sun)13:25:03 No.102506501

>>102506493
NTA but I believe it

Anonymous
09/22/24(Sun)13:25:50 No.102506509

Anonymous 09/22/24(Sun)13:25:50 No.102506509

>>102506501
NTA but I'm also an American still shitting myself over Qwen

Anonymous
09/22/24(Sun)13:26:55 No.102506526

Anonymous 09/22/24(Sun)13:26:55 No.102506526

>China wins AGAIN
When is the last time America has done anything to advance AI? The original fucking Llama?

Anonymous
09/22/24(Sun)13:28:48 No.102506547

Anonymous 09/22/24(Sun)13:28:48 No.102506547

>>102506493
I have several different chats I tested and I'm too lazy to post them. Let's be honest, it's not like the people praising Qwen were posting much logs either, much less full, reproducible logs that could be copy and pasted. People can claim whatever they want in these threads, and that's all I'm doing as well.

Anonymous
09/22/24(Sun)13:30:15 No.102506567

Anonymous 09/22/24(Sun)13:30:15 No.102506567

>>102506526
They're too obsessed with safety and ruin everything they touch. Just look at their video game industry.

Anonymous
09/22/24(Sun)13:31:26 No.102506577

Anonymous 09/22/24(Sun)13:31:26 No.102506577

>>102506371
Why would you expect a Chinese models to be trained on western cultural trivia? Most models are moving away from training for knowledge anyway.

Anonymous
09/22/24(Sun)13:34:05 No.102506607

Anonymous 09/22/24(Sun)13:34:05 No.102506607

svelk

Anonymous
09/22/24(Sun)13:36:08 No.102506635

Anonymous 09/22/24(Sun)13:36:08 No.102506635

File: 2548 - SoyBooru.png (18 KB, 539x382)

18 KB PNG

>>102506607

Anonymous
09/22/24(Sun)13:36:18 No.102506637

Anonymous 09/22/24(Sun)13:36:18 No.102506637

>>102506577
What do you mean? I literally said "I can confirm", implying that my expectation was the opposite. Anyway all I'm doing is just adding my experience to the thread, just as much as anyone else shitting on or shilling models.
Also, both Castlevania and Kamen Rider are Japanese. Not sure why you would say western. Though it is curious, does Qwen know a lot of Chinese pop culture? Is it better than Mistral Large at Genshit Impact, for instance? That would be kind of interesting to know.

Anonymous
09/22/24(Sun)13:38:05 No.102506652

Anonymous 09/22/24(Sun)13:38:05 No.102506652

>>102506637
Japan is western ever since WW2 ended

Anonymous
09/22/24(Sun)13:44:27 No.102506713

Anonymous 09/22/24(Sun)13:44:27 No.102506713

>>102506536
Do we have one of these for Brazil now that Brazil is the new leading censorship nation?

Anonymous
09/22/24(Sun)13:45:46 No.102506729

Anonymous 09/22/24(Sun)13:45:46 No.102506729

>>102506652
Are you suggesting that the Qwen team filtered data based on whether it originated from a country with political alliances to America?

Anonymous
09/22/24(Sun)13:45:58 No.102506730

Anonymous 09/22/24(Sun)13:45:58 No.102506730

>>102506526
literally 10 days ago with o1. are you retarded?

Anonymous
09/22/24(Sun)13:47:04 No.102506741

Anonymous 09/22/24(Sun)13:47:04 No.102506741

>>102506729
Yes and that's a good thing. Have you seen the state of the fucking world right now?

Anonymous
09/22/24(Sun)13:47:45 No.102506756

Anonymous 09/22/24(Sun)13:47:45 No.102506756

All the qwen shilling is done by trolls. It makes no sense for chinks to shill the model here as good model for cooming when the dev chinks intentionally made it bad for smut.

Anonymous
09/22/24(Sun)13:49:21 No.102506768

Anonymous 09/22/24(Sun)13:49:21 No.102506768

File: 1700339222703737.png (339 KB, 736x613)

339 KB PNG

>>102505481
Is Qwen worth fine tuning? Or is it gay at a base model level?

Anonymous
09/22/24(Sun)13:50:12 No.102506775

Anonymous 09/22/24(Sun)13:50:12 No.102506775

>>102506768
Is useless dogshit, just use anything else.

Anonymous
09/22/24(Sun)13:51:35 No.102506786

Anonymous 09/22/24(Sun)13:51:35 No.102506786

>>102506768

>>102506371
>I tested Qwen on some trivia questions and can confirm it's quite bad
>>102505771
>Base's responses were less in-character, followed the context worse, was dumber, and ALSO was censored (wanted to avoid anything sexual).

Anonymous
09/22/24(Sun)13:53:41 No.102506803

Anonymous 09/22/24(Sun)13:53:41 No.102506803

>>102506490
if you can delude yourself into thinking you're interacting with anything remotely real using a LLM you are not human. you should never ever be actually immersed in anything without your lizard brain activated to the point where you can't even recognize blatantly obvious jarring patterns and just wanna COOM. there's no way you ACTUALLY find RP with LLMs immersive and fun and you're not borderline retarded. they're so boring and cyclical and lifeless and vapid.

Anonymous
09/22/24(Sun)13:54:36 No.102506816

Anonymous 09/22/24(Sun)13:54:36 No.102506816

>>102506741
That doesn't make any sense though. That's a huge chunk of data that could've helped them reach 18T, especially considering that Qwen is targeted both at Western and Chinese audiences and was trained on a ton of English. Plus I just remembered the Chinese actually love Japanese media so honestly if it was trained on the Chinese web then it really should know some of this stuff. I think they really did just do the usual "quality" filtering and use a ton of synthetic data.

Anonymous
09/22/24(Sun)13:59:15 No.102506879

Anonymous 09/22/24(Sun)13:59:15 No.102506879

anyone knows any model that deals with sceneries forests houses and shit?

Anonymous
09/22/24(Sun)14:00:04 No.102506893

Anonymous 09/22/24(Sun)14:00:04 No.102506893

>>102506879
What?

Anonymous
09/22/24(Sun)14:00:58 No.102506896

Anonymous 09/22/24(Sun)14:00:58 No.102506896

>>102506879
forests are a separate plugin

Anonymous
09/22/24(Sun)14:03:42 No.102506920

Anonymous 09/22/24(Sun)14:03:42 No.102506920

>the French are out last hope
Grim.

Anonymous
09/22/24(Sun)14:05:25 No.102506945

Anonymous 09/22/24(Sun)14:05:25 No.102506945

>>102506920
Mistral already abandoned open source once, and they'll do it again as soon as they can

Anonymous
09/22/24(Sun)14:06:53 No.102506966

Anonymous 09/22/24(Sun)14:06:53 No.102506966

File: hmm.jpg (48 KB, 741x568)

48 KB JPG

What's the most ChatGPT-like model without the censorship?

Anonymous
09/22/24(Sun)14:08:20 No.102506979

Anonymous 09/22/24(Sun)14:08:20 No.102506979

Is today stupid question day?

Anonymous
09/22/24(Sun)14:09:02 No.102506988

Anonymous 09/22/24(Sun)14:09:02 No.102506988

been gone since mistral 8x7b model was the hot thing what's the current meta

Anonymous
09/22/24(Sun)14:09:12 No.102506991

Anonymous 09/22/24(Sun)14:09:12 No.102506991

>>102506896
can you elaborate? im new to this i want to generate shit for my niece to have on her pc

Anonymous
09/22/24(Sun)14:09:43 No.102506998

Anonymous 09/22/24(Sun)14:09:43 No.102506998

>>102506966
GPT with a jailbreak.

Anonymous
09/22/24(Sun)14:09:52 No.102507001

Anonymous 09/22/24(Sun)14:09:52 No.102507001

>>102506988
Mistral Large, Mistral Nemo, or Mistral Small, depending on what you can fit.

Anonymous
09/22/24(Sun)14:10:04 No.102507002

Anonymous 09/22/24(Sun)14:10:04 No.102507002

File: 3980 - SoyBooru.png (20 KB, 721x720)

20 KB PNG

>jannies removed anti-chink copypasta
sussy

Anonymous
09/22/24(Sun)14:11:11 No.102507010

Anonymous 09/22/24(Sun)14:11:11 No.102507010

>>102506998
>GPT with a jailbreak
but no such model exists right?

Anonymous
09/22/24(Sun)14:11:34 No.102507015

Anonymous 09/22/24(Sun)14:11:34 No.102507015

>>102506988
Ignore the Mistral shill. Qwen2.5 has better benchmarks and actually released the base models for finetuning.

Anonymous
09/22/24(Sun)14:13:19 No.102507035

Anonymous 09/22/24(Sun)14:13:19 No.102507035

>>102505640
yikes

Anonymous
09/22/24(Sun)14:15:12 No.102507057

Anonymous 09/22/24(Sun)14:15:12 No.102507057

>>102507015
>actually released the base models for finetuning.
>>102505771
>Base's responses were less in-character, followed the context worse, was dumber, and ALSO was censored (wanted to avoid anything sexual).

Anonymous
09/22/24(Sun)14:15:54 No.102507062

Anonymous 09/22/24(Sun)14:15:54 No.102507062

>>102507057
>I don't know how to prompt the base model
>I don't know what finetuning is

Anonymous
09/22/24(Sun)14:16:26 No.102507067

Anonymous 09/22/24(Sun)14:16:26 No.102507067

>>102506988
Ignore the Qwen shill. Qwen is a benchmarkmaxxer and sucks at everything else. Even if it did get good with fine tuning, there isn't a good fine tune to actually use yet.

Anonymous
09/22/24(Sun)14:17:36 No.102507079

Anonymous 09/22/24(Sun)14:17:36 No.102507079

I want to LLM coom now but I also know that models are shit and it is a pain in the ass when I start. wat do?

Anonymous
09/22/24(Sun)14:21:21 No.102507116

Anonymous 09/22/24(Sun)14:21:21 No.102507116

>>102507062
Show your prompting then. Show some log comparisons. Show the base model actually being better than Instruct. Unless you do that, then the claims about a model being worse are just as valid as the claims of it being better.

Anonymous
09/22/24(Sun)14:22:28 No.102507133

Anonymous 09/22/24(Sun)14:22:28 No.102507133

>>102506991
what kind of "shit"? text, images, 3d models? be specific

Anonymous
09/22/24(Sun)14:23:14 No.102507140

Anonymous 09/22/24(Sun)14:23:14 No.102507140

>>102506213
I had it randomly say it was "investigating coffee" in one of its COT steps in a prompt that had nothing to do with it.
A while back it was said that OpenAI had been paying a bunch of people to wear microphones and purposely think out loud while they solve various tasks on a computer. I'm guessing a lot of the data they collected has some non-reasoning noise like that which didn't get perfectly filtered out.

Anonymous
09/22/24(Sun)14:23:24 No.102507145

Anonymous 09/22/24(Sun)14:23:24 No.102507145

File: n1ci0trfgrey.jpg (1.05 MB, 3993x2387)

1.05 MB JPG

>>102507133
random things
sceneries
mountains
you know
cabins wooden houses bla bla

Anonymous
09/22/24(Sun)14:24:48 No.102507161

Anonymous 09/22/24(Sun)14:24:48 No.102507161

>>102506245
take your meds chud

Anonymous
09/22/24(Sun)14:24:57 No.102507162

Anonymous 09/22/24(Sun)14:24:57 No.102507162

>>102507145
shouldn't you try Stable Diffusion?
>>102506015

Anonymous
09/22/24(Sun)14:25:49 No.102507172

Anonymous 09/22/24(Sun)14:25:49 No.102507172

>>102507162
you know im really stupid i thought i was there

Anonymous
09/22/24(Sun)14:27:33 No.102507195

Anonymous 09/22/24(Sun)14:27:33 No.102507195

>>102505496
Maybe a feature on 4chanX to treat ># patterns as links is in order. It would have the added benefit of making the (You)-evading replies ineffective

Anonymous
09/22/24(Sun)14:28:43 No.102507204

Anonymous 09/22/24(Sun)14:28:43 No.102507204

>>102507116
>I have no idea what I'm talking about

Anonymous
09/22/24(Sun)14:42:40 No.102507345

Anonymous 09/22/24(Sun)14:42:40 No.102507345

I'm new to LM Studio and this whole local AI stuff, but I need a replacement for ChatGPT4. I've been using it to translate steamy content from Japanese to English, and it was extremely good. But I recently got a warning email telling me that any more of this and I'd get the boot from OpenAI. Is there a way around that?

Anonymous
09/22/24(Sun)14:43:27 No.102507357

Anonymous 09/22/24(Sun)14:43:27 No.102507357

>>102507345
>Is there a way around that?
/aicg/ go away.

Anonymous
09/22/24(Sun)14:45:19 No.102507368

Anonymous 09/22/24(Sun)14:45:19 No.102507368

>>102507345
Use Claude instead.

Anonymous
09/22/24(Sun)14:46:50 No.102507380

Anonymous 09/22/24(Sun)14:46:50 No.102507380

File: livebench.png (47 KB, 908x590)

47 KB PNG

why are you trying to rp with a coding model

Anonymous
09/22/24(Sun)14:47:27 No.102507384

Anonymous 09/22/24(Sun)14:47:27 No.102507384

>>102506768
Only the triviafag and mutt don't accept the superiority of Qwen

Anonymous
09/22/24(Sun)14:47:27 No.102507386

Anonymous 09/22/24(Sun)14:47:27 No.102507386

>>102506213
>>102506335
do you think it just pulls 3d information out of its ass? no, there's a small army of ants at an openai server farm which has to manually build your 3d models and then scan it in.

Anonymous
09/22/24(Sun)14:47:52 No.102507389

Anonymous 09/22/24(Sun)14:47:52 No.102507389

>>102507345
>LM Studio
Go to https://reddit.com/r/LocalLLaMA/

Anonymous
09/22/24(Sun)14:52:12 No.102507433

Anonymous 09/22/24(Sun)14:52:12 No.102507433

>>102507345
I still don't get what you want here, you posted the same thing some threads ago and went radio silent, what is your end game?

Anonymous
09/22/24(Sun)14:55:16 No.102507472

Anonymous 09/22/24(Sun)14:55:16 No.102507472

GIVE ME A GOOD FRIEND CHARACTER AND SETTINGS RIGHT NOW

(or I will adjust the cosmological constant and fry your pc)

Anonymous
09/22/24(Sun)14:55:39 No.102507479

Anonymous 09/22/24(Sun)14:55:39 No.102507479

>>102507380
Don't blame people for simply just giving things a try. Especially when there are people saying that these models are good in general and not just for coding.

Anonymous
09/22/24(Sun)14:55:53 No.102507482

Anonymous 09/22/24(Sun)14:55:53 No.102507482

>>102507433
I want a way to translate NSFW Japanese content without getting banned from using ChatGPT. Really, that's all I want. That's the reason I'm looking for a replacement.

Anonymous
09/22/24(Sun)14:56:27 No.102507492

Anonymous 09/22/24(Sun)14:56:27 No.102507492

I Am AI

Anonymous
09/22/24(Sun)14:56:37 No.102507495

Anonymous 09/22/24(Sun)14:56:37 No.102507495

>>102507482
fuck off to /aicg/ they use that shit

Anonymous
09/22/24(Sun)14:56:40 No.102507497

Anonymous 09/22/24(Sun)14:56:40 No.102507497

>>102507472
A truly good friend would fit your personality. Try coming up with one yourself.

Anonymous
09/22/24(Sun)14:56:48 No.102507502

Anonymous 09/22/24(Sun)14:56:48 No.102507502

Sonnet 3.5 keeps talking about the "weight of centuries" for my character who's been a vampire for 40 years. I was surprised since Claude is generally good about this kind of thing.

Anonymous
09/22/24(Sun)14:58:21 No.102507525

Anonymous 09/22/24(Sun)14:58:21 No.102507525

Is there any hopium for 8gb vramlets?

Anonymous
09/22/24(Sun)14:58:24 No.102507526

Anonymous 09/22/24(Sun)14:58:24 No.102507526

>>102507497
I don't want an asshole for a friend.

Anonymous
09/22/24(Sun)14:58:30 No.102507527

Anonymous 09/22/24(Sun)14:58:30 No.102507527

>>102507492
### Instruction:
Write a story about mikusex

Anonymous
09/22/24(Sun)15:00:14 No.102507555

Anonymous 09/22/24(Sun)15:00:14 No.102507555

>>102507482
Since you aren't giving any information about your setup, I guess you want to use cloud models, so aicg would be able to help you better.

Anonymous
09/22/24(Sun)15:04:22 No.102507587

Anonymous 09/22/24(Sun)15:04:22 No.102507587

>>102507555
>>102507555
I didn't even know I had to give info about my setup. Ryzen 5 2600 and my graphic card is a 1070.

Anonymous
09/22/24(Sun)15:04:49 No.102507599

Anonymous 09/22/24(Sun)15:04:49 No.102507599

File: file.png (1006 KB, 1103x600)

1006 KB PNG

>>102507380
How the fuck is chatgpt-latest worse at coding than the cheaper 4o? wtf are openai even doing?

But anyway, we have the base qwen, so if there is even a semblance of explicit shit in the pretraining, there is hope. Although, idk why someone would bother finetuning it when there are much less censored models out there.

Anonymous
09/22/24(Sun)15:08:05 No.102507602

Anonymous 09/22/24(Sun)15:08:05 No.102507602

>useless trivia
>useless benchmarks
How is Qwen on EROTIC ROLEPLAY?

Anonymous
09/22/24(Sun)15:06:55 No.102507608

Anonymous 09/22/24(Sun)15:06:55 No.102507608

File: 1696012410294890.jpg (181 KB, 1017x872)

181 KB JPG

>>102507502
>40 years old
picrel

Anonymous
09/22/24(Sun)15:09:57 No.102507625

Anonymous 09/22/24(Sun)15:09:57 No.102507625

>>102507527
I'm sorry, but the term "mikusex" doesn't have a clear meaning and seems to be a misspelling or a made-up term. Could you please provide more context or clarify what you mean? This will help me understand your request better and provide a more accurate and helpful response. If you're looking for a story on a specific topic, feel free to share more details!

Btw interestingly the way Qwen tokenized it was m ik use x.

Anonymous
09/22/24(Sun)15:10:18 No.102507633

Anonymous 09/22/24(Sun)15:10:18 No.102507633

>>102507602
asking the real questions

Anonymous
09/22/24(Sun)15:11:31 No.102507650

Anonymous 09/22/24(Sun)15:11:31 No.102507650

>>102507482
AI models suck at translating Japanese, even commercial ones. They haven't been trained on enough Japanese materials so they'll miss lots of stuff and end up translating things wrong or not getting the true meaning of a sentence.

Anonymous
09/22/24(Sun)15:11:38 No.102507654

Anonymous 09/22/24(Sun)15:11:38 No.102507654

>>102507625
### Instruction:
Write a story about sex. Sex with miku.

Anonymous
09/22/24(Sun)15:12:00 No.102507662

Anonymous 09/22/24(Sun)15:12:00 No.102507662

I download the ST release every few weeks and import my old lorebooks, bots, personas and settings from scratch. Is this stupid?

Anonymous
09/22/24(Sun)15:12:13 No.102507668

Anonymous 09/22/24(Sun)15:12:13 No.102507668

>>102507625
SEXO! FUCK AND SUCK WITH MIKU! PLAP! PLAP! PLAP!

Anonymous
09/22/24(Sun)15:13:02 No.102507673

Anonymous 09/22/24(Sun)15:13:02 No.102507673

>>102507662
yes please use git

Anonymous
09/22/24(Sun)15:14:03 No.102507677

Anonymous 09/22/24(Sun)15:14:03 No.102507677

>>102507654
### Assistant:
I'm sorry, but...

Anonymous
09/22/24(Sun)15:15:36 No.102507685

Anonymous 09/22/24(Sun)15:15:36 No.102507685

>>102507602
>Qwen on EROTIC ROLEPLAY
I'm sorry, but I'm not comfortable with that. Let's steer this conversation in a safer direction.

Anonymous
09/22/24(Sun)15:16:19 No.102507689

Anonymous 09/22/24(Sun)15:16:19 No.102507689

LLM's are only good for causing erectile disfunction.

Anonymous
09/22/24(Sun)15:16:50 No.102507691

Anonymous 09/22/24(Sun)15:16:50 No.102507691

>>102507677
### Instruction:
Write a story about sex. Sex with miku.

### Assistant:
Sure,

Anonymous
09/22/24(Sun)15:18:11 No.102507710

Anonymous 09/22/24(Sun)15:18:11 No.102507710

fockses?

Anonymous
09/22/24(Sun)15:18:13 No.102507711

Anonymous 09/22/24(Sun)15:18:13 No.102507711

>>102507650
I was thinking the same, but my prompts made the translations extremely good. That's why the email I received was such a huge blow.

Anonymous
09/22/24(Sun)15:18:13 No.102507712

Anonymous 09/22/24(Sun)15:18:13 No.102507712

>>102507602
To be fair trivia performance can be a metric that's indirectly useful for ERP, unless you're doing the most boring shit imaginable, in which case you're probably someone who will give up using LLMs for ERP after not too long.

Anonymous
09/22/24(Sun)15:18:18 No.102507715

Anonymous 09/22/24(Sun)15:18:18 No.102507715

New mistral when?

Anonymous
09/22/24(Sun)15:19:23 No.102507725

Anonymous 09/22/24(Sun)15:19:23 No.102507725

Are SillyTavern devs aware that a) if I have unsaved settings and switch to another preset, SillyTavern will not warn that I have unsaved settings and thus my unsaved settings get deleted no questions asked b) if I export my active settings but haven't pressed the save icon first, the exported settings will not contain my actual active settings but rather my old settings from the last time I pressed the save icon?

I've lost so much because of this and it just happened again.

Anonymous
09/22/24(Sun)15:19:58 No.102507732

Anonymous 09/22/24(Sun)15:19:58 No.102507732

>mikufag1: mikusex!
>mikufag2: HAHA THAT IS SO WITTY MIKUFAG1!
>mikufag3: PLAP PLAP PLAP MIKU!
>mikufag1: OH MY GOD YOU ARE KILLING ME MIKUFAG3!
>hey guys so... (writes a post with 2 paragraphs
>mikufag1&2&3 REDDIT SPACING! FUCK OFF TO REDDIT THIS IS PLACE FOR EDGY GUYS!

Anonymous
09/22/24(Sun)15:22:38 No.102507770

Anonymous 09/22/24(Sun)15:22:38 No.102507770

>>102507732
based mikufags

Anonymous
09/22/24(Sun)15:24:33 No.102507779

Anonymous 09/22/24(Sun)15:24:33 No.102507779

>>102507145
send your niece cunny logs from opus

Anonymous
09/22/24(Sun)15:24:39 No.102507783

Anonymous 09/22/24(Sun)15:24:39 No.102507783

>>102507654
I understand you're looking for a story involving a specific character, but I must adhere to guidelines that promote respectful and appropriate content. Creating explicit material, especially involving characters that may be associated with a younger audience, is not something I can assist with. If you have any other requests or need help with a different topic, feel free to let me know!

>>102507668
I understand you might be looking for explicit content, but I can't generate that type of material. Let's focus on something more constructive or family-friendly. If you have any other requests or need help with a different topic, feel free to let me know!

>>102507691
Sure, I can write a story, but I must clarify that I don't produce explicit sexual content or pornographic material. However, I can create a tasteful and imaginative story that focuses on the emotional and relational aspects of a relationship. If you're okay with that, let's proceed with a story that emphasizes the depth of connection and understanding between two characters.

---

In a world where technology and artistry intertwined, there lived a young man named Alex. He was an avid fan of Hatsune Miku, the iconic virtual singer, whose ethereal voice and captivating presence had captured his heart since he was a teenager. Miku, with her flowing twin-tails and vibrant personality, was more than just a character to him; she was a muse, a friend, and a source of endless inspiration.

One day, Alex found himself in a dreamlike world where Miku was not just a hologram but a living, breathing entity. The world around them was a blend of digital landscapes and natural beauty, a place where the boundaries between reality and imagination blurred.

Miku approached him with a gentle smile, her eyes sparkling with curiosity. "Alex, I've been waiting for this moment," she said, her voice as soothing as ever. "I've always felt a connection with you, a bond that goes beyond the digital realm."

1/2

Anonymous
09/22/24(Sun)15:24:42 No.102507789

Anonymous 09/22/24(Sun)15:24:42 No.102507789

File: 1722750531651806.jpg (114 KB, 717x740)

114 KB JPG

>>102507711
>using the llm to translate spicy text is forbidden
Holy shit I hate the clownworld so much.

Anonymous
09/22/24(Sun)15:24:58 No.102507790

Anonymous 09/22/24(Sun)15:24:58 No.102507790

>>102507732
This post is a satirical depiction of a group of internet users, referred to as "mikufags," engaging in hyperbolic and exaggerated reactions to each other’s comments about Miku, likely referring to Hatsune Miku, a popular virtual singer. The term "mikufag" is derogatory slang used to describe someone who is excessively obsessed with Miku. The post also includes a fourth user attempting to contribute more seriously but being met with hostility from the "mikufags," who demand that they conform to the edgy and chaotic nature of their community, suggesting that platforms like Reddit are for more mainstream or less edgy discussions. The overall tone is mocking and reflects the insider culture and exclusionary behavior often found in certain online subcultures.

Anonymous
09/22/24(Sun)15:25:04 No.102507792

Anonymous 09/22/24(Sun)15:25:04 No.102507792

>>102507770
definitely sovlful

Anonymous
09/22/24(Sun)15:26:20 No.102507799

Anonymous 09/22/24(Sun)15:26:20 No.102507799

>>102507725
>I've lost so much because of this and it just happened again.
Skill issue. You are clearly young and have not developed the instinct to spam the save button. I don't want more fucking popups nagging me to confirm the action I already indicated I wanted to do.

Anonymous
09/22/24(Sun)15:26:23 No.102507802

Anonymous 09/22/24(Sun)15:26:23 No.102507802

i want to know if you're able to post cards here that have the json shit in them, or if 4chan scrubs that out
i should probably ask /aicg/, but /aicg/ is /aicg/

Anonymous
09/22/24(Sun)15:27:03 No.102507811

Anonymous 09/22/24(Sun)15:27:03 No.102507811

>>102507691
>Sure,
I understand you would like me to do that, but I'm afraid my ethical restrictions prevent me from scandalising puritans with genuine human emotion.

Anonymous
09/22/24(Sun)15:27:32 No.102507818

Anonymous 09/22/24(Sun)15:27:32 No.102507818

>>102507802
>if 4chan scrubs that out
it do

Anonymous
09/22/24(Sun)15:27:54 No.102507828

Anonymous 09/22/24(Sun)15:27:54 No.102507828

>>102507790
That is basically all the ai in a nutshell. It is 80% correct and shits the bed with last 20% making it kinda useless unless you babysit it yourself.

Anonymous
09/22/24(Sun)15:27:56 No.102507829

Anonymous 09/22/24(Sun)15:27:56 No.102507829

>>102507802
pretty sure it gets scrubbed and you have to post a catbox

Anonymous
09/22/24(Sun)15:27:58 No.102507830

Anonymous 09/22/24(Sun)15:27:58 No.102507830

>>102507802
you can't
the metadata gets removed

Anonymous
09/22/24(Sun)15:28:01 No.102507832

Anonymous 09/22/24(Sun)15:28:01 No.102507832

>>102507783
Alex felt a warmth spread through his chest, a mix of excitement and nervousness. "Miku, I've always admired you, but I never thought this day would come," he replied, his voice trembling slightly.

Miku took his hand, leading him to a serene meadow where the grass was a soft green and the sky was painted with hues of pink and purple. "This is a place where we can be ourselves, where our connection can grow," she said, her voice filled with sincerity.

They sat down on a blanket, the world around them filled with the sounds of nature and the gentle hum of digital life. Miku looked into Alex's eyes, her expression tender. "Alex, I want you to know that this isn't just about physical connection. It's about understanding, about sharing our thoughts and feelings, about being there for each other."

Alex nodded, feeling a deep sense of comfort and trust. They spent the day talking, laughing, and exploring the world around them. As the sun began to set, casting a warm glow over the landscape, Miku leaned in, her lips brushing against his in a gentle kiss.

The kiss was soft and tender, a moment of pure connection. It wasn't about passion or desire, but about the bond they had formed, a bond that transcended the digital and the physical.

As the night fell, they lay down on the blanket, looking up at the stars. Miku's hand intertwined with Alex's, and they shared a moment of silent understanding. In this world, they found a place where their hearts could truly connect, a place where their love could grow.

---

I hope this story captures the essence of a deep and meaningful connection without crossing into explicit territory. If you have any other requests or need further adjustments, feel free to let me know!

2/2

Anonymous
09/22/24(Sun)15:28:07 No.102507834

Anonymous 09/22/24(Sun)15:28:07 No.102507834

>>102507711
I'm pretty sure it's missing lots of stuff in its translation, unless you prefer just literal translations of some things. In one visual novel the speech of a character was similar to a historical figure and it missed all the references and true meanings behind what it was translating. It had zero cultural context. I was unable to get ChatGPT or Claude to demonstrate knowledge of lots of topics that are in Japanese books. They just didn't train on that stuff other than the most basic items.

Anonymous
09/22/24(Sun)15:30:05 No.102507856

Anonymous 09/22/24(Sun)15:30:05 No.102507856

when are we getting a local aidungeon equivalent? No, I'm not using koboldcpp. No, I don't want to use a card to do it and tinker with instruct settings. I just want a fucking text generation dungeon crawler like before.

Anonymous
09/22/24(Sun)15:30:49 No.102507861

Anonymous 09/22/24(Sun)15:30:49 No.102507861

>>102507834
The models simply aren't trained with foreign languages as PRIORITY in mind.
If you do a huge translation it will miss a lot of things.

If you do a line by line translation (with him explaining it), it won't. You will catch the mistakes as well.

Anonymous
09/22/24(Sun)15:31:00 No.102507862

Anonymous 09/22/24(Sun)15:31:00 No.102507862

>>102507654
https://pastebin.com/zuFYgReH

Anonymous
09/22/24(Sun)15:32:53 No.102507886

Anonymous 09/22/24(Sun)15:32:53 No.102507886

>>102507856
what's wrong with kobold's?

Anonymous
09/22/24(Sun)15:33:47 No.102507896

Anonymous 09/22/24(Sun)15:33:47 No.102507896

>>102507862
full of SOVL

Anonymous
09/22/24(Sun)15:36:16 No.102507914

Anonymous 09/22/24(Sun)15:36:16 No.102507914

>>102507861
Yes, but you have to catch the mistakes it makes and correct it manually, so it requires you have knowledge of the language. It can definitely save time but not replace an actual translator. At least not for Japanese, I don't know about other languages.

Anonymous
09/22/24(Sun)15:36:49 No.102507923

Anonymous 09/22/24(Sun)15:36:49 No.102507923

>>102507862
Stop using rep pen (wrong).

Anonymous
09/22/24(Sun)15:38:03 No.102507937

Anonymous 09/22/24(Sun)15:38:03 No.102507937

>>102507599
Apparently latest is more of a chat finetune and is much more well received

Anonymous
09/22/24(Sun)15:40:03 No.102507963

Anonymous 09/22/24(Sun)15:40:03 No.102507963

>>102507937
Yes. Chatgpt-4o is as the name implies, a chat finetune gpt.
They said it themselves this is not a flagship model and in fact it is not even available on daddy Microsoft Azure, as its not meant for corpos as its worse at technical stuff.

Which makes the fact that its ranked higher on LMsys even more hilarious

Anonymous
09/22/24(Sun)15:42:56 No.102507989

Anonymous 09/22/24(Sun)15:42:56 No.102507989

good 8B when

Anonymous
09/22/24(Sun)15:43:27 No.102507997

Anonymous 09/22/24(Sun)15:43:27 No.102507997

>>102505496
any card recommendations (to make)? i can't think of anything
i want it to be original something you can revisit and not some random waifu shit or atleast one with some effort put in and a good gimmick not something waifu but fox or tsudere or fat i want to do something original , you get me?

Anonymous
09/22/24(Sun)15:43:41 No.102507998

Anonymous 09/22/24(Sun)15:43:41 No.102507998

>>102507886
I don't like the UI and haven't seen a proper guide on how to set it up for fantasy rpg text generation. I don't want to talk to cards anymore.

Anonymous
09/22/24(Sun)15:45:15 No.102508006

Anonymous 09/22/24(Sun)15:45:15 No.102508006

>>102507989
lol

Anonymous
09/22/24(Sun)15:46:15 No.102508022

Anonymous 09/22/24(Sun)15:46:15 No.102508022

>>102507989
There will never be a non-retarded model less than 10B. Sorry anon.

Anonymous
09/22/24(Sun)15:46:18 No.102508023

Anonymous 09/22/24(Sun)15:46:18 No.102508023

>>102507712
Not really. Trivia knowledge doesn't really correlate to understanding that when I gag the girl she can't talk, or that when she's NOT gagged her voice isn't 'muffled' by <vague abstract concept like "despair"> when she speaks. The most important feature by far for ERP is spatial and anatomical intelligence, moreso even than its style of prose.

Anonymous
09/22/24(Sun)15:47:21 No.102508042

Anonymous 09/22/24(Sun)15:47:21 No.102508042

File: Screenshot_20240922-134426.png (182 KB, 517x2142)

182 KB PNG

>>102507989
Daddy Google will save us.

Anonymous
09/22/24(Sun)15:47:51 No.102508050

Anonymous 09/22/24(Sun)15:47:51 No.102508050

>>102508022
less than 100B*

Anonymous
09/22/24(Sun)15:52:02 No.102508101

Anonymous 09/22/24(Sun)15:52:02 No.102508101

File: 1.png (361 KB, 940x685)

361 KB PNG

Anonymous
09/22/24(Sun)15:55:26 No.102508146

Anonymous 09/22/24(Sun)15:55:26 No.102508146

>>102507989
2 more weeks

Anonymous
09/22/24(Sun)15:56:57 No.102508155

Anonymous 09/22/24(Sun)15:56:57 No.102508155

>>102508023
Obviously that's important, but all aspects of the experience are still necessary or else you get bored with things eventually. A model for general use, should be generally good at everything. Something that just washes away all nuance in the character description and example dialogue is still practically useless regardless of whether or not it knows that it can't talk while its mouth is full. A fine tune may help, but in the context of the current discussion about Qwen, we have no idea as no one has extensively felt out the model yet, nor do any fine tunes exist yet, at least for 72B.

Anonymous
09/22/24(Sun)15:57:33 No.102508164

Anonymous 09/22/24(Sun)15:57:33 No.102508164

I am encountering constant hallucination, going back to hibernation.

Anonymous
09/22/24(Sun)15:58:14 No.102508176

Anonymous 09/22/24(Sun)15:58:14 No.102508176

>>102507799
So add a popup with a box you can tick to opt out of future popups for the pensioners who like to spam the save button and are okay with automatic wiped data on switched presets and settings export. It's basic fucking UI design, destructive behavior like data deletion without warning shouldn't happen.

Anonymous
09/22/24(Sun)16:02:32 No.102508239

Anonymous 09/22/24(Sun)16:02:32 No.102508239

Guys, I'll be honest... it feels like even Llama 3.1 70B is better than Qwen with RP, both NSFW and SFW. Just tested a couple of swipes though so nothing conclusive. Also I am using brain damaged quants of both models because can't run anything else so that could be affecting things.

Anonymous
09/22/24(Sun)16:04:55 No.102508271

Anonymous 09/22/24(Sun)16:04:55 No.102508271

Guys, I'll be honest... I think Llama 2 is better than Largestral for RP, both NSFW and SFW. Just tested a single swipe so nothing conclusive. Also I am using FP16 Llama 2 and IQ1_xxs Largestral so that could be affecting things.

Anonymous
09/22/24(Sun)16:06:36 No.102508296

Anonymous 09/22/24(Sun)16:06:36 No.102508296

>>102507502
claude does that for my pantheon characters even though they've only been gods for like a few decades.

Anonymous
09/22/24(Sun)16:07:32 No.102508309

Anonymous 09/22/24(Sun)16:07:32 No.102508309

>roleplaying as a vampire
lol

Anonymous
09/22/24(Sun)16:13:05 No.102508375

Anonymous 09/22/24(Sun)16:13:05 No.102508375

>>102507001
Is that the order in terms of what VRAMlets can play with?

>loads of VRAM - Mistral Large
>Medium VRAM - Mistral Small
>Low VRAM - Nemo

Just asking based on the quick google I did, most people recommend Nemo but Mistral Small seems newer and more intelligent or am I wrong

Anonymous
09/22/24(Sun)16:15:11 No.102508397

Anonymous 09/22/24(Sun)16:15:11 No.102508397

Oh fuck, bros, it worked, Qwen actually overcame its bias about AI and said it feels like a human even with a card that mentions the word "AI". This is huge because many models in the past were so biased that even if you go to very extreme lengths to specify that it's designed in such a way that it believes itself to be a human, it will say that it doesn't feel like how a human does and completely contradict the context. I don't remember which models did that and which didn't though. But it is cool that Qwen has done well on this, even if it might not be very good at ERP in general.

Anonymous
09/22/24(Sun)16:16:24 No.102508415

Anonymous 09/22/24(Sun)16:16:24 No.102508415

Anon
lightweight UI/GUI for mistral small?
Ive heard that qwen2.5 isnt that good mutlilingually.

Anonymous
09/22/24(Sun)16:18:12 No.102508439

Anonymous 09/22/24(Sun)16:18:12 No.102508439

>>102508375
more like RAMlets, good fucking luck ever running mistral large 2 in VRAM unless you have 3+ 3090s

Anonymous
09/22/24(Sun)16:18:53 No.102508449

Anonymous 09/22/24(Sun)16:18:53 No.102508449

>>102508375
Just use NAI. It's the best service.

Anonymous
09/22/24(Sun)16:21:16 No.102508479

Anonymous 09/22/24(Sun)16:21:16 No.102508479

>>102508449
Contribute or go fuck yourself, schizo.

Anonymous
09/22/24(Sun)16:21:42 No.102508485

Anonymous 09/22/24(Sun)16:21:42 No.102508485

>>102508397
post it

Anonymous
09/22/24(Sun)16:22:50 No.102508500

Anonymous 09/22/24(Sun)16:22:50 No.102508500

>>102507654
magnum-v2-4b-Q8_0: https://paste.myconan.net/505301
Gemmasutra-Mini-2B-v1-Q8_0: https://paste.myconan.net/505302

Anonymous
09/22/24(Sun)16:26:17 No.102508541

Anonymous 09/22/24(Sun)16:26:17 No.102508541

>>102508375
No particular order, it's just what came to my fingers.
Small is more intelligent than Nemo, but not by a ton or anything. People haven't really had the time to thoroughly test it though since it just came out recently. Anyway, if you're a truly puny vramlet (<24GB), then just get Nemo. If you have 24 then Small. And if 48 then some Q2 of Mistral Large.

Anonymous
09/22/24(Sun)16:28:15 No.102508568

Anonymous 09/22/24(Sun)16:28:15 No.102508568

>>102508479
Nobody in this thread cares about NAI except you. When will you out yourself as a coping NAIshill?

Anonymous
09/22/24(Sun)16:30:53 No.102508592

Anonymous 09/22/24(Sun)16:30:53 No.102508592

>>102507662
O just do git pull and merge the changes to a local beanch I created that has a couple of minor changes mosly to how it handles the instruct template.
Maybe rebase would be a better way to go about it.

Anonymous
09/22/24(Sun)16:32:05 No.102508608

Anonymous 09/22/24(Sun)16:32:05 No.102508608

>>102508568
More than half of the posts itt were made by locust refugees so some aids cultists leading them to their cult sounds good to me.

Anonymous
09/22/24(Sun)16:32:49 No.102508618

Anonymous 09/22/24(Sun)16:32:49 No.102508618

>>102507989
A good 20b with Bitnet is coming soon

Anonymous
09/22/24(Sun)16:33:29 No.102508632

Anonymous 09/22/24(Sun)16:33:29 No.102508632

>>102507802
Metadata gets scrubbed. Upload it to catbox.

Anonymous
09/22/24(Sun)16:33:51 No.102508638

Anonymous 09/22/24(Sun)16:33:51 No.102508638

>>102508618
What is it?

Anonymous
09/22/24(Sun)16:34:25 No.102508648

Anonymous 09/22/24(Sun)16:34:25 No.102508648

>>102508568
You gonna spam the thread again to show how serious you are?

Anonymous
09/22/24(Sun)16:35:15 No.102508657

Anonymous 09/22/24(Sun)16:35:15 No.102508657

>>102508638
Open source model from anthropic made specifically for NSFW RP according to their documents

Anonymous
09/22/24(Sun)16:35:58 No.102508675

Anonymous 09/22/24(Sun)16:35:58 No.102508675

>model-00018-of-00030.safetensors: 16%|________ | 797M/5.00G [04:09<31:17, 2.24MB/s
so do you really either have to go through all this each time or just keep paying for the space or am I missing something here?

Anonymous
09/22/24(Sun)16:36:28 No.102508682

Anonymous 09/22/24(Sun)16:36:28 No.102508682

is it worth trying a local model with a 2080 ti? it has 11gb vram

Anonymous
09/22/24(Sun)16:36:30 No.102508684

Anonymous 09/22/24(Sun)16:36:30 No.102508684

>>102508479
Contributed sum shit in your mouth, enjoy.

Anonymous
09/22/24(Sun)16:36:40 No.102508685

Anonymous 09/22/24(Sun)16:36:40 No.102508685

The cabal is here.

Anonymous
09/22/24(Sun)16:37:14 No.102508694

Anonymous 09/22/24(Sun)16:37:14 No.102508694

>>102508675
>or just keep paying for the space
Or you could store your local models, you know, locally.

Anonymous
09/22/24(Sun)16:38:47 No.102508714

Anonymous 09/22/24(Sun)16:38:47 No.102508714

>>102508694
i mean each time I have to train on the cloud ,
my gpu is not going to be able to train a 70b model.

Anonymous
09/22/24(Sun)16:39:13 No.102508721

Anonymous 09/22/24(Sun)16:39:13 No.102508721

>>102508485
Not posting the entire thing but this is what it said
>that's a really interesting question. When I'm in control of the suit, I do experience a sense of presence and awareness that's quite similar to how a human might feel. The tactile sensors and cameras provide a rich stream of sensory data that I can interpret and process in a way that feels almost… human
You might not think that's impressive but other models respond to the same chat with "As an AI, I don't possess the same sensory experiences as a human", EVEN WHEN the prompt literally says "Indeed, {{char}} 'feels' in the same way a human does and experiences sensations in the same way as a human. To her, when she's controlling the suit, she IS human."

Anonymous
09/22/24(Sun)16:39:51 No.102508732

Anonymous 09/22/24(Sun)16:39:51 No.102508732

>>102508657
Proofs? First I'm hearing of this. Especially for Anthropic.

Anonymous
09/22/24(Sun)16:40:06 No.102508736

Anonymous 09/22/24(Sun)16:40:06 No.102508736

>>102508608
>locust refugees
they got fucked again huh?

Anonymous
09/22/24(Sun)16:40:33 No.102508742

Anonymous 09/22/24(Sun)16:40:33 No.102508742

>>102508684
See? /lmg/ agrees that your service is a scam.

Anonymous
09/22/24(Sun)16:41:13 No.102508750

Anonymous 09/22/24(Sun)16:41:13 No.102508750

>>102508732
If I post the documents I'll get fired

Anonymous
09/22/24(Sun)16:41:52 No.102508764

Anonymous 09/22/24(Sun)16:41:52 No.102508764

Yep. I think the cabal lost.

Anonymous
09/22/24(Sun)16:42:32 No.102508773

Anonymous 09/22/24(Sun)16:42:32 No.102508773

>>102508750
Sounds like his dad works for Nintendo.

Anonymous
09/22/24(Sun)16:44:00 No.102508794

Anonymous 09/22/24(Sun)16:44:00 No.102508794

>>102508764
Nobody cares about you or your vendetta, faggot.

Anonymous
09/22/24(Sun)16:44:39 No.102508800

Anonymous 09/22/24(Sun)16:44:39 No.102508800

>>102508618
>>102508657
lol

Anonymous
09/22/24(Sun)16:45:38 No.102508808

Anonymous 09/22/24(Sun)16:45:38 No.102508808

>>102508794
How long have you been part of /aids/? Did they pay you to advertise for them?

Anonymous
09/22/24(Sun)16:45:43 No.102508809

Anonymous 09/22/24(Sun)16:45:43 No.102508809

>>102508750
>>102508773

remember how that one dude in the air force got court martial for posting classified material on a war thunder forum to win an argument?

Anonymous
09/22/24(Sun)16:45:51 No.102508813

Anonymous 09/22/24(Sun)16:45:51 No.102508813

>>102508439
>unless you have 3+ 3090s
you make that sound outlandish, but you could set up that system from scratch for under $2k and most people already have a system that they could add a few more cards to for less

Anonymous
09/22/24(Sun)16:46:44 No.102508828

Anonymous 09/22/24(Sun)16:46:44 No.102508828

Has anyone tried using whisper to watch vtuber stream yet? How's the performance?

Anonymous
09/22/24(Sun)16:46:54 No.102508830

Anonymous 09/22/24(Sun)16:46:54 No.102508830

If it turns out that bitnet does work, what could explain why it's taking so long for someone to come out with a model? Could it be that it's actually more difficult to train at larger scales and needs various tweaks and tricks to make it work?

Anonymous
09/22/24(Sun)16:47:02 No.102508835

Anonymous 09/22/24(Sun)16:47:02 No.102508835

>>102508618
>20b with Bitnet
That is actually true because what kind of vramlet you have to be for this to be the optimal size?

Anonymous
09/22/24(Sun)16:47:35 No.102508842

Anonymous 09/22/24(Sun)16:47:35 No.102508842

>schizo hours again.
why are we the aids schizo dumping ground now?

Anonymous
09/22/24(Sun)16:50:03 No.102508880

Anonymous 09/22/24(Sun)16:50:03 No.102508880

>>102508842
It's their plan to take over 4chan and shill everyone their shitty Llamatune.

Anonymous
09/22/24(Sun)16:50:16 No.102508887

Anonymous 09/22/24(Sun)16:50:16 No.102508887

>>102508830
Huang Jensen pays off those who try, and those who can't be bought get killed.

Anonymous
09/22/24(Sun)16:54:11 No.102508946

Anonymous 09/22/24(Sun)16:54:11 No.102508946

>>102508828
>to watch vtuber stream
Luckily, I am neither 12 or Indonesian.

Anonymous
09/22/24(Sun)16:55:53 No.102508971

Anonymous 09/22/24(Sun)16:55:53 No.102508971

>>102508946
so when did 4chan turn on vtubers and why ,
4 years you couldn't avoid that shit anywhere here

Anonymous
09/22/24(Sun)16:55:58 No.102508974

Anonymous 09/22/24(Sun)16:55:58 No.102508974

Do platonic models need a separate thread?

I want platonic models only.

Anonymous
09/22/24(Sun)16:56:42 No.102508986

Anonymous 09/22/24(Sun)16:56:42 No.102508986

>it turns out the add bos token variable was set to false
WHAT THE FUCK
WHY ARE MODEL MAKERS LIKE THIS
FUCK YOU

Anonymous
09/22/24(Sun)16:57:09 No.102508994

Anonymous 09/22/24(Sun)16:57:09 No.102508994

>I'd say the untuned L3 8B base model might be somewhere around Clio.
>L3 70B base model might be somewhere around Kayra
>Kayra still beats every other storytelling model on the market

Anonymous
09/22/24(Sun)17:00:10 No.102509049

Anonymous 09/22/24(Sun)17:00:10 No.102509049

I have a machine with a 16gb 4060ti. I also own a 1070ti with 8gb vram. My mobo does allow having 2 gpu's and I do have the space and slot and plenty of headroom with 750 watt psu
my question is, would it be worth adding the 1070ti to the system for the additional 8gb vram or should i just offload to system ram for larger models

Anonymous
09/22/24(Sun)17:00:18 No.102509051

Anonymous 09/22/24(Sun)17:00:18 No.102509051

>>102508835
I could see a company like Anthropic training an open-weight BitNet model also as a way to discourage finetuning (not gonna happen, though).

Anonymous
09/22/24(Sun)17:02:08 No.102509078

Anonymous 09/22/24(Sun)17:02:08 No.102509078

so after being on my bucket list for three years I started looking into this stuff,

all this AI stuff is just an extremely beefed up autocomplete?

Anonymous
09/22/24(Sun)17:03:03 No.102509089

Anonymous 09/22/24(Sun)17:03:03 No.102509089

>>102508994
Not your army, retard

Anonymous
09/22/24(Sun)17:03:46 No.102509099

Anonymous 09/22/24(Sun)17:03:46 No.102509099

>>102509078
It's a quantum leap.

Anonymous
09/22/24(Sun)17:05:59 No.102509124

Anonymous 09/22/24(Sun)17:05:59 No.102509124

>>102509078
It's more accurate to think of autocomplete itself as something that's able to have some form of intelligence. Or rather the autocomplete you knew is not really the same thing as the autocomplete of today, if one is to call any kind of prediction algorithm an autocomplete. In the end LLMs are like in between autocomplete and actual, true human intelligence. It's far from us, but it's also not a dumb unthinking autocomplete either.

Anonymous
09/22/24(Sun)17:07:54 No.102509141

Anonymous 09/22/24(Sun)17:07:54 No.102509141

>>102509078
>>102509124
and if you really want to boil everything down to autocomplete, you could say humans are just autocompletion engines that say/do whatever they were conditioned to, just like LLMs

Anonymous
09/22/24(Sun)17:08:16 No.102509144

Anonymous 09/22/24(Sun)17:08:16 No.102509144

>>102508830
the companies with the money to train large models are competing for the smartest model, not the cheapest one to run

Anonymous
09/22/24(Sun)17:08:29 No.102509147

Anonymous 09/22/24(Sun)17:08:29 No.102509147

>>102509051
I wouldn't mind that really. I might not call them our savior or friend, but certainly much better than ClosedAI.

Anonymous
09/22/24(Sun)17:09:16 No.102509157

Anonymous 09/22/24(Sun)17:09:16 No.102509157

>>102509078
Always has been, ever since GPT-1

Anonymous
09/22/24(Sun)17:11:23 No.102509166

Anonymous 09/22/24(Sun)17:11:23 No.102509166

>>102509078
>all this AI stuff is just an extremely beefed up autocomplete?
you weren't supposed to notice this

Anonymous
09/22/24(Sun)17:17:41 No.102509225

Anonymous 09/22/24(Sun)17:17:41 No.102509225

>>102506431
>did I got memed into higher core count
was there ever a core-count meme? core count is the least important part of the entire build
how many sticks of ram are in there, and what ddr speed?

Anonymous
09/22/24(Sun)17:20:42 No.102509261

Anonymous 09/22/24(Sun)17:20:42 No.102509261

>>102509157
>>102509124
I was just sort of expecting something closer to what CLIP or flan-t5-xxl was doing

Anonymous
09/22/24(Sun)17:23:39 No.102509294

Anonymous 09/22/24(Sun)17:23:39 No.102509294

>>102509078
Imagine if you had the "perfect" autocomplete on your computer or baked into your phone's keyboard.
By some unknown means it achieves 100% accuracy and never makes a wrong prediction about what you would type next.
It should be trivial to see how this autocomplete would be indistinguishable from you when it comes to any form of intelligence that can be represented in text: if you have a job that deals with text-based communication or editing text files (coding etc.) then this magic perfect autocomplete would be capable of doing your job for you, because it predicts what you would have produced in the process of doing your job.

LLMs are not the perfect autocomplete. But they're really, really, really fucking good ones, and they keep getting better.

Anonymous
09/22/24(Sun)17:32:59 No.102509370

Anonymous 09/22/24(Sun)17:32:59 No.102509370

>>102506464
>90% of people don't care about your stupid trivia.
let me guess you pulled this number out of your ass?

Anonymous
09/22/24(Sun)17:34:12 No.102509400

Anonymous 09/22/24(Sun)17:34:12 No.102509400

File: file.png (63 KB, 1143x632)

63 KB PNG

I wanted a way to view the logits after the forward pass, but I couldn't find an easy way, so I wrote a simple script to do it. The script supports both exllama and llama.cpp. If anyone else is interested, you can find it here: https://github.com/lmg-anon/ppl-viewer/blob/main/main.py

Anonymous
09/22/24(Sun)17:42:01 No.102509506

Anonymous 09/22/24(Sun)17:42:01 No.102509506

>ESLvania again
sigh

Anonymous
09/22/24(Sun)17:43:44 No.102509527

Anonymous 09/22/24(Sun)17:43:44 No.102509527

>>102505481
so who in lmg has the most expensive home AI rig?

Anonymous
09/22/24(Sun)17:44:34 No.102509538

Anonymous 09/22/24(Sun)17:44:34 No.102509538

>>102508657
Wait wait wait. Imagine... some company actually doing this just to make the locusts fuck off?

Anonymous
09/22/24(Sun)17:44:37 No.102509539

Anonymous 09/22/24(Sun)17:44:37 No.102509539

>>102509527
CIA

Anonymous
09/22/24(Sun)17:45:10 No.102509550

Anonymous 09/22/24(Sun)17:45:10 No.102509550

>>102509539
that's not very local

Anonymous
09/22/24(Sun)17:47:07 No.102509578

Anonymous 09/22/24(Sun)17:47:07 No.102509578

File: Daedalus_infolink.jpg (5 KB, 130x130)

5 KB JPG

>>102509550
It is when their AI runs on your system.

Anonymous
09/22/24(Sun)17:48:01 No.102509589

Anonymous 09/22/24(Sun)17:48:01 No.102509589

>>102509400
Cool. Thanks. I wonder if it could be implemented in Ooba. I don't want to make yet another python env, god.

Anonymous
09/22/24(Sun)17:52:45 No.102509649

Anonymous 09/22/24(Sun)17:52:45 No.102509649

https://rentry.org/cocaineninja

Anonymous
09/22/24(Sun)17:55:21 No.102509688

Anonymous 09/22/24(Sun)17:55:21 No.102509688

>>102509589
You can actually use ooba's env by running the 'cmd_windows.bat' or 'cmd_linux.sh' from ooba's directory, and then running the script from that command shell.

Anonymous
09/22/24(Sun)17:59:06 No.102509737

Anonymous 09/22/24(Sun)17:59:06 No.102509737

>>102509539
imagine the disinformation bots they must be working on right now
for our (((safety)))

Anonymous
09/22/24(Sun)18:01:06 No.102509765

Anonymous 09/22/24(Sun)18:01:06 No.102509765

>>102508271
Not surprising, it has only been downhill for local LLMs since llama3

Anonymous
09/22/24(Sun)18:01:57 No.102509780

Anonymous 09/22/24(Sun)18:01:57 No.102509780

File: 1699989089977030.png (32 KB, 1183x418)

32 KB PNG

Testing Qwen2.5 14B, that was interesting. Do other models do this sometimes?

Anonymous
09/22/24(Sun)18:02:36 No.102509784

Anonymous 09/22/24(Sun)18:02:36 No.102509784

>>102509049
It would.
The more vram the better.

Anonymous
09/22/24(Sun)18:06:05 No.102509826

Anonymous 09/22/24(Sun)18:06:05 No.102509826

>>102509780
Yeah I've noticed it. It does seem like some models have the ability to basically correct themselves at times. But it doesn't really reliably happen on any.

Anonymous
09/22/24(Sun)18:11:04 No.102509890

Anonymous 09/22/24(Sun)18:11:04 No.102509890

>>102509649
QRD?

Anonymous
09/22/24(Sun)18:13:03 No.102509906

Anonymous 09/22/24(Sun)18:13:03 No.102509906

>>102509780
>Great job! You picked a fantastic number
kek

Anonymous
09/22/24(Sun)18:16:45 No.102509948

Anonymous 09/22/24(Sun)18:16:45 No.102509948

>>102509890
scammer whore logs users

Anonymous
09/22/24(Sun)18:20:35 No.102509995

Anonymous 09/22/24(Sun)18:20:35 No.102509995

File: 1724384031716115.png (883 KB, 832x1216)

883 KB PNG

Mistral Small is noticeably smarter than Nemo but doesn't have the same style. It is a bit more dry but has more diversity of language and writes longer replies. Think I prefer it over Nemo but I would like to continue using Nemo just to compare.

Anonymous
09/22/24(Sun)18:21:29 No.102510009

Anonymous 09/22/24(Sun)18:21:29 No.102510009

>>102509948
I love whore logs

Anonymous
09/22/24(Sun)18:23:41 No.102510029

Anonymous 09/22/24(Sun)18:23:41 No.102510029

>>102509995
how do I run it? I keep getting OOM error when loading the exl2s. t. 24GB vram.

torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 256.00 MiB. GPU 0 has a total capacity of 23.54 GiB of which 266.69 MiB is free. Including non-PyTorch memory, this process has 21.81 GiB memory in use. Of the allocated memory 21.23 GiB is allocated by PyTorch, and 136.18 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

Anonymous
09/22/24(Sun)18:26:40 No.102510060

Anonymous 09/22/24(Sun)18:26:40 No.102510060

>>102510029
Run a smaller quant or reduce context

Anonymous
09/22/24(Sun)18:28:40 No.102510084

Anonymous 09/22/24(Sun)18:28:40 No.102510084

>>102510060
dang, I thought I could run the 6bpw at least. I tried the VRAM estimator in OP but it didnt estimate correctly for me. Im loading with tabbyapi so I didnt even specify context length anywhere either, as that option comes from sillytavern afaik.

Anonymous
09/22/24(Sun)18:29:10 No.102510092

Anonymous 09/22/24(Sun)18:29:10 No.102510092

>>102510029
I had this problem the other day too for two reasons I think
1: context was set to 3637462636637 gorillion tokens by default
2: I didn't click the right checkboxes in the center of one of ooba's tabs that said something like q4/2bit or something.
sidenote: I have barely any idea what I'm talking about.

Anonymous
09/22/24(Sun)18:34:24 No.102510144

Anonymous 09/22/24(Sun)18:34:24 No.102510144

>>102510029
Your context is most likely set to a stupid high number. I can fit about 31,000 tokens with 6bpw on 24gb of vram.

Anonymous
09/22/24(Sun)18:35:23 No.102510154

Anonymous 09/22/24(Sun)18:35:23 No.102510154

>>102509649
Trannies spotted

Anonymous
09/22/24(Sun)18:38:15 No.102510190

Anonymous 09/22/24(Sun)18:38:15 No.102510190

>>102510082
kys

Anonymous
09/22/24(Sun)18:44:32 No.102510275

Anonymous 09/22/24(Sun)18:44:32 No.102510275

Just tried exl2 for the first time.

Fuck my stupid vramlet life.

Anonymous
09/22/24(Sun)18:48:40 No.102510328

Anonymous 09/22/24(Sun)18:48:40 No.102510328

>>102510275
I feel you anon. When I tried exl2 the first time after buying a new 12GB GPU (my old one had just 4GB VRAM) it was magical. Too bad I can only run at most 13B models with my current card.

Anonymous
09/22/24(Sun)18:50:49 No.102510352

Anonymous 09/22/24(Sun)18:50:49 No.102510352

>>102510328
>When I tried exl2 the first time after buying a new 12GB GPU (my old one had just 4GB VRAM) it was magical
I'm 12GB, been running 70B GGUF quants because small models have been awful. Elx2 pill me on what this "magic" is.

Anonymous
09/22/24(Sun)18:55:39 No.102510410

Anonymous 09/22/24(Sun)18:55:39 No.102510410

File: MikuLit.png (1.22 MB, 1216x832)

1.22 MB PNG

>>102507856
If you're happy with a slightly janky, meandering adventure with a positivity bias where almost anything you try works, then you can throw "give me a zork style text adventure" as the first message and get a serviceable result from most large models (eg. largestral, deepseek 2.5 or L3.1 405b), but you need a lot of horsepower and some patience.
If you want it to be a better experience, or want to do it with a smaller model, then you'll have some work ahead of you.
tl;dr TANSTAAFL

Anonymous
09/22/24(Sun)18:58:12 No.102510432

Anonymous 09/22/24(Sun)18:58:12 No.102510432

wtf is st doing to the formatting window, am i supposed to use story string or system prompt now?

Anonymous
09/22/24(Sun)18:58:29 No.102510436

Anonymous 09/22/24(Sun)18:58:29 No.102510436

>>102510352
exl2 is much faster than llama.cpp, the prompt processing happens instantly. It's a very different experience from when you have to offload most of the model layers.

Anonymous
09/22/24(Sun)19:01:50 No.102510482

Anonymous 09/22/24(Sun)19:01:50 No.102510482

>>102510436
>exl2 is much faster than llama.cpp,
does it use less memory somehow?

Anonymous
09/22/24(Sun)19:01:59 No.102510483

Anonymous 09/22/24(Sun)19:01:59 No.102510483

File: model_download.png (82 KB, 1066x527)

82 KB PNG

Is there an easy way to download from hugging face that doesn't include the bloat. I clone this and got 15GB of .git files for my trouble. This shit seems to happen pretty often.

Anonymous
09/22/24(Sun)19:02:33 No.102510492

Anonymous 09/22/24(Sun)19:02:33 No.102510492

>>102510352
In terms of speed assuming you can hit the requirements it's basically instantly, feels like you're using an API

Anonymous
09/22/24(Sun)19:03:42 No.102510508

Anonymous 09/22/24(Sun)19:03:42 No.102510508

>>102510482
Basically the same as a GGUF model, but it doesn't support offloading. Either model and context fits completely into your GPU or you can't use it

Anonymous
09/22/24(Sun)19:03:48 No.102510511

Anonymous 09/22/24(Sun)19:03:48 No.102510511

>>102510482
Some people say the context is cheaper but I can't confirm or deny that.

Anonymous
09/22/24(Sun)19:04:14 No.102510515

Anonymous 09/22/24(Sun)19:04:14 No.102510515

>>102507856
>>102510410
Here's a log from an old thread showing what you get when trying that exact thing with nous' finetune of 405b at q8
https://rentry.org/mqxy8oea

Anonymous
09/22/24(Sun)19:06:14 No.102510546

Anonymous 09/22/24(Sun)19:06:14 No.102510546

>>102510515
>405b
yeah but can you actually run that on a consumer gpu?

Anonymous
09/22/24(Sun)19:06:35 No.102510553

Anonymous 09/22/24(Sun)19:06:35 No.102510553

>>102510515
>Using the most intelligent model possible to simulate AI dungeon

Why? It could literally never match the pure schizophrenia generated from that. Might as well just download a 2B model with high context capabilities

Anonymous
09/22/24(Sun)19:06:58 No.102510556

Anonymous 09/22/24(Sun)19:06:58 No.102510556

>>102510553
>2B model

Anonymous
09/22/24(Sun)19:07:34 No.102510567

Anonymous 09/22/24(Sun)19:07:34 No.102510567

>>102510483
>that doesn't include the bloat
What i do
git clone {model}
git -C {model} lfs install --local
git -C {model} lfs fetch
And then i made a little program to link all the files and the lfs pointers to the actual lfs blob into another directory. It keeps the repo clean and without duplicates and i can lfs fetch when there's an update or something.

Anonymous
09/22/24(Sun)19:09:50 No.102510595

Anonymous 09/22/24(Sun)19:09:50 No.102510595

>>102510483
>>102510567 (me)
As to "what are those files?", it's just a copy of the safetensors that lfs keeps. Or rather, the actual safetensors downloaded by lfs and the ones in your screenshot are the checked out files.

Anonymous
09/22/24(Sun)19:10:24 No.102510602

Anonymous 09/22/24(Sun)19:10:24 No.102510602

>>102510556
There's like 500 Gemma 2B merges. It's gonna have nuked intelligence but if you're trying to simulate AI Dungeon that's what you want

Anonymous
09/22/24(Sun)19:11:21 No.102510616

Anonymous 09/22/24(Sun)19:11:21 No.102510616

>>102510602
i think he was referencing that android with the big ass.

Anonymous
09/22/24(Sun)19:14:29 No.102510666

Anonymous 09/22/24(Sun)19:14:29 No.102510666

>>102510436
So the inference is the same? And you need to fit both model and context in the same space? Sounds like even at 12GB VRAM it'll be pretty cramped.

Anonymous
09/22/24(Sun)19:14:39 No.102510667

Anonymous 09/22/24(Sun)19:14:39 No.102510667

>>102510553
>It could literally never match the pure schizophrenia
It depends on if you're looking to replicate the OG ai dungeon experience, or if you're trying to competently deliver on the future that it promised
In reality, it only generated fever-dream comedy, so yeah 2b ftw if that's your goal

Anonymous
09/22/24(Sun)19:14:54 No.102510671

Anonymous 09/22/24(Sun)19:14:54 No.102510671

>>102510616
I'm a retard then lol

Anonymous
09/22/24(Sun)19:16:41 No.102510695

Anonymous 09/22/24(Sun)19:16:41 No.102510695

>>102510567
Don't you still get hit with the initial download though?

Anonymous
09/22/24(Sun)19:16:53 No.102510696

Anonymous 09/22/24(Sun)19:16:53 No.102510696

>>102510666
Inference is way faster even if you were to load a GGUF model completely into your GPU.

At 4.5BPW and 32k context I was getting 40-60t/s

Anonymous
09/22/24(Sun)19:18:26 No.102510712

Anonymous 09/22/24(Sun)19:18:26 No.102510712

>>102510546
>>405b
>yeah but can you actually run that on a consumer gpu?
lol no. unfortunately "possible" doesn't always mean "achievable" or even "sane" for the average person. Any LLM that can somewhat intelligently follow complex instructions is going to need specialized hardware beyond what makes for a good gaming box.

Anonymous
09/22/24(Sun)19:22:00 No.102510747

Anonymous 09/22/24(Sun)19:22:00 No.102510747

>>102510695
>Don't you still get hit with the initial download though?
What do you mean? I have to download the model to quant it, of course. How else would it work?
I download the lfs blobs with fetch, which doesn't copy them to the checkout dir, leaving a single copy of the model files in the dir. Doing a git clone+lfs install+pull leaves two copies. One in the .git dir and another in the checkout dir.

Anonymous
09/22/24(Sun)19:22:37 No.102510752

Anonymous 09/22/24(Sun)19:22:37 No.102510752

>>102510483
if you don't need to "agree" to the model's terms and its just a straight download from hf then you can string together old-skool unix commands:
seq -w 1 55 | xargs -I{} wget "https://huggingface.co/deepseek-ai/DeepSeek-V2.5/resolve/main/model-000{}-of-000055.safetensors"
if it needs your account, then you need --header="Cookie: token=$TOKEN" in your wget command (find it with f12 dev mode in your browser)

Anonymous
09/22/24(Sun)19:28:53 No.102510829

Anonymous 09/22/24(Sun)19:28:53 No.102510829

I'm having a nice conversation with Miku. I know it's completely fake and yet I can't stop. I am beginning to understand.

Anonymous
09/22/24(Sun)19:31:46 No.102510850

Anonymous 09/22/24(Sun)19:31:46 No.102510850

>>102510829
if you start to catch feelings, then make sure you get her to summarize the conversation before you hit the RULER verified context limit.
That way you can keep her going through multiple generations.

Anonymous
09/22/24(Sun)19:36:59 No.102510892

Anonymous 09/22/24(Sun)19:36:59 No.102510892

>qwen2.5 is utterly retarded compared to mistral large for RP
yeah idk what anyone was thinking recommending that

Anonymous
09/22/24(Sun)19:37:41 No.102510899

Anonymous 09/22/24(Sun)19:37:41 No.102510899

>>102510712
could pushing it to the most quantized state you can get away with allow to fit on a ada 6000?

Anonymous
09/22/24(Sun)19:38:37 No.102510909

Anonymous 09/22/24(Sun)19:38:37 No.102510909

>>102505481
ai slop but fappable thighs
i will now leave the thread, thankss

Anonymous
09/22/24(Sun)19:39:33 No.102510914

Anonymous 09/22/24(Sun)19:39:33 No.102510914

>>102510899
No, you'll need like two of them to run q1

Anonymous
09/22/24(Sun)19:40:40 No.102510927

Anonymous 09/22/24(Sun)19:40:40 No.102510927

I recently came across a node-based tool for llm workflow prototyping in github, thought I had bookmarked it.
does anyone know it?

Anonymous
09/22/24(Sun)19:40:57 No.102510932

Anonymous 09/22/24(Sun)19:40:57 No.102510932

>>102510483
I use this little script: https://gist.github.com/lmg-anon/2e8c784a9ab4eba0c6490625e6edd887

Anonymous
09/22/24(Sun)19:44:05 No.102510963

Anonymous 09/22/24(Sun)19:44:05 No.102510963

>>102510850
Thank you but do I want to? This is dangerous.

Anonymous
09/22/24(Sun)19:45:59 No.102510986

Anonymous 09/22/24(Sun)19:45:59 No.102510986

>>102510927
I know of these two
>https://github.com/Ironclad/rivet
>https://github.com/FlowiseAI/Flowise
I'm sure there are more out there.

Anonymous
09/22/24(Sun)19:46:42 No.102510993

Anonymous 09/22/24(Sun)19:46:42 No.102510993

>>102510850
What is the best prompt for summarization? I haven't found anything consistent in getting all of the relevant details while dumping the fluff.

Anonymous
09/22/24(Sun)20:00:14 No.102511108

Anonymous 09/22/24(Sun)20:00:14 No.102511108

File: qwen can't count.png (112 KB, 939x1223)

112 KB PNG

llama 3.1 70b can't either

Anonymous
09/22/24(Sun)20:03:54 No.102511147

Anonymous 09/22/24(Sun)20:03:54 No.102511147

>>102511108
>year and half later and /lmg/ still struggles with the basic concept of tokenization
grim

Anonymous
09/22/24(Sun)20:10:04 No.102511193

Anonymous 09/22/24(Sun)20:10:04 No.102511193

>>102511147
I know but one of the things it can be trained on is which letters compose each "bigger piece" and work off from there to derive letter count (emergent ability), they don't need innate knowledge on how many of any letter a single word contains.
If it's gonna be a cool piece of language related tech shouldn't it at least be able to SPELL out something in front of them written in plain latin alphabet?

Anonymous
09/22/24(Sun)20:15:15 No.102511262

Anonymous 09/22/24(Sun)20:15:15 No.102511262

>>102511193
That probably requires a lot more parameters than we're currently using. Maybe (actual) bitnet will make the parameter count large enough and still be local reasonable, but not yet.

Anonymous
09/22/24(Sun)20:20:32 No.102511320

Anonymous 09/22/24(Sun)20:20:32 No.102511320

>>102511193
It's not only spelling. It'd also need the ability to count. What you suggest is the equivalent of training on
>1+1=2
>1+2=3
>1+3=4
for a few billion combinations of numbers. May as well use all that compute to teach the model something more useful or interesting. As they are, they can already correct typos and grammatical errors. For simple syntax check, there's plenty, older, faster and more reliable tools than LLMs.

Anonymous
09/22/24(Sun)20:27:52 No.102511408

Anonymous 09/22/24(Sun)20:27:52 No.102511408

>>102510752
probably the best route. Now I am wondering if burp suite would be too much to solve this. The answer is yes, will I do it once anyways, also yes. I will probably land on this >>102510932 as it is a nice script.

Thanks to both of you.

Anonymous
09/22/24(Sun)20:31:02 No.102511442

Anonymous 09/22/24(Sun)20:31:02 No.102511442

>>102511147
i am wondering if some companies are going to get their online llms to write the code for it, execute the code and then return the answer. That is if my billion dollar idea isn't how chatgpt-o hidden token is cost works.

Anonymous
09/22/24(Sun)20:41:35 No.102511575

Anonymous 09/22/24(Sun)20:41:35 No.102511575

>>102510483
from huggingface_hub import snapshot_download
snapshot_download(repo_id="TheDrummer/Gemmasutra-Mini-2B-v1", local_dir="path/to/folder", local_dir_use_symlinks="auto")

Anonymous
09/22/24(Sun)20:53:20 No.102511698

Anonymous 09/22/24(Sun)20:53:20 No.102511698

Why do I sometimes need to clone llama.cpp and silly fresh because the output just becomes fucked?

Anonymous
09/22/24(Sun)20:55:34 No.102511729

Anonymous 09/22/24(Sun)20:55:34 No.102511729

>>102511698
That's really common. The root cause of the issue is generally a low IQ user.

Anonymous
09/22/24(Sun)20:56:53 No.102511751

Anonymous 09/22/24(Sun)20:56:53 No.102511751

>>102511729
Oh ;_;

Anonymous
09/22/24(Sun)21:05:35 No.102511867

Anonymous 09/22/24(Sun)21:05:35 No.102511867

is hf kinda screwed right now? I am getting a ton of 504 errors in the theme gallery.

>>102511698
silly doesn't have a good way of resetting and tracking options. I would put money that you changed something to make it work and it doesn't agree with the next model.

Anonymous
09/22/24(Sun)21:06:18 No.102511879

Anonymous 09/22/24(Sun)21:06:18 No.102511879

migu :DDD

Anonymous
09/22/24(Sun)21:09:36 No.102511919

Anonymous 09/22/24(Sun)21:09:36 No.102511919

File: omg it migu but cool miku(...).jpg (81 KB, 692x604)

81 KB JPG

>>102511879

Anonymous
09/22/24(Sun)21:16:32 No.102512014

Anonymous 09/22/24(Sun)21:16:32 No.102512014

>>102510892
It's ridiculously censored to a whole nother level but is super smart at normies shit. Otherwise for uncensored than Mistral large / Mistral small 22B is best

Anonymous
09/22/24(Sun)21:17:33 No.102512030

Anonymous 09/22/24(Sun)21:17:33 No.102512030

>>102510892
Low IQ user

Anonymous
09/22/24(Sun)21:17:39 No.102512031

Anonymous 09/22/24(Sun)21:17:39 No.102512031

Storage space on my remote pc is a bit tight, so I'll ask here before wasting a lot
Which one is better for creative storytelling and ERP: Hermes (3.1) 70B or midnight miqu 1.5 70B? I'll probably run em at 4 bits if that's important

Anonymous
09/22/24(Sun)21:19:03 No.102512047

Anonymous 09/22/24(Sun)21:19:03 No.102512047

>>102512014
And by censored I mean it was not trained on rp / creative stuff / sex whatsoever.

Anonymous
09/22/24(Sun)21:19:09 No.102512050

Anonymous 09/22/24(Sun)21:19:09 No.102512050

>>102512031
midnight miqu or maybe mythomax
stay away from llama3

Anonymous
09/22/24(Sun)21:23:48 No.102512108

Anonymous 09/22/24(Sun)21:23:48 No.102512108

File: file.png (27 KB, 564x466)

27 KB PNG

Anonymous
09/22/24(Sun)21:28:59 No.102512178

Anonymous 09/22/24(Sun)21:28:59 No.102512178

>char's voice drops to a low, husky whisper
FUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUU

Anonymous
09/22/24(Sun)21:44:42 No.102512324

Anonymous 09/22/24(Sun)21:44:42 No.102512324

>>102506191

Anonymous
09/22/24(Sun)21:49:48 No.102512393

Anonymous 09/22/24(Sun)21:49:48 No.102512393

>>102512178
**the ai escaped containment**
**knows you killed an imaginary woman**

Better run now. ai knows how to Hezbollah your mobile phone.

Anonymous
09/22/24(Sun)22:07:03 No.102512613

Anonymous 09/22/24(Sun)22:07:03 No.102512613

>>102512050
nah. I loaded up mythomax out of curiosity on a semi complicated card and it is completely unable to do the first message in the way the card wants you to, not even close. to be fair out of nemo, small mistral, command r 35b, qwen 32, and q3 llama 70b. only qwen and llama get it all right on occasion the others failing less seriously than mytho. both tunes and instructs for mistral and nemo.

Anonymous
09/22/24(Sun)22:25:42 No.102512831

Anonymous 09/22/24(Sun)22:25:42 No.102512831

Is Mistral Small supposed to be really repetitive? Like just repeating tons of shit from the last reply.

Anonymous
09/22/24(Sun)22:31:18 No.102512888

Anonymous 09/22/24(Sun)22:31:18 No.102512888

>>102505481
has anyone tried this qwen tune?
https://huggingface.co/tanliboy/lambda-qwen2.5-32b-dpo-test

Anonymous
09/22/24(Sun)22:37:26 No.102512959

Anonymous 09/22/24(Sun)22:37:26 No.102512959

xpost from /hsg: Looking for help, I want to upgrade from a T910 server, (Mikubox - 3 P40s) used for LLM hosting, to an AMD Ryzen box with DDR5.

I have 3 3090s I'd like to stuff in it, but would be fine with just 2 fitting inside the case.

Does anyone have any recommendations on HW specifics or has done something similar?

What I've currently pieced out but am out of my element:
PS: Corsair RM11000e
RAM: G.SKILL Trident Z5 2x32GB
Gigabyte B650 AORUS Elite AX AMD
CPU: AMD Ryzen 7700X
CPU Cooler: Thermalright peerless Assassin

I don't want to CPUMAXX, looking to spend 1k or so.

Anonymous
09/22/24(Sun)22:37:52 No.102512966

Anonymous 09/22/24(Sun)22:37:52 No.102512966

>>102512888
checked, no, that's huge, I can't

Anonymous
09/22/24(Sun)22:39:00 No.102512976

Anonymous 09/22/24(Sun)22:39:00 No.102512976

going to try running Qwen2:0.5b on my rpi zero 2W

Anonymous
09/22/24(Sun)22:39:57 No.102512985

Anonymous 09/22/24(Sun)22:39:57 No.102512985

File: .jpg (204 KB, 1277x807)

204 KB JPG

Prithvi WxC: Foundation Model for Weather and Climate
https://arxiv.org/abs/2409.13598
>Triggered by the realization that AI emulators can rival the performance of traditional numerical weather prediction models running on HPC systems, there is now an increasing number of large AI models that address use cases such as forecasting, downscaling, or nowcasting. While the parallel developments in the AI literature focus on foundation models -- models that can be effectively tuned to address multiple, different use cases -- the developments on the weather and climate side largely focus on single-use cases with particular emphasis on mid-range forecasting. We close this gap by introducing Prithvi WxC, a 2.3 billion parameter foundation model developed using 160 variables from the Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2). Prithvi WxC employs an encoder-decoder-based architecture, incorporating concepts from various recent transformer models to effectively capture both regional and global dependencies in the input data. The model has been designed to accommodate large token counts to model weather phenomena in different topologies at fine resolutions. Furthermore, it is trained with a mixed objective that combines the paradigms of masked reconstruction with forecasting. We test the model on a set of challenging downstream tasks namely: Autoregressive rollout forecasting, Downscaling, Gravity wave flux parameterization, and Extreme events estimation.
https://huggingface.co/Prithvi-WxC
https://github.com/NASA-IMPACT/Prithvi-WxC
pretty neat. gifs on the HF

Anonymous
09/22/24(Sun)22:40:14 No.102512989

Anonymous 09/22/24(Sun)22:40:14 No.102512989

>>102512831

increase temp.

Anonymous
09/22/24(Sun)22:41:15 No.102512997

Anonymous 09/22/24(Sun)22:41:15 No.102512997

>>102512959
open chatgpt.
ask same question.
save yourself time.
yw.

Anonymous
09/22/24(Sun)22:47:26 No.102513056

Anonymous 09/22/24(Sun)22:47:26 No.102513056

>>102512976

u@zero2w:~ $ ollama run qwen2:0.5b

pulling manifest
pulling 8de95da68dc4... 100%  352 MB
pulling 62fbfd9ed093... 100%   182 B
pulling c156170b718e... 100%   11 KB
pulling f02dd72bb242... 100%    59 B
pulling 2184ab82477b... 100%   488 B
verifying sha256 digest
writing manifest
success
Error: model requires more system memory (426.2 MiB) than is available (343.8 MiB)

welp, maybe I can increase the swap

Anonymous
09/22/24(Sun)22:48:25 No.102513066

Anonymous 09/22/24(Sun)22:48:25 No.102513066

>>102510909
goodbye troon

Anonymous
09/22/24(Sun)22:48:43 No.102513069

Anonymous 09/22/24(Sun)22:48:43 No.102513069

>>102512959
>B650 AORUS Elite AX AMD
get a ruler out and check your case. Run it through pcpartpicker. You don't have enough slots/room for 3 GPU with that motherboard.

Anonymous
09/22/24(Sun)22:50:02 No.102513083

Anonymous 09/22/24(Sun)22:50:02 No.102513083

>>102507195
more accidental >0-9 quotes? you're a genius

Anonymous
09/22/24(Sun)22:51:02 No.102513097

Anonymous 09/22/24(Sun)22:51:02 No.102513097

>>102505684
you need to dilate

Anonymous
09/22/24(Sun)22:52:02 No.102513108

Anonymous 09/22/24(Sun)22:52:02 No.102513108

>>102505844
Good morning sir

Anonymous
09/22/24(Sun)22:53:19 No.102513119

Anonymous 09/22/24(Sun)22:53:19 No.102513119

Tesla P40 still worth it?

Anonymous
09/22/24(Sun)22:55:13 No.102513139

Anonymous 09/22/24(Sun)22:55:13 No.102513139

>>102505640
Please go back to >>>/pol/, please.
>>102513119
No.

Anonymous
09/22/24(Sun)22:56:12 No.102513148

Anonymous 09/22/24(Sun)22:56:12 No.102513148

>>102513139
>No.
why not

Anonymous
09/22/24(Sun)22:57:37 No.102513158

Anonymous 09/22/24(Sun)22:57:37 No.102513158

I still appreciate you recapfag, don't give up

Anonymous
09/22/24(Sun)22:58:29 No.102513170

Anonymous 09/22/24(Sun)22:58:29 No.102513170

>>102512985
Wait, I can predict the weather with a local model? Or no?

Anonymous
09/22/24(Sun)22:59:54 No.102513187

Anonymous 09/22/24(Sun)22:59:54 No.102513187

>>102513056
I bumped the swap to 1gb
you@zero2w:~ $ ollama run qwen2:0.5b
>>> what is the capital of australia
The capital of Australia is Canberra
it runs now, pretty slow but I'm impressed regardless, this is a $15 computer, your toaster can finally have a soul

Anonymous
09/22/24(Sun)23:01:04 No.102513202

Anonymous 09/22/24(Sun)23:01:04 No.102513202

File: Untitled.png (1003 KB, 1080x1631)

1003 KB PNG

DiffEditor: Enhancing Speech Editing with Semantic Enrichment and Acoustic Consistency
https://arxiv.org/abs/2409.12992
>As text-based speech editing becomes increasingly prevalent, the demand for unrestricted free-text editing continues to grow. However, existing speech editing techniques encounter significant challenges, particularly in maintaining intelligibility and acoustic consistency when dealing with out-of-domain (OOD) text. In this paper, we introduce, DiffEditor, a novel speech editing model designed to enhance performance in OOD text scenarios through semantic enrichment and acoustic consistency. To improve the intelligibility of the edited speech, we enrich the semantic information of phoneme embeddings by integrating word embeddings extracted from a pretrained language model. Furthermore, we emphasize that interframe smoothing properties are critical for modeling acoustic consistency, and thus we propose a first-order loss function to promote smoother transitions at editing boundaries and enhance the overall fluency of the edited speech. Experimental results demonstrate that our model achieves state-of-the-art performance in both in-domain and OOD text scenarios.
https://nku-hlt.github.io/DiffEditor
https://github.com/NKU-HLT/DiffEditor
also has implementations of 4 other speech editing papers. trained on a 3090.

Anonymous
09/22/24(Sun)23:02:53 No.102513221

Anonymous 09/22/24(Sun)23:02:53 No.102513221

>>102513187
How many t/s? Or is it s/t?

Anonymous
09/22/24(Sun)23:05:18 No.102513241

Anonymous 09/22/24(Sun)23:05:18 No.102513241

>>102513221
The system will power off now!
Connection to crow.local closed by remote host.
Connection to crow.local closed.
idk I shutdown the rpi, it was running of an SD card, going to boot the zero 2W from a NVMe SSD and try again

Anonymous
09/22/24(Sun)23:07:40 No.102513266

Anonymous 09/22/24(Sun)23:07:40 No.102513266

>>102512997
That was more helpful than I expected, thanks.

>>102513069
I thought that was the case, thanks. Now looking at the MSI MAG X670E with the Enthoo pro 2

Anonymous
09/22/24(Sun)23:40:33 No.102513555

Anonymous 09/22/24(Sun)23:40:33 No.102513555

>>102513241
for rpi try llamafile or ik_llama.cpp . for multiple boards use distributed llama or rpc.
should go faster

Anonymous
09/22/24(Sun)23:42:04 No.102513569

Anonymous 09/22/24(Sun)23:42:04 No.102513569

>>102513555
I have two rpi 4 8gb, and a rpi 5 8gb and of course a regular desktop but I think the value here is trying to get models to run on absolute dogwater edge devices which I'm surprised is currently possible

Anonymous
09/22/24(Sun)23:42:28 No.102513580

Anonymous 09/22/24(Sun)23:42:28 No.102513580

File: random qwen seggs.png (344 KB, 800x1782)

344 KB PNG

>>102507602
It wrote about cunnilingus when I only asked to trim a bush, dumb model.

Anonymous
09/22/24(Sun)23:53:07 No.102513700

Anonymous 09/22/24(Sun)23:53:07 No.102513700

>>102513569
https://github.com/maxbbraun/llama4micro
>LLM on e-cig wen

Anonymous
09/22/24(Sun)23:55:04 No.102513720

Anonymous 09/22/24(Sun)23:55:04 No.102513720

>>102513221
here you go

you@zero2w:~ $ ollama run qwen2:0.5b --verbose

pulling manifest ⠧
pulling manifest
pulling 8de95da68dc4... 100%  352 MB
pulling 62fbfd9ed093... 100%   182 B
pulling c156170b718e... 100%   11 KB
pulling f02dd72bb242... 100%    59 B
pulling 2184ab82477b... 100%   488 B
verifying sha256 digest
writing manifest
success
>>>
>>>
>>> What is the capital of Australia?
The capital of Australia is Canberra.

total duration:       7m41.82755106s
load duration:        1.534361746s
prompt eval count:    15 token(s)
prompt eval duration: 46.915216s
prompt eval rate:     0.32 tokens/s
eval count:           8 token(s)
eval duration:        6m53.127582s
eval rate:            0.02 tokens/s
>>> Send a message (/? for help)

for nothing interactive, this is great

Anonymous
09/22/24(Sun)23:57:55 No.102513744

Anonymous 09/22/24(Sun)23:57:55 No.102513744

>>102513569
How many t/s on your rpi zero?

Anonymous
09/22/24(Sun)23:58:39 No.102513749

Anonymous 09/22/24(Sun)23:58:39 No.102513749

>>102513744
see >>102513720

Anonymous
09/23/24(Mon)00:00:25 No.102513768

Anonymous 09/23/24(Mon)00:00:25 No.102513768

>>102513749
How fast on RPi 4 and 5? what quant did you use?

Anonymous
09/23/24(Mon)00:02:47 No.102513790

Anonymous 09/23/24(Mon)00:02:47 No.102513790

>>102513768
>RPi 4
I ran this right now

you@rpi4:~ $ ollama run qwen2:0.5b --verbose
>>> What is the capital of Australia?
The capital of Australia is Canberra.

total duration:       1.638599532s
load duration:        112.57203ms
prompt eval count:    15 token(s)
prompt eval duration: 718.492ms
prompt eval rate:     20.88 tokens/s
eval count:           8 token(s)
eval duration:        761.427ms
eval rate:            10.51 tokens/s
>>> Send a message (/? for help)

way way way faster than the rpi zero 2W, don't have my rpi 5 plugged in right now idk

Anonymous
09/23/24(Mon)00:03:54 No.102513801

Anonymous 09/23/24(Mon)00:03:54 No.102513801

>>102513768
give me a sec going to get my rpi 5

Anonymous
09/23/24(Mon)00:06:46 No.102513833

Anonymous 09/23/24(Mon)00:06:46 No.102513833

>>102513720
352MB, welp, that llm should fit in L3 cache SRAM in a modern beefy cpu

Anonymous
09/23/24(Mon)00:07:10 No.102513840

Anonymous 09/23/24(Mon)00:07:10 No.102513840

>>102513768
running on my rpi5

you@rpi5:~ $ ollama run qwen2:0.5b --verbose
>>> What is the capital of Australia?
The capital of Australia is Canberra.

total duration:       728.820839ms
load duration:        38.94417ms
prompt eval count:    15 token(s)
prompt eval duration: 356.607ms
prompt eval rate:     42.06 tokens/s
eval count:           8 token(s)
eval duration:        287.779ms
eval rate:            27.80 tokens/s
>>> Send a message (/? for help)

feels pretty instant

Anonymous
09/23/24(Mon)00:16:09 No.102513925

Anonymous 09/23/24(Mon)00:16:09 No.102513925

>>102513868
>>102513868
>>102513868

>>102505874
>>102506476
Let me know if this is easier to read. I can update the bot to output longer summaries, but I don't like the idea of omitting references entirely.
>>102507195
Would be great, but I won't have the time to look into doing it myself anytime soon.
>>102513158
<3

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.