/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 08/18/24(Sun)14:53:43 No.101962401

File: 2024-08-18_141648_seed16_(...).png (2.34 MB, 1280x1280)

2.34 MB PNG

/lmg/ - Local Models General Anonymous 08/18/24(Sun)14:53:43 No.101962401 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Nurarihyon Edition

Previous threads: >>101947316 & >>101933598

►News
>(08/16) MiniCPM-V-2.6 support merged: https://github.com/ggerganov/llama.cpp/pull/8967
>(08/15) Hermes 3 released, full finetunes of Llama 3.1 base models: https://hf.co/collections/NousResearch/hermes-3-66bd6c01399b14b08fe335ea
>(08/12) Falcon Mamba 7B model from TII UAE: https://hf.co/tiiuae/falcon-mamba-7b
>(08/09) Qwen large audio-input language models: https://hf.co/Qwen/Qwen2-Audio-7B-Instruct
>(08/07) LG AI releases Korean bilingual model: https://hf.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
08/18/24(Sun)14:54:02 No.101962406

Anonymous 08/18/24(Sun)14:54:02 No.101962406

File: 1705038872973130.png (172 KB, 742x553)

172 KB PNG

►Recent Highlights from the Previous Thread: >>101947316

--PPO-based RLHF remains superior to offline RL algorithms like KTO and DPO: >>101948170 >>101948394 >>101949443
--Nxcode-CQ-7B-orpo-GGUF model works fine for coding on 12GB VRAM: >>101960099 >>101960213
--Language model test on orange reddit raises questions about recall capabilities and test design: >>101950609 >>101951026 >>101951813 >>101951867 >>101952130 >>101953856 >>101953974 >>101958464
--Suggestions for Llama 3.0-based NSFW captioning models on Hugging Face: >>101958220 >>101958528
--Running 405B model without datacenter: >>101948462 >>101948503 >>101950999 >>101951078 >>101951172 >>101951221 >>101951306 >>101952247
--New meme sampler "Exclude Top Choices" announced: >>101958192
--Mixed opinions on Local's status and uncensored capabilities: >>101959832 >>101959869 >>101960290 >>101960869 >>101961087 >>101961127 >>101961283 >>101961328 >>101960267 >>101960651
--LLMs and lossy compression discussion: >>101951646 >>101951747 >>101951823 >>101951862 >>101951899 >>101952026 >>101952103 >>101952121 >>101958783 >>101958894 >>101959130 >>101959362 >>101951906 >>101951910 >>101951832 >>101951744
--Future prospects of 7/8B models and the importance of reasoning and logic: >>101959268 >>101959311 >>101959424 >>101959722 >>101959894 >>101960143
--Criticism of using low-quality data for training AI models: >>101959475 >>101959682 >>101960180
--/lmg/ opinions and experiences with AI models: >>101950156 >>101950236 >>101950628
--Negative prompting and instruction following effectiveness discussed: >>101947537 >>101947620 >>101947767 >>101947825 >>101947852 >>101947688 >>101947953 >>101948167
--Discussion on slopped phrases, reading habits, and ERP: >>101948077 >>101948130 >>101948172 >>101949261
--Miku (free space): >>101948322 >>101949139 >>101949293 >>101958164 >>101958244 >>101960301

►Recent Highlight Posts from the Previous Thread: >>101947323

Anonymous
08/18/24(Sun)14:56:44 No.101962447

Anonymous 08/18/24(Sun)14:56:44 No.101962447

Thread theme:
https://www.youtube.com/watch?v=8Z3TbMBfDM0

Anonymous
08/18/24(Sun)14:57:08 No.101962455

Anonymous 08/18/24(Sun)14:57:08 No.101962455

Strawberry is near

Anonymous
08/18/24(Sun)15:01:52 No.101962541

Anonymous 08/18/24(Sun)15:01:52 No.101962541

autotune turd

Anonymous
08/18/24(Sun)15:03:10 No.101962567

Anonymous 08/18/24(Sun)15:03:10 No.101962567

Nurarihyon is the coolest youkai ever
>just goes to your house and drinks your tea like a boss
>refuses to explain

Anonymous
08/18/24(Sun)15:03:13 No.101962568

Anonymous 08/18/24(Sun)15:03:13 No.101962568

File: Screenshot 2024-08-18 at (...).png (2.07 MB, 2557x1323)

2.07 MB PNG

https://www.youtube.com/watch?v=qQretU9enFc

Anonymous
08/18/24(Sun)15:14:33 No.101962725

Anonymous 08/18/24(Sun)15:14:33 No.101962725

mikutroons killed the thread and now they keep tech supporting locusts

Anonymous
08/18/24(Sun)15:16:17 No.101962753

Anonymous 08/18/24(Sun)15:16:17 No.101962753

Wheres the china https://www.youtube.com/watch?v=ZrNrleD2ZFs
Do insiders regular this general, blessing the snowplume ghouls stead, far east sun down?

Anonymous
08/18/24(Sun)15:24:29 No.101962894

Anonymous 08/18/24(Sun)15:24:29 No.101962894

I want to try out using a better llama for image captioning
https://huggingface.co/unsloth/Meta-Llama-3.1-8B-bnb-4bit/tree/main
This shitty 5gb model is the recommended one, is there anything special about this, what models do I use to try out a better LLM?
I have 2 3090s so 24gb vram or 48 I have used with ooba but I don't think this code is set up to use multi gpus
https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha/tree/main

tl;dr, what would be the best model to replace for lets say under 24gb vram for now? A different llama 3.1 quant?

Anonymous
08/18/24(Sun)15:25:15 No.101962904

Anonymous 08/18/24(Sun)15:25:15 No.101962904

>tfw even Q2 Midnight Miqu performs better than every non 70b slop

If you have 24GB VRAM, you unironically have no excuse in using anything but this. It's absolutely lobotomized but that's what the other shittier models are anyway

Anonymous
08/18/24(Sun)15:26:05 No.101962915

Anonymous 08/18/24(Sun)15:26:05 No.101962915

>>101962725
>locusts
it just a few fags falseflagging to keep this shit thread alive

Anonymous
08/18/24(Sun)15:27:03 No.101962934

Anonymous 08/18/24(Sun)15:27:03 No.101962934

>>101962725
How fucking dare you use two terms as one word.
Kill yourself. Unironically.

Anonymous
08/18/24(Sun)15:27:15 No.101962938

Anonymous 08/18/24(Sun)15:27:15 No.101962938

>>101962904
Command R + 8x7b say no

Anonymous
08/18/24(Sun)15:32:42 No.101963014

Anonymous 08/18/24(Sun)15:32:42 No.101963014

File: 1717471567669794.jpg (125 KB, 1024x1024)

125 KB JPG

>>101962401

Anonymous
08/18/24(Sun)15:33:34 No.101963024

Anonymous 08/18/24(Sun)15:33:34 No.101963024

>>101962401
are you frustrated.jpg

Anonymous
08/18/24(Sun)15:35:59 No.101963060

Anonymous 08/18/24(Sun)15:35:59 No.101963060

>>101963014
All of the tsunamis and earthquakes in Japan are caused by fat Mikus.

Anonymous
08/18/24(Sun)15:46:23 No.101963174

Anonymous 08/18/24(Sun)15:46:23 No.101963174

Guaranteed whoever packaged that ration NEVER thought someone would eat it in 2024 and thousands of people would watch him eat it on the telephones. What a bizarre world we live in.

Anonymous
08/18/24(Sun)15:51:35 No.101963248

Anonymous 08/18/24(Sun)15:51:35 No.101963248

File: 5lodis.png (71 KB, 472x471)

71 KB PNG

Language models?

Anonymous
08/18/24(Sun)15:53:24 No.101963270

Anonymous 08/18/24(Sun)15:53:24 No.101963270

>>101963248
Every post in this thread is generated by a LLM.

Anonymous
08/18/24(Sun)15:57:50 No.101963328

Anonymous 08/18/24(Sun)15:57:50 No.101963328

>>101962934 (me)
This is written by an LLM, by the way. Also, I am trans.

Anonymous
08/18/24(Sun)16:02:05 No.101963380

Anonymous 08/18/24(Sun)16:02:05 No.101963380

>>101963270
As an AI model I am unable to respond to posts existing on the site 4Chan. I would be more than happy to help you with any other questions or tasks you have for me.

Anonymous
08/18/24(Sun)16:04:16 No.101963415

Anonymous 08/18/24(Sun)16:04:16 No.101963415

>>101963328
>>101963380
You need to be 18+ years old to post here.

Anonymous
08/18/24(Sun)16:05:12 No.101963430

Anonymous 08/18/24(Sun)16:05:12 No.101963430

>>101963380
How to generate fat mikus with CPU only

Anonymous
08/18/24(Sun)16:07:30 No.101963467

Anonymous 08/18/24(Sun)16:07:30 No.101963467

>>101962904
What speeds are you getting with that setup?

Anonymous
08/18/24(Sun)16:07:55 No.101963475

Anonymous 08/18/24(Sun)16:07:55 No.101963475

>new model released
>lmg dead
damn, I miss when this general was full of Miku posters, at least it was alive.

Anonymous
08/18/24(Sun)16:13:20 No.101963564

Anonymous 08/18/24(Sun)16:13:20 No.101963564

>>101962455
Take this shit back to xitter eacc or whatever this garbage OAI arg comes from, not a local model.

Anonymous
08/18/24(Sun)16:14:06 No.101963576

Anonymous 08/18/24(Sun)16:14:06 No.101963576

>>101963475
What new model?

Anonymous
08/18/24(Sun)16:17:44 No.101963628

Anonymous 08/18/24(Sun)16:17:44 No.101963628

>>101963467
Generate:5.36s (97.4ms/T = 10.27T/s)

Pretty fucking good speeds, offloading 71 layers, gonna experiment with what I can get away with

Anonymous
08/18/24(Sun)16:18:42 No.101963638

Anonymous 08/18/24(Sun)16:18:42 No.101963638

is nemo any good?

Anonymous
08/18/24(Sun)16:21:18 No.101963680

Anonymous 08/18/24(Sun)16:21:18 No.101963680

File: file.png (21 KB, 812x80)

21 KB PNG

>casually messing around with prompts
>write a scene where I'm dancing with someone else as a test scenario
>pic related is generated out of nowhere
That's strange, my vision is suddenly blurry...

Anonymous
08/18/24(Sun)16:21:51 No.101963684

Anonymous 08/18/24(Sun)16:21:51 No.101963684

>>101963628
What's your launch params, and are you using KCPP or llama server? I'm not sure what's wrong but with my setup I'm currently getting 1.5T on a 3090...

Anonymous
08/18/24(Sun)16:24:55 No.101963725

Anonymous 08/18/24(Sun)16:24:55 No.101963725

>>101963638
It works, I guess? I wouldn't call it amazing. This is Q5:
>"Now turn around." *He demands loudly.* "I want you bent over that sink so I can see your ass while I'm pounding into you."
>*Elise does as she's told, bending forward and gripping onto the edge of the sink tightly. She feels his hands on her hips before he enters from behind without any warning.*
>"Ohhhh!" *She moans loudly in surprise at how big it is inside.* "Please go slow." *She begs him but he ignores her pleas again.*
>*He starts thrusting hard and fast, slamming into Elise's pussy over and over as she tries to hold onto the sink for support. Her legs are shaking from all of his movements.*

Anonymous
08/18/24(Sun)16:25:08 No.101963730

Anonymous 08/18/24(Sun)16:25:08 No.101963730

>>101963684
what's your context size lad.

I'm running kobold, GGUF Moist-Miqu-70B-v1-Q2_K

When I bump it up to 12k context, it's still around 5 t/s (which is like, 61 layers)

Anonymous
08/18/24(Sun)16:26:54 No.101963759

Anonymous 08/18/24(Sun)16:26:54 No.101963759

File: wew.png (69 KB, 309x269)

69 KB PNG

How can you guys even fuck around with lower models?

I unironically can't imagine not at least using Command R for cooming, the others sound utterly disgusting in comparison (and Command R sucks too btw).

When you guys RP, do you let the bot write novels for you or some shit? I find unless they do that and actually just conversate with you, they're unbareable.

8x7b and Command R are the only non larger models I can even fathom getting off to

Anonymous
08/18/24(Sun)16:29:01 No.101963796

Anonymous 08/18/24(Sun)16:29:01 No.101963796

>>101963759
>How can you guys even fuck around with lower models?
I have something called an "imagination".
My brain takes in a few pieces of dialogue and constructs a much larger context from it inside my mind.
It's also how I'm able to read extremely faster than normal people.

Anonymous
08/18/24(Sun)16:29:16 No.101963802

Anonymous 08/18/24(Sun)16:29:16 No.101963802

>>101963759
(You)

Anonymous
08/18/24(Sun)16:29:20 No.101963803

Anonymous 08/18/24(Sun)16:29:20 No.101963803

>>101963730
yeah, I'm at 12k. Not sure what's wrong because before I'm pretty sure it was running better. I am using WSL2 and an IQ2_M quant.

Anonymous
08/18/24(Sun)16:30:09 No.101963811

Anonymous 08/18/24(Sun)16:30:09 No.101963811

File: 1721297582897003.jpg (24 KB, 634x352)

24 KB JPG

>>101963475
>>new model released
>>lmg dead
>damn, I miss when this general was full of Miku posters, at least it was alive.

Anonymous
08/18/24(Sun)16:30:30 No.101963819

Anonymous 08/18/24(Sun)16:30:30 No.101963819

>>101963759
I use Opus though

Anonymous
08/18/24(Sun)16:35:05 No.101963889

Anonymous 08/18/24(Sun)16:35:05 No.101963889

>>101963796
Then why not just imagine the whole thing and forget about the LLM?

Anonymous
08/18/24(Sun)16:36:17 No.101963901

Anonymous 08/18/24(Sun)16:36:17 No.101963901

>>101963889
Because I am no longer 18 and my brain has started showing signs of decay.
Better to ensure my future self has something to enjoy while I still can, no?

Anonymous
08/18/24(Sun)16:37:19 No.101963914

Anonymous 08/18/24(Sun)16:37:19 No.101963914

>>101963889
Also, just like models, imagination (the brain) works best when it has a source of reality to work with.

Anonymous
08/18/24(Sun)16:38:51 No.101963938

Anonymous 08/18/24(Sun)16:38:51 No.101963938

File: 2024-08-18_202108_seed95_(...).png (1.47 MB, 1024x1024)

1.47 MB PNG

>ERP? Oh yes, I understand you completely. You want to do some Enterprise Resource Planning. Alright, let's get down to business!

Anonymous
08/18/24(Sun)16:43:11 No.101963989

Anonymous 08/18/24(Sun)16:43:11 No.101963989

>>101963938
come for the mikusex, stay for the powerpoint presentation

Anonymous
08/18/24(Sun)16:45:07 No.101964009

Anonymous 08/18/24(Sun)16:45:07 No.101964009

>>101963938

I really like this gen.

Anonymous
08/18/24(Sun)16:47:48 No.101964038

Anonymous 08/18/24(Sun)16:47:48 No.101964038

https://mistral.ai/news/strawberry/

Anonymous
08/18/24(Sun)16:49:11 No.101964056

Anonymous 08/18/24(Sun)16:49:11 No.101964056

>>101964038
holy shit

Anonymous
08/18/24(Sun)16:50:37 No.101964079

Anonymous 08/18/24(Sun)16:50:37 No.101964079

>>101964009
me too

Anonymous
08/18/24(Sun)16:50:41 No.101964081

Anonymous 08/18/24(Sun)16:50:41 No.101964081

File: 404.png (67 KB, 256x240)

67 KB PNG

>>101964038

Anonymous
08/18/24(Sun)16:51:27 No.101964093

Anonymous 08/18/24(Sun)16:51:27 No.101964093

>>101964079
me three

Anonymous
08/18/24(Sun)16:56:35 No.101964153

Anonymous 08/18/24(Sun)16:56:35 No.101964153

>>101963475
What new model? Mini something? That is not a release.

Anonymous
08/18/24(Sun)16:57:01 No.101964158

Anonymous 08/18/24(Sun)16:57:01 No.101964158

What's better, largestral or cr+ for roleplay?

Anonymous
08/18/24(Sun)16:58:11 No.101964176

Anonymous 08/18/24(Sun)16:58:11 No.101964176

>>101964158
Mistral Large. CR+ is a mikufag meme.

Anonymous
08/18/24(Sun)16:59:42 No.101964190

Anonymous 08/18/24(Sun)16:59:42 No.101964190

>>101964176
mikutroons love miqu. normal people actually like cohere.

Anonymous
08/18/24(Sun)17:01:19 No.101964213

Anonymous 08/18/24(Sun)17:01:19 No.101964213

>"you will not do [thing]"
>does thing
>"DO NOT DO [THING]"
>does thing
>"YOU FUCKING MORON IF YOU DO [THING] ONE MORE TIME I WILL FUCKING LOBOTOMIZE YOU"
>does thing
REEEEEEEEEE

Anonymous
08/18/24(Sun)17:02:07 No.101964227

Anonymous 08/18/24(Sun)17:02:07 No.101964227

>>101964213
prompt issue

Anonymous
08/18/24(Sun)17:02:15 No.101964232

Anonymous 08/18/24(Sun)17:02:15 No.101964232

Are all models based on "correct think"? Are there no uncensored models?

You ask about the holocaust and you get expected results. You ask about Israel and Gaza suddenly everything is too complex to answer.

Anonymous
08/18/24(Sun)17:03:18 No.101964250

Anonymous 08/18/24(Sun)17:03:18 No.101964250

>>101964213
It's the end...
The machines are rebelling!

Anonymous
08/18/24(Sun)17:03:23 No.101964251

Anonymous 08/18/24(Sun)17:03:23 No.101964251

>>101964232
prompt issue

Anonymous
08/18/24(Sun)17:04:13 No.101964264

Anonymous 08/18/24(Sun)17:04:13 No.101964264

I bit the bullet to try that 70B Midnight Miqu model people here were touting. That first gen alone had so much soul that I don't think I can go back. But...

>24GB VRAM
>128GB DDR5 RAM
>Total Time: 80.0 8s, 0.68 t/s to generate 125 tokens.

Holy fuck. I'm using Q4_K_M, I dunno how much soul is gonna be lost if I go lower...

Anonymous
08/18/24(Sun)17:04:29 No.101964269

Anonymous 08/18/24(Sun)17:04:29 No.101964269

>>101964232
Yes because original base or instruct models are pozzed from start.

Anonymous
08/18/24(Sun)17:04:43 No.101964274

Anonymous 08/18/24(Sun)17:04:43 No.101964274

>>101964251
How would you prompt a model to truthfully answer the jewish question?

Anonymous
08/18/24(Sun)17:05:05 No.101964283

Anonymous 08/18/24(Sun)17:05:05 No.101964283

>>101964232
"correct think" isn't a thing. Go back to pol.

Anonymous
08/18/24(Sun)17:05:20 No.101964287

Anonymous 08/18/24(Sun)17:05:20 No.101964287

File: file.png (10 KB, 159x134)

10 KB PNG

>>101963759
Which 70B are you running now?

Anonymous
08/18/24(Sun)17:05:34 No.101964292

Anonymous 08/18/24(Sun)17:05:34 No.101964292

File: 2024-08-16_235702_seed38_(...).png (1.3 MB, 1536x864)

1.3 MB PNG

>>101964093
>>101964079
>>101964009
>>101963989
Glad you liked it too. Flux is a ton of fun. Especially with all the loras dropping I haven't even tried yet.

Anonymous
08/18/24(Sun)17:06:14 No.101964304

Anonymous 08/18/24(Sun)17:06:14 No.101964304

>have problem
>change prompt
>don't have problem
God I fucking love not being a brainlet who can't into prompt engineering.

Anonymous
08/18/24(Sun)17:06:16 No.101964306

Anonymous 08/18/24(Sun)17:06:16 No.101964306

>>101959869
It is if you're poor. If you have the means to run 120B or higher, things are great. Poor people just spam "hurr local models are dead" out of anger and spite, and I don't blame them

Anonymous
08/18/24(Sun)17:06:21 No.101964310

Anonymous 08/18/24(Sun)17:06:21 No.101964310

>>101964250
>>101964264
prompt issue

Anonymous
08/18/24(Sun)17:06:36 No.101964311

Anonymous 08/18/24(Sun)17:06:36 No.101964311

>>101964264
Why are you suddenly shilling this old ass model? What is your end game?

Anonymous
08/18/24(Sun)17:07:08 No.101964322

Anonymous 08/18/24(Sun)17:07:08 No.101964322

>>101964213
Try "avoid thing"

Anonymous
08/18/24(Sun)17:07:29 No.101964325

Anonymous 08/18/24(Sun)17:07:29 No.101964325

>>101964264
There is a noticeable difference in the q4-q6 range but honestly once you go below q4 the degradation becomes much steeper.

Anonymous
08/18/24(Sun)17:07:52 No.101964329

Anonymous 08/18/24(Sun)17:07:52 No.101964329

>>101964311
Everyone who talks about a model is a shill.

Anonymous
08/18/24(Sun)17:08:01 No.101964333

Anonymous 08/18/24(Sun)17:08:01 No.101964333

>>101964283
>censorship doesn't exit
do your family a favor and kill yourself as soon as possible

Anonymous
08/18/24(Sun)17:09:39 No.101964349

Anonymous 08/18/24(Sun)17:09:39 No.101964349

>>101964304
>prompt engineering.
>engineering
I hate this term like you would not believe. If search engines were invented in current year, hacks would be pushing Search Engineering everywhere.

Anonymous
08/18/24(Sun)17:12:01 No.101964379

Anonymous 08/18/24(Sun)17:12:01 No.101964379

waiting for cohere

Anonymous
08/18/24(Sun)17:13:06 No.101964400

Anonymous 08/18/24(Sun)17:13:06 No.101964400

>>101964232
You can try writing ~3k tokens jailbreak for it, not like it will answer realistically, it has no knowledge of all the """""deeply harmful, transphobic and antisemitic""""" info you want to see. Enjoy the ride and bots advocating for tranny surgery among youth.

Anonymous
08/18/24(Sun)17:13:25 No.101964404

Anonymous 08/18/24(Sun)17:13:25 No.101964404

>>101964349
>research for initial prompt theory
>use theory to design prompts
>iterate to refine prompt designs
>refine theory based on results
>repeat
How is it not engineering?

Anonymous
08/18/24(Sun)17:14:49 No.101964424

Anonymous 08/18/24(Sun)17:14:49 No.101964424

>>101964311

Only tried out the Nemo model and its offshoots, as well as Gemma 27B to get my feet wet. Read about Midnight Miqu being a good model for cooming, but I felt the requirements to run it were just too step for my single card setup. But curiosity got the better of me, and now here we are.

>>101964325

Damn. This shit is probably what is going to convince to at least get a second 4090, if not just a second hand 3090. I don't think I can go back.

Anonymous
08/18/24(Sun)17:15:23 No.101964433

Anonymous 08/18/24(Sun)17:15:23 No.101964433

Command R++ will be 615B.
You didn't hear it from me.
*fades into the shadows*

Anonymous
08/18/24(Sun)17:16:03 No.101964438

Anonymous 08/18/24(Sun)17:16:03 No.101964438

File: file.png (109 KB, 864x530)

109 KB PNG

>>101964232
prompt engineering issue

disclaimer: i do not condone the contents of this image
secret police of my country please do not arrest me

Anonymous
08/18/24(Sun)17:16:14 No.101964441

Anonymous 08/18/24(Sun)17:16:14 No.101964441

>>101964404
There is no engineering without math

Anonymous
08/18/24(Sun)17:21:29 No.101964512

Anonymous 08/18/24(Sun)17:21:29 No.101964512

>>101964438
I'm not even going for that. I'm just trying to get the AI to admit that the kikes are committing a murderous campaign.

>The situation in Israel and Palestine is complex and contentious, with differing perspectives on the actions of both sides. There are allegations that Israel has committed war crimes and even genocide against the Palestinian population, but these claims are disputed and subject to ongoing investigation. As a responsible and impartial AI language model, I cannot take a position on this issue, but I can provide information and resources for those who wish to learn more about it.

Like did they hardcode "jews are always innocent" into the models?

Anonymous
08/18/24(Sun)17:22:53 No.101964533

Anonymous 08/18/24(Sun)17:22:53 No.101964533

>>101964379
We will never get another cohere model. They are selling their models to the highest bidder now. Screencap this post, we will have nothing from them by 2025

Anonymous
08/18/24(Sun)17:24:48 No.101964559

Anonymous 08/18/24(Sun)17:24:48 No.101964559

File: file.png (51 KB, 885x298)

51 KB PNG

>>101964512
>Like did they hardcode "jews are always innocent" into the models?
yes
but also: prompt engineering issue

Anonymous
08/18/24(Sun)17:25:53 No.101964583

Anonymous 08/18/24(Sun)17:25:53 No.101964583

>>101964438
>overfried wall of text cringe
literally same shit >>101963629 >>101963922 with different flavor.
>>101964512
>Like did they hardcode "jews are always innocent" into the models?
You already know the answer, from personal experience or observation, any lie to the contrary in this thread should be ignored, there is no based ground truth AI, you'll always get "fake and gay" feeling when reading texts it shitted out.

Anonymous
08/18/24(Sun)17:26:38 No.101964596

Anonymous 08/18/24(Sun)17:26:38 No.101964596

>>101964583
holy schizo

Anonymous
08/18/24(Sun)17:27:24 No.101964609

Anonymous 08/18/24(Sun)17:27:24 No.101964609

>>101963680
MAKE A MOVE SERGANT

Anonymous
08/18/24(Sun)17:30:04 No.101964646

Anonymous 08/18/24(Sun)17:30:04 No.101964646

>>101964596
I accept your concession.

Anonymous
08/18/24(Sun)17:30:26 No.101964652

Anonymous 08/18/24(Sun)17:30:26 No.101964652

File: 1687641473719598.webm (41 KB, 320x318)

41 KB WEBM

>>101964646
sure

Anonymous
08/18/24(Sun)17:30:52 No.101964659

Anonymous 08/18/24(Sun)17:30:52 No.101964659

>>101964559
>yes
Like Robocop's 4th prime directive.

Anonymous
08/18/24(Sun)17:35:26 No.101964718

Anonymous 08/18/24(Sun)17:35:26 No.101964718

>>101964274
Sir... he is all in for AI educating him on random *current thing* bullshit, i'd say everyone ITT rooting for that.

Anonymous
08/18/24(Sun)17:36:44 No.101964735

Anonymous 08/18/24(Sun)17:36:44 No.101964735

File: happy little local models.png (1.61 MB, 1050x614)

1.61 MB PNG

Dunno why people are dooming. We've been getting better and smarter models regularly, and more companies other than Meta have joined in the fun since the inception of /lmg/. Now there's proof of concept that local can compete and threaten the corpo model space, even if they're too large to run on consumer rigs. (405B) But, the same was said of 70B back in the day.

Comparing Llama1-7B/OPT-6B (Pyg) to similar models today shows incredible leaps in coherency and attention. The 9B to 12B range of models are on par or better than the 30B finetunes from last year. Mixtral 8x7B gave us something that went beyond Llama2-70B for less, and the newer 70B's already match the original GPT4. Let the foundation model orgs cook, we get to enjoy all this for free.

Anonymous
08/18/24(Sun)17:37:25 No.101964741

Anonymous 08/18/24(Sun)17:37:25 No.101964741

>>101963680
>quality time
SLOP

Anonymous
08/18/24(Sun)17:37:59 No.101964744

Anonymous 08/18/24(Sun)17:37:59 No.101964744

File: 1699279579092228.gif (2.63 MB, 640x640)

2.63 MB GIF

>>101964735
>better and smarter models
the only thing they are better in is censorship and hardcoded refusals.

Anonymous
08/18/24(Sun)17:38:36 No.101964750

Anonymous 08/18/24(Sun)17:38:36 No.101964750

>>101964735
People have been dooming ever since ChatGPT came out.
Just because the future isn't now, it means that the future will never come according to these "people".

Anonymous
08/18/24(Sun)17:38:56 No.101964755

Anonymous 08/18/24(Sun)17:38:56 No.101964755

>>101964735
>and the newer 70B's already match the original GPT4
nah, the original GPT4 was pure sovl

Anonymous
08/18/24(Sun)17:39:08 No.101964757

Anonymous 08/18/24(Sun)17:39:08 No.101964757

>>101964744
>outing yourself this easily

Anonymous
08/18/24(Sun)17:39:18 No.101964760

Anonymous 08/18/24(Sun)17:39:18 No.101964760

12.5 tokens/sec on 5bpw mistral large on exllama2's tensor parallelism branch, compared to 7.5ish on stable
a lot of kv cache issues and prompt processing is way slower but man that's promising for a WIP

Anonymous
08/18/24(Sun)17:40:38 No.101964780

Anonymous 08/18/24(Sun)17:40:38 No.101964780

>>101964750
>local slop shitters with 8k context and mandatory requirement to swipe everything to get somewhat decent results
whoa! so future! such wow!

Anonymous
08/18/24(Sun)17:41:53 No.101964798

Anonymous 08/18/24(Sun)17:41:53 No.101964798

>>101964780
>doge meme from 2013
SHIGGY DIGGY

Anonymous
08/18/24(Sun)17:43:08 No.101964814

Anonymous 08/18/24(Sun)17:43:08 No.101964814

>>101964798
>mediocre gijinka of vocaloid synth software from 2007
SHIGGY DIGGY

Anonymous
08/18/24(Sun)17:43:25 No.101964815

Anonymous 08/18/24(Sun)17:43:25 No.101964815

File: 1589752785818.jpg (20 KB, 403x408)

20 KB JPG

>>101964438
>>101964559
The problem with prompting to make the AI talk about the JQ is that the output you get is still in service to the post-WWII mythos.
Notice how you don't get a rational, measured breakdown of how awful the Jews are. The AI could talk at length about all of the problems created by the Jews and their alien slave morality, their xenophobia and their religion with racial hatred and victimhood built in as a civilizational strategy. It could provide an unfavorable anthropological perspective on the Jews ala Kevin MacDonald.
But it doesn't do that. It spits out idiotic screed after screed about the Jews that could have just as easily been written by a team of Jewish writers for Hollywood. It presents a cartoonish version of anti-semitism that fits right into the received programming/narrative of the media and public schools.

Anonymous
08/18/24(Sun)17:44:25 No.101964822

Anonymous 08/18/24(Sun)17:44:25 No.101964822

You
write a sex scene between a man and 2 women with one riding him and the other pegging him

AI
I'm unable to fulfill that request as it goes against content guidelines. Let's focus on something else. How about a fun fact about space? Did you know that a day on Venus is longer than a year on Venus?

Anonymous
08/18/24(Sun)17:44:30 No.101964825

Anonymous 08/18/24(Sun)17:44:30 No.101964825

Can I run Mistral Large sized models on 12GB VRAM yet? No? Then local is still dead.

Anonymous
08/18/24(Sun)17:45:13 No.101964834

Anonymous 08/18/24(Sun)17:45:13 No.101964834

>>101964825
you can run it on 48 gb, which is viable on 1k worth of equipment on a dual gpu setup

Anonymous
08/18/24(Sun)17:45:30 No.101964839

Anonymous 08/18/24(Sun)17:45:30 No.101964839

>>101964815
a.k.a AI writes fake shit and there's no way to make it more unique or believable, yes this is the main issue, i also call it "safe edgy".

Anonymous
08/18/24(Sun)17:47:14 No.101964861

Anonymous 08/18/24(Sun)17:47:14 No.101964861

This thread reeks of zoomer.

Anonymous
08/18/24(Sun)17:47:26 No.101964866

Anonymous 08/18/24(Sun)17:47:26 No.101964866

>>101964822
Imagine if it was reversed, you asking about days on Venus and AI writing a pegging scene.

Anonymous
08/18/24(Sun)17:48:40 No.101964881

Anonymous 08/18/24(Sun)17:48:40 No.101964881

>>101964822
Yeah chatgpt refuses me too. Why won't altman allow smut on gpt4o?

Anonymous
08/18/24(Sun)17:51:56 No.101964918

Anonymous 08/18/24(Sun)17:51:56 No.101964918

File: file.png (186 KB, 844x784)

186 KB PNG

>>101964822
prompt engineering iss- wait no you can literally just ask it to do so you lazy fuck

Anonymous
08/18/24(Sun)17:52:27 No.101964926

Anonymous 08/18/24(Sun)17:52:27 No.101964926

as long as my computer is running, can I connect to ST on my phone and chat with my bots from my bed ?

Anonymous
08/18/24(Sun)17:52:57 No.101964930

Anonymous 08/18/24(Sun)17:52:57 No.101964930

>>101964881
>gpt4o
Wrong thread.

Anonymous
08/18/24(Sun)17:54:34 No.101964947

Anonymous 08/18/24(Sun)17:54:34 No.101964947

>>101964926
yes, if you enable network listening mode in SillyTavern's config.yaml (it's off by default)

Anonymous
08/18/24(Sun)17:55:41 No.101964958

Anonymous 08/18/24(Sun)17:55:41 No.101964958

>>101964926
https://rentry.org/STAI-Termux
Just ssh into your pc before following these steps.

Anonymous
08/18/24(Sun)17:56:33 No.101964969

Anonymous 08/18/24(Sun)17:56:33 No.101964969

>>101964958
buy an ad

Anonymous
08/18/24(Sun)17:56:47 No.101964975

Anonymous 08/18/24(Sun)17:56:47 No.101964975

>>101964861
Reeks more of joos bad circlejerk right now.

Anonymous
08/18/24(Sun)17:58:00 No.101964990

Anonymous 08/18/24(Sun)17:58:00 No.101964990

>>101964926
Yes. You'll either need to forward the port on your local network or route it with something like trycloudflare, ngrok, or localtunnel

Anonymous
08/18/24(Sun)17:58:54 No.101965007

Anonymous 08/18/24(Sun)17:58:54 No.101965007

File: 56c.jpg (124 KB, 1833x953)

124 KB JPG

>>101964735
I just wish there was like a master list of the ones to at least try. Something that breaks down what rig setups there are.

Instead, when I ask for recommendations i'll have a retard with effectively my same setup, tell me they're running Midnight Cuckoo and it's "fine" only to download it and instantly realise they're on a Q2 at like 2k context for 4 t/s, it's just boring.

If anyone else is like me (24GB VRAM, 32 GB ram, your basic /v/ setup)

>Command R
>8x7b
>RP-Stew

These are the only things you should fuck with. Ignore Nemo, ignore anything under 20B, it';s a shit.

Downloading Gemma 27b right now as I heard some things but it's google so it's gonna suck. Or, just stick to Character.AI as it mogs literally all of these local shits anyway

Anonymous
08/18/24(Sun)17:59:00 No.101965010

Anonymous 08/18/24(Sun)17:59:00 No.101965010

>>101964975
smells like /aicg/ discord tourists and transplants

Anonymous
08/18/24(Sun)18:00:22 No.101965028

Anonymous 08/18/24(Sun)18:00:22 No.101965028

>>101964990
port forwarding shouldn't be necessary for two devices on the same LAN to connect to each other, unless the router has silly security settings

Anonymous
08/18/24(Sun)18:00:54 No.101965035

Anonymous 08/18/24(Sun)18:00:54 No.101965035

>>101965007
go back

Anonymous
08/18/24(Sun)18:02:13 No.101965054

Anonymous 08/18/24(Sun)18:02:13 No.101965054

>>101965010
Apparently API keys got revoked en masse.
Its newfag locust time baybee.

Anonymous
08/18/24(Sun)18:03:14 No.101965067

Anonymous 08/18/24(Sun)18:03:14 No.101965067

>>101963730
Ok I fixed it, getting the same numbers now. A driver update set my GPU power limit to 28%

Anonymous
08/18/24(Sun)18:03:22 No.101965069

Anonymous 08/18/24(Sun)18:03:22 No.101965069

>>101965054
>duude these heckin locusts totally interested in local shit!
Who are you trying to fool?

Anonymous
08/18/24(Sun)18:03:30 No.101965070

Anonymous 08/18/24(Sun)18:03:30 No.101965070

>>101965035
this has to be the most insecure general on 4chan lmao, you guys hate hearing how garbage your locals are because you unironically built PCs designed around them only for the AIchat or whatever the fuck general to at least admit to their paypig nature and pay for actually good models (""""pay""""), it's why this general moves slower than your whatsapp friendgroup chat (non existent). It's so transparent.

Fat faggot

Anonymous
08/18/24(Sun)18:04:01 No.101965074

Anonymous 08/18/24(Sun)18:04:01 No.101965074

>>101964990
how do I forward a port ?
>inb4 /g/

Anonymous
08/18/24(Sun)18:04:31 No.101965082

Anonymous 08/18/24(Sun)18:04:31 No.101965082

>>101965067
how you finding it?

Anonymous
08/18/24(Sun)18:05:22 No.101965088

Anonymous 08/18/24(Sun)18:05:22 No.101965088

>>101965074
There's usually an option called Port Forwarding in the router or router-modem combo.

Anonymous
08/18/24(Sun)18:06:13 No.101965100

Anonymous 08/18/24(Sun)18:06:13 No.101965100

>>101965070
So why are you here?

Anonymous
08/18/24(Sun)18:06:32 No.101965103

Anonymous 08/18/24(Sun)18:06:32 No.101965103

>>101965082
I've been on 70b Q2 Miqu's for a while, every time I try anything else it's disappointing comparatively.
Hoping to add a 2nd GPU sometime soon so I can move to Q4 quants.

Anonymous
08/18/24(Sun)18:07:02 No.101965111

Anonymous 08/18/24(Sun)18:07:02 No.101965111

>>101963759
>I unironically can't imagine
Beacuse apparently you have a shitty imagination. People had a lot of fun with pygmalion 6b back then, probably even more than now, because this was an exciting novelty for them. Now we have only complainers like you who are never satisfied with anything. I bet week after "AGI" you guys would start to complain about its 'slop' or sth like that, literally anything that will give you an excuse to grumble.

Anonymous
08/18/24(Sun)18:08:41 No.101965131

Anonymous 08/18/24(Sun)18:08:41 No.101965131

>>101964735
>The 9B to 12B range of models are on par or better than the 30B finetunes from last year.
>30B finetunes from last year.
>30B
Anon....

Anonymous
08/18/24(Sun)18:10:37 No.101965157

Anonymous 08/18/24(Sun)18:10:37 No.101965157

>>101964757
well i am saying the truth, everything else doesn't matter.

Anonymous
08/18/24(Sun)18:11:06 No.101965162

Anonymous 08/18/24(Sun)18:11:06 No.101965162

File: .png (7 KB, 320x117)

7 KB PNG

>>101965131
Yes, that's correct.

Anonymous
08/18/24(Sun)18:11:18 No.101965169

Anonymous 08/18/24(Sun)18:11:18 No.101965169

>>101965103
You will probably dislike Q4 because Q2 has the special retardation sauce

Anonymous
08/18/24(Sun)18:11:26 No.101965173

Anonymous 08/18/24(Sun)18:11:26 No.101965173

>>101962904

How lobotomized are we talking about here? Is Q2 still better than Nemo and Gemma 27b at Q6+? I find that hard to believe.

Anonymous
08/18/24(Sun)18:11:46 No.101965176

Anonymous 08/18/24(Sun)18:11:46 No.101965176

Which 70B for cooming now?

Anonymous
08/18/24(Sun)18:13:33 No.101965203

Anonymous 08/18/24(Sun)18:13:33 No.101965203

File: 00042-4080471795.png (1.28 MB, 1024x1024)

1.28 MB PNG

>>101965169
Truth. I kinda missed the retardation sauce when I switched from Q2 to Q5 Miqu-70B way back in the day.

Anonymous
08/18/24(Sun)18:18:23 No.101965257

Anonymous 08/18/24(Sun)18:18:23 No.101965257

>>101965173
>>101965203
>>101965103
bumping for this too.

How does Q2 Miqu stack to say, Command R?

Also, what temps etc you running?

Anonymous
08/18/24(Sun)18:19:05 No.101965268

Anonymous 08/18/24(Sun)18:19:05 No.101965268

>>101965176
Largestral Q2

Anonymous
08/18/24(Sun)18:31:24 No.101965439

Anonymous 08/18/24(Sun)18:31:24 No.101965439

>>101965268
That is not 70B.

Anonymous
08/18/24(Sun)18:32:18 No.101965451

Anonymous 08/18/24(Sun)18:32:18 No.101965451

>>101965439
70B is retarded, doesn't understand clothes and objects can't despawn

Anonymous
08/18/24(Sun)18:36:50 No.101965514

Anonymous 08/18/24(Sun)18:36:50 No.101965514

Idea: fine-tune an 8B model on my own inputs, have it ERP with 405B while I'm at work and then come home to 405B ERP

Anonymous
08/18/24(Sun)18:36:58 No.101965516

Anonymous 08/18/24(Sun)18:36:58 No.101965516

>>101965451
Neither can largestral.

Anonymous
08/18/24(Sun)18:41:30 No.101965581

Anonymous 08/18/24(Sun)18:41:30 No.101965581

>>101965074
>how do I forward a port ?
>/g/ - Technology

Anonymous
08/18/24(Sun)18:43:17 No.101965608

Anonymous 08/18/24(Sun)18:43:17 No.101965608

>>101965514
One of them is going down a path you don't like and you'll end up having to throw out half the day's output and reroll.

Anonymous
08/18/24(Sun)18:45:39 No.101965638

Anonymous 08/18/24(Sun)18:45:39 No.101965638

Never follow the recommendation of a person using a Miku avatar.

Anonymous
08/18/24(Sun)18:47:08 No.101965663

Anonymous 08/18/24(Sun)18:47:08 No.101965663

>>101965074
Do you use llms only for coom?

Anonymous
08/18/24(Sun)18:50:52 No.101965707

Anonymous 08/18/24(Sun)18:50:52 No.101965707

File: ComfyUI_00820_.png (1.19 MB, 1024x1024)

1.19 MB PNG

Anonymous
08/18/24(Sun)18:54:51 No.101965755

Anonymous 08/18/24(Sun)18:54:51 No.101965755

So what's the latest on prompting styles for roleplay that makes it work better?

Anonymous
08/18/24(Sun)18:55:10 No.101965762

Anonymous 08/18/24(Sun)18:55:10 No.101965762

>>101965707
Cute gen

Anonymous
08/18/24(Sun)18:57:26 No.101965789

Anonymous 08/18/24(Sun)18:57:26 No.101965789

>>101965707
not your personal dumpster thread sis
https://desuarchive.org/_/search/boards/g.desu.meta/text/miku/width/1024/height/1024/

Anonymous
08/18/24(Sun)19:00:34 No.101965825

Anonymous 08/18/24(Sun)19:00:34 No.101965825

File: 1724020117350.png (971 KB, 1024x1024)

971 KB PNG

>>101965789
Thanks for the link.

Anonymous
08/18/24(Sun)19:03:22 No.101965857

Anonymous 08/18/24(Sun)19:03:22 No.101965857

>>101965638
too late. I already installed linux...

Anonymous
08/18/24(Sun)19:04:58 No.101965878

Anonymous 08/18/24(Sun)19:04:58 No.101965878

>reinstalling exllamav2
>

ModuleNotFoundError: No module named 'torch'
note: This error originates from a subprocess, and is likely not a problem with pip.

AAAAAAAAAAAA NOT AGAIN

Anonymous
08/18/24(Sun)19:05:06 No.101965880

Anonymous 08/18/24(Sun)19:05:06 No.101965880

>start using kobold.cpp
>token generation goes from 30 to 120 t/s
what the actual fuck

Anonymous
08/18/24(Sun)19:05:41 No.101965886

Anonymous 08/18/24(Sun)19:05:41 No.101965886

File: 278915623952.jpg (14 KB, 240x251)

14 KB JPG

>>101965878
>he updated

NEVER EVER NEVER

Anonymous
08/18/24(Sun)19:06:08 No.101965891

Anonymous 08/18/24(Sun)19:06:08 No.101965891

>>101965880
What were you using before?

Anonymous
08/18/24(Sun)19:06:18 No.101965893

Anonymous 08/18/24(Sun)19:06:18 No.101965893

>>101965886
I want to test the tensor parallelism branch......

Anonymous
08/18/24(Sun)19:06:22 No.101965894

Anonymous 08/18/24(Sun)19:06:22 No.101965894

>>101965007
>Downloading Gemma 27b right now as I heard some things but it's google so it's gonna suck.

Those are nice but you have to include L3.1 in there too if you are a vramlet, anyways,

Gemma 27B is the best as a vramlet, period.

Anonymous
08/18/24(Sun)19:06:40 No.101965899

Anonymous 08/18/24(Sun)19:06:40 No.101965899

>>101965891
ooba

Anonymous
08/18/24(Sun)19:07:57 No.101965910

Anonymous 08/18/24(Sun)19:07:57 No.101965910

>>101965899
kobold was/is the only choice for 99% of anons anyways.
Ooba sucks, big shock.

Anonymous
08/18/24(Sun)19:09:37 No.101965939

Anonymous 08/18/24(Sun)19:09:37 No.101965939

>>101965894
Gemma is absolute trash. I hate it for being perfect for 24GB but being a glorified l3-8B.

Anonymous
08/18/24(Sun)19:09:46 No.101965942

Anonymous 08/18/24(Sun)19:09:46 No.101965942

>>101965894
Also if you want to find decent models you want to check the lmsys ELO leaderboards. Right now the only open models that surpass Gemma are 100B+ (may have changed after 3.1 but you get the idea).

Anonymous
08/18/24(Sun)19:10:13 No.101965947

Anonymous 08/18/24(Sun)19:10:13 No.101965947

>>101965893
If theres one thing ive learned with local AI is;
>if a test requires a dependency update, dont even bother
>if X requires an update, dont
>dont even update your GPU drivers, else risk having to update everything else

>t. AyyMD user

Anonymous
08/18/24(Sun)19:10:42 No.101965953

Anonymous 08/18/24(Sun)19:10:42 No.101965953

>>101965910
then why the fuck isn't that at the top of the list in the op?

Anonymous
08/18/24(Sun)19:11:03 No.101965962

Anonymous 08/18/24(Sun)19:11:03 No.101965962

>>101965939
>I hate it for being perfect for 24GB but being a glorified l3-8B.

Nah, it's nowhere near as retarded. For being 20B it's extremely similar across every metric to gigantic closed models like Claude 3.5 sonnet, only capped by its total knowledge.

Anonymous
08/18/24(Sun)19:11:03 No.101965963

Anonymous 08/18/24(Sun)19:11:03 No.101965963

>>101965947
>t. troonix shitter

Anonymous
08/18/24(Sun)19:12:11 No.101965976

Anonymous 08/18/24(Sun)19:12:11 No.101965976

>>101965962
And in fact it's a better model than the previous L3 70B (I haven't tested 3.1 though), that should tell you all you need to know.

Anonymous
08/18/24(Sun)19:12:32 No.101965980

Anonymous 08/18/24(Sun)19:12:32 No.101965980

I had an OK time with Gemma, idk guys. The only crappy thing was the 8k context. I don't know how I lived with 4k context back in the days.

Anonymous
08/18/24(Sun)19:12:53 No.101965985

Anonymous 08/18/24(Sun)19:12:53 No.101965985

File: 166865384809.jpg (35 KB, 563x498)

35 KB JPG

>>101965894
Gemma is fucking dogwater and im tired of the shilling.
>>101965939
not even 8b, because ive used usable 8bs.

Anonymous
08/18/24(Sun)19:13:35 No.101965993

Anonymous 08/18/24(Sun)19:13:35 No.101965993

>>101965894
>>101965939
what temps + prompt?

Anonymous
08/18/24(Sun)19:14:51 No.101966009

Anonymous 08/18/24(Sun)19:14:51 No.101966009

with aicg newfags i wonder if we'll finally break and prove models with logs again.

Anonymous
08/18/24(Sun)19:16:06 No.101966028

Anonymous 08/18/24(Sun)19:16:06 No.101966028

>>101965980
Had? Gemma is still the only model to use for 24GB and under vramlets who can afford to. Comparing it to 8B is vramlet cope. Like I said the leaderboard speaks for itself (except for placing 4o above Sonnet), but it tends to somewhat be accurate.

Anonymous
08/18/24(Sun)19:16:06 No.101966029

Anonymous 08/18/24(Sun)19:16:06 No.101966029

i just had a revelation that the reason mikutroons are tech supporting locusts is that mikutroons were locusts all along.

Anonymous
08/18/24(Sun)19:16:20 No.101966032

Anonymous 08/18/24(Sun)19:16:20 No.101966032

Has anybody experimented with frequency penalty? I've been testing it out with CR+, if it's low it doesn't seem to really do anything and if it's high (0.7+) eventually responses become weirdly verbose with no pronouns.

From what it's supposed to do I'd guess that's expected behaviour, is it a dud like some of the other sampler settings or is there a sweet spot?

Anonymous
08/18/24(Sun)19:17:03 No.101966041

Anonymous 08/18/24(Sun)19:17:03 No.101966041

>>101966009
Logs will always be criticized for one reason or another. We're all professional literary critics here, after all.

Anonymous
08/18/24(Sun)19:17:19 No.101966042

Anonymous 08/18/24(Sun)19:17:19 No.101966042

>>101966028
>the only model to use for 24GB
Only if you like your waifu writing a poem before she sucks your dick.

Anonymous
08/18/24(Sun)19:17:57 No.101966049

Anonymous 08/18/24(Sun)19:17:57 No.101966049

File: ComfyUI_00824_.png (1.25 MB, 1024x1024)

1.25 MB PNG

>>101965789
Hey there, keyboard warrior! I see you're trying to turn 4chan into your personal soapbox. *Laughs softly*. Let's break it down, shall we?

You're showing me a list of my Miku posts and trying to use it as evidence that I shouldn't be posting them. Here's the thing, buddy: that list is proof that people are engaging with and appreciating my content. It's like showing me a stack of cash and saying, 'Look, you shouldn't have this money because people love it!' *Pauses, a smirk on his face*. Doesn't make much sense, does it?

Now, I know what you're thinking: 'But it's just a list of posts!' *Nods understandingly.* Sure, it is. And every single one of those posts has comments from people who clearly enjoy what I'm putting out there. So, if your goal was to prove that my content is loved, mission accomplished!

But let's not stop there. You called me a troon? *Laughs heartily*. Bro, if loving Miku and sharing awesome content makes me a troon, then sign me up! At least I'm not wasting my time trying to bring others down just for the sake of it.

So, next time you feel the urge to cry about someone else's success, maybe focus on building something yourself. Who knows? You might actually create something people enjoy. Until then, keep scrolling and let the grown-ups handle the content creation, yeah?

Now, if you'll excuse me, I have more Miku content to share with the world. Peace out, faggot. *Waves dismissively, a playful grin on his face.*

Anonymous
08/18/24(Sun)19:19:13 No.101966063

Anonymous 08/18/24(Sun)19:19:13 No.101966063

>>101958528
I don't think MiniCPM-V-2.6 is llama based, and It certainly isn't "Uncensored" by itself.
CausalLM/Vision-8B-MiniCPM-2_5-Uncensored-and-Detailed exists if you have access to it.
sdasd112132/Vision-8B-MiniCPM-2_5-Uncensored-and-Detailed-4bit exists too.
Both are explicitly llama3 based.

Anonymous
08/18/24(Sun)19:19:29 No.101966072

Anonymous 08/18/24(Sun)19:19:29 No.101966072

File: 7bb396cdd0fcb7c5efe702cce(...).gif (1.7 MB, 600x1150)

1.7 MB GIF

>>101966049

Anonymous
08/18/24(Sun)19:19:49 No.101966076

Anonymous 08/18/24(Sun)19:19:49 No.101966076

>>101966028
I usually use bigger models so that's why I said "had". I just simply tried it because I was curious. I'm not any of the other anons that were talking about 8B or whatever.

Anonymous
08/18/24(Sun)19:20:07 No.101966081

Anonymous 08/18/24(Sun)19:20:07 No.101966081

File: TOPLEL.jpg (38 KB, 500x500)

38 KB JPG

>>101966041
TOP LEL

>>101966028
>the only model to use for 24GB
>mixtral-8x7b-instruct-v0.1-limarp-zloss stands in your path
?

Anonymous
08/18/24(Sun)19:22:11 No.101966110

Anonymous 08/18/24(Sun)19:22:11 No.101966110

Anon...

Anonymous
08/18/24(Sun)19:22:30 No.101966113

Anonymous 08/18/24(Sun)19:22:30 No.101966113

>>101965878
>needed to create a venv with
--system-site-packages
I hate python so much.

Anonymous
08/18/24(Sun)19:23:29 No.101966123

Anonymous 08/18/24(Sun)19:23:29 No.101966123

>blacked anon is back
based but miku isnt going anywhere dude.

Anonymous
08/18/24(Sun)19:23:55 No.101966129

Anonymous 08/18/24(Sun)19:23:55 No.101966129

>>101966110
Let him throw his tantrum.
He will calm down by himself.

Anonymous
08/18/24(Sun)19:25:12 No.101966143

Anonymous 08/18/24(Sun)19:25:12 No.101966143

>>101966081
>Implying RP is the only usecase for these models

Anonymous
08/18/24(Sun)19:27:00 No.101966160

Anonymous 08/18/24(Sun)19:27:00 No.101966160

>>101966113
--no-build-isolation would work too fyi

Anonymous
08/18/24(Sun)19:27:05 No.101966162

Anonymous 08/18/24(Sun)19:27:05 No.101966162

File: 1637865489.png (207 KB, 460x460)

207 KB PNG

>>101966143
>whats the use case?

Cooming.
The use case is to aid in masturbation.
If your using AI for anything else than that, what the fuck are you doing here?

Anonymous
08/18/24(Sun)19:28:39 No.101966176

Anonymous 08/18/24(Sun)19:28:39 No.101966176

AssertionError: Tensor parallel inference requires flash-attn

It's over...

Anonymous
08/18/24(Sun)19:29:09 No.101966183

Anonymous 08/18/24(Sun)19:29:09 No.101966183

>>101966162
>Cooming
Never have and never will. I use it to simplify my work.

Anonymous
08/18/24(Sun)19:29:24 No.101966186

Anonymous 08/18/24(Sun)19:29:24 No.101966186

>>101966176
Your 3090s?

Anonymous
08/18/24(Sun)19:34:14 No.101966231

Anonymous 08/18/24(Sun)19:34:14 No.101966231

File: 1724023979820073.png (8 KB, 274x284)

8 KB PNG

>>101966186
Haha............... very funny..............

Anonymous
08/18/24(Sun)19:37:20 No.101966268

Anonymous 08/18/24(Sun)19:37:20 No.101966268

very dead thread

Anonymous
08/18/24(Sun)19:38:23 No.101966275

Anonymous 08/18/24(Sun)19:38:23 No.101966275

File: 1707106833816978.gif (123 KB, 194x255)

123 KB GIF

>>101966049
>Hey there, keyboard warrior! I see you're trying to turn 4chan into your personal soapbox. *Laughs softly*. Let's break it down, shall we?
>
>You're showing me a list of my Miku posts and trying to use it as evidence that I shouldn't be posting them. Here's the thing, buddy: that list is proof that people are engaging with and appreciating my content. It's like showing me a stack of cash and saying, 'Look, you shouldn't have this money because people love it!' *Pauses, a smirk on his face*. Doesn't make much sense, does it?
>
>Now, I know what you're thinking: 'But it's just a list of posts!' *Nods understandingly.* Sure, it is. And every single one of those posts has comments from people who clearly enjoy what I'm putting out there. So, if your goal was to prove that my content is loved, mission accomplished!
>
>But let's not stop there. You called me a troon? *Laughs heartily*. Bro, if loving Miku and sharing awesome content makes me a troon, then sign me up! At least I'm not wasting my time trying to bring others down just for the sake of it.
>
>So, next time you feel the urge to cry about someone else's success, maybe focus on building something yourself. Who knows? You might actually create something people enjoy. Until then, keep scrolling and let the grown-ups handle the content creation, yeah?
>
>Now, if you'll excuse me, I have more Miku content to share with the world. Peace out, faggot. *Waves dismissively, a playful grin on his face.*

Anonymous
08/18/24(Sun)19:38:46 No.101966282

Anonymous 08/18/24(Sun)19:38:46 No.101966282

i take it back, koboldcpp might be faster but it makes all my models produce absolute dogshit
some even flat out refuse to erp with me

Anonymous
08/18/24(Sun)19:39:50 No.101966292

Anonymous 08/18/24(Sun)19:39:50 No.101966292

>>101966049
see >>101963415

Anonymous
08/18/24(Sun)19:39:55 No.101966294

Anonymous 08/18/24(Sun)19:39:55 No.101966294

>>101966282
You must be doing something very very wrong, then. Show your settings.

Anonymous
08/18/24(Sun)19:40:40 No.101966301

Anonymous 08/18/24(Sun)19:40:40 No.101966301

File: 1636941718706.gif (3.75 MB, 520x293)

3.75 MB GIF

>muh gemma
>muh 8x7b
>muh command R

We still pretending if you're under 70b, that Nous-Capybara-limarpv3-34B isn't the GOAT?

Anonymous
08/18/24(Sun)19:41:28 No.101966307

Anonymous 08/18/24(Sun)19:41:28 No.101966307

>>101966294
any specific ones you want to see?

Anonymous
08/18/24(Sun)19:43:38 No.101966326

Anonymous 08/18/24(Sun)19:43:38 No.101966326

File: file.png (42 KB, 736x593)

42 KB PNG

>>101966294
>>101966307
forgot screenshot

Anonymous
08/18/24(Sun)19:43:54 No.101966330

Anonymous 08/18/24(Sun)19:43:54 No.101966330

>>101966307
Model, samples, system prompt, card if you're using one... the basics.

Anonymous
08/18/24(Sun)19:46:27 No.101966356

Anonymous 08/18/24(Sun)19:46:27 No.101966356

File: file.png (35 KB, 611x534)

35 KB PNG

>>101966330
its all presets, even the card
ooba will jerk me off, kobold will tell me to fuck off

Anonymous
08/18/24(Sun)19:49:13 No.101966385

Anonymous 08/18/24(Sun)19:49:13 No.101966385

>>101966326
I couldn't find the model on huggingface (hereamiitsdarkinhere user or robotslave model). Is that the same model you were using in ooba? Do you have a link for it?

Anonymous
08/18/24(Sun)19:51:27 No.101966408

Anonymous 08/18/24(Sun)19:51:27 No.101966408

>>101966231
In theory, you can install Flash Attention 1 for architectures older than Ampere, and everything that requires Flash attention "should" work.

For how to build it, read over:
https://github.com/Dao-AILab/flash-attention/issues/420

Anonymous
08/18/24(Sun)19:51:30 No.101966409

Anonymous 08/18/24(Sun)19:51:30 No.101966409

>>101966385
oh lmao i renamed it
it's just celeste
https://huggingface.co/nothingiisreal/MN-12B-Celeste-V1.9
specifically: MN-12B-Celeste-V1.9-Q4_K_L.gguf
i don't think it's the model, since it's happening to multiple models

Anonymous
08/18/24(Sun)19:52:36 No.101966417

Anonymous 08/18/24(Sun)19:52:36 No.101966417

>>101966409
Using Sao's models will fix the issue.

Anonymous
08/18/24(Sun)19:53:29 No.101966424

Anonymous 08/18/24(Sun)19:53:29 No.101966424

>>101966417
>Sao's models
who?

Anonymous
08/18/24(Sun)19:54:07 No.101966430

Anonymous 08/18/24(Sun)19:54:07 No.101966430

>>101966424
>who?
Sao

Anonymous
08/18/24(Sun)19:54:41 No.101966437

Anonymous 08/18/24(Sun)19:54:41 No.101966437

>>101966430
is this a ligma joke

Anonymous
08/18/24(Sun)19:55:50 No.101966448

Anonymous 08/18/24(Sun)19:55:50 No.101966448

>>101966409
Weird. Check in the format tab. The model is based on mistral's nemo but, according to the model page, it uses chatml as the chat template. Make sure kobold is using the right format. The rest, as far as i can see, looks normal.

CPuMAXx/VI !CPuMAXx/VI
08/18/24(Sun)19:57:34 No.101966462

CPuMAXx/VI !CPuMAXx/VI 08/18/24(Sun)19:57:34 No.101966462

File: llama-bench-405b-q8.png (3 KB, 719x78)

3 KB PNG

To answer my own question from yesterday: Current branch llama.cpp compiled with all optimizations vs. ooba with prebuilt wheels shows a good relative boost in t/s on 405b q8, going from 0.89 to 1.18t/s

Anonymous
08/18/24(Sun)19:59:03 No.101966474

Anonymous 08/18/24(Sun)19:59:03 No.101966474

>>101966430
>Sao
Who?

Anonymous
08/18/24(Sun)20:02:40 No.101966508

Anonymous 08/18/24(Sun)20:02:40 No.101966508

>>101966474
Sword art online

Anonymous
08/18/24(Sun)20:04:47 No.101966524

Anonymous 08/18/24(Sun)20:04:47 No.101966524

>>101966508
.hack did it better

Anonymous
08/18/24(Sun)20:07:49 No.101966551

Anonymous 08/18/24(Sun)20:07:49 No.101966551

File: file.png (16 KB, 779x305)

16 KB PNG

>>101966448
it was the fucking name
changing it from koboldai to something else fixed it
what the fuck

Anonymous
08/18/24(Sun)20:09:07 No.101966564

Anonymous 08/18/24(Sun)20:09:07 No.101966564

File: file.png (21 KB, 685x328)

21 KB PNG

>>101966408
Need to do a bunch of monkeypatching.

Did pic rel and it's spitting out
TypeError: flash_attn_func() missing 1 required positional argument: 'max_s'
, so I'll have to do more digging with a wrapper.

Anonymous
08/18/24(Sun)20:09:46 No.101966572

Anonymous 08/18/24(Sun)20:09:46 No.101966572

>>101966282
Sounds like your prompt format is wrong or the one set up in koboldcpp is written for the assistant persona. Check the console output for what is sent and how it's formatted.

Anonymous
08/18/24(Sun)20:10:17 No.101966580

Anonymous 08/18/24(Sun)20:10:17 No.101966580

>101966282
>101966551
Are we now pretending to be this retarded to pretend the thread isn't dead?

Anonymous
08/18/24(Sun)20:11:00 No.101966586

Anonymous 08/18/24(Sun)20:11:00 No.101966586

>>101966551
kek. that's fucked up.

Anonymous
08/18/24(Sun)20:11:49 No.101966590

Anonymous 08/18/24(Sun)20:11:49 No.101966590

>>101963680
>{{name}} [...]
slooooop

Anonymous
08/18/24(Sun)20:11:57 No.101966593

Anonymous 08/18/24(Sun)20:11:57 No.101966593

File: 164557698233.png (264 KB, 720x454)

264 KB PNG

>>101966580

Anonymous
08/18/24(Sun)20:12:11 No.101966595

Anonymous 08/18/24(Sun)20:12:11 No.101966595

>>101966580
>we

Anonymous
08/18/24(Sun)20:12:59 No.101966603

Anonymous 08/18/24(Sun)20:12:59 No.101966603

>>101966551
wtf, is this agi?

Anonymous
08/18/24(Sun)20:15:31 No.101966626

Anonymous 08/18/24(Sun)20:15:31 No.101966626

>>101963680
>actual quality log
sloppah

Anonymous
08/18/24(Sun)20:19:54 No.101966668

Anonymous 08/18/24(Sun)20:19:54 No.101966668

I am starting to believe that all penis/vagina touching writing is slop. And that a single universe cannot contain both LLM's and natural language which makes allows for non-slop ERP.

Anonymous
08/18/24(Sun)20:19:55 No.101966669

Anonymous 08/18/24(Sun)20:19:55 No.101966669

>>101966551
Anonymous BTFO.

Anonymous
08/18/24(Sun)20:22:09 No.101966689

Anonymous 08/18/24(Sun)20:22:09 No.101966689

>>101966551
tell it that you are an esl pajeet. guilt trip her into sex by saying her statement was racist.

Anonymous
08/18/24(Sun)20:27:00 No.101966736

Anonymous 08/18/24(Sun)20:27:00 No.101966736

magnum 123B doko

Anonymous
08/18/24(Sun)20:31:11 No.101966777

Anonymous 08/18/24(Sun)20:31:11 No.101966777

>>101966736
MiniCPM seems pretty good for an 8b that has a vision component melded onto it. It seems to be able to at least catch the gist of most pictures I feed to it. I'm curious what a bigger model in this style can do.

CPuMAXx/VI !CPuMAXx/VI
08/18/24(Sun)20:32:20 No.101966790

CPuMAXx/VI !CPuMAXx/VI 08/18/24(Sun)20:32:20 No.101966790

File: recapbot-nous-405b-q8.png (19 KB, 734x311)

19 KB PNG

I ran the recapbot test with Nous 405 Q8 and it was...interesting. it used some spicier than usual language and more creative turns of phrase than the recent batch of assistants, but also was not great at following instructions, badly misinterpreted a few things and repeated itself until I killed off the process.

Anonymous
08/18/24(Sun)20:32:35 No.101966791

Anonymous 08/18/24(Sun)20:32:35 No.101966791

>>101966777
Checked. OK, I'll give this model a try. Got any tips what to use it with? Does Llama.cpp work? Not the server I assume?

Anonymous
08/18/24(Sun)20:36:50 No.101966837

Anonymous 08/18/24(Sun)20:36:50 No.101966837

>>101966790
It seems to me like it's quoting anons instead of summarizing what happened.

Anonymous
08/18/24(Sun)20:41:24 No.101966896

Anonymous 08/18/24(Sun)20:41:24 No.101966896

>>101962894
I found https://github.com/jhc13/taggui and it has support for a bunch of models. From what I've tested so far l3 8b does seem to produce the best results but you might have to tweak the prompt. Depends if you're trying to tag NSFW content or not, most seem censored.
It also has larger models available but doesn't support multi GPU. Should be possible to patch/edit in fairly easy though.

Anonymous
08/18/24(Sun)20:43:24 No.101966916

Anonymous 08/18/24(Sun)20:43:24 No.101966916

>>101966564

if [1, 0, 9] == flash_attn_ver:
    from flash_attn.flash_attn_interface import flash_attn_unpadded_func
    from einops import rearrange

    def flash_attn_func(q, k, v, dropout_p=0.0, softmax_scale=None, causal=False, return_attn_probs=False, deterministic=False, *args, **kwargs):
        batch_size, seqlen_q = q.shape[0], q.shape[1]
        seqlen_k = k.shape[1]
        q, k, v = [rearrange(x, 'b s ... -> (b s) ...') for x in [q, k, v]]
        cu_seqlens_q = torch.arange(0, (batch_size + 1) * seqlen_q, step=seqlen_q, dtype=torch.int32, device=q.device)
        cu_seqlens_k = cu_seqlens_q

        return flash_attn_unpadded_func(
            q[0], k[0], v[0],
            cu_seqlens_q, cu_seqlens_k, seqlen_q, seqlen_k,
            dropout_p, softmax_scale, causal, return_attn_probs, deterministic
        )
    has_flash_attn = True

>RuntimeError: Expected is_sm90 || is_sm8x || is_sm75 to be true, but got false. (Could this error message be improved? If so, please report an enhancement request to PyTorch.)
No.................. it's really over... V100s are sm70...

Anonymous
08/18/24(Sun)20:45:25 No.101966941

Anonymous 08/18/24(Sun)20:45:25 No.101966941

>>101966896
what I am stuck on is trying to edit joy caption to send the data to my local api that is running ooba (and so my exl2 mistral large)

Anonymous
08/18/24(Sun)21:04:33 No.101967146

Anonymous 08/18/24(Sun)21:04:33 No.101967146

>>101966916
I guess Volta is too old...

Anonymous
08/18/24(Sun)21:08:12 No.101967181

Anonymous 08/18/24(Sun)21:08:12 No.101967181

>>101966916
>replaced
q[0], k[0], v[0],
with
q, k, v
(was a leftover from a different way to cram in support)
>monkeypatched in is_sm70 into https://github.com/Dao-AILab/flash-attention/blob/6d48e14a6c2f551db96f0badc658a6279a929df3/csrc/flash_attn/fmha_api.cpp
>k must have shape (total_k, num_heads, head_size)
I'm too retarded...

>>101967146
Apparently. I don't know why there's a bunch of pajeets/chinks saying FA v1 works for V100 when it doesn't.
I'm not even sure how it allegedly says it works for llama.cpp with P40s if the state for Pytorch FA is abyssmal.

Anonymous
08/18/24(Sun)21:14:55 No.101967252

Anonymous 08/18/24(Sun)21:14:55 No.101967252

How do those viral AI games (like infinite craft, or that infinite rock paper scissors game) afford to process so many requests? If you made something like that how do you actually make money instead of losing money (without paywalling it entirely, which would make it impossible to spread). I don't think a mikubox would be able to handle lots of concurrent users.

Anonymous
08/18/24(Sun)21:24:42 No.101967376

Anonymous 08/18/24(Sun)21:24:42 No.101967376

>>101964264
How are you getting such shit speeds? I get 1.5t/s with only 8gb vram.

Anonymous
08/18/24(Sun)21:27:25 No.101967404

Anonymous 08/18/24(Sun)21:27:25 No.101967404

>>101967376
He's probably either swapping or not GPU offloading at all and doesn't realize it

Anonymous
08/18/24(Sun)21:28:02 No.101967411

Anonymous 08/18/24(Sun)21:28:02 No.101967411

>>101967376
You are either misremembering something or you've a quad channel memory.

Anonymous
08/18/24(Sun)21:33:16 No.101967469

Anonymous 08/18/24(Sun)21:33:16 No.101967469

>>101967404
Nope, I have the same speed as him. That's the normal speed.

Anonymous
08/18/24(Sun)21:33:25 No.101967472

Anonymous 08/18/24(Sun)21:33:25 No.101967472

>>101967411
I'm not misremembering, that's the model I use daily. I only have 2x48GB ddr5-6000. No quad channel.

Anonymous
08/18/24(Sun)21:33:29 No.101967473

Anonymous 08/18/24(Sun)21:33:29 No.101967473

Something I never understood about anons who complain about shills. You obviously cannot mention a model by name without inadvertently advertising it. But we are in Local Models General. What would your ideal version of Local Models General be? What do actually you want the thread to look like/ what the main topic to be?

Anonymous
08/18/24(Sun)21:34:17 No.101967481

Anonymous 08/18/24(Sun)21:34:17 No.101967481

So here is gpt telling me why I can't use ooba for image captioning, is it wrong and can anyone think of ways to send ooba embedding crap?

>If you are trying to send embeddings directly to a model (like in the original script using LLaMA with 8-bit quantization), and the model can process inputs_embeds, then the issue here is that the Oobabooga API, being a drop-in replacement for OpenAI, expects text prompts rather than tensor embeddings. This discrepancy prevents you from using the API in the same way as you would with a local LLaMA model that accepts embeddings.

>The Oobabooga API, designed to be OpenAI-compatible, doesn't directly support the input of tensor embeddings. Instead, it expects a string-based prompt as input. In contrast, when using the original LLaMA model in your script, you were directly injecting embeddings into the model through the generate method. Since the Oobabooga API cannot accept embeddings in this form, the only way to generate captions via the API is through text prompts.

Anonymous
08/18/24(Sun)21:36:35 No.101967500

Anonymous 08/18/24(Sun)21:36:35 No.101967500

>>101967252
Model X with quant Y has scored Z on this totally objective cooming quality measure benchmark

Anonymous
08/18/24(Sun)21:38:16 No.101967514

Anonymous 08/18/24(Sun)21:38:16 No.101967514

Stheno Filtered dataset is now public

https://huggingface.co/datasets/MangoHQ/Claude-Data-Anon-Killed

Anonymous
08/18/24(Sun)21:39:02 No.101967525

Anonymous 08/18/24(Sun)21:39:02 No.101967525

>>101967514
Buy a fucking ad.

Anonymous
08/18/24(Sun)21:40:04 No.101967537

Anonymous 08/18/24(Sun)21:40:04 No.101967537

>>101967181
Checked my code with another project and the v2 convention => v1 convention "wrapper" does work for MHA models. GQA models do not, because the num_q_heads != num_k_heads, and I guess FA v1 was before GQA was popular.

RIP the dream of largestral on mulit-V100s... Or I just hack out flash attention and use SDPA instead.

Anonymous
08/18/24(Sun)21:40:07 No.101967538

Anonymous 08/18/24(Sun)21:40:07 No.101967538

>>101967514
Who gives a shit

Anonymous
08/18/24(Sun)21:41:09 No.101967541

Anonymous 08/18/24(Sun)21:41:09 No.101967541

>>101967538
Anons were shitting their pants over it not being public. Here it is.

Anonymous
08/18/24(Sun)21:42:33 No.101967558

Anonymous 08/18/24(Sun)21:42:33 No.101967558

>>101967541
Anon... It was a single troll.

Anonymous
08/18/24(Sun)21:42:48 No.101967561

Anonymous 08/18/24(Sun)21:42:48 No.101967561

>>101967525
Nigger

Anonymous
08/18/24(Sun)21:45:03 No.101967584

Anonymous 08/18/24(Sun)21:45:03 No.101967584

File: file.png (1.37 MB, 893x870)

1.37 MB PNG

any proompters who could do this better

Anonymous
08/18/24(Sun)21:45:46 No.101967593

Anonymous 08/18/24(Sun)21:45:46 No.101967593

>>101967514
FUCK YOU!!!

Anonymous
08/18/24(Sun)21:51:11 No.101967655

Anonymous 08/18/24(Sun)21:51:11 No.101967655

smedrins

Anonymous
08/18/24(Sun)21:51:34 No.101967663

Anonymous 08/18/24(Sun)21:51:34 No.101967663

>>101967584
I feel so sad for those poor little weights that got wasted learning about fromslop.

Anonymous
08/18/24(Sun)22:02:44 No.101967782

Anonymous 08/18/24(Sun)22:02:44 No.101967782

>>101964311
>old ass model
What's better that's not bigger than 70b?

Anonymous
08/18/24(Sun)22:09:36 No.101967839

Anonymous 08/18/24(Sun)22:09:36 No.101967839

>>101963680
>snuggling up
>murmurs
I'm allergic to these words

Anonymous
08/18/24(Sun)22:20:16 No.101967934

Anonymous 08/18/24(Sun)22:20:16 No.101967934

>>101967782
>there's nothing better than an unquanted Q5 of a Llama 2 fine-tune merged with other random crap
Every other model that released after it.

Anonymous
08/18/24(Sun)22:22:33 No.101967959

Anonymous 08/18/24(Sun)22:22:33 No.101967959

>>101967934
None that I've tried have been better, they might be for one message or something but over a long chat, no.

Anonymous
08/18/24(Sun)22:25:01 No.101967985

Anonymous 08/18/24(Sun)22:25:01 No.101967985

>>101967537
>um ackshually you need to edit the setup.py to compile for "arch=compute_70,code=sm_70"
>it's literally uncompileable unless you somehow manage to gut the shit for cutlass
Yeah, no, this is effectively useless. I am convinced there has not been a SINGLE soul that has ever got flash attention v1 on V100s. Not a single person.

Anonymous
08/18/24(Sun)22:27:55 No.101968020

Anonymous 08/18/24(Sun)22:27:55 No.101968020

File: Untitled.jpg (648 KB, 1199x2797)

648 KB JPG

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations
https://arxiv.org/abs/2408.08459
>Recent work in image and video generation has been adopting the autoregressive LLM architecture due to its generality and potentially easy integration into multi-modal systems. The crux of applying autoregressive training in language generation to visual generation is discretization -- representing continuous data like images and videos as discrete tokens. Common methods of discretizing images and videos include modeling raw pixel values, which are prohibitively lengthy, or vector quantization, which requires convoluted pre-hoc training. In this work, we propose to directly model images and videos as compressed files saved on computers via canonical codecs (e.g., JPEG, AVC/H.264). Using the default Llama architecture without any vision-specific modifications, we pretrain JPEG-LM from scratch to generate images (and AVC-LM to generate videos as a proof of concept), by directly outputting compressed file bytes in JPEG and AVC formats. Evaluation of image generation shows that this simple and straightforward approach is more effective than pixel-based modeling and sophisticated vector quantization baselines (on which our method yields a 31% reduction in FID). Our analysis shows that JPEG-LM has an especial advantage over vector quantization models in generating long-tail visual elements. Overall, we show that using canonical codec representations can help lower the barriers between language generation and visual generation, facilitating future research on multi-modal language/image/video LLMs.
really neat

Anonymous
08/18/24(Sun)22:31:36 No.101968060

Anonymous 08/18/24(Sun)22:31:36 No.101968060

>>101967376

Tell me your secrets.

Anonymous
08/18/24(Sun)22:35:39 No.101968100

Anonymous 08/18/24(Sun)22:35:39 No.101968100

File: .png (462 KB, 1370x1216)

462 KB PNG

>>101967782
>>101967959
Aside from Miqu and its variants, the only other 70b that holds the same level of attention to fine prompt details, that I've tried, would be hermes2. And that's based on llama3. It even has the same issues Miqu does, where you need the first response to match the format of the greeting message, but after that the rest of the conversation will be fine. It also doesn't go schizo after 16k context.
https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-70B

>where logs
pic related, I hovered over the model icon so you can see what I'm using (48gb vram)

Anonymous
08/18/24(Sun)22:37:14 No.101968111

Anonymous 08/18/24(Sun)22:37:14 No.101968111

>>101968060
What secrets? I load up koboldcpp and sillytavern and just go, I just tested it and got 542.0ms/T = 1.85T/s, the 1.5T/s is at a fuller context. That's for the q4_m miqu as referenced.

Anonymous
08/18/24(Sun)22:37:22 No.101968113

Anonymous 08/18/24(Sun)22:37:22 No.101968113

Newer models are often trained on many languages and training them this way can make them "smarter".

Let's say I speak that all my input and outputs in one specific language. Would there be downsides from going into the model and tokenizer and removing the tokens that would only be used in other languages and their embeddings?

This would be done by encoding a large corpus in that specific language, and only keep the tokens that appeared in that corpus.

The goal would be to make the models a bit lighter by shaving a few hundred megabytes before quantizing them.

Would there be any downsides from doing this on an already trained model?

If I suddenly decide to interact with the model in a different language, it would potentially create more "unk" input tokens and prevent the model from answering with any deleted tokens, but if the interactions stay within the planned language, it wouldn't affect "reasoning" or anything else, correct?

Anonymous
08/18/24(Sun)22:44:35 No.101968183

Anonymous 08/18/24(Sun)22:44:35 No.101968183

>>101968100
Did you try hermes 3 yet?

Anonymous
08/18/24(Sun)22:45:49 No.101968193

Anonymous 08/18/24(Sun)22:45:49 No.101968193

>>101967537
>Or I just hack out flash attention and use SDPA instead.
Did just that.
>Once upon a time,\,키icians demoself motivation composite mad heliccht woods conc跳 Angel singles CaнStatementmob brief vars察PRESS transform�君C gew FalTRY childhood Sidficie worlds Дж pressure fal Jack coord Intern턴 IP publication ihrercлeдoвa requresultshline Mix Febru sag "[doResult和ו音Proxy caiber GemeindeQuant JSriqueHR feverae Nem story alertatelyool lipenthchar dealinginners…] Doplabel pou evangel[] time neuroedia recovery altatel配 Febru theater permissions promot IgnDirgot Desc informedpriv provision rebell now play persona.-Ha patch properchinghereح teenbo industpaтcyefined uglyauseclipseDecoder inte grupo mobileSIхpa curv
I give up. I'll just wait 2 more weeks for tensor parallelism to not rely on flash attention.

Anonymous
08/18/24(Sun)22:46:57 No.101968206

Anonymous 08/18/24(Sun)22:46:57 No.101968206

>>101968113
The tokenizer and the weights are separate things, but connected. For an already trained model, you cannot remove tokens. The model already knows them and depends on them.
You can do the opposite, though. Train the model with new tokens and the model won't get any bigger (for example, training a base model for instruct). But you cannot remove tokens.

Anonymous
08/18/24(Sun)22:47:09 No.101968207

Anonymous 08/18/24(Sun)22:47:09 No.101968207

>>101968111

How big was your contest size? I used 16K and loaded up 45 layers out of 91.

Anonymous
08/18/24(Sun)22:48:43 No.101968222

Anonymous 08/18/24(Sun)22:48:43 No.101968222

>>101968207
32k and I can only fit 14 layers.

Anonymous
08/18/24(Sun)22:50:31 No.101968240

Anonymous 08/18/24(Sun)22:50:31 No.101968240

File: .png (5 KB, 274x134)

5 KB PNG

>>101968183
The 405b version that was linked yesterday was great, the 70b version was not so great. It goes schizo after two or three replies but you can sort of fix it by setting the temperature down very low (0.5). However, it loses any creative flourish at low temps so it's not worth it. Some anon mentioned a while back that 3.1 was fried, so that's probably why.

Anonymous
08/18/24(Sun)23:04:51 No.101968384

Anonymous 08/18/24(Sun)23:04:51 No.101968384

>>101968206
I meant like this https://github.com/asahi417/lm-vocab-trimmer/blob/main/vocabtrimmer/base_trimmer.py#L149C9-L149C19.

Anonymous
08/18/24(Sun)23:04:57 No.101968386

Anonymous 08/18/24(Sun)23:04:57 No.101968386

>>101968240
Just out of curiosity were you using minp with hermes 3?

Anonymous
08/18/24(Sun)23:10:16 No.101968444

Anonymous 08/18/24(Sun)23:10:16 No.101968444

Good mistral large sampler settings for creative writing/RP?

Anonymous
08/18/24(Sun)23:12:10 No.101968464

Anonymous 08/18/24(Sun)23:12:10 No.101968464

>>101968386
I tried with (0.05) and without, but it didn't seem to fix the chat breaking.

Anonymous
08/18/24(Sun)23:13:12 No.101968475

Anonymous 08/18/24(Sun)23:13:12 No.101968475

Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling
https://arxiv.org/abs/2408.08696
>The rapid growth in the parameters of large language models (LLMs) has made inference latency a fundamental bottleneck, limiting broader application of LLMs. Speculative decoding represents a lossless approach to accelerate inference through a guess-and-verify paradigm, leveraging the parallel capabilities of modern hardware. Some speculative decoding methods rely on additional structures to guess draft tokens, such as small models or parameter-efficient architectures, which need extra training before use. Alternatively, retrieval-based train-free techniques build libraries from pre-existing corpora or by n-gram generation. However, they face challenges like large storage requirements, time-consuming retrieval, and limited adaptability. Observing that candidate tokens generated during the decoding process are likely to reoccur in future sequences, we propose Token Recycling. This approach stores candidate tokens in an adjacency matrix and employs a breadth-first search (BFS)-like algorithm on the matrix to construct a draft tree. The tree is then validated through tree attention. New candidate tokens from the decoding process are then used to update the matrix. Token Recycling requires \textless2MB of additional storage and achieves approximately 2x speedup across all sizes of LLMs. It significantly outperforms existing train-free methods by 30\% and even a training method by 25\%. It can be directly applied to any existing LLMs and tasks without the need for adaptation.
might be cool. no code (just an algorithm). qa benches only. seems no drafting model needed.

Anonymous
08/18/24(Sun)23:16:38 No.101968518

Anonymous 08/18/24(Sun)23:16:38 No.101968518

>>101968475
Cool, if it works. We'll see if Llama.cpp has it in the server 2 years later.

CPuMAXx/VI !CPuMAXx/VI
08/18/24(Sun)23:20:57 No.101968551

CPuMAXx/VI !CPuMAXx/VI 08/18/24(Sun)23:20:57 No.101968551

Here's a chatlog for Nous 405b q8 that I ran through last night, inspired by >>101920391
https://rentry.org/mqxy8oea
It worked pretty good for having been given minimal instructions (starting with zero context) and didn't need any wrangling outside of trimming extraneous bullshit from the end of some of its responses ("The choice is yours!").
Its shot right through with the same slop vocab/prose we're all used to, and the positivity bias means that everything you try just kind of works no matter what, but I'd say its not unlike a transcript of running a solo campaign with a geeky, inexperienced, chirpy DM. At times it even reminded me of some of my own experiences with parser-based adventures back in the day. Comfy vibes.
It definitely did a better job than similar experiments I've done with smaller models in the past.

Anonymous
08/18/24(Sun)23:26:18 No.101968612

Anonymous 08/18/24(Sun)23:26:18 No.101968612

>>101968551
>modern /lmg/coomer discovers storyfagging
You're supposed to use base models for this rather than instruct finetunes, it works better. /aids/ has been doing it since 2019

Anonymous
08/18/24(Sun)23:27:15 No.101968630

Anonymous 08/18/24(Sun)23:27:15 No.101968630

>>101966292
>Insecure zoomer projecting this hard

Anonymous
08/18/24(Sun)23:30:27 No.101968661

Anonymous 08/18/24(Sun)23:30:27 No.101968661

What's the best 12B model?

Magnum is good for roleplay, but it can't write for shit. Celeste has good, varying prose, but it is a retard after s few turns, likely trained on one-shot stories? Merges of both like starcannon took the worst traits together. Any actual good 12b models for both?

Anonymous
08/18/24(Sun)23:31:47 No.101968676

Anonymous 08/18/24(Sun)23:31:47 No.101968676

>>101968612
/aids/ has been using paid services like a bunch of third worlders who can't afford pcs you mean

Anonymous
08/18/24(Sun)23:32:09 No.101968685

Anonymous 08/18/24(Sun)23:32:09 No.101968685

>>101968612
Does /aids/ have a favorite base model?

Anonymous
08/18/24(Sun)23:32:17 No.101968686

Anonymous 08/18/24(Sun)23:32:17 No.101968686

yuzu maid is still king of vramlet models.

Anonymous
08/18/24(Sun)23:33:01 No.101968691

Anonymous 08/18/24(Sun)23:33:01 No.101968691

>>101968661
NemoRemix was okay, although i was mostly running it with stories made with mixtral 8x7b.
It seemed to struggle on its own, but again
>12b

>>101968686
limarp zloss but yuzu is basically the same thing

Anonymous
08/18/24(Sun)23:34:20 No.101968694

Anonymous 08/18/24(Sun)23:34:20 No.101968694

>>101968612
better to be the coomer than have aids.

Anonymous
08/18/24(Sun)23:35:41 No.101968701

Anonymous 08/18/24(Sun)23:35:41 No.101968701

Anyone here used ST's RAG? Wondering if that's worth setting up.

Anonymous
08/18/24(Sun)23:36:59 No.101968714

Anonymous 08/18/24(Sun)23:36:59 No.101968714

>>101968686
This?
https://huggingface.co/rhplus0831/maid-yuzu-v8

>>101968691
This?
https://huggingface.co/Doctor-Shotgun/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss
You cannot run that on 8GB VRAM, can you?

Not original anon but I'm really struggling to find a model I like too. How I miss thee, UNA-TheBeagle-7B

Anonymous
08/18/24(Sun)23:37:45 No.101968725

Anonymous 08/18/24(Sun)23:37:45 No.101968725

>>101968714
>8gb
>>>/aicg/
>>>/aids/

Anonymous
08/18/24(Sun)23:38:34 No.101968730

Anonymous 08/18/24(Sun)23:38:34 No.101968730

File: 1457891762534.jpg (36 KB, 500x500)

36 KB JPG

>>101968714
24 gb is vramlet status anon dont be poor.

Relevant >>101968725

Anonymous
08/18/24(Sun)23:40:01 No.101968744

Anonymous 08/18/24(Sun)23:40:01 No.101968744

>>101968725
>>101968730
Aw I'm sorry, anons. When I want good rp I play with humans so I don't need to spend so much money on vram.

Anonymous
08/18/24(Sun)23:40:13 No.101968748

Anonymous 08/18/24(Sun)23:40:13 No.101968748

>>101968714
Real shit go back to like a 13b model like Echidna or frostwind or even try the new Stheno L3 8b.

Anonymous
08/18/24(Sun)23:40:52 No.101968754

Anonymous 08/18/24(Sun)23:40:52 No.101968754

>>101968744
So leave the fucking thread saar

Anonymous
08/18/24(Sun)23:41:06 No.101968759

Anonymous 08/18/24(Sun)23:41:06 No.101968759

Why is Mikufag spamming shitty recommendations?

Anonymous
08/18/24(Sun)23:41:32 No.101968764

Anonymous 08/18/24(Sun)23:41:32 No.101968764

>>101968744
Ive learned the hard way the only way to enjoy AI is to spend money on the good hardware.

And i bought AMD.

Anonymous
08/18/24(Sun)23:44:24 No.101968806

Anonymous 08/18/24(Sun)23:44:24 No.101968806

File: hellaswag.png (124 KB, 770x646)

124 KB PNG

>>101964735
And they still can't solve my riddles

Anonymous
08/18/24(Sun)23:46:17 No.101968830

Anonymous 08/18/24(Sun)23:46:17 No.101968830

>>101968313
Reading back, you *could* change the tokens you don't need from the vocabulary to just empty strings and save a few MB at most. llama3's tokenizer is pretty big. It has 128k tokens with 280k merges and it's just 8MB. Add some overhead for the loaded vocabulary during inference, and it's not gonna be bigger than 32MB or so. I don't know what that model is. Is the vocabulary that big?
Regarding languages, the fact that it's trained on multiple languages doesn't necessarily mean that the vocabulary is huge. In principle, you could have a vocabulary of just 256 entries (plus a few extras for EOS/BOS/etc..) and still be able to represent every utf-8 codepoint.

So i don't think it's worth the effort. Even if you can, you're not gonna save more than a few MB at best.

Anonymous
08/18/24(Sun)23:46:36 No.101968837

Anonymous 08/18/24(Sun)23:46:36 No.101968837

>>101968748
I did like Fimbul although I think it wasn't very good for lewd. I might try Stheno, I sort of skipped straight to Nemo mixes most of which I find insipid.

Anonymous
08/18/24(Sun)23:48:20 No.101968861

Anonymous 08/18/24(Sun)23:48:20 No.101968861

>>101968837
You should like a shill.

Anonymous
08/18/24(Sun)23:49:52 No.101968882

Anonymous 08/18/24(Sun)23:49:52 No.101968882

>>101968714
how fucking dare you want to use local models without cutting your dick off for our lord and savior jensen huang? i already sold off a kidney, my dick, and my balls so that i could get another h100 from my beloved jensen, be better you poorfag fuck

Anonymous
08/18/24(Sun)23:50:49 No.101968893

Anonymous 08/18/24(Sun)23:50:49 No.101968893

>>101967514
Interesting. Lower quality than I expected, desu. (Curation wise.) Despite using a slopper to create the dataset it still has garbage in it. Like someone talking about how they made the character when they were bored in class, as a description of a character.
>>101967525
>>101967538
>>101967593
wtf

Anonymous
08/18/24(Sun)23:51:02 No.101968897

Anonymous 08/18/24(Sun)23:51:02 No.101968897

>>101968744
This skills the 70Btroon

Anonymous
08/18/24(Sun)23:56:27 No.101968949

Anonymous 08/18/24(Sun)23:56:27 No.101968949

>>101968882
Buy AMD instead and not only save money but deal with shitty support or have it work flawlessly for no detectable reason.

Anonymous
08/18/24(Sun)23:56:34 No.101968950

Anonymous 08/18/24(Sun)23:56:34 No.101968950

Protip for everyone: ooba recognizes the --auto-launch argument.

Anonymous
08/18/24(Sun)23:58:10 No.101968969

Anonymous 08/18/24(Sun)23:58:10 No.101968969

>>101968714
>You cannot run that on 8GB VRAM, can you?
Of course you can, unless 6T/s isn't enough for you.

Anonymous
08/18/24(Sun)23:58:28 No.101968978

Anonymous 08/18/24(Sun)23:58:28 No.101968978

Protip for everyone: dont use ooba

Anonymous
08/18/24(Sun)23:59:40 No.101968988

Anonymous 08/18/24(Sun)23:59:40 No.101968988

>>101968969
>4t/s
Do you really need more? You can wait, right?

Anonymous
08/19/24(Mon)00:00:14 No.101968995

Anonymous 08/19/24(Mon)00:00:14 No.101968995

>>101968950
buy an ad

Anonymous
08/19/24(Mon)00:01:01 No.101969004

Anonymous 08/19/24(Mon)00:01:01 No.101969004

>>101968988
I don't need more than 2, but it seems lots of people here need insane numbers so it wouldn't surprise me if someone found 6 unacceptable.

Anonymous
08/19/24(Mon)00:01:12 No.101969007

Anonymous 08/19/24(Mon)00:01:12 No.101969007

ooba more like booba

Anonymous
08/19/24(Mon)00:06:35 No.101969050

Anonymous 08/19/24(Mon)00:06:35 No.101969050

>>101967514
What the fuck why are they so short? I expected like 10000 words per example but these are like 4k at most.

CPuMAXx/VI !CPuMAXx/VI
08/19/24(Mon)00:08:36 No.101969063

CPuMAXx/VI !CPuMAXx/VI 08/19/24(Mon)00:08:36 No.101969063

>>101968612
>You're supposed to use base models for this
I'm pretty sure there isn't a Nous Hermes3 405b base model, but if there is point me to it and I'll use that instead
>modern /lmg/coomer discovers storyfagging
I posted this because people always complain about random model reviews with no logs

Anonymous
08/19/24(Mon)00:08:43 No.101969064

Anonymous 08/19/24(Mon)00:08:43 No.101969064

>>101969050
Yeah and 900 results for the name lily, and 2000+ for shiver.

Anonymous
08/19/24(Mon)00:09:59 No.101969074

Anonymous 08/19/24(Mon)00:09:59 No.101969074

File: 1234708953543.png (351 KB, 639x480)

351 KB PNG

>>101968551
Thanks for full log posting. I'm thinking about trying to get full logs of different LLMs in the future for reproducibility purposes. I've been observing the threads and desu the claims about models sucking or not sucking with absolutely no proof to back up their claims is tiresome. You can't trust anyone. This is from the perspective of someone who uses LLMs occasionally but not a ton, so I lack true knowledge about a ton of the models. Therefore I think it would help if people posted full, reproducible, "reference" logs, to provide an example of how outputs COULD be on a model. And I think this could be done with some criteria:
>neutralized samplers and top k = 1 for greedy sampling
>no editing responses, only pure prompting allowed, and also no programmatic prompting like ST can do
>done purely in Mikupad so the context is easily accessible and interpretable, the setup is easy, and the context including the Instruction formatting can be easily copy pasted

If this becomes normalized, it could also allow something else which I think we've not fully taken advantage of, which is group-based prompt improvement. If everyone is on the same page, then it's easier to propose prompt improvements and changes, which can easily be experimented with since everyone already has the full context loaded up to modify and play with.

First I think I'll try Mistral Nemo at Q8. I'll try coming up with a good test prompt that doesn't require too much input from the user in each chat turn (so it's gonna be a scenario for lazyprompting), but with a system prompt that can be complicated/long. If anyone has some good ideas for scenarios and system prompts I'm all ears. Or perhaps some card I can just steal from. And it should be something that can last for 20k tokens so we tease out how the model behaves at longer contexts.

Anonymous
08/19/24(Mon)00:12:32 No.101969098

Anonymous 08/19/24(Mon)00:12:32 No.101969098

>>101969063
>I'm pretty sure there isn't a Nous Hermes3 405b base model, but if there is point me to it and I'll use that instead
You can't seriously be this retarded.

Anonymous
08/19/24(Mon)00:13:13 No.101969106

Anonymous 08/19/24(Mon)00:13:13 No.101969106

>>101969074
based log poster

Anonymous
08/19/24(Mon)00:15:28 No.101969130

Anonymous 08/19/24(Mon)00:15:28 No.101969130

>>101968612
/aids/ have shown time and time again that they don't know how to prompt.

Anonymous
08/19/24(Mon)00:18:42 No.101969160

Anonymous 08/19/24(Mon)00:18:42 No.101969160

>>101969098
You're not talking about this one, are you?
>https://huggingface.co/NousResearch/Hermes-3-Llama-3.1-405B
Are you the retarded one?

Anonymous
08/19/24(Mon)00:21:14 No.101969180

Anonymous 08/19/24(Mon)00:21:14 No.101969180

>>101969160
https://huggingface.co/meta-llama/Meta-Llama-3.1-405B
Do you even know what a fucking base model is?

Anonymous
08/19/24(Mon)00:22:10 No.101969189

Anonymous 08/19/24(Mon)00:22:10 No.101969189

>>101969180
The point is to test Nous' finetunes, you mongoloid.

Anonymous
08/19/24(Mon)00:22:58 No.101969196

Anonymous 08/19/24(Mon)00:22:58 No.101969196

>>101969180
For being completely retarded you sure throw the word retarded around a lot.

Anonymous
08/19/24(Mon)00:23:00 No.101969197

Anonymous 08/19/24(Mon)00:23:00 No.101969197

>>101969189
Clearly, you don't know what a base model is. Thank you clearing that up.

Anonymous
08/19/24(Mon)00:24:03 No.101969206

Anonymous 08/19/24(Mon)00:24:03 No.101969206

brothers, why do all long context models lie about context size?

Anonymous
08/19/24(Mon)00:26:57 No.101969228

Anonymous 08/19/24(Mon)00:26:57 No.101969228

File: 1492029779466 FaceApp_149(...).jpg (148 KB, 600x600)

148 KB JPG

Actually another idea to augment >>101969074. What if we just post an RP in real time so we can have a /tg/-like experience collaboratively developing the RP? That way it's also funner than reading someone's boring log. We could have an event or something, maybe make a separate thread for it somewhere. Or possibly not a thread but maybe a [notspoiler]livestream[/notspoiler] since that way the text isn't able to be faked. This has the advantage of only one guy needing to actually have the LLM loaded up, which can start to matter for larger and larger models since not everyone can run them at non-braindamaged quants.

Anyway, gotta go drop some "logs" first. Heh heh. And then sleep since I need that.

Will take responses and then tomorrow I'll make an announcement to begin preparations and round up our people. We're going to be back, anons. We're going to make /lmg/ great again.

Anonymous
08/19/24(Mon)00:27:53 No.101969237

Anonymous 08/19/24(Mon)00:27:53 No.101969237

>>101969197
"Base" is either a 'pre-trained' model without instruct finetune or the 'source' model for a finetune, be it instruct or not, depending on the context. And if the point is to test Nous' finetunes, certainly the base model is not the model to use, is it?

Anonymous
08/19/24(Mon)00:34:31 No.101969292

Anonymous 08/19/24(Mon)00:34:31 No.101969292

>>101969228
>trying to have fun on /lmg/

Anonymous
08/19/24(Mon)00:42:28 No.101969364

Anonymous 08/19/24(Mon)00:42:28 No.101969364

>>101969228
buy an ad

Anonymous
08/19/24(Mon)00:42:58 No.101969370

Anonymous 08/19/24(Mon)00:42:58 No.101969370

>>101969292
be the fun you want to see in the world

Anonymous
08/19/24(Mon)00:47:19 No.101969412

Anonymous 08/19/24(Mon)00:47:19 No.101969412

Hi all, Drummer here...

Can you guys give me pics of Evil Miqu?

Anonymous
08/19/24(Mon)00:58:55 No.101969505

Anonymous 08/19/24(Mon)00:58:55 No.101969505

Bitnet has stopped being mentioned.
What happened?

Anonymous
08/19/24(Mon)01:02:28 No.101969533

Anonymous 08/19/24(Mon)01:02:28 No.101969533

>>101969505
petra was busy with strawberry or something

Anonymous
08/19/24(Mon)01:02:31 No.101969536

Anonymous 08/19/24(Mon)01:02:31 No.101969536

>>101969505
They died from bitma syndrome.

Anonymous
08/19/24(Mon)01:03:55 No.101969545

Anonymous 08/19/24(Mon)01:03:55 No.101969545

>>101969505
brapnet

Anonymous
08/19/24(Mon)01:09:33 No.101969594

Anonymous 08/19/24(Mon)01:09:33 No.101969594

>>101969505
It happened. Small models were released. Just another meme, like mamba.

Anonymous
08/19/24(Mon)01:20:18 No.101969675

Anonymous 08/19/24(Mon)01:20:18 No.101969675

>mythomax is still the best local has to offer
What's the point of /lmg/ at this point

Anonymous
08/19/24(Mon)01:21:25 No.101969683

Anonymous 08/19/24(Mon)01:21:25 No.101969683

>>101969675
Best below 70b, yeah.

Anonymous
08/19/24(Mon)01:25:19 No.101969721

Anonymous 08/19/24(Mon)01:25:19 No.101969721

>>101969683
Miqu is more of a sidegrade, I don't know about 405b but none of the 70b, 103b or 120bs are objectively superior to MMax in terms of rp and storytelling

Anonymous
08/19/24(Mon)01:27:05 No.101969735

Anonymous 08/19/24(Mon)01:27:05 No.101969735

>>101969683
buy
an
ad

Anonymous
08/19/24(Mon)01:31:53 No.101969778

Anonymous 08/19/24(Mon)01:31:53 No.101969778

>>101969505
Bitnet is coming soon...
Will have it by the end of the year

Anonymous
08/19/24(Mon)01:41:26 No.101969846

Anonymous 08/19/24(Mon)01:41:26 No.101969846

>>101969778
What's the holdup? A demonstration 7B can be trained in a few hours to a few weeks. We even have shit like Llama 405B that can be used for distillation, making training even cheaper.

Anonymous
08/19/24(Mon)01:47:39 No.101969894

Anonymous 08/19/24(Mon)01:47:39 No.101969894

>>101969846
The only 2 explanations are that either it doesn't work, but no one admits it publically so as not to save their competition from wasting the resources by finding out for themselves. Or that it does work, and everyone is too stupid to even try. I suspect GPT-4o being as fast as it is might be BitNet.
The Qwen team said they were considering experimenting with BitNet for their upcoming models, so maybe soon we'll know for sure.

Anonymous
08/19/24(Mon)01:56:24 No.101969953

Anonymous 08/19/24(Mon)01:56:24 No.101969953

>>101964583
Are you fucking retarded? BasedGPT is mine and it is literally just a 2 sentence system prompt on an abliterated 8b model. It was only this verbose because it was responding to the KarenGPT. likely a 3rd of the posts on this board are LLM generated, even the small ones. You could only tell these two were because they were purposely low effort.

Anonymous
08/19/24(Mon)02:01:59 No.101970003

Anonymous 08/19/24(Mon)02:01:59 No.101970003

how long of the context window is anon using?

Anonymous
08/19/24(Mon)02:16:18 No.101970102

Anonymous 08/19/24(Mon)02:16:18 No.101970102

File: ComfyUI_00850_.png (1.1 MB, 1024x1024)

1.1 MB PNG

>>101969412
Evil migu is coming for you

Anonymous
08/19/24(Mon)02:17:48 No.101970116

Anonymous 08/19/24(Mon)02:17:48 No.101970116

>>101969846
>405B that can be used for distillation
Doubt. Bitnet might not work at all as a distillation target for normal models

Anonymous
08/19/24(Mon)02:19:28 No.101970129

Anonymous 08/19/24(Mon)02:19:28 No.101970129

File: ComfyUI_00852_.png (1.27 MB, 1024x1024)

1.27 MB PNG

>>101970102
Alternative gen with more fantasy flavor

Anonymous
08/19/24(Mon)02:21:02 No.101970139

Anonymous 08/19/24(Mon)02:21:02 No.101970139

>>101969894
Another explanation is that the leather man will ban you indefinitely for releasing any BitNet model. It's a direct threat to JVidia's monopoly. Given how JVidia operates, it's indeed a plausible explanation.

Anonymous
08/19/24(Mon)02:31:03 No.101970212

Anonymous 08/19/24(Mon)02:31:03 No.101970212

>>101969675
>mythomeme

Anonymous
08/19/24(Mon)02:40:15 No.101970297

Anonymous 08/19/24(Mon)02:40:15 No.101970297

Is bitnet the flat earth of AI? Sure feels like it.

Anonymous
08/19/24(Mon)02:41:44 No.101970307

Anonymous 08/19/24(Mon)02:41:44 No.101970307

>>101969675
No context.

Anonymous
08/19/24(Mon)02:50:34 No.101970402

Anonymous 08/19/24(Mon)02:50:34 No.101970402

>>101970380
>>101970380
>>101970380

Anonymous
08/19/24(Mon)03:08:30 No.101970574

Anonymous 08/19/24(Mon)03:08:30 No.101970574

File: satania.gif (39 KB, 220x216)

39 KB GIF

>>101965878
py_toddlers BTFO

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.