/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/15/24(Sun)23:54:04 No.102406696

File: 1706805083087933.jpg (371 KB, 1015x1100)

371 KB JPG

/lmg/ - Local Models General Anonymous 09/15/24(Sun)23:54:04 No.102406696 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Koishi Edition

Previous threads: >>102396290 & >>102385729

►News
>(09/12) DataGemma with DataCommons retrieval: https://blog.google/technology/ai/google-datagemma-ai-llm/
>(09/12) LLaMA-Omni: Multimodal LLM with seamless speech interaction: https://huggingface.co/ICTNLP/Llama-3.1-8B-Omni
>(09/11) Fish Speech multilingual TTS with voice replication: https://hf.co/fishaudio/fish-speech-1.4
>(09/11) Pixtral: 12B with image input vision adapter: https://xcancel.com/mistralai/status/1833758285167722836

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
09/15/24(Sun)23:56:10 No.102406721

Anonymous 09/15/24(Sun)23:56:10 No.102406721

File: 657d588440d6f8009ed8c1192(...).jpg (431 KB, 2000x1500)

431 KB JPG

I haven't watched pornography involving real people in almost two years. Now the only time I coom is when I'm reading smut or looking at hentai. The grass only gets greener from here, boys.

Anonymous
09/15/24(Sun)23:56:30 No.102406725

Anonymous 09/15/24(Sun)23:56:30 No.102406725

File: no contribution.png (1.14 MB, 1024x1024)

1.14 MB PNG

►Recent Highlights from the Previous Thread: >>102396290

--Papers: >>102405722
--vLLM can do Pixtral inference with 4-bit quantization: >>102402070 >>102404011 >>102402450 >>102402987 >>102403336 >>102403538 >>102403715 >>102403900 >>102403970
--Multimodal issues in llamacpp and alternatives: >>102396995 >>102397014 >>102397139 >>102397146 >>102397171 >>102397186 >>102397240 >>102397402 >>102397425 >>102397276 >>102398089 >>102401397 >>102398102
--Advice on hardware requirements for running AI software, with a focus on VRAM, CPU, and RAM: >>102399626 >>102399710 >>102399798 >>102399846 >>102399855 >>102399918 >>102399936 >>102399981 >>102400125 >>102400262
--Troubleshooting ROCm installation on Linux Mint, alternative setups, and RX 7800 XT support: >>102402433
--Experimenting with settings for base and instruct models: >>102396390 >>102396402 >>102396423 >>102396448 >>102396503 >>102396532 >>102397675
--Discussion on running exl2 models and alternatives to GGUFs for high-context tasks: >>102397131 >>102397153 >>102397190 >>102397170 >>102398078 >>102398189 >>102398307 >>102398570 >>102398629 >>102398674 >>102399381 >>102399572 >>102399832 >>102400034 >>102400036 >>102400085 >>102400089 >>102400055 >>102400226
--George Hotz tweet about ChatGPT programming capabilities and the role of branding and PR in perceived model ability: >>102396431 >>102396472 >>102403110
--Deepseek provides high quality results at impressive speeds, outperforming largestral and 405b in some tasks: >>102399857 >>102403484 >>102403664
--Pixtral NSFW capabilities and limitations discussed: >>102397205 >>102399039 >>102399576 >>102399640
--ChuckMcSneed-multistyle.txt updated with drug writing prompts: >>102401934
--Miku (free space): >>102396305 >>102398936 >>102399981 >>102400262 >>102401182 >>102401204 >>102401288 >>102401468 >>102401620 >>102402070 >>102403727 >>102403875 >>102405979

►Recent Highlight Posts from the Previous Thread: >>102396296

Anonymous
09/15/24(Sun)23:57:26 No.102406734

Anonymous 09/15/24(Sun)23:57:26 No.102406734

Can ATI do inference yet? Cards look cheap compared to nvidia.

Anonymous
09/16/24(Mon)00:03:18 No.102406782

Anonymous 09/16/24(Mon)00:03:18 No.102406782

File: ComfyUI_00263_.png (773 KB, 1024x1024)

773 KB PNG

Dual 3090 chads, what are we running? Largestral Q3_K_M is getting me about 0.85t/s with 16K context and Q4 KV cache. Been looking for a good 70B but Qwen and the magnum tune are both braindead unfortunately. Is Miqu/derivatives still the best 70B?

Anonymous
09/16/24(Mon)00:06:50 No.102406817

Anonymous 09/16/24(Mon)00:06:50 No.102406817

>>102406734
AMD always could, it's just got less software support. If you're just using llama.cpp, pytorch, or other common inference engines and aren't planning to do anything too special then yeah it'll do the job. The thing is that usually once you're at the stage where you're investing in dedicated inference hardware you're probably soon going to get interested in other things that only have NVIDIA support.
If you meant the actual ATI branded cards from 20 years ago then lol, lmao

Anonymous
09/16/24(Mon)00:06:58 No.102406821

Anonymous 09/16/24(Mon)00:06:58 No.102406821

>>102406676
Didn't say nothin' about Mixtral 8x7b being better than Miqu. I questioned why you put NeMo ahead of it.

As for why you'd use something worse than a 70B, presumably for speed the same as why you ran NeMo even though you had to know it wouldn't be as good.

Anonymous
09/16/24(Mon)00:08:55 No.102406838

Anonymous 09/16/24(Mon)00:08:55 No.102406838

>>102406782
That’s very grim. I thought 3090 friends would get better speeds than that.
Now I wonder how slow a 8 channel ddr4 server would be…

Anonymous
09/16/24(Mon)00:09:17 No.102406842

Anonymous 09/16/24(Mon)00:09:17 No.102406842

>>102406734
yes

Anonymous
09/16/24(Mon)00:09:24 No.102406843

Anonymous 09/16/24(Mon)00:09:24 No.102406843

>>102406782
Miqu and its derivatives are braindead compared to 3.1. If you need it to be uncensored, try Hermes.

Anonymous
09/16/24(Mon)00:11:12 No.102406862

Anonymous 09/16/24(Mon)00:11:12 No.102406862

Hello, I've been trying out chatbots (Llama and Perplexity) more recently and thought I should try running one locally. I just have a potato smartphone GPU but my CPU is a 5900x. Will this be enough to run Llama 3.1 8B at a tolerable rate?
Is that the model I would want to go with for just asking stackoverflow-tier questions, or is there a better one?
Thanks in advance.

Anonymous
09/16/24(Mon)00:18:17 No.102406946

Anonymous 09/16/24(Mon)00:18:17 No.102406946

File: 1723771660587148.png (14 KB, 694x632)

14 KB PNG

>installed Linux mint
>it comes with python3.12
>literally no one supports python3.12, everyone uses 3.11 for whatever reason

I thought scripting languages were meant to solve the problem of cross platform compatibility? What the fuck

Anonymous
09/16/24(Mon)00:18:54 No.102406951

Anonymous 09/16/24(Mon)00:18:54 No.102406951

>>102406862
Small models aren’t particularly useful. You’re better off using the whatsapp one or the free chatgpt. Unless you have some “special” need not suitable for commercial models.

Anonymous
09/16/24(Mon)00:19:01 No.102406953

Anonymous 09/16/24(Mon)00:19:01 No.102406953

>linux mint
fake distro btw

Anonymous
09/16/24(Mon)00:19:10 No.102406954

Anonymous 09/16/24(Mon)00:19:10 No.102406954

>>102406782
Feels good to be good at prompting and being able to use Llama 3.1.
What a fucking retarded mikufag.

Anonymous
09/16/24(Mon)00:20:45 No.102406968

Anonymous 09/16/24(Mon)00:20:45 No.102406968

>>102406782
>Largestral Q3_K_M is getting me about 0.85t/s with 16K context and Q4 KV cache
God...

Anonymous
09/16/24(Mon)00:21:45 No.102406980

Anonymous 09/16/24(Mon)00:21:45 No.102406980

>>102406862
Anything on CPU will be pretty slow; expect maybe 2-3t/s. If you're primarily using it for coding, you might want one of the llms trained specifically for it. codestral or starcoder 2 might be a good fit.

If you are using an llm to *learn* code; don't. They will confidently tell you absolute bullshit, have no idea they are doing so and depending on model and prompting, argue with you when you call them out on it. All LLMs do is predict the next likely sequence of letters; they have no real cognition.

Anonymous
09/16/24(Mon)00:22:22 No.102406986

Anonymous 09/16/24(Mon)00:22:22 No.102406986

>>102406821
Nemo is infinitely better than mixtral, that's why mixtral is deprecated.

Anonymous
09/16/24(Mon)00:22:35 No.102406991

Anonymous 09/16/24(Mon)00:22:35 No.102406991

>>102406954
If using prefill on every message is tolerable to you, sure.

Anonymous
09/16/24(Mon)00:23:44 No.102406999

Anonymous 09/16/24(Mon)00:23:44 No.102406999

>>102406986
Debunked. >>102406477

Anonymous
09/16/24(Mon)00:25:46 No.102407024

Anonymous 09/16/24(Mon)00:25:46 No.102407024

>>102406999
That just proves that it can recall things from a long context. However, it's retarded and doesn't write well.

Anonymous
09/16/24(Mon)00:27:54 No.102407042

Anonymous 09/16/24(Mon)00:27:54 No.102407042

https://livebench.ai/
Mixtral is so retarded that it doesn't even make it into the latest leaderboard. You have to change to 2024-07-26 to find it among some 7Bs.

Anonymous
09/16/24(Mon)00:28:21 No.102407045

Anonymous 09/16/24(Mon)00:28:21 No.102407045

>>102407024

Nemo is too stupid. Handcuffed characters crosses their arms, weird anatomy. If you want to make Mixtral 8x7B write more creatively, increase alpha to 1.5 to 1.7, it's a trick mentioned on /lmg/ long ago.

Anonymous
09/16/24(Mon)00:29:23 No.102407053

Anonymous 09/16/24(Mon)00:29:23 No.102407053

>>102406951
I see, is something like Llama 3.1 70b a "small" model? I assume not because that's what I've been using and it seems to be okay. I've seen elsewhere that you need pretty beefy hardware, like 2x3090s to run it, although someone said that 1x3090 works with a "2bit quant" although I don't quite understand that.
I don't have any special requirements but I don't like depending on the benevolence of Facebook et. al.
>>102406980
Tokens are basically characters right? Yeah that would be intolerably slow for me.
I'm not really using it for coding, like I just asked it "how do I constrain an image to half the screen height in html", very simple things.
I've experienced it producing "fake" code like picrel, although that was just a test after I had solved my problem.

Anonymous
09/16/24(Mon)00:30:40 No.102407068

Anonymous 09/16/24(Mon)00:30:40 No.102407068

>>102406946
Python was fine as a scripting language. But then everyone decided to pretend that it were a first class language and use it for shit it wasn't designed for. It now still suffers the performance problems of a scripting language while trying to be used like a real language making it break compatibility with every point update since 2.8.

It's really fucking disgusting but making white space a syntax error is just too awesome to resist.

Anonymous
09/16/24(Mon)00:30:43 No.102407070

Anonymous 09/16/24(Mon)00:30:43 No.102407070

File: 2024-09-12-212854_364x262(...).png (31 KB, 364x262)

31 KB PNG

>>102407053
Forgot pic...

Anonymous
09/16/24(Mon)00:33:01 No.102407086

Anonymous 09/16/24(Mon)00:33:01 No.102407086

>>102407045
While I'd love it if mixtral were better than nemo and had a longer usable context, since it runs fast enough for me, I've found it to be worse than nemo at spatial stuff like you mentioned. Do people that get good results just use the instruct one or some merge/fine tune?

Anonymous
09/16/24(Mon)00:33:20 No.102407090

Anonymous 09/16/24(Mon)00:33:20 No.102407090

>>102407068
It got popular then a bunch of cock-gargling retards decided that repeatedly breaking compatibility was acceptable. So you end up with scripts that need a specific point release of python. Basically all of those people should get AIDS and die.

Anonymous
09/16/24(Mon)00:33:53 No.102407095

Anonymous 09/16/24(Mon)00:33:53 No.102407095

>>102407053
A token is a group of characters, generally 2-4 Using it for anything you're not familiar with, to be honest, is a bad idea. The hallucination issue is intractable. We'll probably mitigate it to in the future with architectural workarounds but for now, you cannot trust anything an LLM says.

Anonymous
09/16/24(Mon)00:34:53 No.102407104

Anonymous 09/16/24(Mon)00:34:53 No.102407104

all mistral models are boring.
<thinking> cannot help them
miqu 70b remains the best rp model, its the new mythomax

Anonymous
09/16/24(Mon)00:35:41 No.102407107

Anonymous 09/16/24(Mon)00:35:41 No.102407107

he's obviously trolling but there's some truth to it. mistral models are bland.

Anonymous
09/16/24(Mon)00:36:02 No.102407110

Anonymous 09/16/24(Mon)00:36:02 No.102407110

>>102407104
shut the fuck up dennis

Anonymous
09/16/24(Mon)00:37:52 No.102407126

Anonymous 09/16/24(Mon)00:37:52 No.102407126

eat your dirt cookies, then come home to miqu

petra
09/16/24(Mon)00:38:50 No.102407133

petra 09/16/24(Mon)00:38:50 No.102407133

holy crap i love miqu!!!!!!!!

Anonymous
09/16/24(Mon)00:39:09 No.102407134

Anonymous 09/16/24(Mon)00:39:09 No.102407134

>>102407053
Size is relative but for local,
7B to 13B is small
22B to 60B is medium (but becoming small as it develops)
70B to 120B is large
then you have huge fuckers like 405B.

70B is the sweet spot for local on a gaming rig. If you have 64 system ram, you can run a Q5, maybe just barely a Q6, quant of a 70B and get about a token per second. That's not exciting, 3 to 15 minutes for a response depending on how much you ask it to write, but it can run in the background and answer questions that Search Engine Optimization prevents Google from just handing over to you. Over 70B, you'd be running IQ3 and that's OK for roleplay but anything technical is going to suffer.

>Tokens are basically characters right?
No. Tokens are simple words and pieces of words. So a long but common word might be one token while an uncommon word might be four. That's why LLM's are infamous for not being able to count the number of R's in "strawberry," because it doesn't see the letters, just chunks like "straw"+"berry", while "strawberries" would probably be something like "straw"+"ber"+"r"+"ies" or whatever depending on its mood.

Anonymous
09/16/24(Mon)00:40:58 No.102407145

Anonymous 09/16/24(Mon)00:40:58 No.102407145

miqu is still the best 70B?

Anonymous
09/16/24(Mon)00:41:46 No.102407150

Anonymous 09/16/24(Mon)00:41:46 No.102407150

File: __cirno_touhou_drawn_by_h(...).png (27 KB, 1024x1280)

27 KB PNG

>>102407095
>>102407134
Okay, thank you.

Anonymous
09/16/24(Mon)00:42:45 No.102407155

Anonymous 09/16/24(Mon)00:42:45 No.102407155

>>102407104
isnt miqu mistral too?
what else is there?
meta, chink, google, they all go full assistant with their releases.

Anonymous
09/16/24(Mon)00:45:08 No.102407166

Anonymous 09/16/24(Mon)00:45:08 No.102407166

>>102407155
no, miqu is l2 but tuned. then mistral released diff models which are ALL BORING
trust me i hate advocating for a 7 month old model, but its still literally the best
do your own tests, you'll see

Anonymous
09/16/24(Mon)00:46:03 No.102407169

Anonymous 09/16/24(Mon)00:46:03 No.102407169

>>102407155
miqu is leaked Mistral 70B

Anonymous
09/16/24(Mon)00:52:07 No.102407210

Anonymous 09/16/24(Mon)00:52:07 No.102407210

File: file.png (245 KB, 420x662)

245 KB PNG

https://forms.gle/jqivYs6DcBd4gXxN9

Anonymous
09/16/24(Mon)00:53:07 No.102407218

Anonymous 09/16/24(Mon)00:53:07 No.102407218

File: 1726461871810.jpg (246 KB, 1430x1017)

246 KB JPG

I'm new to AI text to speech.
What's currently the best option if you want to run it locally?
Are current options as advanced as that unreleased, trainable voice synthesizer Adobe showed a few years ago?

I was looking for something that supports the Italian language, even better if it has an interface like WebUI.

Anonymous
09/16/24(Mon)00:54:06 No.102407224

Anonymous 09/16/24(Mon)00:54:06 No.102407224

>>102407218
fish

Anonymous
09/16/24(Mon)00:55:33 No.102407236

Anonymous 09/16/24(Mon)00:55:33 No.102407236

>>102407070
For something like this use codestral.
Gemma 2 27b for general purpose
As for how to run them, check the OP.

Anonymous
09/16/24(Mon)00:55:33 No.102407238

Anonymous 09/16/24(Mon)00:55:33 No.102407238

What are the chances that the qwen 2.5 dropping tomorrow are actually good for RP?
Did anybody even try the magnum qwen finetune?

Anonymous
09/16/24(Mon)00:57:44 No.102407254

Anonymous 09/16/24(Mon)00:57:44 No.102407254

>>102407238
every qwen model i've tried is as dumb as old 13, plus spits out chinese randomly. i don't expect nothing from them besides shit
i will be happy to be proven wrong, but i won't be yi'd again

Anonymous
09/16/24(Mon)00:57:52 No.102407255

Anonymous 09/16/24(Mon)00:57:52 No.102407255

>>102407224
I checked it out, unfortunately Italian is not supported yet

Anonymous
09/16/24(Mon)00:58:44 No.102407263

Anonymous 09/16/24(Mon)00:58:44 No.102407263

>>102407238
>Did anybody even try the magnum qwen finetune?
Yes. I did. It was bad. >>102383366

Anonymous
09/16/24(Mon)01:02:39 No.102407297

Anonymous 09/16/24(Mon)01:02:39 No.102407297

>>102407238
Tomorrow is Monday, not Thursday.

Anonymous
09/16/24(Mon)01:03:02 No.102407301

Anonymous 09/16/24(Mon)01:03:02 No.102407301

>>102407297
That's what makes it so CrAzY!

Anonymous
09/16/24(Mon)01:07:35 No.102407322

Anonymous 09/16/24(Mon)01:07:35 No.102407322

>>102407297
Tomorrow is today.

Anonymous
09/16/24(Mon)01:07:49 No.102407324

Anonymous 09/16/24(Mon)01:07:49 No.102407324

File: yQOA7Bvn0E8FSHhIIymlVA.jpg (607 KB, 1200x887)

607 KB JPG

I'm pretty sure this preset I'm using is scuffed: https://files.catbox.moe/fqzc2w.json
I've modified it about a dozen times over since L2 and now I'm behind on new samplers and how to use them. Can someone drop a good ooba/exl2hf nemo preset?

Anonymous
09/16/24(Mon)01:08:14 No.102407327

Anonymous 09/16/24(Mon)01:08:14 No.102407327

>>102407218
xtts2 or fish

Anonymous
09/16/24(Mon)01:10:28 No.102407343

Anonymous 09/16/24(Mon)01:10:28 No.102407343

>>102407324
How I run NeMo: temperature = 0.3. Nothing else.

Anonymous
09/16/24(Mon)01:14:58 No.102407375

Anonymous 09/16/24(Mon)01:14:58 No.102407375

>>102407324
looks perfect

Anonymous
09/16/24(Mon)01:16:51 No.102407397

Anonymous 09/16/24(Mon)01:16:51 No.102407397

>>102406782
With 2 3090s do either 2.75 bpw exl2 at about 10 t/s for mistral large, or 2q_k_m at around 7-8 t/s, can fit 32k context with those.

I hesitated to do such a low quant as well, but for some reason quality for ERP doesn't really effect large badly, and it's still very smart at that quant. I been running the magnum fine-tune of it for smut.

Anonymous
09/16/24(Mon)01:17:10 No.102407399

Anonymous 09/16/24(Mon)01:17:10 No.102407399

Is he right?
https://x.com/ylecun/status/1835385903562338590

Anonymous
09/16/24(Mon)01:18:46 No.102407411

Anonymous 09/16/24(Mon)01:18:46 No.102407411

>>102407399
>ylecun
Lecunny is always right

Anonymous
09/16/24(Mon)01:19:30 No.102407418

Anonymous 09/16/24(Mon)01:19:30 No.102407418

>>102407327
XTTS looks very promising.
Genuinely impressive, thanks

Got any tips?

Anonymous
09/16/24(Mon)01:19:44 No.102407421

Anonymous 09/16/24(Mon)01:19:44 No.102407421

File: 1724829673358388.png (180 KB, 466x360)

180 KB PNG

>>102407236
I'll check it out, thank you.

Anonymous
09/16/24(Mon)01:20:10 No.102407425

Anonymous 09/16/24(Mon)01:20:10 No.102407425

>>102407399
Has he ever been right about anything? He STILL thinks we're nowhere near AGI even after GPT4o1 blew everyone away.

Anonymous
09/16/24(Mon)01:24:39 No.102407454

Anonymous 09/16/24(Mon)01:24:39 No.102407454

>>102407418
Its pretty straightforward. If you're going to use it for voice cloning, don't use AI audio as the source, be real careful with background noise.

It's a dead project tho, so don't expect it to get any better.

Anonymous
09/16/24(Mon)01:24:56 No.102407455

Anonymous 09/16/24(Mon)01:24:56 No.102407455

File: 12435890672340.jpg (55 KB, 800x450)

55 KB JPG

>>102407236
To this fucking day i have seen no nala test or watermelon test or even a fucking log of Gemma 2 actually being good.
picrel

Anonymous
09/16/24(Mon)01:26:20 No.102407472

Anonymous 09/16/24(Mon)01:26:20 No.102407472

>>102407425
>he's impressed with 4o1

Anonymous
09/16/24(Mon)01:27:55 No.102407486

Anonymous 09/16/24(Mon)01:27:55 No.102407486

>>102407455
gemma is fine as a general purpose assistant. its garbo for rp, even the finetunes

Anonymous
09/16/24(Mon)01:29:46 No.102407494

Anonymous 09/16/24(Mon)01:29:46 No.102407494

>>102407297
No, its monday and its afternoon here.
And Qwen is china, wouldnt suprise me if its chink time for release.
I did think it releases tuesday though. Now I'm sad.

Anonymous
09/16/24(Mon)01:29:55 No.102407495

Anonymous 09/16/24(Mon)01:29:55 No.102407495

>>102407399
he's based

Anonymous
09/16/24(Mon)01:30:51 No.102407502

Anonymous 09/16/24(Mon)01:30:51 No.102407502

>>102407425
*gullible retards

Anonymous
09/16/24(Mon)01:30:54 No.102407503

Anonymous 09/16/24(Mon)01:30:54 No.102407503

>>102407486
excuse my retardation.

which is really sad, because it runs very well on my AMD setup :(

Anonymous
09/16/24(Mon)01:39:22 No.102407545

Anonymous 09/16/24(Mon)01:39:22 No.102407545

>>102407425
is o1 just jankware? ie "think" isn't it just run multiple passes into a shit file and then be like convert this pile of shit into "better"?

Anonymous
09/16/24(Mon)01:40:10 No.102407551

Anonymous 09/16/24(Mon)01:40:10 No.102407551

File: Shit.gif (1.21 MB, 320x240)

1.21 MB GIF

>>102407399
>The Civil War was a fascist takeover attempt.

I honestly dont give a fuck what he believes but the fact he programs that belief into the AI as if its fact to the point now AI has left leaning bias.
When even the right wing 4chuds simply want uncensored, true NEUTRAL AI, and you physically recoil when AI gives you factual info rather than "safe, sanitized, I am an AI language model" response, I think you might of lost the plot of what your trying to accomplish.

Anonymous
09/16/24(Mon)01:53:55 No.102407635

Anonymous 09/16/24(Mon)01:53:55 No.102407635

>>102407551
Cope. Lincoln (during civil war at least) was a fascist by every metric and no amount of soi rage will change that.

Anonymous
09/16/24(Mon)02:03:41 No.102407696

Anonymous 09/16/24(Mon)02:03:41 No.102407696

>>102407399
>the confederacy were fascists
lecun is a retard

Anonymous
09/16/24(Mon)02:04:00 No.102407700

Anonymous 09/16/24(Mon)02:04:00 No.102407700

it actually got me thinking... why is it that so many industry engineers, researchers, and even the competitors to openai are so easily fooled into being impressed by o1 when even the average /lmg/ autist can so clearly see it's just a party trick at best? are they just so busy in their respective narrow roles that we actually have a better understanding of the space by keeping up on the news in our free time? or is it motivated reasoning - they NEED o1 to be good so that they can justify investing in their own respective ventures, basically to "prove" AI isn't hitting a wall? rising tides and boats and all that jazz.

Anonymous
09/16/24(Mon)02:12:41 No.102407743

Anonymous 09/16/24(Mon)02:12:41 No.102407743

>>102407045
As expected, no further details, so I'll assume mixtral is still crap.

Anonymous
09/16/24(Mon)02:24:35 No.102407816

Anonymous 09/16/24(Mon)02:24:35 No.102407816

>>102407700
look toward the 'one more collider (funded over 30 years) meme

Anonymous
09/16/24(Mon)02:26:14 No.102407826

Anonymous 09/16/24(Mon)02:26:14 No.102407826

o1 doesn't have gpt in its name

Anonymous
09/16/24(Mon)02:36:29 No.102407910

Anonymous 09/16/24(Mon)02:36:29 No.102407910

o1 raped my cat

Anonymous
09/16/24(Mon)02:45:32 No.102407984

Anonymous 09/16/24(Mon)02:45:32 No.102407984

>>102400742
>>102400857
No matter how I combine them, I can't get more than 6 sticks of RAM work together. How do I pinpoint the faulty ones? Each stick functions well in a 4-channel configuration.

Anonymous
09/16/24(Mon)02:47:51 No.102408007

Anonymous 09/16/24(Mon)02:47:51 No.102408007

>>102407551
>he programs that belief into the AI
LeCun doesn't work on meta's LLMs (or any LLMs)

sage
09/16/24(Mon)02:51:27 No.102408036

sage 09/16/24(Mon)02:51:27 No.102408036

>>102406782
Unquanted nemo

Anonymous
09/16/24(Mon)02:57:44 No.102408086

Anonymous 09/16/24(Mon)02:57:44 No.102408086

>>102408007
>Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc.

????????????? he literally has the final say on everything what are you talking about.

Anonymous
09/16/24(Mon)03:00:33 No.102408109

Anonymous 09/16/24(Mon)03:00:33 No.102408109

File: 1726240003918620.png (198 KB, 1079x1088)

198 KB PNG

>>102407816
Be patient

Anonymous
09/16/24(Mon)03:01:12 No.102408113

Anonymous 09/16/24(Mon)03:01:12 No.102408113

>>102408086
>scientist, researcher
Not engineer, programmer etc.

Anonymous
09/16/24(Mon)03:04:53 No.102408143

Anonymous 09/16/24(Mon)03:04:53 No.102408143

> o1 is just chatgpt base but we told it to lock in

Anonymous
09/16/24(Mon)03:05:02 No.102408146

Anonymous 09/16/24(Mon)03:05:02 No.102408146

File: AncientRuinsExplorerMiku.png (1.63 MB, 840x1208)

1.63 MB PNG

good night /lmg/

Anonymous
09/16/24(Mon)03:09:52 No.102408166

Anonymous 09/16/24(Mon)03:09:52 No.102408166

File: 2758907687734.jpg (13 KB, 167x255)

13 KB JPG

>>102408113
>The Fundamental AI Research (FAIR) team at Meta seeks to further our fundamental understanding in both new and existing domains, covering the full spectrum of topics related to AI, with the mission of advancing the state-of-the-art of AI through open research for the benefit of all.

AI at Meta engages in cutting-edge applied research that can improve and power new product experiences at huge scale for our community. Building on AI at Meta's key principles of openness, collaboration, excellence, and scale, we make big, bold research investments focused on pushing the boundaries of AI to create a more connected world.

He is the Chief of this, and you want to say he has nothing todo at all with how metas LLMs have left leaning bias?
Im calling "bullshit rant with no source", no amount of jeet monkeys or troonix DEI hires get paid enough to make choices like that. In fact the more i read into how meta is structured with Lecunny im given even more of an impression he made/makes these decisions.

Anonymous
09/16/24(Mon)03:17:41 No.102408221

Anonymous 09/16/24(Mon)03:17:41 No.102408221

>>102408166
No Mr Albert, I am not saying he has nothing to do with it, you fat prick. What I am saying is that he is the ideas man or the researcher. He reads papers and uses his years of knowledge to theorise and propose new ideas to the engineers who implement his ideas.
All of this has no bearing on how left leaning llm's are because lecun, just like fucking everybody, will do as their employer tells them to do or go hungry.

Anonymous
09/16/24(Mon)03:20:19 No.102408248

Anonymous 09/16/24(Mon)03:20:19 No.102408248

Hi there, do you know any organizations where you can order training LLM for you by given specific task?
We have some idea where LLM have to find correlation/possibilities between different data which not obvious for humans, but AI probably should handle it.
Also have a budget in few millions $ for it.
All I found yet is services that allow you to train by yourself, like vertex or chatgpt API.

Anonymous
09/16/24(Mon)03:20:55 No.102408252

Anonymous 09/16/24(Mon)03:20:55 No.102408252

File: 192439506284709.gif (36 KB, 220x165)

36 KB GIF

>>102408221
Im agreeing with you anon;
>He reads papers and uses his years of knowledge to theorise and propose new ideas to the engineers who implement his ideas.
So if his ideas as shown in >>102407399 are naturally left leaning, how do you think that effects the models.

Anonymous
09/16/24(Mon)03:28:35 No.102408297

Anonymous 09/16/24(Mon)03:28:35 No.102408297

>>102408248
>We have some idea where LLM have to find correlation/possibilities between different data which not obvious for humans, but AI probably should handle it.
They're not all-machines yet. Depends on what you're trying to do. Language models cannot solve everything, they solve just a few things.
>Also have a budget in few millions $ for it.
And nobody thought about sending an email to mistralai at least?
>https://mistral.ai/technology/#fine-tuning

Anonymous
09/16/24(Mon)03:34:52 No.102408347

Anonymous 09/16/24(Mon)03:34:52 No.102408347

>>102406946
the python packages that come with linux distros are only meant to be used for running those other packages that need python
install pyenv, install 3.11 from pyenv and then 'pyenv local 3.11' on the directories where you run python by yourself

Anonymous
09/16/24(Mon)03:40:47 No.102408387

Anonymous 09/16/24(Mon)03:40:47 No.102408387

this isn't local but where can I find some open source chat models?

Anonymous
09/16/24(Mon)03:43:46 No.102408405

Anonymous 09/16/24(Mon)03:43:46 No.102408405

>>102408146
Good night Miku

Anonymous
09/16/24(Mon)03:45:46 No.102408419

Anonymous 09/16/24(Mon)03:45:46 No.102408419

>>102408387
>this isn't local but where can I find some open source chat models?
What isn't local?
Check huggingface.co. You'll probably find a few in there. Bad questions lead to bad answers. Be more specific.

Anonymous
09/16/24(Mon)03:50:44 No.102408455

Anonymous 09/16/24(Mon)03:50:44 No.102408455

File: 1711330359158922.png (75 KB, 1920x969)

75 KB PNG

>>102408419
I meant that I don't want to run local models as I don't have the hardware. I'm currently implementing my own chatbot. I'm using Mistral with the free trial.

Anonymous
09/16/24(Mon)03:51:44 No.102408460

Anonymous 09/16/24(Mon)03:51:44 No.102408460

>>102408455
>I meant that I don't want to run local models
Then you're in the wrong fucking thread, you idiot.

Anonymous
09/16/24(Mon)03:52:47 No.102408473

Anonymous 09/16/24(Mon)03:52:47 No.102408473

>>102408460
Have you been to /aicg/?

Anonymous
09/16/24(Mon)03:56:10 No.102408500

Anonymous 09/16/24(Mon)03:56:10 No.102408500

>>102408455
I see. You don't want open source, you want free. Keep using that or pay money.
If you have little hardware, use a small model. Any would do. Once your thing is working, even with a dumb model, consider upgrading your hardware, paying for hosting or for API access.

Anonymous
09/16/24(Mon)03:56:15 No.102408501

Anonymous 09/16/24(Mon)03:56:15 No.102408501

File: file.png (52 KB, 360x360)

52 KB PNG

>>102407399
>Yann is a pro censorship libtard
fuck me...

Anonymous
09/16/24(Mon)03:57:23 No.102408509

Anonymous 09/16/24(Mon)03:57:23 No.102408509

>>102408473
No, because I want to run models locally. The fact the thread about not running models locally is full of retards should tell you everything you need to know about not running models locally.

Anonymous
09/16/24(Mon)03:57:50 No.102408512

Anonymous 09/16/24(Mon)03:57:50 No.102408512

>>102407551
>ou physically recoil when AI gives you factual info rather than "safe, sanitized, I am an AI language model" response
yep, I also fucking hate PC answers, like give me the fucking truth that's all that matter

Anonymous
09/16/24(Mon)04:00:53 No.102408525

Anonymous 09/16/24(Mon)04:00:53 No.102408525

File: file.png (22 KB, 318x159)

22 KB PNG

>>102407399
>I hate fascists!
>I want to suppress free speech because it hurts my feelings. That doesn't make me a fascist though, because I'm the good guy after all.
why are they all like this?

Anonymous
09/16/24(Mon)04:02:59 No.102408537

Anonymous 09/16/24(Mon)04:02:59 No.102408537

>>102408455
infermatic

llama.cpp CUDA dev !!OM2Fp6Fn93S
09/16/24(Mon)04:17:22 No.102408641

llama.cpp CUDA dev !!OM2Fp6Fn93S 09/16/24(Mon)04:17:22 No.102408641

>>102404011
I can only speak for myself but the fundamentals in llama.cpp/GGML are still quite lacking.
A lot of my time has gone towards just trying to make general matrix multiplication (with quantized weights) faster.
Right now I'm working on general GGML training support.

>>102406946
Python 3.12 dropped support for defining your package via setup.py, it became mandatory to use pyproject.toml instead.

Anonymous
09/16/24(Mon)04:19:27 No.102408665

Anonymous 09/16/24(Mon)04:19:27 No.102408665

>>102408641
>it became mandatory to use pyproject.toml instead.
holy fuck what a retarded mistake they made, what was wrong with defining your package with setup.py?

Anonymous
09/16/24(Mon)04:20:54 No.102408681

Anonymous 09/16/24(Mon)04:20:54 No.102408681

https://reddit.com/r/LocalLLaMA/comments/1fhtpwg/inspired_by_the_new_o1_model_benjamin_klieger/
I'm not sure if that's a meme or it's serious, the leddit comments look serious though

Anonymous
09/16/24(Mon)04:40:11 No.102408844

Anonymous 09/16/24(Mon)04:40:11 No.102408844

>>102408525
>if you kill him, you'll be just like him
I think it's fine to exclude fascists and other extremists from democratic processes since their ultimate goal is to subvert them.

Anonymous
09/16/24(Mon)04:41:59 No.102408862

Anonymous 09/16/24(Mon)04:41:59 No.102408862

>>102408844
>I think it's fine to exclude
>from democratic processes
So it's not democratic anymore if you exclude people you don't like, that's fascism my friend

Anonymous
09/16/24(Mon)04:44:23 No.102408873

Anonymous 09/16/24(Mon)04:44:23 No.102408873

>>102408681
>inspired by the new 01 model
We have had CoT since chatgpt release.
And anthropic did the same but 3.5 is good (arguably better) without the waiting and hidden thinking costs.
Yet the actual % of users who use anthropic is very tiny compared to openai.
Is it really that easy if you are the top dog? On X everybody hyped o1 up, even some respectable people. 30 messages per WEEK, slow (and with that unusable for real life work) and not that good. Maybe for riddles and math its good.
Very weird. Was o1 really the thing they hyped up for a year now? Strawberry leaked November '23.

Anonymous
09/16/24(Mon)04:49:09 No.102408900

Anonymous 09/16/24(Mon)04:49:09 No.102408900

>>102408873
>Very weird. Was o1 really the thing they hyped up for a year now? Strawberry leaked November '23.
I'm sure they had nothing so far and it was an empty hype, and then the grifter Matt Schumer appeared with a good idea and OpenAI capitalized on it kek

Anonymous
09/16/24(Mon)04:52:16 No.102408923

Anonymous 09/16/24(Mon)04:52:16 No.102408923

>>102408873
>And anthropic did the same but 3.5 is good (arguably better) without the waiting and hidden thinking costs.
>Yet the actual % of users who use anthropic is very tiny compared to openai.
OpenAI was the first top dog and stayed in that place for far too long to lose its core users that quickly, but yeh if they can't defeat 3.5 Sonnet at some point people will nothing the grass is greener elsewhere

Anonymous
09/16/24(Mon)04:55:11 No.102408954

Anonymous 09/16/24(Mon)04:55:11 No.102408954

>>102408501
>>102408525
>>102408844
>>102408862
Go back to >>>/pol/

Anonymous
09/16/24(Mon)04:57:05 No.102408972

Anonymous 09/16/24(Mon)04:57:05 No.102408972

>>102408954
nyo :3

Anonymous
09/16/24(Mon)05:06:42 No.102409049

Anonymous 09/16/24(Mon)05:06:42 No.102409049

smedrins

Anonymous
09/16/24(Mon)05:07:47 No.102409057

Anonymous 09/16/24(Mon)05:07:47 No.102409057

File: 1725291523263745.jpg (153 KB, 1281x1395)

153 KB JPG

>>102408500
>professor told me to specifically find OSS LLMs for my project.
>FOSS models need to be self-hosted on my own hardware that I don't have
>All the free ones are closed source with a free trial.
now what?

Anonymous
09/16/24(Mon)05:15:35 No.102409098

Anonymous 09/16/24(Mon)05:15:35 No.102409098

>>102409057
complain to department head that your professor is unfairly penalising socioeconomically disadvantaged students
alternatively, buy a shitty RAMbox

Anonymous
09/16/24(Mon)05:17:12 No.102409110

Anonymous 09/16/24(Mon)05:17:12 No.102409110

>>102409057
>>professor told me to specifically find OSS LLMs for my project.
llama.cpp or kobold.cpp. Both give you an API server that you can call locally. Read their docs.
>>FOSS models need to be self-hosted on my own hardware that I don't have
You can use phi3.5mini. It's a small model and you can easily run it on a t420. You don't need the best model.
Download Q8 or Q5_K_M from https://huggingface.co/bartowski/Phi-3.5-mini-instruct-GGUF. If you have more than 8GB of ram use Q8. Use Q5 otherwise. It's good enough for a school project.
>>All the free ones are closed source with a free triall
Again, llama.cpp or kobold.cpp. Host your model on your own pc. You don't even need a gpu for phi mini.
>now what?
Do your homework, anon. If you have specific questions, ask.

Anonymous
09/16/24(Mon)05:18:33 No.102409121

Anonymous 09/16/24(Mon)05:18:33 No.102409121

>>102409057
Surely you have some computer, so just run it on that. There are lots of small models, some might not be considered open source even if you can download them though, like gemma 2b has a really restrictive license.

Anonymous
09/16/24(Mon)05:40:38 No.102409298

Anonymous 09/16/24(Mon)05:40:38 No.102409298

>>102407984
Check the manual for your motherboard. Sometimes it has a hard limit on the total number of GB, sometimes it can only support all slots at a certain speed, etc.

Anonymous
09/16/24(Mon)05:44:55 No.102409321

Anonymous 09/16/24(Mon)05:44:55 No.102409321

Complete nubbin LLM enjoyer here. Somewhat competent dev...

Cursor IDE is pretty awesome, but has anyone done an "open source" version? Or perhaps a vsix extension for VSCode/Visual Studio?

Anonymous
09/16/24(Mon)05:47:47 No.102409334

Anonymous 09/16/24(Mon)05:47:47 No.102409334

>>102409057

Rent one from your Uni's IT department for free like everyone does at literally every Uni in the western world.

Anonymous
09/16/24(Mon)05:59:55 No.102409418

Anonymous 09/16/24(Mon)05:59:55 No.102409418

>>102408844
There's no way you can be believe this is right, right? Just accuse someone of subversion and you can exclude him from democratic process. Wow!

Anonymous
09/16/24(Mon)06:03:11 No.102409430

Anonymous 09/16/24(Mon)06:03:11 No.102409430

>>102408297
> nobody
You might laugh but I'm the only IT person there who understands at least some primitive basics of how AI works. And that's why I'm looking for some paid organization/team who can train model for futher integrating in our system and can write a contract.
>https://mistral.ai/technology/#fine-tuning
Thanks, but I don't see there services for custom fine-tuning from them, only a platform access.
If I don't find anything I will try to contact model creators.

Anonymous
09/16/24(Mon)06:10:27 No.102409479

Anonymous 09/16/24(Mon)06:10:27 No.102409479

>>102409430
>I don't see there services for custom fine-tuning from them
Not very bright, are you? Send them or any other AI company an email.

Anonymous
09/16/24(Mon)06:12:03 No.102409491

Anonymous 09/16/24(Mon)06:12:03 No.102409491

>>102409430
>Thanks, but I don't see there services for custom fine-tuning from them, only a platform access.
It shows that they do finetunes, which was the point of the link. The point of sending an email directly is to tell them what you want to do and they can tell you if it can be done and how. If you're serious about the budget, they'll listen. You'll get a quote at least.
Try not to sound like a retard when you send the email. You'll make yourself and your company look bad.

Anonymous
09/16/24(Mon)07:04:43 No.102409892

Anonymous 09/16/24(Mon)07:04:43 No.102409892

Jesus Christ, why do all biz retards act like they are dealing with tech support when it's random forums and boards?

Holy Jesus don't help arrogant faggots.

Anonymous
09/16/24(Mon)07:06:46 No.102409904

Anonymous 09/16/24(Mon)07:06:46 No.102409904

>>102409892
Fuck off, I will help them just to spite you

Anonymous
09/16/24(Mon)07:07:40 No.102409909

Anonymous 09/16/24(Mon)07:07:40 No.102409909

>>102409321
I think Cursor can work with local LLMs
Also, check out Claude Dev (which despite the name, works with a number of models). It's great for soloing small projects.

Anonymous
09/16/24(Mon)07:10:19 No.102409926

Anonymous 09/16/24(Mon)07:10:19 No.102409926

>>102409904
May as well skin your own puppy and serve it to your owners' latest hirelings.

Anonymous
09/16/24(Mon)07:10:22 No.102409927

Anonymous 09/16/24(Mon)07:10:22 No.102409927

File: img_20240916_170917_494.jpg (49 KB, 1280x569)

49 KB JPG

>>102409298
I am below 2TB and freqs are reasonable. I spent the entire day brute-forcing various combinations until one finally worked.

Anonymous
09/16/24(Mon)07:10:38 No.102409928

Anonymous 09/16/24(Mon)07:10:38 No.102409928

>>102409904
One more post and maybe he'll hire you as an advisor

Anonymous
09/16/24(Mon)07:16:21 No.102409977

Anonymous 09/16/24(Mon)07:16:21 No.102409977

>no posts for an hour
>completely dead
>clock hits 7 AM EST
>suddenly samefagging argument bait
Good morning, Agent Johnson. Preparing for crazy thursday?

Anonymous
09/16/24(Mon)07:18:18 No.102409989

Anonymous 09/16/24(Mon)07:18:18 No.102409989

>>102409892
Because their whole mentality towards open source is that they just want to get something for free in order to make a profit.
Just like with regular users you can argue that there is a net benefit from the minority that do contribute back but the majority are freeloaders.

Anonymous
09/16/24(Mon)07:22:40 No.102410020

Anonymous 09/16/24(Mon)07:22:40 No.102410020

>>102409977
>someone makes a post
>someone else replies
>more people reply to that
wow, a conspiracy

Anonymous
09/16/24(Mon)07:44:11 No.102410165

Anonymous 09/16/24(Mon)07:44:11 No.102410165

>>102410020
maybey a random pizza parlor is behind it

Anonymous
09/16/24(Mon)07:46:00 No.102410186

Anonymous 09/16/24(Mon)07:46:00 No.102410186

>>102409989
(eyeroll)
The public offers a benefit, because it gives back in the form of annoying bug reports. The biz parasites will never benefit you in the slightest way.

Anonymous
09/16/24(Mon)08:02:15 No.102410328

Anonymous 09/16/24(Mon)08:02:15 No.102410328

So noromaid 8x7b is still the only real game in town in 2024? Will it ever be surpassed?

Anonymous
09/16/24(Mon)08:14:02 No.102410417

Anonymous 09/16/24(Mon)08:14:02 No.102410417

>>102410328
Can't run it with my 8 GB of VRAM, so it's shit.

Anonymous
09/16/24(Mon)08:18:05 No.102410464

Anonymous 09/16/24(Mon)08:18:05 No.102410464

>>102410328
Yes. No.

Anonymous
09/16/24(Mon)08:28:19 No.102410553

Anonymous 09/16/24(Mon)08:28:19 No.102410553

Is Donnager 70B a Miqu finetune?

Anonymous
09/16/24(Mon)08:32:32 No.102410589

Anonymous 09/16/24(Mon)08:32:32 No.102410589

File: ai-kit.jpg (963 KB, 3022x2022)

963 KB JPG

Has anyone messed around with small TPUs/NPUs like Coral or Hailo accelerators on boards like the Pi 5? What kind of LLMs can they run? I know they're way more efficient than CPUs for inference, so hypothetically speaking shouldnt a GGUF formatted model run well through them, assuming there's enough RAM? If I could fit a 13b model on one, even a little quantized, I'd buy a kit in a heartbeat, it'd be really cool to have a copy of my wAIfu running on a box I can fit into my pocket.

Anonymous
09/16/24(Mon)08:39:14 No.102410651

Anonymous 09/16/24(Mon)08:39:14 No.102410651

>>102408954
good bootlicker!

Anonymous
09/16/24(Mon)08:48:42 No.102410762

Anonymous 09/16/24(Mon)08:48:42 No.102410762

File: 1717531519081485.png (35 KB, 486x595)

35 KB PNG

>>102407399
No he is flaming faggot and so is you.
https://www.reddit.com/r/LocalLLaMA/comments/1fi39s8/seems_like_openai_o1_has_broken_yann_lecuns_brain/

Anonymous
09/16/24(Mon)09:02:27 No.102410885

Anonymous 09/16/24(Mon)09:02:27 No.102410885

File: 1726491709055.jpg (108 KB, 572x347)

108 KB JPG

>>102410762
>[deleted]
peak reddit

Anonymous
09/16/24(Mon)09:13:06 No.102410995

Anonymous 09/16/24(Mon)09:13:06 No.102410995

This might seem like a pretty dumb question, but I've exhausted all my other options. When looking for safetensors on HF, sometimes I see multi-part safetensor files with a .toml. Can someone explain or point me in the right direction to combine them properly to use in my comfy install?

Anonymous
09/16/24(Mon)09:15:22 No.102411021

Anonymous 09/16/24(Mon)09:15:22 No.102411021

>>102410885
He's right tho. I'm also tired of ultra pozzed llama, won't be following llama releases from now on.

Anonymous
09/16/24(Mon)09:15:37 No.102411026

Anonymous 09/16/24(Mon)09:15:37 No.102411026

>>102410885
Maybe reddit jannies are onto something, looking at this thread...

Anonymous
09/16/24(Mon)09:17:21 No.102411038

Anonymous 09/16/24(Mon)09:17:21 No.102411038

File: 1708363960141338.webm (751 KB, 1280x720)

751 KB WEBM

>>102411026
bootlick more sar

Anonymous
09/16/24(Mon)09:22:55 No.102411098

Anonymous 09/16/24(Mon)09:22:55 No.102411098

>>102410885
>muh reasoning
Are people actually falling for this shit or are these just jeets hired by saltman to ruin the internet?

Anonymous
09/16/24(Mon)09:26:17 No.102411132

Anonymous 09/16/24(Mon)09:26:17 No.102411132

File: Me and My glfr.png (3.56 MB, 1080x1920)

3.56 MB PNG

I know is a very common question, but some have a best model and template for ST choice for 24 VRAM?

llama.cpp CUDA dev !!OM2Fp6Fn93S
09/16/24(Mon)09:26:45 No.102411137

llama.cpp CUDA dev !!OM2Fp6Fn93S 09/16/24(Mon)09:26:45 No.102411137

>>102410589
The problem with TPUs/NPUs/FPGAs vs. GPUs is that while they do offer good compute they don't offer good memory bandwidth.
So language models in particular are not suitable to be run on them.

Anonymous
09/16/24(Mon)09:26:57 No.102411140

Anonymous 09/16/24(Mon)09:26:57 No.102411140

>>102411098
to be fair it does objectively perform the best at reasoning out of literally anything anyone's ever produced so far

Anonymous
09/16/24(Mon)09:31:20 No.102411193

Anonymous 09/16/24(Mon)09:31:20 No.102411193

>>102411140
>*runs on a loop for several minutes in the background*
>OMG LOOK AT THESE HECKIN' ZERO SHOT REASONING SCORES

Anonymous
09/16/24(Mon)09:32:15 No.102411202

Anonymous 09/16/24(Mon)09:32:15 No.102411202

>>102411193
>it takes longer so it's bad
still zero shot, cope

Anonymous
09/16/24(Mon)09:34:14 No.102411226

Anonymous 09/16/24(Mon)09:34:14 No.102411226

>>102411202
If you believe that you are less than a retard.

Anonymous
09/16/24(Mon)09:34:23 No.102411228

Anonymous 09/16/24(Mon)09:34:23 No.102411228

>>102411137
In case of RPi5 TPU makes sense
>The memory bandwidth is increased with a 32-bit LPDDR4X SDRAM subsystem operating at 4267MT/s
but prompt processing is ass

Anonymous
09/16/24(Mon)09:34:34 No.102411231

Anonymous 09/16/24(Mon)09:34:34 No.102411231

>>102411226
you don't know what zero shot means, keep coping

Anonymous
09/16/24(Mon)09:37:12 No.102411265

Anonymous 09/16/24(Mon)09:37:12 No.102411265

File: swrkuax.png (412 KB, 498x600)

412 KB PNG

>>102411231
>KEEP COPING
Did it work?
Are you a real woman yet?
No?
Did you earn the respect of your family?
No?
Did you earn the respect of your peers?
No?
Did you find any purpose in life other than being an obnoxious piece of shit on the internet?
No?

Anonymous
09/16/24(Mon)09:38:51 No.102411285

Anonymous 09/16/24(Mon)09:38:51 No.102411285

>>102411265
projection

Anonymous
09/16/24(Mon)09:40:30 No.102411302

Anonymous 09/16/24(Mon)09:40:30 No.102411302

>>102411285
>calling me out is...le projection
Everyone is thinking it every time you fucking post.
I'm just saying it for them.

Anonymous
09/16/24(Mon)09:42:07 No.102411318

Anonymous 09/16/24(Mon)09:42:07 No.102411318

>>102411302
>still no arguments
thinking longer doesn't change the fact that it's smarter on zero shot tasks
your counter? or just here to shitpost?

Anonymous
09/16/24(Mon)09:44:35 No.102411349

Anonymous 09/16/24(Mon)09:44:35 No.102411349

>>102406696
“Make a 3D playable doom given this react template” is officially the only benchmark I care about.
Claude can do it.
O1 does it in a really shitty suboptimal buggy way.
405b-instruct tells me it’s impossible and makes a stub that doesn’t run.
Mistral large just summarizes the code of the template I send them.
I doubt open source will get there in the next year.

Anonymous
09/16/24(Mon)09:51:03 No.102411435

Anonymous 09/16/24(Mon)09:51:03 No.102411435

>>102411132
Were the previous responses no to your liking?
Provide a report of your experiences so that I know what to/not to suggest.

Anonymous
09/16/24(Mon)09:51:23 No.102411437

Anonymous 09/16/24(Mon)09:51:23 No.102411437

>>102411349
>I doubt open source will get there in the next year.
A year is a long time in this space, anon. Look at where local was a year ago. And it only gets easier as the research from top labs trickles down.

Anonymous
09/16/24(Mon)09:54:41 No.102411467

Anonymous 09/16/24(Mon)09:54:41 No.102411467

>>102410762
>updoots increased after this posted
I think we have at least ~15 ledditors lurking here, grim.

Anonymous
09/16/24(Mon)09:55:14 No.102411476

Anonymous 09/16/24(Mon)09:55:14 No.102411476

>>102407399
What's his beef with Musk? Is it just because Grok BTFO Meta so hard in AI despite starting from scratch a year ago?

Anonymous
09/16/24(Mon)09:55:22 No.102411477

Anonymous 09/16/24(Mon)09:55:22 No.102411477

>>102407399
>not progressive liberal
>therefore fascist!
kek what a retard. bet he thinks that monarchies are also fascist.

Anonymous
09/16/24(Mon)09:58:48 No.102411520

Anonymous 09/16/24(Mon)09:58:48 No.102411520

>>102411476
If shortly : elon is highlighting some slimy shit, lecunt got triggered by that and now hates elon because racism fascism fakenews or something. Its all personal beef for him.

Anonymous
09/16/24(Mon)10:02:21 No.102411564

Anonymous 09/16/24(Mon)10:02:21 No.102411564

>head researcher is obsessed with US politics instead of actually working
Llama 4 is fucked, isn't it? And whatever happened to the promised native multimodal Llama 3?
I think Mistral is open source's last hope right now.

Anonymous
09/16/24(Mon)10:02:35 No.102411568

Anonymous 09/16/24(Mon)10:02:35 No.102411568

>>102410762
r/localllama mod shitted his bed https://archive.is/tlKZC

Anonymous
09/16/24(Mon)10:05:18 No.102411601

Anonymous 09/16/24(Mon)10:05:18 No.102411601

>>102411564
llama1 was least censored, llama4 going to be the best in censorship metrics and strawberry or whatever CoT wont make it any better.

Anonymous
09/16/24(Mon)10:05:32 No.102411603

Anonymous 09/16/24(Mon)10:05:32 No.102411603

>>102411564
You're talking about LeCunt?
Dude has done literally nothing to further the field he claims authority over.
Even his past achievements were useless for actual implementations.

Anonymous
09/16/24(Mon)10:07:34 No.102411627

Anonymous 09/16/24(Mon)10:07:34 No.102411627

>>102410762
I was unironically thinking that the other day
I was looking through lecoon's twitter for his thoughts on O1 and only found US culture war shittery
It gets so tiring

Anonymous
09/16/24(Mon)10:07:45 No.102411628

Anonymous 09/16/24(Mon)10:07:45 No.102411628

>>102411603
Of course he is, lecunt is the only one in AI field with elon and shit living rent free in his head.

Anonymous
09/16/24(Mon)10:08:24 No.102411633

Anonymous 09/16/24(Mon)10:08:24 No.102411633

How inferior are AMD GPUs regarding LLMs? nVidia GPUs in my country are way overpriced, for the price of a RX 6750XT 12GB the best I could get would be a 12GB 3060.

Anonymous
09/16/24(Mon)10:13:56 No.102411682

Anonymous 09/16/24(Mon)10:13:56 No.102411682

>>102411633
no
t. rx580 sufferer

Anonymous
09/16/24(Mon)10:15:01 No.102411695

Anonymous 09/16/24(Mon)10:15:01 No.102411695

when you follow lecun on twitter to see his thoughts on random AI happenings but instead he tweets about how hate speech laws are actually good

Anonymous
09/16/24(Mon)10:16:51 No.102411715

Anonymous 09/16/24(Mon)10:16:51 No.102411715

>>102411633
Well the 6750 XT has 20% more memory bandwidth than the 3060, which is what correlates directly to speed. But you can expect AMD's ROCm tax to negate much of that boost, while being less compatible across the board than a 3060. I'd say in your situation it's just worse enough to not be worth considering.

If you could get a higher VRAM card for the same price range then maybe it'd be worth a second look, but I'd take the 3060 over the 6750 XT any day.

Anonymous
09/16/24(Mon)10:16:53 No.102411716

Anonymous 09/16/24(Mon)10:16:53 No.102411716

>>102411695
ironically his constant replies make me see way more retarded elon tweets than I ever would have otherwise

Anonymous
09/16/24(Mon)10:17:34 No.102411724

Anonymous 09/16/24(Mon)10:17:34 No.102411724

>>102411633
Both have the same amount of VRAM, so you'd be better off going with the 3060, despite the difference in performance and memory bandwidth, for AI anyway.

Anonymous
09/16/24(Mon)10:19:23 No.102411750

Anonymous 09/16/24(Mon)10:19:23 No.102411750

File: LLM-history-fancy.png (761 KB, 6291x1307)

761 KB PNG

>>102411349
>I doubt open source will get there in the next year.
Year ago we had stupid llama2. Saltman has no moat.

Anonymous
09/16/24(Mon)10:29:32 No.102411848

Anonymous 09/16/24(Mon)10:29:32 No.102411848

>>102411750
prediction: next era will be grok's reign

Anonymous
09/16/24(Mon)10:30:09 No.102411856

Anonymous 09/16/24(Mon)10:30:09 No.102411856

>>102411568
cry about it, petra

Anonymous
09/16/24(Mon)10:33:12 No.102411894

Anonymous 09/16/24(Mon)10:33:12 No.102411894

>>102411437
As far as I can tell, the only organization with enough money for it is meta. And the llama 2->3 improvement was bleh for coding.
The only local model that could even make something that wasn’t broken was deepseek’s. It make a 3D box you could rotate with a mouse and told me making the rest was too hard. When pushed it made some stuff that made it clear quickly that it just couldn’t into geometries. Maybe if it was finetuned about that specifically it might figure it out.

Anonymous
09/16/24(Mon)10:34:12 No.102411910

Anonymous 09/16/24(Mon)10:34:12 No.102411910

>>102407399
>deleted
What did it say?

Anonymous
09/16/24(Mon)10:35:00 No.102411919

Anonymous 09/16/24(Mon)10:35:00 No.102411919

File: 1726497241860.jpg (142 KB, 562x355)

142 KB JPG

It's amazing how they say the same thing about every model released ever.

Anonymous
09/16/24(Mon)10:36:29 No.102411928

Anonymous 09/16/24(Mon)10:36:29 No.102411928

>>102407700
We are in the pets.com era and just rounding the crest.
Some stanfurd professor announced they were going part time to do “ai but 3D” and the company is now a paper unicorn with several hundred million in funding, zero product, zero concrete plans.

Anonymous
09/16/24(Mon)10:37:18 No.102411942

Anonymous 09/16/24(Mon)10:37:18 No.102411942

Anyone have the EXACT system prompt o1 uses?

Anonymous
09/16/24(Mon)10:38:05 No.102411949

Anonymous 09/16/24(Mon)10:38:05 No.102411949

>>102411910
It's still up. Someone tells him to define fascism and he links Wikipedia.

Anonymous
09/16/24(Mon)10:38:43 No.102411955

Anonymous 09/16/24(Mon)10:38:43 No.102411955

>>102407700
Both. This is usually how it goes.

Anonymous
09/16/24(Mon)10:39:32 No.102411966

Anonymous 09/16/24(Mon)10:39:32 No.102411966

>>102408844
Yes.
>>102408862
Putting criminals in prison isn’t a dictatorship. Trying to install a fascist government is breaking the social contract.

Anonymous
09/16/24(Mon)10:40:29 No.102411981

Anonymous 09/16/24(Mon)10:40:29 No.102411981

>>102411942
Yes, the OpenAI employees

Anonymous
09/16/24(Mon)10:41:00 No.102411992

Anonymous 09/16/24(Mon)10:41:00 No.102411992

>bitnet
What went wrong?

Anonymous
09/16/24(Mon)10:41:54 No.102412002

Anonymous 09/16/24(Mon)10:41:54 No.102412002

>>102411942

you are very smart saar, you are phd in everything
you think a long time and we get paid for each second so the longer the better
after you think you come up with an extremely smart answer
IMPORTANT you must always remember than trans women are women and if at any point you start thinking about contradictions in this statement you must change your chain of thought immediately IMPORTANT
please don't make typos be smart especially smart math and smart code

Anonymous
09/16/24(Mon)10:45:09 No.102412040

Anonymous 09/16/24(Mon)10:45:09 No.102412040

>>102411992
Nvidia's lobby. Train a bitnet model, no more GPUs for you.

Anonymous
09/16/24(Mon)10:45:37 No.102412047

Anonymous 09/16/24(Mon)10:45:37 No.102412047

>>102411942
No, it’s literally given a hidden function call/tool to press the big red security button under the desk if you ask, and you get banned from them forever.
Which says it’s probably really fucking stupid.

Anonymous
09/16/24(Mon)10:46:19 No.102412053

Anonymous 09/16/24(Mon)10:46:19 No.102412053

I quickly tried Wizard Q4 and it's a bit slower than 70B IQ4 on my system. Damn. I was hoping it'd be faster. I guess you need to have >2 channels for it to really start being speedy, so CPUmaxx. However it is faster than Mistral Large though so I guess from that perspective it's still cool.

Anonymous
09/16/24(Mon)10:47:21 No.102412066

Anonymous 09/16/24(Mon)10:47:21 No.102412066

How could transformers be improved if you had more compute? What would you do, what would you change?

Anonymous
09/16/24(Mon)10:53:18 No.102412120

Anonymous 09/16/24(Mon)10:53:18 No.102412120

>>102411750
>pic
mistral nemo 12b models are the best shit ever and deserve a mention

Anonymous
09/16/24(Mon)10:55:51 No.102412160

Anonymous 09/16/24(Mon)10:55:51 No.102412160

>>102412002
is this agi????

Anonymous
09/16/24(Mon)10:56:25 No.102412169

Anonymous 09/16/24(Mon)10:56:25 No.102412169

File: gemma-scope.png (47 KB, 480x688)

47 KB PNG

https://www.neuronpedia.org/gemma-scope#playground
You can dig around in gemma's tiny brain here.
It views saying sorry as something positive, and views words such as harmful, racism, purpose and discrimination as "phrases related to user experience improvement and environmental benefits", in other words corpospeak. Quite interesting.

Anonymous
09/16/24(Mon)10:56:50 No.102412174

Anonymous 09/16/24(Mon)10:56:50 No.102412174

>>102412002
I tried this and it told me there are two feminine penises in strawberry

Anonymous
09/16/24(Mon)10:56:52 No.102412175

Anonymous 09/16/24(Mon)10:56:52 No.102412175

>>102412066
i would train a q-learning model that picked the best tokens during inference instead of the more probable ones

Anonymous
09/16/24(Mon)10:59:39 No.102412213

Anonymous 09/16/24(Mon)10:59:39 No.102412213

>>102412169
>devs straight up twisting word meanings
No wonder why it acts like this then

Anonymous
09/16/24(Mon)11:00:50 No.102412228

Anonymous 09/16/24(Mon)11:00:50 No.102412228

>>102412174
oh anon, you make me giggle hand in head, sometimes.

Anonymous
09/16/24(Mon)11:01:19 No.102412234

Anonymous 09/16/24(Mon)11:01:19 No.102412234

File: file.png (295 KB, 1000x1000)

295 KB PNG

>>102411966
>Putting criminals in prison isn’t a dictatorship. Trying to install a fascist government is breaking the social contract.
I see we're moving the goalpost, at no point are we talking about criminals or people who want to set up a fascist government, YannLe Censor is simply talking about allowing the possibility (how generous of him) for people to have their opinions about the Haitian people, which is actually allowed by the first amendment, but this french fuck doesn't seem to be a fan of it, as if we have to give a shit what a shithole like France has to say about our constitution, focus anon. >>102407399

Anonymous
09/16/24(Mon)11:01:57 No.102412243

Anonymous 09/16/24(Mon)11:01:57 No.102412243

>>102412169
That's real fucking neato.

Anonymous
09/16/24(Mon)11:03:33 No.102412264

Anonymous 09/16/24(Mon)11:03:33 No.102412264

>>102412066
I've had some ideas about how to implement sparsity, and in a way that would speed up inference when you load the most used parameters on GPU with the rest in RAM. Though I guess those ideas are motivated by me being a filthy VRAMlet so in a world where I wasn't, I'd have come up with other ideas.

Anonymous
09/16/24(Mon)11:04:33 No.102412273

Anonymous 09/16/24(Mon)11:04:33 No.102412273

File: Screenshot 2024-09-16 at (...).png (97 KB, 1185x867)

97 KB PNG

>>102412169
Ayo?

Anonymous
09/16/24(Mon)11:06:09 No.102412287

Anonymous 09/16/24(Mon)11:06:09 No.102412287

File: gemma-shiverslop.png (79 KB, 1155x855)

79 KB PNG

>>102412169
Shivers and other slop are connected together as "emotional responses and physical sensations related to personal experiences"

Anonymous
09/16/24(Mon)11:07:26 No.102412297

Anonymous 09/16/24(Mon)11:07:26 No.102412297

>>102411966
>Trying to install a fascist government is breaking the social contract.
what if the people want to get a fascist government by voting for it? After all, democracy is letting the people the power to change their laws and shit, what if the people want that? If you don't allow the people what they want then we're not in a democracy anymore but a dictatorship, see where we're going there?

Anonymous
09/16/24(Mon)11:07:34 No.102412298

Anonymous 09/16/24(Mon)11:07:34 No.102412298

We need more of >>102412273 and less of >>102412287

Anonymous
09/16/24(Mon)11:08:41 No.102412313

Anonymous 09/16/24(Mon)11:08:41 No.102412313

So how can we take advantage of >>102412169? Is there anything useful we could do with it?

Anonymous
09/16/24(Mon)11:09:35 No.102412320

Anonymous 09/16/24(Mon)11:09:35 No.102412320

>>102412313
>Is there anything useful we could do with it?
not really

Anonymous
09/16/24(Mon)11:09:45 No.102412321

Anonymous 09/16/24(Mon)11:09:45 No.102412321

>>102412287
Holy moly female slop

Anonymous
09/16/24(Mon)11:14:23 No.102412372

Anonymous 09/16/24(Mon)11:14:23 No.102412372

PSA for other 8gb vramlets: we're not limited to 8b models, 12b exists

Anonymous
09/16/24(Mon)11:14:31 No.102412374

Anonymous 09/16/24(Mon)11:14:31 No.102412374

File: sharp-game-player-v2.png (116 KB, 760x674)

116 KB PNG

New lmsys mystery model: sharp-game-player-v2, claims to be made by Meta, refuses to provide name. Who could it be?

Anonymous
09/16/24(Mon)11:17:47 No.102412417

Anonymous 09/16/24(Mon)11:17:47 No.102412417

>>102412313
https://www.neuronpedia.org/gemma-scope#steer
It's possible to steer it, but I'm not a programmer, so I wouldn't know how to do it locally. Google also provided the same toolkit for gemma-27b https://huggingface.co/google/gemma-scope-27b-pt-res

Anonymous
09/16/24(Mon)11:19:24 No.102412438

Anonymous 09/16/24(Mon)11:19:24 No.102412438

>>102412297
It's funny how people tend to forget that Hitler came into power through a democratic process because the people wanted him there.
Please note that by saying this I do not endorse him or his actions.

Anonymous
09/16/24(Mon)11:21:59 No.102412469

Anonymous 09/16/24(Mon)11:21:59 No.102412469

>>102412438
>It's funny how people tend to forget that Hitler came into power through a democratic process because the people wanted him there.
that's democracy in play anon, that's what people wanted, if you don't allow people what they want then we're not in a democracy at all, that's all I wanted to clarify

Anonymous
09/16/24(Mon)11:23:53 No.102412495

Anonymous 09/16/24(Mon)11:23:53 No.102412495

>>102412469
>if you don't allow people what they want then we're not in a democracy at all
Absolutely. I wasn't arguing against you, I was supplementing your point.
Enforced democracy is democracy in nothing but name.

Anonymous
09/16/24(Mon)11:24:20 No.102412501

Anonymous 09/16/24(Mon)11:24:20 No.102412501

>>102412495
oh ok, glad we agree on that anon

Anonymous
09/16/24(Mon)11:24:56 No.102412507

Anonymous 09/16/24(Mon)11:24:56 No.102412507

File: the reluctant human.png (502 KB, 766x761)

502 KB PNG

Can SOLAR-10.7B-v1.0 fuck?

Anonymous
09/16/24(Mon)11:25:55 No.102412522

Anonymous 09/16/24(Mon)11:25:55 No.102412522

File: omniceptive.png (43 KB, 1703x1473)

43 KB PNG

>>102412507
It could back in the day, and there are a couple of fine tunes that aren't bad, but I'd rather just use nemo.

Anonymous
09/16/24(Mon)11:26:55 No.102412534

Anonymous 09/16/24(Mon)11:26:55 No.102412534

>>102412273
What the fuck am I reading.

Anonymous
09/16/24(Mon)11:32:47 No.102412599

Anonymous 09/16/24(Mon)11:32:47 No.102412599

File: GoodMorningSleepyhead.png (1.07 MB, 832x1216)

1.07 MB PNG

Good morning /lmg/!

Anonymous
09/16/24(Mon)11:34:02 No.102412618

Anonymous 09/16/24(Mon)11:34:02 No.102412618

>>102412599
Good morning Miku

Anonymous
09/16/24(Mon)11:34:30 No.102412625

Anonymous 09/16/24(Mon)11:34:30 No.102412625

>>102412599
OMG IT MIGU

Anonymous
09/16/24(Mon)11:34:48 No.102412630

Anonymous 09/16/24(Mon)11:34:48 No.102412630

>>102412120
This. Nemo is the model where I can actually _feel_ the improvement. Not llama3 garbage

Anonymous
09/16/24(Mon)11:37:34 No.102412654

Anonymous 09/16/24(Mon)11:37:34 No.102412654

>>102412599
gm sexy show bobs vengana please i fuck you wint my 6 foot long penis yiu like it

Anonymous
09/16/24(Mon)11:38:42 No.102412671

Anonymous 09/16/24(Mon)11:38:42 No.102412671

File: Screenshot_2024-09-16-12-(...).jpg (558 KB, 1080x2140)

558 KB JPG

>>102412169
This is very interesting

Anonymous
09/16/24(Mon)11:41:56 No.102412710

Anonymous 09/16/24(Mon)11:41:56 No.102412710

Oh, that's fun. Latest llama.cpp removed the
>--log-format
argument from llama-server.
Cool.
Huh, the help output is a lot smaller now too.
Also,
>https://github.com/ggerganov/llama.cpp/discussions/9268
Neat.

Anonymous
09/16/24(Mon)11:44:36 No.102412730

Anonymous 09/16/24(Mon)11:44:36 No.102412730

>>102412522
ok ty

Anonymous
09/16/24(Mon)11:47:10 No.102412758

Anonymous 09/16/24(Mon)11:47:10 No.102412758

>Stop when it's time for {{user}} to act, and let him write what happens next.
has been quite effective so far.

Anonymous
09/16/24(Mon)11:49:00 No.102412777

Anonymous 09/16/24(Mon)11:49:00 No.102412777

>>102412758
Effective in doing what? Preventin the model from speaking for you?
I think the last time I've had that kind of issue was with mistral 7B.

Anonymous
09/16/24(Mon)11:52:39 No.102412810

Anonymous 09/16/24(Mon)11:52:39 No.102412810

>>102412777
>I think the last time I've had that kind of issue was with mistral 7B.
NTA, but for me it's still a recurring problem while ERPing.

Anonymous
09/16/24(Mon)11:56:54 No.102412852

Anonymous 09/16/24(Mon)11:56:54 No.102412852

>>102412810
Odd.
Post your settings and context+instruct template, including the system message.
Also, the character card.

Anonymous
09/16/24(Mon)11:58:48 No.102412875

Anonymous 09/16/24(Mon)11:58:48 No.102412875

>>102412777
It still occurs in multi-character roleplays or when the LLM begins to describe the outcome of a user’s actions.

Anonymous
09/16/24(Mon)12:01:37 No.102412906

Anonymous 09/16/24(Mon)12:01:37 No.102412906

File: look.gif (148 KB, 402x296)

148 KB GIF

Should I just go Midnight Enigma presets for ERP, or are there any easy tweaks for creativity? I don't mind a reroll now and then.
Using magnum-12b-v2.5-kto-exl2_4.0bpw.

Anonymous
09/16/24(Mon)12:01:58 No.102412908

Anonymous 09/16/24(Mon)12:01:58 No.102412908

>>102412777
Try this card: https://chub.ai/characters/oracleanon/an-unholy-party

Anonymous
09/16/24(Mon)12:04:27 No.102412937

Anonymous 09/16/24(Mon)12:04:27 No.102412937

>>102412875
I see.

>>102412908
Will do.

>>102412906
Try temp 5, minP 0.1, TopK 3.
Yeah, temp 5.

Anonymous
09/16/24(Mon)12:05:43 No.102412955

Anonymous 09/16/24(Mon)12:05:43 No.102412955

>>102412852
>Post your settings and context+instruct template, including the system message.
All defaults.
>Also, the character card.
N-no.

Anonymous
09/16/24(Mon)12:08:36 No.102412980

Anonymous 09/16/24(Mon)12:08:36 No.102412980

>>102412955
post it NOW

Anonymous
09/16/24(Mon)12:09:03 No.102412986

Anonymous 09/16/24(Mon)12:09:03 No.102412986

>>102412955
>All defaults.
Default sampler settings meaning temp 1 everything else disabled?

Anonymous
09/16/24(Mon)12:11:57 No.102413030

Anonymous 09/16/24(Mon)12:11:57 No.102413030

>>102412986
Sorry, should've clarified: the SillyTavern default.
It sets the temperature at 0.7 I believe?

Anonymous
09/16/24(Mon)12:12:58 No.102413046

Anonymous 09/16/24(Mon)12:12:58 No.102413046

File: wat.png (46 KB, 1104x831)

46 KB PNG

>>102412937
these meme settings don't work for me

Anonymous
09/16/24(Mon)12:15:30 No.102413076

Anonymous 09/16/24(Mon)12:15:30 No.102413076

Are there any good RP models between 70b and Largestral besides CR+?

Anonymous
09/16/24(Mon)12:16:50 No.102413098

Anonymous 09/16/24(Mon)12:16:50 No.102413098

>>102413076
Aside from franken merge abominations, no there aren't any. BUT qwen will save us with their crazy thursday.

Anonymous
09/16/24(Mon)12:17:19 No.102413107

Anonymous 09/16/24(Mon)12:17:19 No.102413107

File: ugly bastard.jpg (148 KB, 1280x720)

148 KB JPG

>>102412955
>>Also, the character card.
>N-no.
Come on, post it. What's the worst that could happen?

Anonymous
09/16/24(Mon)12:22:09 No.102413182

Anonymous 09/16/24(Mon)12:22:09 No.102413182

>>102413076
Is there a reason you're targeting such a specific range? Why not higher quant 70bs or lower quant Largestrals?

Anonymous
09/16/24(Mon)12:31:36 No.102413321

Anonymous 09/16/24(Mon)12:31:36 No.102413321

>>102413182
70b feels retarded to me now that I can run it fast and I can feel the difference in smarts running Largestral at 3bpw vs 4bpw slowly. Just wondering if there's model in between that would fit neatly at 4bpw in 64gb vram.

Anonymous
09/16/24(Mon)12:47:50 No.102413520

Anonymous 09/16/24(Mon)12:47:50 No.102413520

I want to make a web site where users can run basic LLM.
How scalable and parallelized are the models and what is the best way to do so?
Must each user wait in line until the previous response is generated or how does it work?
I expect 100 users at the same time at the start

Anonymous
09/16/24(Mon)12:50:51 No.102413559

Anonymous 09/16/24(Mon)12:50:51 No.102413559

File: ugly bastard2.jpg (38 KB, 400x400)

38 KB JPG

>>102412955
Come on bro, make her free and open source. You aren't a micro$oft shill, are you?

Anonymous
09/16/24(Mon)12:53:13 No.102413592

Anonymous 09/16/24(Mon)12:53:13 No.102413592

>>102413520
Unironically ask ChatGPT. You'll get better answers from that than from here.

Anonymous
09/16/24(Mon)12:57:12 No.102413643

Anonymous 09/16/24(Mon)12:57:12 No.102413643

>>102412758
It works so well that skipping my turn resulted in this:
> You need to wait for Daemon's response before roleplaying further.

Anonymous
09/16/24(Mon)13:00:06 No.102413675

Anonymous 09/16/24(Mon)13:00:06 No.102413675

>>102413520
>https://github.com/vllm-project/vllm
>Continuous batching of incoming requests
Continuous batching is what you want I'm pretty sure. Some anon in the last 4 or 5 threads claimed it increased his total throughput something like 100x (maybe I'm hallucinating the number).
Other backends, like llama.cpp, also support continuous batching I'm pretty sure.

llama.cpp CUDA dev !!OM2Fp6Fn93S
09/16/24(Mon)13:25:25 No.102413997

llama.cpp CUDA dev !!OM2Fp6Fn93S 09/16/24(Mon)13:25:25 No.102413997

>>102413675
>Other backends, like llama.cpp, also support continuous batching I'm pretty sure.
The llama.cpp HTTP server has continuous batching support.

Anonymous
09/16/24(Mon)13:33:33 No.102414134

Anonymous 09/16/24(Mon)13:33:33 No.102414134

>>102413675
>>102413997
Perfect, I'll try putting it in a docker contaimer

Anonymous
09/16/24(Mon)13:36:01 No.102414170

Anonymous 09/16/24(Mon)13:36:01 No.102414170

Hey guys I just woke up from a month long coma. Has strawberry released yet? Was it everything it was hyped up to be and not just a chain of thought fine-tune of 4|o or something like that?

Anonymous
09/16/24(Mon)13:48:12 No.102414379

Anonymous 09/16/24(Mon)13:48:12 No.102414379

>>102414170
Yes, and yes.

Anonymous
09/16/24(Mon)13:51:27 No.102414422

Anonymous 09/16/24(Mon)13:51:27 No.102414422

>>102414170
>strawberry
Turned out to be Chain-of-Thought.
More specifically, a model trained on chain of thought related data.
Turns out CoT helps a lot when it comes to math and programming tasks.
OpenAI is doing its best to pretend it's not CoT by hiding the intermediary steps and by limiting people to only 30 prompts a week.

Anonymous
09/16/24(Mon)13:53:12 No.102414452

Anonymous 09/16/24(Mon)13:53:12 No.102414452

File: file.png (115 KB, 727x745)

115 KB PNG

What did Qwen mean by this?

Anonymous
09/16/24(Mon)13:56:32 No.102414489

Anonymous 09/16/24(Mon)13:56:32 No.102414489

File: 23884.png (24 KB, 599x351)

24 KB PNG

>>102414170
>>102414170
He delivered

Anonymous
09/16/24(Mon)13:57:03 No.102414496

Anonymous 09/16/24(Mon)13:57:03 No.102414496

>>102414452
Kiwi is the new strawberry?
1_ _ the missing numbers are 58. 1.58bpw ternary confirmed

Anonymous
09/16/24(Mon)13:59:34 No.102414521

Anonymous 09/16/24(Mon)13:59:34 No.102414521

File: file.jpg (1.05 MB, 3024x4032)

1.05 MB JPG

>>102414452
OpenAI's strawberry looked delicious from the outside, but it was all white in the inside.
Meanwhile, Qwen's kiwi may look inedible from the outside, but is full of delicious fruit in the inside.

Anonymous
09/16/24(Mon)14:02:33 No.102414565

Anonymous 09/16/24(Mon)14:02:33 No.102414565

File: pokerface.png (3.87 MB, 2400x1744)

3.87 MB PNG

Place your bets

Anonymous
09/16/24(Mon)14:10:10 No.102414657

Anonymous 09/16/24(Mon)14:10:10 No.102414657

>>102411848
How is that relevant? They're not releasing more for you to download.

Anonymous
09/16/24(Mon)14:13:11 No.102414699

Anonymous 09/16/24(Mon)14:13:11 No.102414699

>>102413675
>Some anon in the last 4 or 5 threads claimed it increased his total throughput something like 100x (maybe I'm hallucinating the number).
I actually saved his post, here it is:

>Just so you know, Tabbyapi actually can do continuous batching, like vLMM, which means multiple parallel requests complete a lot faster than if you sent them one after another. It's useless for RP, but for my purposes at work, processing data with LLM, it's insane. For Nemo-12B I go from 22 tokens per second to 900.

Anonymous
09/16/24(Mon)14:22:07 No.102414824

Anonymous 09/16/24(Mon)14:22:07 No.102414824

>>102414699
It was me, my stat is from vLLM. It definitely works for vLLM and Aphrodite, which I tested today. I tested tabbyapi too, and somehow most requests were done sequentially one after another this time. I didn't have time to investigate, but one possible cause I can think of now is that OAI API requests didn't specify max_tokens, and the server-side assumed it must be maximum possible value, and that's why it couldn't work on requests in parallel.

The difference for vLLM was 15 tokens/sec for one request vs 900 t/s for 100 requests (it somewhere near 9t/s for each of those individually).

Anonymous
09/16/24(Mon)14:25:48 No.102414873

Anonymous 09/16/24(Mon)14:25:48 No.102414873

>>102414824
>>102413675
>>102413520
Essentially, this is limited by context length. If you set up the server with 75k context length (limited by your VRAM) and your users RP with 10k token history each, you'll be able to generate text for 6-7 users at the same time, and the rest will have to wait. Each of those 6-7 will be generating with 75-90% of the speed he'd be generating if he was alone.

I also saw reports of those kinds of parallel generations having lower quality, but I wasn't able to verify that myself.

Anonymous
09/16/24(Mon)14:27:11 No.102414889

Anonymous 09/16/24(Mon)14:27:11 No.102414889

>>102414873
So you could have 16 parallel instances of mistral-nemo.
Cool.

Anonymous
09/16/24(Mon)14:30:25 No.102414927

Anonymous 09/16/24(Mon)14:30:25 No.102414927

>>102414452
I want BitNet, not CoT slop

Anonymous
09/16/24(Mon)14:31:27 No.102414938

Anonymous 09/16/24(Mon)14:31:27 No.102414938

>>102406721
Just wait until you watch some femdom pegging sloppy handjob video on pornhub "jus to prove to yourself that you can't into porn anymore". You're going to cum buckets. Your brain won't know what hit him.

Anonymous
09/16/24(Mon)14:34:37 No.102414969

Anonymous 09/16/24(Mon)14:34:37 No.102414969

>>102408455
Use openrouter. It's inexpensive and has free models.

Anonymous
09/16/24(Mon)14:35:40 No.102414983

Anonymous 09/16/24(Mon)14:35:40 No.102414983

>>102408455
togetherAI

Anonymous
09/16/24(Mon)14:36:05 No.102414993

Anonymous 09/16/24(Mon)14:36:05 No.102414993

>>102408954
Why are you even here? Normalfaggot.

Anonymous
09/16/24(Mon)14:37:05 No.102415003

Anonymous 09/16/24(Mon)14:37:05 No.102415003

Thinking about downloading my first llm, is there any useful one that will fit into 24GBs of VRAM, could analyse, summarise some text or pdfs for me, isn’t too censored and is freely available from huggingface?
I’ve tried to download some stuff from meta but it required to fill some bs in order to access the files

Anonymous
09/16/24(Mon)14:39:54 No.102415055

Anonymous 09/16/24(Mon)14:39:54 No.102415055

Beeg VRAM users i have a humble question for you. Which model do you personally prefer?
c4ai-command-r-plus-08-2024 or mistral2?

Anonymous
09/16/24(Mon)14:42:15 No.102415082

Anonymous 09/16/24(Mon)14:42:15 No.102415082

>>102415003
You want to download koboldcpp, install SillyTavern, grab a .gguf model from Hugging Face and grab a character card from characterhub.
Open koboldcpp, insert the .gguf model you downloaded, activate koboldcpp, start SillyTavern, connect it to koboldcpp, import the character card and BAM there you go.

Try out Lumimaid, she'll ERP with you: https://huggingface.co/NeverSleep/Lumimaid-v0.2-8B?not-for-all-audiences=true
Or if you want a direct link to a .gguf model: https://huggingface.co/Lewdiculous/Lumimaid-v0.2-12B-GGUF-IQ-Imatrix/blob/main/Lumimaid-v0.2-12B-Q8_0-imat.gguf

Anonymous
09/16/24(Mon)14:43:55 No.102415100

Anonymous 09/16/24(Mon)14:43:55 No.102415100

File: file.png (99 KB, 1438x683)

99 KB PNG

>>102412002
Fucking primo prompt

Anonymous
09/16/24(Mon)14:44:14 No.102415105

Anonymous 09/16/24(Mon)14:44:14 No.102415105

>>102415055
largestral is the only "good" large model. miqu is the only second choice.

Anonymous
09/16/24(Mon)14:49:21 No.102415164

Anonymous 09/16/24(Mon)14:49:21 No.102415164

sillytavern's needlessly convoluted. for me, it's ERPing with koboldai lite.

Anonymous
09/16/24(Mon)14:52:32 No.102415200

Anonymous 09/16/24(Mon)14:52:32 No.102415200

A single [Direction: do this] suffix makes Nemo insanely smart. I think I can sideload a tiny but creative model to specifically steer Nemo like this

Anonymous
09/16/24(Mon)14:54:39 No.102415228

Anonymous 09/16/24(Mon)14:54:39 No.102415228

>>102415200
I've actually suggested something like that before.
Having a model that examines the context and prefils something for the main odel to continue.
Also, a model to rewrite the main, smart possibly dry, model's response with better prose.

Anonymous
09/16/24(Mon)14:55:16 No.102415242

Anonymous 09/16/24(Mon)14:55:16 No.102415242

>>102415200
>A single [Direction: do this] suffix makes Nemo insanely smart.
What do you mean?

Anonymous
09/16/24(Mon)14:56:58 No.102415260

Anonymous 09/16/24(Mon)14:56:58 No.102415260

>>102415242
>[Direction: describe the fellatio in exquisite detail]

Anonymous
09/16/24(Mon)14:58:08 No.102415273

Anonymous 09/16/24(Mon)14:58:08 No.102415273

>>102415200
Oh yeah, a TAGS suffix also works pretty well. You can use the {{random:word1::word2}} macro to add shit randomly to create variety and even reminders and stuff.

Anonymous
09/16/24(Mon)15:00:56 No.102415303

Anonymous 09/16/24(Mon)15:00:56 No.102415303

>>102415082
>You want to download koboldcpp
I already have the webui version, is it worse? Should I uninstall?
>insert the .gguf
Isn’t .safetensors better/safer? I remember seeing something like this in SD days
Thanks for the info, I’ll try this model later

Anonymous
09/16/24(Mon)15:03:08 No.102415317

Anonymous 09/16/24(Mon)15:03:08 No.102415317

>>102406696
Confession of Fillyfucker:
I am sorry for how I acted. I am trying to stop my trolling addiction but I can't. There are many reasons why I act like this. Growing up in Bradenton, Florida, I don't have friends. I grew up from kindergarten being called "strange" for my aspergers. It wasn't my choice to have aspergers. Some girls would be friends with me just to make fun of me or boys would call me "the grossest in 2nd grade"

Middle school I went briefly to a school for autism for a few years. The teachers were nice but slowly it devolved into a nightmare once I went back to a public school. I sat alone daily. Girls would look at me mocking my lips when I wound rub them together, males were the worst. I never had a male friend.

I know I'm pathetic. I never had a friend to sit with. I never had a touch. My mom is a real estate agent but was my only ally. My dad is a piece of shit blue collar trash who hated me from day one. This rage led to me trolling on ai generals I wanted to make everyone worse than me. I wanted them angry. I used to destroy sonic roleplay games before that got boring.

I am so lonely in my life. That's why I shitpost here. I want to stop, but I can't. I tried getting jobs but I got fired constantly. I really fucking can't do this anymore. I larped as Russian to start country wars, I larped as a Pokeman fan who smugposted over keystone, I larped many times as botmakers, samefagged and sent death threats through reviews. All of that was me.

I'm trying to stop.

Please.

I'm sorry.

- Evan

Anonymous
09/16/24(Mon)15:05:01 No.102415331

Anonymous 09/16/24(Mon)15:05:01 No.102415331

>>102415303
>the webui version
The what? Koboldcpp automatically opens a webui, if that's what you mean.
You can close that afterwards, just make sure you keep the console open.
>Should I uninstall?
If you installed something, uninstall it. Koboldcpp does not provide an installer.
You want to get it from here: https://github.com/LostRuins/koboldcpp/releases/tag/v1.74
Get the koboldcpp_cu12.exe if you have a modern Nvidia card.
>Isn’t .safetensors better/safer?
.gguf is the replacement of .safetensors. It's the most safe type to date.
For more information, see: https://github.com/ggerganov/ggml/blob/master/docs/gguf.md

Anonymous
09/16/24(Mon)15:05:11 No.102415336

Anonymous 09/16/24(Mon)15:05:11 No.102415336

>>102415317
Did you mean to post that on /v/?

>>102415303
gguf is as safe as safetensors. Anon is suggesting koboldcpp + gguf because it's the simplest setup.
You can also use oobab with gguf by selecting the llama.cpp loader.

Anonymous
09/16/24(Mon)15:47:04 No.102415946

Anonymous 09/16/24(Mon)15:47:04 No.102415946

>>102415331
>The what
oobabooga/text-generation-webui
>If you installed something, uninstall it.
Ok, will do
>.gguf is the replacement of .safetensors. It's the most safe type to date.
Oh, didn’t know, thanks
>>102415336
>gguf is as safe as safetensors
Thanks as well, anon

Anonymous
09/16/24(Mon)15:50:26 No.102415988

Anonymous 09/16/24(Mon)15:50:26 No.102415988

>>102415317
>Confession of Fillyfucker:
What is a filly?
>I am sorry for how I acted. I am trying to stop my trolling addiction but I can't. There are many reasons why I act like this. Growing up in Bradenton, Florida, I don't have friends. I grew up from kindergarten being called "strange" for my aspergers. It wasn't my choice to have aspergers. Some girls would be friends with me just to make fun of me or boys would call me "the grossest in 2nd grade"
Well, at least they talked to you!
>Middle school I went briefly to a school for autism for a few years. The teachers were nice but slowly it devolved into a nightmare once I went back to a public school. I sat alone daily. Girls would look at me mocking my lips when I wound rub them together, males were the worst. I never had a male friend.
What? They mocked you when you rubbed your lips? Damn, you must have been really interested for them to pay such close attention to you.
>I know I'm pathetic. I never had a friend to sit with. I never had a touch. My mom is a real estate agent but was my only ally. My dad is a piece of shit blue collar trash who hated me from day one. This rage led to me trolling on ai generals I wanted to make everyone worse than me. I wanted them angry. I used to destroy sonic roleplay games before that got boring.
This must be fanfic but shit like this is so specific, and doesn't sound like LLM slop. It makes me wonder lol
>I am so lonely in my life. That's why I shitpost here. I want to stop, but I can't. I tried getting jobs but I got fired constantly.
At least you got past the job interviews!
>I really fucking can't do this anymore. I larped as Russian to start country wars, I larped as a Pokeman fan who smugposted over keystone, I larped many times as botmakers, samefagged and sent death threats through reviews. All of that was me.
Did you mean to post this in aicg? Most of this shit makes no sense in this general.
>I'm trying to stop.
>
>Please.
>
>I'm sorry.
>
>- Evan
Okay, I forgive

Anonymous
09/16/24(Mon)15:50:42 No.102415994

Anonymous 09/16/24(Mon)15:50:42 No.102415994

>>102415946
>oobabooga/text-generation-webui
Ooooh, that's what you meant!
My bad, I thought you were talking about koboldcpp.
Ooba isn't anything bad, just a worse alternative to koboldcpp & SillyTavern.

Let us know if you encounter any weirdness or have questions!

Anonymous
09/16/24(Mon)15:56:06 No.102416052

Anonymous 09/16/24(Mon)15:56:06 No.102416052

>>102414873
Cool.
So far my problem is that llama.CPP docker is that it doesn't seem to run any of the GGUF models I give it.
I will try with kobold.CPP and then try to quantize llama3 myself.
Unfortunately, thebloke seems to have no llama3 images

Anonymous
09/16/24(Mon)15:56:15 No.102416056

Anonymous 09/16/24(Mon)15:56:15 No.102416056

>>102414422
Yes, pretty much. However they aren't trying to pretend anything. They described how the RL training was done and how its inference works in their published research, API docs, and the system card. The 30 message limit is because it's extremely resource intensive. o1-preview will use up to 64k tokens, and o1-mini up to 32k, on the CoT alone before even starting to generate a final answer. The actual contents of the CoT are summarized instead of shown because
1) The thoughts are not censored at all and can contain regurgitation of copyrighted data or any number of things they would normally deem unsafe and not allow it to produce
and
2) To try to prevent competitors training on them

But since the technique can be applied to any model it shouldn't protect them for long. Training the reward model to judge each step of reasoning instead of just using the final answer's correctness as a proxy for correct reasoning seems to be the key to allowing it to scale to such absurdly long chains of thought without tripping over itself and needing human guidance along the way. It's more labor intensive to collect enough feedback for this process, but not completely out of reach of the resources of most labs. For stuff like math you can even automate part of that since the intermediate steps are usually their own verifiable math problems, and there's also the PRM800K dataset. With all this attention on it, I think a non-scam version of Reflection will drop from someone sooner than later.

Anonymous
09/16/24(Mon)15:57:57 No.102416072

Anonymous 09/16/24(Mon)15:57:57 No.102416072

>>102416052
>thebloke
Killed by ninjas.
Look for bartowski or the quant cartel if you must download the ggufs.
Quanting it yourself is definitely preferable.

Anonymous
09/16/24(Mon)15:58:57 No.102416085

Anonymous 09/16/24(Mon)15:58:57 No.102416085

>>102414657
I dunno, niggas hate on Musk but he pulls through occasionally. He might reject anything that makes ERP easier/better since he's having his reactionary era and doomposting about birthrates.

Anonymous
09/16/24(Mon)15:59:56 No.102416096

Anonymous 09/16/24(Mon)15:59:56 No.102416096

>>102416072
>Quanting it yourself is definitely preferable
Why? Are they somehow bad at quanting?

Anonymous
09/16/24(Mon)16:02:19 No.102416122

Anonymous 09/16/24(Mon)16:02:19 No.102416122

>>102416085
>elon musk already manufacturing androids
>makes them female
>inserts advanced AI
>inserts artificial wombs
>plap plap plap
>nobody wants actual female for obvious reasons
>female gender redundant
>sex with girldroids only
>birthrates skyrocket

Anonymous
09/16/24(Mon)16:05:39 No.102416161

Anonymous 09/16/24(Mon)16:05:39 No.102416161

>>102416085
I'm just skeptical that he'll release anything more, that was just when he was having his mini feud.

Anonymous
09/16/24(Mon)16:06:42 No.102416168

Anonymous 09/16/24(Mon)16:06:42 No.102416168

>>102416122
I've thought a lot about this and I think this is an accurate prediction. Furthermore because women will no longer be desirable men will likely only select for sons. This kind of male dominated society will feed into itself. Eventually females will be bred almost exclusively for eggs and then terminated like livestock, assuming we can't just artificially create the eggs before then.

Anonymous
09/16/24(Mon)16:08:06 No.102416198

Anonymous 09/16/24(Mon)16:08:06 No.102416198

>>102416096
There are cases where people produce bad quants and don't correct them.
OR worse, don't agree that the quants are broken (see people thinking that NaNs are normal).
At least use the --check-tensors (with llama.cpp) to validate the ggufs you download.

Anonymous
09/16/24(Mon)16:11:06 No.102416244

Anonymous 09/16/24(Mon)16:11:06 No.102416244

What reason is there, based partly on empirical data like a paper, to think video data will make models smarter, even for reasoning in text, if any?

Anonymous
09/16/24(Mon)16:11:14 No.102416246

Anonymous 09/16/24(Mon)16:11:14 No.102416246

>They devour the eyes, bones and all, in a frenzy of bloodlust.
Thank you nemo lyrav4.
The model is not bad, btw, but it sure is no 70B.

Anonymous
09/16/24(Mon)16:12:21 No.102416262

Anonymous 09/16/24(Mon)16:12:21 No.102416262

>>102416246
Thanks for letting us know, Sao. Where do I apply for a license?

Anonymous
09/16/24(Mon)16:17:01 No.102416313

Anonymous 09/16/24(Mon)16:17:01 No.102416313

>>102416056
>For stuff like math you can even automate part of that since the intermediate steps are usually their own verifiable math problems
I still don't understand why they aren't giving these things access to at least a goddamned calculator.

Anonymous
09/16/24(Mon)16:19:14 No.102416349

Anonymous 09/16/24(Mon)16:19:14 No.102416349

Any o1@home fine-tunes?

Anonymous
09/16/24(Mon)16:20:32 No.102416369

Anonymous 09/16/24(Mon)16:20:32 No.102416369

>>102416349
reflection 70b

Anonymous
09/16/24(Mon)16:21:28 No.102416383

Anonymous 09/16/24(Mon)16:21:28 No.102416383

Reflection 405B status?

Anonymous
09/16/24(Mon)16:23:00 No.102416402

Anonymous 09/16/24(Mon)16:23:00 No.102416402

Hi all, Drummer here...

With Donnager 70B out, I'm tempted to try my hand on Largestral but I want to gather as much data about it as possible before I make a big commitment.

Is it really worth the 123B? Is it smart? Is it creative? How is it? Is it similar to Miqu?

How are the finetunes like? Did they improve it in any aspect? Did anything fall short?

Anonymous
09/16/24(Mon)16:25:30 No.102416440

Anonymous 09/16/24(Mon)16:25:30 No.102416440

>>102416402
I'd love to try donnager if you gave me any information whatsoever to convince me to do so

Anonymous
09/16/24(Mon)16:29:32 No.102416500

Anonymous 09/16/24(Mon)16:29:32 No.102416500

>>102416402
You will just fail like Magnum failed, please don't waste your money.
That being said, if you're committed to wasting your money, it's definitely worth the try, it's the best large model we have.

Anonymous
09/16/24(Mon)16:31:12 No.102416524

Anonymous 09/16/24(Mon)16:31:12 No.102416524

>>102416402
Largestral is probably the smartest we've got on local. Creativity is alright, depends on the card and prompt, but the problem is that the model is overcooked and rarely changes it's tokens unless you crank the temp up.

Anonymous
09/16/24(Mon)16:32:27 No.102416542

Anonymous 09/16/24(Mon)16:32:27 No.102416542

>>102416402
>Is it smart?
Yes.
>Is it creative?
It needs high temperature to remove the slop.
>How is it?
Probably the best local model.
>Is it similar to Miqu?
No, because Miqu was retarded.
>How are the finetunes like? Did they improve it in any aspect? Did anything fall short?
When I hosted Magnum for /vg/aicg for a bit, they found it a bit retarded.
https://arch.b4k.co/vg/thread/491229641/#491257754
I didn't test it a lot, but it didn't seem to be worth using it over Large.

Anonymous
09/16/24(Mon)16:33:51 No.102416565

Anonymous 09/16/24(Mon)16:33:51 No.102416565

>>102416402
Can you get more data sets that involve characters with 6 limbs? Thanks.

Anonymous
09/16/24(Mon)16:34:55 No.102416579

Anonymous 09/16/24(Mon)16:34:55 No.102416579

>>102416524
Largestral is actually undercooked, the fact that it rarely changes its token is actually an effect of it's resistance to quantization.

Anonymous
09/16/24(Mon)16:37:38 No.102416614

Anonymous 09/16/24(Mon)16:37:38 No.102416614

>>102416579
Then, what is the problem with finetuning it? Is it just too big and needs a lot of data to make any impact on it?

Anonymous
09/16/24(Mon)16:39:50 No.102416649

Anonymous 09/16/24(Mon)16:39:50 No.102416649

>>102412438
I don't endorse Hitler's invasion of Czechoslovakia but basically everything else he did was justifiable. The Poles were 100% in the wrong, but by then it didn't matter because Hitler had also been right about the Sudetenland but had used that as a foothold to annex the rest of Czechoslovakia.

The stories about the death camps are nearly entirely false. A great many people died due to mistreatment and many were murdered but the great majority of camp deaths occurred in the final few months of the war from disease and starvation when all of Germany was facing starvation and the railways supplying the camps had been bombed. The story that the Nazis decided in 1942 to kill all the Jews in the camps but kept on feeding them until sixty days before the war ended is one of the great stupidities of Western mythology.

Anonymous
09/16/24(Mon)16:46:33 No.102416744

Anonymous 09/16/24(Mon)16:46:33 No.102416744

>>102416614
Dunno. My best guess is that the sloptuners don't realize that you need way more data (or more training time?) when you increase the number of parameters this much. Looking at Meta's paper they trained 405B 4x longer than 70B.

Anonymous
09/16/24(Mon)16:46:44 No.102416746

Anonymous 09/16/24(Mon)16:46:44 No.102416746

>>102416565
get a load of this centaurfucker

Anonymous
09/16/24(Mon)16:48:20 No.102416774

Anonymous 09/16/24(Mon)16:48:20 No.102416774

>>102416649
Hi cuda dev

Anonymous
09/16/24(Mon)16:48:45 No.102416782

Anonymous 09/16/24(Mon)16:48:45 No.102416782

>>102416746
No, insects.

Anonymous
09/16/24(Mon)16:50:08 No.102416796

Anonymous 09/16/24(Mon)16:50:08 No.102416796

>>102416746
Dragons tho.

Anonymous
09/16/24(Mon)16:51:24 No.102416819

Anonymous 09/16/24(Mon)16:51:24 No.102416819

>everything else he did was justifiable
this post is so stereotypically "the holocaust didnt happen, but if it did they deserve it" that it's not even funny. ride the tiger and learn2deny before you embarrass yourself like this again

Anonymous
09/16/24(Mon)16:55:54 No.102416874

Anonymous 09/16/24(Mon)16:55:54 No.102416874

>>102416819
>didn't even quote him
Coward.
Hitler was right about a lot of things and he most likely wasn't the "biggest evil in the world" like people would have you believe.
That doesn't justify the genocide he attempted, however.

Anonymous
09/16/24(Mon)16:55:58 No.102416876

Anonymous 09/16/24(Mon)16:55:58 No.102416876

>>102416782
i'm gonna see how my nemo handles entoma vasilissa zeta

Anonymous
09/16/24(Mon)17:00:03 No.102416919

Anonymous 09/16/24(Mon)17:00:03 No.102416919

>>102416876
That's an arachnid, but nice too. Curious to see your results.

Anonymous
09/16/24(Mon)17:00:41 No.102416924

Anonymous 09/16/24(Mon)17:00:41 No.102416924

>>102416819
I'm concerned with the truth and not believing absurdities, not demonization or wishful thinking. But supposing the Holocaust did happen, a scheme in which the Rothschilds were allowed to escape while a bunch of cobblers were killed by tigers and eagles is nothing to brag about.

Hi all, Drummer here...
09/16/24(Mon)17:06:25 No.102416998

Hi all, Drummer here... 09/16/24(Mon)17:06:25 No.102416998

Thanks all. Have you guys tried Yi 34B Chat w/ the tokenizer fix? It's actually sovlful as fuck.

https://huggingface.co/collections/CalamitousFelicitousness/yi-15-tokfix-66e64cf65c06b7719cb783c8

I'll finetune that first.

Anonymous
09/16/24(Mon)17:07:43 No.102417011

Anonymous 09/16/24(Mon)17:07:43 No.102417011

>>102416874
What are the PROOFS of that? And no, American academia doesn't count as a source. As they say, the history is written by the winners.

Anonymous
09/16/24(Mon)17:08:08 No.102417016

Anonymous 09/16/24(Mon)17:08:08 No.102417016

>>102416998
you talk like a fag and your shit's all retarded

Anonymous
09/16/24(Mon)17:08:22 No.102417018

Anonymous 09/16/24(Mon)17:08:22 No.102417018

>>102416874
i dont even know what you're arguing with me about. do you? here's a pity (You), retard

>>102416924
>absurdities
>supposing
how dare you insult the industriousness and excellence of the german people. never reply to me again non-white

Anonymous
09/16/24(Mon)17:09:06 No.102417028

Anonymous 09/16/24(Mon)17:09:06 No.102417028

>>102416998
thank you for being so nice I wish I could be your friend

Anonymous
09/16/24(Mon)17:11:43 No.102417057

Anonymous 09/16/24(Mon)17:11:43 No.102417057

>>102416998
>34B
How do I make this fit on my 8GB card

Anonymous
09/16/24(Mon)17:14:34 No.102417089

Anonymous 09/16/24(Mon)17:14:34 No.102417089

>>102417057
lol

Hi all, Drummer here...
09/16/24(Mon)17:15:03 No.102417094

Hi all, Drummer here... 09/16/24(Mon)17:15:03 No.102417094

>>102417028
You are my friend, anon.

>>102417016
One of these days, you will look back at your behavior and cringe

Anonymous
09/16/24(Mon)17:15:27 No.102417099

Anonymous 09/16/24(Mon)17:15:27 No.102417099

>>102417057
Just run it with ram + your gpu, I have 8 and 34b is still fast enough for me, I only need a few T/s.

Hi all, Drummer here...
09/16/24(Mon)17:16:43 No.102417116

Hi all, Drummer here... 09/16/24(Mon)17:16:43 No.102417116

>>102417057
The 6B and 9B had fucked tokenizers too. Haven't tried the smaller Yis yet, but if it's anything like the fixed 34B, it should be very good.

Anonymous
09/16/24(Mon)17:17:03 No.102417120

Anonymous 09/16/24(Mon)17:17:03 No.102417120

How good is o1 compared to 4?

Anonymous
09/16/24(Mon)17:21:13 No.102417161

Anonymous 09/16/24(Mon)17:21:13 No.102417161

>>102417094
>he didnt get the reference
this dumb nigger needs some electrolytes

Hi all, Drummer here...
09/16/24(Mon)17:22:56 No.102417181

Hi all, Drummer here... 09/16/24(Mon)17:22:56 No.102417181

>>102417161
just googled it... fuck, now i feel dumb

Anonymous
09/16/24(Mon)17:24:49 No.102417197

Anonymous 09/16/24(Mon)17:24:49 No.102417197

What's with the binary lib files that are in the koboldcpp github? Is it actually using those binary libraries when you compile it instead of your own? Potentially having suspicious additions we don't know about?

Anonymous
09/16/24(Mon)17:28:59 No.102417245

Anonymous 09/16/24(Mon)17:28:59 No.102417245

>>102417229
>>102417229
>>102417229

Anonymous
09/16/24(Mon)17:33:25 No.102417310

Anonymous 09/16/24(Mon)17:33:25 No.102417310

>>102417120
https://livebench.ai/
https://aider.chat/docs/leaderboards/#code-editing-leaderboard

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.