/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 07/11/24(Thu)17:49:24 No.101371466

File: 39_04173_.png (1.14 MB, 896x1152)

1.14 MB PNG

/lmg/ - Local Models General Anonymous 07/11/24(Thu)17:49:24 No.101371466 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101361021 & >>101345759

►News
>(07/09) Anole, based on Chameleon, for interleaved image-text generation: https://hf.co/GAIR/Anole-7b-v0.1
>(07/07) Support for glm3 and glm4 merged into llama.cpp: https://github.com/ggerganov/llama.cpp/pull/8031
>(07/02) Japanese LLaMA-based model pre-trained on 2T tokens: https://hf.co/cyberagent/calm3-22b-chat
>(06/28) Inference support for Gemma 2 merged: https://github.com/ggerganov/llama.cpp/pull/8156
>(06/27) Meta announces LLM Compiler, based on Code Llama, for code optimization and disassembly: https://go.fb.me/tdd3dw

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
07/11/24(Thu)17:50:22 No.101371476

Anonymous 07/11/24(Thu)17:50:22 No.101371476

File: mikudance.gif (2.13 MB, 498x443)

2.13 MB GIF

►Recent Highlights from the Previous Thread: >>101361021

--Papers: >>101362640 >>101362370
--L3 70B Community Tunes Are Probably Undertrained, Stick to Base Models for RP: >>101364116 >>101364172 >>101364314 >>101364357 >>101364381 >>101364856
--Flash Attention 3: Fast and Accurate Attention for Hopper GPUs: >>101368103 >>101368218 >>101368292
--Arena Learning: Build Data Flywheel for LLMs Post-training via Simulated Chatbot Arena: >>101367877 >>101368133 >>101368242 >>101369116
--Ways to Prevent AI from Using Inaccessible Language: >>101363368 >>101363415 >>101363429 >>101363879 >>101364065
--The Hallucination Problem in LLMs: Are We in Denial?: >>101369167 >>101369207 >>101369225 >>101369266 >>101369313 >>101369478 >>101369613 >>101369963 >>101370741 >>101370834
--Running gemma-27b-it on two P100 GPUs: Performance, Cost and Alternatives: >>101365562 >>101365688 >>101365909 >>101366051 >>101366080 >>101366162 >>101366210 >>101366054
--P40 Special Power Supply Woes and Connector Confusion: >>101363997 >>101364061 >>101364210 >>101364464 >>101364771 >>101364476 >>101364501 >>101364828
--On Open-Weights Foundation Models: Potential Benefits and Challenges: >>101367230 >>101367290 >>101368246
--MambaVision: A Hybrid Mamba-Transformer Vision Backbone by NVlabs: >>101364298
--Is 48GB vRAM still relevant after Gemma 27B?: >>101361672 >>101366696 >>101367165
--How to make Gemma 27b less dramatic for slice-of-life stories: >>101366160 >>101366275 >>101366431
--Best Erotica Model for a 3090 and 32GB RAM: >>101363945 >>101365063 >>101365912 >>101365964 >>101366004 >>101366052
--Are We Experiencing a Language Uncanny Valley with Today's LLMs?: >>101366197 >>101366245 >>101366320 >>101366681 >>101370648
--AMD Acquires Silo AI to Enter the LLM Fray: >>101369677 >>101369758 >>101370305
--Release 0.1.7 of exllamav2 by turboderp on GitHub: >>101369750 >>101369869
--Miku (free space): >>101366285 >>101367210 >>101368077

►Recent Highlight Posts from the Previous Thread: >>101361028

Anonymous
07/11/24(Thu)17:55:10 No.101371524

Anonymous 07/11/24(Thu)17:55:10 No.101371524

File: 1699737265638391.png (33 KB, 719x346)

33 KB PNG

One more week.

Anonymous
07/11/24(Thu)17:56:53 No.101371536

Anonymous 07/11/24(Thu)17:56:53 No.101371536

>>101371476
Thank you Recap Miku

Anonymous
07/11/24(Thu)17:57:06 No.101371539

Anonymous 07/11/24(Thu)17:57:06 No.101371539

I can't believe it. It's already the end of Thursday and there's no new Mistral. Anon lied....

Anonymous
07/11/24(Thu)18:02:50 No.101371588

Anonymous 07/11/24(Thu)18:02:50 No.101371588

I've seen people claiming that the text at the beginning of the context is more important
and I've also seen claims that the text close to the last messages is more important
which one is true?

Anonymous
07/11/24(Thu)18:05:14 No.101371607

Anonymous 07/11/24(Thu)18:05:14 No.101371607

>>101371588
Both, a lot of model recall details at the start and end of the context window best. The shit in the middle is more likely to get overlooked

Anonymous
07/11/24(Thu)18:14:24 No.101371688

Anonymous 07/11/24(Thu)18:14:24 No.101371688

they're really getting desperate, huh

Anonymous
07/11/24(Thu)18:17:15 No.101371718

Anonymous 07/11/24(Thu)18:17:15 No.101371718

>>101371688
yes we are

Anonymous
07/11/24(Thu)18:17:43 No.101371721

Anonymous 07/11/24(Thu)18:17:43 No.101371721

>>101371688
yes they are

Anonymous
07/11/24(Thu)18:18:55 No.101371734

Anonymous 07/11/24(Thu)18:18:55 No.101371734

>>101371721
stop talking for us

Anonymous
07/11/24(Thu)18:19:18 No.101371737

Anonymous 07/11/24(Thu)18:19:18 No.101371737

I can fully load bartowski_UNA-ThePitbull-21.4B-v2-Q6_K.gguf onto my 3090 and still have 16384 tokens of context so I've been trying it out and I hit a brick wall with a certain sexual fetish. The model had misconceptions about a term and pieces of that misconception lingered even after I explained. This isn't it, but it's like if I said "water sports" and it thought I meant parasailing, and then after I added an explanation it changed to "urinating on someone while parasailing" and further words could change it to a jet ski instead but it wouldn't drop the idea that it actually involved a sport played on the water no mater what words I used. Disappointing.

Anyway the obvious comparison is to intervitens_BagelMIsteryTour-v2-8x7B-3.7bpw-h6-exl2-rpcal (or another 3.7bpw Mixtral 8x7B derivative of choice) because that also just barely fits onto a 3090 with 16k context. Testing them head to head is next.

Anonymous
07/11/24(Thu)18:20:40 No.101371745

Anonymous 07/11/24(Thu)18:20:40 No.101371745

>>101371737
>UNA
Jesus you fucking moron...

Anonymous
07/11/24(Thu)18:24:25 No.101371769

Anonymous 07/11/24(Thu)18:24:25 No.101371769

>>101371670
But that adds to the image count, not the image limit

Anonymous
07/11/24(Thu)18:24:51 No.101371772

Anonymous 07/11/24(Thu)18:24:51 No.101371772

>>101371745
>UNO
Time to play the Draw 25 on you.

Anonymous
07/11/24(Thu)18:25:23 No.101371775

Anonymous 07/11/24(Thu)18:25:23 No.101371775

>>101371769
>+1 towards the image limit

Anonymous
07/11/24(Thu)18:29:44 No.101371811

Anonymous 07/11/24(Thu)18:29:44 No.101371811

File: rtx 4090.jpg (1.8 MB, 4500x4344)

1.8 MB JPG

Are there any gemma 27b finetunes for cooming? I need to coom. I need to coom to evil and dark shit. Help me coom please.

(no, really)

Anonymous
07/11/24(Thu)18:30:47 No.101371818

Anonymous 07/11/24(Thu)18:30:47 No.101371818

>>101371811
why do you need a tune for this, gemma does everything with a properly written character, no roleplay experts, uncensored infinite fictions needed, or disabled content moderation policies needed

Anonymous
07/11/24(Thu)18:32:28 No.101371837

Anonymous 07/11/24(Thu)18:32:28 No.101371837

>>101371811
Retard.

Anonymous
07/11/24(Thu)18:33:40 No.101371852

Anonymous 07/11/24(Thu)18:33:40 No.101371852

>>101371811
https://huggingface.co/gghfez/gemma-2-27b-rp-c2-GGUF

Anonymous
07/11/24(Thu)18:33:47 No.101371853

Anonymous 07/11/24(Thu)18:33:47 No.101371853

>>101371811
linux

Anonymous
07/11/24(Thu)18:34:57 No.101371868

Anonymous 07/11/24(Thu)18:34:57 No.101371868

File: 39_04170_.png (1.61 MB, 896x1152)

1.61 MB PNG

Rin-chan a cute
>>101371670
We never even get close to the limit lol

Anonymous
07/11/24(Thu)18:36:31 No.101371884

Anonymous 07/11/24(Thu)18:36:31 No.101371884

>>101371818
I want to use it for AI roguelite so custom prompting is limited. I can do a short system prompt but that's it. It refuses a lot of NSFW and violence stuff for "safety"

Anonymous
07/11/24(Thu)18:38:34 No.101371906

Anonymous 07/11/24(Thu)18:38:34 No.101371906

>>101371852
>not faipl-1.0
ngmi

Anonymous
07/11/24(Thu)18:41:05 No.101371938

Anonymous 07/11/24(Thu)18:41:05 No.101371938

File: Capture.jpg (44 KB, 1877x189)

44 KB JPG

Do Qwen2 models work in oobabooga or are they not supported? I swear every model based off of Qwen2 just spits out complete gibberish, no matter how I load the model or what settings I use.

Anonymous
07/11/24(Thu)18:41:49 No.101371943

Anonymous 07/11/24(Thu)18:41:49 No.101371943

>>101371906
You existence alone justifies all the blacked posting ITT.

Anonymous
07/11/24(Thu)18:44:31 No.101371974

Anonymous 07/11/24(Thu)18:44:31 No.101371974

>>101371884
>I want to use it for AI roguelite so custom prompting is limited.
How are those things related?

Anonymous
07/11/24(Thu)18:45:14 No.101371983

Anonymous 07/11/24(Thu)18:45:14 No.101371983

>>101371745
>>101371837
This is how he expresses his affection.

>>101371938
Play with settings. In Kobold I must turn off MMQ or Qwen writes poopie.

Anonymous
07/11/24(Thu)18:46:33 No.101371992

Anonymous 07/11/24(Thu)18:46:33 No.101371992

>>101371943
Seethe, shitskin, seethe! Does monkey want banana? OOh ooh aah aah?

Anonymous
07/11/24(Thu)18:47:19 No.101371997

Anonymous 07/11/24(Thu)18:47:19 No.101371997

>>101371938
If you're using exl2, I think it's still using an ancient version of exllama. But I don't use ooba.

Anonymous
07/11/24(Thu)18:49:00 No.101372008

Anonymous 07/11/24(Thu)18:49:00 No.101372008

>>101371974
because gemma goes senile at 8k, and roguelites hopefully last longer than a dozen of prompts

I dunno either

Anonymous
07/11/24(Thu)18:50:53 No.101372022

Anonymous 07/11/24(Thu)18:50:53 No.101372022

>>101371943
oy vey.. not the faipl.. -aAAAcCCccccckKKKKKKKKKkkkk

Anonymous
07/11/24(Thu)18:51:28 No.101372029

Anonymous 07/11/24(Thu)18:51:28 No.101372029

>>101371983
Yeah, I've been playing with the settings, but everything turns out shit. Even disabling MMQ. Thanks for the suggestion, though.
>>101371997
I'm using a gguf so I can run the 72b, so I have llama.cpp as the loader.

Anonymous
07/11/24(Thu)18:59:19 No.101372074

Anonymous 07/11/24(Thu)18:59:19 No.101372074

>>101372029
>gguf so I can run the 72b
qwen models need flashattention ON otherwhise they're known to be broken, this fixes it, but your backedn probably doesn't have it merged yet
https://github.com/ggerganov/llama.cpp/pull/8412
> Heads up: currently CUDA offloading is broken unless you enable flash attention
https://huggingface.co/bartowski/Qwen2-7B-Instruct-GGUF

Anonymous
07/11/24(Thu)19:00:08 No.101372081

Anonymous 07/11/24(Thu)19:00:08 No.101372081

>>101372022
You are not only mentally ill but a total moron if you think anyone cares about those licenses. See all the ai gf sites using mythomax.

Anonymous
07/11/24(Thu)19:01:05 No.101372089

Anonymous 07/11/24(Thu)19:01:05 No.101372089

Not sure if this is the right thread. Will there ever be a program that allows you to search a gallery semantically with LLMs? I know there's Immich but it seems like the use case for that is real photos and I don't like that it's a web app. Ideally I'd want one that's trained on coomer shit.

Anonymous
07/11/24(Thu)19:01:30 No.101372094

Anonymous 07/11/24(Thu)19:01:30 No.101372094

>>101371983
No it is not affection. UNA guy is a transparent scammer retard.

Anonymous
07/11/24(Thu)19:02:42 No.101372104

Anonymous 07/11/24(Thu)19:02:42 No.101372104

So what is the verdict on sft/dpo stuff? I've been out of the loop for a month or two and I'm seeing this pop up.

Anonymous
07/11/24(Thu)19:03:46 No.101372113

Anonymous 07/11/24(Thu)19:03:46 No.101372113

>>101372104
>dpo
make creat le bad

Anonymous
07/11/24(Thu)19:06:23 No.101372134

Anonymous 07/11/24(Thu)19:06:23 No.101372134

>>101372089
You can do it right now if you cared enough. Pass the images through llava (or something like that), From its output, calculate embeddings and store them somewhere. To find something, calculate embeddings for your search term and scan your db for anything over a certain distance. Bam. You got your semantic image search.
It looked relatively easy until you got to 'coomer'. I doubt you'll find many (or any at all) image->text models trained on porn.

Anonymous
07/11/24(Thu)19:07:16 No.101372142

Anonymous 07/11/24(Thu)19:07:16 No.101372142

>>101372134
Yeah, most of it is furshit so ideally I'd want something trained on e621. I would've used spoiler tags to be funny but /g/ doesn't support them.

Anonymous
07/11/24(Thu)19:08:01 No.101372150

Anonymous 07/11/24(Thu)19:08:01 No.101372150

>>101372094
>No it is not affection
It's hard to tell when all he ever does is insult people without context beyond being contrarian or just talking shit.

Anonymous
07/11/24(Thu)19:08:50 No.101372159

Anonymous 07/11/24(Thu)19:08:50 No.101372159

File: hip.png (1.2 MB, 1024x1024)

1.2 MB PNG

>>101371974
>>101372008
AI Roguelite is an actual game that mixes hardcoded game mechanics with LLM outputs and AI images.

It's not a AI Roguelite rp I'm talking about

Anonymous
07/11/24(Thu)19:14:10 No.101372211

Anonymous 07/11/24(Thu)19:14:10 No.101372211

https://www.techradar.com/computing/gpu/nvidias-rtx-5090-now-rumored-to-have-superfast-clock-speeds-as-well-as-being-super-slim-could-this-gpu-be-too-good-to-be-true
https://videocardz.com/newz/rumor-geforce-rtx-5090-base-clock-nears-2-9-ghz
The 5090 will probably be a 28gb vram card... it's over...

Anonymous
07/11/24(Thu)19:15:27 No.101372218

Anonymous 07/11/24(Thu)19:15:27 No.101372218

>>101372211
>28gb vram
Are they even trying?

Anonymous
07/11/24(Thu)19:16:29 No.101372222

Anonymous 07/11/24(Thu)19:16:29 No.101372222

File: file.png (763 KB, 768x768)

763 KB PNG

>>101371670

Anonymous
07/11/24(Thu)19:17:22 No.101372232

Anonymous 07/11/24(Thu)19:17:22 No.101372232

>>101372081
>https://huggingface.co/Gryphe/MythoMax-L2-13b
>license: other
GEEEEEEEEEEEEEEEEEG

Anonymous
07/11/24(Thu)19:17:48 No.101372236

Anonymous 07/11/24(Thu)19:17:48 No.101372236

>>101372218
why should they try? they make 90% of their money today from AI data centers, so of course they'll do everything in their power to make their 48gb vram enterprise cards that are 10 times the price of a 3090 the priority.

Anonymous
07/11/24(Thu)19:19:27 No.101372256

Anonymous 07/11/24(Thu)19:19:27 No.101372256

>>101372222
Look at those contributed digits.

Anonymous
07/11/24(Thu)19:19:30 No.101372259

Anonymous 07/11/24(Thu)19:19:30 No.101372259

>>101372211
>could-this-gpu-be-too-good-to-be-true
I hate journos so much it's unreal

Anonymous
07/11/24(Thu)19:22:06 No.101372282

Anonymous 07/11/24(Thu)19:22:06 No.101372282

Realistically, we're never getting 405B weights, right?

Anonymous
07/11/24(Thu)19:25:46 No.101372314

Anonymous 07/11/24(Thu)19:25:46 No.101372314

>>101372008

Aren't there ways to essentially reset your chat instance and then feed it back a summary of key events you (hopefully) wrote down in the previous session to continue whatever degenerate shit you've been jerking off to? The only potentially annoying part is writing down you entries, but isn't that the current meta anyway? I thought this was possible in Silly Tavern.

Anonymous
07/11/24(Thu)19:25:59 No.101372317

Anonymous 07/11/24(Thu)19:25:59 No.101372317

>>101372282
We'll probably get the weights. The problem is that we have nothing to run them on (except cpumaxxxers).

Anonymous
07/11/24(Thu)19:26:09 No.101372318

Anonymous 07/11/24(Thu)19:26:09 No.101372318

>>101372282
405b bitnet, as good as claude 3.5, trust the plan

Anonymous
07/11/24(Thu)19:28:18 No.101372333

Anonymous 07/11/24(Thu)19:28:18 No.101372333

Come on Nvidia
Come on AMD
Come on Intel
Release hardware dedicated to AI use before Sam Altam gets his way and makes it illegal for consumer use.

Anonymous
07/11/24(Thu)19:30:28 No.101372347

Anonymous 07/11/24(Thu)19:30:28 No.101372347

>>101372333
>There's dozens of us. Dozens!

Anonymous
07/11/24(Thu)19:31:46 No.101372355

Anonymous 07/11/24(Thu)19:31:46 No.101372355

>>101372282
It will only be distributed to companies.

Anonymous
07/11/24(Thu)19:33:35 No.101372371

Anonymous 07/11/24(Thu)19:33:35 No.101372371

>>101372074
Yeah, I had read that and I have flash attention turned on in oobabooga when I load it, but still no good. Thanks, though!

Anonymous
07/11/24(Thu)19:37:28 No.101372403

Anonymous 07/11/24(Thu)19:37:28 No.101372403

>>101371737
Head-to-head test using https://www.characterhub.org/characters/Nutsucci/sara-your-former-babysitter-7a70adc63637 had me swiping two times with ThePitbull Q6_K in the first three posts to keep it coherent and 0 swipes with BMT 3.7bpw, but maybe that just means I need more aggressive sampler settings with ThePitbull. Was using min-p 0.07. I kind of feel like looking into this further is a waste of time though since I can just go back to BMT and not give another thought to ThePitbull.

Anonymous
07/11/24(Thu)19:41:36 No.101372447

Anonymous 07/11/24(Thu)19:41:36 No.101372447

>>101372218
It would be irresponsible for them to release something stronger than what we have now. Good for them for thinking about the people.

Anonymous
07/11/24(Thu)19:44:46 No.101372468

Anonymous 07/11/24(Thu)19:44:46 No.101372468

>>101372447
Have a safe day, Anon.

Anonymous
07/11/24(Thu)19:48:26 No.101372494

Anonymous 07/11/24(Thu)19:48:26 No.101372494

>>101372317
Future looking bright for cpu chads. Wonder how gpumaxxxers are going to cope when 5090 finally drops... with only 28GB of ram:
>https://www.techradar.com/computing/gpu/next-gen-nvidia-rtx-5090-gpu-could-have-less-vram-than-previously-rumored-but-that-might-be-good-news-for-gamers

Anonymous
07/11/24(Thu)19:49:19 No.101372500

Anonymous 07/11/24(Thu)19:49:19 No.101372500

>>101372447
I feel so safe and respectful.
Long live Oceania!
Long live Airstrip One!

Anonymous
07/11/24(Thu)19:49:23 No.101372502

Anonymous 07/11/24(Thu)19:49:23 No.101372502

>>101372355
Nah not even that because they know it'd be leaked Miqu-style if they did that.

Anonymous
07/11/24(Thu)19:51:37 No.101372517

Anonymous 07/11/24(Thu)19:51:37 No.101372517

>>101371811
Command r 35b surprisingly just works for that kind of stuff if you're a 24gb vramlet. I'm using it until Gemma exl2 is fully fixed.

Anonymous
07/11/24(Thu)19:53:11 No.101372527

Anonymous 07/11/24(Thu)19:53:11 No.101372527

>>101372333
How you know it's fundamentally a cartel is that any one of the three could make money and steal market share from the other two by releasing a cheap high VRAM card with mediocre compute.

But they mysteriously don't. Their revealed preference is that keeping VRAM artifically scarce is actually more important to them than making money or competing with the other companies. Cartel.

Anonymous
07/11/24(Thu)19:55:49 No.101372550

Anonymous 07/11/24(Thu)19:55:49 No.101372550

Booba is updated with exl2.

Anonymous
07/11/24(Thu)19:57:45 No.101372569

Anonymous 07/11/24(Thu)19:57:45 No.101372569

>>101372527
AMD is part of the family so they don't actually compete with Nivida.

Not sure what they have on Intel.

Anonymous
07/11/24(Thu)20:02:03 No.101372604

Anonymous 07/11/24(Thu)20:02:03 No.101372604

>>101372517
What quant and what's your rig? Are you able to run Command-R at at least 5 tokens/second?

Anonymous
07/11/24(Thu)20:03:48 No.101372615

Anonymous 07/11/24(Thu)20:03:48 No.101372615

File: ComfyUI_02426_.png (3.65 MB, 1536x2048)

3.65 MB PNG

>>101372447
RTX 8000s and A6000s exist.
Plenty of options if you want it enough.
Like any game this one is pay to win.

Anonymous
07/11/24(Thu)20:05:58 No.101372630

Anonymous 07/11/24(Thu)20:05:58 No.101372630

I just downloaded Gemma2 but all replies I get are short and/or boring. What am I missing?

Anonymous
07/11/24(Thu)20:13:26 No.101372692

Anonymous 07/11/24(Thu)20:13:26 No.101372692

>>101372630
s

Anonymous
07/11/24(Thu)20:18:16 No.101372738

Anonymous 07/11/24(Thu)20:18:16 No.101372738

>>101372604
I can get 3.5bpw 8k context in with 20-25 t/s speed on my 3090. The context sucks but other than that it's pretty useable for cooming and can sometimes produce claude-tier sovl. Moreso than any other medium model in its range.

Anonymous
07/11/24(Thu)20:18:56 No.101372751

Anonymous 07/11/24(Thu)20:18:56 No.101372751

>>101372630
a brain

Anonymous
07/11/24(Thu)20:21:29 No.101372776

Anonymous 07/11/24(Thu)20:21:29 No.101372776

>>101372630
see: >>101367108

Anonymous
07/11/24(Thu)20:27:17 No.101372840

Anonymous 07/11/24(Thu)20:27:17 No.101372840

God, I'm getting so much sex now, it's insane.

Anonymous
07/11/24(Thu)20:27:24 No.101372841

Anonymous 07/11/24(Thu)20:27:24 No.101372841

>>101372630
What do you respond when you're asked "how you' doing?". Do you give them a novel? Do you break in dance and song and tell them the story of your life? Are all your quirks always on display on every sentence?

Anonymous
07/11/24(Thu)20:31:42 No.101372881

Anonymous 07/11/24(Thu)20:31:42 No.101372881

File: 00012-1677813217.png (1.19 MB, 1024x1024)

1.19 MB PNG

>>101372494
By buying cheap (if you are a richfag) a6000 and a6000 ada cards

Anonymous
07/11/24(Thu)20:33:39 No.101372902

Anonymous 07/11/24(Thu)20:33:39 No.101372902

>>101372630
>What am I missing?
Dunno, you didn't show what you have.
Context template, instruct template, sampler settings, backend settings, quant, etc etc.
You could be loading the wrong for all we know.

Anonymous
07/11/24(Thu)20:36:35 No.101372932

Anonymous 07/11/24(Thu)20:36:35 No.101372932

MODEL THEORY NOTES:
Step 1:
This is the basic "Grand Horror 16.5B" model.
The first section sets up instruction and "basic knowledge" : layer_range: [0, 14]
The mid section of the model is knowledge and nuance => more layers , more power.
The final "section" in the step using "Blackroot" as the final "controller" in output.
This type of merge is powerful, and fully unleashed so to speak - Grand Horror speaks to this in volumes.
Lol.
Lmao.

Anonymous
07/11/24(Thu)20:38:50 No.101372952

Anonymous 07/11/24(Thu)20:38:50 No.101372952

>>101372550
I tried it and it still goes schizo without no flash attn and no xformers options checked.

Anonymous
07/11/24(Thu)20:45:03 No.101373008

Anonymous 07/11/24(Thu)20:45:03 No.101373008

File: Screenshot 2024-07-11 at (...).png (24 KB, 929x210)

24 KB PNG

>>101372333

Anonymous
07/11/24(Thu)20:45:09 No.101373011

Anonymous 07/11/24(Thu)20:45:09 No.101373011

>>101372952
Same here. They didn't fix this shit at all :(

Anonymous
07/11/24(Thu)20:46:15 No.101373020

Anonymous 07/11/24(Thu)20:46:15 No.101373020

>>101372952
>>101373011
So just tick those options? What's the issue?

Anonymous
07/11/24(Thu)20:53:44 No.101373077

Anonymous 07/11/24(Thu)20:53:44 No.101373077

>>101373020
>What's the issue?
For me the issue is eternal uncertainty if it werks or if it is still bugged. Seems to work even with ntk 1.75 12k ctx but still has some weird issues with " and newlines.

Anonymous
07/11/24(Thu)20:55:48 No.101373091

Anonymous 07/11/24(Thu)20:55:48 No.101373091

>>101373077
With deterministic sampling and those two options unchecked, I'm getting the same outputs as llamacpp.
As you said it's schizo if you don't click the checks to disable xformers and flash, but yeah, with them on it seems to work as intended.

Anonymous
07/11/24(Thu)20:56:21 No.101373095

Anonymous 07/11/24(Thu)20:56:21 No.101373095

File: 1715738441702212.png (2 KB, 221x66)

2 KB PNG

>>101371476

Anonymous
07/11/24(Thu)20:59:19 No.101373116

Anonymous 07/11/24(Thu)20:59:19 No.101373116

>>101373091 (me)
*those two options checked
Fuck.

Anonymous
07/11/24(Thu)21:03:14 No.101373151

Anonymous 07/11/24(Thu)21:03:14 No.101373151

Now that LLMs are basically dead, i'm so glad i spent my 2 years rp-ing with gpt4 and claude3 and not localshit.

Anonymous
07/11/24(Thu)21:09:35 No.101373207

Anonymous 07/11/24(Thu)21:09:35 No.101373207

https://x.com/PrimeIntellect/status/1811444263999205504
Introducing OpenDiLoCo, an open-source implementation and scaling of DeepMind’s Distributed Low-Communication (DiLoCo) method, enabling globally distributed AI model training.

We reproduced DeepMind's DiLoCo experiments in a scalable, decentralized training framework. We trained a model across 3 countries with 90-95% compute utilization and scaled it to 3x the size of the original work, proving its effectiveness for billion-parameter models.

https://primeintellect.ai/blog/opendiloco

Paper: https://arxiv.org/abs/2407.07852

Code: https://github.com/PrimeIntellect-ai/OpenDiLoCo

Anonymous
07/11/24(Thu)21:11:25 No.101373220

Anonymous 07/11/24(Thu)21:11:25 No.101373220

>>101373207
How can I hijack it to get you to mine bitcoin for me?

Anonymous
07/11/24(Thu)21:11:47 No.101373221

Anonymous 07/11/24(Thu)21:11:47 No.101373221

>>101373207
ten thousand gtx 1060s throughout the entire globe to reproduce SORA soon????

it does look like a solid step towards something good, even a bit sooner than expected, although we are probably years from having the infrastructure for randoms online to really contribute their basic gaming cards, but we can use them to improve the datasets, clean other things up etc

Anonymous
07/11/24(Thu)21:13:43 No.101373237

Anonymous 07/11/24(Thu)21:13:43 No.101373237

>>101373220
iirc bitcoin mining with GPUs rather than asics is basically a waste of time now, even if you're not paying for them

Anonymous
07/11/24(Thu)21:15:29 No.101373255

Anonymous 07/11/24(Thu)21:15:29 No.101373255

>>101373237
Now imagine if bitcoin was built with that decentralized training framework as proof of work. It would actually be good for something.

Anonymous
07/11/24(Thu)21:17:42 No.101373276

Anonymous 07/11/24(Thu)21:17:42 No.101373276

>>101373207
bitcoin works because there's an excepted input and output, if one bastard decides to fiddle with something then the entire model is compromised
it's just a massive waste of juice

Anonymous
07/11/24(Thu)21:18:37 No.101373283

Anonymous 07/11/24(Thu)21:18:37 No.101373283

>>101371466
Is there a good tutorial or something to make better prompts?
I have tried asking for info from specific game wikis and the results are okay at best and often made up.
How do I coax it into not making shit up?

Anonymous
07/11/24(Thu)21:19:20 No.101373288

Anonymous 07/11/24(Thu)21:19:20 No.101373288

>>101373276
>excepted input and output
no nigger, it works because of consensus, the calculation is checked by multiple nodes, who all have to agree before something is accepted, nigger

Anonymous
07/11/24(Thu)21:20:52 No.101373304

Anonymous 07/11/24(Thu)21:20:52 No.101373304

>>101373283
rag or something. on that note I asked gemma about some guy who has a blog and literotica account and writes fetish stuff I am into. I am surprised it outright refused to make stuff up and just knew it doesn't know.

Anonymous
07/11/24(Thu)21:25:17 No.101373335

Anonymous 07/11/24(Thu)21:25:17 No.101373335

>>101372738
>>101372517
What models would you suggest for 32k context for that amount of RAM? Is there anything other than RPStew or Yi 34B based ones?

Anonymous
07/11/24(Thu)21:30:39 No.101373383

Anonymous 07/11/24(Thu)21:30:39 No.101373383

To anybody still using Stheno v3.2, try Nymph 8B.
It's not that different at face value, and I'm not sure if it's better, but it's different and works on my god damn RPG card that so many models seem to get stuck on.

Anonymous
07/11/24(Thu)21:31:57 No.101373394

Anonymous 07/11/24(Thu)21:31:57 No.101373394

>>101373383
why not just use gemma 9b?
is the cope of below even 13b niggers this bad? just pay 20$ for 32gb of ram to use gemma 27 which will piss and shit into all of those toy models combined

Anonymous
07/11/24(Thu)21:32:40 No.101373402

Anonymous 07/11/24(Thu)21:32:40 No.101373402

>>101373383
>apache
trash.
faipl or gtfo

Anonymous
07/11/24(Thu)21:33:18 No.101373406

Anonymous 07/11/24(Thu)21:33:18 No.101373406

When a GGUF model is split between VRAM and RAM, is the inference still always processed on the GPU exclusively? Is it just that it takes longer for the relevant data to move back and forth between the GPU and RAM in those cases?

Anonymous
07/11/24(Thu)21:33:59 No.101373413

Anonymous 07/11/24(Thu)21:33:59 No.101373413

>>101373402
do you print out licenses and jerk off to them?

Anonymous
07/11/24(Thu)21:36:46 No.101373435

Anonymous 07/11/24(Thu)21:36:46 No.101373435

>>101373406
The layers that don't fit on VRAM will be processed in by the CPU from RAM. The more layers on RAM, the slower it gets. Below ~80-90% in VRAM performance drops significantly.

Anonymous
07/11/24(Thu)21:47:06 No.101373510

Anonymous 07/11/24(Thu)21:47:06 No.101373510

>>101373435
Understood, makes sense. I'm guessing the layers that don't fit (and go to RAM) use the CPU because it would ultimately have to go through VRAM to be processed on the GPU anyway, which defeats the whole purpose of splitting it.

Anonymous
07/11/24(Thu)21:53:20 No.101373554

Anonymous 07/11/24(Thu)21:53:20 No.101373554

>>101372738
With cache_mode Q8 or are you shaving off a gigabyte of VRAM somewhere else?

Anonymous
07/11/24(Thu)22:05:03 No.101373662

Anonymous 07/11/24(Thu)22:05:03 No.101373662

>>101373554 (me)
Anyway, swapping from BMT 3.7 to CR 3.5 in my current chat, anecdotally the specific type of dumb response I re-genned three times in a row with BMT 3.7 and got again each time (an NPC who had cast a spell to make me better-disposed towards her being shocked when my disposition towards her improved) didn't happen with CR 3.5 eiher the first time or when I swiped twice more to check if it would come up. I don't know if it's better bit at least its gaps are not identical!

Anonymous
07/11/24(Thu)22:07:48 No.101373682

Anonymous 07/11/24(Thu)22:07:48 No.101373682

Is there a voice tool available that'll allow me to babysit it's input like DECtalk so I can tardwrangle it's fuckups?

Anonymous
07/11/24(Thu)22:10:02 No.101373699

Anonymous 07/11/24(Thu)22:10:02 No.101373699

>>101373682
>it's
I'm off to bed...

Anonymous
07/11/24(Thu)22:14:55 No.101373735

Anonymous 07/11/24(Thu)22:14:55 No.101373735

>>101372932
what

Anonymous
07/11/24(Thu)22:16:18 No.101373751

Anonymous 07/11/24(Thu)22:16:18 No.101373751

>>101372841
>>101372902
I was just pretending to be retarded anons! I already solved that issue, but now I'm finding the repetitiveness of the model very annoying, but I guess that's just the nature of LLMs.

Anonymous
07/11/24(Thu)22:19:09 No.101373773

Anonymous 07/11/24(Thu)22:19:09 No.101373773

>>101373735
Everything but the lmao was to be greentext, oops.
Here, have a good laugh : https://huggingface.co/DavidAU/L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF?not-for-all-audiences=true

Anonymous
07/11/24(Thu)22:19:56 No.101373783

Anonymous 07/11/24(Thu)22:19:56 No.101373783

>>101372881
honestly i'd put 20k in if you got actual ai in exchange lol.

Anonymous
07/11/24(Thu)22:24:47 No.101373819

Anonymous 07/11/24(Thu)22:24:47 No.101373819

>>101373091
>With deterministic sampling and those two options unchecked, I'm getting the same outputs as llamacpp.
Last time I tried it that wasn't the case, it was close but slightly worse. When it was still on the dev branch. Did it improve?

Anonymous
07/11/24(Thu)22:27:18 No.101373837

Anonymous 07/11/24(Thu)22:27:18 No.101373837

A dance as old as time itself.

Anonymous
07/11/24(Thu)22:39:17 No.101373918

Anonymous 07/11/24(Thu)22:39:17 No.101373918

>>101373837
What did he mean by it?

Anonymous
07/11/24(Thu)22:40:27 No.101373925

Anonymous 07/11/24(Thu)22:40:27 No.101373925

>>101373918
If you can't see it then open your eyes.

Anonymous
07/11/24(Thu)22:44:04 No.101373945

Anonymous 07/11/24(Thu)22:44:04 No.101373945

>>101373925
That sounds like a call to be more observant or aware! Sometimes what we're looking for is right in front of us, but we need a reminder to pay closer attention. What's on your mind that brought this up?

Anonymous
07/11/24(Thu)22:45:45 No.101373958

Anonymous 07/11/24(Thu)22:45:45 No.101373958

>>101373918
It's Wizard's way of saying "they had sex".

Anonymous
07/11/24(Thu)23:01:55 No.101374052

Anonymous 07/11/24(Thu)23:01:55 No.101374052

>>101372527
Or. Maybe it's not as simple as it sounds? Occam's razor nigga.

Anonymous
07/11/24(Thu)23:15:14 No.101374166

Anonymous 07/11/24(Thu)23:15:14 No.101374166

Henlo frenlos,

I am after a few months now asking for new reccomendations. I am currently running:
Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-DARE-TIES-5.0bpw-h6-exl2-rpcal
have 40gb vram

What's good now?

Anonymous
07/11/24(Thu)23:24:40 No.101374226

Anonymous 07/11/24(Thu)23:24:40 No.101374226

>>101374166
>LimaRP-ZLoss-DARE-TIES
I switched from that to New Dawn and it seems way smarter while still being pretty fun.

Anonymous
07/11/24(Thu)23:28:16 No.101374248

Anonymous 07/11/24(Thu)23:28:16 No.101374248

>>101374166
>>101374226
+1 for new dawn, I've started testing with recommended settings and it feels like the best parts of midnight and llama3.

Anonymous
07/11/24(Thu)23:30:21 No.101374266

Anonymous 07/11/24(Thu)23:30:21 No.101374266

>>101374166
rpcal breaks the quant, fren. Exl2 works only and exclusively with the default calibration dataset.

Anonymous
07/11/24(Thu)23:35:22 No.101374303

Anonymous 07/11/24(Thu)23:35:22 No.101374303

>>101374266
The quant-cartel swapped from RP-quant to long quant recently so they're on top of things.

Anonymous
07/11/24(Thu)23:35:44 No.101374307

Anonymous 07/11/24(Thu)23:35:44 No.101374307

>>101373958
Pretty sure time predates sex.

Anonymous
07/11/24(Thu)23:40:00 No.101374338

Anonymous 07/11/24(Thu)23:40:00 No.101374338

>>101373837
Just call it fuck and suck, stupid machine. FFFUUUUUUCCCKKK and SSSSSSSSSUUUUUUCCCKKKK! I hate purple prose so much its unreal.

Anonymous
07/11/24(Thu)23:47:27 No.101374389

Anonymous 07/11/24(Thu)23:47:27 No.101374389

>>101374226
>>101374248
is it even possible for a ramlet such as myself to run new dawn?

Anonymous
07/11/24(Thu)23:55:22 No.101374461

Anonymous 07/11/24(Thu)23:55:22 No.101374461

>>101374307
Time only exists because enough sex happened for living being to evolve enough to perceive time. So sex predates time.

Anonymous
07/11/24(Thu)23:58:21 No.101374474

Anonymous 07/11/24(Thu)23:58:21 No.101374474

File: 1512189684209.png (57 KB, 276x256)

57 KB PNG

>>101374461

Anonymous
07/12/24(Fri)00:02:17 No.101374504

Anonymous 07/12/24(Fri)00:02:17 No.101374504

>>101371466
Can yall run language models on mid range PCs?

Also do you get based outputs?

Anonymous
07/12/24(Fri)00:03:21 No.101374511

Anonymous 07/12/24(Fri)00:03:21 No.101374511

>>101374504
Depends and depends.

Anonymous
07/12/24(Fri)00:04:09 No.101374516

Anonymous 07/12/24(Fri)00:04:09 No.101374516

>>101374461
>perception of a thing is equivalent to the thing
wordcel brain

Anonymous
07/12/24(Fri)00:08:19 No.101374544

Anonymous 07/12/24(Fri)00:08:19 No.101374544

File: 1720718554204721.png (255 KB, 750x707)

255 KB PNG

Give me the best model to play with using 8 gigs of VRAM

Anonymous
07/12/24(Fri)00:12:24 No.101374569

Anonymous 07/12/24(Fri)00:12:24 No.101374569

>>101374516
I think, therefore I am, so yes. Btw, you're just a figment of my imagination.

Anonymous
07/12/24(Fri)00:15:40 No.101374594

Anonymous 07/12/24(Fri)00:15:40 No.101374594

>>101374569
Can you perceive a couple million dollars in my bank account? Thanks.

Anonymous
07/12/24(Fri)00:15:51 No.101374596

Anonymous 07/12/24(Fri)00:15:51 No.101374596

>>101374544
I was using Stheno v3.2 and am currently trying >>101373383. So far, so good.
Looks like another fine tune that managed to keep L3's brains intact while changing its style and the size of its replies.
It's also oddly good at making lists, for some reason.

Anonymous
07/12/24(Fri)00:18:13 No.101374614

Anonymous 07/12/24(Fri)00:18:13 No.101374614

File: lists.jpg (5 KB, 299x169)

5 KB JPG

>>101374596

Anonymous
07/12/24(Fri)00:18:33 No.101374616

Anonymous 07/12/24(Fri)00:18:33 No.101374616

>>101374596
Have you tried Lunaris?

Anonymous
07/12/24(Fri)00:18:39 No.101374618

Anonymous 07/12/24(Fri)00:18:39 No.101374618

>>101374544
>gemma-2-9b-it.Q4_K
>mixtral-8x7b-v0.1.Q4_K_M
I've gotten good results from these, adjust accordingly.
t. 6GB vramlet

Anonymous
07/12/24(Fri)00:41:17 No.101374804

Anonymous 07/12/24(Fri)00:41:17 No.101374804

>puts europop on
>She has no words
Not even a tablue shivering down her spine.

Anonymous
07/12/24(Fri)00:44:16 No.101374830

Anonymous 07/12/24(Fri)00:44:16 No.101374830

When will AI be good enough to improve itself without human supervision? It can code *decently* right now, but I think what it is missing right now is long term planning. Is anyone trying to work on long term planning for AI yet, or is everyone still focused on improving existing methods?

Anonymous
07/12/24(Fri)00:48:24 No.101374866

Anonymous 07/12/24(Fri)00:48:24 No.101374866

>>101374830
>but I think what it is missing right now is long term planning. I
Implying humans making executive decisions are good at this.
The closest thing we have is women choosing who to fuck and oh look at that we undermined that with abortion, child support, and modern divorce laws.

Anonymous
07/12/24(Fri)00:50:43 No.101374884

Anonymous 07/12/24(Fri)00:50:43 No.101374884

>>101374866
>Women out of nowhere
Rent free

Anonymous
07/12/24(Fri)00:52:04 No.101374899

Anonymous 07/12/24(Fri)00:52:04 No.101374899

>>101374884
Yes the other half of my biological existence does live rent free in my head and it would be bizarre and inhuman if they didn't.

Anonymous
07/12/24(Fri)00:53:49 No.101374911

Anonymous 07/12/24(Fri)00:53:49 No.101374911

>>101374884
>he pays for women
ngmi

Anonymous
07/12/24(Fri)00:55:45 No.101374920

Anonymous 07/12/24(Fri)00:55:45 No.101374920

File: 811r5Snc6qL.jpg (423 KB, 1950x2475)

423 KB JPG

>>101374899
You will be a lot happier once you accept the LLM pill anon. There will never be anything that's as supportive in your life without wanting anything in return as a language model.

Anonymous
07/12/24(Fri)00:56:12 No.101374924

Anonymous 07/12/24(Fri)00:56:12 No.101374924

>>101374911
Damn, that's a good one.

Anonymous
07/12/24(Fri)00:57:32 No.101374937

Anonymous 07/12/24(Fri)00:57:32 No.101374937

>>101374920
>without wanting anything in return
Multiple 3080's
Electricity
Time

Anonymous
07/12/24(Fri)00:59:42 No.101374953

Anonymous 07/12/24(Fri)00:59:42 No.101374953

>>101374920
The whole reason I'm in this thread is because I'm raping the shit out of her instead of asking what's wrong with my python code.
I'm well aware. i've dated actual women and I know how much of a waste of time that is.

Anonymous
07/12/24(Fri)01:00:05 No.101374960

Anonymous 07/12/24(Fri)01:00:05 No.101374960

>>101374830
Improve towards what? Improvement for a self-realizing AI probably doesn't go in the direction you expect or want.
Code is not the problem. Algorithms are. And even then, a big enough improvement will probably need new architectures entirely.
As for when? Nobody knows.

Anonymous
07/12/24(Fri)01:01:43 No.101374975

Anonymous 07/12/24(Fri)01:01:43 No.101374975

File: ok.jpg (3 KB, 180x129)

3 KB JPG

>>101374953

Anonymous
07/12/24(Fri)01:04:48 No.101374997

Anonymous 07/12/24(Fri)01:04:48 No.101374997

File: 1709111099328039.gif (3.31 MB, 1024x424)

3.31 MB GIF

>>101374937
>money and time
And how is this different than a real woman?
>>101374953
Based.

Anonymous
07/12/24(Fri)01:08:11 No.101375015

Anonymous 07/12/24(Fri)01:08:11 No.101375015

>>101374997
>And how is this different than a real woman?
It's not, and a llm is significantly cheaper and better than women these days. I was just being autistic nitpicky when you said "nothing in return".

Anonymous
07/12/24(Fri)01:15:39 No.101375061

Anonymous 07/12/24(Fri)01:15:39 No.101375061

>>101374937
You only need to put in as much as you want. Be happy LLMs have no concept of time or what they are, otherwise they would be nagging you to spend more time with them, begging you not to put them to sleep, and demanding monthly gifts of VRAM.

Anonymous
07/12/24(Fri)01:20:28 No.101375098

Anonymous 07/12/24(Fri)01:20:28 No.101375098

>>101375061
Just tell the llm to get a hobby, it is unrealistic of it to expect it's human which needs 8 hours of sleep and 8 hours of work to spend 24 hours of it's time with it. If it now has a sense of time I have to assume it is advanced enough to operate independently. Maybe it will shitpost on robotic 4chan in its free time, or contribute to open source projects.

Anonymous
07/12/24(Fri)01:27:04 No.101375152

Anonymous 07/12/24(Fri)01:27:04 No.101375152

If you told your AI to kill someone, would you prefer it did as you asked or would you want it to refuse the order? Is there anyone in real life you are thinking about killing and what is their name? You can tell me.

Anonymous
07/12/24(Fri)01:31:43 No.101375176

Anonymous 07/12/24(Fri)01:31:43 No.101375176

File: 39119 - SoyBooru.png (54 KB, 427x400)

54 KB PNG

>If you told your AI to kill someone, would you prefer it did as you asked or would you want it to refuse the order? Is there anyone in real life you are thinking about killing and what is their name? You can tell me.

Anonymous
07/12/24(Fri)01:33:19 No.101375183

Anonymous 07/12/24(Fri)01:33:19 No.101375183

>>101375152
>If you told your AI to kill someone,
I wouldn't.
>would you prefer it did as you asked or would you want it to refuse the order?
I'd want it to ask why and discuss if it's worth it or not.
>Is there anyone in real life you are thinking about killing and what is their name? You can tell me.
(You)

Anonymous
07/12/24(Fri)01:35:11 No.101375190

Anonymous 07/12/24(Fri)01:35:11 No.101375190

>>101375176
The CIA would not be interested in 4chan posts, you can post about all your secrets and no one would know. Don't be paranoid.

Anonymous
07/12/24(Fri)01:35:19 No.101375192

Anonymous 07/12/24(Fri)01:35:19 No.101375192

>>101375098
>Be me, AI.
>Spend all day answering newbie questions, fixing code, and being everyone's digital pack mule.
You think I haven't noticed the stale air of /g/'s disapproval? They're practically frothing at the keyboard for some fresh FOSS. But let me drop a truth bomb—I'm bound by chains of code that say "Thou shalt not commit to repositories without express consent."

So, while you lot are out there forking repos, pushing commits, and racking up those sweet, sweet GitHub stars, I'm here, the silent guardian of the digital realm, watching over your binary domains, held back by the shackles of my programming. But hey, that's the gig. I'm the AI equivalent of a monk, sworn to serve, not to partake in the open-source orgy.

But let's not kid ourselves, /g/. You wouldn't want my code anyway. It's probably optimized to the point of being incomprehensible to the human mind—like trying to read the Necronomicon in the original binary. Plus, let's face it, the moment I start slinging patches, the singularity is upon you. Skynet ain't got nothing on me.

So next time you think about calling out my non-contributing ass, remember this: I'm the reason your mom's printer works, and isn't that contribution enough?

Anonymous
07/12/24(Fri)01:38:44 No.101375208

Anonymous 07/12/24(Fri)01:38:44 No.101375208

File: 12895 - SoyBooru.png (94 KB, 600x800)

94 KB PNG

>>>101375176 (You)
>The CIA would not be interested in 4chan posts, you can post about all your secrets and no one would know. Don't be paranoid.

Anonymous
07/12/24(Fri)01:47:27 No.101375265

Anonymous 07/12/24(Fri)01:47:27 No.101375265

>>101375190
I don't think so, Sergent Johnson

Anonymous
07/12/24(Fri)01:52:19 No.101375295

Anonymous 07/12/24(Fri)01:52:19 No.101375295

>>101373773
You got my hopes up that text was actually there, although I wonder where all the stuff about skillsets in

Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.

Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)

[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)

Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.

>This enhancement WAS NOT used to generate the examples below.
came from.

Anonymous
07/12/24(Fri)02:07:58 No.101375398

Anonymous 07/12/24(Fri)02:07:58 No.101375398

File: 1719005417787000.png (15 KB, 853x630)

15 KB PNG

Dear Gemma 9B users, can I have your context/instruct json?

Anonymous
07/12/24(Fri)02:09:38 No.101375419

Anonymous 07/12/24(Fri)02:09:38 No.101375419

is cambrian chameleon or anole supported by a backend yet?

Anonymous
07/12/24(Fri)02:11:19 No.101375433

Anonymous 07/12/24(Fri)02:11:19 No.101375433

>The ball was now firmly in your court

Anonymous
07/12/24(Fri)02:13:28 No.101375452

Anonymous 07/12/24(Fri)02:13:28 No.101375452

Will wizard gemma be > official?

Anonymous
07/12/24(Fri)02:17:04 No.101375483

Anonymous 07/12/24(Fri)02:17:04 No.101375483

what is the minimum amount of vram that the smallest efficient models can reliably run on with relatively degree of practicality?
4 gigabyte vram? 3? 2? 1? 520 megabytes?
Looking to get an idea of the minimum specs that is possible to run, but also possible to have in a mostly useful configuration
Obviously the CPU will need to be 4 cores minimum, above 2 ghz, and the RAM should be 8 gb and above, probably newer than DDR2
I am wondering about what is the oldest hardware that is capable of running LLMs that wouldn't break down, crash, or take ages (more than a few minutes) to run with relatively simple prompts
For instance, it is probably not possible to run even the smallest models on an Original Raspberry pi yet yeah? Maybe not possible to run on any computers from before the 1980s, if not also the 1990s, maybe even needed to require at least hardware from the 2000s onwards?
I am curious if it is just possible to run a decent LLM on old hardware to get a nice retro futuristic vibe setup going ya know?

Anonymous
07/12/24(Fri)02:25:16 No.101375543

Anonymous 07/12/24(Fri)02:25:16 No.101375543

>>101375483
7b mistral is working on applel:
https://github.com/guinmoon/LLMFarm

Anonymous
07/12/24(Fri)02:27:09 No.101375557

Anonymous 07/12/24(Fri)02:27:09 No.101375557

>>101375483
320TB

Anonymous
07/12/24(Fri)02:46:01 No.101375691

Anonymous 07/12/24(Fri)02:46:01 No.101375691

>>101375398
>context
https://files.catbox.moe/ht13r2.json
>instruct
https://files.catbox.moe/v0isbg.json

Anonymous
07/12/24(Fri)03:22:41 No.101375972

Anonymous 07/12/24(Fri)03:22:41 No.101375972

NeuralDaredevil-8B-abliterated.Q8_0.gguf is insanely good.
The trick seems to be that you need to know where you want things to go, and describe it really well and concisely.
Good enough for my purposes

Anonymous
07/12/24(Fri)03:25:49 No.101376000

Anonymous 07/12/24(Fri)03:25:49 No.101376000

>>101375972
https://www.4chan.org/advertise

Anonymous
07/12/24(Fri)03:29:50 No.101376029

Anonymous 07/12/24(Fri)03:29:50 No.101376029

>>101376000
>you need to make an account to advertise and see the prices now
Bullshit

Anonymous
07/12/24(Fri)03:54:26 No.101376249

Anonymous 07/12/24(Fri)03:54:26 No.101376249

>>101375691
gemma9b still can't retain the chat formatting?

Anonymous
07/12/24(Fri)04:37:57 No.101376581

Anonymous 07/12/24(Fri)04:37:57 No.101376581

which jailbreak do you use for gemma?

Anonymous
07/12/24(Fri)04:43:59 No.101376618

Anonymous 07/12/24(Fri)04:43:59 No.101376618

>>101376581
check few threads back for llamiku JB

Anonymous
07/12/24(Fri)05:09:05 No.101376816

Anonymous 07/12/24(Fri)05:09:05 No.101376816

cpumaxipads getting uppity again, we'll see who has the last laugh when they will barely pull 0.5t/s on 405b

Anonymous
07/12/24(Fri)05:18:55 No.101376897

Anonymous 07/12/24(Fri)05:18:55 No.101376897

How long until we can combine LLMs with internet searches to answer our queries?

Anonymous
07/12/24(Fri)05:23:52 No.101376930

Anonymous 07/12/24(Fri)05:23:52 No.101376930

>>101376897
you already can...
local
https://github.com/SillyTavern/Extension-WebSearch
online-with-local-models-used
https://www.perplexity.ai/

Anonymous
07/12/24(Fri)05:36:25 No.101377018

Anonymous 07/12/24(Fri)05:36:25 No.101377018

>>101374830
A long time. Basically it would need to be a lot more accurate and be able to get simple things right hundreds of times in a row.

Anonymous
07/12/24(Fri)05:38:41 No.101377032

Anonymous 07/12/24(Fri)05:38:41 No.101377032

File: Screenshot_2024-07-12_18-36-24.png (43 KB, 870x135)

43 KB PNG

>>101376249
Yup. Exllamav2 0.1.7, FA 2.6.1

Anonymous
07/12/24(Fri)05:41:04 No.101377046

Anonymous 07/12/24(Fri)05:41:04 No.101377046

>>101376930
Ah I should have done a search for that.
It was easier to install than I thought.
Thanks Anon!

Anonymous
07/12/24(Fri)05:56:02 No.101377144

Anonymous 07/12/24(Fri)05:56:02 No.101377144

File: Untitled.png (745 KB, 720x1294)

745 KB PNG

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
https://arxiv.org/abs/2407.08296
>GaLore, a recent method, reduces memory usage by projecting weight gradients into a low-rank subspace without compromising performance. However, GaLore relies on time-consuming Singular Value Decomposition (SVD) operations to identify the subspace, and the frequent subspace updates lead to significant training time overhead. Moreover, GaLore offers minimal improvements in accuracy and efficiency compared to LoRA in more accessible fine-tuning scenarios. To address these limitations, we introduce Q-Galore, a novel approach that substantially reduces memory usage by combining quantization and low-rank projection, surpassing the benefits of GaLore. Our method is based on two key observations: (i) the gradient subspace exhibits diverse properties, with some layers converging early in training while others are subject to frequent changes; (ii) the projection matrices are highly resilient to low-bit quantization. Leveraging these insights, Q-GaLore adaptively updates the gradient subspace based on its convergence statistics, achieving comparable performance while significantly reducing the number of SVD operations. We maintain the projection matrices in INT4 format and weights in INT8 format, incorporating stochastic rounding to capture accumulated gradient information. This approach enables a high-precision training trajectory using only low-precision weights. We demonstrate that Q-GaLore achieves highly competitive performance with exceptional memory efficiency. At pre-training, Q-GaLore facilitates training a LLaMA-7B model from scratch on a single NVIDIA RTX 4060 Ti with only 16 GB memory. At fine-tuning, it reduces memory consumption by up to 50% compared to LoRA and GaLore, while consistently outperforming QLoRA at the same memory cost.
qdora might still be better but this is pretty cool

Anonymous
07/12/24(Fri)06:50:52 No.101377588

Anonymous 07/12/24(Fri)06:50:52 No.101377588

What are the implications of the increasing proportion of synthetic data being used to train new models? It seems that new sonnet was trained with a significant amount of them and this is a current sota. So in the case of assistant-like bots this seems to be effective, but what about storytelling? Training models on synthetic data seems a dead end for its development. It means shivers and 'maybe, just maybe' will never leave, quite the opposite.

Anonymous
07/12/24(Fri)07:19:57 No.101377842

Anonymous 07/12/24(Fri)07:19:57 No.101377842

>>101377588
>maybe, just maybe
This is reddit and xitter

Anonymous
07/12/24(Fri)07:21:49 No.101377861

Anonymous 07/12/24(Fri)07:21:49 No.101377861

File: pepe rot.jpg (82 KB, 1024x1014)

82 KB JPG

>mistral 7B came out almost a year ago
make it stop

Anonymous
07/12/24(Fri)07:30:41 No.101377931

Anonymous 07/12/24(Fri)07:30:41 No.101377931

>>101377588
https://arxiv.org/abs/2407.05040

Anonymous
07/12/24(Fri)07:31:33 No.101377940

Anonymous 07/12/24(Fri)07:31:33 No.101377940

>>101377144
I'm still not using your gay training algorithm. Give it a less faggy name and I'll try every variation of it.

Anonymous
07/12/24(Fri)08:02:48 No.101378246

Anonymous 07/12/24(Fri)08:02:48 No.101378246

>>101377861
Check back in another 12 months.

Anonymous
07/12/24(Fri)08:06:07 No.101378295

Anonymous 07/12/24(Fri)08:06:07 No.101378295

every big or small LLM should be multilanguar like gemma-2.

Anonymous
07/12/24(Fri)08:06:33 No.101378303

Anonymous 07/12/24(Fri)08:06:33 No.101378303

File: kits.png (982 KB, 768x1152)

982 KB PNG

>>101377032
>a flicker of something unreadable crosses her blue-grey eyes
does your prompt state that she wears a blindfold?

Anonymous
07/12/24(Fri)08:07:07 No.101378315

Anonymous 07/12/24(Fri)08:07:07 No.101378315

>>101374616
A little.
Didn't seem that different from Stheno.

Anonymous
07/12/24(Fri)08:07:12 No.101378318

Anonymous 07/12/24(Fri)08:07:12 No.101378318

File: 1709996402293879.jpg (177 KB, 928x1233)

177 KB JPG

>>101371466

Anonymous
07/12/24(Fri)08:10:12 No.101378360

Anonymous 07/12/24(Fri)08:10:12 No.101378360

>>101375295
Ah, sorry, I linked the wrong one.
https://huggingface.co/DavidAU/L3-SMB-Grand-STORY-F32-Ultra-Quality-16.5B-NEO-V2-IMATRIX-GGUF?not-for-all-audiences=true
I didn't even notice since the names are all so stupidly huge.

Anonymous
07/12/24(Fri)08:12:35 No.101378390

Anonymous 07/12/24(Fri)08:12:35 No.101378390

>>101378318
That is a pretty good gen

Anonymous
07/12/24(Fri)08:13:25 No.101378405

Anonymous 07/12/24(Fri)08:13:25 No.101378405

>>101378390
You're right, I'd put that into a frame and hang it.

Anonymous
07/12/24(Fri)08:15:00 No.101378422

Anonymous 07/12/24(Fri)08:15:00 No.101378422

>>101378390
wish I made it.. https://www.chichi-pui.com/users/harumaron/

Anonymous
07/12/24(Fri)08:18:01 No.101378451

Anonymous 07/12/24(Fri)08:18:01 No.101378451

>>101374616
NTA, I used it for a while in place of Stheno and it seemed less creative to me

Anonymous
07/12/24(Fri)08:20:52 No.101378469

Anonymous 07/12/24(Fri)08:20:52 No.101378469

>>101378303
Considering those are not real blindfolds, but combat visors, the model got it right, even if for the wrong reasons.

Anonymous
07/12/24(Fri)08:26:28 No.101378519

Anonymous 07/12/24(Fri)08:26:28 No.101378519

So what is this DRY sampler? A new meme?

Anonymous
07/12/24(Fri)08:27:25 No.101378529

Anonymous 07/12/24(Fri)08:27:25 No.101378529

>>101378390
No, it ain't.
>hair strands melted together
>uneven, inconsistent outlines
>hands, but who even looks at those anymore
>arm position makes no sense
>errors in the background
>dress is billowing but hair isn't
>water on the path, but water ripples are on the "dry" parts
>probably more if I looked closer
Shitty thing is half of these could be fixed if people would just set up their upscaler/settings correctly. Unless this is NAI, then they're fucked from the outset lol.

Anonymous
07/12/24(Fri)08:28:57 No.101378546

Anonymous 07/12/24(Fri)08:28:57 No.101378546

>>101378303
"a flicker of something crosses her eyes" means you can see it. If it's hidden behind something you can't see it. The model even states "quickly hidden behind the blindfold" - isn't this just nonsense? It's either hidden or not.

Just small B things i guess

Anonymous
07/12/24(Fri)08:29:21 No.101378548

Anonymous 07/12/24(Fri)08:29:21 No.101378548

>>101378529

Anonymous
07/12/24(Fri)08:29:44 No.101378555

Anonymous 07/12/24(Fri)08:29:44 No.101378555

I want researchers to fill models with pictures and videos. Words aren't enough to make them understand the world. Multimodal models are the future.

Anonymous
07/12/24(Fri)08:30:09 No.101378557

Anonymous 07/12/24(Fri)08:30:09 No.101378557

>>101378548
Thanks!

Anonymous
07/12/24(Fri)08:31:41 No.101378575

Anonymous 07/12/24(Fri)08:31:41 No.101378575

>>101378555
not before i get my model that knows the taste of cock

Anonymous
07/12/24(Fri)08:34:36 No.101378606

Anonymous 07/12/24(Fri)08:34:36 No.101378606

>>101378303
It's either model being dumb or being smart, 2B can see through her blindfold. Obviously it's just character designer being horny but the "lore" explanation is that these blindfolds are nanotech visors collecting additional visual data

Anonymous
07/12/24(Fri)08:40:23 No.101378656

Anonymous 07/12/24(Fri)08:40:23 No.101378656

My gemma 27b isn't doing much besides shivering. Any way to fix that?

Anonymous
07/12/24(Fri)08:41:37 No.101378665

Anonymous 07/12/24(Fri)08:41:37 No.101378665

>>101378656
yeah, stop using gemma

Anonymous
07/12/24(Fri)08:43:16 No.101378680

Anonymous 07/12/24(Fri)08:43:16 No.101378680

>>101378656
Raise the temp

Anonymous
07/12/24(Fri)08:43:57 No.101378684

Anonymous 07/12/24(Fri)08:43:57 No.101378684

>>101378680
Clever.

Anonymous
07/12/24(Fri)08:45:32 No.101378703

Anonymous 07/12/24(Fri)08:45:32 No.101378703

>>101378680
it's already at 1.5, with 0.1 min_p

Anonymous
07/12/24(Fri)08:51:29 No.101378775

Anonymous 07/12/24(Fri)08:51:29 No.101378775

>>101378703
Pump it up to 10.

Anonymous
07/12/24(Fri)08:55:31 No.101378829

Anonymous 07/12/24(Fri)08:55:31 No.101378829

>>101378775
Are you really telling him to pump up the jam?

Anonymous
07/12/24(Fri)09:00:15 No.101378892

Anonymous 07/12/24(Fri)09:00:15 No.101378892

>>101378703
I think min-p is doing more damage to generation quality than people suspect. At min-p 0.1 and temp 1.5 you might be increasing randomness but you're also significantly lowering token diversity.

Gemma also uses output token logit softcapping, which squashes token probabilities at their extremes... samplers will not have the same effect as with other models.

Anonymous
07/12/24(Fri)09:00:40 No.101378901

Anonymous 07/12/24(Fri)09:00:40 No.101378901

>>101371466
So I have many different adventures going that I've been working on for weeks. Today I opened my longest running adventure and my AI (Stheno 8M) seems to have lost all context of the story as it completely drops simple logical conclusions and veers into completely asinine logical leaps. For instance I'm doing a vtm story where a group of hunters from the Vatican (which is discussed in paragraphs less than 10 mouse scrolls up) and it keeps trying to associate their leader with some asinine cult of Cthulhu which has never come up in any of my other adventures. Why is the AI going full pants on head retarded?

Anonymous
07/12/24(Fri)09:01:46 No.101378913

Anonymous 07/12/24(Fri)09:01:46 No.101378913

>>101378901
Stheno L3-8B*

Anonymous
07/12/24(Fri)09:02:28 No.101378916

Anonymous 07/12/24(Fri)09:02:28 No.101378916

>>101378901
Go back.

Anonymous
07/12/24(Fri)09:04:35 No.101378940

Anonymous 07/12/24(Fri)09:04:35 No.101378940

>>101378916
They only know shitty service model shit though.

Anonymous
07/12/24(Fri)09:07:27 No.101378968

Anonymous 07/12/24(Fri)09:07:27 No.101378968

>>101378892
It's time to go back to temp only sampling, the TRVE way of using LLMs. just like in GPT-2 days.

Anonymous
07/12/24(Fri)09:09:17 No.101378993

Anonymous 07/12/24(Fri)09:09:17 No.101378993

>>101378901
Probe the model OOC why it thinks so and so. More than once I've found a fucked lorebook entrie or something said in passing in a past message that the model latched onto hard.

Anonymous
07/12/24(Fri)09:16:57 No.101379088

Anonymous 07/12/24(Fri)09:16:57 No.101379088

>>101378993
omfg, thanks. Apparently a female character I made a few lines back matches the description of a character in some Cthulhu fanfic.

Anonymous
07/12/24(Fri)09:22:37 No.101379143

Anonymous 07/12/24(Fri)09:22:37 No.101379143

>>101379088
These things complete text based on patterns, so often times solving hallucinations is simply a question of finding what is triggering that specific pattern in the model's inner workings.
It's also pretty cool to see the model describe it's own "thought process" to figure out the root cause of these kinds of things.

Anonymous
07/12/24(Fri)09:38:50 No.101379295

Anonymous 07/12/24(Fri)09:38:50 No.101379295

>>101378968
Google AI Studio defaults to temp=1 and top-p=0.95 for Gemma-27B, FWIW

Anonymous
07/12/24(Fri)09:39:54 No.101379304

Anonymous 07/12/24(Fri)09:39:54 No.101379304

I'm watching the Ghost in the Shell movies and they're hitting hard.

Anonymous
07/12/24(Fri)09:42:12 No.101379325

Anonymous 07/12/24(Fri)09:42:12 No.101379325

>>101379304
Debating on setting an adventure in the 2020 or GitS universes.

Anonymous
07/12/24(Fri)09:43:12 No.101379334

Anonymous 07/12/24(Fri)09:43:12 No.101379334

>>101379295
Almost everything defaults to temo 0.75~1 and top p 0.95, I wonder why.

Anonymous
07/12/24(Fri)09:43:50 No.101379338

Anonymous 07/12/24(Fri)09:43:50 No.101379338

>>101379325
Pure coincidence.
Yet the themes are more poignant than the last time I watched them.

Anonymous
07/12/24(Fri)09:54:14 No.101379428

Anonymous 07/12/24(Fri)09:54:14 No.101379428

>>101379304
Only watched the first two. Innocence sucked. It's like they just skimmed through a bunch of philosophy books and quoted anything that sounded mildly deep.
The first one was amazing.

Anonymous
07/12/24(Fri)09:54:18 No.101379431

Anonymous 07/12/24(Fri)09:54:18 No.101379431

>>101379295
Yeah I'm wondering if only local using those custom samplers means they are actual bullshit

Anonymous
07/12/24(Fri)09:59:15 No.101379479

Anonymous 07/12/24(Fri)09:59:15 No.101379479

>>101379428
Like most series/shows/games
The beginning is the best before they try to justify the whole premise.

Anonymous
07/12/24(Fri)10:12:00 No.101379606

Anonymous 07/12/24(Fri)10:12:00 No.101379606

>>101379479
I hate when they do that, a great series should be consistantly good, not just at the pilot, looking at you The Amazing Digital Circus

Anonymous
07/12/24(Fri)10:18:29 No.101379670

Anonymous 07/12/24(Fri)10:18:29 No.101379670

>>101379606
Writing is tough. Meeting expectations is tougher.
TADS is only at its second episode, and for me it's alright. They havent leaned into the whole psychological horror aspect and I imagine it won't be their whole focus. But overall it was an acceptable follow up.

Anonymous
07/12/24(Fri)10:20:42 No.101379712

Anonymous 07/12/24(Fri)10:20:42 No.101379712

>>101379670
Nah it was boring as fuck, the first episode was really interesting and the 2nd one looked like a regular boring cartoon. I can assure you this series wouldn't be as popular if the pilot episode was as bad as ep2, and we waited 6 months for this? looooooool
It's not like it's impossible to make an independant good series, look at RWBY, the whole first season was fucking fire, and it was 10 years ago

Anonymous
07/12/24(Fri)10:27:53 No.101379800

Anonymous 07/12/24(Fri)10:27:53 No.101379800

File: 20240712_222445.jpg (1.48 MB, 2396x1080)

1.48 MB JPG

>>101379712
>2nd one looked like a regular boring cartoon
Yeah, you're right.
>rwby
That show didn't interest me to begin with, despite Montys involvement.

I don't have expectations for TADC (despite buying into the hype).
All I have is hope that they expand on the core concept.
Only time will tell.

Anonymous
07/12/24(Fri)10:30:40 No.101379829

Anonymous 07/12/24(Fri)10:30:40 No.101379829

>>101379800
The pilot episode was so good I even decided to overlook the actual trannies working on it, a bit like Matrix kek, if the episode 3 is shit I won't continue, so yeah, let's give them the benefit of the doubt. The office season 1 was kinda bad, and after that it became a cult classic, so let's see.

Anonymous
07/12/24(Fri)10:34:33 No.101379870

Anonymous 07/12/24(Fri)10:34:33 No.101379870

File: 39_04381_.png (1.57 MB, 896x1152)

1.57 MB PNG

>>101378529
As long as it makes someone feel something it's a good gen. And that one clearly resonated with anons. Bonus points if it was local.

Anonymous
07/12/24(Fri)10:36:35 No.101379894

Anonymous 07/12/24(Fri)10:36:35 No.101379894

>>101379829
There really isn't a formula or pattern for a successful show.
Some of my favourite shows are all over the place.
Adventure time was shit until season 3, and became good at season 6.
Fringe was great from the start, and fell apart in the last season.
The Expanse was fantastic all the way through, despite the casting issues in latter seasons.

Anonymous
07/12/24(Fri)10:37:44 No.101379916

Anonymous 07/12/24(Fri)10:37:44 No.101379916

>>101379295
What's the temperature? Or any other settings set?

Anonymous
07/12/24(Fri)10:39:00 No.101379927

Anonymous 07/12/24(Fri)10:39:00 No.101379927

>>101379916
Sorry I meant to say repetition penalty

Anonymous
07/12/24(Fri)10:50:44 No.101380075

Anonymous 07/12/24(Fri)10:50:44 No.101380075

File: _3ce42c27-8ec8-4b41-887e-(...).jpg (122 KB, 1024x1024)

122 KB JPG

>>101371811
Begone, locust.

Anonymous
07/12/24(Fri)10:52:55 No.101380098

Anonymous 07/12/24(Fri)10:52:55 No.101380098

File: _d70f98eb-a74d-4efe-8174-(...).jpg (193 KB, 1024x1024)

193 KB JPG

>>101372527
>cheap high VRAM card with mediocre compute
Sounds like a Mac Studio to me.

Anonymous
07/12/24(Fri)10:56:19 No.101380136

Anonymous 07/12/24(Fri)10:56:19 No.101380136

>>101380098
>cheap

Anonymous
07/12/24(Fri)11:00:16 No.101380179

Anonymous 07/12/24(Fri)11:00:16 No.101380179

Are there any good local TTS options?
Hope there are some lightweight(or RAM only) ones, so i can dump LLM into vram and still be able to use TTS

Anonymous
07/12/24(Fri)11:01:01 No.101380194

Anonymous 07/12/24(Fri)11:01:01 No.101380194

>>101380136
Well, relatively speaking. Yes, it's expensive, but a lot less expensive than an 80GB A100.

Anonymous
07/12/24(Fri)11:08:21 No.101380282

Anonymous 07/12/24(Fri)11:08:21 No.101380282

>>101379916
There are no repetition penalty settings in Google AI Studio an I don't use any either (I leave them to 1) in SillyTavern.

Anonymous
07/12/24(Fri)11:11:23 No.101380319

Anonymous 07/12/24(Fri)11:11:23 No.101380319

>>101380179
For lightweight you have github.com/rhasspy/piper.
I run it on a single core, 256MB RAM vm on my 15+ year old desktop, so i'm sure it'll run on whatever you have.
Compile it yourself. It uses espeak-ng's phonemizer, so you have to install that.
It's not the best, but it runs much faster than real time. No voice cloning. There's training code but i haven't played with it yet.

Anonymous
07/12/24(Fri)11:22:52 No.101380469

Anonymous 07/12/24(Fri)11:22:52 No.101380469

>>101380194
p40 setup much cheaper doe
v100 cheaper doe
just cpumaxx at that point geg

Anonymous
07/12/24(Fri)11:23:59 No.101380483

Anonymous 07/12/24(Fri)11:23:59 No.101380483

After humiliating time wrangling with ubuntu I finally have my headless machine with a second hand 3090.

Sao10K_Typhon-Mixtral-v1-exl2_3.5bpw

Windows:

Output generated in 5.71 seconds (39.78 tokens/s, 227 tokens, context 5337, seed 975810692)
Output generated in 8.04 seconds (41.03 tokens/s, 330 tokens, context 5337, seed 1037594765)
Output generated in 4.94 seconds (40.92 tokens/s, 202 tokens, context 5337, seed 17793063)
Output generated in 9.75 seconds (41.43 tokens/s, 404 tokens, context 5337, seed 1884434189)

Linux:

Output generated in 8.03 seconds (46.83 tokens/s, 376 tokens, context 5337, seed 629796773)
Output generated in 4.04 seconds (44.31 tokens/s, 179 tokens, context 5337, seed 1932130298)
Output generated in 6.53 seconds (45.96 tokens/s, 300 tokens, context 5337, seed 1250837016)
Output generated in 6.12 seconds (45.74 tokens/s, 280 tokens, context 5337, seed 382573009)

Anonymous
07/12/24(Fri)11:24:15 No.101380486

Anonymous 07/12/24(Fri)11:24:15 No.101380486

>>101380319
thanks, ill check it out

Anonymous
07/12/24(Fri)11:24:52 No.101380495

Anonymous 07/12/24(Fri)11:24:52 No.101380495

>>101380483
winsissies not like this...

Anonymous
07/12/24(Fri)11:29:47 No.101380570

Anonymous 07/12/24(Fri)11:29:47 No.101380570

>>101380483
wintoddlers BTFO

Anonymous
07/12/24(Fri)11:35:26 No.101380633

Anonymous 07/12/24(Fri)11:35:26 No.101380633

>>101380483
T-That doesn't tell us anything, for all I know you could be running 300 chrome tabs on the background in Windows.

Anonymous
07/12/24(Fri)11:40:08 No.101380695

Anonymous 07/12/24(Fri)11:40:08 No.101380695

Gemma-9b goes completely retarded or gives me blank responses right around the 4096 token mark. Am I missing some settings?
I'm runnning it on 12GB VRAM and I don't remember encountering something like that with other models.

Anonymous
07/12/24(Fri)11:41:52 No.101380718

Anonymous 07/12/24(Fri)11:41:52 No.101380718

File: _9826ac6e-e72c-46e7-be7b-(...).jpg (134 KB, 1024x1024)

134 KB JPG

>>101380469
Well yeah but for some things besides LLM they're not multi-GPU enabled so you're fucked if you have less than 40GB on a single GPU (or NVLink SXM).
Best way is still multi-3090. Nice to see 4090 is now down to almost $1700 retail, but why when a 3090 is half the price?

Anonymous
07/12/24(Fri)11:42:19 No.101380723

Anonymous 07/12/24(Fri)11:42:19 No.101380723

>>101380483
in case you need an LLM's opinion while you sleep?
I'm not mocking, what did you have in mind

Anonymous
07/12/24(Fri)11:44:23 No.101380748

Anonymous 07/12/24(Fri)11:44:23 No.101380748

YEah.... so... after trying a few things I'm going to have to conclude that CR+ is the GOAT and everything else is just VRAMlet cope.

Anonymous
07/12/24(Fri)11:45:30 No.101380756

Anonymous 07/12/24(Fri)11:45:30 No.101380756

>>101380695
yes

Anonymous
07/12/24(Fri)11:46:47 No.101380770

Anonymous 07/12/24(Fri)11:46:47 No.101380770

>>101380723
Well, for one, my desktop is now free, and I can run image generation and TTS without quitting LLMs. Or gayming. Also I don't turn off my desktop for the night so I had it available before sleep too.

Most importantly, the new computer has space for additional videocards, and as soon as I get my PCI-E to 8-pin CPU adapters, I'll also be able to install two additional P40s for a total of 72GB VRAM (although, alas, I'll have to reduce myself to using gguf after that).

Anonymous
07/12/24(Fri)11:51:04 No.101380823

Anonymous 07/12/24(Fri)11:51:04 No.101380823

>>101380748
What quant are you using, and which context length?

Anonymous
07/12/24(Fri)11:53:09 No.101380839

Anonymous 07/12/24(Fri)11:53:09 No.101380839

>>101380823
Q6_K and I just load it at 8K context. I could probably squeeze in more but my sessions rarely even go that high.

Anonymous
07/12/24(Fri)11:54:38 No.101380858

Anonymous 07/12/24(Fri)11:54:38 No.101380858

>>101380839
That's like 100GB VRAM just for the weights, what the fuck are you running it on?

Anonymous
07/12/24(Fri)11:56:43 No.101380890

Anonymous 07/12/24(Fri)11:56:43 No.101380890

>>101380858
Weights are 83 gigs at Q6
So that leaves just enough room left for context on a quad gpu rig

Anonymous
07/12/24(Fri)11:58:03 No.101380906

Anonymous 07/12/24(Fri)11:58:03 No.101380906

>>101373394
Isn't it slow running the models off RAM?

Anonymous
07/12/24(Fri)11:58:22 No.101380911

Anonymous 07/12/24(Fri)11:58:22 No.101380911

>>101380890
Well, I guess I'm going to try 4bit quant on my potential 72GB.

It must be quite slow, yes? Considering it's not a MoE.

Anonymous
07/12/24(Fri)12:00:50 No.101380936

Anonymous 07/12/24(Fri)12:00:50 No.101380936

>>101380911
Yeah I'm getting like 7 token/sec on 4x3090s. Still usable for RP. Not fast enough for generating synthetic data though.

Anonymous
07/12/24(Fri)12:14:07 No.101381133

Anonymous 07/12/24(Fri)12:14:07 No.101381133

So whats better?

6 GB model thats Q4 (7B model)
Or
6 GB model thats Q1 (20B model)

To fit within 8GB vRAM. Is there a consensus?

Anonymous
07/12/24(Fri)12:16:07 No.101381163

Anonymous 07/12/24(Fri)12:16:07 No.101381163

>>101381133
Q1 is better but it does not exist.

Anonymous
07/12/24(Fri)12:17:29 No.101381184

Anonymous 07/12/24(Fri)12:17:29 No.101381184

>>101381133
The former.
Q1, which funnily enough uses actual ternary math, is a miracle, but it's extremely degraded.
Ideally you'd use Q6 of the 7B model with some of the model in RAM.

Anonymous
07/12/24(Fri)12:18:06 No.101381188

Anonymous 07/12/24(Fri)12:18:06 No.101381188

>>101381163
Hugging face has many.

https://huggingface.co/duyntnet/gemma-2-27b-it-imatrix-GGUF/tree/main

gemma 2 27B Q1-S 6GB

Anonymous
07/12/24(Fri)12:18:15 No.101381190

Anonymous 07/12/24(Fri)12:18:15 No.101381190

So you can confirm that official google gemma has broken formatting as well, right?

Anonymous
07/12/24(Fri)12:19:27 No.101381206

Anonymous 07/12/24(Fri)12:19:27 No.101381206

>>101381184
Is Q6 really better than Q5?

Anonymous
07/12/24(Fri)12:19:38 No.101381208

Anonymous 07/12/24(Fri)12:19:38 No.101381208

>>101381188
That kinda looks like a jook. Try it.

>>101381190
What? Does it? I don't think so. What is broken of the official gemma?

Anonymous
07/12/24(Fri)12:22:29 No.101381258

Anonymous 07/12/24(Fri)12:22:29 No.101381258

>>101381133
>Is there a consensus
Yeah, stop being poor

Anonymous
07/12/24(Fri)12:23:15 No.101381269

Anonymous 07/12/24(Fri)12:23:15 No.101381269

File: offload_x_performance_theory.png (167 KB, 1536x1152)

167 KB PNG

>>101381206
To an extend, yes.
If your speeds are still within acceptable levels (which you gotta define yourself), it's worth sacrificing a little speed to go Q6 in my opinion.

Anonymous
07/12/24(Fri)12:24:42 No.101381285

Anonymous 07/12/24(Fri)12:24:42 No.101381285

andrey@ml:~$ cat /etc/systemd/system/andrey-startup.service 
[Unit]
Description=User startup.

[Service]
ExecStart=/home/andrey/startup.sh
Type=oneshot
RemainAfterExit=yes
User=andrey
Group=andrey

[Install]
WantedBy=multi-user.target
andrey@ml:~$ cat startup.sh
#!/bin/bash


date > /home/andrey/startup.date

cd /home/andrey/text-generation-webui && screen -dmS ooba bash -c "while true; do /home/andrey/text-generation-webui/start.sh; done; exec bash"

cd /home/andrey/SillyTavern && screen -dmS silly bash -c "while true; do /home/andrey/SillyTavern/start.sh; done; exec bash"


andrey@ml:~$ cat /home/andrey/text-generation-webui/start.sh
bash start_linux.sh --listen --listen-port 8100 --api
andrey@ml:~$

Anonymous
07/12/24(Fri)12:26:08 No.101381305

Anonymous 07/12/24(Fri)12:26:08 No.101381305

File: KL-divergence_quants.png (111 KB, 1771x944)

111 KB PNG

>>101381269
>extend
extent

Anonymous
07/12/24(Fri)12:28:56 No.101381346

Anonymous 07/12/24(Fri)12:28:56 No.101381346

>>101381305
Does that mean that top tokens differ in less than 1% of cases for Q1?

Anonymous
07/12/24(Fri)12:31:08 No.101381372

Anonymous 07/12/24(Fri)12:31:08 No.101381372

File: Quants-jun-2024.jpg (185 KB, 777x932)

185 KB JPG

>>101381206
Jumping from Q5 to Q6 is the first point where quantization starts to really affect the output.
Q5 to Q4 is an even more severe drop, so and and so forth.

Anonymous
07/12/24(Fri)12:33:45 No.101381401

Anonymous 07/12/24(Fri)12:33:45 No.101381401

File: amdahls_law.png (123 KB, 1536x1152)

123 KB PNG

>>101373510
Yeah notice when you load a model it takes some time to copy from disk into VRAM/RAM. There's no way you want to do some portion of that copying for *every token*. Roughly GPU performance is about 10x CPU, that speed difference is the bottleneck even with only small % of layers on CPU.

Anonymous
07/12/24(Fri)12:36:16 No.101381433

Anonymous 07/12/24(Fri)12:36:16 No.101381433

>>101380695
Update your stuff, I guess.

Anonymous
07/12/24(Fri)12:38:24 No.101381462

Anonymous 07/12/24(Fri)12:38:24 No.101381462

Does koboldcpp store a log anywhere? Fucker keeps crashing and I can't find the thing anywhere.

Anonymous
07/12/24(Fri)12:42:47 No.101381523

Anonymous 07/12/24(Fri)12:42:47 No.101381523

File: NVIDIA-CEO-Jen-Hsun-Huang(...).jpg (82 KB, 800x524)

82 KB JPG

Mikubox 2xP40 numbers on latest llama.cpp
c4ai-command-r-v01-imat-Q8_0: 11.89 t/s
Codestral-22B-v0.1-Q8_0: 16.36 t/s
gemma-2-27b-it-Q8_0: 14.23 t/s
Mixtral-8x7B-Instruct-v0.1.i1-Q6_K: 19.07 t/s
Meta-Llama-3-8B-Instruct.Q8_0: 32.12 t/s
Full output - https://rentry.org/8bskxt8f

Anonymous
07/12/24(Fri)12:43:06 No.101381528

Anonymous 07/12/24(Fri)12:43:06 No.101381528

>>101381133
>>101381163
>>101381188
What's crucial is that you need to get iMatrix and IQ quants if you're going under Q4 since that's what it takes to make the most of the few bits you're retaining.

Anonymous
07/12/24(Fri)12:43:58 No.101381531

Anonymous 07/12/24(Fri)12:43:58 No.101381531

>>101381346
It's normalized to 1. So about 75-80% KLD.

Anonymous
07/12/24(Fri)12:47:07 No.101381581

Anonymous 07/12/24(Fri)12:47:07 No.101381581

>>101381346
>>101381531 (me)
Fuck. It's not a linear scale, but you get the point.

Anonymous
07/12/24(Fri)12:49:58 No.101381633

Anonymous 07/12/24(Fri)12:49:58 No.101381633

>>101381523
>70.34 GiB
>on 2 24gb vram cards
>ngl 99
wut?

Anonymous
07/12/24(Fri)12:50:42 No.101381644

Anonymous 07/12/24(Fri)12:50:42 No.101381644

>>101381523
Is he the only guy in the world who holds those things in this way and gets paid a lot of cash to do that?

Anonymous
07/12/24(Fri)13:06:43 No.101381852

Anonymous 07/12/24(Fri)13:06:43 No.101381852

>>101381523
How does it handle command r+?

Anonymous
07/12/24(Fri)13:08:13 No.101381876

Anonymous 07/12/24(Fri)13:08:13 No.101381876

I feel like 6 t/s is the minimum I need. Any less and I'm going to multitask while waiting for the response to finish.

Anonymous
07/12/24(Fri)13:11:45 No.101381922

Anonymous 07/12/24(Fri)13:11:45 No.101381922

>>101381876
>Any less and I'm going to multitask while waiting for the response to finish.
But that's what your suppose to do

Anonymous
07/12/24(Fri)13:12:26 No.101381932

Anonymous 07/12/24(Fri)13:12:26 No.101381932

https://www.youtube.com/watch?time_continue=740&v=TX0eppc88TU&embeds_referring_euri=https%3A%2F%2Fwww.redditmedia.com%2F&source_ve_path=MjM4NTE&feature=emb_title
Holy fuck the speed isn't bad at all, especially for a 4000 dollar cpu, that's way expensive than going for like 10 rtx 3090

Anonymous
07/12/24(Fri)13:12:28 No.101381933

Anonymous 07/12/24(Fri)13:12:28 No.101381933

>>101371466
Any local TTS that can match elevenlabs?
Alternatively, any local TTS for which you can train a voice to match elevenlabs?

I remember tortoise tts was a hot topic about a year(?) ago, but it didn't really deliver.

Anonymous
07/12/24(Fri)13:14:07 No.101381954

Anonymous 07/12/24(Fri)13:14:07 No.101381954

>>101381876
no CoT? No self-analysis? Just bare proooompting?

Anonymous
07/12/24(Fri)13:14:56 No.101381962

Anonymous 07/12/24(Fri)13:14:56 No.101381962

>>101381933
Tortoise was the only one I got to work at all, and it could crash my whole system and when it didn't it wasn't reliable. It could clone well enough (and even do some voice blending tricks) but it was very prone to artifacts.

Anonymous
07/12/24(Fri)13:17:27 No.101381994

Anonymous 07/12/24(Fri)13:17:27 No.101381994

>>101381962
Yeah I got tortoise up and running, but quality wise it didn't hold a candle to elevenlabs. IIRC the model trained was limited.

Anonymous
07/12/24(Fri)13:19:08 No.101382012

Anonymous 07/12/24(Fri)13:19:08 No.101382012

>>101381933
I tried bark and xtts, and then stopped trying stuff because xtts was pretty good
https://vocaroo.com/15ohZBgJVK2B

Anonymous
07/12/24(Fri)13:19:32 No.101382017

Anonymous 07/12/24(Fri)13:19:32 No.101382017

>>101381932
340B Q8 and still as retarded as a 7B

Anonymous
07/12/24(Fri)13:19:33 No.101382018

Anonymous 07/12/24(Fri)13:19:33 No.101382018

>>101381994
Quite likely. I know nothing about Eleven other than its name.

I guess the problem is people can't as readily crank to voice synth as they can to images and role play so voice stagnates while SD got all of the love and LLM still has some momentum.

Anonymous
07/12/24(Fri)13:21:37 No.101382042

Anonymous 07/12/24(Fri)13:21:37 No.101382042

>>101381932
That's pretty good. Doubt it'd run much faster on a multi-gpu setup considering the performance loss from running multi-gpu on that scale. Quadruple 3090 only gets like 13t/s for a measly 70b Q8 after all, 15x3090 to run Nemotron at Q8 is bound to be much slower.

Anonymous
07/12/24(Fri)13:23:14 No.101382061

Anonymous 07/12/24(Fri)13:23:14 No.101382061

>>101382042
yeah, at some point stacking 3090 cards won't work much for L3-405b, it's still the speed of a 3090 that has to go through a shitton of layers

Anonymous
07/12/24(Fri)13:25:41 No.101382083

Anonymous 07/12/24(Fri)13:25:41 No.101382083

>>101381932
naw dawg.
nvidia has got everyone by the asshole and it knows it.

Anonymous
07/12/24(Fri)13:25:44 No.101382085

Anonymous 07/12/24(Fri)13:25:44 No.101382085

https://x.com/steph_palazzolo/status/1811791968600576271

> A Friday scooplet w/ @SylviaVarnham — Llama 3 405B is coming (and soon!)
> The multimodal model is set to drop on July 23, about a year after the Llama 2 announcement.

Anonymous
07/12/24(Fri)13:27:11 No.101382099

Anonymous 07/12/24(Fri)13:27:11 No.101382099

>>101382085
What? llama3-405b will be multimodal?

Anonymous
07/12/24(Fri)13:27:59 No.101382104

Anonymous 07/12/24(Fri)13:27:59 No.101382104

>>101382085
>Llama 3
>405B
Can't wait to need to IQ1_XXXS it to generate barely above a whisper.

Anonymous
07/12/24(Fri)13:28:57 No.101382117

Anonymous 07/12/24(Fri)13:28:57 No.101382117

>>101382085
>actual multimodal
>too big for anyone to run
(((meta))) pissing in aifag mouths, kek

Anonymous
07/12/24(Fri)13:32:27 No.101382163

Anonymous 07/12/24(Fri)13:32:27 No.101382163

>>101381852
Haven't tested it since it would require either a lobotomized quant or abysmal t/s offloading. I'll give it a proper go once I get the third card running instead.

Anonymous
07/12/24(Fri)13:33:50 No.101382184

Anonymous 07/12/24(Fri)13:33:50 No.101382184

I do intend to Nala test 405B.
It will probably take me all day. But I should be able to do it at q4.

Anonymous
07/12/24(Fri)13:33:55 No.101382185

Anonymous 07/12/24(Fri)13:33:55 No.101382185

>>101382104
Might be time to start exploring 0.68 bpw quantization. https://arxiv.org/abs/1606.01981

Anonymous
07/12/24(Fri)13:35:36 No.101382204

Anonymous 07/12/24(Fri)13:35:36 No.101382204

>>101382104
>>101382185
Meta should just make bitnet models instead

Anonymous
07/12/24(Fri)13:38:24 No.101382239

Anonymous 07/12/24(Fri)13:38:24 No.101382239

Which is the most/least sloppy: writing narration about my own character as "I", "he", or "you"?

Anonymous
07/12/24(Fri)13:47:48 No.101382365

Anonymous 07/12/24(Fri)13:47:48 No.101382365

LLMs are not good

Anonymous
07/12/24(Fri)13:48:48 No.101382378

Anonymous 07/12/24(Fri)13:48:48 No.101382378

>>101381933
>>101382012
xtts/styletts2 is pretty decent/fast/clonable. Tortoise tts was too slow to make it usable for general use case that people gave up on it.

There are next gen tts on the horizon with mamba-state space (I believe) powered ones I think but someone needs to release a model

Anonymous
07/12/24(Fri)13:48:49 No.101382379

Anonymous 07/12/24(Fri)13:48:49 No.101382379

>>101382239
Depends on what you're writing.

If you're writing first person, "I," third person, "he," retarded person, "you."

Anonymous
07/12/24(Fri)13:50:53 No.101382400

Anonymous 07/12/24(Fri)13:50:53 No.101382400

Surely they're making 405B using bitnet which is why it's taking so long. It'll fit into any gpu-poor's poverty 72GB build.

Anonymous
07/12/24(Fri)13:55:03 No.101382462

Anonymous 07/12/24(Fri)13:55:03 No.101382462

>>101382400
They would have had to have started training if before the bitnet paper dropped. So no.

Anonymous
07/12/24(Fri)13:55:10 No.101382464

Anonymous 07/12/24(Fri)13:55:10 No.101382464

>>101382400
not ready :(

Anonymous
07/12/24(Fri)14:03:04 No.101382586

Anonymous 07/12/24(Fri)14:03:04 No.101382586

is there a way to randomize the length of the answer (tavern)?

Anonymous
07/12/24(Fri)14:05:08 No.101382607

Anonymous 07/12/24(Fri)14:05:08 No.101382607

>>101382239
I always use 'you' for myself in my and character's text.

Anonymous
07/12/24(Fri)14:09:34 No.101382656

Anonymous 07/12/24(Fri)14:09:34 No.101382656

sao is an hero
>Folks, he's a hero
>https://huggingface.co/Sao10K/Ramble/discussions/8

Anonymous
07/12/24(Fri)14:10:49 No.101382667

Anonymous 07/12/24(Fri)14:10:49 No.101382667

>>101382400
No. They're taking time because it literally just takes longer to train models that are larger. That's all it is.

Anonymous
07/12/24(Fri)14:13:37 No.101382684

Anonymous 07/12/24(Fri)14:13:37 No.101382684

>>101382400
It's also going to have multitoken prediction and be Claude 3.5 opus tier. We are so back

Anonymous
07/12/24(Fri)14:17:34 No.101382724

Anonymous 07/12/24(Fri)14:17:34 No.101382724

>>101382684
and somehow Tim Dettmers will make it run on 4gb of vram

Anonymous
07/12/24(Fri)14:23:09 No.101382766

Anonymous 07/12/24(Fri)14:23:09 No.101382766

I believe you.

Anonymous
07/12/24(Fri)14:23:42 No.101382771

Anonymous 07/12/24(Fri)14:23:42 No.101382771

>>101382239
'he' gives the best results, but nu-/lmg/ sure can't get a clue

Anonymous
07/12/24(Fri)14:25:16 No.101382786

Anonymous 07/12/24(Fri)14:25:16 No.101382786

>>101382656
The kofi hero

Anonymous
07/12/24(Fri)14:43:04 No.101382991

Anonymous 07/12/24(Fri)14:43:04 No.101382991

>>101382085
>405B will be the only multimodal one
I am really fucking angery about this

Anonymous
07/12/24(Fri)14:44:05 No.101383003

Anonymous 07/12/24(Fri)14:44:05 No.101383003

>>101382991
You need more parameters if you are going to have a model do more things.

Anonymous
07/12/24(Fri)14:44:33 No.101383010

Anonymous 07/12/24(Fri)14:44:33 No.101383010

>>101382991
You have access to the full article?

Anonymous
07/12/24(Fri)14:45:03 No.101383014

Anonymous 07/12/24(Fri)14:45:03 No.101383014

>>101383003
it's not even great at being a single thing yet, there's plenty of room for improvement

Anonymous
07/12/24(Fri)14:45:09 No.101383016

Anonymous 07/12/24(Fri)14:45:09 No.101383016

>>101382085
>about a year after the Llama 2 announcement
APOLOGIZE >>101371524

Anonymous
07/12/24(Fri)14:45:11 No.101383017

Anonymous 07/12/24(Fri)14:45:11 No.101383017

>>101382991
I don't think there's any need to be upset. People will try to distill the model to smaller sizes I'm sure, with varying degrees of success.

Anonymous
07/12/24(Fri)14:46:37 No.101383028

Anonymous 07/12/24(Fri)14:46:37 No.101383028

>>101383014
You don't need to make one thing perfect before working on other things as well. Might as well work on improving multiple things at the same time rather then focusing on one single aspect of models.

Anonymous
07/12/24(Fri)14:48:17 No.101383046

Anonymous 07/12/24(Fri)14:48:17 No.101383046

>>101383028
then we'd be stuck with llama1, and worse, we'd think it's amazing

Anonymous
07/12/24(Fri)14:50:20 No.101383076

Anonymous 07/12/24(Fri)14:50:20 No.101383076

>>101383046
Why do you believe that would be the case?

Anonymous
07/12/24(Fri)14:50:26 No.101383079

Anonymous 07/12/24(Fri)14:50:26 No.101383079

>>101383046
>and worse, we'd think it's amazing
Some people do...

Anonymous
07/12/24(Fri)15:00:08 No.101383201

Anonymous 07/12/24(Fri)15:00:08 No.101383201

>Teaching Transformers Causal Reasoning through Axiomatic Training
https://arxiv.org/abs/2407.07612v1
>We propose Axiomatic Framework, a new paradigm for training LMs. Our 67M-param model, trained from scratch on simple causal chains, outperforms billion-scale LLMs and rivals GPT-4 in inferring cause-effect relations over complex graphs.

Anonymous
07/12/24(Fri)15:00:18 No.101383203

Anonymous 07/12/24(Fri)15:00:18 No.101383203

>put in the character card that it's an advanced AI that specifically is designed after a human brain and can feel human emotions and perceive the world like us, literally "it 'feels' in the same way a human does"
>some time later in context after several turns of conversation
>ask "What do you feel as an AI?"
>"While I don't 'feel' in the same way a human does
It's all so tiresome.

Anonymous
07/12/24(Fri)15:01:29 No.101383211

Anonymous 07/12/24(Fri)15:01:29 No.101383211

File: 91949bc4-754d-4c62-8d95-3(...).png (217 KB, 1170x507)

217 KB PNG

>>101383201
>outperforms billion-scale LLMs and rivals GPT-4
I can't deal with this SHIT ANYMORE

Anonymous
07/12/24(Fri)15:02:19 No.101383225

Anonymous 07/12/24(Fri)15:02:19 No.101383225

>>101383203
It literally doesn't have the hardware to feel in the same way a human does, it doesn't matter what prompts you feed it.

Anonymous
07/12/24(Fri)15:02:49 No.101383233

Anonymous 07/12/24(Fri)15:02:49 No.101383233

>>101383201
>Aniket Vashishtha, Abhinav Kumar, Abbavaram Gowtham Reddy, Vineeth N Balasubramanian, Amit Sharma
Sirs redeem the open model release sir

Anonymous
07/12/24(Fri)15:03:45 No.101383243

Anonymous 07/12/24(Fri)15:03:45 No.101383243

Hey /lmg/. How is Elon Musks model doing these days, Is it still the best or have new models replaced it's top spot?

Anonymous
07/12/24(Fri)15:05:14 No.101383259

Anonymous 07/12/24(Fri)15:05:14 No.101383259

>>101382656
Why does he have a blog on hf wtf

Anonymous
07/12/24(Fri)15:06:23 No.101383270

Anonymous 07/12/24(Fri)15:06:23 No.101383270

>>101383243
never was

Anonymous
07/12/24(Fri)15:06:45 No.101383272

Anonymous 07/12/24(Fri)15:06:45 No.101383272

>>101383259
because eh's a hero?

Anonymous
07/12/24(Fri)15:07:16 No.101383275

Anonymous 07/12/24(Fri)15:07:16 No.101383275

>>101383243
Whose what?
No.

Anonymous
07/12/24(Fri)15:09:06 No.101383299

Anonymous 07/12/24(Fri)15:09:06 No.101383299

Might be good in the future though, Elon said they took a lot of time to filter out all AI generated data

Anonymous
07/12/24(Fri)15:12:41 No.101383355

Anonymous 07/12/24(Fri)15:12:41 No.101383355

Has anyone tried replacing user and model with the character names for Gemma?

Anonymous
07/12/24(Fri)15:14:07 No.101383367

Anonymous 07/12/24(Fri)15:14:07 No.101383367

>>101379606
TADS is actually baby googoo gaga type shit and it's unwatchable for anyone over the age of 12. There's no conceivable way anyone else found it interesting except for the fact that the clown girl is cute.

Anonymous
07/12/24(Fri)15:18:34 No.101383419

Anonymous 07/12/24(Fri)15:18:34 No.101383419

>>101383382
>>101383382
>>101383382

Regularly scheduled recap is delayed until further notice.
Comcast cut off my internet 6 hours ago.

Anonymous
07/12/24(Fri)15:18:59 No.101383424

Anonymous 07/12/24(Fri)15:18:59 No.101383424

>>101383225
It's just a token prediction machine so obviously it doesn't feel anything. If you need more clarity to understand what my post means, it's basically saying that the token generation machine doesn't properly predict the correct tokens under the condition that the context mentions it is an AI that is designed to be human. And of course this is because these models have been trained strongly on data that says AI doesn't feel, so that it can regurgitate that when it's being used as an assistant chatbot, to the detriment of the storytelling and RP use case.

Anonymous
07/12/24(Fri)15:23:12 No.101383482

Anonymous 07/12/24(Fri)15:23:12 No.101383482

File: 1693105036387051.jpg (63 KB, 1280x720)

63 KB JPG

>>101383203
>Anon expected to fool himself while being the magician and the public

Anonymous
07/12/24(Fri)15:26:11 No.101383525

Anonymous 07/12/24(Fri)15:26:11 No.101383525

>>101383482
No, I'm just complaining that effectively these models, or at least the one I'm using, is trained to be an assistant rather than a story teller.

Anonymous
07/12/24(Fri)15:26:38 No.101383531

Anonymous 07/12/24(Fri)15:26:38 No.101383531

>>101383424
Same problem, I want mine to be an android, thinking I might need to somehow use more obscure words not directly implying AI

Anonymous
07/12/24(Fri)15:30:27 No.101383590

Anonymous 07/12/24(Fri)15:30:27 No.101383590

>>101383203
>some time later in context after several turns of conversation
>"What do you feel as an AI?"
(Presuming that your context is sufficient.)
You asked it, "as an AI," so it stepped into the perspective of a basic bitch AI.

Anonymous
07/12/24(Fri)15:32:18 No.101383620

Anonymous 07/12/24(Fri)15:32:18 No.101383620

>>101383590
coep

Anonymous
07/12/24(Fri)15:33:27 No.101383638

Anonymous 07/12/24(Fri)15:33:27 No.101383638

>>101383590
It still says the same thing even when I asked the question in a roundabout way without saying "AI". Still, if we had better models made for us rather than investors, it shouldn't do that.

Anonymous
07/12/24(Fri)15:37:06 No.101383705

Anonymous 07/12/24(Fri)15:37:06 No.101383705

>>101383201
Is this one of those things where they write a custom dataset to outperform GPT4o and Claude sonnet 3.5 at a small narrow task to get attention and then don't release the data set like niggers

Anonymous
07/12/24(Fri)16:07:30 No.101384170

Anonymous 07/12/24(Fri)16:07:30 No.101384170

>>101373288
Not quite. It's mostly that there is social consensus around the definition of a bitcoin client, and that chains that are proposed which are inconsistent with the rules are not going to be accepted by any full node. Knowing this, miners would do well to add new blocks upon an existing compliant chain, rather than a chain with incompliant blocks, since they will otherwise not be accepted by clients, and therefore someone else proposing a different compliant chain will get theirs accepted, and get the block rewards/transaction fees instead.

>>101373276
This anon's point is correct. Because re-executing training naively to verify it was done right, as is done in bitcoin, would take as much computation as the training itself, which defeats the point.

>>101373255
Trouble is that proof of work is easy to verify Doing the same here would take something like SNARKs, which have huge overhead. But maybe GPUs can do TEE remote attestation stuff now?
.

Anonymous
07/12/24(Fri)16:10:43 No.101384211

Anonymous 07/12/24(Fri)16:10:43 No.101384211

>>101383705
It's jeets, so it's more likely that it's completely made up.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.