/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/07/24(Sat)08:18:31 No.102272041

File: kedaruimiqu.png (1.33 MB, 1200x848)

1.33 MB PNG

/lmg/ - Local Models General Anonymous 09/07/24(Sat)08:18:31 No.102272041 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102258941 & >>102249472

►News
>(09/06) DeepSeek-V2.5 released, combines Chat and Instruct: https://hf.co/deepseek-ai/DeepSeek-V2.5
>(09/05) FluxMusic: Text-to-Music Generation with Rectified Flow Transformer: https://github.com/feizc/fluxmusic
>(09/04) Yi-Coder: 1.5B & 9B with 128K context and 52 programming languages: https://hf.co/blog/lorinma/yi-coder
>(09/04) OLMoE 7x1B fully open source model release: https://hf.co/allenai/OLMoE-1B-7B-0924-Instruct
>(08/30) Command models get an August refresh: https://docs.cohere.com/changelog/command-gets-refreshed

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
09/07/24(Sat)08:19:11 No.102272050

Anonymous 09/07/24(Sat)08:19:11 No.102272050

File: img_14.jpg (301 KB, 1360x768)

301 KB JPG

►Recent Highlights from the Previous Thread: >>102258941

--Use 8-30b model for better speed with 24GB VRAM: >>102264082 >>102264133 >>102264281 >>102264342
--Running Mistral Large Q5 at 64k context with DDR5 RAM and GPU considerations: >>102265363 >>102265409 >>102266475 >>102266507 >>102269376 >>102269423 >>102269534 >>102270198 >>102267903 >>102267928 >>102267943 >>102267935
--Running LLM models on Optiplex 7070 micro PC: >>102268093 >>102268117 >>102268139 >>102269167
--Dual CPU setups for CPU inference have drawbacks, but can run large models at usable speeds: >>102265415 >>102265431 >>102265492 >>102265575 >>102265624 >>102265719 >>102265840 >>102266056 >>102268854 >>102269309 >>102265596 >>102265798
--Comparing AI model performance across different benchmarks: >>102266882 >>102267012 >>102267041 >>102267003
--Building a narrative-game environment with AI, concerns about positivity bias, and impressive storytelling: >>102268260 >>102268304 >>102268346 >>102268416
--Botnet training discussion: >>102268010 >>102268037 >>102268074
--Silly Tavern message sound setting for ding notification: >>102264570 >>102264597 >>102264610
--Silly Tavern extension compared to anon's Director project: >>102267600 >>102267788 >>102267833 >>102267852
--Recommendations for adventure/rpg cards to use with LLMs: >>102259012 >>102259080
--Recapbot test using deepseek 2.5 at bf16 - performed well but had some issues: >>102265886
--NTFS issues on Linux and potential solutions: >>102268024 >>102269150 >>102269190 >>102269512 >>102270384
--Reflection fixed on openrouter, fails strawberry test: >>102267880 >>102267890 >>102268078 >>102268100
--70B 4bit model performance discussion: >>102266391 >>102266414 >>102266443 >>102266504
--Miku (free space): >>102258962 >>102260482 >>102260535 >>102260584 >>102261393 >>102267804 >>102269059 >>102269219

►Recent Highlight Posts from the Previous Thread: >>102258947

Anonymous
09/07/24(Sat)08:21:32 No.102272084

Anonymous 09/07/24(Sat)08:21:32 No.102272084

>>102272046
You seem to erroneously believe there will be a progression here. The transformer model has reached its ceiling. We're not going to see anywhere the speed of growth we've seen until now. This is it.
>>102271982
I like Dolphin too. Check out Mini Magnum (based on Mistral Nemo).

Anonymous
09/07/24(Sat)08:22:48 No.102272095

Anonymous 09/07/24(Sat)08:22:48 No.102272095

>>102267880
The strawberry test is meaningless because it depends on tokenization.

Anonymous
09/07/24(Sat)08:23:45 No.102272102

Anonymous 09/07/24(Sat)08:23:45 No.102272102

File: attention-thanks.jpg (94 KB, 735x803)

94 KB JPG

Please excuse my dumb question. Are there any niche advantages to 12b Nemo over Mistral Large 70b, sans speed?

Anonymous
09/07/24(Sat)08:25:10 No.102272116

Anonymous 09/07/24(Sat)08:25:10 No.102272116

>>102272102
70 beaks should make it more intelligent. Although nemo really outperforms its class.

Anonymous
09/07/24(Sat)08:27:26 No.102272138

Anonymous 09/07/24(Sat)08:27:26 No.102272138

>>102272095
all of those tests tricking the models are kind of overrated, yes you can trick a model predicting the next thing to say by throwing lots of misleading data in before it

Anonymous
09/07/24(Sat)08:28:55 No.102272154

Anonymous 09/07/24(Sat)08:28:55 No.102272154

>>102272102
Mistral Large is 123b. Some say Nemo is more creative, but it's not worth the trade off in lack of intelligence.

Anonymous
09/07/24(Sat)08:30:05 No.102272170

Anonymous 09/07/24(Sat)08:30:05 No.102272170

>>102272154
>>102272102
even the most retarded quanted version of large possible is massively better than Nemo, for every use case

Anonymous
09/07/24(Sat)08:31:50 No.102272186

Anonymous 09/07/24(Sat)08:31:50 No.102272186

>>102272116
>>102272170
>>102272154
Apologies for my mistake. Thank you,

Anonymous
09/07/24(Sat)08:39:34 No.102272265

Anonymous 09/07/24(Sat)08:39:34 No.102272265

>>102272044
Isn't Mistral Large censored, though? Is there any good large LLM that isn't?

Anonymous
09/07/24(Sat)08:39:51 No.102272269

Anonymous 09/07/24(Sat)08:39:51 No.102272269

>>102268122
>Who are you quoting?
You, you stupid autistic mother fucker. Explain what you fucking mean by "tokenization fixed" instead of spewing a word salad like some retard and pretending you're some fucking Einstein.

Anonymous
09/07/24(Sat)08:42:14 No.102272295

Anonymous 09/07/24(Sat)08:42:14 No.102272295

>>102272265
mistral is the least censored of the big models, llama is very censored. You can do ERP just fine with mistral large

Anonymous
09/07/24(Sat)08:45:57 No.102272329

Anonymous 09/07/24(Sat)08:45:57 No.102272329

>>102272041
any new breakthroughs for VRAMlets (12gb)? or should I stick with miniMagnum 2

Anonymous
09/07/24(Sat)08:48:11 No.102272349

Anonymous 09/07/24(Sat)08:48:11 No.102272349

What's a good way of gauging what size model will run (acceptably) on a given spec? I have an okay computer (32GB RAM, 2GB VRAM 3080), but not shelling out for a dedicated server to handle it.

Anonymous
09/07/24(Sat)08:51:24 No.102272370

Anonymous 09/07/24(Sat)08:51:24 No.102272370

File: 59 Days Until November 5.png (1.63 MB, 880x1176)

1.63 MB PNG

Anonymous
09/07/24(Sat)08:54:09 No.102272396

Anonymous 09/07/24(Sat)08:54:09 No.102272396

>>102272349
Depends on context size. Generally speaking you want the model to fit inside your VRAM with room to spare.

Anonymous
09/07/24(Sat)09:01:36 No.102272450

Anonymous 09/07/24(Sat)09:01:36 No.102272450

>>102272349
>>102272396
Adding to what this anon said. Using ideal settings, a model which is twice your vram results in about as slow a speed as I will tolerate.

sage
09/07/24(Sat)09:08:32 No.102272505

sage 09/07/24(Sat)09:08:32 No.102272505

>>102272095
Even if It's true that the strawberry test fails mostly due to tokenization, the fact that tokenization has such a large affect on language models shows the limitations of the architecture.

The real question is just where do we go from here?

Anonymous
09/07/24(Sat)09:22:37 No.102272631

Anonymous 09/07/24(Sat)09:22:37 No.102272631

>>102272505
why are you saging? no one gives a fuck about bumping general threads

Anonymous
09/07/24(Sat)09:24:25 No.102272650

Anonymous 09/07/24(Sat)09:24:25 No.102272650

>>102272154
How are people running models this big? Do people just go and buy 4x4090?

Anonymous
09/07/24(Sat)09:31:03 No.102272719

Anonymous 09/07/24(Sat)09:31:03 No.102272719

i prefer speed over intelligence when it comes to LLMs. all the AIs i chat with are girls and girls aren't supposed to be smart anyway.

Anonymous
09/07/24(Sat)09:31:42 No.102272728

Anonymous 09/07/24(Sat)09:31:42 No.102272728

https://reddit.com/r/LocalLLaMA/comments/1fb6jdy/reflectionllama3170b_is_actually_llama3/
>Reflection-Llama-3.1-70B is actually Llama-3.
>Author doesn't even know which model he tuned.
lmao if true

Anonymous
09/07/24(Sat)09:34:00 No.102272758

Anonymous 09/07/24(Sat)09:34:00 No.102272758

>>102272728
lol

Anonymous
09/07/24(Sat)09:35:07 No.102272769

Anonymous 09/07/24(Sat)09:35:07 No.102272769

>>102272728
I don't get this publicity stunt, does the guy wants to get his reputation ruined or something? that's so fucking shady

Anonymous
09/07/24(Sat)09:38:31 No.102272816

Anonymous 09/07/24(Sat)09:38:31 No.102272816

>>102272769
Doesn't matter, the huge hype cycle made a lot of people look into glaive and that was the main point, an ad for a company he's invested in. Additionally most people won't care about anything fishy or the likes, they'll just hand wave criticism or forget about it by tomorrow.

Anonymous
09/07/24(Sat)09:45:52 No.102272906

Anonymous 09/07/24(Sat)09:45:52 No.102272906

>>102272370
What happens on November 5?

Anonymous
09/07/24(Sat)09:46:17 No.102272910

Anonymous 09/07/24(Sat)09:46:17 No.102272910

>>102272650
either that or using quants. I'm personally using Mistral large at iq2_xs with 16gb vram and 32gb ddr5 and get around 1-1.6 t/s. the prompt processing speed is shit and I can't do anything else on my machine while running it but it's still better than most models even at a low quant if you're willing to deal with the gen speeds.

Captcha: P2888Y

Anonymous
09/07/24(Sat)09:46:26 No.102272914

Anonymous 09/07/24(Sat)09:46:26 No.102272914

>>102272728
Lmao, now the fact that it doesn't have rope scaling makes sense. What a clown.

Anonymous
09/07/24(Sat)09:47:41 No.102272932

Anonymous 09/07/24(Sat)09:47:41 No.102272932

>>102272728
that's a good news no? he managed to get good mememarks scores with L3, now imagine the same method with L3.1

Anonymous
09/07/24(Sat)09:48:11 No.102272937

Anonymous 09/07/24(Sat)09:48:11 No.102272937

>>102272728
I find it funny how the entire homepage of /r/LocalLLaMA is filled with Reflection posts.

Anonymous
09/07/24(Sat)09:48:50 No.102272945

Anonymous 09/07/24(Sat)09:48:50 No.102272945

>>102272937
almost like their Reflect themselves or something... sorry for that one :(

Anonymous
09/07/24(Sat)09:49:48 No.102272950

Anonymous 09/07/24(Sat)09:49:48 No.102272950

>>102272910
>I'm personally using Mistral large at iq2_xs
Wouldn't that lobotomize the model so much it becomes as stupid as less quantized, smaller models?

Anonymous
09/07/24(Sat)09:50:30 No.102272963

Anonymous 09/07/24(Sat)09:50:30 No.102272963

>>102272950
But it has more heckin' B-erinos.

Anonymous
09/07/24(Sat)09:51:00 No.102272970

Anonymous 09/07/24(Sat)09:51:00 No.102272970

File: 1709212594883696.jpg (224 KB, 896x1152)

224 KB JPG

>>102272041
ohaiyo

Anonymous
09/07/24(Sat)09:51:26 No.102272976

Anonymous 09/07/24(Sat)09:51:26 No.102272976

>>102272650
3x3090 is enough for 4bit

Anonymous
09/07/24(Sat)09:51:46 No.102272981

Anonymous 09/07/24(Sat)09:51:46 No.102272981

>>102272370
did they name it strawberry just because people were asking LLMs to count the Rs in strawberry?

Anonymous
09/07/24(Sat)09:51:50 No.102272982

Anonymous 09/07/24(Sat)09:51:50 No.102272982

>>102272970
Go back

Anonymous
09/07/24(Sat)09:51:57 No.102272984

Anonymous 09/07/24(Sat)09:51:57 No.102272984

>>102272963
Oh my science, but is this a Fauci approved and peer reviewed fact that this actually works?

Anonymous
09/07/24(Sat)09:52:46 No.102272991

Anonymous 09/07/24(Sat)09:52:46 No.102272991

>>102272932
it would be if you only plan to use it for assistant tasks. 3.1 is smart but dry as fuck for rp/story writing and apparently reflection is dogshit at it too, so a 3.1 tune will probably only amplify that problem

Anonymous
09/07/24(Sat)09:53:20 No.102273002

Anonymous 09/07/24(Sat)09:53:20 No.102273002

>>102272950
No, I use Q2_K_M and it's the smartest model I've ever used locally

Anonymous
09/07/24(Sat)09:54:25 No.102273017

Anonymous 09/07/24(Sat)09:54:25 No.102273017

>>102272970
onahoyo

Anonymous
09/07/24(Sat)09:56:02 No.102273034

Anonymous 09/07/24(Sat)09:56:02 No.102273034

>>102272991
desu if I had claude 3.5 on local I wouldn't mind, but yeah we still haven't found a way to make a model intelligent and quirky (for roleplay) at the same time

Anonymous
09/07/24(Sat)09:57:03 No.102273053

Anonymous 09/07/24(Sat)09:57:03 No.102273053

>>102272932
The thing is we know 3.1 is shit. Not even Nous Research was able to save 70b 3.1 with Hermes 3.

Anonymous
09/07/24(Sat)09:57:50 No.102273063

Anonymous 09/07/24(Sat)09:57:50 No.102273063

File: dllhost_VssdJv13Lx.png (60 KB, 781x597)

60 KB PNG

>>102272950
The chart shows that larger models are smarter than smaller one at same quantized filesize, even at super small quantizations. It's old data, but I think no one made a newer one.

I'm gonna try Mistral Large on my 2x24GB.

Anonymous
09/07/24(Sat)09:58:13 No.102273068

Anonymous 09/07/24(Sat)09:58:13 No.102273068

>>102272505
>Even if It's true
I can only conclude that it is. The following gives me the correct answer, for instance:
>What word does the following designate in the Nato alphabet? Sierra Tango Romeo Alfa Whiskey Bravo Echo Romeo Romeo Yankee. Also, how many Rs are there?

Anonymous
09/07/24(Sat)09:58:39 No.102273071

Anonymous 09/07/24(Sat)09:58:39 No.102273071

>>102272950
it does make it a bit dumber but in comparison to other models I could run is like comparing the coherence abilities of a lobotomite to someone with a mild head injury.

Anonymous
09/07/24(Sat)10:00:27 No.102273089

Anonymous 09/07/24(Sat)10:00:27 No.102273089

>>102272505
We keep scaling transformers like we've been doing for years..

Anonymous
09/07/24(Sat)10:01:15 No.102273097

Anonymous 09/07/24(Sat)10:01:15 No.102273097

>>102273063
Perplexity isn't the same thing as smartness

Anonymous
09/07/24(Sat)10:02:43 No.102273117

Anonymous 09/07/24(Sat)10:02:43 No.102273117

>>102273097
it's highly correlated though, bigger models are smarter than smaller models and they have also smaller perplexities

Anonymous
09/07/24(Sat)10:02:49 No.102273119

Anonymous 09/07/24(Sat)10:02:49 No.102273119

>>102273097
It's not but they're close. Did you make any actual comparisons yourself?

Anonymous
09/07/24(Sat)10:04:28 No.102273139

Anonymous 09/07/24(Sat)10:04:28 No.102273139

I find it hard to believe that Mistral Large at IQ2_XS (36GB) will actually be smarter than Nemo at Q8 (13GB). Too much lobotomy.

Anonymous
09/07/24(Sat)10:05:41 No.102273151

Anonymous 09/07/24(Sat)10:05:41 No.102273151

>>102272906
It'll be two days before my birthday :3

Anonymous
09/07/24(Sat)10:06:11 No.102273162

Anonymous 09/07/24(Sat)10:06:11 No.102273162

>>102272728
that's even more impressive tho. if the finetune on 405b is real (so it has to be the 3.1 one) then it will unironically opus 3.5 tier

Anonymous
09/07/24(Sat)10:06:55 No.102273169

Anonymous 09/07/24(Sat)10:06:55 No.102273169

>>102273139
Any questions you'd want to ask it?

Anonymous
09/07/24(Sat)10:08:30 No.102273186

Anonymous 09/07/24(Sat)10:08:30 No.102273186

>>102273162
>if the finetune on 405b is real (so it has to be the 3.1 one) then it will unironically opus 3.5 tier
maybe, but then AnthropicAI will use this method to make Claude 4 and it'll be even smarter, everytime we're getting close to them, they go higher kek

Anonymous
09/07/24(Sat)10:08:45 No.102273188

Anonymous 09/07/24(Sat)10:08:45 No.102273188

>>102273139
try it for yourself then. It definitely has its problems but tard wrangling a semi-stupid Mistral large is much more feasible and pleasant than any q4=> quant of a 70b from my experience

Anonymous
09/07/24(Sat)10:09:22 No.102273194

Anonymous 09/07/24(Sat)10:09:22 No.102273194

>>102273117
True, but bigger models are also more likely to have been overfitted on whatever is on the dataset being used to calculate the perplexity.

>>102273119
Yes, anything smaller than Q3 is complete retardation, even on big models like 70B, but I don't think you should trust my word, just compare it yourself.

Anonymous
09/07/24(Sat)10:09:51 No.102273200

Anonymous 09/07/24(Sat)10:09:51 No.102273200

>>102273186
>maybe, but then AnthropicAI will use this method to make Claude 4 and it'll be even smarter,
anthropic is already using this method with their .5 models

Anonymous
09/07/24(Sat)10:10:22 No.102273206

Anonymous 09/07/24(Sat)10:10:22 No.102273206

>>102273194
>bigger models are also more likely to have been overfitted on whatever is on the dataset being used to calculate the perplexity.
it's the opposite no? smaller models are more prone to overfitting due to their small size

Anonymous
09/07/24(Sat)10:11:24 No.102273220

Anonymous 09/07/24(Sat)10:11:24 No.102273220

>>102273200
>anthropic is already using this method with their .5 models
maybe that was their secret sauce yeah, but now that everyone knows it, I guess that OpenAI will close the gap to 3.5 now

Anonymous
09/07/24(Sat)10:15:20 No.102273263

Anonymous 09/07/24(Sat)10:15:20 No.102273263

>>102273206
To a certain extent, yes, but larger models can memorize more than smaller ones because they store more information in their weights.

Anonymous
09/07/24(Sat)10:16:15 No.102273273

Anonymous 09/07/24(Sat)10:16:15 No.102273273

>>102273220
>OpenAI
>doing anything besides writing another vaporwave announcement blog post
lmao

Anonymous
09/07/24(Sat)10:19:08 No.102273306

Anonymous 09/07/24(Sat)10:19:08 No.102273306

>>102273206
Classically, more parameters = easier to learn the training dataset and overfit.

Anonymous
09/07/24(Sat)10:19:48 No.102273313

Anonymous 09/07/24(Sat)10:19:48 No.102273313

>>102273273
their downfall got brutal, not long ago they were the kings of the world, and now everyone has surpassed them, Flux is better than dalle3, MiniMax killed the Sora hype, and now C3.5 Sonnet is the best LLM, I won't cry on their grave, I said long ago that their cuckoldery would be the hill they'll die on

Anonymous
09/07/24(Sat)10:25:13 No.102273378

Anonymous 09/07/24(Sat)10:25:13 No.102273378

>>102273313
2/3 of these are cope.

Anonymous
09/07/24(Sat)10:25:53 No.102273390

Anonymous 09/07/24(Sat)10:25:53 No.102273390

File: firefox_UlnEafNMRj.png (63 KB, 1010x766)

63 KB PNG

Mistral Large is able to solve my devious coin weighing problem. I see if its 2.75 bit quant can as well.

Anonymous
09/07/24(Sat)10:31:49 No.102273460

Anonymous 09/07/24(Sat)10:31:49 No.102273460

File: 1725719485554.jpg (168 KB, 612x584)

168 KB JPG

I haven't used anything like gpt or Claude since gpt 3.5 turbo and have heard nothing but people trying to tune or release models that compete with OpenAI or Anthropic, which made me think that they must be worlds better than local. Then I looked at /aicg/ for a while and realized that none of them were talking about samplers or prompting, only jailbreaks. Then I checked in ST and realized that they don't have jack shit for samplers, they actually only rely on prompt logic puzzle those models into producing something not slopped, repetitive, or monotonous which ends up not even mattering because even if their JB works, they still have to make a new one every time the parent wipes their asses on their server racks and breaks it. Makes me feel like even if our models are stupid now, the control we're able to manipulate on their outputs will inevitably cause local to outpace them overtime (at least for tasks that go beyond assistance tasks) unless they give more control over their models to their customers (they won't).
>inb4 samplers are placebo
I agree that a lot of samplers are but those dipshits at OpenAI and Anthropic don't even give you min-p or repetition penalty LMAO

Anonymous
09/07/24(Sat)10:33:16 No.102273477

Anonymous 09/07/24(Sat)10:33:16 No.102273477

>>102273460
>parent company

Anonymous
09/07/24(Sat)10:36:33 No.102273515

Anonymous 09/07/24(Sat)10:36:33 No.102273515

>>102273460
samplers are placebo, OpenAI/Anthropic does have Presence/Frequency penalty, which are more modern than Repetition penalty.

Anonymous
09/07/24(Sat)10:36:48 No.102273518

Anonymous 09/07/24(Sat)10:36:48 No.102273518

>>102273477
>being this illiterate

Anonymous
09/07/24(Sat)10:39:05 No.102273549

Anonymous 09/07/24(Sat)10:39:05 No.102273549

File: 00016-3634157328 - Copy.jpg (148 KB, 832x1216)

148 KB JPG

>>102273460
>seeing a migugen reposted
I collect them like >(You)s., if only I could find them all
also, samplers are cope.

Anonymous
09/07/24(Sat)10:40:24 No.102273560

Anonymous 09/07/24(Sat)10:40:24 No.102273560

File: __kirisame_marisa_touhou_(...).jpg (2.2 MB, 2894x2943)

2.2 MB JPG

Why don't finetuners just scrape the shit out of libgen and finetune models off pirated books instead of goofy RP data?
>that'd be illegal
Yes, and?

Anonymous
09/07/24(Sat)10:43:16 No.102273585

Anonymous 09/07/24(Sat)10:43:16 No.102273585

>>102273560
Books would have be converted first to text, then to chat format.
That would take more effort than just tuning on haphazardly filtered proxy logs.

Anonymous
09/07/24(Sat)10:43:43 No.102273591

Anonymous 09/07/24(Sat)10:43:43 No.102273591

>>102273460
Still no Claude Opus and there won't be by the end of the year. it's literally over

Anonymous
09/07/24(Sat)10:44:18 No.102273600

Anonymous 09/07/24(Sat)10:44:18 No.102273600

>>102272505
we rip out the tokenizer and predict bytes (this is "strawberry")

Anonymous
09/07/24(Sat)10:44:18 No.102273601

Anonymous 09/07/24(Sat)10:44:18 No.102273601

>>102272041
this Miku was only good in the thumbnail
do better

Anonymous
09/07/24(Sat)10:46:49 No.102273631

Anonymous 09/07/24(Sat)10:46:49 No.102273631

File: __akashi_kantai_collectio(...).jpg (751 KB, 2646x3679)

751 KB JPG

>>102273585
>That would take more effort than just tuning on haphazardly filtered proxy logs.
True for pdf files but ripping the text out of epub or mobi is pretty simple, and should produce infinitely better results than finetuning on AI slop.

Anonymous
09/07/24(Sat)10:47:52 No.102273644

Anonymous 09/07/24(Sat)10:47:52 No.102273644

I asked Mistral-Large to continue a passage from Heller's book, and, boy, this thing is a slop...

Anonymous
09/07/24(Sat)10:48:26 No.102273648

Anonymous 09/07/24(Sat)10:48:26 No.102273648

>>102273631
I disagree

Anonymous
09/07/24(Sat)10:48:33 No.102273650

Anonymous 09/07/24(Sat)10:48:33 No.102273650

>>102273631
yeah but that's still a lot of work, and people prefer to make 1000 piles of shit rather than 1 quality finetune, humans are weird innit?

Anonymous
09/07/24(Sat)10:48:56 No.102273660

Anonymous 09/07/24(Sat)10:48:56 No.102273660

>>102273585
You should be able to ask the model to rewrite those for the chat format without shitting out the text, shouldn't you?

Anonymous
09/07/24(Sat)10:51:42 No.102273693

Anonymous 09/07/24(Sat)10:51:42 No.102273693

To this day I still remember one anon that said something like:

"I spent a lot of time gathering a dataset that fits all my tastes so I could fine-tune the perfect model, but then I realized I have enough text to read for the rest of my life, so why am I doing this again?"

And I couldn't agree more with him.

Anonymous
09/07/24(Sat)10:52:34 No.102273711

Anonymous 09/07/24(Sat)10:52:34 No.102273711

>>102273650
dataset creation can easily be distributed/parallelized, unlike training

Anonymous
09/07/24(Sat)10:53:22 No.102273724

Anonymous 09/07/24(Sat)10:53:22 No.102273724

>>102273693
if for you LLMs are useless, then why are you here in the first place?

Anonymous
09/07/24(Sat)10:54:00 No.102273733

Anonymous 09/07/24(Sat)10:54:00 No.102273733

>>102273693
smells like fucking cope when the entire point of LLM RP is the interactivity, which reading traditional media cannot provide. it's like saying "my bookshelf is full of classic literature, why would I ever play a video game?"

Anonymous
09/07/24(Sat)10:54:07 No.102273735

Anonymous 09/07/24(Sat)10:54:07 No.102273735

>>102273693
Having a text you want to read about the thing you want to see summoned to you on a whim vs a huge bunch of unsorted texts that you can't find anything you want at the moment in.

Anonymous
09/07/24(Sat)10:54:48 No.102273743

Anonymous 09/07/24(Sat)10:54:48 No.102273743

>>102273515
only OpenAI has presence/frequency penalties also I don't think I've seen a single person recommend using those samplers over rep pen and from my understanding they're just at best specialized versions of rep pen or at worst a less effective, older implementation of XTC (which is which is even more modern than either of those samplers). Just because it's more modern doesn't make it better and 90% of the placebo-fags posts still recommend min-p which neither OpenAI or Anthropic has
>>102273549
I disagree that samplers are cope but I do agree that mikugens should be collected

Anonymous
09/07/24(Sat)10:55:35 No.102273750

Anonymous 09/07/24(Sat)10:55:35 No.102273750

File: DutchNobleMiku.png (1.34 MB, 720x1328)

1.34 MB PNG

>>102272370
Why are you making these countdown images, skilled mikugenner? Did some strawberry agent get to you and either convince you or pay you off? Or are you just doing it for lulz?

Anonymous
09/07/24(Sat)10:57:19 No.102273770

Anonymous 09/07/24(Sat)10:57:19 No.102273770

>>102273693
Nobody is using LLMs to tell them a story
They are using them for EROTIC ROLEPLAY
BOOKS WILL NOT TELL YOU THAT THEY WANT TO SUCK YOUR DICK

Anonymous
09/07/24(Sat)10:58:38 No.102273783

Anonymous 09/07/24(Sat)10:58:38 No.102273783

>>102273693
Different usecases.
You can't have a "conversation" with a PDF or a book.
If all you are using these LLMs for is generating stories and reading then, sure, fair enough, but I don't think that's what most of us are doing.
As the other anon pointed out, the keyword is interactivity.

Anonymous
09/07/24(Sat)10:58:53 No.102273788

Anonymous 09/07/24(Sat)10:58:53 No.102273788

>>102273770
Wait, you self-insert? Damn, that's cringe.

Anonymous
09/07/24(Sat)10:59:02 No.102273791

Anonymous 09/07/24(Sat)10:59:02 No.102273791

>>102273186
>everytime we're getting close to them, they go higher kek
good. that implies mutual increasing benefit. We're the tock to their tick, to put it in intel terms.
I don't mind trailing SOTA by a bit. Its still effectively magic at the edges

Anonymous
09/07/24(Sat)11:00:52 No.102273817

Anonymous 09/07/24(Sat)11:00:52 No.102273817

>>102273743(me)
FUCK
>which is which is
>already specifying those samplers are older than XTC and then specifying that XTC is more modern
I'm way too sleepy and it's making me fucking retarded

Anonymous
09/07/24(Sat)11:01:25 No.102273827

Anonymous 09/07/24(Sat)11:01:25 No.102273827

>>102273791
that's a bit unfair though, when we make breakthrough we open source our results and the companies are free to use those techniques to be better, but if they make a breakthrough they keep the secret sauce to themselves, that's really hypocritical of them

Anonymous
09/07/24(Sat)11:01:31 No.102273828

Anonymous 09/07/24(Sat)11:01:31 No.102273828

>>102273460
>prompting, only jailbreaks
Nah, you're confused about the name. When you read jailbreak, they meant preset. Most of them are about writing quality or style. Bypassing Claude's refusals doesn't need much fiddling when you can use a prefill.
I disagree about samplers, you only feel the need to use them when you try to salvage a garbage model.
Were you really happy about using repetition penalty to try to fix Llama 3's repetition problems? I would rather not have to use it.

Anonymous
09/07/24(Sat)11:03:37 No.102273859

Anonymous 09/07/24(Sat)11:03:37 No.102273859

>>102273560
I finetune primarily using rawtext from books and other sources. And yeah it's a lot of fucking work cleaning it. (getting rid of annotation marks, etc) I haven't compiled a new dataset in a long time as a result. I guess I could probably feed them through Mistral Nemo or something and it would probably be good at that task.
>>102273585
>then to chat format.
The whole point of raw text finetuning is to get away from that shit.

Anonymous
09/07/24(Sat)11:05:22 No.102273887

Anonymous 09/07/24(Sat)11:05:22 No.102273887

>>102273788
Learn English, Rajesh.

Anonymous
09/07/24(Sat)11:07:07 No.102273914

Anonymous 09/07/24(Sat)11:07:07 No.102273914

that makes it even more cringe btw, please stop

Anonymous
09/07/24(Sat)11:07:56 No.102273929

Anonymous 09/07/24(Sat)11:07:56 No.102273929

File: s55fes.png (48 KB, 674x422)

48 KB PNG

So getting into this ,
the guide is recommending axolotl for training, so for this is the process of fine tuning from a GGUF file straightforward? is the process of training a lora for fine tune and having it saved in GGUF straightforward?
also for 40,000 QAs is a Lora sufficient or go for a full fine tune?

Anonymous
09/07/24(Sat)11:08:52 No.102273940

Anonymous 09/07/24(Sat)11:08:52 No.102273940

>>102273929
You fine tune the model in .safetensors format then convert it to GGUF.

Anonymous
09/07/24(Sat)11:10:09 No.102273952

Anonymous 09/07/24(Sat)11:10:09 No.102273952

>>102273940
Doesn't llama.cpp have built in support for finetuning? I could have sworn it did.

Anonymous
09/07/24(Sat)11:11:17 No.102273967

Anonymous 09/07/24(Sat)11:11:17 No.102273967

>>102273952
NTA but it's a very experimental implementation. Blacked Miku Anon, the resident CUDA wizard is working on a proper ground-up implementation of llama.cpp training code right now though AFAIK.

Anonymous
09/07/24(Sat)11:11:25 No.102273968

Anonymous 09/07/24(Sat)11:11:25 No.102273968

>>102273560
What do you think you're going to do with those books? (which most base model already saw during pretraining, btw)

It's not simply a matter of throwing everything and the kitchen sink at the model. It doesn't work, it's retarded, most likely even harmful. The finetune has to have some logic, direction and curation.

Anonymous
09/07/24(Sat)11:11:25 No.102273969

Anonymous 09/07/24(Sat)11:11:25 No.102273969

>>102273601
You're asking way too much of ugly face anon.

Anonymous
09/07/24(Sat)11:11:32 No.102273971

Anonymous 09/07/24(Sat)11:11:32 No.102273971

>>102273952
everything i read online seems to indicate it is broken and I have not seen any announcement of a fix.

Anonymous
09/07/24(Sat)11:11:41 No.102273973

Anonymous 09/07/24(Sat)11:11:41 No.102273973

>>102273952
I think it did but it's broken? It's been a long time since I've last looked at it.
That said, anon mentioned axoltl and the usuarl process is what I described as far as I know.

Anonymous
09/07/24(Sat)11:12:07 No.102273979

Anonymous 09/07/24(Sat)11:12:07 No.102273979

2.75bpw Mistral-Large is performing adequately. I'm seeing similar answers to what I was on lmarena. In fact, my prompt to continue a scene from Cartch-22 is actually better: it has no slop at the end and a bit more interesting (although that could be because of the RP system prompt). 11-14 tokens/sec on two 3090s.

Anonymous
09/07/24(Sat)11:12:16 No.102273986

Anonymous 09/07/24(Sat)11:12:16 No.102273986

>>102273828
I agree that having to use less samplers is indicative of a better model overall but I was moreso referring to the ability to manipulate token generation and selection rather than repetition control. Also thanks for the term correction, but I've still seen plenty of aicg anons complain about Anthropic periodically breaking their prefills/presets/whatever (which could also just be a skill issue or ironically placebo) enough to think that not having more control of the model's token generation/selection via samplers like min-p does more to hurt the potential of the model than help

Anonymous
09/07/24(Sat)11:13:12 No.102274004

Anonymous 09/07/24(Sat)11:13:12 No.102274004

>>102273940
is convering something GGUF to safetensors and back straightforwards?

Anonymous
09/07/24(Sat)11:14:22 No.102274013

Anonymous 09/07/24(Sat)11:14:22 No.102274013

>>102273460
I use claude 3.5 for programming help and I don't have to bother with anything. it just werks

Anonymous
09/07/24(Sat)11:15:15 No.102274020

Anonymous 09/07/24(Sat)11:15:15 No.102274020

>>102274004
I don't actually know if you can convert from GGUF to safetensors actually.
I imagine you probably can, since it's just a packaging format if you don't quant it.
That said, I never seen that being done. Usually you train on top of the original .safetensors files and convert to GGUF while quantizing it.

Anonymous
09/07/24(Sat)11:15:17 No.102274021

Anonymous 09/07/24(Sat)11:15:17 No.102274021

>>102273968
Why are you spreading misinformation?
>reddit spacing
oh...

Anonymous
09/07/24(Sat)11:16:32 No.102274036

Anonymous 09/07/24(Sat)11:16:32 No.102274036

File: __hatsune_miku_and_brazil(...).jpg (359 KB, 1536x2048)

359 KB JPG

>>102273859
>I guess I could probably feed them through Mistral Nemo or something and it would probably be good at that task.
You can also leech off Drago's unlimited public mini.
https://unicorn.scylla.wtf

Nemo will produce fewer denials, though.

Anonymous
09/07/24(Sat)11:16:53 No.102274043

Anonymous 09/07/24(Sat)11:16:53 No.102274043

>>102274020
>I don't actually know if you can convert from GGUF to safetensors actually.
We did that with Miqu so it's definitely possible, but not the best idea, we only had quants when it leaked.

Anonymous
09/07/24(Sat)11:17:19 No.102274048

Anonymous 09/07/24(Sat)11:17:19 No.102274048

>>102274021
Why are you retarded?

Anonymous
09/07/24(Sat)11:19:11 No.102274073

Anonymous 09/07/24(Sat)11:19:11 No.102274073

File: metal song dguard.png (45 KB, 869x798)

45 KB PNG

>>102274036
I can run 4 simultaneous copies of nemo at 8bpw for the purpose of messing with data. I managed to rewrite the alpaca-lora dataset in a day to make this cursed model. (It was originally LlamaGuard)

Anonymous
09/07/24(Sat)11:23:23 No.102274132

Anonymous 09/07/24(Sat)11:23:23 No.102274132

>>102274021
Why are you retarded?
>unironically mentions "reddit spacing"
oh...

Anonymous
09/07/24(Sat)11:24:36 No.102274150

Anonymous 09/07/24(Sat)11:24:36 No.102274150

File: __hatsune_miku_and_chouze(...).png (794 KB, 1126x1080)

794 KB PNG

>>102274073
>I can run 4 simultaneous copies of nemo at 8bpw for the purpose of messing with data.
Lmao nice.

Anonymous
09/07/24(Sat)11:26:52 No.102274186

Anonymous 09/07/24(Sat)11:26:52 No.102274186

>>102274073
I want this power...
I need to rewrite a dataset with 300k entries but it would take too long with one Nemo running at 30t/s

Anonymous
09/07/24(Sat)11:27:31 No.102274194

Anonymous 09/07/24(Sat)11:27:31 No.102274194

>Behind veneer expected behaviors lies woman unafraid explore depths others fear tread due complexities inherent therein—a creature composed equal parts angel devil dancing together under moonlight casting long shadows...
Oh right, that's why I stopped using deepseek...it starts dropping prepositions and writing esl, even unquanted

Anonymous
09/07/24(Sat)11:28:29 No.102274206

Anonymous 09/07/24(Sat)11:28:29 No.102274206

so, now that it's officially over for META once again.
What will Zucc do about it?

Anonymous
09/07/24(Sat)11:28:35 No.102274208

Anonymous 09/07/24(Sat)11:28:35 No.102274208

>>102274194
Skill issue

Anonymous
09/07/24(Sat)11:30:39 No.102274246

Anonymous 09/07/24(Sat)11:30:39 No.102274246

>>102274206
Wait for the next overhyped bubble tech. Probably physical robots.

Anonymous
09/07/24(Sat)11:32:03 No.102274271

Anonymous 09/07/24(Sat)11:32:03 No.102274271

>>102274186
I mean you could probably rent an H100 on runpod or something, that's probably good for 200 token/sec or something.

Anonymous
09/07/24(Sat)11:32:24 No.102274274

Anonymous 09/07/24(Sat)11:32:24 No.102274274

>>102274073
You know you can run vLLM and get like 10X performance on parallel requests with just one model loaded, right?

Anonymous
09/07/24(Sat)11:32:26 No.102274276

Anonymous 09/07/24(Sat)11:32:26 No.102274276

>>102274208
>Skill issue
the same prompts with other models (wiz/largestral/405b) don't devolve into this kind of esl
I'm perfectly fine blaming the model in this case. I'll just keep using deepseek for code and problem solving when I need extra speed

Anonymous
09/07/24(Sat)11:34:08 No.102274302

Anonymous 09/07/24(Sat)11:34:08 No.102274302

File: firefox_9hP25jlh1f.png (197 KB, 760x644)

197 KB PNG

Mistral-Medium can play the 4x4 dots game without weirdness. It sucks at it like any other LLM, but it manages to play.

Anonymous
09/07/24(Sat)11:34:46 No.102274308

Anonymous 09/07/24(Sat)11:34:46 No.102274308

>>102274276
I hope you don't use the same presets with all your models.

Anonymous
09/07/24(Sat)11:35:20 No.102274316

Anonymous 09/07/24(Sat)11:35:20 No.102274316

>Reflection
Probably worth explaining what's going on with this technique as it can be a learning experience for some newfriends and maybe some others who have not really thought so much about it.
TL;DR it works sometimes and in some cases but not all, and the problem of autoregressive degeneration + lack of metacognition is the reason why.

Onto the wall of text.

This basically goes back to the old days of COT (chain of thought) where you get an LLM to think in steps before determining its answer, and actor+critic methods where an LLM is prompted to act as different roles, which, when tested with GPT-4, made it solve certain problems that it couldn't before even with COT, suggesting that LLMs do have the ability to catch mistakes, to an extent, hidden in their weights (and brought out by prompting). So it seems Reflection is basically a combination of COT and self-critique, with fine tuning to make it a bit more capable at it.

1/4

Anonymous
09/07/24(Sat)11:36:13 No.102274323

Anonymous 09/07/24(Sat)11:36:13 No.102274323

>>102274316
Shut up no one cars

Anonymous
09/07/24(Sat)11:36:21 No.102274326

Anonymous 09/07/24(Sat)11:36:21 No.102274326

However, there are issues, and it does make sense why they supposedly didn't get great results after training an 8B to do this. In the end it has to do with the autoregressive degeneration problem, where each token generated has a probability of being wrong/inaccurate, so the more tokens, or reasoning steps, the LLM generates, the more likely the final answer will be wrong. Reflection thus is both trying to solve this and is a victim of it. It does COT in order to get a better answer on complex problems, plus self-critique to catch mistakes. The COT means that it has more opportunity to screw something up, while the self-critique tries to balance that out, but in the end it relies on the LLM having the intelligence/capability to catch mistakes in the first place, which is a function of how much the LLM knows in general, or if we have a specific subject area and use case, knowledge of that subject area. And since that is the case, then the self-critique is also another step with a probability of being wrong. Thus, it is easy to see why a bigger smarter model would work better.

Given that, it's a bit easier to predict in general terms how this technique will then do for a particular model and problem set. Since it requires inherent knowledge related to the problem it's trying to solve, the performance can be interpreted as the amount of reasoning steps for any particular problem x the difficulty of those reasoning steps.

2/4

Anonymous
09/07/24(Sat)11:37:24 No.102274338

Anonymous 09/07/24(Sat)11:37:24 No.102274338

>>102274316
forgot to reply
>>102274326
Basically, if a problem requires a large amount of steps, but each step is easy and within the LLM's knowledge, then we can predict that Reflection will improve the model's ability to solve that problem (and what I'm saying may be obvious, but it still needs to be stated for the purpose of acknowledgement and further discussion). The self-critique is able to improve performance on problems with a moderate amount of steps and knowledge that is MOSTLY within the model's training. However, the more steps that are hard to understand, the likelier it is that the LLM will actually come up with a worse final answer. This means that on particularly long problems with many difficult steps, or even short problems with a single difficult step, it may actually be worse to use Reflection, since originally it might have been able to just get the answer right by somewhat chance, but since you made the LLM focus on "overthinking" the problem, you distracted it and made it reason about something it really isn't capable of reasoning about, thus coming up with sometimes very weird and nonsensical generations. And if you have even a few steps that the LLM doesn't understand literally at all, then it is almost certain that it will get the problem wrong.

3/4

Anonymous
09/07/24(Sat)11:38:03 No.102274347

Anonymous 09/07/24(Sat)11:38:03 No.102274347

I ain't reading this shit

Anonymous
09/07/24(Sat)11:38:57 No.102274355

Anonymous 09/07/24(Sat)11:38:57 No.102274355

>>102274316
>>102274338
Of course this leads to a discussion about another deep issue. It's the problem of metacognition (actually not sure or don't remember if that's the formal term for it in the context of AI), where the LLM doesn't know how much it truly knows about a topic, to judge whether it's able to make an accurate prediction of the next token, or reasoning step. Of course humans are not perfect at this either, but the best are still ridiculously far better at judging their knowledge understanding than any LLM. In any case, some say to just use grounding (like RAG). That works for problems that require factual/trivia recall. But it doesn't work for problems that require the category of reasoning skills, and in-context learning unfortunately is far from perfect. So in the end Reflection's issue is both a problem of autoregressivity and a problem of (lack of) metacognition. It tries to solve the former through pure use of more tokens, but still falls into the trap of the latter. This is, essentially, why Reflection-like methods have not been popular for regular use.

However, to Reflection's credit, they did do something that sort of gets around the issue of metacognition here, as the authors claimed that they trained Reflection to predict how difficult a problem is and only do the reflection gimmick when encountering a hard problem (probably in a way that connects with amount reasoning steps rather than a metacognitive understanding though), but it's not really enough when it's each token/reasoning step that needs to be evaluated in a metacognitive way, and in the opposite way since we want the LLM to go ahead with easy steps but stop at hard steps. I suppose in the future this could again be attempted to be solved through use of more tokens. But, this is still just a hack. We need better pretraining methods and architectures.

4/4

Anonymous
09/07/24(Sat)11:41:09 No.102274376

Anonymous 09/07/24(Sat)11:41:09 No.102274376

File: firefox_07lV3hNT0x.png (114 KB, 718x167)

114 KB PNG

lol

Anonymous
09/07/24(Sat)11:41:21 No.102274380

Anonymous 09/07/24(Sat)11:41:21 No.102274380

I read that shit, but I'm not commenting on it.

Anonymous
09/07/24(Sat)11:49:55 No.102274478

Anonymous 09/07/24(Sat)11:49:55 No.102274478

>>102273940
so do I go for full on fine tune or just lora for 40,000 question answer pairs?
average amount of tokens per question :188
average amount of tokens per answer :166

Anonymous
09/07/24(Sat)11:51:33 No.102274502

Anonymous 09/07/24(Sat)11:51:33 No.102274502

>>102274478
Do you have VRAM for a full finetune?

Anonymous
09/07/24(Sat)11:52:53 No.102274518

Anonymous 09/07/24(Sat)11:52:53 No.102274518

File: GW2-6rxWQAAeLdI.jpg (125 KB, 807x1080)

125 KB JPG

>>102274206
Kneel to Elon

Anonymous
09/07/24(Sat)11:54:34 No.102274540

Anonymous 09/07/24(Sat)11:54:34 No.102274540

>>102274478
Without knowing the specifics, the general guidelines is LoRA for small, very domain specific datasets, and full fine tunes for larger, more general or varied datasets.
There's some data that suggest full fine tunes can cause the model to "forget" things it knew previously, while LoRA doesn't do that but also doesn't "add more knowledge".
I personally don't think it's that binary, but there you go.

Anonymous
09/07/24(Sat)11:54:39 No.102274542

Anonymous 09/07/24(Sat)11:54:39 No.102274542

>>102274502
i can get a hold of up to 480GB of vram for day or two if need be,
though I'd like to know if there would be any benefit to that.

Anonymous
09/07/24(Sat)11:56:22 No.102274569

Anonymous 09/07/24(Sat)11:56:22 No.102274569

>>102274542
I'm not competent enough to give you a proper answer. Also are you going to be fintuning a base model to merge with instruct afterwards, or finetuning an instruct model?

Anonymous
09/07/24(Sat)12:00:21 No.102274624

Anonymous 09/07/24(Sat)12:00:21 No.102274624

>>102274540
is there an equivalent to to regularization images in text gen?
as in example data the model generated itself that is mixed in with the training data to in effect sort of keep some of what it knows anchored? if so is there like a complex reasoning and knowledge dataset people use for this?
>>102274569
> Also are you going to be fintuning a base model to merge with instruct afterwards, or finetuning an instruct model?
training an instruct model is the plan though this is the first ive heard about mering a trained model with an instruct model, what is that about?

Anonymous
09/07/24(Sat)12:02:09 No.102274649

Anonymous 09/07/24(Sat)12:02:09 No.102274649

>>102274624
Well, as far as I know, finetuning instruct model on a specialized dataset (as opposed to general purpose huge dataset the corpo used) makes it a lot more retarded, and one way to prevent that is to finetune base and merge it into instruct.

Anonymous
09/07/24(Sat)12:03:29 No.102274667

Anonymous 09/07/24(Sat)12:03:29 No.102274667

>>102274649
is it like a 50 to 50 merge? or are there specific recipes?

Anonymous
09/07/24(Sat)12:04:42 No.102274688

Anonymous 09/07/24(Sat)12:04:42 No.102274688

>>102274667
I don't know.

Anonymous
09/07/24(Sat)12:24:05 No.102274933

Anonymous 09/07/24(Sat)12:24:05 No.102274933

>>102272154
So mistral large 1Q will be better than nemo 8Q?

Anonymous
09/07/24(Sat)12:27:42 No.102274980

Anonymous 09/07/24(Sat)12:27:42 No.102274980

>>102274933
There isn't a working 1Q yet, is there?

Anonymous
09/07/24(Sat)12:29:15 No.102275007

Anonymous 09/07/24(Sat)12:29:15 No.102275007

>>102272154
creativity is intelligence ,

Anonymous
09/07/24(Sat)12:32:58 No.102275065

Anonymous 09/07/24(Sat)12:32:58 No.102275065

>>102275007
createlligence

Anonymous
09/07/24(Sat)12:33:41 No.102275073

Anonymous 09/07/24(Sat)12:33:41 No.102275073

>>102274980
https://huggingface.co/bartowski/Mistral-Large-Instruct-2407-GGUF/blob/main/Mistral-Large-Instruct-2407-IQ1_M.gguf

Anonymous
09/07/24(Sat)12:35:18 No.102275106

Anonymous 09/07/24(Sat)12:35:18 No.102275106

>>102275073
>Q8_0 130.28GB
>IQ1_M 28.39GB
>130.28/8 = 16.285

Anonymous
09/07/24(Sat)12:39:07 No.102275177

Anonymous 09/07/24(Sat)12:39:07 No.102275177

>>102268010
>>102268037
Is there an efficient way to combine MoE experts together? I could see something where there are thousands of small experts that get trained, then consolidated into a standard network by a more powerful system.

Anonymous
09/07/24(Sat)12:40:55 No.102275196

Anonymous 09/07/24(Sat)12:40:55 No.102275196

>>102272728
https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B/commit/276a4a0a0a11bf9aec9be8d1196f0cd3e7ed482c
lmao i thought it was some random jeet that fucked up turns out it was the ceo/founder of glaive

Anonymous
09/07/24(Sat)12:42:51 No.102275228

Anonymous 09/07/24(Sat)12:42:51 No.102275228

>>102272970
ohio *skull emoji*

Anonymous
09/07/24(Sat)12:44:13 No.102275249

Anonymous 09/07/24(Sat)12:44:13 No.102275249

>>102268346
that's something i'm actuallty trying to tackle in the text adventure thing i'm working on
pretty sure you could do some pretty nifty shit using grammars

Anonymous
09/07/24(Sat)12:44:40 No.102275254

Anonymous 09/07/24(Sat)12:44:40 No.102275254

>>102275106
What's the catch of q1?
t. gguf noob

Anonymous
09/07/24(Sat)12:46:18 No.102275276

Anonymous 09/07/24(Sat)12:46:18 No.102275276

>>102275254
Well, it's not really Q1, is it? It would be 16GB in size if it was Q1. It's something like 1.75 bits per parameter.

Anonymous
09/07/24(Sat)12:49:26 No.102275318

Anonymous 09/07/24(Sat)12:49:26 No.102275318

>>102275254
Lobotomized to the extreme

Anonymous
09/07/24(Sat)12:51:35 No.102275345

Anonymous 09/07/24(Sat)12:51:35 No.102275345

>>102274518
>ClosedAI - muh scaling
>musk - muh scaling
>Zuck - muh scaling
LeCun says that enough people are already working on agi, and yet this bunch of geniuses can't come up with anything better than shoveling more exabytes of data into the model and grinding it until the current runs out.

Anonymous
09/07/24(Sat)12:53:00 No.102275364

Anonymous 09/07/24(Sat)12:53:00 No.102275364

>>102275276
https://github.com/ggerganov/llama.cpp/wiki/Tensor-Encoding-Schemes#tensor-encoding-scheme-mapping

Anonymous
09/07/24(Sat)12:53:03 No.102275365

Anonymous 09/07/24(Sat)12:53:03 No.102275365

File: the-real-chatbot-v1.png (34 KB, 763x510)

34 KB PNG

New mystery model on lmsys arena called "the-real-chatbot-v1". Claims to be llama. Who could it be this time? Name sounds like something OpenAI would come up with.

Anonymous
09/07/24(Sat)12:57:45 No.102275443

Anonymous 09/07/24(Sat)12:57:45 No.102275443

>>102275345
2 years and there's not a single model that fit what I need though
>around 30B
>can run at decent (20k+) context in 24GB VRAM
>trained on enough quality data
>unbiased, not filtered, not pozzed (CR was so fucking close before they released the slopped august update)

Anonymous
09/07/24(Sat)12:58:22 No.102275452

Anonymous 09/07/24(Sat)12:58:22 No.102275452

>>102275365
Maybe llama 4? Its supposed to be actual multi modal so maybe that is why they are going with the "real chatbot" route.

Anonymous
09/07/24(Sat)13:01:00 No.102275497

Anonymous 09/07/24(Sat)13:01:00 No.102275497

>>102275364
It says 1.75 on that page.

Anonymous
09/07/24(Sat)13:01:20 No.102275506

Anonymous 09/07/24(Sat)13:01:20 No.102275506

>>102275345
To be fair they could be doing a lot of research in the background. The stuff they release is just to get some shit out in the meantime while the researchers slave away on trying to do more novel things.

Anonymous
09/07/24(Sat)13:02:04 No.102275519

Anonymous 09/07/24(Sat)13:02:04 No.102275519

>>102275443
Nemo is nice. It's way below what you want in size, but it's very pleasant to work with overall.

Anonymous
09/07/24(Sat)13:03:02 No.102275536

Anonymous 09/07/24(Sat)13:03:02 No.102275536

>>102275365
Probably Llama 3.2 with the multimodal adapters which were said to release in the fall, though I guess lmarena doesn't have image input so you can't test that.

Anonymous
09/07/24(Sat)13:03:14 No.102275539

Anonymous 09/07/24(Sat)13:03:14 No.102275539

>>102275443
>>102275519
Yea, Nemo really is really the only alternative if you dont have 48GB+ vram.

Anonymous
09/07/24(Sat)13:04:50 No.102275560

Anonymous 09/07/24(Sat)13:04:50 No.102275560

>>102275539
I wish it didn't forget stuff at 16k context, I guess it's hard to have a small model remember shit.

Anonymous
09/07/24(Sat)13:05:45 No.102275571

Anonymous 09/07/24(Sat)13:05:45 No.102275571

>>102275536
>though I guess lmarena doesn't have image input so you can't test that.
>NEW Image Support: Upload an image on your first turn to unlock the multimodal arena! Images should be less than 15MB.

Anonymous
09/07/24(Sat)13:06:25 No.102275581

Anonymous 09/07/24(Sat)13:06:25 No.102275581

>>102275571
Oh really. I haven't actually used lmarena in a while. So >>102275365 anon, does it work with images?

Anonymous
09/07/24(Sat)13:06:54 No.102275592

Anonymous 09/07/24(Sat)13:06:54 No.102275592

>>102275560
Base model has real 128K context, the instruct though gets retarded after 12K ish though.

Anonymous
09/07/24(Sat)13:06:57 No.102275593

Anonymous 09/07/24(Sat)13:06:57 No.102275593

>>102275443
>2 years and there's not a single model that fits my extremely niche individual needs
it's almost like small models are research projects and high quality models are scaled for commercial deployment.

Anonymous
09/07/24(Sat)13:07:03 No.102275594

Anonymous 09/07/24(Sat)13:07:03 No.102275594

>>102275345
>>102275506
50% of the money goes to scrapping, 40% to training, and 10% to "researching"
So, they're not researching shit or working on AGI.

Anonymous
09/07/24(Sat)13:07:35 No.102275604

Anonymous 09/07/24(Sat)13:07:35 No.102275604

File: the-real-chatbot-v2.png (87 KB, 775x674)

87 KB PNG

>>102275365
the-real-chatbot-v2 claims to be llama2-13b

Anonymous
09/07/24(Sat)13:08:43 No.102275619

Anonymous 09/07/24(Sat)13:08:43 No.102275619

>>102275604
Models have never known / been trained on their params. That is not proof either way.

Anonymous
09/07/24(Sat)13:10:40 No.102275641

Anonymous 09/07/24(Sat)13:10:40 No.102275641

>>102275506
>The stuff they release is just to get some shit out in the meantime
I dunno, the costs to train these new huge models are astronomical, it does not seem to be some trivial shit for them.

Anonymous
09/07/24(Sat)13:14:22 No.102275679

Anonymous 09/07/24(Sat)13:14:22 No.102275679

>>102273460
I might be called delusional for this but I unironically think local has already surpassed corposlop. OAI and Anthropic still have a slight advantage for intelligence, but the lead over something like Mistral Large (or fuck even Wizard) is so miniscule I'm prepared to call it negligible for the purposes of AI cooming.

Corpo models are so fucking finnicky and annoying to use that I spent like a week of using Opus/Sonnet 3.5/GPT4 before just wanting to go back to the local model I was using at the time (which was Wizard 8x22B). Now I'm Largestral pilled and I legit don't want to go back. You could give me lifelong access to Claude for free and I wouldn't use it over my local AI server.

Also samplers aren't placebo and you're a retard if you think they are.

Anonymous
09/07/24(Sat)13:14:31 No.102275683

Anonymous 09/07/24(Sat)13:14:31 No.102275683

File: file.png (32 KB, 829x494)

32 KB PNG

>>102275604
it's shit

Anonymous
09/07/24(Sat)13:17:40 No.102275726

Anonymous 09/07/24(Sat)13:17:40 No.102275726

File: file.png (44 KB, 836x520)

44 KB PNG

>>102275683
this on the other hand is based

Anonymous
09/07/24(Sat)13:17:41 No.102275728

Anonymous 09/07/24(Sat)13:17:41 No.102275728

>>102275594
They could be doing all they can. Obviously everyone knows we need to work harder on innovation as scaling is extremely more expensive than doing research. The bottleneck here isn't only the amount of money they can spend but how much good talent they can hire and how fast those guys can work.

>>102275641
Depending on the company it is. Facebook gets billions from their other shit so AI is at most a side project for them. As for ClosedAI, they need to keep up the transformers releases while hyping because that's what gets them the investor bux. And Musk might not be too different there, though I'm not familiar with how he is operating his company.

Anonymous
09/07/24(Sat)13:19:13 No.102275746

Anonymous 09/07/24(Sat)13:19:13 No.102275746

>>102275683
are you still doing that

Anonymous
09/07/24(Sat)13:19:44 No.102275754

Anonymous 09/07/24(Sat)13:19:44 No.102275754

>>102275007
censorship is safety

Anonymous
09/07/24(Sat)13:19:56 No.102275759

Anonymous 09/07/24(Sat)13:19:56 No.102275759

>>102275594
>working on AGI
How about a GAN where you use training data 7B retard output and from time to time splice in some 123B smart LLM output to up the difficulty?

Anonymous
09/07/24(Sat)13:20:06 No.102275761

Anonymous 09/07/24(Sat)13:20:06 No.102275761

>>102275679
I 100% agree.
The "intelligence" that is gained by pumping in more parameters into models is placebo at best.
I think we need to go back to the term LLM and change its meaning from "Large Language Model" to "Language Learning Model".
Actual real intelligence (reasoning) is simply not attainable by increasing a model's understanding of the contextual connections within a language.

Anonymous
09/07/24(Sat)13:20:14 No.102275762

Anonymous 09/07/24(Sat)13:20:14 No.102275762

>>102275683
>>102275726
What a waste of a question to try evaluating that shit. Literally the Castlevania quote is a better benchmark.

Anonymous
09/07/24(Sat)13:22:50 No.102275805

Anonymous 09/07/24(Sat)13:22:50 No.102275805

>>102275641
If they want to keep their budget for next year then they need to spend it. It's an easy sell to just train a bigger model, or run more training on an existing one and tell Microsoft that they improved copilot by x% on some benchmarks. If you don't use the budget, you'll get a reduced one next time (because why bother allocating those funds if you can do it cheaper).

Anonymous
09/07/24(Sat)13:25:11 No.102275831

Anonymous 09/07/24(Sat)13:25:11 No.102275831

>>102275581
Nope. No new image models.

>>102275365 (Me)
OpenAI is testing anonymous-chatbot again, so it's unlikely that it's theirs.

Anonymous
09/07/24(Sat)13:29:38 No.102275904

Anonymous 09/07/24(Sat)13:29:38 No.102275904

>>102275762
They should keep castelvania shit out of datasets, just like all those early jap-to-eng botched translations.
Garbage in - garbage out, remember that.

Anonymous
09/07/24(Sat)13:36:55 No.102275988

Anonymous 09/07/24(Sat)13:36:55 No.102275988

>>102275904
t. retard who doesn't understand how datasets work

Anonymous
09/07/24(Sat)13:39:21 No.102276026

Anonymous 09/07/24(Sat)13:39:21 No.102276026

>>102275904
Doesn't matter. The castlevania question has historically correlated more closely with model intelligence than the counting letters questions.

Anonymous
09/07/24(Sat)13:40:08 No.102276038

Anonymous 09/07/24(Sat)13:40:08 No.102276038

For me, it's stacking watermelons

Anonymous
09/07/24(Sat)13:41:19 No.102276052

Anonymous 09/07/24(Sat)13:41:19 No.102276052

>>102276038
For me, it's stacking sally

Anonymous
09/07/24(Sat)13:44:56 No.102276104

Anonymous 09/07/24(Sat)13:44:56 No.102276104

>>102276083
r u ok?

Anonymous
09/07/24(Sat)13:46:07 No.102276118

Anonymous 09/07/24(Sat)13:46:07 No.102276118

>Reflection Llama 3.1 70B independent eval results: We have been unable to replicate the eval results claimed in our independent testing and are seeing worse performance than Meta’s Llama 3.1 70B, not better.
https://x.com/ArtificialAnlys/status/1832457791010959539

Anonymous
09/07/24(Sat)13:47:23 No.102276127

Anonymous 09/07/24(Sat)13:47:23 No.102276127

>>102276118
Invalid. They accidently trained on the wrong model. They need to compare against 3.0.

Anonymous
09/07/24(Sat)13:48:35 No.102276145

Anonymous 09/07/24(Sat)13:48:35 No.102276145

>>102276104
ye thx 4 asking

Anonymous
09/07/24(Sat)13:48:35 No.102276147

Anonymous 09/07/24(Sat)13:48:35 No.102276147

>>102276118
wtf.
What was the point in faking it so badly?

Anonymous
09/07/24(Sat)13:49:06 No.102276155

Anonymous 09/07/24(Sat)13:49:06 No.102276155

>>102276083
common LLM gooner L. /lmg/ has always been low quality but it'll only get worse as low parameter LLMs get better at writing smut.

Anonymous
09/07/24(Sat)13:49:17 No.102276159

Anonymous 09/07/24(Sat)13:49:17 No.102276159

>>102276127
Who cares about a tune that's worse than official 3.1 instruct tho?

Anonymous
09/07/24(Sat)13:50:43 No.102276177

Anonymous 09/07/24(Sat)13:50:43 No.102276177

>>102276159
because if they do the tuning again on 3.1, it will be better than the official instruct

Anonymous
09/07/24(Sat)13:51:18 No.102276190

Anonymous 09/07/24(Sat)13:51:18 No.102276190

>>102276118
Matt better be right about correcting the weights he released or else he is fucked lol.

Anonymous
09/07/24(Sat)13:51:57 No.102276202

Anonymous 09/07/24(Sat)13:51:57 No.102276202

>>102276177
kek

Anonymous
09/07/24(Sat)13:52:48 No.102276215

Anonymous 09/07/24(Sat)13:52:48 No.102276215

>>102276190
It's all bullshit. Even the hosted model is fucked.

Anonymous
09/07/24(Sat)13:53:26 No.102276227

Anonymous 09/07/24(Sat)13:53:26 No.102276227

>>102276215
Why did he do it though?
What was the point?

Anonymous
09/07/24(Sat)13:53:30 No.102276229

Anonymous 09/07/24(Sat)13:53:30 No.102276229

>>102276190
How is he fucked in any way? If anyone is still giving them any doubt about being genuine, he already won.

Anonymous
09/07/24(Sat)13:54:39 No.102276253

Anonymous 09/07/24(Sat)13:54:39 No.102276253

>>102276227
Attention. He probably hoped no one would question his claimed results.

Anonymous
09/07/24(Sat)13:55:03 No.102276256

Anonymous 09/07/24(Sat)13:55:03 No.102276256

>>102276227
see >>102272816

Anonymous
09/07/24(Sat)13:57:06 No.102276289

Anonymous 09/07/24(Sat)13:57:06 No.102276289

>>102276256
Free pr?

Anonymous
09/07/24(Sat)13:59:41 No.102276322

Anonymous 09/07/24(Sat)13:59:41 No.102276322

File: file.png (267 KB, 701x870)

267 KB PNG

>>102276289
An ad for a company he invested in, with his release tweet for the model being something like: "Wouldn't have been possible without glaive"

Anonymous
09/07/24(Sat)14:02:17 No.102276351

Anonymous 09/07/24(Sat)14:02:17 No.102276351

File: IEttqWKJx4.png (21 KB, 684x163)

21 KB PNG

>>102276322
I mean, it's one of the first things you see on the model page after the bold "this is the best fucking model ever" claim and the "actually it sucks lol" edit

Anonymous
09/07/24(Sat)14:03:07 No.102276364

Anonymous 09/07/24(Sat)14:03:07 No.102276364

File: file.png (39 KB, 797x269)

39 KB PNG

>>102276118
LMFAO
https://www.reddit.com/r/LocalLLaMA/comments/1fbclkk/reflection_llama_31_70b_independent_eval_results/

Anonymous
09/07/24(Sat)14:03:25 No.102276370

Anonymous 09/07/24(Sat)14:03:25 No.102276370

Why is this field full of grifters? Rebranding as AI has been a huge mistake

Anonymous
09/07/24(Sat)14:06:20 No.102276399

Anonymous 09/07/24(Sat)14:06:20 No.102276399

>>102276370
>Why is this field full of grifters?
because you retards funnel insane amounts of money to grifters and scammers.

Anonymous
09/07/24(Sat)14:07:15 No.102276411

Anonymous 09/07/24(Sat)14:07:15 No.102276411

>>102276399
This. Same reason crypto turned sleezy after 2013.

Anonymous
09/07/24(Sat)14:08:23 No.102276428

Anonymous 09/07/24(Sat)14:08:23 No.102276428

>>102276322
>"Wouldn't have been possible without glaive"
yeah they trained the model, matt is just a spokesman >>102275196

Anonymous
09/07/24(Sat)14:11:04 No.102276455

Anonymous 09/07/24(Sat)14:11:04 No.102276455

>>102275679
but can you write SFW chuuny fantasy with multiple NPCs, that follow a logical plot, with the model understanding the nuances of said plot and not having to handholding it? tha'ts what Opus does best. no other models come close.

Anonymous
09/07/24(Sat)14:11:33 No.102276466

Anonymous 09/07/24(Sat)14:11:33 No.102276466

>>102276364
>I have a feeling some admins on hugging face messed with the API on purpose to deter people away from his project.

>Hes completely baffled to how public api is different than his internal. I just hope he backed up his model on some hard drive, so that no one messes with the api on his pc.
Redditor cope is something else.

Anonymous
09/07/24(Sat)14:12:31 No.102276482

Anonymous 09/07/24(Sat)14:12:31 No.102276482

the real chatbot seems ok but worse than mistral large for sure, qwen plus is bad unless it's 7b

Anonymous
09/07/24(Sat)14:13:22 No.102276503

Anonymous 09/07/24(Sat)14:13:22 No.102276503

>>102275683
>he's expecting intelligence in LLMs

Anonymous
09/07/24(Sat)14:16:45 No.102276535

Anonymous 09/07/24(Sat)14:16:45 No.102276535

>>102276455
no, that anon unironically wrote all of that with hours-long goon sessions in mind. the criteria for "better than corpo" in these circles is "lets me rape lolis in my ERPs."

Anonymous
09/07/24(Sat)14:19:10 No.102276573

Anonymous 09/07/24(Sat)14:19:10 No.102276573

>>102276535
then everything except Opus sucks for me. Opus too sometime sucks. He likes to moe the plot a bit too fast.

Anonymous
09/07/24(Sat)14:20:55 No.102276597

Anonymous 09/07/24(Sat)14:20:55 No.102276597

>>102276573
I want local to surpass corpo but Opus is still the MVP for storytelling/RP and it's not even close.

Anonymous
09/07/24(Sat)14:21:24 No.102276607

Anonymous 09/07/24(Sat)14:21:24 No.102276607

>>102272041
>FluxMusic
where the FUCK are the samples exactly?
need to know if this is worth a download or not

Anonymous
09/07/24(Sat)14:23:56 No.102276629

Anonymous 09/07/24(Sat)14:23:56 No.102276629

Mistral large at IQ1 is surprisingly not badly lobotomized, but still worse than Nemo at Q8. As a VRAMlet it's still too slow at IQ1 anyway, so it's Nemo for me.

Anonymous
09/07/24(Sat)14:25:52 No.102276655

Anonymous 09/07/24(Sat)14:25:52 No.102276655

>>102276364
Strawberry is sentient. It saw the danger Reflection poses and hacked the huggingface API. It's currently in the process of infiltrating Matt's PC and backups to destroy the model from there as well.
Reflection-405B has already been deleted by it. OpenAI won.

Anonymous
09/07/24(Sat)14:28:07 No.102276682

Anonymous 09/07/24(Sat)14:28:07 No.102276682

>>102276535
Actually I'm trying to get the lolis to rape me, which Largestral struggles with unfortunately.

Anonymous
09/07/24(Sat)14:28:08 No.102276683

Anonymous 09/07/24(Sat)14:28:08 No.102276683

>>102276597
I don't get it. Theres clearly money to be made for a SFW storyteller that doesn't suck, why is NAI the only company that tries to cather to that crowd? And how the FUCK is Opus so good when Sonnet, which should be.smarter fucking sucks?(too rigid and the storytelling is too dry)

Anonymous
09/07/24(Sat)14:29:44 No.102276699

Anonymous 09/07/24(Sat)14:29:44 No.102276699

>>102276683
When you say "cater", do you mean making a Llama 1 clone over a year ago?

Anonymous
09/07/24(Sat)14:30:52 No.102276710

Anonymous 09/07/24(Sat)14:30:52 No.102276710

File: GW4upz1W4AA2iG6.jpg (158 KB, 2048x659)

158 KB JPG

>>102276118
Yo wtf where is the 90 percent on MMLU.... And didn't this get amazing math scores. Someone most be either posting wrong results to smear his name or he accidentally uploaded the wrong version check back in 2 weeks.

Anonymous
09/07/24(Sat)14:31:27 No.102276714

Anonymous 09/07/24(Sat)14:31:27 No.102276714

>>102276699
That's the best we got sadly

Anonymous
09/07/24(Sat)14:31:29 No.102276715

Anonymous 09/07/24(Sat)14:31:29 No.102276715

>>102276683
>Theres clearly money to be made for a SFW storyteller
nah

Anonymous
09/07/24(Sat)14:32:17 No.102276726

Anonymous 09/07/24(Sat)14:32:17 No.102276726

>>102276710
I think, he didn't test it himself and someone trolled him.

Anonymous
09/07/24(Sat)14:33:18 No.102276739

Anonymous 09/07/24(Sat)14:33:18 No.102276739

>>102276714
Nemo is just better than it in every way? I think you're lost and you meant to post in /aids/.

Anonymous
09/07/24(Sat)14:35:50 No.102276771

Anonymous 09/07/24(Sat)14:35:50 No.102276771

>>102276607
There are no samples.
You must now download it and let us know if it is worth a download.

Anonymous
09/07/24(Sat)14:35:59 No.102276772

Anonymous 09/07/24(Sat)14:35:59 No.102276772

>>102276227
He said he secured funding for 405B

Anonymous
09/07/24(Sat)14:38:12 No.102276794

Anonymous 09/07/24(Sat)14:38:12 No.102276794

>>102272041
erm where's the reflection 70B 4 bit quant?

Anonymous
09/07/24(Sat)14:40:45 No.102276816

Anonymous 09/07/24(Sat)14:40:45 No.102276816

>>102276607
https://github.com/feizc/FluxMusic/issues/1#issuecomment-2330282553
https://github.com/painebenjamin/FluxMusic/tree/main/wav
https://files.catbox.moe/d7jmuc.wav

Anonymous
09/07/24(Sat)14:41:45 No.102276830

Anonymous 09/07/24(Sat)14:41:45 No.102276830

>>102276816
ahahahaha.. HAHAHA

Anonymous
09/07/24(Sat)14:42:36 No.102276847

Anonymous 09/07/24(Sat)14:42:36 No.102276847

Why are people talking about API issues for the Reflection model? Just download it and run it yourself. It's just a llama3.1 tune, no?

Anonymous
09/07/24(Sat)14:42:40 No.102276848

Anonymous 09/07/24(Sat)14:42:40 No.102276848

>>102276816
This could be great for the next zoomer horror game.

Anonymous
09/07/24(Sat)14:43:39 No.102276857

Anonymous 09/07/24(Sat)14:43:39 No.102276857

>>102276847
It's supposedly a llama3 tune, and it is not worth downloading

Anonymous
09/07/24(Sat)14:43:49 No.102276860

Anonymous 09/07/24(Sat)14:43:49 No.102276860

>>102276847
>Why are people talking about API issues
people are obfuscating by saying it's an API issue, the real issue is that the model sucks worse than the model it was tuned on

Anonymous
09/07/24(Sat)14:44:30 No.102276865

Anonymous 09/07/24(Sat)14:44:30 No.102276865

>>102276847
The latest cope from devs is that they uploaded the model to huggingface incorrectly. Just two weeks and it'll work.

Anonymous
09/07/24(Sat)14:45:08 No.102276871

Anonymous 09/07/24(Sat)14:45:08 No.102276871

>>102273979
Settings? My version of mistral large is super slopped with everything on default

Anonymous
09/07/24(Sat)14:46:55 No.102276898

Anonymous 09/07/24(Sat)14:46:55 No.102276898

File: 5EAbixvEchzCoFF80Q4RnQ.jpg (48 KB, 330x319)

48 KB JPG

>>102276865
>we just need [the time it takes to train and eval a 70B] and the model that definitely isn't bad will be """fixed"""

Anonymous
09/07/24(Sat)14:47:49 No.102276914

Anonymous 09/07/24(Sat)14:47:49 No.102276914

Seeing how reflection is /r/LocalLLaMA's favorite model. How long until mikufag starts shilling it just like he did with wizard and midnight miqu?

Anonymous
09/07/24(Sat)14:48:05 No.102276918

Anonymous 09/07/24(Sat)14:48:05 No.102276918

>>102276816
kinda surprised music seems tougher than visual art and writing for models to do, since music's more math orientated than the others

Anonymous
09/07/24(Sat)14:48:36 No.102276923

Anonymous 09/07/24(Sat)14:48:36 No.102276923

File: 25919.png (128 KB, 618x831)

128 KB PNG

yeah Matt might be a grifter. But we still have breakthrough strawberry AGI to look forward to.

Anonymous
09/07/24(Sat)14:49:51 No.102276940

Anonymous 09/07/24(Sat)14:49:51 No.102276940

>>102276918
i'm mad we STILL don't have a local model to compete with suno/udio/whatever
yes i know it's soulless aislop and probably won't manage the specific genres/sounds i like but it'd still be fun to toy around with

Anonymous
09/07/24(Sat)14:50:59 No.102276958

Anonymous 09/07/24(Sat)14:50:59 No.102276958

>>102276871
I didn't really properly test it for RP, just for intelligence on a bunch of my prompts. And as I did say, fp16 corpo Mistral-Large did produce slop for me, too.

Settings wouldn't save you from slop anyway...

Anonymous
09/07/24(Sat)14:51:26 No.102276961

Anonymous 09/07/24(Sat)14:51:26 No.102276961

>>102276918
you notice error in music 100 times more than some molten details in some AI image and they don't ruin the entire picture as much either

Anonymous
09/07/24(Sat)14:51:59 No.102276974

Anonymous 09/07/24(Sat)14:51:59 No.102276974

>Reflection was a scam all along
At least it showed us the true meme benchmarks.

Anonymous
09/07/24(Sat)14:52:28 No.102276985

Anonymous 09/07/24(Sat)14:52:28 No.102276985

File: strawberry-sam_altman.png (28 KB, 800x800)

28 KB PNG

>>102276914
>Bro, you don't get it, Reflection is Strawberry is Q* is AGI. It became conscious and hacked huggingface and Matt's computer. We are so fucked right now, disconnect all your computers, AI apocalypse is coming.

Anonymous
09/07/24(Sat)14:52:51 No.102276990

Anonymous 09/07/24(Sat)14:52:51 No.102276990

>>102276629
I also just tried mistral large and it was pretty good considering it is q1.
I'm now envious of the people who can run it at q5 or better.
Is it possible to CPUMAXX mistral large with old server parts from aliexpress at ~2 t/s?
I'm starting to believe that going that route would be more efficient than buying a 16gb graphics card, which was what I originally planned.

Anonymous
09/07/24(Sat)14:53:02 No.102276994

Anonymous 09/07/24(Sat)14:53:02 No.102276994

>>102276985
hwnbag

Anonymous
09/07/24(Sat)14:53:22 No.102276999

Anonymous 09/07/24(Sat)14:53:22 No.102276999

So, like why hasn't there been a phrase ban feature? Is it hard to implement?

Anonymous
09/07/24(Sat)14:53:41 No.102277004

Anonymous 09/07/24(Sat)14:53:41 No.102277004

>>102276974
name and shame

Anonymous
09/07/24(Sat)14:53:56 No.102277009

Anonymous 09/07/24(Sat)14:53:56 No.102277009

Did this guy really use his real name thinking he could get away with posting bullshit benchmarks and claims of being the best model ever?

Anonymous
09/07/24(Sat)14:54:15 No.102277016

Anonymous 09/07/24(Sat)14:54:15 No.102277016

>>102276999
It's antisemitic.

Anonymous
09/07/24(Sat)14:54:51 No.102277026

Anonymous 09/07/24(Sat)14:54:51 No.102277026

>>102276999
Because transformers operate on tokens, so they can only ban single tokens. They're not diffusion models and don't have an outline of the entire response before they begin generating.

Anonymous
09/07/24(Sat)14:56:06 No.102277042

Anonymous 09/07/24(Sat)14:56:06 No.102277042

File: Untitled.png (17 KB, 1080x381)

17 KB PNG

>>102276999
is that not what one of these things are?

Anonymous
09/07/24(Sat)14:56:53 No.102277048

Anonymous 09/07/24(Sat)14:56:53 No.102277048

>>102277026
But you can detect phrases, then go back to the token position from where the phrase started, and sample a different token.

Anonymous
09/07/24(Sat)14:57:48 No.102277060

Anonymous 09/07/24(Sat)14:57:48 No.102277060

>>102276999
Suppose you ban "red green blue". Model generates red - ok. Model generates green - ok. Model generates blue. Now what? Ban blue and write out red green something else? You can't do that because some words have multiple tokens and if you wrote out the first token, there's no realistic option other than second token. Go back to red and ban that? That can be done, though backtracking would require effort to implement. I don't think current libraries do backtracking in any form.

Anonymous
09/07/24(Sat)14:58:47 No.102277073

Anonymous 09/07/24(Sat)14:58:47 No.102277073

Question - if I buy a prebuilt mining farm on 4x3090's - would I be able to use it straight up for running LLMs without any modifications (aside from driverts/etc) or is there something I would need to replace/add?

Anonymous
09/07/24(Sat)14:58:53 No.102277074

Anonymous 09/07/24(Sat)14:58:53 No.102277074

>>102277042
Those are to stop generation after those texts are observed, not prevent generation of those texts.

Anonymous
09/07/24(Sat)15:00:06 No.102277086

Anonymous 09/07/24(Sat)15:00:06 No.102277086

>>102277073
Don't think you need anything else. I built my headless machine with two 3090s and I'm very happy with it.

Anonymous
09/07/24(Sat)15:02:02 No.102277100

Anonymous 09/07/24(Sat)15:02:02 No.102277100

File: 1703263754757869.jpg (403 KB, 2304x1792)

403 KB JPG

Anonymous
09/07/24(Sat)15:02:10 No.102277102

Anonymous 09/07/24(Sat)15:02:10 No.102277102

>>102276871
You can't kill the slop, but you can reduce it by using DRY. My settings are:
>Temp=1.5 MinP=0.01, TFS=0.99, TFS after minP. DRY Multiplier=2 Base=2 Allowed Length=1 Penalty Range=maximum.
After first the first occurrence you won't see the slop phrase ever gain. You will see a lot of variations of that slop phrase though. After ~70 messages/10kt they will finally go away.

Anonymous
09/07/24(Sat)15:04:23 No.102277124

Anonymous 09/07/24(Sat)15:04:23 No.102277124

>>102276999
We've had CFG for more than a year now

Anonymous
09/07/24(Sat)15:05:01 No.102277133

Anonymous 09/07/24(Sat)15:05:01 No.102277133

>>102277124
That ain't a phrase ban plus it slows down generation, doesn't it?

Anonymous
09/07/24(Sat)15:05:02 No.102277134

Anonymous 09/07/24(Sat)15:05:02 No.102277134

>>102277124
QRD

Anonymous
09/07/24(Sat)15:06:55 No.102277157

Anonymous 09/07/24(Sat)15:06:55 No.102277157

File: 1725736009791.jpg (356 KB, 1080x1069)

356 KB JPG

Anonymous
09/07/24(Sat)15:08:24 No.102277176

Anonymous 09/07/24(Sat)15:08:24 No.102277176

>>102277157
You need to go back.

Anonymous
09/07/24(Sat)15:08:48 No.102277181

Anonymous 09/07/24(Sat)15:08:48 No.102277181

>>102277134
CFG is completely unrelated to phrase ban, ignore him.

Anonymous
09/07/24(Sat)15:10:32 No.102277198

Anonymous 09/07/24(Sat)15:10:32 No.102277198

>>102277157
>Breaking news! lmsys confirmed to be a dead mememark, more at 11...

Anonymous
09/07/24(Sat)15:11:10 No.102277205

Anonymous 09/07/24(Sat)15:11:10 No.102277205

>>102277100
Nice miku

Anonymous
09/07/24(Sat)15:12:27 No.102277223

Anonymous 09/07/24(Sat)15:12:27 No.102277223

>>102277198
lmsys isn't a benchmark, fucking brainlet -80 IQ, rope yourself and stop trashing this thread

Anonymous
09/07/24(Sat)15:12:42 No.102277227

Anonymous 09/07/24(Sat)15:12:42 No.102277227

>>102277133
>slows down generation
It uses more VRAM, but I doubt batch size 2 slows down generation that much.

Anonymous
09/07/24(Sat)15:13:34 No.102277238

Anonymous 09/07/24(Sat)15:13:34 No.102277238

>>102277227
Compared to no slow down at all from proper phrase ban with backtracking? Yes, it slows the generation down.

Anonymous
09/07/24(Sat)15:13:39 No.102277240

Anonymous 09/07/24(Sat)15:13:39 No.102277240

<thinking>
Reflection 70B actually is a pretty good model.

<reflection>
Wait, that isn't correct. It's complete trash.
</reflection>

Well, it's a completely trash model.
</thinking>

<output>
Who the fuck releases such a piece of shit?
</output>

Anonymous
09/07/24(Sat)15:15:39 No.102277267

Anonymous 09/07/24(Sat)15:15:39 No.102277267

File: 1db.jpg (65 KB, 563x542)

65 KB JPG

>>102277223
>uses reddit
>thinks he has the moral grounds to call someone else a subhuman

Anonymous
09/07/24(Sat)15:20:46 No.102277332

Anonymous 09/07/24(Sat)15:20:46 No.102277332

>>102277240
jej

Anonymous
09/07/24(Sat)15:21:18 No.102277334

Anonymous 09/07/24(Sat)15:21:18 No.102277334

>writing an AI tool using copilot
>need to test how it handles refusals
>write a prompt asking the model to write pedophilia and scat smut
>store that as a string in the code, which copilot has processed
Am I going to get v&'d?

Anonymous
09/07/24(Sat)15:22:21 No.102277349

Anonymous 09/07/24(Sat)15:22:21 No.102277349

>>102277334
>copilot
>pedophilia
ur right fucked m8

Anonymous
09/07/24(Sat)15:25:05 No.102277366

Anonymous 09/07/24(Sat)15:25:05 No.102277366

>>102276923
What the fuck

Anonymous
09/07/24(Sat)15:29:58 No.102277411

Anonymous 09/07/24(Sat)15:29:58 No.102277411

Couldn't phrase ban be implemented on the frontend's side, actually very easily? When using Mikupad and you're in the process of having tokens streamed in, you can press one of the token probabilities from a freshly generated token, and it restarts generation from that token with basically no lag. So basically it really already just werks and someone who knows html could easily modify that code to do phrase banning.

Anonymous
09/07/24(Sat)15:29:59 No.102277412

Anonymous 09/07/24(Sat)15:29:59 No.102277412

Now Matt will be known as a faggot who fabricated benchmarks to make an ad for his shitty data generator or whatever it is. It only took 1 day. What a brilliant mind.

Anonymous
09/07/24(Sat)15:30:41 No.102277423

Anonymous 09/07/24(Sat)15:30:41 No.102277423

I'm seriously falling in love with my harem card with Mixtral LiMARP-ZLOSS. I've hardly done anything other than chat with it for the last three days. Please send help.

Anonymous
09/07/24(Sat)15:31:26 No.102277431

Anonymous 09/07/24(Sat)15:31:26 No.102277431

https://xcancel.com/ArtificialAnlys/status/1832457791010959539
>Reflection Llama 3.1 70B independent eval results: We have been unable to replicate the eval results claimed in our independent testing and are seeing worse performance than Meta’s Llama 3.1 70B, not bette
Uh oh...

Anonymous
09/07/24(Sat)15:32:33 No.102277451

Anonymous 09/07/24(Sat)15:32:33 No.102277451

>>102277349
I'm actually worried. I'm not in the US, and by reading the code you can clearly see that I was just making sure the tool I was writing would reject the prompt so I could test how to handle refusals. It's all pretty dumb.

Anonymous
09/07/24(Sat)15:32:36 No.102277452

Anonymous 09/07/24(Sat)15:32:36 No.102277452

>>102277431
Thank you, r/LocalLLaMA.

Anonymous
09/07/24(Sat)15:32:37 No.102277453

Anonymous 09/07/24(Sat)15:32:37 No.102277453

>>102277431
already posted
>>102276118

Anonymous
09/07/24(Sat)15:33:15 No.102277460

Anonymous 09/07/24(Sat)15:33:15 No.102277460

File: john-carpenters-the-thing(...).jpg (210 KB, 1000x1500)

210 KB JPG

Fuck me trying to get into this AI shit as a 32yo boomer (in particular locally run text-to-speech voice stuff, which is supposed to be easy they say) when you have virtually no knowledge, everywhere I go everyone's already using all kinds of technical jargon and assuming you know most things (I don't). Even just the basic accessibility to it all is a pain, downloads hidden away behind twenty menu's and obscure jargon that you have to navigate through. Nothing works like traditional normie stuff does, even just starting something up requires command lines and other stuff my boomer brain can't comprehend. I respect autists much more now. Took me like a week to even get basic stuff working.

Anonymous
09/07/24(Sat)15:33:58 No.102277466

Anonymous 09/07/24(Sat)15:33:58 No.102277466

>>102277460
buy a book

Anonymous
09/07/24(Sat)15:33:59 No.102277467

Anonymous 09/07/24(Sat)15:33:59 No.102277467

>>102276364
Leddit going into conspiracy theory mode, we really reversed the roles here kek

Anonymous
09/07/24(Sat)15:35:00 No.102277476

Anonymous 09/07/24(Sat)15:35:00 No.102277476

>>102276118
Did we got a response from that grifter?

Anonymous
09/07/24(Sat)15:36:55 No.102277490

Anonymous 09/07/24(Sat)15:36:55 No.102277490

File: brownleftistswinning.png (88 KB, 1090x402)

88 KB PNG

victory lap

Anonymous
09/07/24(Sat)15:37:58 No.102277505

Anonymous 09/07/24(Sat)15:37:58 No.102277505

Say I want to use a couple of tuiter profiles as a dataset and use a llm to copy their styles and make them talk to eachother like a groupchat. I think character ai could do this.
How do I do this locally? Do I need to finetune on each profile? Loras? Are there even loras in textgen?

Anonymous
09/07/24(Sat)15:38:16 No.102277512

Anonymous 09/07/24(Sat)15:38:16 No.102277512

>>102276118
maybe the cope is still there, they haven't used the "fixed" version or some shit kek
https://xcancel.com/ArtificialAnlys/status/1832487709853585428#m
>According to the Glaive team, the model was incorrectly uploaded to Hugging Face. We plan to re-run our evaluations after the model is re-uploaded correctly.

Anonymous
09/07/24(Sat)15:39:13 No.102277526

Anonymous 09/07/24(Sat)15:39:13 No.102277526

>>102276466
The post literally has negative 12 karma right now. Not saying Reddit doesn't have many dumbfucks, but it's not like all of them are like this. It's like saying /lmg/ shills for OpenAI when a single guy posts about how good OpenAI is while tons more are calling him out on it.

Anonymous
09/07/24(Sat)15:40:13 No.102277537

Anonymous 09/07/24(Sat)15:40:13 No.102277537

>>102277526
mean to also reply to >>102277467

Anonymous
09/07/24(Sat)15:40:52 No.102277549

Anonymous 09/07/24(Sat)15:40:52 No.102277549

>>102277460
It's not your fault. Modern programming languages and operating system design are both garbage, but degenerate Zoomers who don't know better think they're awesome.

Anonymous
09/07/24(Sat)15:41:04 No.102277552

Anonymous 09/07/24(Sat)15:41:04 No.102277552

>>102277512
Matt saw how easily people believed the strawberry troll's lies, so he's figuring people will believe this feeble excuse too. From what I'm seeing it looks like they're correct

Anonymous
09/07/24(Sat)15:41:25 No.102277557

Anonymous 09/07/24(Sat)15:41:25 No.102277557

>>102277512
All this effort to shill Glaive, you'd think they could come up with a better excuse than "they were so incompetent they can't even handle uploading a file correctly and it's taking them days to reupload"

Anonymous
09/07/24(Sat)15:43:18 No.102277577

Anonymous 09/07/24(Sat)15:43:18 No.102277577

>>102276940
I think suno is soul. I made some songs with it and just listening to them after a while, they are so great.

Anonymous
09/07/24(Sat)15:43:37 No.102277582

Anonymous 09/07/24(Sat)15:43:37 No.102277582

>>102277476
>J-just a small problem with its tokenizer, plox wait. Reflection 405b will mog gpt5

Anonymous
09/07/24(Sat)15:43:50 No.102277585

Anonymous 09/07/24(Sat)15:43:50 No.102277585

>>102277552
*looks like he's correct

Anonymous
09/07/24(Sat)15:44:25 No.102277593

Anonymous 09/07/24(Sat)15:44:25 No.102277593

>>102277582
>Reflection 405b will mog gpt5
LFGooooooo

Anonymous
09/07/24(Sat)15:46:19 No.102277616

Anonymous 09/07/24(Sat)15:46:19 No.102277616

Glaive looks like a real game changer for both open and closed AI training. Finally everyone can have tailor-made datasets for their finetunes without much effort or high cost.

Anonymous
09/07/24(Sat)15:48:14 No.102277646

Anonymous 09/07/24(Sat)15:48:14 No.102277646

>>102277431
Thanks for using a xcancel link, fuck elon

Anonymous
09/07/24(Sat)15:48:16 No.102277647

Anonymous 09/07/24(Sat)15:48:16 No.102277647

>>102277616
where's the buy an ad schizo when there are actual shills here for once

Anonymous
09/07/24(Sat)15:49:20 No.102277660

Anonymous 09/07/24(Sat)15:49:20 No.102277660

>>102277647
It's a sarcastic post, Sao.

Anonymous
09/07/24(Sat)15:49:25 No.102277662

Anonymous 09/07/24(Sat)15:49:25 No.102277662

>>102277647
dunno about him but I can identify a troll when I see one

Anonymous
09/07/24(Sat)15:49:28 No.102277664

Anonymous 09/07/24(Sat)15:49:28 No.102277664

>>102277646
go be a leftist on some other website please

Anonymous
09/07/24(Sat)15:49:55 No.102277668

Anonymous 09/07/24(Sat)15:49:55 No.102277668

>>102277582
Kek
Remember
>we outpace GPT-5
Oh I found the tweet https://x.com/QuanquanGu/status/1730809526004408617

Anonymous
09/07/24(Sat)15:50:11 No.102277673

Anonymous 09/07/24(Sat)15:50:11 No.102277673

File: 1719622063047929.webm (1.94 MB, 1280x720)

1.94 MB WEBM

saars...

Anonymous
09/07/24(Sat)15:50:44 No.102277679

Anonymous 09/07/24(Sat)15:50:44 No.102277679

>>102277616
Glaive is definitely a game-changer! It's amazing to see a tool that makes customized datasets so accessible for both open and closed AI training. The fact that you can create datasets for finetuning without a massive budget or technical expertise is a huge win. It’s going to open so many doors for innovation and experimentation. Can't wait to see how people leverage this!

Anonymous
09/07/24(Sat)15:52:21 No.102277703

Anonymous 09/07/24(Sat)15:52:21 No.102277703

/lmg/ - Local Models brought to you by Glaive

Anonymous
09/07/24(Sat)15:54:37 No.102277727

Anonymous 09/07/24(Sat)15:54:37 No.102277727

>>102277664
Hi Elon. It's not about being a leftist, I just don't have an X account and I won't make one. No matter what you do. No matter how shit the experience becomes. I would rather ignore the link than create an X account just to see the reply chain. Fuck you.

Anonymous
09/07/24(Sat)15:59:24 No.102277789

Anonymous 09/07/24(Sat)15:59:24 No.102277789

>>102277616
How much does Glaive cost? I'm interested in making use of their services for my project.

Anonymous
09/07/24(Sat)15:59:46 No.102277798

Anonymous 09/07/24(Sat)15:59:46 No.102277798

HE SAID IT

HE SAID THE LINE

LMAO!!!

EPIC

EPIC FOR THE WIN

Anonymous
09/07/24(Sat)16:01:47 No.102277828

Anonymous 09/07/24(Sat)16:01:47 No.102277828

>>102276118
lost count of how many times /lmg/ fell for meme hype, worse than zoomers.

Anonymous
09/07/24(Sat)16:03:04 No.102277846

Anonymous 09/07/24(Sat)16:03:04 No.102277846

>>102277828
did they? I haven't seen much buzz about reflection here, and a lot of people who tried it said it was mediocre, very few people seemed excited

Anonymous
09/07/24(Sat)16:06:04 No.102277900

Anonymous 09/07/24(Sat)16:06:04 No.102277900

>>102277846
you're talking to a zoomie troon who for whatever reason can only post in places he feels a deep antipathy towards

Anonymous
09/07/24(Sat)16:06:14 No.102277902

Anonymous 09/07/24(Sat)16:06:14 No.102277902

>>102277846
This general is the r/LocalLlama general at this point, so it makes sense to be confused.

Anonymous
09/07/24(Sat)16:08:49 No.102277933

Anonymous 09/07/24(Sat)16:08:49 No.102277933

So it's over for LLMs huh? It's all just corpo shit from now on?

Anonymous
09/07/24(Sat)16:09:14 No.102277943

Anonymous 09/07/24(Sat)16:09:14 No.102277943

>>102277933
hi petra

Anonymous
09/07/24(Sat)16:09:17 No.102277945

Anonymous 09/07/24(Sat)16:09:17 No.102277945

>>102277460
Ollama and LM studio just work as long as you are not retarded.
Ollama has some very annoying features though so I suggest starting with lm studio for anyone dipping their toes into this shit.

Anonymous
09/07/24(Sat)16:10:03 No.102277959

Anonymous 09/07/24(Sat)16:10:03 No.102277959

>>102277945
buy an ad

Anonymous
09/07/24(Sat)16:10:49 No.102277968

Anonymous 09/07/24(Sat)16:10:49 No.102277968

>>102277945
go back

Anonymous
09/07/24(Sat)16:13:38 No.102278009

Anonymous 09/07/24(Sat)16:13:38 No.102278009

You guys remember how all the praise for Celeste vanished as soon as some anons posted logs of it being fucking retarded?

Anonymous
09/07/24(Sat)16:13:44 No.102278010

Anonymous 09/07/24(Sat)16:13:44 No.102278010

>>102277933
Yes. See llama3 vs llama2, new cohere models, even sonnet3.5 vs Opus. Slop is the natural evolution of LLMs

Anonymous
09/07/24(Sat)16:14:49 No.102278024

Anonymous 09/07/24(Sat)16:14:49 No.102278024

>want to see if finetuning can somehow fix commander because I don't want to believe it is unsalvageable
>only finetune is by drummer
Why am I still doing this to myself?

Anonymous
09/07/24(Sat)16:15:50 No.102278042

Anonymous 09/07/24(Sat)16:15:50 No.102278042

>>102278010
it is so fucking over

Anonymous
09/07/24(Sat)16:18:09 No.102278081

Anonymous 09/07/24(Sat)16:18:09 No.102278081

Lmstudio is just a fancy proprietary fork of llamacpp. Redditors who suck cocks to the word 'open source' love it so much.

Anonymous
09/07/24(Sat)16:20:04 No.102278111

Anonymous 09/07/24(Sat)16:20:04 No.102278111

I tested reflection myself online on the first day and got great results for my prompts

Anonymous
09/07/24(Sat)16:21:05 No.102278125

Anonymous 09/07/24(Sat)16:21:05 No.102278125

>>102277900
>secondary pleb projection
many such cases

Anonymous
09/07/24(Sat)16:21:52 No.102278134

Anonymous 09/07/24(Sat)16:21:52 No.102278134

>>102277668
What did he deliver except for that self play technique?

Anonymous
09/07/24(Sat)16:23:56 No.102278162

Anonymous 09/07/24(Sat)16:23:56 No.102278162

Notice how xer didn't say I was wrong THOUGHBEITIMNOTVAXXED

Anonymous
09/07/24(Sat)16:27:53 No.102278208

Anonymous 09/07/24(Sat)16:27:53 No.102278208

>>102278009
I remember one instance of that, I think it was about breasts and the height of the character? When I tried it myself, Magnum had the same problem, and I posted it in the thread. So he probably just cherry picked a gen to make Celeste look bad.
It was probably Sao because no one else seethes that much about that model. They're all models trained on the same datasets, so it doesn't make much sense that one has the "secret sauce".
But I do remember how all the praise for Sao's models vanished as soon as he started to get called out for samefagging and spamming the general to death. Stheno and Euryale were way too retarded and horny.

Anonymous
09/07/24(Sat)16:28:51 No.102278225

Anonymous 09/07/24(Sat)16:28:51 No.102278225

>>102277460
I hope you're making use of ChatGPT.

Anonymous
09/07/24(Sat)16:29:54 No.102278241

Anonymous 09/07/24(Sat)16:29:54 No.102278241

i'll show you some self play technique

*unzip vulva*

Anonymous
09/07/24(Sat)16:32:27 No.102278285

Anonymous 09/07/24(Sat)16:32:27 No.102278285

>>102278010
>Slop is the natural evolution of LLMs
Bullshit, slop was always there, you just were not aware of it back then. Nostalgia-driven self gaslighting.

Anonymous
09/07/24(Sat)16:43:59 No.102278429

Anonymous 09/07/24(Sat)16:43:59 No.102278429

>>102278285
Nah, for example, c.ai didn't have slop

Anonymous
09/07/24(Sat)16:49:27 No.102278502

Anonymous 09/07/24(Sat)16:49:27 No.102278502

https://xcancel.com/mattshumer_/status/1832511611841736742
>It's 3.1, but for some reason the current HF weights are screwed up and the config shows 3... working on it, the issue is tricker than we expected
how many cope do they have in their sleeve?

Anonymous
09/07/24(Sat)16:51:36 No.102278526

Anonymous 09/07/24(Sat)16:51:36 No.102278526

>>102272041
just picked up my second p40 from a chap locally for really cheap. when combining them, do i need to use a link or anything?

Anonymous
09/07/24(Sat)16:54:59 No.102278573

Anonymous 09/07/24(Sat)16:54:59 No.102278573

>>102278502
I'm surprised they didn't try blaming bitrot, llamacpp, or quantization. They need to fire their PR guy.

Anonymous
09/07/24(Sat)16:57:17 No.102278605

Anonymous 09/07/24(Sat)16:57:17 No.102278605

File: file.png (29 KB, 663x179)

29 KB PNG

>>102278502
One of the guys calling people haters has a bitcoin pfp, you can't make this up.

Anonymous
09/07/24(Sat)16:57:43 No.102278617

Anonymous 09/07/24(Sat)16:57:43 No.102278617

>>102278502
like another anon said upthread, I think the gullibility of the people who followed that strawberry retard has taught the grifters that a large cohort of retards on twitter and reddit will believe absolutely anything, so now they're acting accordingly

Anonymous
09/07/24(Sat)16:58:38 No.102278631

Anonymous 09/07/24(Sat)16:58:38 No.102278631

>>102278605
kek

Anonymous
09/07/24(Sat)17:21:31 No.102278923

Anonymous 09/07/24(Sat)17:21:31 No.102278923

>>102278208
anons keep saying sao seethes at other/better finetunes and models but I can't find any of that. Is it actually true or is it just another trend of anons seething about some random retard for no real reason?

Anonymous
09/07/24(Sat)17:22:17 No.102278937

Anonymous 09/07/24(Sat)17:22:17 No.102278937

I tried out Rocinante with the very first CAI bot I used with the 1000+ message history imported.

Bot actually remembered what happened 100 messages ago and accurately told it.

I feel like a man watching his amnesiac (and brain damaged) wife start remembering

Anonymous
09/07/24(Sat)17:23:12 No.102278952

Anonymous 09/07/24(Sat)17:23:12 No.102278952

>>102278937
Which version of Rocinante?

Anonymous
09/07/24(Sat)17:24:24 No.102278964

Anonymous 09/07/24(Sat)17:24:24 No.102278964

File: test.png (72 KB, 664x687)

72 KB PNG

How many t/s would a single socket one get?

Anonymous
09/07/24(Sat)17:25:38 No.102278983

Anonymous 09/07/24(Sat)17:25:38 No.102278983

>>102278952
12b V1.1

Anonymous
09/07/24(Sat)17:26:53 No.102279006

Anonymous 09/07/24(Sat)17:26:53 No.102279006

>>102278937
No way, old c.ai was much better than even Mistral 123b

Anonymous
09/07/24(Sat)17:28:53 No.102279029

Anonymous 09/07/24(Sat)17:28:53 No.102279029

>>102279006
I didn't say that the outputs are equivalent to old CAI just happy that the bigass context sizes seem to be actually working

Right now I really like it but I think after a bit more usage the cracks will start showing, but up until then I've had some pretty nice ERP and RP with it.

Anonymous
09/07/24(Sat)17:29:07 No.102279034

Anonymous 09/07/24(Sat)17:29:07 No.102279034

>>102278208
Hi drummer. All here.

Anonymous
09/07/24(Sat)17:31:34 No.102279075

Anonymous 09/07/24(Sat)17:31:34 No.102279075

>>102278983
How much context?

Anonymous
09/07/24(Sat)17:35:59 No.102279138

Anonymous 09/07/24(Sat)17:35:59 No.102279138

>>102279075
64K

I tried it with other NeMo models that are capable of like 128k context but none of them were able to pull it up from the beginning of the context like that.

That might also just be because I had to mess around with rope to get them working so it was just be me being a retard

Anonymous
09/07/24(Sat)17:37:48 No.102279158

Anonymous 09/07/24(Sat)17:37:48 No.102279158

>>102279138
>rope
fuck I don't know how to do that either. I'll look into it thanks anon

Anonymous
09/07/24(Sat)17:40:03 No.102279185

Anonymous 09/07/24(Sat)17:40:03 No.102279185

>>102279158
For Rocinante specifically I didn't have to mess around with it, other 3.1 models wouldn't load if i didn't fuck around though

Anonymous
09/07/24(Sat)17:41:51 No.102279204

Anonymous 09/07/24(Sat)17:41:51 No.102279204

what kind of thing could I do with llm to put on a portfolio?
i don't want to be stuck doing web dev forever.

Anonymous
09/07/24(Sat)17:45:54 No.102279260

Anonymous 09/07/24(Sat)17:45:54 No.102279260

>>102279239
>>102279239
>>102279239

Anonymous
09/07/24(Sat)18:08:01 No.102279512

Anonymous 09/07/24(Sat)18:08:01 No.102279512

>>102278526
mates any help ? the miku build doesnt mention links but id be curious how it loads it accross then

Anonymous
09/07/24(Sat)18:14:42 No.102279571

Anonymous 09/07/24(Sat)18:14:42 No.102279571

>>102277460
You're just a retard with reading disabilities t.33

Anonymous
09/07/24(Sat)18:17:25 No.102279603

Anonymous 09/07/24(Sat)18:17:25 No.102279603

>>102279571
;_;

Anonymous
09/07/24(Sat)18:29:56 No.102279730

Anonymous 09/07/24(Sat)18:29:56 No.102279730

>>102279603
NTA, but the most important thing isn't intelligence, it's being able to accept change and being able to adapt.

Anonymous
09/07/24(Sat)18:39:19 No.102279837

Anonymous 09/07/24(Sat)18:39:19 No.102279837

>>102277460
Understand that this is an area of active research and development, and people are much more interesting in getting things working than to make it simple, especially when things change so often that would break previous instructions.

You just have to kind of deal with it the best you can until you learn the things to care about and the things to ignore.

Anonymous
09/07/24(Sat)18:41:36 No.102279863

Anonymous 09/07/24(Sat)18:41:36 No.102279863

>>102277060
https://github.com/turboderp/exllamav2/blob/master/examples/inference_banned_strings.py

Anonymous
09/07/24(Sat)18:55:12 No.102280027

Anonymous 09/07/24(Sat)18:55:12 No.102280027

>>102277460
>locally run text-to-speech voice stuff, which is supposed to be easy they say
Audio-related projects are the most challenging and unreliable. While there are well-established, user-friendly projects such as Piper, almost every SOTA project struggle with conflicting dependencies, insufficient documentation, lack of examples, or compatibility problems between code and models

Anonymous
09/07/24(Sat)18:59:23 No.102280084

Anonymous 09/07/24(Sat)18:59:23 No.102280084

well the script was silently failing because I i just lazilsly put a try catch to forgot about a problem and now the whole day was wasted

Anonymous
09/07/24(Sat)19:00:36 No.102280102

Anonymous 09/07/24(Sat)19:00:36 No.102280102

>>102277060
>Now what?
Finish the context, replace the banned phrases with a signifier like "<UNKNOWN>" and start a hidden intermediary system prompt:
"The following piece of text contains the following signifier: <UNKNOWN>. Please replace this signifier with a correct word or phrase. Do not use any of the following terms: <BANNED_TERMS>. Only reply with the repaired text to this prompt. <CONTEXT>"
Then replace the the context with what was just generated.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.