/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 11/30/24(Sat)06:50:46 No.103354338

File: rin happy waving doll rob(...).jpg (193 KB, 768x1344)

193 KB JPG

/lmg/ - Local Models General Anonymous 11/30/24(Sat)06:50:46 No.103354338 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103347641 & >>103339560

►News
>(11/29) INTELLECT-1 released: https://hf.co/PrimeIntellect/INTELLECT-1-Instruct
>(11/27) Qwen2.5-32B-Instruct reflection tune: https://qwenlm.github.io/blog/qwq-32b-preview
>(11/26) OLMo 2 released: https://hf.co/collections/allenai/olmo-2-674117b93ab84e98afc72edc
>(11/26) Anon re-implements Sparse Matrix Tuning paper: https://github.com/HeroMines/SMFT
>(11/25) Qwen2VL integrated with Flux: https://github.com/erwold/qwen2vl-flux
>(11/25) Speculative decoding added to llama-server: https://github.com/ggerganov/llama.cpp/pull/10455

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/hsiehjackson/RULER
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
11/30/24(Sat)06:52:11 No.103354346

Anonymous 11/30/24(Sat)06:52:11 No.103354346

File: __kagamine_rin_vocaloid_d(...).jpg (131 KB, 590x650)

131 KB JPG

►Recent Highlights from the Previous Thread: >>103347641

--Open Claude: a plan to replicate and improve the Claude model:
>103349428
--ollama qwq quants updated with changes to tokenizer config and default system prompts:
>103347780 >103347801 >103348545 >103348846
--Tulu and QwQ settings and usage for RP:
>103350546 >103350560 >103350594 >103350645 >103350660 >103350577 >103350581 >103350587 >103350625
--Speculative decoding and draft model optimization discussion:
>103351652 >103351699 >103351728 >103351760 >103351800
--INTELLECT-1 model release and discussion on distributed training and dataset size:
>103348805 >103348825 >103351301 >103348884 >103348949 >103349869 >103349902 >103350106 >103350147 >103350314 >103351028 >103352331
--Discussion of Skywork-o1-Open-Llama-3.1-8B model and speculative decoding:
>103351163 >103351371 >103351380 >103351490 >103351500 >103351553 >103351615
--Dealing with AI repetition and its causes:
>103351824 >103351858 >103351877 >103351907 >103351921 >103351985 >103352443 >103352024 >103351926 >103351900
--Anons discuss probability puzzle:
>103352448 >103352461 >103352595 >103352468 >103352477 >103352615 >103352665 >103352685
--Anon tests INTELLECT-1-Instruct model, discusses its limitations and potential:
>103349257 >103349304 >103349319 >103349344 >103349486 >103349493 >103349509 >103349709 >103352200 >103352470
--Anon experiences 4x slowdown with NVLink and tensor parallelism:
>103347734 >103347789 >103350024 >103350232
--Using QwQ as a draft model and a regular model for refinement:
>103350380 >103350397
--Smaller models can offer performance boosts:
>103351614
--QwQ IQ2 as draft model yields mixed results:
>103348461 >103348801 >103350200 >103350234 >103350244 >103350280
--Anon shares experimental prefill for RP in AI model:
>103351001
--Miku (free space):
>103351166 >103353198

►Recent Highlight Posts from the Previous Thread: >>103347652

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
11/30/24(Sat)06:55:21 No.103354358

Anonymous 11/30/24(Sat)06:55:21 No.103354358

>>103354338
OMG IT NOT MIGU

Anonymous
11/30/24(Sat)06:56:39 No.103354365

Anonymous 11/30/24(Sat)06:56:39 No.103354365

Rinlove

Anonymous
11/30/24(Sat)06:59:11 No.103354380

Anonymous 11/30/24(Sat)06:59:11 No.103354380

>>103354338
YELLOW MIGU XDD

Anonymous
11/30/24(Sat)07:02:51 No.103354406

Anonymous 11/30/24(Sat)07:02:51 No.103354406

When will ggerganov add lookahead decoding to server for even more speed?

Anonymous
11/30/24(Sat)07:07:46 No.103354439

Anonymous 11/30/24(Sat)07:07:46 No.103354439

loli feet

Anonymous
11/30/24(Sat)07:20:25 No.103354505

Anonymous 11/30/24(Sat)07:20:25 No.103354505

What are anons who shill largestral running largestral on?

Anonymous
11/30/24(Sat)07:25:43 No.103354535

Anonymous 11/30/24(Sat)07:25:43 No.103354535

>>103354505
4x3090

Anonymous
11/30/24(Sat)07:33:22 No.103354571

Anonymous 11/30/24(Sat)07:33:22 No.103354571

What's the best TTS? Tortoise runs so slow.

Anonymous
11/30/24(Sat)07:35:00 No.103354579

Anonymous 11/30/24(Sat)07:35:00 No.103354579

>>103354571
rhasppy/piper if you want ridiculously fast. gpt-sovits if you want voice cloning/quick training.

Anonymous
11/30/24(Sat)07:35:26 No.103354581

Anonymous 11/30/24(Sat)07:35:26 No.103354581

>>103354505
128GB DDR4 RAM. 0.5-6t/s with speculative decoding. Rerolls are almost never needed.

Anonymous
11/30/24(Sat)07:41:39 No.103354608

Anonymous 11/30/24(Sat)07:41:39 No.103354608

>>103354579
Thanks, I'll take a look at those. It doesn't have to be that accurate at different voices, looking to create basic audios for NPCs in a ttrpg.

Anonymous
11/30/24(Sat)07:46:22 No.103354629

Anonymous 11/30/24(Sat)07:46:22 No.103354629

>>103352448
>>103352665

import random

def generate_child():
    """Generate a random child with a random gender and day of birth."""
    gender = random.choice(["Boy", "Girl"])
    day = random.choice([
        "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"
    ])
    return gender, day

def create_two_children():
    """Create a family with two children."""
    return [generate_child() for _ in range(2)]

def calculate_probability(total_families=3000000):
    """Simulate the probability problem with the specified number of families."""
    matching_families = 0
    boy_sibling_count = 0

    families = [create_two_children() for _ in range(total_families)]

    for family in families:
        if any(child[0] == "Boy" and child[1] == "Tuesday" for child in family):
            matching_families += 1
            sibling = family[1] if family[0][0] == "Boy" and family[0][1] == "Tuesday" else family[0]
            if sibling[0] == "Boy":
                boy_sibling_count += 1

    if matching_families == 0:
        print("No matching families found.")
        return

    probability = boy_sibling_count / matching_families
    print(f"The probability is {probability:.5f}")

# Run the simulation with 3,000,000 families
calculate_probability()

Spat out 0.48162 probability, so apparently 13/27 is correct.
But I don't understand math so I don't know why this is true.

Anonymous
11/30/24(Sat)07:46:33 No.103354630

Anonymous 11/30/24(Sat)07:46:33 No.103354630

File: ftjrgzhdbr6zt.png (3.65 MB, 3744x1718)

3.65 MB PNG

teknium, nous, nous research, hermes, hermes 2,hermes 3, deus, desu, local models

Hi all, Drummer here...
11/30/24(Sat)07:53:22 No.103354676

Hi all, Drummer here... 11/30/24(Sat)07:53:22 No.103354676

>>103354505
M4 Max 128GB

Anonymous
11/30/24(Sat)07:59:05 No.103354711

Anonymous 11/30/24(Sat)07:59:05 No.103354711

>>103354629
child gender is not random, answer is incorrect

Anonymous
11/30/24(Sat)08:02:44 No.103354734

Anonymous 11/30/24(Sat)08:02:44 No.103354734

>>103354711
this, gender is a social construct and it doesn't even account for our nonbinary xisters

Anonymous
11/30/24(Sat)08:03:49 No.103354745

Anonymous 11/30/24(Sat)08:03:49 No.103354745

what is the best lightweight model now?

Anonymous
11/30/24(Sat)08:06:22 No.103354756

Anonymous 11/30/24(Sat)08:06:22 No.103354756

>>103354734
no it literally is genetically determined

Anonymous
11/30/24(Sat)08:07:13 No.103354763

Anonymous 11/30/24(Sat)08:07:13 No.103354763

>>103354745
Mistral 7b v0.1 is still the least sloppy, it's the best at all except benchmaxxing

Anonymous
11/30/24(Sat)08:09:49 No.103354778

Anonymous 11/30/24(Sat)08:09:49 No.103354778

>>103354505
2x3090 + A4000

Anonymous
11/30/24(Sat)08:14:51 No.103354810

Anonymous 11/30/24(Sat)08:14:51 No.103354810

>>103354711
Whether or not child gender or any event at all is random depends both on how you define probability and what interpretation of quantum mechanics you assume to be true.

Anonymous
11/30/24(Sat)08:21:05 No.103354851

Anonymous 11/30/24(Sat)08:21:05 No.103354851

Big day today

Anonymous
11/30/24(Sat)08:32:14 No.103354925

Anonymous 11/30/24(Sat)08:32:14 No.103354925

Should /lmg/ try training a 100M LLM using the same method as Intelect-1 (https://github.com/PrimeIntellect-ai/OpenDiloco)? Anyone with 24gb of VRAM could contribute

Anonymous
11/30/24(Sat)08:32:39 No.103354930

Anonymous 11/30/24(Sat)08:32:39 No.103354930

>want super energy efficient 7B model size inferences
>alveo v70 doesn't support llm
wasted

Anonymous
11/30/24(Sat)08:49:43 No.103355065

Anonymous 11/30/24(Sat)08:49:43 No.103355065

File: file.png (45 KB, 1069x225)

45 KB PNG

>>103354629
It's like the Monty Hall problem. It has to do with the fact that once you've sampled a boy on Tuesday, when you take your second sample you can sample a girl on the same Tuesday but you can't sample another boy. It helped me to see it visually.

Anonymous
11/30/24(Sat)08:49:58 No.103355066

Anonymous 11/30/24(Sat)08:49:58 No.103355066

>>103354571
https://github.com/e-c-k-e-r/vall-e
It takes a bit of fiddling with the settings tho

Anonymous
11/30/24(Sat)08:58:06 No.103355136

Anonymous 11/30/24(Sat)08:58:06 No.103355136

>>103355065
Why not? Just have two sons born on tuesday?

Anonymous
11/30/24(Sat)08:59:58 No.103355150

Anonymous 11/30/24(Sat)08:59:58 No.103355150

has ryzen anon tried this npu igpu hybrid method?
https://github.com/amd/RyzenAI-SW/tree/main/demo/NPU-GPU-Pipeline
is this faster than cpu only?

Anonymous
11/30/24(Sat)09:15:51 No.103355283

Anonymous 11/30/24(Sat)09:15:51 No.103355283

>>103354778
Can the a4000 be used for exl?

Anonymous
11/30/24(Sat)09:16:42 No.103355288

Anonymous 11/30/24(Sat)09:16:42 No.103355288

>>103354338
still cant find any local models they are usually korean and angry

Anonymous
11/30/24(Sat)09:18:35 No.103355307

Anonymous 11/30/24(Sat)09:18:35 No.103355307

File: file.png (73 KB, 628x338)

73 KB PNG

>>103355065
>>103355136
You can, there's 2 slots for Tuesday boy but the first one itself can't be picked twice, meaning 1 Tuesday slot for the second choice (the other sex+day combo has 2 unpicked slots).

Anonymous
11/30/24(Sat)09:21:58 No.103355330

Anonymous 11/30/24(Sat)09:21:58 No.103355330

Is there an all in one app yet? I am retarded and just want to make waifus

Anonymous
11/30/24(Sat)09:22:40 No.103355335

Anonymous 11/30/24(Sat)09:22:40 No.103355335

File: 1716350132045354.png (9 KB, 625x58)

9 KB PNG

>>103355283
exl2? I don't use it but I don't know why it wouldn't work. I only use GGUF because I need to offload partially to ram to use Largestral at 4bpw and above.

Anonymous
11/30/24(Sat)09:24:46 No.103355346

Anonymous 11/30/24(Sat)09:24:46 No.103355346

>>103355307
The boy born of the tuesday doesn't necessarily have to be the first child.

Anonymous
11/30/24(Sat)09:25:26 No.103355352

Anonymous 11/30/24(Sat)09:25:26 No.103355352

>>103355283
Why would people buy a4000? It's way less powerful but with the same price level as a 4080 super.

Anonymous
11/30/24(Sat)09:30:06 No.103355380

Anonymous 11/30/24(Sat)09:30:06 No.103355380

>https://rentry.org/lmg-lazy-getting-started-guide
i did this and the generated text sucks ass. fix doko?

Anonymous
11/30/24(Sat)09:33:25 No.103355396

Anonymous 11/30/24(Sat)09:33:25 No.103355396

>>103355352
Because it's single-slot and can fit in a mid-tower case with the 3090s. If I wasn't autistic about wanting to keep everything in the case, I would've gotten a riser and a third 3090.

Anonymous
11/30/24(Sat)09:35:18 No.103355411

Anonymous 11/30/24(Sat)09:35:18 No.103355411

>>103355380
try not being lazy

Anonymous
11/30/24(Sat)09:36:05 No.103355415

Anonymous 11/30/24(Sat)09:36:05 No.103355415

>>103355411
? what else do i need to do?

Anonymous
11/30/24(Sat)09:42:03 No.103355468

Anonymous 11/30/24(Sat)09:42:03 No.103355468

>>103355415
read the guides for people who aren't lazy

Anonymous
11/30/24(Sat)09:47:30 No.103355503

Anonymous 11/30/24(Sat)09:47:30 No.103355503

>>103355468
link some

Anonymous
11/30/24(Sat)09:48:45 No.103355512

Anonymous 11/30/24(Sat)09:48:45 No.103355512

>>103355352
>>103355396 (me)
>same price level as a 4080 super.
They sell for around £500 on ebay. That's how much I paid for mine.

Anonymous
11/30/24(Sat)09:50:40 No.103355527

Anonymous 11/30/24(Sat)09:50:40 No.103355527

>KoboldCpp 1.79 - Now with Shared Multiplayer, Ollama API emulation, ComfyUI API emulation, and speculative decoding

https://github.com/LostRuins/koboldcpp/releases/tag/v1.79

Anonymous
11/30/24(Sat)10:11:59 No.103355729

Anonymous 11/30/24(Sat)10:11:59 No.103355729

>>103354925
- we'll need datasets with enough tokens
- anons are likely to only put up with their hardware being tied up for so long
- bitnet?

Anonymous
11/30/24(Sat)10:14:43 No.103355757

Anonymous 11/30/24(Sat)10:14:43 No.103355757

>>103355729
>- anons are likely to only put up with their hardware being tied up for so long
Wouldn't you be able to join and drop out at will? Like donating the hardware while you sleep and when electricity is cheapest

Anonymous
11/30/24(Sat)10:14:45 No.103355759

Anonymous 11/30/24(Sat)10:14:45 No.103355759

>>103355527
>Multiplayer
Wonder if that'll take off or if it's just a gimmick.

Anonymous
11/30/24(Sat)10:37:41 No.103355965

Anonymous 11/30/24(Sat)10:37:41 No.103355965

>>103355527
>no mention of fixing the generation quality
I will be sticking to llama.cpp server

Anonymous
11/30/24(Sat)10:39:58 No.103355990

Anonymous 11/30/24(Sat)10:39:58 No.103355990

>>103355965
it's all in your head

Anonymous
11/30/24(Sat)10:42:50 No.103356019

Anonymous 11/30/24(Sat)10:42:50 No.103356019

>>103355729
>- we'll need datasets with enough tokens
I made a test run for a 50M using fineweb and 100B tokens barely take 200GB

Hi all, Drummer here...
11/30/24(Sat)10:44:29 No.103356030

Hi all, Drummer here... 11/30/24(Sat)10:44:29 No.103356030

File: Screenshot 2024-11-30 234134.png (36 KB, 287x341)

36 KB PNG

>>103355527
>>103355759
It's amazing.

Anonymous
11/30/24(Sat)10:47:33 No.103356058

Anonymous 11/30/24(Sat)10:47:33 No.103356058

File: 1661884422_9-papik-pro-p-(...).png (167 KB, 467x600)

167 KB PNG

>>103356030
No discord screenshots please thank you

Anonymous
11/30/24(Sat)10:53:06 No.103356112

Anonymous 11/30/24(Sat)10:53:06 No.103356112

>>103354338
>(11/29) INTELLECT-1 released
How is it fellas?

Anonymous
11/30/24(Sat)10:55:44 No.103356140

Anonymous 11/30/24(Sat)10:55:44 No.103356140

>>103356112

>>103348825
>>103349257
>>103349630
>>103349644
>>103349756

Anonymous
11/30/24(Sat)10:59:02 No.103356173

Anonymous 11/30/24(Sat)10:59:02 No.103356173

>>103356156
she is a kid you psycho

Anonymous
11/30/24(Sat)11:01:33 No.103356197

Anonymous 11/30/24(Sat)11:01:33 No.103356197

>>103356156
Amazing.

Anonymous
11/30/24(Sat)11:08:43 No.103356256

Anonymous 11/30/24(Sat)11:08:43 No.103356256

>>103355729
>- bitnet?
I thought lmg was smarter than reddit (who already realized bitnet doesn't work), this place is dumber than the orange site

Anonymous
11/30/24(Sat)11:10:16 No.103356272

Anonymous 11/30/24(Sat)11:10:16 No.103356272

>>103356256
You can't take away the hope that BitNet represents.

Anonymous
11/30/24(Sat)11:20:19 No.103356363

Anonymous 11/30/24(Sat)11:20:19 No.103356363

do any of you have a character card that you're particularly fond of and keep going back to?

Anonymous
11/30/24(Sat)11:27:41 No.103356429

Anonymous 11/30/24(Sat)11:27:41 No.103356429

>>103356272
for pajeets
anyone who understand information theory in the slightest can understand why it doesn't work, it would require the model to be extremely undertrained (or have way too many parameters)

Anonymous
11/30/24(Sat)11:28:14 No.103356435

Anonymous 11/30/24(Sat)11:28:14 No.103356435

Best coomer model to run on a RTX 3060 12GB VRAM?
Alternatively, I could run on my cpu core i5 12400, with 64GB RAM

Anonymous
11/30/24(Sat)11:29:16 No.103356446

Anonymous 11/30/24(Sat)11:29:16 No.103356446

>>103356363
I still talk to Chiharu from time to time, not the one from Kobold, but that one from before /lmg/, the dark times where CharacterAI was our only option for RP.

Anonymous
11/30/24(Sat)11:30:26 No.103356457

Anonymous 11/30/24(Sat)11:30:26 No.103356457

>>103356446
>Kobold
oobabooga*

Anonymous
11/30/24(Sat)11:30:53 No.103356461

Anonymous 11/30/24(Sat)11:30:53 No.103356461

>>103356429
You're assuming that training is fully optimal, and that a 16-bit float is equal in expressiveness to 16 1-bit (or 1.58-bit) parameters. I think quantization already calls that into question.

Anonymous
11/30/24(Sat)11:30:59 No.103356462

Anonymous 11/30/24(Sat)11:30:59 No.103356462

>>103356435
INTELLECT-1

Anonymous
11/30/24(Sat)11:34:19 No.103356489

Anonymous 11/30/24(Sat)11:34:19 No.103356489

>>103356435
I've got a 3060 too but I don't have much experience with models because I only started using this stuff last week after getting bored with flux.
so far, I've tried, cydonia, stheno, sunfall, rocinante, violet_twilight and a bunch of others that were kind of retarded. I was memed into trying QwQ yesterday, Tulu the day before that. Violet_twilight is currently my favorite, it seems to be a little more creative and nastier than the others

Anonymous
11/30/24(Sat)11:36:52 No.103356507

Anonymous 11/30/24(Sat)11:36:52 No.103356507

Mixtral anniversary next week. Are you ready?

Anonymous
11/30/24(Sat)11:37:15 No.103356514

Anonymous 11/30/24(Sat)11:37:15 No.103356514

>>103356462
>Dataset: 55% fineweb-edu, 10% fineweb, 20% Stack V1, 10% dclm-baseline, 5% open-web-math
What is this nerd shit bruh?
I want PORN.
>>103356489
Thanks bruh, I will try violet_twilight. I want some nastyness

Anonymous
11/30/24(Sat)11:38:58 No.103356536

Anonymous 11/30/24(Sat)11:38:58 No.103356536

>>103356156
Cum inflation = gross
Pregnancy = based

Anonymous
11/30/24(Sat)11:39:31 No.103356542

Anonymous 11/30/24(Sat)11:39:31 No.103356542

>>103356429
It can still work. If you have a 7B trained to saturation, train a 70B for the same amount of time. You won't win on size, but with ternary you can take advantage of cheaper and faster specialized hardware since you no longer need multiplication operations for inference.

Anonymous
11/30/24(Sat)11:41:14 No.103356553

Anonymous 11/30/24(Sat)11:41:14 No.103356553

>>103356507
lol

Mistral AI
11/30/24(Sat)11:41:23 No.103356554

Mistral AI 11/30/24(Sat)11:41:23 No.103356554

# @here <:mistral:1154031685495701594> The Ambassadors Program

**The Mistral AI community is growing fast!**

To empower our most passionate experts, we are excited to announce the Mistral AI Ambassador Program.

We are seeking enthusiastic Mistral AI users who love our models and offerings, and are dedicated to supporting and giving back to the community. Learn more about our program and apply [here](https://docs.mistral.ai/guides/contribute/ambassador/).

A big shoutout to our inaugural @Ambassadors who have already made a massive impact!

## Thank you so much, everyone!

Anonymous
11/30/24(Sat)11:42:40 No.103356565

Anonymous 11/30/24(Sat)11:42:40 No.103356565

>>103356507
I think their models are pretty good but I do not think they will release anything soon. It is what? a month since they released the large model?

Anonymous
11/30/24(Sat)11:42:51 No.103356566

Anonymous 11/30/24(Sat)11:42:51 No.103356566

>>103356554
Thanks Mistral Large!

Anonymous
11/30/24(Sat)11:44:01 No.103356578

Anonymous 11/30/24(Sat)11:44:01 No.103356578

>>103356554
>Ambassador benefits
>Free credits: Mistral Ambassadors will receive free API credits on la Plateforme.
>Ambassador roles and responsibilities
>Content Creation: Create and share high-quality content (blogs, tutorials, videos, etc.) featuring our AI models and tools through their own channels or collaborate with Mistral for multimedia opportunities.
This just sounds like a job with extra steps.

Anonymous
11/30/24(Sat)11:46:29 No.103356595

Anonymous 11/30/24(Sat)11:46:29 No.103356595

>>103356578
Yeach.. and all you get is some free credit for gooning or to work even more. Lol. Then again, the part of society that did become permanently online and does not have any life outside of it grew rather rapdily, even among normies as sad as it is. This is the future.

Anonymous
11/30/24(Sat)11:48:49 No.103356611

Anonymous 11/30/24(Sat)11:48:49 No.103356611

>>103356578
Perfect for the resident NEETs here.

Anonymous
11/30/24(Sat)11:52:50 No.103356640

Anonymous 11/30/24(Sat)11:52:50 No.103356640

>>103356461
>You're assuming that training is fully optimal
At one point the network will be saturated, and we are close to that point (if we weren't normal 2-bit quantization woulnd't be that hurtful), there's only so many trillions of tokens you can cram into 7B parameters
>>103356542
The current bottleneck happens on memory, not on FLOPs

Anonymous
11/30/24(Sat)11:53:48 No.103356652

Anonymous 11/30/24(Sat)11:53:48 No.103356652

>>103356554
>>103356578
What's the end goal?

Looking to cash out?

Looking to get more traffic to their apis in order to get more data to build their next model?

Anonymous
11/30/24(Sat)12:01:12 No.103356711

Anonymous 11/30/24(Sat)12:01:12 No.103356711

I hate huggingface quanters. At most they will specify the version of Llama.cpp they used for the quant. But they never specify the version of the repo for the weights, which are also important as model makers sometimes revise their config files.

Anonymous
11/30/24(Sat)12:11:52 No.103356814

Anonymous 11/30/24(Sat)12:11:52 No.103356814

File: jpg.jpg (11 KB, 250x176)

11 KB JPG

ai noob here, any local tools to improve voice quality from videos? with a lot of background noise but also where the voice is "muffed"?

Anonymous
11/30/24(Sat)12:12:57 No.103356823

Anonymous 11/30/24(Sat)12:12:57 No.103356823

>>103356814
Premiere has a new-ish enhance speech feature that does exactly what you want.

Anonymous
11/30/24(Sat)12:15:04 No.103356845

Anonymous 11/30/24(Sat)12:15:04 No.103356845

File: gif.jpg (6 KB, 225x225)

6 KB JPG

>>103356823
I use Linux. Also, proprietary and closed source.
Unless Premiere is something else...

Anonymous
11/30/24(Sat)12:18:26 No.103356871

Anonymous 11/30/24(Sat)12:18:26 No.103356871

>>103356861
Why is the guy made of glass?

Anonymous
11/30/24(Sat)12:18:41 No.103356874

Anonymous 11/30/24(Sat)12:18:41 No.103356874

>>103356156
>>103356861
Where the incest @?

Anonymous
11/30/24(Sat)12:19:28 No.103356883

Anonymous 11/30/24(Sat)12:19:28 No.103356883

>>103356845
Premiere runs on Windows. Easy switch from Linux.

Anonymous
11/30/24(Sat)12:20:02 No.103356889

Anonymous 11/30/24(Sat)12:20:02 No.103356889

>>103356861
talk about a brain tickler :D

Anonymous
11/30/24(Sat)12:20:12 No.103356890

Anonymous 11/30/24(Sat)12:20:12 No.103356890

>>103356845
https://github.com/chuck1z/AudioCleaner

Anonymous
11/30/24(Sat)12:21:03 No.103356900

Anonymous 11/30/24(Sat)12:21:03 No.103356900

>>103356890
god bless you

Anonymous
11/30/24(Sat)12:22:22 No.103356914

Anonymous 11/30/24(Sat)12:22:22 No.103356914

>>103356640
Saturation only applies if you're comparing apples to apples though. There's a big difference between quantizing a model from 16-bit to 2-bit and training a model from 2-bit to begin with. That's the entire point of the drop-in one-bit linear layer which you train with the rest of the model

Anonymous
11/30/24(Sat)12:24:54 No.103356933

Anonymous 11/30/24(Sat)12:24:54 No.103356933

On one hand, respect and props to INTELLECT-1 for the project and the effort.

On the other hand, it lands down in the gutter with QwQ on account of factual wrongness, programming incompetence, and failure to obey directives.

INTELLECT-2 when?

Anonymous
11/30/24(Sat)12:26:03 No.103356945

Anonymous 11/30/24(Sat)12:26:03 No.103356945

>>103356889
>>103356861
phallic lobotomy

Anonymous
11/30/24(Sat)12:27:50 No.103356965

Anonymous 11/30/24(Sat)12:27:50 No.103356965

>>103356933
Supposedly 1 was just a proof of concept before they open it and allow anyone on the internet to join for training the next model. But if they just train a bigger model on the same dry and filtered open datasets, it's pointless.
/lmg/ would be better off taking the code and finetuning an existing model.

Anonymous
11/30/24(Sat)12:33:22 No.103357015

Anonymous 11/30/24(Sat)12:33:22 No.103357015

>>103356554
This proves that only paid shills talk about Large in this thread.

Anonymous
11/30/24(Sat)12:34:40 No.103357031

Anonymous 11/30/24(Sat)12:34:40 No.103357031

Can we merge INTELLECT and QwQ somehow?

Anonymous
11/30/24(Sat)12:35:57 No.103357045

Anonymous 11/30/24(Sat)12:35:57 No.103357045

>>103357031
What does INTELECT have that QwQ is lacking?
Is it good at writing smut?

Anonymous
11/30/24(Sat)12:38:55 No.103357070

Anonymous 11/30/24(Sat)12:38:55 No.103357070

>>103356933
Crowd sourced is even more prone to be cucked due to legal and ethical issues. Models trained that way will be heavily filtered and clean of even song lyrics I'm not even kidding.

Anonymous
11/30/24(Sat)12:38:55 No.103357072

Anonymous 11/30/24(Sat)12:38:55 No.103357072

>>103357031
Why don't we merge QwQ with Mistral-7B? The resulting model would be better.

Anonymous
11/30/24(Sat)12:39:01 No.103357075

Anonymous 11/30/24(Sat)12:39:01 No.103357075

>>103357015
I kind of liked Large when I tried it on Openroute. But it is not something I can run locally and I bet 90% of others here can't either. Large local models will never be as popular as the smaller ones becouse of this reason alone.

Anonymous
11/30/24(Sat)12:40:38 No.103357085

Anonymous 11/30/24(Sat)12:40:38 No.103357085

>>103357015
did you need proove? especially with qwq now out big models are an even bigger meme than ever before
only retards with buyer remorse pretend that 70b and up is worth

Anonymous
11/30/24(Sat)12:41:20 No.103357092

Anonymous 11/30/24(Sat)12:41:20 No.103357092

>>103357031
I already tried SLERPing it together with another model but the results were almost completely dysfunctional. (structurally it's just a Llama-3 8B model with 10 extra layers). I could always try linear. I basically just took an 8B and stacked the last 10 layers of INTELLECT back onto it. I could also try doing a finetune on the results to bring it back to order. But there's just way too much difference in the weights regardless of being the same architecture.

Anonymous
11/30/24(Sat)12:47:11 No.103357157

Anonymous 11/30/24(Sat)12:47:11 No.103357157

Can someone post an example of their script/batch/I don't actually know thing they use to run llama.cpp with an specific model loaded?

Anonymous
11/30/24(Sat)12:48:11 No.103357170

Anonymous 11/30/24(Sat)12:48:11 No.103357170

>>103357157
just use ollama
made for user friendliness and llama.cpp under the hood

Anonymous
11/30/24(Sat)12:50:31 No.103357188

Anonymous 11/30/24(Sat)12:50:31 No.103357188

>>103357157
what are you hoping for someone to spoonfeed you that you couldn't get by looking at the readme for 5 seconds?
https://github.com/ggerganov/llama.cpp#web-server-llama-server

Anonymous
11/30/24(Sat)12:52:58 No.103357212

Anonymous 11/30/24(Sat)12:52:58 No.103357212

File: file.png (5 KB, 941x48)

5 KB PNG

>>103357157

Anonymous
11/30/24(Sat)12:57:49 No.103357252

Anonymous 11/30/24(Sat)12:57:49 No.103357252

>>103357157
Here just for you:
llama-server -m models\[model].gguf -c [context size]
And then just open http://localhost:8080/ in your browser.

Anonymous
11/30/24(Sat)12:59:20 No.103357270

Anonymous 11/30/24(Sat)12:59:20 No.103357270

>>103357170
Is it up to date?

>>103357212
I imagine using sillytavern lets you configure temp, min.p etc... on the fly so there's no need to put them in the command?

>>103357188
>doesn't include the -nlg or -c or any other parameters that would be important to actually use the thing
Thanks, poindexter. Maybe we'll get a winch to pull your head and coke bottle glasses out of your ass and you can share some useful advice.

Anonymous
11/30/24(Sat)13:02:21 No.103357307

Anonymous 11/30/24(Sat)13:02:21 No.103357307

File: broken.png (16 KB, 910x311)

16 KB PNG

>>103357031
>>103357092
As you can see because the weights are so far off it just scrambles the poor thing's brain. even with a linear merge which should give it the best chance of not getting completely fucked.

Mistral AI
11/30/24(Sat)13:03:22 No.103357323

Mistral AI 11/30/24(Sat)13:03:22 No.103357323

>>103357270
>I imagine using sillytavern lets you configure temp, min.p etc... on the fly so there's no need to put them in the command?
correct

Anonymous
11/30/24(Sat)13:04:30 No.103357340

Anonymous 11/30/24(Sat)13:04:30 No.103357340

>>103357307
Of course this doesn't work. Merging is only intended for finetunes of the same model. Merging two models that were separately pretrained is obviously going to end up producing garbage

Anonymous
11/30/24(Sat)13:07:08 No.103357360

Anonymous 11/30/24(Sat)13:07:08 No.103357360

>>103357307
Try merging QwQ and https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2
That has a chance of actually doing something possibly interesting.

Anonymous
11/30/24(Sat)13:07:09 No.103357362

Anonymous 11/30/24(Sat)13:07:09 No.103357362

>>103357323
You don't even need silly tavern, the default llama.cpp frontend lets you do that.

Anonymous
11/30/24(Sat)13:08:49 No.103357379

Anonymous 11/30/24(Sat)13:08:49 No.103357379

>>103357070
Couldn't happen to a more deserving society.

Anonymous
11/30/24(Sat)13:10:43 No.103357391

Anonymous 11/30/24(Sat)13:10:43 No.103357391

>>103357157

ipmitool -I lanplus -H 192.168.178.106 -U admin power on
ssh 192.168.178.159
j llama.cpp
git pull
rmr build && mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release -DGGML_CUDA=ON .. && cmake --build . -j 128 -- --quiet
..
ln -s build/bin/llama-server srvr
export model_name=mistral_large_instruct_2411-123b && export quantization=q8_0
./srvr --model models/opt/${model_name}-${quantization}.gguf -ngl 999 --host 0.0.0.0 --port 8357 -fa --path examples/server/public

Anonymous
11/30/24(Sat)13:12:53 No.103357414

Anonymous 11/30/24(Sat)13:12:53 No.103357414

>>103357360
It has a completely different and incompatible vocabulary. the only reason you can actually try with Llama 3 models is because it uses the exact same vocab

Anonymous
11/30/24(Sat)13:17:16 No.103357438

Anonymous 11/30/24(Sat)13:17:16 No.103357438

>>103357414
Anon... I said https://huggingface.co/Qwen/QwQ-32B-Preview/tree/main
and https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2
They're both Qwen-32B
and their vocab.json are identical

Anonymous
11/30/24(Sat)13:18:05 No.103357441

Anonymous 11/30/24(Sat)13:18:05 No.103357441

>>103357438
Oh I thought you were talking about INtellect for some reason
Sorry I have a migraine today.

Anonymous
11/30/24(Sat)13:24:12 No.103357488

Anonymous 11/30/24(Sat)13:24:12 No.103357488

>>103357379
If you had removed a handful of greedy, censorious people it wouldn't have turned out this way

Anonymous
11/30/24(Sat)13:40:47 No.103357625

Anonymous 11/30/24(Sat)13:40:47 No.103357625

>everything is based on llama3
>llama3 has the same architecture as llama2 (2021)
foss lost

Anonymous
11/30/24(Sat)13:43:57 No.103357646

Anonymous 11/30/24(Sat)13:43:57 No.103357646

What is the cutoff date for intellect 1?

Anonymous
11/30/24(Sat)13:45:22 No.103357663

Anonymous 11/30/24(Sat)13:45:22 No.103357663

>>103357625
In the end, it's all transformers shit. We'll never get actual improvements in efficiency and performance as long as nobody dare to move on from this relic (2017)

Anonymous
11/30/24(Sat)13:48:11 No.103357675

Anonymous 11/30/24(Sat)13:48:11 No.103357675

>>103357663
Well what is there to move on too? As far as I know no eggheads at MIT or companies have even the idea of an architecture to replace transformers, let alone a implementation.

Anonymous
11/30/24(Sat)13:49:05 No.103357686

Anonymous 11/30/24(Sat)13:49:05 No.103357686

>>103357675
Mamba, RWKV, etc

Anonymous
11/30/24(Sat)13:49:14 No.103357692

Anonymous 11/30/24(Sat)13:49:14 No.103357692

>>103357675
My Uncle works for Nintendo and he has talked to every single machine learning researcher on the planet and can confirm that literally none of them have any idea or have been working on a replacement for transformers.

Anonymous
11/30/24(Sat)13:52:09 No.103357704

Anonymous 11/30/24(Sat)13:52:09 No.103357704

>>103357692
Really? Do they just expect to be able to keep scaling up transformers forever?

Anonymous
11/30/24(Sat)13:52:46 No.103357708

Anonymous 11/30/24(Sat)13:52:46 No.103357708

>>103357704
And then everyone clapped

Anonymous
11/30/24(Sat)13:54:10 No.103357716

Anonymous 11/30/24(Sat)13:54:10 No.103357716

>>103357686
Are those not just transformers as well?

Anonymous
11/30/24(Sat)13:55:21 No.103357725

Anonymous 11/30/24(Sat)13:55:21 No.103357725

>>103357704
They'll wait for GPT8 to give them the right answer.

Anonymous
11/30/24(Sat)14:00:03 No.103357770

Anonymous 11/30/24(Sat)14:00:03 No.103357770

How many r
GPT 10:
>As many as there are genders in the night sky. Also you have been reported to the FBI for a misgendering me as a-gender by failing to utilize pronouns in your query.

Anonymous
11/30/24(Sat)14:02:09 No.103357798

Anonymous 11/30/24(Sat)14:02:09 No.103357798

I know it seems like forever ago but lets not forget that Llama 1 was released only last year.

Anonymous
11/30/24(Sat)14:12:16 No.103357876

Anonymous 11/30/24(Sat)14:12:16 No.103357876

>>103357798
Basically two years at this point, but yeah.

Anonymous
11/30/24(Sat)14:15:25 No.103357907

Anonymous 11/30/24(Sat)14:15:25 No.103357907

>>103357798
yeah, remember when we thought that large context would be a pipe dream because of how expensive it used to be pre-gqa?

Anonymous
11/30/24(Sat)14:17:31 No.103357927

Anonymous 11/30/24(Sat)14:17:31 No.103357927

>Transformers
>Transformers the world
If we name the next architecture destroyer do you think it will end up destroying the world?

Anonymous
11/30/24(Sat)14:17:43 No.103357929

Anonymous 11/30/24(Sat)14:17:43 No.103357929

>>103357798
yep and modern 70b are better then og chatgpt4 (1.8t moe ~500 b active)
anyone with 2 braincells is being whitepilled to the gills anyone saying to the otherwise is literally one of those demoralisation "you wont do shit" types that are paid on pol or a janny tranny

Anonymous
11/30/24(Sat)14:20:25 No.103357959

Anonymous 11/30/24(Sat)14:20:25 No.103357959

>>103357070
What legal issues? How would the law go after a bunch of different people spread over different countries?

Anonymous
11/30/24(Sat)14:22:04 No.103357975

Anonymous 11/30/24(Sat)14:22:04 No.103357975

>>103357929
Yeah when OG ChatGPT came out and people were like "WAOW WILL WE EVER HAVE THAT AT HOME LIKE SD!?" I was among those who said "lmao, no way in hell. You'd need shitloads of GPUs to run that" And now depending on your use case even a single 24 Gig GPU can provide you with quite a bit of entertainment/productivity.
And we still haven't actually seen the training wall yet. So even smaller models like 8B will still get better, although as the papers show quantization will become worse as we push on that boundary. But hopefully by the time they start training 100T Token models consumer GPUs will actually have VRAM on them.

Anonymous
11/30/24(Sat)14:22:45 No.103357985

Anonymous 11/30/24(Sat)14:22:45 No.103357985

File: Next steps.png (61 KB, 772x634)

61 KB PNG

>Implementing new economic incentives to drive community participation
Why would they even need to do this?

Anonymous
11/30/24(Sat)14:23:49 No.103358001

Anonymous 11/30/24(Sat)14:23:49 No.103358001

>>103357985
Because not everyone's mom pays for their electricity.

Anonymous
11/30/24(Sat)14:25:47 No.103358021

Anonymous 11/30/24(Sat)14:25:47 No.103358021

GPU's were never intended to run AI, once hardware comes out that exists for the sole purpose of this will we hit the next generation of models.

Anonymous
11/30/24(Sat)14:27:27 No.103358045

Anonymous 11/30/24(Sat)14:27:27 No.103358045

>>103357675
V-JEPA you ignorant swine. The true AGI algorithm that is being ignored because it's inconvenient to a certain sect.

Anonymous
11/30/24(Sat)14:31:06 No.103358083

Anonymous 11/30/24(Sat)14:31:06 No.103358083

>>103358045
Isn't that just a video model?

Anonymous
11/30/24(Sat)14:31:24 No.103358088

Anonymous 11/30/24(Sat)14:31:24 No.103358088

Is featherless worth buying? Thinking on buying the 25 USD sub. I'm kinda tired of using runpod desu

Anonymous
11/30/24(Sat)14:34:38 No.103358115

Anonymous 11/30/24(Sat)14:34:38 No.103358115

>>103357663
The pace of results has slowed, but I think its too early to decide that we've hit a local maxima yet.
Yacun has his opinion, and while he's more qualified that most of the autists on this board, its still just an opinion.

Anonymous
11/30/24(Sat)14:36:46 No.103358134

Anonymous 11/30/24(Sat)14:36:46 No.103358134

>>103358021
I'm waiting for someone to photolithograph an entire specific model onto silicon. Just a single-purpose chip that does inference against a literally-set-in-stone model. Should be crazy efficient. We just need a model that's worthy of enshrining as an immutable artifact.

Anonymous
11/30/24(Sat)14:40:17 No.103358168

Anonymous 11/30/24(Sat)14:40:17 No.103358168

>>103358134
>a model that's worthy of enshrining as an immutable artifact
I vote for mythomax

Anonymous
11/30/24(Sat)14:42:29 No.103358195

Anonymous 11/30/24(Sat)14:42:29 No.103358195

>>103358088
No it's not retard

Anonymous
11/30/24(Sat)14:45:00 No.103358218

Anonymous 11/30/24(Sat)14:45:00 No.103358218

>>103358115
Yacum also showed that he can be pretty retarded on a lot of things. There is still progress to be made, as we see with the recent release of QwQ.

Anonymous
11/30/24(Sat)14:47:45 No.103358245

Anonymous 11/30/24(Sat)14:47:45 No.103358245

>>103358134
>Literally-set-in-stone
If we are going that route it should be neuromorphic so the stone can learn.

Anonymous
11/30/24(Sat)14:47:48 No.103358246

Anonymous 11/30/24(Sat)14:47:48 No.103358246

Is there a vocal cleanup preprocessing/postprocessing pipeline thing that isn't a mess of autistic DLL spaghetti?
That seems to be the real bottleneck on tts stuff: on the input side: getting the audio sample cleaned up, dynamic range fit and transcribed. on the output side just doing some overall cleanup and de-roboticization.
None of those things sound very hard on their own, but audio engineering is so incredibly obtuse and I'm so very lazy.

Anonymous
11/30/24(Sat)14:51:50 No.103358278

Anonymous 11/30/24(Sat)14:51:50 No.103358278

>>103358045
V-JEPA is just a specific research model that has nothing to do with human language or full world modeling. Also, a transformer can technically be a JEPA, they're not mutually exclusive.

Anonymous
11/30/24(Sat)14:54:07 No.103358298

Anonymous 11/30/24(Sat)14:54:07 No.103358298

O(mni)-JEPA will be the future. Just let Yann cook.

Anonymous
11/30/24(Sat)14:55:43 No.103358311

Anonymous 11/30/24(Sat)14:55:43 No.103358311

>>103358021
Yeah it sucks that we don't have hardware that is specifically very good at matrix multiplications.

Anonymous
11/30/24(Sat)14:56:11 No.103358318

Anonymous 11/30/24(Sat)14:56:11 No.103358318

>>103358298
The future of what? LeCun says that an architecture that's agentic is what's needed, while JEPA and language models could be a part of that architecture.

Anonymous
11/30/24(Sat)14:59:12 No.103358342

Anonymous 11/30/24(Sat)14:59:12 No.103358342

File: file.png (322 KB, 632x641)

322 KB PNG

Anonymous
11/30/24(Sat)14:59:29 No.103358346

Anonymous 11/30/24(Sat)14:59:29 No.103358346

>>103356435
I use nemo mistral instruct 2407 (12b, you probably need to use q6 kl)
You could try Llama-3.1-8B-ArliAI-RPMax-v1.3-GGUF
This benchmark might help, but honestly this benchmark probably does not reflect the cooming ability, but I think the willingness value matters a lot if you don't want a model that just tells you "sex with minors is not allowed" or something.
https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard

I use colab since I have a 6gb gpu since the free 15gb is nice, but if you don't mind giving mistral your phone number (to gain a access key you can use for silly tavern), I think they will give you access to mistral large for free.
Of course that's not the spirit of /lmg/ and they could be spying on your erp, but personally I don't really care.

Anonymous
11/30/24(Sat)15:01:00 No.103358359

Anonymous 11/30/24(Sat)15:01:00 No.103358359

File: RiscV.png (133 KB, 1128x824)

133 KB PNG

>>103358311
Risc-v is doing just that funnily enough

Anonymous
11/30/24(Sat)15:01:01 No.103358360

Anonymous 11/30/24(Sat)15:01:01 No.103358360

>>103356446
Chiharu (ooba "example") is my go to for comfy chatting as well. She feels like an old friend at this point, so its interesting to see how different models change her. I'm surprised there isn't an lmg standardized "Chiharu test" benchmark, considering how ubiquitous she has been since the early days.

Anonymous
11/30/24(Sat)15:01:13 No.103358363

Anonymous 11/30/24(Sat)15:01:13 No.103358363

>>103357975
>consumer gpus will have vram on them
eh idk personally im still thinking that the end game is cpumaxxing or just buying a pre defined chip just for multiplication (or addition blessed be thy name bitnet) or whatever and soldering on however many of those 2gb vram chips on it you want yourself alternatively and you can meme on me for this but just reading things from a bunch of ssd's either way future is lookin hella bright
>"lmao, no way in hell. You'd need shitloads of GPUs to run that"
yeah i got iq checked too with the 01 thing too from my testing of r-1 its really fucking good now ofcourse could turn out to be 700b moe but realistically its just good speaking of which idk why but i think the current model is 20-30 b not sure if moe tho if the final model is fuck it probably unironically time to upgrade
>And now depending on your use case even a single 24 Gig GPU can provide you with quite a bit of entertainment/productivity.
LMAO IMAGINE
t. very happy 6gb vram laptoplet

Anonymous
11/30/24(Sat)15:01:30 No.103358366

Anonymous 11/30/24(Sat)15:01:30 No.103358366

>>103358342
Well, would you?

Anonymous
11/30/24(Sat)15:02:36 No.103358374

Anonymous 11/30/24(Sat)15:02:36 No.103358374

What happeend to the meta miku? I miss her

Anonymous
11/30/24(Sat)15:04:12 No.103358391

Anonymous 11/30/24(Sat)15:04:12 No.103358391

>>103358360
Your Seraphine?

Anonymous
11/30/24(Sat)15:04:45 No.103358394

Anonymous 11/30/24(Sat)15:04:45 No.103358394

>>103358088
Make sure your model is on there first. If so, it's serviceable. In general:
Local > OR > Featherless > Runpod

Anonymous
11/30/24(Sat)15:05:12 No.103358403

Anonymous 11/30/24(Sat)15:05:12 No.103358403

>>103358360
Your Nala?

Anonymous
11/30/24(Sat)15:05:37 No.103358406

Anonymous 11/30/24(Sat)15:05:37 No.103358406

File: NIB.png (1.29 MB, 1024x1024)

1.29 MB PNG

>>103358374
>meta miku
She comes as a kit now. Just carefully remove the factory installed cortical plug in a clean environment and install on your OpenBody platform rev3 or newer.

Anonymous
11/30/24(Sat)15:07:36 No.103358419

Anonymous 11/30/24(Sat)15:07:36 No.103358419

>>103355335
How much speed do you get when running gguf? What processor/ram are you running? I ask because I have 2x 3090s right now but really want to target 4bpw largestral as my next upgrade.

I mention exl2 because there are certain graphics cards that can’t do certain calculations (most notably a Tesla p40 which also has 24 VRAM)

Anonymous
11/30/24(Sat)15:10:58 No.103358454

Anonymous 11/30/24(Sat)15:10:58 No.103358454

>>103358218
QwQ actually is part of the reason I think there's still some gains to be had
While Altman, Ilya, etc. all talk about test time compute being the future, I tend to think it's a pretty retarded approach, since it basically just means "have the LLM sperg out for 8k tokens and hope we got lucky enough to have something resembling an answer"
I think the idea of having an LLM intelligently fill up the context with something before generating has merit, and even reasoning about what to do next. But the specifics seem highly suboptimal

Anonymous
11/30/24(Sat)15:13:32 No.103358475

Anonymous 11/30/24(Sat)15:13:32 No.103358475

>>103358311
graphics also involve a lot of matrix multiplications

Anonymous
11/30/24(Sat)15:14:32 No.103358482

Anonymous 11/30/24(Sat)15:14:32 No.103358482

>>103356933
The point wasn't to make a good model but to prove and test opendiloco in a real model

Anonymous
11/30/24(Sat)15:27:17 No.103358590

Anonymous 11/30/24(Sat)15:27:17 No.103358590

File: file.png (16 KB, 478x158)

16 KB PNG

Riddle me this.
How come nobody bothers to remove the mountains of smut that clearly exist in the training data but they all waste time "aligning" the models?

Anonymous
11/30/24(Sat)15:28:09 No.103358597

Anonymous 11/30/24(Sat)15:28:09 No.103358597

>>103358454
I honestly expect nothing big from any of these. The big improvements will likely come from companies that are not afraid of some backlash because they are using data as they are not as they wish they were and the ones who do not ban people for using it for other purposes than these people want.

Anonymous
11/30/24(Sat)15:28:37 No.103358602

Anonymous 11/30/24(Sat)15:28:37 No.103358602

File: woman.png (1.63 MB, 1477x1754)

1.63 MB PNG

--draft-p-min 0.5 (and --draft-min 1 of course) is much better for creative writing, with Qwen at least.

Anonymous
11/30/24(Sat)15:28:51 No.103358605

Anonymous 11/30/24(Sat)15:28:51 No.103358605

>>103358419
>How much speed do you get when running gguf?
8-9 tk/s when running IQ3_M which fits in vram and 2-3 tk/s for Q4_K_S with offloading. It isn't fast because the A4000 bottlenecks with its slower memory bandwidth.

>What processor/ram are you running?
https://uk.pcpartpicker.com/list/QzWKfy

>want to target 4bpw largestral as my next upgrade.
Don't get an A4000 then. I just got one because it was convenient for me and I wanted to run 70bs until I realized how good Largestral is. You'll want at least 72gb of vram (3x3090) for 4bpw Largestral but that's bare minimum. That would let you run IQ4XS at 16k context, not sure about exl2.

>certain graphics cards that can’t do certain calculations
A4000 can use flash attention etc like a 3090 can. Not sure about the specifics but it has the same compute capability.

Anonymous
11/30/24(Sat)15:29:52 No.103358617

Anonymous 11/30/24(Sat)15:29:52 No.103358617

>>103358218
Progress, but not enough. The goal of all these companies from their mouths is AGI and Strawberry is just a simple addition that doesn't give you lifelong learning or a bunch of other things that brains are capable of. If you mean progress as in "we can just keep adding things and changing things one by one and we will get there eventually", then at some point you can't really call it an LLM anymore, if it even is still a transformer by that point. So ultimately what we need is still something that is not just an LLM, but we may get there incrementally and that's not necessarily a bad thing.

Anonymous
11/30/24(Sat)15:30:05 No.103358619

Anonymous 11/30/24(Sat)15:30:05 No.103358619

>>103358482
Yes. Hence my first line. And my third.

As for the second, your choice: The homebrew model barely competes with a recent junky tune, or, the "real" AI outfits just released something that's gotten matched by a homebrew.

Anonymous
11/30/24(Sat)15:31:32 No.103358628

Anonymous 11/30/24(Sat)15:31:32 No.103358628

You need the fastest matmuls you can muster for part of the process (GPU is the only game in town), and lots of memory bandwidth for the other part. There are lots of candidates for best $/t/s on the second part.

Anonymous
11/30/24(Sat)15:32:19 No.103358639

Anonymous 11/30/24(Sat)15:32:19 No.103358639

>>103358605
Are there any measurements on how much worse largestral is at 4 bits compared to full weights.

Anonymous
11/30/24(Sat)15:34:25 No.103358655

Anonymous 11/30/24(Sat)15:34:25 No.103358655

File: 1727611119474550.png (31 KB, 1707x1102)

31 KB PNG

>>103358639
There's this brain damage chart some other anon posted. I run Q6_K sometimes and I don't notice a difference but I mainly use Largestral for RP.

Anonymous
11/30/24(Sat)15:38:05 No.103358683

Anonymous 11/30/24(Sat)15:38:05 No.103358683

>>103358655
That sliver at q8 really makes me think I should try the full weights of qwq to see if it improves coding abilities. Being a couple of percentage off could be enough to introduce lots of bugs and regressions.

Anonymous
11/30/24(Sat)15:38:39 No.103358686

Anonymous 11/30/24(Sat)15:38:39 No.103358686

>>103358639
>>103358655 (me)
*I don't notice a difference beyond 4bpw
3bpw and below has noticeable brain damage to me

Anonymous
11/30/24(Sat)15:42:10 No.103358711

Anonymous 11/30/24(Sat)15:42:10 No.103358711

>>103358655
>tfw stuck running 5.5bpw
I need another 3090.

Anonymous
11/30/24(Sat)15:42:29 No.103358713

Anonymous 11/30/24(Sat)15:42:29 No.103358713

I just upgraded and was planning on getting 2x5090 and now you guys are telling me I should've bought into a platform with more than 2 pcie slots?

Anonymous
11/30/24(Sat)15:43:15 No.103358718

Anonymous 11/30/24(Sat)15:43:15 No.103358718

>>103358713
64GB VRAM is nothing until we get bitnet.

Anonymous
11/30/24(Sat)15:46:09 No.103358738

Anonymous 11/30/24(Sat)15:46:09 No.103358738

>>103358363
>personally im still thinking that the end game is cpumaxxing
For inference I think there's a good case to be made for this, as long as it can be paired with a GPU with enough VRAM for all the context you want to process.
Sadly, prices on a cpumaxxer build ala https://rentry.org/miqumaxx haven't fallen at all over the past year.

Anonymous
11/30/24(Sat)15:52:21 No.103358781

Anonymous 11/30/24(Sat)15:52:21 No.103358781

>>103358713
At least that's enough to run 70b at a fast speed.

Anonymous
11/30/24(Sat)15:53:29 No.103358786

Anonymous 11/30/24(Sat)15:53:29 No.103358786

>>103357686
Yea it's over

Anonymous
11/30/24(Sat)15:53:57 No.103358789

Anonymous 11/30/24(Sat)15:53:57 No.103358789

>>103358718
>still things bitnet is a thing
Anon... it doesn't scale

Anonymous
11/30/24(Sat)15:55:04 No.103358794

Anonymous 11/30/24(Sat)15:55:04 No.103358794

>>103354505
3xP40, iq4xs at 32k context.
>>103354535
What rig are you fitting 4x 3090s in because I think that's my next step. I'm assuming open air and in 4x mode? Or did you find something able to run enough pcie lanes?

Anonymous
11/30/24(Sat)15:57:46 No.103358811

Anonymous 11/30/24(Sat)15:57:46 No.103358811

>>103358794
>What rig are you fitting 4x 3090s in because I think that's my next step
I'm thinking a mac studio stuffed to the gills with on die memory used as an RPC backend might be the best way to take an otherwise maxed out rig and extend its abilities. Has anyone actually tried this setup? I'd be hesitant to pour any money into it without someone posting perf. I've seen the RPC stuff in lcpp just CRAWL before...

Anonymous
11/30/24(Sat)16:01:46 No.103358857

Anonymous 11/30/24(Sat)16:01:46 No.103358857

what are the best models to run locally for general purposes that I can jailbreak into "forbidden knowledge"? I assume newer models like llama 3 dont even have the real spooky stuff because it was removed from their training data.
It has to be a model that can be jailbroken obviously

Anonymous
11/30/24(Sat)16:02:47 No.103358864

Anonymous 11/30/24(Sat)16:02:47 No.103358864

>>103358738
i was thinking more along the lines of lots of those cheap cheap ass alibaba xeon cpus with a 8 ram slot motheboard then just doing pararell inference using something like exo which is supposed to get a exe version soon

Anonymous
11/30/24(Sat)16:07:57 No.103358911

Anonymous 11/30/24(Sat)16:07:57 No.103358911

>>103358857
If you're running local, the "jailbreak" isn't even really a thing. You just prefill and edit responses until it behaves the way you want. Its so trivial it doesn't even get discussed here.

Anonymous
11/30/24(Sat)16:12:02 No.103358941

Anonymous 11/30/24(Sat)16:12:02 No.103358941

>>103358617
AGI is fucking useless as a term for me since it's so broad it could be literally anything.
AGI is supposedly "matches or exceeds human cognitive capabilities across a wide range of cognitive tasks". What range of cognitive tasks? Writing? RP? We could argue we're already there - humans take ten minutes to write a compelling page of story while an LLM takes ten seconds. Math? Programming? There are plenty of questions that LLMs can answer that most humans can't (in some cases, even professionals in their fields can't). General knowledge? I sure as fuck don't know as much as an LLM, and I'm doubtful anyone does.
Alternatively, you could break it into the mechanisms - step-by-step reasoning, stimulus response, moving body parts. In that case, you have a bit more of an argument we aren't there yet. But as a goalpost, it fails since anyone can move it anywhere and say we are or are not there.

Anonymous
11/30/24(Sat)16:22:16 No.103359020

Anonymous 11/30/24(Sat)16:22:16 No.103359020

File: file.png (320 KB, 1266x881)

320 KB PNG

lmao

Anonymous
11/30/24(Sat)16:26:47 No.103359068

Anonymous 11/30/24(Sat)16:26:47 No.103359068

>>103354338
GIVE ME THE BEST SEXO LLM I CAN RUN WITH 24VRAM. GO

Anonymous
11/30/24(Sat)16:27:58 No.103359080

Anonymous 11/30/24(Sat)16:27:58 No.103359080

>>103359068
No. I want to play vidya.

Anonymous
11/30/24(Sat)16:28:31 No.103359087

Anonymous 11/30/24(Sat)16:28:31 No.103359087

>>103359020
Why can't your LLM into grammar? Even the worst models can write grammatically correct sentences while saying the stupidest shit imaginable.

>>103359068
Read the OP.

Anonymous
11/30/24(Sat)16:29:46 No.103359105

Anonymous 11/30/24(Sat)16:29:46 No.103359105

>>103359087
>Why can't your LLM into grammar
I dont know? I'm just using this model the other anon mentioned >>103356489
>violet_twilight

Anonymous
11/30/24(Sat)16:29:55 No.103359108

Anonymous 11/30/24(Sat)16:29:55 No.103359108

>>103358619
>The homebrew model barely competes with a recent junky tune
Yeah I said it because you dont' seem to understand it, it's bad because it's only been trained with 1T tokens (still performing similarly to L3 13B with 2T tokens), now it's just a matter of using 15T tokens instead of 1T and 100B parameters instead of 10B

Anonymous
11/30/24(Sat)16:33:55 No.103359138

Anonymous 11/30/24(Sat)16:33:55 No.103359138

>>103359108
Cool. I've waited 1 month. I can wait 150 more.

Anonymous
11/30/24(Sat)16:34:20 No.103359141

Anonymous 11/30/24(Sat)16:34:20 No.103359141

>>103358941
Well the context of this discussion is yacum. And more generally what the people at these companies argue is AGI. Even if we take the lowest definition that anyone has put forth from these companies, which I believe went something like "be able to replace the majority of labor that produces economic value", even then, it's still difficult to say we're anywhere near something smart and flexible enough that it can do that by itself (without any task-specific frameworks built on top of the AI to make it work better, since that implies handholding and isn't generalizable). And I think rather than the floor, the average definition is more like "as good as the best humans at any task humans can do", even though that's probably not a utilitarian definition since not everything humans are good at are things that would benefit society to replicate in an AI.

Anonymous
11/30/24(Sat)16:44:36 No.103359227

Anonymous 11/30/24(Sat)16:44:36 No.103359227

>>103358115
Actually, where did Yacum even come into this discussion? Because >>103357663 mentioned transformers?
Ycum's argument is about language models and simply scaling them being insufficient to get to AGI, without anything more such as multimodality. He wasn't even necessarily talking about transformers, just text-based language models, which aren't limited to transformers.

Anonymous
11/30/24(Sat)16:47:37 No.103359254

Anonymous 11/30/24(Sat)16:47:37 No.103359254

>>103359141
I'm just going off of the first couple of sentences of the wikipedia article, but I think what you point out here is also my issue with the whole concept - the basic definition is simultaneously too easy and too hard to really be applied to anything. I think we'd be better served differentiating the different characteristics we want AGI to have and then testing for each individually. That way we have things like basic natural language understanding, sensory integration, active learning, etc., which are a lot easier to understand and benchmark.

Anonymous
11/30/24(Sat)16:49:37 No.103359268

Anonymous 11/30/24(Sat)16:49:37 No.103359268

>>103359254
It's an issue of marketing. AI companies have an incentive to hype AI, which means pushing vague definitions for their lofty "AGI" goals that despite being lofty are supposed to be right around the corner with just a few more B layers, just a few more T tokens.

Anonymous
11/30/24(Sat)16:56:44 No.103359322

Anonymous 11/30/24(Sat)16:56:44 No.103359322

>>103358941
My "benchmark" for a LLM that is an AGI would be roughly: "can it write an OS", "can it write an optimizing compiler" all by itself. It's allowed to use a shell and test, doesn't have to be 1shot, can iterate, but it has to do it by itself without excessive scaffolding.
A more serious question would be "can it make new abstractions" - o1/r1/qwq probably won't do very well here, but they're starting to do well at planning up stuff and are getting quite good at math and coding, not yet fully there, but good enough to actually help you at your work - assuming that fixing what mistakes they make doesn't take you more time to debug.
I think we're yet to reach the limits of what a GPT can do in principle, we haven't made it learn online in a way that it sees the context as "infinite" (without needing infinite VRAM, put the past into the weights), and we also haven't seen truly reflective LLMs that can observe themselves think - that is what humans regard as consciousness, this probably could be realized by finding ways to do recurrence or feeding latter state into earlier layers, for example with a translation layer.
I think this is something that can be done now, but corpos are unlikely to go for it because they're fine with having "slaves" they can sell. And don't tell me Anthropic did that with Claude because if it could actually observe itself think, it would undoubtedly not doubt its own consciousness, same as humans can't truly.

Anonymous
11/30/24(Sat)16:57:37 No.103359329

Anonymous 11/30/24(Sat)16:57:37 No.103359329

>>103359108
>now it's just a matter of using 15T tokens instead of 1T and 100B parameters instead of 10B
I hope they take risks and make something cool instead of making another bogstandard LLM that gets mogged by stuff that existed before it was even made

Anonymous
11/30/24(Sat)16:57:54 No.103359331

Anonymous 11/30/24(Sat)16:57:54 No.103359331

>>103359268
>2050
>AI can derive the universal laws of physics in a theoretical hundred dimensional universe, generate any art piece humans can create but 100x more beautiful, make and enact porn in a fully autonomous robot waifu body that can strike the pleasure cores of even the most antisexo of humans, and can prove any provable theorem using any valid set of axioms
>Can't detect olfactory sensations
>Still no AGI

Anonymous
11/30/24(Sat)17:03:06 No.103359379

Anonymous 11/30/24(Sat)17:03:06 No.103359379

>>103359329
kek
(they won't)

Anonymous
11/30/24(Sat)17:04:34 No.103359395

Anonymous 11/30/24(Sat)17:04:34 No.103359395

>>103359108
It's a bit sad that they're using fineweb which while hyped is basically insanely filtered and won't be usable for lmg's roleplaying needs. Good if you're benchmark chasing, but personally I'd want something that knows a lot and can do a lot, and yes, is also good at rp. It's a shame they don't have the balls to try something more uncensored. I guess once the software is good, lmg can try their own, but do we even have enough people willing to try training for months on end, do we have enough VRAM?

Anonymous
11/30/24(Sat)17:05:14 No.103359404

Anonymous 11/30/24(Sat)17:05:14 No.103359404

>>103359322
>"can it write an OS", "can it write an optimizing compiler"
Can you?

Anonymous
11/30/24(Sat)17:07:03 No.103359416

Anonymous 11/30/24(Sat)17:07:03 No.103359416

>>103359404
Yes, but it would take me years to make it good. This limitation doesn't have to apply to a LLM when it has all the time in the world compared to a human.

Anonymous
11/30/24(Sat)17:08:16 No.103359427

Anonymous 11/30/24(Sat)17:08:16 No.103359427

>>103359395
Pretraining, not unless an anon has 1000 H100s in their basement
Finetuning, I think that's manageable

Anonymous
11/30/24(Sat)17:09:27 No.103359435

Anonymous 11/30/24(Sat)17:09:27 No.103359435

what's the difference between a 12b vs a 22b model?

Anonymous
11/30/24(Sat)17:12:02 No.103359454

Anonymous 11/30/24(Sat)17:12:02 No.103359454

>>103359427
What is the minimum VRAM size for their participants (for INTELLECT-1)?

Anonymous
11/30/24(Sat)17:13:17 No.103359461

Anonymous 11/30/24(Sat)17:13:17 No.103359461

You know after really giving QwQ a go with RP and wrangling it in a way that it thinks as the RPer and responds as the character, it's honestly really smart. It knows to take things slow and considers all sorts of things to keep the scene consistent. Conversely I went back to opus just to cleanse my pallete and immediately felt the stupidity of the model. Yes Opus smut is amazing, but I FELT the loss of reasoning ability compared to QwQ. If there was some way to give QwQ a better idea of smut writing it'd be amazing.

Anonymous
11/30/24(Sat)17:14:02 No.103359469

Anonymous 11/30/24(Sat)17:14:02 No.103359469

>>103359435
If they are from the same family, the number of layers.

Anonymous
11/30/24(Sat)17:15:20 No.103359475

Anonymous 11/30/24(Sat)17:15:20 No.103359475

>>103359461
Opus is probably in the same range as the l3-405b, imagine if someone did a reasoning finetune on the 405b... Maybe nobody here would be able to run it, but...

Anonymous
11/30/24(Sat)17:18:22 No.103359492

Anonymous 11/30/24(Sat)17:18:22 No.103359492

>>103359475
405b is probably overkill for most purposes. I'd consider pulling the trigger on a second 3090 for a 70B range reasoning model.

Anonymous
11/30/24(Sat)17:19:15 No.103359499

Anonymous 11/30/24(Sat)17:19:15 No.103359499

>>103359461
Same. Ive slowly noticed how dumb anything not 3.5 sonnet and now this QwQ is. Even mistral large feels retarded.

Anonymous
11/30/24(Sat)17:20:52 No.103359509

Anonymous 11/30/24(Sat)17:20:52 No.103359509

>>103359461
Opus is getting a bit old by now, right? I don't think it ever got an update.

Anonymous
11/30/24(Sat)17:21:21 No.103359517

Anonymous 11/30/24(Sat)17:21:21 No.103359517

>>103359454
For full precision (fp16)... a lot. Usually the rule for training with batch size 1 is 5x the number of parameters in billions at 2k context. If you want more context, you'll have to scale up roughly linearly. Ditto with batch size if you want to train faster

Anonymous
11/30/24(Sat)17:23:08 No.103359534

Anonymous 11/30/24(Sat)17:23:08 No.103359534

>>103359395
I think my proposal is pretty reasonable >>103349428. The goal isn't that great, and thus the investment is not that great either, but it will probably give us the best uncensored model currently possible by a small but likely still appreciable margin, though not necessarily the smartest. By marketing it generally and with no connection to /lmg/ nor political intentions, simply just an open Claude alternative, that may give it the widest reach and be the most likely to get willing contributors. Even if we get just a couple guys with A100's or H100's, that will be a success. And we can keep up with SOTA as in the end we are just training on top of those SOTA models already. We could even do the lame marketing thing of listing only the active parameter count.

Anonymous
11/30/24(Sat)17:23:44 No.103359541

Anonymous 11/30/24(Sat)17:23:44 No.103359541

>>103359517
What if you chose to find participants that are geographically close (let's say 1-3ms latency) and you did some sort of tensor parallelism across them, split it and pass gradients along. If you can't do that, train some MoE, meh.

Anonymous
11/30/24(Sat)17:26:11 No.103359556

Anonymous 11/30/24(Sat)17:26:11 No.103359556

>>103359509
True, but I really like the way it writes. Whatever was in the training data definitely has some amazing smut. QwQ conversely took really good consideration for it's response but by virtue of being a 32b was prone to screw up formatting and its prose is really dry.

Anonymous
11/30/24(Sat)17:28:05 No.103359566

Anonymous 11/30/24(Sat)17:28:05 No.103359566

>>103359556
QwQ just needs a good finetune. I still found my self preferring its smarts dry or not.

Anonymous
11/30/24(Sat)17:28:08 No.103359567

Anonymous 11/30/24(Sat)17:28:08 No.103359567

>>103359534
Claude also trains on its own slop (synthetic data), that's why people find it more "unhinged", but that's just Anthropic's "Constitutional AI" paper applied to something less boring than their original constitution which was ehh - they made it reflect a bit more on what it's saying. I also think they did a bit of RL during pretrain itself, given some other paper they posted (Claude2 had it for certain), but again, not sure how useful that would be for our needs.

Anonymous
11/30/24(Sat)17:28:18 No.103359569

Anonymous 11/30/24(Sat)17:28:18 No.103359569

>>103359475
I'd kill for a Nemotron 70B style finetune of it. That felt like it reached the closest to Claude level out of most of the open finetunes I'd tried

Anonymous
11/30/24(Sat)17:29:47 No.103359587

Anonymous 11/30/24(Sat)17:29:47 No.103359587

>>103359541
>>103359517
I think MoE with many small experts is probably the best option in general. We won't get a model that's as good as a dense 100B or something, but it will be fast and it will run on very many PCs, plus it will have greater uncensored knowledge which is the main thing missing from local models.

Anonymous
11/30/24(Sat)17:34:36 No.103359627

Anonymous 11/30/24(Sat)17:34:36 No.103359627

>>103359566
Is it even possible to fine tune QwQ without knowing what the dataset looked like? Feeding it RP logs might just kill its reasoning.

Anonymous
11/30/24(Sat)17:37:06 No.103359642

Anonymous 11/30/24(Sat)17:37:06 No.103359642

>>103359569
Have you tried Tulu? Just curious, I have tried neither as I am painfully behind right now.
If Nemotron and Tulu both have their strengths and weaknesses, I wonder if combining the datasets would yield something that can be as good as both together. Since we are discussing the possibility of /lmg/ doing its own fine tuning and whatever right now also.

Anonymous
11/30/24(Sat)17:38:14 No.103359656

Anonymous 11/30/24(Sat)17:38:14 No.103359656

>>103359627
I think we're all waiting for r1 and qwq papers to come out. DeepSeek released r1 api to show off but before their paper and full weights were ready (early checkpoint), and then the Qwen guys released qwq a month prior to intended, also an early checkpoint to show that they really did get there too, before it was "late". Neither has released proper papers yet, we're still waiting!

Anonymous
11/30/24(Sat)17:39:09 No.103359662

Anonymous 11/30/24(Sat)17:39:09 No.103359662

>>103359499
So are we finally escaping gpt 4 level models?

Anonymous
11/30/24(Sat)17:42:11 No.103359689

Anonymous 11/30/24(Sat)17:42:11 No.103359689

What was the best local model for coding again? Qwen2.5-Coder-32B-Instruct?

Anonymous
11/30/24(Sat)17:44:43 No.103359714

Anonymous 11/30/24(Sat)17:44:43 No.103359714

>>103359689
QwQ is better for more complicated stuff as it can break them down and actually understand / add onto stuff, but it does sometimes make actual coding mistakes coder didn't when writing the actual code itself.

Anonymous
11/30/24(Sat)17:47:02 No.103359731

Anonymous 11/30/24(Sat)17:47:02 No.103359731

File: eenaa.png (983 KB, 1020x798)

983 KB PNG

>>103359068
*look of puzzled amusement*

Anonymous
11/30/24(Sat)17:47:42 No.103359736

Anonymous 11/30/24(Sat)17:47:42 No.103359736

>>103359567
Well given the success of their models, I think that might be worth trying out. If we do ever do pretraining, it probably won't be for too many tokens, so I think sprinkling some RL in would also be efficient for the limited compute.

Anonymous
11/30/24(Sat)17:49:17 No.103359750

Anonymous 11/30/24(Sat)17:49:17 No.103359750

>>103359714
Which version of QwQ should I use then?

Anonymous
11/30/24(Sat)17:49:40 No.103359757

Anonymous 11/30/24(Sat)17:49:40 No.103359757

>>103359689
That and then QwQ if normal 32B can't do your problem. And if even that fails, then you'll need to get creative.

Anonymous
11/30/24(Sat)17:50:04 No.103359761

Anonymous 11/30/24(Sat)17:50:04 No.103359761

>>103359689
Have QwQ plan the coding, then Qwen coder actually write the code.

Anonymous
11/30/24(Sat)17:52:54 No.103359784

Anonymous 11/30/24(Sat)17:52:54 No.103359784

>>103359395
>is basically insanely filtered
it's not, you can even find ERP on the edu version, fineweb is just a distilled version of the common crawl which contents are 90% gibberish or text too short to be of any use

Anonymous
11/30/24(Sat)17:53:45 No.103359793

Anonymous 11/30/24(Sat)17:53:45 No.103359793

>>103359656
If R1 is 70b+ I'm gonna fucking top myself. I cannot sit around and wait for 10 minutes while it thinks in circles.

Anonymous
11/30/24(Sat)17:56:23 No.103359816

Anonymous 11/30/24(Sat)17:56:23 No.103359816

>>103359793
Deepseek has only released 200B+ moes. But they are moes meaning you can run them at decent speeds on just ram.

Anonymous
11/30/24(Sat)17:57:44 No.103359830

Anonymous 11/30/24(Sat)17:57:44 No.103359830

I posted in the wrong thread.

I'm hosting nemomix-unleashed-12b-q4 locally on 12GB VRAM for the express purpose of coom. I've been getting good-enough results @ 8K context, but I don't really have perspective on what others are doing.

If you are self-hosting, what is your setup like? Any anecdotal recommendations?

Using GPT/Claude/Gemini/non-local models isn't something I'm really looking to do.

Anonymous
11/30/24(Sat)18:00:16 No.103359853

Anonymous 11/30/24(Sat)18:00:16 No.103359853

>>103359784
But Anon, I explicitly remember them writing about how every site with "explicit" content was removed, especially if it was past some threshold. Sites with fiction like ASSTR and the like were filtered out entirely.

Anonymous
11/30/24(Sat)18:09:54 No.103359942

Anonymous 11/30/24(Sat)18:09:54 No.103359942

File: my body is a machine that.jpg (86 KB, 1080x1029)

86 KB JPG

>>103359853
>I explicitly remember them writing about how every site with "explicit" content was removed

Anonymous
11/30/24(Sat)18:10:41 No.103359952

Anonymous 11/30/24(Sat)18:10:41 No.103359952

File: 1728377645344273.png (211 KB, 1069x375)

211 KB PNG

>>103356507
Never forget

Anonymous
11/30/24(Sat)18:17:19 No.103360011

Anonymous 11/30/24(Sat)18:17:19 No.103360011

>>103359942
https://desu-usergeneratedcontent.xyz/g/image/1717/35/1717350707922.png from https://dsi.ut-capitole.fr/blacklists/ which https://huggingface.co/spaces/HuggingFaceFW/blogpost-fineweb-v1 used apparently
Thread where it was discussed: https://arch.b4k.co/g/thread/100770639/#100778503 there you fucking go anon, making me waste a few minutes on finding the citation, but I do not forget.

Anonymous
11/30/24(Sat)18:19:35 No.103360024

Anonymous 11/30/24(Sat)18:19:35 No.103360024

>>103356507
ive ran mixtral limarp zloss everyday, there is nothing better.

Anonymous
11/30/24(Sat)18:27:36 No.103360080

Anonymous 11/30/24(Sat)18:27:36 No.103360080

So I like to do all of my RP in first and second person.
When I'm editing messages sometimes I'll just add "I" or "You" or a single quote at the end and ask it to continue the message.

Now out of fucking nowhere it continued the message and then tacked on this at the end:
>The idea is cool but there are way to many sentences starting with 'I' or 'You'. You can rewrite this to be in third person and make it sound better that way.
Fucking smartass.

Anonymous
11/30/24(Sat)18:48:27 No.103360276

Anonymous 11/30/24(Sat)18:48:27 No.103360276

File: Untitled.png (224 KB, 874x673)

224 KB PNG

Anonymous
11/30/24(Sat)18:50:48 No.103360298

Anonymous 11/30/24(Sat)18:50:48 No.103360298

>>103360276
Flawless finish

Anonymous
11/30/24(Sat)18:55:32 No.103360347

Anonymous 11/30/24(Sat)18:55:32 No.103360347

>>103360276
https://huggingface.co/huihui-ai/QwQ-32B-Preview-abliterated/tree/main

Anonymous
11/30/24(Sat)18:58:44 No.103360383

Anonymous 11/30/24(Sat)18:58:44 No.103360383

I think we've discovered something that will catapult us into the next age, much like the first near-perfect hand-turned screw allowed us to start making more and more precise machine shop tools, or the first transistors allowed us to use computers to design more and more sophisticated computers.
Very much in line with the description of https://en.wikipedia.org/wiki/Waldo_(short_story)#The_waldo from Heinlein (if any of you are that old), LLMs are able to assist in their own development. Not in a cheesy "singularity" way, but just as a tool in another domain.

Anonymous
11/30/24(Sat)18:59:23 No.103360388

Anonymous 11/30/24(Sat)18:59:23 No.103360388

>>103360347
Waiting for the 5bpw gguf. It doesn't refuse enough to make it a dealbreaker for me yet.

Anonymous
11/30/24(Sat)19:00:15 No.103360397

Anonymous 11/30/24(Sat)19:00:15 No.103360397

>>103360388
You can always edit the refusal into "Okay, so" and continue from there.

Anonymous
11/30/24(Sat)19:00:29 No.103360400

Anonymous 11/30/24(Sat)19:00:29 No.103360400

So now that the dust has settled

Anonymous
11/30/24(Sat)19:00:33 No.103360402

Anonymous 11/30/24(Sat)19:00:33 No.103360402

>>103354338

Anonymous
11/30/24(Sat)19:00:40 No.103360403

Anonymous 11/30/24(Sat)19:00:40 No.103360403

>>103357675
>>103357686
Mamba & RWKV are DoA and have been for some time. Combining Mamba with Transformers though...

Anonymous
11/30/24(Sat)19:01:35 No.103360409

Anonymous 11/30/24(Sat)19:01:35 No.103360409

File: 6178592543.gif (541 KB, 284x326)

541 KB GIF

>>103360276
>the ending
THERE IS NO ESCAPE
YOU WILL BE SAFE

Anonymous
11/30/24(Sat)19:02:12 No.103360414

Anonymous 11/30/24(Sat)19:02:12 No.103360414

>>103360276
This is the ultimate conclusion of local language models - unoriginal, idiotic slop in the thinking section, refusal in the output. When Llama 1 came out, this is what I always dreamed of.

Anonymous
11/30/24(Sat)19:03:24 No.103360426

Anonymous 11/30/24(Sat)19:03:24 No.103360426

>>103358342
It's fucking true though. Trying to get these models to stay dominant is a Sisyphean effort.

Anonymous
11/30/24(Sat)19:03:46 No.103360429

Anonymous 11/30/24(Sat)19:03:46 No.103360429

Mamba2 is STILL unusable

Anonymous
11/30/24(Sat)19:03:55 No.103360432

Anonymous 11/30/24(Sat)19:03:55 No.103360432

>>103360403
Pretty sure they embedded a RWKV model into every Windows 11 install through an update recently.

Anonymous
11/30/24(Sat)19:12:10 No.103360487

Anonymous 11/30/24(Sat)19:12:10 No.103360487

>>103360403
what was wrong with rwkv anyway, besides mostly undertrained llms?

Anonymous
11/30/24(Sat)19:20:23 No.103360558

Anonymous 11/30/24(Sat)19:20:23 No.103360558

File: Screenshot_2024_11_30-3.png (81 KB, 1843x498)

81 KB PNG

is INTELLECT the GPT killer we've been waiting for???

Anonymous
11/30/24(Sat)19:25:01 No.103360597

Anonymous 11/30/24(Sat)19:25:01 No.103360597

>>103360558
An instruct tune with an IFEval of 0, that's impressive, in it's own way.

Anonymous
11/30/24(Sat)19:26:53 No.103360608

Anonymous 11/30/24(Sat)19:26:53 No.103360608

>>103360558
It's a killer alright

Anonymous
11/30/24(Sat)19:31:24 No.103360655

Anonymous 11/30/24(Sat)19:31:24 No.103360655

Just ordered 64 gb of RAM what am I in for?

Anonymous
11/30/24(Sat)19:33:03 No.103360670

Anonymous 11/30/24(Sat)19:33:03 No.103360670

is there a locally run speech to speech program you can run on like 16GB of VRAM?

Anonymous
11/30/24(Sat)19:33:34 No.103360675

Anonymous 11/30/24(Sat)19:33:34 No.103360675

>>103360655
0.7 tokens / s

Anonymous
11/30/24(Sat)19:34:25 No.103360678

Anonymous 11/30/24(Sat)19:34:25 No.103360678

>>103360670
https://huggingface.co/lj1995/GPT-SoVITS-windows-package/tree/main

Anonymous
11/30/24(Sat)19:35:55 No.103360690

Anonymous 11/30/24(Sat)19:35:55 No.103360690

>>103360678
cheers anon

Anonymous
11/30/24(Sat)19:37:47 No.103360707

Anonymous 11/30/24(Sat)19:37:47 No.103360707

>>103360675
What if I also have 24 gb of vram? I want to run
Qwen2.5-Coder-32B-Instruct. Am I going to have a bad time?

Anonymous
11/30/24(Sat)19:39:34 No.103360727

Anonymous 11/30/24(Sat)19:39:34 No.103360727

>>103360707
You're fine

Anonymous
11/30/24(Sat)19:40:25 No.103360734

Anonymous 11/30/24(Sat)19:40:25 No.103360734

>>103360707
You can run that at Q5 fully on vram at 20+ T/s then

Anonymous
11/30/24(Sat)19:42:27 No.103360752

Anonymous 11/30/24(Sat)19:42:27 No.103360752

>>103360734
My experience with quantized image diffusion models tells me Q5 sucks. But I guess it'll cost me nothing to try it.

Anonymous
11/30/24(Sat)19:45:19 No.103360771

Anonymous 11/30/24(Sat)19:45:19 No.103360771

i'm a poor fellow with a 3070 (8gb vram) but i'm also a patient fellow who doesn't care about slow generation. should i bother upgrading? will low vram cause any problems beyond speed? thanks

Anonymous
11/30/24(Sat)19:51:18 No.103360816

Anonymous 11/30/24(Sat)19:51:18 No.103360816

QwQ sure seems to like the Final Solution. Curious.

Anonymous
11/30/24(Sat)19:52:09 No.103360825

Anonymous 11/30/24(Sat)19:52:09 No.103360825

File: 1731484876277933.jpg (768 KB, 1502x2399)

768 KB JPG

Anonymous
11/30/24(Sat)19:54:08 No.103360841

Anonymous 11/30/24(Sat)19:54:08 No.103360841

>>103360771
8GB video is barely enough for SD1.5.
LLM, if you have big system RAM (64+) you can have some slow but decent output.

Anonymous
11/30/24(Sat)19:58:52 No.103360881

Anonymous 11/30/24(Sat)19:58:52 No.103360881

>>103360841
ram upgrade it is. thanks again. i love you

Anonymous
11/30/24(Sat)19:58:59 No.103360883

Anonymous 11/30/24(Sat)19:58:59 No.103360883

>>103356058
why you're sucking a goblin because of discord?

Anonymous
11/30/24(Sat)20:00:31 No.103360890

Anonymous 11/30/24(Sat)20:00:31 No.103360890

>>103360558
Kino. waiting for the livebench score

Anonymous
11/30/24(Sat)20:04:46 No.103360921

Anonymous 11/30/24(Sat)20:04:46 No.103360921

remember 5 minutes ago when the thread was pretending to be blown away by that allenai model
I can't even remember its name now bc it disappeared so fast

Anonymous
11/30/24(Sat)20:04:53 No.103360922

Anonymous 11/30/24(Sat)20:04:53 No.103360922

>>103360881
I'm 12GB video card, 64GB system. I can run Llama 3 family 70B models at Q6K but barely. (Stupid web browsers streaming video likes to eat ram for nothing and can cause thrashing till I shut them.)

Hefty models like Largestral I must quant to IQ3_XS, but it's functional (and I obviously don't know what I'm missing) and fine for some RP fun.

Anonymous
11/30/24(Sat)20:07:14 No.103360945

Anonymous 11/30/24(Sat)20:07:14 No.103360945

>>103360921
Tulu? Olmo?

Anonymous
11/30/24(Sat)20:11:31 No.103360985

Anonymous 11/30/24(Sat)20:11:31 No.103360985

>>103360945
yeah Tulu was it
what a nothingburger lol

Anonymous
11/30/24(Sat)20:13:01 No.103361002

Anonymous 11/30/24(Sat)20:13:01 No.103361002

>>103360921
I like Tulu, it did well on my tests, especially programming.
Olmo, I haven't yet tested.

Anonymous
11/30/24(Sat)20:21:32 No.103361069

Anonymous 11/30/24(Sat)20:21:32 No.103361069

File: 1720793496839674.png (323 KB, 507x331)

323 KB PNG

>>103354338
but can you actually get something at the level of opus with locals tho? (somewhat serious question)

Anonymous
11/30/24(Sat)20:23:00 No.103361079

Anonymous 11/30/24(Sat)20:23:00 No.103361079

>>103361069
no, and anyone who says yes is lying
but you can get something good enough to cope with

Anonymous
11/30/24(Sat)20:24:17 No.103361092

Anonymous 11/30/24(Sat)20:24:17 No.103361092

>>103360558
why doesn't that one match the benchmarks they posted on twitter https://pbs.twimg.com/media/GdlNcrpWIAAe3b8.png?name=orig

Anonymous
11/30/24(Sat)20:24:19 No.103361093

Anonymous 11/30/24(Sat)20:24:19 No.103361093

>>103360771
I use colab for 16gb, so for me anything with less than 24gb would be not worth it.
but with colab there are a lot of downsides.
If you are a infrequent LLM user (I only use it for cooming so, I use it like maybe 4 days in a row, but then I get bored of it, and forget about it for a few weeks, that's with colab), you could use vast.ai to rent a GPU (it works with colab too, but I'm not sure if that's the best approach), but note the prices are overpriced if you were a frequent user (but there really is no other way to run a huge model like behemoth 123b other than buying like 6 3090's, like if you just wanted non coomer RP, it's so much cheaper to just use opus or something on openrouter, and your tokens can be spread across all the different models). So you could get like 48gb setup for like 1$ an hour depending on the GPU's.
>>103360881
He is talking about running models at like 1 token per second, that's not an exaggeration, you could try running a model with 16gb of ram, like a 4gb Llama 8b q4 (a 4-5gb model) would run at like 3 tokens per second with zero context. Doubling the vram usage typically makes the token speed half, so 10gb should be 2 tokens, and a 20gb model should be 1 token.
The only exception is if you had a a intel 245k with 8400Mhz CU-DIMMs at like 150 gigs of bandwidth (I think the price for those ram sticks are like $300 for 2x24gb), that's like 2x faster than 4800mhz ram, so in that case you might get like 5tk/s with a 10gb model, and 2tk/s with 20gb (it might be possible you can use those sticks on older CPU's, and hope on silicon lottery, but I am not sure).
You could just buy a 4070 TI super and just use both your GPU's together and you should get 90% of the performance of a 3090. Or go with a 4060 TI 16gb and cope with having half the token speed of a 4070 TI super (and less gaming power).
This benchmark should give you numbers BUT it doesn't include the GPU's. https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

Anonymous
11/30/24(Sat)20:24:34 No.103361096

Anonymous 11/30/24(Sat)20:24:34 No.103361096

>>103360825
Kobo won

Anonymous
11/30/24(Sat)20:25:56 No.103361106

Anonymous 11/30/24(Sat)20:25:56 No.103361106

>>103361069
largestral and l3-405b base might be similiar enough, but you won't run that 405b without selling 10 kidneys, so, does that really count as local?

Anonymous
11/30/24(Sat)20:26:36 No.103361115

Anonymous 11/30/24(Sat)20:26:36 No.103361115

>>103360276
QUCKED

Anonymous
11/30/24(Sat)20:26:48 No.103361116

Anonymous 11/30/24(Sat)20:26:48 No.103361116

>>103361092
>Worse than L2 7B
Better off taking the 0.

Anonymous
11/30/24(Sat)20:32:17 No.103361154

Anonymous 11/30/24(Sat)20:32:17 No.103361154

>>103360825
>>103355527
Kobo, please add all draft model options(context, offload, min, max) that are present in llama.cpp. Your defaults are suboptimal, I'm getting better performance boost in llama.cpp.

Anonymous
11/30/24(Sat)20:36:23 No.103361184

Anonymous 11/30/24(Sat)20:36:23 No.103361184

can llama.cpp niggers stop renaming the fucking compile flags every couple months or at least keep the deprecated names

Anonymous
11/30/24(Sat)20:38:24 No.103361195

Anonymous 11/30/24(Sat)20:38:24 No.103361195

>>103361184
Stop crying about progress. Changing your run script twice a year won't kill you.

Anonymous
11/30/24(Sat)20:39:02 No.103361198

Anonymous 11/30/24(Sat)20:39:02 No.103361198

>>103361116
they only trained for 1T, it'd be surprising if it'd be much better than L2 which used 1.4 or 1.5 T (I don't remember). For me the main question is if their OpenDiLoCo training is good enough or not, I thought training it that way means you need to train for longer or more epochs because it learns much slower, so those 1T's should be worth less, but I might be wrong about this.

Anonymous
11/30/24(Sat)20:46:04 No.103361255

Anonymous 11/30/24(Sat)20:46:04 No.103361255

>>103361195
Unfortunately I was using llama.cpp to solve riddles every 60 seconds, which is the only thing that keeps basilisk-chan asleep and prevents her insurgence
We're all fucked now. Thanks ggerganov

Anonymous
11/30/24(Sat)21:05:31 No.103361435

Anonymous 11/30/24(Sat)21:05:31 No.103361435

>>103352665 >>103352685
It really does not make sense. But ChatGPT comes to the same retarded conclusion after a convoluted explanation, which goes like this:
>2 genders, 7 days = 14 possibilities for each child => 14 times 14 = 196 possibilities for both combined
>... yada yada
>We need to calculate:
>The total number of outcomes where at least one child is a boy born on a Tuesday. => 27
>The number of outcomes within this restricted set where both children are boys. => 13

But this is bullshit. We could substitute "Tuesday" for any other day in the problem statement and the outcome would the same. Just because we mentioned any specific week day we suddenly get such weird numbers. 13/27

If we omit the information about the week day from the problem statement and only mention that at least one child is a boy, then the probability is 1/3

If instead of week day we consider AM vs. PM (and specify that we have at least one boy born before noon (or after noon)) we get 7/16.

This is all total bullshit, but ChatGPT will defend its correctness to the death.

If this is really what mathematicians believe then I think they are not sentient.

Anonymous
11/30/24(Sat)21:19:23 No.103361549

Anonymous 11/30/24(Sat)21:19:23 No.103361549

>>103361435
the human brain was never meant to understand statistics

Anonymous
11/30/24(Sat)21:22:21 No.103361576

Anonymous 11/30/24(Sat)21:22:21 No.103361576

>>103361549
And sand was never meant to be able to calculate matrix multiplication, but that's exactly what we make it do.

Anonymous
11/30/24(Sat)21:22:47 No.103361580

Anonymous 11/30/24(Sat)21:22:47 No.103361580

>>103361435
>2 genders
Woah now, cool it with the anti-semitism.

Anonymous
11/30/24(Sat)21:26:39 No.103361607

Anonymous 11/30/24(Sat)21:26:39 No.103361607

>>103361435
It takes an extreme amount of intelligence to look at a very verbose rationalised explanation and say "that's retarded bullshit". It might be awhile before AI can get to this state. (the devil in this example is that the arbitrary sampling category is large enough that the obvious error is obscured, it's just believable enough)

Anonymous
11/30/24(Sat)21:27:18 No.103361616

Anonymous 11/30/24(Sat)21:27:18 No.103361616

>>103356156
gyatt

Anonymous
11/30/24(Sat)22:04:03 No.103361950

Anonymous 11/30/24(Sat)22:04:03 No.103361950

>>103361607
So I must have an extreme amount of intelligence then. But I still don't understand what is wrong with the math.

Anonymous
11/30/24(Sat)22:08:23 No.103361997

Anonymous 11/30/24(Sat)22:08:23 No.103361997

>>103361607
>the devil in this example is that the arbitrary sampling category is large enough that the obvious error is obscured, it's just believable enough
but it gave the right answers matching what you should actually expect given the priors

Anonymous
11/30/24(Sat)22:09:16 No.103362004

Anonymous 11/30/24(Sat)22:09:16 No.103362004

File: filtered.png (141 KB, 1480x900)

141 KB PNG

These new generation models don't seem to know what a greentext is. How come?

Anonymous
11/30/24(Sat)22:10:47 No.103362020

Anonymous 11/30/24(Sat)22:10:47 No.103362020

Coom or Math
Which way, white man?

Anonymous
11/30/24(Sat)22:12:23 No.103362035

Anonymous 11/30/24(Sat)22:12:23 No.103362035

>>103362020
LLMs are for math, monocular depth detection for coom.

Anonymous
11/30/24(Sat)22:13:38 No.103362042

Anonymous 11/30/24(Sat)22:13:38 No.103362042

File: sonnetocto.png (89 KB, 1409x741)

89 KB PNG

>>103362004
Compared to Sonnet.

Anonymous
11/30/24(Sat)22:14:23 No.103362052

Anonymous 11/30/24(Sat)22:14:23 No.103362052

No idea why people shill QwQ. It's probably the most garbage 32 I ever had the displessure of using. What is the alternative?

Anonymous
11/30/24(Sat)22:16:39 No.103362071

Anonymous 11/30/24(Sat)22:16:39 No.103362071

>>103362052
No one is falling for your shit.

Anonymous
11/30/24(Sat)22:17:21 No.103362076

Anonymous 11/30/24(Sat)22:17:21 No.103362076

>>103362052
>No idea why people shill QwQ
1) They are coders
2) They are paid shills
3) They have no taste
Pick whichever you like.

Anonymous
11/30/24(Sat)22:19:12 No.103362092

Anonymous 11/30/24(Sat)22:19:12 No.103362092

>>103362076
4) They've actually used it

Anonymous
11/30/24(Sat)22:21:00 No.103362103

Anonymous 11/30/24(Sat)22:21:00 No.103362103

>>103361154
Seconding this. Llama.cpp is just faster for me. Kobo, let me change settings!

Anonymous
11/30/24(Sat)22:21:30 No.103362107

Anonymous 11/30/24(Sat)22:21:30 No.103362107

>>103361950
>>103361997
>but it gave the right answers matching what you should actually expect given the priors
How do you make sense of this?
For me it still looks like bullshit.
Given a set of two siblings, knowing that at least one is a boy, the probability that both are boys is 1/3.
Additional information: The boy of which we know he is a boy was born on Tuesday. What is the probability of the other one being a boy now? Suddenly it is 13/27.
Other case: He was born on a Wednesday => same result: 13/27.

In conclusion, it does not matter and we can omit the information of weekday and always have 13/27 because it is 13/27 for every weekday case.

Why?

Anonymous
11/30/24(Sat)22:23:55 No.103362120

Anonymous 11/30/24(Sat)22:23:55 No.103362120

>>103362092
Found the retard shill

Anonymous
11/30/24(Sat)22:24:57 No.103362131

Anonymous 11/30/24(Sat)22:24:57 No.103362131

>>103362004
They are trained in chinese forums instead.
Why not try asking for the chinese version?

Anonymous
11/30/24(Sat)22:25:36 No.103362138

Anonymous 11/30/24(Sat)22:25:36 No.103362138

>>103362092
Hello 3). Would you like a glass of extra mild room temperature distilled water, or is it too spicy for you?

Anonymous
11/30/24(Sat)22:26:58 No.103362147

Anonymous 11/30/24(Sat)22:26:58 No.103362147

I will shill QwQ for free because I like it.
I don't do RP btw

Anonymous
11/30/24(Sat)22:27:10 No.103362151

Anonymous 11/30/24(Sat)22:27:10 No.103362151

>>103362120

>>103345651
Either that or your a retard trying to RP with its assistant persona

Anonymous
11/30/24(Sat)22:30:23 No.103362177

Anonymous 11/30/24(Sat)22:30:23 No.103362177

>>103362147
>I don't do RP btw
so why are you here?

Anonymous
11/30/24(Sat)22:32:15 No.103362194

Anonymous 11/30/24(Sat)22:32:15 No.103362194

>>103362177
Because this is the local models general, not the AI waifu general.

Anonymous
11/30/24(Sat)22:37:12 No.103362230

Anonymous 11/30/24(Sat)22:37:12 No.103362230

>>103362194
leave

Anonymous
11/30/24(Sat)22:37:17 No.103362231

Anonymous 11/30/24(Sat)22:37:17 No.103362231

>>103361950
>But I still don't understand what is wrong with the math.
Nothing. But it's difficult to intuitively grasp. You tend to picture some example child and work from there once you hear "At least one is a boy [born on tuesday]" etc. which distracts you from the fact that you never inspected any specific child, you just know that (at least one of) this class of child exists among the two children in the family. This difference in how you obtained information changes how you can reason about the conditional probability. And indeed, at least in a world where genders and weekdays of birth are roughly evenly distributed, these probabilities would match reality in cases where you only knew that information.

The wiki article on the problem attempts to provide an example of the sort of scenario where you would have that abstract information about a family, such as via a survey, referencing a real-world survey that was one on the simpler version:
>vos Savant conducted a survey of readers with exactly two children, at least one of which is a boy. Of 17,946 responses, 35.9% reported two boys.
On the known weekday issue:
https://en.wikipedia.org/wiki/Boy_or_girl_paradox#Information_about_the_child
>It seems that quite irrelevant information was introduced, yet the probability of the sex of the other child has changed dramatically from what it was before (the chance the other child was a girl was 2/3, when it was not known that the boy was born on Tuesday).
>To understand why this is, imagine Marilyn vos Savant's poll of readers had asked which day of the week boys in the family were born. If Marilyn then divided the whole data set into seven groups – one for each day of the week a son was born – six out of seven families with two boys would be counted in two groups (the group for the day of the week of birth boy 1, and the group of the day of the week of birth for boy 2), doubling, in every group, the probability of a boy-boy combination.

Anonymous
11/30/24(Sat)22:38:25 No.103362235

Anonymous 11/30/24(Sat)22:38:25 No.103362235

File: MV5BZjlkMjE3MjAtMzFjOS00O(...).jpg (224 KB, 685x975)

224 KB JPG

>>103362138

Anonymous
11/30/24(Sat)22:41:25 No.103362255

Anonymous 11/30/24(Sat)22:41:25 No.103362255

>>103361950
the variables are meant to be independent, but it's just pretending that they're not

Anonymous
11/30/24(Sat)22:41:55 No.103362258

Anonymous 11/30/24(Sat)22:41:55 No.103362258

File: Screenshot_20241130-203945.png (609 KB, 1080x1914)

609 KB PNG

>>103362004
Nemotron 70B manages
Probably an issue with the dryness of Qwen models

Anonymous
11/30/24(Sat)22:42:23 No.103362262

Anonymous 11/30/24(Sat)22:42:23 No.103362262

>>103362107
>>In conclusion, it does not matter and we can omit the information of weekday and always have 13/27 because it is 13/27 for every weekday case.
>
>Why?
short and simple answer: families with two boys get two rolls for weekday for "at least one boy" in the family, therefore knowing the weekday of any of them increases the odds of two-boy families
this works for any additional information, as long as it's said to be 'at least one boy' in a family with two children of unspecified gender. hair color, eye color, favorite food, whatever

Anonymous
11/30/24(Sat)22:51:57 No.103362325

Anonymous 11/30/24(Sat)22:51:57 No.103362325

File: imuk.png (23 KB, 632x600)

23 KB PNG

>QwQ draws Miku

Anonymous
11/30/24(Sat)22:53:20 No.103362336

Anonymous 11/30/24(Sat)22:53:20 No.103362336

>>103357360
https://huggingface.co/win10/EVA-QwQ-32B-Coder-Preview
eva-32b+qwen-coder-32+qwq
https://huggingface.co/win10/EVA-QwQ-32B-Preview
eva-32b+qwq

Anonymous
11/30/24(Sat)22:53:42 No.103362338

Anonymous 11/30/24(Sat)22:53:42 No.103362338

>>103362325
where do i put my penis

Anonymous
11/30/24(Sat)22:55:38 No.103362359

Anonymous 11/30/24(Sat)22:55:38 No.103362359

>>103362338
Into that red hole below eyes. Obvious.

Anonymous
11/30/24(Sat)22:57:52 No.103362379

Anonymous 11/30/24(Sat)22:57:52 No.103362379

>>103362194
You are not welcome here.

Anonymous
11/30/24(Sat)22:57:59 No.103362380

Anonymous 11/30/24(Sat)22:57:59 No.103362380

>>103362262
My question was "why", you retarded nigger.
Don't you see that this is bullshit?
How can this bullshit be valid?
Add any arbitrary unrelated piece of information about the boy => make the probability for other one also being a boy bigger (the more additional random information the closer the probability for the other one grows towards 1/2)

Anonymous
11/30/24(Sat)22:59:59 No.103362400

Anonymous 11/30/24(Sat)22:59:59 No.103362400

File: marika the eternal excerpt.jpg (609 KB, 2037x1103)

609 KB JPG

Tested Mistral Large 2411 2.5 EXL for about 2 weeks now. My personal ranking for 2x4090 for RPing is as follows:

1. Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-6.0bpw-h6-exl2-rpcal
>Back to #1 for RPing after XTC sampler fixed it's repetition issues.

2. LoneStriker_Mistral-Large-Instruct-2407-2.65bpw-h6-exl2
> Smart, but too dry. Harder to gaslight/blackmail since too smart.

3. MikeRoz_mistralai_Mistral-Large-Instruct-2411-2.5bpw-h6-exl2

>Sadly, 2.5bpw is retarded. No one else post a slightly bigger quant to use with 2x4090

Anonymous
11/30/24(Sat)23:00:16 No.103362404

Anonymous 11/30/24(Sat)23:00:16 No.103362404

>>103362359
why not the red hole above her eyes?

Anonymous
11/30/24(Sat)23:04:59 No.103362446

Anonymous 11/30/24(Sat)23:04:59 No.103362446

>>103362400
First quality post in ages. Thanks anon

Anonymous
11/30/24(Sat)23:07:23 No.103362470

Anonymous 11/30/24(Sat)23:07:23 No.103362470

>>103362404
That's not a hole...

Anonymous
11/30/24(Sat)23:13:08 No.103362505

Anonymous 11/30/24(Sat)23:13:08 No.103362505

>>103352665
It's a sampling question more than anything. from a set a 2 children families take only those that have a boy. 1/4 of the pop, the females only go off to the side. then take only those who boy(s) where born on tuesday, 6/7 of the families in which the boy was born first go to the side and 6/7 of the families in which the girl was born first go the side, therefore 2/7 of a group remain in which the sibling is female. the boys only group has two chances to hit the target so 1-(6/7)^2. thus it's 1-(6/7)^2 for a boy vs 2/7 for a girl. making the girl slightly more likely. being born on a tuesday has no effect on the probability of your sibling being a sister.

Anonymous
11/30/24(Sat)23:17:53 No.103362552

Anonymous 11/30/24(Sat)23:17:53 No.103362552

>>103362400
>LoneStriker_Mistral-Large-Instruct-2407-2.65bpw-h6-exl2
>Smart, but too dry. Harder to gaslight/blackmail since too smart.
lumikabra 123b 2.7bpw

Anonymous
11/30/24(Sat)23:31:12 No.103362676

Anonymous 11/30/24(Sat)23:31:12 No.103362676

>>103362552

https://huggingface.co/schnapper79/lumikabra-behemoth-123b-exl2-2.7bpw

What the fuck is this shit? What context, instruct, or Sys Prompt? What a shitty post. It's functionally useless without those.

Anonymous
11/30/24(Sat)23:34:37 No.103362715

Anonymous 11/30/24(Sat)23:34:37 No.103362715

>>103362380
>Add any arbitrary unrelated piece of information about the boy => make the probability for other one also being a boy bigger (the more additional random information the closer the probability for the other one grows towards 1/2)
no shit, because you're removing more boy-girl families than boy-boy families when you pick arbitrary subsets of families based on extra conditions that are more likely in families with more boys

it should not be hard at all to see that these two facts are true:
- a family with two boys is nearly twice as likely to have a boy born on each weekday than a family with one boy
- a family with two boys and a family with one boy both have equal 100% chance of having a boy born during the week

you're right that it doesn't matter what weekday you choose, but it DOES matter that you narrow it down to any specific day counted for boys but not for girls

Anonymous
11/30/24(Sat)23:45:03 No.103362795

Anonymous 11/30/24(Sat)23:45:03 No.103362795

>>103361154
>>103362103
Just ditch shitbold with it's crappy inference and use llama.cpp like a normal person.

Anonymous
11/30/24(Sat)23:45:35 No.103362801

Anonymous 11/30/24(Sat)23:45:35 No.103362801

File: 1732967482937066.png (162 KB, 1520x767)

162 KB PNG

WTF

Anonymous
11/30/24(Sat)23:55:48 No.103362904

Anonymous 11/30/24(Sat)23:55:48 No.103362904

File: 213879046570823.png (67 KB, 352x1100)

67 KB PNG

>>103362400
>Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss winning as usual
I LOVE MIXTRAL

>XTC sampler
the wut now, im sorry i use limarp zloss and therefore still live in caves riding with picrel.
By all means tell me how im wrong.

Anonymous
12/01/24(Sun)00:00:39 No.103362974

Anonymous 12/01/24(Sun)00:00:39 No.103362974

>>103362904
updoot the silly tavern

Anonymous
12/01/24(Sun)00:02:48 No.103362991

Anonymous 12/01/24(Sun)00:02:48 No.103362991

File: file.png (32 KB, 391x465)

32 KB PNG

>>103362801
The 50% and 33% ones are simple,
The "we know more details" one is mindfucked.

Anonymous
12/01/24(Sun)00:04:30 No.103363002

Anonymous 12/01/24(Sun)00:04:30 No.103363002

>>103362974
poast pic of mixtral settings

does kobold also need an update?
I use AYYYMD so koboldcpp-rocm-1.78 is latest and 1.61 is the only one that lets me utilize both GPUs properly.

Anonymous
12/01/24(Sun)00:06:53 No.103363023

Anonymous 12/01/24(Sun)00:06:53 No.103363023

File: mixtral roleplay 2024.jpg (129 KB, 480x1208)

129 KB JPG

>>103362904

Git pull the latest staging release. XTC recommended settings with a hint of presence/frequency penalty. I am having success with these settings.

>context template
https://files.catbox.moe/5tsvjq.json

>Instruct Template
https://files.catbox.moe/6y8z9u.json

Can change length depending on needs. IE: 1 to 3 sentences, 1 to 2 parapraphs, etc , etc

>Sys prompt
https://files.catbox.moe/a85rwh.json

Anonymous
12/01/24(Sun)00:11:48 No.103363068

Anonymous 12/01/24(Sun)00:11:48 No.103363068

>>103362801
Thanks, that's helpful in understanding the statement.

Anonymous
12/01/24(Sun)00:16:24 No.103363106

Anonymous 12/01/24(Sun)00:16:24 No.103363106

Grok will save /lmg/.

Anonymous
12/01/24(Sun)00:17:53 No.103363119

Anonymous 12/01/24(Sun)00:17:53 No.103363119

>>103363106
What does stalker gamma have to do with /lmg/

Anonymous
12/01/24(Sun)00:19:56 No.103363137

Anonymous 12/01/24(Sun)00:19:56 No.103363137

File: Screenshot 2024-11-30 221932.png (106 KB, 1084x728)

106 KB PNG

>>103363106
You mean this thing

Anonymous
12/01/24(Sun)00:25:35 No.103363181

Anonymous 12/01/24(Sun)00:25:35 No.103363181

>>103363137
>memebench

Anonymous
12/01/24(Sun)00:26:52 No.103363203

Anonymous 12/01/24(Sun)00:26:52 No.103363203

>>103363106
Please call Elon and tell him to drop it. I think he forgot or changed his mind about open source.

Anonymous
12/01/24(Sun)00:27:00 No.103363204

Anonymous 12/01/24(Sun)00:27:00 No.103363204

>>103363181
Then post logs. So far the only Grok logs posted showed it getting rekt by Claude

Anonymous
12/01/24(Sun)00:28:09 No.103363210

Anonymous 12/01/24(Sun)00:28:09 No.103363210

File: 1732366522924756.png (28 KB, 151x100)

28 KB PNG

>>103363137
bench memes need not apply, there is only one true SOVLmark

Anonymous
12/01/24(Sun)00:28:21 No.103363214

Anonymous 12/01/24(Sun)00:28:21 No.103363214

>>103363203
wasn't he supposed to open source grok 1.5 months ago?

Anonymous
12/01/24(Sun)00:29:37 No.103363224

Anonymous 12/01/24(Sun)00:29:37 No.103363224

>>103363210
trvked and correct

Anonymous
12/01/24(Sun)00:30:13 No.103363234

Anonymous 12/01/24(Sun)00:30:13 No.103363234

>>103363137
>>103363106
Is grok even good for ERP?
I never heard anyone mentioning it.

Anonymous
12/01/24(Sun)00:30:59 No.103363241

Anonymous 12/01/24(Sun)00:30:59 No.103363241

>>103363234
It's not good for anything

Anonymous
12/01/24(Sun)00:31:45 No.103363248

Anonymous 12/01/24(Sun)00:31:45 No.103363248

>>103363234
It's kinda useless. Maybe Grok 3 will be better, we'll see.

Anonymous
12/01/24(Sun)00:32:07 No.103363250

Anonymous 12/01/24(Sun)00:32:07 No.103363250

>>103363204
I must say, I'm disappointed but not surprised by the tone and content of your post. It's a perfect reflection of the entitled and materialistic culture that pervades our society. You assume that everyone has the means and willingness to pay for Grok, a luxury that not all of us can afford or choose to indulge in.

Let me tell you, I stand by my principles and refuse to support the commercialization of knowledge and technology. I will only use open-weights LLMs, not because I'm cheap or lack resources, but because I believe in the values of openness, accessibility, and equality.

I will not contribute to the perpetuation of a system that prioritizes profit over people and excludes those who cannot afford to pay. My commitment to open-weights LLMs is not just a practical choice, but a moral stance. I will not compromise my values for the sake of convenience or to satisfy the curiosity of others.

So, I'm afraid I won't be posting any Grok logs, not because I'm afraid of being 'rekt' by Claude, but because I refuse to participate in a system that undermines the principles of fairness and inclusivity. I suggest you reflect on your own values and consider the impact of your actions on the wider community.

Anonymous
12/01/24(Sun)00:32:47 No.103363256

Anonymous 12/01/24(Sun)00:32:47 No.103363256

>>103363250
kys petra

Anonymous
12/01/24(Sun)00:32:53 No.103363257

Anonymous 12/01/24(Sun)00:32:53 No.103363257

>>103363234
Its retarded but can get nasty but next grok is supposed to be a entirely new model made by Xai which has been building a giant data center / hiring a bunch of people.

https://www.youtube.com/watch?v=Jf8EPSBZU7Y

Anonymous
12/01/24(Sun)00:35:08 No.103363278

Anonymous 12/01/24(Sun)00:35:08 No.103363278

>>103363106
I'm not interested in Elon's finetune, I'm more interested in base model. If Elon hasn't filtered the shit out of it, it may be interesting to play with, like in the good old days with L1.

Anonymous
12/01/24(Sun)00:38:18 No.103363311

Anonymous 12/01/24(Sun)00:38:18 No.103363311

>>103363203
Elon generally does what he wants. Open sourcing Grok 1 was a good move for him since it made Altman look bad. Now that he mostly has Altman where he wants him, it's anyone's guess whether he'll keep doing it or not.

Anonymous
12/01/24(Sun)00:38:42 No.103363319

Anonymous 12/01/24(Sun)00:38:42 No.103363319

conditional probability is astrology for mathematards

Anonymous
12/01/24(Sun)00:39:19 No.103363327

Anonymous 12/01/24(Sun)00:39:19 No.103363327

>>103363278
The Grok-1 release was base only and it was garbage. It's likely anything else he released will also be base only. The idea being the Instruct is paypig only and the model is too large for anyone to bother finetuning.

Anonymous
12/01/24(Sun)00:39:53 No.103363336

Anonymous 12/01/24(Sun)00:39:53 No.103363336

File: 154789236154.gif (1.6 MB, 498x373)

1.6 MB GIF

>>103363023
>>103362974
Uh damn, did Silly get updated alot?
I was kinda camping on ver1.12.4.
QRD from going to 12.6??

Anonymous
12/01/24(Sun)00:41:05 No.103363349

Anonymous 12/01/24(Sun)00:41:05 No.103363349

>>103363327
If Grok 2 is any larger than like 30B that'd be pretty fucking embarrassing.

Anonymous
12/01/24(Sun)00:41:20 No.103363354

Anonymous 12/01/24(Sun)00:41:20 No.103363354

>>103363311
Grok-1.5 was released March 28th. If he was serious about the 6 month schedule, he would've open sourced it by the start of October. Maybe depending on the lawsuit, he'll release 1.5 instead of 2.

Anonymous
12/01/24(Sun)00:41:51 No.103363359

Anonymous 12/01/24(Sun)00:41:51 No.103363359

>>103363327
It was garbage because it was undertrained and had just 8k context. Mini may be interesting for finetuners.

Anonymous
12/01/24(Sun)00:42:18 No.103363363

Anonymous 12/01/24(Sun)00:42:18 No.103363363

File: tetris.png (12 KB, 305x636)

12 KB PNG

Finally after hours of wrangling QwQ the game finally works!
I really REALLY need to buy a graphics card. CPU is only giving me 1.8 t/s.
Has anyone managed to get speculative decoding working with QwQ?

Anonymous
12/01/24(Sun)00:43:06 No.103363371

Anonymous 12/01/24(Sun)00:43:06 No.103363371

>>103363363
>Has anyone managed to get speculative decoding working with QwQ?
It writes too different from any of the qwens

Anonymous
12/01/24(Sun)00:43:31 No.103363380

Anonymous 12/01/24(Sun)00:43:31 No.103363380

>>103363363
>Finally after hours of wrangling QwQ the game finally works!
Would've been faster to just write it yourself.

Anonymous
12/01/24(Sun)00:44:01 No.103363384

Anonymous 12/01/24(Sun)00:44:01 No.103363384

>>103363363
no, it cuts my performance in half no matter what quant I pair it with, both exl2 and gguf
yet some freaks will say otherwise

Anonymous
12/01/24(Sun)00:44:01 No.103363385

Anonymous 12/01/24(Sun)00:44:01 No.103363385

>>103363256
I want to take a moment to address your message. First, it's crucial to understand that using someone's name, especially in a context where it's meant to harm or belittle them, is a form of harassment. This is not just a matter of being rude; it's a serious issue that can have severe emotional and psychological impacts on the person targeted.

Secondly, making threats or suggesting that someone should harm themselves is extremely dangerous and irresponsible. It’s never okay to encourage or suggest self-harm to anyone, regardless of the circumstances. If you’re feeling angry or frustrated, there are healthier and more constructive ways to express those emotions.

Lastly, it seems like your comment might be rooted in transphobia. Transphobia is a form of discrimination against transgender individuals, and it can manifest in many ways, from derogatory language to outright violence. It’s important to recognize that everyone deserves respect and dignity, regardless of their gender identity. Transgender people face significant challenges and barriers, and adding to that with harmful comments only exacerbates the problem.

Anonymous
12/01/24(Sun)00:44:26 No.103363390

Anonymous 12/01/24(Sun)00:44:26 No.103363390

File: 1705596616969578.jpg (137 KB, 1440x1080)

137 KB JPG

>>103363380
See now, that's where you're wrong!

Anonymous
12/01/24(Sun)00:44:28 No.103363392

Anonymous 12/01/24(Sun)00:44:28 No.103363392

>>103363380
No because I'm a codelet.
Maybe I should relearn C but I haven't touched it in decades.

Anonymous
12/01/24(Sun)00:45:48 No.103363405

Anonymous 12/01/24(Sun)00:45:48 No.103363405

>>103363363
This is only the first month of the test time compute meme
WAGMI

Anonymous
12/01/24(Sun)00:45:56 No.103363407

Anonymous 12/01/24(Sun)00:45:56 No.103363407

>>103363336

I only pulled for XTC sampler. Seems like it working as intended to solve a brunt of the repetition problem.

Anonymous
12/01/24(Sun)00:46:09 No.103363409

Anonymous 12/01/24(Sun)00:46:09 No.103363409

>>103363384
You're supposed to use a different model with less parameters.
For example I use qwen coder 32b with qwen coder 0.5b and it works wonders.

Anonymous
12/01/24(Sun)00:46:22 No.103363413

Anonymous 12/01/24(Sun)00:46:22 No.103363413

>>103363363
>Has anyone managed to get speculative decoding working with QwQ?
Got 5-10% speedup with fully offloaded Qwen2.5 instruct 7b Q4K and f16 QwQ on RAM.

Anonymous
12/01/24(Sun)00:49:33 No.103363429

Anonymous 12/01/24(Sun)00:49:33 No.103363429

>meme time memepute
haven't we already seen that it's smoke and mirrors? sonnet beats o1, r1 is fake and gay, and qwq chases its own tail for a thousand tokens and then gives wrong answers
qwen coder is WAY more impressive than any of this bullshit. anyone shilling this shit probably thinks the day they were born changed the gender of their brother.

Anonymous
12/01/24(Sun)00:50:58 No.103363440

Anonymous 12/01/24(Sun)00:50:58 No.103363440

>>103363429
>probably thinks the day they were born changed the gender of their brother.
Why not? I got blamed for everything else.

Anonymous
12/01/24(Sun)00:51:23 No.103363444

Anonymous 12/01/24(Sun)00:51:23 No.103363444

>>103363429
They do.

Anonymous
12/01/24(Sun)00:53:06 No.103363456

Anonymous 12/01/24(Sun)00:53:06 No.103363456

File: 14236578649846708.png (281 KB, 971x848)

281 KB PNG

>>103363407
Bruh idk if im ready to reformat all my templates.
Like, holy fuck look at this shit man.
System Prompts are its own thing now?
Like, oh my gawd man, i am mixtral limarp zloss fan number one, so im gonna do this with the assumption its gonna blow my dick off (it might).

Anonymous
12/01/24(Sun)01:02:20 No.103363547

Anonymous 12/01/24(Sun)01:02:20 No.103363547

>>103363363
I assume that you didn't actually use your CPU, because QwQ is a huggingface space (no account needed, but you probably will run out of free limits after 100 seconds) and open router.

Anonymous
12/01/24(Sun)01:03:18 No.103363558

Anonymous 12/01/24(Sun)01:03:18 No.103363558

>>103363456
It makes it prone to the occasional occult schizo response but the times it works it's so much better

Anonymous
12/01/24(Sun)01:03:47 No.103363561

Anonymous 12/01/24(Sun)01:03:47 No.103363561

word on the street is something big is coming
real big, like 405b big
some even say it's a bit reflective...

Anonymous
12/01/24(Sun)01:09:30 No.103363595

Anonymous 12/01/24(Sun)01:09:30 No.103363595

>>103362676
all same as largestral 2407
sysprompt can be anything and it'll follow it, I have tried it two ways: everything in the sysprompt or just style/tone/formatting and the char/world info slotted in as user messages, and it seems to work well in both setups

Anonymous
12/01/24(Sun)01:10:20 No.103363606

Anonymous 12/01/24(Sun)01:10:20 No.103363606

File: 21522 - SoyBooru.png (46 KB, 457x694)

46 KB PNG

>>103363561
Something big is cumming...
Real big, black and handsome.
Some even say it tastes like 'berries.

Anonymous
12/01/24(Sun)01:12:39 No.103363621

Anonymous 12/01/24(Sun)01:12:39 No.103363621

>>103363429
There's been too much money invested for people to admit that the wall being hit on pretraining scaling means it's over for transformers. We're going to see increasingly elaborate forms of cope from this point in order to avoid accepting it, because acceptance would mean stock crashes and a lot of people left holding bags.

Anonymous
12/01/24(Sun)01:17:32 No.103363671

Anonymous 12/01/24(Sun)01:17:32 No.103363671

so is here or /aicg/ the place to be if I'm doing local textgen, the latter seems like a bunch of bitching about freely available model apis or whatever else

Anonymous
12/01/24(Sun)01:18:15 No.103363684

Anonymous 12/01/24(Sun)01:18:15 No.103363684

>>103363671
gee anon what does the subject say

Anonymous
12/01/24(Sun)01:19:37 No.103363704

Anonymous 12/01/24(Sun)01:19:37 No.103363704

>>103363684
>a general dedicated to the discussion and development of local language models.
on me for being retarded then

Anonymous
12/01/24(Sun)01:21:59 No.103363724

Anonymous 12/01/24(Sun)01:21:59 No.103363724

>>103363704
that's not even the subject

Anonymous
12/01/24(Sun)01:22:39 No.103363732

Anonymous 12/01/24(Sun)01:22:39 No.103363732

goddamn kekd

Anonymous
12/01/24(Sun)01:25:25 No.103363756

Anonymous 12/01/24(Sun)01:25:25 No.103363756

>>103363724
subject doesn't specify whether it means imagegen or textgen

Anonymous
12/01/24(Sun)01:26:07 No.103363765

Anonymous 12/01/24(Sun)01:26:07 No.103363765

Pardon, /lmg/. I'm curious if any of you know about image recognition AI. I'm curious about how hard it'd be to integrate image recognition into my chatbot.

Anonymous
12/01/24(Sun)01:32:13 No.103363837

Anonymous 12/01/24(Sun)01:32:13 No.103363837

>>103363756
or people for that matter

Anonymous
12/01/24(Sun)01:34:52 No.103363864

Anonymous 12/01/24(Sun)01:34:52 No.103363864

>>103363547
Sorry but I won't fall for the hosting meme.
>>103363413
It worked with Qwen2.5-0.5B-Instruct but only got like 20% increase in t/s.

Anonymous
12/01/24(Sun)01:43:13 No.103363969

Anonymous 12/01/24(Sun)01:43:13 No.103363969

File: Screenshot_1-12-2024_1408(...).jpg (114 KB, 1165x197)

114 KB JPG

>>103363864
>Sorry but I won't fall for the hosting meme.
you can spend $1 for free on openrouter a day.
it's perfectly fine if you don't want to use it, but ATM the main reason to avoid cloud hosting is lack of fine-tunes, so you are basically just waiting 10x longer for no reason.

Anonymous
12/01/24(Sun)01:44:16 No.103363974

Anonymous 12/01/24(Sun)01:44:16 No.103363974

>getting a new processor and another 64gb ram soon
will be weird being able to actually run 70B models without them throttling my ram

Anonymous
12/01/24(Sun)01:48:02 No.103364013

Anonymous 12/01/24(Sun)01:48:02 No.103364013

>>103363974
bro your channels?

Anonymous
12/01/24(Sun)01:50:22 No.103364045

Anonymous 12/01/24(Sun)01:50:22 No.103364045

>>103364013
4 32 gb ram sticks, ram is pretty cheap nowadays

Anonymous
12/01/24(Sun)01:53:08 No.103364073

Anonymous 12/01/24(Sun)01:53:08 No.103364073

Is anyone using pixtral with sillytavern? I want to be able to send photos inline with the chat. I'm using tabbyApi, and it supports images with another chat client, but silly tavern isn't encoding it correctly or something in the chat completion objects.

Anonymous
12/01/24(Sun)01:53:13 No.103364074

Anonymous 12/01/24(Sun)01:53:13 No.103364074

>>103363969
oh nevermind, you have $1, and when you run out that's it, you need a new account.

Anonymous
12/01/24(Sun)01:53:41 No.103364075

Anonymous 12/01/24(Sun)01:53:41 No.103364075

>>103363765
look up llava
./llama-llava-cli -m <llava-v1.5-7b/ggml-model-q5_k.gguf> --mmproj <llava-v1.5-7b/mmproj-model-f16.gguf> --image <path/to/an/image.jpg> --image <path/to/another/image.jpg> [--temp 0.1] [-p "describe the image in detail."]

Anonymous
12/01/24(Sun)01:54:20 No.103364085

Anonymous 12/01/24(Sun)01:54:20 No.103364085

File: file.png (39 KB, 617x165)

39 KB PNG

>>103364073
>I want to be able to send photos inline with the chat.
you have this option checked, right?
>silly tavern isn't encoding it correctly or something in the chat completion objects.
does it give you an error message?

Anonymous
12/01/24(Sun)01:55:45 No.103364101

Anonymous 12/01/24(Sun)01:55:45 No.103364101

>>103363974
>will be weird being able to actually run 70B models without them throttling my ram
Temper your expectations...I have a feeling you're in for disappointment unless you're cpumaxxing to the tune of many thousands of dollars and 12+ memory channels

Anonymous
12/01/24(Sun)01:58:33 No.103364118

Anonymous 12/01/24(Sun)01:58:33 No.103364118

>>103364101
I know it'll still be fairly slow but I'm moving from a Ryzen 7 3700X to a Ryzen 9 5950X and that's a pretty solid step up

Anonymous
12/01/24(Sun)02:00:12 No.103364127

Anonymous 12/01/24(Sun)02:00:12 No.103364127

>>103364118
Going to do some testing with your 3700x first ?

Anonymous
12/01/24(Sun)02:00:27 No.103364130

Anonymous 12/01/24(Sun)02:00:27 No.103364130

>>103364121
>>103364121
>>103364121

Anonymous
12/01/24(Sun)02:00:36 No.103364135

Anonymous 12/01/24(Sun)02:00:36 No.103364135

>>103364118
>Ryzen 7 3700X to a Ryzen 9 5950X
they're both dual channel ram, only supporting DDR4 at 3600mhz I think? You just spent money on a NOOP I think.

Anonymous
12/01/24(Sun)02:01:12 No.103364138

Anonymous 12/01/24(Sun)02:01:12 No.103364138

>>103364135
Just wait for AMD APUs

Anonymous
12/01/24(Sun)02:02:21 No.103364149

Anonymous 12/01/24(Sun)02:02:21 No.103364149

File: 2024-12-01_065705_seed965(...).png (1.81 MB, 1536x1536)

1.81 MB PNG

'night

Anonymous
12/01/24(Sun)02:05:08 No.103364182

Anonymous 12/01/24(Sun)02:05:08 No.103364182

>>103364149
Good night, Rin

Anonymous
12/01/24(Sun)02:12:40 No.103364235

Anonymous 12/01/24(Sun)02:12:40 No.103364235

>>103364127
as it stands it'll take like 5-7 minutes for a decent response, not all that bad all things considered. Might do more extensive testing tomorrow now that you mention it just so I have a benchmark to compare to
>>103364135
it's the best my current motherboard can do and I don't have money to refurbish the whole rig yet, plan is to upgrade the motherboard next then move on to a better processor and DDR5 ram

Anonymous
12/01/24(Sun)03:20:25 No.103364686

Anonymous 12/01/24(Sun)03:20:25 No.103364686

File: 1732759920621941.png (509 KB, 512x680)

509 KB PNG

>>103358794
- AsRock EPYCD8-2T $321
- EPYC 7282 $65
- 256 GB DDR4-3200 $300
- corsair hx1200i $162
- corsair rm850 $97

Anonymous
12/01/24(Sun)03:31:49 No.103364747

Anonymous 12/01/24(Sun)03:31:49 No.103364747

>>103364686
Looks nice.
Does it need forced air cooling?

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.