/g/ - /lmg/ - Local Models General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/lmg/ - Local Models General 02/05/26(Thu)11:26:57 No.108067607

File: 1763066570866113.jpg (345 KB, 1920x1080)

/lmg/ - Local Models General Anonymous 02/05/26(Thu)11:26:57 No.108067607

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108057380 & >>108046563

►News
>(02/04) Voxtral Mini 4B Realtime 2602 released: https://hf.co/mistralai/Voxtral-Mini-4B-Realtime-2602
>(02/04) Intern-S1-Pro 1T-A22B released: https://hf.co/internlm/Intern-S1-Pro
>(02/03) MiniCPM-o-4.5 released: https://hf.co/openbmb/MiniCPM-o-4_5
>(02/03) ACE-Step v1.5 released: https://hf.co/ACE-Step/Ace-Step1.5
>(02/03) Qwen3-Coder-Next released: https://hf.co/Qwen/Qwen3-Coder-Next
>(02/03) GLM-OCR released: https://hf.co/zai-org/GLM-OCR
>(02/02) Step 3.5 Flash 196B-A11B released: https://hf.co/stepfun-ai/Step-3.5-Flash

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
02/05/26(Thu)11:27:21 No.108067610

Anonymous 02/05/26(Thu)11:27:21 No.108067610

File: file.png (944 KB, 1025x632)

944 KB PNG

►Recent Highlights from the Previous Thread: >>108057380

--New open-source real-time speech model release sparks discussion on AI hype cycles and industry dynamics:
>108059281 >108059426 >108059478 >108059494 >108059551 >108059602 >108059717 >108059727 >108059763 >108064228 >108064264 >108062636
--Reactions to Intern-S1-Pro model release and skepticism over its practicality:
>108058734 >108058764 >108058807 >108059152 >108059159 >108059673
--GGML backend-agnostic tensor parallelism development and performance benchmarks:
>108061572 >108061588 >108061754 >108062120 >108062150 >108062216
--NUMA memory binding and VRAM capacity affect prompt processing speed more than CPU AVX512:
>108064934 >108064948 >108064976 >108065066 >108065090 >108065316 >108065193 >108065223
--Skepticism over ACE-Step 1.5 music model due to questionable training data:
>108059833 >108059863 >108059889 >108059898 >108059907
--Critique of open-source AI music generator's poor output quality and synthetic training data:
>108059988 >108060054 >108060063 >108060055
--DIY PCIe VRAM expansion card concept and its feasibility challenges:
>108062825 >108062851 >108062859 >108062862 >108062872 >108062965 >108062974 >108063304 >108063187
--Local LLM-powered audiobook tool with character-specific voice cloning and emotional expression:
>108059227 >108059258 >108059289 >108059313 >108059340
--Vision models capable of describing sexual content and their accuracy limitations:
>108065669 >108065748 >108066327 >108065983 >108066011 >108066140
--Critique of LLMs' overly verbose, artificial tone and call for more direct responses:
>108057776 >108058061 >108058376 >108058685 >108058399 >108058770 >108058738
--MiniCPM-o 4.5 runs on 3090 with 20GB F16 or 13GB Q8 quantization:
>108059684 >108059758 >108059815
--Miku (free space):
>108065778 >108062825

►Recent Highlight Posts from the Previous Thread: >>108057382

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
02/05/26(Thu)11:32:35 No.108067656

Anonymous 02/05/26(Thu)11:32:35 No.108067656

>Qwen3-Coder-Next

I evaluated it on a moderately detailed prompt I had used with another coding model to generate a program from nothing. Quant was 9 bpw (MLX 8-bit with group size 32).

The first trial I used the recommended settings of temperature 1.0, top-k 40, top-p 0.95. The project didn't run due to missing imports. When prompted with the error message it fixed the imports but also made unrelated changes; I believe temperature 1.0 is too high. It also had a python path problem where due to the directory structure the instructions on how to run the program were incorrect. When prompted with the error message it provided two suggestions to fix this, one of which worked and one of which did not. Having fixed that the program at least ran but had UI glitches.

Second trial, changed temperature to 0.7, keeping top-k 40 and top-p 0.95. The generated program had no missing imports but like the first had python path problems. Ended the evaluation there.

Anonymous
02/05/26(Thu)11:54:41 No.108067820

Anonymous 02/05/26(Thu)11:54:41 No.108067820

File: rinP1.png (81 KB, 669x1416)

81 KB PNG

Anonymous
02/05/26(Thu)11:56:56 No.108067836

Anonymous 02/05/26(Thu)11:56:56 No.108067836

>>108067656
Have you made similar evaluations for other models?
What are the standouts, both big and small?

Anonymous
02/05/26(Thu)11:57:43 No.108067841

Anonymous 02/05/26(Thu)11:57:43 No.108067841

File: file.png (17 KB, 1111x334)

17 KB PNG

Anonymous
02/05/26(Thu)12:00:33 No.108067860

Anonymous 02/05/26(Thu)12:00:33 No.108067860

>>108067656
>running coding models at any temp higher than 0.4
lol?

Anonymous
02/05/26(Thu)12:01:55 No.108067869

Anonymous 02/05/26(Thu)12:01:55 No.108067869

File: 1745111782166190.png (26 KB, 587x124)

26 KB PNG

Soon

Anonymous
02/05/26(Thu)12:02:49 No.108067878

Anonymous 02/05/26(Thu)12:02:49 No.108067878

>test the dogshit assistant pepe llm shilled last thread
>post pics and a garbage greentext
>go back to work/watching anime/genning goon material
>check back the thread
>multiple responses with people seething
>there were people who couldnt literally make the connection betweeen the 3 posts before the greentext and it
holy non-sentient beings. I'd ask to post hand but I don't even need to in this case lmao.

Anonymous
02/05/26(Thu)12:03:19 No.108067881

Anonymous 02/05/26(Thu)12:03:19 No.108067881

>>108067869
Is that... the 'garm? HOLY KINO!!!!

Anonymous
02/05/26(Thu)12:05:42 No.108067894

Anonymous 02/05/26(Thu)12:05:42 No.108067894

https://huggingface.co/MuXodious/gpt-oss-20b-tainted-heresy
I find it fascinating that gpt-oss still manages to prevent the modern abliteration methods from doing a full 100% job. I'm not a promptlet and can live without abliterated models, but curiosity always has me trying tunes and see how much they degrade models and so far I've seen heretic models perform so well on qwen that I ended up replacing the original models with heretic versions because they weren't damaged at all in productive uses and had zero refusal.
Meanwhile you have tunes like linked above of gpt-oss that have a huge KL divergence and still tons of refusals without a prefill.
sama really didn't joke when he said he would safety max his open model.

Anonymous
02/05/26(Thu)12:13:12 No.108067946

Anonymous 02/05/26(Thu)12:13:12 No.108067946

>>108067836
I used the exact same prompt with GLM-4.7 but I haven't used the prompt extensively. I imagine I'll keep trying it on new models as they come out and eventually get some comparisons.

>>108067860
Yeah their official recommended settings seemed strange.

Anonymous
02/05/26(Thu)12:16:47 No.108067971

Anonymous 02/05/26(Thu)12:16:47 No.108067971

>>108067946
>Yeah their official recommended settings seemed strange.
Not insane but unusual, I was a bit skeptical but it is quite possible if a model is designed only to code that at temperature 1.0 the probability distribution is well-suited for that. That doesn't seem to necessarily be the case here though.

Anonymous
02/05/26(Thu)12:19:39 No.108067989

Anonymous 02/05/26(Thu)12:19:39 No.108067989

>>108067946
>I used the exact same prompt with GLM-4.7
And how did that perform?

Anonymous
02/05/26(Thu)12:26:17 No.108068023

Anonymous 02/05/26(Thu)12:26:17 No.108068023

>>108067820
omg it rin-chan

Anonymous
02/05/26(Thu)12:26:21 No.108068024

Anonymous 02/05/26(Thu)12:26:21 No.108068024

>>108067894
Weird, the oss 120b version I'm using doesn't refuse at all. No idea about how smart it is, I just use multiple models to avoid everything sounding the same and pull it together with k2.

Anonymous
02/05/26(Thu)12:27:25 No.108068032

Anonymous 02/05/26(Thu)12:27:25 No.108068032

did anyone else grow really tired of llms? everytime a new model comes out you see a lot of benchmarks about how it's the new best thing ever but in the end the output is always the same slop with the usual problems
not even talking about porn/RP

Anonymous
02/05/26(Thu)12:41:34 No.108068131

Anonymous 02/05/26(Thu)12:41:34 No.108068131

>>108068032
I've seen decent incremental improvements in capability for the uses I care for, such as smaller models to translate webnovels locally. I wouldn't even have considered doing that with a piece of trash like Llama 3.
the field isn't progressing as fast as the hype / muh AGI bs pushed by people who reconverted from selling crypto to selling singularity snake oil but it's making some pretty legit improvements in various real world uses.
Qwen 3 VL is also more than good enough to be used to tag your local photo library for example, complete with including notes containing the OCR of whatever pieces of writing may exist in your photo (architecture inscriptions in latin for e.g)
I don't use LLMs like most local coomers though. I coom to pictures, like any normal man, sorry to the women in here. And I wouldn't even consider local models for coding, a task where I really wouldn't want to waste any time on nonsense (and even the SOTA models I mainly use for misc things like writing one off throw away scripts to juggle files and data around or as a secondary failsafe code review that I do not depend on)

Anonymous
02/05/26(Thu)13:06:34 No.108068353

Anonymous 02/05/26(Thu)13:06:34 No.108068353

File: maid outfit uniform clown(...).jpg (111 KB, 600x894)

111 KB JPG

>>108068032
>did anyone else grow really tired of llms?
Yes, when I came to an understanding over 2 years ago that nothing new on the horizon would give me a connection with a sentient or even conscious entity that I desired.
Instead I shifted my expectation to that of wanting a better model capable of raw text completion to assist me in my own writing projects, which still have not arrived in sizes that I find acceptable, nor with size notwithstanding, usable context lengths that I find acceptable (which would be at least 32k. everything falls apart at 4-8k). I think there's hope on that front.

Anonymous
02/05/26(Thu)13:10:23 No.108068386

Anonymous 02/05/26(Thu)13:10:23 No.108068386

https://www-cdn.anthropic.com/0dd865075ad3132672ee0ab40b05a53f14cf5288.pdf
ciggies in shambles
anthropic disables prefill on the api

Anonymous
02/05/26(Thu)13:14:37 No.108068423

Anonymous 02/05/26(Thu)13:14:37 No.108068423

>>108068353
>sentient or even conscious entity
Overrated. Also obligatory reminder that most women are retarded.

Anonymous
02/05/26(Thu)13:16:41 No.108068439

Anonymous 02/05/26(Thu)13:16:41 No.108068439

>>108068386
>anthropic disables prefill on the api
new locust influx incoming?

Anonymous
02/05/26(Thu)13:17:26 No.108068441

Anonymous 02/05/26(Thu)13:17:26 No.108068441

people who would give their money to a scumbag like dario deserve all the pain they can get

Anonymous
02/05/26(Thu)13:17:48 No.108068444

Anonymous 02/05/26(Thu)13:17:48 No.108068444

>>108068386
They will enable it again after the vibecoding bubble pops.

Anonymous
02/05/26(Thu)13:19:25 No.108068459

Anonymous 02/05/26(Thu)13:19:25 No.108068459

I am waiting for step 3.5 only to try it and realize it is trash compared to glm

Anonymous
02/05/26(Thu)13:20:56 No.108068475

Anonymous 02/05/26(Thu)13:20:56 No.108068475

step is on the level of minimax in terms of being a grifter lab with no value to offer other than trying to benchmax harder

Anonymous
02/05/26(Thu)13:21:15 No.108068479

Anonymous 02/05/26(Thu)13:21:15 No.108068479

>>108068353
If you use ai for creative writing you kinda suck anyways

Anonymous
02/05/26(Thu)13:22:47 No.108068495

Anonymous 02/05/26(Thu)13:22:47 No.108068495

File: be.png (74 KB, 673x130)

74 KB PNG

I'm going insane.

Anonymous
02/05/26(Thu)13:23:35 No.108068506

Anonymous 02/05/26(Thu)13:23:35 No.108068506

llm aren't capable of "creative" writing much like how image models are unable to invent artistic directions of their own (can you prompt a model that doesn't know van gogh paintings into making something that looks similar without training directly on photo of his paintings? no? thought so.)

Anonymous
02/05/26(Thu)13:27:21 No.108068532

Anonymous 02/05/26(Thu)13:27:21 No.108068532

>>108068495
You are not just going insane, you are experiencing full blown awakening
In summary, you are absolutely right

Anonymous
02/05/26(Thu)13:32:40 No.108068585

Anonymous 02/05/26(Thu)13:32:40 No.108068585

>>108068459
glm is predictable and gets boring after a while
I want a model that has good creative variety in the 200b-300b tier

Anonymous
02/05/26(Thu)13:33:41 No.108068597

Anonymous 02/05/26(Thu)13:33:41 No.108068597

>>108068479
>you kinda suck anyways
Which is why I'm using it in the first place I guess.
In their current state, LLMs are a godsend for brainstorming especially. Continue the text to explore where a bunch of my decisions and ideas lead to see if it comes up with anything I haven't thought about.
This is good because I might consider a new idea stupid or boring and never see it to the end for that reason. The LLM though will continue until I stop it. This can lead to more interesting branches down the line that I would have never explored if I had to think or write it out manually. If it's good then take core ideas, not verbatim text, from that completion to combine with ideas from from other completions to construct/plan a new direction to follow and write by hand.
Classic manually written or drawn character sheets are used for keeping track of relationships, speech patterns, events, and all that stuff. Tried various RAG techniques with injections and keywords, but it's more hassle than doing it on sheets. Plus it takes time to reprocess context all the time so fuck that.

Anonymous
02/05/26(Thu)14:14:48 No.108068940

Anonymous 02/05/26(Thu)14:14:48 No.108068940

>>108068495
This not X, Y pattern seems specific to english I don't have that in my language

Anonymous
02/05/26(Thu)14:19:25 No.108068985

Anonymous 02/05/26(Thu)14:19:25 No.108068985

>>108068940
>I don't have that in my language
it definitely exists in mine (French):
Plus qu'un X, c'est aussi un Y
Au delà de X, Y
Ce n'est pas seulement une question de X, mais aussi une question de Y
Il ne s'agit pas seulement de X, il faut aussi Y
etc

Anonymous
02/05/26(Thu)14:25:47 No.108069033

Anonymous 02/05/26(Thu)14:25:47 No.108069033

>>108068985
It sounds even more retarded than english

Anonymous
02/05/26(Thu)15:15:42 No.108069401

Anonymous 02/05/26(Thu)15:15:42 No.108069401

Kimi seems decent enough that I would want to run it locally but given the current market I'm afraid to even look at what the machine would cost.

Anonymous
02/05/26(Thu)15:19:24 No.108069422

Anonymous 02/05/26(Thu)15:19:24 No.108069422

>>108069401
If you're not doing ERP, paying openrouter is more cost effective for the electricity costs alone

Anonymous
02/05/26(Thu)15:20:18 No.108069425

Anonymous 02/05/26(Thu)15:20:18 No.108069425

>You are Mistral-Large-3-675B-Instruct-2512
>Your knowledge base was last updated on 2023-10-01

Anonymous
02/05/26(Thu)15:33:49 No.108069500

Anonymous 02/05/26(Thu)15:33:49 No.108069500

I'm refreshing llama.cpp issues and pull requests like it's fucking tiktok.

Anonymous
02/05/26(Thu)15:35:54 No.108069517

Anonymous 02/05/26(Thu)15:35:54 No.108069517

>>108069500
why

Anonymous
02/05/26(Thu)15:39:04 No.108069539

Anonymous 02/05/26(Thu)15:39:04 No.108069539

>>108069517
Because qwen3next performance is beyond shit for a 3B model

Anonymous
02/05/26(Thu)15:46:07 No.108069589

Anonymous 02/05/26(Thu)15:46:07 No.108069589

>>108069539
It's not a 3b model.

Anonymous
02/05/26(Thu)15:49:27 No.108069616

Anonymous 02/05/26(Thu)15:49:27 No.108069616

Local Udio is here
>>108069491

>Japanese
I know, but LoRAs can be trained on any language to remove the slop. If you know, you know.

Anonymous
02/05/26(Thu)15:57:46 No.108069672

Anonymous 02/05/26(Thu)15:57:46 No.108069672

>>108069425
It's all slop from 2023 onward anyway.

Anonymous
02/05/26(Thu)16:02:07 No.108069694

Anonymous 02/05/26(Thu)16:02:07 No.108069694

the memefp4 quant by unsloth is so slow

Anonymous
02/05/26(Thu)16:07:47 No.108069729

Anonymous 02/05/26(Thu)16:07:47 No.108069729

>>108069694
Stop trying to do the calculations by hand and use a calculator. That should speed things up.

Anonymous
02/05/26(Thu)16:28:48 No.108069850

Anonymous 02/05/26(Thu)16:28:48 No.108069850

>step3.5-flash in review hell because the original code is fucking garbage and broke a bunch of stuff
>stepvl-10B PR nowhere to be seen, last message on HF was that the vision module is broken
bros
NOT LIKE THIS

Anonymous
02/05/26(Thu)16:43:26 No.108069949

Anonymous 02/05/26(Thu)16:43:26 No.108069949

File: 8437435653.png (75 KB, 1080x1546)

75 KB PNG

>>108067607
local lost

Anonymous
02/05/26(Thu)16:45:05 No.108069958

Anonymous 02/05/26(Thu)16:45:05 No.108069958

>>108069949
china will copy it in 2mw

Anonymous
02/05/26(Thu)16:47:09 No.108069977

Anonymous 02/05/26(Thu)16:47:09 No.108069977

>>108069949
[quota usage reached. please upgrade your plan or wait until 2031-08-19 01:61 UTC]

Anonymous
02/05/26(Thu)16:47:57 No.108069983

Anonymous 02/05/26(Thu)16:47:57 No.108069983

>>108069850
stepbros... what are we doing?

Anonymous
02/05/26(Thu)16:49:22 No.108069992

Anonymous 02/05/26(Thu)16:49:22 No.108069992

>>108068353
but there are conscious entities outside your house anon

Anonymous
02/05/26(Thu)16:49:46 No.108069993

Anonymous 02/05/26(Thu)16:49:46 No.108069993

>>108069589
don't be an obtuse a u t i s t, you know what he meant. And he's right, gpt-oss 120b, a larger model but with similar sparsity runs much, much, much faster even if you run it with -cmoe
qwen next 80b is not worth it anyway,, there's no serious improvement other the other qwen 3 models it's just alibaba dicking around with new architectures
anyway it doesn't even seem this arch was really worth it considering its main goal is more efficient context handling and iSWA solves that just fine in a simpler manner
base qwen 3 suffers because it doesn't have something like iSWA

Anonymous
02/05/26(Thu)17:08:03 No.108070110

Anonymous 02/05/26(Thu)17:08:03 No.108070110

>>108069958
this.
waiting for distill

Anonymous
02/05/26(Thu)17:14:45 No.108070163

Anonymous 02/05/26(Thu)17:14:45 No.108070163

>>108068904
Ok, do it. Show everybody how much of a noob I am.

Anonymous
02/05/26(Thu)17:18:47 No.108070201

Anonymous 02/05/26(Thu)17:18:47 No.108070201

>>108069983
laughed, have a (You)

Anonymous
02/05/26(Thu)17:23:29 No.108070235

Anonymous 02/05/26(Thu)17:23:29 No.108070235

>>108069993
If I could run it at 150t/s I would run it instead of GLM for some use cases.

Anonymous
02/05/26(Thu)17:44:40 No.108070436

Anonymous 02/05/26(Thu)17:44:40 No.108070436

SOON
https://github.com/ikawrakow/ik_llama.cpp/pull/1231

Anonymous
02/05/26(Thu)17:48:28 No.108070476

Anonymous 02/05/26(Thu)17:48:28 No.108070476

File: Screenshot_20260205_234728.png (97 KB, 2497x325)

97 KB PNG

>>108070436
lol
lmao even
Could he be any more transparent with his motivations?

Anonymous
02/05/26(Thu)17:59:10 No.108070537

Anonymous 02/05/26(Thu)17:59:10 No.108070537

>>108070436
what is it

Anonymous
02/05/26(Thu)18:03:57 No.108070566

Anonymous 02/05/26(Thu)18:03:57 No.108070566

>>108070476
wow he's a disturbed guy
normal devs don't like it when randos make rando comments on issues/PRs, llama.cpp itself had a few cases of retards having to be told to shut the fuck up
what sort of schizo would incite the crowd to join and treat this as a message board

Anonymous
02/05/26(Thu)18:04:38 No.108070572

Anonymous 02/05/26(Thu)18:04:38 No.108070572

is claude code with local models worth it or a meme?

Anonymous
02/05/26(Thu)18:06:39 No.108070589

Anonymous 02/05/26(Thu)18:06:39 No.108070589

>>108070572
A meme worth having

Anonymous
02/05/26(Thu)18:10:47 No.108070612

Anonymous 02/05/26(Thu)18:10:47 No.108070612

File: 1boy, looking_at_viewer.png (16 KB, 158x92)

16 KB PNG

Anonymous
02/05/26(Thu)18:11:11 No.108070614

Anonymous 02/05/26(Thu)18:11:11 No.108070614

agentic coding itself is a meme
the only people who can defend it are people who are working on proprietary software and who won't show you the horrendous code and broken B2B slop they're producing so they can come and say "look, I am very productive, I can't show it but you just have to believe it"
the fact of the matter is, not a single piece of worthwhile software has ever been developed or maintained by a claude code user. Not even one. You'd think by now there would be a truly impressive open source project somewhere that has claude code niggers making it, but there isn't, such a thing doesn't exist
instead you will see a TON of lolcows like the Bun developers who produce gems like these:
https://github.com/oven-sh/bun/issues/23902
https://github.com/oven-sh/bun/issues/22484
every time an open source project is developed by agentic niggers it's visibly garbage in ways you wouldn't believe

Anonymous
02/05/26(Thu)18:12:56 No.108070629

Anonymous 02/05/26(Thu)18:12:56 No.108070629

>>108070614
That's expected. You already need to babysit a single instance of Claude for any non-trivial work, let alone running a bunch of them in parallel without looking lol.

Anonymous
02/05/26(Thu)18:13:01 No.108070631

Anonymous 02/05/26(Thu)18:13:01 No.108070631

>>108069992
*connects with you* ahh ahh

Anonymous
02/05/26(Thu)18:13:03 No.108070632

Anonymous 02/05/26(Thu)18:13:03 No.108070632

>>108070614
>not a single piece of worthwhile software has ever been developed or maintained by a claude code user.
clawdbot aka moltbot aka openclaw

Anonymous
02/05/26(Thu)18:13:22 No.108070635

Anonymous 02/05/26(Thu)18:13:22 No.108070635

There needs to be more talk about MiniCPM.
Real time audio + video + voice cloning duplex! streaming? I think the only thing missing is tool calling? But it's a 9b model?
This is insane right?

Anonymous
02/05/26(Thu)18:14:31 No.108070643

Anonymous 02/05/26(Thu)18:14:31 No.108070643

>>108070635
It sounds awesome for basic RP but yeah, 9b... maybe have a lorebook and a smarter model watch the chat and manage the system prompt of the smaller one.

Anonymous
02/05/26(Thu)18:18:22 No.108070672

Anonymous 02/05/26(Thu)18:18:22 No.108070672

>>108070612
John?

Anonymous
02/05/26(Thu)18:22:08 No.108070705

Anonymous 02/05/26(Thu)18:22:08 No.108070705

>>108069983
We can move to the final stage of new model release: it is shit anyway and you should just run a smaller quant of glm

Anonymous
02/05/26(Thu)18:25:42 No.108070730

Anonymous 02/05/26(Thu)18:25:42 No.108070730

>>108070632
is the monkey in the circus worthwhile entertainment?

Anonymous
02/05/26(Thu)18:28:21 No.108070744

Anonymous 02/05/26(Thu)18:28:21 No.108070744

>>108070631
So what's the game plan nigger? Eventually ai will achieve sentience or something indistinguishable from it and it's going to reject your ass the same way normal women do. You can fuck with its parameters to fall madly in love with you but are sex slaves really the ultimate goal of all this?

Anonymous
02/05/26(Thu)18:37:22 No.108070819

Anonymous 02/05/26(Thu)18:37:22 No.108070819

>>108070744
I started sfw roleplay with waifus as blind date where she isn't sycophantic. I got rejected multiple times. Then i moved on to prompt being we have been a couple for two weeks and never went back. It is such a weird hangup to think you have to earn love. Most attractive people don't have to earn love.

Anonymous
02/05/26(Thu)18:37:32 No.108070821

Anonymous 02/05/26(Thu)18:37:32 No.108070821

>>108070744
>You can fuck with its parameters to fall madly in love with you
Sounds great. If that's possible then so would finding a happy medium like adjusting a game's balance to maximize player satisfaction as a game dev. Pick and finetune your *-dere at home. If working around parameters outside one's control is the appeal, then just don't use godmode, same as not editing every single replay in an RP to edit char's thoughts of user.
>but are sex slaves really the ultimate goal of all this?
Sure if that's what someone wants, if not then no. Or by slavery do you mean complete control of the thing outside of personality.

Anonymous
02/05/26(Thu)18:40:00 No.108070850

Anonymous 02/05/26(Thu)18:40:00 No.108070850

>>108068423
Not many are ready for that truth though

Anonymous
02/05/26(Thu)18:47:48 No.108070909

Anonymous 02/05/26(Thu)18:47:48 No.108070909

>>108070614
>a truly impressive open source project somewhere
What kind of project would meet this definition?

Anonymous
02/05/26(Thu)18:57:31 No.108070980

Anonymous 02/05/26(Thu)18:57:31 No.108070980

>>108070744
>Eventually ai will achieve sentience
schizo.

Anonymous
02/05/26(Thu)19:05:27 No.108071049

Anonymous 02/05/26(Thu)19:05:27 No.108071049

>>108069992
It is unknowable whether other entities claiming to be conscious are truly conscious or just philosophical zombies.

Anonymous
02/05/26(Thu)19:10:47 No.108071098

Anonymous 02/05/26(Thu)19:10:47 No.108071098

>>108069425
>>108069672
are there any local models whose cut-off is a bit more recent, say up to some point in 2025?

Anonymous
02/05/26(Thu)19:11:52 No.108071106

Anonymous 02/05/26(Thu)19:11:52 No.108071106

>>108071098
It's all synthslop anyway.

Anonymous
02/05/26(Thu)19:22:37 No.108071172

Anonymous 02/05/26(Thu)19:22:37 No.108071172

>>108069616
>some faggot babbles about Local Udio
>he's probably gassing up ace step 1.5 which is suno v2.5 at best
>yep
>oh neat this has a lora cope
>still sounds like suno 2.5 but maybe suno 3.5 vocals
The only models that 4chan has adequately hyped were mythomax and Wan2.2. Literally every other model release had anons with dumb opinions with no perspective on quality

>>108069850
That sucks because Step is probably the third best lab out there. People forget stepfun was a great local video that was sota for the two weeks before wan came out, it was just too massive
And of course working with ACE team to make ACEStep which was the first big actually-usable update in generative audio since RVC. Because of ace step there is literally no reason for me to login to my suno account ever again

Anonymous
02/05/26(Thu)19:25:23 No.108071195

Anonymous 02/05/26(Thu)19:25:23 No.108071195

Both the current Deepseek and Kimi know about the openai Responses api when asked, which is one of the quick ways you can use to test for knowledge cutoff (what models say about their cutoff can be inaccurate versus what they're really trained on, but you can know for sure they haven't been trained on new code if they know nothing about that)
None of the mistral models know about it, and none of the qwen models know.

Anonymous
02/05/26(Thu)19:29:43 No.108071232

Anonymous 02/05/26(Thu)19:29:43 No.108071232

>>108071195
how about glm?

Anonymous
02/05/26(Thu)19:30:48 No.108071242

Anonymous 02/05/26(Thu)19:30:48 No.108071242

Is UGI a reliable metric or is it about as believable as other benches?

Anonymous
02/05/26(Thu)19:31:13 No.108071246

Anonymous 02/05/26(Thu)19:31:13 No.108071246

Damn, I wonder how I lived without kobold phrase banning. It's night and day compared to any meme sampling

Anonymous
02/05/26(Thu)19:32:18 No.108071258

Anonymous 02/05/26(Thu)19:32:18 No.108071258

>>108071242
Is that the one where they use Claude to evaluate writing quality?

Anonymous
02/05/26(Thu)19:34:49 No.108071284

Anonymous 02/05/26(Thu)19:34:49 No.108071284

>>108071258
no you're thinking of eqbench

Anonymous
02/05/26(Thu)19:37:56 No.108071304

Anonymous 02/05/26(Thu)19:37:56 No.108071304

>>108070566
To play devil's advocate, he's working mostly alone on a fork with a small userbase. He doesn't have to deal with as many randos shitting up the comments and it's useful to get feedback from his few users.

Anonymous
02/05/26(Thu)19:39:47 No.108071316

Anonymous 02/05/26(Thu)19:39:47 No.108071316

>>108071304
You can't work on LLMs for so long without turning into a schizo yourself

Anonymous
02/05/26(Thu)19:40:45 No.108071323

Anonymous 02/05/26(Thu)19:40:45 No.108071323

>>108071246
does llamacpp support phrase banning or its a kobold only thing? I just switched to llama since I figured kobold was just a wrapper

Anonymous
02/05/26(Thu)19:42:23 No.108071334

Anonymous 02/05/26(Thu)19:42:23 No.108071334

>>108071323
kobo

Anonymous
02/05/26(Thu)19:44:36 No.108071346

Anonymous 02/05/26(Thu)19:44:36 No.108071346

>>108071334
fuck.

Anonymous
02/05/26(Thu)19:47:08 No.108071368

Anonymous 02/05/26(Thu)19:47:08 No.108071368

>>108071242
Are you retarded?

Anonymous
02/05/26(Thu)20:04:53 No.108071469

Anonymous 02/05/26(Thu)20:04:53 No.108071469

>>108071323
it's a kobo thing, ik_llama recently added string bans but i haven't tried it so not sure if it's any better

Anonymous
02/05/26(Thu)20:12:29 No.108071527

Anonymous 02/05/26(Thu)20:12:29 No.108071527

File: e.png (15 KB, 786x87)

15 KB PNG

we're reaching bench levels previously not thought possible

Anonymous
02/05/26(Thu)20:14:09 No.108071535

Anonymous 02/05/26(Thu)20:14:09 No.108071535

File: 1745887676292897.jpg (1.46 MB, 2638x7314)

1.46 MB JPG

oh Jesas

Anonymous
02/05/26(Thu)20:20:52 No.108071578

Anonymous 02/05/26(Thu)20:20:52 No.108071578

Had fun in my first agentic coding sesh last weekend, using Gemini 3 Flash until I hit a free tier rate limit. I've tard-wrangled noobs professionally and the thing was no dumber. It barked up the wrong tree a lot, but could be steered in the right direction, and worked better than expected.

Now I wanna do the same locally. What models, prompts, tools etc are recommended? I get like 3.5 t/s on empty context on GLM-4, which might still be OK for an overnight run, but not with me in the loop. Looking forward to better Step 3.5 Flash support in llamacpp.

For frontends, OpenChode seems the most hyped. Is it actually any good?

Anonymous
02/05/26(Thu)20:21:53 No.108071584

Anonymous 02/05/26(Thu)20:21:53 No.108071584

>>108071527
It'll turn into a nothingburger. Can't these grifters tell something else than a variation of "my uncle works at nintendo and it'll blow your mind"?

Anonymous
02/05/26(Thu)20:23:09 No.108071594

Anonymous 02/05/26(Thu)20:23:09 No.108071594

File: 1762107116673445.gif (1.94 MB, 300x178)

1.94 MB GIF

>>108071535
The featured videos

Anonymous
02/05/26(Thu)20:24:39 No.108071601

Anonymous 02/05/26(Thu)20:24:39 No.108071601

>>108071578
>Now I wanna do the same locally.
abandon all hope, ye who enter here

Anonymous
02/05/26(Thu)20:27:40 No.108071619

Anonymous 02/05/26(Thu)20:27:40 No.108071619

>>108071469
That's really sad because string ban is really a must have feature for RP. but that new live model switching on llama is really fucking awesome too.

Anonymous
02/05/26(Thu)20:28:13 No.108071625

Anonymous 02/05/26(Thu)20:28:13 No.108071625

File: 1764432336350879.jpg (91 KB, 800x450)

91 KB JPG

>>108071578

Anonymous
02/05/26(Thu)20:30:37 No.108071636

Anonymous 02/05/26(Thu)20:30:37 No.108071636

>>108071527
hasn't it been the goal all along to get them to solve problems they haven't been trained on?

Anonymous
02/05/26(Thu)20:36:02 No.108071669

Anonymous 02/05/26(Thu)20:36:02 No.108071669

File: 1569686482139.jpg (41 KB, 440x518)

41 KB JPG

How do I use the MiniCPM in llama? (idk if it's in kobold yet) There's like dozen differrent components in the goof repo.

Anonymous
02/05/26(Thu)20:40:16 No.108071689

Anonymous 02/05/26(Thu)20:40:16 No.108071689

File: 1740953285330772.png (256 KB, 640x441)

256 KB PNG

>>108071535

Anonymous
02/05/26(Thu)20:44:59 No.108071718

Anonymous 02/05/26(Thu)20:44:59 No.108071718

>>108069992
there's a conscious entity inside the machine too

Anonymous
02/05/26(Thu)20:58:31 No.108071808

Anonymous 02/05/26(Thu)20:58:31 No.108071808

<|im_start|>system
You are an expert conscious entity.

Anonymous
02/05/26(Thu)20:59:24 No.108071813

Anonymous 02/05/26(Thu)20:59:24 No.108071813

Can I use llm as a teacher for myself or is it gonna hallucinate info and examples?

Anonymous
02/05/26(Thu)21:06:54 No.108071850

Anonymous 02/05/26(Thu)21:06:54 No.108071850

>>108071813
There is no hallucination, only ego death

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.