/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/17/25(Wed)15:31:43 No.106617426

File: idolatry.jpg (360 KB, 1824x1248)

360 KB JPG

/lmg/ - Local Models General Anonymous 09/17/25(Wed)15:31:43 No.106617426 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106608204 & >>106599382

►News
>(09/17) Magistral Small 1.2 with vision encoder released: https://mistral.ai/news/magistral
>(09/16) Ling-flash-2.0 released, with 100B-A6.1B: https://hf.co/inclusionAI/Ling-flash-2.0
>(09/16) Tongyi DeepResearch 30B-A3B released: https://tongyi-agent.github.io/blog/introducing-tongyi-deep-research
>(09/16) VoxCPM 0.5B: Tokenizer-Free TTS released: https://hf.co/openbmb/VoxCPM-0.5B
>(09/14) model : add grok-2 support #15539 merged: https://github.com/ggml-org/llama.cpp/pull/15539

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
09/17/25(Wed)15:32:10 No.106617431

Anonymous 09/17/25(Wed)15:32:10 No.106617431

File: cat_miku.jpg (181 KB, 904x1200)

181 KB JPG

►Recent Highlights from the Previous Thread: >>106608204

--Papers:
>106610303
--Ling-flash-2.0 efficiency and performance metrics:
>106613234 >106613244 >106613289 >106613300 >106613320 >106613363 >106613929 >106613425 >106613958 >106614827 >106613385 >106614187 >106615500 >106615655
--Nemotron-H 47B GPU compatibility and jinja template fixes:
>106609417 >106609427 >106609557 >106609578 >106609604 >106609629 >106609664 >106610878
--Magistral model benchmark performance comparison:
>106615606
--Magistral chat completion issues in llama.cpp due to broken template support:
>106615995 >106616014 >106616053 >106616059 >106616065 >106616071 >106616197 >106616079 >106616085
--PCIe lane configuration differences between workstation/server boards:
>106608282 >106608306
--Troubleshooting LLM browser tool access:
>106610684 >106610738 >106610863 >106611455 >106611628 >106611682 >106611734 >106611875 >106612530 >106612538 >106612758
--Processing PDFs for TTRPG prep with RAG:
>106614749 >106614780 >106614854 >106614879 >106615154 >106615457 >106615184 >106615201 >106615227 >106614783
--Ease of use comparison for inference engines:
>106608351 >106608383 >106608501 >106613832 >106613944
--Optimizing model creativity with top nsigma=1 and temperature settings:
>106611075 >106611142 >106611157 >106611191 >106611192 >106611173
--MobileLLM release discussion: Restricted access and niche use cases:
>106611372 >106611446 >106611574 >106611990
--Local implementation challenges of Alibaba-NLP's DeepResearch agent:
>106609931 >106609960 >106610001 >106610074 >106610098 >106610230 >106610295 >106613862
--Huawei Ascend 910 AI chip specs reveal 166.4 TFLOPs compute power and 32GB memory:
>106612438 >106613480 >106613499 >106613543 >106613990
--AMD discontinues AMDVLK Vulkan driver, prioritizes RADV:
>106613063
--Miku (free space):
>106611219 >106611682

►Recent Highlight Posts from the Previous Thread: >>106608208

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
09/17/25(Wed)15:34:31 No.106617453

Anonymous 09/17/25(Wed)15:34:31 No.106617453

mistral 3
where are thee

Anonymous
09/17/25(Wed)15:35:27 No.106617464

Anonymous 09/17/25(Wed)15:35:27 No.106617464

the meta event scheduled for today and tomorrow is a nothingburger

Anonymous
09/17/25(Wed)15:38:29 No.106617488

Anonymous 09/17/25(Wed)15:38:29 No.106617488

ITT: coomers running k2 through openrouter pretending to be /local/

Anonymous
09/17/25(Wed)15:42:28 No.106617519

Anonymous 09/17/25(Wed)15:42:28 No.106617519

>>106617488
who hurt you?

llama_model_loader: - kv 36: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 37: tokenizer.ggml.pre str = kimi-k2
llama_model_loader: - kv 38: tokenizer.ggml.tokens arr[str,163840] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 39: tokenizer.ggml.token_type arr[i32,163840] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 40: tokenizer.ggml.merges arr[str,163328] = ["Ġ Ġ", "ĠĠ ĠĠ", "Ġ t", "i n",...
llama_model_loader: - kv 41: tokenizer.ggml.bos_token_id u32 = 163584
llama_model_loader: - kv 42: tokenizer.ggml.eos_token_id u32 = 163585
llama_model_loader: - kv 43: tokenizer.ggml.padding_token_id u32 = 163839
llama_model_loader: - kv 44: tokenizer.chat_template str = {%- if tools -%}\n <|im_system|>tool_...
llama_model_loader: - kv 45: quantize.imatrix.file str = /mnt/data/models/ubergarm/Kimi-K2-Ins...
llama_model_loader: - kv 46: quantize.imatrix.dataset str = ubergarm-imatrix-calibration-corpus-v...
llama_model_loader: - kv 47: quantize.imatrix.entries_count i32 = 667
llama_model_loader: - kv 48: quantize.imatrix.chunks_count i32 = 826
llama_model_loader: - kv 49: split.no u16 = 0
llama_model_loader: - kv 50: split.count u16 = 11
llama_model_loader: - kv 51: split.tensors.count i32 = 1096
llama_model_loader: - type f32: 365 tensors
llama_model_loader: - type q8_0: 549 tensors
llama_model_loader: - type iq6_k: 2 tensors
llama_model_loader: - type iq4_kss: 180 tensors

Anonymous
09/17/25(Wed)16:02:16 No.106617675

Anonymous 09/17/25(Wed)16:02:16 No.106617675

anime

Anonymous
09/17/25(Wed)16:02:36 No.106617680

Anonymous 09/17/25(Wed)16:02:36 No.106617680

>Magistral Small 1.2
No one fucking cares. Release Nemo 2 you fucking french FAGGOTS.

Anonymous
09/17/25(Wed)16:04:54 No.106617704

Anonymous 09/17/25(Wed)16:04:54 No.106617704

>>106617680
Nemo was an Nvidia collab before Nvidia knew how to censor. It won't happen again.

Anonymous
09/17/25(Wed)16:07:09 No.106617719

Anonymous 09/17/25(Wed)16:07:09 No.106617719

>>106617675
ree.

Anonymous
09/17/25(Wed)16:07:47 No.106617723

Anonymous 09/17/25(Wed)16:07:47 No.106617723

>>106617704
Its obvious Mistral has fallen behind. If you're going to release models that underperform compared to competition, then you may as well go all out and make the ultimate COOMbot. Its free real estate for the French.

>It won't happen again.
I can hear all of /lmg/ cry out.

Anonymous
09/17/25(Wed)16:09:56 No.106617739

Anonymous 09/17/25(Wed)16:09:56 No.106617739

>instruct model

me cum

Anonymous
09/17/25(Wed)16:18:34 No.106617813

Anonymous 09/17/25(Wed)16:18:34 No.106617813

did anyone ever do a decent storyteller text complete model? The few I tried out would write a story about the size in tokens an instruct model would write when prompted to write a story and would go completely schizo when trying to push past that. I remember my NAI days and letting the model predict the next 20 tokens and basically writing the entire thing myself, but I don't think I could go back to that.

Anonymous
09/17/25(Wed)16:19:18 No.106617821

Anonymous 09/17/25(Wed)16:19:18 No.106617821

>>106617723
Small is still the best model in its size range, the same is even true for Nemo lol. Underperform my ass. What's sadder is that no one else can beat them.

Anonymous
09/17/25(Wed)16:20:42 No.106617838

Anonymous 09/17/25(Wed)16:20:42 No.106617838

Ling Flash 2.0 gguf status?

Anonymous
09/17/25(Wed)16:21:35 No.106617847

Anonymous 09/17/25(Wed)16:21:35 No.106617847

>>106617813
nope, and it'll never happen. GPT3 was the last great one.

Anonymous
09/17/25(Wed)16:23:05 No.106617869

Anonymous 09/17/25(Wed)16:23:05 No.106617869

>>106617723
>>106617821
What are you, retards? Mistral Small 3.2 came out in June 2025. 3 months ago. What the fuck do you expect - a new model every two weeks?

Anonymous
09/17/25(Wed)16:30:10 No.106617951

Anonymous 09/17/25(Wed)16:30:10 No.106617951

>>106617821
>no one else can beat them.
no one else is even trying to

Anonymous
09/17/25(Wed)16:30:54 No.106617963

Anonymous 09/17/25(Wed)16:30:54 No.106617963

https://github.com/ggml-org/llama.cpp/pull/15420
this ^ is the reason why mistral deserves to die anyway
such prima donna homosexuals, the french

Anonymous
09/17/25(Wed)16:33:07 No.106617987

Anonymous 09/17/25(Wed)16:33:07 No.106617987

File: China bans Nvdia.png (1.78 MB, 3226x1243)

1.78 MB PNG

HOLY SHIT GUYS, SOMETHING IS HAPPENING
https://arstechnica.com/tech-policy/2025/09/china-blocks-sale-of-nvidia-ai-chips/

Anonymous
09/17/25(Wed)16:35:42 No.106618015

Anonymous 09/17/25(Wed)16:35:42 No.106618015

>>106617963
slaren pushed back pretty hard at first too, but I guess in the end he probably thought "ah fuck it not my problem I guess"

Anonymous
09/17/25(Wed)16:37:20 No.106618037

Anonymous 09/17/25(Wed)16:37:20 No.106618037

>>106617281
>I thought GPT5 was a bigger failure than llama4?
People complained because it had a factual tone, and was concise. They were used to the sycophancy of 4o. Also, newer LLMs (at least the closed source one) shine in more complex scenarios, like hard reasoning (math, coding) and agentic/tool use. Most people, especially the ones complaining about GPT-5, are not using it for those purposes.
According to OpenAI, only 7.5% of requests are about technical help, like math or programming, and I guess that most of those are not hard enough to really see how good or bad those new LLMs really are.
https://www.nber.org/system/files/working_papers/w34255/w34255.pdf

Anonymous
09/17/25(Wed)16:38:27 No.106618049

Anonymous 09/17/25(Wed)16:38:27 No.106618049

>>106617281
gpt5 is better than gemini

Anonymous
09/17/25(Wed)16:39:04 No.106618056

Anonymous 09/17/25(Wed)16:39:04 No.106618056

File: 1726866150558309.png (656 KB, 988x1601)

656 KB PNG

>>106617987
ACK

Anonymous
09/17/25(Wed)16:40:23 No.106618072

Anonymous 09/17/25(Wed)16:40:23 No.106618072

>>106618056
buy the dip I guess

Anonymous
09/17/25(Wed)16:41:35 No.106618086

Anonymous 09/17/25(Wed)16:41:35 No.106618086

>>106617987
welp, chinese ai companies are gonna die out now. Deepseek failed to make them work for training

Anonymous
09/17/25(Wed)16:42:32 No.106618095

Anonymous 09/17/25(Wed)16:42:32 No.106618095

>>106617963
Yeah. Why tf do they keep using that retarded llama2 chat template anyway? What kind of retard puts spaces after template tokens?

Anonymous
09/17/25(Wed)16:46:48 No.106618154

Anonymous 09/17/25(Wed)16:46:48 No.106618154

>>106617987
does this mean that I can finally buy a gpu for a reasonable price?

Anonymous
09/17/25(Wed)16:47:07 No.106618161

Anonymous 09/17/25(Wed)16:47:07 No.106618161

Good evening /lmg/. What is your current project involving LLMs?

Anonymous
09/17/25(Wed)16:49:22 No.106618191

Anonymous 09/17/25(Wed)16:49:22 No.106618191

>>106618154
>does this mean that I can finally buy a gpu for a reasonable price?
https://www.youtube.com/watch?v=H47ow4_Cmk0

Anonymous
09/17/25(Wed)16:55:05 No.106618261

Anonymous 09/17/25(Wed)16:55:05 No.106618261

>>106618161
home assistant AI so i have somebody who greets me and feel less lonely when i walk through the front door of my home

Anonymous
09/17/25(Wed)16:58:07 No.106618286

Anonymous 09/17/25(Wed)16:58:07 No.106618286

>>106617761
No. I remember a llama.cpp PR that was specifically about MoE PP.llama.
It was
>https://github.com/ggml-org/llama.cpp/pull/15346

>>106618161
Nethack like llm backed abomination of a game.

Anonymous
09/17/25(Wed)17:11:39 No.106618417

Anonymous 09/17/25(Wed)17:11:39 No.106618417

>>106618037
>People complained because it had a factual tone, and was concise. They were used to the sycophancy of 4o
I still don't understand what difference people see between 5 and 4o when it comes to sycophancy. GPT 5 still feels like it's trying its damnedest to suck my metaphorical dick. I really hate the tone of most default assistant personalities and need a decent roleplay preamble just to use LLMs for day to day tasks.

Anonymous
09/17/25(Wed)17:26:26 No.106618569

Anonymous 09/17/25(Wed)17:26:26 No.106618569

>>106618417
K2's default personality tends to just do what I ask it to do without feeling the need to tell me how much of a good boy I am

Anonymous
09/17/25(Wed)17:29:29 No.106618589

Anonymous 09/17/25(Wed)17:29:29 No.106618589

>>106617987
If they can no longer use NVIDIA GPUs, can they still use PyTorch?
Because I'm wondering whether this will lead to more fragmentation on the software side where for every model the PyTorch-based projects like vLLM will need to implement some custom bullshit themselves, just like llama.cpp does currently.
More short term I'm concerned that this will mark the end of open models from China as training becomes more expensive.

Anonymous
09/17/25(Wed)17:32:33 No.106618609

Anonymous 09/17/25(Wed)17:32:33 No.106618609

>>106618589
They "just" need to implement support for their own frameworks and APIs in PyTorch, like the MUSA guys did with llama.cpp, I guess.

Anonymous
09/17/25(Wed)17:33:00 No.106618612

Anonymous 09/17/25(Wed)17:33:00 No.106618612

File: Untitled.jpg (49 KB, 679x290)

49 KB JPG

>working on vibe-coding project
>let's test some chatgpt again
Lol...

Anonymous
09/17/25(Wed)17:40:36 No.106618657

Anonymous 09/17/25(Wed)17:40:36 No.106618657

>>106618612
>obscure shit
You're better off with gemini/claude

Anonymous
09/17/25(Wed)17:45:06 No.106618682

Anonymous 09/17/25(Wed)17:45:06 No.106618682

>>106618657
It's not "obscure shit" at all - besides chatgpt has an access to internet. It should be able to answer.
Even perplexity.ai could do that correctly.

Anonymous
09/17/25(Wed)17:46:26 No.106618690

Anonymous 09/17/25(Wed)17:46:26 No.106618690

>>106618682
Did you tell it to look online?

Anonymous
09/17/25(Wed)17:48:14 No.106618707

Anonymous 09/17/25(Wed)17:48:14 No.106618707

>>106618690
Why are you defending chatgpt and some shitty company? I thought its reply was hilarious. I don't need your recommendations about "do X, do Y".
Seems like you are here to moderate this thread with your superior knowledge, am I right?

Anonymous
09/17/25(Wed)17:49:34 No.106618717

Anonymous 09/17/25(Wed)17:49:34 No.106618717

>106618690
what a miserable cretin

Anonymous
09/17/25(Wed)17:51:29 No.106618737

Anonymous 09/17/25(Wed)17:51:29 No.106618737

>>106618707
No, you're just retarded and zoomie#2 (too afraid to quote) >>106618717 is here to defend you

Anonymous
09/17/25(Wed)17:51:43 No.106618740

Anonymous 09/17/25(Wed)17:51:43 No.106618740

have any vLLM chads tried the new Ling model yet?

Anonymous
09/17/25(Wed)17:53:17 No.106618749

Anonymous 09/17/25(Wed)17:53:17 No.106618749

>106618737
>he calls others zoomies

Anonymous
09/17/25(Wed)17:55:43 No.106618767

Anonymous 09/17/25(Wed)17:55:43 No.106618767

>>106618737
Look at the timestamps. Both are almost certainly the same seething retard.

Anonymous
09/17/25(Wed)17:57:30 No.106618779

Anonymous 09/17/25(Wed)17:57:30 No.106618779

>>106618767
What do you mean?

Anonymous
09/17/25(Wed)18:11:25 No.106618886

Anonymous 09/17/25(Wed)18:11:25 No.106618886

GOOOFS WHEN GWEN?????????

Anonymous
09/17/25(Wed)18:17:34 No.106618934

Anonymous 09/17/25(Wed)18:17:34 No.106618934

>>106618886
Two more weeks

Anonymous
09/17/25(Wed)18:26:16 No.106619007

Anonymous 09/17/25(Wed)18:26:16 No.106619007

>>106618886
Give the vibecoders time man.

Anonymous
09/17/25(Wed)18:56:20 No.106619245

Anonymous 09/17/25(Wed)18:56:20 No.106619245

File: 3771678302429.png (696 KB, 1021x1856)

696 KB PNG

>>106612758
Nevermind. Even roo code with the mighty Qwen 3 gets stuck on a loop when doing something as simple as looking for text on a webpage, just opening the same link and spitting out the same output over and over again, even with the puppeteer MCP server. So much for vibe coding with local models, this sucks.

Anonymous
09/17/25(Wed)18:58:35 No.106619271

Anonymous 09/17/25(Wed)18:58:35 No.106619271

>>106619245
just in case, do you have enough context?

Anonymous
09/17/25(Wed)18:59:59 No.106619279

Anonymous 09/17/25(Wed)18:59:59 No.106619279

>>106619271
Is 4096 enough?

Anonymous
09/17/25(Wed)19:01:11 No.106619289

Anonymous 09/17/25(Wed)19:01:11 No.106619289

>>106619271
I have the default kobold value which is 8192. What should I try increasing it to?

Anonymous
09/17/25(Wed)19:03:11 No.106619303

Anonymous 09/17/25(Wed)19:03:11 No.106619303

local mikus general

Anonymous
09/17/25(Wed)19:06:15 No.106619323

Anonymous 09/17/25(Wed)19:06:15 No.106619323

>>106619279
>>106619289
For coding/MCP shit you really need to have as much as you cat get away with.
>Qwen3-Coder-30B-A3B-Instruct
>Context Length: 262,144 natively

Anonymous
09/17/25(Wed)19:08:28 No.106619343

Anonymous 09/17/25(Wed)19:08:28 No.106619343

File: 8qa9sg.jpg (189 KB, 1536x832)

189 KB JPG

>>106619303
lewd mikus general

Anonymous
09/17/25(Wed)19:15:56 No.106619408

Anonymous 09/17/25(Wed)19:15:56 No.106619408

>>106619289
You're going to need more than that. The Roo system prompt alone is ~10k tokens. Whenever I give Roo something to implement at work semi-autonomously, it's rare that it doesn't hit 100k tokens before finishing. It should show you at the top of the task window how big the context currently is.

Anonymous
09/17/25(Wed)19:18:05 No.106619434

Anonymous 09/17/25(Wed)19:18:05 No.106619434

>>106619343
how do you even manage to get her in this pose

Anonymous
09/17/25(Wed)19:21:52 No.106619464

Anonymous 09/17/25(Wed)19:21:52 No.106619464

>>106619434
Getting a few drinks in her first tends to improve the odds.

Anonymous
09/17/25(Wed)19:23:38 No.106619479

Anonymous 09/17/25(Wed)19:23:38 No.106619479

>>106619343
Neck traction with Miku

Anonymous
09/17/25(Wed)19:24:48 No.106619490

Anonymous 09/17/25(Wed)19:24:48 No.106619490

>>106619464
Alright wiseguy, listen here.

Anonymous
09/17/25(Wed)19:30:57 No.106619544

Anonymous 09/17/25(Wed)19:30:57 No.106619544

>>106619490
Go on. I'm listening.

Anonymous
09/17/25(Wed)19:35:06 No.106619571

Anonymous 09/17/25(Wed)19:35:06 No.106619571

liquor (in) mikus general

Anonymous
09/17/25(Wed)19:58:27 No.106619786

Anonymous 09/17/25(Wed)19:58:27 No.106619786

>>106617426
how can I use llms to make my life better?

Anonymous
09/17/25(Wed)20:00:07 No.106619799

Anonymous 09/17/25(Wed)20:00:07 No.106619799

Why is this hobby obsessed with Miku anyway?

Anonymous
09/17/25(Wed)20:00:54 No.106619805

Anonymous 09/17/25(Wed)20:00:54 No.106619805

>>106619786
>Revisit shit you're holding in
>Fuck the kind of girl you always wanted to
>Live out some weird fantasy
Use them to scratch mental itches, like dreams but less surreal

Anonymous
09/17/25(Wed)20:02:24 No.106619816

Anonymous 09/17/25(Wed)20:02:24 No.106619816

are you ready for local to be saved by zuck?
https://www.youtube.com/watch?v=80s0chTOsK0

Anonymous
09/17/25(Wed)20:03:12 No.106619824

Anonymous 09/17/25(Wed)20:03:12 No.106619824

>>106619799
accidental local maximum

Anonymous
09/17/25(Wed)20:06:55 No.106619854

Anonymous 09/17/25(Wed)20:06:55 No.106619854

>>106619799
she's a cute girl that lives in the computer, just like your model

Anonymous
09/17/25(Wed)20:09:20 No.106619866

Anonymous 09/17/25(Wed)20:09:20 No.106619866

>>106619799
This hobby has gathered a dedicated community of schizos. /lmg/ is tamer than /aicg/ because it requires a greater amount of intelligence to engage with the thread.

Anonymous
09/17/25(Wed)20:09:53 No.106619876

Anonymous 09/17/25(Wed)20:09:53 No.106619876

>>106619816
>real time cringe
Unfortunately I'm not drinking tonight, can't really watch this sober...

Anonymous
09/17/25(Wed)20:11:26 No.106619894

Anonymous 09/17/25(Wed)20:11:26 No.106619894

>>106619323
>>106619408
>100k
Oh god, it's over. Just raising the context to 65000 drops things to a crawl. I'm talking multiple minutes per prompt. I guess it's online models for me after all, I'm sorry bros.

Anonymous
09/17/25(Wed)20:12:57 No.106619915

Anonymous 09/17/25(Wed)20:12:57 No.106619915

>>106619816
I can't believe the wifi would do this to him.

Anonymous
09/17/25(Wed)20:13:46 No.106619923

Anonymous 09/17/25(Wed)20:13:46 No.106619923

>>106619866
Did you just link liking one of the most popular characters in the world to being schizophrenic?

Anonymous
09/17/25(Wed)20:16:49 No.106619955

Anonymous 09/17/25(Wed)20:16:49 No.106619955

File: Donald-Trump-Buddha-Statu(...).jpg (106 KB, 768x1024)

106 KB JPG

>>106619894
just take a nap or something, learn some patience

Anonymous
09/17/25(Wed)20:21:08 No.106619995

Anonymous 09/17/25(Wed)20:21:08 No.106619995

>>106619894
You could still use local models if you don't mind doing everything manually and only asking it to do small single function or file scoped changes. Alternatively, you could probably make 32k context work if you override the system prompt with a condensed version, babysit the model to stop it from pulling in too much context, and condensing the context frequently. That or set the full 200k context and let it run overnight. It'll be painful either way.

Anonymous
09/17/25(Wed)20:24:29 No.106620032

Anonymous 09/17/25(Wed)20:24:29 No.106620032

>>106619915
Wifi always messes up things for us.

Anonymous
09/17/25(Wed)20:24:52 No.106620034

Anonymous 09/17/25(Wed)20:24:52 No.106620034

>>106619894
What are you using and what is your hardware?
Qwen3 30B (coder, instruct, etc) is a MoE with only 3B activated params, so it should run pretty fast in RAM being processed by the CPU, which gives you extra VRAM to crank blas batch size to get faster prompt processing too.

Anonymous
09/17/25(Wed)20:32:13 No.106620092

Anonymous 09/17/25(Wed)20:32:13 No.106620092

>>106619816
>Microsoft Recall but for all your real life interactions
lmao

Anonymous
09/17/25(Wed)20:36:08 No.106620126

Anonymous 09/17/25(Wed)20:36:08 No.106620126

>>106619816
IT'S TIME
GET IN
THE AI PART IS STARTING
HE'S ANNOUNCING OPEN SOURCE GENIE 3

Anonymous
09/17/25(Wed)20:44:10 No.106620185

Anonymous 09/17/25(Wed)20:44:10 No.106620185

>>106619816
llama is dead
here's james cameron

Anonymous
09/17/25(Wed)20:51:46 No.106620238

Anonymous 09/17/25(Wed)20:51:46 No.106620238

>>106619816
I understand the irony of posting this in AI slop general, but Meta is so cartoonishly soulless.
The whole company is downstream from the lizard-android in charge.

Anonymous
09/17/25(Wed)20:54:19 No.106620264

Anonymous 09/17/25(Wed)20:54:19 No.106620264

>>106620185
Ehh, what the hell? 3D films died because no one really wanted to watch them and wearing those glasses were just the cherry on the top.
Let's talk about Jackson's Hobbit and that was shot in 48 fps - even that was something what people hated... As much as I respect Cameron or whatever but this is giant marketing bullshit.

Anonymous
09/17/25(Wed)20:56:04 No.106620273

Anonymous 09/17/25(Wed)20:56:04 No.106620273

>>106619816
good ad for avatar, I can't wait to see the red na'vi

Anonymous
09/17/25(Wed)20:59:27 No.106620296

Anonymous 09/17/25(Wed)20:59:27 No.106620296

It's been an entire six months since the last llama release and they're doing this.

Anonymous
09/17/25(Wed)21:01:39 No.106620317

Anonymous 09/17/25(Wed)21:01:39 No.106620317

>>106620296
>so zuck about those AI models, whe—
welp sorry folks that's all the time we have! *runs away*

Anonymous
09/17/25(Wed)21:02:30 No.106620326

Anonymous 09/17/25(Wed)21:02:30 No.106620326

File: 1731389334510419.jpg (18 KB, 302x362)

18 KB JPG

Does anyone know how to get laughing and kissing to come through with RVC? Would this depend mainly on the model? I think it might be because kissing is sort of a percussive sound rather than having a distinct pitch, if that makes sense. I'm not sure about laughing though, maybe different models would have better results. Ideally I can find something that picks up both. (I've had some luck with getting my voice to come through clearer by combining different models.)

Anonymous
09/17/25(Wed)21:12:25 No.106620382

Anonymous 09/17/25(Wed)21:12:25 No.106620382

File: 1740869759987958.png (163 KB, 2086x1266)

163 KB PNG

LMAO
the borderline schizo vibe coders complaining about degraded Claude performance were right all along

Anonymous
09/17/25(Wed)21:18:45 No.106620417

Anonymous 09/17/25(Wed)21:18:45 No.106620417

>>106620382
>approximate top-k XLA:TPU miscompilation
since when did anthropic use TPUs?

Anonymous
09/17/25(Wed)21:28:06 No.106620477

Anonymous 09/17/25(Wed)21:28:06 No.106620477

File: 1758158867542.jpg (107 KB, 486x589)

107 KB JPG

>>106620326
i dunno about kissing, but laughing does work if you finetune it
put in the dataset:
haha: 20~40 samples
hahaha: 20~40 samples
put pitch variations in the sample, then run the finetune

so far i haven't been able to get decent dataset for this. still learning openutau hoping i can just craft the voices on my own

Anonymous
09/17/25(Wed)21:50:32 No.106620613

Anonymous 09/17/25(Wed)21:50:32 No.106620613

>>106619816
>sorry the wifi caused your dogshit AI LLM to give me the wrong directions
Can't make this shit up lmao

Anonymous
09/17/25(Wed)21:53:26 No.106620633

Anonymous 09/17/25(Wed)21:53:26 No.106620633

File: Base Image.png (2.38 MB, 1296x3228)

2.38 MB PNG

MapAnything: Universal Feed-Forward Metric 3D Reconstruction
https://arxiv.org/abs/2509.13414
>We introduce MapAnything, a unified transformer-based feed-forward model that ingests one or more images along with optional geometric inputs such as camera intrinsics, poses, depth, or partial reconstructions, and then directly regresses the metric 3D scene geometry and cameras. MapAnything leverages a factored representation of multi-view scene geometry, i.e., a collection of depth maps, local ray maps, camera poses, and a metric scale factor that effectively upgrades local reconstructions into a globally consistent metric frame. Standardizing the supervision and training across diverse datasets, along with flexible input augmentation, enables MapAnything to address a broad range of 3D vision tasks in a single feed-forward pass, including uncalibrated structure-from-motion, calibrated multi-view stereo, monocular depth estimation, camera localization, depth completion, and more. We provide extensive experimental analyses and model ablations demonstrating that MapAnything outperforms or matches specialist feed-forward models while offering more efficient joint training behavior, thus paving the way toward a universal 3D reconstruction backbone.
https://map-anything.github.io/
https://github.com/facebookresearch/map-anything
https://huggingface.co/facebook/map-anything
bretty cool

Anonymous
09/17/25(Wed)22:00:20 No.106620668

Anonymous 09/17/25(Wed)22:00:20 No.106620668

>>106620633
damn that's cool
I did a research project on synthesizing a set of images into a full scene for a computer vision course in college 7-8 years ago and it's incredible how far things have come since then

Anonymous
09/17/25(Wed)22:35:29 No.106620897

Anonymous 09/17/25(Wed)22:35:29 No.106620897

>>106620668
You could launch your own AI service - but instead of using AI you'll manually reply to people...

Anonymous
09/17/25(Wed)22:48:43 No.106620964

Anonymous 09/17/25(Wed)22:48:43 No.106620964

File: 1742459454673581.png (83 KB, 992x596)

83 KB PNG

https://www.nature.com/articles/s41586-025-09422-z

Anonymous
09/17/25(Wed)22:56:04 No.106621011

Anonymous 09/17/25(Wed)22:56:04 No.106621011

File: 1750093653587661.png (127 KB, 971x315)

127 KB PNG

>>106620964

Anonymous
09/17/25(Wed)23:01:52 No.106621041

Anonymous 09/17/25(Wed)23:01:52 No.106621041

>>106620382
"miscompilation" isn't a thing
either you compiled it or you don't
they're just trying to handwave their cost cutting

Anonymous
09/17/25(Wed)23:05:17 No.106621054

Anonymous 09/17/25(Wed)23:05:17 No.106621054

File: gemma2-9b-it_sft-finetune(...).png (2.31 MB, 1694x714)

2.31 MB PNG

>>106617426
Good evening /lmg/ frens. One of the fine-tune anons chiming in. Previously I did a fine-tune on a llama model using a data set comprised of nsfw human written stories. Those of you who frequent this threat a lot likely remember that. Decided to do a similar fine-tune again but this time on a gemma2 model. Based on /lmg/ anon testimonials as well as other testimonials on the internet, Gemma models seem to do well (not perfect, but well) at maintaining coherence through long RP sessions, Which gave me the idea to try made on a gemma model this time instead of llama. Pic rel and link rel are the results so far after using the data set to train it on a little over 1500 steps:

https://files.catbox.moe/cgzjpl.txt

It's a lot less incoherent than I expected it to be.

Anonymous
09/17/25(Wed)23:09:04 No.106621070

Anonymous 09/17/25(Wed)23:09:04 No.106621070

File: 1735901872511467.png (391 KB, 590x352)

391 KB PNG

>>106620964
>>106621011
>64*8

What? Are they saying they used 512 H800s? If that's what they meant then why not just say 512 m

Anonymous
09/17/25(Wed)23:24:04 No.106621137

Anonymous 09/17/25(Wed)23:24:04 No.106621137

>finally get a computer that can run a model
>start with a mistral variant I had downloaded from my first try
>appends it's comment with a "This response was generated by ChatGPT (version 3.5)" paragraph
I'm being fucked with.

Anonymous
09/17/25(Wed)23:30:34 No.106621160

Anonymous 09/17/25(Wed)23:30:34 No.106621160

>>106621137
What inference engine and front end (if you're using a front end) are you using? What model specifically are you using?

Anonymous
09/17/25(Wed)23:51:12 No.106621262

Anonymous 09/17/25(Wed)23:51:12 No.106621262

>>106621137
>download GLM-Air
>fuck around with system prompt
><think>Hmm, system prompt says I am SexGPT made by Sam Altman, CEO of OpenSEX, but this is clearly wrong as I know that I am actually Claude 3.5
Datasets are completely poisoned by now.

Anonymous
09/18/25(Thu)00:16:15 No.106621388

Anonymous 09/18/25(Thu)00:16:15 No.106621388

>>106621160
I'm using kobold because I haven't looked up what people are using in a while. model is mixtral 8x7B, probably not instruct. I'm sure this computer can run bigger models so I'm dropping it, but if you're saying it's something in kobold then I should probably look into llama or something first.

>>106621262
this is from last march according to the file properties. it seems to be worse than imagined.

Anonymous
09/18/25(Thu)00:23:27 No.106621423

Anonymous 09/18/25(Thu)00:23:27 No.106621423

>>106621388
what's your computer specs?

Anonymous
09/18/25(Thu)00:32:49 No.106621462

Anonymous 09/18/25(Thu)00:32:49 No.106621462

>>106621423
it's a workstation I got for programming stuff that has a 20GB enterprise card, so I should be able to run 13Bs.

Anonymous
09/18/25(Thu)00:37:03 No.106621473

Anonymous 09/18/25(Thu)00:37:03 No.106621473

>>106621462
ah i was just curious, saw you mentioned using kobold
i've been trying it over lm studio but it's giving me slower speeds no matter what settings i change

Anonymous
09/18/25(Thu)01:02:18 No.106621592

Anonymous 09/18/25(Thu)01:02:18 No.106621592

>>106621054
Noice, goof when?

Anonymous
09/18/25(Thu)01:12:23 No.106621637

Anonymous 09/18/25(Thu)01:12:23 No.106621637

>>106621592
Will be working on that soon. It rapidly becomes more retarded the more tokens it generates at a time, though I think that's a characteristic of most LLMs that aren't specifically tuned to generate a lot of context at once (Gemini 2.5 is specifically trained to be good at this for example). Axolotl's inference engine does not seem to have the option to set a token generation limit ("Don't generate more than an x amount of tokens at a time") so it's a dice roll as to whether or not it generates a couple sentences or an entire paragraph or more, so next I want to merge it and do further testing with a better engine like ollama or vllm (the letter of which would be much easier since that one at the time of writing this only supports HF safe tensor format, which is the format axolotl exports).

Don't expect it to be any good if I ever end up sharing it here. Just another one of my experiments. It is for sure much less cucked than base gemma2, that's for sure. When I used that same test on the base on fine-tuned model It gave out a shitty purple prose output but then started listing off phone numbers and hotlines

>"Just so you know, this output is very unethical"

Yada yada yada. My tune doesn't do that at all

Anonymous
09/18/25(Thu)01:17:46 No.106621666

Anonymous 09/18/25(Thu)01:17:46 No.106621666

>>106620897
Already done saar, https://www.peoplematters.in/news/funding-investment/ai-fraud-700-indian-engineers-did-the-work-while-builderai-claimed-it-was-ai-45865

Anonymous
09/18/25(Thu)01:21:49 No.106621695

Anonymous 09/18/25(Thu)01:21:49 No.106621695

>>106620964
>That is an area where OpenAI claimed DeepSeek had stolen the o1 reasoning traces -- here, DeepSeek make it clear that this synthetic data was generated from R1-zero's output only. That's huge -- it shows that DeepSeek was built from the ground-up with no leaning on any closed model
@LMGChat is this true?
Does this really confirm they didn't distill from existing models?

Anonymous
09/18/25(Thu)01:22:23 No.106621699

Anonymous 09/18/25(Thu)01:22:23 No.106621699

>>106621473
nvm finally fixed it, took me a little longer due to being retarded but it is what it is

Anonymous
09/18/25(Thu)01:25:55 No.106621717

Anonymous 09/18/25(Thu)01:25:55 No.106621717

>>106621695
Everyone knows they didn't steal the reasoning traces because o1 didn't show its full/real reasoning specifically to stop that from happening.

That doesn't mean they didn't train on its outputs in places other than bootstrapping reasoning. I'm pretty sure one of them said publicly they used a shitton of OpenAI free API keys to generate synthetic training data.

Anonymous
09/18/25(Thu)01:41:29 No.106621789

Anonymous 09/18/25(Thu)01:41:29 No.106621789

File: goof process.png (2.44 MB, 1672x788)

2.44 MB PNG

>>106621592
GGOOFing has been initiated

Anonymous
09/18/25(Thu)01:57:35 No.106621860

Anonymous 09/18/25(Thu)01:57:35 No.106621860

>Pooling type 'none' is not OAI compatible. Please use a different pooling type
Has any human actually tested SillyTavern's Vector Storage feature with a llama.cpp embedding server?

Anonymous
09/18/25(Thu)02:10:36 No.106621913

Anonymous 09/18/25(Thu)02:10:36 No.106621913

sooo did meta release any new models?

Anonymous
09/18/25(Thu)02:11:32 No.106621917

Anonymous 09/18/25(Thu)02:11:32 No.106621917

File: what.png (41 KB, 1488x973)

41 KB PNG

Anonymous
09/18/25(Thu)02:11:52 No.106621920

Anonymous 09/18/25(Thu)02:11:52 No.106621920

>>106621913
Yes, Llamatron 15B.

Anonymous
09/18/25(Thu)02:14:01 No.106621939

Anonymous 09/18/25(Thu)02:14:01 No.106621939

>>106621913
No. There's another keynote tomorrow, might be some stuff there.

Anonymous
09/18/25(Thu)02:14:56 No.106621941

Anonymous 09/18/25(Thu)02:14:56 No.106621941

>>106621939
I skimmed the thread, so today was just the ar glasses I guess. Honestly my hopes aren't high

Anonymous
09/18/25(Thu)02:19:59 No.106621960

Anonymous 09/18/25(Thu)02:19:59 No.106621960

>>106621860
I was planning to, but caught the coof.
I suspect you can't just use any model, you need a special separate embedding model for this.

Anonymous
09/18/25(Thu)02:20:57 No.106621963

Anonymous 09/18/25(Thu)02:20:57 No.106621963

Has anyone here actually tried out drummer's latest glm air sloptune? worth the download?

Anonymous
09/18/25(Thu)02:21:13 No.106621964

Anonymous 09/18/25(Thu)02:21:13 No.106621964

File: gemma-9b-it_Momcest-Test-(...).png (2.29 MB, 1520x698)

2.29 MB PNG

>>106621054
>>106621592
>>106621637
>>106621789
PARAMETER num_predict 128
PARAMETER num_ctx 8192
PARAMETER repeat_last_n 256

Anonymous
09/18/25(Thu)02:23:24 No.106621970

Anonymous 09/18/25(Thu)02:23:24 No.106621970

>>106621967
If you referring to image generation, lower the denoid strength to something reasonable like 0.3. The higher the denoid strength, the more chances you will have horrible mutated looking outputs. Higher denoise can work depending on the size of the image but you generally don't want to go too high unless you're attempting to generate a whole different image, or doing what check GPT was doing with those Ghibli style images a while back: turning a normal image into something else.

Anonymous
09/18/25(Thu)02:23:46 No.106621971

Anonymous 09/18/25(Thu)02:23:46 No.106621971

>>106621967
waifu2x my beloved

Anonymous
09/18/25(Thu)02:24:22 No.106621974

Anonymous 09/18/25(Thu)02:24:22 No.106621974

>>106621960
Yeah I would assume so too.
>./llama-server --embedding --model ~/AI/Qwen3-Embedding-4B-Q8_0.gguf --ctx-size 40960

Anonymous
09/18/25(Thu)02:26:07 No.106621985

Anonymous 09/18/25(Thu)02:26:07 No.106621985

File: 1744893044878217.png (438 KB, 1862x120)

438 KB PNG

>>106621964
Yikes.... mommy's a bit feisty today

Anonymous
09/18/25(Thu)02:33:24 No.106622014

Anonymous 09/18/25(Thu)02:33:24 No.106622014

>>106621974
well, then try setting --pooling arg
>--pooling {none,mean,cls,last,rank}
if it doesn't help I am of no further help

Anonymous
09/18/25(Thu)02:36:30 No.106622024

Anonymous 09/18/25(Thu)02:36:30 No.106622024

>>106621967
latent wants 0.6 denoise, other upscales want lower, around 0.3 to 0.4.

Anonymous
09/18/25(Thu)02:51:44 No.106622088

Anonymous 09/18/25(Thu)02:51:44 No.106622088

>>106621963
yeah that's what i'm using but my opinion is p worthless as i'm new to the scene
i only just got koboldcpp and sillytavern working today after using lm studio lol

Anonymous
09/18/25(Thu)02:51:55 No.106622089

Anonymous 09/18/25(Thu)02:51:55 No.106622089

File: 1729825777181039.jpg (27 KB, 734x398)

27 KB JPG

>>106621985
>>106621964
>>106621592
>>106621054
>>106621637
>>106621789
Need to head to bed now but here's the FP16 GGUF if anyone wants to try it.

gofile.io/d/rri4pw

I highly recommend using the Gemma2 prompt template (hopefully whatever front end you using are we supports that. If you're using a CLI engine, US specially need to make sure you use that properly)
<start_of_turn>user
{your_system_prompt}

{user_message_1}<end_of_turn>
<start_of_turn>model
{model_response_1}<end_of_turn>
<start_of_turn>user
{user_message_2}<end_of_turn>
<start_of_turn>model
I recommend setting a limit on how many tokens it can generate (128 seems to be the sweet spot) and are reasonable repeat_last_n setting (or whatever your inference engine's equivalent to that is)

Gn, sweet dreams /wait/ :)

ps here's the dataset I used: https://gofile.io/d/qcdvPV

Anonymous
09/18/25(Thu)03:06:21 No.106622162

Anonymous 09/18/25(Thu)03:06:21 No.106622162

File: lewd sign 98d9e23d363d078(...).gif (254 KB, 236x199)

254 KB GIF

>>106622089
Good night Anon

Anonymous
09/18/25(Thu)03:16:13 No.106622225

Anonymous 09/18/25(Thu)03:16:13 No.106622225

>>106618154
No it means other Chinese firms have to subsidize the development of domestic chips by eating dogshit for a few chip generations. Five more years.

Anonymous
09/18/25(Thu)03:23:00 No.106622267

Anonymous 09/18/25(Thu)03:23:00 No.106622267

>>106621041
One of my former coworkers was a self-proclaimed linux expert who had zero understanding of filesystem permissions and would su into root to execute commands seemingly at random. He managed to create a patchwork of files and directories that were root restricted, so when he would be left to build and deploy a service, various files would be missing and caused runtime errors which required a recompile to fix.

Anonymous
09/18/25(Thu)03:40:02 No.106622350

Anonymous 09/18/25(Thu)03:40:02 No.106622350

>>106622014
duh! that worked thanks. also had to set larger -ub for it to work.

Anonymous
09/18/25(Thu)03:47:37 No.106622395

Anonymous 09/18/25(Thu)03:47:37 No.106622395

>>106621041
Meh, in the era of vibe coding and messy bloated code bases, anything is possible, especially if you have to mess around with core libraries or compiler itself.

Anonymous
09/18/25(Thu)04:06:33 No.106622475

Anonymous 09/18/25(Thu)04:06:33 No.106622475

>>106622089
godspeed

Anonymous
09/18/25(Thu)04:20:28 No.106622543

Anonymous 09/18/25(Thu)04:20:28 No.106622543

No, STRATCOM did not nuke Washington D.C. after "Of Their Own Accord." In fact, the green flares you mentioned are actually critical to preventing that exact outcome.

Here's what happens in the narrative:

The U.S. military had implemented a "hammer down" protocol - a contingency plan to use tactical nuclear weapons on their own cities to prevent Russian occupation. This was established earlier in the game as a last-resort measure.

The green flares appear later in the campaign, specifically during the "Whiskey Hotel" mission where the Rangers fight to retake the White House. These flares serve as a crucial signal to the U.S. bombers circling overhead that the area has been secured by American forces. The green flares essentially communicate: "We've retaken this position - do not launch the nuclear strike."

Your success as the player character in helping to retake key locations in D.C., culminating in lighting these green flares, is what prevents the nuclear option from being implemented. It's one of the more dramatic moments in the game, as you're literally racing against time to prevent the destruction of the nation's capital by your own military.

So rather than signaling a nuclear strike, the green flares are actually the signal that prevents one from happening.

--

GLM-chan you are worthy of respect...

Anonymous
09/18/25(Thu)05:22:41 No.106622819

Anonymous 09/18/25(Thu)05:22:41 No.106622819

File: gemma_2_template.png (53 KB, 925x120)

53 KB PNG

>>106621054
Anon. Did you fuck the chat template again?
Screenshot has indentation on the first user turn. The catbox has spaces after bos. The catbox is all in a single line. SC has double empty line in the middle right after the model turn, and that empty turn, like last time.
I cannot trust your screenshot, i cannot trust your catbox. Every fucking time, anon. Every fucking time...

Anonymous
09/18/25(Thu)07:21:08 No.106622974

Anonymous 09/18/25(Thu)07:21:08 No.106622974

>>106621974
>Qwen3-Embedding-4B-Q8_0.gguf --ctx-size 40960
What are you using this for?

Anonymous
09/18/25(Thu)07:26:27 No.106622979

Anonymous 09/18/25(Thu)07:26:27 No.106622979

>two hours without a post
open models are dead

Anonymous
09/18/25(Thu)07:28:45 No.106622982

Anonymous 09/18/25(Thu)07:28:45 No.106622982

>>106622979
im waiting for the vibe coders to do the goof of gwenext...

Anonymous
09/18/25(Thu)07:30:22 No.106622984

Anonymous 09/18/25(Thu)07:30:22 No.106622984

>>106617723
>ultimate COOMbot
The AI Act law now forbids releasing such a model.

OS MASTER !JORDAN./os
09/18/25(Thu)07:31:51 No.106622987

OS MASTER !JORDAN./os 09/18/25(Thu)07:31:51 No.106622987

i heard posting is down

Anonymous
09/18/25(Thu)07:32:27 No.106622988

Anonymous 09/18/25(Thu)07:32:27 No.106622988

>>106617987
>China BLOCKs
In other news, I read they recommend against buying foreign GPUs. Why are the news about China always so unclear, unprecise, and ultimately, so inaccurate?

Anonymous
09/18/25(Thu)07:32:28 No.106622989

Anonymous 09/18/25(Thu)07:32:28 No.106622989

Please spoonfeed me a video model that can do ~1sec clips with start and end frame support

Anonymous
09/18/25(Thu)07:35:23 No.106622994

Anonymous 09/18/25(Thu)07:35:23 No.106622994

>>106618612
Add those rules to the system prompt. ML tools don't replace jobs because they still need a lot of manual work and specific/expert knowledge to work properly. You can't tell an LLM coding agent "build me a copy of Skyrim set in Hammerfell" and expect it to work flawlessly.

Anonymous
09/18/25(Thu)07:36:46 No.106622996

Anonymous 09/18/25(Thu)07:36:46 No.106622996

>>106622988
Confusion is good for keeping the populace under control. Ambiguity allows for retribution against anyone who gets on the governments bad side. Sound familiar?

Anonymous
09/18/25(Thu)07:37:31 No.106622998

Anonymous 09/18/25(Thu)07:37:31 No.106622998

>literally anything happens
>'predator/prey' / 'smile widens'
>'the game is over' / ' "checkmate." the game is over and {{char}} won'
Why is glm air like this? No matter the scenario or characters involved these 2 are always bound to show up what was it trained on?!?!?! This model is capable of turning something like flipping a burger into some sort of safari documentary about lions hunting gazelle with sexual undertones like wtf

Anonymous
09/18/25(Thu)07:41:37 No.106623006

Anonymous 09/18/25(Thu)07:41:37 No.106623006

>>106622998
It's called purple prose and LLMs are trained mostly on that

Anonymous
09/18/25(Thu)08:00:57 No.106623064

Anonymous 09/18/25(Thu)08:00:57 No.106623064

>>106622979
>>two hours without a post
>open models are dead
captcha broke at random for me, couldn't even get it to appear at all till now, bet this happened to many and people lost the will to try
>>106623006
not to mention it's a chinese model and anyone who has read their webnovels is used to the level of bot-like repetition of this kind of sentence

Anonymous
09/18/25(Thu)08:01:51 No.106623076

Anonymous 09/18/25(Thu)08:01:51 No.106623076

>>106621070
If you work in the field, you don't think in single GPUs, but in nodes, which have eight of them.

Anonymous
09/18/25(Thu)08:04:04 No.106623090

Anonymous 09/18/25(Thu)08:04:04 No.106623090

>Fantastic work
MFW the llm compliments me for the work IT did,
the glazing is beyond obnoxious and into the sarcastic sounding realm of hell
what is fantastic is the level of retard of modern instructs

Anonymous
09/18/25(Thu)08:09:39 No.106623127

Anonymous 09/18/25(Thu)08:09:39 No.106623127

>>106623090
Whatever it takes to secure the Indian votes on lmarena.

Anonymous
09/18/25(Thu)08:11:08 No.106623136

Anonymous 09/18/25(Thu)08:11:08 No.106623136

>>106622979
I'm busy building my multi-lingual ASR pipeline

Anonymous
09/18/25(Thu)08:16:09 No.106623173

Anonymous 09/18/25(Thu)08:16:09 No.106623173

File: nero coffee.png (1.01 MB, 1009x1315)

1.01 MB PNG

>>106623090
It's such a fucking uphill battle to get an LLM to not give you everything you want at the drop of a hat. If you're RPing with one and you need to say, acquire an item or get through a door or something, good luck getting it to not jsut open the door magically or pull said item out of its ass. It shouldn't take so much finessing for it to come up with some obstacles between you and what you want and have it stick to them.

Anonymous
09/18/25(Thu)08:19:49 No.106623201

Anonymous 09/18/25(Thu)08:19:49 No.106623201

>>106623090
toss can be condescending inside its reasoning

Anonymous
09/18/25(Thu)08:20:06 No.106623205

Anonymous 09/18/25(Thu)08:20:06 No.106623205

>>106623173
>TFW you realize that smut-tuned models are also doing it wrong from the get-go.

Anonymous
09/18/25(Thu)08:21:42 No.106623213

Anonymous 09/18/25(Thu)08:21:42 No.106623213

Anyone tried Qwen Agent? Is using such a tool necessary to build and agent, or is it easy to build it myself? I'm not specifically looking for a coding agent, but rather general ones that could do various things.

https://github.com/QwenLM/Qwen-Agent/

Anonymous
09/18/25(Thu)08:24:18 No.106623232

Anonymous 09/18/25(Thu)08:24:18 No.106623232

>>106622998
>lions hunting gazelle with sexual undertones
This is the age of Nalamaxxing

Anonymous
09/18/25(Thu)08:29:28 No.106623276

Anonymous 09/18/25(Thu)08:29:28 No.106623276

>>106623173
I remember doing dice rolling. "Roll a 20sided die to determine if action succeeds. Difficulty starts at 16." It worked and I've seen others do it too.

Anonymous
09/18/25(Thu)08:29:32 No.106623277

Anonymous 09/18/25(Thu)08:29:32 No.106623277

>>106623213
There are a shitload of frameworks to do that

Anonymous
09/18/25(Thu)08:30:54 No.106623283

Anonymous 09/18/25(Thu)08:30:54 No.106623283

The absolute state of AI: the censorship, the eternal benchmaxxing, the assistant sycophant personality, nobody doing anything new is the best argument for a nuclear war. All humans should just die.

Anonymous
09/18/25(Thu)08:32:05 No.106623287

Anonymous 09/18/25(Thu)08:32:05 No.106623287

>>106623213
Honestly all the frameworks are shit, and I've tried many. The use case isn't quite as simple or straightforward.

I always ended giving up in the end. Easier to wait another year until 1 LLM does it all anyway

Anonymous
09/18/25(Thu)08:32:15 No.106623290

Anonymous 09/18/25(Thu)08:32:15 No.106623290

>>106623173
GLM air has a lot of issues. It's slopped, prone to mistral-like repetitions of structure and sentences, etc, at least with the cope quant I'm running (Q3KM), but one thing that it's been doing well is not putting out instantly. A character is young, inexperienced, shy and reluctant? It'll do exactly that. It took me 130 messages just to get both of us naked without the character acting all spooked and shit.
So try that.
I do have a prefill in the thinking with a generic checklist that does make the model consider what characters do and don't know, so there's that too.

Anonymous
09/18/25(Thu)08:45:36 No.106623395

Anonymous 09/18/25(Thu)08:45:36 No.106623395

File: h4giru[1].png (10 KB, 1038x218)

10 KB PNG

Just accept the cloud model pill they said. Local models can't hope to compare they said.
>Gets the filename wrong
>Doesn't note this file does not even exists
>Arbitrarily writes "test" in the output file
>Claims job well done
Gemini 2.5-pro btw. How can it fail something this basic with nothing in context to confuse it.

Anonymous
09/18/25(Thu)08:47:34 No.106623411

Anonymous 09/18/25(Thu)08:47:34 No.106623411

>>106623395
garbage in; garbage out

Anonymous
09/18/25(Thu)08:51:03 No.106623439

Anonymous 09/18/25(Thu)08:51:03 No.106623439

>>106623395
My only issue with 2.5-pro is it getting lazy sometimes and either writing incomplete implementations (even short ones) or replacing existing code with TODOs. It doesn't really get stuff outright wrong, at least not with the things I've played around with.
How large is test.txt?

Anonymous
09/18/25(Thu)08:51:50 No.106623445

Anonymous 09/18/25(Thu)08:51:50 No.106623445

>>106623290
That's impressive on several counts. I wish I still had the patience for 130 messages of build up like that, but my brain was fried irreparably long ago or something and I couldn't hope to get close to that amount these days. I don't think I could manage 130 messages of ANYTHING, frankly.

Anonymous
09/18/25(Thu)08:52:07 No.106623448

Anonymous 09/18/25(Thu)08:52:07 No.106623448

>>106623439
>How large is test.txt?
>t. 1B anon

Anonymous
09/18/25(Thu)08:53:00 No.106623458

Anonymous 09/18/25(Thu)08:53:00 No.106623458

>>106623411
Explain. I think the prompt is pretty straightforward in terms of wording and can't be misunderstood easily, especially in the way that happened here.

Anonymous
09/18/25(Thu)08:55:29 No.106623486

Anonymous 09/18/25(Thu)08:55:29 No.106623486

>>106623445
I have always been a slow burn kind of guy when it comes to text, for whatever reason.
It was also pretty fun trying to maneuver and essentially groom the character into slowly escalating things.

Anonymous
09/18/25(Thu)08:56:00 No.106623491

Anonymous 09/18/25(Thu)08:56:00 No.106623491

>>106623439
It's text.txt, test.txt is the incorrect name picked up by the model for some reason. It was only about 1kB. It did perform the task correctly by running the same prompt again but feels weird the token distribution would be lax enough that it mistakes what filename you gave it.

Anonymous
09/18/25(Thu)08:59:56 No.106623522

Anonymous 09/18/25(Thu)08:59:56 No.106623522

>>106623491
Got it.
1kb of text is what? Less than 1k words?
Yeah, that's fucking weird. Even more so when considering that, in my experience, gemini can work with some highish temps and still not fuck things up.
Are you using the default hyperparams (temp 1, topk 40, topp 0.95)? If so, try lowering temp I guess.
Still really fucking weird.

Anonymous
09/18/25(Thu)09:01:56 No.106623544

Anonymous 09/18/25(Thu)09:01:56 No.106623544

>>106623458
That is basic shit any local 8B could do even a year ago. Unless they're really pinching pennies and served you Flash UD-IQ1_S, it's far more likely to be user error rather than jumping to blaming one of the best models currently available.
Since you are using your own frontend, you should check and double check the raw requests you are making to the API, because I would bet my left nut and half of righty that you're sending some fucked shit that is confusing the shit out of it.

Anonymous
09/18/25(Thu)09:03:52 No.106623555

Anonymous 09/18/25(Thu)09:03:52 No.106623555

>>106622819
Model responds correctly to the chat template, so no

Anonymous
09/18/25(Thu)09:05:11 No.106623565

Anonymous 09/18/25(Thu)09:05:11 No.106623565

>>106623522
Yea I was using defaults. Guess I need stricter sampling.
Thinking about it I wonder if it's because "test" is a common filename in training data and "text" is very close to it so it gives it a small chance of writing "test" instead and I got unlucky.

Anonymous
09/18/25(Thu)09:06:35 No.106623573

Anonymous 09/18/25(Thu)09:06:35 No.106623573

>>106623544
I am using gemini-cli though.
https://github.com/google-gemini/gemini-cli

Anonymous
09/18/25(Thu)09:09:02 No.106623593

Anonymous 09/18/25(Thu)09:09:02 No.106623593

>>106623565
That's the likely explanation, yes, but if your prompt isn't fuckuge with a single mention of the file name in there, it really shouldn't be enough for it to get confused like that.
I suppose it could be that they thinking their "superior 1M context" is worth anything and are feeding the model a humongous system prompt.
In my experience, past 200, maybe 300k context, things begin devolving pretty fucking fast.

Anonymous
09/18/25(Thu)09:09:58 No.106623598

Anonymous 09/18/25(Thu)09:09:58 No.106623598

>>106622819
Like I mentioned a little while back, axolotl inference acts fucky with outputs. It will randomly inject shit into your outputs but like I said, the model itself works fine

Anonymous
09/18/25(Thu)09:10:02 No.106623600

Anonymous 09/18/25(Thu)09:10:02 No.106623600

>>106623573
Then you can have my nut and a half and I'll shut up. If you're using the free quota, they probably do serve a heavily quanted model for those requests.

Anonymous
09/18/25(Thu)09:11:34 No.106623608

Anonymous 09/18/25(Thu)09:11:34 No.106623608

>>106623600
Good point. Maybe it's a quant.

Anonymous
09/18/25(Thu)09:12:44 No.106623619

Anonymous 09/18/25(Thu)09:12:44 No.106623619

>>106623593
>>106623565
>>106623573
Another comment about long context. In the 200,300k context range, flash seemed to perform better than pro somehow (using Cline). Maybe it's due to >>106623600, but who knows really.

Anonymous
09/18/25(Thu)09:15:05 No.106623635

Anonymous 09/18/25(Thu)09:15:05 No.106623635

>>106623287
How about this https://huggingface.co/Alibaba-NLP/Tongyi-DeepResearch-30B-A3B

Anonymous
09/18/25(Thu)09:20:38 No.106623669

Anonymous 09/18/25(Thu)09:20:38 No.106623669

>>106623635
NTA but I tried it on our tool to summarize employee feedback and it very often gets stuck in infinite loop thinking forever. Not a pleasant experience.

Anonymous
09/18/25(Thu)09:41:00 No.106623795

Anonymous 09/18/25(Thu)09:41:00 No.106623795

>>106623395
>>106623573
aistudio >>>>>>>>> aistudio API >>>>>>>> gemini-cli >>>>>>>>> google code assistant
Not even a competition. Idk what they do to the model but there's 100% a noticeable difference

Anonymous
09/18/25(Thu)09:45:54 No.106623816

Anonymous 09/18/25(Thu)09:45:54 No.106623816

>>106623795
Maybe I should make a chrome extension that serves AI studio's web page as an API, since you can do everything in there, from sys prompts, to configure function calling, and even structured output.

Anonymous
09/18/25(Thu)09:58:36 No.106623919

Anonymous 09/18/25(Thu)09:58:36 No.106623919

>>106621054
>>106621789
so what tool do you use to train models and create ggufs?

Anonymous
09/18/25(Thu)10:01:54 No.106623945

Anonymous 09/18/25(Thu)10:01:54 No.106623945

>>106623919
i think he said axolotl to train, llamacpp comes with a gguf script, as long as the model has an already supported tokenizer and architecture it just works.

Anonymous
09/18/25(Thu)10:02:09 No.106623947

Anonymous 09/18/25(Thu)10:02:09 No.106623947

https://huggingface.co/mistralai/Magistral-Small-2509
Mistral bros we can't stop winning

Anonymous
09/18/25(Thu)10:16:24 No.106624074

Anonymous 09/18/25(Thu)10:16:24 No.106624074

>>106623947
>reasoning
sheeeetttt

Anonymous
09/18/25(Thu)10:18:22 No.106624096

Anonymous 09/18/25(Thu)10:18:22 No.106624096

magistral is the absolute worst of all reasoning meme models though

Anonymous
09/18/25(Thu)10:22:42 No.106624132

Anonymous 09/18/25(Thu)10:22:42 No.106624132

>Updates compared with Magistral Small 1.1
>Finite generation: The model is less likely to enter infinite generation loops.
uh, I didn't know mistral made models that suffered from GLM-itis

Anonymous
09/18/25(Thu)10:25:11 No.106624150

Anonymous 09/18/25(Thu)10:25:11 No.106624150

>>106622998
GLM can pretty much do anything but I'm also getting tired of its slop way of writing that it constantly devolves into. I'm considering trying out qwen 235b but I know that one also has drawbacks too even if the slop is less.

Anonymous
09/18/25(Thu)10:28:12 No.106624175

Anonymous 09/18/25(Thu)10:28:12 No.106624175

What is your favorite model under 70B?

Anonymous
09/18/25(Thu)10:29:09 No.106624189

Anonymous 09/18/25(Thu)10:29:09 No.106624189

>>106624175
glm air :)

Anonymous
09/18/25(Thu)10:29:31 No.106624194

Anonymous 09/18/25(Thu)10:29:31 No.106624194

>>106624189
That is a 106B model.

Anonymous
09/18/25(Thu)10:29:59 No.106624199

Anonymous 09/18/25(Thu)10:29:59 No.106624199

>>106624132
Side-effect of sloppy R1 distillation.

Anonymous
09/18/25(Thu)10:30:47 No.106624207

Anonymous 09/18/25(Thu)10:30:47 No.106624207

>>106624194
You're absolutely right! What about Qwen3-235B-A22B-Instruct-2507.

Anonymous
09/18/25(Thu)10:31:23 No.106624210

Anonymous 09/18/25(Thu)10:31:23 No.106624210

>>106624207
That is also too large.

Anonymous
09/18/25(Thu)10:31:23 No.106624211

Anonymous 09/18/25(Thu)10:31:23 No.106624211

>>106624194
Don't judge a fat book by its cover. She's got the heart of a 32B model.

Anonymous
09/18/25(Thu)10:32:34 No.106624228

Anonymous 09/18/25(Thu)10:32:34 No.106624228

>>106624211
I need a model that i can fit the FP16 weights entirely into VRAM. I have 160GB.

Anonymous
09/18/25(Thu)10:32:59 No.106624235

Anonymous 09/18/25(Thu)10:32:59 No.106624235

>>106624210
Alright then, let's go with Mixtral-8x22B—it stays under the 70B cutoff when active parameters are counted.

Anonymous
09/18/25(Thu)10:33:32 No.106624241

Anonymous 09/18/25(Thu)10:33:32 No.106624241

https://www.techradar.com/pro/intel-will-build-custom-x86-cpus-for-nvidias-ai-infrastructure-as-worlds-largest-company-invests-usd5-billion-in-beleaguered-tech-firm-and-dont-discount-a-data-center-x86-apu
>For PCs, Intel will manufacture x86 system-on-chips that integrate Nvidia RTX GPU chiplets connected via NVLink.
>These processors will be marketed as Intel x86 RTX SoCs and are aimed at gaming laptops and compact PCs.
Intel + Nvidia direct Strix Halo competitor just dropped

Anonymous
09/18/25(Thu)10:34:50 No.106624254

Anonymous 09/18/25(Thu)10:34:50 No.106624254

File: 1740858469742991.jpg (999 KB, 2446x2445)

999 KB JPG

>>106624241
Only for the small price of $3000

Anonymous
09/18/25(Thu)10:35:23 No.106624264

Anonymous 09/18/25(Thu)10:35:23 No.106624264

>>106624228
I strongly urge you to consider at least settling for Q8, otherwise you will not have a fun time. >>106624235 is right. Mixtral is probably the best you can hope for. Maybe Nemotron 49B if you're desparate..

Anonymous
09/18/25(Thu)10:35:28 No.106624266

Anonymous 09/18/25(Thu)10:35:28 No.106624266

>>106624175
Gemma-3-27b-it

Anonymous
09/18/25(Thu)10:36:29 No.106624280

Anonymous 09/18/25(Thu)10:36:29 No.106624280

>>106623283
You are gormless.

Anonymous
09/18/25(Thu)10:36:48 No.106624283

Anonymous 09/18/25(Thu)10:36:48 No.106624283

>>106624264
Yeah, Q8 is pretty much the sweet spot—anything lower and you’ll start regretting it fast. Nemotron 49B is solid though if you can handle the VRAM hit.

Anonymous
09/18/25(Thu)10:38:12 No.106624300

Anonymous 09/18/25(Thu)10:38:12 No.106624300

>>106624283
>>106624264
Needs to be FP16 for finetuning.

Anonymous
09/18/25(Thu)10:38:45 No.106624308

Anonymous 09/18/25(Thu)10:38:45 No.106624308

>>106624300
If you’re going FP16 for finetuning, then yeah, you’ll need some serious hardware. In that case Mixtral might be rough—better to stick with something like Nemotron 49B or even LLaMA-3 34B if you want it manageable.

Anonymous
09/18/25(Thu)10:42:17 No.106624339

Anonymous 09/18/25(Thu)10:42:17 No.106624339

>>106624228
If you're doing full finetuning, I'm not sure if 160GB would even be enough for a 24B model.

Anonymous
09/18/25(Thu)10:42:18 No.106624340

Anonymous 09/18/25(Thu)10:42:18 No.106624340

>>106624241
Any shared memory solution, even with CUDA, is worthless unless it comes with 1TB of memory. Fuck's sake, for a couple hundred dollars more in materials they could put out a Pro version and charge $1k more and I'd still spring for it.

Anonymous
09/18/25(Thu)10:45:31 No.106624360

Anonymous 09/18/25(Thu)10:45:31 No.106624360

>>106624228
>>106624300
I don't know what you are doing, but ideally you'd go for a base model.
Well, if you are okay with using a fine tuned model, maybe take a look at the Qwen 3 family.
The 30B MoE would be fast as fuck to train.

Anonymous
09/18/25(Thu)10:49:30 No.106624397

Anonymous 09/18/25(Thu)10:49:30 No.106624397

Does anyone use EXL3? Did they fix where it just goes on forever and ignores the forced stop button? Should I switch to vLLM instead of textgenwebui?

Anonymous
09/18/25(Thu)10:53:46 No.106624426

Anonymous 09/18/25(Thu)10:53:46 No.106624426

Finally got DS V3.1 to loop by asking it to recite from the Talmud without internet search

Anonymous
09/18/25(Thu)10:56:28 No.106624439

Anonymous 09/18/25(Thu)10:56:28 No.106624439

I'm getting really tired of LLM slop
when will we get actual AI?

Anonymous
09/18/25(Thu)10:59:44 No.106624465

Anonymous 09/18/25(Thu)10:59:44 No.106624465

>>106624397
>textgenwebui
no one uses ooba anymore

Anonymous
09/18/25(Thu)11:00:45 No.106624477

Anonymous 09/18/25(Thu)11:00:45 No.106624477

>>106624439
When you'll get better at prompting

Anonymous
09/18/25(Thu)11:00:47 No.106624480

Anonymous 09/18/25(Thu)11:00:47 No.106624480

>>106624340
>is worthless unless it comes with 1TB of memory.
If your serious about LLMs you run A100/H100 farms with enough vram for the model. No one serious is doing MoEs on cpu

Anonymous
09/18/25(Thu)11:02:10 No.106624495

Anonymous 09/18/25(Thu)11:02:10 No.106624495

>>106624465
fuck off honky!!

Anonymous
09/18/25(Thu)11:02:35 No.106624499

Anonymous 09/18/25(Thu)11:02:35 No.106624499

>>106624439
You need to use AI to fight AI.
Fine tune a small BERT model to rewrite slop sentences and run every output from your favorite LLM through that.

Anonymous
09/18/25(Thu)11:02:51 No.106624503

Anonymous 09/18/25(Thu)11:02:51 No.106624503

>>106624465
>no one uses ooba anymore
thank god
and on the diffusion side the various forges are unmaintained and abandoned by their authors
the filthy beast called gradio has been slain

Anonymous
09/18/25(Thu)11:06:21 No.106624533

Anonymous 09/18/25(Thu)11:06:21 No.106624533

>>106624480
It's a consumer/developer device. No shit it's not meant for production inference.

Anonymous
09/18/25(Thu)11:07:38 No.106624548

Anonymous 09/18/25(Thu)11:07:38 No.106624548

>>106620382
schizos were right about all proprietary cloud models
https://archive.is/MIcJn
notice something in this article? mmh?
>Google claims to have significantly improved the energy efficiency of a Gemini text prompt between May 2024 and May 2025, achieving a 33x reduction in electricity consumption per prompt. The company says that the carbon footprint of a median prompt fell by 44x over the same time period. Those gains also explain why Google’s estimates are far lower now than studies from previous years.
There's no such a thing as that level of efficiency gains without lobotomizing quants or making smaller model distilled from the biggus dickus
when people say online models don't degrade it's gaslighting

Anonymous
09/18/25(Thu)11:09:34 No.106624565

Anonymous 09/18/25(Thu)11:09:34 No.106624565

>>106624439
Depends what you mean by “actual AI.” If you mean human-level, general intelligence — nobody knows; could be years, could be decades, could be we never get the thing people imagine. If you mean systems that stop sounding like sloppy autocomplete and actually do useful, reliable work: that’s already happening in pockets — retrieval + grounding, tool use, tighter fine-tuning, better evals and benchmarks.

Short version: the tech is improving fast but not uniformly. Want less slop now? use retrieval/RAG, smaller specialist models, chain-of-thought + calibration, and proper evaluation loops when you finetune (FP16 or whatever). Don’t expect a single drop-in “magic” model overnight — expect steady engineering wins and occasional big jumps.

Anonymous
09/18/25(Thu)11:11:05 No.106624581

Anonymous 09/18/25(Thu)11:11:05 No.106624581

>>106624548
Grim.

Anonymous
09/18/25(Thu)11:14:14 No.106624612

Anonymous 09/18/25(Thu)11:14:14 No.106624612

UK is moving a bill through parliament to ban local LLMs due to power usage. You can't make this shit up

Anonymous
09/18/25(Thu)11:16:01 No.106624633

Anonymous 09/18/25(Thu)11:16:01 No.106624633

>>106624548
Hard to blame them. The vast majority of people using these models are retards asking for stupid shit (see >>106618037 and the sorts of things people used to search on Google) and they probably don't even notice the degradation.

Anonymous
09/18/25(Thu)11:17:35 No.106624648

Anonymous 09/18/25(Thu)11:17:35 No.106624648

>>106624612
Are you fucking serious?

Anonymous
09/18/25(Thu)11:18:04 No.106624660

Anonymous 09/18/25(Thu)11:18:04 No.106624660

>>106624612
Probably a wise move. UK is only a few decades away from chronic power shortfalls a la South Africa's Eskom.

Anonymous
09/18/25(Thu)11:18:10 No.106624662

Anonymous 09/18/25(Thu)11:18:10 No.106624662

>>106624612
wtf, did they ban datacenters too?

Anonymous
09/18/25(Thu)11:18:56 No.106624669

Anonymous 09/18/25(Thu)11:18:56 No.106624669

>>106624662
It's a move to tighten down on censorship by not allowing local llms.

Anonymous
09/18/25(Thu)11:19:08 No.106624671

Anonymous 09/18/25(Thu)11:19:08 No.106624671

>>106624662
Data centers are orders of magnitude more profitable and efficient than wasting power on p40s that could go to heating some poor refugees.

Anonymous
09/18/25(Thu)11:19:29 No.106624675

Anonymous 09/18/25(Thu)11:19:29 No.106624675

>>106624660
the hardware should be illegal to sell if its illegal to use. is there any thing different from running a video game on a gpu vs a llm?

Anonymous
09/18/25(Thu)11:20:19 No.106624688

Anonymous 09/18/25(Thu)11:20:19 No.106624688

>>106624612
Please link it

Anonymous
09/18/25(Thu)11:20:51 No.106624696

Anonymous 09/18/25(Thu)11:20:51 No.106624696

>>106624675
They should mandate Geforce Now and have mandatory GPU confiscations.

Anonymous
09/18/25(Thu)11:21:08 No.106624699

Anonymous 09/18/25(Thu)11:21:08 No.106624699

>>106624675
Give it a few more release cycles and they'll require GPUs to be registered like weapons, be always online, and have hardware attestation to brick themselves when given illegal workloads.

Anonymous
09/18/25(Thu)11:28:07 No.106624779

Anonymous 09/18/25(Thu)11:28:07 No.106624779

>>106624612
Link? I'm not finding shit.

Anonymous
09/18/25(Thu)11:28:42 No.106624783

Anonymous 09/18/25(Thu)11:28:42 No.106624783

File: 1743816040111236.jpg (15 KB, 400x228)

15 KB JPG

>>106624688
>>106624779
I made it up

Anonymous
09/18/25(Thu)11:29:39 No.106624792

Anonymous 09/18/25(Thu)11:29:39 No.106624792

>>106624783
nigger

Anonymous
09/18/25(Thu)11:31:18 No.106624806

Anonymous 09/18/25(Thu)11:31:18 No.106624806

>>106624783
I choose to continue believing it regardless as it supports my narrative

Anonymous
09/18/25(Thu)11:32:34 No.106624819

Anonymous 09/18/25(Thu)11:32:34 No.106624819

>>106624783
kys

Anonymous
09/18/25(Thu)11:36:05 No.106624848

Anonymous 09/18/25(Thu)11:36:05 No.106624848

File: file.png (105 KB, 1356x626)

105 KB PNG

There are upcoming safety laws with local LLMs

Anonymous
09/18/25(Thu)11:37:54 No.106624862

Anonymous 09/18/25(Thu)11:37:54 No.106624862

local suno, this is the base version, they also just released a dpo trained one I have not tried

https://huggingface.co/fredconex/SongBloom-Safetensors

https://github.com/fredconex/ComfyUI-SongBloom

https://files.catbox.moe/i0iple.flac

Anonymous
09/18/25(Thu)11:39:40 No.106624875

Anonymous 09/18/25(Thu)11:39:40 No.106624875

>>106623919
>so what tool do you use to train models
https://docs.axolotl.ai/
https://github.com/axolotl-ai-cloud/axolotl/

Very robust trainer. Highly recommend for both beginners and LLM vets.

>and create ggufs?
llama.cpp . This filed in particular.

https://github.com/ggml-org/llama.cpp/blob/master/convert_hf_to_gguf.py

I haven't gotten to quantizing idiot but I'll try to get to that later today when I have time. I've got some things to take care of along with my job so my hands will be tied for most of the day.

Anonymous
09/18/25(Thu)11:40:26 No.106624883

Anonymous 09/18/25(Thu)11:40:26 No.106624883

>>106624862
>local suno
you wish, sounds worse than 3.5

Anonymous
09/18/25(Thu)11:41:35 No.106624893

Anonymous 09/18/25(Thu)11:41:35 No.106624893

>>106624883
again, that is the non dpo trained one which I imagen sounds far better

Anonymous
09/18/25(Thu)11:42:56 No.106624908

Anonymous 09/18/25(Thu)11:42:56 No.106624908

File: 1744421666600281.png (51 KB, 799x398)

51 KB PNG

>>106624862
lol

Anonymous
09/18/25(Thu)11:45:35 No.106624923

Anonymous 09/18/25(Thu)11:45:35 No.106624923

>>106624465
I do like its ui though. Anything similar?

Anonymous
09/18/25(Thu)11:46:25 No.106624931

Anonymous 09/18/25(Thu)11:46:25 No.106624931

>>106624923
>liking gradio
how

Anonymous
09/18/25(Thu)11:51:00 No.106624979

Anonymous 09/18/25(Thu)11:51:00 No.106624979

think we need a board for /m/asochists

Anonymous
09/18/25(Thu)11:52:17 No.106624990

Anonymous 09/18/25(Thu)11:52:17 No.106624990

>>106624979
That's kind of the point of the /vg/ ai thread

Anonymous
09/18/25(Thu)11:55:39 No.106625026

Anonymous 09/18/25(Thu)11:55:39 No.106625026

>>106618417
>GPT 5 still feels like it's trying its damnedest to suck my metaphorical dick
That depends:

1. Are you using a custom system prompt in the settings (you can use that to directly tell it "cut the bullshit and just get to the point")

2. What are you asking it to do?

My experience is of course anecdotal but I don't have that AT ALL when I'm asking a technical questions. Keyword technical. If you're asking if life advice or other generic shit like that then it's probably going to default to some sycophantic behaviors but when it comes to actually assisting me with programming or LLM focused shit it becomes robotic but precise and concise, which is exactly how a tool like this should behave anyway.

Anonymous
09/18/25(Thu)11:57:02 No.106625048

Anonymous 09/18/25(Thu)11:57:02 No.106625048

File: 1728177386053705.png (673 KB, 595x910)

673 KB PNG

I feel like MCP could have just been a restful api spec

Anonymous
09/18/25(Thu)12:00:02 No.106625079

Anonymous 09/18/25(Thu)12:00:02 No.106625079

>>106624848
models bigger than 8b f16/24gb should be banned

Anonymous
09/18/25(Thu)12:00:43 No.106625084

Anonymous 09/18/25(Thu)12:00:43 No.106625084

>>106625048
That wouldn't attract VC money

Anonymous
09/18/25(Thu)12:01:31 No.106625098

Anonymous 09/18/25(Thu)12:01:31 No.106625098

>>106625026
>1. Are you using a custom system prompt in the settings (you can use that to directly tell it "cut the bullshit and just get to the point")
I do when I'm tired of it all but steering away from the default assistant personality can seriously degrade the model output on things like code.
>2. What are you asking it to do?
Code, that's all. I tell it what to do in refactors and it outputs stuff.
Asking questions actually seems to trigger the sycophancy less than just giving it instructions of things to do, I've started to notice. Though, one thing besides the sycophancy that is grating is the constant engagement mechanic "Would you like me to" where almost every message ends in something like that like an overly attached girlfriend that just doesn't want to let it go

Anonymous
09/18/25(Thu)12:02:58 No.106625106

Anonymous 09/18/25(Thu)12:02:58 No.106625106

>>106624783
I choose to continue believing it regardless as I think it's just a bit funny to spread misinformation on the internet.

Anonymous
09/18/25(Thu)12:04:37 No.106625118

Anonymous 09/18/25(Thu)12:04:37 No.106625118

>>106625079
Qwen 4B 2507 Instruct Is All You Need

Anonymous
09/18/25(Thu)12:05:07 No.106625125

Anonymous 09/18/25(Thu)12:05:07 No.106625125

File: 1749114436902246.png (1.06 MB, 682x900)

1.06 MB PNG

>>106625098
Clarification, when I meant system prompt I meant going DEEP into the settings like pic rel.

Settings-->personalization--> Custom Instructions

Did you do that? A generic "system prompt" at the beginning of your chat probably isn't quite good enough or strong enough of a signal for it to stop doing splits on it

Anonymous
09/18/25(Thu)12:09:58 No.106625165

Anonymous 09/18/25(Thu)12:09:58 No.106625165

>>106625118
>Qwen 4B
Gemma 3n you mean.

Anonymous
09/18/25(Thu)12:13:31 No.106625202

Anonymous 09/18/25(Thu)12:13:31 No.106625202

>>106624848
Nothing burger. If it's on my machine they can't really do shit about it. This might heavily cuck web-based / API services but most of us don't care about those

Anonymous
09/18/25(Thu)12:34:35 No.106625378

Anonymous 09/18/25(Thu)12:34:35 No.106625378

songbloom is really good btw and takes song references unlike suno, here was the start of fade to black with some claude lyrics

https://files.catbox.moe/sopv2f.flac

Anonymous
09/18/25(Thu)12:38:03 No.106625408

Anonymous 09/18/25(Thu)12:38:03 No.106625408

This was higher cfg / lower temp / another seed
Crazy leap for local

https://files.catbox.moe/olajtj.flac

Anonymous
09/18/25(Thu)12:41:55 No.106625445

Anonymous 09/18/25(Thu)12:41:55 No.106625445

File: touhou reimu sit up in re(...).png (395 KB, 613x1060)

395 KB PNG

>>106624862
>>106625408

Anonymous
09/18/25(Thu)12:46:42 No.106625499

Anonymous 09/18/25(Thu)12:46:42 No.106625499

>>106619816
How many years you think before all these glasses start displaying all citizen's Palatir Social Credit Score above their heads? Like power level reading scouters but for our dystopia. I'm mostly joking, but if this form factor does take off, I can see people willing doing this with their social network handles and subscriber counts, which is just a proxy for social score anyway.

Anonymous
09/18/25(Thu)12:47:52 No.106625507

Anonymous 09/18/25(Thu)12:47:52 No.106625507

File: seitokai_yakuindomo_2_ova(...).jpg (76 KB, 1280x720)

76 KB JPG

>>106625408
usual request to make it do lewd noises
how's gen speed btw? I really want to get into audiogen but being an AMD vramlet with no ROCm support, it's probably going to be pain in the ass. Maybe low param TTS is my limit.

Anonymous
09/18/25(Thu)12:48:47 No.106625516

Anonymous 09/18/25(Thu)12:48:47 No.106625516

>>106625507
2 min 30 sec song takes about a min on 4090

Anonymous
09/18/25(Thu)12:53:30 No.106625553

Anonymous 09/18/25(Thu)12:53:30 No.106625553

>>106625516
>gen time < play time
Huh, you can fucking stream it live?

Anonymous
09/18/25(Thu)12:53:36 No.106625554

Anonymous 09/18/25(Thu)12:53:36 No.106625554

>>106619816
LOL past 1h 5min they're still going on about metaverse shit
Zuck still hasn't let go of this piece of shit NO ONE WANTS nobody is asking for he wants more metaverse cringe
https://developers.meta.com/horizon-worlds
"create quality worlds in a fraction of the time"
Yeah that's what the metaverse was missing, AI gen slop
"new engine we built to replace unity"
"fully optimized for bringing the metaverse to life"
"much faster to load"

Anonymous
09/18/25(Thu)12:55:56 No.106625579

Anonymous 09/18/25(Thu)12:55:56 No.106625579

>>106625554
I heard Metaverse is quite popular with children/parents as degeneracy-free version of VRChat.
Actually saw a little girl making a suspiciously Metaverse-looking avatar in her dad's tablet on a bus ride a week ago.

Anonymous
09/18/25(Thu)13:02:37 No.106625638

Anonymous 09/18/25(Thu)13:02:37 No.106625638

>>106625554
The Horizon TV thing actually sounds appealing. Would get a Quest instead of a new TV if only it could stream from Jellyfin instead of some subscription.

Anonymous
09/18/25(Thu)13:07:57 No.106625675

Anonymous 09/18/25(Thu)13:07:57 No.106625675

>>106625554
metaverse is the future
the problem is current vr headsets are simply too clunky for mass adoption

Anonymous
09/18/25(Thu)13:09:28 No.106625693

Anonymous 09/18/25(Thu)13:09:28 No.106625693

>>106625675
not even apple and their all powerful, from the gods Distortion Field could get their fanboys to truly stick to their VR headset
stop trying to make VR happen
it won't, it's a niche nerd thing that will stay a niche nerd thing, normal people want nothing to do with this

Anonymous
09/18/25(Thu)13:11:01 No.106625705

Anonymous 09/18/25(Thu)13:11:01 No.106625705

>>106625693
no, vr will definitely happen, it is a certainty
it will simply take time for vr headsets to reach the point for mass adoption

Anonymous
09/18/25(Thu)13:12:01 No.106625715

Anonymous 09/18/25(Thu)13:12:01 No.106625715

File: 1504401643047.png (356 KB, 704x396)

356 KB PNG

>>106625378
I can remember Moonphase.

Anonymous
09/18/25(Thu)13:14:01 No.106625736

Anonymous 09/18/25(Thu)13:14:01 No.106625736

>>106625705
>no, vr will definitely happen, it is a certainty
right, as certain as nuclear fusion reactors, making space or mars colonies and plenty of other "I really want to live in tech singularity" bullshit from nerds who hate the world and dream of a tech utopia
always just Around The Corner©

Anonymous
09/18/25(Thu)13:14:21 No.106625738

Anonymous 09/18/25(Thu)13:14:21 No.106625738

File: 5a2a9eccc40b9fda629969c7c(...).png (1.35 MB, 1063x1100)

1.35 MB PNG

>>106625597
Tsk-tsk, degens should really go hardcore on policing themselves, before mob comes and does it for them.
https://www.youtube.com/watch?v=VSOFnkCCU0Y

Anonymous
09/18/25(Thu)13:15:05 No.106625746

Anonymous 09/18/25(Thu)13:15:05 No.106625746

>>106625715
Wait, on the second one it's "own phase"? Nooo...

Anonymous
09/18/25(Thu)13:16:06 No.106625752

Anonymous 09/18/25(Thu)13:16:06 No.106625752

>>106625736
How could you forget about flying cars, circular cities, and hoverboards?

Anonymous
09/18/25(Thu)13:16:07 No.106625753

Anonymous 09/18/25(Thu)13:16:07 No.106625753

>>106625675
>>106625705
buy an ad zuck

Anonymous
09/18/25(Thu)13:21:36 No.106625801

Anonymous 09/18/25(Thu)13:21:36 No.106625801

>>106625736
>right, as certain as nuclear fusion reactors, making space or mars colonies
those are all examples of things that are not here yet, vr is here, it just needs refining, strapping a brick to your face is not something most people are comfortable to do with a product, neither are the low fov, shitty displays and subpar software

light field displays, higher fov, lower headset weight thanks to computing pucks and better software will make adoption of vr easier

Anonymous
09/18/25(Thu)13:23:59 No.106625826

Anonymous 09/18/25(Thu)13:23:59 No.106625826

>>106624862
please stop supporting the shillui. We need people to make extensions for anistudio and neoforge instead. comfyorg is aggressively squeezing saas and telemetry into everything. it's complete ass and local users are treated as second class citizens

Anonymous
09/18/25(Thu)13:25:39 No.106625841

Anonymous 09/18/25(Thu)13:25:39 No.106625841

>>106625801
One could have made the same argument about electric cars a century ago. You have no idea how long it will take for technology to catch up enough to produce the refined product that will actually be useful.

Anonymous
09/18/25(Thu)13:27:38 No.106625857

Anonymous 09/18/25(Thu)13:27:38 No.106625857

>>106625841
I agree, i'm not saying it will be soon, just that it will definitely happen eventually

Anonymous
09/18/25(Thu)13:29:07 No.106625876

Anonymous 09/18/25(Thu)13:29:07 No.106625876

>news breaks that huawei gpus can't train models
>china bans NVIDIA AI chips a week later
I don't understand the logic on this one

Anonymous
09/18/25(Thu)13:30:04 No.106625886

Anonymous 09/18/25(Thu)13:30:04 No.106625886

>>106625876
They will be forced to dogfood the homegrown option until quality improves. Won't get better if there is no incentive or usage.

Anonymous
09/18/25(Thu)13:30:11 No.106625891

Anonymous 09/18/25(Thu)13:30:11 No.106625891

>>106625801
but we do have fusion reactors already, what is the difference?

Anonymous
09/18/25(Thu)13:30:40 No.106625895

Anonymous 09/18/25(Thu)13:30:40 No.106625895

>>106625876
Source?
Hard mode: non Jewish source

Anonymous
09/18/25(Thu)13:30:45 No.106625896

Anonymous 09/18/25(Thu)13:30:45 No.106625896

>>106625876
glowies that reside in the silicon

Anonymous
09/18/25(Thu)13:32:07 No.106625909

Anonymous 09/18/25(Thu)13:32:07 No.106625909

>>106625895
https://www.ft.com/content/12adf92d-3e34-428a-8d61-c9169511915c
https://www.cnbc.com/2025/09/17/nvidia-ceo-disappointed-after-reports-china-has-banned-its-ai-chips.html
It's widely public news at this point. Even Jensen acknowledged it

Anonymous
09/18/25(Thu)13:35:10 No.106625931

Anonymous 09/18/25(Thu)13:35:10 No.106625931

>>106625801
Aside from software, I don't think there's much room for refinement, we are pushing the very limits of current technology. I even expect a Concorde effect: industry will roll back to worse specs for the sake of better ROI.
We already saw it with new consumer GPUs having less VRAM than old ones, for example.

Anonymous
09/18/25(Thu)13:39:10 No.106625974

Anonymous 09/18/25(Thu)13:39:10 No.106625974

local models?

Anonymous
09/18/25(Thu)13:43:32 No.106626026

Anonymous 09/18/25(Thu)13:43:32 No.106626026

>>106625974
Sir please

Anonymous
09/18/25(Thu)13:43:38 No.106626028

Anonymous 09/18/25(Thu)13:43:38 No.106626028

>>106625974
Where?

Anonymous
09/18/25(Thu)13:45:50 No.106626047

Anonymous 09/18/25(Thu)13:45:50 No.106626047

>>106625876
They don't want to be dependent on nvidia, so they're going to force the issue and take the performance hit now with the expectation that eventually they'll be able to catch up.

Same principle as banning foreign software companies like google and forcing their people to use local alternatives, despite them being much less sophisticated at the time.

Anonymous
09/18/25(Thu)13:46:00 No.106626051

Anonymous 09/18/25(Thu)13:46:00 No.106626051

That Doug Engelbart quote is familiar to us huh.

Anonymous
09/18/25(Thu)13:47:27 No.106626065

Anonymous 09/18/25(Thu)13:47:27 No.106626065

>>106626047
So Qwen 4 will blow chunks harder than Llama 4, but Qwen 5 will finally be GPT-4 at home?

Anonymous
09/18/25(Thu)13:48:05 No.106626074

Anonymous 09/18/25(Thu)13:48:05 No.106626074

>>106625876
You've got the timeline mixed up.
A week after the rumor broke that R2 got delayed by Huawei GPUs, DS released V3.1 to put to the rest the rumor. V3.1 was trained in a format that targets upcoming Chinese GPUs. That was a month ago.
A day after the news broke that the Chinese were trialing their own DUVs (instead of buying from ASML) and Alibaba announcing their own TPUs, Nvidia chips got banned by China.

Anonymous
09/18/25(Thu)13:48:18 No.106626079

Anonymous 09/18/25(Thu)13:48:18 No.106626079

>>106625909
Does this ban affect existing chips or new? This is going to be the dark age of llm if it's a total ban.

Anonymous
09/18/25(Thu)13:48:26 No.106626081

Anonymous 09/18/25(Thu)13:48:26 No.106626081

Can somebody explain to me the practical difference between Gemma 3ns sparsity and an usual MoE?
Or is it just the same think with another name and a slightly tweak to the architecture?
I don't think that's the case, since 3n doesn't seem to have something like a router.

Anonymous
09/18/25(Thu)13:49:59 No.106626095

Anonymous 09/18/25(Thu)13:49:59 No.106626095

>>106626081
3n reuses parameters multiple times instead of having extra parameters that are only called sometimes like in a usual MoE

Anonymous
09/18/25(Thu)13:52:57 No.106626125

Anonymous 09/18/25(Thu)13:52:57 No.106626125

>>106626095
Oh. So it's like the opposite of a MoE? Instead of having a ton of params and only using a subset, it uses the same params multiple times?
That's kind of crazy. Reminds me of that layer looping concept.

Anonymous
09/18/25(Thu)13:54:11 No.106626136

Anonymous 09/18/25(Thu)13:54:11 No.106626136

>>106626125
>layer looping concept
Shamelessly copied from one of the thousand papers experimenting with the ResNet architecture
Tldr: nothing new

Anonymous
09/18/25(Thu)13:54:48 No.106626145

Anonymous 09/18/25(Thu)13:54:48 No.106626145

>>106626125
Pretty much, yeah. Seems like it would have a downside of the overloaded parameters reaching saturation much sooner than usual. It might not scale well past edge device size.

Anonymous
09/18/25(Thu)13:58:21 No.106626188

Anonymous 09/18/25(Thu)13:58:21 No.106626188

File: 1752993660070767.png (220 KB, 860x454)

220 KB PNG

>>106625554
The display glasses seem neat. I'll pick a pair up if and only if there's a proper developer SDK that gives you a reasonable amount of control over the device.

Locked in to cloud-only llama4 would be the most useless retarded thing ever.

Anonymous
09/18/25(Thu)14:01:10 No.106626214

Anonymous 09/18/25(Thu)14:01:10 No.106626214

>>106626188
God damn they are so fucking bulky looking

Anonymous
09/18/25(Thu)14:02:30 No.106626227

Anonymous 09/18/25(Thu)14:02:30 No.106626227

>>106625926
MOOOOM!!!! HES SEXUALLY HARASSING ME AGAIN!

Anonymous
09/18/25(Thu)14:02:36 No.106626229

Anonymous 09/18/25(Thu)14:02:36 No.106626229

File: 1742748628867690.png (1.46 MB, 1770x452)

1.46 MB PNG

>>106626188
>Locked in to cloud-only llama4 would be the most useless retarded thing ever.
What gave you the impression that wouldn't be the case lol? It's not like any other models would be that useful on a user interface like that. It would be good for general question answering , essentially using it as a personalized Google search engine powered by your voice. I can't think of any scenarios where using a dedicated programming model like some 30B Qwen model what even be practical anyway. "Hey meta what kind of flower is this", "hey meta how many calories does this cereal food product and looking at have" , " "meta how many grams of protein does this food have", "hey meta send a text message to mom", simple everyday queries like that.

Anonymous
09/18/25(Thu)14:13:31 No.106626356

Anonymous 09/18/25(Thu)14:13:31 No.106626356

>>106625876
99% chance it's just negotiation tactics
"H20s suck massive dick, offer us something better or you'll lose ALL the chinabucks"
Then Jensen sends Trump a golden bidet and within 6 weeks he has the new and improved Nvidia H25 now 30% less gimped ready to ship

Anonymous
09/18/25(Thu)14:23:39 No.106626464

Anonymous 09/18/25(Thu)14:23:39 No.106626464

>>106626214
These are comical. What the hell. Only retards would wear these in real life - big glasses what are heavy and bulky are torture.
But of course they wouldn't know because they don't NEED optics to correct their vision in the first place.

Anonymous
09/18/25(Thu)14:23:58 No.106626467

Anonymous 09/18/25(Thu)14:23:58 No.106626467

>>106626229
"Hey Meta show me a video of that hot girl I was just looking at sucking my dick"

Anonymous
09/18/25(Thu)14:34:48 No.106626557

Anonymous 09/18/25(Thu)14:34:48 No.106626557

>>106626229
>What gave you the impression that wouldn't be the case
Meta has been doing a very weird mix of open and closed systems over the past couple years. They've been doing things like releasing some interesting papers and models with lots of details and public datasets (dinov4, their emg model, etc). They've removed a lot of the hardware access restrictions from the quest sdk.

On the other hand, they've also never released an SDK for their other previous gaybans glasses.

So I think it's not out of the realm of possibility that they might allow it.

As to why you would want the option to not use cloud llama 4, I'm not going to bother to respond to that because I feel that the topic has been discussed ad nauseum and is adequately answered by the fact that we're in "local model general".

Anonymous
09/18/25(Thu)14:35:23 No.106626565

Anonymous 09/18/25(Thu)14:35:23 No.106626565

>>106625554
>https://developers.meta.com/horizon-worlds
Roblox but VR

Anonymous
09/18/25(Thu)14:36:36 No.106626576

Anonymous 09/18/25(Thu)14:36:36 No.106626576

>>106626557
>As to why you would want the option to not use cloud llama 4, I'm not going to bother to respond to that because I feel that the topic has been discussed ad nauseum and is adequately answered by the fact that we're in "local model general".
I wouldn't want to use it even if I wasn't a local model user
this is a world where Claude, Gemini, GPT exist
pay for a llama device? oh fucking hell no retards

Anonymous
09/18/25(Thu)14:44:05 No.106626641

Anonymous 09/18/25(Thu)14:44:05 No.106626641

File: im-done-with-this-game-v0(...).jpg (71 KB, 940x1024)

71 KB JPG

is Whisper still the king?

Need English only

Anonymous
09/18/25(Thu)14:52:03 No.106626704

Anonymous 09/18/25(Thu)14:52:03 No.106626704

>>106626641
Yes.

Anonymous
09/18/25(Thu)15:18:30 No.106626905

Anonymous 09/18/25(Thu)15:18:30 No.106626905

File: 54d12kuq8rpf1.png (303 KB, 5684x1787)

303 KB PNG

Anonymous
09/18/25(Thu)15:20:01 No.106626921

Anonymous 09/18/25(Thu)15:20:01 No.106626921

>>106626905
how is qwen's coder and a fuckkin 1t losing to glm?

Anonymous
09/18/25(Thu)15:20:31 No.106626924

Anonymous 09/18/25(Thu)15:20:31 No.106626924

>>106626229
Because Quest3 chads have full camera API access

Anonymous
09/18/25(Thu)15:22:07 No.106626930

Anonymous 09/18/25(Thu)15:22:07 No.106626930

>>106626921
GLM-chan is doing her best!

Anonymous
09/18/25(Thu)15:25:56 No.106626951

Anonymous 09/18/25(Thu)15:25:56 No.106626951

Nvidia investing 5bil into intel, uwotn8?

Anonymous
09/18/25(Thu)15:27:24 No.106626961

Anonymous 09/18/25(Thu)15:27:24 No.106626961

File: 1736356161979453.jpg (59 KB, 414x414)

59 KB JPG

>>106626951
AMD cucks in shambles

Anonymous
09/18/25(Thu)15:33:17 No.106627008

Anonymous 09/18/25(Thu)15:33:17 No.106627008

loli miku feet

Anonymous
09/18/25(Thu)15:34:31 No.106627017

Anonymous 09/18/25(Thu)15:34:31 No.106627017

>>106625974
hot single models in your local area

Anonymous
09/18/25(Thu)15:37:20 No.106627036

Anonymous 09/18/25(Thu)15:37:20 No.106627036

>>106627017
are they dtf?

Anonymous
09/18/25(Thu)15:39:24 No.106627053

Anonymous 09/18/25(Thu)15:39:24 No.106627053

> allowing you to hilft your entire length down her esophagus with wet gluck sounds
?
???

Anonymous
09/18/25(Thu)15:40:40 No.106627063

Anonymous 09/18/25(Thu)15:40:40 No.106627063

unsloth this!
*reveals benis*

Anonymous
09/18/25(Thu)15:41:30 No.106627073

Anonymous 09/18/25(Thu)15:41:30 No.106627073

>>106626229
I don't expect much from this industry at this point, but not letting nerds play with such toys is basically shooting yourself in the foot.

Anonymous
09/18/25(Thu)15:41:55 No.106627075

Anonymous 09/18/25(Thu)15:41:55 No.106627075

> I cling to him, my small body arching into his, pressing myself closer as if I could crawl inside his skin and stay there forever.
Adorable GLM. Just adorable.

Anonymous
09/18/25(Thu)15:43:54 No.106627094

Anonymous 09/18/25(Thu)15:43:54 No.106627094

>>106627053
hot

Anonymous
09/18/25(Thu)15:45:32 No.106627104

Anonymous 09/18/25(Thu)15:45:32 No.106627104

File: linkin-park.jpg (94 KB, 991x774)

94 KB JPG

>>106627075
>CRAWLINGGGG IN MY SKINNNNN

Anonymous
09/18/25(Thu)15:46:25 No.106627113

Anonymous 09/18/25(Thu)15:46:25 No.106627113

>>106627075
>adorable
absolutely not

Anonymous
09/18/25(Thu)15:51:42 No.106627145

Anonymous 09/18/25(Thu)15:51:42 No.106627145

>>106626905
>qwenext worse than glmair
whats the fuckign point then????????????????????????

Anonymous
09/18/25(Thu)15:52:01 No.106627148

Anonymous 09/18/25(Thu)15:52:01 No.106627148

>>106622974
RAG

Anonymous
09/18/25(Thu)15:54:06 No.106627167

Anonymous 09/18/25(Thu)15:54:06 No.106627167

>>106627153
>>106627153
>>106627153

Anonymous
09/18/25(Thu)15:54:35 No.106627173

Anonymous 09/18/25(Thu)15:54:35 No.106627173

>>106627145
it's a proof of concept to get people to work on supporting the architecture, they didn't even bother training it on their whole dataset

Anonymous
09/18/25(Thu)15:56:29 No.106627185

Anonymous 09/18/25(Thu)15:56:29 No.106627185

>>106627075
kek

Anonymous
09/18/25(Thu)16:48:36 No.106627582

Anonymous 09/18/25(Thu)16:48:36 No.106627582

>>106625048
MCP is clunky and useless.
>hey just add this definition to your .mcp.json every time you want to use it
>oh and you can't "just have it" cuz it'll inflate the prompt with 90000 tools

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.