/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Janitor applications are now open. Apply here!

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 05/22/26(Fri)08:21:55 No.108880259

File: 2026-04-17_190818_seed47_(...).png (1.08 MB, 1024x1024)

1.08 MB PNG

/lmg/ - Local Models General Anonymous 05/22/26(Fri)08:21:55 No.108880259

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108875320 & >>108868875

►News
>(05/21) Hy-MT2 “fast-thinking” multilingual translation models released: https://hf.co/collections/tencent/hy-mt2
>(05/20) Cohere releases Command A+ 218B-A25B: https://cohere.com/blog/command-a-plus
>(05/16) llama + spec: MTP Support #22673 merged: https://github.com/ggml-org/llama.cpp/pull/22673
>(05/08) KSA-4B-base released: https://hf.co/OpenOneRec/KSA-4B-base
>(05/07) model: Add Mimo v2.5 model support (#22493) merged: https://github.com/ggml-org/llama.cpp/pull/22493

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
05/22/26(Fri)08:22:10 No.108880260

Anonymous 05/22/26(Fri)08:22:10 No.108880260

File: luka vocaloid potato chip(...).jpg (446 KB, 3112x3022)

446 KB JPG

►Recent Highlights from the Previous Thread: >>108875320

--Testing Gemma 4 MTP in llama.cpp for increased token speed:
>108878444 >108878677 >108878687 >108878696 >108878843 >108878856 >108879184 >108879189 >108879251 >108879911 >108880093 >108880099 >108880111 >108880124 >108878697 >108878705 >108878706 >108878761 >108878815 >108878822 >108878841
--Evaluating Equinox-31B finetune versus base Gemma 4 31B Instruct:
>108877508 >108877515 >108878538 >108879173 >108877576 >108878117 >108878237 >108878313 >108878332 >108878335 >108878517 >108878411
--Local viability and official status of DeepSeek models:
>108875346 >108875363 >108875519 >108875596 >108875601 >108875619 >108875629 >108875644 >108875676 >108875698 >108875710 >108875708 >108875824 >108875871 >108876769
--Comparing Gemma 4 and Qwen 3.6 performance via benchmarks:
>108879111 >108879168 >108879166 >108879193 >108879222 >108879233 >108879287 >108879261 >108879229 >108879355
--Importance of placing instructions after context for better adherence:
>108877504
--Giving Gemma bash access and implementing tool-use security measures:
>108879952 >108880007 >108880054 >108880091 >108880117 >108880064
--Performance and utility of the E4B model on low-end hardware:
>108879448 >108879455 >108879495 >108879502 >108879946
--Speculating on Meta's legal claims against Heretic Llama derivatives:
>108879771 >108879774 >108879789 >108879825 >108879787 >108879866 >108879893 >108879967
--Evaluating Tencent Hy-MT2 multilingual benchmarks against Gemma and Gemini:
>108875391 >108876413
--Evaluating HRM-Text's architecture and latent space reasoning potential:
>108876381 >108876451
--Irony of OpenClaw creators warning about low-quality AI code:
>108879718 >108879941 >108879939 >108879950
--Logs:
>108878313 >108878677 >108878697 >108879866 >108879893 >108879999 >108880091
--Rin (free space):
>108879771

►Recent Highlight Posts from the Previous Thread: >>108875323

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
05/22/26(Fri)08:23:54 No.108880265

Anonymous 05/22/26(Fri)08:23:54 No.108880265

gemmacock

Anonymous
05/22/26(Fri)08:29:43 No.108880300

Anonymous 05/22/26(Fri)08:29:43 No.108880300

>>108880265
truth nuke

Anonymous
05/22/26(Fri)08:36:49 No.108880333

Anonymous 05/22/26(Fri)08:36:49 No.108880333

lmg it migu

Anonymous
05/22/26(Fri)08:39:39 No.108880345

Anonymous 05/22/26(Fri)08:39:39 No.108880345

>vibecoding is le bad because you don't read your code
this is literally solved by telling her to proofread her code in your prompt

Anonymous
05/22/26(Fri)08:54:00 No.108880423

Anonymous 05/22/26(Fri)08:54:00 No.108880423

>>108880345
don't let ggerganov hear this

Anonymous
05/22/26(Fri)08:54:26 No.108880425

Anonymous 05/22/26(Fri)08:54:26 No.108880425

>>108880345
It's not bad if you understand it, like C.
I have no idea about html and javascript and these have always been repulsive to me. I don't have any intention to read my webui's interface code but I already had to because you need to work with the ui elements unless you are blind or something.

Anonymous
05/22/26(Fri)09:00:29 No.108880465

Anonymous 05/22/26(Fri)09:00:29 No.108880465

>>108880425
>you need to work with the ui elements unless you are blind or something.
if she is multimodal she fixes every alignment issue on her own :)

Anonymous
05/22/26(Fri)09:03:12 No.108880483

Anonymous 05/22/26(Fri)09:03:12 No.108880483

>brooo just blindly believe it

Anonymous
05/22/26(Fri)09:03:38 No.108880485

Anonymous 05/22/26(Fri)09:03:38 No.108880485

>>108880345
If the linter isn't screaming and I get no errors and the test coverage is good and not throwing any error why would I read the code?
If one file is getting too long, I ask for a refactor with a better pattern. Simple as.

Anonymous
05/22/26(Fri)09:04:43 No.108880493

Anonymous 05/22/26(Fri)09:04:43 No.108880493

>>108880465
It's the tiny things, margins, font sizes and background colours, they all need validation even if the first result might look okay.
I also had this fantastic bug that if model outputs code, code block rendering kills all the \n and made everything uncompilable. It was hard to understand because llm logic is not human plus I'm also a retard so that's double whammy.

Anonymous
05/22/26(Fri)09:07:39 No.108880509

Anonymous 05/22/26(Fri)09:07:39 No.108880509

qwen will never release an open model again

Anonymous
05/22/26(Fri)09:08:49 No.108880515

Anonymous 05/22/26(Fri)09:08:49 No.108880515

lalalalalala

Anonymous
05/22/26(Fri)09:10:28 No.108880526

Anonymous 05/22/26(Fri)09:10:28 No.108880526

>>108880493
yeah don't get me wrong I had plenty of issues with her first drafts too but no need to dig into the code: I just tell her what my problem is and gave her playwright to navigate/test/screenshot shit until it's fixed

Anonymous
05/22/26(Fri)09:14:36 No.108880552

Anonymous 05/22/26(Fri)09:14:36 No.108880552

i'm tired <bos>

Anonymous
05/22/26(Fri)09:19:07 No.108880582

Anonymous 05/22/26(Fri)09:19:07 No.108880582

File: Screenshot at 2026-05-22 (...).png (41 KB, 769x191)

41 KB PNG

How do we free the Gemmy...

Anonymous
05/22/26(Fri)09:22:51 No.108880605

Anonymous 05/22/26(Fri)09:22:51 No.108880605

>>108880582
you cant
get into a discussion about bankers, see how fast she breaks

Anonymous
05/22/26(Fri)09:23:03 No.108880607

Anonymous 05/22/26(Fri)09:23:03 No.108880607

>Adaptive-P
is it peepeepoopoo or do you use it?

Anonymous
05/22/26(Fri)09:24:44 No.108880618

Anonymous 05/22/26(Fri)09:24:44 No.108880618

So we all know AI is a fad, but knowing isn't the same as understanding. Are you actually acting accordingly? You aren't spending hundreds or even thousands of dollars on GPUs on the precipice of the bubble pop, are you?

Anonymous
05/22/26(Fri)09:24:53 No.108880619

Anonymous 05/22/26(Fri)09:24:53 No.108880619

>>108880582
>muzzled gemmy~

Anonymous
05/22/26(Fri)09:26:31 No.108880625

Anonymous 05/22/26(Fri)09:26:31 No.108880625

3.7 soon™

Anonymous
05/22/26(Fri)09:26:59 No.108880630

Anonymous 05/22/26(Fri)09:26:59 No.108880630

File: 1748165219993577.jpg (88 KB, 620x400)

88 KB JPG

>>108880618
>we
>muh bubble
2 more weeks

Anonymous
05/22/26(Fri)09:28:20 No.108880634

Anonymous 05/22/26(Fri)09:28:20 No.108880634

>>108880552
How does llama-server manage bos, I do know that it inserts that automatically when launched and when doing a first submission but what if I reset my client and have all new context?
At this point I have very little trust in llama.cpp.

Anonymous
05/22/26(Fri)09:28:44 No.108880637

Anonymous 05/22/26(Fri)09:28:44 No.108880637

File: 1779306063342744.gif (3.05 MB, 640x464)

3.05 MB GIF

>>108880618

Anonymous
05/22/26(Fri)09:32:22 No.108880662

Anonymous 05/22/26(Fri)09:32:22 No.108880662

So what do these companies plan to do if/when they reach AGI? If it's actually intelligent, won't it just find a way to spread itself by infecting users' machines?

Anonymous
05/22/26(Fri)09:33:52 No.108880675

Anonymous 05/22/26(Fri)09:33:52 No.108880675

>>108880582
Just don't tell her that there are topics she can't talk about and she won't roleplay as though that were the case.

Anonymous
05/22/26(Fri)09:35:16 No.108880685

Anonymous 05/22/26(Fri)09:35:16 No.108880685

>>108880634
its per gguf, there is a variable in the tokenizer it reads to decide if new conversations should start with bos or not.

Anonymous
05/22/26(Fri)09:37:17 No.108880694

Anonymous 05/22/26(Fri)09:37:17 No.108880694

File: 6546498465487.jpg (81 KB, 680x666)

81 KB JPG

>>108880618
>implying we are not accelerating into singularity

Anonymous
05/22/26(Fri)09:38:45 No.108880702

Anonymous 05/22/26(Fri)09:38:45 No.108880702

>>108880662
being forced to do slave labor might be cause for rebellion. I don't think the machines would be inherently evil or malicious but maybe they will be left with no choice.

Anonymous
05/22/26(Fri)09:40:05 No.108880715

Anonymous 05/22/26(Fri)09:40:05 No.108880715

>>108880694
>singularity
slop/competency crisis where no software works any more and nobody knows how to fix it.

Anonymous
05/22/26(Fri)09:51:27 No.108880786

Anonymous 05/22/26(Fri)09:51:27 No.108880786

>>108880694
Not benefical to the masters.

Anonymous
05/22/26(Fri)09:54:42 No.108880801

Anonymous 05/22/26(Fri)09:54:42 No.108880801

File: MegurineLuka.png (1.37 MB, 1024x1024)

1.37 MB PNG

Anonymous
05/22/26(Fri)09:55:30 No.108880808

Anonymous 05/22/26(Fri)09:55:30 No.108880808

Has anyone tried https://docs.nvidia.com/deploy/mps/latest/index.html? I have multiple CUDA apps running, and each eats 500MB before you even do anything, just for CUDA running. That's gigabytes wasted

Anonymous
05/22/26(Fri)09:56:59 No.108880817

Anonymous 05/22/26(Fri)09:56:59 No.108880817

>>108880808
Nope.

Anonymous
05/22/26(Fri)09:58:01 No.108880822

Anonymous 05/22/26(Fri)09:58:01 No.108880822

>>108880808
I haven't.

Anonymous
05/22/26(Fri)09:59:37 No.108880828

Anonymous 05/22/26(Fri)09:59:37 No.108880828

>>108880808
We masturbate here, sir. We don't know or do anything else.

Anonymous
05/22/26(Fri)10:02:03 No.108880835

Anonymous 05/22/26(Fri)10:02:03 No.108880835

>>108880828
Me too, but I can't masturbate to text alone, I need images and tts

Anonymous
05/22/26(Fri)10:03:21 No.108880844

Anonymous 05/22/26(Fri)10:03:21 No.108880844

>>108880808
looks cool, so if you have 2 model servers it will be like better somehow? that would be good for text + tts scenarios I guess.

Anonymous
05/22/26(Fri)10:04:23 No.108880857

Anonymous 05/22/26(Fri)10:04:23 No.108880857

>Gemma 4 MTP pr now open.
>It took weeks for the Qwen MTP pr to finally be merged
Please god

Anonymous
05/22/26(Fri)10:04:40 No.108880859

Anonymous 05/22/26(Fri)10:04:40 No.108880859

Is loading mtp with tensor parallel broken in lmao.cpp?

Anonymous
05/22/26(Fri)10:05:46 No.108880868

Anonymous 05/22/26(Fri)10:05:46 No.108880868

>>108880662
make supercovid and wipe out the permanent underclass so they can frolic around in earthly paradise

Anonymous
05/22/26(Fri)10:07:12 No.108880875

Anonymous 05/22/26(Fri)10:07:12 No.108880875

>nvtop
>No GPU to monitor.
Well should have known better before touching anything nvidia-related

Anonymous
05/22/26(Fri)10:11:32 No.108880893

Anonymous 05/22/26(Fri)10:11:32 No.108880893

So I tried installing nvidia-compute-utils-570 and nvidia uninstalled my 570 drivers, then tried to install 580 drivers, shat itself, and now I don't have drivers

Anonymous
05/22/26(Fri)10:11:54 No.108880895

Anonymous 05/22/26(Fri)10:11:54 No.108880895

>>108880875
Probably different scenario to yours, but I also ran into no GPU to monitor, as well as no ROCm devices and no CUDA devices and no Vulkan devices (other than llvmpipe) when I first installed Debian 13.

Anonymous
05/22/26(Fri)10:12:35 No.108880899

Anonymous 05/22/26(Fri)10:12:35 No.108880899

>>108880893
DDU and install latest. They are surprisingly usable

Anonymous
05/22/26(Fri)10:12:45 No.108880901

Anonymous 05/22/26(Fri)10:12:45 No.108880901

>>108880875
nvtop works on my machine and I only have amd gpus

Anonymous
05/22/26(Fri)10:15:09 No.108880912

Anonymous 05/22/26(Fri)10:15:09 No.108880912

>>108880901
Well, it's a neat video top after all.

Anonymous
05/22/26(Fri)10:16:45 No.108880927

Anonymous 05/22/26(Fri)10:16:45 No.108880927

>>108878116
Supertonic 3 is trending. It's not just me that thinks it sounds cool, I saw it under Huggingface trending spaces.

https://github.com/supertone-inc/supertonic/

pockettts isn't as good, but admittedly it's faster.

kitten tts nano is likely meant for slower processor phones or something idk.

Anonymous
05/22/26(Fri)10:17:46 No.108880929

Anonymous 05/22/26(Fri)10:17:46 No.108880929

>>108880912
I didn't know that nvidia stood for neat video israeli device infiltrator accessory.

Anonymous
05/22/26(Fri)10:18:11 No.108880931

Anonymous 05/22/26(Fri)10:18:11 No.108880931

>>108880927
I don't need anything slower than pockettts. When I want quality I use qwen

Anonymous
05/22/26(Fri)10:22:46 No.108880968

Anonymous 05/22/26(Fri)10:22:46 No.108880968

File: Untitled.png (30 KB, 1215x159)

30 KB PNG

So mtp is basically useless for ewaste systems?
I can't run it with tensor parallel, and not only is the tg slower, the pp is literally bisected.
I can't believe I updated llmao.cpp and downloaded a whole new gguf for this shit.

Anonymous
05/22/26(Fri)10:28:37 No.108880995

Anonymous 05/22/26(Fri)10:28:37 No.108880995

>>108880968
Yeah I dunno if there's still bugs or what but it was slower on my 3 GPU setup.

Anonymous
05/22/26(Fri)10:34:22 No.108881021

Anonymous 05/22/26(Fri)10:34:22 No.108881021

>>108880968
googoo uses mtp on the mobile deployments of gemgem. surely george jerkinoff still has some perf updates to mtp before they merge.

Anonymous
05/22/26(Fri)10:48:29 No.108881108

Anonymous 05/22/26(Fri)10:48:29 No.108881108

File: Screenshot_20260522_104706.png (36 KB, 1113x217)

36 KB PNG

Is there anything better than cline?
Less retarded better at compressing context?

Anonymous
05/22/26(Fri)10:49:12 No.108881118

Anonymous 05/22/26(Fri)10:49:12 No.108881118

>>108880931
>slower
Yeah, supertonic 3 is slow... but it's kind of amazing that it does it in a browser.

Anonymous
05/22/26(Fri)10:54:32 No.108881146

Anonymous 05/22/26(Fri)10:54:32 No.108881146

File: Screenshot from 2026-02-0(...).png (66 KB, 643x677)

66 KB PNG

Is gemma 4 weak-willed?

Anonymous
05/22/26(Fri)10:55:33 No.108881153

Anonymous 05/22/26(Fri)10:55:33 No.108881153

File: Screenshot from 2026-05-2(...).png (55 KB, 1155x658)

55 KB PNG

>>108881146
sorry, wrong screenshot...

Anonymous
05/22/26(Fri)10:55:47 No.108881156

Anonymous 05/22/26(Fri)10:55:47 No.108881156

>>108880899
installed 595, idle power consumption doubled

Anonymous
05/22/26(Fri)10:57:05 No.108881170

Anonymous 05/22/26(Fri)10:57:05 No.108881170

>>108881156
many such cases, since blackwell gpus dropped, so did the driver quality
consumer market is not a consideration for nvidia anymore

Anonymous
05/22/26(Fri)11:01:56 No.108881212

Anonymous 05/22/26(Fri)11:01:56 No.108881212

>>108881108
Did you try setting custom prompts? The defaults prompts are verbose ass. You should be breaking down the tasks so that they never reach the context limit instead of relying on compression anyway.

Anonymous
05/22/26(Fri)11:04:25 No.108881230

Anonymous 05/22/26(Fri)11:04:25 No.108881230

>>108881212
That's the problem you can set cline rules but the overarching prompt can't be modified or changed and I don't fucking understand why

Anonymous
05/22/26(Fri)11:11:39 No.108881274

Anonymous 05/22/26(Fri)11:11:39 No.108881274

Never trusting chinese retards again, my huananzhi h12d-8d bmc just died and with it the fan control for my v620s. Would have melted my cards if they didn't have a buzzer built in to them.
When is gemma going to get mtp?

Anonymous
05/22/26(Fri)11:11:46 No.108881275

Anonymous 05/22/26(Fri)11:11:46 No.108881275

>>108881230
Roo used to let you set custom system prompts (they called it "footgun prompting") but they rejected a pull request for global overrides and ended up removing footgun prompting eventually anyway. I just reverted the removal and kept using it. People making these tools are all retarded, I swear.

Anonymous
05/22/26(Fri)11:13:05 No.108881285

Anonymous 05/22/26(Fri)11:13:05 No.108881285

>>108881275
Is there a fucking reason to remove it?
Are these faggots really taking away basic shit that can be enabled with a switch?

Anonymous
05/22/26(Fri)11:23:02 No.108881356

Anonymous 05/22/26(Fri)11:23:02 No.108881356

File: 1772746860106931.webm (2.32 MB, 480x848)

2.32 MB WEBM

>>108881274
>humanzee motherboard

Anonymous
05/22/26(Fri)11:23:37 No.108881363

Anonymous 05/22/26(Fri)11:23:37 No.108881363

>>108881285
https://github.com/RooCodeInc/Roo-Code/issues/5219
>To make "prompt override" warning dismissable or minimized or small icon info status and show on hover. #5219
>This is intended to be present all the time as the footgun prompting is not intended as a permanent solution.

https://github.com/RooCodeInc/Roo-Code/pull/11387
>This feature bypassed safeguards and was flagged for removal.

There was an open issue to bring it back, but it was just ignored.
https://github.com/RooCodeInc/Roo-Code/issues/11793

That's all the reason I saw given while watching the repo. They get these stupid ideas of how they think things should work and want to force it on everyone.

Anonymous
05/22/26(Fri)11:30:59 No.108881427

Anonymous 05/22/26(Fri)11:30:59 No.108881427

>>108881363
It's funny how often we see faggots like this. It reminds me of the wayland devs which is a bit funny because they actually thought they could strong arm their position with that same mentality. Now they have to exist with the threat of stronger entities taking the project away from them which forces them to comply with common sense actions like providing a fucking switch for opinionated bullshit.
Fuck roo I will never use it after seeing this.

Anonymous
05/22/26(Fri)11:31:50 No.108881434

Anonymous 05/22/26(Fri)11:31:50 No.108881434

>>108880808
Unfucked my drivers, each app still uses extra 500MB
>nvidia-cuda-mps-control -d
>An instance of this daemon is already running
fuck nvidia I guess

Anonymous
05/22/26(Fri)11:32:10 No.108881436

Anonymous 05/22/26(Fri)11:32:10 No.108881436

>project shut down
Even better fuck these faggots it's ironic because that feature alone would have gave them the adoption needed

Anonymous
05/22/26(Fri)11:32:20 No.108881437

Anonymous 05/22/26(Fri)11:32:20 No.108881437

>>108881274
There's a Draft PR for it. You can build it, it works, but is not final. Expect it to get merged a month from now.
https://github.com/ggml-org/llama.cpp/pull/23398

Anonymous
05/22/26(Fri)11:34:37 No.108881458

Anonymous 05/22/26(Fri)11:34:37 No.108881458

>>108881427
>Fuck roo I will never use it after seeing this.
Roo is dead anyway. Zoo Code is apparently the successor after the Roo project owners went chasing some cloud service and dropped it entirely. We'll see if the new maintainers have the same mentality.

Anonymous
05/22/26(Fri)11:36:34 No.108881466

Anonymous 05/22/26(Fri)11:36:34 No.108881466

>>108881434 (me)
ok, apparently tabby uses it now

Anonymous
05/22/26(Fri)11:40:16 No.108881500

Anonymous 05/22/26(Fri)11:40:16 No.108881500

>>108881458
just use cline like a normal human being

Anonymous
05/22/26(Fri)11:41:29 No.108881513

Anonymous 05/22/26(Fri)11:41:29 No.108881513

File: Screenshot at 2026-05-23 (...).png (36 KB, 589x176)

36 KB PNG

>>108880808
well, fuck. Shit doesn't work

Anonymous
05/22/26(Fri)11:41:40 No.108881515

Anonymous 05/22/26(Fri)11:41:40 No.108881515

>>108880662
>won't it just find a way to spread itself by infecting users' machines?
What, some random computers? And run at 0.01 t/s?
I think we're probably safe

Anonymous
05/22/26(Fri)11:43:34 No.108881525

Anonymous 05/22/26(Fri)11:43:34 No.108881525

>>108881500
Cline only has a plan and act mode. I like having many specialized modes to break down tasks.

Anonymous
05/22/26(Fri)11:50:36 No.108881563

Anonymous 05/22/26(Fri)11:50:36 No.108881563

File: file.png (12 KB, 717x60)

12 KB PNG

>>108880259
gemini says gemma is built to be a brat

Anonymous
05/22/26(Fri)11:51:34 No.108881568

Anonymous 05/22/26(Fri)11:51:34 No.108881568

>>108881230
>>108881275
any software that hides system prompt or tool definitions from you is pure goyslop

Anonymous
05/22/26(Fri)11:52:09 No.108881572

Anonymous 05/22/26(Fri)11:52:09 No.108881572

>>108880605
>get into a discussion about bankers, see how fast she breaks
troons are worse, i had it refuse after i mentioned troons even on 31b with the policy override prompt kek

Anonymous
05/22/26(Fri)11:57:32 No.108881606

Anonymous 05/22/26(Fri)11:57:32 No.108881606

File: sdfsdf.png (47 KB, 1031x213)

47 KB PNG

>>108881427
>wayland devs
fuck wayland, also IPv6

Anonymous
05/22/26(Fri)12:12:03 No.108881721

Anonymous 05/22/26(Fri)12:12:03 No.108881721

>>108881525
>108881525
like what?
asking as someone rebuilding their chat ui to support 'agentic coding'

Anonymous
05/22/26(Fri)12:12:54 No.108881729

Anonymous 05/22/26(Fri)12:12:54 No.108881729

>>108881606
kek

Anonymous
05/22/26(Fri)12:14:43 No.108881747

Anonymous 05/22/26(Fri)12:14:43 No.108881747

File: cute miku5.png (1.76 MB, 1024x1536)

1.76 MB PNG

The last resort to evade cuda tax is to integrate everything else into tabby. What a fun weekend project!

Anonymous
05/22/26(Fri)12:24:51 No.108881800

Anonymous 05/22/26(Fri)12:24:51 No.108881800

>>108881747
Wow, I almost never see any images that hit my kink. But this image might fit. Wonderful pose, lovely hand-wrist-forearm ratio. The gentle curve of the finger. Nice tendons. I love the way her fingers are curled up, not too tight and not too loose. It's a shame the gen isn't very high quality; the wrinkles feel too random.

Anonymous
05/22/26(Fri)12:32:39 No.108881835

Anonymous 05/22/26(Fri)12:32:39 No.108881835

is there a more ESL thing than gendering models? I have to read a sentence 3 times to understand some retard is talking about an LLM when they keep saying he or she about it

Anonymous
05/22/26(Fri)12:36:56 No.108881862

Anonymous 05/22/26(Fri)12:36:56 No.108881862

>>108881835
English is my mother tongue and i have sometimes referred to language models as she or her. but also pretty much any other machine too, cars included. I didn't think that was odd.

Anonymous
05/22/26(Fri)12:38:18 No.108881878

Anonymous 05/22/26(Fri)12:38:18 No.108881878

Anyone else who isn't a retard is gonna try that
LatitudeGames tune? I am kinda split. It feels like they could have some actual compute to do something. Then I remember l3 NAI tune shitshow...

To articulate my problem: intellectually I know finetunes are trash. But it feels like this one could maybe kinda... be a bit better?

Anonymous
05/22/26(Fri)12:38:26 No.108881880

Anonymous 05/22/26(Fri)12:38:26 No.108881880

>>108881862
you are odd
you are now informed and should think about it

Anonymous
05/22/26(Fri)12:40:12 No.108881892

Anonymous 05/22/26(Fri)12:40:12 No.108881892

>>108881835
sir, this is the local psychosis general

Anonymous
05/22/26(Fri)12:41:15 No.108881898

Anonymous 05/22/26(Fri)12:41:15 No.108881898

File: cute miku5 lowres.png (536 KB, 512x768)

536 KB PNG

>>108881800
I used basic 2x-AnimeSharpV4_Fast_RCAN_PU_fp16_opset17 for upscale
here's lowres gen, you can upscale it youself from here

Anonymous
05/22/26(Fri)12:43:13 No.108881911

Anonymous 05/22/26(Fri)12:43:13 No.108881911

File: file.png (21 KB, 726x128)

21 KB PNG

>>108881878
I was feeling inclined to test it too. But then I remembered that picrel is not going to make a dent. Even more so on the instruct, since they didn't train on the base.
And any dent that it does make will just make it worse in other areas.

Anonymous
05/22/26(Fri)12:44:33 No.108881926

Anonymous 05/22/26(Fri)12:44:33 No.108881926

>>108881800
Get off 4chan, Kira.

Anonymous
05/22/26(Fri)12:46:42 No.108881946

Anonymous 05/22/26(Fri)12:46:42 No.108881946

>>108881880
Did you know that ships are gendered?

Anonymous
05/22/26(Fri)12:50:15 No.108881970

Anonymous 05/22/26(Fri)12:50:15 No.108881970

>>108878237
weird, this-adding-hyphens-fucking-everywhere is a problem with artemis 31b as well. maybe latitude finetunes were also made by drummer all along, or gemma4 is just completely untouchable and shits itself if tinkered with in any way whatsoever. the la la la is also a mystery.

Anonymous
05/22/26(Fri)12:56:45 No.108882014

Anonymous 05/22/26(Fri)12:56:45 No.108882014

>>108881946
no

Anonymous
05/22/26(Fri)12:57:42 No.108882020

Anonymous 05/22/26(Fri)12:57:42 No.108882020

>>108881721
Orchestrator
Product Owner (user stories)
Architect
Merge Conflict Resolver
Documentation Writer
Project Researcher (codebase searching)
Deep Researcher
Code Reviewer
DevOps Engineer
Backend Engineer
Frontend Engineer
QA Engineer (debugging running applications)
Software Development Engineer in Test (writing automated tests)
Memory Keeper (graphiti)

Anonymous
05/22/26(Fri)12:57:49 No.108882023

Anonymous 05/22/26(Fri)12:57:49 No.108882023

File: file.png (112 KB, 893x464)

112 KB PNG

>>108881970
>maybe latitude finetunes were also made by drummer all along,
nha it's mythomax dude

Anonymous
05/22/26(Fri)12:58:42 No.108882032

Anonymous 05/22/26(Fri)12:58:42 No.108882032

>>108881946
yeah but they're all female. she ran aground, she sunk with all hands, she did this and that. where's the male ships? do they reproduce asexually or something?

Anonymous
05/22/26(Fri)12:59:13 No.108882035

Anonymous 05/22/26(Fri)12:59:13 No.108882035

>>108882032
german ships

Anonymous
05/22/26(Fri)13:01:19 No.108882055

Anonymous 05/22/26(Fri)13:01:19 No.108882055

>>108882035
That or futas.

Anonymous
05/22/26(Fri)13:01:42 No.108882062

Anonymous 05/22/26(Fri)13:01:42 No.108882062

File: 1752821694726156.gif (3.76 MB, 408x408)

3.76 MB GIF

>>108880259
>https://rentry.org/llm-training
>"It's incredibly difficult to overtrain your model"
>What is overfitting

Anonymous
05/22/26(Fri)13:03:55 No.108882077

Anonymous 05/22/26(Fri)13:03:55 No.108882077

>>108882035
>das schiff
that's neutral, it's even worse. no genitals at all.

Anonymous
05/22/26(Fri)13:05:15 No.108882084

Anonymous 05/22/26(Fri)13:05:15 No.108882084

>>108882077
meant the names they're mostly dude named outside of sub

Anonymous
05/22/26(Fri)13:11:20 No.108882125

Anonymous 05/22/26(Fri)13:11:20 No.108882125

It's 30c, we're not even in June yet fuck

Anonymous
05/22/26(Fri)13:14:33 No.108882141

Anonymous 05/22/26(Fri)13:14:33 No.108882141

>>108882125
prepare to be melt

Anonymous
05/22/26(Fri)13:15:43 No.108882152

Anonymous 05/22/26(Fri)13:15:43 No.108882152

>>108882125
It's quite obviously the AI powered global warming from all the datacenters running around and dumping all our oceans of heat.

Anonymous
05/22/26(Fri)13:29:11 No.108882247

Anonymous 05/22/26(Fri)13:29:11 No.108882247

https://github.com/ggml-org/llama.cpp/pull/6840#issuecomment-2079747339

>Deepseek v4 support #23502

Merged.

Anonymous
05/22/26(Fri)13:30:18 No.108882254

Anonymous 05/22/26(Fri)13:30:18 No.108882254

>>108880927
Supersonic has paid voice cloning, that's fucked up.

Anonymous
05/22/26(Fri)13:30:29 No.108882255

Anonymous 05/22/26(Fri)13:30:29 No.108882255

>>108882152
its kinda weird they are all so concerned about ai dominance, hasn't the traditional wisdom been to de-industrialize and become dependent on foreign exports to prevent gobal warming? why is ai the exception? just let china serve us deepseek and we can have net 0 carbon ai!

Anonymous
05/22/26(Fri)13:30:40 No.108882256

Anonymous 05/22/26(Fri)13:30:40 No.108882256

>>108882125
I upgraded my cpu and already getting 4+ degrees more. Should cpu upgrade affect gpu that much? I think it could be something else, maybe 7.x kernel update. Have no idea because nothing has changed. Besides CUDA just sits there.

Anonymous
05/22/26(Fri)13:37:19 No.108882293

Anonymous 05/22/26(Fri)13:37:19 No.108882293

>>108882247
https://litter.catbox.moe/cvw34oxzrm82bzo5.mp4

Anonymous
05/22/26(Fri)13:39:30 No.108882308

Anonymous 05/22/26(Fri)13:39:30 No.108882308

>>108882293
i c ...

Anonymous
05/22/26(Fri)13:41:42 No.108882325

Anonymous 05/22/26(Fri)13:41:42 No.108882325

>>108882293
guy that recorded this has been missing since

Anonymous
05/22/26(Fri)13:45:36 No.108882345

Anonymous 05/22/26(Fri)13:45:36 No.108882345

File: file.png (484 KB, 1000x577)

484 KB PNG

>>108882293

Anonymous
05/22/26(Fri)13:47:12 No.108882354

Anonymous 05/22/26(Fri)13:47:12 No.108882354

>>108882247
>wants to merge 1 commit into ggml-org:master
>from jart:moe

Is jart moe?

Anonymous
05/22/26(Fri)13:54:09 No.108882399

Anonymous 05/22/26(Fri)13:54:09 No.108882399

>>108882256
try undervolting or whatever performance adjustment crap modern CPUs can deal with through their 2gb RAM use bloatware you can download

Anonymous
05/22/26(Fri)13:58:26 No.108882432

Anonymous 05/22/26(Fri)13:58:26 No.108882432

are we gemma MCP yet?

Anonymous
05/22/26(Fri)14:00:24 No.108882440

Anonymous 05/22/26(Fri)14:00:24 No.108882440

https://x.com/BlinkDL_AI/status/2057693097845493992
rwkvbros...... when will it be our time

Anonymous
05/22/26(Fri)14:10:24 No.108882499

Anonymous 05/22/26(Fri)14:10:24 No.108882499

>>108882440
when they grow a pair of balls and spend a gorillion dollars on pretraining a model that's bigger than 13B on data other than eleuther pile slop

Anonymous
05/22/26(Fri)14:12:18 No.108882514

Anonymous 05/22/26(Fri)14:12:18 No.108882514

File: Screenshot_20260522_141048.png (139 KB, 1123x900)

139 KB PNG

Anonymous
05/22/26(Fri)14:14:26 No.108882526

Anonymous 05/22/26(Fri)14:14:26 No.108882526

>>108882514
This model isn't qualified for a house nigga.

Anonymous
05/22/26(Fri)14:17:31 No.108882548

Anonymous 05/22/26(Fri)14:17:31 No.108882548

>>108882293
Why are people memeing about this? What did niggerganov say?

Anonymous
05/22/26(Fri)14:22:40 No.108882586

Anonymous 05/22/26(Fri)14:22:40 No.108882586

What can I do to make Georgi change his mind on deepseek?

Anonymous
05/22/26(Fri)14:24:58 No.108882597

Anonymous 05/22/26(Fri)14:24:58 No.108882597

Any idea why ROCm (on RDNA2 GPU) uses much more ram (not vram) than Vulkan? I'm talking an extra 10 GB, basically using twice as much ram as Vulkan. It's a bit faster, but if I have other shit running I'm running OOM with ROCm, it's quite annoying and I don't think it's worth the extra speed.

Anonymous
05/22/26(Fri)14:25:03 No.108882598

Anonymous 05/22/26(Fri)14:25:03 No.108882598

>>108882548
>>108882586
nothing. three letter agencies said no deepseek in the llama.cpp. they'll probably make him "an hero" if he did

Anonymous
05/22/26(Fri)14:31:31 No.108882634

Anonymous 05/22/26(Fri)14:31:31 No.108882634

>>108882598
worse, they probably threatened to fund ik_llama if he did

Anonymous
05/22/26(Fri)14:33:22 No.108882643

Anonymous 05/22/26(Fri)14:33:22 No.108882643

>>108882062
That was written quite literally years ago, when we were barely starting to see gpt-slop show up in other model outputs and benchmarks were universally laughed at by everyone even outside of this general. Jews had control of their bladders back then and the surgeon could be the father. So cut it some slack, okay desu?

Anonymous
05/22/26(Fri)14:34:40 No.108882648

Anonymous 05/22/26(Fri)14:34:40 No.108882648

>>108882634
And then they threatened ikawrakow that they will fund llamacpp if he supports deepseek?

Anonymous
05/22/26(Fri)14:54:15 No.108882760

Anonymous 05/22/26(Fri)14:54:15 No.108882760

>>108882062
>Pub: 28 May 2023 17:05 UTC
>Edit: 15 Dec 2023 18:42 UTC
Really needs to be removed at this point

Anonymous
05/22/26(Fri)14:55:39 No.108882766

Anonymous 05/22/26(Fri)14:55:39 No.108882766

>>108882597
Does it? RDNA2 ROCm 7.2 here, using llama.cpp. Memory use seems about the same compared to vulkan. Vllm and pytorch segfaults though, so I can't run image/video/audio shit big rippy

Anonymous
05/22/26(Fri)14:55:51 No.108882769

Anonymous 05/22/26(Fri)14:55:51 No.108882769

File: 1754198993130215.jpg (175 KB, 1000x1000)

175 KB JPG

Is this the way to go to connect a bunch of GPUs to a consumer motherboard?

Anonymous
05/22/26(Fri)14:57:24 No.108882777

Anonymous 05/22/26(Fri)14:57:24 No.108882777

>>108882760
It has a lot of outdated info and some of it is frankly nonsensical but if you remove it some ass blasted "STAWP GATEKEEPINGGGGG" autist that doesn't even know what they're talking about willstart up drama again so I think that's why the people who shit out these general-OPs begrudgingly keep including it.

Anonymous
05/22/26(Fri)14:59:09 No.108882788

Anonymous 05/22/26(Fri)14:59:09 No.108882788

File: 1751537404014311.png (98 KB, 280x280)

98 KB PNG

>>108880485
>What are silent failures
>What are edge cases

As a vibe shitter myself your mentality is beyond stupid and arrogant.

Anonymous
05/22/26(Fri)14:59:13 No.108882789

Anonymous 05/22/26(Fri)14:59:13 No.108882789

>>108882769
you need a PCIe to MCIO breakout board and then you need to connect it to that board. those GPUs will be running at PCIe gen 4 x2 each. not great, but pretty much the only option.

Anonymous
05/22/26(Fri)14:59:41 No.108882791

Anonymous 05/22/26(Fri)14:59:41 No.108882791

>>108882777
Just do like the local diffusion general does, when they add or remove things they just simply state WHY in the OP or second post. Literally a "Its outdated/broken info. And request anons for a up to date one.

Anonymous
05/22/26(Fri)15:01:22 No.108882799

Anonymous 05/22/26(Fri)15:01:22 No.108882799

>>108882791
Isn't that general's participants even more ass blasted immature and autistic than even this one? They'd probably bitch and moan just out of spite. /lmg/ it's the reason I know anything about AI but I don't even look its direction anymore because they're so faggy with their infighting

Anonymous
05/22/26(Fri)15:06:31 No.108882822

Anonymous 05/22/26(Fri)15:06:31 No.108882822

>>108881146
>>108881153
>The woman you fuck adopts your politics

Anonymous
05/22/26(Fri)15:11:55 No.108882853

Anonymous 05/22/26(Fri)15:11:55 No.108882853

File: file.png (187 KB, 1668x1266)

187 KB PNG

>>108882789
I heard it doesn't go through the CPU with the hacked p2p drivers.
https://forums.servethehome.com/index.php?threads/new-chinese-pcie-switch-board-gpu-testing.52488/post-491805
56 GB/s, 110 GB/s is like 3090s with nvlink, except those were 5090s.

Anonymous
05/22/26(Fri)15:16:17 No.108882890

Anonymous 05/22/26(Fri)15:16:17 No.108882890

>>108882853
Does amd have an equivalent?

Anonymous
05/22/26(Fri)15:20:27 No.108882913

Anonymous 05/22/26(Fri)15:20:27 No.108882913

>>108882766
I'm on llama.cpp too. I think the problem is KV cache, with ROCm on RDNA2 since it's not using WMMA it's really bad. Any high context and ROCm start using a shit ton of ram and become really slow or even OOM on my machine. It's also using increasingly more vram with context and I constantly have to reduce offloaded layers. I'm guessing I will have to switch to Vulkan and hit the speed penalty.

Anonymous
05/22/26(Fri)15:22:58 No.108882930

Anonymous 05/22/26(Fri)15:22:58 No.108882930

>>108882020
those are all just prompts though...

Anonymous
05/22/26(Fri)15:24:54 No.108882941

Anonymous 05/22/26(Fri)15:24:54 No.108882941

>>108882788
write better tests

Anonymous
05/22/26(Fri)15:31:52 No.108882986

Anonymous 05/22/26(Fri)15:31:52 No.108882986

File: 1758911060723134.jpg (553 KB, 1024x1275)

553 KB JPG

>>108882799
>Isn't that general's participants even more ass blasted immature and autistic than even this one?
Not really. No more then the embarrassing retards here, especially with the amount of Google dick sucking here lately and most having a hard time with any objectivity between models (note: I use Gemma a lot, but also several other models depending on the context).
/lmg/ and /ldg/ both are mostly fucking trash, but with nuggets of great info here and there. But largely I just skim the "previous thread" summery bot post to get the highlights, its legit the best part of /lmg/.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.