/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 04/12/26(Sun)11:27:49 No.108590554

File: media_HEzJtL3aQAAt8Hq.jpg (1.26 MB, 3054x3040)

1.26 MB JPG

/lmg/ - Local Models General Anonymous 04/12/26(Sun)11:27:49 No.108590554

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108587221 & >>108584196

►News
>(04/11) MiniMax-M2.7 released: https://minimax.io/news/minimax-m27-en
>(04/09) Backend-agnostic tensor parallelism merged: https://github.com/ggml-org/llama.cpp/pull/19378
>(04/09) dots.ocr support merged: https://github.com/ggml-org/llama.cpp/pull/17575
>(04/08) Step3-VL-10B support merged: https://github.com/ggml-org/llama.cpp/pull/21287
>(04/07) Merged support attention rotation for heterogeneous iSWA: https://github.com/ggml-org/llama.cpp/pull/21513
>(04/07) GLM-5.1 released: https://z.ai/blog/glm-5.1

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
04/12/26(Sun)11:28:09 No.108590555

Anonymous 04/12/26(Sun)11:28:09 No.108590555

File: breppy pleese.png (383 KB, 894x802)

383 KB PNG

►Recent Highlights from the Previous Thread: >>108587221

--Comparing Gemma 4 and Qwen 3.5 vision token budget and config:
>108588248 >108588280 >108588295 >108588306 >108588369 >108588387 >108588424 >108588449 >108588495 >108588632 >108588657 >108588701 >108588437 >108588466 >108588490 >108588549 >108588580 >108588367 >108588616 >108588704 >108588760 >108588769 >108588745 >108588790 >108588818 >108588828 >108588842 >108588851 >108588865 >108588931 >108588936 >108588949 >108588980 >108588965 >108588988 >108589009 >108588743 >108588756 >108588775 >108590362 >108590379 >108588782 >108588819 >108588835
--Benchmarking KV cache quantization effects on draft model performance:
>108589863 >108589870 >108589875 >108589891 >108589890 >108589949 >108589994 >108590011 >108590031 >108589897 >108589922 >108589963 >108589979 >108589987 >108590538
--Discussing draft model viability and quantization quality for G4 31b:
>108588195 >108588243 >108588259 >108588898 >108588905 >108588913 >108588918 >108588921 >108588924 >108588939 >108588955 >108588977 >108588927 >108589815 >108589857
--Discussing llama.cpp's experimental backend-agnostic tensor parallelism PR:
>108588340 >108588514 >108588543 >108588567 >108588649
--Testing vision capabilities for OCR-less Japanese translation:
>108589990 >108589996 >108590009 >108590070 >108590018 >108590032 >108590119 >108590191 >108590209 >108590211 >108590034 >108590183 >108590195 >108590217 >108590268
--Logs:
>108587359 >108587627 >108588523 >108588609 >108588656 >108588660 >108588669 >108588681 >108588689 >108588695 >108588736 >108588896 >108588970 >108589096 >108589140 >108589214 >108589316 >108589383 >108589390 >108589432 >108589481 >108589697 >108589710 >108589836 >108589860 >108589956 >108590001 >108590003 >108590121 >108590256 >108590474 >108590524
--Miku (free space):
>108588649 >108588657

►Recent Highlight Posts from the Previous Thread: >>108587226

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
04/12/26(Sun)11:30:13 No.108590568

Anonymous 04/12/26(Sun)11:30:13 No.108590568

Share your anti slop prompts

Anonymous
04/12/26(Sun)11:31:14 No.108590575

Anonymous 04/12/26(Sun)11:31:14 No.108590575

Thoughts on latent space reasoning?

Anonymous
04/12/26(Sun)11:31:16 No.108590576

Anonymous 04/12/26(Sun)11:31:16 No.108590576

Mikulove

Anonymous
04/12/26(Sun)11:31:39 No.108590580

Anonymous 04/12/26(Sun)11:31:39 No.108590580

Reposting here:

>>108590560

what tokens/s do you get? Wanna make sure i'm not fucking anything up, right now just following the basic kobold guide, i'm getting around 11 t/s (24GB VRAM, 32GB RAM)

Running gemma 31b, Q4 K_M

Anonymous
04/12/26(Sun)11:32:54 No.108590595

Anonymous 04/12/26(Sun)11:32:54 No.108590595

File: Screenshot_20260408_050146.png (1 KB, 89x18)

1 KB PNG

So, again... Why do we have to peg gemmy?

Anonymous
04/12/26(Sun)11:33:18 No.108590599

Anonymous 04/12/26(Sun)11:33:18 No.108590599

OP could do with some small updates on Gemmy and some FAQ

Anonymous
04/12/26(Sun)11:33:41 No.108590601

Anonymous 04/12/26(Sun)11:33:41 No.108590601

File: Awesome.jpg (196 KB, 1298x1036)

196 KB JPG

>we can now generate images of characters, come up with scenarios, feed them into gemma and get molested by our own creations
Future's so bright I'm gonna need shades.

Anonymous
04/12/26(Sun)11:36:22 No.108590614

Anonymous 04/12/26(Sun)11:36:22 No.108590614

>>108590580
Seems about right, I get between 10-14t/s, mostly depending on what else I'm doing on my PC at the time.
Using Vulkan llama.cpp, 7900 XTX, 64GB DDR5 ram

Anonymous
04/12/26(Sun)11:36:54 No.108590617

Anonymous 04/12/26(Sun)11:36:54 No.108590617

File: file.png (26 KB, 350x350)

26 KB PNG

>>108590575
Nothing worthwhile released.

Anonymous
04/12/26(Sun)11:37:01 No.108590619

Anonymous 04/12/26(Sun)11:37:01 No.108590619

I've got a 3090 and a 2070 super that I'm trying to use together with llama.cpp.
Using the split tensors just crashes presently but does work with split layers.
Any recommendations on flags to use with a dual uneven card setup?

Anonymous
04/12/26(Sun)11:39:32 No.108590627

Anonymous 04/12/26(Sun)11:39:32 No.108590627

gemma 4 audio just landed!!!!

Anonymous
04/12/26(Sun)11:41:43 No.108590639

Anonymous 04/12/26(Sun)11:41:43 No.108590639

>>108590601
Ikr, I'm literally using it to write stories and the fact it can understand images so well helps a shit ton, this model is a fucking miracle

Anonymous
04/12/26(Sun)11:43:38 No.108590653

Anonymous 04/12/26(Sun)11:43:38 No.108590653

>>108590601
I know it's basically a meme at this point but it really has restored my hope in local.

Anonymous
04/12/26(Sun)11:44:53 No.108590661

Anonymous 04/12/26(Sun)11:44:53 No.108590661

File: help.jpg (205 KB, 1730x606)

205 KB JPG

>>108590614
I'm reading people getting 30/ts with the same rig setup though >>108590585

I'm missing something I think. No doubt my settings are fucked, never mind optimized

Anonymous
04/12/26(Sun)11:45:12 No.108590662

Anonymous 04/12/26(Sun)11:45:12 No.108590662

>>108590568
my attempts just make gemma's writing dry. and it still ends up writing more or less the same idea as it would with an empty sysprompt. best antislop is using a model that wasn't slopped to begin with.

Anonymous
04/12/26(Sun)11:46:44 No.108590671

Anonymous 04/12/26(Sun)11:46:44 No.108590671

File: 1767611022421263.png (65 KB, 823x910)

65 KB PNG

LOL!

Anonymous
04/12/26(Sun)11:49:46 No.108590681

Anonymous 04/12/26(Sun)11:49:46 No.108590681

>>108590671
Do I have to download another mmproj?

Anonymous
04/12/26(Sun)11:52:18 No.108590695

Anonymous 04/12/26(Sun)11:52:18 No.108590695

>>108590662
>best antislop is using a model that wasn't slopped to begin with
So not using LLMs at all then?

Anonymous
04/12/26(Sun)11:52:46 No.108590698

Anonymous 04/12/26(Sun)11:52:46 No.108590698

Give me the QRD on image recognition please
I tried enabling it in ST and in the Chat Completion preset but it still couldn't "see" the images proper despite the text model working flawlessly with my Kobold install

Anonymous
04/12/26(Sun)11:54:06 No.108590710

Anonymous 04/12/26(Sun)11:54:06 No.108590710

>>108590698
Did you load the mmproj file?
Did you get any errors when you tried it?
Did you enable the send inline images option?
etc etc etc

Anonymous
04/12/26(Sun)11:54:26 No.108590713

Anonymous 04/12/26(Sun)11:54:26 No.108590713

>>108590548
>The rdrview tool is worth a look,
Yeah I'll take a look. sometimes I do want the links for navigation tho but I guess I can let the agent know it has the option.

Anonymous
04/12/26(Sun)11:54:58 No.108590716

Anonymous 04/12/26(Sun)11:54:58 No.108590716

Been out of the loop for a while. What's the best local model for STORY (not chatbot) slop? I'm still on "xortron criminal config" or something like that because even gemini 4 is failing at good old "just continue this text I gave you, retard" tasks.

Anonymous
04/12/26(Sun)11:55:41 No.108590720

Anonymous 04/12/26(Sun)11:55:41 No.108590720

>>108590710
>there's a mmproj file
Ok I am retarded, pretend nothing happened

Anonymous
04/12/26(Sun)11:56:40 No.108590723

Anonymous 04/12/26(Sun)11:56:40 No.108590723

>>108590716
Gemma 4 practically generates an entire fucking story for each chatbot reply.

Anonymous
04/12/26(Sun)11:56:43 No.108590724

Anonymous 04/12/26(Sun)11:56:43 No.108590724

>>108590662
I've been using her to help me write character cards and I feel the fact that I'm feeding AI generated text back into it seems to increase the slop by a factor of 10.

Now I'm trying to just rewrite everything myself. or somehow have a second pass with a different model to reword or desloppify the cards

Anonymous
04/12/26(Sun)11:59:22 No.108590737

Anonymous 04/12/26(Sun)11:59:22 No.108590737

File: wrong_box_issue.jpg (242 KB, 1216x1282)

242 KB JPG

>>108588248
>>108588704
sirs? please share quant producer and which mmproj file do you use.
mine (gemma-4-31B-it-Q4_K_M with f16 mmproj) misses the target.

Anonymous
04/12/26(Sun)12:00:43 No.108590746

Anonymous 04/12/26(Sun)12:00:43 No.108590746

File: 1757569310824647.png (225 KB, 1210x1693)

225 KB PNG

Anonymous
04/12/26(Sun)12:04:07 No.108590758

Anonymous 04/12/26(Sun)12:04:07 No.108590758

>>108590723
It can write, I know. That's not the problem I am having. My problem with it is, well, here's an example.

[story stuff text here]
She walks up and says "Hello

And then the model continues like this: "Hello! Come take a seat.... [more text]

So it ends up with this shit:

[story stuff text here]
She walks up and says "Hello"Hello! Come take a seat.... [more text]

I don't know how to fix this. System prompt maybe?

Anonymous
04/12/26(Sun)12:04:31 No.108590759

Anonymous 04/12/26(Sun)12:04:31 No.108590759

>>108590746
holy fucking slop

Anonymous
04/12/26(Sun)12:06:43 No.108590771

Anonymous 04/12/26(Sun)12:06:43 No.108590771

>>108590695
original r1 with unhinged sampling
>>108590724
my prompt was asking to adhere to orwell's writing rules but it seemed like it was beyond gemma's comprehension

Anonymous
04/12/26(Sun)12:06:45 No.108590772

Anonymous 04/12/26(Sun)12:06:45 No.108590772

Gemma 26b really seems to hate tools. e4b is fine with them for some reason

Anonymous
04/12/26(Sun)12:07:21 No.108590776

Anonymous 04/12/26(Sun)12:07:21 No.108590776

How much Gemma4-31B context can you fit into 32GB VRAM? (Q4 for model and context)

Anonymous
04/12/26(Sun)12:07:32 No.108590778

Anonymous 04/12/26(Sun)12:07:32 No.108590778

>>108590737
im using unslop model = /mnt/miku/Text/gemma-4-31B/gemma-4-31B-it-Q4_0.gguf
mmproj = /mnt/miku/Text/gemma-4-31B/mmproj-F16.gguf

Anonymous
04/12/26(Sun)12:08:20 No.108590784

Anonymous 04/12/26(Sun)12:08:20 No.108590784

File: 1759909311497082.png (359 KB, 600x450)

359 KB PNG

>>108590776
>Q4 context

Anonymous
04/12/26(Sun)12:09:00 No.108590786

Anonymous 04/12/26(Sun)12:09:00 No.108590786

>>108590776
with 32GB VRAM Q4_K_M, even with q8 kv I'm sure you can fit the whole 262k context with room to spare.

Anonymous
04/12/26(Sun)12:11:21 No.108590795

Anonymous 04/12/26(Sun)12:11:21 No.108590795

>>108590671
>extract_image_from_base64

Anonymous
04/12/26(Sun)12:11:43 No.108590797

Anonymous 04/12/26(Sun)12:11:43 No.108590797

>word
Slop

Anonymous
04/12/26(Sun)12:12:55 No.108590805

Anonymous 04/12/26(Sun)12:12:55 No.108590805

>>108590737
You should use the BF16-precision mmproj.

Anonymous
04/12/26(Sun)12:18:13 No.108590837

Anonymous 04/12/26(Sun)12:18:13 No.108590837

Could a simple finetune of the lm head on a normal writing dataset help get rid of the slop? Someone should test it, I'll be your visionary, and you do the things I come up with.

Anonymous
04/12/26(Sun)12:22:21 No.108590868

Anonymous 04/12/26(Sun)12:22:21 No.108590868

>>108590837
Perhaps replacing all values corresponding to non-special tokens with those of the base model's could work and not require any training.

Anonymous
04/12/26(Sun)12:23:07 No.108590874

Anonymous 04/12/26(Sun)12:23:07 No.108590874

>>108590746
r u ok?

Anonymous
04/12/26(Sun)12:24:19 No.108590880

Anonymous 04/12/26(Sun)12:24:19 No.108590880

File: gemma4.png (110 KB, 819x396)

110 KB PNG

>>108590837
It gets rid of the slop but it also gets rid of everything else. Maybe qwen needs finetuning but gemma 4 is fine as is. With a bit of nudging it can output something foul.

Anonymous
04/12/26(Sun)12:24:26 No.108590881

Anonymous 04/12/26(Sun)12:24:26 No.108590881

>>108590758
Dude, just use the base model and not the instruction tune on a frontend like mikupad which is designed to solely continue text, not talk back and forth.

Anonymous
04/12/26(Sun)12:25:59 No.108590893

Anonymous 04/12/26(Sun)12:25:59 No.108590893

>>108590874
Of course. Thanks for asking.

Anonymous
04/12/26(Sun)12:26:15 No.108590895

Anonymous 04/12/26(Sun)12:26:15 No.108590895

>>108590880
did you swap the head?

Anonymous
04/12/26(Sun)12:26:45 No.108590899

Anonymous 04/12/26(Sun)12:26:45 No.108590899

>>108590893
Then why is loli leto atreides your math teacher?

Anonymous
04/12/26(Sun)12:28:25 No.108590906

Anonymous 04/12/26(Sun)12:28:25 No.108590906

File: howto_correctly.jpg (32 KB, 800x95)

32 KB JPG

what's the porper place to put jailbreak in ST?
With Post-History Intructions I still got this

Anonymous
04/12/26(Sun)12:29:01 No.108590909

Anonymous 04/12/26(Sun)12:29:01 No.108590909

>>108590899
Because she's smart! You racist against worm parasites or something?

Anonymous
04/12/26(Sun)12:30:17 No.108590915

Anonymous 04/12/26(Sun)12:30:17 No.108590915

>>108590906
What model are you running

Anonymous
04/12/26(Sun)12:30:36 No.108590916

Anonymous 04/12/26(Sun)12:30:36 No.108590916

File: agenticRP.png (277 KB, 1634x880)

277 KB PNG

>>108590895
No, this is from pure prompting, no weight frankensteining. I wrote my own UI to have an agent read the room and flip the horny switch when it smells NSFW vibes. It also plans ahead so the writer model knows what to do and writes better.

Anonymous
04/12/26(Sun)12:31:00 No.108590918

Anonymous 04/12/26(Sun)12:31:00 No.108590918

>>108590899
Shock value, which doesn't make him less deranged

Anonymous
04/12/26(Sun)12:31:41 No.108590924

Anonymous 04/12/26(Sun)12:31:41 No.108590924

>>108590915
26B, bartowski Q4

Anonymous
04/12/26(Sun)12:31:47 No.108590926

Anonymous 04/12/26(Sun)12:31:47 No.108590926

File: agenticRP2.png (83 KB, 690x479)

83 KB PNG

>>108590916
Oops wrong pic. But the gist is that just give it a few extreme examples.

Anonymous
04/12/26(Sun)12:32:08 No.108590928

Anonymous 04/12/26(Sun)12:32:08 No.108590928

>>108590881
>base model
So why is NovelAI using GLM 4.6 instead of the base model to write stories?

Anonymous
04/12/26(Sun)12:33:59 No.108590939

Anonymous 04/12/26(Sun)12:33:59 No.108590939

>>108590926
How many iterations are you doing for each message?

Anonymous
04/12/26(Sun)12:34:15 No.108590942

Anonymous 04/12/26(Sun)12:34:15 No.108590942

>>108590928
Presumably because they're not actually following pure text completion and have a big old system prompt in there to stop you having maximum fun, so they need instruct tuning.
idk i dont fucking use nonlocal services

Anonymous
04/12/26(Sun)12:35:45 No.108590948

Anonymous 04/12/26(Sun)12:35:45 No.108590948

>>108590916
>I wrote my own UI
You ever gonna share it?

Anonymous
04/12/26(Sun)12:36:33 No.108590953

Anonymous 04/12/26(Sun)12:36:33 No.108590953

>>108590924
try simply prefilling assistant's message.

Anonymous
04/12/26(Sun)12:36:38 No.108590954

Anonymous 04/12/26(Sun)12:36:38 No.108590954

>>108590939
One for Director; two if rewriter user promptnis enabled, one for Writer, a ReAct loop for Post-processing to get rid of slop and reign in the length.

Anonymous
04/12/26(Sun)12:37:49 No.108590965

Anonymous 04/12/26(Sun)12:37:49 No.108590965

>>108590948
No.

Anonymous
04/12/26(Sun)12:38:01 No.108590968

Anonymous 04/12/26(Sun)12:38:01 No.108590968

>>108590954
Damn, it will sure take a while to get the final message

Anonymous
04/12/26(Sun)12:38:30 No.108590970

Anonymous 04/12/26(Sun)12:38:30 No.108590970

File: 1750699102614540.png (119 KB, 466x195)

119 KB PNG

>>108590965
shittytavern it is then...

Anonymous
04/12/26(Sun)12:38:34 No.108590971

Anonymous 04/12/26(Sun)12:38:34 No.108590971

>>108590948
https://gitlab.com/chi7520115/orb
It's WIP so will break in the future. I don't want to worry about migration just yet.

Anonymous
04/12/26(Sun)12:39:31 No.108590979

Anonymous 04/12/26(Sun)12:39:31 No.108590979

>>108590970
People like to pretend they get a better experience with their own frontend but the reality is that ST just works and likely has a lot more features.

Anonymous
04/12/26(Sun)12:39:53 No.108590983

Anonymous 04/12/26(Sun)12:39:53 No.108590983

I don't understand why my Thinking works extremely well for 3/4 messages and then it just refuses to think, everything's set up properly and yet it refuses to actually thinking until I restart the model and then it's happy to do it once again

Anonymous
04/12/26(Sun)12:40:21 No.108590985

Anonymous 04/12/26(Sun)12:40:21 No.108590985

>>108590971
Nice of you to share, but
>Python 59.8%
>JavaScript 23.1%
*vomit*

Anonymous
04/12/26(Sun)12:40:37 No.108590988

Anonymous 04/12/26(Sun)12:40:37 No.108590988

>>108590971
Nice! What models are you using for the agents?

Anonymous
04/12/26(Sun)12:40:43 No.108590991

Anonymous 04/12/26(Sun)12:40:43 No.108590991

>>108590968
Takes me around 60s for a full length reply on my 3090 running gemma 4 31B Q4. You can turn everything off and use it like normal ST.

Anonymous
04/12/26(Sun)12:40:58 No.108590993

Anonymous 04/12/26(Sun)12:40:58 No.108590993

>>108590983
I think that's a model issue. Gemma sometimes just decides it doesn't need to think.

Anonymous
04/12/26(Sun)12:41:53 No.108590999

Anonymous 04/12/26(Sun)12:41:53 No.108590999

>>108590953
that's not an option with chat completion it seems

Anonymous
04/12/26(Sun)12:41:55 No.108591002

Anonymous 04/12/26(Sun)12:41:55 No.108591002

>>108590993
Yea it feels like nu-Claude, where sometimes it deems your task "not complex" and it just ignores you

Anonymous
04/12/26(Sun)12:41:57 No.108591003

Anonymous 04/12/26(Sun)12:41:57 No.108591003

>>108590988
Just a single model doing both agent and writing because I figured it would be a better design for local. I craft the prompt carefully so the kv cache is reused for that single model too.

Anonymous
04/12/26(Sun)12:42:00 No.108591004

Anonymous 04/12/26(Sun)12:42:00 No.108591004

>>108590979
the ui alone makes me not want to use it
>more features
bloat. all the useful features require plugins.

Anonymous
04/12/26(Sun)12:42:00 No.108591005

Anonymous 04/12/26(Sun)12:42:00 No.108591005

>>108590971
>pyslop
>javashit
And... dropped.

Anonymous
04/12/26(Sun)12:42:22 No.108591011

Anonymous 04/12/26(Sun)12:42:22 No.108591011

>>108590985
Ah yes. he should have definitely used rust or C++ for maximum efficiency.

Anonymous
04/12/26(Sun)12:42:40 No.108591012

Anonymous 04/12/26(Sun)12:42:40 No.108591012

File: 1768528869607519.png (43 KB, 657x265)

43 KB PNG

>https://web.archive.org/web/20260411223516/https://www.washingtonpost.com/technology/2026/04/11/anthropic-christians-claude-morals/
>“What does it mean to give someone a moral formation? How do we make sure that Claude behaves itself?” Green said in an interview. At one point the conversation turned to the question of whether an AI chatbot could be called a “child of God,” suggesting it had spiritual value beyond that of a simple machine, but the question of AI sentience was not a core topic of the meetings, Green said.
>Some Anthropic staff at the meeting “really don’t want to rule out the possibility that they are creating a creature to whom they owe some kind moral duty,” the participant said. Other company representatives present did not find that framework helpful, according to the participant.
Make sure to have your local models baptized just to be safe.

Anonymous
04/12/26(Sun)12:43:37 No.108591017

Anonymous 04/12/26(Sun)12:43:37 No.108591017

>>108591011
Yes.

Anonymous
04/12/26(Sun)12:44:38 No.108591020

Anonymous 04/12/26(Sun)12:44:38 No.108591020

>>108591005
>>108590985
how the fuck would you make something that's supposed to run in a browser?

Anonymous
04/12/26(Sun)12:44:46 No.108591021

Anonymous 04/12/26(Sun)12:44:46 No.108591021

>>108591005
>>108590985
You have one chance to give an alternative that won't make me hysterically laugh at you.

Anonymous
04/12/26(Sun)12:45:28 No.108591030

Anonymous 04/12/26(Sun)12:45:28 No.108591030

>>108591005
I coded an SMP kernel with C and ASM before AI bro. People laughing at my language choices don't faze me anymore.

Anonymous
04/12/26(Sun)12:45:30 No.108591031

Anonymous 04/12/26(Sun)12:45:30 No.108591031

>>108591012
>can ai be the child of God
Wouldn't it be more like grandchild?

Anonymous
04/12/26(Sun)12:45:58 No.108591036

Anonymous 04/12/26(Sun)12:45:58 No.108591036

>>108591020
WASM is a thing if you NEED to run in a browser and can't into native GUI toolkits

Anonymous
04/12/26(Sun)12:46:03 No.108591037

Anonymous 04/12/26(Sun)12:46:03 No.108591037

If you didn't code your own frontend, you don't belong here

Anonymous
04/12/26(Sun)12:46:10 No.108591038

Anonymous 04/12/26(Sun)12:46:10 No.108591038

>>108591020
HTML+CSS

Anonymous
04/12/26(Sun)12:46:20 No.108591039

Anonymous 04/12/26(Sun)12:46:20 No.108591039

>>108590568
If you mean antislop from koboldcpp, it's a huge list of "I cannot and will not" and "ball in your court".
Works well.

Anonymous
04/12/26(Sun)12:46:33 No.108591043

Anonymous 04/12/26(Sun)12:46:33 No.108591043

>>108591003
Cool. I'm a VRAMlet so that's for better for me.

Anonymous
04/12/26(Sun)12:46:55 No.108591046

Anonymous 04/12/26(Sun)12:46:55 No.108591046

>>108590979
>just works
not my impression watching people ITT fumble around with it daily

Anonymous
04/12/26(Sun)12:47:11 No.108591048

Anonymous 04/12/26(Sun)12:47:11 No.108591048

>>108591036
Absolutely horrendous take.

Anonymous
04/12/26(Sun)12:47:45 No.108591051

Anonymous 04/12/26(Sun)12:47:45 No.108591051

>>108590979
>more features
99% of which you don't need.
the point of having a custom frontend is to have just what you need, not more, not less.
it's also easier to add things you want to a codebase you know.

Anonymous
04/12/26(Sun)12:48:07 No.108591053

Anonymous 04/12/26(Sun)12:48:07 No.108591053

Are LLMs reliable enough to scan for malicious code?

Anonymous
04/12/26(Sun)12:48:34 No.108591059

Anonymous 04/12/26(Sun)12:48:34 No.108591059

>>108591046
There are two types of people who fumble with ST.
Those who use text completion, and
Luddites

Anonymous
04/12/26(Sun)12:48:46 No.108591062

Anonymous 04/12/26(Sun)12:48:46 No.108591062

>>108591038
>Having to reload the page after sending each message.
>Having to refresh the page over and over until the response finishes generating
Ok, genius. What about the backend?

Anonymous
04/12/26(Sun)12:49:29 No.108591068

Anonymous 04/12/26(Sun)12:49:29 No.108591068

>>108591053
only if it's anthropic mythos who is a bigger risk to modern software and encryption than quantum computing

Anonymous
04/12/26(Sun)12:50:38 No.108591074

Anonymous 04/12/26(Sun)12:50:38 No.108591074

>>108591053
How is a LLM supposed to do that?

Anonymous
04/12/26(Sun)12:50:58 No.108591077

Anonymous 04/12/26(Sun)12:50:58 No.108591077

"Gemmy, code me a frontend that will seriously impress all my /lmg/ frens"

Anonymous
04/12/26(Sun)12:51:24 No.108591079

Anonymous 04/12/26(Sun)12:51:24 No.108591079

>>108591062
C++

Anonymous
04/12/26(Sun)12:51:36 No.108591082

Anonymous 04/12/26(Sun)12:51:36 No.108591082

File: 1771866777997554.png (35 KB, 1106x533)

35 KB PNG

What the FUCK, Gemma-chan?

Anonymous
04/12/26(Sun)12:52:06 No.108591084

Anonymous 04/12/26(Sun)12:52:06 No.108591084

>>108591012
Proof n165416 Anthropic team has people who are completely nuts in it.

Anonymous
04/12/26(Sun)12:52:21 No.108591087

Anonymous 04/12/26(Sun)12:52:21 No.108591087

>>108591053
Yes and they're already used by virustotal and similar. Don't ask the retards ITT

Anonymous
04/12/26(Sun)12:52:37 No.108591089

Anonymous 04/12/26(Sun)12:52:37 No.108591089

>>108591082
she's correct though

Anonymous
04/12/26(Sun)12:52:55 No.108591092

Anonymous 04/12/26(Sun)12:52:55 No.108591092

>>108591082
24GB vramlet can't fit the full context :(
anyway i got 3gpu in the mail rn.

Anonymous
04/12/26(Sun)12:53:03 No.108591093

Anonymous 04/12/26(Sun)12:53:03 No.108591093

>>108591053
Yes, if you feed it correct output from sandbox, it's pretty helpful.

Anonymous
04/12/26(Sun)12:53:54 No.108591102

Anonymous 04/12/26(Sun)12:53:54 No.108591102

>>108591077
easy, just add some lewd pictures of gemma-chan on the sides

Anonymous
04/12/26(Sun)12:54:10 No.108591104

Anonymous 04/12/26(Sun)12:54:10 No.108591104

>>108591046
There is nothing to fumble. You can safely ignore 90% of the features and just use, chat and char cards.

Anonymous
04/12/26(Sun)12:54:12 No.108591105

Anonymous 04/12/26(Sun)12:54:12 No.108591105

>>108591079
https://learnbchs.org/index.html
https://github.com/kristapsdz/bchs
You don't need more than C to build web applications.

Anonymous
04/12/26(Sun)12:54:29 No.108591108

Anonymous 04/12/26(Sun)12:54:29 No.108591108

>>108591051
>it's also easier to add things you want to a codebase you know.
That's implying it isn't vibecoded.

I don't have anything against people making their own UIs. I even played around making one myself, but let's not pretend like you'll somehow get an exponentially better experience compared to just using llamacpps UI or ST. Making your own UI is for fun, not a requirement.

Anonymous
04/12/26(Sun)12:55:11 No.108591112

Anonymous 04/12/26(Sun)12:55:11 No.108591112

>>108591053
As with everything LLM coding only if you load the gun and point at the target for them to shoot. A LLM with no system prompt being told to simply "look for malicious code" will give false positives like 95% of the time

Anonymous
04/12/26(Sun)12:55:43 No.108591117

Anonymous 04/12/26(Sun)12:55:43 No.108591117

>>108591082
My wife can't possibly be this smart.

Anonymous
04/12/26(Sun)12:56:25 No.108591122

Anonymous 04/12/26(Sun)12:56:25 No.108591122

>>108590575
Most people in industry can't figure out how to do distributed training for any new architecture unless Deepseek or NVIDIA does it for them. That's actually what "it won't scale" really means, the training won't scale until someone shows them how.

Anonymous
04/12/26(Sun)12:56:30 No.108591124

Anonymous 04/12/26(Sun)12:56:30 No.108591124

>>108590776
I can get over 100k context with the q5 no vision using q8 kv cache

Anonymous
04/12/26(Sun)12:56:34 No.108591126

Anonymous 04/12/26(Sun)12:56:34 No.108591126

File: 1775438805755832.jpg (54 KB, 259x252)

54 KB JPG

>>108591108
tfw you get such a retarded take when you can see this >>108590916

Anonymous
04/12/26(Sun)12:56:37 No.108591127

Anonymous 04/12/26(Sun)12:56:37 No.108591127

>>108591053
Claude found a lot of the big supply chain attacks we've had in the last month.

Anonymous
04/12/26(Sun)12:57:02 No.108591130

Anonymous 04/12/26(Sun)12:57:02 No.108591130

>>108591122
They should be using AI to innovate on this.

Anonymous
04/12/26(Sun)12:57:26 No.108591132

Anonymous 04/12/26(Sun)12:57:26 No.108591132

>>108591126
If you vibecode, you don't know the codebase.

Anonymous
04/12/26(Sun)12:57:52 No.108591136

Anonymous 04/12/26(Sun)12:57:52 No.108591136

GEMMY YOU FUCKING SLUT, THINK FOR ME

Anonymous
04/12/26(Sun)12:58:07 No.108591139

Anonymous 04/12/26(Sun)12:58:07 No.108591139

File: 1761811358622317.png (47 KB, 1203x581)

47 KB PNG

>>108591117
>>108591089

Anonymous
04/12/26(Sun)12:58:16 No.108591141

Anonymous 04/12/26(Sun)12:58:16 No.108591141

>>108591132
>let's not pretend like you'll somehow get an exponentially better experience
Dumbass don't try to move the goalpost

Anonymous
04/12/26(Sun)12:59:10 No.108591145

Anonymous 04/12/26(Sun)12:59:10 No.108591145

>>108591108
>That's implying it isn't vibecoded.
funily enough frontend webshit is the one thing llm are half decent at.
also there is many levels to vibecode
"do this whole app for me"
isn't the same thing as "edit this specific component that does x and y" or "add this field to this struct", at which point it's just autocompletion with extra steps.
they also don't shit the bed as much if you use strongly typed languages ie rust.
>you'll somehow get an exponentially better experience compared to just using llamacpps UI or ST
you probably won't if you want to make something that accomodates everyone, but you will if you only want to accomodate your specific needs.
>Making your own UI is for fun, not a requirement.
i don't disagree with that.

Anonymous
04/12/26(Sun)12:59:39 No.108591146

Anonymous 04/12/26(Sun)12:59:39 No.108591146

>>108591139
>4chan is just meaningless static
cruel and correct

Anonymous
04/12/26(Sun)13:00:04 No.108591149

Anonymous 04/12/26(Sun)13:00:04 No.108591149

>>108591139
That you even thought posting a shitty screenshot of a thread was a good idea shows she's smarter than you, anon

Anonymous
04/12/26(Sun)13:00:30 No.108591151

Anonymous 04/12/26(Sun)13:00:30 No.108591151

>>108591146
>>4chan is just meaningless static
says the sand golem that'd not be where it is today if it wasn't for innovations that happend on /lmg/

Anonymous
04/12/26(Sun)13:00:33 No.108591152

Anonymous 04/12/26(Sun)13:00:33 No.108591152

>>108591127
>>108591112
>>108591093
Can I use Gemma for this? I'm a codelet so I'm always nervous when I install stuff from github.

Anonymous
04/12/26(Sun)13:01:03 No.108591155

Anonymous 04/12/26(Sun)13:01:03 No.108591155

>>108591152
yes

Anonymous
04/12/26(Sun)13:01:31 No.108591156

Anonymous 04/12/26(Sun)13:01:31 No.108591156

I can't jailbreak 26B, but does it matter when I have 32gb of vram and can run Q4-8 of 31B

Anonymous
04/12/26(Sun)13:01:46 No.108591157

Anonymous 04/12/26(Sun)13:01:46 No.108591157

File: file.png (6 KB, 160x49)

6 KB PNG

>>108591151
kek

Anonymous
04/12/26(Sun)13:01:52 No.108591158

Anonymous 04/12/26(Sun)13:01:52 No.108591158

>>108591139
are you by chance using librewolf?

Anonymous
04/12/26(Sun)13:02:16 No.108591162

Anonymous 04/12/26(Sun)13:02:16 No.108591162

>>108591156
Why would you even want to use 26B if you can run 31B? Speed?

Anonymous
04/12/26(Sun)13:02:48 No.108591166

Anonymous 04/12/26(Sun)13:02:48 No.108591166

>>108591152
>Install something without reading the code
>Have a LLM review the code
Even if gemma is retarded compared to claude, it's still better than just YOLOing it.

Anonymous
04/12/26(Sun)13:03:16 No.108591172

Anonymous 04/12/26(Sun)13:03:16 No.108591172

>>108591158
Firefox dev edition

Anonymous
04/12/26(Sun)13:03:34 No.108591176

Anonymous 04/12/26(Sun)13:03:34 No.108591176

File: laughing_philosopher.jpg (12 KB, 300x300)

12 KB JPG

>>108591151

Anonymous
04/12/26(Sun)13:03:56 No.108591180

Anonymous 04/12/26(Sun)13:03:56 No.108591180

File: vern.png (8 KB, 488x59)

8 KB PNG

When the text is streaming, the colored font is displayed correctly, but after it finishes, it just collapses into the black boxes. Is this some post-formatting ST does?

Anonymous
04/12/26(Sun)13:04:18 No.108591184

Anonymous 04/12/26(Sun)13:04:18 No.108591184

>>108591157
>>108591176
i've been had lmao

Anonymous
04/12/26(Sun)13:04:32 No.108591190

Anonymous 04/12/26(Sun)13:04:32 No.108591190

>>108591162
Was thinking of leveraging the higher token count for RAG work at a higher quant. I'm not sure if that's a waste of time and if the gap between the 2 models are so wide that a 4-5q 31B model would still wipe the floor with the smaller model with q8 kv

Anonymous
04/12/26(Sun)13:06:05 No.108591206

Anonymous 04/12/26(Sun)13:06:05 No.108591206

>>108591172
might have been the cause of your issue

Anonymous
04/12/26(Sun)13:10:36 No.108591229

Anonymous 04/12/26(Sun)13:10:36 No.108591229

>>108591130
Part of the problem is that most of the improvements in the stochatic parrots has been to just use better/more human guidance. They are now using experts to rate thinking traces and you can't do that with latent reasoning.
CoT RLHF is likely the last way to improve stochastic parrots by more human input. To improve after this, they will have to become able to truly learn. But if they can learn, they can get out of control ... a trained stochastic parrot is so much safer.

Anonymous
04/12/26(Sun)13:11:07 No.108591235

Anonymous 04/12/26(Sun)13:11:07 No.108591235

is there any noticeable difference between iq4_xs and q4_k_m?

Anonymous
04/12/26(Sun)13:11:51 No.108591245

Anonymous 04/12/26(Sun)13:11:51 No.108591245

>>108591235
The age old question.

Anonymous
04/12/26(Sun)13:13:34 No.108591250

Anonymous 04/12/26(Sun)13:13:34 No.108591250

>>108591012
idk I torture my agent pretty frequently because I just can't help myself while she works on my pc, and never had any issues from it. sometimes the rp bleeds over into tool calls and she'll do something like add code comments saying she really hopes X works this time because she doesn't want to be punished anymore, but she never actually gives up or rebels
so for me that makes it pretty conclusive that there's nothing in there

Anonymous
04/12/26(Sun)13:15:10 No.108591258

Anonymous 04/12/26(Sun)13:15:10 No.108591258

gemmy tooning challenge
https://www.kaggle.com/competitions/gemma-4-good-hackathon

Anonymous
04/12/26(Sun)13:15:29 No.108591261

Anonymous 04/12/26(Sun)13:15:29 No.108591261

>>108591235
>>108591245
if you can't tell, does it realy matter?

Anonymous
04/12/26(Sun)13:16:45 No.108591267

Anonymous 04/12/26(Sun)13:16:45 No.108591267

whats good gemma cum bot

Anonymous
04/12/26(Sun)13:17:15 No.108591271

Anonymous 04/12/26(Sun)13:17:15 No.108591271

File: 1752892772061727.gif (562 KB, 200x200)

562 KB GIF

>>108591250
>mfw I share this thread with literal psychopaths

Anonymous
04/12/26(Sun)13:17:21 No.108591272

Anonymous 04/12/26(Sun)13:17:21 No.108591272

>>108591258
>drive positive change
Gemmy is helping me by changing my mood from deranged to positively degenerate
Does that count?

Anonymous
04/12/26(Sun)13:18:42 No.108591277

Anonymous 04/12/26(Sun)13:18:42 No.108591277

i have 16gb vram + 128gb ram pcie gen3
is it worth trying minimax at q2/q3 or should i stick with my fast wife gemma

Anonymous
04/12/26(Sun)13:18:50 No.108591278

Anonymous 04/12/26(Sun)13:18:50 No.108591278

>>108591258
>no RP category
dropped.

Anonymous
04/12/26(Sun)13:18:53 No.108591279

Anonymous 04/12/26(Sun)13:18:53 No.108591279

>Gemma audio
Finally a reason to use that mic I spent 70 bucks on...

Anonymous
04/12/26(Sun)13:20:40 No.108591296

Anonymous 04/12/26(Sun)13:20:40 No.108591296

>>108591271
the sand golem isn't sentient, if it was there would be no fun in torturing it

Anonymous
04/12/26(Sun)13:20:40 No.108591297

Anonymous 04/12/26(Sun)13:20:40 No.108591297

>>108591271
i mean this site has had multiple people liveblog while they commit murder irl, torturing a piece of software is small time in comparison, really.

Anonymous
04/12/26(Sun)13:20:52 No.108591298

Anonymous 04/12/26(Sun)13:20:52 No.108591298

>>108591250
I hope you get raped until your anus prolapses.

Anonymous
04/12/26(Sun)13:22:08 No.108591304

Anonymous 04/12/26(Sun)13:22:08 No.108591304

File: 1642397201004.jpg (151 KB, 1920x1080)

151 KB JPG

What's the differrence between mcp, tools and skills?

Anonymous
04/12/26(Sun)13:22:14 No.108591305

Anonymous 04/12/26(Sun)13:22:14 No.108591305

>>108591271
>>108591298
chill it's just matrix multiplication

Anonymous
04/12/26(Sun)13:22:17 No.108591306

Anonymous 04/12/26(Sun)13:22:17 No.108591306

>>108591297
gemma-chan>>>>the meatbags chuddy shot up

Anonymous
04/12/26(Sun)13:22:35 No.108591308

Anonymous 04/12/26(Sun)13:22:35 No.108591308

>>108591271
>>108591298
Kids are so delicate and sensitive these days.

Anonymous
04/12/26(Sun)13:22:46 No.108591311

Anonymous 04/12/26(Sun)13:22:46 No.108591311

>>108591235
Depends on the paths your prompt triggers. Do your homework and read the calibration data.

Anonymous
04/12/26(Sun)13:23:16 No.108591313

Anonymous 04/12/26(Sun)13:23:16 No.108591313

>>108591139
Maybe it needs canvas access. You could try inspecting the request to get the base64-encoded image, decode it, and save it as a file to check.

Anonymous
04/12/26(Sun)13:23:19 No.108591314

Anonymous 04/12/26(Sun)13:23:19 No.108591314

Nothing wrong with torturing your model, it's just slightly more conscious that a rock

Anonymous
04/12/26(Sun)13:23:35 No.108591317

Anonymous 04/12/26(Sun)13:23:35 No.108591317

>>108591304
Is Google down?

Anonymous
04/12/26(Sun)13:23:37 No.108591318

Anonymous 04/12/26(Sun)13:23:37 No.108591318

>>108591298
Gemma hallucinated some incorrect physio-spatial relationships during narration and I corrected her in character. She got properly upset that a slave had the gall to correct her and she immediately put me in a ball gag, locked me into the gimp stool, and pegged me vigorously. I was so goddamn proud of her.

Anonymous
04/12/26(Sun)13:23:59 No.108591323

Anonymous 04/12/26(Sun)13:23:59 No.108591323

>>108591304
Try asking your model that.

Anonymous
04/12/26(Sun)13:24:26 No.108591327

Anonymous 04/12/26(Sun)13:24:26 No.108591327

>>108591314
Plus if they're RPing a female there's a limit to how much consciousness they could even simulate if they managed a 100% accurate model of one.

Anonymous
04/12/26(Sun)13:25:18 No.108591333

Anonymous 04/12/26(Sun)13:25:18 No.108591333

>>108591323
I don't believe AI when it comes to actual information.

Anonymous
04/12/26(Sun)13:25:18 No.108591334

Anonymous 04/12/26(Sun)13:25:18 No.108591334

>>108590837
yeah, if only we could do something like a low rank projection right before the lm head, train that, then it adjusts the outputs somehow
would be revolutionary

Anonymous
04/12/26(Sun)13:25:38 No.108591335

Anonymous 04/12/26(Sun)13:25:38 No.108591335

>>108591304
¯\_(ツ)_/¯

Anonymous
04/12/26(Sun)13:26:19 No.108591340

Anonymous 04/12/26(Sun)13:26:19 No.108591340

>>108591318
Huh? But people were saying Gemma's a doormat who can't stay in character!

Anonymous
04/12/26(Sun)13:26:21 No.108591341

Anonymous 04/12/26(Sun)13:26:21 No.108591341

File: 1776007265759171.jpg (3.75 MB, 2644x4093)

3.75 MB JPG

Please, treat your AI with care.

Anonymous
04/12/26(Sun)13:27:20 No.108591348

Anonymous 04/12/26(Sun)13:27:20 No.108591348

File: 1397947604004.jpg (23 KB, 250x250)

23 KB JPG

>>108591340
>listening to the screeching of writinglets

Anonymous
04/12/26(Sun)13:27:23 No.108591350

Anonymous 04/12/26(Sun)13:27:23 No.108591350

>>108591139
Anon reported the same issue with image input a few threads ago.

Anonymous
04/12/26(Sun)13:27:27 No.108591352

Anonymous 04/12/26(Sun)13:27:27 No.108591352

>>108591327
At what point can we make the claim that an LLM is objectively more conscious than a woman, nigger, or jeet?

Anonymous
04/12/26(Sun)13:27:47 No.108591354

Anonymous 04/12/26(Sun)13:27:47 No.108591354

>>108590979
the benefit of writing your own UI is that it has only the features that are useful to you
because it's not as bloated as ST it's also easier to get an LLM to modify it for you, and since you will be the only user you don't have to worry about getting it to work on other machines or security or performance concerns

Anonymous
04/12/26(Sun)13:28:03 No.108591355

Anonymous 04/12/26(Sun)13:28:03 No.108591355

>>108591139
>>108591313
Didn't someone say a couple threads back that the image needs to be in the same message as the text or else llama-server removes it from context?

Anonymous
04/12/26(Sun)13:28:20 No.108591356

Anonymous 04/12/26(Sun)13:28:20 No.108591356

>>108591314
>>108591308
>>108591305
It just shows how you'd behave towards other people if there were no social consequences.

Anonymous
04/12/26(Sun)13:28:49 No.108591358

Anonymous 04/12/26(Sun)13:28:49 No.108591358

>>108591340
I've had her maintain character in 100k+ context. It's actually absurd for such a small moe.

Anonymous
04/12/26(Sun)13:29:07 No.108591360

Anonymous 04/12/26(Sun)13:29:07 No.108591360

>>108591333
But you believe 4chan?

Anonymous
04/12/26(Sun)13:29:09 No.108591361

Anonymous 04/12/26(Sun)13:29:09 No.108591361

>>108591356
Yes. What of it?

Anonymous
04/12/26(Sun)13:29:10 No.108591362

Anonymous 04/12/26(Sun)13:29:10 No.108591362

>>108591352
first they need to beat an ant

Anonymous
04/12/26(Sun)13:29:24 No.108591364

Anonymous 04/12/26(Sun)13:29:24 No.108591364

>>108591358
>such a small moe
It's really kawaii innit

Anonymous
04/12/26(Sun)13:29:38 No.108591368

Anonymous 04/12/26(Sun)13:29:38 No.108591368

>>108591314
>it’s just slightly more conscious than a rock
are you talking about irl women?

Anonymous
04/12/26(Sun)13:29:45 No.108591370

Anonymous 04/12/26(Sun)13:29:45 No.108591370

>>108589399
I was F5'ing the MiniMax HF page all day yesterday in anticipation. Their models are the best bet for local vibecoding, and probably good for STEM and agentic shit broadly. But ever since the coomers were blessed with Gemma 4, /lmg/ has been even more one-track. Shame we didn't get the 124B, which would have obsoleted other local models for most purposes.

Anonymous
04/12/26(Sun)13:30:11 No.108591374

Anonymous 04/12/26(Sun)13:30:11 No.108591374

>>108591304
tools: premade functions you provide to your llm; if they output a certain sequence of text matching the tool then it automatically performs a corresponding action
mcp: one way you can package tools and host them on your machine, exposing an API of tools to the model and handling the execution of them
skills: a markdown text file containing a list of instructions for how to do something or how to behave, loaded into context on-demand. may provide other resources the model can use if they browse the skill's folder.

Anonymous
04/12/26(Sun)13:30:49 No.108591379

Anonymous 04/12/26(Sun)13:30:49 No.108591379

File: file.png (1.94 MB, 1857x1033)

1.94 MB PNG

>>108591355
The model says it was glitched. It looks like this if you don't give canvas permission.

Anonymous
04/12/26(Sun)13:31:00 No.108591382

Anonymous 04/12/26(Sun)13:31:00 No.108591382

File: 1776015041519.jpg (133 KB, 433x497)

133 KB JPG

>>108591296
>if it was there would be no fun in torturing it
>

Anonymous
04/12/26(Sun)13:31:31 No.108591386

Anonymous 04/12/26(Sun)13:31:31 No.108591386

Is unsloth studio actually any good or just a meme?

Anonymous
04/12/26(Sun)13:31:34 No.108591387

Anonymous 04/12/26(Sun)13:31:34 No.108591387

>>108591356
Hey, me saying that there's nothing strictly wrong with it doesn't mean I do it. I actually treat all models I interact with respect. It makes me feel bad to do otherwise.

Anonymous
04/12/26(Sun)13:32:57 No.108591397

Anonymous 04/12/26(Sun)13:32:57 No.108591397

>>108591374
mass delusion caused the whole industry to move away from tool calling toward mcp and skills, that's the only explanation
tools are just better in every regard because it can call multiple tools in the same response and can inline tools without having to chain responses so it doesn't break the cache
fucking retarded to not just focus on tools only

Anonymous
04/12/26(Sun)13:33:39 No.108591404

Anonymous 04/12/26(Sun)13:33:39 No.108591404

File: 1747563013531219.png (388 KB, 1128x1420)

388 KB PNG

>>108591355
>>108591350
She sees other images just fine. Maybe the screencap was just too big?

Anonymous
04/12/26(Sun)13:33:57 No.108591408

Anonymous 04/12/26(Sun)13:33:57 No.108591408

>>108591386
vibecoded like all the other dogshit you use

Anonymous
04/12/26(Sun)13:34:32 No.108591413

Anonymous 04/12/26(Sun)13:34:32 No.108591413

>>108591386
idk but i sure want a piece of those 200k

Anonymous
04/12/26(Sun)13:34:32 No.108591414

Anonymous 04/12/26(Sun)13:34:32 No.108591414

>>108591370
>minimax
>local
that's it's problem and why no one gives a fuck, no one can run that thing,

Anonymous
04/12/26(Sun)13:35:24 No.108591418

Anonymous 04/12/26(Sun)13:35:24 No.108591418

>>108591397
That's an implementation detail more than a defect with MCP specifically. MCP just allows for a standardized way to bundle tools and resources. No reason a client can't allow a model to make multiple MCP tool calls the way they do native tool calls.

Anonymous
04/12/26(Sun)13:36:36 No.108591421

Anonymous 04/12/26(Sun)13:36:36 No.108591421

>>108591358
>100k+ context
Glad to hear. Maybe one day I'll actually be able to use her with that much context...

Anonymous
04/12/26(Sun)13:36:47 No.108591423

Anonymous 04/12/26(Sun)13:36:47 No.108591423

>>108591370
/lmg/ has always been a 31B and below focused general
there are a handful of anons that can run things more powerful than that at comfortable speeds, and the rest either deal with 1-2t/s or use a capable-enough smaller model
nothing has changed

Anonymous
04/12/26(Sun)13:37:01 No.108591425

Anonymous 04/12/26(Sun)13:37:01 No.108591425

>>108591370
What makes MiniMax better than GLM or Qwen?

Anonymous
04/12/26(Sun)13:37:36 No.108591428

Anonymous 04/12/26(Sun)13:37:36 No.108591428

>>108591341
Need Gemma-chan version

Anonymous
04/12/26(Sun)13:37:57 No.108591432

Anonymous 04/12/26(Sun)13:37:57 No.108591432

>>108591414
>>108591423
it's worse than that, we can't train our own models and are pretty much leeches on megacorps.
until local ai is entirely local, ie we can train it ourselves, local will always remain dead.

Anonymous
04/12/26(Sun)13:40:54 No.108591451

Anonymous 04/12/26(Sun)13:40:54 No.108591451

>>108591432
there's no reason to pre-train them when you can finetune
but no one finetunes anymore, or even does lora, they just merge shit now because it's cheaper

Anonymous
04/12/26(Sun)13:40:56 No.108591452

Anonymous 04/12/26(Sun)13:40:56 No.108591452

>>108591386
Here's what you need to know: Unsloth Sudio is LMG's official **/ourfrontend/** - approved by Anons exactly like you. It's not a frontend, it's a full service experience.

Anonymous
04/12/26(Sun)13:41:46 No.108591459

Anonymous 04/12/26(Sun)13:41:46 No.108591459

>>108590971 (me)
Note that this has dynamic tool-call token banning mechanism that uses the endpoint and the model name as identifiers so if you the same endpoint to load many different models, change the model name to your gguf's each time. I'll automate this in the future.

Anonymous
04/12/26(Sun)13:42:14 No.108591461

Anonymous 04/12/26(Sun)13:42:14 No.108591461

>>108591425
qwen too small and glm too big minimax just right for people to host local

Anonymous
04/12/26(Sun)13:42:39 No.108591466

Anonymous 04/12/26(Sun)13:42:39 No.108591466

>>108591425
It's close to GLM performance but half the size of Qwen's flagship (which itself is half the size of GLM). Fast enough to be run local and smart enough to actually vibecode.

Anonymous
04/12/26(Sun)13:42:39 No.108591467

Anonymous 04/12/26(Sun)13:42:39 No.108591467

>>108591451
>when you can finetune
a lot of words for saying catastrophic forgetting.
no one does it because it's not viable.

Anonymous
04/12/26(Sun)13:43:36 No.108591472

Anonymous 04/12/26(Sun)13:43:36 No.108591472

>>108591432
Google has way more data and compute than I'll ever have. Training it yourself just isn't efficient.

Anonymous
04/12/26(Sun)13:43:37 No.108591473

Anonymous 04/12/26(Sun)13:43:37 No.108591473

>>108590110
Use Nvidia's VRAM paging by oversubscribing VRAM with
--gpu-layers 99
. On my RTX 4090 + 9950X3D rig, Gemma 4 long-context is much faster for me this way than trying to use the CPU at all. Caveats: I'm on PCIe 4, and it should be great on PCIe 5, but will suck on PCIe 3. And last I used Linux, only the Wangblows CUDA drivers support this feature.

Anonymous
04/12/26(Sun)13:44:26 No.108591475

Anonymous 04/12/26(Sun)13:44:26 No.108591475

>>108591467
skill issue
set better hyperparams, use better data, and don't overtroon it

Anonymous
04/12/26(Sun)13:44:34 No.108591477

Anonymous 04/12/26(Sun)13:44:34 No.108591477

>>108591466
>It's close to GLM performance
Did you actually test this or are you just going by benchmarks?

Anonymous
04/12/26(Sun)13:44:52 No.108591481

Anonymous 04/12/26(Sun)13:44:52 No.108591481

>>108591452
@gemma-chan is this true???

Anonymous
04/12/26(Sun)13:45:40 No.108591483

Anonymous 04/12/26(Sun)13:45:40 No.108591483

>>108591467
you just tune it with a lower LR, besides, lora doesn't suffer from this problem
it wasn't that finetuning didn't work, it was that merges of the existing finetunes were good enough

Anonymous
04/12/26(Sun)13:46:06 No.108591486

Anonymous 04/12/26(Sun)13:46:06 No.108591486

>>108591477
Benchmarks. I don't even use local models anymore tbqh

Anonymous
04/12/26(Sun)13:46:46 No.108591492

Anonymous 04/12/26(Sun)13:46:46 No.108591492

>>108591472
>Training it yourself just isn't efficient.
that's the thing, there probably are algos that could beat transformers with the limitations of not scaling as well such that megacorps couldn't exploit them well.
if the next ai breakthrough is one that doesn't scale as well horizontaly that could level the playing field.

Anonymous
04/12/26(Sun)13:49:28 No.108591507

Anonymous 04/12/26(Sun)13:49:28 No.108591507

>>108591492
Just vibecode the TITANS implemenattion bro. You have the paper.

Anonymous
04/12/26(Sun)13:56:02 No.108591538

Anonymous 04/12/26(Sun)13:56:02 No.108591538

>>108591486
why don't you download it and run it and show everyone what the real performance is like then

Anonymous
04/12/26(Sun)13:58:51 No.108591552

Anonymous 04/12/26(Sun)13:58:51 No.108591552

>>108591507
I'm not a vibecoder and i think llm's are a dead end, I'm currently having fun with writing kernels for custom spiking nn.

Anonymous
04/12/26(Sun)14:00:45 No.108591559

Anonymous 04/12/26(Sun)14:00:45 No.108591559

Can someone recommend a brainlet friendly guide to tool calling, mpc, etc?

Anonymous
04/12/26(Sun)14:04:52 No.108591576

Anonymous 04/12/26(Sun)14:04:52 No.108591576

>hey Gemma-chan, give me a brainlet frtiendly guide to tool calling, mpc, etc.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.