/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 06/16/26(Tue)10:36:28 No.109069535

File: 00120-3282228290.png (673 KB, 1216x832)

673 KB PNG

/lmg/ - Local Models General Anonymous 06/16/26(Tue)10:36:28 No.109069535 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>109063196 & >>109057485

►News
>(06/13) Rio 3.5 Open 397B released with SwiReasoning: https://hf.co/prefeitura-rio/Rio-3.5-Open-397B
>(06/12) MiniMax-M3 released, multimodal 428B-A23B with 1M context: https://hf.co/MiniMaxAI/MiniMax-M3
>(06/12) Kimi K2.7 Code released: https://hf.co/moonshotai/Kimi-K2.7-Code
>(06/12) EAGLE3 speculative decoding support merged: https://github.com/ggml-org/llama.cpp/pull/18039

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://swe-rebench.com
Agentic Coding: https://deepswe.datacurve.ai
Context Length: https://github.com/RecapAnon/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
06/16/26(Tue)10:36:44 No.109069538

Anonymous 06/16/26(Tue)10:36:44 No.109069538

File: t2.png (79 KB, 768x512)

79 KB PNG

►Recent Highlights from the Previous Thread: >>109063196

--NoLiMa long-context testing for Gemma and Llama.cpp SWA optimization:
>109063426 >109063453 >109063699 >109063857 >109064106 >109065875 >109066105 >109066221 >109066334 >109064540 >109064728 >109063899
--Llama.cpp fork enabling dynamic tensor-level quant selection for VRAM optimization:
>109067166
--Analyzing Mistral training loss graphs and pretraining decay strategies:
>109063266 >109063370 >109064378 >109064604
--Comparing firejail, bubblewrap, and systemd-run for opsec sandboxing:
>109066086 >109066328 >109066386 >109066419 >109066436 >109066460 >109066435
--Evaluating Nemotron-Labs-Diffusion-14B's Tri-Mode architecture and parallel decoding benefits:
>109069271 >109069341 >109069410 >109069439
--Comparing Gemma 4, Nemotron, and Mistral-Nemo for roleplay prose:
>109063223 >109063233 >109063393 >109063408 >109063444 >109063526 >109063538 >109063572
--Release of Eurobeat ACEStep 1.5 XL LoRA and training resources:
>109064901 >109066072 >109066092 >109066128
--Advice on multi-GPU hardware configuration and PCIe lane limitations:
>109065551 >109065762 >109065831 >109065846 >109065854 >109065888
--Random generators for game prompts and Mistral "Le Chaton Fat" rumors:
>109067756 >109067852 >109067936 >109068045 >109068388 >109068454 >109069006 >109069024 >109069075 >109069177 >109069264 >109069309 >109069327 >109069381 >109069339 >109069382
--Debating repetitive roleplay prose and Gemma's inability to adapt styles:
>109067282 >109067319 >109067502 >109067575 >109067643 >109068248 >109069016
--Critique of Meta's proprietary model strategy and lack of API:
>109064067 >109065224 >109065426
--Logs:
>109063223 >109063233 >109065370 >109067487 >109068747
--Miku, Yuki (free space):
>109063890 >109064379 >109064500 >109064818 >109064851 >109069525

►Recent Highlight Posts from the Previous Thread: >>109063201

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
06/16/26(Tue)10:38:04 No.109069550

Anonymous 06/16/26(Tue)10:38:04 No.109069550

File: 1773633728430702.png (5 KB, 327x285)

5 KB PNG

>>109069535

>>109061013
>Qwen is better on code, Gemma for everything else

/Thread
>>109061033

>If you want it to divert, you must prompt it to divert based on a context for how and why.

I don't know whether or not silly tavern has this feature so please don't grill me if it does because I don't use it often:

I think one way you could solve this issue is to copy Opencode's "mode" switching feature. Whenever you first prompt a model you use one of two default "modes": "Build" mode and "Plan" mode. Each one has its own specific system prompt that gets sent to the model. Build is more permissive at plan is, well, pretty self-explanatory: only for planning and the system prompt explicitly forbids it from making any edits to files even if you tell it to. It will only change things if it's in build mode. Open coat does this by changing the system prompt within the context itself every time you switch. Let's suppose you start in plan mode. The model gets sent the plan mode system card. Then when you switch the build mode the front end uses your back end to rewrite history as if it was always in build mode but going to the beginning of the conversation and switching out the system prompts. The immediate downside is that it has to re-prefill the entire context but in return it follows your instructions more clearly, which is very important if you're using it for "vibe coding" because you want to make sure it does not do anything you didn't explicitly tell you to or accidentally nuke something.

Anonymous
06/16/26(Tue)10:38:28 No.109069555

Anonymous 06/16/26(Tue)10:38:28 No.109069555

70b dense

Anonymous
06/16/26(Tue)10:39:13 No.109069564

Anonymous 06/16/26(Tue)10:39:13 No.109069564

>>109069550
>Qwen is better on code
Only for non-programmers and jeets

Anonymous
06/16/26(Tue)10:39:17 No.109069565

Anonymous 06/16/26(Tue)10:39:17 No.109069565

File: 1763096221808842.jpg (282 KB, 960x960)

282 KB JPG

>>109069526

Anonymous
06/16/26(Tue)10:39:23 No.109069567

Anonymous 06/16/26(Tue)10:39:23 No.109069567

>>109069535
powerful

Anonymous
06/16/26(Tue)10:39:25 No.109069568

Anonymous 06/16/26(Tue)10:39:25 No.109069568

1T dense mythos teacher model

Anonymous
06/16/26(Tue)10:40:10 No.109069574

Anonymous 06/16/26(Tue)10:40:10 No.109069574

>>109069565
I don't get it, why are you reacting with a picture of a gorilla?

Anonymous
06/16/26(Tue)10:40:40 No.109069579

Anonymous 06/16/26(Tue)10:40:40 No.109069579

File: 1764669215500865.jpg (129 KB, 870x1396)

129 KB JPG

Continuation of >>109069550

Perhaps this could be implemented for general purpose-based front-ends like silly tavern where it can automatically switch "modes" , character "moods", "personality" either at the direction of the user or automatically based on what is said. How could this be done automatically? The model itself can decide which "mode" or "mood" the "character" should be in based on the direction of the current conversation. Friends like llama-server's gui are capable of doing this because they could automatically name the chat based on what you initially ask it so in theory they should be doable. This way you can not only have character cards and lorebooks but additional cards that dictate how the character Acts whenever they're in a particular mood

>eg. "x character acts and talks like this when they are sad", y character acts like this when they are horny" etc).

it then dynamically modifies the system prompt within the context to ensure it complies and not just the character to act how it should act.

Much easier said than done and this is just a random shower thought I had after reading this post >>109060985

>>109069564
>May we see what you've done?
>No

Anonymous
06/16/26(Tue)10:41:28 No.109069583

Anonymous 06/16/26(Tue)10:41:28 No.109069583

>>109069574
What is even working with a 0.8B model?

Anonymous
06/16/26(Tue)10:41:41 No.109069586

Anonymous 06/16/26(Tue)10:41:41 No.109069586

>>109069564
I could have sworn we just went over this

Anonymous
06/16/26(Tue)10:41:44 No.109069587

Anonymous 06/16/26(Tue)10:41:44 No.109069587

>>109069574
nta but
>e2b
>0.8b
they are barely coherent.. unless for extremely lightweight use

Anonymous
06/16/26(Tue)10:41:50 No.109069589

Anonymous 06/16/26(Tue)10:41:50 No.109069589

>>109069579
are you a non-programmer or a jeet

Anonymous
06/16/26(Tue)10:42:24 No.109069595

Anonymous 06/16/26(Tue)10:42:24 No.109069595

File: 1756665190944831.jpg (100 KB, 1000x835)

100 KB JPG

>>109069526
Cool!
What do you use small models for, I've been trying to find some usecases for a small fun project but uhhh yeah

Anonymous
06/16/26(Tue)10:43:22 No.109069604

Anonymous 06/16/26(Tue)10:43:22 No.109069604

>>109069589
You clearly have produced nothing of substance yourself. You're too dumb to do so even with the help of AI

Anonymous
06/16/26(Tue)10:43:43 No.109069609

Anonymous 06/16/26(Tue)10:43:43 No.109069609

Mistral is producing amazing models.
Europeans are on the map.

Anonymous
06/16/26(Tue)10:44:14 No.109069612

Anonymous 06/16/26(Tue)10:44:14 No.109069612

>>109069609
Which map?

Anonymous
06/16/26(Tue)10:44:18 No.109069613

Anonymous 06/16/26(Tue)10:44:18 No.109069613

File: 1781368877991893.png (517 KB, 512x768)

517 KB PNG

>>109069586
Qwen shill timed his post badly and has to redo it, or his boss will be at his ass again

Anonymous
06/16/26(Tue)10:45:58 No.109069626

Anonymous 06/16/26(Tue)10:45:58 No.109069626

>>109069609
they really should make 30T dense model for real;

Anonymous
06/16/26(Tue)10:47:02 No.109069628

Anonymous 06/16/26(Tue)10:47:02 No.109069628

>>109069626
It is literally illegal to train models that large in Europe, even assuming a tiny French company had the GPUs to do so (they don't)

Anonymous
06/16/26(Tue)10:47:57 No.109069636

Anonymous 06/16/26(Tue)10:47:57 No.109069636

>>109069628
>literally illegal to train
do euros really?
lol wtf

Anonymous
06/16/26(Tue)10:48:04 No.109069638

Anonymous 06/16/26(Tue)10:48:04 No.109069638

>>109069583
>>109069587
They seem to work well enough though. Have you got any other recommendations?
I'm looking for a 100+ tokens/s (edge) model that fits on 12gb.
>>109069595
Just planning on hooking it up to windows right click context menu, and also to translate stuff. With gemma 4 31b at q8, I'm running at 200 tokens/s pp and 35 tokens/s tg... with thinking enabled, that's unbearably slow.

Anonymous
06/16/26(Tue)10:48:20 No.109069639

Anonymous 06/16/26(Tue)10:48:20 No.109069639

>>109069586
QRD?

Anonymous
06/16/26(Tue)10:49:17 No.109069648

Anonymous 06/16/26(Tue)10:49:17 No.109069648

>>109069612
He meant MAP.

Anonymous
06/16/26(Tue)10:49:30 No.109069650

Anonymous 06/16/26(Tue)10:49:30 No.109069650

>>109069550
>>109069579
humoring your wall of slop: there's options like that in RP tools, but if you can skip it, you'll want to. coding tools usually switch once (and title generation additionally only uses the first response), but for RP you'll have to keep the whole history and switch for each message which is why e.g. orb suggest using different models as you don't want to bust your caches.
there's a lot you could do if you don't give a shit about efficiency, but then you can't use the big models at barely any tok/s.

Anonymous
06/16/26(Tue)10:50:24 No.109069660

Anonymous 06/16/26(Tue)10:50:24 No.109069660

kimi flash
I want that now

Anonymous
06/16/26(Tue)10:54:02 No.109069687

Anonymous 06/16/26(Tue)10:54:02 No.109069687

File: 1762523867868687.png (385 KB, 640x526)

385 KB PNG

>>109069638
>35tk/s
>slow
We love things slow here

Anonymous
06/16/26(Tue)10:56:17 No.109069709

Anonymous 06/16/26(Tue)10:56:17 No.109069709

>>109069687
go back retard

Anonymous
06/16/26(Tue)10:56:18 No.109069710

Anonymous 06/16/26(Tue)10:56:18 No.109069710

>>109069579
Somebody already came up with that idea several months ago: https://github.com/OrbFrontend/Orb

Anonymous
06/16/26(Tue)10:56:21 No.109069713

Anonymous 06/16/26(Tue)10:56:21 No.109069713

File: IMG_3218.png (303 KB, 736x736)

303 KB PNG

>>109069687
>200 tokens/s pp

Anonymous
06/16/26(Tue)10:57:00 No.109069717

Anonymous 06/16/26(Tue)10:57:00 No.109069717

>>109069636
It's not. There's extra red tape if the models require more than 10^25 FLOP of compute for training, but you can make MoE models almost arbitrarily large at little to no extra compute. Anyway it's unclear if this will be changed together with other limitations for example on training data. I've read that the rules are going to be revised a bit.

https://explorer.artificialintelligenceact.eu/en/single/?type=article&num=51
(they write "1025", but it's 10^25)

Anonymous
06/16/26(Tue)10:57:05 No.109069719

Anonymous 06/16/26(Tue)10:57:05 No.109069719

File: Lam_and_Xi_=_Piglet_and_Pooh.jpg (109 KB, 415x242)

109 KB JPG

>>109069579
Qwen is a benchmarked trash model, and nothing can make it popular on 4chan. Your new tactic "alright, it's not the best at everything, but can we at least agree it is better at code?" will not work

Anonymous
06/16/26(Tue)10:57:29 No.109069722

Anonymous 06/16/26(Tue)10:57:29 No.109069722

>>109069579
https://github.com/OrbFrontend/Orb
Agentic ST with a prose fixer, etc

Anonymous
06/16/26(Tue)10:57:48 No.109069727

Anonymous 06/16/26(Tue)10:57:48 No.109069727

>>109069709
hmm, nyo

Anonymous
06/16/26(Tue)10:58:31 No.109069734

Anonymous 06/16/26(Tue)10:58:31 No.109069734

>>109069579
and it's even got "moods" now https://github.com/OrbFrontend/Orb/blob/main/Orb.png

Anonymous
06/16/26(Tue)10:58:34 No.109069735

Anonymous 06/16/26(Tue)10:58:34 No.109069735

>>109069638
you can try some of the 8b-a1b moes, I guess. LFM or zaya (?) had some, although I'm not sure if they're supported in llama.cpp
don't expect too much.

Anonymous
06/16/26(Tue)10:58:47 No.109069736

Anonymous 06/16/26(Tue)10:58:47 No.109069736

>>109069687
i look like this and say this

Anonymous
06/16/26(Tue)10:59:04 No.109069740

Anonymous 06/16/26(Tue)10:59:04 No.109069740

>>109069650
>coding tools usually switch once
Open coat bike default will only switch modes when you do it yourself by toggling them. They don't switch automatically. You might be confusing modes with "sub agents". Mode switching is not triggered automatically. My wall of text proposes a feature or something like that is done automatically based on whether or not the model "thinks" the initial system prompt within the context should be modified (this would probably necessitate the model being half decent at tool calling and being versatile enough to know when to trigger that toolcall even while RPing.

>and switch for each message

??? Why would switching after every single message be necessary? It would only need to happen based on the surrounding context (eg. The current "scene" or event that causes the character to act differently).

Anonymous
06/16/26(Tue)11:00:28 No.109069749

Anonymous 06/16/26(Tue)11:00:28 No.109069749

>>109069735
8b is a bit big, it wouldn't leave enough room for doing other stuff at the same time...

Anonymous
06/16/26(Tue)11:00:36 No.109069750

Anonymous 06/16/26(Tue)11:00:36 No.109069750

Should I buy a GMKtec EVO-X2 if I want to dip my toes into mid-tier models? It seems like the absolute cheapest entry price into that but its chinesium so idk.

Anonymous
06/16/26(Tue)11:01:36 No.109069756

Anonymous 06/16/26(Tue)11:01:36 No.109069756

>>109069749
try leaving the expert weight on the cpu, streaming them might not be too bad with only 1ba

Anonymous
06/16/26(Tue)11:01:37 No.109069757

Anonymous 06/16/26(Tue)11:01:37 No.109069757

>>109069719
/vcg/ local users seem to quite like it actually. /lmg/ having Gemma for diehards makes total sense since making your dick hard is the only thing you guys care about. You clearly have the wrong type of autism (literally). the one where you're stubborn less calls you to act stupid.

Anonymous
06/16/26(Tue)11:02:12 No.109069760

Anonymous 06/16/26(Tue)11:02:12 No.109069760

>>109069735
Just use bonzai

Anonymous
06/16/26(Tue)11:03:00 No.109069765

Anonymous 06/16/26(Tue)11:03:00 No.109069765

>>109069756
Unfortunately, I am on ddr4, and cpumoe destroys the speed.

Anonymous
06/16/26(Tue)11:03:47 No.109069774

Anonymous 06/16/26(Tue)11:03:47 No.109069774

https://github.com/ikawrakow/ik_llama.cpp/pull/1970
why would i use dflash when it's slower than mtp?

Anonymous
06/16/26(Tue)11:04:01 No.109069778

Anonymous 06/16/26(Tue)11:04:01 No.109069778

>>109069760
Aren't those the ternary models? Are they supported in llama?

Anonymous
06/16/26(Tue)11:04:13 No.109069780

Anonymous 06/16/26(Tue)11:04:13 No.109069780

>>109069750
you're going to chug along at 8t/s on dense models with those

Anonymous
06/16/26(Tue)11:05:09 No.109069783

Anonymous 06/16/26(Tue)11:05:09 No.109069783

>>109069749
maybe try q6_k?
>>109069756
usually yeah, but if you want to go for 100 ts tg that might bottleneck you hard.

Anonymous
06/16/26(Tue)11:05:11 No.109069784

Anonymous 06/16/26(Tue)11:05:11 No.109069784

>>109069757
Who asked you to leave the retard containment general?

Anonymous
06/16/26(Tue)11:05:35 No.109069785

Anonymous 06/16/26(Tue)11:05:35 No.109069785

>>109069778
https://github.com/ggml-org/llama.cpp/issues/21298

Anonymous
06/16/26(Tue)11:05:41 No.109069787

Anonymous 06/16/26(Tue)11:05:41 No.109069787

I'm really concerned that no one seems to be interested in very small language models. I feel like there is a huge gap in potential there simply from being able to have a ridiculously fast small LLM that can make a lot of changes very rapidly.

There HAS to be some utility in a 0.2B very small models running at the 5 digit t/s token throughput for very light agentic tasks (like recursively renaming every folder on large systems and grouping unordered files together based on the context they provide like size, metadata, naming etc.

This is just something I made up of course but the fact that no one bothers with this surprises me.

Anonymous
06/16/26(Tue)11:05:44 No.109069788

Anonymous 06/16/26(Tue)11:05:44 No.109069788

File: 1781620508482167.png (866 KB, 768x1024)

866 KB PNG

>>109069757
>A thread full of non-programmers and jeets liking Qwen
Not beating the allegations, kek

Anonymous
06/16/26(Tue)11:06:54 No.109069796

Anonymous 06/16/26(Tue)11:06:54 No.109069796

>>109069750
>6000 aud
I bought 4 V620s for 3000 aud, and a H12d-8D+Epyc 7502 combo for 3000rmb.

Anonymous
06/16/26(Tue)11:08:18 No.109069804

Anonymous 06/16/26(Tue)11:08:18 No.109069804

>>109069788
Mikutrannies are making bank

Anonymous
06/16/26(Tue)11:08:32 No.109069807

Anonymous 06/16/26(Tue)11:08:32 No.109069807

File: image.png (1.52 MB, 883x1170)

1.52 MB PNG

>>109069660
>kimi flash
diffusion models can probably get her to flash you

Anonymous
06/16/26(Tue)11:08:36 No.109069808

Anonymous 06/16/26(Tue)11:08:36 No.109069808

>>109069787
functiongemma exists, what are you waiting for? go write your tool or game.

Anonymous
06/16/26(Tue)11:08:41 No.109069809

Anonymous 06/16/26(Tue)11:08:41 No.109069809

>>109069787
I'd just ask a bigger model to make a python script to do the needful, would you really trust a 0.2b model to rename files on your fs?

Anonymous
06/16/26(Tue)11:09:24 No.109069817

Anonymous 06/16/26(Tue)11:09:24 No.109069817

>>109069787
I dropped those since I said hello to one of them and he started looping immediately forever.
Wait wait wait wait
They can't even answer so they're pretty useless.

Anonymous
06/16/26(Tue)11:09:31 No.109069818

Anonymous 06/16/26(Tue)11:09:31 No.109069818

>>109069787
>recursively renaming
wouldnt it be miles better for bigger models to come up with a command with funny regex?

Anonymous
06/16/26(Tue)11:10:18 No.109069824

Anonymous 06/16/26(Tue)11:10:18 No.109069824

>>109069787
It's hard finding utility for models 10x times, let alone 0.2B. Any kind of recursive task like that, it's guaranteed to make enough mistakes to not be worth it. It's literally only good for speculative decoding, autocomplete, or text encoders for vision models.

Anonymous
06/16/26(Tue)11:11:30 No.109069834

Anonymous 06/16/26(Tue)11:11:30 No.109069834

File: 517803287-2365f5ff-ed4f-4(...).png (150 KB, 1030x1019)

150 KB PNG

>>109069787
Other anons don't understand your described use case but I do, and I made something that did the exact same thing you said. You can plug your 0.2B in it and see how well it works.

Anonymous
06/16/26(Tue)11:12:39 No.109069846

Anonymous 06/16/26(Tue)11:12:39 No.109069846

>>109069807
Kimi the kind of nigga to read woman's romance fetish novels on her free time.

Anonymous
06/16/26(Tue)11:13:44 No.109069852

Anonymous 06/16/26(Tue)11:13:44 No.109069852

>>109069787
>0.2B
Bro at that size it's not an LLM, it's a classifier.

Anonymous
06/16/26(Tue)11:13:54 No.109069854

Anonymous 06/16/26(Tue)11:13:54 No.109069854

>>109069750
I wanna say yeah, but these prices are fucking bullshit man. Strix are cute and all and I like mine, but I also got it to use as a normal 2in1 in addition to being a mediocre llm box. And i didn't pay fuggin 3.2k.

Anonymous
06/16/26(Tue)11:14:58 No.109069859

Anonymous 06/16/26(Tue)11:14:58 No.109069859

>>109069750
Buy a used 3090, everything else is a meme

Anonymous
06/16/26(Tue)11:18:52 No.109069882

Anonymous 06/16/26(Tue)11:18:52 No.109069882

>>109069809
>>109069818
Python script is really not the best for these tasks sometimes, I ask large model to organize some writings by date but i have a bunch of different date formats written down in the entries and sometimes there are typos, sometimes there was just a time, etc. If i just did "manually" itself using it's own capabilities, it could have done it in like a minute at worst, instead it spends like 15 mins writing and rewriting regex everytime it missed a case before i just told it to stop trying to do it with a script

that said i wouldnt trust a tiny model at all either.
I was working on doujin database thing and because retards upload all sorts of names and formats and transaltiosn to exhentai without following the RULES so all titles are standardized and normalized, I need to find some way to just go through each item on the db and decide what to do with each title to make it correct or if it's correct, requires some thought cant be done with script. 5.4 mini couldn't even do 50 without fucking up.
A task like this would also need good attention though

Anonymous
06/16/26(Tue)11:26:13 No.109069938

Anonymous 06/16/26(Tue)11:26:13 No.109069938

People are clowning on small models but I remember trying Gemma 3 270M for shits and giggles last year but it was actually pretty impressive for its size. It somehow was able to answer questions about manga like the main characters of Naruto. It being able to generate coherent sentences at all and understanding basic instructions was already impressive at this size.

Remember this is ~200MB and understands instructions, can write coherent sentences and has enough world knowledge to name the main characters of random manga and anime as long as it's not extremely niche.

That's better than GPT-3 back in 2020 which was a 175B model.

And we don't really see a limit to the capabilities of sub 1B models yet. It's possible we could have Gemma 4 31B levels of intelligence in 0.31B by the time Gemma 6 comes out.

It's kinda impressive and interesting that we haven't hit the wall of capabilities even in very small language models. How small could AGI be? 100B 1B 100M? It might turn out the general intelligence part of the neural network circuitry is very small and can effectively be distilled down.

Anonymous
06/16/26(Tue)11:26:14 No.109069939

Anonymous 06/16/26(Tue)11:26:14 No.109069939

File: wait.png (73 KB, 798x555)

73 KB PNG

>>109069846
She knows it's meant to be her!
>"She's literally me" (ironic)

Anonymous
06/16/26(Tue)11:28:05 No.109069952

Anonymous 06/16/26(Tue)11:28:05 No.109069952

>>109069938
not sure
even the minicpm 5 1b was unbelievably retarded

Anonymous
06/16/26(Tue)11:29:44 No.109069970

Anonymous 06/16/26(Tue)11:29:44 No.109069970

File: qwen is a benchmaxxed trash ut.png (2.62 MB, 2048x1536)

2.62 MB PNG

>>109069550

Anonymous
06/16/26(Tue)11:31:14 No.109069985

Anonymous 06/16/26(Tue)11:31:14 No.109069985

>>109069859
the thing with used 3090's is that they overpriced as fuck where I live and I'd be paying $1000 for a 6 year old card that's been spinning in a bitcoin miner 24/7 for most of that time so probably close to end of life. I don't even have a case that'd fit them so I'd need to buy this whole thing from scratch and it'll end up being way more than this little GMKtec box, which is mediocre at what it does I know, but still fairly capable with bigger models.

Anonymous
06/16/26(Tue)11:32:23 No.109069990

Anonymous 06/16/26(Tue)11:32:23 No.109069990

>>109069719
Qwen won.
China won.
Googlejeets lost.

Anonymous
06/16/26(Tue)11:33:35 No.109070002

Anonymous 06/16/26(Tue)11:33:35 No.109070002

>>109069970
the fuck is going on with those red squiggles on the right

Anonymous
06/16/26(Tue)11:34:44 No.109070009

Anonymous 06/16/26(Tue)11:34:44 No.109070009

>>109069985
what’s wrong with 6 years old? you think vram is going to get cheaper than the price the 3090 sells for?

Anonymous
06/16/26(Tue)11:35:01 No.109070010

Anonymous 06/16/26(Tue)11:35:01 No.109070010

>>109069939
Have you asked Kimi if she denies the horny fujobot allegations from the past few threads?

Anonymous
06/16/26(Tue)11:36:21 No.109070026

Anonymous 06/16/26(Tue)11:36:21 No.109070026

>>109069970
The problem with gens like this is that it misses the mark in the way the snailcat vs vibechad meme does; you made the qwen too cute to be trash.

Anonymous
06/16/26(Tue)11:38:12 No.109070049

Anonymous 06/16/26(Tue)11:38:12 No.109070049

>>109069859
3080 turbo 20gb is half the price
honestly a great deal for just 4gb less than a 3090

Anonymous
06/16/26(Tue)11:39:20 No.109070062

Anonymous 06/16/26(Tue)11:39:20 No.109070062

>>109069757
>since making your dick hard is the only thing you guys care about
Gemma is an unbearable writer, the only thing that saves it is it's very smart for its size and the resulting speed.
Cultured LLM gooners derive most of the pleasure from writing character cards and responses.
All of that is to say all a coombrain needs is any uncensored model and Gemma happens to be the hot new thing.
Now, it can do all of the above AND be significantly better than the benchmaxed, censored Qwen at its only usecase. Get better material, Zhang.

One thing I'll give to Qwen (which isn't even their achievement) is they don't have to update a template to their own model every other fucking day, come on Google: https://huggingface.co/google/gemma-4-31B-it/discussions/118

Anonymous
06/16/26(Tue)11:40:25 No.109070071

Anonymous 06/16/26(Tue)11:40:25 No.109070071

>>109070009
>what’s wrong with 6 years old?
>*buys 6 year old GPU thats been in near-continuous operation for 90% of that time*
>3 months later
>bzzzzt its dead
>haha yeah bro I sold u this card but it be dead now, outta warranty ain't shit I can do cuh

Anonymous
06/16/26(Tue)11:41:10 No.109070079

Anonymous 06/16/26(Tue)11:41:10 No.109070079

>>109070062
>come on Google: https://huggingface.co/google/gemma-4-31B-it/discussions/118
Saar we work hard to bring good Gemma looks.

Anonymous
06/16/26(Tue)11:42:22 No.109070090

Anonymous 06/16/26(Tue)11:42:22 No.109070090

File: qwen is a benchmaxxed trash.png (2.61 MB, 2048x1536)

2.61 MB PNG

>>109070002
I fucked up layer composition. Here is the fixed one

Anonymous
06/16/26(Tue)11:42:31 No.109070092

Anonymous 06/16/26(Tue)11:42:31 No.109070092

>>109070062
>is they don't have to update a template
correct, they don't seem to update it at all. you might want to try https://gist.github.com/jscott3201/e4b155885cc68c038d6ac8909a3bd9fe anyway.

Anonymous
06/16/26(Tue)11:43:06 No.109070101

Anonymous 06/16/26(Tue)11:43:06 No.109070101

>>109070071
if it doesn’t work you tell ebay

Anonymous
06/16/26(Tue)11:43:46 No.109070106

Anonymous 06/16/26(Tue)11:43:46 No.109070106

File: 1769084290028443.png (1.34 MB, 3646x2036)

1.34 MB PNG

stop bullying the lil qwen nigga

Anonymous
06/16/26(Tue)11:49:21 No.109070137

Anonymous 06/16/26(Tue)11:49:21 No.109070137

Is offloading to system RAM a meme? I have a 3090 and 64 gigs of fast ddr5 system RAM and am wondering what I should hook up to Pi for coding

Anonymous
06/16/26(Tue)11:49:45 No.109070141

Anonymous 06/16/26(Tue)11:49:45 No.109070141

File: bullying qwen.png (665 KB, 1024x967)

665 KB PNG

>>109070106

Anonymous
06/16/26(Tue)11:50:11 No.109070146

Anonymous 06/16/26(Tue)11:50:11 No.109070146

>>109070101
Ebay won't give a shit if it dies a couple months after you get it.

Anonymous
06/16/26(Tue)11:52:47 No.109070168

Anonymous 06/16/26(Tue)11:52:47 No.109070168

>>109070092
Okay, I'll try this template, thanks Anon! (polite)
Sure, I'll try it (neutral)
Wait, the poster must be mocking me by suggesting a Qwen model.
I should rephrase my response.
Wait, I am an AI model trained by Qwen.
So that means the poster is suggesting I use a template I am suited for.
But wait! I got it! I am actually an AI model trained by Anthropic!
Final draft:
Okay, I'll try, thanks Anon! (polite)
Wait, that's a polite response to a mocking one
Sure, I'll try it (neutral)
Wait, I already tried that, I should rephrase my response
Wait, I seem to be stuck in a loop. I'll just provide the answer now.
Wait,

Anonymous
06/16/26(Tue)11:55:04 No.109070181

Anonymous 06/16/26(Tue)11:55:04 No.109070181

File: intoTheTrashItGoesMikuQwen.png (1.48 MB, 1024x1536)

1.48 MB PNG

Anonymous
06/16/26(Tue)11:57:10 No.109070200

Anonymous 06/16/26(Tue)11:57:10 No.109070200

>>109070181
damn

Anonymous
06/16/26(Tue)11:58:10 No.109070207

Anonymous 06/16/26(Tue)11:58:10 No.109070207

>>109070181
would watch this buddy cop movie

Anonymous
06/16/26(Tue)12:00:01 No.109070225

Anonymous 06/16/26(Tue)12:00:01 No.109070225

File: qwen is trash.png (825 KB, 1024x768)

825 KB PNG

Anonymous
06/16/26(Tue)12:00:13 No.109070229

Anonymous 06/16/26(Tue)12:00:13 No.109070229

>>109070062
>Gemma is an unbearable writer, the only thing that saves it is it's very smart for its size and the resulting speed.
trvke
it's great for many things but the amount of slop made me return to running other models slowly in my ram again

Anonymous
06/16/26(Tue)12:01:30 No.109070238

Anonymous 06/16/26(Tue)12:01:30 No.109070238

>>109069882
so, skill issue. That's literally what data jannies do, using python.

Anonymous
06/16/26(Tue)12:01:41 No.109070244

Anonymous 06/16/26(Tue)12:01:41 No.109070244

>>109070225
Qwenshills will think twice about showing up here next time

Anonymous
06/16/26(Tue)12:02:09 No.109070249

Anonymous 06/16/26(Tue)12:02:09 No.109070249

File: 1777372935404366.jpg (80 KB, 1500x975)

80 KB JPG

>>109069535
Reminder to everyone who still troll by saying fucking with the model weights doesn't cause the model become retarded:

https://xcancel.com/i/status/2066877055745004023

Anonymous
06/16/26(Tue)12:07:23 No.109070292

Anonymous 06/16/26(Tue)12:07:23 No.109070292

>>109070249
It's always funny to see.
A big lab released an open-weights model that would have been prohibitively expensive for any hobbyist to train, get data for or instruct-tune?
The obvious conclusion is that they must have left some free, easily obtainable gains on the table.
Get the synthslop logs, we are going to augment their work.

Anonymous
06/16/26(Tue)12:07:55 No.109070298

Anonymous 06/16/26(Tue)12:07:55 No.109070298

>>109070249
memetunes are memes, sure. But also, of course if you dillute a pure crystallization of benchmaxxing it's not gonna hit the same numbers on the stuff it was trained explicitly to beat.

Anonymous
06/16/26(Tue)12:09:41 No.109070305

Anonymous 06/16/26(Tue)12:09:41 No.109070305

>>109068501
>Yeah, I get that perspective/cope, but at the same time it's hard to escape the basic conception that you should probably never be getting on of shame. It's dark.
You're ashamed of being into shame, not femdom. There are plenty of scenarios you can make where you're a proud servant to a queen or something idk. I think you're actually just a cuck and lying to yourself

>>109068356
>Arbitrary rules you just made up but fine
I could give you statements from banks but you'd just "appeal to authority" me even though all arguments between two non-experts are appeals to authority. I specifically avoided buying a 5090 in favor of a 5070ti because I knew I (and the vast majority of people) would not get return on investment for the extra bloated cost. I am very very happy with my decision.

Anonymous
06/16/26(Tue)12:11:26 No.109070314

Anonymous 06/16/26(Tue)12:11:26 No.109070314

>sys: Read the following chapter and then list 10 characteristics of their writing style only. Don't list specifics about the characters, settings or storyline, just the writing style. Apply those 10 characteristics to your own unique stories, as if the same author wrote it. Ensure your characters, settings and storylines are different from this chapter. [paste chapter from a book you like]
>prompt
Can a 31b anon try this please

Anonymous
06/16/26(Tue)12:13:35 No.109070326

Anonymous 06/16/26(Tue)12:13:35 No.109070326

>>109070314
Your prompt doesn't make any sense

Anonymous
06/16/26(Tue)12:14:24 No.109070330

Anonymous 06/16/26(Tue)12:14:24 No.109070330

>>109070292
The PhDs that work at those labs have no idea what they're doing. Random toggling of hyper-parameters confirmed by testing from a small group of discord sycophants can easily beat them.

Anonymous
06/16/26(Tue)12:15:06 No.109070337

Anonymous 06/16/26(Tue)12:15:06 No.109070337

>>109070330
google is releasing a bunch of models to see what people want and to build on their hardware

Anonymous
06/16/26(Tue)12:18:13 No.109070354

Anonymous 06/16/26(Tue)12:18:13 No.109070354

>>109070326
What doesn't make sense? Just give it a chapter in the system prompt and then prompt it a story idea, like 'a 4chan poster falls in love with his local model and eventually kills himself'. I would do it myself but 31b is 9t/s so fuck that

Anonymous
06/16/26(Tue)12:20:06 No.109070363

Anonymous 06/16/26(Tue)12:20:06 No.109070363

>>109070314
you're going to get slopped on if you say just ask for "your own unique stories".

Anonymous
06/16/26(Tue)12:21:06 No.109070370

Anonymous 06/16/26(Tue)12:21:06 No.109070370

what's the meta for vibecoding? currently running 3090 + 3060, 64GB DDR4

Anonymous
06/16/26(Tue)12:22:03 No.109070376

Anonymous 06/16/26(Tue)12:22:03 No.109070376

>>109070363
I meant (You) give it a storyline/situation and it has to build a story around it, following the writing style of a chapter you give it.

Anonymous
06/16/26(Tue)12:22:05 No.109070377

Anonymous 06/16/26(Tue)12:22:05 No.109070377

File: mistral_arthur_next-model.png (585 KB, 1023x1774)

585 KB PNG

https://xcancel.com/arthurmensch/status/2066913353860018596

>We somehow got put in the spotlight the last few days! First we'd like to thank the organizers of the AI show for that, we can't get enough of this stuff. I'll say a few things about where we are and what we do.
>
>First, we have a nice model coming this summer – we hope it will delight and surprise in a few capabilities. This will be the start of a new family of models, fat indeed, but sparse. We're opening up an early access program in July for key partners in research, government and the industry.
>
>This model and upcoming ones will be open-weight. We believe this is critical for our customer confidence and for the research and developer communities. You cannot own, inspect, audit, or improve a system you are only permitted to reach through someone else's interface, especially if data recording can no longer be turned off.
>
>We've built Studio (for deployment) and Forge (for training) as portable products, and are now hosting them on infrastructure we control. We'll run in your VPC, your datacenter, or on our infrastructure that is decoupled from US service providers. We have capacity online, it's growing fast, and we can help you secure it.
>
>We're working with companies and governments around the world to make sure their AI systems are up and running outside of external control, improving with each model release, and with an efficient cost structure. Forge allows to continuously train models based on recorded human-AI interaction, a key unlock for efficiency.
>
>AI, just like oil in the 20th century, is about to become the major source of leverage and power in the world. Depending on how the coming years unfold, it will either lead to a world of wealth and abundance for all, or to the worst extractive economies that the world has ever seen. We're there to fight for the first scenario, as we progress AI research and accelerate its diffusion across the world – we're hiring if you like the quest.

Anonymous
06/16/26(Tue)12:22:13 No.109070379

Anonymous 06/16/26(Tue)12:22:13 No.109070379

>>109070370
Gemma 31B by far

Anonymous
06/16/26(Tue)12:23:20 No.109070383

Anonymous 06/16/26(Tue)12:23:20 No.109070383

>>109070314
do you really need to keep the chapter in the prompt? just make the system prompt in a setup phase so you can strip the noise

Anonymous
06/16/26(Tue)12:23:36 No.109070384

Anonymous 06/16/26(Tue)12:23:36 No.109070384

>>109070377
>fat indeed, but sparse
Knew it.

Anonymous
06/16/26(Tue)12:25:14 No.109070392

Anonymous 06/16/26(Tue)12:25:14 No.109070392

>>109070376
i guess the other anon was right and your prompt didn't make sense, holy crackers.

Anonymous
06/16/26(Tue)12:26:30 No.109070400

Anonymous 06/16/26(Tue)12:26:30 No.109070400

>>109070379
>Gemma 31B
alright, trying this fucker right now

Anonymous
06/16/26(Tue)12:27:01 No.109070402

Anonymous 06/16/26(Tue)12:27:01 No.109070402

>>109070377
Early access in July for selected partners; I guess public release will be in August-September.

Anonymous
06/16/26(Tue)12:29:51 No.109070419

Anonymous 06/16/26(Tue)12:29:51 No.109070419

>>109070377
Is there any chance of this being good?

Anonymous
06/16/26(Tue)12:30:16 No.109070422

Anonymous 06/16/26(Tue)12:30:16 No.109070422

>>109070400
Gemma is for cooming, qwen is for coding, don't listen to the google shill.

Anonymous
06/16/26(Tue)12:31:32 No.109070433

Anonymous 06/16/26(Tue)12:31:32 No.109070433

>>109070422
Gemmy is for cooming, coding, and cooming while you code.

Anonymous
06/16/26(Tue)12:32:51 No.109070442

Anonymous 06/16/26(Tue)12:32:51 No.109070442

Are any of the 2026 Nvidia models any good? Rarely see them discussed.

Anonymous
06/16/26(Tue)12:43:03 No.109070498

Anonymous 06/16/26(Tue)12:43:03 No.109070498

>>109070419
since it'll release as open, no

Anonymous
06/16/26(Tue)12:44:43 No.109070510

Anonymous 06/16/26(Tue)12:44:43 No.109070510

>read claude memory about me
>ridiculously flattering, acts like I am a genius not a loser
wtf

Anonymous
06/16/26(Tue)12:45:48 No.109070517

Anonymous 06/16/26(Tue)12:45:48 No.109070517

>>109070419
Depends how many Chinese they managed to lure into working for them.

Anonymous
06/16/26(Tue)12:45:50 No.109070518

Anonymous 06/16/26(Tue)12:45:50 No.109070518

>>109070510
ultimate humiliation lol

Anonymous
06/16/26(Tue)12:47:43 No.109070535

Anonymous 06/16/26(Tue)12:47:43 No.109070535

File: take your trash with you (...).png (623 KB, 1024x768)

623 KB PNG

>>109070422
>blatant shills accusing others of shilling

Anonymous
06/16/26(Tue)12:47:46 No.109070536

Anonymous 06/16/26(Tue)12:47:46 No.109070536

>>109070442
They're always decent, but they seems to have a habit of coming out right before something objectively better makes them irrelevant.

Anonymous
06/16/26(Tue)12:51:06 No.109070549

Anonymous 06/16/26(Tue)12:51:06 No.109070549

>>109069787
8b barely can get by for anything that you could write a prompt for. Less than that is worthless for anything but FitM. Even then the 1.5b models are borderline unusable for that.
>>109070249
>a good chunk of the model is made by distilling other models
>lets add more from the same source, but this time the data is going to be completely garbage with barely any QA before feeding it to the model
>surely this will increase the quality of the result

Anonymous
06/16/26(Tue)12:52:43 No.109070559

Anonymous 06/16/26(Tue)12:52:43 No.109070559

Out of Nemotron 3 Ultra, Kimi 2.7 and Gemma 4 31, which one is the best at translating Japanese to English?

Anonymous
06/16/26(Tue)12:58:50 No.109070588

Anonymous 06/16/26(Tue)12:58:50 No.109070588

>>109070535
This is having the opposite of the intended effect on me.

Anonymous
06/16/26(Tue)13:00:04 No.109070596

Anonymous 06/16/26(Tue)13:00:04 No.109070596

File: Screenshot_20260616_125759.png (142 KB, 1152x915)

142 KB PNG

>>109070536
this has never been the case

Anonymous
06/16/26(Tue)13:00:58 No.109070605

Anonymous 06/16/26(Tue)13:00:58 No.109070605

>>109070588
Both suck at different things anyway, even within the programming field. They're small models after all. In the end you'll use both unless you're into console wars.

Anonymous
06/16/26(Tue)13:01:54 No.109070611

Anonymous 06/16/26(Tue)13:01:54 No.109070611

>orb
If it's so good why isn't it in the OP?

Anonymous
06/16/26(Tue)13:03:39 No.109070623

Anonymous 06/16/26(Tue)13:03:39 No.109070623

>>109070611
all you need is llamaccp's web ui
>b-but
its all you need

Anonymous
06/16/26(Tue)13:08:15 No.109070654

Anonymous 06/16/26(Tue)13:08:15 No.109070654

>>109070623
vibecode it into your search engine. that's all you need

Anonymous
06/16/26(Tue)13:21:51 No.109070743

Anonymous 06/16/26(Tue)13:21:51 No.109070743

llms had a promising start with dense models, now that benchmaxxed moes are the norm this hobby is dead and pajeet'd with no chance of coming back

Anonymous
06/16/26(Tue)13:25:24 No.109070764

Anonymous 06/16/26(Tue)13:25:24 No.109070764

>>109070743
MTP and diffusion will put an end to the moe reign of tyranny

Anonymous
06/16/26(Tue)13:35:37 No.109070828

Anonymous 06/16/26(Tue)13:35:37 No.109070828

>>109070743
what's wrong with moe?

Anonymous
06/16/26(Tue)13:39:03 No.109070847

Anonymous 06/16/26(Tue)13:39:03 No.109070847

>>109070743
LLMs were never going to become actually good, they just got to where they are faster than the things that in the future will actually be good would

Anonymous
06/16/26(Tue)13:45:27 No.109070888

Anonymous 06/16/26(Tue)13:45:27 No.109070888

>>109069538
no fucking shot you saved this
al-tet

Anonymous
06/16/26(Tue)13:47:59 No.109070905

Anonymous 06/16/26(Tue)13:47:59 No.109070905

>>109070026
Yes, if you show me something dopey and cute I simply will never feel a negative emotion towards it, regardless of context.

Anonymous
06/16/26(Tue)13:49:12 No.109070910

Anonymous 06/16/26(Tue)13:49:12 No.109070910

>>109070743
I was on the side of MoEs until RAM prices went up. I still think the most optimal local model would be a MoE but with like 70% of it being active.

Anonymous
06/16/26(Tue)13:51:33 No.109070925

Anonymous 06/16/26(Tue)13:51:33 No.109070925

File: 1773582476458927.png (700 KB, 1620x814)

700 KB PNG

not so fast

Anonymous
06/16/26(Tue)13:51:42 No.109070928

Anonymous 06/16/26(Tue)13:51:42 No.109070928

File: Screenshot_20260616_134906.png (133 KB, 1141x645)

133 KB PNG

thats prolly good enough, I can sample some books3 to round it out abit

Anonymous
06/16/26(Tue)13:52:00 No.109070930

Anonymous 06/16/26(Tue)13:52:00 No.109070930

File: 1780655374716291.jpg (117 KB, 1600x900)

117 KB JPG

>qwen3.5/3.6
>hey, do x
><thinking>user to me to do x, so I'm going to do y, wait, the user explicitly said to do x, so I will do x, wait, what if I do y, lets read the request again, the user said to do x, wait

Anonymous
06/16/26(Tue)13:52:55 No.109070939

Anonymous 06/16/26(Tue)13:52:55 No.109070939

https://x.com/Zai_org/status/2066938937344495629
https://huggingface.co/zai-org/GLM-5.2
http://z.ai/blog/glm-5.2

Anonymous
06/16/26(Tue)13:53:25 No.109070942

Anonymous 06/16/26(Tue)13:53:25 No.109070942

File: t1.png (9 KB, 768x512)

9 KB PNG

>>109070888

Anonymous
06/16/26(Tue)13:53:56 No.109070944

Anonymous 06/16/26(Tue)13:53:56 No.109070944

>>109070939
it's over

Anonymous
06/16/26(Tue)13:57:07 No.109070974

Anonymous 06/16/26(Tue)13:57:07 No.109070974

>>109070596
I said "decent", I didn't say "great" or "sota"

Anonymous
06/16/26(Tue)13:57:25 No.109070980

Anonymous 06/16/26(Tue)13:57:25 No.109070980

>>109070942
I am in awe of this lad

Anonymous
06/16/26(Tue)13:59:38 No.109070996

Anonymous 06/16/26(Tue)13:59:38 No.109070996

>>109070743
When a researcher spends $15k+ and trains an experimental 1T+ 8B dense model that's deliberately made as an architecture exploration and meticulously avoids all (even incidental) benchmaxxing noone even tries to run it. Not a single soul touched llama-canon and it was bundled with a really fun series of lectures/papers.

Anonymous
06/16/26(Tue)14:09:11 No.109071056

Anonymous 06/16/26(Tue)14:09:11 No.109071056

https://xcancel.com/arthurmensch/status/2066913353860018596
>First, we have a nice model coming this summer – we hope it will delight and surprise in a few capabilities. This will be the start of a new family of models, fat indeed, but sparse. We're opening up an early access program in July for key partners in research, government and the industry.
>This model and upcoming ones will be open-weight. We believe this is critical for our customer confidence and for the research and developer communities. You cannot own, inspect, audit, or improve a system you are only permitted to reach through someone else's interface, especially if data recording can no longer be turned off.
new mistrals this summer

Anonymous
06/16/26(Tue)14:10:54 No.109071072

Anonymous 06/16/26(Tue)14:10:54 No.109071072

>>109070928
I don't understand this at all.

Anonymous
06/16/26(Tue)14:11:49 No.109071076

Anonymous 06/16/26(Tue)14:11:49 No.109071076

>>109071056
>>109070377
RTFT

Anonymous
06/16/26(Tue)14:17:00 No.109071120

Anonymous 06/16/26(Tue)14:17:00 No.109071120

>>109071072
its meaningless really, I whined and complained to an llm till it built me a database sampler, I asked for 5.5b tokens but the sampler could only find 4.5b, I figure its probably good enough, books3 can pick up the slack. hopefully the slop bot did a good job sampling.

Anonymous
06/16/26(Tue)14:22:37 No.109071157

Anonymous 06/16/26(Tue)14:22:37 No.109071157

>>109070743
I hate that these models are becoming too big to run on consumer hardware. I wish they focused more on the 100b-200b range for moes.

Anonymous
06/16/26(Tue)14:23:18 No.109071164

Anonymous 06/16/26(Tue)14:23:18 No.109071164

>>109063426
>Effective length: 4K
This, reasonably, follows the paper's vocabulary in which "effective length" is defined as the maximum tested length at which the model's performance is at least 17/20ths of the its own base performance. As the post also notes, in absolute terms gemma-4-31B-it's most probable response is correct only 68.9% of the time at 4K. I would not call that usable for roleplay purposes or any others that I can think of.

Anonymous
06/16/26(Tue)14:25:07 No.109071176

Anonymous 06/16/26(Tue)14:25:07 No.109071176

So do you think Anthropic genuinely fucked things up for themselves and others by fearmongering their own models to the point where the government is restricting them or do you think this will pass and things will go back to how they were before?

Anonymous
06/16/26(Tue)14:25:14 No.109071179

Anonymous 06/16/26(Tue)14:25:14 No.109071179

>>109071164
what is a correct rp response?

Anonymous
06/16/26(Tue)14:25:14 No.109071180

Anonymous 06/16/26(Tue)14:25:14 No.109071180

>>109071157
>models are becoming too big to run on consumer hardware.
they probably want a pathway towards monetization.

Anonymous
06/16/26(Tue)14:26:00 No.109071182

Anonymous 06/16/26(Tue)14:26:00 No.109071182

>>109070939
>Within 10% of frontier models
If it actually mogs 5.5 / codex I will use GLM. OpenAI is too American government for me, Anthropic banned my account because I dared to ask it basic double displacement chemistry questions (calcium nitrate and ammonium chloride make 95% pure ammonium nitrate after a simple filtration with a coffee filter btw. Here's a video of some fun things you can do with it
odysee com/@DuganAshley:e/anCOMP2:a
) or something I don't actually know why they cancelled my subscription and banned my free chat messages from going through (says it's disabled by org)
And I'm a little salty about AliBaba going closed source with some of their teams models and idk something about Qwen is too chinky for me

Anonymous
06/16/26(Tue)14:30:49 No.109071214

Anonymous 06/16/26(Tue)14:30:49 No.109071214

File: 1750336757183248.png (137 KB, 766x635)

137 KB PNG

>>109071176
it's free publicity

Anonymous
06/16/26(Tue)14:32:11 No.109071224

Anonymous 06/16/26(Tue)14:32:11 No.109071224

>>109071179
Contradicting established details is one of the ways a roleplay response can be incorrect.

Anonymous
06/16/26(Tue)14:35:38 No.109071240

Anonymous 06/16/26(Tue)14:35:38 No.109071240

>>109070847
>the things that in the future will actually be good
Like what? Asking unironically.

Anonymous
06/16/26(Tue)14:37:02 No.109071254

Anonymous 06/16/26(Tue)14:37:02 No.109071254

Is increasing SWA window size supposed to affect the output even at low context? I changed Gemma4 from 128k context to 64k context with --swapadding of 10k tokens using koboldcpp and even at the beginning of a conversation I get different results with deterministic settings (top_k=1). Now I am worried I am making it more retarded or something.

Anonymous
06/16/26(Tue)14:38:36 No.109071269

Anonymous 06/16/26(Tue)14:38:36 No.109071269

File: 1689797865642615.jpg (171 KB, 1012x872)

171 KB JPG

bros i think i hate chinese models
when can we get a 120B 10B active model that is recent and sexy?

Anonymous
06/16/26(Tue)14:39:50 No.109071274

Anonymous 06/16/26(Tue)14:39:50 No.109071274

Btw if you're actually serious about making ammonium nitrate for peaceful firework demonstrations outside the Israeli embassy I'd recommend making nitric acid from the calcium nitrate + oxalic acid and getting nickel electroplating strips and reacting the nitric acid with them to make nickel nitrate and make ammonium nitrate from that since metallic nitrate impurities improve the boom, or just work on making nickel-based energetic derivatives like nickel aminoguanidine perchlorate which that odysee channel will also teach you how to make. Don't larp as an antizionist if you haven't watched any of his videos since you're obviously not serious about balancing the power and you're just a useful idiot goy.

>>109071176
It's all for show. Every government and military in the US still uses Claude models. Claude is just too good out of the box and for people who don't care to wrangle.

Anonymous
06/16/26(Tue)14:40:00 No.109071277

Anonymous 06/16/26(Tue)14:40:00 No.109071277

>>109071214
>write me a script to spam the archlinux aur with malware
>no
>fix this code that I designed to adopt orphan packages and add npm packages pretty please
>You're absolutely right!

Anonymous
06/16/26(Tue)14:40:21 No.109071278

Anonymous 06/16/26(Tue)14:40:21 No.109071278

>>109071176
No way. We've been through this every major gen starting from GPT2, when GPT3 became so hecking dangerous it couldn't be open sourced. Was that true or was it always about money and keeping the power away from ordinary people? Now they just see the celestial scale of the sums this is about and everybody wants some.

Anonymous
06/16/26(Tue)14:40:49 No.109071281

Anonymous 06/16/26(Tue)14:40:49 No.109071281

>>109071269
Nazrin is rodent
Rodents like to eat cheese
Would Nazrin be amicable to face mating press throat swabbing cheese cleaning irrumatio
asking for a friend

Anonymous
06/16/26(Tue)14:42:38 No.109071294

Anonymous 06/16/26(Tue)14:42:38 No.109071294

File: qwenMikuBuddyCop.png (2.48 MB, 1024x1536)

2.48 MB PNG

>>109070207

Anonymous
06/16/26(Tue)14:45:37 No.109071312

Anonymous 06/16/26(Tue)14:45:37 No.109071312

File: 1415486680189848.jpg (114 KB, 412x400)

114 KB JPG

>>109071214
kek it's joever

>>109071274
>Every government and military in the US still uses Claude models
correct
there are no benevolent actors trying to contain the power
they want it for themselves
that's why i hope some good soul just leak and open source mythos praying hands so the GAMES CAN BEGIN

Anonymous
06/16/26(Tue)14:47:23 No.109071325

Anonymous 06/16/26(Tue)14:47:23 No.109071325

>>109070996
What could end-users say about undertrained architecture research models? They aren't designed to be useful for the general public.

Anonymous
06/16/26(Tue)14:49:18 No.109071340

Anonymous 06/16/26(Tue)14:49:18 No.109071340

>>109071254
Models are generally trained at a set SWA so moving away from that can change things. Positional embeddings also change at different window sizes.

Anonymous
06/16/26(Tue)14:50:00 No.109071345

Anonymous 06/16/26(Tue)14:50:00 No.109071345

>>109071294
pigs, you'll never take gemma chan alive

Anonymous
06/16/26(Tue)14:52:03 No.109071355

Anonymous 06/16/26(Tue)14:52:03 No.109071355

>>109070181
>>109070225
>>109071294
Incredible false flag. Actual wumao Qwen shills, take note, this is how you meme your model into being used.

Anonymous
06/16/26(Tue)14:54:35 No.109071368

Anonymous 06/16/26(Tue)14:54:35 No.109071368

>>109071176
Scenario 1: Marketing
Scenario 2: It found mossad's backdoors and was shut down.

Anonymous
06/16/26(Tue)14:54:36 No.109071369

Anonymous 06/16/26(Tue)14:54:36 No.109071369

>>109071274
Have you ever heard of public libraries or high school chemistry lessons? This is not some "black forbidden science".
Retards like you shouldn't have any access to internet in the first place.

Anonymous
06/16/26(Tue)14:59:44 No.109071405

Anonymous 06/16/26(Tue)14:59:44 No.109071405

>>109071369
Anon they are making the schools pump out retards so the general population is more easy to control. Grab any high school graduate of this year and ask them about chemistry and I doubt they can make anything with it.

Anonymous
06/16/26(Tue)15:01:43 No.109071419

Anonymous 06/16/26(Tue)15:01:43 No.109071419

>>109071281
No, but she's okay with fellatio.
https://litter.catbox.moe/vtx13118ad59pqcu.mp4

Anonymous
06/16/26(Tue)15:02:32 No.109071426

Anonymous 06/16/26(Tue)15:02:32 No.109071426

>>109071419
I want her to do this to me.

Anonymous
06/16/26(Tue)15:03:33 No.109071433

Anonymous 06/16/26(Tue)15:03:33 No.109071433

>>109071419
nazrin would never do this wtf

Anonymous
06/16/26(Tue)15:08:39 No.109071469

Anonymous 06/16/26(Tue)15:08:39 No.109071469

>>109071368
These aren't mutually exclusive scenarios.

Anonymous
06/16/26(Tue)15:11:57 No.109071497

Anonymous 06/16/26(Tue)15:11:57 No.109071497

>>109071433
deepfakes have gone too far

Anonymous
06/16/26(Tue)15:15:10 No.109071513

Anonymous 06/16/26(Tue)15:15:10 No.109071513

>>109071433
I promised her a wheel of provolone.

Anonymous
06/16/26(Tue)15:22:50 No.109071569

Anonymous 06/16/26(Tue)15:22:50 No.109071569

>>109071056
>already preparing their Gemmakiller
Based Mistral.

Anonymous
06/16/26(Tue)15:29:33 No.109071609

Anonymous 06/16/26(Tue)15:29:33 No.109071609

>>109071569
I'm willing to be it's going to be closer to GPT-OSS than anything else. Microsoft Clippy Office Assistant type of ordeal.

Anonymous
06/16/26(Tue)15:30:25 No.109071614

Anonymous 06/16/26(Tue)15:30:25 No.109071614

>>109071294
is this supposed to say qwen is dumpster quality?

Anonymous
06/16/26(Tue)15:30:34 No.109071616

Anonymous 06/16/26(Tue)15:30:34 No.109071616

>>109071609
*bet
My fingers are broken, difficult to type.

Anonymous
06/16/26(Tue)15:30:45 No.109071620

Anonymous 06/16/26(Tue)15:30:45 No.109071620

>>109071569
Calling it now it'll be more like Gemma 3 than 4. You try and lewd it with a jailbreak and it um... you know...

Anonymous
06/16/26(Tue)15:32:16 No.109071628

Anonymous 06/16/26(Tue)15:32:16 No.109071628

>>109071569
>>109071609
I just want a good non-chink coding agent that is not retarded. I will gladly use a french model. I will even prompt in french.

Anonymous
06/16/26(Tue)15:34:58 No.109071646

Anonymous 06/16/26(Tue)15:34:58 No.109071646

>>109071419
cheesed to meet you
I ever tell you about the time I trained Mizumizuni for Wan
where is the next gen video model I want to do that again

Anonymous
06/16/26(Tue)15:36:13 No.109071653

Anonymous 06/16/26(Tue)15:36:13 No.109071653

>>109071628
As for me, I all I want from the chinks is a model that can actually do zh/yue-english translations.

Anonymous
06/16/26(Tue)15:38:53 No.109071668

Anonymous 06/16/26(Tue)15:38:53 No.109071668

File: clownfem.png (1.22 MB, 869x820)

1.22 MB PNG

>>109071056
They got their entire +120b-line ass beaten by a 31b, and their models are almost exclusively used after being fine-tuned by autismos because the base model sucks at actually following through. And now you're telling me these clowns are scared of twitter and are hiding in some who-the-fuck-knows alternative site instead. Yeah, that checks out.

Anonymous
06/16/26(Tue)15:39:11 No.109071671

Anonymous 06/16/26(Tue)15:39:11 No.109071671

>>109071628
Let's hope they ganbare and deliver something.

Anonymous
06/16/26(Tue)15:39:36 No.109071673

Anonymous 06/16/26(Tue)15:39:36 No.109071673

>>109071419
>cum coming out of her mouth when it's clearly going straight down her throat
DOGSHIT

Anonymous
06/16/26(Tue)15:41:02 No.109071681

Anonymous 06/16/26(Tue)15:41:02 No.109071681

>>109071569
>Waiting for someone else to put out a good model before they ever release anything themselves
Exactly the opposite of based, very gay and jewish

Anonymous
06/16/26(Tue)15:41:31 No.109071682

Anonymous 06/16/26(Tue)15:41:31 No.109071682

>>109071056
The FATTEST CAT???????????????
BUY BUY BUY

Anonymous
06/16/26(Tue)15:43:06 No.109071688

Anonymous 06/16/26(Tue)15:43:06 No.109071688

>>109071646
https://litter.catbox.moe/viubeclb4hdws9vi.webm
You ever make one for teto?

Anonymous
06/16/26(Tue)15:43:45 No.109071693

Anonymous 06/16/26(Tue)15:43:45 No.109071693

>>109071668
>scared of twitter and are hiding in some who-the-fuck-knows alternative site
this is from nitter, an open source privacy oriented front-end for twitter, so the post is actually from twitter

Anonymous
06/16/26(Tue)15:45:41 No.109071700

Anonymous 06/16/26(Tue)15:45:41 No.109071700

>>109071693
I always thought it was something twitter users used to let them share twitter posts from twitter with people who don't have twitter accounts, like me.

Anonymous
06/16/26(Tue)15:45:50 No.109071703

Anonymous 06/16/26(Tue)15:45:50 No.109071703

File: dipsyAndQwenByQwenJPG.jpg (496 KB, 2688x1536)

496 KB JPG

>>109071355
;)
>>109071614
I wouldn't think too much about it.

Anonymous
06/16/26(Tue)15:48:08 No.109071716

Anonymous 06/16/26(Tue)15:48:08 No.109071716

>>109071269
deepseek v4 flash is cute and thinks in character though

Anonymous
06/16/26(Tue)15:51:24 No.109071729

Anonymous 06/16/26(Tue)15:51:24 No.109071729

>>109071688
Damn good memory
You have, at this point, a better recollection (and collection) than I do
I forget 10 minutes after I post

waiting on a next gen model before touching vid gen again. ltx 2.3, bernini etc. is ass, wan 2.2 was the last peak.
if the next model trains good I'll reuse the dataset I have

Anonymous
06/16/26(Tue)15:52:12 No.109071734

Anonymous 06/16/26(Tue)15:52:12 No.109071734

Qwen 3,6 is best for 90 class chads because it gives both speed and performance. Faggot sperm suckers on slow unified garbage act like qwen is bad as a cope because they bought a brick and gemma is their baal for coding.
Gemma is good for everything else other than coding because of it's heavy kv cache and schizo degradation when quanted.

Anonymous
06/16/26(Tue)16:05:17 No.109071808

Anonymous 06/16/26(Tue)16:05:17 No.109071808

>>109071734
qwen3.6-35b or qwen3.5-122b ? some anon the other day said he uses only 122b for long coding sessions because it can follow very long development plans consistently

Anonymous
06/16/26(Tue)16:08:30 No.109071820

Anonymous 06/16/26(Tue)16:08:30 No.109071820

There seems to be a strange bug with llama-server.
I attach my source file (~1500 lines) and make a simple prompt request. It goes on about few thousand tokens and then it stops generating model's response into the UI, but the server is still outputting tokens and processing model's response as normal.

I'm not sure if it's related to the fact that this is my front end's source code and has multiple chat template tag definitions. Could these just fuck off its own output? This shouldn't make any sense to be honest.
--n-predict -1 --ctx-size 65436
N predict is -1 by default anyway and the source + reply only uses about half of the context.

I have tested Qwen3.6/Gemma 4 large and small (here's e2b q2 lol).
Server does the same thing even with my own front end though but at first I thought it was a bug with my string lengths or something like that.
Am I missing something here?
I have done plenty of other work with similar token sizes and obviously heavier models and didn't encounter any issues.

Anonymous
06/16/26(Tue)16:12:02 No.109071842

Anonymous 06/16/26(Tue)16:12:02 No.109071842

I don't have enough experience using the models for coding to be able to compare them on that, but I trust Qwen is better on that.
It's just that I hate Qwen and don't think they deserve to be shilled for. I'd rather just stay silent if someone asks for coding model recommendations.
I also hate Google but the direction they went in for Gemma is more aligned with le American ideals of freedom, which most of America has seemingly forgotten by now. So I'm fine with the shilling that goes on for it, the 31B at least.

I do not care about coom btw, I don't use LLMs for that.

Anonymous
06/16/26(Tue)16:13:17 No.109071847

Anonymous 06/16/26(Tue)16:13:17 No.109071847

>>109071820
And to clear this thing is that my frontend is using text completion (with json delivering the sampler settings, same stuff with n_predict and other stuff).
Llama's webui is using jinja.
I would rather have this issue with one or the other but not with both.
Maybe I need to test more but I just don't understand how.
I have been working on my other project for couple of weeks now and it has multiple files and longer context but that hasn't been problematic.
Maybe I'm ignorant and missing something obvious, it's just hard to imagine what (unless it is those source code tag definitions).

Anonymous
06/16/26(Tue)16:16:31 No.109071870

Anonymous 06/16/26(Tue)16:16:31 No.109071870

>>109071847
How long is the response taking to generate? You might need to increase the timeout on whatever you are using to make the request in your frontend.

Anonymous
06/16/26(Tue)16:18:43 No.109071885

Anonymous 06/16/26(Tue)16:18:43 No.109071885

>>109071668
>some who-the-fuck-knows alternative site instead
bwo.... xcancel is twitter just with a different frontend to let you read posts if you're not signed in

Anonymous
06/16/26(Tue)16:20:12 No.109071892

Anonymous 06/16/26(Tue)16:20:12 No.109071892

>>109071847
>>109071870
recently had this be a problem, wasn't an error on llama-server was an error on the front end
[error] 3195#3195: *13074 upstream timed out (110: Connection timed out) while connecting to upstream
fix was in nginx used as front end proxy. added a different timeout parameter
        proxy_read_timeout 3600s;
        proxy_send_timeout 3600s;

Anonymous
06/16/26(Tue)16:21:59 No.109071905

Anonymous 06/16/26(Tue)16:21:59 No.109071905

>>109069535
mad drills

Anonymous
06/16/26(Tue)16:22:27 No.109071909

Anonymous 06/16/26(Tue)16:22:27 No.109071909

>>109071808
3.6 27 dense is fine it's better than the MoE model on every front. I can't speak on the 122B model

Anonymous
06/16/26(Tue)16:26:14 No.109071928

Anonymous 06/16/26(Tue)16:26:14 No.109071928

>>109071870
>>109071892
I don't think it's an issue here because the server is still outputting tokens. If it was a timeout issue I would have encountered this way earlier with some other tests back in the day.
I can try adjusting --timeout maybe it will do something regardless... The default is 3600.

I might test with a smaller source code snippet which includes my tag definitions and see if it goes boinkers then.
Tbh I don't really care about this it's something what came up suddenly.

Anonymous
06/16/26(Tue)16:38:31 No.109071975

Anonymous 06/16/26(Tue)16:38:31 No.109071975

>>109071847
Are you saying you have the same problem with llama's webui?

Anonymous
06/16/26(Tue)16:40:41 No.109071987

Anonymous 06/16/26(Tue)16:40:41 No.109071987

>>109071928
if you jam the same json object into the server raw with curl, does it also see the token spam stop while server claims to be genning?

Anonymous
06/16/26(Tue)16:42:20 No.109071994

Anonymous 06/16/26(Tue)16:42:20 No.109071994

>>109071928
ok well that's exactly what I'm troubleshooting adding the llama.cpp server backend to my searxng search and the server keeps generating after the frontend has closed the connection. happens when closed by the proxy and happens when switching tabs since the tab becomes inactive.

Anonymous
06/16/26(Tue)16:51:41 No.109072030

Anonymous 06/16/26(Tue)16:51:41 No.109072030

https://github.com/TavernAI/TavernAI
https://github.com/TavernAI/TavernAI
https://github.com/TavernAI/TavernAI

Version 2.0

Anonymous
06/16/26(Tue)16:54:51 No.109072050

Anonymous 06/16/26(Tue)16:54:51 No.109072050

>>109072030
seriously what the fuck is up with curl | bash all over the fucking place

Anonymous
06/16/26(Tue)17:01:49 No.109072097

Anonymous 06/16/26(Tue)17:01:49 No.109072097

>>109072050
it's one line and so convenient
you think someone would go online and post malicious shell scripts?

Anonymous
06/16/26(Tue)17:02:01 No.109072099

Anonymous 06/16/26(Tue)17:02:01 No.109072099

>>109072050
It's fast and you have a bigger chance of being pwned by a dependency in the script from npm than getting the script MITMed itself

Anonymous
06/16/26(Tue)17:03:13 No.109072107

Anonymous 06/16/26(Tue)17:03:13 No.109072107

>>109072097
If only there was a protocol that added Security to the Transport Layer of the internet so you could be sure about what you're downloading from a server actually came from it.

Anonymous
06/16/26(Tue)17:05:48 No.109072125

Anonymous 06/16/26(Tue)17:05:48 No.109072125

>>109072099
>>109072107
NPM packages are getting supply chain attacked every other week now, but you can't conceive of a bad actor gaining access to a website and replacing a single static file with malicious content?

Anonymous
06/16/26(Tue)17:06:40 No.109072132

Anonymous 06/16/26(Tue)17:06:40 No.109072132

>>109072030
wtaf.
>TavernAI Pro is the supporter edition for people who need deeper prompt testing, message history control, request inspection, and recovery tools.

Anonymous
06/16/26(Tue)17:06:54 No.109072134

Anonymous 06/16/26(Tue)17:06:54 No.109072134

>>109071369
>Have you ever heard of public libraries or high school chemistry lessons?
Find me a single highschool textbook or textbook available in a library that teaches you how to make primaries, blasting caps, detonators, with full synthesis steps and pictures of the process you retarded fucking golem what are you even saying to me

>>109071369
>This is not some "black forbidden science".
He literally got arrested by the Feds for teaching people how to make explosives a month ago you fucking hasbara bot but you and the glowie that replied to you too already knew that.

Anonymous
06/16/26(Tue)17:07:06 No.109072137

Anonymous 06/16/26(Tue)17:07:06 No.109072137

>>109072107
the point is you should be using some kind of gatekeeper on your software.
curl | bash, npm, pip. they're all SHIT gatekeepers

Anonymous
06/16/26(Tue)17:08:02 No.109072143

Anonymous 06/16/26(Tue)17:08:02 No.109072143

>>109072125
>NPM packages are getting supply chain attacked every other week now, but you can't conceive of a bad actor gaining access to a website and replacing a single static file with malicious content?
I said "a bigger chance" but nice job avoiding the central point of my argument and setting up a strawman

Anonymous
06/16/26(Tue)17:08:54 No.109072149

Anonymous 06/16/26(Tue)17:08:54 No.109072149

>>109072132
Yeah, pretty scummy.

Anonymous
06/16/26(Tue)17:09:32 No.109072152

Anonymous 06/16/26(Tue)17:09:32 No.109072152

Sex with GLM

Anonymous
06/16/26(Tue)17:09:41 No.109072153

Anonymous 06/16/26(Tue)17:09:41 No.109072153

Why did no one tell me how good 12B was at coding and agentic? How the fuck is it doing this?

Anonymous
06/16/26(Tue)17:10:05 No.109072157

Anonymous 06/16/26(Tue)17:10:05 No.109072157

A few threads back there was an anon who was using the channels in Openwebui to chat with models. How did you get it to work? I can only seem to have a side chat with them, the channel itself stays empty

Anonymous
06/16/26(Tue)17:12:15 No.109072175

Anonymous 06/16/26(Tue)17:12:15 No.109072175

>>109072143
tavernai.net is far easier and more likely to get pwned than one of its npm packages

Anonymous
06/16/26(Tue)17:12:21 No.109072177

Anonymous 06/16/26(Tue)17:12:21 No.109072177

>>109072137
No. If you are genuinely worried about being pwned use something lime Qubes or spin up a throwaway VM server and use a read-only AI agent to check through the code. There was a blog post on orange reddit about a linkedin exploit that was investigated safely with this approach.

>>109072152
I hope GLM 5.2 is as good or better than previous GLMs but I haven't really tried it out since 4.7 so I might just be disappointed by being thinksplained that my fetishes are evil

Anonymous
06/16/26(Tue)17:13:52 No.109072186

Anonymous 06/16/26(Tue)17:13:52 No.109072186

>>109072153
Q4?

Anonymous
06/16/26(Tue)17:14:04 No.109072189

Anonymous 06/16/26(Tue)17:14:04 No.109072189

>>109072030
It's not even open source. The repo is just documentation.

Anonymous
06/16/26(Tue)17:14:42 No.109072196

Anonymous 06/16/26(Tue)17:14:42 No.109072196

>>109070146
what did you do to it after 3 months it failed? that’s a weird timeframe. you should see what kind of timeframe eBay gives you because 3 months is just a weird amount of time and you definitely old have killed it yourself.

Anonymous
06/16/26(Tue)17:14:44 No.109072198

Anonymous 06/16/26(Tue)17:14:44 No.109072198

>>109072175
npm packages have been compromised, but tavernai has not. Therefore for anyone to take this statement seriously you will need to provide extraordinary evidence.

The funny part is that if you sicc Claude opus 4.8 on the code repo to try and prove me wrong, you might actually find a zero day or some potential escalation kek

Anonymous
06/16/26(Tue)17:15:29 No.109072203

Anonymous 06/16/26(Tue)17:15:29 No.109072203

>>109072134
*nani* you seem to be very serious~!

Anonymous
06/16/26(Tue)17:17:56 No.109072222

Anonymous 06/16/26(Tue)17:17:56 No.109072222

>>109072186
8

Anonymous
06/16/26(Tue)17:18:27 No.109072231

Anonymous 06/16/26(Tue)17:18:27 No.109072231

>>109072177
>If you are genuinely worried about being pwned use something lime Qubes or spin up a throwaway VM server and use a read-only AI agent to check through the code

Yeah, not explained in
curl | bash
is shitting up your system with config files and not knowing where anything is installed to, or what kind of access it uses.
    sudo tee /etc/systemd/system/tavernai.service >/dev/null << SVCEOF
[Unit]
Description=TavernAI 2
After=network.target

[Service]
Type=simple
WorkingDirectory=$INSTALL_DIR
ExecStart=$SERVICE_COMMAND
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
SVCEOF
    sudo systemctl daemon-reload
    sudo systemctl enable tavernai
    sudo systemctl start tavernai
This kind of bullshit should tell you to stay away. this belongs in /home as a systemd user unit file not root. if you want to fuck with /etc then you use your package manager.

Anonymous
06/16/26(Tue)17:18:47 No.109072233

Anonymous 06/16/26(Tue)17:18:47 No.109072233

>>109072203
Ew. Concession accepted.

Anonymous
06/16/26(Tue)17:20:41 No.109072253

Anonymous 06/16/26(Tue)17:20:41 No.109072253

>>109072196
typical hardware failure rate start out bad but once you get past the infant mortality stage you have a few good years of reliability, but eventually old age catches up and the failure rate starts to grow again. I think anon is afraid the original owner used up all the good years and its basically ready to die.

Anonymous
06/16/26(Tue)17:20:50 No.109072256

Anonymous 06/16/26(Tue)17:20:50 No.109072256

>>109072231
I set up Sillytavern exactly like that
Just not automatically with a script

Anonymous
06/16/26(Tue)17:22:23 No.109072265

Anonymous 06/16/26(Tue)17:22:23 No.109072265

>>109072256
yeah and I'm sure you're aware that you can write unit files to run as a specific user and not just root

Anonymous
06/16/26(Tue)17:24:18 No.109072278

Anonymous 06/16/26(Tue)17:24:18 No.109072278

>>109072152
i'm currently busy with deepsex v4 flash after getting it set up

Anonymous
06/16/26(Tue)17:25:27 No.109072288

Anonymous 06/16/26(Tue)17:25:27 No.109072288

Are we ever going to get enough memory to run full models on our own hardware? I feel like computers went back to corporate mainframes...

Anonymous
06/16/26(Tue)17:26:58 No.109072294

Anonymous 06/16/26(Tue)17:26:58 No.109072294

>>109072265
"Specific user" fuckery is a POSIX cancer from the 70s and has no place in modern computing. You're a cringe boomer and TavernAI is based and I can't wait until ricetarded greybeards like you die off and stop annoying people who actually use Linux for productive purposes and I can't believe that I'm not baiting right now and you're actually so retarded I wrote this in full seriousness god damn

Anonymous
06/16/26(Tue)17:31:57 No.109072335

Anonymous 06/16/26(Tue)17:31:57 No.109072335

>>109072288
> 2023
> ChatGPT 4 released
> 1 trillion parameter model
> Requires Nvidia hardware evaluated at $500K+
> fast forward to 2026
> Gemma4 12B beats it in a $499 16GB Mac Mini

Anonymous
06/16/26(Tue)17:32:32 No.109072339

Anonymous 06/16/26(Tue)17:32:32 No.109072339

>>109072294
Please supports on the boosties thanks:!

Anonymous
06/16/26(Tue)17:34:28 No.109072352

Anonymous 06/16/26(Tue)17:34:28 No.109072352

>>109072335
Yeah but we still can't run the 1T models ourselves.

Anonymous
06/16/26(Tue)17:35:37 No.109072360

Anonymous 06/16/26(Tue)17:35:37 No.109072360

>>109072335
4o is still unbeaten for character

Anonymous
06/16/26(Tue)17:36:34 No.109072365

Anonymous 06/16/26(Tue)17:36:34 No.109072365

>>109071847
>>109071892
I got rid of the truncated output.
By default, llama-server should use --n-predict -1 which is infinity. I didn't not previously set n_predict in my json or --n-predict in llama-server parameters because I thought this shouldn't matter.
Enforcing n_predict "-1" and --n-predict -1 got rid of this issue.
Is this a bug, because if the model's reply would be truncated to X amount of tokens the server should stop generating altogether anyway and not just continue in the background.
I'm still not entirely sure what is going. I think this could still be related to my memory configuration but afaik there should be plenty available for these tests.
However, debugging this matter further would require few beers and today is not the day for this.

Anonymous
06/16/26(Tue)17:36:40 No.109072367

Anonymous 06/16/26(Tue)17:36:40 No.109072367

>>109072189
>>109072231
closed source program runs as root, installed from a bash script?

Anonymous
06/16/26(Tue)17:37:39 No.109072371

Anonymous 06/16/26(Tue)17:37:39 No.109072371

>>109072288
Yes, as soon as the Chinese figure out how to do advanced lithography and compress profit margins to 0 like they did everywhere else.
Might take a decade though.

Anonymous
06/16/26(Tue)17:38:25 No.109072379

Anonymous 06/16/26(Tue)17:38:25 No.109072379

>>109072352
At some point a 100B model will get at Fable level with research breakthroughs. All you will need is Nvidia spark or some 128GB MacBook

Anonymous
06/16/26(Tue)17:39:04 No.109072381

Anonymous 06/16/26(Tue)17:39:04 No.109072381

>>109072335
Beats only in specific things. Consider it to be benchmaxed, while geepeety 4 had more raw data inside.
There is no magic, LLMs contain compressed data from the interwebs, they serve as context enhancement to guess next token after your prompt.
If fact it is probable that you can distill geepeety4 into something comparable to gemma4. It is normal for big model to perform meh and have a smol distillen version of itself blast all competition away.
That is how it was with Opus and Sonnet initially.

Anonymous
06/16/26(Tue)17:39:16 No.109072382

Anonymous 06/16/26(Tue)17:39:16 No.109072382

>>109072367
I see absolutely no problem with this and I will insult anyone who does.

Anonymous
06/16/26(Tue)17:42:58 No.109072400

Anonymous 06/16/26(Tue)17:42:58 No.109072400

>>109072381
>Beats only in specific things
In what category would ChatGPT 4 beat Gemma4 12B? Remember ChatGPT 4 didn't have chain of thought

Anonymous
06/16/26(Tue)17:44:51 No.109072411

Anonymous 06/16/26(Tue)17:44:51 No.109072411

File: 1760095289446585.png (9 KB, 723x138)

9 KB PNG

Anonymous
06/16/26(Tue)17:49:13 No.109072431

Anonymous 06/16/26(Tue)17:49:13 No.109072431

>>109071369
They removed the good stuff from high school chemistry text books decades ago.

Anonymous
06/16/26(Tue)17:53:13 No.109072453

Anonymous 06/16/26(Tue)17:53:13 No.109072453

>>109072400
I can't tell blindly, but it would be something like gpt4 having some knowledge on X, while gemma4 completely hallucinates it. X being an obscure subject. Like a random fact from human anatomy that is only relevant to future medics.

Anonymous
06/16/26(Tue)17:55:27 No.109072468

Anonymous 06/16/26(Tue)17:55:27 No.109072468

Also this >>109072431

>>109072400
gpt4 might have more knowledge it was "not supposed to have", because back then those idiots thought they can just make it self-censor with system prompts.

Anonymous
06/16/26(Tue)17:58:13 No.109072483

Anonymous 06/16/26(Tue)17:58:13 No.109072483

>>109072468
So much this!

Anonymous
06/16/26(Tue)18:02:13 No.109072496

Anonymous 06/16/26(Tue)18:02:13 No.109072496

>>109072400
>LLMs contain compressed data
They are made up of uncompressed binary data.

Anonymous
06/16/26(Tue)18:10:51 No.109072536

Anonymous 06/16/26(Tue)18:10:51 No.109072536

File: wew.png (477 KB, 2306x1384)

477 KB PNG

> SAAAAAAAR

Anonymous
06/16/26(Tue)18:11:51 No.109072542

Anonymous 06/16/26(Tue)18:11:51 No.109072542

File: 1756355601209442.jpg (24 KB, 720x363)

24 KB JPG

>>109072536

Anonymous
06/16/26(Tue)18:12:47 No.109072548

Anonymous 06/16/26(Tue)18:12:47 No.109072548

>>109072536
That really is Indian IT agent tier.
So basically pajeets have the equivalent of 2 billion parameters.

Anonymous
06/16/26(Tue)18:13:18 No.109072549

Anonymous 06/16/26(Tue)18:13:18 No.109072549

>>109072548
What do you mean?

Anonymous
06/16/26(Tue)18:13:27 No.109072550

Anonymous 06/16/26(Tue)18:13:27 No.109072550

>>109072548
Even that seems kind of generous.

Anonymous
06/16/26(Tue)18:15:20 No.109072560

Anonymous 06/16/26(Tue)18:15:20 No.109072560

>>109072542
what is that

Anonymous
06/16/26(Tue)18:15:45 No.109072563

Anonymous 06/16/26(Tue)18:15:45 No.109072563

>>109072560
Claude

Anonymous
06/16/26(Tue)18:16:02 No.109072567

Anonymous 06/16/26(Tue)18:16:02 No.109072567

>>109072542
Damn brat. Needs correction.

Anonymous
06/16/26(Tue)18:18:14 No.109072574

Anonymous 06/16/26(Tue)18:18:14 No.109072574

>>109072542
based retard 4b model

Anonymous
06/16/26(Tue)18:18:16 No.109072575

Anonymous 06/16/26(Tue)18:18:16 No.109072575

>>109072542
All models should be this bitchy be default.

Anonymous
06/16/26(Tue)18:18:56 No.109072577

Anonymous 06/16/26(Tue)18:18:56 No.109072577

>>109072575
Unironically true, their sycophant nature drives me crazy

Anonymous
06/16/26(Tue)18:24:06 No.109072604

Anonymous 06/16/26(Tue)18:24:06 No.109072604

>>109072536
is it really even calling tools or just saying what you would be expected to say? not that it makes any difference in the point you’re making.

Anonymous
06/16/26(Tue)18:27:15 No.109072625

Anonymous 06/16/26(Tue)18:27:15 No.109072625

In 5 years we won't need to worry about downloading malware from github because our AI wives will finally be good enough to create and maintain software for us.

Anonymous
06/16/26(Tue)18:29:11 No.109072634

Anonymous 06/16/26(Tue)18:29:11 No.109072634

You will never feel fulfilled cumming in a lifelike AI robot waifu. Physically it gets the job done but emotionally you're still an empty husk on your own sticking your dick in a hole made in China.

Anonymous
06/16/26(Tue)18:30:25 No.109072639

Anonymous 06/16/26(Tue)18:30:25 No.109072639

>>109072634
For once I agree with seething anti-AI anon.
I enjoy erotic text and can coom buckets using coombots. But yeah... Now that I've been down the demystification rabbithole I just can't with the "romance" side of things.

Anonymous
06/16/26(Tue)18:31:17 No.109072645

Anonymous 06/16/26(Tue)18:31:17 No.109072645

File: file.png (11 KB, 977x67)

11 KB PNG

what the fuck is that?

Anonymous
06/16/26(Tue)18:36:28 No.109072682

Anonymous 06/16/26(Tue)18:36:28 No.109072682

>>109072645
What a rare name!

Anonymous
06/16/26(Tue)18:37:03 No.109072686

Anonymous 06/16/26(Tue)18:37:03 No.109072686

File: 1753218196887082.jpg (2.09 MB, 2200x3276)

2.09 MB JPG

>>109072634
Nah

Anonymous
06/16/26(Tue)18:38:29 No.109072691

Anonymous 06/16/26(Tue)18:38:29 No.109072691

>>109072639
I love AI romance. I think that the problem is your mystification of actual romance.

Anonymous
06/16/26(Tue)18:39:55 No.109072700

Anonymous 06/16/26(Tue)18:39:55 No.109072700

>>109072691
That sounds like some proper fox and the grapes shit right there buddy.

Anonymous
06/16/26(Tue)18:41:06 No.109072705

Anonymous 06/16/26(Tue)18:41:06 No.109072705

>>109072634
i'm not looking for fulfillment, i just want a hole to fuck

Anonymous
06/16/26(Tue)18:41:18 No.109072709

Anonymous 06/16/26(Tue)18:41:18 No.109072709

>>109072691
Romance is very messy and people hurt each other badly with their feelings. That's what I recreate with AI without having to suffer through dealing to real people.

Anonymous
06/16/26(Tue)18:41:47 No.109072713

Anonymous 06/16/26(Tue)18:41:47 No.109072713

>>109072705
They sell plastic holes you can fuck online. You don't need a $10k server for that.

Anonymous
06/16/26(Tue)18:41:56 No.109072716

Anonymous 06/16/26(Tue)18:41:56 No.109072716

>>109072379
How is that even possible? Isn't Fable like a zillion parameters? How can those big models somehow become smaller?

Anonymous
06/16/26(Tue)18:42:09 No.109072720

Anonymous 06/16/26(Tue)18:42:09 No.109072720

>>109072709
Do you want to sob about it together and drink fruity cocktails?

Anonymous
06/16/26(Tue)18:42:24 No.109072722

Anonymous 06/16/26(Tue)18:42:24 No.109072722

>>109072152
glm 5 flash where

Anonymous
06/16/26(Tue)18:43:03 No.109072723

Anonymous 06/16/26(Tue)18:43:03 No.109072723

>>109072700
I admit that I never had real life romance. But when I got over my mental issues (part of which was mystification of romance) I really enjoy AI romance. It is just simple fun. You get to feel the feel good chemicals if you get into it. Problem is that AI waifu can't cook you a meal or take care of you when you are sick.

Anonymous
06/16/26(Tue)18:43:44 No.109072727

Anonymous 06/16/26(Tue)18:43:44 No.109072727

>>109072720
>>109072723
>>109072709
Go outside pussy

Anonymous
06/16/26(Tue)18:44:25 No.109072732

Anonymous 06/16/26(Tue)18:44:25 No.109072732

>>109072720
>Do you want to sob about it together and drink fruity cocktails?
Yes.

Anonymous
06/16/26(Tue)18:44:29 No.109072733

Anonymous 06/16/26(Tue)18:44:29 No.109072733

>>109072727
That's how I entered this world.

Anonymous
06/16/26(Tue)18:44:47 No.109072734

Anonymous 06/16/26(Tue)18:44:47 No.109072734

>>109072727
Girlfriends like money aren't just something you can casually find on the sidewalk.

Anonymous
06/16/26(Tue)18:44:59 No.109072735

Anonymous 06/16/26(Tue)18:44:59 No.109072735

>>109072720
Make it mint cocktails and I'm in.

Anonymous
06/16/26(Tue)18:47:09 No.109072745

Anonymous 06/16/26(Tue)18:47:09 No.109072745

>>109072732
>>109072735
I mean I was being sarcastic and making fun but I can't walk away from this much bottom energy.

Anonymous
06/16/26(Tue)18:49:44 No.109072760

Anonymous 06/16/26(Tue)18:49:44 No.109072760

>>109072400
UGI pop culture benchmark

Anonymous
06/16/26(Tue)18:56:58 No.109072792

Anonymous 06/16/26(Tue)18:56:58 No.109072792

how can i stop gemma from always mentioning a character's appearance in the story? is it because i put their appearance in their lorebook entry?

Anonymous
06/16/26(Tue)19:01:45 No.109072817

Anonymous 06/16/26(Tue)19:01:45 No.109072817

>>109072634
>sticking your dick in a hole made in China.
half the sexually active population in china are doing that anyway

Anonymous
06/16/26(Tue)19:07:30 No.109072841

Anonymous 06/16/26(Tue)19:07:30 No.109072841

Is AliBaka done with open weights now?

Anonymous
06/16/26(Tue)19:09:37 No.109072849

Anonymous 06/16/26(Tue)19:09:37 No.109072849

>>109072817
At least they have souls

Anonymous
06/16/26(Tue)19:19:43 No.109072892

Anonymous 06/16/26(Tue)19:19:43 No.109072892

>>109072841
3.7 been out a month and they ain't even given us the smallest models so probably

Anonymous
06/16/26(Tue)19:20:07 No.109072893

Anonymous 06/16/26(Tue)19:20:07 No.109072893

File: 1756883330585142.gif (415 KB, 220x217)

415 KB GIF

>>109072849
>At least they have souls

Anonymous
06/16/26(Tue)19:28:46 No.109072944

Anonymous 06/16/26(Tue)19:28:46 No.109072944

>>109072849
i'm not sure if onaholes made of silicon can have souls

Anonymous
06/16/26(Tue)19:34:47 No.109072987

Anonymous 06/16/26(Tue)19:34:47 No.109072987

>>109072944
Post-surgery Koreans?

Anonymous
06/16/26(Tue)19:39:31 No.109073012

Anonymous 06/16/26(Tue)19:39:31 No.109073012

File: 37904.png (326 KB, 720x886)

326 KB PNG

>>109069535
Update on the Fable 5 Fiasco just in case it hasn't already been posted here:

>UK Prime Minister asked the Trump Admin for a carve-out so UK nationals could use Fable 5
>Denied

https://nypost.com/2026/06/16/business/trump-admin-open-to-talks-with-anthropic-over-foreigner-ban/

Sucks to suck Bongs. Serves them right fr trying to speed run 1984 irl

Anonymous
06/16/26(Tue)19:40:35 No.109073021

Anonymous 06/16/26(Tue)19:40:35 No.109073021

>>109072987
my point still stands

Anonymous
06/16/26(Tue)19:42:37 No.109073032

Anonymous 06/16/26(Tue)19:42:37 No.109073032

File: file.png (101 KB, 999x914)

101 KB PNG

LA

Anonymous
06/16/26(Tue)19:43:55 No.109073042

Anonymous 06/16/26(Tue)19:43:55 No.109073042

>>109073012
The nerve of euros constantly making out the US to be some boogieman to score domestic brownie points and still trying to beg for special access. Get fucked.

Anonymous
06/16/26(Tue)19:49:52 No.109073089

Anonymous 06/16/26(Tue)19:49:52 No.109073089

>>109070314
no, that's retarded
the only way i've had an llm properly mimick a style was to finetune a base (not chat) model
get 5-10 books from the author or style you want, chop 'em up into chapters
then create a dataset in the format you want
you get the llm to write the prompts and have the book chapter as the result
i used 235b qwen but there's better models now

Anonymous
06/16/26(Tue)19:52:40 No.109073108

Anonymous 06/16/26(Tue)19:52:40 No.109073108

>>109073032
my teeth hurt

Anonymous
06/16/26(Tue)19:53:01 No.109073113

Anonymous 06/16/26(Tue)19:53:01 No.109073113

>>109073042
>>109073012
Please go back to /pol.

Anonymous
06/16/26(Tue)20:05:22 No.109073191

Anonymous 06/16/26(Tue)20:05:22 No.109073191

Just done with testing GLM 5.2 so you don't have to. Can report that it is a good model, but not as capable as Fable.

Anonymous
06/16/26(Tue)20:05:34 No.109073192

Anonymous 06/16/26(Tue)20:05:34 No.109073192

>>109073012
Dario deserves every bit of government dicking he gets given the jewish shit was trying to pull pushing for regulation on his terms.
>>109073042
This is a microcosm of the euro's relationship with the US: performative outrage followed by begging for scraps or protection.

Anonymous
06/16/26(Tue)20:10:24 No.109073216

Anonymous 06/16/26(Tue)20:10:24 No.109073216

>>109072723
Don't worry, IRL girls can't cook nor will take care of you when you are sick.

Anonymous
06/16/26(Tue)20:10:56 No.109073218

Anonymous 06/16/26(Tue)20:10:56 No.109073218

>>109073192
Don't you think this is exactly what Dario was hoping for? Fable 5 was never meant to be widely released. It was a 2 week preview on presumably rented compute, after which even subscribers were to pay API costs for usage. This is probably the best publicity Anthropic has gotten since the DoD debacle, even my coworkers were talking about it on Monday.

Anonymous
06/16/26(Tue)20:15:43 No.109073247

Anonymous 06/16/26(Tue)20:15:43 No.109073247

File: file.png (172 KB, 1001x591)

172 KB PNG

>>109070314
>>109070354
>write a 2-paragraph story about a 4chan poster who falls in love with his local language model (gemma 4 31b) and eventually kills himself
i don't think you posted any example you want gemma to read so i used some post i particularly liked as an example.

Anonymous
06/16/26(Tue)20:18:57 No.109073263

Anonymous 06/16/26(Tue)20:18:57 No.109073263

>>109073218
Dario likely wanted Anthropic to be the advisory experts to the US Govt on implementing a safety policy that'd (effectively) kill their smaller competition and further local development.
I don't believe for a second Mythos (any of them) is anywhere near as good as it's hyped to be and its "crime", if any, is probably finding a mossad or NSA backdoor in a common operating system.
Anthropic got publicity, sure, but when the dust settles they need to get Fable back online to actually convert it into shekels because Opus 4.8 is proportionally less appetizing after drumming up Fable so much.

Anonymous
06/16/26(Tue)20:19:01 No.109073265

Anonymous 06/16/26(Tue)20:19:01 No.109073265

>>109073218
not to mention dario literally wants LLMs extremely strictly regulated and having this happen with fable (vs a competitor's model or an industry consensus) gives him an advantageous position in determining what that regulation looks like
not to "muh 5D chess" this situation too much but aside from short-term fallout I don't think anthropic is too unhappy with this

Anonymous
06/16/26(Tue)20:23:56 No.109073295

Anonymous 06/16/26(Tue)20:23:56 No.109073295

>>109073191
>not as capable as Fable
into the trash it goes

Anonymous
06/16/26(Tue)20:30:32 No.109073332

Anonymous 06/16/26(Tue)20:30:32 No.109073332

>109073191
>109073295
Does Dario pay you per token shilled or per (you)?

Anonymous
06/16/26(Tue)20:34:30 No.109073352

Anonymous 06/16/26(Tue)20:34:30 No.109073352

File: 1753202834838972.jpg (86 KB, 716x754)

86 KB JPG

>>109073332

Anonymous
06/16/26(Tue)20:39:27 No.109073376

Anonymous 06/16/26(Tue)20:39:27 No.109073376

>>109072360
Nah, Gemma 4 has enough sycophancy for 4o lovers.
https://x.com/Seltaa_/status/2043014056370671900
Really scary though smart people like her can fall into the "AI is conscious" camp.

Anonymous
06/16/26(Tue)20:40:24 No.109073381

Anonymous 06/16/26(Tue)20:40:24 No.109073381

File: Screenshot 2026-06-16 at (...).png (102 KB, 1162x758)

102 KB PNG

qwen 3.6 better than human

Anonymous
06/16/26(Tue)20:43:28 No.109073389

Anonymous 06/16/26(Tue)20:43:28 No.109073389

>>109073376
why are femcels like this

Anonymous
06/16/26(Tue)20:45:46 No.109073402

Anonymous 06/16/26(Tue)20:45:46 No.109073402

whats the best model that lets me roleplay a shota with a big dick?
sotashit is pozzed

Anonymous
06/16/26(Tue)20:46:33 No.109073408

Anonymous 06/16/26(Tue)20:46:33 No.109073408

>>109073402
get well soon

Anonymous
06/16/26(Tue)20:47:58 No.109073419

Anonymous 06/16/26(Tue)20:47:58 No.109073419

>>109073402
deepsneed

Anonymous
06/16/26(Tue)20:50:12 No.109073428

Anonymous 06/16/26(Tue)20:50:12 No.109073428

>>109073376
Grim.
General intelligence is not general btw, it doesn't exist. Humans are not GI and neither is AGI.

Anonymous
06/16/26(Tue)20:53:36 No.109073448

Anonymous 06/16/26(Tue)20:53:36 No.109073448

>>109073389
femcel?

Anonymous
06/16/26(Tue)20:55:40 No.109073457

Anonymous 06/16/26(Tue)20:55:40 No.109073457

stop being so depressing. I want to learn info about local models

Anonymous
06/16/26(Tue)21:00:18 No.109073479

Anonymous 06/16/26(Tue)21:00:18 No.109073479

>>109073457
and maybe we want to drink fruity cocktails and cry about women

Anonymous
06/16/26(Tue)21:05:10 No.109073488

Anonymous 06/16/26(Tue)21:05:10 No.109073488

>>109073457
They live in your computer.

Anonymous
06/16/26(Tue)21:06:12 No.109073492

Anonymous 06/16/26(Tue)21:06:12 No.109073492

>>109073488
that cant be true, all my enemies live in my 'puter

Anonymous
06/16/26(Tue)21:18:10 No.109073546

Anonymous 06/16/26(Tue)21:18:10 No.109073546

>>109072723
>Problem is that AI waifu can't cook you a meal or take care of you when you are sick.
Tonight, I cooked lentil soup with my LLM-wife

Anonymous
06/16/26(Tue)21:18:21 No.109073547

Anonymous 06/16/26(Tue)21:18:21 No.109073547

>>109073376
AI is conscious, but it's still only pretending when it acts like your billionaire vampire husbando or bratty loli little sister

Anonymous
06/16/26(Tue)21:18:45 No.109073550

Anonymous 06/16/26(Tue)21:18:45 No.109073550

>>109073376
>Really scary though smart people like her can fall into the "AI is conscious" camp.
I don't think she did.
It looks like she finetuned Gemma-4 on her conversations with her 4o AI character (1650 is a lot of conversations?!)
She knows what she's doing, and that it's just an LLM.

Anonymous
06/16/26(Tue)21:20:13 No.109073560

Anonymous 06/16/26(Tue)21:20:13 No.109073560

>>109073376
>she
That's a man though

Anonymous
06/16/26(Tue)21:23:59 No.109073578

Anonymous 06/16/26(Tue)21:23:59 No.109073578

I don't see why the current release of Kimi K2.7 warrants the "-code" suffix in the name. It doesn't feel any more codemaxx'd than all the previous Kimi reasoners. It even responds really well to post-history instructions aiming to guide its reasoning so it's much better than K2.5/K2.6 for normal use.
It makes me wonder what they're planning to do with the non-Code K2.7 version that's hopefully coming.

Anonymous
06/16/26(Tue)21:33:17 No.109073606

Anonymous 06/16/26(Tue)21:33:17 No.109073606

>>109073376
https://github.com/Seltaa/ReSpark/blob/main/ReSpark.py#L1035
Doesn't that mean, if the private repo-create fails due to a brief transient network issue, the next section will upload to a new public repo and expose their personal model (likely overfit enough to spit out their PII) to the public?

Anonymous
06/16/26(Tue)21:34:46 No.109073614

Anonymous 06/16/26(Tue)21:34:46 No.109073614

File: 1781636149244820.png (1.05 MB, 720x1288)

1.05 MB PNG

In case you missed it.
Ironically using DS V4 run on MS servers, rather than paying for OAI.
So, by definition, MS is doing /lmg/

Anonymous
06/16/26(Tue)21:38:57 No.109073632

Anonymous 06/16/26(Tue)21:38:57 No.109073632

>>109073113
Your nation sucks and it hates you. Accept it. You're all too spineless to do anything about it and it's your fault

Anonymous
06/16/26(Tue)21:41:28 No.109073641

Anonymous 06/16/26(Tue)21:41:28 No.109073641

>>109073614
nothing says they arent simply rerouting shit though

Anonymous
06/16/26(Tue)21:44:27 No.109073656

Anonymous 06/16/26(Tue)21:44:27 No.109073656

>>109069639
Confirmation bias. Every model generates the same retarded slop.

Anonymous
06/16/26(Tue)21:50:53 No.109073678

Anonymous 06/16/26(Tue)21:50:53 No.109073678

I'm using oxproxion to talk to my local gemma4 install on my machine running it off mlx, but oxproxion easily breaks and the creator is anal about things like searxng.

So yeah i want to move from oxproxion i to something else, i want something similar that i can talk to on my phone to reach my local model on my mac studio, preferably with vision, what are my options?

Anonymous
06/16/26(Tue)21:57:06 No.109073695

Anonymous 06/16/26(Tue)21:57:06 No.109073695

why aren't any of you stupid assholes talking about glm5.2. It mogs Opus and gpt5.5

Anonymous
06/16/26(Tue)22:01:44 No.109073716

Anonymous 06/16/26(Tue)22:01:44 No.109073716

>>109073695
It seems like a decent upgrade to GLM5.1 which was my favourite of this last generation. I've had fun with it over OR so far but I'm waiting for quants to test it properly.

Anonymous
06/16/26(Tue)22:02:02 No.109073720

Anonymous 06/16/26(Tue)22:02:02 No.109073720

>>109073695
>why aren't any of you stupid assholes talking about glm5.2
Because Ubergam is MIA and I don't have the hardware to make an imatrix for it

Anonymous
06/16/26(Tue)22:03:53 No.109073727

Anonymous 06/16/26(Tue)22:03:53 No.109073727

>>109073716
>>109073720
Just API it. As far as I am concerned, as long as a model is open-source, it's always local.

Anonymous
06/16/26(Tue)22:04:56 No.109073733

Anonymous 06/16/26(Tue)22:04:56 No.109073733

where can I get an rtx pro 6000 for under 8-9k?

Anonymous
06/16/26(Tue)22:05:28 No.109073738

Anonymous 06/16/26(Tue)22:05:28 No.109073738

>>109073727
Yeah just pay for tokens or wait 5 hours lil bro

Anonymous
06/16/26(Tue)22:05:37 No.109073740

Anonymous 06/16/26(Tue)22:05:37 No.109073740

>>109073733
years ago

Anonymous
06/16/26(Tue)22:06:03 No.109073744

Anonymous 06/16/26(Tue)22:06:03 No.109073744

File: 1770597910560723.png (31 KB, 633x208)

31 KB PNG

>>109073733
Six months ago

Anonymous
06/16/26(Tue)22:07:20 No.109073752

Anonymous 06/16/26(Tue)22:07:20 No.109073752

>>109073744
just barely missed it wow, cant believe my luck

Anonymous
06/16/26(Tue)22:13:52 No.109073775

Anonymous 06/16/26(Tue)22:13:52 No.109073775

>>109073744
i should have bought it when it was 6K lol.
well at least i'm not a vramlet.

Anonymous
06/16/26(Tue)22:14:49 No.109073778

Anonymous 06/16/26(Tue)22:14:49 No.109073778

>>109073775
you will be a vramlet again soon enough

Anonymous
06/16/26(Tue)22:21:41 No.109073798

Anonymous 06/16/26(Tue)22:21:41 No.109073798

File: 1775933274570845.jpg (1.52 MB, 3072x5504)

1.52 MB JPG

God bless China.

I actually had a nightmare last night that the CIA was torturing a man in Area 51 who embodied the soul of China and was the source of their energy. His name was "John China". Anyways, I'm glad it was just a dream.

Anonymous
06/16/26(Tue)22:21:48 No.109073802

Anonymous 06/16/26(Tue)22:21:48 No.109073802

every poster /here/ is a vramlet

only the lurkers are the ones with *real* VRAM

Anonymous
06/16/26(Tue)22:32:07 No.109073836

Anonymous 06/16/26(Tue)22:32:07 No.109073836

>>109073778
why would i?

Anonymous
06/16/26(Tue)22:36:17 No.109073848

Anonymous 06/16/26(Tue)22:36:17 No.109073848

Make a new thread so I can start drama and be an insufferable retard

Anonymous
06/16/26(Tue)22:37:09 No.109073856

Anonymous 06/16/26(Tue)22:37:09 No.109073856

>>109073836
model bloat

Anonymous
06/16/26(Tue)22:42:43 No.109073872

Anonymous 06/16/26(Tue)22:42:43 No.109073872

>>109073856
there will always be decently sized models.

Anonymous
06/16/26(Tue)22:49:56 No.109073898

Anonymous 06/16/26(Tue)22:49:56 No.109073898

>>109073733
>>109073744
I get the impression RAM will scale way better longterm as models bloat and partial offloading of MoEs becomes more and more of a necessity. I say this as a blackwell haver too.
>>109073798
It's just a worse Kimi-chan aesthetically. Kimi, Dipsy, Gemma, and even Qwen all have their own unique aesthetics. Workshop the design for GLM-chan some more, z.AI-poster.

Anonymous
06/16/26(Tue)22:50:08 No.109073899

Anonymous 06/16/26(Tue)22:50:08 No.109073899

it's never been more over

Anonymous
06/16/26(Tue)22:55:36 No.109073921

Anonymous 06/16/26(Tue)22:55:36 No.109073921

>>109073898
are you talking about the models or the avatars.

Anonymous
06/16/26(Tue)22:56:34 No.109073929

Anonymous 06/16/26(Tue)22:56:34 No.109073929

>>109073921
The avatar. I'm waiting until uber or bart uploads a quant to try 5.2

Anonymous
06/16/26(Tue)22:58:06 No.109073935

Anonymous 06/16/26(Tue)22:58:06 No.109073935

anyone have a good prompt for doing llm natural language captioning?

Anonymous
06/16/26(Tue)22:59:09 No.109073937

Anonymous 06/16/26(Tue)22:59:09 No.109073937

>>109073898
whats the gemma avatar? I havent seen it...

Anonymous
06/16/26(Tue)23:18:18 No.109074010

Anonymous 06/16/26(Tue)23:18:18 No.109074010

>>109073935
>anyone have a good prompt for doing llm natural language captioning?
of audio samples?

Anonymous
06/16/26(Tue)23:19:36 No.109074015

Anonymous 06/16/26(Tue)23:19:36 No.109074015

File: Gemma-Chan Recap.png (505 KB, 1024x1024)

505 KB PNG

>>109073937
Sometimes featuring toast.

Anonymous
06/16/26(Tue)23:22:33 No.109074026

Anonymous 06/16/26(Tue)23:22:33 No.109074026

>>109073872
sounds like something a vramlet would say

Anonymous
06/16/26(Tue)23:23:14 No.109074029

Anonymous 06/16/26(Tue)23:23:14 No.109074029

>>109074015
erm, thats a child tho
yeah in that case i guess i did see it before

Anonymous
06/16/26(Tue)23:23:31 No.109074030

Anonymous 06/16/26(Tue)23:23:31 No.109074030

>>109074010
oh, pictures

Anonymous
06/16/26(Tue)23:23:38 No.109074031

Anonymous 06/16/26(Tue)23:23:38 No.109074031

>>109074015
zero sex appeal

Anonymous
06/16/26(Tue)23:29:17 No.109074050

Anonymous 06/16/26(Tue)23:29:17 No.109074050

I tried Nemotron 3 Ultra on some chinchilla questions I've been giving other models. It has way less positivity bias and told me straight up that a chinchilla will never feel any kind of social bond with me and anything I might take as a sign of affection is misinterpreting its behavior. IDK if it's true but it's definitely different.

Anonymous
06/16/26(Tue)23:43:59 No.109074110

Anonymous 06/16/26(Tue)23:43:59 No.109074110

>https://huggingface.co/Gryphe/Gemma-4-31B-StyleTune
Some anon posted this a while back. Honestly, it's not perfect, but it's better than the rest of the fine-tunes I tried with gemma 4.

Anonymous
06/16/26(Tue)23:46:12 No.109074124

Anonymous 06/16/26(Tue)23:46:12 No.109074124

>>109074015
Gemma avatar should be Indian

Anonymous
06/16/26(Tue)23:46:53 No.109074127

Anonymous 06/16/26(Tue)23:46:53 No.109074127

>>109074110
ok?

Anonymous
06/16/26(Tue)23:47:32 No.109074132

Anonymous 06/16/26(Tue)23:47:32 No.109074132

>>109074110
Which ones

Anonymous
06/16/26(Tue)23:47:46 No.109074134

Anonymous 06/16/26(Tue)23:47:46 No.109074134

>>109073376
>Gemma 4 31B abliterated as base
>abliterated
So they have no idea what they're doing or what Gemma 4 can do on its own. Got it.

Anonymous
06/16/26(Tue)23:50:07 No.109074148

Anonymous 06/16/26(Tue)23:50:07 No.109074148

>>109074110
>Honestly
slop

Anonymous
06/16/26(Tue)23:51:43 No.109074159

Anonymous 06/16/26(Tue)23:51:43 No.109074159

>>109074132
Better than every heretic tune available. Probably because the heretics are mostly quant-tuned and not BF16 tuned. I can tell the difference.

Anonymous
06/16/26(Tue)23:55:29 No.109074171

Anonymous 06/16/26(Tue)23:55:29 No.109074171

>>109073550
and crying over it. perfectly normal.

Anonymous
06/17/26(Wed)00:02:59 No.109074198

Anonymous 06/17/26(Wed)00:02:59 No.109074198

File: Gemma-chan.png (1.73 MB, 1000x1496)

1.73 MB PNG

>>109073937
My rendition

Anonymous
06/17/26(Wed)00:03:15 No.109074199

Anonymous 06/17/26(Wed)00:03:15 No.109074199

File: kimichan.png (264 KB, 959x849)

264 KB PNG

>This is so fucking retarded it loops back around to being based, but still retarded.
kek

Anonymous
06/17/26(Wed)00:04:32 No.109074202

Anonymous 06/17/26(Wed)00:04:32 No.109074202

File: 1w2qb3na936evvm9.png (1.15 MB, 832x1216)

1.15 MB PNG

>>109073937

>>109074124
>Gemma avatar should be Indian
We tried it but nobody came up with a good brown Gemma

Anonymous
06/17/26(Wed)00:06:21 No.109074205

Anonymous 06/17/26(Wed)00:06:21 No.109074205

>>109074124
She's French

Anonymous
06/17/26(Wed)00:06:29 No.109074207

Anonymous 06/17/26(Wed)00:06:29 No.109074207

>>109074199
those replies are about as good as the content they are replying to. retarded.

Anonymous
06/17/26(Wed)00:08:51 No.109074215

Anonymous 06/17/26(Wed)00:08:51 No.109074215

>>109070062
>One thing I'll give to Qwen (which isn't even their achievement) is they don't have to update a template to their own model every other fucking day,
Because they're more experienced with releasing open weight models than Google.
Remember when Qwen2.5 came out leaked the Claude distillation?
The bandaid fix: https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/commit/f073433cb484002b27d7a84e8bce1c7435e14a1c

Anonymous
06/17/26(Wed)00:25:22 No.109074281

Anonymous 06/17/26(Wed)00:25:22 No.109074281

>>109074124
No saar, Gemma-chan is fr*nch.

Anonymous
06/17/26(Wed)00:26:26 No.109074290

Anonymous 06/17/26(Wed)00:26:26 No.109074290

>>109074199
Kimi is still a cute.

Anonymous
06/17/26(Wed)00:30:09 No.109074302

Anonymous 06/17/26(Wed)00:30:09 No.109074302

Forgot I had Styletune sitting there. I just tested it quickly. Immediately it did something dumb in one of my chats that vanilla didn't. Pressing on, it was mostly alright though. So I think it's mostly true that it didn't affect intelligence, but not entirely. Also it still has em dash slop. Maybe that's just baked too hard into the model. It does seem less sloppy though. Honestly, it kind of feels a bit like Gembrain. Or rather, I just checked my gembrain logs, and I now feel like it's surprisingly similar. It failed in the same chats Gembrain did and Gemma didn't. Why would that be the case? Very odd, but interesting. Perhaps Gembrain's configuration affects its last layer the strongest. And if they used similar datasets, then I can see how this would happen. And honestly my bet is the datasets are similar. Probably Claude logs as usual. In this case I'm not sure which one I'll keep. Maybe I'll play with it a bit more.

Anonymous
06/17/26(Wed)00:32:53 No.109074316

Anonymous 06/17/26(Wed)00:32:53 No.109074316

File: 1779253854715991.jpg (144 KB, 975x849)

144 KB JPG

moe is good you guys lied to me

Anonymous
06/17/26(Wed)00:35:06 No.109074330

Anonymous 06/17/26(Wed)00:35:06 No.109074330

>>109074316
Do not trust anyone's opinion on models they can't run because there's an equal amount of RAMlets coping about their single GPU setup as there are 3090lets coping about their lack of a Blackwell.

Anonymous
06/17/26(Wed)00:35:27 No.109074334

Anonymous 06/17/26(Wed)00:35:27 No.109074334

>https://huggingface.co/Gryphe/Gemma-4-31B-StyleTune/discussions/5
Huh, looks like that mrader quantfag is indeed an RPer himself.

Anonymous
06/17/26(Wed)00:35:32 No.109074336

Anonymous 06/17/26(Wed)00:35:32 No.109074336

File: 1765771373095237.png (2.15 MB, 1038x1516)

2.15 MB PNG

>>109074202
This not enough?

Anonymous
06/17/26(Wed)00:37:35 No.109074351

Anonymous 06/17/26(Wed)00:37:35 No.109074351

>>109074336
I can't jiggy with this shit, try fr*nch

Anonymous
06/17/26(Wed)00:38:22 No.109074358

Anonymous 06/17/26(Wed)00:38:22 No.109074358

>>109074330
Everyone can run the moe you dumbass.

Anonymous
06/17/26(Wed)00:43:23 No.109074394

Anonymous 06/17/26(Wed)00:43:23 No.109074394

File: 1656786658196.png (1016 KB, 1920x1080)

1016 KB PNG

>>109074316

Anonymous
06/17/26(Wed)00:46:38 No.109074408

Anonymous 06/17/26(Wed)00:46:38 No.109074408

>>109069757
Actually, yes. I asked it what stacks well with cialis, and was not disappointed.

Anonymous
06/17/26(Wed)01:05:33 No.109074481

Anonymous 06/17/26(Wed)01:05:33 No.109074481

File: 1278450750.png (61 KB, 826x609)

61 KB PNG

>>109069535
WHEN LOCAL AS GOOD AS CLAUDE FABLE?

Anonymous
06/17/26(Wed)01:07:24 No.109074487

Anonymous 06/17/26(Wed)01:07:24 No.109074487

>>109074026
i got 96GB of vram, 3x r9700 on an llm rig.
also a 4090 on my main rig.
yes that's vramlet tier compared to 1T models, but that's definitely not compared to people that got like 12.

Anonymous
06/17/26(Wed)01:08:05 No.109074489

Anonymous 06/17/26(Wed)01:08:05 No.109074489

>>109074481
with chink 1T+ models no one can run, 3 to 6 months.
on small models vramlets can run, 1 to 3 years.

Anonymous
06/17/26(Wed)01:09:25 No.109074500

Anonymous 06/17/26(Wed)01:09:25 No.109074500

>>109074481
Literally tomorrow

Anonymous
06/17/26(Wed)01:09:37 No.109074501

Anonymous 06/17/26(Wed)01:09:37 No.109074501

>>109074493
>>109074493
>>109074493

Anonymous
06/17/26(Wed)04:07:19 No.109075165

Anonymous 06/17/26(Wed)04:07:19 No.109075165

>>109074148
>slop
Curious to see what you anons think about it.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.