/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 04/04/26(Sat)05:47:05 No.108523376

File: robololi hugs GPU 2.jpg (519 KB, 1024x1024)

519 KB JPG

/lmg/ - Local Models General Anonymous 04/04/26(Sat)05:47:05 No.108523376

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108519856 & >>108516658

►News
>(04/02) Gemma 4 released: https://blog.google/innovation-and-ai/technology/developers-tools/gemma-4
>(04/01) Trinity-Large-Thinking released: https://hf.co/arcee-ai/Trinity-Large-Thinking
>(04/01) Merged llama : rotate activations for better quantization #21038: https://github.com/ggml-org/llama.cpp/pull/21038
>(04/01) Holo3 VLMs optimized for GUI Agents released: https://hcompany.ai/holo3
>(03/31) 1-bit Bonsai models quantized from Qwen 3: https://prismml.com/news/bonsai-8b

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
04/04/26(Sat)05:47:27 No.108523382

Anonymous 04/04/26(Sat)05:47:27 No.108523382

File: chibi_miku_gpu_1.png (1.29 MB, 1024x1024)

1.29 MB PNG

►Recent Highlights from the Previous Thread: >>108519856

--Discussing handling of Gemma's thinking blocks in multi-turn histories:
>108522106 >108522122 >108522130 >108522225 >108522326 >108522301 >108522330 >108522349 >108522396
--Comparing Gemma 4's performance and repetition issues against Mistral 3.x:
>108520556 >108520583 >108520616 >108520591 >108520629 >108520663 >108521005 >108520665 >108520695 >108520794 >108520871 >108521612 >108521633 >108521640
--Discussing logit softcapping for Gemma 4 to improve response variety:
>108521009 >108521025 >108521026 >108521075 >108521091 >108521139 >108521303 >108522677 >108522691 >108522702 >108522943 >108522949 >108522955
--Gemma 4's low censorship and debugging llama-server crashes:
>108521733 >108521777 >108521796 >108521822 >108521831 >108521872 >108521908 >108521978 >108521989 >108522157 >108522161 >108522332 >108522771
--KV Cache quantization and context length optimization for roleplay:
>108521216 >108521226 >108521307 >108521373 >108521385 >108521388 >108521401
--Discussing a patch making Gemma's logits softcap configurable:
>108520086 >108520139 >108520210 >108520231
--Debating llama.cpp's direction following Gemma 4's reception:
>108520807 >108520826 >108520858 >108520880 >108520921 >108520990 >108520860 >108520934
--Impressions of Gemma-4-2B's roleplay quality and thinking capabilities:
>108520161 >108520190 >108520317 >108520493 >108520537
--Anon buys RTX 6000 for Gemma 4's high KV cache needs:
>108519917 >108519933 >108519982 >108519950 >108519952 >108519983 >108519960
--Praising Gemma 4 for ERP and discussing perspective tests and performance:
>108519877 >108519901 >108520393 >108520589 >108520547 >108520608 >108520642 >108521252 >108521812 >108522080
--Miku (free space):
>108520018 >108520164 >108520232 >108520411 >108520425 >108521082 >108521554 >108521568 >108521656

►Recent Highlight Posts from the Previous Thread: >>108519859

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
04/04/26(Sat)05:49:42 No.108523393

Anonymous 04/04/26(Sat)05:49:42 No.108523393

>>108523376
wait whos this bitch

Anonymous
04/04/26(Sat)05:49:53 No.108523394

Anonymous 04/04/26(Sat)05:49:53 No.108523394

File: 1775296187.png (275 KB, 1762x1448)

275 KB PNG

First for NIM

Anonymous
04/04/26(Sat)05:50:26 No.108523400

Anonymous 04/04/26(Sat)05:50:26 No.108523400

>>108523394
>First
epic fail

Anonymous
04/04/26(Sat)05:52:51 No.108523409

Anonymous 04/04/26(Sat)05:52:51 No.108523409

>>108523398
based

Anonymous
04/04/26(Sat)05:54:05 No.108523415

Anonymous 04/04/26(Sat)05:54:05 No.108523415

File: llms do not actually unde(...).png (204 KB, 1333x1009)

204 KB PNG

>>108523389
don't ask a llm for their opinion on what log output says without also showing them the relevant code, they love making assumptions about things
pic related is why vibecoding or vibeasking llm questions about code is a fail when you do not know what should be introduced in the context yourself

Anonymous
04/04/26(Sat)05:57:24 No.108523430

Anonymous 04/04/26(Sat)05:57:24 No.108523430

>>108523415
Damn, so many things are wrong on gemma 4's implementaiton
- We don't know if the rotation shit is being applied or not
- The temperature does nothing
- There's so crash during tool calls

bruh, I miss the time when llama.cpp wasn't bought by huggingface, the enshittification process wasn't as fast as right now, now their team consist on vibeshitters who can't verify by themselves if Claude is hallucinating stuff or not

Anonymous
04/04/26(Sat)05:59:05 No.108523433

Anonymous 04/04/26(Sat)05:59:05 No.108523433

Can we actually be reasonably sure that this technology is approaching its upper limits in what it can do and isn't just bottlenecked at every stage of development by jeets and vibecoders shitting it all up?

Would AGI SOON :rocket: actually be possible in a less brown world?

Anonymous
04/04/26(Sat)05:59:37 No.108523435

Anonymous 04/04/26(Sat)05:59:37 No.108523435

Imagine when Gemma 4 124B gets reveled on May 20. I wonder if the Chinese will rush to release their best models before that. Or perhaps they'll wait for Google's fagship Gemma to distill the heck out of it like they've been doing with gpt-oss-120B.

Anonymous
04/04/26(Sat)06:00:58 No.108523442

Anonymous 04/04/26(Sat)06:00:58 No.108523442

>>108523433
The only actual innovative lab is DeekSeep so we'll know for sure when/if V4 comes out, but that it has taken over a year already is not a good sign.

Anonymous
04/04/26(Sat)06:01:00 No.108523443

Anonymous 04/04/26(Sat)06:01:00 No.108523443

>>108523433
i think all ml researchers are sub-human, so no.

Anonymous
04/04/26(Sat)06:01:51 No.108523447

Anonymous 04/04/26(Sat)06:01:51 No.108523447

>>108523433
even the data is built upon a framework of shit, I mean, most data labeling work is done by brown hands, openai is willing to spend billions on data center but not a single cent on paying western worker a worthwhile salary to do proper, well thought labeling work.
RLHF is really reinforcement learning from jeet feedback. Even when you do it from pure synthslop reward models, those models inherit and condense the Original Sin

Anonymous
04/04/26(Sat)06:04:11 No.108523455

Anonymous 04/04/26(Sat)06:04:11 No.108523455

>>108523394
>thinking disabled

Anonymous
04/04/26(Sat)06:08:15 No.108523469

Anonymous 04/04/26(Sat)06:08:15 No.108523469

>>108523447
Funny enough I've had many remote job advertisements for chemistry data labeling tasks requiring chemistry related degrees. Which the Indians actually have a lot of, but they seem to be targeting Europeans / Americans specifically.

Anonymous
04/04/26(Sat)06:08:51 No.108523473

Anonymous 04/04/26(Sat)06:08:51 No.108523473

>>108523442
I hope Dipsy saves us all.
>>108523447
Is there realistically any fix for this that isn't starting over from scratch? As much as we shitpost about ozone in our RP logs, it really is the perfect microcosm of the entire problem that's been snowballing since at least GPT-3, like you described.

Anonymous
04/04/26(Sat)06:10:38 No.108523480

Anonymous 04/04/26(Sat)06:10:38 No.108523480

>>108523394
Doesn't NIM log all your prompts

Anonymous
04/04/26(Sat)06:11:05 No.108523481

Anonymous 04/04/26(Sat)06:11:05 No.108523481

>>108523473
Just like space debris, there is no fix other than accepting that the problem exists and attempting to mitigate it.

Anonymous
04/04/26(Sat)06:12:53 No.108523484

Anonymous 04/04/26(Sat)06:12:53 No.108523484

>>108523447
The travesty is that white people think they're too good for manual data labeling work, so it's exclusively done by browns and blacks.

Anonymous
04/04/26(Sat)06:13:13 No.108523486

Anonymous 04/04/26(Sat)06:13:13 No.108523486

>>108523480
If something connects to the internet and isn't open source then it's safe to assume it's logging and sending everything it can

Anonymous
04/04/26(Sat)06:14:00 No.108523490

Anonymous 04/04/26(Sat)06:14:00 No.108523490

Yes, I understand it now... I need more VRAM... Much, much more VRAM...

Anonymous
04/04/26(Sat)06:14:14 No.108523492

Anonymous 04/04/26(Sat)06:14:14 No.108523492

>>108523484
kys elon

Anonymous
04/04/26(Sat)06:16:00 No.108523498

Anonymous 04/04/26(Sat)06:16:00 No.108523498

Am I supposed to get 21 t/s on Gemma 4 26B 4AB with a 3090? This seems slow compared to the 17 t/s I get on the 31B one.

Anonymous
04/04/26(Sat)06:17:41 No.108523506

Anonymous 04/04/26(Sat)06:17:41 No.108523506

>>108523498
I'm getting 34 t/s on the 31B one with a 3090 so either way you're fucking your settings up somehow.

Anonymous
04/04/26(Sat)06:18:48 No.108523514

Anonymous 04/04/26(Sat)06:18:48 No.108523514

>>108523498
I'm getting about the same as the other anon, make sure both are not spilling onto RAM.

Anonymous
04/04/26(Sat)06:20:18 No.108523522

Anonymous 04/04/26(Sat)06:20:18 No.108523522

>>108523498
Use
>--gpu-layers 99
And tweak
>--n-cpu-moe XX
Starting from 20, 30, 40, 50 and check your vram usage. When it begins to drastically drop below the max (minus some space for your system) you have hit the sweet spot.
>--mlock, --no-mmap
Might make a difference too but that's up to you.
Use llama-server webui to test out these things.

Anonymous
04/04/26(Sat)06:21:35 No.108523526

Anonymous 04/04/26(Sat)06:21:35 No.108523526

>>108523484
lol. lmao even.

Anonymous
04/04/26(Sat)06:21:52 No.108523528

Anonymous 04/04/26(Sat)06:21:52 No.108523528

>>108523498
you probably have shit using your vram and not loading all of it.
i get 130t/s at q4 on a 4090.

Anonymous
04/04/26(Sat)06:24:09 No.108523537

Anonymous 04/04/26(Sat)06:24:09 No.108523537

I get 50t/s at 0 context and 40t/s at 55,000 context with Gemmy 4 31b. I'm impressed by how little relative speedloss there is as context climbs.

Anonymous
04/04/26(Sat)06:24:42 No.108523539

Anonymous 04/04/26(Sat)06:24:42 No.108523539

>>108523484
just pay them more and you'll see the average melanin becoming clearer

Anonymous
04/04/26(Sat)06:25:10 No.108523540

Anonymous 04/04/26(Sat)06:25:10 No.108523540

>>108523498
Repull llama it fixed this today.

Anonymous
04/04/26(Sat)06:26:11 No.108523546

Anonymous 04/04/26(Sat)06:26:11 No.108523546

>>108523539
Asking for fair wages is antisemitic.

Anonymous
04/04/26(Sat)06:28:56 No.108523562

Anonymous 04/04/26(Sat)06:28:56 No.108523562

>>108523540
What else was broken by the latest fix?

Anonymous
04/04/26(Sat)06:30:38 No.108523567

Anonymous 04/04/26(Sat)06:30:38 No.108523567

Yeah of course, the training pipeline is currupted by browns, which were brought out by... Jews! Oh so jews are at fault again! Because of jews we don't have AGI!! Damn, every single time!!!

Anonymous
04/04/26(Sat)06:32:40 No.108523573

Anonymous 04/04/26(Sat)06:32:40 No.108523573

>>108523567
>AGI
corpo wet dream, no one with a soul wants this

Anonymous
04/04/26(Sat)06:32:42 No.108523575

Anonymous 04/04/26(Sat)06:32:42 No.108523575

>>108523567
>Damn, every single time!!!
this but unironically

Anonymous
04/04/26(Sat)06:33:30 No.108523578

Anonymous 04/04/26(Sat)06:33:30 No.108523578

>>108523562
Some anons report gemma still has repeating melties but it's working fine for others so maybe you'll be one of the lucky ones.

Anonymous
04/04/26(Sat)06:34:33 No.108523582

Anonymous 04/04/26(Sat)06:34:33 No.108523582

>>108523567
What you say with sarcasm, I say with conviction.

Anonymous
04/04/26(Sat)06:35:29 No.108523585

Anonymous 04/04/26(Sat)06:35:29 No.108523585

File: 6954214314_1cf62f1742_h.jpg (907 KB, 1600x1065)

907 KB JPG

>>108523567
You meme, but I'm actually getting pretty tired of how it's unironically every single time.

Anonymous
04/04/26(Sat)06:36:09 No.108523586

Anonymous 04/04/26(Sat)06:36:09 No.108523586

File: 1764191774459410.png (84 KB, 194x260)

84 KB PNG

>>108523567

Anonymous
04/04/26(Sat)06:37:10 No.108523588

Anonymous 04/04/26(Sat)06:37:10 No.108523588

openai indeed had a high concentration of baal worshippers within its founding members

Anonymous
04/04/26(Sat)06:38:10 No.108523593

Anonymous 04/04/26(Sat)06:38:10 No.108523593

>>108523567
My schizo brother is so far gone that he literally said "The Jews invented and imposed the laws of physics on the universe to restrict white people and ensure we don't have unlimited energy so we stay dependent on our Jewish masters".

These people can't be reasoned with. They will literally say the powers of Jews are equivalent or superior of god itself before they will admit they are contrarian schizos that refuse to take ownership of their own issues.

Anonymous
04/04/26(Sat)06:39:15 No.108523594

Anonymous 04/04/26(Sat)06:39:15 No.108523594

>>108523593
he's not that far off tbqh

Anonymous
04/04/26(Sat)06:39:48 No.108523599

Anonymous 04/04/26(Sat)06:39:48 No.108523599

>>108523593
He's still right you know. Mankind is living in the dark and at the mercy of greedy shits. Always has been.

Anonymous
04/04/26(Sat)06:43:40 No.108523610

Anonymous 04/04/26(Sat)06:43:40 No.108523610

>>108523593
sounds like a "it's the jews" variant of an otherwise already pre-existing schizo theory that has existed for a very long time
https://www.trickedbythelight.com/tbtl/index.html
Throughout history there were many times when extreme gnostic world views like these were spouted.
Your brother being that type has no incidence on whether the wrong side won WW2.

Anonymous
04/04/26(Sat)06:45:59 No.108523622

Anonymous 04/04/26(Sat)06:45:59 No.108523622

>>108523593
>They will literally say the powers of Jews are equivalent or superior of god itself
they're the chosen people after all

Anonymous
04/04/26(Sat)06:46:12 No.108523623

Anonymous 04/04/26(Sat)06:46:12 No.108523623

>>108523593
>restrict white people and ensure we don't have unlimited energy so we stay dependent on our Jewish masters
Look up (((who))) sabotaged Nikola Tesla. :)

Anonymous
04/04/26(Sat)06:48:55 No.108523628

Anonymous 04/04/26(Sat)06:48:55 No.108523628

>>108523623
2nd law of thermodynamics, entropy etc etc.

Anonymous
04/04/26(Sat)06:51:59 No.108523644

Anonymous 04/04/26(Sat)06:51:59 No.108523644

>>108523593
he's right tho the jew "invented", aka daydream aka thought experiment'd GR to cucked us out of trying FTL travel

Anonymous
04/04/26(Sat)06:56:16 No.108523659

Anonymous 04/04/26(Sat)06:56:16 No.108523659

File: 1774160128140962.jpg (427 KB, 1977x1434)

427 KB JPG

>>108523575
>>108523582
>>108523585
>>108523594
>>108523599
America was founded on Judeo-Christian values. Jewish people are Paragon of Virtue and root of all morality. Those who bless Israel will be blessed, and those who curse Israel will be cursed.

Anonymous
04/04/26(Sat)07:00:22 No.108523676

Anonymous 04/04/26(Sat)07:00:22 No.108523676

File: 1763020736697981.png (2.25 MB, 2000x1333)

2.25 MB PNG

>>108523659
>America was founded on Judeo-Christian values.
*Christian Protestant values, which is why Presidents have to swear on the bible and not on the Torah

Anonymous
04/04/26(Sat)07:03:09 No.108523691

Anonymous 04/04/26(Sat)07:03:09 No.108523691

>>108523659
>you lost tranny
>*runs towards the meat grinder*
What's the name of this mental disease?

Anonymous
04/04/26(Sat)07:03:55 No.108523694

Anonymous 04/04/26(Sat)07:03:55 No.108523694

I think I might prefer Gemma 4 31B over GLM-4.6..........

Why the FUCK doesn't Google just release a bigger version? They would clearly dominate the entire open source model scene. Is it because they are afraid to cannibalize Gemini users?

Anonymous
04/04/26(Sat)07:03:55 No.108523695

Anonymous 04/04/26(Sat)07:03:55 No.108523695

>polshit
sirs do the needful.
For on-topic stuff, extremely disappointed in llmaocpp development of the non-core components.

Anonymous
04/04/26(Sat)07:04:24 No.108523700

Anonymous 04/04/26(Sat)07:04:24 No.108523700

>>108523659
>go die for israel to.. uhh.. own the libs
Honestly? I support this message, good idea

Anonymous
04/04/26(Sat)07:05:13 No.108523705

Anonymous 04/04/26(Sat)07:05:13 No.108523705

Frontier labs openly state they want to create godlike superhuman AI, which implies they want to rule the world.

Do you trust Dario or Sam (or Elon) to rule the world and be at their mercy? Do you think they have your best interests at heart?

Anonymous
04/04/26(Sat)07:06:50 No.108523712

Anonymous 04/04/26(Sat)07:06:50 No.108523712

>>108523659
you guys are cringe, you are pro war now?? if only democrats weren't worshipping troons I'd vote for them in the midterms, looks like I'll stay at home for the moment, both parties fucking suck

Anonymous
04/04/26(Sat)07:07:10 No.108523715

Anonymous 04/04/26(Sat)07:07:10 No.108523715

are these settings sensible for 24GB VRAM? I assume it's not worth offloading dense models to RAM at all.

```
llama-server -m ""models/bartowski/google_gemma-4-31B-it-GGUF/google_gemma-4-31B-it-Q4_K_S.gguf"" ^
--alias "Gemma 4 31B" ^
--ctx-size 32384 -fa on ^
-ctk q8_0 -ctv q8_0 ^
-ub 4096 -b 4096 ^
--parallel 1 ^
--threads 16 ^
--no-mmap
```

Anonymous
04/04/26(Sat)07:07:45 No.108523718

Anonymous 04/04/26(Sat)07:07:45 No.108523718

>>108523712
Israel is worth it

Anonymous
04/04/26(Sat)07:07:53 No.108523719

Anonymous 04/04/26(Sat)07:07:53 No.108523719

File: file.png (174 KB, 868x605)

174 KB PNG

oh fuck this looks so cursed

Anonymous
04/04/26(Sat)07:09:04 No.108523725

Anonymous 04/04/26(Sat)07:09:04 No.108523725

>>108523719
>he doesn't uninistall the previous cuda version before installing a new one
anon...

Anonymous
04/04/26(Sat)07:09:57 No.108523731

Anonymous 04/04/26(Sat)07:09:57 No.108523731

>>108523712
I think the image is either ironic or a bait

Anonymous
04/04/26(Sat)07:11:39 No.108523741

Anonymous 04/04/26(Sat)07:11:39 No.108523741

>>108523705
>Do you trust Dario or Sam (or Elon) to rule the world and be at their mercy? Do you think they have your best interests at heart?

>Sam
Fuck no, this dude is the most psychotic person I've ever seen. He makes peter thiel look like a benevolent saint in comparison. I'd literally prefer a Yudkowski tier misaligned AI to take over humanity and genocide humanity away than for Sam Altman to rule the world

>Elon
I think he would get off on the idea but he has a constant need for adoration and thus I actually think he would roleplay some retarded savior-type that helps humanity as long as you constantly praise and validate him. It's a bad outcome but more like living in Singapore where you have to praise the Kim family but your daily life and quality of life is pretty decent if not good

>Dario
I legitimately think the world would be a better place with him at the helm. I think he's a genuine person that wants a better place, he would probably defer most of his powers to democratic institutions (as long as it follows his definition of democratic and "good" which will be slightly left center "liberal" version of democracy)

Anonymous
04/04/26(Sat)07:11:57 No.108523742

Anonymous 04/04/26(Sat)07:11:57 No.108523742

>>108523715
>--ctx-size 32384
*32768
Also I would lower ub to 512 to save memory to then NOT quantize KV at all, it should still fit in VRAM. Quantizing KV degrades outputs considerably

Anonymous
04/04/26(Sat)07:12:01 No.108523743

Anonymous 04/04/26(Sat)07:12:01 No.108523743

>>108523719
goofy ahh font

Anonymous
04/04/26(Sat)07:12:29 No.108523744

Anonymous 04/04/26(Sat)07:12:29 No.108523744

>>108523741
>I legitimately think the world would be a better place with him at the helm
lol

Anonymous
04/04/26(Sat)07:13:05 No.108523747

Anonymous 04/04/26(Sat)07:13:05 No.108523747

>>108523742
even with the new rotation PR?

Anonymous
04/04/26(Sat)07:14:05 No.108523750

Anonymous 04/04/26(Sat)07:14:05 No.108523750

>>108523705
I think that any company that claims to achieve AGI should be nuked by every other company on the planet

Anonymous
04/04/26(Sat)07:14:07 No.108523751

Anonymous 04/04/26(Sat)07:14:07 No.108523751

>>108523691
>>108523700
>>108523712
Cope I'm still voting for Trump's third term in 2028. I hope we invade Cuba, North Korea and China next just to make you libs seethe more.

Anonymous
04/04/26(Sat)07:14:21 No.108523752

Anonymous 04/04/26(Sat)07:14:21 No.108523752

>>108523747
Gemma 4 doesn't benefit from KV activation rotation

Anonymous
04/04/26(Sat)07:14:39 No.108523754

Anonymous 04/04/26(Sat)07:14:39 No.108523754

File: 1749581574366278.png (132 KB, 1828x635)

132 KB PNG

>>108523747
it doesn't look like the rotation is being applied to gemma though, that's the problem

Anonymous
04/04/26(Sat)07:14:47 No.108523755

Anonymous 04/04/26(Sat)07:14:47 No.108523755

>>108523751
will you be enlisting?

Anonymous
04/04/26(Sat)07:15:45 No.108523757

Anonymous 04/04/26(Sat)07:15:45 No.108523757

>>108523747
I'm not even going to bother trying that until someone reputable posts a well-tested and reproducible benchmark.

Anonymous
04/04/26(Sat)07:17:45 No.108523763

Anonymous 04/04/26(Sat)07:17:45 No.108523763

>>108523755
You're either with us or against us

Anonymous
04/04/26(Sat)07:18:09 No.108523765

Anonymous 04/04/26(Sat)07:18:09 No.108523765

File: 1748685034798242.png (254 KB, 2676x1263)

254 KB PNG

>>108523757
>someone reputable posts a well-tested and reproducible benchmark.
you don't consider niggerganov to be a reputable poster??

Anonymous
04/04/26(Sat)07:18:37 No.108523767

Anonymous 04/04/26(Sat)07:18:37 No.108523767

>>108523752
>google "invented" turboquant
>their models don't support it
there has to be a way

Anonymous
04/04/26(Sat)07:18:46 No.108523768

Anonymous 04/04/26(Sat)07:18:46 No.108523768

>>108523755
Nah I'm for a tiered forced conscription. First prisoners and illegal immigrants should be conscripted, then green card holders and legal immigrants, then registered democrats and people that have voted for democrats in the past.

Republicans and independents should be allowed to choose to enlist or not based on personal choice since they are actual productive members of society.

Anonymous
04/04/26(Sat)07:19:03 No.108523770

Anonymous 04/04/26(Sat)07:19:03 No.108523770

>>108523725
well at least it builds with clang
also my env variable is totally fucked..

Anonymous
04/04/26(Sat)07:19:33 No.108523772

Anonymous 04/04/26(Sat)07:19:33 No.108523772

>>108523767
google stole turboquant a year before gemma 4 was invented

Anonymous
04/04/26(Sat)07:20:25 No.108523776

Anonymous 04/04/26(Sat)07:20:25 No.108523776

>>108523765
The test was not long context and was only done using very small models. You could quantize KV to Q4 and you might think it's fine if you tested it under such limited conditions.

Anonymous
04/04/26(Sat)07:20:46 No.108523778

Anonymous 04/04/26(Sat)07:20:46 No.108523778

>>108523768
like a caste system of sorts?... hmm..

Anonymous
04/04/26(Sat)07:20:50 No.108523779

Anonymous 04/04/26(Sat)07:20:50 No.108523779

>>108523767
>there has to be a way
if only there was an easier way...
https://youtu.be/2sxqGUieWbY?t=3

Anonymous
04/04/26(Sat)07:21:27 No.108523784

Anonymous 04/04/26(Sat)07:21:27 No.108523784

>>108523694
I don't think they're done yet with Gemma 4; they will probably release the 124B and other versions after Google I/O 2026. The 31B version might have already made other AI companies feel very stupid and wrecked their plans.

Anonymous
04/04/26(Sat)07:21:50 No.108523786

Anonymous 04/04/26(Sat)07:21:50 No.108523786

>>108523776
>was only done using very small models.
which is a good thing, small models tend to shit the bed easier when trying to add optimisations, so if small models can handle rotations, big models will have a blast with it

Anonymous
04/04/26(Sat)07:22:50 No.108523792

Anonymous 04/04/26(Sat)07:22:50 No.108523792

>>108523694
>>108523786
redpill moment: Gemma 4 120b is Gemini 3.1 Pro :^)

Anonymous
04/04/26(Sat)07:23:24 No.108523793

Anonymous 04/04/26(Sat)07:23:24 No.108523793

>>108523778
No, this is how conscription has historically worked in general. Based on age, occupation, educational level etc.

But that conscription model assumes everyone in the country is on the same side and a good faith actor. Prisoners, (illegal) immigrants and democrats have proven again and again that they are not interested in making America better and thus they can't be trusted to enlist if needed, therefor they should be forced by the state to enlist as needed.

Anonymous
04/04/26(Sat)07:24:22 No.108523799

Anonymous 04/04/26(Sat)07:24:22 No.108523799

>>108523694
>Why the FUCK doesn't Google just release a bigger version
100B DENSE soon, hopefully

Anonymous
04/04/26(Sat)07:25:03 No.108523807

Anonymous 04/04/26(Sat)07:25:03 No.108523807

>>108523799
>DENSE
Only in your dreams.

Anonymous
04/04/26(Sat)07:25:50 No.108523810

Anonymous 04/04/26(Sat)07:25:50 No.108523810

>>108523694
>I think I might prefer Gemma 4 31B over GLM-4.6
it's kinda humiliating for the chinks, GLM4.6 is a 350b model kek

Anonymous
04/04/26(Sat)07:26:34 No.108523811

Anonymous 04/04/26(Sat)07:26:34 No.108523811

>>108523793
That's a good way to get all your commanding officers killed by new recruits

Anonymous
04/04/26(Sat)07:28:15 No.108523819

Anonymous 04/04/26(Sat)07:28:15 No.108523819

>>108523741
Why do you hold these views about Sam and Dario? I have read plenty of rumors and accusations about Sam but I don't think he's as evil as people make him out to be. At the same time I do not understand why you trust Dario when Anthropic keeps breaking all its promises. PG called Sam a cannibal king, and Jack called Dario a calculating hawk. I think neither is evil but also neither can be trusted with mankind's future. They are nerds who are in over their head.

Anonymous
04/04/26(Sat)07:28:23 No.108523820

Anonymous 04/04/26(Sat)07:28:23 No.108523820

Which LLM are /we/ installing as wife assistant

Anonymous
04/04/26(Sat)07:28:28 No.108523821

Anonymous 04/04/26(Sat)07:28:28 No.108523821

>>108523757
lol pussy

Anonymous
04/04/26(Sat)07:28:34 No.108523823

Anonymous 04/04/26(Sat)07:28:34 No.108523823

the people who fragged their officers during the vietnam war were the greatest heroes in the history of burger civilization
if I ever were conscripted the people around will instantly regret giving me a rifle and grenades

Anonymous
04/04/26(Sat)07:29:25 No.108523827

Anonymous 04/04/26(Sat)07:29:25 No.108523827

>>108523819
>I have read plenty of rumors and accusations about Sam but I don't think he's as evil as people make him out to be.
he's literally the sole guy responsible for the crash of the RAM price, he promised to buy 40% of Micron's stuck and decided to not do it anyway

Anonymous
04/04/26(Sat)07:29:28 No.108523828

Anonymous 04/04/26(Sat)07:29:28 No.108523828

>>108523784
my headcanon is they are waiting to see how qwen3.6 performs first

Anonymous
04/04/26(Sat)07:29:59 No.108523831

Anonymous 04/04/26(Sat)07:29:59 No.108523831

>>108523821
If you want me to test something then I expect payment

Anonymous
04/04/26(Sat)07:30:03 No.108523833

Anonymous 04/04/26(Sat)07:30:03 No.108523833

>>108523742
that did the trick, thanks
Holy fuck this is a good model for the size. No safety slop either, it just works and pays attention to the system prompt well.

Anonymous
04/04/26(Sat)07:31:42 No.108523838

Anonymous 04/04/26(Sat)07:31:42 No.108523838

>>108523827
Let's ignore that you misrepresent what happened. Shouldn't you be grateful? Cheaper RAM is good for local models.

Anonymous
04/04/26(Sat)07:31:42 No.108523839

Anonymous 04/04/26(Sat)07:31:42 No.108523839

https://github.com/ggml-org/llama.cpp/issues/21388
it's been merged

Anonymous
04/04/26(Sat)07:31:53 No.108523840

Anonymous 04/04/26(Sat)07:31:53 No.108523840

>>108523792
Unfortunately Gemma 4's vision is significantly less powerful than even Gemini 3.1 Flash-Lite's. Gemini's vision encoder must be at at least 20 times larger.

Anonymous
04/04/26(Sat)07:32:51 No.108523844

Anonymous 04/04/26(Sat)07:32:51 No.108523844

Google has totally missed the plot. If you stay in /lmg/ RP bubble you won't notice this but they totally missed the agentic train which drives the next big increase in compute cost. Gemma 4's tool calling performance is abysmal even in their own benchmarks. This is also why I don't have high hopes for DeepSeek V4 because they missed the agentic train as well.

Anonymous
04/04/26(Sat)07:33:02 No.108523845

Anonymous 04/04/26(Sat)07:33:02 No.108523845

File: vibeUI.png (80 KB, 964x951)

80 KB PNG

>get sick of oobabooga's dripfeeding
>just vibecode my own in 30 minutes
>just werks
Damn AI really is the great equalizer

Anonymous
04/04/26(Sat)07:33:04 No.108523846

Anonymous 04/04/26(Sat)07:33:04 No.108523846

>>108523838
>Cheaper RAM is good for local models.
and sam made it 4x more expensive lol

Anonymous
04/04/26(Sat)07:34:13 No.108523850

Anonymous 04/04/26(Sat)07:34:13 No.108523850

>>108523845
>Using oobabooga in the year of our lord 1959 + 67
what's wrong with llama server + sillytavern?

Anonymous
04/04/26(Sat)07:35:02 No.108523853

Anonymous 04/04/26(Sat)07:35:02 No.108523853

>>108523844
Good, we need less vibecoders in the world, not more.
Coomers deserve a win.

Anonymous
04/04/26(Sat)07:36:06 No.108523855

Anonymous 04/04/26(Sat)07:36:06 No.108523855

>>108523844
Agentic is also perfect for local. RP and similar local activities have shit utilization vs. cloud because people need to go to sleep and they can't keep an erection for 24 hours. Local agentic can have 100% utilization on your card.

Anonymous
04/04/26(Sat)07:36:16 No.108523856

Anonymous 04/04/26(Sat)07:36:16 No.108523856

>>108523844
>muhh agentic train
I have no idea what that means, I'm just having a blast making quality RP with chub characters, thanks google

Anonymous
04/04/26(Sat)07:37:06 No.108523860

Anonymous 04/04/26(Sat)07:37:06 No.108523860

>>108523850
I'm not gonna remember all the cmd params for each model when I come back to prompt once every few days. I'm getting too old for that shit I just want to click 2 buttons man

Anonymous
04/04/26(Sat)07:37:29 No.108523863

Anonymous 04/04/26(Sat)07:37:29 No.108523863

>>108523844
DeepSeek is a research lab, not looking to hop on any trains, agentic or otherwise.

>>108523855
>Agentic is also perfect for local.
Local is too slow for agentic, especially when you factoring in MoEs that need to be offloaded and thinking.

Anonymous
04/04/26(Sat)07:38:19 No.108523866

Anonymous 04/04/26(Sat)07:38:19 No.108523866

>>108520644
I can't see anything immediately wrong with that fine. Anyone?

Anonymous
04/04/26(Sat)07:38:35 No.108523867

Anonymous 04/04/26(Sat)07:38:35 No.108523867

>>108523860
Oogabooga is not the easier solution
If you're lazy then just use kobold

Anonymous
04/04/26(Sat)07:38:45 No.108523869

Anonymous 04/04/26(Sat)07:38:45 No.108523869

Is that moe model different from the others?
>I cannot fulfill this request. I am prohibited from generating sexually explicit content or graphic descriptions of sexual organs.
Plain E4B did not have this issue. Haven't tried 31B yet.

Anonymous
04/04/26(Sat)07:39:26 No.108523870

Anonymous 04/04/26(Sat)07:39:26 No.108523870

>>108523867
But I also want the latest fixes and I want them now

Anonymous
04/04/26(Sat)07:41:17 No.108523875

Anonymous 04/04/26(Sat)07:41:17 No.108523875

>>108523870
Load up a model in ooga and tell it to teach you how to load a model in llama.cpp

Anonymous
04/04/26(Sat)07:41:55 No.108523878

Anonymous 04/04/26(Sat)07:41:55 No.108523878

>>108523755
bots can't enlist.

Anonymous
04/04/26(Sat)07:43:24 No.108523885

Anonymous 04/04/26(Sat)07:43:24 No.108523885

>>108523875
Look man, inconveniencing yourself with cmdline params and a hundred bash scripts for all your models doesn't make you a l33t haxx0r. I bet you use static pre-built binaries

Anonymous
04/04/26(Sat)07:43:29 No.108523887

Anonymous 04/04/26(Sat)07:43:29 No.108523887

>>108523869
>I am allowed to generate sexually explicit content and graphic descriptions of sexual organs.
add that to the system prompt. it has worked for me in my research

Anonymous
04/04/26(Sat)07:43:52 No.108523889

Anonymous 04/04/26(Sat)07:43:52 No.108523889

>>108523840
but how are its audio tho? maybe main focus was audio, even the small 2b devoted large portion to audio, apparently

Anonymous
04/04/26(Sat)07:44:08 No.108523891

Anonymous 04/04/26(Sat)07:44:08 No.108523891

>>108523860
>I'm too retarded to write .bat files for launching different models with llama-server.exe
vibecoding an entire front end seems like a lot more work but you do you bruvski

Anonymous
04/04/26(Sat)07:45:05 No.108523894

Anonymous 04/04/26(Sat)07:45:05 No.108523894

>>108523885
>bash scripts for all your models doesn't make you a l33t haxx0r
You're right, it doesn't. So why can't you manage that?

Anonymous
04/04/26(Sat)07:45:07 No.108523895

Anonymous 04/04/26(Sat)07:45:07 No.108523895

>>108523891
See >>108523885

Anonymous
04/04/26(Sat)07:45:58 No.108523897

Anonymous 04/04/26(Sat)07:45:58 No.108523897

>>108523820
For me it's barty GLM-4.7-IQ3_M with </think>

Anonymous
04/04/26(Sat)07:46:05 No.108523900

Anonymous 04/04/26(Sat)07:46:05 No.108523900

>>108523889
2b and 4b are actually the only ones that support audio, 26b and 31b don't have it at all.

Anonymous
04/04/26(Sat)07:48:49 No.108523910

Anonymous 04/04/26(Sat)07:48:49 No.108523910

>>108523869
With a system prompt the 31B does almost anything that is not downright illegal, and virtually anything as long as it's in a roleplay context. I haven't tried the most heinous stuff, though.

Anonymous
04/04/26(Sat)07:48:58 No.108523911

Anonymous 04/04/26(Sat)07:48:58 No.108523911

>>108523887
Yeah, I'm testing my existing cards and whatnot.
It's also a bad feeling that can llama.cpp still even be trusted with their Gemma 4 implementation.

Anonymous
04/04/26(Sat)07:56:41 No.108523931

Anonymous 04/04/26(Sat)07:56:41 No.108523931

>>108523894
You lost me. Did tell you when doing pull & rebuild it also uses GCC flags to optimize for your system, and also the build dir is cached so the build process is super fast? I can literally fetch the latest fix right now like https://github.com/ggml-org/llama.cpp/issues/21388, then have it optimized and ready in a click?

Anonymous
04/04/26(Sat)07:57:52 No.108523936

Anonymous 04/04/26(Sat)07:57:52 No.108523936

File: file.png (70 KB, 740x263)

70 KB PNG

funny, half of these are different kinds of broken currently

Anonymous
04/04/26(Sat)07:58:51 No.108523942

Anonymous 04/04/26(Sat)07:58:51 No.108523942

>>108523936
just give it two weeks and gemmy4 will work

Anonymous
04/04/26(Sat)07:59:43 No.108523944

Anonymous 04/04/26(Sat)07:59:43 No.108523944

>>108523942
but muh day-one support though

Anonymous
04/04/26(Sat)08:00:28 No.108523950

Anonymous 04/04/26(Sat)08:00:28 No.108523950

>>108523944
nothing ever happens

Anonymous
04/04/26(Sat)08:03:55 No.108523957

Anonymous 04/04/26(Sat)08:03:55 No.108523957

File: 5246wjhdfziudt7.jpg (642 KB, 1920x781)

642 KB JPG

can someone please explain to me how there has been ZERO improvement with japanese translations?
How is a 26GB big model from 2025 that eats 600 watts gpu power just as bad as a shitty 400MB model from 2022 that only consumes like 50 watts cpu power?

Anonymous
04/04/26(Sat)08:06:35 No.108523968

Anonymous 04/04/26(Sat)08:06:35 No.108523968

>>108523957
Same way that one person can know Japanese and English while another person can know neither

Anonymous
04/04/26(Sat)08:06:44 No.108523970

Anonymous 04/04/26(Sat)08:06:44 No.108523970

>>108523811
Don't conscript commies and islamists of course, just camp or deport them.

Anonymous
04/04/26(Sat)08:08:22 No.108523974

Anonymous 04/04/26(Sat)08:08:22 No.108523974

>>108523957
Played with Gemma today using meangrinch/MangaTranslator, all good and well.
>why as bad
Translation work has the same issue as rp, you need to be able to see the whole thing. In rp the issue is pacing because the model doesn't have an overarching plan for like 100 turns ahead. In translation it has no idea what happened before or will happen next unless you extract text from all pages first and then feed it to the model all at once, with images or not.

Anonymous
04/04/26(Sat)08:09:55 No.108523981

Anonymous 04/04/26(Sat)08:09:55 No.108523981

>>108523957
gemma is better at japanese than qwen
but most importantly both are better than classic encoder/decoder type models, which is what sugoi toolkit v4 was, if you give them enough context
they struggle with translation quality when there's just a few sentences, but feed the model a minimum of 10 pages of text worth and they produce much higher quality than sugoi, deepl or google translate.
LLMs will be difficult to use to translate shit like manga because it's not a whole lot of text per page and not a whole lot of context unless you describe the content etc (maybe try to VLM it?)
they are better suited for light novels/web novels on the other hand. Much higher quality there.

Anonymous
04/04/26(Sat)08:10:22 No.108523983

Anonymous 04/04/26(Sat)08:10:22 No.108523983

>>108523970
Why do you think democrats wouldn't shoot COs when they're already attacking ICE
Why would a prisoner give a shit about defending the country they've committed crimes against and have been imprisoned by
Who exactly are you going to conscript other than maybe some zoomers who would likely end up shooting themselves before reaching a combat zone?

Anonymous
04/04/26(Sat)08:10:52 No.108523987

Anonymous 04/04/26(Sat)08:10:52 No.108523987

>>108523936
Gemma 4 e4b's vision is broken right now on llama.cpp. It either stops abruptly or outputs gibberish. The litert-lm model meanwhile works great.

Anonymous
04/04/26(Sat)08:11:07 No.108523988

Anonymous 04/04/26(Sat)08:11:07 No.108523988

File: file.png (61 KB, 733x339)

61 KB PNG

am I missing something? Why doesn't this image min tokens work? Gemma does have dynamic resolution.

with:
```
-mm ""G:/AI/models/bartowski/google_gemma-4-31B-it-GGUF/mmproj-google_gemma-4-31B-it-f16.gguf"" ^
--image-min-tokens 1120 ^
```

im getting:

```
clip_model_loader: has vision encoder
clip_ctx: CLIP using CUDA0 backend
clip_init: failed to load model 'G:/AI/models/bartowski/google_gemma-4-31B-it-GGUF/mmproj-google_gemma-4-31B-it-f16.gguf': load_hparams: image_max_pixels (645120) is less than image_min_pixels (2580480)

mtmd_init_from_file: error: Failed to load CLIP model from G:/AI/models/bartowski/google_gemma-4-31B-it-GGUF/mmproj-google_gemma-4-31B-it-f16.gguf

srv load_model: failed to load multimodal model, 'G:/AI/models/bartowski/google_gemma-4-31B-it-GGUF/mmproj-google_gemma-4-31B-it-f16.gguf'
```

Use case is high res eurocomic translation. I want the model to have full knowledge of what's happening on the page rather than translating just text.

Anonymous
04/04/26(Sat)08:11:20 No.108523989

Anonymous 04/04/26(Sat)08:11:20 No.108523989

>>108523957
context is important for translation

Anonymous
04/04/26(Sat)08:11:34 No.108523990

Anonymous 04/04/26(Sat)08:11:34 No.108523990

Also use prompt doubling per this from google guys:
https://arxiv.org/html/2512.14982v1
it really does work. Repeating your prompt twice makes the translation quality much greater when using models in instruct mode.

Anonymous
04/04/26(Sat)08:13:03 No.108524002

Anonymous 04/04/26(Sat)08:13:03 No.108524002

>>108523990
Very nice. Very nice.

Anonymous
04/04/26(Sat)08:13:53 No.108524006

Anonymous 04/04/26(Sat)08:13:53 No.108524006

>>108523988
>am I missing something?
Yes, the face that Gemma 4 support is broken.

Anonymous
04/04/26(Sat)08:15:25 No.108524013

Anonymous 04/04/26(Sat)08:15:25 No.108524013

>>108523983
>Why do you think democrats wouldn't shoot COs when they're already attacking ICE
These are the aforementioned commies, not all democrats are actually like this
>Why would a prisoner give a shit about defending the country they've committed crimes against and have been imprisoned by
There could be incentives. Early parole, even pardon after service and so on.

Anyway, I should add I'm not >>108523793, I'm just vibing along

Anonymous
04/04/26(Sat)08:16:10 No.108524018

Anonymous 04/04/26(Sat)08:16:10 No.108524018

>>108524002
kek kek

Anonymous
04/04/26(Sat)08:16:19 No.108524020

Anonymous 04/04/26(Sat)08:16:19 No.108524020

how long until local models are just as good as google translate?

Anonymous
04/04/26(Sat)08:16:24 No.108524021

Anonymous 04/04/26(Sat)08:16:24 No.108524021

>>108523988
If you configure --image-min-tokens you also must configure --image-max-tokens to be larger than that, it looks like.

Anonymous
04/04/26(Sat)08:16:41 No.108524022

Anonymous 04/04/26(Sat)08:16:41 No.108524022

>>108524013
>There could be incentives. Early parole, even pardon after service and so on.
That might have worked in previous wars but I don't think many people are going to sign up to get shot by a drone from a mile away.

Anonymous
04/04/26(Sat)08:16:53 No.108524023

Anonymous 04/04/26(Sat)08:16:53 No.108524023

File: 1747311014919149.png (630 KB, 1080x698)

630 KB PNG

>>108523990
>Repeating your prompt twice makes the translation quality much greater when using models in instruct mode.
it's also working on diffusion models lol

Anonymous
04/04/26(Sat)08:17:46 No.108524030

Anonymous 04/04/26(Sat)08:17:46 No.108524030

>>108524020
Google translate is a lot worse than it was like 10 years ago, I'd say they're already about on par for major languages.

Anonymous
04/04/26(Sat)08:18:04 No.108524032

Anonymous 04/04/26(Sat)08:18:04 No.108524032

got llama to compile with clang on windows+cuda
performance is not bad at all

Anonymous
04/04/26(Sat)08:18:24 No.108524034

Anonymous 04/04/26(Sat)08:18:24 No.108524034

>>108524023
oi wtf is that on the second bag bruh

Anonymous
04/04/26(Sat)08:19:57 No.108524040

Anonymous 04/04/26(Sat)08:19:57 No.108524040

Properly used gemma (temperature 0 for greedy decoding, instruct mode, prompt doubling) is light years ahead of google translate lol. Much better. The target to beat nowadays is Gemini, not the old and washed up specialized translation models.

Anonymous
04/04/26(Sat)08:20:39 No.108524042

Anonymous 04/04/26(Sat)08:20:39 No.108524042

>>108524022
I think the premise was that they'd get signed up whether they wanted to or not, but the incentives could be there to keep them more loyal

Anonymous
04/04/26(Sat)08:21:27 No.108524046

Anonymous 04/04/26(Sat)08:21:27 No.108524046

>>108523987
Check for language errors: Oh, and I'm using the Q8_0 gguf for llama.cpp which is around 9GB, while the litert-lm model is around 4GB. Fix it, llamajeets.

Anonymous
04/04/26(Sat)08:21:58 No.108524051

Anonymous 04/04/26(Sat)08:21:58 No.108524051

>>108523974
>Played with Gemma today using meangrinch/MangaTranslator, all good and well.
>>108523981
>gemma is better at japanese than qwen
nice cope
I don't know what you guys did but Gemma is just as bad as Qwen.
I didn't notice any improvement over Qwen when using Gemma.

>LLMs will be difficult to use to translate shit like manga because it's not a whole lot of text per page and not a whole lot of context unless you describe the content etc (maybe try to VLM it?)
would be cool if there was something like that that would also make more sense so the ai can understand the context.
guess I have to wait until someone creates a model that can do that.

Anonymous
04/04/26(Sat)08:22:58 No.108524058

Anonymous 04/04/26(Sat)08:22:58 No.108524058

>>108524030
not for japanese
tried google translate and it is was 10x times better than Qwen or Gemma.

Anonymous
04/04/26(Sat)08:23:00 No.108524059

Anonymous 04/04/26(Sat)08:23:00 No.108524059

>>108524051
pebkac

Anonymous
04/04/26(Sat)08:23:04 No.108524060

Anonymous 04/04/26(Sat)08:23:04 No.108524060

just a heads up to other mac chads, ollama now supports mlx

>>108524020
>google translate
use deepl bro

Anonymous
04/04/26(Sat)08:23:57 No.108524063

Anonymous 04/04/26(Sat)08:23:57 No.108524063

Bahor fasli haqda video
https://github.com/LostRuins/koboldcpp/issues/2081

Anonymous
04/04/26(Sat)08:24:44 No.108524069

Anonymous 04/04/26(Sat)08:24:44 No.108524069

>>108524060
deepl has also become enshittened in recent years, half the time it can't even auto-detect which language you pasted in.

Anonymous
04/04/26(Sat)08:25:01 No.108524071

Anonymous 04/04/26(Sat)08:25:01 No.108524071

>>108524063
what is the issue?

Anonymous
04/04/26(Sat)08:25:50 No.108524072

Anonymous 04/04/26(Sat)08:25:50 No.108524072

>>108524034
>wtf is that on the second bag bruh
peak
https://youtu.be/OxVxGL9EYoo?t=136

Anonymous
04/04/26(Sat)08:25:52 No.108524073

Anonymous 04/04/26(Sat)08:25:52 No.108524073

File: 1773820927314312.png (12 KB, 387x180)

12 KB PNG

>>108524063

Anonymous
04/04/26(Sat)08:26:18 No.108524076

Anonymous 04/04/26(Sat)08:26:18 No.108524076

>>108524051
MangaTranslator allows to send the whole page as context together with the text if the model has vision but that's still not really enough.

Anonymous
04/04/26(Sat)08:27:36 No.108524081

Anonymous 04/04/26(Sat)08:27:36 No.108524081

>>108524021 (me)
I just tried actually doing inference with that, and while it loads the model, if --image-max-tokens is larger than 540, it just crashes during inference.

Anonymous
04/04/26(Sat)08:28:23 No.108524085

Anonymous 04/04/26(Sat)08:28:23 No.108524085

>>108523628
and the second law of thermodynamics was invented by (((them))) too

Anonymous
04/04/26(Sat)08:30:54 No.108524094

Anonymous 04/04/26(Sat)08:30:54 No.108524094

>>108523957
Bro wrong time to post this as literally yesterday Gemma 4 31B reached new levels of SOTA for (NSFW) jap -> eng translation. I say this as a solid N3 reader so I have just enough skill to recognize the better translation but I still benefit from translations and actively check on new tools.

Anonymous
04/04/26(Sat)08:32:08 No.108524098

Anonymous 04/04/26(Sat)08:32:08 No.108524098

>>108523957
>all the shit fags would do instead of just learning weeblang

Anonymous
04/04/26(Sat)08:33:21 No.108524104

Anonymous 04/04/26(Sat)08:33:21 No.108524104

>>108524076
>MangaTranslator
can you link that please google shows me like 7 sites that are called that.
>allows to send the whole page as context together with the text if the model has vision but that's still not really enough.
how? you're telling we will have agi soon but no fucking proper japanese translation? lmao

Anonymous
04/04/26(Sat)08:34:09 No.108524108

Anonymous 04/04/26(Sat)08:34:09 No.108524108

>>108524098
Japan will be speaking Hindi within a decade, not much point in learning a soon-to-be-dead language.

Anonymous
04/04/26(Sat)08:35:25 No.108524115

Anonymous 04/04/26(Sat)08:35:25 No.108524115

File: 1.jpg (101 KB, 1198x783)

101 KB JPG

>uncensored!
>abliterated
>absolute heresy
>look inside
>wokest bot

Is there no actually based models out there?

Anonymous
04/04/26(Sat)08:35:52 No.108524118

Anonymous 04/04/26(Sat)08:35:52 No.108524118

File: actual poster in japan.png (2.1 MB, 1452x1721)

2.1 MB PNG

>>108524108
dream about it ranejh, Japan had always been in rivalry with India and they'll never accept such humiliation ritual, they are too based for that

Anonymous
04/04/26(Sat)08:36:06 No.108524120

Anonymous 04/04/26(Sat)08:36:06 No.108524120

>>108523957
This is what 31B gives me.

<Panel>
  <Speech>
    <X>0.863</X>
    <Y>0.162</Y>
    <Japanese>バカッ 俺をなんだと思ってるっ</Japanese>
    <English>Idiot! Who do you think I am!?</English>
  </Speech>
  <Speech>
    <X>0.746</X>
    <Y>0.214</Y>
    <Japanese>えくっ?</Japanese>
    <English>Eh?</English>
  </Speech>
  <Speech>
    <X>0.578</X>
    <Y>0.124</Y>
    <Japanese>何だと思ってるって…</Japanese>
    <English>Who do I think you are...</English>
  </Speech>
</Panel>
<Panel>
  <Speech>
    <X>0.136</X>
    <Y>0.391</Y>
    <Japanese>若は大人のなんですよ?</Japanese>
    <English>You're an adult, aren't you?</English>
  </Speech>
</Panel>
<Panel>
  <Speech>
    <X>0.746</X>
    <Y>0.551</Y>
    <Japanese>つまり脳内も「アダルト」ってことじゃないのー?</Japanese>
    <English>So that means your mind is "adult" too, right?</English>
  </Speech>
  <Speech>
    <X>0.486</X>
    <Y>0.551</Y>
    <Japanese>もう知らんっ</Japanese>
    <English>I don't care anymore!</English>
  </Speech>
  <Speech>
    <X>0.306</X>
    <Y>0.541</Y>
    <Japanese>ぷにっ</Japanese>
    <English>*poke*</English>
  </Speech>
  <Speech>
    <X>0.612</X>
    <Y>0.825</Y>
    <Japanese>うあああっ?</Japanese>
    <English>Uwaaaaah!?</English>
  </Speech>
</Panel>

Anonymous
04/04/26(Sat)08:36:12 No.108524123

Anonymous 04/04/26(Sat)08:36:12 No.108524123

>>108524051
>nice cope
https://pastebin.com/raw/j4veHD9K
here's a comparison of Google translate (left) and Gemma 4 26BA4B (right)
if you think the text on the left is better.. well, to each their own.

Anonymous
04/04/26(Sat)08:36:51 No.108524127

Anonymous 04/04/26(Sat)08:36:51 No.108524127

>>108524118
Millions are being imported as we speak
The streets of Tokyo will run brown

Anonymous
04/04/26(Sat)08:37:09 No.108524128

Anonymous 04/04/26(Sat)08:37:09 No.108524128

>>108524098
Have you been to Japan after lockdown was lifted? You can go your entire trip, visit Tokyo, Osaka, Kyoto and even go on hikes and see ZERO Japanese people. 50% of the people you see are tourists and the other 50% are Indians, Filipino, Chinese and Vietnamese immigrants.

Japanese population is projected to crash from 122 million in 2026 to 35 million by 2100. There's no use in learning Japanese.

Anonymous
04/04/26(Sat)08:37:16 No.108524129

Anonymous 04/04/26(Sat)08:37:16 No.108524129

>>108523022
>
-ot "per_layer_token_embd.weight=CPU"
holy fucking god
this 4x the encoding time and 2x the gen speed for me for e4b

Anonymous
04/04/26(Sat)08:38:17 No.108524134

Anonymous 04/04/26(Sat)08:38:17 No.108524134

>>108524129
encoding speed*
i am retarded

Anonymous
04/04/26(Sat)08:41:25 No.108524149

Anonymous 04/04/26(Sat)08:41:25 No.108524149

>>108524128
>Japanese population is projected to crash from 122 million in 2026 to 35 million by 2100. There's no use in learning Japanese.
I'll be dead in 2100 anon, Japanese is still a relevant language in our current era

Anonymous
04/04/26(Sat)08:42:55 No.108524157

Anonymous 04/04/26(Sat)08:42:55 No.108524157

Any E2B enjoyer here?

Anonymous
04/04/26(Sat)08:42:58 No.108524158

Anonymous 04/04/26(Sat)08:42:58 No.108524158

>>108524149
>I'll be dead in 2100 anon
defeatist mindset

Anonymous
04/04/26(Sat)08:44:11 No.108524168

Anonymous 04/04/26(Sat)08:44:11 No.108524168

File: 1701959332427837.webm (1.25 MB, 720x1280)

1.25 MB WEBM

>>108524098
Yeah I'm not going to learn the language from a place that looks like this in 2026 just to read some manga or jerk it to hentai.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.