/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 03/25/26(Wed)15:03:42 No.108453570

File: HEN SHIN.jpg (179 KB, 1024x1024)

179 KB JPG

/lmg/ - Local Models General Anonymous 03/25/26(Wed)15:03:42 No.108453570

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108447705 & >>108441758

►News
>(03/24) GigaChat 3.1 released: https://hf.co/collections/ai-sage/gigachat-31
>(03/17) Rakuten AI 3.0 released: https://global.rakuten.com/corp/news/press/2026/0317_01.html
>(03/16) Mistral Small 4 released: https://mistral.ai/news/mistral-small-4
>(03/11) Nemotron 3 Super released: https://hf.co/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
03/25/26(Wed)15:04:03 No.108453575

Anonymous 03/25/26(Wed)15:04:03 No.108453575

File: what's in the box.jpg (235 KB, 1536x1536)

235 KB JPG

►Recent Highlights from the Previous Thread: >>108447705

--Vision models failing deformity edge cases:
>108452331 >108452376 >108452412 >108452429 >108452466 >108452484 >108452523 >108452546 >108452616 >108452385 >108452409 >108452509 >108452607 >108452626 >108452841 >108452845 >108452849 >108452867
--Xeon 6 LLM inference benchmarks debated over AMX optimization gaps:
>108448422 >108448451 >108448886 >108449507 >108451237 >108452095 >108450210 >108452136
--Nvidia Nemotron reasoning challenge puzzles:
>108448817 >108448837 >108448859 >108449204 >108449216 >108448873 >108448945
--Direct-io PR discussion and gemma3 loading failures:
>108451404 >108451435 >108451499 >108451511 >108451525 >108451530 >108451534 >108451515
--Skepticism toward TurboQuant's 2-bit quantization claims:
>108450002 >108450011 >108450054 >108451136 >108450065 >108451386 >108450294
--Qwen 3.5's niche use cases and performance tradeoffs debated:
>108450432 >108450443 >108450488 >108450499 >108450517 >108450519 >108450534 >108450554 >108450571 >108450589 >108450599 >108450615 >108450634 >108452303
--Optimal context window sizes for coding tasks:
>108451293 >108451325 >108451838 >108451330 >108451306 >108451406 >108451432
--Exploring LLM integration for dynamic NPC interactions in ASCII games:
>108447855 >108447871 >108447952 >108447980 >108448029 >108448043 >108448103 >108448045 >108448058
--TurboQuant claims 6x memory reduction and 8x speedup with zero accuracy loss:
>108451313 >108451431 >108451594 >108451872
--PocketTTS.cpp achieves GPU-like CPU inference speeds:
>108451512 >108451553 >108451556 >108451562
--GigaChat-3.1-Ultra Russian model released with DeepSeek architecture:
>108448539 >108448567
--Step3.5 MTP support PR for llama.cpp:
>108450936 >108451133 >108451275
--Miku, Luka, and Dipsy (free space):
>108450983 >108452704 >108448061 >108452647 >108452749

►Recent Highlight Posts from the Previous Thread: >>108447707

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
03/25/26(Wed)15:06:28 No.108453599

Anonymous 03/25/26(Wed)15:06:28 No.108453599

uoh yeah

Anonymous
03/25/26(Wed)15:09:23 No.108453622

Anonymous 03/25/26(Wed)15:09:23 No.108453622

>>108453575
cum in between her thighs

Anonymous
03/25/26(Wed)15:12:53 No.108453652

Anonymous 03/25/26(Wed)15:12:53 No.108453652

File: Screen_Shot_2022-06-02_at(...).png (2.9 MB, 1513x1556)

2.9 MB PNG

4 boobs proove the dog test was ass

Anonymous
03/25/26(Wed)15:13:13 No.108453655

Anonymous 03/25/26(Wed)15:13:13 No.108453655

File: 2c3e1f573b37c20b3d0737390(...).jpg (166 KB, 1024x1024)

166 KB JPG

You are going to support Israeli-American chip design by buying the Intel® Arc™ Pro B70, aren't you?

Anonymous
03/25/26(Wed)15:14:34 No.108453665

Anonymous 03/25/26(Wed)15:14:34 No.108453665

>>108453655
If the cost per GB of VRAM is good, yes.
Otherwise, no.

Anonymous
03/25/26(Wed)15:15:33 No.108453676

Anonymous 03/25/26(Wed)15:15:33 No.108453676

>>108453652
this is horrific, jesus

Anonymous
03/25/26(Wed)15:16:56 No.108453684

Anonymous 03/25/26(Wed)15:16:56 No.108453684

>>108453655
*Judeo-Christian chip design

Anonymous
03/25/26(Wed)15:18:15 No.108453690

Anonymous 03/25/26(Wed)15:18:15 No.108453690

>>108453655
Does Intel have their own thing like zluda yet?

Anonymous
03/25/26(Wed)15:19:08 No.108453699

Anonymous 03/25/26(Wed)15:19:08 No.108453699

>>108453655
repdill me on b70. i hear their software stack is the real issue. just how bad is it and feasibly how long might it take for to catch up?

Anonymous
03/25/26(Wed)15:22:38 No.108453733

Anonymous 03/25/26(Wed)15:22:38 No.108453733

>>108453570
artist?

Anonymous
03/25/26(Wed)15:23:31 No.108453738

Anonymous 03/25/26(Wed)15:23:31 No.108453738

>>108453733
noobai piloted by the local autist

Anonymous
03/25/26(Wed)15:25:08 No.108453752

Anonymous 03/25/26(Wed)15:25:08 No.108453752

Outside of stepfun which I’m too poor to run shit feels pretty stagnant of late
Shoulda went for the 128gb sticks after all

Anonymous
03/25/26(Wed)15:25:27 No.108453755

Anonymous 03/25/26(Wed)15:25:27 No.108453755

File: 1762969619953363.png (21 KB, 684x75)

21 KB PNG

>>108453655
No

Anonymous
03/25/26(Wed)15:30:22 No.108453794

Anonymous 03/25/26(Wed)15:30:22 No.108453794

>>108453755
Cheaper than a 5090.

Anonymous
03/25/26(Wed)15:30:52 No.108453798

Anonymous 03/25/26(Wed)15:30:52 No.108453798

>>108453794
yeah..

Anonymous
03/25/26(Wed)15:33:05 No.108453813

Anonymous 03/25/26(Wed)15:33:05 No.108453813

>>108453794
It's not competing with 5090s though, it's competing with V100s or AMD R9700s.

Anonymous
03/25/26(Wed)15:50:34 No.108453929

Anonymous 03/25/26(Wed)15:50:34 No.108453929

File: Screenshot_20260325_152730.png (212 KB, 2371x915)

212 KB PNG

this was the least noisy format I could come up with for formatting 4chan threads. I just re-serialized the id's and skipped the names and dates. is this good enough?

Anonymous
03/25/26(Wed)15:53:53 No.108453951

Anonymous 03/25/26(Wed)15:53:53 No.108453951

File: gemma4.jpg (537 KB, 1264x1737)

537 KB JPG

Will my gemma-4 27B be very light on my gpu?

Anonymous
03/25/26(Wed)15:55:03 No.108453961

Anonymous 03/25/26(Wed)15:55:03 No.108453961

>>108453929
It's clean and should be fine. You could also format it as "No.1" instead of square brackets and put a newline before the comments to make it more recognizable as 4chan threads to the models.

Anonymous
03/25/26(Wed)15:57:37 No.108453974

Anonymous 03/25/26(Wed)15:57:37 No.108453974

File: oh-god-why-do-i-want-it-so-bad.png (401 KB, 735x799)

401 KB PNG

>>108453570
>cunny holding onahole
Sounds redundant.

Anonymous
03/25/26(Wed)15:58:11 No.108453980

Anonymous 03/25/26(Wed)15:58:11 No.108453980

>>108453951
Yes! All that's left is to get gemma 4 onto your GPU.

Anonymous
03/25/26(Wed)15:59:47 No.108453990

Anonymous 03/25/26(Wed)15:59:47 No.108453990

>>108453951
thou shan't redeem le gemma

Anonymous
03/25/26(Wed)16:01:46 No.108454005

Anonymous 03/25/26(Wed)16:01:46 No.108454005

I'm still laughing at the dog test from the previous thread.
Can't wait to be replaced my an LLM because I couldn't see the fifth leg on an image with four legs visible at best...
Will the labs start benchmaxxing on the Dog Shit Vision Test if we mention it enough times? Like with mesugaki?

Anonymous
03/25/26(Wed)16:02:45 No.108454010

Anonymous 03/25/26(Wed)16:02:45 No.108454010

File: Screenshot_20260325_155315.png (159 KB, 2356x1043)

159 KB PNG

>>108453961
>You could also format it as "No.1" instead of square brackets and put a newline before the comments to make it more recognizable as 4chan threads to the models.
yup, I'll buy it.

Anonymous
03/25/26(Wed)16:07:08 No.108454034

Anonymous 03/25/26(Wed)16:07:08 No.108454034

>>108454005
https://arxiv.org/html/2505.23941v1
imagine being this proud of being an ignorant pajeet coming to defend muh vision model lady

Anonymous
03/25/26(Wed)16:10:43 No.108454053

Anonymous 03/25/26(Wed)16:10:43 No.108454053

>>108453951
>KV cache quantization
That's basically irrelevant. All the new models use so little memory for KV cache.

Anonymous
03/25/26(Wed)16:12:25 No.108454064

Anonymous 03/25/26(Wed)16:12:25 No.108454064

>>108454053
>All the new models use so little memory for KV cache
even with the current efficiency gains there's no way we'll get to 1M locally without some aggressive form of quant

Anonymous
03/25/26(Wed)16:14:39 No.108454079

Anonymous 03/25/26(Wed)16:14:39 No.108454079

>blacklist "guttural"
>model starts writing "gutteral"
>blacklist "gutteral"
>model starts writing "gutural"
Come on now these are not even words.

Anonymous
03/25/26(Wed)16:16:07 No.108454089

Anonymous 03/25/26(Wed)16:16:07 No.108454089

>>108454079
The power of the embedding space.

Anonymous
03/25/26(Wed)16:16:26 No.108454091

Anonymous 03/25/26(Wed)16:16:26 No.108454091

blacklist, antislop sampler, grammars, all of that was always a total cope. the LLM always wants to fit the square pattern into the round box and life finds a way.

Anonymous
03/25/26(Wed)16:16:30 No.108454092

Anonymous 03/25/26(Wed)16:16:30 No.108454092

>>108454079
>bl*cklist
denylist*

Anonymous
03/25/26(Wed)16:18:04 No.108454096

Anonymous 03/25/26(Wed)16:18:04 No.108454096

>>108454079
Why should only meatbags be allowed to make typos?

Anonymous
03/25/26(Wed)16:21:47 No.108454117

Anonymous 03/25/26(Wed)16:21:47 No.108454117

File: file.png (54 KB, 860x581)

54 KB PNG

>>108454034
Haha yeah. (What vision model lady?)
Anyway, if you ever wonder about the state of Israel (they are lobbied to hell and have no need to return anything), pic related is Israel controlling the United States.

Anonymous
03/25/26(Wed)16:23:11 No.108454129

Anonymous 03/25/26(Wed)16:23:11 No.108454129

extremely organic posting

Anonymous
03/25/26(Wed)16:23:54 No.108454132

Anonymous 03/25/26(Wed)16:23:54 No.108454132

File: 1749508287708464.png (63 KB, 1080x500)

63 KB PNG

https://xcancel.com/arcprize/status/2036860080541589529#m
lawl

Anonymous
03/25/26(Wed)16:24:04 No.108454133

Anonymous 03/25/26(Wed)16:24:04 No.108454133

>>108453951
So will I be able to run 72b instead of 12b in the near future
Yes I am stupid but answer the question please

Anonymous
03/25/26(Wed)16:24:14 No.108454134

Anonymous 03/25/26(Wed)16:24:14 No.108454134

>>108454064
I just tried loading Qwen3.5 397B with yarn and it needs 31GB for 1M context. That's local.

Anonymous
03/25/26(Wed)16:24:44 No.108454139

Anonymous 03/25/26(Wed)16:24:44 No.108454139

>>108454132
>moving goalposts

Anonymous
03/25/26(Wed)16:25:43 No.108454145

Anonymous 03/25/26(Wed)16:25:43 No.108454145

>>108454133
It's just for the kvcache.

Anonymous
03/25/26(Wed)16:25:45 No.108454146

Anonymous 03/25/26(Wed)16:25:45 No.108454146

>>108454132
>one thousand dollars
>0.035%
wew lads

Anonymous
03/25/26(Wed)16:26:56 No.108454153

Anonymous 03/25/26(Wed)16:26:56 No.108454153

>>108454133
For that you want BitNet.

Anonymous
03/25/26(Wed)16:27:27 No.108454156

Anonymous 03/25/26(Wed)16:27:27 No.108454156

>>108454132
This just in: random word generator bad at understanding 2d environments until it's added to the training data

Anonymous
03/25/26(Wed)16:30:56 No.108454178

Anonymous 03/25/26(Wed)16:30:56 No.108454178

>>108454153
It's been two years already. Where are the 72b bitnets?

Anonymous
03/25/26(Wed)16:32:12 No.108454187

Anonymous 03/25/26(Wed)16:32:12 No.108454187

>>108454132
> Thinking it is playing another game
> Holding on to early hypothesis
yup, that's vision models in a nutshell.
very big on assumptions, very very overfit to the image datasets they were trained on.
I'm glad a mainstream benchmark went this route, I bet if they ever do anything close to minor alteration of their benchmark it will keep throwing LLMs off and reveal that the emperor never had any clothes to begin with and all pretense of generalization were lies.

Anonymous
03/25/26(Wed)16:32:48 No.108454191

Anonymous 03/25/26(Wed)16:32:48 No.108454191

it's not coming this week either, is it?

Anonymous
03/25/26(Wed)16:35:31 No.108454208

Anonymous 03/25/26(Wed)16:35:31 No.108454208

>>108454191
big week for Gemma 4

Anonymous
03/25/26(Wed)16:35:34 No.108454210

Anonymous 03/25/26(Wed)16:35:34 No.108454210

an employee leaked that the new deepshit would be much bigger than the previous, then removed his post on chinese social media
methinks all this ebegging for ds is going to turn into sour ewhining quick

Anonymous
03/25/26(Wed)16:37:54 No.108454221

Anonymous 03/25/26(Wed)16:37:54 No.108454221

>>108454210
I'm a big boy I can handle it.
Also source?

Anonymous
03/25/26(Wed)16:38:45 No.108454228

Anonymous 03/25/26(Wed)16:38:45 No.108454228

File: yeah right.png (346 KB, 3404x746)

346 KB PNG

>>108454132
kek, get pwned Jensen
https://www.theverge.com/ai-artificial-intelligence/899086/jensen-huang-nvidia-agi

Anonymous
03/25/26(Wed)16:38:46 No.108454229

Anonymous 03/25/26(Wed)16:38:46 No.108454229

>>108453699
Pytorch is actually mostly fine and stable, except for memory stats reporting on anything newer than the A series cards so as long as you can get stuff working there, you can get transformers and ComfyUI working, and easily hack up anything else to get a good experience. That's the benefit of going mainline over IPEX which was a hack in the first place so yeah, they started during Pytorch 2.5 and now at Pytorch 2.10, Intel's backend is pretty good there. However, anything lower level, your only real choices for GPU inference using multiple GPUs and https://github.com/intel/llm-scaler for their fork of vLLM or with mainline stuff, Vulkan with llama.cpp and other forks since the SYCL backend is half baked and OpenVIno is not mature and ipex-llm is abandoned since last year so is outdated for the newer models. ik_llama doesn't support SYCL as that was what caused the whole debacle in the first place and Vulkan is an afterthought there but I have no clue about the other forks but I think at least kobold.cpp works too. There's some stuff around OpenVino but none of it is really mature yet. That is the real issue with Intel which is lower level software where you really want to squeeze out the juice, it's not there. But I think for ComfyUI and other stuff on the Pytorch layer of things, it might be fine.

Anonymous
03/25/26(Wed)16:40:46 No.108454246

Anonymous 03/25/26(Wed)16:40:46 No.108454246

>>108454210
2T or 3T?
I can almost run 2T at 1.X bit

Anonymous
03/25/26(Wed)16:44:54 No.108454268

Anonymous 03/25/26(Wed)16:44:54 No.108454268

File: never ending loop.png (898 KB, 1080x1084)

898 KB PNG

>>108454132
they will train their model on those test and say they reached AGI, then AGI 4 comes and destroys everything, and then they will train their model on...

Anonymous
03/25/26(Wed)16:52:13 No.108454319

Anonymous 03/25/26(Wed)16:52:13 No.108454319

What's the best model to use as a Claude Code substitute that fits in 128GB?

Anonymous
03/25/26(Wed)16:53:18 No.108454323

Anonymous 03/25/26(Wed)16:53:18 No.108454323

>>108454319
Minimax 2.5

Anonymous
03/25/26(Wed)16:55:18 No.108454331

Anonymous 03/25/26(Wed)16:55:18 No.108454331

>>108453227
What boards did you choose to scrape?

Anonymous
03/25/26(Wed)16:58:51 No.108454347

Anonymous 03/25/26(Wed)16:58:51 No.108454347

>>108454268
Such is the case with chasing benchmarks. Hopefully these companies don't just keep doing this for the rest of our lives, haha... lol...

Anonymous
03/25/26(Wed)17:01:52 No.108454364

Anonymous 03/25/26(Wed)17:01:52 No.108454364

>>108451136
>>108450054
>>108451431
ok seriously guys you have to explain why Bitnet is bad I am using its techniques as a core component of my models

Anonymous
03/25/26(Wed)17:02:05 No.108454366

Anonymous 03/25/26(Wed)17:02:05 No.108454366

>>108454331
/g/ /pol/ /sci/ /lit/ /his/ /tg/ /out/
do you have any recommendations, pol seems to move the fastest, it looks like I'm going to have to do some sampling so it doesn't dominate.

Anonymous
03/25/26(Wed)17:02:44 No.108454369

Anonymous 03/25/26(Wed)17:02:44 No.108454369

>>108454347
They only need to keep up the charade for a few more quarters until they can cash out and let it collapse

Anonymous
03/25/26(Wed)17:03:09 No.108454372

Anonymous 03/25/26(Wed)17:03:09 No.108454372

>>108454366
Depends on what you're trying to achieve

Anonymous
03/25/26(Wed)17:04:25 No.108454381

Anonymous 03/25/26(Wed)17:04:25 No.108454381

>>108454364
some dude on discord said it'll never work because it makes training insanely expensive, source his asshole of course, and some here took that screenshot as gospel truth despite it contradicting the original paper itself

Anonymous
03/25/26(Wed)17:05:41 No.108454393

Anonymous 03/25/26(Wed)17:05:41 No.108454393

>>108454381
on the other hand the paper is obviously gospel...

Anonymous
03/25/26(Wed)17:06:57 No.108454404

Anonymous 03/25/26(Wed)17:06:57 No.108454404

>>108454381
as opposed to random /lmg/ dude and microsoft jeet saying it works when literally not a single soul in the industry is making a model with it, in an era where hardware costs are going up the wazoo, compute is being limited even in strong AI labs (qwen guys complained about lack of access to compute) and everyone would very much like a model compression technique that actually worked
you are all deluded ai psychotics, bitnet is the fruit of years of coping and zero production

Anonymous
03/25/26(Wed)17:07:34 No.108454408

Anonymous 03/25/26(Wed)17:07:34 No.108454408

>>108454364
>I am using its techniques as a core component of my models
As in "bitnet" (ternary 2.x bit quantization) or a model trained that way like the actual bitnet?

Anonymous
03/25/26(Wed)17:08:07 No.108454417

Anonymous 03/25/26(Wed)17:08:07 No.108454417

>>108454404
>qwen guys
even said they'd try bitnet at some point before qwen3

Anonymous
03/25/26(Wed)17:09:52 No.108454432

Anonymous 03/25/26(Wed)17:09:52 No.108454432

File: 1518376309230.png (3 KB, 279x237)

3 KB PNG

>>108453570
How do you even describe this bodytype to an AI without explicitly requesting it to generate loli porn?

Anonymous
03/25/26(Wed)17:10:09 No.108454434

Anonymous 03/25/26(Wed)17:10:09 No.108454434

>>108454417
Clearly they tried and found it's shit

Anonymous
03/25/26(Wed)17:11:11 No.108454442

Anonymous 03/25/26(Wed)17:11:11 No.108454442

>>108454434
really really wish they'd have openly said so, then we could have buried this meme for good

Anonymous
03/25/26(Wed)17:11:29 No.108454446

Anonymous 03/25/26(Wed)17:11:29 No.108454446

>>108454393
The paper reported the training costs for actually training bitnet and full precision models as nearly equivalent. You are free to reproduce their experiments with a small model and make a name for yourself by calling Microsoft out for publishing fraudulent papers.

Anonymous
03/25/26(Wed)17:11:29 No.108454447

Anonymous 03/25/26(Wed)17:11:29 No.108454447

>>108454434
yeah in this industry it's rare for people to openly call other people's work outright shit, if they gave it a try and it's shit the silent treatment is the most likely outcome.

Anonymous
03/25/26(Wed)17:11:47 No.108454450

Anonymous 03/25/26(Wed)17:11:47 No.108454450

>>108454432
"Compact"

Anonymous
03/25/26(Wed)17:11:56 No.108454452

Anonymous 03/25/26(Wed)17:11:56 No.108454452

>>108454432
This is /lmg/. Why is explicity requesting lolis an issue?

Anonymous
03/25/26(Wed)17:12:22 No.108454458

Anonymous 03/25/26(Wed)17:12:22 No.108454458

>>108454442
Nah people would be like "real BitNet has never been tried"

Anonymous
03/25/26(Wed)17:13:37 No.108454463

Anonymous 03/25/26(Wed)17:13:37 No.108454463

the bitnet meme will survive as long as jeets have access to the internet and dream of running a llm on their 50 bucks phone

Anonymous
03/25/26(Wed)17:14:30 No.108454474

Anonymous 03/25/26(Wed)17:14:30 No.108454474

>>108454432
short and petite? height? small or nearly nonexistent tits? permanently stuck in pre-bloom? come on man

Anonymous
03/25/26(Wed)17:14:47 No.108454477

Anonymous 03/25/26(Wed)17:14:47 No.108454477

>>108454366
>do you have any recommendations
Well would you be able to scrape the archive sites like Desu or NotArch? Then you wouldnt need an active userbase and can scrape years worth of activity.
Also I would get rid of /pol/, pretty sure almost all the posts over there are already bots.

Anonymous
03/25/26(Wed)17:15:06 No.108454480

Anonymous 03/25/26(Wed)17:15:06 No.108454480

avocado blt rise up

Anonymous
03/25/26(Wed)17:16:09 No.108454492

Anonymous 03/25/26(Wed)17:16:09 No.108454492

>>108454480
I'm waiting for the news any day now that Meta is cancelling Avocado and firing everyone involved

Anonymous
03/25/26(Wed)17:18:03 No.108454502

Anonymous 03/25/26(Wed)17:18:03 No.108454502

>>108454452
I was more puzzled by how one is able to request the absolute embodiment of sex that this bodytype is from a generative model without invoking any sexual connotations when formulating a prompt.

Anonymous
03/25/26(Wed)17:18:05 No.108454504

Anonymous 03/25/26(Wed)17:18:05 No.108454504

File: FB5D8F481EC6E44964D59A672(...).png (2.99 MB, 1024x1536)

2.99 MB PNG

Cloud models are down but I don't know if we can benefit from this somehow.
Why aren't we benefitting?

Anonymous
03/25/26(Wed)17:18:33 No.108454506

Anonymous 03/25/26(Wed)17:18:33 No.108454506

>>108454372
idk really, it was a bit of a lark, I was reading a thread and thought it would have been good training data, so I thought I'd see how hard it was to scrape 4chan, and it turns out they have a really simple and free API so it actually was way easier then expected. so I got that board list by asking claude what boards are more text driven, given the source is an image board and the target is a fucking text model it seemed likely to be the main consideration, but idk I still need to do some test shots on those images and see if I can get a model to annotate them accurately and quickly enough. even if i don't annotate every image on every thread, a few here and there might still be useful training data. I'm not trying to achieve anything specific really, I don't expect the data to improve any benchmarks. its just a little bit of fun I guess, see what happens.

Anonymous
03/25/26(Wed)17:19:02 No.108454512

Anonymous 03/25/26(Wed)17:19:02 No.108454512

>>108454477
If he goes for archive sites, he can keep pre-2016 /pol/

Anonymous
03/25/26(Wed)17:19:17 No.108454513

Anonymous 03/25/26(Wed)17:19:17 No.108454513

>>108453575
don't look at me like that, it makes me hard

Anonymous
03/25/26(Wed)17:20:29 No.108454518

Anonymous 03/25/26(Wed)17:20:29 No.108454518

>>108454504
>ollama

Anonymous
03/25/26(Wed)17:20:54 No.108454523

Anonymous 03/25/26(Wed)17:20:54 No.108454523

>>108453575
delicious tummy and plump thighs is back

Anonymous
03/25/26(Wed)17:21:17 No.108454528

Anonymous 03/25/26(Wed)17:21:17 No.108454528

>>108454502
Are you proposing a creative writing exercise or are you unaware of diffusion models trained on booru tags?

Anonymous
03/25/26(Wed)17:21:39 No.108454532

Anonymous 03/25/26(Wed)17:21:39 No.108454532

>>108454366
/v/ despite the fact that it is a videogame board practically everything gets discussed there at one point or another.

Anonymous
03/25/26(Wed)17:21:55 No.108454535

Anonymous 03/25/26(Wed)17:21:55 No.108454535

>>108454079
I vaguely remember a model writing sory when I banned sorry. When it really wants to say something, it will try to find a way.

Anonymous
03/25/26(Wed)17:22:31 No.108454539

Anonymous 03/25/26(Wed)17:22:31 No.108454539

the current spam of irrelevant garbage is why we should ask mossad to kill brittle

Anonymous
03/25/26(Wed)17:23:50 No.108454553

Anonymous 03/25/26(Wed)17:23:50 No.108454553

>>108454504
>no pantyshot
dude wtf

Anonymous
03/25/26(Wed)17:25:01 No.108454559

Anonymous 03/25/26(Wed)17:25:01 No.108454559

>>108454366
>pol seems to move the fastest
just add a +100 logit bias to the word jew, same result

Anonymous
03/25/26(Wed)17:25:25 No.108454563

Anonymous 03/25/26(Wed)17:25:25 No.108454563

>>108454512
I just checked the two sites mentioned, desu and notarch, neither has it. does anyone archive /pol/?
>>108454532
I added it to the list

Anonymous
03/25/26(Wed)17:26:48 No.108454575

Anonymous 03/25/26(Wed)17:26:48 No.108454575

deepseek ocr just got merged in llmao
ocr2 soon

Anonymous
03/25/26(Wed)17:26:50 No.108454576

Anonymous 03/25/26(Wed)17:26:50 No.108454576

>>108454563
4plebs and archived.moe

Anonymous
03/25/26(Wed)17:27:31 No.108454582

Anonymous 03/25/26(Wed)17:27:31 No.108454582

what's the pro of ocr vs image to text

Anonymous
03/25/26(Wed)17:30:08 No.108454607

Anonymous 03/25/26(Wed)17:30:08 No.108454607

File: Screenshot 2026-03-25 at (...).png (30 KB, 885x289)

30 KB PNG

Lmao.

Anonymous
03/25/26(Wed)17:30:21 No.108454613

Anonymous 03/25/26(Wed)17:30:21 No.108454613

>>108454408
I'm training a tiny model exactly like this yes
>>108454458
>>108454442
>>108454447
>>108454434
>>108454417
>>108454404
>>108454393
>>108454381
thank you everyone for the conversation, I'll be stopping the training promptly then shooting myself in the face. After that I will just train a tiny qwen3 and stop thinking that I'm smart.

Anonymous
03/25/26(Wed)17:30:59 No.108454619

Anonymous 03/25/26(Wed)17:30:59 No.108454619

>>108453929
>random religionfag manifesting into the imaginary thread
Yeah, thats about right, alongside the pol termites that need to chew their way into any space possible because even they find their board intolerable

Anonymous
03/25/26(Wed)17:31:06 No.108454622

Anonymous 03/25/26(Wed)17:31:06 No.108454622

>>108454528
Just playing dumb and daydreaming of more tangible labels than just artist names I guess.

Anonymous
03/25/26(Wed)17:33:04 No.108454635

Anonymous 03/25/26(Wed)17:33:04 No.108454635

>>108454613
>I'm training a tiny model exactly like this yes
That's sick. Keep going.

Anonymous
03/25/26(Wed)17:33:40 No.108454638

Anonymous 03/25/26(Wed)17:33:40 No.108454638

>>108454582
Image to text is OCR (Optical Character Recognition). I'm not entirely sure what you're asking.
If you wan to figure out the layout of a book or a site, a graph, whatever, you need more than just the text. Now OCR is used more generally to say "The model kind of understands images", which includes translating text in images to text.

Anonymous
03/25/26(Wed)17:35:08 No.108454645

Anonymous 03/25/26(Wed)17:35:08 No.108454645

>>108454638
so the "ocr" in ds is the same capability as the image to text in qwen 3.5 for example?
if I show it something outside of text, will it be able to describe it?

Anonymous
03/25/26(Wed)17:35:56 No.108454653

Anonymous 03/25/26(Wed)17:35:56 No.108454653

>>108454635
it's not actually because it has been performing worse than a fucking bitmamba terniary model I've been trying, and that shit was made by a Brazilian. It's been driving me nuts for a week I thought I was doing something wrong. That's why when that anon mentioned Bitnet being ass I wanted to know why

Anonymous
03/25/26(Wed)17:38:00 No.108454673

Anonymous 03/25/26(Wed)17:38:00 No.108454673

>>108454653
Can you do a write up of exactly what you tried and how badly it performed? Either someone might be able to help you fix it, or it might shut people up asking about bitnet every other thread.

Anonymous
03/25/26(Wed)17:42:05 No.108454709

Anonymous 03/25/26(Wed)17:42:05 No.108454709

>>108454645
>if I show it something outside of text, will it be able to describe it?
I'm not sure. But if it does, it's probably fairly limited. Seems like an experimental model. llama.cpp just committed compatibility for DS's first OCR model. Give it a go. It's a small model.
https://github.com/ggml-org/llama.cpp/pull/17400
The one for DSOCR2 is being worked on as well.
https://github.com/ggml-org/llama.cpp/pull/20975

Anonymous
03/25/26(Wed)17:42:59 No.108454719

Anonymous 03/25/26(Wed)17:42:59 No.108454719

>>108454709
thanks anon, that's what I meant then, "ocr" is specifically to decode text, not generally do image to text

Anonymous
03/25/26(Wed)17:46:22 No.108454757

Anonymous 03/25/26(Wed)17:46:22 No.108454757

PSA: Save your cum. This week will be huge.

Anonymous
03/25/26(Wed)17:47:33 No.108454768

Anonymous 03/25/26(Wed)17:47:33 No.108454768

>>108454757
>This week will be huge.
Longer than 7 days? Damn. How things change.

Anonymous
03/25/26(Wed)17:50:50 No.108454796

Anonymous 03/25/26(Wed)17:50:50 No.108454796

>>108454757
Are you saying... it is going to cost as much as RAM?
I better start stockpiling. Thank you insider-anon!

Anonymous
03/25/26(Wed)17:55:31 No.108454829

Anonymous 03/25/26(Wed)17:55:31 No.108454829

>>108454576
that'll work, the API is less generous but they have data dumps, I can just download the full thing. text is only 89gb

https://archive.org/details/4plebs-org-data-dump-2026-01

Anonymous
03/25/26(Wed)18:01:36 No.108454874

Anonymous 03/25/26(Wed)18:01:36 No.108454874

File: Screenshot_20260325_174901.png (240 KB, 1175x845)

240 KB PNG

>>108454829
I lied, its actually broken out so its even easier to manage. I think I might download /x/ too, could be fun.

Anonymous
03/25/26(Wed)18:01:37 No.108454875

Anonymous 03/25/26(Wed)18:01:37 No.108454875

https://www.claudescode.dev/?window=since_launch
you can see some pretty funny instances of ai psychosis in the projects with the most commits and most lines added.
https://github.com/synaptent/aragora?tab=readme-ov-file
>Individual LLMs are unreliable. Their personas shift with context, their confidence does not correlate with accuracy, and they often optimize for plausible agreement instead of truth.
>Aragora treats that as a systems problem. It coordinates heterogeneous models through structured debate and review, preserves receipts and provenance, and stops truthfully when evidence is insufficient. The goal is not just faster AI output, but governed AI-assisted execution you can actually inspect.

Anonymous
03/25/26(Wed)18:02:55 No.108454882

Anonymous 03/25/26(Wed)18:02:55 No.108454882

>>108453684
judeo*

Anonymous
03/25/26(Wed)18:06:17 No.108454908

Anonymous 03/25/26(Wed)18:06:17 No.108454908

>>108454875
>pure vibecoded repo
yikes

Anonymous
03/25/26(Wed)18:08:40 No.108454929

Anonymous 03/25/26(Wed)18:08:40 No.108454929

File: tq.png (198 KB, 1100x821)

198 KB PNG

vulkan: add TQ3_0 (TurboQuant) 3.5-bit KV cache quantization
https://github.com/ggml-org/llama.cpp/pull/21010

Anonymous
03/25/26(Wed)18:12:24 No.108454957

Anonymous 03/25/26(Wed)18:12:24 No.108454957

>>108454929
shutdown already lol

Anonymous
03/25/26(Wed)18:12:29 No.108454958

Anonymous 03/25/26(Wed)18:12:29 No.108454958

>>108454929
Aaaaand, it's closed.

Anonymous
03/25/26(Wed)18:14:04 No.108454965

Anonymous 03/25/26(Wed)18:14:04 No.108454965

if only wilkin received the same treatment.

Anonymous
03/25/26(Wed)18:14:28 No.108454968

Anonymous 03/25/26(Wed)18:14:28 No.108454968

>>108454504
stop posting these they're retarded.

Anonymous
03/25/26(Wed)18:17:23 No.108454979

Anonymous 03/25/26(Wed)18:17:23 No.108454979

>>108453813
V100s are EOL with key features missing and R9700s still don't have good Pytorch support with issues like https://github.com/ROCm/ROCm/issues/5674 and https://github.com/ROCm/ROCm/issues/6007 still unresolved months after the fact. People will gamble on the B70 because of that.

Anonymous
03/25/26(Wed)18:21:13 No.108455002

Anonymous 03/25/26(Wed)18:21:13 No.108455002

so what's the current meta for ERP is it STILL Mistral Nemo ?

Anonymous
03/25/26(Wed)18:39:28 No.108455105

Anonymous 03/25/26(Wed)18:39:28 No.108455105

Is cloode down?

Anonymous
03/25/26(Wed)18:40:58 No.108455115

Anonymous 03/25/26(Wed)18:40:58 No.108455115

>>108451495
Sure, thought it's half vibecoded and then fixed by me without cleaning or anything, so I should clean it up a bit. When I release it Ill send the github link here.

Anonymous
03/25/26(Wed)18:42:29 No.108455122

Anonymous 03/25/26(Wed)18:42:29 No.108455122

File: file.png (123 KB, 1386x515)

123 KB PNG

>https://arxiv.org/pdf/2603.19664
Isn't this even better than cache quantization?

Anonymous
03/25/26(Wed)18:45:53 No.108455147

Anonymous 03/25/26(Wed)18:45:53 No.108455147

>>108455122
>We verify this empirically: under greedy decoding, generating 30 tokens with and without the cache yields 100% token identity across all six models tested (four architecture families, 135M to 4B parameters).
This testing is not enough, why only 30 tokens? Why not 128 to 1024 tokens? We need to see if the correlation holds the more tokens you generate.

Anonymous
03/25/26(Wed)18:53:06 No.108455187

Anonymous 03/25/26(Wed)18:53:06 No.108455187

>>108455122
Sounds too good to be true, also
>x is all you need
Gay. But if it's real then I'd like to see it applied to a real usable model

Anonymous
03/25/26(Wed)18:54:30 No.108455199

Anonymous 03/25/26(Wed)18:54:30 No.108455199

>>108455147
Because it won't be relevant for us if it's 1000 tokens per second because the average person ain't rich

Anonymous
03/25/26(Wed)18:55:02 No.108455204

Anonymous 03/25/26(Wed)18:55:02 No.108455204

>c.ai dead
>Claude dead
>OpenRouter banning users left and right
Local WONNED

Anonymous
03/25/26(Wed)18:57:01 No.108455214

Anonymous 03/25/26(Wed)18:57:01 No.108455214

>>108455199
nta, but what the fuck are you on about?

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.