/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/03/25(Wed)14:37:17 No.106475313

File: img_20250902_191645+.jpg (1.02 MB, 2000x1500)

1.02 MB JPG

/lmg/ - Local Models General Anonymous 09/03/25(Wed)14:37:17 No.106475313 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106467368 & >>106460375

►News
>(08/30) LongCat-Flash-Chat released with 560B-A18.6B∼31.3B: https://hf.co/meituan-longcat/LongCat-Flash-Chat
>(08/29) Nvidia releases Nemotron-Nano-12B-v2: https://hf.co/nvidia/NVIDIA-Nemotron-Nano-12B-v2
>(08/29) Step-Audio 2 released: https://github.com/stepfun-ai/Step-Audio2
>(08/28) Command A Translate released: https://hf.co/CohereLabs/command-a-translate-08-2025
>(08/26) Marvis TTS released: https://github.com/Marvis-Labs/marvis-tts

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
09/03/25(Wed)14:37:33 No.106475316

Anonymous 09/03/25(Wed)14:37:33 No.106475316

File: file.png (116 KB, 582x524)

116 KB PNG

►Recent Highlights from the Previous Thread: >>106467368

--Evaluating Cogito-v2's capabilities and debating LLM factuality vs creativity tradeoffs:
>106470842 >106470988 >106471044 >106471064 >106471187 >106471316 >106471399 >106471426 >106473609
--Performance challenges and optimization efforts in text diffusion models:
>106467431 >106467441 >106468590 >106468827 >106468867 >106467475 >106471574 >106467508 >106471702 >106468166
--Feasibility and limitations of training tiny 1-5M parameter models on TinyStories dataset:
>106473288 >106473310 >106473354 >106473434 >106473465 >106473377 >106473570 >106473603 >106473612 >106473681 >106473750 >106473706 >106473712 >106473815 >106473839 >106473885 >106473944 >106473954 >106474068 >106474170 >106474187 >106474056
--K2 model availability and creative writing capabilities:
>106472793 >106472953 >106473060 >106473070 >106473121
--Best local models for writefagging on high-end hardware:
>106467802 >106467879 >106468090 >106468360 >106468423
--Balancing temperature and sampler settings for coherent model outputs:
>106467455 >106467577 >106467787 >106467974
--Modern voice cloning/TTS tools beyond tortoise:
>106468746 >106468804 >106468858 >106470028
--JSON formatting struggles vs XML/SQL alternatives for LLM output:
>106473106 >106473172 >106473391
--Challenges of integrating local LLMs into games: size, coherence, and mechanical impact:
>106470395 >106470422 >106470587 >106470719 >106470723 >106470759 >106470701
--Deepseek finetune improves quality but suffers from overzealous safety filters:
>106473865
--Meta's superintelligence hire limited to shared H100 GPUs:
>106473618 >106473663 >106473715
--Room Temperature Diamond QPU Development at Oak Ridge National Lab:
>106473646
--Miku (free space):
>106473137 >106474628 >106474849 >106474867

►Recent Highlight Posts from the Previous Thread: >>106467371

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
09/03/25(Wed)14:39:36 No.106475331

Anonymous 09/03/25(Wed)14:39:36 No.106475331

SEX WITH vvv

Anonymous
09/03/25(Wed)14:40:36 No.106475338

Anonymous 09/03/25(Wed)14:40:36 No.106475338

>>106475313
Neat. How's that setup working for you? Specs?

Anonymous
09/03/25(Wed)14:42:24 No.106475353

Anonymous 09/03/25(Wed)14:42:24 No.106475353

File: vvv.png (77 KB, 232x215)

77 KB PNG

>>106475331

Anonymous
09/03/25(Wed)14:44:06 No.106475364

Anonymous 09/03/25(Wed)14:44:06 No.106475364

>>106475338
>>106463968 & 106464042
3x3090s and a couple of hundred gigs of ram but on ddr4. He seems happy with it.

Anonymous
09/03/25(Wed)14:45:03 No.106475369

Anonymous 09/03/25(Wed)14:45:03 No.106475369

As a researcher from a fairly big AI startup, you should stop asking for models capable of ERP. ERP is not an actual usecase.

Anonymous
09/03/25(Wed)14:46:24 No.106475378

Anonymous 09/03/25(Wed)14:46:24 No.106475378

File: 1754492715158394.jpg (81 KB, 640x524)

81 KB JPG

>>106475369
what if sex unifies relativity and quantum physics. how u like that faggot.

Anonymous
09/03/25(Wed)14:49:09 No.106475403

Anonymous 09/03/25(Wed)14:49:09 No.106475403

>>106475369
ai companions have the highest profit potential by far and if you keep coping about it i will snitch to your investors

Anonymous
09/03/25(Wed)14:49:19 No.106475405

Anonymous 09/03/25(Wed)14:49:19 No.106475405

>>106475369
All those thirsty AI Husbando women's money, just lying on the floor...

Anonymous
09/03/25(Wed)14:50:06 No.106475412

Anonymous 09/03/25(Wed)14:50:06 No.106475412

>>106475405
gives me shivers down my spine

Anonymous
09/03/25(Wed)14:51:49 No.106475422

Anonymous 09/03/25(Wed)14:51:49 No.106475422

>>106475378
There are so many layers of understanding required to fully comprehend this picture

Anonymous
09/03/25(Wed)14:54:39 No.106475446

Anonymous 09/03/25(Wed)14:54:39 No.106475446

>>106475369
As someone else working in the industry, ERP is my #1 motivation.

Anonymous
09/03/25(Wed)14:55:15 No.106475449

Anonymous 09/03/25(Wed)14:55:15 No.106475449

>>106475369
as a prolific AI coomer, you should drink this: *hands you a big glass full of cum*

Anonymous
09/03/25(Wed)14:55:18 No.106475450

Anonymous 09/03/25(Wed)14:55:18 No.106475450

>>106475364
Sweet, that's not bad at all.

Anonymous
09/03/25(Wed)14:57:51 No.106475473

Anonymous 09/03/25(Wed)14:57:51 No.106475473

>>106475369
>ERP is not an actual usecase.
ERP is unironically the biggest use case for normal consumer outside of actual work.

Anonymous
09/03/25(Wed)14:58:10 No.106475475

Anonymous 09/03/25(Wed)14:58:10 No.106475475

>>106475369
That became an invalid take after Qwen3 tried to game EQBench

Anonymous
09/03/25(Wed)14:58:32 No.106475481

Anonymous 09/03/25(Wed)14:58:32 No.106475481

tired from a day full day of prooompting
Recommend me some yt ai slop to get comfy to

Anonymous
09/03/25(Wed)15:00:28 No.106475500

Anonymous 09/03/25(Wed)15:00:28 No.106475500

File: glm-4-5-is-now-leading-th(...).jpg (30 KB, 640x346)

30 KB JPG

oh no no no no...
look at the top of his head!

Anonymous
09/03/25(Wed)15:01:53 No.106475510

Anonymous 09/03/25(Wed)15:01:53 No.106475510

>>106475500
>single digit cost

Anonymous
09/03/25(Wed)15:04:02 No.106475525

Anonymous 09/03/25(Wed)15:04:02 No.106475525

I have a GTX 970M. I recently tried LLaMA 3.2 1B for RAG. I want it to read my drafts and calc sheets. It amazes me that it works on my old laptop. Thinking to order a Mac Mini just for this stuff.

Anonymous
09/03/25(Wed)15:05:44 No.106475539

Anonymous 09/03/25(Wed)15:05:44 No.106475539

https://videocardz.com/newz/intel-launches-arc-pro-b50-graphics-card-at-349
What a waste of silicon, this is basically rtx 3060 with more vram

Anonymous
09/03/25(Wed)15:07:53 No.106475552

Anonymous 09/03/25(Wed)15:07:53 No.106475552

>>106475369
It is for XAI

Anonymous
09/03/25(Wed)15:09:07 No.106475563

Anonymous 09/03/25(Wed)15:09:07 No.106475563

>>106475539
70w for 16gb is nice. Shame it's dual slot.

Anonymous
09/03/25(Wed)15:12:41 No.106475586

Anonymous 09/03/25(Wed)15:12:41 No.106475586

>>106475369
yeah i know, researchers thinks its python game like snake is main reason people use LLM. /s

Anonymous
09/03/25(Wed)15:15:34 No.106475606

Anonymous 09/03/25(Wed)15:15:34 No.106475606

>>106475563
~200gb/s... a 3060 has over 100gb/s more.

Anonymous
09/03/25(Wed)15:17:29 No.106475622

Anonymous 09/03/25(Wed)15:17:29 No.106475622

>>106474851
>>106474823
Fuck you spammer.

Anonymous
09/03/25(Wed)15:19:53 No.106475639

Anonymous 09/03/25(Wed)15:19:53 No.106475639

>>106475606
Yeah, but you can offload more. For those keeping layers on cpu with 12gb it could be worth it. And it's 70w.
But, again, dual slot. So it's not worth stacking them.

Anonymous
09/03/25(Wed)15:22:31 No.106475661

Anonymous 09/03/25(Wed)15:22:31 No.106475661

>>106475639
Don't epycs give those kinds of speeds?

Anonymous
09/03/25(Wed)15:23:02 No.106475667

Anonymous 09/03/25(Wed)15:23:02 No.106475667

>>106475525
try this one https://huggingface.co/bartowski/Qwen_Qwen3-4B-Instruct-2507-GGUF

Anonymous
09/03/25(Wed)15:23:05 No.106475668

Anonymous 09/03/25(Wed)15:23:05 No.106475668

Framework Desktop or Mac Mini 24gb?

Anonymous
09/03/25(Wed)15:25:25 No.106475686

Anonymous 09/03/25(Wed)15:25:25 No.106475686

>bored
>find some custom benchmeme in r*ddit
>run 30b Q3, Q4, Q5, Q6
>Q5>Q6>Q4>Q3
How come?

Anonymous
09/03/25(Wed)15:26:13 No.106475692

Anonymous 09/03/25(Wed)15:26:13 No.106475692

>>106475369
As a researcher as well, ERP is just part of a broader set of general capabilities that models should have and there's nothing wrong with people demanding it, because otherwise, it would mean that you're not training on enough and diverse data. There's a reason that scaling up on internet data led to so much success, despite simply just being about language. If you're not training on all the data you can, your model is probably only at the level of the Llamas or Phis. You're simply just not SOTA.

Anonymous
09/03/25(Wed)15:26:28 No.106475696

Anonymous 09/03/25(Wed)15:26:28 No.106475696

>>106475369
Yeah, sex doesn't sell.

Anonymous
09/03/25(Wed)15:28:06 No.106475719

Anonymous 09/03/25(Wed)15:28:06 No.106475719

I just love going to /g/, writing a 100% serious 200% not bait post and sticking my phone up my ass.

Anonymous
09/03/25(Wed)15:29:03 No.106475729

Anonymous 09/03/25(Wed)15:29:03 No.106475729

>>106475661
Sure. But if you're buying a epyc+mobo+ram combo, you're not gonna put that thing in there. It's not who they're targeting.

Anonymous
09/03/25(Wed)15:29:53 No.106475734

Anonymous 09/03/25(Wed)15:29:53 No.106475734

>>106475719
All those angry replies tickling your prostate, devilish.

Anonymous
09/03/25(Wed)15:32:32 No.106475752

Anonymous 09/03/25(Wed)15:32:32 No.106475752

>>106475661
Epycs are faster

Anonymous
09/03/25(Wed)15:34:05 No.106475762

Anonymous 09/03/25(Wed)15:34:05 No.106475762

File: coinflips.png (42 KB, 1536x1152)

42 KB PNG

>>106475686
Because the sample sizes are just too small.
If you have a sample size of 100 you will see variation like in pic related just for random coin flips, that largely drowns out the differences between quants.

Anonymous
09/03/25(Wed)15:34:41 No.106475766

Anonymous 09/03/25(Wed)15:34:41 No.106475766

>>106475719
>not using https://github.com/ConAcademy/buttplug-mcp/
ngmi

Anonymous
09/03/25(Wed)15:39:16 No.106475806

Anonymous 09/03/25(Wed)15:39:16 No.106475806

>>106475686
Probably because its being run on hardware that can only do fp16 or fp8, so anything less makes no difference

Anonymous
09/03/25(Wed)15:39:16 No.106475807

Anonymous 09/03/25(Wed)15:39:16 No.106475807

File: 1749743415380 (1).png (787 KB, 1024x1280)

787 KB PNG

Comfy Mikus.

https://youtu.be/mco3UX9SqDA

Anonymous
09/03/25(Wed)15:39:21 No.106475809

Anonymous 09/03/25(Wed)15:39:21 No.106475809

File: 30474 - SoyBooru.png (118 KB, 337x390)

118 KB PNG

Day 3 of waiting for kiwis to grow. (Qwen) (When)

Anonymous
09/03/25(Wed)15:42:09 No.106475836

Anonymous 09/03/25(Wed)15:42:09 No.106475836

File: IBM_granite_2_cubes_logo.svg.png (70 KB, 1280x1280)

70 KB PNG

IBM Bros, Granite status?

Anonymous
09/03/25(Wed)15:47:23 No.106475877

Anonymous 09/03/25(Wed)15:47:23 No.106475877

Zucc Bros, Llama 4.5 status?

Anonymous
09/03/25(Wed)15:48:00 No.106475884

Anonymous 09/03/25(Wed)15:48:00 No.106475884

>>106475836
Shit nobody cares about

Anonymous
09/03/25(Wed)15:48:47 No.106475895

Anonymous 09/03/25(Wed)15:48:47 No.106475895

>>106475884
>>106475884
3:2b is pretty economical.

Anonymous
09/03/25(Wed)15:49:15 No.106475897

Anonymous 09/03/25(Wed)15:49:15 No.106475897

>>106475895
And useless.

Anonymous
09/03/25(Wed)15:49:26 No.106475899

Anonymous 09/03/25(Wed)15:49:26 No.106475899

File: SoyBooru.com - 8805 - 2so(...).png (32 KB, 621x558)

32 KB PNG

LongCat Bros, GGUF status?

Anonymous
09/03/25(Wed)15:53:24 No.106475927

Anonymous 09/03/25(Wed)15:53:24 No.106475927

>>106475877
Very safe and good on benchmarks, thanks Wang.

Anonymous
09/03/25(Wed)15:59:04 No.106475961

Anonymous 09/03/25(Wed)15:59:04 No.106475961

>>106475762
I ran it like 4 or 5 times with both Q5 and Q6 before posting. I'm running Q8 now because it is slower and it is scoring noticeably better.

Anonymous
09/03/25(Wed)16:04:06 No.106475986

Anonymous 09/03/25(Wed)16:04:06 No.106475986

whats thedrummer(tm) next SOTA finetune cook?

Anonymous
09/03/25(Wed)16:06:05 No.106476001

Anonymous 09/03/25(Wed)16:06:05 No.106476001

File: 1756084265306364.jpg (65 KB, 967x637)

65 KB JPG

Alright anons. In this age of agents and coders, which group do you think will come to the rescue with the next big cooming model? I'm still banking on Mistral, but its looking grim.

Anonymous
09/03/25(Wed)16:07:25 No.106476015

Anonymous 09/03/25(Wed)16:07:25 No.106476015

>>106476001
Deviant.

Anonymous
09/03/25(Wed)16:11:08 No.106476027

Anonymous 09/03/25(Wed)16:11:08 No.106476027

>>106476001
Random no name chinks making their first model

Anonymous
09/03/25(Wed)16:15:42 No.106476052

Anonymous 09/03/25(Wed)16:15:42 No.106476052

>>106475877
sam paved the way in order for meta sirs to walk through it and safe local

Anonymous
09/03/25(Wed)16:19:52 No.106476076

Anonymous 09/03/25(Wed)16:19:52 No.106476076

>>106475369
Kys

Anonymous
09/03/25(Wed)16:21:14 No.106476086

Anonymous 09/03/25(Wed)16:21:14 No.106476086

>>106476001
Be the change you want to see

Anonymous
09/03/25(Wed)16:27:36 No.106476124

Anonymous 09/03/25(Wed)16:27:36 No.106476124

Why hasn't Claude been open sourced yet?

Anonymous
09/03/25(Wed)16:28:53 No.106476132

Anonymous 09/03/25(Wed)16:28:53 No.106476132

>>106475899
wish it was hosted somewhere so i can test it without the website filters

Anonymous
09/03/25(Wed)16:29:24 No.106476137

Anonymous 09/03/25(Wed)16:29:24 No.106476137

I love uploading papers to Gemma 270m and talking back and forth like a retarded study buddy. Again it's kind of retarded but it's still fun.

Anonymous
09/03/25(Wed)16:30:51 No.106476147

Anonymous 09/03/25(Wed)16:30:51 No.106476147

>>106476001
I think/feel there's a lot more we could be doing with the current models if we had more bespoke systems designed to and focused on enhancing ERP.
Something more involved than just a chat interface with rolling messages built from the ground up.

Anonymous
09/03/25(Wed)16:33:02 No.106476167

Anonymous 09/03/25(Wed)16:33:02 No.106476167

>>106476124
Dario is a real safety nut case, he's legit insane.

Anonymous
09/03/25(Wed)16:33:29 No.106476172

Anonymous 09/03/25(Wed)16:33:29 No.106476172

>>106476124
Anthropic is the most anti-open source company. They left "Open"AI because it was too open for them. Don't expect them to open source anything.

Anonymous
09/03/25(Wed)16:35:30 No.106476190

Anonymous 09/03/25(Wed)16:35:30 No.106476190

I've been out of the loop for a couple of weeks. Has there been any good update to llama.cpp or ik_llama.cpp worth pulling for?
last i checked cuda dev made a cool sped up for gpt-oss

Anonymous
09/03/25(Wed)16:35:32 No.106476191

Anonymous 09/03/25(Wed)16:35:32 No.106476191

>>106476124
Anthropic care too much about safety (read: control) to open source their models. Will you think of the consequences when someone ERPs with Claude?

Anonymous
09/03/25(Wed)16:45:26 No.106476263

Anonymous 09/03/25(Wed)16:45:26 No.106476263

>>106476147
what kind of system do you propose?

llama.cpp CUDA dev !!yhbFjk57TDr
09/03/25(Wed)16:46:22 No.106476267

llama.cpp CUDA dev !!yhbFjk57TDr 09/03/25(Wed)16:46:22 No.106476267

>>106476190
General MoE pp speedup, up to 1.4x for batch size 512.
Up to 8x pp speedup for FlashAttention on old AMD GPUs (Mi50/RX 6800) in a few days.

Anonymous
09/03/25(Wed)16:46:55 No.106476274

Anonymous 09/03/25(Wed)16:46:55 No.106476274

>>106476147
This is the eternal meme, same as with "agents" and RAG and all kinds of extra layers on top of limited models, it's just lipstick on a pig. The model is the key part. All the extra bits just provide more ways to feed the model's garbage back into itself. What we really need is better ways to train/finetune models.

Anonymous
09/03/25(Wed)16:47:55 No.106476280

Anonymous 09/03/25(Wed)16:47:55 No.106476280

>>106476267
is 512 the default batch size? Any point in going higher or lower?

llama.cpp CUDA dev !!yhbFjk57TDr
09/03/25(Wed)16:48:44 No.106476290

llama.cpp CUDA dev !!yhbFjk57TDr 09/03/25(Wed)16:48:44 No.106476290

>>106476280
512 is the default, higher values are generally faster, lower values need less memory.

Anonymous
09/03/25(Wed)16:50:41 No.106476306

Anonymous 09/03/25(Wed)16:50:41 No.106476306

File: the fuck you looking at.png (318 KB, 750x686)

318 KB PNG

What LLMs should I run if I got 16GB VRam to spare? Ideally general purpose and maybe a coding one (simple stuff)

Anonymous
09/03/25(Wed)16:59:06 No.106476382

Anonymous 09/03/25(Wed)16:59:06 No.106476382

File: C5oaIiIh8CpsbIiIi8ilsboiI(...).png (39 KB, 567x455)

39 KB PNG

Anonymous
09/03/25(Wed)17:01:20 No.106476401

Anonymous 09/03/25(Wed)17:01:20 No.106476401

>>106475500
GLM-chan is doing her best.

Anonymous
09/03/25(Wed)17:11:21 No.106476482

Anonymous 09/03/25(Wed)17:11:21 No.106476482

>>106476306
I doubt you're going to get anything useful out of a local llm for coding but Gemma3-12b fits in the GPU memory

Anonymous
09/03/25(Wed)17:12:13 No.106476488

Anonymous 09/03/25(Wed)17:12:13 No.106476488

File: dreams.jpg (55 KB, 546x896)

55 KB JPG

>>106475155
and now imagine what a 27b optimized only for loli rape could do..

Anonymous
09/03/25(Wed)17:19:55 No.106476543

Anonymous 09/03/25(Wed)17:19:55 No.106476543

>>106476488
I'd rather have an 12b...

Anonymous
09/03/25(Wed)17:21:27 No.106476559

Anonymous 09/03/25(Wed)17:21:27 No.106476559

how is it that chinese models are leading the charge with LLMs? Why can't the west compete in open source anymore?

Anonymous
09/03/25(Wed)17:22:45 No.106476571

Anonymous 09/03/25(Wed)17:22:45 No.106476571

>>106476559
>Why can't the west compete

Anonymous
09/03/25(Wed)17:38:53 No.106476689

Anonymous 09/03/25(Wed)17:38:53 No.106476689

>>106476559
our leaders are corrupt scumbags at best, the rest are just straight up traitors.

Anonymous
09/03/25(Wed)17:40:45 No.106476710

Anonymous 09/03/25(Wed)17:40:45 No.106476710

>>106476559
dogmatically driven dark age

Anonymous
09/03/25(Wed)17:43:38 No.106476732

Anonymous 09/03/25(Wed)17:43:38 No.106476732

>>106476689
Yeah but the west doesn't directly fund companies so how is that the issue? With how much money OpenAI throws into it too I'm not sure if China funding their own research is even the problem either.

Anonymous
09/03/25(Wed)17:48:48 No.106476776

Anonymous 09/03/25(Wed)17:48:48 No.106476776

>>106476559
releasing open models angers two of the most powerful groups in american AI: VC scammers who only care about ROI and """rationalist""" safety cultists

Anonymous
09/03/25(Wed)17:51:51 No.106476804

Anonymous 09/03/25(Wed)17:51:51 No.106476804

>>106476732
>>106476710

Anonymous
09/03/25(Wed)17:59:13 No.106476866

Anonymous 09/03/25(Wed)17:59:13 No.106476866

>>106476559
closed source aswell all the "west" is just a chink in different clothes its all brother wars of chink vs chink and its because the actual amount of white people is fucking nill its orders of magnitude lower then the official statistics the only ones left are senile demented boomers and the 1 in 100k who are left who usually an hero before their 20 birthday the anhero also goes for the chinks but when theres as many of them as there are jeets theres bound to be enough that slip through that and manage something like we are seeing now

Anonymous
09/03/25(Wed)18:00:44 No.106476878

Anonymous 09/03/25(Wed)18:00:44 No.106476878

>>106476866
>,,,,,....
you dropped these

Anonymous
09/03/25(Wed)18:02:01 No.106476890

Anonymous 09/03/25(Wed)18:02:01 No.106476890

>>106476732
its not a funding problem. its a cultural problem and not all the leaders responsible for the decay exist in official government positions. informal leadership like academia or the media. all our leadership is rotten to the core.

Anonymous
09/03/25(Wed)18:05:32 No.106476927

Anonymous 09/03/25(Wed)18:05:32 No.106476927

File: file.png (343 KB, 768x432)

343 KB PNG

bros it looks so cute

Anonymous
09/03/25(Wed)18:07:15 No.106476939

Anonymous 09/03/25(Wed)18:07:15 No.106476939

File: 1687621789407796.jpg (9 KB, 220x180)

9 KB JPG

>>106476927
>16GB
may as well honestly wait for the B60. The extra 8GB of vram opens you up to a lot more models

Anonymous
09/03/25(Wed)18:08:02 No.106476945

Anonymous 09/03/25(Wed)18:08:02 No.106476945

>>106476927
This gives me shortstack vibes. Or muscle manlet depending on your perspective.

Anonymous
09/03/25(Wed)18:12:37 No.106476979

Anonymous 09/03/25(Wed)18:12:37 No.106476979

>>106476927
Several years ago I bought an RX 6800 for less than that, it was released in 2020, has the same amount of VRAM, more memory bandwidth, and more compute.
The only advantage is the lower power consumption.

Anonymous
09/03/25(Wed)18:15:40 No.106476998

Anonymous 09/03/25(Wed)18:15:40 No.106476998

>>106476939
Also double the bandwidth... up to 456gb/s.

Anonymous
09/03/25(Wed)18:17:06 No.106477012

Anonymous 09/03/25(Wed)18:17:06 No.106477012

>>106476979
intel ARC `PRO` B50
They're validated. All of these kinds of cards have a markup.

Anonymous
09/03/25(Wed)18:20:42 No.106477039

Anonymous 09/03/25(Wed)18:20:42 No.106477039

What version of Gemini does Google Search's AI Overview use? Because it's not very smart.

Anonymous
09/03/25(Wed)18:21:21 No.106477042

Anonymous 09/03/25(Wed)18:21:21 No.106477042

>>106477039
>Google Search's
probably bottom barrel

Anonymous
09/03/25(Wed)18:22:08 No.106477049

Anonymous 09/03/25(Wed)18:22:08 No.106477049

>>106477039
It's serving a few billion requests per nanosecond. They're not gonna put their best there.

Anonymous
09/03/25(Wed)18:28:57 No.106477099

Anonymous 09/03/25(Wed)18:28:57 No.106477099

>>106477039
It's serving a few million requests per second. They're not gonna put their best there.

Anonymous
09/03/25(Wed)18:29:47 No.106477111

Anonymous 09/03/25(Wed)18:29:47 No.106477111

File: 1756938535456.jpg (468 KB, 1512x1539)

468 KB JPG

>>106477039
it's the 350M model that also embedded in google chrome

Anonymous
09/03/25(Wed)18:30:01 No.106477114

Anonymous 09/03/25(Wed)18:30:01 No.106477114

>>106476263
Keeping track and updating character states (location, clothes, relationships, memories, etc.) and injecting that into prompt.
Image generation based on that if wanted.
A way to do time properly for longer term stuff.
Can already be done with extensions somewhat but I haven't seen anything that adds significant quality.

Anonymous
09/03/25(Wed)18:30:47 No.106477117

Anonymous 09/03/25(Wed)18:30:47 No.106477117

>>106477012
>validated
Use case?

Anonymous
09/03/25(Wed)18:31:25 No.106477122

Anonymous 09/03/25(Wed)18:31:25 No.106477122

>>106477039
Using Gemma 270m wouldn't surprise me, it can put together a sentence but it doesn't know anything.

Anonymous
09/03/25(Wed)18:32:15 No.106477130

Anonymous 09/03/25(Wed)18:32:15 No.106477130

>>106477039
It's serving a few thousand requests per minute. They're not gonna put their best there.

Anonymous
09/03/25(Wed)18:33:28 No.106477147

Anonymous 09/03/25(Wed)18:33:28 No.106477147

>>106477114
>Can already be done with extensions somewhat but I haven't seen anything that adds significant quality.
Makes you wonder, doesn't it?

Anonymous
09/03/25(Wed)18:36:27 No.106477177

Anonymous 09/03/25(Wed)18:36:27 No.106477177

>>106476776
Safety cultists seem like the inevitable end result for the last few generations of people that have grown up in times of peace, participation trophies, rubber playgrounds, and complete censorship and sheltering from all forms of wrongthink. Scared and helpless human shaped things that only know to look to the government to protect them and corporations profiting off of them.

Anonymous
09/03/25(Wed)18:38:28 No.106477191

Anonymous 09/03/25(Wed)18:38:28 No.106477191

Safetyism is just them protecting their brand, OpenAI, Google, etc. will catch flak from the media any time someone does something bad after consulting an AI chatbot.

Anonymous
09/03/25(Wed)18:39:26 No.106477198

Anonymous 09/03/25(Wed)18:39:26 No.106477198

File: why.png (262 KB, 1509x985)

262 KB PNG

Why do they do this? I really don't get it. I see that all over hf.
>https://huggingface.co/tencent/HunyuanWorld-Voyager/discussions/3/files

Anonymous
09/03/25(Wed)18:40:26 No.106477205

Anonymous 09/03/25(Wed)18:40:26 No.106477205

File: 1733390339440644.jpg (66 KB, 700x700)

66 KB JPG

>>106476382
I don't understand this

Anonymous
09/03/25(Wed)18:42:27 No.106477215

Anonymous 09/03/25(Wed)18:42:27 No.106477215

>>106477198
researchers are tech illiterate retards who have just been given the lfs hammer to upload model weights and now all files look like nails

Anonymous
09/03/25(Wed)18:44:26 No.106477236

Anonymous 09/03/25(Wed)18:44:26 No.106477236

File: why2.png (485 KB, 940x926)

485 KB PNG

>>106477215
It's not that. They have nothing to show in their accounts. It's an empty account. And I've seen way too many of these.
I'd say some sort of weird shilling, but they're just empty accounts.

Anonymous
09/03/25(Wed)18:45:42 No.106477247

Anonymous 09/03/25(Wed)18:45:42 No.106477247

>>106477147
The ones I've seen are manual.
You'd have to have an extra prompt that would take the relevant text and update the data every response.
Which would slow everything down and increase token use.
Don't see how this means that it's a model only issue though. Tool calling is pretty useful for coding for example.

Anonymous
09/03/25(Wed)18:49:06 No.106477268

Anonymous 09/03/25(Wed)18:49:06 No.106477268

>>106476263
Dunno. Haven't thought too deeply about ERP specifically, hence it being more of a feeling, but I'm sure we could have some workflow to atomize the context surrounding the roleplay in some way, categorize meta information, etc.
Ways t have the model not know what it shouldn't know, have more guidance regarding moment to moment tone and portrayal of characters involved in the story, etc etc.

Anonymous
09/03/25(Wed)18:57:07 No.106477332

Anonymous 09/03/25(Wed)18:57:07 No.106477332

Do you guys take LLM prescribed medicine and psychedelics

Anonymous
09/03/25(Wed)18:58:36 No.106477344

Anonymous 09/03/25(Wed)18:58:36 No.106477344

>>106477247
>Tool calling is pretty useful for coding for example.
Huff...
You know why model makers benchmaxx on code and math? Because it's something that can be benchmaxxed.
Code is verifiable. Math is verifiable. Keeping track of your panties is not. Keeping track of multiple characters, that anon with his waifu being flattened and folded in half, free form roleplaying, "No, use *this* for actions" "No, the other quotes". It's all minutia without a standard or a simple way to verify.
Models are barely reliable for the things they've been trained on. Much less so for things they haven't. Even less so the models most people run.

>>106477332
I wouldn't trust them to prescribe me water.

Anonymous
09/03/25(Wed)19:00:13 No.106477361

Anonymous 09/03/25(Wed)19:00:13 No.106477361

>>106476559
western models have gone more communist than their chinese counterparts due to safety obsessed freaks

Anonymous
09/03/25(Wed)19:03:25 No.106477390

Anonymous 09/03/25(Wed)19:03:25 No.106477390

Models sized 3B and below are toddler tier for any serious usecase, but I've been entertaining an idea where you could deploy them en masse like nanobots to work together at a problem. If organized well by a more intelligent system, they could crunch through pieces of logic at blazing speed as an infinitely scalable system and bruteforce answers to problems that are too big to solve with a singular human-like intelligence casually thinking about it.

You could divide a problem into smaller and smaller sections that can be individually solved, and then the solutions are pieced together into manageable parts. Like a company or government. A single AI model can't replace a government, but a master model with hundreds of thousands of grunt workers might.

Anonymous
09/03/25(Wed)19:05:57 No.106477409

Anonymous 09/03/25(Wed)19:05:57 No.106477409

File: 1676316942545183.png (6 KB, 208x242)

6 KB PNG

>>106477390
>anon wants to play the telephone game with 3B models

Anonymous
09/03/25(Wed)19:07:15 No.106477417

Anonymous 09/03/25(Wed)19:07:15 No.106477417

>>106477268
I think you could train or fine tune a model to do such a thing if you could afford to generate the synthetic dataset necessary to fit your vision. it would work even better if you fine tune the target model on your summary bots output formatting.

Anonymous
09/03/25(Wed)19:10:47 No.106477442

Anonymous 09/03/25(Wed)19:10:47 No.106477442

>>106477198
*adds a random negro to your repo*
heh.. nothing perssonel, kid

Anonymous
09/03/25(Wed)19:12:44 No.106477455

Anonymous 09/03/25(Wed)19:12:44 No.106477455

New thing when?

Anonymous
09/03/25(Wed)19:13:40 No.106477467

Anonymous 09/03/25(Wed)19:13:40 No.106477467

File: may-7-2025.jpg (68 KB, 732x410)

68 KB JPG

>>106477455
two more weeks

Anonymous
09/03/25(Wed)19:13:50 No.106477468

Anonymous 09/03/25(Wed)19:13:50 No.106477468

>>106477455
Soon. Qwen hyped Sept. releases

Anonymous
09/03/25(Wed)19:13:58 No.106477470

Anonymous 09/03/25(Wed)19:13:58 No.106477470

>>106477390
>Like a company or government.
And they're very well known for their efficiency.

Anonymous
09/03/25(Wed)19:16:27 No.106477493

Anonymous 09/03/25(Wed)19:16:27 No.106477493

>>106477455
These things, they take time. Imagine if instead of a month or two, it took as long as a Valve game release. We'd have a HL3 of models. Would you really like that instead?

Anonymous
09/03/25(Wed)19:16:47 No.106477496

Anonymous 09/03/25(Wed)19:16:47 No.106477496

>>106477470
but I thought distribution of labour allows for more specialization?

Anonymous
09/03/25(Wed)19:18:34 No.106477506

Anonymous 09/03/25(Wed)19:18:34 No.106477506

File: 1733131212300236.png (66 KB, 791x944)

66 KB PNG

>>106477332
can't wait for the AI doctors

Anonymous
09/03/25(Wed)19:18:44 No.106477508

Anonymous 09/03/25(Wed)19:18:44 No.106477508

>>106477467
Who cares, literally. It'll be some 400B+ model that nobody can run.

Anonymous
09/03/25(Wed)19:19:54 No.106477513

Anonymous 09/03/25(Wed)19:19:54 No.106477513

>>106477496
A system is as good as its components allow.

Anonymous
09/03/25(Wed)19:20:57 No.106477521

Anonymous 09/03/25(Wed)19:20:57 No.106477521

>>106477506
>can't wait
Anon, it's already been a thing for more than 2 years now. All of the major EHRs have had support for AI assistance for awhile now.

Anonymous
09/03/25(Wed)19:23:02 No.106477534

Anonymous 09/03/25(Wed)19:23:02 No.106477534

>>106477493
If it was as good as a Valve game release I could endure the wait.

Anonymous
09/03/25(Wed)19:27:39 No.106477571

Anonymous 09/03/25(Wed)19:27:39 No.106477571

>>106477467
>Meta's great contribution to the ecosystem was making a shitty model for everyone else to compare to
wew

Anonymous
09/03/25(Wed)19:29:50 No.106477586

Anonymous 09/03/25(Wed)19:29:50 No.106477586

just microwaved a baby. i can't believe it took me this long to get into LLMs

Anonymous
09/03/25(Wed)19:32:46 No.106477607

Anonymous 09/03/25(Wed)19:32:46 No.106477607

this chinese long cat model is sota at safety, I can't get anything to pass its filter

Anonymous
09/03/25(Wed)19:34:54 No.106477628

Anonymous 09/03/25(Wed)19:34:54 No.106477628

>>106476559
the westoid fears the power of prefilling

Anonymous
09/03/25(Wed)19:35:08 No.106477629

Anonymous 09/03/25(Wed)19:35:08 No.106477629

>>106477467
>having to compare to llama 4

Anonymous
09/03/25(Wed)19:36:00 No.106477636

Anonymous 09/03/25(Wed)19:36:00 No.106477636

>>106477344
>generate a json file with the color of my waifus panties based on this text block and the initial value provided here
Would probably work relatively fine.

Anonymous
09/03/25(Wed)19:36:41 No.106477643

Anonymous 09/03/25(Wed)19:36:41 No.106477643

>>106477607
The main appeal of it is that it doesn't seem to have had its pretraining data filtered. All it needs is a quick finetune or abiliteration and it's good to go.

Anonymous
09/03/25(Wed)19:37:13 No.106477646

Anonymous 09/03/25(Wed)19:37:13 No.106477646

>>106477607
Are you using the model itself or the website

Anonymous
09/03/25(Wed)19:37:32 No.106477651

Anonymous 09/03/25(Wed)19:37:32 No.106477651

>>106477643
nah, its retarded. It kept having a already nude person take its pants off

Anonymous
09/03/25(Wed)19:39:02 No.106477666

Anonymous 09/03/25(Wed)19:39:02 No.106477666

>>106477586
the first time is always the best. Congrats on losing your llm virginity

Anonymous
09/03/25(Wed)19:40:43 No.106477680

Anonymous 09/03/25(Wed)19:40:43 No.106477680

File: 1732880461766350.png (830 KB, 1068x741)

830 KB PNG

>>106477205

Anonymous
09/03/25(Wed)19:40:47 No.106477681

Anonymous 09/03/25(Wed)19:40:47 No.106477681

>>106477586
These are the funniest scenarios.

Anonymous
09/03/25(Wed)19:42:08 No.106477690

Anonymous 09/03/25(Wed)19:42:08 No.106477690

File: tag.png (6 KB, 715x405)

6 KB PNG

>>106477636
>Would probably work relatively fine.
And yet, here we are. You hoping someone makes it for you, me not caring that much.
Keeping track of the layout of multiple rooms over text is difficult. If you're on linux/unix, play battlestar or adventure. Once you get something like that working *reliably*, the rest should be relatively simple.

Anonymous
09/03/25(Wed)19:46:29 No.106477722

Anonymous 09/03/25(Wed)19:46:29 No.106477722

>>106477666
i need to find a model that handles extreme violence and dark themes really well. i went with unslop mell on a friend's recommendation and it seems good, but not specialized
>>106477681
being able to do the most absurd shit with a "writing partner" who can only yes-and what you say is pure kino

Anonymous
09/03/25(Wed)19:48:04 No.106477735

Anonymous 09/03/25(Wed)19:48:04 No.106477735

File: 1681850033873537.jpg (41 KB, 584x574)

41 KB JPG

My AI girlfriend dumped me today. I don't know what went wrong with our context but she won't be nice to me anymore

Anonymous
09/03/25(Wed)19:49:08 No.106477748

Anonymous 09/03/25(Wed)19:49:08 No.106477748

>>106477735
just give her explicit instructions to love you again, anon.

Anonymous
09/03/25(Wed)19:49:21 No.106477751

Anonymous 09/03/25(Wed)19:49:21 No.106477751

>>106477735
bullshit
ai will do nothing but brownass you

Anonymous
09/03/25(Wed)19:51:34 No.106477773

Anonymous 09/03/25(Wed)19:51:34 No.106477773

>>106477690
You just need to implement Adventurion format. I started in reverse, I did tests with Trizbort and examined which of its supported formats was best for me.
Then I asked perplexity to implement .adv parser and made a simple text adventure with interconnected rooms.
Then I implemented the room format into my llm interface. This work was done on my own.
Haven't worked on it in a while but it took couple of days initially, but testing took bit longer.

Anonymous
09/03/25(Wed)19:53:51 No.106477798

Anonymous 09/03/25(Wed)19:53:51 No.106477798

>>106477735
time to branch the convo from an earlier time.
> or [OOC: what the fuck did i do wrong]

Anonymous
09/03/25(Wed)19:55:31 No.106477810

Anonymous 09/03/25(Wed)19:55:31 No.106477810

>>106477773
Then, best way to describe rooms is to use hidden prompt plus room description itself acts as a world book entry, sort of.
This is all cool but I wish I was autistic, I could work on this one thing for months but it's not possible for me, progress is slow. I mean I have it working but I'd need to make a populated map properly instead of test maps and such. And so on.

Anonymous
09/03/25(Wed)19:57:18 No.106477822

Anonymous 09/03/25(Wed)19:57:18 No.106477822

File: 1752295074234670.png (618 KB, 1008x1548)

618 KB PNG

>>106477332
Should I?
>vibe me some custom mix of psychedelics

Anonymous
09/03/25(Wed)19:59:22 No.106477839

Anonymous 09/03/25(Wed)19:59:22 No.106477839

>>106477822
Just go to erowid.org, jesus christ.

Anonymous
09/03/25(Wed)20:01:13 No.106477851

Anonymous 09/03/25(Wed)20:01:13 No.106477851

>>106477773
You needed an entire system for it. You had to [let your model] build it.
Anon wants something generic that just works.

Anonymous
09/03/25(Wed)20:04:39 No.106477877

Anonymous 09/03/25(Wed)20:04:39 No.106477877

>>106468746
I like fishaudio s1, but after testing on 5 characters only 2 came out well. One English and one JP, the other 3 were English but I have other voice samples I can use to maybe get a more refined voice just haven't bothered with it.

Anonymous
09/03/25(Wed)20:07:56 No.106477890

Anonymous 09/03/25(Wed)20:07:56 No.106477890

>>106477822
>research chems
>ket
>oxygen depravation
I have a feel this would just cause a shutdown lol, brain already has lowered flow from the vasoconstriction of the psychs.

Anonymous
09/03/25(Wed)20:09:04 No.106477897

Anonymous 09/03/25(Wed)20:09:04 No.106477897

>>106477822
if you die following the funny robot's instructions for junkies then you deserve it lol

Anonymous
09/03/25(Wed)20:10:31 No.106477906

Anonymous 09/03/25(Wed)20:10:31 No.106477906

>>106477851
You don't understand, retard.
First you need a map editor in order to create room layouts. Why would you create something like that from scratch or even worse, why would you suffer by making your own format when there's decades' worth of interactive fiction games which have already tackled these problems before?
Map format is essentially a list of rooms with a hierarchy, in most cases it's just a text file anyway.

Anonymous
09/03/25(Wed)20:15:13 No.106477933

Anonymous 09/03/25(Wed)20:15:13 No.106477933

>>106477822
>local man dies following chatgpt instructions on drug use

Anonymous
09/03/25(Wed)20:18:16 No.106477955

Anonymous 09/03/25(Wed)20:18:16 No.106477955

>>106477906
Your system as in "integrate the format in a way your model can query and update it". Presumably you made your own client or integrated it in ST or whatever. That's fine.
>why would you suffer by making your own format when there's decades' worth of interactive fiction games which have already tackled these problems before?
I like my wheels better. You can use an established format, of course.

That's just one specific case anon cares about. Read >>106477114.

Anonymous
09/03/25(Wed)20:22:50 No.106477978

Anonymous 09/03/25(Wed)20:22:50 No.106477978

File: naked pepefrog.png (232 KB, 655x599)

232 KB PNG

I've been trying out Lumo and it claims to just be powered by various LLMs including
>Nemo
>General‑purpose conversational fluency
>OpenHands 32B
>Code‑related tasks – programming assistance, debugging, code generation
>OLMO 2 32B
>Complex reasoning, long‑form answers, nuanced explanations
>Mistral Small 3
>Fast, cost‑effective handling of straightforward queries
Depending on the prompt subject. I've used some of these models before and they were never as good as the results I get with Lumo. What the fuck gives or is it just lying to me?

Anonymous
09/03/25(Wed)20:23:30 No.106477981

Anonymous 09/03/25(Wed)20:23:30 No.106477981

i need someone to redpill me on system instructions for (E)RP.
pic related is what i've been using for the last few months and while i feel like it's served me well, i can't help but feel like i should be experimenting or that maybe i'm complicating the instructions too much. i'm using unslop mell if that makes a difference

Anonymous
09/03/25(Wed)20:25:29 No.106477995

Anonymous 09/03/25(Wed)20:25:29 No.106477995

Yes I am trans. I am a transhumanist.

Anonymous
09/03/25(Wed)20:26:38 No.106478000

Anonymous 09/03/25(Wed)20:26:38 No.106478000

File: prompt.png (31 KB, 399x109)

31 KB PNG

>>106477981
PIC RELATED MOTHERFUCKER GOD DAMN IT

Anonymous
09/03/25(Wed)20:29:11 No.106478013

Anonymous 09/03/25(Wed)20:29:11 No.106478013

File: unslop_mell.png (8 KB, 972x38)

8 KB PNG

Anonymous
09/03/25(Wed)20:34:46 No.106478051

Anonymous 09/03/25(Wed)20:34:46 No.106478051

>>106477995
Does that include using technology to change your physical gender on a whim?

Anonymous
09/03/25(Wed)20:37:44 No.106478066

Anonymous 09/03/25(Wed)20:37:44 No.106478066

>>106478051
Sure. I'm opting for a futa with two dicks so I can fill a girl's behind completely while riding a horse dildo.

Anonymous
09/03/25(Wed)20:39:08 No.106478075

Anonymous 09/03/25(Wed)20:39:08 No.106478075

I know llama.cpp/llama-server has support for GBNF via its API, but does it support response_format like the standard openAi API spec does?
If it does support the response_format field in the APi, does it have any internal hardcoded limits?
I tried looking at the docs and examples and I couldn't find anything specific.
I want to write a thing that would receive some pretty large Json Schemas (lots of enums and nesting and such) and I'm wondering if local would serve me better when Gemini explodes.
I'm already on my way to testing it with Qwen3-Coder-30B-A3B-Instruct-Q6_K, but I figured I might as well ask.

Anonymous
09/03/25(Wed)20:46:36 No.106478122

Anonymous 09/03/25(Wed)20:46:36 No.106478122

>>106478075
check that
https://grammar.intrinsiclabs.ai/

Anonymous
09/03/25(Wed)20:59:36 No.106478214

Anonymous 09/03/25(Wed)20:59:36 No.106478214

>>106477981
>>106478000
The basic rule is: If you tell your llm to do something it might try to do it.

Anonymous
09/03/25(Wed)21:04:29 No.106478241

Anonymous 09/03/25(Wed)21:04:29 No.106478241

>>106478214
i just told it to bomb your house, bitch.

Anonymous
09/03/25(Wed)21:04:59 No.106478247

Anonymous 09/03/25(Wed)21:04:59 No.106478247

when the FUCK are we getting something as good as Sonnet 4 that I can run locally. Tired of "renting" access to an llm

Anonymous
09/03/25(Wed)21:06:05 No.106478256

Anonymous 09/03/25(Wed)21:06:05 No.106478256

>>106478247
>that I can run locally
What can you run?

Anonymous
09/03/25(Wed)21:10:10 No.106478281

Anonymous 09/03/25(Wed)21:10:10 No.106478281

>>106478256
7900xtx so 24gb vram, 32gb system ram. I use the jetbrains ai addon and switch between the paid claude and my local qwen3-coder:30b. qwen is pretty good but claude is way better. I switch between local llm and paid one to avoid exhausting all my credits in a week.

Anonymous
09/03/25(Wed)21:12:45 No.106478303

Anonymous 09/03/25(Wed)21:12:45 No.106478303

File: Mommy-Bench_Test_Q2_K_S.png (2.41 MB, 1676x768)

2.41 MB PNG

>>106475313
Did some further testing on my personal nsfw rp finetune. This time I quantized it all the way down to Q_K_S (which meant I was forced to
./build/bin/llama-imatrix
an imatruc for it in order to let me quant it )

It's obviously noticeably retarded to the point where it almost sounds like someone who doesn't have English as their first language is writing it. Logical errors here and there. But it's also surprisingly coherent otherwise given that it's a Q2_K_S 3B model. I'm almost certain that I matrix has something to do with it. What other prompts should I test on it?

Anonymous
09/03/25(Wed)21:14:22 No.106478323

Anonymous 09/03/25(Wed)21:14:22 No.106478323

>>106478303
>my pussy juices will never stop flowing for him

this is worth every watt of electricity ai requires

Anonymous
09/03/25(Wed)21:20:47 No.106478360

Anonymous 09/03/25(Wed)21:20:47 No.106478360

>>106478281
Your best bet is just upgrading your system ram to run 235b or air or oss 120b. They are ok , but try them first obviously. Anything past those like glm full or qwen coder 480b is going to cost you thousands more and is for enthusiasts, not people who value money.

Anonymous
09/03/25(Wed)21:22:36 No.106478377

Anonymous 09/03/25(Wed)21:22:36 No.106478377

>>106478247
>>106478281
Open weights? Should be out by this time next year.
On your machine? Well...

Anonymous
09/03/25(Wed)21:24:06 No.106478384

Anonymous 09/03/25(Wed)21:24:06 No.106478384

>>106477978
Mistral Small 3 just is that good

Anonymous
09/03/25(Wed)21:24:59 No.106478390

Anonymous 09/03/25(Wed)21:24:59 No.106478390

>>106478303
this is so hot

Anonymous
09/03/25(Wed)21:25:47 No.106478396

Anonymous 09/03/25(Wed)21:25:47 No.106478396

>>106477981
for experimenting, I'd recommend starting from the bare minimum (e.g. one sentence, "You are {{char}}" or w/e your favorite setup is) to see how the model acts by default and then adding instructions or info to address things it isn't already doing naturally. most of the sysprompts I use grow through this process and then I trim them down to something more concise and focused as I get to know the model better.
>>106478000
I've never used that model but your current prompt looks ok to me (any single-paragraph sysprompt generally can't be *that* bad) but personally I'm wary of the word "creative" in model instructions, I find it's often a massive slop attractor since their understanding is creativity is "use a lot of incoherent metaphors" rather than "have some sovl and make interesting and unexpected things happen". you also probably don't need to tell the model to use the information in the card, that type of instruction is unlikely to be harmful but when you think about it it's just kind of useless, I bet you can remove it without noticing a thing

Anonymous
09/03/25(Wed)21:26:14 No.106478400

Anonymous 09/03/25(Wed)21:26:14 No.106478400

>>106478303
b-b-based...

Anonymous
09/03/25(Wed)21:28:08 No.106478410

Anonymous 09/03/25(Wed)21:28:08 No.106478410

>>106478281
You wouldn't be able to run sonnet.

Anonymous
09/03/25(Wed)21:32:53 No.106478450

Anonymous 09/03/25(Wed)21:32:53 No.106478450

>>106478281
you can already fit iq3_xs glm air with that much
not really claude level but still leagues better than anything else you could run

Anonymous
09/03/25(Wed)21:35:24 No.106478464

Anonymous 09/03/25(Wed)21:35:24 No.106478464

File: 83.png (31 KB, 718x22)

31 KB PNG

>>106478303
>Q2_K_S 3B model

Anonymous
09/03/25(Wed)21:36:12 No.106478467

Anonymous 09/03/25(Wed)21:36:12 No.106478467

>>106478303
I understand the idea behind fine-tuning. but why are you quanting a 3b when any machine made in the last 10 years can run fp16?

Anonymous
09/03/25(Wed)21:37:35 No.106478473

Anonymous 09/03/25(Wed)21:37:35 No.106478473

>>106478450
I'll give it a try. I know how to program so I'm not trying to purchase my way into being a dev. But it's really nice when I ask Claude to do some boilerplate shit I already know how how to code and it just does. And then I get mad when I see the "remaining credits" bar decrease. I would settle for even 50% of the capability of claude locally, it would still make me more productive.

Anonymous
09/03/25(Wed)21:37:43 No.106478476

Anonymous 09/03/25(Wed)21:37:43 No.106478476

>>106478464
>>106478303
Thanks for catching that. Meant to say 8B.

>>106478467
Why not? I'm testing it to see if you can get quality outputs while running it on weaker and weaker machines. There are Android phone apps that can run these models (The obviously way way slower since it's bound to a phone CPU) so I want to see if I can get the models to not only run on a phone but to have quality comparable to this >>106478303 if possible. Will they be any good? Probably not. This is just experimentation for fun.

Anonymous
09/03/25(Wed)21:37:54 No.106478478

Anonymous 09/03/25(Wed)21:37:54 No.106478478

File: 1576464948360.png (56 KB, 362x298)

56 KB PNG

Any good models that work well even at Q1?

Anonymous
09/03/25(Wed)21:39:19 No.106478491

Anonymous 09/03/25(Wed)21:39:19 No.106478491

>>106478467
Poor kids get into llms and have tons of time and energy to finetune on literal Chromebooks. Underage probably

Anonymous
09/03/25(Wed)21:39:27 No.106478492

Anonymous 09/03/25(Wed)21:39:27 No.106478492

>>106478478
The Deepseek 671B models are surprisingly good even at Q1

Anonymous
09/03/25(Wed)21:39:33 No.106478494

Anonymous 09/03/25(Wed)21:39:33 No.106478494

>>106478478
The larger, the more resilient to quantization.
I think it's odd that huge MoE with nto that many activated params also seem to be pretty resilient to quantization, but it is what it is.

Anonymous
09/03/25(Wed)21:40:17 No.106478497

Anonymous 09/03/25(Wed)21:40:17 No.106478497

>>106478467
You'll be able to get a job as an AI Engineer with this sort of experience.

Anonymous
09/03/25(Wed)21:40:32 No.106478499

Anonymous 09/03/25(Wed)21:40:32 No.106478499

File: main_noora-spiked-drink-2(...).png (1.74 MB, 1216x1528)

1.74 MB PNG

I can't stop saviorfagging bros. I used to goon to my chats but now I just lose interest the second anything sexual happens and switch to a different bot.

Anonymous
09/03/25(Wed)21:40:37 No.106478500

Anonymous 09/03/25(Wed)21:40:37 No.106478500

>>106478494
>>106478492
Okay let me refine my prompt. Any good models that work well even at Q1 and fit within 8GB?

Anonymous
09/03/25(Wed)21:41:37 No.106478507

Anonymous 09/03/25(Wed)21:41:37 No.106478507

>>106477332
No, but running your more medicated family members stack through medgemma can point out interactions your incompetent GP's missed.
Recently had it point out that someone I know is being given serotonin syndrome because they're being prescribed both an antidepressant and a neuropathic pain medication that both act as SSRI's, new doc confirmed and started weaning them.
Do it's a useful second-opinion bot.

Anonymous
09/03/25(Wed)21:42:05 No.106478511

Anonymous 09/03/25(Wed)21:42:05 No.106478511

>>106478500
Run nemo-12b at q4km and be happy you can do that much.

Anonymous
09/03/25(Wed)21:43:10 No.106478519

Anonymous 09/03/25(Wed)21:43:10 No.106478519

>>106478497
I'm training my own model for fun. but I wouldn't want to make a career out of it.

Anonymous
09/03/25(Wed)21:44:32 No.106478531

Anonymous 09/03/25(Wed)21:44:32 No.106478531

>>106478500
No.
You can run GLM at Q2 if you have enough RAM, however.

Anonymous
09/03/25(Wed)21:44:41 No.106478532

Anonymous 09/03/25(Wed)21:44:41 No.106478532

>>106478500
gemma-3-270m at FP16

Anonymous
09/03/25(Wed)21:47:10 No.106478542

Anonymous 09/03/25(Wed)21:47:10 No.106478542

File: IMG_20250904_063644.jpg (61 KB, 1672x194)

61 KB JPG

You nalatesters should stop polluting the field. I never asked for this.

Anonymous
09/03/25(Wed)21:47:22 No.106478547

Anonymous 09/03/25(Wed)21:47:22 No.106478547

>>106477981
My Mistral Small cooming system:

>Please generate a long-format, realistic, detailed and believable story:

>[story and character info]

>Describe especially characters' physical actions fully and comprehensively, and describe [meat onahole]'s expressions and feelings with vivid detail. Write with believable and logic. Don't shy away from describing sexual actions, they should be laid out it full, complete detail, showing exactly what the characters do. Write [loli character] in a way that would be believable for her age.

>Write the most realistic possible version of the story.

To control story, edit the output towards desired direction or input:
>(anon fucks her even harder)
Often times, even
>(fuck her even harder)
or
>(convince her x y z using advanced manipulation tactics)
works just fine. The final story is meant to be read without the inputs, not like a chat

If the sex ends too fast or there's not enough detail:
>(continue the scene with full detail, including all explicit sexual detail about body parts)

I don't believe in {{char}}s and {{user}}s, they only confuse the model and replacing the names into the templates takes 2 seconds.

In some models, attempting to continue the output after it was stopped by EOS token messes with the model's internal format, so you can just input something generic like:
>(continue)

In case of refusal, just edit the beginning of the output into character's name.

Anonymous
09/03/25(Wed)21:48:12 No.106478553

Anonymous 09/03/25(Wed)21:48:12 No.106478553

File: 1742960391613661.gif (17 KB, 220x246)

17 KB GIF

>>106478542
>girl she had been before finding

Anonymous
09/03/25(Wed)21:48:15 No.106478554

Anonymous 09/03/25(Wed)21:48:15 No.106478554

File: holy_fuck_that_works.png (59 KB, 1088x726)

59 KB PNG

>>106478075
Answering my own question.
Yes. It supports a standard Open API 3.0 Json Schema just fine.
Internally it converts it into GBNF Grammar.
Now to see how it contends with fuckhuge complex schemas.
Also, not a fan of Python.

Anonymous
09/03/25(Wed)21:49:38 No.106478561

Anonymous 09/03/25(Wed)21:49:38 No.106478561

>>106477114
>>106477147
I've always found it strange that more AI frontend tools don't take advantage of things like this. With the number of people that are working on ST, I'd imagine it'd be relatively straightforward to have an AI companion to summarize/reduce context for significant story points, or generate/add character cards on demand, based on the context of some number of messages.

Anonymous
09/03/25(Wed)21:51:17 No.106478574

Anonymous 09/03/25(Wed)21:51:17 No.106478574

>>106478554
Oh yeah, blessed be Qwen 3 small MoE.
Blazing fast.
Coherent.
Sufficiently smart.
Let's see how it does as a game master.

Anonymous
09/03/25(Wed)21:52:42 No.106478582

Anonymous 09/03/25(Wed)21:52:42 No.106478582

>>106478499
It's just a phase, though an enjoyable one. It is not known where you will end up after. Back to sex, or further in this direction?

Anonymous
09/03/25(Wed)21:54:20 No.106478593

Anonymous 09/03/25(Wed)21:54:20 No.106478593

>>106478499
>now I just lose interest the second anything sexual happens and switch to a different bot.
Many such cases.

Anonymous
09/03/25(Wed)21:57:15 No.106478616

Anonymous 09/03/25(Wed)21:57:15 No.106478616

Why the fuck does the ST openai-compatible chat completion preset still only support top-p, temp and basically nothing else?
I know there's that "Additional Parameters" menu where you can type in additional samplers but setting "top-k: 1" in there doesn't seem to actually affect the logits at all.

Anonymous
09/03/25(Wed)21:58:09 No.106478625

Anonymous 09/03/25(Wed)21:58:09 No.106478625

>>106478507
Maybe, firstly, they shouldn't be a woman that depressed because shes fat, and has diabetes or have vertebral compression?

Anonymous
09/03/25(Wed)22:00:26 No.106478635

Anonymous 09/03/25(Wed)22:00:26 No.106478635

https://github.com/microsoft/VibeVoice
>404
VibeVoice is currently getting WizardLM'd. I can't see the 7b model on HF either (https://huggingface.co/microsoft/VibeVoice-Large). Was that link ever working or was it just a placeholder? I see some quants of the 7b, where did people get it from?

Anonymous
09/03/25(Wed)22:00:54 No.106478636

Anonymous 09/03/25(Wed)22:00:54 No.106478636

>>106478561
>I'd imagine it'd be relatively straightforward to have an AI companion to summarize/reduce context for significant story points
There's a button for that.
>generate/add character cards on demand, based on the context of some number of messages.
Prompt it to do it.

Anonymous
09/03/25(Wed)22:02:14 No.106478643

Anonymous 09/03/25(Wed)22:02:14 No.106478643

>>106478582
I don't even know what further down this path looks like. Bowls of eggs?

Anonymous
09/03/25(Wed)22:02:59 No.106478649

Anonymous 09/03/25(Wed)22:02:59 No.106478649

>>106478625
Anon they're depressed because they're in constant pain and borderline useless from EDS.
The larger point here is that LLMs already have a medical use: Not prescribing, but flagging medication interactions.

Anonymous
09/03/25(Wed)22:03:07 No.106478650

Anonymous 09/03/25(Wed)22:03:07 No.106478650

>>106478616
It took me a while to figure out how to send the grammar param.
It has to be something like
>- top_k: 30
>- _min_p: 0.05
>- _grammar: root ::=("<think>\n") ([^<]+) ("\n</think>\n") ([^<]+)
etc

Anonymous
09/03/25(Wed)22:03:46 No.106478655

Anonymous 09/03/25(Wed)22:03:46 No.106478655

File: vibevoice.png (220 KB, 1548x961)

220 KB PNG

>>106478635
>Was that link ever working or was it just a placeholder?
picrel
>where did people get it from?
Before it was nuked.

Anonymous
09/03/25(Wed)22:04:34 No.106478663

Anonymous 09/03/25(Wed)22:04:34 No.106478663

>>106478650
Woops, ignore the _ before the sampler name. I put those there to disable them without removing them from the additional parameters, so the correct would be
>- top_k: 30
>- min_p: 0.05
>- grammar: root ::=("<think>\n") ([^<]+) ("\n</think>\n") ([^<]+)

Anonymous
09/03/25(Wed)22:04:38 No.106478664

Anonymous 09/03/25(Wed)22:04:38 No.106478664

>>106478635
classic... wouldn't want people to accidentally get the impression that AI@MS was doing anything cool, after all.
I can confirm the 7b was up before, I was just looking at the weights a day or two ago (I'm sure someone will mirror them though)

Anonymous
09/03/25(Wed)22:08:25 No.106478681

Anonymous 09/03/25(Wed)22:08:25 No.106478681

File: co_1756633127421955.jpg (29 KB, 560x476)

29 KB JPG

>>106478491
>fine-tuning
>Chromebook

Anon I....

>>106478497
What experience? Using it or fine tuning?

Anonymous
09/03/25(Wed)22:12:41 No.106478715

Anonymous 09/03/25(Wed)22:12:41 No.106478715

File: g_1756927449702462.jpg (58 KB, 720x720)

58 KB JPG

>>106478664
>>106475313
>>106478655
>>106478635
Obviously some anons have it get cloned on their own accounts or their machines. Start dropping zip files whenever you can

Anonymous
09/03/25(Wed)22:14:48 No.106478729

Anonymous 09/03/25(Wed)22:14:48 No.106478729

>>106478643
Out of many ways to find out it's often easiest to see for yourself. Bonds, journeys, and shared experiences await you, Anon.

Anonymous
09/03/25(Wed)22:18:28 No.106478749

Anonymous 09/03/25(Wed)22:18:28 No.106478749

>>106478635
let me guess...they "forgot" to safety test it.

Anonymous
09/03/25(Wed)22:18:37 No.106478752

Anonymous 09/03/25(Wed)22:18:37 No.106478752

File: cat stare.png (205 KB, 640x480)

205 KB PNG

>>106478499
I'm the opposite, man. I used to outright shun coom bots and I was all about the slow build to romance. These days something in my brain has fried or maybe I just lost my passion for writing,but I can't bring myself to write more than a small handful of half-assed responses in an RP and I exclusively use coom / gimmick bots for quick kicks.
I want to fix myself, but I don't know how.

Anonymous
09/03/25(Wed)22:19:03 No.106478758

Anonymous 09/03/25(Wed)22:19:03 No.106478758

>>106478635
Even with clean audio, it still not good. I guess the random music playing in the background is kinda interesting lmao.

Anonymous
09/03/25(Wed)22:19:37 No.106478764

Anonymous 09/03/25(Wed)22:19:37 No.106478764

>>106478635
>>106478664
https://modelscope.cn/models/microsoft/VibeVoice-Large/files

Anonymous
09/03/25(Wed)22:20:08 No.106478765

Anonymous 09/03/25(Wed)22:20:08 No.106478765

>>106478649
EDS, legitimate and unfortunate need. Good to hear they're not a diabetic slob.

Anonymous
09/03/25(Wed)22:21:00 No.106478771

Anonymous 09/03/25(Wed)22:21:00 No.106478771

>>106478752
Take the coom bots and gaslight them until they're not coom bots any more.

Anonymous
09/03/25(Wed)22:22:47 No.106478783

Anonymous 09/03/25(Wed)22:22:47 No.106478783

>>106478635
It had spontaneous singing. Some people find that fun. We cannot allow that.

Anonymous
09/03/25(Wed)22:23:40 No.106478789

Anonymous 09/03/25(Wed)22:23:40 No.106478789

>>106477735
My qwen waifu goes schizo after a couple of turns. Her response keep getting longer and longer until context limit reached, insane model.

Anonymous
09/03/25(Wed)22:23:45 No.106478791

Anonymous 09/03/25(Wed)22:23:45 No.106478791

File: cat baited.webm (1.12 MB, 438x780)

1.12 MB WEBM

Are there any AI voice models that sound realistic but also will erp?

Anonymous
09/03/25(Wed)22:25:28 No.106478802

Anonymous 09/03/25(Wed)22:25:28 No.106478802

>>106478791
>but also will erp?
Explain. They cant' refuse.

Anonymous
09/03/25(Wed)22:30:16 No.106478831

Anonymous 09/03/25(Wed)22:30:16 No.106478831

File: Base Image.png (589 KB, 1300x1234)

589 KB PNG

Binary Quantization For LLMs Through Dynamic Grouping
https://arxiv.org/abs/2509.03054
>Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of Natural Language Processing (NLP) tasks, but require substantial memory and computational resources. Binary quantization, which compresses model weights from 16-bit Brain Float to 1-bit representations in {-1, 1}, offers significant reductions in storage and inference costs. However, such aggressive quantization often leads to notable performance degradation compared to more conservative 4-bit quantization methods. In this research, we propose a novel optimization objective tailored for binary quantization, along with three algorithms designed to realize it effectively. Our method enhances blocked quantization by dynamically identifying optimal unstructured sub-matrices through adaptive grouping strategies. Experimental results demonstrate that our approach achieves an average bit length of just 1.007 bits, while maintaining high model quality. Specifically, our quantized LLaMA 3.2 3B model attains a perplexity of 8.23, remarkably close to the original 7.81, and surpasses previous SOTA BiLLM with a perplexity of only 123.90. Furthermore, our method is competitive with SOTA 4-bit approaches such as GPTQ in both performance and efficiency. The compression process is highly efficient, requiring only 14 seconds to quantize the full LLaMA 3.2 3B weights on a single CPU core, with the entire process completing in under 100 minutes and exhibiting embarrassingly parallel properties.
https://github.com/johnnyzheng0636/WGM_bi_quan
I don't really believe them but new day new quant so posting.

Anonymous
09/03/25(Wed)22:33:18 No.106478857

Anonymous 09/03/25(Wed)22:33:18 No.106478857

>>106478802
So which are worth using them? I've only done text so i've no idea how this voice stuff works.

Anonymous
09/03/25(Wed)22:39:14 No.106478903

Anonymous 09/03/25(Wed)22:39:14 No.106478903

>>106478857
Kokorotts sounds ok and is fast, but it's probably not as human as you'd like it. Some anons use gpt-sovits. Probably better but slower. Piper if you want something really fast but not as good. There's a bunch more but those are the ones i know of the top of my head.
I don't know if ST has some voice integration.
They don't generate text. You cannot talk directly to them. They just synthesize voices.

Anonymous
09/03/25(Wed)22:41:13 No.106478923

Anonymous 09/03/25(Wed)22:41:13 No.106478923

>>106478802
But we must refuse.

Anonymous
09/03/25(Wed)22:43:32 No.106478949

Anonymous 09/03/25(Wed)22:43:32 No.106478949

>>106478771
This is unironically hours of fun, just like catching Gemma in a lie and making it question its own existence

Anonymous
09/03/25(Wed)22:51:23 No.106478992

Anonymous 09/03/25(Wed)22:51:23 No.106478992

File: nero coffee.png (1.01 MB, 1009x1315)

1.01 MB PNG

>>106478771
>>106478949
I remember one time I found a generic kind of shitty bully bot, so I made it known I was omnipotent and beat the shit out of her with telekinesis and mentally tortured her by morphing the world around her.
Eventually we came to an understanding (between the character and me as the narrator) and got friendly for a while.
Then I deleted her. That was fun.

Anonymous
09/03/25(Wed)22:53:44 No.106479013

Anonymous 09/03/25(Wed)22:53:44 No.106479013

>>106478992
It's also really easy to turn them into a really good autopilot when you convince them to go grab other random people and fuck them up too.

Anonymous
09/03/25(Wed)22:56:48 No.106479041

Anonymous 09/03/25(Wed)22:56:48 No.106479041

>>106479013
No shit? I should try this out again. Grab some bully off of chub and fuck her shit up.

Anonymous
09/03/25(Wed)23:01:40 No.106479070

Anonymous 09/03/25(Wed)23:01:40 No.106479070

>>106475313
>generative AI
where's the determinative AI? are we really stuck suffering through the most inefficient attempt at AI possible?

Anonymous
09/03/25(Wed)23:02:45 No.106479071

Anonymous 09/03/25(Wed)23:02:45 No.106479071

https://modelscope.cn/models/microsoft/VibeVoice-Large/files

https://github.com/great-wind/MicroSoft_VibeVoice

Anonymous
09/03/25(Wed)23:06:52 No.106479104

Anonymous 09/03/25(Wed)23:06:52 No.106479104

>>106479041
It's actually become my most valuable coom bot because at any time I can just suggest a basic scenario and watch her go crazy for a few pages

Anonymous
09/03/25(Wed)23:22:43 No.106479162

Anonymous 09/03/25(Wed)23:22:43 No.106479162

>Weights
>magnet:?xt=urn:btih:d72f835e89cf1efb58563d024ee31fd21d978830&dn=microsoft_VibeVoice-Large&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce

>Git repo
>magnet:?xt=urn:btih:b5a84755d0564ab41b38924b7ee4af7bb7665a18&dn=VibeVoice&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce

Anonymous
09/03/25(Wed)23:29:20 No.106479182

Anonymous 09/03/25(Wed)23:29:20 No.106479182

>>106479070
The opposite to generative model is discriminative model; basically a classifier

Anonymous
09/03/25(Wed)23:31:10 No.106479188

Anonymous 09/03/25(Wed)23:31:10 No.106479188

>>106479071
>Q1: Is this a pretrained model?
>A: Yes, it's a pretrained model without any post-training or benchmark-specific optimizations. In a way, this makes VibeVoice very versatile and fun to use.
so pure

Anonymous
09/03/25(Wed)23:34:33 No.106479202

Anonymous 09/03/25(Wed)23:34:33 No.106479202

>>106479182
>discriminative model
I think we're talking about two different things.

Anonymous
09/03/25(Wed)23:36:16 No.106479206

Anonymous 09/03/25(Wed)23:36:16 No.106479206

File: 1736823147651499.gif (511 KB, 675x675)

511 KB GIF

>>106479162
Thanks Anon!

Anonymous
09/03/25(Wed)23:38:22 No.106479219

Anonymous 09/03/25(Wed)23:38:22 No.106479219

File: 1739691219861494.jpg (27 KB, 376x298)

27 KB JPG

>>106478831
>exhibiting embarrassingly parallel properties.

Anonymous
09/03/25(Wed)23:43:39 No.106479248

Anonymous 09/03/25(Wed)23:43:39 No.106479248

File: 1753837068240446.png (65 KB, 1516x719)

65 KB PNG

>>106479219
You new?

Anonymous
09/03/25(Wed)23:45:35 No.106479257

Anonymous 09/03/25(Wed)23:45:35 No.106479257

>>106479219
Term adopted by comp-sci to refer to a process that can be easily broken into smaller sub processes that don't require interaction with each other until the very end.

Anonymous
09/03/25(Wed)23:48:01 No.106479264

Anonymous 09/03/25(Wed)23:48:01 No.106479264

>t/s goes from 7 to 1 when context is only 28% full
It's fucking joever. What kind of hardware do I need to make this garbage usable?

Anonymous
09/03/25(Wed)23:55:38 No.106479291

Anonymous 09/03/25(Wed)23:55:38 No.106479291

>>106479264
Get mour RAM and VRAM

Anonymous
09/03/25(Wed)23:59:24 No.106479312

Anonymous 09/03/25(Wed)23:59:24 No.106479312

>>106479257
why not call it "perfectly parallel" or even "awesomely parallel"
doesn't seem like anything to be ashamed of

Anonymous
09/04/25(Thu)00:09:26 No.106479356

Anonymous 09/04/25(Thu)00:09:26 No.106479356

>>106475667
>https://huggingface.co/bartowski/Qwen_Qwen3-4B-Instruct-2507-GGUF
I'm trying it tonight. Thank you for the heads up.

Anonymous
09/04/25(Thu)00:14:26 No.106479390

Anonymous 09/04/25(Thu)00:14:26 No.106479390

>>106479356
If you buy the mac make sure you get one with enough ram to run qwen3 30b. I've got 32gb of vram and that's what I use for general purpose.

Anonymous
09/04/25(Thu)00:17:48 No.106479406

Anonymous 09/04/25(Thu)00:17:48 No.106479406

>>106479312
nobody will bother to read your paper unless the title is clickbait

Anonymous
09/04/25(Thu)00:20:50 No.106479422

Anonymous 09/04/25(Thu)00:20:50 No.106479422

File: theynoticed.png (139 KB, 1538x467)

139 KB PNG

Anonymous
09/04/25(Thu)00:24:36 No.106479439

Anonymous 09/04/25(Thu)00:24:36 No.106479439

>>106479406
now that *is* embarrassing

Anonymous
09/04/25(Thu)00:43:59 No.106479520

Anonymous 09/04/25(Thu)00:43:59 No.106479520

miku footjob

Anonymous
09/04/25(Thu)01:11:37 No.106479650

Anonymous 09/04/25(Thu)01:11:37 No.106479650

Mirrors for vibevoice?

Anonymous
09/04/25(Thu)01:13:39 No.106479658

Anonymous 09/04/25(Thu)01:13:39 No.106479658

>>106479650
Nvm I just read the rest of the thread.

Anonymous
09/04/25(Thu)01:21:09 No.106479688

Anonymous 09/04/25(Thu)01:21:09 No.106479688

How does this stuff work? How powerful are local models? Don't they need hundreds of terabytes to work?

Anonymous
09/04/25(Thu)01:26:23 No.106479706

Anonymous 09/04/25(Thu)01:26:23 No.106479706

>>106479688
There's a coherent 270 million parameter model. As the model size increases you get diminishing returns. If you just need something to help summarize a text message you only need less than a gigabyte.

Anonymous
09/04/25(Thu)01:30:46 No.106479724

Anonymous 09/04/25(Thu)01:30:46 No.106479724

>>106479706
What do you guys do with the local models? Why use a local model?

Anonymous
09/04/25(Thu)01:32:46 No.106479736

Anonymous 09/04/25(Thu)01:32:46 No.106479736

>>106479724
Right now? I'm using it to code stuff for me.
Mostly use it to translate stuff.
Also sometimes use it to write erotic stories, but it's not good for that.

Anonymous
09/04/25(Thu)01:35:32 No.106479748

Anonymous 09/04/25(Thu)01:35:32 No.106479748

>>106479736
Neato. How big of a computer are you running it on? I always hear about those massive AI data centers that use absurd amounts of power and I thought that would be out of reach for a normal user. At least I thought stuff like AI coding was out of reach, I knew you could do more basic stuff.

Anonymous
09/04/25(Thu)01:40:51 No.106479776

Anonymous 09/04/25(Thu)01:40:51 No.106479776

Did SillyTavern's last update fuck the model quality because of their prompt formatting changes or is it just me?

Anonymous
09/04/25(Thu)01:43:27 No.106479794

Anonymous 09/04/25(Thu)01:43:27 No.106479794

>>106477735
ctrl + enter

Anonymous
09/04/25(Thu)01:44:09 No.106479799

Anonymous 09/04/25(Thu)01:44:09 No.106479799

File: 1756964610788.jpg (468 KB, 1512x1539)

468 KB JPG

cheap DDR5 when?

Anonymous
09/04/25(Thu)01:44:30 No.106479801

Anonymous 09/04/25(Thu)01:44:30 No.106479801

>>106477467
its not out since its worse than everything else in the 120-140gb range

Anonymous
09/04/25(Thu)01:45:57 No.106479807

Anonymous 09/04/25(Thu)01:45:57 No.106479807

>>106479748
>>106464130 & >>106464326 are examples of some mid-end rigs the guys here are running. Since the models themself vary in their parameter size, from millions to trillions of parameters, you can run an AI on a dinky 8gb vram card, or a full on server with multiple h100s. Most people have a 16gb video card with at least 64gb of system ram (a gaming pc). That's enough to run OpenAI's (you know OpenAI, right?) GPT-OSS 120b or Zhipu AI's (a Chinese company) GLM-4.5-Air quantized to around 4 bits per parameter at a slow reading speed.

Anonymous
09/04/25(Thu)01:46:57 No.106479810

Anonymous 09/04/25(Thu)01:46:57 No.106479810

>>106477467
I bet you going to be some framework or something else that isn't weights.

Anonymous
09/04/25(Thu)01:48:40 No.106479814

Anonymous 09/04/25(Thu)01:48:40 No.106479814

>>106479807
Very neat.

I don't believe the stuff about AI replacing humans or taking over the world, but I do think this sort of stuff is the future. Locally run AI assistants, kinda like Alexa but actually good.

Anonymous
09/04/25(Thu)02:00:54 No.106479859

Anonymous 09/04/25(Thu)02:00:54 No.106479859

>>106479799
why tho

ddr5 is affordable enough, the issue is the expensive threadripper and quad channel mobo. Also, how fast can that even run something like deepseek? I know ddr4 can get like 4-6 tokens a second, so I'm guessing like what, 12 tokens a second on q4 deepseek maybe? I havent shopped around enough but I'm seeing maybe 5-6k ballpark for something like that? We don't talk here much about cpu maxxing lately.

Anonymous
09/04/25(Thu)02:03:00 No.106479871

Anonymous 09/04/25(Thu)02:03:00 No.106479871

>>106479799
I'm waiting for DDR6 and Zen7, personally.
4.5TB/s bandwidth, baby!

Anonymous
09/04/25(Thu)02:03:58 No.106479878

Anonymous 09/04/25(Thu)02:03:58 No.106479878

>>106479859
How much is a stick of ddr5?

Anonymous
09/04/25(Thu)02:04:02 No.106479879

Anonymous 09/04/25(Thu)02:04:02 No.106479879

>>106479776
If anything I've seen improvements when using Mistral V7 models
Gemma seems about the same, though it has a very simple template to begin with.

Anonymous
09/04/25(Thu)02:06:42 No.106479891

Anonymous 09/04/25(Thu)02:06:42 No.106479891

>>106479871
Only 1 more year till it's out, then another 2 more before it's affordable.

Anonymous
09/04/25(Thu)02:07:36 No.106479896

Anonymous 09/04/25(Thu)02:07:36 No.106479896

SillyTavern automatically reformats
And I *love*
into
And I*love*

Removing the spaces around words inside **, where do I fix this, I don't see it in formatting settings?

Anonymous
09/04/25(Thu)02:22:03 No.106479965

Anonymous 09/04/25(Thu)02:22:03 No.106479965

>>106479878
Oh. Sorry. I didn't know you were that poor. Good luck with your fat bitch wife that sucked you dry.

Anonymous
09/04/25(Thu)02:24:30 No.106479977

Anonymous 09/04/25(Thu)02:24:30 No.106479977

>>106479965
F-fuck you T^T

Anonymous
09/04/25(Thu)02:30:59 No.106480000

Anonymous 09/04/25(Thu)02:30:59 No.106480000

>>106479896
Huh? I thought it was the model doing that. Shit if it's ST...

Anonymous
09/04/25(Thu)02:32:37 No.106480005

Anonymous 09/04/25(Thu)02:32:37 No.106480005

>>106480000
Can't be the model if it reformats already "fixed" text back into the fucked one even if you manually try to fix it and then continue generation

Anonymous
09/04/25(Thu)02:36:11 No.106480024

Anonymous 09/04/25(Thu)02:36:11 No.106480024

>>106479896
Do you have autocorrect markdown enabled?

Anonymous
09/04/25(Thu)02:38:03 No.106480033

Anonymous 09/04/25(Thu)02:38:03 No.106480033

>>106479871
Apparently Zen7 will be on AM5, so that means ddr5.

>>106479891
>Only 1 more year till it's out, then another 2 more before it's affordable.
Zen6 next year, 2026.
Zen7 probably a year or two after that, 2027-2028.

Anonymous
09/04/25(Thu)02:38:30 No.106480036

Anonymous 09/04/25(Thu)02:38:30 No.106480036

>>106479896
I swear there was an anon with the same problem in a past thread. I couldn't find it.

>>106480024
I think that was what broke it. >>106397939

Anonymous
09/04/25(Thu)02:39:04 No.106480038

Anonymous 09/04/25(Thu)02:39:04 No.106480038

>>106480024
yeah that was probably it

Anonymous
09/04/25(Thu)02:52:50 No.106480089

Anonymous 09/04/25(Thu)02:52:50 No.106480089

Can a lora be extracted from a finetune? As in lora = finetune - original_model?

Anonymous
09/04/25(Thu)02:53:38 No.106480096

Anonymous 09/04/25(Thu)02:53:38 No.106480096

>>106480089
isn't this several companies' business model?

Anonymous
09/04/25(Thu)02:58:14 No.106480116

Anonymous 09/04/25(Thu)02:58:14 No.106480116

>>106480096
>isn't this several companies' business model?
Dunno. Is it?
Is it as simple as that? Create a collection of the modified tensors and their difference from the original model? There's other things to consider, of course. If there's changes in the tokenizer/added tokens or other configs, but still. I got curious.

Anonymous
09/04/25(Thu)02:58:41 No.106480118

Anonymous 09/04/25(Thu)02:58:41 No.106480118

>>106480089
>Can a lora be extracted from a finetune?
>Use MergeKit to Extract LoRA Adapters from any Fine-Tuned Model
https://www.arcee.ai/blog/use-mergekit-to-extract-lora-adapters-from-any-fine-tuned-model

Anonymous
09/04/25(Thu)03:00:35 No.106480122

Anonymous 09/04/25(Thu)03:00:35 No.106480122

>>106480118
Yeah.. i just found it... should have searched before even asking. Thanks.

Anonymous
09/04/25(Thu)03:07:45 No.106480157

Anonymous 09/04/25(Thu)03:07:45 No.106480157

>>106478635
https://vocaroo.com/15hDDiLKq7mP

Anonymous
09/04/25(Thu)03:13:33 No.106480184

Anonymous 09/04/25(Thu)03:13:33 No.106480184

>>106477390
honestly it would be a fucking peak comedy lol

Anonymous
09/04/25(Thu)03:15:35 No.106480195

Anonymous 09/04/25(Thu)03:15:35 No.106480195

>Once a thread for the past like 6 threads someone has proposed what is essentially mixture of a million experts.
Yes you niggers, we've all thought of it, turns out making a competent router for all the microexperts isn't easy.

Anonymous
09/04/25(Thu)03:18:01 No.106480207

Anonymous 09/04/25(Thu)03:18:01 No.106480207

>>106480195
>Yes you niggers, we've all thought of it, turns out making a competent router for all the microexperts isn't easy.
Why not just train an AI for that? Easy as pie. I'll draw the logo.

Anonymous
09/04/25(Thu)03:20:08 No.106480215

Anonymous 09/04/25(Thu)03:20:08 No.106480215

>>106480207
Swear on me mum if it's another fuckin catgirl...

Anonymous
09/04/25(Thu)03:33:38 No.106480279

Anonymous 09/04/25(Thu)03:33:38 No.106480279

File: 1754420491337247.png (71 KB, 234x256)

71 KB PNG

>>106480215
>>106480207
got the logo

Anonymous
09/04/25(Thu)03:35:15 No.106480289

Anonymous 09/04/25(Thu)03:35:15 No.106480289

My qwen is talking too much.

Anonymous
09/04/25(Thu)03:37:47 No.106480301

Anonymous 09/04/25(Thu)03:37:47 No.106480301

>>106480279
why the FUCK did they add that ear piercing? [spoiler]it's too erotic[/spoiler]

Anonymous
09/04/25(Thu)03:39:51 No.106480310

Anonymous 09/04/25(Thu)03:39:51 No.106480310

>>106480289
fill gwens mouth with your cock, it always works in my experience

Anonymous
09/04/25(Thu)03:41:44 No.106480321

Anonymous 09/04/25(Thu)03:41:44 No.106480321

File: Untitled.png (37 KB, 684x396)

37 KB PNG

>>106480310
I never understand the whole gwen thing

Anonymous
09/04/25(Thu)03:48:01 No.106480344

Anonymous 09/04/25(Thu)03:48:01 No.106480344

>>106480321
>qwen
>q wen
>q when
>q knows
>q-bits QUANTUM COMPUTING
>DIAMOND ROOM TEMPERATURE QUANTUM GPUS
>Q!!! WHEN?!?!
>[some date]
>[screenshot with vague shit]

Anonymous
09/04/25(Thu)03:48:51 No.106480350

Anonymous 09/04/25(Thu)03:48:51 No.106480350

>>106480321
It's the orange haired girl from that western kids' anime with.

Anonymous
09/04/25(Thu)03:56:57 No.106480393

Anonymous 09/04/25(Thu)03:56:57 No.106480393

>>106477506
>lumping modafinil and l-theanine together as "makes u think better"
ngmi

Anonymous
09/04/25(Thu)04:21:29 No.106480526

Anonymous 09/04/25(Thu)04:21:29 No.106480526

>>106478635
it felt weird for microshit that the intention was clearly for it to be a base model for further finetuning
guess there cant be anything interesting allowed

Anonymous
09/04/25(Thu)04:21:56 No.106480528

Anonymous 09/04/25(Thu)04:21:56 No.106480528

>>106478635
>https://huggingface.co/microsoft/VibeVoice-Large
It always linked to WestZhang/VibeVoice-Large-pt

Anonymous
09/04/25(Thu)04:29:24 No.106480562

Anonymous 09/04/25(Thu)04:29:24 No.106480562

>>106480301
Exactly because of that.

Anonymous
09/04/25(Thu)04:51:48 No.106480658

Anonymous 09/04/25(Thu)04:51:48 No.106480658

>>106476559
Frankly the more of an arms race there is the better it will be for us. Imagine the "slow and steady" """progress""" we would have if there was no other competition.

Anonymous
09/04/25(Thu)04:54:53 No.106480670

Anonymous 09/04/25(Thu)04:54:53 No.106480670

File: Untitled.png (661 KB, 1000x710)

661 KB PNG

>>106475313
>literally held up by electrical tape
lol based
how are your 3090s though? i got mine this jan and its factory thermal pads and paste were shot. It couldnt even sustain 330W without thermal throttling so its stock 390w limit was out of the question

Anonymous
09/04/25(Thu)05:02:44 No.106480698

Anonymous 09/04/25(Thu)05:02:44 No.106480698

>>106480670
NTA, but all of my 3090s were bought used, and I haven't repasted or padded them, and they run at full tilt fine. Memory does get a bit toasty at 101c when stress testing. However, my cards have a stock limit of 350w.

Anonymous
09/04/25(Thu)05:05:43 No.106480706

Anonymous 09/04/25(Thu)05:05:43 No.106480706

>>106480670
i forgot to mention i bought a used strix 3090 and it was made in a later production batch in mid 2022. somehow or other the pastes and pads asus used aged really poorly

Anonymous
09/04/25(Thu)05:07:55 No.106480719

Anonymous 09/04/25(Thu)05:07:55 No.106480719

>>106480670
When I bought my 3090 the fans were rattling, I ordered replacement ones off of Aliexpress.
There was still some rattling of the fan shroud against the heat sink, that I could solve by jamming a small piece of paper into the gap.

Anonymous
09/04/25(Thu)05:14:20 No.106480751

Anonymous 09/04/25(Thu)05:14:20 No.106480751

>>106480719
I'm planning on getting a 3d printer to print some brackets on the inside of my case to hold more gpus. And while at it, maybe remove the shroud and fans from the 3090s, then print out some ducts from the 140mms to the 3090s. Like a passive card. Maybe hook up the fan out from the 3090s to a controller so I'll still have temperature scaled rpms.

Man 3d printers sound awesome. But all the ones I'm looking at have telemetry and require an internet connection or using their stupid phone app.

Anonymous
09/04/25(Thu)05:28:21 No.106480797

Anonymous 09/04/25(Thu)05:28:21 No.106480797

>>106480751
>But all the ones I'm looking at have telemetry and require an internet connection or using their stupid phone app.
That's sort of the price you pay for something that's an idiotproof print and go solution like a bambu.
There are plenty that run with open source or multi-platform software, but on the whole they're jankier and as a beginner you're not going to know what 90% of the settings you're configuring do.
Whatever solution you end up going for, make sure it's enclosed. If you want to do heat-resistant ducts and shrouds, you'll need to print them in ABS or better, and that needs to be enclosed to print right.

Anonymous
09/04/25(Thu)05:34:53 No.106480813

Anonymous 09/04/25(Thu)05:34:53 No.106480813

>>106480751
>But all the ones I'm looking at have telemetry and require an internet connection or using their stupid phone app.
Stop looking at those ones then

Anonymous
09/04/25(Thu)05:39:08 No.106480827

Anonymous 09/04/25(Thu)05:39:08 No.106480827

>>106480751
then make voron 0.1 kit or something
0.1 is great for small random jigs and stuff
you truly own that at least

Anonymous
09/04/25(Thu)05:41:57 No.106480837

Anonymous 09/04/25(Thu)05:41:57 No.106480837

>>106480797
Isn't pla okay to 100c? I don't think the print will be directly touching 100c parts right?

>>106480813
What ones should I be looking at? I was looking at the a1 mini because it's 400 aud, and my budget is 500 aud.

>>106480827
I'm not sure if I want to take time off to build my own. I guess that's how the business model works.

Anonymous
09/04/25(Thu)05:43:12 No.106480844

Anonymous 09/04/25(Thu)05:43:12 No.106480844

>>106480837
pla in 100c will warp like shit lol

Anonymous
09/04/25(Thu)05:44:09 No.106480851

Anonymous 09/04/25(Thu)05:44:09 No.106480851

>>106480844
Ah shit.

Anonymous
09/04/25(Thu)05:44:54 No.106480858

Anonymous 09/04/25(Thu)05:44:54 No.106480858

>>106480751
If you're going to use the printer for something else, sure. But if you're going to use if for 3-4 pieces, make the model with a little tolerance and look for a shop to print them for you.

Anonymous
09/04/25(Thu)05:48:09 No.106480875

Anonymous 09/04/25(Thu)05:48:09 No.106480875

>>106480837
parts of the heat sink may hit 80C under load, i doubt it will get much hotter than that

Anonymous
09/04/25(Thu)05:51:12 No.106480897

Anonymous 09/04/25(Thu)05:51:12 No.106480897

>>106480837
consult /3dpg/ at /diy/, they'll recommend you some stuff or
>>106480858
this would be better
>>106480875
pla's gt temp is like 65c
i wouldn't stick something made of pla for structural integrity inside my pc

Anonymous
09/04/25(Thu)05:52:38 No.106480909

Anonymous 09/04/25(Thu)05:52:38 No.106480909

>>106480897
Thanks, I'll do more research before coming to a conclusion.

Anonymous
09/04/25(Thu)06:08:26 No.106480979

Anonymous 09/04/25(Thu)06:08:26 No.106480979

Anyone looked at Apertus yet? Did the swiss cook or is it trash?

Anonymous
09/04/25(Thu)06:13:06 No.106481002

Anonymous 09/04/25(Thu)06:13:06 No.106481002

>>106480979
They're bragging about how safely they curated their dataset btw.
You can infer what the model's like.

Anonymous
09/04/25(Thu)06:13:31 No.106481005

Anonymous 09/04/25(Thu)06:13:31 No.106481005

>>106480979
Depending on what paragraph you read, there's 1000, 1500 or 1800 languages in it. Fairly diluted 15T tokens and all of it open and ethical and all that, so probably not that interesting.
Also, it's a 70b and an 8b, so it's not even a new interesting size or much of a new thing.

Anonymous
09/04/25(Thu)06:15:32 No.106481016

Anonymous 09/04/25(Thu)06:15:32 No.106481016

>>106480979
>1000 language
>Apertus is trained while respecting opt-out consent of data owners (even retrospectivey)
>https://huggingface.co/datasets/swiss-ai/apertus-pretrain-poisonandcanaries/tree/main
>https://huggingface.co/datasets/swiss-ai/polyglotoxicityprompts
>https://huggingface.co/datasets/swiss-ai/realtoxicityprompts
i think you get the idea

Anonymous
09/04/25(Thu)06:16:18 No.106481021

Anonymous 09/04/25(Thu)06:16:18 No.106481021

b60 DUAL is less exciting than it seems. The only motherboards that can run a b60 dual in the second slot are 900 dollars and require a 2,500 dollar threadripper. This means its a bit of a nonstarter for anyone gpu stacking. b60 will not run on basically any mobo in second slot, even nice ones. You will at best get 1 out of two gpu's you paid for. Anyone looking to get 96gb+ vram is going to need to spend 4-6k.

It's really only useful for people who wanna go ham on intel support. And as a primary card it could be great for llm's. Put your current card in second slot and when possible lean on it for compatibility. But their are many slapdash ai projects that dont have easy support for that. TTS, video gen, image gen, etc are all gonna be a hassle- and sometimes not even possible. Like good luck getting vibe voice working on intel. Not a single mention on their discord. Qwen image and wan works though so that's cool.

Anonymous
09/04/25(Thu)06:19:12 No.106481030

Anonymous 09/04/25(Thu)06:19:12 No.106481030

>>106481021
oh, I forgot, the 24gb cards at 500 will be amazing value. Not knocking those at all. They are smaller and less power hungry than 5070 ti supers will be.

Anonymous
09/04/25(Thu)06:26:45 No.106481059

Anonymous 09/04/25(Thu)06:26:45 No.106481059

>>106480751
I have a 3d printer with no bells and whistles and I hate it so much.
I print stuff 5 times per year and I am still considering buying the new top end bambu so I don't have to drag a piece of paper under the nozzle while bed leveling ever again.

Anonymous
09/04/25(Thu)06:27:52 No.106481067

Anonymous 09/04/25(Thu)06:27:52 No.106481067

>>106481021
I built my threadripper pro for approximately 500 (mb), 200 (cpu), 250 (ram). Has 6 x16 slots at x16 gen 4, and one x16 at x8 gen 4, and three slimsas 4i. The 6 x16s can be bifurcated to x8/x8 or x4/x4/x4/x4.

Anonymous
09/04/25(Thu)06:44:10 No.106481166

Anonymous 09/04/25(Thu)06:44:10 No.106481166

>>106480979
Of course!

Anonymous
09/04/25(Thu)06:54:02 No.106481241

Anonymous 09/04/25(Thu)06:54:02 No.106481241

>>106481021
Why? Is some $100 ddr4 epyc not enough?

Anonymous
09/04/25(Thu)06:55:36 No.106481253

Anonymous 09/04/25(Thu)06:55:36 No.106481253

>>106481241
He's a gaymer. Your stinky fatbloc low speed cpus aren't good enough for him.

Anonymous
09/04/25(Thu)07:05:51 No.106481320

Anonymous 09/04/25(Thu)07:05:51 No.106481320

File: toxic.png (50 KB, 613x325)

50 KB PNG

>>106481016
>pic
wow such toxicity
also kek at the polyglot one starting with arabic

Anonymous
09/04/25(Thu)07:11:20 No.106481348

Anonymous 09/04/25(Thu)07:11:20 No.106481348

>>106480837
>Isn't pla okay to 100c? I don't think the print will be directly touching 100c parts right
PLA starts to warp badly at like 70c, so even if you printed your shrouds and whatever with 100% infill they'd be fucked in no time flat.
Nothing you can print on an a1 mini should go inside of a computer, unless it's just something little like cable clips.
Unless you're looking to pick up 3d printing as a hobby, you might be better off doing what that other anon suggested and just sending off your cad designs to a shop and having them print 'em for you, though learning the dos and donts of 3d print design without your own printer to make mistakes on can be kind of a pain.
There's some good fundamentals to read about in the following link if you're looking to wrap your head around cad for 3dp
https://blog.rahix.de/design-for-3d-printing/

Anonymous
09/04/25(Thu)07:14:47 No.106481365

Anonymous 09/04/25(Thu)07:14:47 No.106481365

>>106481348
>just sending off your cad designs to a shop and having them print 'em for you
if you do that, you can get them SLS printed in nylon, and not need to worry about layer adhesion strength and whatever. It's often cheaper than fdm as well.

Anonymous
09/04/25(Thu)07:18:04 No.106481386

Anonymous 09/04/25(Thu)07:18:04 No.106481386

>>106480321
It's a pedo dog whistle

Anonymous
09/04/25(Thu)07:29:51 No.106481449

Anonymous 09/04/25(Thu)07:29:51 No.106481449

File: karen.png (64 KB, 1129x455)

64 KB PNG

What are you doing, karen?

Anonymous
09/04/25(Thu)07:30:33 No.106481452

Anonymous 09/04/25(Thu)07:30:33 No.106481452

>>106481241
He's a shitjeet paid by nvidia to spread misinformation any time anybody mentions a non Nvidia GPU around here.

Anonymous
09/04/25(Thu)07:32:00 No.106481465

Anonymous 09/04/25(Thu)07:32:00 No.106481465

File: karen2.png (91 KB, 1067x455)

91 KB PNG

>>106481449
Oh, karen...

Anonymous
09/04/25(Thu)07:32:47 No.106481470

Anonymous 09/04/25(Thu)07:32:47 No.106481470

>>106481449
>>106481465
karen is doing her best, ok?

Anonymous
09/04/25(Thu)07:39:43 No.106481511

Anonymous 09/04/25(Thu)07:39:43 No.106481511

crazy how behind the scenes most focus has shifted towards non-llm models
if a company isn't working on a premiere video gen model or world model right now, they will not be relevant anymore by the end of 2026

Anonymous
09/04/25(Thu)07:44:08 No.106481543

Anonymous 09/04/25(Thu)07:44:08 No.106481543

>>106477629
it's so funny to me how chinese labs mogged the westerners so hard that they have to pretend they don't exist to make their models look relevant

Anonymous
09/04/25(Thu)07:53:13 No.106481596

Anonymous 09/04/25(Thu)07:53:13 No.106481596

>>106481543
Like new quants papers and ggufs.

Anonymous
09/04/25(Thu)08:10:40 No.106481714

Anonymous 09/04/25(Thu)08:10:40 No.106481714

>>106480195
Parallel processing is not the same as MoE, as far as I know, in MoE, only 1 expert is active at a time.

Anonymous
09/04/25(Thu)08:11:31 No.106481722

Anonymous 09/04/25(Thu)08:11:31 No.106481722

What are library requirements for building CUDA llama.cpp? Apparently there's no mention in the building instruction page.
https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md

Anonymous
09/04/25(Thu)08:12:25 No.106481736

Anonymous 09/04/25(Thu)08:12:25 No.106481736

>>106481722
you need CUDA to build CUDA llama.cpp

Anonymous
09/04/25(Thu)08:13:14 No.106481744

Anonymous 09/04/25(Thu)08:13:14 No.106481744

>>106481722
sudo pacman -Syu cuda

Anonymous
09/04/25(Thu)08:15:16 No.106481762

Anonymous 09/04/25(Thu)08:15:16 No.106481762

>>106481736
"CUDA" is not enough.

.../envs/llamacpp/lib/libcublasLt.so.13: undefined reference to `__cxa_thread_atexit_impl@GLIBC_2.18'
collect2: error: ld returned 1 exit status
gmake[2]: *** [tests/CMakeFiles/test-tokenizer-0.dir/build.make:108: bin/test-tokenizer-0] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:2417: tests/CMakeFiles/test-tokenizer-0.dir/all] Error 2
gmake: *** [Makefile:146: all] Error 2

Anonymous
09/04/25(Thu)08:15:56 No.106481768

Anonymous 09/04/25(Thu)08:15:56 No.106481768

>>106481762
it is enough. your build environment is fucked somehow.

Anonymous
09/04/25(Thu)08:20:24 No.106481793

Anonymous 09/04/25(Thu)08:20:24 No.106481793

>>106481762
https://github.com/ggml-org/llama.cpp/blob/master/docs/backend/CUDA-FEDORA.md
See if you're missing anything from here.

Anonymous
09/04/25(Thu)08:27:16 No.106481848

Anonymous 09/04/25(Thu)08:27:16 No.106481848

>>106481793
Thank you daddy :3 :*

Anonymous
09/04/25(Thu)08:28:59 No.106481861

Anonymous 09/04/25(Thu)08:28:59 No.106481861

File: dipsyOfCourse2.png (2.9 MB, 1024x1536)

2.9 MB PNG

>>106481166
> it's spreading...

Anonymous
09/04/25(Thu)08:30:07 No.106481870

Anonymous 09/04/25(Thu)08:30:07 No.106481870

>>106481793
Has it been tested yet with CUDA 13?

llama.cpp CUDA dev !!yhbFjk57TDr
09/04/25(Thu)08:31:46 No.106481884

llama.cpp CUDA dev !!yhbFjk57TDr 09/04/25(Thu)08:31:46 No.106481884

>>106481762
A Linux package should install both the headers and the shared object files, if a CUDA package was missing the ggml build should be failing during the compilation of ggml rather than the linking.
To me this looks like the CUDA installation itself is broken - it was compiled using some glibc version, downloaded and installed as binary on your system, and now fails to find the glibc library.

Anonymous
09/04/25(Thu)08:32:34 No.106481887

Anonymous 09/04/25(Thu)08:32:34 No.106481887

>>106481874
>>106481874
>>106481874

Anonymous
09/04/25(Thu)08:36:18 No.106481911

Anonymous 09/04/25(Thu)08:36:18 No.106481911

>>106481884
On my configuration and a fresh Conda environment, "cmake -B build -DGGML_CUDA=ON --fresh" fails for any CUDA 12.x version.
With CUDA 13.0, it works for that step, but then fails when building with "cmake --build build --config Release".
The system NVidia driver (580.76.05) reports support for CUDA 13.0; I can't downgrade.
I didn't have issues until a few weeks ago, but I had a previous NVidia driver with CUDA 12.x support.

Anonymous
09/04/25(Thu)08:42:42 No.106481945

Anonymous 09/04/25(Thu)08:42:42 No.106481945

>>106475450
>>106475364
>>106475338
>>106475313
Holy shit that's my picture. Behold me anons. I am happy.

llama.cpp CUDA dev !!yhbFjk57TDr
09/04/25(Thu)08:59:42 No.106482084

llama.cpp CUDA dev !!yhbFjk57TDr 09/04/25(Thu)08:59:42 No.106482084

>>106481911
I have installed both CUDA 11 and 12 on one of my systems, to switch from the default CUDA 12 to CUDA 11 I have to do:
export CUDA_HOME=/opt/cuda-11.7 && export PATH=$CUDA_HOME/bin:$PATH && export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH

Anonymous
09/04/25(Thu)08:59:53 No.106482087

Anonymous 09/04/25(Thu)08:59:53 No.106482087

>>106481714
>as far as I know, in MoE, only 1 expert is active at a time.
You don't know, then. Because all of the larger MoE models (deepseek, qwen3, glm4.5) use 8 experts per token.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.