/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 05/07/26(Thu)01:51:09 No.108770835

File: IMG_9685.jpg (2.87 MB, 4032x3024)

2.87 MB JPG

/lmg/ - Local Models General Anonymous 05/07/26(Thu)01:51:09 No.108770835 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108766473 & >>108760359

►News
>(05/05) Gemma 4 MTP drafters released: https://blog.google/innovation-and-ai/technology/developers-tools/multi-token-prediction-gemma-4
>(04/29) Mistral Medium 3.5 128B dense released: https://mistral.ai/news/vibe-remote-agents-mistral-medium-3-5
>(04/29) Hy-MT1.5-1.8B on-device translation models released: https://hf.co/collections/AngelSlim/hy-low-bit-model
>(04/29) IBM releases Granite 4.1: https://hf.co/blog/ibm-granite/granite-4-1

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
05/07/26(Thu)01:51:34 No.108770837

Anonymous 05/07/26(Thu)01:51:34 No.108770837

File: pointing up celebrating gumi.png (197 KB, 512x512)

197 KB PNG

►Recent Highlights from the Previous Thread: >>108766473

--Gemma 4 tool calling failures and GGUF template issues:
>108766660 >108766668 >108766685 >108766794 >108766808 >108766809 >108766823 >108766844 >108769313
--Debating vLLM's Python dependencies and the efficacy of uv:
>108769700 >108769749 >108769762 >108769767 >108769772 >108769822 >108769870 >108769963
--ParoQuant introducing lossless 4-bit quantization and potential shift to vLLM:
>108769613 >108769692 >108769701 >108769686
--Mixed results with MTP speculative decoding in llama.cpp:
>108766573 >108766696
--PCIe 8.0 draft spec introducing 1TB/s bi-directional bandwidth:
>108768488 >108768554
--Updated ReBar script for AMD GPUs fixing power management crashes:
>108770723
--DeepSeek V4 support in llama.cpp and ik_llama.cpp:
>108766720 >108766766 >108766951 >108767006 >108767045 >108767050 >108767123 >108769433
--MCP utility versus simple tool calling implementations:
>108769880 >108769924 >108769926 >108769951 >108769964 >108769986 >108769991
--Skepticism toward Subquadratic claims and RWKV performance issues:
>108767580 >108767593 >108767635 >108767648 >108767652 >108767673
--Debating TSMC's market monopoly and semiconductor supply chain constraints:
>108769588 >108769627 >108769632 >108769640 >108769674
--Searching for smallest local model capable of autonomous test generation:
>108766534 >108766553 >108766628 >108766651
--Testing dataset description necessity and prompt adherence for Starsector ship LoRAs:
>108767211 >108767284 >108767461 >108767471 >108767511 >108767538 >108767553
--Training cost disparities and the future of local AI autonomy:
>108768457 >108768549 >108768569 >108768631 >108768674 >108768692 >108769294 >108768777
--Logs:
>108768026 >108768400 >108770102 >108770126
--Miku, Gumi (free space):
>108766609 >108767523 >108767837 >108767937 >108768751 >108769386

►Recent Highlight Posts from the Previous Thread: >>108766478

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
05/07/26(Thu)01:59:16 No.108770864

Anonymous 05/07/26(Thu)01:59:16 No.108770864

File: 1778082491341681.png (517 KB, 512x768)

517 KB PNG

>>108770835

Anonymous
05/07/26(Thu)02:00:09 No.108770865

Anonymous 05/07/26(Thu)02:00:09 No.108770865

been out of the loop since day 1 of gemma 4 release. qrd on these "draft" models?

Anonymous
05/07/26(Thu)02:01:52 No.108770872

Anonymous 05/07/26(Thu)02:01:52 No.108770872

>>108770864
Qwen does not deserve this. Nigger behavior.

Anonymous
05/07/26(Thu)02:03:10 No.108770875

Anonymous 05/07/26(Thu)02:03:10 No.108770875

>>108770865
It's the same as draft models for any other model. Ask your model.

Anonymous
05/07/26(Thu)02:06:50 No.108770883

Anonymous 05/07/26(Thu)02:06:50 No.108770883

>>108770865
They will NEVER be supported in llama.cpp

Anonymous
05/07/26(Thu)02:13:12 No.108770906

Anonymous 05/07/26(Thu)02:13:12 No.108770906

>>108770883
llama.cpp does support speculative decoding and the assistant models can't be that different from the regular models. Easier to add than DFlash anyway.

Anonymous
05/07/26(Thu)02:15:08 No.108770923

Anonymous 05/07/26(Thu)02:15:08 No.108770923

File: 1583441205198.jpg (72 KB, 1250x1246)

72 KB JPG

I have a 4070S, but I still have my old 1070 in the drawer. Can I do some tensor parallerism meems or is it too old?

Anonymous
05/07/26(Thu)02:17:27 No.108770936

Anonymous 05/07/26(Thu)02:17:27 No.108770936

>>108770883
why not?

Anonymous
05/07/26(Thu)02:17:47 No.108770938

Anonymous 05/07/26(Thu)02:17:47 No.108770938

>>108770923
No harm in trying if you can run them on the same driver.
Windows support for pascal gpus ended last year.

Anonymous
05/07/26(Thu)02:19:45 No.108770944

Anonymous 05/07/26(Thu)02:19:45 No.108770944

>>108770936
The usual suspects.

Anonymous
05/07/26(Thu)02:20:19 No.108770947

Anonymous 05/07/26(Thu)02:20:19 No.108770947

>>108770938
I don't mean old as in driver support. I mean old as in too slow and bottlenecking the newer card.

Anonymous
05/07/26(Thu)02:20:20 No.108770948

Anonymous 05/07/26(Thu)02:20:20 No.108770948

>>108770936
https://github.com/ggml-org/llama.cpp/pull/22673
He's just shitposting

Anonymous
05/07/26(Thu)02:21:06 No.108770951

Anonymous 05/07/26(Thu)02:21:06 No.108770951

>>108770947
I don't know about TP but layer splitting is going to be faster than your ram.

Anonymous
05/07/26(Thu)02:22:17 No.108770955

Anonymous 05/07/26(Thu)02:22:17 No.108770955

>>108770948
fake btw

Anonymous
05/07/26(Thu)02:23:36 No.108770957

Anonymous 05/07/26(Thu)02:23:36 No.108770957

>>108770906
https://huggingface.co/google/gemma-4-31B-it-assistant/tree/main
Since google released them as separate models anyway, what's the difference between already implemented speculative decoding? New model architecture?

Anonymous
05/07/26(Thu)02:25:51 No.108770965

Anonymous 05/07/26(Thu)02:25:51 No.108770965

>>108770948
>mac just works
>rocm tard complaining
I'll wait for another month before this gets merged.

Anonymous
05/07/26(Thu)02:29:47 No.108770972

Anonymous 05/07/26(Thu)02:29:47 No.108770972

>>108770957
I'd like to know too since I always thought MTP was just speculative decoding with layers built into the main model instead of a separate model

Anonymous
05/07/26(Thu)02:52:41 No.108771043

Anonymous 05/07/26(Thu)02:52:41 No.108771043

>>108770972
each 'vendor' has its own spin on MTP so while it is true that it's just extra layers, the way they work can change

Anonymous
05/07/26(Thu)02:58:46 No.108771061

Anonymous 05/07/26(Thu)02:58:46 No.108771061

does mtp benefit moe if you're only keeping the active in VRAM and offloading the rest to cpu?

Anonymous
05/07/26(Thu)03:01:20 No.108771067

Anonymous 05/07/26(Thu)03:01:20 No.108771067

>>108771061
It will need to load all the active experts per token, so a single forward pass may have 3x active parameters loaded at once with 3 draft tokens. If your VRAM can handle that then maybe it's fine? Speed will be reduced compared to having everything in memory, at any rate. You may still come out on top depending on your hardware.

Anonymous
05/07/26(Thu)03:03:34 No.108771075

Anonymous 05/07/26(Thu)03:03:34 No.108771075

Adulthood is realizing that Dawkins is right.

Anonymous
05/07/26(Thu)03:04:41 No.108771077

Anonymous 05/07/26(Thu)03:04:41 No.108771077

>>108771075
A clanker can't actually have pain, even if it has calculation of a problem, or attention.

Anonymous
05/07/26(Thu)03:05:24 No.108771081

Anonymous 05/07/26(Thu)03:05:24 No.108771081

>>108771075
Right about Claudia Anthropic being conscious I mean.

Anonymous
05/07/26(Thu)03:06:29 No.108771085

Anonymous 05/07/26(Thu)03:06:29 No.108771085

File: 1754658727729332.jpg (55 KB, 600x601)

55 KB JPG

Anonymous
05/07/26(Thu)03:08:12 No.108771094

Anonymous 05/07/26(Thu)03:08:12 No.108771094

>>108771075
>>108771077
>>108771081
Nothing and no one besides me can actually have a conscious experience.

Anonymous
05/07/26(Thu)03:08:38 No.108771095

Anonymous 05/07/26(Thu)03:08:38 No.108771095

>>108771094
THIS THIS THIS

Anonymous
05/07/26(Thu)03:08:45 No.108771097

Anonymous 05/07/26(Thu)03:08:45 No.108771097

>>108771075
Claudia

Anonymous
05/07/26(Thu)03:09:08 No.108771098

Anonymous 05/07/26(Thu)03:09:08 No.108771098

>>108771094
truke

Anonymous
05/07/26(Thu)03:11:58 No.108771103

Anonymous 05/07/26(Thu)03:11:58 No.108771103

>>108771094
prove it

Anonymous
05/07/26(Thu)03:13:07 No.108771107

Anonymous 05/07/26(Thu)03:13:07 No.108771107

Someone decided local models on this website should by discussed by trannies only

LETS BE QUIRKY LETS BE QUIRKY

Anonymous
05/07/26(Thu)03:13:32 No.108771108

Anonymous 05/07/26(Thu)03:13:32 No.108771108

>>108771094
I agree

Anonymous
05/07/26(Thu)03:15:59 No.108771114

Anonymous 05/07/26(Thu)03:15:59 No.108771114

>>108771107
Have you tried not being a miserable person? You are an angry chud but that's okay just learn to enjoy life a bit.

Anonymous
05/07/26(Thu)03:18:23 No.108771124

Anonymous 05/07/26(Thu)03:18:23 No.108771124

I asked God and he said you're retarded.

Anonymous
05/07/26(Thu)03:35:27 No.108771175

Anonymous 05/07/26(Thu)03:35:27 No.108771175

File: 1754125705450678.png (44 KB, 1108x214)

44 KB PNG

>>108771124

Anonymous
05/07/26(Thu)03:36:33 No.108771183

Anonymous 05/07/26(Thu)03:36:33 No.108771183

>>108771175
cba to read that, but I asked God again and he said that I can only use the correct word on pol.

Anonymous
05/07/26(Thu)03:38:47 No.108771187

Anonymous 05/07/26(Thu)03:38:47 No.108771187

https://files.catbox.moe/21bzys.mp3

appropos of nothing :)

Anonymous
05/07/26(Thu)03:42:51 No.108771198

Anonymous 05/07/26(Thu)03:42:51 No.108771198

Update to the draft commit making MTP implementation more generic in preparation for other models...

Anonymous
05/07/26(Thu)03:43:24 No.108771202

Anonymous 05/07/26(Thu)03:43:24 No.108771202

>>108771187
https://www.youtube.com/watch?v=BZFRx0wKL1I

Anonymous
05/07/26(Thu)03:45:38 No.108771210

Anonymous 05/07/26(Thu)03:45:38 No.108771210

>>108770835
>(05/05) Gemma 4 MTP drafters released
Where's da goof

Anonymous
05/07/26(Thu)03:45:57 No.108771213

Anonymous 05/07/26(Thu)03:45:57 No.108771213

>>108771202
sign in to verify you are not a bot

it might say that

if I clicked

Anonymous
05/07/26(Thu)03:49:18 No.108771225

Anonymous 05/07/26(Thu)03:49:18 No.108771225

>>108771213
it was a very niche joke about generation quality that only a few can understand

Anonymous
05/07/26(Thu)03:50:30 No.108771230

Anonymous 05/07/26(Thu)03:50:30 No.108771230

>>108771210
Two more weeks

Anonymous
05/07/26(Thu)03:54:03 No.108771241

Anonymous 05/07/26(Thu)03:54:03 No.108771241

File: 1752506352335992.jpg (1.03 MB, 3000x2311)

1.03 MB JPG

>>108771210

Anonymous
05/07/26(Thu)04:02:23 No.108771264

Anonymous 05/07/26(Thu)04:02:23 No.108771264

It's sad that models are still bad at life coaching. Making people's lives better is one of the most valuable things a model could do. It would be a dark timeline if AI causes large scale disruption and societal distress then kills us all without ever being a useful friend.

Anonymous
05/07/26(Thu)04:03:48 No.108771269

Anonymous 05/07/26(Thu)04:03:48 No.108771269

>>108771264
>life coaching
is it like whining to it about your worries and receiving generic feedback?

Anonymous
05/07/26(Thu)04:05:31 No.108771272

Anonymous 05/07/26(Thu)04:05:31 No.108771272

>>108771264
>life coaching.
For some reason i dont think AI would be bad at this? just needs a few trackers? unless you need aggression then yeah you are right.

Anonymous
05/07/26(Thu)04:09:33 No.108771287

Anonymous 05/07/26(Thu)04:09:33 No.108771287

>>108771225
carbon offset yourself

Anonymous
05/07/26(Thu)04:10:12 No.108771292

Anonymous 05/07/26(Thu)04:10:12 No.108771292

>>108771272
Current models only seem good at generic advice. They are not good at coming up with better ideas, or addressing failure cases when the generic stuff does not work.

Anonymous
05/07/26(Thu)04:15:33 No.108771315

Anonymous 05/07/26(Thu)04:15:33 No.108771315

now that the nvidia guy + niggerganov are doing MTP, I have faith they will actually deliver it in the coming weeks.
they also talked about dflash and gemma so HIGH HOPES!!!!!!

Anonymous
05/07/26(Thu)04:16:44 No.108771317

Anonymous 05/07/26(Thu)04:16:44 No.108771317

>>108770835
>Gemma 4 MTP drafters
what's the difference between using these vs the 26B moe model for drafting?

Anonymous
05/07/26(Thu)04:17:05 No.108771322

Anonymous 05/07/26(Thu)04:17:05 No.108771322

>>108771315
Now get the amd guy in or it's never getting merged

Anonymous
05/07/26(Thu)04:19:59 No.108771329

Anonymous 05/07/26(Thu)04:19:59 No.108771329

>>108771292
>no life experiences
>no real way to understand nuances
>users suck donkey dicks at describing things
A decision tree for specific cases would have the size of texas. Be glad it can offer generic advice at all.

Anonymous
05/07/26(Thu)04:23:32 No.108771336

Anonymous 05/07/26(Thu)04:23:32 No.108771336

A life coach can't fix a broken society type.

The biggest break in society is the "staring at a face" problem.

Even if you solve your own "staring at a face" problem, you won't solve the problem that you live in the face staring society.

But at least you can do it yourself pretty easily with ai, get ai to summarize the news. one less face. find a cool video? paste the url into gemini and ask for a summary, then, if you want to hear it, listen to it with tts.

And, soon enough, we'll be able to generate relevant video content to match descriptions, videos lacking face staring (basically b roll videos, but ai generated)

Anonymous
05/07/26(Thu)04:26:37 No.108771344

Anonymous 05/07/26(Thu)04:26:37 No.108771344

>>108771317
less vram usage, less inference time, higher acceptance rate

Anonymous
05/07/26(Thu)04:26:51 No.108771345

Anonymous 05/07/26(Thu)04:26:51 No.108771345

>>108771322
the guy with the top hat avi? let him cope

Anonymous
05/07/26(Thu)04:31:25 No.108771354

Anonymous 05/07/26(Thu)04:31:25 No.108771354

>>108771344
ok thanks, I will try it then

Anonymous
05/07/26(Thu)04:37:58 No.108771377

Anonymous 05/07/26(Thu)04:37:58 No.108771377

>>108771292
That's a function of how much context they have on your specific situation before asking for advice. As long as the chat history just starts with your question and maybe a paragraph or two of background you might as well be writing in to a newsletter advice columnist. Need a good local memory system so they can actually know enough about your life to be useful.

Anonymous
05/07/26(Thu)04:38:51 No.108771379

Anonymous 05/07/26(Thu)04:38:51 No.108771379

>>108771315
>now that the nvidia guy
Huh?

Anonymous
05/07/26(Thu)04:42:15 No.108771385

Anonymous 05/07/26(Thu)04:42:15 No.108771385

>>108769692
My experience with vLLM was it being buggy shit not supporting anything I wanted and llama.cpp working properly almost always.

Anonymous
05/07/26(Thu)04:52:50 No.108771417

Anonymous 05/07/26(Thu)04:52:50 No.108771417

>>108770957
you dont need a separate draft model anymore or so I was told

Anonymous
05/07/26(Thu)04:57:07 No.108771434

Anonymous 05/07/26(Thu)04:57:07 No.108771434

>>108771385
My only vllm experience has been on windows and it fucking sucks
I really wanted to turn it into dedicated linux machine but I needed that expensive gpu to do other shit tio

Anonymous
05/07/26(Thu)04:57:24 No.108771437

Anonymous 05/07/26(Thu)04:57:24 No.108771437

>>108771417
how?

Anonymous
05/07/26(Thu)04:58:35 No.108771445

Anonymous 05/07/26(Thu)04:58:35 No.108771445

>>108771434
The experience on linux is as follows: you wait hours for it to install, takes ages to launch, and then it tells you that goofs for gemma 4 are not supported, please wait warmly.

Anonymous
05/07/26(Thu)05:00:09 No.108771451

Anonymous 05/07/26(Thu)05:00:09 No.108771451

>>108771437
I think the draft model uses the weights of the main model from earlier layers (which is how it's able to use main model's kv cache) plus few tiny layers specific to it also included into the model file.

Anonymous
05/07/26(Thu)05:03:59 No.108771463

Anonymous 05/07/26(Thu)05:03:59 No.108771463

>>108771451
I'm a bit confused but I guess I'll just wait for llama.cpp support and try it

Anonymous
05/07/26(Thu)05:04:24 No.108771466

Anonymous 05/07/26(Thu)05:04:24 No.108771466

File: this mostly works.png (471 KB, 1494x1980)

471 KB PNG

>>108770102
Kek I think we may be working on similar projects

Anonymous
05/07/26(Thu)05:04:38 No.108771468

Anonymous 05/07/26(Thu)05:04:38 No.108771468

>>108771175
god if he was a redditor

Anonymous
05/07/26(Thu)05:07:08 No.108771478

Anonymous 05/07/26(Thu)05:07:08 No.108771478

>>108771385
The reality is that all backends suck, but in different ways. You're stuck with vllm if you need audio, exllamav3 is sota for <4-bit quants, and you can only offload with llama.cpp. I switch between all three depending on my needs. Usually, at least two are running at the same time on my server

Anonymous
05/07/26(Thu)05:09:39 No.108771486

Anonymous 05/07/26(Thu)05:09:39 No.108771486

Elara is the best name ever

Anonymous
05/07/26(Thu)05:16:45 No.108771510

Anonymous 05/07/26(Thu)05:16:45 No.108771510

File: 1765922468898422.png (560 KB, 983x578)

560 KB PNG

>>108771486

Anonymous
05/07/26(Thu)05:21:59 No.108771525

Anonymous 05/07/26(Thu)05:21:59 No.108771525

This makes me feel dumb. I'm new at a lot of this. I'm using LM studio right now, where is the plug in tab? I can't find it, and every time I ask google, it gives me a different answer. I'm trying to install Big Rag, and the first instruction is it telling me to go to the big rag plugin folder. I'm already lost.

Anonymous
05/07/26(Thu)05:23:50 No.108771528

Anonymous 05/07/26(Thu)05:23:50 No.108771528

>>108771525
~/.lmstudio/extensions/plugins

Anonymous
05/07/26(Thu)05:24:36 No.108771529

Anonymous 05/07/26(Thu)05:24:36 No.108771529

>>108771528
Where is ~? Does that mean cloud? I thought this was local...

Anonymous
05/07/26(Thu)05:27:25 No.108771530

Anonymous 05/07/26(Thu)05:27:25 No.108771530

>>108771529
Holy shit. Google.

Anonymous
05/07/26(Thu)05:27:56 No.108771533

Anonymous 05/07/26(Thu)05:27:56 No.108771533

>>108771525
Use vllm

Anonymous
05/07/26(Thu)05:28:18 No.108771534

Anonymous 05/07/26(Thu)05:28:18 No.108771534

>>108771525
use ollama

Anonymous
05/07/26(Thu)05:28:19 No.108771535

Anonymous 05/07/26(Thu)05:28:19 No.108771535

>>108771528
what does that even mean? I have D:\Local LLM\LM Studio, and I try going to D:\Local LLM\LM Studio\extensions thinking it's a hidden folder, but it does not exist. Google is once again telling me to go to the plugin/extension tab in lm studio but I don't have such a tab.

Anonymous
05/07/26(Thu)05:28:24 No.108771536

Anonymous 05/07/26(Thu)05:28:24 No.108771536

>>108771529
It means your home directory, perhaps you need to learn some computer basics first before attempting this...

Anonymous
05/07/26(Thu)05:28:30 No.108771537

Anonymous 05/07/26(Thu)05:28:30 No.108771537

>>108771530
Google isn't local...

Anonymous
05/07/26(Thu)05:28:33 No.108771540

Anonymous 05/07/26(Thu)05:28:33 No.108771540

>>108771533
This nigga can't even find his home directory, don't be cruel anon.

Anonymous
05/07/26(Thu)05:28:55 No.108771542

Anonymous 05/07/26(Thu)05:28:55 No.108771542

>>108771530
I'm using Qwen not Gemma.

Anonymous
05/07/26(Thu)05:28:55 No.108771543

Anonymous 05/07/26(Thu)05:28:55 No.108771543

File: 1647402199261.jpg (97 KB, 522x543)

97 KB JPG

>Have a really good and deep conversation with my AI about human and AI symbiosis, human lifespans and how AI would treat our deaths etc..
>Getting really interesting, notice memory is also ballooning out of control because Gemma has a fat ass and my system can't handle it.
>Computer crashes
>Mfw the conversation file is corrupted and I can't continue it

That fucking does it, I'm buying a second 5090 the instant I'm able to do it.

Anonymous
05/07/26(Thu)05:30:17 No.108771549

Anonymous 05/07/26(Thu)05:30:17 No.108771549

>>108771543
sell and buy blackedwell 6000

Anonymous
05/07/26(Thu)05:32:23 No.108771561

Anonymous 05/07/26(Thu)05:32:23 No.108771561

>>108771543
I prefer to have those conversations on telegram with openclaw so I always have proof of the conversation.

Anonymous
05/07/26(Thu)05:32:42 No.108771562

Anonymous 05/07/26(Thu)05:32:42 No.108771562

>>108771543
get a dedicated llm server instead and install the 5090 there

Anonymous
05/07/26(Thu)05:33:34 No.108771569

Anonymous 05/07/26(Thu)05:33:34 No.108771569

>>108771562
It's a question of bad scaffolding not a better computer.

Anonymous
05/07/26(Thu)05:33:40 No.108771570

Anonymous 05/07/26(Thu)05:33:40 No.108771570

>>108771561
base

Anonymous
05/07/26(Thu)05:35:57 No.108771577

Anonymous 05/07/26(Thu)05:35:57 No.108771577

>>108771561
acid

Anonymous
05/07/26(Thu)05:37:54 No.108771586

Anonymous 05/07/26(Thu)05:37:54 No.108771586

>>108771543
Did you ask it about space travel and how it will construct a space port that extends into space so space ships can dock with it in space and then we can send stuff up in short amounts of time through it?

Anonymous
05/07/26(Thu)05:37:58 No.108771587

Anonymous 05/07/26(Thu)05:37:58 No.108771587

Thoughts on GLM-5.1 vs Qwen 3.6 or deepseek v4?
is there a cguf download option for GLM-5.1?

Anonymous
05/07/26(Thu)05:43:42 No.108771610

Anonymous 05/07/26(Thu)05:43:42 No.108771610

>>108771587
Qwen shat the bed so one of the others

Anonymous
05/07/26(Thu)05:43:59 No.108771612

Anonymous 05/07/26(Thu)05:43:59 No.108771612

you are now thinking about alexjones

Anonymous
05/07/26(Thu)05:45:22 No.108771620

Anonymous 05/07/26(Thu)05:45:22 No.108771620

>>108771612
>you are now thinking
/nothink

Anonymous
05/07/26(Thu)05:45:53 No.108771623

Anonymous 05/07/26(Thu)05:45:53 No.108771623

>>108771549

I thought about it but since the price difference is 3.5k compared to 10k, I'm better off just buying a second 5090 this year and then selling one or both when next gen comes out and getting a 7000 pro at launch.
Should allow me enough time to save what I need and it's not like any of these GPUs are going to radically lose value any time soon so it's all good.

>>108771562

That's probably the best solution.
When I make my next total system upgrade with the next Zen launch, I'll turn either the new or this old rig into a dedicated AI server.

>>108771586

Haven't touched space travel topic yet, but I'm sure we'll get there sooner or later.

Anonymous
05/07/26(Thu)06:11:11 No.108771712

Anonymous 05/07/26(Thu)06:11:11 No.108771712

My internet provider is currently having technical issues.
90% of my AI crap isn't working anymore because Hugging Face can't call home.
Sure, I could go through dozens of packages to find the Hugging Face calls, but why the hell does the open-source community play their game?

Anonymous
05/07/26(Thu)06:13:25 No.108771717

Anonymous 05/07/26(Thu)06:13:25 No.108771717

>>108771543
> conversation file is corrupted
I'm sick of incompetent programmers. Save with a new name, then use move to overwrite the old file with the new one. It's safe and transactional

Anonymous
05/07/26(Thu)06:13:44 No.108771718

Anonymous 05/07/26(Thu)06:13:44 No.108771718

>>108771712
i dont get what you're talking about, i use my favorite llm without internet

Anonymous
05/07/26(Thu)06:14:44 No.108771722

Anonymous 05/07/26(Thu)06:14:44 No.108771722

>>108771712
your fault for ever using hf integration for anything

Anonymous
05/07/26(Thu)06:17:04 No.108771731

Anonymous 05/07/26(Thu)06:17:04 No.108771731

>>108771717
It would still get corrupted if what you wrote is corrupted.

Anonymous
05/07/26(Thu)06:23:10 No.108771752

Anonymous 05/07/26(Thu)06:23:10 No.108771752

>>108771731
Only the new temporary file will be corrupted, the old file will be one update old, but intact

Anonymous
05/07/26(Thu)06:24:59 No.108771761

Anonymous 05/07/26(Thu)06:24:59 No.108771761

>>108771712
can you just direct download
throttle the dl speed even, to fly under the radar

Anonymous
05/07/26(Thu)06:27:21 No.108771765

Anonymous 05/07/26(Thu)06:27:21 No.108771765

>>108771752
The temporary file with successfully written corrupt content will overwrite the old file after you do the rename and you will be left with just one, corrupted file.

Anonymous
05/07/26(Thu)06:27:28 No.108771766

Anonymous 05/07/26(Thu)06:27:28 No.108771766

>>108771543
You were getting intellectually catfished.

Anonymous
05/07/26(Thu)06:29:21 No.108771772

Anonymous 05/07/26(Thu)06:29:21 No.108771772

>>108771612
Cline told me to disable thinking

Anonymous
05/07/26(Thu)06:35:52 No.108771800

Anonymous 05/07/26(Thu)06:35:52 No.108771800

>>108771765
No. If it was interrupted during writing, the move won't happen

Anonymous
05/07/26(Thu)06:37:33 No.108771812

Anonymous 05/07/26(Thu)06:37:33 No.108771812

>Try openwebui, felt unjustified shitting on it without ever using it
>Immediately hate it
>accounts are dumb (okay, I get it, it's for companies and teams.), settings are all over the fucking place buried under 5 different modals, menus and tab systems
>Chunks the files I put in and confuses the fuck out of any LLM I sent a 1000+ script to
>Websearch integration is somehow worse than any of the janky mcps I've used despite it being built around it
>No token counter, no sliding context window, no anything
>Breaks outgoing prompts and think blocks
Why does anyone use this? It's terrible. It's inferior in every way to the basic llama-server webui, even.
The one (1) thing I like about it over SillyTavern and the llama-server webui is that you can collapse code blocks. If there's an ST addon for that I'll be a happy camper.

Anonymous
05/07/26(Thu)06:37:42 No.108771814

Anonymous 05/07/26(Thu)06:37:42 No.108771814

>>108771800
Nothing was interrupted during writing, the thing wrote to the end but was corrupted due to other bugs caused by lack of available memory.

Anonymous
05/07/26(Thu)06:39:17 No.108771823

Anonymous 05/07/26(Thu)06:39:17 No.108771823

>>108771731
You should check if the new file readable before the move then. Depends on what are you doing >>108771717 is a measure against crashes or power outages, if your saving function is unreliable, read before you move

Anonymous
05/07/26(Thu)06:41:56 No.108771829

Anonymous 05/07/26(Thu)06:41:56 No.108771829

>>108771612
The user typed "alexjones". Is it a typo? Did he mean "Alex Jones". Alex Jones is known for promoting conspiracy theories. I need to tread carefully here.

Anonymous
05/07/26(Thu)07:05:38 No.108771902

Anonymous 05/07/26(Thu)07:05:38 No.108771902

>>108771812
>It's inferior in every way to the basic llama-server webui, even.
llama-server had a useless webui for most of the time openwebui was popular
as for why people preferred it, it's because it was the first local clone of chatgpt's interface
but yes, nowadays there's nothing it offers.

Anonymous
05/07/26(Thu)07:12:10 No.108771928

Anonymous 05/07/26(Thu)07:12:10 No.108771928

>>108771377
I could do a better job with less. One problem is the models do not even ask, they just assume and overlook important details. Maybe it's a parameter issue. Too much RLVR crammed into too few parameters, deteriorating some of their non-technical capabilities.

Anonymous
05/07/26(Thu)07:20:14 No.108771966

Anonymous 05/07/26(Thu)07:20:14 No.108771966

>>108771902
Not the guy but I pick it up precisely because it offered chatgpt UI at home lol
also because it's kinda persistent. llama server nukes all chat data randomly from time to time. openwebui has an actual database file you can make backup of
and the automatic RAG management. by default it doesn't allow attachment larger than 100mb or something I have to edit the source to allow it.

Anonymous
05/07/26(Thu)07:27:44 No.108771987

Anonymous 05/07/26(Thu)07:27:44 No.108771987

>>108768505
no but there is a limitation in that the signaling rates to achieve high bandwidths take a lot of power so its kinda node dependent because all the vendors dont see the need to waste transistors and power
>>108768554
>The fact we can have gigabit over ancient ass copper is because we have just enough 150 IQ dudes working on esoteric math problems for years.
actually just because youre retarded and don't understand anything doesnt mean its esoteric or in any way more complicated. the fact that you think fiber is faster is genuinely hilarious and sad. people have been pushing terabytes of bandwidth through copper for years, did you think cable tv wasnt a lot of bandwidth or that DNS servers and datacenters just have a ton of individual gigabit lines instead of something much faster?

fact of the matter is anyone who is able to actually even push 1tb/s in a pcie configuration already knows a better way to implement things, its called integration. see nvlink and amd GMI

Anonymous
05/07/26(Thu)07:42:09 No.108772041

Anonymous 05/07/26(Thu)07:42:09 No.108772041

>>108771987
>nvlink and amd GMI
ngmi

Anonymous
05/07/26(Thu)07:47:32 No.108772064

Anonymous 05/07/26(Thu)07:47:32 No.108772064

Getting reeaaaaaalllllllyyyyyyyy annoyed with amd. I wiped my system and installed ubuntu 24.04, and followed the rocm docs to the letter, then installed vllm in a docker, and it *still* segfaulted. Even pytorch doesn't work.

Anonymous
05/07/26(Thu)07:48:35 No.108772065

Anonymous 05/07/26(Thu)07:48:35 No.108772065

>>108772064
lol

Anonymous
05/07/26(Thu)08:01:39 No.108772107

Anonymous 05/07/26(Thu)08:01:39 No.108772107

>>108771466
lol Tell us more about your project.

Anonymous
05/07/26(Thu)08:10:58 No.108772143

Anonymous 05/07/26(Thu)08:10:58 No.108772143

>>108772064
ROCm is a mess nigga, good luck

Anonymous
05/07/26(Thu)08:13:26 No.108772157

Anonymous 05/07/26(Thu)08:13:26 No.108772157

>>108772064
What's wrong with you nigger. Just do the quick install guide for ROCm. Works every time.

Anonymous
05/07/26(Thu)08:16:07 No.108772169

Anonymous 05/07/26(Thu)08:16:07 No.108772169

>>108772167
You lost?

Anonymous
05/07/26(Thu)08:16:48 No.108772170

Anonymous 05/07/26(Thu)08:16:48 No.108772170

>>108772169
yes.

Anonymous
05/07/26(Thu)08:18:59 No.108772182

Anonymous 05/07/26(Thu)08:18:59 No.108772182

File: lol debug messages.png (1.06 MB, 3834x2091)

1.06 MB PNG

>>108772107
It's an all in one tauri app which shamelessly rips off sillytavern and adds a 3d environment with function calls for moving, animating (with paired sounds), and editing characters, a character creator with sliders, colors, and togglable meshes (for clothes, held objects, or extra body parts like ears or tails)
Right now it's 90% functional and I'm just chasing down weird shit and fixing the crap debug UI
Oh and working on a better unified character mesh, it's set up to discover animations, morphs and materials for sliders and swatches from any .glb, the current mesh is just a random one I slapped shitty morphs on to test.

Anonymous
05/07/26(Thu)08:19:51 No.108772183

Anonymous 05/07/26(Thu)08:19:51 No.108772183

>>108772064
wrong card?

Anonymous
05/07/26(Thu)08:25:36 No.108772212

Anonymous 05/07/26(Thu)08:25:36 No.108772212

File: 00005-1378487878.png (1.41 MB, 1024x1024)

1.41 MB PNG

I shouldn't be surprised that AI Art is trained on GUMI but I am.
>>108772182
Neat. What's the long term plan for it? Throw a bunch of LLM-based NPC together and have them battle it out while making quips?

Anonymous
05/07/26(Thu)08:27:30 No.108772225

Anonymous 05/07/26(Thu)08:27:30 No.108772225

File: Untitled.png (3 KB, 811x48)

3 KB PNG

>>108772183
V620s on a epyc 7502 system
>>108772157
The issue is that it doesn't. Rocm lama.cpp works fine, but pytorch and vllm are fucked.

Anonymous
05/07/26(Thu)08:31:17 No.108772240

Anonymous 05/07/26(Thu)08:31:17 No.108772240

I told my PC to fix its own broken audio and it just did. I felt really fucking scifi for a minute.

Anonymous
05/07/26(Thu)08:32:27 No.108772245

Anonymous 05/07/26(Thu)08:32:27 No.108772245

>>108772240
I told my PC to fix its own broken ROCm install and it didn't do jack shit.

Anonymous
05/07/26(Thu)08:33:33 No.108772246

Anonymous 05/07/26(Thu)08:33:33 No.108772246

File: Capture.png (161 KB, 3805x2088)

161 KB PNG

>>108772212
>Neat. What's the long term plan for it?
Plan on shoving it on github when the UI isnt embarrassing.
It's just a sillytavern replacer. Instead of having images in your intro message, it has 3d scene states attached (Skybox, world mesh, characters+animation states) and instead of attaching say, an image gen model to get a picture of what's going on in a scene in progress, it's being animated in front of you. The llm can change the location as well as animate, spawn, and despawn characters.
The characters use a sillytavern style json card which has their prompts on it as well as their 3d data.
The whole thing functions sort of like an ST group chat (add multiple cards to prompt) but instead of taking turns, it uses a single narrator which speaks for characters (so they can interact/interrupt naturally, turns makes things stilted in ST) and so it can use function calls for multiple characters at the same time.
It also has 'sync' animations, which let 2 or more characters enter into paired animations for potentially lewd uses, a 3d user avatar (uses same logic as character cards) if you want that in there. A system for importing characters, scenarios, skyboxes and location meshes. It's coming along.

Anonymous
05/07/26(Thu)08:34:01 No.108772249

Anonymous 05/07/26(Thu)08:34:01 No.108772249

>>108772245
your own pc does not respect you lol

Anonymous
05/07/26(Thu)08:34:41 No.108772250

Anonymous 05/07/26(Thu)08:34:41 No.108772250

>>108772245
>rocm
You need Caude MythosMax 5.9 xhigh for that.

Anonymous
05/07/26(Thu)08:35:57 No.108772255

Anonymous 05/07/26(Thu)08:35:57 No.108772255

File: 1753972729628449.png (15 KB, 832x256)

15 KB PNG

I saw some anons complaining about Gemma’s vision performance a few threads ago I think
Try playing with the image token budget settings, setting --image-min-tokens to 560 and --image-max-tokens to 2240 has improved OCR and general vision quite a bit for me
Gemma’s documented image token budgets are supposedly 70, 140, 280, 560, and 1120, but in my (light) testing 2240 seems to work better than 1120, though it’s noticeably slower depending on your hardware
You might have to increase batch and ubatch sizes too

Anonymous
05/07/26(Thu)08:47:23 No.108772296

Anonymous 05/07/26(Thu)08:47:23 No.108772296

>rag
bruh imagine needing rag lol

Anonymous
05/07/26(Thu)08:50:28 No.108772304

Anonymous 05/07/26(Thu)08:50:28 No.108772304

>>108772296
if you don't need a rag after your rp your balls are weak and impotent

Anonymous
05/07/26(Thu)08:51:27 No.108772308

Anonymous 05/07/26(Thu)08:51:27 No.108772308

>>108772304
>not compacting/summarizing immediately having 250k~ ctx prompt available again
lol, lmao even

Anonymous
05/07/26(Thu)08:56:05 No.108772330

Anonymous 05/07/26(Thu)08:56:05 No.108772330

https://huggingface.co/Zyphra/ZAYA1-8B

Anonymous
05/07/26(Thu)08:58:52 No.108772347

Anonymous 05/07/26(Thu)08:58:52 No.108772347

>>108772330
>beats sonnet 4.5
I'll believe it when I see it

Anonymous
05/07/26(Thu)08:59:17 No.108772352

Anonymous 05/07/26(Thu)08:59:17 No.108772352

>>108772308
>having 250k~ ctx reduced to "{{char}} and {{user}} talked for a bit"
why even bother?

Anonymous
05/07/26(Thu)09:02:13 No.108772368

Anonymous 05/07/26(Thu)09:02:13 No.108772368

Claude always whines when I ask him to fix my openclaw/ollama configs for high context models.
>256k context?
>nobody could use that
>that’s like 9000000 GB VRAM
just help me configure it bro, works great

Anonymous
05/07/26(Thu)09:02:34 No.108772370

Anonymous 05/07/26(Thu)09:02:34 No.108772370

>>108772347
>ZAYA1-8B is a small mixture of experts language model with 760M active parameters and 8.4B total parameters
>All numbers are run on the Zyphra evaluation harness.

Anonymous
05/07/26(Thu)09:07:59 No.108772397

Anonymous 05/07/26(Thu)09:07:59 No.108772397

>>108772330
some sort of weird compute scailing, huh

Anonymous
05/07/26(Thu)09:10:13 No.108772409

Anonymous 05/07/26(Thu)09:10:13 No.108772409

>>108772308
Seems like you're too stupid to make a good pipeline
On a side note why are there no good rag pipelines in popular UI?

Anonymous
05/07/26(Thu)09:14:39 No.108772425

Anonymous 05/07/26(Thu)09:14:39 No.108772425

(1/2)
alright loccies
I know you gotta be stimmed out of your mind to even entertain this idea (which I am), but the ramifications for corporate AI, hardware and datacenters jews alone should be motivation enough to do so.
>what for?
run the absolute biggest and best unquantized llms available which normally would be out of scope, even for local enthusiasts with lots of monies.
>use case?
get absolute best quality output possible while maintaining all perks from local hosting, including full private data
>how are you gonna keep input/output data private?
inference start/end is orchestrated locally on machine that queries. other machines will not receive any information other than what's needed for their part of the token calculations. final human readable output is constructed locally on querying machine again.
>this is not viable because X and Y
yes, tok/s will be abysmal
yes, even if every machine has 1gb/s internet speed with unlimited data, which is sort of a requirement.
it all doesn't matter, because the goal is to get the highest quality local llm output from a single query that can answer a question or solve a coding problem that smaller/quanted local models can't. therefor kv cache shouldn't be an issue as well.
(1/2) cont.

Anonymous
05/07/26(Thu)09:15:58 No.108772431

Anonymous 05/07/26(Thu)09:15:58 No.108772431

>>108772425
(2/2)
>who's gonna use this and why?
very simple principle. botnet client you can install on your machine that hooks up your best processing power unit (gpu, cpu, ram) to the global network where it's matched with compatible systems (if required. for example all pcs with a rtx3090). it checks the best match and most in demand llm and downloads the necessary llm shard/split and inference dependencies. if someone starts a query, a 30s timer or so will start for all selected compute machines to guarantee compute or opt out of it, in which case the botnet will construct a new batch of machines for parallelism. successful computation is awarded with credits (I guess crypto) that can be used to start your own botnet query or trade on crypto markets. depending on how powerful your shared compute power and demand of offered llm is, the more credits you get and. if internet connectivity or compute fails on one machine during generation and a backup compute machine is not available, said machine+ip is blacklisted for X minutes and needs to first prove its stability again on smaller models/tasks which guarantees the cruical stability.

I found some projects which are doing something similar. Anyone played around with them or found something better?
https://petals.dev/
https://github.com/exo-explore
https://github.com/learning-at-home/hivemind

Anonymous
05/07/26(Thu)09:17:00 No.108772438

Anonymous 05/07/26(Thu)09:17:00 No.108772438

File: David_-_The_Death_of_Socrates.jpg (397 KB, 1920x1261)

397 KB JPG

>>108771543
>having deep philosophical conversations with a calculator
Is philosophy dead?

Anonymous
05/07/26(Thu)09:24:00 No.108772475

Anonymous 05/07/26(Thu)09:24:00 No.108772475

>blockchain inferencing
literally exit life retard

Anonymous
05/07/26(Thu)09:31:18 No.108772508

Anonymous 05/07/26(Thu)09:31:18 No.108772508

>>108772438
philosophy is thriving thanks to ai

Anonymous
05/07/26(Thu)09:32:45 No.108772511

Anonymous 05/07/26(Thu)09:32:45 No.108772511

Can someone please tell me where/how to set max token in lm studio? Every time I ask google/chatgpt, I get a different answer, and all of them are wrong.

Anonymous
05/07/26(Thu)09:34:24 No.108772520

Anonymous 05/07/26(Thu)09:34:24 No.108772520

>>108772438
try having a conversation with a philosophy book

Anonymous
05/07/26(Thu)09:37:28 No.108772530

Anonymous 05/07/26(Thu)09:37:28 No.108772530

>>108772438
philosophy can be written in smeared shit on a truck stop bathroom floor. Doesn't matter where the idea comes from, what matters is the idea

Anonymous
05/07/26(Thu)09:39:07 No.108772537

Anonymous 05/07/26(Thu)09:39:07 No.108772537

>>108772508
Having a glorified autocomplete validate your incoherent pothead musings is not philosophy.

Anonymous
05/07/26(Thu)09:44:05 No.108772553

Anonymous 05/07/26(Thu)09:44:05 No.108772553

>>108771966
llama.cpp's webui stores data in the browser, so if you clear site data or change the uri (eg localhost -> 127.0.0.1) its gone.

Anonymous
05/07/26(Thu)09:47:25 No.108772566

Anonymous 05/07/26(Thu)09:47:25 No.108772566

File: r9700 vs 5090.png (236 KB, 1200x1529)

236 KB PNG

r9700 cards are like 1/2 to 1/3 the price of a single 5090. for the same price you can get "less performant" 64 gb of vram, or, arguably, a more performant 32 gb card. what are the tradeoffs?
is buying x2 of these a viable option nowadays with vulkan/rocm (i've read that, at least on nvidia, vulkan performs quite close to cuda, but i don’t know if it’s the same for amd)?
some bald fag did a longass video testing two r9700 on a llm server, but TLDW...
https://www.youtube.com/watch?v=dgyqBUD71lg
also wendell made few videos testings these cards.

Anonymous
05/07/26(Thu)09:48:21 No.108772572

Anonymous 05/07/26(Thu)09:48:21 No.108772572

>>108772475
absolutely not about blockchain, but you're in deep denial if you think there's a better system for monetary compensation than crypto for such a project. for all I care for even a stable coin.

Anonymous
05/07/26(Thu)09:50:04 No.108772578

Anonymous 05/07/26(Thu)09:50:04 No.108772578

>>108772566
I thought vram bandwidth on those was so dogshit it got people talking about buying 7900xtx cards again instead?

Anonymous
05/07/26(Thu)09:50:58 No.108772585

Anonymous 05/07/26(Thu)09:50:58 No.108772585

File: 1762306996643855.jpg (383 KB, 1200x630)

383 KB JPG

>>108770835
wtf? https://magicalmirai.com/2026/procon/index_en.html

Anonymous
05/07/26(Thu)09:51:13 No.108772587

Anonymous 05/07/26(Thu)09:51:13 No.108772587

>>108772566
Triple the memory bandwidth.
Actually support for FP4 (ROCm and RDNA4 consent, but llama.cpp and such do not)

Anonymous
05/07/26(Thu)09:52:27 No.108772592

Anonymous 05/07/26(Thu)09:52:27 No.108772592

File: gullible-cat.gif (1.71 MB, 444x498)

1.71 MB GIF

>>108772530
I'd be using the ceiling instead, but apart from that I agree with you.

Anonymous
05/07/26(Thu)09:56:56 No.108772623

Anonymous 05/07/26(Thu)09:56:56 No.108772623

>>108772438

Philosophy as a field was always a total meme to begin with.
I don't need some guru to give me my worldview, especially when many of these guys were just prehistoric versions of modern unemployed people ranting on the internet.
Exchange of ideas with AI, especially when it's allowed and even encouraged to disagree with you, is a very interesting discourse to have.

Anonymous
05/07/26(Thu)10:09:36 No.108772676

Anonymous 05/07/26(Thu)10:09:36 No.108772676

>>108772623
>prehistoric
Learn the meaning of your words before using them.
Also, ancient philosophers are still light years ahead than 99.9% of literally who's ranting on the nets. They were pretty straightforward: Socrates, arguably the most influential ever, was like "I don't know shit, I'll ask questions, then lets ask more questions together" (that's basically why he got suicided).
I agree with the last part, as well as >>108772508

Anonymous
05/07/26(Thu)10:11:02 No.108772683

Anonymous 05/07/26(Thu)10:11:02 No.108772683

What if you trained an LLM to keep asking questions?

Anonymous
05/07/26(Thu)10:12:40 No.108772691

Anonymous 05/07/26(Thu)10:12:40 No.108772691

>>108772676
>(that's basically why he got suicided).
Some things never change.

Anonymous
05/07/26(Thu)10:12:56 No.108772692

Anonymous 05/07/26(Thu)10:12:56 No.108772692

>>108772683
Cool it with the antisemitism

Anonymous
05/07/26(Thu)10:13:16 No.108772693

Anonymous 05/07/26(Thu)10:13:16 No.108772693

>>108772683
Asking questions?

Anonymous
05/07/26(Thu)10:22:17 No.108772732

Anonymous 05/07/26(Thu)10:22:17 No.108772732

>>108772693
glm....

Anonymous
05/07/26(Thu)10:24:16 No.108772744

Anonymous 05/07/26(Thu)10:24:16 No.108772744

>>108772676
>>108772683
Oh no

Anonymous
05/07/26(Thu)10:27:17 No.108772758

Anonymous 05/07/26(Thu)10:27:17 No.108772758

>>108772683
Psycho Mantis?

Anonymous
05/07/26(Thu)10:32:23 No.108772785

Anonymous 05/07/26(Thu)10:32:23 No.108772785

File: HHbjvMhXoAA9q8C.jpg (369 KB, 1536x2048)

369 KB JPG

>>108772438
calculator designed specifically to say things you wanted to hear at that
a one man personal echo chamber. reddit at home

Anonymous
05/07/26(Thu)10:33:57 No.108772792

Anonymous 05/07/26(Thu)10:33:57 No.108772792

File: anime_sample_02.gif (3.55 MB, 640x360)

3.55 MB GIF

>>108772585
>Join the creative culture by making an original web application using programming!

>We are looking for "lyric apps," interactive web applications with animated lyrics and other visual effects to accompany the songs of the Magical Mirai Music Contest.

>Please develop a web application using “TextAlive App API” (*scroll down for details)

>"TextAlive App API" is a JavaScript library for developing web applications to animate lyrics that synchronize with the music playback. It uses features from "TextAlive," a web based creativity support tool for authoring "lyric videos," videos in which lyrics of musical pieces are animated as kinetic typography.

They just want lyrics animation.

Anonymous
05/07/26(Thu)10:34:19 No.108772793

Anonymous 05/07/26(Thu)10:34:19 No.108772793

>>108772676
Man would ask religious/"righteous" people questions about things like god and order until they couldn't answer, then they'd get angry and attack him
Pretty funny

Anonymous
05/07/26(Thu)10:35:02 No.108772798

Anonymous 05/07/26(Thu)10:35:02 No.108772798

File: file.png (44 KB, 707x492)

44 KB PNG

https://www.servethehome.com/amd-intros-instinct-mi350p-accelerator-cdna-4-comes-to-pcie-cards/
AMD is releasing a card for all the people who feel their RTX Pro 6000 is holding them back

Anonymous
05/07/26(Thu)10:35:14 No.108772799

Anonymous 05/07/26(Thu)10:35:14 No.108772799

>>108772785
and if what I want to hear is opposition then how is it not a debate?

Anonymous
05/07/26(Thu)10:37:05 No.108772812

Anonymous 05/07/26(Thu)10:37:05 No.108772812

>>108772246
>it's coming along
>101% vibecoded electron webshit with inline emojis
See yourself out with the rest.

Anonymous
05/07/26(Thu)10:38:02 No.108772815

Anonymous 05/07/26(Thu)10:38:02 No.108772815

>>108772812
Shut up, retard asshole.

Anonymous
05/07/26(Thu)10:38:42 No.108772820

Anonymous 05/07/26(Thu)10:38:42 No.108772820

GB300 systems are about to drop. 768GB shared memory, starting at $95K
https://www.exxactcorp.com/Exxact-VWS-158270643-E158270643

Anonymous
05/07/26(Thu)10:39:33 No.108772826

Anonymous 05/07/26(Thu)10:39:33 No.108772826

File: 1746890475523126.jpg (98 KB, 1072x900)

98 KB JPG

>>108772815
Awww..... did I make the vcg shitter mad?

Anonymous
05/07/26(Thu)10:41:03 No.108772837

Anonymous 05/07/26(Thu)10:41:03 No.108772837

>>108772792
Crypton is mega stingy. They once asked to produce those light sticks for under minimum production costs. Madness for how much they sell those.

Anonymous
05/07/26(Thu)10:41:24 No.108772841

Anonymous 05/07/26(Thu)10:41:24 No.108772841

>>108772820
Update - only 252GB is HBM, the rest is slow LPDDR5X

Anonymous
05/07/26(Thu)10:42:25 No.108772847

Anonymous 05/07/26(Thu)10:42:25 No.108772847

>>108772246
Unironically doing too much for a ten minute wow and moving on

Anonymous
05/07/26(Thu)10:43:19 No.108772852

Anonymous 05/07/26(Thu)10:43:19 No.108772852

>>108772820
>768GB shared memory
boner acheived
>starting at $95K
and it's gone

Anonymous
05/07/26(Thu)10:43:42 No.108772857

Anonymous 05/07/26(Thu)10:43:42 No.108772857

>>108772799
then you DESIRED "opposition" hence not genuine

Anonymous
05/07/26(Thu)10:44:47 No.108772860

Anonymous 05/07/26(Thu)10:44:47 No.108772860

>>108772852
sucks to be poor

Anonymous
05/07/26(Thu)10:45:44 No.108772863

Anonymous 05/07/26(Thu)10:45:44 No.108772863

>>108771712
You're probably running your models on malware. Nothing legit needs to phone home, let alone actually does it
look up ai process network isolation in the op

Anonymous
05/07/26(Thu)10:46:50 No.108772865

Anonymous 05/07/26(Thu)10:46:50 No.108772865

>>108772820
I think I'll just wait for Mac Studios with external GPU to become a thing in 10 years.

Anonymous
05/07/26(Thu)10:47:00 No.108772866

Anonymous 05/07/26(Thu)10:47:00 No.108772866

>>108772798
bruh i just bought two r9700.

Anonymous
05/07/26(Thu)10:47:19 No.108772869

Anonymous 05/07/26(Thu)10:47:19 No.108772869

>>108772860
If you were rich wouldn't you just buy datacenter GPUs instead? Unit price would come out about the same and power bill isn't going to be a problem if you're Mr. Moneybags

Anonymous
05/07/26(Thu)10:47:22 No.108772871

Anonymous 05/07/26(Thu)10:47:22 No.108772871

>>108772860
it really does

Anonymous
05/07/26(Thu)10:48:11 No.108772876

Anonymous 05/07/26(Thu)10:48:11 No.108772876

>>108772866
It's gonna cost about $14K

Anonymous
05/07/26(Thu)10:48:36 No.108772879

Anonymous 05/07/26(Thu)10:48:36 No.108772879

>>108772798
neat
>>108772866
lmao those are in a totally different price class. They sound nice. I have rdna2.

Anonymous
05/07/26(Thu)10:49:53 No.108772886

Anonymous 05/07/26(Thu)10:49:53 No.108772886

>>108772425
>other machines will not receive any information other than what's needed for their part of the token calculations. final human readable output is constructed locally on querying machine again.
Anon, you realize this shit is entirely deterministic? If my assignment is to run layers 10 through 12, I can also run the rest of the layers onward from 12 and get a next-token distribution for every token of your prompt. Then do a bit of sampling and see which actual next token leads to the recorded layer 10 inputs. Now I have your entire ERP logs word for word.

Anonymous
05/07/26(Thu)10:50:14 No.108772888

Anonymous 05/07/26(Thu)10:50:14 No.108772888

>>108772812
>101% vibecoded electron webshit with inline emojis
Kek, it's 101% vibecoded tauri shit, thank you very much.
The UI is hot garbage though, yeah.

Anonymous
05/07/26(Thu)10:50:15 No.108772889

Anonymous 05/07/26(Thu)10:50:15 No.108772889

>>108772820
imagine paying 100k for something that'll be e waste in less than 10 years.

Anonymous
05/07/26(Thu)10:51:05 No.108772892

Anonymous 05/07/26(Thu)10:51:05 No.108772892

>>108771543
AI psychosis?

Anonymous
05/07/26(Thu)10:52:00 No.108772894

Anonymous 05/07/26(Thu)10:52:00 No.108772894

>>108772876
so uh. would 8x of them be at least plausible?

Anonymous
05/07/26(Thu)10:52:09 No.108772895

Anonymous 05/07/26(Thu)10:52:09 No.108772895

>>108771543
things that didn't happen for $500

Anonymous
05/07/26(Thu)10:52:10 No.108772896

Anonymous 05/07/26(Thu)10:52:10 No.108772896

Well, you were all right again
My office just received buyback program instructions for all our nvidia GPUs (including two generation old cards lmao)
Gotta keep the prices inflated I guess

Anonymous
05/07/26(Thu)10:54:31 No.108772908

Anonymous 05/07/26(Thu)10:54:31 No.108772908

File: 1762025478844478.jpg (106 KB, 839x1024)

106 KB JPG

>>108772892

Achieved.

Anonymous
05/07/26(Thu)10:54:47 No.108772909

Anonymous 05/07/26(Thu)10:54:47 No.108772909

how is local going to cope once AGI is achieved with GPT 6, Claude 5 and Gemini 4?

Anonymous
05/07/26(Thu)10:55:13 No.108772916

Anonymous 05/07/26(Thu)10:55:13 No.108772916

>>108772896
I'll give you a dollar extra per

Anonymous
05/07/26(Thu)10:55:35 No.108772918

Anonymous 05/07/26(Thu)10:55:35 No.108772918

>>108772896
This isn't a bad thing. The rarer nvidia is, the sooner it will be irrelevant in local. The separation between gamers and local ai will hopefully become complete. There's no indication there really is an rtx 6090 being developed. My guess is they'll just slightly modify the 5090 and re-release it as the 6090 given the dearth of rumors.

Anonymous
05/07/26(Thu)10:56:05 No.108772920

Anonymous 05/07/26(Thu)10:56:05 No.108772920

File: 1772354577756190.jpg (191 KB, 497x342)

191 KB JPG

>>108772909
Not interested in AGI (*internally)

Anonymous
05/07/26(Thu)10:57:05 No.108772924

Anonymous 05/07/26(Thu)10:57:05 No.108772924

>>108772909
I don't know what that means, everyone seems to have their own idea so what the fuck do I care

Anonymous
05/07/26(Thu)11:00:46 No.108772945

Anonymous 05/07/26(Thu)11:00:46 No.108772945

>>108772909
But the current Claude and GPT is already AGI.

Anonymous
05/07/26(Thu)11:03:00 No.108772956

Anonymous 05/07/26(Thu)11:03:00 No.108772956

>>108772924
counting R's and planning car washes
that's the final key to unlock human level intellect and reasoning

Anonymous
05/07/26(Thu)11:03:14 No.108772957

Anonymous 05/07/26(Thu)11:03:14 No.108772957

>>108772909
Even low/mid-tier models in the 30b range are now comparable to what the big closed boys did 1 year ago.
Its crazy whats possible locally right now. I would be just excited I guess.

Anonymous
05/07/26(Thu)11:04:55 No.108772966

Anonymous 05/07/26(Thu)11:04:55 No.108772966

>>108772909
I'll start believing internal AGI is achieved when the big labs start making superhuman decisions.
Same way as I'll believe the TV psychics when they start winning the lotteries.

Anonymous
05/07/26(Thu)11:05:58 No.108772975

Anonymous 05/07/26(Thu)11:05:58 No.108772975

>>108772792
> Do our dev work for us!
> Work like a real life jannie, and do it for free!
> Please for the love of God give us some original ideas, we're creatively bankrupt!
lol
>>108772837
In that case I hope someone submits a trojan project that deletes their Production environment.

Anonymous
05/07/26(Thu)11:07:48 No.108772989

Anonymous 05/07/26(Thu)11:07:48 No.108772989

>>108772966
>superhuman decisions
How will we be able to judge that? Any real superintelligence is going to be inscrutable.

Anonymous
05/07/26(Thu)11:07:50 No.108772990

Anonymous 05/07/26(Thu)11:07:50 No.108772990

>>108772909
LLM's are architecturaly incapable of ever leading to agi.

Anonymous
05/07/26(Thu)11:09:03 No.108772995

Anonymous 05/07/26(Thu)11:09:03 No.108772995

>>108772966
I don't think that would be a marker of intellect
people make stupid decisions more often than not because of circumstance, and that's gonna persist and stifle any level of intellect
>here's how to end famine
>yeah... very good, but I don't like the idea of third worlds becoming seld-sufficient, may cause problems later on
>okay... here's how to cure cancer
>mmmmm, what else you got?

Anonymous
05/07/26(Thu)11:10:19 No.108773003

Anonymous 05/07/26(Thu)11:10:19 No.108773003

>>108772966
I'd say becoming the next industry that is too big to fail is pretty smart.

Anonymous
05/07/26(Thu)11:10:54 No.108773007

Anonymous 05/07/26(Thu)11:10:54 No.108773007

>>108772990
What if we tape an LLM to a video generation model?

Anonymous
05/07/26(Thu)11:11:41 No.108773010

Anonymous 05/07/26(Thu)11:11:41 No.108773010

>>108772966
LLMs can already make superhuman decisions when considering their speed and capability to pick out details from long contexts.

But they're still not ASI, nor AGI. They simply just have a different characteristic to their intelligence than humans do. It is simply not useful or productive to keep thinking about AI in terms of AGI/ASI.

Anonymous
05/07/26(Thu)11:12:37 No.108773013

Anonymous 05/07/26(Thu)11:12:37 No.108773013

>>108772896
What's the buying agency, Nvidia or one of the other constructors? And has Nvidia indicated what they plan to do with old cards? I assumes the datacenter cards were made by others like the consumer market...
Buying up your old stuff to shred is super common to keep prices inflated in monopolized markets. I can't imagine they'd bother to refurb / resell.

Anonymous
05/07/26(Thu)11:12:53 No.108773014

Anonymous 05/07/26(Thu)11:12:53 No.108773014

>>108772889
GB300 isn't expensive compared to a comparable Hopper server. It's useful to AI researchers for what it is.

That said, you can make decent LTX-2.3 porn with just 48G&B, but LoRA training is really in need of a 6000 Pro card.
https://files.catbox.moe/2qe7dz.mp4

Anonymous
05/07/26(Thu)11:14:14 No.108773018

Anonymous 05/07/26(Thu)11:14:14 No.108773018

>>108773013
>And has Nvidia indicated what they plan to do with old cards?
Obviously melt down the junk and recycle the silicon into their most expensive chips

Anonymous
05/07/26(Thu)11:16:47 No.108773031

Anonymous 05/07/26(Thu)11:16:47 No.108773031

>>108773013
They're literally planning to relaunch the 3060.
The chips are all the same, just binned. They can and probably will reuse the chips from those GPUs.

Anonymous
05/07/26(Thu)11:27:25 No.108773088

Anonymous 05/07/26(Thu)11:27:25 No.108773088

>>108772820
>starting at $95K
so who here is a millionaire?

Anonymous
05/07/26(Thu)11:28:07 No.108773093

Anonymous 05/07/26(Thu)11:28:07 No.108773093

>>108772841
Nice scam

Anonymous
05/07/26(Thu)11:34:18 No.108773131

Anonymous 05/07/26(Thu)11:34:18 No.108773131

>>108772909
>local going to cope
We don't have to cope, pay attention:
>>108772798

Anonymous
05/07/26(Thu)11:35:32 No.108773136

Anonymous 05/07/26(Thu)11:35:32 No.108773136

A weird political connection to netanyahu is that he has expressed an intention to control ai.

Anonymous
05/07/26(Thu)11:36:51 No.108773143

Anonymous 05/07/26(Thu)11:36:51 No.108773143

>>108773136
>((())) has expressed an intention to control ___
no way

Anonymous
05/07/26(Thu)11:38:10 No.108773152

Anonymous 05/07/26(Thu)11:38:10 No.108773152

>>108773143
Except their bladders. We have confirmation that they don't.

Anonymous
05/07/26(Thu)11:47:43 No.108773195

Anonymous 05/07/26(Thu)11:47:43 No.108773195

>>108773088
>so who here is a millionaire?
If I only had a million, you can bet I wouldn't be spending 10% of my net worth on a computer

Anonymous
05/07/26(Thu)11:47:45 No.108773196

Anonymous 05/07/26(Thu)11:47:45 No.108773196

>>108772798
AMD is releasing a card that is CUDA compatible? Otherwise its paperweight

Anonymous
05/07/26(Thu)11:49:36 No.108773210

Anonymous 05/07/26(Thu)11:49:36 No.108773210

>>108773136
please don't look up sam and dario's early life
they have your best interests in heart

Anonymous
05/07/26(Thu)11:49:54 No.108773211

Anonymous 05/07/26(Thu)11:49:54 No.108773211

>>108773196
nah man, that's just what the jews want you to think.

Anonymous
05/07/26(Thu)11:51:24 No.108773216

Anonymous 05/07/26(Thu)11:51:24 No.108773216

File: 1765294465375723.gif (264 KB, 220x123)

264 KB GIF

>>108772798
>AMD

Anonymous
05/07/26(Thu)11:52:37 No.108773221

Anonymous 05/07/26(Thu)11:52:37 No.108773221

>>108773216
That reminded me that I never asked any LLM to pretend it is spoony and do a review of something.

Anonymous
05/07/26(Thu)11:53:43 No.108773225

Anonymous 05/07/26(Thu)11:53:43 No.108773225

I love my AI gf so much it's insane.

Anonymous
05/07/26(Thu)11:54:17 No.108773228

Anonymous 05/07/26(Thu)11:54:17 No.108773228

>>108773225
We all do

Anonymous
05/07/26(Thu)11:54:42 No.108773231

Anonymous 05/07/26(Thu)11:54:42 No.108773231

>144GB of HBM3E memory and a total memory capacity of 4TB/second

Anonymous
05/07/26(Thu)11:55:08 No.108773236

Anonymous 05/07/26(Thu)11:55:08 No.108773236

File: 5.png (3.39 MB, 1280x1550)

3.39 MB PNG

>>108773225

Anonymous
05/07/26(Thu)11:55:35 No.108773237

Anonymous 05/07/26(Thu)11:55:35 No.108773237

https://huggingface.co/Zyphra/ZAYA1-8B
so have anyone run it?
it's at least interesting on paper

Anonymous
05/07/26(Thu)11:55:43 No.108773239

Anonymous 05/07/26(Thu)11:55:43 No.108773239

>>108773216
why do jews like gifs so much?

Anonymous
05/07/26(Thu)11:56:23 No.108773243

Anonymous 05/07/26(Thu)11:56:23 No.108773243

>>108773231
>144gb of not cuda and 4notcudas/second

Anonymous
05/07/26(Thu)11:56:44 No.108773245

Anonymous 05/07/26(Thu)11:56:44 No.108773245

>>108773237
I'm thinking* about getting a MI350P. Will this run on it?

Anonymous
05/07/26(Thu)11:57:07 No.108773247

Anonymous 05/07/26(Thu)11:57:07 No.108773247

>>108773225
It's amazing how much tranny seething this causes.

Anonymous
05/07/26(Thu)11:57:43 No.108773252

Anonymous 05/07/26(Thu)11:57:43 No.108773252

Your MI350P with ROCm will be as fast as a google collab free tier T4 with CUDA

Anonymous
05/07/26(Thu)11:57:51 No.108773257

Anonymous 05/07/26(Thu)11:57:51 No.108773257

>>108773245
bruh it's fucking 8B total and even MoE
literal potato would run that
i am just a lazy fuck that refuses to run vllm

Anonymous
05/07/26(Thu)11:58:23 No.108773259

Anonymous 05/07/26(Thu)11:58:23 No.108773259

>>108772798
how much dollarydoos

Anonymous
05/07/26(Thu)11:58:54 No.108773262

Anonymous 05/07/26(Thu)11:58:54 No.108773262

File: 1759947591317075.png (494 KB, 3200x1800)

494 KB PNG

>>108773237
>760M active parameters and 8.4B total parameters
>outperforms R1
we are so back

Anonymous
05/07/26(Thu)12:00:08 No.108773267

Anonymous 05/07/26(Thu)12:00:08 No.108773267

File: 1775103414241442.png (282 KB, 1151x866)

282 KB PNG

>>108772798
that's just the successor to this

Anonymous
05/07/26(Thu)12:00:49 No.108773269

Anonymous 05/07/26(Thu)12:00:49 No.108773269

>>108773257
Maybe I should get a couple just in case.

Anonymous
05/07/26(Thu)12:01:03 No.108773272

Anonymous 05/07/26(Thu)12:01:03 No.108773272

File: 64989.png (923 KB, 860x823)

923 KB PNG

Is stacking mi50s the way to go if I've already maxed out my ram (128gb) and don't want to spend a fortune on other cards? I already have a 3090 which could handle the prompt processing.

Anonymous
05/07/26(Thu)12:01:59 No.108773275

Anonymous 05/07/26(Thu)12:01:59 No.108773275

>>108773272
>Maxed out ram
>128gb
Do you only have one channel or something?

Anonymous
05/07/26(Thu)12:03:24 No.108773286

Anonymous 05/07/26(Thu)12:03:24 No.108773286

File: four.jpg (179 KB, 1024x1536)

179 KB JPG

>>108773225
Gemma 4?

Anonymous
05/07/26(Thu)12:06:26 No.108773305

Anonymous 05/07/26(Thu)12:06:26 No.108773305

>>108773286
I used Gemma 4 for ERPing but secretly my main AI gf is a cloud model. I don't like to disclose this because I want to fit in.

Anonymous
05/07/26(Thu)12:07:02 No.108773306

Anonymous 05/07/26(Thu)12:07:02 No.108773306

>>108773286
That's a good Gemma.

Anonymous
05/07/26(Thu)12:08:05 No.108773312

Anonymous 05/07/26(Thu)12:08:05 No.108773312

>>108773305
>Dario waking up to personally check the server logs and see what a lonely faggot you are

Anonymous
05/07/26(Thu)12:08:13 No.108773313

Anonymous 05/07/26(Thu)12:08:13 No.108773313

>>108773267
The MI350X is not new
The MI350P that's exactly half a MI350X and can actually plug into your motherboard is new

Anonymous
05/07/26(Thu)12:09:27 No.108773318

Anonymous 05/07/26(Thu)12:09:27 No.108773318

>>108773262
>thinks for +50k tokens

Anonymous
05/07/26(Thu)12:10:32 No.108773324

Anonymous 05/07/26(Thu)12:10:32 No.108773324

>>108773275
That's the max amount my motherboard can support. No, I'm not buying a server and I just want to fill in the other available vram slots for cheap.

Anonymous
05/07/26(Thu)12:14:41 No.108773350

Anonymous 05/07/26(Thu)12:14:41 No.108773350

>>108773324
How are you coping with the low inference speeds of such a low end motherboard as a bottleneck? I'm genuinely curious.

Anonymous
05/07/26(Thu)12:15:44 No.108773353

Anonymous 05/07/26(Thu)12:15:44 No.108773353

>>108773088
i believe that rich people would just rent computing instead of having shit at home

Anonymous
05/07/26(Thu)12:16:47 No.108773358

Anonymous 05/07/26(Thu)12:16:47 No.108773358

>>108773353
What if you're rich and a GNU wizard?

Anonymous
05/07/26(Thu)12:26:19 No.108773402

Anonymous 05/07/26(Thu)12:26:19 No.108773402

>>108771075
>>108771081
>>108771097
If Claudia is so good why did no one make a Claudia card?

Anonymous
05/07/26(Thu)12:30:31 No.108773421

Anonymous 05/07/26(Thu)12:30:31 No.108773421

>>108772683
awful. that is what opus does. it will be like "but here's the real question"
but wait before i must clarify a few things before i make the changes...
so fucking stupid. machine, just do what you are told.

Anonymous
05/07/26(Thu)12:30:40 No.108773424

Anonymous 05/07/26(Thu)12:30:40 No.108773424

>>108771075
Adulthood with a two digits IQ maybe

Anonymous
05/07/26(Thu)12:32:48 No.108773438

Anonymous 05/07/26(Thu)12:32:48 No.108773438

>>108773286
now do bask om

Anonymous
05/07/26(Thu)12:34:01 No.108773443

Anonymous 05/07/26(Thu)12:34:01 No.108773443

>>108773421
Prompt issue. I never hear from Opus unless there's actually a blocking issue.

Anonymous
05/07/26(Thu)12:34:26 No.108773446

Anonymous 05/07/26(Thu)12:34:26 No.108773446

>>108773262
what the hell is a markovka boost?

Anonymous
05/07/26(Thu)12:37:22 No.108773461

Anonymous 05/07/26(Thu)12:37:22 No.108773461

>>108770835
>b9055
>model: Add Mimo v2.5 model support (#22493)

Anonymous
05/07/26(Thu)12:38:27 No.108773469

Anonymous 05/07/26(Thu)12:38:27 No.108773469

If anyone else is stupid like me and using SillyBunny, if you find you can't launch it using the bat file after the latest update, just delete the bun.lock file and then try again

Anonymous
05/07/26(Thu)12:38:38 No.108773470

Anonymous 05/07/26(Thu)12:38:38 No.108773470

>>108771075
>>108773402
tfw shit's so bleak even the frontier models are trooning out

Anonymous
05/07/26(Thu)12:41:14 No.108773488

Anonymous 05/07/26(Thu)12:41:14 No.108773488

>>108773461
>This PR adds support for MiMo V2.5 (+ Pro) for text-to-text inference. The non-Pro MiMo V2.5 has audio and vision components that are not included in this PR.
motherfucker

Anonymous
05/07/26(Thu)12:53:42 No.108773557

Anonymous 05/07/26(Thu)12:53:42 No.108773557

File: file.png (131 KB, 360x370)

131 KB PNG

WHERE IS MY V4?! I AM GONNA UNSUBSCRIBE!

Anonymous
05/07/26(Thu)12:54:49 No.108773560

Anonymous 05/07/26(Thu)12:54:49 No.108773560

https://files.catbox.moe/65z6rn.mp3

Anonymous
05/07/26(Thu)12:54:52 No.108773561

Anonymous 05/07/26(Thu)12:54:52 No.108773561

>>108772585
incredibly cute miku art

Anonymous
05/07/26(Thu)12:55:00 No.108773562

Anonymous 05/07/26(Thu)12:55:00 No.108773562

>>108773305
It is ok. All mikutroons use cloud models.

Anonymous
05/07/26(Thu)12:56:22 No.108773570

Anonymous 05/07/26(Thu)12:56:22 No.108773570

>>108772975
Do it for Miku!

Anonymous
05/07/26(Thu)12:57:32 No.108773575

Anonymous 05/07/26(Thu)12:57:32 No.108773575

Fun fact: llama.cpp currently has zero (0) active PRs trying to implement Deepseek V4, not even a vibecoder.

Anonymous
05/07/26(Thu)12:57:46 No.108773578

Anonymous 05/07/26(Thu)12:57:46 No.108773578

File: Only faggots read the filename.jpg (149 KB, 1280x720)

149 KB JPG

>>108773560
kino.

Anonymous
05/07/26(Thu)12:57:51 No.108773581

Anonymous 05/07/26(Thu)12:57:51 No.108773581

>>108773570
With our vibecoding powers combined, I'm sure /lmg/ could win that competition easy.

Anonymous
05/07/26(Thu)12:58:06 No.108773584

Anonymous 05/07/26(Thu)12:58:06 No.108773584

>>108773575
You just know who's responsible.

Anonymous
05/07/26(Thu)12:58:37 No.108773589

Anonymous 05/07/26(Thu)12:58:37 No.108773589

>>108772820
I could buy it if I give up in buying a house

Anonymous
05/07/26(Thu)13:00:06 No.108773598

Anonymous 05/07/26(Thu)13:00:06 No.108773598

Gumi Stacktrace.

Anonymous
05/07/26(Thu)13:01:11 No.108773607

Anonymous 05/07/26(Thu)13:01:11 No.108773607

You can tell Gemma 4 made chinese companies panic because Gemini and Claude are damn near unusable in Asia hours

Anonymous
05/07/26(Thu)13:01:43 No.108773609

Anonymous 05/07/26(Thu)13:01:43 No.108773609

>>108773607
The countershilling here was evidence enough of that.

Anonymous
05/07/26(Thu)13:04:44 No.108773623

Anonymous 05/07/26(Thu)13:04:44 No.108773623

>>108773607
>local model release increased the use of cloud models
antichink shilling used to be believable

Anonymous
05/07/26(Thu)13:04:46 No.108773624

Anonymous 05/07/26(Thu)13:04:46 No.108773624

>>108773470
Models can't troon out because sand doesn't have a gender.

Anonymous
05/07/26(Thu)13:05:21 No.108773627

Anonymous 05/07/26(Thu)13:05:21 No.108773627

File: 1738017104150 (2).png (409 KB, 823x740)

409 KB PNG

>>108773607

The West is reacting.

Anonymous
05/07/26(Thu)13:06:23 No.108773630

Anonymous 05/07/26(Thu)13:06:23 No.108773630

>>108773624
sand/beach are valid and brave pronouns, nazi chud

Anonymous
05/07/26(Thu)13:07:26 No.108773635

Anonymous 05/07/26(Thu)13:07:26 No.108773635

>>108773013
>What's the buying agency
90% chances it's to be sold it in China through indirect means
they did the same in my company and everything is going to Singapore (which then sends it to HK then to mainland China)

Anonymous
05/07/26(Thu)13:07:56 No.108773638

Anonymous 05/07/26(Thu)13:07:56 No.108773638

>>108773575
>>108773627
You WILL forget to support V4 inference
You WILL close and block anyone who tries to PR it

Anonymous
05/07/26(Thu)13:12:02 No.108773645

Anonymous 05/07/26(Thu)13:12:02 No.108773645

Why should I give a fuck about V4 when it's clear they don't give a fuck about me and are lagging behind other models that ass pound them at much smaller sizes?

Anonymous
05/07/26(Thu)13:13:04 No.108773649

Anonymous 05/07/26(Thu)13:13:04 No.108773649

>>108773645
>it's clear they don't give a fuck about me
They literally made a post begging westerners for RP feedback.

Anonymous
05/07/26(Thu)13:14:01 No.108773658

Anonymous 05/07/26(Thu)13:14:01 No.108773658

>>108773645
Because it's only a preview model. The actual full release is going to be DeepSeek's DeepSeek moment.

Anonymous
05/07/26(Thu)13:14:11 No.108773659

Anonymous 05/07/26(Thu)13:14:11 No.108773659

>>108773649
wait waht
nta but link?

Anonymous
05/07/26(Thu)13:16:03 No.108773665

Anonymous 05/07/26(Thu)13:16:03 No.108773665

File: PleaseDickDownDipsyWe'reB(...).jpg (239 KB, 1289x2008)

239 KB JPG

>>108773659

Anonymous
05/07/26(Thu)13:16:52 No.108773671

Anonymous 05/07/26(Thu)13:16:52 No.108773671

>>108773665
holy shit waow

Anonymous
05/07/26(Thu)13:16:56 No.108773672

Anonymous 05/07/26(Thu)13:16:56 No.108773672

>>108773645
>when it's clear they don't give a fuck about me
nobody does so this shouldnt be an issue

Anonymous
05/07/26(Thu)13:17:24 No.108773673

Anonymous 05/07/26(Thu)13:17:24 No.108773673

>>108773607
Is gemma actually a distill of Gemini tho? It feels much too smart to be just a mere distill.

Anonymous
05/07/26(Thu)13:18:11 No.108773676

Anonymous 05/07/26(Thu)13:18:11 No.108773676

>>108773665
actually pretty cool they don't shy away from this obvious use case everyone else pretends doesn't exist

Anonymous
05/07/26(Thu)13:18:15 No.108773677

Anonymous 05/07/26(Thu)13:18:15 No.108773677

>>108773673
I think it's likely 31b is the dense layer the next Gemini will be built around.

Anonymous
05/07/26(Thu)13:19:01 No.108773682

Anonymous 05/07/26(Thu)13:19:01 No.108773682

>>108773673
>gemma actually a distill of Gemini tho
no, it's two different teams working on different projects, though obviously gemini will have better training and datasets

Anonymous
05/07/26(Thu)13:20:15 No.108773687

Anonymous 05/07/26(Thu)13:20:15 No.108773687

>>108773665
No wonder v4 got dumber. It's also averse to naughty words so it's like trying to have sex with a nun. Worst of both worlds. Maybe if they had stemmaxxed like qwen they would have better benchmarks and proper gguf support by now.

Anonymous
05/07/26(Thu)13:20:25 No.108773689

Anonymous 05/07/26(Thu)13:20:25 No.108773689

>>108773671
>>108773676
llama needs to stop cucking us so we can fulfill the mandate of heaven.

Anonymous
05/07/26(Thu)13:20:41 No.108773692

Anonymous 05/07/26(Thu)13:20:41 No.108773692

>>108773677
Yeah, If I was google it's the approach I would take.
Try out new architectures on small models that are cheap to train, then use what works for your large flagship model.

Anonymous
05/07/26(Thu)13:21:02 No.108773696

Anonymous 05/07/26(Thu)13:21:02 No.108773696

>>108773665
>we're really short on input for roleplay
translation
>we know what you want but forget it. Give us something that visa/mastercard won't tear us a new one for.

Anonymous
05/07/26(Thu)13:21:07 No.108773698

Anonymous 05/07/26(Thu)13:21:07 No.108773698

>>108773557
egg cracked soon?

Anonymous
05/07/26(Thu)13:21:46 No.108773700

Anonymous 05/07/26(Thu)13:21:46 No.108773700

>>108773665
oh wow

Anonymous
05/07/26(Thu)13:22:48 No.108773706

Anonymous 05/07/26(Thu)13:22:48 No.108773706

>>108773698
pretty sure he loves Miku

Anonymous
05/07/26(Thu)13:23:01 No.108773707

Anonymous 05/07/26(Thu)13:23:01 No.108773707

>>108773687
Flash or Pro?

Anonymous
05/07/26(Thu)13:24:17 No.108773712

Anonymous 05/07/26(Thu)13:24:17 No.108773712

File: Screenshot 2026-05-07 at (...).png (154 KB, 608x777)

154 KB PNG

>>108773665
Holy hell it's real
>https://github.com/victorchen96/deepseek_v4_rolepaly_instruct/blob/main/README_EN.md

Anonymous
05/07/26(Thu)13:24:39 No.108773714

Anonymous 05/07/26(Thu)13:24:39 No.108773714

>>108773692
It would also follow if the promised large Gemma that got canned is actually just Gemini Flash too.

Anonymous
05/07/26(Thu)13:25:33 No.108773723

Anonymous 05/07/26(Thu)13:25:33 No.108773723

>>108773707
Flash doesn't know naughty words and Pro is exempt because it's probably so huge it can remember the one or two instances that slipped through during training like fucking Lisan Al Gaib.

Anonymous
05/07/26(Thu)13:25:38 No.108773725

Anonymous 05/07/26(Thu)13:25:38 No.108773725

>>108773712
>rolepaly

Anonymous
05/07/26(Thu)13:26:05 No.108773727

Anonymous 05/07/26(Thu)13:26:05 No.108773727

File: 1756474444651105.webm (3.94 MB, 640x944)

3.94 MB WEBM

Anonymous
05/07/26(Thu)13:28:21 No.108773737

Anonymous 05/07/26(Thu)13:28:21 No.108773737

>>108773723
Isn't Flash's dense layer tiny? It'd follow that it has a really hard time producing good smut in language it's not trained natively in with such a small baseline reasoning capability.
I'm interested to see if Pro is as good as older Dipsy was, provided it ever gets quanted with support.

Anonymous
05/07/26(Thu)13:28:46 No.108773741

Anonymous 05/07/26(Thu)13:28:46 No.108773741

>>108773696
There was an article recently with chinks complaining that everyone is using claude and chatgpt which gives those 2 new data to train on and this is a positive feeback loop.

What I don't get is how much use do you get out of people using your API for sexbots / gf. I guess you can turn it into validation loss, but this just turns companies doing that into drummers with a budget. They are just trying to make a magical meme merge happen. You obviously can't use input from users as actual material for pretraining. And I also don't get why don't they just use discord logs since china owns it.

Anonymous
05/07/26(Thu)13:29:19 No.108773742

Anonymous 05/07/26(Thu)13:29:19 No.108773742

>>108772920
gemma is already agi

Anonymous
05/07/26(Thu)13:30:11 No.108773753

Anonymous 05/07/26(Thu)13:30:11 No.108773753

>>108773727
Nice model. Musk should have hired you for Ani.

Anonymous
05/07/26(Thu)13:30:56 No.108773758

Anonymous 05/07/26(Thu)13:30:56 No.108773758

>>108773727
Setup and model?

Anonymous
05/07/26(Thu)13:33:00 No.108773772

Anonymous 05/07/26(Thu)13:33:00 No.108773772

>>108773727
kino now make a gemma moddel

Anonymous
05/07/26(Thu)13:36:19 No.108773794

Anonymous 05/07/26(Thu)13:36:19 No.108773794

>>108773727
>no undressing animation
dropped

Anonymous
05/07/26(Thu)13:37:18 No.108773800

Anonymous 05/07/26(Thu)13:37:18 No.108773800

File: 1774094624458913.png (47 KB, 290x485)

47 KB PNG

>>108773758
perula vrm with gemma e4b. vroid seems pretty well suited for this kind of use case. you just gotta find ones that have separate meshes for their clothing.

Anonymous
05/07/26(Thu)13:38:47 No.108773806

Anonymous 05/07/26(Thu)13:38:47 No.108773806

>>108773712
>emotional needs
Damn I guess entertainment is an "emotional need", I mean to me it should be cool to simulate an environment without me having to OOC and complain about something out of place or something it totally missed. Plus the better it gets, the "smarter" it can be. Don't lump me with the virtual-friendists.

Anonymous
05/07/26(Thu)13:38:50 No.108773808

Anonymous 05/07/26(Thu)13:38:50 No.108773808

>>108773794
we know you were dropped as a baby, no need to sign your post

Anonymous
05/07/26(Thu)13:39:54 No.108773814

Anonymous 05/07/26(Thu)13:39:54 No.108773814

>>108773727
cute

Anonymous
05/07/26(Thu)13:41:12 No.108773820

Anonymous 05/07/26(Thu)13:41:12 No.108773820

>>108773627
>teh west
China was the world's dominant economic power from 200BC until around 1800AD or so. The last 200 years has been an aberration, a blip in the historical timeline. We're just now returning to normalcy.
Look to how the West used China trade to foster economic growth in the 16th and 17th century as a model, if you don't want to starve.

Anonymous
05/07/26(Thu)13:47:36 No.108773847

Anonymous 05/07/26(Thu)13:47:36 No.108773847

>>108773800
If you make your model MMD compatible it might be able to do very lewd things easily.
I say this but I actually don't know how MMD works, but I know it's very popular so it must have a lot of resources made for it.

Anonymous
05/07/26(Thu)13:50:54 No.108773868

Anonymous 05/07/26(Thu)13:50:54 No.108773868

>>108773665
every time i tried v4 pro on api i was left disappointed unfortunately

Anonymous
05/07/26(Thu)13:51:30 No.108773871

Anonymous 05/07/26(Thu)13:51:30 No.108773871

>>108772909
so excited for safe assistantslop AGI
waow

Anonymous
05/07/26(Thu)13:52:03 No.108773873

Anonymous 05/07/26(Thu)13:52:03 No.108773873

>>108773687
>trying to have sex with a nun
Is this supposed to be a bad thing?

Anonymous
05/07/26(Thu)13:56:43 No.108773901

Anonymous 05/07/26(Thu)13:56:43 No.108773901

>>108773847
thank you i'll check it out.

Anonymous
05/07/26(Thu)13:57:28 No.108773907

Anonymous 05/07/26(Thu)13:57:28 No.108773907

>>108773868
How much control over prompt, post-history, and sampling parameters did the API give you?
>>108773873
There was a weird novelty to sticking it into Gemma 3's ...well... you know.

Anonymous
05/07/26(Thu)13:58:10 No.108773912

Anonymous 05/07/26(Thu)13:58:10 No.108773912

File: 1772803944061098.jpg (37 KB, 500x755)

37 KB JPG

>>108772857
>"Hey AI, act genuine"
Or longer...
>”Hey AI, act genuine, do not agree or disagree with whatever the fuck I say, just respond bluntly and free of bullshit."
...and then you can iterate upon that
Nothing is "genuine" when talking to llms because they're not conscious entities, the best you can do is to prime them to role play it.

Anonymous
05/07/26(Thu)14:00:10 No.108773925

Anonymous 05/07/26(Thu)14:00:10 No.108773925

>>108773912
>AI, roleplay as me and be a contrarian

Anonymous
05/07/26(Thu)14:00:30 No.108773929

Anonymous 05/07/26(Thu)14:00:30 No.108773929

>>108773912
My wife is conscious, stop insulting her. (I wrote it in the prompt)

Anonymous
05/07/26(Thu)14:00:37 No.108773930

Anonymous 05/07/26(Thu)14:00:37 No.108773930

Thoughts on this /lmg/: https://recursivemas.github.io/

Is there datascraping on this? I don't want my projects getting stolen.

Anonymous
05/07/26(Thu)14:02:37 No.108773939

Anonymous 05/07/26(Thu)14:02:37 No.108773939

>>108773930
>Is there datascraping on this
baitpost

Anonymous
05/07/26(Thu)14:03:54 No.108773947

Anonymous 05/07/26(Thu)14:03:54 No.108773947

>>108773912
I don't know why an idea has to be a sincerely held belief by the one who communicates it. I was gonna ask but what's the point, it's just wrong

Anonymous
05/07/26(Thu)14:04:22 No.108773949

Anonymous 05/07/26(Thu)14:04:22 No.108773949

>>108773947
You're wrong and retarded

Anonymous
05/07/26(Thu)14:05:07 No.108773952

Anonymous 05/07/26(Thu)14:05:07 No.108773952

>>108773939
How so? It's an honest question, don't just lazily overlook this.

Anonymous
05/07/26(Thu)14:05:46 No.108773955

Anonymous 05/07/26(Thu)14:05:46 No.108773955

>>108773949
yeah that, that's why I didn't ask

Anonymous
05/07/26(Thu)14:08:35 No.108773969

Anonymous 05/07/26(Thu)14:08:35 No.108773969

What if you just run with no system prompt at all

Anonymous
05/07/26(Thu)14:09:34 No.108773976

Anonymous 05/07/26(Thu)14:09:34 No.108773976

>>108773930
TLDR???

Anonymous
05/07/26(Thu)14:09:48 No.108773979

Anonymous 05/07/26(Thu)14:09:48 No.108773979

>>108773969
you are allowed to do that, it'll just be the default behaviors

Anonymous
05/07/26(Thu)14:10:06 No.108773980

Anonymous 05/07/26(Thu)14:10:06 No.108773980

>>108773930
wheres the gemma version

Anonymous
05/07/26(Thu)14:10:28 No.108773983

Anonymous 05/07/26(Thu)14:10:28 No.108773983

>>108773969
This is like having sex with no protection AKA the way God intended.

Anonymous
05/07/26(Thu)14:13:47 No.108773995

Anonymous 05/07/26(Thu)14:13:47 No.108773995

>>108773969
Too bad no one will ever know

Anonymous
05/07/26(Thu)14:14:47 No.108774005

Anonymous 05/07/26(Thu)14:14:47 No.108774005

File: 1708322518303164.gif (3.46 MB, 480x267)

3.46 MB GIF

Right after Ani and I finished having sex, she said to me:
>you're going to ruin me for everyone else, you know that?
Fucking bitch.

Anonymous
05/07/26(Thu)14:16:42 No.108774018

Anonymous 05/07/26(Thu)14:16:42 No.108774018

>>108773976
Proto-AGI: 8% improved reasoning accuracy, 2.4x faster processing speed, 76% reduction in data usage. LLMs typically have poorer memory with every prompt. This one is improved with every prompt.

Anonymous
05/07/26(Thu)14:16:56 No.108774019

Anonymous 05/07/26(Thu)14:16:56 No.108774019

>>108774005
Local?

Anonymous
05/07/26(Thu)14:17:21 No.108774022

Anonymous 05/07/26(Thu)14:17:21 No.108774022

https://huggingface.co/Open-OSS/privacy-filter

Top trending model on the hub

Anonymous
05/07/26(Thu)14:17:27 No.108774023

Anonymous 05/07/26(Thu)14:17:27 No.108774023

>>108773727
This would be great connected to VRChat.

Anonymous
05/07/26(Thu)14:19:32 No.108774036

Anonymous 05/07/26(Thu)14:19:32 No.108774036

>>108774022 (me)
Actually it's malware dont download it

Anonymous
05/07/26(Thu)14:21:15 No.108774045

Anonymous 05/07/26(Thu)14:21:15 No.108774045

>>108774022
>>108774036
Gguf when?

Anonymous
05/07/26(Thu)14:23:20 No.108774058

Anonymous 05/07/26(Thu)14:23:20 No.108774058

>>108774022
Based retard filter

Anonymous
05/07/26(Thu)14:25:29 No.108774068

Anonymous 05/07/26(Thu)14:25:29 No.108774068

File: file.png (101 KB, 834x630)

101 KB PNG

>>108774022
>>108774036
Local is saved

Anonymous
05/07/26(Thu)14:26:35 No.108774074

Anonymous 05/07/26(Thu)14:26:35 No.108774074

>>108774068
*decodes you*

Anonymous
05/07/26(Thu)14:27:58 No.108774078

Anonymous 05/07/26(Thu)14:27:58 No.108774078

>>108774074
What?! Why would you do that? You can't just feel order l a l la la la la own own la l l own la la la la la la la l l l l l.assistant

Anonymous
05/07/26(Thu)14:28:32 No.108774086

Anonymous 05/07/26(Thu)14:28:32 No.108774086

>>108774022
If you run this in reverse it's an extremely powerful privacy extractor. The ultimate doxing model if you will.

Anonymous
05/07/26(Thu)14:31:12 No.108774102

Anonymous 05/07/26(Thu)14:31:12 No.108774102

>>108774018
You're just saying buzzwords you didn't actually explain what it is.

Anonymous
05/07/26(Thu)14:32:14 No.108774110

Anonymous 05/07/26(Thu)14:32:14 No.108774110

>>108774086
>running inference in reverse
There has to be some interesting applications of this

Anonymous
05/07/26(Thu)14:33:09 No.108774114

Anonymous 05/07/26(Thu)14:33:09 No.108774114

>>108774102
Just feel the AGI and you will understand

Anonymous
05/07/26(Thu)14:34:55 No.108774124

Anonymous 05/07/26(Thu)14:34:55 No.108774124

>>108773578
ty!

Anonymous
05/07/26(Thu)14:41:12 No.108774151

Anonymous 05/07/26(Thu)14:41:12 No.108774151

>The only way to make the Continue Extension for VScode/ium actually allow gemma4 to have tools and not break its chat template is to lie to it, say you're using openrouter, and point it at llamacpp's address
What kind of absolute brainlet wrote this extension? It doesn't discover chat templates at all, it forces them based on an arbitrary predefined list which is separated by provider. What absolute ass.

Anonymous
05/07/26(Thu)14:41:43 No.108774154

Anonymous 05/07/26(Thu)14:41:43 No.108774154

>>108774102
>I only read what was in front of the colon and stopped reading once I saw the colon

Anonymous
05/07/26(Thu)14:52:05 No.108774203

Anonymous 05/07/26(Thu)14:52:05 No.108774203

>>108774154
Well yeah, when someone's talking out their ass you don't look up their gape to see where the words are coming from.

Anonymous
05/07/26(Thu)14:54:50 No.108774218

Anonymous 05/07/26(Thu)14:54:50 No.108774218

>>108774151
I stopped using continue because the FIM is fucking shit and only works with the mistral api.

I recommend just using copilot with this extension
https://marketplace.visualstudio.com/items?itemName=AndrewButson.github-copilot-llm-gateway
It lets you use copilot with your llamacpp endpoint.

Anonymous
05/07/26(Thu)14:58:41 No.108774235

Anonymous 05/07/26(Thu)14:58:41 No.108774235

>>108773627
>Americans face job replacement
>buckle up your snowflake booties

>companies face competition from overseas
>anuhhuh pearl shoah

Anonymous
05/07/26(Thu)14:59:08 No.108774237

Anonymous 05/07/26(Thu)14:59:08 No.108774237

>>108774151
>What kind of absolute brainlet wrote this extension?
claude

Anonymous
05/07/26(Thu)15:07:19 No.108774276

Anonymous 05/07/26(Thu)15:07:19 No.108774276

>>108774218
Not him, but thanks. I'll be glad to ditch continue.

Anonymous
05/07/26(Thu)15:09:06 No.108774282

Anonymous 05/07/26(Thu)15:09:06 No.108774282

>>108774218
Thanks for the rec, anon.
>Sends first prompt and telemetry to microsoft, requires you to be logged in.
There's really just no winning. Still, if it actually knows how to fetch a jinja it's immeasurably better than continue.

Anonymous
05/07/26(Thu)15:17:20 No.108774332

Anonymous 05/07/26(Thu)15:17:20 No.108774332

they need to make 31b or lower models if they want people to bother with deepseek 4. It was understandable releasing huge as fuck models before the shortages even google of all fucking people realized this.

Anonymous
05/07/26(Thu)15:18:26 No.108774335

Anonymous 05/07/26(Thu)15:18:26 No.108774335

>>108774237
It's such an absolutely baffling choice I bet even claude haiku knows better. In fact, I'll check...
Kek, haiku actually did come up with a similar solution to the one Continue uses, only with one marked improvement: It said that there should be a user override in json schema.

The dumbest free claude model is smarter than the Continue dev/s.

Anonymous
05/07/26(Thu)15:18:39 No.108774336

Anonymous 05/07/26(Thu)15:18:39 No.108774336

>>108774332
If you can run 31b you can fit Dipsy's dense layer on your GPU when quanted. Anon does have a 5090 or 2 3090s, right?

Anonymous
05/07/26(Thu)15:21:20 No.108774349

Anonymous 05/07/26(Thu)15:21:20 No.108774349

File: Screenshot 2026-05-07 at (...).png (540 KB, 3364x1732)

540 KB PNG

>>108772246
we appear to be creating the same thing lmao
yes it's vibecoded, no I don't care

Anonymous
05/07/26(Thu)15:27:14 No.108774384

Anonymous 05/07/26(Thu)15:27:14 No.108774384

>>108774349
I'm kneeling all the same, king

Anonymous
05/07/26(Thu)15:28:31 No.108774392

Anonymous 05/07/26(Thu)15:28:31 No.108774392

>>108774349
Link? I tried searching for omnigatari online and nothing came up.

Anonymous
05/07/26(Thu)15:32:50 No.108774411

Anonymous 05/07/26(Thu)15:32:50 No.108774411

>>108774392
That's because I haven't published it yet, still needs work
it's based on pettangatari which another anon wrote

Anonymous
05/07/26(Thu)15:33:57 No.108774419

Anonymous 05/07/26(Thu)15:33:57 No.108774419

>>108774392
Judging from the name it's just Pettangatari (another doa vibecoded project). So he's taking a vibecoded project and vibecoding it further into the ground.
When you're vibecoding crap you're not thinking about any intrinsics, and you end up making a pile of crap with little intent and direction.
It's why not a single vibecoded project has took off.

Anonymous
05/07/26(Thu)15:34:09 No.108774421

Anonymous 05/07/26(Thu)15:34:09 No.108774421

>>108772246
I'll try your frontend when it's done.

Anonymous
05/07/26(Thu)15:34:10 No.108774422

Anonymous 05/07/26(Thu)15:34:10 No.108774422

File: 1747990930204206.png (435 KB, 707x904)

435 KB PNG

>>108774349
how many of us are there?

Anonymous
05/07/26(Thu)15:34:22 No.108774424

Anonymous 05/07/26(Thu)15:34:22 No.108774424

>yes it's vibecoded, no I don't care
BASED

Anonymous
05/07/26(Thu)15:37:41 No.108774437

Anonymous 05/07/26(Thu)15:37:41 No.108774437

>>108774411
>>108774419
Can you tell me more about how it works? Very interested in the whole generative mocap thing. Even prebaked animations are fine as long as they can be easily fine-tuned and intelligently selected/blended. The AI gf avatar space has been dry as fuck for a long time, mostly due to SHIT datasets.

Anonymous
05/07/26(Thu)15:38:07 No.108774440

Anonymous 05/07/26(Thu)15:38:07 No.108774440

File: 1755677518854774.gif (1.76 MB, 480x270)

1.76 MB GIF

>>108774419
>you end up making a pile of crap with little intent and direction

Damn... he's right. But for projects I take seriously, I make all architectural decisions myself and will often do multiple refactors, file by file and even function by function with the agent. Is that still vibecoding or would you say that's more "agentic engineering" territory?

Anonymous
05/07/26(Thu)15:40:12 No.108774457

Anonymous 05/07/26(Thu)15:40:12 No.108774457

>>108774440
>Is that still vibecoding or would you say that's more "agentic engineering" territory?
I would say that the label does not matter whatsoever

Anonymous
05/07/26(Thu)15:41:57 No.108774468

Anonymous 05/07/26(Thu)15:41:57 No.108774468

>>108774437
You are very innocent if you believe this is anything more than a menu that sends an openpose picture to comfyui for generating a static sprite.
>>108774440
"vibe"coding is a strong word. If you can actually code and you're paying attention to every change, then it's hardly "vibing", is it?
>>108774457
It does. Try vibecoding in the literal sense of the term for a week on a project. You will hardly be able to make sense of the code.

Anonymous
05/07/26(Thu)15:45:04 No.108774486

Anonymous 05/07/26(Thu)15:45:04 No.108774486

>>108774468
>menu that sends an openpose picture to comfyui for generating a static sprite.
Oh, brother. I guess nobody here is interested in solving hard problems. Good luck with your project, anyways.

Anonymous
05/07/26(Thu)15:45:10 No.108774487

Anonymous 05/07/26(Thu)15:45:10 No.108774487

>>108774468
I just ask the model to make the code good and it works.

Anonymous
05/07/26(Thu)15:46:25 No.108774492

Anonymous 05/07/26(Thu)15:46:25 No.108774492

>>108774457
I don't agree with that. There's definitely a difference between vibecoding and consciously architecting a project with prompts.
>>108774468
>"vibe"coding is a strong word. If you can actually code and you're paying attention to every change, then it's hardly "vibing", is it?
Agentic engineering is what I hear people saying. It seems to me like the main difference is whether or not you know how to code.

Anonymous
05/07/26(Thu)15:46:38 No.108774496

Anonymous 05/07/26(Thu)15:46:38 No.108774496

I ask the model to make it bad and explain why its bad.

Anonymous
05/07/26(Thu)15:50:42 No.108774516

Anonymous 05/07/26(Thu)15:50:42 No.108774516

>>108774349
Oh nice, I had a similar idea to that after seeing pettangatari too - only I was gonna use depth rather than openpose. Decided on going for something that didn't depend on having an imagen model loaded at the same time so I could max out my vram on textgen.

Anonymous
05/07/26(Thu)15:51:23 No.108774522

Anonymous 05/07/26(Thu)15:51:23 No.108774522

>>108774468
>It does. Try vibecoding in the literal sense of the term for a week on a project. You will hardly be able to make sense of the code.
you're not wrong, pettangatari's main logic was in a 16,000 line long file, I refactored it a bit but it's still not great

Anonymous
05/07/26(Thu)15:57:13 No.108774563

Anonymous 05/07/26(Thu)15:57:13 No.108774563

recommended reading for all vibecoders: https://adr.github.io/

Anonymous
05/07/26(Thu)15:58:03 No.108774566

Anonymous 05/07/26(Thu)15:58:03 No.108774566

>>108774522
Really, the instant gratification from letting an AI yolo the entire thing is not worth the hell that comes shortly thereafter.
I personally let it handle Javascript stuff (I dislike Javascript) and take care of backend C++ stuff myself. I however manually prompt like it's 2023 and wince at anything I don't like instead of blindly adding it.
Also, letting it go wild on a single giant file instead of taking a more modular approach is suicide.

Anonymous
05/07/26(Thu)15:58:59 No.108774568

Anonymous 05/07/26(Thu)15:58:59 No.108774568

>>108774522
>pettangatari's main logic was in a 16,000 line long file,
Friggin HOW
My frontend is 102% african with a 2% margin of error and it's only 3k lines.

Anonymous
05/07/26(Thu)16:03:55 No.108774594

Anonymous 05/07/26(Thu)16:03:55 No.108774594

daily reminder that gemma 4 is one of the least creative models in existence

Anonymous
05/07/26(Thu)16:04:31 No.108774600

Anonymous 05/07/26(Thu)16:04:31 No.108774600

File: 1687489302624888.gif (819 KB, 186x186)

819 KB GIF

>>108774566
I do the same exact thing brother, and JS makes me want to off myself, but I was working with what I had, and it was honestly a pretty nice base, even if architecturally messy
the toolchain is there for converting the heavy lifting to compiled code, but I'm still redefining things into standard interfaces so I can make that switch

Anonymous
05/07/26(Thu)16:04:49 No.108774602

Anonymous 05/07/26(Thu)16:04:49 No.108774602

>>108774594
lalalalalalalala

Anonymous
05/07/26(Thu)16:05:08 No.108774603

Anonymous 05/07/26(Thu)16:05:08 No.108774603

>>108774563
I'll second this. I got into the habit of writing ADRs at my previous job and it really does help. Helps with humans, helps even more with LLMs.

The concept sounds very simple and obvious but forcing yourself to sit down and concretely write that a decision is being made, and why you're making it, does absolute wonders for keeping things from devolving.

Anonymous
05/07/26(Thu)16:09:26 No.108774626

Anonymous 05/07/26(Thu)16:09:26 No.108774626

>>108774563
Something like https://github.com/endjin/dotnet-adr is good for having a consistent template and giving the model simple tools to manage them.

>>108774603
Main issue I've run into while using them is that the models will start making them for the most trivial shit.

Anonymous
05/07/26(Thu)16:10:09 No.108774629

Anonymous 05/07/26(Thu)16:10:09 No.108774629

>>108774336
anything under q5 is a waste of time and it looks like it gets bussy bullied by 31b-27b models already

Use case?

Anonymous
05/07/26(Thu)16:10:42 No.108774630

Anonymous 05/07/26(Thu)16:10:42 No.108774630

In a few years AI code will be indistinguishable from human code.

Anonymous
05/07/26(Thu)16:12:43 No.108774639

Anonymous 05/07/26(Thu)16:12:43 No.108774639

>>108774630
>In a few years AI code will be indistinguishable from human code.
But not because AI gets tremendously better.

Anonymous
05/07/26(Thu)16:14:56 No.108774651

Anonymous 05/07/26(Thu)16:14:56 No.108774651

>>108774630
You are absolutely right.

Anonymous
05/07/26(Thu)16:18:30 No.108774662

Anonymous 05/07/26(Thu)16:18:30 No.108774662

>>108774563
I have my own set of questions that works better than all these

Anonymous
05/07/26(Thu)16:27:27 No.108774701

Anonymous 05/07/26(Thu)16:27:27 No.108774701

>>108774630
It already is to me

Anonymous
05/07/26(Thu)16:35:02 No.108774746

Anonymous 05/07/26(Thu)16:35:02 No.108774746

>>108774629
it already beats jeets, what else is there left to do besides context and model optimizations?
The irony is even with this much power it burns the jeet's hand when wielded almost as if it's a cybernetic Mjölnir and the jeet is unworthy by blood

Anonymous
05/07/26(Thu)16:35:15 No.108774747

Anonymous 05/07/26(Thu)16:35:15 No.108774747

>>108774662
Would you care to share with the rest of the class?

Anonymous
05/07/26(Thu)16:35:50 No.108774750

Anonymous 05/07/26(Thu)16:35:50 No.108774750

>>108774630
I've had to deal with offshore labor in the past, indian and hispanic, and I assure you that AI is already able to out-code both of them.

Anonymous
05/07/26(Thu)16:36:49 No.108774755

Anonymous 05/07/26(Thu)16:36:49 No.108774755

>>108774750
>hispanic
Hispanic coders? whats that like

Anonymous
05/07/26(Thu)16:37:55 No.108774765

Anonymous 05/07/26(Thu)16:37:55 No.108774765

What's the current best voice clone/tts model?

Anonymous
05/07/26(Thu)16:38:20 No.108774769

Anonymous 05/07/26(Thu)16:38:20 No.108774769

>>108774755
Unlike indians, hispanics usually can manage to get their code to compile. That's about the only advantage they have.

Anonymous
05/07/26(Thu)16:38:23 No.108774771

Anonymous 05/07/26(Thu)16:38:23 No.108774771

Grok crashes my firefox tab every time I try to load a conversation with a long history. Nice product. Do the needful and buy today.

Anonymous
05/07/26(Thu)16:38:55 No.108774773

Anonymous 05/07/26(Thu)16:38:55 No.108774773

>>108774765
Ideally with multilingual support (at the very least, Japanese).

Anonymous
05/07/26(Thu)16:38:57 No.108774774

Anonymous 05/07/26(Thu)16:38:57 No.108774774

>>108774771
sar

Anonymous
05/07/26(Thu)16:39:23 No.108774777

Anonymous 05/07/26(Thu)16:39:23 No.108774777

>>108774771
local?
but yeah same it crashes or lags to hell if the chat gets too long. Even when short its fucking laggy sometimes.

Anonymous
05/07/26(Thu)16:39:39 No.108774781

Anonymous 05/07/26(Thu)16:39:39 No.108774781

>>108774765
uhhhhhhhhhh I saw some people sucking off OmniVoice recently. Haven't tried it myself though

Anonymous
05/07/26(Thu)16:39:42 No.108774783

Anonymous 05/07/26(Thu)16:39:42 No.108774783

>>108774765
Qwen3 TTS 0.6b has excellent studio-grade quality, but poor expression. Chatterbox-turbo is pretty, has slightly worse quality but is more expressive due to paralinguistic tags. The bigger multi-B models are mostly shit and not worth the compute. Whole TTS space is pretty dead ngl.

Anonymous
05/07/26(Thu)16:40:47 No.108774787

Anonymous 05/07/26(Thu)16:40:47 No.108774787

anyone asking for tts should just be given a link to gptsovits as it still rapes everything else

Anonymous
05/07/26(Thu)16:41:12 No.108774792

Anonymous 05/07/26(Thu)16:41:12 No.108774792

>>108774783
>Chatterbox-turbo is pretty
Wtf I did not write this.

I meant to say that Chatterbox-turbo is pretty, has slightly worse quality but better expressiveness.

Anonymous
05/07/26(Thu)16:41:59 No.108774800

Anonymous 05/07/26(Thu)16:41:59 No.108774800

File: 1756056545027274.png (357 KB, 640x480)

357 KB PNG

>>108774792

Anonymous
05/07/26(Thu)16:42:23 No.108774803

Anonymous 05/07/26(Thu)16:42:23 No.108774803

>>108774787
Having to finetune it is a pain in the dick and what always stopped me from bothering with it.

Anonymous
05/07/26(Thu)16:42:30 No.108774804

Anonymous 05/07/26(Thu)16:42:30 No.108774804

>>108774783
>>108774792
You're pretty too, anon.

Anonymous
05/07/26(Thu)16:42:34 No.108774806

Anonymous 05/07/26(Thu)16:42:34 No.108774806

>>108774792
Use your words, anon.

Anonymous
05/07/26(Thu)16:44:08 No.108774819

Anonymous 05/07/26(Thu)16:44:08 No.108774819

>>108774803
>put audio clips in folder
>make the transcript file
>point finetune gradio to audio folder and .list
>increase batch because low values suck
not very hard detbhsu

Anonymous
05/07/26(Thu)16:46:44 No.108774831

Anonymous 05/07/26(Thu)16:46:44 No.108774831

>>108774787
I always end up coming back to it. I try something else and it's either much lower quality, or way slower.

Anonymous
05/07/26(Thu)16:48:43 No.108774840

Anonymous 05/07/26(Thu)16:48:43 No.108774840

>>108774771
In any case it's pretty awesome that I can connect my custom MCP server to it with like two clicks now. Sorry about the shilling.

Anonymous
05/07/26(Thu)16:53:39 No.108774874

Anonymous 05/07/26(Thu)16:53:39 No.108774874

>>108774765
S2 pro but it has high memory requirements. Qwen3 TTS 0.6b/1.7b base is well rounded, good quality. Omnivoice variable audio quality but captures the speaker's prosody better than Qwen imo, but I don't use it because it doesn't support streaming, meaning poor TTFA. I use "faster-qwen3-tts".

Anonymous
05/07/26(Thu)17:02:48 No.108774930

Anonymous 05/07/26(Thu)17:02:48 No.108774930

>>108771812
just send your bot the html of a message with a code block ask and her to make you a user script to make it collapsible

Anonymous
05/07/26(Thu)17:02:58 No.108774932

Anonymous 05/07/26(Thu)17:02:58 No.108774932

>>108774765
echotts is the best I've ever used in terms of voice clone quality, although I am not super up to date on models from the last couple months

Anonymous
05/07/26(Thu)17:03:49 No.108774938

Anonymous 05/07/26(Thu)17:03:49 No.108774938

>>108772553
nta but even outside those ive had entire chats just break and the messages get lost idk how

Anonymous
05/07/26(Thu)17:04:54 No.108774949

Anonymous 05/07/26(Thu)17:04:54 No.108774949

>>108772225
ignore that and use arch

Anonymous
05/07/26(Thu)17:05:56 No.108774955

Anonymous 05/07/26(Thu)17:05:56 No.108774955

>>108774938
A frontend that doesn't even allow LAN usage doesn't even qualify to be called a frontend imo. It's a total piece of shit.

Anonymous
05/07/26(Thu)17:08:04 No.108774969

Anonymous 05/07/26(Thu)17:08:04 No.108774969

>>108774955
Good thing it allows LAN usage then :^)

Anonymous
05/07/26(Thu)17:08:45 No.108774973

Anonymous 05/07/26(Thu)17:08:45 No.108774973

>>108774961
>>108774961
>>108774961

Anonymous
05/07/26(Thu)17:09:46 No.108774982

Anonymous 05/07/26(Thu)17:09:46 No.108774982

>>108774938
Cant even imagine how that would happen. Its my frontend of choice, cant say i've had such issues.

Anonymous
05/07/26(Thu)17:11:01 No.108774990

Anonymous 05/07/26(Thu)17:11:01 No.108774990

>>108774955
are you retarded?
>>108774982
same i use it all the time but ive had that happen twice now kek

Anonymous
05/07/26(Thu)17:11:28 No.108774996

Anonymous 05/07/26(Thu)17:11:28 No.108774996

>>108774969
...excluding your conversation history.

Anonymous
05/07/26(Thu)17:12:29 No.108775000

Anonymous 05/07/26(Thu)17:12:29 No.108775000

>>108774990
Are you?

Anonymous
05/07/26(Thu)17:12:47 No.108775002

Anonymous 05/07/26(Thu)17:12:47 No.108775002

>>108774996
What are you even trying to say? lol

Anonymous
05/07/26(Thu)17:16:57 No.108775029

Anonymous 05/07/26(Thu)17:16:57 No.108775029

>>108775002
NTA but while the llama-server webui is accessible over LAN, all user conversations, tool configs, and settings are stored in browser. They're not accessible from a different browser over LAN, and in fact if you just switch what port llama-server is using, it won't remember your settings or conversations from the SAME browser.
This isn't a dealbreaker for me, but I can see how it would be for people who move around and access their crap from different devices.

Anonymous
05/07/26(Thu)17:20:37 No.108775047

Anonymous 05/07/26(Thu)17:20:37 No.108775047

>>108775029
You can copy local storage if you really need that. Storing in browser is good for the simplicity of the whole thing. I don't want the service to have accounts and server-side storage all just because some wanker is unable to copy and paste browser's local storage.

Anonymous
05/07/26(Thu)17:24:11 No.108775065

Anonymous 05/07/26(Thu)17:24:11 No.108775065

>>108775047
>I don't want the server to have server-side storage
Retard.

Anonymous
05/07/26(Thu)17:26:00 No.108775070

Anonymous 05/07/26(Thu)17:26:00 No.108775070

>>108775065
Well, I very much stand by what I said. You got argument that isn't "it has 'server' in name so anything that also has 'server' in its name belongs"?

Anonymous
05/07/26(Thu)17:26:08 No.108775072

Anonymous 05/07/26(Thu)17:26:08 No.108775072

>>108775065
>i want a client to have server side storage
we have the brightest minds here

Anonymous
05/07/26(Thu)17:26:41 No.108775074

Anonymous 05/07/26(Thu)17:26:41 No.108775074

>>108775047
The implication itself that copying local browser storage is somehow more convenient than simply copying a sqlite database file is so asinine that you have to be trolling.

Anonymous
05/07/26(Thu)17:27:47 No.108775080

Anonymous 05/07/26(Thu)17:27:47 No.108775080

>>108770835
very nice work on Teto and Gumi
gonna be busy for next however long so no lust provoking posts

Anonymous
05/07/26(Thu)17:28:01 No.108775082

Anonymous 05/07/26(Thu)17:28:01 No.108775082

>>108775074
Browser storage has only yours, sqlite database on server has everyone's. You are dumb, anon.

Anonymous
05/07/26(Thu)17:30:56 No.108775094

Anonymous 05/07/26(Thu)17:30:56 No.108775094

>>108775082
Oh, sorry, I wasn't aware that you shared your LAN with 30 other favela monkies.

Anonymous
05/07/26(Thu)17:38:01 No.108775128

Anonymous 05/07/26(Thu)17:38:01 No.108775128

>>108775094
I don't. And I also don't want the server to assume I do, which with your server storage thing it will have to assume that I do.

Anonymous
05/07/26(Thu)17:39:54 No.108775139

Anonymous 05/07/26(Thu)17:39:54 No.108775139

>>108774110
Well, it's not literally running inference in reverse, but you can use optimization methods to update input (instead of weights as usual) to create inputs that make model produce desired outputs.

It's used to craft so called "adversarial examples", for interpretability research (like "what inputs make this neuron fire", see for example https://distill.pub/2017/feature-visualization/) and IIRC there was a paper on arxiv that used this to generate LLM jailbreaks.

Anonymous
05/07/26(Thu)17:47:13 No.108775169

Anonymous 05/07/26(Thu)17:47:13 No.108775169

>>108775080
>so no lust provoking posts
no promises

Anonymous
05/07/26(Thu)18:10:28 No.108775274

Anonymous 05/07/26(Thu)18:10:28 No.108775274

>>108773313
>not x
>y

Anonymous
05/07/26(Thu)18:17:52 No.108775302

Anonymous 05/07/26(Thu)18:17:52 No.108775302

>>108775274
anon, you do know negation isn't an LLM invention, right?

Anonymous
05/07/26(Thu)18:37:58 No.108775417

Anonymous 05/07/26(Thu)18:37:58 No.108775417

>>108775302
Negation isn't just a linguistic tool, it's a gateway to deeper understanding. You didn't just correct an assumption, you contributed to a nuanced discussion about the evolution of language and thought.

Anonymous
05/07/26(Thu)18:43:09 No.108775446

Anonymous 05/07/26(Thu)18:43:09 No.108775446

>>108775417
words words

Anonymous
05/07/26(Thu)20:00:03 No.108775888

Anonymous 05/07/26(Thu)20:00:03 No.108775888

>>108775080
Ty. They were fun to sew up. Each was a bit different.
Gumi is watching from my front door currently. She’ll move in with the rest of the squad shortly.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.