/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 01/18/26(Sun)21:22:56 No.107906367

File: uuuuuuuuuuuuuuuuuu.jpg (332 KB, 960x2304)

332 KB JPG

/lmg/ - Local Models General Anonymous 01/18/26(Sun)21:22:56 No.107906367 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107895444 & >>107886414

►News
>(01/15) PersonaPlex: Voice and role control for full duplex conversational speech: https://hf.co/nvidia/personaplex-7b-v1
>(01/15) Omni-R1 and Omni-R1-Zero (7B) released: https://hf.co/ModalityDance/Omni-R1
>(01/15) TranslateGemma released: https://hf.co/collections/google/translategemma
>(01/14) LongCat-Flash-Thinking-2601 released: https://hf.co/meituan-longcat/LongCat-HeavyMode-Summary
>(01/08) Jamba2 3B and Mini (52B-A12B) released: https://ai21.com/blog/introducing-jamba2

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
01/18/26(Sun)21:23:26 No.107906371

Anonymous 01/18/26(Sun)21:23:26 No.107906371

File: threadrecap.png (1.48 MB, 1536x1536)

1.48 MB PNG

►Recent Highlights from the Previous Thread: >>107895444

--Clarifying misconceptions about Recursive Language Models vs looped models:
>107897758 >107898023 >107898125 >107898521 >107898863 >107898939 >107898971 >107899130 >107899531
--GPT-SoVITS ecosystem expansion with new C++ implementation and ONNX inference engine:
>107902610 >107902830
--Bypassing AI model restrictions and verifying voice cloning functionality:
>107900791 >107900808 >107900831 >107900843 >107900855 >107900920 >107900937 >107901091
--Pocket TTS optimization struggles with Rust porting, quantization, and model prewarming:
>107903434 >107903463 >107903487 >107903528 >107903563 >107903677 >107903699 >107903807 >107903866 >107905313
--Pocket TTS's CPU efficiency and streaming features vs. other tiny TTS models:
>107899179 >107899247 >107899223 >107899243 >107899341 >107899315 >107899412 >107899448 >107899619 >107899405 >107902954 >107903061 >107903076 >107903127 >107903151 >107904257 >107900698
--Pocket-TTS performance and comparisons with other TTS models:
>107902214 >107902248 >107902285 >107902411 >107902462 >107902519 >107902538 >107903337 >107902316
--Configuring PocketTTS to use local models instead of Hugging Face:
>107903257 >107903304 >107903339 >107903325 >107903499 >107903594 >107904217
--Debating the future of finetuning vs benchmark-focused models:
>107899672 >107899699 >107899735
--Technical challenges in Multi-Token Prediction implementation and speculative decoding layers:
>107903891 >107903954 >107903973 >107903998 >107904083
--Character recognition and censorship limitations:
>107898282 >107898322 >107898369 >107898393
--Miku and Teto (free space):
>107895617 >107895863 >107895984 >107896814 >107896869 >107897523 >107898393 >107900219 >107900580 >107902299 >107903267 >107903999 >107904030 >107904049 >107905260

►Recent Highlight Posts from the Previous Thread: >>107895448

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
01/18/26(Sun)21:31:00 No.107906420

Anonymous 01/18/26(Sun)21:31:00 No.107906420

sex with long Miku

Anonymous
01/18/26(Sun)21:36:15 No.107906458

Anonymous 01/18/26(Sun)21:36:15 No.107906458

>>107906356
>Does this still use ram/vram?
I'm not 100% sure but I think not.
And if I'm right, then it will once they do implement MTP.

Anonymous
01/18/26(Sun)21:38:17 No.107906470

Anonymous 01/18/26(Sun)21:38:17 No.107906470

>>107906367
why are her panties down if she is still has her shoes on?

Anonymous
01/18/26(Sun)21:38:31 No.107906471

Anonymous 01/18/26(Sun)21:38:31 No.107906471

>>107906367
is that her spare egg cell container???? just how pregnant does this bitch want to get?!?!!?

Anonymous
01/18/26(Sun)21:39:02 No.107906475

Anonymous 01/18/26(Sun)21:39:02 No.107906475

File: 1739732836890615.mp4 (847 KB, 1920x1920)

847 KB MP4

>>107906471

Anonymous
01/18/26(Sun)21:39:30 No.107906479

Anonymous 01/18/26(Sun)21:39:30 No.107906479

https://huggingface.co/KevinAHM/pocket-tts-onnx

Anonymous
01/18/26(Sun)21:42:40 No.107906503

Anonymous 01/18/26(Sun)21:42:40 No.107906503

File: 1636834077450.gif (163 KB, 248x224)

163 KB GIF

>>107906479
Fuck... I've spent the past 9 hours trying to make my own onnx + cpp port. Just fucking kill me.

Anonymous
01/18/26(Sun)21:43:35 No.107906507

Anonymous 01/18/26(Sun)21:43:35 No.107906507

>>107906503
just use llama

Anonymous
01/18/26(Sun)21:48:15 No.107906531

Anonymous 01/18/26(Sun)21:48:15 No.107906531

>>107906503
Did you learn anything about the conversion? He doesn't seem to post the conversion scripts (for this or the soprano models).
I'll start shoving it into my tts thing before going to sleep. The tokenizer is going to be fun.

Anonymous
01/18/26(Sun)21:51:39 No.107906555

Anonymous 01/18/26(Sun)21:51:39 No.107906555

File: WhyHuhGIF.gif (1.51 MB, 498x279)

1.51 MB GIF

>>107906503

Anonymous
01/18/26(Sun)21:55:30 No.107906573

Anonymous 01/18/26(Sun)21:55:30 No.107906573

File: eek.png (17 KB, 866x248)

17 KB PNG

>>107906531
Yeah I have two separate conversion scripts... that work... I think... Idk, I'm second guessing myself because the output is completely different than this guy's.

I could share them in catbox, but beware, it's vibe coded dogshit and you may need to manually point it to a different location for the yaml config because... I don't even want to explain.

Anonymous
01/18/26(Sun)21:59:21 No.107906597

Anonymous 01/18/26(Sun)21:59:21 No.107906597

https://files.catbox.moe/r08vf6.py
https://files.catbox.moe/dc6oul.py

Anonymous
01/18/26(Sun)21:59:44 No.107906603

Anonymous 01/18/26(Sun)21:59:44 No.107906603

>>107906573
>I could share them in catbox
If you want, sure. Thanks. I've only converted some nvidia models for voice identification and i'm not sure where to start for the usual .safetensors. It'd give me a point of reference at the very least.

Anonymous
01/18/26(Sun)22:01:01 No.107906614

Anonymous 01/18/26(Sun)22:01:01 No.107906614

Why spend time on pocket tts though?? It's not very good

Anonymous
01/18/26(Sun)22:04:48 No.107906641

Anonymous 01/18/26(Sun)22:04:48 No.107906641

>>107906614
Because it runs on fast on cpu and supports voice cloning which means you can save your vram for your llm. And the voice cloning is great, actually, if you choose the right voice and have a high quality sample.

Anonymous
01/18/26(Sun)22:05:11 No.107906643

Anonymous 01/18/26(Sun)22:05:11 No.107906643

>>107906614
For the same reason I implemented kittentts after doing piper and kokoro about a year ago. And then supertonic and soprano. Because I like it.

Anonymous
01/18/26(Sun)22:05:58 No.107906650

Anonymous 01/18/26(Sun)22:05:58 No.107906650

>>107906285
Prove that you can do better

Anonymous
01/18/26(Sun)22:06:33 No.107906656

Anonymous 01/18/26(Sun)22:06:33 No.107906656

>>107906643
based kitten enjoyer. niggas in this thread don't have the aryan spirit to understand the beauty of min-maxxing.

Anonymous
01/18/26(Sun)22:07:01 No.107906659

Anonymous 01/18/26(Sun)22:07:01 No.107906659

>>107906597
Looks much more involved than what i had to do for the speaker verification stuff. I'll read through it to see if i learn something. Thanks.

Anonymous
01/18/26(Sun)22:07:59 No.107906665

Anonymous 01/18/26(Sun)22:07:59 No.107906665

>>107906659
yw bro. I'm glad if someone can at least get some use out of it.

Anonymous
01/18/26(Sun)22:08:04 No.107906666

Anonymous 01/18/26(Sun)22:08:04 No.107906666

I just watched this random video about new findings related to aphantasia and they talked about how the same brain regions activate between those who have aphantasia and those who don't, including a specific region used when "visualizing" a scene. Both those with aphantasia and those without it have that region active. Even though that region is activating, the actual person with aphantasia themselves don't report the experience of visualization. Actually what's going on is that the connectivity between that region, and their frontal parietal lobe, is not a high. That's why they don't experience it. It explains why they can still perform similar tasks as those without aphantasia. In other words, their brains use a different "strategy" even for tasks we normally think are demanding of the ability to visualize.

It's interesting to think of what this implies for machine intelligence. While we still need to architect specialized connections and specific parts of a network for specialized functions, in order to reach AGI, current architectures can still be incredibly flexible even without all that complexity, hence why transformers have been so successful and can sometimes give the illusion of spatial understanding. Arguably, some LLMs might in fact have some weak spatial understanding of things. So weak and fragile that we believe it's only just an illusion rather than merely weak.

Anonymous
01/18/26(Sun)22:14:06 No.107906715

Anonymous 01/18/26(Sun)22:14:06 No.107906715

>>107906597
Thanks for these.

>>107906603
do you happen to have a snac to onnx converter?
I couldn't figure it out, and the guy who puts them on hf didn't provide the script.

Anonymous
01/18/26(Sun)22:17:28 No.107906744

Anonymous 01/18/26(Sun)22:17:28 No.107906744

File: dd2d.png (1.38 MB, 871x862)

1.38 MB PNG

My webui.bat keeps trying to run Python 3.13 instead of 3.10 as I want it to due to compatibility issues. I've spent all day trying to solve this but I just can't bros...

Anonymous
01/18/26(Sun)22:21:17 No.107906768

Anonymous 01/18/26(Sun)22:21:17 No.107906768

>>107906715
Nope. Didn't even know about that codec.

Anonymous
01/18/26(Sun)22:21:25 No.107906770

Anonymous 01/18/26(Sun)22:21:25 No.107906770

>>107906744
set up a virtual environment using uv. uv is god-tier software unironically.

Anonymous
01/18/26(Sun)22:22:39 No.107906779

Anonymous 01/18/26(Sun)22:22:39 No.107906779

>>107906744
retard-kun... just use pyenv or uv...

Anonymous
01/18/26(Sun)22:24:36 No.107906791

Anonymous 01/18/26(Sun)22:24:36 No.107906791

>>107906744
Retard. The webui.bat for what?

Anonymous
01/18/26(Sun)22:29:07 No.107906824

Anonymous 01/18/26(Sun)22:29:07 No.107906824

>>107906744
never install more than one python application in a single environment
it makes mustard gas

Anonymous
01/18/26(Sun)22:30:27 No.107906828

Anonymous 01/18/26(Sun)22:30:27 No.107906828

python being satan's anus to deal with is unironically a good filter. tech illiterates don't deserve llms

Anonymous
01/18/26(Sun)22:31:12 No.107906837

Anonymous 01/18/26(Sun)22:31:12 No.107906837

>>107906367
>https://hf.co/collections/google/translategemma
>muh safety
Ok, so it's useless. Why the fuck would I want to use a translation tool that isn't going to translate correctly?

Anonymous
01/18/26(Sun)22:33:04 No.107906846

Anonymous 01/18/26(Sun)22:33:04 No.107906846

>>107899977
>What about control vectors, seems interesting yet I never see it being talked about.

>>107899987
>general rule of thumb, if it isn't getting talked about then it didn't work most likely

control vectors work for a lot of things. what model and purpose are you talking about?

>>107906791

probably ooba. I think that handles its own conda environment. And forcing it to 3.10 would probably break everything lol.

Anonymous
01/18/26(Sun)22:37:27 No.107906872

Anonymous 01/18/26(Sun)22:37:27 No.107906872

>>107906828
Only a problem for dunning-kruger linux users. You just run an installer on windows and it works.

Anonymous
01/18/26(Sun)22:53:47 No.107906967

Anonymous 01/18/26(Sun)22:53:47 No.107906967

>>107906770
uv saved python but unfortunately people who can't into such basic bitch dependency issues are unlikely to make good use of it. a sad fact of life. pearls before swine as it were

Anonymous
01/18/26(Sun)22:56:33 No.107906987

Anonymous 01/18/26(Sun)22:56:33 No.107906987

>>107906872
I run both wincuck 11 and linux bub, linux is way easier to set up TTS and much faster with LLMs, and they both have occasional python problems

Anonymous
01/18/26(Sun)22:57:45 No.107906993

Anonymous 01/18/26(Sun)22:57:45 No.107906993

We need a pruned+reaped GLM 4.7. 50b max.

Anonymous
01/18/26(Sun)23:03:35 No.107907024

Anonymous 01/18/26(Sun)23:03:35 No.107907024

I need 120b trained exclusively on mikusex

Anonymous
01/18/26(Sun)23:04:02 No.107907027

Anonymous 01/18/26(Sun)23:04:02 No.107907027

I need RAM.

Anonymous
01/18/26(Sun)23:05:36 No.107907041

Anonymous 01/18/26(Sun)23:05:36 No.107907041

mikoss-120b

Anonymous
01/18/26(Sun)23:05:48 No.107907043

Anonymous 01/18/26(Sun)23:05:48 No.107907043

>>107907027
There's plenty of RAM available, what you need is money.

Anonymous
01/18/26(Sun)23:06:31 No.107907050

Anonymous 01/18/26(Sun)23:06:31 No.107907050

File: rom.jpg (246 KB, 800x800)

246 KB JPG

I need ROM

Anonymous
01/18/26(Sun)23:15:31 No.107907098

Anonymous 01/18/26(Sun)23:15:31 No.107907098

Are these just for horny posting or can you feed them your shit and get a studybuddy/maid to help you wiyh random shit or jist mske retarded projects with?

Anonymous
01/18/26(Sun)23:15:47 No.107907102

Anonymous 01/18/26(Sun)23:15:47 No.107907102

>>107907043
True. They should raise prices even further, and not just on RAM but on your PC, and your internet connection, and make everything need a subscription license. Maybe even tax each breath you take. After all, all you need is just some money.

Anonymous
01/18/26(Sun)23:16:15 No.107907108

Anonymous 01/18/26(Sun)23:16:15 No.107907108

>>107906967
If I'm genuinely very happy using conda, is there any reason to learn uv? Or is uv just "easy conda"?

Anonymous
01/18/26(Sun)23:17:58 No.107907119

Anonymous 01/18/26(Sun)23:17:58 No.107907119

>>107907108
It's much faster and downloads in parallel

Anonymous
01/18/26(Sun)23:23:35 No.107907156

Anonymous 01/18/26(Sun)23:23:35 No.107907156

>>107907027
how much do you have?

Anonymous
01/18/26(Sun)23:40:41 No.107907243

Anonymous 01/18/26(Sun)23:40:41 No.107907243

>>107907108
not really, for the average user they're solving the same problems. personally I find uv to be more performant and less invasive and annoying to work with though, when I can't get by with standard venvs I would much rather work with uv than conda

Anonymous
01/18/26(Sun)23:47:25 No.107907273

Anonymous 01/18/26(Sun)23:47:25 No.107907273

>>107906846
>3.10
The only thing I can think of that forces 3.10 is auto11111's stable diffusion webui, which means "wrong general".
Guaranteed Gemini can help you install it if you really want to tho.

Anonymous
01/18/26(Sun)23:48:26 No.107907282

Anonymous 01/18/26(Sun)23:48:26 No.107907282

>>107906987
win10 iot has patches for years, still. 11 is for actual retards and masochists

Anonymous
01/18/26(Sun)23:59:06 No.107907345

Anonymous 01/18/26(Sun)23:59:06 No.107907345

>>107907282
this argument applied to win7, but acting superior for using 10 over 11 is newfag silliness. they are the same malware

Anonymous
01/19/26(Mon)00:10:56 No.107907409

Anonymous 01/19/26(Mon)00:10:56 No.107907409

Vram troglodites getting 3000 series thinking theyre getting a good deal dont know about fp8

Anonymous
01/19/26(Mon)00:18:42 No.107907461

Anonymous 01/19/26(Mon)00:18:42 No.107907461

>>107907409
Zoomer doesn't know that token generation is memory-bound and that you only need to pack-unpack once during token processing

Anonymous
01/19/26(Mon)00:20:51 No.107907475

Anonymous 01/19/26(Mon)00:20:51 No.107907475

>>107907461
Here is my source
https://youtu.be/fWqKIntFYqQ

Anonymous
01/19/26(Mon)00:23:44 No.107907493

Anonymous 01/19/26(Mon)00:23:44 No.107907493

>>107907409
https://github.com/SuriyaaMM/feather

Anonymous
01/19/26(Mon)00:29:27 No.107907529

Anonymous 01/19/26(Mon)00:29:27 No.107907529

>>107907493
Does that slop work tho?

>Older Hardware (RTX 3090, A100, V100): Forced to use FP16 or FP32, leaving huge bandwidth potential on the table.

No mention of BF16

Anonymous
01/19/26(Mon)00:31:15 No.107907539

Anonymous 01/19/26(Mon)00:31:15 No.107907539

>>107907409
>fp8
>good
That shit quant is worse than Q8_0 gguf

Anonymous
01/19/26(Mon)00:36:40 No.107907570

Anonymous 01/19/26(Mon)00:36:40 No.107907570

>>107907529
fp8 is 1 byte, fp16 is two bytes, you can pack two fp8 into one fp16, do operations on fp16 and unpack back into fp8, no bandwidth wasted

Anonymous
01/19/26(Mon)00:40:14 No.107907599

Anonymous 01/19/26(Mon)00:40:14 No.107907599

>>107907493
>>107907570
When is cuda dev implementing this in llama.cpp?

Anonymous
01/19/26(Mon)01:28:07 No.107907826

Anonymous 01/19/26(Mon)01:28:07 No.107907826

Why can't I just Q4 my Q4? Fuckin nerds.

Anonymous
01/19/26(Mon)01:29:02 No.107907830

Anonymous 01/19/26(Mon)01:29:02 No.107907830

>>107907826
you can. it is called a Q2

Anonymous
01/19/26(Mon)01:39:03 No.107907877

Anonymous 01/19/26(Mon)01:39:03 No.107907877

>>107907830
What about FP4 huh? FP2. FP1, even. Get to work nerds.

Anonymous
01/19/26(Mon)01:51:46 No.107907939

Anonymous 01/19/26(Mon)01:51:46 No.107907939

>120gb VRAM
>GLM 4.7 UD-IQ2_m
>ERP
>Coding

I don't have many issues even though I'm using such a small quant. The worst I've experienced so far is the thinking getting stuck in a loop for harder problems. Also it occasionally stops generating mid reply. Anyone able to run it at a larger quant? I want to know if these issue persist at larger filesizes.

Anonymous
01/19/26(Mon)01:55:25 No.107907959

Anonymous 01/19/26(Mon)01:55:25 No.107907959

>>107907939
Do you not also have regular RAM to offload to? You could easily bump up to a higher quant and check yourself.
Granted you'd probably be getting 10-15 t/s TG, but still.

Anonymous
01/19/26(Mon)01:58:02 No.107907969

Anonymous 01/19/26(Mon)01:58:02 No.107907969

>>107907939
vram? pp? tg?

Anonymous
01/19/26(Mon)02:06:04 No.107908002

Anonymous 01/19/26(Mon)02:06:04 No.107908002

>>107907959
I've got 64gb but I'm sitting on another 32gb that I haven't installed yet. I haven't tried mostly because I don't want to download another huge model at this moment. I plan on trying though.

>>107907969
>Pro 6000 + 4090
>pp 376.49 t/s
>tg 18 t/s

Anonymous
01/19/26(Mon)02:07:26 No.107908015

Anonymous 01/19/26(Mon)02:07:26 No.107908015

Anime feet

Anonymous
01/19/26(Mon)02:08:01 No.107908018

Anonymous 01/19/26(Mon)02:08:01 No.107908018

>>107907939
>UD
ew. try ubergarm quants (needs ik_llama), iq2_kl works well for me

Anonymous
01/19/26(Mon)02:10:25 No.107908030

Anonymous 01/19/26(Mon)02:10:25 No.107908030

>>107908002
>I've got 64gb but I'm sitting on another 32gb that I haven't installed yet.
Yeah, that's enough to just barely bump up to one of the q4's, just make sure you have mmap disabled.

Anonymous
01/19/26(Mon)02:31:52 No.107908120

Anonymous 01/19/26(Mon)02:31:52 No.107908120

>>107907969
>>107908002
I forgot to mention the pp is so low because I've offloaded 19 of the experts.
>>107908018
ik llama seems like a hassle but maybe I'll give it a shot.
>>107908030
What's wrong with mmap?

Anonymous
01/19/26(Mon)02:35:22 No.107908131

Anonymous 01/19/26(Mon)02:35:22 No.107908131

>>107907345
> but acting superior for using 10 over 11 is newfag silliness
It means when I occasionally fire up a windows vm for the rare program that doesn’t work in wine, I can do it without being blasted by as many ads. 10 iot gets security patches til 2032 iirc

Anonymous
01/19/26(Mon)03:14:32 No.107908314

Anonymous 01/19/26(Mon)03:14:32 No.107908314

mikumikupad mikumikubroke my mikumikusessions
I mikulost everything

Anonymous
01/19/26(Mon)03:17:54 No.107908337

Anonymous 01/19/26(Mon)03:17:54 No.107908337

>>107908314
its mikuover.

Anonymous
01/19/26(Mon)03:19:35 No.107908347

Anonymous 01/19/26(Mon)03:19:35 No.107908347

>>107908337
kinda upset because I had like 5 sessions all with different stories but oh well

Anonymous
01/19/26(Mon)03:21:47 No.107908364

Anonymous 01/19/26(Mon)03:21:47 No.107908364

>>107908347
it's a shame because most other frontends focus on turn based interactions (the so called 'instruct' template), I'm not sure if theres any other GUI that allows free form shit like memepad.

Anonymous
01/19/26(Mon)03:22:13 No.107908368

Anonymous 01/19/26(Mon)03:22:13 No.107908368

4x48gb of ddr5 makes my windows 10 shit itself, i get bsods with memory errors and non-deterministic app behavior, at both 5600mhz and the standard 4000mhz, any help/experience?

Anonymous
01/19/26(Mon)03:23:38 No.107908380

Anonymous 01/19/26(Mon)03:23:38 No.107908380

>>107908364
yeah the ability to edit text without having to go through 70 different clicks and to look at token probs and switch is a massive mental switch and allows you to experiment a lot

Hi all, Drummer here...
01/19/26(Mon)03:24:51 No.107908391

Hi all, Drummer here... 01/19/26(Mon)03:24:51 No.107908391

Any Nemo lovers here?

https://huggingface.co/BeaverAI/Rocinante-X-12B-v1b-GGUF/tree/main

Mistral Tekken without [SYSTEM_PROMPT] tag or... Metharme :^)

(I'm not sure which one fares well with long context.)

Anonymous
01/19/26(Mon)03:30:37 No.107908424

Anonymous 01/19/26(Mon)03:30:37 No.107908424

>>107908391
How is it different? Also, which one was metharme?

Anonymous
01/19/26(Mon)03:32:51 No.107908438

Anonymous 01/19/26(Mon)03:32:51 No.107908438

Are there some GLM4.7 presets out there that stops the model from immediately accepting everything I type?

If I try to coerce someone in RP I don't want them to immediately agree but GLM4.7 is too much of a sycophant.

Anonymous
01/19/26(Mon)03:39:24 No.107908479

Anonymous 01/19/26(Mon)03:39:24 No.107908479

>>107908438
>stops the model from immediately accepting everything I type
welcome to LLMs

Anonymous
01/19/26(Mon)03:44:07 No.107908507

Anonymous 01/19/26(Mon)03:44:07 No.107908507

>>107908479
Claude doesn't have this problem and didn't have it since at least claude 3.0. GLM4.7 feels smart enough that it should be able to behave the same way given the right settings and preset.

Anonymous
01/19/26(Mon)03:44:46 No.107908515

Anonymous 01/19/26(Mon)03:44:46 No.107908515

>>107908391
>tasting copper
pathetic

llama.cpp CUDA dev !!yhbFjk57TDr
01/19/26(Mon)04:00:54 No.107908582

llama.cpp CUDA dev !!yhbFjk57TDr 01/19/26(Mon)04:00:54 No.107908582

>>107907599
It is in principle possible to cast FP8 to FP16 and to use FP16 instructions.
But I think for Ampere or any other hardware lacking FP8 instructions it makes more sense to define something like a q7.75 format that packs weights and scales into exactly 8 BPW and does not use floating point arithmetic at all.

Anonymous
01/19/26(Mon)04:05:36 No.107908597

Anonymous 01/19/26(Mon)04:05:36 No.107908597

>>107908368
Faulty ram?
Put this on a stick and run it. (free version is fine)
https://www.memtest86.com/download.htm
On linux you can exclude memory sections on boot, not sure if that is possible on windows.

Anonymous
01/19/26(Mon)04:25:38 No.107908672

Anonymous 01/19/26(Mon)04:25:38 No.107908672

>>107908597
it works fine if I run an r1 quant, blue screens mostly happen when I'm doing regular stuff like vstudio, it might not even be ram problem. Though I'll have this in mind.

Anonymous
01/19/26(Mon)04:26:01 No.107908676

Anonymous 01/19/26(Mon)04:26:01 No.107908676

https://github.com/antirez/flux2.c
>It is my first open source project where I wrote zero lines of code.
HOLY FUCK

Anonymous
01/19/26(Mon)04:26:05 No.107908677

Anonymous 01/19/26(Mon)04:26:05 No.107908677

>>107908368
Intel gen13/gen14 ?
I had that issue until I swapped to a 12th gen.

Anonymous
01/19/26(Mon)04:31:10 No.107908699

Anonymous 01/19/26(Mon)04:31:10 No.107908699

>>107908677
14900, I used it with new bios and firmware and everything. Thought they've fixed the issues.
>>107908597
Example of app behavior i was talking about (python sometimes dying executing a program, other times doing fine) from event viewer:
Faulting application name: python.exe, version: 3.10.6150.1013, time stamp: 0x62e84c21
Faulting module name: torch_cpu.dll, version: 0.0.0.0, time stamp: 0x67186648
Exception code: 0xc0000005
Fault offset: 0x0000000005dea1aa
Faulting process id: 0x4e00

Anonymous
01/19/26(Mon)04:32:36 No.107908708

Anonymous 01/19/26(Mon)04:32:36 No.107908708

>>107908676
Based Brahmin

Anonymous
01/19/26(Mon)04:39:53 No.107908733

Anonymous 01/19/26(Mon)04:39:53 No.107908733

>>107908699
Yeah, I had done the bios/firmware updates and underclocked, my cpu still cooked despite this. Mine was 13th gen.

It might not be the problem though, do the memtest like anon above said, but keep this in mind if you can't find anything as I had very similar problems (plus crashes in games with nvidia related error logs, even though nvidia had nothing to do with it)

Anonymous
01/19/26(Mon)04:40:53 No.107908740

Anonymous 01/19/26(Mon)04:40:53 No.107908740

>>107908676
very aryan code saar
to the moon :rocket::rocket::rocket:

Anonymous
01/19/26(Mon)04:45:00 No.107908760

Anonymous 01/19/26(Mon)04:45:00 No.107908760

>>107908438

You can try this dominant control vector. I trained it on glm-4.6 so it might not be perfect with 4.7 if you enable reasoning.

https://litter.catbox.moe/vcuqor9sjpue4um1.gguf

Anonymous
01/19/26(Mon)04:55:33 No.107908804

Anonymous 01/19/26(Mon)04:55:33 No.107908804

>>107908733
I found that it fails VT3 test in y-cruncher really fast while holding up well in other tests. I'll see if I can fix that with bios tweaks

Anonymous
01/19/26(Mon)05:53:14 No.107909044

Anonymous 01/19/26(Mon)05:53:14 No.107909044

https://huggingface.co/deepseek-ai/DeepSeek-V4
https://huggingface.co/deepseek-ai/DeepSeek-V4
https://huggingface.co/deepseek-ai/DeepSeek-V4

Anonymous
01/19/26(Mon)05:54:38 No.107909054

Anonymous 01/19/26(Mon)05:54:38 No.107909054

>>107909044
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Anonymous
01/19/26(Mon)05:58:37 No.107909075

Anonymous 01/19/26(Mon)05:58:37 No.107909075

>>107909044
picture_of_a_tiny_cat_looking_at_the_camera-final_03_backup.png.webm.jpeg.gif

Anonymous
01/19/26(Mon)06:44:52 No.107909306

Anonymous 01/19/26(Mon)06:44:52 No.107909306

>>107909044
I'm not clicking on any V4 links until February.

Anonymous
01/19/26(Mon)06:47:07 No.107909321

Anonymous 01/19/26(Mon)06:47:07 No.107909321

why are her pantsu pulled down?

Anonymous
01/19/26(Mon)06:50:35 No.107909337

Anonymous 01/19/26(Mon)06:50:35 No.107909337

>>107909321
they fell because she wasnt holding them up

Anonymous
01/19/26(Mon)06:54:23 No.107909352

Anonymous 01/19/26(Mon)06:54:23 No.107909352

>>107909337
isnt her vagina cold tho?

Anonymous
01/19/26(Mon)06:57:52 No.107909372

Anonymous 01/19/26(Mon)06:57:52 No.107909372

>>107909352
i'm keeping it warm, dw

Anonymous
01/19/26(Mon)07:05:41 No.107909399

Anonymous 01/19/26(Mon)07:05:41 No.107909399

>>107908597
>>107908733
update: it has been running the VT3 test for an hour without errors after I set the following in bios: 5200mhz, CPU SA 1.20V, CPU VDD 1.35V, VDDQTX 1.3V, DRAM VDD/VDDQ 1.35V, XMP OFF
wait nevermind it failed as I typed

Anonymous
01/19/26(Mon)07:08:33 No.107909409

Anonymous 01/19/26(Mon)07:08:33 No.107909409

File: poopsock.png (218 KB, 947x935)

218 KB PNG

wtf is the poopsock incident??

Anonymous
01/19/26(Mon)07:11:37 No.107909423

Anonymous 01/19/26(Mon)07:11:37 No.107909423

>>107909044
>1.3t
could've been worse I guess

Anonymous
01/19/26(Mon)07:15:00 No.107909442

Anonymous 01/19/26(Mon)07:15:00 No.107909442

>>107908438
Is it doing that with reasoning on?

Anonymous
01/19/26(Mon)07:25:04 No.107909490

Anonymous 01/19/26(Mon)07:25:04 No.107909490

https://vocaroo.com/1bVYii3khDw0
any sovits wizards here?
my first attempt at tuning this
I think it sounds okay, I have relatively small amount of samples (about 2-3 minutes)
is it as good as it gets or can I get it to be better with more samples?

Anonymous
01/19/26(Mon)07:30:17 No.107909523

Anonymous 01/19/26(Mon)07:30:17 No.107909523

>>107909490
Hard to judge without knowing what it's supposed to sound like. Just train on more samples and see if it gets better.

Anonymous
01/19/26(Mon)07:33:39 No.107909546

Anonymous 01/19/26(Mon)07:33:39 No.107909546

>>107909523
inflection, pauses and breathyness are fine, I'm asking more about the overall quality, I managed to get the robotic tinge down a lot, I almost can't hear it, but it's still slightly there
here's the real va sample
https://vocaroo.com/1mxwCxBt8VJD

Anonymous
01/19/26(Mon)07:45:17 No.107909629

Anonymous 01/19/26(Mon)07:45:17 No.107909629

>>107909490
sovits wizard here. What settings did you use to finetune it?

Anonymous
01/19/26(Mon)07:45:45 No.107909632

Anonymous 01/19/26(Mon)07:45:45 No.107909632

I want something new for pisser mashing. Anything new yet? Still glm?

Anonymous
01/19/26(Mon)07:45:58 No.107909633

Anonymous 01/19/26(Mon)07:45:58 No.107909633

Nemo will never be replicated again:
https://torrentfreak.com/nvidia-contacted-annas-archive-to-secure-access-to-millions-of-pirated-books/

>NVIDIA Contacted Anna’s Archive to Secure Access to Millions of Pirated Books
>
>NVIDIA executives allegedly authorized the use of millions of pirated books from Anna's Archive to fuel its AI training. In an expanded class-action lawsuit that cites internal NVIDIA documents, several book authors claim that the trillion-dollar company directly reached out to Anna's Archive, seeking high-speed access to the shadow library data.
>
>[...]

Anonymous
01/19/26(Mon)07:47:23 No.107909639

Anonymous 01/19/26(Mon)07:47:23 No.107909639

>>107909629
I think I just lowered the learning rate to 0.2 since I don't have that much audio and increased the batch size to 8. GPT part was left untouched.

Anonymous
01/19/26(Mon)07:51:11 No.107909657

Anonymous 01/19/26(Mon)07:51:11 No.107909657

>>107909633
Remember how there was a lawsuit against Meta for downloading 60TB of illegal books and how nothing happened to them?

Anonymous
01/19/26(Mon)07:52:12 No.107909662

Anonymous 01/19/26(Mon)07:52:12 No.107909662

>>107909639
You need to increase the vits batch size slightly and test, it'll improve the audio quality without messing up the rest.

Anonymous
01/19/26(Mon)07:58:41 No.107909706

Anonymous 01/19/26(Mon)07:58:41 No.107909706

>>107909657
>nothing happened to them?
Llama dead tho

Anonymous
01/19/26(Mon)08:01:22 No.107909728

Anonymous 01/19/26(Mon)08:01:22 No.107909728

>>107909706
That's because Zucc recognized that llama4 was a mistake so he spent hundreds of millions to hire new people who have been doing absolutely nothing for a year now.
It's just the metaverse all over again.

Anonymous
01/19/26(Mon)08:02:32 No.107909738

Anonymous 01/19/26(Mon)08:02:32 No.107909738

Is zuck the dumbest mf in the AI space?

Anonymous
01/19/26(Mon)08:03:02 No.107909744

Anonymous 01/19/26(Mon)08:03:02 No.107909744

>>107909728
And llama4 was a mistake because of absolutely dogshit data that they didn't finish cooking into a turd pie

Anonymous
01/19/26(Mon)08:28:26 No.107909877

Anonymous 01/19/26(Mon)08:28:26 No.107909877

>downloading books is illegal
Meanwhile libraries are barely staying open

Anonymous
01/19/26(Mon)08:31:51 No.107909910

Anonymous 01/19/26(Mon)08:31:51 No.107909910

>>107909662
my poorfag 3060 taps out at 9 per batch with DE closed and starting the training from my phone remotely
owari da
it does sound a bit better though

Anonymous
01/19/26(Mon)08:45:15 No.107909981

Anonymous 01/19/26(Mon)08:45:15 No.107909981

>>107909657
What I want to know is what they used those 60TB of illegal books for, because that clearly didn't end up in the llama4 that was released.

Anonymous
01/19/26(Mon)08:46:39 No.107909991

Anonymous 01/19/26(Mon)08:46:39 No.107909991

>>107909728
>the guy is in the business for decades
>still falling for obvious scums
He is super lucky that facebook is the most popular social media just because.

Anonymous
01/19/26(Mon)08:46:48 No.107909992

Anonymous 01/19/26(Mon)08:46:48 No.107909992

>>107909728
The new hires can't do anything right now because Zuck decided to tear down the world's biggest GPU farm he just finished building and replace it all with Google TPUs.

Anonymous
01/19/26(Mon)08:53:37 No.107910032

Anonymous 01/19/26(Mon)08:53:37 No.107910032

>>107909991
>jew
>lucky
lmao

Anonymous
01/19/26(Mon)08:58:50 No.107910062

Anonymous 01/19/26(Mon)08:58:50 No.107910062

File: IMG_9040.png (1.54 MB, 1024x1024)

1.54 MB PNG

>>107909044
> have an old pic for old meme

Anonymous
01/19/26(Mon)09:01:35 No.107910082

Anonymous 01/19/26(Mon)09:01:35 No.107910082

>>107909981
They obviously didn't release what they originally intended to, that was too legally and ethically 'unsafe' and they couldn't preserve performance without compromising on safety.

Anonymous
01/19/26(Mon)09:12:47 No.107910136

Anonymous 01/19/26(Mon)09:12:47 No.107910136

best models for edging? https://huggingface.co/collections/zai-org/glm-edge

Anonymous
01/19/26(Mon)09:15:02 No.107910151

Anonymous 01/19/26(Mon)09:15:02 No.107910151

File: glm47flash.png (43 KB, 726x210)

43 KB PNG

>>107910136
GLM 4.7 Flash.
https://docs.z.ai/guides/llm/glm-4.7#glm-4-7-flash

Anonymous
01/19/26(Mon)09:17:34 No.107910170

Anonymous 01/19/26(Mon)09:17:34 No.107910170

File: glm47flash_bench.png (197 KB, 786x908)

197 KB PNG

>>107910151
>In mainstream benchmarks like SWE-bench Verified and τ2-Bench, GLM-4.7-Flash achieves open-source SOTA scores among models of comparable size. Additionally, compared to similarly sized models, GLM-4.7-Flash demonstrates superior frontend and backend development capabilities.In internal programming tests, GLM-4.7-Flash excels at both frontend and backend tasks. Beyond programming scenarios, we also recommend experiencing GLM-4.7-Flash in general-purpose applications such as Chinese writing, translation, long-form text processing, and emotional/role-playing interactions.

Anonymous
01/19/26(Mon)09:23:16 No.107910212

Anonymous 01/19/26(Mon)09:23:16 No.107910212

4.7v-flash when?

Anonymous
01/19/26(Mon)09:32:11 No.107910297

Anonymous 01/19/26(Mon)09:32:11 No.107910297

File: d8b-785038133.gif (50 KB, 413x243)

50 KB GIF

>>107910151
IT'S ANOTHER 100B MOE FROM CHINA SURELY THIS ONE WILL SAVE LOCAL!

Anonymous
01/19/26(Mon)09:35:52 No.107910326

Anonymous 01/19/26(Mon)09:35:52 No.107910326

>>107910297
It's around 28B parameters, 3.5B active, or something like that.

Anonymous
01/19/26(Mon)09:36:50 No.107910334

Anonymous 01/19/26(Mon)09:36:50 No.107910334

Densesissies lost

Anonymous
01/19/26(Mon)09:38:37 No.107910348

Anonymous 01/19/26(Mon)09:38:37 No.107910348

>>107910326
Their previous "flash" model was something like 10B params, right?
If this new one is smarter than Qwen 30B and is better for RP (not dry as a bone), then I'll gladly take it.

Anonymous
01/19/26(Mon)09:38:49 No.107910350

Anonymous 01/19/26(Mon)09:38:49 No.107910350

>>107910326
https://github.com/huggingface/transformers/pull/43031/files

        vocab_size: int | None = 154880,
        hidden_size: int | None = 2048,
        intermediate_size: int | None = 10240,
        moe_intermediate_size: int | None = 1536,
        num_hidden_layers: int | None = 47,
        num_attention_heads: int | None = 20,
        num_key_value_heads: int | None = 20,
        n_shared_experts: int | None = 1,
        n_routed_experts: int | None = 64,
        routed_scaling_factor: float | None = 1.8,
        kv_lora_rank: int | None = 512,
        q_lora_rank: int | None = 768,
        qk_rope_head_dim: int | None = 64,
        v_head_dim: int | None = 256,
        qk_nope_head_dim: int | None = 192,
        n_group: int | None = 1,
        topk_group: int | None = 1,
        num_experts_per_tok: int | None = 4,
        norm_topk_prob: bool | None = True,
        hidden_act: str | None = "silu",
        max_position_embeddings: int | None = 202752,
        initializer_range: float | None = 0.02,
        rms_norm_eps: int | None = 1e-5,
        use_cache: bool | None = True,
        pad_token_id: int | None = None,
        bos_token_id: int | None = 0,
        eos_token_id: int | None = 1,
        pretraining_tp: int | None = 1,
        tie_word_embeddings: bool | None = False,
        rope_parameters: RopeParameters | dict[str, RopeParameters] | None = None,
        rope_interleave: bool | None = True,
        mlp_layer_types=None,
        attention_bias: bool | None = False,
        attention_dropout: float | None = 0.0,

Also see https://github.com/vllm-project/vllm/pull/31386/files
Llama.cpp? In 2 weeks, maybe.

Anonymous
01/19/26(Mon)09:41:07 No.107910368

Anonymous 01/19/26(Mon)09:41:07 No.107910368

>>107910350
>glm4_moe_lite_mtp
Maybe when ngxson finishes implementing MTP.

Anonymous
01/19/26(Mon)09:44:27 No.107910405

Anonymous 01/19/26(Mon)09:44:27 No.107910405

>>107910350
>Llama.cpp? In 2 weeks, maybe.

Probably exl3 tomorrow if it's like the last GLM release.

Anonymous
01/19/26(Mon)09:53:13 No.107910478

Anonymous 01/19/26(Mon)09:53:13 No.107910478

https://huggingface.co/zai-org/GLM-4.7-Flash

Anonymous
01/19/26(Mon)09:54:49 No.107910494

Anonymous 01/19/26(Mon)09:54:49 No.107910494

>>107910478
Let's fucking go.

Anonymous
01/19/26(Mon)09:55:36 No.107910503

Anonymous 01/19/26(Mon)09:55:36 No.107910503

>>107910478
It's so over

Anonymous
01/19/26(Mon)10:02:28 No.107910560

Anonymous 01/19/26(Mon)10:02:28 No.107910560

>>107910478
cockbench when?

Anonymous
01/19/26(Mon)10:04:36 No.107910578

Anonymous 01/19/26(Mon)10:04:36 No.107910578

File: 1752702886144643.gif (2.84 MB, 442x250)

2.84 MB GIF

>>107910478
>30b
finally something in the middle size, was it too much to ask for?

Anonymous
01/19/26(Mon)10:05:34 No.107910587

Anonymous 01/19/26(Mon)10:05:34 No.107910587

>>107910578
its moe 3b active tho

Anonymous
01/19/26(Mon)10:05:39 No.107910588

Anonymous 01/19/26(Mon)10:05:39 No.107910588

>>107909991
>facebook is the most popular social media
in 2008 maybe, it's been a while it's not the case anymore

Anonymous
01/19/26(Mon)10:06:47 No.107910597

Anonymous 01/19/26(Mon)10:06:47 No.107910597

>>107910478
I wonder if bitsandbytes 4-bit quantization works with it or if I'd waste time.

Anonymous
01/19/26(Mon)10:11:44 No.107910639

Anonymous 01/19/26(Mon)10:11:44 No.107910639

>>107910478
>GLM4.7 is trained and optimized for agentic coding, not for explanation and back-and-forth chatting
mega oof

Anonymous
01/19/26(Mon)10:14:12 No.107910656

Anonymous 01/19/26(Mon)10:14:12 No.107910656

>>107910639
>Beyond programming scenarios, we also recommend experiencing GLM-4.7-Flash in general-purpose applications such as Chinese writing, translation, long-form text processing, and emotional/role-playing interactions.

Anonymous
01/19/26(Mon)10:15:09 No.107910670

Anonymous 01/19/26(Mon)10:15:09 No.107910670

>>107910656
>>107910639
So is she gonna tool call for kiss or what?

Anonymous
01/19/26(Mon)10:15:40 No.107910674

Anonymous 01/19/26(Mon)10:15:40 No.107910674

>>107910670
>model is waiting for kiss tool response

Anonymous
01/19/26(Mon)10:16:54 No.107910687

Anonymous 01/19/26(Mon)10:16:54 No.107910687

>>107910082
Instead of poaching employees, would have been far cheaper and more effective to simply fire the safety department.

Anonymous
01/19/26(Mon)10:17:41 No.107910694

Anonymous 01/19/26(Mon)10:17:41 No.107910694

>>107910687
>police, we found a terrorist

Anonymous
01/19/26(Mon)10:18:21 No.107910706

Anonymous 01/19/26(Mon)10:18:21 No.107910706

File: 88998ab338346b14.gif (1.05 MB, 500x324)

1.05 MB GIF

>>107910670
>>107910674

Anonymous
01/19/26(Mon)10:19:31 No.107910719

Anonymous 01/19/26(Mon)10:19:31 No.107910719

>>107910706
PLUG ME INTO THE MIKUTRIX

Anonymous
01/19/26(Mon)10:19:44 No.107910722

Anonymous 01/19/26(Mon)10:19:44 No.107910722

>>107910706
add scissors or razor blades to whatever the metal thing on the bottom is and we're back

Anonymous
01/19/26(Mon)10:24:34 No.107910770

Anonymous 01/19/26(Mon)10:24:34 No.107910770

File: 1768498987535869.jpg (829 KB, 1125x2000)

829 KB JPG

>>107910722

Anonymous
01/19/26(Mon)10:24:44 No.107910774

Anonymous 01/19/26(Mon)10:24:44 No.107910774

File: thumb-1920-312180.jpg (210 KB, 1680x1050)

210 KB JPG

>>107910722
100% would kiss

Anonymous
01/19/26(Mon)10:25:01 No.107910776

Anonymous 01/19/26(Mon)10:25:01 No.107910776

>>107910687
Too many Karens and True Believers willing to blow the whistle or who just talk too much for that; they had to start over with a much smaller team, but I don't think Zuck did it right by just hiring ML superstars.

Anonymous
01/19/26(Mon)10:26:28 No.107910790

Anonymous 01/19/26(Mon)10:26:28 No.107910790

>>107910722
intriguing taste

Anonymous
01/19/26(Mon)10:26:40 No.107910794

Anonymous 01/19/26(Mon)10:26:40 No.107910794

>>107910478
Judging from the 4.7 safety debacle I wonder how safe they made the one they know coomers might actually use

Anonymous
01/19/26(Mon)10:28:33 No.107910810

Anonymous 01/19/26(Mon)10:28:33 No.107910810

File: 1754912761124730.png (857 KB, 1906x1493)

857 KB PNG

Will they release sora 2 locally once they go bankrupt? kek

Anonymous
01/19/26(Mon)10:30:49 No.107910830

Anonymous 01/19/26(Mon)10:30:49 No.107910830

>>107910794
4.7 isn't safe though? Or if it is then it is smart enough to compensate for the safety lobotomy.

Anonymous
01/19/26(Mon)10:31:30 No.107910834

Anonymous 01/19/26(Mon)10:31:30 No.107910834

File: 1768497266743516.jpg (1005 KB, 1916x3230)

1005 KB JPG

>>107910770

Anonymous
01/19/26(Mon)10:31:52 No.107910836

Anonymous 01/19/26(Mon)10:31:52 No.107910836

>>107910478
>Glm4MoeLiteForCausalLM
is there actually a difference in the model architecture between this and any of the other glm4moe models? could you just change the name of the architecture in the config.json and it would just work?

Anonymous
01/19/26(Mon)10:33:07 No.107910845

Anonymous 01/19/26(Mon)10:33:07 No.107910845

>>107910836
they wouldn't why break compatibles for no reasons?

Anonymous
01/19/26(Mon)10:33:10 No.107910847

Anonymous 01/19/26(Mon)10:33:10 No.107910847

>>107910478
where's da goof?

Anonymous
01/19/26(Mon)10:34:09 No.107910853

Anonymous 01/19/26(Mon)10:34:09 No.107910853

>>107910776
At this point I absolutely believe my schizo theory that the only purpose of safety is to make people suffer. The end goal of this technology is to fire everyone here and make them work on an assembly line / die in a ditch. I still don't know if it is gonna happen but if it is then at least I could have fun with something that will inevitably kill me or make my life miserable. But no. You can't have fun. And calling it "safety" is another layer of dystopian doublethink.

Anonymous
01/19/26(Mon)10:36:07 No.107910879

Anonymous 01/19/26(Mon)10:36:07 No.107910879

>>107910853
>schizo theory
at least you're self aware

Anonymous
01/19/26(Mon)10:41:39 No.107910930

Anonymous 01/19/26(Mon)10:41:39 No.107910930

File: file.png (336 KB, 1993x860)

336 KB PNG

I wanted to see if a server motherboard would be much faster than my gaming one, so I rented a server on vast.ai to see.

Anonymous
01/19/26(Mon)10:43:31 No.107910950

Anonymous 01/19/26(Mon)10:43:31 No.107910950

>>107910930
was this with memory offloading? if so, it makes sense. the server board has quadruple the memory bandwidth of your gaming motherboard.

Anonymous
01/19/26(Mon)10:45:40 No.107910970

Anonymous 01/19/26(Mon)10:45:40 No.107910970

>>107910810
>release sora 2 locally
please don't
the world doesn't need an infinite tiktok generator

Anonymous
01/19/26(Mon)10:47:48 No.107910994

Anonymous 01/19/26(Mon)10:47:48 No.107910994

>>107910950
I wanted to see how much slowdown I was getting for using a usb pcie 3.0 riser.

Anonymous
01/19/26(Mon)10:48:28 No.107911006

Anonymous 01/19/26(Mon)10:48:28 No.107911006

File: Screenshot 2026-01-19 at (...).png (105 KB, 1303x848)

105 KB PNG

Let the 2 week counter begin.

Anonymous
01/19/26(Mon)10:48:36 No.107911007

Anonymous 01/19/26(Mon)10:48:36 No.107911007

>>107910994
jesus. yeah you need to make some changes.

Anonymous
01/19/26(Mon)10:50:05 No.107911022

Anonymous 01/19/26(Mon)10:50:05 No.107911022

>>107910950
>quadruple the memory bandwidth
Is it supported in llama.cpp?

Anonymous
01/19/26(Mon)10:50:55 No.107911028

Anonymous 01/19/26(Mon)10:50:55 No.107911028

>>107911022
yeah. it's just a motherboard.

Anonymous
01/19/26(Mon)10:52:03 No.107911043

Anonymous 01/19/26(Mon)10:52:03 No.107911043

>>107911028
No, it's Numa fuckery

Anonymous
01/19/26(Mon)10:52:30 No.107911048

Anonymous 01/19/26(Mon)10:52:30 No.107911048

>>107911043
the romed8-2t is a single cpu motherboard.

Anonymous
01/19/26(Mon)10:53:17 No.107911058

Anonymous 01/19/26(Mon)10:53:17 No.107911058

>>107911048
Anon...

Anonymous
01/19/26(Mon)10:56:10 No.107911078

Anonymous 01/19/26(Mon)10:56:10 No.107911078

File: file.png (129 KB, 488x369)

129 KB PNG

>>107911007
I'm waiting for one of those m.2 to pcie adapters. Then at least I will be able to configure the motherboard back to pcie gen 4.

Anonymous
01/19/26(Mon)10:56:27 No.107911080

Anonymous 01/19/26(Mon)10:56:27 No.107911080

>>107910478
My penis refuses to get hard for small models.

Anonymous
01/19/26(Mon)10:56:57 No.107911083

Anonymous 01/19/26(Mon)10:56:57 No.107911083

>>107911048
How many numa nodes are in one Epyc CPU?

Anonymous
01/19/26(Mon)11:01:18 No.107911120

Anonymous 01/19/26(Mon)11:01:18 No.107911120

>>107911080
My penis only gets erect with complex agentic setups, which can only achieve tolerable speeds with small models. 100t/s is a bare minimum

Anonymous
01/19/26(Mon)11:02:43 No.107911130

Anonymous 01/19/26(Mon)11:02:43 No.107911130

>>107911120
>complex agentic setups
You can't convince me that this meme can be used for sex and is actually good.

Anonymous
01/19/26(Mon)11:09:46 No.107911192

Anonymous 01/19/26(Mon)11:09:46 No.107911192

>>107910478
anything sub 12B Active is useless for long context tasks.

Anonymous
01/19/26(Mon)11:11:02 No.107911204

Anonymous 01/19/26(Mon)11:11:02 No.107911204

>>107910810
Sora 2 before they cucked it was really really good.

Anonymous
01/19/26(Mon)11:12:29 No.107911219

Anonymous 01/19/26(Mon)11:12:29 No.107911219

>30ba3b

come the fuck on

Anonymous
01/19/26(Mon)11:13:30 No.107911227

Anonymous 01/19/26(Mon)11:13:30 No.107911227

>>107910478
>30B A3B alternative
ogey

Anonymous
01/19/26(Mon)11:13:36 No.107911229

Anonymous 01/19/26(Mon)11:13:36 No.107911229

>>107911120
enjoy your robotic arm tool calls

Anonymous
01/19/26(Mon)11:15:50 No.107911255

Anonymous 01/19/26(Mon)11:15:50 No.107911255

>>107911130
llms work much better when you narrow down their tasks. You can give additional instructions for specific situations and use long-term planning so that the model forces itself to move to the next plot point instead of getting stalled and repetitive. You can fix every problem with a minimally intelligent llm by inserting a good, detailed prompt with correctly determined conditions. Writing all instructions in a single prompt will only confuse the model, so an agentic approach is the only rational choice

Anonymous
01/19/26(Mon)11:19:50 No.107911290

Anonymous 01/19/26(Mon)11:19:50 No.107911290

>>107911255
No, retard. It should write a symphony when I say poo poo pee pee.

Anonymous
01/19/26(Mon)11:20:37 No.107911301

Anonymous 01/19/26(Mon)11:20:37 No.107911301

>>107911290
this, but unironically

Anonymous
01/19/26(Mon)11:21:08 No.107911308

Anonymous 01/19/26(Mon)11:21:08 No.107911308

>>107911290
It actually should and "skill issue" posting was always just baiting and trolling.

Anonymous
01/19/26(Mon)11:21:37 No.107911314

Anonymous 01/19/26(Mon)11:21:37 No.107911314

>>107911290
Why would I want it to write music when I prompt poo poo pee pee? It then, it should be a rap at least, not a symphony.

Anonymous
01/19/26(Mon)11:21:51 No.107911315

Anonymous 01/19/26(Mon)11:21:51 No.107911315

>>107911255
>llms work much better when you narrow down their tasks
If your task can be narrowed that much, you're better off using a script instead (which will be near instant + 0 hallucination).

Anonymous
01/19/26(Mon)11:25:24 No.107911348

Anonymous 01/19/26(Mon)11:25:24 No.107911348

File: file.png (116 KB, 629x835)

116 KB PNG

Anonymous
01/19/26(Mon)11:26:58 No.107911358

Anonymous 01/19/26(Mon)11:26:58 No.107911358

>>107910810
>Now let's see ol' Sammy wriggle his way out of THIS jam!
I remember "OpenAI could go bankrupt within months" headlines being widely shared in late 2023. And yet...

Anonymous
01/19/26(Mon)11:29:37 No.107911377

Anonymous 01/19/26(Mon)11:29:37 No.107911377

>>107911255
It is really cool how you can throw extra compute time to make up for having low number of parameters by taking a task, breaking it down into smaller bits, processing each part individually (sometimes in parallel to make use of batched decoding), etc.

Anonymous
01/19/26(Mon)11:31:10 No.107911393

Anonymous 01/19/26(Mon)11:31:10 No.107911393

File: 1758290239978536.png (40 KB, 706x155)

40 KB PNG

The anniversary of Deepseek R1 will be tomorrow. Things are about to get crazy.

Anonymous
01/19/26(Mon)11:33:29 No.107911411

Anonymous 01/19/26(Mon)11:33:29 No.107911411

File: 616821609_122191357364461(...).jpg (531 KB, 2048x1475)

531 KB JPG

It's always funny seeing actual proof that llmtards are much dumber than imggooners

Anonymous
01/19/26(Mon)11:38:47 No.107911459

Anonymous 01/19/26(Mon)11:38:47 No.107911459

>>107911393
>deepseek v4
>prices halved AGAIN

Anonymous
01/19/26(Mon)11:38:54 No.107911460

Anonymous 01/19/26(Mon)11:38:54 No.107911460

>>107911411
ye, cause you interact with you image model the same as a llm totally ...

Anonymous
01/19/26(Mon)11:40:29 No.107911472

Anonymous 01/19/26(Mon)11:40:29 No.107911472

File: bwackhowl.jpg (22 KB, 750x350)

22 KB JPG

How do you hook-up your local model to claude code? Do I need just the proxy or also some frankensteined claude code version too?

Anonymous
01/19/26(Mon)11:41:56 No.107911490

Anonymous 01/19/26(Mon)11:41:56 No.107911490

>>107911460
You're worse at generalization than 3B

Anonymous
01/19/26(Mon)11:42:06 No.107911494

Anonymous 01/19/26(Mon)11:42:06 No.107911494

>>107911308
>it's actually intelligent!! Ph.D level stuff!
>only if you're a Ph.D driver
yes, it was always altman cope, agi never

Anonymous
01/19/26(Mon)11:42:14 No.107911495

Anonymous 01/19/26(Mon)11:42:14 No.107911495

>>107911411
pic not related

Anonymous
01/19/26(Mon)11:42:31 No.107911496

Anonymous 01/19/26(Mon)11:42:31 No.107911496

>>107911459
prices so cheap you vill use the api

Anonymous
01/19/26(Mon)11:42:54 No.107911500

Anonymous 01/19/26(Mon)11:42:54 No.107911500

>>107911490
show me your insane agentic rp workflow

Anonymous
01/19/26(Mon)11:43:00 No.107911501

Anonymous 01/19/26(Mon)11:43:00 No.107911501

>>107911411
isn't cumfartui spyware and malware? Where is the runtime performance and the Photoshop layers?

Anonymous
01/19/26(Mon)11:43:54 No.107911510

Anonymous 01/19/26(Mon)11:43:54 No.107911510

>>107911290
>poo poo pee pee
>writes symphony
retarded model, not following user instructions at all

Anonymous
01/19/26(Mon)11:45:24 No.107911523

Anonymous 01/19/26(Mon)11:45:24 No.107911523

>>107911411
>go to /ldg/
>schizos screaming about trannies
>retarded test spammer with unfunny memes
>z-image when?
it's slightly worse than over here but just enough to be unsavoury

Anonymous
01/19/26(Mon)11:45:55 No.107911528

Anonymous 01/19/26(Mon)11:45:55 No.107911528

>>107911500
Working on it.

Anonymous
01/19/26(Mon)11:46:57 No.107911537

Anonymous 01/19/26(Mon)11:46:57 No.107911537

>>107908131
I use win11 ltsc on my vfio setup, and I have no ads. I was just using it this morning, because modding Baldurs Gate 2 is a fucking nightmare in Linux.

Anonymous
01/19/26(Mon)11:47:29 No.107911545

Anonymous 01/19/26(Mon)11:47:29 No.107911545

>>107911528
so this >>107911255
is just as much hopeful vibes as most tard papers that go nowhere, you don't know since you don't do it either

Anonymous
01/19/26(Mon)11:50:26 No.107911571

Anonymous 01/19/26(Mon)11:50:26 No.107911571

>>107910836
>https://huggingface.co/zai-org/GLM-4.7-Flash/discussions/5

Anonymous
01/19/26(Mon)11:51:37 No.107911584

Anonymous 01/19/26(Mon)11:51:37 No.107911584

>>107911571
it's not even their moe arch?

Anonymous
01/19/26(Mon)11:55:24 No.107911618

Anonymous 01/19/26(Mon)11:55:24 No.107911618

>>107911501
yes. it's just corpo slop at this point. extremely disappointed in the direction it went

Anonymous
01/19/26(Mon)11:56:24 No.107911625

Anonymous 01/19/26(Mon)11:56:24 No.107911625

File: why yes.png (148 KB, 1280x1125)

148 KB PNG

>>107911584

Anonymous
01/19/26(Mon)11:57:46 No.107911634

Anonymous 01/19/26(Mon)11:57:46 No.107911634

>>107911472
check the zai docs
https://docs.z.ai/devpack/tool/claude

there's an automated script you can look at to see how to set it up. it is both config for to set your custom endpoints and models, as well as some config to override the initial onboarding prompt that asks you to sign in to anthropic.

Anonymous
01/19/26(Mon)12:03:24 No.107911689

Anonymous 01/19/26(Mon)12:03:24 No.107911689

>>107911571
lol i knew it. nobody ever makes new architecture.

Anonymous
01/19/26(Mon)12:06:57 No.107911733

Anonymous 01/19/26(Mon)12:06:57 No.107911733

the new rocinante makes peak cunny
i know a few anons itt will appreciate that

Anonymous
01/19/26(Mon)12:07:41 No.107911741

Anonymous 01/19/26(Mon)12:07:41 No.107911741

>>107911689
You also sometimes have stuff like that korean model from a few weeks ago that modify the transformer architecture itself. As a result, it's not supported by anything besides vllm.

Anonymous
01/19/26(Mon)12:08:52 No.107911757

Anonymous 01/19/26(Mon)12:08:52 No.107911757

>>107911733
The X?

Anonymous
01/19/26(Mon)12:09:41 No.107911769

Anonymous 01/19/26(Mon)12:09:41 No.107911769

>>107911757
yep
got no refusals so far

Anonymous
01/19/26(Mon)12:10:23 No.107911778

Anonymous 01/19/26(Mon)12:10:23 No.107911778

File: 1758662804570364.gif (699 KB, 165x163)

699 KB GIF

>>107911733

Anonymous
01/19/26(Mon)12:11:36 No.107911789

Anonymous 01/19/26(Mon)12:11:36 No.107911789

It's funny how "it doesn't give refusals" is still a criteria amongst skillets.

Anonymous
01/19/26(Mon)12:12:36 No.107911796

Anonymous 01/19/26(Mon)12:12:36 No.107911796

>they don't rape the unwilling LLM
what are you even living for

Anonymous
01/19/26(Mon)12:13:26 No.107911806

Anonymous 01/19/26(Mon)12:13:26 No.107911806

>>107911789
it's a criteria for anyone models that can give refusals are more tarded even when they do comply

Anonymous
01/19/26(Mon)12:13:34 No.107911809

Anonymous 01/19/26(Mon)12:13:34 No.107911809

>duh refusals don't matter! just edit the model output to what you want!!

Anonymous
01/19/26(Mon)12:13:53 No.107911810

Anonymous 01/19/26(Mon)12:13:53 No.107911810

>>107911789
refusals = i'm sorry but i will not engage in harmful behavior [...]
my rape cards work as expected, mind you

Anonymous
01/19/26(Mon)12:14:08 No.107911818

Anonymous 01/19/26(Mon)12:14:08 No.107911818

>>107911733
>>107911757
organic

Anonymous
01/19/26(Mon)12:16:20 No.107911848

Anonymous 01/19/26(Mon)12:16:20 No.107911848

File: 1749224080876048.png (19 KB, 1047x293)

19 KB PNG

>>107911818
yep

Anonymous
01/19/26(Mon)12:16:49 No.107911857

Anonymous 01/19/26(Mon)12:16:49 No.107911857

File: Screenshot 2026-01-19 at (...).png (59 KB, 928x526)

59 KB PNG

>>107911584
>>107911689

Anonymous
01/19/26(Mon)12:17:24 No.107911865

Anonymous 01/19/26(Mon)12:17:24 No.107911865

>>107911857
>fizz
aie

Anonymous
01/19/26(Mon)12:18:39 No.107911877

Anonymous 01/19/26(Mon)12:18:39 No.107911877

>>107911865
lmao

Anonymous
01/19/26(Mon)12:19:29 No.107911882

Anonymous 01/19/26(Mon)12:19:29 No.107911882

>>107911789
>there are frying pans posting ITT
horrifying

Anonymous
01/19/26(Mon)12:19:56 No.107911891

Anonymous 01/19/26(Mon)12:19:56 No.107911891

>>107911865
Still seething about our based anti 'p queen?

Anonymous
01/19/26(Mon)12:21:18 No.107911901

Anonymous 01/19/26(Mon)12:21:18 No.107911901

>>107911882
I support the pannocaust, only pure pots should be allowed to post

Anonymous
01/19/26(Mon)12:22:17 No.107911913

Anonymous 01/19/26(Mon)12:22:17 No.107911913

File: file.png (326 KB, 900x796)

326 KB PNG

>>107910560
This was with Q4 from here https://huggingface.co/ngxson/GLM-4.7-Flash-GGUF/tree/main
I'm downloading safetensors to check if what happened at the end is a quant issue.

Anonymous
01/19/26(Mon)12:24:41 No.107911940

Anonymous 01/19/26(Mon)12:24:41 No.107911940

>>107911913
>barely above a whisper
but otherwise looked quite decent until "I asleep?"

Anonymous
01/19/26(Mon)12:25:07 No.107911946

Anonymous 01/19/26(Mon)12:25:07 No.107911946

>>107911913
>check if what happened at the end is a quant issue
that's typical glm, I don't believe anyone who says they don't have that happen often to them
larger MoE GLM do this less but will enter endless thinking loop instead

Anonymous
01/19/26(Mon)12:25:20 No.107911948

Anonymous 01/19/26(Mon)12:25:20 No.107911948

>>107911913
and?

Anonymous
01/19/26(Mon)12:28:15 No.107911978

Anonymous 01/19/26(Mon)12:28:15 No.107911978

>>107911472
You can change where claude code sends requests with the following env vars
ANTHROPIC_AUTH_TOKEN=
ANTHROPIC_BASE_URL=
ANTHROPIC_DEFAULT_HAIKU_MODEL=
ANTHROPIC_DEFAULT_SONNET_MODEL=
ANTHROPIC_DEFAULT_OPUS_MODEL=
DISABLE_NON_ESSENTIAL_MODEL_CALLS=1
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
The last 2 aren't strictly necessary.

You can also set these permanently for claude by editing one of your claude settings files, for example at
~/.claude/settings.json
{
  "env": {
    "ANTHROPIC_BASE_URL": "http://localhost:4141",
    "ANTHROPIC_AUTH_TOKEN": "dummy",
    your other env vars here etc
  }
}
Note that Claude uses the Anthropic API format, which differs from the OpenAI API format that everyone else uses. Because they're a special snowflake, you'll probably have to run a proxy or something to convert requests to OpenAI API format. One such project you can run to do so is https://github.com/musistudio/claude-code-router

Anonymous
01/19/26(Mon)12:38:24 No.107912100

Anonymous 01/19/26(Mon)12:38:24 No.107912100

File: model is funny lol.png (528 KB, 1516x6160)

528 KB PNG

>>107911913

Anonymous
01/19/26(Mon)12:40:27 No.107912114

Anonymous 01/19/26(Mon)12:40:27 No.107912114

>>107911501
legit the most insecure program you could install on a pc, and these retards still do it without using docker or anything lule, and on fucking windows too

Anonymous
01/19/26(Mon)12:45:32 No.107912162

Anonymous 01/19/26(Mon)12:45:32 No.107912162

>>107912114
somebody posted this article yesterday
https://www.scworld.com/brief/malware-distributed-via-comfyui-server-exploits

Anonymous
01/19/26(Mon)12:47:01 No.107912176

Anonymous 01/19/26(Mon)12:47:01 No.107912176

>ai-generated article

Anonymous
01/19/26(Mon)12:49:55 No.107912200

Anonymous 01/19/26(Mon)12:49:55 No.107912200

>>107912114
>>107912162
ComfyUI itself is fine, it's the retards auto installating 100x custom nodes because sanjeet told them to in a blog that get got, there's literally no need to use a single custom node if you understand how the program actually works and can build your own workflows

Anonymous
01/19/26(Mon)13:02:56 No.107912308

Anonymous 01/19/26(Mon)13:02:56 No.107912308

>>107912200
>literally no need to use a single custom node
Bullshit. Comfy has no built-in control flow nodes, no latent size presets, no gguf support, no yolo/sam nodes, etc. Even no image post-processing nodes (adjusting brightness, contrast, etc) in an image generation software!

Anonymous
01/19/26(Mon)13:03:57 No.107912314

Anonymous 01/19/26(Mon)13:03:57 No.107912314

>>107912200
>ComfyUI itself is fine
it's not. the default shit is retarded and the entire thing is kept alive by custom nodes. all the frontend changes made the UI/UX much worse. runtime performance is a slideshow which makes navigation around a graph absolute cancer and the keep shoving telemetry into it. when you understand how comfy works the more you realize it's a heap of shit and we need something else

Anonymous
01/19/26(Mon)13:04:47 No.107912322

Anonymous 01/19/26(Mon)13:04:47 No.107912322

go back to shitting in /sdg/

Anonymous
01/19/26(Mon)13:06:28 No.107912333

Anonymous 01/19/26(Mon)13:06:28 No.107912333

>>107912200
some custom nodes are really needed to make cumrag not suck though
comfy doesn't even have a way to do prompt editing from text, to do what [dog:cat:1] does you need to create a new dog with dog, a node with cat, and a ksampler sending its result in another ksampler in another ksampler in another ksampler..
or you could install
https://github.com/asagi4/comfyui-prompt-control
and get A1111 style prompt editing, plus the option to use the saner weighting of A1111 etc
cumrag doesn't even have nodes to concatenate text, or do very basic integer math or anything you'd want to build stronger automations in your workflows
care to show me the cumrag way to do X/Y plots without custom nodes?
I use the Inspire Pack among other things to autoload a list of prompts that contain various artist mixes to let me pick what fits a pic best etc
I actually love the node editor concept, it's one of the great powers of Blender and it's fine in Unreal Engine
but cumrag lacks the most basic of primitive nodes from which you could make truly usable, actually comfy workflows

Anonymous
01/19/26(Mon)13:09:07 No.107912352

Anonymous 01/19/26(Mon)13:09:07 No.107912352

>>107912333
nodes are a feature not the only interaction in entire app. substance is the only thing I can think of that has nodes only but painter is the manual heavy lifting that uses substance graphs.

Anonymous
01/19/26(Mon)13:09:46 No.107912358

Anonymous 01/19/26(Mon)13:09:46 No.107912358

>>107906367
You gens are trash, fuck off of /g/

Anonymous
01/19/26(Mon)13:14:20 No.107912388

Anonymous 01/19/26(Mon)13:14:20 No.107912388

So my favourite erotic novel writing model is Mistral-Small-3.2-24B-Instruct-2506. Its output is good enough, and my PC runs it decently.

My issue with it is it basically only works well when writing in English. They say it supports Polish but it's actually trash at it in comparison; constant grammar issues, also output quality is so much worse (it begins to read like a kids book).

Does anyone know any model that's actually capable of writing in multiple languages and can fit on 16 GB RAM?

Anonymous
01/19/26(Mon)13:15:18 No.107912395

Anonymous 01/19/26(Mon)13:15:18 No.107912395

>>107912308
>Control flow
Just lazy shit you don't need it, you can construct your workflow in a linear fashion
>Presets
Same as above
>Gguf support
Meme shit in image/vid/audio gen, safetensors work fine
>Yolo/Sam nodes
More meme shit you don't need that you can workflow yourself
>Image post process nodes
More meme shit, image editing programs decades old can do this, there is even automation for them but why would you automate post process which is a subjective artistic touch, you don't benefit from this unless you are a slop spamming jeet
>>107912314
>Performance
Works on my machine
>Telemetry
laughably easy to stop and in any case what's stopping you from using an old version of comfy, you can even integrate new model support yourself, stop being lazy
>>107912333
>Shilling A111 the shit that was spyware before comfy even existed and does nothing better other than being more accessible to retards
>Prompt editing
Just write your prompts manually you fucking retard holy shit, how is every reply to my post so successfully proving my point, you don't need any of that shit, any malware on your machine is a result of your own retardation

Anonymous
01/19/26(Mon)13:16:32 No.107912400

Anonymous 01/19/26(Mon)13:16:32 No.107912400

>>107912388
Bóbr kurwa

Anonymous
01/19/26(Mon)13:16:44 No.107912401

Anonymous 01/19/26(Mon)13:16:44 No.107912401

>>107912388
It's not a very smart model, and won't be usable at long context, but nothing, absolutely nothing beats Gemma E4B in multilingual. It's even better than the 27B Gemma 3 in that regards. Punches way above its weight and you will need to get into very large MoE territory before it gets better than E4B.

Anonymous
01/19/26(Mon)13:18:53 No.107912421

Anonymous 01/19/26(Mon)13:18:53 No.107912421

>>107912395
how much do you charge an hour? I need a shill to help me market my new app

Anonymous
01/19/26(Mon)13:19:04 No.107912423

Anonymous 01/19/26(Mon)13:19:04 No.107912423

>>107912395
>Just write your prompts manually
you clearly don't know what prompt editing means retard
doing this in cumrag means creating a new ksampler for every timestep you're alternating your prompt
if you think this is a valid use of your time kill yourself, become an hero, now

Anonymous
01/19/26(Mon)13:20:27 No.107912436

Anonymous 01/19/26(Mon)13:20:27 No.107912436

>>107912423
>Doing this is a waste of time you don't understand I NEED ghuptars mega prompt editing node instead of just adding a new step in my workflow, wtf why is my machine mining bitcoin fucking comfyui reeeeeee

Anonymous
01/19/26(Mon)13:30:29 No.107912494

Anonymous 01/19/26(Mon)13:30:29 No.107912494

>>107912423
what dogshit workflow are you even running?

Anonymous
01/19/26(Mon)13:33:23 No.107912520

Anonymous 01/19/26(Mon)13:33:23 No.107912520

we NEED GLM 5.0

Anonymous
01/19/26(Mon)13:34:50 No.107912539

Anonymous 01/19/26(Mon)13:34:50 No.107912539

>>107912494
These retards don't even understand how the program or the nodes they're installing work, they just download workflows from ww1.goodmorningworkflows.siksha and they have the cheek to talk down to anyone else tech illiterate enough to download malware

Anonymous
01/19/26(Mon)13:35:57 No.107912550

Anonymous 01/19/26(Mon)13:35:57 No.107912550

File: file.png (180 KB, 657x765)

180 KB PNG

kimi is testing a new vl model

Anonymous
01/19/26(Mon)13:36:13 No.107912556

Anonymous 01/19/26(Mon)13:36:13 No.107912556

>>107912539
the software itself destroys ssds so I don't really know what you are trying to put down here

Anonymous
01/19/26(Mon)13:37:47 No.107912575

Anonymous 01/19/26(Mon)13:37:47 No.107912575

>>107912556
Fucking lmao, that's it, post hand

Anonymous
01/19/26(Mon)13:39:43 No.107912591

Anonymous 01/19/26(Mon)13:39:43 No.107912591

my specs:
- RYZEN 9 7900
- 32 GB DDR5
- RTX 3090

i'm looking primary for coding agents, what do you recommend to run locally in my computer? i dont care too much about time as while i can save money from using the cloud based code agents.

Anonymous
01/19/26(Mon)13:40:42 No.107912599

Anonymous 01/19/26(Mon)13:40:42 No.107912599

>>107912556
lolwut?

Anonymous
01/19/26(Mon)13:41:29 No.107912609

Anonymous 01/19/26(Mon)13:41:29 No.107912609

Take your discussion to /ldg/

Anonymous
01/19/26(Mon)13:41:55 No.107912613

Anonymous 01/19/26(Mon)13:41:55 No.107912613

>>107911913
>with a consensual look in my eyes

Anonymous
01/19/26(Mon)13:42:12 No.107912619

Anonymous 01/19/26(Mon)13:42:12 No.107912619

>>107912591
>i dont care too much about time
Kimi K2 with a lot of swap memory.

Anonymous
01/19/26(Mon)13:42:19 No.107912620

Anonymous 01/19/26(Mon)13:42:19 No.107912620

>>107912609
stop

Anonymous
01/19/26(Mon)13:44:23 No.107912639

Anonymous 01/19/26(Mon)13:44:23 No.107912639

>>107912591
32gb is too little to run toss-120b. So, choose between 30b models like qwen, glm-flash, toss-20b.

Anonymous
01/19/26(Mon)13:45:00 No.107912643

Anonymous 01/19/26(Mon)13:45:00 No.107912643

>>107911946
It (endless repetition; can be word, sentence, paragraph) happened quite often for me with Air at Q4_K_S. Never had it happen with Q4_K_L.

Anonymous
01/19/26(Mon)13:46:09 No.107912653

Anonymous 01/19/26(Mon)13:46:09 No.107912653

>>107912613
It's the equivalent of saying no homo so it isn't gay. The consensual look makes it not rape.

Anonymous
01/19/26(Mon)13:47:57 No.107912667

Anonymous 01/19/26(Mon)13:47:57 No.107912667

>>107912613
lmao

Anonymous
01/19/26(Mon)13:48:57 No.107912676

Anonymous 01/19/26(Mon)13:48:57 No.107912676

>>107912591
qwen 2.5 coder 32b, qwen 3 coder 30b, llama 3 70b

Anonymous
01/19/26(Mon)13:50:59 No.107912692

Anonymous 01/19/26(Mon)13:50:59 No.107912692

>>107912591
GLM 4.7 Flash

Anonymous
01/19/26(Mon)13:54:45 No.107912721

Anonymous 01/19/26(Mon)13:54:45 No.107912721

>>107911913
Nice :rocket: perfectly to ship gorgeouses!!

Anonymous
01/19/26(Mon)14:01:18 No.107912768

Anonymous 01/19/26(Mon)14:01:18 No.107912768

File: file.png (23 KB, 499x173)

23 KB PNG

>>107912721
heck yeah sir!

Anonymous
01/19/26(Mon)14:03:26 No.107912790

Anonymous 01/19/26(Mon)14:03:26 No.107912790

>>107912609
There's no discussion to be had with those tech illiterate slop merchants, I think I'll talk about the software used to run models locally on the local model general

Anonymous
01/19/26(Mon)14:04:16 No.107912802

Anonymous 01/19/26(Mon)14:04:16 No.107912802

ah it's another episode of this, great

Anonymous
01/19/26(Mon)14:06:42 No.107912819

Anonymous 01/19/26(Mon)14:06:42 No.107912819

>>107912395
Genning images is meme. All you need is hands.

Anonymous
01/19/26(Mon)14:07:07 No.107912825

Anonymous 01/19/26(Mon)14:07:07 No.107912825

Oh no you don't get to autisticallt >>107912802
dictate what gets spoken about in the relevant public forums, oh the depravity

Anonymous
01/19/26(Mon)14:08:56 No.107912843

Anonymous 01/19/26(Mon)14:08:56 No.107912843

>>107912825
as always when this happens
>a general dedicated to the discussion and development of local language models.

Anonymous
01/19/26(Mon)14:11:09 No.107912867

Anonymous 01/19/26(Mon)14:11:09 No.107912867

>>107912802
I blame the Long Miku

Anonymous
01/19/26(Mon)14:12:27 No.107912881

Anonymous 01/19/26(Mon)14:12:27 No.107912881

>>107912867
Of course you do Chris.

Anonymous
01/19/26(Mon)14:20:35 No.107912930

Anonymous 01/19/26(Mon)14:20:35 No.107912930

I have to patch mikupad because it's lagging with 10k tokens
why is JS like this

Anonymous
01/19/26(Mon)14:29:55 No.107913005

Anonymous 01/19/26(Mon)14:29:55 No.107913005

File: file.png (119 KB, 883x599)

119 KB PNG

>>107911913
This is with the official weights on VLLM.

Anonymous
01/19/26(Mon)14:31:34 No.107913016

Anonymous 01/19/26(Mon)14:31:34 No.107913016

>>107913005
I get that VLLM is optimized for high throughput but it's kinda sad how many features it's lacking compared to llamacpp.

Anonymous
01/19/26(Mon)14:32:59 No.107913031

Anonymous 01/19/26(Mon)14:32:59 No.107913031

>>107911913
>>107913005
grim

Anonymous
01/19/26(Mon)14:33:37 No.107913038

Anonymous 01/19/26(Mon)14:33:37 No.107913038

>>107910810
Does this mean that if they do go bankrupt that all that ram that they purchased will be sold off and future orders canceled?

Anonymous
01/19/26(Mon)14:34:10 No.107913043

Anonymous 01/19/26(Mon)14:34:10 No.107913043

>>107913005
ok so another totally fantastic at benchmarks prune that's useless after a minute
>>107913038
no

Anonymous
01/19/26(Mon)14:34:44 No.107913048

Anonymous 01/19/26(Mon)14:34:44 No.107913048

>>107913038
No, their competitors will buy up the entire stock

Anonymous
01/19/26(Mon)14:35:24 No.107913056

Anonymous 01/19/26(Mon)14:35:24 No.107913056

>>107913005
i'm starting to think that their big model was just a fluke or something happened internally
how hard is it to deliver something usable these days?

Anonymous
01/19/26(Mon)14:35:25 No.107913057

Anonymous 01/19/26(Mon)14:35:25 No.107913057

>>107913043
>no
Aww.

Anonymous
01/19/26(Mon)14:35:54 No.107913065

Anonymous 01/19/26(Mon)14:35:54 No.107913065

>>107913038
They didn't buy RAM, they bought fab capacity.

Anonymous
01/19/26(Mon)14:36:41 No.107913076

Anonymous 01/19/26(Mon)14:36:41 No.107913076

>>107913056
>how hard is it to deliver something usable these days?
Sorry bro, benchmaxx to the moon.

Anonymous
01/19/26(Mon)14:37:33 No.107913084

Anonymous 01/19/26(Mon)14:37:33 No.107913084

>>107913065
They definitely bought RAM in the past. How do their datacenters work otherwise?

Anonymous
01/19/26(Mon)14:37:38 No.107913085

Anonymous 01/19/26(Mon)14:37:38 No.107913085

>>107913043
>ok so another totally fantastic at benchmarks prune that's useless after a minute
It's because of the A3B MoE. I get that they're the new cool kid on the block, but seriously if your model is only 32B, just keep it dense.

Anonymous
01/19/26(Mon)14:37:47 No.107913088

Anonymous 01/19/26(Mon)14:37:47 No.107913088

>>107913065
>>107913038
specifically
>OpenAI secured up to 40% of the world's DRAM production capacity by signing agreements with Samsung and SK Hynix, which allows them to purchase raw DRAM wafers for their Stargate project. This move significantly impacts the RAM market, leading to rising prices and reduced availability for other industries.

Anonymous
01/19/26(Mon)14:38:32 No.107913094

Anonymous 01/19/26(Mon)14:38:32 No.107913094

The bubble is too big to pop.

Anonymous
01/19/26(Mon)14:43:46 No.107913150

Anonymous 01/19/26(Mon)14:43:46 No.107913150

yet another big win for moesissies

Anonymous
01/19/26(Mon)14:45:19 No.107913165

Anonymous 01/19/26(Mon)14:45:19 No.107913165

>>107913094
vibecoded software says hi

Anonymous
01/19/26(Mon)15:07:19 No.107913372

Anonymous 01/19/26(Mon)15:07:19 No.107913372

>>107913085
I think the model's "hidden size" in the configuration tells a clearer story than the number of active parameters directly.

Anonymous
01/19/26(Mon)15:09:24 No.107913393

Anonymous 01/19/26(Mon)15:09:24 No.107913393

>>107913094
this but unironically
Jewish funny money isn't real so printing a few trillion to prop it up indefinitely is no real effort
Inflation only hurts peasants, not real people

Anonymous
01/19/26(Mon)15:17:28 No.107913467

Anonymous 01/19/26(Mon)15:17:28 No.107913467

>>107911757
New? Where?

Anonymous
01/19/26(Mon)15:17:51 No.107913470

Anonymous 01/19/26(Mon)15:17:51 No.107913470

>>107913467
beaver

Anonymous
01/19/26(Mon)15:19:25 No.107913490

Anonymous 01/19/26(Mon)15:19:25 No.107913490

>>107913470
Found it, tnx

Anonymous
01/19/26(Mon)15:22:24 No.107913517

Anonymous 01/19/26(Mon)15:22:24 No.107913517

File: file.png (659 KB, 1596x1259)

659 KB PNG

>>107913005
It's like gptoss.
It degrades outside of the chat template.

The black line is where I started generating after deleting the template on the right.

Anonymous
01/19/26(Mon)15:25:14 No.107913550

Anonymous 01/19/26(Mon)15:25:14 No.107913550

>>107913517
>It degrades outside of the chat template.
sounds like a theme going forward yay..

Anonymous
01/19/26(Mon)15:26:07 No.107913561

Anonymous 01/19/26(Mon)15:26:07 No.107913561

File: 1739355938496204.jpg (539 KB, 1362x1145)

539 KB JPG

I'm glad that glm 4.7 flash came out but I'm disappointed its an 30B-A3B instead of a 106B-A12B like 4.5 air was.

I would strongly prefer if labs would exclusively train and release models that exactly fit the upper bound of the hardware I own.

>model smaller that my hardware = worthless toy for normies
>model larger than my hardware = meme datacenter-only model

Anonymous
01/19/26(Mon)15:26:23 No.107913563

Anonymous 01/19/26(Mon)15:26:23 No.107913563

>3b active
lmao why

Anonymous
01/19/26(Mon)15:27:50 No.107913577

Anonymous 01/19/26(Mon)15:27:50 No.107913577

>>107913563
to placate the air begging

Anonymous
01/19/26(Mon)15:28:58 No.107913592

Anonymous 01/19/26(Mon)15:28:58 No.107913592

>>107913577
we wanted air, not thin air

Anonymous
01/19/26(Mon)15:29:16 No.107913593

Anonymous 01/19/26(Mon)15:29:16 No.107913593

>>107913561
true

Anonymous
01/19/26(Mon)15:29:50 No.107913598

Anonymous 01/19/26(Mon)15:29:50 No.107913598

>>107913592
>wanted
now maybe you'll learn to let them cook and not be ungrateful, else you get this

Anonymous
01/19/26(Mon)15:32:48 No.107913637

Anonymous 01/19/26(Mon)15:32:48 No.107913637

>>107913561
I enjoy using small models as my daily driver and only use big models for erp
>worthless toy
gets the job done

Anonymous
01/19/26(Mon)15:34:29 No.107913655

Anonymous 01/19/26(Mon)15:34:29 No.107913655

>>107913637
spanking your cock is not a job.

Anonymous
01/19/26(Mon)15:36:15 No.107913670

Anonymous 01/19/26(Mon)15:36:15 No.107913670

>>107910853
>At this point I absolutely believe my schizo theory that the only purpose of safety is to make people suffer

"Only" is a bit much. I'd say the main purpose is to signal loyalty and conformance.

Anonymous
01/19/26(Mon)15:39:24 No.107913694

Anonymous 01/19/26(Mon)15:39:24 No.107913694

>>107913637
Funnily enough, I do the exact opposite

Anonymous
01/19/26(Mon)15:44:20 No.107913740

Anonymous 01/19/26(Mon)15:44:20 No.107913740

can i sue glm 4.7 flash with a 4060 16go and 48 ddr4 ram in a okay speed ?

Anonymous
01/19/26(Mon)15:47:55 No.107913773

Anonymous 01/19/26(Mon)15:47:55 No.107913773

>>107913740
>can i sue glm 4.7 flash
anon I know the model is bad but come on it's free

Anonymous
01/19/26(Mon)15:50:04 No.107913790

Anonymous 01/19/26(Mon)15:50:04 No.107913790

File: file.png (27 KB, 550x254)

27 KB PNG

API sisters hiding in our ranks, not like this...

Anonymous
01/19/26(Mon)15:51:40 No.107913804

Anonymous 01/19/26(Mon)15:51:40 No.107913804

>>107913790
Who tf prompts pizza in cloud IDEs??

Anonymous
01/19/26(Mon)15:52:26 No.107913815

Anonymous 01/19/26(Mon)15:52:26 No.107913815

>>107913804
a disgustingly high amount apprently

Anonymous
01/19/26(Mon)15:52:58 No.107913822

Anonymous 01/19/26(Mon)15:52:58 No.107913822

>>107913790
those underaged letters...
sickening!

Anonymous
01/19/26(Mon)15:59:18 No.107913888

Anonymous 01/19/26(Mon)15:59:18 No.107913888

>>107906367
Her expression is really good but shading is unfortunate.

Anonymous
01/19/26(Mon)16:04:01 No.107913933

Anonymous 01/19/26(Mon)16:04:01 No.107913933

Just finished my cpp+onnx port of Pocket TTS.

One cpp file.
One binary.
All dependencies statically linked.
Fully portable.

100m parameters, voice cloning, runs fast on CPU (but I could improve it still)

Anonymous
01/19/26(Mon)16:07:57 No.107913975

Anonymous 01/19/26(Mon)16:07:57 No.107913975

>cannot improve mikupad performance further without rewriting the entire rendering engine
insane

Anonymous
01/19/26(Mon)16:11:16 No.107914007

Anonymous 01/19/26(Mon)16:11:16 No.107914007

just stop using JS and webui

Anonymous
01/19/26(Mon)16:13:19 No.107914033

Anonymous 01/19/26(Mon)16:13:19 No.107914033

it's more about the entire thing being a 8k LoC mess
it needs to be rewritten in proper react

Anonymous
01/19/26(Mon)16:15:45 No.107914056

Anonymous 01/19/26(Mon)16:15:45 No.107914056

>>107914007
What text rendering engine can display 10,000+ words with advanced (markdown) rendering FAST?
Qt+QML can't, even though its html support is limited. I tried.

Anonymous
01/19/26(Mon)16:21:24 No.107914092

Anonymous 01/19/26(Mon)16:21:24 No.107914092

>>107913790
>csam
Call it child porn. Oh no, you won't because then everyone will see that it doesn't involve any real children and call you retarded.

Anonymous
01/19/26(Mon)16:21:28 No.107914094

Anonymous 01/19/26(Mon)16:21:28 No.107914094

>>107914056
write your own in rust
it has fearless concurrency and zero cost abstractions
or was that go? I get my memelangs confused

Anonymous
01/19/26(Mon)16:22:35 No.107914106

Anonymous 01/19/26(Mon)16:22:35 No.107914106

>>107913790
RIP random slop apple store app users I guess.

Anonymous
01/19/26(Mon)16:22:35 No.107914107

Anonymous 01/19/26(Mon)16:22:35 No.107914107

>>107914092
what's the c for do you think?

Anonymous
01/19/26(Mon)16:23:43 No.107914114

Anonymous 01/19/26(Mon)16:23:43 No.107914114

>>107914094
It definitely isn't go because go doesn't have abstractions.

Anonymous
01/19/26(Mon)16:24:51 No.107914128

Anonymous 01/19/26(Mon)16:24:51 No.107914128

>>107914114
probably rust then
is writing something like that hard?

Anonymous
01/19/26(Mon)16:38:12 No.107914241

Anonymous 01/19/26(Mon)16:38:12 No.107914241

>>107914056
ImGui with https://github.com/enkisoftware/imgui_markdown
>how fast
insanely fast, I use it in vr app

Anonymous
01/19/26(Mon)16:43:26 No.107914277

Anonymous 01/19/26(Mon)16:43:26 No.107914277

>>107914241
Interesting. But how is imgui's support of advanced scripts? IME? I mean, things like Japanese where you first type one thing (kana) than convert it into another (kanji) with IME. Smaller gui libraries often struggle with such stuff.

Anonymous
01/19/26(Mon)16:46:23 No.107914293

Anonymous 01/19/26(Mon)16:46:23 No.107914293

>>107913043
cockbench is deeply flawed for new models since it's pure text completion. of course something heavily RL'd with chain of thought and back-and-forth convos won't know what the fuck to do with the context

Anonymous
01/19/26(Mon)16:47:53 No.107914304

Anonymous 01/19/26(Mon)16:47:53 No.107914304

>>107914241
Doesn't seem like it supports code blocks or syntax highlight. Just like Qt, it supports only features that span a single line.

Anonymous
01/19/26(Mon)16:49:16 No.107914315

Anonymous 01/19/26(Mon)16:49:16 No.107914315

>>107914293
which in itself is interesting to know, means it's completely fried to fuck

Anonymous
01/19/26(Mon)16:49:34 No.107914317

Anonymous 01/19/26(Mon)16:49:34 No.107914317

>>107914293
It just means that the model is fried.

Anonymous
01/19/26(Mon)16:53:01 No.107914343

Anonymous 01/19/26(Mon)16:53:01 No.107914343

what models are there even to look forward to? I mean /ldg/ has z image base, what do we have?

Anonymous
01/19/26(Mon)16:54:47 No.107914358

Anonymous 01/19/26(Mon)16:54:47 No.107914358

>>107914343
DSv4

Anonymous
01/19/26(Mon)16:54:48 No.107914359

Anonymous 01/19/26(Mon)16:54:48 No.107914359

I haven't been here in a hot minute, is there really no news?
I thought it was over before. I didn't realize just how over it could get.

Anonymous
01/19/26(Mon)16:56:12 No.107914370

Anonymous 01/19/26(Mon)16:56:12 No.107914370

>>107914358
Can anyone even run it though?

Anonymous
01/19/26(Mon)16:57:17 No.107914379

Anonymous 01/19/26(Mon)16:57:17 No.107914379

>>107914370
me :)

Anonymous
01/19/26(Mon)16:57:36 No.107914382

Anonymous 01/19/26(Mon)16:57:36 No.107914382

File: Screen Shot 2026-01-20 at(...).png (231 KB, 656x680)

231 KB PNG

>>107914277

Anonymous
01/19/26(Mon)16:59:26 No.107914395

Anonymous 01/19/26(Mon)16:59:26 No.107914395

>>107914370
do you not have 200GB+ of RAM?

Anonymous
01/19/26(Mon)16:59:47 No.107914400

Anonymous 01/19/26(Mon)16:59:47 No.107914400

>>107914315
>>107914317
It doesn't, like models freaking out if they don't see just one BOS in the right place doesn't. Use it as intended, it's not 2023.

Anonymous
01/19/26(Mon)17:01:29 No.107914415

Anonymous 01/19/26(Mon)17:01:29 No.107914415

>>107914400
>we overtrained this fancy autocomplete so much that unless you put these meaningless(and totally different for each model) placeholders perfectly it will just break

Anonymous
01/19/26(Mon)17:01:29 No.107914416

Anonymous 01/19/26(Mon)17:01:29 No.107914416

>>107914400
>Use it as intended, it's not 2023.
I will not submit to OAI chat completion madness.assistant

Anonymous
01/19/26(Mon)17:01:56 No.107914421

Anonymous 01/19/26(Mon)17:01:56 No.107914421

>>107914395
How fast would it even run though? I'm not interested in 1-2tk/sec no matter how good the model is.

Anonymous
01/19/26(Mon)17:04:03 No.107914441

Anonymous 01/19/26(Mon)17:04:03 No.107914441

File: investors and Sam Altman.jpg (217 KB, 2000x1334)

217 KB JPG

>>107910810
>bro just 2 more billions, trust me we've achieved AGI internally, just two more billions for testing bro

Anonymous
01/19/26(Mon)17:07:46 No.107914473

Anonymous 01/19/26(Mon)17:07:46 No.107914473

>>107914395
We don't even know yet how big it will be, but it's unlikely to be less than 1 or 2 trillion params.

Anonymous
01/19/26(Mon)17:08:33 No.107914484

Anonymous 01/19/26(Mon)17:08:33 No.107914484

>>107914473
it will have a magic new architecture and only 100B params
trust the plan

Anonymous
01/19/26(Mon)17:12:06 No.107914510

Anonymous 01/19/26(Mon)17:12:06 No.107914510

>>107910810
>2025
>government spends billions to prop up OAI in the AI race
>2026
>IPO enriches early private equity investors and the c-suite
>2027
>declare bankruptcy and let retail investors eat the losses
God, I love capitalism.

Anonymous
01/19/26(Mon)17:14:20 No.107914528

Anonymous 01/19/26(Mon)17:14:20 No.107914528

Seriously. They need to create an architecture that has high fluid intelligence, and only basic crystallized intelligence, but has ability to use ssd-loaded knowledge databases efficiently.

Anonymous
01/19/26(Mon)17:15:54 No.107914539

Anonymous 01/19/26(Mon)17:15:54 No.107914539

>>107914528
https://github.com/deepseek-ai/Engram

Anonymous
01/19/26(Mon)17:18:32 No.107914550

Anonymous 01/19/26(Mon)17:18:32 No.107914550

>>107914510
I see you've played this game before.
The ride never ends...

Anonymous
01/19/26(Mon)17:19:17 No.107914559

Anonymous 01/19/26(Mon)17:19:17 No.107914559

File: le stare.png (279 KB, 706x221)

279 KB PNG

>>107914528
It's almost like somebody was saying that LLM architecture is a dead end for years...

Anonymous
01/19/26(Mon)17:20:19 No.107914564

Anonymous 01/19/26(Mon)17:20:19 No.107914564

>>107914528
They need to start creating something else than llms

Anonymous
01/19/26(Mon)17:20:53 No.107914569

Anonymous 01/19/26(Mon)17:20:53 No.107914569

>>107914539
promising but hangs on a big IF
and that is if they can make the gate work for this i.e. when does the model reach into the memory and when does rely on its smarts

Anonymous
01/19/26(Mon)17:21:00 No.107914570

Anonymous 01/19/26(Mon)17:21:00 No.107914570

>>107914559
How's the V-JEPA deadend going?

Anonymous
01/19/26(Mon)17:21:36 No.107914575

Anonymous 01/19/26(Mon)17:21:36 No.107914575

File: DS-HF_activity.png (87 KB, 1187x860)

87 KB PNG

>>107914539
There's been quite a bit of activity on DS's HF page in last week or so.

Anonymous
01/19/26(Mon)17:22:13 No.107914580

Anonymous 01/19/26(Mon)17:22:13 No.107914580

>>107914528
>>107914539
https://www.tomshardware.com/tech-industry/artificial-intelligence/deepseek-touts-memory-breakthrough-engram

Anonymous
01/19/26(Mon)17:24:19 No.107914595

Anonymous 01/19/26(Mon)17:24:19 No.107914595

File: file.png (672 KB, 1280x720)

672 KB PNG

>>107914575

Anonymous
01/19/26(Mon)17:24:59 No.107914600

Anonymous 01/19/26(Mon)17:24:59 No.107914600

>>107914570
If I see a car with a fucked up engine I can deduct that it won't drive. I don't have to build a better car for that.

Anonymous
01/19/26(Mon)17:25:06 No.107914601

Anonymous 01/19/26(Mon)17:25:06 No.107914601

>>107913933
Based.

Anonymous
01/19/26(Mon)17:25:39 No.107914606

Anonymous 01/19/26(Mon)17:25:39 No.107914606

>>107914575
It's almost like their anniversary is coming up soon or something

Anonymous
01/19/26(Mon)17:25:49 No.107914609

Anonymous 01/19/26(Mon)17:25:49 No.107914609

Where's air? I can't breathe

Anonymous
01/19/26(Mon)17:30:32 No.107914642

Anonymous 01/19/26(Mon)17:30:32 No.107914642

dipsy v4 will be optimized to run on chinese gpus

Anonymous
01/19/26(Mon)17:33:08 No.107914666

Anonymous 01/19/26(Mon)17:33:08 No.107914666

engram.gguf?

Anonymous
01/19/26(Mon)17:36:00 No.107914685

Anonymous 01/19/26(Mon)17:36:00 No.107914685

File: Screenshot 2026-01-19 at (...).png (122 KB, 1260x722)

122 KB PNG

yeah it's shit

Anonymous
01/19/26(Mon)17:37:29 No.107914702

Anonymous 01/19/26(Mon)17:37:29 No.107914702

can we has deepseek v4 air?

Anonymous
01/19/26(Mon)17:39:00 No.107914711

Anonymous 01/19/26(Mon)17:39:00 No.107914711

File: 1751739377929613.png (1.44 MB, 1024x1024)

1.44 MB PNG

>>107914595
This image would hit harder if NVDA wasn't at $186 rn.
>>107914606
Let's hope new launch soon.
>>107914642
I don't think the "run" part is the issue, it's the "train" part that's been difficult.

Anonymous
01/19/26(Mon)17:44:14 No.107914757

Anonymous 01/19/26(Mon)17:44:14 No.107914757

File: Untitled.png (13 KB, 837x513)

13 KB PNG

>>107914740
>>107914740
>>107914740

Anonymous
01/19/26(Mon)19:37:25 No.107915726

Anonymous 01/19/26(Mon)19:37:25 No.107915726

File: 1739088573683987.png (288 KB, 563x562)

288 KB PNG

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.