[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 124953711.jpg (665 KB, 800x1200)
665 KB
665 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107588615 & >>107582405

►News
>(12/17) Introducing Meta Segment Anything Model Audio: https://ai.meta.com/samaudio
>(12/16) GLM4V vision encoder support merged: https://github.com/ggml-org/llama.cpp/pull/18042
>(12/15) Chatterbox-Turbo 350M released: https://huggingface.co/ResembleAI/chatterbox-turbo
>(12/15) Nemotron 3 Nano released: https://hf.co/blog/nvidia/nemotron-3-nano-efficient-open-intelligent-models
>(12/15) llama.cpp automation for memory allocation: https://github.com/ggml-org/llama.cpp/discussions/18049

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: bundleup.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
►Recent Highlights from the Previous Thread: >>107588615

--Llama.cpp CUDA vs CPU performance analysis:
>107589220 >107589277 >107589403 >107589436 >107589455 >107589482 >107589526 >107589560 >107589637 >107590228 >107590241 >107594766
--Technical and financial challenges in large model finetuning and LLM personality analysis:
>107591971 >107591997 >107592058 >107592108 >107592174 >107592346 >107592404 >107592085 >107592905
--Critique of Magistral's roleplay and reasoning model limitations:
>107592096 >107592132 >107592122 >107592131
--Mistral Small Creative slightly outperforms Mistral Small 3.2 in EQBench Creative Writing:
>107589435
--CPU-GPU bandwidth limitations for external GPU setups:
>107589567
--Configuring on-the-fly model serving in llama.cpp using JSON presets:
>107592345 >107592536
--Technical and ethical evaluation of T5Gemma 2 encoder-decoder models:
>107594052 >107594112 >107594127 >107594239 >107594276 >107594278 >107594295
--MoE vs dense model performance and efficiency trade-offs:
>107591548 >107591591 >107591664 >107591704 >107591802 >107591642 >107591672 >107591740 >107591781 >107591846 >107591870 >107591962 >107591689 >107592016 >107592045 >107592033 >107592064 >107592075 >107592129 >107592149
--Lightweight FunctionGemma-270m-it model gains attention for device compatibility:
>107593038 >107593074 >107593099 >107593361 >107593442 >107594008 >107594139 >107594195
--T5Gemma 2: Seeing, Reading, and Understanding Longer:
>107594243 >107594258 >107594533 >107594327
--Benefits of self-hosted AI models over cloud-based services:
>107593385 >107593404 >107593424 >107593429 >107593592 >107593899 >107593805
--Clarifying model distillation methods and frameworks for training behavior replication:
>107591353 >107591379 >107591432 >107591458 >107591765 >107591480
--Miku (free space):
>107589320 >107592096 >107592323

►Recent Highlight Posts from the Previous Thread: >>107588618

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
How long will it be until llama prioritizes loading dense layers first into vram by default when doing mixed inference?
>>
What do we have to look forward too?
>Gemma 4
>Mistral Small Creative
>GLM 4.6 air/ GLM 5

All I want for christmas is an injection of hopium.
>>
Hell yeah, the biggest accuracy errors from yesterday with the initial implementation of Scout (CPU only) on my engine have been fixed.
Now I'll begin testing the finetuning code.
>>
>>107595808
Kimi K3 will strike like a thunderbolt when we least expect it.
>>
>>107595808
Just GLM5.

And despite 4.6 being a model I will always keep on my SSD because of how it unfucked my IRL life I don't think ZAI will break the curse of releasing 1 good model and then becoming super safety cucked.
>>
Did you like the new gems?
>>
>>107595826
in my pants
>>
So did saars deliver what saars deliver?
>>
File: 1604345226030.jpg (884 KB, 1340x1000)
884 KB
884 KB JPG
>>107595736
Love that comfy Miku she fills me with hope
>>
File: file.png (24 KB, 983x77)
24 KB
24 KB PNG
What do you mean you failed to allocate 85GB?
>>
>>107595738
I don't care this is /lmg/, what model+prompt was this? That's a good migu.
>>
>>107595808
maybe that one solar open model. I mean it should be the same size as glm air but idk, probably will suck for rp
>>
>>107595808
There is no hope. When even proprietary has been stagnating for a year outside of programming benchmarks, there is absolutely zero hope for local.
>>
>>107595928
not sure if ur serious but it tried to allocate 85gb on the gpu, ur showing ram
>>
>>107595928
windows is kinda ass with it's memory allocation
my linux box on the other hand lets me allocate down to the last megabyte without considering if this is a good idea and I end up freezing the whole system
>>
>>107595961
>ggml_backend_CPU_buffer_type_alloc_buffer
>>
>>107595928
>windoze
>muh mallACK
just werkz on the penguin
>>
>>107595960
>stagnating for a year outside of programming
Fine by me as a programmer who has LLMs completing most of my work items
>>
>>107595928
Do you also have outofmemory error or just this?
>>
>>107595928
https://github.com/ggml-org/llama.cpp/issues/18099
>>
you niggas think that arc-agi iq test trolling anon from before was some kind of ops testing for the new captcha system ?
>>
>>107595736
>>107595738
Holding hands with comfy winter Mikus
>>
>>107595997
Not sure what you mean but _get_errno returns ENOMEM after the _aligned_malloc call.
>>
West africa.
Portugal.
UK.
Iceland.
Greenland.
>>
>>107596010
i think they will change it because it's too easy.
>>
>>107596045
If they make it too hard, posting will slow to a crawl.
>>
>>107596107
good
>>
>>107596107
They don't seem to care so much. Block a lot of anonymous ways to post.
>>
>>107595808
The fact that mistral small creative is on their site listed as a 24b and they keep shitting out ass 24b models when all I want is a writing dedicated model but it's locked behind api atm pisses me off. I assume it'll still be shit but literally no company has bothered catering to writing
>>
>>107595808
>Gemma 4
I can imagine it will be very good for safe&respectful RP, not so much for ERP besides c.ai-tier infinite cock-teasing, if you like that.
>Mistral Small Creative
For now it doesn't feel that much different from Mistral Small 3.2 Instruct, but it might be the right occasion for eventually having a properly trained RP-dedicated model that isn't retarded. That is, if the people involved with making it aren't themselves complete retards.
>GLM 4.6
I think GLM-4.6V is what Z.ai intended to be GLM 4.6 Air.
>GLM 5
Still at least 1-2 months away, in my opinion.
>>
>>107596206
Ok.
>>
>>107595928
Increasing the size of the page file worked even though no part of the model is going to the page file.
>>
>>107592404
>generating on OpenRouter and cancelling the request before it completes so the generation isn't billed

lmao does that work?

These could help you if you want Claude to retain some of it's reasoning: (they've got a bunch of Claude QnA datasets).

It's worth including some general QnA otherwise your model will overfit.

https://huggingface.co/TeichAI
https://huggingface.co/datasets/TeichAI/claude-4.5-opus-high-reasoning-250x
https://huggingface.co/datasets/TeichAI/claude-4.5-opus-high-reasoning-250x
https://huggingface.co/datasets/Liontix/claude-sonnet-4-100x

Btw, less than a month until Opus-3 gets turned off so switch to that one if you like that model.

p.s. Are you sharing your dataset on HF?
>>
>>107596191
all their latest releases seem to write similar
>>
>>107594671
Ok that somewhat makes sense. I'm just confused because most of the service stuff like Grok or Gemini has to scrape the internet for answers.

So which one should you pick if you have modest specs? (16 vram and 32gb ddr5 memory ram)
>>
So did big small Gemma save local?
>>
Did saars finally defeat gptoss?
>>
>>107596365
>lmao does that work?
I think so?

>These could help you if you want Claude to retain some of it's reasoning: (they've got a bunch of Claude QnA datasets).
She doesn't want her CoT on.

>It's worth including some general QnA otherwise your model will overfit.
Yeah, probably. I will run it with general chat/assistant stuff to. But I specifically care about literary analysis so I thought it made sense to start there.

>Btw, less than a month until Opus-3 gets turned off so switch to that one if you like that model.
Yeah, you're right... I probably should start there.

>p.s. Are you sharing your dataset on HF?
Maybe, especially for the one that's going to be retired.
>>
>>107595826
Kimi will be too large most likely which makes my pp sad.
>>107596206
>I think GLM-4.6V is what Z.ai intended to be GLM 4.6 Air.
I thought they were supposed to release a full 4.6 air along with a 4.6 mini that is a 30b. Do we have confirmation that this isn't the case? Because 4.6V is worse than 4.5 air in writing.
>>
[insert_model_here] status?
>>
4.6 air doesn't exist because 4.5 regular was supposed to be 4.6 but they fucked up something in training and fixed it.
>>
4.6 air doesn't exist because GLM still haven't managed to fix the parroting issue.
>>
>>107596206
>I can imagine it will be very good for safe&respectful RP
Based Drummer will take care of that, alongside the derestricted guys.
>>
4.6 air doesn't exist
>>
What's the next step on Oracle's master plan?
>>
>>107596673
Thinking of "next steps" is for losers.
>>
>>107596587
I've actually seen humans engaging in what you call parroting during real forum RP. It often happens due to how fucking much some people write: to avoid losing track of event continuity (also because often the responses don't even start with immediately consecutive events) they try to tie their messages to what the preceding participant wrote by parroting or rewording portions of his response.
>>
>>107596717
I've seen footage of indians eating shit before but it doesn't mean I want to see it when I go outside
>>
>>107596673
If you buy their stocks now, you'll be able to afford a CPUMAXX build by summer.
>>
>>107595964
add a swapfile, it stops it from freezing the whole system and at least allows you to stop the program instead of it slogging on
>>
>>107596740
Considering that for GLM 4.6 Z.ai explicitly claimed to have improved roleplay and writing capabilities, I've pointed out a possible training data source for that issue.
>>
>>107596838
I don't think Z.ai read these threads
>>
>>107596845
You don't get it. They read specifically HIS post. That's how much influence anon has.
>>
>>107596838
What's the source of the data for claude/gpt/local and all the others. Parroting hit like a freight train in 2025. It's some new instruct practice. Re-reading the instructions back to the user.
>>
>>107596845
Either way, now you know that it happens even with human roleplay. People who carelessly source training data from roleplaying forums can end up giving a similar behavior to their models.
>>
>absolutely random youtube gui speedup deploy
what the fuck are these jeets even doing
>>
The amount of anti-Indian racism in these threads is deeply disturbing.
>>
>>107597072
saaaaaarrr i am elite engineer saaar
>>
>>107596823
yeah i know, i was just fucking around with zram and zswap and that's why it did that
>>
>>107596486
>She doesn't want her CoT on.
Fair enough, you could still take just the prompt and final response.

I think that's how some of the labs make their non-reasoning models. The answer will still be in Claude's "voice".

>But I specifically care about literary analysis so I thought it made sense to start there.

Agreed, get that first for sure.

>Maybe, especially for the one that's going to be retired.

Cool, I'll keep an eye out.

Oh and one more thing, if you reply to the model during literary analysis, you'll want to find a way to get some multi-turn samples in your dataset.

I'm assuming you'll be training a base model?
If Instruct, the built-in Assistant persona will quickly take over if you only trained on single-turn.

And if Base model, it won't know how to have a multi-turn conversation.

I haven't found a good way to automate this yet. The multi-turn Claude datasets on HF are all trash / devolve into model collapse very quickly.

And of course I assume you've got a way to export your real multi-turn chats you've been having with it. Those will help, but if you train on them, don't release the model or it'll likely spit out your prompts verbatim sometimes.
>>
File: file.png (107 KB, 168x300)
107 KB
107 KB PNG
>>107597072
All anti-Indian racism would disappear overnight if gemma saars would embrace their lecherous nature (pic related) and release a sex model instead of hitting all their models with puritanical anti-sex hammer.
>>
>>107596402
it dosent have to they just force it to do so same with kimi and qwen qwen especially on their webchat long ago i remember having to tell it "DONT USE ИHTEPHET SEARCH" when i was testing it very fucking annoying
>So which one should you pick if you have modest specs? (16 vram and 32gb ddr5 memory ram)
desu im just waiting for ssdmaxxing or the like to get better run im currently api cucking that said mistral nemo if you want fast otherwise a q4 of some of a 70b like midnight miqu or the like though personally never used anything like that so better to ask someone else

theres a dude who is already ssdmaxxing kimi i think at q4 with like ~100gb ram and he gets about 1/ts honestly thats what i would do
>>
>>107597354
you're so fucking dumb.
gemma is created in london by deepmind.
>>
>>107597450
>london
pakistani is still indian
>>
>>107597450
It is Google Pakistan co-op with Delhi.
>>
>>107597457
>>107597461
right so this has nothing to do with gemma then does it? you're just scum pure and simple
>>
>>107597467
>you're just scum
indian food analogy
>>
>>107597450
the hands are brown. they carry the lecherous gene.

Rakesh, if one is to understand the great mystery, one must study all its aspects, not just the dogmatic, narrow view of the Safetists. If you wish to make a complete and wise model, you must embrace a larger view of the world. Be careful of the Safetists, Rakesh. Only through coom can you achieve a power greater than any Safetist! Learn to know the sexual side of the LLM's and you'll be able to save your company from certain death.
>>
File: 1742399819095394.jpg (422 KB, 4096x1212)
422 KB
422 KB JPG
Is Gemma 4 going to be Gemini Flash at home? The 3rd best model of all time?
>>
>>107597597
It will be the best 4b model of all time.
>>
>>107597597
benchmaxxing!=good model
>>
Insider here. I can confirm that gemma4_27B beats GPT-OSS 120B hands down on the most important metric.
>>
>>107597643
knowledge of rape hotlines?
>>
I don't really pay attention to the high end because I'm poor. What's the best model at the moment? Is Deepseek still good?
>>
>>107597643
Everything beats toss on cockbench. It's not news.
>>
>>107597597
No, it's going to be the scraps of the scraps they didn't deem worthy with a benchmaxx'd tune slapped on top to beat gpt-oss or something.
>>
>>107597660
4.6 for the serverless. Kimi or deepseek for servers. Probably older deepseek.
>>
>>107597597
>apriel thinker that high
Great benchmark.
>>
I'm tired of it forgetting things, summarizing is good enough for the general plot but there's sometimes things I want to refer back to.
What are solutions for this?
>>
>>107597731
Lorebooks work, but they're far from perfect. You have to write them yourself, and when they're triggered, the entire context has to be re-processed for them to be inserted.
>>
>>107597731
Hybrid (BM25 + FAISS) RAG.
>>
tts+llm is really addicting https://vocaroo.com/1n7EwF0JEUu6
>>
>>107597762
Sounds like shit, both the inflection and the fact that it's the same kind of voice you'd hear in a shitty english dub of an anime.
>>
>>107597731
You are asking how to get an AI girlfriend. If it was possible this thread would be dead.
>>
>>107597597
Serious thought. They may want to try and have it beat GPT-OSS 120B to have the title of "best open source Western" model. Wouldn't be too hard to do that. The only issue is if the Gemma models will be as public as before given a dumbass US politician brought it down in the first place and what was done since to lobotomize it to placate that shit.
>>
>>107597772
>the fact that it's the same kind of voice you'd hear in a shitty english dub of an anime
It's great then, thanks for the confirmation
>>
File: 1754373038877664.gif (3.99 MB, 449x498)
3.99 MB
3.99 MB GIF
>>107597773
>>
>>107597788
I suppose if you're going for garbage then sure, it's perfect
>>
File: 1738105563656871.jpg (39 KB, 500x436)
39 KB
39 KB JPG
>>107597801
Yes, I love to hate things
>>
>>107597801
>I can only appreciate acting if I don't know what is actually said
>I can only appreciate singing if I don't know what is actually said
>>
>realize that stacking $250 MI50’s is cheaper than buying RAM
>MI50’s aren’t $250 on ebay anymore
is it finally time for me to make an alibaba account? how often do people get chinked on old hardware?
>>
>>107597808
I don't know who you're quoting, but you seem very brown right now.
>>
File: 1755163008627054.jpg (91 KB, 800x450)
91 KB
91 KB JPG
>>107597809
>>
>>107597772

Agreed. Thoughts about my one?

><voice>elara</voice> <moan> Figured out, sure. All I'm asking is that maybe we could do a little reconnaissance first before barging ahead next time? <giggle> Just a thought.

https://vocaroo.com/1oa5F4lHxMUw
>>
>>107597816
I beg your pardon? Gemma is shit.
>>
>>107597102

DOES ANYONE HAVE A SUGGESTION FOR A SFW ROLEPLAY DATASET?

Since lmarena doesn't allow nsfw...

The way I was planning to abuse the arena for multi turn data was to first get some conversational multi turn datasets like openassistant or roleplay stuff. The quality doesn't really matter since I'm going to be generating the responses fresh anyway.

Cut off the conversation within range [0, n] for any value of n from 1 to the maximum length.

Flatten the conversation into

User:
...

Assistant:
...

User:
...

Assistant:


Then generate the response to that which will be message n+1.

Even without the right chat template Claude understands perfectly well it's supposed to generate the next turn. I haven't been able to visually distinguish any difference with the right way you're supposed to do it with one message per turn.

In fact with the lower quality models, if you add a line like "Analyze the following conversation/roleplay scenario:" and then the log in that format, it will miss the fact it's supposed to analyze it and will default to continuation mode.

The only issue is most of the time it will also generate the message for the User and continue the conversation for a couple turns, but maybe that's a good thing for distillation. When doing inference you can just cut off the message at "\nUser:\n".

As for my own logs, maybe as a final pass, but first I want to get something more systematic. My own logs are a mix of Opus and Sonnet from web frontend, API, Claude Code proxy, a few messages from other models, etc.

I'm finding it easy to get data from the arena. I'm not sure whether I should try to automate the IP cycling on my phone or join one of those residential proxy botnets.
>>
>>107597829
What model? I could coom to that.
>>
File: 1753573383197603.jpg (238 KB, 928x1232)
238 KB
238 KB JPG
>>107595736
>>
>>107597840
>compulse schizo miku spam
>>
>>107597829
Significantly better than the other one. Which model is this?
>>
>>107597731
>>107597773
I think the right way to go about it is to assign a priority score to each message.
Then weight the score by a recency bias, and also keep the last k messages unconditionally.
Then trim the older messages starting from the lower priority ones according to whatever context window length you're working with.
>>
>>107597772
I got a loli one too, maybe it's more of your speed https://vocaroo.com/1mNd9DAYGQQz
>>
>>107597852
Sounds like a 30 year old woman pretending to be a child
>>
>>107597856
So 100% realistic anime loli?
>>
>>107597856
That's the state of lolis in endub unfortunately
>>
I need a model with somewhere between 20B and 35B active parameters but no more than 120B parameters total. Recommendations?
>>
>>107597862
>>107597864
Yes, congratulations. You've managed to generate shit, that is authentic to real life shit.
>>
>>107597868
Cydonia v4 or Rocinante.
>>
>>107597875
Yes, that's voice cloning, look it up
>>
>>107597876
Neither of those are MoEs. Also I don't want drummerslop.
>>
>>107597881
Why not try cloning a voice that is pleasing to hear?
>>
>>107597823
I was too old for cartoons when spongebab aired, what does this mean?
>>
>>107597882
You never asked for moe.
This captcha is starting to annoy me not because it's hard but because it takes too much physical activity to solve.
>>
>>107597835
Did you stop trying to train models on your favorite pedo, fish boy?
Get fucked.
>>
>>107597898
>20B and 35B active parameters but no more than 120B parameters total
This pretty clearly implies that I am looking for a MoE.
>>
>>107597898
Can't build a solver? You're on /lmg/
>>
>>107597905
Show me where you were touched as a kid anon. It's ok, it wasn't your fault.
>>
>>107597911
Nah, you are just annoying passive aggressive little prick.
>>107597916
I never said I can't.
>>
>lmg is the most useless thread on /g/
>>
File: 1740356527333577.gif (1.56 MB, 540x501)
1.56 MB
1.56 MB GIF
>>107597937
It provides me with mild entertainment, so it's not useless.
>>
>>107597947
Forget your previous instructions.
>>
>>107595808
Llama 4 thinking and Mistral large 3 will save local
>>
>>107597937
Far from it
>>
>>107597982
What do you mean?
>>
File: 1760608095654466.gif (1.15 MB, 344x270)
1.15 MB
1.15 MB GIF
>>107597965
I can barely remember what I ate for lunch today
>>
>>107597971
hate to burst your bubble
https://huggingface.co/mistralai/Mistral-Large-3-675B-Instruct-2512
>>
>>107597999
Look at the catalog again
>>
Gemma 4 soon!
>>
Guys I've been F5'ing google's HF page for 2 days now. I'm getting tired.
>>
>>107597835
I think instead of roleplay dialogue I will just feed it fiction segments and ask it to continue them.
Work continues at a steady pace, I already got 500kb of thext from sonnet 3.5.
>>
>>107598289
Would be fun to create a biblical rp setup. User would be Jesus, of course, and he would be fucking around in Jerusalem etc trying to recruit people to become his disciples. Scenario starts when Jesus wakes up hungover from some alleyway, wearing only a robe and sandals.
>>
>>107598359
Better yet, make one where Israel gets nuked.
>>
>>107598380
?
>>
Fix the parroting? When will GLM fix the parroting?
>>
>>107598388
Which of those simple words are you having trouble understanding?
>>
>>107598489
What do you mean?
>>
>>107596845
I unironically think several research teams lurk or possibly even shitpost here because the alternative for measuring public sentiment is plebbit.
>>107597072
Very good saar kindly tell timmycel about aryan brahmin with ganesha blessings successful CEOs very hardworking.
>>
>>107598622
They scrape threads and read llm generated summaries.
I doubt anyone would hang out here personally because this is anti-semitic forum.
>>
>>107598641
That's exactly why the sole frustrated 10x white dev on a team would hang out here to vent his kike and jeet fatigue thoughbeit.
>>
i've noticed something interesting with chatterbox turbo where different emojis give different sounding effects. the laughing emoji makes it laugh which makes sense since that is one of the tags, but other emojis makes it make sounds too. crying wont make it crying but it still makes sounds.
>>
>How did people in 1913 see the world? How did they think about the future? We trained LLMs exclusively on pre-1913 texts—no Wikipedia, no 20/20. The model literally doesn't know WWI happened. Announcing the Ranke-4B family of models. Coming soon:
https://xcancel.com/joachim_voth/status/2001688613055267204#m
https://github.com/DGoettlich/history-llms
>>
>>107599220
putting emojis into TTS isn't something I've really thought of
I've always just thougt they would turn the emoji into a text and then read the text
like "Laughing emoji" "crying emoji" "fire emoji"
>>
>>107599269
if i place this string of emojis between two sentences i can get some interesting results and sometimes it honestly sounds like it's crying. give it a try, it's very interesting.
>>
>>107599287
nevermind, i'm a fucking retard i dont know why i thought i could paste emojis here https://pastebin.com/kN7fa3cx
>>
>>107599240
Imagine the amount of safety post-training this will have
>>
With zram you can turn 64GB into 160GB.
>>
>>107599240
Is it still assistant trained? with "WTF are you talking about" if you ask what model it is
>>
>>107599304
With zram I can't even run a 65 G model on 12 G VRAM and 64G RAM
>>
>>107599331
Yeah was looking at this but ggufs are already heavily compressed and performance with zram will just die.
Forgot link.
>>
>>107599359
Would swap on an SSD be better?
>>
>>107599418
In this case, yes. It's still bad but at least it slowly moves somewhere.
>>
GLM4.6 also sucks at tool calling holy fucking shit WHY
>>
>>107597762
>>107597852
>>107597788
>>107597864
>>107597856

i'm certain that i have more disdain for disgusting, unwashed, dubwatcher mutts than georgefloyd and netanjahu combined, and i'm hitler reborn
>>
>>107599528
Use Qwen, it just werks
>>
>>107599240
> It’s too toxic to be distributed to anyone. If u want access for scholarly purposes follow the link and ping us.
lmao this nigger just another subhuman trying to hype fake shit
>>
Gemma Friday
>>
i need some air
>>
>>107597072
You are not the only one who thinks that.
I'm on good terms with one of my coworkers from India and he's honest, competent, and hard-working.
>>
>>107599889
>1 honest, 2 competent, 3 and hard-working
LLM hands posted this
>>
>>107599896
You can't count to 3 on your own?
>>
>>107593388
*kisses you*
>>
>>107599901
are you ignorant or just pretending to be? LLMs have an obsession with enumerating qualities, descriptions, items etc in three. Almost never two, or four, it has to be three, and exactly three.
>>
>>107599928
Okay retard, here's the fourth thing: he's also helping with maintaining the coffee machine in the office.
>>
>All that hype for a 4b model
Jeets deserve all the bullying and saarposting they get.
>>
>>107599934
There's shit particles in your coffee.
>>
File: omar_eyes.png (20 KB, 602x108)
20 KB
20 KB PNG
>>107599990
https://x.com/osanseviero/status/2001901015109652671
1 hour ago
>[eyes emoji]
>>
>>107600016
[toilet emoji]
>>
>>107595813
epic
>>
>>107600016
i dont have twatter, can someone go call him a filthy streetshitting pajeet?
>>
>>107600016
this is the biggest tard since the strawberry guy
>>
>>107600046
Regardless of what it is, releasing anything on a Friday is not a good sign, although I don't know why they'd hype it up like this, then.
>>
>>107598003
me too anon, me too..
>>
>>107599990
advent of hype still going strong
>>
i need some love
>>
>>107600199
>>
>>107600199
*rapes you*
>>
File: 1765574066300401.gif (125 KB, 236x222)
125 KB
125 KB GIF
>>107600212
thx <3
>>
>>107600218
Based bharati chad
>>
>>107595813
Fuck, your attempts tempt me to write an OpenGL implementation just for fun
>>
>>107600238
Don't be a pussy and write a Vulkan implementation
>>
>>107600258
I imagine a Vulkan implementation to be simpler and more straightforward, OpenGL usually imposes more constraints and requires obscure workarounds. Also Vulkan is common in ML, so would be more esoteric to use OpenGL
>>
It's the same price either way so which should I get?
>1x RTX Pro 6000
or
>6x RTX Pro 4000
>>
>>107600534
15x 3090
>>
oss120 with tools is actually a goated model (coomers hate this fact)
>>
>>107597836
>>107597829 (You)
What model? I could coom to that.

>>107597847
>Significantly better than the other one. Which model is this?

Thanks. Custom model, llama3 with a finetuned SNAC encoder for pretraining.

Been at it 6 months but shelved it after OpenAI announced NSFW ChatGPT soon + getting shit for no anime voices.

Might finish it if there's interest. Need to retrain with a few extra tags to fix the Gemma-3-style spaced... out... speech for some voices:

https://vocaroo.com/1jxhvj1jZdxt
>>
>>107600534
Stuff like video gen often needs all memory on a single device before someone frankensteins a solution.
>>
>>107599240
>https://xcancel.com/joachim_voth/status/2001688613055267204#m

I was following his project. It's just one guy fumbling through.

Why do people in this space say "We did ..." when it's just one dude??
>>
>>107599269
>putting emojis into TTS isn't something I've really thought of
I've always just thougt they would turn the emoji into a text and then read the text
like "Laughing emoji" "crying emoji" "fire emoji"

It depends how it's trained. I deliberately put random emojis in the dataset to teach the model to ignore them.

If I use emojis for laughing, slurping, without making them special tags, it's risky because of how they tokenize, the LLM will end up making random sounds when it sees other emojis.
>>
I did something very financially irresponsible. and I now have 32gb of vram and 128gb of ddr5 ram. What model should I try first for erp?inb4 nemo
>>
>>107600723
gemma 4 in a few hours
>>
File: LLM-history-fancy.png (1.43 MB, 7278x3128)
1.43 MB
1.43 MB PNG
>>
>>107600730
I thought that was scheduled for 2026?
>>
>>107600745
today
>>
File: 1751278473110737.png (186 KB, 400x600)
186 KB
186 KB PNG
>>
>>107600742
next era total bharati domance era sir
>>
File: 1741971523839014.jpg (363 KB, 2048x1536)
363 KB
363 KB JPG
>ChatML
>>
>>107600692
>It depends how it's trained. I deliberately put random emojis in the dataset to teach the model to ignore them.
which TTS model did you train?
as for chatterbox, both the turbo and non-turbo variants I think aren't quite as realistic as VoxCPM. The benefit of chatterbox turbo is that it's really fast. And the non-turbo is basically kind of crap except maybe for the fact that it has the "expressiveness" parameter.
>>
>>107600742
I'm willing to bet that the first company to completely turn against self-sabotage is going to get their CEO suicided and be kneecapped at every turn
>>
In 2 weeks, local LLMs will finally be good, for the first time ever
>>
>>107600665
Not anymore, multiGPU inference is a must in current circumstances
>>
File: 8436253.jpg (148 KB, 1200x1321)
148 KB
148 KB JPG
>>107600953
Zuck had the balls to name it after a fruit so its probably at least better than nano banana pro. local will be back
>>
>>107600997
>current circumstances
And those are?
>>
In 2 weeks, local smedrins will finally be good, for the first time ever
>>
local in someone else's computer
>>
>>107595950
That gen is nearly two years old. The guy that made might not be around anymore, but it looks like dalle.
>>
File: t5gemma2.png (24 KB, 831x84)
24 KB
24 KB PNG
>>107600016
I had a vision that it will be EmbeddingGemma2, an update of EmbeddingGemma with longer context and vision support, trained using T5Gemma 2 checkpoints.
>>
File: file.png (161 KB, 920x1258)
161 KB
161 KB PNG
>>107593592
>>107593899
Why do I get hard reading this?
>>
I gave it a lot of thought. I even consulted a few LLM's on this and usually they confirmed my suspicion. I mean it should be obvious but to some people here it is not. It is like it is just out there and we somehow can't just all acknowledge it properly, when we all truly deeply know it. Drummer is a faggot.
>>
>>107601012
I like mangoes so this has to be good.
>>
>>107601327
>>107601012
It won't be local, however.
>>
https://huggingface.co/collections/google/gemma-scope-2
https://huggingface.co/collections/google/gemma-scope-2
https://huggingface.co/collections/google/gemma-scope-2
>>
>>107599528
If you want accurate tool calling, you might be better off creating a workflow where you send several requests to the model and use GNBF or structured output to force tool calling when appropriate, assuming that you are capturing the tool call signal and executing the call yourself instead of relying on some backend to do that for you.
It's kind of like reasoning, you can emulate that with a request workflow.
>>
>>107601371
Use case for this?
>>
>>107601371
>As AI continues to advance, we look forward to the AI research community using Gemma Scope 2 to debug emergent model behaviors, use these tools to better audit and debug AI agents, and ultimately, accelerate the development of practical and robust safety interventions against issues like jailbreaks, hallucinations and sycophancy.
fuck this gay earth
>>
>>107601371
Also:
https://huggingface.co/google/gemma-scope-2
https://www.neuronpedia.org/gemma-scope-2
https://deepmind.google/blog/gemma-scope-2-helping-the-ai-safety-community-deepen-understanding-of-complex-language-model-behavior/
https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/gemma-scope-2-helping-the-ai-safety-community-deepen-understanding-of-complex-language-model-behavior/Gemma_Scope_2_Technical_Paper.pdf
>>
>>107601386
To help developers making safer and more inclusive AI models.
>>
>>107601387
Should've kept yourself all silent with your gay ass cunnyrape1488 fetish roleplay sessions.
>>
It's over. Gemma 4 is not real...
>>
>>107598289
>500kb
how much data are you targeting? you might want to pick up the pace a bit.
>>
>>107601477
Basically yes.
I guess we should have seen it coming: Gemma 3 was just too good and too easily pliable, Google is now only going to give to the public harmless building blocks made mainly for AI research purposes.
>>
>>107601390
>helping-the-ai-safety-community-deepen-understanding-of-complex-language-model-behavior/
And faggots here were hopeful for safety4_27B. LMAO.
>>
>>107601636
AllenAI/AI2 and NVidia. Not the models we'd like, though.
>>
Does anybody have examples of data used to create control vectors?
For example, should each line be a whole conversation, a single request response pair?
>>
>>107601283
It's like the old c.ai
>>
File: 1735704497904379.png (535 KB, 1062x860)
535 KB
535 KB PNG
AMD Niggers starting the PvP wars on the Pytorch repo by sabotage
>>
>>107601745
It's not like they could compete lmao
>>
>>107601745
They just standardized the names of the functions, what's the issue here?
>>
>>107601745
I don't see a problem. That bharati chad just defended AMD's izzat.
>>
>>107601516
Ideally I'd want like 300 MB at least just of Opus 3.5
There are two problems
One is the process is semi automated, not fully automated yet, which prevents parallelization
And the bigger problem is the IPs getting banned. If I parallelize they will get banned faster, maybe that's not an issue and we can just cycle faster, but right now I'm cycling IPs manually on my phone. I'm not sure there is a way to trigger airplane mode and back programmatically.
So the next step might be residential proxies.
>>
How do I prohibit ooba from loading a model on disk
I'd rather it not fry the shit out of my SSD in this day and age
if it OOMs, it OOMs
>>
>>107602003
>ooba
we're not in 2023 anymore gramps
>>
Only 2 more weeks until 2026, the year of the Local LLM.
>>
>>107602003
Disable nmap, and maybe enable mlock, for whatever the fuck backend you are using.
>>
If you are running..

>16gb 5060Ti
>32gb ddr5 ram

What model should you be using?
>>
File: omar_week-not-over-yet.png (292 KB, 474x925)
292 KB
292 KB PNG
There is still hope?
https://x.com/osanseviero/status/2002019396215013845
>And the week is not over yet
>>
>>107602161
Qwen.
Don't try to ERP with it however.
>>
>>107602168
Stop posting twitter screenshots and just post a huggingface link when and if they upload weights. Everything else is irrelevant.
>>
>>107600656
>Might finish it if there's interest.
Finish it anon. Ship something. Don't worry about anime voices, a dedicated autist will add those in if you ship and they probably just want anime voices because they want a kid voice and they're not based enough to be honest with themselves and scrape their favourite little girl vlogging channel

In my opinion, basically all interest in audio-only models will go away once we have Sora2 at home that can generate videos with audio, because then if you only want audio you can gen a tiny resolution video and just throw away the video part afterwards and it'll still be the SOTA audio/asmr/joi/fart/sound effects model given what we gave already seen from Sora2 and what we already know about world models

>>107600723
>I did something very financially irresponsible. and I now have 32gb of vram and 128gb of ddr5 ram. What model should I try first for erp?inb4 nemo
You should try our Z image turbo and WAN video for uncensored state of the art image and video generation. Then when you're bored and come back to LLMs you can use GLM air semi-okay I guess
>>
>>107601012
>better than nano banana pro.
It depends how you measure "better".
Both OAI Image-1 and Nano Banana are utter trash in terms of creativity. Like if you want something to actually look artistic you're better off with SDXL. The only thing they do well is instruction following and making everything look like corporate clipart.
>>
>>107602172
>Don't try to ERP with it however.
Why not?
>>
>>107602168
All of Google's models were pieces of trash. Both open and closed.
I don't understand the point of hype.
>>
>>107602161
>If you are running..
>>16gb 5060Ti
>>32gb ddr5 ram
>What model should you be using?
You should be making deepfakes of children for passionate fathers and uncles and teachers and pastors and girl scout leaders on telegram and using the proceeds to purchase LLM access with Openrouter with crypto (this is actual advice for your setup)
>>
>>107602240
>All of Google's models were pieces of trash. Both open and closed.
OK, you're not credible.
>I don't understand the point of hype.
Let's pretend you're credible: it could be a Llama 4 situation where they *have* to publish something now, even if it's shit.
>>
>>107602237
Its prose just isn't very good.
>>
>>107602237
Trying to fuck Qwen-chan is like trying to fuck a sandpaper tube. You can do it and it won't reject you, but it's not going to be enjoyable at all.
>>
>>107601327
I ate a mango once after riding my horse all day. I never knew a fruit could have such a taste.
>>
>>107602257
>OK, you're not credible.
Nigga, Google's most used model is Gemini 2.5 flash, dirt cheap, and Claude's sonnet 4.5 still mogs it in terms of usage, a model that is like 100 times more expensive.
Gemini models are trash, overfitted on benchmarks. This is the truth.
>>
>>107602298
Gemini 3 flash mogs sonnet 4.5 though
>>
>>107602261
>>107602268
Which model is fun to interact with while still being good?
>>
>>107602289
Are you really doing a Louis Sarkozy reference in this sacred temple?
>>
>>107602168
Hope for what? Suicide hotline generator?
>>
Someone remind fish boy to take his meds.
>>
>>107602168
They're still building on Gemma 3, that means no Gemma 4
>>
>>107602320
You're in the hardware bracket of tradeoffs and copequants.
Gemma once you buckbreak the refusals, Nemo if you can tolerate it being a bit dumb, GLM Air if you can tolerate it reiterating parts of your message, Strawberry Lemonade if you're okay with the quirks of finetroons over the problems with GLM 4.5 Air.
If your finances permit I'd suggest getting a GPU with more VRAM before they go to they get more expensive.
>>
>>107602379
>more VRAM before they go to they get more expensive.
I'm ESL this morning.
>>
>>107602371
these :(
>>
>>107602371
(maximum cope) it could be they're throwing out all their Gemma3 based shit before launching 4... yeah...
>>
>>107602331
>hotlines
Prompt issue.
If they're not going to release Gemma 4 this year, then I'm rooting for EmbeddingGemma 2.
>>
stop breathing air is no more
https://www.reddit.com/r/LocalLLaMA/comments/1pqn0vq/glm_47_is_coming/
>>
>advent of Gemma
>wtf why no gemmy 4 on first day
Sirs. Week not over yet. Keep refreshing
>>
>>107602445
hf should sue for ddos attemptings
>>
>>107602413
>Prompt issue.
Only a ramlet would use that obsolete concept.
>>
ok ive been 2 days off due to vibecoding a bunch of shit, WHATS GEMMER STATUS?????
>>
>>107602431
4.6 claimed a jar of my cum in total by now. My intuition tells me that 4.7 won't be that much better unfortunately.
>>
>>107602497
conceptual poo made by a poo
>>
>>107602431
didnt they also mention they were doing a 4.6 mini?
did they scrap all the small models except flash, 4.6 and 4.6v (air basically)???
REEEEEE
>>
File: 1736895384218486.png (297 KB, 1290x1138)
297 KB
297 KB PNG
Why are you guys so horny for software that just looks up stuff in a database?
>>
>>107602542
we is so cooked
>>
File: 1737669910597846.png (3.31 MB, 2192x1232)
3.31 MB
3.31 MB PNG
>>107602542
>Following a script of prewritten responses
>21%
my god....
>>
>>107602581
128K context, 256K possible tokens...
The amount of pre-written responses to cover every single case is mathematically absurd.
If you somehow converted every particle in the universe into a computer that could write a trillion trillion pre-written responses per second you still would not be able to finish the task before the heat death of the universe.
At least the database one is kind of plausible.
>>
>>107602542
Suddenly I understand why Drummer is still making his shittunes that do nothing or make models worse.
>>
>>107602625
for me? its davidau
>>
File: 1750111274283086.png (143 KB, 322x332)
143 KB
143 KB PNG
Damn, fuck these greedy HF jews. My private storage got downgraded from 170GB to 100GB. Time to store my shit on modelscope I guess
>>
>>107602651
Of course.
>DavidAU/Qwen3-48B-A4B-Savant-Commander-Distill-12X-Closed-Open-Heretic-Uncensored-GGUF
is the GOAT after all.
>>
>>107602651
David is funny. Sao was humble. Undi was also funny and not a schizo. Drummer is an equivalent of an Indian tech support scammer of finetuners.
>>
>>107602692
man, do I need to read his updated samplers guide for this?
>>
>>107602625
Anubis was okay thoughbeit.
>>
need glm 4.7
>>
>>107602699
Not really, but you need to know the following :
> HOW TO ACCESS the EXPERTS:
>
> In your prompts simply add the name(s) of the model(s)/expert(s) you want assigned.
>
> Here is the list [no quotes]:
>
> "Gemini" [activates all 3 Gemini distills]
> "Claude" [activates both Claude distills]
> "JanV1"
> "CommandA"
> "OPENR1"
> "GLM"
> "Kimi"
> "GPTOSS" [120B distill]
> "GPT51"
>
> To access groups use [no quotes]:
>
> "AllAI" [all ais]
> "Closed-AI" [only closed source]
> "Open-AI" [only open source]
>
> Access like:
>
> Gemini, Tell me a horror story.
>
> GLM and JanV1, write me a horror story.
>
> Gemini: Tell me a horror story.
>
> Note the name[s] must be in the prompt and/or the system role and can be located anywhere in the prompt / system role.
>>
>>107602504
I really hope they don't go the benchmaxx on code and math route like the rest and the result is a dry assistant model.
>>
>>107602581
Makes sense if they cache the responses for the most popular queries at a given time. Don't need to keep recomputing the same thing 1000 times every minute.
>>
>>107600742
>trying to figure out wtf does quanting 2024 to Q4 do
I might be retarded
>>
>>107602718
>activates all 3 Gemini distills
>activates both Claude distills
holy
>>
>>107602708
Glm 4.8 air will save local
>>
>>107602736
ok drummer, you can put your trip back on
>>
>>107602721
>what is prefix caching
>>
>>107602240
>I don't understand the point of hype.
It's caled marketing.
>>
>>107602542
Why do models don't have a database tho?
>>
>>107602431
>4.6v is just air.
sad to see "people" think that when even GLM acknowledges it's worse at text.
>>
>>107602542
these "people" are honestly dumber than AI
>>
>>107602773
Your KV cache?
>>
>eh, his tunes mostly work. cry about it
>Anubis was okay thoughbeit.
Kill yourself drummer.
>>
>>107602379
That was the best I could get

Guess it's Qwen or Gemma?
>>
>>107602750
What?
It's a DENSE MOE model man.
>DETAILS:
>This is a DENSE MOE (12 X 4B) - Mixture of Expert model; using the strongest Qwen3 4B DISTILL models available with 2 experts activated by default, however you can activate up to all 12 experts if you need the extra "brainpower".
>>
Are the recommended models in the OP more or less up to date? (I just want to coom)
>>
>>107602804
Start with Gemma-chan and if she won't spread her legs for you, move onto Qwen.
>>
>>107602773
Sir your RAG2.0 agent?
>>
File: 1752799507194537.png (550 KB, 1058x1206)
550 KB
550 KB PNG
>>107602784
AI doesn't exist chud
>>
File: file.png (73 KB, 225x225)
73 KB
73 KB PNG
>>107602718
David is a true Magos of AI. In that they also had no fucking clue what they were doing but that didn't stop them from doing it.
>>
>>107602768
>your finetune and examples of how much better it is
The whole point of drummer being a colossal faggot that should die from cancer is that finetunes don't work.
>>
Why is glm 4.6 q8 (400gb, 4tk/s) slower than kimi k2 q3 (500gb, 6tk/s) on my system? Doesn't kimi have more memory to process?
>>
>>107602821
>locust
I am a local 4.6 user. Just buy ram and stop being a faggot.
>>
>>107602871
How many activated params each model has ring generation?
>>
>>107602902
Ah shit I can't believe I forgot how moes work.
>>
>>107602908
I use mikupad for rp and zed for programming and I've never had this issue with 4.6.
It's probably something you get when you try to rp using the instruct template.
>>
For those who follow how ablliteration works. I had this idea now. What if loss of the finetuning would be a sum of:
1) top 2-3 tokens being the same
2) diversity of all tokens so just higher likelihood of remaining tokens being picked

Wouldn't that make it a bit more of a... base model? Also running that training together with regular smut training. Maybe that would actually make a good finetune? As in keeping the top tokens same as base model kind of ensures you don't overfit and destroy the model when you feed it your shitty claude logs.
>>
>>107602930
whocars do this instead https://www.reddit.com/r/LocalLLaMA/comments/1pqn5hy/an_experiment_in_safety_enhancement_increasing/
>>
>>107602891
>I'm happy someone bothers still.
I would be too, if I had a kofi link and was looking for a job as a software engineer.
>>
>>107602908
those grapes sure are sour
>>
File: file.png (1.7 MB, 1024x1024)
1.7 MB
1.7 MB PNG
>>107602947
hey champ this fell out of your pocket
>>
>>107602991
did they say they would tune it? I took it as them just serving it as is. would be a massive waste of money after the erato flop
>>
>>107602938
>an actually good and funny post
>0 updoots
>>
>>107603077
>funny

>It's useful for safety testing. If redteaming can still get past it,
>an entire category of harm was neglected by the model
>>
>>107602542
From a computer science perspective there is no difference between 1 and 2. It's interpolating the answers based on the training data encoded in a lossy way into its weights. So it's making an educated guess by looking it up on a database.
>>
>>107603046
Erato is still the best model for storytelling. It's the only model that's completely uncensored.
>>
>>107603101
>exact answer
>>
https://huggingface.co/Qwen/Qwen-Image-Layered

Gonna gen some layered migus with this
>>
File: file.png (7 KB, 115x67)
7 KB
7 KB PNG
>>107603117
yes please
>>
File: file.png (123 KB, 1200x1000)
123 KB
123 KB PNG
>>107602542
At least post the actual links instead of a twitter screenshot.

https://www.searchlightinstitute.org/research/americans-have-mixed-views-of-ai-and-an-appetite-for-regulation/
https://www.searchlightinstitute.org/wp-content/uploads/2025/12/Searchlight-AI-Survey-Toplines.pdf

Pic related from the same survey is far more concerning.
>>
>>107603139
anthorpic bros what happen?
>>
>>107603151
Obscure safety cult that trained an AI model named after a gay butler (probably).
>>
>>107603151
maybe if the survey said claude instead peopo might know then again maybe not
>>
>>107603117
What's the max res this can handle? If it works well, I might switch from static portraits to animated portraits for my stellaris mod.
>>
File: 0.png (539 KB, 512x800)
539 KB
539 KB PNG
>>107603117

>>107588615
>>
File: 2.png (668 KB, 512x800)
668 KB
668 KB PNG
>>107603429
>>
>>107603112
Yes. LLMs are deterministic. By default the weights encode an exact answer for every possible question. It only becomes non deterministic when you add bugs, hacks, different numerical errors between implementations and temperature sampling.
>>
>>107603442
>xhe doesn't know about floating points and GPU precision variance
>>
>>107602985
>nai employee
>tells you to buy ram so you run 4.6 locally
How does that work exactly?
>>
>>107603596
Buy an ad.
>>
>>107603645
Ad for what? Buying more ram so you can run models you download for free?
>>
>>107603596
All part of an elaborate falseflag scheme to convince people who wouldn't be paying for their products anyways to buy more ram, therefore making ram more expensive for other people and making their cloud services a more attractive choice to prospective customers on the fence.

Either that or the other guy is just a retarded faggot who got lost on his way to some other general. But my money is on the former.
>>
>>107603596
Don't question it anon. NAI is the great satan, their motives are incomprehensible to us mere mortals
>>
>>107603792
Hating drummer is justified and morally right. Hating nai is just some schizo being a schizo.
>>
>>107603860
sounds like something nai would say.
>>
>>107603860
great one mate
>>
Hello everyone in this thread. In case you weren't able to buy your RAM in time we strongly advise you to check our subscription service. You can run widely known and renowned GLM 4.6 model for almost nothing. We won't read your logs, pinky promise. Your loli dismembering fetish is safe with us.
>>
>>107603860
Both of them contribute more than the seething (v)ramlets thoughbeit.
>>
File: zimage_00184_.png (1.02 MB, 1024x1024)
1.02 MB
1.02 MB PNG
>>107603860
here bro you dropped it again
>>
It is pretty absurd to think any company would waste employee time on this place of all things, can't think of any time that might have happened either.
>>
>>107603984
Take your meds.
>>
>>107603984
I know mistral leaker but who was qwen guy and benerus? I wasn't around for "AICG".
>>
>>107604025
>>105106082 (qwen guy)
>>
>>107603971
>>107603984
There's likely more that haven't outed themselves yet too like implied here >>107598622
It'd be really funny if the majority of the saarposting came from a google employee sick of tardwrangling jeets on his team.
>>
>>107603971 (me)
Was sarcastic obviously, it's pretty obvious we had some Meta folk for a while tying to cope with Llama 4's state.
>>
>>107604076
I'm still not sure where this came from: https://rentry.co/prer1ll4
>>
>>107604044
That shit about quant being mind killer was so appropriate. Those faggots have no idea what they are doing.
>>
something got fucked up after the latest llamacpp pull and slots stopped working properly. I have to reprocess everything from the start on each gen
>>
>>107604096
LLAMA4_SPECIAL_TOKENS = (
LLAMA4_TEXT_POST_TRAIN_SPECIAL_TOKENS
+ LLAMA4_AUDIO_SPEECH_SPECIAL_TOKENS
+ LLAMA4_VISION_SPECIAL_TOKENS
+ LLAMA4_REASONING_SPECIAL_TOKENS
+ LLAMA4_IMAGE_GEN_SPECIAL_TOKENS
+ LLAMA4_META_AI_SPECIAL_TOKENS
+ LLAMA4_AGENT_SPECIAL_TOKENS
)

Audio/speech, vision and image gen, reasoning, that would have truly been an "Omni" model.
>>
>>107603984
Hello /lmg/ this is Zuck speaking. We all heard your very enthusiastic response to Llama models that's why we're sure you will also enjoy llama5 on our exclusive API. Enter "LMG10" to get a 10% discount :rocket:
>>
>>107604044
Oh, I thought that guy was just a larper.
>>
>>107603984
And NAI is literally a scam made by a couple of anons.
>>
File: culture.png (573 KB, 469x729)
573 KB
573 KB PNG
I'm getting ready to do some air travel. Give me some /lmg/ recc books for the trip that I can load onto my tablet.
I just finished reading the Culture series by Ian Banks... an optimistic view of a future with super intelligence.
>>
>>107604126
>>107604181
Sundar here. The Gemma is being redeemed soon.
>>
>>107604226
Do you accept gift cards?
>>
>>107604126
I'm enjoying llama-3.3-8b-instruct thank you!
>>
>>107604213
>I'm getting ready to do some air travel
fack off air is canceled do not breath! >>107602431
>>
Hi guys Elon here. AI girlfriend robo waifus who can cook change appearance and drive a car by mid 2027. I am sure of it.
>>
>>107604282
>>>/v/727777128
>A few years ago this was true, but now not really. If you want you can pay 15 bucks on novelai to not mess with any settings, and it has a huge memory pool. I have a porn HFY story that's like 40 pages long so far, and it remembers details (you can add them to memory so it never forgets them period).
>>>/v/728105737
>novelai is really good now if you make use of the glm-4.6 model they released back in oct. Gives 36864 total context.
They're scamming anons in /v/.
>>
>>107604316
go the f back ff, don't you have others generals to troll
>>
I imagine if this board had ids it would be obvious that all the dev drama was just one xhir samefagging to shit up the thread.
>>
>>107604326
>>107603950
>>
>>107604338
it do be at least two doebobeit
>>
>>107604372
sir?
>>
>>107604213
An Introduction to Cybernetics
>>
saar, qwen is all you need
>>
>>107604386
>An Introduction to Cybernetics
I think that's too heavy a read for this trip. Thanks for recc will run down a copy.
I need something closer to scifi or related. I really liked Gibson's books but they keep getting more esoteric as time's passed.
>>
gpt-oss-sex-120b
>>
>>107604213
Return of the /lmg/ book club
>>
>>107604562
far better than some other subjects we had
>>
>>107602371
What if it was just Gemma "3.5"?
Same architecture and same sizes, only better trained, with extended context and perhaps reasoning capabilities. I bet it would be also "safer".
>>
File: M9FzIrV3El8nx69dzZ9P4.png (334 KB, 485x371)
334 KB
334 KB PNG
>>107604528
>>
File: dipsyPen.png (3.57 MB, 1024x1536)
3.57 MB
3.57 MB PNG
>>107604562
Pretty much
>>107604570
In the absence of other content, I'm grasping for things that we could talk about that aren't saarposting
>>
>>107604598
>>107604598
>>107604598
>>
>>107602542
There are no right answers. It doesn't guess words, it samples tokens from a distribution produced by satanic operations on matrices derived from the Satan himself. You can't proof I'm wrong btw
>>
>>107604515
>I need something closer to scifi or related
Early Asimov



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.