[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: 0.png (2.57 MB, 1500x1500)
2.57 MB
2.57 MB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107856424 & >>107847320

►News
>(01/14) LongCat-Flash-Thinking-2601 released: https://hf.co/meituan-longcat/LongCat-HeavyMode-Summary
>(01/08) Jamba2 3B and Mini (52B-A12B) released: https://ai21.com/blog/introducing-jamba2
>(01/05) OpenPangu-R-72B-2512 (74B-A15B) released: https://hf.co/FreedomIntelligence/openPangu-R-72B-2512
>(01/05) Nemotron Speech ASR released: https://hf.co/blog/nvidia/nemotron-speech-asr-scaling-voice-agents
>(01/04) merged sampling : add support for backend sampling (#17004): https://github.com/ggml-org/llama.cpp/pull/17004

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: what's in the box.jpg (235 KB, 1536x1536)
235 KB
235 KB JPG
►Recent Highlights from the Previous Thread: >>107856424

--Papers:
>107863388
--Multi-GPU model loading challenges and optimization techniques:
>107858542 >107858562 >107858757 >107858773 >107858794 >107858813 >107858862 >107858902 >107858938 >107858574 >107858708 >107858722 >107858745
--LLM potential and synthetic data challenges in training:
>107857483 >107857494 >107857524 >107862696 >107857550 >107857610 >107857630 >107857528 >107857562 >107857601 >107857627 >107863112 >107863126 >107863159 >107857667 >107857702 >107857735
--Budget GPU cooling solutions and DIY fan modification experiences:
>107857227 >107857475 >107858313 >107858798 >107860478 >107860739
--Bypassing Hugging Face token requirement for Chatterbox-turbo model weights:
>107860133 >107860167 >107860162 >107860173 >107861457 >107861519
--Implementing efficient layer looping in llama.cpp across backends:
>107859445 >107859518 >107859520 >107859531 >107859544 >107859583
--Chatterbox-turbo performance and setup frustrations with ROCm:
>107860835 >107861152 >107861313 >107860932
--TTS system deployment and performance testing across platforms:
>107856644 >107857722
--Python environment management debates: venv vs Docker vs conda:
>107859294 >107860018 >107860143 >107860210 >107861562 >107860093 >107860809 >107860152 >107861672 >107861697 >107861760 >107861798 >107861808 >107861830 >107861776 >107861788
--GitHub contribution issues and ik_llama.cpp project maintenance challenges:
>107859037 >107859111 >107859125 >107859126 >107859207 >107859259
--Challenges with AI character portrayal and bias in Opus 4.5 vs Kimi:
>107862097 >107862355
--Critique of large-scale model training and praise for distillation techniques:
>107857737 >107857759 >107857770 >107857849 >107857891 >107857896 >107857919
--Miku and Luka (free space):
>107856886 >107857077 >107859422 >107863062 >107860414

►Recent Highlight Posts from the Previous Thread: >>107856428

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
>>107864105
Is that age of LongCat finally upon us?
>>
LongPussy a meme?
>>
>>107864276
>having goofs
>being a meme
pick one
>>
>>107864171
>>107864276
No support in llama.cpp=not local. Simple as.
>>
>>107864348
TRVKE
>>
>>107864105
it doesn't get said enough, but you guys are cool.
keep on doing you bros.
>>
>>107864024
>Either be subtle and don't mention nsfw

I was thinking that, but when I put the tokenizer config into Qwen and asked what the model does, it straight away spotted it's a NSFW audio system.

>or don't release it under your name if you hope to have or keep a career in ML

I didn't study ML formally so probably not an option for me anyway
>>
File: cockbench.png (2.01 MB, 1131x6954)
2.01 MB
2.01 MB PNG
Added exaone moe.
>>
>>107864456
Given their popularity for RP, you should add Mistral small 3.1/3.2. No one uses 2501 anymore.
>>
>>107864456
The model completely breaks itself in the presence of lewd.
I asked it to continue the story using the chat template and this is its thinking response. It gets stuck in a loop. It's also hallucinating a system prompt it probably had in training.
https://pastebin.com/1YqtWRGp
https://pastebin.com/cnAwGC6g
>>
>>107864432
You're over thinking it, simply don't mention it in the model card
>>
Wow another big win for moesissies. Really outdid yourselves this time.
>>
File: image.jpg (11 KB, 114x101)
11 KB
11 KB JPG
>>107864105
yjk
the gloves stay on
>>
MMA was able to play a good soccer game in the first half. Although a team like Ghana would come up late last season as the club and its team would have to fill the scoring gap against Germany, Germany and Germany. However, the team did have great support on the pitch.

In the second game against Germany, he was able to play in the first half, but the German had to do it in the first half.

But as Böhm continued to play against Germany for the first half in the first half of the season, he was able to take on most of the task.

"When we looked at it during our first round, I got in pretty good shots," he said. "It was just a moment when we finally had a chance to play in the second half.

"Having completed a huge amount of work here, we had a very talented team and we prepared with them, they were a great team."

MMA's 2-1 win over Germany only came into being the first team to begin.

The Slovenian team's last two games were at home next Saturday.

Rivalist Pavel Luttić is not the only centre-back where he went from being named one to finishing the season in the first half. However, he will allow the team to now be back to Paris.

"I saw this game last season. A lot of time came out of it and we were able to do it," he said. "After the first half, I spent the next couple of days doing it. I got started on the pitch, I went out at the end of the pitch and I had to make sure I was able to stop the pitch.

"I got out and I had to do it, and I don't have anything left to get it right. I can't be excited about it."

MMA's other team – such as Belgium and Belgium – has already entered the league. Their academy have signed their academy in Paris, Amsterdam and Austria, following the main-rightening of the team.

"To put that game back into the dressing room, it was like nothing to leave."


i was training a 30M 6/6/384 model and i managed to pull this... from 30M model.. what the actual fuck?
>>
>>107864809
they are pretty magical. its unfortunate they don't scale very well.
>>
>>107864456
The OSS results always makes me chuckle.
>>
>>107864809
>MMA was able to play a good soccer game in the first half. Although a team like Ghana would come up late last season as the club and its team would have to fill the scoring gap against Germany, Germany and Germany. However, the team did have great support on the pitch.
Does this sound less like nonsense to someone who likes soccer?
>In the second game against Germany, he was able to play in the first half, but the German had to do it in the first half.
NGL this seems basically what I would expect.
>>
>too lazy to load model up, just use same model via provider
>ask basic bitch question about an area's climate, fishing, what fish you can catch
>IT'S NOT JUST FISHING, IT'S A DEEP CULTURAL TRADITION
>a fuckload of markdown tables and lists later, I suddenly don't give a shit
Every generation of model and they find a new thing that drives me up a wall and train every other model on said outputs. I almost wish to return to llama 2 days, except it was so retarded that it was borderline impossible to extract any usable info from
>>
>>107865005
The fish is looking at you with a mix of fear and something else
>>
>>107865115
The fish also has a scent that's... uniquely its own
I hate how fucking retarded llms are despite their potential value
>>
>>107865162
>begins every single reply with the same fucking words, permanently dooming your entire chat no matter how much you swipe
nothing personnel human
>>
>>107865005
>You hit the nail on the head! It's not just about training, it's about the bonds we form with one another as we embark on this journey of growth and discovery. You didn't just ask a question, you opened the door to a world of possibilities.
>>
should i start token banning the words "and" and "just"?
>>
>>107865250
you should token ban
>The
>for
>an
>a
and puntuation symbols that aren't ";" or "?".
It's just commonslop
>>
File: itsover.png (193 KB, 881x692)
193 KB
193 KB PNG
They really have converged on this pattern. Even base models.
>>
>>107865295
Caused by synthetic slop data, no doubt
>>
>>107865295
holy shit
>>
File: my honest reaction.jpg (47 KB, 562x675)
47 KB
47 KB JPG
>>107865295
>>
File: rip papers anon.jpg (292 KB, 1683x2048)
292 KB
292 KB JPG
>>107840944
It looks like a bag of tricks but not something revolutionary since the theoretical benefits can be done with other techniques without lowering MFU. I don't see how it makes quantization or MoE any different.

>Ouro paper
>does not extrapolate to deeper, unseen recurrent steps
sad!



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.