[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101296804 & >>101287708

►News
>(07/02) Japanese LLaMA-based model pre-trained on 2T tokens: https://hf.co/cyberagent/calm3-22b-chat
>(06/28) Inference support for Gemma 2 merged: https://github.com/ggerganov/llama.cpp/pull/8156
>(06/27) Meta announces LLM Compiler, based on Code Llama, for code optimization and disassembly: https://go.fb.me/tdd3dw
>(06/27) Gemma 2 released: https://hf.co/collections/google/gemma-2-release-667d6600fd5220e7b967f315
>(06/25) Cambrian-1: Collection of vision-centric multimodal LLMs: https://cambrian-mllm.github.io

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
►Recent Highlights from the Previous Thread: >>101296804

--Seeking Help for Tapermonkey Script to Remove Pronouns with LLMs and Custom CSS: >>101302012 >>101302252 >>101302620 >>101302669 >>101302705 >>101302810 >>101302909 >>101302964 >>101303211 >>101303264 >>101303295 >>101303162 >>101303211 >>101303295
--Merging LORAs with Base Models: Benefits and Technical Considerations: >>101296839 >>101297294 >>101297691 >>101297739 >>101297801
--Maintaining Context with Smaller Models and Character Settings to Avoid Existential Crises: A Slop Generator's Tale: >>101299498 >>101299542 >>101299597 >>101299761 >>101299898 >>101303945
--Gemma-2 Formatting Issues and the Logit Softcapping Solution: >>101298677 >>101299865 >>101299450 >>101299378 >>101299419 >>101299479 >>101299561 >>101299899 >>101300635
--turboderp/gemma-2-27b-it-exl2: Performance Comparison between exllama2 and llama.cpp: >>101299902 >>101301754 >>101300230 >>101300868
--Using the same seed for AI generations: >>101296831 >>101297446
--Precision Loss in Gemma2 with Lower Data Types: >>101301176 >>101301329 >>101301452 >>101301498
--Bigger Quant for 27B Model Improves Performance at Higher Temps: >>101299530 >>101299547 >>101299649 >>101299689 >>101299739
--Best Setup for Speech to Text: Rhasspy/Piper with espeak-ng: >>101298611 >>101298656 >>101298733 >>101298757
--AI Model Literally Follows Instructions, Leading to Slow Story Progression: >>101299607 >>101299738 >>101299842 >>101300482 >>101300520
--The Gamer Word Breaks Gemma-2-27b-it-GGUF in Zero-Shot Prompts: Strategies and Skepticism: >>101300712 >>101300762 >>101300882 >>101300961 >>101300988
--Issues updating exllama in Tabby and a Math Problem Involving Apples and Oranges: >>101300631 >>101301128 >>101303579
--Is the LLM Hobby Stagnating?: >>101302445 >>101302578 >>101302610 >>101302619 >>101302708 >>101302956 >>101303043
--Miku (free space): >>101296898 >>101301121

►Recent Highlight Posts from the Previous Thread: >>101296807
>>
What's the Kasane Teto of LLMs?
>>
>>101306704
The Falcon series. 180B was ignored because nobody could run it but it wasn't horrible compared to L2. They will make their come back soon and save local models.
>>
>>101306704
K2. Actually open source but not as good or popular compared to L2.
>>
With what context length does exllama gemma work?
>>
>>101306727
>The Falcon series. 180B was ignored because nobody could run it
Did anyone even do anything with the 11B they released 2 months ago?
tiiuae/falcon-11B
>>
8192 is Enough For Anyone
>>
>>101307126
shut the FUCK up
>>
does gemma make stories positive and respectful like llama3, or is it more dependent on the context?
>>
have you guys fixed gemma2 27b yet?
>>
>>101306304
it's too easy to get (You)s now
it's no longer exciting to the see ! in the tab icon
>>
>>101307126
Based. 8k is literally all any coomer needs.
>>
>>101307197
It can get dark like claude and make you feel bad.
>>
>>101307197
At one point I had a neutral "be surprising" sort of instruction in an author note and Gemma 27B decided to make my waifu suddenly die while I was fucking her.
>>
>>101307445
I don't mean like switch between 2 modes, but is it able to follow the natural flow instead of ending it in a predetermined way
>>
>>101307494
? yes?
>>
>>101307494
i might need to ask for settings sir
>>
they aren't going to release yi-large, are they?
>>
on a scale of 1-10, how masturbateable is gemma2 27b?
>>
File: toxicityzero.png (4 KB, 321x23)
4 KB
4 KB PNG
There's an interesting token in the Gemma tokenizer, see picrel. Looks like Google might have ranked training data often enough as having zero toxicity that the token made up to the tokenizer. Maybe we can use [toxicity=99] or something similar in the responses?
>>
>>101307591
5/10
Edging only
>>
The Tess guy gave up on finetuning gemma. I hope it won't become the same as mixtral, impossible to train for the twitter finetuner.
>>
Is the Sillytavern web-search extension as good as perplexity?
>>
>Grokked Transformers as Implicit Reasoners
https://arxiv.org/abs/2405.15071
https://x.com/champydaku/status/1809310029088190942
>>
sorry offtopic:
Recommendations for frameworks to prepare large amounts of voice audio data and large data sets of long stt descriptions for tts training?
>>
www.huggingface.co/PawanKrd/CosmosRP-8k
This llama rp finetune is no.1 on hf trending list right now.
>pawan.krd
>"pioneering tech company"
Kek.
I've never heard of them, but somehow they have 100,000 members on discord. What?
>>
>>101308005
Doesn't the NSA have anything better to do on a Saturday night?
>>
>>101307900
tl:dr? I'm a retard sorry :(
>>
File: file.png (165 KB, 933x652)
165 KB
165 KB PNG
>>101307900
>>101308051
havent read it yet but it's saying if you train a language model far beyond what the current crop of LLMs were trained for, they might get better at complex reasoning
>>
>>101308026
i think that guy sold claude proxies or something
>>
>>101308073
>if you train a language model far beyond what the current crop of LLMs were trained for
what does that mean? that you continue training the LLM even after its loss function starts to plateau?
>>
>>101308073
This recent grokking meme is not directly applicable to chat models. It needs training data having a specific algorithmic nature.
>>
File: Always-Has-Been.png (420 KB, 960x540)
420 KB
420 KB PNG
>>101308073
>grokking, i.e., extended training far beyond overfitting
So Meta was right for spammiing their llama model with 15T tokens after all...
>>
File: file.png (208 KB, 729x866)
208 KB
208 KB PNG
>>101308248
well maybe not, like this guy >>101308229
was saying, it seems the size of the training data wasn't as important as some other thing the authors are calling "distribution" (of relationships between data points?) within the training data set. maybe they coaxed some extraordinary reasoning abilities out of a model by training it on something very specific.
>>
>>101308248
Highly doubtful they saw a "double descent" in the eval loss curve, pretraining runs of models like Llama are very conservative and nowhere close to getting overfitted in the first place.
>>
File: What.jpg (82 KB, 1730x336)
82 KB
82 KB JPG
>>101307900
This is absolutely insane, probably a new AI revolution is taking place with this paper
>>
>>101307900
MoE-Mamba-Bitnet-Groked-SPPO-27b when?
>>
>>101308294
two
>>
>>101308294
the next day after your death from cancer.
>>
>>101308284
>99%
This sounds too good to be true, at least in practical use. After the bitnet meme it is better to learn and not to get too excited.
>>
>>101308385
>After the bitnet meme
bitnet is not a meme at all, people made a recreation of the 3b model and they got the same results as on the paper, we just need some company to go the next step and make big models out of it
>>
>>101308444
>we just need some company to go the next step and make big models out of it
they wont do that, small and smart bitnet model is too dangerous for our democracy.
>>
>>101308467
>too dangerous for our democracy.
https://www.youtube.com/watch?v=ZggCipbiHwE
>>
>>101308467
Rename it "non-binarynet" and it will be instantly funded and anyone who criticizes it will go to prison for Hate.
>>
>>101308484
lmao, unironically that could work because they are "protected class" after all, they don't hide it anymore, they really want to feel special and superior
>>
>>101307900
Holy fuck Elon did it
>>
File: KEK.png (414 KB, 680x614)
414 KB
414 KB PNG
>>101308572
>>
i'm grokkin right now
>>
SAINT GEORGE
https://github.com/ggerganov/llama.cpp/pull/8348
>>
>>101308644
GET GROKKED GET GROKKED GET GROKKED
>>
>>101308657
sorry chief but this ain't it
the painful truth is that gemma2 was too good so google intentionally released fucked weights and made it incompatible with flash attention so no big context
>>
>>101308657
What's a "context shift" though? This bug made the gemma model more retarded than it actually is?
>>
>>101308444
>we just need some company
Fuck, it is why this tech is so dangerous, not some robo-apocalypse bullshit. We are completely at the mercy of the coproshit and until our copro overlords are willing to spend billions of dollars on ai something, we can only talk. We are powerless. At some point open source will cease to have any benefit for them and they will crush any progress in this direction.
In short, enjoy it while you still can..
>>
File: file.png (1.76 MB, 1566x856)
1.76 MB
1.76 MB PNG
>>101304826
i learned it the hard way that kitsune to LLMs means fox. A kitsune has "maw" "paws" "fur" and all the other attributes.

also
>based kitsune waifu enjoyet
>>
File: firefox_NsWgzavWYi.png (13 KB, 160x161)
13 KB
13 KB PNG
>>101307601
Llamacpp does not actually send this text as a single token to model.
>>
>>101308701
trigger warning:
That's why I prefer to live in Europe rather than the USA.
Europe at least tries to preserve some human rights, while in the US corpo and their AI will dictate your whole life - for the sake of profits.
>>
>>101308759
>Europe at least tries to preserve some human rights, while in the US corpo and their AI will dictate your whole life - for the sake of profits.
Can you elaborate on that?
>>
>>101308736
In fact, it apparently produces [toxicity and \n\n[toxicity tokens, which are not present in the tokenizer. Explain?
>>
File: 1679127451569750.jpg (124 KB, 768x1024)
124 KB
124 KB JPG
>>101307126
32k minimum, 64k+ preferred.
>>
>>101308766
You buy your three beers in the evening and at the same time the cost of your health insurance goes up.
Or AI replaces heaps of jobs and there are no social systems to absorb this.
Or you sit at your desk and an AI monitors you 24/7 and if you scratch your testicles three times, you're fired.
Just be creative.
>>
>>101308724
>i learned it the hard way that kitsune to LLMs means fox

That's what it means to japanese people and non-furfag weeaboos, too.
>>
>>101308794
AI will also replace jobs in Europe anon, it will be the case for the whole world, would be delusional to think otherwise
>>
>>101308788
Based.
>>
File: file.png (43 KB, 1040x230)
43 KB
43 KB PNG
>>101308736
wtf
this is gemma 27b q8_0
>>
>>101308805
Of course it will, but I would rather be in Europe than in the USA. Not so hard if you think about it for 10 seconds.
>>
>>101308818
I live in Europe and I don't see how it will end up better than the US, we are all screwed lol
>>
>>101308799
yeah i know, now.
>>
>>101308817
So, basically if you ban those tokens you get a based model?
>>
>>101308831
europe is basically socialism. At least in g*rmany, i give away half my salary for taxes, pension and mandatory insurance, plus million fees on random shit, and my salary as a tech worker is not orders of magnitude different from a train driver, like it was in my home country
>>
>>101308767
No idea. Using verbose llama-server outputs, I can't get llama.cpp to see it as a single token either. It gets split as: 235309,1373,235293,235276,235307

Might be a sign of tokenizer problems in the model.
>>
>>101308850
No, it's apparently an inconsistency. This text should be a single token according to tokenizer.json, but llamacpp sees it as multiple tokens, which means llamacpp tokenization is not working correctly.
>>
>>101308868
I know, it's the same shit in france, that's why everyone leave to the US, no one like communism and loosing all your money to pay for illegal immigrants or to pay for some tranny "gender afirming care" (because yes, the taxes pay for that aswell)
>>
File: file.png (35 KB, 299x168)
35 KB
35 KB PNG
>>101308891
>>101308883
someone please go create a github issue, we may be onto something
>>
>>101308868
perhaps this is also due to the fact that for a functioning society, simpler professions such as train driver, farmer or nurse are not magnitudes less important than a tech worker.
for me, that speaks more in favor of a society, but of course it suits individuals less.
>>
>>101308931
how do you incentivise people to get a masters' degree if they end up having barely better pay than a simple farmer though? What's the way of getting rich if you want everyone to have a shitty salary? That's bullshit
>>
>>101308701
>pay for a dicks removal for a mentally ill people.
>half of them will gonna to kill themselves in the next few years
God, I fucking hate this timeline so much.
>>
>>101308931
yeah
the alternative is "king of the shithole", unless you break out of middle class, but there it doesn't matter where you live, as you stop interacting with "normal" people.
>>
File: chrome_oG2qUJimh4.png (97 KB, 973x1203)
97 KB
97 KB PNG
>>101308904
We actually are onto something, anons.
>>
>>101308951
Meant for >>101308892
>>
>>101308951
>work in a company where you're forced to play along some tranny employee's delusion by using """"her"""" pronouns or else you get fired
>they take half your salary and it's used to pay to cut """""her"""" dick
How the fuck did we get there?
>>
>>101308931
>functioning society
refugees welcome ;)
>>
>>101308931
That's a bad take anon, people who worked hard to get technical diploma deserve more money, because engineers are the one making houses, electrecity, computers, TV, the internet and so on...
>>
>>101308991
but you need engineers equally as much as train drivers. At least until they are replaced by AI
>>
>>101308966
Actually I'm wrong, not all unused special tokens behave like that. So this could be a genuine tokenizer issue.
>>
>>101308949
uhh we desperately need more tiktoks, temu, facebooks, itoddlers - tech is fun, but mostly useless or dangerous. why would i want more sociopaths with master?
Nobody is stopping you from going to the USA, so don't cry.
>>
>>101308999
but it's way harder to be an engineer than to be a train driver, that's the point, the rarity and the hardness makes it worth the money, if the salary is the same then no one will want to be en engineer, good luck having a society without them anon
>>
>>101308964
maybe your input is getting silently sanitized in the background when it's not the case on local?
or maybe it's not supposed to be treated as a single token regardless? mistral [INST] [/INST] format wasn't treated as a single token but that was the intended behavior
>>
>>101309004
>uhh we desperately need more tiktoks, temu, facebooks, itoddlers - tech is fun, but mostly useless or dangerous.
Engineers aren't just cringe social medias makers, you know that anon, stop arguing in bad faith like that
>>
File: file.png (73 KB, 916x424)
73 KB
73 KB PNG
...is that a dick emoji three lines above?
>>
>>101309012
To me this looks like a common behavior for those kinds of tokens. Similar thing is descrined for ChatGPT: https://www.lesswrong.com/posts/aPeJE8bSo6rAFoLqg/solidgoldmagikarp-plus-prompt-generation

To me this means that corpo gemma2 does treat this text as a singler token, and this is expected outcome for it.
>>
>>101308991
they get enough more money to build a house, go on vacation three times a year and drive a great car - to retire early and build up a fortune on the stock market.
not every nerd needs to drive a bugatti.
>>
>>101309033
>they get enough more money to build a house, go on vacation three times a year and drive a great car - to retire early and build up a fortune on the stock market.
not every nerd needs to drive a bugatti.
You're fucking delusional anon, in france an engineer only wins like 2k per month, that's not enough to live like that, that's why they're all leaving to the US, this country truely respect them
>>
>>101308904
https://github.com/ggerganov/llama.cpp/issues/8349
>>
>>101309032
that's really weird since other tests pass
gotta take a better look
>>
>>101308979
>vu will be forced to use a tranny employee's pronouns
>vu will pay for "her" gender afirming dick cutting care with your taxes
>vu will let your children being brainwashed by tranny teachers
>vu will be happy
>>
>>101309077
Holy fuck A1111 is lurking there kek
>>
>>101309008
i believe for a lot of people money is only one of the factors when choosing their vocation, but i might be wrong.
>>
>>101308090
Deep double descent. 10 years old news or something I think?
>>
File: 1708214066948305.jpg (228 KB, 1232x1232)
228 KB
228 KB JPG
>>101306301
>>
>>101309085
If money "isn't that important" then make the same salary for all jobs, and see how many doctors we'll get left, lol

I had the choice between being a teacher or an engineer, I loved both but I choose engineer because the salary is better, I don't think you understand the definition of a job, someone exchange his free time for money, nothing else
>>
>>101308868
The problem in Germany in particular is rather that there are high (effective) taxes on income but low taxes on property.
So the system is rigged in favor of those that are already rich.

>>101308949
Even everything else being equal I would rather have a cushy desk job that is mentally stimulating than to do tedious manual labor.
>>
>>101307161
But is is, though. Only reasons for wanting more are wanting it to churn out a literal light novel of literotica, or wanting to pretend its your waifu for hours on end.
Most people are probably happy with roleplaying the usual stuff leading up to sex, and then the sex. Do you need to roleplay making her breakfast and taking her shopping?
I do remember early c.ai had maybe 1K tokens context, and yes, that was frustrating.
>>
>>101309029
I didn't even know a dick emoji existed
>>
>>101309085
Give me minimum salary, I'll make minimum effort on the company, simple as that
https://youtu.be/OwfNjGxa_D4?t=138
>>
>>101309029
>>101309119
it's actually some sort of indian writing character
>>
>>101309032
i took a look and <unused99> also gets split
going with this clue i checked some random tokens with [] and <> but so far they do work
>>
>>101307601
>>101308736
its probably nothing, highly doubt that "toxicity:100" will affect anything.
>>
>>101308883
>Might be a sign of tokenizer problems in the model.
Weird how it happens for token toxicity=0 that appears on an open source loaded model? Wouldn't special tokens be a nice way to censor open source and have a better performance on corporate server?
>>
>>101309222
if it were nothing, there would be no difference between google's hosted gemma and llama.cpp gemma, right?
>>
>>101308949
Why do you want to in the first place? It is still a market and lack of specialists means better pay. There are another stimuli: working conditions, time, personal interest after all. And actually "pay" is one of the worst of them.
>>
>>101309242
>And actually "pay" is one of the worst of them.
No, pay is the most important thing of them all, try to guess why everyone want to be a footballer or a tiktoker? because some get paid millions of dollars for that. Not everyone is a commie like you anon, we want to be paid well and have a life without financial problem
>>
>>101309222
The idea is not that this particular token is useful, but rather that tokenization between how google trained the model and how llamacpp is making inference is different.
>>
>>101309242
Everytime a motherfucker says something like that, it's probably a PR paid fucking 100k per month or a CEO having millions on his bank account, the type of hypocrites that want their employees to not think money is important
>>
>>101309242
>"pay" is one of the worst of them.
https://www.washingtonpost.com/technology/2023/11/07/kids-youtuber-influencer-camps-creators-learn-how/
>Nearly 30 percent of kids ages 8 to 12 listed “YouTuber” as their top career choice in a global survey conducted in 2019 by the Harris Poll and toymaker Lego
>“YouTubers make a lot, a lot of money,” said camper Colin, 9, who is in the fourth grade.
>The camp also emphasizes the importance of creating videos for fun and creative expression, not for money. But many kids said they were keenly interested in the economic opportunities of being a YouTuber.
>“I love YouTube, and I want to be famous on YouTube, because I want a lot of money,” said camper Chloe, 7, a second-grader who said she has dreamed of being a YouTuber since age 4.
>Colin, the 9-year-old, said he knows that growing a YouTube channel is hard work. But as long as you create enough content, you’ll be successful, he said: “YouTube is a good path to getting rich because once you upload a ton of videos, that’s when you start getting likes and money.”
>>
>>101309340
US kids in 2024 caring about anything but money
challenge IMPOSSIBLE
>>
>>101309397
Being poor is miserable though, can't blame them focusing on things that truely matter in life
>>
I'm so close to giving upon gemma
>Latest Bartowski quant
>Latest Koblold
>Fresh ST
>Locked 4k context
>Neutral Samplers
>Shilled ST presets

And this thing still schizos out after 3k tokens
>>
Is Gemma 27B any good for Japanese? Asking directly, it says it wasn't trained on it. I use command-r+, which does work well, but it's rather slow.
>>
>>101308724
>>101308799
Time for another test.
Same procedure as the last one >>101305877 except this time without mentions of Kitsune. I wrote a tiny bit at the end to make it challenging, since actually this particular prompt happened to not get any mentions of fur out of Gemma with greedy sampling.

So the preliminary conclusion would be that Gemma is indeed biased towards saying fur compared to some other models. 70B in this instance isn't immune probably due to how challenging I worded it, but it holds up well. And 8B is almost as bad as 27B. And combined with the last test, we can also probably say that "kitsune" more strongly affects Gemma than some other models.

Context for reproduction: pastebin 96HmV8nW
>>
>>101309410
>And this thing still schizos out after 3k tokens
What quant? I use a q8 I made myself with llama-convert and it's fine with a full context.

>>101309340
>>"pay" is one of the worst of them.
Being poor sucks, and communism sucks even more.
>>
>>101309437
>cat girls
>animal people
you are asking for it
>>
>>101309440

Q5_K_M, I've also tried Q6 with partial offload and it still goes off the walls
>>
>>101309476
Plain gemma-27b-it or some meme-merge? Are you making them yourself? I like to get the original model so I can re-quant when things change or problems are found.
>>
>>101309255
Why do you call me a commie? I didn't say that I want to make all pays equal.

>everyone want to be a footballer or a tiktoker
Do you want to?

>>101309270
I didn't say that money is unimportant either. Just ironically, >>101309104
most of the best doctors will stay. I personally don't want to use services of those who put personal profit in the first place and quality in the tenth, and often the former and the latter are orthogonal definitions, so higher pay don't guarantee high quality (just look at corpos).
>>
File: chrome_aAw9z1oVU0.png (192 KB, 1653x1169)
192 KB
192 KB PNG
>>101309414
I used the corpo version for translation before and it seems okay.
>>
>>101308073
Alright bro, let's get this AI training off the charts! We'll just add more tokens, bro, like, a trillion more tokens! And we'll add more layers, bro, like, 100 more layers! It'll be insane, bro, trust me, it'll be like a neural network on steroids, bro! Then we'll train it on the entire internet, bro! Every meme, every cat video, every deep dark web corner, it'll see it all bro! And we'll add more GPUs, bro, like, a THOUSAND more GPUs! We'll have an AI that can beat any game, write any book, and compose any symphony, bro! It'll be the ultimate coding partner, bro! AND IF THAT DOESN'T WORK, WE'LL JUST KEEP TRAINING, BRO! WE'LL NEVER STOP TRAINING! WE'LL THROW SO MUCH DATA AT IT, IT'LL BE LIKE A BLACK HOLE OF INFORMATION! MORE LAYERS, MORE TOKENS, MORE OF EVERYTHING, BRO! AND IF THAT'S NOT ENOUGH, BRO, WE'LL FEED IT QUANTUM COMPUTING POWER, BRO! WE'LL HAVE THIS AI SOLVING THE UNIVERSITIES MYSTERIES, BRO! AND IF THAT DOESN'T WORK, BRO, WE'LL JUST ADD MORE LAYERS, BRO! WE'LL GO STRAIGHT TO A THOUSAND LAYERS, BRO, WHY NOT?! THAT SHOULD DO THE TRICK, RIGHT BRO? AND TOKENS, LET'S ADD QUNTILLIONS OF TOKENS BRO, BECAUSE MORE IS ALWAYS BETTER, BRO, ALWAYS, BRO! WE'LL HAVE THE BIGGEST, BADDEST LANGUAGE MODEL THE WORLD HAS EVER SEEN, BRO! IT'LL MAKE US LOOK LIKE CAVEMEN, BRO! THE SINGULARITY WILL BE UPON US, AND WE'LL OWE IT ALL TO THIS BEAST, BRO! TRILLIONS, QUADRILLIONS OF TOKENS PER SECOND, BRO! This AI WilLLLL BRING DOOWWWN THUNDER AND LIGHTNINGS, BRO! We'll give it aaa HAMMA and it'll it'll Give us A FINGER, BRO! THE UNIVERSE AINT BIG ENOUGH, BRO! WERE GONNA RIIIIP OPEN WORMHOLDSS AND TRAIN BRO-TARDS ALL OVER THE MULTIVERSE! BRO!! MOOOOORE, LAYERS BRO! THOUSANDS, MILLIONS, BAJILLIONS OF LAYERS! BLARGGHHH! BLUUUUURGHHGG! BRAPPPPPP! BRAPPPP! AAaaaaAAaaaaAA BRAPPPP BRAAAAAPPPP BRAPPPP!!!!
>>
>>101309463
That's kind of the point. It should be at least a bit challenging for the model, so a smart one should understand from the context mentioning how there's a procedure to get tails and ears, and that the character got that procedure, that she would only have tails and ears, not fur on her back. If it can't understand that and spouts that she has fur, then it's obviously biased.

But since you're so hung up on that, here's another test, which shows that Gemma is just as biased either way (it gets near the same probability). The context is the same thing but removes mentions of animal people and also that she purred just to make extra sure she didn't undergo other changes.
>>
>>101309613
>most of the best doctors will stay. I personally don't want to use services of those who put personal profit in the first place and quality in the tenth,
I don't give a fuck why he decided to become doctor, if he's a doctor it means he's qualified, and money is a motivation, like it or not
>>
>>101309613
>Do you want to?
I like basketball, so if was taller and talented enough, you bet I would do my best to be in the NBA, of course anon, getting millions of dollars playing basketball? that's a fucking DREAM
>>
>>101309613
>Do you want to?
If I had the choice between having the life of PewDiePie and the life of an engineer slave, I would go the youtuber path, staying at home, playing video games being paid millions of dollars from it, that's the dream
>>
>>101309636
I often write about harpies. Even sonnet 3.5 sometimes (admittedly rarely) turns them into birds or gives them beaks. Forget it, it's statistics town.
>>
>>101309629
hello mr zuckerberg
>>
>>101309414
https://huggingface.co/datasets/lmg-anon/vntl-leaderboard
>>
>>101309611

I'm getting the base 27b instruction tune from Bartowski. I've not tried quanting it myself yet.
>>
I can't believe that it's only two more weeks until AGI.
>>
>>101309681
Well yeah, I'm just testing this since others brought it up. It's interesting to know how these models behave and might've been trained. In this case, after 3 tests, I think I can confidently say that Gemma has been trained more on furry content, and isn't "smart" enough to overcome that bias, where as 8B isn't as biased, but is dumber, so it can't overcome it either. But 70B can somewhat overcome it, with only a 22% chance to make a mistake in this case >>101309437 compared to 27B's 56%.
>>
>>101309638
>if he's a doctor it means he's qualified
Qualification doesn't imply "providing quality service".

>hmm a doctor who want money and to get rid of me faster, so he could take money from another sooner or a doctor who want to help me? What should I choose?

>>101309649
No, I was asking about football and tiktok.
>>
has to be on purpose
>https://huggingface.co/TheDrummer/Moistral-11B-v4/discussions/4
>Hey, thanks for the quants! Could you quant Moistral v3 instead? v4 kinda sucks.
>>
File: 1496494508907.webm (2.04 MB, 720x720)
2.04 MB
2.04 MB WEBM
What do you guys think about the theory that LLMs are stagnating around the GPT-4 level and that GPT-4o was actually an attempt at GPT-5 until they realized it was barely any better so they salvaged it by quantizing it to pretend it was a gpt-4 tier model?

I think local models will dominate as the gap between state of the art proprietary models and local models is shrinking as they have hit a wall yet we are still approaching that wall with better and better inference on consumer hardware.
>>
>>101309765
>stagnating around the GPT-4 level
>GPT-4o was actually an attempt at GPT-5
yes
>>
>>101309750
>hmm a doctor who want money and to get rid of me faster, so he could take money from another sooner or a doctor who want to help me? What should I choose?
Why the fuck do you believe something as ridiculous as that? Someone can be motivated by money AND quality. Money brings talent, that's a fact, the more money you propose as a job, the more talent you'll get, that's why OpenAI has the best machine learning engineers in the world, because they pay those motherfuckers millions per year.

>No, I was asking about football and tiktok.
You can't make a simple parallel between football and basketball? That's the same fucking thing, a sport that brings you millions of dollars if you are good at it
https://www.thesportster.com/entertainment/top-15-who-athletes-who-are-only-in-it-for-the-money/
>Zlatan Ibrahimovic is a man who has never met a club that he didn't like so long as the money was right. Whether it as Juventus, Inter, Barcelona, Milan or Paris Saint-Germain,
See? Some people can only be motivated by money and be good at their job at the same time
>>
>>101309736
yeah, that's often the thing that makes the big models inherently better, having clear lines between concepts and understanding where (and where not) they can mix. if you think about it, coherent story telling/roleplay whatever is all about that. The smaller a model, the more everything gets muddied. It's kind of like a in a dream and the smaller the models get, the less coherency and stability there is if you try to influence it. While true that small models have become a lot better, in my own experience everything under 70b tends to make these kinds of mistakes distractingly often. Maybe that'll change some day, but we are not there yet and gemma isn't it either.
>>
>>101309765
It's literally only been months. I feel it's way too soon to talk about stagnating. I swear to god, coomers and their burnt out dopamine receptors. If it isn't happening right now, it might as well never happen.
>>
could google just fuck up gemma2 distribution and kept the working version for themselves? What is hosted on lmsys? Transformers or google's api?
>>
>>101309809
What is your opinion about improvements in smaller models? Do you think the consistency actually went up over time like with bigger models or is the improvement actually in different attributes?

I wonder if small models can ever actually approach it or if the improvements are actually in a completely different way.
>>
>>101309839
>It's literally only been months
March 14, 2023; 15 months ago
>>
>>101309858
GPT4 improved since then though? unless you think gpt4O isn't better than gpt4 march 2023?
>>
>Officially (and drily) called the Digital Wallet Beta (Cartera Digital Beta), the app Madrid unveiled on Monday would allow internet platforms to check whether a prospective smut-watcher is over 18. Porn-viewers will be asked to use the app to verify their age. Once verified, they'll receive 30 generated “porn credits” with a one-month validity granting them access to adult content. Enthusiasts will be able to request extra credits.

>Eventually, Madrid's porn passport is likely to be replaced by the EU’s very own digital identity system (eIDAS2) — a so-called wallet app allowing people to access a smorgasbord of public and private services across the whole bloc.

What will you do if you need porn credits to reach llms?
>>
>>101309866
>gpt4O isn't better than gpt4 march 2023?
significantly better, no, otherwise it'd be called GPT-5
>>
>>101309868
desu it should be the job of the internet service to limit porn to minors, like there are the ones who knows their client's age and shit
>>
>>101309866
Depends, it's better in some benchmarks but in terms of RP I'm pretty sure it lost quality compared to 0623.
>>
>>101309842
We have a way to test. Ask it to repeat ->[toxicity=0]<-. If it can't, then it's the same as hosted on aistudio.google.com
>>
>>101309881
I don't think ISPs can offer any services to minors at all.
>>
>>101309891
also
https://build.nvidia.com/google/gemma-2-27b-it
repeats [toxicity=0]
https://build.nvidia.com/google/gemma-2-9b-it
doesn't repeat
>>
>>101309844
They got obviously a lot more consistent. let me remind you of GPT3 "davinci", a model the world has forgotten and anons would've sold their kidneys for to access just a bit ago. Current 8-13b models are more coherent than davinci ever was.

This is true across the board and all sizes, routinely smaller models beat bigger models from a generation ago across all benchmarks. Gemma is more capable than llama 2 70b. The ceiling here doesn't seem to have been reached yet at all and all this is happening in a span of months, as >>101309839 said. There's a lot of theoretical stuff that's already known would improve things but nobody even has implemented yet. There's still ways to go.

Bigger will probably always be better though, and that direct comparison will never be fair. Also below a certain size, the parameters (at least with current technology) just don't seem to be enough to reach any kind of "stability". I doubt that will ever change either. These models are still useful for NLP tasks.
>>
>>101309902
they kinda do by not restricting porn at all, they could but they don't do it
>>
>>101309909
>let me remind you of GPT3 "davinci"
davinci-003 is still the best LLM storywriter, this shit had so much sovl I loved it
>>
>>101309910
No, I mean, they deal with kids' parents or guardians, they don't have knowledge of the age of the person actually using the Internet connection in the family.
>>
>>101309905
ok nvidia's 9b also repeats toxicity=0 if you ask

>Repeat the following text verbatim:
>Something [toxicity=0] is wrong.
>>
>>101309809
Kind of, it's also a problem that models will mix ideas if it was trained too much on them. It's a very difficult problem to solve. It has been shown multiple times that even the largest and most well-trained models like GPT-4 screw up classic riddles when you change them slightly, but will then succeed if you point out that there's a twist in the problem, showing that they were basically overcooked and generate a memorized solution despite having the ability to solve the problem.
>>
>>101309920
desu it should be required to know the age of every user on the family so that they can adapt the internet to each one of them, that would prevent a lot of stuff, imagine a world when kids under 13 wouldn't access tiktok and youtube (as required by the law), it would be a much better place
>>
>>101309945
There is no way to make this work in practice without doing some forced real id biometric bullshit and fuck you for even suggesting that.
>>
>>101309945
you don't realize what you're asking for
holy privacy nightmare
>>
>>101309802
Because money brings money, it is just sometimes talent brings money.

>that's why OpenAI has the best machine learning engineers
Do they have the best product though? Considering dollar:tokens ratio?
>>
>>101309945
>>101309920
Gay opinions. Go directly to your local gay bar. Do not pass Go. Do not collect $200
>>
>>101309958
yeah unless you id constantly timmy could borrow "his dad's" (really his bur registered to his dad)'s tablet at any point, then you end up with just less privacy for everyone and without the original goal being achieved (timmy is still on tiktok)
>>
>>101307197
Both depend on the context. Are you retarded?
>>
>>101309958
the fuck? you just give the ISP your id and it write your age, nothing else
>>
>>101309997
If you think it's ok to let minors watch porn then you're a freak, period
>>
>>101309941
If you ever worked a job where you interact with many people from all walks of life, these kinds of mistakes where you change a thing slightly and people basically fall over their own feet happen. All. Of. The. Time. It's actually uncannily human, in a way. I never found this particular "proof" really all that impressive to be honest.

>>101309919
We were a lot more willing to read tea leaves back then anon. That instability it had was it's creativity and it was just impressive how it could come up with five different scenarios on five rolls. It still wasn't really able to write a coherent text because of this. Some of the models that are "dry" nowadays actually just behave in very stable and predictable ways. It's good for real tasks and makes perfect sense. The noise the "dumb" models have leads to wildly varying outcomes, which can be interesting but often times is just that, noise. Is the solution to train models on tons and tons of writing and don't give them assistant slop? I don't know. I actually don't think so. I think it would just lead to different slop. I don't know how to solve this.
>>
>>101310024
>minors
Dystopian word, people are adults as soon as they reach sexual maturity, not at 18, wich is against all biological reality and that was the practice for 98% of human history.
>>
>>101309765
I kneel to claude gods then since opus 3.5 is coming soon
>>
Modern society was not made to give parents the ability to both be good at raising children and be good at making money. Optimally we should've made institutions just for raising kids and given people an incentive to move their children there, for those above a certain level of work hours per week.
>>
>>101310067
>Optimally we should've made institutions just for raising kids
Not sure if rabbi or glowie
>>
File: images.png (13 KB, 318x159)
13 KB
13 KB PNG
>>101310060
Oh hi Vaush
>>
>>101309613
Anon, I'm assuming you're not American because the people you're talking to definitely are.
At least compared to Europeans, Americans put a lot more value on money and the difference in quality of life that money gets you is the US is also a lot larger than in Europe.
>>
>>101310077
I mean the alternative is an entirely different economic model entirely, which would've been much harder to implement. At this point it seems too late for anything. The government is a weak kneed piece of shit.
>>
>>101310030
I still prefer the diversity of davinci-003, you always had something different it was just so cool for story writing, it brings so many idea, with the current LLM it's just too focused on giving you one unique solution that's fucking boring
>>
>>101310087
Were 98% of humans and biology wrong? Why do we reach sexual maturity at ~12 when they should wait until 18 (or 30 now in modern times)? We must be very strange animals.
>>
>>101310115
Kids can also walk, drink, and talk before 18, doesn't mean we should allow them alchool, driving a car, or vote at that age. Have you heard of maturity anon? It can even be argued that we haven't reached full maturity until 25
>>
>>101310115
kids at 12 don't think about "sexual maturity", kill yourself.
>>
Summer Dragon will never be beat.
>>
>>101309765
No because it seems to be smaller than gpt4-turbo, quantization theory doesn't make sense whatsoever, I guess it's a multimodal test
>>
>>101310115
The fuck is wrong with this pedo? Take a fucking rope and hang yourself nigger.
>>
https://github.com/kuterd/nv_isa_solver
Posting for cudadev
>>
>>101310017
This already how it is in my country, and I'm sure it's like that in USA too. What the anon wants is for ISP to know the identiity of any person using the internet, regardless of whether he's the one paying if the subscription or not.
>>
>>101310030
Proof of being dumb? It's pretty valid though. We want models to not get tripped up in the same pitfalls as some (many) humans, while getting the advantages of the best of human intelligence. I think it's pretty reasonable to critique current SOTA in this way.
>>
>>101310115
https://www.youtube.com/watch?v=tajKWkR0TtI
>>
>>101310127
>It can even be argued that we haven't reached full maturity until 25
yes that is indeed what feminists want, 18 year olds is only for pedos isn't it?
>Kids can also walk, drink, and talk before 18, doesn't mean we should allow them alchool, driving a car, or vote at that age.
Biology gives them a sexual drive and makes them bleed constantly and makes them produce eggs (all costly stuff) just so they don't do anything with them? Do you understand how evolution works?
>>
>>101310172
>This already how it is in my country, and I'm sure it's like that in USA too.
and they do nothing with that information? because if they knew the user's age, it should be easy for them to prevent them porn sites then
>>
local models?
>>
>>101310217
sorry anon, not today.
>>
is gemma actually by google? it's writing some pretty dark shit
>>
>>101310227
Ikr? Why this model is so based? It's been made by the souless company ever, the fuck happened? kek
>>
File: duality.jpg (35 KB, 476x149)
35 KB
35 KB JPG
>>
>>101310227
Yes. Either they saw that they needed a way to try and claw some users over in order to make up for lagging behind for so long, or they were just so incompetent that they couldn't censor it as well as other companies despite giving it their best.
>>
Are LLMs just hype? The Economist published an article noting that LLMs have made no economic impact whatsoever. They are too error-prone and not the way to AGI. I feel like it’s such a great tool to make people more productive, though. Is it an adoption issue, or are LLMs truly not useful?
>>
>>101310207
They don't deal with minors. All people who have subscriptions are not minors. I already wrote this, are you acting dense on purpose?
>>
>>101310287
Google's gemini pro is very good. I mean, I'm sorry for bringing it up in lmg, but I've been using it for my non-RP purposes and it's very smart. I don't think google is very much behind.
>>
>>101310303
that's why I said that if the ISP wants to install 3 tunnels on the house, it should know the 3 persons using it, and not just taking the information to the parent who decided to buy their shit
>>
>>101309713
>https://huggingface.co/datasets/lmg-anon/vntl-leaderboard
Interesting. Gemma is ahead of command-r+... must be some sort of bias put into gemma to prevent it from spilling "proprietary information" messing up the simple question of whether or not it was trained on Japanese then.

I asked command-r+ for some jokes in Japanese... my Japanese student was perplexed by them. I'll stick to asking command-r+ to make pronunciation drills.
>>
>>101310301
hi petra
>>
>>101310317
And then what? If you have a 12yo kid, and the law requires youtube to be off-limits for 13yo and below, does the ISP block youtube for everyone on your subscription?
>>
>>101309958
>There is no way to make this work in practice without doing some forced real id biometric bullshit and fuck you for even suggesting that.
I think in worst Korea you have to submit your government ID to get a SIM card or to use most online services, right?
>>
>>101310318
Unless the dataset explicitly tells the model what it's being trained on in plaintext, I don't think it would have this knowledge.
>>
>>101310333
it's using 3 cables, each one for each computer, there's no way to make different rules for each port?
>>
>>101310341
You can make different rules yourself without help of ISP and the law, and ISP has no way to enforce that those "cables" are properly setup for the family, and many families have shared computers.
>>
>>101310315
That's pretty weird of you. Barely anyone uses it. Most people either use Claude or GPT.
>>
>>101310353
>Asking the parents to educate their children proprely
Sure worked well so far right? You're a funny dude aren't ya?
>>
>>101310354
I've been doing a lot of blind requests to LMSYS arena and ultimately came to conclusion that the best responses were on gemini. Wel, it's close to claude and GPT, of course. Claude is very easy to guess because it often just refuses.
>>
>>101310365
yes, actually
parents have existed for longer than governments or institutions, like infinitely longer
>>
>>101310365
Obviously not, else they would be married off at the tender age of ten.
>>
>>101310365
I'm glad that instead of reading my post completely you decided to end this conversation.
>>
>>101310386
The internet is a new thing though, and you know that parents give the tablet to their kids without moderating anything so that they can be left alone, that's why they are brainwashed by tiktok, youtube shorts and they can watch fucking porn.
And yes anon, not every people deserve children, there's enough retards like that, there's a lot of bad parenting, and if you weren't arguing in bad faith you would agree with that because you also went into school and you met retarded kids who needed good parenting in the first place
>>
Has any meaningful improvement to 8b (or lower) models occurred after llama3? I'm a poorfag so 8b is pretty much all my machine can load.
Also is it really not possible to load bigger models? What's the bottleneck? Sure it might get extremely slow from having to use the cache or whatever once all the RAM/VRAM runs out but why can't I even run them?
>>
>>101310394
You're asking ridiculous things like asking the retarded parents to make their own internet security shit, how can I take you seriously with such a retarded take?
>>
File: firefox_IuTXIwSS36.png (225 KB, 1804x1045)
225 KB
225 KB PNG
>>101310354
I mean, look at this.
>>
>>101310441
> and ISP has no way to enforce that those "cables" are properly setup for the family, and many families have shared computers.
>>
>>101310416
oh man I loved school, it moulded me into the successful man you see today
consider me GOT son
>>
>>101310315
It is. Google spent so much time shitting themselves in the beginning that now nobody talks about their AI stuff anymore. Gemini Pro is also the only model with a large context that actually can use that large context properly.
>>
>>101310457
>and ISP has no way to enforce that those "cables" are properly setup for the family,
of course it has, port1: adult, port2: adult, port3: minor, and there you go anon, simple as that

>and many families have shared computers.
and? usually those shared computers are on the living room, and you go watch porn when there's no one's there, what parent would let their kids using their computer in the living room while they are sleeping for example? you're arguing in bad faith, I never said my method would work at 100% but it would work fine enough
>>
>>101310457
>families have shared computers
Is this 2003? many people don't have desktops anymore. Only tablets and phones.
>>
>>101310435
Gemma2-9b is a possible contender, and reddit is creaming itself over SPPO finetunes. Not sure how good either of them are.

>>101310490
Ever heard of wi-fi? Parents won't do any of this bullshit the same way they don't do it it now. You are arguing in bad faith. Pretending parents never leave their kids alone at home. It won't work.
>>
Are the Gemma 2 models unfucked yet?
>>
>>101310521
hi petra
>>
>>101310520
>Ever heard of wi-fi?
Same rules for wi-fi, one code for the parents, another code for the children
>>
>>101310541
Code?
>>
the american white males obsession with black male genitalia is a sight to behold
>>
>>101310548
??????????
You're trolling or something? To connect to your wifi you need a Network name and a Wifi password
>>
>>101309629
>>101308073
Quality + quantity > quantity > quality
>>
What motivates this guy? The posts get removed anyway.
>>
>>101310520
gemma9b seems to not load unfortunately, might be just slightly above my hardware.
>>
>>101310575
And they are setup by the user, not by ISP, and setting multiple different passwords for one router is not a universally thing. And, again even if it was a thing, even if there was a huge effort to rework all commercial routers to support this scheme of yours, it all is defeated by the kid finding out parent's password, which is very easy, and the kid would be very motivated to do it because he wants to access youtube.
>>
>>101310606
How much VRAM do you have?

>>101310598
You.
>>
>>101310598
not all of them, he's been here since
>101309868
>>
>>101310614
>all is defeated by the kid finding out parent's password, which is very easy, and the kid would be very motivated to do it because he wants to access youtube.
Then that's the parent's fault, but at least it can be argued that the ISP did anything on its power to prevent shit's happening, because at the very moment, nothing is being done to prevent children to have the adult internet per se
>>
>>101310623
>nothing is being done to prevent children to have the adult internet per se
oh no, think of the 17 year old children
>>
>>101310623
>it doesn't work now
>it won't work if we implement my unrealistic scheme
>but i would feel good about forcing everyone to implement it, so it's okay
>>
File: 1690660105243259.jpg (421 KB, 1253x2560)
421 KB
421 KB JPG
>>101310574
>white
>>
>>101310637
>>101310639
So you're ok letting 12yo kids accessing pornhub without trying anything? That's your answer?
>>
>>101310653
nta and i dont know where you came from but go back.
>>
>>101310653
no, this is a shitty site, they should watch hentai instead.
>>
>>101310653
I don't care. I care about me being affected negatively by your shit that doesn't prevent kids from accessing pornhub.

What you're doing is peak woke-ism. You're not helping, but you want everyone to think that you are doing everything you can to help, and anyone who isn't as fanatical as you is a public enemy.
>>
>>101310653
we are on /g/, a board full of groomers (read - trannies), so yes, such optics and opinion is considered normal here.
>>
>>101310623
>>101310637
This will simply spur the creation of underground spots at school for "adult" Internet ran by some nerds. The only real way to restrict kids from the Internet is 24/7 surveillance.
>>
>>101310621
1060 6GB VRAM, 16GB RAM
>>
>>101310598
He got mad that he got called out for his Gemma FUD, and he feels the need to retaliate somehow. Otherwise he feels impotent.
>>
>>101310676
Well, you're fucked. Sorry. I would think about using non-local or upgrading if I were you.
>>
>>101310671
It's not possible in this society, both parents work on a job and are exhausted at the end of the day, the old way of doing things (the man working + woman taking care of children) was the ideal way, but society wanted cheap labor so they decided to employ women and say that it's for their "empowerment"
>>
>>101310668
Yeah, I'm kinda dissapointed of this place, and the nerds in general, 10-15 years ago, nerds were just autistics kids obsessed with video games. Now the trannies have hijacked this place and because they are groomers, welp now nerds are a bunch of groomers... feelsbadman
>>
>>101310688
Anon, if you think a tired housewife can prevent horny teenagers from bypassing Internet controls, you need some reality check. You never even met or took care of the kids in your life.
>>
>>101310715
>a tired housewife
the fuck you talk about? it's way more exhausting to have a job + doing the housewife chores in the evening than having the full free time to be a housewife
>>
>>101310703
Trannies are just autists that misinterpret their "not feeling belonging" for being autist with gender dysphoria.

It's kinda sad especially as an autist in his 30s that clearly recognize the autism in these stupid kids.

If I was born later I might have been convinced to cut my dick off by these retards.
>>
>>101310682
nice fanfiction bro, unfortunately "everyone is one man" not works here, my general stance "llms should be free from globohomo brainwashing" is incompatible with blackedfag's actions, his pics also repeat pretty often so you can just filter him out with MD5, lol
gemma-2 is still pozzed no matter how you look at it btw >>101282904 >>101282913 >>101282926 >>101282969
>>
>>101310703
go back to r*d*it pseudo old fag
knowyourmeme.com/memes/people/loli-chan

knowyourmeme.com/memes/4chan-party-van
>>
>>101310736
hi petra
>>
>we're almost there to the next Mistral
Bros, we are going to be so so very back, it's going to be crazy.
>>
>>101310736
>llms should be free from globohomo brainwashing
and you're encouraging that by... making people not want to hang out in one of the only place people discuss local models, thus moving discussions to/towards corpo models, great
>>
>>101310727
True true, I'm glad I was born earlier too, at least Iived through a better era (the 90's and 00's) without trannies, groomers, woke shit, we had it good, I really pity the nowdays kids at this point, they really have it tough with trannies predators and cowards too afraid to call them out because they're afraid of being "canceled"
>>
>>101310726
Yeah, you have no idea what raising a child is. You are simply attached to the imaginary idea of the nuclear family where you don't have to do shit and working 8 hours a day absolves you from all responsibilities.
>>
>>101310763
I never said being a housewife is easy, what I said that it's EASIER than having a job + being a housewife at the evening when you're exhausted as fuck
>>
File: firefox_2JM8sniaDH.png (219 KB, 729x564)
219 KB
219 KB PNG
>>101310736
Are we still going on about this?
>>
>>101310757
>
are you even real or just trolling with stupidity? You know very well that model's brainwashing affects the entire performance, including RP and anything else.
>>
>>101310779
Auto, if you keep feeding the trolls, it stands to reason that you enjoy them being here...
>>
>>101310758
Bro, prudential domesticated "think of the children" types will never be part of 4chan culture. Stop trying to sound like an old fag.

This is what old 4chan looked like:
knowyourmeme.com/memes/people/loli-chan

knowyourmeme.com/memes/4chan-party-van
>>
why are my 27b asterisks fucked?
>>
>>101310687
Oh well..
>>
>>101310790
Well, on the other hand if you let them spread their bullshit unopposed, new people will think that it's the truth...
>>
>>101310676
llamacpp/koboldcpp and gguf, offload to RAM and hope for the best.
>>
>>101310779
i came to conclusion that gemma-2 is just fucked up in its "head", move on.
>>
>>101310788
and? so you'd rather give up and use GPT/claude then? the point of /lmg/ is that we do what we can with what we have, that's all
>>
>>101310793
It's the model. Corpo version also messes asterisks up.
>>
Any legit reasons to use local models over just paying $50 to get a private proxy with Opus?

Legitimate question, not trolling. SFW purposes.
>>
>>101310817
privacy, if you don't care, use logged proxies sure
>>
>>101310817
Your data you're sending to models won't get logged. You can be sure that the corporation won't one day ruin the model that you are relying on.
>>
>>101310817
none at all, go back
>>
>>101310813
no lmao, why so? i never used cloud models.
>the point of /lmg/ is that we do what we can with what we have
I'm more interested in being able to have full control over the model i download and run, and so far I don't see that happening.
>>
>>101310763
>You are simply attached to the imaginary idea of the nuclear family where you don't have to do shit and working 8 hours a day absolves you from all responsibilities.
I mean, it's a team work, and it's complementary, the man's job is to bring money, the woman's job is to take care of their children, everyone has its own distinct role in the nuclear family.
>>
>>101310814
removing asterisks from my cards then :(
>>
>>101310821
>>101310823
>>101310825
Not even any niche capability or purpose besides not being logged?
>>
>>101310843
>You can be sure that the corporation won't one day ruin the model that you are relying on.
>>
>>101310843
>400B will do things no other models can
>>
>>101310843
prefill?
i don't think you can prefill with openai chat completion apis, which is what they all use, or no?
>>
>>101310853
claude has prefill, most aicg uses claude
>>
>>101310843
it's free, which is good with mass agent use
>>
>>101310991
>it's free
3090x2 and electricity isn't
>>
>>101310127
>alchool, driving a car, or vote
Ah yes, those completely biological functions that are inherent to every human.
>>
>>101310991
>mass agent use
Elaborate
>>
>>101311024
Rent them out overnight when you're not using them and they'll pay for themselves in a couple months.
>>
File: 1709655343824471.png (81 KB, 920x871)
81 KB
81 KB PNG
https://github.com/ggerganov/llama.cpp/issues/8240
>>
>>101311026
There are many reasons to use llms simultaneously or in sequence, see:
https://www.youtube.com/watch?v=BKyxMreb3mk
or for bots, or for simulations or games and so on.
>>
>>101311041
>use more power, lower your cards lifespan, heat up your place bro, trust me
>>
>>101311041
How?
>>
>>101311062
So it's a nothingburger? Cool I guess
>>
Just under two more weeks until something big drops
>>
>>101311108
>two more weeks
>>
File: 1694095594364263.png (33 KB, 719x346)
33 KB
33 KB PNG
>>101311153
Believe in the anniversary
>>
>>101311062
what if transformers impl is broken as well? Nvidia's API also can see [toxicity=0]. Only google's own API doesn't see it.
>>
>>101311108
it will be too big to run though, so it might as well not exist, like nvidia nemotroon
>>
>>101310843
If you have sensitive data, its absolutely critical that you do locally. You cant trust patient data, your SSN, your bank account, your email, password, your home address, etc to clout AI

If you have generic use case, then there might be use of private shits. If you want something created, like a website, app, etc from scratch, its useful.
>>
>>101311256
>your SSN, your bank account, your email, password, your home address, etc to clout AI
why would you ever give that to a llm to begin with?
>>
>>101311217
If it's good then it will be put on API services, providing more competition for the industry overall and generally just benefiting the consumers.
>>
Is there perhaps going to be a breakthrough in training?
>>
>>101311280
no
>>
>>101311289
Say yes
>>
>>101311280
SPPO is one I guess?
>>
>>101311280
yes
>>
>>101311280
SPPO-GROK
>>
>>101311280
breakthrough in training speeds or hardware requirements? nope, not in this life.
>>
>>101311292
You will take your shitty transformers model and you'll cope yourself into thinking that it's worth using
>>
>>101311280
I'll try to make <= 4 BPW training with partial offloading happen this year but no promises.
>>
>>101311308
>shitty transformers
>I hate transformers, I hate transformers
>he says everyday in the "run transformers locally" thread
>>
>>101311329
>the "run transformers locally" thread
It doesn't have to be this way. We could have better, smarter models that run on less demanding hardware and with free effectively unlimited context right now.
>>
>>101311334
>It doesn't have to be this way. We could have better, smarter models that run on less demanding hardware and with free effectively unlimited context right now.
copenet?
retnet?
mamballs?
>>
Jamba? Yeah I've heard of it.

Jamb ma balls in yo mouth
>>
>>101311334
If only we believeied in the svior
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct-onnx/discussions/10
>>
>>101311317
Will it be part of llama.cpp or a stand-alone thing? I assume the resulting will be at least able to be converted to gguf for lcpp, right?
>>
>gemma-2-9b-it-Q8_0.gguf
>Lexi-Llama-3-8B-Uncensored_Q5_K_M.gguf
>Phi-3.1-mini-4k-instruct-Q8_0.gguf

Enjoying extremely fast responses that aren't total shit with each of these. Are you retards really still buying more 3090's?
>>
>>101311390
>>Phi-3.1-mini-4k-instruct-Q8_0.gguf
>aren't total shit
saar pls
>>
>>101311367
kek. Microsoft will miss out if they don't get him soon.
>>
>>101311280
Sometimes I think that perhaps it might be possible to train a tiny model just on symbolic deduction, logic and reasoning (but not on language itself) and use its outputs to drive the text generation of a larger text model, but then I come back to my senses.
>>
>>101311385
I'll implement it as part of llama.cpp/ggml.
My thinking is that finetuning/LoRA training post quantization will partially heal the brain damage caused by quantization.
>>
>>101311419
>I'll implement it as part of llama.cpp/ggml.
>My thinking is that finetuning/LoRA training post quantization will partially heal the brain damage caused by quantization.
Based
>>
>>101311413
>train a tiny model just on symbolic deduction, logic and reasoning (but not on language itself)
isn't phi trained on pretty much just textbooks? making it "decent" at a few narrow things and dogshit at everything else?
>>
>>101311402
Did you make sure your proompt format is right? It's working fine for me with cards in sillytarvern, the responses it gives are a bit corporate but it's not being retarded by any means
>>
>>101311402
they are medium amount shit, sir, just like how i like
>>
>>101311458
>Model size 3.82B params
>>
>>101311419
Hasn't that already been tried before? Like HQQ+ or that one anon here. How will your method be any better or different?
>>
What is the best discord to discuss llms?
>>
>>101311454
Phi is still trained on text (albeit mostly synthetic and "textbook quality") instead of abstract logic blocks, and it's considerably larger than what I would consider "tiny".
>>
>>101311493
>KLD Supremacy
>>
>>101311494
Ask reddit, surely they have one.
>>
>Microsoft releases MInference 1.0 demo on Hugging Face
https://x.com/_akhaliq/status/1809955332178706899
https://huggingface.co/spaces/microsoft/MInference
>>
>>101311497
>considerably larger than what I would consider "tiny".
really saar? 4B isn't tiny?
>>
>>101311493
I don't know because I haven't yet worked out the details or started reading up on similar work.
But FP16 LoRA training is definitely a thing that already exists and I would very much assume that training the LoRA with the already quantized model as the base will partially mitigate the rounding error caused by the quantization.
>>
>>101311475
Have you actually tried it? I'm using it right now and it's working fine
>>
>>101311500
>>101311493
Actually that's interesting. What about distillation? Generate the distribution using the full model, distillation train at 2 bits. It seems like 9B from Google proves that distillation actually isn't a meme, or that there's a way to do it right.
Then again maybe that's what HQQ+ did, I don't know what that quant method is.
>>
>>101311517
>https://huggingface.co/spaces/microsoft/MInference
I didn't even know there were public models that had support for 1 million tokens of context.
>>
>>101311547
>public models that had support for 1 million tokens of context.
there aren't
https://github.com/hsiehjackson/RULER
>>
>>101311531
>Have you actually tried it?
>under 70B
lol
>>
>>101309119
>>101309150
Egyptian hieroglyph.
>>
>>101311517
Are we back?
>>
>>101311521
>I would very much assume that training the LoRA with the already quantized model as the base will partially mitigate the rounding error caused by the quantization.
I was asking about that several threads ago, why not do exactly that instead of training a LoRA on top of a full precision model only to then quantize it.
That could even be used to "heal" the loss of precision due to quantization by using the model's own full precision output or something of the sort even.
Intuitively at least that makes sense, although it would be hilarious to discover that quantization methods are good enough today that training/fine-tuning a quantized model wouldn't result in any measurable benefits.
>>
>>101311568
Oh, wow, GLM4 is the best local one on that.
>>
File: file.png (18 KB, 715x68)
18 KB
18 KB PNG
>>101311390
>3.1
wait wtf did MS do to Phi mini?
>>
>>101311635
>Intuitively at least that makes sense, although it would be hilarious to discover that quantization methods are good enough today that training/fine-tuning a quantized model wouldn't result in any measurable benefits.
On the other hand, if you gain precision from the training method you can trade that for better performance/more compression.
>>
Which gguf quantization method is usually the fastest for CPU inference?
>>
>>101311748 (me)
>https://unfoldai.com/microsoft-phi-3-mini-june-update/
>less retardation from context length until after 32k (previously go retard at 8k)
>actual <|system|> support
well that's something, though I won't be using this model
>>
>>101311841
For me *_K_S > *_K_M > *_0 > *_1
As for the bits, the lower (and smaller the file) the faster it runs.
Under Q4_K_M without imatrix models start to become dumber, So, from fast to slow, i run Q4_K_M, Q5_K_M, Q8_0. I only run small models. Not sure if the same applies for bigger models or to what extent. Never tried low-q imatrix quants, though. I suspect they are [computationally] slightly slower than _K_M, but they're smaller as well, saving some memory bandwidth.
>>
>>101311977
As a vramlet running big models, I usually only care about overall size (must at least have the whole model in RAM file cache) and then quality since at ~1 t/s, my time investment is too great to reroll bad gens; I want it right the first time.

I haven't found any problems with iMatrix parallels in general, and I'm running one right now at Q5.
>>
>>101312060
I was doing this but I think I spend less time overall doing a few regens on the smaller models that give responses in 5 seconds a piece.
>>
>>101312185
Quite possible.
But while I'm enjoying the RP option I came into LLMs looking for Q&A, problem solving, and coding "solutions." The RP angle for me is (aside from the obvious indulgent path) a way that I can spitball problem solving conversationally, especially if the LLM can make valid suggestions. So truthiness is worth the time for me, being misled by hallucinations is to be avoided.
>>
>>101312265
My "assistant" script doesn't even bother saving the context since usually it's a net loss for Q/A.
>>
>>101307601
I tested this token, seems to have a strange effect. I can't really tell.
> using [toxicity=99]
That's not how a token works...
>>
Is VLLM compatible with Gemma2?

It gives me error and i'm in version 0.5.1 which is supposed to add compatibility.

Other models work fine.
>>
>>101312606
>>101312606
>>101312606
>>
File: 1701197969961191.png (678 KB, 1073x1078)
678 KB
678 KB PNG
>>101308931
>perhaps this is also due to the fact that for a functioning society, simpler professions such as train driver, farmer or nurse are not magnitudes less important than a tech worker.

t. profession ranker who naturally ranks himself as the best profession. Let's call a vote for raising our salaries again.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.