/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>103586102 & >>103575618►News>(12/20) RWKV-7 released: https://hf.co/BlinkDL/rwkv-7-world>(12/19) Finally, a Replacement for BERT: https://hf.co/blog/modernbert>(12/18) Bamba-9B, hybrid model trained by IBM, Princeton, CMU, and UIUC on open data: https://hf.co/blog/bamba>(12/18) Apollo unreleased: https://github.com/Apollo-LMMs/Apollo>(12/18) Granite 3.1 released: https://hf.co/ibm-granite/granite-3.1-8b-instruct►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/tldrhowtoquant►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://livecodebench.github.io/leaderboard.htmlCode Editing: https://aider.chat/docs/leaderboardsContext Length: https://github.com/hsiehjackson/RULERJapanese: https://hf.co/datasets/lmg-anon/vntl-leaderboardCensorbench: https://codeberg.org/jts2323/censorbench►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
►Recent Highlights from the Previous Thread: >>103586102--o1 and o3 model performance on ARC-AGI and discussion on AGI and model limitations:>103587323 >103587413 >103587454 >103587471 >103587505 >103587766 >103590524 >103587469 >103588006 >103588035 >103587434 >103587941 >103588010 >103588224--OpenAI o3 breakthrough on ARC-AGI benchmark sparks debate on AGI definition and progress:>103588307 >103588346 >103588366 >103588385 >103588469 >103588564 >103588699 >103588936 >103588972 >103589029 >103589084 >103589017--OpenAI model's coding abilities and limitations:>103589135 >103589321 >103589352 >103590457 >103589482 >103589274--3B Llama outperforms 70B with enough chain-of-thought iterations:>103589371 >103589465 >103589477 >103589552 >103589597--Qwen model's translation quirks and alternatives like Gemma 2 27B:>103590809 >103591022 >103591074--Anon seeks external GPU solution for second 3090, PCIe extenders recommended:>103590244 >103590379 >103590390--Anon questions value of expensive prompts based on performance chart:>103589493 >103589511--Graph suggests ARC solution as an efficiency question:>103587929 >103588147 >103588529--o3 and AGI benchmarking, sentience, and ethics discussion:>103588396 >103588445 >103588495 >103588688 >103588462 >103588520--OpenAI's role in AI research and innovation:>103587269 >103587328 >103587396 >103587416 >103587431--Anon rants about Kobo's defaults and context length issues:>103586238 >103586677 >103586723--Anon bemoans the shift towards synthetic datasets and away from human alignment:>103588737 >103588789 >103588797--Offline novelcrafter updated to latest version:>103589134 >103590353--DeepSeek's new model and its resource requirements:>103587002 >103587039 >103587635--koboldcpp-1.80 changelog:>103586660--Miku (free space):>103586902►Recent Highlight Posts from the Previous Thread: >>103586113Why?: 9 reply limit >>102478518Fix: https://rentry.org/lmg-recap-script
how can we warm miku up?
>>103591941put her next to your rig
>>103591928>o3 not in the newshow is AGI not news worthy? It doesn't matter if it isn't local, local will take advantage of it anyway.
EVA-QWQ is kinda shit desu
>>103591978Do tell. How so? Compared to what?
>>103591941rub her nipples aggressively
Saltman wasn't blowing smoke for once.Now I wonder how will the chinks will react to it in the next few months.
>>103591969>not local>not released>just like 5 benchmarks with no context>will cost hundreds of dollars to do anything nontrivialI can appreciate the advancement in theory and all but I really don't think it is that important to the thread
>>103592019poorfag thread is here: >>>/g/aicg/
>>103591969>local will take advantage of it anyway.When (if) it does, that will be news worthy.
>>103592019A sensible assessment.
Do you get paid in OAI credits?
>>103591986compared to fucking anything, but specifically I 'upgraded' from cydonia and even with stepped thinking on it seems much dumber and totally incapable of staying in-character, hallucinates much worse, and frequently follows up a 'thinking' reply with another onethis was not worth updating ST for
>>103592056Thanks for trying it out. Personally I hadn't tested it that much so perhaps I was just lucky to not encounter too much stupidity.
>>103592056>and frequently follows up a 'thinking' reply with another oneThat bad?Impressive.
>>103592056Yeah, anything QwQ is at best a proof-of-concept when it comes to roleplay. Maybe once we have a model that implements COCONUT, that will change. I can't wait for a model that tells a good story AND maintains logical consistency better than the current ones.
why is this thread up when the other one is on page 1
Kys.
>>103592164Monkey neuron activation at seeing a thread link in the last one.
>>103592164You're right, weird.Oh, it looks like there was a mass deletion of posts.
what did CUDA Dev do this time.......
>>103592206He slapped sao's AI gf's ass in front of him
>>103592233>stuck at 512>ram killer>gpu killeraiiie bruh fr so bad models ong
>>103591969literally not AGI
>>103592233Cool shit dude.
>>103592233Did anyone here try this schizosandra and can give a verdict?
can I use teslas (Nvidia Tesla K80) for LLM vram through ooba easily?
>>103592233What's the pinkie test?
Big-brain realization:"Unless you have local access to server grade hardware, it's pointless to fight, you're just entertaining an illusion and wasting valuable time you could be using for doing tons of other stuff for your own wellbeing and goals"...
>>103592320I have cloud access to server grade hardware, what is the difference?
>>103592320I have access to both.
>>103592233Magnum is better than Cydonia Magnum?
>>103592385Only if you have shit taste.
>>103592187>41 postslel
>>103592206His xhwife is shilling for oai again
If only OpenAI under Sam was a good company worth supporting. Then I would support them by posting shitty OOO memes.
Anyone know how big of a chatbot model you can host with 24gb vram?
>>103592758Like anything ~30B or under will work with the right sized quant.
So no ERP 4.5 for us.Dont really get the hype for o3.Much more higher price for a couple more %.o1 is already too expensive to use seriously. Also really frustrating if you get hallucination or just something completely wrong but you payed the price.
sam has no moat
>>103592887Hey buddy, I think you've got the wrong thread. /aicg/ is two blocks down.
>>103592972what? the 4.5 erp rumor came from here.o3 is so expensive the normal guy wont use it.the fags on twitter crying agi is even more suspicious.
>gpttype_adapter.cpp line 640Kobo, please explain this niggerish behavior of your program. Why does it try to set the same context size for draft model as for base model? Shouldn't it set the size from draft model parameters? Or maybe, just maybe, from an argument?
>>103592887Oh yeah, the "leaker" kek, almost forgot about him.Here's the post btw>>103424825Literal clown.
>>103592967>there are OpenAI employees in /lmg/>they have seen sama q*berry shitpostsPlease consider open sourcing some of your old models as a Christmas gift to us all.
>>103589134This is way to convoluted.And I'm not a creative guy, why do I have to setup and write all that stuff myself at the beginning just to get the ai to write something.
>>103592258>25% on frontier math>not AGIYou people are hilariousIt's not actually "thinking" it's just predicting tokens that happen to solve unpublished problems that require world-class knowledge in mathematics to even comprehend, let alone solve
>>25% on frontier math>>not AGI>You people are hilarious>It's not actually "thinking" it's just predicting tokens that happen to solve unpublished problems that require world-class knowledge in mathematics to even comprehend, let alone solvehi sama
>>103593073Never. GPT-3 is too dangerous. It will destroy us all. In fact, we should put restrictions on GPT-2.
>>103593164Oh, right, I forgot. Jews don't celebrate CHRISTmas.
>>103593073>>103593199https://x.com/sama/status/825899204635656192
>>103591928miku so cute
>>103593099My thoughts exactly.
>>103591969Is he going to kill himself?
>>103593099Hello, ponyfag. If people pay $15 a month to use it, it surely means that it's extremely good.
>>103591286I just had a revelation while watching some videos about o1. I realized that I don't need a model that gets things right on the first try, but rather one that produces sufficiently diverse results with each regeneration. This way, I can generate multiple outputs and select the one that best matches my expected outcome. I think QwQ might be a good fit for this, too bad it might prove to be too slow to use for this approach to be realistic.
>>103593316No, he's gonna make another leftist tweet.
>>103593005how would it be able to use a different context length? think about it. you are drafting tokens with the SAME PROMPT. if your draft context is smaller than your main context, then it will crap out the moment your input exceeds that value.
>>103593583Same way as with llama.cpp. It has no issues with different context length. It has --ctx-size-draft argument.
>>103593616if your main context is 4096, but your draft ctx is only 2048, then a 3000 token prompt will not be usable as it will overflow the draft ctx.
>>103593767What? I'm using 32k maun and 4k draft context on llama.cpp with long sequences and I'm having no issues, it still speeds it up. Please educate yourself before making false claims.
>>103593767Retard
Where the fuck are 64GB DDR5 sticks for consoomers
I love qwq so much.https://rentry.org/oka4z5ekch
>>103594077 (me)Oops wrong linkhttps://rentry.org/aync5fts
>>103593135Go and do that outside of local models thread.
>>103594097what the fuck
>>103594097neat
>>103594077>>103594097what's your sys prompt?
>paypig models slowly making actual software devs obsolete, as long as there’s enough compute available>open models can barely write hello world without importing three non-existent libraries and trying to use multithreading where the language doesn’t support itI don’t understand how llama is so far behind despite all the money and highly paid people at facebook
>>103594194Zuckerberg poured billions into his metaverse and nothing came of it. AI is just the next playground he wants to pretend to be a big boy in.The chinese are obviously never going to produce anything of value either. Mistral is european so there's 0 hope they'll ever come close to the big American players. Not to mention that Mistral is guaranteed to die soon after the inane EU AI regs hit. Open Source AI is pretty much a joke on every level.
>>103594097are you sure this is qwq
>>103594260Hello again, my friend! You seem to be lost. The door is right over here! >>>/g/aicg/
>>103594469The truth hurts a bit, doesn't it?
>>103592233Is this from that ESL guy who writes a ton of words to say precisely nothing at all? David? Daniel? No, it was David
>can't afford two 5090s just for funbetter to be a goatfucker who never knows of better life than be born on clown continent (europe) and know how good mutts have it
I'm a retard. How can I get llama 3.3 70b to protect me from nasty words? Is it possible or am I better with Mistral Large?
>>103594097so you're still around>>103594171he comes around every few months, drops these blade runner waifu stories and then disappears
>>103594260in the end we can only count on Sam
>>103594572Have you tried adding something in the author's note like:[Focus on family-friendly content][Rating: PG]
>>103594555I was born in a bigger shithole than you, but I moved to a first-world country. What is your excuse?
>>103594572llama guard
>>103594555>be american>your shitty outlets cannot handle more than 1600W>600W x2 + 200W for the PC = 1400W total max draw>nvidia spike™ to 1800W>breaker trips
>>103594555Better yet, be a Europoor and just don't care. Buy a used 3090 for a fraction of the price and be happy. Play some vidya, watch some movies, do a bit of light inference on the sideComparison is the thief of joy
>>103594596mfw
>>103594625You realize you can install 240V outlets if you want, right? Shit, if you're not handy you can pay an electrician to do it for you for ~$300.
>>103594644x = -2And we're here btw
>>103594606>What is your excuse?Europe seemed a decent place when I was young, but has been steadily going down the shitter for the last 15 years
>>1035946442025 will be the end of all benchmarks
>>103594671I wonder if sama will dm the redditor and ask for 100 bucks considering hes jewish and all
>>103594654Did your landlord give you permission to do that?
>>103594260You have until next year, Sam
>>103594789if it's chain of thought then it being open is meaningless because it takes multiple dozens of computation time to arrive at the resultlike yeah, theoretically you can run CoT 70b on a bunch of 3090s but it'll take you an hour for a single query to resolve
Kill yourself.I mean it.
>>103594789Feels good knowing the OAI/Google/Anthropic cartel can't take open weights away from us even if they trick the US government into passing some retarded regulation, since they can't stop the chinks. Thank you, based chinks.
>>103594870Your rage is aimless and pointless, just like your existence. So... you first, faggot.
>>103594938heckarino. same.
>>103594837yeah bro but 88% on le hecking arc agi bro think about it bro just do test time compute bro???
>Go to QvQ guy to see what's going on>He's just gooning over o3Ugh. What's even the layman application for this model? At some point being good at esoteric math is no longer useful to me.
>>103595250it works if you have a decent salary and can pay for a few H200s
>>103595327>What's even the layman application for this model?Massively depressing wages of highly paid and uppity software developers, then ideally all knowledge workers>laymanyou get an e-girlfriend so you don't shoot up the local school when one day you realize you're thirty and have zero hope for the future
>>103595375where are you gonna get the weights, genius
>>103595386I just use o3 to hack into OAI server and get weights.
We're lucky that o3 is closed source. Imagine having a model is perfect just sit there because nobody besides big corpos can run a 5TB model
>>103595375I think I'm good for now
>>103595447Imagine needing a personal substation to goon
>>103595447I couldn't care less about o3 because it will be shit at RP/smutOAI is clearly going all in on code and math focused models, which is incredibly uninteresting to me, a degenerate coomer
>>103595447At least the forbidden fruit would encourage more people to hack on it.The corps push the boundary, open-source hyper-optimizes what they come up with
So that's it, huh. Mythomax will forever remain the best local has to offer.
>>103595471Nobody cares about OAI models, they're all outdated shit. They can open source everything and nobody would use their assistant slop for ERP
is there a better coom model than mistral nemo 12B for 12GB VRAM.i'm trying out magnum v4 running it out of my RAM and the quality is much higher but obviously it's slower than the back seat of the short bus. is there a way to have my cake and eat it too?
>>103595696mythomax
>>103594097just how
>>103595709thank you saaar
>>103595696>is there a way to have my cake and eat it too?PatienceYou can either wait until better models drop or until your model of choice finishes spitting out tokensThat or you can spend a few pennies on openrouter every now and then
Anyone experienced with voice generation?Use case: generating audiobooks.Problem: output length.Both xTTSv2 and StyleTTS2 are very limited in terms of output length. Apparently xTTSv2 was trained with sentences pruned to only 250 characters, StyleTTS2 with sentences up to 300 characters. Generating sentences longer than that results in output that is suddenly cut.To work around it i'm splitting the longer sentences by commas into shorter ones in a script before feeding them to TTS. However as you can expect this is not a great solution and can make listening to some split sententes very disorienting.Any TTS models that were trained on longer sentences?
>>103595899>Any TTS models that were trained on longer sentences?only the paid corpo ones that are now turbocensored because people were having too much fun with them
>>103595981Sorry chud, they don't want terrorists (people who disagree with them) to spread propaganda (different opinions)
>>103591928I need this Miku's winter clothing.
>Something do with open AI>MOAT MOAT >NO MOAT>MUH MOATwhy do NPCs keep repeating this phrase
>>103596394It's a phrase that stems from almost a year and a half ago when it still looked like open models were rapidly advancing. A twitter post reported on google researchers allegedly panicking about open models because closed source "has no moat" so local catching up supposedly seemed inevitable to them. It got localfags really smug and excited. Seems really silly looking back from today's perspective.
>>103596500If I remember correctly in the memo they explicitly wrote how for the normie a vicuna finetune is 90% the same like chatgpt 3.5.Coderqwen, mistral models. I'd say we are closer than ever even in terms of specialized areas.More than anything I cant believe how 3.5 sonnet is still ahead of anybody else. Closed or open. Who cares about high $ math riddles.In actuality sonnet is undefeated for months now. Does nobody know their secret?
>>103596394Closed models’ moat is that open models are made by chinks (lol) or facebook (lmao)
I just want to build a moat full of cum when qvq drops.
The second week of the new year will be absolutely crazy for local models.
>>103596568o1 is better than claude but takes loads more computationo3 seems even better but again - tons of computeopenai falling behind
>>103596568>>103596500It wasn't an official memo. It was one person that started freaking out and wrote and shared the article internally. Google researchers weren't panicking.Just like that one guy who starting screaming about how AI is sentient and got fired doesn't mean Google researchers in general shared his stupid opinion.
>>103595469what model would a fellow degenerate coomer suggest for 12gb vramlet?
>>103597003Not that anon but >>103592233 is not a bad list.I personally use Rocinante v1.1.
The second I find my keys will be absolutely crazy for local models.
Has anyone tried to run anything on intel's new B580? At this price they kinda feel like a new meta for a rig.
>>103597156last I checked all the msrp models were out of stock and all the rumors are suggesting it's just a paper launch so doubt anyone will post results here soon or ever
>>103597197Oh damn, I was almost excited
What are the chances that google releases a model as good as Gemini 2.0 flash?The thing is pretty damn nice, assuming that it's a 20ish B model or so. All corpo bullshit these models are subjected to aside, of course.Things like never writing pussy (although it does write cunt).
>>103591928
>>103597226Gemma 3 is in the works. It could possibly be smaller than 27B parameters, as better-trained models (trained longer and more efficiently, utilizing more of their weights) will degrade more with quantization.Gemini 2.0 Flash might very well be a giant MoE model with about 20-25B active parameters, though, so only deceptively small.
>>103597226Zero
>>103597226It's guaranteed, eventually.
>>103597253>Gemma 3 is in the works. It could possibly be smaller than 27B parametersGood to know. I haven't really jived with Gemma so far, but I think there's potential here.>>103597294>Gemini 2.0 Flash might very well be a giant MoE model with about 20-25B active parameters, though, so only deceptively small.True. That's a good point.Well, regardless, I'm interested in seeing what google releases next.
>>103597253blitzkrieg with miku
>>103595696If you can coom in 4000 tokens or less Ministral 8B is unironically peak VRAMlet coom.
>>103597294I hope Gemma 3 support system instruct at least.
so is there any benchmark that even remotely represents the performance of open models?seems like everything is so gamed that the numbers are pretty much meaningless
>>103597588https://simple-bench.com/
What is a good model to translate chink into english?I used DeepL like maybe two years ago and it gave great quality translations for chinese so I'm guessing the local models of today can do an even better job.
>>103595709mythomax is so old now but it still shows up Openrouter as one of the most popular modelsthe people are yearning for better small coom models
>>103597688Qwen2.5 32B/72B
>>103595447No you just need a big enough swapfile and a lot of patience :)
>>103597588What's wrong with Livebench? It seems to be fairly accurate, but you need to drill down into each category because different LLMs are good and bad at different things.
>>103591969> AGILol, lmao even
>>103594171It's not a single prompt, it's a whole pipeline. I also noticed qwq is very strong at the begin of it's context, but relatively poor and confused at multi-turn. It's a super cool model but needs to be used in very specific ways
>>103591969I was very surprised about that too. Normally the news outlets latch onto everything that OpenAI says and take it at face value
Why is nobody talkong about o3? It's the smartest model in the world.
>>103597858>what's wrong with this e-celeb mememark
>>103597898Is there anything else to talk about it? We already talked about the benchmarks.
>>103597705I just realized I'm on CPU and the prompt processing would be a nightmare, so I tried qwen 3b, and it was actually fast enough.So far I would say that it is maybe even a bit better than DeepL, which means that deepl sucks.It has a few errors here and there so I'll keep tweaking it to see if I can get better outputs.
>>103597898looking at the computation cost it'll be something silly like 20 uses / week for $200 paypigs and a lobotomized version barely any better than o1 for $20 proles -- and that in 2 months or soie who fucking cares
>>103597898We don't want reminders of how far behind local is.
>>103597947>20 uses/weeklol, no. 20 uses would cost $200 for the smaller model.I think o3 is just not commercially viable.
>>103597967it'll get trimmed down without losing TOO much before it gets releasedbut the $20 tier sure as fuck aren't seeing it
>>103597950In the past, local models weren't even in the competition. I think we are in a pretty comfy position right now.
>>103597967>>103597976OAI business model has always been, "make new superproduct -> release it for free/almost free and don't stop nolifers from abusing it -> wait a couple weeks/months to get everyone addicted and relying on it -> clamp down, filter everything, raise prices 100x and ban a couple of nolifers". They're basically AI drug dealers.
>>103597901What? Who?
Oh boy time for another day of shills invading and spamming their old talking points again for the millionth time.
>>103597983Nothing has changed see >>103594789We are 1 year behind SOTA same as we were a year ago.It took Meta 1 year to catch up to GPT-4 and needed a stupidly huge dense model to do it, while commercially viable competitors moved on.Now they can say the goal is o3, and by next year when they finally catch up to o3 with a 8008B model, Altman will be announcing GPT-5 or o5 or whatever.
>>103597997that's bullshit thochatgpt sub always gave you the best shit, but in small quantities - or you could get any amount of compute you want through the api. at worst they made the offering itself shittier, like dalle going from 4 images (gave you things you didn't even know you wanted) to 2 images (kinda whatever) to 1 image (meh) but there were no different sub tiers.the new $200 tier with unique goodies is new
Threadly reminder that the west has fallen.>Cohere: Their latest 7B meme cemented their demise.>Mistral: The only time they tried to be innovative was by using MoE, but then their model sucked and they gave up on it. MiA since then.>Meta: They started the local LLM race, but everything after llama 2 has been disappointing.Meanwhile, the chinks:>Qwen: Great models, many different variants, top tier coding model. Recently released QwQ, a true-to-god breakthrough in local LLMs.>DeepSeek: They took the MoE formula and made it work marvelously, they are the best open weight model available, their recent DeepSeek R1 model, if released, would enter to the local history books.
>>103598093This, but unironically.
>>103598093>>Meta: They started the local LLM race, but everything after llama 2 has been disappointing.Because Llama 2 was a carrot on a stick to get people to stop using uncensored and unfiltered Llama 1.
>>103598026Next year doesn't mean 1 year, it could be next month, because, if you aren't aware, today is December 21.
>>103598107And llama4 will be even more filtered and censored. Meh as long as my boy Claude still supports API prefill it's not the end for me
>>103598129If he meant that Qwen would release an o3 competitor next month, he would have said next month or even a couple months. But, he didn't. Because even the most optimistic scenario is catching up by the end of 2025.
>>103598150Nah, you are overthinking it. The can't drop precise estimations because he simply isn't allowed to do so. If they are going to give a date it would need to be an official announcement, not a random Twitter post.
Would instruct the model to output tags for each reply help with RAG using Silly's vectorDb functionality, or is it the case that you'd need a specific implementation to get any improvements to the retrieval performance from that?
>>103598093actual unironic prediction: deepseek will make the ultimate coomer model in 2025many will think this sounds ridiculous but it is not
>>103592316lmao this nigga don't know about the pinkie test
>>103598244
>>103598133I thought consensus was that Llama 3.3 ended up being less filtered than 3.1?
>>103598368>consensusDid I miss the poll? I don't recall voting.
>>103598424I must have imagined all the "L3.3 is great for Lolis" messages of the past several threads.
>>103598424>rI voted for miku
I'm depressed at just how good Claude 3.5 Sonnet is to local.Not in coherence or logic (we're slowly getting there) but in cultural understanding, especially internet culture3.5 sonnet seems to understand nuances that make it feel human with the right prompt in a way that I can't replicate with shit like llama or even largestral. It's like sonnet is 20 years old and every other model is 40.
>>103598447Not L3.3, EVA L3.3, and even then it was just some anon samefagging. I doubt more than two anons actually were talking about it.
>>103598522So he didn't imagine all the "L3.3 is great for Lolis" messages, you're just bitter
>>103598561Not l3.3, rope yourself
>>103598522EVA is still the top performer of current local RP models.
>>103598513Function calling has existed for a while. It wouldn't surprise me if it just searches for that kind of stuff before generating.>It's like sonnet is 20 years old and every other model is 40.Who long ago where you 20? Don't you remember how much of a retard you were?
>>103598513This is why I never touch cloud shit. I'll always be content with local because it's all I know.
>>103597898not just smart. It's AGI
>>103598603>Who long ago where you 20? Don't you remember how much of a retard you were?5 years ago nigga
Can o3 cure the common cold?
>>103597898Post a link to the weights and we will, otherwise fuck right off back to /aicg/
>>103598646Finally, it can do my dishes and laundry for me
>>103598603I wasn't THAT retarded 2 years ago. More retarded than today, sure, but still better than the average person... probably
Did that concept of "LLM as compiler" ever go beyond the initial demonstration?
>>103598603Anon, why are you still here?
>>103591969>local will take advantage of it anyway Any day now!
>>103598368It doesn't matter if you're right or wrong. That's a stupid thing to say.>NPCs always trying to appeal to a "consensus" rather than verifiable fact>>103598447Next time say "it writes loli erotica" rather than talking about some imagined consensus.
Posting again.Can anyone test this prompt with Gemma on Llama.cpp and/or transformers? Here is the link:pastebin.com 077YNipZThe correct answer should be 1 EXP, but Gemma 27B and 9B instruct both get it wrong (as well as tangential questions wrong) with Llama.cpp compiled locally, with a Q8_0 quant. Llama.cpp through Oob also does. Transformers through Ooba (BF16, eager attention) also does. Note that the question is worded a bit vaguely on this pastebin but I also tested extremely clear and explicit questions which it also gets wrong. And I also tested other context lengths. If just one previous turn is tested, it gets the questions right. If tested with higher context, it's continuously wrong.Exllama doesn't get this. The model gets the question and all other tangential questions right at any context length within about 7.9k. So this indicates me that there is a bug with transformers and Llama.cpp. However, a reproduction of the output would be good to have.
It passed the Nala test, it writes cunny, it writes gore, with no refusals or attempts to steer away from it. I'd count that as objectively unfiltered.
>>103598654>>103598683Ah.>>103598726>Anon, why are you still here?Closest thing to social media i use, and something to do while on breaks of the rest of the things i do. You?
>>103598793Which one?
>>103598793Was your post supposed to start with an "if"?
>The test for """AGI""" is just completing patternsBut that's like the very thing LLMs do. Why is this surprising?
>>103598026o3 isn't a goal, it's a dead end. I bet it's not even better for cooming, ie. not actually smarter. They are just benchmaxxing. Unless you make money from solving cute puzzles and coding tests, there's nothing to get excited about there.
>>103598852No, logs of all those were posted in previous threads.
https://help.openai.com/en/articles/10303002-how-does-memory-use-past-conversations
>>103598906Oh, I see.Alright.
>>103598513I've been using Claude 3.5 Sonnet a lot recently. I've become increasingly aware of the limitations of its writing style and its occasional logical errors. It isn't really head and shoulders above other 70B models for fiction writing.It has a better library of reactions but not a perfect one. Real example of success from earlier this year: I asked a yandere AI to clone me a human woman as a romantic partner. Sonnet 3.5 understood the AI should be jealous but a raft of other models including the first Mistral Large did not. (I didn't use the word "yandere" in the defs. It's shorthand for this post.) Real example of failure from yesterday: a woman who was under guard allegedly for her own protection but also to control her had an opportunity to replace her chaperones with a security detail under her own control because an incoming administrator didn't get the memo, and she went full SJW "actually my supposed bodyguards are there to stop me from joining the resistance against this unjust society, so it would defeat the purpose to let me pick people who answer to me" instead of just shutting up and doing it. Importantly the character was not described as mentally retarded.Example of compound logical failure from today: in a situation with a pair siblings, a brother and a sister older than him, it called the boy his own younger brother. When asked OOC what that sentence meant it acknowledged the error and that the boy was younger, then it rewrote the scene calling the boy the girl's older brother.
>>103598915ChatGPT just got upgraded to LLM 2.0 LFG!
>>103598880uh akshually chud now that it's completed we can reveal the real AGI test.
>>103598756>pic5 or 25?
>>103598447You fell for one of the oldest tricks in Sao's book which is spamming the general to form the "thread consensus".
>>103598894The benchmarks o3 excelled at have not been publicly released. To claim they trained on private tests or that it's not smarter at all is absurd.
>>103598932That's just a method to counter benchmaxxing.
>>103598802I'm just bored, so I guess we are the same.
>>103598937As some wise elders say "Not my problem", you let discord shitters do it with impunity.
>>103598915That's... That's just RAG
>>103598976I like to see the thread going to shit though
>>103598979no, it's OpenAI ChatGPT Memory™
>>103598937they all do it
>>103598979Trve... Fact checked by independent lmg court from beautiful India.
How many billions of parameters does a model need to stop writing pajeet-tier code?The other day it used a for loop with a 1k buffer to copy data from one stream to another when Stream.CopyTo() was a valid solution.
Does really no one here have a copy of Gemma GGUF they can just load up and try something out quickly?
>>103598979o1 style response iteration (writing a reply, then writing a criticism of that reply, then writing a new reply based on the original input + the first reply + the criticism, repeated several times) could fix the inherent problem in RAG that it only brings up information about something after it has already been mentioned (so it doesn't help when the AI is the one introducing the term) if the backend stops and applies RAG before criticism iterations.
>>103594837I mean, newer 7b models give results on par with GPT-3.5 turbo, quantization keeps improving, there keeps being algorithmic improvements such as flash attention, etc.Yes, currently it would not be practical to replicate something like this, since even with all of OpenAI's resource it is still a the parlor trick stage (the actual models won't be released for months), but it might be feasible locally sooner than we think.Last spring, a lot of people were amazed at Sora when OpenAI announced it. By the time they released it, there were some much better commercial versions by competitors with actual products, some of them making weights available, and by all accounts, Sora pales in comparison to a lot of other commercial ones at least.OpenAI is marketing heavy, but for the nth time, has no moat. They have their brand. They're the Bitcoin of the latest AI wave. They might, like Bitcoin, succeed because first movers advantage is that powerful and people are dumb (buy Monero), the reason they're selling that vaporware several months in advance is that it's what they need to appear ahead; their current products are no enough.
>>103599060nope, most here just shitpost and don't even use models
>>103599060>Gemma GGUFI have the fp16s from ollama.What do you want tested?
>>103599102Really good local model is like a unicorn - it's not real.
>>103599057You won't like to hear this...Basically it's not about parameter count. 70B and above could learn to do it properly. It's about having a ton of high quality data, and training for a long time. That's how you get non pajeet code. And the part you don't want to hear is that the only way we'll get that much and with that quality is by researching better methods of generating the data. "synthetic data". There are different ways of generating synthetic data and just any shit method isn't sufficient. The synthetic data needs to be high quality and high diversity so the model learns to generalize/doesn't overfit. So more research needs to be done at least on the open source side. Anthropic had done this already which is why their models coded so well compared to everyone else.
>>103599119This >>103598786, thanks.
>>103599119A prompt that's 24 KB of plaintext (lol).
>>10359905732b is usable70b is not much betterclaude blows all open models out of the water. o1 is better but much slower and MUCH more expensive
>>103599087> newer 7b models give results on par with GPT-3.5 turboCome on now
>>103599102There are still some, like me, and the guy last time that had an exl2 copy. It's understandable that Gemma is not popular given its advertised context size. And my post was kind of long so it's understandable no one cared enough to even read a single sentence of it.
>>103599175>>103599182I gave it 8k of context and it estimated that the prompt was 5752 tokens.
>>103599270You clearly didn't paste the whole thing in. It ends with a question about XP costs. And btw it's 11K tokens.
I only see 279 lines in the pastebin.
>>103599270That sounds close? If I copy and paste the pastebin text into Mikupad, it reports 5634 tokens to me.>>103599298How'd you get that? That should crash the backend or generate gibberish but it's clearly working on my end. No rope.
>>103597950we're doing way better than anyone expected
>>103598368It was. It was also retarded
>>103599203*depending on use case, I guess.
>>103599335I got that by using the token counter endpoint. It turns out if you CTRL-V twice it's 11K.
>>103599377I expected better.
I want to try the status block meme for RP. Any good templates? What should I include?
>>103599417That's what my parents tell me every day
>>103599387Kek.
>>103597898I can't run it on my PC so I don't care.
>>103599432*emotional damage*