/g/ - Luddite bros? - Technology

Anonymous

Luddite bros? 06/04/26(Thu)20:00:52 No.108982253

File: 1752024578678251.png (325 KB, 737x708)

325 KB PNG

Luddite bros? Anonymous 06/04/26(Thu)20:00:52 No.108982253 Archived

Mythos can improve speed of training code 52x (compared to human 4x at 4-8hrs)

Anonymous
06/04/26(Thu)20:02:51 No.108982265

Anonymous 06/04/26(Thu)20:02:51 No.108982265

The footnote reads: «How large the speedup gets depends heavily on how much room for improvement the starting code leaves, and it should not be read as a real-world training speedup. So the absolute multiple is not the figure to anchor on here. What is more informative is the like-for-like comparison that this experimental setup makes possible, both across models (~3x to ~52x over the past year) and against a skilled human (~4x in four to eight hours on the same task).»

Anonymous
06/04/26(Thu)20:15:00 No.108982315

Anonymous 06/04/26(Thu)20:15:00 No.108982315

>>108982265
aitoddlers btfod

Anonymous
06/04/26(Thu)20:15:59 No.108982320

Anonymous 06/04/26(Thu)20:15:59 No.108982320

>>108982253
Great news. Maybe mathfags at Anthrophic will be able to produce code that's only 30 orders of magnitude slower than the norm now!

Anonymous
06/04/26(Thu)23:50:51 No.108983218

Anonymous 06/04/26(Thu)23:50:51 No.108983218

>>108982253
Is this why 4.7 and 4.8 were worse than 4.5 and 4.6 and why claude code remains a buggy pule of steaming shit?

Anonymous
06/04/26(Thu)23:53:13 No.108983231

Anonymous 06/04/26(Thu)23:53:13 No.108983231

AI will be effectively useless to all but brown people until it can be ran cheaply and efficiently in a local setup.

Anonymous
06/04/26(Thu)23:56:57 No.108983247

Anonymous 06/04/26(Thu)23:56:57 No.108983247

File: 1779814570525645.png (1.16 MB, 1000x1450)

1.16 MB PNG

Anonymous
06/05/26(Fri)00:18:59 No.108983330

Anonymous 06/05/26(Fri)00:18:59 No.108983330

>goythos

Anonymous
06/05/26(Fri)00:22:18 No.108983338

Anonymous 06/05/26(Fri)00:22:18 No.108983338

>>108982253
why are you marketing a product which will never be sold to the public?

Anonymous
06/05/26(Fri)00:50:33 No.108983450

Anonymous 06/05/26(Fri)00:50:33 No.108983450

>look our model that we can't show you did something amazing on some vague metric that we invented!
What's the point of these shill threads?

Anonymous
06/05/26(Fri)01:00:02 No.108983482

Anonymous 06/05/26(Fri)01:00:02 No.108983482

Also work with CPU design

https://github.com/FeSens/auto-arch-tournament/blob/main/docs/auto-arch-tournament-blog-post.md

LLM will save the computing world

Anonymous
06/05/26(Fri)01:03:14 No.108983495

Anonymous 06/05/26(Fri)01:03:14 No.108983495

>>108983247
Move Slow, Snailcat!

Anonymous
06/05/26(Fri)01:03:24 No.108983497

Anonymous 06/05/26(Fri)01:03:24 No.108983497

Over

Anonymous
06/05/26(Fri)01:46:09 No.108983632

Anonymous 06/05/26(Fri)01:46:09 No.108983632

>>108982253
We don't need faster, we need smarter. Let us know when it can count the number of days in a week with the letter 'a' without having to make a tool call, or at least make it so that it knows when it's guessing and when it's sure instead of pretending it's sure every time.

Anonymous
06/05/26(Fri)02:13:59 No.108983699

Anonymous 06/05/26(Fri)02:13:59 No.108983699

>>108983632
Ironically I wrote a simple prototype that simply hooked up an LLM with a wikidata database and tools allowing fetching triplets and storing triplets in an overriding 'user database', and that alone was enough to fix virtually everything about it. I'm also working on using this mechanism to add quality/confidence of source on the triplets which is a big more challenging, but should allow the LLM to choose not to believe schizobabble over primary sources.
So far it's proven extremely useful in overriding bad assumptions made by the models. You can define new facts and words and it will get them right all the time. Whereas without this hookup, it'll just hallucinate meanings and go off the rails. If it can't find predicates or entities, it always informs the user it doesn't know instead of making it up, which is great. If it finds multiple potentially matching entities, it asks the user which option makes most sense instead of guessing, unless the context is enough to disambiguate (for example, ask gpt 5.5 who is albert einstein and it will immediately go for the scientist, my system will ask if you mean the scientist or the actor now known as Albert Brooks, but if you ask about what albert einstein contributed to physics, it will disambiguate and not consider the actor as a potential match). Also, it was able to bring the answer quality of qwen 3.5 up beyond that of gpt 5.5 xhigh. This is good evidence that the thinking layer can be improved and that improving it has direct, profound impact on downstream quality.

Anonymous
06/05/26(Fri)02:16:06 No.108983705

Anonymous 06/05/26(Fri)02:16:06 No.108983705

>>108983632
>count the number of days in a week with the letter 'a' without having to make a tool call
it cant, because thats not what an LLM/token predictor does. the fact that it can make a tool call is all that matters

Anonymous
06/05/26(Fri)02:21:04 No.108983719

Anonymous 06/05/26(Fri)02:21:04 No.108983719

>>108983705
Yes, this is correct. The whole end-to-end shtick is old deep learning ideas that only mattered in the lab, it was the high water mark of what these things can do but has never been an actual thing in production where these systems were always hooked up with tools like we are now rediscovering and pretending it's new. Same with agents, we always had these in practice. Classical answering systems had a router entrypoint that would dispatch to deep learning goal-oriented chatbots. And those systems, within their goal-oriented niches, performed as well as early gpt-4 did, though they were nearly never available to the public. The fact gpt-4 worked about as well despite being general was really cool, though.

Anonymous
06/05/26(Fri)02:50:26 No.108983819

Anonymous 06/05/26(Fri)02:50:26 No.108983819

>>108982253
>It's hecking AGI i swear.
strawb...
>NOO!!! Not like that!!!

Anonymous
06/05/26(Fri)03:56:52 No.108984060

Anonymous 06/05/26(Fri)03:56:52 No.108984060

>>108983247
Move slow, Snailcat!