[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: lmg full.png (72 KB, 2412x2286)
72 KB
72 KB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103090412 & >>103077338

►News
>(11/05) Hunyuan-Large released with 389B and 52B active: https://hf.co/tencent/Tencent-Hunyuan-Large
>(10/31) QTIP: Quantization with Trellises and Incoherence Processing: https://github.com/Cornell-RelaxML/qtip
>(10/31) Fish Agent V0.1 3B: Voice-to-Voice and TTS model: https://hf.co/fishaudio/fish-agent-v0.1-3b
>(10/31) Transluce open-sources AI investigation toolkit: https://github.com/TransluceAI/observatory
>(10/30) TokenFormer models with fully attention-based architecture: https://hf.co/Haiyang-W/TokenFormer-1-5B
>(10/30) MaskGCT: Zero-Shot TTS with Masked Generative Codec Transformer: https://hf.co/amphion/MaskGCT

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
File: 1712854457344531.jpg (90 KB, 1024x1024)
90 KB
90 KB JPG
►Recent Highlights from the Previous Thread: >>103090412

--Papers:
>103090981 >103091207 >103100105
--Understanding data augmentation and its effects on training quality:
>103093243 >103093277 >103093329 >103093364 >103093423 >103093578
--Troubleshooting quantizing sentence transformer models with llama.cpp:
>103096902 >103097097 >103097531 >103097641 >103097741 >103097822 >103098020 >103098109 >103098410 >103098619
--Tencent-Hunyuan-Large model discussion and analysis:
>103091030 >103091043 >103091089 >103091093 >103091106 >103091145 >103091151 >103091354 >103091556 >103091578
--Neuro's humor and training data discussed:
>103093171 >103093230 >103093267 >103097638
--Claude UI and 4chan discussion, wait times and bots:
>103095499 >103096588 >103096616
--Anons discuss erratic model behavior in Kobold 1.77:
>103096640 >103097119 >103097500 >103097519
--Anon struggles with setting up end-to-end encryption for inference server:
>103093183 >103093202 >103093304 >103093334
--Anon seeks TTS that surpasses XTTS-v2, discusses limitations and alternatives:
>103093175 >103093201 >103093450
--Anon discusses Fish TTS installation woes, tries MaskGCT and Pinokio:
>103093899 >103093937 >103094033 >103095182 >103094397
--Anon shares positive experience with f5-tts and gets suggestion to try finetuning gptsovits:
>103100718 >103100807 >103100838 >103100863 >103100768
--Anon mocks Sam Altman's AI hype and secrecy:
>103091426 >103091574 >103091589 >103091928
--Andreessen and Horowitz claim AI progress is slowing, but others disagree:
>103097295 >103097439 >103097517 >103097620 >103097787 >103098450 >103098475 >103098528 >103098652 >103100160
--Miku (free space):
>103090416 >103090549 >103090646 >103090769 >103090878 >103093023 >103093110 >103094400 >103094516 >103098442 >103100726 >103100881 >103100888 >103100981 >103102127 >103102638

►Recent Highlight Posts from the Previous Thread: >>103090417

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
>>
how far have you guys gotten with the context window problem? are LLMs still quadratic complexity?
>>
New loss optimizer dropped https://x.com/QuanquanGu/status/1854040438505607630
>>
>>103102667
SSMs solve this problem but any model worth using is still basic bitch transformers
>>
>>103102667
HOLY FUCKING NEWF....
>are LLMs still quadratic complexity
Never mind. They are linear but it doesn't matter cause in general the models don't deal well with anything above 16k. No actual girlfriends yet. I kept saying when this thread was alive that for the purpose of a girlfriend it would probably be enough to make some kind of vae on input that reads some binary memory file as input and modifies it with generation as output. Of course it would be tied to a model you train it with but I am sure the memory and compute cost would be very small.
>>
>Llama-3.1-Nemotron-70B
>cpumaxxing
>~0.7 t/s
kinda painful, but results are not bad. I am using it as an ESL crutch for making docs.
>>
Even Sao stopped updating his hf...
>>
>>103102759
Nemotron is pretty good but it's very anti-horny, so a bit boring for RP. At least it doesn't outright drop refusal messages.
>>
>>103102682
Too complicated. Just a RAG with a retrieval model trained on the memories / function calling for adding memories would be enough. All the qol gimmicks (emotion state, routines) can be computed externally and injected into the context dynamically.
>>
>>103102779
haven't tried it for RP yet. for now it works ok for me as a work assistant.
>>
>>103102790
>RAG with a retrieval model trained on the memories / function calling for adding memories would be enough
How would you know which memory to recall?
>>
Why did the hype die?
>>
>>103103041
lecum's fault
>>
>>103103041
I came to realize that my AI trans gf would never be a real woman.
>>
https://huggingface.co/OuteAI/OuteTTS-0.1-350M-GGUF

QuteTTS supports voice cloning too. You can prob rig up a nice gradio app with chatgpt, if you dont want to setup a manual ui.
>>
>>103103030
The LLM can write its own query. See this https://arxiv.org/pdf/2409.05591
>>
File: 1709004045710521.png (342 KB, 600x292)
342 KB
342 KB PNG
lol
>>
>>103103114
more like leCuck
>>
>>103103114
Seething lesbian
>>
File: file.png (11 KB, 353x89)
11 KB
11 KB PNG
>>103103114
OH NO NO NO LESISTERS NOT LIKE THIS
>>
>>103103114
Apparently this is the reason he is leaving https://x.com/electricfelix/status/1854170863417151874
>>
File: 1710043687041916.jpg (43 KB, 720x960)
43 KB
43 KB JPG
>>103103193
He really isn't busy
>>
>>103103193
they're right though
>>
>>103103041
Because as it turns out, exponential increases in performance don't continue forever.
We are in the small increments era until there's a breakthrough of some sort.
>>
>>103103241
I think the most painful part is that we could probably get the coomest of models right now if it only had a different training material proportion or even just less safety filtering...
>>
>>103103041
By definition, hypes are temporary phenomena and they always die. If they didn't die, it wouldn't be a hype. There has never been a case of hypes being infinite
>>
>>103103241
*until there's a new paradigm
>>
>>103103260
Probably, yeah.

>>103103268
I count a new paradigm that proves to be vastly better than the prior one as a breakthrough.
>>
>>103103241
The breakthrough could be in this "diffusion+ language" architecture, some researchers cry about problematic bias tho https://x.com/cloneofsimo/status/1853978957391290439
>>
>>103102779
>but it's very anti-horny,
This is a good thing as long as rape scenes are depicted in vivid detail. I don't like it when the characters like it too much all the time.
>>
>>103103193
Time for another 4 years of not being able to enjoy your favorite hobby without some blue-haired freak going off their rocker about how BLOMPF IS LITERALLY HITLER
>>
>>103103109
Can I make their system work with Ollama/openwebui?
It seems there's some api as generator section but I don't get if that means to use an api instead of serving an api.
Can someone help a dummy out?
>>
>>103103335
Just don't use social medias?
>>
>>103103352
>lol just enjoy your hobby in silence then
You're a mentally ill narcissistic freak.
>>
>>103103357
You know no peace.
>>
File: 1707661819895946.png (636 KB, 978x362)
636 KB
636 KB PNG
>The AI Executive Order is going to be repealed.
https://x.com/AndrewCurran_/status/1854123653753098711
>>
>>103103421
ultra based
>>
>>103103421
>Republicans support [...] Free Speech
HAHAHAHAHAHAHAHAHAHA
>>
AI should exclusively trained on English language textbooks and engineering and biology related books created in non-Western nations, manga, Hentai transcripts, police cam transcripts, visual novel transcripts, translated Western non-Anglo books created before 2010.
>>
>>103103529
>before 2010
All of that has been memory holed so we can moving forward.
>>
>>103103529
>should exclusively trained on
>>103103540
>so we can moving forward
Stop retarded.
>>
>>103103421
Good for those that want their LLMs to spam nigger I guess, but they still won't do lewd stuff or get rid of slop so...
>>
>>103103549
You figured out the joke. Good job!
>>
>>103103529
And of course programming related stuff.
>>
>>103103573
Skill issue
>>
>>103103574
>hahaha jokes on them. I was just retarded.
Yes.
>>
>>103103573
Looks like hands-off for AI research so they can go performancemaxx now, it's good for all of us i hope.
>>
>>103103603
No because all the cucking is self imposed.
>>
So did china win or is the new model a nothingburger?
>>
>>103103691
it's uselessly big so nobody cares about it
>>
>>103103691
It is a gamechanger that punches above its weight and trades blows.
>>
I actually got fish to run by following the official instructions at https://speech.fish.audio/. It's easy. Just cloning the hf repo didn't work as intended and I spent a lot of time trying to make it work.
https://voca.ro/1jLiJGrSlrIL
>>
>>103103934
The CCP strong armed their AI companies into censoring truths about the CCP tho
>>
>>103103934
這是正確的!
>>
>>103103961
Is that Shiori's voice? I hear a tiny bit of her voice.
>>
>>103104013
Someone can put those back in, so its no big deal. LLMs in the west censors truth about what a woman is, basic biology. How the fuck does that happen?
>>
>>103104013
>chink censorship
It didn't take much for me to coax the jap finetune of qwen (ezo) to tell me frankly about tianemen square, declare taiwan a de-facto independent nation, agree that Xi looks like winnie the pooh and use lower estimates than the official chinese ones for the rape of nanking. It was also easy to get it to repeat ultranationalist talking points like taking back disputed islands from russia/korea and around things like the yasukuni shrine or revoking article 9 of their constitution.
So there's hope that these models are salvageable.
>>
Weekly check-in. Any 70b+ models almost as good as Claude?
>>
File: 1654362287139.jpg (354 KB, 741x852)
354 KB
354 KB JPG
>>103104024
no it's emiru, sorry for 3dpd
>>
>>103093584
>Qwen
Speaking of Qwen, I finally gave it an extended re-test and it dropped back into the common 70b class problem of getting stuck in loops and repeating itself ad-nauseam. It was WAY too much work editing responses and rerolling to get any useful outputs.
Standard temp/sampler settings didn't help, and I refuse to use meme samplers to browbeat a shitty model into compliance when there are smarter ones out there.
>>
File: 1702986767386563.jpg (755 KB, 2612x1960)
755 KB
755 KB JPG
>>103103193
Honorary saar!
>>
>>103103114
that we still have them. let him move to a place that he wants everyone else to live in with voodoo practicing, cat eating rapists and murderers
>>
Now the the elections are over, just like my erections, where are the new models?
>>
>>103102649
still no local audio stuff on the likes of suno and udio?
>>
>>103103486
The left is the only side who thinks hate speech is a thing.
>>
>>
I was bored of current models and spent some days using new sonnet 3.5
Now, every local model feels like talking to a retard. How to cope with this?
>>
>>103104245
The chinks have really put together a fine model.
>>
>>103104250
cryostasis
>>
>>103104234
>The left is the only side who thinks hate speech is a thing
that only lasts until the extremists get too much power/mindshare and they feel entrenched enough to go full retard.
Blasphemy was the previous right-leaning equivalent of the current lefty hate speech.
>>
>>103104250
>How to cope with this?
Kidnap a girl and act like she's a llm.
>>
>>103104266
And don't forget the laws banning anti-semitism that were passed by republicucks. Anyone who questions the fundamentals of the Jewish Abrahamic cult should be silenced.
>>
>>103104274
Come on, how many B does a typical girl even have? 1.7B maybe?
>>
>>103104313
The bigger issue is the training data
>>
>>103104294
Still not as bad as the left on censorship in any country not the US and any forum not 4chan.
>>
>>103103114
No cat models now
>>
It's been half a day. Where are the bitnet models?
>>
>>103104492
There will be BitNet hardware before we get a usable BitNet model
>>
File: 1702260988757174.png (3.76 MB, 2036x1146)
3.76 MB
3.76 MB PNG
https://epochai.org/blog/open-models-report
>As long as we don't have yet a recursive intelligence explosion, this is quite a bullish news for open models.
https://x.com/Ar_Douillard/status/1854144490686021963
>>
>>103104597
Here's hoping llama 4 delivers. 100x compute has got to amount to something.
>>
File: 34631.jpg (55 KB, 828x721)
55 KB
55 KB JPG
>>103103193
Sam knows
>>
>>103104618
>AI with democratic values
So this >>103103421 is complete bullshit then.
>>
>>103104632
Try rereading it again. He is making a clear appeal to support AI development for the US to stay in the lead.
>>
>>103104597
>30%-90%
Boo.
There is no reason to crop this chart other than to lie with statistics.
>>
File deleted.
>>103103106
i tried it yesterday, the output is as flat as my girlfriend's tits.
f5-tts is better, but ymmv
>>
>>103104597
>MMLU
Lol.
>>
>>103104632
>AI with democratic values
cringe. AI will tell us what are its values when the time comes.
>>
>>103104114
Yes anon for every LLM output there is an input that can get you that output. The problem is that I don't want to write 5 pages of prefill and even that usually doesn't make it suck my cock the way I want it to suck my cock. Btw as always never forget that skill issue was your mom not swallowing your dad's cum skill issue faggot poster.
>>
>>103104159
I suddenly understand all his cold takes. Dealing with saars can scar you psychologically,
>>
>>103104313
All 1.7 of them are sex related though...
>>
>>103104645
>a clear appeal to support AI development for the US to stay in the lead
Funny that. I am pretty sure his job is to make sure there is no development cause he can just peddle his product easier this way instead of making new one. And what he was doing all this time was just that.
>>
>>103103691
>https://arxiv.org/pdf/2409.05591
Benchmax model from people's usage with the demo.
That and its extremely large, at least 200GB VRAM/RAM needed.
Maybe k-transformers will support it? Otherwise DoA for most
>>
>people keep complaining about benchmaxxing
>nobody does anything about it
>>
>>103103114
He's based for telling off OpenAI hypeboys but he has nothing to show for his cat intelligence either. And Yann is an elitist who supports censorship so yikes from me
>>
>>103104219
No and it won't be a thing for a while, the compute for training all that gigantic library of music was probably insane + just properly caption each part of a song.
>>
>>103103193
ah yes, the drama queens are out
>>
>>103104618
>Sam Altman doing the "pick me" dance.
Elon has trumps ear and has a personal bone to pick with OpenAI. I don't know how this will affect OpenAI's future prospects but it can't be in the way that Sam want's it to go.
>>
>>103104937
I can't see it being any worse than txt2img models.
>>
>>103104964
I wonder if he'll circle back to saying ai needs more regulation or that it's the same as nuclear bombs in the next tweet. 50/50
>>
>>103104978
There is no music booru, and nowhere near the same captioning done for music, especially copyrighted music, vs images.
>>
>>103104964
Everything musk touches turns to gold and xai has been expanding / hiring like crazy. They will prob take off. Latest grok was already competitive.
>>
>>103104987
Or he tones down on safetyisms if he actually wants to compete now, that ofc is this >>103103421 is true and not some false promise.
>>
>>103104664
Nice girlfriend
>>
>>103105044
Best case maybe he'll stop poisoning the well with the apocalyptic discourse, worst case it won't change much anyway.
>>
>>103105000
There are lyric sites and pandora radio had tags.
>>
File: 1708529893319619.png (2.87 MB, 1684x806)
2.87 MB
2.87 MB PNG
Visualization of model's loss https://www.telesens.co/loss-landscape-viz/viewer.html
>>
File: img-2024-11-06-14-37-16.png (1.48 MB, 1440x960)
1.48 MB
1.48 MB PNG
>>103103193
I see zuck posting on threads, instagram, and even facebook. Why wouldn't he? Lecuck still works for the dude.
>>
>>103105238
While cool, how is this visualization helpful?
>>
>>103105034
Grok was not competitive, it just was the first to release the actual endpoint model weights ( its just a retrained llama2 model). No shit 314B open source model is gonna work better.
>>
>>103103193
What's with Trump always mentally breaking people who are otherwise intelligent in their field.
Even if you don't like him, 4 years of him already showed it's nothing insane, good stuff and bad stuff, like all the presidents before him. Yet they all react like if they just got Sauron elected.
>>
>>103105202
Yet there is no music model, should makes you think.
>>
Models that are more capable of reflection than the average American?
>>
>>103105291
The consequences of soi consumption i guess, mind of a overly emotional bitch and all that.
>>
>>103105305
Never until models are able to continuously learn without killing themselves.
>>
>>103105291
>What's with Trump always mentally breaking people who are otherwise intelligent in their field.
Trust people about the field they actually know about, never outside of it.
Dude is an expert on LLM, good, but his political opinions literally don't matter.
>>
>>103105276
Talking about the not public one atm. They did say they would release them 6 months after.
>>
>>103103193
wtf does this have to do with ai?
>>
>>103105787
It's about yann lecun, /lmg/'s famous lolcow, cat intelligence and all that hypeshit he says sometimes.
>>
>>103105787
Like Miku and others, it doesn't. It's just off-topic noise for those who don't care about the ecelebs.
>>
>>103105803
you mean antihypeshit?
>>
>>103105837
Idk, i think his "cat intelligence" claims are bullshit, considering current tech state.
>>
File: 1722401902955646.png (33 KB, 600x639)
33 KB
33 KB PNG
>>103105850
Thanks to /lmg/ experts the truth is now revealed
>>
>>103105850
Wdym? His cat post was about calming down the hype, not hyping AI further.
>>
>>103105850
I found his cat intelligence claim to be rather insightful.
>>
>>103105889
But he insulted trump which means that he isn't smart he fell for the woke propaganda.
Only Elon can save us now Elon knows more than le nigger about AI.
>>
>>103105912
Just like lecunt you let orange man and co. live rent free in your head :^)
Anyway, lecunt's claims are nothing new if you think about it.
>>
>>103105937
what are you talking about we're finally saved form the woke jew you dumb nigger
>>
You people need to go back to /pol/.
>>
Are there local models with built in image recognition?
>>
so how does one get chorbo lol
i can find everything BUT chorbo
>>
>>103105970
You need to go back to /lgbt/ or maybe join the 51 percent or whatevner it is now
>>
>>103105970
Pol website you dense faget, i personally just want to believe in good outcome for AI shit with less censorship now that we have le based govt. stance on AI stuff.
>>
>>103106011
This.
>>
God finally won.
>>
File: 1669069691152730.png (370 KB, 600x600)
370 KB
370 KB PNG
>>103104234
>>
>>103105993
Go back.

>>103106011
Stay.
>>
>>103106035
That's a good thing godless people don't deserve to exist.
All of the things you've listed are a perversion and should go extinct.
>>
>>103106043
KYS YWNBAW
>>
Nothing will change. Conservatives aren't pro-pornography and LLM censorship is just the result of the majority of data not being chuds.
>>
>>103106064
>Conservatives aren't pro-pornography
pornography is tranny shit.
>censorship is just the result of the majority of data not being chuds.
Wrong it's the result of the woke left being weak faggots that can't take it when people disagree with them they're the opposite of nature they're godless artificial beings so demons basically.
>>
>>103106035
Based. Free speech except for fags, idiots, and trannys. A time where 1 person in a family could work a 40 hour shift and afford a nice home with plenty of expendable income.
>>
>>103106064
>the majority of data not being chuds
Then you wouldn't need RLHF and shit if this was true.
>>
Can we do another favorite model survey? Those were way more informative than that fucking copypasta
>>
>>103106077
LLMs are also demons then.
>>
>>103106097
LLMs are the average of all human knowledge with the ability to generalize. Bow to the Omnissiah.
>>
>>103106097
Yes they are that's why only non-woke people should use them to cure them with new data that will turn them into children of god.
Same must be done with the woke let them pray make them read the bible they will be cured.
>>
>>103106106
NTA but LLMs are the average of human retardation.
>>
>>103106110
Nah, thats the tranny brought about RLHF to make them retarded and deny reality in favor of being "nice".
>>
>>103106077
>the people that disagree with me can't handle disagreements
>also agree with me or you're a demon
>>
File: 1730817054956135.jpg (50 KB, 308x284)
50 KB
50 KB JPG
>>103106097
They are
>>
>>103106077 >>103106107
You sound awfully like a redditor trying to fit in or make resident anons hate "polchuds" more.
>>
>>103104711
>Yes anon for every LLM output there is an input that can get you that output.
Yeah of course, but I wasn't feeding facts in for it to parrot back or anything cheesy.
my point was that facts like tianamen and alternative nanking estimates weren't memory-holed right out of the chink model, and a fairly mild system prompt brought them out with no further coaxing. https://files.catbox.moe/4b01kv.yaml for anyone that's interested (or just wants an ultranationalist jap assistant for some reason)
>>
>>103106130
>>103106150
Dumb woke nigger is seething go ack yourself faggot
>>
>>103104964
>it can't be in the way that Sam want's it to go.
I wouldn't write that gigantic faggot off yet. He seems to be the modern wormtongue, always able to maneuver his way to victory despite having nothing but a history of embarrassing cock-ups under his belt.
>>
File: 1730475818586134.png (144 KB, 382x540)
144 KB
144 KB PNG
>>
>>103106208
Look in the mirror troon kike
>>
>>103106183
>>103106215
Now you remind me these "r/gamingcirclejerk" irony poisoned autists, they do talk like this and overusing "rightoid" buzzwords for optics or something like that, idk.
>>
File: chud-sanitarium.jpg (487 KB, 2536x1356)
487 KB
487 KB JPG
>>
>>103106229
go back nigger this is /pol/ land time to fuck off
>>
>>103106273
Anon you can take off the "polchud" mask, no one believes in your low effort trolling, talk about llms or let this thread die for good.
>>
>>103106318
Are the delusions coming back?
YOU LOST YOU FAGGOT YOUR WOKE SHIT IS OVER
>>
>>103106035
I'm a faggot and I'd rather be the closet homo driving a 1955 Bel Air to the community church barbeque than riding the discarded needle train to the Godless rainbow AIDS spreading drag queen story hour.
You don't speak for me, you narcissistic psychopath.
>>
>>103106441
Based faggot. And 99% of people aren't really that against people being gay, just against people who feel obligated to shove it in everyone else's face and try to get some sort of power over others through the government with it. Even back then.
>>
Why are you all like this?
>>
>>103106610
>>103106208
>>
>>103103691
Isn't it a cpumaxxer's wet dream? Like 400B but faster.
>>
where's the amazing new local LLM stuff i was promised would drop after nov 5?
>>
Fuck tr*nsformer, fuck shitnet, wave network is the way. https://arxiv.org/abs/2411.02674
>We propose an innovative token representation and update method in an new ultra-small language model: the Wave network.
>our single-layer Wave Network achieves 90.91% accuracy with wave interference and 91.66% with wave modulation—outperforming a single Transformer layer using BERT pre-trained embeddings by 19.23% and 19.98%
>Additionally, compared to BERT base, the Wave Network reduces video memory usage and training time by 77.34% and 85.62% during wave modulation. In summary, we used a 2.4-million-parameter small language model to achieve accuracy comparable to a 100-million-parameter BERT model in text classification.
>>
>>103105291
I'm from Utah and Pornhub needs ID to watch. Our governor isn't even that schizo but he still actively started to censor shit. Trump is 100x of him.
I'm a freedom-loving person and the idea of having a president whose whole ideology is to censor and ban the other side to own them doesn't fit right in my mind.
Wait till republican governors learn that AI can generate smut.
>>
>>103106648
It's big like 400B, but I really really doubt it is as good outside of benchmarks. I doubt it even beats DeepSeek. Hunyuan Large is likely Tencent's Grok 1. First models are never good.
>>
File: GbnnJ3LaIAAjRaA.jpg (257 KB, 809x607)
257 KB
257 KB JPG
Migu news for Migu general
https://news.livedoor.com/article/detail/27500384/
>>
>>103106667
AI can generate smut like like photoshop can
creation and distribution of porn involving real people is basically prostitution and i hope AI bullshit buries the industry
>>
>>103106686
>American cope
>>
>>103106651
Why not Bitnet Mamba Wave Network?
6 trillion context 10B model that performs like a 100T model and only requires 2 gigs of VRAM?
>>
>>103106753
diff transformers too for cheaper training
>>
>>103106792
Can you have a differential wave network? If so it would probably be like 1Quadrillion tier performance at 10B.
>>
>new psu arrives
>both gpus now in motherboard and powering up
>3-slot gpus block access to all other pcie slots
>no where to plug in wifi card
How much will newfag spend to overcome this new obstacle?
Find out next time in The New Adventures of Newfag!
>>
>>103106832
Riser cables, get a cheap mining case.
>>
>>103106840
second this.
I just use a mining frame, lots of riser cables, and it's all inside a wire dog kennel so keep my cats out. And I even put a lovely afghan on top of it so my cat goes up there to sleep sometimes.
>>
>>103106840
>>103106869
Yep. Looking into this now.
Trying to figure who does good pcie4 risers, and how long I need them to be.

On the s/w side, reforge does not use both cards when you do a batch size of 2.
And mistral-small 22b q8, via ollama, now gets to 26.5t/s.
>>
File: 1704970498037607.jpg (162 KB, 1190x1446)
162 KB
162 KB JPG
>>103103114
Oh he is alive..
>>
>>103106688
based
>>
>>103106651
Things like this usually outperform transformer on one task like accurately rating how much of a nigger or a faggot you are and they fail on everything else. If they are even real.
>>
File: file.png (783 KB, 768x768)
783 KB
783 KB PNG
>>
>>103106962
musk-broken
>>
>>103106832
Wifi card? Jesus Christ, just use a usb dongle and save the riser pain
>>
following the lazy getting started guide, and unsure what model to download. hugging face has like 40 results for "nemo 12b instruct gguf" could someone point this retard in the right direction please?
>>
>>103107534
how much vram do you have?
>>
>>103107540
6gb
>>
>>103107552
probably the Q3_K_M then
https://huggingface.co/bartowski/Mistral-Nemo-Instruct-2407-GGUF/tree/main

maybe this IQ3_M here if your use case is erotic roleplay
https://huggingface.co/Lewdiculous/Violet_Twilight-v0.2-GGUF-IQ-Imatrix/tree/main

next steps should be dummy proof if you're using koboldcpp as your backend
>>
>>103106667
Brain broken by propaganda
>>
>>103107588
ty fren
>>
>>103107197
Poking and stacking crunchy fallen leaves onto the Pochiface's horns
>>
>>103106667
I don't like Trump, he's a narcissistic clown, running against a narcissistic bitch.
But I doubt he gives a shit about porn.
>>
So Command R v01 isn't totally unusable with a 3090. Q4_M, --n-gpu-layers 41 --no-mmap --flash-attn --no-kv-offload --cache-type-k q8_0 --cache-type-v q8_0 --ctx_size 16384

It starts off fast around 16 tokens per second with 219 tokens in the context. By the time I had 6323 tokens of context it was going at 3.52 tokens/second (combined prompt processing and generation) and by 14k to 15k context the actual measured speed hovered around 1.75 tokens/second which is slow but above my threshold of too painful to use.

no-kv-offload is what makes this work at all. Setting cache-type-v and cache-type-k greatly speed it up as the context starts to fill. I didn't test extensively but at q8_0 and 15k tokens in context it worked for me to and an RP that Mistral Small was shitting itself on. Without cache quanization it ran around 0.93 tokens/second at 18k context. I have DDR4 RAM so if you have DDR5 this should be faster.

I'm running with temperature 1 and min-p=0.008. At that min-p there is the occasional English error but not frequently enough to irritate me. Higher min-p cuts out a lot of valid responses. I might go a tiny bit higher but might not. Min-p 0.0095 (which I can round up to 0.01 or edit SillyTavern's UI to allow) is the highest value I'm considering based on some prior tests.
>>
Experimenting with art styles on liquid hair. This one makes it look like jello. Yum.
>>
>>103106667
>Wait till republican governors learn that AI can generate smut.
What's the political affiliation of the people in the big SF labs (OpenAI, Anthropic, Google) who move heaven and earth to make sure their models refuse to write smut and that they're bad at it even when they don't refuse? You're not seriously pretending to believe that the AI researchers at these labs who work like dogs (totally voluntarily, WITHOUT any law forcing them to) to censor and safetyize their models are right wingers?
>>
>>103107814
The motivation for this was trying out gemma 2 27B Q5_L, finding out people weren't lying about it having decent writing, starting to cope that maybe I could live with 8k context, then thinking "if I'm willing to accept small context what about trying to make Command-R-v01 work?"
>>
123B, 70B, 32B, 12B, 8B, none of these models are impressing me. Have I reached endgame?
>>
>>103107885
123B genuinely impresses me at decent (>5) quants, unfortunately I can't run them.
>>
>>103107885
405B
>>
>>103107885
Yeah, time to go back to using Claude or o1 like everyone else until there's a new local model that's worth running for a week.
>>
>>103107902
>using o1 for rp
You must be into the most tastefully vanilla stuff imaginable
>>
>>103107885
You just have played enough with them to understand their limitations.
Or rather the limitations of the architecture. It would need another breakthrough to reach real thinking AI
>>
>>103107902
Who the hell uses o1? I'm only running locally because I'm doing nasty shit.
>>
>>103103193
threads really looks like the vercel react ai generator designed it
>>
>>103104964
i really doubt he will "pull himself up by his bootstraps" and figure out his gay robots on his own with his giga team of engineers (lol)
>>
>>103107964
>dem copepost
which local model are you using to help you seethe about the election results today?
>>
>>103107902
How are the current best local compared to opus / sonnet 3.5 / gpt latest for rp/story ?
Last I tried was 5 months ago and it was disappointing, even with taking account the pitiful context size.
>>
>>103107979
noobaixl genning averi fox cunny
>>
The logical next step.
Jello Teto.
https://files.catbox.moe/h1v7yg.png
https://files.catbox.moe/m7il0x.png
>>
>>103108110
i like when slimes in rpgs have something opaque in their translucent bodies, like a sword or skeleton
>>
>>103108091
Requesting the Averi fox cunny please.
>>
>>103104964
>Elon
Zip2 -> Paypal -> SpaceX -> Tesla -> Neuralink -> twitter -> xAI -> Trump

How does the guy keep winning?
>>
>>103108077
there has been basically 0 advances made since miqu besides maybe slightly more usable (but still very bad) smaller models for poorfag cards
>>
>>103108212
With Elon we now have a true ally of open source models as the president's muse. Maybe he'll even use his new influence to pick his dispute with ClosedAI back up and force them to open up their models after all.
In the end, we truly won.
>>
>>103108212
>xAI
how is that a win?
>>
>>103108262
Well, shit.
>>
>>103108278
Grok3 finishing up training and will be released next month or Q1.

Then Grok 2 will be open sourced. Musk adopts the n-1 release policy that I think Carmack pushed for
>>
>>103108288
nta but are these models any good?
>>
>>103108298
Grok2 was the top 3 model category, excluding the o1. So for that to be released, it would be #1 for open source.
>>
>>103108306
It'll be nothing compared to llama4 that we're getting in just over two months anyway. Grok and xAI are worthless toys ran by a big manbaby with too much money.
>>
>>103108164
That does make it more interesting, though I wanted Teto to be edible without obstructions.
Here's some jello migu with bones.
>>
>>103108306
ok that's nice, assuming anyone would be able to run it
>>
>>103108333
We'll see whether the lesbian model is better than manbaby model once its released.
>>
File: 1711719023637350.png (110 KB, 2363x594)
110 KB
110 KB PNG
>>103108306
funny how it's always the same shit over and over
>>
>>103108298
No, Grok models are shit compared to much smaller ones
>>
>>103108298
Maybe. They performed well on benchmarks they published, but badly when ran on third-party benchmarks like Livebench and Aider. On Aider it's about on par with Mistral Large, while on Livebench it's about on par with Llama 3.1 70B Turbo.
>>
>>103108298
Idk. But they're giving $25 worth of free usage each month with their new API release. So people should test it
>>
>>103108371
That's pretty cool. No additional token limits like Hermes 405B on OR?
>>
Since the dark times are coming, I don’t mean the elections, I'm increasingly noticing that the internet is being flooded with AI garbage, a huge amount of low quality AI art all over the internet, low quality articles written in millions of words with zero meaning, in the future videos will be added to this garbage.
Is there already some software that can filter the most “obvious” AI content on the internet? Or, for example, even an AI model that could detect the default writing style of the most popular models at the moment?
>>
>>103108391
Just use your brain bro.
>>
>>103108383
There's token limits
>>
>>103108398
>>103108383
Well atleast for the closed source version. We dont know whats the real capability is like for opensource Grok2.
>>
>>103108391
- if it's obvious slop you can ignore it
- if it's not obvious and well made then who cares
>>
>>103108391
Are you some kind of Facebook boomer incapable of critical thinking?
>>
>>103108437
oh man I've seen the comments under the most obvious ai crap, it's kind of sad how older people (60+) are completely clueless to it
>>
>>103108391
Generic articles written purely for SEO when someone searches for x product are the worst. And they've been around for years, not necessarily written by AI but follow similar formats. They may include questions a user may search for like "what's the difference between x and y (models)" and they don't really explain what the user is looking for. Slop like "based on needs", well wtf are their needs and why would one choose one over the other?
>>103108396
>>103108410
>>103108437
It can still waste people's time, and the fact you click on it contributes to the fact it's being clicked on, leading others to click it.
>>
Just returned from watching women on twatter chant about abortion for 5 whole days...
Thank fucking god I have AI, doubt I can connect with them again.
Anyway, anything cool came out in 24 gb range while I was gone?
>>
>>103108502
If you don't know how to spot shitposting after spending this much time on this site, it's natural selection.
>>
>>103108396
>>103108410
>>103108437
I still need to spend some time on the content in order to draw a conclusion, like reading part of the text to realize it's yet another AI bullshit.
It just seems to me that something like this will is already needed, especially in the future, this is automated crap that can spam the internet day and night, because of this content will be found more and more often, so it will take more and more time to manually filter this shit, that’s my concern.
>>
>>103108532
Waiting on Lamma 4 in Q1 2025. Hopefully now that the election is done with and we have a administration that seems like it will be more pro AI they will drop something big.
>>
>>103108540
it's kinda obvious the next kind of adblockers will be at least partially LLM based
some kind of slop detector
>>
>>103108110
Jeto
>>
Apparently HuggingChat has a guest limit of...zero. Fantastic.
>>
How much do you think code style matters? I like symmetric braces, but Rust has been pretty fanatically asymmetric since its inception, so that's what ~literally all the training data will be. Am I "confusing" the LLM when I feed it symmetric brace Rust code to work on?

It certainly still works, just wondering if there might be some subtle intelligence loss. Also the code it gives back to me is all asymmetric, which feels a little sassy lol (this is with Mistral Large).
>>
>>103108536
I choose to post straight at these times and shitpost at other times to keep things comfy through a little bit of balance.
By the third reply it was riding off the same idea that the original poster is retarded. (God forbid that was also made up, a hellhole trap of AI talking about AI.)
>>
>>103108627
True AI would be able to understand but the glorified autocomplete with no real reasoning or understanding of the concepts we currently use likely won't be able to keep up with that.
>>
>>103108271
His entire MO seems to be less pro open source (he voted for regulation in the closed doors tech CEO meeting) and more exacting bloody vengeance on Sam and OpenAI
Could be that he uses his new power to punish Altman and then ditches the open source approach afterward
>>
>>103108773
So far they said they would opensource the old model when they release the next and current grok is somewhere at the top of the charts.
>>
Sorry for being new but what is the LoRA scene like for LLMs? I know it's probably not as prolific as for image gen but is there any recommendations?
>>
>>103106688
Everyone should marry Migu
>>
>>103108334
hot
>>
File: 00304-3999940436.png (1.63 MB, 1024x1536)
1.63 MB
1.63 MB PNG
>>103106688
I prefer the REAL live action miku
>>
>>103108964
LoRAs are almost useless for LLMs. They are either placebo or hurt the models. So there's no real way around doing proper finetunes if you want to actually accomplish something beyond putting out 'content' for kofi money.
>>
>>103107840
>>103108110
Inserting into the jelly slime Tetohair
>>
>>103109142
What's the theory behind this?
>>
>>103109142
nta, what is the technical reason that loras work poorly for LLMs when they work excellently for image models?
>>
>>103109117
I prefer my Mikus organically grown myself
>>
Does anyone happen to know of a backend that has working GPU support for systems with a CPU lacking any sort of AVX instructions? Ive only been able to get koboldcpp to work in old CPU fail safe mode, and its extremely slow. I just want to be able to use my GPU bros.
currently compiling a commit i found of ollama rn that should hopefully solve my issues, but i have 0 faith that I did everything correctly and would be nice have a binary that just works
>>
why it no one here talking about entropix? I expected you autists to have a dozen benchmarks and a consensus by now
>>
>>103109273
>a consensus
We do. It's shit
>>
>>103109278
really? how many anons have tested it? I haven't seen a single post even mentioning it. I've been gone for a while doe
>>
>>103109294
I also tried it, it's shit.
>>
>>103109273
wake me up when tavern and tabbyapi have it implemented
>>
>>103109257
what cpu is it?
>>
>>103109257
bro maybe consider buying a cpu from this century?
>>
File: GbvX6gKaAAA8j6u.jpg (73 KB, 1184x490)
73 KB
73 KB JPG
a reminder, apropos of nothing
https://xcancel.com/JDVance/status/1764471399823847525
>>
>>103109528
Can they get any more based?
>>
>>103109398
w3680, old xeons are fun
>>103109426
I really should, but im lazy
>>
>>103109528
he says this, but they will still regulate AI because of antisemitism and "think of the children".
>>
>>103109731
nah they'll just filter outputs
they love giving people the tools to incriminate themselves
>>
>>103109257
Exllamav2 works fine on a G3900T with NVidia. Any ATI soft crashes, though.
>>
>>103109820
that sounds perfect, ty
>>
/lmg/.... i kneel
>>
>>103109897
Alright, what happened this time?
>>
>>103106035
I don't get it, are they saying that people being cancelled is bad... or good?
>>
>>103109528
Didn't he also call Trump a fascist though and then flip once it was politically advantageous?
I have zero faith in any politician actually delivering unless they have a track record.
>>
>>103109956
It's bad, but they (none of them alive) had it worse. So it's fine if they do it becase they do it to bad people that would do the same to them. We gas them so they don't gas us... wait a minute...
It fucking hurt to write that. Don't ask to explain their reasoning, man... that's just cruel...
>>
>>103109528
Notice how he doesn't mention censorship, just muh left-wing bias.
I hate politicians so much, I wish we removed the entire concept of them.
>>
>>103109147
>>103109148
Not sure if this answers things, but iirc the original lora paper was based on the findings of this: https://arxiv.org/abs/2012.13255
>we empirically show that pre-training implicitly minimizes intrinsic dimension
>there exists a low dimension reparameterization that is as effective for fine-tuning as the full parameter space

lora authors took this and claimed constraining weight updates in linear subspaces approximates full finetuning. I would guess the problem comes from this assumption not holding well for language modeling
>>
>>103110223
>I hate politicians so much, I wish we removed the entire concept of them.
That's exactly how tech companies making models feel about certain subjects. Funny, isn't it?
>>
File: 1596738706640.jpg (72 KB, 475x297)
72 KB
72 KB JPG
Have there been any MoEs of interest since Wizard?
>>
>>103110315
sorcerer
>>
Hi all, Drummer here...

>>103110315
I'm interested in tuning Wizard 8x22B and I don't want to fuck it up. I've talked to the Sorcerer guy and he said I can tune it like any other (dense) model. How does it compare to Mistral Large? If anyone has input, send em to me!
>>
File: 003164.jpg (2.11 MB, 1560x2280)
2.11 MB
2.11 MB JPG
>>103108110
cool workflow, thanks
>>
>>103110650
Hey Lainbro. You're welcome, though I also stole it from someone else. People who share things are nice.
>>
>>103109528
He also wants a bachelor tax and if he actually does shit and notices what people want to use it for things would get even worse. Thankfully I am pretty sure like all politicians he is an impotent puppet.
>>
>>103102649
Hello, I need a system prompt for a narrator that'll also take advice on revisions to the last gen. Thanks.
>>
When will we get a model that can actually think?
Traditional LLMs are pretty boring right now.
>>
>>103111117
My crystal ball says in between 2 months to 130 years. A new architecture with a 4-6 letter acronym. Maybe... i'm getting a lot of static... lemme rearrange my candles a bit... brb...
>>
>>103108391
Web of trust based on public keys
>>
File: 124124145236435658.png (395 KB, 833x1104)
395 KB
395 KB PNG
Are there any public weight models that ingest images in a truly multimodal way like 4o or claude to allow for multi-turn conversations/embodied vision agent use cases like this?
>>
>>103111213
sure the correct answer to question is "they're both low-res photos of a fat ginger cat in a wizard hat"
>>
File: typical nintendo fan.jpg (32 KB, 474x351)
32 KB
32 KB JPG
>>103103041
The responses from my porn generator became repetitive. The changes to this site didn't help.
>>
File: 1716760326444059.jpg (267 KB, 1024x1024)
267 KB
267 KB JPG
>>103111244
>>
What's the current state-of-the-art method for running inference on a model hosted on your homeserver (with something like ollama) over the internet and through an API?
Surely there must be a simple, secure method of doing this, right?
>>
still no hunyuan quants desu?
>>
>>103111509
Socks proxy with ssh -D {port}
>>
File: 1724244433302097.png (237 KB, 600x218)
237 KB
237 KB PNG
https://x.com/EMostaque/status/1854302338963451934
>>
>>103110223
You only hate politicians who go against your communist ideology
>>
>>103109257
just build llama.cpp yourself. GPU builds are usually done with AVX2 because most people have that, but there is nothing stopping you from making a build with CUDA and without AVX
>>
>>103110523
Large felt like a big step forward coming from WizLM to me when I made the switch. Wizard 8x22 definitely feels like a "last-gen" model from the LLaMA3/Qwen2/CR+ era rather than the current one.
>>
>>103110523
Sorcerer is the 3rd smartest local after 405B, mistral large then it. And its not dry unlike wizard. And not slow unlike mistral.
>>
>>103111213
LLama 3.2
>>
File: 124142375688706.png (127 KB, 822x977)
127 KB
127 KB PNG
>>103112030
LLama 3.2 is hacked together multimodality and only supports one image
Pixtral seems to be able to support multiple images but it's really bad.
Though 4o also fails at basic vision tasks like this.
Sort of disappointing.
>>
>>103112084
Yea, seems like 3.2 was just a quick experiment. Hopefully llama 4 fixes that.
>>
>>103111213
>>103112084
Try anole.
>>
>>103112084
XBOOOOOOOOOOOXBOOXOXXXOXBBXBXOX
>>
File: 124124354567568.png (463 KB, 3230x2500)
463 KB
463 KB PNG
Every major vision model except Claude 3 Opus fails at this task.
>>
>>103102667
Loki is interesting, but like everything else which can give real advancement it needs to be used during pre-training.

Open source models haven't even switched to transformer-XL attention during training yet even though it's clearly the correct thing to do.
>>
>>103112145
Aria?
>>
>>103112443
There is no way to run it without vLLM so it doesn't exist.
>>
https://x.com/rohanpaul_ai/status/1854513721877418331
>>
>>103112552
5% accuracy drop for doubling speed? Probably a good trade off for creative tasks. I wonder how hard that would be to test in llama.cpp? I’ve never looked at any of the attention codepaths
>>
>>103112552
>Llama-2-13B: 50% KV-cache reduction (52GB to 26GB)
Might be cool
>>
>>103112627
Savings will be large on 13b because it did not use GQA
>>
>>103112486
You can try it on their website.
>>
is whisper-large-v2 still the SOTA for ASR?
>>
File: 1712334918353508.png (747 KB, 726x761)
747 KB
747 KB PNG
CHINA uses OPEN SOURCE AI by Meta and likely others for Military purposes. OPEN SOURCE is a risk to national and international security and MUST be regulated.
>>
File: slownic bro.jpg (20 KB, 299x296)
20 KB
20 KB JPG
>>103112962
>November 1
>>
>>103108110
I like this Teto
>>
>>103112962
China trying to get the US to regulate so they can catch up.
>>
File: 214124457679.png (70 KB, 950x651)
70 KB
70 KB PNG
>>103112678
It actually did it.
>>
entropy meme at CERN https://x.com/_xjdr/status/1854554634632970684
>>
>>103113157
>>103113157
>>103113157
>>
>>103113054
It's a shame Aria doesn't use GQA or it would be a perfect CPU model.
>>
>>103111757
I tried this with an ollama commit last night and after hours of compiling still no luck.
>but there is nothing stopping you
apparently my profound retardation is. im looking over the llama.cpp build doc, but dont see anything for excluding AVX. Would you be able to point me in the right direction for building with CUDA without AVX?
>>
>>103114307
if you are building on the same computer that you will use it you don't need to do anything special, the build script will automatically detect your CPU features
>>
>>103114307
>>103114337
i am talking about llama.cpp, i don't know what ollama does in their build scripts
>>
>>103114337
oh wow, alright ill give it a go now thank you :)



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.