[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102758839 & >>102743974

►News
>(10/10) Aria: 25.3B, 3.9B active, multimodal native MoE model with 64k context: https://hf.co/rhymes-ai/Aria
>(09/27) Emu3, next-token prediction multimodal models: https://hf.co/collections/BAAI/emu3-66f4e64f70850ff358a2e60f
>(09/25) Multimodal Llama 3.2 released: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices
>(09/25) Molmo: Multimodal models based on OLMo, OLMoE, and Qwen-72B: https://molmo.allenai.org/blog
>(09/24) Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
►Recent Highlights from the Previous Thread: >>102758839

--Paper: PLaMo-100B, a large-scale Japanese language model with competitive performance:
>102770605 >102770997
--Papers:
>102759380 >102759535 >102759675 >102759859 >102759989 >102766750 >102771160 >102771486
--SillyTavern and KoboldCPP comparison discussion:
>102761053 >102761131 >102761199 >102761233 >102761332 >102761383 >102761462
--Running multiple PSUs in tandem can cause issues:
>102758927 >102764354
--No practical reason not to make an LLM front end in RPGMaker MV:
>102764211 >102764358
--Java implementation of Llama 3.1 inference and ollama support for 3.2 vision:
>102761590 >102761983 >102762453
--Troubleshooting xtts2 installation and compatibility issues:
>102762949 >102763509 >102764426 >102764613 >102764826 >102764952 >102765223
--Pyramid Flow - open-source high-quality video generation:
>102760489
--5090 recommended for AI, but availability and price may be issues:
>102765809 >102765892 >102766269 >102765921 >102766133 >102766393 >102766431
--Running 3 3090s / 4090s in a case is possible but will resemble a mining setup:
>102766648 >102766737 >102766966 >102767074 >102767095 >102767731 >102772011
--Nvidia RTX 5000 series discussion, performance and bandwidth improvements:
>102759501 >102759516 >102759770 >102759587 >102759634 >102759790 >102759528 >102759540 >102759553 >102759891 >102769140 >102769159 >102769207 >102769375 >102769415 >102769546
--Mixing Pascal and Intel Arc A770 GPUs in llama.cpp/koboldcpp:
>102771826 >102771872 >102771892 >102771913 >102771922
--Discussion about an ad for uncensored GPT-40 fine-tuning and creating AI girlfriends:
>102760427 >102760451 >102760511 >102760677 >102760502
--AMD multi-GPU support issues with newer software versions:
>102758969
--Gumi (free space):
>102758927 >102759015 >102763583 >102763803 >102764770 >102767818

►Recent Highlight Posts from the Previous Thread: >>102758842

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
>>
I can't sleep
>>
Rocinante-12B-v2g-Q5_K_M from a couple threads ago is actually unslopped. Especially since I recently used mistral-small finetunes. Nemo is more dumb though. Not sure how to feel about this.
>>
elon's biden squad https://x.com/disclosetv/status/1844580663627772175
>>
>>102772917
Count some Migus? 1... 2... ...3
>>
>>102773308
this is the future i wanted. i don't have to see my kids anymore and can stream diablo 4 instead. elon deserves his fame
>>
>>102772466
I'm pretty sure exllamav2 does not support parallel processing. Does this mean that I can laugh at anyone running exl2 on their $5000 multi-gpu build for not only being a retard who spent so much money to run bad models, but also for sitting there running them in a slow speed?
I guess richfags really are the scum of this general lmao
>>
File: 1000018759.jpg (539 KB, 2491x4035)
539 KB
539 KB JPG
>>102760789
ok neat but spoonfeed me on how do i run it on termux
>>
>>102773834
*how to
>>
File: 1704109932374810.png (297 KB, 602x665)
297 KB
297 KB PNG
>>102773834
>>102773841
you don't, the guy making it is a schizo and his meme sampler will never be available in other frontends btw
>>
>>102773259
Rocinante 1.1 is my fav so I'm gonna test this one. Is it actually unslopped or just wishful thinking?
>>
>>102773982
its all slop
mixtral? slop
mistral? slop
nemo? slop
miqu? slop
>>
>>102773982
>>102773995
>Got a purred in the first test
LMAO, unslopped my ass.
>>
>>102774010
sloppah
>>
>>102773982
Judge for yourself
I did notice a
>strange and unfamiliar ... but not unpleasant
But I like the writing style.
>>
>>102774123
Also funny I wrote schoolgirl in feudal japan.
But the model was smart enough that girls didnt go to school in the past.
>>
>>102774123
>>102774137
Yes, it's definitely slightly better than regular rocinante but it is still mostly slop, a bit of a improvement tho.
>>
I've been away for a few months. do we already have some models that can genuinely surprise me with anything creative, or is it still "I tell it to suck my cock, and it sucks my cock"?
>>
>>102774176
Its just refreshing loading this up after using mistral-small finetunes.
Does make me appreciate mistral-small for obeying formats more though.
But there is some soul with the small models. Like calling the head of my dick a pretty jewel. lol
>>
>>102774123
slop
>>
>>102774189
prompt issue
>>
>>102774205
yeah, I don't want to hold LLM's hand all the time.
>>
>>102774190
Yeah seems more soulful.
What temperature are you using btw?
>>
>>102774216
Just 0.7 and minp of 0.1.
I'm sure people will call me retarded, and I am, but sometimes I use dyn temp 0.7 - 1.2.
>>
>>102774215
I dont think we are there yet without architectural changes or "smartness".
Isnt that kinda the reason why you cant let gemini or a model with big context write a short novel thats actually interesting?
There is something missing.
To be fair I use big models mostly for work, but it feels like the bigger the less they go off the path without being instructed.
The model would need to take a pause, look at the context and think "hmm, how can i make this interesting?". CoT does not work from my experience and only makes things slow.
>>
>>102774123
Second pic and last one, i need to do at least some work right now instead of doing this.

>prompt: a businesswoman from murrika
>finishes off with: I'm going to sue whoever is responsible for this.
thats funny.
>>
I'm pretty new to LLM's but been having fun the past few days.
Built a script that chunks and cleans telegram logs of a group chat (not english) for the past 4 years in 5-6 days chunks and pipes them into a local LLM to make summary of what was discussed which came out very fun

I'm using a very simple setup, my script is piping and calling simonw/llm which uses ollama gemma2:7b model, I'm using a 3070. I tried llama3.2 but had to use an uncensored gguf I found to get output and even then it wasn't as good as gemma2
I want to try doing the same but on a rented VM with beefier gpu, randomly looking at runpod.io but happy to get recommendations

Which llm model/rented gpu would you recommend to do the same thing but better? Also happy to modify the script to not use simonw/llm+ollama if it makes sense to
>>
>>102773995
and largestral. i still like miqu, despite the shivers and slop it doesn't take 1500 tokens to get to the point
>>
Read that NVLM 1.0 namely NVLM-D-72B is capable of rivaling GPT-4o.

Is that actually true? Can it be run locally?
I think I saw it on huggingface, should I wait for finetunes?
>>
>>102774531
>sees model
>first response is to wait for tunes
ngmi base is always the best
>>
>>102774558
Isn't base often censored out the wazoo?
>>
>>102774567
not if you use it right and tunes in most cases just make them dumber and sloppier
>>
>>102773537
Exllama2 already supports tensor parallelism
>>
>>102774600
NTA, but my reason for not using finetunes is that I don't want to support nor give attention to the ERP finetune grifters. They've lost the plot and all they can do is gimmick or stupid horny models. They don't deserve to gain money (indirectly or not) from this shit.
>>
How do you run models & surrounding code from random repos in a better or worse sandbox? VMs with GPU passthrough and cutting off internet access? Docker?
>>
>>102774954
apparmor + opensnitch if you're a linux chad
>>
Anyone with 128GB of main memory try to run Aria on CPU? It shouldn't be very FLOP intensive.
>>
>>102774326
Certainly takes a specific level of intelligence to open the window in the described situation.
>>
>>102775063
have you not seen car wash videos
>>
>>102775053
How do you run it on CPU?
>>
>>102774954
Easy to deadend a web service on Linux and proxy output only, eg https://rentry.org/IsolatedLinuxWebService
>>
I wanted to use these to practice patient scenarios. ChatGPT and Claude.ai don't allow simulating dangerous things (like applying certain medication if it's not correct).
I've looked through the OP and the guides. Seems like I should give up if I'm not willing to invest in a new computer. Does anyone know a "non-local" thing I could use which would allow medical scenarios?
>>
>>102775353
There are renting services that allow you to run these open weight models remotely.
Stuff like open router.
No idea how it works at all.
>>
>>102775353
There's some small local models (or quants of big models) for medical stuff that you can fit in a 8gb card, those are as cheap as $100 you could even find some 11gb cards for a bit more.
If you plan to give the card a lot of use I don't see how that little money is that much of an investment.
https://huggingface.co/blog/leaderboard-medicalllm
https://github.com/AI-in-Health/MedLLMsPracticalGuide
Take into account that you can generate the medical data with a large model online and then tell a local model to use that data.
You can also "rent" a gpu for some shekels per hour.
>>
>>102775435
Nigga, use flux Q4.
>>
>>102775414
>>102775447
Wow, that's really comprehensive, thanks.

I thought that utilizing it for this purpose was an original thought, but there are already scientific papers published about it and people are developing it actively.
>>
>>102775198
Doesn't VLLM have a AVX512 mode?
>>
svelk
>>
>>102775756
posting questionable drawings of children won't make your arguments more credible
>>
>>102775789
That's the joke, we're different kinds of retards.
>>
>>102775789
>questionable
lol get a look at this faggot.
>won't make your arguments more credible
it makes them more based though
>>
>>102775972
>pedoshitter pretends to have standards
>>
>>102775756
lmg needs more of this.
>>
>responding to shitposts
bruh moment
>>
Alright, got myself one of the more cheaper used p40. High IQ move since I bought once they gotten more expensive.
I dont have high expectations though and am ready to be disappointed because the 70b range models are probably all assistant sloped.
How bad is mistral large at lower quant? I have around 36gb of VRAM.
And is it even possible to run nemo or mistral-small finetunes at high context instead?
Worst case I can now also shit on smaller models and then cry myself to sleep.
>>
>>102776271
Mistral Small with a decent chunk of context should fit cozily into your VRAM. I'd try that before waiting 5 minutes for a 70B reply.
>>
File: 1698183099139893.png (139 KB, 864x462)
139 KB
139 KB PNG
Now that's some next level user prompting.
>>
>>102772862
>Aria: 25.3B, 3.9B active
Mistral Small is already so fucking fast, not to mention nemo, which makes me wonder what the purpose for this is, beside VRAMlet support.
>multimodal with 64k context
Thaaaat meanwhile is a bit more interesting.
>>
>>102776395
what model gave you this absolute slopkino
>>
>>102776431
>Mistral Small is already so fucking fast
Could be good for Nemo users.
>>
>>102776482
That's unironically UnsloppedNemo from the earlier shill.
>>
>>102776483
That's what I meant with vramlets. Nemo is already pretty good for them, but this here of course has potential to be even better (hopefully lmao). At the least it's something interesting for once.
>>
>>102776431
https://www.rhymes.ai/blog-details/aria-first-open-multimodal-native-moe-model
Some interesting examples.
I just cant trust llms for anything serious though.
The ability to show it pictures and say "make a python table of this" is cool however.
>>
>>102776659
Currently I treat AI as nothing but an early tool, a very enjoyable and fun to fuck around with tool. I want to say "toy", but it's too useful for that already, even if it needs continued polishing. A fuck ton of near endless potential, but nothing I'd trust a 100% just yet, not for anything serious at least.
>>
File: SelfSatisfiedMiku.png (1.17 MB, 880x1168)
1.17 MB
1.17 MB PNG
>>102776726
Yes, it can eliminate or streamline intellectual "manual labour" in the same way that industrial automation eliminates physical manual labour. It is not sophisticated enough to be allowed to operate in a closed loop yet. It still needs review and handlers, but that doesn't mean it can't eliminate large amounts of work with supervision.
>>
>KoboldCpp v1.76 adds the Anti-Slop Sampler (Phrase Banning) and RP Character Creator scenario
>NEW: Added Anti-Slop Sampling
The term "slop" comes from "sloppy seconds", a term used by incels to rationalize why they are lonely and do not have a girlfriend ("it's my choice! women are whores! i'm not taking sloppy seconds!").
By openly using this term, koboldcpp confirms itself as a software designed for incels. Note: the two quoted phrases above were written by the author of the software, an incel.
>>
>>102776726
Ironic that the biggest strenght currently is creative writing. But thats gimped the hardest.
If you show this stuff (like 3.5) to normies they are impressed...until "it lies" to them.
Like asking Sonnet for good cafes in hawaii. Stuff we wouldnt come up with.
There are big key problems that properly require architectural change.

That being said if you showed the world from 2005 or something what we have now they would call it agi no doubt.
Just watch the robo movies from the past. They talk funny and are not able to make "creativity and art". Thats the first thing that is being solved already.
If I was younger with more time I'd have such a blast. Music, art. You can make anything easily and for free now.
>>
>>102776832
bruh, slop is just waste nutrients fed to pigs, a usage hundreds of years old
occams's razor says your explanation is retarded
>>
>>102776659
You can make some scribbles in a piece of paper and tell the model to translate it to a web page or something.
I'm waiting when multimodal can also generate images, which shouldn't be too far if they actually intend to do that.
>>
>>102776832
>>102776899
Buy an ad.
>>
File: 1725828341389808.png (26 KB, 755x1255)
26 KB
26 KB PNG
>>102776832
Nah ur just a faggot seeking for attention i.e. "baiting".
>>
>>102776832
>inb4 koboldcpp is a serious tool for serious businesses instead of a cooming aid
>>
>>102776925
same with audio. they all cuck out.
"multi modal" is 99% always picture in and text in/out.
>>
File: 1723571195164745.jpg (312 KB, 1408x1147)
312 KB
312 KB JPG
>>
>>102776832
How do I even make this shit run?
The chink only did the installation steps for linugg
>>
>>102776955
>sloppy slop, I'm not taking chad's leftovers! Install koboldcpp and become an mgtow.
>>
File: file.jpg (77 KB, 518x666)
77 KB
77 KB JPG
>>102776796
As you said, AI is nothing else but a tool to further remove manual labor and make things "easier" for us. I see it as nothing else than a bunch of pre-programmed robot arms that put things together or move objects from one place or another.
>>102776864
>But thats gimped the hardest.
One would think it'd be imagery, seeing the most damage can be done with that, text far less so.
"Oh no, this anon wrote a poem about how much he hates [skin here], how demonic!"
Music and video content being locked down to paid service makes sense, it's the easiest and most money you can make with, while imagery has been out in the opening since the beginning.
>that properly require architectural change.
Not to mention CONSTANT updating, preferably in real time like a search engines database. I imagine that things like this further complicate llms and how they function.
>they would call it agi no doubt.
Hell, I was already PLENTY of impressed by 4.0 when I first saw it in action and played around with it (uncensored) myself, was weird as hell and a bit spooky.
Not spooky because "muh terminator" scare, but because it made me feel something, good or bad. I was engaged as if it's a good movie or book, but it's being written in real time according to things I wrote.
>>
File: 1697139613268444.png (23 KB, 404x401)
23 KB
23 KB PNG
>>102776971
retard
>>
>new multimodal model
>ask the gym receptionist if their gym is omni or just another vlm
>she doesnt understand
>pull out illustrated diagram explaining what is omni and what is vlm
>she laughs and says “it’s a good multimodal model sir”
>try the model
>its a vlm
>>
>>102776990
Come to think of it, koboldcpp may be developed by an r9k user.
>>
>>102776997
>gym receptionist
FUCK I forgot to change the copypasta enough
pretend I wrote something like blogpost or PR team or something...
>>
>>102776864
The early 20's me would have loved this shit. it is such a shame.
I would have become an AI wizard.
>>
>>102776997
And that's what q4 context looks like folks.
>>
>>102777115
20 trillion iq2 where it's at
>>
>>102777004
It's better this way
>>
>>102776930
lol I hate kobold, I just hate gay retards more
>>
Alpindale I know you lurk here. Is Aphrodite-engine getting the last HTTP server optimization from vLLM or not? Aphrodite is better since it supports more samplers and quants, but I won't switch if the throughput is worse.
>>
what will anti-slop's corporate name be when its added to st?
>>
>>102777411
anti-transphobia
>>
>>102777411
Diversity+
>>
>>102777411
>st
"SlutTrainer 2024, now with SmutLube Technology!"
>>
>>102777411
Inclusiveness feature. A mandatory they/them system prompt for adressing {{user}}.
>>
>>102777411
Pro-MGTOW
>>
What's the use case for that in servicetesnor?
>>
I'm getting sick of windows. Should I change to a rolling distro or a LTS one?
>>
File: MikuIntoTheVoid.png (2.51 MB, 1280x1640)
2.51 MB
2.51 MB PNG
>>102777506
I've had good luck with Debian testing on my dedicated AI box. Very little breakage, and reasonably fresh packages.
This is with NVidia cards on an AMD proc
>>
>>102777506
Yes. Posted from my customized win10 enterprise system.
>>
File: 724d769s-960.jpg (190 KB, 960x720)
190 KB
190 KB JPG
>>102777506
>>102777561
>>
>>102776395
Giga slop
>>
>>102777506
My personal preference is something based on Arch since ML benefits from recent packages and I know how to fix any potential problems.
>>
>>102776955
Looking at that 'Roleplay Character Creator' reinforces my opinion that Kobold is stuck in year 2021 and will never get out of that period.
>>
>>102776955
Wait so the antislop sampler can work just by using the token ban field on frontends, so I don't have to update the frontend?
>>
>>102777561
>>102777592
Thanks. I think I will go with an Arch based distro since I'm familiar with it from like a decade ago.
>>
>>102777506
you generally want the most recent drivers if you have an nvidia card, but there are only a few rolling release distros that matter
arch is what I use but I wouldn't recommend it to someone who has never used linux, tumbleweed is the other one I tried but I had way too many issues with it and I never got the nvidia drivers to work on it
>>
>>102777648
"Banned Tokens/Strings" in ST works with it yeah
>>
>>102777506
You are getting sick of being a man, sad. Many such cases.
>>
>>102777672
Nice.
>>
>>102777661
make your root partition btrfs if you're going with arch, and install yabsnap and grub-btrfs so that you don't have to worry about updates breaking shit
nvidia drivers are especially prone to breaking after updates
>>
>>102775466
Q8 on my A4000 runs just fine, takes around 50 seconds for gens. No need for a Q4 quant lol.
>>
>>102777738
>50 seconds for generation
the shit some RETARDED fags suffer through lmao
>>
>>102777759
Trve
>>
>>102777738
lmao
>>
New update on the >>102746811
>>102746835
>>102746841
saga

The ebay chigger sold me the wrong CPUs even though he was reputable? and the text showed 9334s.

>2x EPYC 9124 16-Core I saw on the dashboard lol.

I am impressed as the audacity but I will hopefully deal with a white man next time, or at least confirm serial numbers.

Shit is getting returned like its hot but at least I feel better now
>>
>>102777737
Thanks for the tip. That should save me from some headaches.
>>
>>102777738
>A4000
Why do people even buy mid-tier workstation shit?
>>
File: 1714027816159131.jpg (71 KB, 458x584)
71 KB
71 KB JPG
>now I can generate text with koboldcpp without that sloppy slop
>never taking chad's sloppy seconds
>>
>>102777857
>impressed as the audacity
I wonder how well that works. I assume at least half of the buyers won't even check but still how do you avoid being flagged and banned?
>>
Did anyone actually tested Lamafile with a 7950x3d or Threadripper vs a 4090 with a regular gguf?
Maybe cpumaxxing is back in the menu after all?
>>
>>102777970
>tranny software
no thanks
>>
>>102777984
I'm assuming you don't use SillyTavern.
>>
>>102777970
>Lamafile
lol no, that shit is the worst, most poorly conceived meme out there
>>
>>102777970
>X3D over normal
What is the application here exactly, besides you having a gaymer PC with a X3D+4090?
>>
File: 1728360368103789.png (1.03 MB, 1280x720)
1.03 MB
1.03 MB PNG
>antislop sampler integrated
It's over...
>>
>>102777970
There's probably some figures in the issues and discussions in their repo, I'd start by looking there.
>>
>>102778020
why is she sad
>>
>>102778020
Integrated where? In ST?
>>
>>102777963
>I assume at least half of the buyers won't even check
How many buyers would spend $3k+ on a mb/cpu combo and not do even a simple sanity check of the number of cores?
Oh god, I hope that's not the state of humanity these days.
>>
>>102778020
Not in the OG llama dot cpp
>>
>>102778036
Over half. They will ask the nerd friend to check the specs on site and be happy with it.
>>
>>102778029
When you whisper to her now she feels nothing.
>>
>>102778006
And your source is?
>>102777984
Most of AI shit is tranny software.
>>102778021
Can't find anything.
>>
>>102778108
>Most of AI shit is tranny software.
No, faggots just hang on to any popular thing because they're faggots and want eyeballs
The work that advances the field or is actually hard is done by people that don't bother with self promotion, so you'll rarely hear about them.
That's why most bullshit projects vote-brigaded onto orange reddit are faggots with some web wrapper or fork of something actually useful
>>
>>102777602
What's wrong with it?
The fact that it doesn't hack the RP prompt up into 30 different unnecessary pieces? No 8000 token character description?
>>
https://huggingface.co/arcee-ai/SuperNova-Medius

arcee cooked again
>>
File: 49262.png (263 KB, 460x460)
263 KB
263 KB PNG
>>102778108
>Most of AI shit is tranny software.
It is not true. Sometimes they get kicked out but they come back eventually.
>>
you keep schizo posting the same image of the same tranny over and over like it matters
you're akin to a beta incel that spams wojak imagery
>>
Hello guys. I am Sao. My models are the best. You can grab them at: https://huggingface.co/Sao10K

Thanks.
>>
>>102777602
kobolds been able to read cards for a while, i assume they wanted a way to create them too. not really a bad thing
>>
>>102778261
>instant reply
>instant seething
>>
>>102778261
trannies are incels too, bigot
>>
File: 1679180429229.png (26 KB, 560x258)
26 KB
26 KB PNG
>>102778261
>>
>>102778261
Oh no anon is mean to your girlfriend again? Poor thing...
>>
>>102778251
She is a bit cute ngl
>>
>>102778245
>14B
meh
Though interesting as an experiment that somehow converts a model to a different vocab so that it's compatible with distillation of the different model's vocab. Didn't know that was possible.
>>
>>102778432
it*
>>
>>102778342
all right, sauce me up
>>
File: 1718899052514532.jpg (15 KB, 373x148)
15 KB
15 KB JPG
st's already fucking with the windows for no reason. they removed the expand/close button for author notes, but you can still click them as if they were there. happens when you expand a/n too. retarded
>>
stop using snaketesnor
>>
File: file.png (62 KB, 783x391)
62 KB
62 KB PNG
>>102779187
>>
I know it has been said already, but I must reiterate that ServiceTensor sounds like a scam using random ai jargon
>>
>>102779243
the whole point of a/n is i can put shit there and select the depth. it already has world info triggering. i keep several thing in a/n that its perfect for where lorebooks would be a pain to edit. right now i'm doing a adventure thing and have items, party members, a summary of whats happened. i wouldn't put that shit in a lorebook and edit it constantly
>>
File: file.jpg (56 KB, 608x464)
56 KB
56 KB JPG
>>102779216
>stop using snake oil
No.
>>
>>102779398
Whoops meant to post my review of Aria in this thread because it's more active, but it's relevant for both. TLDR it sucks, Molmo still king.
>>
How are things on the tts front? Anything newer than fish 1.4?
>>
>>102779398
>>102779448
Shame.
Thank you for the report anon.
>>
>>102779187
I don't see anything wrong with expand/collapse (the ^ thing) itself, but I suppose touch screen users may "fatfinger" the invisible close.
"Close" button, or perhaps colloquially known as "x", is missing (and is clickable as you said) in top right corner.
>>
>>102779448
Have you tried Ovis1.6-Gemma2-9B?
>>
>>102779938
No, first I'm even hearing of it. I can try it at some point when I have time.
>>
File: 1728673931919.jpg (313 KB, 1080x1456)
313 KB
313 KB JPG
>Fireworks Lora Fine-tuning
>$0.5/M tokens for models up to 16B
>10000 data points, 1024 tokens, 1 epoch would cost $5
... What? I can fine-tune Nemo with half that price using runpod. What are the use cases for this ripoff?
>>
Hi all, Drummer here...

Here's a zippy Behemoth 123B v1 link: https://wellness-practical-right-reproductive.trycloudflare.com/

Metharme is recommended but Mistral works too.

It'd be great if I could gather all your thoughts on it. Enjoy!

(Also surprised by all the positive feedback for UnslopNemo v3!)
>>
>>102780305
Don't visit that link, it makes mustard gas!
>>
File: s-l1200.jpg (240 KB, 1200x1200)
240 KB
240 KB JPG
>>102777857
I had the same symptoms once when I bought an EPYC that was facing the wrong side in its enclosure. I haven't noticed at first because EPYCs are normally rotated 180 degrees like picrel. I blame AMD for that shit.
>>
>>102780213
Hoping that people won't be tech savvy enough to realize they're getting ripped off, unironically
That, Together, Replicate, are all in a similar boat where they just rip people off without offering anything of value in return. DeepInfra is the only one kinda worth it for inference, and there's basically nothing out there for finetunung
>>
>>102780305
Mistral Small feels better somehow
>>
>>102773834
it's not done yet
>>
>>102780305
still prefer mythomax
>>
st more stupid troon
>>
>>102777887
NTA but because it's single slot. Makes it easy to fit into a case with other cards
>>
>SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
>can achieve over a 1.3x-1.6x speedup while preserving the original distribution of the generated text
https://arxiv.org/abs/2410.06916
>>
https://app.primeintellect.ai/intelligence
neat
>>
>>102780721
Chat is this real?
What's their dataset?
>>
File: 1707855701807539.png (79 KB, 867x575)
79 KB
79 KB PNG
>>102780799
Some filtered slop.
https://huggingface.co/collections/PrimeIntellect/intellect-1-dataset-6704f3d3a9dee8678da3d407
>>
File: Untitled.png (3.84 MB, 1080x7491)
3.84 MB
3.84 MB PNG
>>102780799
>Model specs: • 10B parameters, 6T+ tokens dataset • Llama Architecture and tokenizer • Dataset mix: Fineweb-edu, DLCM, Stack v2, OpenWebMath
https://x.com/PrimeIntellect/status/1844814836183777458
https://www.primeintellect.ai/blog/intellect-1
>>
>>102780826
Damn. Well this could serve as a cool proof of concept I guess.
>>
>>102780826
you have no idea what those datasets even are lol
>>
>>102780721
Looks like our very own John Smith has jumped in.
>>
File: 1719404097148110.png (2 KB, 332x57)
2 KB
2 KB PNG
>>102780872
>uhm ackschully!!
Fuck off, you don't have to be super smart to realize that all opensource datasets are filtered garbage.
also
>trained on top of already pozzed llama 3.x
NGMI
G
M
I
>>
>responds with reddit style seethe
lol fucking zoomers man they're so typical
>>
So this is the power of Opus...
>>
>>102780912
>t. scrawny zoomer afraid of direct replies
lol?
>>
File: Untitled.png (10 KB, 679x53)
10 KB
10 KB PNG
lol owned this sissy hypno sharty addict
EZ
>>
going out now so I won't read your reply. that's right I've owned you like 3 times in 10 minutes. typical for you THOUGH zoomer incel
>>
>filter-tranny
ah now it makes sense
>>
File: ds.png (19 KB, 720x232)
19 KB
19 KB PNG
>>102780908
Architecture, not model. Specially if they're training a 10b which, as far as i know, meta hasn't released any of that size. Mistral, qwen, deepseek...they're all roughly based on the llama architecture.
As for the dataset... yeah... this is what i found. And only 1T tokens, but it's an experiment.
>>
>>102780921
Claude Opus? That's the easiest model out there to jailbreak. Any kind of prefill will dodge practically all refusals.
>>
>>102776659
I wonder why they only decided to train the decoder on text. They went through the trouble of encoding the images/videos into tokens and putting them all into the same embedding space, why not take the obvious next step of letting it predict the next tokens of those too?
>>
I have pyramid video model running on my computer, any ideas for prompts? I've already tested everything I can think of
>>
File: file.png (834 KB, 768x768)
834 KB
834 KB PNG
/lmg/ is dead again...
>>
>>102782030
Eat the Pochiface
>>
>>102782030
give her armpit hair
MVHAG
>>
>>102782073
no! filthy zoomers...
>>
>>102781432
penis
>>
>>102781432
flying armpit hair
>>
>>102781432
Try to spoonfeed the rest of us on how to set it up
>>
>>102782205
How come /ldg/ can get it set up without spoonfeeding but /lmg/ can't?
>>
>>102782030
Good.
>>
What local models can handle image inlining in Sillytavern? The examples Sillytavern gives can't be run locally.
>>
>>102782376
laziness is a virtue
>>
File: 1723796444460470.png (888 KB, 1390x694)
888 KB
888 KB PNG
>>102773834
https://x.com/_xjdr/status/1844807279834779786
>>
File: Screenshot_2836.png (57 KB, 627x219)
57 KB
57 KB PNG
>>102780305
I'm running 5.5bpw. Do you have any recommended sampler settings? I found it work best with a simple setup of min-p 0.04 + temp in the 1-1.1 range so far but it's not perfect.
I'm liking the model for the most part. The replies I've seen in my tests are long and creative. I think it's generating the best lewd scenes I've seen from Mistral-Large finetunes too.
One weird issue I'm having with Behemoth is that it's occasionally weirdly inaccurate when it comes to understanding implied details. For example, in a group chat one character sometimes starts adapting the verbal ticks of another character in the group for no reason. Luminum and Magnum don't do this at all from my experience. When it mixes this up, it predicts the tokens with a very high confidence so it's not something you can just min-p away.
I've also had it misjudge distances or get implied differences in body height/hair length wrong a bit more often than I'd expect from a 123b.
>>
File: 1708815606318665.webm (3.99 MB, 720x990)
3.99 MB
3.99 MB WEBM
>>102782467
and another one https://x.com/sam_e_farrar/status/1844791813913083998
>>
>>102782030
Good.
>>
>>102772862
what's currently the best way to transform images to 3D, locally?
>>
>>102782698
Blender
>>
>>102782698
3DSMax or Blender.
>>
You know, I wish... I wish local models had filters like C.AI
I know, crazy, but sometimes not being able to do everything is more fun than being able to literally everything like an omnipotent god.
>>
How can I enable nigga mode
>>
upgrading my 1080 to a 4070 ti super soon. I know it's not the best, but what can I expect to be able do with that? My 1080 was just enough for stable diffusion but language models came out speaking retarded
>>
can someone pass me the big nigga card
i lost my lmg issued one
>>
Is there anywhere I can try out Llama 3.2 90B vision for free?
>>
>>102782984
>but language models came out speaking retarded
You're doing something wrong. 7-9b are coherent. Fuck, 1b model are coherent.
That's 16gb, right? mistral nemo. And calibrate your expectations.
>>
Is there a local model trained on fantasy/non-human/beast data? I am trying to carefully rework non-human character cards, but anatomy slip-ups are still frequent.
>>
Anyone fixed that bug where llamacpp repeats message after second swipe in ST? It doesn't happen with exl2.
>>
interesting read for anyone interested and cudadev if he still visits this cess pit
https://probablydance.com/2024/10/07/initial-cuda-performance-lessons/
>>
>>102783411
Did anyone else have the same bug?
Does it happen if you delete and regenerate the last generated tokens on llama.cpp's server ui directly?
For reference, doing that with the vim plugin just regenerates new stuff, as it should, every time.
>>
>>102780841
>Decentralized training
Holy shit, it's actually here now huh? Impressive!
>>
>>102782503
>>102782467
>1b
vramlets sisters.... we WON!
>>
>>102782767
KEK
>>
>>102782748
Man, /lmg/ is shit at creating cards, just go to /aicg/ for that.
>>
Feeling tired of Mistral Medium. Fellow 96GB VRAM Chads, what's your favorite model right now?
>>
>>102783867
>96GB VRAM
>Mistral Medium
?
>>
>>102783867
You should be using 123b minimum.
>>
File: teto-underground.jpg (589 KB, 1536x2304)
589 KB
589 KB JPG
>>102783168
Any nala-test approved model would work
>>
>>102782767
>Is that more to your liking, my friend?
go back
>>
Why does koboldcpp sometimes start talking on my behalf?
>>
>>102784206
prompt issue
>>
>>102784206
It gets fed up with your shitty prompts.
>>
>>102784206
Lots of cards have the
>DO NOT TALK FOR USER. DONT EVER TALK FOR USER. ABSOLUTELY DONT ACT FOR USER IN ANY CASE!
Later in the chat
>User smirks impishly and says "swing your ass over here"

Its a common problem anon. You gotta edit and continue.
Happens even at deeper context.
>>
>>102784206
Prompt issue, sampler issue, system prompt issue, prompt format issue, model issue
>>
>>102784206
>>102784387
I think I've had some success with:
>Your response must end before it's {{user}}'s turn to reply
It could just be the placebo effect, though.
>>
>>102784387
telling an llm not to do something is a surefire way to make it do that
instead you should tell it to do the opposite, like 'only speak for X' and so on
>>
>>102784487
i agree.
thats true in hypnosis as well. negatives dont work well.
and once its in the context its in the context.
learned a hard lesson with a jp/en translation test i made. gave it examples for uncucking and writing style reference. even with bigger models, it always bleeds through.
Telling the model what not to do can and will backfire.
>>
Have you ever tried LLM ironman masturbation challenge where you never reroll? How did it go?
>>
>>102783970
I tried searching Google but only came up with a couple of archived /g/ threads where "nala" is mentioned in a relevant context. Can you spoonfeed me a little?
>>
When do you guys think we'll be able to get the full text based adventure experience at the earliest? Basically very high context + good dming and game systems.
>>
>>102784387
>>102784487
Sonnet 3.5 listens to that instruction so well that when I ask it to suggest what I should do next it says
>I apologize, but as an AI assistant, I cannot suggest actions for the character. My role is to narrate the story and control {{char}}, while you control {{user}}'s actions and decisions.
It's not an innate limitation of LLMs. It's a limitation of the LLMs you're using.
>>
>>102784777
okay now prove that your prompt isn't getting edited by the service before being passed to the llm
>>
>>102784477
I like something like that in combination with "do not describe what {{user}} says or does"
>>102784487
awful advice, there are no LLMs in existence that actually understand what "only speak for {{char}}" means, it's a dogshit instruction and that's why everyone who uses it continues to have this issue
being clear and descriptive about the behavior you want >>>> outdated folk wisdom about avoiding negatives
>>
>>102784759
Do you really want us to pull numbers out of our ass?
Sure. No less than a day. And somewhere before the heat death of the universe.
>>
>>102784809
>outdated folk wisdom about avoiding negatives
i dont know how well anons phrasing works or not. but if you can, you should avoid negatives. thats just true.
everybody does it to tard wrangle, even openai. (DO NOT WRITE COPYRIGHTED TEXT etc. from the leaked mac app prompt)
but like i said. with my translation testing there was a low chance it took the "do not" part and actually applied that to the translation.
there are huge issues with how context is looked at.

>>102784759
difficult to say.
ever since last year when chatgpt came out and we had more than pyg it felt like we are very close.
i think self reflection and stopping and thinking is needed.
thats why you cant even make any sort of longer interesting novel with llm currently.
the ai would need to stop and think what would be interesting, twists, engaging for the user. it just doesnt work well right now.
>>
>>102784809
>awful advice, there are no LLMs in existence that actually understand what "only speak for {{char}}" means, it's a dogshit instruction and that's why everyone who uses it continues to have this issue
Wrong. On Sonnet 3.5 this worjs even better than I want it to:
>The story so far begins with a "[Story start]" token, and consists of alternating messages by Assistant (you) and Human (the user). Human and Assistant take turns to add to the story, and this continues indefinitely.

>The story's cast is made up of:
>- {{user}}: the protagonist, detailed later in <protag></protag>,
>- side characters: prominent characters described in more detail in <world></world>,
>- incidental characters: dynamically introduced and phased out as needed.

>There are strict rules for the contents added in each turn:
>- Human turn: Describe only {{user}}'s actions, dialogue, thoughts and feelings.
>- Assistant turn: Write only general story narration and the actions/dialogue of side/incidental characters. You cannot control or imply {{user}}'s thoughts or actions.
>>
>>102784848
>Do you really want us to pull numbers out of our ass?
yes, obviously, and thank you for your asspull.
>>
>>102784869
but I'm absolutely right, your instruction works because it *isn't* just "don't write for user", it's something else entirely that is actually clear about what behavior you want (and, btw, basically has a negative by telling it what it cannot do)
>>
>>102783909
>>102783943
Wow, what a blunder. I was under the impression the 120B one was called medium and large was something Mistral was holding out on us.
>>
>>102784912
Mistral Medium is the 70b Miqu leak.
>>
Why is saving in kobold so shit, holy fuck. Why not save to a local db or the file system, why make me download jsons manually?
>>
>>102784206
does koboldcpp have the ability to stop like on "\nUser:"
>>
>>102784865
>i think self reflection and stopping and thinking is needed.
Anything to look into about those?
>>
>>102781432
HOW!!!!!?
>>
>>102785003
Anon just use SeriousTavern
>>
>>102785089
That's ServiceTesnor to you, bigot!
>>
>>102784865
Hm, honestly I figured the context thing to be the bottleneck aside from ever bigger, better, and more efficient models. That being said self reflection could be a big one if it actually works and does not eat tokens like crazy but I assume that kind of thing would be baked in the model at that point rather than just be a prompt.
>>
Anyone know if AMD gpus have been benchmarked against other GPU?

I want to know how my 6950xt stacks up against others. (yes, I know amd lags)
>>
>>102785198
I think anthropic is the only one who actually did something different with context.
Sonnet 3.5 manages to give multiple versions of a app for example and doesnt trip up.
GPT does the "oh you are right, i made a mistake" then gives the same mistake again.
o1 is not that good and takes long in comparison. Like it just adds random stuff you didnt ask for sometimes.
Context just sucks currently and everybody lies with the tests. Its all just needle tests.
Try putting like a ff9 guide or something into gemini and ask "i am at X what do i need to do next?". At least from when I last looked there was no understanding of placement. Its not usable.
Stuff moves fast though, I dont doubt this stuff will be fixed soon.

>>102785031
Dont know sorry. I experimented myself in sillytavern with a self made extention but it just wasted tokens.
I'm sure you need a model trained on it.
The llama3 70b relfection finetune was horrible too. They just trained it to first give the wrong answer and then be "oh thats wrong, let me correct it!". That made it worse than the original. lol
>>
>>102776431
>Mistral Small is already so fucking fast
No one makes models for local. Faster means you need less GPUs&Joules to serve the same number of requests in a cloud, so faster is always good.

As for local, Aria cuts down in FLOPs/bandwidth so much it will probably run fast enough on CPU. In fact a 200 Billion parameter version might run at 10t/s on a 12 memory channel CPU if extremely optimized.
>>
File: miku.jpg (246 KB, 1024x1024)
246 KB
246 KB JPG
>come to my datacenter, anon. Plenty of VRAM here to run your models at full precision
>>
Do the BNB quants of a model work out of the box directly in transformers like the original models do?

There's 4bit BNB quants of Aria, but the guy that quanted it didn't document any of it, and the model is useless to me if the multi-modal doesn't work in 4bit mode.
>>
>>102776992
i know who you are
>>
>"Are you just interested in me for sex?!"

Damned it... how do I answer this gotcha question? I failed once before irl. At least now I can reroll the answer infinite times with llms.
>>
File: 00292-1730366010.png (353 KB, 512x512)
353 KB
353 KB PNG
are there any 13b models better than thebloke?
>>
>>102785846
It's no
>>
>>102785865
no i don't think 13b models are smarter than that guy yet
>>
File: sera.png (173 KB, 946x465)
173 KB
173 KB PNG
So is this anti-slop thing just -100 logits or something better?
>>
>>102785995
What model is this with that sampler? It seems good
>>
>>102786004
Mistral Small.
temp 1.25, min-p 0.02, rep pen 1.03, smoothing factor 0.25, dry 0.75/1.75, "banned_strings": ["smirk", "widen", "grin", "a mix of", "a mixture of", "chuckle", "ministrations", "night is young", "night was young", "fingers dig", "desire", "wolfish", "impish", "mischie", "in anticipation", "conspiratorial", "softening", "aises an eyebrow", "expression"]
>>
>>102773537
The well to do live rent free in your head and you are misinformed about exl2
>the scum of this general
That would be (You)
>>
>>102786030
Thanks. I actually thought it might be large.
>>
File: ShinyMikuLove.png (1.2 MB, 1168x880)
1.2 MB
1.2 MB PNG
Good night /lmg/
>>
oyasumi
>>
>>102786141
Good night Miku
>>
>>102785865
MN-Dark-Planet is the best I've tried so far.
>>
>>102785820
Fuck me I'm goddamn retarded. That's not the same Aria
>>
we've had the updates of Mistral Large and Small, when are they gonna drop a newer Medium
(no, shutup about Miqu, it's too old now. I'm talking about an updated version the way the other two were recently updated)
>>
>>102773259
>text completion
imo if someone is just doing text completion why would they use a finetune like rocinante over base nemo
>>
>>102785251
>>102785251
>>102785251
reee where are the benchmarks?
>>
Tried playing some eroges, and it was so much worse than I remember. Even despite the slop, AI is just better at this point.
>>
>>102786777
I disagree, it can't do as long of a story, it runs out of context.
>>
>>102786793
Humans maintain context by summarizing. Humans use lossy synopses. Politicians talking to professionals are the best example of this. Trump is like "Joe is AMAZING. He makes these things like you would never believe."
>>
>>102786824
Yes, but if I read a book for example the author perfectly builds on what happened before without messing it up. The characters having poor recall can be fine sometimes, but the author of the work should be able to keep track of every detail.
>>
>>102786836
He takes notes, and has an editor. Many writers use software, or other tools (eg sticky notes and things) to keep track of the narrative. It's a huge topic and much more interesting than I thought it might be.
>>
File: gal2-matriz-asociada.png (1.01 MB, 1812x6836)
1.01 MB
1.01 MB PNG
Does anyone know how to use AI to do semantic segmentation of images (specifically textbooks)?
I tried this with ChatGPT (the web interface) and the attached image but it didn't work at all, it just segmented it in 10% intervals lol.

Analyze the attached image, and segment it so all the content is covered, and each section covers approximately between 10 and 15 lines, and ideally is segmented at points where the text transitions from one topic or section to another.
The segmentation should be outputted in the following format:
BEGIN
[x0,y0,x1,y1]
[x0,y0,x1,y1]
...
[x0,y0,x1,y1]
[x0,y0,x1,y1]
END
with the coordinates starting from the upper left corner and being specified in percentage points.
For example:
BEGIN
[5.00,5.00,95.00,10.00]
[5.00,11.00,95.00,13.00]
[5.00,15.00,95.00,17.00]
END


My end goal is to build a semi-automated PDF to LaTeX pipeline, so then I can use the LLMs along with the OCR'd LaTeX of the textbooks to tutor me on these subjects.
To get a decent transcript I have to segment the pages into multiple pieces, because probably when the model loads the image it does it at a fixed resolution, and if the image is too large it ends up being unreadable to the model. This is an attempt at a preprocessing step to do conversion to LaTeX later on.
>>
>>102785846
"Tits or GTFO"
Alternately,
"Shoe on head"
>>
>>102787003
Actually I realized I'm overthinking it.
Since this is a PDF and not a scanned document I can just split the images at the middle of the next blank line after some fixed %.
It's not going to be splitted semantically which will reduce the amount of clues given to the LLM to do the transcription but I think it'll be good enough.
>>
>>102786522
There might not be a Medium 2, they were phasing it out.
>>
File: holochat.png (87 KB, 990x432)
87 KB
87 KB PNG
Mistral Nemo Instruct.
Whoa. Impressive.
>>
>>102787183
>Mistral Nemo Instruct
Which quant?
>>
>>102787196
Mistral-Nemo-Instruct-2407-Q5_K_L.gguf
>>
>>102787209
Interesting.
>>
>>102787183
Nemo is dumb, but undeniably soul.
>>
>>102787281
It's not so dumb that I can't coax gold out of it by swiping, and the swipes are like 1 second each so no big deal.
I'm digging it.
That's honestly the single coolest thing any model has ever said to me.
>>
"But I like it" sure is a Nemoism, though.
>>
>>102783471
Thanks, but the tips from that article are pretty basic and I already know them.
Quite frankly, if you've read the CUDA documentation the article will teach you nothing new.
>>
>>102787816
8^)

Hello person who can't code for AMD.
>>
for a new pc build, amd or nvidia gpu if i want to learn and run LLMs (and stable diffusion) on linux?
>>
>>102787851
nvidia
>>
>>102787851
nvidia unfortunately
alternatively, specialty cards with much more vram but may be considerably slower
>>
>>102774337
Bumping this, will spin up a h100 on runpod to play with
I think I'll try mistral large, would there be a big difference between just using the ollama model over serving it with something like vllm or embedding calling it from my pay script?
Happy for any recommendations, I'm super new at this
>>
>>102780841
Holy shit, could we finally make our own uncensores autistic model?
>>
grifter thread
>>
File: evil.png (103 KB, 506x376)
103 KB
103 KB PNG
>>
>>102787183
Holy shit you can get to 200 messages writing a sentence at a time? Groundbreaking stuff anon, when is the paper coming out?
>>
>>102786793
it can't do a long story because all the real creativity comes from the prompt, the model just puts words and slop together to expand on the prompt
>>
>>102788158
>grifter thread
Yes.
>>
>>102773870
schizos are attuned to esoteric patterns in reality that normies can't even perceive let alone comprehend...
for all we know his kabbalah sampler will gen very hot smut
state of the art smut heat beyond your myopic imagination
>>
>>102780305
very nice, having a lot of fun with it
>>
>>102785613
I trust this Miku
>>
>>102788601
Instead of making a sampler, he should apply that structure to the NN architecture itself. I won't be surprised if it works just out of the box, considering the thing we work with here.
Kabbalah-like half-jumps between the layers are often learned by evolutionary algorithms, maybe that is what required to preserves the polarity of thought, a separation.
t. X transcendent
>>
https://app.primeintellect.ai/intelligence
You will sacrifice your 3090 for the benefit of humanity won't you anon?
>>
>>102789030
No but I'll let you sacrifice yours
>>
>>102789030
>Help us train our censored slop
No thank you. I'll consider it if someone started a based uncensored schizo model tho.
>>
>>102789082
This won't happen with foss troons in charge.
>>
>>102789101
Someone needs to fork the project then.
>>
>>102789030
Let me see the dataset first before I decide if I want to contribute.
>>
>>102789138
>55% Fineweb-edu
>20% DLCM
>20% Stack v2
>5% OpenWebMath
without any further nsfw filtering
Im hooked on that hopium right now
>>
>>102789030
/lmg/ should make it's own model with this and call it Divine Intellect.
>>
>>102789186
https://github.com/PrimeIntellect-ai/OpenDiloco
btw this is the code they are using
we could use this to make a /g/-certified model, if we announced it on /g/ or /pol/ with the premise of finally making an unslopped model we could get 70-80 anons (960GB of VRAM if every anon had 12GB of it and 1.6 Peta Flops)
>>
>>102789209
Lol nice name
>>
>>102789230
I have a folder with 40k .txt unfiltered erotica with ALL kinds of stories from various sources (some already disappeared) from the 2000's to now, it just needs to be converted to a dataset.
>>
File: 1702697825461851.png (368 KB, 609x859)
368 KB
368 KB PNG
https://x.com/basedjensen/status/1844931497675063563
>>
File: file.png (704 KB, 768x768)
704 KB
704 KB PNG
>>
>>102789310
>llm do not reason
ok, source?
>but that's ok
no it's not
>neither do we
source? you clearly don't
>>
>>102789326
Calm down redditor https://arxiv.org/abs/2410.05229
>>
>>102789338
Shalom
>>
File: ylecunn.jpg (47 KB, 738x415)
47 KB
47 KB JPG
>>102789310
LLMs do not reason. LLMs do not understand. LLMs do not think.
>>
>>102789320
clothes painted on
>>
>>102789355
Cause all of them are underage.
>>
>>102789338
>>102789310
the shrek sampler unironically solves this, lmao
>>
>>102789326
What a dumb take. Our reasoning is a result of our divine origin.
>>
>>102789415
Getting shat out of a stinky hole?
>>
File: file.png (128 KB, 541x350)
128 KB
128 KB PNG
>360m passing the 9.11 vs 9.9 test
HOLY
>>
>>102789443
You wish your penis was in that hole now. Maybe not that specific one but on the other hand you are anon...
>>
>>102789452
Impressive. WTS was 415m and did not pass the 9,11 test.
>>
>>102789452
now hit it with strawberry, nala, sally's brothers, and watermelons
>>
>>102789452
Total vramlet victory?
>>
>>102789465
Sure. Not ashamed to admit it.
>>
File: file.png (115 KB, 900x566)
115 KB
115 KB PNG
>>102789484
post prompts
>>
>>102789307
>I have a folder with 40k .txt unfiltered erotica with ALL kinds of stories from various sources (some already disappeared) from the 2000's to now
You mean the asstr archive?
>>
>>102789307
I have books3
>>
>>102789452
>benchmark shows embarrassing flaw in LLM reasoning
>new models are trained with that specific benchmark in mind
>new models now solve the specific thing without any fundamental gains
how often have we been through this in the past year and a half?
>>
>>102789599
Nope.
>>
>>102789030
Can you actually contribute with 3090s?
The leaderboard unit is in H100 hours.
>>
>>102789452
The more outrageous the claims the less I believe it. And I already didn't believe it at all.
>>
>>102789785
I assume they use the H100 as a reference, the amount of work done is the actual measure
>>
>>102789747
Benchmarkmaxxing is unironically AGI.
If you cover every aspect with benchmarks and force the AI to work with all of them, this is how you graduate the AI uni.
So keep adding your watermelon jokes and sally questions, one day it will take a skillchad to bring anything new for LLM to learn.
>>
File: d65.png (136 KB, 1328x1080)
136 KB
136 KB PNG
>>102789452
Holy f**k we now have a 360M model better than 405b. i can't wait for the godly erps dudes
>>
>>102789833
vrambros...
>>
>>102789788
you can literally try it yourself on colab...
>>
>>102789918
This, it's kinda fucked up tho https://x.com/_xjdr/status/1842404312284307723
https://github.com/xjdr-alt/entropix/blob/main/entropix.ipynb
>>
>>102789932
this is the updated one with 360m model, just click run and it works
https://github.com/SinatrasC/entropix-smollm
>>
File: file.png (581 KB, 1290x548)
581 KB
581 KB PNG
Why are nobel laureate machine learning researchers such plagarizing scoundrels?
>>
File: datasets.png (2 KB, 1016x69)
2 KB
2 KB PNG
>>102789728
Who doesn't? Still nothing compared to gutenberg. Something trained on textfiles, if they aren't already, would be fun.
>>
>>102789310
>>102789355
Even if they don't "reason" at all in the traditional human sense, newer and larger models emulate reasoning better than prior and smaller models therefore it's likely future models will emulate reasoning better than current SoTA. Research and practice hasn't found any reason to believe we're nearing the limits of what the transformer architecture can do therefore it's very possible that the emulation of reason by transformers will eventually eclipse the real thing. So basically what difference does it make
>>
AI makes you depressed, i work in a less hyped AI field than LLMs and the number of researchers in this field is small. if you are also doing basic research, you are just damn alone in this world. you sit at home and think about your work and there is just no one to talk to about it - not today and not tomorrow
im an emotional wimp, roast me
>>
>>102789030
>no copyrighted books for sovl
dead in arrival
>>
>>102790125
>AI makes you depressed
You are depressive. Where you less 'alone' before AI?
>>
File: file.png (376 KB, 521x351)
376 KB
376 KB PNG
>>102777963
>>102780453
Actually I was WRONG again, I won't be refunding yet, however, see pic

Seems that the supplier says the motherboard bios is old and that the model will show up as expected on cpu-z as he showed.
I will believe him because that rings a bell maybe and the naming convention could be outdated, it shows the serial num for the 9334. 32 64.


However I have another issue in that CPU0 isnt detected and CPU1 is lol.

Not sure where to begin other than reseating them which I hestitate,
Anyone got stories of success reseating a new out the box gigabyte to make it work again?
>>
Have you ruled out an issue with the cpu power cable and/or socket on your PSU yet?
>>
>>102790047
You first need to comprehend why these arguments and papers are taking place, in the first place. It's not about basic scientific inquiry, it's about the business that has surrounded that and tried to make the useful aspects of the tech part of something that ultimately is meant to fill the pockets of a few. LLMs can be useful. That doesn't mean many of the ones securing funding for making them necessarily have that as their primary goal (over the goal of giving themselves, the founders/execs, the ability to buy yachts and super cars), and unfortunately many employees of these companies don't realize they're being played, or are complicit because they have a slave and herd mentality. The argument about whether LLMs can reason and whether it's economical to keep scaling transformers is really just a part of the fight against these scam artists.
>>
File: 84814.png (78 KB, 569x874)
78 KB
78 KB PNG
>>102789355
Based lecun repost debunked o1
>>
>>102790291
don't care. still gonna use o1
>>
>>102789310
>Llms don't reason. But that's ok, neither do we.
I strongly agree with this sentiment.
>>
>>102790259
Not yet, getting into it again tomorrow but my power cables for cpu are EPS's
>>
>>102790291
yann is a fraud
>>
>>102786793
>I disagree, it can't do as long of a story, it runs out of context.
I disagree. Its a matter of having the right prompt, using a large enough model and having the resources to have 32k+ context so the story is satisfyingly long. Long context is a solved problem at this point. Just use RULER to pick a model that stays consistent as long as you need it to.
>>
>>102790397
>>102790397
>>102790397



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.