[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: file.png (2.65 MB, 1024x1024)
2.65 MB
2.65 MB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101757601 & >>101749053

►News
>(08/05) vLLM GGUF loading support merged: https://github.com/vllm-project/vllm/pull/5191
>(07/31) Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma
>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676
>(07/26) Cyberagent releases Japanese fine-tune model: https://hf.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407
>(07/25) BAAI & TeleAI release 1T parameter model: https://hf.co/CofeAI/Tele-FLM-1T

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
File: threadrecap.png (1.48 MB, 1536x1536)
1.48 MB
1.48 MB PNG
►Recent Highlights from the Previous Thread: >>101757601

--OpenAI cuts GPT-4o price amidst competition from Anthropic: >>101760015 >>101760089 >>101760156 >>101760127 >>101760161 >>101760420 >>101760540 >>101760702 >>101760777 >>101760661
--How to prompt LLM to call external APIs using function calling: >>101762712 >>101764500 >>101764831 >>101764932
--Anon generates responses for lmsys_chat_1m_clean dataset with GPT-4 and Claude 3.5 Sonnet: >>101762642
--Anon discusses 405B model and providers, with some anons preferring local models and criticizing Together's prices and reputation: >>101759055 >>101759065 >>101759099 >>101759133 >>101759275 >>101759284 >>101759434 >>101759474 >>101759515 >>101759573 >>101759612 >>101759717 >>101759903 >>101760068 >>101759548 >>101759777 >>101759863 >>101759929 >>101760063 >>101760095 >>101760112 >>101760225 >>101760114
--OpenAI used muP before others, possibly related to µTransfer technique: >>101763343 >>101763361
--Lambda GPU Cloud pricing discussion: >>101763607 >>101764189 >>101764197
--Idefics3-8B-Llama3 model supports multimodal tasks: >>101766156 >>101766174 >>101766250
--FLUX.1 video model and AI development discussion: >>101757726 >>101757742 >>101757804 >>101757844 >>101757810 >>101757849 >>101757943 >>101759880 >>101760044 >>101760092 >>101763158 >>101760094
--Dan Hendrycks trying to push finetuning resistant method as law mandated: >>101760368 >>101760390 >>101760444 >>101760613
--8b draft spec decoding performance and small vs large model discussion: >>101764345 >>101764353 >>101764408
--ZLUDA project taken down, to be rebuilt from pre-AMD version: >>101759347
--Lambda's 8x H100 deal is cheaper than RunPod: >>101763416 >>101763594 >>101763596 >>101763626 >>101763629
--Anons discuss 405B and Mistral Large models on openrouter: >>101758409 >>101758472 >>101758750 >>101758846 >>101758785 >>101758989
--Miku (free space): >>101761322 >>101764185 >>101765085

►Recent Highlight Posts from the Previous Thread: >>101757610
>>
what is the best programing model that can fit in 12gb of vram?
>>
>>101767160
gemmasutra 2b
>>
>>101767164
>2b
in the trash it goes
>>
https://www.cerebras.net/cerebras-customer-spotlight-overview/spotlight-aleph-alpha/
>>
>>101767259
Fine.. fine... what about it?
>>
>>101767164
Can you shut the fuck up about that amateur clown? His models are shit
>>
>>101767356
>His models are shit
proof?
>>
>>101767356
It was a joke, anon, jesus fuck... remember when people recommended phi2? remember when people recommended tinyllama? no? oh....
>>
>>101767356
hi sao
>>
File: girl_infront_of_houses.jpg (235 KB, 1075x717)
235 KB
235 KB JPG
So, what are you guys doing that warrants a local model?
>>
>>101767379
c.ai doesn't let me coom properly and no fucking way i'm paying for tokens lmao
>>
>>101767379
Reliability and principle. Once a model is released, it cannot be made worse than at launch. And i don't like online services. I like owning what i have. As for the use, i just find them interesting.
>>
>>101767379
grooming a 14 year old
>>
>>101767456
Your wife is spilling out the beans on reddit!!!
>>
>>
File: how-to-adjust-flame.png (381 KB, 512x512)
381 KB
381 KB PNG
https://stovemastery.com/what-causes-a-gas-stove-to-explode/
does anyone know what model writes these? I want to rp with it
>>
>>101767456
just read https://vndb.org/v415/ instead of shitty aislop
>>
>>101767497
3.5 turbo?
>>
>>101767160
Nemo or mini-magnum. You'll have some of it on regular ram probably but it should run great.
>>
>>101767555
>use this finetune trained on rp logs for coding
>>
>>101767356
This. We should be talking more about InternLM 2.5 20B instead. This model beats Gemma 2 27B and comes really close to Llama 3.1 70B in a bunch of benchmarks. 64.7 on MATH 0 shot is absolutely insane, 3.5 Sonnet has just 71.1. And with 8bit quants, you should be able to fit it on a 4090.
>>
>>101767593
I approve of this post
>>
>>101767589
Missed the programming part. My apologies.
>>
Thank you
>>
>>101767593
buy an ad
>>
>>101767379
holy shit when I looked at the thumbnail of this I could have sworn I saw the words "child pussy" but then I zoomed in and it was normal. you guys saw it too right
>>
>>101767715
No. You're paranoid. It's a perfectly normal picture.
>>
>>101767715
Absolutely DISGUSTING!!!! This should be deleted IMMEDIATELY, whoever posted this must be sick and depraved! God Bless America
>>
>every model i don't like (aka i can't run it because i'm a vramlet) is a shill post
>>
>feeding
>>
File: 1708228248091609.png (20 KB, 522x226)
20 KB
20 KB PNG
>>101767379
I can't imagine being a poorfag and having to deal with retarded shit like this
>>
>>101755678
Alright, okay. So, dolphin-2.9.3-mistral-nemo-12b.
During my test battery, there's a moment where I ask the model to create a list of people then in the next message I ask it to create a lewd story featuring that character.
Celeste (1.6) and mini-magnum did stellar at that point, with other models failing to follow the prompt or to make a good story. dolphin so far seems to fall in the latter case. It writes the story but it defaults to keeping it short, doesn't write much detail, and it's extremely evasive with sex stuff.
I'll try prompting around it and see if I can extract good results in that aspect before continuing.
One good thing about it is that I don't have to cheat by adding "Sure, {{user}}", "Continuing", etc at the end of my prefill with it.
Other models would go slightly schizo or have a large incidence of not following on user's last message.
>>
File: ComfyUI_05567_.png (1.19 MB, 720x1280)
1.19 MB
1.19 MB PNG
Multiple proxies down, you know what that means....
>>
Stop using Assistant.
>>
>>101768091
but she likes it
>>
File: GOaNiSVaIAAIIyS.jpg (101 KB, 1170x981)
101 KB
101 KB JPG
https://x.com/sama/status/1821207141635780938

happening
>>
>>101768137
>another self-mythologizing tease
yawn
call me when openai actually delivers something substantial again
>>
File: 1721611039411706.gif (235 KB, 500x500)
235 KB
235 KB GIF
>>101766139
>Using crunchdog as a way to say current models have soul
Crunchdog is just an extremely funny card, it'd probably be soulful on any model. That doesn't mean that sex and love shit will be, which is what's been lacking in soul.
>>
What model is good to help me try to learn programming?
Is it qwen2?
I can run 30b models in slow mode btw
>>
>>101768668
try codestral 22b
>>
Anybody have a model that they like for 24GB vramlets?
>>
>>101768701
sorry, this thread is for people who locally run Llama 405B only
>>
>>101768693
Downloading it right now
>>
>>101768771
literally kill yourself
>>
>>101768793
It's just a prank, bro!
>>
>>101768793
Figuratively chill down
>>
>>101768137
I hate that fucking faggot with every fiber of my being. I wish Anthropic would have hit the scene first. 99.9 percent of the population only knows ChatGPT
>>
>>101768832
If anthropic hit the scene first we would have bezos posting cryptic pictures of legumes instead.
Choice is an illusion.
Local is the only way.
>>
>>101767443
Boomer sensibilities. Nobody cares about ownership of tech slop except old people (You)
>>
"Als mir klar wurde, dass Intelligenz simuliert werden kann, habe ich die Idee unserer Einzigartigkeit aufgegeben"
>>
>>101767593
wtf? why do you keep shilling this garbage? fuck off and die
>>
>>101768701
https://huggingface.co/TheDrummer/Big-Tiger-Gemma-27B-v1-GGUF
makes child rape stories and doesnt refuse
>>
>>101768923
>Nobody cares about ownership of tech slop except old people (You)
ok
>>
>>101768137
journalists must be slobbering all over his boots after that post
>>
>>101768137
Yann Lecun said research isn't secret. If OpenAI ships proto-AGI he should retire
>>
>>101768832
He and the other entrepreneurs ultimately sold the software company, which developed apps for Android and iOS that let users selectively share their location with other people, to Green Dot Corporatio Company in 2012 for $43.4 million
47.555427053884706, 7.606273996838664
>>
Tried out Celeste 1.6 yesterday at 60k context.

I'm thinking it's kino
>>
>>101769043
ad
>>
File: 1647399149433.jpg (292 KB, 1027x1273)
292 KB
292 KB JPG
Is there an ElevenLabs tier voice cloning model yet?
>>
>>101768944
Checked. That's more than I need it for, but a decent baseline for model freedom.
>>
>>101769141
Nothing changed since yesterday. Ask again tomorrow.
>>
>>101768944
I'm not into rp but an uncucked gemma2 27b would be nice.
>>
>>101769043
It's definitely better than 1.9 in my experience.
>>
File: 1720869141748493.jpg (94 KB, 875x916)
94 KB
94 KB JPG
>>101767112
>>
>>101767715
get your mind out of the gutter
>>
nu cum wen?
>>
>More questions?
>Contact me via Discord or ask on the /img/
>/img/
Are the magnum shills unable to spell the name of the general right?
https://rentry.org/MagnumProxy#more-questions
>>101769766
>>
File: 1722778333267421.png (60 KB, 640x640)
60 KB
60 KB PNG
>>101769843
Lol, my bad anon! Thank u for highlighting that
>>
new 8b sota just dropped: https://huggingface.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct

>4.2 Output: All rights, title, and interest in and to the Output generated by the Model, whether in its original form or modified, are and shall remain the exclusive property of the Licensor.
>>
File: file.png (29 KB, 718x256)
29 KB
29 KB PNG
>>101769935
HOLY SHEET
>>
>>101769935
>>101769953
>Weird custom architecture 4k context
holy sheet indeed.
>>
>>101769953
>>101769991
46.8 on arena hard is llama 70b-tier
>>
>>101769935
>>101769953
>korean
biggest liars after chinese
>>
>>101769953
wait even LG is jumping in on this now?
Few more weeks and Hot Pockets will be releasing open source foundational models at this rate.
>>
>>101770048
>biggest liars after chinese
uhm source? isn't this literally the first korean llm?
>>
>>101770112
isn't the solar team korean? though I guess theirs was more of a continued pretrain
>>
>>101770112
Not to mention even if it is Korean it's coming from a reputable electronics firm. It's not like half the chinese models that come from literally who startups.
>>
>another model release
>another American melty
How do we solve their insecurity problem?
>>
Exaone instruct template off of tokenizerconfig.json for people who don't want to submit to the model gate conditions
[CODE]"chat_template": "{% for message in messages %}{% if loop.first and message['role'] != 'system' %}{{ '[|system|][|endofturn|]\n' }}{% endif %}{{ '[|' + message['role'] + '|]' + message['content'] }}{% if message['role'] == 'user' %}{{ '\n' }}{% else %}{{ '[|endofturn|]\n' }}{% endif %}{% endfor %}{% if add_generation_prompt %}{{ '[|assistant|]' }}{% endif %}",[/CODE]
>>
>>101770154
Here's hoping that the data is primarily from English sources.
>>
>>101770199
how do make code block on /g/ without looking like retard plzhalp
>>
>>101770048
No. We are back. The LK99 of LLMs is here.
>>
>>101770199
Okay, so
>[|assistant/user/system|][|endofturn|]
If those are actual special tokens rather than being tokenized as strings, than alright. Better than fucking mistral's.
>>
been away for a few months, is stheno is the best bang for buck for 12gb VRAM?
>>
>>101770257
no, gemmasutra 2b is better nowadays
>>
>>101770257
It never was. Buy an ad, shill.
>>
>>101770257
Yes, kind of. Depends on your taste, but I'd tell you to also try nemo-instruct and some of its fine tunes like celeste 1.6, mini magnum, and dolphin.
There's also Gemma 9b, but I never gave that one a proper try.
>>
Drummer mindbroke the local schizo
>>
>>101770248
Later mistral releases did actually add an [INST] and [/INST] token.
>>
>>101770275
tell me what's better then faggot
>>
Gemma 27b is still the only local model that can write javascript without semicolons, it's fucking infuriating. I want to like magnum 32b, but it just ignores instructions

>>101770213
[CODE][CODE][/CODE][/CODE]
>>
>>101770213
I don't know if i can nest code tags.
Let's see what it looks like in the post.

>>
>>101770296
True. Should have mentioned that.
>>
>>101770276
sweet I'll try all of those. thanks for not being a schizo like that other retard
>>
>>101770326
kek. that didn't quite work.
Escaping them?
\
like this?
\

>>
>>101770269
I can go bigger than that if there are gains to be had
>>
File: nala exaone.png (158 KB, 936x532)
158 KB
158 KB PNG
Nala test for Exaone 3.0 7.8B
This is definitely promising.
There's even new slop that we've never seen before. Like "tingles through your muscles"
>>
>>101770519
holy formatting
>>
>>101770519
Imagine if they find-replaced shivers down your spine with that.
Would be hilarious.
The broken formatting is bad, but the text itself is pretty okay considering its size.
A shame about 4 fucking k context, but I suppose it could be used to generate "un-sloped" data to train other models with.
>>
are there any c.ai alternatives which can be self/locally hosted? if not, maybe we could build one.
>>
>>101770519
Spatial reasoning seems all over the place tho
>>
>>101770562
Hey, if you give me a week and a couple of h100 I can make train a pretty convincing simulacra for you.
>>
there was some random gemma model on the anthracite hf page today anyone have a reup?
>>
>>101770565
>>101770561
>>101770551
there might be eot token issues right now with my current half year old sillytavern setup I literally just threw together a template and started messing around with it as soon as I found out about it.
>>
>>101770614
Fair. The way it's breaking format does look like an issue with the template, BOS/EOS tokens, etc.
>>
>>101770627
The way it's breaking the formatting is because whatever shitty old version of ST I'm using omits the last character for some fucking reason. So it always ends with an unpunctuated sentence or an unclosed asterisk or unfinished quote.
>>
Maybe the times are changing but I'm not liking magnum-12b-v2. Ironically it's too coom-brained. I found myself switching to Nemo mid-session and having to wrangle it way less. It's a shame cuz I really liked mini-magnum and magnum 72b.
>>
>>101770761
There's a 12b v2?
Gotta download it I guess.
>>
>>101770519
gemmasutra 2b is better
>>
>>101770761
That's a shame. I already felt like Nemo was too horny of a model.
>>
File: sportsballt1.1.png (216 KB, 925x624)
216 KB
216 KB PNG
Hmm very interesting.
It will stay relatively coherent above simple t=1.0 but gets a little schizo, but if you give it something surrealistic to describe it does pretty well. This is at t=1.1
With meme samplers it might be even better.
>>
File: file.png (2.7 MB, 1024x1024)
2.7 MB
2.7 MB PNG
>>
>>101770809
Only ever tried 2b-it but it's still pretty good. If I knew someone with a lol computer wanting to try erp out I would probably recommend a 2b gemma model.
Exaone is not bad though. It's got some slop unique to itself which means that its dataset has some things that other models datasets do not. Which from a tinkerer's perspective makes it interesting. But I wouldn't tell someone to go and ERP with it. 4k context is kind of sad, and the default ROPE settings in the config don't work so someone will have to feel it out if they want to try and extend the context.
>>
File: exaonestrawberry.png (10 KB, 849x181)
10 KB
10 KB PNG
holy shit exaone beats gpt4|o at the strawberry test.
>>
>>101770977
Cosmic Miku looks tired.
>>
>>101771076
I'd love to see how it tokenizes the word.
>>
>>101770170
I trust a random 4chan post that model is good or bad more than any benchmark. And I don't trus 4chan posts about model being good or bad.
>>
>>101770809
race to the absolute bottom
>>
>>101771219
in both English and Korean.
There could be some weird interplay between the korean word/tokenization for strawberry and the English one that affords it a workaround.
>>
>>101768923
Slave level thinking
>>
>>101769935
>https://huggingface.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct
>Max sequence length 4,096
:-)
>>
>>101768923
This
>>
eta on qlora-pipe exaone support?
>>
error 400 from both suno and udio right now, weird.
anti-ai folks having a melty or something?
>>
File: sovl_.png (132 KB, 604x515)
132 KB
132 KB PNG
>hurr durr new models are woke and sloppy
You just don't use good tunes, picrel is llama merge I'm testing right now. So far I like the prose and it looks really promising.
>>
>>101771752
Reading that sent a shiver down my spine
>>
>>101771752
>mfw reading this
Kek, good bait
>>
>>101771752
Unironically pls post some good recs
>>
>>101771752
That's amazing.
It even has journeys.
I can't remember the last time I've unironically seen journeys.
Plenty of bonds however.
>>
>>101771435
yeah, so tired of this... 8k is barely enough to coom...
>>
>>101771776
>>101771781
>>101771788
>>101771831
guess the model
>>
>>101771887
>guess the model
Mytho
>>101765965
>Wanna hear something funny? You were pissing me off so I decided to false-flag logs here, posting mythomax logs and calling it some other model (gemma, llama, mistral). You have no idea how much laugh I got when you were whining how sloppy it was and there are no good models anymore because local peaked at mythomax (kek).
>>
>>101771887
NOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
>>
File: mythomax.png (85 KB, 657x350)
85 KB
85 KB PNG
>>101771905
>Mytho
ding ding ding
>>
>>101771752
>"And who knows? Maybe XYZ
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA I HAVE SEEN THIS LLAMA3ISM WAY TOO MANY TIMES BY NOW AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>>
>>101771938
>LLAMA3ISM
but is mytho l2 doebit
>>
>>101771947
mytho is llama merge
>>
>>101771752
This isn't bad. This is how most of the greatest writers in the world write
>>
>>101771932
gemmasutra 2b is INDEED better
>>
>>101771956
yes but not llama3
>>
>>101771932
Do xwin-mlewd too.
That was my favorite.
>>
>>101771959
the instant switch xD
>>
>>101771932
Myothmax is still king
>>
>>101771995
king of slop
>>
>>101772003
King of sovl, but I do wish for an update, ngl.
>>
>rename 2b to model of your choosing
>edit entire reply into gptslop
wow every model is bad local is dead
>>
>>101771980
unfortunately I don't have it on my hard drive
>>
>>101771947
It is not mytho. And who knows is distinctly l3.
>>
File: meds.png (241 KB, 512x497)
241 KB
241 KB PNG
>>101772019
>>
>nobody would EVER troll on /lmg/
>>
>>101772034
nice copium but it is mytho, you can recognize it easily by the purple prose
>>
>>101772049
>trolling by saying an old model is bad
purpose?
if anything he should shat on nemos if he wanted to cause true chaos like say magnum mini 2 is big sloppy
>>
>>101772071
the most sucked off model in the history of /lmg/ is just some old model? people already agree nemo is fucking retarded
>>
>>101772088
>the most sucked off model in the history of /lmg/ is just some old model?
yes? it is and was always slop even if it was the best slop of its size for its time
>>
>>101772088
From my point of view people who say mythomax was any good are trolling. They were just impressed by the vocabulary, but beside that mythomax was stupid, quite often incoherent and had a lot of slop and cliche.
>>
>>101772049
I'm just sick of people shilling their useless models
>>
>>101772088
Name 1 L2 merge that writes as well as Nemo
>>
Here's a fun dumb thing for you guys to try.
Temp 2, TopK 2 (yep) minP 0.05 just in case.
See how you sampling the top 2 tokens with almost 50/50 percentage chance most of the time.
>>
>>101772071
There is no such thing as mini-magnum 2.
>>
>>101772140
there is none, but just because one shit smells less than the other doesn't mean they aren't both shit
>>
>>101772140
Mythalion-Kimiko is pretty good. The older models in general are less slopped because they hadn't yet reached the level of intelligence required to notice the slop-web in human creative-writing. But they're also less intelligent.
Nemo blows any L2 model out of the water as far as conceptual understanding goes, regardless of size. But it would be inaccurate to say that there weren't perfectly good L2-13B models for cooming to back in the day (which wasn't even that long ago in the grand scheme of things).
>>
>>101772156
https://huggingface.co/intervitens/mini-magnum-12b-v1.1
> New version is available! anthracite-org/magnum-12b-v2
close enough for me, i don't care about your exact branding
>>
i see no reason to use any model that isnt by mistral whether you have vram or not, nobody else is capable of making a good model right now
>>
File: graph.png (4 KB, 502x397)
4 KB
4 KB PNG
>>101772212
i can see one
i can see one
i can see one
>>
retards in this shithole can't even write a prompt or change samplers but will spend good money to run shitquants of large and say it's the models fault
>>
>>101772203
It's just not the same T-T
>>
>>101771932
Fake news. MythoGAWD never wrote like this.
>>
>>101772254
>retards in this shithole can't even write a prompt or change samplers
the cant even hzve the model answer >>101749214
>>
>>101772231
this, while I kinda like mistral models the repetition issue is terrible. And don't say "just use rep penalty", firstly, it doesn't always work as it should, and secondly, it lobotomizes the model. You basically force the model to not use tokens it wants to use, sure it can look semi-coherent but the intelligence hit is visible.
>>
File: GS1bXAmbwAAoOV0.jpg (377 KB, 1400x1800)
377 KB
377 KB JPG
>>
>>101772312
End of scene.
>>
Tess-3 (Mistral-Large-2-123B) and Trinity-2 (Codestral)

Dropping two new models today, before I fly out to Defcon.

Tess-3 on Mistral-Large-2-123B (General-LLM): https://huggingface.co/migtissera/Tess-3-Mistral-Large-2-123B

Trinity-2 on Codestral (Code-LLM):
https://huggingface.co/migtissera/Trinity-2-Codestral-22B

Both are uncensored. Codestral scores 78 on HumanEval.
>>
>>101772356
didn't he say like yesterday he wouldn't do sub 70bs anymore?
>>
https://philome.la/johnayliff/seedship/play/index.html
>>
>>101772373
There's no reason to do 70b+ anymore.
>>
https://new.reddit.com/r/StableDiffusion/comments/1emi1j9/opensource_amd_gpu_implementation_of_cuda_zluda/
>a based gentleman wanted to help AMD by making Cuda compatible with their cards
>AMD sent a ban notice to him
If that's not a sign that AMD is a controlled oposition, then I don't know what else to say
>>
>>101772065
>you can recognize it easily by the purple prose
What does purple prose mean?
>>
is Mistral great because of the pretrain or because of the finetune?
>>
>>101772518
mistral can't finetune for shit
see 8x22b vs wizlm
>>
>>101772513
>What does purple prose mean?
A literary term! "Purple prose" is a pejorative term used to describe writing that is overly elaborate, flowery, and excessively ornate. It's characterized by the use of overly complex vocabulary, convoluted sentence structures, and an abundance of adjectives and adverbs.
End of scene.
>>
>>101772549
So basically me, also why specifically purple?
>>
File: OEMhYF15BjZUc7S0nN-u7.png (78 KB, 989x590)
78 KB
78 KB PNG
>>101772530
>see 8x22b vs wizlm
>Microsoft WizardLM-2-8x22B 11.7 %
>>
>>101772569
>also why specifically purple?
The term "purple" is thought to have originated from the idea that the writing is so elaborate and excessive that it's almost "royal" or "imperial" in its grandeur – much like the rich, regal color purple. However, in this context, the term is not meant to be complimentary, but rather to suggest that the writing is overly indulgent and self-aggrandizing.
>>
>>101772587
I didn't know they had a term for literature that describes me so well.
>>
>>101772646
based purplechad
>>
>>101772513
>What does purple prose mean?
In short - describing for the sake of describing. Have you ever tried to write an essay (for X words) and realized that you are short on words so you added a lot of useless fillers? Now make the fillers sound elaborate, melodramatic and hyperbolic - this is a recipe for purple prose.
>>
How would you measure that
>>
>>101772570
Hallucination is soul. See how no claude is on the list because they are too good and souful.
>>
>>101772570
>eval with no correlation whatsoever to RP quality
>>
>>101772718
>swiping endlessly is sovl
Fuck your sovl then, buddy. You can choke on it.
>>
>>101772718
>Hallucination is soul
this is why I refuse to use temp smaller than 4
>>
read a book on 1.3b
>>
>>101769953
where are the standard benches, though. All of these can be pretty easily gamed if you've trained on gpt4o outputs since style is really influential
>>
>>101772752
Every single time I have seen someone mention soul in this thread the example was of an LLM typing like a retard or like a schizo.
>>
I like Lyra. Donate to Sao's Ko-fi today.
>>
https://huggingface.co/nothingiisreal/L3.1-70B-Celeste-V0.1-BF16
70b Celeste
>>
>>101772752
Nice try, ChatGPT.
>>
>>101772970
>It seems to be way more coherent and aware of whats going on as well as more intelligent.
"12b mogs 70" copers btfo by their own sloptunnas you love see it
>>
>>101772919
For me "sovl" is the ability to write in non-cliche way. No patterns, no sentences and phrases that are excessively present in human writing. Also the ability of model to surprise me with their answer or the direction of plot they are taking.
There are local models that have a glimpse of sovl from time to time but there is none I would call sovful. The only model I unironically found sovful was old c.AI.
>>
>>101773042
Just up the temp to the point of incoherence. Or reduce context to 1k tokens.
>>
>>101773042
>old c.AI
beamsearch
>>
>>101772530
>see 8x22b vs wizlm
What? Both were shit. The only proof that it was better was Reddit's word of mouth impulsed by the mysticism of being taken down early.
>>
>>101772970
Sao is not going to like this...
>>
>>101772970
>>101773180
Hi lemmy
>>
Haven't checked in for a minute.

Is mini magnum 12b still the best RP model for people with single GPUs?
>>
>>101772970
buy an ad
>>
>>101773042
Agreed, old c.ai and I would also add summer dragon.
>>
>>101773221
celeste 123b mogs it
>>
>>101773221
That or the base Nemo Instruct 12b if you're looking for drier prose or non RP purposes.
>>
>>101773194
>>101772970
samefolx
>>
>>101773221
No, but also, 12B isn't in the ballpark of what you can run with a single GPU like a 4090. You can run anything below Nemotron with 340B parameters with the right quantization, so I think you might need to be more specific with your question.
>>
>>101773247
celeste tends to spazz out with the descriptions way more
>>
>>101773331
>trust me, this finetune that doesn't actually exists is bad
>>
>>101773406
Fuck you
>>
Finetune the rust away
>>
>>101773406
nta but the 'brand' name alone is radioactive, no one serious about the craft will touch that with a 10 foot pole (3.048 meters for eurofags)
>>
>>101773327
I have a 4090 and everything i've tried has been pretty mid compared to Mini magnum in terms of 1-1 RP.

It gives me the most realistic feel out of the ones i've tried but i've not been here for around a week and i've hardly tried everything. What other ones are good at maintaing consistent dialogue, realistic convos that feel kinda natural?
>>
>>101773327
Maybe that anon has a 3060 or something like that, so 12B is the most he can comfortably run offloading all the layers to the gpu
>>
>>101773519
Buy an ad.
>>
>>101773327
No thanks, I'll take 128k context over 0.05 t/s responses even if I have to swipe a few times.
>>
>>101773534
he has a 4090 and even then, I still would recomend Magnum to him.

Everything else fucking sucks unironically
>>
>>101773653 (me)
my name is Alpin, btw
>>
>>101773653
>Magnum
72b, 32b, 12b or mini-magnum?
>>
>>101773672
All of them are the SOTA at their respective sizes. No one else can compete.
>>
File: goliathwatermelons.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
>>101768137
A cheeky nod to the number of Rs in strawberry test that we post here all the time? What's next, a photo of sama holding watermelons?
>>
>>101773687
NTA but thank you for your recommendation. Magnum is amazing.
>>
>>101772402
What? Why?
>>
wrt function calling, it does depend on the specific model, but generally the raw prompt is going to include something like "You have access to the following functions" followed by the actual json list of functions formatted like OpenAI's would be in python.

https://huggingface.co/Trelis/Mistral-7B-Instruct-v0.1-function-calling-v3

Prompt Format here is a good start. I suggest Nemo or Mistral Large and explicitly ask for JSON responses.
>>
File: EzYl8b.png (101 KB, 756x838)
101 KB
101 KB PNG
>8x48GB GPUs
>just lying around
>i guess it's enough for 32b?
What did Alpin mean by this?
>>
>>101773653
shill
>>101773665
it's Alpine, didn't she transition recently?
>>101773672
none of them
>>101773687
they're all shit like the people who defecated them
>>101773708
btfo, shill
>>
>>101773672
what one could a 4090 realistically run? Been using mini myself on the 4090 and it's pretty fucking good
>>
>>101773911
there's nothing better than magnum
>>
>>101773653
>>101773911
If I had a 4090 I'd probably be running CommandR.
I can run it off of ram but it's so dog slow.
>>
>>101773897
remember that next time you'll be asked to donate to fund the compute to train the models
what a bunch of frauds
>>
>>101773911
You should try mistral large.
>>
>Here's the 4-11 on...
Thanks dolphin 12b. I had never heard of that slang before.
>>
>>101768923
>noooo you can't be trusted with models stop evading my control over you
>>
>>101773932
and btw, that includes all the finetooooners associated with that organization
alpine is smuggling and has been smuggling free compute for them
>>
>>101773980
there's no shot it runs on my 4090 lmfao. I only have 32GB RAM too
>>101773929
I have it, I just struggle finding good settings online (temps etc).

So it doesn't really perform as good as it likely can
>>
>>101774159
>there's no shot it runs on my 4090 lmfao
Upgrade your ram then, you'd get way better speeds than I do with a 2070.
>>
>>101768923
Exactly
If I didn't care about it I wouldn't be here in this thread, unless I had some kind of mental disability like a few people I have seen in this place.
>>
>>101767379
do you like sam altman reading your chats?
captcha: N00T
>>
>>101774172
much total memory you got and what's your speeds like?
>>
Good morning /lmg/, any good 8B models out there?
>>
Redpill me on RPStew

I keep seeing it on reddit but nobody here mentions it
>>
>>101774330
I advise you stop reading reddit. What is RPStew?
>>
>>101774330
>I keep seeing it on reddit but nobody here mentions it
I'm sure you can figure out the reason on your own
>>
>>101774316
gemma 8b
>>
>>101774273
96gb, it starts at 1.2T/s but slows faster than I'd like. The slow speed is worth it to me though since it's pretty good.
>>
>>101774379
>1.2T/s
>slows down
are you into edging?
>>
>>101774357
>I keep seeing it on reddit but nobody here mentions it
>I'm sure you can figure out the reason on your own
Most shilling is done by the finetuners themselves. Reject the idea that number of mentions = good.
>>
>>101774424
can you redpill me on llama 3 8B finetunes
>>
I will now say something kinda obvious that helped my cooming experience. I have realized that the final stage of my fucked up fetish is too difficult for current LLM's. However dialing it down a bit to a less complicated version has given me some very nice results that didn't require rerolls. Nemo seems like something that is finally good enough for this. Although I don't know how it will handle repeat sessions.
>>
>>101774424
Undi never shills his stuff here. His try it yourself method has captivated some sirs that now do it for him.
>>
>>101774424
>Reject the idea that number of mentions = good.
exactly, so ignore the reddit shills, shilling some random useless model
>>
>>101774479
hi sao. i was talking about you instead.
>>
>>101774462
wat fetish? vore or something?
>>
>>101774492
more like talking to voices in your head shizo
>>
Do you guys think Sao Drummer and Undi ERP with each other in some private discord?
>>
>>101774495
I am not telling. I am not becoming the next piss / stomach noises anon.
>>
>>101774504
no, they spend their time hacking into corpos and adding shivers down the spine to their training databases
>>
>>101774534
>hacking
>undi
anon please...
>>
>>101774521
I have it worse anon, my fetish is so niche that there are not even text materials by humans, so models are completely useless for that.
>>
File: file.png (449 KB, 636x350)
449 KB
449 KB PNG
>>101774555
Do tell us more.
>>
>>101774539
he learned it to spread slop and poison corpo datasets to ensure that we never get unslopped model
>>
>>101774555
It is sex with minors, isn't it?
>>
>>101774570
Nah, but it's not something disgusting or very weird, just niche and to be fair it's fairly hard to represent it with text only.
I'm hoping that in a few years multimodal models with image generation will be my savior.
>>
>>101774589
no, I said niche.
>>
>>101774625
what body parts does it involve?
>>
>>101767112
Holy crap. I hadn’t looked in a while and now vast/runpod prices have totally cratered. The hype cycle is ending finally.
>>
File: file.png (1.68 MB, 1607x782)
1.68 MB
1.68 MB PNG
>>101774757
>hype cycle is ending finally.
What are next?
>>
>>101773672
>>101773687
By what metrics? I like 70B and mini-magnum a lot but 30B and the new 12B were underwhelming.
>>
>>101774802
Hi Undi
>>
File: oh you.jpg (47 KB, 582x415)
47 KB
47 KB JPG
>>101774729
I'm not saying shit anon
>>
>>101774757
Isn't that due to the fact that Blackwell is out and more H100s are out?
>>
>>101774828
so coprophilia?
>>
>>101767112
Where should I begin if I want to develop a personal AI assistant hosted on a local server?
>>
>>101774864
I said not disgusting, now you are trolling me so I will spill the beans.
>>
>>101774870
have you tried getting a job?
>>
>>101774859
The only new thing I think is the H100 NVL, but prices for 4090s are less than half of what they were at their peak.
>>
>>101774900
so? where are the beans, Lebovsky?
>>
>EXAONE
Should I make GGUFs or is it a nothingburger? Is >>101769935 an undercover LG employer?
>>
>>101775201
it's 4k context so we don't care
>>
>>101775201
As a curiosity to play around with it's alright.
Not really a replacement for anything we have now due to only having 4k context and struggling with some concepts we now expect 8B models to handle.
It also stays coherent at fairly high temperatures for its size. I only test things with simple sampling, so I can't say what that means when you apply meme samplers to it.
>>
>>101775214
>>101775225
I'm fine with 4K context if it's actually good compared to llama3
>>
>>101775239
>I'm fine with 4K context
no you're not
>>
>>101775239
8k was a pain with llama 3. 4k is downright insulting. What is this, 2023?
>>
>>101775246
I'm using 6208 max because higher gets too slow
>>
>>101775272
Now, I'm not a mathmagician, but I am pretty sure 6208 is more than 4096.
>>
>>101775313
yeah but "max" implies that I can cope with 4096. in fact, I only use 6208 for groups, 4096 is enough for single characters
>>
With Mistral Large 2, I'm concerned that this is turning into a full on addiction. Is 123b that much better than 70b, or does Mistral AI have that good of a dataset? It's the only Mistral model I've ever liked, actually. (Unless you count WizardLM 2 8x22)
>>
>>101775532
What 70b were you using?
>>
>>101773897
Which Discord server is this?
>>
>>101775532
Found any good largestral fine tune?
>>
>"You know, they say that for every inch below six, you might as well be missing a limb. And you, my dear {{user}}, are teetering on the edge of being a paraplegic, aren't you?"
I wasn't expecting this kind of sovl from my sph slopbot
>>
>>101775652
>teetering on the edge
Why the fuck are LLM texts so easily recognizable? Same goes for images
>>
>>101775667
Corporate influence, uncanny valley and overfitting.
>>
>>101775718
how do we solve this
>>
>>101775734
Just add a few more billion parameters. Llama 4 1700B will be a great success
>>
>>101775734
>Corporate influence
Don't tune on assistantslop. Don't filter "harmful" data out of the base model. Make a pure chat model like early c.AI.

>uncanny valley
Make model smarter. Easier said than done. While >>101775770 may work, it is a suboptimal approach.

>overfitting
Make a list of overused phrases and either filter them out or replace them with less common, but context-appropriate phrases.
>>
>>101775844
sadly no one here actually trains base models, and the people who do are more interested in benchmarks and scamming investors
>>
>>101775874
Takes massive amounts of capital to train a base model of any appreciable size in a reasonable timeframe.
>>
I got flux running but what are you supposed to do with it? There's no use for these images. Is that why random boards have dedicated AI slop threads?
>>
File: ComfyUI_00119_.png (333 KB, 512x512)
333 KB
333 KB PNG
>>101776014
Have a glass of bees.
>>
Does anybody actually use the "story" format for their slop?
>>
>>101776014
gooning
>>
>>101776014
Use it alongside your text gen model to illustrate the scenes, of course.
>>
>>101776052
Sometimes I just let it run for a while and read the gems it has produced
>>
>>101776042
The fuck did bees over do to you? Fuck da wasps.
>>
>>101776101
Go drink a coke outside and you can recreate that picture.
>>
Anyone have a link to that comfy script/workflow that lets you offload the CLIP model onto a different GPU?
>>
File: ComfyUI_00185_.jpg (150 KB, 1024x1024)
150 KB
150 KB JPG
>>101776014
>what are you supposed to do with it?
Make Mikus
>>
File: metal_albums.png (1.47 MB, 1204x747)
1.47 MB
1.47 MB PNG
What are the best models/LORAs for creating the "80s metal album" aesthetic? I'm not sure if this actually has an art style by name. It doesn't necessarily have to actually be tailored towards album covers, this is just the best example. My goal is just to be able to generate art with this style consistently.
>>
anything I should be aware of since mistral large for coom rp?
>>
What's up with the sudden imagegen posts? Someone trying to troll again?
>>
Is the file size of a model a reliable indicator of how much memory it will require to load? I noticed with some other Llama 3.1 models, loading the model initially takes up 8GB of VRAM, then in task manager I can see more memory being allocated. Is the initial memory allocation the model itself? What is contained in the secondary memory allocation? Is that how models store context?
>>
>>101776151
here >>101689729
>>
>>101776228
Yes. Also, install Linux.
>>
>>101776179
>>>/g/ldg
>>
>>101776247
I think these are troll posts
>>
>>101776234
Thank you muchly, friend.
>>
>>101776244
Is the behavior I described unique to Windows? What would switching to Linux change? I'd love to, bit I think the system RAM fallback feature I need to run larger AI models is only available on the Windows Nvidia drivers
>>
>>101776277
Use GGUF and choose GPU layer count correctly.
>>
>>101776277
You can split between RAM and VRAM in GGUFs. Linux is simply better.
>>
Are the getting started links in the OP up to date? I've been using koboldcpp with mythomax-l12-13b Q5_K_M for the past few months. Is there something better or out there. Also, koboldcpp defaults to 200 gpu layers, but if I hover over the common values are vastly different. Am I doing something wrong or is it correct? It works so far.
>>
>>101776356
Windows users can't use layers for the gpu? That sounds strange.
>>
File: file.png (2.61 MB, 1024x1024)
2.61 MB
2.61 MB PNG
>>
>>101776461
Lol ofc they can. I just meant that Linux is better in general as an OS.
>>
File: 24-08-07 21-48-20 3047.jpg (2.79 MB, 4032x3024)
2.79 MB
2.79 MB JPG
Made some Migu bumper stickers. I used chink-brand white toner off Aliexpress, it worked fine for like 1/4 the price. There's a little banding at the top of the page, but I read that hologram sticker material doesn't work super well in a laser printer. I set my printer to "label" mode, it helped. No idea how inkjet ink sticks to plastic. Fuck inkjets, I'm never going back to that bullshit.
>>
>>101776724
wait you're saying you printed those entirely at home? can you list your setup in a bit more detail?
>>
>>101776724
I like these Migus
>>
>>101776747
Yeah, at home. It looks impossible but the trick is buying a white toner cart for your printer, printing a "mask" in black and white on the hologram sticker, then swapping the black toner cart back in, and feeding the paper throgh again for regular color.
It helps immensely to start with an image that already has white outlines around stuff, since it hides the inevitable registration mistakes between the two printing passes. I asked bing/dall-e to make Migus with a white border. The white border also makes it easier to use the "magic select" tool for creating transparent areas in the color part, and a mask for the white part.
You can also buy a $3000 printer which does it in one pass. I used a $300 canon color laser.
>>
>>101776877
No but I mean what about the glue? Or will you have to apply that manually?
>>
>>101776750
I want to use this one because it's cute and also pantsu but bing fucked it up by cutting off the left side, it's going to take more gimp work than I feel like right now.
>>
>>101776888
It's sticker material. You peel off the back, it's self adhesive.
>>
>>101776923
>gimp
unexpectedly based
>>
Something like this comes out well with just a b&w laser printer. In sunlight the hologram material is very catchy.
>>
File: 1718373963212515.png (2.19 MB, 1152x1024)
2.19 MB
2.19 MB PNG
>>101776923
>>
File: file.png (2.72 MB, 1024x1024)
2.72 MB
2.72 MB PNG
spamming a few overnite imagegens cause we're at bump limit
>>
File: file.png (2.73 MB, 1024x1024)
2.73 MB
2.73 MB PNG
>>
>>101777172
>we're at bump limit
huh?
>>
File: file.png (2.59 MB, 1024x1024)
2.59 MB
2.59 MB PNG
>>
File: file.png (2.62 MB, 1024x1024)
2.62 MB
2.62 MB PNG
>>101777183
>he doesnt know about the bump limit
>>
File: file.png (2.63 MB, 1024x1024)
2.63 MB
2.63 MB PNG
>>
File: file.png (2.58 MB, 1024x1024)
2.58 MB
2.58 MB PNG
>>
>>101777183
Look at this newfag and laugh
>>
>>101777172
>>101777180
>>101777190
>>101777210
>>101777255
>>101777270
all shit slop, why do you even bother with spamming the same fucking images?
>>
either blind, retarded, or both
>>
File: file.png (2.55 MB, 1024x1024)
2.55 MB
2.55 MB PNG
>>101777281
look at this dumb bitch lol
>>
File: file.png (2.5 MB, 1024x1024)
2.5 MB
2.5 MB PNG
>>
>>101777299
DEATH
>>
Chatbots?
>>
File: file.png (2.62 MB, 1024x1024)
2.62 MB
2.62 MB PNG
>>101777352
some of them come out really fucked up, AI is weird
>>
File: file.png (2.61 MB, 1024x1024)
2.61 MB
2.61 MB PNG
>>
File: ComfyUI_00004_.png (1.01 MB, 768x1024)
1.01 MB
1.01 MB PNG
>>
File: file.png (2.7 MB, 1024x1024)
2.7 MB
2.7 MB PNG
>>
File: file.png (2.55 MB, 1024x1024)
2.55 MB
2.55 MB PNG
>>
>>101777537
are you trying to fill the thread before it archives?
>>
>>101777844
yeah he thinks his slop ai genned "art" is worth something
>>
>>101777844
anon when a thread hits bump limit it doesnt matter how much you spam it afterwards, it just doesn't bump any more. anyone who doesnt understand this is new
>>
>>101777880
so what's the point on doing this, it's just retarded flooding spam
>>
>>101777886
if you dont like it, you can look away
>>
>>101777893
it breaks 4chan rules though
>>
>>101777897
so report it and see what happens
>>
>>101777909
I know that jannies don't care about AI threads on /g/ (especially aicg, but other AI threads too), that doesn't mean it's allowed.
>>
>>101777918
neither is trolling, retard-kun
>>
>>101777918
my friend, i don't think you understand. the thread hit the bump limit. posting doesn't affect the board any more. the thread is essentially dead now. imagedumping isn't even technically against rules even when a thread is live
>>
File: gemma2_9b.jpg (330 KB, 1167x1392)
330 KB
330 KB JPG
>9b model same intelligence as gpt-4
how possible
>>
>>101778034
lmsys doesn't test intelligence, it tests human preference. Learn the difference already, for fuck's sake.
>>
>>101778034
>dude gemma 2b totally beats mixtral 8x7b dude I saw it on the arena
>>
>>101778094
t. butthurt ai
>>
>>101778094
Anything could beat mixtral 8x7b.
>>
>>101778119
>>101778247
Hello google sirs
>>
>>101773672
32b v2 is probably the best I've seen for RP. This is surprising, because I wasn't impressed with 72b v1 at all. (I'm hoping 72b v2 arrives soon.)
Mistral Large 2 is definitely more nuanced and natural, and I prefer that overall. However, Magnum 32b v2 is slightly better imo due to the dataset and its instruction following, which is insane for a 32b model. It doesn't repeat itself either. Again, this is only RP. I haven't tried it for anything else yet.
>>
>>101778322
>32b is better than Mistral Large
Hi Alpin.
>>
>>101778034
>Brainless parrots ask the same questions on lmsys over and over again
.>put the short list in the data set
>Get a massive advantage over those that don't.
>>
>>101778328
>>101778328
>>101778328
>>
>>101772231
Show me a model that doesn't repeat at long ctx



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.