[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101191862 & >>101186500

►News
>(06/28) Inference support for Gemma 2 merged: https://github.com/ggerganov/llama.cpp/pull/8156
>(06/27) Meta announces LLM Compiler, based on Code Llama, for code optimization and disassembly: https://go.fb.me/tdd3dw
>(06/27) Gemma 2 released: https://hf.co/collections/google/gemma-2-release-667d6600fd5220e7b967f315
>(06/25) Cambrian-1: Collection of vision-centric multimodal LLMs: https://cambrian-mllm.github.io
>(06/23) Support for BitnetForCausalLM merged: https://github.com/ggerganov/llama.cpp/pull/7931

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
►Recent Highlights from the Previous Thread: >>101191862

--The Struggle is Real: Cleaning Datasets for Machine Learning Models: >>101191983 >>101192038 >>101192091 >>101192171 >>101192277 >>101192315 >>101192363 >>101192395 >>101192396 >>101192335
--Seeking EXL2 Compatible Server with OpenAI API and Context-Free Grammar Support: >>101192778 >>101193053 >>101193098 >>101193144 >>101193166
--Qwen 2's Tess-v2.5.2-Qwen2-72B Variant: A Promising AI Model: >>101193192 >>101193460 >>101193516
--Perplexity Improves with 9b Base Model: >>101192484 >>101192604 >>101192644 >>101192655 >>101192662
--Llama.cpp's Token Generation Delay with Cached Prompts: >>101195573 >>101195626 >>101195675 >>101195962 >>101195977 >>101196044 >>101196192
--LLM Compiler: Code Optimization and Disassembly Research Experiment: >>101191929 >>101192026 >>101192176 >>101193178
--Counting Letters and Custom Compiling Gemma-2 Support in Llama.cpp: >>101192460 >>101192555 >>101192897 >>101192940 >>101192964 >>101193630 >>101193760
--Gemma's Performance in Real-World RP and Potential Combinations: >>101193239 >>101193271 >>101193316 >>101193566 >>101193637
--Gemma 2's 8K Context Limitations and Meta's Unfulfilled Promises: >>101195909 >>101195953 >>101196152 >>101196394
--AI-Generated Cat Image and LLM Writing Quality: >>101193118 >>101193151 >>101193260 >>101193552 >>101193311 >>101194021 >>101194047 >>101194093 >>101194110 >>101194134 >>101194150 >>101194234 >>101194251
--27B's Performance Improvement and Schizo Fix: >>101193819 >>101193846 >>101193868 >>101193906 >>101193945 >>101195648 >>101193954 >>101193967
--Llama-70B and Gemma-27B VRAM Performance Issues: >>101194991 >>101195001 >>101195070
--gemma2's Repetitive Answers: A Potential Inference Issue: >>101192975 >>101192983 >>101193055
--Miku (free space): >>101192212 >>101192496 >>101195485 >>101196114 >>101196225 >>101196269 >>101196461 >>101196550 >>101196766

►Recent Highlight Posts from the Previous Thread: >>101191868
>>
!!! THREADLY REMINDER !!!
llama.cpp is AGPL3.0-only
>>101188248
>>
File: 1558544371796.jpg (28 KB, 604x550)
28 KB
28 KB JPG
Any new image model developments?
>>
>>101197218
Pony / pony realism
>>
Why did you start a new thread?
>>
dead general dead hobby
>>
dead technology dead future
>>
I've been trying out gemma 2 27b on lmsys.
It feels A LOT like the gemini flash model, but a bit dumber. I've also noticed that it basically gives the same response each time, even with the temperature turned up.

They overall have the same feel, and my theory is that gemma 2 is just a "fork" of an earlier checkpoint of gemini flash.
>>
>>101197411
Aloneposting on /lmg/ on a Friday late night?

Yeah, that's something people should aspire to.
>>
>>101197208
!!! THREADLY REMINDER !!!
petra's timezone is UTC+1
>>
File: 1714730734021332.jpg (42 KB, 400x400)
42 KB
42 KB JPG
>>101197454
>>
File: 1719588170603034.png (1.94 MB, 2560x2560)
1.94 MB
1.94 MB PNG
>>101197208
>>
>https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard
what happened to miqu?
>>
>>101197438
h-hey it's Saturday night here
>>
>>101197494
She had a good run but she was finally put out to pasture. No model is SOTA forever.
>>
What is the current coom model?
>>
>>101197579
qwen2 72b
>>
>>101197438
hey its saturday morning here
>>
>>101197579
Buy an ad.
>>
What is the current coom card?
>>
>>101197613
big nigga
>>
>>101197613
The one you wrote yourself to reflect your ideal scenario
>>
>>101197652
I don't have a type
>>
>>101197660
Ask the AI to make up a character then.
>>
File: file.png (180 KB, 275x387)
180 KB
180 KB PNG
>>101197660
>
>>
File: SuomiCoreMiku.png (1.48 MB, 832x1216)
1.48 MB
1.48 MB PNG
>>101197660
>I don't have a type
picrel is now your type
>>
File: petra.png (100 KB, 431x270)
100 KB
100 KB PNG
>>101197660
>I don't have a type
picrel is now your type
>>
Anyway had issues with 27B generating only pad tokens indefinitely until the generation is interrupted, when loaded with Transformers?
This isn't the different issue some are having with 27B outputs being low quality or schizo, this is something else where it's not working at all. Just generating an endless string of pad tokens in response to any input.
>>
*Anyone
>>
>>101197660
Python was a mistake.
>>
>>101197771
savage
>>
>>101197771
heh
>>
>>101197771
It's a fucking bane on programming.
So many headaches because of that fucking meme "language."
>>
>>101197771
there's worse language though, imagine using javascript as the required programming language for the fucking sites on the internet, I know sounds crazy but...
>>
my favorite quant is IQ4_NL
>>
>>101197771
Python does have types. /lmg/ proving once again to be mostly nocoders.
>>
>>101197798
only pajeets and hobbyists use plain javascript
every company with more than 2 developers working codes with typescript
>>
>>101197754
I found another guy on HF getting this issue but he couldn't solve it either. Not GGUFs, just the standard FP16 weights.
Got the latest 4.43.0.dev Transformers, but it happens with the 4.42.0.dev wheel supplied by Google too. Weird.
>>
>>101197294
sampling doesn't work so logits doesn't work
>>
>>101197820
And the Anon that I was replying to presumably also has a type.
But in both cases there is a lack of awareness.
>>
>>101197826
typescript is the same shit anon... it's just java script but with OOP
>>
>>101197845
it has types
>>
>>101197853
yay...
>>
>>101197828
Also do_sample is off so it's not that. I'm not getting NaNs, it's just generating <pad> endlessly.
>>
>>101197411
blame avatarfags
>>
File: .png (940 KB, 891x1064)
940 KB
940 KB PNG
>>101197861
forgot to attach picrel
>>
Gemma2 27b is good at poetic metre. Never seen a model spit out multiple stanzas of perfect iambic pentameter without a single mistake.
>>
>>101197845
>it's just java script but with OOP
You have no idea what you're talking about.
>>101197826
TypeScript is just for decoration, like putting makeup on a pig.
>>
>>101197882
how did you test it out anon? last time I've heard about that model there was some bugs making it schizo
>>
>>101197860
>>101197828
>>101197754
Fuck, I just needed to tick the BF16 option when loading the weights. Even though the weights are FP16. I don't get it but I'll take it, it's working now.
>>
>Bug: quantized gemma 27b output still wrong after tokenizer fix #8183
it wouldn't be llama.cpp otherwise
>>
>>101197899
>You have no idea what you're talking about.
Oh I fucking do, anon, I fucking do I made a site project with typescript, it's the same fucking shit as javascript and I hated my life when doing this shit, the fact that this failed language managed to be the main language on the fucking internet is still one of the biggest mysteries of the human history
>>
File: vU8ccqx.png (62 KB, 780x550)
62 KB
62 KB PNG
>>101197882
>>101197901
Picrel (this was after I gave it the feathers vs. steel riddle, that is why the topic is steel)
It's the technical accuracy of the poetic metre I am impressed by specifically, not necessarily the writing quality. Other models would have struggled with consistently keeping the correct metre and messed up stress and syllables once or twice.
>>
>>101197911
sounds like skill issue
>>
>>101197913
sounds like masochism issue
>>
>>101197911
JavaScript has OOP. TypeScript adds types, hence the name, not OOP. Stupid webshitter.
>>
File: file.png (27 KB, 644x800)
27 KB
27 KB PNG
>>101197918
>0==false
>AAAAAAAAAA
you
>>
>>101197911
>the fact that this failed language managed to be the main language on the fucking internet is still one of the biggest mysteries of the human history
It should have been Lua.
>>
>>101197911
>Oh I fucking do, anon, I fucking do

Not him, and I hate javascript, but no, you really don't. Give it a rest.
>>
>>101197945
Lua is relabeled BASIC.
>>
>>101197949
No it's not. The syntax is just vaguely similar. BASIC doesn't have closures, first class functions, and you can't create prototype patterns. Also BASIC isn't designed to be embeddable.
>>
>>101197945
No, it shouldn't be any single retarded scripting language. The web should be a collection of documents, as it was originally intended to be.
Adding any scripting was the first mistake. Trying to turn web browsers into cross platform application emulators because the average bootcamp flunkie is too stupid for regular application development was the biggest mistake.
But now that we're here, Web Assembly is the correct solution. Letting those same javascript artisans that got us here in the first place gimp it from interacting with the DOM to prevent themselves from becoming obsolete was the final mistake.
>>
>>101197974
If we had to have scripting in the web browser, Lua would have been a fine solution.
Otherwise though I agree with you.
>>
>>101197938
>>101197948
didn't expect to find fucking javascript fanboys, goddam, and you people say you hate python at the same time? you lost all credibility with that stank take
>>
>/lmg/ - i am le smart
embarassing
>>
>ask character if she's a virgin
>no
>add it into the card
>now she acts like the shyest most boring and predictable character
wat do?
>>
File: file.png (52 KB, 1060x189)
52 KB
52 KB PNG
>sweet summer child
:(
>>
>>101198056
Perfect example of the model picking up on shitty cliche smut tropes, realizing that's what it's writing and implementing that into its writing style.
Try rephrasing it to "{{char}} has never had sex before" and watch it magically fix itself.
>>
Is there any good local multi-modal model that takes image and video input yet?
>>
File: ComfyUI_00707_.png (1.19 MB, 832x1216)
1.19 MB
1.19 MB PNG
>>101198076
>has never had sex before
holy shit it works anon, thank you. heres a miku pic
>>
>>101198087
>011
>011
>>
>>101197911
Okay but what's wrong with it
>>
File: file.png (196 KB, 1007x627)
196 KB
196 KB PNG
geg
>>
>>101198084
Even if there were, llama.cpp wouldn't support them. There's a few that support image input, but I haven't seen any local models that take video yet.
>>
>>101198107
:(
>>
>>101197864
>shit vs shit
whoa!
>>
>>101198035
thats all of /g/ at this point, catalog is trashed with anime pics and ai jeet hype-up advertisement "threads"
>>
>>101198099
there's hundreds of better programming languages than fucking javascript and you're ok that this piece of shit is required to make the internet work? the fuck?
>>
>/lmg/ - local models general
>>
>How many r's are in "strawberry"?
>There are 2 r's in "strawberry".
>Spell the word "strawberry" and tell me how many r's are in the word.
>The word "strawberry" is spelled S-T-R-A-W-B-E-R-R-Y. There are 3 r's in the word "strawberry".

I don't think the strawberry test is good for validating a model's quality despite yesterday's meme. Rather I think it reminds us of the way tokenization works and that it's something we must account for when asking an LLM to do tasks that is more granular than the word/token level.
>>
>>101198499
>ad hominem
whats wrong with it anon?
>>
File: file.png (314 KB, 976x750)
314 KB
314 KB PNG
hi
>>
File: LOL.jpg (187 KB, 1792x1400)
187 KB
187 KB JPG
>>101198558
Even the creator of javascript thinks it ruined the internet but nahh nothing's wrong with it anon 1!11!1!1
https://lunduke.substack.com/p/creator-of-javascript-apologizes
>>
>>101198570
You have this saved but you cant name a reason. I'll stop replying though since it's offtopic.
>>
>>101197787
Ironically only people who aren't programmers or are new into coding trash on python. There is a good in my job who were coding in C++ for 15 years. Since he tried python he never went back and always fights tooth and nails to use it in our projects. I've seen it multiple times and my personal experiences and my colleagues' around me resonate more with me than some python memes from r/ProgrammerHumor
>>
>>101198577
>you cant name a reason
if you knew how to read text on pictures, you'll notice the reasons are cited on his tweet
>I'll stop replying though since it's offtopic.
looks convenient to leave the debate after being proved wrong :^)
>>
>>101198570
This is just attention seeking behavior
>>
>>101198591
You're right, it's just a consipracy theory!1!1!1
>>
>>101198586
>there is a good
guy*, fucking autocorrect
>>
>>101198523
Yeah well said. It's quite robust when you expand the instructions
>### Instruction:
>Count the instances of the given Letter within the Input string. First expand the Input string into individual letters, then count the number of instances of the given Letter.
>### Input:
>strawberry
>### Letter:
>R
>### Response:
>Step 1:
>>
I think ylecunn is correct and meta should stop wasting compute on producing llms that will never basic intuitive behavior or genuine situational awareness.
>>
>>101198663
I think lecunt is french and should go away.
>>
>>101198674
>french man
I think he sounds based
>>
https://huggingface.co/ChuckMcSneed/control_vectors/blob/main/command-r-plus/unslop1/control_vector-commandr-unslop1.gguf
Took my control vector for a test drive to see if it works correctly. During SFW everything worked perfectly, had the style that I wanted, but when NSFW part came, slop came out. Looks like slop during SFW and slop during NSFW have different directions inside the model.
>>
File: ComfyUI_00704_.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
>>101198756
>wtfpl
>>
>>101198768
Sorry, but it will likely be removed if I release it with +NIGGER license.
>>
>>101198756
how do I use this with exl2
>>
>>101198785
release it with faipl-1.0
>https://freedevproject.org/faipl-1.0/
qrd: agpl but for weights
>>
>>101198756
how do i apply faipl-1.0?
>>
File: screenshot.png (44 KB, 1489x104)
44 KB
44 KB PNG
>>101198793
>faipl-1.0
HF doesn't recognize it.
>>
>>101198918
you can select other, and then add the contents of https://freedevproject.org/faipl-1.0.txt to it
>>
>>101198924
to the LICENSE file, atleast thats how animagine and many other open source models do it
>https://huggingface.co/cagliostrolab/animagine-xl-3.1
>>
>>101198918
you can also add >license_name: faipl-1.0-sd
to the readme
>>
File: bvsed.png (1.59 MB, 832x1216)
1.59 MB
1.59 MB PNG
>>101198756
>faipl-1.0
>>
>>101198586
I have multiple years of experience both in Python and other languages and it is my strong conviction that allowing retards to use dynamic typing is a terrible idea.
>>
File: file.png (343 KB, 1219x886)
343 KB
343 KB PNG
>ministrations
>>
>>101198586
>There is a good in my job who were coding in C++ for 15 years.
I'm one of those guys, I did some C++ and Java for more than 10 years, Python is still my favorite language, it's just simple and elegant, the others are convoluted piece of shit, but I choose those shit languages because it pays well kek
>>
>>101198756
>https://huggingface.co/ChuckMcSneed/control_vectors/blob/main/command-r-plus/unslop1/example_output.md
wtf this is better than lora, why doesnt sao10k use control vectors?
>>
>>101199025
0.5 > -0.5
>>
>>101198989
there isn't a sentence in there that doesn't have slop in it.
>>
>>101199102
wat do
>>
>>101198953
>
why are you gay
>>
File: 1695511767256918.jpg (72 KB, 1079x998)
72 KB
72 KB JPG
>>101198565
damn
>>
>>101199025
Because control vectors lock model in a single direction. Not everyone has the same tastes, some people like forming bonds and going on journeys(>>101199067), some like model to be blunt and clear. There are also issues with repetition and decreased intelligence, if applied too hard.
>>
>>101199204
what would happen if you merged multiple control vectors then
>>
>>101199134
When you prompt LLM to write prose or roleplay it will always give you slop.

There are two ways
1) obvious - stop prompting it to write slop.
2) remove slop - autoprompt to rewrite output removing all slop.

the rest is cope. No amount of ko-fi finetunes or control vectors can fix it. It's the datasets, the training. Either you have that, or you have braindead rambling.
>>
File: made-in-heaven.gif (476 KB, 220x127)
476 KB
476 KB GIF
>>101199229
The universe accelerates so fast it gets reset.
>>
>>101199229
All off them would apply.
>>
>>101199242
hi x1,000,000000,00000000000,000000000000000000,000000000000000000000000000000000
>>
>>101199237
>1) obvious - stop prompting it to write slop.
how? the system prompt? the cards? i dont like prose at all but its not like theres much choice in models
>2) remove slop - autoprompt to rewrite output removing all slop.
how do i autoprompt? i know i can manually clean slop from replies but sometimes its too much, takes a toll of its own
>>
>>101197434
Interesting theory. But its only useful if they did it the other way around - as in lobotomizing the final flash model. If a way is found to re-add ze 6 bazinglion context window and multimodal capabilities it would be huge.
>>
File: GRKcPqpaMAA9Ekq.jpg (101 KB, 1038x642)
101 KB
101 KB JPG
Visionbros it is so over...
>>
>>101199300
>no chameleon
>no Cambrian

meds, schizzo
>>
>>101197208
CUDA dev, release your critical code with agpl, with a condition that it will be mit or whatever in 3/6 months, or something like that, just so the non contributing parts cannot be in the currrent state of the art, yet you can say that you are providing a business friendly code when it reaches a stable status
>>
>>101199314
>Chameleon
"Oh yeah, that happened": the model
Is there a single person who has tried it?
>>
File: file.png (180 KB, 1014x546)
180 KB
180 KB PNG
no matter how much i reroll she answers with my weight, wat do
>>
>>101199385
use a better model
>>
what's a good API provider where I can rapidly test different models?
>>
>>101199403
for example?
>>
File: btfo.png (227 KB, 1044x840)
227 KB
227 KB PNG
>>
>>101199459
i fucking hate llms so much
>>
happynameday.today
>>
File: ComfyUI_00508_-940e891e.png (1.94 MB, 1024x1024)
1.94 MB
1.94 MB PNG
>>101199471
>happynameday.today
>>
>>101199459
is this stheno or something like that?
>>
>>101199548
stheno 8b 3.2, any recommendations?
>>
>>101199562
stheno works ok for prompts that aren't 80% coom focused. I did wholesome adventures with it. So you'll have to learn to write your own cards.
>>
>>101199599
gib cards
>>
>>101199459
garbage in garbage out
>>
>>101199636
Anons when the AI doesn't write a masterpiece or a symphony after typing in a single dot:
>>
>>101199609
just write it yourself nigga, it's not hard, think of a girl YOU would want to plap and describe her. You don't have to follow any weird formatting, writing plain text works too.

{{char}} is a party girl. {{char}} likes drinking and handholding with strangers. Stuff like that.
>>
>>101197434
google is distilling their models from bigger ones so I wouldn't be surprised
>>
>>101199487
Damn, Petra looking good
>>
File: file.png (227 KB, 1008x649)
227 KB
227 KB PNG
>>
>>101198989
>Doing the other RPers actions
I hate people like you.
>>
File: file.png (144 KB, 1055x403)
144 KB
144 KB PNG
>>
>>101197218
pixart team said they're working on a bigger model
>>
File: file.png (284 KB, 931x731)
284 KB
284 KB PNG
>>
>no more cards to pat
its over..
>>
What quant to use for gemma 27b for 24GB VRAM?
>>
File: file.png (233 KB, 946x781)
233 KB
233 KB PNG
>>
File: file.png (91 KB, 626x241)
91 KB
91 KB PNG
what
>>
File: 33f1n4.png (213 KB, 384x480)
213 KB
213 KB PNG
>>101199946
pat this
>>
>>101200080
he a man tho
>>
>>101200079
B'Hig Cox.
>>
File: file.png (107 KB, 1050x315)
107 KB
107 KB PNG
least gay card
>>
>>101197831
Wait, what? If it can't know logits, how does it choose which token to add?
>>
>>101200080
Biggie Jong Un
>>
>>101200126
Are you running this with 0.2 smoothing or something?
>>
>>101200277
everything is neutralized besides
Temperature - 1.12-1.22
Min-P - 0.075
Top-K - 50
Repetition Penalty - 1.1
>>
>All these logs
>No Nala
How disappointing.
>>
>>101200234
the pat cut
>>
Thank you greg. Fuck.
It's fully possible that they are worrying about something that's actually normal behavior, as weird as it is.
It's probably a bug, yeah, but check instead of assuming.
>>
File: file.png (82 KB, 852x276)
82 KB
82 KB PNG
I am going fucking insane, spent 2 hours wrangling. I need to resist the urge to ask totally irrelevant questions.
>>
Is Gemini 2 27B working on local yet or can I go back to sleep?
>>
>>101200518
If you had the ability to sleep why not just sleep?
>>
>>101200394
Working on another merge right now so I should have some new official Nala tests right now.
also uploading a 70B merge right now. But it keeps failing overnight (the computer it's uploaded to is on a wireless ethernet bridge so it's a bit too finnicky for the HF web uploader, so I've been going 1 file at a time as I have time for the last couple of days).
New Nala tests and at least 1 new model by the end of the day.
>>
>>101200544
>1 new model
licensed under FAIPL-1.0?
>>
>>101200550
If I don't make it cc-by-nc petra anon will show up and bully me.
>>
File: Untitle11d.jpg (106 KB, 640x640)
106 KB
106 KB JPG
>>101200560
>cc-by-nc
b-based..
>>
Licensing model weights seems pretty dubious in the first place if I'm to be honest. What's to stop someone from just taking a set of weights, applying a 360 degree rotation to all the tensors and walking away with a different license since it's technically a different set of weights now?
>>
>>101200497
1. INST generate output
2. INST is this slop/out of character/? respond with yes or no
3. if yes, INST rewrite, goto 2. else display output to user.
>>
>>101200593
me
>>
>>101200497
Your complaint is unclear.

But IMHO it can be more fun having meta discussion with the LLM about the RP than the RP is. Especially on runs where the LLM is acting like AGI instead of like 77IQ.
>>
>>101200593
Depends on the argument if it's a creative work (protected) or simply a list of data (not protected) or a combination (how cookbooks work under copyright law; the ingredients are unprotected as a list of facts, the instructions are protected because they are a designed procedure).
>>
>>101200629
found the artcel
>>
>>101200593
Copyright all the symmetry groups.
>>
File: file.png (341 KB, 1162x887)
341 KB
341 KB PNG
>r/4chan
geg
>>
>>101200684
reads like some kind of mixtral meme model
>>
>>101200733
settings from >>101171560
model is the normal 8b stheno 3.2
>>
>>101199025
Logs?
>>
>>101200923
re-read the post
>>
File: firefox_CFznzCa19K.png (121 KB, 1536x747)
121 KB
121 KB PNG
Which one i should download? I dont even know how much space it needs in ram / vram
>>
>>101200948
None, download 3.3
>>
>>101200948
this post is bait, right?
>>
>>101200966
No. Before that i was looking for Max RAM required and etc.
>>
>>101200948
>IQ3_S-imat
what means the first I and the imat?
>>
>>101200948
RAM=model size+15%
>>
>>101201000
imat is iMatrix. It is used with tiny quants so they aren't as stupid.

IQ is a different system of quants than the common K series. It has a different trade-off of size and performance. IQ is smaller than QK, of the same Q number.

IQ and iMat are unrelated but can appear together, e.g. i1-IQ3_XXS
>>
>>101199237
>1) obvious - stop prompting it to write slop.

Prompting only works if it is consistent with entrainment. In any conflict between the sysprompt and entrainment, entrainment wins.
>>
>>101201064
I see, thanks for the answer anon, much apreciated
>>
I read like half of the stuff in the OP and now I'm talking to my GPU.
she's kinda stupid and keeps changing her syntax but I think I can figure this out, or try different models and stuff at least.
I've made it this far
>>
>>101200948
I think q8 base requirement for that model is a little less than 8gb but if you run it at a higher context the ram requirement increases. I'd estimate q8 at 16k context would around 11-ish gb.
>>
>>101201119
>I read like half of the stuff in the OP
You are leaps ahead of most people. Congrats.
>>
>>101201119
based
>>
>>101201119
>I read
You are leaps ahead of most people. Congrats.
>>
>>101201087
i never said to prompt it to not right slop.
i said to not prompt it to write slop.
>>
Should I have downloaded Koboldcpp instead of koboldai?
I have 12gigs of vram, but a decent processor. it looks like the main difference is streaming?
>>
call me georgi the way I'm gerganov to AI chatbots
>>
>>101201486
KoboldAI is all but defunct, kcpp is current and for mixed GPU/CPU setups, while exl2 is for Rawdogging pure GPU.
>>
File: file.png (184 KB, 1037x546)
184 KB
184 KB PNG
uhhh bros...
>>
>>101201527
thanks. assuming I can just copy models from the KoboldAI 'models' folder to a similar folder in Koboldcpp so I don't have to redownload them?
>>
Gemma had two major issues at launch which we know of so far.

The first was an incorrect tokenizer, which was fixed relatively quickly though a lot of GGUFs were made before that.

The second issue which was discovered much later was that Logic Soft-Capping, which Gemma-2 was trained with but which was initially not implemented in Transformers due to it conflicting with flash attention, was far more important than Google had believed it to be. Especially for the larger model.

The first issue (broken tokenizer) has been fixed for a while, and fixed GGUF has been uploaded to Bartowski's Account. But the second issue has not been fixed in llama.cpp yet. There is a PR but it has not been merged, though it likely will be very soon based on the recent approvals.

It was first believed that GGUFs would have to be remade after the PR got merged, but a default value was added for the soft-capping which means that old GGUFs will work as soon as the PR is merged.

So to summarize, if you download a GGUF from bartowski right now it will work as soon as the PR is merged, but before then you will experience degraded performance. Especially on the 27b model, which is entirely broken at certain tasks at the moment.

It's entirely possible that there are issues beyond just these two. It's not rare for various bugs to rear their heads when a new architecture emerges after all. And I have seen some say that they are experiencing issues even after the fixes. Like this post.

It's also worth noting that since llama.cpp does not support sliding window attention at the moment it will likely perform pretty poorly with context sizes larger than 4K. There is an issue open for sliding window attention but it has not really been worked on so far since few models actually use it.

40 upvotes

I honestly had no idea how shit /lmg/ is. I just hated you cause you are mentally ill.
>>
>>101201532
Ai wuz not understand the nigger language n shieeet
>>
File: 1384537776668.gif (89 KB, 600x450)
89 KB
89 KB GIF
>>101201593
>>
>>101201606
how is google releasing a broken, undercooked model our fault?
>>
>>101201709
You didn't summarize the last two days worth of threads in ELI5 fashion so he could understand what was going on.
>>
is typhon 8x7b still king?
>>
Anons, are this good?
Nous-Capybara-limarpv3-34B
Mixtral-8x7B-Instruct-v0.1
PsyMedRP-v1-20B
Fimbulvetr v2
>>
What's the current best local model for creating an app (and so you can learn programming in the process)?
>>
>>101201995
DeepSeekCoder-V2-236B
>>
>>101201951
old. but i guess for your size theres not much better, i guess stheno is ok but its very sloppy, fimb is very nice, psymedrp? eh didnt enjoy it. mixtral instruct? use mixtral instruct limarp zloss dareties or whatever its called instead. >34b
dunno
>>
>>101201951
Switch out plain Mixtral for Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss.
>>
>>101202018
or typhon, but right now im messing with typhon to see how it was, seems worse than stheno, probably a skill issue
>>101202038
is it better than typhon? i remember using it and switching it for typhon because everyone was shilling typhon, then i tried typhon and it was meh
>>
File: DiaJIigUwAAtzRh-orig.jpg (64 KB, 437x437)
64 KB
64 KB JPG
>>101199798
That's exactly what I wanted to hear. Thanks anon!
>>
>https://huggingface.co/InferenceIllusionist/TeTO-MS-8x7b/tree/main
b-bros?
>>
>>101202093
Imagine if you had posted actual meaningful text that created conversation instead of farting something r9k would turn you away for.
>>
>>101202093
>mixtral
yawn
cute tet though
>>
According to the new HF leaderboard, Qwen is the top model, while CR+ is much lower. Does that actually align with people's usage of the models? I rarely ever hear about Qwen being good, or bad.
>>
>>101202193
> Does that actually align with people's usage of the models?
no
>>
>>101202193
For me, CR+ is solid for RP and has occasionally acted too smart.
Qwen2 liked to spontaneously disrespect which of us was playing which role.

But for things other than RP, Qwen might be better. Different models, different strengths.
>>
>>101202193
Qwrn 2 is underrated for the size, but CR+ is just better.
>>
File: Chibi-Tet.png (469 KB, 512x768)
469 KB
469 KB PNG
>>101202093
Howdy fren.
Just my love letter to Mixtral. I know it's not the new shiny thing out there by any means but I haven't seen many Model Stock experiments being done with it yet so my curiosity got the best of me. Might release a follow-up but only if it actually proves to be an improvement over this one.
>>101202173
>cute tet
Thanks anon. Featureless Flat 2D Mix does a pretty decent job with chibi style out of the box without LoRAs.
>>
>>101202012
What GPU would you need to run that? I have a 3090 ti.
>>
>>101202420
do you say its the best mixtral sexo tune?
>>
>>101201751
Between explain like i am 5 and lmg shit answers there is informative. And that was informative. I have never seen an informative post like that one here.
>>
>>101202454
go back
>>
How is Qwen supposed to be pronounced anyway? Isn't it an abbreviation of Tongyi Qianwen?
>>
>>101202473
Probably something like a hissed "chwun" I would guess. Q is a hissed CH sound, and the "e" is usually close to the ə/uh sound for unvoiced vowels in English, like how "the" becomes "thuh" in front of certain words or always when someone is retarded.
>>
>>101202435
MoE so you could probably get by with that 3090 if you have lots of RAM
>>
>>101202473
>>101202503
it's just kwen, I remember hearing one of their guys say it in some xitter space once
>>
>>101202435
How much system ram do you have to back the layer swaps?
I had to dial down to IQ3_XXS to get it to function on a 4070, and that was glacial because it's still 85GB at that quant, and I've only 64GB system RAM.
>>
>>101202473
The q is like a ch sound, and the wen is like if you tried pronouncing "wn".
Also you can go into Google translate and put in 通义千问 to listen to exactly how tong yi qian wen is pronounced.
>>
>>101202473
kvyen
>>
>>101202473
qwen
>>
>>101202576
>义
The excite.com mascot?
>>
If I apply a 4-bit qlora to a 3.5bwp Exllama quant of the same model qlora is off, is it gonna become retarded?
>>
>>101202473
"Quwhen"
>>
>>101201995
>>101202012
Just use WLM, it's slightly better than DS but is way smaller:
>https://prollm.toqan.ai/leaderboard/coding-assistant
>>
>>101202751
if u have 128gb of ram DS is faster
>>
>>101202018
Why do you talk like if you had brain damage?
>>
>>101202193
Qwen2 is solid, but I prefer Commandr's writing style. L3 is >8k shitter.
>>
>>101202751
>Provider: Stack Overflow
>Evaluation Method: Auto-evaluation with GPT4 - Turbo
You expect me to take this seriously? WLM just scores higher because it is turboslopped.
>>
>>101202809
my mind is very unorganized so when i lazily type it turns out like this.
t. listening to ear licking asmr 24/7 to curb his internal monologue
>>
>>101202777
Whoa I didn't know that. It looks like DS has 21B active params during inference compared to WLM's 44B so I can see it definitely being faster.
>>
>>101202751
>>101202777
>>101202834
>>101202570
So for a coding assistant/teacher, with 3090ti and 32GB the best one is?
>>
>>101202882
Phi-3-Mini
>>
>>101202882
Claude 3.5
>>
>>101202882
>3090ti
look at this dude..
>32GB
XDDDDDDDDDDDDDDDDDDDDDDDDD
>>
>>101202882
Dude, either one but you need to get more RAM. It's cheap, you have no excuse.
>>
>>101202882
>i have more ram+vram than an rtx 3090ti owner
>12+64>24+32
GEEEEEEEEEEG
>>
>>101202420
from preliminary testing i give this model a 10/10
DISCLAMER: t. used stheno for the past few days
>>
File: GRP8is0acAAyZZj.png (561 KB, 2664x2775)
561 KB
561 KB PNG
>>
L3-8B-Stheno-v3.2-Q5_K_M is this fine? Or should pick Q8 and don't bother?
>>
>>101203067
linux
>>
File: GRLxTGSXIAEbA5s.png (200 KB, 680x591)
200 KB
200 KB PNG
https://x.com/rohanpaul_ai/status/1806772036125008087
https://github.com/FlagOpen/FlagEmbedding/tree/master/Long_LLM/activation_beacon

Using activation beacons to increase context from 4K to 400K, while still maintaining same vram usage, only minimally increasing inference, and same level of quality
>>
>>101203067
I think you should buy an ad.
>>
>>101202914
>It's cheap, you have no excuse.
64GB would set me back £204
128GB would set me back £315
>>
>>101202882
>So for a coding assistant/teacher
This is the question I'm looking to answer sometime soon.
32 GB system is really tight even if you're willing to accept slow (1 word per second) output rates. 64 gets you many of the 70B class models if you quant down. DeepSeek, still not enough. You'd probably need 96 to get into the door for slow DSC.

>>101202751
>Just use WLM
Wizard can't even strawberry. It said that there's 1 r after I made it spell out the word explicitly. I'm not sure if I trust that kind of LLM thinking.
>>
>>101203105
what the fuck? i thought prices were bad here.. you can get a new ddr4 32gig stick for like 70$
>>
>>101203132
My 32GB is 2x16 GB sticks. They're imported. For some reason I couldn't get them in the UK at the time. They still cost £140 on ebay.
>>
>>101203182
what if you just threw those puny tiny sticks out and replaced them with 2 32gig sticks instead?
>>
>>101203195
I don't know if I want to drop £204 on it. I may sell the sticks first.
>>
>>101203212
ddr5?
>>
File: 00058-3694687329.png (284 KB, 512x512)
284 KB
284 KB PNG
It's up, boys!

https://huggingface.co/Envoid/Llama-3-TenyxChat-DaybreakStorywriter-70B
>>
>>101203075
>Just 9 hours on 8xA800 machine for a 7b, bro.
>>
>>101203231
G.SKILL 32GB DDR4 Trident Z Royal Gold 3600Mhz PC4-28800 CL16 1.35V Dual Channel Kit
>>
>>101203256
do you really have to get the top of the line?
>>
>>101203008
kek
>>
>>101203234
gguf when
>>
>>101203234
we bac
>>
File: 1716774852831426.jpg (7 KB, 431x400)
7 KB
7 KB JPG
>>101203262
>>101203234
>>The resulting model scores significantly higher on the super top secret, private NALA evaluation
>>
>>101203281
They're gold though. 3600mhz and cl 16.

What would you replace them with?
>>
>>101203234
We are finally entering the era of long brainrot names for llama3 sloptunes. That means a breakthrough is coming soon right?
>>
>>101203234
What was TenyxChat supposed to be?
>>
>>101202054
>probably a skill issue
not a skill issue, mixtral tunes are boring, smarter than stheno for sure but not as fun
>>
>>101202473
I pronounce it "Queen"
>>
>>101203396
TenyxChat is a finetune by Tenyx. I don't know who they are but their Llama-3 finetune is pretty good. (Their Mixtral finetune was good too) They get overshadowed by nous and bagel and all that shit but they have their own niche. And their models can be pretty naughty with the right coaxing despite being "safe".
>>
>>101203408
>cummming cuuuummming oohhh i'm cummmiing
so fun
>>
>>101203067
Q8, never use lesser than q8 quants on small models
>>
>>101197169
So, any new erp local models? Up to 70B
>>
what's the best model for rp? is it still midnight miqu?
>>
>>101203234
That TenyxChat has a lot of fixing to do after I tried a L3-70B-daybreak-storywriter-v0.4 just now:

Me> I turn away, about to head up the building's stairs to my second level apartment.
It>
>I see you're about to leave, and I feel a sudden surge of panic. I don't want you to go away. I want to be near you. I want to touch you. I want to feel your skin against mine. I want to smell you. I want to taste you. I want to be with you. I want to be yours. I want you to be mine. I want to belong to you. I want you to belong to me. I want to be yours alone. I want you to be mine alone. I want to be with you always. I want you to be with me always. I want to be yours forever. I want you to be mine forever.
>I'm so close now. I can smell you. I can feel your heat. I can see the sweat on your skin. I can see the muscles in your back. I can see the strength in your arms. I can see the power in your legs. I can see the beauty in your face. I can see the kindness in your eyes. I can see the love in your heart. I can see the goodness in your soul.
>I want to touch you. I want to feel your skin against mine. I want to taste your sweat. I want to smell your heat. I want to feel your muscles. I want to feel your strength. I want to feel your power. I want to feel your beauty. I want to feel your kindness. I want to feel your love. I want to feel your goodness.

This goes on for 12 paragraphs.
What a Storywriter. It must get great marks on Hemmingway app.

Another time it did a similar thing but gave distance. Same paragraph but the first sentence went 50 yards, 20 yards, 10 yards... I gave up when it "took a few steps" to go from 0.5 inches to 0.25 inches away.
>>
>>101203436
As the sun dipped below the horizon, casting long shadows across the well-trodden path, they knew that their journey had only just begun. The bonds they had forged, tempered in the fires of shared experience, would guide them forward, a beacon of hope in the face of any challenge. Together, they would embrace the unknown, knowing that life's greatest adventures were best shared with those who mattered most.
>>
>>101203545
I hate that kind of prose worse than repeated catchphrases.
That wispy affirmatively speculating about the future shit.
>they knew...only just begun
>would guide them
>face any challenge
>would embrace the unknown
>life's greatest adventures

If any of that shit's legit, tell those stories. If not, cut.
>>
>>101203545
GPTslop prose is the written equivalent of corporate memphis.
>>
>>101202054
Typhon is a pile of steaming slop shit. It's being shilled by one anon who proceeds to shit on every tune someone mentions in the thread, so it would look better by comparison. Basically badmouthing anything else and hoping that some newbie would download typhon instead.
>>
>>101203591
that's a classic useless and flowery prose of mixtral finetunes like BMT or Typhon. The main reason why I dropped them
>>
>>101203624
>who proceeds to shit on every tune someone mentions in the thread
That's every finetuner, Sao being the worst case. Even NovelAI shills come here to do that when Dreamgen or SpellBound release theirs.
>>
>>101203234
Are retards still finetuning with "shivers down my spine" and "bonds of shared trust and respect" artifical data slop or have they learned?
>>
>>101203694
>or have they learned?
lol lmao
>>
>>101203694
The slop comes from human writing. The smarter a model the more slop it will have because it will have more vectors pointing towards cliches in writing. One of the reasons dumber models generally have more sovl.
>>
>>101203624
>by one anon who proceeds to shit on every tune someone mentions in the thread
I actually don't remember this happening. Aren't you another finetuner basically doing the badmouthing?
>>
>>101203129
I'm using Bing AI to crib a program together.
>>
>>101203624
got anything better?
>>
anyone who tried the new deepseek coder v2- how does it compare to the "big ones"? (4o/sonnet 3.5)
>>
Have they fixed gemma 27b? Are my tensors safe?
>>
>>101202420
for some reason it gets braindead at around 5.2k context. likely a temp issue?.
tried both sampling settings from the repo, using the instruct/context templates from the repo aswell
>iq4_xs llamacpp_hf
>>
>>101203821
No, wait another week
>>
>havent lurked in months since the general went to shit
>completely in the dark for news
>check this thread
>petraposter is back
ah so that means a new model dropped and its actually good, so, what is it?
>>
>>101203831
>or its a me contaminating the model issue
>>
>>101203719
>Aren't you another finetuner basically doing the badmouthing?
I wish I could finetune anything on my trash tier hardware, I wouldn't have to rely on retarded sloptunes
>>
File: file.png (35 KB, 361x487)
35 KB
35 KB PNG
>>101203855
i was never gone
>>
File: JOKER BABY.png (109 KB, 319x215)
109 KB
109 KB PNG
>its another 8k context release
dropped, see you next tuesday.
>>
File: 1710161200327265.jpg (2.06 MB, 1416x1978)
2.06 MB
2.06 MB JPG
>>101197169
>>
>>101203855
gemma 2 27b
>https://eqbench.com/creative_writing.html
>>
>>101203855
No, he's currently mad that he accidentally leaked his timezone yesterday.
>>101196178
>>101196185
>>101196305
>>
>>101203903
>https://eqbench.com/creative_writing.html
The benchmark is BS, but if you read the actual examples, the Gemma ones are really easier to the eyes. It's also noticeable how similar they are to the Gemini Pro samples.

It'd be entirely possible to not have sloptext. It's just that nobody cares. And no, ko-fi finetuners cant fix this.
>>
>>101203903
holy benchmark, ill believe it when i see it

>>101203916
kek what a chode
>>
>>101203916
Petra is german. "Petra" is a very common german name.
>>
why do mixtral models become retarded after 4k context
t. retard
>>
>>101203855
>new model dropped and its actually good
>goy slop from jewgle
you can't be this retarded.
>>
>>101203916
>timezone leak
wow you sure got him! /s
>>
>>101204052
>he lacks reading comprehension
do you come from africa like petraposter does?
>>
>>101204067
hi petra
>>
File: 29390 - SoyBooru.png (139 KB, 775x1232)
139 KB
139 KB PNG
>>101203855
Gemma WNBAG
>>
>>101196305
>that timezone
/lmg/ has shitskins now lmao, i wondering how many of them are mikufags, hmm...
>>
>>101204139
nice deflection, petrus
>>
I'm testing SPPO again today. This time I used the Nala card. Essentially it feels the same as vanilla Instruct but a bit less varied and a bit more focus paid to Nala's lion features. By less varied, I mean that small differences in the Instruct formatting doesn't affect its response as much, compared to Meta's Instruct. Specifically I tested ST's L3 Instruct preset, vs Instruct with names, and vs with the preset's system prompt and with it deleted.

Honestly though it's not a huge difference. Was kind of hard to tell. Maybe a more complex card would show the difference or if I play with this card more.
>>
>>101204168
Thank you for your report
I tried the model with my RP card and it did better than most although Stheno somehow is still the best at it.
Bade instruct EOS on the first message it sucks.
>>
>>101198756
I've made another unslop control vector, this time aimed at NSFW and it made the model a bit horny and a bit more optimistic as a side effect. Will do a test run to see if it's worth releasing.
>>
aicg proxies are dying left and right, so be ready
>>
>>101204408
you are already dead
>>
https://github.com/ggerganov/llama.cpp/pull/8197
So about that Gemma 2 support in ollama, worked alongside Google engineers... It was basically a shitshow?
>>
>>101204429
nani the fuck?
>>
>>101204444
i have to agree
>>
>>101204408
So then pretend that 8B is good so that we can keep them from ruining the GPU market.
>>
>>101204470
The average aicg user has an higher IQ than the average lmg user, who's able to eat their own shit and be happy.
>>
>>101204483
/aicg/ became an extension of reddit as soon as the proxies showed up.
>>
I can run Command R+ 5bit exl2 now.
>>
>>101204510
No, /lmg/ is an extension of /r/LocalLLaMA.
>>
>>101204510
all of 4chan post 2016 is just an extension of reddit where you're allowed to say nigger and retard freely
>>
>>101204483
not the one on /g/
>>
>>101204408
and why should I care, exactly?
>>
>>101204532
good.
>>
>>101203952
Its actually really interesting to see how some models from completely different companies wrote structurally very similar stories with the same prompt. It seems like there are "groups" that write in similar ways no doubt having to do with something that goes into the training. The benchmark itself is useless because claude rating them comes with it's own biases, but at least the samples are interesting to see.
>>
>>101204444
The fact that these models are so distinct from the usual llama variant give me a smidge of hope that once they are working properly and at full capacity they'll be better than current models in the same weight range.
It would be kind of depressing to witness all the efforts to try and make the thing work only for it to be bad.
>>
>>101204523
Pretty much. /lmg/ is just the /r/LocalLLaMA chatroom. You can see that when something happens, no one posts a source, but people will just begin discussing it out of the blue because they get their news from LocalLLaMA but come here to talk about it.
When a source is posted, it's like 3 people posting the same link at the same time they got from reddit without checking if it was already posted before.
>>
>>101204663
I think most people use Twitter for news. Too much stupid shilling on Reddit.
>>
>>101204663
>>101204523
redditor here and I agree.
>>
hey lads, just returning from a break. what's the best lewd RP model these days? On a 4090. also, how does said model compare to Claude/GPT etc?
>>
>>101204663
This is why i lurk once every few months or so, even having that tinge of redditor in me because i get my news from people who get their news from reddit feels like a tumor on my brain
CUDAdev is the only real one, still have no idea why he bothers with this place

>that said he shouldn't have suggested that idea from the other day
>>
>>101204694
Buy an ad.
>>
>>101204710
are you claiming i'm shilling the 4090? lmao jealous little faggot *revs fans*
>>
>>101204707
>still have no idea why he bothers with this place
He’s racist.
>>
>>101204753
Anyone with more than 2 brains cells to rub together is.
>>
File: 1708811302667.png (522 KB, 774x776)
522 KB
522 KB PNG
>>101204792
*with less than 2 brain cells
>>
>>101204245
Tbh I still haven't tested Stheno, but yeah I think I'll stop testing SPPO. At most it's just an enhancement of Instruct, and this technique is not enough to be its own full tune, until they demonstrate one.
>>
>>101204694
Stheno 3.2, mixtral 8x7b, comand R, possibly qwen 2.
>>
>>101204753
>>101204792
nice
>>
>>101204660
It's probably good I think, but still, not great when it's still an 8k class.
>>
>>101204707
>CUDAdev is the only real one, still have no idea why he bothers with this place
My opinion is that adversity is important for good discussions and actually learning things.
Reddit is designed in such a way that discourages disagreement so you end up with a lot of dunning krugers who never get told that they're retarded.
And also assuming everything is shit until proven otherwise is a better approach to the wider AI space where there are grifters and scammy papers/projects everywhere.

>>101204753
I definitely do not consider myself as such.
>>
>>101204816
>0 foreskin by this ID
>>
>>101204823
thanks anon!
>>
>>101204819
It is a technique after all. There is the consideration of the data they used after all.
>>
>>101204872
For mixtral, try limarp zloss
>>
>>101204866
You forgot your trip. And I remember you got banned for Russian discussions.
>>
>>101197945
reminder that lua was created for use by petroleum engineers, not software developers
this is why lua uses pants-on-head retarded 1-based indexing
>>
>>101204866
>>101204913
I did forget my trip but I don't see what that has to do with anything.
>>
>>101204902
I keep bouncing back to zloss, it's my "just works" model.
>>
limaballs
>>
>>101204957
Pretty much.
I pretty much use just stheno these days but only because I'm playing around with chaining prompts, and mixtral is pretty slow on my system. 8b I can offload fully and get instant responses, but had I slightly better hardware, I'd 100% be using 8x7b limarp zloss . Maybe the qwen2 moe too.
>>
>>101205004
>>101205004
>>101205004
>>
>>101204816
>median american *household* income is $56k
is this bullshit or has working in tech just really skewed my perspective here
>>
>>101205087
The only 56k my white ass has ever enjoyed was a dial up modem.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.