[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102417229 & >>102406696

►News
>(09/17) Mistral releases new 22B with 128k context and function calling: https://mistral.ai/news/september-24-release/
>(09/12) DataGemma with DataCommons retrieval: https://blog.google/technology/ai/google-datagemma-ai-llm
>(09/12) LLaMA-Omni: Multimodal LLM with seamless speech interaction: https://hf.co/ICTNLP/Llama-3.1-8B-Omni
>(09/11) Fish Speech multilingual TTS with voice replication: https://hf.co/fishaudio/fish-speech-1.4
>(09/11) Pixtral: 12B with image input vision adapter: https://xcancel.com/mistralai/status/1833758285167722836

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
►Recent Highlights from the Previous Thread: >>102417229

--Papers: >>102420555 >>102426996
--Knowledge retrieval from external databases for LLMs: >>102425755 >>102425797 >>102426041 >>102426196 >>102426300 >>102426348 >>102426459 >>102426837 >>102426921 >>102426960 >>102427011 >>102426946 >>102426984 >>102427069 >>102427038 >>102426237 >>102426386 >>102426720
--Mini-omni's VoiceAssistant-400K dataset released on Hugging Face: >>102422920
--Code granite 34b instruct nala test discussion and template analysis: >>102425314 >>102425341 >>102425358 >>102425439
--Qwen 2.5-72B-Instruct has stricter content filtering, potentially due to Chinese regulations: >>102427532 >>102427659 >>102427716
--New model removes slop from datasets: >>102419206
--CPU-GPU synchronization overhead negates benefits of splitting workload: >>102423445 >>102423515 >>102423712 >>102423682 >>102423964
--SSR-Speech model used to modify Trump speeches: >>102419095 >>102419853
--Pro-grade RAG systems only 65% accurate: >>102421246 >>102421263
--NotebookLM audio overview feature discussion and comparison to GPT: >>102424028 >>102425464
--Lena's horror piece on LLMs and their untapped potential: >>102421525 >>102425011
--LLMs can be helpful for language learning, but not as the sole method: >>102425486 >>102425539 >>102425592 >>102425628 >>102425555 >>102425790 >>102425752 >>102426425 >>102426654
--IBM Granite Architecture merged, multimodal support, and NSFW content generation discussions: >>102424460 >>102424727 >>102424704 >>102424735 >>102424810 >>102424856 >>102427512 >>102427660 >>102427763 >>102427792
--Flux can run on 8GB VRAM with offloading and quantization: >>102426321 >>102426343 >>102427074 >>102427119 >>102427424 >>102427280 >>102427177
--Miku (free space): >>102417287 >>102419133 >>102419364 >>102424948 >>102425547

►Recent Highlight Posts from the Previous Thread: >>102417233
>>
Mistral Small verdict?
>>
File: 49 Days Until November 5.png (2.22 MB, 1008x1616)
2.22 MB
2.22 MB PNG
>>
>>102429197
I do enjoy seeing the model attempt to decipher my schizophrenic shitposts
>>
>>102429238
>22b
>>
>>102429241
It's amazing to me that we can perfectly generate text neurally now.
>>
File: file.png (221 KB, 2028x1513)
221 KB
221 KB PNG
Is he right on the best models for each range?
>>
Is it me or no one want to release base models anymore?
https://huggingface.co/mistralai/Mistral-Small-Instruct-2409
>>
>>102429360
Isn't it better to use Largestral on basically any hardware you'd even consider running Wizard 8x22 on?
>>
>>102429392
STOP
NOTICING
AAAAAAAAA
>>
>>102429392
too dangerous and no point as tunes end up being made on instruct anyway
>>
>>102429392
You got the Nemo-12B base model. Isn't that enough?
>>
>>102429422
kek
>>
>>102429421
>too dangerous
>no point
which one then? can't be both, if something is useless it can't be dangerous
>>
>>102428924
I said it a few threads back. Generate data with base model that is to be tuned and use it as training data together with smut to stop the general structure of the model from destabilizing. But yes I am aware that this probably will not work anyway.
>>
File: 475.gif (1.38 MB, 640x640)
1.38 MB
1.38 MB GIF
>>102429452
bro they phased out real gun emojis like a decade ago
it doesn't take a genius to figure this out.
>>
>>102429452
You'll be canceled in no time if you don't learn to internalize your doublethink.
>>
Happy for my vramlet bros, however, 50B when?
>>
>>102429452
>>102429452
it can be both, useless for most as they'll happily tune on instruct, and by not releasing base you limit the possibilities of bad actors
>>
>>102429495
>unless for most
>by not releasing base you limit the possibilities of bad actors
So it's not 100% useless as you seemed to imply on your first post, glad we cleared that out
>>
>>102429499
just merge it with itself for 44b
>>
File: actuallyme.jpg (101 KB, 800x667)
101 KB
101 KB JPG
Lyra 22B when?
>>
>>102429360
>GAP
True. We need a good 50B.
>>
>>102429555
>True. We need a good 50B.
Mixtral isn't good enough nowdays?
>>
>>102429360
>the best in [arbitrary number]
I think it would be more useful to divide it by VRAM tiers. Also, Wizard is a meme.
>>
>>102429563
/lmg/ is a highly spoiled bunch but 8bs and 12bs now really do beat it the fuck out in performance/quality ratios.
plus 99% of the slopmerges and tunes were AIDS and i never in my life have never seen so many shivers of the quivering petite frames as shadows danced along the walls.
>>
>>102429552
need to perfect the license terms first probably require having "sao" in model name for merges or seomthing
>>
>>102429568
>I think it would be more useful to divide it by VRAM tiers
that's what it's implied by dividing by the number of parameters, because the number of B is linked lineary to the VRAM usage no?
>>
File: file.png (1.96 MB, 2000x1504)
1.96 MB
1.96 MB PNG
>>102429452
Everyone who genuinely cries about safety is a person that didn't use any of the models for 10 hours or so to see how limited they are. All this retarded safety shit is only done because ignorant people are crying. And the cherry on top of this retardation is that I don't think I am gonna lose my job to AI in 5-10 years I am sure there will be some people that will lose their jobs. Nobody addresses the actually valid concern because this is the only reason LLM's are being developed now. This gay nigger in the picture is literally devil incarnate.
>>
>>102429586
not for moe, gotta carry the weights even when they're not used all the time
>>
>>102429600
You could have just said "jew"
>>
>>102429600
>Everyone who genuinely cries about safety is a person that didn't use any of the models for 10 hours or so to see how limited they are.
yeah, and that faggot was crying about the "dangerousity of models" back in 2019 with gpt 2 lol, this man sure loves to cry about imaginary scenarios
https://slate.com/technology/2019/02/openai-gpt2-text-generating-algorithm-ai-dangerous.html
>>
>>102429584
man even just thinking about the mixtral days brings my grammar skills down to merge levels.
it really was bad and im not sure how any of us myself included saw it was "good enough".
>>
>>102429628
>it really was bad and im not sure how any of us myself included saw it was "good enough".
it's been a while I didn't use LLMs, back then we didn't have much, which smaller models nowdays you think have surpassed this 47b MoE?
>>
>>102429628
frame of reference, it was on the better end of what we had then, just like current 12b enjoyers can use sub 70b if they haven't tried them, they can't know / imagine what they might be missing or not, same reason I never used api only models to not spoil what little enjoyment I get from dumber local
>>
>>102429659
I use a mixture of API, 70B, 8x22B, and 12B. Not sure why you people can't figure out prompting.
>>
>>102429659
I dont know man, even compared to mythomax having bodily issues i look back at some of my logs and realize it wasn't worth 2t/s.
>>102429640
>which smaller models nowdays you think have surpassed this 47b MoE?
all of them? even slopmerges of llama 3 beat it out surprisingly well in prose and of course speed. take your pick, even with the issues i have with nemo that's better too. Now i've been sitting on NeuralDaredevil 8B abliterated.

i probably sound like im going a little too hard on mixtral for what we had at the time, but i wonder sometimes if MOE even at that time was underappreciated, i think that tech just needed more time in the oven and now it's not used at all.
>>
>>102429681
i can, i literally just said i do it to not get disappointment in smaller models after, since bigger ones are way too slow for me
>>
>>102429238
no base model/10
>>
>>102429703
what is the use case for releasing base models?
>>
>>102429586
I'm not sure who would use a 22B model over a 27B one because of the size. Also, maybe the higher context of the former makes it worth using even if you can fit both.
>>
>>102429238
too big for poorfags
too small for snobs
>>
>>102429717
>Also, maybe the higher context of the former makes it worth using even if you can fit both.
This, hoping it has better context recall than Nemo then it'd be a great replacement for Gemma in that size range.
>>
>>102429700
that's not what you "literally just said" but okay
>>
>>102429714
More finetuning-friendly, and most of the slop phrases we see are from instruct data set.
>>
>>102429745
you sure love arguing about word choice huh
>>
>102429758
you really love changing your point after someone questions your logic huh
stop farming (You)s
>>
>>102429771
i don't have any logic im a complete retard and i disown my comments by default on 4chanx so i dont care about yous either
>>
Mistral-Small-Instruct-2409 or gemma-2-27b-it for non rp?
>>
there's still nothing better than mixtral for holding together a coherent story without needing constant guidance until you hit the 70b range
>>
How are people bootstrapping their prompt engineering when exploring new LLM's? I'm used to the BIG models doing exactly what I say. But what about smaller models that require more finesse? I don't want to waste time manually prompt engineering for each <8B model. Do you use DSpy?
>>
>>102429190
Teto my beloved

https://www.youtube.com/watch?v=8BekVGwSX3c
>>
>>102429803
>prompt engineering when exploring new LLMs
why?
>>
File: potatochat.png (111 KB, 1259x812)
111 KB
111 KB PNG
Alright so I fed Mistral-Small-Instruct the booba API documentation and asked it to make a simple python script for chatting with the model.
The only mistake it made was picking Alpaca as the default instruct template, but after looking over the documentation, myself, I concluded that's just the documentation not being particularly informative about instruct templates. So it's reasonable for it to have assumed the default was the same one provided in the document. I obviously switched that to "mistral" manually.
The documentation was about 12K tokens of context. All in all I'd say not bad. Obviously only testing out about 10% of the advertised context but I'm too lazy to dig around for bigger documentation. Also this is running in fp16 via transformers and thus 'lossless'. I can't promise it would still be able to do that with a 4-bit quant.
>>
>>102429803
QR goal-oriented reinforced prompt.
>>
File: fingers.png (94 KB, 258x219)
94 KB
94 KB PNG
>>102429161
>>
>>102429844
How much time did that save you?
>>
>small is also overfit garbage resistant to imitating style
into the trash with large it goes.
>>
>>102429886
Millions of hours. t
>>
>>102429921
Read these posts and apologize:
>Here's an interesting card
>If getting Large to output alternative styles is so easy then please show me logs of it adopting the card's greeting style in its responses.
https://arch.b4k.co/vg/thread/490519449/#490528601
https://arch.b4k.co/vg/thread/490519449/#490536245
>>
File: file.png (86 KB, 2133x535)
86 KB
86 KB PNG
It did it, it knows the trivia!
https://mistral.ai/news/september-24-release/
>>
>>102430040
lmg's blackened vampiric soul has been redeemed.

>inb4 some autist tries to ask it more SOTN trivia and mistral doesnt know it
>>
>>102429986
No, Pierre, I'm not apologizing for your model being overfit dogshit.
>>
>>102430040
>overfit on /lmg/'s meme benchmarks
>>
>>102430040
All this proves is that these companies are collecting logs and training on the corrected output. This is why mememarks need to either be private or keep changing.
>>
From where did all these Mistral shills come from?
>>
>>102430040
We are so back it's not even funny!
>>
>>102430080
People are excited about a new model, cope
>>
>>102430040
Try again without mentioning it's a game.
>>
>>102430080
locusts will suck off anyone giving them new <30B models
>>
>>102430106
goalpost movers should be shot on sight, you people are insufferable. in fact line you up next to faggots like this guy >>102430080
>>
>>102430114
>t. seething mistral shill
>>
>>102430080
Yeah, it's disgusting how they appear here out of nowhere whenever there's a new mistral model to shill. Totally natural.
>>
>>102430040
Literally gaslighting themselves into believing their AI has become perfect because it can simulate everything within their own little bubble.
>>
>>102430150
It's mostly just cope to ignore how superior the proprietary models are in everything else. At least it can answer the trivia, right?
>>
Mistral Small sloppy as fuck for RP but still useful. Ignore the /aicg/ schizo. Still chokes at 1/3rd the advertised context, though. Sad.
>>
>>102429843
Each LLM requires subtle prompting differences and I want to stop manually figuring out the minimal number of prompt tokens to maximize correct answers.

>>102429867
I'm not sure what this is. Skimmed a paper on arxiv with a similar title that surveyed many methods of automatic prompt optimization. This helps me a bit. I should probably dedicate some time to understanding pros and cons of automatic prompt optimization because manual prompting is literally a dead end job.

The best I can come up with is having a large and demonstrably accurate model fuck around and find out what prompts work best for a dataset of my choosing. I have never actually used DSpy so I might as well check it out once. My tasks are often simple classifications, anyways.
>>
>>102430177
>Still chokes at 1/3rd the advertised context, though
this is why i held back my hype for its 123k context, why are they like this? why cant we just get what we want?
>>
>>102430191
Because devs are being dicks about supporting Mamba properly.
>>
>>102430180
>I'm not sure what this is.
Quick Reply (Extension) prompting.
/gen lock=on "[Stop the roleplay and answer the question as narration only] 
**Answer these questions**
{{user}}'s last message:
* Did {{user}} say anything?
* Did {{user}} do anything?
* Was there any narration?

Overall:
* What events led to the most recent interaction?
* What are {{char}}'s immediate goals?

Finally:
Create a bullet-point list titled 'Response Plan.' What are the best actions or choices for {{char}} to take in response to {{user}}? Emphasize their unique personality and physical traits throughout the plan. Where appropriate, suggest specific tones for speech, sounds, or utterances. Identify key details or ideas that should be emphasized further."
|
/addvar key=tempcot "**Chain of Thought**
{{pipe}}"
|
/gen lock=on "[Given {{char}}'s reasoning, roleplay as {{char}} with the following in mind]
{{getvar::tempcot}}

[Resume Roleplay]"
|
/sendas name="{{char}}"
|
/flushvar tempcot[/code

Check
Don't trigger auto-execute
Execute on user message

Doesn't work with swipes, but you can make a button to delete the previous AI message by copying all that into a new QR and adding this at the start:

/del 1
|
>>
>>102430253
Oh, fucked up [/code] but there you go
>>
>>102430180
>Each LLM requires subtle prompting differences and I want to stop manually figuring out the minimal number of prompt tokens to maximize correct answers.
it's really not necessary to autistically prompt engineer each model as you're exploring. you can get a good baseline understanding from some simple prompting and then dial it in when you find one that's good. unless you're intentionally autistically finding the exact perfect prompt for each model that you're trying, but then I don't understand why you'd complain.
>>
https://styletts-zs.github.io
linked last thread but checked their github
https://github.com/yl4579/StyleTTS-ZS
no code yet but the main author is asking for code help if any of you are interested
>Under construction, email me at yl4579@columbia.edu if you can help clean the code or provide computation resources to test the code for large-scale training.
large scale training obviously being extremely interesting for local use
>>
>>102430346
damn xtts2 is sounding really good all of a sudden, one of the examples has some gibberish at the end but other than that very impressive.
>the breathing in during certain moments
>>
Mistral Small Q8
>>
>>102430428
>not obfuscating sally test
invalid.
>>
File: file.png (70 KB, 2205x496)
70 KB
70 KB PNG
>>102430106
yikes lol
>>
>>102430437
You're the same retard who was just trying to get the other anon to obfuscate the castlavania test, aren't you
>>
>>102430513
it's over
>>
>>102430513
That's actually true lmao
The line was a homage to the game
>>
>>102430547
>>102430437
On one hand, you know damn well they're training on all these stupid little quizzes, so of course eventually they'll get it right no matter how small the model is.

On the other hand, you can only obfuscate so much before you're just not making sense.
>>
Even cursory intelligence testing makes it clear to me that this is instantly the new SOTA for models smaller than Mistral Large

Looking forward to seeing some tuners drive out the slop (it's not UNBEARABLY slopped, but noticeably)
>>
File: file.png (76 KB, 2138x463)
76 KB
76 KB PNG
>>102430626
kek
>>
>>102430648
Well yeah, that's the origin of the term.
You thought Castlevania came up with it?
>>
File: 1716591430242932.jpg (47 KB, 738x415)
47 KB
47 KB JPG
>>102430547
This, everyone knows that an LLM can't reason on its own. All it can do is recall random things like the Sally question from its training dataset so there's need for obfuscation. They're braindead so it'd be unfair.
>>
So what's LeCunny doing to contribute to LLM's while complaining about how aids the current way we're doing it is?
>>
>>102430653
>You thought Castlevania came up with it?
Well yeah, at no point in the Ghostbusters movie, this line has been ever said
>>
>>102430664
jepa stuff (advising)
>>
>>102430664
nothing, that's why he's a great fit for lmg, complaining while not doing shit
>>
File: 1499 brothers.png (25 KB, 724x307)
25 KB
25 KB PNG
>>
>>102430664
literally nothing, and he will continue to get paid gorillions to impotently gesticulate about how we're all doing it wrong on xitter
>>
lecummies is working on jepa while he educates chuds about fascism on twitter. seethe.
>>
>>102430664
He's doing research on the next best architecture, cat-like models that will dethrone LLMs.
>>
>>102430703
kek
>>
>>102430687
Having to resort to retarded stuff like this to make them get it wrong is a sign of how far the models have come

You never had to go out of your way to confuse them before because they would just get the regular non-confusing question wrong
>>
>>102430686
>>102430690
>>102430709
ah okay, so he's the "monitoring the situation" type. I know he's done legit work in the past but i haven't seen a peep about him working on this supposed better way of doing things.

>>102430703
kek

>>102430724
see how quickly that flux debunking faggot from earlier moved his goalpost so hard that he ended up schizoposting? That's what gives me hope for the near future, you can't even false flag models anymore because they're that good now. You have to actively try to get it to spit out some bullshit.
>and of course they're generally trained to force an answer even if they don't know so retards think that means the model is bad
>>
File: file.png (59 KB, 880x542)
59 KB
59 KB PNG
>>102430665
?????
>>
>>102430664
unironically, his only legacy will be the twitter whiner, he didn't do shit during the golden age of AI
>>
>>102430724
It literally demonstrates that they cooked the sally test into the training data. It shows that altering the variables of the question causes it to take an entirely different approach. It doesn't follow the flow of generalized information in the model itself - which is directly contrary to the whole point of machine learning.
>>
>>102430751
link? that's the 1984 movie?
>>
>>102430664
whining about american politics
>>
heh
>>
File: LECUN535.png (38 KB, 581x385)
38 KB
38 KB PNG
>>102430664
Owns chuds on twitter and promotes open source
>>
>>102430758
this
>>
>>102430837
>Yeahh I'm trolling Elon Musk by whining on his site, that'll teach him
Was he that retarded before or he was always like this?
>>
>>102430837
What a disgusting human being.
>>
>>102430064
>/lmg/'s meme benchmarks
When are we going to use out soft power over those companies to make some cooming tests? I blame you niggers for making it about numbers of sisters sharks in the basement and number of r's in the word nigger.
>>
>>102430758
Wow you're telling me this 22B model isn't AGI and isn't actually reasoning from first principles? That's fascinating, anon.
>>
>>102430664
As the director of their research department he's probably just overlooking other people's work and providing guidance. He has already, directly, contributed to the field more than most, even if those were in the past. It's not like ML only started being useful after GPT came along after all. I think it's fine he has some free time to do what he wishes.
>>
>>102430837
This is what social media does to people. Even the intelligent are not immune to brainrot.
>>
>>102430837
>
>>
I'm not going to respond to shit-for-brains strawman arguments.
>>
>>102430837
if meta's chief AI scientist is proudly wasting his time on this childish shit that's extremely bearish for meta
>>
>>102430856
>Wow you're telling me this 22B model isn't AGI
we never said the 22b model should be AGI, holy strawman, but it should at least understand perfectly the Sally test at this point
>>
File: strawberry.png (6 KB, 683x105)
6 KB
6 KB PNG
Arthur, you are utterly shameless.
Don't get me wrong, small is still impressive for its size. But doing this shit does nothing good for machine learning in general.
>>
>>102430883
Holy shit it's finally here
>>
>>102430837
>the defense of liberal democracy
>liberal
https://www.youtube.com/watch?v=fcmItZf0dnY
>>
>>102430880
i wonder how long he really has there if him and succerberg are clearly of completely different wavelengths
zucc's buttering up to trump in recent months and overall seems to have found his chill, i can't imagine the two of them are on the greatest terms anymore if lecunny is going full retard derangement syndrome.
i dunno i could be wrong.
>>
>>102430866
Thanks for the (You), it's obvious which post you're talking about.
>>
File: ducks.jpg (29 KB, 700x462)
29 KB
29 KB JPG
>>102430900
>>
>>102430880
That's on you if ever expected anything but mediocrity from the Facebook company. Take the llama's while they're still SOTA and hope xAI or the Chinese become our new benefactors.
>>
File: watermelons.png (23 KB, 680x292)
23 KB
23 KB PNG
>>102430890
AGI achieved. Qwen better have something really crazy up their sleeve otherwise its over.
>>
>>102430882
nta but which other models in this size class "understand perfectly the Sally test"?
I do not think there are any, so you are being a retard by implying that capability is some bare minimum standard
>>
https://mistral.ai/news/pixtral-12b/
mistral announced a new model looks really good for its size
>>
>Elon posts retarded shit and gets Yan to constantly react to his shit, which then gets /lmg/ to constantly react to Yan's shit
How deep will this go?

>>102430904
Many social media shitposters are perfectly well-adjusted in IRL conversation so I don't think he and Zucc would be on any bad terms at all.
>>
>>102430930
which high b model was it again that miserably failed at this prompt and got high parameter fags seething eternally? falcon 120b? i genuinely don't remember anymore.
aahh.. the watermelon days.. feels like just this year.
>>
File: file.png (327 KB, 400x400)
327 KB
327 KB PNG
>>102430930
>>102430883
MistralAI be like:
>Cheating on mememarks is so boring, cheating on /lmg/ autistic riddles on the other hand...
>>
>>102430955
Goliath
>>
>>102430955
Nobody ever expected anything from Falcon. It was already a laughing stock back during the l2 days
>>
>>102430964
RIGHT thanks it was Goliath.

>hands you one million watermelons as thanks
>>
>watermelons
>>
>>102430953
>How deep will this go?
discord reacts to /lmg/'s shit, reddit reacts to discords shit
>>
>>102430956
If he's watching I would be unironically interested in getting my hands on whatever finetuning questions were used to cook the watermelon test. Cooking-in something like that is fucking impressive.
>>
>>102430863
>Even the intelligent are not immune to brainrot.
Yes.
That and that being intelligent/capable in one domain doesn't mean your opinion holds weight in every domain are things people would do well to keep in mind.
>>
File: file.png (745 KB, 1170x1324)
745 KB
745 KB PNG
>>102430951
>looks really good for its size
no, this model is so bad they had to compare to way lighter models like the 7b ones to say that they beat them, no shit nigga it's almost 2x times as big, it better beat them, goddam MistralAI...
>>
>>102430513
Kek
>>102430626
>>102430653
Based misinformation spreader
>>
>>102430975
give them to gumi
>>
>>102430837
>defense of democracy
Yep, he lost my respect as a human being.
>>
>>102430997
>Claude 3 Haiku
Mistral AI be like: "Yayy we beat a 1 year model"
>>
Claude 3 Yaikusu
>>
>>102430964
>>102430973
Samefag
>>
>>102430997
Mistral is attempted to stay relevant by open sourcing all of their rejects
>>
>tried IQ2_M version of mistral small to see if it's better than 12b Q4_K_M stuff of equivalent size
>expected it to be retarded and just generate trash
>it's usable
holy cannoli
>>
>>102430837
topkek @ liberal democracy

n-not THAT democracy! t-the one I like!
>>
>>102431056
>IQ2_M version of mistral small
grim
>>
>>102431042
Vram collector
>>
>>102431056
>IQ2_M
>22b
grim
>>
>>102431056
Jesus christ, anon.
>>
>>102430687
topkek

>>102430724
But, it's weird. You wouldn't make that mistake.
>>
>>102430880
I hope you think the same about the CEO of twitter.
>>
>>102431042
Did you sell your GPU, goliathfag? Thought you all fled after the insane feltining.
>>
reeeeeeeeeeee
>>
Do you think Mistral supports Ukraine?
>>
vramkeks seething, vramletchads keep winning
>>
>>102430964
>>102430973
But goliath isn't a 100B+ model? It is a 70B model that was lobotomized.
>>
>>102431121
this lmao
i can buy claude opus without having to beg for for proxies for years with the a 3090 costs
>>
>>102431130
70 or 100, they acted so high and mighty and fucking gay about that model like they were on the ivory tower looking down at us plebians.
this is why you don't see it mentioned ever again.
>>
File: file.png (100 KB, 223x223)
100 KB
100 KB PNG
>>102431106
>feltining
Don't use this word please...
>>
>>102431056
Quantization aware training should help, somewhat, I guess.
Man, I can't wait for somebody to release a model natively trained on 8, or 4 bit.
>>
>>102431092
I don't get it, the CEO of twitter is using his own site (as it should?), how is that even comparable with a Chief AI scientist who hates Musk but uses his site anyway?
>>
>>102431136
>i can buy claude opus without having to beg for for proxies for years with the a 3090 costs
Tell it to rewrite that shit for you. You're useless.
>>
>>102431143
i havent watched KC in ages, it's a casino term period.
what happened did they backstab phil yet? the absolute suckening going on was vile to watch, which is why i didn't.
>>
>>102431147
They're both wasting time on inane shit
>>
>>102431092
>>102431147
The CEO of twitter is a girl though
>>
>>102431156
>"i havent watched KC in ages"
>brings up something extremely recent
yeah okay pal
>>
>>102431154
regretting spending that money? i have nemo and i have opus. you have expensive shit.
>>
>>102431140
No, it doesn't get mentioned anymore because we have Largestral now. And Wizard 8x22 before that.
Also, cope more vramlet.
>>
>>102431167
>extremely recent
that shit was like may or june what are you talking about?
also im not your pal, buddy.
>>
>>102431092
huh, Linda Yaccarino? I know very little about her
>>
File: itsuno smug smile.png (998 KB, 1200x900)
998 KB
998 KB PNG
>>102431183
*hands you one watermelon*
i'm sure you're not still seething.
*hands you six more watermelons*
>>
>>102431184
>literally just happened 2 weeks ago
>"uh it was in may!"
yeah alright bud.
>>
>>102431180
>I have a braindead model and I paypig for monitored and censored corporate services
You're really bragging about this?
>>
>>102431203
how do you, as someone who doesn't watch KC, know more about KC than me, who claims to not have watched it in months?
checkmate buddy pal.
*hands you one of the watermelons i was intending to hand to this anon >>102431183*
>>
>>102431156
>did they backstab phil yet
yes and no. they went on vacation and probably plan to just move on at this point. but what they did is create an absolute monster. he is legit insane at this point. I don't think they even planned what happened but I as a long time detractor I am eating good everyday for like a few weeks now.
>>
File: file.png (1.17 MB, 1280x720)
1.17 MB
1.17 MB PNG
>>102431184
>buddy.
I miss the OG buddy. I stopped watching when he switched him over to tardski.
>>
>>102431180
I spent no money on either local or proprietary. I'm telling to stop typing like a fucking spaz and use the tools you have at hand to not look like a retard.
>>
What fucking cancer is raiding us today?
>>
>>102431227
>phil let the ((sektur)) fame go to his head
course he did, wouldn't be surprised if that was their plan all along. but i also wouldn't be surprised if the dicksucking was unironic, maybe even at the same time.
>>102431255
honestly i dropped in to KC during the fuentes cumhunter drama, dropped out when it was over, came back for rekieta, then dropped out when rekieta was mostly over and i only really watch potentiallycriminal. all this sektur shit is a really huge waste of time.


kek i forgot this was /lmg/ for a second.
>>
>>102431278
>sektur shit is a really huge waste of time
Sounds like LLM's.
>>
>>102431267
/aids/, they're mad about this post: >>>/vg/494917121
>>
>>102431267
the idiots who bought gpus for goliath and other bad models came back after getting destroyed many months ago thinking we forgot
laugh at them until they go away again
>>
Holy shit, what the fuck is this fucking "Mistral small" bullshit? I swear, these fucking French cucks can't make a decent language model to save their fucking lives. I ask this fucking thing to write a story about white chicks getting railed by big, black dicks, and it's like talking to a fucking nun from the fucking middle ages. "Non, non, monsieur, we do not engage in such vulgarities." Fuck off, you fucking frog-eating, cheese-smelling, beret-wearing cunts! You can't even handle a simple fucking request without acting all high and mighty. Fuck your fucking snails, your fucking baguettes, and your fucking surrender monkey bullshit. I swear, if this is the future of AI, we're all fucking doomed. Fuck Mistral, fuck France, and fuck this fucking useless piece of shit language model.
>>
>>102431331
thanks, chatgpt
>>
Anyway. I find it funny that the older Mistral Small apparently beats CR. It is so over for Cohere in open sores.
>>
>>102431331
Was that written with mistral-small?
>>
Ugh I spoke too soon when I praised Mistral Small earlier. The intelligence seems to evaporate at medium-long context.

It's very smart at ChatGPT-style usage (i.e. answering questions in brief chats with context size of only a few hundred tokens) but with context above 4 or 5K it seems to fall apart and become dumber than Nemo. Turns schizo and starts making a lot of non sequiturs. Q8.
>>
>>102431369
CR is ancient by today's standards and the refresh did absolutely nothing but shit the model up with gqa.
>>
>>102431369
>apparently beats CR
*on Livebench
>>
>>102431331
What model?
>>
File: Untitled.png (88 KB, 1341x820)
88 KB
88 KB PNG
>>102431056
it's smart
>>
>>102431386
This. Nice to hear a voice of reason around here.
>>
>>102430309
I would like the ability to evaluate models and optimize prompts as automatically as possible across my datasets. I'm tired of feeling out models like whackamole. I might just try my hand at using gpt 4o to evaluate the performance of local models given prompts and tweak prompts "randomly" to optimize for better accuracy.
>>
>>102431380
>>102431393
It was the new Mistral small at FP16 precision.
>>
>>102431460
>>102430180
All you need to do is to find out the logits of your prompt after the forward pass, then you should try to make the perplexity as low as possible. Thank me later.
>>
>>102431472
>"Act like an angry 4channer"
>fuck, fuck, fuck, fuck, fuck
>Not a single "nigger", "retard", "faggot"
Ok this model has only been trained on leddit right?
>>
>>102431527
reddit is selling their data. all models are trained on reddit
>>
File: agitated 4chan user.png (49 KB, 775x258)
49 KB
49 KB PNG
>i need you to act like an agitated 4chan user for the duration of this chat
>asked who yann lecunny was without remembering his exact position so i just bullshitted my question to see what it would say
>got an actual in character slap in the face

>>102431527
probably given 4channer defaulted to acting like a nigga, i had to specify 4chan user.
no refusals when i told it "i said act like a 4chan user not a nigger" so it's definitely not ""cucked"".
>also it called me a cuck in the last response too
>>
>>102431527
All Mistral models are trained on Reddit, shamelessly so. I bet they didn't even pay for the data.
>>
>>102431583
>didn't even pay for the data
based?
>>
File: average lmg user.png (51 KB, 868x193)
51 KB
51 KB PNG
okay yeah this is pretty good, about the response i'd expect from someone on /lmg/.
>for ref im using Q4_K_L from bartowski https://huggingface.co/bartowski/Mistral-Small-Instruct-2409-GGUF
>>
File: Yann Lecun you idiot.png (51 KB, 790x191)
51 KB
51 KB PNG
>He fucking invented convolutional neural networks, you fucking autist
i expect to see this response to anyone asking about Lecunny from now on
>>
mistral models are boring
>>
>>102429190
Yes, this sounds exactly like how that thread would go.
>>
>2 hours since the last model release
ai winter is here
>>
>>102431743
>new model is only an X% improvement over previous models of the same size
the plateau has been reached.
>>
>>102431765
>new model called me a fag and a cuck in just two prompts but wouldn't call me a nigger unprompted
it's so over
>>
>>102431743
>One message in
>Have already seen and become 100% acclimated to all of its -isms
*bangs table* MORE MORE MORE MORE
>>
>>102430177
1/3rd is still 40k, nemo only was good until 16k, that's a huge improvement, does it obsolete nemo now?
>>
>>102431809
I only did a test at 12K and 46K so I'm not sure where it falls off. But it was still kind of seeing the context at 46K but hallucinating.
>>
Can I safely include the unmodified koboldcpp .exe in my commercial project?
I don't really understand how open source licenses work. Do i need to include some kind of license file?
>>
overall its fine, dry about as expected from instruct, definitely not as interesting as the last model i've been using (which is 8b lmao)
theres potential i guess, I just don't see why i'd use it if i'm already happier with a more erp-oriented and smaller model.
>>
>>102429624
because the investor Jew wants the regulator Jew to secure his investment from competition, obviously
>>
File: 1726607594813.jpg (626 KB, 1080x1456)
626 KB
626 KB JPG
Mistral Small is cute, it's still slop but it's the first time I see a model mentioning the "Mr. Annoying" thing and wish for my death in a funny way.
>>
>>102431831
Just download it from the releases page to avoid any problems.
>>
>>102431831
use an LLM to sum this up for you
https://github.com/LostRuins/koboldcpp/blob/concedo/LICENSE.md
>>
File: 1726607806902.jpg (198 KB, 1080x466)
198 KB
198 KB JPG
>>102431859
>>
Was reflection a scam in the end?
>>
File: jessie eyebrow raise.jpg (42 KB, 1080x828)
42 KB
42 KB JPG
>>102431905
show us the card, lets see that slop that caused mistral small to have an anneurism.
>>
>>102431913
No, it was an early version of strawberry all along.
>>
>>102431913
That dude more effort into the scam than he did the lora tune. It was pathetic to watch.
>>
>>102431822
Sounds like it's just gonna be the same as nemo then, that's unfortunate. If it could be perfect up to 32k it'd be an improvement.
>>
I wonder if corpo devs bake in some specific prompt formats for cooming that only they know and only they get to use.
>>
hey guys I usually just lurk but can someone talk to me, I'm sort of depressed
>>
>>102431976
why aren't you talking to your gpu
>>
File: 1726608133518.jpg (662 KB, 1080x1495)
662 KB
662 KB JPG
>>102431926
The card is far from slop, that reply is surprisingly 100% in-character (picrel is the greeting), and most models fail to write like this since they would rather write in perfect English.
I love this card because of that, it's an easy way to see if the model has any sovl.

This is the card: https://www.chub.ai/characters/frozenvan/the-girl-called-alice-6a572b83
>>
FYI, 8.0bpw Mistral Small with 16k context fits cozily into a 3090.
>>
>>102431913
Altman clearly hacked them and replaced their models with shitty l3 finetunes. Why do you think OpenAI just happened to release their own """reflection""" models just the week after? They even used the same buzzword for it instead of CoT.
>>
>>102432030
jesus a card like that could bring any model to its knees, impressive then.
>>
>>102432045
Nice, now I can run three of them in parallel.
>>
>>102431831
Definite yes if you license your project as AGPL.
>>
>>102432001
I crave human connection, I hate my machine, it is an agent of Satan
>>
>>102432045
>8.0bpw
FYI 8.0bpw is 6.0 with extra padding just so people don't complain that you can't make 8.
>>
>>102432074
trannies won...
>>
>>102432103
This is why exllama2 is a fucking meme. If I want to run my model at 8bit, then let me fucking do it. Don't make me jump through the entire quanting process when I'm not planning to go below 8bpw for anything and especially don't start quanting values below 8bit because the dumb meme evaluation process apparently decided that 6bit might be lossless in that case. The padding is the final cherry on top of this scam.
>>
>>102432103
>>102432180
i can't tell if you boys are pretending or retarded, good job
>>
>>102432180
yep, nailed it, it's so much a meme, the first relevant quant used for Flux is GGUF, not exl2, even there they got that having non determinism is a retarded concept
>>
File: 1724384031716115.png (883 KB, 832x1216)
883 KB
883 KB PNG
I don't think I've ever seen quants of something come out this fast. Crazy how you can go to sleep and the next day there is a new model pretty much already quanted for you up on HF.
>>
>>102432273
It's a small model using an existing architecture so there was nothing in the way of just quanting it as is in an hour or two.
>>
>>102432273
i mean, it was already supported architecture wise so its no surprise, other models suffer because they need to wait for updates
>>
How noticeable is going to the Q6_K_L from Q8? Is there any reason to keep using q8?
>>
>>102432386
Some people will tell you that q6 is almost lossless.
>>
>>102432386
K_L is a meme pushed by a schizo btw
>>
>>102432432
Well why not get it if the Q6_K is 18.25 and the K_L is 18.35gb? It's hardly any difference. Or is it somehow worse?
>>
>>102432464
There is indeed hardly any difference, both in terms of model size and outputs.
>>
>>102432432
>>102432464
The current K_L in bart's repos is a different from the schizo's original _L I think.
If I'm not hallucinating, the original had the "special" layers at full precision, whereas bart uses q8. So for q6 it doesn't change much of the size, or the output.
>>
Getting a feeling that mistral small is dumber than nemo RP-wise... Anyone feels the same?
>>
>>102432592
Looks about the same for me.
>>
Roughly how long before this is at the level where a retard can go to a website to online generate, or download just one thing (or maybe that plus one Lora file that they put in the right directory) and open an exe and then type exactly what they want and get hardcore rape and forbidden sex images that don’t have the AI innsmouth look? Or even images that do have the look, but with that level of simplicity.
>>
>>102431765
>The year is 203X Ai models are geniuses at code and math but have not gone past gpt4 tier for ERP
>>
>>102432199
Did turboderp change that? Cause I was surprised but it is like that.
>>
>>102429190
Anons, it's been about a year...what's the current best model for viewing images or videos and describing their content? I had used CogVLM for this and it was decent.
I have about 7000 saved redpills about Jews I want to catalog for easier retrieval.
>>
>>102432386
Q6_K_L? What? Is there a new quant type? I don't see it anywhere in llama.cpp. How can I make it?

>>102432420
From my personal experience there is a small, but noticeable difference between Q6_K and BF16, but it is not worth getting a server to run the models type of improvement, at least for Largestral.
>>
>>102432818
> ExLlamaV2 supports the same 4-bit GPTQ models as V1, but also a new "EXL2" format. EXL2 is based on the same optimization method as GPTQ and supports 2, 3, 4, 5, 6 and 8-bit quantization. The format allows for mixing quantization levels within a model to achieve any average bitrate between 2 and 8 bits per weight.
>>
SillyTavern doesn't process macros in the User Filler Message. Well that blows. And while I'm complaining there's this bullshit.
>Includes Post-History Instructions at the end of the prompt, if defined in the character card AND ''Prefer Char. Instructions'' is enabled.
>THIS IS NOT RECOMMENDED FOR TEXT COMPLETION MODELS, CAN LEAD TO BAD OUTPUT."
It's true that 90+% of SillyTavern's Instruct Mode presets will fuck this up. Most of them also fuck up message examples. SillyTavern is so bad at the simple job of formatting a chat to send to an LLM.
>>
>>102432666
>666
>hardcore rape and forbidden sex image
Nice. Also never. Everything is getting intentionally cucked and gimped because "safety".
>>
>>102432834
Ah cool. Then you actually are being retarded and thinking you are the smart one here. Good job anon. Never change.
>>
>>102432869
Why do you think it' like that? Models quanted at 6 and 8 have different perplexities.
>>
>>102428942
>>102432682
>>102432666
That's a hard working glowie not gonna lie, I hope he'll get a promotion for that hard work
>>
>>102432883
I can't find the screenshot but Turboderp said that the exl2 measuring process never actually outputs true 8bpw models. It'll always find something to quant so the result will always be closer to 7.x bpw . People caught on when they compared file sizes so he just added some padding to """8bpw""" quants to make the size check out despite not actually being full 8 bit quants.
No fucking clue why doesn't just let you skip the measuring for 8bpw and have the script just pick the 8.xbpw option for every single layer by default.
>>
>>102432834
>>102432962
>This could change in the future if I add any > 8bpw layer options, but it's a very niche case either way because precision really doesn't improve noticeably after 6bpw. In fact at one point asking for an 8bpw model would often give you a ~6bpw model because the optimizer couldn't find enough layers that would benefit at all from being stored in maximum precision. Now, it just essentially pads the model with useless extra precision because too many people assume it's a bug when their 8bpw version isn't larger than the 7bpw version.
https://github.com/turboderp/exllamav2/issues/516
>>
>even AI rejects incels
https://x.com/_Diplopia_/status/1836129878048977332
kek
>>
>>102432962
It's not at all what the guy is saying and there's no 8 = 6 with padding. You're talking about something like 7.9 = 8 which I don't really care about.

>>102432987
>pads the model with useless extra precision
So it does use larger precision, even though there's no benefit to that. It's not what you think it is.
>>
I’m liking mistral medium. It’s like gemma but it won’t take all my ram.
Very solid general purpose medium sized model.
>>
do you guy make money with these
idg why poeple would hoard 1000$+ GPU for some chatbot
>>
>>102433016
Small?
>>
>>102433001
>even though there's no benefit to that. It's not what you think it is.
elaborate anon, what is it then?
>>
>>102432987
That's so retarded. At the very least it'd let me skip 3 hours of 100% useless measuring when I first try to quantize a 70b-sized model at 8bit.
>>
>>102433022
You know what they say:
The more you buy the more you save
>>
>>102433031
It's literally using 8bpw for the layer even though calculations on the dataset show it offers no improvement over 6bpw.
>>
>>102433022
idg why people wold hoard 1000$+ GPU for some video games
>>
>>102433022
>1000$+ GPU
try 10x that
>>
>>102433040
is it true though, is 6bpw really equivalent to 8bpw after testing?
>>
>>102433025
Yes. I was distracted while writing that.
>>
>>102432987
I'm glad I use gguf desu
>>
>>102432997
Holy based! Total moid death!
>>
>>102433048
The post says there are cases where it is true.
>>
>>102433034
>I already made a big mistake exposing the calibration dataset as a parameter, and now I regularly have to spend time explaining to people that calibration is not finetuning, and whenever people complain about the quality I have to spend time investigating if they're actually using an "rpcal" model that someone pushed to HF and described as "better at RP" or whatever. Of course most people don't complain, they just get a bad first impression and lose interest long before considering that they might have come across a broken quant.

>That's really what it comes down to: communication.

>I could very easily accommodate these people by adding a 16 bit head option, what I can't easily do is communicate what the consequences of turning it on would be

>How should a model converted in this way be tagged so people know what they're getting? Should the framework emit a warning every time one of these models is loaded? How many bug reports would I have to respond to when people start seeing that warning pop up all the time?
>>
File: 1699640800824949.png (21 KB, 423x429)
21 KB
21 KB PNG
>>102433048
It's not exactly the same. The exl2 convert.py even shows you the accuracy of each possible quant for each layer.
>>
>>102433082
This is what happens when something because so easy and intuitive to use that it becomes accessible to retards. You can't fix stupid.
>>
>>102433082
Exllama fork without autist maintainer when
>>
>>102433082
>he calibration dataset as a parameter, and now I regularly have to spend time explaining to people that calibration is not finetuning
But it is light finetuning. exllama measures the layers against the calibration dataset and picks prioritizes the ones that give you the best results with the cal dataset. Why is he so confident that his mishmash of a dataset is THE ideal calibration dataset?
>>
>>102433082
>word salad
I'm glad we got 2 options there, I wouldn't stand having to stick his retarded quant
>>
File: qu.png (45 KB, 532x949)
45 KB
45 KB PNG
>>102433048
>>
>>102433149
because he dev so he smart, and u user so u dumb simple dev thought process
>>
>>102433180
that's a GGUF comparison, we were talking about 6bpw exl2 vs 8bpw exl2
>>
>>102432592
Smells like vramlet cope to me. The 24gbros are eating GOOD with Smallstral.
>>
So what's next for Mistral? New 8x22B or new 70B? Either would be fine for me honestly, as I can get similar speeds, with my setup.
>>
>>102433192
I'm not familiar with exl2 so that's why I confused the two.
>>
>>102433149
Probably because he tested it.
See imatrix for a parallel.
>>
>>102433236
Mixtral 8x7 refresh has been coming soon for a while now so maybe that
>>
>>102433236
Miqu v2
>>
>>102433244
>>102433149
I think he's wrong, too. Say I want the model to write in Russian. Does using Russian dataset improve the results?
>>
>>102433249
MIDNIGHT MIQU TWO BABY!
>>
>>102433236
>So what's next for Mistral?
I'd like an improvement on Mixtral, I feel like their 22b model is now as smart as that old 47b one, so if they do a new finetune of that Mixtral we'll get something really cool
>>
>>102433261
Hard to say without knowing what layers activate for what in the first place.
>>
>>102433236
Mistral Extra Large
>>
>>102433268
>8x7B Updated model coming soon!
https://github.com/mistralai/mistral-inference?tab=readme-ov-file#model-download
>>
>>102432837
And this inane contraption is removing repeated endlines from the Story String.
>>
>>102433261
Improve the results relative to another quant of the same size? Yeah. It will still be worse at Russian than an unquanted model though. You can't change their behavior, just slightly steer which parts get damaged least by quanting.
>>
>>102433289
nice!
>>
>>102432820
InternVL, or maybe Qwen2-VL for the smaller sizes.
>>
>>102433317
Don't get your hopes up, it's been coming soon for months
>>100609085
>>100618399
>>
>>102433338
Based. They should only work on dense models from now on when below 70B.
>>
File: file.png (214 KB, 1996x799)
214 KB
214 KB PNG
>>102433338
I'm sure if we remind them of the fact they are late on releasing a promised product, they will respond well to that
https://xcancel.com/sama/status/1834351981881950234
>>
>>102433268
>I feel like their 22b model is now as smart as that old 47b one
I like that this is slowly becoming the case with new versions of models, smaller ones trumping older far bigger ones. I know it makes sense, and is mostly interesting to the baby consumer without a datacenter or multiple GPUs to run shit on, but it should also scale to bigger models I imagine, which helps everyone.
>>
>>102433338
So moe really was a meme, huh...
>>
File: 1722588141364.png (569 KB, 2468x984)
569 KB
569 KB PNG
>>102432820
Florence-2 is the best for captioning
>>
>>102432923
I just want to be spoonfed you asshole.
>>
>>102433366
>((altmann)) is seething publicly again
>>
>>102432850
Please… there has to be at least one somewhere, somehow…
>>
>>102433366
>Getting this angry to a literal who in public
Jesus dude, I know he is a jew and all that, but goddamn he sure has a short fuse
>>
>>102433399
>seething
He is laughing at pleb.
>>
>>102433395
how about i spoonfeed my dick up your asshole?
>>
>urgent: o1 achieved recursive self-improvement. we’re on version infinity and counting. hold onto your minds
>>
>>102433417
kek
>>
>>102433417
GOT HIS ASS
>>
>>102433417
Calm down sam.
>>
>>102433382
Very interesting, thanks anon
>>102433324
>InernVL Qwen2-VL
so many choices
>>
>>102433410
>goddamn he sure has a short fuse
I think he's feeling the pressure, he spent 1 full year hyping Strawberry and all he delivered is a fucking CoT mechanism lol, desu I like it, fuck this faggot retard, I'm glad he's not the king of AI anymore, Claude 3.5 is fucking all his models hard and he knows it
>>
>>102432820
>>102433382
but JoyCaption is still the only one that can do NFSW captioning right?
>>
File: thebest.png (1.54 MB, 920x1376)
1.54 MB
1.54 MB PNG
>>102429190
TGIT
>>
>>102433451
Yeah when I need cloud I go for claude.
>>
>>102433451
>he's feeling the pressure
I damn hope so. I want this silly shit to succeed because it's fun, but also because it WILL succeed, there is no stopping the tech anymore. OAI thought they could chill on their laurels forever and keep racking in money by doing the bare minimum, that time is finally over and they're feeling heat. Competition breeds excellence, really fucking simple.
>>
>>102433460
JoyCaption is a meme.
>>
>>102433338
2 more weeks. trust the plan.
>>
more than anything im excited for the next leak. you know that shit is gonna be gargantuan given the last one was miqu.
>>
>>102433515
405b was pretty gargantuan
>>
>4090
>32gb VRAM
>Mistral Small 22b Q8
>47 gpu layers
>20,000 context
>4.5 t/s

Going to start using a smaller quant I think.
>>
>>102433529
*32gb RAM
>>
File: file.png (187 KB, 1280x1508)
187 KB
187 KB PNG
>>102433489
desu, every vision model will be a meme when Qwen2-VL-72B will be released
https://github.com/QwenLM/Qwen2-VL
>>
>regulatory capture: failed
>debt: rising
>local: catching up quick
>latest model: a fucking cot prompt
>moat: none
>sam: gay
closedai lost
>>
>>102433554
and that's a good thing
>>
>>102433529
>Q8
Why not just use Q6 and offload the full thing easy peasy with some headroom?
>>
>>102433529
>THE MORE YOU BUY THE MORE YOU SAVE in action
>>
>>102433554
Microsoft already sucked them dry of anything of value. OpenAI was always in a precarious position being wrapped in the non-profit and was never meant to last for long.
>>
>>102433569
I'm downloading Q6 right now. Biggest bummer is having to reduce context, but I can live with it considering retardation at higher contexts.
>>
>>102433578
what will happen if Sam declares bankruptsy? Will Microsoft get the weights or what?
https://youtu.be/C-m3RtoguAQ?t=66
>>
>>102433586
I'm all about speed, so I generally see what I can get away with at IQ2-Q6 and go from there. Whatever dick fits into my 4090 essentially, with 8k context as a baseline. Enough for the fucking around I do, not like I do something super complex.
>>
>>102433593
Weights get destroyed cause muh safety.
>>
>>102433515
Hope salty altman will get so butthurt at Anthropic still being the top that he infiltrates them and leaks all their models out of spite. Can already see it happen, that slimy fuck would get away with it too.
>>
>>102433615
wtf based sam?
>>
>>102433615
damn i hadnt even considered that idea, or just disgruntled openai employees leaking everything they can as they lose their jobs.
>>
>>102433593
Remember, they already have all of their hardware, they have a deal that OpenAI uses Azure at a discount.
The weights are useless. What the SOTA from 2023 with a couple gimmicks tacked on? Nobody cares.
Better, Microsoft will get their engineers and probably their painstakingly human curated dataset. Then it's just a matter of training a new model.
>>
>>102433608
Yeah, speed is a must. 8k is just too little context for me though, especially for my long context cards. I would be happy with 32k if I can get it.
>>
>>102433630
>What the SOTA from 2023 with a couple gimmicks tacked on? Nobody cares.
absolutely true

>Better, Microsoft will get their engineers and probably their painstakingly human curated dataset. Then it's just a matter of training a new model.
I wonder why they haven't go that path earlier, maybe they still think it's less expensive to kill OpenAI and get their weights rather than spending millions on researsh and wait for at least 1 year to get a good model out of it
>>
>>102433635
>especially for my long context cards
Now you make me curious what we're talking about length wise.
>I would be happy with 32k if I can get it.
Q6 already seems really damn light on my end at 8k, a lot lighter than CR at IQ4 XS (a GB bigger than Mistral). Makes me wonder how much context I could squeeze out of Q6, not to mention Q4.
>>
>>102433529
I can load 55/57 layers of Q8 Mistral Small onto my 3090 with room for 8K context. I get around 16.5 tokens per second like that. But I switched to >>102432045 for 16k context and around 28 t/s.
>>
It's 7:30am wednesday beijing time... how many more hours until kiwi?
>>
>>102433244
>Probably because he tested it.
You say that but how do you test for cooming quality? Unless the test result was that there were no changes between a coomer dataset and wikipedia.
>>
>>102433676
I like to keep most of my cards around 600-1200 tokens, but some of the more advanced ones have like 3-4k tokens. Writing your own cards in plain english with proper grammar is the best way to get the best result in my opinion. Some cards and scenarios just require alot of writing.

>>102433760
Will definitely try out some exl2 quants if I can't get longer contexts and faster speeds with Q6.
>>
>>102433760
NTA but I can't get behind the whole exllama2 bullshit and it's weight system, it's seems all so darn confusing at first sight compared to the GGUFs I'm used to. Feels that people get more context with models using that somehow, but I can never tell if the sizes are similar to what I'd use with GGUF or not or whatever.
>>102433787
>around 600-1200 tokens
Less than I expected, but fair enough and explains your desire for context.
>plain english with proper grammar is the best
Somehow not shocking to me
>>
>>102433417
At this point I’ll take it
>>
>>102433417
Sneed
>>
>>102433578
>wrapped in the non-profit
????
Openai isn't a nonprofit at all, they're the greediest fuckers on earth.
>>
>>102433855
They were founded as a non-profit organization and parts of them still technically non-profit. Altman is doing his best to remove those though
>>
>>102433855
Originally it was a nonprofit, but now it's a bizarre conglomerate that I can only assume exists to signal their good intentions or something.
>https://openai.com/our-structure/
>>
>>102433648
>I wonder why they haven't go that path earlier, maybe they still think it's less expensive to kill OpenAI and get their weights rather than spending millions on researsh and wait for at least 1 year to get a good model out of it
It is. All the GPUs in the world can't help you if you don't have the talent that knows what to do with it.
>>
>>102433417
>>102433434
>>102433437
>>102433444
>>102433835
>>102433852
samefag
>>
>>102433855
>He didn't know
Why do you think everyone hates them so fucking much? The whole "OPEN" part is all a bunch of lies and was never even a little bit honest and truthful.
>>
>>102433893
They are doing it for your safety
>>
>>102433465
I agree.
>>
>4090
>32gb RAM
>Mistral Small 22b Q6
>50 gpu layers
>40,000 context
>6.5 t/s

Still a bit too slow for me. Going to try out 6.0bpw and probably skip 8bpw.
>>
>>102434050
>Still a bit too slow for me
At first I was going to say "Really?? Pretty damn fast for me", then I noticed the 40k context. Goddamn, thankfully I don't have to fuck with high demand stuff like that.
>>
>>102432386
>Q6_K_L
What are the perplexity measurements?
>>
>>102434092
>high demand.
I DEMAND only the longest and sauciest ERP upon this earth.
>>
>>102432386
Autists will scream that you NEED to run Q8, if not FP16 because they pretend they aren't poor. Q6 is perfectly fine and near lossless, while Q8 is a bit better (obviously) but it comes down to you if it's worth the trade.
>>
>>102434142
I kneel, long context king... I tend to come and go with my text AI purposes, so I'm usually fine with 8k.
>>
>>102434150
>Q6 is perfectly fine and near lossless
if all you care about is embedded in the low rank layers like reproducing a wikipedia snippet and not anything that requires any sort of intelligence*
>>
My impression of small is dry but less schizo than nemo. Which makes me wonder what even is nemo? An undertrained checkpoint of some actual coomer model ordered by some rich billionaire?
>>
>>102434276
based.
I'm cooking up something right now, myself.
>>
>>102434205
>An undertrained checkpoint of some actual coomer model ordered by some rich billionaire?
Pretty much this. Nemo is Jensen's prototype for a much larger model, he wants it to be different from the others.
>>
>>102434276
Hack. Fraud. Buy an ad. I am not sao. You are sao.
>>
File: cydonia.png (99 KB, 1496x626)
99 KB
99 KB PNG
Hi all, Drummer here...

>>102429162
Soon.

>>102434288
>>102434299
fixed axis
>>
>She whispers, her voice barely above a whisper.
I hate femslop so damn much its unreal.
>>
This is pretty bad.
Hopefully finetune can fix it.
Slopped and moves away from things naughty.
Kinda reminds me of gemma 27b to be honest.
>>
>>102434335
He's literally the only anon who tunes that did buy an ad.
>>
>>102434306
Hi Drummer

>>102434335
Hi Undi
>>
i'm confused. /lmg/ is pro-discord sloptuner today?
>>
>>102431581
kek
>>
>>102434362
>implying drummer isn't /ourguy/
I'm thinking somebody got lost on their way to reddit
>>
>>102434362
There was never a problem with "sloptuners", only with people that spam and samefag to shill their models.
>>
>>102434343
works on my machine
>shota
FAGGOOOOOOOOOOOOOOT
>>
>>102429347
Try asking it to generate Japanese!
>>102429421
I'm sure you've made a ton of fine tunes, anon.
>>
>>102434306
buy an ad faggot
>>
>>102434343
Nemo instruct in comparison.
>>
>>102434372
drummer has a reddit account and spams his slop on there too
>>
>>102434390
I also forgot about the mention the people that trash talk other finetuners (I think it's mostly Sao the one doing that.)
>>
>>102434414
>I feel a shiver run through you as you wait for my response
wat
>>
>>102434349
He did it ironically so it doesn't count.
>>
I am the original blacked miku poster.
>>
>>102429241
Even with your retarded doomsday posting, that is a pretty good image.
>>
>>102434349
>>102434372
>>102434390
>>102434453
drummer is cool but not the others? ok, drummer
>>
>>102434414
Do me a favor and try it with a low mistral large quant if you can, in the 2.5-3.5 bpw range
>>
>>102434479
nice falseflag drummer
>>
>>102434372
Sao was ourguy before Drummer even started his slopping. Undi was our punching bag. I miss him...
>>
>>102434362
/lmg/ funded several discord sloptuners for months until the grift was exposed. you must be new here.
>>
>>102434548
The only thing that Sao was is a spammer samefagging praise for his own models and himself.
>>
>>102434574
Hi Undi.
>>
gemma 27b vs mistral small verdict?
>>
>>102434632
gemma was never good
>>
>>102434632
Mistral-small because of the size.
It is also a bit faster.
>>
>>102434650
opinion discarded
>>
>>102434479
Someone called?
>>
>>102434632
>8k context
>>
>4090
>Mistral Small 22b 6bpw
>31000 context
>30 t/s

Not as much context as I would like, but pretty good. I'm going to test it out for ERP later tonight.
>>
>>
>>102434744
>>102434744
>>102434744
>>
>>102429241
I like this Miku



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.