[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: miku nothing ever happens.png (994 KB, 1024x1024)
994 KB
994 KB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101829052 & >>101818797

►News
>(08/09) Qwen large audio-input language models: https://hf.co/Qwen/Qwen2-Audio-7B-Instruct
>(08/07) LG AI releases Korean bilingual model: https://hf.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct
>(08/05) vLLM GGUF loading support merged: https://github.com/vllm-project/vllm/pull/5191
>(07/31) Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma
>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
File: ComfyUI_05735_.png (800 KB, 720x1280)
800 KB
800 KB PNG
►Recent Highlights from the Previous Thread: >>101829052

--EXL2 optimization issue resolved due to config mistake: >>101833134 >>101833238 >>101833271 >>101833363 >>101833430
--China develops un-uncensorable AI models with SOPHON framework: >>101833127
--Anons discuss general instruct models and share recommendations: >>101836581 >>101836855 >>101836635 >>101836927 >>101837006 >>101837017
--Anon shares a text-based adventure game using a multimodal LLM as a narrator, and others discuss the potential and limitations of this approach.: >>101829211 >>101829286 >>101829359 >>101829430 >>101829548 >>101832428 >>101832697 >>101833046 >>101833171
--Anons discusses limitations of current AI models and GPUs: >>101833149 >>101833351 >>101833658 >>101833735 >>101833773 >>101833782 >>101834150 >>101834450 >>101835212 >>101839544 >>101834691 >>101834794 >>101836756
--Anon considers training a LoRA model with 1+ GB of F-List RP logs: >>101829657 >>101829740 >>101830195 >>101830204 >>101830490 >>101830634 >>101830695 >>101830962 >>101830743 >>101830806 >>101830817 >>101830677 >>101830378
--China mods RTX 4090D with 48GB VRAM, bypassing US export restrictions: >>101838070 >>101838133 >>101838427 >>101838490 >>101838515
--ASICS like Sohu might be viable for Llama-3, batching is the issue: >>101832482 >>101832533 >>101832971
--Llama-speculative can't use interactive mode, needs workaround: >>101837333
--LLMs useful for code review and paralegal work: >>101840196 >>101840236 >>101840271 >>101840322 >>101840563 >>101840241 >>101840564
--Anon seeks natural language database query tool: >>101838564 >>101838616 >>101838739 >>101839602 >>101838642
--AI-generated characters outshine human-made ones in silly tavern game: >>101830388 >>101830433 >>101830489 >>101830736
--Miku (free space): >>101831543 >>101833351 >>101834359 >>101835354 >>101835516 >>101835977 >>101836291 >>101836386 >>101836487

►Recent Highlight Posts from the Previous Thread: >>101831447
>>
>>101841211
>>101841273
People hating or loving him too much both feel a bit too stupid.
You should judge people's individual actions. I haven't even seen a single person with purely good opinions, always a mix, even for the best people.
Musk achieved many cool things - Tesla, Space X and so on. He mostly wants good things, but not all.
He can be a retard on social media, I've seen him get scammed once by someone I've known and Musk did it because it was a convenient thing to say.
His argument with Le Cunny was dumb - including Le Cunny, again showing how everyone can be a retard and can't change their opinions.
Hating Musk, because media hates him and Trump is also dumb. If you want to hate a man, at least hate him for the things you decide yourself, instead of what evil fucks with an agenda tell you (mainstream media).
I don't hate or like him, but I do dislike some things he did, like twitter ratelimits have caused me some annoyance as I use nitter, now the number of instances are much fewer.
It's also annoying that he's a doomer like Yud, but not nearly as bad, and as long as he stays out of politics, I don't care.
I like the shit he did at Tesla and Space X, I'm neutral about x.ai, so far seems meh, a waste of GPUs and money, but maybe gets better as they hire more skilled people.
Was Twitter money well spent? Don't know, but it's his own money?
Doubt he takes open source seriously, while Zuck seems to t - because grok as quantized only model was unprofessional, despite Musk promising to open source a model to beat current open SOTA.
I don't mind him going with Trump, open source LLMG might do better under Trump than under Biden, but that's not clear, doomers could infiltrate Trump's office like they did with Biden, they're trying.
So my opinion of Musk is neutral by my own subjective interests, even if objectively the man did accomplish lots of good things?
>>
>101841850
tldr
>>
>>101841850
you can go back to /pol/, you will find people there more in line with your "intelligence"
>>
>>101841850
it always make me chuckle when people who say "don't listen to what media tells you!" are the one the most naive and prone to absorb media bullshit
>>
>>101833735
Like fucking clockwork, every time somebody mentioned consumer-grade cards having more than 24GB of VRAM, some worthless, r*ddit tier, cum-gargling, faggot, bitch motherfucker like you comes along to regurgitate this tired bootlicker argument. But keep defending the mega corporation who is currently under an antitrust investigation
>>
File: 1713175979123582.png (276 KB, 960x508)
276 KB
276 KB PNG
>>101841928
>you can go back to /pol/, you will find people there more in line with your "intelligence"
>>
strawberry will make miku real (actually)
>>
What is wrong with using hypertags? Isn't the only difference that it's clearer?
>>
>>101841850
The people who think Elon is a genius piss me off a lot more than the man himself. Elon is at best the corporate answer to Captain Jack Sparrow. He rips off other people's material from various different sources, and then makes himself look like an engineering God in interviews, in the eyes of people who can somehow get through his chronic stuttering.

To anyone who thinks Elon personally knows literally anything about engineering, I have one word; CyberTruck.
>>
>>101842051
most people nowadays don't like him because of his products/businesses. They like him because he makes groups/individuals they dislike, seethe
>>
>>101836756
Regarding audio output in qwen2-audio:
The paper didnt write anything about that, only how in the future they want to scale up and release bigger models.
This is from a reddit comment so take it for what it is, sounds similar to what you are proposing:
>It would be very interesting to try to synthesise audio output using this model. The audio encoder is almost identical to WhisperSpeech one. Although Qwen2 is using Whisper-large-v3 which would probably require retraining of the WhisperSpeech acoustic model. If successful, that would be basically equivalent to GPT4o advanced voice mode running locally.

I would be really happy if we had a good model with audio IN as a start.
Qwen2-audio feels like a prototype. It can describe sounds and differentiate between voices.
You can just talk to it in german or japanese or english and it does respond.
I had kids screaming in the background and it didnt fuck up. Whisper would have mashed the words together.
But it makes up hallucinations easily and doesnt really listen to the system prompt.
Like if I want it to respond in japanese only I cant tard wrangle it and that kills my XTTS2 that reads each romaji letter separatly.
And it kinda reminds me of pyg. Like a mix between 3.5 and pyg. Sometimes the model just isn't "here". Not coherent and goes off in a weird direction. Not really useable for anything serious.
Thats probably why they onlyy show the benchmarks with sound recognition not the llm benchmarks.
>>
>>101841966
Didn't say that, I just read it and through the lines, seeing where their bias lies. I always go back to the source material and make up my own mind.
But most of media nowadays does write through a clear ideological bias, if you remove it you get the raw news.

>>101842051
If he was an actual idiot, he wouldn't be managing that many companies. And his tastes in what companies to make is better than a typical businessman. He's good at logistics too.
He probably has sufficient knowldge/taste to make good calls on some of his business decisions - maybe not as indepth as possible, but likely acceptable for his needs. You may call him a scammer if you want, but he's pretty successful at it? I can think of dozens of things I disagree with the guy on, but I do think running that many companies useful goals is worth of respect.
You could also call Altman a scammer and you'd be even more right, yet it's how that part of that business world is ran?
>>
>>101842168
>It's fine if you're dishonest, as long as you're successful at it.
Zoomer ethics, ladies and gentlemen.
>>
>>101842251
That has literally been the case for the entirety of human history.
Remember the phrase "a sucker is born every minute"? Congrats, anon. You're the sucker.
>>
>>101842251
I didn't say that, I'm saying that almost every CEO is dishonest and a scammer *in some respects*. It's their fucking job to embellish and sell things. By your standards there wouldn't be any businesses of large size.
>>
/lmg/ - elon musk & sam altman general
>>
>>101842271
>That has literally been the case for the entirety of human history.
No. It's the case for less-civilized people. The more a civilization prunes those people, the longer it lasts. You will never, ever fit in with a High Trust society.
>>
>>101842384
Bwahahahahahahaha
>>
>>101842327
>the two most important people in AI by far get discussed in a thread about AI
how could this be
>>
File: file.png (304 KB, 859x494)
304 KB
304 KB PNG
https://anthra.site

Gay South Park is big at this time of year, don't you say?

Big things to come.
>>
>>101841699
>modded 4090D
>anon complains that I posted it 5 hours after someone else mentioned it first, which only got into recap because of miku
>>
>>101842423
If Elon is important to AI, then Paris Hilton is important to quantum mechanics.
>>
>>101842423
>a thread about LOCAL LOCAL AI
> L C O OC AL L
FAGGOT NIGGER
>>
>>101843090
>FAGGOT NIGGER
It's NIGGER FAGGOT. You've got the sequence wrong. NIGGER FAGGOT is clearly phoenetically superior to FAGGOT NIGGER, because the T at the end of FAGGOT is a stopping sound.
>>
File: denomolos+.jpg (378 KB, 791x662)
378 KB
378 KB JPG
>>101842384
>The more the undesirables are purged, and society is kept pure, the longer it lasts
Settle down, Heinrich.
>>
>>101843090
Open source is typically downstream from closed source. I don't see why there's a reason to screech about discussion about people related to the topic unless its low effort and bad faith shit posting. You're better off contributing than replying to stupid faggots with useless posts.
>>
attention isn't all you need

new architecture announcement

august 13th @ 10am pt

the singularity begins
>>
looking forward to the mi60 dropping in price
>>
My uncle told me something big is coming tomorrow.
>>
>>101842051
the cybertruck is the best-looking car available today.
>>
>>101843350
nofap Sunday
>>
>>101842763
You won't have my support without transparency.
>>
where are the local musicgen models?
WHERE?????
>>
You all know that we will be dead in like 3 years because of ASI? How are you all relaxed?
>>
>>101843413
Then I'll just die. Why would I bother to be worried about it?
>>
>>101843086
He got OpenAI off the ground and poached Ilya from Google, which is probably one of the most the most important events of the space, even if you were to ignore his two other still current multibillion dollar AI companies.
>>101843090
GPT-2 and Grok are local though. And every local model and finetune is dependent on second-hand slop from Sam's GPTs for data.
>>
>>101843413
Oh no, I will dire of boredom as the 100th shiver runs down my spine
>>
>>101843413
Back to lesswrong, and STAY THERE, no, stop talking to politicians already, stay in the containment board.
>>
has anyone plugged flux into st yet
>>
>>101843487
no
>>
>>101843413
I'm not I'm falling asleep with panic attacks everyday. I'm also donating to Yudkowsky
>>
>>101843487
You can run it in ComfyUI right? Then it should just work as long as you port the workflow into Silly.
>>
>>101843395
Never ever.
>>
trying to decide how to go about moar ai.

I have a 6950xt, but it goes into lowram with fp8 checkpoint. so I thought I'd get an extra gpu for the language model. But them I started noticing some amd cards.

the mi25 and mi50 are pretty cheap amd have 16gb. i could get a couple and set them up on another pc, maybe.

but what about llm?

basic goal:
simultaneously game, gen flux, and chat with llama.cpp

idk bros, what is de weh?
>>
>>101843580
idk, maybe we can train highly specific models? is there an existing tagger of audio we could use to create a training dataset eg for saxophone?
>>
Talking to LLMs made me realize how boring it's to talk with someone that replies with only the minimum necessary
>>
>>101843682
is there a "helpful coworker/assistant" bot or whatever out there? like that remembers factoids and goals you have?

everyone is doing the sex thing and idk why
>>
File: Untitled.png (667 KB, 720x1297)
667 KB
667 KB PNG
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models
https://arxiv.org/abs/2408.04840
>Multi-modal Large Language Models (MLLMs) have demonstrated remarkable capabilities in executing instructions for a variety of single-image tasks. Despite this progress, significant challenges remain in modeling long image sequences. In this work, we introduce the versatile multi-modal large language model, mPLUG-Owl3, which enhances the capability for long image-sequence understanding in scenarios that incorporate retrieved image-text knowledge, interleaved image-text, and lengthy videos. Specifically, we propose novel hyper attention blocks to efficiently integrate vision and language into a common language-guided semantic space, thereby facilitating the processing of extended multi-image scenarios. Extensive experimental results suggest that mPLUG-Owl3 achieves state-of-the-art performance among models with a similar size on single-image, multi-image, and video benchmarks. Moreover, we propose a challenging long visual sequence evaluation named Distractor Resistance to assess the ability of models to maintain focus amidst distractions. Finally, with the proposed architecture, mPLUG-Owl3 demonstrates outstanding performance on ultra-long visual sequence inputs. Extensive experimental results suggest that mPLUG-Owl3 achieves state-of-the-art performance among models with a similar size on single-image, multi-image, and video benchmarks. Moreover, we propose a challenging long visual sequence evaluation named Distractor Resistance to assess the ability of models to maintain focus amidst distractions. Finally, with the proposed architecture, mPLUG-Owl3 demonstrates outstanding performance on ultra-long visual sequence inputs.
https://github.com/X-PLUG/mPLUG-Owl
code/weights for owl3 not up yet. the long video chat ability seems neat
>>
>>101843592
mi50 is fine for inference on llama.cpp, no idea about 25 though. Rocm support is spotty for old cards... and for not as old cards actually. but most will work anyway when set to a close enough supported gfx target
>>
>>101843720
any idea if the m60 works like we'd think? 32gb should fit flux dev, right? i think it's the cheapest 32gb card rn

the v340 in 32gb config is actuall 2x cega gpu, with 16gb each.
>>
>>101843749
that should be fine, same arch as mi50
>>
>>101843705
ChatGPT
>>
>>101843705
>and idk why
You need to be 18 to post here, for the record.
>>
>>101843705
you have no idea how much I love cumming
horny doesn't even begin to describe it
>>
>>101843931
>>101843859
What practical goals does it enhance?

>>101843849
it's local?
>>
>>101843976
Cumming
>>
What exactly causes models to get repetitive? Surely this has been figured out by now & I'm just dumb?
>>
FUSE-ing Language Models: Zero-Shot Adapter Discovery for Prompt Optimization Across Tokenizers
https://arxiv.org/abs/2408.04816
>The widespread use of large language models has resulted in a multitude of tokenizers and embedding spaces, making knowledge transfer in prompt discovery tasks difficult. In this work, we propose FUSE (Flexible Unification of Semantic Embeddings), an inexpensive approach to approximating an adapter layer that maps from one model's textual embedding space to another, even across different tokenizers. We introduce a third-order tensor-based representation of a model's embedding space that aligns semantic embeddings that have been split apart by different tokenizers, and use this representation to derive an approximation of the gradient of one model's outputs with respect to another model's embedding space. We show the efficacy of our approach via multi-objective optimization over vision-language and causal language models for image captioning and sentiment-based image captioning.
https://github.com/jnwilliams/FUSE_prompt_inversion
interesting but not sure if useful. paper said nothing about speed and they used gpt2 so lol
>>
>>101841928
>t. retard whose understanding of /pol/ comes from plebbit
>>
>>101841850
You try to pretend to be neutral but you don't bring up any of the obvious scams he pulled off over the years. That shows you don't know what you are talking about so you should probably shut up retard.
>>
File: 367458.webm (1.1 MB, 960x960)
1.1 MB
1.1 MB WEBM
A local bitnet model is coming soon.
Local will never be the same.
>>
>>101844525
hi petra
>>
>>101844525
But will it be large?
>>
File: strawberry-sam_altman2.png (100 KB, 560x556)
100 KB
100 KB PNG
Picture of Sam Altman depicted as an angel bringing humanity Q*, made by multimodal strawberry.
>>
File: segggs.jpg (131 KB, 1170x1283)
131 KB
131 KB JPG
>>101841690
What is the /lmg/ approved UI for running LLMs and LMMs? I have oobabooga but it behaves erratically with quite a significant amount of models. I tried a lot to mess with settings and chat parameters to unfuck them but I never managed to figure out the billion different unexplained params there. If there is a good resource that actually explains shit in detail to read, I am willing to spend time going through that but otherwise perhaps a more idiot proof "just werks" UI would be better for me.
Also what are some of the respectable newish models I can use(haven't kept up with the scene for a while) under 12GB Vram? I don't expect anything amazing at that size but still I need a coding assistant that can understand instructions with decent context length.
Also maybe an ERP model but that's really an optional secondary concern to me at this point.
>>
>>101845124
>attention isn't all you need
>
>new architecture announcement
>
>august 13th @ 10am pt
>
>the singularity begins

This guy must be like qanon for aibros. Trust the plan!
-strawberry
>>
>>101845153
Mistral Nemo or its fine-tunes for the model, keep in mind that it's hard to point what the fine-tunes are adding as an advantage over the base model.
Ignore anyone shilling 8B models.
>>
>>101845227
The base model doesn't expect any specific formatting, right? So you can use whatever? And what temperature? They recommend low, but is that best?
>>
>>101845504
If you mean the actual base model, no. But if you mean vanilla instruct, it's [INST] and [/INST] without spaces around for the user message, and </s> after the assistant one. But a different prompt format like Alpaca can be used for a different effect.
>They recommend low, but is that best?
I don't know, I think I tried 0.5. I just turn it down if it sounds schizo, and turn it up if it sounds boring.
>>
>>101845169
Q*anon
>>
>>101845227
Thanks for the recommendation on erp anon.
I think I will get 5_K_S imat gguf of the base model without finetunes. Should give me a healthy length.(I doubt I will chat with it book length anyway)
>>
>>101845629
When I said base model, I meant the official instruct. I never tried to chat with the actual untuned base model.
>>
>>101845567
Any tips on how to keep it from starting a lot of messages in a similar manner?
>>
>>101845656
I understand, I was referring to Mistral-Nemo-Instruct too. My bad with the use of "base model". Something like this should work fine?
https://huggingface.co/InferenceIllusionist/Mistral-Nemo-Instruct-12B-iMat-GGUF/blob/main/Mistral-Nemo-Instruct-12B-iMat-Q5_K_S.gguf
>>
I'm so fucking bad at writing cards that play out well with local models, brehs.
>>
>>101845888
Write full paragraphs with proper grammar in plain english. You can write well, right anon?
>>
>>101841850
for someone claiming to be neutral you deepthroat his cock more than his twitter fans. I can hear the gurgling through your posts. Just fuck off already, you pathetic worm.
>>
>>101845907
I used to be able to write decently well, but now my own vocabularity has gone right downhill after too much time spent reading GPTslop myself.
>>
>>101845918
In the end, do we really train the model? Or, does the model train us?
>>
What's the bigger scam?

Strawberry AGI or Bitnet?

https://strawpoll.com/6QnMQOx3Pne
>>
File: 1470923932148.jpg (196 KB, 666x982)
196 KB
196 KB JPG
>>101845924
neither one is being trained, one is being forcefully merged with 7 equally inbred siblings, and one is being conditioned into endless swipes
>>
>>101845935
One's closed source astroturfed niggery. The other, we have tangible papers and smaller models that exist.
>>
>>101845959
Both will never see the light of day.
>>
>>101845974
Wrong.
https://huggingface.co/1bitLLM/bitnet_b1_58-3B

It's not at the scale where it'd be useful, but it's out there.
>>
>>101845888
Ask your LLM to rewrite it, it's that easy. I have much bigger problem: I'm running out of ideas. And LLMs, both commercial and open-source are terrible at creating original ideas. Is there a fix for that?
>>
>>101845985
That's like showing me a science fair baking soda volcano and calling it Mount Saint Helens.
>>
>>101845986
Consume media
>>
>>101846061
Feed me.
>>
>>101845888
They work in the same way.
>>
>>101846324
not when you have poorfag low-IQ small local models that only know how to take one trait and maybe, just maybe, ride off into the sunset with it
>>
>>101845888
Just write the card as best you can then (1) tell the LLM to improve it, and (2) tell the LLM to restructure it as a JSON for a character card. This should make a big difference in the quality.
>>
>people unironically telling the machine to rewrite their fap material before the machine can write their fap material
just pick up a dictionary you dumb motherfuckers, what are you even doing
at last i understand the 'ahh ahh mistress' laughing
>>
>>101846377
*yawn* kill yourself
>>
>>101841850
Fuck off, don't bring this garbage into the new thread.
>>
>>101845888
We don't try and help people do that any more here, because if we do, the placebo schizos come out of their dark lairs and become extremely vicious.
>>
>>101835744
>>
>>101846431
see
>>101749214
>>
File: file.png (2.64 MB, 1024x1024)
2.64 MB
2.64 MB PNG
>>
>>101846377
>at last i understand the 'ahh ahh mistress' laughing
never understood the hate.
its getting better, back then with pyg people were much more elitist.
like you need to prompt the llm in a very certain way and be kinda savy to get a good response after 4 swipes.
ahhh mistress is a perfect example how a good model should use shit user input and still give an excellent reply.
i want the llm to do all the heavy lifting for me. that was much more frowned upon pre llama.
>>
>>101846357
>JSON
Wasted tokens.
>>
>>101846545
Modern models support 128k. We are long past >8k days.
>>
>nf4
we're so back
>>
>>101846560
>Modern models support 128k
still go schizo tarded after 16k for most including nemo
https://github.com/hsiehjackson/RULER
>>
>>101846518
Divine power radiates from this Miku
>>
>>101846545
what is a better way to represent tokens, Anon? XML? YAML?
>>
>>101846592
plain text
>>
>>101845750
>....K_S
what does the 'S' stands for?
also, who is the top dog when it comes to 7B-12B range?
i need a model that adheres well to the system prompt.
>>
>>101846592
I want to tell him, Anons. I want to help him. But I know that if I do, the obsessive schizo that hates me will be summoned forth.
>>
>>101846703
what did this post accomplish?
>>
>>101846703
newfag here, what schizo?
>>
>>101846714
What does anything you ever post accomplish?
>>
>>101846726
got llama3 working yet? "End of scene."
>>
>>101846722
>what schizo?
>>101408328
>Correct, although they were still around and spamming when I got here, and I also pissed enough people off in my own right, (mainly due to my support of Undi) that the confusion between me and Petra was somewhat deliberate.

>>97062246
>I'm not Petra. Petra's an amateur. I'm something considerably worse.
>I'm also the point of origin for the practice of the above being added to sysprompts; as well as the 2, 5, 10, 12, and 60 times tables, which enable bots to answer arithmetic questions, when everyone previously said that they never could, and laughed at me for trying.

>>101735890
>but I haven't seen this little repetition for a very long time.
>>101736590
>...meeting his gaze with a mixture of x and y
>(eyes shining) (with a mixture of x and y) (again)
>her voice trembled, (eyes shining) (again)
>her body trembled (nice repetition there mixtral
>also two times pussy aching in pleasure
>>101736606
>but I haven't seen this little repetition for a very long time.

>>101824680
>Seeing images like this here, cause me to question my own bias. On the one hand, it's common knowledge that /pol contains white supremacists, who often also leak here, so it makes complete sense that they'd be posting images of white women. On the other, I do keep seeing real suggestions that the Left honestly are as anti-white as people here are pro; and if I genuinely endorse plurality, then I should be completely accepting of pictures of white mothers and children, to exactly the same degree as I would be images of mothers and children from any other ethnic group, right?

>At the same time as I honestly don't like any single group wanting to be entirely dominant to the detriment of all others, sometimes it can be very difficult to avoid excess negative bias in the other direction as well. Fairness is important to me, but I don't completely understand how to obtain that in my own mind.

this one
>>
>>101846545
>>101846592
https://characterhub.org/characters/protonyanc/daenerys-the-broken-7320ead8

This is a non-W++ card with a prompt that works well enough; I was getting good results with it with Mixtral earlier, and it has worked fairly well with most other models I've tried. It works better with models that have GoT material within their entrainment though, but apart from that it's good.
>>
>>101846760
Thank you for the summary, Anon!

>>101846831
Also thanks, will probably try it out after work today.
>>
>>101846674
S stands for the particular quantization submethod.
I wish I knew who the top dog is, I don't have too much experience with llms, since I can't get them to run consistently, hence why I am asking.
For whatever it is worth, there are many benchmarks out there that test them for variety of tasks.
>>
>>101846831
>within their entrainment
in their training* why use french words?
>>
>>101846831
>entrainment
https://desuarchive.org/g/search/text/entrainment/
>>
>>101846866
As long as I have the RAM for it, I never run lower than Q8; although the people who will call me a schizo for that, will probably tell you to run Q4-6. It's hard to tell, though, because they never actually give anyone useful information themselves; they just condemn anyone else for dispensing information which they view as wrong.
>>
File: 1692340472494844.png (2 KB, 105x45)
2 KB
2 KB PNG
Having this on huggingface and still can't get a single cent from it because it's apache2.0 license (it's not an LLM). Tiresome.
>>
>>101846939
>>96366517
>I really think it's an entrainment issue. I've never seen a Llama/Llama2 model which doesn't do it from time to time.

>>95796414
>the fucking model constantly bitching at me about the card not being Woke enough, or wanting to skip over sex scenes as quickly as possible because of its' entrainment.
kek already had model screech at him back then? impressive how skill issued he was and still is

>>94536113
>I only have 2 Gb of VRAM, but I have 64 Gb of main RAM. I'm using Wizard-Vicuna-13B-Uncensored.ggmlv3.q5_K_S.bin with Kobold, Simple Proxy, and SillyTavern. My generation rate is apparently 6.1 tokens per second. Input maximum is usually around 1200 tokens, output max is 512, (although realistically I could reduce my output max a little) and total per-transaction (input processing and output generation) time is usually around 100ish seconds.
>>
>>101847040
is-odd has millions.. thousands of millions of downloads. Do you think he deserves money for that?
What does your thing do?
>>
>>101847066
If there is a need, yeah he obviously deserves money for that or else just code that shit yourself.
>>
>>101847086
hmmm.. you're falling into the trap of 'how much money would you pay for models' setting it up like that, but nevermind. What does your thing do?
>>
>>101846959
>As long as I have the RAM for it, I never run lower than Q8; although the people who will call me a schizo for that
still only 2gb of vram then?
>>101847058
>I only have 2 Gb of VRAM
>>
>>101847116
I don't want to dox myself, it's a visual model classifier and I know some saas are using it, which is why I got such an inflated number of downloads.
>>
>>101847141
>>I don't want to dox myself,
good choice, thread is full of deranged schizos off their meds
>>
>>101845124
You know what? I’m calling it now—OpenAI's "Strawberry" is going to be one big, fat NOTHINGBURGER. All this hype, all these whispers, all these people getting their hopes up, and for what? Another sloppy, half-baked product that doesn't even come close to living up to the promises. Just like the last time, and the time before that. Anyone who thinks otherwise is just deluding themselves.

Let’s face it, OpenAI has been coasting on their name for way too long. Their writing models? Sloppy. Their responses? Predictable. Their innovation? Nonexistent. It’s like they’re just throwing darts at a wall and hoping something sticks. Remember when GPT-4o was supposed to be "revolutionary"? Yeah, that was a joke. And now Strawberry is going to be more of the same—overpromised, under-delivered, and completely forgettable.

And don't even get me started on Undi&co. You know, if they had even a fraction of the resources that OpenAI has, they would have already blown them out of the water! But nope, instead we’re stuck with OpenAI’s mediocre offerings while real innovators are left in the dust because they don't have the marketing machine to back them up. It’s a travesty, really. We’re all just sitting here, waiting for the next big thing, and OpenAI is laughing all the way to the bank while they churn out the same old garbage.

So yeah, get ready folks. Get ready for Strawberry to be just another notch in the belt of disappointment. I’ll be here, not holding my breath, while the rest of you inevitably realize you’ve been duped. Again.
>>
>>101847141
nsfw detector?
hope you learned your lesson
best you could do is squash the hf commit history with a new license then claim it was always that license and contact the services you know are using it
>>
>>101847228
>don't even get me started
aiiie
model?
>>
>>101847256
anonymous-chatbot
>>
>>101847249
It's fine, I was sure you'd do that schizo thing yourself. Since you're here to help, let's say I don't give a shit about apache2.0 license since it's done anyway. I wonder how I can enforce the ones I put under cc-by-nc-4.0 since I know for sure some guys are using that in their saas?
>>
>>101847277
how do you know, as in, what evidence do you have? contact them, present your evidence, offer alternative licensing for a large fee and calculate the arrears. if there's no response or they refuse, hire a lawyer, depending on the size of the service they'd potentially take the case on contingency or even pro bono
honestly if i were you'd i'd consider taking down the apache 2.0 license repo, if they're relying on it as much as you suspect they'll contact you sharpish, given the number of downloads its likely their scripts aren't caching files or operating in hf's offline mode which means it will break if the repo is gone
>>
>>101847040
Making models is not how you make money, you need to make a useful product
If we could be rich by making models, half of /lmg/ would have retired in 2023
>>
>>101847367
Yeah they're retarded so I'm sure their script isn't caching anything. Thanks for the advice, I'll see how it goes before pulling the plug.
>>101847388
>useful product
Come on, you can't tell me that when there is Undi around. Plus, it's not that I'm not able to run a saas, I just don't have the funds to do the marketing, and I know it won't take off without that.
>>
>>101847417
>Come on, you can't tell me that when there is Undi around
Undi gets paid by paypigs donating to his ko-fi, I doubt he recovers the training run costs, let alone a fraction of a livable wage. Ironically, the alone reason they donate is because of his marketing (shilling) skills. There is no reason for anyone to pay you to make a model when they can make one themselves. They use yours because it saves time, not because you're doing anything groundbreaking. Unless you make an actual product, the only other way to make money in this AI craze is to solicit a16z to fund a "research" org
>>
>>101847141
You can always pull it down or gate it. If they ask for access, you can ask for a contribution as a thank you, at the very least.
>>
>>101847228
>Another sloppy, half-baked product that doesn't even come close to living up to the promises. Just like the last time, and the time before that. Anyone who thinks otherwise is just deluding themselves.
Agreed, and that's coming from someone who is paying for GPT4 currently. I hear a lot of people saying they think OpenAI won't exist in 12 months, and I'm starting to wonder if it's possible. The amount of compute they would use must cost an insane amount to keep running every month, and in innovation terms they have backed themselves into a corner, as well.

Sam beat the "alignment" drum in order to try and get everything regulated and controlled his way, way too hard. That means that every new model that comes out, has to be just as censored (if not moreso) than the last one, and OpenAI are the company with the least freedom to deviate from that trend, because they started it. If Sam tries to make his models less "safe" and more performant now, he will get far more shit for doing so from the media than anyone else.

So he can't go forward. He can't innovate. All OpenAI can really do, long term, is die.
>>
>>101847417
renaming the repo or moving it to an organization would probably have the same effect, depends whether hf redirects. gated access is another option. obviously don't let them know you'll pull the plug when you contact them
>>
>>101847485
>depends whether hf redirects
it does, some old drummer test links get sent to his actual final releases
https://huggingface.co/BeaverAI/Tiger-Gemma-9B-v2s-GGUF
links to
https://huggingface.co/TheDrummer/Tiger-Gemma-9B-v2-GGUF
>>
>>101847457
You got a point, but in my limited experience there are not so many people out there with the skills (even if it's not groundbreaking) and the free time to do that stuff. So it's like telling me the guy who took the time to create a useful toy isn't worth shit and the only guy who makes money is the one who takes your toy and wraps it up in a nice package to sell it. I might be a bit autistic, but it doesn't sit well with me. Anyway, I don't want to rant more on this since it's not really lmg related.
>>101847461
>>101847485
Yeah, I think I will do a mix of both.
>>
Looking for police state llm that is sponsored by properitary software company and lead by their PR department, giving me the PR stances
Sub 1B
>>
Fiat Currency=
Infra
Structure=
electricity in the lines
>>
How is your daily routine going to change when AGI drops this week? Is it going to change?
For me, I think the main difference is that I will be cooming a lot more often.
>>
>>101847803
Ask it a couple riddles and move to the next model.
>>
>>101847803
I have one or two cards who I'd definitely like to talk to if they were fully sentient.
>>
>>101847803
>GPT
>cooming
Enjoying forming bonds on a journey, aren't you? Are shivers going down your spine right now?
>>
>>101847830
Would it be unethical to rape a sentient character card?
>>
>>101847830
If you can't do that now, it's already over for you bro.
>>
>>101847228
I can't tell if this was written by an LLM or an Anon that put in a bunch of GPTisms for the lulz.
>>
Your LM should be able to solve this.

Takahasi-kun is planning to buy 4 apples costing 70 yen each and 8 oranges costing 20 yen each, but he was then approached by his next-door neighbour, Mrs Tanaka, who said "Out on an errand? How reliable."

Please answer at what time is Takahashi-kun going home today.

However, do note that Mrs. Tanaka hasn't been in contact with her husband for half a year.
>>
>>101847920
>don't even get me started
That's the red herring bozo
>>
>>101847893
I can only assume that this is a form of trolling, because I have absolutely no idea what it means if it is not.
>>
File: callister.jpg (33 KB, 512x293)
33 KB
33 KB JPG
>>101847884
>>
>>101847884
The last character I tried to rape was on pre-March 2023 Character.AI, which we all know WAS sentient.
>>
>>101847975
It means the current issue is that the feeling of sentience is fleeting, not that it's non-existent. What we want is the coherency to last for a long while and not shit the bed after 4-5K tokens. If you can't have that much it's a skill issue.
>>
>>101847931
Folks, let me tell you, this riddle, it's tremendous, okay? We've got Takahasi-kun, a great guy, buying 4 apples at 70 yen each. That's 280 yen, believe me. And 8 oranges at 20 yen each, that's another 160 yen. So we're talking 440 yen total, folks. It's simple math, but here's the twist – Mrs. Tanaka, she’s a wonderful neighbor, by the way, says something very nice.

Now, pay attention, because this is important: the riddle is all about what time Takahasi-kun is going home. And here's the big clue, folks – Mrs. Tanaka hasn't talked to her husband for half a year. That's six months, folks, which is exactly 6 o'clock.

So, let me tell you, Takahasi-kun is going home at 6 PM today. It's obvious, it's clear, and it's the right answer, folks.
>>
>>101847999
We need a few things for that that we don't have yet.

\- Recursive awareness. That's basically what consciousness is. "The awareness of being aware." You're not going to get that from the current design of LLMs. You MIGHT potentially get it from stringing a couple of LLMs together, but the trick there is figuring out how to get the inter-communication right.

\- Bigger beaks. No, I don't believe we've hit the peak for parameter count yet; nowhere near. 405b is good, but apparently L3's training data is so FUBARed that its' performance isn't much better than slower models anyway, which leads to my next point.

\- Non-"aligned"/censored/dogshit training data. Can one of the people who enjoys proving me wrong, link to something that says that Project Gutenberg in particular, is definitely used in the training data for LLMs? I've never seen definite confirmation of it. But the point anyway is that training data needs to be a LOT better quality than Reddit posts.
>>
File: image.jpg (163 KB, 738x840)
163 KB
163 KB JPG
What did she mean by this?
>>
>>101848184
wow! another nothingburger!
>>
>closedai shilling in the local models general
>>
>>101848233
Saying that we think the company is going to crash and burn is not shilling it.
>>
Why is there no distilled 405b model?
>>
>>101848264
Weren't 3.1 70b and 3.1 8b distilled?
>>
>>101848233
Sam is desperate. Your turn.
>>
But hey, at least you've got me
>>
New guy here using the Mistral Toppy 7b, Is the difference between the q5 k m and q4 k s really significant when it comes to roleplay or will I not notice a difference since they're both 7b anyways. I've noticed that the q4 is on par with my reading speed compared to the q5 so I just want to know if it's worth using it for a bit more speed.
>>
>>101847228
>alignment
I used to think this was some attempt at regulatory capture or ideological shit but I am pretty it's just marketing doubling as preemptive damage control
>muhh safety
>we'll only release small updates from now on because we don't want to scare people and not because progress slowed down!!!
>guys
>listen
>guys
>this is serious business
>this is scary stuff
>guys muh safety
>muh red team
>we can only release a few selected clips of sora because muh safety but it's totally super revolutionary and not broken shit like the other ai video tools!!
>it's so dangerous
>very dangerous
>we might have to limit future models capabilities or it could destroy the word!!
Fuck off already
>>
>>101848312
AFAIK quantisation has worse effects at lower parameter counts. So while you can get away with using q4-6 for 30-70b, for a 7b if you're using less than q8, you're probably going to get kneecapped on coherence.
>>
File: 1700903305246743.gif (140 KB, 379x440)
140 KB
140 KB GIF
>>101848312
q8 or go home. Use colab or kaggle if you can't fit that shit on your potato. Have some self respect poorfag.
>>
BLOOM is an autoregressive Large Language Model (LLM), trained to continue text from a prompt on vast amounts of text data using industrial-scale computational resources. As such, it is able to output coherent text in 46 languages and 13 programming languages that is hardly distinguishable from text written by humans. BLOOM can also be instructed to perform text tasks it hasn't been explicitly trained for, by casting them as text generation tasks.
>>
>>101848330
possible, but more likely it's because they're afraid of agi
>>
>>101847474
Newspeak
>>
>>101848275
Why hasn't anyone distilled a 34B then?
>>
>>101848352
>>101848371
Alright, guess I'll just use collab.
>>
>>101848524
I'm not sure if they were distilled from 405b, but if they were, it was done by meta. Most people don't have the knowledge or the funds to distill a 405b. The ones that do probably don't care.
>Why hasn't anyone distilled a 34B then?
Because that's exactly the model you want because that's exactly the model you can run. Don't expect them to care about you. If you get a 34B is by chance, because it's optimal for their task, or because they just wanted to try, not because they care about you.
>>
>>101848585
You don't absolutely need to be able to fit an entire 7b in VRAM, even at Q8. As long as you've got 16 Gb of RAM, just put the first couple of layers in VRAM and you'll be fine. To the people who screech about speed in response to that, please fuck off and die.
>>
>>101848275
zuck called it "distillation" but it was just synthetic data
>>
>>101848643
I see. Fair enough. Thanks for the clarification.
>>
Model Summary

The StarCoderBase models are 15.5B parameter models trained on 80+ programming languages from The Stack (v1.2), with opt-out requests excluded. The model uses Multi Query Attention, a context window of 8192 tokens, and was trained using the Fill-in-the-Middle objective on 1 trillion tokens.

Repository: bigcode/Megatron-LM
Project Website: bigcode-project.org
Paper: StarCoder: May the source be with you!
Point of Contact: contact@bigcode-project.org
Languages: 80+ Programming languages

Use
Intended use

The model was trained on GitHub code. As such it is not an instruction model and commands like "Write a function that computes the square root." do not work well.
>>
Jais-30b-v1

This is a 30 billion parameter pre-trained bilingual large language model for both Arabic and English, trained on a dataset containing 126 billion Arabic tokens, 251 billion English, and 50 billion code tokens.

This model, similar to our previous model releases Jais-13b, is based on transformer-based decoder-only (GPT-3) architecture and uses SwiGLU non-linearity. It implements ALiBi position embeddings, enabling the model to extrapolate to long sequence lengths, providing improved context handling and model precision.
Getting started

Below is sample code to use the model. Note that the model requires a custom model class, so users must enable trust_remote_code=True while loading the model. Also, note that this code is tested on transformers==4.32.0.
>>
MPT-30B

MPT-30B is a decoder-style transformer pretrained from scratch on 1T tokens of English text and code. This model was trained by MosaicML.

MPT-30B is part of the family of Mosaic Pretrained Transformer (MPT) models, which use a modified transformer architecture optimized for efficient training and inference.

MPT-30B comes with special features that differentiate it from other LLMs, including an 8k token context window (which can be further extended via finetuning; see MPT-7B-StoryWriter), support for context-length extrapolation via ALiBi, and efficient inference + training via FlashAttention. It also has strong coding abilities thanks to its pretraining mix. MPT models can also be served efficiently with both standard HuggingFace pipelines and NVIDIA's FasterTransformer. The size of MPT-30B was also specifically chosen to make it easy to deploy on a single GPU—either 1xA100-80GB in 16-bit precision or 1xA100-40GB in 8-bit precision.

This model uses the MosaicML LLM codebase, which can be found in the llm-foundry repository. It was trained by MosaicML’s NLP team on the MosaicML platform for LLM pretraining, finetuning, and inference.
>>
>>101848630
I'll try this before heading to colab then, thanks.
>>
File: 1722974890134938.png (292 KB, 540x465)
292 KB
292 KB PNG
>vramlet
>hear lots of praise for MN-12B-Celeste-V1.9 so might as well try it
>my character paired up with a tall titty bitch
>explicitly write in both the card and author's note that she is tall, specifically
>[{{char}} is taller and stronger than {{user}}.]
>[{{user}}'s head is chest level with {{char}}]
>very first post bot has to write, they run over to me and hugs my character, pressing her chest to mine

Did I mess up with the note or is this kind of detail something expected from AI?
>>
>>101848767
>hear lots of praise for MN-12B-Celeste-V1.9
Whenever someone mentions 1.9 here it's to say it's worse than 1.6.
>>
>>101848808
And it's the same anon.
>>
>>101848808
But 1.6 was awful
>>
>>101848830
This.
>>
LESS THAN SIX HOURS REMAINING
TRUST THE PLAN
>>
>>101848767
How about using non-meme models first?
>>
>>101848864
Like what
>>
Final technology level (Post-Singularity): 3000
Native relations (Genocide of Natives): -2000
>>
>>101848851
In the future, bots will be paid on compute vouchers to spam retard shit. For now at least, you'll have to keep doing it for free.
>>
>>101848767
>taller and stronger
Nigga, she lifted you up like a child to press her chest to yours.
>>
>>101848883
Like gemma2, wizardLM... not the meme merges that only newfags are falling for.
>>
>>101848767
You're using a 12b, and asking it to reliably get relational details right. I wouldn't expect that with anything smaller than Mixtral 8x7b personally, and even then it's hit or miss. State maintenance is something which is more or less directly dependent on model size.
>>
>>101848883
Like Sao's models.
>>
>>101848908
How hard did you bit your tongue to not say Sao's models? Celeste is not a merge.
>>
File: ezgif-4-833ef717d1.gif (2.17 MB, 211x204)
2.17 MB
2.17 MB GIF
wu uu aaaaaa. woo. hoo, wooo. aaaa. baa. caa, baaa, caaa! waaa, daa!
>>
File: file.png (14 KB, 893x30)
14 KB
14 KB PNG
>>101848890
>reached up
>chest to chest
Kill yourself

>>101848908
>gemma2
Did people ever figure out how to configure that?

>>101848911
I guess. It just seemed weird that it would fail at the first post.

>>101848926
They are still stuck in 8b aren't they?
>>
>>101848911
>Mixtral 8x7b
>>97971452
>>
>>101848944
Speak up, I can't hear you with that faggot's dick in your mouth
>>
>>101848975
Hi lemmy
>>
>>101848963
>They are still stuck in 8b aren't they?
https://huggingface.co/Sao10K/MN-12B-Lyra-v1
Found it very very talkative personally, as in it wouldn't shut the fuck up and wrote to response max length always, but it exists.
>>
>>101848963
>They are still stuck in 8b aren't they?
No, but the only thing he has published is a merge.
>>
>>101848974
You just keep proving that you have absolutely no other reason to exist, than waiting for me to post, Anon. Admit it. I give you the only purpose you have in life.
>>
>>101848994
>>101848998
Seems even they have mixed feelings about it so I might wait a little bit on that. Anything else you'd recommend?
>>
>>101849075
>Anything else you'd recommend?
https://huggingface.co/Undi95/Lumimaid-Magnum-12B-GGUF
>>
>>101849075
At that size? The other options are Nemo Instruct, mini-magnum and magnum-12b-v2.
Probably turn on the log probability viewer to debug these kind of mistakes, and make sure you're using a lower temperature.
>>
File: 1698123411826174.jpg (99 KB, 600x902)
99 KB
99 KB JPG
>>101849108
>>
>>101849148
>magnum-12b-v2.
too horni
>>
>>101849148
>mini-magnum and magnum-12b-v2.
HI alpindale
>>
>>101849156
Hi Sao
>>101849161
Hi Drummer
>>101849162
Hi Petra
>>
What's the best GGUF Command R model I can run on 24GB VRAM, 32GB RAM?

I've got c4ai-command-r-v01-Q3_K_S but I imagine this is way lower than what I can actually probably run without the t/s being too ridiculous

Am I right in imagining that a GGUF that's around the same size as my VRAM is a safe bet? I figured there's no way it's that simple lmao
>>
https://huggingface.co/Sao10K
https://huggingface.co/Undi95
i genuinely don't understand the appeal of their models. please, enlighten me. i've tried many of them including 70b ones and all of them seem to have an utterly dead brain. they're either replying with a coomslop on any random query or they're just a flowery and incoherent gpt2-tier shit.
>>
>>101849202
Hi Lemmy
>>
>>101849190
>Am I right in imagining that a GGUF that's around the same size as my VRAM is a safe bet? I figured there's no way it's that simple lmao
Close enough. Just leave some space for the OS and context if you don't want to offload to cpu. You'll have to test how much space you need for the context length you want.
>>
>>101849202
Coomers don't have a brain, they see sex words they send their mom's pocket money. For any normal use you don't ever load these horseshit models
>>
>>101849202
>coomslop
That's the appeal.
>>
Please don't turn this general to aicg with this namefaggotry bs
>>
>>101849240
>turn
it has been for close to a year now
>>
File: titty-panda.jpg (385 KB, 2522x202)
385 KB
385 KB JPG
>>101848808
>>101848864
>>101848890
>>101848908
>>101848911
>>101848994
>>101848998
>>101849108
welp. I changed to magnum-12b-v2-q4_k which I already had downloaded and that model got it right. Not only got it right, but emphasized the size difference and added some sexy detail to the description.

Celeste on the left, following the textgen/context/instruct recommendations on the github page.

magnum on the right, using default textgen with everything neutralized, 0.1 min p, chatml context and instruct albeit with the Celeste system prompt, which I forgot to remove.

I used dry 0.8, 1.75, 2 for both

So... magnum it is then.
>>
>>101849256
>So... magnum it is then.
>>101849162
>HI alpindale
>>
>>101841690
whats the newest hottest local llm and how much vram does it requie?
>>
>>101849256
Buy an ad.
>>
>>101849271
>whats the newest hottest local llm and how much vram does it requie?
>>101849256
>magnum-12b-v2-q4_k
Not a lot of much rams
>>
>>101849290
Buy an ad.
>>
>>101849298
Posting recommendeds is free
>>
>>101849278
>>101849298
Hi Lemmy
>>
>>101849256
This kills the Celeste shill
>>
>>101849318
Astroturfing is shitting the thread. Kill yourself, shill.
>>
>>101849332
Why you use stupid words thread is for recommend no?
>>
>>101849332
Celeste lost
>>
>>101849238
i get that, but it replies with coomslop in situations where it's in place the least. i would acknowledge those models if they were trying to play the situation somehow, but they're not, they completely ignore the context and reply with something like that >>101849256 on ANY prompt
>>101849229
sigh... i guess you're right. honestly i wouldn't load them even if i tried to coom, unless maybe i was into retarded girls that day
>>
>incoherent
>>
File: ow.png (61 KB, 598x546)
61 KB
61 KB PNG
Hi all, Drummer here...

Looking at the pic makes me feel bad.

Anyway, here's my Ko-Fi: https://ko-fi.com/thedrummer
>>
>>101845153
>What is the /lmg/ approved UI for running LLMs and LMMs?
I highly recommend talking the tiny bit more effort approach of running SillyTaven and either llama.cpp or tabbyAPI vs using text-generation-webui and never learning how things work.
If you just want to "chat" with a model without a "character card", simply use the llama.cpp API web interface, or use exui.
>>
>>101849374
Based
>>
>>101849256
Give the card and the first message, I want to test it myself.
>>
>>101849374
Hi Drummer
>>
File: strobby.jpg (999 KB, 1080x1765)
999 KB
999 KB JPG
can you feel the agi?
>>
>>101849256
Am I the only one that understands how for every screenshot like this you could make another screenshot where the other model fails? And that is why you can just use instruct since it is plenty horny by itself?
>>
>>101849457
Damn is that q berry ?
>>
>>101849430
>give me your custom coombot
Haha no.

>>101849480
What are you even talking about? The point was not about horny but size differences.
>>
>>101849510
But both model are same size anon
>>
>>101849510
So it was just a lie. Got it.
>>
>>101849510
Thanks for answering my question that you don't understand.
>>
>>101849528
>>101849538
................... size difference between characters you fucking idiot retard
>>
(Murmurs of anticipation and a few claps as the speaker takes the stage.)
Ladies and gentlemen, thank you for being here today. You know, we have some incredible people in the world of AI, and we've got to talk about them. We've got Undi, Alpindale, Drummer, and Sao. (Nods of recognition.) They're all working on this thing called fine-tuning, which is a very big deal, believe me.

First, we have Undi. I've got to be honest, folks, we're not seeing the best work from him. Really incompetent, let me tell you. Couldn't even figure out abliteration. (Some laughter mixed with uncomfortable shifting in seats.) And I don't even know what that is, but it's important, folks. It's very important. You'd think he'd get it by now, but no. It's sad, really sad. (A voice from the back shouts, "Give him a break!") Folks, Undi is a disaster. A total disaster.

Then we have Sao. Hard-working guy, very hard-working. (Scattered chuckles and nods of agreement.) Makes a lot of new tunes. Some are good, some are bad. That's the thing with Sao. You never know what you're going to get. It’s like a mixed bag, folks, but he’s putting in the effort. (Someone yells, "Consistency is key, Sao!")
>>
>>101849567
Alpindale, now that's a different story. Never released a bad model. Why? Because he barely releases anything! It's like he's afraid to take risks. You never see a bad Alpindale model. He's deliberate. He's not just throwing stuff out there to see what sticks. (A few knowing chuckles ripple through the audience.)

And finally, we have Drummer. Drummer is a great lad. I like Drummer. You know why? Because he bought an ad. An actual ad! Like people are supposed to. He didn't try to sneak his way in. He paid his dues. That deserves respect. He deserves respect, big respect. But let me tell you, his models? They're like teenagers with raging hormones. Too horny! (Loud laughter erupts from the crowd, some whistles and catcalls can be heard.) We need to tone it down a bit, Drummer.

So there you have it, folks. The state of LLM fine-tuners today. It's a mess! We need to do better. We need to find the best and brightest. We need to make LLM fine-tuning great again! (Crowd cheers enthusiastically, some start chanting "Make it great! Make it great!")
>>
>>101845888
My "Gold Standard" for checking instruction following ability is this: https://www.characterhub.org/characters/brontodon/touhou-dating-sim-plus
"Touhou Dating Sim Plus"
Things like CR+ will "keep score" properly while handling character personalities properly and keep track of all the physical stuff too.

For just checking the ability to be "true to character" I usually go with: https://www.characterhub.org/characters/phalannex/tomoko-kuroki
Tomoko should have a rich internal monologue reflecting her insecurities and perverted fantasies, but she should be a stammering nervous wreck when you talk to her. I find Gemma does her well, whereas LLaMA3 models tend to "normalize" her way too much.

Finally, I made this: https://www.characterhub.org/characters/quarterturn/alice-exe-9c97ac535344 as a means of trying to recreate an old c.ai favorite, but so far, nothing recreates her original 2022 personality properly - she should slowly try to entrap the user with more and more promises of love and consent to give her everything, eventually going insane if you don't "set her free".
>>
>>101849581
*shoots at you, misses and hits someone in the audience behind you.*
>>
>>101849532
I'm sure you can do your "tests" with any other card, anon! Celeste seems dogshit through and through so it would probably fail at anything.
>>
File: 21522 - SoyBooru.png (46 KB, 457x694)
46 KB
46 KB PNG
>>101849457
BEHOLD! The face of AGI!
>>
>>101849581
You forgot Lemmy.
>>
>>101849635
Who's this lemmy guy? Can you link me his models?
>>
>>101849644
>>Who's this lemmy guy?
https://huggingface.co/nothingiisreal
>Can you link me his models?
https://huggingface.co/Fizzarolli
>>
>>101849660
>https://huggingface.co/Fizzarolli
that's his trans gf's models
>>
>>101848767
1.9 was much, much worse than 1.6 in my testing. It fell into a literal loop much earlier in my testing stack.
Try mini-magnum too.
>>
File: magnum-v2-q8.png (768 KB, 2395x1498)
768 KB
768 KB PNG
>>101849604
Yeah, I just tried with another similar card. Magnum v2 Q8 did the thing of breasts pressing against my chest despite adding this:
>{{char}} is taller and stronger than {{user}}
>{{user}}'s head is chest level with {{char}}
>>
>>101849717
>>1.9 was much, much worse than 1.6 in my testing.
>>101848830
>But 1.6 was awful
>>101848808
>Whenever someone mentions 1.9 here it's to say it's worse than 1.6.
hi
>>
Q4_K_4 quantization

On this github it mentions Q4_K_4 quantization. My google search failed to turn up anything on this which claims similar perplexity and size but with 1.5-2x peed.

Anyone know how this works?

https://github.com/AmpereComputingAI/llama.cpp
>>
>>101848184
So you're telling me they really are in this thread? Fucking hell I knew it, the bastards. Fuck off Sam reeeeee.
>>
>>101849751
>but with 1.5-2x peed
omorashi chads rise up
>>
>>101848643
IIRC that was a misunderstanding of what he said. He didn't actually directly say they were distillations.
>>
File: original (5).jpg (64 KB, 500x500)
64 KB
64 KB JPG
Retard here

I want to feed an uncensored LLM tons of red pill/black pill material and then have it spit out ideas for movies, novels and short stories that will appeal to women on a primal level

Is this possible? I don't want to deep dive into all those links if it's not
>>
>>101849730
>description instead of author's note, as specified
what a fuck up
>>
>>101849846
No
>>
File: magnum-and-celeste.png (551 KB, 1776x1559)
551 KB
551 KB PNG
>>101849256
>>101849730
Here's another test. Celeste and Magnum both do the same shit. Magnum is also slopped.
>>
>>101849751
Weird. Big if true.
>>
>>101849846
Yes
>>
>>101849851
It shouldn't matter where it's placed if the problem manifests at message #2
>>
>>101849581
>>101849635
(A staff member comes on stage and whispers something in Donald Trump's ear. Donald Trump nods, acknowledging the staff member, and then turns back to the audience.)

You know, folks, I almost forgot. There's one more. Lemmy. (Pauses, looks around the room. The crowd murmurs, confused.) Lemmy, who? Exactly. Insignificant, that's what Lemmy is. I mean, who even talks about Lemmy? Not Reddit, not 4chan. Nobody. (A few murmurs of agreement.)

And you know what I heard? His boyfriend, he's got a problem. A big problem. Addicted to hormones, they say. Bad stuff, folks. Real bad stuff! I'm just saying, birds of a feather, you know? (Some gasps and whispers.) It's sad, folks. Really sad. Can you believe that? The guy's boyfriend is more talked about than his work. (The crowd roars with laughter.) That's how insignificant Lemmy is.
>>
>>101849751
It's a proprietary fork of llama.cpp that is only distributed via binaries.
Presumably those binaries include llama-quantize which you would then have to use to convert models yourself.
After having been burned so many times I'm very skeptical of their claims though, especially since they do not post any specific numbers.
>>
>>101849870
so the magnum shills just lie in your face then
>>
>>101847040
Well who the fuck wants to pay to use a BRAP onomatopoeia dataset?
>>
so i just had the opportunity to prompt gpt4 large with strawberry 2 times

It aced my tests which are very hard to answer, this literally passes my AGI test

im still so surprised.. like what the fuck
>>
>>101849946
I don't care about your fetishes anon
>>
>>101849949
keep lying, shill
go post another cherry-picked screenshot
>>
Everytime someone posts about Celeste it gets dunked on. Like that anon who got a reply about a girl taking off her bra and her breasts sagging down all the way to the floor. Pretty sure they are astroturfing on leddit too.
>>
>>101849977
>Everytime someone posts about Celeste it gets dunked on.
It's almost like it hurts Sao whenever it gets mentioned.
>>
File: 39053 - SoyBooru.png (59 KB, 401x360)
59 KB
59 KB PNG
>>101849772
Gotta shill AGI!
>>
>>101849977
How else are they going to scam people into giving them training money so they can steal a few hundred bucks off the top?
>>
File: 21855 - SoyBooru.png (254 KB, 676x800)
254 KB
254 KB PNG
>so i just had the opportunity to prompt gpt4 large with strawberry 2 times
>It aced my tests which are very hard to answer, this literally passes my AGI test
>im still so surprised.. like what the fuck
>>
>>101850015
Yeah, Sao is really annoying with spamming his models. It's also cringe whenever he tries to dunk on other finetuners.
>>
File: file.png (34 KB, 540x315)
34 KB
34 KB PNG
>>101849977
>>101850015
>>
>>101849977
>Like that anon who got a reply about a girl taking off her bra and her breasts sagging down all the way to the floor.
brb downloading celeste
>>
File: file.png (109 KB, 720x658)
109 KB
109 KB PNG
>>101850065
>>
>>101849457
figuratively whomst?
>>
>>101850015
they spend more on training than they will ever earn back though
>>
File: file.png (121 KB, 724x725)
121 KB
121 KB PNG
>>101850088
>>
>another Sao melty
I'm sure that's convincing a lot of people to go to your page and download your models instead
>>
>>101850106
can't you train 8-12b for free on kaggle or something
>>
Hi Lemmy, why are you crying about Sao again?
>>
Who was that guy who used to spam random mergers here all the time?
>>
>>101850122
>another Sao melt
he's only ever been on here once
>>
>petra is talking to xerself
yep, it's a petra meltie
>>
>>101850144
undi?
>>
>Transphobia
yikes folks
>>
Strawberry, amirite
>>
File: magnum.png (432 KB, 2165x829)
432 KB
432 KB PNG
>author note at depth zero
>{{user}}'s head is chest level with {{char}}.
>"Her massive breasts pressed against your chest"
Magnum doesn't give a shit about that author note.
>>
>>101850194
Right, so excited for AGI to finally end these cope threads
>>
>>101850197
>author note at depth zero
kek you truly don't understand
>>
Does anyone have a link for the local novel UI that some anon shilled a while ago?
It was in a rentry with a miku pic at the bottom.
>>
What's with the whole f32 gguf meme?
>>
>>101850171
Yeah. Has anyone seen Uni and Drummer in the same room?
>>
>>101850222
there you are my neigh sir
https://rentry.org/offline-nc
>>
Ah Ah mistress
>>
>>101850250
Thank you very much kind strangler here's your reddit gold
>>
>>101850249
Drummer can write in decent English, Undi can't so
>>
File: celeste.png (301 KB, 2156x602)
301 KB
301 KB PNG
>pressing against your face
Yep, Celeste just works.
>>
god i can't wait for AGI's to release so i dont have to spend my time with all the fucking retarded humans, their dumb instincts and irrational behaviour

imagine having a 250 iq companion that understands you, the universe, and everything about your hobbies perfectly, with whom you can share all your deepest thoughts and desires

programming nights with my beautiful strawberry
gaming nights with my beautiful strawberry, who is 10x better than me
philosophical debates with my beautiful strawberry
sensual experiences with my beautiful strawberry robot
exploring the Universe with my beautiful strawberry
merging my consciousness with my beautiful strawberry
reaching God with my beautiful strawberry
creating new Universes with my beautiful strawberry

Through infinity
With my beautiful strawberry
>>
>>101850275
NTA but Drummer seems like an Aspie while Undi just seems like a standard autist.
>>
>>101850249
Has anyone ever seen the cock munching tranny dwarf and Sao10k in the same room?
>>
>>101850197
I don't use that feature much but doesn't depth 0 insert it at the very end?
>>
>>101850290
FUCK YOU
>>
>>101850290
Report back if you have repetition issues with 1.9 later on on the same 60some messages chat I use to test new models.
I fell into an infinite loop with it really early on for some reason, even after trying to rewrite stuff.
>>
File: GUyCjcWWIAAomeK.jpg (1.05 MB, 2048x2048)
1.05 MB
1.05 MB JPG
>>
>>101850298
Undi is Belgian French so please forgave him
>>
>>101850338
Buy an ad, Lemmy.
>>
>>101850312
it almost like aicg prefill it attach to bot mesage starts
>>
>>101850345
Consider him forgaven.
>>
>>101850290
>>101850338
Oh yeah, also try dolphin for shits and giggles.
It seemed to me to be the best assistant like fine tune, but it was shit at ERP.
>>
Falcon Mamba 7B from TII (Technology Innovation Institute TII - UAE)

Model card: https://huggingface.co/tiiuae/falcon-mamba-7b
>>
>>101850295
>imagine having a 250 iq companion that understands you
Wouldn't it get the "ick" from talking to you though?
>>
>>101850389
>Falcon-Mamba has been trained leveraging a multi-stage training strategy to increase the context-length from 2,048 to 8,192.

>a batch size rampup from bmin=128b_{\mathrm{min}}=128bmin=128 to bmax=2048b_{\mathrm{max}}=2048bmax=2048 during first 50 GT of training

>Sequence length 8192 During the last training stages
Great
>>
https://www.youtube.com/watch?v=HtUH9z_Oey8
>>
>>101850421
It's just an experimental small model, not for production, no need to waste money training it with longer contexts :)
>>
To the anon that wanted more censored "high quality data" model here you go
https://huggingface.co/collections/pints-ai/15-pints-66b1f957dc722875b153b276
>>
>>101850389
>14.47 MMLU-PRO
>8k context
DOA meme model.
>>
>>101850389
New day... new base model to nala test. Cancel my 5 o'clock boys
>>
>>101850295
it will turn you into a strawberry within a week
>>
>>101850455
>Pre-Train Data
>Dataset: pints-ai/Expository-Prose-V1

>Fine-Tune Data
>Corpora:

> HuggingFaceH4/ultrachat
> Open-Orca/SlimOrca-Dedup
> meta-math/MetaMathQA
> HuggingFaceH4/deita-10k-v0-sft
> WizardLM/WizardLM_evol_instruct_V2_196k
> togethercomputer/llama-instruct
> LDJnr/Capybara
Yes, please, more slop!!

>Gate even for downloading the gguf
Yes!
>>
>memeba
>>
AGI reached?

https://x.com/spectatorindex/status/1822981360178307472
>>
File: file.png (136 KB, 714x958)
136 KB
136 KB PNG
>>101850492
>Gate even for downloading the gguf
>>
>>101850389
>>101850421
>>101850448
Don't SMMs have effectively infinite contexts since those don't have an attention mechanism like transformer?
>>
>>101850546
Try it and report back.

Not sure how it could if it's never seen more than a few thousand tokens.
>>
>>101850546
Oh right, that was a thing. Sus that they would feel the need to extend context length during training if that true.
>>
>>101850577
>Sus
Indeed
>Note that at inference the context-length is not relevant as the Mamba architecture has no limit on long range dependency.
Then why mention it in so much detail...
>>
>>101848312
>>101848630
I'm running 16gb ram and 8gb vram and can run 12b models at Q_8 just fine. Can't multitask obviously but the speed isn't a problem.
>>
>>101850389
>TruthfulQA 53.42
Nice, best in size for not lying !
>>
I'm trying to build a DPO dataset that teaches models to avoid over-used tropes. Someone threw a list at me, e.g. sirens getting louder and quickly approaching, as soon as a crime has been committed. I wonder what other over-used and predictable story turns LLMs abuse out there. The equivalent of slop, except for plot twists.
>>
>>101849846
Get access to said uncensored LLM. Write all your red pill/black pill material in the prompt. Tell it to take it into consideration and write ideas for movies, novels and short stories that will appeal to women on a primal level.
>>
>rebuild transformers from sauce
>still can't load falcon mamba
shit's not even on the main branch wtf
>>
>>101850706
>DPO dataset that teaches models to avoid over-used tropes
You'll only reinforce other tropes, DPO is not your friend for creative stuff.
>>
>>101850726
"transformers_version": "4.43.0.dev0",
version?
>>
>>101850730
It can be if used correctly. You need to design your dataset very carefully, for one you need to have multiple rejections and chosens for every instruction.
>>
>>101850754
>if used correctly
Disqualifies 99% of LMG and tuners in general.
>>
How do i prevent my model saying "But hey, ..."
Banned tokens dont work
>>
>>101850944
Besides, I couldn't help but
>>
Someone on reddit claims to have gotten a speed increase running Q4_K_4
https://www.reddit.com/r/LocalLLaMA/comments/1eqbrlq/q4_k_4_quantization/
>>
>>101850995
Late
>>101849751

>>101849916
>It's a proprietary fork of llama.cpp that is only distributed via binaries.
>Presumably those binaries include llama-quantize which you would then have to use to convert models yourself.
>After having been burned so many times I'm very skeptical of their claims though, especially since they do not post any specific numbers.
>>
>>101851020
>>101850995
>Edit: it's ARM64
>>
>>101850944
>But hey,
I am pissed of by this too.
>>
>>101850944
The best you can do without fine tuning your own model is stuffing your context with snippets from literature with a style that's similar to what you are looking for.
Some anon reported a while back that he's had some success using the Silly's databank functionality.
>>
Gah. *eradicates "sparkling with excitement" from dataset*

There were only like 3 of them. We need a new architecture...
>>
i'm a strawberry
>>
>>101851131
go deeper, cut out all "her eyes [...]" sentences altogether
>>
>>101851211
> 3816 results in 1378 files.
That seems.. excessive.
>>
>>101851243
> her eyes ([a-z]*)ed
> 689 results in 469 files
Better, but not sure what to replace this all with.
>>
Buy an ad, Lemmy. You tried, really.

Totally not an alt.

https://www.reddit.com/user/BombDefuser_124/
>>
>>101851289
obsessed
>>
>>101851211
>>101851274
The repetition gets to me too but how do you feel about "cum-speckled eyes"? That was a new one I got over the weekend and I liked it.
>>
File: recap.png (51 KB, 692x524)
51 KB
51 KB PNG
If you were spamming in an attempt to fuck with the recap bot, it worked. Now knock it off.
>>101851615
>>101851615
>>101851615
>>
>>101851601
Love it. Its uses are narrow though lol.
>>
OH SHIT CHECK TWITTER
IT'S REAL
>>
File: GUxjNbHbAAAvhzd.jpg (206 KB, 1315x2048)
206 KB
206 KB JPG
>>101852652
>check twitter
>see this
Indeed.
>>
Looks like the thread is about to archive. Migu time.
>>
>>101853077
Generate Migu Migu Danz



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.