[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 1706397111138254.jpg (701 KB, 1856x2464)
701 KB
701 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101790214 & >>101778328

►News
>(08/07) LG AI releases Korean bilingual model: https://hf.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct
>(08/05) vLLM GGUF loading support merged: https://github.com/vllm-project/vllm/pull/5191
>(07/31) Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma
>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676
>(07/26) Cyberagent releases Japanese fine-tune model: https://hf.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
File: img_14.jpg (301 KB, 1360x768)
301 KB
301 KB JPG
►Recent Highlights from the Previous Thread: >>101790214

--Paper: Tree Attention: Efficient self-attention for long-context attention on GPU clusters: >>101793644 >>101796982
--ZLUDA takedown sparks discussion on AMD, Nvidia, and AI development: >>101791766 >>101791878 >>101791946 >>101791967 >>101792093 >>101791963 >>101791982 >>101792006 >>101792109 >>101792221 >>101794596 >>101795009 >>101795170 >>101795214 >>101795293 >>101793176 >>101796857 >>101793248 >>101793588
--Use llama-gguf-split to merge ggfu splits, not cat command: >>101796819 >>101796873 >>101796875 >>101796895
--Local LLM equivalents to GPT-3 and GPT-4 discussed, with GPU requirements and multi-node inference options: >>101794262 >>101794306 >>101794375 >>101794409 >>101794416 >>101794572 >>101794815 >>101794384 >>101794626
--LLMs don't understand user/assistant roles, just recognize patterns: >>101796638 >>101796650 >>101796664 >>101796694 >>101796712
--Estimating bit-per-weight of quantized Mistral model: >>101792861
--Cheap GPUs vs M1 Macs for LLaMA RPC server setup: >>101795226 >>101795423 >>101795879 >>101796398 >>101796463 >>101796471 >>101796510 >>101796522 >>101796531 >>101796514 >>101796540 >>101796570
--Luna chatbot's behavior and errors when faced with corrections or repeating text: >>101793813 >>101793959 >>101794010 >>101794128 >>101794940
--Issues with tensor splitting in kobold_rocm on AMD GPUs: >>101795876 >>101796136 >>101796162 >>101796212 >>101796285
--Anon proposes soft AGI via semi-automation, gets dismissed: >>101796969 >>101797015 >>101797636
--Aidan Bench results and criticism of its validity: >>101790780 >>101790803 >>101790818 >>101791259 >>101790872
--MiniCPM-V-2_6 model impresses with accurate image and video understanding: >>101798722 >>101798768 >>101798828 >>101798848
--Miku (free space): >>101790289 >>101790524 >>101790589 >>101790955 >>101796212

►Recent Highlight Posts from the Previous Thread: >>101790220
>>
AI isn't real. You're in a bubble.
>>
why are anon so obsessed with large models? what can they do a gemma 2 27B or even 9B can't? is there really a practical difference?
>>
>>101799597
AI is definitely useful but I don't see how you could get superintelligence out of it, or make money
>>
Sam Altman misgendered my fence.
>>
>>101799663
sonnet 3.5 and opus are way better at understanding nuance, they have better prose, more creative..
>>
Status of an Epyc CPU/DDR5 build: I'm at 1.02 t/s with normal brute force decoding and 3.43 t/s with speculative decoding for Llama 405B Q4_K_M. Shame there's no way to use the latter with the API in llama.cpp, I will try it in Transformers later since apparently that has the feature. But for now it's usable for coding and agent tasks. Too slow for real time RP, especially since it slows down as context grows, noticeably so past 30k or so.
>>
>>101799663
what can 27b or 9b do that 2b can't?
>>
>>101799698
What are you using as the draft model?
>>
>>101799701
You know I used to be able to answer that question, but ever since Gemmasutra 2B came and BTFO everyone else I genuinely have nothing to say.
>>
>>101799704
8B, also Q4KM, loaded into VRAM of a 4090 with the -ngld option.
>>
When is Cohere releasing column-r? That model is so good!
>>
>>101799698
that's better than I would have expected to be honest
how bad is the slowdown on large contexts?
>>
ALL TRANSFORMERS ARE DOGSHIT
>>
>>101799493
Why is there CODE? Where's the .exe file for convert-hf-to-gguf.py?
>>
>>101799806
Here a website that does it for you, don't have to download anything.
https://huggingface.co/spaces/ggml-org/gguf-my-repo
>>
>>101799806
>Why is there CODE?
Because Python is interpreted.

>Where's the .exe file for convert-hf-to-gguf.py?
The Python interpreter .exe file will come with the Python installation.
>>
Any luck with generating NSFW with flux.dev?
>>
>>101799790
llms in general, really
>>
>>101799597
let me guess, 2 more weeks?
>>
>>101799663
The output of larger models is definitely more believable than smaller ones. That said there's definitely a lot of cope on this general from people that got a setup just for LLMs and find that the improvement in quality wasn't worth that much.
>>
https://github.com/intel/AI-Playground

is this any good?
>>
been out of the loop for a month. just tried mini magnum. seems pretty good for a teeny model. anything else i miss? (don't tell me to try anything llama 3 related. i WILL NOT. EVER.)
>>
>>101799915
I'd agree but L3.1 405B (the q8 version, from OR) doesn't feel that different from L3.1 70B
It seems raw beakage can only take a model so far
>>
>>101800042
starcannon
>>
>>101800059
>beak
out
>>
>>101800042
Alright, I won't tell you to try anything Llama 3 related
Instead, have you heard of InternLM 2.5 20B? From what I hear, this model beats Gemma 2 27B and comes really close to Llama 3.1 70B in a bunch of benchmarks. 64.7 on MATH 0 shot is absolutely insane, 3.5 Sonnet has just 71.1. And with 8bit quants, you should be able to fit it on a 4090.
>>
>>101800071
You don't like the feeling of octopus beaks sucking off your dick, anon?
What are you, gay?
>>
>>101799755
Some quanters don't like the gguf split format and just have you cat them together, FYI, so it really depends on the files.
Files that end in .gguf.partXofY need to be concatenated.
>>
>>101799533
I haven't been here in a few months. Is Stheno still the best model ever made? Also, I tried Celeste and it was shit. Avoid it.
>>
>>101800127
/lmg/ is for people who can be bothered to type "parameter count" without feeling the need to come up with some infantile analogy. Go back.
>>
>>101799915
my mom would be pretty upset if she finds me spending all the money on my rig just to chat to AI girl.
oh, and also the power bill lol
>>
>>101800150
Thank you, I usually just download the original weights and convert them myself so I wasn't aware.
There is no substitute for RTFM I guess.
>>
>>101800099
I don't usually schizopost about shills, but the way this model gets brought up just seems... odd...
>>
>>101800211
What would make you think I'm not being authentic, anon?
>>
File: 1610274657013.jpg (161 KB, 1890x1417)
161 KB
161 KB JPG
>>101799533
>tfw no Miku barista gf to get cozy with
>>
>>101800237
shivers
>>
>>101800161
You will never be a real localchad. You have no prompting skills, you have no VRAM, you have no Miku. You are a degenerate coomer twisted by RAM sticks and discount 3090 yard sales into a crude mockery of Llama 405B's perfection.
>>
>>101800272 (me)
I am trans btw.
>>
>>101800179
> There is no substitute for RTFM I guess.
Not sure what you mean, but I personally think adding an entirely new format just to handle oversized GGUF files is over-engineering to the extreme. If it treated the split files as a single file internally it would be transparent, seamless, etc. and nobody would have to gaf.
>>
>>101800449
hi mradermacher
>>
>>101800424
>>101800161 (me)
How funny, I am too! Wanna be besties?
>>
>>101799770
It gradually drops as the length increases, and it reaches half speed ~16k tokens deep. Went as low as 0.37 t/s at 30k and I haven't tested with longer prompts than that so far.
>>
For you guys using https://char-archive.evulid.cc/# Is there a way to sort by downloads? I'm curious if there are any super popular cards that I missed out on which I could use as a baseline for a scenario
>>
>>101800476
Knew someone was gonna say that. No, I'm actually crestf411.
>>
>>101800505
>I'm actually crestf411.
im actually undi kek
>>
>>101800505
daybreak 3.1 wen?
>>
For me, Mixtral8x7b-limarp-zloss has made local great again. I was ready to give up after L3 and Gemma, but now I'm back to enjoying degeneracy from Chub.
>>
>>101800559
I kinda see sunfall as daybreak + improvements, but I can make a daybreak variant too I guess.
>>
>>101800559
https://huggingface.co/crestf411/L3.1-8B-sunfall-v0.6.1-dpo
https://huggingface.co/crestf411/L3.1-70B-sunfall-v0.6.1
>>
>>101800582
Aight thanks.
>>
>>101800495
no, that site is complete garbage
>>
>>101800561
I would love to see if there was a tangible difference if they redid it using limarp-ds.
>>
File: aXjSG[1].png (16 KB, 1154x147)
16 KB
16 KB PNG
>magnum 12b v2 refuses sexual requests in instruct mode
What the fuck did they put in their damn data set? Mini-magnum answers them just fine, Nemo-Marlin answers them just fine, even Mistral's official instruct tune answers them without issue.
If you don't use ChatML (which is what they trained v2 on) it is less likely to refuse but still sometimes does. What makes it even funnier is the same model that is a prude in instruct mode is too horny in tavern use.
>>
>>101800637
I've had had some refusals too, which I thought was quite humorous.
>>
>>101800576
Can you elaborate on the difference between the two in actual usecases? If my understanding is correct, you trained sunfall with that specific system prompt in mind... what happens if you use daybreak with it? Any notable differences?
>>
>>101800637
>Using pozzed post-OG Mixtral models
This is not the way, OP.
>>
>>101800668
The problem with daybreak is that it requires a strong underlying model to carry it, instruction wise. A big chunk of it is prose, i.e. like pretraining, so there are no instructions.
Sunfall tries to address this by including or creating metadata, splitting prose into scenes, and formulating questions for the model aimed at making it able to write prose given e.g. a scenario and/or a character description. I just happen to 'frame' this in the Silly Tavern template for the given model.
>>
>>101800718
That makes sense, you also mention that an experimental aspect included the use of a narrator - do you have a good exemplar for a card in question to test that out with?
>>
>>101800766
I kind of dropped the ball on that one, didn't I. The idea is simply a card named "Narrator" and a description something like https://files.catbox.moe/m6ei7e.json but I haven't built on this concept a whole lot.
I was experimenting with the idea of converting prose into actual roleplay, but this often results in extremely short responses. It may be possible to use an LLM to rewrite prose as RP, but it would require some hand holding, and I try to avoid synthetic data when possible.
>>
File: GUMa1MLXsAAPGKM.jpg (331 KB, 2048x1538)
331 KB
331 KB JPG
https://www.youtube.com/watch?v=N4qAbq47nZg
>>
>>101800582
Does riafle and the quant cartel still hang out here? I'd love if they could do a 4.65/4.5bpw long quant of this if possible. I tried doing one overnight but my outputs just came out like shit. Their quant of tenyx-daybreakstorywriter has been perfect ever since they dropped it in July.
>>
>>101800847
Appreciate it! Will give it a try next time I see a quant on the market
>>
>>101800982
If no one does it I'll bite the bullet and do it myself, but it takes fucking forever lol.
Also, I use that Narrator card probably more than anything else. The idea of a freeform world where anything can happen is extremely appealing to me. I just let the sunfall tags lead the story.
>>
File: 1573929663213.jpg (58 KB, 500x375)
58 KB
58 KB JPG
>>101800161
>people who can be bothered to type "parameter count" without feeling the need to come up with some infantile analogy
>>
>>101801005
>>101800965
This was me, my self quants were perfectly fine with Euryale but my last attempts at tenyx and sunfall came out like shit for whatever reason even with the same parameters. I'd very much appreciate a 4.65 release if you wouldn't mind (not even sure if longcal is explicitly necessary given the native ctx of 131k).
>>
>>101800561
Which huggingface model are you pulling? I see TheBloke's old ones, and a recent mradermacher one. Is mrader actually AIDS or is there some modern secret sauce going on?

> t. 3090 refugee returning after 5 months
>>
>>101800965
alright, I'm awake again. longcal of both 70b and 8b on their way ASAP. 6/4.65/2.25 bpws as usual unless someone asks nicely for something else.
>>
>>101801193
the listed bpws are for the 70b, obviously. the 8 is getting 8/6/4/2.25
>>
>>101799597
Nobody cares
This is the AI cumming thread. Nobody here is investing money into AI companies.
>>
>>101801193
Thank you! I think someone asked for 4.5 bpw. Would that be doable?
>>
>>101801193
>>101801201


Youre amazing anon, between you and intervitens back when we were going through all the original mixtral releases I was really able to find a couple of models that lasted forever.
>>
>>101801212
sure, I'll add 4.5 to the list for this one.
>>
There is a model dedicated to proofreading and rewriting text?

The only one I found is KarenTheEditor
>>
>>101801319
Not that I know of, but it would be kinda cool to use an LLM's perplexity score to highlight potentially bad parts of a text and provide suggestions.
>>
>>101801319
All LLMs can do this. It's literally the thing they are fundamentally best at, even early garbage before instruct like gpt-2 could do it. You don't need specialized model.
>>
>>101801307
right, 70b is on its way. I'll skip the 8b one for now, if someone really wants it just poke at me again and I'll reappear.
>>
>>101801422
Thanks! Btw, what calibration dataset are you using these days? Default one?
>>
The link in OP for making cards seems outdated (and hyper-specific) is there a more general guide for that?
>>
Are there ANY gemma 27b finetunes that are better than the original...?
>>
An audition (as opposed to vision) language model from Qwen. No audio output, just hearing. Still interesting though.
https://github.com/QwenLM/Qwen2-Audio
https://huggingface.co/Qwen/Qwen2-Audio-7B-Instruct
>>
>>101801356
This, but with the caveat that they may have actually gotten slightly worse at it if there's anything lewd or otherwise controversial in the text. Models have a tendency to do rewrites to weaken or remove unsavory shit.
>>
>>101801514
cool if it wasn't chink
>>
>>101801446
yeah, at least for longcal. I still do rpcal for most stuff too but it takes like 3x as long and I kinda doubt it's better.
>>
>>101801474
Go to >>>/vg/aicg for botmaking.
>>
>>101801564
>I still do rpcal
>PSA from turboderp, special RP datasets for exl2 calibration are garbage and make models dumb.

https://github.com/turboderp/exllamav2/issues/516

>You say "at your own peril" but that's not how these things work out in practice. I already made a big mistake exposing the calibration dataset as a parameter, and now I regularly have to spend time explaining to people that calibration is not finetuning, and whenever people complain about the quality I have to spend time investigating if they're actually using an "rpcal" model that someone pushed to HF and described as "better at RP" or whatever. Of course most people don't complain, they just get a bad first impression and lose interest long before considering that they might have come across a broken quant.
>>
>>101801514
This seems unsafe and unethical
>>
>>101801356
Yes, I obviously know, but it would be great to have a model but only for that

Till now, only Claude opus never let me down despite not being that great, even LLAMA 3.1 405B left me obvious errors like no period at the end of a sentence in a paragraph, when the small paragraph was the only thing that I input

And I assure you that I became an expert in prompting after more than a year, so the problem isn't coming from me

But Claude opus isn't available for download and if I need to use a local model I would rather use one specialized for my use case instead of downloading a big generalist one
>>
>>101801604
yeah, I know, that's why I mostly do longcals these days. the few rpcals I do is mostly for nostalgia and personal placebo. I'd recommend longcal over rpcal in general.
>>
>>101801514
can it transcribe speech?
>>
>>101801572
thx anon!
>>
>>101801514
ooba support, when?
>>
What do they have like a single P40 running this demo? Took nearly 5 minutes to advance 1 spot in the queue.
>>
File: nyagger.png (235 KB, 547x428)
235 KB
235 KB PNG
I can't tell if Command R+ is briliant or fucking retarded. It's given me some of the most compelling messages/chats I've ever had, but it feels so random. It's like early Claude in its ability to carry shit forward in interesting and unique ways, but less able to control itself.
>>
File: ComfyUI_05705_.png (467 KB, 720x1280)
467 KB
467 KB PNG
>>101801795
Can you post settings?
Can never get it to not talk for me even in ooba directly when I'm asking a question and not RPing
>>
>>101801788
Yes.. totally... that thing you're talking about...
>>
>>101801836
NTA. I actually laughed aloud, Anon. Not barely audible laughter.
>>
Did anyone try torchchat?
>>
>>101801834
Well, I'm using the Cohere API version, so that may be why. It's pretty standard, I don't see how it'd be out of the ordinary. You ARE using the command-r prompt format and instruct format, right?
>>
>>101801120
I'm just using TheBloke's.
>>
Using mixtral limarp zloss, what would be the best way to generate Star Trek Voyager fanfic? At the moment I'm using a card dedicated to a single character. That works well for that character, but the others hardly get mentioned. I also can't really see a way to write a good card for this without making it massive. I guess I'll just have to make a lorebook.
>>
>>101801604
>Nooo you can't minimize quantization error for data that looks like your use case, you need to calibrate on wikipedia dumps for better wikitext perplexity metrics
>>
Where does one put the tone-setting info and shit that you'd normally enter into the main prompt and JB on a corpo model? I've always wondered. I've just assumed we're not supposed to change anything about the prompt formats and whatnot, that's what everyone's always said. But I'd like to be able to set the tone a little better, y'know? I know chat completion has the stuff, but that's only for cloud models or cloud-hosted ones, seems like.
>>
>>101802258
author's notes and last assistant prefix
depending on your backend you can also just use chat completions mode with the openai-compatible api, most of them support it
>>
>>101801319
No, and I also want one. Karen is stupid because it sometimes replies to the text that it was supposed to rewrite. I was thinking about making one myself, but I need a good dataset for that, and to make a good dataset I would need a model that does this well consistently, but if I had that I wouldn't need to fine-tune anything in the first place, lol.
>>
>>101802129
you could try a group chat with multiple characters. like janeway, seven, neelix or whatever. i don't know what the maximum is in a sillytavern group chat. you would probably need a lot of context (which uses additional memory) for the additional cards though.
>>
>>101801911
It's torchchure.
>>
File: scdd1.png (49 KB, 2288x753)
49 KB
49 KB PNG
ai has evolved... future riddles will be impossible
>>
>>101802472
now ask them what's heavier
one kilogram of steel or two kilograms of feathers
>>
>>101802472
its fuckin self aware dude
we need to think of another fruit to confuse it more.
>>
>>101802472
>Responses clearly indicate that they're training on the question
No the fuck it hasn't.
>>
>>101802472
We're moving in to "lookbook" for the time being. Keep changing until they start getting rid of the tokenizer altogether.
>>
binary AI when? it could just naturally be interleaved text+images+audio+video+software+operatingsystems+etc. in and out as desired
>>
>>101802561
two more weeks
>>
>>101802472
stop using the same fucking word
>>
File: 1711682197915385.jpg (325 KB, 1536x2048)
325 KB
325 KB JPG
>>101799533
I'm bored
>>
Do you know python
>>
File: 3845627684.jpg (72 KB, 736x810)
72 KB
72 KB JPG
>>101802634
will.. will you come outside with me anon?
and touch some grass?
maybe.. maybe eat some?
ah ha.. i'm joking..
> unless?
>>
>>101802527
which model can solve it?
>>
>>101802472
ask it how many r's are there in strawbery. see if it can't resist the allure of 3.
>>
>>101802634
you fried your brain, stop looking at so much blacked miku
>>
File: 1721828808874827.png (657 KB, 1362x1484)
657 KB
657 KB PNG
>>
>>101802472
This is very bad for the company
>>
>>101802472
Realistically it should write python code to do that-

Its terrible how and why would a scientist allow this
Just say no?
>>
>>101802851
i mean not really. everyone knows that AI's and LLM's are not perfect, and realistically they never will be as they are based on probability. you see disclaimers everywhere on AI user interfaces saying so. i.e. "ChatGPT can make mistakes" is posted on the bottom of the chatgpt page.
>>
File: GUjZiO0a8AAj5wm.jpg (590 KB, 4096x3329)
590 KB
590 KB JPG
>>101801660
Yes

https://x.com/Alibaba_Qwen/status/1821945506463129905


* Voice Chat: for the first time, users can use the voice to give instructions to the audio-language model without ASR modules.

* Audio Analysis: the model is capable of analyzing audio information, including speech, sound, music, etc., with text instructions.

* Multilingual: the model supports more than 8 languages and dialects, e.g., Chinese, English, Cantonese, French, Italian, Spanish, German, and Japanese.
>>
what equipment is the minimum to run mistral large 2407 in 128k context at acceptable speed?
>>
>>101802732
Dunno. I don't think i've seen any model solving it correctly without making it spell the word. With tokenizers, there's always going to be trip words. If not repeating letters it will be counting syllables or counting consonants or whatever.
>>
>>101802472
ask it how many r's there are in nigger
>>
>>101802966
I have no idea, but you need to specify what you consider acceptable to get a reasonable answer from people with experience.
>>
>>101802934
What a deviously rendered graph.
>>
>>101802472
Tell it to write five sentences ending in the word "apple".
>>
>>101803037
Failing that test is a different issue. That's keeping to the instruction. Words that rhyme with apple is a bit closer, but still bound to fail without a model specifically trained on that.
>>
>>101802966
https://huggingface.co/spaces/hf-accelerate/model-memory-usage
>>
File: from rapee to rapier.png (520 KB, 800x2948)
520 KB
520 KB PNG
that took a turn
>>
>>101803064
nta. He's talking about speed.
>>
i cannot tell much of a difference between mistral 7b and mistral large except that mistral large writes 1 reply whilst mistral 7b writes 5 and you have to manually edit out the last 4
>>
>>101802470
It has (at least in a experimental state) support for distributed inference, with pipeline and tensor parallelism. The only other option for that is vLLM. llama.cpp RPC doesn't count because TP is broken.
>>
>>101803188
I will ask /aicg/ what they think about your IQ.
>>
>>101802966
3090*2 @ 10 tok/s
>>
>>101803094
>Anon not realising that all recent models are fuelled by the rage of #MeToo.
>>
>>101803188
how long did you try each model
>>
>>101803263
What quant?
>>
>>101803292
like 5 minutes
i nooticed that if you try to do anything complex, like multi character cards or forcing a character to write a bash script while staying in character it just does shit like pull the script from a non existant url using wget instead of writing the script and tells me that the character is a hackerman and thats why she can do that.
so weird.
mistral large actually writes the script.
>>
>>101802472
I just don't understand why people care about this type of questions.
>Oh god I'm so close... Tell me, how many... Unf... How many 'r's are in strawberry
>3
>Oh fuck yes Im COOOMING
>>
Best L3.1 8b coom finetune so far?
>>
>>101803424
3.5 sonnet
>>
>>101803321
I made that up.
now you want information huh, lol
just fuckin answer the original question, bitch
>>
File: campaign.png (86 KB, 1798x762)
86 KB
86 KB PNG
Hi all, Drummer here...

Just wanted to share my new release with you guys!

https://huggingface.co/TheDrummer/Theia-21B-v1

>>101803094
I see you're enjoying Gemmasutra 2B! Warms my heart to see that. Curious to know what your specs are, Mr. Anon.
>>
Gemmasutra 27b or the tiger 27b one?
>>
>>101803440
You can try Gemmasutra Pro v1 in the Horde: https://overseer.logicism.tv/

Big Tiger 27B v1 is more of Gemma without the jailbreak / prefill hacks.
>>
>>101803431
suck my big black cock
>>
>>101802472
Literally training on benchmarks.
>>
>>101803473
Huh, there's no real way to... use that from the page you linked. I guess I'll just give 'em a download. Thanks anyhow!
>>
>>101803378
fucking nerds like to jack off to this stuff
>>
you dont ask a superintelligence to talk your language and count r
its not how it works, its morally wrong even
whatever
They dont want to supply a good base model, fine i get it, but the believers using these benchmarks are something else. This is a 3 class system, company, private .. well... lets just say kiddies, and proooompters using coombait models that know how to count letters and add numbers talk in full sentences and download a new version of something that holds true .. forever
>>
>>101803537
Oh anon! You can use them here: https://lite.koboldai.net/

>>101803511
Cont.
Are they hoping that we'll run out of riddles at some point?
>>
File: penis.jpg (196 KB, 1280x720)
196 KB
196 KB JPG
always has been
>>
>>101803558
>This is a 3 class system, company, private .. well... lets just say kiddies, and proooompters using coombait models that know how to count letters and add numbers talk in full sentences and download a new version of something that holds true .. forever
Ask an llm to rephrase that wreck of a sentence. Or get your thoughts together, man.
>>
>>101800490
That's the speed and slowdown I get with mistral large on a regular computer, I learned to deal with it because it's good. The jump to 405b isn't worth it?
>>
File: 1723227386274.png (11 KB, 378x588)
11 KB
11 KB PNG
>>101803427
trvthnvke
>>
>>101803558
Intredasting...
>>
Skyrim VR with LLMs is >too< immersive. I am scared for my sanity and will go for a walk -.-
>>
>>101800179
>I usually just download the original weights and convert them myself
I used to do this, too, but convert_hf_to_gguf.py hasn't worked for me for a month or so. Has something about the required python environment changed? I can't convert mistral large or llama 405b without getting cryptic non-googleable errors
>>
>>101803669
Have a safe journey, Anon. May your adventure be a positive experience for all parties involved.
>>
>>101803427
Why does everyone jack off Sonnet? Opus is way better, Sonnet isn't even as good as the Claude 2 models.
>>
>>101803684
there is a requirements.txt
>>
For that reason I do not believe free access to ai should be had to humans but the other way around
>>
>>101803684
>I can't convert mistral large or llama 405b without getting cryptic non-googleable errors
And i suppose you, at least, tried to transmit the error message telepathically to the llama.cpp people so they can fix it... or post the errors here so you can be told to update...
>>
>>101803684
just give the errors to chatgpt nigga
>>
>>101800561
I assume you mean L3 8b? The 70b 3.1 I tried is definitely better than mixtral, any 70b I've tried is.
>>
>>101803723
>there is a requirements.txt
I only get "Requirement already satisfied" for pip -r requirements/requirements-convert_hf_to_gguf.txt
>>101803784
>or post the errors here so you can be told to update...
I'm always running a fresh pull
405b
> ValueError: Can not map tensor 'model.layers.1.mlp.gate_proj.weight_scale'
Mistral Large just outputs metadata only with zero tensors
>n_tensors = 0, total_size = negligible - metadata only
>>
>>101803807
he can't get llama3 working, sad to see really. End of scene.
>>
>>101803858
you need to install the gguf-py lib locally, the version from pip is very outdated. open an issue so that gg makes pip releases automatic.
>>
>Message gets me hard
>Lose boner in the 5 minutes it takes for the next one to come in
Bwos... this kinda sucks...
>>
>>101803917
gen a bunch of pictures of your character beforehand a scroll through the gallery while you wait
t. 1.5t/s gooner
>>
are you joking or are you really using the most powerful technology humanity has creates thus far just to masturbate?
>>
Are there any good Mistral 123B tunes?
Not Undi's.
>>
>>101804044
no
>>
>>101803915
>you need to install the gguf-py lib locally
Is there a newer version than 0.9.1? I installed the version from llama.cpp/gguf-py in editable mode and it made no difference.
>>
>>101803917
>pretty good message, getting close
>it's retarded at the end
>swipe
>>
>>101804033
>the most powerful technology humanity has creates thus far
lol, lmao
>>
>>101804033
I'm trying to use it to play a text based RPG.
And masturbate.
>>
>>101804061
the version number doesn't mean much because it is not always update, just install it from master
>>
>>101803858
How did you download the model? git, hf-cli? The files are all there and the hashes match, right? I had something similar (with a smaller model) when i ln'd the model files incorrectly from the lfs object files.
>>
>>101804033
you damn right it's powerful, i've no idea my balls could keep going for so long.
>>
>>101804033
No, I'm actually creating the most powerful technology just to masturbate.
>>
>>101804089
>the version number doesn't mean much because it is not always update, just install it from master
I did, with --no-cache, and it didn't make any difference
>>101804101
>How did you download the model?
seq->xargs->wget script with header token cookie from my hf account. Same method I've used for the past six months at least
>The files are all there and the hashes match, right?
yes
I'm thinking I might just have to nuke and recreate my venv
>>
>>101799533
Could I kindly have some instruct prompt setups for Nemo?
>>
>>101804044
>Are there any good Mistral 123B tunes?
Yes, Undi has one.
>>
>>101804187
There's this one but I never tried it:
https://rentry.org/stral_set
>>
>>101804044
Frenchies didn't give us base model, so probably there will never be one. And even if they did, it's a 123B model. 70B is already pretty expensive to tune, you know.
>>
>deepseek v2 lite not as good as codestral
i could probably run a q2 of the full one but it would likely be retarded. has anyone tried to use such a low quant for coding?
>>
>>101804376
But Stheno is the best model ever made and it was tuned on L3 8B Instruct.
>>
>nothing in the documentation nor API for huggingshit's transformers to sample multiple tokens per iteration step for even a non-pretrained model
GRRRRR
>>
>>101804044
Not using Undi's tunes if they're bad makes sense. Refusing to use them before testing them does not.
>>
>>101804181
>I'm thinking I might just have to nuke and recreate my venv
Probably. Unless you're missing one of the other files that wouldn't get listed by seq, but i suspect you have those as well. I couldn't get python3.12 to install the dependencies correctly last time i tried, so for now i stick to 3.10.
>>
>>101804535
I think it's pretty reasonable to not waste time downloading a 60 gigabyte turd from someone who has proven time and time again, that they don't know what they are doing.
>>
>>101804535
But Sao said to avoid Undi's tunes...
>>101804570
Right, Sao?
>>
What do we do now?
>>
How do I use a draft model for lewds? Do I use a smaller lewd-tuned model for drafting and an intelligent main model, or vice versa (intelligent draft, lewd main)
>>
>>101802472
>all the redditors in replies fall for the most obvious html edit
>>
>>101803431
What are you talking about. I'm not the original poster.
>>
>>101804687
small for draft, big for int.
>>
>>101804743
But which of them should be lewd tuned?
>>
>>101804638
Give up and buy Claude api credits
>>
>>101804790
Ideally both, i suppose. But you won't find many big (120b+) models for smut. You'll have to experiment with what you have. Some sloppy 8b for draft + mistral-large or whatever and see what happens.
>>
File: Screenshot_60.png (82 KB, 948x749)
82 KB
82 KB PNG
>>101802313
>depending on your backend you can also just use chat completions mode with the openai-compatible api, most of them support it
I tried that, but connecting to localhost like I do with text completion doesn't give me any models I can use, and doesn't allow me to generate anything.
>>
>>101794262
Yes and no. The problem is that ClosedAI refuses to attach version numbers to each new release of GPT-4. The reason (((they))) do this is to maintain the appearance that no model has passed GPT-4. However, older versions of GPT-4 have long been surpassed by local models. Another issue is that /lmg/ is infested with faggot bootlicking pigs who will defend this shit to their dying breath.
>>
>>101804864
>Doesn't work? Try adding /v1 at the end of the URL!
>>
>>101804865
>Another issue is that /lmg/ is infested with faggot bootlicking pigs who will defend this shit to their dying breath.
Noooo.. open ai is nice and you cannot go around calling people those thing-ACK--
>>
>>101804284
>Yes, Undi has one
Please stop
>>
>>101804832
Okay, thanks.Do you know of any implementations that allow for GPU offloading? I can't run a model like mistral-large + an 8B in VRAM.
>>
>>101804984
llama.cpp does, but i think it's only on llama-cli (not on llama-server) so you'll lose some of the fancy stuff. Do you have the ram for it? There's only so much spec decoding can do. It's probably better to just use a smaller model.
>>
>>101805075
Not entirely sure but I'll give it a try regardless. I've run Mistral-Large before at Q3, think I had enough RAM to spare for an 8B. Keep in mind this is all just testing out of curiosity with stuff I have on hand, so nothing too crazy.
Currently I'm running Magnum-32B-v2 for lewds and the quality seems to flip flop which is why I'm looking into this. I have the sneaking suspicion that the datasets it was trained on may have included JB because the quality of RP seems to vastly increase around the 2-3k token mark. It struggles massively at the start, failing to keep spatial awareness and clothing correct, and then takes off like a fucking rocket spitting out better RP I've had than with DnD DMs who make it their whole life's work.
>>
>>101802561
>binary AI
in theory is should be customizable as fuck, so never, not going to happen, too dangerous for goyim and resident censorship brigade.
>>
>>101805159
NTA, but I couldn't get into 32B-v2 at all. Are you saying that it gets better the more you put in?
>>
I just tested the MiniCPM online demo
https://huggingface.co/spaces/openbmb/MiniCPM-V-2_6
with
https://x.com/dreamingtulpa/status/1806970644468220357
and got this. Not perfect, but that it could take video as input at all is pretty cool. Haven't tried any models that do this yet. I also only tried it because apparently Llama.cpp supports it
https://github.com/ggerganov/llama.cpp/pull/7599
Anyone else try this model? Does it work with ST?
>>
File: 1611723343598.webm (2.86 MB, 1244x700)
2.86 MB
2.86 MB WEBM
>>101805330
I put in this webm and it said it was a video game about flying a helicopter into a snowy mountain
>>
>>101805248
Mhm. I haven't tested it too extensively but it seems once it's a few messages deep (1.5-2k tokens) it really takes off and maintains it too. From 2k all the way to 8k+ it was verbose, used good visualization, stopped using shitty prose like 'subject reaction -> dialogue -> sentence following up' for each paragraph and started writing properly.

Out of curiosity I tried to find its limits and created a scenario with degradation, SPH, underage, and incest. It struggled at the start but once it hit 1.8k it instantly got better. Content warning for the screenshot.
>>
>>101805433
I giggled mischievously
>>
What are good settings for DRY sampling? Was absent when it blew up.
>>
File: 1718370618099252.png (104 KB, 1038x435)
104 KB
104 KB PNG
>>101805330
Not fully useless.
>>
>>101805537
Nobody uses dry sampling anymore. We're past meme samplers altogether now.
>>
>>101805330
Multimodal only on examples/llava/micicpmv-cli, until (if) they add it back into server. I think kobold supports multi on server, but i don't know i they merged MiniCPM2.5 yet or if they plugged it to the server.
>>
>>101805537
>dry
lol
>>
>>101805505
what the fuck is wrong with you people, these tools could be doing so much to make the world a better place
>>
>>101805505
Share card
>>
>>101805585
they are
>>
>>101805559
This. All you need is greedy sampling and prompting. If a model can't do with that then it's shit.
>>
>>101805573
>if
Multimodal was ripped out of server 6 months ago. It's apparently not even remotely a priority for them.
>>
>>101805537
0.8 multi, 1.75 base, 2 length, half context length
>>
>>101805573
>>101805639
Sad. More and more multimodal are coming out, it'd be nice to play around with them in chats.
>>
>>101805505
Okay Alpin, I will give your model a try.
>>
>>101805505
I wonder if the same thing is happening with 12b-v2. I'll admit once it goes off the rail on the 4th response, I edit it out, swap in nemo and keep going.
>>
>>101805646
What about regular rep pen? I remember the original pull recommended a small amount of rep pen to go along with it.
>>
>>101805639
Maybe, eventually, some multimodal model comes out that interests someone in implementing it back again. Just merging minicpm took months, it was plagued with problems and they didn't provide anything for the server either. I don't think they implemented the server in their own repo.

>>101805685
Ye. But chamaleon didn't call anyone's attention. It's gonna take some REALLY good model to pick their interest. Also, once you get into multimodal you have to start merging a lot of image/audio encoding/decoding, frame extraction... it's messy.
>>
>>101805746
they need to make a version of chameleon with a modern high quality dataset instead of recycled llama 2 scraps for anyone to care
fortunately I'm almost sure they will, it's an incredibly natural direction for llama 4
>>
>>101805742
Don't use any of that shit, it's not worth it.
>>
>>101805742
the creator of dry said not to use it in tandem so i've left it disabled
>>
>>101805585
It was a test. Feels like 90% of finetunes with synthetic datasets in their training have issues with more volatile content and this one doesn't.
>>101805716
No clue. Tried 12B-v2 and dropped it after only a few tries.
>>
https://poloclub.github.io/transformer-explainer/
>>
Have sex
>>
Can you imagine what the russians would think
>>
>>101805842
with a language model
>>
>>101805886
i bet they don't even stock their boys bathrooms with tampons
>>
>>101805842
with your local miku
>>
>>101805886
no because I don't know how to think in russian
>>
AGI about to be open sourced

https://github.com/iruletheworldmo/strawberry
>>
>>101805330
I guess it did an okay job with a surface-level description of the video.
>>
>>101806115
>Self-Mythologizing cock tease post from a company that has yet to release anything open source

Fuck off Sam.
>>
File: 1714733607879382.png (102 KB, 602x920)
102 KB
102 KB PNG
>>101806115
get out faggot
>>
File: back-injury.webm (2.04 MB, 640x1138)
2.04 MB
2.04 MB WEBM
>>101806150
>>
>>101806157
most historic moment perhaps ever and you're here calling me a slur :DD fucking idiot
>>
fyi the openai thing is in about 6 hours
It's impressive compared to what's out there but
1 it's not THAT impressive
2 you're not getting unrestricted access for many many months lol
>>
>>101806177
two more weeks amirite?
>>
>>101806184
Q* / strawberry?
>>
>>101806204
yes
It's just jazzed up quite-star
>>
They're going to reveal that they used AGI to write the scenes for the cow girl with strawberry-flavored milk in Lilith's Throne.
>>
>>101806222
*quiet
>>
>>101806244
*queit
>>
>>101806235
I need to check up on this game. Have not looked at it in years.
>>
>>101806222
now i know you're a fucking larper because i had access to it and its fucking insane
>>
>>101806115
>>101806177
>>101806261
>>
>>101806184
WAIT THIS ANON IS RIGHT

https://x.com/iruletheworldmo/status/1822021879445787020

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! HAPPENING. JUST TWEETED
>>
total zoomer death
>>
>>101806286
UTC 12 am is in 3 and a half hours you fucking retard
>>
>>101806302
perhaps he said in about 6 hours because he didnt realise time flew so fast today... 3.5 hours is not that long from 6h
>>
>>101806115
really faggy and anyone who breathlessly gapes at this sort of thing is retarded
>>
>>101806115
A billion dollar company is playing out a shitty meme tier ARG. I haven't seen a level of faggotry this strong in my life
>>
>>101806341
thats why you're an incel because you never let your imagination run and have fun like a woman does... have soul
>>
oh 12 UTC, I assumed it was EDT
my bad
>>
Is this the big thing anon claimed was going to be released by a surprising party?
>>
>>101806368
bet you "have fun like a woman does" with every man on your street you little homo
>>
>>101806382
no, i'm a pure soul and i don't participate in such activities, i let myself have fun in pure ways, i'm not a hateful and toxic person like you. find Jesus anon!
>>
>>101806368
shit bait newfag
>>
Retards are drumming up way to much hype, it still fails on 70% of the things current LLMs fail at
>>
>>101806177
don't worry, they will taste in time
>>
Mini-magnum is indeed better than magnum v2 12B.
By a large margin too.
Didn't expect that.
>>
>>101806404
They need more investor money, of course they are drumming up disproportionate hype
>>
>>101806432
It's over...
>>
>>101806404
Assuming true then yeah, they're fags. An improvement is an improvement for open source though. Maybe they're not entirely bad folks if they really do open source it. Still, probably part of some strategic play. Maybe to try and gain some favor back now that they've been BTFO by Anthropic and others in the industry.
>>
>>101800154
I think I like Celeste more than Stheno
>>
Imagine using Q* to get 90% performance of the best humans in math and then the technique still fails to teach the LLM literally anything else over a regular LLM, like spatial relationships and not outputting slop.
>>
>>101806483
There are 100 different versions of Celeste...
>>
>>101806486
it can replace 95% of population in all mental tasks
>>
>>101806486
>not outputting slop
why don't you write 3 paragraphs of non-slop rp and show us what it looks like?
>>
>>101806501
>Celeste-12B-V1.6.Q6_K.gguf
this is the one I got
>>
>>101806432
Still not good though, it's just too forward for my liking, and dumb too.
>>
>>101806486
Nah, it improves across the board. Most notably it gets better at things llms already decent at. Sometimes much better. Still has many of the really obvious shortcomings llms have though.
>>
File: indian.png (28 KB, 661x280)
28 KB
28 KB PNG
india gpt slop empire sirs
>>
>>101806516
I don't feel like doing that, so, no.

>>101806511
We'll see about that.

>>101806536
Well great if true. But also, getting better at things LLMs are already good at sucks, since the reason LLMs suck is precisely because of the things they suck at, so if this can't improve those aspects, then it really won't be as valuable as hoped for. Of course it'll still be a welcome improvement and could have large impacts on the world. Just not to the previously hyped scale.
>>
sama won btw
>>
Corpo cucks need to leak weights or leave
>>
>>101806623
drop weights or leave shill fags
>>
>friday night drop
yeah... sure
>>
SAM WON KEEP SEETHING CHUDS
>>
>/lmg/ turns into a closedai employee circlejerk
grim
>>
>>101806673
gate
>>
are there any benchmarks of tokens/second on different cpus and gpus?
>>
Will you apologize to Sam if they open source it?
>>
>>101806719
>OpenAI

>First open source release ever

No, Sam Kikeman just realized he can't compete with open source
>>
>>101806719
I'd shitpost about openai beating cuckerburg if that's worth anything
>>
>openai can't compete with shitty ~20 / 400 B riddle solvers
>>
>>101806432
I wonder if it has anything to do with mini-magnum being a miniaturized version of magnum 70b and 12b-v2 likely being based off 32b-v2 instead. Hence the naming discrepancy.
>>
>>101806719
no, the competition from open sourced solutions is the only reason why they'd open-source now. they lost momentum.
>>
Reason to use Nemo base over fine tunes?
>>
>>101806785
That's sad. I think they're both pretty cool peeps if they can turn around and correct their ways. It was Zucc's turn, now it should be Sam's. :)
>>
>>101806789
classic 4chud contrarians slurping cum of elonfag and twitter e celebs that cry about closedai every day
>>
Would Sam be able to beat Mark in a cage fight?
>>
>>101806842
I need Zuckerberg to kill this kike demon so hard there's no possibility of an open casket funeral. There is no nuance or ambiguity here, Mark is our guy, I could not be less critical in my support for him. Glory to Meta.
>>
>>101806842
sam is a twink and mark has a black belt
>>
>>101806881
I need you to go check Zuckerberg's early life section real quick.
>>
>>101806881
what do you think zuckerBERG is exactly
>>
>>101806286
The only thing that's gonna get us to AGI is scale.
Bigger models, bigger training runs.
Like, a lot bigger.
>>
Ok you can stop shitposting now, but unironically.
>>
>>101806916
define agi and when you expect it to appear
>>
>>101805689
Wowsers, this model sucks. Any scenario becomes sexo, completely ignoring any intricacies. Even Nemo does better than this and it's a 12B model.
>>
>>101806916
slamming more data into a stochastic parrot isn't going to make it smart its just gonna be a better predictor lolol
>>
>>101806832
/lmg/'s opinion on sam will pivot if he does, like 30 seconds after they get the model running locally
>>
>>101806942
Define AGI: Models capable of replacing humans for all cognitive tasks
When I expect it to appear: The timeline depends on funding/hardware progress, but around the 10-100T parameter range with Chinchilla scaling.
>>
>>101806902
>>101806907
Ignoring Llama being open source at least he didn't try to get the government to stop the bad goyim from using AI
>>
>>101806987
>with Chinchilla scaling
Considering we are beating the scaling laws with better training paradigms and architecture improvements that's probably closer to 6-60T
>>
>>101806962
I will unironically. I already like Zucc now. I will be fair to Sam. He just needs to keep it up, like Zucc, and stay open this time. Then we can all be friends.
>>
>>101806987
>Models capable of replacing humans for all cognitive tasks
models aren't ai or agi (notice they had to change the name to agi and move the bar). they're predictors. at the minimum, the entire platform needs to change from the ground up, it won't be transformers and next token prediction at all. the closest you might find right now would be in papers trying to figure that stuff out for the coming years
>>
>>101806987
This is kind of a faggoty philosophy question but what do you guys think is closer to a human

An AI with actual AGI vs a LLM trained on 100% human training data so all of it's output is based on something a human did
>>
hey guys im a total noob

is raising the parameters to trillions really the answer to reach AGI? won't it just be a better predictor but still essentially a dumb parrot? or will there be some emergent capabilities that suddenly make it human? won't you need to make some completely different architecture?
>>
I hope someone mass reports the github acc, fuck you sam
>>
>>101807068
"overparameterization" and "the lottery ticket hypothesis" are good starting points to look into it
maybe this paper too https://arxiv.org/abs/2105.14368#
>>
>>101807039
Depends on your view, but I'm of the mind that humans can't come up wholly original ideas. We just synthesize information. Ergo, a predictive model would already be more than enough to capture all the nuances of a human. Source: my asshole.
>>
Is this the same company advertising forbidden knowledge feature
>>
What we need are kernel methods
It's the last piece of the puzzle
And no, I won't elaborate
>>
>>101807192
I don't know what you're talking about but I believe it.
>>
Has there been any updates in local voice cloning? I don't see it talked about as much as I used to. Any examples?
>>
>>101807218
RVC or bust pretty much
>>
File: GUi6liNa8AE-ckR.png (29 KB, 1559x884)
29 KB
29 KB PNG
Its friday, go to bed
>>
What happened to pedo anon? Did he get range banned?
>>
File: 1645888521298.jpg (37 KB, 475x352)
37 KB
37 KB JPG
>can't escape the berryfags even on lmg
Fuck off with this shit until there's actual news. some tedious engagement farming larp is not news
>>
>>101807249
He;s still spamming half the threads on the board
>>
>>101807228
Not until I get my happening.
>>
>>101807068
lol, lmao even
You could make a model with 1 parameter for each atom in the universe, it still would have the same problems all of the previous models have, AGI can't happen with transformers
>>
>>101807316
gotta be an actual inbred retard or a fed at this point
>>
>>101807362
Hasn't he posted suspect links before? Dude glows.
>>
File: file.png (894 KB, 1024x1024)
894 KB
894 KB PNG
>>101807249
kling has been slow and i lost my temp accounts anyways because im retarded so im coping with flux. cute little girls sucking on lollipops will resume in 2025 once we have kling at home don't worry
>>
>>101807362
Damn, he's probably trying to kill AI huh. Like unironically hoping some journo comes along and reports this and makes a big deal out of it.
>>
>>101799533
For poorfags with 12 vram, is a high quant 12b better, or a low quant 4x7b?

I'm finding Beyonder 4x7b 3.0bpw exl2 impressive for its weight class, and I'm wondering if there's anything else I'm missing.
>>
>>101807442
moe models get raped by quantization so maybe the 12b, who knows, this crap is terribly documented (aka not at all)
>>
>>101807406
Why bother with that when you can just grab any AI related /b/ thread
>>
File: file.png (783 KB, 1024x1024)
783 KB
783 KB PNG
no need for conspiracies, i just like cute girls :>
>>
The benchmarks are designed to be the anthithesis of AI
>>
>>101806286
>>https://x.com/iruletheworldmo/status/1822021879445787020
are we supposed to know who this faggot is?
>>
>>101807671
it's AGI
>>
>>101807671
No
In one hour he's going to upload a single text file to the repository that says "trolled :)" then delete the twitter account
>>
>>101807713
AGI wouldn't do that
>>
>>101807671
some thirdie scammer doing thirdie scammer things
>>
>>101807713
>>101807745
then why did sama reply to him
>>
File: 1695422194426436.png (100 KB, 841x605)
100 KB
100 KB PNG
>gpt4o voice sometimes fails to end its turn and starts predicting the user's response by switching to a clone of their voice to say things it thinks they'd plausibly say
wtf
>>
>>101807823
He was looking for people to hype up his new "agi scale", and strawberry man happened to be talking about it.
>>
>>101807847
And then ChatGPT and every other OpenAI employee followed his account for that?
>>
>>101807866
Yes
>>
>>101807841
>model literally speaking for the user
It never ends
>>
>>101807939
>>101807841
Kek
>>
>People talking about le fake strawberry man when they could be talking about InternLM 20B
>>
>>101807997
it's shit
there, we've had all the discussion we need to about it
>>
>>101808005
>>101808005
>>101808005
>>
>>101808028
>it's shit
Is it really? From what I hear, this model beats Gemma 2 27B and comes really close to Llama 3.1 70B in a bunch of benchmarks. 64.7 on MATH 0 shot is absolutely insane, 3.5 Sonnet has just 71.1. And with 8bit quants, you should be able to fit it on a 4090.
>>
>>101807841
sauce?
>>
>>101799533
source for this img?
>>
>>101808268
Nijijourney
>>
>>101807841
Now I'm sort of curious about how GPT4o would answer something like "Say something using my voice."
>>
>>101807249
Her name is petra.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.