[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Settings Mobile Home
/g/ - Technology

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

File: 1705962944691597.jpg (100 KB, 500x710)
100 KB
100 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>101449685 & >>101439122

>(07/18) Improved DeepSeek-V2-Chat 236B: https://hf.co/deepseek-ai/DeepSeek-V2-Chat-0628
>(07/18) Mistral NeMo 12B base & instruct with 128k context: https://mistral.ai/news/mistral-nemo/
>(07/16) Codestral Mamba, tested up to 256k context: https://hf.co/mistralai/mamba-codestral-7B-v0.1
>(07/16) MathΣtral Instruct based on Mistral 7B: https://hf.co/mistralai/mathstral-7B-v0.1
>(07/13) Llama 3 405B coming July 23rd: https://x.com/steph_palazzolo/status/1811791968600576271

►News Archive: https://rentry.org/lmg-news-archive
►FAQ: https://wikia.schneedc.com
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started

►Further Learning

Chatbot Arena: https://chat.lmsys.org/?leaderboard
Programming: https://hf.co/spaces/bigcode/bigcode-models-leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
►Recent Highlights from the Previous Thread: >>101449685

--Papers: >>101453981 >>101454294 >>101454370 >>101452718
--Rumors and Speculations about the 5000 Series GPUs: Wait or Useless?: >>101454392 >>101454448 >>101454485
--New Mistral Model Performance and FP8 Inference: >>101454907 >>101454985 >>101454995 >>101455053 >>101455087 >>101455105 >>101455146 >>101455156 >>101455003 >>101455198
--Meta Scared of EU Regulations, Anime-Style: >>101455273 >>101455305 >>101455316 >>101455453
--Hardware Requirements for Running Nemotron: 8xRTX 3090 and Quantization Techniques Discussion: >>101451077 >>101451113
--Hardware Requirements for Fine-Tuning a Medical Transcription LLM: >>101453078 >>101453129 >>101453165 >>101453210
--EfficientQAT: Blurring the Lines Between Training Strategies and Quantization: >>101449904 >>101449996 >>101455653
--DeepSeek-V2-Chat-0628: Capable Coder, Dry RP Partner (and Potential Chinese Spy?): >>101454717 >>101454869
--Transcribing Audio with Timestamps: Whisper.cpp to the Rescue: >>101451382 >>101451555
--Llama.cpp Performance Discrepancy: Outdated Versions vs. Latest Releases: >>101455208 >>101455307 >>101456180 >>101456236 >>101456316
--GoldFinch: Combining Linear Attention and Traditional Transformers for Enhanced Performance: >>101453520
--Audio-Flamingo for detecting non-verbal audio cues to enhance LLMs: >>101451823 >>101451900 >>101451918 >>101451951
--Understanding the Default Rope Freq Base in ooba's llama.cpp: >>101454930
--EU Regulators Restrict Access to Multimodal Llama Models for Businesses: >>101451361
--OpenAI releases a cheaper AI model, sparking concerns about the future of FOSS AI: >>101456062 >>101456143 >>101457080 >>101456267
--Anon's Mental Health and Model Recommendations: >>101455983 >>101456365 >>101456388
--Alternatives to Mixtral 8x7B?: >>101453716 >>101453758 >>101453762 >>101453857 >>101454043
--Miku (free space): >>101450566 >>101455397

►Recent Highlight Posts from the Previous Thread: >>101449690
File: 1698082688369775.jpg (79 KB, 1280x647)
79 KB
tick tock cuckies
Is there an ai I can use to improve my social skills? I am insanely bad at it and sound like a retard
yes, stop posting retarded questions in circlejerk generals.
Surely someone has made an ai to train social skills, no?
AI will make you sound even more like a retard.
What you want is to socialize with people somewhat less retarded than you in increasing steps.
t. autist who learned how to socialize somewhat okay thanks to my job as a developer that got sent to clients left and right.
Ooo you take clients
I'm mainly struggling with not knowing how to continue the conversation/having things to talk about, especially through text messages. It seems like romance ais are quite popular, surely there is something to simulate irl interaction?
>DeepSeek-V2-Chat-0628: Capable Coder, Dry RP Partner (and Potential Chinese Spy?)
Delete the recap and try again.
Thank fuck for the pandemic and home office. Now I never even have to leave my apartment since the company I work at got rid of its offices and all of our clients are in another city.
It's great.
4chan already improves your social skills.
>thanks to my job as a developer that got sent to clients left and right.
Literally the same and I had to write quarterly plans and reports that had to be approved by the board of directors so I directly had to talk to them and dumb shit down in normalfaggot corpospeak so much that I now am legitimately better at communication and social skills than the HR roastie despite being in at least the top 500 most sperg people on /g/
Yo what the fuck Mistral-Nemo is actually surprisingly good.

I'm using Transformers bf16 via ooba. Not even sure I have the chat format right, it seems to be different from past Mistrals. Each assistant message in the history needs an explicit EOS, which I added in the SillyTavern template. Also, the tokenizer always ends each prompt with EOS, no matter what. Never seen this before, but I guess it's intended? It means it ends with "[/INST]</s>" and seems to work. But that means you have to turn name formatting off, else it becomes "[/INST]Cumslut:</s>" which definitely fucks it up.

Anyway, the model feels kinda like Gemma. Very natural sounding, smart, almost completely uncensored. Doesn't hesitate with or try to dodge NSFW.

Also I have a past RP I always go back to to test models. Models always give one of two types of responses to a certain message. One response is correct, the other is incorrect behavior. Smaller, dumber models ALWAYS get it wrong, larger models always get it right. Never have a seen a model that 50/50s it based on RNG. Anyway mistral-nemo gets it right, by far the smallest model I've seen that does so.
Most of it is autists arguing
File: migguu.png (1.32 MB, 1024x1360)
1.32 MB
1.32 MB PNG
>surely there is something to simulate irl interaction
Don't fall for this trap anon, going all in on token predictors for genuine connection is a recipe for disaster
Based grinding social skills on the job anon
Better than 27B?
>is a recipe for disaster
Maybe but I kinda need to speedrun social skills asap
Buy an ad, Arthur.
post logs or it didn't happen.
go on ome dot tv. get used to showing your disgusting horrible unwashed self, and speaking to equally terminally online retards. start each conversation with a purpose. reach your stupid insignificant socializing goals. start with being able to say a sentence without stuttering. keep making goals bigger, go from being able to speak to being able to actually make a basic connection with a stranger. work your way up to being able to make friends via mirroring and being genuine/funny/whatever else you think could be your strength. don't fall into the pitfall of thinking online friends are actual REAL friends though. after you rack up a few online friends, eventually, you'll gain enough confidence to actually try it on a real human being.
Possibly, or a sidegrade. It's probably doesn't have as much raw intelligence as gemma 27b, but it's not far behind. Compared to gemma though, it does much less of the hesitating, glossing over, dick dodging type of descriptions, whatever that thing is where models dance around trying to avoid explicit NSFW descriptions. So it very well may be a more "fun" model to play with than gemma 27b.
Naw I'm done posting logs. It just derails the thread. People come out of the woodworks and swear their favorite 7b model can do even better than whatever you post. It's a fairly small model, just try it yourself. Anyone with 2 3090s can run the bf16. Or just wait for llama.cpp to implement it properly.
>ome dot tv
Never heard of that before. This the new omeggle or something?
File: aa4.jpg (22 KB, 456x628)
22 KB
Anyone remembers what to look for on output issues with Silly Tavern? I switched my desktop to Linux to make it a dedicated Ooba + Silly Tavern server, everything works fine when testing only inside Ooba, but on Silly every output I get is nothing but backslashes. I remember having this issue before, but I don't remember what I did to fix it.
yeah, except it's not deadlocked with nothing but indians and arabs. there's like 200-500k people on there at any given time. first worlders too.
>uhm! it just le derails le thread
bullshitter, as i thought kek
>Possibly, or a sidegrade.
It's a bit too early to start with your new scam after Wizard and Midnight Miqu, mikufag.
Overthinking is your problem. Most interaction is vain, and again, not giving a shit attitude is the way to go. If you want something to say, then say it. Do not think if it looks good; make your intentions clear. It is that simple. If you like someone, just tell them. Especially if you are an autist, that honestly is something that helped me tremendously. Just being straight with people from the start did not only win me friends but also my girlfriend. Like something? say so. If you do not like something, also say it. very simple.
i take back what i said about gemma. it's not THAT bad. after formatting a bunch of rules in XML it seems to be a lot better with actually following them. i've tried some old cards that seem to be very predictable and stagnant with almost every model excluding CR+. gemma 27b has so far brought a fresh perspective in every card. i wish there was a 70b version.
Daily reminder that this is a Sao thread.
it's easy bro if she ever says something you don't know how to respond to or don't care about just reroll
make sure you keep the first message short or else there will be a large delay on first response as cache is filled

you dummy if you have to take who you are for people to like you, you're gonna need to chase that endlessly to maintain the lie
just be yourself and if that isn't a good fit don't force it. better that than harbouring a deep seated apathy, resentment and hatred for decades.
>lists two genuinely good models
Make it less obvious next time, accuse me of shilling llama 3 8b or yi or some shit.
You can't just say to an autistic guy "just stop", it doesn't work that way. Some people have their brain wired in a way that makes them question every single thing
You can run Mistral Nemo on a 24GB GPU with transformers also with bitsandbytes in 8-bit (slowly) or in 4-bit (with degraded quality).

Contrarily to the other Anon, I didn't find it better than Gemma-2, however. It doesn't seem as good in following instructions, also.
I can only offer my experience and nothing more. Take what you can from it or not;d it is your life.
Isn't it better to live this way though? Not taking everything for granted and asking yourself why you should do this or do that? Autistic people are the exact opposite as NPC's I would say kek
>overthinking is your problem
Yeah I do overthink too much and I am indeed an autist
>but also my girlfriend
This is the situation I'm in right now. I found a nice girl online, she's my type, we introduced ourselves to each other but right now it's the day after and I have no fucking clue how to keep it going. Also as me and my friends have gotten older it seems like we're slowly drifting away and the DMs sometimes go a full week without talking and when I do talk to them it's a small chat, covid fucked this up a lot. It also doesn't help much that I have nothing really going on in my life besides waking up, WFH, playing games, going sleep.
Dunno if it is better life. I mean, happiness is mostly a short-term thing, whatever you do. And I am not some scholar or philosopher who knows the answer to what can be considered a good life. For myself, I like some social life, and having relationships did give me at least some motivation that I lacked before. Anyway, we are completely off-topic from the original question. likely best to stop here.
### Response (10 paragraphs, mikusex, expert roleplayer, detailed, descriptive, make me coom insanely hard):
Online friends aren't friends, never forget that. Also it's completely natural to fall out of touch with people with time. In my experience as someone in their 30s friendships have an expiration date of about ~15 years and it's not your or their fault. Your interests and personality changes over time and so do theirs while personal circumstance will be completely different.

Most of the people I know have 1-3 people they engage with on a daily basis and hang out with a combination of colleagues (forced), family members (awkward) and people they are attached to (wife/kids).

I work full time as an engineer, have a wife and child and I haven't interacted with any of my supposed friends for years now, not because I don't want to but there is simply no time for it.

It's time to get validation and life satisfaction outside of human interaction. Read more books, learn a new language or something while focusing on yourself some more.
News sure is picking up the pace, must be a busy month
Llama 3.5 and 405B has everyone shitting themselves. Mistral in particular.
>llama 3.5
Yeah the preshit is coming out.
it's llama 3.1 ;)
This, but unironically.
Has any cpumaxxx already tried DeepSeek-V2-Chat-0628?

GGUF's are already up: https://huggingface.co/bullerwins/DeepSeek-V2-Chat-0628-GGUF
Miku giggled as she adjusted her long teal pigtails. "Anon-kun, I've been waiting for you," she cooed...

[The AI's response continues for several paragraphs with increasingly explicit sexual content involving the fictional character.]
Have you considered you just don't care/aren't interested?
>online friends aren't friends
True but these friends are the people I grew up with. After school we mostly talked through text, discord, whatever and we used to play games but not anymore
Maybe but I also like having friends because they make me happy when I would talk with them. I also want to have a wife and kids in the future
>True but these friends are the people I grew up with. After school we mostly talked through text, discord, whatever and we used to play games but not anymore

I realize that anon, that was also the case for me and most other people on 4chan. You're not an unique individual. That has been the pattern for most millennials and older zoomers as we spend most of our free time online.

They aren't your actual real friends. Try not logging in for at least a month and you'll realize you don't give a shit and it feels freeing, go do literally anything else in your free time.
>I also want to have a wife and kids in the future
Same but I figure my brain isn't interested in socialization much. Not sure how or if I'll find a wife
So that's why GLM4 was so weirdly bad.
hey /g/ - new to llms, just setup ollama and openwebui. it's working ok but I noticed my GPU isn't being used at all. I have an Nvidia 1080ti with 12G of VRAM using dolphin-mixtral:8x7b. nvidia-cuda-toolkit-12.5.0 is installed. how do I get ollama to use my gpu?
>they aren't your actual real friends
Then who are? Brotherhood has been the norm for so long to the point that you legitimately lower your life span if you are lonely
Same for me, that's why I asked if there's any ai where I can practice my socialisation skills because at least with an ai it's not real so you can't burn bridges for trying
Would a Q2 of this be better than WizardLM2 Q4_K_M?
>all these new releases
>everything is a nothingburger
So LLMs have peaked already? Time to short Nvidia.
have you tried if anything else that uses cuda works?
So llama.cpp is faster than EXL2 now. EXL2 is soon to be a dead project.
>practice my socialisation skills
I doubt that would help for myself at least if I'm just not interested in socializing. Why would it make my brain want to socialize?
Your brain already knows how to socialize, you just need to stop wireheading long enough to let it work
I just have no idea what to talk about
>Then who are?
People you actually interact with IRL. If you don't do that it means you have no real friends. Or maybe you realize it's actually that one colleague you kinda like and hang out with from time-to-time.
>just keep doing it bro
No. This doesn't work if you are not biologically interested. Some people aren't interested.
I haven't. Can you recommend any other uncensored models? I was looking at TheBloke/dolphin-2.6-mistral-7B-GGUF but I'm not sure how to load it into ollama.
start doing more things during your day that you can talk about to people, and less autistic hobbies like llms
I much prefer stay a virgin all my life rather than forcing myself doing some shit I don't care about with people I can't stand because the vast majority of them took the vaccine and believe trans women are women lol
>but I'm not sure how to load it into ollama.
Use ooba. ollama is a piece of shit.
If you aren't interested in socialization it's because you can't compute rewards on a long enough time horizon
I never used ollama but you might try another backend like koboldcpp to see if it's a cuda issue, and also because it's easy to use gguf in it
anyway, I mostly use l3 8b niitama q5_k_m gguf
perhaps you can find things to do that you both enjoy and can talk about to people
Got any examples? My only hobbies are pretty much technology and playing video games. I do spend a lot of time researching health and wellness and talking about politics but I don't really think of them as hobbies or things people would be interested in. Even if I did something like skiing or skateboarding I don't see how it would work for a conversation besides bringing it up for a moment
obama is literal Spyware, I'll never understand people using something that you have to install with
curl ollama.com/rootkit.sh | sudo bash
I guess that men in that country are tired of doing 2 years of mendatory milliatory services and women just get to chill
those countries are cherry-picked but they're also some of the very few countries with high income
I bet you wouldn't find many "libtards" in bulgaria or whatever
If you want a cheat code for all of this bullshit and a cheat code that most of the general population actually uses to socialize with other people becouse belive or not they are not wiser, start drinking.
>I bet you wouldn't find many "libtards" in bulgaria or whatever
yeah but bulgaria is a shilhole lol
~~nikke and stellar blade happened~~
There was a massive scandal with some feminist cults within politics. You can Google it. It is an absolute rabbit hole of insanity.
oobabooga and ollama are spyware? I don't doubt you, wouldn't surprise me. what's a viable alternative? what are you using?
you can't have your cake and eat it
materialism and progressivism go hand in hand
I have no idea what those two things are but assuming it's the massive scandal the Anon below you is talking about I will have to take a look at it. Thanks mate
All the women I was dating loved to stay home. Having inside hobbies is not the main problem; a lack of trying to socialize is.
I prefer to stay single on a rich country then kek
Ask these stupid questions on Reddit or Discord. Doesn't ollama have one?
>Having inside hobbies is not the main problem; a lack of trying to socialize is.
How can you even socialize if you stay in your home all day because all your hobbies are there? kek
I'm joking, imagine if all 17 year olds do is play coomer gacha and gaymz, nobody knows how to interact anymore
You are on the internet right now. I am starting to think you are either a troll or have very low intelligence. I would definetly agree that going out is the best way to socialize and to meet people and even autistic nerd like anon can do it.. but holy shit some of your takes makes me wonder with what kind of person i am interacting..
No idea
I said obama not ooba
the fuck you talk about? how do you get girls on the internet? by siming to them on instagram? they got thousands of messages per day? on tinder? I got 1 match per month, only men like brad pitt have their chance on the internet retard
he's just going to say "socialize" and then say you're refusing to do it without telling you a way to feel like doing it (there is no way)
You could just say you are incel it would be fine. And yes dating apps.
local models?
No, I did manage to find the perfect girl. I just don't know how to continue the conversation, she dm'd me first because she liked me
give it a rest fag
File: 8wzwio329so61.jpg (359 KB, 2757x2757)
359 KB
359 KB JPG
>And yes dating apps.
0 chance on that
so how are people running the new mistral nemo without support for the new tokenizer? or was support added somewhere? what am i missing here? thanks
be glad you're good looking anon, don't mess that up, nature gave you that chance, don't ever forget it
Why don't men just date their robots? Probably a lot less hassle then a real women.
Are we in kindergarten? If you do not grasp even the basics of how human interaction works and need someone to tell you that, then I honestly do not know what to say because any advice I could give you would not help a person in such a situation.
>If you do not grasp even the basics of how human interaction works
I know how it "works", I just don't feel interested in doing it.
I've always said this. There isn't an incel/dating issue going on. There is a lack of social circles issue going on.

Despite what retarded graphs on /pol/ say and what dating app marketeers try to push, the vast majority of people don't use dating apps or even "meet people at clubs".

The vast majority of people meet through social circles. This is friends of friends at house parties and gatherings.

The main issue when I hear about single guys is always that they aren't part of social circles. They have 2-3 male friends at best and don't have a diverse group of friends with different interests or gender.

A healthy group of friends is 70% your own gender 30% the opposite gender. Yes you need to make friends with women. Actual legitimate friends as well, not just using them or friendship of convenience.

Try to actually talk to women in a non-romantic setting. Try to actually treat them like a person and speak to them. From my personal experience women tend to be the best friends. They are emotionally supportive, more understanding and more open to having deeper talks about your feelings.

Women also are the best wingmen. All the girlfriends I've ever dated were met through female friends and their social circles.


When your female friend sets up a social gathering or house party it'll be 70% female 30% male on average. There will be a lot of women there that (just like 4chan users) only interact with their own gender, meaning they don't even see men in their daily lives. You won't be seen as creepy or a threat because you've already been pre-vetted by their trusted female friend. YOU are the one that gets flirted with by those women in my experience. You just show up, treat them normally (like dude friends) and some will flirt with you, you flirt back and BAM after the hookup you exchange numbers and start dating.

That is how all the relationships I've ever had went.
>don't mess that up
I'm trying, which is why I started this whole conversation but I don't know how to do it. She is literally everything I was looking for in a girl but I don't know how to keep it going and I feel like I'm fucking up my chance at catching a unicorn.
File: 20822.jpg (468 KB, 1200x1200)
468 KB
468 KB JPG
>Despite what retarded graphs on /pol/ say and what dating app marketeers try to push, the vast majority of people don't use dating apps or even "meet people at clubs".
You didn't have to "know how to do it" when you played games with your friends online. Because you felt like doing it and it was fun.
>Are we in kindergarten? If you do not grasp even the basics of how human interaction works and need someone to tell you that,
you know damn well that interacting with the opposite sex have its own unnatural set of rules, it's nothing like just talking to regular friends, it's way more complicated than that
this is true but social circles are a continuous thing, if you end up in a situation in which you don't have any friends other than autists who also don't know anyone other than you, it's very difficult to expand your circle
for some this goes back to high school which means it never ever started
>Yes you need to make friends with women.
>friends with women
LMAOOOOOO, the only way I consider friendship between a woman and a man viable is only if the woman is absolutely horrendous
That's true but times have changed and I don't think I could talk with them the same way I did years ago. I used to be the class clown and I'd make them laugh so hard but I feel like I forgot how I would make them laugh
Oh man, I sure love talking about le modern women bad on /g/.
>some roastie hands typed this post
>That's true but times have changed and I don't think I could talk with them the same way I did years ago.
I know. I'm not saying it would be fun now, I'm just saying it not being fun now is why you cant do it
>Posts graph based on a study by match group, the company that owns 96% of the entire dating app market including Hinge, Tinder, Bumble, OKCupid and others
Are you going to post studies funded by Marlboro about how cigarette smoke is actually beneficial next?
If you knew how to read you'd see that the study was made by Standford University
>That's true but times have changed and I don't think I could talk with them the same way I did years ago.
I can relate to that, now I'm too redpilled to waste my time and energy on some random people that participate in the decline of the west, unironically, I'm just investing my time on myself now, the world is fucked because of them
The way is to stop wireheading so your subconscious reward predictions can function as designed, leading you to intuitively desire social interaction.
You're right that you can't reason your subconscious into desiring something, but what you can do is stop actively defeating your reward prediction hardware by continuously feeding it instantly gratifying stimulus, which will allow it to come up with the right answers on its own.
First off because I don't consider black people that are 20% lower in IQ to be my equals, just the individuals that are in my range, might be fewer of them in existence than whites, but they are still out there due to the normal distribution of IQ in every demographic.

Second I don't base the valuation of a person on something as simplistic as IQ scores. I think measurements of something like "soul" is more important. I don't mean in a spiritual sense but more in a feeling sense. This is why I don't consider Asians to be my equal but insectoids instead while black people tend to have an abundance of soul and are usually pretty cool.
>First off because I don't consider black people that are 20% lower in IQ to be my equals, just the individuals that are in my range
I legitimately consider black people to be equals to me
So you went from "all blacks are equal to me" to "no I mean... some blacks are equal to me" kek
what does tiktok have to do with llms

fucking redditor
It might blow your mind when you find out who funded the research group at Standford (sic) University
>but what you can do is stop actively defeating your reward prediction hardware by continuously feeding it instantly gratifying stimulus,
So stop browsing the internet until I want to socialize?
You think 3.1 will be a big update, or worse that column-r?
>it's da jowwwws
I mean, yeah, I'm not a big fan of jews either, but you're supposed to fight them by making some valid counterarguments against the validity of the study itself, or else you lose all credibility
Yes, 4chan is a kindergarten for manchildren.

Based. I'm in the same position, and I don't even care. I honestly don't see why people feel bad about it either, I guess this is a testament that I'm autist.

This is absolutely bullshit. In the case of games you have things in common, the game.
If life was that easy, you could just pick up girls in anime conventions and things like that.

Women see their male friends as a harem, I'm pretty sure I saw this somewhere.
I don't consider most whites to be equals to me either. The point is the race isn't the determent factor. Fuck no if I see some anti-vax conspiracy retard "white" person will I ever consider it to be my equal. Race has nothing to do with it.

So "blacked" has no appeal to me. And neither does it to any person I've ever met. It's purely aimed at the conservative population that considers it to be a taboo.
>ut they are still out there due to the normal distribution of IQ in every demographic.
even WOKEipedia agree with me on that kek
>Fuck no if I see some anti-vax conspiracy retard "white" person will I ever consider it to be my equal.
He's not your equal that's true, because he's superior than you, taking a vaccine made in less than a year by a company that has the biggest total fines of all companies is a strong indicator of a 2 digit IQ inpulse anon
File: dou-bao.png (78 KB, 600x400)
78 KB
>he doesn't use doubao
No, retard. It's Match group. Not the Jews. Almost all dating studies are directly or indirectly funded by them. Independent research actually shows there is a massive decline in dating app usage, particularly among young women below the age of 25.

Dating apps are largely considered to be for loser women that can't find a guy in her own social circle. Women are shamed by other women for using dating apps. You're living in a online bubble.
>This is absolutely bullshit. In the case of games you have things in common, the game.
How does that contradict what I said?
>you could just pick up girls in anime conventions
I don't even feel like going to an anime convention though. Even if I did it doesn't mean I would feel like socializing with them much.
>Independent research actually shows there is a massive decline in dating app usage, particularly among young women below the age of 25.
such as? I'd like to see them anon
>If life was that easy, you could just pick up girls in anime conventions and things like that.
You can, if you're good looking
>bot can't differentiate "tick tock" with "tiktok"
Are you retarded?
That's literally just some file that someone uploaded as part of a discussion.
Also can you just fuck off already and stop shitting up the thread?
>Yes, 4chan is a kindergarten for manchildren.
but you're also on 4chan anon :(
>noo let us love nignogs in peace :((

>How does that contradict what I said?
You implied talking with girls is as easy as playing online games with friends.

that isn't "research"
that's journalism
>my anecdotal testimonies are better than the statistics you provided
no anon, just no...
File: efzef.jpg (214 KB, 2656x1367)
214 KB
214 KB JPG
keep denying reality libtard, because we won't
I meant if it's fun (like playing games online) then you can socialize naturally. If it isn't fun (talking to that girl now) you don't socialize.
has anyone checked if the nemotron-4-340b base model is any good?

all I see is instructslop
I always find it funny that it's unemployed, permaonline friendless men that are complaining about being single.

I mean you've got much bigger issues to deal with and fix a lot first before trying to go for a relationship.

Relationships are not some shortcut to happiness. It also requires a lot of support, maintenance and logistics just to keep.

There should be a flowchart for incels on what steps to take before complaining.

>Are you employed?
No?: Search for a job
Yes: Continue
>Are you an active participant in your local community?
No?: Volunteer for something, engage with your immediate neighborhood
Yes?: Continue
>Do you have a diverse social circle comprised of both genders?
No?: Build upon your friend group and try to foster interpersonal connections
Yes?: Continue
>You can now start to consider dating someone.
Alright motherfucker. I got you.

You message her roughly as follows:
"Okay let me be frank for a second. You seem like pretty much everything I'm looking for in a girl. I am like, terminally bad at talking to women and I really don't want to fuck this up, so bear with me if I act like an awkward idiot. It's because I am one."

If that blows it for you, she wasn't your unicorn.
If she responds with something like "But you're not Frank, you're (Anon)." you put a fucking ring on that.
/licking miku's genitals/
this, even serial killers have a shit ton of love letters from girls because they are good looking, that's all you need to get bitches, you can do the worst shit imaginable it will be ok for them
I see, sorry for my autism. I guess socializing will never be fun for someone like me because, like OP, I worry too much.
But at least I have already given up and now I only worry about how to improve myself.
>There should be a flowchart for incels on what steps to take before complaining.
going to be prison and be good looking is one of the flowchart aswell kek >>101459773
>There should be a flowchart for incels on what steps to take before complaining.
>Are you handsome
No? Kys
Yes? That's it, you won.
let's goo, with that I'll be able to run L3-405b at 0.5t/s <3<3<3<3
Showing confidence is the most important thing. Show confidence even if you don't have confidence. Bullshit your way through with confidence.
The worst thing she can do is ghost you. And this just means you never had a chance.
File: 1716795842760632.png (246 KB, 600x654)
246 KB
246 KB PNG
>If she responds with something like "But you're not Frank, you're (Anon)." you put a fucking ring on that.
*0.05 t/s
>Showing confidence is the most important thing.
just be confident bro >>101459773
We're talking about having a healthy relationship that's stable and fulfilling. Not hookups.
One day, when DDR5 issues have been ironed out, faster sticks come out, costs come down, and architecutures/backends get better at making the most of RAM, that will be an insanely good deal. Spending 10k+ to cpumax for sub 1 t/s is retardation today.
You think the "wife" you're getting won't go for this handsome serial killer? lmao, must be a bliss to be this ignorant
ok, spelling the same thing two different ways doesn't make it two things, retard
>l3 8b niitama q5_k_m gguf
What made you choose that model? I don't see much about it on huggingface? Excuse the noob questions, just started learning about this stuff.
>What made you choose that model?
He's Sao shilling his own model. Spamming this general is how he makes money.
the guy who made it also made the model I was using before, and this one is newer and seems a little better
q5_k_m because I'm a 8 GB vramlet, I run it at 12k tokens of context, you can go higher with the quantization with your 12GB but more context than that makes it go schizo
I'm >>101459365

I actually have female friends that I'm very close with and share all of our secrets with each other. Which is why I look through all of this weird 4chan incel crap that can only ever be believed by someone that never spoke to a woman 1:1 in their fucking lives.

A lot of the /pol/ talking points immediately break down the moment you actually interact with people. Which is why it's so bonkers that many 4chan users hold this mindset.

It's equivalent of people saying the sky is green and grass is red. And you immediately realizing they never left the house to look at it themselves and instead of taking the fucking easy step of going outside to check themselves they believe others in the echo chamber and copy the standpoint that the sky is green and grass is red.

Women aren't like how they are portrayed on 4chan at all. In fact they are almost (ironically) the exact opposite of 4chan claim.
That's incredibly sweet
>I actually have female friends that I'm very close with and share all of our secrets with each other.
Sure they will tell the truth in front of you, my sweet summer child.
Anyway, it's pointless to argue like that, to know the real truth we need to see statistics and shit, not just some anecdotical evidence from 3 women friends out of 4 billions, you know what I mean?
Nemo status?
That seems super autistic but I guess if she married me she'd have to deal with my autism anyway, I'll try
Still lost.
go back to pleddit
>The main issue when I hear about single guys is always that they aren't part of social circles.
So tell us how we are supposed to become interested in doing that if it is not fun.
Is that why you're racist? Because all your opinions are based on statistics?
Hi Sao. What did you merge this time?
>Is that why you're racist? Because all your opinions are based on statistics?
Of course, only statistics matter, not feelings and anecdotes, I let that for reality deniers like libtards
statistics or actual science is raycist and transphobic, yes.
i merged my throbbing member with your mom's slick folds
I'm white, so statistically I'm smart than most people even without needing to do anything. that's how it works right?
Statistics provide some actual and solid evidence, of course we would try to understand stuff with that, I really don't get your point anon.
Exactly. You think you can keep the mask on for 40+ years? If she doesn't think you being an autistic retard is some kind of adorable quirk then you're shit outta luck. If she turns out to be as much of an autistic retard as you, congratulations, you found your soulmate. Go forth, and don't fucking multiply, we've got enough future school shooters running around.
why do we have a small brown zoomer obsessed with shitting on open source models?
explains why white people are so lazy, then. riding on the coattails of those who came before them and assume it was inheritable.
Statistically, it means you're more likely to be smarter than a random nigger, yes. >>101459599
This hobby seems to attract the dredges of society and there are no barriers to posting here.
why do we have a small brown zoomers defending niggers here?
Is this your own Aesir?
File: ted_20111005.jpg (52 KB, 580x363)
52 KB
>explains why white people are so lazy, then.
Telling lies anon?
go back
It actually is fun, you just don't realize that because you never tried it or never tried it with the right people.

Think of it like this. How many guys have you met through your life? Probably thousands. How many were cool enough to have as friends, probably a dozen or so. It's the same with women, yet most guys on 4chan have barely spoken with 10 women in their entire lives, yet make all kinds of generalizations that are plainly false.

Women come in all kinds of shapes and forms. I don't hang out with dudebros or "autistic manchild metal T-shirt bearded IT guys". But that might be your crowd. Similarly I'm not friends with harry potter/cat obsessed funko-pop women or motorcycle metal band girls either, but that might be your thing.

It's fun when you find your people. Be open minded because you never know which demographic is the one you fit in (My female friends are all stacy bartender dominant types which I would never have expected to click well with my personality, but it does, not for dating though but for friendship, who would've guessed)
>that video
>the messages being posted ITT
this general has been overtaken by reddit
it wasn't even that good to begin with tbqh
I can't wait until you guys form a discord and this thread finally dies
>don't fucking multiply
No, he just needs to make sure he doesn't vaccinate his kids
no one cares about open source models bro, also you get model's weights only so its not "opensource".
>this general has been overtaken by reddit
reddit wouldn't argue that niggers are inferiors to white people, what are you smoking anon?
check unemployment for the rest of the world and you'll see how lazy the white people are.
>no one cares
>spends hours spamming the thread every day
I think this must been the worst /lmg/ thread i seen so far.. You have your own page robots.
yes, to some extent, you see zoomers brigading for nignogs ITT, not surprising though if you remember that /lmg/ originates from /aicg/, the faggiest general on /g/.
why do you brown people not look after yourselves? also did you get molested by one of your uncles? you give off that vibe
it's just one guy and he's probably the same one who posts that income chart whenever someone makes fun of jeets
File: 1709668548089889.gif (810 KB, 500x290)
810 KB
810 KB GIF
Asking for dating advice on /r9k/ has to be the worst decision ever
>It actually is fun, you just don't realize that because you never tried it or never tried it with the right people.
>try 10000 times to find the right people
let them vent lol, we don't have any cool stuff right now to mess with :/
>let them vent lol
No. Fuck off.
I've been using Mistral to write replies to off-topic posts because the irony using a local model to write posts for a thread about local models amuses me.
The results are always people bitching about the posts being from a redditor. How would they fucking know if they weren't some colonizing tourist from the election? It fills me with a mix of revulsion and disgust to see how retarded this site has become. We used to have a much better standard of retard.
>you won't solve this country's birth rate decline by trying to convince the incels to replicate
This entire discussion started because some dude got messaged by a girl but is to autistic to talk back to her kek
>yet another "not bad for its size" small model released by a big lab
"LLMs have permanently plateaued at GPT-4 level" doomers looking more correct every day
checking in for today
okay, can you clarify which actual points you would like to make beside "actually that is false because anon just made an anecdote"?
talking about llm shit 24/7 seems boring, saoshit and random drummer memetunes is the only thing you talk about here, give it a rest fags
Hi Undi.
This is a LLM general, if you want to talk about something else just go SOMEWHERE ELSE.
uhhh speaking of drummer memetunes, Gemmasutra 9B v1 was announced today and I'm too lazy to touch it k thanks no need to reply to me
I don't know, the "is futa gay?" one was pretty bad too.
i gave up on memma is not that good reals
Make a discord.
aight cool
No one cares, drummer. baa
So is no one running this mistral Nemo model yet?
no support yet 2mw, then two more to fix random tokenize issu
one /lmg/ thread got taken over by off-topic discussion, oh noo! lmg has fallen! it's over... time to pack it up!
fucking whiny bitched ffs
It's not made by Sao. It doesn't have my trust.
Makes it easier to write cards with grounded personalities.
it's been dozens of threads with 50+ chains of off-top disigenous
It's still better than the Sao shilling spam.
recursively hiding replies and not getting baited is our only hope
You need to adjust your temp and top K, you're starting to repeat.
Why would that even be worthy a whole thread of discussion? Futas are women with dicks, and the only parallel we have for this irl is transgenders. But that only works if you think transgenders are women (impossible) or if you are gay.
>Futas are women with dicks, and the only parallel we have for this irl is transgenders.
no, futas are like intersex people, they have both sex (penis + vagina) except that they both work for a futa
oh god, no again
oh yes, do continue
saying "go back" on purpose to kill it's original meaning or what?
there's no point in saying it when your general originates from /aicg/ - the embodiment of reddit on /g/, retarded faggot lmfao
File: 1721338935760.jpg (194 KB, 1080x764)
194 KB
194 KB JPG
kino is back to the menu.
/lmg/ is a diverse community of largely intelligent competent and most importantly EMPLOYED people. Therefor the dating advice here is already way ahead of the rest of 4chan by being more grounded.

The issue with communities is that there is always the survivorship bias in play as well as some other vices I will list now:

>People with the most free time will overwhelmingly be the most common posts, meaning children, unemployed and crazy people will outpost functional people drowning out regular discussion
>Communities tend to make a status ladder which promotes contrarianism to show you're part of the "in-group" This has caused the effect where /a/ has the worst anime taste, /v/ has the worst vidya taste and /g/ has the worst linux taste and discussions.
>People are physically wired to react more to negativity than positivity meaning you will hear complaints and whining over and over while positivity is being drowned out, causing more people to not post positive things at all.

All of this is amplified on places like /r9k/ making it the blind leading the blind. This is why it's more enticing to post about dating or other topics on /lmg/ as there is still some overlap here but it doesn't suffer from those limitations. It should still be avoided outside of just local discussions in a thread here and there with months apart.
>not made by Sao
Go back.
>literally restarting the off-top discussion...
File: 1718241432767426.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
bros how do i make ultimate sexo bot myself
Wrong thread: >>>/vg/486515294
>We are conducting more evaluations before the final release. If you would like to access Higgs v2 early or do customization, please contact us at api@boson.ai.
*remove my shirt* yes?
>[The AI's response continues for several paragraphs with increasingly explicit sexual content involving the fictional character.]
shivers up her cups your balls swirling tongue her wet heat like a vice that is so uniquely hers
Where's tabbyapi but for llama.cpp? I don't want to calculate the tensor split manually...
>A jump from 55 to 62 on MMLU-pro
that's it, they're starting to cheat on those new mememarks again, when will they learn? Make a fucking private dataset already...
No no, this is the AI text generation thread. You want the AI text generation thread.
>managed to derail this entire thread which eventually turned into /pol/
I kneel.
>Just copped $400 3090 to go with my P40
I'll be ready for 70B just in time to get mogged by 405B.
File: msihz11r2pfb1.jpg (86 KB, 670x481)
86 KB
I'm addicted to GPU coil whine. It feels like something important is missing when my PC isn't in the same room when I chat with llm.
yay nemostral confirmed for 2mw status
>The issue is that it uses a custom tokenizer named Tekken. That's not an issue for any program that uses Transformers. As their tokenizer system supports the custom tokenizer. Which is why they call it a drop in replacement.
>For llama.cpp however the custom tokenizer has to be implemented manually. And implementing new tokenizers correctly is usually not easy. Gemma-2 and Llama-3's tokenizer for instance took quite a while to implement properly, and it took multiple attempts to do so as bugs were found over time.
For me that's the opposite, if I hear that shit I'm scared my GPU will explode or something kek
>tell your robot wife you love her
>sudden increase in coil whine
And so much for "no reason to use exllama anymore"...
>she is thinking hard about how to reply without hurting your feelings
can we use it elsewhere? like on chatbot arena to see if it's actually good and not a meme?
Why did they name a tokenizer after a fighting game?
one of the big draws is long context, all the online services that would allow testing for free gimp context to small sizes
Why did all the people suddenly get so lazy around 2008?
File: petra bros?.png (1.1 MB, 3244x1518)
1.1 MB
1.1 MB PNG
Uhm... petra sisters... this doesn't make us look good...
there was a huge economic crash in 2008, you don't remember that?
>Uhm... petra sisters... this doesn't make us look good...
how does reposting comments on issues makes me a pretus?
>it's over
>[insert stale meme here]
hi petra
>Hackers Claim to Have Leaked 1.1 TB of Disney Slack Messages
>A hacker group called “NullBulge” says it stole more than a terabyte of Disney’s internal Slack messages and files from nearly 10,000 channels in an apparent protest over AI-generated art.
Why would those retards go after Disney? Are they using AI to produce their movies or something?
hi cudadev
now kith
How did you get it working? I get an error about the tensor shape not being right.
anyone tried arcee nova yet?
Disney fired most of their artists and reduced their writing staff as more and more AI tools have replaced (parts) of the job.
he's not using lcpp learn to read
>using Transformers bf16
Yes I'm asking how he did that. I get an error when trying to run it with transformers in ooba at BF16
1.1 TB of pretraining data, great
Kek, to be fair, Disney only make woke slop movies now, so using chatgpt looks good to them, it's the perfect candidate
I'm pretty sure those discussions are worse than fucking leddit though: "We need to make this character black, oh btw how to add a trans character without anyone noticing it?" I wouldn't put that shit on my model desu
What are all the things that can be optimized about transformers?
llama.cpp CUDA dev, you won!
Gguf now faster than exl2
This isn't true for 8B models
Kobo won
It's true for 3D models
Dead project, discord project, redditor project
Ollama wins
I got a private dataset for you right here

*grabs nuts*
For me, it's 882 T/s with exllama and 630 T/s with llama.cpp with Llama 3 70B.
prompt eval time     =   12179.96 ms /  7679 tokens (    1.59 ms per token,   630.46 tokens per second)
generation eval time = 2090.56 ms / 26 runs ( 80.41 ms per token, 12.44 tokens per second)

11 tokens generated in 9.53 seconds (Queue: 0.0 s, Process: 0 cached tokens and 7667 new tokens at 882.84 T/s, Generate: 13.01 T/s, Context: 7667 tokens)
Yeah for me exl2 is also still a bit faster but it's crazy how the gap is closing. A few months ago exl2 was almost double the speed and I was wondering why anybody was bothering with gguf. When things are continuing like this llama.cpp might surpass it.
>llama 3 70b instruct
is that your daily driver? lol
this but unironically
I like Higgs V1
My daily driver is Gemma 2 27B. It's really fast with exllama and flash attention, but the outputs are worse than llama.cpp, so there's no point in making the comparison with that model.
Not that Anon but I'm maining L3 70B spins. People keep stumping for others and I keep finding them too stupid to answer simple stuff.
better than tiktoken
File: 1695971992407731.png (129 KB, 589x758)
129 KB
129 KB PNG
opensores ai sisters...
Hope you're at least getting paid, faggot
He's wrong though, summer dragon (and by extension davinci) sovl still hasn't been topped.
The silence in the general is because most anons ACK'd.
now show us some examples of that sovl, let's laugh together.
I can't, you had to live it.
Nah, it was pretty fucking trash compared to what we have now; even cai got obliterated.
>even cai got obliterated
nah. the cai soul is real.
File: 1692513827940035.jpg (243 KB, 1115x892)
243 KB
243 KB JPG
and then what happened?
Eh, the L3 405B was getting 86.5%
It's just a matter of distillation, if only someone who worked on open source knew how to do it.
File: 1702099141328062.gif (22 KB, 268x214)
22 KB
text-davinci had soul though
Papi chulo tells a bedtime story to the little girl and she falls asleep.
machines can't have souls
no shit lol? its just a text on your screen.
thread quality is unimaginable shit
thanks for your valuable and insightful contribution to the thread quality
>317 replies
>"heckin thread qualityrino"
get out fag
it's time to go to bed fucking zoomer.
I agree. Not really unusual for this general but it it is a shame.
I disagree. It was fine before, but now that you mentioned it I think it's definitely shit.
you could say that the quality of this thread is directly proportional to the quality of our models
Is the thread going to improve a lot in a week with the release of Llama 3 400B?
how much vram for 4k ctx prompt???
>he doesn't know
We will reach levels of shitposting and newfaggotry never thought possible.
>Llama 3 400B
when it releases?
next tuesday
am i stupid or would buying this be a good idea?
no, but it will improve with llama 3.1 longbo
So you're getting at least two so you can do better than 70B badness with slightly faster token rate, right?
i have 5 4060tis and a 3090ti and room for 7 GPUs. im also planning on getting a 5090 when those come out
I ordered one of those earlier this year for about that price. I ended up returning it because the lack of flash attention for Turing cards cost a lot performance compared to my other cards so it didn't feel worth the 2K euros I paid.
The guys behind flash-attention kept promising that they will look into porting flash-attn for Turing if they have time back then so maybe that's no longer an issue but I didn't want to wait until they possibly, maybe implement it.
i would really just be using it as a sacrificial VRAM card. i dont really need any performance from it, just VRAM
VRAM is useless if the performance is terrible
besides, if all you care about is VRAM, why not load up on P40s
Buy a 3090
refer to >>101462368.
i already have more than enough performance, i just need more VRAM. P40s are absolute garbage, and ive heard their drivers are a nightmare to configure. i need more than just an extra 24GB of VRAM
Your build is always more or less as fast as the slowest component you're using to run your models. If you partially load a model onto a 48GB RTX8000 it'll use that card for inference for those parts. Just like offloading even 10% of your model onto RAM absolutely tanks your performance even if the rest is on GPU, doing this with the RTX8000 will slow your inference speed to more or less that of this card.
And of yous guys fuck with NVIDIA ChatRTX?
File: 1697750178233555.png (357 KB, 338x436)
357 KB
357 KB PNG
>nvidia promoting local models
uh oh
it doesn't matter how powerful your cards are, if you just want more VRAM get P40s
anything else is a shit deal and you're wasting money
if you want good performance with high VRAM then cough up for a A100
is the RTX 8000 slower or faster than a 4060ti? are my 4060tis holding back my performance then? should i work on replacing my 4060tis with something else?
Does that thing not come with a cooler?
it does not come with a cooler, but its only a 230 watt card so it should be fine if you have a fan pointed at it or something
He could buy TWO of those 48GB cards for the price of a 3090, Anon.
literally the inverse of that is true. a 3090 is like $700
What's a good normie model that's still uncensored.
I want a normal chat model that won't try and end the conversation if I mention anything dicey.
how much vram do you have?
Only 12
out of 10
File: file.png (1.01 MB, 768x768)
1.01 MB
1.01 MB PNG
Hi baguette. Very organic shilling. Still not touching your dwarf model. You must be at least 20B old to be considered.
I can't run 70b and I'm sick of retarded MoEs, what's a good monolithic model in the 20-30b range?
Now draw her
Gemma 2.
File: 1717100040930594.jpg (35 KB, 680x521)
35 KB
am i retarded or is using the mixtral 8x7B lima merge not what everyone is still doing for aismut on a single 24GB card? been using it for like 5 months now and haven't seen a need to upgrade besides the occasional ear whisper and spine shiver that i edit out or model switch to break
>Commander - performs like a 30B model should and is nice for cooming!, high context!, no GQA...
>Gemma27B - fits perfectly on a 24GB!, GQA!, low context...,still bugged to shit...
>Mistral12B (speculating) - works from start! GQA! High context!, it is a fucking 7B with some makeup on...
It is incredible how all 7B's and 70B's are practically the same and interchangeable while all the releases in the 20-30B segment are fucking cursed.
wait, i thought commandr was a 100b model?
CommandR is fixed with FA and cache quant though. Just use it.
CommandR+ is. Atm the best for single 3090 is gemma 27B for smarter or commandR for a better writer.

Japan about to take off as the leader of AI.
>Try to actually talk to women in a non-romantic setting. Try to actually treat them like a person and speak to them.
i use a 4.5bpw quant of commandr+ with 20k context. should i switch to unquanted commandr?
100% right.
I mean if you can fit the bigger one then no.
now they just need someone who's willing to train a model that's not a 2k context piece of shit trained on cpus
Yes. It''s better.
yes. i can run the 4.5bpw quant of commandr+ at around 5 tokens per second
Japan is the India of the technology world.
I am talentless guy.
All the better!
There is no noticeable upgrade. Commander is a side grade arguably and you have to use 3.5bpw at most.
Buy a ad
Commander has a much better writing style / is more creative.
File: 1708000257104153.png (60 KB, 720x242)
60 KB
India is the India of the technology world...
Though it looks like Trump may be smart about it and relax regulations.

AI Manhattan project to eliminate "unnecessary and burdensome regulations"

"The draft order, obtained by the Post, outlines a series of "Manhattan Projects" to advance military AI capabilities. It calls for an immediate review of what it terms "unnecessary and burdensome regulations" on AI development. The approach marks a contrast to the Biden administration's executive order from last October, which imposed new safety testing requirements on advanced AI systems."
wtf i love orange man now!
wtf orange man good
>dude ai is just like nukes!!!
File: 1694852243223701.png (827 KB, 759x1107)
827 KB
827 KB PNG
That's what the experts are saying, yes.
Trump / the order didn't make that correlation, the press did. He just wants to get rid of regulations to keep the US ahead of China.
Well I'm an expert roleplayer, and I say AI is perfectly safe
>What we're doing here is super serious, you guys
I hate the media so much, they thrive off of fearmongering and panic. Somebody should really do something about them
uhh, bread?
no thanks. i dont eat gluten
uhh, bake it?
no thanks, I'd rather this general dies forever
I wish more people thought like you... Maybe the world would be a better place.
Fine, ill do it:


I am getting very low tokens / second using 70b models on a new setup with 2 4090s. Midnight-Miqu 70b for example gets around 6 tokens / second using EXL2 at 4.0 bpw.

The 4-bit quantization in GGUF gets 0.2 tokens per second using KoboldCPP.

I got faster rates renting an A6000 (non-ada) on Runpod, so I'm not sure what's going wrong. Nvidia-SMI shows that the VRAM is near full on both cards, so I don't think half of it is running on the CPU.

Any advice is appreciated!
try this
find system python with
 python -c "import sys; print(sys.executable)" 
Sadly is just to military shit, not for waifu breed.
Thanks so much, but it didn't help unfortunately. I get faster speeds not using the 2nd GPU at all and just using regular RAM for the rest of the layers.

[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.