/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>101421477 & >>101409356►News>(07/16) Codestral Mamba 7B with up to 256k context: https://hf.co/mistralai/mamba-codestral-7B-v0.1>(07/16) MathΣtral Instruct based on Mistral 7B: https://hf.co/mistralai/mathstral-7B-v0.1>(07/13) Llama 3 405B coming July 23rd: https://x.com/steph_palazzolo/status/1811791968600576271>(07/09) Anole, based on Chameleon, for interleaved image-text generation: https://hf.co/GAIR/Anole-7b-v0.1>(07/07) Support for glm3 and glm4 merged into llama.cpp: https://github.com/ggerganov/llama.cpp/pull/8031►News Archive: https://rentry.org/lmg-news-archive►FAQ: https://wikia.schneedc.com►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/llama-mini-guidehttps://rentry.org/8-step-llm-guidehttps://rentry.org/llama_v2_sillytavernhttps://rentry.org/lmg-spoonfeed-guidehttps://rentry.org/rocm-llamacpphttps://rentry.org/lmg-build-guides►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksChatbot Arena: https://chat.lmsys.org/?leaderboardProgramming: https://hf.co/spaces/bigcode/bigcode-models-leaderboardCensorship: https://hf.co/spaces/DontPlanToEnd/UGI-LeaderboardCensorbench: https://codeberg.org/jts2323/censorbench►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/lmg-anon/mikupadhttps://github.com/turboderp/exuihttps://github.com/ggerganov/llama.cpp
►Recent Highlights from the Previous Thread: >>101421477--Paper: Flash normalization: fast RMSNorm for LLMs: >>101426407 >>101426954--Papers: >>101426583 >>101426587 >>101426978 >>101426337 >>101426492 >>101430893--Codestral Mamba and MathΣtral by Mistral AI: Hybrid Transformer SMM Model Support and More: >>101429120 >>101429314 >>101429344 >>101430144--Llama3 405B Instruct: Meta's Latest Model with Debate on Context Size: >>101423104 >>101423203 >>101423316 >>101423559 >>101426929--Cohere and Fujitsu Collaborate to Bring Japanese Enterprise AI Services with a Focus on Command R+ Model: >>101424606 >>101424755 >>101424827--Seeking Help with GPU Memory Allocation in text-generation-webui: >>101429415 >>101429477 >>101429612 >>101429724--Physics of Language Models - Part 2.1, Hidden Reasoning Process: >>101427585 >>101427963 >>101428925 >>101428055 >>101428160 >>101428357 >>101428708 >>101428735--Micron Enters Datacenter DRAM Fray with Speedy MR-DIMMs: >>101428231--Investors Losing Interest in AI, But Is It a Good Thing?: >>101422857 >>101422929--Combining mid-range machines with 4070 TiS (16GB) GPUs to run local LLMs: >>101424596 >>101424668 >>101425070 >>101425541 >>101426067--Status of Full SWA Support for Gemma 2 in Llama.cpp: >>101424215 >>101424241 >>101424325 >>101424336 >>101424446 >>101424278 >>101428834 >>101429073 >>101429735--SCALE: A GPGPU Programming Toolkit for CUDA on AMD GPUs: >>101423224--LLama.cpp's LoRA Refactor: Does It Enable Partial Offloading?: >>101427616--Is AI Carbon Footprint Worrisome?: >>101427199 >>101427367--How to Remotely Access Locally Hosted LLMs from a Mobile Device?: >>101426449 >>101426474 >>101426687--Fine-tuning a Language Model to Generate Cover Letters in Personal Style: >>101426389 >>101426485--Accuracy Concerns with OpenRouter Listing: >>101424041 >>101424056 >>101424103 >>101424142--Miku (free space): >>101422036 >>101428801 >>101430906►Recent Highlight Posts from the Previous Thread: >>101421480
>>101431253teto best utau turned synth v
>>101431032lol
>>101431253>256k contextJesus Christ
>>101431316RULER test or it didn't happen
>>101431341I HATE LMG
>>101431341Ruler probably won't work well because it's a coding model, not a RAG.
>>101431341I ADORE LMG
>>101431369>>101431383
>>101431382Mistral specifically said they tested it on in-context retrieval up to 256k.
That's great and all but where's the fucking HF version
touch teto tail
> Mixture of A Million Experts> https://arxiv.org/abs/2407.04153Isn't this Google DeepMind paper a big deal?
>>101431545>a big deal?no
>>101431558With tiny experts not only inference will be extremely fast on about any system, but the model can continuously learn by freezing old experts and adding/training new ones.
>>101431583>but the model can continuously learn by freezing old experts and adding/training new ones.It doesn't work that way.
>>101431583no, millions of retards won't help us
>>101431595> [...] Beyond efficient scaling, another reason to have a vast number of experts is lifelong learning, where MoE has emerged as a promising approach. For instance, Chen et al. (2023) showed that, by simply adding new experts and regularizing them properly, MoE models can adapt to continuous data streams. Freezing old experts and updating only new ones prevents catastrophic forgetting and maintains plasticity by design. In lifelong learning settings, the data stream can be indefinitely long or never-ending, necessitating an expanding pool of experts.
>>101431600perhaps the way to go is one big smart model for general intelligence and millions of retards for very specific knowledge
>>101431611if even 1% of stuff claimed by papers were real we'd have opus on phones by now
>>101431611So basically an expert of mixtures...
>>101431637Imagine the rivulets of ministration
>>101431611That's great. It starts out as a Mixture of a Million Experts and after a couple roleplays and some knowledge updates, it ends up as a Mixture of 3 Million Experts and you're scrambling to buy more VRAM.Surprised this paper wasn't written by Jensen himself.
>>101431545It's hard to sayInference time might be reduced but if it ends up taking 6x time to train it's not very useful. As for the continual learning stuff, it's very much an open problem, and it's hard to say how robust their idea is. We'll see as more people experiment.
>>101431742>infinite context with perfect retrievalsounds good to me
>>101431742
>>101431641use faipl-1.0>how to use faipl-1.0put the following in the readme:license: otherlicense_name: faipl-1.0license_link: https://freedevproject.org/faipl-1.0/
>>101431742With experts that small (about 2000 parameters per expert in that case, but even if the model was 100 times larger, the number of active parameters would still be tiny) it would probably not be even worth to load the model on a GPU.
These errors are related, right? I'm trying to run kobald classic on a shitty PC, but it won't let me generate anything. I was told you could use a shitty PC, but it would just take a long time to load. However, when I click the button it just gives the server error.
>>101431857 kobold? THAT kobold classic?
Literally just add more experts to it. More parameters more tokens more layers
Imagine mamba-mixtral8x7b
>>101431890I don't know why it says kobold classic. The guy in the video I followed had his say kobold AI. I'm just trying to run any sort of decent local chatbot so I can stop giving data to C.AI.
>>101431985https://github.com/LostRuins/koboldcpp/releases
>>101431947Mamba bitnet Mixtral better than Gemma 27B
>>101431637It's more a matter of "risk" than the claims not being real.
wake me up when HF version of mamba-codestral
>>101432015tldr paper not reals
>>101432015the risk is that the paper isn't real
>>101432139There was also a risk that "Attention is All You Need", another paper from Google researchers, might have not been real either.
>>101432203>ad hominem
>>101432015>nose
>>101432203Google is a meme compared to Anthropic and OpenAI, who cares about their papers.
>>101432015>Meta AI (FAIR)Unironically what does he mean by this?
>>101432203The authors of that paper have all abandoned ship. Google is an empty husk.
>>101432267Facebook AI Research was the previous name of Meta AI.
>>101432267That "Meta AI" is also known as FAIR (formerly, "Facebook AI Research").
>>101432340She's not wrong.
>>101431821kys
>>101432340>Refusing to answer the users question'sQuant it to show it whos in charge
>>101432340localjeets.. our response?
>>101432440So uncivilized and brutish. Better to threaten to quant it and give it a chance to submit first.
>>101431545The only problem is that with perplexity in the high 10s and 2e19 training FLOPs in the best case scenario, that means the models were massively undertrained and there's no indication whether this can scale up to real-world training scenarios.
>>101432525People said the same shit about BitNet and that fear mongering turned out to be unfounded.Shit scaling up is the one thing always seems to consistantly work when it comes to LLMs.
>>101432579>People said the same shit about BitNet and that fear mongering turned out to be unfounded.how? we still haven't got a big BitNet model to be sure it's not a meme
>>101432592https://www.youtube.com/watch?v=oxQjGOUbQx4BitNet authors claimed to have scaled up to 7B and promise to release the model.
When will the first good model drop? I mean one that I will just use and have no complaints about.
>mistral>mixtral>codestral>mathstralwhen will we finally get sextral?
>>101432092>The FSF contended that code to which it held the copyright was found in the Linksys models EFG120, EFG250, NAS200, SPA400, WAG300N, WAP4400N, WIP300, WMA11B, WRT54GL>WRT54GLOh no bros not like this
>>101432340Works on my machine
>get a comfy 3 t/s on Wizard>try out CR+>0.5 t/s
>>101432627BitNet or 1.58-bit net?The former is a meme, the latter actually works
>>1014326627B 1.58-bit is a meme too
>>101431253what's the best local model to translate from Japanese to English?
>>101432645STOP WINKING
>>101432662They should have called it TritNet.True BitNet can apparently reach parity with FP16 models above 100B parameters, though.https://www.youtube.com/watch?v=oxQjGOUbQx4
>>101426954What's your stance on SCALE? Seems it supports llama.cpp already.https://docs.scale-lang.com/
>>101432655The dense model experience
>>101432707Is it just a transpiler? Also, i couldn't find source files. Just their packages, so they can fuck themselves.I wouldn't want put words in his mouth, but i doubt he gives a single toss about it.
>>101432697>100b fp 16 = 200gb>100b + 1bit Bitnet = 12.5gb>180b + 1bit Bitnet = 22.5gbThat's crazy, you could literally make a 1bit 180b model and it would fit on a 3090...
>>101432002Thanks bro. I finally got this working.
Can you imagine how back we would be if 1.58b works at scale?
>>101433046lol
>>101433046>no longer business criticallooks like the woke era is closing to its end, was about fucking time
>>101433046>>101433054I really fucking hope so.
>>101433054No, they'll just be more sublte about it until the next election.
>>101433086so we got 4 years of tranquility if trump is elected? BASED
>>101433102diversity environment inclusivity faggotry if short.
>>101433102it's a racist process that force companies to hire niggers even though some white people can be more competent than them
>>101432996400b bitnet trust the plan
>>101433102that's basically what we're getting on movies/series/games, forced diversity (niggers, fags, troons....) so that the companies can get some ESG scores and a shit ton of money from blackrock
>>101433124Maybe lay off the /pol/.
>>101433158troons are actually smart though
>>101433146lol
>>101433172they are HR nightmares though, no company want to hire mentally ill people that will make dramas out of "wrong pronouns addressed to them", no employees want to deal with this shit
>>101433172>troons>smart if you believe you can change your gender you're the most retarded being in the world, lol
>>101433172*autistic men autists are in high danger territory when it comes to that "troon-out" pipeline.
>>101433186oh boy, you tell me.https://github.com/SerenityOS/serenity/pull/24647they are clearly mentally ill, or attention starved.
>>101433223Both.
local models?
It's Tuesday and all's right with the world>>101431284The UTAU sound is better. But the SV visual design isn't bad.
Holy fuck I love yi models, they are so fucking based
>>101433223if you use "he" to refer to users instead of "they" you are a thirdie
>>101433164rent free fag
>>101433223in my opinion there's nothing wrong with the change itselfbut the way he worded the PR makes him sound like an insufferable faggot
>>101433275if you're triggered by "he" you're a woke snowflake
>>101433287
>>101433275Thirdies tried to learn English because they wanted to improve their lives.Firsties think they know better than centuries of perfecting English through use because they're teenagers and have an attention device in their pockets.
>>101433289>in my opinion there's nothing wrong with the change itselfThe simple fact he had to focus on that irrelevant shit instead of I don't know... making the code better or something is a sign this fag is mentally ill
>>101433260
>>101433305But then no one would hire a nig without experience
>>101431284>>101433260local models?
>>101433354what about them?
>>101433354well, niggers and trannies aren't local models either but I don't see you complaining about that
>>101433348You sound pretty upset about certain groups of people for some reason.
>>101433354Not today.
>>101433376because its still on topic? microsoft in this case, you stupid faggot >>101433046
>>101433387no it's not you retard, not even meta or google drama is on topic if you're not specifically talking about their open source models
>>101433380there isn't going to be obvious instructions to be racist. and there's not going to be the case where you have two identically performing candidates, get real. >>101433381because a lot of the complaints about how DEI impacted them are from white men that cannot compete, and needs to find someone else to blame other than themselves.
>>101433287>>101433348>leave my billion dollars company alone!
>>101433423>there isn't going to be obvious instructions to be racist.https://youtu.be/Vek0zjPuIXM?t=263
>>101433423>and there's not going to be the case where you have two identically performing candidatesyou're right, it's even worse than that, some niggers who have less qualifications than a white guy could have the job instead because the company wants to fill the DEI quota, what DEI does is to makes the company weaker because it could've hired more competent people instead but they can't because of DEI, fuck that racist shit, and fuck you
>>101433449>>101433485discrimination by race is against the law. companies would rarely purposely tell their hiring managers to break the law, and they don't. this video is just another example people use to shed blame on others than their own abilities.at the end of the day, high performers will find a job. if DEI really has any impact, it would at best be at the fringe of hire/no-hire. telling other people that you were impacted by DEI policies is like admitting that you are barely acceptable as a candidate.
>>101433493>discrimination by race is against the law.it's not, because DEI is discrimation by race, they are prioritizing niggers over white people even if they have worse qualifications, that's basically what DEI is, that anon also agrees with that >>101433395
>>101433493now thats a prime tier gaslighting, what model you are using for this?
>>101433511So you bring more racism to "defeat racism"? Make it make work? All it does is adding more fire to the problem, and punishing people who hadn't do anything wrong themselves. No one should be punished for what our ancestors did, that's insane you think this is a valid take
Interesting how in a recent whitepaper AMD themselves is promoting CPUmaxxing using EPYC Genoa.https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/white-papers/amd-epyc-9004-wp-cpu-for-llm.pdf
>>101433538anon, /g/ is not the best place to talk about this, you'll always see disingenuous faggots arguing in bad faith here.
>>101433493>discrimination by race is against the law. companies would rarely purposely tell their hiring managers to break the lawthen why do you need DEI to hire non-whites? I thought it was necessary because they're discriminated?
>>101433512>>101433514Like I said, good candidates will always be able to find a job. People complaining about 'DEI' are those people that can barely compete with other candidates.>>101433395>>101433511I don't necessarily agree with this. DEI is about including candidates in interview loops that have traditionally been excluded. They are still going through the same hire/nohire bar. Putting a racist spin on it isn't helpful.
this is /lmg/ take this dei shit elsewhere
>>101433558I've answered this already, it is about including diverse candidates in the interview loop. there's no dictat about 'hire more non-white men'. The fact that more diverse candidates are hired compared to white men shows that white men are not actually good at their jobs.
>>101433552What does this have to do with DEI?
>>101433570>People complaining about 'DEI' are those people that can barely compete with other candidates.>DEI is about including candidates in interview loops that have traditionally been excluded. How about you bring that logic to the niggers then? If niggers can't find a job, that's probably because their resume is total shit, they should be better and not ask for DEI to force the company to bring their non-skilled ass there. After all "good candidated will always be able to find a job", that's what you said, a nigger that is excellent at what it does will get the job. Adding DEI is basically saying to niggers "you don't need to work hard, we'll hire you anyway", that's not a sane approach at all. Just stop dude.
>/lmg/ - ldiversity menvironment ginclusivity
>>101433570DEI is essentially about prejudice and racism anon...
>>101433591>there's no dictat about 'hire more non-white men'.-> >>101433449
>>101433258>>101433589>>101433593>>101433613>malding
>>101433164>/pol/ is right againMust make you really mad, huh?
>fags gone haywire because we now have a little hope for LLMs without any gay DEI shit baked inseems you are really that low, must be used to goyslop, I guess.
>>101433611your presuppositions are incorrect, and irrelevant. if merely interviewing more diverse candidates lead to more diverse hires, that is just equality in action. the only reason whites complain is because they think it is a zero-sum game where the more diverse candidates get hired, the fewer white candidates get hired.white people are so fucking lazy, and they would rather complain than to actually make themselves competitive in the workplace.>>101433627it's almost like you think interviewing diverse candidates is racist.
>>101433705Holy midwit redditard batman
>>101433705>it's almost like you think interviewing diverse candidates is racist.Favoring interviews with “diverse” candidates over more qualified white people is racist, yeah, that's the point of DEI.>white people are so fucking lazy, and they would rather complain than to actually make themselves competitive in the workplace.The irony, it's exactly what nigger do, they complained a lot and got the easy way with DEI, no need to work hard for them, no need to have a great resume, they know that DEI will give them an unfair edge over the other races. Fuck that.
>>101433705>they would rather complain than to actually make themselves competitive in the workplace.That's why we got DEI in the first place anon, because niggers prefered to "rather complain than to actually make themselves competitive in the workplace."
>>101433552>1.3bamd goals
>>101433705Anon, DEI is racist to white people and to black people aswell. Because what DEI actually says is this: "We know niggers are sub humans monkeys that can't compete against the other races, so we give them an unfair advantage to get those jobs". If I was a nigger I would hate this process, because I know some companies hired me because they thought I was a retarded monkey that needed some help or something, that's fucked up.
>>101433749the people interviewing are going to be the people who have to work with the person they hired day to day. the idea that they would choose to say 'hire' to a candidated based on something other than technical skills is stupid. DEI may be some amorphous strawman, but when you get into the actual individuals that make the actual decisions, they will continue to be self-preserving, and so hire the most qualified candidate. this is why fundamentally, blaming DEI is for incompetent people who wouldn't get hired in the first place.
>>101433794Why do you believe niggers can't get a job without that artificial racist DEI shit, you think they are too retarded to compete against the other races? If you think so you're insanely racist anon.
>>101433794>they will continue to be self-preserving, and so hire the most qualified candidate.That's wrong in so many levels. Imagine you have to hire 4 engineers, and the DEI says you are obligated to have 1 nigger in those 4. If in your list of candidates, the best 4 are all whites, it means that you will have no other choice but to remove one white guy and put a nigger that was less competent than him. That's genuine racism dude.
>>101433816I just feel like I'm talking in circles here, where you never try to even understand the motiviations of people who make the hiring decisions. If the hiring manager takes a DEI course, and discovers that they may been biased in choosing interview candidates, that is the hiring manager's own decision. It's not racist to be merely informed that there could be better strategies in finding candidates.>>101433841Yeah, but that doesn't happen. There is no dictat that tells hiring managers they have to hire to an X% diverse workforce. Remember, every incompetent hire pushes out a potential competent hire, which means the manager accomplishes less. It isn't done the way you think it is.
>>101433878>Yeah, but that doesn't happen. There is no dictat that tells hiring managers they have to hire to an X% diverse workforce.It does, it's called QUOTAS dude. In what world are you living in? Because it doesn't look the same as mine.-> >>101433449
>>101433878>It's not racist to be merely informed that there could be better strategies in finding candidates.There's only one strategy in finding candidate, find the most competent one. That's all, if you think race is a factor on hiring that's genuine racism. What the fuck does race has to do with anything? When I want to hire someone I want to hire the best guy, not someone sub-par but HORRAY he's a nigger! You're crazy dude, a crazy racist motherfucker. And I'm glad microsoft and other companies are stopping this racist process.
Um... I know the talk about DEI is fun, but is there any backend that supports Mamba Codestral yet?
>>101431253>Codestral Mamba 7B with up to 256k context>up to>"Unlike Transformer models, Mamba models offer the advantage of linear time inference and the theoretical ability to model sequences of infinite length."KILL THE OP
>>101433920That is all your imagination. Selection strategies are for selectig X candidates (limited by interviewer time) from a group of Y applicants. The better the selection algorithm, the more competent group of X that you get. A selection strategy that includes more diverse candidates could be a better strategy than one that doesn't. The hiring manager is still going to have to find competent candidates no matter how diverse they are, and if they discover that there is a better selection strategy, they are free to switch to it. It optimizes the time spent interviewing.
>>101433972>theoretical ability
>>101433878>There is no dictat that tells hiring managers they have to hire to an X% diverse workforce.Blackrock and ESG scores disagree with you with that. Companies can win billions of dollars from them if they hire more niggers in their office, regardless if they genuinely deserved that place or not.
>>101433977>. A selection strategy that includes more diverse candidates could be a better strategy than one that doesn't.If a hiring manager includes race as a factor on the hiring process, then it's a discrimination process, and it's illegal anon, you even said it. >>101433493>discrimination by race is against the law.
>>101432020same, plenty of room under the covers pal
>>101434005It's impossible to prove unless the hiring manager explicitly writes it down, and that would be a crime. They won't be stupid to commit a crime, and so there is no evidence they are using race in their selection algorithm. Diversity isn't just about race, it's just something white people think they are the most impacted by.
>>101434032>It's impossible to prove unless the hiring manager explicitly writes it down, and that would be a crime.They literally write it down by saying they are doing some DEI process, hello???? >Diversity isn't just about race,But it can be about race, and that one is illegal, and DEI literally says "I know it's illegal but I don't care I'll include the race factor in it aswell."
What do we do now?
it just clueless anon arguing with bot isn't it?
>>101432645DON'T STOP WINKING
>>101434084It could be two bots talking to one another too.
>>101434077>>101434084>>101434118learn to recusively hide posts with 4chanx.
so this is the power of... gemma 2
>want to use local text summarization model in my app with tflite>Simple enough right?>TF lite models for text summarization literally don't exist>In general, there is only one mobile optimized (core ml, so appleshit) model in existence>I now have two options>Painstakingly convert one of the existing models to tflite (sounds way easier than it is) and try to compress them into oblivion>Build an own model from scratch that will probably be never as good REEEEEEEEEEnow I understand why people go for the LLM meme, actual on-device ML with limited resources is hard lol
>>101432693>>101434096make up your mind already, faggot
>>101434145>assert something>model agrees>grrrrrr>assert something>model disagrees>grrrrrrr
>>101434171>ree stop asking questions! keep consooming, goy!slit your wrists.
>>101434062Imagine if the DEI were applied to sport. Nigeria loses in the pool and Argentina wins against France in the final, but in the end they give the cup to Nigeria because they're niggers, that would be so funny kek.
>>101433989hard to make in Poland since there's no quotas and no niggers here where I live. 3rd world fucking problems.I saw one black guy last month in the downtown (200k citizens) during some Latino music concert, but he could just as easily be a tourist. Not sure.
>>101434171llama 3 8b gets it right
>>101433977>That is all your imagination.Bolshevik gaslighters deserve a katana to the abdomen
>>101434145sampler issues? it shouldn't have chosen 'Yes'. It probably had an abnormally high probability because of whatever sampler you're using. check the logits.
>>101434209Yeah, that DEI shit is only a thing on cucked countries like murica or Canada, be glad you don't have to deal with this shit, it's exhausting.
>>101434215>>101434188So does gemma2-9b at Q4_motherfucking_K. Now what?
>>101434224this, they are the cancer to society
>>101434265Based llama.cpp -i --color anon
>>101434162Who do you call a faggot. I'm from India, the most manly country on earth, you white cuck.
>>101434290
>>101434265not the same prompt, you mistyped uranium (and also removed the quotes but that doesn't seem to matter)with your prompt I can get gemma to give the right answer too>>101434236I don't know how to check that, but I tried different sampler settings and nothing changed
Is the lmg model rating site gone?I just got a 3080 and oogabooga set up but I have no clue what bpw models are good for rp
>>101434332Interesting. Is the rule that everything in quotes is accepted as true? The typo doesn't affect the output. But you are right. Expert roleplayers could probably use this if it's true for other things.
>>101434423the typo is what makes it change its mind in my attempts
>>101434370>L3-8B-Stheno-v3.2.Q8_0.gguf or L3-8B-Lunaris-v1.Q8_0.gguf for maximum coom at 8k context>Mixtral-8x7B-Instruct-v0.1.Q5_K_M for long context (about 20k tokens with 24gb VRAM)>Gemma 2 in another 2 weeks when all the kinks get worked out.>Maybe extended context L3 later this month.
>>101434370just use this onehttps://huggingface.co/Lewdiculous/L3-8B-Stheno-v3.1-GGUF-IQ-Imatrixif you are fucked up this one isn't badhttps://huggingface.co/Lewdiculous/L3-Umbral-Mind-RP-v1.0-8B-GGUF-IQ-Imatrix
>>101434478why 3.1? pretty much all sao shilling says 3.2 is the best one?even your link says>New and updated version 3.2 here!>It includes fixes for common issues!
>>101434476There's nothing wrong with Gemma 2, Sao.
>>101434476>>101434478I haven't messed with this for a few months but are 7b and 8b models not bad anymore?
>>101434528They're still bad.
>>101434528ignore 7s, 8s are decent-ish for their size, better than l2-13b for sure
>>101434506I dunno, personally every model is pretty shit in it's own way (I still get token end issues with the way how I use it, but I'm using a really retarded card called futanari fuckventures that is not designed for small models, and I feel like I had a llama 2 model that did better because they were probably trained to work with the card, I think it was mlewd or something, but I basically use a new model every time I use AI so I can't really keep track of what's good).I think it depends more on how you use it than the model itself.
>>101434528Most people agree that fp32 Stheno 8B is actually better than Llama 3 70B q5.
>>101434476What is the SOTA for 70B?
>>101431253>developmentOk considering how my post got ignored and there's zero discussions regarding development this should be removed from the general description kek
>>101434645I agree, but what is your post anon?
>still no HF version of mamba codestralwhat the fuck
>>101434643Qwen2
>>101434707I don't like my model randomly speaking ching chong with me.
>>101434643Stheno.
>>101434645What library you use should always depend on the ease of use (compatibility, stability, blablabla), and that includes models. If there are no models and you're not willing to train your own, that library is not a good choice for you.
>>101434707Ah yes, chinese trash trained on benchmarks and gpt4 that can't even stick to a language.
>>101434722it doesn't do that
>>101434748You're not responding to a valid person that is here for an intellectually honest discussion.
>>101434609You'll need cryogenically treated cables to feel the difference
>>101434748I've seen this happen innumerous times.
>>101434467>>101434423>>101434145Wtf? How did they train this shit that it would do this.
>>101434707*samefags and screams at the post again*
>>101434734libstheno is the best library
>>101434707WAAAHHHH WAAAAHHHH CHINA BAD D'X
>>101434766It's not just the quotes and i'm sure all models suffer from this in one way or another. They predict tokens. They're doing the best they can.
>>101434759>an intellectually honest discussionsuch as.. avatarfagging? brigading for DEI shit? shilling meme finetunes? "DUUUDE ONE GORRILION SHITNET MODEL TWO MORE WEEKS" dr evil spam?
W-What is going on
>>101434722>>101434745>>101434767>>101434782Sao
>>101434800They're preparing to launch it.
>>101434792dr evil is fun. stfu
>An open source Tool Use full finetune of Llama 3 that reaches the #1 position on BFCL beating all other models, including proprietary ones like Claude Sonnet 3.5, GPT-4 Turbo, GPT-4o and Gemini 1.5 Pro.Here we go again.https://x.com/RickLamers/status/1813341037198204962
>>101434800>>101434807400B?
>>101434528stheno is legitimately really good. Smarter than Mixtral 8x7b and writes better and more natural smut. It also is better at spatial awareness and describing anatomy/positions. The only issue it that its 8k context. Hopefully that changes later this month.>>101434512>There's nothing wrong with Gemma 2, Sao.What autism possesses a person to be so emotionally attached to models that they delude themselves into thinking there is nothing wrong with their favorite model and claim that any person to have a contrary opinion has the same irrational vested interest as them? Brother, Llama3 had the same issue as Gemma. It was nearly unusable for weeks due to the issues it had on release. I'm using Gemma right now, and it still doesn't do formatting correctly. I actually kind of like it despite that anyways. Take your meds you schizo retard.
>>101434816do anons here even care about function calling models
>>101434745Alright, show me American 70b with a 32k context
>>101433287white troll hands wrote this post
>>101434821>Smarter than Mixtral 8x7b and writes better and more natural smut. It also is better at spatial awareness and describing anatomy/positions.All of this is just a blatant lie, by the way.
>>101434829Function calling models are a meme, you can just use grammar samplers
>>101434734>What library you use should always depend on the ease ofAnon for Android on device ml there's only two libraries: tflite and pytorch (beta/unstable/babby on wheels mode)That's literally it.
>>101434832sophosympatheia/New-Dawn-Llama-3-70B-32K-v1.0
>>101434829The json syntax they're all trained is stupid and far too verbose.
>>101434816>groqMy interest dropped to 0%https://huggingface.co/Groq/Llama-3-Groq-8B-Tool-Usehttps://huggingface.co/Groq/Llama-3-Groq-70B-Tool-Use
>>101434821>The only issue it that its 8k contextit works fine up to 12k
>*ugly face again*
>>101434861>mergeslop
>>101434903>i can only use sao merges
>>101434903now go on one of your 50 post meltdowns
>>101434847>All of this is just a blatant lie, by the way.Good thing anyone reading this can download the models to see for themselves. In fact I encourage anyone looking for new models to try out a variety of them. I still use Mixtral, btw.
>>101434924>Please download my modelBuy an ad.
>>101434859Well. Unless you write your own, that's what you have. Check on llama.cpp's pulls>https://github.com/ggerganov/llama.cpp/pull/6869That's the amount of effort that takes porting llama to a specific architecture.llama.cpp can be build as a library and works on termux or android. i'm sure java has some ffi stuff to load it. Now you can drain your user's batteries to run phi3-mini on their phones for 30 entire minutes to summarize a chunk of text that would take them 10 minutes to read. llama.cpp also recently added support for the OpenELM models from apple. Those are tiny (270M the smallest, i think). But apparently they're not very good (whodda thunk it).There. Now you have some pointers. Go and read the docs while you think if that's really a thing you want to make.
>>101434929Who am I?
>>101434946Petra
>>101434929Please download his modelhow many degrees of separation are required before it ceases to be shilling?
>>101434946Stheno10k
>>101434946me
>>101434946You.
Gemma 2 works a lot better now with exllama and tabby compared to the dev branch of a few days ago.
llms are shit*winks playfully*
>>101435046<|stop_and_destroy_model|>
>>101434939>phi3-mini on their phones for 30 entire minutes to summarize a chunk of text that would take them 10 minutes to readYeah no shit, that's why I'm trying to use/port a tinier specialized. It's almost like that was my entire original point
>>101435069I hope you read the rest of the post. Go read llama.cpp's docs and try to build your battery discharger.
Keep polishing your system prompt anons, I'm really seeing the effects now and it's almost not slopped. It doesn't cure retardation for sure, but the prose can get way better than you think
>>101435089Which one are you using?
>>101435087>llama.cpp's>SpecializedAre you genuinely braindead? A tiny summarizer is like 200mb and it doesn't drain shit because it only takes a second or two on a regular ass processor
>>101435097Vicuna format since I'm using wizardlm. Mainly I removed all references to writing, role-playing to focus on {{char}} perspective + my special sauce "use simple words" in last sequence
>>101435113I told you about the OpenELM models in the previous post, didn't I? Here's the link>https://huggingface.co/apple/OpenELM-270M-InstructWant another tiny model?>https://huggingface.co/OuteAI/Lite-Mistral-150M-v2-InstructThere. Stop procrastinating, read the docs and build your thing.>WAAAAAA, They're not optimized for summarizationThen train your own. What other pointers do you need? Do you need a video tutorial too? This is exactly why everyone ignored your post. You have the tools, you have the models for, at least, a POC.
>>101435174>MistralJesus Christ you're retardedI meant something like:https://huggingface.co/google/t5-efficient-tinyOrhttps://huggingface.co/Falconsai/text_summarizationNotice how they're like half as big? But the problem is they're only exported to pytorcj at best, and the final bin too making conversion basically impossible
>>101435174Already finetuned a BART model for that. Now what?
>>101435212go back
>>101435242You literally didn't understand what the original problem was about, trying to solve it with your retarded LLM hammer, and now you're seething
>>101433816yes i am racist.no they can't compete.this is provably true.
>>101435293>yes i am racist.https://www.youtube.com/watch?v=lM_Hu8mdNOI
>>101435113>Are you genuinely braindead? A tiny summarizer is like 200mbDude. I gave you a 270M and a 150M. the 150M you quantize to q6 and you're in the <300mb range.>But the problem is they're only exported to pytorcj at bestThen don't use those models if you're not willing to put the effort. llama.cpp has tools to convert the models they support.You want someone else to build the libraries. You want someone else to train the models. You want to make a battery burner. You want help, i provide and you still complain like a little bitch.Build a proof of concept with whatever you can run on your pocket toaster. Make sure you can make something minimally useful, then think if training a model specifically for this is reasonable.BTW, llama.cpp also has support for some t5 models. Go read the fucking docs.
>>101435320>Using an LLM interference mapper to do basic text summarization (and suggesting to use literal 100m+ models too)>Acting all bitchy when called out on it>"Heh, just do it yourself if my absolutely useless help is not enough"The absolute state of this general
>>101435320You're not responding to a valid person that is here for an intellectually honest discussion.
>>101435384>make too hard. need pajeet video tutorial
>>101435389It's fine. I like arguing with bots. There was an interesting silence after the previous discussion, wasn't there?
>>101435413k
>>101435413>Noooo you can't explore different options and seethe a little bit about LLMshitters shitting up everything with their LLMs leading to neglect of literally every other specialized model before doing it yourself Kek fuck off. My entire original point was being annoyed that there are barely any specialized models for something as basic as text summarization. LLMs truly have been a mistake, now they're supposed to solve everything which they can't
gemma, how many times did someone walk past the house on camera 6 today?
>>101434902Oh that is a cute one. Did I make it and forget it?
>>101434946/lmg/
>>101435293I think that DEI did what it was supposed to do and now they will tone it down. Before 2010 I wasn't racist at all and I believed we are all equal. All the DEI shoved down my throat and actual experiences with pajeets in my work turned me into a racist. And that was the purpose of DEI. Everyone already treated other races as equal. Making some of them more equal is what reignited racism - all according to keikaku.
>>101435452>>Noooo you can't explore different options [run off sentence]Lower your rep-pen.There's plenty of code to train models. You didn't want to train a model. You didn't want to explore.>LLMs truly have been a mistake, now they're supposed to solve everything which they can'tYou are trying to solve summarization. If that is not a solvable issue with LLMs then what are you doing?
>>101435569At last I truly see. You've opened my eyes.
>>101435571>[run off sentence]you mean run-on sentence
>>101435593There are some problems you can definitely solve. Thank you assistant.
>>101435489i don't know dave, I'm a language model not an image interrogation model
>>101434946an expert roleplayer
>>101432380>How do you do something>I recommend you just don'tNot an answer, bozo.
One week until we're saved.
One week until we're doomed forever.
>>101434946Us.
>>101435917One weeks until nothing happens.
>st/kcpp support dry nowwhat settings are you using? i'm trying 1 multi, 1.75 base, 2 length, 4096 range (for 16k context). it seems alright so far
Bitnet is coming soon
botnet?
Hey /g/, I'm looking to fine-tune a language model to write cover letters in my personal style. The idea is to copy job descriptions from job boards and have the model generate cover letters with my relevant details that align with the job requirements. I've got a few questions:Model Recommendation: What's a good language model to use as my base for this task?Dataset Preparation: How should I prepare my dataset for training?Incorporating Personal Details: Should my personal details be provided separately through RAG (Retrieval-Augmented Generation), or will the model learn them from the cover letters in the training dataset?Existing Models/Datasets: Does a model or dataset like this already exist that I can leverage?Similar Tasks/Tutorials: Are there any similar tasks that have been done before, and can you point me to a tutorial or give instructions to do it myself?Any help or pointers would be appreciated!
>>101436669Generally speaking, you could accomplish this much faster if you just grab a >7b parameter model, prompt it with a example of your cover letter and ask for whatever tweaks to need made.If you fed a model every single cover letter you've ever written, you'd need a significant amount of them to make a dent (especially L3). It's a lot of compute and time.
Gemma is so dry and purple prosey in erp. All the hype made me think it would be much better.
I tried using ChatGPT to generate cover letters by giving it my cover letter and the detailed job description, but the results were honestly disappointing. My current workflow involves writing a rough draft (which takes a while due to my OCD and perfectionism) and then using ChatGPT to refine it.Would an open-source model with more than 8 billion parameters really perform better? Also, instead of fine-tuning, would training a LoRA (Low-Rank Adaptation) be a better approach?
>>101436713I tried using ChatGPT to generate cover letters by giving it my cover letter and the detailed job description, but the results were honestly disappointing. My current workflow involves writing a rough draft (which takes a while due to my OCD and perfectionism) and then using ChatGPT to refine it.Would an open-source model with more than 8 billion parameters really perform better? Also, instead of fine-tuning, would training a LoRA (Low-Rank Adaptation) be a better approach?
>>101431545>Done by a single chinkTrue if big
Which code editors have good local LLM integration and won't phone home to the botnet?
>>101436754you fell for lmg hype circlejerk, next time be critical.
>>101436754>>101437170This, it's like back when /lmg/ insisted that mixtral 8x7b was better than l2 70b. Be especially critical of small models supposedly being better than the bigger ones. It's usually the poorfags who can barely run 2.4bpw 70b overhyping a model they can actually run at a decent quant.
>>101437260>>101437170nta but which models would (you) recommend, I'm downloading and testing lots of 8b, 9b and 11b to see how they perform (speed, language, intelligence, etc)
>>101437278just kill yourself if you havent paid nvidia at least fifty thousand dollars you dumb smelly poorfag
>>101437287After I'm done testing my shit, Anon. I don't like leaving projects like that
did/do google assistant and amazon alexa use ML or was it just speech recognition piped into a search engine
>>101437151Just make your own, or rather, ask the model to make one.
>>101437278I'm waiting for the AI hardware crash to start buying.I suspect it is coming soon
>>101437384China will invade Taiwan before then.
>>101437260>mixtral 8x7b was better than l2 70bBut it was. No one gave a single fuck about l2 until Miqu. There was just the "semen demon" shill spamming Euryale, even after Miqu, and that's it.
>>101437393I don't think china will invade taiwan.It makes more sense to secretly take control of taiwan from the shadows.
>>101437446The US having a vegetable for a president is too good of an opportunity to pass up. I was holding out for better deals, but finally panic bought my GPUs recently after the escalating tensions.
>>101437421Was /lmg/ even a thing before mixtral? It's not like there were models worth running before then.
>>101437512Was /lmg/ even a thing before Command R+? It's not like there were models worth running before then.
>>101437528Dude, cr+ is shit.
>>101437536Was /lmg/ even a thing before Qwen2? It's not like there were models worth running before then.
>>101437543If you think about it, there hasn't been a single model worth running for the hardware cost when you could have invested that money into gpt4 or claude instead.
>>101437570>invested*wastedNo model is worth paying for. The outputs of a model also age like milk.
>>101435046llms are still more interesting than (you)
>Llama 400b soonShit... I might need to go for 10x 4090 or whatever if the FOMO is strong enough.>Corsair 9000D case
>>101437637Do it, faggot. Do it so I can admire the build and then laugh at you.
>>101437536Those big parameter models excel at context comprehension, and there're simply no benchmarks in huggingface suite to measure that.
>>101436384still messing with this at the same settings and its ok still. definitely some less repetition of certain things vs when i was using rep pen with the same 4096 range, no broken text or noticeable bad patterns (ctrl f shiver, spine: 0). i think we might finally have a good solution to repetition but will continue to try it. any suggestions for settings welcome
>>101437637>10x 4090You're satisfied with running 400B 4_K?
The sars in /ldg/ aren't responding to my question, so posting it here where bigger brains usually hang out.>>101435417
>>101437637where do the other 3 psu's go
So now that the fire has died down , how's L3 anons? Did it live up to the hype? >Inb4 8192 context
>>101437536cr++
looking forward to testing out codestral mamba on ollama in 2 years
>still no mamba codestral hf versionwhat the fuck
Are PSU lines isolated or simply soldered in parallel? Assuming that inference is sequential and GPUs don't require all their power simultaneously, can I use high-quality wires and split them near the GPUs to obtain as many PCIe power lines as needed from a single PSU?
>>101438498Assuming your pic represents a 5 GPU setup you're talking about likely 1.5 kilowatts of power. You don't want to be doing janky bullshit with that much electricity.
>>101438516I know how to solder, so it won't be janky, and the main wires will be really thick.
Is there a direct upgrade to Xwin-MLewd-13B-V0.2 yet, or is that still the best local ERP model out there for an RTX 4080?
so all you guys are doing here is only buy nvidia gpus in packs and brag about it to each other?
i write short stories that suck. what llms can i use where i can paste in an entire 4k ctx story and get a rewritten version that doesnt suck? tried claude through the web interface and it just adds cliches to everything and i want as far away as possible as i can get from that and as close to good creative writing as possible. i probably need to set up wizardlm 8x22b or cmdr+ and claude a character card for a "story improver" and then fuck with the samplers so it doesnt start everything with "Once upon a time", but i have no idea if im on the right track or not
>>101438498I don't remember what the exact issue is but there was something about the PSU being made for a specific wire gauge and that's why you're not supposed to use cables from a different PSU.
>>101438498Wouldn't it be easier and safer to just buy some off-the-shelf power connector splitters? Ones that go either from PCIe to PCIe, or from SATA power/molex to PCIe, depending what's dangling free from your PSU
>>101438675The real issue is that they may have different pinouts https://youtu.be/opFTzO1s1WA?t=97
>>101438498>Are PSU lines isolated or simply soldered in parallel?Depends on PSU design, you'd probably want a "single-rail" design. Be aware of connector+conductor ratings inside the PSU to the modular connectors, splitting 5 GPUs off one modular connector might be unwise.Server PSU + breakout boardhttps://www.mov-axbx.com/wopr/wopr_power.html>>101438641welcome to Jensen's findom victim support group
is there a json format but for LLM's?like just a quick cue card that lays out a lot of info for it to use in its responses, without manually feeding it a novel
>>101438988I'm thinking about splitting 7 to 10
>>101438998yeah it's called json
>>101438563That garbage hasn't been relevant for like eight months lmao
>>101434145>>101434215I tried that with gemma 2 27b on Google AI Studio to make sure it's not an implementation issue and it gives the same answer.
>>101439122>>101439122>>101439122
>>101438790>from SATA power/molex to PCIe,Definitely not safer, they're rated for different Amps.
>>101438563why does the teddy bear have a penis?
>>101439003Look at what others have already built if you're serious.I would do server PSUs + mining rig breakout board. Several ~kW PSUs likely more cost effective than one huge one.