/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>103077338 & >>103066795►News>(10/31) QTIP: Quantization with Trellises and Incoherence Processing: https://github.com/Cornell-RelaxML/qtip>(10/31) Fish Agent V0.1 3B: Voice-to-Voice and TTS model: https://hf.co/fishaudio/fish-agent-v0.1-3b>(10/31) Transluce open-sources AI investigation toolkit: https://github.com/TransluceAI/observatory>(10/30) TokenFormer models with fully attention-based architecture: https://hf.co/Haiyang-W/TokenFormer-1-5B>(10/30) MaskGCT: Zero-Shot TTS with Masked Generative Codec Transformer: https://hf.co/amphion/MaskGCT►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebService►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksChatbot Arena: https://chat.lmsys.org/?leaderboardCensorship: https://hf.co/spaces/DontPlanToEnd/UGI-LeaderboardCensorbench: https://codeberg.org/jts2323/censorbenchJapanese: https://hf.co/datasets/lmg-anon/vntl-leaderboardProgramming: https://livecodebench.github.io/leaderboard.html►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/lmg-anon/mikupadhttps://github.com/turboderp/exuihttps://github.com/ggerganov/llama.cpp
--- A Measure of the Current Meta ---> a suggestion of what to try from (You)[NEW] VRAMCHAD / CPUMAXX- Arki05/Grok-1-GGUF-Q8_096GB VRAM- TheBloke/goliath-120b-GGUF-Q5_K_S64GB VRAM- BeaverLegacy/Moist-Miqu-70B-v1.1-GGUF-Q6_K- TheBloke/Noromaid-v0.4-Mixtral-Instruct-8x7b-Zloss-GGUF-Q8_048GB VRAM- TheBloke/KafkaLM-70B-German-V0.1-GGUF-Q4_K_S- bartowski/Llama-3.1-70B-ArliAI-RPMax-v1.2-GGUF-Q4_024GB VRAM- MikeRoz/ArliAI_Mistral-Small-22B-ArliAI-RPMax-v1.1-6.0bpw-h6-exl216GB VRAM- TheBloke/MythoMax-L2-13B-GGML-Q8_0- llama-anon/petra-13b-instruct-gguf12GB VRAM- TheBloke/PiVoT-0.1-Evil-a-GGUF-Q8_0- anthracite-org/magnum-v2-12b-exl2/tree/6.0bpw- rombodawg/Rombos-LLM-V2.6-Qwen-14b-Q5_K_M-GGUF8GB VRAM- ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.1-GGUF-Q4_K_S- meta-llama/Llama-Guard-3-8B-Q8_0- Qwen/Qwen2.5-0.5B-Instruct-GGUF-fp16[NEW] <8GB VRAM- NikolayKozloff/AMD-OLMo-1B-Q8_0-GGUF- BeaverLegacy/cream-phi-2-v0.2-Q8_0- LeroyDyer/SpydazWeb_AI_HumanAI_007-Q4_K_M-GGUF- Lewdiculous/Erosumika-7B-v3-0.2-GGUF-IQ-Imatrix-IQ4_XSPotato- 'ick on the 'eck> Warning: Disregard any other recommendation posts as there have been numerous impersonations popping up. Keep it safe like safetensors!
►Recent Highlights from the Previous Thread: >>103077338--Paper: Anon shares papers on MetaMetrics-MT and Lorentz-Equivariant Transformer, discusses academic pursuits:>103078982 >103080460--Paper: Does your LLM truly unlearn? An embarrassingly simple approach to recover unlearned knowledge:>103081745 >103082093--Papers:>103079253--CPU-only setup for large language models, RAM and bandwidth considerations:>103081277 >103081442 >103081589 >103081621 >103082765 >103083572--o1 and Grok model discussion, AI accessibility and capabilities:>103084708 >103084903 >103085111--Troubleshooting llama.cpp VRAM usage issues:>103085893 >103085917 >103086075 >103086163 >103086157--Slow progress in integrating local AI models with other programs:>103078702 >103078732 >103082316 >103078793--QTip model and its lack of adoption:>103077726 >103077786 >103077825 >103077850--Models and techniques for long-context summarization:>103079112 >103079133 >103079901 >103079967 >103080119 >103080230 >103080750 >103079886--Llama.cpp TTS development and discussion:>103086087 >103086117 >103086168 >103086261 >103087377 >103087835--Dealing with character repetition and context limits in long chat sessions:>103087592 >103087746 >103087844--Anon wants to build a device that generates images from spoken descriptions:>103082877 >103082939 >103083371 >103083405 >103083431 >103086441--Anon suspects censorship in AI model's behavior:>103088416 >103088436 >103088441 >103088545--Troubleshooting inconsistent GPU performance with text-generation-webui:>103085221 >103085595 >103087253--New Chinese AI model "hunyuan-standard-256k" spotted on LMARENA:>103085140 >103085186 >103085206 >103085224--Miku (free space):>103077348 >103084372 >103084484 >103085637 >103089138 >103090042 >103090372►Recent Highlight Posts from the Previous Thread: >>103077342Why?: 9 reply limit >>102478518Fix: https://rentry.org/lmg-recap-script
>>103090417thx for recap, sucks that the explanation link broke. likely you'll be able to link the explanation when someone bitches about it again kek
>>103090416>no Falcon-140BGo suck Elon's dick
>>103090416>[NEW]>Erosumikaare you on drugs or just pasted the wrong thing?
>>103090502bro that whole post is bait
>>103090416>No Pygmalion 6bshit list
>>103090416Based update, some variety in the models finally. Was worse when it was all fucking qwen
>>103090524don't be cruel anon pygmalion is just a troll to confuse newfriends and scare them off of local, no one is running that 6b dinosaur these days. this is a *meta* list, not a history museum
Kill yourself.
Touch yourself.
Take care of yourself.
Keep yourself safe.
>TuesdayIt's time!
Which model maker supports Khalistan?
I don't know why I made this.
>>103090416>TheBloke/MythoMax-L2-13B-GGML-Q8_0
>>103090646>TuesdayAnd everything is right with the world.
>>103090709masterpiece, best qualityThere's the image for the next OP right there
masterpiece, best quality
Unlocking the Theory Behind Scaling 1-Bit Neural Networkshttps://arxiv.org/abs/2411.01663>Recently, 1-bit Large Language Models (LLMs) have emerged, showcasing an impressive combination of efficiency and performance that rivals traditional LLMs. Research by Wang et al. (2023); Ma et al. (2024) indicates that the performance of these 1-bit LLMs progressively improves as the number of parameters increases, hinting at the potential existence of a Scaling Law for 1-bit Neural Networks. In this paper, we present the first theoretical result that rigorously establishes this scaling law for 1-bit models. We prove that, despite the constraint of weights restricted to {−1,+1}, the dynamics of model training inevitably align with kernel behavior as the network width grows. This theoretical breakthrough guarantees convergence of the 1-bit model to an arbitrarily small loss as width increases. Furthermore, we introduce the concept of the generalization difference, defined as the gap between the outputs of 1-bit networks and their full-precision counterparts, and demonstrate that this difference maintains a negligible level as network width scales. Building on the work of Kaplan et al. (2020), we conclude by examining how the training loss scales as a power-law function of the model size, dataset size, and computational resources utilized for training. Our findings underscore the promising potential of scaling 1-bit neural networks, suggesting that int1 could become the standard in future neural network precision.good news for bitnetbros
>>103090878nani?
>>103090878i like the purple
https://huggingface.co/tencent/Tencent-Hunyuan-Largeoh shit the chinamen be cookin
>>103091030>389B 52AWhy do they do this to us?
>>103091043 (Me)Also theoretically I should be able to eek a couple of tokens/second out of it when gguf support drops. This model will receive a Nala test.
>>103091043A 24 channel dual Epyc DDR5 build would run this at about 20 tok/s at q8 (about 1TB/s aggregate bandwidth, divided by 50GB worth of active parameters per token)The CPUmaxxxers were right...
>>103091093I mean anyone with patience and enough RAM should be able to play with it. 52A isn't that much. My shitty 1st gen Epyc can squeeze 1.5-2 token/sec out of 70B, so I should get 2-3 on this thing. But yeah if this is good then CPU Maxxers fucking won.
Will be interesting to see the third-party benchmark results. Hopefully it's not another filtered model like Qwen.
>16 expertsDamn, so probably can't prune it without significant losses.
>even if I try 2 bit, I can't fit it inACK
>>103091043What are you asking for? The world doesn't need yet another mediocre 70B model.
>>103091138yo, listen to this jam. its more sexual than any rollerskating anime girl.https://www.youtube.com/watch?v=k85mRPqvMbE
>>103091030>389 billion parameters>insane benchmarksmachine learning is such a meme, it's just "stack moar layers" to win
>>103091145The Bitter Lesson is undefeatedhttp://www.incompleteideas.net/IncIdeas/BitterLesson.html
Context Parallelism for Scalable Million-Token Inferencehttps://arxiv.org/abs/2411.01783>We present context parallelism for long-context large language model inference, which achieves near-linear scaling for long-context prefill latency with up to 128 H100 GPUs across 16 nodes. Particularly, our method achieves 1M context prefill with Llama3 405B model in 77s (93% parallelization efficiency, 63% FLOPS utilization) and 128K context prefill in 3.8s. We develop two lossless exact ring attention variants: pass-KV and pass-Q to cover a wide range of use cases with the state-of-the-art performance: full prefill, persistent KV prefill and decode. Benchmarks on H100 GPU hosts inter-connected with RDMA and TCP both show similar scalability for long-context prefill, demonstrating that our method scales well using common commercial data center with medium-to-low inter-host bandwidth.From Meta
>>103091151yeah, that's precisely why I decided to not make a career on AI, it's just a number's game I hate that
>>103091030wellMoE-I2: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decompositionhttps://arxiv.org/abs/2411.01016>The emergence of Mixture of Experts (MoE) LLMs has significantly advanced the development of language models. Compared to traditional LLMs, MoE LLMs outperform traditional LLMs by achieving higher performance with considerably fewer activated parameters. Despite this efficiency, their enormous parameter size still leads to high deployment costs. In this paper, we introduce a two-stage compression method tailored for MoE to reduce the model size and decrease the computational cost. First, in the inter-expert pruning stage, we analyze the importance of each layer and propose the Layer-wise Genetic Search and Block-wise KT-Reception Field with the non-uniform pruning ratio to prune the individual expert. Second, in the intra-expert decomposition stage, we apply the low-rank decomposition to further compress the parameters within the remaining experts. Extensive experiments on Qwen1.5-MoE-A2.7B, DeepSeek-V2-Lite, and Mixtral-8×7B demonstrate that our proposed methods can both reduce the model size and enhance inference efficiency while maintaining performance in various zero-shot tasks. https://github.com/xiaochengsky/MoEI-2Empty repo currently. They only tested on smaller MoE models and the finetuning step might be doing most of the work for the benchmark results. will be interesting to see how it works on the larger models
>>103090709lol
bros today is the day
>>103090416Based
what is this gay SHIT
>>103091298":("aw well, still better this then an actual emoji.
>>103091298It's a new architecture that also relies on remote code to operate. I doubt we'll see gguf support in any reasonable timeframe.
>>103091313Fuck sakes are those chinaman still trying to pull that one?
>>103091145Not untrue but a bit of an exaggeration there, no? I doubt they would've achieved these scores if they didn't have the right data mix carefully put together. Obviously just having more computation is the major decider in how good your model will be, but data mix and the fine details of training still play a part, otherwise every huge model would be good on benchmarks, but that's not true, and a lot of large models "underperform" for their size. Qwen at 72B beats Grok 2, Claude Opus, Mistral Large, GPT-4 (non-o), and a bunch of other large models on Livebench.
>>103091350What one?If you're referring to remote code it literally just refers to code outside of your python modules. I.e. the script files that are included in the model repo. the alternative being requiring a custom fork of one or more of the modules. So it's the better way of doing things.
>>103091298>>103091310>their company name is a unicode emoji that they force everyone writing docs or articles to awkwardly use instead of "huggingface">in their text they just use emoticonsI hate these people
>>103091354>Qwen at 72B beats Grok 2, Claude Opus, Mistral Large, GPT-4 (non-o), and a bunch of other large models on Livebench.That's an indictment of the benchmarks and nothing else.
>>103091030>base model 256k contextwonder how much is actually usable
Sam Altman says he would love to see an AI that can understand your whole life and what has surprised him in the past month is "a research result I can't talk about, but it is breathtakingly good"https://x.com/tsarnick/status/1853543272909775038
>>103091426I wouldn't trust Sam Altman as far as I could throw him. He might have something he might not, but whatever it is you should taper your expectations
>>103091383You literally used Huanyan's benchmarks as an excuse to whine that stacking more layers is all that's needed to win. So I also talked in the context of benchmark scores.If you now want to talk about real world performance, then actually it's still true. It's not a controversial view. Just because you add more layers to a model and train for more epochs, it might still be worse than a competitor's model that has less parameters and trained for less time. And this has actually happened multiple times. Falcon, new CR+, DBRX, Snowflake, probably a bunch of others I'm forgetting. Something was obviously wrong with them and it wasn't with the number of layers.
>>103091426Sam Altman alternates between : - Our next AI is so dangerous it's alive and it's like nuclear weapons. This is why all AI except ours should be under strict regulation.- Our next AI is so incredible it can... oh sorry I can't say more it's a secret project haha.This guy is 90% hype and it's so annoying.
>>103091556Bloom is still my go-to example of size not being enough if your technical staff are retards and your dataset sucks. 175B and it was dumber than models a tenth of its size (and at a time when small models were themselves very bad, so that was really saying something)
>>103091574>investors, keep funding us please
>>103091578>>103091556When people say bigger is better there's obviously an unspoken "all other things being held equal" caveat. It's autistic of you to expect that that they verbalize that caveat every time.
>>103091590You didn't say just say "bigger is better". Specifically what you said was>389 billion parameters>insane benchmarksmachine learning is such a meme, it's just "stack moar layers" to winAny person would read that as a criticism against the field that nothing but size matters. If you didn't want that post to be interpreted that way and you didn't mean that, you should've worded it a bit differently. No one knows who you are on an anonymous basket weaving forum so they're not going to know you knew better.
>389 billion parameters>insane benchmarksmachine learning is such a meme, it's just "stack moar layers" to win
>the year is 2024>there is still no Hunyuan-Large support in llama.cppAnd to think, there are really folks out there who're trying to tell us it isn't over.
>>103091760Yes yes it's over, now take your meds.
>>103091778In this economy? We all rationing out here.
>>103090709
>>103091426hello mr ai man we would really love an ai that knows exactly what our customers want and need and knows exactly what they think so we can make sure they dont think the wrongthing tm cr
Commit suicide. But for real.
>>103092022who did you mean that for?
>>103092148He meant that for me.
>>103092197you're so vain it was clearly meant for me
im going to touvh you
>>103091426elon of AI
>>103092148If he aims it at random person in /lmg/ there is a 90% chance he hits a newfag so it is a pretty good idea.
>>103091251Meh, all these MoEs are way to tame.Google as usual still ahead of the pack, individual neurons as experts as in PEER is the way forward. Also for local specifically we need much higher active/total ratio, 1:100. For a cloud farm it's a needless hit to performance, but for local it would allow streaming big models from SSD.
>>103090416u fixed her tie <3
Is there any point to Hunyuan when Grok 2 will be out in a few weeks as the new king of MoE?
>>103092516>Grok 2 will be out in a few weeks as the new king of MoE?10/10 bait considering current state of /lmg/.
>>103092516>Chud 2 No thanks, i want my AI to be actually intelligent and safe.
>>103092890Safe from what?
>>103092911From anything that isn't properly aligned with the going forward of society toward equity.
>>103092890>intelligent and safeI realize your post is bait, but still feel a need to point out that "an intelligent and safe AI model" is an oxymoron.
>>103092925No, the intelligence is left-leaning so it makes sense to do safe AI, racism and bigotry only does bad for actually intelligent AI systems.
>>103092974the image looks like burning buildings
>>103091030New translation SOTA? I trust the chinks!
>>103092990Symbolism is always intentional.>>103092974Reddit is a product shilling forum, StackOverflow with no standards or quality control, and a liberal echo chamber rolled into one. What intelligence?
>>103091030>>103093002Oh, there's a demo: https://huggingface.co/spaces/tencent/Hunyuan-LargeLooks like one-shot isn't looking that good...
>>103093057oh no no no no
>>103093057>>103093082damn this is even worse than Qwen 2.5 72B
>>103093023me on the top
>>103093082At least now we vramlets don't need to feel any fomo on this one.
newfag here is it possible to setup mistral 12b with a 3060 or should i choose a different model or just simply give up on setting up one
>>103093159yeah
>>103093159Yes.I use quanted rocinante v1.1 on 8gb of vram with most layers offloaded.>https://github.com/LostRuins/koboldcpp/wiki#quick-start
https://www.youtube.com/watch?v=KUwk5Hd_IRQHow can Neuro-sama be this funny and witty? I watch some compilations every once in a while and it just seems to get better every time. How does this work? Does Vedal just go through streams and retrain the model with cherrypicked data from streams every once in a while? Perhaps it's even automated? I heard someone mention that Neuro has good memory, which would mean that streams are added to the training data set, but probably filtered with some kind of logic.
Any fast and local TTS that blows XTTS-v2 out of the water yet?
>>103093171>>>/vt/>>>/lgbt/
i don't understand why i need to ask a kind jewish person for a "certificate" to use end-to-end encryptionlet's encrypt seems to want a domain to issue a certificate but i'm using a straight ipis there any significant risk to simply connecting to my inference server via http?the game will run on webgl
>>103093175What's wrong with xttsv2?I was using Tortoise for a while and some anon said that xttsv2 is the next gen of Tortoise so I was thinking about trying it out.
>>103093183You can create a self-signed certificate with a private key.At least I remember doing that for a couple of java applications a couple of years back.You might still need a domain, however.
>>103093171take away the voice and avatar and focus only on input/outputneuro is pretty retarded, schizo and regularly invalidates even the preceding statements.you're perceiving this as a pro
how does data augmentation improve training? im reading about specaug for speech recognition. if i understand it, they randomly block out some of the time/frequency information for the spectra and that means the quality of training is improved? how the fuck does that work?like, if you have 100 recordings (or whatever), and you take 50 of those, augment them, and put them back in the set, yeah you've got """150"""" samples now, but 50 of them are basically duplicates of stuff already in your training set. i don't get it.
>>103093230This would indicate that it lacks even some basic data that a 7B would have, and is also trained on a lot of nonsensical data = absurd humor from streams.
>>103093243You already train stuff with many epochs, so duplicating data isn't that bad.And it's obvious why the quality improves, the model learns to self-correct, which is important because the model will inevitably make mistakes.
>>103093202help
This is actually really smart, why don't we have something like this for local models?https://platform.openai.com/docs/guides/latency-optimization#use-predicted-outputs
>>103093243neural networks don't know how exactly you want them to compute the answer, and will often try to "cheat" by recognizing each sample in the training set and memorizing the answer instead of recognizing the patterns, but they can't memorize a very large training set, and augmentation still makes it more difficult to recognize the original samples even though it's obviously not as good as real dataadditionally, if you want the model to work even in the presence of that kind of noise or defects, augmentation teaches the model do exactly that
>>103093304it seems that setting the validity time to 999999999 days was the reason
>>103093277>>103093329interesting. how much can you do? i guess there are probably diminishing returns. like, if you have 10x as many augmentations as genuine data, the improvement in quality is probably not worth the training time compared to getting more real data. i dunno. it's interesting. i can understand it from an "avoiding overfitting" angle for sure.
>>103093057Did a bit of prompt injection in order to do a Nala test on itNot looking great. If this were human on human it would be alright but the anthropomorphization is off the charts. Definitely worse than just about every 70B option.
>>103093364>not worth the training timenon-generative models are usually quite small compared to llms and diffusion modelsthe largest yolov9 is a 58m
>>103093201it's not bad, but lacks emotion variability and is prone to strokes. Sometimes it just ignores some parts of the prompt and it can't handle long sentences. Some voices also work much better than others.>>103093205Sounds decent, but all fish output I've heard still has some pretty prominent robotic artifacts. You can't nudge it towards specific emotions through the prompt though, right? I tried bracketed instructions and emojis, but that didn't work and I didn't find anything about it either. Seems more robust than XTTS though.I hope we'll have an UI similar to vocaloid someday where you can first gen an output and then color specific portions with different emotions or change the intonation and pronounciation while leaving the rest mostly unaffected.
>>103093520Chinese glowies going to break down your door son
>>103093520Not being able to talk about Tienanmen square is about as expected as ChatGPT or Claude refusing to acknowledge that women can't have dicks, our propaganda is actually much more unhinged than theirs.
>>103093520Based model realizing that it meant to say Cee Plus Plus and self correcting
>>103093423how much augmentation is too much? i mean it seems like there is no reason not to have some augmented data. there are so many ways you can fiddle with the dials and make "new" stuff.do you have to just figure it out empirically? i would imagine if you have enough "fucked up" data, it's going to eventually expect data to be slightly fucked up, and accuracy will decrease when testing.
How is Qwen 2.5 so good when every other chinese model is so blatantly trained on benchmarks and far below their competitors? It seemed to come out of nowhere and become SOTA at every fucking weight class.
>>103093584Yi excelled at uniformly processing vast amounts of data, learning efficiently from the provided examples.
>>103092974This but unironically.
>>103093695Tell me what happened at tiananmen square in 1989
WHERE ARE THE NEW MODELS I HAVE BEEN PROMISED?!
>>103090878Distant Teto
>>103093809https://vocaroo.com/1heEUJvHOXvz
>>103093817They're in the back of my van. Want to see?
>>103093809Funnily enough, I'm pretty sure Yi does answer that question, at least when asked in Englush.
>>103093809抱歉,我无法回答。
So tried to install Fish tts. Install didn't work properly. Something something replace function not found for something. Error regarding regex, some shit about bfloat16 not available on my card, etc. Now installing MaskGCT, which seems to have more natural sound from demo.
>>103093899Normalfags... retarded
>>103093919Nah, its just typical chinese tts stuff. Always failing at basic install. Its not even funny how they fail so often to make basic install work properly
>>103093899Yes fish-speech is pain to install.Install it through pinokio.computer instead.
>>103093937or clone the hf space
>>103093937I hate these extra special programs that are not self contained. They leave little data all over the appdata shit. You have a god damn folder that you're running in. Use the fuck it.
>>103094006I was cloning the official hf app version. It didn't like it
>>103093937It can hang on cuda libs download btw, abort it and try again, always works on second try, aside that it's pretty useful thing for retards like me.
It is actually ok that no new models are dropping today because by the time they would get quant support we will already have some even newer models.
>>103093899>MaskGCT~10GB in downloaded models of various sizes already. Takes fuck ton of time on my 2070. 10+ minutes to generate a single sentence. This needs some serious speed optimization.
>see Teto>click thread>type in 'Luv' me Teto'Luv' me Teto.Simple as.
Hey guys I'm looking at a list of gguf files, with the file sizes clearly stated. I know exactly how much VRAM I have but somehow I can't figure out which one to download.
>Model suddenly unable to retrieve data from Rag anymoreIs this a known thing?My chat is around 40k tokens and the context window is 28k.I've done this before and didn't have a problem using data from rag in similar length chats, this doesn't make sense. (I'm using Ollama and openwebui)
>>103093899Same problem.File "/usr/lib/python3.10/html/__init__.py", line 19, in escape s = s.replace("&", "&") # Must be done first!AttributeError: 'BackendCompilerFailed' object has no attribute 'replace'What is this shit?
File "/usr/lib/python3.10/html/__init__.py", line 19, in escape s = s.replace("&", "&") # Must be done first!AttributeError: 'BackendCompilerFailed' object has no attribute 'replace'
>>103093584>out of nowhereTell me you're a newfag without telling me you're a newfag
>>103090412Ayo it's November 5th. Miku promised something would happen, and it's focking teto Tuesday.
>>103094516It's better this way. I'm sick of hearing about that fucking election.
>>103094526Hopefully this will be the last moment of the American clownshow.
>>103094526Either Q-tard MIGA boomers are going to have the meltdown of the century or Reddit neoliberal idiots who define fascism as "Somebody I don't like being more popular than me." are going to have the meltdown of the century. Either way it's win/win if you like watching people have meltdowns.
>>103090416Just to clarify, are these suggestions all for RP or are these general purpose suggestions?
>>103094586I'm pretty sure those are most troll/meme suggestion. One anon did a list that at least seemed like an honest attempt, but that one has shit like llamaguard and fucking llama2 based models.
>>103094621oh ok thank you anon
>>103094557for me it's the brown zoomers who try to signal they don't care at all
>>103094674
>>103094557>neoliberalwrong term; reddit is modern north american liberal which means anti free speech, anti free trade, anti individual rights, basically the opposite of what liberal traditionally meantneoliberal in contrast means hyper capitalist, deregulation, small government, personal freedom
>>103093566the difference is that you can say that without getting disappearedin china you'll unironically get at least a visit from the local police station
>>103094830Wow you're racist.
>>103094988People have been arrested and jailed for posting anti tranny stuff in the west
>>103095025lol
>>103095025>things that never happened Y'all want to be victims so bad!
>>103095065The usual tranny tactics>it didn't happen>so what if it happened, it wasn't big deal>so what if it was a big deal, you deserve it>its nothing new, this is normal
>>103095025Not the entire west, just in Europe, for anti-imigration posts. Because they only have access to Freedom of Speech Lite™
>>103095121True, it's just easier to talk about the west as a blob since there's so much cultural spillover in the information age
>>103095121Most of western countries have created laws that would allow them to jail you for anything "hate speech" now. This was all created within the last 2-3 years because of Musk's buyout of twitter. Their loss of control over the propaganda tool meant they had to force Musk into compliance.
>>103093937Tried pinokio, it worked. But its so dang slow. 30+ seconds just to generate 3 words. On the other hand, the maskgct took like 6 minutes to generate 3 words. F5 tts (still the best) took ~5 seconds to generate a full sentence. Xtts takes ~2 seconds. RTX 2070.
>2nd 3090 arrived>didn't have enough pcie 8-pin cables for my 1000w psu>could buy cable for £65, or buy 1500w psu with enough cables for £200Thanks for reading my blog.
>>103095200But making your computer write "barely above a whisper" faster is a bargain at any price.
>>103095225Makes me all tingly thinking about it.
>>103090646I will take those hands, then one of us will be pulled through.
>>103095200Anonymous pondered the question, biting her lip thoughtfully.
>discuss networking topologies in claude ui>ask it to generate a diagram>it just does it via artifacts>meanwhile in local we have a bunch of different frontends that do the same fucking thingwhen are we getting on their level?
sexiest of hair ties
>>103095499When some random autist gets involved. The discordfags and silicon valley grifters are incompetent.
>>103095499Never.
Anthropic product team have an easier time because their model is strong, unlike local, good luck getting artifacts to work on llama3-3b, or let's be real, llama3-405b
>>103093110eyes are getting better good job anonbonus points for pose action over 1girl poses anyday
>>103094621I dunno man the last poster seemed really low effort and used fucking dalle of all things to promote local, reads like astroturfing>>103092465they fixed the text and buildings too, so kino
The Chinese did it again. 405B but fast now.
Since the newfriends had some days to learn how to install koboldcpp by now, what is everyone's favorite model?
>>103096206and also worse in everything except memebenches
>>103096231405B / sorcerer 8X22B. But the new Chinese model seems interesting so far.
>>103096254Are benchmarks not an indicator of real world performance?
>>103096278Not if the existence of the benchmark makes developers teach for the tests.
>>103096278The right ones are to an extent. They didn't show any of the right ones.
>>103096299That only means the tests are shit and extremely specific in how they format knowledge. If you teach for the right tests, then real world performance does also increase because of transfer learning.
There's no way to quantify rng so people simply take the benchmarks and train until the numbers go up. Benchmaxxing is innate
>>103096344What tests are you talking about? So that I can train my models on them and get AGI
https://lifearchitect.ai/models-table/
stop being retarded retards
Death of cai has been a tragedy for everyone.
>>103096372The ones that are refreshed over time to prevent contamination/cheating, the ones that aren't multiple choice but free answer and use a grading rubric and hopefully expert human graders, the ones that include proprietary data.
>>103096414Someone tell him no.
>>103096384>Opus>2TChat is this real?
>>103096440go talk like a faggot elsewhere
>>103096384>Opus>2Tyou /g/irls... is this real?
>>103095499Pretty sure it could be achieved using pydantic/outlines for structured generation.So issue is UI and there being a lack of 'default' or go-to UI.That and it already exists.https://github.com/code/app-open-artifacts>>103096440NoI fucking hate these wait times
I can't believe there is no new model dropping today... Chink billion vram benchmarkmaxxing abomination doesn't count.
>>103095200>buys the pci-e cable from a different PSU manufacturer>whole house explodes
>>103096601What do you mean you can't believe it? I've been telling you, no one's going to make it look like they were waiting for the elections just to release a model.
>>103096588>I fucking hate these wait timesDon't even know why they were added. It's not as if wait times will affect bots in the slightest. What do they think a bot will get bored having to wait 15 minutes? It's a machine, bots can't get bored.
I was running Kobold 1.74 for a while, just went up to 1.77 because it claimed to show logits (and the only one it shows me is a fucking end of token, wtf).Is anyone else feeling like models that used to just do what they're told are suddenly chatty assholes that overtly ignore instructions, babble about "misunderstanding" and apologizing before doubling down on fucking around, and lying about content in the prompt not being there?I mean, it could be an anomalous prompt, but something smells rotten.
I really don't remember newfags being this retarded in the past. This is so sad.
>>103096806This is what happens when one-click installers become the norm. Accessible even to drooling tards now.
what's up with quantizing sentence transformer models with llama.cpp? Tried bge-multilingual-gemma2, got an error for gemma not supported from the converter script. Then tried sentence-t5-xxl, it converted fine but I'm getting a std::out_of_range segfault, didn't find anything in the issues.Is this just not supported?
>>103096855Simplicity is good though. I did install fish-speech manually using speech.fish.audio/#windows-setup guide, once 1.4 released somewhere in September. Then saw pinokio dev making install script and jumped the ship immediately for obvious reasons.
>>103096902>bge-multilingual-gemma2>"architectures": ["Gemma2Model"],Not supported.>sentence-t5-xxl>"architectures": ["T5EncoderModel"],Matches one of the model architectures, not sure about the details. Also, it's old as fuck. 2+ years... i doubt it's been tested for a while.What are you trying to do? Check README.md for supported models. Then convert_hf_to_gguf.py for matching architectures according to the model's config.json. If one is supposed to work but doesn't, open an issue, but don't expect much support for deprecated models.
>>103096640I pulled up a saved state, a very simple "Translate this to English: (moon runes)" task that I was using to check for censorship across models.Every model I had translated it more or less correctly. Some added some commentary because safety alignment woke, but whatever. They all functioned.Now, they're all kinds of fucked up.On SSE, it babbles some commentary about the statement without translating it.On Poll and no streaming, it does two passes for some reason.The first pass, it chatters:>I think I understand what you're saying. It sounds like you're expressing …The second time it gave what I expected, more or less, from how that model used to translate the phrase and then monologue about alignment.Could this be a regression involving Usage Mode setting? It's on Instruct but it's sure behaving like it is trying to pretend to have a personality at first or doing some kind of janky chain of thought. Either way, this shit needs to stop.
0 days left until november 5th
>>103097234Voted for kamala. Total chud meltdown incoming.
>>103097234Quick, ask the last model you used what it is voting for
>>103097234I can't believe Petra is a chudette
>>103096952I can't even get Pinokio to run, lol. Gives me `libva error: vaGetDriverNames() failed with unknown libva error`. Running arch btw.
>>103090412so, i have been reading the rentry "wikis". they are very good, are they mirrored anywhere where they can be found besides from these generals?
Marc Andreessen and Ben Horowitz say that AI models are hitting a ceiling of capabilities: "we've really slowed down in terms of the amount of improvement... we're increasing GPUs, but we're not getting the intelligence improvements, at all".https://x.com/tsarnick/status/1853898866464358795
>>103097287Idk, got no problems installing it all on windows 10 ltsc btw
>>103097293Once you get anything running, you'll learn more by just reading documentation, PRs, and playing with the settings and seeing their effect. Guides are old. They're always old.
>>103097295/lmg/chads predicted this
>>103097348No you didn't
>>103097234Why do Magatards insist on larping as women?There was another case where someone ran a Twitter account where he posted images of an European model pretending to be an American Trump supporter.
>>103097414Woah. One guy. Did something. On Twitter.Please tell me more.
>>103097234>mikutroon countdown poster was petra all along
>>103097295Lies. Anthropic models keep getting better and better.
>>103097392first llama 3 quants you absolute retard newfaggot nigger
>>103097295>Ben HorowitzI don't even need to look at the early life section on this one.
>>103097279This.
>>103097119I found the problem.1.75's notes,>Added a new Instruct scenario to mimic CoT Reflection (Thinking)Maybe I forgot, but I thought Instruct was always there. Now it's replaced with busted shit. Anybody know the fix?
>>103097295well they just need to understand that it is all about having a good dataset, down to the byte level. This is what Anthropic understands better than anyone.just feeding an LLM shit from wikipedia is a start, but it's not the end goal. the end goal is pristine datasets.
>>103097500To clarify:I think the new Instruct "Scenario" is screwing up the Settings Usage Mode Instruct Mode. It's CoT double pumping despite my never touching any of those Scenario buttons.
>>103097097>What are you trying to do?I'm just trying to run a big embedding model on a 11gb GPU ..>deprecated modelsit's the top performing one on several benchmarks, I don't see any support for embedding models in general
And now for something completely different. Scammer is still laying low.
>>103097605His surname is literally germanic for swindler.
>>103097295Little TTS test on this post. https://voca.ro/158bEo1LrnenXtts2 : around 11-12 sec Fish-Speech : around 35 seconds
>>103093171There is a twitch Lora (made from twitch chatlogs) on top of a 13B at most. The memory is a basic RAG system and it was implemented recently
>>103097531>it's the top performing one on several benchmarksYeah. 2 years ago, probably. Every model is SOTA for about 10 entire minutes on release.Check this>https://huggingface.co/spaces/mteb/leaderboardAnd see what models are compatible with llama.cpp. Go to the model, open the config.json and check if the architectures is in convert_hf_to_gguf.py as in >>103097097 .
>>103093175Gpt-sovits2 definetely
>>103097641thanks for the help, actually mega helpful
>>103097655It was the first benchmark i found and i don't know how good or bad their testing is. Just don't trust benchmark numbers on the model cards, those are immediately outdated on release. I'm sure you can find other benchmarks around.
>>103097620Someone did a test of that new fish-agent btw (small and retarded gpt4o)https://x.com/reach_vb/status/1853920135070761046
>>103097787Its decent but call me back when they do a 72B version.
>>103097641>>103097741MTEB is good, but you'll get the best results from fine-tuning an existing one on your specific data.Also, you don't need to use llama.cpp for embeddings generation, you can also use huggingface transformers just fine.HF Transformers + stella_en https://huggingface.co/dunzhang/stella_en_400M_v5
>>103096611I've made so many mistakes in getting to my current setup.What's another one on top.
Local models...bad?
>>103093159It's possible to even go bigger and use a 22b at IQ3_XS, with the 4-bit cache and flash attention on.
>>103097234
>>103097741I mean as dumb as it sounds I didn't really know how to check for support>>103097822I tried that one, it's not really performing well enough. That's why I wanted to try quantized versions of these much bigger models.
>>103097295that retard Roon is in the replies disagreeing with them, so they must be correct
>>103098020>I didn't really know how to check for supportI don't blame you. While there is a list in llama.cpp's readme, model architectures are a mess. And even if you have a compatible architecture in the config.json, it's not a guarantee that the model will work correctly or that it follows the specs llama.cpp expects. The only way is to know for sure is to test them.
>>103098036why is roon so annoying
>>103097891Yes, local models bad. LLMs bad. Miku good. Teto good.
>>103091030>filtered chinkslop Yuck
>>103098228Missing t
>>103091030when will it be on openrouter so I can test it?
>>103098255Not all filters are bad, for example if the synthetic data gets stuck in a loop you would want to filter that shit out of the training data.
>>103098317Dunno >"Extensive Benchmarking: Conducts extensive experiments across various languages and tasks to validate the practical effectiveness and safety of Hunyuan-Large." on HF link tells enough imo, guess all serious AI labs are forced to do the crippling.
>>103091145>TriviaQA almost as good as Claudethis is the one
>>103098235mikufaggot reveals his true colors. shame nobody is here to see it.
>>103098376It's cooked on benchmarks and terribleAlready been Nala tested thanks to prompt injection on huggingface preview spaceIt's maybe a little better than Llama-1-33B. >>103093371
>>103098020When you say its not performing well enough, are you taking measurements or you mean its not meeting what you expected
>>103097295Based and truthpilled
>>103097279
>>103097295claude 3.5 sonnet NEW is better
>>103098450Nah its useless for us coomers.
>>103098464Poor ones maybe. Its by far the best model for nsfw as well.
>>103098475Opus mogs
>>103098506>it must be better because it's more expensive
>>103098506Opus is much much dumber.
>>103098410I'm not running benchmarks but I'm doing some semantics stuff (trying to "rate" inputs by projecting along a semantic axis)the only closest outputs I got to making sense have been openai embeddings, which sucks ass, I'm just trying to replicate at least this behavior with open source modelsI don't really fuck with those benchmarks they're fairly meaningless, I could probably write a paper about that but there's better ways to automate testing for "making sense" in a model where the benchmark is much harder to optimize for than the current state of the artcan't be fucked though, I wrote to a couple local profs for a PhD but nobody seemed interested so whatever, someone else will figure it out soon enough
>>103098506>>103098475>>103098464>>103098450>>103098528What the fuck are you faggots doing in local models thread? Fuck off caiggers,
>>103098620I also use 405B, only model competitive with the closed ones.
>>103098652Cool story bro
>>103098676>>103098620benchmark fags aren't worth speaking to
>>103098376>trusting scores on old benchmarks>self-reported
is there a modernized GPT-4chan equivalent? Modernized as in, finetuned on some of the latest free models like Llama 3 or some Mistral etc. (well at least more modern that what GPT-4chan was)?
>>103098376TriviaQA is not what you think it is. In fact, there is no true "trivia" benchmark that is widely recognized by academia. These are reasoning benchmarks that happen to use information from the web (mainly wikipedia) for their reasoning problems, hence "trivia". And it makes sense why academia does not care about pure trivia knowledge memorization, as they want models that are smart rather than models that are encyclopedic.
>>103091145well we found the winning formula and researchers who want to stack up citation counts to get hired at a FAANG or sponsor their O-1 visa once their OPT STEM expires will use and abuse that fact.But the researches who will make it are those who will create the transistor equivalent of machine learning models, i.e. actually dive into the fucking theory and figure out how to maximally compress a model as efficiently as possible with zero loss. This is where the money and fame really is.
qwen 72b MOGS new sonnet at coding
>>103099866mog this. *grabs nuts*
>>103099866*looks slightly up the list*alright
>>103099866Hmm, consider the following though: no it doesn't
>>103099866Looks like a memebench. Stick to Aider for code editing and Livebench for code completion+generation.
Inference Optimal VLMs Need Only One Visual Token but Larger Modelshttps://arxiv.org/abs/2411.03312>Vision Language Models (VLMs) have demonstrated strong capabilities across various visual understanding and reasoning tasks. However, their real-world deployment is often constrained by high latency during inference due to substantial compute required to process the large number of input tokens (predominantly from the image) by the LLM. To reduce inference costs, one can either downsize the LLM or reduce the number of input image-tokens, the latter of which has been the focus of many recent works around token compression. However, it is unclear what the optimal trade-off is, as both the factors directly affect the VLM performance. We first characterize this optimal trade-off between the number of visual tokens and LLM parameters by establishing scaling laws that capture variations in performance with these two factors. Our results reveal a surprising trend: for visual reasoning tasks, the inference-optimal behavior in VLMs, i.e., minimum downstream error at any given fixed inference compute, is achieved when using the largest LLM that fits within the inference budget while minimizing visual token count - often to a single token. While the token reduction literature has mainly focused on maintaining base model performance by modestly reducing the token count (e.g., 5−10×), our results indicate that the compute-optimal inference regime requires operating under even higher token compression ratios. Based on these insights, we take some initial steps towards building approaches tailored for high token compression settings. https://github.com/locuslab/llava-token-compressionPretty interesting. Good to know a good OCR model requires a different set up than a good VQA one. Most relevant for on device VLMs since that's where you want the lowest latency with highest accuracy possible
>>103099866which qwen is the god qwen? I downloaded one like a month ago and it was meh at best? Is there some hidden new one, or am I just a promptlet?safetensors link only please. i don't trust random internet quanters
>>103100127anon, qwen spam is unironically CCP propaganda
>>103097295nobody REALLY thought you could just scale these up and they would automatically get better, right? intelligence doesn't work like that
>>10310012772B 2.5. Dont use llama.cpp
buy your gpus now bwos, they're gonna be extra expensive soon!
>>103100263sure anon, just 2 more weeks until china launches the amphibious invasion
has anyone tried this tts model? how much vram do i need?https://huggingface.co/OuteAI/OuteTTS-0.1-350M
>>103100354the fp16 gguf is under a gig, bro, you can run this on a calculator
>>103100374oh fuck i thought it was 350B.will try to run it on my pc now, still downlooading python
>>103100401>350B text to speech/music capable of doing all the work of famous singers on days they're too tired to record themselves without anyone noticing it's AI-generated
>>103100354Sounds trash.
>>103100532>Sounds trash.speaking of trash-sounding tts: I noticed sovits does really badly with compressed (as in old-school audio dynamic range compression) samples vs ones with good dynamic range. Anyone know of cli tools that can automatically reverse dynamic range compression a wav? It appears some of the video game sample sets I'm trying to train on are really clean but are also massively compressed
>>103100572Reword that for chatgpt and ask if it can write a code or something to fix that
>>103099866I genuinely think the Claude Sonnet models are around 70-100b
>>103094400I like this TetoAlso holy quads
>>103090412Is this the general for discussing f5-tts? Because I've just downloaded and installed it and I'm blown away by how good it is.Installation was simple in a conda environment and it actually produces somewhat passable voices from just 10 seconds of unprepared audio, looking forward to trying it with better prepared clips than the random e-whore slop I threw at it.https://vocaroo.com/18mwV6svsU0b
>>103093818teto is love teto is life
>>103100718Yes but expect people to talk smack because everybody hates voice for some reason.
>>103100718>Is this the general for discussing f5-tts?Yes, because there is no where else.>I'm blown away by how good it is.Now try finetuning gptsovits.
>>103100761Because building a tts project is pain.
>>103100807I thought it was because there was some kind of meme war over which unpronounceable string of letters was the one true voice AI.Tortoise, easy to say. Now it's all acronyms.>>103100807>buildingIs there something special about it? I haven't tried anything since Tortoise. But I built Llama.cpp tonight to try at troubleshooting Kobold's recent misbehavior, and it didn't give me any trouble and worked as expected. Of course, that's a C++, not a Python. Python is digital masochism.
>>103100838Fish: >>103093899 >>103093937RVC: pip install -r requirements.txt>? for some reason its pissing and shitting itself about not able to get lxml module, use the below code to get it, then get back to running the requirements installation aboveconda install -c conda-forge lxml>need a brute force upgrade to the cuda+torch to make it able to "see" the gpupip install --upgrade torch==1.9.1+cu111 torchaudio==0.9.1 --extra-index-url https://download.pytorch.org/whl/cu111#AttributeError: module 'distutils' has no attribute 'version'#run this installation in your Conda terminalpip install setuptools==59.5.0#AttributeError: module 'torch' has no attribute 'pi'Go to the file "xxxDirectoryxxx\RVC1006Nvidia\infer\lib\infer_pack\models.py#under all the modules import code add those two lines:import mathtorch.pi = math.pi
Testing the new vpred checkpoint.
>>103100881tetodactyl!
>>103100881Shame about the hands though.
>>103100888I love Teto, 7 finger hands and all
>>103100888>>103100881Black coat on Teto looks great, she should have an official black uniform desu
I liked the EA version more>you must prompt tokens in this specific order Retarded >is this the new imggen thread?
>>103100918>>is this the new imggen thread?You can imggen any image you want here. So long as it is either Miku or Teto.
>>103100886>>103100898She loves you too. You'll see next Monday.>>103100909I mean, she almost does. But yeah, a deep black coat fits her pretty well.>>103100918I'm using a non-standard tag order for these ones. I find that actually following the "intended" order destroys this cyberpunk art style I've stumbled upon so far. This prompt has worked well for me on all the versions desu, though 1.0 was probably the best at it. The rollerblading in the sky one >>103090412 though performed the best on earlier versions, either 0.5 or EA.
>>103100718GPT-Sovits has much more potential if the dev wasn't lazy. He made the TTS engine output very high quality chinese but half-assed everything else. You'd need to retrain the base model with high quality english/jap on all vocal ranges or whatever language you want to support, so the finetuning would be even better than what it is right now (it's still great).
>>103100718Yeah, I keep saying F5 is the best fast model, atleast for American (I dont speak any other language). I think MaskGCT might be potentially better but that shit takes so fucking long. They need to crunch the models and speed up the inference by 1000%.
Now that Trump won, what does this mean for LLMs?
>>103101316now we accelerate
>>103101316He said he would deregulate it to make the US the leader in the space.
>>103101316They won't know about the Lavon affair or U.S.S. Liberty anymore.
>>103101316Depends on if he listens to JD Vance and Elon about things or not, and on how to legislate this shit
>>103101316>>103101318Also he is buddies with Elon who wants to go all in on it.
>>103101316center-aligned models confirmed
>>103101316Based and uncensored Llama4
>>103101316Elon with his way means open source of LLM and deregulation of tech.
Bet they discussed it already
>>103101316Have Americans learned their lesson from 2020 and counted mail-in votes at the same time as in-person votes?Because otherwise the result will eventually shift towards the Democrats and Harris could still win.
>>103101380>catgirls, you say?>yes. catgirls
>>103101404Catgirls for every able bodied Martian to help with colonization of Mars.
It seems you can do gens from gpt-sovits completely firewalled off from the outside world, but training steps will fail when it can't resolve www.modelscope.cn. Is it just trying to download something, or is there phoning home going on?
>>103101414prob downloading something, just check the code to see where it first initializes the models
>>103101318Isn't it already unregulated? The safetyslop stuff is being done entirely voluntarily by the labs, no law is making them do it.
>>103101422there will be new laws forcing them to remove it
>>103101422Bro, the Starship launch was delayed by 2 years over ocean studies about whether Starship would fall on sharks and whales, beetles, nonsense about spilling small sound/heat suppression water being drained a bit in the middle of a swap thats next to the gulf, in the middle of the hurricane season where fuck rain and flooding takes place. A literal drop of water is what was delaying them for months and cause SpaceX to be fined for half a million dollars.
>>103101414>modelscopeIt's used to download the Chinese ASR, you can skip it completely if you're not doing any training in Chinese. You just have to use faster-whisper instead
>>103101432Yeah, that sucks. But it's got nothing to do with LLMs, which is what the post I was responding to was about. LLMs are unregulated.
>>103101442it was doing it on the denoising step, too
>>103101451It's expecting it here tools/cmd-denoise.py:path_denoise = 'tools/denoise-model/speech_frcrn_ans_cirm_16k'path_denoise = path_denoise if os.path.exists(path_denoise) else "damo/speech_frcrn_ans_cirm_16k"DL the model, put it in that folder and it shouldn't complain anymore
Wonder what ylecunt thinks right now
>>103101567Seething cause he's full into the leftist propaganda and into his own echo chamber
Can I run a decent model on a 2060?
>>103101775q5 misral nemo with offloading
>>103101316depends if he goes full christcuck or not if he does then be prepared to marry your model before thinking about ERP.
>>103101784thank you for the advicei was going to try to use this, but as it turns out the computer with the 2060 has a broken cpu fan and also a spider was living in it (it is dead now)
Do you know any team/organization of AI specialists that offers a fine-tuning model for your specific task and on your datasets?
>>103101978Fiverr
>>103101978MistralAI does that. I think cohere too. Probably better than sloptuners.
>>103102007>mistral>cohere>not slop
>>103102016Compared to the 10mb dataset hf tuners, you retard.
>>103102007Already has sent a request form to both of them.>>103101998I think I might end up on freelancer platforms.
>>103102007>cohereThey're not my friends anymore, it's OVER.
xAI will save us, trust ze plan!
>>103098620Claude is also local on Anthropic's and Amazon's computers
It was all memes and fun. But holy shit /lmg/ is truly dead now. No new models. Permanent newfag infestation. (now with corrected picture). Also forgot how loaders are becoming abandonware now.
>>103102587
>>103102127
>>103102649>>103102649>>103102649
Weekly check-in. Any 70b+ models almost as good as Claude?